Hello! I’m trying to update the datahub using helm...
# troubleshoot
a
Hello! I’m trying to update the datahub using helm (v0.8.24 to v0.9.1) and i’m facing the
datahub-elasticsearch-setup-job
error. The credentials seem to be correct, I can login to opensearch via browser using the username
datahub-dev-1
and the password from the k8s
elasticsearch-secrets
.
Copy code
kubectl logs job.batch/datahub-elasticsearch-setup-job

2022/11/01 22:02:31 Waiting for: <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>
2022/11/01 22:02:32 Received 401 from <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>. Sleeping 1s
2022/11/01 22:04:31 Timeout after 2m0s waiting on dependencies to become available: [<https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>]
values.yaml:
Copy code
elasticsearch:
    host: "<http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com>"
    port: "443"
    useSSL: "true"
    auth:
      username: "datahub-dev-1"
      password:
        secretRef: elasticsearch-secrets
        secretKey: elasticsearch-password
i
Hello Slava, Have you checked whether the kubernetes cluster can connect to the OpenSearch cluster? Is there dns resolution? Can a pod in k8s ping the ES cluster?
a
Hello @incalculable-ocean-74010,
Have you checked whether the kubernetes cluster can connect to the OpenSearch cluster?
seems like I can connect:
Copy code
kubectl exec -it pod/datahub-datahub-gms-677ffc497d-pxj9d -- sh

/ $ curl <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com> -u '''datahub-dev-1:mypassword'''

{
  "name" : "qwerr",
  "cluster_name" : "qwer:datahub-opensearch-dev-1",
  "cluster_uuid" : "qwer",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "oss",
    "build_type" : "tar",
    "build_hash" : "unknown",
    "build_date" : "2022-07-20T07:43:57.819165Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
I have a DNS resolution:
Copy code
getent hosts <http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com>
10.194.16.104
Ping not working:
Copy code
/ $ ping <http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com>
^C
--- <http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com> ping statistics ---
80 packets transmitted, 0 packets received, 100% packet loss
I suspected that datahub-elasticsearch-setup-job somehow couldn’t get credits from here:
Copy code
elasticsearch:
    host: "<http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com>"
    port: "443"
    useSSL: "true"
    auth:
      username: "datahub-dev-1"
      password:
        secretRef: elasticsearch-secrets
        secretKey: elasticsearch-password
i
What makes you say that? The logs don't report permission errors
It looks to me like the OpenSearch cluster was somehow not ready? Can you share your values.yaml file?
a
What makes you say that? The logs don’t report permission errors
the error 401
Copy code
2022/11/01 22:02:32 Received 401 from <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>. Sleeping 1s
Copy code
# Values to start up datahub after starting up the datahub-prerequisites chart with "prerequisites" release name
# Copy this chart and change configuration as needed.
datahub-gms:
  enabled: true
  image:
    repository: linkedin/datahub-gms
    tag: "v0.9.1"

datahub-frontend:
  enabled: true
  image:
    repository: linkedin/datahub-frontend-react
    tag: "v0.9.1"
  # Set up ingress to expose react front-end
  ingress:
    enabled: false
  extraEnvs:
    - name: AUTH_OIDC_ENABLED
      value: "true"
    - name: AUTH_OIDC_CLIENT_ID
      value: "qq"
    - name: AUTH_OIDC_CLIENT_SECRET
      value: "qq"
    - name: AUTH_OIDC_DISCOVERY_URI
      value: "qq"
    - name: AUTH_OIDC_BASE_URL
      value: "qq"
    - name: AUTH_OIDC_SCOPE
      value: "qq"


acryl-datahub-actions:
  enabled: true
  image:
    repository: acryldata/datahub-actions
    tag: "v0.0.7"
  resources:
    limits:
      memory: 512Mi
    requests:
      cpu: 300m
      memory: 256Mi

datahub-mae-consumer:
  image:
    repository: linkedin/datahub-mae-consumer
    tag: "v0.9.1"

datahub-mce-consumer:
  image:
    repository: linkedin/datahub-mce-consumer
    tag: "v0.9.1"

datahub-ingestion-cron:
  enabled: false #true
  image:
    repository: acryldata/datahub-ingestion
    tag: "v0.9.1"

elasticsearchSetupJob:
  enabled: true
  image:
    repository: linkedin/datahub-elasticsearch-setup
    tag: "v0.9.1"
  extraEnvs:
    - name: USE_AWS_ELASTICSEARCH
      value: "true"
  podSecurityContext:
    fsGroup: 1000
  securityContext:
    runAsUser: 1000
  podAnnotations: {}

kafkaSetupJob:
  enabled: true
  image:
    repository: linkedin/datahub-kafka-setup
    tag: "v0.9.1"
  podSecurityContext:
    fsGroup: 1000
  securityContext:
    runAsUser: 1000
  podAnnotations: {}

mysqlSetupJob:
  enabled: true
  image:
    repository: acryldata/datahub-mysql-setup
    tag: "v0.9.1"
  podSecurityContext:
    fsGroup: 1000
  securityContext:
    runAsUser: 1000
  podAnnotations: {}

postgresqlSetupJob:
  enabled: false
  image:
    repository: acryldata/datahub-postgres-setup
    tag: "v0.9.1"
  podSecurityContext:
    fsGroup: 1000
  securityContext:
    runAsUser: 1000
  podAnnotations: {}

datahubUpgrade:
  enabled: true
  image:
    repository: acryldata/datahub-upgrade
    tag: "v0.9.1"
  batchSize: 1000
  batchDelayMs: 100
  noCodeDataMigration:
    sqlDbType: "MYSQL"
  podSecurityContext: {}
    # fsGroup: 1000
  securityContext: {}
    # runAsUser: 1000
  podAnnotations: {}
  restoreIndices:
    resources:
      limits:
        cpu: 500m
        memory: 512Mi
      requests:
        cpu: 300m
        memory: 256Mi

global:
  graph_service_impl: elasticsearch
  datahub_analytics_enabled: true
  datahub_standalone_consumers_enabled: false

  elasticsearch:
    host: "<http://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com|vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com>"
    port: "443"
    useSSL: "true"
    auth:
      username: "datahub-dev-1"
      password:
        secretRef: elasticsearch-secrets
        secretKey: elasticsearch-password


  kafka:
    bootstrap:
      server: "<http://b-1.datahubmskclusterdev1.qweqwe.c24.kafka.us-east-1.amazonaws.com:9092|b-1.datahubmskclusterdev1.qweqwe.c24.kafka.us-east-1.amazonaws.com:9092>"
    zookeeper:
      server: "<http://z-1.datahubmskclusterdev1.qweqwe.c24.kafka.us-east-1.amazonaws.com:2181|z-1.datahubmskclusterdev1.qweqwe.c24.kafka.us-east-1.amazonaws.com:2181>"

    ## For AWS MSK set this to a number larger than 1
    partitions: 2
    replicationFactor: 2
    schemaregistry:
      url: "<http://prerequisites-cp-schema-registry:8081>"


  sql:
    datasource:
      host: "<http://datahub-rds-dev-1.cluster-qq.us-east-1.rds.amazonaws.com:3306|datahub-rds-dev-1.cluster-qq.us-east-1.rds.amazonaws.com:3306>"
      hostForMysqlClient: "<http://datahub-rds-dev-1.cluster-qq.us-east-1.rds.amazonaws.com|datahub-rds-dev-1.cluster-qq.us-east-1.rds.amazonaws.com>"
      port: "3306"
      url: "jdbc:<mysql://datahub-rds-dev-1.cluster-qq.us-east-1.rds.amazonaws.com:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8&enabledTLSProtocols=TLSv1.2>"
      driver: "com.mysql.cj.jdbc.Driver"
      username: "admin"
      password:
        secretRef: db-secrets
        secretKey: db-admin-password

  datahub:
    gms:
      port: "8080"
      nodePort: "30001"
    mae_consumer:
      port: "9091"
      nodePort: "30002"
    appVersion: "1.0"

    managed_ingestion:
      enabled: true
      defaultCliVersion: "0.9.1"
It looks to me like the OpenSearch cluster was somehow not ready?
Does this respond mean the cluster is ready?
Copy code
/ $ curl <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com> -u '''datahub-dev-1:mypassword'''

{
  "name" : "qwerr",
  "cluster_name" : "qwer:datahub-opensearch-dev-1",
  "cluster_uuid" : "qwer",
  "version" : {
    "number" : "7.10.2",
    "build_flavor" : "oss",
    "build_type" : "tar",
    "build_hash" : "unknown",
    "build_date" : "2022-07-20T07:43:57.819165Z",
    "build_snapshot" : false,
    "lucene_version" : "8.7.0",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}
@incalculable-ocean-74010
i
To check if an ES cluster is ready run:
Copy code
curl -XGET 'http://<elasticsearch host url>:9200/_cluster/health?pretty=true
a
It doesn’t working on 9200 port
but https is ok
Copy code
curl -XGET '<https://vpc-datahub-opensearch-dev-1-q.us-east-1.es.amazonaws.com/_cluster/health?pretty=true>' -u '''datahub-dev-1:pass'''

{
  "cluster_name" : "123:datahub-opensearch-dev-1",
  "status" : "yellow",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "discovered_master" : true,
  "active_primary_shards" : 387,
  "active_shards" : 387,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 385,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 50.129533678756474
}
@incalculable-ocean-74010
i
Yes
a
How should I do that? Like this, via extraEnvs?
Copy code
elasticsearchSetupJob:
  enabled: true
  image:
    repository: linkedin/datahub-elasticsearch-setup
    tag: "v0.9.1"
  extraEnvs:
    - name: USE_AWS_ELASTICSEARCH
      value: "true"
    - name: port
      value: 443
but anyway, the job logs say that port 443 is already in use
Copy code
kubectl logs job.batch/datahub-elasticsearch-setup-job

2022/11/01 22:02:31 Waiting for: <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>
2022/11/01 22:02:32 Received 401 from <https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>. Sleeping 1s
2022/11/01 22:04:31 Timeout after 2m0s waiting on dependencies to become available: [<https://vpc-datahub-opensearch-dev-1-qweqwe.us-east-1.es.amazonaws.com:443>]
Maybe I need to put there login and pass for connection? elasticsearchSetupJob
It was a password problem x_x
I had a password with symbols like |
i
Glad you got it to work!