Hi datahub was up and running using helm after trying to ing DataHub #all-things-deployment

Hi, datahub was up and running using helm, after t...

better-fireman-33387

09/01/2022, 6:08 AM

Hi, datahub was up and running using helm, after trying to ingest with mysql source needed to reinstall all charts and the installation is failing now. nothing has changed in my values.yaml files and the configuration stayed the same. I suspect mysql pod is not ready, describe pod is in the thread any help?

better-fireman-33387

09/01/2022, 6:08 AM

Copy code

➜  ~ kubectl describe pod prerequisites-mysql-0
Name:         prerequisites-mysql-0
Namespace:    prod-it-data
Priority:     0
Node:         <http://kube00560.taboolasyndication.com/10.110.108.239|kube00560.taboolasyndication.com/10.110.108.239>
Start Time:   Thu, 01 Sep 2022 09:02:29 +0300
Labels:       <http://app.kubernetes.io/component=primary|app.kubernetes.io/component=primary>
              <http://app.kubernetes.io/instance=prerequisites|app.kubernetes.io/instance=prerequisites>
              <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
              <http://app.kubernetes.io/name=mysql|app.kubernetes.io/name=mysql>
              controller-revision-hash=prerequisites-mysql-765bc74f8c
              <http://helm.sh/chart=mysql-9.1.2|helm.sh/chart=mysql-9.1.2>
              <http://statefulset.kubernetes.io/pod-name=prerequisites-mysql-0|statefulset.kubernetes.io/pod-name=prerequisites-mysql-0>
Annotations:  checksum/configuration: f25419cba89da9112afaebb339431970fb516b7719d448d12799914417968846
              <http://kubernetes.io/limit-ranger|kubernetes.io/limit-ranger>: LimitRanger plugin set: cpu, memory request for container mysql; cpu, memory limit for container mysql
              <http://podpreset.admission.kubernetes.io/podpreset-cluster-domain|podpreset.admission.kubernetes.io/podpreset-cluster-domain>: 6821103621
Status:       Running
IP:           10.134.102.189
IPs:
  IP:           10.134.102.189
Controlled By:  StatefulSet/prerequisites-mysql
Containers:
  mysql:
    Container ID:   <containerd://72d37d9195e176ca0be7f467b7f45ea330857c981d0d71edeb124a9494238f6>2
    Image:          <http://docker.io/bitnami/mysql:8.0.29-debian-10-r23|docker.io/bitnami/mysql:8.0.29-debian-10-r23>
    Image ID:       <http://docker.io/bitnami/mysql@sha256:a4c097f505825077d7983c740dcf099f8b40ec7d318aecfd9891337600145a2a|docker.io/bitnami/mysql@sha256:a4c097f505825077d7983c740dcf099f8b40ec7d318aecfd9891337600145a2a>
    Port:           3306/TCP
    Host Port:      0/TCP
    State:          Running
      Started:      Thu, 01 Sep 2022 09:02:57 +0300
    Ready:          False
    Restart Count:  0
    Limits:
      cpu:     1
      memory:  512Mi
    Requests:
      cpu:     100m
      memory:  256Mi
    Liveness:  exec [/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"
if [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then
    password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")
fi
mysqladmin status -uroot -p"${password_aux}"
] delay=5s timeout=1s period=10s #success=1 #failure=3
    Readiness:  exec [/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"
if [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then
    password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")
fi
mysqladmin status -uroot -p"${password_aux}"
] delay=5s timeout=1s period=10s #success=1 #failure=3
    Startup:  exec [/bin/bash -ec password_aux="${MYSQL_ROOT_PASSWORD:-}"
if [[ -f "${MYSQL_ROOT_PASSWORD_FILE:-}" ]]; then
    password_aux=$(cat "$MYSQL_ROOT_PASSWORD_FILE")
fi
mysqladmin status -uroot -p"${password_aux}"
] delay=15s timeout=1s period=10s #success=1 #failure=10
    Environment:
      BITNAMI_DEBUG:        false
      MYSQL_ROOT_PASSWORD:  <set to the key 'mysql-root-password' in secret 'mysql-secrets'>  Optional: false
      MYSQL_DATABASE:       my_database
      cluster_domain:       <http://taboolasyndication.com|taboolasyndication.com>
    Mounts:
      /bitnami/mysql from data (rw)
      /opt/bitnami/mysql/conf/my.cnf from config (rw,path="my.cnf")
      /var/run/secrets/kubernetes.io/serviceaccount from prerequisites-mysql-token-dthdp (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  data-prerequisites-mysql-0
    ReadOnly:   false
  config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      prerequisites-mysql
    Optional:  false
  prerequisites-mysql-token-dthdp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  prerequisites-mysql-token-dthdp
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
                 <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
  Type     Reason            Age               From               Message
  ----     ------            ----              ----               -------
  Warning  FailedScheduling  87s               default-scheduler  0/854 nodes are available: 854 pod has unbound immediate PersistentVolumeClaims.
  Warning  FailedScheduling  87s               default-scheduler  0/854 nodes are available: 854 pod has unbound immediate PersistentVolumeClaims.
  Normal   Scheduled         85s               default-scheduler  Successfully assigned prod-it-data/prerequisites-mysql-0 to <http://kube00560.taboolasyndication.com|kube00560.taboolasyndication.com>
  Normal   Pulling           79s               kubelet            Pulling image "<http://docker.io/bitnami/mysql:8.0.29-debian-10-r23|docker.io/bitnami/mysql:8.0.29-debian-10-r23>"
  Normal   Pulled            60s               kubelet            Successfully pulled image "<http://docker.io/bitnami/mysql:8.0.29-debian-10-r23|docker.io/bitnami/mysql:8.0.29-debian-10-r23>" in 19.14969436s
  Normal   Created           57s               kubelet            Created container mysql
  Normal   Started           57s               kubelet            Started container mysql
  Warning  Unhealthy         1s (x5 over 41s)  kubelet            Startup probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure.
mysqladmin: connect to server at 'localhost' failed
error: 'Can't connect to local MySQL server through socket '/opt/bitnami/mysql/tmp/mysql.sock' (2)'
Check that mysqld is running and that the socket: '/opt/bitnami/mysql/tmp/mysql.sock' exists!

better-fireman-33387

09/01/2022, 6:09 AM

I already tried to delete pvc and reinstall but I’m getting same error again

better-fireman-33387

09/01/2022, 6:31 AM

now its working… very weird

bumpy-needle-3184

09/01/2022, 6:43 AM

👍

better-fireman-33387

09/01/2022, 7:30 AM

we tried to ingest with mysql and got a lot of red errors in ui. checking the gms logs:

Copy code

06:43:12.107 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 282 Took time ms: -1 Message: failure in bulk execution:
06:43:13.144 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 266 Took time ms: -1 Message: failure in bulk execution:
06:43:14.204 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 263 Took time ms: -1 Message: failure in bulk execution:
06:43:15.098 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 176 Took time ms: -1 Message: failure in bulk execution:
06:43:16.097 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 183 Took time ms: -1 Message: failure in bulk execution:
06:43:17.094 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 225 Took time ms: -1 Message: failure in bulk execution:
06:43:18.152 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 239 Took time ms: -1 Message: failure in bulk execution:
06:43:19.105 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 255 Took time ms: -1 Message: failure in bulk execution:
06:43:20.183 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 254 Took time ms: -1 Message: failure in bulk execution:
06:43:21.089 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 228 Took time ms: -1 Message: failure in bulk execution:
06:43:25.174 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 259 Took time ms: -1 Message: failure in bulk execution:
06:43:25.282 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 137 Took time ms: -1 Message: failure in bulk execution:
06:43:26.327 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 234 Took time ms: -1 Message: failure in bulk execution:
06:43:27.504 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 226 Took time ms: -1 Message: failure in bulk execution:
06:43:28.406 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 289 Took time ms: -1 Message: failure in bulk execution:
06:43:29.576 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 277 Took time ms: -1 Message: failure in bulk execution:
06:43:30.337 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 263 Took time ms: -1 Message: failure in bulk execution:
06:43:31.577 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 391 Took time ms: -1 Message: failure in bulk execution:
06:43:32.318 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 200 Took time ms: -1 Message: failure in bulk execution:
06:43:33.480 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 318 Took time ms: -1 Message: failure in bulk execution:
06:43:34.344 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 189 Took time ms: -1 Message: failure in bulk execution:
06:43:35.333 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 326 Took time ms: -1 Message: failure in bulk execution:
06:43:36.575 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 343 Took time ms: -1 Message: failure in bulk execution:
06:43:37.416 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 304 Took time ms: -1 Message: failure in bulk execution:
06:43:38.479 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 403 Took time ms: -1 Message: failure in bulk execution:
06:43:39.393 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 156 Took time ms: -1 Message: failure in bulk execution:
06:43:40.854 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:25 - Failed to feed bulk request. Number of events: 222 Took time ms: -1 Message: failure in bulk execution:
06:44:15.442 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:44:17.285 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
06:44:46.371 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:44:48.355 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
06:45:17.082 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:45:19.978 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
06:45:47.818 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:45:50.710 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
06:46:18.676 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:46:21.627 [I/O dispatcher 1] ERROR c.l.m.k.e.ElasticsearchConnector:47 - Error feeding bulk request. No retries left
06:46:22.206 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:35 - Error feeding bulk request. No retries left
06:46:22.209 [ThreadPoolTaskExecutor-1] ERROR c.l.m.graph.elastic.ESGraphWriteDAO:66 - ERROR: Failed to delete by query. See stacktrace for a more detailed error:
06:46:22.296 [ThreadPoolTaskExecutor-1] ERROR c.l.m.graph.elastic.ESGraphWriteDAO:66 - ERROR: Failed to delete by query. See stacktrace for a more detailed error:
06:46:22.327 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.522 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.534 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.583 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.595 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.631 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.661 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.846 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.858 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.892 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.906 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.948 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:22.961 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:23.157 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:23.176 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook
06:46:23.211 [ThreadPoolTaskExecutor-1] ERROR c.l.m.k.MetadataChangeLogProcessor:90 - Failed to execute MCL hook with name com.linkedin.metadata.kafka.hook.UpdateIndicesHook

better-fireman-33387

09/01/2022, 7:31 AM

and the elastic logs ☝️

bumpy-needle-3184

09/01/2022, 8:04 AM

what about kafka setup job ? could you share logs of it

bumpy-needle-3184

09/01/2022, 8:06 AM

log from elasticsearch logs says

Caused by: java.net.ConnectException: Timeout connecting to [elasticsearch-master/10.136.13.77:9200]

could you verify the connection ?

better-fireman-33387

09/01/2022, 8:08 AM

how to verf=ify?

better-fireman-33387

09/01/2022, 8:08 AM

should I change to elastic-headless? why is there 2 services for each instance

better-fireman-33387

09/01/2022, 8:10 AM

kafka setup log:

Copy code

~ kubectl logs datahub-kafka-setup-job-clttk
[main] INFO org.apache.kafka.clients.admin.AdminClientConfig - AdminClientConfig values:
	bootstrap.servers = [prerequisites-kafka:9092]
	client.dns.lookup = use_all_dns_ips
	client.id =
	<http://connections.max.idle.ms|connections.max.idle.ms> = 300000
	<http://default.api.timeout.ms|default.api.timeout.ms> = 60000
	<http://metadata.max.age.ms|metadata.max.age.ms> = 300000
	metric.reporters = []
	metrics.num.samples = 2
	metrics.recording.level = INFO
	<http://metrics.sample.window.ms|metrics.sample.window.ms> = 30000
	receive.buffer.bytes = 65536
	<http://reconnect.backoff.max.ms|reconnect.backoff.max.ms> = 1000
	<http://reconnect.backoff.ms|reconnect.backoff.ms> = 50
	<http://request.timeout.ms|request.timeout.ms> = 30000
	retries = 2147483647
	<http://retry.backoff.ms|retry.backoff.ms> = 100
	sasl.client.callback.handler.class = null
	sasl.jaas.config = null
	sasl.kerberos.kinit.cmd = /usr/bin/kinit
	sasl.kerberos.min.time.before.relogin = 60000
	sasl.kerberos.service.name = null
	sasl.kerberos.ticket.renew.jitter = 0.05
	sasl.kerberos.ticket.renew.window.factor = 0.8
	sasl.login.callback.handler.class = null
	sasl.login.class = null
	sasl.login.refresh.buffer.seconds = 300
	sasl.login.refresh.min.period.seconds = 60
	sasl.login.refresh.window.factor = 0.8
	sasl.login.refresh.window.jitter = 0.05
	sasl.mechanism = GSSAPI
	security.protocol = PLAINTEXT
	security.providers = null
	send.buffer.bytes = 131072
	<http://socket.connection.setup.timeout.max.ms|socket.connection.setup.timeout.max.ms> = 127000
	<http://socket.connection.setup.timeout.ms|socket.connection.setup.timeout.ms> = 10000
	ssl.cipher.suites = null
	ssl.enabled.protocols = [TLSv1.2]
	ssl.endpoint.identification.algorithm = https
	ssl.engine.factory.class = null
	ssl.key.password = null
	ssl.keymanager.algorithm = SunX509
	ssl.keystore.certificate.chain = null
	ssl.keystore.key = null
	ssl.keystore.location = null
	ssl.keystore.password = null
	ssl.keystore.type = JKS
	ssl.protocol = TLSv1.2
	ssl.provider = null
	ssl.secure.random.implementation = null
	ssl.trustmanager.algorithm = PKIX
	ssl.truststore.certificates = null
	ssl.truststore.location = null
	ssl.truststore.password = null
	ssl.truststore.type = JKS

[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka version: 6.1.4-ccs
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka commitId: c9124241a6ff43bc
[main] INFO org.apache.kafka.common.utils.AppInfoParser - Kafka startTimeMs: 1662018258463
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Completed updating config for topic _schemas.

bumpy-needle-3184

09/01/2022, 8:15 AM

try running

curl -XGET "10.136.13.77:9200/_cluster/health?pretty

bumpy-needle-3184

09/01/2022, 8:22 AM

how did you set up Elasticsearch ? is it deployed as kubernetes pod as part of prerequisite helm chart or you are using aws OpenSearch service?

better-fireman-33387

09/01/2022, 8:23 AM

as k8s part of the prerequisites

better-fireman-33387

09/01/2022, 8:23 AM

I see the pod is not healty now

better-fireman-33387

09/01/2022, 8:24 AM

this is the elastic section from prerequisites values.yaml:

Copy code

elasticsearch:
  enabled: true   # set this to false, if you want to provide your own ES instance.
  replicas: 3
  minimumMasterNodes: 1
  # Set replicas to 1 and uncomment this to allow the instance to be scheduled on
  # a master node when deploying on a single node Minikube / Kind / etc cluster.
  # antiAffinity: "soft"

  # # If your running a single replica cluster add the following helm value
  # clusterHealthCheckParams: "wait_for_status=yellow&timeout=1s"

  # # Shrink default JVM heap.
  # esJavaOpts: "-Xmx128m -Xms128m"

  # # Allocate smaller chunks of memory per pod.
  resources:
    requests:
      cpu: "0.1"
      memory: "100M"
    limits:
      cpu: "1000m"
      memory: "512M"

  # # Request smaller persistent volumes.
  volumeClaimTemplate:
    accessModes: ["ReadWriteOnce"]
    storageClassName: "rook-ceph-block"
    resources:
      requests:
        storage: 200M

better-fireman-33387

09/01/2022, 8:25 AM

maybe need to increase resources

bumpy-needle-3184

09/01/2022, 9:38 AM

as discussed , all the elastic search pods are not in ready state , due to which elasticsearch cluster is not healthy

better-fireman-33387

09/01/2022, 9:45 AM

Ok did what we talked about, they are now all in ready state

better-fireman-33387

09/01/2022, 9:49 AM

should I see also a running elastic-setup job?

better-fireman-33387

09/01/2022, 9:49 AM

cause all of them in error

bumpy-needle-3184

09/01/2022, 9:50 AM

you need to ran -

helm upgrade --install

for datahub helm chart

better-fireman-33387

09/01/2022, 10:00 AM

OK WORKING now getting this and the menubar disappread

bumpy-needle-3184

09/01/2022, 10:01 AM

check gms pod log

better-fireman-33387

09/01/2022, 10:03 AM

all of a sudden i have 2 gms pods one is ready and running (old one_) and one not ready

better-fireman-33387

09/01/2022, 10:03 AM

Copy code

0:02:11.189 [main] INFO  c.l.m.s.e.i.ESIndexBuilder:200 - Index datasetindex_v2_1662026531189 does not exist. Creating
java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-2 [ACTIVE]
	at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:834)
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:259)
	at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
	at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
	at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1598)
	at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1565)
	at org.elasticsearch.client.IndicesClient.create(IndicesClient.java:145)
	at com.linkedin.metadata.search.elasticsearch.indexbuilder.ESIndexBuilder.createIndex(ESIndexBuilder.java:204)
	at com.linkedin.metadata.search.elasticsearch.indexbuilder.ESIndexBuilder.buildIndex(ESIndexBuilder.java:106)
	at com.linkedin.metadata.search.elasticsearch.indexbuilder.EntityIndexBuilder.buildIndex(EntityIndexBuilder.java:23)
	at com.linkedin.metadata.search.elasticsearch.indexbuilder.EntityIndexBuilders.buildAll(EntityIndexBuilders.java:21)
	at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.configure(ElasticSearchService.java:37)
	at com.linkedin.metadata.kafka.hook.UpdateIndicesHook.<init>(UpdateIndicesHook.java:83)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.springframework.beans.BeanUtils.instantiateClass(BeanUtils.java:211)
	at org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:117)
	at org.springframework.beans.factory.support.ConstructorResolver.instantiate(ConstructorResolver.java:311)
	at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:296)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.autowireConstructor(AbstractAutowireCapableBeanFactory.java:1372)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBeanInstance(AbstractAutowireCapableBeanFactory.java:1222)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:582)
	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
	at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
	at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953)
	at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918)
	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583)
	at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:401)
	at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:292)
	at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103)
	at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:1073)
	at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:572)
	at org.eclipse.jetty.server.handler.ContextHandler.contextInitialized(ContextHandler.java:1002)
	at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:746)
	at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379)
	at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1449)
	at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1414)
	at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:916)
	at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288)
	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
	at org.eclipse.jetty.server.Server.start(Server.java:423)
	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
	at org.eclipse.jetty.server.Server.doStart(Server.java:387)
	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
	at org.eclipse.jetty.runner.Runner.run(Runner.java:519)
	at org.eclipse.jetty.runner.Runner.main(Runner.java:564)
Caused by: java.net.SocketTimeoutException: 30,000 milliseconds timeout on connection http-outgoing-2 [ACTIVE]
	at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387)
	at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92)
	at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39)
	at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:261)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:502)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:211)
	at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280)
	at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
	at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
	at java.lang.Thread.run(Thread.java:748)
10:02:41.279 [main] INFO  c.l.m.s.e.i.EntityIndexBuilder:19 - Setting up index: chartindex_v2
10:02:41.284 [main] WARN  org.elasticsearch.client.RestClient:65 - request [HEAD <http://elasticsearch-master:9200/chartindex_v2?ignore_throttled=false&ignore_unavailable=false&expand_wildcards=open%2Cclosed&allow_no_indices=false>] returned 2 warnings: [299 Elasticsearch-7.16.2-2b937c44140b6559905130a8650c64dbd0879cfb "Elasticsearch built-in security features are not enabled. Without authentication, your cluster could be accessible to anyone. See <https://www.elastic.co/guide/en/elasticsearch/reference/7.16/security-minimal-setup.html> to enable security."],[299 Elasticsearch-7.16.2-2b937c44140b6559905130a8650c64dbd0879cfb "[ignore_throttled] parameter is deprecated because frozen indices have been deprecated. Consider cold or frozen tiers in place of frozen indices."]
10:02:41.285 [main] INFO  c.l.m.s.e.i.ESIndexBuilder:200 - Index chartindex_v2 does not exist. Creating

better-fireman-33387

09/01/2022, 10:04 AM

this errors happened after trying to ingest mysql

better-fireman-33387

09/01/2022, 10:05 AM

this has worked in test env with docker-compose so the issue here is with the helm configuration

bumpy-needle-3184

09/01/2022, 10:08 AM

this is not an error ...gms is trying to create index in elasticsearch

better-fireman-33387

09/01/2022, 10:13 AM

If I do clean install everything is ok after trying to ingest mysql I get many errors in UI and in gms/elastic

bumpy-needle-3184

09/01/2022, 10:24 AM

for ingestion related issue, you can ask for help i ingestion channel

better-fireman-33387

09/01/2022, 10:28 AM

@bumpy-needle-3184 thanks!

225 Views

Open in Slack

Previous Next