Hi hi I m facing a new problem deploying datahub with helm c DataHub #all-things-deployment

Join Slack

Hi hi. I'm facing a new problem deploying datahub ...

# all-things-deployment

rapid-book-98432

09/23/2022, 12:34 PM

Hi hi. I'm facing a new problem deploying datahub with helm chart. Ilt seems to be linked to ES container setup job :

helm install datahub datahub/datahub -n demo --version 0.2.83 --debug

install.go178 [debug] Original chart version: "0.2.83"

install.go195 [debug] CHART PATH: /home/cmo/.cache/helm/repository/datahub-0.2.83.tgz

client.go299 [debug] Starting delete for "datahub-elasticsearch-setup-job" Job

client.go128 [debug] creating 1 resource(s)

client.go529 [debug] Watching for changes to Job datahub-elasticsearch-setup-job with timeout of 5m0s

client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: ADDED

client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0

client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED

client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 1, jobs succeeded: 0

client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED

client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 2, jobs succeeded: 0

Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition

helm.go84 [debug] failed pre-install: timed out waiting for the condition

INSTALLATION FAILED

main.newInstallCmd.func2

helm.sh/helm/v3/cmd/helm/install.go:127

github.com/spf13/cobra.(*Command).execute

github.com/spf13/cobra@v1.3.0/command.go:856

github.com/spf13/cobra.(*Command).ExecuteC

github.com/spf13/cobra@v1.3.0/command.go:974

github.com/spf13/cobra.(*Command).Execute

github.com/spf13/cobra@v1.3.0/command.go:902

main.main

helm.sh/helm/v3/cmd/helm/helm.go:83

runtime.main

runtime/proc.go:255

runtime.goexit

runtime/asm_amd64.s:1581

If you have any idea. Thanks ! N.B : The es setup job container having this log :

2022/09/23 123401 Problem with request: Get http://elasticsearch-master:9200: dial tcp 10.0.125.209200 connect: connection refused. Sleeping 1s

2022/09/23 123403 Problem with request: Get http://elasticsearch-master:9200: dial tcp 10.0.125.209200 connect: connection refused. Sleeping 1s

2022/09/23 123403 Timeout after 2m0s waiting on dependencies to become available: [http://elasticsearch-master:9200]

Well known error here -_-

incalculable-ocean-74010

09/23/2022, 12:39 PM

How was elastic deployed?

incalculable-ocean-74010

09/23/2022, 12:40 PM

Pre-requisite helm chart? Managed service? Something else? It looks like the job can not connect to the ES cluster

rapid-book-98432

09/23/2022, 12:40 PM

Hi @incalculable-ocean-74010;) ! I use the helm chart prerequisites also

rapid-book-98432

09/23/2022, 12:41 PM

regarless the datahub install, the prerequisites setup job is also failing :

NAME READY STATUS RESTARTS AGE

datahub-elasticsearch-setup-job-l6j59 1/1 Running 0 15s

elasticsearch-master-0 0/1 Pending 0 17m

elasticsearch-master-1 0/1 Pending 0 17m

prerequisites-cp-schema-registry-6f4b5b894f-h9927 2/2 Running 0 17m

prerequisites-kafka-0 1/1 Running 1 (17m ago) 17m

prerequisites-mysql-0 1/1 Running 0 17m

prerequisites-zookeeper-0 1/1 Running 0 17m

datahub-elasticsearch-setup-job-l6j59 0/1 Error 0 2m4s

datahub-elasticsearch-setup-job-l6j59 0/1 Error 0 2m6s

datahub-elasticsearch-setup-job-rmjqx 0/1 Pending 0 0s

datahub-elasticsearch-setup-job-rmjqx 0/1 ContainerCreating 0 0s

datahub-elasticsearch-setup-job-rmjqx 1/1 Running 0 5s

datahub-elasticsearch-setup-job-rmjqx 0/1 Error 0 2m4s

datahub-elasticsearch-setup-job-rmjqx 0/1 Error 0 2m6s

datahub-elasticsearch-setup-job-zbdr7 0/1 Pending 0 0s

datahub-elasticsearch-setup-job-zbdr7 0/1 ContainerCreating 0 0s

datahub-elasticsearch-setup-job-zbdr7 1/1 Running 0 1s

datahub-elasticsearch-setup-job-zbdr7 0/1 Error 0 2m2s

datahub-elasticsearch-setup-job-zbdr7 0/1 Error 0 2m3s

datahub-elasticsearch-setup-job-92trf 0/1 Pending 0 0s

datahub-elasticsearch-setup-job-92trf 0/1 ContainerCreating 0 0s

datahub-elasticsearch-setup-job-92trf 1/1 Running 0 2s

rapid-book-98432

09/23/2022, 12:41 PM

this was working perfectly 1 week ago

incalculable-ocean-74010

09/23/2022, 12:41 PM

It's failing because elastic search cluster is not up. Note that it is in PENDING state.

incalculable-ocean-74010

09/23/2022, 12:42 PM

You need to fix that and then the setup job should work

incalculable-ocean-74010

09/23/2022, 12:43 PM

Look at the pods with elasticsearch-master-<number>

rapid-book-98432

09/23/2022, 12:44 PM

also my only change is regarding the hardware

rapid-book-98432

09/23/2022, 12:45 PM

i put 3 * (2CPU+ 4GB) insteal of 1 * (2 CPU+8GB) in my cluster

incalculable-ocean-74010

09/23/2022, 12:45 PM

It's a non-datahub specific change and pertains to your infra. There is not much I can do to help

rapid-book-98432

09/23/2022, 12:46 PM

2 replicas ES y my setting also

incalculable-ocean-74010

09/23/2022, 12:46 PM

It's possible the elastic cluster pod got moved into other k8s nodes. Since they are stateful they can't find the associated volumes (that were in the old nodes). Hence your issue

rapid-book-98432

09/23/2022, 12:47 PM

well if a do a clean and full uninstall / installl that should work

rapid-book-98432

09/23/2022, 12:47 PM

i did a persistant volume clean also

rapid-book-98432

09/23/2022, 12:48 PM

kubectl delete pvc --all

persistentvolumeclaim "data-prerequisites-kafka-0" deleted

persistentvolumeclaim "data-prerequisites-mysql-0" deleted

persistentvolumeclaim "data-prerequisites-zookeeper-0" deleted

persistentvolumeclaim "elasticsearch-master-elasticsearch-master-0" deleted

persistentvolumeclaim "elasticsearch-master-elasticsearch-master-1" deleted

rapid-book-98432

09/23/2022, 12:48 PM

log like that

incalculable-ocean-74010

09/23/2022, 12:52 PM

Also delete the persistent volume

incalculable-ocean-74010

09/23/2022, 12:52 PM

This will delete all search and graph data you had in DataHub

rapid-book-98432

09/23/2022, 12:53 PM

you are rightn this is the persistentvolumeclaim different thhan the PV

rapid-book-98432

09/23/2022, 12:54 PM

i don't know the diff to be honest

rapid-book-98432

09/23/2022, 12:55 PM

Ok all Pv and PVC cleaned now, starting new deployment

rapid-book-98432

09/23/2022, 12:59 PM

Jesus i got this

rapid-book-98432

09/23/2022, 1:00 PM

the 4GB node is a problem it seems

2 Views

Open in Slack

Previous Next