Hi hi. I'm facing a new problem deploying datahub ...
# all-things-deployment
r
Hi hi. I'm facing a new problem deploying datahub with helm chart. Ilt seems to be linked to ES container setup job :
helm install datahub datahub/datahub -n demo --version 0.2.83 --debug
install.go178 [debug] Original chart version: "0.2.83"
install.go195 [debug] CHART PATH: /home/cmo/.cache/helm/repository/datahub-0.2.83.tgz
client.go299 [debug] Starting delete for "datahub-elasticsearch-setup-job" Job
client.go128 [debug] creating 1 resource(s)
client.go529 [debug] Watching for changes to Job datahub-elasticsearch-setup-job with timeout of 5m0s
client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: ADDED
client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED
client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go557 [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED
client.go596 [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
helm.go84 [debug] failed pre-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
helm.sh/helm/v3/cmd/helm/install.go:127
github.com/spf13/cobra.(*Command).execute
github.com/spf13/cobra@v1.3.0/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
github.com/spf13/cobra@v1.3.0/command.go:974
github.com/spf13/cobra.(*Command).Execute
github.com/spf13/cobra@v1.3.0/command.go:902
main.main
helm.sh/helm/v3/cmd/helm/helm.go:83
runtime.main
runtime/proc.go:255
runtime.goexit
runtime/asm_amd64.s:1581
If you have any idea. Thanks ! N.B : The es setup job container having this log :
2022/09/23 123401 Problem with request: Get http://elasticsearch-master:9200: dial tcp 10.0.125.209200 connect: connection refused. Sleeping 1s
2022/09/23 123403 Problem with request: Get http://elasticsearch-master:9200: dial tcp 10.0.125.209200 connect: connection refused. Sleeping 1s
2022/09/23 123403 Timeout after 2m0s waiting on dependencies to become available: [http://elasticsearch-master:9200]
Well known error here -_-
i
How was elastic deployed?
Pre-requisite helm chart? Managed service? Something else? It looks like the job can not connect to the ES cluster
r
Hi @incalculable-ocean-74010;) ! I use the helm chart prerequisites also
regarless the datahub install, the prerequisites setup job is also failing :
NAME READY STATUS RESTARTS AGE
datahub-elasticsearch-setup-job-l6j59 1/1 Running 0 15s
elasticsearch-master-0 0/1 Pending 0 17m
elasticsearch-master-1 0/1 Pending 0 17m
prerequisites-cp-schema-registry-6f4b5b894f-h9927 2/2 Running 0 17m
prerequisites-kafka-0 1/1 Running 1 (17m ago) 17m
prerequisites-mysql-0 1/1 Running 0 17m
prerequisites-zookeeper-0 1/1 Running 0 17m
datahub-elasticsearch-setup-job-l6j59 0/1 Error 0 2m4s
datahub-elasticsearch-setup-job-l6j59 0/1 Error 0 2m6s
datahub-elasticsearch-setup-job-rmjqx 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-rmjqx 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-rmjqx 0/1 ContainerCreating 0 0s
datahub-elasticsearch-setup-job-rmjqx 1/1 Running 0 5s
datahub-elasticsearch-setup-job-rmjqx 0/1 Error 0 2m4s
datahub-elasticsearch-setup-job-rmjqx 0/1 Error 0 2m6s
datahub-elasticsearch-setup-job-zbdr7 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-zbdr7 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-zbdr7 0/1 ContainerCreating 0 0s
datahub-elasticsearch-setup-job-zbdr7 1/1 Running 0 1s
datahub-elasticsearch-setup-job-zbdr7 0/1 Error 0 2m2s
datahub-elasticsearch-setup-job-zbdr7 0/1 Error 0 2m3s
datahub-elasticsearch-setup-job-92trf 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-92trf 0/1 Pending 0 0s
datahub-elasticsearch-setup-job-92trf 0/1 ContainerCreating 0 0s
datahub-elasticsearch-setup-job-92trf 1/1 Running 0 2s
this was working perfectly 1 week ago
i
It's failing because elastic search cluster is not up. Note that it is in PENDING state.
You need to fix that and then the setup job should work
Look at the pods with elasticsearch-master-<number>
r
also my only change is regarding the hardware
i put 3 * (2CPU+ 4GB) insteal of 1 * (2 CPU+8GB) in my cluster
i
It's a non-datahub specific change and pertains to your infra. There is not much I can do to help
r
2 replicas ES y my setting also
i
It's possible the elastic cluster pod got moved into other k8s nodes. Since they are stateful they can't find the associated volumes (that were in the old nodes). Hence your issue
r
well if a do a clean and full uninstall / installl that should work
i did a persistant volume clean also
kubectl delete pvc --all
persistentvolumeclaim "data-prerequisites-kafka-0" deleted
persistentvolumeclaim "data-prerequisites-mysql-0" deleted
persistentvolumeclaim "data-prerequisites-zookeeper-0" deleted
persistentvolumeclaim "elasticsearch-master-elasticsearch-master-0" deleted
persistentvolumeclaim "elasticsearch-master-elasticsearch-master-1" deleted
log like that
i
Also delete the persistent volume
This will delete all search and graph data you had in DataHub
r
you are rightn this is the persistentvolumeclaim different thhan the PV
i don't know the diff to be honest
Ok all Pv and PVC cleaned now, starting new deployment
Jesus i got this
the 4GB node is a problem it seems