fierce-finland-15121
04/18/2023, 6:41 PMERROR SpringApplication Application run failed
org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'upgradeCli': Unsatisfied dependency expressed through field 'noCodeUpgrade'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in class path resource [com/linkedin/gms/factory/entity/EbeanServerFactory.class]: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
bland-orange-13353
04/19/2023, 5:29 AMcareful-lunch-53644
04/19/2023, 5:34 AMflat-painter-78331
04/19/2023, 10:00 AM<https://scx-datahub.cxos.tech>
is where I've exposed the datahub applicationbest-daybreak-64419
04/19/2023, 6:59 PMhelm install prerequisites
, the ‘prerequisites-cp-schema-registry’ pod kept failing and restarting, while other pods remained in the pending state.
It was exactly the same issue mentioned in this thread on Slack. Although I added the EBS usage policy to EKS, PVC binding did not work, and when I ran the kubectl get pv
command, no PVs were found. Then I checked kubectl get storageclasses
and found an StorageClass with the name ‘standard.’
I finally succeeded in binding the PVC only after modifying the values.yaml file as follows, and I could see that prerequisites-cp-schema-registry
was running normally.
elasticsearch:
...
# # Request smaller persistent volumes.
volumeClaimTemplate:
accessModes: ["ReadWriteOnce"]
storageClassName: "standard"
resources:
requests:
storage: 30Gi
...
mysql:
enabled: true
auth:
# For better security, add mysql-secrets k8s secret with mysql-root-password, mysql-replication-password and mysql-password
existingSecret: mysql-secrets
global:
storageClass: "standard"
I have deployed Datahub once on EKS version v1.23.13-eks with Datahub version 9.x using the same method. At that time, when I checked PVCs, the bound sc name was gp2 (default). However, when I added the EBS policy to EKS, PVC was bound immediately without modifying the values.yaml file.
So, my first question
is:
With the EKS version upgrade, I couldn’t see gp2 as the default storage class, and there was only ‘standard’ (which doesn’t have a default name). Therefore, I added the storageClass option in the values.yaml file to solve the issue. However, I’m wondering if creating a separate default storage class with VOLUMEBINDINGMODE set to WaitForFirstConsumer is the correct solution.
`Second question`:
The Kubernetes deployment document recommends installing ‘prerequisites’ and then installing ‘datahub/datahub’ using helm, citing dependencies. However, when RDS(mysql), MSK, and OpenSearch are already set up, should I set ‘enabled’ to true or false for es and mysql in the prerequisites values.yaml file?
`Third question`:
Do EKS nodes need to be at least 3 in number? Also, is it necessary to have 3 or more Kafka brokers?
Thank you for taking the time to read through my lengthy question. If there is any part of my inquiry that you didn’t fully understand, please feel free to ask for clarification. I look forward to your response.bland-orange-13353
04/20/2023, 12:02 AMrich-crowd-33361
04/20/2023, 12:18 AMbland-orange-13353
04/20/2023, 6:25 AMmicroscopic-machine-90437
04/20/2023, 9:41 AMERROR: The ingestion process was killed, likely because it ran out of memory. You can resolve this issue by allocating more memory to the datahub-actions container.
When I go through the values.yml file, I could see that the datahub-actions container has 512 Mi as memory.
My questions is, when we ingest metadata, in which container it will be stored. If the data from snowflake we are trying to ingest is in GBs, how large we have to scale our memory in the actions container? Is there a way to find out what is the size of the data/metadata we are trying to ingest(from snowflake/any other source).
Can someone help me with this.bland-gold-64386
04/20/2023, 12:32 PMairflow connections add --conn-type 'datahub_rest' 'datahub_rest_default' --conn-host '<http://domain.com|domain.com>' --conn-password ''
rapid-hamburger-95729
04/20/2023, 1:47 PMsteep-doctor-17127
04/20/2023, 10:55 PMpowerful-cat-68806
04/21/2023, 11:13 AMmost-animal-32096
04/21/2023, 12:28 PMmicroscopic-machine-90437
04/21/2023, 1:58 PMlimited-forest-73733
04/24/2023, 1:51 PMflat-painter-78331
04/25/2023, 11:01 AMdatahub-system-update-job
saying Error: secret "datahub-auth-secrets" not found
. I've had deployed datahub previously and was working fine as well. I deleted the deployment and am trying to re-deploy now,
Can someone tell me if there's anything I need to do or something im not looking at please?
Thanks in advance!bland-orange-13353
04/25/2023, 6:58 PMbland-orange-13353
04/26/2023, 10:15 AMearly-kitchen-6639
04/26/2023, 11:23 AMdatahub-gms
, I see that it is able to connect to all the endpoints properly but it is unable to resolve Kafka broker hostnames. We have our Kafka running on EKS using strimzi operator and I providing the boostrap server URL to datahub-gms.
All other pods in our EKS are able to resolve broker hostnames correctly, seems like the issue is with datahub-gms image. Infact, we use the same endpoint for Pinot tables also. Has anyone faced this issue? Sharing the logs in thread. Please help. Thanks!many-rocket-80549
04/26/2023, 2:51 PMprehistoric-wall-71780
04/26/2023, 9:16 PMdatahub.ingestion.run.pipeline.PipelineInitError: Failed to find a registered source for type bigquery: 'str' object is not callable
bumpy-activity-74405
04/27/2023, 7:51 AMdatahub-upgrade
is needed to run datahub-gms? Some background:
I am running datahub on kubernetes for ~2 years now, I don't use your provided helm charts. I simply have two pods - one for gms and one for frontend. ES and mysql are not on k8s. And for the longest time that's all I needed - both frontend and gms recovered after being restarted. But as of v0.10.1
(I think) gms just won't start if I don't run the upgrade container:
2023-04-27 06:41:59,065 [R2 Nio Event Loop-1-2] WARN c.l.r.t.h.c.c.ChannelPoolLifecycle:139 - Failed to create channel, remote=localhost/127.0.0.1:8080
I think it just gets stuck at this step:
2023-04-27 06:41:39,668 [main] INFO c.l.metadata.boot.BootstrapManager:33 - Executing bootstrap step 1/13 with name WaitForSystemUpdateStep...
I understand the need to reindex ES indices when a certain upgrade requires it (which I've done adhoc when upgrading 0.9.6.1
-> 0.10.1
), but what's the point of it outside of that? Is there any way to avoid to have to run the upgrade each time gms starts?wonderful-wall-76801
04/27/2023, 10:33 AM2023-04-27 10:24:44,740 [I/O dispatcher 1] INFO c.l.m.s.e.update.BulkListener:47 - Successfully fed bulk request. Number of events: 6 Took time ms: -1
and why I see this string 5 time per second ?
I think that this message talk me about some delay (or something like that) with working another tasks such as creating term into glossary or changing some permissions in settings.
Because when I try to do this steps - I'm facing with the next problem:
2023-04-27 08:08:04,881 [I/O dispatcher 1] ERROR c.l.m.s.e.update.BulkListener:44 - Failed to feed bulk request. Number of events: 7 Took time ms: -1 Message: failure in bulk execution:
[0]: index [glossarytermindex_v2_1682499154226], type [_doc], id [urn%3Ali%3AglossaryTerm%3A15618cc3-f96d-4f77-94d7-b6adb9e02ba8], message [[glossarytermindex_v2_1682499154226/vbvG5KPmSpiGQawAxCxalg][[glossarytermindex_v2_1682499154226][0]] ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn%3Ali%3AglossaryTerm%3A15618cc3-f96d-4f77-94d7-b6adb9e02ba8]: document missing]]]
great-branch-515
04/27/2023, 3:01 PMwonderful-book-58712
04/27/2023, 3:54 PMastonishing-byte-5433
04/28/2023, 12:48 PMlimited-forest-73733
04/30/2023, 4:46 PMlimited-forest-73733
04/30/2023, 4:46 PMbest-daybreak-64419
05/02/2023, 2:45 AM