https://datahubproject.io logo
#all-things-deployment
Title
# all-things-deployment
b

big-carpet-38439

06/14/2022, 4:37 PM
@bitter-dog-24903 Let's chat more here!
Please paste the command + the failure you're seeing
b

bitter-dog-24903

06/14/2022, 4:38 PM
Thanks John..
So I am deploying Datahub on AWS using EKS
And I want to use AWS RDS as a dependency for MySql db that Datahub requires..
Below are the list of steps I performed:
• Created a cluster using eks:
eksctl create cluster \
--name datahub \
--region us-east-1 \
--with-oidc \
--nodes=3
This created a VPC and deployed kubernetes clusters in AWS
I created an RDS instance with the same VPC and included all the security groups that were part of the datahub VPC
Changed the values.yaml for prerequisites chart for mysql to false:
Copy code
mysql:
  enabled: false
  auth:
    # For better security, add mysql-secrets k8s secret with mysql-root-password, mysql-replication-password and mysql-password
    existingSecret: mysql-secrets
b

big-carpet-38439

06/14/2022, 4:42 PM
Okay makes sense
And did you configure the values to talk to your new RDS?
b

bitter-dog-24903

06/14/2022, 4:43 PM
Deployed the prerequisites using that values.yaml file
Also changed the values.yaml file for datahub deployment to point it to the new RDS instance
After that deployed datahub using helm install
datahub datahub/datahub --values values.yaml --debug
Getting below error:
Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
helm.go:84: [debug] failed pre-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>
main.main
<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>
runtime.main
runtime/proc.go:255
runtime.goexit
runtime/asm_amd64.s:1581
@kind-dawn-17532 ^
b

big-carpet-38439

06/14/2022, 4:45 PM
Okay got it - and what version of datahub are you deploying? (Which chart version)
b

bitter-dog-24903

06/14/2022, 4:45 PM
datahub-0.2.81, datahub-prerequisites-0.0.6
b

big-carpet-38439

06/14/2022, 4:49 PM
ok thank you! cc @early-lamp-41924 @kind-dawn-17532
e

early-lamp-41924

06/14/2022, 4:50 PM
Could you run
Copy code
kubectl get pods -n <<namespace>>
one of the setup jobs must be failing
b

bitter-dog-24903

06/14/2022, 4:51 PM
The mysql setup job is failing
Copy code
NAME                        READY  STATUS            RESTARTS  AGE
datahub-elasticsearch-setup-job-x98wk       0/1   Completed          0     52m
datahub-kafka-setup-job-48fzj           0/1   Completed          0     52m
datahub-mysql-setup-job-2t92q           0/1   Error            0     51m
datahub-mysql-setup-job-5swhg           0/1   Error            0     49m
datahub-mysql-setup-job-b8ln6           0/1   Error            0     46m
datahub-mysql-setup-job-bdn4d           0/1   Error            0     51m
datahub-mysql-setup-job-dk7zp           0/1   Error            0     51m
datahub-mysql-setup-job-n7mb6           0/1   Error            0     51m
datahub-mysql-setup-job-zb5p9           0/1   Error            0     50m
elasticsearch-master-0               1/1   Running           0     68m
elasticsearch-master-1               1/1   Running           0     68m
elasticsearch-master-2               1/1   Running           0     68m
prerequisites-cp-schema-registry-cf79bfccf-6t7nq  2/2   Running           0     68m
prerequisites-kafka-0               1/1   Running           0     68m
prerequisites-neo4j-community-0          0/1   CreateContainerConfigError  0     68m
prerequisites-zookeeper-0             1/1   Running           0     68m
e

early-lamp-41924

06/14/2022, 5:08 PM
Can you post the logs?
b

bitter-dog-24903

06/14/2022, 5:23 PM
You mean the kubernetes logs?
Copy code
helm install datahub datahub/datahub --values values.yaml --debug
install.go:178: [debug] Original chart version: ""
install.go:195: [debug] CHART PATH: /Users/ronakshah/Library/Caches/helm/repository/datahub-0.2.81.tgz

client.go:299: [debug] Starting delete for "datahub-elasticsearch-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-elasticsearch-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-elasticsearch-setup-job: ADDED
client.go:596: [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED
client.go:299: [debug] Starting delete for "datahub-kafka-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-kafka-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-kafka-setup-job: ADDED
client.go:596: [debug] datahub-kafka-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-kafka-setup-job: MODIFIED
client.go:299: [debug] Starting delete for "datahub-mysql-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-mysql-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: ADDED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 3, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 4, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 5, jobs succeeded: 0
Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
helm.go:84: [debug] failed pre-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>
runtime.main
	runtime/proc.go:255
runtime.goexit
	runtime/asm_amd64.s:1581
b

big-carpet-38439

06/14/2022, 5:34 PM
Failed on pre-install ...
So mysql-setup configurations must not be correct- you mentioned that the RDS is in the same VPC - Is it in the same subnet?
b

bitter-dog-24903

06/14/2022, 5:41 PM
Yes, it is in the same subnet as the datahub
e

early-lamp-41924

06/14/2022, 5:42 PM
Yes the kubernetes logs
The logs of the failed mysql-setup pods
b

bitter-dog-24903

06/14/2022, 5:48 PM
@early-lamp-41924 Posted the logs while deployment above^
e

early-lamp-41924

06/14/2022, 5:49 PM
Those are logs from helm?
You can get it via
Copy code
kubectl logs datahub-mysql-setup-job-2t92q -n <<namespace>
Copy code
kubectl logs <<pod-name>> -n <<namespace>
Just copied one of the mysql setup job pod name from above
b

bitter-dog-24903

06/14/2022, 5:50 PM
Ohh got it.. getting it. Thanks
Copy code
2022/06/14 17:46:54 Waiting for: <tcp://datahub.cluster-cevhj.us-east-1.rds.amazonaws.com:3306>
2022/06/14 17:46:54 Connected to <tcp://datahub.cluster-cevhj.us-east-1.rds.amazonaws.com:3306>
-- create datahub database
CREATE DATABASE IF NOT EXISTS datahub CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
USE datahub;

-- create metadata aspect table
create table if not exists metadata_aspect_v2 (
 urn              varchar(500) not null,
 aspect            varchar(200) not null,
 version            bigint(20) not null,
 metadata           longtext not null,
 systemmetadata        longtext,
 createdon           datetime(6) not null,
 createdby           varchar(255) not null,
 createdfor          varchar(255),
 constraint pk_metadata_aspect_v2 primary key (urn,aspect,version)
);

-- create default records for datahub user if not exists
CREATE TABLE temp_metadata_aspect_v2 LIKE metadata_aspect_v2;
INSERT INTO temp_metadata_aspect_v2 (urn, aspect, version, metadata, createdon, createdby) VALUES(
 'urn:li:corpuser:datahub',
 'corpUserInfo',
 0,
 '{"displayName":"Data Hub","active":true,"fullName":"Data Hub","email":"<mailto:datahub@linkedin.com|datahub@linkedin.com>"}',
 now(),
 'urn:li:corpuser:__datahub_system'
), (
 'urn:li:corpuser:datahub',
 'corpUserEditableInfo',
 0,
 '{"skills":[],"teams":[],"pictureLink":"<https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web-react/src/images/default_avatar.png>"}',
 now(),
 'urn:li:corpuser:__datahub_system'
);
-- only add default records if metadata_aspect is empty
INSERT INTO metadata_aspect_v2
SELECT * FROM temp_metadata_aspect_v2
WHERE NOT EXISTS (SELECT * from metadata_aspect_v2);
DROP TABLE temp_metadata_aspect_v2;

-- create metadata index table
CREATE TABLE IF NOT EXISTS metadata_index (
 `id` BIGINT NOT NULL AUTO_INCREMENT,
 `urn` VARCHAR(200) NOT NULL,
 `aspect` VARCHAR(150) NOT NULL,
 `path` VARCHAR(150) NOT NULL,
 `longVal` BIGINT,
 `stringVal` VARCHAR(200),
 `doubleVal` DOUBLE,
 CONSTRAINT id_pk PRIMARY KEY (id),
 INDEX longIndex (`urn`,`aspect`,`path`,`longVal`),
 INDEX stringIndex (`urn`,`aspect`,`path`,`stringVal`),
 INDEX doubleIndex (`urn`,`aspect`,`path`,`doubleVal`)
);
ERROR 1045 (28000): Access denied for user 'root'@'192.168.60.40' (using password: YES)
2022/06/14 17:46:54 Command exited with error: exit status 1
e

early-lamp-41924

06/14/2022, 5:55 PM
Access denied?
b

bitter-dog-24903

06/14/2022, 5:56 PM
Looks like its able to connect to the database, but does not have root access?
I am setting up the access password using
kubectl create secret generic mysql-secrets --from-literal=mysql-root-password=<<password>>
e

early-lamp-41924

06/14/2022, 5:58 PM
oh one thing
are you using RDS?
b

bitter-dog-24903

06/14/2022, 5:58 PM
Yes, I am using RDS
e

early-lamp-41924

06/14/2022, 5:58 PM
there the username should be “admin”
not root
b

big-carpet-38439

06/14/2022, 5:58 PM
is this an RDS thing?
e

early-lamp-41924

06/14/2022, 5:58 PM
at least that is the case for our dbs. can you check?
b

bitter-dog-24903

06/14/2022, 5:59 PM
I am having username as datahub in RDS
Should I change it to root?
e

early-lamp-41924

06/14/2022, 5:59 PM
are you setting that anywhere?
we default username to root
you should change it to datahub
b

bitter-dog-24903

06/14/2022, 6:00 PM
Ohhh damn.. got it. let me check
Thanks a lot. Looks like the setup job for mysql completed
It is failing on the post-install
e

early-lamp-41924

06/14/2022, 6:14 PM
that is fine
it will succeed eventually
once gms is live
b

bitter-dog-24903

06/14/2022, 6:15 PM
Copy code
client.go:557: [debug] Add/Modify event for datahub-datahub-upgrade-job: ADDED
client.go:596: [debug] datahub-datahub-upgrade-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: INSTALLATION FAILED: failed post-install: timed out waiting for the condition
helm.go:84: [debug] failed post-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>
runtime.main
	runtime/proc.go:255
runtime.goexit
	runtime/asm_amd64.s:1581
e

early-lamp-41924

06/14/2022, 6:15 PM
get pods?
b

bitter-dog-24903

06/14/2022, 6:15 PM
Copy code
NAME                        READY  STATUS            RESTARTS  AGE
datahub-acryl-datahub-actions-c8868cdf6-tgtj5   1/1   Running           2     10m
datahub-datahub-frontend-8448f49655-pclfn     1/1   Running           0     10m
datahub-datahub-gms-87f49d87b-fj4l4        0/1   CreateContainerConfigError  0     10m
datahub-datahub-upgrade-job-qvqjj         0/1   CreateContainerConfigError  0     10m
datahub-elasticsearch-setup-job-ggnhd       0/1   Completed          0     11m
datahub-kafka-setup-job-g5vmx           0/1   Completed          0     11m
datahub-mysql-setup-job-s2plt           0/1   Completed          0     11m
elasticsearch-master-0               1/1   Running           0     150m
elasticsearch-master-1               1/1   Running           0     150m
elasticsearch-master-2               1/1   Running           0     150m
prerequisites-cp-schema-registry-cf79bfccf-6t7nq  2/2   Running           0     150m
prerequisites-kafka-0               1/1   Running           0     150m
prerequisites-neo4j-community-0          0/1   CreateContainerConfigError  0     150m
prerequisites-zookeeper-0             1/1   Running           0     150m
e

early-lamp-41924

06/14/2022, 6:15 PM
hmn
Copy code
CreateContainerConfigError
?
Can you run
Copy code
kubectl describe pod <<pod-name>> -n <<namespace>>
k

kind-dawn-17532

06/14/2022, 6:16 PM
Sorry, I got pulled into something.. are we good here or do I add my team member who did all the helm deployments for us?
b

bitter-dog-24903

06/14/2022, 6:18 PM
Copy code
kubectl logs datahub-datahub-upgrade-job-qvqjj
Error from server (BadRequest): container "datahub-upgrade-job" in pod "datahub-datahub-upgrade-job-qvqjj" is waiting to start: CreateContainerConfigError
Hi Atul, Thanks for checking.. The RDS issue got resolved. Looking into the post-install issue now
e

early-lamp-41924

06/14/2022, 6:19 PM
describe pod please
b

bitter-dog-24903

06/14/2022, 6:20 PM
Copy code
Name:     datahub-datahub-upgrade-job-qvqjj
Namespace:  default
Priority:   0
Node:     ip-192-168-45-36.ec2.internal/192.168.45.36
Start Time:  Tue, 14 Jun 2022 11:02:14 -0700
Labels:    controller-uid=dac86aae-e63c-436d-9c87-5cbc7aadf9ea
       job-name=datahub-datahub-upgrade-job
Annotations: <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
Status:    Pending
IP:      192.168.55.11
IPs:
 IP:      192.168.55.11
Controlled By: Job/datahub-datahub-upgrade-job
Containers:
 datahub-upgrade-job:
  Container ID:  
  Image:     acryldata/datahub-upgrade:v0.8.31
  Image ID:    
  Port:     <none>
  Host Port:   <none>
  Args:
   -u
   NoCodeDataMigration
   -a
   batchSize=1000
   -a
   batchDelayMs=100
   -a
   dbType=MYSQL
  State:     Waiting
   Reason:    CreateContainerConfigError
  Ready:     False
  Restart Count: 0
  Limits:
   cpu:   500m
   memory: 512Mi
  Requests:
   cpu:   300m
   memory: 256Mi
  Environment:
   ENTITY_REGISTRY_CONFIG_PATH: /datahub/datahub-gms/resources/entity-registry.yml
   DATAHUB_GMS_HOST:       datahub-datahub-gms
   DATAHUB_GMS_PORT:       8080
   DATAHUB_MAE_CONSUMER_HOST:  datahub-datahub-mae-consumer
   DATAHUB_MAE_CONSUMER_PORT:  9091
   EBEAN_DATASOURCE_USERNAME:  datahub
   EBEAN_DATASOURCE_PASSWORD:  <set to the key 'mysql-root-password' in secret 'mysql-secrets'> Optional: false
   EBEAN_DATASOURCE_HOST:    <http://datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306|datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306>
   EBEAN_DATASOURCE_URL:     jdbc:<mysql://datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8>
   EBEAN_DATASOURCE_DRIVER:   com.mysql.jdbc.Driver
   KAFKA_BOOTSTRAP_SERVER:    prerequisites-kafka:9092
   KAFKA_SCHEMAREGISTRY_URL:   <http://prerequisites-cp-schema-registry:8081>
   ELASTICSEARCH_HOST:      elasticsearch-master
   ELASTICSEARCH_PORT:      9200
   GRAPH_SERVICE_IMPL:      neo4j
   NEO4J_HOST:          prerequisites-neo4j-community:7474
   NEO4J_URI:          <bolt://prerequisites-neo4j-community>
   NEO4J_USERNAME:        neo4j
   NEO4J_PASSWORD:        <set to the key 'neo4j-password' in secret 'neo4j-secrets'> Optional: false
  Mounts:
   /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54kpm (ro)
Conditions:
 Type       Status
 Initialized    True 
 Ready       False 
 ContainersReady  False 
 PodScheduled   True 
Volumes:
 kube-api-access-54kpm:
  Type:          Projected (a volume that contains injected data from multiple sources)
  TokenExpirationSeconds: 3607
  ConfigMapName:      kube-root-ca.crt
  ConfigMapOptional:    <nil>
  DownwardAPI:       true
QoS Class:          Burstable
Node-Selectors:       <none>
Tolerations:         <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
               <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
 Type   Reason   Age          From        Message
 ----   ------   ----         ----        -------
 Normal  Scheduled 17m          default-scheduler Successfully assigned default/datahub-datahub-upgrade-job-qvqjj to ip-192-168-45-36.ec2.internal
 Normal  Pulling  17m          kubelet      Pulling image "acryldata/datahub-upgrade:v0.8.31"
 Normal  Pulled   17m          kubelet      Successfully pulled image "acryldata/datahub-upgrade:v0.8.31" in 10.539330122s
 Warning Failed   15m (x12 over 17m)  kubelet      Error: secret "neo4j-secrets" not found
 Normal  Pulled   2m39s (x71 over 17m) kubelet      Container image "acryldata/datahub-upgrade:v0.8.31" already present on machine
e

early-lamp-41924

06/14/2022, 6:21 PM
Copy code
Error: secret "neo4j-secrets" not found
Are you trying to run with neo4j or with elasticsearch as graph backend?
b

bitter-dog-24903

06/14/2022, 6:22 PM
I am not changing any default configuration for neo4j in values.yaml
Running
kubectl create secret generic neo4j-secrets --from-literal=neo4j-password=datahub
again and trying
b

big-carpet-38439

06/14/2022, 6:23 PM
@early-lamp-41924 He mentioned he was following the AWS deploy guide steps
b

bitter-dog-24903

06/14/2022, 6:53 PM
The Config error is now resolved..
Only one of the upgrade job is failing now
e

early-lamp-41924

06/14/2022, 6:54 PM
As I mentioned above, that should succeed once gms is live
it will retry until it succeeds
b

bitter-dog-24903

06/14/2022, 6:54 PM
So just wait for some time and again rerun the upgrade?
e

early-lamp-41924

06/14/2022, 6:55 PM
Kubernetes
will automatically do it
Wait it out
b

bitter-dog-24903

06/14/2022, 6:56 PM
Ohh got it.. thanks a lot Dexter 🙏
b

big-carpet-38439

06/14/2022, 7:21 PM
Working now? Thanks everyone.
b

bitter-dog-24903

06/14/2022, 8:22 PM
It started working now 😀 Thank you..
While using AWS SMK as dependency and trying to upgrade the datahub deployment it is failing with below error:
Copy code
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1655262435166, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1655262435166, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited.
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
Copy code
WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : Call(callName=createTopics, deadlineMs=1655262505480, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
[2022-06-15 03:07:26,372] ERROR org.apache.kafka.common.errors.TimeoutException: Call(callName=createTopics, deadlineMs=1655262505480, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: createTopics
 (kafka.admin.TopicCommand$)
[2022-06-15 03:07:26,463] ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1': (org.apache.kafka.common.utils.KafkaThread)
java.lang.OutOfMemoryError: Java heap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
	at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
	at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113)
	at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:452)
	at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:402)
	at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1333)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1264)
	at java.lang.Thread.run(Thread.java:750)
Kafka nodes are having 1000 GB of storage each
@early-lamp-41924 Any suggestion on this ^
nvm, It worked...
b

big-carpet-38439

06/17/2022, 3:11 PM
Glad to hear it Ronak.
🙏 1
c

careful-gigabyte-11162

07/05/2022, 7:01 AM
Hi @bitter-dog-24903 How did you solved it? I have allocated only 100 GB for Kafka nodes.
b

bitter-dog-24903

07/07/2022, 12:10 AM
Hi Siva, for me the issue was resolved by setting PlainText as authentication type and using the plainText endpoint in the values.yaml file to connect to MSK (Kafka) cluster.
plus1 1
Hi @early-lamp-41924 @big-carpet-38439, I have deployed the https://github.com/datahub-project/datahub/blob/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml file on AWS ECS. When I try to login into the Datahub UI, I am getting error:
Failed to log in! SyntaxError: Unexpected token < in JSON at position 1
. Is there any other compose file that I should be deploying or any change to this file? The Gms container in AWS ECS is exiting with error:
Command exited with error: signal: killed
f

famous-florist-7218

08/01/2022, 7:35 AM
Hi guys, I face to the same issue 😕
Copy code
datahub-elastic-6cc45b5c4f-w2b6g                                 1/1     Running            0                2d8h
datahub-elasticsearch-setup-job--1-bb5n9                         0/1     Error              0                17m
datahub-elasticsearch-setup-job--1-fqlvq                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-gxgdd                         0/1     Error              0                20m
datahub-elasticsearch-setup-job--1-jjl4w                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-qd89w                         0/1     Error              0                12m
datahub-elasticsearch-setup-job--1-rtpgj                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-wsksj                         0/1     Error              0                21m
datahub-kafka-789989768f-mhzw5                                   1/1     Running            0                2d8h
datahub-kafka-setup-job--1-6kqhf                                 0/1     Error              0                130m
datahub-kafka-setup-job--1-ggg6d                                 0/1     Error              0                134m
datahub-kafka-setup-job--1-q549v                                 0/1     Error              0                123m
datahub-kafka-setup-job--1-sfmgd                                 0/1     Error              0                128m
datahub-kafka-setup-job--1-svwsj                                 0/1     Error              0                133m
datahub-kafka-setup-job--1-t4vg6                                 0/1     Error              0                132m
datahub-kafka-setup-job--1-v5fzb                                 0/1     Error              0                117m
datahub-mysql-7cfd455897-dp4zb                                   1/1     Running            0                2d22h
datahub-zookeeper-569df875bd-7wgsj                               1/1     Running            0                2d8h
b

bumpy-needle-3184

08/01/2022, 9:07 AM
could you share more logs from the pod and also describe the pod for which you are facing issue
Copy code
kubectl logs <<pod-name>> -n <<namespace>>
kubectl describe <<pod-name>> -n <<namespace>>
f

famous-florist-7218

08/01/2022, 9:19 AM
Hi @bumpy-needle-3184, I’ve solved this issue by pointing out nodeSelector. Because our EKS cluster was mixed ARM and Non-ARM, the ARM node cannot be compatible with DataHub chart.
b

bumpy-needle-3184

08/01/2022, 9:19 AM
good to know
b

bitter-dog-24903

10/03/2022, 9:25 PM
Hello @early-lamp-41924 I am deploying datahub and its components using docker on AWS ECS. All the containers seem to be stable except the datahub-gms and datahub-actions. The datahub-gms container logs has below error:
Caused by: java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED
at org.apache.http.util.Asserts.check(Asserts.java:46)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)
at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:255)
... 19 common frames omitted
21:11:26.190 [pool-6-thread-1] ERROR c.l.m.s.e.query.ESSearchDAO:72 - Search query failed
java.lang.RuntimeException: Request cannot be executed; I/O reactor status: STOPPED
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:857)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:259)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)
at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)
at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:60)
at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.search(ESSearchDAO.java:100)
at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:67)
at com.linkedin.entity.client.JavaEntityClient.search(JavaEntityClient.java:288)
at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:50)
at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:42)
at com.datahub.authorization.DataHubAuthorizer$PolicyRefreshRunnable.run(DataHubAuthorizer.java:222)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED
at org.apache.http.util.Asserts.check(Asserts.java:46)
at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)
at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:255)
... 19 common frames omitted
21:11:26.190 [pool-6-thread-1] ERROR c.d.authorization.DataHubAuthorizer:229 - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30
com.datahub.util.exception.ESQueryException: Search query failed:
at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:73)
at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.search(ESSearchDAO.java:100)
at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:67)
at com.linkedin.entity.client.JavaEntityClient.search(JavaEntityClient.java:288)
at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:50)
at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:42)
at com.datahub.authorization.DataHubAuthorizer$PolicyRefreshRunnable.run(DataHubAuthorizer.java:222)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
And datahub-actions container is failing with below error:
2022/10/03 17:51:47 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s
2022/10/03 17:51:48 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s
2022/10/03 17:51:49 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s
2022/10/03 17:51:49 Timeout after 4m0s waiting on dependencies to become available: [<http://datahub-gms:8080/health>]
Can you please help with any suggestion on resolving this?
9 Views