< bitter dog 24903> Let s chat more here DataHub #all-things-deployment

Join Slack

<@U031JK100BD> Let's chat more here!

# all-things-deployment

big-carpet-38439

06/14/2022, 4:37 PM

@bitter-dog-24903 Let's chat more here!

big-carpet-38439

06/14/2022, 4:37 PM

Please paste the command + the failure you're seeing

bitter-dog-24903

06/14/2022, 4:38 PM

Thanks John..

bitter-dog-24903

06/14/2022, 4:38 PM

So I am deploying Datahub on AWS using EKS

bitter-dog-24903

06/14/2022, 4:39 PM

And I want to use AWS RDS as a dependency for MySql db that Datahub requires..

bitter-dog-24903

06/14/2022, 4:39 PM

Below are the list of steps I performed:

bitter-dog-24903

06/14/2022, 4:40 PM

• Created a cluster using eks:

eksctl create cluster \

--name datahub \

--region us-east-1 \

--with-oidc \

--nodes=3

bitter-dog-24903

06/14/2022, 4:40 PM

This created a VPC and deployed kubernetes clusters in AWS

bitter-dog-24903

06/14/2022, 4:41 PM

I created an RDS instance with the same VPC and included all the security groups that were part of the datahub VPC

bitter-dog-24903

06/14/2022, 4:42 PM

Changed the values.yaml for prerequisites chart for mysql to false:

bitter-dog-24903

06/14/2022, 4:42 PM

Copy code

mysql:
  enabled: false
  auth:
    # For better security, add mysql-secrets k8s secret with mysql-root-password, mysql-replication-password and mysql-password
    existingSecret: mysql-secrets

big-carpet-38439

06/14/2022, 4:42 PM

Okay makes sense

big-carpet-38439

06/14/2022, 4:42 PM

And did you configure the values to talk to your new RDS?

bitter-dog-24903

06/14/2022, 4:43 PM

Deployed the prerequisites using that values.yaml file

bitter-dog-24903

06/14/2022, 4:43 PM

Also changed the values.yaml file for datahub deployment to point it to the new RDS instance

bitter-dog-24903

06/14/2022, 4:44 PM

After that deployed datahub using helm install

datahub datahub/datahub --values values.yaml --debug

bitter-dog-24903

06/14/2022, 4:44 PM

Getting below error:

bitter-dog-24903

06/14/2022, 4:44 PM

Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition

helm.go:84: [debug] failed pre-install: timed out waiting for the condition

INSTALLATION FAILED

main.newInstallCmd.func2

<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>

<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>

<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>

<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>

<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>

<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>

<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>

main.main

<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>

runtime.main

runtime/proc.go:255

runtime.goexit

runtime/asm_amd64.s:1581

bitter-dog-24903

06/14/2022, 4:44 PM

@kind-dawn-17532 ^

big-carpet-38439

06/14/2022, 4:45 PM

Okay got it - and what version of datahub are you deploying? (Which chart version)

bitter-dog-24903

06/14/2022, 4:45 PM

datahub-0.2.81, datahub-prerequisites-0.0.6

big-carpet-38439

06/14/2022, 4:49 PM

ok thank you! cc @early-lamp-41924 @kind-dawn-17532

early-lamp-41924

06/14/2022, 4:50 PM

Could you run

Copy code

kubectl get pods -n <<namespace>>

early-lamp-41924

06/14/2022, 4:50 PM

one of the setup jobs must be failing

bitter-dog-24903

06/14/2022, 4:51 PM

The mysql setup job is failing

bitter-dog-24903

06/14/2022, 4:52 PM

Copy code

NAME                        READY  STATUS            RESTARTS  AGE
datahub-elasticsearch-setup-job-x98wk       0/1   Completed          0     52m
datahub-kafka-setup-job-48fzj           0/1   Completed          0     52m
datahub-mysql-setup-job-2t92q           0/1   Error            0     51m
datahub-mysql-setup-job-5swhg           0/1   Error            0     49m
datahub-mysql-setup-job-b8ln6           0/1   Error            0     46m
datahub-mysql-setup-job-bdn4d           0/1   Error            0     51m
datahub-mysql-setup-job-dk7zp           0/1   Error            0     51m
datahub-mysql-setup-job-n7mb6           0/1   Error            0     51m
datahub-mysql-setup-job-zb5p9           0/1   Error            0     50m
elasticsearch-master-0               1/1   Running           0     68m
elasticsearch-master-1               1/1   Running           0     68m
elasticsearch-master-2               1/1   Running           0     68m
prerequisites-cp-schema-registry-cf79bfccf-6t7nq  2/2   Running           0     68m
prerequisites-kafka-0               1/1   Running           0     68m
prerequisites-neo4j-community-0          0/1   CreateContainerConfigError  0     68m
prerequisites-zookeeper-0             1/1   Running           0     68m

early-lamp-41924

06/14/2022, 5:08 PM

Can you post the logs?

bitter-dog-24903

06/14/2022, 5:23 PM

You mean the kubernetes logs?

bitter-dog-24903

06/14/2022, 5:26 PM

Copy code

helm install datahub datahub/datahub --values values.yaml --debug
install.go:178: [debug] Original chart version: ""
install.go:195: [debug] CHART PATH: /Users/ronakshah/Library/Caches/helm/repository/datahub-0.2.81.tgz

client.go:299: [debug] Starting delete for "datahub-elasticsearch-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-elasticsearch-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-elasticsearch-setup-job: ADDED
client.go:596: [debug] datahub-elasticsearch-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-elasticsearch-setup-job: MODIFIED
client.go:299: [debug] Starting delete for "datahub-kafka-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-kafka-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-kafka-setup-job: ADDED
client.go:596: [debug] datahub-kafka-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-kafka-setup-job: MODIFIED
client.go:299: [debug] Starting delete for "datahub-mysql-setup-job" Job
client.go:128: [debug] creating 1 resource(s)
client.go:529: [debug] Watching for changes to Job datahub-mysql-setup-job with timeout of 5m0s
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: ADDED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 1, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 2, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 3, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 4, jobs succeeded: 0
client.go:557: [debug] Add/Modify event for datahub-mysql-setup-job: MODIFIED
client.go:596: [debug] datahub-mysql-setup-job: Jobs active: 1, jobs failed: 5, jobs succeeded: 0
Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
helm.go:84: [debug] failed pre-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>
runtime.main
	runtime/proc.go:255
runtime.goexit
	runtime/asm_amd64.s:1581

big-carpet-38439

06/14/2022, 5:34 PM

Failed on pre-install ...

big-carpet-38439

06/14/2022, 5:35 PM

So mysql-setup configurations must not be correct- you mentioned that the RDS is in the same VPC - Is it in the same subnet?

bitter-dog-24903

06/14/2022, 5:41 PM

Yes, it is in the same subnet as the datahub

early-lamp-41924

06/14/2022, 5:42 PM

Yes the kubernetes logs

early-lamp-41924

06/14/2022, 5:42 PM

The logs of the failed mysql-setup pods

bitter-dog-24903

06/14/2022, 5:48 PM

@early-lamp-41924 Posted the logs while deployment above^

early-lamp-41924

06/14/2022, 5:49 PM

Those are logs from helm?

early-lamp-41924

06/14/2022, 5:49 PM

You can get it via

early-lamp-41924

06/14/2022, 5:49 PM

Copy code

kubectl logs datahub-mysql-setup-job-2t92q -n <<namespace>

early-lamp-41924

06/14/2022, 5:50 PM

Copy code

kubectl logs <<pod-name>> -n <<namespace>

Just copied one of the mysql setup job pod name from above

bitter-dog-24903

06/14/2022, 5:50 PM

Ohh got it.. getting it. Thanks

bitter-dog-24903

06/14/2022, 5:54 PM

Copy code

2022/06/14 17:46:54 Waiting for: <tcp://datahub.cluster-cevhj.us-east-1.rds.amazonaws.com:3306>
2022/06/14 17:46:54 Connected to <tcp://datahub.cluster-cevhj.us-east-1.rds.amazonaws.com:3306>
-- create datahub database
CREATE DATABASE IF NOT EXISTS datahub CHARACTER SET utf8mb4 COLLATE utf8mb4_bin;
USE datahub;

-- create metadata aspect table
create table if not exists metadata_aspect_v2 (
 urn              varchar(500) not null,
 aspect            varchar(200) not null,
 version            bigint(20) not null,
 metadata           longtext not null,
 systemmetadata        longtext,
 createdon           datetime(6) not null,
 createdby           varchar(255) not null,
 createdfor          varchar(255),
 constraint pk_metadata_aspect_v2 primary key (urn,aspect,version)
);

-- create default records for datahub user if not exists
CREATE TABLE temp_metadata_aspect_v2 LIKE metadata_aspect_v2;
INSERT INTO temp_metadata_aspect_v2 (urn, aspect, version, metadata, createdon, createdby) VALUES(
 'urn:li:corpuser:datahub',
 'corpUserInfo',
 0,
 '{"displayName":"Data Hub","active":true,"fullName":"Data Hub","email":"<mailto:datahub@linkedin.com|datahub@linkedin.com>"}',
 now(),
 'urn:li:corpuser:__datahub_system'
), (
 'urn:li:corpuser:datahub',
 'corpUserEditableInfo',
 0,
 '{"skills":[],"teams":[],"pictureLink":"<https://raw.githubusercontent.com/linkedin/datahub/master/datahub-web-react/src/images/default_avatar.png>"}',
 now(),
 'urn:li:corpuser:__datahub_system'
);
-- only add default records if metadata_aspect is empty
INSERT INTO metadata_aspect_v2
SELECT * FROM temp_metadata_aspect_v2
WHERE NOT EXISTS (SELECT * from metadata_aspect_v2);
DROP TABLE temp_metadata_aspect_v2;

-- create metadata index table
CREATE TABLE IF NOT EXISTS metadata_index (
 `id` BIGINT NOT NULL AUTO_INCREMENT,
 `urn` VARCHAR(200) NOT NULL,
 `aspect` VARCHAR(150) NOT NULL,
 `path` VARCHAR(150) NOT NULL,
 `longVal` BIGINT,
 `stringVal` VARCHAR(200),
 `doubleVal` DOUBLE,
 CONSTRAINT id_pk PRIMARY KEY (id),
 INDEX longIndex (`urn`,`aspect`,`path`,`longVal`),
 INDEX stringIndex (`urn`,`aspect`,`path`,`stringVal`),
 INDEX doubleIndex (`urn`,`aspect`,`path`,`doubleVal`)
);
ERROR 1045 (28000): Access denied for user 'root'@'192.168.60.40' (using password: YES)
2022/06/14 17:46:54 Command exited with error: exit status 1

early-lamp-41924

06/14/2022, 5:55 PM

Access denied?

bitter-dog-24903

06/14/2022, 5:56 PM

Looks like its able to connect to the database, but does not have root access?

bitter-dog-24903

06/14/2022, 5:57 PM

I am setting up the access password using

kubectl create secret generic mysql-secrets --from-literal=mysql-root-password=<<password>>

early-lamp-41924

06/14/2022, 5:58 PM

oh one thing

early-lamp-41924

06/14/2022, 5:58 PM

are you using RDS?

bitter-dog-24903

06/14/2022, 5:58 PM

Yes, I am using RDS

early-lamp-41924

06/14/2022, 5:58 PM

there the username should be “admin”

early-lamp-41924

06/14/2022, 5:58 PM

not root

big-carpet-38439

06/14/2022, 5:58 PM

is this an RDS thing?

early-lamp-41924

06/14/2022, 5:58 PM

at least that is the case for our dbs. can you check?

bitter-dog-24903

06/14/2022, 5:59 PM

I am having username as datahub in RDS

bitter-dog-24903

06/14/2022, 5:59 PM

Should I change it to root?

early-lamp-41924

06/14/2022, 5:59 PM

are you setting that anywhere?

early-lamp-41924

06/14/2022, 5:59 PM

we default username to root

early-lamp-41924

06/14/2022, 5:59 PM

https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/values.yaml#L137

early-lamp-41924

06/14/2022, 6:00 PM

you should change it to datahub

bitter-dog-24903

06/14/2022, 6:00 PM

Ohhh damn.. got it. let me check

bitter-dog-24903

06/14/2022, 6:13 PM

Thanks a lot. Looks like the setup job for mysql completed

bitter-dog-24903

06/14/2022, 6:14 PM

It is failing on the post-install

early-lamp-41924

06/14/2022, 6:14 PM

that is fine

early-lamp-41924

06/14/2022, 6:14 PM

it will succeed eventually

early-lamp-41924

06/14/2022, 6:14 PM

once gms is live

bitter-dog-24903

06/14/2022, 6:15 PM

Copy code

client.go:557: [debug] Add/Modify event for datahub-datahub-upgrade-job: ADDED
client.go:596: [debug] datahub-datahub-upgrade-job: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
Error: INSTALLATION FAILED: failed post-install: timed out waiting for the condition
helm.go:84: [debug] failed post-install: timed out waiting for the condition
INSTALLATION FAILED
main.newInstallCmd.func2
	<http://helm.sh/helm/v3/cmd/helm/install.go:127|helm.sh/helm/v3/cmd/helm/install.go:127>
<http://github.com/spf13/cobra.(*Command).execute|github.com/spf13/cobra.(*Command).execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:856|github.com/spf13/cobra@v1.3.0/command.go:856>
<http://github.com/spf13/cobra.(*Command).ExecuteC|github.com/spf13/cobra.(*Command).ExecuteC>
	<http://github.com/spf13/cobra@v1.3.0/command.go:974|github.com/spf13/cobra@v1.3.0/command.go:974>
<http://github.com/spf13/cobra.(*Command).Execute|github.com/spf13/cobra.(*Command).Execute>
	<http://github.com/spf13/cobra@v1.3.0/command.go:902|github.com/spf13/cobra@v1.3.0/command.go:902>
main.main
	<http://helm.sh/helm/v3/cmd/helm/helm.go:83|helm.sh/helm/v3/cmd/helm/helm.go:83>
runtime.main
	runtime/proc.go:255
runtime.goexit
	runtime/asm_amd64.s:1581

early-lamp-41924

06/14/2022, 6:15 PM

get pods?

bitter-dog-24903

06/14/2022, 6:15 PM

Copy code

NAME                        READY  STATUS            RESTARTS  AGE
datahub-acryl-datahub-actions-c8868cdf6-tgtj5   1/1   Running           2     10m
datahub-datahub-frontend-8448f49655-pclfn     1/1   Running           0     10m
datahub-datahub-gms-87f49d87b-fj4l4        0/1   CreateContainerConfigError  0     10m
datahub-datahub-upgrade-job-qvqjj         0/1   CreateContainerConfigError  0     10m
datahub-elasticsearch-setup-job-ggnhd       0/1   Completed          0     11m
datahub-kafka-setup-job-g5vmx           0/1   Completed          0     11m
datahub-mysql-setup-job-s2plt           0/1   Completed          0     11m
elasticsearch-master-0               1/1   Running           0     150m
elasticsearch-master-1               1/1   Running           0     150m
elasticsearch-master-2               1/1   Running           0     150m
prerequisites-cp-schema-registry-cf79bfccf-6t7nq  2/2   Running           0     150m
prerequisites-kafka-0               1/1   Running           0     150m
prerequisites-neo4j-community-0          0/1   CreateContainerConfigError  0     150m
prerequisites-zookeeper-0             1/1   Running           0     150m

early-lamp-41924

06/14/2022, 6:15 PM

hmn

early-lamp-41924

06/14/2022, 6:15 PM

Copy code

CreateContainerConfigError

early-lamp-41924

06/14/2022, 6:16 PM

Can you run

early-lamp-41924

06/14/2022, 6:16 PM

Copy code

kubectl describe pod <<pod-name>> -n <<namespace>>

kind-dawn-17532

06/14/2022, 6:16 PM

Sorry, I got pulled into something.. are we good here or do I add my team member who did all the helm deployments for us?

bitter-dog-24903

06/14/2022, 6:18 PM

Copy code

kubectl logs datahub-datahub-upgrade-job-qvqjj
Error from server (BadRequest): container "datahub-upgrade-job" in pod "datahub-datahub-upgrade-job-qvqjj" is waiting to start: CreateContainerConfigError

bitter-dog-24903

06/14/2022, 6:19 PM

Hi Atul, Thanks for checking.. The RDS issue got resolved. Looking into the post-install issue now

early-lamp-41924

06/14/2022, 6:19 PM

describe pod please

bitter-dog-24903

06/14/2022, 6:20 PM

Copy code

Name:     datahub-datahub-upgrade-job-qvqjj
Namespace:  default
Priority:   0
Node:     ip-192-168-45-36.ec2.internal/192.168.45.36
Start Time:  Tue, 14 Jun 2022 11:02:14 -0700
Labels:    controller-uid=dac86aae-e63c-436d-9c87-5cbc7aadf9ea
       job-name=datahub-datahub-upgrade-job
Annotations: <http://kubernetes.io/psp|kubernetes.io/psp>: eks.privileged
Status:    Pending
IP:      192.168.55.11
IPs:
 IP:      192.168.55.11
Controlled By: Job/datahub-datahub-upgrade-job
Containers:
 datahub-upgrade-job:
  Container ID:  
  Image:     acryldata/datahub-upgrade:v0.8.31
  Image ID:    
  Port:     <none>
  Host Port:   <none>
  Args:
   -u
   NoCodeDataMigration
   -a
   batchSize=1000
   -a
   batchDelayMs=100
   -a
   dbType=MYSQL
  State:     Waiting
   Reason:    CreateContainerConfigError
  Ready:     False
  Restart Count: 0
  Limits:
   cpu:   500m
   memory: 512Mi
  Requests:
   cpu:   300m
   memory: 256Mi
  Environment:
   ENTITY_REGISTRY_CONFIG_PATH: /datahub/datahub-gms/resources/entity-registry.yml
   DATAHUB_GMS_HOST:       datahub-datahub-gms
   DATAHUB_GMS_PORT:       8080
   DATAHUB_MAE_CONSUMER_HOST:  datahub-datahub-mae-consumer
   DATAHUB_MAE_CONSUMER_PORT:  9091
   EBEAN_DATASOURCE_USERNAME:  datahub
   EBEAN_DATASOURCE_PASSWORD:  <set to the key 'mysql-root-password' in secret 'mysql-secrets'> Optional: false
   EBEAN_DATASOURCE_HOST:    <http://datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306|datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306>
   EBEAN_DATASOURCE_URL:     jdbc:<mysql://datahub.cluster-cevhjrrouwzn.us-east-1.rds.amazonaws.com:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8>
   EBEAN_DATASOURCE_DRIVER:   com.mysql.jdbc.Driver
   KAFKA_BOOTSTRAP_SERVER:    prerequisites-kafka:9092
   KAFKA_SCHEMAREGISTRY_URL:   <http://prerequisites-cp-schema-registry:8081>
   ELASTICSEARCH_HOST:      elasticsearch-master
   ELASTICSEARCH_PORT:      9200
   GRAPH_SERVICE_IMPL:      neo4j
   NEO4J_HOST:          prerequisites-neo4j-community:7474
   NEO4J_URI:          <bolt://prerequisites-neo4j-community>
   NEO4J_USERNAME:        neo4j
   NEO4J_PASSWORD:        <set to the key 'neo4j-password' in secret 'neo4j-secrets'> Optional: false
  Mounts:
   /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-54kpm (ro)
Conditions:
 Type       Status
 Initialized    True 
 Ready       False 
 ContainersReady  False 
 PodScheduled   True 
Volumes:
 kube-api-access-54kpm:
  Type:          Projected (a volume that contains injected data from multiple sources)
  TokenExpirationSeconds: 3607
  ConfigMapName:      kube-root-ca.crt
  ConfigMapOptional:    <nil>
  DownwardAPI:       true
QoS Class:          Burstable
Node-Selectors:       <none>
Tolerations:         <http://node.kubernetes.io/not-ready:NoExecute|node.kubernetes.io/not-ready:NoExecute> op=Exists for 300s
               <http://node.kubernetes.io/unreachable:NoExecute|node.kubernetes.io/unreachable:NoExecute> op=Exists for 300s
Events:
 Type   Reason   Age          From        Message
 ----   ------   ----         ----        -------
 Normal  Scheduled 17m          default-scheduler Successfully assigned default/datahub-datahub-upgrade-job-qvqjj to ip-192-168-45-36.ec2.internal
 Normal  Pulling  17m          kubelet      Pulling image "acryldata/datahub-upgrade:v0.8.31"
 Normal  Pulled   17m          kubelet      Successfully pulled image "acryldata/datahub-upgrade:v0.8.31" in 10.539330122s
 Warning Failed   15m (x12 over 17m)  kubelet      Error: secret "neo4j-secrets" not found
 Normal  Pulled   2m39s (x71 over 17m) kubelet      Container image "acryldata/datahub-upgrade:v0.8.31" already present on machine

early-lamp-41924

06/14/2022, 6:21 PM

Copy code

Error: secret "neo4j-secrets" not found

early-lamp-41924

06/14/2022, 6:21 PM

Are you trying to run with neo4j or with elasticsearch as graph backend?

bitter-dog-24903

06/14/2022, 6:22 PM

I am not changing any default configuration for neo4j in values.yaml

bitter-dog-24903

06/14/2022, 6:23 PM

Running

kubectl create secret generic neo4j-secrets --from-literal=neo4j-password=datahub

again and trying

big-carpet-38439

06/14/2022, 6:23 PM

@early-lamp-41924 He mentioned he was following the AWS deploy guide steps

early-lamp-41924

06/14/2022, 6:25 PM

https://datahubproject.io/docs/deploy/kubernetes#quickstart

bitter-dog-24903

06/14/2022, 6:53 PM

The Config error is now resolved..

bitter-dog-24903

06/14/2022, 6:54 PM

Only one of the upgrade job is failing now

early-lamp-41924

06/14/2022, 6:54 PM

As I mentioned above, that should succeed once gms is live

early-lamp-41924

06/14/2022, 6:54 PM

it will retry until it succeeds

bitter-dog-24903

06/14/2022, 6:54 PM

So just wait for some time and again rerun the upgrade?

early-lamp-41924

06/14/2022, 6:55 PM

Kubernetes

early-lamp-41924

06/14/2022, 6:55 PM

will automatically do it

early-lamp-41924

06/14/2022, 6:55 PM

Wait it out

bitter-dog-24903

06/14/2022, 6:56 PM

Ohh got it.. thanks a lot Dexter 🙏

big-carpet-38439

06/14/2022, 7:21 PM

Working now? Thanks everyone.

bitter-dog-24903

06/14/2022, 8:22 PM

It started working now 😀 Thank you..

bitter-dog-24903

06/15/2022, 3:16 AM

While using AWS SMK as dependency and trying to upgrade the datahub deployment it is failing with below error:

bitter-dog-24903

06/15/2022, 3:16 AM

Copy code

[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.
java.util.concurrent.ExecutionException: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1655262435166, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
	at org.apache.kafka.common.internals.KafkaFutureImpl.wrapAndThrow(KafkaFutureImpl.java:45)
	at org.apache.kafka.common.internals.KafkaFutureImpl.access$000(KafkaFutureImpl.java:32)
	at org.apache.kafka.common.internals.KafkaFutureImpl$SingleWaiter.await(KafkaFutureImpl.java:89)
	at org.apache.kafka.common.internals.KafkaFutureImpl.get(KafkaFutureImpl.java:260)
	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:149)
	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.errors.TimeoutException: Call(callName=listNodes, deadlineMs=1655262435166, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited.
[main] INFO io.confluent.admin.utils.ClusterStatus - Expected 1 brokers but found only 0. Trying to query Kafka for metadata again ...
[main] ERROR io.confluent.admin.utils.ClusterStatus - Error while getting broker list.

bitter-dog-24903

06/15/2022, 3:17 AM

Copy code

WARNING: Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Error while executing topic command : Call(callName=createTopics, deadlineMs=1655262505480, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
[2022-06-15 03:07:26,372] ERROR org.apache.kafka.common.errors.TimeoutException: Call(callName=createTopics, deadlineMs=1655262505480, tries=1, nextAllowedTryMs=-9223372036854775709) timed out at 9223372036854775807 after 1 attempt(s)
Caused by: org.apache.kafka.common.errors.TimeoutException: The AdminClient thread has exited. Call: createTopics
 (kafka.admin.TopicCommand$)
[2022-06-15 03:07:26,463] ERROR Uncaught exception in thread 'kafka-admin-client-thread | adminclient-1': (org.apache.kafka.common.utils.KafkaThread)
java.lang.OutOfMemoryError: Java heap space
	at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:57)
	at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
	at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
	at org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:113)
	at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:452)
	at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:402)
	at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:674)
	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:576)
	at org.apache.kafka.common.network.Selector.poll(Selector.java:481)
	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:561)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1333)
	at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1264)
	at java.lang.Thread.run(Thread.java:750)

bitter-dog-24903

06/15/2022, 3:25 AM

Kafka nodes are having 1000 GB of storage each

bitter-dog-24903

06/15/2022, 8:19 PM

@early-lamp-41924 Any suggestion on this ^

bitter-dog-24903

06/16/2022, 12:11 AM

nvm, It worked...

big-carpet-38439

06/17/2022, 3:11 PM

Glad to hear it Ronak.

🙏 1

careful-gigabyte-11162

07/05/2022, 7:01 AM

Hi @bitter-dog-24903 How did you solved it? I have allocated only 100 GB for Kafka nodes.

bitter-dog-24903

07/07/2022, 12:10 AM

Hi Siva, for me the issue was resolved by setting PlainText as authentication type and using the plainText endpoint in the values.yaml file to connect to MSK (Kafka) cluster.

plus1 1

bitter-dog-24903

07/13/2022, 3:44 PM

Hi @early-lamp-41924 @big-carpet-38439, I have deployed the https://github.com/datahub-project/datahub/blob/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml file on AWS ECS. When I try to login into the Datahub UI, I am getting error:

Failed to log in! SyntaxError: Unexpected token < in JSON at position 1

. Is there any other compose file that I should be deploying or any change to this file? The Gms container in AWS ECS is exiting with error:

Command exited with error: signal: killed

famous-florist-7218

08/01/2022, 7:35 AM

Hi guys, I face to the same issue 😕

Copy code

datahub-elastic-6cc45b5c4f-w2b6g                                 1/1     Running            0                2d8h
datahub-elasticsearch-setup-job--1-bb5n9                         0/1     Error              0                17m
datahub-elasticsearch-setup-job--1-fqlvq                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-gxgdd                         0/1     Error              0                20m
datahub-elasticsearch-setup-job--1-jjl4w                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-qd89w                         0/1     Error              0                12m
datahub-elasticsearch-setup-job--1-rtpgj                         0/1     Error              0                21m
datahub-elasticsearch-setup-job--1-wsksj                         0/1     Error              0                21m
datahub-kafka-789989768f-mhzw5                                   1/1     Running            0                2d8h
datahub-kafka-setup-job--1-6kqhf                                 0/1     Error              0                130m
datahub-kafka-setup-job--1-ggg6d                                 0/1     Error              0                134m
datahub-kafka-setup-job--1-q549v                                 0/1     Error              0                123m
datahub-kafka-setup-job--1-sfmgd                                 0/1     Error              0                128m
datahub-kafka-setup-job--1-svwsj                                 0/1     Error              0                133m
datahub-kafka-setup-job--1-t4vg6                                 0/1     Error              0                132m
datahub-kafka-setup-job--1-v5fzb                                 0/1     Error              0                117m
datahub-mysql-7cfd455897-dp4zb                                   1/1     Running            0                2d22h
datahub-zookeeper-569df875bd-7wgsj                               1/1     Running            0                2d8h

bumpy-needle-3184

08/01/2022, 9:07 AM

could you share more logs from the pod and also describe the pod for which you are facing issue

Copy code

kubectl logs <<pod-name>> -n <<namespace>>
kubectl describe <<pod-name>> -n <<namespace>>

famous-florist-7218

08/01/2022, 9:19 AM

Hi @bumpy-needle-3184, I’ve solved this issue by pointing out nodeSelector. Because our EKS cluster was mixed ARM and Non-ARM, the ARM node cannot be compatible with DataHub chart.

bumpy-needle-3184

08/01/2022, 9:19 AM

good to know

bitter-dog-24903

10/03/2022, 9:25 PM

Hello @early-lamp-41924 I am deploying datahub and its components using docker on AWS ECS. All the containers seem to be stable except the datahub-gms and datahub-actions. The datahub-gms container logs has below error:

bitter-dog-24903

10/03/2022, 9:25 PM

Caused by: java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED

at org.apache.http.util.Asserts.check(Asserts.java:46)

at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)

at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)

at org.elasticsearch.client.RestClient.performRequest(RestClient.java:255)

... 19 common frames omitted

21:11:26.190 [pool-6-thread-1] ERROR c.l.m.s.e.query.ESSearchDAO:72 - Search query failed

java.lang.RuntimeException: Request cannot be executed; I/O reactor status: STOPPED

at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:857)

at org.elasticsearch.client.RestClient.performRequest(RestClient.java:259)

at org.elasticsearch.client.RestClient.performRequest(RestClient.java:246)

at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613)

at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583)

at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553)

at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1069)

at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:60)

at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.search(ESSearchDAO.java:100)

at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:67)

at com.linkedin.entity.client.JavaEntityClient.search(JavaEntityClient.java:288)

at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:50)

at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:42)

at com.datahub.authorization.DataHubAuthorizer$PolicyRefreshRunnable.run(DataHubAuthorizer.java:222)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.lang.IllegalStateException: Request cannot be executed; I/O reactor status: STOPPED

at org.apache.http.util.Asserts.check(Asserts.java:46)

at org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase.ensureRunning(CloseableHttpAsyncClientBase.java:90)

at org.apache.http.impl.nio.client.InternalHttpAsyncClient.execute(InternalHttpAsyncClient.java:123)

at org.elasticsearch.client.RestClient.performRequest(RestClient.java:255)

... 19 common frames omitted

21:11:26.190 [pool-6-thread-1] ERROR c.d.authorization.DataHubAuthorizer:229 - Failed to retrieve policy urns! Skipping updating policy cache until next refresh. start: 0, count: 30

com.datahub.util.exception.ESQueryException: Search query failed:

at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:73)

at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.search(ESSearchDAO.java:100)

at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.search(ElasticSearchService.java:67)

at com.linkedin.entity.client.JavaEntityClient.search(JavaEntityClient.java:288)

at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:50)

at com.datahub.authorization.PolicyFetcher.fetchPolicies(PolicyFetcher.java:42)

at com.datahub.authorization.DataHubAuthorizer$PolicyRefreshRunnable.run(DataHubAuthorizer.java:222)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

bitter-dog-24903

10/03/2022, 9:25 PM

And datahub-actions container is failing with below error:

bitter-dog-24903

10/03/2022, 9:26 PM

2022/10/03 17:51:47 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s

2022/10/03 17:51:48 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s

2022/10/03 17:51:49 Received 503 from <http://datahub-gms:8080/health>. Sleeping 1s

2022/10/03 17:51:49 Timeout after 4m0s waiting on dependencies to become available: [<http://datahub-gms:8080/health>]

bitter-dog-24903

10/03/2022, 9:27 PM

Can you please help with any suggestion on resolving this?

10 Views

Open in Slack

Previous Next