all-things-deployment
  • c

    chilly-library-82062

    11/16/2022, 11:57 AM
    Hello... Is there s way to enable SpringActuator for datahub-gms?
  • f

    fresh-cricket-75926

    11/16/2022, 3:51 PM
    Hi everyone , i was trying to build datahub-frontend docker image , but stuck in below step wherein gradle trying to build datahub-web-react and datahub-frontend . Any help would be much appreciated.
  • w

    witty-motorcycle-52108

    11/16/2022, 5:32 PM
    looking to confirm that all of the setup jobs (kafka, elasticsearch, postgres) are safe to run multiple times, so they can be run "on deploy" rather than manually when needed? what about the
    datahub-upgrade
    container? is it idempotent as well, or is it problematic to run multiple times?
  • w

    witty-motorcycle-52108

    11/16/2022, 5:36 PM
    also, does the elasticsearch client support IAM authentication? I see IAM auth mentioned in the Kafka docs, but not in the elasticsearch docs, so wanted to double check. I assume not since it's not mentioned and i think the code only uses username/password, but wanted to double check
  • m

    mysterious-motorcycle-80650

    11/16/2022, 6:03 PM
    hello. i have problem with datahub when i use aws kafka. It never stops starting and the pod constantly restarts
  • g

    gentle-tailor-78929

    11/16/2022, 8:38 PM
    Hello, when deploying, I get this MySQL error:
    Waiting for: <tcp://localhost:3306>
    Problem with dial: dial tcp 127.0.0.1:3306: connect: connection refused. Sleeping 1s
    Any idea what could be wrong?
  • p

    purple-forest-88570

    11/17/2022, 9:01 AM
    This is inquiry for ingesting a file based lineage and adding a custom model to k8s [Datahub verison : 0.9.2.2] Hello Team, I am trying to add a file based lineage and custom model. Everything works well with dockers on single EC2. However It is a little difficult to add those things with K8S. There are two problems I am suffering. Problem 1) Ingestion metadata(file based lineage) to GMS on k8s was failed with the error msg "401 Client Error: Unauthorized for url" I set the sink address on recipe yml file as datahub-datahub-gms 's external IP checked from command [kubectl get serivces]. DATAHUB_GMS_URL and DATAHUB_GMS_TOKEN are already set. Is there any suspicious point for this error? Problem 2) How could I place an unzipped custom model on "/etc/plugins/models/" in GMS pod ? The below command adds a yml file not including libs/**.jar to configmap. "kubectl create configmap custom-model --from-file=~/.datahub/plugins/models/" Can anyone help me please?
  • a

    acceptable-glass-96188

    11/17/2022, 11:24 PM
    We ran into this issue while working on datahub v0.9.0. The images datahub-elasticsearch-setup, datahub-Kafka-setup and datahub-gms were mentioned with “debug” tag in docker compose files. But those tags do not exist in public docker. Due to this our datahub instances are failing to come up. We changed it to “latest” for the time being to make it work but like to understand why debug tags are missing, when they will be created, and if they can’t be created can we do an upstream merge with “latest” tag? Please let us know at the earliest. Thank you
  • r

    rich-van-74931

    11/18/2022, 10:14 AM
    hi everyone! I’m very interested in DataHub and have been testing using the Docker quickstart and now I wanted to deploy it on EKS, but I’m not able to install the prerequisites, all pods are pending except prerequisites-cp-schema-registry that is always restarting, apparently because it cannot connect to Kafka (this pod is also pending). I’m using default values for everything and after searching in the channels for something related to this, I haven’t found a solution. I’ve tried to increase the EKS nodes also, but with no luck… Hope someone can help with this… Thanks!
  • f

    fresh-cricket-75926

    11/18/2022, 11:19 AM
    Hi Folks, we are trying to deploy javakeystore in datahub-frontend using helm charts , but when we tried to create config and mount extraVolumes and extraEnvs in values.yml , the pod is simple terminating with "oops cant start the server" message. so we decided to build datahub-frontend docker image but fails when gradle trying to build metadata-services and then at datahub-web-react:yarnBuild . Note : we are trying to build Docker image in github workflow. Any help would be appreciated here. Attached the complete log
  • r

    ripe-tailor-61058

    11/18/2022, 2:57 PM
    Good morning. Is there a way to change the default datahub admin password via the helm charts?
  • f

    faint-tiger-13525

    11/21/2022, 8:15 AM
    Hello! Sorry, if it is the wrong channel to ask. Maybe someone configured an access policy for domains. I need to create such kind of policy which allows users to change all objects belonged some "Domain X", but at the same time, this policy is supposed to allow these users to set a domain to entities which don't have any domain at the moment. Maybe someone has already solved such a challenge?
  • s

    shy-parrot-64120

    11/21/2022, 10:45 AM
    Hi dearest all does anyone know abt
    0.9.2.3
    /
    0.9.3
    release dates? need to plan our compatibility releases (
    Jinja2
    stuff) due to contributions made into main branch
  • b

    blue-honey-61652

    11/21/2022, 12:13 PM
    Good morning, (I hope I am posting this in the right channel 😅) I am currently deploying datahub on Kubernetes thanks to the helm chart and I need a bit of help on something I am trying to do please. For security reasons I need to connect the datahub-GMS to a specific VPN (behind witch the data sources are exposed). To make the VPN connection work on Kubernetes I need to add the following security context to the datahub-GMS :
    securityContext:
      capabilities:
        add: ["NET_ADMIN"]
    I didn't find anything in the helm chart's values.yaml that would allow me to do that. Is there a way to do that with the current helm chart ? If the answer is no do you have ideas for workaround/alternative please ^^ ? Regards,
  • b

    bumpy-eye-36525

    11/22/2022, 4:22 AM
    Hello everyone, I trying to setup datahub helm with kafka confluent, but I get error after deploy. I used security.protocol SSL. Detail config is below. Any help would be much appreciated.
    Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 1ms in state APIVERSION_QUERY, 1 identical error(s) suppressed)
  • f

    fresh-cricket-75926

    11/22/2022, 12:12 PM
    Hello everyone , i am trying to build datahub-frontend image using github-workflows and everytime the connection will timeout during "Task :datahub-web-react:yarnBuild". Error Message : " The build failed because the process exited too early. This probably means the system ran out of memory or someone called
    kill -9
    on the process. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. error Command failed with exit code 1." Any suggestion will be very helpful since we are stuck with this issue from few days. Attached is the docker build logs
  • s

    silly-angle-91497

    11/22/2022, 6:28 PM
    Are there any examples for setting the values.yaml file when deploying datahub to AWS using MSK with IAM authentication?
  • g

    gentle-tailor-78929

    11/22/2022, 6:53 PM
    Hello, I am unable to get deployments working for the following containers (Broker, Kafka Setup, Schema Registry). It seems to be related to an SASL issue. What might I be missing? Thanks. Broker
    [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@cc34f4d
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
    Kafka setup
    [kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.network.SaslChannelBuilder - [AdminClient clientId=adminclient-1] Failed to create channel due to
    [FATAL][2022-11-22 18:49:09 +0000]	      org.apache.kafka.common.errors.SaslAuthenticationException: Failed to configure SaslClientAuthenticator
    [FATAL][2022-11-22 18:49:09 +0000]	      Caused by: org.apache.kafka.common.KafkaException: Principal could not be determined from Subject, this may be a transient failure due to Kerberos re-login
    Schema registry
    [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@cc34f4d
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
  • r

    refined-energy-76018

    11/23/2022, 10:04 AM
    Hi, when I upgrade the version for the setup jobs, it causes an error in Kubernetes upon deployment indicating that the image is immutable. This tells me that if I want to upgrade the version, I need to delete the completed but still-existing setup jobs of the previous version before deploying or I have to give the setup jobs unique names for each version (not possible with current Datahub helm chart templating). Wondering if anyone else has faced a similar issue and come up with a solution for it. It seems like my options are: • don't upgrade the version for the setup jobs after the first time (Not sure if this is recommended since I don't know if we're expected to run the setup jobs whenever the version is upgraded or just the first time you set up Datahub) • figure out a way to automatically delete the jobs between deployments • something else I haven't thought of
  • m

    microscopic-mechanic-13766

    11/23/2022, 11:46 AM
    Good day everyone, quick question: has anyone tried/know if it is possible to send validations from other sources to Datahub so that they are shown in the "Validations" tab?? I am interested in this as Great Expectations' integration has some limitations (like not supporting the Spark execution engine for sources like Hive, the amount of files needed to create for a validation to occur, ...)
  • w

    witty-motorcycle-52108

    11/23/2022, 8:00 PM
    hey all, working through a deployment and was wondering what a few environment variables for the GMS container do. I looked through docs/issues and searched code and this channel, but didnt see any obvious explainers so hoping this isnt information i just missed somewhere. •
    DATASET_ENABLE_SCSI
    - no idea what this is or what it does •
    ENTITY_REGISTRY_CONFIG_PATH
    - this is a yaml file in the container, does it need to be persisted across restarts or is it effectively a cache? if it needs to be persisted, should it be shared across all containers if running replicas? •
    PE_CONSUMER_ENABLED
    - i believe PE means platform events from my searching, but not quite sure what this enables/disables. is this the variable we set to
    false
    when we are deploying MAE/MCE consumers separately? •
    MAE_CONSUMER_ENABLED
    - how does this work with
    PE_CONSUMER_ENABLED
    ? does it mean the MAE consumer should be started in GMS, and is not running separately? •
    MCE_CONSUMER_ENABLED
    - same question as above for
    MAE_CONSUMER_ENABLED
    ^ basically given a deployment with distinct containers for GMS, MAE consumer, and MCE consumer, should the three above variables all be
    false
    ? when GMS is running as a replicaset, do the replicas need the yaml file to persist like a daemonset would do, or can they rebuild the file on server boot if not present?
  • s

    shy-dog-84302

    11/24/2022, 12:56 PM
    Hi! We are evaluating datahub for internal metadata management. I see that datahub comes with default ‘datahub’ user and password. We have changed the default password for that user. I’m wondering if there are other security vulnerabilities to datahub that we need to worry about before we push our business data onto it?
  • l

    limited-sundown-13355

    11/25/2022, 4:52 AM
    Hello All, I am trying to integrate Datahub & GreatExpectations. I’ve installed the latest Datahub (v0.9.2) using GKE clusters on Google cloud. The front-end web page runs fine on the port 9002 but when I try to access the GMS service running on port 8080 either via browser or when I pass the assertions to Datahub from GE, I get this error ( in the thread below). Appreciate if anyone can suggest how to fix it.
  • h

    hallowed-kilobyte-916

    11/25/2022, 8:53 AM
    I have installed datahub on an aws linux instance (Centos os) using the quickstart command. However, when i try logging in to the frontend, I get a
    failed to login. An unexpected error occured.
    Upon digging around, i read that this error occurs when gms is unhealthy. Unfortunately when I check, it is indeed unhealthy. This is the log https://gist.github.com/theSekyi/d0f379e53e6a86160f006d87afdf6b8f
  • s

    shy-dog-84302

    11/25/2022, 11:30 AM
    Hi! I am facing an indentation issue with lifecycle element in datahub-frontend/deployment.yaml file when I tried to provide a lifecycle element as suggested in values.yaml file to edit the user.props file as I could not do it in traditional method of volumes and volumeMounts. Harness renders my deployment.yaml file as follows
    lifecycle: map[postStart:map[exec:map[command:[/bin/sh -c echo "username:password" > /datahub-frontend/conf/user.props
    ]]]]
    I believe this is because the lifecycle element is not properly indented in values.yaml file. My values override looks like this:
    lifecycle:
      postStart:
        exec:
          command:
            - /bin/sh
            - -c
            - |
              echo "username:password" > /datahub-frontend/conf/user.props
    Any clue what is going wrong here?
  • s

    strong-father-80629

    11/25/2022, 7:47 PM
    Anyone know where can I look into on how to deploy datahub in a URL subpath? (eg: mydomain.com/datahub)
  • r

    red-waitress-53338

    11/28/2022, 4:08 AM
    Hi All, I have stuck with this issue for quite a while now, any help will be highly appreciated. Is there a way to pass NULL as the DATAHUB_GMS_PORT value for the frontend application? I have deployed the GMS service on Google CloudRun, the GMS service has been deployed successfully. I can see the correct output with
    curl <https://google-cloudrun-gms-service/config>
    but it is not working when I do
    curl <https://google-cloudrun-gms-service:8080/config>
    , so therefore I want to pass NULL as the DATAHUB_GMS_PORT value for the frontend application. Any ideas please??
  • f

    famous-quill-82626

    11/28/2022, 4:54 AM
    I have DataHub installed and working locally on Minikube, but am unable to generate auth tokens. i.e. I see the message:
    Token based authentication is currently disabled. Contact your DataHub administrator to enable this feature.
    Instructions were to change the auth setting on the yaml file, to set metadata_service_authentication from false to true and then redeploy. I solely made the following change in the file:
    global:
          ...
          metadata_service_authentication:
              # enabled: false
              enabled: true
              systemClientId: "__datahub_system"
              systemClientSecret:
              ...
    .. however , now my pods for frontend and gms will not start:
    NAMESPACE       NAME                                               READY   STATUS                       RESTARTS        AGE
    datahub-local   datahub-datahub-frontend-7c7bbd5d64-4fr99          0/1     CreateContainerConfigError   0               3m42s
    datahub-local   datahub-datahub-gms-5c5c549c7d-n6k88               0/1     CreateContainerConfigError   0               3m42s
    datahub-local   datahub-elasticsearch-setup-job-tkffw              0/1     Completed                    0               6m15s
    datahub-local   datahub-kafka-setup-job-45scn                      0/1     Completed                    0               6m8s
    datahub-local   datahub-mysql-setup-job-stwhn                      0/1     Completed                    0               3m47s
    datahub-local   elasticsearch-master-0                             1/1     Running                      0               9m46s
    datahub-local   prerequisites-cp-schema-registry-b55968566-zf5lm   2/2     Running                      0               9m46s
    datahub-local   prerequisites-kafka-0                              1/1     Running                      4 (6m58s ago)   9m46s
    datahub-local   prerequisites-mysql-0                              1/1     Running                      0               9m46s
    datahub-local   prerequisites-neo4j-community-0                    1/1     Running                      0               9m46s
    datahub-local   prerequisites-zookeeper-0                          1/1     Running                      0               9m46s
    .. any help would be much appreciated Pete
  • c

    clever-lamp-13963

    11/28/2022, 9:58 AM
    Any documentation on how to deploy Aspects in an environment created via
    docker quickstart?