https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • w

    witty-motorcycle-52108

    11/16/2022, 5:36 PM
    also, does the elasticsearch client support IAM authentication? I see IAM auth mentioned in the Kafka docs, but not in the elasticsearch docs, so wanted to double check. I assume not since it's not mentioned and i think the code only uses username/password, but wanted to double check
    b
    • 2
    • 2
  • m

    mysterious-motorcycle-80650

    11/16/2022, 6:03 PM
    hello. i have problem with datahub when i use aws kafka. It never stops starting and the pod constantly restarts
    plus1 1
    i
    h
    • 3
    • 7
  • g

    gentle-tailor-78929

    11/16/2022, 8:38 PM
    Hello, when deploying, I get this MySQL error:
    Copy code
    Waiting for: <tcp://localhost:3306>
    Problem with dial: dial tcp 127.0.0.1:3306: connect: connection refused. Sleeping 1s
    Any idea what could be wrong?
    i
    b
    • 3
    • 23
  • b

    bland-orange-13353

    11/17/2022, 9:01 AM
    This message was deleted.
    i
    • 2
    • 2
  • a

    acceptable-glass-96188

    11/17/2022, 11:24 PM
    We ran into this issue while working on datahub v0.9.0. The images datahub-elasticsearch-setup, datahub-Kafka-setup and datahub-gms were mentioned with “debug” tag in docker compose files. But those tags do not exist in public docker. Due to this our datahub instances are failing to come up. We changed it to “latest” for the time being to make it work but like to understand why debug tags are missing, when they will be created, and if they can’t be created can we do an upstream merge with “latest” tag? Please let us know at the earliest. Thank you
    i
    • 2
    • 1
  • r

    rich-van-74931

    11/18/2022, 10:14 AM
    hi everyone! I’m very interested in DataHub and have been testing using the Docker quickstart and now I wanted to deploy it on EKS, but I’m not able to install the prerequisites, all pods are pending except prerequisites-cp-schema-registry that is always restarting, apparently because it cannot connect to Kafka (this pod is also pending). I’m using default values for everything and after searching in the channels for something related to this, I haven’t found a solution. I’ve tried to increase the EKS nodes also, but with no luck… Hope someone can help with this… Thanks!
    i
    • 2
    • 4
  • f

    fresh-cricket-75926

    11/18/2022, 11:19 AM
    Hi Folks, we are trying to deploy javakeystore in datahub-frontend using helm charts , but when we tried to create config and mount extraVolumes and extraEnvs in values.yml , the pod is simple terminating with "oops cant start the server" message. so we decided to build datahub-frontend docker image but fails when gradle trying to build metadata-services and then at datahub-web-react:yarnBuild . Note : we are trying to build Docker image in github workflow. Any help would be appreciated here. Attached the complete log
    log.txt
    i
    • 2
    • 6
  • r

    ripe-tailor-61058

    11/18/2022, 2:57 PM
    Good morning. Is there a way to change the default datahub admin password via the helm charts?
    i
    p
    • 3
    • 5
  • f

    faint-tiger-13525

    11/21/2022, 8:15 AM
    Hello! Sorry, if it is the wrong channel to ask. Maybe someone configured an access policy for domains. I need to create such kind of policy which allows users to change all objects belonged some "Domain X", but at the same time, this policy is supposed to allow these users to set a domain to entities which don't have any domain at the moment. Maybe someone has already solved such a challenge?
    b
    m
    e
    • 4
    • 6
  • s

    shy-parrot-64120

    11/21/2022, 10:45 AM
    Hi dearest all does anyone know abt
    0.9.2.3
    /
    0.9.3
    release dates? need to plan our compatibility releases (
    Jinja2
    stuff) due to contributions made into main branch
    g
    • 2
    • 3
  • b

    blue-honey-61652

    11/21/2022, 12:13 PM
    Good morning, (I hope I am posting this in the right channel 😅) I am currently deploying datahub on Kubernetes thanks to the helm chart and I need a bit of help on something I am trying to do please. For security reasons I need to connect the datahub-GMS to a specific VPN (behind witch the data sources are exposed). To make the VPN connection work on Kubernetes I need to add the following security context to the datahub-GMS :
    Copy code
    securityContext:
      capabilities:
        add: ["NET_ADMIN"]
    I didn't find anything in the helm chart's values.yaml that would allow me to do that. Is there a way to do that with the current helm chart ? If the answer is no do you have ideas for workaround/alternative please ^^ ? Regards,
    b
    p
    • 3
    • 5
  • b

    bumpy-eye-36525

    11/22/2022, 4:22 AM
    Hello everyone, I trying to setup datahub helm with kafka confluent, but I get error after deploy. I used security.protocol SSL. Detail config is below. Any help would be much appreciated.
    Copy code
    Disconnected while requesting ApiVersion: might be caused by incorrect security.protocol configuration (connecting to a SSL listener?) or broker version is < 0.10 (see api.version.request) (after 1ms in state APIVERSION_QUERY, 1 identical error(s) suppressed)
    b
    • 2
    • 5
  • f

    fresh-cricket-75926

    11/22/2022, 12:12 PM
    Hello everyone , i am trying to build datahub-frontend image using github-workflows and everytime the connection will timeout during "Task datahub web reactyarnBuild". Error Message : " The build failed because the process exited too early. This probably means the system ran out of memory or someone called
    kill -9
    on the process. info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command. error Command failed with exit code 1." Any suggestion will be very helpful since we are stuck with this issue from few days. Attached is the docker build logs
    build.txt
    b
    • 2
    • 5
  • s

    silly-angle-91497

    11/22/2022, 6:28 PM
    Are there any examples for setting the values.yaml file when deploying datahub to AWS using MSK with IAM authentication?
    a
    r
    • 3
    • 8
  • g

    gentle-tailor-78929

    11/22/2022, 6:53 PM
    Hello, I am unable to get deployments working for the following containers (Broker, Kafka Setup, Schema Registry). It seems to be related to an SASL issue. What might I be missing? Thanks. Broker
    Copy code
    [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@cc34f4d
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
    Kafka setup
    Copy code
    [kafka-admin-client-thread | adminclient-1] INFO org.apache.kafka.common.network.SaslChannelBuilder - [AdminClient clientId=adminclient-1] Failed to create channel due to
    [FATAL][2022-11-22 18:49:09 +0000]	      org.apache.kafka.common.errors.SaslAuthenticationException: Failed to configure SaslClientAuthenticator
    [FATAL][2022-11-22 18:49:09 +0000]	      Caused by: org.apache.kafka.common.KafkaException: Principal could not be determined from Subject, this may be a transient failure due to Kerberos re-login
    Schema registry
    Copy code
    [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2181 sessionTimeout=40000 watcher=io.confluent.admin.utils.ZookeeperConnectionWatcher@cc34f4d
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.common.X509Util - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
    [FATAL][2022-11-22 18:49:09 +0000]	      [main] INFO org.apache.zookeeper.ClientCnxn - zookeeper.request.timeout value is 0. feature enabled=
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server localhost/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknown error)
    [FATAL][2022-11-22 18:49:09 +0000]	      [main-SendThread(localhost:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket error occurred: localhost/127.0.0.1:2181: Connection refused
    a
    b
    +6
    • 9
    • 95
  • r

    refined-energy-76018

    11/23/2022, 10:04 AM
    Hi, when I upgrade the version for the setup jobs, it causes an error in Kubernetes upon deployment indicating that the image is immutable. This tells me that if I want to upgrade the version, I need to delete the completed but still-existing setup jobs of the previous version before deploying or I have to give the setup jobs unique names for each version (not possible with current Datahub helm chart templating). Wondering if anyone else has faced a similar issue and come up with a solution for it. It seems like my options are: • don't upgrade the version for the setup jobs after the first time (Not sure if this is recommended since I don't know if we're expected to run the setup jobs whenever the version is upgraded or just the first time you set up Datahub) • figure out a way to automatically delete the jobs between deployments • something else I haven't thought of
    b
    • 2
    • 6
  • m

    microscopic-mechanic-13766

    11/23/2022, 11:46 AM
    Good day everyone, quick question: has anyone tried/know if it is possible to send validations from other sources to Datahub so that they are shown in the "Validations" tab?? I am interested in this as Great Expectations' integration has some limitations (like not supporting the Spark execution engine for sources like Hive, the amount of files needed to create for a validation to occur, ...)
    g
    • 2
    • 1
  • w

    witty-motorcycle-52108

    11/23/2022, 8:00 PM
    hey all, working through a deployment and was wondering what a few environment variables for the GMS container do. I looked through docs/issues and searched code and this channel, but didnt see any obvious explainers so hoping this isnt information i just missed somewhere. •
    DATASET_ENABLE_SCSI
    - no idea what this is or what it does •
    ENTITY_REGISTRY_CONFIG_PATH
    - this is a yaml file in the container, does it need to be persisted across restarts or is it effectively a cache? if it needs to be persisted, should it be shared across all containers if running replicas? •
    PE_CONSUMER_ENABLED
    - i believe PE means platform events from my searching, but not quite sure what this enables/disables. is this the variable we set to
    false
    when we are deploying MAE/MCE consumers separately? •
    MAE_CONSUMER_ENABLED
    - how does this work with
    PE_CONSUMER_ENABLED
    ? does it mean the MAE consumer should be started in GMS, and is not running separately? •
    MCE_CONSUMER_ENABLED
    - same question as above for
    MAE_CONSUMER_ENABLED
    ^ basically given a deployment with distinct containers for GMS, MAE consumer, and MCE consumer, should the three above variables all be
    false
    ? when GMS is running as a replicaset, do the replicas need the yaml file to persist like a daemonset would do, or can they rebuild the file on server boot if not present?
    b
    • 2
    • 6
  • s

    shy-dog-84302

    11/24/2022, 12:56 PM
    Hi! We are evaluating datahub for internal metadata management. I see that datahub comes with default ‘datahub’ user and password. We have changed the default password for that user. I’m wondering if there are other security vulnerabilities to datahub that we need to worry about before we push our business data onto it?
    b
    • 2
    • 2
  • l

    limited-sundown-13355

    11/25/2022, 4:52 AM
    Hello All, I am trying to integrate Datahub & GreatExpectations. I’ve installed the latest Datahub (v0.9.2) using GKE clusters on Google cloud. The front-end web page runs fine on the port 9002 but when I try to access the GMS service running on port 8080 either via browser or when I pass the assertions to Datahub from GE, I get this error ( in the thread below). Appreciate if anyone can suggest how to fix it.
    b
    • 2
    • 3
  • h

    hallowed-kilobyte-916

    11/25/2022, 8:53 AM
    I have installed datahub on an aws linux instance (Centos os) using the quickstart command. However, when i try logging in to the frontend, I get a
    failed to login. An unexpected error occured.
    Upon digging around, i read that this error occurs when gms is unhealthy. Unfortunately when I check, it is indeed unhealthy. This is the log https://gist.github.com/theSekyi/d0f379e53e6a86160f006d87afdf6b8f
    b
    • 2
    • 23
  • s

    shy-dog-84302

    11/25/2022, 11:30 AM
    Hi! I am facing an indentation issue with lifecycle element in datahub-frontend/deployment.yaml file when I tried to provide a lifecycle element as suggested in values.yaml file to edit the user.props file as I could not do it in traditional method of volumes and volumeMounts. Harness renders my deployment.yaml file as follows
    Copy code
    lifecycle: map[postStart:map[exec:map[command:[/bin/sh -c echo "username:password" > /datahub-frontend/conf/user.props
    ]]]]
    I believe this is because the lifecycle element is not properly indented in values.yaml file. My values override looks like this:
    Copy code
    lifecycle:
      postStart:
        exec:
          command:
            - /bin/sh
            - -c
            - |
              echo "username:password" > /datahub-frontend/conf/user.props
    Any clue what is going wrong here?
    b
    • 2
    • 13
  • s

    strong-father-80629

    11/25/2022, 7:47 PM
    Anyone know where can I look into on how to deploy datahub in a URL subpath? (eg: mydomain.com/datahub)
    b
    p
    b
    • 4
    • 9
  • r

    red-waitress-53338

    11/28/2022, 4:08 AM
    Hi All, I have stuck with this issue for quite a while now, any help will be highly appreciated. Is there a way to pass NULL as the DATAHUB_GMS_PORT value for the frontend application? I have deployed the GMS service on Google CloudRun, the GMS service has been deployed successfully. I can see the correct output with
    curl <https://google-cloudrun-gms-service/config>
    but it is not working when I do
    curl <https://google-cloudrun-gms-service:8080/config>
    , so therefore I want to pass NULL as the DATAHUB_GMS_PORT value for the frontend application. Any ideas please??
    b
    b
    • 3
    • 6
  • f

    famous-quill-82626

    11/28/2022, 4:54 AM
    I have DataHub installed and working locally on Minikube, but am unable to generate auth tokens. i.e. I see the message:
    Copy code
    Token based authentication is currently disabled. Contact your DataHub administrator to enable this feature.
    Instructions were to change the auth setting on the yaml file, to set metadata_service_authentication from false to true and then redeploy. I solely made the following change in the file:
    Copy code
    global:
          ...
          metadata_service_authentication:
              # enabled: false
              enabled: true
              systemClientId: "__datahub_system"
              systemClientSecret:
              ...
    .. however , now my pods for frontend and gms will not start:
    Copy code
    NAMESPACE       NAME                                               READY   STATUS                       RESTARTS        AGE
    datahub-local   datahub-datahub-frontend-7c7bbd5d64-4fr99          0/1     CreateContainerConfigError   0               3m42s
    datahub-local   datahub-datahub-gms-5c5c549c7d-n6k88               0/1     CreateContainerConfigError   0               3m42s
    datahub-local   datahub-elasticsearch-setup-job-tkffw              0/1     Completed                    0               6m15s
    datahub-local   datahub-kafka-setup-job-45scn                      0/1     Completed                    0               6m8s
    datahub-local   datahub-mysql-setup-job-stwhn                      0/1     Completed                    0               3m47s
    datahub-local   elasticsearch-master-0                             1/1     Running                      0               9m46s
    datahub-local   prerequisites-cp-schema-registry-b55968566-zf5lm   2/2     Running                      0               9m46s
    datahub-local   prerequisites-kafka-0                              1/1     Running                      4 (6m58s ago)   9m46s
    datahub-local   prerequisites-mysql-0                              1/1     Running                      0               9m46s
    datahub-local   prerequisites-neo4j-community-0                    1/1     Running                      0               9m46s
    datahub-local   prerequisites-zookeeper-0                          1/1     Running                      0               9m46s
    .. any help would be much appreciated Pete
    b
    s
    • 3
    • 3
  • c

    clever-lamp-13963

    11/28/2022, 9:58 AM
    Any documentation on how to deploy Aspects in an environment created via
    docker quickstart?
    b
    • 2
    • 5
  • b

    best-umbrella-88325

    11/29/2022, 7:25 AM
    Hi community! Quick question, the datahub helm package seems to have a large number of vulnerabilities (https://artifacthub.io/packages/helm/datahub/datahub?modal=security-report). Upon investigating, it seems a few of them are because of the Jackson Core dependency being of version 2.4.0. However, in the build.gradle file of the datahub project, we can see the version is the upgraded on (2.9.10). Any idea why it isn't getting the updated version of the dependency in the docker file? We are trying to resolve this vulnerability in order to deploy datahub on Production. Thanks in advance!
    b
    • 2
    • 1
  • s

    swift-wolf-62993

    11/29/2022, 8:15 AM
    Hey @best-umbrella-88325 I'm concerned about the same. Here's the vulnerability overview per image per type, but the unpatched Java dependencies are especially unsavory. Lots of criticals there which is definitely a showstopper for enterprise adoption ⬇️
    Copy code
    | Docker Image                              | Type       | CRITICAL | HIGH | LOW | MEDIUM | UNKNOWN |
    | ----------------------------------------- | ---------- | -------- | ---- | --- | ------ | ------- |
    | acryldata/datahub-frontend-react:v0.9.2.4 | jar        | 24       | 56   | 3   | 35     | 2       |
    | acryldata/datahub-gms:v0.9.2.4            | gobinary   |          | 11   |     | 2      |         |
    |                                           | jar        | 16       | 30   | 2   | 37     | 5       |
    | acryldata/datahub-ingestion:v0.9.2.4      | debian     | 17       | 378  | 614 | 306    | 3       |
    |                                           | gobinary   |          | 11   |     | 2      |         |
    |                                           | jar        | 21       | 38   | 4   | 20     | 5       |
    |                                           | python-pkg | 3        | 12   | 1   | 11     |         |
    | acryldata/datahub-mae-consumer:v0.9.2.4   | gobinary   |          | 11   |     | 2      |         |
    |                                           | jar        | 16       | 21   | 1   | 28     | 5       |
    | acryldata/datahub-mce-consumer:v0.9.2.4   | gobinary   |          | 11   |     | 2      |         |
    |                                           | jar        | 16       | 21   | 1   | 28     | 5       |
    |                                           |            | 113      | 600  | 626 | 473    | 25      |
    b
    b
    +2
    • 5
    • 48
  • s

    shy-dog-84302

    11/29/2022, 10:08 PM
    Hi! I am experiencing an error situation in
    datahub-upgrade-job
    while updating my datahub installation in k8s from helm charts from
    0.2.96 -> 0.2.114
    Error message and the upgrade-job yaml file attached in thread 🧵 Anyone has experienced similar issue with possible hints for solving it?
    b
    • 2
    • 15
  • w

    witty-motorcycle-52108

    11/30/2022, 5:40 PM
    hey all! opened up this PR to allow Actions to connect to a GMS instance over TLS by allowing the user to customize the GMS protocol, curious if there are any immediate concerns that would block this, or if we could assume it'll eventually be merged and tagged/built as an Actions release?
    👀 1
1...282930...53Latest