https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • b

    bumpy-keyboard-50565

    03/23/2020, 2:01 PM
    Yes, GMS has the
    /admin
    endpoint that returns
    OK
    . Frontend also has the same route: https://github.com/linkedin/datahub/blob/master/datahub-frontend/conf/routes#L9
    s
    c
    • 3
    • 8
  • b

    brash-airplane-35511

    06/12/2020, 9:18 AM
    Hi, I am looking into Datahub, which looks to be a very promising product. Would like to spin it up in EKS, and for that I plan to test the helm charts for Datahub. Regarding our Kafka cluster (in EKS) we are running TLS everywhere. Thus, is SSL/TLS supported for connection towards Kafka and schema registry? Best regards //Lars
    s
    b
    m
    • 4
    • 7
  • v

    victorious-dawn-14920

    08/31/2020, 10:17 AM
    Hello everyone, After testing happily Datahub locally, we wanted to do small PoC with stable deployment on k8s. Everything is working fine, except the parameters to connect to kafka: I need to be able to configure kafka consumer names and kafka topic names in order to fit to our internal conventions (or at least, be able to add a prefix for both, but full configuration would be better) While it seems possible to configure the consumer group (ie: https://github.com/linkedin/datahub/pull/1745/files ), which would solve one of my problems, it seems that the topics name are hardcoded (ie: https://github.com/linkedin/datahub/blob/master/metadata-events/mxe-registration/src/main/java/com/linkedin/mxe/Topics.java ), which would mean I would need to rebuild the docker image. Am I missing something obvious, or what would be the best way to achieve that? Thanks a lot for your time!.
    b
    • 2
    • 8
  • s

    silly-apple-97303

    09/03/2020, 8:10 PM
    Howdy, is it intentional that the docker images for the consumer jobs are using alpine linux? https://github.com/linkedin/datahub/blob/master/docker/datahub-mce-consumer/Dockerfile#L4 Specifically running into errors like this because alpine linux does not use glibc. I think this error is only showing up in our real kafka environments/not docker compose because our kafka cluster is configured to use snappy compression by default.
    Copy code
    java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.7-bb847a5e-21b5-4d9b-babd-f31afc7109a7-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: No such file or directory (needed by /tmp/snappy-1.1.7-bb847a5e-21b5-4d9b-babd-f31afc7109a7-libsnappyjava.so)
    	at java.lang.ClassLoader$NativeLibrary.load(Native Method)
    s
    • 2
    • 8
  • h

    high-hospital-85984

    10/01/2020, 7:12 PM
    Hi! I noticed that all services (GMS, MCE, MAE, and frontend) seem to be running as root. Would it be out of the question to fix the images to run as non-root user?
    m
    b
    s
    • 4
    • 8
  • h

    high-hospital-85984

    11/12/2020, 5:47 PM
    Im trying to run datahub on k8s using yaml’s very much inspired by the helm charts found in the repo (we use kustomize). I’m running into problems very similar to:. https://github.com/linkedin/datahub/issues/1642 , and found the appropriate sql command that should fix the issue ( havent tried yet). Just as a sanity check: are there no migrations run on startup of GMS?
    b
    • 2
    • 5
  • w

    worried-orange-11965

    02/15/2021, 10:38 PM
    (1/5) Installing nghttp2-libs (1.35.1-r2) (2/5) Installing libssh2 (1.9.0-r1) (3/5) Installing libcurl (7.64.0-r5) (4/5) Installing curl (7.64.0-r5) (5/5) Installing tar (1.32-r0) Executing busybox-1.29.3-r10.trigger OK: 86 MiB in 58 packages % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 -- -- -- -- -- -- 0 curl: (60) SSL certificate problem: unable to get local issuer certificate More details here: https://curl.haxx.se/docs/sslcerts.html
    g
    b
    • 3
    • 14
  • m

    mammoth-bear-12532

    04/05/2021, 8:23 PM
    @dry-gigabyte-2025 @early-lamp-41924: you two should chat about the helm charts for aws
    d
    e
    • 3
    • 2
  • i

    icy-easter-2378

    04/06/2021, 6:29 PM
    Copy code
    pi@raspberrypi:~/workspace/datahub $ ./docker/quickstart.sh
    Pulling zookeeper       ... done
    Pulling neo4j         ... pulling from library/neo4j
    Pulling mysql         ... pulling from library/mysql
    Pulling broker        ... done
    Pulling schema-registry    ... done
    Pulling kafka-setup      ... done
    Pulling elasticsearch     ... pulling from library/elasticsearch
    Pulling kibana        ... pulling from library/kibana
    Pulling elasticsearch-setup  ... done
    Pulling datahub-gms      ... done
    Pulling datahub-mce-consumer ... done
    Pulling datahub-frontend-react ... done
    Pulling datahub-mae-consumer ... done
    Pulling kafka-rest-proxy   ... done
    Pulling kafka-topics-ui    ... done
    Pulling schema-registry-ui  ... done
    
    ERROR: for elasticsearch no matching manifest for unknown in the manifest list entries
    
    ERROR: for mysql no matching manifest for unknown in the manifest list entries
    
    ERROR: for neo4j no matching manifest for unknown in the manifest list entries
    
    
    ERROR: for kibana no matching manifest for unknown in the manifest list entries
    ERROR: no matching manifest for unknown in the manifest list entries
    g
    b
    +2
    • 5
    • 56
  • l

    little-france-72098

    04/13/2021, 11:24 AM
    Hi, I added SSL-Support via ENV-Variables to the kafka-setup-container which I'd like to contribute. I've seen that the Dockerfile would conflict with an open pull request from @early-lamp-41924 where he added configurable topics. How should I proceed? Just create a pull request? (Sorry, I'm totally new to contributing to larger projects)
    b
    • 2
    • 2
  • a

    acceptable-football-40437

    04/16/2021, 1:27 PM
    Hello! 👋 I'm experimenting with cloud deployment of DataHub. I see that Helm charts are still in
    contrib
    , and I have a few questions: 1. I saw that these are on the roadmap to be in the main line DataHub very soon, is that still the case? 2. Looking at the README, there is a line that says
    Also, these can be installed on-prem or can be leveraged as managed service on any cloud platform.
    (referring to Kafka, Elasticsearch, MySQL, and Neo4j). My team is a small one, so we're really interested in leveraging managed services wherever possible! I'd love to hear people's experiences doing this and more details on how that setup process works. Do you set up managed instances and then modify the charts'
    values.yaml
    to point at those instances? Thanks in advance, I'm still new to Helm!
    🙌 1
    l
    h
    e
    • 4
    • 48
  • c

    calm-addition-66352

    04/18/2021, 10:44 PM
    Team, I get an error like this on my datahub UI (screenshot). And at the same time I could see the below messages on the log. Anyone knows what might be the reason, if anyone bumped into the same before. I assume it might be something to do with the elasticsearch docker container not running as expected. FYI - I have used the 
    quickstart.sh
     to setup the installation.
    Copy code
    kibana                    | {"type":"log","@timestamp":"2021-04-18T14:30:23Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"No living connections"}
    kibana                    | {"type":"log","@timestamp":"2021-04-18T14:30:23Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"Unable to revive connection: <http://elasticsearch:9200/>"}
    kibana                    | {"type":"log","@timestamp":"2021-04-18T14:30:23Z","tags":["warning","elasticsearch","data"],"pid":7,"message":"No living connections"}
    b
    e
    b
    • 4
    • 15
  • a

    acceptable-football-40437

    04/19/2021, 7:48 PM
    Hi all! The links in the Setup section in
    kubernetes/README.md
    are all broken, since it seems
    Helm Hub
    is deprecated. Where should these point now?
    b
    e
    • 3
    • 23
  • m

    modern-nest-69826

    04/20/2021, 7:17 AM
    My earlier reported issue has been solved with Shirshanka's help. In the log of the front-end container the following was found "This application is already running (Or delete /datahub-frontend/play.pid file).". After stopping datahub the frontend image was removed from the system and a new copy was pulled. Datahub started again with the modified quickstart. One container was not started, but could then be started manually. All containers are now up and running!
    🎉 2
    m
    • 2
    • 1
  • i

    important-television-65295

    04/23/2021, 3:49 AM
    Hi Everyone! 👋 I am currently try to deploy datahub into Google Kubernetes Engine. So far I’ve successfully install the datahub by following this guideline (https://github.com/linkedin/datahub/tree/master/contrib/kubernetes), but somehow after installing it, I can not login and when I check the log from GMS pod it says “javax.servlet.UnavailableException: Servlet Not Initialized”. Can anyone help me resolve this issue? Thank you
    e
    • 2
    • 10
  • i

    incalculable-ocean-74010

    04/23/2021, 9:13 AM
    Hi All, Currently the K8s ingestion cron chart component creates a CronJob resource hardcoded to launch the metadata ingestion framework for a given configuration. This is not very flexible in cases where you may want to customize the output of the metadata ingestion framework or launch something else altogether. I found this need myself when trying to enrich the output of the ingestion framework before sending the MCEs to DataHub. In order to do so I generalized the ingestion chart to allow the possibility to define a generic shell command with custom logic through a bash script. If anyone think this is useful let me know and I'll open a PR. cc @gray-shoe-75895 @mammoth-bear-12532
    b
    m
    • 3
    • 7
  • h

    high-hospital-85984

    04/23/2021, 1:33 PM
    Curious to know, what do you use to monitor the GMS and frontend services (other than normal k8s) metrics? Setting up the actuator-based monitoring for MCE and MAE now.
    i
    • 2
    • 4
  • h

    high-hospital-85984

    04/23/2021, 1:57 PM
    Hmm…I might be missing something but in the MAE service we say that the server.port is 9091, but the docker image exposes 9090 . Bug?
    e
    • 2
    • 1
  • e

    early-lamp-41924

    04/23/2021, 3:51 PM
    Hi everyone! We moved helm charts out of contrib into https://github.com/linkedin/datahub/tree/master/datahub-kubernetes The templates themselves should be exactly the same as before! We added some example configuration for the prerequisites so folks can get started quickly in kubernetes. Let us know if you run into any issues!
    👍 2
    i
    i
    • 3
    • 4
  • i

    important-television-65295

    04/27/2021, 5:33 AM
    Hi all, does anyone know why datahub-gms return HTTP Status:500 on web browser?
    g
    • 2
    • 5
  • l

    little-france-72098

    05/11/2021, 7:34 AM
    Hello, when I'm trying to deploy to a kubernetes cluster and connect to Kafka via SSL I'm having a permission problem with the cert-files. I found that this is because the secret is mounted as root-readonly with
    defaultMode: 256
    in the
    deployment.yaml
    files, but the docker containers are not run as root anymore according to their dockerfiles. Of course, if I delete the default mode (or set it less restricting) I can deploy without a problem. I was wondering if there is a other solution to solve this without editing the official helm chart or if this is some inconsistency remaining from the switch to run the containers with a non-root user?
    e
    • 2
    • 4
  • g

    gifted-art-69474

    05/12/2021, 11:35 AM
    Hello We’ve noticed that
    curl
    in the base image (
    openjdk:8-jre-alpine
    ) contains a vulnerability that we would like to mitigate. However, the newest version of the base image is 2 years old so maybe time to switch out that one?
    • 1
    • 3
  • w

    white-beach-27328

    05/14/2021, 6:20 PM
    Has anyone run into an error with DataHub v0.7.1 in which the dependent charts in
    datahub-kubernetes
    fail to identify the dependent charts such as
    datahub-gms
    ? We are pushing the datahub chart to a chart museum and when attempting to deploy that chart into our K8's cluster, I’m getting an error like this:
    Copy code
    resource-helm>>> Updating dependencies in /tmp/build/put/stg-chart/datahub...
    Saving 5 charts
    Save error occurred:  directory charts/datahub-gms not found
    l
    e
    i
    • 4
    • 17
  • w

    white-beach-27328

    05/17/2021, 5:35 PM
    Is there anything to gain from having these
    test-connection.yaml
    files in the helm charts? I’m noticing that there are some odd bugs in them like: • The gms test hitting a different endpoint than
    /health
    which causes the service to 404 and fail (https://github.com/linkedin/datahub/blob/master/datahub-kubernetes/datahub/charts/datahub-gms/templates/tests/test-connection.yaml#L14) • The mae consumer is expecting there is a service to provide dns resolution for the test: https://github.com/linkedin/datahub/blob/master/datahub-kubernetes/datahub/charts/datahub-mae-consumer/templates/tests/test-connection.yaml. No service exists. • Similarly as above, the mce consumer is expecting there is a service to provide dns resolution for the test: https://github.com/linkedin/datahub/blob/master/datahub-kubernetes/datahub/charts/datahub-mce-consumer/templates/tests/test-connection.yaml#L14. No service exists. I was initially going to submit a PR fixing this, but tbh I’m not seeing their value given that there are now readiness and liveliness probes on the different deployments which are effectively doing these tests anyway. Happy to just PR a delete on these files since they seem like duplicates. Thoughts?
    e
    • 2
    • 4
  • p

    proud-jelly-46237

    05/18/2021, 9:13 PM
    has anybody here tried to automate the whole datahub-kubernetes installation via terraform?
    l
    e
    a
    • 4
    • 16
  • a

    acoustic-midnight-64606

    06/04/2021, 9:41 PM
    Is there a plan to host versioned helm charts in hosted helm repo, for example http://charts.datahubproject.io/, or is the goal for users to add it to their own (chartmuseum, artifactory, gitlab, ...)
    e
    b
    +2
    • 5
    • 12
  • p

    proud-jelly-46237

    06/08/2021, 4:05 PM
    while installing the prerequisite charts got the following error in
    datahub-prerequisites-cp-schema-registry
    pod. I am trying to install it in EKS.
    [main] WARN org.apache.kafka.clients.ClientUtils - Couldn't resolve server prerequisites-kafka:9092 from bootstrap.servers as DNS resolution failed for prerequisites-kafka
    [main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready.
    Copy code
    org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
    	at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:499)
    	at org.apache.kafka.clients.admin.Admin.create(Admin.java:73)
    	at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:49)
    	at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:138)
    	at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
    Caused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers
    	at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)
    	at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)
    	at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:455)
    Just wanted to double check if you faced this before and in case have a resolution for this, before I start the debug process.
    e
    b
    • 3
    • 11
  • p

    proud-jelly-46237

    06/09/2021, 9:15 PM
    right now
    datahub-frontend
    with a service type
    loadbalancer
    creates a public internet facing LB. is there any way we can have an option for a
    internal load balancer
    e.g.https://kubernetes.io/docs/concepts/services-networking/service/#internal-load-balancer probably just having an
    annotation
    option in service.yaml will do the trick. which can always be optional and read from Values.yaml
    e
    b
    • 3
    • 12
  • r

    rich-policeman-92383

    06/14/2021, 12:50 PM
    Hi Guys Please help me. I want to connect datahub with a kerberised kafka running as part of a cloudera cluster.
    plus1 1
    e
    • 2
    • 3
  • g

    gifted-bird-57147

    06/15/2021, 11:45 AM
    Hi, We want to run a Proof-of-Concept of Datahhub on our AWS infrastructure. Our support engineers are using AWS CDK to manage our infrastructure, so I'm trying to figure out what I need to tell them to get a proper environment up and running... One thing I was wondering about: in the Deploying to AWS section in the documentation there is this part about exposing the datahub-frontend via a load balancer. Shouldn't the datahub-mce-consumer be exposed as well? As to receive messages from the ingestion framework?
    h
    b
    e
    • 4
    • 6
12345...53Latest