https://datahubproject.io logo
Join Slack
Powered by
# all-things-deployment
  • q

    quiet-wolf-56299

    11/04/2022, 7:53 PM
    Has anyone else run into a similar error when using a standalone mysql instance? ERROR 1045 (28000): Plugin caching_sha2_password could not be loaded: Error loading shared library /usr/lib/mariadb/plugin/caching_sha2_password.so: No such file or directory
    s
    • 2
    • 4
  • q

    quiet-wolf-56299

    11/04/2022, 7:54 PM
    Specifically its MySQL Setup tossing that error at the moment but it boils down to an access denied error. Our DBA is hesitant to downgrade authentication in mysql if we don’t have to. Also why is datahub trying to use a mariadb client?
    s
    • 2
    • 2
  • t

    thousands-truck-51652

    11/06/2022, 3:21 AM
    I'm trying to install datahub on a server with no internet access. I have datahub running but I can't get the Oracle source to work. I've installed everything it told me to install but I keep getting "oracle is disabled due to an error in initialization". Anyone have any ideas?
    s
    • 2
    • 3
  • c

    colossal-laptop-87082

    11/07/2022, 6:10 AM
    Hello team!! I'm new to the Datahub, I wanted to ingest CSV and make these observability checkpoints with the help of the Datahub, Is this possible for this? • Freshness • Volume • Scheme
    s
    • 2
    • 2
  • b

    billowy-pilot-93812

    11/07/2022, 8:30 AM
    Hi all, where can I find these two file when enabling personal access token? application.yml and application.config
    b
    • 2
    • 2
  • b

    bland-orange-13353

    11/07/2022, 8:51 AM
    This message was deleted.
    m
    • 2
    • 1
  • m

    many-piano-52097

    11/07/2022, 8:59 AM
    Hi everyone, have you ever used OIDC single sign-on. Signed JWT rejected: Another algorithm expected, or no matching key(s) found This error?
    • 1
    • 1
  • t

    thousands-branch-81757

    11/07/2022, 10:44 AM
    how can I enable profiling for dbt models using snowflake connection in datahub? I couldn't find any config in dbt recipe. I have tried to ingest dbt and snowflake seperately but table /dataset on datahub weren't merged. I'm using CLI version 0.9.0
    a
    • 2
    • 1
  • b

    best-eve-12546

    11/07/2022, 6:09 PM
    Hi y’all, I’m curious what deployment models folks are maintaining internally? Ideally I’d like to be able to add some additional company-specific features on our end without needing to internally fork and maintain everything separately. Does anyone have something similar set up already?
    a
    • 2
    • 4
  • c

    cuddly-arm-8412

    11/08/2022, 3:32 AM
    hi,team.I found that the optional field of advanced search does not have the attribute name.
  • f

    few-tent-85021

    11/08/2022, 7:00 AM
    Glad to be aboard the good ship datahub. I am having an issue with deployment using the docker-compose.yml file. Specifically datahub-gms & Jetty.
    Copy code
    datahub-gms:
        depends_on:
          - mysql
        environment:
          DATAHUB_SERVER_TYPE: quickstart
          DATAHUB_TELEMETRY_ENABLED: "true"
          DATASET_ENABLE_SCSI: "false"
          EBEAN_DATASOURCE_DRIVER: com.mysql.jdbc.Driver
          EBEAN_DATASOURCE_HOST: mysql:3306
          EBEAN_DATASOURCE_PASSWORD: datahub
          EBEAN_DATASOURCE_URL: jdbc:<mysql://mysql:3306/datahub?verifyServerCertificate=false&useSSL=true&useUnicode=yes&characterEncoding=UTF-8>
          EBEAN_DATASOURCE_USERNAME: datahub
          ELASTICSEARCH_HOST: elasticsearch
          ELASTICSEARCH_PORT: "9200"
          ENTITY_REGISTRY_CONFIG_PATH: /datahub/datahub-gms/resources/entity-registry.yml
          GRAPH_SERVICE_IMPL: elasticsearch
          JAVA_OPTS: -Xms1g -Xmx1g
          KAFKA_BOOTSTRAP_SERVER: broker:29092
          KAFKA_SCHEMAREGISTRY_URL: <http://schemaregistry:8081>
          MAE_CONSUMER_ENABLED: "true"
          MCE_CONSUMER_ENABLED: "true"
          PE_CONSUMER_ENABLED: "true"
        hostname: datahub-gms
        image: linkedin/datahub-gms:head
        networks:
          default: null
        ports:
        - mode: ingress
          target: 8080
          published: 8080
          protocol: tcp
        volumes:
        - type: bind
          source: /Users/delicountercouture/.datahub/plugins
          target: /etc/datahub/plugins
          bind:
            create_host_path: true
    I can confirm from mysql, zookeeper, elasticsearch are OK. The set-up scripts for mysql and elasticsearch ran fine ( I can query mysql and see the tables, and query es for the indexes &c ). The broker is running and kafka-setup has completed without errors ( I can see the topics by querying broker via zookeeper ). And then comes datahub-gms, and the tears begin. I have attached a screenshot to show that Jetty is running in the container datahub-gms, but something isn't correct with the *.war file. I have also attached log info for the container and you can see a verbose output of concerns. The log file is long-winded and I hope someone here can help me understand where the fault is.
    datahub-gms.log
    b
    • 2
    • 2
  • d

    damp-queen-61493

    11/08/2022, 12:21 PM
    Hi team! I'm trying to setup SSL connection between a MySQL (used as backend database by datahub) and Datahub deployed with helm chart. I can't find instructions in the helm docs. Has anyone configured the datahub to connect to MySQL using SSL and certificates?
    a
    i
    r
    • 4
    • 8
  • w

    witty-microphone-40893

    11/09/2022, 10:13 AM
    Hi all! I'm looking to deploy Datahub into an AWS environment. Is there anyone who has done this who can give me rough idea of the costs involved? It's an area that's hard to estimate without a baseline of some sort.
    i
    • 2
    • 2
  • m

    microscopic-mechanic-13766

    11/09/2022, 11:38 AM
    Good morning, I have a few questions that might be a bit basic: What is the main difference between datahub-ingestion and datahub-actions? Which of them is the one in charge of the metadata ingestion? Until now I just had a deployment with 3 datahub services: Front, GMS and Actions. This deployment worked perfectly and didn't seem to lack of any functionality. Are there any other services needed (MAE, Ingestion, ...)? (Note: I am mainly asking this as I am trying to improve a bit the ingestion in Spark so that Hive datasets appear as such and not as HDFS datasets (issue mentioned here) and found that the actions doesn't really have a Dockerfile and all is done inside the Ingestion Dockerfile, so I don't know where the needed files (DatahubSparkListener and DatasetExtractor) are mapped; or are they only in the Spark Plugin Jar? ) Thanks in advance!!
    m
    m
    • 3
    • 6
  • b

    better-fireman-33387

    11/10/2022, 10:55 AM
    Hi all, after trying to upgrade from v0.9 to v0.9.1 elastic search setup job failed, after manually changing setup job tag from 0.9.1 to v0.8.43 it succeeded, any idea why? values.yaml inside
    • 1
    • 4
  • c

    cuddly-arm-8412

    11/10/2022, 1:04 PM
    hi,team.When I pull the latest code editing prompts: Exception in thread "main" java.lang.UnsupportedClassVersionError: com/linkedin/metadata/model/validation/ModelValidationTask has been compiled by a more recent version of the Java Runtime (class file version 55.0), this version of the Java Runtime only recognizes class file versions up to 52.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:756) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:473) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:601) metadata servicerestli-servlet-impl:validateModels (Thread[Execution worker for ':' Thread 5,5,main]) completed. Took 0.137 secs. Error: A JNI error has occurred, please check your installation and try again Execution failed for task 'metadata servicerestli-servlet-impl:validateModels'.
    Process 'command '/Users/wangdongkun/Library/Java/JavaVirtualMachines/liberica-1.8.0_332/bin/java'' finished with non-zero exit value 1
    d
    • 2
    • 3
  • b

    broad-article-1339

    11/10/2022, 1:48 PM
    Morning everyone, does anyone know when is the helm chart for
    0.9.2
    being released?
    i
    • 2
    • 6
  • t

    tall-butcher-30509

    11/11/2022, 9:28 AM
    Hello everyone. We’d like to replicate our production deployment’s state into a QA deployment so that we can test our custom emitter app against it before rollout. Is there a better way to do it that just ‘restoring’ a new deployment from a backup of the prod deployment? We are not using quickstart and have a relatively small deployment on a single mysql DB instance.
    i
    • 2
    • 3
  • g

    gifted-diamond-19544

    11/11/2022, 1:13 PM
    Hello all! Anyway I can make all new user’s that login into our platform via OKTA to be assigned as Readers by default? Thank you 🙂
    e
    b
    • 3
    • 4
  • g

    green-soccer-37145

    11/11/2022, 2:32 PM
    Hey everyone, is there any recommondation, how much resources a kubernetes cluster needs, in order to run a DataHub instance? If read the documentation, where it says, that at least 7GB of memory is required. But I'm not sure what to expect in a kubernetes multi-node deployment.
    b
    s
    • 3
    • 4
  • h

    high-ice-84066

    11/11/2022, 5:28 PM
    Question for you all how do git tags work for the helm charts? e.g. https://github.com/acryldata/datahub-helm/tree/datahub-0.2.110/charts/datahub <-- is the v.0.9.1 of DH? Asking so we can understand a rollback strategy on upgrades.
    i
    • 2
    • 4
  • r

    red-waitress-53338

    11/12/2022, 9:14 PM
    Hi, I am new to DataHub, getting the following error, how to debug where the issue is? 2022/11/12 211246 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211247 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211248 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211249 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211250 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211251 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211252 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211253 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211254 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s 2022/11/12 211255 Problem with request: Get http:: http: no Host in request URL. Sleeping 1s @early-lamp-41924 @green-football-43791 Any help is highly appreciated.
    b
    c
    • 3
    • 14
  • l

    lemon-cat-72045

    11/14/2022, 7:24 AM
    Hi all, can we increase the ES bulk flush period and request limit by changing the k8s YAML file? By checking the source code they are both set to 1 by default. Thanks.
    b
    • 2
    • 2
  • h

    high-summer-78960

    11/14/2022, 10:16 PM
    Hey all, are there any references to Datahub running on Kafka 3?
    i
    • 2
    • 2
  • r

    ripe-tailor-61058

    11/15/2022, 4:19 PM
    Hello, I know some version of this has been asked before but searching through slack, wasn't sure if I had the latest info. We plan to use datahub for our system which would create over a million datasets. Actually with our pipeline stages, may be closer to 100 million. Are there any sizing guidelines or suggestions for handling those numbers regarding elasticsearch / EBS volumes, etc? Thanks in advance.
    i
    b
    w
    • 4
    • 4
  • w

    witty-motorcycle-52108

    11/15/2022, 6:37 PM
    hi all! are there any documented version guidelines for prerequisites like postgres/mysql, elasticsearch/opensearch, kafka, etc?
    i
    • 2
    • 3
  • c

    cuddly-arm-8412

    11/16/2022, 2:53 AM
    hi,team.I found that when we have more than 10000 pieces of data, we still return 10000
    b
    • 2
    • 2
  • c

    chilly-library-82062

    11/16/2022, 11:57 AM
    Hello... Is there s way to enable SpringActuator for datahub-gms?
    i
    o
    • 3
    • 5
  • f

    fresh-cricket-75926

    11/16/2022, 3:51 PM
    Hi everyone , i was trying to build datahub-frontend docker image , but stuck in below step wherein gradle trying to build datahub-web-react and datahub-frontend . Any help would be much appreciated.
    error.txt
    i
    • 2
    • 7
  • w

    witty-motorcycle-52108

    11/16/2022, 5:32 PM
    looking to confirm that all of the setup jobs (kafka, elasticsearch, postgres) are safe to run multiple times, so they can be run "on deploy" rather than manually when needed? what about the
    datahub-upgrade
    container? is it idempotent as well, or is it problematic to run multiple times?
    i
    • 2
    • 2
1...272829...53Latest