https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    bitter-waitress-17567

    12/26/2022, 8:56 AM
    acryl-datahub, version 0.9.3.2
  • g

    gentle-camera-33498

    12/26/2022, 7:32 PM
    Hello everyone, I'm having problems with my restoreIndices cronjob on Kubernetes deployment after upgrading to DataHub 0.9.5. Kubernetes version: v1.21 DataHub version: v0.9.5 Kafka version: 3.2.0 Elasticsearch version: 7.17.3 Error message:
    ***************************
    APPLICATION FAILED TO START
    ***************************
    Description:
    Field kafkaHealthChecker in com.linkedin.gms.factory.kafka.DataHubKafkaEventProducerFactory required a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' that could not be found.
    The injection point has the following annotations:
    - @javax.inject.Inject()
    - @javax.inject.Named(value="noCodeUpgrade")
    Action:
    Consider defining a bean of type 'com.linkedin.metadata.dao.producer.KafkaHealthChecker' in your configuration.
    👀 1
    ✅ 1
    i
    b
    +2
    • 5
    • 7
  • m

    millions-hydrogen-95879

    12/27/2022, 3:39 AM
    Hi, I am trying to install datahub on my Mac M1 laptop but some of the containers fail to start:
    Copy code
    datahub version
    DataHub CLI version: 0.9.4
    
    datahub docker quickstart --arch m1
    
    Getting the errors as below:
    Unable to run quickstart - the following issues were detected:
    - kafka-setup is still running
    - datahub-gms is still starting
    - zookeeper is not running
    ✅ 1
    h
    t
    +3
    • 6
    • 28
  • p

    powerful-cat-68806

    12/27/2022, 8:10 AM
    Hi all, I need doc/guidelines to update my ingress-controller to support public access to datahub portal Pls. advise 🙂
    h
    • 2
    • 13
  • e

    echoing-needle-51090

    12/27/2022, 8:19 AM
    Hi all,
  • e

    echoing-needle-51090

    12/27/2022, 8:23 AM
    Hi all, I'm trying to delete all Business Glossary Node using command "datahub delete --entity_type glossaryNode" but it did not work, the command did not detect any Node to delete, said that "No urn to delete". I would like to have an advise on this issue since only --entity_type option I can use is dataset.
    b
    f
    c
    • 4
    • 5
  • e

    elegant-salesmen-99143

    12/27/2022, 12:19 PM
    we upgraded our Datahub recently and looks like there's new mechanism for Groups& Roles and Policies, and we lost all our previos groups and memberships and policies. Is there any way to access that lost info? I really hope there is, couse trying to remember all of them and recreate would be a nightmare...
  • l

    late-ability-59580

    12/27/2022, 2:56 PM
    Hi all ❄️*Snowflake-dbt* issues Here are some I keep encountering and can't find solution or even sense to: 1. in the dbt recipe, when using
    entities_enabled.sources: 'NO'
    , the sources, which are then considered as datasets, don't get the snowflake symbol 2. My dbt entities always have their urns in uppercase. Oddly, when ingesting snowflake with
    convert_urns_to_lowercase: False
    , the snowflake entities are separate from their dbt counterpart. When not using that flag, they merge together. 3. When ingesting snowflake with lowercase urns, some tables appear twice in the lineage view as the up/downstream of each other (two, identical squares - same stats, urns, etc,). I would appreciate any tip regarding any of these issues
    b
    b
    +3
    • 6
    • 9
  • g

    glamorous-wire-83850

    12/28/2022, 11:43 AM
    Hi team, when i try to look up for table , an error shows on. I am using last version on k8 with local ES,postgre, kafka. Any ideas for how to solve this?
    Copy code
    Validation error (FieldUndefined@[nonSiblingDatasetFields/privileges]) : Field 'privileges' in type 'Dataset' is undefined (code undefined)
    h
    • 2
    • 3
  • g

    gentle-camera-33498

    12/28/2022, 2:33 PM
    Hello everyone, After upgrading my development instance of DataHub to version 0.9.5, I started to receive errors in my GMS instance that I wasn't receiving before. Examples bellow: 1.
    ERROR c.l.m.s.e.query.ESSearchDAO:72 - Search query failed
    java.lang.RuntimeException: error while performing request
    ...
    Caused by: java.util.concurrent.TimeoutException: Connection lease request time out
    2.
    ERROR c.l.d.g.r.r.ListRecommendationsResolver:66 - Failed to get recommendations for input com.linkedin.datahub.graphql.generated.ListRecommendationsInput@361dc5ce
    java.lang.RuntimeException: error while performing request
    ...
    Caused by: java.util.concurrent.TimeoutException: Connection lease request time out
    3.
    [ThreadPoolTaskExecutor-1] INFO c.l.m.k.t.DataHubUsageEventTransformer:74 - Invalid event type: HomePageViewEvent
    140700.296 [ThreadPoolTaskExecutor-1] WARN c.l.m.k.DataHubUsageEventsProcessor:56 - Failed to apply usage events transform to record: {"type":"HomePageViewEvent","actorUrn":"urnlicorpuser:patrick.braz","timestamp":1672236418541,"date":"Wed Dec 28 2022 110658 GMT-0300 (Horário Padrão de Brasília)","userAgent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36","browserId":"a449948d-c76e-4545-8bf1-6aac1047583d"}
    Could anyone please help me understandd why?
    ✅ 1
    b
    a
    +2
    • 5
    • 79
  • p

    powerful-cat-68806

    12/28/2022, 3:34 PM
    Hi team, The landing page page (after login)is blank Any idea?
    Screen Recording 2022-12-28 at 14.31.24.mov
    b
    h
    b
    • 4
    • 4
  • a

    astonishing-animal-7168

    12/28/2022, 4:15 PM
    Hi team, after upgrading to v0.9.5 I cannot login anymore with the root Datahub user nor through OIDC. The frontend and gms logs are shown below after trying to login with the root Datahub user. Any hints?
    ✅ 1
    b
    • 2
    • 16
  • r

    rhythmic-lock-29204

    12/28/2022, 10:01 PM
    Hi, I'm trying to evaluate datahub for my company and we've been stuck for a couple weeks now. I could use any advice at this point - all suggestions welcome. Overview: Attempting to run an MSSQL source ingestion through the UI Config: Token based auth is disabled for evaluation. Recipe: IP/Port are faked here
    Copy code
    source:
        type: mssql
        config:
            password: '${secretPass}'
            database: DatabaseName2
            host_port: '192.168.1.1:9999'
            username: '${secretUser}'
    The sink configuration is no problem. I've tinkered with it extensively and it was working fine for us to ingest data at one point. Unfortunately we tried to compose DataHub again and lost this functionality on the current image. When running UI ingestion, this is the relevant error I see:
    Copy code
    '[2022-12-28 19:53:58,671] ERROR    {datahub.ingestion.run.pipeline:127} - mssql is disabled; try running: pip install '"'acryl-datahub[mssql]'\n"
    We have installed the plugin as requested by the error message and also followed the steps to set up the ODBC driver, as well as pyodbc just in case that is an issue. I have attached three files here: 1. The Error Log 2. Output of
    datahub check plugins --verbose
    3. Config showing ODBC driver installed Any idea what I'm doing wrong, or advice on how to further diagnose the issue?
    👀 1
    ✅ 1
    m
    a
    • 3
    • 8
  • b

    best-rose-86507

    12/29/2022, 10:41 AM
    Hi guys, ive been trying to ingest databricks unity catalog tables in datahub and everything is successful other than the lineage (it's not visible), i'm not sure why it's not being ingested, I've explicitly configured
    include_table_lineage
    &
    include_column_lineage
    to
    true
    in the yaml recipe. The lineage appears fine in databricks unity catalog as seen in the image attached, but when looking at the lineage for the same table in datahub, the lineage is not visible 😕 Would really appreciate it if someone can direct me on how to fix this
    d
    s
    • 3
    • 8
  • g

    glamorous-wire-83850

    12/30/2022, 9:28 AM
    Hi team, I am not sure this problem belongs here because also its related to ingestion. We have multi project GCP setup with very big dataset and column numbers, so column base profiling took days. Is there any way to speed up that process any suggestions? I am using kubenetes deployed from docker with action(6-8 core 60-70gb ram, gms 3 core 20gb ram)v9.5
    b
    • 2
    • 1
  • f

    flaky-camera-29314

    12/30/2022, 6:15 PM
    Hello! I have trouble installing Datahub using the steps here. Can someone help me fix it?
    Copy code
    python3 -m datahub version
    Traceback (most recent call last):
      File "/Users/obritto/opt/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "/Users/obritto/opt/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/__main__.py", line 1, in <module>
        from datahub.entrypoints import main
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/entrypoints.py", line 11, in <module>
        from datahub.cli.check_cli import check
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/cli/check_cli.py", line 7, in <module>
        from datahub.cli.json_file import check_mce_file
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/cli/json_file.py", line 3, in <module>
        from datahub.ingestion.source.file import GenericFileSource
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/ingestion/source/file.py", line 17, in <module>
        from datahub.ingestion.api.common import PipelineContext
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/ingestion/api/common.py", line 7, in <module>
        from datahub.emitter.mce_builder import set_dataset_urn_to_lower
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/emitter/mce_builder.py", line 13, in <module>
        from datahub.configuration.source_common import DEFAULT_ENV as DEFAULT_ENV_CONFIGURATION
      File "/Users/obritto/opt/anaconda3/lib/python3.7/site-packages/datahub/configuration/source_common.py", line 49, in <module>
        class DatasetSourceConfigBase(PlatformSourceConfigBase, EnvBasedSourceConfigBase):
      File "pydantic/main.py", line 324, in pydantic.main.ModelMetaclass.__new__
      File "/Users/obritto/opt/anaconda3/lib/python3.7/abc.py", line 126, in __new__
        cls = super().__new__(mcls, name, bases, namespace, **kwargs)
    TypeError: multiple bases have instance lay-out conflict
    👀 1
    ✅ 1
    a
    • 2
    • 3
  • p

    powerful-cat-68806

    01/01/2023, 10:25 AM
    Hi team, I’ve created a new data source, but I unable to connect. The DS is AWS Redshift cluster. I’m using an endpoint to connect it (I.e. - the endpoint is
    vpce-xxxxx-xxxx
    and not a standard RS endpoint) Also - to which pod I should connect to routing to DSs? 10x 🙂
    m
    • 2
    • 5
  • p

    powerful-cat-68806

    01/02/2023, 7:31 AM
    Hi team, Does datahub CLI support testing connections to other resources? If so, What script I need to execute?
    b
    h
    • 3
    • 3
  • p

    proud-policeman-19830

    01/02/2023, 7:58 AM
    Hey, Happy New Year to one and all! Is there any way I can add a root CA certificate into datahub (running via
    datahub docker quickstart
    )? Specifically, I'd like to add a root cert to db connection (postgres), I think I could do it by setting
    sslrootcert
    on a connection uri, but how do I get the cert into datahub to pick the cert up, and what would be the path for
    sslrootcert
    ? If i have to build one of the images myself to do this, which one would it be?
    b
    • 2
    • 3
  • r

    rough-gold-15434

    01/02/2023, 1:41 PM
    Hi Team, We are trying to integrate Great Expectation with Datahub for Snowflake source. And following the steps as mentioned in : https://docs.greatexpectations.io/docs/integrations/integration_datahub/. On running the checkpoint, It is running fine but the Validation tab in Datahub is still disabled.
    h
    • 2
    • 2
  • r

    rough-gold-15434

    01/02/2023, 1:42 PM
    Any help would be much appreciated
  • p

    powerful-cat-68806

    01/02/2023, 4:11 PM
    Hi team & happy new year 🎅 ☃️ I’m adding my DH cluster(k8s) to Okta(OIDC) Regarding this - is this need to be configured in
    datahub-frontend-xxx
    pod? If so, I can’t find the path
    docker/datahub-frontend/env/docker.env
    Pls. assist 🙂
    b
    • 2
    • 15
  • b

    brave-waitress-14748

    01/03/2023, 5:25 AM
    Hi all and happy new year 🙂 New DataHub user here - we've deployed
    0.9.3
    to a GKE cluster, (with an Istio service mesh) and are noticing some strange behaviour when trying to run CLI ingestion commands. When I configure
    DATAHUB_GMS_URL
    to point directly to the GMS service (via an ingress, or port forwarding), all works as expected. But when I point the CLI to the frontend service (via an ingress), with suffix
    /api/gms
    , I get a
    401
    error.
    Copy code
    <html>\n<head>\n<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>\n<title>Error 401 Unauthorized to perform this action.</title>\n</head>\n<body><h2>HTTP ERROR 401 Unauthorized to perform this action.</h2>\n<table>\n<tr><th>URI:</th><td>/entities</td></tr>\n<tr><th>STATUS:</th><td>401</td></tr>\n<tr><th>MESSAGE:</th><td>Unauthorized to perform this action.</td></tr>\n<tr><th>SERVLET:</th><td>restliRequestHandler</td></tr>\n</table>\n<hr/><a href="<https://eclipse.org/jetty>">Powered by Jetty:// 9.4.46.v20220331</a><hr/>\n\n</body>\n</html>
    This is initially hard to see initially as the HTML response causes the JSON parser to barf (see
    cli_utils.py
    , L207), but I narrowed it down as only occurring under the above conditions. Note that I am using a token generated by the root
    datahub
    user, and have set
    METADATA_SERVICE_AUTH_ENABLED
    to
    "true"
    for both the frontend, and gms deployments. I can work with things as they stand and communicate with the GMS service directly, but would like to know if this is intended behaviour, as it contradicts the documentation which suggests I should be able to use the frontend proxy
    ... we will be shifting to the recommendation that folks direct all traffic, whether it's programmatic or not, to the DataHub Frontend Proxy, as routing to Metadata Service endpoints is currently available at the path
    /api/gms
    Thanks in advance!
    b
    f
    • 3
    • 14
  • q

    quick-student-61408

    01/03/2023, 1:42 PM
    Hi and happy new year ! Can i switch this information ? It's little bit annoying to hover with the mouse to get the information 🙂 Thank you
    b
    • 2
    • 2
  • a

    agreeable-belgium-70840

    01/03/2023, 3:26 PM
    hello, I am trying to deploy datahub v0.9.5. In gms, I am getting the following error: Any ideas?
    Copy code
    Error opening zip file or JAR manifest missing : opentelemetry-javaagent-all.jar
    Error occurred during initialization of VM
    agent library failed to init: instrument
    2023/01/03 15:25:19 Command exited with error: exit status 1
    b
    • 2
    • 12
  • q

    quiet-smartphone-60119

    01/03/2023, 3:46 PM
    Hi folks! Happy new year 🙂 Checking in to see if anyone knows a way in Search to pull up mlfeatures via matches against their description.
    ✅ 1
    a
    • 2
    • 6
  • m

    melodic-dress-7431

    01/03/2023, 4:39 PM
    Hi, Have build datahub-frontend-react image facing following issue
    b
    • 2
    • 7
  • m

    melodic-dress-7431

    01/03/2023, 4:39 PM
    Copy code
    datahub-frontend-react     | play.api.UnexpectedException: Unexpected exception[NullPointerException: Null stream]
  • m

    melodic-dress-7431

    01/03/2023, 4:40 PM
    while running
    Copy code
    docker/dev.sh
  • m

    melodic-dress-7431

    01/03/2023, 4:40 PM
    accessing UI get the following page
1...676869...119Latest