https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    busy-furniture-10879

    12/14/2022, 2:42 PM
    I'm trying to debug some Azure SSO stuff and I don't see any logs in the locations mentioned here. Is there a specific command or configuration change needed for Datahub to generate logs?
    ✅ 1
    b
    o
    d
    • 4
    • 13
  • g

    gentle-camera-33498

    07/20/2022, 7:11 PM
    Hello guys, I created a custom ingestion source for Dataset Assertions. The ingestion is ok; the AssertionInfo is on my MySQL database and the AssertionRunEvent on Elasticsearch index. But I can't see the Assertions on "Validation" tab and I can't even delete de metadata. Any idea what could cause this?
    Copy code
    datahub delete --env PROD --entity_type assertion --hard
    This will permanently delete data from DataHub. Do you want to continue? [y/N]: y
    [2022-07-20 16:01:44,919] INFO     {datahub.cli.delete_cli:234} - datahub configured with <https://------>-
    [2022-07-20 16:01:46,660] INFO     {datahub.cli.delete_cli:247} - Filter matched 0 entities. Sample: []
    This will delete 0 entities. Are you sure? [y/N]: N
    plus1 1
    s
    • 2
    • 5
  • b

    brainy-piano-85560

    12/14/2022, 2:56 PM
    Hey, I'm encountering a very weird problem. I'm trying to add glossary term / Domain, UI pop-up says it was added, but it doesn't show on UI / not after refresh / not in an entity page when I want to add it. Moreover, the 'most popular' and 'last seen' categories on homepage do not update. I'm using DH on AWS EC2, with the quickstart build, and only ingested some postgres & elastic data.
    ✅ 1
    b
    • 2
    • 4
  • c

    colossal-hairdresser-6799

    12/14/2022, 3:29 PM
    Hi guys! What’s the easiest way of deleting domains from multiple datasets?
    ✅ 1
    b
    • 2
    • 6
  • c

    colossal-hairdresser-6799

    12/14/2022, 6:56 PM
    CROSS POSTING https://datahubspace.slack.com/archives/CV2UXSE9L/p1671044127009829
    b
    • 2
    • 1
  • s

    salmon-area-51650

    12/15/2022, 7:05 AM
    Hello team 👋, I’m getting an issue after upgrading to DataHub
    v0.9.3
    using
    lookml
    ingestion. This is the error:
    Copy code
    [2022-12-15 06:56:05,712] ERROR    {datahub.entrypoints:187} - Failed to configure source (lookml): 1 validation error for LookMLSourceConfig
    base_folder
      base_folder is not provided. Neither has a github deploy_key or deploy_key_file been provided (type=value_error)
    However,
    deploy_key
    is provided. This is my configuration:
    Copy code
    source:
          type: lookml
          config:
            parse_table_names_from_sql: false
            github_info:
                deploy_key: '${DEPLOY_KEY}'
                repo: 'XXXXXXXX'
            api:
                base_url: '<https://XXXXX.eu.looker.com>'
                client_secret: '${LOOKER_CLIENT_SECRET}'
                client_id: XXXXXXXXXXX
            project_name: my_project
    
        pipeline_name: 'lookerml_production'
    
        sink:
          type: "datahub-rest"
          config:
            server: "<http://datahub-datahub-gms:8080>"
    Any clue? Thanks in advance!
    b
    g
    • 3
    • 6
  • m

    magnificent-lock-58916

    12/15/2022, 8:46 AM
    Using Tableau ingestion and having an interesting trouble… Before I explain it in detail, worth noting that we automatically delete everything related to Tableau from Datahub before each ingest execution. We do this since currently Datahub doesn’t support stateful ingestion for Tableau and we need our meta-data be up to date. So, the problem is that scheduled automatic ingestion only ingest one single Tableau folder called “Draft”. And executing it manually somehow ingest everything we need. What can be the reason behind this problem, how to fix it? Manually ingesting everyday is kinda troublesome Another problem we experience only with Tableau ingestion is that after ingestion it takes DataHub some time before it displays every ingested entity. For example, after execution it shows only 20 Tableau entities, after some time 150 entities, and after another time 300. In total we have 1.7k Tableau entities, could this quantity be somehow related to this problem?
    d
    c
    m
    • 4
    • 18
  • t

    thankful-diamond-10319

    12/15/2022, 2:23 PM
    Hi, I am trying to ingest from a Postgres database and when I run the ingestion I get what appears to be all of the data added, but the last status appears as
    Failed
    along with this extremely long error log. I am unable to identify the errors due to the length of the log and was wondering if any of these errors are critical and how to resolve them.
    exec-urn_li_dataHubExecutionRequest_4cf638d9-63e3-43c1-87ea-9f2ab965210e.log
    b
    • 2
    • 3
  • f

    flat-agency-53385

    12/15/2022, 9:58 PM
    Hey folks, I have a simple Graphql query I am using to return fields and descriptions for a dataset. I works great except I cannot get it to return the Glossary Terms associated with the fields in the dataset. I can see the terms attached to the fields in the GUI, but the Glossary Terms section for the fields in the graphql output returns
    null
    . Has anyone run into this before? here is my query
    Copy code
    query table_with_terms($urn: String!) {
      dataset(urn: $urn) {
        urn
        type
        name
        schemaMetadata(version: 0) {
          fields {
            fieldPath
            label
            description
            glossaryTerms {
              terms {
                associatedUrn
                term {
                  hierarchicalName
                  properties {
                    name
                  }
                }
                associatedUrn
              }
            }
          }
        }
      }
    }
    b
    • 2
    • 2
  • b

    broad-article-1339

    12/15/2022, 10:40 PM
    Hey everyone, I'm trying to ingest unity catalog metadata but there's a known issue where the
    env
    variable is ignored. Are there any known work arounds?
    ✅ 1
    👀 1
    d
    a
    • 3
    • 4
  • w

    witty-butcher-82399

    12/16/2022, 10:51 AM
    Hi! A couple of questions about authorization: I would like to define a policy such as: allow some users/groups to edit description for the datasets in a given data platform instance. This is somehow similar to this one: https://datahubproject.io/docs/authorization/policies#coming-soon
    • Ability to define Metadata Policies against multiple reosurces scoped to particular “Containers” (e.g. A “schema”, “database”, or “collection”)
    Just going higher in the hierarchy up to the platform instance level. In the current status, I was thinking on solving this with
    resource_urn
    criteria https://datahubproject.io/docs/authorization/policies#resources Does that criteria support other operators different from
    EQUALS
    such as starts with, contains or even regexp? Definitely in the UI this is not possible, is it possible via policy as code? Second question is about applying to owners.
    Whether this policy should be apply to owners of the Metadata asset. If true, those who are marked as owners of a Metadata Asset, either directly or indirectly via a Group, will have the selected privileges.
    Can this be restricted to some ownership in particular? Thanks
    b
    • 2
    • 7
  • b

    best-umbrella-88325

    12/16/2022, 11:27 AM
    Hello community! We're trying to setup the datahub code on our local systems. We've been able to build the code, and are now trying to run the dev.sh script to see how we can create the images on our local system as mentioned here: https://datahubproject.io/docs/docker/development However, we are seeing this error pop up each time we are running the script.
    Copy code
    WARNING: Some service image(s) must be built from source by running:
        docker compose build %s elasticsearch-setup datahub-frontend-react kafka-setup
    Error response from daemon: manifest for linkedin/datahub-elasticsearch-setup:debug not found: manifest unknown: manifest unknown
    Upon seeing the output of docker images, we are able to see this image:
    Copy code
    $ docker images | grep debug
    linkedin/datahub-frontend-react                 debug                                                   912fbfb19f09   2 hours ago     182MB
    linkedin/datahub-kafka-setup                    debug                                                   796dcb017865   2 hours ago     673MB
    linkedin/datahub-elasticsearch-setup            debug                                                   482e6a255771   2 hours ago     23.1MB
    linkedin/datahub-gms                            debug                                                   2d51843ad479   10 months ago   292MB
    Can anyone help us out on this? Any help would be appreciated. Thanks
    a
    b
    • 3
    • 3
  • c

    cold-father-66356

    12/16/2022, 12:50 PM
    👋 Hello, team! What would be the easiest way to ADD AWS secret and key and token to the docker compose ENVs? (I want to ingest athena data and we using AWS credentials to access, not user and password). thanks a lot
    b
    d
    • 3
    • 47
  • a

    aloof-iron-76856

    12/16/2022, 1:17 PM
    Hi community, we have been trying for a few days to configure the CLI on a dedicated machine in an environment with DataHub deployed on AWS EKS using the standard HELM from your distribution. The CURL is working without any problem, but the datahub client is going to the frontend (even when we specify the GMS port). Error:
    Copy code
    ...
    ConfigurationError: You seem to have connected to the frontend instead of the GMS endpoint. The rest emitter should connect to DataHub GMS (usually <datahub-gms-host>:8080) or Frontend GMS API (usually <frontend>:9002/api/gms)
    ...
    We tried changing addresses as prompted - no luck. Actual datahub init:
    plus1 1
    b
    • 2
    • 3
  • b

    best-market-29539

    12/16/2022, 3:24 PM
    Hello guys, my logo for GCS disappeared. Do you know how can I get it?
    b
    • 2
    • 2
  • w

    wonderful-hair-89448

    12/16/2022, 5:05 PM
    Hi, Team. i am facing this issue after giving quickstart command mentioned below.. I need to get this resolved. datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose-without-neo4j-m1.quickstart.yml unable to run quickstart - the following issues were detected: • kafka-setup is still running • datahub-gms is running but not healthy
    👀 1
    b
    a
    • 3
    • 8
  • w

    wonderful-hair-89448

    12/16/2022, 5:58 PM
    i did install acryl datahub to 0.9.2, and i am still seeing after running the command, datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose-without-neo4j-m1.quickstart.yml unable to run quickstart - the following issues were detected: • kafka-setup is still running
  • b

    bitter-lawyer-49179

    12/19/2022, 8:53 AM
    Hi everyone! I'm new to datahub and I'm setting it up on my laptop to test it out. I've been able to get through some of my errors by finding resolutions on this channel. However i'm now stuck at the point where I'm trying to connect to postgres running on my local machine (not on docker), but I'm unable to get it working. Came across this existing issue created by someone else, but no luck. Kindly help. Thanks!
    Postgers_connection_error
    b
    • 2
    • 3
  • b

    busy-analyst-35820

    12/19/2022, 10:18 AM
    Hi Team, We have enabled Google Auth in Datahub , the version used is v.0.9.2. Currently we have a scenario where we would like to authorise the user based on the Google Groups, the user belongs to. ,ie, along with authenticating user with Google credentials, we should also have a check on the groups and authorise only those users who belongs to certain group which we claim. Is there any option to achieve this? We tired the below and couldn't notice any changes. https://datahubproject.io/docs/authentication/guides/sso/configure-oidc-react/
    ✅ 2
    b
    t
    b
    • 4
    • 5
  • b

    breezy-portugal-43538

    12/19/2022, 10:42 AM
    hi, is there some python example on how can I pass the trainingData and evaluationData to Aspects in MLModelSnapshot? what I tried was to pass it like:
    Copy code
    TrainingDataClass(
                        trainingData=training_data
                    )
    But it doesn't work, my training_data is just a list with dictionaries containing the key, value pairs like in this example below: https://github.com/datahub-project/datahub/blob/a121aff6eb2814dc6d15c4406793ad1bd9[…]5eed3e/metadata-ingestion/examples/mce_files/bootstrap_mce.json
    h
    • 2
    • 8
  • s

    salmon-angle-92685

    12/19/2022, 1:07 PM
    Hello guys, Two days ago the Datahub Helm Chart (https://github.com/acryldata/datahub-helm/blob/master/charts/datahub/Chart.yaml) was updated and since then our OIDC doesn't work anymore. We were obligated to go back to the ancient version an it reworks just fine. We are using Auth0 as our authentification system. Error from the datahub datahub-frontend pod:
    Caught exception while attempting to handle SSO callback! It's likely that SSO integration is mis-configured
    Error in our Auth0 logs:
    Parameter 'code_verifier' is required
    Anyone else is facing this problem since friday? Thanks !
    e
    • 2
    • 1
  • l

    lively-minister-2773

    12/19/2022, 1:38 PM
    Hello, Does Datahub support Redshift Serverless ? or there is a work around as stl_insert is not available in Redshift Serverless https://docs.aws.amazon.com/redshift/latest/mgmt/serverless-monitoring.html You can’t query STL, STV, SVCS, SVL, and some SVV system tables and views with Amazon Redshift Serverless, except the following: …. Thank you
    d
    m
    b
    • 4
    • 10
  • m

    microscopic-mechanic-13766

    12/19/2022, 3:54 PM
    Good afternoon, I am trying to clone the project from git using IntelliJ but I am getting the following error: I have installed all the dependencies needed (docker, docker-compose and jdk 11). Has anyone experienced this "error" before?
    o
    • 2
    • 6
  • e

    elegant-salesmen-99143

    12/20/2022, 7:45 AM
    Hi. I have a problem on Hive where after the tables are moved to a different schema or renamed, I can still see them in Datahub in previous schema /under previous name. I can see both old and new version of the table, the old one does not disappear from the schema after it's been renamed/moved somewhere else. It happens even though the source metadata updates every day. Any ideas what I can look into to fix this?
    m
    • 2
    • 2
  • m

    microscopic-mechanic-13766

    12/20/2022, 10:20 AM
    Hi, so I am using v0.9.3 on both gms and front and when I
    Sign Out
    , if I use the back arrow of the browser to attempt to go back to the home page of Datahub I am able to access (if the user I logged in with is of my OIDC provider, which in my case is Keycloak). This doesn't happen with the users created inside of Datahub.
    ✅ 1
    a
    e
    • 3
    • 8
  • s

    silly-ability-65278

    12/20/2022, 10:39 AM
    Hi, I tried to clickhouse as my source data. My server is running on kubernetes and it only has IPv4. As I tried to ingest the metadata, it show that it is invalid IPv6 URL. How do I change to IPv4? Thank you
    o
    • 2
    • 3
  • f

    faint-actor-78390

    12/20/2022, 12:04 PM
    Hi, Is option "datahub" : { "serverType" : "quickstart" }, compatible with GreatExpectations ingestion to Datahub ? Thks Bruno
    ✅ 1
    👀 1
    a
    • 2
    • 2
  • e

    eager-vase-41681

    12/20/2022, 12:18 PM
    Hi all, I successfully set up datahub with docker, but failing to ingest the sample data. Elasticsearch and datahub-gms have different ports assigned compared to default settings.
    f
    • 2
    • 5
  • l

    lemon-lock-89160

    12/20/2022, 2:59 PM
    I’m facing an OCSP error when trying to connect to snowflake. I have verified: • Snowflake network policy has IP added from datahub • port 80 and 443 are opened Below is a snippet of my error log. Any ideas?
    Copy code
    {datahub.cli.ingest_cli:120} - Starting metadata ingestion\n'
               '[2022-12-20 14:39:38,569] ERROR    {snowflake.connector.ocsp_snowflake:1490} - Failed to get OCSP response after 1 attempt. Consider '
               'checking for OCSP URLs being blocked\n'
               '[2022-12-20 14:39:38,570] ERROR    {snowflake.connector.ocsp_snowflake:1065} - WARNING!!! Using fail-open to connect. Driver is '
               'connecting to an HTTPS endpoint without OCSP based Certificate Revocation checking as it could not obtain a valid OCSP Response to use '
               'from the CA OCSP responder. Details: \n'
               " {'driver': 'PythonConnector', 'version': '2.9.0', 'eventType': 'RevocationCheckFailure', 'eventSubType': "
               "'OCSPResponseFailedToConnectCacheServer|OCSPResponseFetchFailure', 'sfcPeerHost':
    ❄️ 1
    h
    • 2
    • 4
  • b

    broad-article-1339

    12/20/2022, 3:01 PM
    Hey everyone, I recently upgraded Datahub from 8.33 to 9.2. I don't know if it's releated but now when I try adding new recipes I keep getting an Elastic Search error
    Copy code
    ElasticsearchException[Elasticsearch exception [type=document_missing_exception, reason=[_doc][urn%3Ali%3Aassertion%3A5cc4c9d82c421bcc03798c6c42b7010f]: document missing]]]
    Has anyone come across this issue?
    h
    • 2
    • 2
1...656667...119Latest