https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    bland-orange-13353

    01/27/2023, 1:27 PM
    If you’re having trouble with quickstart, please make sure you’re using the most up-to-date version of DataHub by following the steps in the quickstart deployment guide: https://datahubproject.io/docs/quickstart/#deploying-datahub. Specifically, ensure you’re up to date with the DataHub CLI:
    Copy code
    python3 -m pip install --upgrade pip wheel setuptools
    python3 -m pip install --upgrade acryl-datahub
    datahub version
  • d

    damp-greece-27806

    01/27/2023, 7:18 PM
    Hi again, we’re still struggling with the datahub cron jobs failing on this method: https://github.com/datahub-project/datahub/blob/55357783f330950408e4624b3f1421594c[…]b/upgrade/nocodecleanup/DeleteLegacyGraphRelationshipsStep.java The failure being:
    Copy code
    Starting upgrade with id NoCodeDataMigrationCleanup...
    Executing Step 1/4: UpgradeQualificationStep...
    Found qualified upgrade candidate. Proceeding with upgrade...
    Completed Step 1/4: UpgradeQualificationStep successfully.
    Executing Step 2/4: DeleteLegacyAspectRowsStep...
    Completed Step 2/4: DeleteLegacyAspectRowsStep successfully.
    Executing Step 3/4: DeleteLegacyGraphRelationshipStep...
    Failed to delete legacy data from graph: java.lang.ClassCastException: class com.linkedin.metadata.graph.elastic.ElasticSearchGraphService cannot be cast to class com.linkedin.metadata.graph.neo4j.Neo4jGraphService (com.linkedin.metadata.graph.elastic.ElasticSearchGraphService and com.linkedin.metadata.graph.neo4j.Neo4jGraphService are in unnamed module of loader org.springframework.boot.loader.LaunchedURLClassLoader @7ca48474)
    Failed to delete legacy data from graph: java.lang.ClassCastException: class com.linkedin.metadata.graph.elastic.ElasticSearchGraphService cannot be cast to class com.linkedin.metadata.graph.neo4j.Neo4jGraphService (com.linkedin.metadata.graph.elastic.ElasticSearchGraphService and com.linkedin.metadata.graph.neo4j.Neo4jGraphService are in unnamed module of loader org.springframework.boot.loader.LaunchedURLClassLoader @7ca48474)
    Failed Step 3/4: DeleteLegacyGraphRelationshipStep. Failed after 1 retries.
    Exiting upgrade NoCodeDataMigrationCleanup with failure.
    Upgrade NoCodeDataMigrationCleanup completed with result FAILED. Exiting...
    I’m not fully understanding this bit as it seems specific to neo4j, and I don’t know why that would/should be running for setups that aren’t using that. Thanks for you help in advance.
    b
    b
    +3
    • 6
    • 25
  • a

    ambitious-notebook-45027

    01/28/2023, 9:40 AM
    hi, everyone,someone can help me? i want use ldap to login,but Failed to log in! Invalid Credentials
    Copy code
    WHZ-Authentication {
      com.sun.security.auth.module.LdapLoginModule sufficient
      java.naming.security.authentication="simple"
      userProvider="<ldap://192.168.3.75:389>"
      authIdentity="cn={USERNAME},dc=lenovoedu,dc=cn"
      userFilter="(&(objectClass=inetOrgPerson)(uid={USERNAME}))"
      debug="true"
      useSSL="false";
    };
    b
    • 2
    • 7
  • b

    brainy-piano-85560

    01/29/2023, 1:12 PM
    Hey guys, I have a problem starting datahub 🤔 . Upgraded the code & cli (DataHub CLI version: 0.9.6.2) and it shows me this when i'm using datahub docker quickstart: - datahub-gms is still starting - schema-registry is not running - broker is not running - zookeeper is not running after a few minutes the gms goes down I'm running DataHub on a EC2, and it worked before (even with this version and a different CLI version). i've ingested data etc.
    ✅ 1
    b
    • 2
    • 7
  • b

    bland-orange-13353

    01/29/2023, 1:12 PM
    If you’re having trouble with quickstart, please make sure you’re using the most up-to-date version of DataHub by following the steps in the quickstart deployment guide: https://datahubproject.io/docs/quickstart/#deploying-datahub. Specifically, ensure you’re up to date with the DataHub CLI:
    Copy code
    python3 -m pip install --upgrade pip wheel setuptools
    python3 -m pip install --upgrade acryl-datahub
    datahub version
  • e

    elegant-state-4

    01/29/2023, 10:27 PM
    Hi folks! I am working on a fork of datahub which I recently updated. I am running into the following build error when running the command
    ./gradlew :datahub-frontend:dist -x yarnTest -x yarnLint
    Copy code
    [16:15:15] Generate [started]
    [16:15:16] Generate [completed]
    [16:15:16] Generate src/types.generated.ts [completed]
    [16:15:16] Load GraphQL documents [completed]
    [16:15:16] Generate [started]
    [16:15:17] Generate [completed]
    [16:15:17] Generate to src/ (using EXPERIMENTAL preset "near-operation-file") [completed]
    [16:15:17] Generate outputs [completed]
    Creating an optimized production build...
    Browserslist: caniuse-lite is outdated. Please run:
    npx browserslist@latest --update-db
    
    Why you should do it regularly:
    <https://github.com/browserslist/browserslist#browsers-data-updating>
    Failed to compile.
    
    /Users/eyomi/datahub/datahub-web-react/src/App.tsx
    TypeScript error in /Users/eyomi/datahub/datahub-web-react/src/App.tsx(109,10):
    'ThemeProvider' cannot be used as a JSX component.
      Its instance type 'Component<ThemeProviderProps<DefaultTheme, DefaultTheme>, any, any>' is not a valid JSX element.
        The types returned by 'render()' are incompatible between these types.
          Type 'React.ReactNode' is not assignable to type 'import("/Users/eyomi/datahub/node_modules/@types/react/index").ReactNode'.  TS2786
    
        107 |
        108 |     return (
      > 109 |         <ThemeProvider theme={dynamicThemeConfig}>
            |          ^
        110 |             <Router>
        111 |                 <Helmet>
        112 |                     <title>{dynamicThemeConfig.content.title}</title>
    
    
    info Visit <https://yarnpkg.com/en/docs/cli/run> for documentation about this command.
    error Command failed with exit code 1.
    
    > Task :datahub-web-react:yarnQuickBuild FAILED
    
    FAILURE: Build failed with an exception.
    Any idea what could be causing this issue?
    b
    • 2
    • 2
  • a

    average-dinner-25106

    01/30/2023, 7:23 AM
    Hello. how can I deactivate 'Stats' Tab ? This makes allocating resources inefficient. I made yaml recipe without adding 'stats', but it appeared. Actually I activated it before once, but it was just testing. After that, Stats tab was activated automatically.
    ✅ 1
    f
    • 2
    • 2
  • m

    many-solstice-66904

    01/30/2023, 9:22 AM
    Good morning all! I am trying to run the
    ./docker/dev.sh
    script to get started on local development. When I do this I run into the following issue:
    Copy code
    + docker-compose -f docker-compose.yml -f docker-compose.override.yml -f docker-compose.dev.yml pull
    [+] Running 9/12
     ⠿ zookeeper Pulled                                                                                                                                                                                            1.2s
     ⠿ broker Pulled                                                                                                                                                                                               1.2s
     ⠿ elasticsearch-setup Warning                                                                                                                                                                                 1.5s
     ⠿ kafka-setup Warning                                                                                                                                                                                         1.7s
     ⠿ mysql Pulled                                                                                                                                                                                                1.2s
     ⠿ datahub-actions Pulled                                                                                                                                                                                      1.3s
     ⠿ mysql-setup Pulled                                                                                                                                                                                          1.4s
     ⠿ datahub-frontend-react Warning                                                                                                                                                                              1.5s
     ⠿ neo4j Pulled                                                                                                                                                                                                1.3s
     ⠿ datahub-gms Pulled                                                                                                                                                                                          1.4s
     ⠿ schema-registry Pulled                                                                                                                                                                                      1.4s
     ⠿ elasticsearch Pulled                                                                                                                                                                                        1.2s
    WARNING: Some service image(s) must be built from source by running:
        docker compose build datahub-frontend-react elasticsearch-setup kafka-setup
    3 errors occurred:
    	* Error response from daemon: manifest for linkedin/datahub-frontend-react:debug not found: manifest unknown: manifest unknown
    	* Error response from daemon: manifest for linkedin/datahub-elasticsearch-setup:debug not found: manifest unknown: manifest unknown
    	* Error response from daemon: manifest for linkedin/datahub-kafka-setup:debug not found: manifest unknown: manifest unknown
    running
    COMPOSE_DOCKER_CLI_BUILD=1 DOCKER_BUILDKIT=1 DOCKER_DEFAULT_PLATFORM="$(uname -m)" docker-compose -p datahub -f docker-compose.yml -f docker-compose.override.yml -f docker-compose.dev.yml build kafka-setup
    exits with exit code 0 so that works fine. My local build also succeeds with no issues. I've had a look through the documentation and this channel but I have not found an issue similar to this so I am at a bit of a loss what to do here.
    b
    s
    • 3
    • 9
  • f

    fresh-cricket-75926

    01/30/2023, 11:17 AM
    Hi community , was trying to ingest dbt data using below recipe but i am getting "JSONDecodeError: Extra data: line 4 column 1 (char 1741602)". Any suggestion on it would be helpful . source: type: "dbt" config: manifest_path: "/datahub/ingestion/recipes/dbt_docs_split_data/manifest.json" catalog_path: "/datahub/ingestion/recipes/dbt_docs_split_data/catalog.json" test_results_path: "/datahub/ingestion/recipes/dbt_docs_split_data/run_results.json" target_platform: "redshift" sink: type: datahub-rest config: server: 'http://datahub-datahub-gms:8080'
    ✅ 1
    b
    • 2
    • 4
  • d

    delightful-orange-22738

    01/30/2023, 12:34 PM
    Hello did you know this bug? auth is not working version
    Copy code
    helm upgrade --install datahub datahub/datahub --values charts/datahub/values.yaml  --version 0.2.128
    Copy code
    LogIn.tsx:85          POST <http://datahub.bdp-prod/logIn> 500 (Internal Server Error)
    b
    • 2
    • 27
  • f

    fierce-garage-74290

    01/30/2023, 1:00 PM
    Export dataset validation results to BI dashboard (e.g. PowerBI) We were asked by our client whether it'd be possible to create a dashboard with DQ checks history in a BI tool. Just wondering if anyone had similar task before and what would be the simplest way to accomplish this. API requests directly from PowerBI using GraphQL API? Or maybe sync some underlying RDS structures (we're on AWS) to Snowflake and then create a dashboard on top of it? The sync does not have to be real-time, even once a day is sufficient. Thanks in advance for the tips!
    b
    • 2
    • 7
  • d

    delightful-orange-22738

    01/30/2023, 1:26 PM
    I can loging to deleted user
    datahub/datahub
    with full admin rights 😢
    Copy code
    helm upgrade --install datahub datahub/datahub --values charts/datahub/values.yaml  --version 0.2.128
    b
    • 2
    • 8
  • c

    cuddly-plumber-64837

    01/30/2023, 2:51 PM
    Hello all, my docker quickstart is now failing after no issues previously. Has anyone faced a similar issue?
    ✅ 1
    b
    i
    • 3
    • 9
  • b

    bland-orange-13353

    01/30/2023, 2:51 PM
    If you’re having trouble with quickstart, please make sure you’re using the most up-to-date version of DataHub by following the steps in the quickstart deployment guide: https://datahubproject.io/docs/quickstart/#deploying-datahub. Specifically, ensure you’re up to date with the DataHub CLI:
    Copy code
    python3 -m pip install --upgrade pip wheel setuptools
    python3 -m pip install --upgrade acryl-datahub
    datahub version
  • e

    early-student-2446

    01/30/2023, 3:59 PM
    Hi! Is there a way to recreate all elastic Indices without a Kubernetes job? (maybe post with wildcard for all URNs?)
    b
    • 2
    • 4
  • b

    bulky-jackal-3422

    01/30/2023, 4:16 PM
    Hi everyone, I'm trying to use airflow 2.4.3 with meltano's datahub plugin, but everytime I try to start up airflow I'm now getting an error that says:
    Copy code
    Invalid version: '[<current_datetime>] {_plugin.py:349} INFO - Patching datahub policy 2.4.3'
    I'm trying to use
    apache-airflow==2.4.3
    ,
    apache-airflow-providers-amazon==6.0.0
    ,
    acryl-datahub-airflow-plugin==0.9.6.2
    . I've configured the
    gms_host
    in the datahub utility, but I haven't set the connection on the airflow side because I would normally do this through the UI. Is there a way for me to set the configuration in my
    meltano.yml
    ? I've tried setting
    datahub_conn_id
    under
    config.core
    but no luck. Any idea what policy this error is referring to?
    b
    d
    • 3
    • 18
  • n

    numerous-ram-92457

    01/30/2023, 6:44 PM
    Hey all 👋🏼, trying to setup our Snowflake ingestion and keep running into issues. Wondering if anyone could take a look at our log file and provide any insight.
    exec-urn_li_dataHubExecutionRequest_6b3edb7d-8e93-455d-8712-7c5015d7d33c.log
    b
    • 2
    • 6
  • f

    fierce-garage-74290

    01/30/2023, 11:20 PM
    Is it possible to mark in
    Business Glossary File Format
    that a certain glossary term is deprecated? I cannot see any option in the documentation: https://datahubproject.io/docs/generated/ingestion/sources/business-glossary/ I'd like to manage this via repo not UI.
    b
    • 2
    • 4
  • a

    average-dinner-25106

    01/31/2023, 1:47 AM
    Hi, I made the customized policy which restricts managing metadata ingestion. The problem is that this policy seemed not to be applied . As the second figure shows, the user who is permitted to manage metadata ingestion can't find the 'ingestion' tab. Is this bug? If not, I don't know why It didn't work.
    b
    • 2
    • 3
  • b

    brash-helicopter-28341

    01/31/2023, 8:21 AM
    Hi, ingestion tried to run twice at once why it happend? https://github.com/datahub-project/datahub/issues/7053
    b
    • 2
    • 58
  • b

    brash-helicopter-28341

    01/31/2023, 8:23 AM
    Still have this bug(((
  • l

    lively-jackal-83760

    01/31/2023, 8:56 AM
    Hi guys I'm working with datahub using Java lib datahub-client. In the latest version, this dependency has a built-in dependency to io.swagger, which is conflicted with my own project. How we can exclude swagger from datahub's Java lib? Because it doesn't look like a common maven library with dependencies and so on
    ✅ 1
    h
    • 2
    • 2
  • w

    wooden-breakfast-17692

    01/31/2023, 12:49 PM
    Hey all, I’m trying to build datahub inside docker. Currently, I’m only able to run the provided docker (-compose) files which mount bin files and deploy them from docker. What I’m trying to achieve is to also build inside docker. Is this currently possible? Thanks in advance!
    b
    • 2
    • 2
  • n

    numerous-application-54063

    01/31/2023, 5:07 PM
    Hello, we have an issue on advanced filtering from the ui. in any filter type (platform,tags..) we are not able to search for a free keyword. because the search is returning always 5 fixed elements, that are showed also as default values. any valid tag or platform is not found if its not in the default list. does anyone encounter this issue before? we are on version 0.9.6 but it was not working also on 0.9.0 for us
    b
    • 2
    • 19
  • f

    faint-painting-38451

    01/31/2023, 5:14 PM
    Hi everyone, We setup a couple users to generate access tokens for the roles admin, writer and reader. With the reader user, the UI looks correct and we were able to generate a token, however it still seems that it is able to hit all GMS endpoints. For example, I tried both
    /entities?action=ingest
    and
    aspects?action=ingestProposal
    and both went through successfully with the reader token. The GMS is validating the token because I tried the request with no token and then with an invalid one and these requests both got 401 errors. Also I noticed that trying a GraphQL mutate with the reader user was rejected, it's just the GMS calls that go through. Do the GMS endpoints check if the actor has the permissions to be making the call?
    a
    • 2
    • 2
  • c

    creamy-machine-95935

    01/31/2023, 7:58 PM
    Hey! 😄 How often do you launch new releases? I am asking this because I see the feature my teams needs is already merged in master
    ✅ 1
    b
    • 2
    • 2
  • e

    elegant-state-4

    01/31/2023, 9:03 PM
    Hey folks! I am trying to publish metadata to datahub using the OpenAPI interface. I understand I need to generate the API and the model classes using a codegen tool of my choosing. Can someone recommend a good codegen tool they have used for this purpose and if possible sample code to demonstrate? I tried using the openapi-generator but I am running into some compilation issues. Any help would be appreciated
    b
    f
    • 3
    • 4
  • g

    gentle-lifeguard-88494

    02/01/2023, 1:06 AM
    Hey everyone, I need some help programmatically accessing the graphql API endpoint in Python. I generated a PAT using this code:
    Copy code
    # query = """mutation {
    #   createAccessToken(input: {type: PERSONAL, actorUrn: "urn:li:corpuser:datahub", duration: NO_EXPIRY, name: "my personal token"}) {
    #     accessToken
    #     metadata {
    #       id
    #       name
    #       description
    #     }
    #   }
    # }
    # """
    Then I went ahead and added it to my recipe:
    Copy code
    source:
      type: postgres
      config:
        # Coordinates
        host_port: localhost:5432
        database: andreasmartinson
    
        # Credentials
        username: ${USERNAME}
        password: ${PASSWORD}
    
        # Options
        database_alias: postgres
    
    # <https://datahubproject.io/docs/authentication/introducing-metadata-service-authentication/>
    sink:
      type: "datahub-rest"
      config:
        server: "<http://localhost:8080>"
        token: ${DATAHUB_TOKEN}
    I then re-ingested the metadata and passed in the token:
    Copy code
    datahub ingest -c config/postgres_datahub_config.dhub.yaml
    Then I tried this query:
    Copy code
    import requests
    
    query = """query search {
      search(input: {type: DATASET, query: "*", start: 0, count: 100}) {
        searchResults {
          entity {
            ... on Dataset {
              properties {
                name
                description
              }
              schemaMetadata {
                name
                fields {
                  fieldPath
                  nativeDataType
                }
              }
            }
          }
        }
      }
    }
    """
    url = '<http://localhost:9002/api/graphiql>'
    r = <http://requests.post|requests.post>(url, json={'query': query})
    print(r.status_code)
    print(r.text)
    I verified that the query worked using the graphql frontend, it's just that I keep getting a 401 permissions error. Could someone help me figure out why I might be getting a permissions error? Feel like I'm probably missing something simple here
    ✅ 1
    h
    • 2
    • 2
  • a

    average-dinner-25106

    02/01/2023, 1:44 AM
    Hello, as far as I know, datahub supports extracting the lineage given metadata of postgres. Therefore, I include the configuration 'include_table_location_lineage' to be true in the yaml recipe. After ingestion, what I expected is that since account_id is foreign key of the table 'account_product_relation', it should be connected to 'account' table automatically. But the lineage option didn't work since clicking the lineage tab results in an empty space. Does the lineage service is not for entity relationship? I want to use the lineage like managing ERD.
    ✅ 1
    h
    • 2
    • 2
  • s

    silly-accountant-1288

    02/01/2023, 7:14 AM
    Hello , I am having trouble about neo4j backed lineage store. when lineages are nested and circuit, gms container always run into oom. then i find out the lineage shortest path check is in metadata-io, but not in neo4j side. current code in Neo4jGraphService.java:
    Copy code
    final String multiHopTemplateDirect = "MATCH (a {urn: '%s'})-[r:%s*1..%d]->(b) WHERE b:%s RETURN a,r,b";
    final String multiHopTemplateIndirect = "MATCH (a {urn: '%s'})<-[r:%s*1..%d]-(b) WHERE b:%s RETURN a,r,b";
    should it become?:
    Copy code
    final String multiHopTemplateDirect = "MATCH shortestPath((a {urn: '%s'})-[r:%s*1..%d]->(b)) WHERE b.urn <> '%s' AND b:%s RETURN a,r,b";
    final String multiHopTemplateIndirect = "MATCH shortestPath((a {urn: '%s'})<-[r:%s*1..%d]-(b)) WHERE b.urn <> '%s' AND b:%s RETURN a,r,b";
    and also remove:
    Copy code
    // It is possible to have more than 1 path from node A to node B in the graph and previous query returns all the paths.
    // We convert the List into Map with only the shortest paths. "item.get(i).size()" is the path size between two nodes in relation.
    // The key for mapping is the destination node as the source node is always the same, and it is defined by parameter.
    neo4jResult = neo4jResult.stream()
    .collect(Collectors.toMap(item -> item.values().get(2).asNode().get("urn").asString(), Function.identity(),
                (item1, item2) -> item1.get(1).size() < item2.get(1).size() ? item1 : item2))
            .values()
            .stream()
            .collect(Collectors.toList());
    b
    f
    • 3
    • 9
1...737475...119Latest