https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • p

    plain-farmer-27314

    01/11/2022, 8:32 PM
    Hey yall, trying to try out the new version that was released Seeing an error when trying to run our usual bigquery ingest. Version:
    Copy code
    ➜  datahub-explore python3 -m datahub version
    DataHub CLI version: 0.8.22.1
    Python version: 3.7.9 (default, Jan 24 2021, 17:48:25)
    [Clang 7.1.0 (tags/RELEASE_710/final)]
    I've also made sure to run upgrade on the bigquery plugin and the datahub-rest plugin the error:
    Copy code
    File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
    File "<frozen importlib._bootstrap>", line 983, in _find_and_load
    File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
    File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
    File "<frozen importlib._bootstrap_external>", line 728, in exec_module
    File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
    File "/Users/zachary.bluhm/.local/lib/python3.7/site-packages/datahub/ingestion/source/sql/bigquery.py", line 25, in <module>
        from datahub.ingestion.source.sql.sql_common import (
    File "/Users/zachary.bluhm/.local/lib/python3.7/site-packages/datahub/ingestion/source/sql/sql_common.py", line 39, in <module>
        from datahub.ingestion.source.state.stateful_ingestion_base import (
    File "/Users/zachary.bluhm/.local/lib/python3.7/site-packages/datahub/ingestion/source/state/stateful_ingestion_base.py", line 18, in <module>
        from datahub.ingestion.source.state_provider.state_provider_registry import (
    File "/Users/zachary.bluhm/.local/lib/python3.7/site-packages/datahub/ingestion/source/state_provider/state_provider_registry.py", line 10, in <module>
        assert ingestion_state_provider_registry.get("datahub")
    File "/Users/zachary.bluhm/.local/lib/python3.7/site-packages/datahub/ingestion/api/registry.py", line 124, in get
        raise KeyError(f"Did not find a registered class for {key}")
    Also fwiw, looker ingestion is working as normal
    m
    m
    +2
    • 5
    • 29
  • m

    melodic-helmet-78607

    01/12/2022, 5:39 AM
    graphql listRecommendations breaks when I search two urns at the same time with similar name but different case e.g urnliglossaryTerm:channel vs urnliglossaryTerm:Channel, Anyone know how to reset frontend recommendations? / reset tracking statistics?
    m
    • 2
    • 4
  • h

    high-hospital-85984

    01/12/2022, 10:43 AM
    👋 I'm trying out the new authentication feature locally, enabled it both for GMS and frontend, and created a token for myself through the UI. Now, however, when I make the request suggested in the UI:
    Copy code
    curl -X POST '<http://localhost:9002/api/graphql>' \
    --header 'Authorization: Bearer <token>' \
    --header 'Content-Type: application/json' \
    --data-raw '{"query":"{\n  me {\n    corpUser {\n        username\n    }\n  }\n}","variables":{}}'
    I get a 401. making the same request to the GMS (
    :8080
    ) works but so does any other token, or with no token at all. Is there someway to ensure that the authentication service has indeed been enabled properly?
    b
    • 2
    • 3
  • q

    quick-pizza-8906

    01/12/2022, 1:13 PM
    This docs mention datahub-spark-lineage version 0.0.3: https://datahubproject.io/docs/metadata-integration/java/spark-lineage/ But when I search it in maven: https://search.maven.org/search?q=a:datahub-spark-lineage I find only version 0.0.2, where can I find newest jar?
    m
    • 2
    • 1
  • r

    red-pizza-28006

    01/12/2022, 1:45 PM
    i am starting to see this error on top of our datasets after the recent upgrade
    Copy code
    Validation error of type FieldUndefined: Field 'operations' in type 'Dataset' is undefined @ 'dataset/operations' (code undefined)
    ✅ 1
    m
    • 2
    • 3
  • f

    faint-painting-38451

    01/12/2022, 7:18 PM
    We are currently unable to build Datahub due to this issue: https://github.com/linkedin/datahub/issues/3879. Is there a work around to build without connecting to jcenter.bintray?
    ✅ 1
    👍 1
    a
    m
    +2
    • 5
    • 42
  • c

    careful-insurance-60247

    01/12/2022, 9:35 PM
    Stood up a new AWS k8 datahub deployment but after I login im presented with a whitescreen. The datahub dashboard shows for a split second.
    b
    e
    • 3
    • 27
  • e

    eager-gpu-17565

    01/13/2022, 9:02 AM
    Hi Team! Want to know if I can ingest metadata from Salesforce into DataHub?
    e
    • 2
    • 2
  • a

    acceptable-potato-35922

    01/13/2022, 5:29 PM
    We are running into issues extracting Lineage from BigQuery because our BQ implementation is broken out into multiple projects and currently DataHub only auto-extracts from one project at a time. https://feature-requests.datahubproject.io/b/feedback/p/bigquery-lineage-between-multiple-projects Considering this feature is not available yet, is there a work around that has worked for others with a similar BigQuery setup?
    e
    l
    • 3
    • 4
  • a

    acceptable-architect-70237

    01/13/2022, 9:00 PM
    Hi, not sure if it's just me. I have deployed the
    frontend-react
    and
    gms
    app separately. I have made sure the
    gms
    API endpoint works as expected by running the
    api/graphql
    in the
    cUrl
    or
    postman
    . I also made sure my frontend deployed successfully. But I kept receiving errors from the
    datahub-frontend
    module
    Copy code
    502,  Unable to route request - Domain Unavailable
    I did another experiment. Running the
    frontend-react
    docker locally, and pointing the remote
    gms
    , I received the same
    502, Unable to route request - Domain Unavailable
    error. Of course, if local
    frontend-react
    pointed to local
    datahub-gms
    , everything works fine. I also experimented any backend app (
    pyhton,
    ,
    node.js
    to talk to the remote backend
    api/graph
    , it works fine as well. What could be the problem? is any settings in
    front-end
    module?
    e
    g
    • 3
    • 13
  • n

    numerous-eve-42142

    01/13/2022, 9:38 PM
    Hi everyone! I'm running Datahub locally to do some tests. Before putting it to work with real databases, I'm using local Postgres docker to ingest the platform. When I did ingest metadata from CLI, it seems to work just fine, but only postgres. information_schemas are shown in data hub. Not the sample tables I've created. Anyone have a clue of what could be happening? Here are the codes and images: postgre_datahum.yml: source: type: postgres config: # Coordinates host_port: localhost:5432 database: postgres # Credentials username: postgres password: 1234 sink: # sink configs type: "datahub-rest" config: server: "http://localhost:8080"
    b
    d
    • 3
    • 18
  • n

    nutritious-bird-77396

    01/13/2022, 10:59 PM
    I see some sample urns like
    urn:li:dataset:(urn:li:dataPlatform:foo,bar,PROD)
    returned from GMS but they don't show up in the frontend. Are the example urns ignored in the frontend config somewhere?
    e
    • 2
    • 2
  • m

    miniature-television-17996

    01/14/2022, 7:52 AM
    Hello, could you tell me how to debug such problem ?
    d
    • 2
    • 2
  • w

    witty-laptop-49489

    01/14/2022, 10:39 AM
    Hi, since version
    v0.8.22
    the UI stopped showing Inputs for charts.
    d
    l
    • 3
    • 3
  • d

    damp-queen-61493

    01/14/2022, 7:39 PM
    Hi everyone! I just deploy datahub on k8s with image
    linkedin/datahub-frontend-react:v0.8.22
    , but I can't see the datahub version on top right. I'm really using version 0.8.22? (Sorry for my english)
    l
    e
    • 3
    • 11
  • b

    billions-tent-29367

    01/14/2022, 8:56 PM
    Question regarding designing models: We have written an entity to represent an organization Team. The key aspect is the Team's Name. Teams change their names occasionally. Since the key aspect is immutable, team name changes look the same as team creation. How best should we maintain the history of the team?
    e
    m
    • 3
    • 14
  • b

    bland-barista-59197

    01/15/2022, 4:39 AM
    Hi, With new version following graph query stop working. { search(input:{ type: DATASET, query: “*” , start: 0, count: 10000, filters: [ { field : “tags”, value: “Legacy”} ] }) {searchResults { entity {urn} } } }
    • 1
    • 1
  • n

    nutritious-bird-77396

    01/15/2022, 5:28 PM
    Build failure in the new release(0.8.23) tar file. It works out of the master but when trying to take the releases it fails... Steps to reproduce:
    Copy code
    export DATAHUB_VERSION=0.8.23
    wget <https://github.com/linkedin/datahub/archive/refs/tags/v${DATAHUB_VERSION}.tar.gz> && \
        tar -xzf v${DATAHUB_VERSION}.tar.gz && \
        mv datahub-${DATAHUB_VERSION} datahub-src && \
        rm v${DATAHUB_VERSION}.tar.gz
    
    cd /datahub-src && ./gradlew :metadata-service:war:build -x test
    Error stack in 🧵
    m
    e
    • 3
    • 4
  • a

    ancient-pillow-45716

    01/17/2022, 6:57 AM
    I installed DataHub with docker-quick start (
    datahub docker quickstart
     )and that did not include any source code, just several running containers. How can I edit 
    user.pros
     file to add user to login? I tried to edit this file inside 
    datahub-frontend-react
     container but it didn't work. @square-activity-64562 @little-megabyte-1074
    s
    m
    • 3
    • 2
  • g

    glamorous-microphone-33484

    01/17/2022, 8:33 AM
    Hi all, Got the following error when running open-api ingest. The error was: "<method 'sourceReport.report_warning' of SourceReport_produced=1, workunit_ids=['endpoint url']], warnings={endpoint_url}: ['Field in swagger file does not give consistent data', 'Unable to find an example for endpoint. Please add it to the list of forced examples']. The weird thing was that I added the forced examples in the yml file. Any idea how to debug it further?
    s
    • 2
    • 5
  • b

    billowy-lock-72499

    01/17/2022, 11:07 AM
    Hi team could anyone help, that to work with graphql do we need to have any graphql database required ?
    s
    • 2
    • 6
  • q

    quaint-branch-37931

    01/17/2022, 1:37 PM
    After updating datahub from 0.8.17 to 0.8.23 I'm seeing a lot of
    FieldUndefined
    errors in the UI. Do I have to take additional steps to upgrade, apart from updating the container versions?
    s
    m
    • 3
    • 6
  • r

    rich-policeman-92383

    01/18/2022, 8:27 AM
    Hello Can we copy the description, glossary terms, about, owners from one dataset to another provided both are identical datasets. Scenario: There are two oracle datasets A , A_new with similar columns. We used to use dataset A which had all the required column descriptions entered manually using the datahub UI. Now we will be using dataset A_new how can we copy the (description, glossary terms, about, owners) Both these datasets have 200+ columns out of which 100 have (description, glossary terms). I was thinking of select all the postgres rows that have URN of dataset A with description in the metadata column. Then inserting these rows with URN replcaed with A_new in both URN and metadata column. This will require some effort and is error prone. Looking for suggestions on this.
    s
    m
    • 3
    • 5
  • b

    bland-orange-13353

    01/18/2022, 2:08 PM
    This message was deleted.
    s
    • 2
    • 1
  • r

    rich-policeman-92383

    01/18/2022, 9:06 PM
    Hello While using the datahub-spark-lineage-0.0.2.jar with spark version 2.4.8 i am getting error "ERROR MCPEmitter: GMS URL not configured". command used: spark-submit count.py mnm_dataset.csv If i configure "spark.datahub.lineage.mcpEmitter.gmsUrl" instead of "spark.datahub.rest.server" then it results in Error "Server returned HTTP response code: 500 for URL http://localhost:8080/aspects?action=ingestProposal" Config looks like :
    Copy code
    spark = (SparkSession
            .builder
            .appName("PythonMnMCount")
            .config("spark.jars.packages","io.acryl:datahub-spark-lineage:0.0.2")
            .config("spark.extraListeners","com.linkedin.datahub.lineage.spark.interceptor.DatahubLineageEmitter")
            .config("spark.datahub.rest.server", "<http://localhost:8080>")
            .enableHiveSupport()
            .getOrCreate())
    s
    • 2
    • 4
  • b

    busy-zebra-64439

    01/19/2022, 4:30 AM
    Hi Team , i am having a query regarding the version's of mysql and elastic . From the docker image we found that the datahub uses the mysql version as 5.7 and elastic version as 7.9.3 , 1. could you please confirm whether we can use mysql version 8 for datahub ?. 2. we could not able to find the versions of all the components used in datahub on documentation, kindly update us is there any document link available.
    s
    • 2
    • 4
  • w

    witty-laptop-49489

    01/19/2022, 2:05 PM
    Hi all, how should I deal with dot (.) in table/column name of Dataset? Do I need to apply some transformation for names or is it possible to show in UI w/o split?
    Copy code
    DESCRIBE TABLE test.test_nested
    
    ┌─name──────────────────┬─type──────────────────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
    │ id                    │ UInt64                    │              │                    │         │                  │                │
    │ some.column           │ String                    │              │                    │         │                  │                │
    │ col_Nested.ID         │ Array(UInt32)             │              │                    │         │                  │                │
    │ col_Nested.Serial     │ Array(UInt32)             │              │                    │         │                  │                │
    │ col_Nested.EventTime  │ Array(Nullable(DateTime)) │              │                    │         │                  │                │
    │ col_Nested.Price      │ Array(Int64)              │              │                    │         │                  │                │
    │ col_Nested.OrderID    │ Array(String)             │              │                    │         │                  │                │
    │ col_Nested.CurrencyID │ Array(Nullable(UInt32))   │              │                    │         │                  │                │
    └───────────────────────┴───────────────────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
    s
    • 2
    • 4
  • f

    few-air-56117

    01/19/2022, 2:55 PM
    Hi guys, i made a OIDC integration with azure and ingest de groups using a recepie. The groups are now in datahub but the user are not mapped with groups. Thx :D
    l
    • 2
    • 1
  • m

    melodic-helmet-78607

    01/20/2022, 10:10 AM
    Hi, how do I enable verbose logging? Error in jaas/LDAP plugin, It only says connection error (name resolution?) Cannot reproduce since it is random
    Copy code
    08:04:10 [application-akka.actor.default-dispatcher-11] ERROR controllers.AuthenticationController - Authentication error
    javax.naming.AuthenticationException: javax.security.auth.login.FailedLoginException: Cannot connect to LDAP server
    s
    • 2
    • 1
  • f

    few-air-56117

    01/20/2022, 1:28 PM
    hi guys, i tried to ingest some metadata from biguqery, one week ago the code works but now i got this
    Copy code
    {datahub.ingestion.source.sql.bigquery:224} - Built lineage map containing 0 entries.
     UserWarning: Cannot create BigQuery Storage client, the dependency google-cloud-bigquery-storage is not installed.
    d
    • 2
    • 9
1...121314...119Latest