https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • f

    future-hamburger-62563

    11/04/2021, 11:59 PM
    I came across an issue with docker in windows/wsl using
    dev.sh
    . The script would fail because of unexpected character and a SO post eluded to a breaking change in Docker Compose V2. I turned this option off in docker and it works for me now, but I am curious, did anyone else have this issue? Do ppl still use dev.sh? Or are ppl using kubernetes for development?
    e
    • 2
    • 8
  • a

    adamant-van-40260

    11/06/2021, 5:53 AM
    I got the exception when search the dataset, so I guess I have to increase the timeout value or something to optimize the query ?
    Copy code
    Caused by: java.util.concurrent.TimeoutException: Exceeded request timeout of 10000ms
    	at com.linkedin.r2.transport.http.client.TimeoutTransportCallback$1.run(TimeoutTransportCallback.java:69)
    s
    f
    b
    • 4
    • 7
  • r

    rough-zoo-50278

    11/08/2021, 6:39 AM
    Good morning I'm running the datahub docker QuickStart and trying Postgres ingestion. It will fail with a 500 in gms „can't ingest entity“ any hints or how to debut? I assume I can change the log level for ingestion?
    Copy code
    07:21:46.830 [qtp544724190-12] ERROR c.l.m.filter.RestliLoggingFilter - java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
    07:21:47.228 [qtp544724190-10] INFO  c.l.m.filter.RestliLoggingFilter - POST /entities?action=ingest - ingest - 500 - 1ms
    Running the pipeline throws this:
    Copy code
    91 more\nCaused by: java.net.URISyntaxException: Invalid URN Parameter: 'No enum constant com.linkedin.common.FabricType.dev: urn:li:dataset:(urn:li:dataPlatform:postgres,dev.public.consumer,dev)\n\tat com.linkedin.common.urn.DatasetUrn.createFromUrn(DatasetUrn.java:55)\n\tat com.linkedin.common.urn.DatasetUrn.createFromString(DatasetUrn.java:38)\n\tat com.linkedin.common.urn.DatasetUrn$1.coerceOutput(DatasetUrn.java:76)\n\t... 94 more\n", 'message': 'java.lang.RuntimeException: java.lang.reflect.InvocationTargetException', 'status': 500}
    EDIT: It was wrong env (which didn't exist)
    b
    • 2
    • 2
  • k

    kind-psychiatrist-76973

    11/08/2021, 10:22 AM
    Hello, I deployed datahub using this helm chart https://github.com/acryldata/datahub-helm. I am getting several errors like this one
    Copy code
    datahub/datahub-datahub-gms-654d4f8457-467qj[datahub-gms]: 10:13:31.450 [Thread-2114] ERROR c.d.m.graphql.GraphQLController - Errors while executing graphQL query: "query getSearchResultsForMultiple($input: SearchAcrossEntitiesInput!) {\n  searchAcrossEntities(input: $input)
    when accessing pages like
    /search?page=1&query=name
    . Have you seen this type of error? I am running this version
    linkedin/datahub-gms:v0.8.16
    e
    b
    • 3
    • 10
  • s

    stocky-guitar-68560

    11/08/2021, 11:24 AM
    hey everyone,from where I can add users to my datahub running locally,is there any API written or is there any way to add users from UI?
    b
    b
    • 3
    • 3
  • l

    lively-jackal-83760

    11/08/2021, 12:10 PM
    Hey guys I'm getting the error during kafka ingestion with schema-registry
    Copy code
    File "/Users/dseredenko/.conda/envs/dataCatalog/lib/python3.9/site-packages/datahub/ingestion/extractor/schema_util.py", line 408, in _to_mce_fields
        yield from self._avro_type_to_mce_converter_map[type(avro_schema)](avro_schema)
    File "/Users/dseredenko/.conda/envs/dataCatalog/lib/python3.9/site-packages/datahub/ingestion/extractor/schema_util.py", line 328, in _gen_nested_schema_from_field
        yield from self._to_mce_fields(sub_schema)
    File "/Users/dseredenko/.conda/envs/dataCatalog/lib/python3.9/site-packages/datahub/ingestion/extractor/schema_util.py", line 408, in _to_mce_fields
        yield from self._avro_type_to_mce_converter_map[type(avro_schema)](avro_schema)
    
    KeyError: <class 'avro.schema.TimestampMillisSchema'>
    tried to parse field from schema-registry like
    Copy code
    {
      "name": "event_ts",
      "type": {
        "type": "long",
        "logicalType": "timestamp-millis"
      },
      "tags": [
        "business-timestamp"
      ]
    }
    seems like there is no TimestampMillisSchema key in dict AvroToMceSchemaConverter._avro_type_to_mce_converter_map
    b
    • 2
    • 4
  • g

    great-oxygen-13926

    11/08/2021, 12:44 PM
    Hey everyone, I am running Datahub v.0.8.16, chart version 0.2.24 and tried enabled jmx metrics via the exporter. But the exporter is throwing an error
    Copy code
    Exception in thread "main" java.lang.NullPointerException
            at io.prometheus.jmx.JmxCollector.loadConfig(JmxCollector.java:164)
            at io.prometheus.jmx.JmxCollector.<init>(JmxCollector.java:78)
            at io.prometheus.jmx.WebServer.main(WebServer.java:30)
    on the jmx-exporter pod. I verified the configmap is correctly configured and I didn't make any changes to the values.yml
    b
    • 2
    • 1
  • b

    brief-lizard-77958

    11/08/2021, 3:07 PM
    Hey everyone. I'm having trouble ingesting properties - they don't get added through a successful ingestion. For example ingesting a dashboard with customProperties. (Posting the code of the json I ingest in the thread) The ingestion is successful but I can't find the properties in the UI. Ingesting through rest - this exact code worked on previous versions and is taken from bootstrap_mce.json file from the latest github version. Thread in Slack Conversation
    b
    l
    • 3
    • 2
  • k

    kind-psychiatrist-76973

    11/08/2021, 4:27 PM
    while following the debug page https://datahubproject.io/docs/debugging#how-can-i-check-if-search-indices-are-created-in-elasticsearch I found out that `datasetdocument`and
    corpuserinfodocument
    are not present
    e
    b
    • 3
    • 5
  • r

    refined-apple-6340

    11/08/2021, 6:30 PM
    i am trying to configure datahub to use keycloak (oidc) for authentication in a docker env.   I setup my keycloak and datahub to map to 0.0.0.0 in etc/hosts so this works.  I can login and get the redirect but it fails at the last step on a crsf type thing indicating the "State parameter is different from the one sent in authentication request. Session expired or possible threat of cross-site request forgery" [1:25 PM] thanks for any help or guidance
    b
    b
    • 3
    • 27
  • h

    handsome-belgium-11927

    11/09/2021, 10:50 AM
    Anybody managed to ingest GlossaryNode? I get an error code 400 if I ingest via curl, and nothing happens if I ingest via python.
    d
    • 2
    • 2
  • d

    dazzling-appointment-34954

    11/09/2021, 12:58 PM
    Hey guys, first of all thanks for this nice community and the help! This might be a noobie problem, still I hope you can give me some support. We just setup a Datahub instance and I am now trying to install the plugins for Metadata ingestions as described here: https://datahubproject.io/docs/metadata-ingestion Sadly I receive the error visible within the screenshot, can someone point me to a solution? Thanks!
    m
    • 2
    • 8
  • r

    rapid-sundown-8805

    11/09/2021, 1:57 PM
    Hi all! So with the new version v0.8.16, Azure AD JIT group provisioning is working swimmingly, but the default is that group names are assigned to the group ID. This is not really user friendly, and I need to change this to the group name. I can do that in the Azure AD ingestion recipe by setting
    azure_ad_response_to_groupname_attr
    (I think - not tested), but how can I set this for the JIT provisioning ie. when someone logs into the front end? (see screenshot). I think I want it set to
    displayName
    .
    b
    • 2
    • 11
  • q

    quick-pizza-8906

    11/09/2021, 2:05 PM
    Hello, I played around with fairly new feature - glossary terms and found some problems, am I doing something wrong? 1. When accessing demo page I opened dataset Mysql
    User.UserAccount
    and assigned
    Sensitive
    term to field
    user_id
    - then I opened
    Sensitive
    glossary term details and related entities were showing 3 datasets but not mysql
    User.UserAccount
    why is it so? Shouldn't I be able to see this association from that subpage? If not, how can I associate field with a glossary term so that this association is visible both in dataset details and glossary term detail? 2. In local docker deployment I associated a term with a field and then deleted the term via
    datahub delete ...
    command, while the term was gone from the terms UI it was still visible in the dataset details, is there any procedure to trigger removal there as well?
    b
    r
    m
    • 4
    • 30
  • r

    red-pizza-28006

    11/09/2021, 3:18 PM
    Are there any other permissions needed to ingest users from azure AD? I am having trouble ingesting both users and groups.
    b
    q
    • 3
    • 25
  • n

    nutritious-bird-77396

    11/09/2021, 7:24 PM
    When trying to build datahub in local....I am facing the build errors due to test failures..
    Copy code
    =========================== short test summary info ============================
    FAILED tests/unit/test_glue_source.py::test_get_column_type_contains_key - bo...
    FAILED tests/unit/test_glue_source.py::test_get_column_type_contains_array - ...
    FAILED tests/unit/test_glue_source.py::test_get_column_type_contains_map - bo...
    FAILED tests/unit/test_glue_source.py::test_get_column_type_contains_set - bo...
    FAILED tests/unit/test_glue_source.py::test_get_column_type_not_contained - b...
    FAILED tests/unit/test_glue_source.py::test_glue_ingest - botocore.exceptions...
    FAILED tests/unit/test_glue_source.py::test_underlying_platform_takes_precendence
    FAILED tests/unit/test_glue_source.py::test_underlying_platform_cannot_be_other_than_athena
    FAILED tests/unit/test_glue_source.py::test_without_underlying_platform - bot...
    FAILED tests/unit/sagemaker/test_sagemaker_source.py::test_sagemaker_ingest
    ========== 10 failed, 154 passed, 18 deselected, 2 warnings in 34.06s ==========
    
    > Task :metadata-ingestion:testQuick FAILED
    Any idea on what i am missing? More details in 🧵
    m
    • 2
    • 6
  • g

    gifted-continent-35826

    11/09/2021, 8:00 PM
    Can someone point me in the right direction regarding creating users in quickstart? For some reason, I am unable to create users, and can't find any info in the documentation....
    b
    s
    • 3
    • 2
  • r

    red-pizza-28006

    11/10/2021, 12:37 PM
    I tried deleting datasets in Datahub using
    Copy code
    datahub delete --env DEV
    but I got this error
    Copy code
    HTTPError: 404 Client Error: Not Found for url: https://<datahub-endpoint>//entities?action=search
    Could this be because I have SSO turned on?
    m
    b
    e
    • 4
    • 39
  • l

    loud-camera-71352

    11/10/2021, 5:11 PM
    Hi guys! can we indent sql queries (redshift usage) as on demo site ?
    b
    • 2
    • 6
  • e

    enough-zoo-71516

    11/10/2021, 5:41 PM
    Has anyone used Preset to do a superset recipe? https://datahubproject.io/docs/metadata-ingestion/source_docs/superset/ I just created a Preset (https://preset.io/ - managed superset) account today and was wondering how to figure out the
    connect_uri
    . Just wondering if anyone else has already got Datahub + Preset running?
    g
    • 2
    • 2
  • m

    mysterious-park-53124

    11/11/2021, 4:30 AM
    Copy code
    Source (kafka) report:
    {'failures': {},
     'filtered': ['__transaction_state',
                  '_confluent-telemetry-metrics',
                  '_confluent_balancer_partition_samples',
                  '_confluent-metrics',
                  '_confluent_balancer_api_state',
                  '__consumer_offsets',
                  '_confluent_balancer_broker_samples',
                  '_confluent-license'],
     'topics_scanned': 21,
     'warnings': {'topic1': ['failed to get value schema: Subject not found. (HTTP status code 404, SR code 40401)'],
                  'topic2': ['failed to get value schema: Subject not found. (HTTP status code 404, SR code 40401)'],
                  'topic3': ['failed to get value schema: Subject not found. (HTTP status code 404, SR code 40401)']},
     'workunit_ids': ['kafka-topic1',
                      'kafka-topic2',
                      'kafka-topic3',
                      ],
     'workunits_produced': 13}
    Sink (datahub-kafka) report:
    {'downstream_end_time': None,
     'downstream_start_time': None,
     'downstream_total_latency_in_seconds': None,
     'failures': [],
     'records_written': 13,
     'warnings': []}
    b
    • 2
    • 2
  • b

    bumpy-activity-74405

    11/11/2021, 9:19 AM
    Hey, so I’ve tried looking at some frontend logs in the
    datahub-frontend-react
    container and it is full of
    Copy code
    [application-akka.actor.default-dispatcher-23] WARN  akka.actor.ActorSystemImpl - Explicitly set HTTP header 'Content-Length: 641' is ignored, explicit `Content-Length` header is not allowed. Use the appropriate HttpEntity subtype.
    warnings. Is this a known issue? Running quickstart
    v0.8.16
    .
    b
    b
    • 3
    • 5
  • s

    square-activity-64562

    11/11/2021, 11:46 AM
    not sure where this is in github. broken logos on docs home page https://datahubproject.io/docs/ It is getting

    https://datahubproject.io/docs-website/static/img/acryl-logo-light-mark.png▾

    instead of

    https://datahubproject.io/img/acryl-logo-light-mark.png▾

    which is actually working
    b
    • 2
    • 3
  • b

    broad-crowd-13788

    11/11/2021, 5:19 PM
    What are the implications on making changes to CorpUserKey which is the key aspect of the Corp User entity. I basically want to add another field i.e. 'Employee number' to the key aspect. I know that adding another field to the key aspect will change its Urn and ldap source for ingestion but does it affect any other functionality?
    b
    • 2
    • 6
  • b

    breezy-florist-18916

    11/12/2021, 5:39 AM
    Hi all, I tried to use
    datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose-without-neo4j-m1.quickstart.yml
    install datahub on mac M1 I met this issue: Unable to run quickstart - the following issues were detected: - elasticsearch-setup is still running - mysql-setup is still running Docker log: qemu: uncaught target signal 11 (Segmentation fault) - core dumped any ideas? thanks
    m
    d
    +2
    • 5
    • 15
  • h

    handsome-belgium-11927

    11/12/2021, 8:11 AM
    Hi all. I've just pulled latest version and I can't ingest datasets, I get an error
    Caused by: java.net.URISyntaxException: Invalid URN Parameter: 'No enum constant com.linkedin.common.FabricType.PROD)
    (full in thread), something changed about the ingestion process or this is a bug?
    b
    b
    • 3
    • 13
  • n

    nice-planet-17111

    11/12/2021, 10:43 AM
    Hi, i just found the interesting error(?) regarding deleting entities.. 1. when i delete entities via
    datahub delete --urn <some_urn>
    , the path/button to the entity also get deleted. 2. However, when i delete entities using
    DELETE FROM ~
    on datahub database (metadata_aspect_v2), the path/button to the entity does not get deleted !! (Although there's no related data in DB) 3. After that, even if i try
    datahub delete --urn <some_urn>
    on the same entity, the path/button does not get deleted. (At this point, somehow it's impossible to hide this path/buttons?...) Does anyone know about this issue or how to solve this problem? 🙂
    b
    s
    b
    • 4
    • 10
  • t

    thousands-intern-95970

    11/12/2021, 2:19 PM
    Hello Everyone ! Being a newbie with the Datahub, I need a little help troubleshooting an error. While injecting the metadata form the source: file [JSON] and sink: datahub-rest, in a local server after writing the recipe as per the guide; there’s a error showing: “ValueError: com.linkedin.pegasus2avro.usage.UsageAggregation is missing required field: bucket” Can anyone help to understand the error and how to solve it. Thank you in advance :)
    b
    b
    • 3
    • 6
  • p

    plain-farmer-27314

    11/12/2021, 6:38 PM
    Hey all, trying to setup a local dev env, specifically for metadata-ingestion I'm getting the below error when trying to run the gradlew setup command:
    Copy code
    > Task :metadata-ingestion:environmentSetup FAILED
    Error: [Errno 13] Permission denied: '-/Dev/datahub-explore/datahub/metadata-ingestion/venv/bin/activate.fish'
    b
    m
    +2
    • 5
    • 14
  • p

    plain-farmer-27314

    11/12/2021, 9:36 PM
    I'm still trying to get tests to run for an update i've made to the biguery usage ingestion plugin When I run:
    Copy code
    pytest -m 'not integration' -vv
    I'm seeing:
    Copy code
    rootdir: /Users/-/Dev/datahub-explore/datahub/metadata-ingestion, configfile: setup.cfg, testpaths: tests/unit, tests/integration
    And this is resulting in lots of test failures, with this error:
    Copy code
    ImportError while importing test module '/Users/-/Dev/datahub-explore/datahub/metadata-ingestion/tests/integration/trino/test_trino.py'.
    Hint: make sure your test modules/packages have valid Python names.
    b
    • 2
    • 5
1...678...119Latest