https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • b

    better-spoon-77762

    12/23/2021, 8:52 AM
    Hi All, Started getting this test failure which causes the build to fail
    s
    o
    • 3
    • 4
  • a

    able-yacht-17327

    12/26/2021, 7:31 AM
    Hello Team I want to configure Jaas authrtication in my datahub I only have one.yml file(Having source and sink info) in my server under datahub folder . Please guide @witty-plumber-82249
    l
    h
    i
    • 4
    • 9
  • f

    future-petabyte-5942

    12/27/2021, 6:30 AM
    Hi Team How can we access storage layer mysql that was deployed along with Datahub on AWS?
    s
    • 2
    • 32
  • c

    curved-magazine-23582

    12/27/2021, 7:57 PM
    Hello, I've set up Glue ingestion plugin, and it errors out, shown below. Is it because it can't handle the
    marketplace.jdbc
    connection type? 🤔
    Copy code
    File "/usr/local/lib/python3.7/site-packages/datahub/entrypoints.py", line 93, in main
        sys.exit(datahub(standalone_mode=False, **kwargs))
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
        return self.main(*args, **kwargs)
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1053, in main
        rv = self.invoke(ctx)
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
        return _process_result(sub_ctx.command.invoke(sub_ctx))
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
        return ctx.invoke(self.callback, **ctx.params)
    File "/usr/local/lib/python3.7/site-packages/click/core.py", line 754, in invoke
        return __callback(*args, **kwargs)
    File "/usr/local/lib/python3.7/site-packages/datahub/cli/ingest_cli.py", line 73, in run
        pipeline.run()
    File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/run/pipeline.py", line 149, in run
        self.source.get_workunits(), 10 if self.preview_mode else None
    File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/aws/glue.py", line 521, in get_workunits
        dag, flow_urn, s3_formats
    File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/aws/glue.py", line 309, in process_dataflow_graph
        node, flow_urn, new_dataset_ids, new_dataset_mces, s3_formats
    File "/usr/local/lib/python3.7/site-packages/datahub/ingestion/source/aws/glue.py", line 265, in process_dataflow_node
        raise ValueError(f"Unrecognized Glue data object type: {node_args}")
    
    ValueError: Unrecognized Glue data object type: {'connection_type': 'marketplace.jdbc', 'connection_options': {'dbTable': 'contact', 'connectionName': 'prod-salesforce', 'filterPredicate': 'SystemModstamp'}, 'transformation_ctx': 'DataSource0'}
    l
    c
    • 3
    • 5
  • b

    busy-dusk-4970

    12/27/2021, 10:50 PM
    Does anyone have keycloak as an OIDC auth and care to share their settings? I can login via postman but datahub keeps giving me a
    <http://localhost:9002/authenticate?redirect_uri=%2F>
    Attached are screenshots of the error and my keycloak settings. My datahub-frontend-react env vars:
    Copy code
    - DATAHUB_GMS_HOST=datahub-gms
        - DATAHUB_GMS_PORT=8090
        - DATAHUB_SECRET=YouKnowNothing
        - DATAHUB_APP_VERSION=1.0
        - DATAHUB_PLAY_MEM_BUFFER_SIZE=10MB
        - JAVA_OPTS=-Xms512m -Xmx512m -Dhttp.port=9002 -Dconfig.file=datahub-frontend/conf/application.conf
          -Djava.security.auth.login.config=datahub-frontend/conf/jaas.conf 
          -Dlogback.configurationFile=datahub-frontend/conf/logback.xml
          -Dlogback.debug=false -Dpidfile.path=/dev/null
        - KAFKA_BOOTSTRAP_SERVER=broker:29092
        - DATAHUB_TRACKING_TOPIC=DataHubUsageEvent_v1
        - ELASTIC_CLIENT_HOST=opensearch
        - ELASTIC_CLIENT_PORT=9200
        - KEYCLOAK_CLIENT_SECRET=e96ef414-de3e-460c-bd91-0866debfd8eb
        - AUTH_OIDC_ENABLED=true
        - AUTH_OIDC_CLIENT_ID=data-fabric-confidential
        - AUTH_OIDC_CLIENT_SECRET=e96ef414-de3e-460c-bd91-0866debfd8eb
        - AUTH_OIDC_DISCOVERY_URI=<http://localhost:8080/auth/realms/data-fabric-realm/.well-known/openid-configuration>
        - AUTH_OIDC_BASE_URL=<http://localhost:9002>
        - AUTH_OIDC_SCOPE="openid profile email groups"
    b
    b
    • 3
    • 61
  • b

    better-spoon-77762

    12/29/2021, 2:04 AM
    Hello Team, I am running data-hub using docker-compose. I see that ingested data is going both in Neo and Elastic Search. Is there a way to ensure data is stored in only one of Neo or ES ?
    e
    • 2
    • 6
  • a

    ambitious-guitar-89068

    12/29/2021, 11:32 AM
    QQ: Is there a way to skip a dashboard in metabase while ingesting (aware of the regex option) - but I am asking more like a skip on error for a specific dashboard ingestion…
    i
    • 2
    • 1
  • m

    magnificent-park-81142

    12/30/2021, 6:56 PM
    could you help me to build the metadata-ingestion to work in my local
    l
    c
    +2
    • 5
    • 109
  • a

    astonishing-lamp-98820

    12/30/2021, 8:37 PM
    Hi, I want to extend the schema of a dataset with new metadata, so that I have on UI the metadata "field", "description", "tags", "terms" and e.g. "customColumn". I defined a new aspect of the entity dataset like the metadata-model-custom module but I realised that this is not what I want because I couldn't find a way to use metadata-ingestion with this new aspect. Is there another way to add such a column to "schema"?
    h
    • 2
    • 2
  • m

    magnificent-park-81142

    01/01/2022, 5:23 PM
    I am able change upstream for a dataset(urnlidataset:(urnlidataPlatform:hive,SampleHiveDataset,PROD)) using lineage_emitter_mcpw_rest.py. could you help me to set downstream for the same(urnlidataset:(urnlidataPlatform:hive,SampleHiveDataset,PROD))
    e
    • 2
    • 1
  • r

    rich-policeman-92383

    01/03/2022, 1:47 PM
    Hello How can i enable debug logs for gms, mae/mce consumer. Reason: I am trying to communicate with a kerberised kafka following DOCs but getting Error Similar error for other topics as well. These topics were created using the kafka-setup container but gms,mae and mce consumers are failing to read.
    Copy code
    datahub_datahub-mae-consumer.1.ia85tvruu4y4@N4PBIL-DARD0887    | 13:29:30.498 [kafka-kerberos-refresh-thread-my_user@INDIA.TTT] WARN  o.a.k.c.s.kerberos.KerberosLogin - [Principal=my_user@INDIA.TTT]: TGT renewal thread has been interrupted and will exit.
    datahub_datahub-mae-consumer.1.ia85tvruu4y4@N4PBIL-DARD0887    | 13:29:30.499 [main] WARN  o.s.b.w.s.c.AnnotationConfigServletWebServerApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.context.ApplicationContextException: Failed to start bean 'org.springframework.kafka.config.internalKafkaListenerEndpointRegistry'; nested exception is org.apache.kafka.common.errors.TopicAuthorizationException: Not authorized to access topics: [DataHubUsageEvent_v1]
    s
    • 2
    • 17
  • w

    worried-terabyte-81786

    01/03/2022, 7:12 PM
    Hello everyone! I tried to ingest a few glossary terms using the examples from the documentation. However, using the example as is, I got the following error messages:
    h
    • 2
    • 2
  • w

    worried-terabyte-81786

    01/03/2022, 7:15 PM
    Copy code
    ValidationError: 2 validation errors for BusinessGlossaryConfig
    url
     none is not an allowed value (type=type_error.none.not_allowed)
    nodes -> 0 -> terms -> 1 -> custom_properties
     extra fields not permitted (type=value_error.extra)
    g
    h
    • 3
    • 42
  • h

    hallowed-breakfast-61855

    01/04/2022, 3:20 AM
    Hello, please help me with answering the following question is ADFS SSO supported for datahub login?
    f
    b
    • 3
    • 3
  • a

    ambitious-guitar-89068

    01/04/2022, 5:27 AM
    While intesting from metabase via kafka, I get this, I can go about tampering kafka message size on client and server, but whats more sensible?
    KafkaException: KafkaError{code=MSG_SIZE_TOO_LARGE,val=10,str="Unable to produce message: Broker: Message size too large"}
    b
    • 2
    • 2
  • g

    gentle-florist-49869

    01/04/2022, 2:11 PM
    Hello people, happy new year and I´d like to know if someone experience this issue and know one solution please? in my lab , created datahub stack via (datahub docker quickstart --quickstart-compose-file ./docker/quickstart/docker-compose.quickstart.yml) and after tried to put up the consumers job (MAE/MCE) https://github.com/linkedin/datahub/blob/master/docker/docker-compose.consumers.yml to see metrics on kafka Stream (example: http://localhost:9091/actuator/metrics) - but MAE container failed to start1 error: Description: Field systemAuthentication in com.linkedin.metadata.kafka.config.EntityHydratorConfig required a bean of type 'com.datahub.authentication.Authentication' that could not be found. The injection point has the following annotations: - @org.springframework.beans.factory.annotation.Autowired(required=true) - @org.springframework.beans.factory.annotation.Qualifier(value=systemAuthentication) Action: Consider defining a bean of type 'com.datahub.authentication.Authentication' in your configuration.
    b
    • 2
    • 3
  • m

    millions-notebook-72121

    01/04/2022, 3:26 PM
    Hi All - I am in the process of redeploying Datahub to the newer version after the log4j storm. It was working perfectly fine before the updates but now I'm having some issues. In particular, I am installing on Kubernetes following this guide https://datahubproject.io/docs/deploy/kubernetes/ - I created the secrets as shown below. When I install the prerequisites chart, the mysql pod starts but never gets ready. If I do a
    kubectl describe
    on the pod I get
    Copy code
    Readiness probe failed: mysqladmin: [Warning] Using a password on the command line interface can be insecure.
    mysqladmin: connect to server at 'localhost' failed
    error: 'Access denied for user 'root'@'localhost' (using password: YES)'
    Has anyone seen this before? I'm checking the Kubernetes cluster to see if anything has changed there, but seems the same as before. Attaching some screenshots for reference. If I do a port-forward for the DB port and try accessing the DB with a client, I get the same error.
    i
    • 2
    • 6
  • g

    gentle-nest-904

    01/05/2022, 9:47 AM
    can anyone confirm/validate this - or did i do something wrong?
    s
    • 2
    • 1
  • g

    gentle-nest-904

    01/05/2022, 9:55 AM
    here is the input for the file config: "# A sample recipe that pulls JSON from file and puts it into DataHub   source:  type: file  config:    # Coordinates    filename: ./DataHub/JsonInputs/BCS_Persoon.json   sink:  type: "datahub-rest"  config:    server: http://localhost:8080" And i attached the error response here as well.
    s
    b
    • 3
    • 23
  • b

    busy-dusk-4970

    01/05/2022, 6:17 PM
    I'm trying to do local development on datahub for some customizations and I get this error when trying to run
    docker/dev.sh
    Copy code
    for docker_datahub-frontend-react_1  Cannot start service datahub-frontend-react: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "datahub-frontend/bin/playBinary": stat datahub-frontend/bin/playBinary: no such file or directory: unknown
    Any ideas how I can fix this? Tnx 🙏
    h
    l
    +2
    • 5
    • 37
  • l

    lemon-hydrogen-83671

    01/05/2022, 6:45 PM
    whoops just realized theres a troubleshoot channel 😬 https://datahubspace.slack.com/archives/CV2KB471C/p1641401560096300
    l
    • 2
    • 2
  • n

    nutritious-bird-77396

    01/05/2022, 8:43 PM
    Hi All.... I am trying to set the Kafka Producer/Consumer Configs in the GMS Application... I would like to override these configs... •
    ssl.truststore.location
    •
    security.protocol
    •
    sasl.mechanism
    •
    sasl.jaas.config
    •
    sasl.client.callback.handler.class
    Setting these properties did not work
    KAFKA_PROPERTIES_SSL_TRUSTSTORE_LOCATION
    ..... Any help on this would be great....
    b
    b
    • 3
    • 5
  • s

    some-crayon-90964

    01/06/2022, 6:00 PM
    Hey guys, we pulled the latest version, but we got an issue as below running
    ./gradlew build
    . Howevery, when we ran this specific test (
    ./gradlew :metadata-integration:java:spark-lineage:compileJava
    ), it was always successful. Any idea how to fix that? Thanks.
    plus1 2
    h
    o
    m
    • 4
    • 17
  • m

    many-pilot-7340

    01/06/2022, 8:10 PM
    Seeing sqlalchemy errors and airflow containers(scheduler, worker and evetually postgres) restarting when using datahub with airflow following the steps at https://datahubproject.io/docs/docker/airflow/local_airflow/
    g
    h
    g
    • 4
    • 13
  • n

    nutritious-bird-77396

    01/06/2022, 9:20 PM
    I was having issues in connecting to AWS MSK using
    SPRING_KAFKA_PROPERTIES_SASL_MECHANISM=AWS_MSK_IAM
    Raised the issue in the project - https://github.com/aws/aws-msk-iam-auth/issues/50 If anyone else have seen this issue or any suggestions do let me know...
    h
    • 2
    • 2
  • m

    magnificent-park-81142

    01/07/2022, 3:58 PM
    Hi Team , I tried to do post request with Authrorization token(generated from datahub settings page ) to my localhost but 401 unauthorized error is occuring in postman. could you help on the same. The same is working fine when try from browser chrome using graphql extension
    s
    a
    • 3
    • 9
  • l

    lemon-hydrogen-83671

    01/07/2022, 6:47 PM
    Hey folks, has anyone run into issues running the react front-end? Pretty much any request made to
    /api/*path
    returns a
    Copy code
    HTTP/2 400 Bad Request: HTTP message contains more than the configured limit of 64 headers
    g
    a
    +6
    • 9
    • 69
  • b

    billions-tent-29367

    01/11/2022, 5:41 PM
    Hello! I have been looking for a way to fetch all of a particular entity. I found https://feature-requests.datahubproject.io/b/Developer-Experience/p/graphql-api-list-entities which is exactly what I want. But I also found that a search like this seems to work:
    Copy code
    curl --location --request POST 'gms:8080/entities?action=search' \
    --header 'Content-Type: text/plain' \
    --data-raw '{
        "input": "**",
        "entity": "Application",
        "start": 0,
        "count": 500
    }'
    So what does
    **
    actually mean? (Application is a custom entity)
    m
    b
    • 3
    • 5
  • b

    billions-tent-29367

    01/11/2022, 5:49 PM
    Second question: Previously, custom models had to begin with the
    com.linkedin
    namespace, otherwise the build did not pick them up properly. Every person I've demonstrated our models to has asked something like "Why do the models say com.linkedin? Can you remove that?" So.. is it possible yet?
    m
    m
    • 3
    • 3
  • q

    quick-pizza-8906

    01/11/2022, 7:33 PM
    Hello, I have a problem with deleting a dataset, when I run CLI I see:
    Copy code
    ❯ datahub delete --hard --urn 'urn-here'
    This will permanently delete data from DataHub. Do you want to continue? [y/N]: y
    [2022-01-10 15:20:27,608] INFO {datahub.cli.delete_cli:126} - DataHub configured with <http://localhost:9999>
    Nothing deleted for urn-here
    Took 1.898 seconds to hard delete 0 rows for 1 entities
    where
    urn-here
    is actual URN. After this operation I still see the dataset in UI. Running reindexing job does not help. Note 0 rows deleted for 1 entities info. What can I do?
    m
    m
    • 3
    • 6
1...111213...119Latest