https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • d

    dazzling-appointment-34954

    02/02/2022, 6:09 PM
    Hi all, I ingested some datasets through file ingestion and played around with deleting them afterwards through the cli. Now I tried to ingest the same data again but have a strange behaviour: • Ingestion runs without errors and outputs that all records are inserted • In UI no new records appear (e.g. under platform) and also through search / CLI query I can not see them • but: When I enter the direct URN of one of the entries in the URL I see the dataset also with the correct browse-path and information I suspect this might be related to an indexing error? Anyone has some idea what to do / what is happening? Thanks in advance!
    o
    • 2
    • 4
  • l

    late-bear-87552

    02/02/2022, 6:11 PM
    Copy code
    source:
      type: "bigquery"
      config:
        ## Coordinates
        project_id: adf-adfa-240416
        credential:
          project_id: adf-adfa-240416
          private_key_id: ""
          private_key: "-----BEGIN PRIVATE KEY"
          client_email: ""
          client_id: ""
        table_pattern:
          deny:
          - 
    sink:
      type: "datahub-rest"
      config:
        server: "<http://localhost:8080>"
    wanted to deny table which starts with temp_, can anyone help me with the yml file???
    o
    • 2
    • 1
  • h

    handsome-football-66174

    02/02/2022, 10:35 PM
    Hi, Facing this issue - we are using SSO and the application deployed on EKS cluster 215705 [application-akka.actor.default-dispatcher-425728] ERROR application - ! @7mh86m29f - Internal server error, for (GET) [/callback/oidc?code=aqWFN5bo1D6VoA-AUd03cBOIiZedSv7RnDQAAABw&state=NKO1aayjWv6I7oW4wv1dUrMdd2DA7idRPyccUkBbjGA] -> play.api.UnexpectedException: Unexpected exception[CompletionException: org.pac4j.core.exception.TechnicalException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target]        at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:247)        at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:176)        at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:363)        at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:361)        at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)        at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
    o
    e
    +3
    • 6
    • 44
  • c

    calm-river-44367

    02/03/2022, 1:59 PM
    hey, I have been trying to insert new users for datahub, using user.props file. what I'm doing is I add the name and password of my custom user in the user.props exactly the way the user datahub:dathub has been written there. And then for the datahub-frontend-react part in the docker-compose file I add: build:       context: ../       dockerfile: docker/datahub-frontend/Dockerfile     image: linkedin/datahub-frontend-react:${DATAHUB_VERSION:-head}     env_file: datahub-frontend/env/docker.env     hostname: datahub-frontend-react     container_name: datahub-frontend-react     ports:       - "9002:9002"     depends_on:       - datahub-gms     volumes:       - ./my-custom-dir/user.props:/datahub-frontend/conf/user.props with the address of the user.props and env file where ever its needed and as it says in the docs I run the new docker-compose file, however custom users have not been added. does anyone have any idea on what I'm doing wrong or not doing?? https://datahubproject.io/docs/how/auth/jaas#mount-a-custom-userprops-file-docker-compose
    o
    • 2
    • 1
  • s

    strong-iron-17184

    02/03/2022, 2:40 PM
    image.png
    b
    o
    • 3
    • 31
  • n

    numerous-eve-42142

    02/03/2022, 9:07 PM
    Hi everyone!! I'm running datahub on Kubernetes now and i didn't find how could i erase some metadata from the platform. Like rollbacks, once i'm not able to do "datahub"commands. Does anyone can help me?
    o
    p
    • 3
    • 5
  • b

    better-orange-49102

    02/04/2022, 10:34 AM
    (I'm using code base that is 2 months back and cant screenshot because it is inside intranet environment) im seeing in mySQL rows containing a platform urn that is lowercase and camelcase (OpenApi and openapi) ie some rows are: 1. urnlidataPlatform:openapi 2. urnlidataPlatformOpenAPI 3. urnlidataset:(urnlidataPlatform:OpenApi, dataset1, PROD) 4. urnlidataset:(urnlidataPlatform:OpenApi, dataset2, PROD) and i think its causing problems in displaying the logo of the openapi platform... because case(1) and (2) have common aspects of different version numbers. I don't know how the camelcase urn snuck into the DB. I guess the only way is to clear the ES indices and amend the mysql records and reindex?🥴
    o
    • 2
    • 2
  • g

    gifted-queen-61023

    02/04/2022, 10:57 AM
    Hi guys waving from afar left I fetched the latest changes and now I'm getting an error when building. Can you guys help? 😕 The error is as follows: Task metadata ingestioninstallDev FAILED × Encountered error while trying to install package. ╰─> sasl3 × Running
    setup.py
    install for
    sasl3
    did not run successfully. │ exit code: 1 ╰─> [28 lines of output] /.../datahub/metadata-ingestion/venv/lib/python3.8/site-packages/setuptools/dist.py697 UserWarning: Usage of dash-separated 'description-file' will not be supported in future versions. Please use the underscore name 'description_file' instead note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure Thanks in advance 🙌
    o
    m
    • 3
    • 9
  • b

    boundless-student-48844

    02/04/2022, 11:04 AM
    Hey team, I am trying to extend on MLFeature model. But I can’t find the aspect code for
    Urn
    ,
    MLFeatureDataType
    ,
    VersionTag
    (link) from
    com.linkedin.common
    (link) under
    metadata-models
    repo. Did i miss out anything? 🙇
    h
    • 2
    • 2
  • s

    sparse-planet-56664

    02/04/2022, 12:19 PM
    Hi, after deleting an entire platform or at least a bunch of datasets via the CLI there’s still lineage around if you look at other datasets. Is this how it is as of right now? Or have I missed something?
    o
    • 2
    • 3
  • b

    brief-toothbrush-55766

    02/04/2022, 2:47 PM
    I am using the Java api for emitting metadata, and I am getting this error. I see the problem from the error, but I cant figure how to fix it. here is the Java code snippet:
    Copy code
    DatasetAspect dataAspect = new DatasetAspect();
    
    dataAspect.setOwnership(getOwnership(dataset));
    dataAspect.setSchemaMetadata(getSchemaMetadata(dataset));
    dataAspect.setInstitutionalMemory(getInstitutionalMemory(dataset));
    dataAspect.setDatasetProperties(new DatasetProperties().setDescription("Gama Test description").setCustomProperties(map));
    MetadataChangeProposalWrapper mcpw = MetadataChangeProposalWrapper.builder()
            .entityType("dataset")
            .entityUrn("urn:li:dataset:(urn:li:dataPlatform:s3,test,PROD)")
            //.entityUrn("urn:li:dataset:(foo,bar,PROD)")
            .upsert()
    
            .aspect(dataAspect).aspectName("dataset")
            .build();
    l
    o
    c
    • 4
    • 7
  • a

    ancient-author-86397

    02/04/2022, 9:22 PM
    Hi! I'm trying to build DataHub and when gradle gets to
    compileMainGeneratedDataTemplateJava
    it fails because it cannot find the symbol
    FabricType
    .
    o
    • 2
    • 19
  • c

    calm-river-44367

    02/06/2022, 9:52 AM
    Hey, I want to deploy OIDC in datahub other than google, azure, or okta. First of all, I want to know if it's possible and if it is other than client id and secret with the scopes what else do I need for such a configuration.
    b
    • 2
    • 3
  • s

    shy-parrot-64120

    02/07/2022, 9:30 AM
    Hi all, datahub cli started to throw an exception:
    Copy code
    UnboundVariable: ': unbound variable'
    --- running:
    Copy code
    datahub ingest run -c glue.yml
    Untitled.cpp
    g
    • 2
    • 10
  • s

    shy-parrot-64120

    02/07/2022, 1:06 PM
    search results not showing root-level attribute name
    • 1
    • 1
  • h

    high-toothbrush-90528

    02/07/2022, 1:48 PM
    Hi everybody! I have a problem building the images using Teamcity. The error comes from git.properties. I have added linkedin/datahub as upstream in a BitBucket repo. Currently I am not able to find a solution or at least a workaround for this issue. Any suggestions are really welcome. Thanks!
    Copy code
    #9 252.3 There are 31 data schema input files. Using input root folder: /datahub-src/li-utils/src/main/pegasus
    #9 252.9 [main] INFO com.linkedin.pegasus.generator.PegasusDataTemplateGenerator - Generating 32 files
    #9 253.1
    #9 253.1 FAILURE: Build failed with an exception.
    #9 253.1
    #9 253.1 * What went wrong:
    #9 253.1 Execution failed for task ':metadata-models:generateGitProperties'.
    #9 253.1 > gradlegitproperties.org.eclipse.jgit.errors.MissingObjectException: Missing unknown 0b1d79ea8d4295908f5f808431e7a8b5faba6759
    #9 253.1
    i
    m
    • 3
    • 8
  • p

    plain-farmer-27314

    02/07/2022, 3:34 PM
    Hey all - Today I went to undo an ingestion run and noticed that
    datahub ingest list-runs
    is not showing all of the different ingestion jobs we run. We run 5 or so jobs daily, and the latest one its showing is from 02/05, and its missing several jobs from each day as well. Double checked our airflow logs and confirmed jobs ran each day successfully Wondering if this is a known issue or if I'm missing something here
    i
    • 2
    • 16
  • c

    calm-river-44367

    02/08/2022, 7:49 AM
    hello! I have been checking out the OIDC feature of datahub but for a provider other than the three ones explained in the docs. I got my client id, client secret, and my discovery URL with the right suffix (.well-known/openid-configuration) I put all of them in the: # Required Configuration Values: AUTH_OIDC_ENABLED=true AUTH_OIDC_CLIENT_ID=your-client-id AUTH_OIDC_CLIENT_SECRET=your-client-secret AUTH_OIDC_DISCOVERY_URI=your-provider-discovery-url AUTH_OIDC_BASE_URL=your-datahub-url as said in the document and placed it inside the docker.env file. after that, I brought the docker-compose file down and then up again. when I open my datahub page again I get the page below, I have no idea what is wrong. If anyone has any idea please help me out.
    i
    • 2
    • 10
  • s

    strong-iron-17184

    02/08/2022, 2:06 PM
    Hi, i got this error when i want to run airflow lineage in the port 58080
    teamwork 1
    w
    i
    • 3
    • 25
  • w

    wooden-football-7175

    02/08/2022, 2:32 PM
    Hello everyone. I’m having some issues while I tried to ingest lineage from airflow with datahub backend. I follow instructions, install dependencies, create a demo dag as the page detailed but I’m having the next error:
    Copy code
    {datahub.py:122} ERROR - ('Unable to emit metadata to DataHub GMS', {'message': "Invalid URL '<host>/entities?action=ingest': No schema supplied. Perhaps you meant http://<host>/entities?action=ingest?"})
    Anyone have an idea about this!!! Thanks
    b
    i
    • 3
    • 16
  • m

    modern-monitor-81461

    02/08/2022, 5:33 PM
    I just upgraded to 0.8.25 and I am now seeing new errors when ingesting data (I'm using my own custom source that I'm developing: Iceberg). I think my code did not really change, only the new dependency to 0.8.25. Have we added some new kind of validation? Here is the error:
    Copy code
    'message': "Parameters of method 'ingest' failed validation with error 'ERROR :: "
    '/entity/value/com.linkedin.metadata.snapshot.DatasetSnapshot/aspects/2/com.linkedin.schema.SchemaMetadata/fields/3/type/type/com.linkedin.schema.ArrayType/nestedType '
                                       ':: array type is not backed by a DataList\n'
    I understand that there is a problem with my SchemaMetadata aspect about the 4th field. It is indeed an array:
    Copy code
    SchemaFieldClass({"fieldPath": "[version=2.0].[type=struct].[type=struct].leaf_cert.[type=array].[type=string].all_domains", "jsonPath": None, "nullable": True, "description": None, "type": SchemaFieldDataTypeClass({"type": ArrayTypeClass({"nestedType": None})}), "nativeDataType": "list<string>", "recursive": False, "globalTags": None, "glossaryTerms": None, "isPartOfKey": False, "jsonProps": "{"native_data_type": "list<string>"}"}),
    all_domains
    is a
    List<String>
    in Iceberg, so I'm modeling it as an
    Array
    of
    String
    in DataHub. What is wrong with my code now that I'm on 0.8.25? What does "array type is not backed by a DataList" mean?
    i
    b
    • 3
    • 2
  • a

    ambitious-cartoon-15344

    02/09/2022, 3:27 AM
    Hi, we try to use Data Domains, we use Chinese, discover Datasets can't set Domain. li:
    b
    • 2
    • 7
  • b

    blue-plastic-11088

    02/09/2022, 5:56 AM
    Hello! Are there any known issues around displaying glossary term details? I have successfully imported business glossary and able to tag datasets with glossary terms. However when I go to the glossary page and click on any term, the page flashes for a fraction of a second and goes blank. The url looks like this: http://xxx.xxx.xxx.xxx:9002/glossary/urn:li:glossaryTerm:Ecommerce.CallToAction/Related%20Entities?is_lineage_mode=false $datahub version DataHub CLI version: 0.8.25.1 Python version: 3.8.10 (default, Nov 26 2021, 201408) [GCC 9.3.0]
    🆙 1
    d
    i
    +2
    • 5
    • 23
  • d

    dazzling-appointment-34954

    02/09/2022, 7:33 AM
    Hi everyone, I would like to report a search issue. When searching for an item that has a “/” in his name the search runs into “an unknown error occured”. Sometimes these elements are suggested in the “Try searching for” area and when clicking on them it always throws an error (not so nice for users browsing the catalog). Is this intentional so the best practice would be to remove these “/” in urns or can we find a fix for it?
    i
    b
    • 3
    • 2
  • f

    few-air-56117

    02/09/2022, 1:03 PM
    Hi guys, i deploy today datahub 0.2.85 i now i got this errors
    m
    c
    i
    • 4
    • 10
  • d

    damp-queen-61493

    02/09/2022, 3:07 PM
    Hi guys! We're evaluating the glossary feature in dev. After try to rename a term, a new term as created e I removed (soft-delete) the old one. But the old term is still assigned to a dataset field and when I try to remove I got an error in UI :
    Copy code
    Failed to remove term: An unknown error occurred.
    GMS log:
    Copy code
    15:06:25.538 [Thread-9955] INFO  c.l.d.g.r.mutate.RemoveTermResolver:52 - Removing Term. input: {}
    15:06:25.545 [Thread-9955] ERROR c.l.d.g.r.mutate.RemoveTermResolver:63 - Failed to perform update against input com.linkedin.datahub.graphql.generated.TermAssociationInput@43af2fc3, Failed to validate record with class com.linkedin.common.GlossaryTerms: ERROR :: /editableSchemaFieldInfo :: unrecognized field found but not allowed
    ERROR :: /terms :: field is required but not found and has no default value
    ERROR :: /auditStamp :: field is required but not found and has no default value
    
    15:06:25.546 [Thread-9955] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
    java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to perform update against input com.linkedin.datahub.graphql.generated.TermAssociationInput@43af2fc3
    	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
    	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
    	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
    	at java.lang.Thread.run(Thread.java:748)
    Caused by: java.lang.RuntimeException: Failed to perform update against input com.linkedin.datahub.graphql.generated.TermAssociationInput@43af2fc3
    	at com.linkedin.datahub.graphql.resolvers.mutate.RemoveTermResolver.lambda$get$0(RemoveTermResolver.java:64)
    	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
    	... 1 common frames omitted
    Caused by: com.linkedin.metadata.entity.ValidationException: Failed to validate record with class com.linkedin.common.GlossaryTerms: ERROR :: /editableSchemaFieldInfo :: unrecognized field found but not allowed
    ERROR :: /terms :: field is required but not found and has no default value
    ERROR :: /auditStamp :: field is required but not found and has no default value
    
    	at com.linkedin.metadata.entity.ValidationUtils.lambda$validateOrThrow$0(ValidationUtils.java:19)
    	at com.linkedin.metadata.entity.RecordTemplateValidator.validate(RecordTemplateValidator.java:37)
    	at com.linkedin.metadata.entity.ValidationUtils.validateOrThrow(ValidationUtils.java:17)
    	at com.linkedin.metadata.entity.EntityService.ingestProposal(EntityService.java:398)
    	at com.linkedin.datahub.graphql.resolvers.mutate.MutationUtils.persistAspect(MutationUtils.java:33)
    	at com.linkedin.datahub.graphql.resolvers.mutate.util.LabelUtils.removeTermFromTarget(LabelUtils.java:69)
    	at com.linkedin.datahub.graphql.resolvers.mutate.RemoveTermResolver.lambda$get$0(RemoveTermResolver.java:54)
    	... 2 common frames omitted
    15:06:25.547 [Thread-9954] ERROR c.datahub.graphql.GraphQLController:94 - Errors while executing graphQL query: "mutation removeTerm($input: TermAssociationInput!) {\n  removeTerm(input: $input)\n}\n", result: {errors=[{message=An unknown error occurred., locations=[{line=2, column=3}], path=[removeTerm], extensions={code=500, type=SERVER_ERROR, classification=DataFetchingException}}], data={removeTerm=null}}, errors: [DataHubGraphQLError{path=[removeTerm], code=SERVER_ERROR, locations=[SourceLocation{line=2, column=3}]}]
    Datahub version: 0.8.25 How can I recover from this error and how is the correct flow to remove a glossary term with associated dataset and fiels?
    i
    • 2
    • 6
  • w

    wooden-football-7175

    02/09/2022, 3:52 PM
    Hello everyone. I’m having a problem with ubuntu / docker quickstart. Any recomendation for sink:
    datahub-rest
    Config? I’m trying to execute with webui receipe and I’m having error with the publish because on “console” it is working fine!
    i
    • 2
    • 2
  • n

    nutritious-bird-77396

    02/09/2022, 9:36 PM
    Hello everyone! I am running in to this bug in gms log after updating to version
    0.8.26
    from
    0.8.24
    When trying to list groups from the UI throws an Invalid urn error... Error Stack in the 🧵
    o
    • 2
    • 6
  • n

    nutritious-bird-77396

    02/09/2022, 9:53 PM
    Team, A couple of questions looking at the mysql/postgres-setup 1. What is the purpose of
    metadata_index
    table - https://github.com/arunvasudevan/datahub/blob/master/docker/mysql-setup/init.sql#L42 2. I don't see it in
    postgres-setup
    I am assuming its just missed right? https://github.com/arunvasudevan/datahub/blob/master/docker/postgres-setup/init.sql
    s
    • 2
    • 2
  • f

    few-air-56117

    02/10/2022, 7:07 AM
    Hi guys, i upgrade my helm repo
    Copy code
    helm repo update
    and install datahub 0.8.26 but i got this errors
    Copy code
    14      	Thu Feb 10 08:53:50 2022	superseded	datahub-0.2.42	0.8.24     	Upgrade complete
    15      	Thu Feb 10 08:58:38 2022	deployed  	datahub-0.2.45	0.8.26     	Upgrade complete
    • 1
    • 1
1...151617...119Latest