https://datahubproject.io logo
Join SlackCommunities
Powered by
# troubleshoot
  • w

    wooden-arm-26381

    08/05/2022, 7:56 AM
    Hey, with v0.8.42 it seems like a small Python mistake sneaked into the Azure AD connector:
    UnboundLocalError: local variable 'datahub_corp_group_snapshot' referenced before assignment
    The variable is accessed outside this for loop: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/identity/azure_ad.py#L280 This happens on line #297 and #305
    e
    • 2
    • 3
  • j

    jolly-traffic-67085

    08/05/2022, 7:58 AM
    Hello everyone. I connected oidc datahub with keycloak. datahub version v.0.8.35 and I can log out. But in version v.0.8.41 I connected oidc to keycloak in the same way. but can't logout from datahub. when i click log out and back to main page instead of going to the login page. I don't known What going no. thank you!.
    b
    • 2
    • 2
  • h

    hallowed-lawyer-5424

    08/05/2022, 10:01 AM
    Hello all, I am trying to fetch entities using the search API
    <http://localhost:8080/entities?action=search>
    with this body as payload
    Copy code
    {
      "input": "",
      "entity": "dataset",
      "start": 9999,
      "count": 10000
    }
    I could able to fetch only upto 10,000 records. Is there any way to get entire records based on entity type and environment. If I give start as '0' and end as '10000', getting the response. Not getting the remaining records.
    i
    • 2
    • 2
  • b

    brave-tomato-16287

    08/05/2022, 10:04 AM
    Dear all, After update to .42 we have faced with error when run ingest:
    Copy code
    datahub_actions.pipeline.pipeline.PipelineException: Failed to log failed event to file! EventEnvelope(event_type='MetadataChangeLogEvent_v1', event=MetadataChangeLogEvent({'auditHeader': None, 'entityType': 'dataset', 'entityUrn': 'urn:li:dataset:(urn:li:dataPlatform:dbt,dev.analytics_dbt_test__audit.not_null_dim_customers_customer_id,PROD)', 'entityKeyAspect': None, 'changeType': 'RESTATE', 'aspectName': 'upstreamLineage', 'aspect': GenericAspectClass({'value': b'{"upstreams":[{"auditStamp":{"actor":"urn:li:corpuser:unknown","time":0},"type":"TRANSFORMED","dataset":"urn:li:dataset:(urn:li:dataPlatform:redshift,dev.analytics.dim_customers,PROD)"}]}', 'contentType': 'application/json'}), 'systemMetadata': None, 'previousAspectValue': None, 'previousSystemMetadata': None, 'created': AuditStampClass({'time': 1659685194401, 'actor': 'urn:li:corpuser:__datahub_system', 'impersonator': None})}), meta={'kafka': {'topic': 'MetadataChangeLog_Versioned_v1', 'offset': 162992, 'partition': 0}})
    • 1
    • 1
  • l

    little-army-38555

    08/05/2022, 10:48 AM
    Hi everyone! Could you help with
    mssql
    plugin installation in Kubernetes with Helm Chart? Are there any ways to do this without
    pip install
    ? Or should I build a docker container with it?
    h
    i
    +2
    • 5
    • 10
  • s

    square-solstice-69079

    08/05/2022, 1:52 PM
    I'm testing the new bulk edit after upgrading the datahub to the latest version. I'm able to bulk edit to add tag and owner, but domains are not being set, is this a bug? Edit: Tested it on the demo site, and also not working there.
    b
    • 2
    • 9
  • g

    gray-nest-42961

    08/05/2022, 6:08 PM
    👋 Hello, team! our GMS instances occasionally hit errors trying to create an Elasticsearch system metadata index. Error looks like:
    Copy code
    Failed to instantiate [com.linkedin.metadata.kafka.hook.UpdateIndicesHook]: Constructor threw exception; nested exception is java.lang.RuntimeException: Could not configure system metadata index
    22:15:58 [main] INFO  c.l.r.t.h.c.c.AbstractNettyClient - Shutdown requested
    22:15:58 [main] INFO  c.l.r.t.h.c.c.AbstractNettyClient - Shutting down
    ...
    Caused by: java.net.ConnectException: Connection refused
    any idea on what might cause it and how to fix it👀? thanks! cc @bitter-lizard-32293 @numerous-byte-87938
    b
    • 2
    • 1
  • n

    numerous-account-62719

    08/08/2022, 4:56 AM
    Hi Team, How to delete all tables from datasource in datahub?
    g
    i
    • 3
    • 32
  • e

    echoing-farmer-38304

    08/08/2022, 8:01 AM
    Hello! Having trouble with powerbi ingestion, getting next error during ingestion
    Copy code
    \datahub\ingestion\source\powerbi.py", line 588, in get_data_source
        id=datasource_dict["datasourceId"],
    
    KeyError: 'datasourceId'
    Dictionary with data (datasource_dict)
    Copy code
    {'datasourceType': 'Sql', 'connectionDetails': {'server': 'server-name', 'database': 'db-name'}}
    dataset_type_mapping
    Copy code
    dataset_type_mapping:
            PostgreSql: postgres
            Oracle: oracle
    Is there problem in my config or something goes wrong in module, any ideas?
    g
    g
    • 3
    • 4
  • s

    salmon-rose-54694

    08/08/2022, 10:10 AM
    hi We code a new aspect called ‘dataset_ttl’. This aspect can do CRUD but we see may errors from MAE and MCL like:
    Copy code
    org.apache.kafka.common.errors.SerializationException: Error deserializing key/value for partition MetadataChangeLog_Versioned_v1-0
    Or
    Copy code
    Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema not found; error code: 40403
    	at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:292)
    	at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:351)
    	at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:659)
    	at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:641)
    Just want to know anything on the new aspect maybe miss regarding MAE and MCL consumer? Thank you.
    e
    • 2
    • 4
  • n

    numerous-account-62719

    08/08/2022, 11:29 AM
    Hi Team, I tried installing datahub via pip using the following command pip install datahub But I am getting the following error
    Copy code
    Requirement already satisfied: pip in /opt/conda/lib/python3.9/site-packages (22.0.4)
    Collecting install
      Using cached install-1.3.5-py3-none-any.whl (3.2 kB)
    Collecting datahub
      Using cached datahub-0.8.90dev.tar.gz (11 kB)
      Preparing metadata (setup.py) ... done
    Collecting pastescript>=1.0
      Using cached PasteScript-3.2.1-py2.py3-none-any.whl (73 kB)
    Collecting cheetah>=2.0
      Using cached Cheetah-2.4.4.tar.gz (190 kB)
      Preparing metadata (setup.py) ... error
      error: subprocess-exited-with-error
      
      × python setup.py egg_info did not run successfully.
      │ exit code: 1
      ╰─> [9 lines of output]
          Traceback (most recent call last):
            File "<string>", line 2, in <module>
            File "<pip-setuptools-caller>", line 34, in <module>
            File "/tmp/pip-install-qqhf77ju/cheetah_8ebe7010343c4b18a855f6ae5ba8cd7b/setup.py", line 10, in <module>
              import SetupTools
            File "/tmp/pip-install-qqhf77ju/cheetah_8ebe7010343c4b18a855f6ae5ba8cd7b/SetupTools.py", line 50
              except DistutilsPlatformError, x:
                                           ^
          SyntaxError: invalid syntax
          [end of output]
      
      note: This error originates from a subprocess, and is likely not a problem with pip.
    error: metadata-generation-failed
    
    × Encountered error while generating package metadata.
    ╰─> See above for output.
    
    note: This is an issue with the package mentioned above, not pip.
    hint: See above for details.
    h
    • 2
    • 1
  • g

    gentle-camera-33498

    08/08/2022, 3:13 PM
    Hello everyone, I deployed DataHub in GKE using Helm. I have 4 replicas of GMS and 2 replicas of frontend, but I still have a lot of these errors. Can Anyone know what it could be?
    o
    • 2
    • 8
  • f

    faint-translator-23365

    08/08/2022, 3:13 PM
    When I'm configuring OIDC I'm getting this error in logs 150708 [application-akka.actor.default-dispatcher-279] ERROR application - 1 @7ohe603jp - Internal server error, for (GET) (/authenticate?redirect_uri=82F] -> play.api.UnexpectedException: Unexpected exception[TechnicalException: java.net.ConnectException: Connect: at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:340) at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:263) at play. core, server.AkkaHttpServer$$anonfun$1.applyOrELse(AkkaHttpServer.scala:443) at play.core. server.AkkaHttpServer$$anonfun$1.applyorElse(AkkaHttpServer.scala:441) at scala. concurrent.Future. $anonfun$recoverwith$1 (Future. scala:417) at scala. concurrent. impl.Promise. $anonfun$transformwith$1 (Promise.scala:41) at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:64) at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55) at akka.dispatch.BatchingExecutor$BlockableBatch.Sanonfun$run$1(BatchingExecutor.scala:92) at scala. runtime. java8.JFunctione$mcv$sp.apply(JFunctione$mev$sp. java: 23) at scala. concurrent. BlockContext$.withBlockContext (BlockContext. scala: 85) at akka. dispatch. BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:92) at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala: at atta. dispatch. forkjoin. Fork o intoot, torresuére-kuninTacko/MA.7%90..java:1330) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: org.pac4j.core.exception.TechnicalException: java.net.ConnectException: Connection refused (Connec at org.pacuj.oidc.config.OidcConfiguration.internalInit(oidcConfiguration.java:136) at org.pacuj.core.util.InitializableObject.init(InitializableObject.java:20) at auth.sso.oidc.custom.CustomoidcClient.clientInit(CustomOidctlient.java:21) at org.pacuj.core.client.IndirectClient.internalInit(Indirectclient.java:58) at org.pacuj.core.util.InitializableObject.init(Initializableobject.java:20) at org.pacuj.core.client.Indirectclient.getRedirectAction(Indirectclient.java:93) at org.pacuj.core.client.Indirectclient.redirect(Indirectclient.java:79) at controllers.AuthenticationController.redirectToIdentityProvider(AuthenticationController.java:278) at controllers.AuthenticationController.authenticate(AuthenticationController.java:89) at router.Routes$$anonfun$routes$1.$anonfun$applyOrElse$8(Routes.scala:489) at play.core.routing.HandlerInvokerFactory$$anon$6.resultCall(HandlerInvoker.scala:139) at play.core.routing.HandlerInvokerFactory$$anon$6.resultcall(HandlerInvoker.scala:138) at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$3$$anon$u$$anon$5.invocation(Hand" at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:128) at play.mvc.Action. lambda$call$@ (Action. java: 89) at java.util.Optional.map(Optional. java:215) at play.mvc.Action.call(Action. java: 81) at play.http.DefaultActionCreator$1.call(DefaultActionCreator.java:33) at play.core. j.JavaAction.$anonfun$apply$8(JavaAction.scala:188) at scala. concurrent.Future$. $anonfun$apply$1 (Future.scala:659) at scala.util.Success.$anonfun$map$1(Try.scala:255) at scala.util.Success.map(Try. scala: 213; at scala.concurrent. Future. $anonfun$map$1 (Future, scala:292) at scala.concurrent. impl. Promise.LiftedTree1$1 (Promise.scala:33) at scala. concurrent. impl. Promise. Sanonfun$transform$1 (Promise.scala: 33) at scala.concurrent. impl. CallbackRunnable. run(Promise. scala: 64) at play.core. j .HttpExecutionContext$$anon$2. run (HttpExecutioncontext, scala:77) at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:70) at play.core. j.HttpExecutionContext,execute(HttpExecutionContext.scala:69) at scala. concurrent. impt. CallbackRunnable. executewithvalue(Promise. scala:72) at scala. concurrent. impl. Promise$KeptPromise$Kept. onComplete (Promise.scala:372) at scala. concurrent. impt.Promise$KeptPromise$Kept, onComplete$(Promise,scala:371) scala. concurrent. impl. Promise$KeptPronise$Successful. onComplete(Promise.scala:379) <pod> <containers> _«Logs>
    l
    m
    • 3
    • 13
  • s

    shy-parrot-64120

    08/08/2022, 7:13 PM
    Hi all just tried to upgrade 0.8.41 -> 0.8.42 (via k8s helm chart 0.2.87)
    datahub-gms
    failed to start:
    Copy code
    ERROR: No such classes directory file:///etc/datahub/plugins/auth/resources
    i
    m
    • 3
    • 4
  • b

    bright-diamond-60933

    08/08/2022, 9:05 PM
    datahub-frontend-react | Caused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers
  • r

    rapid-house-76230

    08/08/2022, 9:10 PM
    When I try to port forward my frontend pod and log in with default username/pw, I get these errors from the frontend pod, It seems like the GMS pod could not be resolved but GMS pod gives out no error and is running fine. Anyone has any pointers?
    Untitled.txt
    e
    o
    f
    • 4
    • 15
  • b

    bright-diamond-60933

    08/08/2022, 8:57 PM
    has any of you seen these errors when you run ./dev.sh
    e
    • 2
    • 10
  • p

    purple-analyst-83660

    08/09/2022, 4:13 AM
    Hi Team, I am trying to ingest data from tableau project to my datahub. I get this error
    failed to write record with workunit urn:li:dashboard:(tableau,10625179-12ce-63aa-edce-798f6a70d9f6) with ('Unable to emit metadata to DataHub GMS', {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException', 'stackTrace': 'com.linkedin.restli.server.RestLiServiceException [HTTP Status:422]: com.linkedin.metadata.entity.ValidationException: Failed to validate record with class com.linkedin.entity.Entity: ERROR :: /value/com.linkedin.metadata.snapshot.DashboardSnapshot/aspects/0/com.linkedin.dashboard.DashboardInfo/datasets :: unrecognized field found but not allowed
    My datahub GMS version is 0.8.40 and CLI is 0.8.42.
    h
    • 2
    • 2
  • f

    faint-translator-23365

    08/08/2022, 2:02 PM
    Hi I am trying to enable profiling for snowflake source, tables are getting ingested but not getting profiled, am I missing something, can someone please help? Is profile_pattern still part of snowflake recipe? Because once we remove profile_pattern all the tables are profiled. Thanks!
    h
    • 2
    • 1
  • e

    echoing-farmer-38304

    08/09/2022, 7:17 AM
    Hello! I'm trying to install datahub library locally ( pip install -e . ), it runs successfully, but when I run tests, getting next error
    Copy code
    from datahub.metadata.schema_classes import (
    E   ModuleNotFoundError: No module named 'datahub.metadata'
    Also, I tried to build a whl package and run, but getting the same error. If I install with ( pip install acryl-datahub==0.8.41 ), tests run without that error, but in that case it wouldn't see my local changes. Is there any solutions for that?
    h
    • 2
    • 3
  • a

    agreeable-belgium-70840

    08/09/2022, 9:50 AM
    hello, I am facing the following issue: I have created a group and I have created two different privileges for that group, all on metadata and all on platform. Everything works and the admin group has elevated privileges. But the users can't create a tag. They get an error that there aren't sufficient privileges for that. What am I missing? Any ideas?
    i
    b
    • 3
    • 11
  • s

    straight-agent-79732

    08/09/2022, 11:39 AM
    Hi, I am trying to deploy datahub using
    Copy code
    datahub docker quickstart
    I got port conflicts with elastic search, schema registry and datahub-gms. I see we can pass different ports for elastic search and schema registry but no documentation available for datahub-gms. leaving datahub-gms aside, I tried passing different ports for elastic search and schema registry like this, datahub docker quickstart --elastic-port 7310 --schema-registry-port 7311 But, no luck datahub is using same old ports. Can anyone help me out here
    h
    • 2
    • 3
  • w

    wooden-pencil-40912

    08/09/2022, 12:39 PM
    Hi Team 👋 When will the helm chart be released for datahubbbb version
    v0.8.42
    ? Any estimate would be really helpful. 🙂
    i
    • 2
    • 2
  • s

    stale-printer-44316

    08/09/2022, 3:32 PM
    Hi Guys, Is it possible to ingest descriptions from avro schema (schema registry) into datahub directly without any manual intervention? If so, how can this be done please?
    b
    h
    • 3
    • 4
  • m

    most-nightfall-36645

    08/09/2022, 3:42 PM
    Hi, Is there a
    log_level
    option we can use to reduce log verbosity. I tried searching around documentation but I cant seem to find anything.
    b
    • 2
    • 4
  • k

    kind-whale-32412

    08/09/2022, 4:59 PM
    Could anyone please update the version of
    confluent-kafka
    in acryl-datahub-actions pip package? Currently it's not compatible with Macbook M1
    m
    • 2
    • 1
  • b

    bland-stone-30401

    08/09/2022, 11:49 AM
    Hi Team, We have upgraded LDH to version 0.8.41. Our code build and deployed successfully., but when we try to access it via UI we are getting below error.
    Oops, an error occurred
    This exception has been logged with id *7ohk695mj*.
    PS: We have also disabled the OIDC authentication.
    a
    m
    +3
    • 6
    • 9
  • s

    steep-finland-24780

    08/10/2022, 12:17 AM
    Hello, I was trying to troubleshoot a problem on docker-compose with Google OIDC. The logs from the front-end container is showing the following error:
    Copy code
    Caused by: com.nimbusds.oauth2.sdk.ParseException: The scope must include an "openid" value
    	at com.nimbusds.openid.connect.sdk.AuthenticationRequest.parse(AuthenticationRequest.java:1378)
    	at com.nimbusds.openid.connect.sdk.AuthenticationRequest.parse(AuthenticationRequest.java:1312)
    	at org.pac4j.oidc.redirect.OidcRedirectActionBuilder.buildAuthenticationRequestUrl(OidcRedirectActionBuilder.java:110)
    It seems it's not parsing properly the
    AUTH_OIDC_SCOPE
    env var. I opened an interactive shell inside the container and it seems the variables are being set accordingly. Here's the output from the front-end container:
    Copy code
    ubuntu@host-name:~$ docker exec -it <CONTAINER_ID_FRONT-END> /bin/sh
    
    / $ env
    
    ELASTIC_CLIENT_HOST=elasticsearch
    HOSTNAME=datahub-frontend-react
    SHLVL=1
    HOME=/home/datahub
    AUTH_OIDC_DISCOVERY_URI=<https://accounts.google.com/.well-known/openid-configuration>
    ELASTIC_CLIENT_PORT=9200
    AUTH_OIDC_CLIENT_ID=<correct_OIDC_CLIENT>
    AUTH_OIDC_CLIENT_SECRET=<correct_OIDC_SECRET>
    AUTH_OIDC_ENABLED=true
    AUTH_OIDC_USER_NAME_CLAIM=email
    AUTH_OIDC_SCOPE="openid profile email"
    TERM=xterm
    Does anyone had a similar problem? How are you guys setting those variables?
    e
    • 2
    • 7
  • n

    numerous-account-62719

    08/10/2022, 5:46 AM
    Hi Team I am upgrading the datahub version from 0.8.33 to 0.8.41 Getting the following error: Unable to create application: application spec for telco-datahub-test is invalid: InvalidSpecError: Unable to generate manifests in helm/datahub: rpc error: code = Unknown desc =
    helm dependency build
    failed exit status 1: Error: can't get a valid version for repositories datahub-gms, datahub-mae-consumer, datahub-mce-consumer. Try changing the version constraint in Chart.yaml
    b
    • 2
    • 1
  • b

    bulky-jordan-44775

    08/10/2022, 6:28 AM
    👋 Hello, team! I installed datahub using kubernetes and have it deployed to AWS following the guide published in the site, but as soon as I login I get this error message
    [Thread-481] WARN  n.g.e.SimpleDataFetcherExceptionHandler - Exception while fetching data (/corpUser) : java.lang.RuntimeException: Failed to retrieve entities of type CorpUser
    Can anybody help me?
1...424344...119Latest