https://datahubproject.io logo
Join Slack
Powered by
# troubleshoot
  • t

    thousands-intern-95970

    03/14/2022, 4:10 PM
    Hi all ! We are using the lineage_emitter_dataset_finegrained.py to visualize the lineage but encountered by the error "The field at path '/dataset/upstream/relationships[0]/entity' was declared as a non null type, but the code involved in retrieving data has wrongly returned a null value. The graphql specification requires that the parent field be set to null, or if that is non nullable that it bubble up null to its parent and so on. The non-nullable type is 'Entity' within parent type 'LineageRelationship'"
    l
    g
    b
    • 4
    • 6
  • l

    lemon-hydrogen-83671

    03/14/2022, 7:35 PM
    Hey guys, i was wondering how folks are dealing with scenarios where the datastore for datahub_gms becomes unavailable. I noticed that during an ingestion run it will just publish an event into kafka to indicate that the record failed and moves on, but is there anything we can set to make automatic retries? AFAIK those failed topics just log the events in there
    h
    e
    o
    • 4
    • 9
  • g

    gray-carpet-60705

    03/14/2022, 9:06 PM
    Hello, I’m trying to upgrade DataHub from v0.8.21 to v0.8.23 (and eventually to the latest) and running into UI login error as default “datahub” user upon initial upgrade. I’ll include more details in the thread. Any help would be greatly appreciated. Thanks!
    b
    e
    b
    • 4
    • 25
  • s

    salmon-area-51650

    03/15/2022, 11:27 AM
    👋 Hello team!!! I’m trying to enable OIDC with Google SSO but I’m getting an error. Any help would be appreciated.
    Copy code
    extraEnvs:
        - name: AUTH_OIDC_ENABLED
          value: "true"
        - name: AUTH_OIDC_CLIENT_ID
          value: "<http://XXXXXXX.apps.googleusercontent.com|XXXXXXX.apps.googleusercontent.com>"
        - name: AUTH_OIDC_CLIENT_SECRET
          value: "YYYYYYY"
        - name: AUTH_OIDC_DISCOVERY_URI
          value: "<https://accounts.google.com/.well-known/openid-configuration>"
        - name: AUTH_OIDC_USER_NAME_CLAIM
          value: "email"
        - name: AUTH_OIDC_USER_NAME_CLAIM_REGEX
          value: "([^@]+)"
        - name: AUTH_OIDC_BASE_URL
          value: "<https://datahub.mydomain.com>"
    Copy code
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend ! @7n2h5bp28 - Internal server error, for (GET) [/authenticate?redirect_uri=%2F] ->
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend play.api.UnexpectedException: Unexpected exception[CryptoException: Unable to init cipher instance.]
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.http.HttpErrorHandlerExceptions$.throwableToUsefulException(HttpErrorHandler.scala:247)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.http.DefaultHttpErrorHandler.onServerError(HttpErrorHandler.scala:176)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:363)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.server.AkkaHttpServer$$anonfun$2.applyOrElse(AkkaHttpServer.scala:361)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:346)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.Future$$anonfun$recoverWith$1.apply(Future.scala:345)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:36)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.BatchingExecutor$AbstractBatch.processBatch(BatchingExecutor.scala:55)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:92)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.BatchingExecutor$BlockableBatch$$anonfun$run$1.apply(BatchingExecutor.scala:92)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.BatchingExecutor$BlockableBatch.run(BatchingExecutor.scala:91)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(ForkJoinExecutorConfigurator.scala:49)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend Caused by: org.apache.shiro.crypto.CryptoException: Unable to init cipher instance.
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.init(JcaCipherService.java:495)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.initNewCipher(JcaCipherService.java:599)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.crypt(JcaCipherService.java:444)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.encrypt(JcaCipherService.java:324)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.encrypt(JcaCipherService.java:313)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.play.store.ShiroAesDataEncrypter.encrypt(ShiroAesDataEncrypter.java:51)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.play.store.PlayCookieSessionStore.set(PlayCookieSessionStore.java:77)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.play.store.PlayCookieSessionStore.set(PlayCookieSessionStore.java:29)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.oidc.redirect.OidcRedirectActionBuilder.addStateAndNonceParameters(OidcRedirectActionBuilder.java:97)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.oidc.redirect.OidcRedirectActionBuilder.redirect(OidcRedirectActionBuilder.java:72)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.core.client.IndirectClient.getRedirectAction(IndirectClient.java:109)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.pac4j.core.client.IndirectClient.redirect(IndirectClient.java:79)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at controllers.AuthenticationController.redirectToIdentityProvider(AuthenticationController.java:160)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at controllers.AuthenticationController.authenticate(AuthenticationController.java:87)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$4$$anonfun$apply$4.apply(Routes.scala:450)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at router.Routes$$anonfun$routes$1$$anonfun$applyOrElse$4$$anonfun$apply$4.apply(Routes.scala:450)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.routing.HandlerInvokerFactory$$anon$3.resultCall(HandlerInvoker.scala:134)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.routing.HandlerInvokerFactory$$anon$3.resultCall(HandlerInvoker.scala:133)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.routing.HandlerInvokerFactory$JavaActionInvokerFactory$$anon$8$$anon$2$$anon$1.invocation(HandlerInvoker.scala:108)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.JavaAction$$anon$1.call(JavaAction.scala:88)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.http.DefaultActionCreator$1.call(DefaultActionCreator.java:31)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.JavaAction$$anonfun$9.apply(JavaAction.scala:138)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.HttpExecutionContext$$anon$2.run(HttpExecutionContext.scala:56)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.Execution$trampoline$.execute(Execution.scala:70)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.HttpExecutionContext.execute(HttpExecutionContext.scala:48)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.impl.Future$.apply(Future.scala:31)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.concurrent.Future$.apply(Future.scala:494)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.j.JavaAction.apply(JavaAction.scala:138)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:96)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.mvc.Action$$anonfun$apply$2.apply(Action.scala:89)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2$$anonfun$1.apply(Accumulator.scala:174)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.util.Try$.apply(Try.scala:192)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:174)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.StrictAccumulator$$anonfun$mapFuture$2.apply(Accumulator.scala:170)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at scala.Function1$$anonfun$andThen$1.apply(Function1.scala:52)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.api.libs.streams.StrictAccumulator.run(Accumulator.scala:207)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.server.AkkaHttpServer$$anonfun$14.apply(AkkaHttpServer.scala:357)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at play.core.server.AkkaHttpServer$$anonfun$14.apply(AkkaHttpServer.scala:355)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.http.scaladsl.util.FastFuture$.akka$http$scaladsl$util$FastFuture$$strictTransform$1(FastFuture.scala:41)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.http.scaladsl.util.FastFuture$$anonfun$transformWith$extension1$1.apply(FastFuture.scala:51)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at akka.http.scaladsl.util.FastFuture$$anonfun$transformWith$extension1$1.apply(FastFuture.scala:50)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	... 13 common frames omitted
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend Caused by: java.security.InvalidKeyException: Invalid AES key length: 30 bytes
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at com.sun.crypto.provider.AESCrypt.init(AESCrypt.java:87)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at com.sun.crypto.provider.GaloisCounterMode.init(GaloisCounterMode.java:302)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at com.sun.crypto.provider.CipherCore.init(CipherCore.java:589)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at com.sun.crypto.provider.AESCipher.engineInit(AESCipher.java:346)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at javax.crypto.Cipher.implInit(Cipher.java:809)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at javax.crypto.Cipher.chooseProvider(Cipher.java:867)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at javax.crypto.Cipher.init(Cipher.java:1399)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at javax.crypto.Cipher.init(Cipher.java:1330)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	at org.apache.shiro.crypto.JcaCipherService.init(JcaCipherService.java:488)
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 	... 57 common frames omitted
    datahub-datahub-frontend-8d7f7cf6f-xvjwm datahub-frontend 11:24:21 [Thread-1] INFO  play.core.server.AkkaHttpServer - Stopping server...
    Thanks in advance!!
    a
    b
    b
    • 4
    • 13
  • f

    fresh-salesclerk-60575

    03/15/2022, 12:16 PM
    Hi, I am trying to deploy the
    linkedin/datahub-frontend-react
    locally, that connects to a
    datahub-gms
    I have in another machine (in AWS). The
    GMS
    seems to work fine and has data, but the
    frontend
    is giving me the following error just when opening it:
    Copy code
    Request URL: <http://localhost:9002/api/v2/graphql>
    Request Method: POST
    Status Code: 503 Service Unavailable
    Remote Address: [::1]:9002
    There is no extra info at all in the logs. If anyone has seen anything like this or has an idea I'd appreciate some help thanks
    g
    • 2
    • 6
  • b

    bland-balloon-48379

    03/15/2022, 8:30 PM
    Hi, so my team is currently running an instance of datahub on kubernetes. I recently just updated the values file for our helm chart to use the latest image tags for datahub (v0.8.28). I was told that this version of datahub would have an ingestion available in the UI (presented as a tab in the top right corner of the home page), but I still am only able to see "Analytics," "Users & Groups," and "Policies." I can see that it is running version 0.8.28 (at least for the frontend) when checking under my user icon on the home page, so I'm not sure what's causing this hold-up. Is there something else I need to do to enable this feature, or something I am missing? Would appreciate any feedback, thanks.
    l
    e
    k
    • 4
    • 8
  • r

    red-napkin-59945

    03/15/2022, 11:31 PM
    hey team, regarding this PR I have checked in the png image together with the .md file. But it still complains:
    Copy code
    Error: Image ../docs/rfc/active/000-datadoc-entity/datadoc-high-level-model.png used in genDocs/docs/rfc/active/000-datadoc-entity/datadoc-entity-rfc.md not found.
        at async Promise.all (index 0)
    Is there anything else I missed?
    g
    • 2
    • 3
  • b

    better-orange-49102

    03/16/2022, 2:44 AM
    using OIDC login and on v0.8.26 code: some oidc errors spotted in logs indicating gms connection errors? (logs in thread)
    • 1
    • 2
  • e

    early-midnight-66457

    03/16/2022, 7:58 AM
    Copy code
    javax.servlet.ServletException: org.springframework.web.util.NestedServletException: Async processing failed; nested exception is java.lang.OutOfMemoryError: unable to create new native thread
    	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
    	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    	at org.eclipse.jetty.server.Server.handleAsync(Server.java:537)
    	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:398)
    	at org.eclipse.jetty.server.HttpChannel.run(HttpChannel.java:314)
    	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:782)
    	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:918)
    	at java.lang.Thread.run(Thread.java:748)
    Caused by: org.springframework.web.util.NestedServletException: Async processing failed; nested exception is java.lang.OutOfMemoryError: unable to create new native thread
    	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod$ConcurrentResultHandlerMethod.lambda$new$0(ServletInvocableHandlerMethod.java:213)
    	at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
    	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    	at java.lang.reflect.Method.invoke(Method.java:498)
    	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:190)
    	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:138)
    	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
    	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:888)
    	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:793)
    	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
    	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1040)
    	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:943)
    	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1006)
    	at org.springframework.web.servlet.FrameworkServlet.doPost(FrameworkServlet.java:909)
    	at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
    	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:883)
    	at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
    	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:852)
    	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:544)
    	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
    	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:536)
    	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
    	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
    	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1581)
    	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
    	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1307)
    	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
    	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:482)
    	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1549)
    	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
    	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1204)
    	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:195)
    	at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
    	... 7 more
    Caused by: java.lang.OutOfMemoryError: unable to create new native thread
    	at java.lang.Thread.start0(Native Method)
    h
    • 2
    • 1
  • s

    stale-petabyte-93467

    03/16/2022, 8:16 AM
    Hi good day. I just finished a test ingestion and I would like to know where I can see the errors or failures. Based from the report failures is null but I am getting an error message.
    Copy code
    Sink (datahub-rest) report:
    {'downstream_end_time': datetime.datetime(2022, 3, 16, 16, 7, 18, 188112),
     'downstream_start_time': datetime.datetime(2022, 3, 16, 16, 0, 51, 276126),
     'downstream_total_latency_in_seconds': 386.911986,
     'failures': [],
     'gms_version': 'v0.8.29',
     'records_written': 2495,
     'warnings': []}
    
    Pipeline finished with failures
    [2022-03-16 16:07:19,211] INFO     {datahub.telemetry.telemetry:159} - Sending Telemetry
    [2022-03-16 16:07:29,226] INFO     {datahub.telemetry.telemetry:159} - Sending Telemetry
    d
    s
    • 3
    • 16
  • f

    fierce-author-36990

    03/16/2022, 10:04 AM
    Hi, I am trying to extend the metadata model, I follow the example project on github https://github.com/datahub-project/datahub/tree/master/metadata-models-custom to execute, in v0.8.20 version It can be successful, but it reported an error after updating to v0.8.29 recently, I see the error log in the logs in the Datahub-gms container as follows, I want to know how I should fix it
    Copy code
    09:54:08 [ForkJoinPool.commonPool-worker-5] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler - Failed to execute DataFetcher
    java.util.concurrent.CompletionException: java.lang.ClassCastException: com.linkedin.data.DataList cannot be cast to com.linkedin.data.DataMap
            at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
            at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
            at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1606)
            at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1596)
            at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
            at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
            at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
            at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:175)
    Caused by: java.lang.ClassCastException: com.linkedin.data.DataList cannot be cast to com.linkedin.data.DataMap
            at com.linkedin.data.DataMap.getDataMap(DataMap.java:286)
            at com.linkedin.datahub.graphql.WeaklyTypedAspectsResolver.lambda$null$1(WeaklyTypedAspectsResolver.java:72)
            at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
            at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
            at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
            at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
            at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
            at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
            at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
            at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
            at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
            at com.linkedin.datahub.graphql.WeaklyTypedAspectsResolver.lambda$get$2(WeaklyTypedAspectsResolver.java:56)
            at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
            ... 5 common frames omitted
    09:54:08 [qtp544724190-13] INFO  c.l.m.resources.usage.UsageStats - Attempting to query usage stats
    09:54:08 [pool-9-thread-1] INFO  c.l.m.filter.RestliLoggingFilter - POST /usageStats?action=queryRange - queryRange - 200 - 13ms
    g
    • 2
    • 7
  • h

    high-family-71209

    03/16/2022, 1:04 PM
    When I run
    datahub delete --platform oracle
    - why do I have remaining entries for the schema and the database?
    s
    • 2
    • 2
  • s

    stale-jewelry-2440

    03/16/2022, 1:05 PM
    DataHubValidationAction
    in Great Expectations and Mixpanel Hi folks, it seems that the module sends data to mixpaned by default. Differently statetd, I never asked nor configured anything to send data to Mixpanel, but when I activate the module to send expectations results to DataHub (i.e. adding the relative action in the checkpoint yml file), I read this on the logs:
    Copy code
    [2022-03-16, 13:55:38 CET] {connectionpool.py:1005} DEBUG - Starting new HTTPS connection (1): <http://api.mixpanel.com:443|api.mixpanel.com:443>
    [2022-03-16, 13:55:38 CET] {connectionpool.py:465} DEBUG - <https://api.mixpanel.com:443> "POST /engage HTTP/1.1" 200 25
    Note that those messages disappear if I comment out the DataHubValidationAction in the checkpoint file.
    h
    s
    • 3
    • 5
  • h

    high-family-71209

    03/16/2022, 1:06 PM
    Hi all, when ingesting with the cli - why is this not working:
    schema_pattern:
    deny:
    - '.*'
    allow:
    - 'myschema.*'
    p
    h
    • 3
    • 5
  • h

    high-family-71209

    03/16/2022, 1:41 PM
    Hi all - I just ingested 170 oracle views. When I click on the oracle platform, so `search?filter_platform=urnlidataPlatform:oracle`I just get an empty view. How do I fix this?
    👀 1
    s
    h
    +3
    • 6
    • 12
  • b

    brave-businessperson-3969

    03/16/2022, 2:28 PM
    I'm trying to query DataHub via the python packages. Installed version is 0.8.29 with the latest acrly pypi packages.
    Copy code
    from datahub.emitter.mce_builder import make_dataset_urn, dataset_urn_to_key
    from datahub.emitter.mcp import MetadataChangeProposalWrapper
    
    # read-modify-write requires access to the DataHubGraph (RestEmitter is not enough)
    from datahub.ingestion.graph.client import DatahubClientConfig, DataHubGraph
    
    from datahub.metadata.schema_classes import (
        InstitutionalMemoryClass,
        SchemaMetada
    )
    
    gms_endpoint = "<http://localhost:8080>"
    datahub_uplink = DataHubGraph(config=DatahubClientConfig(server=gms_endpoint))
    urn = 'urn:li:dataset:(urn:li:dataPlatform:postgres,Demo.PagilaDemo.public.actor,DEV)'
    
    current_institutional_memory = datahub_uplink.get_aspect_v2(
        entity_urn=urn,
        aspect="institutionalMemory",
        aspect_type=InstitutionalMemoryClass,
    )
    
    print(current_institutional_memory)
    # -> results in the expected output
    
    
    # but this command throws an expection
    datahub_uplink.get_aspect_v2(urn, aspect="schemaMetadata", aspect_type=SchemaMetadataClass)
    
    ValueError                                Traceback (most recent call last)
    [....]
    File ~/datahub/env/lib/python3.8/site-packages/avrogen/avrojson.py:358, in AvroJsonConverter._record_from_json(self, json_obj, writers_schema, readers_schema)
        356     result[field.name] = field_value
        357 if input_keys:
    --> 358     raise ValueError(f'{readers_schema.fullname} contains extra fields: {input_keys}')
        359 return self._instantiate_record(result, writers_schema, readers_schema)
    
    ValueError: com.linkedin.pegasus2avro.schema.Schemaless contains extra fields: {'com.linkedin.schema.MySqlDDL'}
    Any idea why the parser can't handle the MySqlDDL field? Strange thing is, the object I'm querying is not even ingested from a MySql DB but from a Postgres DB.
    h
    m
    • 3
    • 5
  • d

    delightful-appointment-36689

    03/16/2022, 10:51 PM
    Hi folks, I'm trying to use the Great expectations integration with datahub. It's working with postgres but I have a issue with redshift. I am running the checkpoint with:
    great_expectations -v checkpoint run datahub_checkpoint
    The validation passes but I can't see the assertions in datahub UI. But, running the same checkpoint with a postgres datasource gives me the validation details in the datahub UI. Any ideas?
    l
    h
    • 3
    • 3
  • h

    high-family-71209

    03/17/2022, 12:28 PM
    Hi all,
    sudo datahub delete --platform glue --entity_type container
    matches 8 entities for me, but then fails to delete them with error code 500 saying
    failed to convert urn to entity key: urns parts and key fields do not have same length
    why?
    d
    • 2
    • 6
  • b

    billowy-candle-87381

    03/17/2022, 1:32 PM
    Hi! anyone have an idea why I might be receiving this when running manual ingestion for snowflake-usage? User has accountadmin role and the query runs fine in snowflake.
    l
    s
    • 3
    • 9
  • r

    red-napkin-59945

    03/17/2022, 8:49 PM
    Hey team I found even if I set
    AUTH_OIDC_USER_NAME_CLAIM_REGEX=([^@]+)
    in the
    frontend.env
    file. But That env does not exist in the JVM.
    e
    • 2
    • 6
  • b

    breezy-controller-54597

    03/18/2022, 7:24 AM
    I'm trying to deploy DataHub to EC2 using minikube, but it's not working. Prerequisites seems to be working fine, but when I run
    helm install datahub datahub/datahub
    it times out.
    Copy code
    $ kubectl get pods
    NAME                                                READY   STATUS    RESTARTS        AGE
    elasticsearch-master-0                              1/1     Running   0               5m3s
    prerequisites-cp-schema-registry-6f4b5b894f-zd8kn   2/2     Running   0               5m3s
    prerequisites-kafka-0                               1/1     Running   1 (4m21s ago)   5m3s
    prerequisites-mysql-0                               1/1     Running   0               5m3s
    prerequisites-neo4j-community-0                     1/1     Running   0               5m3s
    prerequisites-zookeeper-0                           1/1     Running   0               5m3s
    
    $ helm install datahub datahub/datahub
    W0318 06:58:33.319740  292308 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
    W0318 06:58:33.321488  292308 warnings.go:70] batch/v1beta1 CronJob is deprecated in v1.21+, unavailable in v1.25+; use batch/v1 CronJob
    Error: INSTALLATION FAILED: failed pre-install: timed out waiting for the condition
    ❤️ 1
    h
    e
    • 3
    • 32
  • b

    brave-insurance-80044

    03/18/2022, 12:20 PM
    Hi team, I’m working on creating a new EntityType and a custom Platform. I’ve successfully ingested the new EntityType with the custom Platform. The Platform is showing up in the new entity’s breadcrumbs and search results, but the Platform tile is not appearing on the Home Page’s Platform section. Anyone has any clue how to fix it?
    b
    g
    e
    • 4
    • 12
  • p

    plain-farmer-27314

    03/18/2022, 7:16 PM
    Hi all, we recently updated to 0.8.31 to take advantage of the impact analysis feature as shown in the demo. As seen in the picture, it looks like we are still missing the impact analysis button. Just wondering if there's a config or something we have to enable to get this functionality Thanks!
    e
    • 2
    • 10
  • p

    plain-farmer-27314

    03/18/2022, 8:32 PM
    Hi everyone - wondering what the best way is to query all datasets that have a certain folder as a parent E.g in Looker we have a folder called "production" that has several subfolders. We are looking to be able to filter a lineage query by "all dashboards that contain production as a parent folder"
    h
    l
    • 3
    • 2
  • n

    numerous-eve-42142

    03/19/2022, 4:32 PM
    Hi! I'm having a little problem ingesting data from redshift with airflow dag. This error message shown is this:
    Copy code
    File "/home/airflow/.local/lib/python3.8/site-packages/datahub/ingestion/api/registry.py", line 128, in get
        raise ConfigurationError(
    datahub.configuration.common.ConfigurationError: redshift is disabled; try running: pip install 'acryl-datahub[redshift]'
    [2022-03-17, 18:59:09 UTC] {local_task_job.py:154} INFO - Task exited with return code 1
    o
    • 2
    • 5
  • s

    shy-parrot-64120

    03/19/2022, 6:34 PM
    Hi Folks just found that then manually constructing Airflow by
    mce_builder.make_data_flow_urn('airflow', dag_id)
    dataFlow objects urn does not contain
    dataPlatform
    , like this:
    Copy code
    urn:li:dataFlow:(airflow,aws_transforms_sportsbook,prod)
    is this an issue or feature? this leads to unable to operate with objects via cli:
    --platform airflow
    - searches nothing in other hand UI shows object under correct platform
    o
    h
    • 3
    • 18
  • w

    witty-butcher-82399

    03/21/2022, 12:10 PM
    I have a recipe with the following
    schema_pattern
    (note the trailing
    $
    ) working in 0.8.27 but not working in >= 0.8.28.
    Copy code
    schema_pattern:
          allow:
          - ^my_database\$
    This config is loaded as follows in 0.8.27 and tables in
    my_database
    being processed
    Copy code
    [2022-03-21 12:11:33,341] DEBUG    {datahub.cli.ingest_cli:76} - Using config: {'source': {'type': 'hive', ..., 'schema_pattern': {'allow': ['^my_database$',
    whereas in 0.8.28 gets loaded as follows
    Copy code
    [2022-03-21 12:10:51,588] DEBUG    {datahub.cli.ingest_cli:76} - Using config: {'source': {'type': 'hive', ..., 'schema_pattern': {'allow': ['^my_database\\$',
    This results on the database not matching the pattern in 0.8.28 and so being filtered out. So a recipe that was loading many tables with 0.8.27 loads none with 0.8.28. Was this a voluntary change fixing some bad previous behaviour? Or just a sort of bug that was introduced from 0.8.28 version?
    g
    • 2
    • 4
  • d

    damp-greece-27806

    03/21/2022, 2:03 PM
    Howdy - we’re experiencing issues with the metabase source, where the failure report is too big to send to cloudwatch, as the watchtower lib sets
    max_message_size
    to 262144 by default:
    Copy code
    [2022-03-21 04:12:56,631] {{logging_mixin.py:104}} WARNING - /usr/local/lib/python3.7/site-packages/watchtower/__init__.py:199 WatchtowerWarning: Failed to deliver logs: An error occurred (InvalidParameterException) when calling the PutLogEvents operation: Log event too large: 666943 bytes exceeds limit of 262144
    is there any way to configure this or would it require a code change?
    o
    • 2
    • 4
  • m

    mysterious-butcher-86719

    03/21/2022, 3:05 PM
    Hi Team, I am getting 401 error while trying to get the graphQL output through the python requests.post method. Is there a particular way that I had to authenticate the log in to datahub and then to execute the requests.post to get the response?
    o
    a
    r
    • 4
    • 7
  • g

    gentle-father-80172

    03/21/2022, 3:46 PM
    Hi Team 👋 Need some help with a GraphQL query please.... Trying to get a dataset's lineage information but GraphQL is complaining that
    "Validation error of type FieldUndefined: Field 'lineage' in type 'Dataset' is undefined @ 'dataset/lineage'"
    b
    • 2
    • 4
1...212223...119Latest