https://datahubproject.io logo
Join SlackCommunities
Powered by
# troubleshoot
  • r

    refined-energy-76018

    07/06/2022, 5:38 PM
    Anyone else run into this issue before? Any pointers would be greatly appreciated
  • k

    kind-dawn-17532

    07/06/2022, 5:54 PM
    How to make sense of the two breadcrumbs in this screenshot? In our case we have platform instance enabled, and the second breadcrumb has database name appear twice..
    b
    b
    • 3
    • 6
  • g

    gentle-camera-33498

    07/06/2022, 8:37 PM
    Hello everyone, I'm having some issues with DataHub. At the company I work for, the organization of tables and views within the same BigQuery dataset is done by adopting a naming pattern. In other words, tables are only given lowercase names separated by '_' while views follow the upper camelcase pattern. However, there are some cases in which the name of the table and the view are distinguished by only 1 capital letter (e.g., <dataset>.order and <dataset>.Order). With that, in the UI when I search for 'order' I get several 500 messages and a JSON parsing error message and when I try to explore in search of the table, I only find the view. As I can't change the defaults adopted, does anyone have any idea how I can solve this?
    l
    • 2
    • 5
  • g

    green-pharmacist-54624

    07/07/2022, 9:51 AM
    Hi, I am new to Datahub. Setting up on my local using
    datahub docker quickstart
    and I am getting this error
    Copy code
    Unable to run quickstart - the following issues were detected:
    - datahub-gms is running but not healthy
    Here my log file error statement
    Copy code
    09:05:37 [main] WARN  o.s.w.c.s.XmlWebApplicationContext - Exception encountered during context initialization - cancelling refresh attempt: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'siblingGraphServiceFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in com.linkedin.gms.factory.entity.EntityAspectDaoFactory: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in com.linkedin.gms.factory.entity.EbeanServerFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
    09:05:37 [main] ERROR o.s.web.context.ContextLoader - Context initialization failed
    org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'siblingGraphServiceFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in com.linkedin.gms.factory.entity.EntityAspectDaoFactory: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in com.linkedin.gms.factory.entity.EbeanServerFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
    	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.resolveFieldValue(AutowiredAnnotationBeanPostProcessor.java:659)
    	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:639)
    	at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:119)
    	at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessProperties(AutowiredAnnotationBeanPostProcessor.java:399)
    	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1431)
    	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:619)
    	at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
    	at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
    	at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
    	at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
    	at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
    	at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953)
    	at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918)
    	at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583)
    	at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:401)
    	at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:292)
    	at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103)
    	at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:1073)
    	at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:572)
    	at org.eclipse.jetty.server.handler.ContextHandler.contextInitialized(ContextHandler.java:1002)
    	at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:746)
    	at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379)
    	at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1449)
    	at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1414)
    	at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:916)
    	at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288)
    	at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
    	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
    	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
    	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
    	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
    	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
    	at org.eclipse.jetty.server.Server.start(Server.java:423)
    	at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
    	at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
    	at org.eclipse.jetty.server.Server.doStart(Server.java:387)
    	at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
    	at org.eclipse.jetty.runner.Runner.run(Runner.java:519)
    	at org.eclipse.jetty.runner.Runner.main(Runner.java:564)
    a
    • 2
    • 2
  • b

    bumpy-activity-74405

    07/07/2022, 10:34 AM
    Hi, I was trying to find some docs on policies and if I click on the first result
    Policies Guide
    it takes me to a page that doesn't exist (https://datahubproject.io/docs/policies/). However there is a page that exists that you can find manually under authorization - https://datahubproject.io/docs/authorization/policies.
    b
    • 2
    • 2
  • b

    bumpy-activity-74405

    07/07/2022, 11:06 AM
    I think there's a bug in the settings UI in
    v0.8.38
    . Prior to updating I had a platform policy that allowed users to
    Manage Users & Groups
    and it worked fine. It still works, but the only way to access it is if you know the url (https://somehost.com/settings/identities/users). The
    User & Groups
    tab under
    Access
    is just not there for users with those privileges. After tinkering around I found that the tab does appear if you give a user
    Manage policies
    platform privileges to go along with
    Manage Users & Groups
    , but that's obviously too wide to use as a workaround.
    ✅ 1
    b
    • 2
    • 2
  • b

    bland-orange-13353

    07/07/2022, 2:22 PM
    This message was deleted.
  • r

    rhythmic-stone-77840

    07/07/2022, 2:56 PM
    Hey All - I'm having an issue figuring out the right way to set a filter for a graphql setup. I'd like to filter on dataset entries that come from the platform bigquery. I've tried a bunch of different versions of what to set for the field, but I can't see to find the right one. Example in 🧵
    ✅ 1
    b
    t
    s
    • 4
    • 13
  • c

    calm-dinner-63735

    07/07/2022, 8:39 PM
    if i want to enable TLS encryption for MSK kafka between client and broker what i need to change in values.yaml
  • j

    jolly-traffic-67085

    07/08/2022, 6:53 AM
    Hello, I need to create GenerateAccessToken by use GraphiQL in datahub and use on OpenAPI follow command below
    _mutation_ {
    createAccessToken(input: {type: PERSONAL, actorUrn: "urn:li:corpuser:datahub", duration: ONE_HOUR, name: "my personal token"}) {
    accessToken
    metadata {
    id
    name
    description
    }
    }
    }
    and
    curl --location --request POST '<http://localhost:8080/api/graphql>' \
    --header 'X-DataHub-Actor: urn:li:corpuser:datahub' \
    --header 'Content-Type: application/json' \
    --data-raw '{ "query":"{ createAccessToken(input: { type: PERSONAL, actorUrn: \"urn:li:corpuser:datahub\", duration: ONE_HOUR, name: \"my personal token\" } ) { accessToken metadata { id name description} } }", "variables":{}}'
    but it not working follow this picture. please suggest me about how to create GenerateAccessToken by use GraphiQL and OpenAPI Thanks in advance!
    b
    • 2
    • 5
  • c

    calm-dinner-63735

    07/08/2022, 8:21 AM
    Hi can i get any help here : if i want to enable TLS encryption for MSK kafka between client and broker what i need to change in values.yaml
    a
    • 2
    • 6
  • b

    best-lamp-53937

    07/08/2022, 12:51 PM
    Which version(s) of Java and Gradle are recommended for building datahub locally?
    g
    • 2
    • 3
  • m

    microscopic-mechanic-13766

    07/11/2022, 8:58 AM
    Good morning, I have been facing an error than I don't know the source exactly and how I could fix. The first time Datahub is started, doesn't matter which user you login with, users can't do nothing as they get a page saying "Unauthorized". If I get that error, logout and login with another user this error disappear for both users.
    b
    • 2
    • 8
  • q

    quick-pizza-8906

    07/11/2022, 10:27 AM
    Good afternoon, as we are having some problems with users permissions which prompted some investigation, I got to this piece of code: https://github.com/datahub-project/datahub/blob/master/datahub-web-react/src/app/entity/EntityPage.tsx#L54-L74 Which checks user permissions on browser side - even though my user was not authorized to see a dataset I could see it by simply substituting value of the flag via chrome debugging... Is it expected behavior?
  • m

    most-nightfall-36645

    07/11/2022, 11:33 AM
    Hi -- I am having some issues upgrading datahub. I have deployed datahub via terraform using the datahub k8s helm charts. When I upgrade, the
    datahub-datahub-upgrade-job
    isnt run. Is there a documented process for upgrading k8s deployments?
    i
    • 2
    • 3
  • m

    most-nightfall-36645

    07/11/2022, 12:20 PM
    Intermittent Authetication errors: Hi, we run ingestions as a part of our CI pipelines using datahub rest api, when ingesting we receive intermittent authentication errors for example our client reports:
    Copy code
    requests.exceptions.JSONDecodeError: [Errno Expecting value] <html>
    <head>
    <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"/>
    <title>Error 401 Unauthorized to perform this action.</title>
    </head>
    <body><h2>HTTP ERROR 401 Unauthorized to perform this action.</h2>
    <table>
    <tr><th>URI:</th><td>/entities</td></tr>
    <tr><th>STATUS:</th><td>401</td></tr>
    <tr><th>MESSAGE:</th><td>Unauthorized to perform this action.</td></tr>
    <tr><th>SERVLET:</th><td>restliRequestHandler</td></tr>
    </table>
    <hr/><a href="<https://eclipse.org/jetty>">Powered by Jetty:// 9.4.46.v20220331</a><hr/>
    </body>
    </html>
    The server reports a missing authentication token:
    Copy code
    10:36:57.676 [qtp1830908236-57260] WARN c.d.a.a.AuthenticatorChain:70 - Authentication chain failed to resolve a valid authentication. Errors: [(com.datahub.authentication.authenticator.DataHubSystemAuthenticator,Failed to authenticate inbound request: Authorization header is missing 'Basic' prefix.), (com.datahub.authentication.authenticator.DataHubTokenAuthenticator,Failed to authenticate inbound request: Unable to verify the provided token.)]
    This behaviour happens intermittently, where some jobs suceed and other fail, we havent changed our client or token between jobs, so I dont understand why the token is missing. We host our deployment using EKS, we use MySQL as our datastore. I have checked: • RDS database connections and system resources • Kafka system resource • ES system resources None of these are under contention. I also check the node where the frontend and gms containers are running, both have plenty of free memory and cpu time. I am wondering if this could be a bug, does anyone have any suggestions?
    i
    b
    r
    • 4
    • 14
  • h

    hallowed-dog-79615

    07/11/2022, 12:58 PM
    Hi everyone! I just wanted to note here some bugs (at least that is what we think) we are finding lately in DataHub. • Since we installed .40 version, dbt tests are not shown on datasets' entity pages. Formerly they were found under "Assertions", which now is disabled. • We are not sure since what specific release, but our "impact analysis", the list-like lineage section available in each entity's details page, does not show nodes deeper than 1st order. We can select 2nd and 3+ nodes options in the filters, but they are not shown in the list nor the downloaded csv. Only 1st order nodes in both. • Column lineage section is algo missing. We have some screenshots from our team from weeks ago and we could find this section under the Properties tab in a dataset's page. Now Properties tab contains different information (related with dbt even if its a snowflake object). It is dbt object overwriting snowflake in a sibling association? thanks!
    plus1 1
    thanks bear 1
    i
    g
    l
    • 4
    • 10
  • w

    wooden-arm-26381

    07/11/2022, 4:23 PM
    Hi, did someone had any success implementing filter inputs for creating policies? I’m trying to apply a “view entity” policy based on some criteria, e.g. entities with a certain tag attached or which are part of a specific cloud project. Not sure though if the filter functionality is already implemented yet. Trying to create it on GraphQL without any effect so far. Even doing an exact match with a URN does not work for me. Example:
    Copy code
    mutation createPolicy{
       createPolicy (
        input: {
        	type: METADATA,
           	name: "filter match urn",
           	state: ACTIVE,
           	description: "",
           	resources: {
           		type: "",
              	resources: [  ],
              	allResources: false,
            	  filter: {
                    criteria: [
                      {
                        field: "entity_urn",
                        values: [ "urn:li:dataset:(urn:li:dataPlatform:bigquery,<project>.<dataset>.<table>,DEV)" ],
                        condition: EQUALS
                      }
                    ]
                  }
           },
           actors: {
           		users: [ ],
              groups: [ ],
              resourceOwners: false,
              allUsers: true,
              allGroups: false
           },
           privileges:[ "VIEW_ENTITY_PAGE" ]
        }
      )
    }
    Any help greatly appreciated 🙂 Cheers
    b
    • 2
    • 4
  • m

    most-pillow-90882

    07/11/2022, 9:41 PM
    Hi - We’ve configured Datahub with AWS Cognito OIDC. The user appears to authenticate correctly, but I get an infinitely looping white screen when trying to redirect to the dathhub homepage. This still happens when using using an incognito window and after clearing browser cookies. Any help diagnosing this would be much appreciated. Thank you. Each loop displays different oidc code and state query parameters in the url (see browser screenshot) The frontend logs show multiple repeats of this error:
    Copy code
    21:05:58 [application-akka.actor.default-dispatcher-16] ERROR auth.sso.oidc.OidcCallbackLogic - Unable to renew the session. The session store may not support this feature
    b
    • 2
    • 7
  • h

    hallowed-machine-2603

    07/12/2022, 4:29 AM
    Hi team, I have two questions about transformers function. 1. 'Add dataset browse paths': I use this function for classification dataset. But I can't change path for each dataset. For example, I use command 'path templates: 'Test/Test route'. I want to set PATH for a specific table. Test/Test/[Test_table] How can I set PATH for each dataset? 2. 'Add a set of properties': Same question like question No.1. How can I set PATH for a specific table. Thx 🙂
    b
    • 2
    • 4
  • p

    purple-analyst-83660

    07/12/2022, 11:07 AM
    Hi Team, It would be great if you can help me with this. I want to use this https://github.com/linkedin/datahub/blob/master/metadata-ingestion/src/datahub/cli/cli_utils.py to send delete/update request. I have datahub hosted on GKE, what do I need to authenticate my request? What all environment variable do I need to set before making a delete/put request. Also how and where can I get them? Thanks for reading!
    s
    • 2
    • 2
  • m

    mysterious-butcher-86719

    07/12/2022, 1:54 PM
    Hi Team, Could you please help on how to use search_after parameter using the graphQL api to retrieve more than 10000 records from datahub?
    e
    h
    b
    • 4
    • 4
  • m

    microscopic-mechanic-13766

    07/12/2022, 2:25 PM
    Hi, I am currently deploying Datahub using the v0.8.40. My problem is that, the first time Datahub starts it has to create a lot of new indices in ElasticSearch. As this takes some time, in the middle of the process I get the
    Copy code
    Received signal: terminated
    Command exited with error: exit status 143
    Is there a way to modify the time that this graceful termination is executed??
    s
    • 2
    • 18
  • s

    steep-carpet-52398

    07/12/2022, 2:42 PM
    Hi, Im on RHEL OS and Im getting some errors when I try to deploy datahub with quickstart. Im using latest datahub version. Errors: - kafka-setup is still running - schema-registry is not running - broker is not running - datahub-frontend-react is running but not healthy - datahub-gms is still starting - elasticsearch-setup is still running - elasticsearch is running but not healthy HELP! Anyone knows how can I fix it? Thanksss.
    s
    • 2
    • 9
  • h

    handsome-football-66174

    07/12/2022, 7:21 PM
    Hi Team, I am trying to build docker image using the 0.8.40 version . Facing this error, Any suggestions for this -
    Copy code
    FAILURE: Build failed with an exception.
    
    * What went wrong:
    Execution failed for task ':datahub-web-react:yarnBuild'.
    > Process 'command 'yarn'' finished with non-zero exit value 131
    
    * Try:
    Run with --stacktrace option to get the stack trace. Run with --info or --debug option to get more log output. Run with --scan to get full insights.
    
    * Get more help at <https://help.gradle.org>
    
    BUILD FAILED in 2m 26s
    b
    e
    • 3
    • 39
  • e

    echoing-farmer-38304

    07/13/2022, 7:45 AM
    Hello, when we try to ingest MSSQL occurs problem: {'MyServer.MyDB.MySchema.MyTable': ['unable to map type UNIQUEIDENTIFIER() to metadata schema', 'unable to map type BIT() to metadata schema']} This is basic MSSQL type, so we don't know what to do. Can somebody help us?
    l
    • 2
    • 1
  • p

    plain-farmer-27314

    07/13/2022, 2:34 PM
    Hi team, QQ I'm trying to retrieve all of our bigquery datasets, even those that we didn't directly ingest (i.e they were created from lookml lineage or something) I'm using the below snippet:
    Copy code
    search(
          input:
              {
                  type: DATASET,
                  query: "",
                  count: 6000,
                  filters: [
                      {
                          field: "platform",
                          value: "urn:li:dataPlatform:bigquery"
                      }
                  ]
              }
      )
    However this is only returning datasets that I have directly ingested, and not those brought in "indirectly" through lineage or other means. Is this supposed to happen? And if so, can I modify the search params to include these results?
    b
    • 2
    • 10
  • k

    kind-whale-32412

    07/13/2022, 3:34 PM
    Hey there, I'm going through the quickstart guide in here https://datahubproject.io/docs/quickstart/; After running
    Copy code
    datahub docker quickstart
    I get an errror
    Detected M1 machine
    Unable to run quickstart:
    - Docker doesn't seem to be running. Did you start it?
    ✅ 2
    s
    b
    c
    • 4
    • 21
  • k

    kind-whale-32412

    07/13/2022, 3:34 PM
    Anybody knows how to proceed with this ^^?
  • g

    gifted-knife-16120

    07/14/2022, 7:53 AM
    Hi team. I am new to DataHub. I have one question. My team would like to explore Data Lineage, and our data platform is PostgreSQL. But then, when I proceed with the ingestion and see the Dataset, there is no Lineage available. May I know how to make it available and clickable?
    c
    • 2
    • 5
1...373839...119Latest