https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • f

    full-chef-85630

    07/27/2022, 2:38 PM
    Hi all,How to quickly locate this problem
    Copy code
    com.linkedin.restli.server.RestLiServiceException: Failed to validate record with class com.linkedin.dataset.DatasetProperties: ERROR :: /platformSchema :: unrecognized field found but not allowed
    ERROR :: /created :: unrecognized field found but not allowed
    ERROR :: /lastModified :: unrecognized field found but not allowed
    ERROR :: /fields :: unrecognized field found but not allowed
    ERROR :: /schemaName :: unrecognized field found but not allowed
    ERROR :: /version :: unrecognized field found but not allowed
    ERROR :: /hash :: unrecognized field found but not allowed
    ERROR :: /platform :: unrecognized field found but not allowed
    thank you 1
  • f

    full-chef-85630

    07/27/2022, 3:03 PM
    Hi all, Why does this setting of tags not take effect
    Copy code
    dataset_properties = DatasetPropertiesClass(
        name="test-bigquery",
        tags=[
            builder.make_tag_urn("bigquery-user-info"),
            builder.make_tag_urn("user-info"),
        ],
        description="This table stored the canonical User profile",
        customProperties={"governance": "ENABLED"},
        externalUrl="<http://150.158.172.130:3001>",
        qualifiedName="bigquery-test",
    )
    m
    • 2
    • 1
  • w

    wooden-chef-22394

    07/28/2022, 1:38 AM
    I am new to GraphQL. Could you give me a GraphQL api that get upstreamLineage and downstreamLineage of a Dataset?
  • a

    alert-fall-82501

    07/28/2022, 9:37 AM
    I am facing below issue while running datahub quickstart .
  • a

    alert-fall-82501

    07/28/2022, 9:37 AM
    Unable to run quickstart - the following issues were detected: - datahub-gms is still starting - kafka-setup is still running - elasticsearch-setup is still running - elasticsearch is still starting
  • a

    alert-fall-82501

    07/28/2022, 9:38 AM
    there are the error message getting .can someone help me with this ?
  • b

    brash-minister-83469

    07/28/2022, 10:49 AM
    Hey, stupid question to the community. Would you distinguish between operation and analytical workload consuming and producing data?! I have the horror scenario in mind that Java developer might simple code ETLs / ELTs in Java consuming and producing data which I need to catalog for data lineage and data governance in general. Do you have a view on this?!
    b
    • 2
    • 1
  • c

    clever-lamp-13963

    07/28/2022, 12:43 PM
    Using OpenAPI how do I retrieve an entity with a particular verstionstamp?
    b
    o
    • 3
    • 3
  • c

    clever-lamp-13963

    07/28/2022, 2:54 PM
    GraphQL always returns null for
    editableSchemaMetadata
    in
    versionedDataset
    . OpenAPI does return value. Is it not implemented?
    b
    e
    o
    • 4
    • 10
  • a

    adamant-napkin-88678

    07/28/2022, 8:51 PM
    Hello everyone I am sebastian Cajamarca Data scientist form Bia a startup in the energy sector in Colombia
  • a

    adamant-napkin-88678

    07/28/2022, 8:52 PM
    I just want to build a data catalogue on data hub ... Anyone has a good tutorial for that ? Thanks
    b
    g
    • 3
    • 5
  • b

    best-umbrella-24804

    07/29/2022, 5:45 AM
    Hi I need to rename a Domain without deleting it so that all the mappings don't disappear, does anyone know how to do this?
  • a

    alert-ram-30868

    07/29/2022, 7:03 AM
    Hi folks,
  • s

    square-hair-99480

    07/29/2022, 9:39 AM
    Hello friends, look I am trying to change my root user password but I am not sure how to do so . Sorry if it is a very dumb question, My instance is runnin on a ec2 using the docker compose here https://github.com/datahub-project/datahub/blob/master/docker/quickstart/docker-compose-without-neo4j.quickstart.yml . I tried to change
    EBEAN_DATASOURCE_PASSWORD
    but that is not the way. So is there other env var for setting this? In which docker service should I set it? Or is there another better way?
    s
    • 2
    • 3
  • s

    square-hair-99480

    07/29/2022, 12:55 PM
    Hello friends I was just curious about the relation between Datahub and Acryl. To me looks like Acryl is to Datahub as Astronomer is to Apache Airflow. But what I do not have clear is who is maintaining Datahub, Linkedin or Acryl? In other words who is the Apache?
    ➕ 1
    b
    s
    l
    • 4
    • 15
  • l

    lemon-engine-23512

    07/29/2022, 1:09 PM
    Hello All, i have added glue metadata using cli. I can see the datasets on ui now. But not lineage. Any idea how to add lineage for glue source
    c
    • 2
    • 17
  • r

    rapid-king-93225

    07/31/2022, 11:00 AM
    Some days ago (https://datahubspace.slack.com/archives/CV2KB471C/p1658481799112079) I started some first experiments with dataHub. However, today I can neither get the Login UI to show up for the old version from approx. 10 days ago - nor for the new version published 5 days ago: https://github.com/datahub-project/datahub/blob/master/docker/quickstart/docker-compose-without-neo4j-m1.quickstart.yml In fact, the new version is bringing up even more problems than the earlier one: GMS is complaining: ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.8ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.8ANTLR Tool version 4.5 used for code generation does not match the current runtime version 4.8ANTLR Runtime version 4.5 used for parser compilation does not match the current runtime version 4.82022/07/31 105240 Command exited with error: exit status 1 Also: - Mysql never comes up on first try. I manually need to start it - In the new version, Kafka and schema registry show some problems and sometimes seem to stop - the UI at http://localhost:9002/ never shows up for me This seems to be persistent - even if I delete all docker containers and associated volumes.
    s
    • 2
    • 13
  • s

    shy-kitchen-7972

    08/01/2022, 8:56 AM
    Hi all, in the schema of dashboardInfo I see it's possible to indicate the datasets it is consuming. Howerever, in the DashboardInfoClass "datasets" is not available as parameter nor can I see it in the api definition. So it seems that I can only document dashboard lineage by documenting the charts. Any idea if this is correct or maybe lineage between dataset & dashboard is a new feature and I don't have the proper version yet?
    b
    • 2
    • 2
  • m

    modern-soccer-40481

    08/01/2022, 1:51 PM
    Sorry, a bit new with datahub, so maybe a basic question: if I document the columns of a table, is there an automatic method to classify the same columns in other tables that have these as well?
    l
    • 2
    • 2
  • s

    sticky-guitar-6066

    08/01/2022, 11:04 AM
    Hi all, when I tried to ingest bigquery metadata, how do I put table info of bigquery(ex, Last modified) into “Properties” of the dataset? I would like to use this information to tag datasets which are ‘out-of-date’.
    b
    m
    • 3
    • 3
  • c

    curved-book-15799

    08/01/2022, 6:28 PM
    I've found a couple of questions recently asking about adding multi-tenancy support to DataHub and I'm curious - has anyone made any progress on that or thought about what it would take to implement?
    l
    • 2
    • 1
  • c

    cuddly-butcher-39945

    08/01/2022, 9:11 PM
    Hi Everyone, I believe the answer is no, but looking for a confirmation. Is there a way to apply Role Based Access Policies to downstream database management systems like Snowflake? Use case is the following: I would like to create Data Access and Subscription policies in Datahub and have these manifest in the form of Snowflake Roles which are then automatically applied to views, schemas and tables. Thanks in advance!
    g
    • 2
    • 1
  • g

    gorgeous-dinner-4055

    08/01/2022, 7:28 PM
    Do folks have tips on making their local development faster? For example: If I'm doing work on the front-end, I run the following command after making changes:
    Copy code
    ./gradlew :datahub-web-react:build \
        -x :datahub-web-react:yarnLint \
        -x :datahub-web-react:yarnTest \
        && docker restart datahub-frontend-react
    Do people have similar/better workflows for front end & backend development? I've thought about using something like watchmedo to automatically detect if anything in the frontend sub dir gets touched and do the automatic restart.
    g
    • 2
    • 8
  • f

    full-chef-85630

    08/01/2022, 11:52 PM
    Hi all, I have encountered some errors when deploying with k8s. Have you encountered any
    Copy code
    23:44:29.655 [main] ERROR o.s.web.context.ContextLoader:313 - Context initialization failed
    org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'siblingGraphServiceFactory': Unsatisfied dependency expressed through field '_entityService'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in com.linkedin.gms.factory.entity.EntityAspectDaoFactory: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in com.linkedin.gms.factory.entity.EbeanServerFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
            at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.resolveFieldValue(AutowiredAnnotationBeanPostProcessor.java:659)
            at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor$AutowiredFieldElement.inject(AutowiredAnnotationBeanPostProcessor.java:639)
            at org.springframework.beans.factory.annotation.InjectionMetadata.inject(InjectionMetadata.java:119)
            at org.springframework.beans.factory.annotation.AutowiredAnnotationBeanPostProcessor.postProcessProperties(AutowiredAnnotationBeanPostProcessor.java:399)
            at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.populateBean(AbstractAutowireCapableBeanFactory.java:1431)
            at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.doCreateBean(AbstractAutowireCapableBeanFactory.java:619)
            at org.springframework.beans.factory.support.AbstractAutowireCapableBeanFactory.createBean(AbstractAutowireCapableBeanFactory.java:542)
            at org.springframework.beans.factory.support.AbstractBeanFactory.lambda$doGetBean$0(AbstractBeanFactory.java:335)
            at org.springframework.beans.factory.support.DefaultSingletonBeanRegistry.getSingleton(DefaultSingletonBeanRegistry.java:234)
            at org.springframework.beans.factory.support.AbstractBeanFactory.doGetBean(AbstractBeanFactory.java:333)
            at org.springframework.beans.factory.support.AbstractBeanFactory.getBean(AbstractBeanFactory.java:208)
            at org.springframework.beans.factory.support.DefaultListableBeanFactory.preInstantiateSingletons(DefaultListableBeanFactory.java:953)
            at org.springframework.context.support.AbstractApplicationContext.finishBeanFactoryInitialization(AbstractApplicationContext.java:918)
            at org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:583)
            at org.springframework.web.context.ContextLoader.configureAndRefreshWebApplicationContext(ContextLoader.java:401)
            at org.springframework.web.context.ContextLoader.initWebApplicationContext(ContextLoader.java:292)
            at org.springframework.web.context.ContextLoaderListener.contextInitialized(ContextLoaderListener.java:103)
            at org.eclipse.jetty.server.handler.ContextHandler.callContextInitialized(ContextHandler.java:1073)
            at org.eclipse.jetty.servlet.ServletContextHandler.callContextInitialized(ServletContextHandler.java:572)
            at org.eclipse.jetty.server.handler.ContextHandler.contextInitialized(ContextHandler.java:1002)
            at org.eclipse.jetty.servlet.ServletHandler.initialize(ServletHandler.java:746)
            at org.eclipse.jetty.servlet.ServletContextHandler.startContext(ServletContextHandler.java:379)
            at org.eclipse.jetty.webapp.WebAppContext.startWebapp(WebAppContext.java:1449)
            at org.eclipse.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1414)
            at org.eclipse.jetty.server.handler.ContextHandler.doStart(ContextHandler.java:916)
            at org.eclipse.jetty.servlet.ServletContextHandler.doStart(ServletContextHandler.java:288)
            at org.eclipse.jetty.webapp.WebAppContext.doStart(WebAppContext.java:524)
            at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
            at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
            at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:117)
            at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
            at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
            at org.eclipse.jetty.server.Server.start(Server.java:423)
            at org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
            at org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
            at org.eclipse.jetty.server.Server.doStart(Server.java:387)
            at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
            at org.eclipse.jetty.runner.Runner.run(Runner.java:519)
            at org.eclipse.jetty.runner.Runner.main(Runner.java:564)
    Caused by: org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'entityAspectDao' defined in com.linkedin.gms.factory.entity.EntityAspectDaoFactory: Unsatisfied dependency expressed through method 'createEbeanInstance' parameter 0; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'ebeanServer' defined in com.linkedin.gms.factory.entity.EbeanServerFactory: Bean instantiation via factory method failed; nested exception is org.springframework.beans.BeanInstantiationException: Failed to instantiate [io.ebean.EbeanServer]: Factory method 'createServer' threw exception; nested exception is java.lang.NullPointerException
    i
    l
    +2
    • 5
    • 7
  • l

    lively-farmer-38551

    08/01/2022, 9:50 PM
    Hi, anybody knows how to import metadata comments from table and columns on postgreSQL ? Thanks a lot !
    g
    l
    • 3
    • 8
  • f

    full-chef-85630

    08/01/2022, 3:12 AM
    Hi all, Datahub neo4j supports distributed storage and Federated Query ?
    g
    • 2
    • 4
  • b

    bright-cpu-56427

    08/02/2022, 2:47 AM
    Hi guys How can I query the first dataset in a dataset where the depth of lineage is unknown? For example, When there is a lineage from a > b > c > d >... n, Is there a graphql query statement that can tell at once that n starts with a? i am using python and I checked that I can query about 2 depths using the query below,
    Copy code
    """
    {
      search(input: { type: CHART, query: "qs", start: 0, count: 10000 }) {
        searchResults {
          entity {
              ...on Chart {
                properties {
                    name
                }
                lineage(input: {direction: UPSTREAM , start: 0, count: 10000}) {
                    relationships {
                        type
                        entity {
                          ...on Dataset {
                                  name
                                  lineage(input: {direction: UPSTREAM , start: 0, count: 10000}) {
                                    relationships {
                                      entity {
                                        ...on Dataset {name}
                                      }
                                    }
                                  }
                                }
                        }
                    }
                }
             }
          }
        }
      }
    }
    """
    g
    • 2
    • 7
  • h

    high-notebook-40979

    10/19/2021, 11:54 AM
    hi All! I'm deploying DataHub with LDAP authentication. Currently we able to login with LDAP user, but in Manage Users & Groups page, we only see datahub user
    b
    b
    +3
    • 6
    • 11
  • a

    alert-fall-82501

    07/28/2022, 9:36 AM
    Hi Team
    b
    g
    g
    • 4
    • 62
  • a

    alert-ram-30868

    07/29/2022, 7:05 AM
    Hi , I looking to get the metadata change proposal wrapper from the nested fields(Struct,Array,Map..) of the metadata? i am unable to find ? can any share some references or related info which is helpful to me.
    g
    • 2
    • 8
1...363738...80Latest