https://datahubproject.io logo
Join Slack
Powered by
# getting-started
  • b

    billions-baker-82097

    05/25/2023, 8:50 AM
    https://datahubproject.io/docs/generated/metamodel/entities/mlfeature, It has schema defined here, what is the use of these schemas ? Can we use these schemas to make aspects ( like we do by using a .pdl file ) or Is it just for understanding purposes?
    a
    • 2
    • 1
  • j

    jolly-ram-48429

    05/25/2023, 9:24 AM
    Hi all, I’m new to the DataHub (and to slack), so I really hope someone can help us out with the following problem: At KPN we are now storing tables (datasets) as nodes (term groups) with the columns (attributes) as terms in the glossary. For the Enterprise Conceptual Data Model (ECDM) we also want to store the business glossary in the DataHub and we used the same method: we are storing the entities as GlossaryNodes, with the definition, enlightenments and examples in the description field of the node and we are storing all attributes as terms, with the definition, enlightenments and examples in the description field of the term. Now we are trying to relate the tables to the entities, but we can’t see any possibility to link on nodes: we can link on individual attribute/field/term level (by using inherits or contains fields), but that isn’t what we want: we want to link tables to entities. We also want to link the entity to the external Enterprise Conceptual Data Model (on a entity to entity level). However: we can provide an url on term level (source_url) and on complete file level (url), but again the possibility to do this on node / term-group level is missing. Can somebody guide us how to correctly use the DataHub to implement linking tables to entities and giving url’s per entity to the ECDM? Are we doing something wrong?
    d
    • 2
    • 1
  • b

    bitter-translator-92563

    05/25/2023, 9:38 AM
    Hi guys. We are useing Datahub and have a number of OpenAPI contracts ingested in it. Our next step is to link those specifications with Terms in glossary. I'd appreciate hearing any best practice or advice on how your are managing linking API Specifications with the terms. Here are two of the ways we've came up so far: - link terms and API methods/fields in UI after ingestion of API specification; - make an additional development in Datahub in order to parse term urn from the specification during ingestion and turn it to relation within Datahub. Seems that the second way is the one we are considering to implement. But would appreciate if anyone could share any experience for this type of task.
    m
    • 2
    • 8
  • r

    red-parrot-86331

    05/25/2023, 5:47 PM
    Hello, Difference between
    Tags
    &
    Glossary Terms
    and when/how to use them is clear in the documentation. But
    description
    for say on a "column", would have business context. So, why attach a separate glossary term as well ? Can anyone here please help clarify or how you guys are using both
    description
    and
    glossary term
    features in conjunction ?
    ✅ 1
    b
    • 2
    • 3
  • r

    refined-mouse-71434

    05/26/2023, 1:44 AM
    Hello,I want to ask something about kafka metadata ingestion.The log is {datahub.ingestion.source.kafka:426} - Config details for topic ODS.dbo.bswgoods fetched successfully,but this kafka topic's schema view no data.My kafka.yml is source: type: kafka config: connection: consumer_config: security.protocol: PLAINTEXT bootstrap: 'xx.xx.xx.xx9092,xx.xx.xx.xx9092,xx.xx.xx.xx:9092' sink: type: "datahub-rest" config: server: "http://xx.xx.xx.xx:8080" token: "mytokenxxxxx". thanks
    plus1 1
    a
    • 2
    • 1
  • c

    colossal-exabyte-39072

    05/26/2023, 10:14 AM
    Hi guys, I don't have a complete set of data, just metadata (including fieldname and description). Is there anyway I can use datahub to ingest this metadata (csv format) into objects? thanks
    ✅ 1
    b
    b
    • 3
    • 3
  • r

    rapid-controller-60841

    05/29/2023, 4:28 AM
    👋 大家好!
  • r

    rapid-controller-60841

    05/29/2023, 4:29 AM
    source:
    type: hive config: env: PROD platform: databricks host_port: 'http://JD-in-us.cloud.databricks.com/published' username: token password: '${databricks_token}'
  • r

    rapid-controller-60841

    05/29/2023, 4:29 AM
    我想知道是否可以通过配置来设置连接的超时时间,如果有任何人知道,请告诉我该如何设置,非常感谢
    plus1 1
  • r

    rapid-controller-60841

    05/29/2023, 4:30 AM
    I would like to know whether the timeout period of connection can be set through configuration. If anyone knows, please tell me how to set it. Thank you very much
    d
    • 2
    • 1
  • a

    average-dentist-82800

    05/29/2023, 11:13 PM
    Hello, is there a way to configure the sink to store the metadata in dynamoDB ?
    ✅ 1
    plus1 1
    m
    b
    • 3
    • 2
  • b

    billions-rose-75566

    05/30/2023, 11:42 AM
    Hi DataHub! Is it possible to send Azure blob data to DataHub? I cannot find it....
    m
    b
    +2
    • 5
    • 15
  • p

    prehistoric-greece-5672

    05/30/2023, 8:45 PM
    How would I migrate my data catalog out of DataHub and into another product? I hope I'll never need to do that, but suppose that I do. If there is documentation about this, please show me. I haven't found anything yet. I'm concerned about vendor lock-in, even with an open source product.
    f
    m
    • 3
    • 5
  • l

    late-king-71541

    05/31/2023, 5:59 AM
    Hello everyone, I encountered some problems after using docekr to quickly deploy DataHub, and I would like to ask you for advice 1. The official website of DataHub says that the production environment is recommended to use Kubernetes-based installation and deployment, but if the production environment does not use Kubernetes, how to deploy it? Of course I know that there is a docker-based deployment, but I really don't want to use docker to deploy DataHub in the production environment 2. Docker and Docker Compose are required in the prerequisite requirements of the DataHub official website development guide. Is there any other way to develop and compile?
    ✅ 1
    b
    b
    • 3
    • 5
  • b

    billions-baker-82097

    05/31/2023, 2:45 PM
    Hi Team, I am trying to build custom entity as a top level entity but there is no proper documentation available and how to build so? Can anyone provide some insight to it? I saw recent push in the datahub where datahub team added ownershiptype as a top level entity. It looks that whole datahub needs to be build for top-level entity unlike custom aspect. For custom aspect we can utilise metadata-model-custom, but for custom entity as a top-level entity there is no such method if I am not wrong... But since I have to make custom entity as a top level entity...can anyone from datahub team guide on the same, it would be very helpful.
    a
    b
    • 3
    • 5
  • g

    great-rainbow-70545

    05/31/2023, 4:07 PM
    Is there a way to tell what plugins are present by default in the containers used in the official helm chart? (without spelunking through the Dockerfile).
    ✅ 1
    a
    b
    a
    • 4
    • 3
  • g

    gifted-barista-64719

    06/01/2023, 6:50 AM
    Hi, Is there a way to open a specific dataset to specific users only?
    ✅ 1
    d
    • 2
    • 2
  • o

    orange-crayon-34273

    06/01/2023, 11:24 AM
    Hi, are there any plans to support Elastic 8. There is an open feature request from a year, but no futher response. We can offer to help with a contribution, but only if the core team agrees on it. Upgrading would probably result it a breaking change. Thanks for any response!
    m
    • 2
    • 4
  • a

    ambitious-bird-91607

    06/01/2023, 2:17 PM
    Hi there! Do actions (such as sending a notification to Microsoft Teams) apply only to changes done within datahub? Is there a way to get notified if a, for example, description of a dataset has changed between the current ingestion and the previous ingestion? Thanks in advance! :)
    a
    r
    • 3
    • 3
  • f

    future-table-91845

    06/01/2023, 3:33 PM
    Question about Pricing. Where can I find exact info about Pricing. I believe Pricing is based on the no. of soures. But where do I find details
    ✅ 1
    b
    • 2
    • 1
  • e

    eager-monitor-4683

    06/01/2023, 11:47 PM
    Hey team, just a quick question regarding the hosts, are there any hosts in Australia? As we need to comply with company policy to make sure all data sits in AUS. Thank you
    a
    • 2
    • 3
  • b

    busy-honey-716

    06/02/2023, 4:27 AM
    Hi there! while installation via command python -m datahub docker quickstart i'm unable to up the datahub . It shows few of containers like mysql-setup is exited and throws an error like 'Frontend is still starting etc'. Could this happen because of the unhealthy containers ? If so how can I get datahub running locally?
    plus1 1
    a
    • 2
    • 1
  • p

    powerful-flag-14641

    06/02/2023, 12:46 PM
    Hello. I could use some guidance on how to programatically query for lineage using any of the Rest, Graphql, or Python SDK methods. I have a set of RDS tables that are synced using fivetran to snowflake. I want to propegate the domain and group of the RDS tables through the fivetran tables that function as DBT sources. I have found the docs to be pretty good and have been able to use the python sdk to do some metadata decoration of datasets, but I am stuck trying to query for downstream dependencies of a dataset. I don’t see examples of this funcitonality here. I’ve tried making an http call to the relationships endpoint, but it does not return the data I am expecting. I have tried both
    /relationships
    and
    /openapi/relationships/
    . I have looked for documentation on the acceptable values for the url query params
    direction
    ,
    types
    , and
    relationshipTypes
    , but can’t seem to find this endpoint documented on the datahub site. I also see a call being made to the
    /api/v2/graphql
    when I look at the lineage in the U.I. It uses a query of type
    getEntityLineage
    . When I search for
    getEntityLineage
    in the datahub docs, I get no results. We have many flows and use cases where we would like to use datahub to provide dataset lineage outside of the U.I. Can someone point me in the right direction for how to pull this out of the system? Any of the python sdk, graphql, or rest api is fine by me.
    🩺 1
    a
    a
    • 3
    • 5
  • g

    green-honey-91903

    06/02/2023, 6:37 PM
    Hi there 👋 Is there a good email contact to ask security related questions about datahub/acryldata?
    m
    • 2
    • 4
  • h

    hallowed-market-13473

    06/02/2023, 9:54 PM
    Hi, I was interested in using datahub without connecting directly to the raw data, but instead providing the files containing the metadata itself - for instance, a JSON file containing all the tables in a database, as well as the schema for each table. Is there documentation on the recommended way of doing this? I was thinking of using JSONSchema as the datasource, but wasn’t sure if that was the best/recommended way.
    m
    a
    • 3
    • 8
  • g

    gifted-butcher-68437

    06/05/2023, 9:44 AM
    I've been trying to setup datahub. Installing it using helm charts. I've tried installed the prerequisites helm chart and the
    datahub/datahub
    helm in a new namespace (not default). This in turn spins up services with the namespace as part of the name of the service. Kafka and Zookeeper especially. ElasticSearch remains the same. This fails the kafka job when bringing up the
    datahub/datahub
    helm chart.
    a
    • 2
    • 1
  • e

    enough-eve-67949

    06/06/2023, 8:36 AM
    hi, anyone knows how datahub could integrated with dbt, which I mean that could datahub access the metadata generated from dbt. great thanks.
    ✅ 1
    m
    • 2
    • 2
  • n

    nutritious-megabyte-12020

    06/06/2023, 1:40 PM
    Hi! Is there a possibility to show unstructured data like images, pngs and so on in Datahub?
    a
    m
    • 3
    • 8
  • i

    icy-dawn-43119

    06/06/2023, 2:46 PM
    Hi all, I hope you are having a good day! does anyone know if Datahub could integrate with BQ Dataform?
    a
    a
    • 3
    • 4
  • d

    dazzling-rainbow-96194

    06/06/2023, 11:21 PM
    Is the initial ingest going to take a lot of time if we have too many schemas and tables? If yes, is there a way to optimize initial ingest time?
    d
    • 2
    • 2
1...656667...80Latest