https://datahubproject.io logo
Docs
Join the conversationJoin Slack
Channels
acryl-omnisend
advice-data-governance
advice-metadata-modeling
all-things-datahub-in-windows
all-things-deployment
announcements
authentication-authorization
chatter
column-level-lineage
contribute
contribute-datahub-blog
data-council-workshop-2023
datahub-soda-test
demo-slack-notifications
design-business-glossary
design-data-product-entity
design-data-quality
design-datahub-documentation
design-dataset-access-requests
design-dataset-joins
feature-requests
flyte-datahub-integration
getting-started
github-activities
help-
i18n-community-contribution
ingestion
integration-alteryx-datahub
integration-azure-datahub
integration-dagster-datahub
integration-databricks-datahub
integration-datastudio-datahub
integration-iceberg-datahub
integration-powerbi-datahub
integration-prefect-datahub
integration-protobuf
integration-tableau-datahub
integration-vertica-datahub
introduce-yourself
jobs
metadata-day22-hackathon
muti-tenant-deployment
office-hours
openapi
plugins
show-and-tell
talk-data-product-management
troubleshoot
ui
Powered by Linen
show-and-tell
  • b

    bumpy-library-47725

    09/24/2021, 6:42 AM
    Hey, has anyone integrated Tableau/ PowerBI with DataHub?
    👀 4
    s
    f
    +5
    • 8
    • 7
  • r

    red-pizza-28006

    10/20/2021, 8:49 AM
    hello, has anyone has experience with using your own Kafka instead of the default k8 one?
    w
    • 2
    • 1
  • n

    nice-planet-17111

    10/21/2021, 1:46 AM
    Hello, has anyone has experience with using CloudSQL as storage instead of default mysql container ? 🙂
    b
    g
    • 3
    • 6
  • b

    better-orange-49102

    10/22/2021, 9:21 AM
    I've a student intern who joined my team for a few months and built a administrator dashboard to solve the following painpoints that my team thought would be an issue when we deploy Datahub: 1. allow us to see all the fields and datasets inside Datahub at a glance (the current frontend UI displays 1 dataset at a time), hence letting us sort similar field names and see if certain field descriptions or tags can be consistently applied. 2. mass rename tags inside datahub to cut down on the tags used (and edit on tag descriptions) - tags cant be deleted inside Datahub at the moment, so its more about renaming tags to a common name. Also a panel showing a list of all tags and the number of datasets using the tags. The whole dashboard was done by querying the REST endpoint and wrangling the data for display in UI. (He didnt try out GraphQL since its a rather new development and he is coming to the end of his internship) He didnt have much development experience and is too shy to show his repo😂, so im just taking animated GIF of what the tool can do (attached) i do think that Datahub could use something like this (but maybe more integrated with the existing UI), so just sharing the idea with the community🙂
    👏 1
    👍 7
    l
    b
    r
    • 4
    • 5
  • p

    polite-flower-25924

    11/04/2021, 7:36 AM
    Hey @little-megabyte-1074 and @mammoth-bear-12532, Today, I tried to mention DataHub in Twitter however I couldn’t find the account 🙂 In my humble opinion, it would be good to share DataHub updates in Twitter in terms of visibility. What do you think?
    👍 3
    m
    f
    b
    • 4
    • 7
  • f

    fancy-fireman-15263

    11/23/2021, 7:44 PM
    FYI - managed to understand what proportion of our schemas have descriptions using the following query against the graphql:
    {
      search(input: { type: DATASET, query: "*", start: 0, count: 4000 }) {
        searchResults {
          entity {
             urn
             type
             ...on Dataset {
                name
              	schemaMetadata(version: 0){
                  fields {
                    fieldPath
                    description
                  }
                }
             }
          }
        }
      }
    }
    👍🏾 1
    :teamwork: 2
    👍 3
    👏 3
    l
    b
    • 3
    • 3
  • s

    stale-guitar-98627

    01/31/2022, 7:48 PM
    Hi friends! I am a co-chair for the SciPy Conference Data Lifecycle track this year. Our goal is to introduce industry best practices to scientific computing. If you use DataHub (or any other data tools), please consider submitting an abstract to present for our track. Always happy to chat if there are any questions 🙂
    🔥 5
    :teamwork: 5
    s
    • 2
    • 2
  • a

    astonishing-lunch-91223

    03/25/2022, 5:56 PM
    https://twitter.com/MihaiTodor/status/1507415722988736514
    :excited: 3
    :teamwork: 6
    ❤️ 3
    :datahub: 2
    :datahubbbb: 2
    m
    d
    l
    • 4
    • 8
  • f

    faint-television-78785

    07/12/2022, 10:53 AM
    anyone here work in the aerospace or defense industries? would be interested to hear about your work
  • a

    aloof-dentist-85908

    07/13/2022, 3:13 PM
    Hi everybody, 😊 is there anyone working with SAP and did integrate metadata from all the different SAP tools (SAP Lumira Designer, SAP Analytics Cloud, SAP BW, SAP HANA, SAP Data Intelligence etc.)? #sap
    👀 2
    d
    s
    +2
    • 5
    • 18
  • f

    full-shoe-73099

    07/25/2022, 1:40 PM
    Salut les amis! Perhaps someone knows how to display the power bi report environment in datahub interface?
  • m

    mysterious-pager-59554

    07/27/2022, 2:16 PM
    Hello Team Has anybody here worked on Datahub's integration with great-expectations : i.e pushing the validation results of great expectations to datahub for a CSV /Parquet file. I could only accomplish this for SQL alike data sources (i.e BigQuery)
    s
    l
    q
    • 4
    • 4
  • h

    hallowed-lizard-92381

    08/18/2022, 4:13 PM
    Interested in a recording 🤔
    l
    • 2
    • 2
  • d

    dazzling-appointment-34954

    09/16/2022, 12:51 PM
    Hey everyone, I don´t know if this the right place to post this but I created a little helper for some client projects, which might also be helpful for other people in the community. So I would like to contribute 🙂 What does it do? It uses a google sheets template to create a well formatted JSON file that can be ingested as individual datasets into your datahub instance. Through google sheets you can easily copy+paste a lot of data inside it, e.g. to document all datasets from an individual system you have at your company (we did this for example a couple of times with KPIs or SAP data). It is a very basic version that supports the main aspects of a dataset + 3 individual schema fields for now (but can easily be extended / adapted). The code should be self explanatory in the app script. Disclaimer: I am not a Software Engineer, so there is definitely room for improvement 😉 You can find the script here: https://docs.google.com/spreadsheets/d/1DFCPi2_o8oTXpwVwxDvdhLiaUaGsj5yNmN-NhJnIWxQ/edit?usp=sharing Feel free to try and give me feedback if anything does not work as expected.
    c
    b
    • 3
    • 2
  • m

    many-rainbow-50695

    10/10/2022, 6:50 AM
    Hi, everyone! I've created an open business glossary from semantic data types registry. It includes about 300+ existings data types like: classification codes, personal and companies identifiers, geocodes and e.t.c. Most common data types and dozens country and language specific data types like UK Ward code or French SIRET code. I am also working on integration of semantic data types detection integrated with Datahub ingestion. I've already published metacrafter tool that allows automatic identification of semantic data types, next step is to do it compatible with Datahub API and ingestion process. It available here https://github.com/apicrafter/metacrafter-registry/blob/main/data/datahub/metacrafter.yml Feel free to contact me if you and provide your feedback.
    l
    c
    • 3
    • 3
  • c

    crooked-van-51704

    01/03/2023, 4:33 PM
    Hey, at the end of the year we were really sad that the Postgres source didn’t have support for table lineage yet and we really needed it, so … we wrote it and want to share with everyone else. We have open sourced a new package
    datahub-postgres-lineage
    , it behaves as a new “data source” for now, but it only emits lineage for Views in postgres. The package is already available on PyPi and can be easily installed using
    pip install datahub-postgres-lineage
    For now, we decided to release the package as a standalone data source so that people could try it right away, but we plan to propose this to be included the built-in Postgres data source as one more option, similar to Snowflake.
    l
    m
    b
    • 4
    • 6
  • b

    bland-orange-13353

    01/25/2023, 6:04 AM
    This message was deleted.
    l
    l
    • 3
    • 3
  • g

    gentle-lifeguard-88494

    02/24/2023, 8:07 PM
    Hey everyone, I was able to use the custom metadata model to get distinct values for all columns! This is only for low cardinality columns (Cardinality.FEW) based off of the GE definition. It was really nice not having to code any React/Typescript to get it to show in the UI as well 😛hew: Thought I would share since it took me a little bit to orient myself to the open-source project, but I'm proud of this small achievement! If anyone is interested all the steps involved (could be helpful for those doing it for the first time), I can share more details as well. Thanks for everyone who helped out! @orange-night-91387 @bulky-soccer-26729 @curved-planet-99787 @astonishing-answer-96712 Looking forward to making bigger contributions in the future 💪 P.S. The data shown is from the public Chinook sample database - so no PII issues to worry about from the screenshot
    e
    c
    +4
    • 7
    • 19
  • s

    sparse-address-17104

    03/16/2023, 8:56 PM
    Hi everyone, just contribuying to the group I have built a datahub documentation to deploy datahub by straightforward step-by-step process and I´m using dbt/trino push/pull datasources, If there is anyone interested it can be read here: https://www.linkedin.com/feed/update/urn:li:activity:7037156471440072704/
  • s

    sparse-address-17104

    03/16/2023, 8:56 PM
    medium as well: https://medium.com/@leandro.totino87/centralized-open-data-plataform-jupyterhub-trino-dbt-grafana-minio-hive-datahub-751d320ff8d7
  • h

    hallowed-microphone-6899

    03/22/2023, 1:16 PM
    Hi Team,I just follow Airflow Integration to do , but airflow log pending ,status picture 1 , airflow connections picture 2; code is in picture 3 env info : airflow = 2.5.2 ( standalone), acryl_datahub_airflow_plugin = 0.10.0.6, Python 3.9.6. Can you tell me who to do ?
Powered by Linen
Title
h

hallowed-microphone-6899

03/22/2023, 1:16 PM
Hi Team,I just follow Airflow Integration to do , but airflow log pending ,status picture 1 , airflow connections picture 2; code is in picture 3 env info : airflow = 2.5.2 ( standalone), acryl_datahub_airflow_plugin = 0.10.0.6, Python 3.9.6. Can you tell me who to do ?
View count: 1