https://datahubproject.io logo
Join Slack
Powered by
# advice-data-governance
  • w

    wonderful-baker-8803

    11/27/2023, 5:27 AM
    Hi there, I am using Datahub v0.12.0. I have question on glossary governance. Is there any way we can track changes made on Glossary items (history), one of the functions of governance is who made changes and possibly reasons for the change
    g
    • 2
    • 4
  • b

    brief-beach-39550

    11/27/2023, 1:35 PM
    Our exiting user shared an observation: existing search functionality treats _ separated as wild card searches (shows in search results if any of the words searched is found). I need help here to understand more about _ role in search. Please suggest from where I can get help. I have referred https://datahubproject.io/docs/how/search/#advanced-queries but not getting detailed information.
    r
    f
    • 3
    • 2
  • r

    rough-motorcycle-9079

    12/13/2023, 12:15 PM
    Hi there, i have some dataset and column schema, also ingested some glossary terms i am trying to attach my glossary to a specific column wasn’t able to find any such yaml file / example on how to do it from code can someone share an example please ?
    g
    • 2
    • 1
  • m

    most-restaurant-63516

    12/20/2023, 8:12 AM
    hi! is openlineage integrated?
    r
    • 2
    • 1
  • q

    quiet-television-68466

    12/21/2023, 4:25 PM
    Hello all, we’re looking at trying to assemble some more fine grained analytics of who is using our DataHub instance so we can better target future communications. How are these statistics aggregated? Is logins information aggregated as its own entity, or inside the user entity? Most of our other reports are based on building views on top of the mysql metadata_aspect_v2 table. Is this information available within there anywhere? Will share here any findings we have.
    w
    • 2
    • 2
  • p

    powerful-dawn-10711

    12/27/2023, 12:41 PM
    Is there a way I can propagate tags across lineage? Suppose that I've a Table (Customer) with attributes such as (name, age). And I add tag to "age" col as PII tag. Will any view with age col that read from that table will have the same tag? I want something like classification propagation to be exact. Is it possible in Datahub using tags or another way around? Thread in Slack Conversation
  • f

    fast-receptionist-22258

    01/04/2024, 12:36 AM
    Anyone try (or think about) pushing semantic documentation or governance tags from source code data schema into DataHub as an alternative to manual documentation/tagging? Perhaps something like dbt docs but for other datastores. If so, what motivated you to do it and did it work as you expected?
  • h

    hallowed-airline-55013

    01/04/2024, 7:02 AM
    Hi team, it seems from the datahub, one dataset can only set one data product. So how can we handle scenario like two data products share one dateset in datahub?
    g
    m
    +4
    • 7
    • 20
  • m

    microscopic-dream-41425

    01/08/2024, 6:43 AM
    Hi guys, I have a quetion. How to grant a user the permission to ‘allow granting permissions to others’ within a specified scope in DataHub? For example, allowing user A to grant permissions to other users on the ‘dw’ library in S3?
  • m

    melodic-scientist-67175

    01/15/2024, 12:44 PM
    Hi guys, I wanted to see if there's like a presentation for this platform on all it's features, comparassions, etc..
    r
    • 2
    • 1
  • q

    quaint-appointment-83049

    01/17/2024, 7:05 PM
    Hi All, I am using Google cloud GCP and all our data is being ingested into Bigquery. I use Datahub for my Datacatalog and Data governance. The Data lineage is also a feature in Datahub. Actually the lineage is visible or shown only when the tables/views are created via Bigquery. If it is created through Google Dataform, the lineage of those tables are not visible. Both the upstream and downstream is empty. I need to have a Data Lineage for the tables which were created by Dataform as well. Can I know how can it be done? I am currently stuck and need a quick solution. If anyone has any idea on this, please feel free to respond.
    l
    r
    • 3
    • 2
  • r

    rich-salesmen-77587

    01/17/2024, 10:05 PM
    do we have an api to create data product #dataproduct?
    a
    • 2
    • 3
  • m

    miniature-hair-20451

    01/18/2024, 1:17 PM
    Hello, There is no glossary in the current company, and I would like to organize one. Please share your implementation of a glossary that you consider the best for you.
  • w

    white-nightfall-84574

    01/19/2024, 11:17 AM
    Hello I wanted clarity on whether there is an API availability with Datahub to fetch metadata from other data catalogues?
    r
    • 2
    • 1
  • b

    broad-honey-63164

    01/22/2024, 7:05 PM
    I noticed there is very.. very little docuemntation.. or videos on data contracts. Where can I find EXPLICIT examples rather than youtube videos and talks about it. There are no real examples anywhere.
    plus one 3
    l
    s
    • 3
    • 8
  • a

    adventurous-dawn-19232

    01/23/2024, 10:56 AM
    my requirement creates a dataset for each database inside MySQL. Inside each database dataset, creates separate table datasets with names like "table_name" (without the database prefix), and each table dataset contains all the fields for that specific table. from the csv file useing python code is that possible i am getting databasename.tablename
    l
    • 2
    • 1
  • b

    brave-judge-32701

    02/01/2024, 7:31 AM
    I want to implement through code the ability to batch assign database tables to a specific data domain based on their naming conventions. However, I have not found an API that conveniently searches for datasets. Is my only option to use the
    getSearchResultsForMultiple
    function? I also haven’t found any documentation related to this function.
    a
    • 2
    • 1
  • r

    rich-salesmen-77587

    02/08/2024, 7:01 PM
    is there a graphql example query to delete tag ?
    w
    • 2
    • 1
  • b

    brainy-musician-50192

    02/12/2024, 2:28 PM
    Has anyone successfully implemented a system where if a job that performs ETL fails (e.g. Airflow task), then the table affected by it and all downstream tables are flagged in some way? Essentially to allow someone to quickly check if the table data is correct and up-to-date
    r
    • 2
    • 1
  • f

    flat-judge-25164

    02/13/2024, 7:38 PM
    Hi Folks, Is it possible to build approval workflow in open source version of datahub using datahub-actions framework?
    b
    f
    • 3
    • 2
  • g

    gorgeous-planet-81148

    02/27/2024, 5:53 PM
    Hello, is there a way to set the ownership of a domain when ingesting it from a file? I'm trying to ingest domains at startup and have their ownership be set, but there's no obvious way to set ownership when using a file source type.
    r
    • 2
    • 1
  • l

    limited-motherboard-51317

    02/28/2024, 6:34 PM
    Hello colleagues! I found that Datahub has Data classification feature - https://datahubproject.io/docs/metadata-ingestion/docs/dev_guides/classification/ This one can be very useful for my project. But need to clarify some things 1. Can I use this feature not only for Snowfalke, but with Postgres, MySQL, MSSQL? 2. Can I implement custom classifiers? In my case it can be classification of sensitive data other then presonal - for example Salary?
    plus1 1
    b
    b
    • 3
    • 2
  • v

    victorious-eve-73426

    02/28/2024, 9:38 PM
    Hey team - is there any feature to bulk update meta data at schema or table level via a csv file. I know there is capability to download a csv
    m
    e
    b
    • 4
    • 6
  • p

    plain-optician-17131

    02/28/2024, 11:43 PM
    Hey all, looking for operational advice: We're designating Data Products and or entities with "Tiers" (e.g. T1, T2, T3) to denote level of business impact, quality, SLA, etc... and looking at options on how to flag this in DataHub. I was thinking of creating user "groups" for each tier. Any other thoughts or recommendations?
  • v

    victorious-eve-73426

    03/04/2024, 1:59 PM
    Hey all - is there a way to download data dictionary of one or many datasets via api or UI?
    r
    • 2
    • 2
  • l

    limited-motherboard-51317

    03/05/2024, 9:10 AM
    Hello colleagues! I'm looking for Data modeling tool in which we can design a conceptual and logical data model and correlate it with actual schemas of datasets in Datahub? Do we have such one?
    b
    r
    • 3
    • 2
  • b

    bland-receptionist-85001

    03/10/2024, 1:09 AM
    Hi, Is it possible that datahub will support notifications with Apprise library. Apprise enable the functionality to send notifications to almost every platform including slack but also other platforms like discord, mattermost, mail etc. What are you thinking can we collaborate to make this happen?
    plus1 1
    r
    • 2
    • 1
  • l

    limited-motherboard-51317

    03/10/2024, 7:04 PM
    Hi! I'm currently study how datacontract work. I'm started locally lastest(0.13.0) datahub via quickstart and try to add simple datacontract. After adding datacontract to dataset next exception is shown while opening dataset in UI -
    no enum constant com.linkedin.datahub.generated.AssertionType.DATA_SCHEMA(code 400)
    And one more generic question. I don't understant end to end logic of datacontract, who is validating actual dataset agains published contract of dataset on Datahub? What are the executors of datacontract? Because as per my understaning datacontract validate schema(metamodel) and data of actual dataset, this it requires executor of contract against concrete dataset. Where to find more extented end to end samples and documentation?
    b
    • 2
    • 2
  • f

    fast-gold-39211

    03/13/2024, 5:13 PM
    Anyone using datahub for MSSQL datawarehouse DataLineage and governance, do you think datahub is the right tool for a smaller team?
    r
    • 2
    • 1
  • c

    calm-alligator-12692

    03/14/2024, 9:20 AM
    Is there a way to trigger managed ingestion from the cli or programmatically?
    r
    • 2
    • 2