https://datahubproject.io logo
#advice-data-governance
Title
# advice-data-governance
a

acceptable-potato-35922

01/12/2022, 8:03 PM
What are some of the guidelines/best practices that folks here use to ensure that Owners are responsible for their metadata? We want our Data Producers to be responsible for the PUSH of metadata to DataHub and our team to be responsible for the platform itself - not the individual datasets. How do others manage this?
l

little-megabyte-1074

01/12/2022, 8:57 PM
Talk about perfect timing - I’m currently working on a blog post talking about this very topic! I’ll be sure to link it in this channel once it’s published (targeting this week)
🤩 1
plus1 2
excited 1
a

alert-jackal-50417

02/18/2022, 10:13 PM
Also would love to read this! Thank you @little-megabyte-1074
l

little-megabyte-1074

02/18/2022, 10:37 PM
Hi @alert-jackal-50417! I published it a while back — https://datahubspace.slack.com/archives/C02QMLWJG12/p1643062891025700
a

alert-jackal-50417

02/18/2022, 10:38 PM
Thank you! I just saw that later in the channel as well
❤️ 1
b

busy-dentist-64466

03/10/2022, 2:24 PM
@acceptable-potato-35922 Who is the data producer please? The Data engineer who creates the data product, the data product owner, or the data owner?
a

acceptable-potato-35922

03/15/2022, 4:02 PM
@busy-dentist-64466 That’s a tricky question to answer 🙂 I think it depends on the organization. The way that I have done it in the past is to set the 2 levels of ownership: 1. Data Owner --> The person (or team DL) that created the dataset. Typically engineering or data science 2. Information Owner --> The person who understands the permissible uses of the data and thus will control who gets access to it and who doesn’t. Typically somebody from Product - but I’ve also seen a lot of engineers take that role as well.
b

busy-dentist-64466

03/15/2022, 7:17 PM
Thanks for clarifying. I think it could be interesting to specify the categories of meta-data the different types of owners are responsible for. Maybe something like the below? • Data Onwers are responsible for Technical Meta-Data: Database-Schema-Table, technical lineage, column names,... • Information Owners are responsible for Logical Meta-Data: Definition, Sensitivity, Entity,... I agree that having users push meta-data is waaaaay more efficient that having to discover & classify data, but it is very hard to incentivize users to do this. If I had to do this, I would combine carrot and stick: • carrrot: Data products with good meta-data will be easier to find in the catalog, and will harness more trust. As a result, more data consumers will use this data set. Maybe, you could reward data producers for this with gifts? • stick: I really hate the stick. This is data governance 1.0, and does not always lead to the desired outcome. But, maybe you can revoke access for data products that have no meta-data? People won't like you, and you'll have to make sure they don't stick "kick me" notes on your back when you're walking through your company's hallways.
🤣 1
a

acceptable-potato-35922

03/16/2022, 3:36 PM
Very true. The Governance sticks always cause friction. We are trying to come up with carrots to incentivize data producers to be part of the push. But it’ll be a long journey.
👍 1