What are some of the guidelines/best practices tha...
# advice-data-governance
a
What are some of the guidelines/best practices that folks here use to ensure that Owners are responsible for their metadata? We want our Data Producers to be responsible for the PUSH of metadata to DataHub and our team to be responsible for the platform itself - not the individual datasets. How do others manage this?
l
Talk about perfect timing - I’m currently working on a blog post talking about this very topic! I’ll be sure to link it in this channel once it’s published (targeting this week)
🤩 1
plus1 2
excited 1
a
Also would love to read this! Thank you @little-megabyte-1074
l
Hi @alert-jackal-50417! I published it a while back — https://datahubspace.slack.com/archives/C02QMLWJG12/p1643062891025700
a
Thank you! I just saw that later in the channel as well
❤️ 1
b
@acceptable-potato-35922 Who is the data producer please? The Data engineer who creates the data product, the data product owner, or the data owner?
a
@busy-dentist-64466 That’s a tricky question to answer 🙂 I think it depends on the organization. The way that I have done it in the past is to set the 2 levels of ownership: 1. Data Owner --> The person (or team DL) that created the dataset. Typically engineering or data science 2. Information Owner --> The person who understands the permissible uses of the data and thus will control who gets access to it and who doesn’t. Typically somebody from Product - but I’ve also seen a lot of engineers take that role as well.
b
Thanks for clarifying. I think it could be interesting to specify the categories of meta-data the different types of owners are responsible for. Maybe something like the below? • Data Onwers are responsible for Technical Meta-Data: Database-Schema-Table, technical lineage, column names,... • Information Owners are responsible for Logical Meta-Data: Definition, Sensitivity, Entity,... I agree that having users push meta-data is waaaaay more efficient that having to discover & classify data, but it is very hard to incentivize users to do this. If I had to do this, I would combine carrot and stick: • carrrot: Data products with good meta-data will be easier to find in the catalog, and will harness more trust. As a result, more data consumers will use this data set. Maybe, you could reward data producers for this with gifts? • stick: I really hate the stick. This is data governance 1.0, and does not always lead to the desired outcome. But, maybe you can revoke access for data products that have no meta-data? People won't like you, and you'll have to make sure they don't stick "kick me" notes on your back when you're walking through your company's hallways.
🤣 1
a
Very true. The Governance sticks always cause friction. We are trying to come up with carrots to incentivize data producers to be part of the push. But it’ll be a long journey.
👍 1