As a pilot launch, @silly-dusk-92062 and I started setting up DataHub with our data products. We each own ~100 tables, and they can be linked to the same documentation. It would be nice to be able to link documentation at a higher level,. We use Big Query, so like at the project or even just at data set levels.
Curious if anyone else feel the same challenge when setting up documentations.
m
mammoth-bear-12532
09/24/2021, 2:41 AM
Hi @clean-cpu-43303 great question and aligned with how we are thinking about "dataset containers". How are you thinking about rendering the documentation? At the leaf dataset level.. or at the container level.
s
silly-dusk-92062
09/24/2021, 5:57 PM
Hi @mammoth-bear-12532 - We would like to link documentation at the container level and the leaf level. Right now we starting with BigQuery as a source for our data catalog. Most of our BigQuery projects represent a data product so we need documentation at the container level. However we do have projects where our datasets represent a data product so we have cases where we would like to link documentation at the leaf level as well. cc @clean-cpu-43303
b
busy-accountant-26554
10/01/2021, 2:26 PM
Hi @mammoth-bear-12532, @clean-cpu-43303 and @silly-dusk-92062. We have a similar use case were we want to govern datasets at a container level meaning that we have defined a dataset to be a union of schemas (or tables for rdbms). But at the same time display the dataset in the catalog at the container level as well as on the table (leaf) level.
busy-accountant-26554
10/01/2021, 2:30 PM
@mammoth-bear-12532, are dataset containers something that will end up on the DataHub roadmap? Such a feature would would create lots of value for our catalog customers.
m
mammoth-bear-12532
10/01/2021, 2:31 PM
Hi @busy-accountant-26554: yes they definitely are