calm-winter-90524
10/29/2022, 7:41 PMdbt
? Is it DataHub
(or data.world
or Collibra
or Atlan
or Alation
?) Some combination of the two (obviously), but where are the lines drawn.
I love the idea of keeping docs, tests, metadata right next to my models in dbt, managed in GitHub. If I take the "dbt as source of truth" approach, then I obviously would need to push metadata changes to the data catalog or enterprise metadata tool/framework. But there's way more to it than that ... metadata for the same objects is likely available in other forms from other sources (e.g., crowd sourcing for one, or a separate glossary of business terms, or some higher fidelity lineage data, profiling, not to mention data contracts, access policies, etc).
How do you all think about and solve this? Thanks!modern-artist-55754
10/31/2022, 1:24 AMcalm-winter-90524
10/31/2022, 9:34 AMschema.yml
file in GitHub) but should first PULL that data from the source of truth (DataHub), then execute dbt build
from there.
I am not suggesting this is a good thing to do, rather, that it is or could be the logical deduction from a strict application of "DataHub-as-source-of-truth". Is it rather more appropriate to say something like "DataHub is a window to the truth, a truth that is the union of various 'bits' of truth that come from multiple places (e.g., dbt and Azure Directory)."
Thanks again!modern-artist-55754
10/31/2022, 10:11 AM