Hi - I’m one of the authors of https://github.com/stitchfix/hamilton. I’m researching an idea around emitting metadata from Hamilton to external systems. So am looking at what I could emit/send to datahub. I see there are some special emitters like DBT — what’s required to get a special one?
e
echoing-airport-49548
06/22/2022, 12:14 AM
Hi @busy-gigabyte-97279, great to have you here! We would love contributions for adding another ingestion source. Take a look at these guides on developing on metadata ingestion and adding a new source to get some more info! Please let me know if you have any other questions 🙂
👍 1
m
mammoth-bear-12532
06/23/2022, 6:35 PM
@busy-gigabyte-97279: welcome to DataHub! If you are looking to emit metadata (in push fashion) from your orchestrator, you can check out the emitter docs here. Also for code examples, you can check on the airflow integration code referred to from here.
b
busy-gigabyte-97279
06/23/2022, 6:47 PM
@mammoth-bear-12532 yep thanks. One question — is datahub’s premise that it stores metadata around materialized data only?
Hamilton is pretty interesting in that it gets people to model their dataflow independently of materialization concerns. So I’m trying to think would the metadata encoded in Hamilton be useful to emit prior to execution — or is it only useful after execution when something has been materialized to some datastore?
m
mammoth-bear-12532
06/25/2022, 2:13 AM
It definitely is fine to emit nodes that are logic nodes and have not yet run... e.g. we represent views in dbs as well as dbt models