Hi hows it going? I was scouring through your docu...
# ingestion
m
Hi hows it going? I was scouring through your documentation to find an example to ingest Airflow DAG metadata (not lineage), but was unsuccessful (high chance I overlooked something). Inspired by your demo (https://demo.datahubproject.io/browse/pipelines/airflow/prod) would love to know how to see Airflow DAG and task metadata in in DataHub. Any guidance will be appreciated. Thank you in advance.
m
Hi Fredrik - thanks for responding so quickly, I should have been more specific. Not talking about lineage, but rather the DAG metadata itself (including tasks in the DAG).
h
I think taking the LineageBackend into use is enough. At least based on my experience using the lineage feature in Airflow. Airflow takes care of the rest. But true, docs on that is missing.
m
Would LOVE if you can share a bit more on this and I am happy to raise a PR against your docs to include this 🙂
h
Yeah, I haven't tested this properly. It's more a observation while working on dataset related lineage in Airflow
m
@millions-jelly-76272, @high-hospital-85984 is right, just configuring the lineage backend will emit lineage as well as DAG metadata when the DAG runs.
✅ 1
g
Yep - if you do steps 1 and 2 in the lineage backend guide https://datahubproject.io/docs/metadata-ingestion/#using-datahubs-airflow-lineage-backend-recommended, you’ll automatically start getting Airflow DAG and task metadata when your tasks run
m
Thank you @clean-bear-94984 @mammoth-bear-12532 and @high-hospital-85984 - I confirm it is working now :-)
🎉 1
h
Nice!