Hi, I would like to see the information about the ...
# ingestion
s
Hi, I would like to see the information about the runs history of airflow tasks in datahub as shown in this example: https://demo.datahubproject.io/tasks/urn:li:dataJob:(urn:li:dataFlow:(airflow,datahub_li[…]kend_demo,prod),run_data_task)/Runs?is_lineage_mode=false but I couldn't find the documentation related to that and how to configure it. I have already airflow DAGs ingested in datahub but no Runs are shown. For ex: Thanks for the help
d
Here is the documentation: https://datahubproject.io/docs/lineage/airflow/ If capturing airflow tasks works then I think the only missing piece is enabling
capture_executions
property in your airflow config: •
capture_executions
(defaults to false): If true, it captures task runs as DataHub DataProcessInstances. This feature only works with Datahub GMS version v0.8.33 or greater.
Copy code
[lineage]
backend = datahub_provider.lineage.datahub.DatahubLineageBackend
datahub_kwargs = {
    "datahub_conn_id": "datahub_rest_default",
    "cluster": "prod",
    "capture_ownership_info": true,
    "capture_tags_info": true,
    "capture_executions": true,
    "graceful_exceptions": true }
s
oh thanks! i'll have a look to that. we have v0.8.36 so it should be fine. šŸ™‚
b
Let us know how this gooes šŸ™‚
s
It worked! thanks
d
Awesome, I think we should enable it now by default.
s
yep, it's really a nice feature!