How to disable Airflow lineage for some DAGs Hi a...
# ingestion
m
How to disable Airflow lineage for some DAGs Hi all, I am using Airflow as a job scheduler and I have been enjoying the lineage backend with DataHub. I have looked at the code and did not see any hint of this, so I'll ask here. Is there a way to configure a DAG or an Operator to prevent Airflow from emitting task and pipeline lineage to DataHub? By default when you install and configure the backend, any task and DAG that will run in Airflow will emit to DataHub. That's all cool, but we have jobs running in Airflow that are unrelated to data (could be infrastructure maintenance jobs, housekeeping, etc...) and it makes no sense to see those in DataHub. It would be nice if there would be a flag that I can set on a DAG and/or Operator and that flag would indicate if Airflow should or not emit to DataHub. And there should be a default value for this that can be set in the lineage backend config so that you can overwrite the current default behavior (emit by default or not). Does this make sense?
o
We don't currently have a way to do this, but definitely consider putting up a feature request here: https://feature-requests.datahubproject.io/
👍 1
s
hey @orange-night-91387, I am looking for a similar feature to selectively allow dags to emit metadata in airflow. Is it available now?
@modern-monitor-81461 did you found any work-around for this?
++ @mammoth-bear-12532