How to disable Airflow lineage for some DAGs
Hi all, I am using Airflow as a job scheduler and I have been enjoying the
lineage backend with DataHub. I have looked at the code and did not see any hint of this, so I'll ask here. Is there a way to configure a DAG or an Operator to prevent Airflow from emitting task and pipeline lineage to DataHub? By default when you install and configure the backend, any task and DAG that will run in Airflow will emit to DataHub. That's all cool, but we have jobs running in Airflow that are unrelated to data (could be infrastructure maintenance jobs, housekeeping, etc...) and it makes no sense to see those in DataHub. It would be nice if there would be a flag that I can set on a DAG and/or Operator and that flag would indicate if Airflow should or not emit to DataHub. And there should be a default value for this that can be set in the lineage backend config so that you can overwrite the current default behavior (emit by default or not).
Does this make sense?