silly-nest-50341
05/16/2023, 5:41 AMsilly-nest-50341
05/16/2023, 7:56 AM(dataset -> datajob(airflow task) datset)
works fine. But the thing is when the airflow task is finished, it seems it is automatically sending out its status to datahub and updates its status. and I assume this is overwriting the task’s status… How can I maintain its lineage even after the task is done?
[2023-05-16, 16:44:12 ] {_plugin.py:147} INFO - Emitting Datahub Dataflow: DataFlow(urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7f892da85c40>, id='BigQueryLineageOperator_table_test', orchestrator='airflow', cluster='prod', name=None, 'is_paused_upon_creation': 'None'
[2023-05-16, 16:44:12 ] {_plugin.py:165} INFO - Emitting Datahub Datajob: DataJob(id='create_test', urn=<datahub.utilities.urns.data_job_urn.DataJobUrn object at 0x7f892daac6a0>, flow_urn=<datahub.utilities.urns.data_flow_urn.DataFlowUrn object at 0x7f89190f99d0>, name=None, description=None, properties={'_inlets': '[]', '_outlets': '[]', 'depends_on_past': 'False', 'email': 'None', 'label': "'create_test'", 'execution_timeout': 'None', 'sla': 'None', 'trigger_rule': "<TriggerRule.ALL_SUCCESS: 'all_success'>", 'wait_for_downstream': 'False', 'downstream_task_ids': 'set()', 'inlets': '[]', 'outlets': '[]'}, url='/?flt1_dag_id_equals=&_flt_3_task_id=create_test', tags={, 'BigQueryLineageOperator', 'data_discovery', 'datahub'}, owners={'XXXX'}, group_owners=set(), inlets=[], outlets=[], upstream_urns=[])