Hey everyone, I have a Airflow DAG (with 4 tasks) ...
# advice-metadata-modeling
s
Hey everyone, I have a Airflow DAG (with 4 tasks) that moves my data from a PostgresDB to a Hive Metastore, that's also pushing lineage data to Datahub. But I feel like I'm doing something wrong with this approach. When I get to the "Lineage View", I need to make around 4 clicks to see the dependencies between the two tables
Copy code
Postgres Table > Task A > Task B > Task C > Task D > Hive Table
Is it normal to omit the "Airflow Tasks" in the Data Lineage to make easier to read it and calculate the degree of dependency, or would I be letting a important part out? Cheers, Filipe
1
a
Hi @strong-father-80629 are you expecting to see table -> DAG -> table, or table -> table directly?
s
Hi @astonishing-answer-96712, thanks for replying. I'm open to both ways actually, I just wanted to see with some more experienced people what is the recommended way, or if there's some benefits of both solutions.
a
Hi @strong-father-80629, we typically don’t recommend using tasks as DAGs, I think your implementation is the most correct at the moment, though we don’t have a great way to collapse this into a more human readable format.