Hello everyone,
How does DataHub create the lineage for Redshift objects? In particular, I want to know where DataHub retrieved the information for the lineage attached between the s3 files and the Redshift table. Is there a particular view that is being ingested? Or is DataHub parsing the queries on the table?
Thank you, @dazzling-judge-80093! Follow question, what determines which files get used? Because it appears that it is the same file in different folders (one for each day/time that it is uploaded).
d
dazzling-judge-80093
11/08/2022, 6:19 PM
we join this table with the query history
s
swift-plastic-79414
11/08/2022, 6:49 PM
Is there anything that can be done to filter/hide the duplicates?