Hello team, When we ingest fine grained lineage t...
# troubleshoot
l
Hello team, When we ingest fine grained lineage to datahub using python SDK , what should be the dataset in the following line: upstream = Upstream ( dataset = , type=DatasetLineageType.TRANSFORMED) Is it all upstream tables found while finding lineage or particular table? What is the purpose of that? Could any one please help me on this?
b
hey Priya! yeah great question - the
UpstreamLineage
aspect for a dataset includes two records -
upstream
and
fineGrainedLineages
,
upstreams
will be a list of all the datasets that are upstream of this specific dataset that the asset belongs to. This is at the entity level.
fineGrainedLineages
contains a list of column-level lineages, each object containing relationships between different columns (could be one-to-one, one-to-many, many-to-one, or many-to-many). For the UI we assume that any columns from other datasets have those other datasets in the list of
upstreams
as well. so to specifically answer your question, the
dataset
in an
Upstream
is a singular upstream dataset urn at the entity level that is upstream of the dataset that this aspect is for.
does that answer your question?
l
Yes, Thankyou so much @bulky-soccer-26729. Please correct me if I am wrong. So upstreams should be all upstream datasets of entityurn in metadatachangeproposalwrapper and finegrainedlineages should be complete list of column level lineages.
b
yup exactly! as with
upstreams
,
fineGrainedLineages
is a complete list of column level lineages in the upstream direction. so the
downstreams
field on
FineGrainedLineage
will be the column(s) from the entity for this aspect and the
upstreams
will be the column(s) upstream of them on other datasets that exist on the entity level in
upstreams
l
Thankyou so much @bulky-soccer-26729 for helping me. It works.
b
of course glad i could help!