:wave::skin-tone-4: Hi All! I am seeing weird beh...
# troubleshoot
g
👋🏽 Hi All! I am seeing weird behavior with lineage, and wondering if it's a bug, or something I'm miss-understanding. We have a couple of datasets with multiple versions of Lineage that we have ingested over time. When looking at a urns history, we see that there's 4 versions(details in thread), and the latest version is not == to the largest version number. Is that perhaps a bug with how we're ingesting data?
Query to fetch the urn info:
Copy code
select version, createdon from metadata_aspect_v2 where urn = "urn:li:dataset:(URN NAME HERE)" and aspect='upstreamLineage'
Copy code
returns:
+---------+----------------------------+
| version | createdon                  |
+---------+----------------------------+
|       0 | 2022-08-19 08:05:43.900000 |
|       1 | 2022-04-09 00:42:35.781000 |
|       2 | 2022-04-28 08:07:35.735000 |
|       3 | 2022-08-16 08:07:22.181000 |
+---------+----------------------------+
The correct version of lineage should be the version 0, but through the graphQL endpoint version 3 is returned
Any tips would be appreciated!
What I'm observing is that the
lastObserved
is updating for version 0, and it now contains the latest timestamp. So, without knowing what the code is doing, guessing that the lineage is ignoring
lastObserved
in favor of max(version) Does that seem right? If someone could point me to where to validate that, happy to contribute here
Deleting and re-creating Opensearch indicies seems to have fixed this, not sure what caused this 🤷🏽
The one other cause of this could be that one of the kafka topics wasn't created(PlatformEvent_v1), after creating that topic things seem to start flowing more normally. So the "fix" could have been either of these two events.