hi! congrats on shipping 0.9.0, the column level l...
# ui
n
hi! congrats on shipping 0.9.0, the column level lineage ui is awesome!! we have update out Datahub as we were already prepared for this ui enhancement (at a db level) unfortunately, we are not able to see the column level lineage, although it is already stored in our db
these are my lineages from the graphql request from the UI
pretty similar to the demo ui you have provided!
any clue?
b
hey Alberto! it looks like the urns in your
fineGrainedLineages
are actually dataset urns instead of schema field urns. in order to display column level lineage we're expecting those urns to map to a specific schema field (so the urn would look like this, for example:
urn:li:schemaField:(urn:li:dataset:(urn:li:dataPlatform:athena,pro_arg_cliente360.a_collectivos,PROD),field_path)
how are you getting
fineGrainedLineages
into your db?
actually ignore more first comment I believe I was thinking of the wrong thing!
when you go to the lineage visualization, are you able to see the toggle saying "Show Columns"? and if so, when you click it are you seeing your columns and just not the relationships between them you would expect?
n
yeah, I toggle the column level
and it displays the table columns
but the relationship arrows are only between tables
not fields
b
sorry I've been caught up in something - just letting you know I haven't forgotten about you and will be back soon!
n
no problem!! thank you Chris
hi @bulky-soccer-26729! any ideas so far?
b
hey Alberto! i think i might have an idea. can I see what your schemaMetadata looks like for the table in question here?
n
hi!
here it goes
message has been deleted
if you need in plain text, please let me know
b
ahh yup just as I suspected - we actually just discovered an issue with column level lineage and V2 field paths and I just pushed a fix for this! so if you pull from
head
I believe your issue should be solved. otherwise it will come out in the next release. so sorry about the inconvenience!
n
oh, I am deploying directly from the hub
so I guess I have to wait
but that sounds good enough for me!! thank you @bulky-soccer-26729
waiting eagerly for that 🙂
b
okay gotcha, yeah we should be issuing a new release within the next week or two at the latest!
just curious - what does your column level lineage look like in your database and how did you get it in there? (speaking of the
fineGrainedLineage
property on the
upstreamLineage
aspect)
n
hi!
we are populating those fields in two ways
1. with an adhoc script using the python sdk
2. in an automated way using a Spark listener every time one of our ETLs finishes
our idea is to end up using only the second one!
hi @bulky-soccer-26729! I have just updated to the 0.9.1
but now the dataset is view is completely broken 😞
message has been deleted
the getDataset GraphQL operation returns a 503
looking in the GMS logs
I see a lot of exceptions like
Copy code
11:25:47.238 [ForkJoinPool.commonPool-worker-399] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
	at java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
	at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	... 6 common frames omitted
am I doing something wrong?
b
oh dang! hmm yeah something seems to be off on your deployment with GMS.. is all of datahub broken for you or only the dataset page? could you try restarting your GMS pod?
n
it seems to be only the dataset detail view
I will!
same stuff 😞
b
hmm so this is just dataset view? any other places where you see an error? also are there any more errors or more to the error above? from that it's hard to tell what's going on
n
Copy code
14:40:06.609 [ForkJoinPool.commonPool-worker-495] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
java.util.concurrent.CompletionException: java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
	at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by: java.util.concurrent.RejectedExecutionException: Thread limit exceeded replacing blocked worker
	at java.base/java.util.concurrent.ForkJoinPool.tryCompensate(ForkJoinPool.java:1575)
	at java.base/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3115)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1823)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	... 6 common frames omitted
and a new one
Copy code
14:40:06.656 [ForkJoinPool.commonPool-worker-495] INFO  c.l.m.graph.neo4j.Neo4jGraphService:150 - MATCH (src {urn:"urn:li:glossaryNode:Technical"})<-[r:IsPartOf ]-(dest )
14:40:06.594 [ForkJoinPool.commonPool-worker-3] ERROR c.l.d.g.e.DataHubDataFetcherExceptionHandler:21 - Failed to execute DataFetcher
java.util.concurrent.CompletionException: java.lang.StackOverflowError
	at java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:314)
	at java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:319)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1702)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1144)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3151)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1817)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1144)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3151)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1817)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1144)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3151)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1817)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1144)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3151)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1817)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.fetchEntityRelationships(EntityRelationshipsResultResolver.java:66)
	at com.linkedin.datahub.graphql.resolvers.load.EntityRelationshipsResultResolver.lambda$get$0(EntityRelationshipsResultResolver.java:46)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700)
	at java.base/java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1692)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.helpAsyncBlocker(ForkJoinPool.java:1144)
	at java.base/java.util.concurrent.ForkJoinPool.helpAsyncBlocker(ForkJoinPool.java:3151)
	at java.base/java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1817)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1998)
	at org.neo4j.driver.internal.util.Futures.blockingGet(Futures.java:100)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:62)
	at org.neo4j.driver.internal.InternalSession.run(InternalSession.java:47)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:34)
	at org.neo4j.driver.internal.AbstractQueryRunner.run(AbstractQueryRunner.java:39)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.runQuery(Neo4jGraphService.java:335)
	at com.linkedin.metadata.graph.neo4j.Neo4jGraphService.findRelatedEntities(Neo4jGraphService.java:160)
	at com.linkedin.metadata.graph.JavaGraphClient.getRelatedEntities(JavaGraphClient.java:41)
I think is due to how the data is in my db right?
other dataset views are ok
those without lineage actually
b
yeah that would be my guess.. something is causing a stack overflow error when trying to fetch lineage data for your dataset. has your data changed at all for this entity? and is it all entities that have lineage or just some that may have something broken going on in them?
n
just some of them
I havent change anything actually
but I guess I have some problem with the lineage of that dataset in particular
I will look at it
thank you Chris!
b
of course! let me know if you find out what's going on or if you need any more help 🙂