I’m trying to figure out why some of my datasets a...
# ui
w
I’m trying to figure out why some of my datasets are throwing exceptions when rendering in the UI. Most seem to load correctly, but some throw a large series of exceptions, e.g.:
Exception while fetching data (/dataset/upstreamLineage/upstreams[0]/dataset/downstreamLineage/downstreams[1]/dataset) : java.lang.RuntimeException: Failed to retrieve entities of type Dataset
• Assuming my ingest code is correct, all referenced datasets should exist • I am able to query the problematic dataset via the REST API • I am able to load the referenced upstream in the UI (e.g. Loading up the
upstream[0]
dataset) • I checked the
datahub-frontend-react
logs and put the stack trace in the thread 👇 What are some good methods to investigate this?
Example
datahub-frontend-react
 log stack trace:
Copy code
19:47:25 [ForkJoinPool.commonPool-worker-1] WARN  n.g.e.SimpleDataFetcherExceptionHandler - Exception while fetching data (/dataset/downstreamLineage/downstreams[75]/dataset/upstreamLineage/upstreams[0]/dataset) : java.lang.RuntimeException: Failed to retrieve entities of type Dataset
java.util.concurrent.CompletionException: java.lang.RuntimeException: Failed to retrieve entities of type Dataset
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1592)
	at java.util.concurrent.CompletableFuture$AsyncSupply.exec(CompletableFuture.java:1582)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.lang.RuntimeException: Failed to retrieve entities of type Dataset
	at com.linkedin.datahub.graphql.GmsGraphQLEngine.lambda$null$42(GmsGraphQLEngine.java:368)
	at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
	... 5 common frames omitted
Caused by: java.lang.RuntimeException: Failed to batch load Datasets
	at com.linkedin.datahub.graphql.types.dataset.DatasetType.batchLoad(DatasetType.java:92)
	at com.linkedin.datahub.graphql.GmsGraphQLEngine.lambda$null$42(GmsGraphQLEngine.java:366)
	... 6 common frames omitted
Caused by: com.linkedin.r2.RemoteInvocationException: com.linkedin.r2.RemoteInvocationException: Received error 414 from server for URI <http://datahub-gms:8080/datasets>
	at com.linkedin.restli.internal.client.ExceptionUtil.wrapThrowable(ExceptionUtil.java:135)
	at com.linkedin.restli.internal.client.ResponseFutureImpl.getResponseImpl(ResponseFutureImpl.java:130)
	at com.linkedin.restli.internal.client.ResponseFutureImpl.getResponse(ResponseFutureImpl.java:94)
	at com.linkedin.restli.internal.client.ResponseFutureImpl.getResponseEntity(ResponseFutureImpl.java:173)
	at com.linkedin.dataset.client.Datasets.batchGet(Datasets.java:233)
	at com.linkedin.datahub.graphql.types.dataset.DatasetType.batchLoad(DatasetType.java:79)
	... 7 common frames omitted
Caused by: com.linkedin.r2.RemoteInvocationException: Received error 414 from server for URI <http://datahub-gms:8080/datasets>
	at com.linkedin.restli.internal.client.ExceptionUtil.exceptionForThrowable(ExceptionUtil.java:98)
	at com.linkedin.restli.client.RestLiCallbackAdapter.convertError(RestLiCallbackAdapter.java:66)
	at com.linkedin.common.callback.CallbackAdapter.onError(CallbackAdapter.java:86)
	at com.linkedin.r2.message.timing.TimingCallback.onError(TimingCallback.java:81)
	at com.linkedin.r2.transport.common.bridge.client.TransportCallbackAdapter.onResponse(TransportCallbackAdapter.java:47)
	at com.linkedin.r2.filter.transport.FilterChainClient.lambda$createWrappedClientTimingCallback$0(FilterChainClient.java:113)
	at com.linkedin.r2.filter.transport.ResponseFilter.onRestError(ResponseFilter.java:79)
	at com.linkedin.r2.filter.TimedRestFilter.onRestError(TimedRestFilter.java:92)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:166)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:132)
	at com.linkedin.r2.filter.FilterChainIterator.onError(FilterChainIterator.java:101)
	at com.linkedin.r2.filter.TimedNextFilter.onError(TimedNextFilter.java:48)
	at com.linkedin.r2.filter.message.rest.RestFilter.onRestError(RestFilter.java:84)
	at com.linkedin.r2.filter.TimedRestFilter.onRestError(TimedRestFilter.java:92)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:166)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:132)
	at com.linkedin.r2.filter.FilterChainIterator.onError(FilterChainIterator.java:101)
	at com.linkedin.r2.filter.TimedNextFilter.onError(TimedNextFilter.java:48)
	at com.linkedin.r2.filter.message.rest.RestFilter.onRestError(RestFilter.java:84)
	at com.linkedin.r2.filter.TimedRestFilter.onRestError(TimedRestFilter.java:92)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:166)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:132)
	at com.linkedin.r2.filter.FilterChainIterator.onError(FilterChainIterator.java:101)
	at com.linkedin.r2.filter.TimedNextFilter.onError(TimedNextFilter.java:48)
	at com.linkedin.r2.filter.message.rest.RestFilter.onRestError(RestFilter.java:84)
	at com.linkedin.r2.filter.TimedRestFilter.onRestError(TimedRestFilter.java:92)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:166)
	at com.linkedin.r2.filter.FilterChainIterator$FilterChainRestIterator.doOnError(FilterChainIterator.java:132)
	at com.linkedin.r2.filter.FilterChainIterator.onError(FilterChainIterator.java:101)
	at com.linkedin.r2.filter.TimedNextFilter.onError(TimedNextFilter.java:48)
	at com.linkedin.r2.filter.transport.ClientRequestFilter.lambda$createCallback$0(ClientRequestFilter.java:102)
	at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:82)
	at com.linkedin.r2.transport.http.client.rest.ExecutionCallback.lambda$onResponse$0(ExecutionCallback.java:64)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: com.linkedin.r2.message.rest.RestException: Received error 414 from server for URI <http://datahub-gms:8080/datasets>
	at com.linkedin.r2.transport.http.common.HttpBridge$1.onResponse(HttpBridge.java:76)
	... 4 common frames omitted
b
Does the dataset you were browsing have a ton of downstreams ?
414 means that we are sending a request URI to GMS that is too long
When we invoke a "batch load" of datasets against GMS Rest.li embeds all of the requested urns in the URI
(IIRC)
So I think this is a scale issue: We need to somehow tell rest.li to put those urns in the body (or implement that ourselves) it appears
w
Oh it’s definitely one of our larger ones. To give you an idea about scale, we have 6k+ datasets
m
we are running into a similar issue whenever we try to load a dataset with a lot of upstreams/downstreams
b
Okay let us look into alternatives here
we fetch all of the downstreams for a dataset in one go. We may need to implement pagination here
w
Thanks for looking into this! Let me know if you’d like me to make an issue or help with testing
b
We've managed to reproduce 😛
m
outside of this… but maybe golden test dataset needs to grow a bit… 🤔
b
Agreed! Would not be opposed to that. Would you like to make the changes?
g
Hey @millions-engineer-56536 and @worried-sundown-63248 we have made some performance improvements to the lineage graph data fetching- can you try pulling latest master and seeing how the lineage viz works now?
w
I pulled the latest changes, rebuilt with gradle, and ran the dev docker setup + separate yarn build for the react UI, but I’m still getting the error. Would I need to re-ingest the data? I can try nuking everything and trying again
b
You shouldn't need to. Do you have any idea the number of downstream entities are being loaded for the failing guy?
m
I won’t be able to get to this till Monday
g
hey folks- another update here. We have a fix to batch get the ids via POST, but found the generated rest.li clients do not support post-ing a batch get request. We filed an issue on rest.li here: https://github.com/linkedin/rest.li/issues/603 and should have things resolved soon.
b
@green-football-43791 Are we tracking an issue on our side?
g
w
Thanks for opening the issues 👍
g
Update here @worried-sundown-63248 @millions-engineer-56536 as a temporary solution, we are going to batch load downstream entities from gms
pr here:
after this is merged your issue should be resolved
m
thank you
b
Thanks Gabe!
w
I pulled the fork and rebuilt, but I’m still getting issues. The dev docker setup pulls images built from the main branch, so maybe I’ll test it after the PR is merged
b
Same error
?
w
The react client error is slightly different (
Exception while fetching data (/dataset/upstreamLineage/entities[0]/entity) : java.lang.RuntimeException: Failed to retrieve entities of type Dataset
), but the frontend is still giving
Caused by: com.linkedin.r2.message.rest.RestException: Received error 414 from server for URI <http://datahub-gms:8080/datasets>
g
@worried-sundown-63248 I’m surprised you’re still getting the 414 error after batching. one step you could take to help me understand the issue would be to run the front end in debug mode (you may also need to run gms locally as well for it to work) and put a breakpoint in BatchGetUtils
If you could let me know what the request(s) being constructed look like, that would be very helpful!
I can also pair with you later today (5pm PST) or tomorrow morning to investigate- let me know if either of those times work.
b
Yeah --would be great to pair so we can get to the bottom of things
w
I nuked, rebuilt, and re-ingested, and I’m still getting the error, so let’s pair up. I’m EST, so 8pm is a little late tonight, but does 9am PST tomorrow work? I’m also free after that
g
Yep- let’s sync 9am tomorrow 👍
👍 1
m
I’m not sure if you guys were able to get to the bottom of this… but we were finally able to test the changes last night and still are seeing an error
g
Hey- we were able to determine the fix
I'll get it out by EOD today 👍
🙌 2
m
ok cool
g
Great to hear @millions-engineer-56536!