Hey team :wave::skin-tone-2: I am not sure if this...
# troubleshoot
d
Hey team 👋🏻 I am not sure if this is a valuable feedback but we observed that Datahub fails to return the impact analysis(transitive downstream consumers) of an entity if that entity has more than around 3K downstreams. The effect becomes more visible with the entites with 5-7K downstream dependencies. I know thesse numbers seems very high but I think makes sense when you want to get the downstreams of a very core table, also feeding looker(looker has many entity types such as dashbaords). I don't think that the infrastructure we serve Datahub is the bottleneck here but this is always a possibility.
BTW as a workaround, we get the downstreams level by level and implement BFS/DFS on our application layer for such tables. This increases the time of computation since we need to do so many graphql queries instead of doing one and let the Datahub do the same computation
g
Thanks for the report - we’re aware of this problem, and in particular aware that performance got worse in the 0.10.0 release