<@UV40X5SF2> I noticed that in the GMA ElasticSear...
# ui
g
@microscopic-receptionist-23548 I noticed that in the GMA ElasticSearch DAO we paginate on entities but not on groups: https://github.com/linkedin/datahub-gma/blob/master/dao-impl/elasticsearch-dao-7/src/main/java/com/linkedin/metadata/dao/browse/ESBrowseDAO.java#L72 Curious if you have any context as to why that was done- it forces the UI to handle group pagination in browse which could run into scalability issues for large & complex browse trees. Would it make sense to also pass along the pagination flag when fetching groups?
cc @big-carpet-38439
m
no, i don't have context here.
g
@microscopic-receptionist-23548 would you be open to me adding a feature to paginate groups?
How does LI handle this currently? are they paginating in the interal ember app?
I see in the OS ember code a comment that says groups are not affecting by pagination:
m
to be honest im not even sure what groups are in this context...
g
browse returns 2 lists, entities and groups
groups are essentially directories
a non-completed browse path
m
my work with ES has been more operational than on the API side
oh right this is for browse...
g
right.
I would be happy to make the change, but if we started paginating groups
and LI was assuming the whole response would come back
then that would be a breaking change for LI
m
idk why those lists are different; maybe we should have a new API that returns a List<Stuff> where Stuff can be either a directory or entity. then page that....
g
since some paths would become undiscoverable via browse
m
?
g
I like that idea!
m
idk how big of a refactor that would be though
if there's something smaller but isn't like that that is fine too
tldr you can fix this at your discretion 😛
1
though consider making this non-breaking; add new functionality here and deprecate the old; will be easier for LI to adopt
that's generally the way to make all clients happy: add new stuff, deprecate old stuff, delete deprecated things over time
g
For sure 👍
c
The group is used for helping calculating number of leaf nodes in each sub folders. also, a bit related to a ticket open internally, ES query with large aggregation size will be resources intensive, and yet we currently have INT_MAX set.
g
Hey @cool-river-24902 - thanks for your answer! How does LI currently handle pagination of groups in browse internally then? Does LI DH paginate on the client side? Or does LI DH show all groups at once, even if there may be 100s or 1000s?
c
Hi Gabe, browse in general needs improvement; currently (to my best knowledge) there is no pagination on groups, feel free to open a ticket or a PR : )
👌 1
g
Thanks Na! I added a few more details to this issue here: https://github.com/linkedin/datahub/issues/2200