Any recommendation to do rolling aggregation (eg m...
# general
c
Any recommendation to do rolling aggregation (eg movingAvg of past 7 days for each hour of last 24 hours) efficiently inside Pinot?
j
Pinot does not support this kind of query natively. The most efficient way is to query the per-hour sum and count for the past 8 days, and then calculate the rolling aggregation on the client side. Another way (less efficient) is to send a separate query for each hour with a filter to query the data for the last 24 hours
c
Thanks. That’s what I was thinking. The other option would be to do it upstream, but then you’d lose the flexibility that Pinot brought
I’m also wondering if the separate query option would create an unnecessary spike of query amount
j
Separate query option is definitely much more expensive than the one query approach with some client side post-aggregation, so for performance perspective, I would recommend the one query approach
As for the the upstream approach, if this query pattern is a very commonly used for your use case, you might also consider pre-calculate the rolling aggregation and store it as a separate column
Then you can still benefit from the arbitrary slice and dice provided by Pinot
k
If you are brave enough to try, you can write a post aggregation transform function to do this on Pinot side
Basically what ever you are doing on client side, can be done on broker side
We will be happy to help