Any recommendation to do rolling aggregation (eg movingAvg of past 7 days for each hour of last 24 hours) efficiently inside Pinot?
j
Jackie
10/23/2020, 8:23 PM
Pinot does not support this kind of query natively. The most efficient way is to query the per-hour sum and count for the past 8 days, and then calculate the rolling aggregation on the client side. Another way (less efficient) is to send a separate query for each hour with a filter to query the data for the last 24 hours
c
Chundong Wang
10/23/2020, 8:25 PM
Thanks. That’s what I was thinking. The other option would be to do it upstream, but then you’d lose the flexibility that Pinot brought
Chundong Wang
10/23/2020, 8:25 PM
I’m also wondering if the separate query option would create an unnecessary spike of query amount
j
Jackie
10/23/2020, 8:31 PM
Separate query option is definitely much more expensive than the one query approach with some client side post-aggregation, so for performance perspective, I would recommend the one query approach
Jackie
10/23/2020, 8:33 PM
As for the the upstream approach, if this query pattern is a very commonly used for your use case, you might also consider pre-calculate the rolling aggregation and store it as a separate column
Jackie
10/23/2020, 8:33 PM
Then you can still benefit from the arbitrary slice and dice provided by Pinot
k
Kishore G
10/23/2020, 9:19 PM
If you are brave enough to try, you can write a post aggregation transform function to do this on Pinot side
Kishore G
10/23/2020, 9:20 PM
Basically what ever you are doing on client side, can be done on broker side