https://pinot.apache.org/ logo
#general
Title
# general
p

Priyank Bagrecha

01/04/2022, 11:38 PM
also what is the time bucket for which aggregation happens?
m

Mayank

01/04/2022, 11:41 PM
The bucketing right now is the time granularity. In other words, the dimension/time values of rows have to match, for metrics to aggregated.
p

Priyank Bagrecha

01/04/2022, 11:46 PM
i see. what i am thinking of doing, is aggregate metrics per 5 min time bucket. now i can create a derived column which will mark the start of a 5 min window using event timestamp. but the event timestamp is still in the row. i am guessing that will be a problem for this to work right? unless of course i manage that outside of pinot.
j

Johan Adami

01/05/2022, 12:41 AM
i’ve experimented with this a bit. you can use a transform function in the table config to make derived columns with whatever granularities you want. but then you have to make sure to pick the lowest granularity as the time column and not include the original column in your schema
p

Priyank Bagrecha

01/05/2022, 1:01 AM
wait, so i don't have to keep the original column in the schema when using a derived column? i thought i needed to define both for the transform function in the table config to work.
j

Johan Adami

01/05/2022, 1:14 AM
my personal experiments used an event source that had a millisecond granularity time column. I used 2 transform functions to convert it to hourly and daily. I did not include the original field in the schema and it worked correctly.
when I kept the original column, it also tried to preaggregate on the millisecond column which defeated the purpose
p

Priyank Bagrecha

01/05/2022, 1:15 AM
Thank you Johan. I will give that a try.
m

Mayank

01/08/2022, 6:30 PM
Thanks @User