hi All, I have a question about what is the best w...
# general
s
hi All, I have a question about what is the best way to create segments, like I have realtime table which is ingesting data from kafka topic and topic has deviceid as key, should we create segments as per device id or it should be based on default time based segments, our queries mostly have searches for device id and time range. If we create segments as per device Id will the same device id data go in same segment and if it is like that query will be faster and will it look only for segments which has these deviceIds, how will it work.
k
Pinot supports both time and space partitioning
In most cases, it’s better to partition by time first (day) and then space(deviceid)
s
any sample , or document where both are used . I am struggling how to do it for both what config is required for that.
Time partition happens automatically.. it segment metadata has start time and end time which get as used during pruning
m
Yes, for realtime consumption, simply partition by the primary key, as time partition will happen naturally.