https://pinot.apache.org/ logo
#troubleshooting
Title
# troubleshooting
e

eywek

03/23/2022, 4:59 PM
Hello, I was wondering if it’s possible to partition segments based on a field value (but without any transformation). For example, I store in pinot events from multiple websites, those events have name (i.e.
purchase
, `page_view`…) and I would like to create a segment by event name (with a size limit ofc). Since those events are user defined I can’t really know how many partitions I’ll have. I’ve seen Murmur, Hashcode… partition config but it doesn’t insure me that each event type will have a dedicated segment (e.g. I don’t want
page_view
and
purchase
events to be in the same segments, to avoid loading any
page_view
data when doing a query on
page_view
ones) Thank you
k

Ken Krugler

03/23/2022, 5:01 PM
We do something similar (partitioning by country), but we have a Flink workflow building segments, so this is pretty easy - we just generate different segments that have a year-month and the country name in the segment name, so that it’s partitioned by both date & country. We have one country (US) that is significantly larger than the others, so we sub-partition that by a hash of of the fields that we use for various aggregations.
e

eywek

03/23/2022, 5:03 PM
Okay, so you aren’t using the native pinot partioning, right?
r

Richard Startin

03/23/2022, 5:07 PM
I recall there was recently a community contribution to do just this, YMMV
e

eywek

03/23/2022, 5:09 PM
Oh, great! Thank you!