In Pinot, segments belonging to a particular datas...
# pinot-dev
k
In Pinot, segments belonging to a particular dataset may have different indexes. Is this correct? If so, what is the benefit, aside from not having to reload all segments to apply index changes?
r
you'd do that to accelerate queries, new data will have the new indexing which will (hopefully) make the queries faster. Any reloaded segments will have the new indexes too.
1
k
But wouldn't the old segments be slow to query? Wouldn't that impact overall performance?
r
yes, that's why you'd apply the change, but it takes time to propagate, for all the segments to be reloaded
k
So the advantage is that the impact of index on query performance is immediately visible.
As opposed to the long time it takes to update an index in a traditional db
r
given that the reload can't be done instantaneously, it's really nice to know what fraction of segments queried have been reloaded and reindexed
1
not really, it's more scalable to have data structures like indexes and dictionaries embedded in segments, because each segment is self sufficient and doesn't lead to coordination with any other segment
1
k
I'm trying to think of disadvantages to not all segments having the same index and hence different results for explain query
I suppose query optimization happens per segment? Because query optimization would look different for every segment, given that every segment has a different index.
r
it's actually less efficient than global data structures in several ways, because dictionaries can be similar across segments for many attributes, which leads to duplication, makes things harder to cache in memory and so on
1
yes it happens per segment, as well as planning
planning has to happen at segment level because that's the only level where you know what inventory you have to execute the query against
1
k
Makes sense. Thank you for answering my questions.
🙇‍♂️ 1