Neeraja Sridharan
09/26/2022, 7:35 PMoffline tables in Pinot
with invertedIndexColumns, sortedColumn and segmentPartition (with Murmur based partitions) enabled. We also have instanceSelectorType as "replicaGroup".
We've currently setup createInvertedIndexDuringSegmentGeneration
flag to false
by default.
Is there a recommended approach to set this flag to true
and also, what is the expected behavior?
Will it be beneficial to enable it to minimize index creation after segments are loaded onto servers?
Appreciate any help regarding this 🙇♀️Mayank
Neeraja Sridharan
09/26/2022, 8:27 PMMayank
false
. The tradeoff is essentially pushing larger segment (with index), vs creating index on server (cpu/mem on server). Also, independently, the primary reason server has the capability to build index during loading is because it provides the flexibility to change indexing as the use case evolves, without needed to re-bootstrap the data.Ken Krugler
09/26/2022, 11:09 PMNeeraja Sridharan
09/27/2022, 1:43 AMcreateInvertedIndexDuringSegmentGeneration
to be beneficial?
FYI, we use spark pinot batch ingestion job for building & pushing segments to Pinot.Ken Krugler
09/27/2022, 2:53 PMNeeraja Sridharan
09/27/2022, 3:40 PMNeeraja Sridharan
09/27/2022, 3:45 PMNeha Pawar
Neeraja Sridharan
09/27/2022, 4:08 PMcreateInvertedIndexDuringSegmentGeneration
flag set to true always? Just trying to evaluate on what to expect if we start having this true for our new tables given that we've been having this false for our existing Pinot offline tables. FYI, we use pinot spark batch ingestion job for building & pushing segments to Pinot.Ken Krugler
09/27/2022, 5:17 PMKen Krugler
09/27/2022, 5:18 PMNeeraja Sridharan
09/27/2022, 7:59 PMfalse
value & update to true
as needed (if we hit issues with Pinot CPU load/running out of memory).Neha Pawar
Neeraja Sridharan
09/27/2022, 8:36 PM