Also when I run the batch ingestion job I see some debug output about dictionary encoding the columns, including numeric metric columns. Does that mean it's dictionary encoding the data in Pinot's internal format? Say I'd like to compute averages and quantiles of these metrics grouped by different dimensions -- is dictionary encoding best for that or should I disable it? Or is what I'm seeing not relevant to query performance
m
Mayank
04/13/2021, 4:12 PM
By default most columns are dictionary encoded, and work well. It helps to disable in certain cases like strings with really high cardinality. For your case you can assume it works fine, unless you are seeing issues.