Prashant Pandey
04/19/2022, 10:49 AMKartik Khare
04/19/2022, 11:09 AMoptimizeDictionaryForMetrics
config in https://docs.pinot.apache.org/configuration-reference/table
This only works for Single-valued columns though currently.
For your second question, forward index are created by default. Inverted index should be used when you want to do a constant time lookup for a column value.
https://docs.pinot.apache.org/basics/indexing/inverted-indexKartik Khare
04/19/2022, 11:11 AMPrashant Pandey
04/19/2022, 11:16 AMSo you will need to have a dictionary for it to work.In our case, we have around 800M segments out of which 400M is the size of the dict due to the cardinality (700k unique values). Can we somehow optimise this?
Prashant Pandey
04/19/2022, 11:18 AMKartik Khare
04/19/2022, 11:24 AMPrashant Pandey
04/19/2022, 11:26 AMtag__KEYS
: Stores keys of your tags.
tag__VALUES
: Stores values of those the corresponding tags in tag__KEYS
.
So basically, they look like:
tag__KEYS: "key1, key2, key3"
tag__VALUES: "val1, val2, val3"
Almost all of our queries have predicates on these two columns.Mayank
Mayank
Prashant Pandey
04/19/2022, 6:57 PMMayank