Ken Krugler
03/04/2021, 4:50 PMDistinctCountHLL only works for single value fields. It seems like a simple change in DistinctCountHLLAggregationFunction.aggregate() to check if the BlockValSet is multi-valued, and if so then call BlockValSet.getXXXMV() and do a sub-iteration on the secondary array it returns. Does that make sense?Kishore G
Ken Krugler
03/04/2021, 5:09 PM"message": "QueryExecutionError:\njava.lang.UnsupportedOperationException\n\tat org.apache.pinot.core.segment.index.readers.ForwardIndexReader.readDictIds(ForwardIndexReader.java:84)\n\tat org.apache.pinot.core.common.DataFetcher$ColumnValueReader.readStringValues(DataFetcher.java:439)\n\tat org.apache.pinot.core.common.DataFetcher.fetchStringValues(DataFetcher.java:146)\n\tat org.apache.pinot.core.common.DataBlockCache.getStringValuesForSVColumn(DataBlockCache.java:194)\n\tat org.apache.pinot.core.operator.docvalsets.ProjectionBlockValSet.getStringValuesSV(ProjectionBlockValSet.java:94)\n\tat org.apache.pinot.core.query.aggregation.function.DistinctCountHLLAggregationFunction.aggregate(DistinctCountHLLAggregationFunction.java:103)\n\tat org.apache.pinot.core.query.aggregation.DefaultAggregationExecutor.aggregate(DefaultAggregationExecutor.java:47)\n\tat org.apache.pinot.core.operator.query.AggregationOperator.getNextBlock(AggregationOperator.java:66)\n\tat org.apache.pinot.core.operator.query.AggregationOperator.getNextBlock(AggregationOperator.java:35)\n\tat org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:49)\n\tat org.apache.pinot.core.operator.combine.BaseCombineOperator$1.runJob(BaseCombineOperator.java:94)\n\tat org.apache.pinot.core.util.trace.TraceRunnable.run(TraceRunnable.java:40)\n\tat java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)"Ken Krugler
03/04/2021, 5:09 PMMayank
distinctCountHLLMV?Mayank
MV suffix in the name.Ken Krugler
03/04/2021, 6:59 PMaggregate, aggregateGroupBySV, and aggregateGroupByMV. Made me think there was a missing aggregateMV function. I see now that the BySV and ByMV` methods are for doing aggregations when the grouping column is SV vs. MV.Mayank
Ken Krugler
03/04/2021, 6:59 PMBlockValSet could be used to determine whether to handle it as an SV or an MV column.Mayank
Ken Krugler
03/04/2021, 7:00 PMMayank