Chundong Wang
12/29/2020, 6:01 PMSELECT facility_name as key_col, COUNT(*) as val_col
FROM enriched_station_orders_v1_OFFLINE
WHERE created_at_seconds BETWEEN 1606756268 AND 1609175468
AND (facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef')
GROUP BY key_col
ORDER BY val_col DESC
LIMIT 5
We’ll get exceptions on pinot-server like (index number seems to vary),
Caught exception while processing and combining group-by order-by for index: 1
However if we change from facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
to facility_organization_id = 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
there won’t be such exception. Or if we switch to facility_id
instead of facility_name
it won’t threw exception as well.
Have you seen such issue before?Chundong Wang
12/29/2020, 6:01 PMMayank
Chundong Wang
12/29/2020, 6:25 PM{
"message": "QueryExecutionError:\njava.lang.ArrayIndexOutOfBoundsException",
"errorCode": 200
},
Chundong Wang
12/29/2020, 6:25 PMMayank
LOGGER.error(
"Caught exception while processing and combining group-by order-by for index: {}, operator: {}, queryContext: {}",
index, _operators.get(index).getClass().getName(), _queryContext, e);
mergedProcessingExceptions.add(QueryException.getException(QueryException.QUERY_EXECUTION_ERROR, e));
Chundong Wang
12/29/2020, 6:26 PM2020/12/29 17:50:16.761 ERROR [GroupByOrderByCombineOperator] [pqw-7] Caught exception while processing and combining group-by order-by for index: 1, operator: org.apache.pinot.core.operator.query.AggregationGroupByOrderByOperator, queryContext: QueryContext{_selectExpressions=[facility_name, count(*)], _aliasMap={facility_name=key_col, count(*)=val_col}, _filter=(created_at_seconds BETWEEN '1606756268' AND '1609175468' AND facility_organization_id != 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'), _groupByExpressions=[facility_name], _orderByExpressions=[count(*) DESC], _havingFilter=null, _limit=5, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=24999}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:enriched_station_orders_v1_OFFLINE), filterQuery:FilterQuery(id:0, value:null, operator:AND, nestedFilterQueryIds:[1, 2]), aggregationsInfo:[AggregationInfo(aggregationType:COUNT, aggregationParams:{column=*}, isInSelectList:true, expressions:[*])], groupBy:GroupBy(topN:5, expressions:[facility_name]), filterSubQueryMap:FilterQueryMap(filterQueryMap:{0=FilterQuery(id:0, value:null, operator:AND, nestedFilterQueryIds:[1, 2]), 1=FilterQuery(id:1, column:created_at_seconds, value:[[1606756268 1609175468]], operator:RANGE, nestedFilterQueryIds:[]), 2=FilterQuery(id:2, column:facility_organization_id, value:[ac56d23b-a6a2-4c49-8412-a0a0949fb5ef], operator:NOT, nestedFilterQueryIds:[])}), queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=24999}, pinotQuery:PinotQuery(dataSource:DataSource(tableName:enriched_station_orders_v1_OFFLINE), selectList:[Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_name)), Expression(type:IDENTIFIER, identifier:Identifier(name:key_col))])), Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))])), Expression(type:IDENTIFIER, identifier:Identifier(name:val_col))]))], filterExpression:Expression(type:FUNCTION, functionCall:Function(operator:AND, operands:[Expression(type:FUNCTION, functionCall:Function(operator:BETWEEN, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:created_at_seconds)), Expression(type:LITERAL, literal:<Literal longValue:1606756268>), Expression(type:LITERAL, literal:<Literal longValue:1609175468>)])), Expression(type:FUNCTION, functionCall:Function(operator:NOT_EQUALS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_organization_id)), Expression(type:LITERAL, literal:<Literal stringValue:ac56d23b-a6a2-4c49-8412-a0a0949fb5ef>)]))])), groupByList:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_name))], orderByList:[Expression(type:FUNCTION, functionCall:Function(operator:DESC, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))]))], limit:5), orderBy:[SelectionSort(column:count(*), isAsc:false)], limit:5)}
Chundong Wang
12/29/2020, 6:26 PMqueryContext
partMayank
Chundong Wang
12/29/2020, 6:28 PMMayank
Chundong Wang
12/29/2020, 6:29 PMMayank
Chundong Wang
12/29/2020, 6:30 PMChundong Wang
12/29/2020, 6:30 PMMayank
Mayank
Chundong Wang
12/29/2020, 6:32 PMfacility_organization_id
is a couple of thousands.Mayank
Mayank
Chundong Wang
12/29/2020, 6:33 PMChundong Wang
12/29/2020, 6:33 PMMayank
Chundong Wang
12/29/2020, 6:39 PMMayank
Chundong Wang
12/29/2020, 6:41 PMChundong Wang
12/29/2020, 6:42 PMMayank
AnonymizeDataCommand
which can anonymize your data (if that can make it shareable)Kishore G
Chundong Wang
12/29/2020, 7:14 PMJackie
12/29/2020, 7:32 PMChundong Wang
12/29/2020, 7:33 PMJackie
12/29/2020, 7:38 PM\t
in some facility_name
valuesJackie
12/29/2020, 7:38 PMChundong Wang
12/29/2020, 7:38 PM0.5.0
Jackie
12/29/2020, 7:40 PMJackie
12/29/2020, 7:41 PMChundong Wang
12/29/2020, 7:41 PM0.6.0
released yet (which I suppose included this fix)?Jackie
12/29/2020, 7:43 PM