Hi team, when we tried to ran below query, ```SELE...
# general
c
Hi team, when we tried to ran below query,
Copy code
SELECT facility_name as key_col, COUNT(*) as val_col
FROM enriched_station_orders_v1_OFFLINE
WHERE created_at_seconds BETWEEN 1606756268 AND 1609175468
AND (facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef') 
GROUP BY key_col
ORDER BY val_col DESC
LIMIT 5
We’ll get exceptions on pinot-server like (index number seems to vary),
Copy code
Caught exception while processing and combining group-by order-by for index: 1
However if we change from
facility_organization_id <> 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
to
facility_organization_id = 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'
there won’t be such exception. Or if we switch to
facility_id
instead of
facility_name
it won’t threw exception as well. Have you seen such issue before?
cc @Alex Odle
m
What's the exception you are seeing (it should be logged along with the error message)?
c
The exception I got from pinot-ui is
Copy code
{
    "message": "QueryExecutionError:\njava.lang.ArrayIndexOutOfBoundsException",
    "errorCode": 200
  },
But from the log on pinot-server, the exception isn’t there.
m
Hmm, server should have also logged:
Copy code
LOGGER.error(
                "Caught exception while processing and combining group-by order-by for index: {}, operator: {}, queryContext: {}",
                index, _operators.get(index).getClass().getName(), _queryContext, e);
            mergedProcessingExceptions.add(QueryException.getException(QueryException.QUERY_EXECUTION_ERROR, e));
c
I know and that’s weird to me too. The full error log I see on pinot-server is,
Copy code
2020/12/29 17:50:16.761 ERROR [GroupByOrderByCombineOperator] [pqw-7] Caught exception while processing and combining group-by order-by for index: 1, operator: org.apache.pinot.core.operator.query.AggregationGroupByOrderByOperator, queryContext: QueryContext{_selectExpressions=[facility_name, count(*)], _aliasMap={facility_name=key_col, count(*)=val_col}, _filter=(created_at_seconds BETWEEN '1606756268' AND '1609175468' AND facility_organization_id != 'ac56d23b-a6a2-4c49-8412-a0a0949fb5ef'), _groupByExpressions=[facility_name], _orderByExpressions=[count(*) DESC], _havingFilter=null, _limit=5, _offset=0, _queryOptions={responseFormat=sql, groupByMode=sql, timeoutMs=24999}, _debugOptions=null, _brokerRequest=BrokerRequest(querySource:QuerySource(tableName:enriched_station_orders_v1_OFFLINE), filterQuery:FilterQuery(id:0, value:null, operator:AND, nestedFilterQueryIds:[1, 2]), aggregationsInfo:[AggregationInfo(aggregationType:COUNT, aggregationParams:{column=*}, isInSelectList:true, expressions:[*])], groupBy:GroupBy(topN:5, expressions:[facility_name]), filterSubQueryMap:FilterQueryMap(filterQueryMap:{0=FilterQuery(id:0, value:null, operator:AND, nestedFilterQueryIds:[1, 2]), 1=FilterQuery(id:1, column:created_at_seconds, value:[[1606756268		1609175468]], operator:RANGE, nestedFilterQueryIds:[]), 2=FilterQuery(id:2, column:facility_organization_id, value:[ac56d23b-a6a2-4c49-8412-a0a0949fb5ef], operator:NOT, nestedFilterQueryIds:[])}), queryOptions:{responseFormat=sql, groupByMode=sql, timeoutMs=24999}, pinotQuery:PinotQuery(dataSource:DataSource(tableName:enriched_station_orders_v1_OFFLINE), selectList:[Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_name)), Expression(type:IDENTIFIER, identifier:Identifier(name:key_col))])), Expression(type:FUNCTION, functionCall:Function(operator:AS, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))])), Expression(type:IDENTIFIER, identifier:Identifier(name:val_col))]))], filterExpression:Expression(type:FUNCTION, functionCall:Function(operator:AND, operands:[Expression(type:FUNCTION, functionCall:Function(operator:BETWEEN, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:created_at_seconds)), Expression(type:LITERAL, literal:<Literal longValue:1606756268>), Expression(type:LITERAL, literal:<Literal longValue:1609175468>)])), Expression(type:FUNCTION, functionCall:Function(operator:NOT_EQUALS, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_organization_id)), Expression(type:LITERAL, literal:<Literal stringValue:ac56d23b-a6a2-4c49-8412-a0a0949fb5ef>)]))])), groupByList:[Expression(type:IDENTIFIER, identifier:Identifier(name:facility_name))], orderByList:[Expression(type:FUNCTION, functionCall:Function(operator:DESC, operands:[Expression(type:FUNCTION, functionCall:Function(operator:COUNT, operands:[Expression(type:IDENTIFIER, identifier:Identifier(name:*))]))]))], limit:5), orderBy:[SelectionSort(column:count(*), isAsc:false)], limit:5)}
Basically ended at
queryContext
part
m
Seems like a bug, mind filing an issue?
c
Sure!
m
Also, can you try NOT IN instead, just to check?
c
Just tried, and it’s the same.
m
Thanks. I'd it is a general bug though, and something in your data might be triggering this. Would it be possible for you to test similar queries on other tables?
c
On staging environment (exact same configured except different data), this issue couldn’t be repro. So I’m suspecting it has something to do with corner case of data. I’d imaging prod to have more real data.
I tried this in other table on prod and it won’t repro. I tried other columns in the same table and it won’t repro. But I don’t know what kind of data issue is causing this.
m
Hmm, would be hard to repro without knowing what to look for
What's the cardinality of the column?
c
facility_organization_id
is a couple of thousands.
m
And how many docs are you expecting in your original query result?
Seems like you want to including all orgs except one, which could be huge
c
6 million doc scanned I think
total around 100mm docs
m
I see, that is not too much
c
Thanks. https://github.com/apache/incubator-pinot/issues/6393 filed for this. Not sure what we could do to help pin point the exact issue.
m
Yeah, at this point, I think issue is only going to help with tracking. Are you open to debugging at your end (with help from community)? I presume you cannot share the data (even one segment)
c
I’ll need @Elon’s help to turn on/off flags or gather dump.
But yes I think we could debug this together when he’s ready. 🙂
✔️ 1
m
There's also
AnonymizeDataCommand
which can anonymize your data (if that can make it shareable)
k
@Jackie ^^
c
I heard this has something to do with tab character in string column
j
@Chundong Wang Do you have the stack trace after the exception log?
c
I don’t. The server log ends at the query context part without actual stack trace of the exception.
j
Based on the description, I think the problem is that there are
\t
in some
facility_name
values
Which version of Pinot are you running?
c
0.5.0
j
Is it easy to try with the latest version?
There is a fix for this issue, but not included in `0.5.0`: https://github.com/apache/incubator-pinot/pull/5858
c
I see. I’ll check with the team to see how we’d upgrade. Is
0.6.0
released yet (which I suppose included this fix)?
j
Yes, and it includes this fix