https://pinot.apache.org/ logo
w

Will Briggs

01/22/2021, 4:56 PM
I’m running into a situation where Pinot is using a star-tree index to satisfy a query in one case, but not in another, and the queries are almost identical. This one does not use the star tree:
Copy code
SELECT dimension, SUM(metric) AS totalMetrics FROM myTable WHERE otherDimension='filterValue' AND eventTimestamp >= cast(now() - 172800000 as long) GROUP BY 1 ORDER BY 2 DESC LIMIT 10
This one uses the star tree:
Copy code
SELECT dimension, SUM(metric) AS totalMetrics FROM myTable WHERE otherDimension='filterValue' AND eventTimestamp >= 1611161288000 GROUP BY 1 ORDER BY 2 DESC LIMIT 10
It looks like the use of a dynamically-computed timestamp value is confusing the optimizer somehow? the
eventTimestamp
column is not part of my star-tree index in either case.
m

Mayank

01/22/2021, 5:22 PM
How did you find out that StarTree was used for one query?
w

Will Briggs

01/22/2021, 5:31 PM
Trace
m

Mayank

01/22/2021, 5:49 PM
You are right, the code to check if star-tree can be used or not does not handle the expression
cast(now() - 172800000 as long)
. You can file an issue for the same.
w

Will Briggs

01/22/2021, 6:03 PM
@Mayank I’m new to the Pinot codebase, but if you can point me in the right direction, I’d be happy to see if I can put together a PR to address this.
m

Mayank

01/22/2021, 6:03 PM
Nice
Let me post a pointer here
Note, after optimization lhs and rhs for the time predicate are swapped
w

Will Briggs

01/22/2021, 6:16 PM
@Mayank Correct me if I’m wrong, but based on a quick look, it appears that the problem might actually be in
extractPredicateEvaluatorsMap
, and not in
isFitForStarTree
?
m

Mayank

01/22/2021, 6:17 PM
yes
isFitForStarTree
is the high level entry point for you to get the algorithm. The code that passes the predicateColumns is the one that needs to be checked.
w

Will Briggs

01/22/2021, 6:18 PM
Got it, perfect, and thank you.
👍 1
m

Mayank

01/22/2021, 6:18 PM
You'll jump a few classes from this entry point to go to the exact place that needs fix, but you'll get a better picture of what is going on
w

Will Briggs

01/22/2021, 6:20 PM
Yup, looks like the various Aggregation*PlanNode instances. I’ll spend some time on it after I get done my day job, thanks again!
m

Mayank

01/22/2021, 6:29 PM
They'll probably lead you to
Copy code
StarTreeUtils.extractPredicateEvaluatorsMap
w

Will Briggs

01/22/2021, 6:30 PM
Yeah, they did, that was what I referenced above 🙂
m

Mayank

01/22/2021, 6:30 PM
Oh yeah you did
w

Will Briggs

01/22/2021, 8:36 PM
@Jackie Saw you commented on the time column predicate issue… wanted to tag you here for context.
j

Jackie

01/22/2021, 8:41 PM
@Will Briggs I think the problem here is that
1611161288000
is in micros instead of millis
Then everything is pruned out
m

Mayank

01/22/2021, 8:42 PM
Copy code
I did a simple test to compile the query to BrokerRequest, and do see cast(now() - 172800000 as long) as LHS for predicate. My test did not actually go through any query execution.
w

Will Briggs

01/22/2021, 8:43 PM
@Jackie 1611161288000 is ms since epoch
j

Jackie

01/22/2021, 8:45 PM
Oh, sorry my bad
👍 1
The problem is not how the predicate is evaluated, but why the first query is not converted to the second query on the broker side
m

Mayank

01/22/2021, 8:49 PM
Because of use of a function (now()) as opposed to a constant eval? (just guessing)
j

Jackie

01/22/2021, 8:55 PM
Seems working without the cast
w

Will Briggs

01/22/2021, 8:56 PM
yes, it seems to be the cast. The only reason the cast is necessary is being addressed here: https://github.com/apache/incubator-pinot/pull/6403
m

Mayank

01/22/2021, 8:56 PM
Yeah, I am working with @Amrish Lal on ^^
w

Will Briggs

01/22/2021, 8:57 PM
Yes, I was just thanking him earlier because I noticed that PR 🙂
I didn’t realize these were so closely intertwined, though
m

Mayank

01/22/2021, 8:58 PM
I have concerns with this PR, so we would probably find other ways to fix the problem this PR is trying to address.
j

Jackie

01/22/2021, 9:03 PM
The cast is not invoked on broker side because cast is not registered as a scalar function
w

Will Briggs

01/22/2021, 9:03 PM
So presumably there’s something buried in the
PinotQuery2BrokerRequestConverter
(or thereabouts) that is reifying eligible expressions into constants (e.g.,
now()
) before handing the query off from the Broker for execution, and it isn’t handling
cast
function expressions?
ah
j

Jackie

01/22/2021, 9:04 PM
We can add a scalar function for cast, then it should work on broker side (compile time)