Hello :slightly_smiling_face: Are subqueries suppo...
# troubleshooting
j
Hello 🙂 Are subqueries supported in Pinot ? Such as 1.
SELECT * FROM table WHERE x in (SELECT ...)
and 2.
SELECT * FROM (SELECT ...)
(i.e. working with the output of a subquery)
My use case in particular right now is 2. Like:
SELECT SUM(agg_value) FROM (SELECT dateString, AVG(value) as agg_value FROM table GROUP BY dateString)
Or maybe there's another way to express this query ?
The data looks like this
Copy code
date    | 01 | 02 | 03
entity1 | 1  |  2 |  3 
entity2 | 3  |  4 |  5
The idea is to AVG over entity1 and entity2 per day, then SUM those results over all days
m
Pinot does not support generic nested queries. There's some limited use case like: https://docs.pinot.apache.org/basics/indexing/json-index#nested-filter-expression
thankyou 1
🆗 1
@Jackie is there way to model this query using post-aggregation?
j
I've tried to find some info on post-aggregation (I've found a PR about this), but couldn't find much ^^ Would really appreciate @Jackie input then 😄
j
Unfortunately this cannot be achieved with the post-aggregation
j
@Mayank @Jackie Is there another approach to solving that query ? Ideally as much as possible in realtime (i.e. no precomputation), within Pinot
j
In the subquery, we need to return the results for all groups
j
Yes Can this subquery result be "exploited" further in any way ? Without resorting to client level aggregations ?
j
As a workaround, you may issue
SELECT AVG(value) FROM table GROUP BY dateString LIMIT 10000
(assuming the cardinality of
dateString
is smaller than 10000), then do the sum on the client side
j
Yes, luckily dateString cardinality is low (<1000) I'll see how performance fares in that case then Do you think any upcoming feature would allow this query to be done in Pinot ?
j
We don't have a feature already planned to support this
m
One option is to use Pinot + Presto if you have nested queries like this.
j
We can probably build a simple query engine on the broker to support simple subquery like this
Which takes a result table as the source
Yes, using Presto might be an available option supported now
✔️ 1
j
One option is to use Pinot + Presto if you have nested queries like this.
Sounds a bit overkill for that use case as it's not currently available in my environment but thanks for the idea 🙂
@Jackie I see, maybe some day then ! 😄
j
Can you please file an issue about this feature so that we keep track of it
j
Sure, will do
Thank you very much for the feedback @Mayank & @Jackie
m
👍
j
https://github.com/apache/incubator-pinot/issues/7043 I'm having a bit of trouble finding the appropriate terms to describe the feature request, feel free to edit & add anything 🙂