Apache Pinot

Quick question: is it possible to run a query on a certain table, but with an IN_SUBQUERY where the subquery is run against another table?

Thanks. Are there any plans to support it though?

Likely yes, but no concrete timeline at the moment. This is venturing into full support for nested queries and joins 

Got it, makes sense. Thanks man! :slightly_smiling_face:

<@U02BE3NN5U7> can you please try the query.. I dont remember restricting IN_SUBQUERY to the same table.

Yes, please try out <@U02BE3NN5U7>, I might have missed something here.

Ah, nice, I'll try it out then :slightly_smiling_face:

Well my friends it just so happens that... YES, it does support subqueries to different tables!!! Yay!!! \o/

For reference, this is my query:
```select weight, caseid as weights_caseid from weights where in_subquery(caseid, 'select id_set(caseid) from brands_metrics where brand_id = 1000226 and date_ &gt; 20150501 and date_ &lt; 20150515 limit 1000000') = 1 and in_subquery(caseid, 'select id_set(caseid) from filters where name = ''gender'' and values_ = 2 and date_ &gt; 20150501 and date_ &lt; 20150515 limit 1000000') = 1 and date_ &gt; 20150501 and date_ &lt; 20150515 limit 1000000```
and I get results!

Guys, this is amazing news, because the fact that this works means that I don't even need to use Trino or anything like that, because then I can just do the aggregation of the data in the Pinot Broker itself! And it runs much faster than in Trino, actually :slightly_smiling_face:

Thanks for confirming <@U02BE3NN5U7>, we will add this to the doc. cc: <@U01DS9A0DS4>

Dude, I am *SO* happy with this... it's a game-changer for me, really! :heart:
Next on: I'll change our structure and do some more complex stuff with Groovy in order to have some calculations done over JSON columns.
And let me say this again: Pinot is ridiculously fast, OMG!!!

By the way: the query above ran in 28ms on my computer, with 1 year worth of data, but where each segment contains data for 1 day only - in total I have 202 segments (it's where we have data). If I was doing monthly merge roll-ups, this would be even faster. I can already see this awesome database becoming our next data source for our main data backend! :slightly_smiling_face:

Glad you are able to use Pinot to solve your use cases.

Yep. We already use it for internal analytics, which is important but not mission-critical; Now this experiment I'm doing is to serve our main product, so if everything works, we'll put up another cluster, but a much more robust one, with more and beefier nodes.

Have updated the docs to indicate you can run a sub query against any table <https://docs.pinot.apache.org/users/user-guide-query/filtering-with-idset>

Awesome, thanks man! :slightly_smiling_face: