Hello Is there a way to use lookups dimTable inside a WHERE Apache Pinot #troubleshooting

Hello Is there a way to use lookups (dimTable) ins...

Jonathan Meyer

06/13/2021, 10:59 AM

Hello Is there a way to use lookups (dimTable) inside a WHERE clause ? Something like

SELECT SUM(value) FROM table WHERE user IN LOOKUP('group', 'user', 'groupId', '<groupId>')

? (which isn't valid) Basically my goal is to fetch a list of 'users' in the dimTable using a 'group' identifier and filtering on those (in the main table)

Jackie

06/14/2021, 1:40 AM

This is not a look-up query. You can take a look at

Use IdSet for Id Filtering

design: https://docs.pinot.apache.org/developers/design-documents. This query can be modeled as a subquery and solved with the

IN_SUBQUERY

function

Jonathan Meyer

06/14/2021, 7:27 AM

Fantastic, looks like what I need, thank you @Jackie ! 🎉

Jonathan Meyer

06/14/2021, 9:33 AM

@Jackie Sorry to bother you, as there currently exist no user documentation (as far as I know), I've got a few questions (then I'll contribute back with a docs PR asap 🙂) I'm having a hard time figuring out how to construct and use the idSet 1. Building the IdSet a. Where should I input

IdSets.create(DataType dataType)

? b. Creating from a query using

SELECT ID_SET(userId) FROM table

i. With only 2 ids (string / hex data type) in the ID_SET 1. Default IdSet expectedInsertions
value returns an output (

serialized IdSet

?) so long, Firefox truncates it and the Pinot UI doesn't display it 2. Using a smaller value (expectedInsertions:1000
) reduces the output size, but it is still very very long 3. Is that due to the data type of my ids ? 2. Querying: ◦ ...

Jackie

06/14/2021, 4:11 PM

@Jonathan Meyer For your use case, you should be able to use the

IN_SUBQUERY

function, no need to explicitly create an

IdSet

Jonathan Meyer

06/14/2021, 4:12 PM

Oh ! I'll try that, thanks @Jackie I guess it is necessary to explicitly create an

IdSet

when data in not in a table already ?

Jackie

06/14/2021, 4:14 PM

E.g.

SELECT SUM(value) FROM table1 WHERE IN_SUBQUERY(user, 'SELECT ID_SET(groupId) FROM group WHERE ...')  = 1

Jackie

06/14/2021, 4:14 PM

You can send a subquery with aggregation

ID_SET()

Jonathan Meyer

06/14/2021, 4:53 PM

Sorry for the "spam", here's more detail in case that helps Main table:

Jonathan Meyer

06/14/2021, 4:53 PM

"ids" table

Jonathan Meyer

06/14/2021, 4:54 PM

Query:

SELECT entityId, dateString, value FROM mainTable WHERE IN_SUBQUERY(entityId, 'SELECT ID_SET(entityId) FROM idsTable WHERE communityId = comm1') = 1

This returns no result, using

= 0

returns all of them Looks like it works without the

WHERE

clause (

IN_SUBQUERY

) But performance is really impacted, a very simple query with basically no data takes nearly a second to return a result that isn't correct

Jackie

06/14/2021, 5:47 PM

@Jonathan Meyer You need to quote the `comm1`:

SELECT entityId, dateString, value FROM mainTable WHERE IN_SUBQUERY(entityId, 'SELECT ID_SET(entityId) FROM idsTable WHERE communityId = ''comm1''') = 1

Jackie

06/14/2021, 5:47 PM

Without quote, it is parsed as a column

Jackie

06/14/2021, 5:47 PM

Note that you need 2 single-quotes to escape the quote

✔️ 1

Jonathan Meyer

06/14/2021, 5:51 PM

I've tried messing up with quotes but couldn't figure it out haha Thank you, it really works now 🙂 I've got ~~two~~ one issue~s~ left: 1. Query is super slow (but I've seen interesting logs, I'll send them afterwards) 2. Filtering on
communityId=comm
, both entityId
[...]6a
and
[...]6b
should come up, but 6b doesn't a. Filtering on
communityId=comm1
only returns
[...]6a
results as expected b. Oopsie, bad data

Jonathan Meyer

06/14/2021, 5:54 PM

When it comes to query performance, here's the query stats from PinotUI Query took 577ms

query_stats

Jonathan Meyer

06/14/2021, 5:58 PM

Servers seem to respond fairly quickly => 41ms + 34ms + 47ms Looks like there are 2 identical queries for the subquery (so 3 queries in total) ? Is that normal ? Overall, I don't see where the other 300+ ms are coming from The data I showed in the screenshot really is all of it (mainTable: 6 docs, idsTable: 3 docs)

Untitled.txt

Jackie

06/14/2021, 6:04 PM

Yes, there are 2 identical subquery queries due to the PQL deprecation, and will be fixed in the next release (fix already merged in the latest master but not released yet)

🆗 1

Jonathan Meyer

06/14/2021, 6:04 PM

Ah okay, thanks for the explanation

Jonathan Meyer

06/14/2021, 6:05 PM

Is this level of performance expected or is there maybe a misconfiguration on my side ?

Jackie

06/14/2021, 6:05 PM

I think the time is spent on transporting data. The query needs to run 3 RTT between broker and server

Jonathan Meyer

06/14/2021, 6:06 PM

Sounds like a lot, no ? Server and broker are on the same server, and I've sent pretty massive payloads (like 10000 ids) with much better performance (<100ms) [& with 100M docs]

Jackie

06/14/2021, 6:07 PM

If they are running on the same server, then it's definitely too long

Jonathan Meyer

06/14/2021, 6:08 PM

Yep I think so too

Jackie

06/14/2021, 6:08 PM

But from the log we can see it takes ~60ms for one RTT

➕ 1

Jonathan Meyer

06/14/2021, 6:08 PM

I think the explanation may lie in

SELECT ID_SET(entityId) FROM idsTable

This query takes "forever" (~500ms), and returns a massive payload

Jonathan Meyer

06/14/2021, 6:09 PM

id_set_logs

Jackie

06/14/2021, 6:10 PM

But you mentioned there are only 3 docs there?

Jonathan Meyer

06/14/2021, 6:10 PM

Yes it is the case In the logs we can see the

ResponseSize: 4562277

Jonathan Meyer

06/14/2021, 6:10 PM

Maybe it is because the column are of string type ?

Jonathan Meyer

06/14/2021, 6:10 PM

But still

Jackie

06/14/2021, 6:11 PM

Good finding. That is way too big

Jackie

06/14/2021, 6:11 PM

Did you enable the trace?

Jonathan Meyer

06/14/2021, 6:11 PM

Yep

Jackie

06/14/2021, 6:12 PM

Let's try without the trace and see how big is the response

Jonathan Meyer

06/14/2021, 6:13 PM

I nearly crashed my slack copying the trace 😂

Jonathan Meyer

06/14/2021, 6:13 PM

I'll try without the trace

Jonathan Meyer

06/14/2021, 6:14 PM

Query is said to return in ~100ms, but I can see from the UI that's not the case (easily 2s++) Here are the logs

Jonathan Meyer

06/14/2021, 6:14 PM

Untitled

Jonathan Meyer

06/14/2021, 6:15 PM

Not exactly the same size, but still massive ->

Jackie

06/14/2021, 6:30 PM

Let me try to reproduce it

Jackie

06/14/2021, 6:30 PM

Which version are you running?

0.7.1

Jonathan Meyer

06/14/2021, 6:30 PM

Let me check

Jonathan Meyer

06/14/2021, 6:30 PM

Nope, 0.7.0

Jonathan Meyer

06/14/2021, 6:31 PM

Can I simply restart Pinot with 0.7.1 image and try to reproduce ? Not yet very informed on upgrade procedures

Jackie

06/14/2021, 6:32 PM

Hmm,

0.7.0

is not a good build

Jackie

06/14/2021, 6:32 PM

If you are just trying it local, I'd suggest directly try the latest image

Jonathan Meyer

06/14/2021, 6:33 PM

We won't be running current latest in production, so isn't it a bad idea to test things out on a different version ?

Jonathan Meyer

06/14/2021, 6:33 PM

But I'll try latest now anyway just to check if the problem goes away 🙂

Jackie

06/14/2021, 6:34 PM

Let's try

0.7.1

then, we released

0.7.1

to replace

0.7.0

because that release contains some bug

Jonathan Meyer

06/14/2021, 6:35 PM

Okay, give my internet connection a sec 😄

Jonathan Meyer

06/14/2021, 6:41 PM

Nope, still slow

Jonathan Meyer

06/14/2021, 6:43 PM

Size: 4561913

full_query_logs

Jonathan Meyer

06/14/2021, 6:44 PM

That's on

0.7.1

(non jdk-11) Trying on latest now

Jackie

06/14/2021, 6:45 PM

Oh, I think I know the reason

Jonathan Meyer

06/14/2021, 6:45 PM

Oh ? 😄

Jackie

06/14/2021, 6:46 PM

entityId

is a

STRING

column rt?

Jonathan Meyer

06/14/2021, 6:46 PM

Yes

Jonathan Meyer

06/14/2021, 6:46 PM

But i've tried with

FLOAT

column too

Jackie

06/14/2021, 6:46 PM

Can you try again on a

INT

column? I cannot reproduce the issue with

INT

column

Jonathan Meyer

06/14/2021, 6:47 PM

I'll try

Jackie

06/14/2021, 6:47 PM

For non-integer data type, Pinot has to create bloom filter for the

IdSet

Jonathan Meyer

06/14/2021, 6:47 PM

That's what I read yeah Is that so detrimental to the performance ?

Jonathan Meyer

06/14/2021, 6:48 PM

Can't it be combined with a form of dictionary encoding ? (STRING -> INT given that cardinality is low enough)

Jackie

06/14/2021, 6:51 PM

Yeah, that's a good optimization. Basically keeping a set until the size is reaching the threshold

Jackie

06/14/2021, 6:51 PM

That is not done yet because when designing it, it is designed for integer/long id fields

Jonathan Meyer

06/14/2021, 6:52 PM

Ah I see

Jackie

06/14/2021, 6:52 PM

With the current code, you may tune

expectedInsertions

to reduce the size of the bloom filter

Jackie

06/14/2021, 6:52 PM

By default it expects 5M values

Jonathan Meyer

06/14/2021, 6:53 PM

Yeah, in my first experiments that's what I saw, the payload size was reduced, and performance a bit better I'll try that again Luckily my use case should not have more than 100k different keys

Jackie

06/14/2021, 6:54 PM

For int/long fields, we use bitmap which is much more compact

✔️ 1

Jonathan Meyer

06/14/2021, 6:55 PM

Given that my input is this string thing (a MongoDB ObjectId, actually), I'd have to do a mapping...

Jackie

06/14/2021, 6:58 PM

IdSet

is implemented as an interface, so we can optimize it to support set for non-integer fields

Jackie

06/14/2021, 6:58 PM

Contributions are very welcome 😉

Jonathan Meyer

06/14/2021, 6:59 PM

Should we file an issue about that ? My Java skills are not there yet, but hopefully someday If anything motivates me to dive into Java, it'd be Pinot 😉

Jackie

06/14/2021, 6:59 PM

Yes, please file an issue on this enhancement. Thanks!

Jonathan Meyer

06/14/2021, 7:00 PM

Thank you for all your help !

Jonathan Meyer

06/14/2021, 7:16 PM

@Jackie I confirm that performance is great when using INT type column 👍

👍 1

Jonathan Meyer

07/02/2021, 7:43 AM

Hello @Jackie Have you ever used idsets with non time-based tables ? (''child'' table) If so, how did you handle segment replacements ?

Jackie

07/02/2021, 3:52 PM

@Jonathan Meyer The segment replacement are always based on segment name. New segment will replace the old segment with the same name. There is no limitation on using idset on non time based tables

Jonathan Meyer

07/02/2021, 3:53 PM

What I mean is : how do you name your segments such that you can easily replace them in case the schema doesn't contain a datetime column ?

Jonathan Meyer

07/02/2021, 3:53 PM

Let's say, if I have a segment per

id

How will I easily replace it ? (i.e. generate a new segment with the same name)

Jackie

07/02/2021, 7:59 PM

We usually call tables without a time column

refresh

use case, where all the segments are usually replaced all together. When pinot generates segments, it will append a sequence number to the table name as the segment name. If each time same data files are provided, then all the segments will be replaced properly. Do you think your use case can fit into this model?

Jonathan Meyer

07/02/2021, 8:10 PM

Ah yes, refresh tables, I forgot about that feature That would be a suboptimal workaround, but thanks for pointing it out My use case doesn't quite fit in this model, as my scenario is the following : • 1 segment per group • Per group refresh • A new group can appear at any time • (One can disappear too) So refreshing all segments / group everytime a group is updated - is a possibility, but quite a heavy one in my case

Jackie

07/02/2021, 9:01 PM

So basically you need to control the segment name generated from each group. How do you generate segments now? You should be able to directly set the segment name from the segment generation config

Jonathan Meyer

07/02/2021, 10:12 PM

So basically you need to control the segment name generated from each group.

Yes, exactly

How do you generate segments now?

Right now it follows the

simple

strategy (default), with global IDs Leading to segments names `<table>~_OFFLINE~_<N>`(where

is a number I cannot control directly and depends on the number of input files) Which doesn't really work for my use case

You should be able to directly set the segment name from the segment generation config

Are you talking about the

fixed

strategy ? Isn't it only able to handle a single input file ? (given that we provide a single [fixed] segment name)

Jackie

07/02/2021, 10:40 PM

The

simple

strategy should generate segments with name

<table>_<N>

(no

OFFLINE

in between) If you want to control the segment name, a work around should be using the

fixed

strategy and generate one segment for each job

➕ 1

Jackie

07/02/2021, 10:42 PM

Or design a new name generator that generate segment name based on the input file name as you suggested

Jonathan Meyer

07/02/2021, 11:04 PM

Thanks @Jackie, I came to the same conclusion ~~Maybe there's one more option~~ • For "single group update" using
fixed
type sounds like it should work • For full refresh, I'm not sure table `segmentPushType`:
refresh
type table would work ? Would every precious segments be discarded whenever a new ''batch'' (i.e. job of multi files) of segment is generated ? Edit: Actually no, that wouldn't work But I've realized that the "run multiple ingestion jobs" is probably my best and simplest solution As for the name generator solution, there's an open issue :) -> https://github.com/apache/incubator-pinot/issues/7090 Thanks for the discussion @Jackie !

👍 1

Open in Slack

Previous Next