https://pinot.apache.org/ logo
e

eywek

05/21/2021, 9:01 AM
Hello, I’m having issue with
LIMIT
on a table with 4.5 millions of rows, When I’m doing this query:
Copy code
SELECT * FROM datasource_609bc4f74e3c000300131110 ORDER BY "timestamp" ASC LIMIT 100000,10
I’m getting a result in ~2.5s, and I can see in the query response stats
totalDocs=4794306
which is fine But when I’m doing this one (offset 1 000 000 instead of 100 000):
Copy code
SELECT * FROM datasource_609bc4f74e3c000300131110 ORDER BY "timestamp" ASC LIMIT 1000000,10
I’m getting no rows (which isn’t the expected behavior) in ~10s and the totalDocs is
569840
(which isn’t the expected behavior either?) I have an hybrid table with segmentPruning by time Do you have any idea why I’m having this kind of issue? Thank you
j

Jackie

05/21/2021, 6:05 PM
Hi, can you please check the other metadata within the query response? I suspect some servers timed out because of the high offset
FYI, for offset queries, pinot has to gather all the records even before the offset, i.e.
100010
records for the first query, and
1000010
records for the second query
e

eywek

05/25/2021, 8:01 AM
Right, it seems servers timeouts, I haven’t thought of that, thank you
There is any way to fix this? (without increasing timeouts)
j

Jackie

05/25/2021, 3:02 PM
Unfortunately currently there is no index that can accelerate this query. What is the jvm setting for your servers? You might want to consider increasing the memory limit
v

vmarchaud

05/25/2021, 3:32 PM
We have 16G of heap and have around 32G available for nmaped file, do have any estimation of how much we should allocate ?
j

Jackie

05/25/2021, 5:09 PM
That should be enough
The reason for this query being slow is that currently Pinot will try to gather
1000010
records from each segment. If each segment has less than 1M records, then basically Pinot will scan the whole table
Can you please file an issue about this? If there is no overlap on timestamp between segments, we might be able to locate the segment which contains the target records
e

eywek

05/26/2021, 8:15 AM