Hey team I am seeing a huge variation in performance of the Apache Pinot #troubleshooting

Hey team, I am seeing a huge variation in performa...

Yash Agarwal

07/21/2021, 10:03 AM

Hey team, I am seeing a huge variation in performance of the following queries.

Copy code

select distinct DATETIMECONVERT(transaction_date, '1:DAYS:EPOCH', '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:DAYS') from transactions limit 1000 -- 80+ seconds
select distinct transaction_date from transactions limit 1000 -- 3.5 seconds

Can you help with the how to optimize the same. In the meanwhile we have added another column in the

yyyy-MM-dd

format to support the same.

Mayank

07/21/2021, 12:56 PM

How many docs does the filter select, and how many distinct values? Also what’s your jvm settings?

Yash Agarwal

07/21/2021, 5:46 PM

There is no filter .. about 22 billion records. And 700 distinct values.

Yash Agarwal

07/21/2021, 5:49 PM

Screenshot 2021-07-21 at 11.19.50 PM.png

Mayank

07/21/2021, 5:58 PM

Yeah, so I think DateTimeConvert on 22B records might be slow.

Mayank

07/21/2021, 5:59 PM

Can you try count(*) + group by

DATETIMECONVERT(transaction_date, '1:DAYS:EPOCH', '1:DAYS:SIMPLE_DATE_FORMAT:yyyy-MM-dd', '1:DAYS')

to get the distinct values?

Mayank

07/21/2021, 6:01 PM

It might be slightly faster, but not sure by how much. Also, adding another column will make it super fast.

Jackie

07/21/2021, 6:03 PM

I don't think doing aggregation group-by can be faster than distinct though. The cost is mainly on transform as we can see distinct itself is quite fast (second query)

Jackie

07/21/2021, 6:05 PM

Ideally we should support doing transform on broker side after getting the distinct results, which avoids the per-record transform

Jackie

07/21/2021, 6:06 PM

Currently one workaround would be to create a derived column for this transformation, and directly querying the derived column

Jackie

07/21/2021, 6:06 PM

You may create the derived column by adding the ingestion transform and do a table reload: https://docs.pinot.apache.org/developers/advanced/ingestion-level-transformations

Open in Slack

Previous Next