This message was deleted Apache Druid #dev

Join Slack

This message was deleted.

# dev

Slackbot

01/06/2024, 3:19 AM

This message was deleted.

Kai Sun

01/06/2024, 3:26 AM

This would be something very useful for our usage cases.

Abhishek Agarwal

01/06/2024, 4:09 PM

If you want to give it a try Kai, I will be happy to help with review or design. We can rope in @Vadim for any web-console changes

👍 1

Kai Sun

01/08/2024, 6:46 PM

Sure, let me have a look.

Ben Krug

03/07/2024, 12:01 AM

This would be awesome

Kai Sun

03/08/2024, 1:42 AM

Guys, let me revive this thread a little bit. For the past two month, I worked on some query performance tuning in our system, mainly to reduce the query latency and specifically in the real time path. In the mean time, I find that a query insight tool like proposed above would be very valuable. More specifically, the following would be useful and some of them are missing: 1/ per segment query processing time 2/ per segment waiting time for the processing queue and current queue size 3/ per hydrant query processing time -- missing 4/ per hydrant waiting time thread time -- if paralleled, missing 5/ per query merge buffer acquisition time and current waiting queue size -- missing 6/ post segment processing grouper potentially spilling to disk time -- missing These information should not be limited to only brokers, but also data node (peons and historicals).

Kai Sun

03/08/2024, 1:47 AM

The main idea is • attribute time spent for each query in each stage and report contention points statistics such as the thread pool queue length while waiting, or merge buffer queue length while waiting, or spilling to disk activity for groupers. • report this data collectively to some UI so that admin can have a direct insights above queries and why they may be slow.

Kai Sun

03/08/2024, 1:48 AM

I will scratch a design and most likely to work on it next 3 or 4 months. cc: @Abhishek Agarwal, @Vadim, @Ben Krug

Abhishek Agarwal

03/08/2024, 2:51 AM

Yes indeed such a tool would be very useful.

👍 1

Ben Krug

03/13/2024, 1:23 AM

This kind of instrumentation, and a nice interface (eg SQL) would be amazing, thank you. (If not SQL, emitting metrics.)

Ben Krug

03/13/2024, 1:25 AM

(I think of performance schema in MySQL, after Mark Leith added this, it was widely adopted, and made it in to MySQL proper.)

2 Views

Open in Slack

Previous Next