hey friends i have a question regarding `Table Consuming Lat Apache Pinot #troubleshooting

hey friends, i have a question regarding `Table Co...

Luis Fernandez

03/18/2022, 4:12 PM

hey friends, i have a question regarding

Table Consuming Latency

I have been turning off and on various part of pinot to see how it behaves, this time i decided to turn off for sometime the kafka app that produces the records to pinot, i saw a latency increase when i turned off the app and at least for p99, it was 160ms and now is over a minute, when things like this happen when do you expect pinot to get back to its regular level does it ever get back? I was thinking as the day goes by maybe and this topic start to get less traffic then maybe things come down but I was wondering if that somehow can come back any other way. Ofc this is still pretty fast but I’m wondering what happens if I were to take down the app for a longer time how could that impact the p99 times

Mayank

03/18/2022, 4:37 PM

Assuming the > 1min latency you are referring to is for consumption, I’d say the consumption catches up pretty fast, however, as you can imagine it is a function of data size, number of partitions, number of servers etc. I’d recommend testing it for practical scenarios you think you will run into.

Luis Fernandez

03/18/2022, 4:40 PM

this is 2 servers 16 partitions the kafka app is only 1 replica and it’s back to speed and we are processing 4k messages/sec

Rong R

03/18/2022, 4:41 PM

can you share how you measure the "Table Consuming Latency" during the time the app is turned off?

Luis Fernandez

03/18/2022, 4:41 PM

avg by (table) (pinot_server_freshnessLagMs_XXthPercentile{kubernetes_namespace="$namespace"})

Luis Fernandez

03/18/2022, 4:44 PM

when i turned off the kafka app particularly for p99 it went up

Rong R

03/18/2022, 4:51 PM

ok so that value is measured by

Copy code

System.currentTimeMillis() - minConsumingFreshnessMs

in the case your app was turned off. it is basically a linear function of wall-time.

Rong R

03/18/2022, 4:52 PM

(since your minConsumingFreshnessms is your last ingested kafka msg timestamp)

Rong R

03/18/2022, 4:52 PM

is it possible for you to test turning the app back on and how fast this metrics restore to 0? liek mayank suggested?

Luis Fernandez

03/18/2022, 4:53 PM

right the app is on already but that metric is not going down

Luis Fernandez

03/18/2022, 4:53 PM

image.png

Luis Fernandez

03/18/2022, 4:54 PM

that’s p99

Rong R

03/18/2022, 4:54 PM

can you make a query?

Luis Fernandez

03/18/2022, 4:55 PM

like MAX time or something like that on the table?

Rong R

03/18/2022, 4:55 PM

just count* is fine

Luis Fernandez

03/18/2022, 4:56 PM

yea i can

Rong R

03/18/2022, 4:57 PM

and check if the metrics comes down

Luis Fernandez

03/18/2022, 4:58 PM

they don’t come down

Luis Fernandez

03/18/2022, 4:58 PM

but i’m curious how does it relate

Rong R

03/18/2022, 4:59 PM

upon checking the code path that metrics only update when a query is being processed.

Luis Fernandez

03/18/2022, 4:59 PM

ohh, this cluster is actively getting some queries

Rong R

03/18/2022, 4:59 PM

yeah but is that specific for that table?

Luis Fernandez

03/18/2022, 5:01 PM

yes specific for this table

Luis Fernandez

03/18/2022, 5:01 PM

the metric too

Rong R

03/18/2022, 5:04 PM

oh. interesting!

Open in Slack

Previous Next