hey friends I wonder if anyone has issues like this we have Apache Pinot #troubleshooting

hey friends I wonder if anyone has issues like thi...

Luis Fernandez

10/11/2022, 5:38 PM

hey friends I wonder if anyone has issues like this, we have a table and we want to ingest data from the beginning of the topic, however in the cluster while we are ingesting this data the servers get really really busy and the p99 response metrics for other tables get impacted greatly, has anyone gotten across this, do you all know what the bottleneck is and why servers get so impacted in terms of response times? it’s really weird that a change adding one table will impact the cluster so negatively.

Luis Fernandez

10/11/2022, 5:40 PM

they are trying to catch up to speed but the response times get impacted greatly

Luis Fernandez

10/11/2022, 5:40 PM

our p99 are usually 15ms and with new table it spikes to 300ms

Johan Adami

10/11/2022, 5:57 PM

it’s likely CPU. Pinot tries to ingest as many events as possible per second

Luis Fernandez

10/11/2022, 5:58 PM

right but we are at 32 cores and usually they are chilling

Luis Fernandez

10/11/2022, 5:58 PM

unless we do something like this 😄

Luis Fernandez

10/11/2022, 5:58 PM

should we over provision do more replicas?

Johan Adami

10/11/2022, 5:58 PM

there is a

topic.consumption.rate.limit

from https://docs.pinot.apache.org/basics/data-import/pinot-stream-ingestion you can use. we’ve been meaning to experiment with this, but I can’t say for certain how well it works yet

👍 1

Luis Fernandez

10/11/2022, 5:58 PM

omg

Luis Fernandez

10/11/2022, 5:58 PM

#pro

Luis Fernandez

10/11/2022, 5:59 PM

#PROTIP

Johan Adami

10/11/2022, 5:59 PM

the description for that config also mentions there’s tons of GC with high ingestion, so maybe check those metrics as well? I know we had to disallow “smallest” offset ingestion for our tables to avoid this same latency impact

Luis Fernandez

10/11/2022, 6:00 PM

we just did smallest cause we need the data from the beginning of the topic indeed and wow yea GC pauses I need to check that too

Luis Fernandez

10/11/2022, 6:03 PM

image.png

Luis Fernandez

10/11/2022, 6:03 PM

i do see it goes up but it’s still .3% time spent on GC

Johan Adami

10/11/2022, 6:05 PM

what is that metric? i thought GC metrics were emitted as milliseconds

Luis Fernandez

10/11/2022, 6:06 PM

it’s % of CPU time on GC based on what i see lol

Luis Fernandez

10/11/2022, 6:06 PM

like this:

Luis Fernandez

10/11/2022, 6:06 PM

Copy code

rate(jvm_gc_collection_seconds_sum{ kubernetes_namespace="pinot", component="server"}[4m])

Luis Fernandez

10/11/2022, 6:07 PM

image.png

Luis Fernandez

10/11/2022, 6:07 PM

u can see how it just spikes up and that’s when we started ingesting those many records

Luis Fernandez

10/11/2022, 6:07 PM

and we are going right now at 200k/s

Johan Adami

10/11/2022, 6:16 PM

jvm_gc_collection_seconds_sum

sounds like seconds spent doing GC no? ~.3 sounds like 300ms which correlates with the p99 you’re seeing

😮 1

Johan Adami

10/11/2022, 6:17 PM

https://www.robustperception.io/measuring-java-garbage-collection-with-prometheus/ and https://docs.oracle.com/javase/8/docs/api/java/lang/management/GarbageCollectorMXBean.html#getCollectionTime-- seem to agree

Luis Fernandez

10/11/2022, 6:21 PM

thank you this is very insightful

Luis Fernandez

10/11/2022, 6:21 PM

it seems like it caught up to speed

Rong R

10/11/2022, 10:24 PM

as @Johan Adami mentioned limiting it on the table level using

topic.consumption.rate.limit

is the right way for this use case. it works quite well. • this is also a dynamic config - e.g. no need to reload or reset table for it to take effect. although it will not take effect until the next segment start consuming • it doesn't limit rate on a server level. e.g. it doesn't enforce limit across all consumers within the same server. • it only limits on a per partition level for lower-level consumer

Luis Fernandez

02/02/2023, 9:51 PM

do you all know how to establish the rate?

topic.consumption.rate.limit

like i know it’s a double but is it like message per second or something like that it should process? @Rong R @Johan Adami

Johan Adami

02/02/2023, 10:00 PM

Copy code

double topicRateLimit = streamConfig.getTopicConsumptionRateLimit().get();
    double partitionRateLimit = topicRateLimit / partitionCount;
    <http://LOGGER.info|LOGGER.info>("A consumption rate limiter is set up for topic {} in table {} with rate limit: {} "
            + "(topic rate limit: {}, partition count: {})", streamConfig.getTopicName(), tableName, partitionRateLimit,
        topicRateLimit, partitionCount);
    MetricEmitter metricEmitter = new MetricEmitter(serverMetrics, metricKeyName);
    return new RateLimiterImpl(partitionRateLimit, metricEmitter);

It seems it naively assumes equal traffic per partition. so if you set 1000 as the limit on a topic with 10 partitions, each consumer will get 100 as the rate limit

Luis Fernandez

02/02/2023, 10:01 PM

that rate limit is messages/s (?)

Johan Adami

02/02/2023, 10:02 PM

correct

Johan Adami

02/02/2023, 10:02 PM

i can’t tell if it’s recreated or not when partition count changes, though

Luis Fernandez

02/02/2023, 10:02 PM

thank you Johan you the MVP

thankyou 1

Johan Adami

02/02/2023, 10:06 PM

just some links so you or others can double check. here is where the rate limiter is created. here is where it’s created in the stream processing code. here is where the rate limiting is being done

🙏 1

Luis Fernandez

02/03/2023, 8:00 PM

it works like a charm

Luis Fernandez

02/27/2023, 4:14 PM

one question about rate limiting what happens when we hit the rate limit with this

topic.consumption.rate.limit

? are messages just queued up or dropped I think they get queued cause i think we just sleep right if we hit the rate limit?

Open in Slack

Previous Next