questions are the servers supposed to have as many cores as Apache Pinot #troubleshooting

questions; are the servers supposed to have as man...

Luis Fernandez

10/10/2022, 2:04 PM

questions; are the servers supposed to have as many cores as partitions on a kafka topic (?), if that’s the case what’s the best way to scale up pinot setups, the more tables we add the more the servers will have to ingest and we will get into resource contention given the cores on a server, how do you all manage this?

Mayank

10/10/2022, 2:08 PM

No, that is not a requirement. For scaling up, you’ll have to check resource usage based on your workload to see if scaling is needed

Luis Fernandez

10/10/2022, 2:13 PM

we have observed for our setup at least that the more we have been adding tables we have been scaling up our servers at least with more cores

Mayank

10/10/2022, 3:05 PM

Yeah that is a function of workload, and not a requirement on number of partitions vs number of cores

👍 1

Luis Fernandez

10/10/2022, 3:33 PM

also number of partitions is also not a function of number of server nodes right? like if i have 16 partitions in my kafka topic it doesn’t mean i need 16 servers

Mayank

10/10/2022, 9:17 PM

Yeah, no such requirement. It is a function of total ingestion across partitions, while serving queries and perhaps even generating segments at peak (aka workload). You can refer to this to part blog https://www.startree.ai/blog/capacity-planning-in-apache-pinot-part-1

Mayank

10/10/2022, 9:17 PM

https://www.startree.ai/blog/capacity-planning-in-apache-pinot-part-2

Luis Fernandez

10/11/2022, 5:24 PM

one thing that we do see is that for example at our size tables can get impacted

Luis Fernandez

10/11/2022, 5:24 PM

we have a server w 32 cores and 2 replicas 64 gigs ram and 32 gigs heap

Luis Fernandez

10/11/2022, 5:24 PM

but our p99 metrics get impacted if we ingest data looking at 7 days in our topics

Luis Fernandez

10/11/2022, 5:26 PM

and it impacts the p99 response in other tables

Mayank

10/11/2022, 11:34 PM

You can use the ingestion rate limiter to limit the max rate at which one table can ingest, so that it doesn’t steal resources from other tables when bootstrapping

Mayank

10/11/2022, 11:34 PM

Check table config docs page

Open in Slack

Previous Next