Hi, is there any article about kubernetes production experience for pinot? We want to learn things like optimal server count, num of segments per server, optimal resources for realtime and offline servers etc. I've found a few articles, but i want to know if there are other articles
m
Mayank
03/25/2021, 4:03 PM
Some of it is tribal knowledge atm, but yes would be good to document those. What are your specific questions, may be I can help
o
Oguzhan Mangir
03/25/2021, 7:31 PM
I actually dont know what should be memory size for each server for about 1 tb table with high qps. For example, we are using 128 gb memory per node in druid for all tables. But in pinot, we want to use different tenant for each big tables
Oguzhan Mangir
03/25/2021, 7:35 PM
Also we expect 4 segments for each day, and keep last 2 years data for a table. Each segment size can be 100 mb to 300 mb. I really dont know num servers, memory size etc
m
Mayank
03/25/2021, 8:23 PM
What's the read throughput/latency requirement?
Mayank
03/25/2021, 8:24 PM
For high throughput (thousands of reads) and low latency (< 200ms) you do want at least 64 GB RAM, 32 cores and load each server with few hundreds of GBs of data on SSD.
Mayank
03/25/2021, 8:26 PM
If the data can be partitioned, that really helps a lot with scaling for throughput
Mayank
03/25/2021, 8:27 PM
Say you load 200 GB per server, then you need 5 server nodes x 3 for replication.
Mayank
03/25/2021, 8:27 PM
This will lead you with good head room
Mayank
03/25/2021, 8:28 PM
If you want to discuss further, we can hop on a zoom call if that is more efficient.
o
Oguzhan Mangir
03/26/2021, 4:04 PM
that will perfect. we will check our requirements again. so what about brokers? 3 brokers with 64 gb memory for that 5 servers?
m
Mayank
03/26/2021, 4:18 PM
depending on how much cpu processing the broker is doing (how big the responses are from server) you may need to adjust broker footprint. Normally 3 brokers (for fault tolerance) with 64gb would suffice.
Mayank
03/26/2021, 4:19 PM
Again, I am saying these numbers without knowing your query workload, so take it with that grain of salt.
Mayank
03/26/2021, 4:19 PM
If you want to do a zoom call, ping me and we can setup a time.