Hi folks! I'm trying to get a Pinot cluster setup in AWS, but I need to decide exactly the pieces of this cluster so that the sysops from my the company I work for can set it up, and I'd like your opinions on this. I'll explain more in this thread.
m
Mayank
01/11/2022, 10:38 PM
Sure
d
Diogo Baeder
01/11/2022, 10:39 PM
Basically, we'll start with around 100M rows, and I have this data backed up elsewhere, in compressed singular objects where each object will be inserted as one row, and in total we have ~500GB of these compressed objects.
Diogo Baeder
01/11/2022, 10:39 PM
We don't need crazy fast queries, just being sub-minute is already great for us. So I'm thinking about this organization:
Diogo Baeder
01/11/2022, 10:41 PM
• 1 node for Kafka, gp2 EBS, 100GB
• 1 node for the Pinot Controller, gp2 as well (although it doesn't seem to need to be fast)
• 2 nodes for the Pinot Brokers (if it's possible to have this replication, for availability)
• 3 nodes for the Pinot Servers, gp2 with 1TB each
Diogo Baeder
01/11/2022, 10:41 PM
Does the above make sense?
m
Mayank
01/11/2022, 10:45 PM
What do you mean one object of 100M rows becomes 1 row in Pinot? If so, how are you planning to query it?