We did a basic POC of Pinot on bare metal EC2 host...
# general
a
We did a basic POC of Pinot on bare metal EC2 hosts on AWS(And see Pinot will be an excellent fit for our use-cases) and are now at a stage to start working in the direction of production setup. Wanted to check with the community for production setup: Are there, any gotcha's on running the cluster via Kubernetes(helm) way vs dedicated EC2 hosts. Does Pinot have any performance benefit of running on dedicated EC2 instances vs Kubernetes POD's @Scale.
@User^: To add more details
x
hi Abhinav, what’s your scale look like? We run many pinot clusters of different sizes on k8s and haven’t seen particular perf issues related to k8s. Even with k8s, you can also have EC2 instances dedicated to the pinot clusters.
a
Hi @User: Thanks for confirming. Can you elaborate little more by what you mean by
Even with k8s, you can also have EC2 instances dedicated to the pinot clusters.
?
Do you mean have dedicated EC2 instances and deploy containers on it vis k8s? And all pods on a specific host can share the same data segment ?
x
e.g. we use k8s to deploy pinot servers, controllers, etc. Each of them (like every pinot server) runs on dedicated EC2 instance. Pods (like those server pods) don’t share data segment.
👍 1
a
Gotcha. So does k8s offer any benefit compared to a non-k8s deployment? Given that data, segments are not shared.
x
I’m mainly aware of the benefits of k8s to simplify the operations.
👍 1
as to any perf impacts, I’ll cc @User to help shed some light here.
m
Technically, if the resources provided are same, there isn’t any difference in perf.
a
Got it. Thanks @User, @User
p
I guess our major concern (or at least mine since I haven't used k8s before) is using a dedicated disk on ec2 instance mounted to k8s pod vs using ebs volumes. I honestly don't know if the first option is possible although I hope it is and maybe any pointers in that direction.
x
We mount EBS to EC2 instance for pinot server pods to access. I think that’s handed by k8s PersistentVolume but not expert on that front.
👍 1
a
@User: As @User mentioned they have 1: 1 mapping from the k8s pod to the instance. Hence nothing is shared amongst Pods in terms of the segment.
m
Yes, most deployments I know of use EBS attached to VMs.
🙏 1
p
@User EBS volumes can be 1:1 mapped too. They are different than instances with dedicated disc (instance store). I/O on EBS volumes happens via network while dedicated disc will be local to the instance. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html
👍 1
j
Shouldn’t we use instance store for storage as we have replication enabled and deep storage as well in case we loose instance. @User @User
p
Hi Jatin, that is what I am implying when I say I want to use dedicated local NVMe SSD disc. :)
m
We have seen good perf even with ebs
I don’t recommend using local disk as it may be volatile (and expensive)
j
@User its a tradeoff between latency and availability. its actually not costly infact cheaper than ebs given the type of machine we use. P.S. whats the recommendation on machine type?
m
If it is cheaper, and satisfies your availability / latency needs, go for it @User
j
Please suggest for machine type as well.
m
That depends on your workload and sla requirements. Will ping on dm