Hi everyone. I am planning to experiment with Pinot for the user facing analytics use cases we have. Our scale is not too large (~ 1M DAU) and we have a small team of 3 engineers working on data engineering for the first time. We primarily use managed services on AWS. With Pinot, one of the concerns is self managing the infrastructure and I wanted to know how has been the experience of others in this regard.
k
Ken Krugler
04/09/2021, 3:16 PM
We’ve been running Pinot for a few months now, using Docker containers on self-serve hardware. In general it’s been no problem, though I always worry about having Zookeeper in the mix 🙂
We did run into one cluster-killer issue, where a query with a
distinct
count that was too large would put the cluster in a weirdly broken state, until brokers were restarted.