Before I re-invent the wheel here has anyone creat...
# troubleshooting
s
Before I re-invent the wheel here has anyone created a kubernetes Init Container to check for all servers to be available before running table creation scripts? I've got an Init Container in my server statefulsets that waits for the controller and that works great, but my init table script now needs to wait for the servers to be available. My current plan is to get my expected replica count and compare that to my pinot instances array (after parsing out only my servers).
m
You want to wait until all Replicas are up? That might be too strict. Cc @Xiang Fu
s
Yes, but I'm a little bit torn on the idea, we have an idempotent table creation shell script that runs in a kubernetes post deploy job. So when we deploy to our local or a gke environment that job waits for the controller to be up. That definitely has to happen. But later we noticed some errors in the controller b/c it was trying to do things before the servers were totally ready. I'll have to dig for the issues we saw. My theory is that maybe we don't have to do this b/c the controller would sort things out eventually as the servers came up.
m
Yes, the state of what is available and loaded is maintained and query will be routed accordingly
s
So I guess I'm curious about your experience in production kubernetes environments if you've seen the use of init containers in the server statefulsets that wait for the controller to be ready
m
We use health checks + timeouts
s
So here's the one we use for out table init script that waits for the controller to be ready
- name: wait-pinot-controller image: {{ .Values.deployImage }} imagePullPolicy: {{ .Values.imagePullPolicy}} command: ['bash', '-c', 'while [[ "$(curl -X GET http://{{ $pinotService }}/health -s)" != "OK" ]]; do echo waiting for pinot controller to be created; sleep 2s; done']
I was looking for an equivalent approach to see if the servers are also ready b/c I put an initcontainer on my servers to wait for the controller and my table creation script (which also waits for the controller) fails b/c the servers were in a wait state for the controller as well so they both try to run at the same time and my table script tries to add tags to the servers which aren't quite available yet. So like I said, I may have kind of gone too far and maybe should rip out my init container for the servers, remove the wait in other words.
So maybe this question will help me. Is the controller health check going to return ok if all of the servers are not ready?
I'm pretty sure it will return ok regardless of the servers being ready, there is no dependency on controller status and server status it seems, which kinda makes sense if true.
m
Yes, controller should not block on servers
Because you may have 100s of servers and thousands of tables and you may be doing a rolling restart
s
So we want to script the addition of tags to servers, so in order for those to be successful, the servers need to be up and running.
m
You can decouple controller being ready from table being ready?
s
so for now the only condition we wait to be ready before running our table scripts is that the controller be ready
but my concern is that's not enough, it also needs to wait until all servers are ready I think
m
I guess I don’t understand the requirements well. Is this a first time setup or rolling restart or something else
And how often did you intend to do this operation
s
so we deploy to production every 4 weeks, and can deploy to dev and our test env on demand, and of course locally on demand. So we have to work for both first time and later deploys.
I could totally be overthinking it, but I have try to cover any potential issues. I'm assuming pinot handles most of my concerns naturally, like say I scale a server, I'm assuming tags will be copied over too.
We are generally not given access to our production environments, so everything has to be scripted and idempotent for the most part.
m
Table creation, tagging servers are one time operations and don’t tie into deployment.
s
what if I need to add columns to the table though? Or make other config changes.
say I want to change my tier storage config from 7d to 14d
m
Ok, when you say deploy, are you deploying Pinot, or something else?
The changes you mentioned above can be done dynamically via rest-api calls, you don’t need to restart/redeploy Pinot for that
That would be really bad UX 🙂
s
LOL
right that makes sense, It almost makes me think we need to the concept of migrations for pinot.