Hello all. We just upgraded one of our test k8s na...
# troubleshooting
s
Hello all. We just upgraded one of our test k8s namespaces from .10 to .11 by deleting all of the statefulsets (with cascade=orphan) then deploying, then deleting the pods zookeeper, broker, controller, server, minion pods in that order and then re-deploying. Pinot was upgraded successfully but our segments were not downloaded from the deepstore. Looks like zookeper doesn't have knowledge of them. We know we can use the LaunchDataIngestionJob to load the deepstore segments (BTW this is a hybrid table), but I'm curious what we would do in production. Was there a step we missed that would have made pinot recognize the segments in the deepstore and automatically load them?
k
where is the zookeeper dataDir? was it using a PV or local storage
l
hey Stuart also in the process of upgrading to .11 in our test clusters, wondering why do you have to do with cascade=orphan? also according to this guide https://docs.pinot.apache.org/operators/operating-pinot/upgrading-pinot-cluster there’s no need to redeploy zk
s
@Kishore G it was using a pv
@Luis Fernandez we used that option so the pv's would not get deleted (our thinking anyway)
Yeah, maybe our problem was redeploying zookeeper in the first place
In any case this was a test env so we will use these learnings
l
you have all your old tables in your test env only not the data?
s
Correct and the realtime table started ingesting so that part was all good
m
If zk snapshots are available on the PVs, you can restore it from there. In general though, I’d run ZK separately from Pinot.
s
I think we just nuked zk and that was our issue sounds like
Yeah, we feel confident we could restore the data, and it looks like we caused the isssue ourselves by messing w/ zk
l
if zk gets nuked and you lose information i would imagine you would also lose your old tables and all which is interesting, at least that has been my experience when messing with zk 😄 always scared to do anything to it
🌟 1
(given that zk stores all that information if i’m not mistaken)
s
Well we have an init script that is idempotent that creates them 🙂
l
oooo i see i see
that explains that
s
ya lol
l
i guess there should be ways to restore this info and also make copies of it elsewhere do you have something like that setup? cause we don’t and maybe we should 😄
m
Are you destroying and recreating tables with each deployment? That should not be necessary. You can upgrade Pinot bits in a deployment without have to recreate table and with zero downtime
s
Nope, it doesn't destroy them, it's idempotent (for the most part)
Yeah I think we just made a mistake here by nuking zk, lesson learned.
m
But I am still curious on the need to have an idempotent script to create tables (and applying during deployment?).
s
Well for us it's to make local development easier
We use skaffold for local dev
We do a few things in that init script, we add cluster configs and server tags and schema and such
For now the table portion is very simple, just create it if it doesn't exist, we don't make any changes to an existing table yet, we may add some sort of "migration" process for that later on