question.. is there a side-effect on ZooKeeper fro...
# general
a
question.. is there a side-effect on ZooKeeper from running
bin/pinot-admin.sh AddTable
or is it contained to pinot's data directory
x
table configs will also be registered in helix which is backed by zookeeper
a
so as I understand it, I cannot bring up all-in-one pinot against one ZK, then start it later only with the same data directory, as AddTable indirectly requires write access to ZK (via Helix)
x
E.g. Table configs/table ideal states/broker ideal states are written into zookeeper
Write, you need to bring up a zookeeper which you can write to first
a
yeah but I need to use same ZK or use something else to import the state back into ZK before start
I'm trying to remove the orchestration needed to externally one-time bootstrap schema
and trying to do this such that it doesn't leak traffic until that's done.. since there's a ZK state dep I think the only way is to use java at this point
unless ServiceManager can add a list of AddTable things 😄
which would be similar to bootstrapping anyway except only want to do this one-time, and not listen on service port..
ideal though is the image can be baked with schema, as we can do this usually except when state has to also live in ZK (ex cassandra etc are self-contained)
right now we double-host kafka with ZK
I know kafka eventually will drop ZK but still uses it...
one way is to double-host pinot with ZK I guess
then change order of startup..
but the flakiest thing we have has to do with lazy creation of pinot views as traffic can pass prior to that, espcially if not using slow k8s barriers
I basically have everything done to do in docker layer except ran into this ZK thing
I'm tempted to persist the ZK state 😛
but better solution is a way to use ServiceManager to install prior to passing health check
install schema I mean
I'll raise a distinct issue on this.
x
hmm, do you mean you want to have some state stored somewhere and recover from that ?
in case of that, one way is to have an external zk which you can connect to always.
another is to mount data directory somewhere in the container so you can always recover the state.
I think this is a good feature with bootstrap tables.
It can be just a simple wrapper script to start PinotSM and wait for
/health
check then create pinot tables
But the ultimate solution is to integrate this inside PinotSM.
a
the problem is that even if you remount the data dirs of pinot lack of metadata in ZK make it impossible to rebuild. I would say that pinot could rebuild this state when encountering a clean ZK (duplicating the storage concern in ZK), allow bootstrap tables (allows to keep unhealthy until bootstrap) or status quo lazy setup (https://github.com/hypertrace/pinot/blob/main/pinot-servicemanager/docker-bin/install-schema)
so what I did was mess with docker HEALTHCHECK for now, in order to pretend the feature in PinotSM existed, and later this external messing around can be removed. The approach I did now only works if you use docker-healthcheck (ex docker-compose v2 not v3, or explicit use of the same HEALTHCHECK in k8s etc). By embedding into PinotSM the health barrier is more portable.. anything that reads /health will know when things are ready or not