question is there a side effect on ZooKeeper from running `b Apache Pinot #general

question.. is there a side-effect on ZooKeeper fro...

Adrian Cole

09/04/2020, 1:30 AM

question.. is there a side-effect on ZooKeeper from running

bin/pinot-admin.sh AddTable

or is it contained to pinot's data directory

Xiang Fu

09/04/2020, 2:26 AM

table configs will also be registered in helix which is backed by zookeeper

Adrian Cole

09/04/2020, 2:27 AM

so as I understand it, I cannot bring up all-in-one pinot against one ZK, then start it later only with the same data directory, as AddTable indirectly requires write access to ZK (via Helix)

Xiang Fu

09/04/2020, 2:27 AM

E.g. Table configs/table ideal states/broker ideal states are written into zookeeper

Xiang Fu

09/04/2020, 2:28 AM

Write, you need to bring up a zookeeper which you can write to first

Adrian Cole

09/04/2020, 2:28 AM

yeah but I need to use same ZK or use something else to import the state back into ZK before start

Adrian Cole

09/04/2020, 2:32 AM

I'm trying to remove the orchestration needed to externally one-time bootstrap schema

Adrian Cole

09/04/2020, 2:33 AM

and trying to do this such that it doesn't leak traffic until that's done.. since there's a ZK state dep I think the only way is to use java at this point

Adrian Cole

09/04/2020, 2:33 AM

unless ServiceManager can add a list of AddTable things 😄

Adrian Cole

09/04/2020, 2:34 AM

which would be similar to bootstrapping anyway except only want to do this one-time, and not listen on service port..

Adrian Cole

09/04/2020, 2:35 AM

ideal though is the image can be baked with schema, as we can do this usually except when state has to also live in ZK (ex cassandra etc are self-contained)

Adrian Cole

09/04/2020, 2:35 AM

right now we double-host kafka with ZK

Adrian Cole

09/04/2020, 2:36 AM

I know kafka eventually will drop ZK but still uses it...

Adrian Cole

09/04/2020, 2:36 AM

one way is to double-host pinot with ZK I guess

Adrian Cole

09/04/2020, 2:36 AM

then change order of startup..

Adrian Cole

09/04/2020, 2:37 AM

but the flakiest thing we have has to do with lazy creation of pinot views as traffic can pass prior to that, espcially if not using slow k8s barriers

Adrian Cole

09/04/2020, 2:37 AM

I basically have everything done to do in docker layer except ran into this ZK thing

Adrian Cole

09/04/2020, 2:38 AM

I'm tempted to persist the ZK state 😛

Adrian Cole

09/04/2020, 2:39 AM

but better solution is a way to use ServiceManager to install prior to passing health check

Adrian Cole

09/04/2020, 2:39 AM

install schema I mean

Adrian Cole

09/04/2020, 2:39 AM

I'll raise a distinct issue on this.

Adrian Cole

09/04/2020, 2:52 AM

https://github.com/apache/incubator-pinot/issues/5977

Xiang Fu

09/04/2020, 7:06 AM

hmm, do you mean you want to have some state stored somewhere and recover from that ?

Xiang Fu

09/04/2020, 7:06 AM

in case of that, one way is to have an external zk which you can connect to always.

Xiang Fu

09/04/2020, 7:06 AM

another is to mount data directory somewhere in the container so you can always recover the state.

Xiang Fu

09/04/2020, 7:08 AM

I think this is a good feature with bootstrap tables.

Xiang Fu

09/04/2020, 7:09 AM

It can be just a simple wrapper script to start PinotSM and wait for

/health

check then create pinot tables

Xiang Fu

09/04/2020, 7:23 AM

But the ultimate solution is to integrate this inside PinotSM.

Adrian Cole

09/07/2020, 1:10 AM

the problem is that even if you remount the data dirs of pinot lack of metadata in ZK make it impossible to rebuild. I would say that pinot could rebuild this state when encountering a clean ZK (duplicating the storage concern in ZK), allow bootstrap tables (allows to keep unhealthy until bootstrap) or status quo lazy setup (https://github.com/hypertrace/pinot/blob/main/pinot-servicemanager/docker-bin/install-schema)

Adrian Cole

09/07/2020, 1:12 AM

so what I did was mess with docker HEALTHCHECK for now, in order to pretend the feature in PinotSM existed, and later this external messing around can be removed. The approach I did now only works if you use docker-healthcheck (ex docker-compose v2 not v3, or explicit use of the same HEALTHCHECK in k8s etc). By embedding into PinotSM the health barrier is more portable.. anything that reads /health will know when things are ready or not

Open in Slack

Previous Next