hey friends, i’m seeing our space in zookeeper alm...
# troubleshooting
l
hey friends, i’m seeing our space in zookeeper almost getting to max usage in terms of disk, we have 5gb disk space, we currently have it setup with
Copy code
- name: ZK_PURGE_INTERVAL
              value: "1"
            - name: ZK_SNAP_RETAIN_COUNT
              value: "3"
in the logs i can see things getting set:
Copy code
2022-04-15 16:14:35,914 [myid:1] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2022-04-15 16:14:35,915 [myid:1] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 1
2022-04-15 16:14:35,979 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2022-04-15 16:14:35,980 [myid:1] - INFO  [main:ManagedUtil@46] - Log4j found with jmx enabled.
2022-04-15 16:14:35,988 [myid:1] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
however i don’t see any more logs of clean up logs after, is there a reason for this? also, i can see that the space is being chugged by the logs does anyone know why things may not get cleaned up? Thank you would appreciate your help.
m
@Xiaoman Dong ^^
l
i did an ls -lh
/data/log/version-2
and things are getting filled up there
Copy code
$ ls -lh
total 3.2G
-rw-rw-r-- 1 zookeeper zookeeper 768M Apr  5 00:45 log.700000001
-rw-rw-r-- 1 zookeeper zookeeper 1.0G Apr 11 17:51 log.700014b0e
-rw-rw-r-- 1 zookeeper zookeeper 1.1G Apr 15 16:01 log.8000001e6
-rw-r--r-- 1 zookeeper zookeeper 448M Apr 18 20:11 log.80000ea05
x
A bit off topic but I don’t think this is related to zookeeper node size too big issue; this looks very zookeeper specific cc @Mayank I will try to have a quick look at zookeeper but we may need some zookeeper expertise here
m
What’s filling the logs @Luis Fernandez? Also, for prod ZK, you probably want to have persistent store fo snapshots.
l
whatever gets stored in this route
Copy code
/data/log/version-2
they do not look like logs tho lol they look like our configs
m
Transaction logs?
l
example
Copy code
`ZKLG
?5
  "id" : "WorkflowContext",
  "simpleFields" : {
    "LAST_PURGE_TIME" : "1646699352774",
    "NAME" : "TaskQueue_RealtimeToOfflineSegmentsTask",
    "START_TIME" : "1646234218213",
    "STATE" : "IN_PROGRESS"
  },
  "mapFields" : {
    "JOB_STATES" : {
      "TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1646698682995" : "COMPLETED"
    },
    "StartTime" : {
      "TaskQueue_RealtimeToOfflineSegmentsTask_Task_RealtimeToOfflineSegmentsTask_1646698682995" : "1646698697926"
    }
  },
  "listFields" : { }
}
  "id" : "etsyads_metrics_dev__0__5__20220307T2037Z",
  "simpleFields" : {
m
Yeah, I am saying it is the ZK transaction logs
l
yes so this is filling up
any way to purge them?
m
You should mount snapshots on EVS, otherwise you are running the risk of data loss
x
Not sure if it is helpful but I searched and found this: https://github.com/31z4/zookeeper-docker/issues/30
m
cc: @Daniel Lavoie
l
aren’t snapshots here?
Copy code
/data/version-2
content:
Copy code
total 3.5M
-rw-r--r-- 1 zookeeper zookeeper    1 Apr 15 19:06 acceptedEpoch
-rw-r--r-- 1 zookeeper zookeeper    1 Apr 15 19:06 currentEpoch
-rw-rw-r-- 1 zookeeper zookeeper 481K Mar  8 19:58 snapshot.500a20f00
-rw-rw-r-- 1 zookeeper zookeeper 1.3M Apr  5 00:45 snapshot.700014b0d
-rw-rw-r-- 1 zookeeper zookeeper 1.8M Apr 11 17:53 snapshot.8000001e5
also noob question what’s EVS?
@Xiaoman Dong we are using
apache-zookeeper-3.5.5
m
Typo EBS - persistent storage
l
we are using whatever is there in the helm image
d
I think the ZK env variable have changed between 3.5 and 3.7
For snapshot configure.
can you share the content of
/conf/zoo.cfg
from within the ZK container?
l
si
Copy code
clientPort=2181
dataDir=/data
dataLogDir=/data/log
tickTime=2000
initLimit=10
syncLimit=10
maxClientCnxns=60
minSessionTimeout= 4000
maxSessionTimeout= 40000
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
4lw.commands.whitelist=*
server.1=pinot-zookeeper-0.pinot-zookeeper-headless.pinot-dev.svc.cluster.local:2888:3888
server.2=pinot-zookeeper-1.pinot-zookeeper-headless.pinot-dev.svc.cluster.local:2888:3888
server.3=pinot-zookeeper-2.pinot-zookeeper-headless.pinot-dev.svc.cluster.local:2888:3888
d
Auto purge seems to be configured as expected.
l
is it that it still hasn’t hit those parameters whatever they are?
d
I see that both dataDir and dataLogDir are within the same directory. I’ve personally encountered issue when these two are nested.
Is that a default config that go you that directory configuration?
l
yes that was default
from the helm chart
Copy code
ZK_DATA_DIR=${ZK_DATA_DIR:-"/data"}
      ZK_DATA_LOG_DIR=${ZK_DATA_LOG_DIR:-"/data/log"}
i’m not super knowledgeable about zookeeper but how are these 2 directories related?
d
the log dir stores the snapshots of your ZK state.
sorry
Data Dir stores the snapshots of your ZK state. That’s a complete of the state.
The Data Log Dir holds the transaction logs. That’s the critical data.
When a snapshot is made, ZK can purge the transactions log to reduce its size.
l
probably don't wanna lose that transaction log yes?
d
You can lose the transactions if you have a fresh snapshot.
But you would want to let ZK do its thing.
I have reason to believe you could be impacted by a bug in the helm chart configuration. Can you manually edit the PVC to increase the size of the PVC? If you are on AWS it can be done without restart.
l
i will have to ask my team but i think we can increase it yes
/data/log/version-2
Copy code
total 3.2G
-rw-rw-r-- 1 zookeeper zookeeper 768M Apr  5 00:45 log.700000001
-rw-rw-r-- 1 zookeeper zookeeper 1.0G Apr 11 17:51 log.700014b0e
-rw-rw-r-- 1 zookeeper zookeeper 1.1G Apr 15 16:01 log.8000001e6
-rw-r--r-- 1 zookeeper zookeeper 448M Apr 18 20:37 log.80000ea05
/data/version-2
Copy code
total 3.5M
-rw-r--r-- 1 zookeeper zookeeper    1 Apr 15 19:06 acceptedEpoch
-rw-r--r-- 1 zookeeper zookeeper    1 Apr 15 19:06 currentEpoch
-rw-rw-r-- 1 zookeeper zookeeper 481K Mar  8 19:58 snapshot.500a20f00
-rw-rw-r-- 1 zookeeper zookeeper 1.3M Apr  5 00:45 snapshot.700014b0d
-rw-rw-r-- 1 zookeeper zookeeper 1.8M Apr 11 17:53 snapshot.8000001e5
d
Snapshots are happening but the log isn’t cleaned.
I observed that behaviour when dataDirLog and dataDir are nested in the same folder.
I’ll do some testing tomorrow and see if we can get a recipe to fix this.
🙌 1
l
oh shoot D-:
m
Thanks @Daniel Lavoie for jumping on this.
@Harish Bohara is also running into ZK disk space issue
h
I am using EBS and one of the ZK 20GB disk was 100% consumed. This happens in a cluster which was running for only 2 days
l
can you also share your zk configs?
d
Can you also both share the helm values of your deployment?
By removing sensitive stuff of course.
l
i have the yaml file
h
Using confis from helm chart which is in Pinot site. Only change I, I increased the disk + CPU and memory for zk nodes
l
same here and added autopurge
Copy code
env:
            - name: ZK_REPLICAS
              value: "3"
            - name: JMXAUTH
              value: "false"
            - name: JMXDISABLE
              value: "false"
            - name: JMXPORT
              value: "1099"
            - name: JMXSSL
              value: "false"
            - name: ZK_HEAP_SIZE
              value: "256M"
            - name: ZK_SYNC_LIMIT
              value: "10"
            - name: ZK_TICK_TIME
              value: "2000"
            - name: ZK_PURGE_INTERVAL
              value: "1"
            - name: ZK_SNAP_RETAIN_COUNT
              value: "3"
            - name: ZOO_INIT_LIMIT
              value: "5"
            - name: ZOO_MAX_CLIENT_CNXNS
              value: "60"
            - name: ZOO_PORT
              value: "2181"
            - name: ZOO_STANDALONE_ENABLED
              value: "false"
            - name: ZOO_TICK_TIME
              value: "2000"
Copy code
resources:
            limits: 
              cpu: 500m
              memory: 1Gi
            requests:
              cpu: 500m
              memory: 1Gi
          volumeMounts:
            - name: data
              mountPath: /data
            - name: config
              mountPath: /config-scripts
volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "5Gi"
d
Settings
ZK_DATA_DIR
to
/data/snapshots
should do the trick. Big big spoiler. Do not attempt this on an environment you cannot affort to lose data. I have reason to believe that this could cause problem if all pods are updated at once. I would recommend update your statefulset update policy to
OnDelete
and set the env variable. Delete the pods one by one and checking the quorum status. Once the cluster is rollout, you can apply the changes with helm to ensure they are permanent.
l
you mean this
Copy code
updateStrategy:
    type: RollingUpdate
instead of that
OnDelete
?
d
Yes.
There is a fundamental problem with Zookeeper in Kubernetes.
Is that ZK needs to be ready to start synchronizing with other nodes.
Statefulset rollout only looks for pod readiness.
when a ZK pod is ready doesn’t mean it’s synced.
l
which is this
Copy code
readinessProbe:
            exec:
              command:
                - sh
                - /config-scripts/ready
d
If k8s health check is adapted to sync readiness, then it can never join and sync with the quorum given that there is no networking available when readiness is not reached.
By switching to
OnDelete
you can control the rollout yourself.
l
so basically i first make the change to
OnDelete
on the updateStrategy, is that a no op change?
d
That will trigger no change
How many ZK nodes do you have?
l
right, then even when i change the folder and i merge that change that also won’t do anything yes?
I have 3
it will only do when i manually delete the pod and the pod comes back
so i can do it one by one
d
1- Set OnDelete 2- Set EnvVar 3- Delete first pod and wait for it to be synced 4- Delete second and wait 5- Delte third pod and wait
I would also assert between 3 and 4 that the issue is fixed, and that you don’t see the transaction log growing to crazy sizes again.
l
cause the space usage should be better yes?
d
Yes.
Oh.
Just a minute
You will need to delete
/data/version-2
manually.
l
in each of the pods as they are coming back yes?
d
Once you confirmed that the new directory is stable and not going crazy
👍 1
You can wait after the full rollout
h
Copy code
limits:
              memory: 7G
            requests:
              cpu: 1000m
              memory: 6G
          env:
            - name: ZOO_DATA_LOG_DIR
              value: ""
            - name: ZOO_PORT_NUMBER
              value: "2181"
            - name: ZOO_TICK_TIME
              value: "2000"
            - name: ZOO_INIT_LIMIT
              value: "10"
            - name: ZOO_SYNC_LIMIT
              value: "5"
            - name: ZOO_MAX_CLIENT_CNXNS
              value: "60"
            - name: ZOO_4LW_COMMANDS_WHITELIST
              value: "srvr, mntr, ruok"
            - name: ZOO_LISTEN_ALLIPS_ENABLED
              value: "no"
            - name: ZOO_AUTOPURGE_INTERVAL
              value: "0"
            - name: ZOO_AUTOPURGE_RETAIN_COUNT
              value: "3"
            - name: ZOO_MAX_SESSION_TIMEOUT
              value: "40000"
            - name: ZOO_SERVERS
              value: zookeeper-0.zookeeper-headless.pinot.svc.cluster.local:2888:3888::1 zookeeper-1.zookeeper-headless.pinot.svc.cluster.local:2888:3888::2 zookeeper-2.zookeeper-headless.pinot.svc.cluster.local:2888:3888::3 
            - name: ZOO_ENABLE_AUTH
              value: "no"
            - name: ZOO_HEAP_SIZE
              value: "1024"
            - name: ZOO_LOG_LEVEL
              value: "ERROR"
            - name: ALLOW_ANONYMOUS_LOGIN
              value: "yes"
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: metadata.name
          ports:
            - name: client
              containerPort: 2181
            - name: follower
              containerPort: 2888
            - name: election
              containerPort: 3888
          livenessProbe:
            exec:
              command: ['/bin/bash', '-c', 'echo "ruok" | timeout 2 nc -w 2 localhost 2181 | grep imok']
            initialDelaySeconds: 30
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 6
          readinessProbe:
            exec:
              command: ['/bin/bash', '-c', 'echo "ruok" | timeout 2 nc -w 2 localhost 2181 | grep imok']
            initialDelaySeconds: 5
            periodSeconds: 10
            timeoutSeconds: 5
            successThreshold: 1
            failureThreshold: 6
          volumeMounts:
            - name: data
              mountPath: /bitnami/zookeeper
      volumes:
  volumeClaimTemplates:
    - metadata:
        name: data
        annotations:
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "100Gi"
l
@Daniel Lavoie when u said that we should afford some data loss what data would that be?
d
What i said, to test this anywhere you can afford to lose data before rolling it out in production.
👍 1
Anything stored in ZK could be corrupted if not done properly
l
oh yea 💯
d
Ensure you reproduce the FS growing problem too on that non critical environment.
l
that may be a tricky one it has only been getting filled up as several days have gone by 😄
(3 weeks)
d
Yeah. Ok, well at least test the rollout recipe so you can confirm you are confortable with it.
h
These are the steps? 1. Update StatefulSet
Copy code
updateStrategy:
      type: RollingUpdate
To
Copy code
updateStrategy:
      type: OnDelete
Also add
Copy code
env: 
     -  name: ZK_DATA_DIR
       value: "/bitnami/zookeeper/data/snapshots"
This will not restart ZK nodes 2. Delete first of the ZK node manually - wait for it to be synced 3. Delete second of the ZK node manually - wait for it to be synced 4. Delete third of the ZK node manually - wait for it to be synced 5. Rollback the changes in StatefulSet and change updateStrategy back to OnDelete
Copy code
updateStrategy:
      type: RollingUpdate
d
Copy code
env: 
     -  name: ZK_DATA_DIR
       value: "/data/snapshots"
and once cluster is stabilized and FS is not growing anymore, clean
/data/version-2
on all pods
h
In my setup i have
Copy code
volumeMounts:
   - name: data
     mountPath: /bitnami/zookeeper

I have no name!@zookeeper-0:/$ df -h
Filesystem      Size  Used Avail Use% Mounted on
overlay          50G  3.4G   47G   7% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.8G     0  3.8G   0% /sys/fs/cgroup
/dev/nvme0n1p1   50G  3.4G   47G   7% /etc/hosts
/dev/nvme1n1     99G  441M   98G   1% /bitnami/zookeeper
shm              64M     0   64M   0% /dev/shm
tmpfs           3.8G   12K  3.8G   1% /run/secrets/kubernetes.io/serviceaccount
tmpfs           3.8G     0  3.8G   0% /proc/acpi
tmpfs           3.8G     0  3.8G   0% /sys/firmware


zookeeper-0:/bitnami/zookeeper/data$ ls -la
total 16
drwxrwsr-x 3 1001 1001 4096 Apr 19 07:58 .
drwxrwsr-x 4 root 1001 4096 Apr 19 07:58 ..
-rw-rw-r-- 1 1001 1001    2 Apr 19 07:58 myid
drwxrwsr-x 2 1001 1001 4096 Apr 19 15:08 version-2
I see the setup I have the data dir is “bitnami/zookeeper/data”
d
Ok, I see. Indeed you change make sense
h
However I still did not understand why this fix should work 🙂.
What will happen after making this change, The logs will go to “/data/snapshots” dir??
d
Yes
You have the snapshot and log dir in the same
/data
path.
h
Copy code
I think I should do 2 changes: my configs do now have both ZK_DATA_DIR and ZK_DATA_LOG_DIR.
 
env:
   - name: ZK_DATA_DIR
     value: /bitnami/zookeeper/data/snapshots
   - name: ZK_DATA_LOG_DIR
     value: /bitnami/zookeeper/data/log
d
Yes indeed.
ZK_DATA_LOG_DIR
will default to
/data/log/
@Luis Fernandez I would recommend explcitely setting the value too if it’s not to late. I know that with ZK 3.7, the default value changed from the bitnami docker image.
l
will do so
also another change for the OSS may be to change this to
/data/snapshot
instead of that so that by default future clients don’t run into it
d
Yes but let’s confirm this fixes the issue from your end.
👍 1
We don’t recommend using Zookeeper in helm for production.
l
what should we use
d
/data/log
l
oh wait sorry you said you don’t recommend using zookeeper in helm for production what should it be instead
d
Sorry 🤣
🤣 1
Zookeeper on baremetal
or VM.
there is a Zookeeper operator in the wild but I have no experience with it
We have written our own Operator for zookeeper.
h
Copy code
$printenv
ZK_DATA_LOG_DIR=/bitnami/zookeeper/data/log
ZK_DATA_DIR=/bitnami/zookeeper/data/snapshots
...

I dont see /bitnami/zookeeper/data/snapshots or  /bitnami/zookeeper/data/logs dir after restart..
d
what is your
/conf/zoo.cfg
config?
h
This is form POD
Copy code
I have no name!@zookeeper-2:/$ cat /opt/bitnami/zookeeper/bin/../conf/zoo.cfg
# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial 
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between 
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just 
# example sakes.
dataDir=/bitnami/zookeeper/data
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
maxClientCnxns=60
#
# Be sure to read the maintenance section of the 
# administrator guide before turning on autopurge.
#
# <http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance>
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=0

## Metrics Providers
#
# <https://prometheus.io> Metrics Exporter
#metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
#metricsProvider.httpPort=7000
#metricsProvider.exportJvmInfo=true
preAllocSize=65536
snapCount=100000
maxCnxns=0
reconfigEnabled=false
quorumListenOnAllIPs=false
4lw.commands.whitelist=srvr, mntr, ruok
maxSessionTimeout=40000
server.1=zookeeper-0.zookeeper-headless.pinot.svc.cluster.local:2888:3888;2181
server.2=zookeeper-1.zookeeper-headless.pinot.svc.cluster.local:2888:3888;2181
server.3=zookeeper-2.zookeeper-headless.pinot.svc.cluster.local:2888:3888;2181
I have no name!@zookeeper-2:/$
l
btw you also need to make purgeInterval=1
d
I don’t see
dataDir
l
and I don’t see the
dataLogDir
d
Harish you could be facing a different issue than Luis.
h
I am using helm chat from Pinot site. It dont think it takes config file from vol mount. It must be default config from ZK docker
d
Yeah
h
I think I can change “ZOO_DATA_LOG_DIR=/bitnami/zookeeper/data/log” but “ZK_DATA_DIR” will remain “/bitnami/zookeeper/data”? Will this workout?
d
What we are suspecting is that having both dir in the same
/data
dir causes the issue
Harish, I think you are simply encountering a snapshot misconfiguration issue
Not the same issue than Luis
h
Ok.. However, I did get the problem where my ZK failed to start due to “No space left” issue where this was 100% full. Is his not due to this config?
l
I think you may have 2 issues, and one of them is the same about space getting filled up, by looking at your zookeeper config I can tell that you don’t have the purging activated, plus the issue than Daniel and I have been trying to solve
m
This thread is now blog worthy 😅
Or at least a recipe. @Mark Needham
d
Zookeeper on kubernetes should be a book by itself 😞
l
no wonder why 😄
there were some issues with our builds i’m gonna try to roll this on dev now
how do i know when a particular zookeeper is ready and good to go
d
You can run
zkCli.sh ls
command on it
You should see the pinot state
l
getting this in one of the zookeepers when it’s coming back:
Copy code
2022-04-19 17:47:13,503 [myid:3] - ERROR [main:QuorumPeer@955] - Unable to load database on disk
java.io.IOException: No snapshot found, but there are log entries. Something is broken!
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:211)
	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:919)
	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:905)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
2022-04-19 17:47:13,505 [myid:3] - ERROR [main:QuorumPeerMain@101] - Unexpected exception, exiting abnormally
java.lang.RuntimeException: Unable to run quorum server 
	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:956)
	at org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:905)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:205)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:123)
	at org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
Caused by: java.io.IOException: No snapshot found, but there are log entries. Something is broken!
	at org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:211)
	at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
	at org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:919)
d
Yeah, snapshot dir changed and logs are still present. So it complains
When you delete the pod, add the delete pvc in the same command. It will reprovision an empty pvc
l
noob question how do i tell it to remove the pvc as well?
d
kubectl delete pod <pod-name> && kubectl delete pvc <pvc-name>
Make sure you delete the right pvc 😛
l
i run
zkCli.sh ls
within the box?
it came back now
d
`kubectl exec <pod-name> -- zkCli.sh ls`is your friend
l
Copy code
clientPort=2181
dataDir=/data/snapshot
dataLogDir=/data/log
tickTime=2000
initLimit=10
syncLimit=10
maxClientCnxns=60
minSessionTimeout= 4000
maxSessionTimeout= 40000
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
4lw.commands.whitelist=*
server.1=pinot-zookeeper-0.pinot-zookeeper-headless.pinot-dev.svc.cluster.l
ocal:2888:3888
server.2=pinot-zookeeper-1.pinot-zookeeper-headless.pinot-dev.svc.cluster.l
ocal:2888:3888
server.3=pinot-zookeeper-2.pinot-zookeeper-headless.pinot-dev.svc.cluster.l
ocal:2888:3888
is it okay if i don’t see anything on
data/log
?
d
Does
zkCli.sh ls
show anything?
l
Copy code
Connecting to localhost:2181
2022-04-19 17:59:27,717 [myid:] - INFO  [main:Environment@109] - Client environment:zookeeper.version=3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on 05/03/2019 12:07 GMT
2022-04-19 17:59:27,720 [myid:] - INFO  [main:Environment@109] - Client environment:host.name=pinot-zookeeper-2.pinot-zookeeper-headless.pinot-dev.svc.cluster.local
2022-04-19 17:59:27,720 [myid:] - INFO  [main:Environment@109] - Client environment:java.version=1.8.0_232
2022-04-19 17:59:27,721 [myid:] - INFO  [main:Environment@109] - Client environment:java.vendor=Oracle Corporation
2022-04-19 17:59:27,721 [myid:] - INFO  [main:Environment@109] - Client environment:java.home=/usr/local/openjdk-8
2022-04-19 17:59:27,721 [myid:] - INFO  [main:Environment@109] - Client environment:java.class.path=/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/classes:/apache-zookeeper-3.5.5-bin/bin/../build/classes:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/target/lib/*.jar:/apache-zookeeper-3.5.5-bin/bin/../build/lib/*.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-jute-3.5.5.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/zookeeper-3.5.5.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-log4j12-1.7.25.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/slf4j-api-1.7.25.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/netty-all-4.1.29.Final.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/log4j-1.2.17.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/json-simple-1.1.1.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jline-2.11.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-util-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-servlet-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-server-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-security-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-io-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jetty-http-9.4.17.v20190418.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/javax.servlet-api-3.1.0.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-databind-2.9.8.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-core-2.9.8.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/jackson-annotations-2.9.0.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/commons-cli-1.2.jar:/apache-zookeeper-3.5.5-bin/bin/../lib/audience-annotations-0.5.0.jar:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-*.jar:/apache-zookeeper-3.5.5-bin/bin/../zookeeper-server/src/main/resources/lib/*.jar:/conf:
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:java.io.tmpdir=/tmp
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:java.compiler=<NA>
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:os.name=Linux
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:os.arch=amd64
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:os.version=5.4.144+
2022-04-19 17:59:27,724 [myid:] - INFO  [main:Environment@109] - Client environment:user.name=zookeeper
2022-04-19 17:59:27,725 [myid:] - INFO  [main:Environment@109] - Client environment:user.home=/home/zookeeper
2022-04-19 17:59:27,725 [myid:] - INFO  [main:Environment@109] - Client environment:user.dir=/apache-zookeeper-3.5.5-bin
2022-04-19 17:59:27,725 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.free=11MB
2022-04-19 17:59:27,726 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.max=247MB
2022-04-19 17:59:27,726 [myid:] - INFO  [main:Environment@109] - Client environment:os.memory.total=15MB
2022-04-19 17:59:27,728 [myid:] - INFO  [main:ZooKeeper@868] - Initiating client connection, connectString=localhost:2181 sessionTimeout=30000 watcher=org.apache.zookeeper.ZooKeeperMain$MyWatcher@3b95a09c
2022-04-19 17:59:27,731 [myid:] - INFO  [main:X509Util@79] - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2022-04-19 17:59:27,800 [myid:] - INFO  [main:ClientCnxnSocket@237] - jute.maxbuffer value is 4194304 Bytes
2022-04-19 17:59:27,806 [myid:] - INFO  [main:ClientCnxn@1653] - zookeeper.request.timeout value is 0. feature enabled=
ls [-s] [-w] [-R] path
command terminated with exit code 1
d
zkCli.sh ls /
l
Copy code
[pinot, zookeeper]
that means is cool right?
d
ZK seems happy
Can you share the startup logs?
l
Copy code
2022-04-19 17:54:15,446 [myid:3] - WARN  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Learner@282] - Unexpected exception, tries=2, remaining init limit=17995, connecting to pinot-zookeeper-0.pinot-zookeeper-headless.pinot-dev.svc.cluster.local/10.12.177.117:2888
java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:607)
	at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:233)
	at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:262)
	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:77)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2022-04-19 17:54:16,448 [myid:3] - WARN  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Learner@282] - Unexpected exception, tries=3, remaining init limit=16993, connecting to pinot-zookeeper-0.pinot-zookeeper-headless.pinot-dev.svc.cluster.local/10.12.177.117:2888
java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:607)
	at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:233)
	at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:262)
	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:77)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2022-04-19 17:54:17,450 [myid:3] - ERROR [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Learner@277] - Unexpected exception, retries exceeded. tries=4, remaining init limit=15991, connecting to pinot-zookeeper-0.pinot-zookeeper-headless.pinot-dev.svc.cluster.local/10.12.177.117:2888
java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:607)
	at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:233)
	at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:262)
	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:77)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2022-04-19 17:54:17,450 [myid:3] - WARN  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Follower@96] - Exception when following the leader
java.net.ConnectException: Connection refused (Connection refused)
	at java.net.PlainSocketImpl.socketConnect(Native Method)
	at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
	at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
	at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
	at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
	at java.net.Socket.connect(Socket.java:607)
	at org.apache.zookeeper.server.quorum.Learner.sockConnect(Learner.java:233)
	at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:262)
	at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:77)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1271)
2022-04-19 17:54:17,450 [myid:3] - INFO  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):Follower@201] - shutdown called
java.lang.Exception: shutdown Follower
	at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:201)
	at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1275)
2022-04-19 17:54:17,451 [myid:3] - WARN  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1318] - PeerState set to LOOKING
2022-04-19 17:54:17,451 [myid:3] - INFO  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1193] - LOOKING
2022-04-19 17:54:17,451 [myid:3] - INFO  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):FastLeaderElection@885] - New election. My id =  3, proposed zxid=0x0
2022-04-19 17:54:17,452 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 3 (n.leader), 0x0 (n.zxid), 0x2 (n.round), LOOKING (n.state), 3 (n.sid), 0x0 (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-04-19 17:54:17,453 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 2 (n.leader), 0xa00000018 (n.zxid), 0x1 (n.round), FOLLOWING (n.state), 1 (n.sid), 0xb (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-04-19 17:54:17,454 [myid:3] - INFO  [WorkerReceiver[myid=3]:FastLeaderElection@679] - Notification: 2 (message format version), 2 (n.leader), 0xa00000018 (n.zxid), 0x1 (n.round), LEADING (n.state), 2 (n.sid), 0xb (n.peerEPoch), LOOKING (my state)0 (n.config version)
2022-04-19 17:54:17,454 [myid:3] - INFO  [QuorumPeer[myid=3](plain=/0.0.0.0:2181)(secure=disabled):QuorumPeer@1269] - FOLLOWING
Copy code
+ /config-scripts/run
+ exec java -cp '/apache-zookeeper-3.5.5-bin/lib/*:/apache-zookeeper-3.5.5-bin/*jar:/conf:' -Xmx256M -Xms256M org.apache.zookeeper.server.quorum.QuorumPeerMain /conf/zoo.cfg
2022-04-19 17:54:12,120 [myid:] - INFO  [main:QuorumPeerConfig@133] - Reading configuration from: /conf/zoo.cfg
2022-04-19 17:54:12,125 [myid:] - INFO  [main:QuorumPeerConfig@385] - clientPortAddress is 0.0.0.0/0.0.0.0:2181
2022-04-19 17:54:12,125 [myid:] - INFO  [main:QuorumPeerConfig@389] - secureClientPort is not set
2022-04-19 17:54:12,203 [myid:3] - INFO  [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3
2022-04-19 17:54:12,205 [myid:3] - INFO  [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 1
2022-04-19 17:54:12,205 [myid:3] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@138] - Purge task started.
2022-04-19 17:54:12,206 [myid:3] - INFO  [main:ManagedUtil@46] - Log4j found with jmx enabled.
2022-04-19 17:54:12,213 [myid:3] - INFO  [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed.
2022-04-19 17:54:12,315 [myid:3] - INFO  [main:QuorumPeerMain@141] - Starting quorum peer
2022-04-19 17:54:12,320 [myid:3] - INFO  [main:ServerCnxnFactory@135] - Using org.apache.zookeeper.server.NIOServerCnxnFactory as server connection factory
2022-04-19 17:54:12,321 [myid:3] - INFO  [main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s sessionless connection timeout, 1 selector thread(s), 2 worker threads, and 64 kB direct buffers.
2022-04-19 17:54:12,399 [myid:3] - INFO  [main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:2181
2022-04-19 17:54:12,419 [myid:3] - INFO  [main:Log@193] - Logging initialized @583ms to org.eclipse.jetty.util.log.Slf4jLog
2022-04-19 17:54:12,902 [myid:3] - WARN  [main:ContextHandler@1588] - o.e.j.s.ServletContextHandler@1bce4f0a{/,null,UNAVAILABLE} contextPath ends with /*
2022-04-19 17:54:12,902 [myid:3] - WARN  [main:ContextHandler@1599] - Empty contextPath
2022-04-19 17:54:12,911 [myid:3] - INFO  [main:X509Util@79] - Setting -D jdk.tls.rejectClientInitiatedRenegotiation=true to disable client-initiated TLS renegotiation
2022-04-19 17:54:12,912 [myid:3] - INFO  [main:QuorumPeer@1488] - Local sessions disabled
2022-04-19 17:54:12,912 [myid:3] - INFO  [main:QuorumPeer@1499] - Local session upgrading disabled
2022-04-19 17:54:12,912 [myid:3] - INFO  [main:QuorumPeer@1466] - tickTime set to 2000
2022-04-19 17:54:12,913 [myid:3] - INFO  [main:QuorumPeer@1510] - minSessionTimeout set to 4000
2022-04-19 17:54:12,913 [myid:3] - INFO  [main:QuorumPeer@1521] - maxSessionTimeout set to 40000
2022-04-19 17:54:12,913 [myid:3] - INFO  [main:QuorumPeer@1536] - initLimit set to 10
2022-04-19 17:54:12,921 [myid:3] - INFO  [main:ZKDatabase@117] - zookeeper.snapshotSizeFactor = 0.33
2022-04-19 17:54:12,922 [myid:3] - INFO  [main:QuorumPeer@1781] - Using insecure (non-TLS) quorum communication
2022-04-19 17:54:12,997 [myid:3] - INFO  [main:QuorumPeer@1787] - Port unification disabled
2022-04-19 17:54:12,997 [myid:3] - INFO  [main:QuorumPeer@2154] - QuorumPeer communication is not secured! (SASL auth disabled)
2022-04-19 17:54:12,997 [myid:3] - INFO  [main:QuorumPeer@2183] - quorum.cnxn.threads.size set to 20
2022-04-19 17:54:12,998 [myid:3] - INFO  [main:FileTxnSnapLog@372] - Snapshotting: 0x0 to /data/snapshot/version-2/snapshot.0
2022-04-19 17:54:13,001 [myid:3] - INFO  [main:QuorumPeer@931] - currentEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2022-04-19 17:54:13,006 [myid:3] - INFO  [main:QuorumPeer@946] - acceptedEpoch not found! Creating with a reasonable default of 0. This should only happen when you are upgrading your installation
2022-04-19 17:54:13,012 [myid:3] - INFO  [main:Server@370] - jetty-9.4.17.v20190418; built: 2019-04-18T19:45:35.259Z; git: aa1c656c315c011c01e7b21aabb04066635b9f67; jvm 1.8.0_232-b09
d
Can you get it from the begining? I’m interested in the bit that shows the startup config
thank you
Try moving with the next server now
l
moving to next one
Copy code
[pinot, zookeeper]
d
can you with
ls /data/log/
on pod first and second zk pod?
l
Copy code
$ kubectl exec pinot-zookeeper-1 -- ls /data/log/
version-2
Copy code
$ kubectl exec pinot-zookeeper-2 -- ls /data/log/
version-2
d
then check within
/data/log/version-2
l
Copy code
$ kubectl exec pinot-zookeeper-2 -- ls /data/log/version-2
log.b000011cd
Copy code
$ kubectl exec pinot-zookeeper-1 -- ls /data/log/version-2
log.c00000001
d
alright. Now for the last one
l
here we go
Copy code
[pinot, zookeeper]
Copy code
$ kubectl exec pinot-zookeeper-0 -- ls /data/log/version-2
log.c000002a3
image.png
d
So you had the issue on dev?
l
prod is gonna get to the same level 😄
but for some reason it hasn’t gotten to the level dev had it
d
Sounds like you are ready for surgery
l
my hands are sweating 😄 haha
d
Oh before you proceed
I would add the helm values and do a helm upgrade to ensure things are not blowing up for you next rollout
l
i’m not sure i follow 😄
d
So you manually added an helm env variable
Next to you run
helm upgrade
. That variable will be missing
l
ohh you mean that when there’s an upgrade to helm that i may need to get then it would be gone from here?
Copy code
env:
  - name: ZK_REPLICAS
    value: "3"
  - name: JMXAUTH
    value: "false"
  - name: JMXDISABLE
    value: "false"
on the statefulset.yaml
d
you can add the env variable as part of an helm value
zookeeper.env.ZK_DATA_LOG_DIR=/data/log
zookeeper.env.ZK_DATA_DIR=/data/snapshots
l
i think we have sort of a different setup we use jsonnet and that’s what i was editing, we regularly don’t use
helm upgrade
d
As long as you don’t loose the env var during you next upgrade 😛
l
i will keep note of that for sure
h
FYI - I had a setup where ZK logs reached to 20GB. This fixed it: I am using “docker.io/bitnami/zookeeper:3.7.0-debian-10-r56” -> setting ZOO_AUTOPURGE_INTERVAL=1 and restart of ZK fixed it
l
this has been deployed to prod, flawlessly no issues at all
@Daniel Lavoie thank you so much for your support and help on how to best deploy this
i’m gonna keep a watch on the logs and se if things get filled up again
d
Sweet, happy this worked out all the way up to production
🍷