Hi all! For anyone unfamiliar with Zookeeper opera...
# general
r
Hi all! For anyone unfamiliar with Zookeeper operations (which we were), you will see that Zookeeper will keep increasing its disk usage. We found this odd since it only allows 1MB worth of data in its leaf nodes. Looking at our data directory we saw its disk usage mainly in the logs folder and the snapshots. After searching a bit, we found that there are two configurations that will automatically purge these files (
autopurge.snapRetainCount
and
autopurge.purgeInterval
). The logs are transaction logs, not application level logs, and they relate to the snapshots such that Zookeeper can recover from a failure by using the latest snapshot and the transaction logs. The
purgeInterval
is 0 by default so it does not purge anything. and
snapRetainCount
is 3 - but again this is disabled. Depending on the docker image and helm chart you are using you already have an env var to change the `purgeInterval`: •
ZOO_AUTOPURGE_PURGEINTERVAL
for the official docker image •
ZK_PURGE_INTERVAL
for the zookeeper incubator helm chart Hope this helps!
👍 3
x
Many thanks for point this out! We’ve also seen this issue and the auto purge does help!
k
can we add this to docs?
l
I added this auto purge feature to zookeeper long ago. I can explain why this feature disabled by default. More write ops to zookeeper results in more snapshots. ZooKeeper snapshots can be backed up and they can be used to restore the ZooKeeper state. Depending on how frequently snapshots are being rolled, users may want to control the number of snapshots to retain. For example, if snapshots are being rolled every hour (due to heavy writes), you may want to retain 24 of them to be able to restore 1 day old state.