Any idea why Pinot server JVM usage grows as records grows - it does glow slowly. However, this way it will eventually have high heap usage. (I do have 5-6 inverted and sorted indexes in my table - with very low cardinality columns)
• using off heap and MMAP for segments in my setup
• have ~ 100-150 segments
• 500-600M rows -> continue to grow over 150M per day..
• 6 server nodes - 8GB is given to JVM and rest is available to off-heap
What I expected - it will grow and as segments go to disk it should come back the lower bound. I expected that this cycle should continue and the the lower bound of memory should remain constants. Am i missing some setting?
04/25/2022, 12:55 PM
Which version of Pinot? Also are you see this increase only for server or broker/controller as well? If all, perhaps related to Prometheus
04/25/2022, 12:55 PM
Should I remove Prometheus and try?
04/25/2022, 12:56 PM
First check if all components
04/25/2022, 12:59 PM
Controller and broker - looks ok to me (i see controller and broker come to ~ lower bounds after GC, then go up and again go the ~same value).
04/25/2022, 1:01 PM
But you are just ingesting data, so broker should not have any spikes at all
It should be flat lined
May be try without Prometheus, just to debug
04/25/2022, 1:27 PM
Got the issue. Initially I had 2 tables (both table have sane dat source) : this setup was giving all those issue with ZK disconnect, very high GC time etc
1. Append table + with star tree index
2. Upsert table - 1-2 bloom filter index
I created 2 diff cluster:
1. Append table without star tree 🟢
a. This grows slowly as i mentioned above - however I ran full gc once amd it came down to 200MB which looks good
b. This means it does not have any memory lead and eventually a full gc will happen and will free up the memory
2. Upsert table - 1-2 bloom filter index 🔴
a. This setup grows very fast and ~200M records it goes to 8GB heap size
b. Then GC does not help - event force GC does not recover memory
c. Out of book table configs - need to debug what is wrong here
Will debut issue in Upsert tables. I will run a separate cluster with Append and Upset to see how it performs
04/26/2022, 5:35 PM
For 2, is your pk monotonically increasing? They are stored on-heap today. Also, have you noticed a need for bloom filter? It is also stored on heap iirc, it might be better to just use inv index which is off heap, and will give you good perf.
04/28/2022, 9:06 PM
I though bloom filter will allow Pinot to skip a segment file (few DB like C* does this). Ok I will remove it from bloom filter..
For 2 - my PK is a UUID..
The setup (1) is working fine for 1 weeks (with 2 tables each having 1B records). I see the pinot server memory map usage is increasing to 30+ GB. Is it normal? I did not understand why it should grow. It should be a flat line.. Am I missing something?
04/28/2022, 11:55 PM
If data is being ingested then memory-map usage is expected to increase with data growth?