Currently, when i am uploading a lots of segment i...
# general
a
Currently, when i am uploading a lots of segment into Pinot, the table status moved to in BAD State for long period of time. Is this expected, or i have misconfigured the system ?
m
What do you see in the ideal state and external view in the ZK?
a
Screenshot 2021-05-07 at 23.29.16.png
Seems like ZK is not able to sync well.
m
Hmm seems like gzipped
IIRC there was a way to tell ZooInspector to unzip, will need to recollect the details
k
this means there are too many segments
which is fine but I am guessing you have too many small segments
m
correct
k
Pinot UI will be able to decompress this
a
Current segment size is around 80MB. What is the sweet spot for Pinot ?
k
100 to 500mb
how many segments do you have, 10k+ ?
j
2544 based on the screenshot, so not too much
m
I think the original problem was BAD state
a
For a day of data, i have currently 2000*80 MB partitions.
m
Which may mean that server running out of memory?
How many servers do you have and what's the replication?
a
Pinot Server or ZK ?
3 Server 1 replication.
m
Pinot server (if too much metadata in memory). Or servers don't have enough local storage
what's the vm config
are you running latest master or official release?
a
23 core 128G memory 1 TB disk.
I will add more memory to Pinot Server.
m
latest master or official release?
a
0.7.1
m
oh ok
in latest release there's a debug endpoint to get some info (so that we don't have to ask a lot of questions)
a
Considering 500MB partitions with 90 days of data will have 28k Segment roughly.
This will be interesting with ZK.
m
xms/xmx on server?
a
Copy code
-Xms12G -Xmx28G
This is Server config.
m
Yeah, check server logs for errors?
Servers store segment metadata on heap, with large number of segments may be the heap size is not sufficient (just guessing)
a
I will take a look at the gc.log
and report back.
m
Look at the server log first
I have seen use cases go upto 1-1.5GB per segment without issues, just fyi. There are other factors to consider though before increasing the segment size to > 1 GB
a
No errors in Server.log.
Interestingly the error was only there when metadata push was happening.
Once, it got completed. Table was back in Bad state.
m
I am confused, so is the table in BAD state after everything is done, or only during push?
a
only when SparkSegmentMetadataPushJobRunner.java task was running.
m
Ok, so now table is GOOD?
a
Once, its completed Table is now in good state
yes.
ZK is a vanilla installation also. 0 config running on same machine.
m
Ok, I need to check how UI reports BAD state. I usually go by ZK
a
I will also configure this.
m
My guess is that it takes a while for external view to update since the server has to download segment from deepstore and that's when it is reported BAD by UI. If so, that should be fixed.
a
I have many new batch to load. I will check and let you know the details.
m
Ok