Currently when i am uploading a lots of segment into Pinot t Apache Pinot #general

Currently, when i am uploading a lots of segment i...

Akash

05/07/2021, 10:23 PM

Currently, when i am uploading a lots of segment into Pinot, the table status moved to in BAD State for long period of time. Is this expected, or i have misconfigured the system ?

Mayank

05/07/2021, 10:24 PM

What do you see in the ideal state and external view in the ZK?

Akash

05/07/2021, 10:29 PM

Screenshot 2021-05-07 at 23.29.16.png

Akash

05/07/2021, 10:30 PM

Seems like ZK is not able to sync well.

Mayank

05/07/2021, 10:30 PM

Hmm seems like gzipped

Mayank

05/07/2021, 10:30 PM

IIRC there was a way to tell ZooInspector to unzip, will need to recollect the details

Kishore G

05/07/2021, 10:38 PM

this means there are too many segments

Kishore G

05/07/2021, 10:39 PM

which is fine but I am guessing you have too many small segments

Mayank

05/07/2021, 10:39 PM

correct

Kishore G

05/07/2021, 10:39 PM

Pinot UI will be able to decompress this

Akash

05/07/2021, 10:43 PM

Current segment size is around 80MB. What is the sweet spot for Pinot ?

Kishore G

05/07/2021, 10:43 PM

100 to 500mb

Kishore G

05/07/2021, 10:44 PM

how many segments do you have, 10k+ ?

Jackie

05/07/2021, 10:44 PM

2544 based on the screenshot, so not too much

Mayank

05/07/2021, 10:45 PM

I think the original problem was BAD state

Akash

05/07/2021, 10:45 PM

For a day of data, i have currently 2000*80 MB partitions.

Mayank

05/07/2021, 10:45 PM

Which may mean that server running out of memory?

Mayank

05/07/2021, 10:45 PM

How many servers do you have and what's the replication?

Akash

05/07/2021, 10:45 PM

Pinot Server or ZK ?

Akash

05/07/2021, 10:45 PM

3 Server 1 replication.

Mayank

05/07/2021, 10:45 PM

Pinot server (if too much metadata in memory). Or servers don't have enough local storage

Mayank

05/07/2021, 10:46 PM

what's the vm config

Mayank

05/07/2021, 10:46 PM

are you running latest master or official release?

Akash

05/07/2021, 10:46 PM

23 core 128G memory 1 TB disk.

Akash

05/07/2021, 10:47 PM

I will add more memory to Pinot Server.

Mayank

05/07/2021, 10:47 PM

latest master or official release?

Akash

05/07/2021, 10:48 PM

0.7.1

Mayank

05/07/2021, 10:48 PM

oh ok

Mayank

05/07/2021, 10:48 PM

in latest release there's a debug endpoint to get some info (so that we don't have to ask a lot of questions)

Akash

05/07/2021, 10:49 PM

Considering 500MB partitions with 90 days of data will have 28k Segment roughly.

Akash

05/07/2021, 10:49 PM

This will be interesting with ZK.

Mayank

05/07/2021, 10:49 PM

xms/xmx on server?

Akash

05/07/2021, 10:50 PM

Copy code

-Xms12G -Xmx28G

Akash

05/07/2021, 10:50 PM

This is Server config.

Mayank

05/07/2021, 10:50 PM

Yeah, check server logs for errors?

Mayank

05/07/2021, 10:51 PM

Servers store segment metadata on heap, with large number of segments may be the heap size is not sufficient (just guessing)

Akash

05/07/2021, 10:56 PM

I will take a look at the gc.log

Akash

05/07/2021, 10:56 PM

and report back.

Mayank

05/07/2021, 10:56 PM

Look at the server log first

Mayank

05/07/2021, 10:57 PM

I have seen use cases go upto 1-1.5GB per segment without issues, just fyi. There are other factors to consider though before increasing the segment size to > 1 GB

Akash

05/07/2021, 11:02 PM

No errors in Server.log.

Akash

05/07/2021, 11:02 PM

Interestingly the error was only there when metadata push was happening.

Akash

05/07/2021, 11:02 PM

Once, it got completed. Table was back in Bad state.

Mayank

05/07/2021, 11:03 PM

I am confused, so is the table in BAD state after everything is done, or only during push?

Akash

05/07/2021, 11:04 PM

only when SparkSegmentMetadataPushJobRunner.java task was running.

Mayank

05/07/2021, 11:04 PM

Ok, so now table is GOOD?

Akash

05/07/2021, 11:04 PM

Once, its completed Table is now in good state

Akash

05/07/2021, 11:04 PM

yes.

Akash

05/07/2021, 11:04 PM

ZK is a vanilla installation also. 0 config running on same machine.

Mayank

05/07/2021, 11:04 PM

Ok, I need to check how UI reports BAD state. I usually go by ZK

Akash

05/07/2021, 11:05 PM

I will also configure this.

Mayank

05/07/2021, 11:06 PM

My guess is that it takes a while for external view to update since the server has to download segment from deepstore and that's when it is reported BAD by UI. If so, that should be fixed.

Akash

05/07/2021, 11:08 PM

I have many new batch to load. I will check and let you know the details.

Mayank

05/07/2021, 11:08 PM

Open in Slack

Previous Next