Hi folks I have a question about segment thresholds If I hav Apache Pinot #general

Hi folks! I have a question about segment threshol...

Diogo Baeder

11/17/2021, 11:53 AM

Hi folks! I have a question about segment thresholds. If I have something like this:

Copy code

'realtime.segment.flush.threshold.rows': '10000',
                    'realtime.segment.flush.threshold.time': '24h',
                    'realtime.segment.flush.desired.size': '100M',

does this mean that the first value that gets reached from the above ones determines that the segment will be flushed? Or is it the last value reached that determines that? For example, if a segment has been filling for 24h already, but has only 200 rows and 10M in size, does it get flushed because it reached the 24h mark?

👀 1

Mark Needham

11/17/2021, 1:13 PM

soooo.... The High Level Realtime Segment Data Manager • Checks

realtime.segment.flush.threshold.rows

on every record that gets processed. If that value is exceeded it flushes. • If it hasn't been exceeded then there's a task that checks once a minute if

realtime.segment.flush.threshold.time

has been exceeded and flushes if it has. That task also checks the rows threshold too in case it was missed by the first check I guess. If the time threshold has been exceeded and no new documents have been indexed it won't flush the segment. But if there have been > 0 documents indexed it will flush. And then the Low Level Realtime Segment Data Manager checks: •

realtime.segment.flush.desired.size

gets translated into a number of rows threshold based on the size of each row in the previous segment.

Copy code

* The formula used to compute new number of rows is:
 * targetNumRows = ideal_segment_size * (a * current_rows_to_size_ratio + b * previous_rows_to_size_ratio)
 * where a = 0.25, b = 0.75, prev ratio= ratio collected over all previous segment completions

I'm not entirely sure how it switches between those two data managers, but I expect @User or @User will know. But assuming that it's using the high level one, for your example:

Copy code

For example, if a segment has been filling for 24h already, but has only 200 rows and 10M in size, does it get flushed because it reached the 24h mark?

It would flush at the 24h mark from my understanding.

🙏 2

Mayank

11/17/2021, 2:12 PM

@User please check out the stream config section here https://docs.pinot.apache.org/configuration-reference/table#realtime-table-config

Mayank

11/17/2021, 2:13 PM

Please let me know if it is still unclear, will fix the docs

Diogo Baeder

11/17/2021, 2:47 PM

Awesome, that's very helpful, @User! Thanks @User, I've seen the docs but it's not very clear how the flushing happens when multiple flushing rules are configured; Mark's explanation above was what I needed, just to understand how this will happen.

Kamal Chavda

11/17/2021, 2:58 PM

I was wondering about the same thing. So each segment should have a consistent size correct? I've got this set-up but my segment sizes start off small and then get larger

Diogo Baeder

11/17/2021, 3:01 PM

Overall, then, my understanding is that it's a "whatever limit gets hit first" case, then.

Mark Needham

11/17/2021, 3:03 PM

so I'm not an expert of this code, but this is the place where it decides when to flush the segment: https://github.com/apache/pinot/blob/master/pinot-core/src/main/java/org/apache/pi[…]ot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java https://github.com/apache/pinot/blob/master/pinot-core/src/main/java/org/apache/pi[…]ot/core/data/manager/realtime/HLRealtimeSegmentDataManager.java

Kishore G

11/17/2021, 3:10 PM

Yes, what ever reaches first.. but note that under the hood there are only two thresholds • rows • Time Size gets converted into rows by looking at previous segments.. it takes a few iterations for Pinot to get this right as it needs to learn the mapping between rows and size

Diogo Baeder

11/17/2021, 3:17 PM

Got it... thanks @User!

Mark Needham

11/17/2021, 3:18 PM

@User what does it do if you specify both rows and size? Does it ignore the rows threshold in favour of size or something?

Kishore G

11/17/2021, 3:19 PM

Good question.. I don’t know

Kishore G

11/17/2021, 3:20 PM

Typically you use one or the other

Mayank

11/17/2021, 3:59 PM

Typically whatever threshold meets first is honored. The only exception is desired size, which take effect only when rows is set to zero (as in the docs).

Diogo Baeder

11/17/2021, 4:01 PM

Ah, got it. Thanks man! 🙂

👍 1

Neha Pawar

11/17/2021, 5:07 PM

1. Rows has to be 0 for size to take effect 2. No matter what size you specify, it starts off with 100k rows, and then slowly ramps up the rows to get to the desired size 3. If you specify only rows, and no size, then the rows get divided amongst all consuming partitions on a server (so if you specify rows 10k, and use 1 server and have 3 partitions and 1 replica, each segment will use 3333 as rows threshold) 4. Regardless of whether you go with size or just rows, you can (and always should) set a time threshold, so that it serves as the ultimate safety check. 5. Having said all this, typically you can just go with 0 rows, 24h time, and 200M size

Neha Pawar

11/17/2021, 5:08 PM

I think we should add a small section in docs for "setting thresholds"

Diogo Baeder

11/17/2021, 5:21 PM

That's very useful info, thanks a lot @User! And yes, it would be a great addition to the docs, I agree

Mark Needham

11/17/2021, 5:23 PM

@User I can do that. I guess it's more of less summarise this thread

Neha Pawar

11/17/2021, 6:57 PM

thank you Mark! yes, summarizing this thread. some points to note • would also be helpful to include the defaults that kick in, if nothing is specified • there’s some property name changes between 0.6 and 0.7

Open in Slack

Previous Next