Diogo Baeder
11/17/2021, 11:53 AM'realtime.segment.flush.threshold.rows': '10000',
'realtime.segment.flush.threshold.time': '24h',
'realtime.segment.flush.desired.size': '100M',
does this mean that the first value that gets reached from the above ones determines that the segment will be flushed? Or is it the last value reached that determines that? For example, if a segment has been filling for 24h already, but has only 200 rows and 10M in size, does it get flushed because it reached the 24h mark?Mark Needham
realtime.segment.flush.threshold.rows
on every record that gets processed. If that value is exceeded it flushes.
• If it hasn't been exceeded then there's a task that checks once a minute if realtime.segment.flush.threshold.time
has been exceeded and flushes if it has. That task also checks the rows threshold too in case it was missed by the first check I guess.
If the time threshold has been exceeded and no new documents have been indexed it won't flush the segment. But if there have been > 0 documents indexed it will flush.
And then the Low Level Realtime Segment Data Manager checks:
• realtime.segment.flush.desired.size
gets translated into a number of rows threshold based on the size of each row in the previous segment.
* The formula used to compute new number of rows is:
* targetNumRows = ideal_segment_size * (a * current_rows_to_size_ratio + b * previous_rows_to_size_ratio)
* where a = 0.25, b = 0.75, prev ratio= ratio collected over all previous segment completions
I'm not entirely sure how it switches between those two data managers, but I expect @User or @User will know.
But assuming that it's using the high level one, for your example:
For example, if a segment has been filling for 24h already, but has only 200 rows and 10M in size, does it get flushed because it reached the 24h mark?
It would flush at the 24h mark from my understanding.Mayank
Mayank
Diogo Baeder
11/17/2021, 2:47 PMKamal Chavda
11/17/2021, 2:58 PMDiogo Baeder
11/17/2021, 3:01 PMMark Needham
Kishore G
Diogo Baeder
11/17/2021, 3:17 PMMark Needham
Kishore G
Kishore G
Mayank
Diogo Baeder
11/17/2021, 4:01 PMNeha Pawar
Neha Pawar
Diogo Baeder
11/17/2021, 5:21 PMMark Needham
Neha Pawar