Slackbot
05/26/2023, 8:49 PMJohn Kowtko
05/26/2023, 11:12 PMmaxBytesInMemory
and intermediatePersistPeriod
control when persists occur.
• Eventually the size or time threshold is reached and a segment is built and published. At that time these persist files are all collected into one regular sized segment and sent to deep storage.
I would suggest reading through this section of the ingestion doc for an explanation of the various parameters involved: https://druid.apache.org/docs/latest/development/extensions-core/kafka-supervisor-reference.html#kafkasupervisortuningconfig
Because the persisted files serve queries faster than the row buffer, we generally see people lean towards more persisting and less time in the row buffer, balanced with appropriate timing for building segments.
There are a lot of moving parts in this area of the product, so feel free to post more detailed questions.