when we insert data into pinot how is replication achieved i Apache Pinot #getting-started

when we insert data into pinot how is replication ...

Luis Fernandez

09/07/2021, 4:38 PM

when we insert data into pinot how is replication achieved? is it when a segment is completed that we make this data available to other nodes?

Kulbir Nijjer

09/07/2021, 11:05 PM

Depends on type of table - realtime vs. offline. For realtime as many servers(consumers) as the replication factor, start consuming data in parallel from the streaming source. Whenever segment is completed controller gets notified and it picks one of the replica servers to commit the segment to and also update the segment store. For offline servers -since segment is already generated earlier, replication simply controls which servers from the pool host the offline segment and it's decided by Controller. More details are defined here: https://docs.pinot.apache.org/basics/architecture#real-time-data-flow as well as offline/batch data flow.

Luis Fernandez

09/08/2021, 1:35 PM

thank you so much for your answer Kulbir, that brings me to my next question, when we say online servers vs offline servers do we really mean only online table vs offline tables?

Kulbir Nijjer

09/08/2021, 2:46 PM

Not sure if I fully understood the question, a server which is hosting offline segment is referred to as offline server and similarly One which is processing real-time or streaming input is called real-time server. Type of segment being served identifies server type and server can be hosting both type of segments. Documentation covers all these things well so I will recommend going through same as well as YouTube getting started and other videos on Pinot channel are really great resources.

Open in Slack

Previous Next