Hello, Regarding kafka-based streaming ingestion. ...
# general
p
Hello, Regarding kafka-based streaming ingestion. When does pinot commit offsets to kafka? Is it after creating a segment? Can Pinot be configured to commit offsets only after a segment has been stored in deep storage to ensure no data is lost, in case segments in the server but not in deep storage are deleted?
k
That’s the default behavior. Offsets are checkpointed only after segments are committed to deep store
p
What happens then when deep store is not configured?
k
Note, Pinot does not use Kafka for checkpointing. Pinot has its own checkpointing mechanism
It will always upload the segment to controller and controller will store it in its local directory
p
Pinot segment checkpoint means controller storing segment in local directory?
If the controller local directory is corrupted or deleted is the checkpoint invalid?
I.e: Is the controller dir a single point of failure?
k
No
p
Is there a way to tell Pinot to discard/delete consuming segments and re-read events from Kafka from the last committed offset?
In the case where the consuming segment was deleted from local storage but the metadata about the segment still exists
m
Did you try to delete the segment using rest-api / swagger?
p
Not yet, I will try after configuring deep store
m
Ok