Hi Pinot folks, we observed a peculiar incident to...
# general
p
Hi Pinot folks, we observed a peculiar incident today wherein consumption stopped from just 1 partition of a topic (this topic has 96 partitions, 95 are working fine). This segment was moved from CONSUMING to OFFLINE state due to some exception during consumption.
Copy code
0 2022/04/06 12:06:04.179 ERROR [LLRealtimeSegmentDataManager_span_event_view_1__50__287__20220406T1205Z] [span_event_view_1__50__287__20220406T1205Z] Exception while in work
  1 2022/04/06 12:06:04.365 INFO [FileUploadDownloadClient] [span_event_view_1__50__287__20220406T1205Z] Sending request: <http://controller-0.controller-headless.pinot.svc.cluster.local:9000/segmentStoppe>    dConsuming?reason=java.lang.NullPointerException&streamPartitionMsgOffset=1059610656&instance=Server_server-span-event-view-realtime-7.span-event-view-realtime-headless.pinot.svc.cluster.local_8098&of    fset=-1&name=span_event_view_1__50__287__20220406T1205Z to controller: controller-0.controller-headless.pinot.svc.cluster.local, version: Unknown
  2 2022/04/06 12:06:04.366 INFO [ServerSegmentCompletionProtocolHandler] [span_event_view_1__50__287__20220406T1205Z] Controller response {"isSplitCommitType":false,"streamPartitionMsgOffset":null,"build    TimeSec":-1,"status":"PROCESSED","offset":-1} for <http://controller-0.controller-headless.pinot.svc.cluster.local:9000/segmentStoppedConsuming?reason=java.lang.NullPointerException&streamPartitionMsgO>    ffset=1059610656&instance=Server_server-span-event-view-realtime-7.span-event-view-realtime-headless.pinot.svc.cluster.local_8098&offset=-1&name=span_event_view_1__50__287__20220406T1205Z
  3 2022/04/06 12:06:04.366 INFO [LLRealtimeSegmentDataManager_span_event_view_1__50__287__20220406T1205Z] [span_event_view_1__50__287__20220406T1205Z] Got response {"isSplitCommitType":false,"streamParti    tionMsgOffset":null,"buildTimeSec":-1,"status":"PROCESSED","offset":-1}
I have attached the server logs when this happened.
This segment is OFFLINE right now. Is there anyway to move it back to CONSUMING?
r
what version are you using?
p
0.9.1
It created the next segment with the same starting offset as the OFFLINE segment, and then consumption started happening again, as you can see.
message has been deleted
n
There's a periodic background task on the controller which fixes this. That's how you got the new segment. It runs every hour
p
Ah. But does it fix OFFLINE segments as well? I thought it fixes only those segments that are in ERROR state.
Copy code
* Validates realtime ideal states and segment metadata, fixing any partitions which have stopped consuming,
* and uploading segments to deep store if segment download url is missing in the metadata.
Okay got it. But not sure why it suddenly errored out during consumption in the first place.
k
@User the PR for updating ideal states during segment commit should take care of this in future, right?
s
@User the offline segments will go away in a few days (default 7d I think). They only exist in idealstate, and do not have any data. They are there for debugging purpose so that you can search logs with that segment name and find out what went wrong. The automatic correction mechanism kicks in and creates new segments from the same offset (as long as that is still available in the stream).
n
@User the existing ValidationManager logic is the one that’s expected to fix this (as it did). I don’t think the PR you’re referring to would’ve addressed this (as Prashant’s problem wasn’t about a new partition)
p
@User thanks for the response. In any such future incidents, I guess we can trigger this job manually to fasten up the repair process. However, can we do something to know better what exactly went wrong during consumption? It says it’s a NPE but does tell exactly where.
s
It should log a stack
Our experience has been that on most occasions the OFFLINE transition is because of a transient failure in the underlying stream. There is logic to try multiple times, but it is possible that the retries fail as well. In that case, the segment turns to OFFLINE state, and new one is created at some point in future automatically to recover. Until that time, data will be stale on the partitions that had consumption issues.
However, there have been extremely rare occasions where the input data is either invalid or unsupported in pinot, causing an exception. In this case, no matter how many times the new segments get created, the row in question will disrupt consumption. The only way (at the moment) is to wait until a this row is evicted from the stream (due to retention), and pinot will then automatlcaly find a new (more recent) offset to consume data from -- at the cost of data loss, of course. A metric is raised to indicate data loss.
Either way, an exception stack should be logged on the server that attempted to consume the row. You can retrieve the stack via status APIs as well, I believe (not sure about this). That said, I have seen that if there are too many exceptions, jvm tends to log a single line after a few occurrences, so you may need to go back to locate the first exception stack log.
p
@User Will investigate on these lines, thanks a lot for the thorough analysis 🙂
the row in question will disrupt consumption
The new segment that was created by the repair job actually went ahead and created the segment successfully. As you said, if there were some issues with the data, then the new segment should’ve failed as well.
@User I think this JVM flag can be the reason for the stubbed stack traces? It says this kicks in if the exception reoccurs multiple times.
s
possible.