https://pinot.apache.org/ logo
z

Zsolt Takacs

07/12/2021, 11:18 AM
I've ran into a problem with a realtime table using
"completionMode": "DOWNLOAD"
. The download failed three times on the replica server, and the segment is stuck in error state. If I try to reload it the log says it's reloading a consuming segment, but nothing changes. How can I recover from this state?
The relevant logs grepping for the segment name:
Copy code
2021/07/12 03:11:23.622 INFO [AppInfoParser] [viewSegments__6__322__20210712T0211Z] Kafka version : 2.0.0
2021/07/12 03:11:23.622 INFO [AppInfoParser] [viewSegments__6__322__20210712T0211Z] Kafka commitId : 3402a8361b734732
2021/07/12 03:11:23.622 INFO [LLRealtimeSegmentDataManager_viewSegments__6__322__20210712T0211Z] [viewSegments__6__322__20210712T0211Z] Stopping consumption due to time limit start=1626055882438 now=1626059483622 numRowsConsumed=3 numRowsIndexed=3
2021/07/12 03:11:23.628 INFO [FileUploadDownloadClient] [viewSegments__6__322__20210712T0211Z] Sending request: <http://merry-macaw-pinot-controller-0.merry-macaw-pinot-controller-headless.crt.svc.cluster.local:9000/segmentConsumed?name=viewSegments__6__322__20210712T0211Z&offset=58088148&instance=Server_merry-macaw-pinot-server-1.merry-macaw-pinot-server-headless.crt.svc.cluster.local_8098&reason=timeLimit&rowCount=3&streamPartitionMsgOffset=58088148> to controller: merry-macaw-pinot-controller-0.merry-macaw-pinot-controller-headless.crt.svc.cluster.local, version: Unknown
2021/07/12 03:11:23.628 INFO [ServerSegmentCompletionProtocolHandler] [viewSegments__6__322__20210712T0211Z] Controller response {"isSplitCommitType":false,"streamPartitionMsgOffset":"58088148","buildTimeSec":-1,"status":"HOLD","offset":58088148} for <http://merry-macaw-pinot-controller-0.merry-macaw-pinot-controller-headless.crt.svc.cluster.local:9000/segmentConsumed?name=viewSegments__6__322__20210712T0211Z&offset=58088148&instance=Server_merry-macaw-pinot-server-1.merry-macaw-pinot-server-headless.crt.svc.cluster.local_8098&reason=timeLimit&rowCount=3&streamPartitionMsgOffset=58088148>
2021/07/12 03:11:25.074 INFO [HelixTaskExecutor] [ZkClient-EventThread-21-merry-macaw-zookeeper:2181] Scheduling message f0e75749-a1c3-4d9a-a65f-ce5c82ce9dc5: viewSegments_REALTIME:viewSegments__6__322__20210712T0211Z, CONSUMING->ONLINE
2021/07/12 03:11:25.074 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] handling message: f0e75749-a1c3-4d9a-a65f-ce5c82ce9dc5 transit viewSegments_REALTIME.viewSegments__6__322__20210712T0211Z|[] from:CONSUMING to:ONLINE, relayedFrom: null
2021/07/12 03:11:25.106 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Instance Server_merry-macaw-pinot-server-1.merry-macaw-pinot-server-headless.crt.svc.cluster.local_8098, partition viewSegments__6__322__20210712T0211Z received state transition from CONSUMING to ONLINE on session 100e6aa105f006d, message id: f0e75749-a1c3-4d9a-a65f-ce5c82ce9dc5
2021/07/12 03:11:25.110 INFO [LLRealtimeSegmentDataManager_viewSegments__6__322__20210712T0211Z] [HelixTaskExecutor-message_handle_thread] State: HOLDING, transitioning from CONSUMING to ONLINE (startOffset: 58088145, endOffset: 58088148)
2021/07/12 03:11:26.629 INFO [LLRealtimeSegmentDataManager_viewSegments__6__322__20210712T0211Z] [HelixTaskExecutor-message_handle_thread] Consumer thread stopped in state HOLDING
2021/07/12 03:11:26.629 INFO [LLRealtimeSegmentDataManager_viewSegments__6__322__20210712T0211Z] [HelixTaskExecutor-message_handle_thread] State HOLDING. CompletionMode DOWNLOAD. Downloading to replace
2021/07/12 03:11:26.630 INFO [S3PinotFS] [HelixTaskExecutor-message_handle_thread] Copy <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to local /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:11:56.637 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread] Caught exception while fetching segment from: <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to: /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:11:57.056 INFO [S3PinotFS] [HelixTaskExecutor-message_handle_thread] Copy <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to local /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:12:27.088 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread] Caught exception while fetching segment from: <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to: /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:12:28.192 INFO [S3PinotFS] [HelixTaskExecutor-message_handle_thread] Copy <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to local /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:12:58.196 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread] Caught exception while fetching segment from: <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> to: /var/pinot/server/data/index/viewSegments_REALTIME/viewSegments__6__322__20210712T0211Z.tar.gz
2021/07/12 03:13:01.512 WARN [viewSegments_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread] Failed to download segment viewSegments__6__322__20210712T0211Z from deep store:
2021/07/12 03:13:01.513 WARN [viewSegments_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread] Download segment viewSegments__6__322__20210712T0211Z from deepstore uri <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> failed.
2021/07/12 03:13:01.513 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task viewSegments__6__322__20210712T0211Z
2021/07/12 03:13:01.593 INFO [HelixTask] [HelixTaskExecutor-message_handle_thread] Message: f0e75749-a1c3-4d9a-a65f-ce5c82ce9dc5 (parent: null) handling task for viewSegments_REALTIME:viewSegments__6__322__20210712T0211Z completed at: 1626059581593, results: false. FrameworkTime: 23 ms; HandlerTime: 96496 ms.
2021/07/12 07:52:48.453 INFO [HelixTaskExecutor] [ZkClient-EventThread-21-merry-macaw-zookeeper:2181] Scheduling message ba69fae1-316e-4923-b2b0-3ad4701a60eb: viewSegments_REALTIME:viewSegments__6__322__20210712T0211Z, null->null
2021/07/12 07:52:48.454 INFO [viewSegments_REALTIME-SegmentReloadMessageHandler] [HelixTaskExecutor-message_handle_thread] Handling message: ZnRecord=ba69fae1-316e-4923-b2b0-3ad4701a60eb, {CREATE_TIMESTAMP=1626076368284, EXECUTE_START_TIMESTAMP=1626076368454, MSG_ID=ba69fae1-316e-4923-b2b0-3ad4701a60eb, MSG_STATE=new, MSG_SUBTYPE=RELOAD_SEGMENT, MSG_TYPE=USER_DEFINE_MSG, PARTITION_NAME=viewSegments__6__322__20210712T0211Z, RESOURCE_NAME=viewSegments_REALTIME, RETRY_COUNT=0, SRC_CLUSTER=crt, SRC_INSTANCE_TYPE=PARTICIPANT, SRC_NAME=Controller_merry-macaw-pinot-controller-1.merry-macaw-pinot-controller-headless.crt.svc.cluster.local_9000, TGT_NAME=Server_merry-macaw-pinot-server-1.merry-macaw-pinot-server-headless.crt.svc.cluster.local_8098, TGT_SESSION_ID=100e6aa105f006d, TIMEOUT=-1}{}{}, Stat=Stat {_version=0, _creationTime=1626076368404, _modifiedTime=1626076368404, _ephemeralOwner=0}
2021/07/12 07:52:48.454 INFO [viewSegments_REALTIME-SegmentReloadMessageHandler] [HelixTaskExecutor-message_handle_thread] Waiting for lock to refresh : viewSegments__6__322__20210712T0211Z, queue-length: 0
2021/07/12 07:52:48.455 INFO [viewSegments_REALTIME-SegmentReloadMessageHandler] [HelixTaskExecutor-message_handle_thread] Acquired lock to refresh segment: viewSegments__6__322__20210712T0211Z (lock-time=1ms, queue-length=0)
2021/07/12 07:52:48.455 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Reloading single segment: viewSegments__6__322__20210712T0211Z in table: viewSegments_REALTIME
2021/07/12 07:52:48.463 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Reloading segment: viewSegments__6__322__20210712T0211Z in table: viewSegments_REALTIME
2021/07/12 07:52:48.464 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Try reloading REALTIME consuming segment: viewSegments__6__322__20210712T0211Z in table: viewSegments_REALTIME
2021/07/12 07:52:48.464 INFO [MutableSegmentImpl_viewSegments__6__322__20210712T0211Z_crt-pinot-segments] [HelixTaskExecutor-message_handle_thread] Newly added columns: {}
2021/07/12 07:52:48.464 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Reloaded single segment: viewSegments__6__322__20210712T0211Z in table: viewSegments_REALTIME
2021/07/12 07:52:48.472 INFO [HelixTask] [HelixTaskExecutor-message_handle_thread] Message: ba69fae1-316e-4923-b2b0-3ad4701a60eb (parent: null) handling task for viewSegments_REALTIME:viewSegments__6__322__20210712T0211Z completed at: 1626076368472, results: true. FrameworkTime: 8 ms; HandlerTime: 10 ms.
b

Bruce Ritchie

07/12/2021, 3:32 PM
Log indicates that it couldn't dl the segment from S3:
Copy code
Download segment viewSegments__6__322__20210712T0211Z from deepstore uri <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments__6__322__20210712T0211Z> failed.
Does that segment exist in S3?
z

Zsolt Takacs

07/12/2021, 3:36 PM
Yes it exists. In the meantime I've removed the tenant tags from the affected server and rebalanced everything from it, and all segments recovered. The problem is it seems there's no automatic recovery if the deepstore download fails enough times.
And also I've seen this same occurence with an offline table where segments were batch ingested, it is not realtime only.
b

Bruce Ritchie

07/12/2021, 3:39 PM
I haven't looked into the code but it's possible it's not handling the 503 slowdown response (or some other common s3 issue) from s3 well.
z

Zsolt Takacs

07/12/2021, 3:43 PM
Here's a download error with some stack trace:
Copy code
2021/07/08 03:11:04.064 WARN [PinotFSSegmentFetcher] [HelixTaskExecutor-message_handle_thread] Caught exception while fetching segment from: <s3://data-crt-pinot-us-prod/data/viewSegments/viewSegments_OFFLINE_1564164000000_1564167600000_34> to: /var/pinot/server/data/segment/viewSegments_OFFLINE/tmp-viewSegments_OFFLINE_1564164000000_1564167600000_34-36849a99-61b3-4643-bf12-afa5908c35cc/viewSegments_OFFLINE_1564164000000_1564167600000_34.enc
software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: Read timed out
	at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:98) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:43) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:199) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:66) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:34) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:189) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:121) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:147) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:52) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:62) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at software.amazon.awssdk.services.s3.DefaultS3Client.getObject(DefaultS3Client.java:3539) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.plugin.filesystem.S3PinotFS.copyToLocalFile(S3PinotFS.java:490) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
this same error occured three times, then it's
Attempts exceeded when downloading segment
and the segment is stuck
j

Jackie

07/12/2021, 4:22 PM
@Zsolt Takacs You can use the segment reset api in the swagger which should be able to fix the error'ed segment
z

Zsolt Takacs

07/12/2021, 6:22 PM
thanks, I will try next time
Since these problems are frequent for us I've made a script to call the reset api for every segment that has problems, it's working. Is anything planned that can fix these issues without manual actions?