Prashant Pandey
03/11/2022, 1:33 PM{
"id": "backend_entity_view_REALTIME",
"simpleFields": {
"BUCKET_SIZE": "0",
"SESSION_ID": "30162be05be0043",
"STATE_MODEL_DEF": "SegmentOnlineOfflineStateModel",
"STATE_MODEL_FACTORY_NAME": "DEFAULT"
},
"mapFields": {
"backend_entity_view__6__1770__20220311T1126Z": {
"CURRENT_STATE": "CONSUMING",
"END_TIME": "1646997978968",
"INFO": "",
"PREVIOUS_STATE": "OFFLINE",
"START_TIME": "1646997978753",
"TRIGGERED_BY": "*"
}
},
"listFields": {}
}
{
"id": "service_call_view_REALTIME",
"simpleFields": {
"BUCKET_SIZE": "0",
"SESSION_ID": "30162be05be0043",
"STATE_MODEL_DEF": "SegmentOnlineOfflineStateModel",
"STATE_MODEL_FACTORY_NAME": "DEFAULT"
},
"mapFields": {
"service_call_view__4__1268__20220311T1227Z": {
"CURRENT_STATE": "ONLINE",
"END_TIME": "1647004152026",
"INFO": "",
"PREVIOUS_STATE": "CONSUMING",
"START_TIME": "1647004133127",
"TRIGGERED_BY": "*"
},
"service_call_view__4__1269__20220311T1308Z": {
"CURRENT_STATE": "CONSUMING",
"END_TIME": "1647004133319",
"INFO": "",
"PREVIOUS_STATE": "OFFLINE",
"START_TIME": "1647004133127",
"TRIGGERED_BY": "*"
}
},
"listFields": {}
}
{
"id": "span_event_view_1_REALTIME",
"simpleFields": {
"BUCKET_SIZE": "0",
"SESSION_ID": "30162be05be0043",
"STATE_MODEL_DEF": "SegmentOnlineOfflineStateModel",
"STATE_MODEL_FACTORY_NAME": "DEFAULT"
},
"mapFields": {
"span_event_view_1__1__9751__20220309T1444Z": {
"CURRENT_STATE": "OFFLINE",
"END_TIME": "1646837055817",
"INFO": "",
"PREVIOUS_STATE": "CONSUMING",
"START_TIME": "1646837055782",
"TRIGGERED_BY": "*"
},
"span_event_view_1__1__9865__20220311T1302Z": {
"CURRENT_STATE": "ONLINE",
"END_TIME": "1647004903102",
"INFO": "",
"PREVIOUS_STATE": "CONSUMING",
"START_TIME": "1647004896155",
"TRIGGERED_BY": "*"
},
"span_event_view_1__13__9635__20220311T1303Z": {
"CURRENT_STATE": "CONSUMING",
"END_TIME": "1647003820644",
"INFO": "",
"PREVIOUS_STATE": "OFFLINE",
"START_TIME": "1647003820427",
"TRIGGERED_BY": "*"
},
"span_event_view_1__1__9866__20220311T1321Z": {
"CURRENT_STATE": "CONSUMING",
"END_TIME": "1647004896393",
"INFO": "",
"PREVIOUS_STATE": "OFFLINE",
"START_TIME": "1647004896155",
"TRIGGERED_BY": "*"
}
},
"listFields": {}
}
The server is consuming from all partitions of span_event_view_1_REALTIME
just fine, but the lag in just this partition (partition 6) of backend_entity_view_REALTIME
is continually increasing. I checked the controller logs, and see a whole lot of:
2022/03/11 13:20:53.141 WARN [ConsumerConfig] [grizzly-http-server-0] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
2022/03/11 13:20:53.397 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-prod-(0ff7d49b_DEFAULT)] Event 0ff7d49b_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.
2022/03/11 13:21:36.012 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-prod-(d95795b6_DEFAULT)] Event d95795b6_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.
2022/03/11 13:21:59.914 WARN [ZkBaseDataAccessor] [HelixController-pipeline-default-pinot-prod-(6831a128_DEFAULT)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-1.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/b43eba4e-58fb-4fae-bb8e-72bcd97ed0ea=-101}
2022/03/11 13:22:00.411 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-prod-(ed7c9add_DEFAULT)] Event ed7c9add_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.
2022/03/11 13:22:00.899 WARN [TaskGarbageCollectionStage] [TaskJobPurgeWorker-pinot-prod] ResourceControllerDataProvider or HelixManager is null for event 7ed61473_TASK(CurrentStateChange) in cluster pinot-prod. Skip TaskGarbageCollectionStage.
2022/03/11 13:22:12.082 ERROR [ZkBaseDataAccessor] [grizzly-http-server-2] paths is null or empty
2022/03/11 13:23:09.694 WARN [ZkBaseDataAccessor] [HelixController-pipeline-default-pinot-prod-(6e6a3d4e_DEFAULT)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-1.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/3f8e92c9-7eb4-49ca-91d1-caa9e868e071=-101}
2022/03/11 13:23:09.694 WARN [ZkBaseDataAccessor] [HelixController-pipeline-task-pinot-prod-(6e6a3d4e_TASK)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-1.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/3f8e92c9-7eb4-49ca-91d1-caa9e868e071=-101}
2022/03/11 13:23:10.248 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-prod-(9f5bd9ba_DEFAULT)] Event 9f5bd9ba_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.
2022/03/11 13:23:20.224 WARN [ZkBaseDataAccessor] [HelixController-pipeline-task-pinot-prod-(28af3294_TASK)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-1.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/0ebed9b5-51b6-41fe-b5df-a8d69c1b717b=-101}
2022/03/11 13:23:20.224 WARN [ZkBaseDataAccessor] [HelixController-pipeline-default-pinot-prod-(28af3294_DEFAULT)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-1.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/0ebed9b5-51b6-41fe-b5df-a8d69c1b717b=-101}
2022/03/11 13:23:20.905 WARN [TopStateHandoffReportStage] [HelixController-pipeline-default-pinot-prod-(b4f04a02_DEFAULT)] Event b4f04a02_DEFAULT : Cannot confirm top state missing start time. Use the current system time as the start time.
2022/03/11 13:25:05.373 WARN [ZkBaseDataAccessor] [HelixController-pipeline-default-pinot-prod-(364d2339_DEFAULT)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-0.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/aae2fb99-0322-4568-b410-88fddd305fd9=-101}
2022/03/11 13:25:05.373 WARN [ZkBaseDataAccessor] [HelixController-pipeline-task-pinot-prod-(364d2339_TASK)] Fail to read record for paths: {/pinot-prod/INSTANCES/Broker_broker-0.broker-headless.pinot.svc.cluster.local_8099/MESSAGES/aae2fb99-0322-4568-b410-88fddd305fd9=-101}
The server has only one conspicuous warning:
2022/03/11 12:57:12.755 ERROR [ServerSegmentCompletionProtocolHandler] [span_event_view_1__1__9864__20220311T1243Z] Could not send request <http://controller-0.controller-headless.pinot.svc.cluster.local:9000/segmentConsumed?reason=rowLimit&streamPartitionMsgOffset=28119323382&instance=Server_server-realtime-10.server-realtime-headless.pinot.svc.cluster.local_8098&offset=-1&name=span_event_view_1__1__9864__20220311T1243Z&rowCount=9138843&memoryUsedBytes=3188798201>
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method) ~[?:?]
at java.net.SocketInputStream.socketRead(SocketInputStream.java:115) ~[?:?]
at java.net.SocketInputStream.read(SocketInputStream.java:168) ~[?:?]
at java.net.SocketInputStream.read(SocketInputStream.java:140) ~[?:?]
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:157) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-f8ec6f6f8eead03488d3f4d0b9501fc3c4232961]
at shaded.org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[pinot-all-0.9.1-jar-with-dependencies.jar:0.9.1-
at java.lang.Thread.run(Thread.java:829) [?:?]
2022/03/11 12:57:12.756 ERROR [LLRealtimeSegmentDataManager_span_event_view_1__1__9864__20220311T1243Z] [span_event_view_1__1__9864__20220311T1243Z] Holding after response from Controller: {"offset":-1,"streamPartitionMsgOffset":null,"buildTimeSec":-1,"isSplitCommitType":false,"status":"NOT_SENT"}
The server’s resource usage is well under limits. Any idea what might be going on here?Prashant Pandey
03/11/2022, 1:37 PMbackend_entity_view_REALTIME
are being consumed from as usual, so doesn’t look like a problem with the kafka topic.Prashant Pandey
03/11/2022, 2:05 PMMayank
Kishore G
Prashant Pandey
03/11/2022, 2:44 PMKishore G
Kishore G
Kishore G
Prashant Pandey
03/11/2022, 2:51 PMspan_event_view_1_REALTIME
all fine but stopped consuming from backend_entity_view_REALTIME
.Kishore G
Kishore G
Kishore G
Prashant Pandey
03/11/2022, 2:53 PMspan_event_view_1_REALTIME
, which was working fine. I can look for similar logs for backend_entity_view_REALTIME
in Sumo Logic.Prashant Pandey
03/11/2022, 2:54 PMKishore G
Prashant Pandey
03/11/2022, 2:55 PMPrashant Pandey
03/11/2022, 3:07 PM"1647001516682","11/03/2022 17:55:16.682 +0530","Exception in thread ""backend_entity_view__6__1770__20220311T1126Z"" java.lang.OutOfMemoryError: Java heap space","{""timestamp"":1647001516682,""log"":""Exception in thread \""backend_entity_view__6__1770__20220311T1126Z\"" java.lang.OutOfMemoryError: Java heap space"",""stream"":""stderr"",""kubernetes"":{""host"":""ip-10-1-136-249.ap-south-1.compute.internal""}}","PROD","ip-10-1-136-249.ap-south-1.compute.internal","234","k8s-prod-green-eks","k8s/prod/pinot/server","prod-green-eks","pinot.server-realtime-10.server","stderr","1647001516682"
Mayank
Mayank
Prashant Pandey
03/11/2022, 3:11 PMMayank
Prashant Pandey
03/11/2022, 3:13 PMMayank
Mayank
Mayank
Prashant Pandey
03/11/2022, 3:14 PMMayank
Prashant Pandey
03/11/2022, 3:14 PMMayank
Prashant Pandey
03/11/2022, 3:16 PMMayank
Prashant Pandey
03/11/2022, 3:16 PMPrashant Pandey
03/11/2022, 3:17 PMMayank
Mayank
Prashant Pandey
03/11/2022, 3:18 PMPrashant Pandey
03/11/2022, 3:19 PMMayank
Mayank
Prashant Pandey
03/11/2022, 3:21 PMPrashant Pandey
03/11/2022, 3:23 PMMayank
Sajjad Moradi
03/11/2022, 6:39 PMSajjad Moradi
03/11/2022, 6:45 PM