Hello, we are on 0.6 and we are facing one issue w...
# troubleshooting
l
Hello, we are on 0.6 and we are facing one issue with realtime tables consumption from Kafka. Consumption from kafka stops all of a sudden. There are no errors/exceptions in logs. And from thread dumps, we don’t see the consumption thread. From this I assume we hit some uncaught exception in consumption thread. We are using Kafka low level consumer. At pinot bootstrap, should we register a default uncaught exception handler which just logs them with full stack trace. This helps in debugging several unhandled corner cases.
j
The consuming threads have a global catch which will log the exception if anything goes wrong. Can you please check the external view of the table and see if the segment runs into ERROR state.
l
yes. checking
j
Also search for the
ERROR
log with logger name
LLRealtimeSegmentDataManager_<segmentName>
l
Nope. There are no error logs for this either in controller or server
x
hmm, have you checked the log file inside the server container?
l
yes
x
hmm, can you try to set the log level to info or debug on server so it may print out the stacktrace, in any case, there should be something output. Also for test purpose, you can make the logger to be sync
l
Log level is already info. We can’t really afford to set it to debug as this is happening in a cluster with significant load. And this issue is not occurring in other clusters (they are with less load).
At pinot bootstrap, should we register a default uncaught exception handler which just logs them with full stack trace.
@Xiang Fu: Should we have one?
This issue occurred again in our system and here are my observations • Few partitions consumer threads are not seen in thread dumps. This means, consumption threads are not just hung but they are dead • This issue is happening when an old segment is moving from CONSUMING to ONLINE state and a new segment is being created. • No error logs and no stack traces. • Last log we see in server before consumption thread dies quietly.
Copy code
2021/03/31 07:31:39.825 INFO [LLRealtimeSegmentDataManager_spanEventView__27__2251__20210331T0731Z] [HelixTaskExecutor-message_handle_thread] RealtimeDataResourceZKMetadata contains no information about sorted column for segment spanEventView__27__2251__20210331T0731Z
• External view of the table also doesn’t show this server for this specific segment • Issue gets fixed after restart of problematic pinot-server
Controller logs
Copy code
2021/03/31 07:31:37.786 INFO [PinotLLCRealtimeSegmentManager] [grizzly-http-server-1] Creating segment ZK metadata for new CONSUMING segment: spanEventView__27__2251__20210331T0731Z with start offset: 166204784 and creation time: 1617175897785
2021/03/31 07:31:37.786 INFO [SegmentSizeBasedFlushThresholdUpdater] [grizzly-http-server-1] spanEventView__27__2251__20210331T0731Z: Data from committing segment: Time 1h32m3s  numRows 182331 threshold 182331 segmentSize(bytes) 326256819
2021/03/31 07:31:37.786 INFO [SegmentSizeBasedFlushThresholdUpdater] [grizzly-http-server-1] Committing segment size 326256819, current ratio 4.443249882066186E-4, setting threshold for spanEventView__27__2251__20210331T0731Z as 186363
2021/03/31 07:31:37.786 INFO [PinotLLCRealtimeSegmentManager] [grizzly-http-server-1] Persisting segment ZK metadata for segment: spanEventView__27__2251__20210331T0731Z
2021/03/31 07:31:37.854 INFO [RealtimeSegmentAssignment] [grizzly-http-server-1] Assigning segment: spanEventView__27__2251__20210331T0731Z with instance partitions: {""instancePartitionsName"":""spanEventView_CONSUMING"",""partitionToInstancesMap"":{""0_0"":[""Server_pinot-server-3.pinot-server.traceable.svc.cluster.local_8098"",""Server_pinot-server-0.pinot-server.traceable.svc.cluster.local_8098"",""Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098"",""Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098""]}} for table: spanEventView_REALTIME
2021/03/31 07:31:37.854 INFO [RealtimeSegmentAssignment] [grizzly-http-server-1] Assigned segment: spanEventView__27__2251__20210331T0731Z to instances: [Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098, Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098] for table: spanEventView_REALTIME
2021/03/31 07:31:37.854 INFO [PinotLLCRealtimeSegmentManager] [grizzly-http-server-1] Adding new CONSUMING segment: spanEventView__27__2251__20210331T0731Z to instances: [Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098, Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098]
2021/03/31 07:31:39.689 INFO [MessageDispatchStage] [HelixController-pipeline-default-traceable-views-(d7a6546b_DEFAULT)] Event d7a6546b_DEFAULT : Sending Message 6d73e84e-fa4d-4987-b026-7cb93397b7a9 to Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098 transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z|[] from:OFFLINE to:CONSUMING, relayMessages: 0
2021/03/31 07:31:39.689 INFO [MessageDispatchStage] [HelixController-pipeline-default-traceable-views-(d7a6546b_DEFAULT)] Event d7a6546b_DEFAULT : Sending Message 803e30a5-3c3b-4cb2-9f2d-065ee9a78404 to Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098 transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z|[] from:OFFLINE to:CONSUMING, relayMessages: 0
2021/03/31 07:31:40.937 INFO [MessageGenerationPhase] [HelixController-pipeline-default-traceable-views-(f69e3778_DEFAULT)] Event f69e3778_DEFAULT : Message already exists for Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098 to transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z from OFFLINE to CONSUMING, isRelay: false
2021/03/31 07:31:40.937 INFO [MessageGenerationPhase] [HelixController-pipeline-default-traceable-views-(f69e3778_DEFAULT)] Event f69e3778_DEFAULT : Message already exists for Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098 to transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z from OFFLINE to CONSUMING, isRelay: false
2021/03/31 07:31:41.892 INFO [MessageGenerationPhase] [HelixController-pipeline-default-traceable-views-(1de6a604_DEFAULT)] Event 1de6a604_DEFAULT : Message already exists for Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098 to transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z from OFFLINE to CONSUMING, isRelay: false
Pinot server logs
Copy code
2021/03/31 07:31:39.730 INFO [HelixTaskExecutor] [ZkClient-EventThread-22-zookeeper-0.zookeeper-headless.traceable.svc.cluster.local:2181,zookeeper-1.zookeeper-headless.traceable.svc.cluster.local:2181,zookeeper-2.zookeeper-headless.traceable.svc.cluster.local:2181/pinot] Scheduling message 6d73e84e-fa4d-4987-b026-7cb93397b7a9: spanEventView_REALTIME:spanEventView__27__2251__20210331T0731Z, OFFLINE->CONSUMING
2021/03/31 07:31:39.730 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] handling message: 6d73e84e-fa4d-4987-b026-7cb93397b7a9 transit spanEventView_REALTIME.spanEventView__27__2251__20210331T0731Z|[] from:OFFLINE to:CONSUMING, relayedFrom: null
2021/03/31 07:31:39.811 INFO [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Instance Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098, partition spanEventView__27__2251__20210331T0731Z received state transition from OFFLINE to CONSUMING on session 377be7c4e47028e, message id: 6d73e84e-fa4d-4987-b026-7cb93397b7a9
2021/03/31 07:31:39.811 INFO [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] [HelixTaskExecutor-message_handle_thread] SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline() : ZnRecord=6d73e84e-fa4d-4987-b026-7cb93397b7a9, {CREATE_TIMESTAMP=1617175899403, ClusterEventName=MessageChange, EXECUTE_START_TIMESTAMP=1617175899730, EXE_SESSION_ID=377be7c4e47028e, FROM_STATE=OFFLINE, MSG_ID=6d73e84e-fa4d-4987-b026-7cb93397b7a9, MSG_STATE=read, MSG_TYPE=STATE_TRANSITION, PARTITION_NAME=spanEventView__27__2251__20210331T0731Z, READ_TIMESTAMP=1617175899722, RESOURCE_NAME=spanEventView_REALTIME, RESOURCE_TAG=spanEventView_REALTIME, SRC_NAME=pinot-controller-1.pinot-controller.traceable.svc.cluster.local_9000, SRC_SESSION_ID=177be7a05650257, STATE_MODEL_DEF=SegmentOnlineOfflineStateModel, STATE_MODEL_FACTORY_NAME=DEFAULT, TGT_NAME=Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098, TGT_SESSION_ID=377be7c4e47028e, TO_STATE=CONSUMING}{}{}, Stat=Stat {_version=0, _creationTime=1617175899694, _modifiedTime=1617175899694, _ephemeralOwner=0}
2021/03/31 07:31:39.811 INFO [SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel] [HelixTaskExecutor-message_handle_thread] SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline() : ZnRecord=6d73e84e-fa4d-4987-b026-7cb93397b7a9, {CREATE_TIMESTAMP=1617175899403, ClusterEventName=MessageChange, EXECUTE_START_TIMESTAMP=1617175899730, EXE_SESSION_ID=377be7c4e47028e, FROM_STATE=OFFLINE, MSG_ID=6d73e84e-fa4d-4987-b026-7cb93397b7a9, MSG_STATE=read, MSG_TYPE=STATE_TRANSITION, PARTITION_NAME=spanEventView__27__2251__20210331T0731Z, READ_TIMESTAMP=1617175899722, RESOURCE_NAME=spanEventView_REALTIME, RESOURCE_TAG=spanEventView_REALTIME, SRC_NAME=pinot-controller-1.pinot-controller.traceable.svc.cluster.local_9000, SRC_SESSION_ID=177be7a05650257, STATE_MODEL_DEF=SegmentOnlineOfflineStateModel, STATE_MODEL_FACTORY_NAME=DEFAULT, TGT_NAME=Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098, TGT_SESSION_ID=377be7c4e47028e, TO_STATE=CONSUMING}{}{}, Stat=Stat {_version=0, _creationTime=1617175899694, _modifiedTime=1617175899694, _ephemeralOwner=0}
2021/03/31 07:31:39.811 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Adding segment: spanEventView__27__2251__20210331T0731Z to table: spanEventView_REALTIME
2021/03/31 07:31:39.825 INFO [LLRealtimeSegmentDataManager_spanEventView__27__2251__20210331T0731Z] [HelixTaskExecutor-message_handle_thread] RealtimeDataResourceZKMetadata contains no information about sorted column for segment spanEventView__27__2251__20210331T0731Z
Last log on server looks problem to me. Does this mean premature notification from controller to the server even before the actual metadata is committed to ZooKeeper. cc: @Xiang Fu
Updated this thread with more info. Any thoughts?
x
@Mayank this seems to be an issue that many people facing
@Laxman Ch so the segment is committed and new segment is not created or it’s just the consuming segment thread got dropped, and segment is still in consuming status in idealstates ?
l
Here from the above logs, segment thread died while creating a new segment.
segment is still in consuming status in idealstates
I checked external view and the server is missing in the list. I didn’t check the ideal state. Its happening every few hours for us. Will check next time provide details.
@Xiang Fu @Mayank: Please zoom into this log. I feel this should not happen.
Copy code
2021/03/31 07:31:39.825 INFO [LLRealtimeSegmentDataManager_spanEventView__27__2251__20210331T0731Z] [HelixTaskExecutor-message_handle_thread] RealtimeDataResourceZKMetadata contains no information about sorted column for segment spanEventView__27__2251__20210331T0731Z
This is the last log line from the segment consumption thread before it dies.
m
Reading the thread
Are there other log files created (eg hs_err.log) in the log directory?
l
No @Mayank. JVM didn’t crash. Only few consumption threads stops
m
Is this a fair summary:
Copy code
1. Once in a while some servers stop consuming.
2. Segment is in consuming in ideal-state
3. Server is missing from external view.
4. Logs seem to suggest that an attempt commit the segment may have caused this?
l
Made it bit verbose to be clear. Sorry.
Copy code
1. Once in a while some servers stop consuming some of the assigned partitions (Say 2 out of 16 partitions of a table) stops consuming from Kafka.
2. Segment is in consuming in ideal-state (DON'T KNOW. WILL PROVIDE THESE DETAILS WHEN IT HAPPENS AGAIN)
3. Server is missing from external view ONLY for the segments where the consumption thread dies.
4. Logs seem to suggest that an attempt commit an OLD segment and start a NEW segment may have caused this.
External view of the segment where consumption thread died on pinot-server-2.
Copy code
"spanEventView__27__2251__20210331T0731Z" : {
         "Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098" : "CONSUMING"
}
External view of the segment where consumption thread is running fine on pinot-server-2.
Copy code
"spanEventView__13__2255__20210331T0620Z" : {
         "Server_pinot-server-1.pinot-server.traceable.svc.cluster.local_8098" : "CONSUMING",
         "Server_pinot-server-2.pinot-server.traceable.svc.cluster.local_8098" : "CONSUMING"
}
x
oh, so it’s not all segments failed
Then when the segment on server-1 is sealed and goes online, server-2 should download it from controller
l
Nope. That does not happen. This gets fixed only after pinot-server-2 restart. If we do not restart, we lose partitions slowly and finally consuming partitions count on all servers comes down to zero.
2. Segment is in consuming in ideal-state (DON’T KNOW. WILL PROVIDE THESE DETAILS WHEN IT HAPPENS AGAIN)
Yes. All segments are in CONSUMING in ideal state. But they are not there in external view. cc: @Mayank @Xiang Fu
j
@Laxman Ch How about the EV of the committed segment? E.g. if
spanEventView__13__2255__*
is the new consuming segment, what's the state of
spanEventView__13__2254___*
?
Also, how much resource is each Pinot server allocated? I feel the server might stuck at segment creation step
l
Found the root-cause. This is possibly due to a bug in groovy. From thread dumps we see all message handlers are slowly going to the following state and stuck there.
Copy code
"HelixTaskExecutor-message_handle_thread" #51 daemon prio=5 os_prio=0 cpu=70457.28ms elapsed=4885.80s tid=0x00007fe4e43d6000 nid=0x6e waiting for monitor entry  [0x00007fe4aa6e5000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at org.codehaus.groovy.reflection.ClassInfo$GlobalClassSet.add(ClassInfo.java:477)
	- waiting to lock <0x0000000702ed2218> (a org.codehaus.groovy.util.ManagedLinkedList)
	at org.codehaus.groovy.reflection.ClassInfo$1.computeValue(ClassInfo.java:83)
	at org.codehaus.groovy.reflection.ClassInfo$1.computeValue(ClassInfo.java:79)
	at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$EntryWithValue.<init>(GroovyClassValuePreJava7.java:37)
	at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment.createEntry(GroovyClassValuePreJava7.java:64)
	at org.codehaus.groovy.reflection.GroovyClassValuePreJava7$GroovyClassValuePreJava7Segment.createEntry(GroovyClassValuePreJava7.java:55)
	at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.put(AbstractConcurrentMap.java:120)
	at org.codehaus.groovy.util.AbstractConcurrentMap$Segment.getOrPut(AbstractConcurrentMap.java:100)
	at org.codehaus.groovy.util.AbstractConcurrentMap.getOrPut(AbstractConcurrentMap.java:38)
	at org.codehaus.groovy.reflection.GroovyClassValuePreJava7.get(GroovyClassValuePreJava7.java:94)
	at org.codehaus.groovy.reflection.ClassInfo.getClassInfo(ClassInfo.java:144)
	at org.codehaus.groovy.runtime.metaclass.MetaClassRegistryImpl.getMetaClass(MetaClassRegistryImpl.java:258)
	at org.codehaus.groovy.runtime.InvokerHelper.getMetaClass(InvokerHelper.java:883)
	at groovy.lang.GroovyObjectSupport.<init>(GroovyObjectSupport.java:34)
	at groovy.lang.Script.<init>(Script.java:42)
	at Script1.<init>(Script1.groovy)
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(java.base@11.0.10/Native Method)
	at jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(java.base@11.0.10/NativeConstructorAccessorImpl.java:62)
	at jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(java.base@11.0.10/DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(java.base@11.0.10/Constructor.java:490)
	at org.codehaus.groovy.runtime.InvokerHelper.createScript(InvokerHelper.java:431)
	at groovy.lang.GroovyShell.parse(GroovyShell.java:700)
	at groovy.lang.GroovyShell.parse(GroovyShell.java:736)
	at groovy.lang.GroovyShell.parse(GroovyShell.java:727)
	at org.apache.pinot.core.data.function.GroovyFunctionEvaluator.<init>(GroovyFunctionEvaluator.java:73)
	at org.apache.pinot.core.data.function.FunctionEvaluatorFactory.getExpressionEvaluator(FunctionEvaluatorFactory.java:91)
	at org.apache.pinot.core.data.function.FunctionEvaluatorFactory.getExpressionEvaluator(FunctionEvaluatorFactory.java:78)
	at org.apache.pinot.core.util.IngestionUtils.extractFieldsFromSchema(IngestionUtils.java:61)
	at org.apache.pinot.core.util.IngestionUtils.getFieldsForRecordExtractor(IngestionUtils.java:50)
	at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.<init>(LLRealtimeSegmentDataManager.java:1186)
	at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:314)
	at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:133)
	at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164)
	at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline(SegmentOnlineOfflineStateModelFactory.java:88)
	at jdk.internal.reflect.GeneratedMethodAccessor723.invoke(Unknown Source)
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.10/DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(java.base@11.0.10/Method.java:566)
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)
	- locked <0x000000072defa200> (a org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel)
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)
	at java.util.concurrent.FutureTask.run(java.base@11.0.10/FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@11.0.10/ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@11.0.10/ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(java.base@11.0.10/Thread.java:834)
Looks like this is the bug we are hitting in Pinot 0.6.0 (groovy 2.4.8). We are trying the following • Workaround mentioned in GROOVY-8067 • Upgrade groovy library to 2.4.9 Will provide an update soon after the test today.
x
thanks!
please file an issue or submit a PR 😛
l
Sure @Xiang Fu. I see groovy lib is already updated in 0.7. Will validate this on 0.7.1 as well.
Anyone else experiencing the same problem, following are the scenarios to watch out for the following combination in your setup • Pinot server thread dumps stuck in groovy lib • You are using map type in your schema (columns with __KEYS && __VALUES) • You are using Pinot 0.6
We are running with
-Dgroovy.use.classvalue=true
JVM argument as suggested in GROOVY-8067. Workaround seems to be working fine. No occurrence in last 12 hours run (Incident used to occur every 2-3 hours earlier). Will continue to use this as workaround and upgrade to 0.7.1 as needed. @Xiang Fu @Mayank: fyi
m
Thanks @Laxman Ch for confirming
@Xiang Fu Is there a way to retro update the 0.6 release notes, or document it in a way that other users using that release also get notified?
x
we can update release notes by adding a section of existing issue and workaround
s
@Laxman Ch this article suggests that the workaround is not recommended as it could lead to memory leaks - https://github.com/fabric8-services/fabric8-tenant-jenkins/issues/77
l
@Suraj: Verified the details of above bug.
workaround is not recommended as it could lead to memory leaks
GROOVY-7591 is fixed in 2.4.5. We are on 2.4.8 • This is actually due to a JDK bug and that got fixed in JDK 9
x
hmm, have you tried jdk 11 image?
r
l
• We use JDK 11 as Ravi mentioned • Verified the heap usage. There is no leak observed in last 20 hours under production load with above groovy flag enabled.
Pnot server heapusage looks stable for 1 day after enabling the above mentioned groovy no class use flag. Don’t see any signs of memory leak
x
I see, so is there anything we need to fix from pinot side?
l
No fix needed in 0.7 as we already upgraded groovy version. If there are plans to fix 0.6 branch we can fix it by upgrading the groovy. Not sure if we maintain/support old releases in Pinot project.
m
We should at least document this in the 0.6 page @Xiang Fu ?
x
Or maybe FAQ
m
@Laxman Ch Would you be able to add it (since you have the most details)?
l
I can do that.
m
thankyou