Hi, Till yesterday everything was fine, I was able...
# troubleshooting
n
Hi, Till yesterday everything was fine, I was able to ingest data into Realtime table. However, today I started facing this issue where the existing tables had some corrupted segments showing as BAD. Also, created tables with pre-defined segments are showing in a BAD state. I followed trouble shooting steps mentioned in previous threads that include reloading segments, resetting the segments, deleting the segments, rebalancing etc. None of them started ingestion process again. Also, I deleted the entire cluster and re-configured everything, but still I am facing the same issue even after creating the table successfully. Server Exception:
Copy code
Caught exception in state transition from OFFLINE -> ONLINE for resource: caseData_REALTIME,
Controller:
Copy code
Reading segments debug info from servers: [Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098] for table: caseData_REALTIME
Server: Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098 returned error: 404
m
What’s the exception on the server side? Also, is there enough resources (storage, cpu/mem)?
n
@Mayank Here is the server exception for every segment:
Copy code
Caught exception in state transition from OFFLINE -> ONLINE for resource: pinotSourceTable_REALTIME, partition: pinotSourceTable__2__0__20220928T0111Z
java.lang.UnsupportedOperationException: null
        at org.apache.pinot.plugin.stream.kinesis.KinesisStreamMetadataProvider.fetchStreamPartitionOffset(KinesisStreamMetadataProvider.java:80) ~[pinot-kinesis-0.12.0-SNAPSHOT-shaded.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.fetchLatestStreamOffset(LLRealtimeSegmentDataManager.java:1435) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.<init>(LLRealtimeSegmentDataManager.java:1393) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:362) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:175) ~[pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:165) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeConsumingFromOffline(SegmentOnlineOfflineStateModelFactory.java:87) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:?]
        at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:?]
        at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?]
        at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?]
        at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:350) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:278) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.12.0-SNAPSHOT-jar-with-dependencies.jar:0.12.0-SNAPSHOT-83b7f157f77c07675d7760569a796199a41b5555]
        at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]
Also, when I use the release 0.10.0 or 0.11.0, it works as expected. But, when I create the cluster with latest image tag, it throws me this error.
n
@Nagendra Gautham Gondi since it is happening in the latest image tag, can you please share a few lines before this exception on the same thread? I wonder if it has to do with a refactoring PR that got merged yesterday.
m
Thanks @Navina for following up
👍 1
n
Untitled.txt
Hi @Navina I’ve attached the server log above.
n
@Nagendra Gautham Gondi I am able to reproduce this with the kinesis quickstart. debugging to see why though. I thought I added fix for kinesis too. I will look into this tomorrow and give you an update.
ok. I found the issue. prior to my change, all exceptions were caught and
fetchLatestStreamOffset
would return
null
. With my change, I ended up handling only
TImeoutException
. I will create a fix for this in my AM time (IST).
n
Awesome, thanks Navina! I very much appreciate it. Please keep me posted and I’d continue exploring again.
👍 1
n
@Nagendra Gautham Gondi this should fix your problem -> https://github.com/apache/pinot/pull/9481/files ty!
n
Great, I will watch out for this to be pushed into master before I re-create my setup.
👍 1