Hello, I've had to downgrade pinot from a 0.8.0 sn...
# general
p
Hello, I've had to downgrade pinot from a 0.8.0 snapshot version to 0.7.1 (I needed some features from 0.8.0, but due to shifting needs was forced to 0.7.1). I deleted my old table and am currently re-ingesting into the 0.7.1 equivalent. However, I note that the UI is extremely slow with the following error when I try to query the table:
Copy code
[
  {
    "errorCode": 410,
    "message": "BrokerResourceMissingError"
  }
]
The broker logs show this exception:
Copy code
2021/06/17 16:47:13.084 WARN [BaseInstanceSelector] [ClusterChangeHandlingThread] Failed to find servers hosting segment: HitExecutionView__10__19__20210616T1108Z for table: HitExecutionView_REALTIME (all ONLINE/CONSUMING instances: [] and OFFLINE instances: [] are disabled, counting segment as unavailable)
2021/06/17 17:05:18.183 ERROR [BrokerResourceOnlineOfflineStateModelFactory] [HelixTaskExecutor-message_handle_thread] Caught exception while processing transition from OFFLINE to ONLINE for table: hitexecutionview_REALTIME
java.lang.IllegalStateException: Failed to find ideal state for table: hitexecutionview_REALTIME
	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.broker.routing.RoutingManager.buildRouting(RoutingManager.java:309) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.broker.broker.helix.BrokerResourceOnlineOfflineStateModelFactory$BrokerResourceOnlineOfflineStateModel.onBecomeOnlineFromOffline(BrokerResourceOnlineOfflineStateModelFactory.java:80) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_282]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_282]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
2021/06/17 17:05:18.283 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task hitexecutionview_REALTIME
java.lang.reflect.InvocationTargetException: null
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_282]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_282]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_282]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_282]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
Caused by: java.lang.IllegalStateException: Failed to find ideal state for table: hitexecutionview_REALTIME
	at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.broker.routing.RoutingManager.buildRouting(RoutingManager.java:309) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	at org.apache.pinot.broker.broker.helix.BrokerResourceOnlineOfflineStateModelFactory$BrokerResourceOnlineOfflineStateModel.onBecomeOnlineFromOffline(BrokerResourceOnlineOfflineStateModelFactory.java:80) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
	... 12 more
2021/06/17 17:05:18.301 ERROR [StateModel] [HelixTaskExecutor-message_handle_thread] Default rollback method invoked on error. Error Code: ERROR
2021/06/17 17:05:18.383 ERROR [HelixTask] [HelixTaskExecutor-message_handle_thread] Message execution failed. msgId: 366af265-24b4-4b59-a28f-1e6387e5d2aa, errorMsg: java.lang.reflect.InvocationTargetException
2021/06/17 17:05:18.392 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Skip internal error. errCode: ERROR, errMsg: null
2021/06/17 17:06:50.697 WARN [RoutingManager] [HelixTaskExecutor-message_handle_thread] Routing does not exist for table: hitexecutionview_REALTIME, skipping refreshing segment
2021/06/17 17:06:51.179 WARN [RoutingManager] [HelixTaskExecutor-message_handle_thread] Routing does not exist for table: hitexecutionview_REALTIME, skipping refreshing segment
2021/06/17 17:07:22.699 WARN [RoutingManager] [HelixTaskExecutor-message_handle_thread] Routing does not exist for table: hitexecutionview_REALTIME, skipping refreshing segment
2021/06/17 17:07:23.997 WARN [RoutingManager] [HelixTaskExecutor-message_handle_thread] Routing does not exist for table: hitexecutionview_REALTIME, skipping refreshing segment
2021/06/17 17:07:25.800 WARN [RoutingManager] [HelixTaskExecutor-message_handle_thread] Routing does not exist for table: hitexecutionview_REALTIME, skipping refreshing segment
Any ideas?
m
The log suggests there isn't an ideal-state for the table. Can you confirm? If so, likely means table was not created
p
Where can I confirm? Zookeeper browser?
m
yes
p
I see an ideal state for the table
This new table has the same name as the older table but lowercased. Perhaps that is the issue?
m
and it has all segments ONLINE/CONSUMING?
p
yes
m
Yes, that explains the
BrokerResourceMissing
as in broker did not find the table
p
How can I fix this? It seems data was not correctly or completely deleted when I deleted the old table.
m
I think when you delete a RT table, you have to wait for its ideal-state and external view to be completely gone from ZK browers before recreating. Can you try again?
It should be documented in docs, if we haven't already (i remember seeing it though)
p
of course
Does not seem to work, this is starting to look like a degraded filesystem.
In the zookeeper browser in the server instances I see the following errors per partition:
Copy code
{
  "id": "103988a1c17001e__HitExecutionView_REALTIME",
  "simpleFields": {},
  "mapFields": {
    "HELIX_ERROR     20210617-164711.000407 STATE_TRANSITION 86863a54-a4a6-4ea3-ace3-49b20fdb44e1": {
      "AdditionalInfo": "Exception while executing a state transition task HitExecutionView__0__16__20210615T1105Zjava.lang.reflect.InvocationTargetException\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: java.lang.IllegalStateException: Unsupported json index version: 2\n\tat shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:493)\n\tat org.apache.pinot.core.segment.index.readers.json.ImmutableJsonIndexReader.<init>(ImmutableJsonIndexReader.java:56)\n\tat org.apache.pinot.core.segment.index.column.PhysicalColumnIndexContainer.<init>(PhysicalColumnIndexContainer.java:114)\n\tat org.apache.pinot.core.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:115)\n\tat org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:283)\n\tat org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:138)\n\tat org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164)\n\t... 12 more\n",
      "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler",
      "MSG_ID": "f73f44ab-62fe-4f24-86af-8e44ae873a33",
      "Message state": "READ"
    },
    "HELIX_ERROR     20210617-164711.000434 STATE_TRANSITION 254788e1-6826-4321-a3ae-3ecb978ca3ae": {
      "AdditionalInfo": "Message execution failed. msgId: f73f44ab-62fe-4f24-86af-8e44ae873a33, errorMsg: java.lang.reflect.InvocationTargetException",
      "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler",
      "MSG_ID": "f73f44ab-62fe-4f24-86af-8e44ae873a33",
      "Message state": "READ"
    }
  },
  "listFields": {}
}
m
Copy code
Unsupported json index version: 2\n\tat shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:493)\n\tat org.apache.pinot.core.segment.index.readers.json.ImmutableJsonIndexReader.<init>
p
Where does that come from?
m
In your log above
p
I know, I mean what causes it in the pinot code path?
m
Seems like going back from 0.8.0 to 0.7.1 might be causing this.
This is when the code is trying to add a new realtime segment.
p
That is strange, I tested this downgrade change in lower environments without issue.
Perhaps segments where not online and consuming... 🤔
m
I think the delete + recreate and downgrade was not cleanly down and cluster went into wierd state
p
Is there a growing list of 0.8.0 features I can take a look at?