Hi everyone, we have a couple of segments with no ...
# troubleshooting
b
Hi everyone, we have a couple of segments with no replicas and they're not coming online at all. how can that be fixed? Tried reloading those segments with REST API but that didn't seem to work
m
Are they in error state?
b
Yes.
m
Restart the server and see if they become online. If they are still in error state, then server should log why
b
Restarted the server but didn't fix. Let me see the logs
Copy code
2020/09/25 00:07:12.246 ERROR [HelixStateTransitionHandler] [HelixTaskExecutor-message_handle_thread] Exception while executing a state transition task spanEventView__5__2157__20200924T0934Z
java.lang.reflect.InvocationTargetException: null
	at sun.reflect.GeneratedMethodAccessor61.invoke(Unknown Source) ~[?:?]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_265]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_265]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
--
Caused by: java.lang.RuntimeException: Attempt to re-create an existing index for key: start_time_millis.range_index, for segmentDirectory: /var/pinot/server/data/index/spanEventView_REALTIME/spanEventView__5__2157__20200924T0934Z/v3
	at org.apache.pinot.core.segment.store.SingleFileIndexDirectory.checkKeyNotPresent(SingleFileIndexDirectory.java:178) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.store.SingleFileIndexDirectory.allocNewBufferInternal(SingleFileIndexDirectory.java:151) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.store.SingleFileIndexDirectory.newBuffer(SingleFileIndexDirectory.java:106) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.store.SegmentLocalFSDirectory$Writer.getNewIndexBuffer(SegmentLocalFSDirectory.java:335) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.store.SegmentLocalFSDirectory$Writer.newIndexFor(SegmentLocalFSDirectory.java:320) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.index.loader.LoaderUtils.writeIndexToV3Format(LoaderUtils.java:57) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.index.loader.invertedindex.RangeIndexHandler.createRangeIndexForColumn(RangeIndexHandler.java:109) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.index.loader.invertedindex.RangeIndexHandler.createRangeIndices(RangeIndexHandler.java:74) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.segment.index.loader.SegmentPreProcessor.process(SegmentPreProcessor.java:108) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
	at org.apache.pinot.core.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:99) ~[pinot-all-0.5.0-jar-with-dependencies.jar:0.5.0-d87bbc9032c6efe626eb5f9ef1db4de7aa067179]
you're right this exception is there in the server
j
@Buchi Reddy Can you please check the
index_map
file for the errored segment? (under
/var/pinot/server/data/index/spanEventView_REALTIME/spanEventView__5__2157__20200924T0934Z/v3
)
x
is this segment good before? if so ,you can try to delete it on pinot server ,then reload. See if it will be downloaded from deep store
m
Seems like recreating of range index fails
b
@Xiang Fu removing the segment from server and reloading didn't work. Here are the logs on server.
Copy code
2020/09/25 07:15:24.897 INFO [spanEventView_REALTIME-SegmentReloadMessageHandler] [HelixTaskExecutor-message_handle_thread] Waiting for lock to refresh : spanEventView__5__2157__20200924T0934Z, queue-length: 0
2020/09/25 07:15:24.897 INFO [spanEventView_REALTIME-SegmentReloadMessageHandler] [HelixTaskExecutor-message_handle_thread] Acquired lock to refresh segment: spanEventView__5__2157__20200924T0934Z (lock-time=0ms, queue-length=0)
2020/09/25 07:15:24.897 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Reloading single segment: spanEventView__5__2157__20200924T0934Z in table: spanEventView_REALTIME
2020/09/25 07:15:24.897 INFO [HelixInstanceDataManager] [HelixTaskExecutor-message_handle_thread] Segment metadata is null. Skip reloading segment: spanEventView__5__2157__20200924T0934Z in table: spanEventView_REALTIME
2020/09/25 07:15:24.897 INFO [HelixTask] [HelixTaskExecutor-message_handle_thread] Message 827a6d91-831b-4a27-acfe-88d62f3a0ed2 completed.
@Jackie here are the files in that segment dir.
Copy code
root@pinot-server-0:/var/pinot/server/data/index/spanEventView_REALTIME/spanEventView__5__2157__20200924T0934Z/v3# ls -latr
total 518140
drwxrwsr-x 2 root 1337      4096 Sep 24 10:14 .
-rw-rw-r-- 1 root 1337        16 Sep 24 10:14 creation.meta
-rw-rw-r-- 1 root 1337     39508 Sep 24 10:14 metadata.properties
-rw-rw-r-- 1 root 1337      9387 Sep 24 23:53 index_map
-rw-rw-r-- 1 root 1337 530504277 Sep 24 23:53 columns.psf
drwxrwsr-x 3 root 1337      4096 Sep 25 00:07 ..
don't see anything off there.
index_map
file even has the range index start offset and size
x
hmm, can u try to delete the segment and restart the server ?
b
Yes, I can try. Why does it say metadata is null? where is it looking for the metadata?
x
I feel that reload task assumes segment presence. it will only download it from deep store for newly added segment or from startup