Hello, I am using a gs bucket as my deepstore an...
# general
m
Hello, I am using a gs bucket as my deepstore and I also have a RealtimeToOfflineSegmentsTask setup to convert realtime segments to offline segments . I would like to store only offline segments in gs and not the realtime segments, because reading realtime segments from gs is causing an issue for some of my tables. Where could I find the configuration for specifically storing only offline tables in gs?
l
your segments should always be on disk, as I understand deep store is just like a backup, you are using the open source pinot yet?
m
Yes I am using the open source pinot
l
yeah so everything should be on disk first
m
ohhok. Coz I keep seeing this exception for a pinot server trying to convert offline segment to realtime segment.
This is what I see
Copy code
java.lang.RuntimeException: java.lang.IllegalStateException at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.replaceLLSegment(RealtimeTableDataManager.java:535) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.untarAndMoveSegment(RealtimeTableDataManager.java:483) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadSegmentFromDeepStore(RealtimeTableDataManager.java:459) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.downloadAndReplaceSegment(RealtimeTableDataManager.java:432) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.addSegment(RealtimeTableDataManager.java:337) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addRealtimeSegment(HelixInstanceDataManager.java:170) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:164) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at jdk.internal.reflect.GeneratedMethodAccessor10.invoke(Unknown Source) ~[?:?] at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:?] at java.lang.reflect.Method.invoke(Method.java:566) ~[?:?] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?] at java.lang.Thread.run(Thread.java:829) [?:?] Caused by: java.lang.IllegalStateException at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:429) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.segment.index.readers.forward.BaseChunkSVForwardIndexReader.<init>(BaseChunkSVForwardIndexReader.java:72) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.segment.index.readers.forward.FixedByteChunkMVForwardIndexReader.<init>(FixedByteChunkMVForwardIndexReader.java:40) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.segment.index.readers.DefaultIndexReaderProvider.newForwardIndexReader(DefaultIndexReaderProvider.java:104) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.spi.index.IndexingOverrides$Default.newForwardIndexReader(IndexingOverrides.java:205) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.segment.index.column.PhysicalColumnIndexContainer.<init>(PhysicalColumnIndexContainer.java:166) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:181) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:121) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:91) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] at org.apache.pinot.core.data.manager.realtime.RealtimeTableDataManager.replaceLLSegment(RealtimeTableDataManager.java:533) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-078c711d35769be2dc4e4b7e235e06744cf0bba7] ... 17 more
My segments once created for one particular table become BAD once they finish consuming and are committed.
l
you mean when an online segment tries to be converted into an offline one?
cannot tell what’s happening from that exception
m
probably. Let me share some more exceptions I see on the pinot server.
l
isolate the issue
m
I hope this sheds some more light.
So what you said where the deepstore is just a backup seems right because when I exec into the servers I do see the segments on there as well.
l
try removing that task and see if you still get that issue
has anyone see this one before?
m
Copy code
{
  "REALTIME": {
    "tableName": "transaction_record_new_REALTIME",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "timeColumnName": "consensus_timestamp",
      "timeType": "NANOSECONDS",
      "replication": "1",
      "replicasPerPartition": "1",
      "schemaName": "poc_new"
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "streamConfigs": {
        "streamType": "kafka",
        "stream.kafka.consumer.type": "simple",
        "stream.kafka.topic.name": "transaction_record_new",
        "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
        "stream.kafka.consumer.factory.class.name": "org.apache.pinot.core.realtime.impl.kafka2.KafkaConsumerFactory",
        "stream.kafka.hlc.zk.connect.string": "kafka-zookeeper:2181",
        "stream.kafka.zk.broker.url": "kafka-zookeeper:2181",
        "stream.kafka.broker.list": "kafka:9092",
        "stream.kafka.hlc.bootstrap.server": "kafka:9092",
        "realtime.segment.flush.threshold.time": "1d",
        "realtime.segment.flush.threshold.size": "200M",
        "realtime.segment.flush.threshold.rows": "0",
        "stream.kafka.consumer.prop.auto.offset.reset": "smallest"
      },
      "noDictionaryColumns": [
        "transaction_id",
        "ids"
      ],
      "rangeIndexVersion": 2,
      "jsonIndexColumns": [
        "transfers_tokens_1",
        "transfers_hbar_1",
        "transfers_nft_1"
      ],
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "loadMode": "MMAP",
      "enableDefaultStarTree": false,
      "enableDynamicStarTreeCreation": false,
      "aggregateMetrics": false,
      "nullHandlingEnabled": false
    },
    "metadata": {},
    "task": {
      "taskTypeConfigsMap": {
        "RealtimeToOfflineSegmentsTask": {
          "bucketTimePeriod": "2d",
          "bufferTimePeriod": "2d",
          "mergeType": "concat"
        }
      }
    },
    "ingestionConfig": {
      "transformConfigs": [
        {
          "columnName": "fields_1",
          "transformFunction": "JSONFORMAT(fields)"
        }
      ]
    },
    "isDimTable": false
  }
}
So this is my table config. are you suggesting that I remove the RealtimeToOfflineSegmentsTask?
l
yeah
just to see if that’s the issue
also, things are making it to your GS bucket successfully yes?
m
yes they are
@Luis Fernandez Getting rid of the task didn't really help
I still see the bad segments
l
what happens when you query
m
This is what I see on the controller UI
l
Check external view on status of these segments
m
ok thx
k
I feel like @Neha Pawar was addressing something like this (moving realtime segments to deep store), maybe she can chime in here…
l
do you see any errors on your controller?
m
Nothing on the controller specifically. This is something I constantly see on the server even in the zookeeper logs
Copy code
{
  "id": "10061a6ffc90539__transaction_record_new_OFFLINE",
  "simpleFields": {},
  "mapFields": {
    "HELIX_ERROR     20220728-140949.000307 STATE_TRANSITION cbe2ce03-e2f0-414c-8059-a9ff25b878f3": {
      "AdditionalInfo": "Exception while executing a state transition task transaction_record_new_OFFLINE_1568411631396440000_1568437595042231000_0java.lang.reflect.InvocationTargetException\n\tat jdk.internal.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\nCaused by: java.lang.IllegalStateException\n\tat shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:429)\n\tat org.apache.pinot.segment.local.segment.index.readers.forward.BaseChunkSVForwardIndexReader.<init>(BaseChunkSVForwardIndexReader.java:72)\n\tat org.apache.pinot.segment.local.segment.index.readers.forward.FixedByteChunkMVForwardIndexReader.<init>(FixedByteChunkMVForwardIndexReader.java:40)\n\tat org.apache.pinot.segment.local.segment.index.readers.DefaultIndexReaderProvider.newForwardIndexReader(DefaultIndexReaderProvider.java:104)\n\tat org.apache.pinot.segment.spi.index.IndexingOverrides$Default.newForwardIndexReader(IndexingOverrides.java:205)\n\tat org.apache.pinot.segment.local.segment.index.column.PhysicalColumnIndexContainer.<init>(PhysicalColumnIndexContainer.java:166)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:181)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:121)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:91)\n\tat org.apache.pinot.core.data.manager.offline.OfflineTableDataManager.addSegment(OfflineTableDataManager.java:52)\n\tat org.apache.pinot.core.data.manager.BaseTableDataManager.addOrReplaceSegment(BaseTableDataManager.java:373)\n\tat org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addOrReplaceSegment(HelixInstanceDataManager.java:355)\n\tat org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:162)\n\t... 11 more\n",
      "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler",
      "MSG_ID": "69099261-a641-4e68-8965-bf607838c563",
      "Message state": "READ"
    },
    "HELIX_ERROR     20220728-140949.000386 STATE_TRANSITION bdcb3ab9-9721-4c80-a14c-4a58705a9962": {
      "AdditionalInfo": "Message execution failed. msgId: 69099261-a641-4e68-8965-bf607838c563, errorMsg: java.lang.reflect.InvocationTargetException",
      "Class": "class org.apache.helix.messaging.handling.HelixStateTransitionHandler",
      "MSG_ID": "69099261-a641-4e68-8965-bf607838c563",
      "Message state": "READ"
    }
  },
  "listFields": {}
}