https://pinot.apache.org/ logo
h

Harold Lim

03/25/2021, 8:55 PM
Hi, We have a Pinot cluster with around 100 realtime tables. Around 28 tables went into bad state. We have 2 sets of servers (3 each) with different tags (e.g., realtime and offline). Our tables are configured (using the tagOverrideConfig) that once a consuming segment is completed, it is moved immediately to servers with offline tag. On the tables that went to bad state, we noticed that from the UI, it's still showing that the segment is still assigned to the "realtime" server. We do see the segment get completed and get uploaded to the deepstore. Also, we noticed that in zookeeper in pinot -> instances -> server -> messages there are lots of messages. Does that mean that messages are not getting consumed by the server? I assume this is how controller/server communicate thru helix (?).
x

Xiang Fu

03/25/2021, 8:59 PM
@Neha Pawar
n

Neha Pawar

03/25/2021, 9:09 PM
when you use tagOverrideConfig, the consuming segment completes and still remains on the realtime servers. There’s an hourly periodic task, that will move the completed segments from the realtime to the offline tagged servers.
so, it is normal to see some completed segments still on the realtime servers
h

Harold Lim

03/25/2021, 9:10 PM
I think the issue is that the tables are in bad state
and the segment completed 12+ hours ago
So for example, we started the workload midnight. 0_0 got completed, and then 0_1 segment is now consuming (both in realtime server 0). and no new data is getting consumed)
I do see this message in pinot-controller (every hour): 2021/03/25 090216.763 ERROR [SegmentRelocator] [restapi-multiget-thread-789] Relocation failed for table: comp0_horizontal_REALTIME
n

Neha Pawar

03/25/2021, 9:18 PM
and any stack trace with it? afaik, relocation will not happen if the segments are in error state
h

Harold Lim

03/25/2021, 9:18 PM
External View in ZK:
Copy code
{
  "id": "comp0_horizontal_REALTIME",
  "simpleFields": {
    "BATCH_MESSAGE_MODE": "false",
    "BUCKET_SIZE": "0",
    "IDEAL_STATE_MODE": "CUSTOMIZED",
    "INSTANCE_GROUP_TAG": "comp0_horizontal_REALTIME",
    "MAX_PARTITIONS_PER_INSTANCE": "1",
    "NUM_PARTITIONS": "2",
    "REBALANCE_MODE": "CUSTOMIZED",
    "REPLICAS": "1",
    "STATE_MODEL_DEF_REF": "SegmentOnlineOfflineStateModel",
    "STATE_MODEL_FACTORY_NAME": "DEFAULT"
  },
  "mapFields": {
    "comp0_horizontal__0__0__20210324T2221Z": {
      "Server_pinot-server-realtime-1.pinot-server-realtime-headless.svc.cluster.local_8098": "CONSUMING"
    }
  },
  "listFields": {}
}
Ideal view:
Copy code
{
  "id": "comp0_horizontal_REALTIME",
  "simpleFields": {
    "BATCH_MESSAGE_MODE": "false",
    "IDEAL_STATE_MODE": "CUSTOMIZED",
    "INSTANCE_GROUP_TAG": "comp0_horizontal_REALTIME",
    "MAX_PARTITIONS_PER_INSTANCE": "1",
    "NUM_PARTITIONS": "2",
    "REBALANCE_MODE": "CUSTOMIZED",
    "REPLICAS": "1",
    "STATE_MODEL_DEF_REF": "SegmentOnlineOfflineStateModel",
    "STATE_MODEL_FACTORY_NAME": "DEFAULT"
  },
  "mapFields": {
    "comp0_horizontal__0__0__20210324T2221Z": {
      "Server_pinot-server-realtime-1.pinot-server-realtime-headless.svc.cluster.local_8098": "ONLINE"
    },
    "comp0_horizontal__0__1__20210325T0755Z": {
      "Server_pinot-server-realtime-1.pinot-server-realtime-headless.svc.cluster.local_8098": "CONSUMING"
    }
  },
  "listFields": {}
}
Stack trace before:
Copy code
2021/03/25 09:02:16.763 WARN [TableRebalancer] [restapi-multiget-thread-789] Caught exception while waiting for ExternalView to converge for table: comp0_horizontal_REALTIME, aborting the rebalance
java.util.concurrent.TimeoutException: Timeout while waiting for ExternalView to converge
        at org.apache.pinot.controller.helix.core.rebalance.TableRebalancer.waitForExternalViewToConverge(TableRebalancer.java:504) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-27b61fe6a338b1363efb64a7fed87d95cc793f8a]
        at org.apache.pinot.controller.helix.core.rebalance.TableRebalancer.rebalance(TableRebalancer.java:351) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-27b61fe6a338b1363efb64a7fed87d95cc793f8a]
        at org.apache.pinot.controller.helix.core.relocation.SegmentRelocator.lambda$processTable$0(SegmentRelocator.java:96) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-27b61fe6a338b1363efb64a7fed87d95cc793f8a]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_282]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
j

Jackie

03/25/2021, 9:36 PM
Seems the problem is that the consuming segment is not able to turn online. Can you please check the server log and see if there is any exception log
h

Harold Lim

03/25/2021, 9:40 PM
I don't see any NullPointerException in the log. Any particular thing I need to look for?
j

Jackie

03/25/2021, 10:48 PM
Any
ERROR
log in your server log?
Based on the external view and ideal state, the server stuck at consuming->online segment transition
h

Harold Lim

03/26/2021, 12:17 AM
I don't see any other error in SERVER log related to this particular table