https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • r

    raghav

    10/13/2025, 3:19 PM
    Hey Team, We are facing an issue with ingestion in pinot. Our prod cluster has stopped ingesting data. In server Helix logs I can see servers can't connect to zookeeper. I have tried restarting all the components. Disk usage on zookeeper = <5% CPU on zookeeper ~10% We have 24 servers, 36 kafka partitions, 50 GB memory each, peak ingestion rate 1MM rps, segment size - 300MB. Can anyone please help us understand this and mitigate this issue?
    Copy code
    2025/10/13 07:46:07.467 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
    2025/10/13 07:46:07.472 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
    2025/10/13 07:46:09.059 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
    2025/10/13 07:46:09.059 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
    2025/10/13 07:46:21.387 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
    2025/10/13 07:46:21.387 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
    2025/10/13 07:46:22.025 WARN [ZKHelixManager] [message-count-scheduler-0] zkClient to pinot-zookeeper:2181 is not connected, wait for 10000ms.
    2025/10/13 07:46:32.028 ERROR [ZKHelixManager] [message-count-scheduler-0] zkClient is not connected after waiting 10000ms., clusterName: d3-pinot-cluster, zkAddress: pinot-zookeeper:2181
    2025/10/13 07:46:34.790 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
    2025/10/13 07:46:34.790 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
    2025/10/13 12:34:34.225 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] 125 START: CallbackHandler 0, INVOKE /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c type: CALLBACK
    2025/10/13 12:34:34.226 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler 0 subscribing changes listener to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES, callback type: CALLBACK, event types: [NodeChildrenChanged], listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c, watchChild: false
    2025/10/13 12:34:34.227 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler0, Subscribing to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES took: 1
    2025/10/13 12:34:34.231 INFO [MessageLatencyMonitor] [ZkClient-EventThread-125-pinot-zookeeper:2181] The latency of message 89f57203-2271-4d7a-abc3-1087222fc439 is 853 ms
    2025/10/13 12:34:34.246 INFO [HelixTaskExecutor] [ZkClient-EventThread-125-pinot-zookeeper:2181] Scheduling message 89f57203-2271-4d7a-abc3-1087222fc439: metric_numerical_agg_1H_REALTIME:, null->null
    m
    • 2
    • 14
  • а

    Андрей Морозов

    10/14/2025, 6:53 AM
    Hi, all ! I trying to batch ingestion from multiple parquet files from directory. Job made all segments in mounted directory , but didn't push it to Pinot. Before this - my table had already one old single segment from previous job and data was pushed successfull. My configuration of cluster - docker [controller, broker, server1, server2. server3] 16CPU / 64RAM / 1TB SSD / Ubuntu Server. Job Spec:
    Copy code
    executionFrameworkSpec:
      name: standalone
      segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
      segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
      segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner
    
    jobType: SegmentCreationAndTarPush
    
    inputDirURI: '/var/imports/insights_ch1_fff_seg/'
    includeFileNamePattern: "glob:**/*.parquet"
    outputDirURI: '/tmp/pinot-segments/insights_ch1_fff_sm'
    overwriteOutput: true
    
    pushJobSpec:
      pushFileNamePattern: 'glob:**/*.tar.gz'
      pushParallelism: 2
      pushAttempts: 2
    
    recordReaderSpec:
      dataFormat: parquet
      className: org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader
    
    pinotFSSpecs:
      - scheme: file
        className: org.apache.pinot.spi.filesystem.LocalPinotFS
    
    tableSpec:
      tableName: insights_ch1_4
      schemaURI: '<http://pinot-controller:9000/tables/insights_ch1_4/schema>'
      tableConfigURI: '<http://pinot-controller:9000/tables/insights_ch1_4>'
    
    pinotClusterSpecs:
      - controllerURI: '<http://pinot-controller:9000>'
    Made segs on mounted dir after working of job: (screenshot) Command for running job:
    Copy code
    docker exec -e JAVA_OPTS="-Xms16g -Xmx40g" -it pinot-controller \
      bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile /config/insights_ch1_4_job.yaml
    I'm not see a log from stdout - only when it falls. Xmx40g (when it was 24g - job failed by out of heap space). What is wrong ?
  • m

    madhulika

    10/14/2025, 4:07 PM
    Hi @Mayank I was changing table configuration from replica config instance assignment to balanced segment strategy and noticed the segment count did not change much but table size got doubled.
    m
    • 2
    • 15
  • s

    Sonit Rathi

    10/15/2025, 4:37 AM
    Hi team, I am trying to remove sort index on one of the columns and have tried reloading all segments. still the segments after reloading still show sorting true and is appearing in queries.
    m
    • 2
    • 16
  • m

    madhulika

    10/15/2025, 3:28 PM
    Hi @Mayank Event with balanced segment strategy some tables segment being assigned to fewer servers only. I was thinking all servers would participate in segment assignment as round robin.
    m
    y
    • 3
    • 8
  • m

    mg

    10/16/2025, 9:00 AM
    Hi team, I'm running a real-time table with Kafka ingestion, and although data ingestion is working perfectly fine and the table status is green, I am getting a recurring stream of WARN logs in the Controller that I'd like to clarify. It appears the underlying Kafka client's
    ConsumerConfig
    is flagging Pinot-specific properties as unknown, likely because they are wrappers around the core Kafka properties. Are these warnings benign and expected, or does this indicate a potential issue with our configuration style? I'm seeking recommendations on whether we can suppress these warnings or if there's an updated configuration pattern we should use to avoid passing these metadata properties to the Kafka client. 1. Controller WARN Logs (Example)
    Copy code
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'streamType' was supplied but isn't a known config.
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
    2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.
    2. Relevant Table Config (
    streamConfigs
    )
    Copy code
    {
      "REALTIME": {
        "tableName": "XYZ",
        "tableType": "REALTIME",
        "segmentsConfig": {...},
        "tenants": {...},
        "tableIndexConfig": {
          "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.consumer.type": "LowLevel",
          "stream.kafka.topic.name": "test.airlineStats",
          "stream.kafka.broker.list": "kafka-bootstrap.kafka.svc:9093",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka30.KafkaConsumerFactory",
          "security.protocol": "SSL",
          // SSL config continues...
        },
        "other-configs": ...
       },
       "metadata": {},
       "other-configs": ...
      }
    }
    Any guidance on best practices for stream config in recent Pinot versions, or a way to silence these specific
    ConsumerConfig
    warnings, would be highly appreciated! Thanks!
    m
    • 2
    • 1
  • t

    Tommaso Peresson

    10/16/2025, 10:55 AM
    Is there a cluster config to periodically clean up the task history to avoid bogging down ZK? I know there's an API, just wanted to know if it could be self contained without having to schedule an job external to pinot to call it.
    m
    m
    • 3
    • 9
  • а

    Андрей Морозов

    10/17/2025, 11:43 AM
    Hi, Team ! I have a problem with ingestion from CSV file, which contains STRING values in column, such a "#1082;аБ...." I get ERROR Caused by: java.lang.IllegalArgumentException: Cannot read single-value from Object[]: [Б, а, р,......] for column: ext_id The parser reading this as array, but I want to load this to Pinot as is as STRING. How to fix this ? Another problem with parsing STRING as " Text , text text", parser reasing it as Object[]
    m
    • 2
    • 4
  • m

    Mustafa Shams

    10/20/2025, 7:02 PM
    I'm having an issue with the UI in pinot 1.4.0 when trying to add an Offline or Realtime table where sometimes the Table Type option will be unselected and grayed out so I'm not able to select it. I have to switch to the json editor and enter the table type for it to work. I was wondering if this is a known issue or a bug with 1.4.0. Is there a way to fix it or a version where this doesn't happen?
    m
    j
    • 3
    • 14
  • a

    Alaa Halawani

    10/22/2025, 5:47 AM
    Hi everyone, I’ve recently started using Apache Pinot 1.4 and set up a real-time table with upsert enabled, consuming data from Kafka. I ingested about 1.7 million rows across 12 segments, and during the initial load test, query performance was blazing fast. However, after restarting the server, I noticed: • The server’s memory usage dropped noticeably • A significant spike in query latency, especially in
    schedulerWaitMs
    Additional details: • Ingestion is stopped (so no extra Kafka load) • Increasing
    pinot.query.scheduler.query_runner_threads
    helped slightly, but performance is still slower than before the restart • Tried both MMAP and HEAP loading modes with similar results • I am running Pinot cluster on k8s nodes Has anyone run into similar behavior after a restart? Any idea why it happens? Any recommendations or configuration tips to improve performance would be much appreciated
    m
    • 2
    • 1
  • r

    Rahul Sharma

    10/22/2025, 7:56 PM
    Hi Team, We are using realtime tables with upsert in Pinot. However, since Pinot does not actually delete old records, we need to schedule Minion compaction tasks to handle this. I added the configuration below to my table. Now, in the Task Manager, the
    upsertCompactionTask
    is visible, but its task configuration is empty. As a result, compaction is not working, and the number of records in my table remains the same. Can anyone please help? Conf:
    Copy code
    "task": {
          "taskTypeConfigsMap": {
            "UpsertCompactionTask": {
              "schedule": "0 */5 * ? * *",
              "bufferTimePeriod": "0d",
              "invalidRecordsThresholdPercent": "10",
              "invalidRecordsThresholdCount": "1000"
            }
          }
        },
    x
    • 2
    • 9
  • k

    Krupa

    10/24/2025, 11:19 AM
    Hi @Mayank I have created a table initially its performance is good when the records are less than 1Million with around 1000 QPS, when the data increased to 15Million the query performance degraded for the same QPS and it is taking more than 3secs for query.I have kept relevant indexes on the relevant columns. What am I missing and what can be the reasons for it. Please help
    m
    а
    • 3
    • 4
  • u

    Utsav Jain

    10/29/2025, 5:15 AM
    Hi Team we are using realtime tables with upserts enabled and the ttl window is 12 hrs To purge any stale record coming after ttl window and to avoid runnig queries with distinct we enabled segment compaction job But we are seeing issues while running it with full accuracy like few of the older segments are never considered for compaction which is resulting in inaccurate numbers while running the query Can you please help in understanding the causes of such cases ?
    x
    • 2
    • 8
  • r

    Rajat

    10/29/2025, 10:16 AM
    Hi Team, is there any known bug in Pinot? like when I check for duplicates in data by running:
    Copy code
    SELECT s_id, count(*)
    FROM shipmentMerged_final
    GROUP BY s_id
    HAVING COUNT(*) > 1
    Sometimes it shows no records but sometimes it shows data with count as 2
  • r

    Rajat

    10/29/2025, 10:49 AM
    another issue:
    Copy code
    SELECT COUNT(*) AS aggregate,
    s_id
    FROM shipmentMerged_final
    WHERE o_company_id = 2449226
      AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
      AND o_shipping_method IN ('SR', 'SRE', 'AC')
      AND o_is_return = 0
      AND o_state = 0
    group by 2
    limit 1500
    Above Query is showing: 1150 total records But When running:
    Copy code
    SELECT COUNT(*) AS aggregate
    FROM shipmentMerged_final
    WHERE o_company_id = 2449226
      AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
      AND o_shipping_method IN ('SR', 'SRE', 'AC')
      AND o_is_return = 0
      AND o_state = 0
    The count is coming as: 1162
    y
    • 2
    • 1
  • r

    Rajat

    10/29/2025, 10:49 AM
    @Xiang Fu @Mayank
  • r

    Rashpal Singh

    10/29/2025, 11:24 PM
    Hi All, I am using Pinot 1.1 and I want to store null for my DOUBLE column.. For that I have used below confs:
    nullHandlingEnabled=true at table config level
    enableColumnBasedNullHandling": true at schema level
    Copy code
    {
          "name": "notNullColumn",
          "dataType": "DOUBLE",
          "notNull": False
     }
    Still when I am querying, I am getting "0" instead of null. How can I fix this issue where I want to see null (original value) instead of 0 in query response without adding "SET enableNullHandling=true" in my query
    x
    • 2
    • 19
  • r

    Rahul Sharma

    10/30/2025, 4:23 AM
    Hi team, I am creating an autoscaler for minion-based batch ingestions. To scale up and down, I need the number of tasks that are waiting and the number of tasks that are running. I checked the Pinot metrics and found these two:
    pinot_controller_numMinionSubtasksWaiting_Value
    and
    pinot_controller_numMinionSubtasksRunning_Value
    . However, for each task type, they always show a value of 0 even when tasks are running. Am I using the wrong metrics? Which metrics should I use to build a custom autoscaler for minions?
    x
    s
    • 3
    • 27
  • f

    francoisa

    10/30/2025, 8:49 AM
    Hi team 😉 Quick question about some messages I see in my monitoring : “Recreating stream consumer for topic partition *, reason: Total idle time: 183647 ms exceeded idle timeout: 180000 ms” What is the behaviour behind that ? Reset the consumer to last commited offset and reingest things ? Or just re-pop the consumer to his last consumed offset ?
    j
    • 2
    • 2
  • b

    Badhusha Muhammed

    10/30/2025, 4:17 PM
    Hello Team, We are encountering an issue where our Pinot servers are timing out when attempting to establish a session with Zookeeper. This is causing the Pinot Servers to crash (or go down). Although the server attempts to iteratively establish a new connection, the process continues to time out until we manually restart the server instance. A similar scenario can be found in the following GitHub issue: https://github.com/apache/pinot/issues/4686. 1. The initial issue between the Pinot Server and Zookeeper was session expiration. 2. Regardless of the underlying issue (e.g., Zookeeper latency, GC pauses blocking the main thread), Pinot should be capable of automatically re-establishing the connection once the problem is resolved. Instead, we are forced to manually restart the server to restore a healthy Zookeeper session. As a result , the server is being removed from the LIVE_INSTANCE metadata and registered as DEAD.
    x
    j
    • 3
    • 22
  • v

    Victor Bivolaru

    10/31/2025, 1:31 PM
    Hello, I have a question about how the controller handles rebalancing segments. I am more interested in the following aspect: is there any downtime while moving a segment from a server to another ? I see that in the manual rebalance job you can specify downtime=false only if you have replication. Is the mechanism behind controller rebalancing the same ?
    y
    • 2
    • 6
  • m

    Mannoj

    11/03/2025, 4:39 PM
    Hi Team, I was checking if pinot really logs auditing as in
    who did what, how, from which source and at what time
    ? Seems like the code base logs only the response and the type and not the request. It will be great if request is also being logged, so audit info is fully available. In code base : ControllerResponseFilter.java > LOGGER.info("Handled request from {} {} {}, content-type {} status code {} {}", srcIpAddr, method, uri, contentType, > respStatus, reasonPhrase); If this has requestContext is also added , I believe it should add request details with payload that is initially sent by the user, or if its disabled on purpose, do you mind giving that control to log4j that enduser can choose to enable it or not. I'm no developer 🥺, I'm trying the make sense of the code and see if it can be added . Where I'm coming from is: I just added a user via controller to have read,write permissions of a particular user on all tables. All I get is below.
    Copy code
    2025/11/03 20:30:59.922 INFO [ControllerResponseFilter] [grizzly-http-server-15] Handled request from 192.168.13.1 PUT <http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false|http://test-phaseroundtoaudit11.ori.com:9000/users/dedactid_rw?component=BROKER&passwordChanged=false>, content-type text/plain;charset=UTF-8 status code 200 OK
    2025/11/03 20:30:59.957 INFO [ControllerResponseFilter] [grizzly-http-server-14] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/tables|http://test-phaseroundtoaudit11.ori.com:9000/tables>, content-type null status code 200 OK
    2025/11/03 20:30:59.980 INFO [ControllerResponseFilter] [grizzly-http-server-12] Handled request from 192.168.13.1 GET <http://test-phaseroundtoaudit11.ori.com:9000/users|http://test-phaseroundtoaudit11.ori.com:9000/users>, content-type null status code 200 OK
    But its missing read,write has been given my admin user to ALL/particular tables. There is further granularity missing which is crucial I believe. Let me know your views. Thanks!!
    y
    s
    • 3
    • 6
  • a

    Alexander Maniates

    11/03/2025, 7:10 PM
    QQ: Is there a certain task we can run to force a server to re-upload it's segment to the deep store (in our case S3)? We have a situation where a realtime server failed to upload to S3, and then the segment was offloaded to offline servers, The offline servers were able to fetch the segment from their online peers to successfully load the segment, but the segment is still in a weird state where it is missing from the deep-store/S3. Should some periodic task be running to check on this, or can we run some manual controller task to "heal" the situation?
    m
    • 2
    • 4
  • r

    Rahul Sharma

    11/04/2025, 10:02 AM
    Hi Team, Context: We want to use Apache Pinot for real-time analytics query use cases in our microservices. Since realtime Pinot tables ingest directly from Kafka, ingestion delays/lag can occur. Our requirement is: whenever a document (row) in Pinot is updated, we want to push an event to Kafka with the primary key that changed. This would allow downstream microservices to consume that event, know that a specific record has been updated in Pinot, then trigger real-time analytics queries and perform required downstream actions. Question: Is there any existing feature or recommended workaround in Pinot to detect when a row is updated in a realtime table and trigger an event (e.g., send a Kafka message) so downstream services can be notified?
    m
    s
    • 3
    • 14
  • m

    Mariusz

    11/04/2025, 2:42 PM
    Hi Team, Recently I was trying to enable OOM (https://docs.pinot.apache.org/operators/operating-pinot/oom-protection-using-automatic-query-killing). I have added below configurations in both broker and server config files.
    Copy code
    pinot.broker.instance.enableThreadCpuTimeMeasurement=true
    pinot.broker.instance.enableThreadAllocatedBytesMeasurement=true
    pinot.server.instance.enableThreadAllocatedBytesMeasurement=true
    pinot.server.instance.enableThreadCpuTimeMeasurement=true
    pinot.query.scheduler.accounting.enable.thread.memory.sampling=true
    pinot.query.scheduler.accounting.enable.thread.cpu.sampling=true
    
    
    pinot.query.scheduler.accounting.oom.enable.killing.query=true
    pinot.query.scheduler.accounting.query.killed.metric.enabled=true
    
    pinot.query.scheduler.accounting.oom.critical.heap.usage.ratio=0.3
    pinot.query.scheduler.accounting.oom.panic.heap.usage.ratio=0.3
    <http://pinot.query.scheduler.accounting.sleep.ms|pinot.query.scheduler.accounting.sleep.ms>=30
    pinot.query.scheduler.accounting.oom.alarming.usage.ratio=0.3
    pinot.query.scheduler.accounting.sleep.time.denominator=3
    pinot.query.scheduler.accounting.min.memory.footprint.to.kill.ratio=0.01
    
    pinot.query.scheduler.accounting.factory.name=org.apache.pinot.core.accounting.PerQueryCPUMemAccountantFactory
    pinot.query.scheduler.accounting.cpu.time.based.killing.enabled=true
    pinot.query.scheduler.accounting.publishing.jvm.heap.usage=true
    <http://pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms|pinot.query.scheduler.accounting.cpu.time.based.killing.threshold.ms>=1000
    I have run some heavy queries to test the OOM killing feature, but I don't see any killed queries in the broker/server metrics.
    Copy code
    SELECT accountId,countryCode,direction,day,hour,msgType,currency,topic,finalStatus,year,month,
      SUM(CASE WHEN finalStatus = 'Failed' THEN 1 ELSE 0 END) AS failed_count,
      SUM(CASE WHEN finalStatus = 'Delivered' THEN 1 ELSE 0 END) AS success_count,
      COUNT(*) AS total_records,
      COUNT(DISTINCT udrId) AS unique_udrs,
      SUM(price) AS total_revenue,
      AVG(price) AS avg_price,
      MAX(price) AS max_price,
      MIN(price) AS min_price,
      SUM(CASE WHEN errorCode > 0 THEN 1 ELSE 0 END) AS error_count,
      SUM(price * (CASE WHEN direction = 'Unknown' THEN 1 ELSE -1 END)) AS net_revenue
    FROM
      dummy_table
    GROUP BY
      accountId,countryCode,direction,msgType,currency,topic,finalStatus,year,month,day,hour
    ORDER BY
      total_revenue DESC,
      avg_price DESC
    LIMIT 1000000
    Whenever I run this query, the server goes down, but no queries are terminated automatically. Can you please help me to understand if I am missing any configurations or steps to enable this feature? I did test on
    apachepinot/pinot:1.5.0-SNAPSHOT-9d32f376d8-20251016, size of Heap -Xms2G -Xmx2G for server and broker.
  • n

    Naveen

    11/05/2025, 9:20 AM
    Hi Team, Im getting this error continously even my servers are running properly and status of the tables are in good state. help me to resolve the issue kubectl get pod -n dp-1-346 NAME READY STATUS RESTARTS AGE pinot-broker-0 1/1 Running 0 13h pinot-controller-0 1/1 Running 0 6m26s pinot-minion-stateless-84fc6899f9-2shqp 1/1 Running 0 13h pinot-server-0 1/1 Running 0 6m32s pinot-server-1 1/1 Running 0 6m39s presto-coordinator-0 1/1 Running 0 25h presto-worker-0 1/1 Running 0 25h zookeeper-0 1/1 Running 0 13h
    a
    p
    r
    • 4
    • 24
  • r

    Rajasekharan A P

    11/06/2025, 7:04 AM
    Hi, In my Pinot cluster, I initially had 4 servers (A, B, C, D) with segments distributed across them. I wanted to consolidate all segments onto a single server, so I removed the tags from servers B, C, and D, and then ran a rebalance operation to allocate all segments to the remaining server (A). After rebalancing, all segments were assigned to the single server. However, the segments that were originally on the other servers appeared in ERROR state in the external view, even though their ideal state in ZooKeeper showed them as ONLINE. For example: • Ideal State:
    Copy code
    "load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
        "Server_172.18.0.6_8098": "ONLINE"
    }
    • External View:
    Copy code
    "load_chat_messages_core_1756318894786_1758914214102_1758919671601": {
        "Server_172.18.0.6_8098": "ERROR"
    }
    To resolve this, I performed a reload and reset operation on the affected segments. After the reset, the segment state transitioned from ERROR to OFFLINE, allowing it to be properly reloaded. Setup details: • Running Pinot in Docker • Using local storage for segment files • Segment data is volume-mounted
  • f

    francoisa

    11/06/2025, 10:41 AM
    Hi questions on strange load accross VMs. Does anyone have faced this kind of issues before ? We’ve got 4 servers 2 of them shows lot of CPU/ RAM usages where the two others looks like normal/chill. Data is properly balanced (32 tables / 2 partitions (non spooled) with replication of 2 ) so each server is used correctly when queried. Can be seen on the network traffic close to equal on each server. Tried to rebalance tables but all already balanced. I’ve rework the JMX to grab only relevant data and I do not see anything. Same queries rate and segment processed / server. Any clues ?
    m
    • 2
    • 14
  • v

    Victor Bivolaru

    11/07/2025, 1:09 PM
    I am trying to debug a strange issue regarding segment generation from a realtime table. Its config is set up like this:
    Copy code
    "realtime.segment.flush.threshold.rows": "0",
    "realtime.segment.flush.threshold.segment.size": "500M",
    "realtime.segment.flush.threshold.time": "4h"
    However, when inspecting the metadata of any of the realtime segments we can see for example:
    Copy code
    "segment.realtime.endOffset": "67399447",
    "segment.start.time": "1762424217000",
    "segment.time.unit": "MILLISECONDS",
    "segment.flush.threshold.size": "100000",
    "segment.realtime.startOffset": "66512835",
    "segment.size.in.bytes": "14018213",  <====== 14MB instead of 500M    
    "segment.end.time": "1762426143000",  <====== subtracting segment.start.time from this we get roughly 35 min 
    "segment.total.docs": "100000",
    "segment.realtime.numReplicas": "1",
    "segment.creation.time": "1762511599197",
    "segment.index.version": "v3",
    "segment.crc": "3704033136",
    "segment.realtime.status": "DONE",
    j
    • 2
    • 2
  • r

    Rajasekharan A P

    11/10/2025, 4:44 AM
    Hello Everyone, I am facing some issues in production with the Pinot setup. Could anyone help me?🙂