Apache Pinot #troubleshooting

Андрей Морозов

10/08/2025, 10:59 AM

Hi, all! I have a docker compose instance of pinot:latest and try to batch ingestion from parquet file. I have a problem with deleting old segments , if I try to reload data with deleting segments via API , then from disk, deleting a table and creating this table again. After ingestion job I see an old and new segments and rowcount with count of segments is incrementally increases. What I'm doing wrong ?

Tommaso Peresson

10/08/2025, 3:26 PM

Hello there, how can I configure Merge Rollup tasks to use

metadata

push mode to save the segments in the deep store and keep only metadata on the Controller?

Shubham Kumar

10/09/2025, 6:21 AM

Hi team, My current primary key count is around 100 million. Whenever I restart the server, the primary key count increases to around 260 million and then drops back to 100 million. Could you please help me understand why this behavior occurs?

madhulika

10/09/2025, 5:18 PM

Copy code

SELECT tripId,
  CASE
    WHEN total_task = delivered_task THEN 'COMPLETE_DELIVERY'
    WHEN total_task = delivered_task + returned_task THEN 'DELIVERED_RETURNED'
    WHEN delivered_task = '0' THEN 'NO_DELIVERY'
    ELSE 'PARTIAL_DELIVERY'
  END AS delivery_Type
FROM (
    SELECT DISTINCT tripId,
      COUNT(DISTINCT taskId) AS total_task,
      SUM(deliveredOrder) AS delivered_task,
      SUM(returnedOrder) AS returned_task
    FROM (
        SELECT tripId,
          CASE
            WHEN deliveryStatus IN ('DELIVERED') THEN 1
            ELSE 0
          END AS deliveredOrder,
          CASE
            WHEN deliveryStatus IN ('RETURNED') THEN 1
            ELSE 0
          END AS returnedOrder,
          taskId
        FROM lmd_task_db_snapshot task
        WHERE tripId IN (
           
          ) AND scheduleStart >= '2025-10-08 13:00:00.0'
          AND scheduleStart < '2025-10-10 08:00:00.0'
      ) I
    GROUP BY tripId
  )

Victor Bivolaru

10/10/2025, 11:09 AM

Hi, I would like to ask for some clarification regarding minions and using the rest API to execute a task (segmentGeneration) When I have no minions started and I try to execute a task, the task shows up in as

"Task_SegmentGenerationAndPushTask_smth_f11b81f0-cc0f-4c8d-b205-4873963f49d4": "IN_PROGRESS"

when calling

GET /tasks/SegmentGenerationAndPushTask/state

, but when checking with

GET tasks/subtask/Task_SegmentGenerationAndPushTask_smth_f11b81f0-cc0f-4c8d-b205-4873963f49d4/state

I get

Copy code

{
  "Task_SegmentGenerationAndPushTask_smth_f11b81f0-cc0f-4c8d-b205-4873963f49d4_0": null
}

The controller logs clearly states

Copy code

2025/10/10 11:04:55.086 ERROR [JobDispatcher] [HelixController-pipeline-task-smth-(2c58d6d3_TASK)] No available instance found for job: TaskQueue_SegmentGenerationAndPushTask_Task_SegmentGenerationAndPushTask_smth_f11b81f0-cc0f-4c8d-b205-4873963f49d4

I was expecting that the status of the task to also reflect that by showing

"NOT_STARTED"

Victor Bivolaru

10/10/2025, 1:46 PM

One more question without any ties to the previous message: Provided our setup creates small segments inside of which our data is sorted by a column

C1

that in the table config appears as a

sortedColumn

Nightly we would like to run a merge task but I am not sure if this task would keep the data sorted over the newly created segment. I am afraid the only way is writing a custom task

francoisa

10/10/2025, 2:05 PM

Hi team 😉 Really quick question about Derived Column and performances. I’ve got a few Derived Column in my table and I’m planning to add many other more using JSONEXTRACTSCALAR. My main concern is about performances on a REALTIME with allready existing segments. Does the reload needed to have this new column avalaible calculate it for each row of all segments ? Or is it calculated on flight at query time (I just hope it’s not the case)

raghav

10/10/2025, 2:09 PM

Hey Team, We are using Pinot in prod for more than 6 moths now. Suddenly we started facing issues when ingestion drops suddenly and recovers after some time. I checked the logs thoroughly and were able to find two errors in server. Controller logs looks fine. We have 24 servers, 36 kafka partitions, 50 GB memory each, peak ingestion rate 1MM rps, segment size - 300MB. Can anyone please help us understand this and mitigate this issue? ERROR/WARN logs

Copy code

pinotServer.2025-10-10.9.log.gz:2025/10/10 13:22:34.067 ERROR [RealtimeSegmentDataManager_metric_numerical_agg_1H__16__182629__20251010T1321Z] [metric_numerical_agg_1H__16__182629__20251010T1321Z] Holding after response from Controller: {"buildTimeSec":-1,"isSplitCommitType":true,"streamPartitionMsgOffset":null,"status":"NOT_SENT"}
pinotServer.2025-10-10.9.log.gz:2025/10/10 13:22:52.653 ERROR [ServerSegmentCompletionProtocolHandler] [metric_numerical_agg_1H__28__180921__20251010T1322Z] Could not send request <http://pinot-controller-0.pinot-controller-headless.d3-pinot-cluster.svc.cluster.local:9000/segmentConsumed?reason=rowLimit&streamPartitionMsgOffset=172544503662&instance=Server_pinot-server-1.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098&name=metric_numerical_agg_1H__28__180921__20251010T1322Z&rowCount=696146&memoryUsedBytes=338498296>

2025/10/10 13:14:41.871 WARN [AppInfoParser] [HelixTaskExecutor-message_handle_thread_5] Error registering AppInfo mbean
javax.management.InstanceAlreadyExistsException: kafka.consumer:type=app-info,id=metric_numerical_agg_1H_REALTIME-D3NumericalSketchPartitioned-28
at java.management/com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:322)

Yash Lohade

10/10/2025, 4:46 PM

Hell guys I just implemented Basic Auth to my Pinot but while inserting data in tables using apache flinks stream with this code public static void insertIntoPinot(DataStream<Row> sinkStream, String sinkTableName, String controllerURL) throws Exception { HttpClient httpClient = HttpClient.getInstance(); ControllerRequestClient client = new ControllerRequestClient( ControllerRequestURLBuilder.baseUrl(controllerURL), httpClient); Schema schema = PinotConnectionUtils.getSchema(client, sinkTableName); TableConfig tableConfig = PinotConnectionUtils.getTableConfig(client, sinkTableName, "OFFLINE"); Logging.log(jobName, "PROCESS", "Job Inserting Data into Pinot " + controllerURL); sinkStream.addSink( new PinotSinkFunction<>( new FlinkRowGenericRowConverter(TYPE_INFO), tableConfig, schema)) .name("PinotSink_" + sinkTableName) .setParallelism(PARALLELISM); Logging.log(jobName, "PROCESS", "Job Inserted Data into Pinot " + controllerURL); } but this connectors/drivers don't have an option for basic auth and passing of headers how could I ingest my data now, I also tried basic HTTP client request to ingest data but I was running into issues with batch ingest config import java.net.URI; import java.net.http.HttpClient; import java.net.http.HttpRequest; import java.net.http.HttpResponse; import java.util.Base64; import java.util.ArrayList; import java.util.List; import java.util.Map; import com.fasterxml.jackson.databind.JsonNode; import com.fasterxml.jackson.databind.ObjectMapper; import org.apache.pinot.spi.config.table.TableConfig; import org.apache.pinot.spi.config.table.ingestion.BatchIngestionConfig; import org.apache.pinot.spi.config.table.ingestion.IngestionConfig; import org.apache.pinot.spi.data.Schema; import org.apache.flink.streaming.api.datastream.DataStream; import org.apache.flink.types.Row; public static void insertIntoPinot(DataStream<Row> sinkStream, String sinkTableName, String controllerURL, String username, String password) throws Exception { // Encode credentials for basic auth String authString = username + ":" + password; String encodedAuth = Base64.getEncoder().encodeToString(authString.getBytes()); HttpClient httpClient = HttpClient.newHttpClient(); ObjectMapper mapper = new ObjectMapper(); // 1) Fetch schema JSON from Pinot Controller REST API String schemaUrl = controllerURL + "/schemas/" + sinkTableName; HttpRequest schemaRequest = HttpRequest.newBuilder() .uri(URI.create(schemaUrl)) .header("Authorization", "Basic " + encodedAuth) .GET() .build(); HttpResponse<String> schemaResponse = httpClient.send(schemaRequest, HttpResponse.BodyHandlers.ofString()); if (schemaResponse.statusCode() >= 300) { throw new RuntimeException("Failed to fetch schema: " + schemaResponse.body()); } Schema schema = mapper.readValue(schemaResponse.body(), Schema.class); // 2) Fetch table config JSON for OFFLINE table String tableConfigUrl = String.format("%s/tables/%s?type=OFFLINE", controllerURL, sinkTableName); HttpRequest tableConfigRequest = HttpRequest.newBuilder() .uri(URI.create(tableConfigUrl)) .header("Authorization", "Basic " + encodedAuth) .GET() .build(); HttpResponse<String> tableConfigResponse = httpClient.send(tableConfigRequest, HttpResponse.BodyHandlers.ofString()); if (tableConfigResponse.statusCode() >= 300) { throw new RuntimeException("Failed to fetch table config: " + tableConfigResponse.body()); } JsonNode rootNode = mapper.readTree(tableConfigResponse.body()); JsonNode offlineNode = rootNode.get("OFFLINE"); if (offlineNode == null) { throw new RuntimeException("OFFLINE config section not found in table config response"); } TableConfig tableConfig = mapper.treeToValue(offlineNode, TableConfig.class); // 3) Fix missing ingestionConfig->batchIngestionConfig->batchConfigMaps to avoid Pinot errors during ingestion if (tableConfig.getIngestionConfig() == null) { tableConfig.setIngestionConfig(new IngestionConfig()); } IngestionConfig ingestionConfig = tableConfig.getIngestionConfig(); if (ingestionConfig.getBatchIngestionConfig() == null) { // Must provide batchConfigMaps as empty list (required) List<Map<String, String>> batchConfigMaps = new ArrayList<>(); BatchIngestionConfig batchIngestionConfig = new BatchIngestionConfig(batchConfigMaps, null, null); ingestionConfig.setBatchIngestionConfig(batchIngestionConfig); } else { BatchIngestionConfig batchIngestionConfig = ingestionConfig.getBatchIngestionConfig(); if (batchIngestionConfig.getBatchConfigMaps() == null) { batchIngestionConfig.setBatchConfigMaps(new ArrayList<>()); } } // 4) Add PinotSinkFunction to Flink DataStream using fetched schema and updated table config sinkStream.addSink( new PinotSinkFunction<>( new FlinkRowGenericRowConverter(TYPE_INFO), // Your converter according to Row types tableConfig, schema)) .name("PinotSink_" + sinkTableName) .setParallelism(PARALLELISM); } I would appreciate you guys helping us out.

Satya Mahesh

10/13/2025, 1:40 PM

Hello guys, I’ve optimized the queries — when I run them directly in the Pinot controller, they complete in about 100–150 ms, and through Java integration they usually take around 200 ms. However, sometimes the execution time spikes to over 10 seconds. Could you please help me understand what might be causing this — is it related to the query itself or the Pinot table configuration? I set timeout 10 sec. 2025-10-13 154731.738 log=[{"errorCode":200,"message":"QueryExecutionError:\nReceived error query execution result block: {250=ExecutionTimeoutError\nProcessingException(errorCode:250, message:ExecutionTimeoutError)\n\tat org.apache.pinot.common.exception.QueryException.<clinit>(QueryException.java:113)\n\tat org.apache.pinot.common.datablock.DataBlockUtils.extractErrorMsg(DataBlockUtils.java:55)\n\tat org.apache.pinot.common.datablock.DataBlockUtils.getErrorDataBlock(DataBlockUtils.java:47)\n\tat org.apache.pinot.query.runtime.blocks.TransferableBlockUtils.getErrorTransferableBlock(TransferableBlockUtils.java:54)}\norg.apache.pinot.query.service.dispatch.QueryDispatcher.runReducer(QueryDispatcher.java:306)\norg.apache.pinot.query.service.dispatch.QueryDispatcher.submitAndReduce(QueryDispatcher.java:96)\norg.apache.pinot.broker.requesthandler.MultiStageBrokerRequestHandler.handleRequest(MultiStageBrokerRequestHandler.java:219)\norg.apache.pinot.broker.requesthandler.BaseBrokerRequestHandler.handleRequest(BaseBrokerRequestHandler.java:133)\n"}] 2025-10-13 154731.739 log=101731 ERROR traceId=, parentId=, spanId=, sampled= [io.qu.mu.ru.MutinyInfrastructure] (executor-thread-33) Mutiny had to drop the following exception: io.fastpix.metrix.AppException: something went wrong in pinot 2025-10-13 154731.739 log= at io.quarkus.vertx.core.runtime.VertxCoreRecorder$15.runWith(VertxCoreRecorder.java:638) 2025-10-13 154731.739 log= at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1594) 2025-10-13 154731.739 log= at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:11) 2025-10-13 154731.739 log= at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:11) 2025-10-13 154731.739 log= at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2654) 2025-10-13 154731.739 log= at io.smallrye.context.impl.wrappers.SlowContextualSupplier.get(SlowContextualSupplier.java:21) 2025-10-13 154731.739 log= at java.base/java.lang.Thread.run(Thread.java:1583) 2025-10-13 154731.739 log= at io.smallrye.mutiny.operators.uni.UniRunSubscribeOn.lambda$subscribe$0(UniRunSubscribeOn.java:27) 2025-10-13 154731.739 log= at io.fastpix.metrix.utils.PinotClientConfig_ClientProxy.executeQueryAsync(Unknown Source) 2025-10-13 154731.739 log= at io.fastpix.metrix.services.impl.MetricServiceImpl.lambda$getMetricsOfBreakdown$1(MetricServiceImpl.java:420) 2025-10-13 154731.739 log= at org.jboss.threads.EnhancedQueueExecutor.runThreadBody(EnhancedQueueExecutor.java:1627) 2025-10-13 154731.739 log= at org.jboss.threads.EnhancedQueueExecutor$Task.doRunWith(EnhancedQueueExecutor.java:2675) 2025-10-13 154731.739 log= 2025-10-13 154731.739 log= at io.smallrye.mutiny.operators.AbstractUni.subscribe(AbstractUni.java:35) 2025-10-13 154731.739 log= at io.fastpix.metrix.utils.PinotClientConfig.executeQueryAsync(PinotClientConfig.java:61) 2025-10-13 154731.739 log= at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)

raghav

10/13/2025, 3:19 PM

Hey Team, We are facing an issue with ingestion in pinot. Our prod cluster has stopped ingesting data. In server Helix logs I can see servers can't connect to zookeeper. I have tried restarting all the components. Disk usage on zookeeper = <5% CPU on zookeeper ~10% We have 24 servers, 36 kafka partitions, 50 GB memory each, peak ingestion rate 1MM rps, segment size - 300MB. Can anyone please help us understand this and mitigate this issue?

Copy code

2025/10/13 07:46:07.467 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
2025/10/13 07:46:07.472 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:09.059 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
2025/10/13 07:46:09.059 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:21.387 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( Disconnected )
2025/10/13 07:46:21.387 WARN [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState:Disconnected, SessionId: 10000184ff502de, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 07:46:22.025 WARN [ZKHelixManager] [message-count-scheduler-0] zkClient to pinot-zookeeper:2181 is not connected, wait for 10000ms.
2025/10/13 07:46:32.028 ERROR [ZKHelixManager] [message-count-scheduler-0] zkClient is not connected after waiting 10000ms., clusterName: d3-pinot-cluster, zkAddress: pinot-zookeeper:2181
2025/10/13 07:46:34.790 INFO [ZkClient] [Start a Pinot [SERVER]-EventThread] zkclient 3, zookeeper state changed ( SyncConnected )
2025/10/13 07:46:34.790 INFO [ZKHelixManager] [ZkClient-EventThread-125-pinot-zookeeper:2181] KeeperState: SyncConnected, instance: Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098, type: PARTICIPANT
2025/10/13 12:34:34.225 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] 125 START: CallbackHandler 0, INVOKE /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c type: CALLBACK
2025/10/13 12:34:34.226 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler 0 subscribing changes listener to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES, callback type: CALLBACK, event types: [NodeChildrenChanged], listener: org.apache.helix.messaging.handling.HelixTaskExecutor@1b9d313c, watchChild: false
2025/10/13 12:34:34.227 INFO [CallbackHandler] [ZkClient-EventThread-125-pinot-zookeeper:2181] CallbackHandler0, Subscribing to path: /d3-pinot-cluster/INSTANCES/Server_pinot-server-4.pinot-server-headless.d3-pinot-cluster.svc.cluster.local_8098/MESSAGES took: 1
2025/10/13 12:34:34.231 INFO [MessageLatencyMonitor] [ZkClient-EventThread-125-pinot-zookeeper:2181] The latency of message 89f57203-2271-4d7a-abc3-1087222fc439 is 853 ms
2025/10/13 12:34:34.246 INFO [HelixTaskExecutor] [ZkClient-EventThread-125-pinot-zookeeper:2181] Scheduling message 89f57203-2271-4d7a-abc3-1087222fc439: metric_numerical_agg_1H_REALTIME:, null->null

Андрей Морозов

10/14/2025, 6:53 AM

Hi, all ! I trying to batch ingestion from multiple parquet files from directory. Job made all segments in mounted directory , but didn't push it to Pinot. Before this - my table had already one old single segment from previous job and data was pushed successfull. My configuration of cluster - docker [controller, broker, server1, server2. server3] 16CPU / 64RAM / 1TB SSD / Ubuntu Server. Job Spec:

Copy code

executionFrameworkSpec:
  name: standalone
  segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
  segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
  segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner

jobType: SegmentCreationAndTarPush

inputDirURI: '/var/imports/insights_ch1_fff_seg/'
includeFileNamePattern: "glob:**/*.parquet"
outputDirURI: '/tmp/pinot-segments/insights_ch1_fff_sm'
overwriteOutput: true

pushJobSpec:
  pushFileNamePattern: 'glob:**/*.tar.gz'
  pushParallelism: 2
  pushAttempts: 2

recordReaderSpec:
  dataFormat: parquet
  className: org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader

pinotFSSpecs:
  - scheme: file
    className: org.apache.pinot.spi.filesystem.LocalPinotFS

tableSpec:
  tableName: insights_ch1_4
  schemaURI: '<http://pinot-controller:9000/tables/insights_ch1_4/schema>'
  tableConfigURI: '<http://pinot-controller:9000/tables/insights_ch1_4>'

pinotClusterSpecs:
  - controllerURI: '<http://pinot-controller:9000>'

Made segs on mounted dir after working of job: (screenshot) Command for running job:

Copy code

docker exec -e JAVA_OPTS="-Xms16g -Xmx40g" -it pinot-controller \
  bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile /config/insights_ch1_4_job.yaml

I'm not see a log from stdout - only when it falls. Xmx40g (when it was 24g - job failed by out of heap space). What is wrong ?

madhulika

10/14/2025, 4:07 PM

Hi @Mayank I was changing table configuration from replica config instance assignment to balanced segment strategy and noticed the segment count did not change much but table size got doubled.

Sonit Rathi

10/15/2025, 4:37 AM

Hi team, I am trying to remove sort index on one of the columns and have tried reloading all segments. still the segments after reloading still show sorting true and is appearing in queries.

madhulika

10/15/2025, 3:28 PM

Hi @Mayank Event with balanced segment strategy some tables segment being assigned to fewer servers only. I was thinking all servers would participate in segment assignment as round robin.

10/16/2025, 9:00 AM

Hi team, I'm running a real-time table with Kafka ingestion, and although data ingestion is working perfectly fine and the table status is green, I am getting a recurring stream of WARN logs in the Controller that I'd like to clarify. It appears the underlying Kafka client's

ConsumerConfig

is flagging Pinot-specific properties as unknown, likely because they are wrappers around the core Kafka properties. Are these warnings benign and expected, or does this indicate a potential issue with our configuration style? I'm seeking recommendations on whether we can suppress these warnings or if there's an updated configuration pattern we should use to avoid passing these metadata properties to the Kafka client. 1. Controller WARN Logs (Example)

Copy code

2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.decoder.class.name' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'streamType' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.type' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.broker.list' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.consumer.factory.class.name' was supplied but isn't a known config.
2025/10/16 08:20:15.667 WARN [ConsumerConfig] [pool-14-thread-9] The configuration 'stream.kafka.topic.name' was supplied but isn't a known config.

2. Relevant Table Config (

streamConfigs

)

Copy code

{
  "REALTIME": {
    "tableName": "XYZ",
    "tableType": "REALTIME",
    "segmentsConfig": {...},
    "tenants": {...},
    "tableIndexConfig": {
      "streamConfigs": {
      "streamType": "kafka",
      "stream.kafka.consumer.type": "LowLevel",
      "stream.kafka.topic.name": "test.airlineStats",
      "stream.kafka.broker.list": "kafka-bootstrap.kafka.svc:9093",
      "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder",
      "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka30.KafkaConsumerFactory",
      "security.protocol": "SSL",
      // SSL config continues...
    },
    "other-configs": ...
   },
   "metadata": {},
   "other-configs": ...
  }
}

Any guidance on best practices for stream config in recent Pinot versions, or a way to silence these specific

ConsumerConfig

warnings, would be highly appreciated! Thanks!

Tommaso Peresson

10/16/2025, 10:55 AM

Is there a cluster config to periodically clean up the task history to avoid bogging down ZK? I know there's an API, just wanted to know if it could be self contained without having to schedule an job external to pinot to call it.

Андрей Морозов

10/17/2025, 11:43 AM

Hi, Team ! I have a problem with ingestion from CSV file, which contains STRING values in column, such a "#1082;аБ...." I get ERROR Caused by: java.lang.IllegalArgumentException: Cannot read single-value from Object[]: [Б, а, р,......] for column: ext_id The parser reading this as array, but I want to load this to Pinot as is as STRING. How to fix this ? Another problem with parsing STRING as " Text , text text", parser reasing it as Object[]

Mustafa Shams

10/20/2025, 7:02 PM

I'm having an issue with the UI in pinot 1.4.0 when trying to add an Offline or Realtime table where sometimes the Table Type option will be unselected and grayed out so I'm not able to select it. I have to switch to the json editor and enter the table type for it to work. I was wondering if this is a known issue or a bug with 1.4.0. Is there a way to fix it or a version where this doesn't happen?

Alaa Halawani

10/22/2025, 5:47 AM

Hi everyone, I’ve recently started using Apache Pinot 1.4 and set up a real-time table with upsert enabled, consuming data from Kafka. I ingested about 1.7 million rows across 12 segments, and during the initial load test, query performance was blazing fast. However, after restarting the server, I noticed: • The server’s memory usage dropped noticeably • A significant spike in query latency, especially in

schedulerWaitMs

Additional details: • Ingestion is stopped (so no extra Kafka load) • Increasing

pinot.query.scheduler.query_runner_threads

helped slightly, but performance is still slower than before the restart • Tried both MMAP and HEAP loading modes with similar results • I am running Pinot cluster on k8s nodes Has anyone run into similar behavior after a restart? Any idea why it happens? Any recommendations or configuration tips to improve performance would be much appreciated

Rahul Sharma

10/22/2025, 7:56 PM

Hi Team, We are using realtime tables with upsert in Pinot. However, since Pinot does not actually delete old records, we need to schedule Minion compaction tasks to handle this. I added the configuration below to my table. Now, in the Task Manager, the

upsertCompactionTask

is visible, but its task configuration is empty. As a result, compaction is not working, and the number of records in my table remains the same. Can anyone please help? Conf:

Copy code

"task": {
      "taskTypeConfigsMap": {
        "UpsertCompactionTask": {
          "schedule": "0 */5 * ? * *",
          "bufferTimePeriod": "0d",
          "invalidRecordsThresholdPercent": "10",
          "invalidRecordsThresholdCount": "1000"
        }
      }
    },

Krupa

10/24/2025, 11:19 AM

Hi @Mayank I have created a table initially its performance is good when the records are less than 1Million with around 1000 QPS, when the data increased to 15Million the query performance degraded for the same QPS and it is taking more than 3secs for query.I have kept relevant indexes on the relevant columns. What am I missing and what can be the reasons for it. Please help

Utsav Jain

10/29/2025, 5:15 AM

Hi Team we are using realtime tables with upserts enabled and the ttl window is 12 hrs To purge any stale record coming after ttl window and to avoid runnig queries with distinct we enabled segment compaction job But we are seeing issues while running it with full accuracy like few of the older segments are never considered for compaction which is resulting in inaccurate numbers while running the query Can you please help in understanding the causes of such cases ?

Rajat

10/29/2025, 10:16 AM

Hi Team, is there any known bug in Pinot? like when I check for duplicates in data by running:

Copy code

SELECT s_id, count(*)
FROM shipmentMerged_final
GROUP BY s_id
HAVING COUNT(*) > 1

Sometimes it shows no records but sometimes it shows data with count as 2

Rajat

10/29/2025, 10:49 AM

another issue:

Copy code

SELECT COUNT(*) AS aggregate,
s_id
FROM shipmentMerged_final
WHERE o_company_id = 2449226
  AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
  AND o_shipping_method IN ('SR', 'SRE', 'AC')
  AND o_is_return = 0
  AND o_state = 0
group by 2
limit 1500

Above Query is showing: 1150 total records But When running:

Copy code

SELECT COUNT(*) AS aggregate
FROM shipmentMerged_final
WHERE o_company_id = 2449226
  AND o_created_at BETWEEN TIMESTAMP '2025-10-10 00:00:00' AND TIMESTAMP '2025-10-26 23:59:59'
  AND o_shipping_method IN ('SR', 'SRE', 'AC')
  AND o_is_return = 0
  AND o_state = 0

The count is coming as: 1162

Rajat

10/29/2025, 10:49 AM

@Xiang Fu @Mayank

Rashpal Singh

10/29/2025, 11:24 PM

Hi All, I am using Pinot 1.1 and I want to store null for my DOUBLE column.. For that I have used below confs: nullHandlingEnabled=true at table config level

enableColumnBasedNullHandling": true at schema level

Copy code

{
      "name": "notNullColumn",
      "dataType": "DOUBLE",
      "notNull": False
 }

Still when I am querying, I am getting "0" instead of null. How can I fix this issue where I want to see null (original value) instead of 0 in query response without adding "SET enableNullHandling=true" in my query

Rahul Sharma

10/30/2025, 4:23 AM

Hi team, I am creating an autoscaler for minion-based batch ingestions. To scale up and down, I need the number of tasks that are waiting and the number of tasks that are running. I checked the Pinot metrics and found these two:

pinot_controller_numMinionSubtasksWaiting_Value

and

pinot_controller_numMinionSubtasksRunning_Value

. However, for each task type, they always show a value of 0 even when tasks are running. Am I using the wrong metrics? Which metrics should I use to build a custom autoscaler for minions?

francoisa

10/30/2025, 8:49 AM

Hi team 😉 Quick question about some messages I see in my monitoring : “Recreating stream consumer for topic partition *, reason: Total idle time: 183647 ms exceeded idle timeout: 180000 ms” What is the behaviour behind that ? Reset the consumer to last commited offset and reingest things ? Or just re-pop the consumer to his last consumed offset ?

Badhusha Muhammed

10/30/2025, 4:17 PM

Hello Team, We are encountering an issue where our Pinot servers are timing out when attempting to establish a session with Zookeeper. This is causing the Pinot Servers to crash (or go down). Although the server attempts to iteratively establish a new connection, the process continues to time out until we manually restart the server instance. A similar scenario can be found in the following GitHub issue: https://github.com/apache/pinot/issues/4686. 1. The initial issue between the Pinot Server and Zookeeper was session expiration. 2. Regardless of the underlying issue (e.g., Zookeeper latency, GC pauses blocking the main thread), Pinot should be capable of automatically re-establishing the connection once the problem is resolved. Instead, we are forced to manually restart the server to restore a healthy Zookeeper session. As a result , the server is being removed from the LIVE_INSTANCE metadata and registered as DEAD.