AG
06/03/2025, 8:45 AMtableIndexConfig?
Georgi Varbanov
06/03/2025, 10:17 AMDong Zhou
06/03/2025, 11:19 PMGeorgi Varbanov
06/04/2025, 7:29 AMprasanna
06/04/2025, 8:15 AMcoco
06/05/2025, 6:34 AMRajat
06/05/2025, 6:45 AMGaurav
06/06/2025, 9:31 AMGeorgi Varbanov
06/06/2025, 9:45 AMVipin Rohilla
06/09/2025, 5:14 PMEddie Simeon
06/09/2025, 7:06 PMNaz Karnasevych
06/09/2025, 10:50 PM/bitnami/zookeeper/data
dir, so clearly some things were not lost, but curious if migrating charts requires some extra steps in the config. Also, couple of followup questions:
• are there ways to recover the tables/schemas if zookeeper lost them on restart?
• how can this happen in one env but not in another? the only difference i can think of is the addition of basic auth to the controller ui
• ways to prevent this for the future? we want to update prod as well, but not sure how to prevent this scenario. Backing up tables/schemas manually is one thing, but is there other important data whose loss can prevent a healthy recovery of pinot?Anish Nair
06/11/2025, 1:10 PMDong Zhou
06/12/2025, 8:12 AMRicha Kumari
06/13/2025, 4:46 PMRoss Morrow
06/14/2025, 7:59 PMpod/pinot-server-0 1/1 Running 0 43h
pod/pinot-server-1 1/1 Running 0 43h
pod/pinot-server-2 1/1 Running 1 (26s ago) 43h
pod/pinot-server-3 1/1 Running 0 43h
pod/pinot-server-4 1/1 Running 0 43h
pod/pinot-server-5 0/1 OOMKilled 3 (3m56s ago) 43h
pod/pinot-server-6 1/1 Running 1 (8m50s ago) 81m
pod/pinot-server-7 1/1 Running 1 (7m44s ago) 81m
The table data itself is not that large, pretty small in fact (at 10GB currently), there are 8 servers with O(60GB) memory each, 100GB PVCs, and a total of maybe 450 segments over 4 tables.
But this table is (IIUC) trying to do some pretty high cardinality upserts, easily into the many tens of millions of keys. Could this be the cause of the OOMs? Are there specific settings I can review or actions I can take while I'm learning besides larger instances?Rajat
06/15/2025, 2:56 PMimage:
repository: <http://615177075440.dkr.ecr.ap-south-1.amazonaws.com/pinot|615177075440.dkr.ecr.ap-south-1.amazonaws.com/pinot>
tag: 1.0.1
pullPolicy: IfNotPresent
cluster:
name: pinot-prod
# ----------------------------------------------------------------------------
# ZOOKEEPER: 3 replicas
# ----------------------------------------------------------------------------
zookeeper:
name: pinot-zookeeper
replicaCount: 3
persistence:
enabled: true
storageClass: gp3 # ← GP3 EBS
size: 10Gi
resources:
requests:
cpu: 100m
memory: 256Mi
limits:
cpu: 300m
memory: 512Mi
port: 2181
# ----------------------------------------------------------------------------
# CONTROLLER: 2 replicas, internal LB
# ----------------------------------------------------------------------------
controller:
name: pinot-controller
replicaCount: 2
startCommand: "StartController"
# podManagementPolicy: Parallel
resources:
requests:
cpu: 100m
memory: 1Gi
limits:
cpu: 300m
memory: 3Gi
jvmOpts: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=9010:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -XX:ActiveProcessorCount=2 -Xms512M -Xmx2G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-controller.log -Djute.maxbuffer=4000000"
# Persist controller metadata
persistence:
enabled: true
accessMode: ReadWriteOnce
storageClass: gp3
size: 50Gi
mountPath: /var/pinot/controller/data
service:
name: controller
annotations:
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9010"
# labels:
# prometheus-monitor: "true"
extraPorts:
- name: controller-prom
protocol: TCP
containerPort: 9010
podAnnotations:
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9010"
# Expose via Kubernetes Ingress on port 9000
# ingress:
# v1:
# enabled: true
# ingressClassName: nginx # or your ingress controller
# annotations:
# <http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: "HTTP"
# <http://nginx.ingress.kubernetes.io/rewrite-target|nginx.ingress.kubernetes.io/rewrite-target>: /
# hosts: [<http://pinot-eks.sr-bi-internal.in|pinot-eks.sr-bi-internal.in>]
# path: /controller
# tls: []
external:
enabled: false
# ----------------------------------------------------------------------------
# BROKER: 2 replicas, internal LB
# ----------------------------------------------------------------------------
broker:
name: pinot-broker
startCommand: "StartBroker"
# podManagementPolicy: Parallel
replicaCount: 2
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 500m
memory: 3Gi
jvmOpts: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=9020:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -XX:ActiveProcessorCount=2 -Xms512M -Xmx2G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-broker.log -Djute.maxbuffer=4000000"
service:
name: broker
# type: LoadBalancer
annotations:
# <http://service.beta.kubernetes.io/aws-load-balancer-internal|service.beta.kubernetes.io/aws-load-balancer-internal>: "true"
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9020"
# labels:
# prometheus-monitor: "true"
extraPorts:
- name: broker-prom
protocol: TCP
containerPort: 9020
podAnnotations:
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9020"
# Expose via Kubernetes Ingress on port 8099
# ingress:
# v1:
# enabled: true
# ingressClassName: nginx # or your ingress controller
# annotations:
# <http://nginx.ingress.kubernetes.io/backend-protocol|nginx.ingress.kubernetes.io/backend-protocol>: "HTTP"
# <http://nginx.ingress.kubernetes.io/use-regex|nginx.ingress.kubernetes.io/use-regex>: "true"
# <http://nginx.ingress.kubernetes.io/rewrite-target|nginx.ingress.kubernetes.io/rewrite-target>: /$2
# hosts: [<http://pinot-eks.sr-bi-internal.in|pinot-eks.sr-bi-internal.in>]
# path: /broker(/|$)(.*)
# pathType: ImplementationSpecific
# tls: []
external:
enabled: false
# ----------------------------------------------------------------------------
# PINOT SERVER: 2 replicas, each with 100 Gi gp3 PVC
# ----------------------------------------------------------------------------
server:
name: pinot-server
startCommand: "StartServer"
# podManagementPolicy: Parallel
replicaCount: 2
resources:
requests:
cpu: 2000m # 2 vCPU
memory: 5Gi
limits:
cpu: 4000m # 4 vCPU
memory: 10Gi
persistence:
enabled: true
accessMode: ReadWriteOnce
size: 100G
mountPath: /var/pinot/server/data
storageClass: gp3
jvmOpts: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=9030:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -Xms4G -Xmx8G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-server.log -Djute.maxbuffer=4000000"
service:
name: server
# type: LoadBalancer
annotations:
# <http://service.beta.kubernetes.io/aws-load-balancer-internal|service.beta.kubernetes.io/aws-load-balancer-internal>: "true"
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9030"
# labels:
# prometheus-monitor: "true"
extraPorts:
- name: server-prom
protocol: TCP
containerPort: 9030
podAnnotations:
<http://prometheus.io/scrape|prometheus.io/scrape>: "true"
<http://prometheus.io/path|prometheus.io/path>: /metrics
<http://prometheus.io/port|prometheus.io/port>: "9030"
# ----------------------------------------------------------------------------
# MINION: 1 replica (for background tasks / retention, compaction, etc.)
# ----------------------------------------------------------------------------
minion:
enabled: true # run the minion pod
name: pinot-minion
startCommand: "StartMinion"
# podManagementPolicy: Parallel
replicaCount: 1 # scale up if you have heavy compaction/merge workloads
resources:
requests:
cpu: 100m
memory: 512Mi
limits:
cpu: 200m
memory: 1Gi
jvmOpts: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=8008:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -XX:ActiveProcessorCount=2 -Xms256M -Xmx1G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-minion.log -Djute.maxbuffer=4000000"
I am running pinot EKS with 3 nodes but utilizing only 2 as the number of replicas are two only.....
I want to use 3 nodes for better performance what will happen if I increase the number of pods to 3 and how should I do it with the existing data not getting lost?
@Xiang Fu @MayankMayank
Mayank
Georgi Varbanov
06/16/2025, 11:29 AMRajat
06/17/2025, 9:16 AMPratik Bhadane
06/17/2025, 12:42 PMRajat
06/19/2025, 8:05 AMfrancoisa
06/23/2025, 9:19 AMgetting logs like software.amazon.awssdk.services.s3.model.S3Exception: The authorization header is malformed; the region is wrong; expecting 'eu-west-1'. (Service: S3, Status Code: 400, Request ID: 184BA1F70B628FA6, Extended Request ID: 82b9e6b1548ad0837abe6ff674d1d3e982a2038442a1059f595d95962627f827)
here is my server conf for the S3 part
# Pinot Server Data Directory
pinot.server.instance.dataDir=/var/lib/pinot_data/server/index
# Pinot Server Temporary Segment Tar Directory
pinot.server.instance.segmentTarDir=/var/lib/pinot_data/server/segmentTar
#S3
pinot.server.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
pinot.server.storage.factory.s3.region=us-west-1
pinot.server.segment.fetcher.protocols=file,http,s3
pinot.server.storage.factory.s3.bucket.name=bucketName
pinot.server.storage.factory.s3.endpoint=URL_OF_MY_S3_ENDOINT
pinot.server.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
pinot.server.segment.fetcher.s3.pathStyleAccess=true
Any ideas welcome 🙂Kiril Kalchev
06/23/2025, 11:37 AMSELECT * FROM table
and then re-import it using a simple tool, the size drops to just 15 MB.
I only need the aggregated data — I don’t need per-event details.
Is there a way to merge the old segments and significantly reduce table size and improve query speed using Pinot tasks?Yeshwanth
06/24/2025, 9:13 AM"streamIngestionConfig": {
"streamConfigMaps": [
{
"streamType": "kafka",
"stream.kafka.topic.name": "flattened_spans2",
"stream.kafka.broker.list": "kafka:9092",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.time": "30m",
"realtime.segment.flush.threshold.segment.size": "300M"
},
{
"streamType": "kafka",
"stream.kafka.topic.name": "flattened_spans3",
"stream.kafka.broker.list": "kafka.pinot-0-nfr-setup.svc.cluster.local:9092",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.consumer.prop.auto.offset.reset": "smallest",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.time": "30m",
"realtime.segment.flush.threshold.segment.size": "300M"
}
]
}
But i am running into this issue.
First kafka
2025/06/20 13:21:17.528 INFO [KafkaConsumer] [otel_spans__1__0__20250620T1321Z] [Consumer clientId=otel_spans_REALTIME-flattened_spans2-1, groupId=null] Seeking to offset 0 for partition flattened_spans2-1
Second kafka
025/06/20 13:22:08.659 INFO [KafkaConsumer] [otel_spans__10001__0__20250620T1321Z] [Consumer clientId=otel_spans_REALTIME-flattened_spans3-1, groupId=null] Seeking to offset 0 for partition flattened_spans3-10001
2025/06/20 13:22:08.659 INFO [KafkaConsumer] [otel_spans__10000__0__20250620T1321Z] [Consumer clientId=otel_spans_REALTIME-flattened_spans3-0, groupId=null] Seeking to offset 0 for partition flattened_spans3-10000
the flattened_spans3 has only partitions 1-3 but the pinot server is seeking out partition number 10000 for some reason.
Can someone please guide me on where i'm going wrong with my config ?baarath
06/25/2025, 7:15 AMbin/pinot-admin.sh StartServer -configFileName conf/pinot-server.conf
Aman Satya
06/25/2025, 8:54 AMMergeRollupTask
on the sales_OFFLINE
table, but it fails with a StringIndexOutOfBoundsException
.
It looks like the error comes from this line:
MergeRollupTaskUtils.getLevelToConfigMap()
Here, is the config that I am using.
json
j
"taskTypeConfigsMap": {
"MergeRollupTask": {
"mergeType": "rollup",
"bucketTimePeriod": "1d",
"bufferTimePeriod": "3d",
"revenue.aggregationType": "sum",
"quantity.aggregationType": "sum"
}
}
And here's the relevant part of the error:
java.lang.StringIndexOutOfBoundsException: begin 0, end -1, length 9
at ...MergeRollupTaskUtils.getLevelToConfigMap(MergeRollupTaskUtils.java:64)
mathew
06/26/2025, 8:15 AMJan
06/26/2025, 10:34 AM