Tony Zhang
08/11/2022, 5:28 AMKishore G
Tony Zhang
08/11/2022, 5:37 AMTony Zhang
08/11/2022, 5:38 AMTony Zhang
08/11/2022, 5:41 AMAt LinkedIn, we do run a cluster with several million segments (and thousands of tables), and 100s of servers. Over time, we have made improvements to pinot that helps us handle this type of load. The load is on zookeeper. Increasing bandwidth on zookeeper, separating the Helix and Pinot controller instances are things you can do.
Mayank
Mayank
Tony Zhang
08/11/2022, 6:01 AMTony Zhang
08/11/2022, 6:04 AMMayank
Tony Zhang
08/11/2022, 4:00 PM{
"segment.realtime.endOffset": "140557099793",
"segment.start.time": "1660055340000",
"segment.time.unit": "MILLISECONDS",
"segment.flush.threshold.size": "1966448",
"segment.realtime.startOffset": "140555344567",
"segment.end.time": "1660056581000",
"segment.total.docs": "1966933",
"segment.realtime.numReplicas": "1",
"segment.creation.time": "1660055859719",
"segment.index.version": "v3",
"segment.crc": "1224605237",
"segment.realtime.status": "DONE",
"segment.download.url": "<s3://xxxxxx>"
}
Mayank
Tony Zhang
08/11/2022, 4:06 PMMayank
Tony Zhang
08/11/2022, 4:07 PMTony Zhang
08/11/2022, 4:09 PMMayank
Mayank
Tony Zhang
08/11/2022, 4:18 PMzookeeper:
## Replicas
replicaCount: 3
autopurge:
## The time interval in hours for which the purge task has to be triggered. Set to a positive integer (1 and above) to enable the auto purging.
##
purgeInterval: 1
resources:
requests:
memory: 8Gi
cpu: 2
limits:
cpu: 2
memory: 8Gi
Tony Zhang
08/11/2022, 4:19 PMcontroller:
replicaCount: 3
persistence:
size: "600Gi"
mountPath: /var/pinot/server/data
resources:
requests:
memory: "24Gi"
cpu: "6"
limits:
cpu: "6"
memory: "24Gi"
Tony Zhang
08/11/2022, 4:20 PMMayank
Tony Zhang
08/11/2022, 4:27 PMMayank
Tony Zhang
08/11/2022, 4:31 PMTony Zhang
08/11/2022, 4:32 PMMayank
Tony Zhang
08/11/2022, 10:20 PMMayank