Hello, we are migrating to Pinot and we have come ...
# troubleshooting
j
Hello, we are migrating to Pinot and we have come across an interesting problem. We are estimating the required infrastructure for our Pinot cluster we have observed, that when we have a small number of server nodes, (we use c5.18xlarge), then the CPU utilization on a single query is relatively small per individual server, leading to larger query runtimes. With Trino, we always get full utilization of the cluster (if not limited by resource groups). We have the
pinot.server.query.executor.max.execution.threads
set to -1, and thus have no upper limit on the threads used on a single query. Has anyone experienced similar behavior?
m
Are you saying that the query become faster when called via Trino, or saying that cpu utilization is higher? If latter it could be because Trino might be calling Pinot to fetch all data (with filter push down).
j
Sorry for the confusing description šŸ™‚. Let me rephrase: - when we run queries on Trino (querying Hive tables, no Pinot involvement), we always see that Trino tries to utilize all available nodes and cores - this is by design - when we run queries on Pinot (querying Pinot tables, no Trino involvement), we observe that Pinot utilizes only a fraction of its available resources - is this by design as well?
m
Yes, Pinot will try to minimize the work done for a query (via indexing/metadata etc). It does have good default heuristics to use available cores efficiently
šŸ‘ 1
If you are seeing cpu util is low but your queries are slow, then likely you have an IO bottleneck
šŸ‘ 1
j
Thank you for the response, our use case if very similar to the one described here: https://apache-pinot.slack.com/archives/CDRCA57FC/p1662027019360169
We have a table of around 2TB, partitioned into 11k files, and we use
gp3
EBS storage blocks.
m
Do you have metrics set up? Are you seeing heavy IO? What I am unclear is on whether you are just curious about low cpu utilization or is it translating to a real issue (eg poor latency or throughput)?
j
This is a real problem for us - the query latency is too high for us to be acceptable and we need sufficient resource utilization. For our use case, the query throughput does not need to be high, in fact, it can be low (a few dozen up to ~ 100 QPS). Our primary constraint is query latency (acceptable to <= 10s) We run queries on datasets of sizes 1-10B records. I will do some measurements and will share some findings.
But will investigate the I/O operations of the cluster as the potential culprit.
m
Yes sounds like that might be the culprit. If you could also share the cluster config (dm if you prefer), I can review the jvm settings instance type etc as well.
šŸ‘ 1
j
Here is the configuration of our test cluster:
Copy code
################ controller ################

Controller config:

cluster.tenant.isolation.enable=true
controller.helix.cluster.name=PinotCluster
controller.host=pinot-controller
controller.port=9000
controller.vip.host=pinot-controller
controller.vip.port=9000
#controller.data.dir=/var/pinot/controller/data
controller.local.temp.dir=/var/pinot/controller/data
controller.zk.str=zookeeper:2181
pinot.set.instance.id.to.hostname=true
pinot.server.grpc.enable=true
pinot.server.grpc.port=8090

Container config:

- CPU limit: 0.0
- memory limit: 25g

JVM config:

-Xms=20g
-Xmx=20g

################ broker ################

pinot.broker.client.queryPort=8099
pinot.broker.enable.query.limit.override=true
pinot.broker.query.response.limit=2147483647
pinot.broker.routing.table.builder.class=random
pinot.broker.startup.minResourcePercent=100.0
pinot.broker.timeoutMs=60000
pinot.set.instance.id.to.hostname=true
pinot.server.grpc.enable=true
pinot.server.grpc.port=8090

Container config:

- CPU limit: 0.0
- memory limit: 40g

JVM config:

-Xms=35g
-Xmx=35g

################ server ################

pinot.server.grpc.enable=true
pinot.server.grpc.port=8090
pinot.server.netty.host=pinot-server1
pinot.server.netty.port=8098
pinot.server.adminapi.port=8097
pinot.server.instance.dataDir=/opt/pinot/data
pinot.server.instance.segmentTarDir=/opt/pinot/segment_tar
pinot.server.query.executor.timeout=120000
pinot.server.query.executor.max.execution.threads=-1
pinot.server.query.executor.max.init.group.holder.capacity=10000
pinot.server.query.executor.num.groups.limit=100000
pinot.server.query.executor.min.segment.group.trim.size=-1
pinot.server.query.executor.min.server.group.trim.size=20000
pinot.server.query.executor.groupby.trim.threshold=100000000
pinot.set.instance.id.to.hostname=true

Container config:

- CPU limit: 0.0
- memory limit: 120g

JVM config:

-Xms=100g
-Xmx=100g
m
Ok, the problem with server is that you have given most of the memory to Java heap, so it has little memory for mmap, and hence it is causing heavy IO
šŸ‘ 1
I’d recommend no more than 32 GB of heap (typically 16GB) for pinot components.
@Juraj Pohanka
j
Aha, I see
thank you very much! We will try it with less heap.
šŸ‘ 1