https://pinot.apache.org/ logo
#general
Title
# general
s

Shounak Kulkarni

04/14/2020, 6:25 PM
Was doing a PT on the pinot setup ( 3 brokers, 3 servers, 3 controllers). Currently there are 20,000,000+ entries (64 segments over all 3 servers 35MB as segment size ). The topic in kafka has 3 partitions and replicationPerPartition is 2 in pinot. With a minimal load of 100 rps the brokers are going down. logs given below
The pods are stable now but latency is very poor. For 80rps latency was 5ms but with 100rps its going beyond 1s. No error logs on any container.
k

Kishore G

04/15/2020, 1:47 PM
What’s the kubernetes container setting. Have you added any indexes?
s

Shounak Kulkarni

04/15/2020, 1:48 PM
Yes, I have applied inverted index to the columns used in where clause of queries
broker - 4 GB, 1200 millicores server - 5 GB, 1500 millicores controller- 3 GB, 500 millicores
k

Kishore G

04/15/2020, 1:58 PM
Can you paste a sample query?
And response
Also, you are probably hitting capacity limit. 1.5 core is not enough to do performance testing.
s

Shounak Kulkarni

04/15/2020, 2:10 PM
select MAX(timeSinceEpoch), modelYearCode FROM vehicleTable where timeSinceEpoch >= 1584988200 and userId = 880560022 group by modelYearCode limit 10
Copy code
{
  "resultTable": {
    "dataSchema": {
      "columnDataTypes": [
        "DOUBLE",
        "STRING"
      ],
      "columnNames": [
        "max(timeSinceEpoch)",
        "modelYearCode"
      ]
    },
    "rows": [
      [
        1585835137,
        "CUJ202008"
      ],
      [
        1585717081,
        "CUJ202009"
      ],
      [
        1585741681,
        "CUJ202010"
      ],
      [
        1585800122,
        "CUJ202011"
      ],
      [
        1585839286,
        "CUJ202003"
      ],
      [
        1585828710,
        "CUJ202014"
      ]
    ]
  },
  "exceptions": [],
  "numServersQueried": 3,
  "numServersResponded": 3,
  "numSegmentsQueried": 38,
  "numSegmentsProcessed": 15,
  "numSegmentsMatched": 15,
  "numConsumingSegmentsQueried": 3,
  "numDocsScanned": 309,
  "numEntriesScannedInFilter": 349,
  "numEntriesScannedPostFilter": 618,
  "numGroupsLimitReached": false,
  "totalDocs": 22051919,
  "timeUsedMs": 5,
  "segmentStatistics": [],
  "traceInfo": {},
  "minConsumingFreshnessTimeMs": 9223372036854776000
}
regarding hitting capacity the cpu usage for pods is not going even beyond 1 core. Resource utilization during PT :
Copy code
NAME                 CPU(cores)   MEMORY(bytes)
pinot-broker-0       179m         994Mi
pinot-broker-1       36m          1006Mi
pinot-broker-2       28m          1004Mi
pinot-controller-0   4m           913Mi
pinot-controller-1   4m           926Mi
pinot-controller-2   10m          944Mi
pinot-server-0       70m          1481Mi
pinot-server-1       124m         1477Mi
pinot-server-2       127m         1475Mi
pinot-zookeeper-0    14m          382Mi
k

Kishore G

04/15/2020, 2:13 PM
do you have inverted index or sorted index on userId
s

Shounak Kulkarni

04/15/2020, 2:14 PM
inverted index
k

Kishore G

04/15/2020, 2:15 PM
try sorted index
but even this query is using 5ms
so even if you run sequentially, you should be getting 100+ qps
s

Shounak Kulkarni

04/15/2020, 2:17 PM
so that "timeUsedMs" field specifies exactly which time?
k

Kishore G

04/15/2020, 2:17 PM
total time at the broker before responding
s

Shounak Kulkarni

04/15/2020, 2:19 PM
so the latency showed by the PT tool (270ms) is all network latency?
k

Kishore G

04/15/2020, 2:20 PM
yes, where are you running the PT tool
s

Shounak Kulkarni

04/15/2020, 2:21 PM
on a separate vm
k

Kishore G

04/15/2020, 2:21 PM
yeah, your client might not be scaling
👍 1
or network latency
s

Shounak Kulkarni

04/15/2020, 2:30 PM
Is it possible that traffic congestion is happening at brokers?
k

Kishore G

04/15/2020, 2:31 PM
not at this qps
I dont know if kube is limiting the network io
s

Shounak Kulkarni

04/15/2020, 2:32 PM
yes even resources are not getting used by broker that much
I'll check the kube side
k

Kishore G

04/15/2020, 2:35 PM
is this real-time?
s

Shounak Kulkarni

04/15/2020, 2:36 PM
Yes table type is realtime but currently working with static generated entries in kafka
k

Kishore G

04/15/2020, 2:39 PM
thats ok