https://pinot.apache.org/ logo
#troubleshooting
Title
# troubleshooting
l

Luis Fernandez

06/08/2022, 7:39 PM
another question related to importing data: we have imported our last 2 years worth of data into pinot using the standalone job in Dev however we are observing things between our 2 different behavior for the same query in prod/dev (prod doesn’t have this historical data yet but it does have data for this particular time range). Performance in dev is way slower. this query:
Copy code
SELECT product_id, SUM(impression_count) as impression_count, SUM(click_count) as click_count, SUM(cost) as spent_total FROM metrics WHERE user_id = xxx AND serve_time BETWEEN 1651363200 AND 1654012799 GROUP BY product_id LIMIT 6000
production metadata response:
Copy code
"numServersQueried": 4,
  "numServersResponded": 4,
  "numSegmentsQueried": 97,
  "numSegmentsProcessed": 31,
  "numSegmentsMatched": 31,
  "numConsumingSegmentsQueried": 1,
  "numDocsScanned": 15109,
  "numEntriesScannedInFilter": 0,
  "numEntriesScannedPostFilter": 60436,
  "numGroupsLimitReached": false,
  "totalDocs": 493642793,
  "timeUsedMs": 32,
  "offlineThreadCpuTimeNs": 0,
  "realtimeThreadCpuTimeNs": 0,
  "offlineSystemActivitiesCpuTimeNs": 0,
  "realtimeSystemActivitiesCpuTimeNs": 0,
  "offlineResponseSerializationCpuTimeNs": 0,
  "realtimeResponseSerializationCpuTimeNs": 0,
  "offlineTotalCpuTimeNs": 0,
  "realtimeTotalCpuTimeNs": 0,
  "segmentStatistics": [],
  "traceInfo": {},
  "minConsumingFreshnessTimeMs": 1654715649414,
  "numRowsResultSet": 9708
dev metadata response:
Copy code
"exceptions": [],
  "numServersQueried": 4,
  "numServersResponded": 4,
  "numSegmentsQueried": 11703,
  "numSegmentsProcessed": 31,
  "numSegmentsMatched": 31,
  "numConsumingSegmentsQueried": 1,
  "numDocsScanned": 15117,
  "numEntriesScannedInFilter": 0,
  "numEntriesScannedPostFilter": 60468,
  "numGroupsLimitReached": false,
  "totalDocs": 51283295726,
  "timeUsedMs": 580,
  "offlineThreadCpuTimeNs": 0,
  "realtimeThreadCpuTimeNs": 0,
  "offlineSystemActivitiesCpuTimeNs": 0,
  "realtimeSystemActivitiesCpuTimeNs": 0,
  "offlineResponseSerializationCpuTimeNs": 0,
  "realtimeResponseSerializationCpuTimeNs": 0,
  "offlineTotalCpuTimeNs": 0,
  "realtimeTotalCpuTimeNs": 0,
  "segmentStatistics": [],
  "traceInfo": {},
  "minConsumingFreshnessTimeMs": 1654716958681,
  "numRowsResultSet": 9708
amount of segments in prod: 1600 amount of segments in dev: 13000 I guess my question is that I see segments queried be way higher in dev and I’m wondering why and if that’s the reason why the query is just performing slower in dev it’s almost equal to the amount of segments that exist in the cluster while prod is only querying a tiny portion. Do you have an idea as to what may be happening?
m

Mayank

06/08/2022, 9:26 PM
Probably your dev data is not partitioned.
l

Luis Fernandez

06/08/2022, 9:29 PM
i thought partitioning was only for QPS
m

Mayank

06/08/2022, 9:30 PM
Is your question on
"numSegmentsQueried": 11703,
?
l

Luis Fernandez

06/08/2022, 9:30 PM
yes
m

Mayank

06/08/2022, 9:31 PM
Partitioning improves QPS by reducing
numSegmentsQueried
Also, your VM configs may differ between dev and prod
l

Luis Fernandez

06/08/2022, 9:31 PM
dev is a replica of prod in terms of specs
the only difference i see between the 2 metadata responses is the
numSegmentsQueried
m

Mayank

06/08/2022, 9:32 PM
Num nodes, jvm configs, all other variables match exactly?
l

Luis Fernandez

06/08/2022, 9:32 PM
yes everything is the same
m

Mayank

06/08/2022, 9:32 PM
Check segment metadata in dev to see if it shows 1 partition or multiple
l

Luis Fernandez

06/08/2022, 9:33 PM
dev however has way more data because we haven’t run that import
on prod
"segment.partition.metadata": "{\"columnPartitionMap\":{\"user_id\":{\"numPartitions\":8,\"partitions\":[0,1,2,3,4,5,6,7],\"functionName\":\"Murmur\",\"functionConfig\":null}}}"
m

Mayank

06/08/2022, 9:35 PM
It is not partitioned ^^
l

Luis Fernandez

06/08/2022, 9:35 PM
that means it’s not partitioned
m

Mayank

06/08/2022, 9:35 PM
Yes
l

Luis Fernandez

06/08/2022, 9:35 PM
would that explain that bottleneck we are seeing?
like that difference in response times?
m

Mayank

06/08/2022, 9:37 PM
It might, depending on main memory vs total segment size on disk, whether you are running other load at the time, and other factors.
l

Luis Fernandez

06/08/2022, 9:38 PM
all the queries in general are way slower in dev that they are on prod
for the same time windows that prod has data for
and the only variable i see is the numSegmentsQueried
m

Mayank

06/08/2022, 9:39 PM
And that’s what I am explaining above.
l

Luis Fernandez

06/08/2022, 9:41 PM
i have this config on offline
Copy code
"segmentPartitionConfig": {
        "columnPartitionMap": {
          "user_id": {
            "functionName": "Murmur",
            "numPartitions": 8
          }
        }
      },
"routing": {
      "segmentPrunerTypes": [
        "partition"
      ]
    },
this means that the responsible to do the partitioning on this data is whatever generates the data that offline then ingests thru the standalone job, not pinot itself yes?