Luis Fernandez
06/08/2022, 7:39 PMSELECT product_id, SUM(impression_count) as impression_count, SUM(click_count) as click_count, SUM(cost) as spent_total FROM metrics WHERE user_id = xxx AND serve_time BETWEEN 1651363200 AND 1654012799 GROUP BY product_id LIMIT 6000
production metadata response:
"numServersQueried": 4,
"numServersResponded": 4,
"numSegmentsQueried": 97,
"numSegmentsProcessed": 31,
"numSegmentsMatched": 31,
"numConsumingSegmentsQueried": 1,
"numDocsScanned": 15109,
"numEntriesScannedInFilter": 0,
"numEntriesScannedPostFilter": 60436,
"numGroupsLimitReached": false,
"totalDocs": 493642793,
"timeUsedMs": 32,
"offlineThreadCpuTimeNs": 0,
"realtimeThreadCpuTimeNs": 0,
"offlineSystemActivitiesCpuTimeNs": 0,
"realtimeSystemActivitiesCpuTimeNs": 0,
"offlineResponseSerializationCpuTimeNs": 0,
"realtimeResponseSerializationCpuTimeNs": 0,
"offlineTotalCpuTimeNs": 0,
"realtimeTotalCpuTimeNs": 0,
"segmentStatistics": [],
"traceInfo": {},
"minConsumingFreshnessTimeMs": 1654715649414,
"numRowsResultSet": 9708
dev metadata response:
"exceptions": [],
"numServersQueried": 4,
"numServersResponded": 4,
"numSegmentsQueried": 11703,
"numSegmentsProcessed": 31,
"numSegmentsMatched": 31,
"numConsumingSegmentsQueried": 1,
"numDocsScanned": 15117,
"numEntriesScannedInFilter": 0,
"numEntriesScannedPostFilter": 60468,
"numGroupsLimitReached": false,
"totalDocs": 51283295726,
"timeUsedMs": 580,
"offlineThreadCpuTimeNs": 0,
"realtimeThreadCpuTimeNs": 0,
"offlineSystemActivitiesCpuTimeNs": 0,
"realtimeSystemActivitiesCpuTimeNs": 0,
"offlineResponseSerializationCpuTimeNs": 0,
"realtimeResponseSerializationCpuTimeNs": 0,
"offlineTotalCpuTimeNs": 0,
"realtimeTotalCpuTimeNs": 0,
"segmentStatistics": [],
"traceInfo": {},
"minConsumingFreshnessTimeMs": 1654716958681,
"numRowsResultSet": 9708
amount of segments in prod: 1600
amount of segments in dev: 13000
I guess my question is that I see segments queried be way higher in dev and I’m wondering why and if that’s the reason why the query is just performing slower in dev it’s almost equal to the amount of segments that exist in the cluster while prod is only querying a tiny portion. Do you have an idea as to what may be happening?Mayank
Luis Fernandez
06/08/2022, 9:29 PMMayank
"numSegmentsQueried": 11703,
?Luis Fernandez
06/08/2022, 9:30 PMMayank
numSegmentsQueried
Mayank
Luis Fernandez
06/08/2022, 9:31 PMLuis Fernandez
06/08/2022, 9:32 PMnumSegmentsQueried
Mayank
Luis Fernandez
06/08/2022, 9:32 PMMayank
Luis Fernandez
06/08/2022, 9:33 PMLuis Fernandez
06/08/2022, 9:33 PMLuis Fernandez
06/08/2022, 9:34 PM"segment.partition.metadata": "{\"columnPartitionMap\":{\"user_id\":{\"numPartitions\":8,\"partitions\":[0,1,2,3,4,5,6,7],\"functionName\":\"Murmur\",\"functionConfig\":null}}}"
Mayank
Luis Fernandez
06/08/2022, 9:35 PMMayank
Luis Fernandez
06/08/2022, 9:35 PMLuis Fernandez
06/08/2022, 9:35 PMMayank
Luis Fernandez
06/08/2022, 9:38 PMLuis Fernandez
06/08/2022, 9:39 PMLuis Fernandez
06/08/2022, 9:39 PMMayank
Luis Fernandez
06/08/2022, 9:41 PM"segmentPartitionConfig": {
"columnPartitionMap": {
"user_id": {
"functionName": "Murmur",
"numPartitions": 8
}
}
},
"routing": {
"segmentPrunerTypes": [
"partition"
]
},
Luis Fernandez
06/08/2022, 9:42 PM