Kishore G
Kishore G
Elon
12/18/2020, 6:23 PM-XX:SoftRefLRUPolicyMSPerMB=0
and combined with java11 (pr coming in the next 2 days) we are seeing that gc clears more soft references than before. Found this on the jvm mailing list: https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-April/025586.htmlElon
12/18/2020, 6:24 PMsamarth
12/21/2020, 4:30 AM-XX:ActiveProcessorCount
java opts, so that I can override number of available processors, and workaround the need to request high number of CPUs in k8 deployment.
Wanted to do some perf test to reach optimal value.
/**
* Use at most 10 or half of the processors threads for each query. If there are less than 2 processors, use 1 thread.
* <p>NOTE: Runtime.getRuntime().availableProcessors() may return value < 2 in container based environment, e.g.
* Kubernetes.
*/
public static final int MAX_NUM_THREADS_PER_QUERY =
Math.max(1, Math.min(10, Runtime.getRuntime().availableProcessors() / 2));
https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/operator/combine/CombineOperatorUtils.java#[…]8Ken Krugler
12/29/2020, 3:13 PMElon
01/08/2021, 9:25 AM-XX:SoftRefLRUPolicyMSPerMB=0
to fix the issue where soft references are not sufficiently cleaned up in gc.Prashant Pandey
07/19/2021, 7:48 AMSelect backend_id, backend_protocol, backend_name, COUNT(*) FROM backendEntityView WHERE tenant_id = '__default' AND ( backend_id IS NOT NULL AND start_time_millis >= 1626546008022 AND start_time_millis < 1626632408022 ) GROUP BY backend_id, backend_protocol, backend_name ORDER BY PERCENTILETDIGEST99(duration_millis) desc limit 10000
We have a sorted index on backend_id
and a range index on start_time_millis
. What we have observed is that PERCENTILETDIGEST99
increases the query time by a large amount (more than 300% easily). Is there some way we can optimise this using pre-aggregations (or some other way)?Mayank
Mayank
Mayank
Prashant Pandey
07/19/2021, 4:39 PMnumDocsScanned
is 658418392.Prashant Pandey
07/19/2021, 4:40 PM7611342
is the number of entries scanned in filter.Prashant Pandey
07/19/2021, 4:40 PM2633673568
is numEntriesScannedPostFilter
Prashant Pandey
07/19/2021, 4:44 PMPrashant Pandey
07/19/2021, 6:57 PMKavin Kuppusamy
11/06/2021, 5:54 PMTony Requist
11/10/2021, 3:45 PMKishore G
Tony Requist
11/10/2021, 3:55 PMSubbu Subramaniam
11/10/2021, 6:31 PMSubbu Subramaniam
11/10/2021, 6:33 PMPriyank Bagrecha
06/03/2022, 11:42 PMLee Wei Hern Jason
03/03/2023, 4:31 AMShreeram Goyal
03/21/2023, 4:30 PMEric Liu
05/09/2023, 11:59 PMSHA-256
algorithm for the upsert table from the performance perspective? I want to avoid collisions as much as possible.Eric Liu
08/16/2023, 3:02 AMsegmentPrunerTypes
config) for an upsert table? The partition key (say pk
) in my upsert table is the hashed value of two concatenated id fields (lets say A
and B
), and the most common pattern of queries against that table is filtering on column A
instead of the pk
.Eric Liu
08/24/2023, 3:34 PMLee Wei Hern Jason
08/25/2023, 5:20 AMSELECT start_timestamp_10mins, SUM(online_seconds)/3600.0 AS value FROM m_driver_supply_derived WHERE country_id = 6 AND city_id IN (84, 130, 20, 340, 369, 79, 360, 308, 98, 81, 230, 34, 349, 350, 370, 63, 372, 346, 231, 60, 86, 61, 218, 99, 43, 363, 222, 64, 367, 80, 15, 101, 348, 62, 361, 28, 75, 373, 356, 352, 69, 41, 362, 55, 347, 341, 374, 365, 219, 65, 275, 10, 66, 364, 40, 225, 366, 146, 35, 337, 354, 342, 344, 102, 345, 96, 132, 78, 18, 359, 36, 256, 77, 358, 343, 371, 100, 357, 368, 215, 26, 255, 44, 144, 351, 85, 353, 76) AND
business_vertical = 'BUSINESS_VERTICAL_TYPE_TRANSPORT' AND
granularity = 'GRANULARITY_BUSINESS' AND
start_timestamp_10mins >= '2023-08-15 17:00:00.0' AND start_timestamp_10mins < '2023-08-25 17:00:00.0' GROUP BY start_timestamp_10mins HAVING value >= 0 ORDER BY start_timestamp_10mins ASC LIMIT 10000000
Star Tree Index:
We are configuring it in the order of highest cardinality to lowest.
"starTreeIndexConfigs": [
{
"dimensionsSplitOrder": [
"vehicle_type_id",
"city_id",
"country_id",
"vehicle_mode",
"granularity",
"start_timestamp_10mins",
"start_timestamp_1hour",
"start_timestamp_4hour"
],
"skipStarNodeCreationForDimensions": [],
"functionColumnPairs": [
"SUM__online_seconds",
"SUM__online_count",
"SUM__in_transit_seconds"
],
"maxLeafRecords": 10000
}
],
Venkat Boina(VB)
12/01/2023, 6:05 AM