Apache Pinot #pinot-perf-tuning

Kishore G

12/11/2020, 1:10 AM

a) descending order of cardinality, time column at the end b) no c) maxLeaf record - default is 100k, stick to default for now.

Kishore G

12/11/2020, 1:58 AM

only at very high throughput - like 10k+ queries per sec

Elon

12/18/2020, 6:23 PM

Hi, we had some server crashes. I was able to take some histo's (couldn't get a heap dump though). Like last time we saw that DirectR references were taking up the most space in the heap (refs to the mmapped segments). We found that setting

-XX:SoftRefLRUPolicyMSPerMB=0

and combined with java11 (pr coming in the next 2 days) we are seeing that gc clears more soft references than before. Found this on the jvm mailing list: https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2019-April/025586.html

Elon

12/18/2020, 6:24 PM

Any thoughts/experiences regarding this? I will let you know how it goes:)

samarth

12/21/2020, 4:30 AM

are there any default perf test in pinot repo that I can use to measure the performance ? I am starting pinot-server with

-XX:ActiveProcessorCount

java opts, so that I can override number of available processors, and workaround the need to request high number of CPUs in k8 deployment. Wanted to do some perf test to reach optimal value.

Copy code

/**
   * Use at most 10 or half of the processors threads for each query. If there are less than 2 processors, use 1 thread.
   * <p>NOTE: Runtime.getRuntime().availableProcessors() may return value < 2 in container based environment, e.g.
   *          Kubernetes.
   */
  public static final int MAX_NUM_THREADS_PER_QUERY =
      Math.max(1, Math.min(10, Runtime.getRuntime().availableProcessors() / 2));

https://github.com/apache/incubator-pinot/blob/master/pinot-core/src/main/java/org/apache/pinot/core/operator/combine/CombineOperatorUtils.java#[…]8

Ken Krugler

12/29/2020, 3:13 PM

What’s the best way to confirm that a star tree index is being used for a query?

Elon

01/08/2021, 9:25 AM

pr to use java11: https://github.com/apache/incubator-pinot/pull/6424 it's been working for us. We also use

-XX:SoftRefLRUPolicyMSPerMB=0

to fix the issue where soft references are not sufficiently cleaned up in gc.

Prashant Pandey

07/19/2021, 7:48 AM

Hi. We have the following query:

Copy code

Select backend_id, backend_protocol, backend_name, COUNT(*) FROM backendEntityView WHERE tenant_id = '__default' AND ( backend_id IS NOT NULL AND start_time_millis >= 1626546008022 AND start_time_millis < 1626632408022 ) GROUP BY backend_id, backend_protocol, backend_name ORDER BY PERCENTILETDIGEST99(duration_millis) desc  limit 10000

We have a sorted index on

backend_id

and a range index on

start_time_millis

. What we have observed is that

PERCENTILETDIGEST99

increases the query time by a large amount (more than 300% easily). Is there some way we can optimise this using pre-aggregations (or some other way)?

Mayank

07/19/2021, 3:39 PM

You could pre-aggregate the percentile tdigest in a column and store it byte serialized

Mayank

07/19/2021, 3:40 PM

How many docs is the query selecting?

Mayank

07/19/2021, 3:41 PM

You could also try star tree index, however you might then want to also select the percentile in the aggregation so that star tree index is used to compute it

Prashant Pandey

07/19/2021, 4:39 PM

Hi Mayank,

numDocsScanned

is 658418392.

Prashant Pandey

07/19/2021, 4:40 PM

is the number of entries scanned in filter.

Prashant Pandey

07/19/2021, 4:40 PM

2633673568

numEntriesScannedPostFilter

Prashant Pandey

07/19/2021, 4:44 PM

“You could pre-aggregate the percentile tdigest in a column and store it byte serialized” - How can I do this?

Prashant Pandey

07/19/2021, 6:57 PM

I mean, the docs say that the only supported aggregation rn is SUM

Kavin Kuppusamy

11/06/2021, 5:54 PM

@User has left the channel

Tony Requist

11/10/2021, 3:45 PM

Question about server disk size - do server nodes need enough disk space to store all segments? Or will segments get dropped from local disk and re-read from deep storage as needed if the disk gets full?

Kishore G

11/10/2021, 3:46 PM

it needs enough disk space to store all the segments assigned to it

Tony Requist

11/10/2021, 3:55 PM

Thanks. So deep storage is just a backup. Is this use case tiered storage is meant to address? We have a AWS/EKS deployment and our cost is driven by server storage (EBS) - it would be ideal to have older data in S3

Subbu Subramaniam

11/10/2021, 6:31 PM

Tiered storage just moves some segments to a different set of servers, but those servers now need to have enough storage to host these.

Subbu Subramaniam

11/10/2021, 6:33 PM

Even in the issue that I mention, it is expected that the storage use temporarily bumps up on the servers, and then reclaimed when the segments "age". Pinot does not handle the case of serving data from segments that cannot be stored on servers.

Priyank Bagrecha

06/03/2022, 11:42 PM

@Priyank Bagrecha has left the channel

Lee Wei Hern Jason

03/03/2023, 4:31 AM

Hi Team, would like to recommendations for resource allocation. Right now we use the default configuration of allowing our segments to build off heap and i know that indexes are kept on heap. We want to get some advice on how much memory should we allocate to our severs/broker/controller ? Right now we are using i3.2xlarge. We want to fix our heap/nonheap allocation (currently we are using a % which is non ideal).

Shreeram Goyal

03/21/2023, 4:30 PM

Hi, I am facing an issue with offline servers on running heavy queries via presto. I have 6 offline servers each with 32G RAM and I have configured my tables to have 2 RGs with 3 servers each. We have major chunk of our data (some tables of size 29G) in offline servers and that would keep increasing with time. When I run queries, most of the times the server goes down due to OOM or the query gets aborted due to some exception. Can I get some insights on the configuration for heap/non-heap allocation?

Eric Liu

05/09/2023, 11:59 PM

Is it NOT recommended to use a primary key that created by

SHA-256

algorithm for the upsert table from the performance perspective? I want to avoid collisions as much as possible.

Eric Liu

08/16/2023, 3:02 AM

What are the recommended routing configs (more specifically, the

segmentPrunerTypes

config) for an upsert table? The partition key (say

pk

) in my upsert table is the hashed value of two concatenated id fields (lets say

and

), and the most common pattern of queries against that table is filtering on column

instead of the

pk

Eric Liu

08/24/2023, 3:34 PM

What’s the recommended memory setup for broker? If a node has 16GB memory, how much memory request of the pod (assume single pod on the node), heap and off heap I should configure?

Lee Wei Hern Jason

08/25/2023, 5:20 AM

Hi Team, can i get some advise if our star tree index is configured correctly for this query ? Query:

Copy code

SELECT start_timestamp_10mins, SUM(online_seconds)/3600.0 AS value FROM m_driver_supply_derived WHERE country_id = 6 AND city_id IN (84, 130, 20, 340, 369, 79, 360, 308, 98, 81, 230, 34, 349, 350, 370, 63, 372, 346, 231, 60, 86, 61, 218, 99, 43, 363, 222, 64, 367, 80, 15, 101, 348, 62, 361, 28, 75, 373, 356, 352, 69, 41, 362, 55, 347, 341, 374, 365, 219, 65, 275, 10, 66, 364, 40, 225, 366, 146, 35, 337, 354, 342, 344, 102, 345, 96, 132, 78, 18, 359, 36, 256, 77, 358, 343, 371, 100, 357, 368, 215, 26, 255, 44, 144, 351, 85, 353, 76) AND 
business_vertical = 'BUSINESS_VERTICAL_TYPE_TRANSPORT'  AND 
granularity = 'GRANULARITY_BUSINESS'  AND 
start_timestamp_10mins >= '2023-08-15 17:00:00.0' AND start_timestamp_10mins < '2023-08-25 17:00:00.0'  GROUP BY start_timestamp_10mins HAVING value >= 0 ORDER BY start_timestamp_10mins ASC LIMIT 10000000

Star Tree Index: We are configuring it in the order of highest cardinality to lowest.

Copy code

"starTreeIndexConfigs": [
        {
          "dimensionsSplitOrder": [
            "vehicle_type_id",
            "city_id",
            "country_id",
            "vehicle_mode",
            "granularity",
            "start_timestamp_10mins",
            "start_timestamp_1hour",
            "start_timestamp_4hour"
          ],
          "skipStarNodeCreationForDimensions": [],
          "functionColumnPairs": [
            "SUM__online_seconds",
            "SUM__online_count",
            "SUM__in_transit_seconds"
          ],
          "maxLeafRecords": 10000
        }
      ],

Venkat Boina(VB)

12/01/2023, 6:05 AM

@Elon When can we start using lookup function and other functions in passthrough queries?