Apache Pinot #pinot-perf-tuning

Elon

12/07/2020, 6:35 PM

Yep, very much so:) Will get back to you shortly w details.

Elon

12/07/2020, 6:59 PM

So in about 1 hour or so we will be deploying java11 in staging (and then prod in a few hours if it looks good). The main takeaways we found so far are that offheap settings should be disabled to eliminate server crashes due to huge amount of DirectR buffers and that we still see latencies when jvm finally decides to clear soft references.

Elon

12/07/2020, 6:59 PM

What are your thoughts about reducing

-XX:SoftRefLRUPolicyMSPerMB

Kishore G

12/07/2020, 7:00 PM

where are the direct buffers getting created

Kishore G

12/07/2020, 7:00 PM

do we see a spike in the number?

Elon

12/07/2020, 8:56 PM

I see that they are being created for inverted indexes. Not a spike but a slow creep as the server keeps running. I think we allowed users to create too many inverted indexes. Is the recommendation to create very few indexes?

Elon

12/08/2020, 3:16 AM

Upgraded to java11, letting it run for a few hours

Elon

12/08/2020, 3:17 AM

Do you recommend we separate workloads into 2 clusters: we have a realtime only operational workload with very tight sla, and an analytic workload where ppl are always bulk inserting data, redoing indexes, etc. more adhoc. Since our cluster is only 6 nodes we thought it would be too small for multi tenant setup.

Elon

12/08/2020, 3:18 AM

We see a lot of DirectR buffers for the offline analytic tables which have very large retentions, but the realtime only tables have 1 week retentions and grow relatively slowly.

Sidd

12/08/2020, 3:22 AM

Sorry just catching up here. Question : is the issue related to the heap overhead and gc caused by a large number of references for direct byte buffers? The references cleanup are subject to normal gc cycle even though the off heap memory pointed to by them is freed?

Kishore G

12/08/2020, 4:20 AM

Where are creating these?

Elon

12/08/2020, 7:18 PM

Hi, does anyone have a recommendation regarding when it is beneficial to use separate clusters? We are exploring whether to separate realtime only, super low latency (queries must return in < 2 seconds), business critical tables with fixed size/slowly growing segments vs. hybrid/offline, very fast growing, less critical but higher latency (i.e. queries can return in ~5seconds) . We only have 6 nodes currently, so until we really scale up, it makes more sense to have the critical tables in one cluster and analytic tables in another.

Elon

12/08/2020, 7:19 PM

Does that make sense? i.e. due to analytic tables constantly changing indexing, exploring data, etc. vs realtime where the result of a query can trigger an alert (i.e. anomaly detection)

Elon

12/08/2020, 7:25 PM

Or does everyone use 1 cluster?

Matt

12/08/2020, 10:58 PM

Hello, I deployed Pinot in k8s cluster and can see the memory is reported incorrectly actually v high. top command inside the node and docker stats are showing correct usage ~ 2GB. However K8s and prom metrics all are reporting high usage(25G).

Ken Krugler

12/11/2020, 12:49 AM

Assuming I want to optimize for the query “select dim1,dim2,sum(met1) from t where dim3=‘val’ group by dim1,dim2 order by sum(met1) desc”, then (a) what’s the best ordering of fields in dimensionsSplitOrder, (b) does it matter whether met1 has a dictionary and/or an inverted index (I assume not), and (c) how would one tune the maxLeafRecords setting (or does that rarely matter)?

Ken Krugler

12/11/2020, 1:24 AM

I’ve got maxLeftRecords set to 10K, I thought that was default - should I increase?

Ken Krugler

12/11/2020, 1:26 AM

And I don’t see anything about when to use onHeapDictionaryColumns, does that matter at all?

Elon

12/11/2020, 6:02 AM

We noticed that offline tables with a lot of segments require a lot of DirectR buffer references - would this indicate that we need to scale up the number of servers? What % of the heap should DirectR buffer references consume before it is recommended to scale up?

Ken Krugler

12/16/2020, 3:24 PM

I’d heard that the server process has a set (10?) number of threads to handle query requests. Assuming we have very low qps, and want to minimize latency, is there a way to set the number of threads == the number of cores? And are there any other places in the system where similar constraints exist?

Ken Krugler

12/29/2020, 3:17 PM

Asking because if I add a where clause (that filters out nothing) using a dimension NOT in my dimensionsSplitOrder list, I though that the star tree wouldn’t be used - and the query time is the same for the case without that where clause.

Ken Krugler

01/07/2021, 9:05 PM

I added two more servers to my cluster, and performance has dropped. One theory is that one or both of these new servers is slower than the previous servers, and thus causing the drop in performance. How can I confirm or refute that theory? Are there Pinot metrics I should be examining?

Ken Krugler

01/20/2021, 3:57 PM

Is there a way to have an inverted index for a column, but not store the column data? So a pure filter-only field?

Leon Liu

06/30/2021, 12:43 PM

Hey good morning. I read some articles about Pinot, and feel Pinot can be a great tool for our real time analytics platform. we currently use snowflake and redshift. I tried it with a simple usecase (63 million records with percentileest, avg aggration) on a single ec2 instance and the performance is amazing. I want to pursue further and have a few questions related with building star tree index for the aggregators. mainly we want to make sure building the star tree indexes takes much shorter than the full cubing. hope you can help me out: 1. for our percentile aggregation, we only care the values for 10, 25, 50, 75 and 90 percent. is there any way to do the aggregation only for those percentiles? 2. How do i know if a star tree index is built? from the UI “Reload Status” screen, I don’t see anything related with the star tree index 3. currently we are doing very intensive monthly cubing to support realtime analytics (percentile on 12 columns, avg on 12 columns, approx_cont_distinct on 5 columns). at the end of each month, we are batch feeding about 70 million records. is it possible to build the star tree index in a couple of hours? if so what are the recommended ways to speed up the index building process? some context for our table: 1. 40 dimension columns, 1 time column and 15 metric column 2. we have monthly feed about 70 million records 3. we need monthly, quarterly and yearly analytics Thanks in advance

Mayank

07/19/2021, 7:31 PM

I was referring to pre aggregate during indexing in offline

Subbu Subramaniam

11/10/2021, 6:31 PM

@User perhaps you are looking for a solution being worked on in this issue: https://github.com/apache/pinot/issues/7229

Anish Nair

07/05/2022, 7:51 AM

Hey guys, Regarding Realtime Table Memory usage . i had posted a thread on troubleshoot group. https://apache-pinot.slack.com/archives/C011C9JHN7R/p1656747644771759 Can anyone advice on the same?

Kartik Khare

09/14/2022, 3:37 PM

@Abhishek Gupta

Abhishek Gupta

09/14/2022, 5:26 PM

Hey everyone, I'm trying to run below query but it is erroring out with

upstream request timeout

, when using single month data in filter then it is working. Not sure why error message isn't giving valid reason of failure. Can you guys suggest the right and necessary table config

Copy code

SELECT
    DISTINCTCOUNT( CASE WHEN data_source = 'web' THEN master_id ELSE 'a' END) as activity_account_count,
	DISTINCTCOUNT( CASE WHEN data_source in ('b2bn', 'b2bn_excluded') THEN master_id ELSE 'a' END) as kw_account_count,
    DISTINCTCOUNT( CASE WHEN data_source = 'fpm' THEN master_id ELSE 'a' END) as fpm_account_count,
    DISTINCTCOUNT( CASE WHEN data_source = 'tpm' THEN master_id ELSE 'a' END) as tpm_account_count,
    DISTINCTCOUNT( CASE WHEN data_source = 'crm' THEN master_id ELSE 'a' END) as crm_account_count,
    DISTINCTCOUNT( CASE WHEN data_source = 'map' THEN master_id ELSE 'a' END) as map_account_count,
    DISTINCTCOUNT( master_id) as all_account_count
FROM
    six_sense_dapm
WHERE
    dt BETWEEN '2022-01-01' AND '2022-08-30'
    AND data_source IN ('web', 'b2bn', 'b2bn_excluded', 'fpm', 'tpm', 'crm', 'map')
    AND product='__all__'

table config :

Copy code

{
  "OFFLINE": {
    "tableName": "sumologic_dapm_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "timeType": "DAYS",
      "schemaName": "sumologic_dapm",
      "replication": "2",
      "segmentPushType": "APPEND",
      "timeColumnName": "dt",
      "allowNullTimeValue": false
    },
    "tenants": {
      "broker": "DefaultTenant",
      "server": "DefaultTenant"
    },
    "tableIndexConfig": {
      "invertedIndexColumns": [
        "data_source",
        "product",
        "source_activity_name"
      ],
      "noDictionaryColumns": [
        "external_id",
        "master_id",
        "secondary_id",
        "source_activity_desc",
        "source_activity_url",
        "source_activity_referrer_url",
        "source_activity_desc",
        "source_activity_url_r",
        "metric_value",
        "source_id"
      ],
      "rangeIndexColumns": [
        "dt"
      ],
      "optimizeDictionaryForMetrics": true,
      "enableDefaultStarTree": true,
      "enableDynamicStarTreeCreation": true,
      "aggregateMetrics": true,
      "nullHandlingEnabled": true,
      "rangeIndexVersion": 2,
      "autoGeneratedInvertedIndex": false,
      "createInvertedIndexDuringSegmentGeneration": false,
      "sortedColumn": [
        "data_source",
        "master_id"
      ],
      "loadMode": "MMAP"
    },
    "metadata": {},
    "isDimTable": false
  }
}

Prakhar Pande

11/07/2022, 8:16 PM

This query does not scale beyond 240 qps {"sql":"select event_time , sum(count) as event_counts from nrt_app_open where year=year(now()) and month=month(now()) and day>=(day(now())-1) and FromDateTime(event_time, 'yyyy-MM-dd HHmmss') > cast((now() - 86400000) as long) group by 1 limit 1000000000","trace":false,"queryOptions":""} Any ideas on how I can optimise this. Cpu and memory of pinot-servers are also under utilized. Thanks in advance.