https://pinot.apache.org/ logo
Join Slack
Powered by
# pinot-perf-tuning
  • e

    Elon

    12/07/2020, 6:35 PM
    Yep, very much so:) Will get back to you shortly w details.
  • e

    Elon

    12/07/2020, 6:59 PM
    So in about 1 hour or so we will be deploying java11 in staging (and then prod in a few hours if it looks good). The main takeaways we found so far are that offheap settings should be disabled to eliminate server crashes due to huge amount of DirectR buffers and that we still see latencies when jvm finally decides to clear soft references.
  • e

    Elon

    12/07/2020, 6:59 PM
    What are your thoughts about reducing
    -XX:SoftRefLRUPolicyMSPerMB
    ?
  • k

    Kishore G

    12/07/2020, 7:00 PM
    where are the direct buffers getting created
  • k

    Kishore G

    12/07/2020, 7:00 PM
    do we see a spike in the number?
  • e

    Elon

    12/07/2020, 8:56 PM
    I see that they are being created for inverted indexes. Not a spike but a slow creep as the server keeps running. I think we allowed users to create too many inverted indexes. Is the recommendation to create very few indexes?
  • e

    Elon

    12/08/2020, 3:16 AM
    Upgraded to java11, letting it run for a few hours
  • e

    Elon

    12/08/2020, 3:17 AM
    Do you recommend we separate workloads into 2 clusters: we have a realtime only operational workload with very tight sla, and an analytic workload where ppl are always bulk inserting data, redoing indexes, etc. more adhoc. Since our cluster is only 6 nodes we thought it would be too small for multi tenant setup.
  • e

    Elon

    12/08/2020, 3:18 AM
    We see a lot of DirectR buffers for the offline analytic tables which have very large retentions, but the realtime only tables have 1 week retentions and grow relatively slowly.
  • s

    Sidd

    12/08/2020, 3:22 AM
    Sorry just catching up here. Question : is the issue related to the heap overhead and gc caused by a large number of references for direct byte buffers? The references cleanup are subject to normal gc cycle even though the off heap memory pointed to by them is freed?
  • k

    Kishore G

    12/08/2020, 4:20 AM
    Where are creating these?
    m
    e
    • 3
    • 4
  • e

    Elon

    12/08/2020, 7:18 PM
    Hi, does anyone have a recommendation regarding when it is beneficial to use separate clusters? We are exploring whether to separate realtime only, super low latency (queries must return in < 2 seconds), business critical tables with fixed size/slowly growing segments vs. hybrid/offline, very fast growing, less critical but higher latency (i.e. queries can return in ~5seconds) . We only have 6 nodes currently, so until we really scale up, it makes more sense to have the critical tables in one cluster and analytic tables in another.
  • e

    Elon

    12/08/2020, 7:19 PM
    Does that make sense? i.e. due to analytic tables constantly changing indexing, exploring data, etc. vs realtime where the result of a query can trigger an alert (i.e. anomaly detection)
  • e

    Elon

    12/08/2020, 7:25 PM
    Or does everyone use 1 cluster?
    c
    j
    m
    • 4
    • 8
  • m

    Matt

    12/08/2020, 10:58 PM
    Hello, I deployed Pinot in k8s cluster and can see the memory is reported incorrectly actually v high. top command inside the node and docker stats are showing correct usage ~ 2GB. However K8s and prom metrics all are reporting high usage(25G).
    x
    • 2
    • 20
  • k

    Ken Krugler

    12/11/2020, 12:49 AM
    Assuming I want to optimize for the query “select dim1,dim2,sum(met1) from t where dim3=‘val’ group by dim1,dim2 order by sum(met1) desc”, then (a) what’s the best ordering of fields in dimensionsSplitOrder, (b) does it matter whether met1 has a dictionary and/or an inverted index (I assume not), and (c) how would one tune the maxLeafRecords setting (or does that rarely matter)?
  • k

    Ken Krugler

    12/11/2020, 1:24 AM
    I’ve got maxLeftRecords set to 10K, I thought that was default - should I increase?
    m
    • 2
    • 1
  • k

    Ken Krugler

    12/11/2020, 1:26 AM
    And I don’t see anything about when to use onHeapDictionaryColumns, does that matter at all?
    m
    • 2
    • 2
  • e

    Elon

    12/11/2020, 6:02 AM
    We noticed that offline tables with a lot of segments require a lot of DirectR buffer references - would this indicate that we need to scale up the number of servers? What % of the heap should DirectR buffer references consume before it is recommended to scale up?
    s
    k
    k
    • 4
    • 5
  • k

    Ken Krugler

    12/16/2020, 3:24 PM
    I’d heard that the server process has a set (10?) number of threads to handle query requests. Assuming we have very low qps, and want to minimize latency, is there a way to set the number of threads == the number of cores? And are there any other places in the system where similar constraints exist?
    m
    • 2
    • 6
  • k

    Ken Krugler

    12/29/2020, 3:17 PM
    Asking because if I add a where clause (that filters out nothing) using a dimension NOT in my dimensionsSplitOrder list, I though that the star tree wouldn’t be used - and the query time is the same for the case without that where clause.
    c
    j
    • 3
    • 32
  • k

    Ken Krugler

    01/07/2021, 9:05 PM
    I added two more servers to my cluster, and performance has dropped. One theory is that one or both of these new servers is slower than the previous servers, and thus causing the drop in performance. How can I confirm or refute that theory? Are there Pinot metrics I should be examining?
    m
    s
    • 3
    • 15
  • k

    Ken Krugler

    01/20/2021, 3:57 PM
    Is there a way to have an inverted index for a column, but not store the column data? So a pure filter-only field?
    m
    k
    s
    • 4
    • 96
  • l

    Leon Liu

    06/30/2021, 12:43 PM
    Hey good morning. I read some articles about Pinot, and feel Pinot can be a great tool for our real time analytics platform. we currently use snowflake and redshift. I tried it with a simple usecase (63 million records with percentileest, avg aggration) on a single ec2 instance and the performance is amazing. I want to pursue further and have a few questions related with building star tree index for the aggregators. mainly we want to make sure building the star tree indexes takes much shorter than the full cubing.  hope you can help me out: 1. for our percentile aggregation, we only care the values for 10, 25, 50, 75 and 90 percent. is there any way to do the aggregation only for those percentiles? 2. How do i know if a star tree index is built? from the UI “Reload Status” screen, I don’t see anything related with the star tree index 3. currently we are doing very intensive monthly cubing to support realtime analytics (percentile on 12 columns, avg on 12 columns,   approx_cont_distinct on 5 columns). at the end of each month, we are batch feeding about 70 million records. is it possible to build the star tree index in a couple of hours? if so what are the recommended ways to speed up the index building process? some context for our table: 1. 40 dimension columns, 1 time column and 15 metric column 2. we have monthly feed about 70 million records 3. we need  monthly, quarterly and yearly analytics Thanks in advance
    m
    k
    • 3
    • 13
  • m

    Mayank

    07/19/2021, 7:31 PM
    I was referring to pre aggregate during indexing in offline
    p
    • 2
    • 1
  • s

    Subbu Subramaniam

    11/10/2021, 6:31 PM
    @User perhaps you are looking for a solution being worked on in this issue: https://github.com/apache/pinot/issues/7229
    t
    • 2
    • 1
  • a

    Anish Nair

    07/05/2022, 7:51 AM
    Hey guys, Regarding Realtime Table Memory usage . i had posted a thread on troubleshoot group. https://apache-pinot.slack.com/archives/C011C9JHN7R/p1656747644771759 Can anyone advice on the same?
    k
    • 2
    • 1
  • k

    Kartik Khare

    09/14/2022, 3:37 PM
    @Abhishek Gupta
  • a

    Abhishek Gupta

    09/14/2022, 5:26 PM
    Hey everyone, I'm trying to run below query but it is erroring out with
    upstream request timeout
    , when using single month data in filter then it is working. Not sure why error message isn't giving valid reason of failure. Can you guys suggest the right and necessary table config
    Copy code
    SELECT
        DISTINCTCOUNT( CASE WHEN data_source = 'web' THEN master_id ELSE 'a' END) as activity_account_count,
    	DISTINCTCOUNT( CASE WHEN data_source in ('b2bn', 'b2bn_excluded') THEN master_id ELSE 'a' END) as kw_account_count,
        DISTINCTCOUNT( CASE WHEN data_source = 'fpm' THEN master_id ELSE 'a' END) as fpm_account_count,
        DISTINCTCOUNT( CASE WHEN data_source = 'tpm' THEN master_id ELSE 'a' END) as tpm_account_count,
        DISTINCTCOUNT( CASE WHEN data_source = 'crm' THEN master_id ELSE 'a' END) as crm_account_count,
        DISTINCTCOUNT( CASE WHEN data_source = 'map' THEN master_id ELSE 'a' END) as map_account_count,
        DISTINCTCOUNT( master_id) as all_account_count
    FROM
        six_sense_dapm
    WHERE
        dt BETWEEN '2022-01-01' AND '2022-08-30'
        AND data_source IN ('web', 'b2bn', 'b2bn_excluded', 'fpm', 'tpm', 'crm', 'map')
        AND product='__all__'
    table config :
    Copy code
    {
      "OFFLINE": {
        "tableName": "sumologic_dapm_OFFLINE",
        "tableType": "OFFLINE",
        "segmentsConfig": {
          "timeType": "DAYS",
          "schemaName": "sumologic_dapm",
          "replication": "2",
          "segmentPushType": "APPEND",
          "timeColumnName": "dt",
          "allowNullTimeValue": false
        },
        "tenants": {
          "broker": "DefaultTenant",
          "server": "DefaultTenant"
        },
        "tableIndexConfig": {
          "invertedIndexColumns": [
            "data_source",
            "product",
            "source_activity_name"
          ],
          "noDictionaryColumns": [
            "external_id",
            "master_id",
            "secondary_id",
            "source_activity_desc",
            "source_activity_url",
            "source_activity_referrer_url",
            "source_activity_desc",
            "source_activity_url_r",
            "metric_value",
            "source_id"
          ],
          "rangeIndexColumns": [
            "dt"
          ],
          "optimizeDictionaryForMetrics": true,
          "enableDefaultStarTree": true,
          "enableDynamicStarTreeCreation": true,
          "aggregateMetrics": true,
          "nullHandlingEnabled": true,
          "rangeIndexVersion": 2,
          "autoGeneratedInvertedIndex": false,
          "createInvertedIndexDuringSegmentGeneration": false,
          "sortedColumn": [
            "data_source",
            "master_id"
          ],
          "loadMode": "MMAP"
        },
        "metadata": {},
        "isDimTable": false
      }
    }
    m
    • 2
    • 16
  • p

    Prakhar Pande

    11/07/2022, 8:16 PM
    This query does not scale beyond 240 qps {"sql":"select event_time , sum(count) as event_counts from nrt_app_open where year=year(now()) and month=month(now()) and day>=(day(now())-1) and FromDateTime(event_time, 'yyyy-MM-dd HHmmss') > cast((now() - 86400000) as long) group by 1 limit 1000000000","trace":false,"queryOptions":""} Any ideas on how I can optimise this. Cpu and memory of pinot-servers are also under utilized. Thanks in advance.
    m
    • 2
    • 7
12Latest