https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • l

    Laxman Ch

    08/17/2020, 4:25 PM
    ok. Can you please point me to the PR or version where this is fixed?
    t
    • 2
    • 2
  • e

    Elon

    08/17/2020, 7:38 PM
    @Laxman Ch - in your controller config do you have the following set?
    Copy code
    <http://pinot.controller.storage.factory.class.gs|pinot.controller.storage.factory.class.gs>=org.apache.pinot.plugin.filesystem.GcsPinotFS
    pinot.controller.storage.factory.gs.projectId=<YOUR PROJECT ID>
    pinot.controller.storage.factory.gs.gcpKey=<GCS KEY>
    pinot.controller.segment.fetcher.protocols=file,http,gs
    pinot.controller.segment.fetcher.gs.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
    l
    • 2
    • 7
  • e

    Elon

    08/17/2020, 9:45 PM
    qq - for offline segment generation: In SegmentGeneratorConfig should either autoGeneratedInvertedIndex or createInvertedIndexDuringSegmentGeneration be set to true for inverted indexes to get generated?
    n
    • 2
    • 7
  • e

    Elon

    08/17/2020, 11:04 PM
    And another question for offline segment generation - just want to confirm that sortedIndex is not used by offline, we have to provide the sorted data, is that right?
    n
    • 2
    • 1
  • k

    Kishore G

    08/18/2020, 12:11 AM
    Can you check?
    c
    • 2
    • 8
  • s

    Suvodeep Pyne

    08/19/2020, 1:40 AM
    @Xiang Fu @Kishore G Getting a NPE on
    GroupByResultSet
    when reading alerts from TE in a docker container in k8s. Any idea what could be the root cause? This error doesn’t happen when running TE on bare metal.
    te-npe-stacktrace.txt
    x
    • 2
    • 2
  • l

    Laxman Ch

    08/19/2020, 6:57 PM
    Folks, Have some doubts related to rebalance. Can someone please clarify these or point me to relevant documentation. ============= • One of our pinot servers was scaled down (kubenetes - from 4 servers to 3 servers). • Even after several hours segments didn’t come online. • Same case with CONSUMING segments. Kafka partitions which were getting processed by scaled down server are now not getting processed at all. ============= • When does the rebalance gets triggered? I already tried server/controller restarts. I also tried rebalance from controller UI • What is the right way to scaled down a server? FYI: we are on 0.3.0 + some fixes, in case if it matters
    n
    • 2
    • 23
  • e

    Elon

    08/19/2020, 10:38 PM
    Is there a quick way we can convert realtime segments to offline segments? Is there any benefit to doing that since the realtime segment is created with star tree, inverted, sorted and text indexes?
    n
    • 2
    • 12
  • y

    Yash Agarwal

    08/20/2020, 9:16 AM
    I am using presto for querying and joining results from Pinot, What is the recommended approach to do multiple aggregations like the following in single query ?
    Copy code
    select channel,
        sales_date,
        sum(sales) as sum_sales,
        sum(units) as sum_units
    from pinot.default.sales
    group by channel, sales_date
    Currently presto is trying to fetch raw values for all the columns.
    • 1
    • 1
  • y

    Yash Agarwal

    08/20/2020, 9:17 AM
    Also how can I use custom Pinot UDFs like segmentPartitionedDistinctCount in presto queries ?
    e
    • 2
    • 6
  • m

    Mayank

    08/20/2020, 10:13 PM
    what's the numDocsScanned?
    b
    k
    • 3
    • 16
  • k

    Kishore G

    08/21/2020, 2:16 AM
    Startree HLL will work well for this
    b
    • 2
    • 5
  • k

    Kishore G

    08/21/2020, 2:17 AM
    Also, in the latest version we did some optimization to use dictionary to solve these queries whenever possible
    b
    • 2
    • 1
  • e

    Elon

    08/21/2020, 4:54 PM
    in 0.5.0?
    d
    n
    • 3
    • 10
  • e

    Elon

    08/21/2020, 7:34 PM
    We can upload a flamegraph. @Onha Choe is the brave person that is working with the pinot logging controller. Anyone familiar with this that we can work with?
    o
    n
    t
    • 4
    • 21
  • p

    Pradeep

    08/24/2020, 5:50 PM
    QQ, w.r.t to the way broker processes queries for hybrid tables. I see in the docs that query gets split between offline and realtime based on a timestamp. My take away from here is that data has to be continuous in offline table for all the previous days, is it possible to just make broker to query offline table for a window of time instead. Usecase is when we have some issue/holes in the data (lets say whole of yesterday). we want to be able to fix the segments only in that timerange, instead of maintaining a offline table alltogether. Are there alternative ways to fix this?
    k
    • 2
    • 2
  • y

    Yash Agarwal

    08/25/2020, 6:11 PM
    I am getting error in prestodb when running the following query.
    Copy code
    select channel,
        sales_date,
        sum(sales) as sum_sales,
        sum(units) as sum_unts
    from pinot.default.guestSlsLitm
    where channel = 'WEB'
    group by channel, sales_date
    union all
    select channel,
        sales_date,
        sum(sales) as sum_sales,
        sum(units) as sum_units
    from pinot.default.guestSlsLitm
    where channel = 'STORES'
    group by channel, sales_date;
    Each of the individual queries work separately. but the union does not. Even the explain plan fails with
    Copy code
    Query 20200825_180515_00000_mshj8 failed: Expected to find the pinot table handle for the scan node
    com.facebook.presto.spi.PrestoException: Expected to find the pinot table handle for the scan node
    	at com.facebook.presto.pinot.PinotPlanOptimizer.lambda$optimize$0(PinotPlanOptimizer.java:86)
    x
    • 2
    • 3
  • m

    Mustafa

    08/26/2020, 8:27 AM
    Also when I get rid of sum aggregation it works perfectly. (Is this a valid sql?)
    Copy code
    SELECT "timestamp",variant_id FROM Sales WHERE operator_id = 1 AND campaign_id = 1 GROUP BY Hour("timestamp"), variant_id
    x
    • 2
    • 15
  • y

    Yash Agarwal

    08/26/2020, 10:47 AM
    How to achieve something similar to
    local.directory.sequence.id=true
    in
    SparkSegmentGenerationJobRunner
    ?
    x
    • 2
    • 7
  • e

    Elon

    08/26/2020, 5:35 PM
    So the granularity by itself doesn't do anything by itself, a transform function is needed also, right?
    n
    • 2
    • 2
  • e

    Elon

    08/27/2020, 12:39 AM
    Sorry, another date time field spec question: can you have multiple date time columns bucketed on different granularities based on one source column?
    n
    j
    • 3
    • 25
  • y

    Yash Agarwal

    08/28/2020, 2:25 PM
    I am using Pinot Image
    0.5.0-SNAPSHOT-331b874cd-20200821
    . But in that I am not able to access the swagger ui. All apis for
    swaggerui-dist/lib/*
    and
    swaggerui-dist/css/*
    are failing with 404.
    k
    j
    • 3
    • 6
  • a

    Ankit

    08/28/2020, 2:49 PM
    Copy code
    {
      "tableName": "ordereventmap_OFFLINE",
      "reportedSizeInBytes": -1,
      "estimatedSizeInBytes": -1,
      "offlineSegments": {
        "reportedSizeInBytes": -1,
        "estimatedSizeInBytes": -1,
        "missingSegments": 1,
        "segments": {
          "ordereventmapbatch": {
            "reportedSizeInBytes": -1,
            "estimatedSizeInBytes": -1,
            "serverInfo": {
              "Server_pinot-server-59619160-2-966153609.stg.omsanalyticsplatform.cp.dfwstg2.prod.walmart.com_8098": {
                "segmentName": "ordereventmapbatch",
                "diskSizeInBytes": -1
              }
            }
          }
        }
      },
      "realtimeSegments": null
    have been trying to upload batch segment to table….but the segment is going missing and no data is available. Can anyone tell reasons due to which segments goes missing on uploading? Didn’t find any errors in controller or server or broker
    k
    n
    • 3
    • 3
  • y

    Yash Agarwal

    08/28/2020, 3:04 PM
    Is there a segment level query caching happening in pinot. If we fire the same metric query we see the result response time decrease from seconds to milliseconds. But the same resets to seconds are sometime/other queries. Is there a document where I can read more about it?
    m
    • 2
    • 1
  • y

    Yash Agarwal

    09/01/2020, 11:33 AM
    Hey Team, I am getting intermittent exceptions in CombinePlanNode.
    Copy code
    Exception processing requestId 137
    java.lang.RuntimeException: Caught exception while running CombinePlanNode.
    	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:149) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.core.plan.InstanceResponsePlanNode.run(InstanceResponsePlanNode.java:33) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.core.plan.GlobalPlanImplV0.execute(GlobalPlanImplV0.java:45) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:221) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:155) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_265]
    	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:111) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
    Caused by: java.util.concurrent.TimeoutException
    	at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:1.8.0_265]
    	at org.apache.pinot.core.plan.CombinePlanNode.run(CombinePlanNode.java:139) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	... 13 more
    Processed requestId=137,table=guestslslitm3years_OFFLINE,segments(queried/processed/matched/consuming)=1058/-1/-1/-1,schedulerWaitMs=0,reqDeserMs=4,totalExecMs=10659,resSerMs=0,totalTimeMs=10663,minConsumingFreshnessMs=-1,broker=Broker_10.59.100.47_8099,numDocsScanned=-1,scanInFilter=-1,scanPostFilter=-1,sched=fcfs
    Slow query: request handler processing time: 10663, send response latency: 58, total time to handle request: 10721
    Is there a reason why this is happening ? Is there a way we can override the timeout of 10s in CombineNodePlan
    d
    k
    n
    • 4
    • 11
  • y

    Yash Agarwal

    09/01/2020, 11:37 AM
    What is the best approach to solve the same ? I am currently storing about 3 Billion rows (1000 segments) per table on a single data node. Should I rebalance it to more servers or add CPU/RAM to the same.
    n
    • 2
    • 2
  • n

    Neha Pawar

    09/01/2020, 10:49 PM
    table config says
    Copy code
    "segmentsConfig":{
           "timeColumnName":"timestampInEpoch",
    u
    • 2
    • 3
  • d

    Dileep Reddy

    09/02/2020, 9:22 AM
    doesn’t trim function work with sql?
    k
    • 2
    • 8
  • d

    Dan Hill

    09/03/2020, 1:57 AM
    1.. Has anyone hooked up Pinot to read files from MinIO (open source S3 alternative)? I'm using MinIO in a local Kubernetes setup (and S3 in prod). I checked out S3PinotFS but I don't see an easy way to override the endpoint for S3. Can this be done using the URI? E.g.
    <s3://minio:9000/mybucket/objectpath>
    or
    <s3://mybucket.minio:9000/objectpath>
    ? 2.. What's the best way to add the
    pinot-s3
    plugin with
    apachepinot/pinot
    docker image? Do I need to create my own wrapping image? I see a few environment variables in
    pinot-admin.sh
    that I can set. E.g.
    JAVAOPTS
    ,
    PLUGINS_INCLUDE
    ,
    PLUGINS_CLASSPATH
    ,
    PLUGINS_DIR
    x
    k
    +2
    • 5
    • 14
  • s

    Shen Wan

    09/03/2020, 6:25 PM
    We have a realtime table with a field
    timestamp_ns
    which is the
    segmentsConfig.timeColumnName
    and has range index. When I used
    order by timestamp_ns
    in an SQL query, it will time out. Why? I expect range index to help
    order by
    .
    k
    • 2
    • 11
12345...166Latest