https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • e

    Elon

    09/04/2020, 8:52 PM
    Hi, we noticed that when we upload an offline segment it is not available unless we select from the
    _OFFLINE
    table. Then if we upload a more recent segment that segment becomes available but the newest uploaded segment is not available. Is there any way to set a segment as ready to be served or something similar?
    m
    • 2
    • 6
  • s

    srisudha

    09/08/2020, 1:16 PM
    hi @Buchi Reddy you can check out the observability section of this article - Pinot exposes JMX metrics which can give you insights - https://medium.com/@shounakmk219/tasted-apache-pinot-and-we-loved-it-85f9022c30f7
    👍 1
    b
    • 2
    • 1
  • e

    Elon

    09/09/2020, 6:30 PM
    Is all of this going to be in pinot 0.5.0?
    t
    • 2
    • 14
  • n

    Neha Pawar

    09/09/2020, 6:31 PM
    @Ting Chen can you please guide Elon about the necessary PRs? ^^
    ❤️ 1
    t
    e
    • 3
    • 3
  • y

    Yash Agarwal

    09/09/2020, 6:32 PM
    I am getting the following exception during rebalancing. I have triggered rebalancing once but it seemed to have stopped in the middle and now i am getting the following exception. Any ideas ?
    Copy code
    Rebalancing table: guestslslitm3years_OFFLINE with minAvailableReplicas: 1, bestEfforts: false
    Found ERROR instance: Server_10.59.98.103_8098 for segment: guestslslitm3years_2018-01-01_2018-01-01_3, table: guestslslitm3years_OFFLINE
    Caught exception while waiting for ExternalView to converge for table: guestslslitm3years_OFFLINE, aborting the rebalance
    java.lang.IllegalStateException: Found segments in ERROR state
    	at org.apache.pinot.controller.helix.core.rebalance.TableRebalancer.isExternalViewConverged(TableRebalancer.java:538) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.controller.helix.core.rebalance.TableRebalancer.waitForExternalViewToConverge(TableRebalancer.java:480) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.controller.helix.core.rebalance.TableRebalancer.rebalance(TableRebalancer.java:344) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.controller.helix.core.PinotHelixResourceManager.rebalanceTable(PinotHelixResourceManager.java:2128) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at org.apache.pinot.controller.api.resources.PinotTableRestletResource.lambda$rebalance$0(PinotTableRestletResource.java:530) ~[pinot-all-0.5.0-SNAPSHOT-jar-with-dependencies.jar:0.5.0-SNAPSHOT-701ffcbd5be5f39e91cea9a0297c4e8b0a7d9343]
    	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_265]
    	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_265]
    	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_265]
    	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_265]
    	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_265]
    n
    j
    • 3
    • 4
  • y

    Yash Agarwal

    09/09/2020, 7:27 PM
    Sure. I have been doing that till now. but just wanted to confirm if there is a better approach 🙂
    k
    • 2
    • 4
  • t

    Tim Chan

    09/11/2020, 3:18 PM
    Is there a string length limitation for a string column. I’m seeing my data being truncated to 575 characters.
    m
    n
    • 3
    • 13
  • t

    Tim Chan

    09/11/2020, 6:35 PM
    Copy code
    2020/09/11 18:26:00.615 WARN [LLRealtimeSegmentDataManager_sesame_person_features__4__6__20200911T1716Z] [sesame_person_features__1__7__20200911T1716Z] Commit failed  with response {"isSplitCommitType":false,"streamPartitionMsgOffset":null,"buildTimeSec":-1,"status":"FAILED","offset":-1}
    j
    x
    • 3
    • 8
  • a

    Ankit

    09/14/2020, 7:08 AM
    Can anyone tell me how to see pinot server metrics(either via some api or jmx) and how to set lead controller for retention job?
    x
    n
    • 3
    • 7
  • y

    Yash Agarwal

    09/15/2020, 1:44 PM
    I am getting the following error when trying to drop a test broker.
    Copy code
    {
      "code": 409,
      "error": "Failed to drop instance Broker_172.17.0.2_8099 - Instance Broker_172.17.0.2_8099 exists in ideal state for brokerResource"
    }
    How do i remove the instance from ideal states and also drop it ?
    n
    x
    m
    • 4
    • 30
  • s

    Shen Wan

    09/15/2020, 8:56 PM
    Pinot logging has a bug here (mismatching number of arguments) that truncates the error details.
    n
    • 2
    • 16
  • p

    Pradeep

    09/16/2020, 1:40 AM
    QQ what is the best way to drop a realtime server and move the segments to a different server? Would disabling the server and rebalancing would do the trick?
    n
    • 2
    • 33
  • s

    Shen Wan

    09/16/2020, 3:39 PM
    SQL query filtering on the field used for partitioning returning nothing. Filtering on other fields is fine. I do not see anything worth mentioning in log. What’s going on?
    Untitled
    k
    n
    m
    • 4
    • 179
  • b

    Buchi Reddy

    09/20/2020, 3:53 AM
    hey everyone, i'm trying to add STAR tree index to an existing table using the REST API. I've validated the tableConfig and did a
    PUT
    of the config and PUT also succeeded with http
    200
    but when I get back the tableConfig again, I don't see the STAR tree index config anymore.
    x
    • 2
    • 1
  • y

    Yash Agarwal

    09/21/2020, 8:04 PM
    In batch Ingestion flow, I am partitioning the data by Date, Channel and pmod(hash of ID, x) . Where x varies for differnt channel values. How can i define this in partition config for the table ?
    m
    • 2
    • 1
  • j

    Jackie

    09/23/2020, 5:32 PM
    @Yash Agarwal You can use the APIs defined in
    PinotInstanceAssignmentRestletResource
    for that. These APIs are detailed fine tuning APIs so we don't have them documented yet. Will enhance the documents. Let me know if you need more help understanding the APIs
    y
    • 2
    • 1
  • b

    Buchi Reddy

    09/25/2020, 12:46 AM
    Hi everyone, we have a couple of segments with no replicas and they're not coming online at all. how can that be fixed? Tried reloading those segments with REST API but that didn't seem to work
    m
    j
    x
    • 4
    • 17
  • t

    Tim Chan

    09/29/2020, 5:11 PM
    How do I avoid seeing an InternalError message when I would like to result a very very large result set?
    y
    n
    j
    • 4
    • 44
  • p

    Pradeep

    09/29/2020, 11:14 PM
    Hi, I added some additional columns to star-tree index, but I don’t see the new column being part of the start tree index in the new segments doesn’t modifications get reflected in the new segments? IIRC reloads doesn’t help with modifications to the star tree index, correct me if I am wrong
    j
    • 2
    • 13
  • p

    Pradeep

    09/30/2020, 12:12 AM
    shouldn;t the segment rotate if 2h is hit?
    n
    s
    • 3
    • 25
  • p

    Pradeep

    10/01/2020, 12:47 AM
    QQ, doesn’t pinot-admin.sh AddTable interface work for updating OFFLINE table config? I am able to update realtime table config
    m
    n
    • 3
    • 17
  • d

    Dan Hill

    10/01/2020, 1:11 AM
    Hi. This isn't a problem. I'm looking for design feedback. I'm using Apache Flink to generate inputs into Pinot's realtime tables. Any experience with doing aggregation in Flink vs Pinot? My current prototype aggregates user events inside of Pinot for streaming but aggregates in Flink for Pinot's offline ingestion. I'm tempted to move the streaming aggregation to Flink too so I can reuse the same code for streaming and batch. Flink's aggregation is nice and very performant. Flink supports sending
    delete
    and
    insert
    operations for group by results. Here's a simplified version of my Pinot table to help illustrate my use case: (dimensions) utc_date, seller_id, content_id, (metrics) sum_clicks.
    c
    • 2
    • 12
  • s

    Subbu Subramaniam

    10/01/2020, 5:45 PM
    we dont build startree in the realtime ingestion path, right?
    b
    • 2
    • 3
  • p

    Pradeep

    10/01/2020, 7:55 PM
    we most likely figured out the issue, thanks a lot @Jackie we have a hybrid table and timestamp column gets auto added to the filter and star-tree is not used in query execution
    k
    • 2
    • 5
  • e

    Elon

    10/02/2020, 7:46 PM
    Thanks! We're on pinot 4 so should I just go to zookeeper and manually delete the tag somehow?
    n
    • 2
    • 9
  • t

    Tim Chan

    10/02/2020, 7:58 PM
    has anyone used https://github.com/python-trio/trio to try to run concurrent queries with https://github.com/python-pinot-dbapi/pinot-dbapi
    k
    x
    • 3
    • 8
  • s

    Subbu Subramaniam

    10/02/2020, 8:28 PM
    @Elon the realtime consumer automatically adjusts to new tags that may be there in the cluster and removes segments from untagged ones. This is done over time. So, any new consuming segments are allocated ONLY amongst the tagged servers that exist at the time of newly created segment. The old segments (already consumed ones) are left to be retained out (unless you invoke rebalanlce). That said, if you had (say) 4 replicas configured, and only 3 tagged instances exist, then we continue consumption as before on an untagged host sine we don't want to run the risk of losing a replica
    e
    r
    • 3
    • 9
  • d

    Dan Hill

    10/03/2020, 5:57 AM
    I'm trying to build Pinot (so I can send a PR). I'm not sure I'm doing this correctly. I'm hitting an error. I'm following the code setup and contribution guides. I'm hitting an issue
    mvn checkstyle:check -X
    . It looks like the
    <http://repository.apache.org|repository.apache.org>
    snapshots has not been updated for
    0.6.0
    . Also, I'm using
    openjdk 13.0.2 2020-01-14
    . Should I change this?
    Copy code
    Caused by: org.eclipse.aether.transfer.ArtifactNotFoundException: Failure to find org.apache.pinot:pinot-spi:jar:0.6.0-SNAPSHOT in <https://repository.apache.org/snapshots> was cached in the local repository, resolution will not be reattempted until the update interval of apache.snapshots has elapsed or updates are forced
    x
    • 2
    • 24
  • d

    Dan Hill

    10/07/2020, 1:44 AM
    I have a simple k8 job that runs LaunchDataIngestionJob. Is it possible to use an environment variable in the batch job spec? I'm trying to pass a date parameter into the input path. Here's the error and example spec.
    Copy code
    Caused by: groovy.lang.MissingPropertyException: No such property: PINOT_DATE for class: SimpleTemplateScript1
    	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptBytecodeAdapter.java:53) ~[pinot-all-0.6.0-SNAPSHOT-jar-with-dependencies.jar:0.6.0-SNAPSHOT-8782e47b45c945cd02ccb8f06597b0ffa66a735a]
    Copy code
    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: pinot-s3-data-config
    data:
      local_batch_job_spec.yaml: |-
        executionFrameworkSpec:
        ...
        inputDirURI: '<s3://promoted-event-logs/offlinepinot/dt=$JOB_DATE/>'
        ...
    
    ---
    apiVersion: batch/v1
    kind: Job
    metadata:
      name: pinot-populate-from-s3
    spec:
      template:
        spec:
          containers:
            - name: pinot-populate-from-s3
              image: apachepinot/pinot:0.6.0-SNAPSHOT-8782e47b4-20201006-jdk8
              env:
                - name: JOB_DATE
                  value: $(date -u +"%Y-%m-%d")
              args: [
                "LaunchDataIngestionJob",
                "-jobSpecFile",
                "/home/pinot/pinot-config/local_batch_job_spec.yaml"
              ]
              volumeMounts:
                - name: pinot-s3-data-config
                  mountPath: /home/pinot/pinot-config
          restartPolicy: OnFailure
          volumes:
            - name: pinot-s3-data-config
              configMap:
                name: pinot-s3-data-config
      backoffLimit: 100
    n
    x
    • 3
    • 7
  • p

    Pradeep

    10/14/2020, 12:09 AM
    Hi, for the hybrid tables is there a way to adjust the boundary used by the broker for querying offline vs realtime?
    m
    • 2
    • 5
1...456...166Latest