https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • e

    Elon

    03/16/2022, 5:18 PM
    I can't see the full schema, is it an mv column?
    a
    • 2
    • 1
  • f

    Facundo Bianco

    03/16/2022, 8:18 PM
    Hi All 👋, I'm trying to configure a date format like this "_2020-12-31T195921.522-0400_" and created table-schema.json as
    Copy code
    "dateTimeFieldSpecs": [{
        "name": "timestampCustom",
        "dataType": "STRING",
        "format" : "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd'T'HH:mm:ss.SSZZ",
        "granularity": "1:MILLISECONDS"
      }]
    Table is generated successfully but POST command returns
    Copy code
    {
      "code": 500,
      "error": "Caught exception when ingesting file into table: foo_OFFLINE. null"
    }
    I discovered is related to date format, could you kindly indicate how should it be? I used this site to generate the custom format. Thanks in advance!
    x
    m
    +3
    • 6
    • 32
  • g

    Grace Lu

    03/16/2022, 11:26 PM
    Hi team, we ran into lots of issue when setting up spark ingestion job with Yarn. The latest issue we saw is that the application master reported the following error after the job is submitted to the cluster and no resources can be assigned to the job:
    Copy code
    Exception in thread "main" java.lang.NoSuchMethodError: org.apache.hadoop.yarn.api.records.impl.pb.ProtoUtils.convertToProtoFormat(Lorg/apache/hadoop/yarn/api/records/ExecutionType;)Lorg/apache/hadoop/yarn/proto/YarnProtos$ExecutionTypeProto;
    	at org.apache.hadoop.yarn.api.records.impl.pb.ExecutionTypeRequestPBImpl.setExecutionType(ExecutionTypeRequestPBImpl.java:73)
    We wonder if pinot has also introduced this class in its dependencies and if it is conflicted with the library in our hadoop cluster itself? We are at spark 2.4.6, hadoop 2.9.1, pinot 0.9.2, and seems like pinot 0.9.2 is built with hadoop2.7.0 and spark 2.4.0, have we tested the compatible spark/hadoop version for running ingestion jobs?
    m
    x
    • 3
    • 6
  • j

    Jonathan Meyer

    03/17/2022, 4:18 PM
    Hello Pinot community, long time no see 🙂 Is there any way for
    SUM
    to not return 0 when there's actually no values to aggregate ? i.e. return
    null
    in such case
    m
    j
    • 3
    • 16
  • t

    Tony Requist

    03/17/2022, 7:22 PM
    I have a realtime table with
    Copy code
    "realtime.segment.flush.threshold.rows": "10000000",
            "realtime.segment.flush.threshold.time": "6h",
            "realtime.segment.flush.threshold.segment.size": "400M",
    I changed these values two days ago, previously the "rows" limit was 0. Pinot is generating segments with 3,333,333 rows, every ~90 minutes, 95-100MB -- significantly below any of the limits. Server logs show
    Starting consumption on realtime consuming segment ... maxRowCount 33333
    and
    Stopping consumption due to row limit nRows=3333333
    - I am trying to figure out where that limit is coming from.
    m
    n
    m
    • 4
    • 8
  • l

    Luis Fernandez

    03/17/2022, 9:13 PM
    hey friends i want to run your thoughts thru something I have been doing some chaos exercises in pinot to see how it reacts this is my current scenario: Chaos exercise in pinot: System config: 1 minion, 2 servers, 2 brokers, 3 controllers, 3 zookeepers, data replication 2, backup gcs, environment GKE Scenario: downsize to 1 server, remove server pvc, see impact, try to go back to normal. (2 servers) Steps: 1. Downsize server to 1 with kubectl scale 2. Remove pvc in server 1 with kubectl delete pvc 3. Observation: p99 response time in system still strong not noticeable changes 4. Upsize back to 2 with kubectl scale 5. Observation: things don’t kick in automatically it seems like there’s some manual steps I have to do, don’t see new server consuming and having data pulled from gcs, still see the old server in the servers UI in the pinot-controller, it seems like I need to run a rebalance at this point 6. Update offline and online tags from old server with endpoint in pinot-controller 7. Seems like we can issue a rebalance now 8. Issuing with following: dryRun=false, reassignInstances=true, includeConsuming=false, bootstrap=true, downtime=false, minAvailableReplicas=true, bestEfforts=false 9. Observation: not seeing noticeable changes in p99 response time At this point the second instance is still not in a great state and not consuming, however the system is okay performing still at ms for p99s I’m wondering the following: Question: • What to look for when a rebalance is done in the pinot-controller-logs? • When to delete the old server tag? Do I need to also issue a updateBrokerResource, I try to delete but it says that Instance Server_10.12.64.88_8098 exists in ideal state for table and it doesn’t let me drop, at this point I cannot see the tables in the UI • Any other thing I should have done while rebalancing?
    m
    e
    j
    • 4
    • 53
  • s

    Sandeep R

    03/18/2022, 1:19 AM
    Hi Team, I have problem with this timestamp, We have this date format in table "LOG_TS": "2022-03-09T164742.995+00:00", and I am adding below date format, not sure if this is correct format?
    Copy code
    "name": "LOG_TS",
          "dataType": "LONG",
          "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyy-mm-ddThh:mm:ss.sssZ",
          "granularity": "1:MILLISECONDS"
    m
    r
    +2
    • 5
    • 35
  • l

    Luis Fernandez

    03/18/2022, 4:12 PM
    hey friends, i have a question regarding
    Table Consuming Latency
    I have been turning off and on various part of pinot to see how it behaves, this time i decided to turn off for sometime the kafka app that produces the records to pinot, i saw a latency increase when i turned off the app and at least for p99, it was 160ms and now is over a minute, when things like this happen when do you expect pinot to get back to its regular level does it ever get back? I was thinking as the day goes by maybe and this topic start to get less traffic then maybe things come down but I was wondering if that somehow can come back any other way. Ofc this is still pretty fast but I’m wondering what happens if I were to take down the app for a longer time how could that impact the p99 times
    m
    r
    • 3
    • 24
  • l

    Luis Fernandez

    03/18/2022, 7:00 PM
    hey friends it’s me again, I was using apache ab to do a simple load test to the brokers in pinot, we noticed that the exceptions in the server sky rocketed while ab was going, it seems like this is the stack trace
    Copy code
    Encountered exception while processing requestId 9610 from broker Broker_pinot-broker-1.pinot-broker-headless.pinot.svc.cluster.local_8099
    java.lang.NullPointerException: null
            at org.apache.pinot.core.util.trace.TraceContext.getTraceInfo(TraceContext.java:191) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependenci
    es.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at org.apache.pinot.core.query.executor.ServerQueryExecutorV1Impl.processQuery(ServerQueryExecutorV1Impl.java:223) ~[pinot-all-0.10.
    0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at org.apache.pinot.core.query.executor.QueryExecutor.processQuery(QueryExecutor.java:60) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-depen
    dencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at org.apache.pinot.core.query.scheduler.QueryScheduler.processQueryAndSerialize(QueryScheduler.java:151) ~[pinot-all-0.10.0-SNAPSHO
    T-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at org.apache.pinot.core.query.scheduler.QueryScheduler.lambda$createQueryFutureTask$0(QueryScheduler.java:137) ~[pinot-all-0.10.0-S
    NAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
            at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListe
    nableFutureTask.java:111) [pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at shaded.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:58) [pinot-all-0.10.0-SNAPSHOT-jar-with-dep
    endencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at shaded.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:75) [pinot-all-0.10.0-S
    NAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-b7c181a77289fccb10cea139a097efb5d82f634a]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            at java.lang.Thread.run(Thread.java:829) [?:?]
    does anyone know what this NullPointer may refer to?
    m
    r
    • 3
    • 9
  • w

    Weixiang Sun

    03/18/2022, 8:25 PM
    In upsert table, can we update the timestamp of the row?
    j
    • 2
    • 6
  • b

    Bordin Suwannatri

    03/21/2022, 5:15 AM
    hi now i try to create real time table consume kafka. it's created but status BAD. Please Help recommend this. error log is --> Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__2__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__3__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__4__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__5__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__6__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__7__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.937 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__8__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120640 poc-pinot01 pinot-admin.sh: 2022/03/21 120640.938 ERROR [MessageGenerationPhase] [HelixController-pipeline-default-pinot-uat-(1ad8bfc9_DEFAULT)] Event 1ad8bfc9_DEFAULT : Unable to find a next state for resource: user_stream_tykb_REALTIME partition: user_stream_tykb__9__0__20220321T0506Z from stateModelDefinitionclass org.apache.helix.model.StateModelDefinition from:ERROR to:CONSUMING Mar 21 120707 poc-pinot01 systemd-logind: Removed session 590.
    r
    • 2
    • 3
  • a

    Ali Atıl

    03/21/2022, 7:22 AM
    Hi everyone, Do indexes also work for multi-valued columns?
    r
    m
    • 3
    • 2
  • d

    Diana Arnos

    03/22/2022, 2:01 PM
    👋 hey there I have a strange situation going on. I have 2 servers setup up. Eventually, they had a problem and restarted and still running fine.I can see in the logs that they are consuming data normally:
    Copy code
    Consumed 261 events from (rate:3.1030054/s), currentOffset=763096, numRowsConsumedSoFar=288096, numRowsIndexedSoFar=288096
    ....
    [Consumer clientId=consumer-455, groupId=] Discovered group coordinator <redacted> (id: 2147483646 rack: null)
    But the controller still show them with
    dead
    status and when I try to query the data, I see in the Broker log:
    Copy code
    No server found for request 1: select responseId from responseCount limit 1
    And this is the response from the query API:
    Copy code
    {
      "exceptions": [],
      "numServersQueried": 0,
      "numServersResponded": 0,
      "numSegmentsQueried": 0,
      "numSegmentsProcessed": 0,
      "numSegmentsMatched": 0,
      "numConsumingSegmentsQueried": 0,
      "numDocsScanned": 0,
      "numEntriesScannedInFilter": 0,
      "numEntriesScannedPostFilter": 0,
      "numGroupsLimitReached": false,
      "totalDocs": 0,
      "timeUsedMs": 0,
      "offlineThreadCpuTimeNs": 0,
      "realtimeThreadCpuTimeNs": 0,
      "segmentStatistics": [],
      "traceInfo": {},
      "minConsumingFreshnessTimeMs": 0,
      "numRowsResultSet": 0
    }
    How can I make the Controller see they are alive? 👀
    m
    l
    +2
    • 5
    • 18
  • w

    Weixiang Sun

    03/23/2022, 4:52 AM
    When I am trying to use the lookup UDF join between dimension table and realtime table, it does not work. But it works for dimension table and offline table, Is it expected? I do not see such restriction from https://docs.pinot.apache.org/users/user-guide-query/lookup-udf-join. Is there anything missing?
    l
    y
    • 3
    • 7
  • b

    Bordin Suwannatri

    03/23/2022, 8:30 AM
    hello everyone i found some error whith transformFunction jsonPathString i can not use word order in jsonPathString --> "transformFunction": "jsonPathString(order,'$.channel')" -->this is not work. i test modify json replace from order to hello and user this --> "transformFunction": "jsonPathString(hello,'$.channel')" it's working. why i can not use "order". my real json massage they use "order". Please help.
    m
    • 2
    • 51
  • e

    eywek

    03/23/2022, 4:59 PM
    Hello, I was wondering if it’s possible to partition segments based on a field value (but without any transformation). For example, I store in pinot events from multiple websites, those events have name (i.e.
    purchase
    , `page_view`…) and I would like to create a segment by event name (with a size limit ofc). Since those events are user defined I can’t really know how many partitions I’ll have. I’ve seen Murmur, Hashcode… partition config but it doesn’t insure me that each event type will have a dedicated segment (e.g. I don’t want
    page_view
    and
    purchase
    events to be in the same segments, to avoid loading any
    page_view
    data when doing a query on
    page_view
    ones) Thank you
    k
    r
    • 3
    • 5
  • w

    Wei Li

    03/24/2022, 6:51 AM
    Hi, I am setting up pinot in AWS EKS, The clusters are successfully set up in EKS. However, when I try to create schema and load data (Sec 3.4 in this doc https://docs.pinot.apache.org/basics/getting-started/kubernetes-quickstart) by running this script:`kubectl apply -f pinot/pinot-realtime-quickstart.yml` I see the job are created but not running.
    r
    • 2
    • 6
  • a

    ahsen m

    03/24/2022, 5:45 PM
    hello, is there any tutorial connecting pinot with mongodb ?
    k
    m
    • 3
    • 6
  • l

    Luis Fernandez

    03/25/2022, 6:44 PM
    anyone know the reason why a server that has been marked as Dead, and updated its tags and after issued a rebalance would be still pop in the
    IdealState
    in zookeeper?
    m
    j
    • 3
    • 6
  • d

    Diogo Baeder

    03/25/2022, 11:35 PM
    Hi folks! Now that we're using Pinot with realtime tables in production, I'm also doing some experiments with offline tables for something else I'm developing. However, one thing I'd like to do is to be able to partition the data according to the values in some of the dimension columns. I'll follow in a thread:
    r
    • 2
    • 8
  • d

    Diogo Baeder

    03/27/2022, 11:03 PM
    Hi again folks! Related to my previous question, but not the same: what's the best partitioning strategy for a STRING column: Murmur, HashCode or ByteArray? What are the criteria I should use to choose what's the best for my case?
    m
    m
    k
    • 4
    • 13
  • d

    Diana Arnos

    03/28/2022, 12:08 PM
    Hello again 😬 How can I setup S3 as deep storage while using the helm chart? I tried adding the configs from this article to
    controller.extra.configs
    , but every time I do it the Controller starts responding with
    502 Bad Gateway
    and I can't see anything wrong in the logs. Results from
    helm template
    on the thread.
    m
    d
    +2
    • 5
    • 30
  • b

    Bordin Suwannatri

    03/28/2022, 3:56 PM
    hi guys i have multiple kafka sasl separate Kerberos. i don't know what parameter on real time table use for point to krb5.conf or content inside krb5.conf. i need to config realtime tables and multiple kdc. Please recommend which parameter or some example use for that.
    m
    a
    c
    • 4
    • 18
  • l

    Luis Fernandez

    03/28/2022, 5:28 PM
    hey friends... I was issuing rolling updates for pinot-servers with
    kubectl
    however I noticed that when I run this command I always get a brand new server and have to issue rebalances again, is restarting servers something that requires rebalancing? I'm pretty sure it must be something funky going on with our config
    d
    • 2
    • 32
  • l

    Lakshmanan Velusamy

    03/28/2022, 6:45 PM
    Hi Community, Can the dimension tables be created across different tenants with the same name ?
    m
    m
    • 3
    • 8
  • a

    ahsen m

    03/29/2022, 1:34 AM
    so i updated values like
    Copy code
    persistence:
          enabled: true
          accessMode: ReadWriteOnce
          size: 2G
          mountPath: /var/pinot/controller/data
          storageClass: ""
          extraVolumes:
            - name: gcp-credentials-volume
              secret:
                secretName: gcp-credentials
                items:
                - key: gcp_creds_json
                  path: gcp_credentials.json
          extraVolumeMounts:
            - name: gcp-credentials-volume
              mountPath: /opt/pinot/gcp
              readOnly: true
    but when i run helm template testing --debug .  the template it generates does not have any volume mount named `gcp-credentials-volume`, any idea's?
    m
    x
    • 3
    • 35
  • s

    sunny

    03/29/2022, 1:43 AM
    Hi, I was issuing partitioning in Pinot. When I query 'select where in' partition column, It doesn't show any record. But when I query 'select where not in' partition column, It seems ok. And after flushing segment, query 'select where in' result in right record. but after producing row (before flushed segments), it doesn't show record *)realtime table *)partition column : subject *) kafka topic partitions = 3 *) pinot partitiom function : Murmur
    m
    a
    • 3
    • 31
  • m

    Mohammed Galalen

    03/29/2022, 6:01 AM
    Hi team, I was trying to compile pinot from source on macbook pro M1 and I got two errors during the compilation one regarding the
    protoc-gen-grpc-java-1.4.0-osx-x86_64
    and the other
    com.github.eirslett:frontend-maven-plugin:1.1
    I had to upgrade ``com.github.eirslett:frontend-maven-plugin`` to
    1.11.0
    and downloaded the
    protoc-gen-grpc-java-1.4.0-osx-x86_64
    manually. But I couldn't run the example, and I'm getting this error
    Copy code
    Failed to start a Pinot [SERVER] at 15.16 since launch
    java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: event executor terminated
        at org.apache.pinot.core.transport.QueryServer.start(QueryServer.java:136) ~[pinot-all-0.10.0-SNAPSHOT-jar-with-dependencies.jar:0.10.0-SNAPSHOT-649f5988d5746869ef6a690f4747ff4d6fb9c607]
        at org.apache.pinot.server.starter.ServerInstance.start(ServerInstance.java:165)
    k
    m
    x
    • 4
    • 4
  • k

    Kamal Chavda

    03/29/2022, 5:50 PM
    Is anyone using Tableau with Pinot? Getting this error when trying to connect to hosted instance:
    x
    k
    +3
    • 6
    • 72
  • d

    Diogo Baeder

    03/30/2022, 12:31 AM
    Hey guys, a few surprises I had with 0.10.0: • The
    segmentPartitionConfig
    map doesn't accept the mapping of column to partition config directly, as the table configuration documentation says, but rather can only contain a
    columnPartitionMap
    field it seems, and then this field in its turn can contain the mapping between column and partition config • The
    segmentsConfig
    seems to have had its old
    replicasPerPartition
    renamed to
    replication
    , if I understand correctly - or maybe I just don't understand where each should be used, if both are valid (although the config docs don't mention
    replicasPerPartition
    anymore) Should I open a ticket on GitHub about these? Or am I getting something wrong perhaps?
    m
    l
    • 3
    • 6
1...363738...166Latest