https://pinot.apache.org/ logo
Join SlackCommunities
Powered by
# troubleshooting
  • m

    Mahesh babu

    08/24/2022, 4:55 AM
    java.lang.NullPointerException: null 2022/08/24 043759.411 ERROR [LLRealtimeSegmentDataManager_DISH13__0__0__20220823T0834Z] [DISH13__0__0__20220823T0834Z] Caught exception while transforming the record: { "nullValueFields" : [ "Destination", "RAN-UE-NGAP-ID", "5G-TMSI", "5GMM_cause", "5GSM_cause", "No.", "Info", "Protocol", "Source", "Destination_Port" ], "fieldToValueMap" : { "Destination" : "null", "RAN-UE-NGAP-ID" : "null", "Time" : null, "No." : "null", "Info" : "null", "Source" : "null", "MSIN" : "100000031", "STATUS" : "OK", "5G-TMSI" : "null", "5GMM_cause" : "null", "5GSM_cause" : "null", "Protocol" : "null", "Destination_Port" : "null" } }
    m
    • 2
    • 3
  • p

    Piyush Chauhan

    08/24/2022, 5:03 AM
    I am not getting correct data after upsert. 2nd row was already there I wanted to update it. But when I sent a new kafka message with same
    id
    (primary key in schema) and updated
    exception_count
    and
    updated_at
    it created a new record (1st row). When querying from Query Console is the the attached results where for the given id. But when I query using the Java client I get on the 1 record i.e. older record whose updated at value 1660654106. *All of the columns in the ss are actual columns. They are not derived.
    m
    k
    s
    • 4
    • 19
  • l

    Luis Fernandez

    08/24/2022, 8:50 PM
    question: I was doing an scaling exercise today and a couple of things happened: our setup is 2 servers we were trying to up the number of cores. the way we did this was by adding the configs in kubernetes and applying them when deleting the pod, say
    pinot-server-1
    at that point
    pinot-server-1
    was getting scaled up and
    pinot-server-0
    was working without issue and serving stuff after a bit when
    pinot-server-1
    was coming back up we started getting the following error:
    Copy code
    [
      {
        "message": "java.net.UnknownHostException: pinot-server-1.pinot-server-headless.pinot.svc.cluster.local\n\tat java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)\n\tat java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1509)\n\tat java.base/java.net.InetAddress.getAllByName(InetAddress.java:1368)\n\tat java.base/java.net.InetAddress.getAllByName(InetAddress.java:1302)",
        "errorCode": 425
      },
      {
        "message": "2 servers [pinot-server-0_R, pinot-server-1_R] not responded",
        "errorCode": 427
      }
    ]
    and also
    Copy code
    2022-08-24 16:47:32	
    java.net.UnknownHostException: pinot-server-1.pinot-server-headless.pinot.svc.cluster.local
    2022-08-24 16:47:32	
    Caught exception while sending request 183945048 to server: pinot-server-1_R, marking query failed
    what i’m trying to understand is how queries were getting routed to pinot-server-1 if it was down, after a bit this problem resolves itself without us doing anything but we did get some downtime.
    m
    x
    • 3
    • 165
  • s

    Slackbot

    08/25/2022, 7:16 AM
    This message was deleted.
    e
    • 2
    • 1
  • p

    Prashant Pandey

    08/25/2022, 7:54 AM
    Hi team, we were facing an access issue with our S3 based deep-store and we wanted to disable it. So I went ahead and removed all S3 related configs from the controller config as:
    Copy code
    controller.helix.cluster.name=myCluster
        controller.port=9000
        controller.data.dir=/tmp/controller
        controller.zk.str=myZKStr
        pinot.set.instance.id.to.hostname=true
    And it doesn’t seem to be trying to connect with S3 anymore. Wanted to confirm if this is the right way to do this? Thanks!
    k
    d
    n
    • 4
    • 8
  • j

    Jinny Cho

    08/25/2022, 7:15 PM
    👋 team. Just double checking on one thing - Can I paginate pinot result? The query is like
    Copy code
    SELECT SUM(a)
    FROM my_table
    WHERE b = c
    GROUP BY d;
    If not, is there any plan to enable pagination for Group by query? It'd be really nice if it has it.
    m
    • 2
    • 3
  • s

    suraj sheshadri

    08/26/2022, 1:18 AM
    Is there any API that can update the metadata for a segment. I see that one of the segment was created with no metadata information and I didnt find any way to reload the segment data. All other segments were loaded properly, but one bad segment can make the table not queryable and had to delete it for now. Any suggestions on how to just update the metadata for a segment and fix it.
    m
    • 2
    • 10
  • d

    Deena Dhayalan

    08/23/2022, 1:20 PM
    Hi Team , I have found that the working package for @ScalarFunction should be
    Copy code
    org.apache.pinot.*.function.*
    But In the Doc It is org.apache.pinot.scalar.XXXX Can Anyone say Is there a way of auto register package when the package is this 'org.apache.pinot.scalar.XXXX'?
    Copy code
    Or  org.apache.pinot.*.function.* this will be the actual package?
    s
    • 2
    • 3
  • l

    Lars-Kristian Svenøy

    08/26/2022, 9:12 AM
    Hey team 👋 What versioning scheme does Pinot use? What version number bumps can we expect there to be backwards compatibility breaking changes in the API? The theme seems to be patch is not breaking.. while minor is breaking. Just want to confirm
    m
    • 2
    • 1
  • u

    구상모Sangmo Koo

    08/29/2022, 6:49 AM
    Hi, I want to filter when there is no 'ts' or 'exp' object.
    Copy code
    Table Config : 
    "ingestionConfig": {
      "filterConfig": {
        "filterFunction": "Groovy({ ts == null || ts < 1000000000 || exp == null}, exp, ts)"
      }
    }
    
    Table Schema : 
    {
      "name": "exp_utc",
      "dataType": "LONG",
      "transformFunction": "exp*1000",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:MILLISECONDS"
    },
    {
      "name": "exp_asia_seoul_datetime",
      "dataType": "STRING",
      "transformFunction": "toDateTime((exp*1000)+(timezoneHour('Asia/Seoul')*3600000), 'yyyy-MM-dd HH:mm:ss')",
      "format": "1:SECONDS:EPOCH",
      "granularity": "1:SECONDS"
    }
    Copy code
    Normal collection data :
    {"id":"E8DB","tp":1,"fw":"1.5.0","vc":22,"ts":1661175231,"ri":-59,"ad":3,"exp":1678450474,"ar":{},"mg":{"1":0.2,"2":0.2,"3":0.2}}
    
    filter data :
    {"id":"240A","tp":1,"fw":{"1":{"1":-25,"2":-10,"3":-4}},"ar":{"0":1},"mg":{"1":0.2,"2":0.2,"3":0.2}}
    The following error occurs because the 'exp' filter does not work.
    Copy code
    Caused by: java.lang.RuntimeException: Caught exception while executing function: plus(times(exp,'1000'),times(timezoneHour('Asia/Seoul'),'3600000'))
        at org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:124) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
        at org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:119) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
        ... 15 more
    Caused by: java.lang.RuntimeException: Caught exception while executing function: times(exp,'1000')
        at org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:124) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
        at org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:119) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
        at org.apache.pinot.segment.local.function.InbuiltFunctionEvaluator$FunctionExecutionNode.execute(InbuiltFunctionEvaluator.java:119) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
        ... 15 more
    Caused by: java.lang.IllegalStateException: Caught exception while invoking method: public static double org.apache.pinot.common.function.scalar.ArithmeticFunctions.times(double,double) with arguments: [null, 1000.0]
    Any problem with this filterFunction?
    Copy code
    "filterFunction": "Groovy({ ts == null || ts < 1000000000 || exp == null}, exp, ts)"
  • a

    Abdelhakim Bendjabeur

    08/29/2022, 9:41 AM
    Hello everyone, I am trying to setup a deep store on GCS and experiment on backup and recovery on kubernetes. I followed this doc and Pinot does not seem to upload segments to the bucket. the controller is running with the following config
    Copy code
    JAVA_OPTS:-XX:ActiveProcessorCount=2 -Xms256M -Xmx1G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xlog:gc*:file=/opt/pinot/gc-pinot-controller.log -Dplugins.dir=/opt/pinot/plugins -Dplugins.include=pinot-gcs -Dlog4j2.configurationFile=/opt/pinot/conf/log4j2.xml -Dplugins.dir=/opt/pinot/plugins
    
    StartController -configFileName /var/pinot/controller/config/pinot-controller.conf
    
    $ cat /var/pinot/controller/config/pinot-controller.conf
    controller.helix.cluster.name=pinot-quickstart
    controller.port=9000
    controller.data.dir=/var/pinot/controller/data
    controller.zk.str=pinot-zookeeper:2181
    pinot.set.instance.id.to.hostname=true
    controller.task.scheduler.enabled=true
    controller.data.dir=<gs://pinot-quickstart-deep-storage/data>
    controller.local.temp.dir=/temp
    controller.enable.split.commit=true
    <http://pinot.controller.storage.factory.class.gs|pinot.controller.storage.factory.class.gs>=org.apache.pinot.plugin.filesystem.GcsPinotFS
    pinot.controller.storage.factory.gs.projectId=some-project-id
    pinot.controller.storage.factory.gs.gcpKey=/var/pinot/controller/config/gcp-key.json
    pinot.controller.segment.fetcher.protocols=file,http,gs
    pinot.controller.segment.fetcher.gs.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
    The plugin is there
    Copy code
    $ ls /opt/pinot/plugins/pinot-file-system/
    pinot-adls  pinot-gcs  pinot-hdfs  pinot-s3
    When I check the propertystore.segments in zookeeper, I see that the download url is not a gcs url
    Copy code
    "segment.download.url": "<http://pinot-controller-0.pinot-controller-headless.pinot-quickstart.svc.cluster.local:9000/segments/tickets_channels/tickets_channels__2__0__20220826T0933Z>",
    Did anyone succeed at configuring this? am I missing something? Thanks a lot for your help 🙏
    l
    • 2
    • 20
  • r

    Ruthika Vasudevan

    08/29/2022, 4:18 PM
    Hi. I'm looking to get the state of an instance via API call. I see that
    GET/instances/{instanceName}
    gives certain info. Looking to specifically get instance state. Anyone here tried something similar?
    m
    • 2
    • 9
  • t

    Tanmesh Mishra

    08/29/2022, 5:23 PM
    Hey 👋, I am working on this PR and facing errors (in the thread). Can someone help me resolving them?
    m
    • 2
    • 4
  • p

    Petr Ponomarenko

    08/29/2022, 9:48 PM
    reposting from #C019DKYBC6P as adviced by @Kishore G (thank you): https://apache-pinot.slack.com/archives/C019DKYBC6P/p1661795736703099
    k
    • 2
    • 8
  • n

    Neeraja Sridharan

    08/29/2022, 11:29 PM
    Hey team 👋 We are currently setting-up partition based segment pruning for OFFLINE table scenario (in addition to existing time based pruning). We have Replica-Group Instance Assignment already setup and planning to do Partitioned Replica-Group Instance Assignment to be more effective for our partition based segment pruning. Given the current
    replicaGroupPartitionConfig
    , any recommendation on how to distribute the (50) servers assignment to (16) partitions for (2) Replica Groups i.e. recommended value for
    numInstancesPerPartition
    ?
    Copy code
    "segmentPartitionConfig": {
            "columnPartitionMap": {
              "team_id": {
                "functionName": "Murmur",
                "numPartitions": 16
              }
            }
          }
    Copy code
    "replicaGroupPartitionConfig": {
              "replicaGroupBased": true,
              "numInstances": 0,
              "numReplicaGroups": 2,
              "numInstancesPerReplicaGroup": 25,
              "numPartitions": 0,
              "numInstancesPerPartition": 0
            }
    m
    • 2
    • 9
  • d

    Diogo Baeder

    08/30/2022, 9:27 PM
    Hi folks! I'm having an issue where my Server instances go "Dead" every now and then, and I'd like to know if there's a way to check this with a single HTTP request to some "health-check endpoint" or similar. More on this thread.
    l
    m
    • 3
    • 6
  • a

    Alice

    08/31/2022, 1:08 AM
    Hi team, a question about Pinot consuming stream data. Pinot checks Kafka consuming offset periodically and reset the offset when there’s discrepancy between Kafka partition offset and Pinot current consuming offset, right? What’s the frequency of this checking?
    m
    s
    • 3
    • 3
  • d

    Diogo Baeder

    08/31/2022, 12:15 PM
    Hi guys, I need help. We did some changes to our Pinot cluster deployment values - JVM tuning and liveness probes -, redeployed the instances and now we lost all our tables (from what's showing in the Incubator UI) and they're not receiving anything from Kafka. What's going on, and how can we recover our tables and segments?
    m
    r
    • 3
    • 58
  • n

    Naz Karnasevych

    08/31/2022, 2:57 PM
    Hey folks, quick question: is it possible to override pinot configuration properties via env vars? For example, when setting params like
    <http://pinot.controller.storage.factory.class.gs|pinot.controller.storage.factory.class.gs>
    m
    x
    • 3
    • 6
  • e

    Ethan Huang

    08/31/2022, 3:15 PM
    Hi team, when I ran a query with condition
    value > 0
    but got results with
    value = 0
    Is this an issue or should I make some special configuration to avoid this? The query is:
    Copy code
    select clock, value from application_metrics where metric = 'successCount' and statistic='duration' and value > 0 limit 10
    clock
    is a time column with TIMESTAMP type and
    value
    is a metric column with DOUBLE type, the others are dimension columns. The pinot version is 0.11.0-SNAPSHOT and built form source with last commit hash
    561e471a86278e0e486cd9e602f8499fc8f8517c
    I also ran this query using the broker query api, and also got the wrong results. Screenshots from the pinot UI and rest api tool are attached. Thank you very much!
    m
    s
    • 3
    • 8
  • s

    suraj sheshadri

    08/31/2022, 9:30 PM
    Copy code
    We have a usecase where we need to use common table expressions or subqueries to achieve below use case. 
    1) Is there any alternate way in pinot to achieve the same at this time. We do not want to use presto/trino etc at this time.
    2) Do we know when subquery / CTE support is planned for pinot.
    3) Do we know when the feature to insert rows into pinot using a query will be made available. Do we have a way to create temp tables? So we can use these tables to query next step.
    4) Do we have a way to output data of a query to s3 location. I only see INSERT INTO "baseballStats" FROM FILE '<s3://my-bucket/public_data_set/baseballStats/rawdata/>' in documentation.
    
    
    weekly_agg as (
     Select 
      user, sum(req_daily_cap) as req_cnt,
      sum(total_day_cnt) as total_week_cnt
     from fcap_day_agg
    Group by user
    )
    
    With fcap_weekly_agg as (
    Select 
      user, 
       IF (weeklyFCAP >0, Min(req_cnt, weeklyFCAP), req_cnt ) as req_weekly_cap,
       total_week_cnt 
    from weekly_agg
    )
    
    Select sum (req_weekly_cap)/sum(total_week_cnt) as fcap_factor from fcap_weekly_agg
  • p

    Padma Malladi

    08/31/2022, 9:45 PM
    Hi, what are the usecases when pinot zookeeper persistence storage would go up? what would be the WAL?
  • p

    Padma Malladi

    08/31/2022, 9:46 PM
    if i increase the segment flush threshold time to 5 min, would that cause the zookeeper persitence storage to go up?
    m
    • 2
    • 6
  • s

    suraj sheshadri

    09/01/2022, 2:58 AM
    i am seeing that pinot query doesnt work for “*not like”* clause. its working as like.. is there a known bug?
    m
    a
    s
    • 4
    • 7
  • j

    Juraj Pohanka

    09/01/2022, 9:01 AM
    Hello, we are migrating to Pinot and we have come across an interesting problem. We are estimating the required infrastructure for our Pinot cluster we have observed, that when we have a small number of server nodes, (we use c5.18xlarge), then the CPU utilization on a single query is relatively small per individual server, leading to larger query runtimes. With Trino, we always get full utilization of the cluster (if not limited by resource groups). We have the
    pinot.server.query.executor.max.execution.threads
    set to -1, and thus have no upper limit on the threads used on a single query. Has anyone experienced similar behavior?
    m
    • 2
    • 16
  • j

    Jinny Cho

    09/01/2022, 4:08 PM
    👋 Team. Have one question. I found from the doc that we can technically use
    HAVING
    clause (here). However when I try it, I receive the following error message. The query I tried was the same as what was written in the doc. Is this something new that we need to upgrade?
    Copy code
    Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered "HAVING" at line 11, column 1.
    Was expecting one of:
        <EOF> 
        "LIMIT" ...
        "OFFSET" ...
    ...
    m
    x
    r
    • 4
    • 13
  • p

    Padma Malladi

    09/01/2022, 5:43 PM
    HI, is there a way for me to get the consuming rows (yet to be persisted) with a query? I am seeing that the kafka messages are coming through, but the data is not yet persisted in Pinot for some of the indexed attributes. I set the segment threshold size to a much lower value and it doesnt help. I set the threshold time to 1 min and it persists every min. But thats not something I like as the ZK 's storage is increasing rapidly.I also would like to return the data to me much sooner than 1 min
    x
    m
    • 3
    • 32
  • s

    suraj sheshadri

    09/02/2022, 1:14 AM
    Is there a way to use cast function on a multivalue column select cast(deals as STRING) from bidrequest.. deals is a multi value column
    s
    • 2
    • 1
  • d

    Deena Dhayalan

    09/02/2022, 5:35 AM
    I am trying to query a table in trino 391 but getting this error as I have used TIMESTAMP datatype Is there any support there in trino?
    m
    e
    • 3
    • 4
  • s

    Stuart Millholland

    09/02/2022, 1:41 PM
    So this is not a huge deal as it's a dev environment, but we accidentally loaded about 20 mutable realtime segments into an immutable offline table and want to delete them but get this error message: Table name: immutable_events_OFFLINE does not match table type: REALTIME. Can we manually delete those segments?
    m
    b
    +2
    • 5
    • 24
1...545556...166Latest