https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • x

    Xiang Fu

    04/02/2023, 8:04 PM
    Is this due to the user/group permissions
    t
    • 2
    • 1
  • d

    Damon

    04/03/2023, 7:41 AM
    Hello team, question about upsert and segments when a row is upserted, does an existing segment get updated or does a new segment get created?
    x
    • 2
    • 1
  • d

    Dugi Sarma

    04/03/2023, 7:52 AM
    Hello all, I have a EKS cluster where I have pinot deployed. When I try to run the following
    bin/pinot-admin.sh LaunchDataIngestionJob   -jobSpecFile bin/job.yml
    I get
    Caused by: java.io.IOException: software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: FV53WSKHTDD1XC20, Extended Request ID: 49ztRi8mEScOkZn2z80nf/5r1q3F7rs1NC0FvjFDT/tpZfLUVF94euXtZ3Ya7PMN/qlEPioI99U=)
    I have enabled the existing kubernetes namespace where the pods are deployed to have access to the s3 bucket so wondering what I am missing ? Does it need to explicitly have the AWS Access/Secret as well ?
    x
    • 2
    • 8
  • e

    Eric Song

    04/03/2023, 8:42 AM
    Hi Teams, question about Text_MATCH. I have configured text index in tableConfig, but when I query data 1 or 2 days ago, sometimes I got ParseException/NullPointerException/ArrayIndexOutOfBoundsException 3 types of exceptions (the data before 3 days ago will not report exceptions)
    p
    • 2
    • 7
  • s

    Sid

    04/03/2023, 9:05 AM
    hi team, been trying to ingest from s3 following this tutorial - https://startree.ai/blog/apache-pinot-0-11-inserts-from-sql The files in s3 are in json.gz format, but the task keeps failing in minion with runtime exception.
    m
    • 2
    • 7
  • l

    Luis Fernandez

    04/03/2023, 2:16 PM
    hello friends long time! I have a question today we have the requirement to have the realtime avg of a field per day and i was wondering how to achieve so given that AVG is not supported here https://docs.pinot.apache.org/developers/advanced/ingestion-level-aggregations#allowed-aggregation-functions, is the way to do it to do the running average in kafka itself and then basically do some upsert logic in pinot that updates that running average? We are looking at a daily granularity resolution for this.
    m
    j
    g
    • 4
    • 15
  • r

    Ravi Singal

    04/03/2023, 4:01 PM
    Hi team, one question regarding pinot controller UI. I want to put it behind nginx proxy. how can I change the root from
    /
    to
    /pinot
    ? this will allow me to route the requests based on path.
    Copy code
    - path: /pinot
      pathType: Prefix
      backend:
        service:
          name: pinot-controller
          port:
            number: 9000
    m
    • 2
    • 5
  • d

    Dugi Sarma

    04/03/2023, 5:09 PM
    Hello all, had a doubt regarding filtering. I was advised to do something like this for filtering -
    Copy code
    "filterConfig": {
            "filterFunction": "strcmp(name, 'pdp_view') != 0"
          },
    "transformConfigs": [
            {
              "columnName": "name",
              "transformFunction": "JSONPATH(event, '$.event_name')"
            }]
    However, this is just converting the other values of the column to null, not filtering those rows completely out. Anything that I could be missing ?
    n
    x
    y
    • 4
    • 32
  • s

    Slackbot

    04/04/2023, 2:59 AM
    This message was deleted.
    m
    e
    • 3
    • 6
  • s

    Shreeram Goyal

    04/04/2023, 7:34 AM
    Hi, I am trying an ingestion transformation JSONPATHLONG while ingesting offline data where I have kept default value as -1 for error parsing. Now I have also kept -1 as my defaultNullValue in the schema of the table but when I query :
    select * from table where column is null
    -> I get no records where as on querying:
    select * from table where column=-1
    I get records. How can I rectify this?
    m
    d
    t
    • 4
    • 125
  • s

    Soumitra Kumar

    04/04/2023, 7:30 PM
    Hello, I am trying to understand how Fault-Domain-Aware Instance Assignment works, details in the thread. cc: @Jia Guo @Mayank
    👀 1
    • 1
    • 1
  • z

    Zhuangda Z

    04/04/2023, 8:07 PM
    Hi 👋 does anyone have an example to match starts with using native text index?
    j
    • 2
    • 1
  • a

    abhinav wagle

    04/04/2023, 11:54 PM
    Hellos I want to connect Trino cluster to multiple Pinot clusters. As described here If I I add them as comma separated like this
    pinot.controller-urls=cluster1:8098,cluster2:8098
    , trino client randomly picks first one or second one based on a new session. What config should I add so that I can query both the Pinot clusters from one single Trino cluster
    x
    • 2
    • 13
  • p

    pramod shenoy

    04/05/2023, 5:16 AM
    Hi Team, are below properties valid for pinot 0.12 zk username and password in controler conf
    Copy code
    controller.zk.username=<XYZ>
    controller.zk.password=<XYZ>
  • m

    Malte Granderath

    04/05/2023, 10:17 AM
    Hey all 👋 Is there any way to exclude specific dimension values for a startree index? We have a scenario where we use a boolean to filter out “soft-deleted” events and we only care about the metrics for the non soft-deleted events Example:
    Copy code
    SELECT COUNT(*) FROM events WHERE some_group_id = x AND is_obsoleted = False
    So in our queries the
    some_group_id
    would change but we would always filter for
    is_obsoleted = False
    Thanks for any help here thankyou
    m
    • 2
    • 2
  • m

    Malte Granderath

    04/05/2023, 12:36 PM
    Another question about the logging of queries: Is there currently any way to direct the query logs on the brokers to a separate file? It would help a lot to have those not flood the regular logs 🙂
    m
    h
    • 3
    • 4
  • s

    Saket Kothari

    04/05/2023, 1:11 PM
    Hi , I am seeing very high query response time for my data.. I am trying to do a count(*) on a user_id and have created an inverted index on this column. The query takes time to load for each unique users and afterwards runs very fast. I am using 1 broker, 1 controller and 1 server with no cpu bound and 2GB memory. Can someone please help?
    m
    • 2
    • 5
  • p

    Pratik Tibrewal

    04/06/2023, 9:05 AM
    Hey can someone help me with this issue: https://github.com/apache/pinot/issues/10552
    s
    • 2
    • 1
  • s

    Saket Kothari

    04/06/2023, 1:22 PM
    Hi Team, I am getting an error while bulk ingesting parquet files from gcs..Its just a single parquet file(~ 11kb). Exception details in thread.
    m
    m
    • 3
    • 11
  • d

    Dugi Sarma

    04/06/2023, 2:49 PM
    Hi team, I am ingesting around 25k files from s3. The ingestion is very slow (around 4 records per second) and has resulted in
    14471
    segments when the size is only 138 mb. I am attaching the jobSpec yaml. Can someone please let me know any glaring mistakes or places where there can be improvements ? Currently using EKS and 2 nodes.
    m
    x
    • 3
    • 14
  • r

    Ryan Ivey

    04/06/2023, 7:46 PM
    We're testing pinot ingestion from pulsar and after initially creating the cluster with 3 servers, I increased to 5, this immediately resulted in several bad segments and the overall size of the only table ingested, declined by about 200G. I've tried rebalancing and reloading all segments, but nothing changes. Initially all 5 servers contained the same tags, I've since removed the realtime tag from the latest 2 servers that were created. I've deduced the bad segements down to missing segments on specific servers. How is it possible to reload these bad segments or what is the recommended process when increasing the number of servers?
  • e

    Eaugene Thomas

    04/07/2023, 6:28 AM
    Hey , I was getting this error for dimension table segments in pinot . Exception trace in thread
    d
    • 2
    • 2
  • k

    Kevin Xu

    04/07/2023, 1:22 PM
    Hi Team, we have an issue that how to deal with 400K+ segments in one table. We tried to increase segments to 200K+, but we found that table the size of both IDEALSTATE and EXTERELVIEW znnode are 1M+ , it took long time to read these znode when response to completed segment request that caused query time out in server log. When the frequency of building segments is very fast. Pinot seems cannot support that large data. I am not sure am I right? If yes, have you currently tried to find available solution to deal with these situation?
    🍷 1
    l
    n
    +3
    • 6
    • 28
  • a

    abhinav wagle

    04/07/2023, 9:54 PM
    Hellos, I am trying to change Pinot ingestion by pointing the Pinot table to a SSL based Kafka using changes as shown here and running into following exception when I try creating table . I don't see any specific error logs in
    pinot-controller
    for what is causing the issue. Any pointers on where/how I can dig deeper. thanks !
    Copy code
    {
      "code": 500,
      "error": "org.apache.pinot.shaded.org.apache.kafka.common.KafkaException: Failed to construct kafka consumer"
    }
  • s

    Shreeram Goyal

    04/09/2023, 8:11 PM
    Hi, I am facing an issue while applying ingestion transform on RT stream where I have a nested json and I want to extract some value from it. My data is of type:
    {....., "nested_json" :{"json_obj": {"a":{"b":"c"}}}, .....}
    please help what to apply in ingestion transform
    d
    • 2
    • 3
  • v

    Venkat Boina(VB)

    04/10/2023, 11:27 AM
    Hi all, If I change my index config and do reload all segments, I don’t see index changes getting applied. Also rebalance brokers seems to be not working. Any help in this would be appreciated.
    m
    s
    • 3
    • 2
  • s

    Shreeram Goyal

    04/10/2023, 11:41 AM
    Hi, I was trying to ingest data offline using spark and found that there was an issue with a few rows getting ingested as I have applied an ingestion transform JSONPATHSTRING where I was trying to extract a string from a json which wasn't a valid json. Can't I get a null value in that case instead of dropping the whole dataframe? Is there a workaround for this where I could simply return null for such malformed json as losing data isn't desirable for us?
    s
    • 2
    • 6
  • s

    Saket Kothari

    04/10/2023, 1:37 PM
    Hi I am trying to batch ingest through spark submit on a dataproc cluster.. can someone please let me know what wrong I am doing? Details in thread.
    • 1
    • 1
  • l

    Lee Wei Hern Jason

    04/10/2023, 3:03 PM
    Hi Team, we have 2 brokers and one of the broker’s Direct memory usage increase and doesnt reduce. We have allocate the direct memory with 1.5g and from the dashboard, it shows that it peaked at 0,9g. We faced some logs about the broker getting direct memory OOM. When searching through pinot community threads, i found someone faced similar issue but different from his, i didnt set
    -XX:+DisableExplicitGC
    . Wonder if anyone knows why the direct memory didnt get GC if it is full or it didnt hit > than the max and it shows OOM.
    m
    x
    +4
    • 7
    • 20
  • p

    pramod shenoy

    04/10/2023, 10:22 PM
    Hi Team, I am getting exception when Configuring HDFS as deep storage in controller conf any pointers on what i may be doing wrong. Details in thread
    ✅ 1
    • 1
    • 6
1...767778...166Latest