https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • p

    Priyank Bagrecha

    06/22/2022, 7:19 AM
    i am trying to use
    valuein
    and
    distinctcounthll
    from looker to query pinot data via trino. i get this error
    Copy code
    Query failed (#20220622_071712_00006_mi58r): line 4:24: Function 'valuein' not registered
    d
    • 2
    • 2
  • p

    Priyank Bagrecha

    06/22/2022, 8:30 AM
    Trino lowercases the query when passing to pinot. As a result query with predicates like
    field = 'Some Value'
    returns no result because it gets translated to
    field = 'some value'
    . did anyone figure out a way to resolve this issue?
    n
    e
    • 3
    • 34
  • l

    Lars-Kristian Svenøy

    06/22/2022, 9:58 AM
    Hello team 👋 Is there any way to remove fields from a schema? I know that this is counted as a breaking change, but it would be nice to be able to do it anyway if it was explicitly intentional. I have a field right now which is set to null, and which we've decided to not include. The only way right now to get rid of that field would be to delete the entire table, recreate the schema and start ingesting the data again, but this isn't a very friendly solution. Any suggestions?
    m
    • 2
    • 17
  • r

    Rakesh Bobbala

    06/22/2022, 2:45 PM
    Hello Team, My realtime table is not consuming records from kafka topic after reaching the "segment.flush.threshold.size": "100000". Am I missing some configurations ?
  • r

    Rakesh Bobbala

    06/22/2022, 2:48 PM
    So when I run the below query, it keeps returning 100000.
    Copy code
    select count(*) from table
    Also, the segments are not getting pushed to the s3 bucket after the threshold.
    m
    • 2
    • 20
  • r

    Rakesh Bobbala

    06/22/2022, 4:33 PM
    Can someone help me with the below error .
    Copy code
    mkdir <s3://test_bucket/test_key_1/data/rakesh_test>
    Copying uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1609Z.tmp.c2fcb060-bf45-487c-b025-c0b27c601f38> to uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1609Z>
    Deleting uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1609Z> force true
    Caught exception while committing segment file for segment: rakesh_test__0__0__20220622T1609Z
    software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403
    The controller was able to create the tmp files. But still I see access denied
  • r

    Rakesh Bobbala

    06/22/2022, 4:34 PM
    i tried both
    Copy code
    pinot.controller.storage.factory.s3.disableAcl=true
    and
    Copy code
    pinot.controller.storage.factory.s3.disableAcl=false
  • r

    Rakesh Bobbala

    06/22/2022, 4:34 PM
    no luck
  • m

    Michael Latta

    06/22/2022, 4:34 PM
    Looks like your s3 credentials need to be looked at
    ➕ 1
    r
    • 2
    • 8
  • s

    Stuart Millholland

    06/22/2022, 5:15 PM
    We are running an init shell scrip that runs a few curl commands to create our tables. We sometimes see the error {"code":409,"error":"Table mutable_events_REALTIME already exists"}failed to create the mutable_events table even when we know the table doesn't exist. Is this a known issue? Is there some sort of lag after you delete a table before it is recognized as being gone? Using the swagger api, we verify that there are no table in the system but this create still fails with this error.
    n
    • 2
    • 3
  • r

    Rakesh Bobbala

    06/22/2022, 7:59 PM
    Can someone help with the below S3 access issue
    Copy code
    Copy /tmp/pinot-tmp-data/fileUploadTemp/rakesh_test__0__0__20220622T1821Z.a804f229-8b20-4b26-be2a-36c97eecd56b from local to <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z.tmp.295786f8-ca70-4929-bacc-8685d7eed4d5>
    Response to segmentUpload for segment:rakesh_test__0__0__20220622T1821Z is:{"offset":143796,"status":"UPLOAD_SUCCESS","isSplitCommitType":false,"segmentLocation":"<s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z.tmp.295786f8-ca70-4929-bacc-8685d7eed4d5>","streamPartitionMsgOffset":"143796","buildTimeSec":-1}
    Handled request from 10.0.1.187 POST <http://pinot-controller-0.pinot-controller-headless.pinot-quickstart.svc.cluster.local:9000/segmentUpload?segmentSizeBytes=151514&buildTimeMillis=119&streamPartitionMsgOffset=143796&instance=Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098&offset=-1&name=rakesh_test__0__0__20220622T1821Z&rowCount=1000&memoryUsedBytes=510612>, content-type multipart/form-data; boundary=XMJs_YLCxWLk2ADRtRKgDg5q7PaR5pB4_Bkwm3 status code 200 OK
    Processing segmentCommitEndWithMetadata:Offset: -1,Segment name: rakesh_test__0__0__20220622T1821Z,Instance Id: Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098,Reason: null,NumRows: 1000,BuildTimeMillis: 119,WaitTimeMillis: 0,ExtraTimeSec: -1,SegmentLocation: <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z.tmp.295786f8-ca70-4929-bacc-8685d7eed4d5,MemoryUsedBytes>: 510612,SegmentSizeBytes: 151514,StreamPartitionMsgOffset: 143796
    Processing segmentCommitEnd(Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098, 143796)
    Committing segment rakesh_test__0__0__20220622T1821Z at offset 143796 winner Server_pinot-server-0.pinot-server-headless.pinot-quickstart.svc.cluster.local_8098
    Committing segment file for segment: rakesh_test__0__0__20220622T1821Z
    mkdir <s3://test_bucket/test_key_1/data/rakesh_test>
    Copying uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z.tmp.295786f8-ca70-4929-bacc-8685d7eed4d5> to uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z>
    Deleting uri <s3://test_bucket/test_key_1/data/rakesh_test/rakesh_test__0__0__20220622T1821Z> force true
    Caught exception while committing segment file for segment: rakesh_test__0__0__20220622T1821Z
    software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: TNEVQ80HE8YYE6QV, Extended Request ID: s2urpeZBQG+wFdNvBN/AC57hEzwBOJy4kQJMP/rKOpJZHQLsfTQMV5ghT3bF2XatKwxTqmjP0UQ=)
    n
    • 2
    • 18
  • r

    Rakesh Bobbala

    06/22/2022, 7:59 PM
    checked all the permissions. But, couldn't resolve this
  • t

    Tiger Zhao

    06/22/2022, 8:40 PM
    Hi, is it possible to update the
    primaryKeyColumns
    for an existing upsert table and have it take effect? Or would I need to recreate the table?
    n
    k
    • 3
    • 6
  • s

    Stuart Millholland

    06/23/2022, 12:20 AM
    Any insight on why some of my deepstore segments are named like this: immutable_events__8__0__20220622T1721Z655cec28-e5c8-4500-b18c-abbbc9d77b47 and some are named like this: immutable_events__5__0__20220622T1721Z for the same table
    • 1
    • 2
  • a

    ahmed

    06/23/2022, 1:20 AM
    Hi I started working on Pinot this week and read couple of articles on how to make udf and created one but the problem after importing it into plugins and lib it doesn't recognize the function and I don't why can you help ? here is the UDF code https://gist.github.com/AhmedElsagher/fd941e7d6d9607167a52825c8e370d03
    m
    • 2
    • 1
  • a

    Alice

    06/23/2022, 6:17 AM
    Hi team. I noticed the sample config for segmentPartitionConfig in Pinot. But I have a question about this. If highlevel type is used, is it ok to use a different partition config in Pinot from kafka partition config? eg, set numPartitions a value not the same as Kafka topic partition number. Or if no partition key is set in Kakfa, will it take effect in Pinot if a column is used in columnPartitionMap?
    m
    • 2
    • 4
  • l

    Lars-Kristian Svenøy

    06/23/2022, 8:44 AM
    Hello team. Quick question, does pool-based tagging also work for brokers?
    m
    • 2
    • 3
  • m

    Mohamed Emad

    06/23/2022, 10:36 AM
    Hello, We have a pinot cluster and we notice after adding real-time tables the status of the server became dead with the following errors ''' "_code": 404, "_error": "ZKPath /pinot-quickstart/LIVEINSTANCES/Server_pinot-release-server-1.pinot-release-server-headless.default.svc.cluster.local_8098 does not exist: ''' when I restart the server pod the status become healthy. Does anyone face this issue before?
    m
    • 2
    • 2
  • a

    Alice

    06/23/2022, 3:40 PM
    Hi, how to tune query performance like ‘select count(distinct user_id) from table_name’? When I run such a query, it returned several servers not responded. 😂 This table has about 100million rows and inverted_index is created for user_id.
    m
    • 2
    • 1
  • a

    Alice

    06/24/2022, 2:49 AM
    Hi team, Can I use upsert in realtime table and use realtimetoofflinesegmentstask at the same time? will this task make sure rows in offline table upserted?
    x
    • 2
    • 6
  • a

    abhinav wagle

    06/24/2022, 3:44 AM
    Hi Team, when I start broker locally. Where can I tail logs for broker. This is what I see during startup process, at which point no activity logs are seen. Do they get redirected to some file ?
    Copy code
    export JAVA_OPTS="-Xms4G -Xmx4G -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -Xloggc:gc-pinot-broker.log"
    ./bin/pinot-admin.sh StartBroker \
        -zkAddress localhost:2191
    [0.006s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:gc-pinot-broker.log instead.
    2022/06/23 20:41:49.149 INFO [StartBrokerCommand] [main] Executing command: StartBroker -brokerHost null -brokerPort 8099 -zkAddress localhost:2191
    2022/06/23 20:41:49.161 INFO [StartServiceManagerCommand] [main] Executing command: StartServiceManager -clusterName PinotCluster -zkAddress localhost:2191 -port -1 -bootstrapServices []
    2022/06/23 20:41:49.161 INFO [StartServiceManagerCommand] [main] Starting a Pinot [SERVICE_MANAGER] at 0.341s since launch
    2022/06/23 20:41:49.165 INFO [StartServiceManagerCommand] [main] Started Pinot [SERVICE_MANAGER] instance [ServiceManager_192.168.50.11_-1] at 0.346s since launch
    2022/06/23 20:41:49.167 INFO [StartServiceManagerCommand] [Start a Pinot [BROKER]] Starting a Pinot [BROKER] at 0.347s since launch
    Jun 23, 2022 8:41:53 PM org.glassfish.grizzly.http.server.NetworkListener start
    INFO: Started listener bound to [0.0.0.0:8099]
    Jun 23, 2022 8:41:53 PM org.glassfish.grizzly.http.server.HttpServer start
    INFO: [HttpServer] Started.
    2022/06/23 20:41:57.595 INFO [StartServiceManagerCommand] [Start a Pinot [BROKER]] Started Pinot [BROKER] instance [Broker_192.168.50.11_8099] at 8.775s since launch
    m
    l
    • 3
    • 2
  • a

    Alice

    06/24/2022, 4:24 AM
    Hi, is there possibility Pinot will create 2 rows for one Kafka message?
    m
    x
    +3
    • 6
    • 23
  • t

    Tommaso Peresson

    06/24/2022, 9:47 AM
    Hi everybody, I have a question for you. I'm trying to figure out the best combination of filters for queries like:
    Copy code
    select date, 
             fields.column1, 
             distinctcounthll(hllState)
      from EventsHll 
      where fields.column2 in (1,2,10)
      group by date, fields.column1 
      limit 300
    And I get sub optimal performance:
    Copy code
    "numServersQueried": 2,
      "numServersResponded": 2,
      "numSegmentsQueried": 1816,
      "numSegmentsProcessed": 1816,
      "numSegmentsMatched": 1816,
      "numConsumingSegmentsQueried": 0,
      "numDocsScanned": 154861922,
      "numEntriesScannedInFilter": 214829377,
      "numEntriesScannedPostFilter": 464585766,
      "numGroupsLimitReached": false,
      "totalDocs": 447509450,
      "timeUsedMs": 25832,
      "offlineThreadCpuTimeNs": 0,
      "realtimeThreadCpuTimeNs": 0,
      "offlineSystemActivitiesCpuTimeNs": 0,
      "realtimeSystemActivitiesCpuTimeNs": 0,
      "offlineResponseSerializationCpuTimeNs": 0,
      "realtimeResponseSerializationCpuTimeNs": 0,
      "offlineTotalCpuTimeNs": 0,
      "realtimeTotalCpuTimeNs": 0,
      "segmentStatistics": [],
      "traceInfo": {},
      "minConsumingFreshnessTimeMs": 0,
      "numRowsResultSet": 140
    So from my understanding the best combination would be a star-tree index for the aggregation and an inverted index for the filtering. Now when I look at the query explanation it seems that it uses the star-tree index filtering
    Copy code
    Operator#$%0                                                       Operator_Id#$%1    Parent_Id#$%2   
     ------------------------------------------------------------------ ------------------ ---------------- 
      "BROKER_REDUCE(limit:300)"                                         "0"                "-1"            
      "COMBINE_GROUPBY_ORDERBY"                                          "1"                "0"             
      "AGGREGATE_GROUPBY_ORDERBY"                                        "2"                "1"             
      "TRANSFORM(fields.column1, date)"                                  "3"                "2"             
      "PROJECT(fields.column1, date, distinctCountHLL__hllState)"        "4"                "3"             
      "FILTER_STARTREE_INDEX"                                            "5"                "4"
    I'm using pinot 0.10.0 and I have the star tree index enabled on
    date, fields.column1, fields.column2
    on
    distinctcounthll__hllState
    and the inverted index enabled on
    fields.column2
    . Just as a reference the same query without the filtering takes 277ms with
    numEntriesScannedInFilter:0
    and
    numEntriesScannedPostFilter:54480
    . My question is, how can I further optimise filtering when grouping by and using a star-tree index? Can the star tree index be used in conjunction with the inverted index? Thanks a lot
    x
    • 2
    • 7
  • k

    kauts shukla

    06/24/2022, 10:09 AM
    @All : I have moved to latest version 0.10.0 and setup on EC2 graviton machines. no error nothing but not able to consume from kafka. Strange no error longs on server logs.
    s
    • 2
    • 1
  • a

    Alice

    06/24/2022, 10:47 AM
    Does Pinot 0.11 disable groovy by default? I tried Pinotadmin.sh to upload table and it returned this following error😅. {“code”400,“error”“Groovy filter functions are disabled for table config
    d
    n
    • 3
    • 12
  • m

    Michael Latta

    06/24/2022, 3:35 PM
    When I attempt to create a real time table but get the kafka connect string wrong Pinot is left in an inconsistent state. Attempting to create the table fails with message that the table already exists, but the table is not listed in UI or swagger as an existing table, and an attempt to delete it using swagger fails. I am not sure how to clean this up other than rebuilding the cluster, which is undesirable once we actually start to rely on it.
    m
    n
    • 3
    • 3
  • a

    abhinav wagle

    06/24/2022, 9:18 PM
    Hi Team, I am trying to load this https://docs.pinot.apache.org/basics/getting-started/pushing-your-data-to-pinot table on a locally running pinot instance and seeing this :
    Copy code
    {"code":400,"error":"Invalid table config for table transcript_OFFLINE: Failed to find instances with tag: DefaultTenant_OFFLINE for table: transcript_OFFLINE"}`
    x
    • 2
    • 5
  • d

    Diogo Baeder

    06/25/2022, 1:48 AM
    Hi folks, I need some help to understand how I can do some numeric conversions within queries. I want to be able to convert an integer to
    1
    if it matches a certain number X, and to
    0
    if it doesn't. More on this thread.
    x
    • 2
    • 10
  • a

    Alice

    06/25/2022, 4:31 AM
    Hi team, we’re using Pinot master and found kafka configuration about ssl in “streamConfigs” section has changed. Previous version config is like “sasl.jaas.config” “org.apache.kafka.common.security.scram.ScramLoginModule And latest config needs to be like the following: “sasl.jaas.config”: “org.apache.pinot.shaded.org.apache.kafka.common.security.scram.ScramLoginModule So, just to confirm, is this change a temporal plan or it will be used in future versions?
    n
    k
    • 3
    • 2
  • k

    KY

    06/25/2022, 4:55 PM
    we were trying the peer download as per the following doc. https://docs.pinot.apache.org/operators/operating-pinot/decoupling-controller-from-the-data-path#overview-of-peer-download-policy Noticed that there is an option to download from the replica's peer using the localpinotFS. what should be the value of pinot.server.instance.segment.store.uri when used in combination with file://dir ? Our goal is to bypass controller and rely on the peer download in case of offline assignments of segments. Want to try out the behavior with and without deepstor ..
    n
    • 2
    • 3
1...454647...166Latest