https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • a

    Abhijeet Kushe

    04/29/2022, 1:09 AM
    Hi Team, There was a production issue due to which we lost records from 04/20 to 04/26.We create a segment each day.We have a log of all the records.I wanted to know whether I can just directly stream those records to pinot.The eventTimestamp on which the Pinot RealtimeTable is configured would be between 04/26.So will the back dated entries in the pinot realtime table be captured in a current segment or will get merged into the older segment which has been closed ?
    m
    l
    • 3
    • 5
  • a

    Alice

    04/29/2022, 8:43 AM
    Hi, what’s the best type choice for current in Pinot? Is it STRING?
    r
    k
    • 3
    • 9
  • s

    Saumya Upadhyay

    04/29/2022, 10:55 AM
    hi All , if segments are offline how to make them online, on of the table has offline state it has been recovered automatically but other three tables segments are in offline state.
    m
    • 2
    • 12
  • l

    Luis Fernandez

    04/29/2022, 2:20 PM
    hey all, we continue to have issues with zookeeper on gke sadly, our sandbox environment got its space filled up does anyone know how to recover from this scenario, pretty much the entire system is sad at the moment
    k
    d
    • 3
    • 10
  • y

    Young Seok (Tony) Kim

    04/29/2022, 8:03 PM
    Hi all, I’m trying to do batch ingestion with spark according to this documentation. It seems that the current pinot version 0.10.0 doesn’t include some dependencies and the documentation is recommending to use
    0.11.0-SNAPSHOT
    instead. (I got some runtime issues - Class not found exception). Does anyone know how can I find the 0.11.0-snapshot binary?
    m
    k
    • 3
    • 4
  • j

    Jinal Panchal

    05/02/2022, 12:43 PM
    Hello, I've started exploring Pinot.. So is there any way to define primary key & foreign key relationships so that we can maintain mapping?
    m
    • 2
    • 1
  • d

    Diogo Baeder

    05/02/2022, 1:53 PM
    Hi folks, let me ask for your opinion on modeling tables in Pinot. Suppose (just a fake case for simple illustration) that you had a data source where you have users having different objects at home, where the types and names of these objects are dynamic, and you wanted to have a way to store them in such a way that you could be able to query them by objects amounts, like finding users that have 2 cars and 2 TVs. Considering that you don't know what objects would be coming in beforehand, how would you model this? JSON field for the objects, to keep them in a single row that represents the individual user? Spreading the objects as different rows and then aggregating and filtering at the application side? How would you guys model this?
    k
    m
    a
    • 4
    • 17
  • r

    Ryan Ruane

    05/03/2022, 1:39 PM
    Hi everyone. New here and I was just wanting to ask some questions on ingesting arrays, that is
    INT_ARRAY
    ,
    FLOAT_ARRAY
    ,
    TIMESTAMP_ARRAY
    , etc. I have found that I can ingest using JSON all types as multi-valued dimension columns with the exception of
    BOOLEAN
    ,
    TIMESTAMP
    , and
    BYTES
    . I believe that
    JSON_ARRAY
    isn't a valid type, but I wasn't sure about
    BYTES_ARRAY
    . If anyone is about and can shed some light, I would be very appreciative. More info present in-thread
    m
    • 2
    • 8
  • t

    Tao Hu

    05/03/2022, 6:49 PM
    Hi everyone. I updated to 0.10.0 and tried a group by query with filter clause in aggregations but got the following error:
    Copy code
    GROUP BY with FILTER clauses is not supported
    Do we have any plans to support FILTER clause with group by? And for now is there any workaround if I want to filter my aggregations in a group by query? Thanks
    m
    a
    • 3
    • 5
  • n

    Nikhil Varma

    05/05/2022, 5:30 AM
    Hi everyone, im trying to store the segments in the minio as s3 deepstorage but there it is writing the temporary segments instead of saving the segments. please help on this if anyone face this issue earlier. Thank you
    m
    m
    • 3
    • 10
  • d

    Diogo Baeder

    05/05/2022, 9:12 PM
    Hi folks, quick question: table indexes - whatever the type - are created inside each segment, and don't ever cross segments, right? Asking just to confirm.
    r
    • 2
    • 2
  • k

    Kevin Xu

    05/06/2022, 2:49 AM
    Hi,@Xiang Fu Could you please help me to fix this issue about presto in workflow? Link: https://github.com/apache/pinot/runs/6316328667?check_suite_focus=true
    x
    • 2
    • 9
  • d

    Diogo Baeder

    05/06/2022, 1:49 PM
    Hi folks! The documentation about Sorted Inverted Index doesn't tell how to configure that; How can that be done?
    m
    m
    • 3
    • 16
  • r

    Rebecca Lau

    05/06/2022, 4:25 PM
    hello! we’re trying to batch ingest segments into our pinot instance, but we are finding that some segments are in a bad state. the stack trace we see from the debug/tables/{tablename} endpoint is like so:
    java.lang.IllegalArgumentException: newLimit > capacity: (604 > 28)\n\tat java.base/java.nio.Buffer.createLimitException(Buffer.java:372)\n\tat java.base/java.nio.Buffer.limit(Buffer.java:346)\n\tat java.base/java.nio.ByteBuffer.limit(ByteBuffer.java:1107)\n\tat java.base/java.nio.MappedByteBuffer.limit(MappedByteBuffer.java:235)\n\tat java.base/java.nio.MappedByteBuffer.limit(MappedByteBuffer.java:67)\n\tat org.apache.pinot.segment.spi.memory.PinotByteBuffer.view(PinotByteBuffer.java:303)\n\tat org.apache.pinot.segment.spi.memory.PinotDataBuffer.view(PinotDataBuffer.java:379)\n\tat org.apache.pinot.segment.local.segment.index.readers.forward.BaseChunkSVForwardIndexReader.<init>(BaseChunkSVForwardIndexReader.java:97)\n\tat org.apache.pinot.segment.local.segment.index.readers.forward.FixedByteChunkSVForwardIndexReader.<init>(FixedByteChunkSVForwardIndexReader.java:37)\n\tat org.apache.pinot.segment.local.segment.index.readers.DefaultIndexReaderProvider.newForwardIndexReader(DefaultIndexReaderProvider.java:97)\n\tat org.apache.pinot.segment.spi.index.IndexingOverrides$Default.newForwardIndexReader(IndexingOverrides.java:184)\n\tat org.apache.pinot.segment.local.segment.index.column.PhysicalColumnIndexContainer.<init>(PhysicalColumnIndexContainer.java:166)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:181)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:121)\n\tat org.apache.pinot.segment.local.indexsegment.immutable.ImmutableSegmentLoader.load(ImmutableSegmentLoader.java:91)\n\tat org.apache.pinot.core.data.manager.offline.OfflineTableDataManager.addSegment(OfflineTableDataManager.java:52)\n\tat org.apache.pinot.core.data.manager.BaseTableDataManager.addOrReplaceSegment(BaseTableDataManager.java:373)\n\tat org.apache.pinot.server.starter.helix.HelixInstanceDataManager.addOrReplaceSegment(HelixInstanceDataManager.java:355)\n\tat org.apache.pinot.server.starter.helix.SegmentOnlineOfflineStateModelFactory$SegmentOnlineOfflineStateModel.onBecomeOnlineFromOffline(SegmentOnlineOfflineStateModelFactory.java:162)\n\tat jdk.internal.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)\n\tat java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:566)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.invoke(HelixStateTransitionHandler.java:404)\n\tat org.apache.helix.messaging.handling.HelixStateTransitionHandler.handleMessage(HelixStateTransitionHandler.java:331)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:97)\n\tat org.apache.helix.messaging.handling.HelixTask.call(HelixTask.java:49)\n\tat java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\n\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\n\tat java.base/java.lang.Thread.run(Thread.java:829)\n
    @Luis Fernandez and I were wondering what this
    capacity
    value (28, according to the trace) might be? thanks!
    r
    l
    s
    • 4
    • 31
  • p

    Prashant Pandey

    05/06/2022, 5:50 PM
    Hi team. What should be the value of
    controller.host
    in controller config for a k8s deployment? I am deploying Pinot to a new env and leaving this field empty results in a NPE during controller startup:
    Copy code
    java.lang.NullPointerException: null
    	at org.apache.pinot.common.utils.helix.HelixHelper.updateHostnamePort(HelixHelper.java:550) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
    	at org.apache.pinot.controller.BaseControllerStarter.updateInstanceConfigIfNeeded(BaseControllerStarter.java:607) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
    	at org.apache.pinot.controller.BaseControllerStarter.registerAndConnectAsHelixParticipant(BaseControllerStarter.java:583) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
    	at org.apache.pinot.controller.BaseControllerStarter.setUpPinotController(BaseControllerStarter.java:382) ~[pinot-all-0.9.3-jar-with-dependencies.jar:0.9.3-e23f213cf0d16b1e9e086174d734a4db868542cb]
    r
    • 2
    • 9
  • r

    Ryan Persaud

    05/06/2022, 10:07 PM
    👋 Hello, I am working through the QuickStart Tutorial, and I started pinot locally with the command:
    ./bin/pinot-admin.sh QuickStart -type batch
    . I can see a log entry for the the table being added, and no obvious errors:
    Copy code
    Adding offline table: baseballStats
    Executing command: AddTable -tableConfigFile /var/folders/jv/g99n5jcj3hz0lbbf90gykcc40000gq/T/1651874628141/baseballStats_1651874628195.config -schemaFile /var/folders/jv/g99n5jcj3hz0lbbf90gykcc40000gq/T/1651874604715/baseballStats/baseballStats_schema.json -controllerProtocol http -controllerHost localhost -controllerPort 9000 -user null -password [hidden] -exec
    but I do not see the table via the UI (please see screenshot). Is there an additional step that I need to take in order to see the table? Thanks! Not sure if it's relevant, but here is some version information: Java:
    openjdk 11.0.15 2022-04-19
    pinot:
    pinot-0.10.0
    x
    • 2
    • 10
  • d

    Diogo Baeder

    05/08/2022, 3:11 AM
    Hi folks! How can I use the
    /ingestFromURI
    endpoint from the Controller API to ingest a file as a segment, but defining the segment name myself? I tried passing
    segment.name
    to the
    batchConfigMapStr
    parameter JSON, but it didn't work, the Controller ends up creating the segment name by itself. I'd like to have more control over this, because I want to be able to more easily replace segments.
    m
    m
    n
    • 4
    • 17
  • d

    Deepak Mishra

    05/09/2022, 11:05 AM
    Hi Team, I am trying to execute pinot ingestion job with segment type ‘fixed’ . Input data is on different directory - dir Batch1 - part-**.avro dir Batch2 - part-00**.avro etc . I would like to generate segment with fixed type segment name e.g. - segment name - Batch1 , segment name - Batch2 etc. Can any one please help with the same?
    d
    m
    x
    • 4
    • 10
  • p

    Prashant Pandey

    05/09/2022, 11:27 AM
    Hi team, how does Pinot encode byte columns to display on the UI? Are they encoded as hex strings?
    k
    • 2
    • 2
  • l

    Luis Fernandez

    05/09/2022, 3:08 PM
    anyone know why in my pinot metrics when i send a query i get
    "numServersQueried": 4,
    but i only have 2 servers o.O ?
    m
    • 2
    • 3
  • t

    Tiger Zhao

    05/09/2022, 6:41 PM
    Hi, I noticed some strange behavior when setting
    realtime.segment.flush.threshold.rows
    for my realtime tables. It seems that the actual number of rows per segment becomes some value smaller than the value I set. For example, I'll set this to 1000000, but in the segment metadata,
    segment.flush.threshold.size
    would be 500000 and the segment does only ingest 500000 rows. This seems to only happen for some tables, and sometimes it is shrunk by a factor 2 or 4. Just wondering if there is any other setting I'm missing that is causing this?
    m
    m
    • 3
    • 11
  • l

    Luis Fernandez

    05/09/2022, 6:48 PM
    hey friends, question, why do we consider queries that take more than 100ms to be slow? in our current cluster we have some queries that are taking more than 100ms to execute is that a reason to be worried?
    m
    • 2
    • 3
  • a

    abhinav wagle

    05/09/2022, 9:35 PM
    Hello, Any pointers while I am using Pinot Helm here helm on AWS, and how to make Pinot controller Load balancer url accessible only inside VPN while I deploy this on AWS.
    m
    x
    • 3
    • 51
  • d

    Deepak Mishra

    05/10/2022, 3:55 AM
    Hi Team , I am trying to executing backfill job using pinot-ingestion job.Basically , I am trying create offline segment using pinot-inestion job can we fix offline segments size without using minion while executing backfill job if too much data is there? Can anyone please help with the same?
    x
    • 2
    • 3
  • l

    Luis Fernandez

    05/10/2022, 3:55 PM
    hello friends!! we are encountering some issues when migrating data using the job spec, we are basically migrating a bunch of json files in gcs into pinot a json file looks like this:
    Copy code
    {"serve_time":1623110400.00000000,"p_id":8.0476135E7,"u_id":6047599.0,"i_count":1}
    {"serve_time":1623110400.00000000,"p_id":8.1923416E7,"u_id":5407252.0,"i_count":1,"c_count":1,"c":17}
    we endup having this exception for some of the files:
    Copy code
    2022/05/10 15:48:19.314 ERROR [SegmentGenerationJobRunner] [pool-2-thread-1] Failed to generate Pinot segment for file - <gs://rb>
    lau_tmp/raw_data/date=2020-07-25/part-00168-c741f867-338d-4c84-afaf-428f85c14088.c000.json
    java.lang.RuntimeException: Unexpected end-of-input within/between Object entries
    do you know why we may end up getting these errors?
    d
    m
    n
    • 4
    • 33
  • g

    George He

    05/11/2022, 6:29 AM
    Hi guys, I am trying to use
    LastWithTime
    function to get the latest data of each group. But unfortunately I got NPE exception. Original intention is to get the top N of a group after group by ad hoc column. Wondering if I have a grammar issue of my PQL:
    Copy code
    select hostname, lastWithTime('alertId', 'issued', 'STRING')   from uas_nomalized_alert 
    where JSON_EXTRACT_SCALAR(attributes, '$.graphQL-businessService', 'STRING', '') = 'Meeting Centers'
    group by hostname
    alertId
    is a 'STRING' column,
    issued
    is a 'TIMESTAMP' column
    n
    • 2
    • 3
  • f

    Fizza Abid

    05/11/2022, 6:44 AM
    Hi, after setting up authentication. My ingestion stopped.
  • f

    Fizza Abid

    05/11/2022, 6:45 AM
    Used this link for enabling authentication https://docs.pinot.apache.org/operators/tutorials/authentication-authorization-and-acls
    x
    x
    • 3
    • 5
  • a

    Alice

    05/11/2022, 8:27 AM
    Hi team, I’m trying to use Pinot upsert feature. Part of my table config is like below: { “tableName”: “upsert_test_local”, “tableType”: “REALTIME”, “segmentsConfig”: { “schemaName”: “upsert_test_local”, “timeColumnName”: “*created_on*”, “timeType”: “MILLISECONDS”, “allowNullTimeValue”: true, “replicasPerPartition”: “1", “retentionTimeUnit”: “DAYS”, “retentionTimeValue”: “30", “segmentPushType”: “APPEND”, “completionConfig”: { “completionMode”: “DOWNLOAD” } }, “tenants”: { }, “tableIndexConfig”: { “loadMode”: “MMAP”, “aggregateMetrics”: true, “nullHandlingEnabled”: true, “streamConfigs”: { “streamType”: “kafka”, “stream.kafka.consumer.type”: “lowlevel”, “stream.kafka.decoder.class.name”: “org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder”, “stream.kafka.consumer.factory.class.name”: “org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory”, “stream.kafka.consumer.prop.auto.offset.reset”: “largest”, “realtime.segment.flush.threshold.time”: “*30m*”, “realtime.segment.flush.threshold.rows”: “0", “realtime.segment.flush.threshold.segment.size”: “100M”, “realtime.segment.flush.autotune.initialRows”: “1000000" } }, “ingestionConfig”: { “filterConfig”: { “filterFunction”: “Groovy({tablename != \“test_table_name\“}, tablename)” }, “transformConfigs”: [ { “columnName”: “id”, “transformFunction”: “Groovy({UUID.randomUUID().toString()}, tablename)” }, { “columnName”: “*timestamp*”, “transformFunction”: “jsonPathString(metrics, ‘$.timestamp’)” }, { “columnName”: “*created_on*”, “transformFunction”: “Groovy({System.currentTimeMillis()}, tablename)” }, { “columnName”: “updated_on”, “transformFunction”: “Groovy({System.currentTimeMillis()}, tablename)” } ] }, “metadata”: { “customConfigs”: {} }, “routing”: { “instanceSelectorType”: “strictReplicaGroup” }, “upsertConfig”: { “mode”: “PARTIAL”, “defaultPartialUpsertStrategy”: “OVERWRITE”, “partialUpsertStrategies”:{ “*created_on*”: “IGNORE” } } } And part of schema is like this: “dateTimeFieldSpecs”: [ { “name”: “*timestamp*”, “dataType”: “LONG”, “format”: “1MILLISECONDSEPOCH”, “granularity”: “1:MILLISECONDS” }, { “name”: “*created_on*”, “dataType”: “LONG”, “format”: “1MILLISECONDSEPOCH”, “granularity”: “1:MILLISECONDS” }, { “name”: “updated_on”, “dataType”: “LONG”, “format”: “1MILLISECONDSEPOCH”, “granularity”: “1:MILLISECONDS” } ], “primaryKeyColumns”: [ “timestamp” ] At first, upsert works as expected. But after a while, like 30 minutes later, when I query this table, there is no record in this table. But totalDocs in the query response stats is not 0. Then I write some data to the same Kafka topic, and query this table, there are some records. But the value of the created_on field is 0 instead of the current timestamp. Any idea what property is not set right here? Is it timeColumnName property?
    k
    j
    • 3
    • 19
  • m

    Marium Faheem

    05/11/2022, 8:50 AM
    I was trying to connect pinot with trino , connections looks good , I’m getting pinot catalog also but when i exectue
    {"code":403,"error":"Permission is denied for access type 'READ' to the endpoint '<http://controller.com/tables>'"}
    Any idea how to resolve ?
    k
    • 2
    • 4
1...404142...166Latest