https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • a

    Aaron Weiss

    08/15/2022, 3:40 PM
    Hey, this is more of a question to validate, you cannot have a multiple value / array Metric field in Pinot, correct? i.e. for Dimensions you can specify
    Copy code
    "singleValueField": false
    Is there something analogous for Metrics? I ask because we would like to keep our records at a certain granularity with a couple of metrics that are arrays, but we want to keep them metrics to be able to do aggregates on them.
    m
    j
    • 3
    • 27
  • t

    Tiger Zhao

    08/15/2022, 9:55 PM
    Hi, looking at https://github.com/apache/pinot/pull/8032 from the 0.10.0 release, my understanding is that we can now batch ingest segments into realtime tables? I just tried doing this with
    pinot-admin.sh LaunchDataIngestionJob
    but I get a
    Failed to decode table config from JSON
    error. Is this expected?
    m
    • 2
    • 4
  • a

    Andrew Sunarto

    08/16/2022, 1:09 AM
    Hi guys, I’m new to pinot. I have a realtime table filled with 2603 documents from a Kafka producer, which continues to produce messages to the Kafka topic. The number of documents in pinot from this topic is no longer increasing, so I’m wondering what possibly could have caused this. Reported/Estimated size for this table is 0 bytes (although there are 2603 valid documents in the table) and the status for each of the table segments is Good and the server is Consuming in the replica set. Any ideas I can try to get pinot to continue collecting the messages from the kafka sink?
    n
    • 2
    • 6
  • e

    Ehsan Irshad

    08/16/2022, 8:15 AM
    Hi. We have pinot setup on local ec2 instances (no Kubernetes). Just wondering where zookeeper config is stored, or do we have to create it manually zoo.conf in /conf directory? In my Zookeeper Browser I see the zookeeper config empty. Problem is zookeeper is eating all the disk we assign to it, so to me it seems some logs are not purged. Following this thread in particular to debug (@Mayank hope heading in right direction ?) https://apache-pinot.slack.com/archives/C011C9JHN7R/p1650312581517129
    m
    • 2
    • 4
  • p

    Priyank Bagrecha

    08/16/2022, 10:15 PM
    We are deploying pinot on k8s cluster using the community provided helm charts. We are using spot instances for the k8s cluster. Do we have to rebalance the cluster to allow re-assignment whenever a pod gets replaced as the spot instances goes away?
    m
    a
    • 3
    • 10
  • p

    Priyank Bagrecha

    08/16/2022, 11:54 PM
    One more question - We are loading data for an offline table using a spark job. We see some segments go into ERROR state. What makes a segment go into ERROR state v/s OFFLINE? Does it mean the download of the segment from S3 to local server disc failed? If the server retries to grab the segment, how many retries does it do, and is it configurable? Will
    Reload Segment
    force download of segment from S3 to pinot server disc?
    m
    • 2
    • 20
  • t

    Tanmesh Mishra

    08/17/2022, 9:11 PM
    Hey team, I am working on this PR and have a question
    m
    • 2
    • 2
  • n

    Neeraja Sridharan

    08/17/2022, 9:31 PM
    Just wanted to check if there is a recipe for broker side pruning. More info in this thread:
    m
    • 2
    • 2
  • s

    Scott deRegt

    08/17/2022, 10:52 PM
    Hey team 👋 I'm running into an issue with offline segments in
    error
    state. When checking
    debug
    endpoint and server logs, am seeing
    java.nio.file.NoSuchFileException
    on specific segment files local paths. I've tried rebalancing servers w/
    downtime
    +
    bootstrap
    toggled as well as running reload segment with
    forceDownload
    set to
    true
    but still can't seem to clear the `error`s. any tips on how to repair this error state?
    ✅ 1
    m
    n
    • 3
    • 11
  • r

    Ryan Ruane

    08/18/2022, 4:58 PM
    Hi guys, I was wondering whether it would be possible to start an entire cluster in a single docker image rather than having to spin up one for each part of the cluster with docker compose. I know that the quick start function will spin up a working cluster, but from the documentation I only see it providing preexisting tables. I would like to create a custom table with all cluster nodes within on docker image. Do you happen to know whether this would be easily achievable or desirable? The reason I'm asking concerns gitlab ci and issues I was having with trying to impose a dependency ordering. From what understand, gitlab ci lack some of docker composes features, such as a restart policy, etcetera.
    m
    x
    • 3
    • 3
  • r

    Ryan Ruane

    08/18/2022, 5:00 PM
    Secondarily, on a completely unrelated note, does anybody know how it would be possible to extract two values from Json in a query, and then composed them into an array. I'm curious to know whether it is possible to extract three values by key, and return them as an array of integers.
  • l

    Luis Fernandez

    08/18/2022, 7:17 PM
    question: can I add auth to the controller UI but not necessarily to the cluster as a whole (?) like can I add auth to the UI without requests to the broker having to be authorize via CURL?
    s
    • 2
    • 4
  • t

    Timothy James

    08/19/2022, 12:32 AM
    Hi Pinot heroes like @Mayank! I'm attempting to use a minion merge rollup task (as @Mayank helpfully suggested), but it seems to do... nothing. The segments aren't getting merged and I'm not seeing any relevant logging in minion or controller logs. Details in thread... but here's what my OFFLINE table's task definition looks like:
    Copy code
    "task": {
          "taskTypeConfigsMap": {
            "MergeRollupTask": {
              "5min.bucketTimePeriod": "5min",
              "5min.bufferTimePeriod": "15min",
              "1hour.bucketTimePeriod": "1h",
              "1hour.bufferTimePeriod": "2h",
              "1day.bucketTimePeriod": "1d",
              "1day.bufferTimePeriod": "1d"
            }
          }
        },
    m
    h
    • 3
    • 9
  • r

    Rafael Jeon

    08/19/2022, 9:19 AM
    Hi I’m trying to start quick upsert examples with docker environment. But I got following error.
    Copy code
    docker run \
        -p 9000:9000 \
        apachepinot/pinot:0.9.3 QuickStart \
        -type upsert_json_index
    Unable to find image 'apachepinot/pinot:0.9.3' locally
    0.9.3: Pulling from apachepinot/pinot
    a2abf6c4d29d: Pull complete
    2bbde5250315: Pull complete
    202a34e7968e: Pull complete
    8c484b17211c: Pull complete
    c79d6edef3e3: Pull complete
    9335053f1957: Pull complete
    99fa3710378d: Pull complete
    dd4c492811d1: Pull complete
    1979ceaa5442: Pull complete
    Digest: sha256:fa8e27a6b81732ea238f0c41f85ba2f1f4578e1011c40d007e938f00bb59fb5d
    Status: Downloaded newer image for apachepinot/pinot:0.9.3
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/pinot/lib/pinot-all-0.9.3-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.9.3-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-yammer/pinot-yammer-0.9.3-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-metrics/pinot-dropwizard/pinot-dropwizard-0.9.3-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-environment/pinot-azure/pinot-azure-0.9.3-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.9.3-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See <http://www.slf4j.org/codes.html#multiple_bindings> for an explanation.
    SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
    WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.
    WARNING: An illegal reflective access operation has occurred
    WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/opt/pinot/lib/pinot-all-0.9.3-jar-with-dependencies.jar) to method java.lang.Object.finalize()
    WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass
    WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
    WARNING: All illegal access operations will be denied in a future release
    ***** Starting Kafka *****
    ***** Starting meetup data stream and publishing to Kafka *****
    javax.websocket.DeploymentException: Handshake error.
    	at org.glassfish.tyrus.client.ClientManager$3$1.run(ClientManager.java:656)
    	at org.glassfish.tyrus.client.ClientManager$3.run(ClientManager.java:694)
    	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
    	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
    	at org.glassfish.tyrus.client.ClientManager$SameThreadExecutorService.execute(ClientManager.java:848)
    	at java.base/java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:118)
    	at org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:493)
    	at org.glassfish.tyrus.client.ClientManager.connectToServer(ClientManager.java:337)
    	at org.apache.pinot.tools.streams.MeetupRsvpStream.run(MeetupRsvpStream.java:71)
    	at org.apache.pinot.tools.UpsertJsonQuickStart.execute(UpsertJsonQuickStart.java:86)
    	at org.apache.pinot.tools.admin.command.QuickStartCommand.execute(QuickStartCommand.java:161)
    	at org.apache.pinot.tools.Command.call(Command.java:33)
    	at org.apache.pinot.tools.Command.call(Command.java:29)
    	at picocli.CommandLine.executeUserObject(CommandLine.java:1953)
    	at picocli.CommandLine.access$1300(CommandLine.java:145)
    	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2352)
    	at picocli.CommandLine$RunLast.handle(CommandLine.java:2346)
    	at picocli.CommandLine$RunLast.handle(CommandLine.java:2311)
    	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2179)
    	at picocli.CommandLine.execute(CommandLine.java:2078)
    	at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:161)
    	at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:192)
    Caused by: org.glassfish.tyrus.core.HandshakeException: Response code was not 101: 404.
    	at org.glassfish.tyrus.client.TyrusClientEngine.processResponse(TyrusClientEngine.java:299)
    	at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientFilter.handleHandshake(GrizzlyClientFilter.java:322)
    	at org.glassfish.tyrus.container.grizzly.client.GrizzlyClientFilter.handleRead(GrizzlyClientFilter.java:291)
    	at org.glassfish.grizzly.filterchain.ExecutorResolver$9.execute(ExecutorResolver.java:95)
    	at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeFilter(DefaultFilterChain.java:260)
    	at org.glassfish.grizzly.filterchain.DefaultFilterChain.executeChainPart(DefaultFilterChain.java:177)
    	at org.glassfish.grizzly.filterchain.DefaultFilterChain.execute(DefaultFilterChain.java:109)
    	at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:88)
    	at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:53)
    	at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:515)
    	at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:89)
    	at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.run0(WorkerThreadIOStrategy.java:94)
    	at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy.access$100(WorkerThreadIOStrategy.java:33)
    	at org.glassfish.grizzly.strategies.WorkerThreadIOStrategy$WorkerThreadRunnable.run(WorkerThreadIOStrategy.java:114)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569)
    	at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549)
    	at java.base/java.lang.Thread.run(Thread.java:829)
  • m

    Mark Needham

    08/19/2022, 10:27 AM
    I think that's because meetup disabled the RSVP API
    👍 1
  • m

    Mark Needham

    08/19/2022, 10:27 AM
    can you try to use 0.10.0
  • m

    Mark Needham

    08/19/2022, 10:27 AM
    I think that quickstart has been fixed in that verson
  • p

    Priyank Bagrecha

    08/19/2022, 8:36 PM
    Hello, I am looking for general guidance here. We are loading data for offline table from AWS S3 using the job spec for spark job. The segment size is ~ 400 MB on the disc. I am noticing that the servers run into OOM while trying to transitioning segment state after downloading it to the server disc. We are using 15 servers with 4 cpu and 32 GB ram and using 16 GB for heap and also using offheap. The servers have 2 TB disc each i.e. total of 30 TB disc space, and we are loading a total of 2 TB of data. We have also configured inverted index on top of some fields in the data.
    s
    • 2
    • 3
  • p

    Priyank Bagrecha

    08/19/2022, 8:58 PM
    How does Pinot, and specially server, behave differently between online v/s offline tables?
    m
    • 2
    • 8
  • s

    Sukesh Boggavarapu

    08/19/2022, 9:03 PM
    I have a real time table with 3 day retention. And offline table that basically gets data loaded daily through an offline ingestion job . So for example, when I query for data for yesterday, data gets queried only from offline table despite real time table also having it? Just wanted to check that offline table always has the precedence over real time table?
    m
    • 2
    • 1
  • j

    Jay Bhatt

    08/22/2022, 2:55 PM
    Hello folks 👋, new to the pinot world, just wanted clarification on one point. We have a
    Long
    type column in one of the tables. Going ahead, for more precision, we want to maintain a
    Double
    type of column. Would it be possible to directly update the column type in the schema or should we add a new column and delete the old one. Referred the schema evolution link, but couldn't determine how to proceed from it.
    m
    • 2
    • 5
  • s

    Stuart Millholland

    08/22/2022, 4:52 PM
    So we found ourselves in a position where we had to decrease kafka partitions for realtime and ended up deleting our consuming segments. We are looking for how we tell Pinot to start creating consuming segments again.
    m
    n
    • 3
    • 17
  • s

    Stuart Millholland

    08/22/2022, 7:00 PM
    We've noticed realtime consumption "Seeking to offset" logs are super chatty in the realtime servers. Is there a way to tweak that?
  • l

    Luis Fernandez

    08/22/2022, 7:28 PM
    hey friends, i come to you with a query optimization question, we have the following query:
    select sum(impression_count) from metrics where user_id = xxx and product_id = xxxx
    this query even tho it has more selectivity is slower than this one:
    select sum(impression_count) from metrics where user_id = xxx
    we currently are partitioning on the
    user_id
    and have a bloom filter on
    product_id,user_id
    is it because of the partitioning (?) also we do see an elevation of
    numEntriesScannedInFilter
    when we also add the
    product_id
    in the query but without it is pretty much 0, what do you recommend to do in this case.
    k
    z
    • 3
    • 15
  • m

    Mahesh babu

    08/23/2022, 7:08 AM
    Hi Team, Pinot Realtime tables connected to Kafka topics stops updating after sometime even though Kafka topics gets populated this is the debug information [ { "tableName": "DISH12_REALTIME", "numSegments": 1, "numServers": 1, "numBrokers": 1, "segmentDebugInfos": [ { "segmentName": "DISH12__0__0__20220817T1159Z", "serverState": { "Server_10.90.21.84_8098": { "idealState": "CONSUMING", "externalView": "CONSUMING", "segmentSize": "0 bytes", "consumerInfo": { "segmentName": "DISH12__0__0__20220817T1159Z", "consumerState": "NOT_CONSUMING", "lastConsumedTimestamp": 1660823957234, "partitionToOffsetMap": { "0": "101151" } }, "errorInfo": { "timestamp": "2022-08-18 115922 UTC", "errorMessage": "Could not build segment", "stackTrace": null } } } } ], "serverDebugInfos": [], "brokerDebugInfos": [], "tableSize": { "reportedSize": "0 bytes", "estimatedSize": "0 bytes" }, "ingestionStatus": { "ingestionState": "UNHEALTHY", "errorMessage": "Segment: DISH12__0__0__20220817T1159Z is not being consumed on server: Server_10.90.21.84_8098" } }
    k
    • 2
    • 14
  • t

    Tony Zhang

    08/23/2022, 7:32 AM
    Hey Guys, Is the Tiered Storage available for the pinot? if so, how can I configure it to AWS S3? thanks.
    m
    • 2
    • 1
  • k

    Kaushik Ranganath

    06/07/2021, 5:19 AM
    When I do a kubectl get all -n pinot-quickstart, I see this has brought up classic load balancers to expose both the broker and the controller on tcp ports, and when I make a curl/browser request to the DNSs, I expect these to show up the UI for the broker and the Swagger UI for the controller, but the request times out eventually without bringing up the UI. I am a beginner in AWS Networking as well, but the security groups created by these setup instructions which I have followed exactly allows TCP requests from 0.0.0.0/0. Any inputs on bringing the UI up for the broker and server are much appreciated!
    d
    • 2
    • 1
  • l

    Luis Fernandez

    08/23/2022, 7:33 PM
    hello friends me again, with a different issue: we are executing queries in pinot and are getting the following exception at query time, not for all of them just a few:
    Copy code
    QueryExecutionError:\njava.lang.IndexOutOfBoundsException\n	at java.base/java.nio.Buffer.checkBounds(Buffer.java:714)\n	at java.base/java.nio.DirectByteBuffer.get(DirectByteBuffer.java:288)\n	at org.apache.pinot.segment.local.segment.index.readers.forward.VarByteChunkSVForwardIndexReader.getStringCompressed(VarByteChunkSVForwardIndexReader.java:81)\n	at org.apache.pinot.segment.local.segment.index.readers.forward.VarByteChunkSVForwardIndexReader.getString(VarByteChunkSVForwardIndexReader.java:61)
    query looks like this:
    Copy code
    SELECT SUM(impression_count) as imp_count, stemmed_query FROM query_metrics WHERE user_id = xxx AND product_id = xxx AND serve_time BETWEEN 1660622400 AND 1661227199 GROUP BY stemmed_query ORDER BY impression_count LIMIT 100000
    stats:
    Copy code
    "numServersQueried": 2,
        "numServersResponded": 2,
        "numSegmentsQueried": 11,
        "numSegmentsProcessed": 10,
        "numSegmentsMatched": 10,
        "numConsumingSegmentsQueried": 1,
        "numDocsScanned": 16241,
        "numEntriesScannedInFilter": 5862,
        "numEntriesScannedPostFilter": 64964,
        "numGroupsLimitReached": false,
        "totalDocs": 77203847,
        "timeUsedMs": 133,
        "offlineThreadCpuTimeNs": 0,
        "realtimeThreadCpuTimeNs": 0,
        "offlineSystemActivitiesCpuTimeNs": 0,
        "realtimeSystemActivitiesCpuTimeNs": 0,
        "offlineResponseSerializationCpuTimeNs": 0,
        "realtimeResponseSerializationCpuTimeNs": 0,
        "offlineTotalCpuTimeNs": 0,
        "realtimeTotalCpuTimeNs": 0,
        "segmentStatistics": [],
        "traceInfo": {},
        "minConsumingFreshnessTimeMs": 1661283161852,
        "numRowsResultSet": 100
    m
    j
    +3
    • 6
    • 161
  • d

    Devang Shah

    08/23/2022, 11:37 PM
    Another question: Searched through the logs for the job that has been CrashLooping. Here's what I found. Any clues on how I can fix this?
    Defaulted container "pinot-add-example-realtime-table-json" out of: pinot-add-example-realtime-table-json, pinot-add-example-realtime-table-avro
    SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jarfile/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-metrics/pinot-yammer/pinot-yammer-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-metrics/pinot-dropwizard/pinot-dropwizard-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-environment/pinot-azure/pinot-azure-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jarfile/opt/pinot/plugins/pinot-stream-ingestion/pinot-pulsar/pinot-pulsar-0.11.0-SNAPSHOT-shaded.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance. WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.codehaus.groovy.reflection.CachedClass (file:/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar) to method java.lang.Object.finalize() WARNING: Please consider reporting this to the maintainers of org.codehaus.groovy.reflection.CachedClass WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations WARNING: All illegal access operations will be denied in a future release 2022/08/23 232955.035 INFO [AddTableCommand] [main] Executing command: AddTable -tableConfigFile /var/pinot/examples/airlineStats_realtime_table_config.json -schemaFile /var/pinot/examples/airlineStats_schema.json -controllerProtocol http -controllerHost pinot-controller -controllerPort 9000 -user null -password [hidden] -exec 2022/08/23 232955.765 INFO [AddTableCommand] [main] {"code":500,"error":"org.apache.pinot.shaded.org.apache.kafka.common.KafkaException: Failed to construct kafka consumer"}
    n
    • 2
    • 1
  • d

    Devang Shah

    08/23/2022, 11:31 PM
    Hello friends, I have been busy deploying Pinot Quickstart on Kubernetes in Azure. Everything went well until deploying pods and services and seeing them running. I was even able to access the controller through my browser. But, then the moments of glory did not last too long. My controller is not responding anymore. Cant load the broker web page. See the screenshot below of the current state of the pods and services. Except for the failure of a job, I don't see any other pod or service failing. Has anyone faced this issue? Do you know what could be going wrong? Could it be that my k8s cluster is undersized?
    m
    • 2
    • 1
1...535455...166Latest