https://pinot.apache.org/ logo
Join Slack
Powered by
# troubleshooting
  • p

    prasanna

    08/23/2024, 5:29 AM
    Hi Team, I have a peculiar problem below is the situation. we have a Hybrid table setup lets say tableA which is injected with weekly data (note setup is dev) --> realtime table has 7 DAYS retention --> offline table has YEARS retention when we pushed data for week based on our schedule for RealtimeToOffline task the data was moved to offline table. since data injection is not continues Realtime table is empty as of now. When i query the tableA without time range condition or even with time range condition for older data we always receive 0 results. but when same query with time range condition for older data is executed specifically on tableA_OFFLINE table we see results. Can someone please help understand the problem here. As per Document we just need to define tableA in query to get the results and OFFLINE or REALTIME part is handled by pinot internally. Is their any gap in my understanding here or if i am missing any config.
  • a

    Adil Shaikh

    08/23/2024, 9:50 AM
    Hi Team, I and using deep archive storage on that my controller send the data to s3 bucket but when i add another server and try to fetch segment on that server will show bad and my setups is docker
    k
    • 2
    • 1
  • p

    Pramiti

    08/23/2024, 11:27 AM
    Hi! I am trying to ingest realtime data from Apache Pulsar into Pinot. I have created a topic and producer in Pulsar and have checked the schema using a temporary consumer. When I try to ingest the same data into Pinot using the same schema and table stream configs from the documentation, no segment is created and data isn't ingested. Pinot 1.1.0 is in a cluster, and pulsar 3.3.1 is in standalone mode, both running on docker using compose files.
    m
    k
    • 3
    • 5
  • z

    Zhuangda Z

    08/24/2024, 2:57 AM
    Team, curious if a sparse segment would waste a lot of space or Pinot has some underlying optimization for this?
    m
    • 2
    • 12
  • z

    Zhuangda Z

    08/24/2024, 4:03 PM
    One of the restrictions for text index is it cant coexist with other indexes. Curious if we can address this by having a mirror col? Like, enable star-tree index on the mirror col while the original one is used for text search. Will this work?
    m
    • 2
    • 6
  • s

    Slackbot

    08/26/2024, 10:34 AM
    This message was deleted.
    m
    • 2
    • 1
  • p

    prasanna

    08/26/2024, 1:25 PM
    Hi All, i have a basic query for merge rollup config. Are below configs acceptable. I was able to create table with this config without any issue. But at the same time i was not able to use WEEK, MONTH, YEAR in retentionUnit. Thus want to be sure this is ok. can soem one kindly suggest or correct. Basically i want to be sure that values like 1w, 1m, 1y are acceptable configs value. "task": { "taskTypeConfigsMap": { "MergeRollupTask": { "1hour.mergeType": "rollup", "1hour.bucketTimePeriod": "1d", "1hour.bufferTimePeriod": "3d", "1day.mergeType": "rollup", "1day.bucketTimePeriod": "1w", "1day.bufferTimePeriod": "3w", "1week.mergeType": "rollup", "1week.bucketTimePeriod": "1m", "1week.bufferTimePeriod": "3m", "1month.mergeType": "rollup", "1month.bucketTimePeriod": "1y", "1month.bufferTimePeriod": "3y", "sumClm.aggregationType": "sum", "minClm.aggregationType": "min", "maxClm.aggregationType": "max", "countClm.aggregationType": "sum", "col.aggregationType": "sum" } } }
    m
    m
    • 3
    • 5
  • s

    Sandeep R

    08/26/2024, 6:30 PM
    Hi everyone, We have configured a table retention policy for 6 hours, but I've noticed that the deleted segments are not being automatically flushed from the disk. I currently have to manually clean up this data across all servers
    m
    • 2
    • 5
  • r

    raghav

    08/27/2024, 10:45 AM
    Hey Team, We are facing below issue with several of our tables. WE got the output on running VerifyClusterState command. Is there any SOP to resolve these errors?
    Copy code
    2024/08/27 10:41:33.110 ERROR [ClusterStateVerifier] [pool-3-thread-1] Table drift_execution_history_REALTIME is not stable. numUnstablePartitions: 18
  • n

    Nathan

    08/27/2024, 11:42 AM
    Hi Everyone, I'm working on reading data from Apache Pinot using the JDBC client and encountering some issues that might be related to missing JAR files. Could someone please provide a list of the required JARs that should be included along with
    pinot-java-client-1.2.0.jar
    ?
    m
    k
    • 3
    • 4
  • v

    Vũ Lê

    08/28/2024, 3:18 AM
    Hi everyone, I am configuring authentication for Pinot, I followed the instructions from Pinot documentation, however when I restart Pinot, the login page still does not appear, is there any configuration I need to enable. Please help me. I have configured according to this documentation: https://docs.pinot.apache.org/v/release-0.9.0/operators/tutorials/authentication-authorization-and-acls
    m
    • 2
    • 7
  • a

    Apoorv Upadhyay

    08/28/2024, 6:38 AM
    Hi Team, Can you help on this regard. I have a datetimeFieldSpec as below,
    Copy code
    "dateTimeFieldSpecs": [
        {
          "name": "order_date",
          "dataType": "LONG",
          "defaultNullValue": 0,
          "format": "1:MILLISECONDS:EPOCH",
          "granularity": "1:SECONDS"
        }}
    I was expecting column to have default value as 0 but its long.MIN_VALUE, any possible reason for it ?
    j
    • 2
    • 11
  • a

    Anand Kr Shaw

    08/29/2024, 6:57 AM
    Hi Team , Doing a prod setup for pinot with three controllers. One pod is coming up successfuly - But other two keeps on restarting with the below error. ➜ pinot-prod git:(anandsh/tlsconfig) ✗ kubectl logs pinot-controller-1 -n pinot -f 2024/08/29 050543.946 INFO [StartControllerCommand] [main] Executing command: StartController -configFileName /config/controller.conf 2024/08/29 050544.050 INFO [StartServiceManagerCommand] [main] Executing command: StartServiceManager -clusterName PinotCluster -zkAddress zookeeper:2181 -port -1 -bootstrapServices [] 2024/08/29 050544.051 INFO [StartServiceManagerCommand] [main] Starting a Pinot [SERVICE_MANAGER] at 0.553s since launch 2024/08/29 050544.054 INFO [StartServiceManagerCommand] [main] Started Pinot [SERVICE_MANAGER] instance [ServiceManager_pinot-controller-1.pinot-controller.pinot.svc.cluster.local_-1] at 0.556s since launch 2024/08/29 050544.054 INFO [StartServiceManagerCommand] [main] Starting a Pinot [CONTROLLER] at 0.556s since launch 2024/08/29 050627.151 ERROR [ZKHelixManager] [main] fail to createClient. retry 1 org.apache.helix.HelixException: Failed to create live instance because instance: Controller_pinot-controller_9000 already has a live-instance in cluster: PinotCluster. Path is: /PinotCluster/LIVEINSTANCES/Controller_pinot-controller_9000 at org.apache.helix.manager.zk.ParticipantManager.createLiveInstance(ParticipantManager.java:354) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.helix.manager.zk.ParticipantManager.handleNewSession(ParticipantManager.java:159) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.helix.manager.zk.ZKHelixManager.handleNewSessionAsParticipant(ZKHelixManager.java:1443) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.helix.manager.zk.ZKHelixManager.handleNewSession(ZKHelixManager.java:1390) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.helix.manager.zk.ZKHelixManager.createClient(ZKHelixManager.java:782) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.helix.manager.zk.ZKHelixManager.connect(ZKHelixManager.java:817) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.controller.BaseControllerStarter.registerAndConnectAsHelixParticipant(BaseControllerStarter.java:769) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.controller.BaseControllerStarter.setUpPinotController(BaseControllerStarter.java:449) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.controller.BaseControllerStarter.start(BaseControllerStarter.java:378) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.tools.service.PinotServiceManager.startController(PinotServiceManager.java:118) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.tools.service.PinotServiceManager.startRole(PinotServiceManager.java:87) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21e556234ee99351a7a] at org.apache.pinot.tools.admin.command.StartServiceManagerCommand.lambda$startBootstrapServices$0(StartServiceManagerCommand.java:252) ~[pinot-all-1.2.0-jar-with-dependencies.jar:1.2.0-cc33ac502a02e2fe830fe21
    pinot-controller-configmap.yamlpinot-controller-statefulset.yaml
    m
    • 2
    • 1
  • m

    Mrityunjay Sharma

    08/29/2024, 11:14 AM
    Hii Team, I want to have S3 as deep storage I configured as below in helm. It is not storing in S3. Please help here.
    Copy code
    controller:
      
      extra:
        configs: |-
          pinot.set.instance.id.to.hostname=true
          controller.task.scheduler.enabled=true
          pinot.controller.segment.fetcher.protocols=file,http,s3
          pinot.controller.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
          pinot.controller.storage.factory.s3.region=us-east-1
          
          controller.helix.cluster.name=pinot
          controller.data.dir=s3://{s3-bucket-name}
          controller.local.temp.dir=/tmp/pinot-tmp-data/
          controller.enable.split.commit=true
          pinot.controller.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
          pinot.controller.storage.factory.s3.accessKey={access-key}
          pinot.controller.storage.factory.s3.secretKey={secret-key}
          pinot.controller.storage.factory.s3.disableAcl=false
          
          
     Server:
     
      extra:
        configs: |-
          pinot.set.instance.id.to.hostname=true
          pinot.server.instance.realtime.alloc.offheap=true
          pinot.query.server.port=7321
          pinot.query.runner.port=7732
          pinot.server.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
          pinot.server.segment.fetcher.protocols=file,http,s3
          pinot.server.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
          
          pinot.server.instance.enable.split.commit=true
          pinot.server.storage.factory.s3.httpclient.maxConnections=50
          pinot.server.storage.factory.s3.httpclient.socketTimeout=30s
          pinot.server.storage.factory.s3.httpclient.connectionTimeout=2s
          pinot.server.storage.factory.s3.httpclient.connectionTimeToLive=0s
          pinot.server.storage.factory.s3.httpclient.connectionAcquisitionTimeout=10s
          
          pinot.use-streaming-for-segment-queries=true
          realtime.segment.serverUploadToDeepStore = true
          
          pinot.server.storage.factory.s3.region=us-east-1
          pinot.server.instance.dataDir=s3://{s3-bucket-name}
          pinot.server.instance.segmentTarDir=/tmp/pinot-tmp/server/segmentTars
          pinot.server.storage.factory.s3.disableAcl=false
          pinot.server.storage.factory.s3.endpoint=s3://{s3-bucket-name}
          pinot.server.segment.store.uri=s3://{s3-bucket-name}
          pinot.server.instance.segment.store.uri=s3://{s3-bucket-name}
          
          pinot.server.storage.factory.s3.accessKey={access-key}
          pinot.server.storage.factory.s3.secretKey={secret-key}
    m
    • 2
    • 1
  • b

    Bruno Mendes

    08/29/2024, 2:02 PM
    Hello guys, When a real-time table stops being populated (receiving new data), what should be the first thing I look at? Server logs? Is there any api end-point which give some clue? tks
    m
    • 2
    • 2
  • d

    Dor Levi

    08/29/2024, 4:59 PM
    Is anyone running the latest master in production? or at least a branch with https://github.com/apache/pinot/pull/13711 (Support polymorphic scalar comparison functions in the multi-stage query engine #13711) ? We are very interested in this and want to build more confidence before using the latest master with it.
    y
    • 2
    • 2
  • b

    Bruno Mendes

    08/29/2024, 5:33 PM
    My tables are stuck on the status
    UPDATING
    for more than one day, how should I troubleshoot to discover what is happening?
    j
    s
    x
    • 4
    • 7
  • d

    Dor Levi

    08/30/2024, 11:01 PM
    I’m trying to look at a trace info for a query (Multi-stage), I’m executing it in the Pinot GUI with the trace checkmark but looking at the JSON output I keep getting
    Copy code
    "traceInfo": {}
    Do we need to turn on some other flags as well?
    m
    g
    • 3
    • 5
  • m

    meshari aldossari

    08/30/2024, 11:20 PM
    Hi team, I was wondering if it’s possible to build extra indexes on a column that’s used on the star tree index split order?
    m
    x
    • 3
    • 2
  • a

    Anand Kr Shaw

    08/31/2024, 8:19 AM
    Hi Team - setting up the monitoring part of pinot via prometheus. Facing issue with the agent coming up. : controller:
    ...
    jvmOpts: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=8008:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -Xms256M -Xmx1G"
    I am following this document but somehow the port is not up.
    apiVersion: apps/v1
    kind: StatefulSet
    metadata:
    name: pinot-controller
    namespace: pinot
    spec:
    serviceName: "pinot-controller"
    replicas: 1
    selector:
    matchLabels:
    app: pinot-controller
    template:
    metadata:
    labels:
    app: pinot-controller
    annotations:
    <http://prometheus.io/scrape|prometheus.io/scrape>: "true"
    <http://prometheus.io/port|prometheus.io/port>: "8008"
    <http://prometheus.io/path|prometheus.io/path>: "/metrics"
    spec:
    affinity:
    podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
    - labelSelector:
    matchExpressions:
    - key: <http://app.kubernetes.io/name|app.kubernetes.io/name>
    operator: In
    values:
    - zookeeper
    topologyKey: "<http://kubernetes.io/hostname|kubernetes.io/hostname>"
    containers:
    - name: pinot-controller
    image: apachepinot/pinot:1.2.0
    env:
    - name: JVM_OPTS
    value: "-javaagent:/opt/pinot/etc/jmx_prometheus_javaagent/jmx_prometheus_javaagent.jar=8008:/opt/pinot/etc/jmx_prometheus_javaagent/configs/pinot.yml -Xms256M -Xmx1G"
    ports:
    - containerPort: 9000
    - containerPort: 8008
    command:
    - "bin/pinot-admin.sh"
    - "StartController"
    - "-zkAddress"
    - "zookeeper-0:2181"
    - "-configFileName"
    - "/config/controller.conf"
    resources:
    requests:
    memory: "4Gi"
    cpu: "4000m"
    limits:
    memory: "8Gi"
    cpu: "8000m"
    volumeMounts:
    - mountPath: /data
    name: pinot-controller-storage
    - mountPath: /config
    name: pinot-config
    volumes:
    - name: pinot-config
    configMap:
    name: pinot-controller-config
    volumeClaimTemplates:
    - metadata:
    name: pinot-controller-storage
    spec:
    accessModes: [ "ReadWriteOnce" ]
    resources:
    requests:
    storage: 50Gi
    storageClassName: "gp2"
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: pinot-controller
    namespace: pinot
    annotations:
    <http://prometheus.io/scrape|prometheus.io/scrape>: "true"
    <http://prometheus.io/port|prometheus.io/port>: "8008"
    <http://prometheus.io/path|prometheus.io/path>: "/metrics"
    spec:
    ports:
    - name: http
    port: 9000
    targetPort: 9000
    - name: prometheus
    port: 8008
    targetPort: 8008
    selector:
    app: pinot-controller
    From within the Pod : sh-5.2# curl <http://localhost:8008/metrics>
    curl: (7) Failed to connect to localhost port 8008 after 0 ms: Couldn't connect to server
    x
    • 2
    • 2
  • a

    Adil Shaikh

    08/31/2024, 2:06 PM
    how to provide table level access for specific user in pinot in docker setup
    m
    • 2
    • 1
  • s

    Sumitra Saksham

    09/02/2024, 8:11 AM
    Hi Team, I am trying to understand pinot for our usecase. We have Kafka as data ingestion source and I get the part where we will have consuming segments and once they reach the limit of rows or time, they will be flushed. I have few doubts: 1. If let say one server which has a consuming segment for kafka partition 1, goes down then what will happen since the storage for consuming segment is that server local storage? 2. Once a segment is flushed, it will be stored in segment store like deep store. Now if my server goes down, and another server is added, will controller notify the new server to get the segment data from segment store and use it for serving queries? 3. During the time when segment store provides the data to new server, does during this time the query will be providing wrong results as few segments will be missing in final result?
    y
    • 2
    • 7
  • j

    Jaideep C

    09/02/2024, 10:12 AM
    Hi I am following this docs to set up Grafana and Prometheus. I followed the same but not able to see the below tables. • Pinot Controller CPU User • Pinot Controller JVM User • Pinot Broker CPU User • Pinot Broker JVM User • Pinot Server CPU User • Pinot Server JVM User I am guessing that the reason why it is not picking this up is because I am running on bare metal unlike the docs in which the cluster is running on k8s. Maybe it is pulling the usage info from the k8s cluster. I am not sure. Im an absolute newbie when it comes to Grafana and Prometheus so not sure how to fix this. Please let me know if you have any suggestions. For the exporter.yml files I am using the ones found in the Apache pinot Docker container.
    m
    • 2
    • 1
  • s

    Sumitra Saksham

    09/02/2024, 2:24 PM
    I am facing an issue. The segments are not getting created. Information: 1. Ingestion Source: Kafka Topic with 2 Partitions 2. Below is Json file:
    Copy code
    {
      "tableName": "RawData",
      "tableType": "REALTIME",
      "segmentsConfig": {
        "timeColumnName": "event_time",
        "schemaName": "RawData",
        "replication": "2",
        "retentionTimeUnit": "DAYS",
        "retentionTimeValue": "180",
        "minimizeDataMovement": true,
        "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
        "replicasPerPartition": "2",
        "completionMode": "DOWNLOAD",
        "peerSegmentDownloadScheme": "http"
      },
      "tenants": {
        "broker": "DefaultTenant",
        "server": "DefaultTenant"
      },
      "routing": {
        "instanceSelectorType": "strictReplicaGroup",
        "segmentPrunerTypes": [
          "time"
        ]
      },
      "query": {
        "timeoutMs": 30000,
        "disableGroovy": true,
        "useApproximateFunction": true,
        "maxQueryResponseSizeBytes": 104857600,
        "maxServerResponseSizeBytes": 52428800
      },
      "tableIndexConfig": {
        "loadMode": "MMAP",
        "streamConfigs": {
          "streamType": "kafka",
          "stream.kafka.broker.list": "${KAFKA_BROKERS}",
          "stream.kafka.consumer.prop.auto.offset.reset": "smallest",
          "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
          "realtime.segment.flush.threshold.time": "24h",
          "stream.kafka.topic.name": "raw-data",
          "stream.kafka.consumer.type": "lowlevel",
          "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
          "realtime.segment.flush.threshold.rows": "100000",
          "realtime.segment.flush.segment.size": "1GB",
          "sasl.mechanism": "SCRAM-SHA-256",
          "security.protocol": "SASL_PLAINTEXT",
          "sasl.jaas.config": "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"${SASL_USERNAME}\" password=\"${SASL_PASSWORD}\";"
        },
        "enableDefaultStarTree": true,
        "invertedIndexColumns": [
          "id",
          "raw_val"
        ],
        "rangeIndexColumns": [
          "event_time"
        ],
        "sortedColumn": [
          "event_time"
        ],
        "aggregateMetrics": false,
        "nullHandlingEnabled": false,
        "columnMajorSegmentBuilderEnabled": true,
        "starTreeIndexConfigs": [
          {
            "dimensionsSplitOrder": [
              "id",
              "raw_val",
              "event_time"
            ],
            "skipStarNodeCreationForDimensions": [],
            "functionColumnPairs": [
              "COUNT__*"
            ],
            "maxLeafRecords": 10000
          }
        ]
      },
      "metadata": {
        "customConfigs": {}
      },
      "ingestionConfig": {
        "continueOnError": false,
        "rowTimeValueCheck": false,
        "segmentTimeValueCheck": true
      },
      "fieldConfigList": [
        {
          "name": "id",
          "encodingType": "DICTIONARY"
        },
        {
          "name": "raw_val",
          "encodingType": "DICTIONARY"
        }
      ]
    }
    I am using 3 Servers and 1 Controller. I am using GKS for deep store. Can you please help?
    m
    • 2
    • 9
  • d

    Deepak Gautam

    09/03/2024, 9:31 AM
    HI team, We increased vCPU for our server from 2 to 12 and post that we started facing following exception while running "VerifySegmentState". Does anyone has idea how this could be fixed ? Exception :
    Copy code
    2024/09/03 09:23:24.816 INFO [VerifySegmentState] [main] Segment: d3_ob_metrics_1H__7__9__20240826T0915Z idealstate: {Server_pinot-server-19.pinot-server-headless.d3-ob-cluster-latest.svc.cluster.local_8098=ONLINE, Server_pinot-server-2.pinot-server-headless.d3-ob-cluster-latest.svc.cluster.local_8098=ONLINE} does NOT match external view: {Server_pinot-server-2.pinot-server-headless.d3-ob-cluster-latest.svc.cluster.local_8098=ONLINE}
    x
    • 2
    • 3
  • n

    Nick Just

    09/03/2024, 9:53 AM
    Hello, I am trying to connect Tableau Desktop with Pinot Server using these steps (https://docs.pinot.apache.org/integrations/tableau), but Tableau gives me following error.
    Copy code
    Bad Connection: Tableau could not connect to the data source.
    Error Code: FAB9A2C5
    org.apache.pinot.client.PinotClientException: Pinot returned HTTP status 400, expected 200
    Pinot Version: 1.0.0 Jar files located in the Drivers folder as stated in the docs: • pinot-jdbc-client-1.0.0-shaded.jar • async-http-client-2.12.3.jar • calcite-core-1.34.0.jar I checked through curl command that the broker and server endpoint are working.
    x
    k
    • 3
    • 4
1...162163164165166Latest