Apache Pinot #troubleshooting

Pratik Bhadane

01/04/2023, 6:53 AM

Hello Team, Could you please answer the following questions: 1. How can we enable full support for SQL joins and subqueries in apache Pinot. Currently, we are on Pinot 0.12.0 and cannot execute subqueries/joins. 2. Does using Presto allow us SQL joins and subqueries and nested queries support for Realtime and Offline tables of Pinot?

Mostafa Ghadimi

01/04/2023, 12:45 PM

We are experiencing two issues on the Apache Pinot simultaneously: Set the segment max size to 500M as the following:

Copy code

"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.time": "1h",
"realtime.segment.flush.segment.size": "500M"

Issue 1: The segments' sizes after changing the state from CONSUMING to COMPLETED is about 200M and not 500 (the segment creation duration is less than 1 hour for sure) Issue 2: The segments are stored at

/var/pinot/server/data/index

and not at

/var/pinot/server/data/segment

. Here is the map of volumes in docker-compose file:

Copy code

- ./data/server_data/segment:/var/pinot/server/data/segment
- ./data/server_data/index:/var/pinot/server/data/index

Huaqiang He

01/04/2023, 12:46 PM

Hi team, I get an execution error when running a query that uses the function

lastwithtime

Copy code

select str11 as job_id, 
lastwithtime(str14,event_timestamp,'STRING') as query
from telemetry_events 
where epoch_minute between toEpochMinutes(now()-60000*24*7) and toEpochMinutes(now())  
group by str11
limit 10000

execute query error: QueryExecutionError: java.lang.RuntimeException: Caught exception while building data table. at org.apache.pinot.core.operator.blocks.InstanceResponseBlock.<init>(InstanceResponseBlock.java:46) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:118) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:39) at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ... Caused by: java.nio.BufferOverflowException at java.base/java.nio.HeapByteBuffer.put(HeapByteBuffer.java:221) at java.base/java.nio.ByteBuffer.put(ByteBuffer.java:914) at org.apache.pinot.segment.local.customobject.StringLongPair.toBytes(StringLongPair.java:46) at org.apache.pinot.core.common.ObjectSerDeUtils$11.serialize(ObjectSerDeUtils.java:438)

where str14 (query) is a SQL like string. It looks like a character encoding issue. I can work around it with

Copy code

decodeUrl(lastwithtime(encodeUrl(str14),event_timestamp,'STRING')) as query

Sevvy Yusuf

01/04/2023, 1:59 PM

Hi team, anyone here using spot instances for their Pinot infrastructure? I would be interested in hearing about how you manage issues around availability vs cost

chandarasekaran m

01/05/2023, 3:29 PM

<!here> I am have cloned pinot from master and running in locally . I want to enable V2 query engine to explore latest features. Where i have to add(config file directory) to add below properties.

Copy code

"pinot.multistage.engine.enabled": "true",
"pinot.server.instance.currentDataTableVersion": "4",
"pinot.query.server.port": "8421",
"pinot.query.runner.port": "8442"

chandarasekaran m

01/05/2023, 3:30 PM

@Neha Pawar

Thomas Steinholz

01/05/2023, 10:47 PM

Hello Team, I have a pinot cluster with over 30 servers on it, but it seems like no matter what, the segments will always overload 2 or so servers and keep the rest far under utilized (close to 20% of the PVC) while bring down the cluster with the over-used servers attempting to use beyond 100% PVC usage. Are there recommendations to improve the balancing of these segments across servers? As well as the recovery process for pinot servers that have 100% PVC utilization. Untagging and rebalancing seems to only do so much and takes a very very very long time to make any progress.

Shreeram Goyal

01/06/2023, 10:44 AM

Hi, We have been trying to do batch ingestion via spark using parquet. file format We found that the time columns are converted to UTC time and not as per the actual timezone. Is there any workaround for this ?

Mostafa Ghadimi

01/07/2023, 1:11 PM

Problem Description: After testing the Pinot on our development nodes, it worked properly and we wanted to switch the production nodes, to the new Pinot we were working on. The legacy Pinot (on production nodes) already had access to Kafka nodes. After a short down time and deploying the version of Pinot on production, we faced to this error during table creation:

Copy code

org.apache.pinot.spi.stream.TransientConsumerException: org.apache.pinot.shaded.org.apache.kafka.common.errors.TimeoutException: Failed to get offsets by times in 5001ms"

Would someone help us to fix this issue? What has been done: • The connection with Kafka nodes has been checked. • The legacy version of Pinot had the same table description for Kafka data ingestion! P.S.: We are using Ansible in order to deploy the Pinot which has been open-sourced on this link.

Mostafa Ghadimi

01/07/2023, 1:14 PM

@channel

Caleb Shei

01/08/2023, 5:21 PM

A bug? Try to do data batch import via Spark on our Hadoop secured cluster (i.e., kerberos enabled) and get the following error:

Copy code

23/01/08 16:52:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (<http://ph-jp98v52.infra.adtechlabs.com|ph-jp98v52.infra.adtechlabs.com> executor 2): java.lang.RuntimeException: java.lang.RuntimeException: Failed to authenticate user principal [i-cshei@INFRA.ADTECHLABS.COM] with keytab [/home/i-cshei/.keytab]
        at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:77)
        at org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:349)
        at org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:342)
        at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1(JavaRDDLike.scala:352)
        at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1$adapted(JavaRDDLike.scala:352)
        at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
        at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
        at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
        at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)
        at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)
        at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:136)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: Failed to authenticate user principal [i-cshei@INFRA.ADTECHLABS.COM] with keytab [/home/i-cshei/.keytab]
        at org.apache.pinot.plugin.filesystem.HadoopPinotFS.authenticate(HadoopPinotFS.java:288)
        at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:72)
        at com.valassis.plugin.filesystem.HadoopValassisFS.init(HadoopValassisFS.java:48)
        at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:65)
        ... 18 more

Raluca Lazar

01/09/2023, 6:20 PM

Hi all, I have a need to scale down the number of replicas for one of our servers from 6 to 3 and I'm having a hard time with the segment rebalancing command after scaling down. Here is some info on the environment: • 3 kafka partitions ingesting into this table / Pinot replication factor is 2 (

"replicasPerPartition": "2"

) • this is the REALTIME portion of a hybrid table and we have a realtime-to-offline job setup to run every day I've read through this and followed all the steps and I still end up in a situation where Pinot thinks my segments are distributed across 6 server instances. The CURL command to rebalance returns this message:

"description": "Instance reassigned, table is already balanced"

and the

segmentAssignment

section shows segments distributed on 6 servers. Am I missing anything?

Thomas Steinholz

01/09/2023, 6:52 PM

Hello all, I have some pinot tables that I have untagged that hit 100% PVC utilization and are not dropping. I have already rebalanced and moved as many tables as I could off of those two 100% servers but I think they are still being used for queries by the broker. What is the best way to recover these servers that have maxed out volumes? Separately, I needed to add a new column to some of the bigger tables and have reloaded all the segments - yet no matter what these segments seem to never generate the new column. I just see large negative values for the LONG column and the tables state something like:

There are <many thousands of> invalid segment/s. This usually means that they were created with an older schema. Please reload the table in order to refresh these segments to the new schema.

The smaller tables I have seemed to have generated valid values for all the segments, but these bigger tables can’t seem to reload any more segments. I assume this is related to the two filled up servers not being able to reload the segments but also not moving the segments to other servers (that are mostly under 50% in utilization).

Vincent Vu

01/09/2023, 10:46 PM

Hello all, I’m trying to ingest Kerberos Kafka data into Pinot real-time but I’m having a lot of issues. Can anyone who has some experience that can help or point me to some kind of guide to follow?

Pyry Kovanen

01/10/2023, 10:34 AM

TLS/SSL settings and Helm Chart Hi all, I've followed the Configuring TLS/SSL #TLS only guide with the Helm Chart. I have couple of questions regarding this: • When i.e.

pinot-server

starts it prints a line:

Starting server admin application on: <http://0.0.0.0:8097>, <https://0.0.0.0:7443>

. Why is that, given that the settings are explicitly disabling the

http

? ◦ pinot.server.adminapi.access.protocols=https ◦ pinot.server.adminapi.access.protocols.https.port=7443 ◦ pinot.server.netty.enabled=false ◦ pinot.server.nettytls.enabled=true ◦ pinot.server.nettytls.port=8098 ◦ This happens with other components as well. • Helm chart

values.yaml

does not support setting TLS/SSL related ports on kubernetes services, it's hardwired to the default non-secure ports, like for

pinot-controller

the port is

instead of

used in the settings. To change these I must either delete the unnecessary services or use

kubectl patch

to change the ports right after

helm install

, as a quick workaround. Is there something I missed here? • There is no built in way to secure Zookeeper traffic with the chart, it seems. Is this due to recommendation to use Zookeeper Operator instead? In general: • Is Helm chart suitable for production usage? Possibly if I replace the Zookeeper with Zookeeper Operator managed installation? Pinot version that I'm using is

apachepinot/pinot:0.11.0-SNAPSHOT-a6f5e89-20221207

Thanks already in advance!

Elon

01/11/2023, 1:07 AM

Hi, we wanted to know if there are any issues with enabling groovy: there is a config

controller.disable.ingestion.groovy

which defaults to false. Some users here want to use groovy transforms and we wanted to know if there are any risks or recommendations (i.e. do not use groovy, transform via flink or builtin function where applicable). Thanks!

Rohit Anilkumar

01/11/2023, 9:29 AM

[SOLVED] Hey I am trying to push metrics from pinot using JMX. exported the following to ALL_JAVA_OPTS

-javaagent:/home/ec2-user/pinot/jmx/jmx_prometheus_javaagent-0.17.2.jar=8008:/home/ec2-user/pinot/jmx/pinot.yml

but when I check

public_IP:8008/metrics

, its giving me a refused to connect. I tried

curl localhost:8008/metrics

from the EC2 node and its giving me

curl: (7) Failed to connect to localhost port 8008 after 0 ms: Connection refused

-> Does this mean nothing is being pushed to that port? Am i missing something here?

Sachin Mittal Consultant

01/12/2023, 5:17 AM

Hello folks I am facing a particular problem reading from Kinesis stream and it has to do we de-aggregation The records which I get from

KinesisConsumer

are aggregated records which were published by some other KPL. So the consumer needs to do de-aggregation in order to process them further. Refer: https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-consumer-deaggregation.html Now I don't see this happening in pinot's

KinesisConsumer

and looks like we are using awssdk v2 for kinesis client, and I am not sure how we can de-aggregate them Any thoughts ?

Ehsan Irshad

01/12/2023, 11:21 AM

Hi Team we are trying to add users, using the swagger api. We can see they are updated in Zookeeper property stores but we cannot use them to authenticate. Are we doing something wrong here? We dont want to maintain the users in the controller config etc. Referring to this issue: https://github.com/apache/pinot/pull/8314

Pratik Bhadane

01/13/2023, 11:54 AM

Hello Team, Pinot version: 0.12 I have 3-time columns in MILLISECONDS epoch out of which one column (Updated_Date) is having null values. When I am ingesting data with the following schema's record with Updated_Date null values are not getting inserted in the Pinot Realtime table. I am only able to ingest Updated_Date null value records with DimensionField of type STRING. Is this expected behavior? I have tried adding "allowNullTimeValue": true in the table config but this field is ignored at the time of adding tableConfigFile and not shown in Pinot Web UI TABLE CONFIG. So does it mean record with NULL values of other than STRING type wll not get inserted 1. =================== { "schemaName": "GoNoGoCustomerApplication_v3_2", "dimensionFieldSpecs": [ { "name": "Application_ID", "dataType": "STRING" }, { "name": "Application_Stage", "dataType": "STRING" }, { "name": "op", "dataType": "STRING" }, { "name": "ord", "dataType": "STRING" } ], "metricFieldSpecs": [ { "name": "Requested_Amount", "dataType": "DOUBLE" } ], "dateTimeFieldSpecs": [ { "name": "ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "Creation_Date", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "Updated_Date", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "source_ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" } ] } 2. =================== { "schemaName": "GoNoGoCustomerApplication_v3_5", "dimensionFieldSpecs": [ { "name": "Application_ID", "dataType": "STRING" }, { "name": "Application_Stage", "dataType": "STRING" }, { "name": "op", "dataType": "STRING" }, { "name": "ord", "dataType": "STRING" }, { "name": "Updated_Date", "dataType": "LONG" }, { "name": "source_ts_ms", "dataType": "LONG" }, { "name": "Creation_Date", "dataType": "LONG" } ], "metricFieldSpecs": [ { "name": "Requested_Amount", "dataType": "DOUBLE" } ], "dateTimeFieldSpecs": [ { "name": "ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" } ] } 3. =================== { "schemaName": "GoNoGoCustomerApplication_v3_3", "dimensionFieldSpecs": [ { "name": "Application_ID", "dataType": "STRING" }, { "name": "Application_Stage", "dataType": "STRING" }, { "name": "op", "dataType": "STRING" }, { "name": "ord", "dataType": "STRING" } ], "metricFieldSpecs": [ { "name": "Requested_Amount", "dataType": "DOUBLE" } ], "dateTimeFieldSpecs": [ { "name": "ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "Creation_Date", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "Updated_Date", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" }, { "name": "source_ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" } ] } 4. =================== { "schemaName": "GoNoGoCustomerApplication_v3_4", "dimensionFieldSpecs": [ { "name": "Application_ID", "dataType": "STRING" }, { "name": "Application_Stage", "dataType": "STRING" }, { "name": "op", "dataType": "STRING" }, { "name": "ord", "dataType": "STRING" }, { "name": "Updated_Date", "dataType": "TIMESTAMP" }, { "name": "source_ts_ms", "dataType": "TIMESTAMP" }, { "name": "Creation_Date", "dataType": "TIMESTAMP" } ], "metricFieldSpecs": [ { "name": "Requested_Amount", "dataType": "DOUBLE" } ], "dateTimeFieldSpecs": [ { "name": "ts_ms", "dataType": "LONG", "format" : "1MILLISECONDSEPOCH", "granularity": "1:MILLISECONDS" } ] } cat table-config.json { _"tableName":"GoNoGoCustomerApplication_v3_5",_ "tableType":"REALTIME", "segmentsConfig":{ _"timeColumnName":"ts_ms",_ "timeType":"MILLISECONDS", "allowNullTimeValue": true, "allowNullTimeValue": "true", "retentionTimeUnit":"DAYS", "retentionTimeValue":"7000", "segmentPushType":"APPEND", "segmentAssignmentStrategy":"BalanceNumSegmentAssignmentStrategy", _"schemaName":"GoNoGoCustomerApplication_v3_5",_ "replicasPerPartition":"1" }, "tenants":{ }, "tableIndexConfig":{ "loadMode":"MMAP", "nullHandlingEnabled": false, "streamConfigs":{ "streamType": "kafka", "stream.kafka.consumer.type": "lowlevel", "stream.kafka.topic.name": "CustomerApplication", "stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder", "stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory", "stream.kafka.broker.list": "<IP>:9092", "realtime.segment.flush.threshold.time": "3600000", "realtime.segment.flush.threshold.rows": "50000", "stream.kafka.consumer.prop.auto.offset.reset": "smallest" } }, "ingestionConfig":{ "transformConfigs":[ { _"columnName":"Application_ID",_ _"transformFunction":"JSONPATHSTRING(after, '$.id')" }, { _"columnName":"Requested_Amount",_ "transformFunction":"JSONPATHSTRING(after, '$.applicationRequest.application.loanAmount')" }, { _"columnName":"Application_Stage",_ "transformFunction":"JSONPATHSTRING(after, '$.applicationRequest.currentStageId')" }, { _"columnName":"Creation_Date",_ "transformFunction":"JSONPATHSTRING(after, '$.dateTime.$date')" }, { _"columnName":"Updated_Date",_ "transformFunction":"JSONPATHSTRING(after, '$.updatedDate.$date')" }, { _"columnName":"source_ts_ms",_ _"transformFunction":"JSONPATHSTRING(source, '$.ts_ms')"_ }, { "columnName":"ord", "transformFunction":"JSONPATHSTRING(source, '$.ord')" } ] }, "metadata":{ "customConfigs":{ } } }

Phil Sarkis

01/13/2023, 7:38 PM

On one of my machines when I try to run ./pinot-admin.sh QuickStart -type Stream it works out of the box and on another machine out of the box upon attempting to start Kafka, it waits for about 5 seconds and then gives me:

Sidharth Sawhney

01/13/2023, 11:14 PM

Hi everyone running into a bug while uploading a batch csv file into pinot. The first screenshot is a picture of my csv file. The second picture is one of my table config. The third is my ingestion job spec. I have a csv file of my data on my docker container that I am trying to upload to a cluster I host on localhost. the command I used is below:

pinot/bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile ingestionJobEVSpec.yml

My CSV file when uploaded shows up as 0s and nulls. the values of the csv are missing nor is there an error when I execute the above command. Does anyone know what could be the issue?

Shubham Kumar

01/16/2023, 7:08 AM

Hi team, we were consuming huge clickstream data(multiline json event data) in pinot. and since clickstream events has multiple schemas most of events were not getting parsed and was getting logged. Each line(with one key) is getting sent as new event in elastic search, Which is causing it to have even more ingestion rate than our biggest cluster. Is there a way to limit these kind of pinot server logs using any configuration? sample log shows whole multiline json:

Copy code

},
"device_id" : "****WAD*A*D*AS",
"os" : "Android",
"session" : null,
"advertising_id" : "null",
"source" : "SyncTimer",
"manufacturer" : "OnePlus",
"event_ts_mins" : null,
"app_name" : null,
"event_ts" : [ 1673719133085 ],
"event_date" : null,
"event_ts_hr" : null,
"event_name" : null,
"event_ts_days" : null,
"customer_id" : null,
"device" : {
"device_id" : "*****",
"os" : "Android",
"os_version" : "33",
"model" : "CPH2411",
"manufacturer" : "OnePlus"
},
"user" : {
"customer_id" : "DFH*#U*$*#Q93(***8"
},
"events" : [ {
"event_name" : "pulse_sdk_init",
"timestamp" : 1673719133085
} ],
"timestamp" : null
}
}

mahmoud elhalwany

01/16/2023, 10:56 AM

Hello, is there is any way to subscribe to table or export table data from pinot to kafka ?

Rohit Anilkumar

01/16/2023, 11:23 AM

Hey, quick question. I have the following in my monitoring yaml file for brokers

Copy code

- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queries\"><>(\\w+)"
  name: "pinot_broker_queries_$2"

But when i check prometheus, i dont see anything that starts with pinot_broker_queries. Is it not scraping the metrics correctly? But Im getting some of the other metrics (edited) I get the metrics with _exceptions, _nettyConnection, _healthChecks etc. Im using the config provided in the documentation https://docs.pinot.apache.org/operators/operating-pinot/monitoring broker

Copy code

rules:
# Pinot Broker
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+).authorization\"><>(\\w+)"
  name: "pinot_broker_authorization_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.documentsScanned\"><>(\\w+)"
  name: "pinot_broker_documentsScanned_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.entriesScannedInFilter\"><>(\\w+)"
  name: "pinot_broker_entriesScannedInFilter_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.entriesScannedPostFilter\"><>(\\w+)"
  name: "pinot_broker_entriesScannedPostFilter_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.freshnessLagMs\"><>(\\w+)"
  name: "pinot_broker_freshnessLagMs_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queries\"><>(\\w+)"
  name: "pinot_broker_queries_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queryExecution\"><>(\\w+)"
  name: "pinot_broker_queryExecution_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queryRouting\"><>(\\w+)"
  name: "pinot_broker_queryRouting_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.reduce\"><>(\\w+)"
  name: "pinot_broker_reduce_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.requestCompilation\"><>(\\w+)"
  name: "pinot_broker_requestCompilation_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.scatterGather\"><>(\\w+)"
  name: "pinot_broker_scatterGather_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.totalServerResponseSize\"><>(\\w+)"
  name: "pinot_broker_totalServerResponseSize_$2"
  labels:
    table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)_(\\w+).groupBySize\"><>(\\w+)"
  name: "pinot_broker_groupBySize_$3"
  labels:
    table: "$1"
    tableType: "$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)_(\\w+).noServingHostForSegment\"><>(\\w+)"
  name: "pinot_broker_noServingHostForSegment_$3"
  labels:
    table: "$1"
    tableType: "$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.healthcheck(\\w+)\"><>(\\w+)"
  name: "pinot_broker_healthcheck_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.helix.(\\w+)\"><>(\\w+)"
  name: "pinot_broker_helix_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.helixZookeeper(\\w+)\"><>(\\w+)"
  name: "pinot_broker_helix_zookeeper_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.nettyConnection(\\w+)\"><>(\\w+)"
  name: "pinot_broker_nettyConnection_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.clusterChangeCheck\"\"><>(\\w+)"
  name: "pinot_broker_clusterChangeCheck_$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.proactiveClusterChangeCheck\"><>(\\w+)"
  name: "pinot_broker_proactiveClusterChangeCheck_$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)Exceptions\"><>(\\w+)"
  name: "pinot_broker_exceptions_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.routingTableUpdateTime\"><>(\\w+)"
  name: "pinot_broker_routingTableUpdateTime_$1"

The query options I see on prometheus. Some of the metrics arent getting scraped I guess?

Copy code

pinot_broker_exceptions_requestCompilation_Count
pinot_broker_exceptions_requestCompilation_FifteenMinuteRate
pinot_broker_exceptions_requestCompilation_FiveMinuteRate
pinot_broker_exceptions_requestCompilation_MeanRate
pinot_broker_exceptions_requestCompilation_OneMinuteRate
pinot_broker_exceptions_resourceMissing_Count
pinot_broker_exceptions_resourceMissing_FifteenMinuteRate
pinot_broker_exceptions_resourceMissing_FiveMinuteRate
pinot_broker_exceptions_resourceMissing_MeanRate
pinot_broker_exceptions_resourceMissing_OneMinuteRate
pinot_broker_exceptions_uncaughtGet_Count
pinot_broker_exceptions_uncaughtGet_FifteenMinuteRate
pinot_broker_exceptions_uncaughtGet_FiveMinuteRate
pinot_broker_exceptions_uncaughtGet_MeanRate
pinot_broker_exceptions_uncaughtGet_OneMinuteRate
pinot_broker_exceptions_uncaughtPost_Count
pinot_broker_exceptions_uncaughtPost_FifteenMinuteRate
pinot_broker_exceptions_uncaughtPost_FiveMinuteRate
pinot_broker_exceptions_uncaughtPost_MeanRate
pinot_broker_exceptions_uncaughtPost_OneMinuteRate
pinot_broker_healthcheck_BadCalls_Count
pinot_broker_healthcheck_BadCalls_FifteenMinuteRate
pinot_broker_healthcheck_BadCalls_FiveMinuteRate
pinot_broker_healthcheck_BadCalls_MeanRate
pinot_broker_healthcheck_BadCalls_OneMinuteRate
pinot_broker_healthcheck_OkCalls_Count
pinot_broker_healthcheck_OkCalls_FifteenMinuteRate
pinot_broker_healthcheck_OkCalls_FiveMinuteRate
pinot_broker_healthcheck_OkCalls_MeanRate
pinot_broker_healthcheck_OkCalls_OneMinuteRate
pinot_broker_helix_connected_Value
pinot_broker_helix_ookeeperReconnects_Count
pinot_broker_helix_ookeeperReconnects_FifteenMinuteRate
pinot_broker_helix_ookeeperReconnects_FiveMinuteRate
pinot_broker_helix_ookeeperReconnects_MeanRate
pinot_broker_helix_ookeeperReconnects_OneMinuteRate
pinot_broker_nettyConnection_BytesReceived_Count
pinot_broker_nettyConnection_BytesReceived_FifteenMinuteRate
pinot_broker_nettyConnection_BytesReceived_FiveMinuteRate
pinot_broker_nettyConnection_BytesReceived_MeanRate
pinot_broker_nettyConnection_BytesReceived_OneMinuteRate
pinot_broker_nettyConnection_BytesSent_Count
pinot_broker_nettyConnection_BytesSent_FifteenMinuteRate
pinot_broker_nettyConnection_BytesSent_FiveMinuteRate
pinot_broker_nettyConnection_BytesSent_MeanRate
pinot_broker_nettyConnection_BytesSent_OneMinuteRate
pinot_broker_nettyConnection_ConnectTimeMs_Value
pinot_broker_nettyConnection_RequestsSent_Count
pinot_broker_nettyConnection_RequestsSent_FifteenMinuteRate
pinot_broker_nettyConnection_RequestsSent_FiveMinuteRate
pinot_broker_nettyConnection_RequestsSent_MeanRate
pinot_broker_nettyConnection_RequestsSent_OneMinuteRate
pinot_broker_proactiveClusterChangeCheck_Count
pinot_broker_proactiveClusterChangeCheck_FifteenMinuteRate
pinot_broker_proactiveClusterChangeCheck_FiveMinuteRate
pinot_broker_proactiveClusterChangeCheck_MeanRate
pinot_broker_proactiveClusterChangeCheck_OneMinuteRate

I dont see the missing metrics in the jmx port either. Im seeing the same issue in the controller as well.

Pratik Bhadane

01/16/2023, 1:21 PM

Hello Team, How can we get count() output as "0" zero In Pinot? For now I am getting output as "No Record(s) found"

Prashanth Rao

01/17/2023, 7:45 AM

Hi, Greetings everyone, I did a restart on Pinot Server/Controller and found the tables have vanished along with the data . I had specified -dataDir and I see metadata in that folder. What needs to be done to pull back the data that is seemingly gone now ?

Ehsan Irshad

01/17/2023, 10:15 AM

Hi Team. We have setup the Pinot Deep Store in S3, and we also tried to decoupled controller from the data path But some of the segments in the realtime table are still in ERROR state with error below

Copy code

[TABLENAME_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread_10] Download and move segment TABLENAME__1__169__20230104T0711Z from peer with scheme http failed.

Do server and controllers need to point to same URI , our config Controller: controller.data.dir=s3://prd-pinot-archive/prd-mimic-pinot/controller-data Server:

pinot.server.instance.segment.store.uri=<s3://prd-pinot-archive/prd-mimic-pinot/server-data>

eywek

01/17/2023, 10:22 AM

Hello, I’m trying to use the Apache Pulsar stream ingestion plugin but when creating a REALTIME table with the following config:

Copy code

"streamConfigs": {
        "streamType": "pulsar",
        "topic.consumption.rate.limit": "1500",
        "stream.pulsar.fetch.timeout.millis": "30000",
        "stream.pulsar.consumer.type": "lowlevel",
        "stream.pulsar.topic.name": "<persistent://public/default/worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28>",
        "stream.pulsar.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder",
        "stream.pulsar.consumer.factory.class.name": "org.apache.pinot.plugin.stream.pulsar.PulsarConsumerFactory",
        "stream.pulsar.bootstrap.servers": "<pulsar://pulsar.production.internal.reelevant.io>.:6650",
        "stream.pulsar.consumer.prop.auto.offset.reset": "smallest",
        "realtime.segment.flush.threshold.rows": "0",
        "realtime.segment.flush.threshold.segment.size": "200M"
      }

I’m having a

UNHEALTHY

ingestion status for the table:

Copy code

{
  "ingestionStatus": {
    "ingestionState": "UNHEALTHY",
    "errorMessage": "Did not get any response from servers for segment: worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28__0__1__20230117T1012Z"
  }
}

And if I use the

/consumingSegmentsInfo

I’m having the following result:

Copy code

{
  "_segmentToConsumingInfoMap": {
    "worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28__0__1__20230117T1012Z": []
  }
}

From what I see, the plugin is consuming data from Pulsar and I’m able to query it but I was wondering why I get an

UNHEALTHY

ingestion status since I was using it for monitoring purposes. Thank you

abhinav wagle

01/17/2023, 11:17 PM

Hello, I currently have a Tenant in Pinot which has a REAL TIME table. Which API's should I call and in what order to add a new OFFLINE table to same tenant. When I created a new OFFline Table, it does show up in the expected Tenant, but my ingestion job fails to find it. Which API should I invoke to add "test_OFFLINE to the following Server instance config . Since even after table creation the server is not updated with

test_OFFLINE

Copy code

"listFields": {
    "TAG_LIST": [
      "test_REALTIME"
    ]
  }