Pratik Bhadane
01/04/2023, 6:53 AMMostafa Ghadimi
01/04/2023, 12:45 PM"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.time": "1h",
"realtime.segment.flush.segment.size": "500M"
Issue 1: The segments' sizes after changing the state from CONSUMING to COMPLETED is about 200M and not 500 (the segment creation duration is less than 1 hour for sure)
Issue 2: The segments are stored at /var/pinot/server/data/index
and not at /var/pinot/server/data/segment
. Here is the map of volumes in docker-compose file:
- ./data/server_data/segment:/var/pinot/server/data/segment
- ./data/server_data/index:/var/pinot/server/data/index
Huaqiang He
01/04/2023, 12:46 PMlastwithtime
select str11 as job_id,
lastwithtime(str14,event_timestamp,'STRING') as query
from telemetry_events
where epoch_minute between toEpochMinutes(now()-60000*24*7) and toEpochMinutes(now())
group by str11
limit 10000
execute query error: QueryExecutionError: java.lang.RuntimeException: Caught exception while building data table. at org.apache.pinot.core.operator.blocks.InstanceResponseBlock.<init>(InstanceResponseBlock.java:46) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:118) at org.apache.pinot.core.operator.InstanceResponseOperator.getNextBlock(InstanceResponseOperator.java:39) at org.apache.pinot.core.operator.BaseOperator.nextBlock(BaseOperator.java:39) ... Caused by: java.nio.BufferOverflowException at java.base/java.nio.HeapByteBuffer.put(HeapByteBuffer.java:221) at java.base/java.nio.ByteBuffer.put(ByteBuffer.java:914) at org.apache.pinot.segment.local.customobject.StringLongPair.toBytes(StringLongPair.java:46) at org.apache.pinot.core.common.ObjectSerDeUtils$11.serialize(ObjectSerDeUtils.java:438)
where str14 (query) is a SQL like string.
It looks like a character encoding issue. I can work around it with
decodeUrl(lastwithtime(encodeUrl(str14),event_timestamp,'STRING')) as query
Sevvy Yusuf
01/04/2023, 1:59 PMchandarasekaran m
01/05/2023, 3:29 PM"pinot.multistage.engine.enabled": "true",
"pinot.server.instance.currentDataTableVersion": "4",
"pinot.query.server.port": "8421",
"pinot.query.runner.port": "8442"
chandarasekaran m
01/05/2023, 3:30 PMThomas Steinholz
01/05/2023, 10:47 PMShreeram Goyal
01/06/2023, 10:44 AMMostafa Ghadimi
01/07/2023, 1:11 PMorg.apache.pinot.spi.stream.TransientConsumerException: org.apache.pinot.shaded.org.apache.kafka.common.errors.TimeoutException: Failed to get offsets by times in 5001ms"
Would someone help us to fix this issue?
What has been done:
• The connection with Kafka nodes has been checked.
• The legacy version of Pinot had the same table description for Kafka data ingestion!
P.S.: We are using Ansible in order to deploy the Pinot which has been open-sourced on this link.Mostafa Ghadimi
01/07/2023, 1:14 PMCaleb Shei
01/08/2023, 5:21 PM23/01/08 16:52:57 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) (<http://ph-jp98v52.infra.adtechlabs.com|ph-jp98v52.infra.adtechlabs.com> executor 2): java.lang.RuntimeException: java.lang.RuntimeException: Failed to authenticate user principal [i-cshei@INFRA.ADTECHLABS.COM] with keytab [/home/i-cshei/.keytab]
at org.apache.pinot.spi.filesystem.PinotFSFactory.register(PinotFSFactory.java:77)
at org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:349)
at org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentGenerationJobRunner$1.call(SparkSegmentGenerationJobRunner.java:342)
at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1(JavaRDDLike.scala:352)
at org.apache.spark.api.java.JavaRDDLike.$anonfun$foreach$1$adapted(JavaRDDLike.scala:352)
at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:575)
at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:573)
at org.apache.spark.InterruptibleIterator.foreach(InterruptibleIterator.scala:28)
at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1003)
at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1003)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.RuntimeException: Failed to authenticate user principal [i-cshei@INFRA.ADTECHLABS.COM] with keytab [/home/i-cshei/.keytab]
at org.apache.pinot.plugin.filesystem.HadoopPinotFS.authenticate(HadoopPinotFS.java:288)
at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:72)
at com.valassis.plugin.filesystem.HadoopValassisFS.init(HadoopValassisFS.java:48)
at org.apache.pinot.plugin.filesystem.HadoopPinotFS.init(HadoopPinotFS.java:65)
... 18 more
Raluca Lazar
01/09/2023, 6:20 PM"replicasPerPartition": "2"
)
• this is the REALTIME portion of a hybrid table and we have a realtime-to-offline job setup to run every day
I've read through this and followed all the steps and I still end up in a situation where Pinot thinks my segments are distributed across 6 server instances. The CURL command to rebalance returns this message: "description": "Instance reassigned, table is already balanced"
and the segmentAssignment
section shows segments distributed on 6 servers. Am I missing anything?Thomas Steinholz
01/09/2023, 6:52 PMThere are <many thousands of> invalid segment/s. This usually means that they were created with an older schema. Please reload the table in order to refresh these segments to the new schema.The smaller tables I have seemed to have generated valid values for all the segments, but these bigger tables can’t seem to reload any more segments. I assume this is related to the two filled up servers not being able to reload the segments but also not moving the segments to other servers (that are mostly under 50% in utilization).
Vincent Vu
01/09/2023, 10:46 PMPyry Kovanen
01/10/2023, 10:34 AMpinot-server
starts it prints a line: Starting server admin application on: <http://0.0.0.0:8097>, <https://0.0.0.0:7443>
. Why is that, given that the settings are explicitly disabling the http
?
◦ pinot.server.adminapi.access.protocols=https
◦ pinot.server.adminapi.access.protocols.https.port=7443
◦ pinot.server.netty.enabled=false
◦ pinot.server.nettytls.enabled=true
◦ pinot.server.nettytls.port=8098
◦ This happens with other components as well.
• Helm chart values.yaml
does not support setting TLS/SSL related ports on kubernetes services, it's hardwired to the default non-secure ports, like for pinot-controller
the port is 9000
instead of 9443
used in the settings. To change these I must either delete the unnecessary services or use kubectl patch
to change the ports right after helm install
, as a quick workaround. Is there something I missed here?
• There is no built in way to secure Zookeeper traffic with the chart, it seems. Is this due to recommendation to use Zookeeper Operator instead?
In general:
• Is Helm chart suitable for production usage? Possibly if I replace the Zookeeper with Zookeeper Operator managed installation?
Pinot version that I'm using is apachepinot/pinot:0.11.0-SNAPSHOT-a6f5e89-20221207
Thanks already in advance!Elon
01/11/2023, 1:07 AMcontroller.disable.ingestion.groovy
which defaults to false. Some users here want to use groovy transforms and we wanted to know if there are any risks or recommendations (i.e. do not use groovy, transform via flink or builtin function where applicable). Thanks!Rohit Anilkumar
01/11/2023, 9:29 AM-javaagent:/home/ec2-user/pinot/jmx/jmx_prometheus_javaagent-0.17.2.jar=8008:/home/ec2-user/pinot/jmx/pinot.yml
but when I check public_IP:8008/metrics
, its giving me a refused to connect. I tried curl localhost:8008/metrics
from the EC2 node and its giving me curl: (7) Failed to connect to localhost port 8008 after 0 ms: Connection refused
-> Does this mean nothing is being pushed to that port? Am i missing something here?Sachin Mittal Consultant
01/12/2023, 5:17 AMKinesisConsumer
are aggregated records which were published by some other KPL.
So the consumer needs to do de-aggregation in order to process them further.
Refer: https://docs.aws.amazon.com/streams/latest/dev/kinesis-kpl-consumer-deaggregation.html
Now I don't see this happening in pinot's KinesisConsumer
and looks like we are using awssdk v2 for kinesis client, and I am not sure how we can de-aggregate them
Any thoughts ?Ehsan Irshad
01/12/2023, 11:21 AMPratik Bhadane
01/13/2023, 11:54 AMPhil Sarkis
01/13/2023, 7:38 PMSidharth Sawhney
01/13/2023, 11:14 PMpinot/bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile ingestionJobEVSpec.yml
My CSV file when uploaded shows up as 0s and nulls. the values of the csv are missing nor is there an error when I execute the above command. Does anyone know what could be the issue?Shubham Kumar
01/16/2023, 7:08 AM},
"device_id" : "****WAD*A*D*AS",
"os" : "Android",
"session" : null,
"advertising_id" : "null",
"source" : "SyncTimer",
"manufacturer" : "OnePlus",
"event_ts_mins" : null,
"app_name" : null,
"event_ts" : [ 1673719133085 ],
"event_date" : null,
"event_ts_hr" : null,
"event_name" : null,
"event_ts_days" : null,
"customer_id" : null,
"device" : {
"device_id" : "*****",
"os" : "Android",
"os_version" : "33",
"model" : "CPH2411",
"manufacturer" : "OnePlus"
},
"user" : {
"customer_id" : "DFH*#U*$*#Q93(***8"
},
"events" : [ {
"event_name" : "pulse_sdk_init",
"timestamp" : 1673719133085
} ],
"timestamp" : null
}
}
mahmoud elhalwany
01/16/2023, 10:56 AMRohit Anilkumar
01/16/2023, 11:23 AM- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queries\"><>(\\w+)"
name: "pinot_broker_queries_$2"
But when i check prometheus, i dont see anything that starts with pinot_broker_queries. Is it not scraping the metrics correctly? But Im getting some of the other metrics (edited)
I get the metrics with _exceptions, _nettyConnection, _healthChecks etc. Im using the config provided in the documentation https://docs.pinot.apache.org/operators/operating-pinot/monitoring
broker
rules:
# Pinot Broker
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+).authorization\"><>(\\w+)"
name: "pinot_broker_authorization_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.documentsScanned\"><>(\\w+)"
name: "pinot_broker_documentsScanned_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.entriesScannedInFilter\"><>(\\w+)"
name: "pinot_broker_entriesScannedInFilter_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.entriesScannedPostFilter\"><>(\\w+)"
name: "pinot_broker_entriesScannedPostFilter_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.freshnessLagMs\"><>(\\w+)"
name: "pinot_broker_freshnessLagMs_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queries\"><>(\\w+)"
name: "pinot_broker_queries_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queryExecution\"><>(\\w+)"
name: "pinot_broker_queryExecution_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.queryRouting\"><>(\\w+)"
name: "pinot_broker_queryRouting_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.reduce\"><>(\\w+)"
name: "pinot_broker_reduce_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.requestCompilation\"><>(\\w+)"
name: "pinot_broker_requestCompilation_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.scatterGather\"><>(\\w+)"
name: "pinot_broker_scatterGather_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)\\.totalServerResponseSize\"><>(\\w+)"
name: "pinot_broker_totalServerResponseSize_$2"
labels:
table: "$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)_(\\w+).groupBySize\"><>(\\w+)"
name: "pinot_broker_groupBySize_$3"
labels:
table: "$1"
tableType: "$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)_(\\w+).noServingHostForSegment\"><>(\\w+)"
name: "pinot_broker_noServingHostForSegment_$3"
labels:
table: "$1"
tableType: "$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.healthcheck(\\w+)\"><>(\\w+)"
name: "pinot_broker_healthcheck_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.helix.(\\w+)\"><>(\\w+)"
name: "pinot_broker_helix_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.helixZookeeper(\\w+)\"><>(\\w+)"
name: "pinot_broker_helix_zookeeper_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.nettyConnection(\\w+)\"><>(\\w+)"
name: "pinot_broker_nettyConnection_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.clusterChangeCheck\"\"><>(\\w+)"
name: "pinot_broker_clusterChangeCheck_$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.proactiveClusterChangeCheck\"><>(\\w+)"
name: "pinot_broker_proactiveClusterChangeCheck_$1"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.(\\w+)Exceptions\"><>(\\w+)"
name: "pinot_broker_exceptions_$1_$2"
- pattern: "\"org.apache.pinot.common.metrics\"<type=\"BrokerMetrics\", name=\"pinot.broker.routingTableUpdateTime\"><>(\\w+)"
name: "pinot_broker_routingTableUpdateTime_$1"
The query options I see on prometheus. Some of the metrics arent getting scraped I guess?
pinot_broker_exceptions_requestCompilation_Count
pinot_broker_exceptions_requestCompilation_FifteenMinuteRate
pinot_broker_exceptions_requestCompilation_FiveMinuteRate
pinot_broker_exceptions_requestCompilation_MeanRate
pinot_broker_exceptions_requestCompilation_OneMinuteRate
pinot_broker_exceptions_resourceMissing_Count
pinot_broker_exceptions_resourceMissing_FifteenMinuteRate
pinot_broker_exceptions_resourceMissing_FiveMinuteRate
pinot_broker_exceptions_resourceMissing_MeanRate
pinot_broker_exceptions_resourceMissing_OneMinuteRate
pinot_broker_exceptions_uncaughtGet_Count
pinot_broker_exceptions_uncaughtGet_FifteenMinuteRate
pinot_broker_exceptions_uncaughtGet_FiveMinuteRate
pinot_broker_exceptions_uncaughtGet_MeanRate
pinot_broker_exceptions_uncaughtGet_OneMinuteRate
pinot_broker_exceptions_uncaughtPost_Count
pinot_broker_exceptions_uncaughtPost_FifteenMinuteRate
pinot_broker_exceptions_uncaughtPost_FiveMinuteRate
pinot_broker_exceptions_uncaughtPost_MeanRate
pinot_broker_exceptions_uncaughtPost_OneMinuteRate
pinot_broker_healthcheck_BadCalls_Count
pinot_broker_healthcheck_BadCalls_FifteenMinuteRate
pinot_broker_healthcheck_BadCalls_FiveMinuteRate
pinot_broker_healthcheck_BadCalls_MeanRate
pinot_broker_healthcheck_BadCalls_OneMinuteRate
pinot_broker_healthcheck_OkCalls_Count
pinot_broker_healthcheck_OkCalls_FifteenMinuteRate
pinot_broker_healthcheck_OkCalls_FiveMinuteRate
pinot_broker_healthcheck_OkCalls_MeanRate
pinot_broker_healthcheck_OkCalls_OneMinuteRate
pinot_broker_helix_connected_Value
pinot_broker_helix_ookeeperReconnects_Count
pinot_broker_helix_ookeeperReconnects_FifteenMinuteRate
pinot_broker_helix_ookeeperReconnects_FiveMinuteRate
pinot_broker_helix_ookeeperReconnects_MeanRate
pinot_broker_helix_ookeeperReconnects_OneMinuteRate
pinot_broker_nettyConnection_BytesReceived_Count
pinot_broker_nettyConnection_BytesReceived_FifteenMinuteRate
pinot_broker_nettyConnection_BytesReceived_FiveMinuteRate
pinot_broker_nettyConnection_BytesReceived_MeanRate
pinot_broker_nettyConnection_BytesReceived_OneMinuteRate
pinot_broker_nettyConnection_BytesSent_Count
pinot_broker_nettyConnection_BytesSent_FifteenMinuteRate
pinot_broker_nettyConnection_BytesSent_FiveMinuteRate
pinot_broker_nettyConnection_BytesSent_MeanRate
pinot_broker_nettyConnection_BytesSent_OneMinuteRate
pinot_broker_nettyConnection_ConnectTimeMs_Value
pinot_broker_nettyConnection_RequestsSent_Count
pinot_broker_nettyConnection_RequestsSent_FifteenMinuteRate
pinot_broker_nettyConnection_RequestsSent_FiveMinuteRate
pinot_broker_nettyConnection_RequestsSent_MeanRate
pinot_broker_nettyConnection_RequestsSent_OneMinuteRate
pinot_broker_proactiveClusterChangeCheck_Count
pinot_broker_proactiveClusterChangeCheck_FifteenMinuteRate
pinot_broker_proactiveClusterChangeCheck_FiveMinuteRate
pinot_broker_proactiveClusterChangeCheck_MeanRate
pinot_broker_proactiveClusterChangeCheck_OneMinuteRate
I dont see the missing metrics in the jmx port either. Im seeing the same issue in the controller as well.Pratik Bhadane
01/16/2023, 1:21 PMPrashanth Rao
01/17/2023, 7:45 AMEhsan Irshad
01/17/2023, 10:15 AM[TABLENAME_REALTIME-RealtimeTableDataManager] [HelixTaskExecutor-message_handle_thread_10] Download and move segment TABLENAME__1__169__20230104T0711Z from peer with scheme http failed.
Do server and controllers need to point to same URI , our config
Controller: controller.data.dir=s3://prd-pinot-archive/prd-mimic-pinot/controller-data
Server: pinot.server.instance.segment.store.uri=<s3://prd-pinot-archive/prd-mimic-pinot/server-data>
eywek
01/17/2023, 10:22 AM"streamConfigs": {
"streamType": "pulsar",
"topic.consumption.rate.limit": "1500",
"stream.pulsar.fetch.timeout.millis": "30000",
"stream.pulsar.consumer.type": "lowlevel",
"stream.pulsar.topic.name": "<persistent://public/default/worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28>",
"stream.pulsar.decoder.class.name": "org.apache.pinot.plugin.inputformat.json.JSONMessageDecoder",
"stream.pulsar.consumer.factory.class.name": "org.apache.pinot.plugin.stream.pulsar.PulsarConsumerFactory",
"stream.pulsar.bootstrap.servers": "<pulsar://pulsar.production.internal.reelevant.io>.:6650",
"stream.pulsar.consumer.prop.auto.offset.reset": "smallest",
"realtime.segment.flush.threshold.rows": "0",
"realtime.segment.flush.threshold.segment.size": "200M"
}
I’m having a UNHEALTHY
ingestion status for the table:
{
"ingestionStatus": {
"ingestionState": "UNHEALTHY",
"errorMessage": "Did not get any response from servers for segment: worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28__0__1__20230117T1012Z"
}
}
And if I use the /consumingSegmentsInfo
I’m having the following result:
{
"_segmentToConsumingInfoMap": {
"worker_datasource_60e5a1ab40480001009289b7_6258e7b21993b1000737be40_28__0__1__20230117T1012Z": []
}
}
From what I see, the plugin is consuming data from Pulsar and I’m able to query it but I was wondering why I get an UNHEALTHY
ingestion status since I was using it for monitoring purposes.
Thank youabhinav wagle
01/17/2023, 11:17 PMtest_OFFLINE
.
"listFields": {
"TAG_LIST": [
"test_REALTIME"
]
}