Neha Pawar
Neha Pawar
Neha Pawar
Neha Pawar
LATEST
and realtime.segment.flush.threshol.time: 6h
Neha Pawar
Neha Pawar
Abhijeet Kushe
10/08/2021, 8:46 PMNo job to purge for the queue TaskQueue_RealtimeToOfflineSegmentsTask
I just restarted the entire cluster I still see the above message.I have started Table again with shard iterator at AT_SEQUENCE_NUMBER.I see the iterator is stuck at 66K seconds ago (since we have 1 day retention).I have noticed iin the past if this iterator does not shift for a long time.Will update later on if I dont see it changeAbhijeet Kushe
10/08/2021, 8:46 PMAbhijeet Kushe
10/08/2021, 8:46 PMKartik Khare
10/10/2021, 5:16 AMKartik Khare
10/10/2021, 5:16 AMAbhijeet Kushe
10/11/2021, 1:07 PMAbhijeet Kushe
10/11/2021, 1:08 PMNeha Pawar
Abhijeet Kushe
01/25/2022, 2:45 PMAbhijeet Kushe
11/28/2022, 8:21 PMNeha Pawar
Abhijeet Kushe
11/29/2022, 2:14 AMNeha Pawar
Neha Pawar
Abhijeet Kushe
11/29/2022, 1:13 PMRakesh Bobbala
02/10/2023, 7:54 PM{
"tableName": "backend_pre_processed",
"tableType": "REALTIME",
"segmentsConfig": {
"schemaName": "backend_pre_processed",
"retentionTimeUnit": "DAYS",
"retentionTimeValue": "90",
"replication": "1",
"replicasPerPartition": "1",
"timeColumnName": "spark_write_timestamp",
"minimizeDataMovement": false
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant",
"tagOverrideConfig": {}
},
"tableIndexConfig": {
"invertedIndexColumns": [],
"noDictionaryColumns": [],
"autoGeneratedInvertedIndex": false,
"createInvertedIndexDuringSegmentGeneration": false,
"sortedColumn": [],
"bloomFilterColumns": [],
"loadMode": "MMAP",
"streamConfigs": {
"streamType": "kinesis",
"stream.kinesis.topic.name": "backend-processed-events",
"stream.kinesis.consumer.type": "lowlevel",
"stream.kinesis.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kinesis.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kinesis.KinesisConsumerFactory",
"realtime.segment.flush.threshold.time": "7d",
"realtime.segment.flush.threshold.rows": "1000000",
"region": "us-east-1",
"maxRecordsToFetch": "3",
"shardIteratorType": "LATEST",
"stream.kinesis.fetch.timeout.millis": "30000"
},
"onHeapDictionaryColumns": [],
"varLengthDictionaryColumns": [],
"enableDefaultStarTree": false,
"enableDynamicStarTreeCreation": false,
"aggregateMetrics": false,
"nullHandlingEnabled": false,
"optimizeDictionaryForMetrics": false,
"noDictionarySizeRatioThreshold": 0,
"rangeIndexColumns": [],
"rangeIndexVersion": 2
},
"metadata": {},
"quota": {},
"routing": {},
"query": {},
"ingestionConfig": {
"segmentTimeValueCheck": true,
"transformConfigs": [
{
"columnName": "spark_write_timestamp_epoch",
"transformFunction": "FromDateTime(spark_write_time, 'yyyy-MM-dd''T''HH:mm:ss.SSS''Z')"
},
{
"columnName": "spark_write_timestamp",
"transformFunction": "TRIM(spark_write_time)"
},
{
"columnName": "row_created_time",
"transformFunction": "now()"
},
{
"columnName": "m_event",
"transformFunction": "TRIM(event)"
}
],
"continueOnError": false,
"rowTimeValueCheck": false
},
"isDimTable": false
}
Also I want to know if this is the right way to test the ingestion time.Rajat Gupta
03/01/2024, 7:33 PMselect * from <TableName>
I am getting the below Query response. Can someone help answer below questions.
"requestId": "1862848077000000379",
"brokerId": "Broker_pinot-broker-2.pinot-broker-headless.pinot.svc.cluster.local_8099",
"exceptions": [],
"numServersQueried": 3,
"numServersResponded": 3,
"numSegmentsQueried": 28,
"numSegmentsProcessed": 3,
"numSegmentsMatched": 3,
"numConsumingSegmentsQueried": 8,
"numConsumingSegmentsProcessed": 0,
"numConsumingSegmentsMatched": 0,
"numDocsScanned": 30,
"numEntriesScannedInFilter": 0,
"numEntriesScannedPostFilter": 270,
"numGroupsLimitReached": false,
"maxRowsInJoinReached": false,
"totalDocs": 15367504,
"timeUsedMs": 4,
"offlineThreadCpuTimeNs": 0,
"realtimeThreadCpuTimeNs": 0,
"offlineSystemActivitiesCpuTimeNs": 0,
"realtimeSystemActivitiesCpuTimeNs": 0,
"offlineResponseSerializationCpuTimeNs": 0,
"realtimeResponseSerializationCpuTimeNs": 0,
"offlineTotalCpuTimeNs": 0,
"realtimeTotalCpuTimeNs": 0,
"brokerReduceTimeMs": 0,
"segmentStatistics": [],
"traceInfo": {},
"partialResult": false,
"numSegmentsPrunedByBroker": 0,
"numRowsResultSet": 10,
"minConsumingFreshnessTimeMs": 1709241001083,
"numSegmentsPrunedByServer": 25,
"numSegmentsPrunedInvalid": 0,
"numSegmentsPrunedByLimit": 25,
"numSegmentsPrunedByValue": 0,
"explainPlanNumEmptyFilterSegments": 0,
"explainPlanNumMatchAllFilterSegments": 0
}
1. Why are number of docs scanned only 30 as I haven't added any filter to the query. Is it because the records are flushed to disk and hence not queried?
2. I am seeing that some of records are stucked in kinesis and are not being read, It only reads when I restart the pinot-server and after sometime it again stops reading messages. How can I debug this? I am seeing iterator age increase in kinesis.