Kishore G
Kishore G
Christian Acuna
08/18/2020, 12:27 AMKishore G
Christian Acuna
08/18/2020, 12:28 AM{
"OFFLINE": {
"tableName": "DnsForwarderServiceStatus_OFFLINE",
"tableType": "OFFLINE",
"segmentsConfig": {
"timeType": "MILLISECONDS",
"schemaName": "olap_enriched_dns_forwarder_service_status",
"timeColumnName": "cloud_timestamp_ms",
"replication": "1",
"replicasPerPartition": "1",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"createInvertedIndexDuringSegmentGeneration": false,
"noDictionaryColumns": [],
"enableDefaultStarTree": false,
"enableDynamicStarTreeCreation": false,
"aggregateMetrics": true,
"nullHandlingEnabled": true,
"loadMode": "MMAP",
"invertedIndexColumns": [],
"autoGeneratedInvertedIndex": false
},
"metadata": {
"customConfigs": {}
}
},
"REALTIME": {
"tableName": "DnsForwarderServiceStatus_REALTIME",
"tableType": "REALTIME",
"segmentsConfig": {
"timeType": "MILLISECONDS",
"schemaName": "olap_enriched_dns_forwarder_service_status",
"timeColumnName": "cloud_timestamp_ms",
"retentionTimeUnit": "DAYS",
"retentionTimeValue": "30",
"replication": "1",
"replicasPerPartition": "1",
"segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
},
"tenants": {
"broker": "DefaultTenant",
"server": "DefaultTenant"
},
"tableIndexConfig": {
"createInvertedIndexDuringSegmentGeneration": false,
"noDictionaryColumns": [],
"enableDefaultStarTree": false,
"enableDynamicStarTreeCreation": false,
"aggregateMetrics": true,
"nullHandlingEnabled": true,
"streamConfigs": {
"streamType": "kafka",
"stream.kafka.consumer.type": "lowlevel",
"stream.kafka.topic.name": "olap-enriched-dns-forwarder-service-status",
"stream.kafka.decoder.class.name": "org.apache.pinot.plugin.stream.kafka.KafkaJSONMessageDecoder",
"stream.kafka.consumer.factory.class.name": "org.apache.pinot.plugin.stream.kafka20.KafkaConsumerFactory",
"stream.kafka.broker.list": "KAFKA_URL",
"realtime.segment.flush.threshold.time": "30m",
"realtime.segment.flush.threshold.size": "100000",
"stream.kafka.consumer.prop.auto.offset.reset": "largest",
"stream.kafka.zk.broker.url": "ZK_URL"
},
"loadMode": "MMAP",
"invertedIndexColumns": [],
"autoGeneratedInvertedIndex": false
},
"metadata": {
"customConfigs": {}
}
}
}
Kishore G
Christian Acuna
08/18/2020, 12:30 AMNeha Pawar
Christian Acuna
08/18/2020, 12:37 AMNeha Pawar
Schema schema = ZKMetadataProvider.getTableSchema(_propertyStore, _offlineTableName);
Preconditions.checkState(schema != null, "Failed to find schema for table: %s", _offlineTableName);
this is throwing the exception. it needs schema with name = DnsForwarderServiceStatusChristian Acuna
08/18/2020, 12:38 AMNeha Pawar
Christian Acuna
08/18/2020, 12:46 AMChristian Acuna
08/18/2020, 12:46 AMKishore G
Xiang Fu
Buchi Reddy
08/20/2020, 10:10 PMDISTINCTCOUNT
queries on raw data from realtime tables seems to be very slow. Tried the HLL approximation but that didn’t help. If we were to be okay with approximated results, would you recommend Theta Sketches
? Is that generally faster than the HLL?Mayank
Mayank
Mayank
Buchi Reddy
08/20/2020, 10:12 PMDISTINCTCOUNT
and HLL
are equally slow. Are there any optimizations that we can do to improve the latencies?Mayank
Aggregating HLL T/S derived columns during consumption
(cc: @Jackie)Jackie
08/20/2020, 10:18 PMValueAggregator
for aggregation during consumptionJackie
08/20/2020, 10:19 PMElon
08/21/2020, 4:53 PMElon
08/21/2020, 7:32 PMYash Agarwal
08/24/2020, 5:12 PMData size: 3 Years
Daily Raw Orc Size: 4GB
Daily Segment Counts: 11
Data Full Refresh: Weekly
Data Append: Daily
Pradeep
08/26/2020, 1:09 AMingestionConfig
for my offline table, but I don’t see epochMinutes
getting populated.
Is ingestionConfig only applicable to realtime tables?
"ingestionConfig": {
"transformConfigs": [
{
"columnName": "epochMinutes",
"transformFunction": "toEpochMinutes(timestampMillis)"
}
]
}
Pradeep
08/26/2020, 1:11 AMXiang Fu