Alexander Vivas
03/12/2021, 2:20 PMAlexander Vivas
03/12/2021, 2:20 PMDeepak Kumar Mishra
03/16/2021, 4:22 AMDeepak Kumar Mishra
03/16/2021, 9:54 AMRavikumar Maddi
03/16/2021, 4:17 PMAli LeClerc
03/18/2021, 3:06 AMRavikumar Maddi
03/19/2021, 12:43 PMHarshvardhan Surolia
03/22/2021, 12:12 PMPhúc Huỳnh
03/24/2021, 6:39 AM2021/03/24 06:13:18.879 WARN [ConsumerCoordinator] [RuleLogsQC__2__1__20210319T0728Z] [Consumer clientId=consumer-12735, groupId=] Synchronous auto-commit of offsets {c1.elk.db-gamification-consumer-log.qc-2=OffsetAndMetadata{offset=17073, metadata=''}} failed: Not authorized to access group:
Ravikumar Maddi
03/24/2021, 8:09 AMRavikumar Maddi
03/24/2021, 9:41 AM{
"personId": "9878",
"addresses": [
{
"doorNum": "45456",
"Street": "Washington Road",
"area": "sector-1"
},
{
"doorNum": "676756",
"Street": "Washington Road",
"area": "sector-2"
},
{
"doorNum": "768768",
"Street": "Washington Road",
"area": "sector-4"
}
]
}
{
"personId": "68768",
"addresses": [
{
"doorNum": "45456",
"Street": "Washington Road",
"area": "sector-1"
},
{
"doorNum": "676756",
"Street": "Washington Road",
"area": "sector-2"
},
{
"doorNum": "768768",
"Street": "Washington Road",
"area": "sector-4"
}
]
}
In Schema config file I mentioned like this:
{
"name": "addresses",
"dataType": "STRING",
"maxLength": 2147483647,
"singleValueField": false
},
In Table Config file I mentioned like this:
"jsonIndexColumns": [
"addresses"
],
But I am not able to find data in Pinot Query Console, I am not able to find any error in any log.
Need HelpPrashant Kumar
03/24/2021, 3:06 PMMohamed Sultan
03/25/2021, 7:27 AMCharles
03/26/2021, 3:25 AMCharles
03/31/2021, 12:29 AMCharles
03/31/2021, 12:29 AMCharles
03/31/2021, 12:30 AMRavikumar Maddi
03/31/2021, 1:23 PMKishore G
Alexander Vivas
03/31/2021, 2:25 PMorg.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 514
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
at sun.security.ssl.Alert.createSSLException(Alert.java:131) ~[?:1.8.0_282]
( . . . )
at io.confluent.kafka.schemaregistry.client.rest.RestService.sendHttpRequest(RestService.java:212) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.rest.RestService.httpRequest(RestService.java:256) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:486) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.rest.RestService.getId(RestService.java:479) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getSchemaByIdFromRegistry(CachedSchemaRegistryClient.java:177) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getBySubjectAndId(CachedSchemaRegistryClient.java:256) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient.getById(CachedSchemaRegistryClient.java:235) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:107) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:79) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder.decode(KafkaConfluentSchemaRegistryAvroMessageDecoder.java:114) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder.decode(KafkaConfluentSchemaRegistryAvroMessageDecoder.java:120) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.plugin.inputformat.avro.confluent.KafkaConfluentSchemaRegistryAvroMessageDecoder.decode(KafkaConfluentSchemaRegistryAvroMessageDecoder.java:53) ~[pinot-confluent-avro-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.processStreamEvents(LLRealtimeSegmentDataManager.java:471) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager.consumeLoop(LLRealtimeSegmentDataManager.java:402) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at org.apache.pinot.core.data.manager.realtime.LLRealtimeSegmentDataManager$PartitionConsumer.run(LLRealtimeSegmentDataManager.java:538) [pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-a44d0b1bb64d00d851ea6f2d8bc46ff0ab080d3e]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
I had a look at the pods in our server instances and all of them had their certs in place, all the table configs have the right path in their properties and even so the second table (the one we need to consume form a different kafka cluster) doesn't workElon
04/01/2021, 6:54 AMselect count(*) from mytable where (( DATETRUNC( 'hour', created_at_seconds, 'seconds')) - ( DATETRUNC( 'hour', CAST( 1.610354466173E9 as long), 'seconds'))) >= 0
does not work but if you take the E9
away it works. Looks like the grammar only recognizes
FLOATING_POINT_LITERAL : SIGN? DIGIT+ '.' DIGIT* | SIGN? DIGIT* '.' DIGIT+;
This is for pinot 0.6.0, did this change in 0.7.0?Brian Olsen
04/10/2021, 1:17 AM"dateTimeFieldSpecs": [
{
"name": "cdc_case_earliest_dt",
"dataType": "STRING",
"format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy/MM/dd",
"granularity": "1:DAYS"
},
{
"name": "cdc_report_dt",
"dataType": "STRING",
"format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy/MM/dd",
"granularity": "1:DAYS"
},
{
"name": "pos_spec_dt",
"dataType": "STRING",
"format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy/MM/dd",
"granularity": "1:DAYS"
},
{
"name": "onset_dt",
"dataType": "STRING",
"format": "1:DAYS:SIMPLE_DATE_FORMAT:yyyy/MM/dd",
"granularity": "1:DAYS"
}
]
With csv that has various dates that don't exist
[ec2-user@aws ~]$ head /tmp/pinot-quick-start/covid-cases.csv
cdc_case_earliest_dt ,cdc_report_dt,pos_spec_dt,onset_dt,current_status,sex,age_group,race_ethnicity_combined,hosp_yn,icu_yn,death_yn,medcond_yn
2020/10/23,2021/01/28,2020/10/23,,Laboratory-confirmed case,Female,0 - 9 Years,"Black, Non-Hispanic",Missing,Missing,No,Missing
2020/10/23,2020/10/23,2020/10/23,,Laboratory-confirmed case,Female,0 - 9 Years,"Black, Non-Hispanic",No,Unknown,No,No
2020/10/23,2020/10/25,2020/10/23,2020/10/23,Laboratory-confirmed case,Female,0 - 9 Years,"Black, Non-Hispanic",No,Missing,Missing,Missing
2020/10/23,2020/10/25,2020/10/23,,Laboratory-confirmed case,Female,0 - 9 Years,"Black, Non-Hispanic",Missing,Missing,Missing,Missing
Looks like when parsing null rows, the parser gets fed a null value. I'm tempted to update defaultNullValue
in dateTimeFieldSpec to be a default date of 1970/01/01
but I'd like to just keep those values null if possible. Anything i'm doing wrong or any way around this?
Failed to generate Pinot segment for file - file:/tmp/pinot-quick-start/covid-cases.csv
java.lang.IllegalArgumentException: Invalid format: "null"
at org.joda.time.format.DateTimeParserBucket.doParseMillis(DateTimeParserBucket.java:187) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.joda.time.format.DateTimeFormatter.parseMillis(DateTimeFormatter.java:826) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.core.segment.creator.impl.SegmentColumnarIndexCreator.writeMetadata(SegmentColumnarIndexCreator.java:555) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.core.segment.creator.impl.SegmentColumnarIndexCreator.seal(SegmentColumnarIndexCreator.java:514) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.handlePostCreation(SegmentIndexCreationDriverImpl.java:273) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:246) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.plugin.ingestion.batch.common.SegmentGenerationTaskRunner.run(SegmentGenerationTaskRunner.java:111) ~[pinot-all-0.7.0-SNAPSHOT-jar-with-dependencies.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$submitSegmentGenTask$1(SegmentGenerationJobRunner.java:261) ~[pinot-batch-ingestion-standalone-0.7.0-SNAPSHOT-shaded.jar:0.7.0-SNAPSHOT-50a4531b33475327bc9fe3c0199e7003f0a4c882]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_282]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_282]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_282]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]
Elon
04/10/2021, 1:30 AM"nullHandlingEnabled": true,
in the "tableIndexConfig":
section of the table config (not the schema above). Not sure if time columns can be null though, if not then defaultNullValue = 0 would work, otherwise it will be set to Long.MIN_VALUE which is also not a valid value.Aaron Wishnick
04/13/2021, 3:52 PMJonathan Meyer
04/16/2021, 10:45 AMevents.something.*
) ?
• Is it possible to filter on a column not part of the schema (ex: when filtering on event.this_event_type.only
) ?
Thanks 😄Surendra
04/16/2021, 11:38 PM{
"id": "<>__0__1000__20210312T1413Z",
"simpleFields": {
"segment.crc": "1640078893",
"segment.creation.time": "1615558396792",
"segment.end.time": "-9223372036854775808",
"segment.flush.threshold.size": "100000",
"segment.flush.threshold.time": null,
"segment.index.version": "v3",
"segment.name": "<>__0__1000__20210312T1413Z",
"segment.realtime.download.url": "s3://<>/pinot/<>/<>__0__1000__20210312T1413Z",
"segment.realtime.endOffset": "62577410",
"segment.realtime.numReplicas": "1",
"segment.realtime.startOffset": "62565619",
"segment.realtime.status": "DONE",
"segment.start.time": "-9223372036854775808",
"segment.table.name": "<>_REALTIME",
"segment.time.unit": "MILLISECONDS",
"segment.total.docs": "11791",
"segment.type": "REALTIME"
},
"mapFields": {},
"listFields": {}
}
Ravikumar Maddi
04/19/2021, 1:11 PMRavikumar Maddi
04/20/2021, 2:44 PM