Kamal Chavda
07/12/2021, 6:04 PM"name": "created_date",
"dataType": "STRING",
"format" : "1:MILLISECONDS:SIMPLE_DATE_FORMAT:YYYY-MM-dd HH24:MI:<http://SS.MS|SS.MS>",
"granularity": "1:MILLISECONDS"
example from csv file: 2020-03-01 073108.792457. I keep on getting failed to generate pinot segment and java illegal argument exception error
java.lang.IllegalArgumentException: Illegal pattern component: I
at org.joda.time.format.DateTimeFormat.parsePatternTo(DateTimeFormat.java:566) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.joda.time.format.DateTimeFormat.createFormatterForPattern(DateTimeFormat.java:687) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.joda.time.format.DateTimeFormat.forPattern(DateTimeFormat.java:177) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.spi.data.DateTimeFormatPatternSpec.<init>(DateTimeFormatPatternSpec.java:57) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.spi.data.DateTimeFormatSpec.<init>(DateTimeFormatSpec.java:59) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.core.indexsegment.generator.SegmentGeneratorConfig.setTime(SegmentGeneratorConfig.java:212) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.core.indexsegment.generator.SegmentGeneratorConfig.<init>(SegmentGeneratorConfig.java:138) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.plugin.ingestion.batch.common.SegmentGenerationTaskRunner.run(SegmentGenerationTaskRunner.java:95) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$run$0(SegmentGenerationJobRunner.java:199) ~[pinot-batch-ingestion-standalone-0.7.1-shaded.jar:0.7.1-afa4b252ab1c424ddd6c859bb305b2aa342b66ed]
Has anyone run into this issue? Do I need to convert the dates to EPOCH when generating CSV?Xiang Fu
:
need to be escapedKamal Chavda
07/12/2021, 6:23 PMJackie
07/12/2021, 6:25 PMyyyy-MM-dd HH:mm:ss.SSS
?Kamal Chavda
07/12/2021, 6:26 PMJackie
07/12/2021, 6:28 PMKamal Chavda
07/12/2021, 6:37 PMJackie
07/12/2021, 6:41 PMCSVRecordReaderConfig
which contains the header infoKamal Chavda
07/12/2021, 6:44 PMrecordReaderSpec:
dataFormat: 'csv'
className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
Do I need to change that to CSVRecordReaderConfig instead of CSVRecordReader?Jackie
07/13/2021, 6:06 PMKamal Chavda
07/13/2021, 7:21 PMUsing class: org.apache.pinot.plugin.inputformat.csv.CSVRecordReader to read segment, ignoring configured file format: AVRO
Jackie
07/13/2021, 7:30 PMKamal Chavda
07/13/2021, 9:07 PMjava.lang.IllegalStateException: Cannot read single-value from Object[]: [check 175, Some Name] for column: reference_id
But when I check the actual file for that value it's "check 175; Some Name". Not sure how that's happening.Jackie
07/14/2021, 12:16 AM;
is preserved as the multi-value delimiter. E.g. a;b
will be interpreted as [a, b]
. One work-around would be picking an unused character is the multi-value delimiter if you don't expect multi-value within the csv fileXiang Fu
Kamal Chavda
07/14/2021, 12:32 AMJackie
07/14/2021, 12:37 AMKamal Chavda
07/14/2021, 12:40 AMXiang Fu
delimiter
and multiValueDelimiter
you can ignore othersKamal Chavda
07/14/2021, 1:33 PM{
"name": "created_date",
"dataType": "STRING",
"format": "1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss",
"granularity": "1:SECONDS"
}
Xiang Fu
Kamal Chavda
07/14/2021, 5:36 PM2020-03-01 07:31:08
DATETIME_CONVERT(my_date_field, '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss', '1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss', '1:HOURS')
Xiang Fu
Jackie
07/14/2021, 6:47 PMselect ... from table where created_date > '2020-01-01 00:00:00'
should workKamal Chavda
07/14/2021, 7:11 PMselect...from table where year(create_date) >= 2020
. What you posted above does work though so will modify query. Thanks for sharing the ingestion transform link, will use it to transform columns!Jackie
07/14/2021, 7:16 PMTIMESTAMP
, year()
will workKamal Chavda
07/16/2021, 3:04 PMJackie
07/16/2021, 5:31 PMKamal Chavda
07/16/2021, 5:32 PM"datatype": "Timestamp",
"format" : "1:SECONDS:TIMESTAMP"
instead of
"datatype": "STRING",
"format" : "1:SECONDS:SIMPLE_DATE_FORMAT:yyyy-MM-dd HH:mm:ss"
Jackie
07/16/2021, 11:34 PM1:MILLISECONDS:TIMESTAMP