Ryan Clark
07/28/2021, 9:24 PMjava.lang.IllegalStateException: Cannot flatten value node: null
Ryan Clark
07/28/2021, 9:24 PMexecutionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '<s3://example-bucket/>'
includeFileNamePattern: 'glob:**/*.txt.gz'
outputDirURI: '<s3://another-example-bucket/batch-output/>'
overwriteOutput: true
pinotFSSpecs:
- scheme: s3
className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs:
region: 'us-east-1'
recordReaderSpec:
dataFormat: 'json'
className: 'org.apache.pinot.plugin.inputformat.json.JSONRecordReader'
tableSpec:
tableName: 'tablename'
schemaURI: '<http://example.us-east-1.elb.amazonaws.com:9000/tables/tablename/schema>'
tableConfigURI: '<http://example.us-east-1.elb.amazonaws.com:9000/tables/tablename>'
pinotClusterSpecs:
- controllerURI: '<http://example.us-east-1.elb.amazonaws.com:9000/>'
Ryan Clark
07/28/2021, 9:24 PMDriver, record read time : 0
Driver, stats collector time : 0
Driver, indexing time : 0
Tarring segment from: /var/folders/q6/7tb5ffbj621_wtx7wvf2pwmm0000gn/T/pinot-736ebe4d-6eb5-4333-a542-c994a4f92ed4/output/dimRewardProductGroupTable_OFFLINE_1627504142199_1627504142199_135 to: /var/folders/q6/7tb5ffbj621_wtx7wvf2pwmm0000gn/T/pinot-736ebe4d-6eb5-4333-a542-c994a4f92ed4/output/dimRewardProductGroupTable_OFFLINE_1627504142199_1627504142199_135.tar.gz
Size for segment: dimRewardProductGroupTable_OFFLINE_1627504142199_1627504142199_135, uncompressed: 5.36K, compressed: 1.32K
Copy /var/folders/q6/7tb5ffbj621_wtx7wvf2pwmm0000gn/T/pinot-736ebe4d-6eb5-4333-a542-c994a4f92ed4/output/dimRewardProductGroupTable_OFFLINE_1627504142199_1627504142199_135.tar.gz from local to <s3://fetch-pinot-test/pinot-data/pinot-quickstart/batch-output/year=2021/month=07/day=20/hour=03/dimRewardProductGroupTable_OFFLINE_1627504142199_1627504142199_135.tar.gz>
Submitting one Segment Generation Task for <s3://example-bucket/year=2021/month=07/day=22/hour=03/part-00000-b5f50645-9b35-413a-a2e2-c4f2a97512fa-c000.txt.gz>
Copy <s3://example-bucket/year=2021/month=07/day=23/hour=03/part-00000-6d85cabe-e315-428f-9f1d-4e227f7a7772-c000.txt.gz> to local /var/folders/q6/7tb5ffbj621_wtx7wvf2pwmm0000gn/T/pinot-736ebe4d-6eb5-4333-a542-c994a4f92ed4/input/part-00000-6d85cabe-e315-428f-9f1d-4e227f7a7772-c000.txt.gz
Submitting one Segment Generation Task for <s3://example-bucket/year=2021/month=07/day=23/hour=03/part-00000-6d85cabe-e315-428f-9f1d-4e227f7a7772-c000.txt.gz>
Copy <s3://example-bucket/year=2021/month=07/day=24/hour=03/part-00000-224051fa-e495-413e-a04f-18ccc0991000-c000.txt.gz> to local /var/folders/q6/7tb5ffbj621_wtx7wvf2pwmm0000gn/T/pinot-736ebe4d-6eb5-4333-a542-c994a4f92ed4/input/part-00000-224051fa-e495-413e-a04f-18ccc0991000-c000.txt.gz
Using class: org.apache.pinot.plugin.inputformat.json.JSONRecordReader to read segment, ignoring configured file format: AVRO
Finished building StatsCollector!
Collected stats for 34 documents
Created dictionary for INT column: dummy with cardinality: 1, range: 0 to 0
Using fixed length dictionary for column: sequenceNumber, size: 442
Created dictionary for STRING column: sequenceNumber with cardinality: 34, max length in bytes: 13, range: 1626797946139 to 1626797947541
Using fixed length dictionary for column: resync, size: 4
Created dictionary for STRING column: resync with cardinality: 1, max length in bytes: 4, range: true to true
Using fixed length dictionary for column: collection, size: 19
Created dictionary for STRING column: collection with cardinality: 1, max length in bytes: 19, range: RewardProductGroups to RewardProductGroups
Using fixed length dictionary for column: id, size: 816
Created dictionary for STRING column: id with cardinality: 34, max length in bytes: 24, range: 5e14b42af4f32e6c2a230aa1 to 60f6f77b259c4853012b8f72
Using fixed length dictionary for column: operation, size: 8
Created dictionary for STRING column: operation with cardinality: 1, max length in bytes: 8, range: SNAPSHOT to SNAPSHOT
Start building IndexCreator!
Failed to generate Pinot segment for file - <s3://example-bucket/year=2021/month=07/day=21/hour=03/part-00000-d5213b3f-5de8-48ac-a51c-7401944db031-c000.txt.gz>
java.lang.IllegalStateException: Cannot flatten value node: null
at shaded.com.google.common.base.Preconditions.checkState(Preconditions.java:518) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.spi.utils.JsonUtils.flatten(JsonUtils.java:246) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.core.segment.creator.impl.inv.json.BaseJsonIndexCreator.add(BaseJsonIndexCreator.java:90) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.core.segment.creator.impl.SegmentColumnarIndexCreator.indexRow(SegmentColumnarIndexCreator.java:407) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.core.segment.creator.impl.SegmentIndexCreationDriverImpl.build(SegmentIndexCreationDriverImpl.java:220) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.plugin.ingestion.batch.common.SegmentGenerationTaskRunner.run(SegmentGenerationTaskRunner.java:108) ~[pinot-all-0.7.1-jar-with-dependencies.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.lambda$run$0(SegmentGenerationJobRunner.java:199) ~[pinot-batch-ingestion-standalone-0.7.1-shaded.jar:0.7.1-e22be7c3a39e840321d3658e7505f21768b228d6]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Using class: org.apache.pinot.plugin.inputformat.json.JSONRecordReader to read segment, ignoring configured file format: AVRO
Finished building StatsCollector!
Collected stats for 27 documents
Ryan Clark
07/28/2021, 9:24 PM.gz
 but it is json objects not separated by commas
{"json1": "value", "json2": "value"}
{"json1": "value", "json2": "value"}
{"json1": "value", "json2": "value"}
{"json1": "value", "json2": "value"}
Xiang Fu
Ryan Clark
07/29/2021, 5:15 PM