Mateus Oliveira
06/16/2021, 7:53 PMTrying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
Creating an executor service with 1 threads(Job parallelism: 0, available cores: 1.)
Listed 8 files from URI: <s3://landing/bank/>, is recursive: true
Got exception to kick off standalone data ingestion job -
java.lang.RuntimeException: Caught exception during running - org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:144) ~[pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.runIngestionJob(IngestionJobLauncher.java:113) ~[pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
at org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.execute(LaunchDataIngestionJobCommand.java:132) [pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
at org.apache.pinot.tools.admin.PinotAdministrator.execute(PinotAdministrator.java:166) [pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
at org.apache.pinot.tools.admin.PinotAdministrator.main(PinotAdministrator.java:186) [pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
Caused by: java.lang.IllegalArgumentException
at sun.nio.fs.UnixFileSystem.getPathMatcher(UnixFileSystem.java:288) ~[?:1.8.0_292]
at org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner.run(SegmentGenerationJobRunner.java:175) ~[pinot-batch-ingestion-standalone-0.8.0-SNAPSHOT-shaded.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
at org.apache.pinot.spi.ingestion.batch.IngestionJobLauncher.kickoffIngestionJob(IngestionJobLauncher.java:142) ~[pinot-all-0.8.0-SNAPSHOT-jar-with-dependencies.jar:0.8.0-SNAPSHOT-2de40fde8051c2c0281416c2da11c179c2190435]
... 4 more
this is my job
executionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '<s3://landing/bank/>'
includeFileNamePattern: '*.json'
outputDirURI: '<s3://pinot/>'
overwriteOutput: true
pinotFSSpecs:
- scheme: s3
className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs:
region: 'us-east-1'
endpoint: '<http://10.0.220.205:9000>'
accessKey: 'access'
secretKey: 'key'
recordReaderSpec:
dataFormat: 'json'
className: 'org.apache.pinot.plugin.inputformat.json.JSONRecordReader'
tableSpec:
tableName: 'bank'
pinotClusterSpecs:
- controllerURI: '<http://localhost:9000>'
Aaron Wishnick
06/16/2021, 8:02 PMincludeFileNamePattern: 'glob:**/*.json'
Mayank
if (_spec.getIncludeFileNamePattern() != null) {
includeFilePathMatcher = FileSystems.getDefault().getPathMatcher(_spec.getIncludeFileNamePattern());
}
Mateus Oliveira
06/16/2021, 8:10 PMTrying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
Creating an executor service with 1 threads(Job parallelism: 0, available cores: 1.)
Listed 8 files from URI: <s3://landing/bank/>, is recursive: true
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
Initializing PinotFS for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
Listed 0 files from URI: <s3://pinot/>, is recursive: true
Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@106cc338] for table bank
Xiang Fu
includeFileNamePattern: 'glob:**/*.json'
Mateus Oliveira
06/16/2021, 8:24 PMXiang Fu
Mateus Oliveira
06/16/2021, 8:26 PMbank_2021_5_19_11_33_43.json
Xiang Fu
Mateus Oliveira
06/16/2021, 8:27 PMXiang Fu
schemaURI: '<http://localhost:9000/tables/bank/schema>'
tableConfigURI: '<http://localhost:9000/tables/bank>'
tableSpec:
Mateus Oliveira
06/16/2021, 8:28 PMSegmentGenerationJobSpec:
!!org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
authToken: null
cleanUpOutputDir: false
excludeFileNamePattern: null
executionFrameworkSpec: {extraConfigs: null, name: standalone, segmentGenerationJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner,
segmentMetadataPushJobRunnerClassName: null, segmentTarPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner,
segmentUriPushJobRunnerClassName: org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner}
failOnEmptySegment: false
includeFileNamePattern: glob:*.json
inputDirURI: <s3://landing/bank/>
jobType: SegmentCreationAndTarPush
outputDirURI: <s3://pinot/>
overwriteOutput: true
pinotClusterSpecs:
- {controllerURI: '<http://localhost:9000>'}
pinotFSSpecs:
- className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs: {region: us-east-1, endpoint: '<http://10.0.220.205:9000>', accessKey: YOURACCESSKEY,
secretKey: YOURSECRETKEY}
scheme: s3
pushJobSpec: null
recordReaderSpec: {className: org.apache.pinot.plugin.inputformat.json.JSONRecordReader,
configClassName: null, configs: null, dataFormat: json}
segmentCreationJobParallelism: 0
segmentNameGeneratorSpec: null
tableSpec: {schemaURI: '<http://localhost:9000/tables/bank/schema>', tableConfigURI: '<http://localhost:9000/tables/bank>',
tableName: bank}
tlsSpec: null
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner
Initializing PinotFS for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
Creating an executor service with 1 threads(Job parallelism: 0, available cores: 1.)
Listed 8 files from URI: <s3://landing/bank/>, is recursive: true
Trying to create instance for class org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner
Initializing PinotFS for scheme s3, classname org.apache.pinot.plugin.filesystem.S3PinotFS
Listed 0 files from URI: <s3://pinot/>, is recursive: true
Start pushing segments: []... to locations: [org.apache.pinot.spi.ingestion.batch.spec.PinotClusterSpec@63f259c3] for table bank
root@pinot-controller-0:/opt/pinot#
Xiang Fu
includeFileNamePattern: glob:*.json
'glob:**/*.json'
'glob:*.json'
Mateus Oliveira
06/16/2021, 8:35 PMMayank
Mateus Oliveira
06/16/2021, 8:40 PMMayank
Kulbir Nijjer
06/16/2021, 8:44 PMendpoint: '<http://10.0.220.205:9000>'
In case u interested about valid values: https://docs.aws.amazon.com/general/latest/gr/s3.htmlXiang Fu
Kulbir Nijjer
06/16/2021, 10:38 PM