Sowmya Gowda
05/31/2022, 6:42 AMCannot read single-value from Object[]: [Staff RN (Med Surg, Ortho/Neuro, GI/GU floor] for column: jobTitle
saurabh dubey
05/31/2022, 8:21 AMSowmya Gowda
05/31/2022, 8:27 AMsaurabh dubey
05/31/2022, 11:26 AMStaff RN (Med Surg; Ortho/Neuro; GI/GU floor
are the culprits here. The ;
character is the default multi value separator for the CsvReader configured in the job spec to ingest the data. I was able to generate the segment correctly with
executionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/Users/saurabh.dubey/Downloads/test2_candidate/raw_data/'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/Users/saurabh.dubey/Downloads/test2_candidate/segments/'
overwriteOutput: true
pinotFSSpecs:
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
dataFormat: 'csv'
className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
configs:
multiValueDelimiter: '\$'
tableSpec:
tableName: 'test2_candidate'
schemaURI: '<http://localhost:9000/tables/test2_candidate/schema>'
tableConfigURI: '<http://localhost:9000/tables/test2_candidate>'
pinotClusterSpecs:
- controllerURI: '<http://localhost:9000>'
^ This spec. Basically overriding the
configs:
multiValueDelimiter: '\$'
part to change the multiValueDelimiter to some other character. But this may not always work (if some strings contain $ character). But basically you should figure out the correct multiValueDelimiter as per your data and use that in the ingestion spec. Else change the ingestion from csv to something more robust like jsonsaurabh dubey
05/31/2022, 11:27 AMSowmya Gowda
05/31/2022, 11:29 AM