Hi Team, I'm facing a issue with pinot datatypes....
# troubleshooting
s
Hi Team, I'm facing a issue with pinot datatypes. I have a column jobTitle value as "Staff RN (Med Surg, Ortho/Neuro, GI/GU floor" in my file and defined schema with string datatype only. But I'm getting error while loading into table -
Cannot read single-value from Object[]: [Staff RN (Med Surg,  Ortho/Neuro,  GI/GU floor] for column: jobTitle
1
s
Can you share your table config, schema json and the data format, data file / data json you're trying to ingest? Is this a realtime table or a offline table?
s
Its a offline table ingesting from csv file. Sharing tar file consisting table config, schema and job_specification file and raw_data/xab.csv file
s
@Sowmya Gowda values like
Staff RN (Med Surg; Ortho/Neuro; GI/GU floor
are the culprits here. The
;
character is the default multi value separator for the CsvReader configured in the job spec to ingest the data. I was able to generate the segment correctly with
Copy code
executionFrameworkSpec:
  name: 'standalone'
  segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
  segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
  segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/Users/saurabh.dubey/Downloads/test2_candidate/raw_data/'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/Users/saurabh.dubey/Downloads/test2_candidate/segments/'
overwriteOutput: true
pinotFSSpecs:
  - scheme: file
    className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
  dataFormat: 'csv'
  className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
  configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
  configs:
    multiValueDelimiter: '\$'
tableSpec:
  tableName: 'test2_candidate'
  schemaURI: '<http://localhost:9000/tables/test2_candidate/schema>'
  tableConfigURI: '<http://localhost:9000/tables/test2_candidate>'
pinotClusterSpecs:
  - controllerURI: '<http://localhost:9000>'
^ This spec. Basically overriding the
Copy code
configs:
    multiValueDelimiter: '\$'
part to change the multiValueDelimiter to some other character. But this may not always work (if some strings contain $ character). But basically you should figure out the correct multiValueDelimiter as per your data and use that in the ingestion spec. Else change the ingestion from csv to something more robust like json
^@Kartik Khare for more
s
Thank you @saurabh dubey for the quick solution. It helps me a lot !!