Sukesh Boggavarapu
03/29/2022, 11:58 AMMark Needham
Sukesh Boggavarapu
03/29/2022, 2:52 PMSukesh Boggavarapu
03/29/2022, 2:52 PM{
"custom.map": "{\"input.data.file.uri\":\"file:/data/customers.csv\"}",
"segment.crc": "3320463979",
"segment.creation.time": "1648122612684",
"segment.index.version": "v3",
"segment.name": "merchants_OFFLINE_0",
"segment.offline.download.url": "<http://172.28.0.4:9000/segments/merchants/merchants_OFFLINE_0>",
"segment.offline.push.time": "1648122613365",
"segment.table.name": "merchants",
"segment.total.docs": "100",
"segment.type": "OFFLINE"
}
Mark Needham
Mark Needham
Mark Needham
Sukesh Boggavarapu
03/29/2022, 2:53 PMSukesh Boggavarapu
03/29/2022, 2:53 PMSukesh Boggavarapu
03/29/2022, 2:54 PMMark Needham
Mark Needham
Sukesh Boggavarapu
03/29/2022, 2:55 PMMark Needham
Sukesh Boggavarapu
03/29/2022, 2:55 PMSukesh Boggavarapu
03/29/2022, 2:55 PMexecutionFrameworkSpec:
name: 'standalone'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '/data'
includeFileNamePattern: 'glob:**/*.csv'
outputDirURI: '/opt/pinot/data/members'
overwriteOutput: true
pinotFSSpecs:
- scheme: file
className: org.apache.pinot.spi.filesystem.LocalPinotFS
recordReaderSpec:
dataFormat: 'csv'
className: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReader'
configClassName: 'org.apache.pinot.plugin.inputformat.csv.CSVRecordReaderConfig'
tableSpec:
tableName: 'members'
pinotClusterSpecs:
- controllerURI: '<http://localhost:9000>'
Mark Needham
Mark Needham
includeFileNamePattern: 'glob:**/*.csv'
Sukesh Boggavarapu
03/29/2022, 2:56 PMMark Needham
Mark Needham
Mark Needham
Sukesh Boggavarapu
03/29/2022, 2:56 PMincludeFileNamePattern: 'glob:**/members.csv'
Sukesh Boggavarapu
03/29/2022, 2:57 PMMark Needham
Sukesh Boggavarapu
03/29/2022, 2:57 PMMark Needham
Sukesh Boggavarapu
03/29/2022, 2:58 PMSukesh Boggavarapu
03/29/2022, 2:58 PMSukesh Boggavarapu
04/06/2022, 2:40 PMSukesh Boggavarapu
04/06/2022, 2:40 PMdocker exec -it pinot-controller /opt/pinot/bin/pinot-admin.sh AddTable -tableConfigFile /config/members/members_table.json -schemaFile /config/members/members_schema.json -exec
Sukesh Boggavarapu
04/06/2022, 2:41 PMdocker exec -it pinot-controller /opt/pinot/bin/pinot-admin.sh LaunchDataIngestionJob -jobSpecFile /config/members/members_job-spec.yml
Sukesh Boggavarapu
04/06/2022, 2:42 PM24
records with duplicate data along with invalid dataSukesh Boggavarapu
04/06/2022, 2:42 PMSukesh Boggavarapu
04/06/2022, 2:43 PMselect * from member where merchant_id=123
:Sukesh Boggavarapu
04/06/2022, 2:44 PMSukesh Boggavarapu
04/06/2022, 3:05 PMinputDirURI: '/data/members'
includeFileNamePattern: 'glob:**/*.csv'
Sukesh Boggavarapu
04/06/2022, 3:06 PM/data/members
which is members.csv
with 8 records