https://pinot.apache.org/ logo
a

Azri Jamil

07/04/2021, 5:31 AM
Hi I try to push data from GCS to Pinot, after submitting job it seem not doing any and no output at all, these are my job spec
Copy code
executionFrameworkSpec:
    name: 'standalone'
    segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
    segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
    segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
jobType: SegmentCreationAndUriPush
inputDirURI: '<gs://mdm-datalake/ais/sentences/>'
outputDirURI: '/tmp/ais-pinot/sentences/'
includeFileNamePattern: 'glob:**/**.parquet'
overwriteOutput: true
pinotFSSpecs:
  - scheme: file
    className: org.apache.pinot.spi.filesystem.LocalPinotFS 
  - scheme: gs
    className: org.apache.pinot.plugin.filesystem.GcsPinotFS
    configs:
        projectId: 'aton-analytics'
        gcpKey: '/var/pinot/controller/config/gcs-datalake-key.json'
recordReaderSpec:
    dataFormat: 'parquet'
    className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
tableSpec:
    tableName: 'sentence'
pinotClusterSpecs:
    - controllerURI: '<http://localhost:9000>'
k

Ken Krugler

07/04/2021, 8:20 PM
This looks odd to me
includeFileNamePattern: 'glob:**/**.parquet'
. I think it should be
includeFileNamePattern: 'glob:**/*.parquet'
a

Azri Jamil

07/05/2021, 1:35 AM
I tried that one before, but same no output.
Is it because the data was too big?
k

Ken Krugler

07/05/2021, 5:28 PM
Shouldn’t be a data size issue. I’m assuming no errors from the job, right? So then I’d check the controller log file to see what it says is happening. The most common cause I’ve found for this behavior is that there isn’t any input data, given the provided input directory and file name pattern. So just verify that from the same server where you’re submitting this job, you can list files in
Copy code
'<gs://mdm-datalake/ais/sentences/>'
And these files match the
*.parquet
pattern.
a

Azri Jamil

07/15/2021, 12:52 PM
Ok I will take a look at it again, it does make sense..