Hi having a similar but different issue with the standalone Apache Pinot #troubleshooting

Hi, having a similar but different issue with the ...

Thomas Steinholz

10/24/2022, 7:37 PM

Hi, having a similar but different issue with the ‘standalone’ BatchIngestionJob, where ultimately the pod runs out of ephemeral storage (requires downloading more than 10G of data: The node was low on resource: ephemeral-storage. Container pinot-job-batch-ingestion was using 1944724Ki, which exceeds its request of 0.
). I have actually mounted a persistent volume to the pod executing this job, but it does not seem to be using it. I am currently mounting it at

/var/pinot/minion/data

and

/var/pinot/server/data

but this is not working for either. What directory should this volume be mounted to so that the BatchIngestJob uses the volume instead of the ephemeral storage? As a secondary question, is there a simpler way to do this within the kubernetes cluster running Pinot? Or is the standard way to utilize an external Spark Cluster with the custom compiled pinot image?

Thomas Steinholz

10/24/2022, 7:41 PM

Here is my kubernetes job definition for refrence

Copy code

apiVersion: v1
kind: ConfigMap
metadata:
  name: batch-job-metadata-configmap
  namespace: datalake
data:
  batch-ingest-job-spec.yaml: |-
    executionFrameworkSpec:
        name: 'standalone'
        segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentGenerationJobRunner'
        segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentTarPushJobRunner'
        segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.standalone.SegmentUriPushJobRunner'
    jobType: SegmentCreationAndTarPush
    inputDirURI: 's3://<omitted>'
    outputDirURI: 's3://<omitted>'
    overwriteOutput: true
    pinotFSSpecs:
        - scheme: s3
          className: org.apache.pinot.plugin.filesystem.S3PinotFS
          configs:
            region: '<omitted>'
    recordReaderSpec:
      dataFormat: 'parquet'
      className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
    tableSpec:
        tableName: 'uplinkpayloadevent'
    pinotClusterSpecs:
        - controllerURI: '<omitted>'
    pushJobSpec:
      pushParallelism: 2
      pushAttempts: 2
      pushRetryIntervalMillis: 1000
      segmentUriPrefix : 's3://'
      segmentUriSuffix : pinot-offline/
---
apiVersion: batch/v1
kind: Job
metadata:
  name: pinot-batch-ingest-job
  namespace: <omitted>
spec:
  template:
    spec:
      containers:
        - name: pinot-batch-ingestion
          image: apachepinot/pinot:latest
          args:
            - "LaunchDataIngestionJob"
            - "-jobSpecFile"
            - "/var/linklabs/batch/batch-ingest-job-spec.yaml"
          env:
            - name: JAVA_OPTS
              value: "-Xms32G -Xmx64G -Dpinot.admin.system.exit=true"
          resources:
            requests:
              memory: "32Gi"
            limits:
              memory: "64Gi"
          envFrom:
            - secretRef:
                name: <omitted>
          volumeMounts:
            - name: batch-job-metadata
              mountPath: /var/linklabs/batch
            - name: data
              mountPath: /var/pinot/server/data
      restartPolicy: OnFailure
      volumes:
        - name: batch-job-metadata
          configMap:
            name: batch-job-metadata-configmap
        - name: data
          persistentVolumeClaim:
            claimName: task-pv-pinot-etl-claim
  backoffLimit: 10

Mayank

10/24/2022, 8:11 PM

@Haitao Zhang ^^

Haitao Zhang

10/24/2022, 8:33 PM

ack

Mayank

10/24/2022, 8:40 PM

Also cc: @Seunghyun

Ken Krugler

10/25/2022, 3:49 AM

The SegmentGenerationJobRunner code for standalone writes temp files to the directory returned by

System._getProperty_("<http://java.io|java.io>.tmpdir")

. So if you can set that property to your permanent volume (or maybe a /tmp dir on that volume) then I think it would work as you want.

Haitao Zhang

10/25/2022, 3:52 AM

I would agree with this root cause. Maybe another way of fixing the problem is to mount the volumn to that dir

Thomas Steinholz

10/25/2022, 7:03 PM

It does seem like mounting it on the

/tmp

dir is working, the job is still running but it hasn’t gotten this far before yet - so that is a good sign!

Open in Slack

Previous Next