Edgaras Kryževičius
09/28/2022, 11:47 AMCaused by: java.lang.IllegalStateException: PinotFS for scheme: abfs has not been initialized
This is spark command I am running:
spark-submit \
--class org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand \
--master local \
--deploy-mode client \
--conf "spark.driver.extraJavaOptions=-Dplugins.dir=${PINOT_DISTRIBUTION_DIR}/plugins" \
--conf "spark.driver.extraClassPath=${PINOT_DISTRIBUTION_DIR}/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-3.2/pinot-batch-ingestion-spark-3.2-${PINOT_VERSION}-shaded.jar:${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar:${PINOT_DISTRIBUTION_DIR}/plugins/pinot-file-system/pinot-adls/pinot-adls-${PINOT_VERSION}-shaded.jar:${PINOT_DISTRIBUTION_DIR}/plugins/pinot-input-format/pinot-parquet/pinot-parquet-${PINOT_VERSION}-shaded.jar" \
--conf "spark.executor.extraClassPath=${PINOT_DISTRIBUTION_DIR}/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-3.2/pinot-batch-ingestion-spark-3.2-${PINOT_VERSION}-shaded.jar:${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar:${PINOT_DISTRIBUTION_DIR}/plugins/pinot-file-system/pinot-adls/pinot-adls-${PINOT_VERSION}-shaded.jar:${PINOT_DISTRIBUTION_DIR}/plugins/pinot-input-format/pinot-parquet/pinot-parquet-${PINOT_VERSION}-shaded.jar" \
local://${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar -jobSpecFile ${PINOT_DISTRIBUTION_DIR}/SparkIngestionJob.yaml
SparkIngestionJob.yaml:
executionFrameworkSpec:
name: 'spark'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentUriPushJobRunner'
segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark3.SparkSegmentMetadataPushJobRunner'
extraConfigs:
stagingDir: examples/batch/airlineStats/staging
jobType: SegmentCreationAndTarPush
inputDirURI: '<abfs://fs@accountname/...>'
includeFileNamePattern: 'glob:**/*.avro'
outputDirURI: 'examples/batch/airlineStats/segments'
overwriteOutput: true
pinotFSSpecs:
- scheme: adl2
className: org.apache.pinot.plugin.filesystem.ADLSGen2PinotFS
configs:
accountName: '..'
accessKey: '..'
fileSystemName: '..'
recordReaderSpec:
dataFormat: 'avro'
className: 'org.apache.pinot.plugin.inputformat.avro.AvroRecordReader'
tableSpec:
tableName: 'airlineStats'
schemaURI: '<http://20.207.206.121:9000/tables/airlineStats/schema>'
tableConfigURI: '<http://20.207.206.121:9000/tables/airlineStats>'
segmentNameGeneratorSpec:
type: normalizedDate
configs:
segment.name.prefix: 'airlineStats_batch'
exclude.sequence.id: true
pinotClusterSpecs:
- controllerURI: '<http://20.207.206.121:9000>'
pushJobSpec:
pushParallelism: 2
pushAttempts: 2
pushRetryIntervalMillis: 1000
I am also attaching my values.yml file, which is used to deploy Pinot using helm.Edgaras Kryževičius
09/28/2022, 1:16 PMMayank
Navina
09/28/2022, 6:08 PMabfs
and adl2
?Edgaras Kryževičius
09/29/2022, 3:42 PMMayank