Abhay Rawat
06/28/2022, 5:03 PMCaused by: java.lang.NoSuchMethodError: 'org.apache.pinot.shaded.org.apache.commons.configuration.PropertiesConfiguration org.apache.pinot.spi.env.CommonsConfigurationUtils.fromFile(java.io.File)'
at org.apache.pinot.segment.spi.index.metadata.SegmentMetadataImpl.getPropertiesConfiguration(SegmentMetadataImpl.java:161)
Error on spark 3.2
Exception in thread "main" java.lang.ExceptionInInitializerError
Caused by: java.lang.NullPointerException
at org.apache.commons.lang3.SystemUtils.isJavaVersionAtLeast(SystemUtils.java:1626)
both on jdk*11,* on aws emrs
I think it’s this particular combination (Pinot, Spark, S3, Parquet) thats not working. I am trying to remove some of them to narrow down the problem. Just wanted to know if this has worked for anyoneAbhay Rawat
06/28/2022, 5:04 PMcat > /tmp/spark_batch_job_spec.yml << EOF
executionFrameworkSpec:
name: 'spark'
segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner'
segmentTarPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentTarPushJobRunner'
segmentUriPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentUriPushJobRunner'
segmentMetadataPushJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentMetadataPushJobRunner'
jobType: SegmentCreationAndTarPush
inputDirURI: '<s3://pinot-ec2-poc/source_data/small/mytable/>'
includeFileNamePattern: 'glob:**/*.parquet'
outputDirURI: <s3://pinot-ec2-poc/data/small/mytable>
overwriteOutput: true
pinotFSSpecs:
-
scheme: s3
className: org.apache.pinot.plugin.filesystem.S3PinotFS
configs:
region: us-west-2
recordReaderSpec:
dataFormat: 'parquet'
className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
tableSpec:
tableName: 'mytable'
schemaURI: 'http://<controller>:9000/tables/mytable/schema'
tableConfigURI: 'http://<controller>:9000/tables/mytable'
pinotClusterSpecs:
- controllerURI: 'http://<controller>:9000'
EOF
Abhay Rawat
06/28/2022, 5:04 PMsudo spark-submit \
--class org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand \
--deploy-mode client \
--jars "/opt/pinot/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-3.2/pinot-batch-ingestion-spark-3.2-0.11.0-SNAPSHOT-shaded.jar,/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-SNAPSHOT-shaded.jar,/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar,/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-SNAPSHOT-shaded.jar" \
--conf "spark.driver.extraJavaOptions=-Dlog4j2.configurationFile=/opt/pinot/conf/pinot-ingestion-job-log4j2.xml" \
--conf "spark.driver.extraClassPath=/opt/pinot/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-3.2/pinot-batch-ingestion-spark-3.2-0.11.0-SNAPSHOT-shaded.jar:/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-SNAPSHOT-shaded.jar:/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-SNAPSHOT-shaded.jar" \
--conf "spark.executor.extraClassPath=/opt/pinot/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-3.2/pinot-batch-ingestion-spark-3.2-0.11.0-SNAPSHOT-shaded.jar:/opt/pinot/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-SNAPSHOT-shaded.jar:/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar:/opt/pinot/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-SNAPSHOT-shaded.jar" \
/opt/pinot/lib/pinot-all-0.11.0-SNAPSHOT-jar-with-dependencies.jar -jobSpecFile /tmp/spark_batch_job_spec.yml
Mayank
Kartik Khare
06/29/2022, 6:17 AMKartik Khare
06/29/2022, 6:18 AMMayank
Kartik Khare
06/29/2022, 3:49 PMMayank
Kartik Khare
06/29/2022, 3:51 PMMayank
Abhay Rawat
07/05/2022, 4:30 PMjava.lang.NullPointerException
on
java.lang.RuntimeException: Caught exception during running - org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentTarPushJobRunner
rechecked everything, confs, jars etc look fine. Any idea what would be causing thisKartik Khare
07/05/2022, 4:31 PM