Andy Cooper
11/03/2022, 9:06 PMv0.11.0-SNAPSHOT to v0.11.0.
Since we are using Java8 and have a hard dependency on spark2.4 right now, I had to compile. Now, when running the same spark ingestion job that is working on v0.11.0-SNAPSHOT , we are receiving the following error when using the new jars:
Can't construct a java object for tag:<http://yaml.org|yaml.org>,2002:org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec; exception=Class not found: org.apache.pinot.spi.ingestion.batch.spec.SegmentGenerationJobSpec
We have been looking at this for a while now and I believe we are at the end of the line and out of ideas on where to look next.Andy Cooper
11/03/2022, 9:09 PMmvn clean install -DskipTests -Pbin-dist -Djdk.version=8
Resulting folder structures and jars look to be as expectedAndy Cooper
11/03/2022, 9:11 PMjobSpecFile YAML is valid, as determined by successfully running the job with v0.11.0-SNAPSHOTAndy Cooper
11/03/2022, 9:14 PMspark_args we are passing to the ingestion command:
PINOT_SPARK_ARGS = {
"spark.driver.extraJavaOptions": "-Dplugins.dir=/mnt/pinot/apache-pinot-0.11.0-bin/plugins -Dplugins.include=pinot-s3,pinot-parquet -Dlog4j2.configurationFile=/mnt/pinot/apache-pinot-0.11.0-bin/conf/pinot-ingestion-job-log4j2.xml",
"spark.executor.extraJavaOptions": "-Dplugins.dir=/mnt/pinot/apache-pinot-0.11.0-bin/plugins -Dplugins.include=pinot-s3,pinot-parquet -Dlog4j2.configurationFile=/mnt/pinot/apache-pinot-0.11.0-bin/conf/pinot-ingestion-job-log4j2.xml",
"spark.driver.extraClassPath": "/mnt/pinot/apache-pinot-0.11.0-bin/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-2.4/pinot-batch-ingestion-spark-2.4-0.11.0-shaded.jar:/mnt/pinot/apache-pinot-0.11.0-bin/lib/pinot-all-0.11.0-jar-with-dependencies.jar:/mnt/pinot/apache-pinot-0.11.0-bin/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-shaded.jar:/mnt/pinot/apache-pinot-0.11.0-bin/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-shaded.jar",
"spark.executor.extraClassPath": "/mnt/pinot/apache-pinot-0.11.0-bin/plugins-external/pinot-batch-ingestion/pinot-batch-ingestion-spark-2.4/pinot-batch-ingestion-spark-2.4-0.11.0-shaded.jar:/mnt/pinot/apache-pinot-0.11.0-bin/lib/pinot-all-0.11.0-jar-with-dependencies.jar:/mnt/pinot/apache-pinot-0.11.0-bin/plugins/pinot-file-system/pinot-s3/pinot-s3-0.11.0-shaded.jar:/mnt/pinot/apache-pinot-0.11.0-bin/plugins/pinot-input-format/pinot-parquet/pinot-parquet-0.11.0-shaded.jar",
}
All files exist in the appropriate paths as expectedAndy Cooper
11/03/2022, 9:16 PMAndy Cooper
11/03/2022, 9:23 PMorg.yaml.snakeyaml.Yaml.loadAs method's ability to find the correct class path.
Although I do find it very strange that it is unable to load a class that should be in the same module?/project? (pom.xml path ) as the calling method.Mayank
Mayank
Andy Cooper
11/03/2022, 10:41 PMKartik Khare
11/04/2022, 2:32 PMpinot-all jar not being loaded properly. The complete command would be helpful.Andy Cooper
11/04/2022, 4:45 PMKartik Khare
11/07/2022, 8:12 AM'file': 'hdfs:///user/pinot/pinot-all-0.11.0-jar-with-dependencies.jar'
with
'file': '<local://pinot-all-0.11.0-jar-with-dependencies.jar>'
The jar file is already copied from hdfs to local (since it is mentioned in the --jars), so specifying local path should work. For me, this fixed the issue most of the times