Hi Team, I am trying to submit a spark ingestion j...
# general
m
Hi Team, I am trying to submit a spark ingestion job but getting
main
method not found exception and I checked the class
LaunchDataIngestionJobCommand
code, it does not have a main method.
Copy code
Exception in thread "main" java.lang.NoSuchMethodException: org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand.main([Ljava.lang.String;)
	at java.base/java.lang.Class.getMethod(Class.java:2108)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:42)
	at <http://org.apache.spark.deploy.SparkSubmit.org|org.apache.spark.deploy.SparkSubmit.org>$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:955)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1043)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1052)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Configuration -
Copy code
#!/usr/bin/env bash

export HADOOP_CONF_DIR=/Users/$USER/servers/hadoop-2.7.1/etc/hadoop
export SPARK_DIST_CLASSPATH=$(/Users/$USER/servers/hadoop-2.7.1/bin/hadoop classpath)

export PINOT_VERSION=0.9.3
export PINOT_DISTRIBUTION_DIR="/Users/$USER/servers/pinot-${PINOT_VERSION}"

export SPARK_VERSION=3.2.0
export SPARK_HOME="/Users/$USER/servers/spark-${SPARK_VERSION}-bin-without-hadoop"

${SPARK_HOME}/bin/spark-submit \
  --class org.apache.pinot.tools.admin.command.LaunchDataIngestionJobCommand \
  --master "local[2]" \
  --deploy-mode client \
  --conf "spark.driver.extraJavaOptions=-Dplugins.dir=${PINOT_DISTRIBUTION_DIR}/plugins -Dlog4j2.configurationFile=${PINOT_DISTRIBUTION_DIR}/conf/pinot-ingestion-job-log4j2.xml" \
  --conf "spark.driver.extraClassPath=${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar" local://${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar \
  -jobSpecFile ${PINOT_DISTRIBUTION_DIR}/examples/batch/airlineStats/sparkIngestionJobSpec.yaml
a
I faced similar issue. Try this command
Copy code
$SPARK_HOME/bin/spark-submit --class org.apache.pinot.tools.admin.PinotAdministrator \
  --master "local[3]" --deploy-mode client --conf "spark.driver.extraJavaOptions=-Dplugins.dir=${PINOT_DISTRIBUTION_DIR}/plugins \
   -Dlog4j2.configurationFile=${PINOT_DISTRIBUTION_DIR}/conf/pinot-ingestion-job-log4j2.xml" \
   --conf "spark.driver.extraClassPath=${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar" \
   ${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar \
 LaunchDataIngestionJob \
 -jobSpecFile
m
This didn’t work. Some issue with
pinot-all-0.9.3-jar-with-dependencies.jar
uber jar.
Java Service Loader loading
StartKafkaCommand
event though I specified
LaunchDataIngestionJob
in arg. I will try with to pass individual jars in classpath (classpath passed in pinot-admin.sh script)
Copy code
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.pinot.tools.admin.command.StartKafkaCommand.<init>(StartKafkaCommand.java:51)
	at org.apache.pinot.tools.admin.PinotAdministrator.<clinit>(PinotAdministrator.java:98)
Caused by: java.util.NoSuchElementException
	at java.base/java.util.ServiceLoader$2.next(ServiceLoader.java:1309)
a
The uber jar doesn't seems to be working. could be some classloading issue Try adding all the jars inside ${PINOT_DISTRIBUTION_DIR}/plugins to the spark.driver.extraClassPath I created a shell script to add the jars. I think this is not the best way, but it works 🙂
Copy code
PLUGINS_DIR=${PINOT_DISTRIBUTION_DIR}/plugins
PLUGIN_JARS=$(find "$PLUGINS_DIR" -name \*.jar)
for PLUGIN_JAR in $PLUGIN_JARS ; do
if [ -n "$PLUGINS_CLASSPATH" ] ; then
    PLUGINS_CLASSPATH=$PLUGINS_CLASSPATH:$PLUGIN_JAR
else
    PLUGINS_CLASSPATH=$PLUGIN_JAR
fi
done


$SPARK_HOME/bin/spark-submit --class org.apache.pinot.tools.admin.PinotAdministrator \
  --master "local[3]" --deploy-mode client \
   --conf "spark.driver.extraJavaOptions=-Dplugins.dir=${PINOT_DISTRIBUTION_DIR}/plugins \
   -Dlog4j2.configurationFile=${PINOT_DISTRIBUTION_DIR}/conf/pinot-ingestion-job-log4j2.xml" \
--conf "spark.driver.extraClassPath=${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar:${PLUGINS_CLASSPATH}" \
 ${PINOT_DISTRIBUTION_DIR}/lib/pinot-all-${PINOT_VERSION}-jar-with-dependencies.jar \
 LaunchDataIngestionJob \
 -jobSpecFile
I am using spark 2.4.8
m
yes, i tried same but got a spark related exception - spark was unable to find some scala method, probably related to spark version (3.2.0-bin-without-hadoop) I am using. thanks for sharing spark version and script 👍
a
@User: I faced similar error you had where its trying to run
StartKafkaCommand
even if I pass
LaunchDataIngestionJob
.
how did you get around this issue. Any pointers will be helpful. thanks
@User ^ FYI
Copy code
22/01/14 18:06:54 INFO ApplicationMaster: Waiting for spark context initialization...
Exception in thread "Driver" java.lang.ExceptionInInitializerError
	at org.apache.pinot.tools.admin.command.StartKafkaCommand.<init>(StartKafkaCommand.java:51)
	at org.apache.pinot.tools.admin.PinotAdministrator.<clinit>(PinotAdministrator.java:98)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)
Caused by: java.util.NoSuchElementException
	at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:365)
	at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404)
	at java.util.ServiceLoader$1.next(ServiceLoader.java:480)
	at org.apache.pinot.tools.utils.KafkaStarterUtils.getKafkaConnectorPackageName(KafkaStarterUtils.java:54)
	at org.apache.pinot.tools.utils.KafkaStarterUtils.<clinit>(KafkaStarterUtils.java:46)
	... 7 more
22/01/14 18:06:54 ERROR ApplicationMaster: Uncaught exception: 
java.lang.IllegalStateException: User did not initialize spark context!
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:485)
	at <http://org.apache.spark.deploy.yarn.ApplicationMaster.org|org.apache.spark.deploy.yarn.ApplicationMaster.org>$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:773)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:772)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:797)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
22/01/14 18:06:54 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: java.lang.IllegalStateException: User did not initialize spark context!
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:485)
	at <http://org.apache.spark.deploy.yarn.ApplicationMaster.org|org.apache.spark.deploy.yarn.ApplicationMaster.org>$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:305)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:773)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:772)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:797)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
a
Try the second script in above thread, adding all plug-in jars to class path https://apache-pinot.slack.com/archives/CDRCA57FC/p1641878863065900?thread_ts=1641806567.053000&amp;cid=CDRCA57FC
m
Issue with latest pinot version distribution. Use pinot 0.8.0
👍 1
a
Thanks. I will try out both options @User, @User