:wave: Attempting to debug an error running pinot ...
# troubleshooting
n
👋 Attempting to debug an error running pinot spark batch ingestion job. Using pinot 0.10.0 release with jdk 8 built via
mvn clean install -DskipTests -Pbin-dist -T 4 -Djdk.version=8
getting this error
Copy code
Caused by: shaded.com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.10.0
	at shaded.com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
	at shaded.com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
	at shaded.com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:808)
	at org.apache.spark.util.JsonProtocol$.<init>(JsonProtocol.scala:59)
	at org.apache.spark.util.JsonProtocol$.<clinit>(JsonProtocol.scala)
	... 32 more
job spec is here
Copy code
executionFrameworkSpec:
  name: 'spark'
  segmentGenerationJobRunnerClassName: 'org.apache.pinot.plugin.ingestion.batch.spark.SparkSegmentGenerationJobRunner'
  extraConfigs:
    stagingDir: <s3://nikhil-dw-dev/pinot/staging/>
    dependencyJarDir: '<s3://nikhil-dw-dev/pinot/apache-pinot-incubating-0.7.1-bin/plugins>'  
jobType: SegmentCreation
inputDirURI: '<s3://nikhil-dw-dev/pinot/pinot_input/>'
includeFileNamePattern: 'glob:**/*.parquet'
outputDirURI: '<s3://nikhil-dw-dev/pinot/pinot_output3/>'
overwriteOutput: true
pinotFSSpecs:
  -
    className: org.apache.pinot.plugin.filesystem.S3PinotFS
    scheme: s3
    configs:
      region: us-east-1
recordReaderSpec:
  dataFormat: 'parquet'
  className: 'org.apache.pinot.plugin.inputformat.parquet.ParquetRecordReader'
tableSpec:
  tableName: 'students'
  schemaURI: '<s3://nikhil-dw-dev/pinot/students_schema.json>'
  tableConfigURI: '<s3://nikhil-dw-dev/pinot/students_table.json>'
2
m
@Kartik Khare ^^
k
Hi nikhil, we have made some changes in spark plugin to address this error. You will need to build from our master branch. Its fine if pinot deployed is 0.10 as it is compatible with that
👋 1
n
got the following error to build from master
Copy code
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  02:42 min (Wall Clock)
[INFO] Finished at: 2022-05-21T05:31:00Z
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M2:enforce (enforce-dependency-convergence) on project pinot-spark: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed. -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M2:enforce (enforce-dependency-convergence) on project pinot-spark: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed.
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:200)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:196)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
    at java.lang.Thread.run (Thread.java:748)
Caused by: org.apache.maven.plugin.MojoExecutionException: Some Enforcer rules have failed. Look above for specific messages explaining why the rule failed.
    at org.apache.maven.plugins.enforcer.EnforceMojo.execute (EnforceMojo.java:235)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:200)
    at org.apache.maven.lifecycle.internal.builder.multithreaded.MultiThreadedBuilder$1.call (MultiThreadedBuilder.java:196)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.Executors$RunnableAdapter.call (Executors.java:511)
    at java.util.concurrent.FutureTask.run (FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624)
    at java.lang.Thread.run (Thread.java:748)
[ERROR]
ran this command
mvn clean install -DskipTests -Pbin-dist -T 4 -Djdk.version=8 -e
java version
Copy code
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~18.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode)
mvn version
Copy code
Apache Maven 3.6.0
Maven home: /usr/share/maven
Java version: 1.8.0_312, vendor: Private Build, runtime: /usr/lib/jvm/java-8-openjdk-amd64/jre
Default locale: en, platform encoding: UTF-8
OS name: "linux", version: "5.4.0-1071-aws", arch: "amd64", family: "unix"
k
Sent you link to published jars in DM. Please try with that and let know if that works. Also, is the master branch rebased?
n
thank you. yes had just pulled master
this is resolved now, @Kartik Khare shared the jdk8 tar file for the latest master build, and using that worked - the ingestion job ran successfully. thank you karthik for helping debugging the issue 🙇
t
Hello everybody, I have the same problem here @Kartik Khare. Would you be able to provide the jars to me as well? Thank you 🙂
k
DM'd you the link.
🙏 1
t
@Kartik Khare Thanks for helping. It worked by compiling 0.11.0-SNAPSHOT and using it to push the segments. FYI, I needed to add
--jars
in the spark-submit command to propagate them on the workers.
k
Happy to help!! Thanks for providing the logs to figure out the issue. I have updated the FAQ section for future reference - https://docs.pinot.apache.org/basics/data-import/batch-ingestion/spark
🙏 2