Hi. I'm trying to setup up the Spark-Agent on a Az...
# getting-started
b
Hi. I'm trying to setup up the Spark-Agent on a Azure Databricks cluster. Have anyone gotten that to work? I'm getting a NullPointerException when i run my notebook. I've tried various Spark runtime versions (2.4.5, 3.2.1 and 3.3.0) all with the same result. I'm using spark.jars.packages io.acryldatahub spark lineage0.8.36 Any ideas? EDIT: Nevermind. Found a specific Databricks.jar
m
Hi @bumpy-portugal-86782: trying to understand what was the resolution here?
Also gentle reminder to refrain from posting large stack-traces on the main message as it clutters up the feed, you can post a shorter main message and then add the stack trace in the thread.
b
my apologies. The solution is found in #integration-databricks-datahub Someone posted a PDF and a Jar with for databricks support. It started working once I used that version instead of the jar-file on mavencentral.
Moving stacktrace here for completeness / searchability.
Copy code
ERROR DatahubSparkListener: java.lang.NullPointerException
	at datahub.spark.DatahubSparkListener$3.apply(DatahubSparkListener.java:255)
	at datahub.spark.DatahubSparkListener$3.apply(DatahubSparkListener.java:251)
	at scala.Option.foreach(Option.scala:407)
	at datahub.spark.DatahubSparkListener.processExecutionEnd(DatahubSparkListener.java:251)
	at datahub.spark.DatahubSparkListener.onOtherEvent(DatahubSparkListener.java:238)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:102)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:118)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:102)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:105)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:105)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at <http://org.apache.spark.scheduler.AsyncEventQueue.org|org.apache.spark.scheduler.AsyncEventQueue.org>$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:100)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:96)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1623)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:96)