:question:Re: Spark Batch Ingestion. Is there a re...
# troubleshooting
s
Re: Spark Batch Ingestion. Is there a recommended pattern for monitoring the health of spark batch ingestion job? I'm seeing semi-regular
org.apache.spark.SparkException
in our ingestion job which leads to staging segments being purged and no output in
outputDirURI
. See this error in logs,
ERROR [LaunchDataIngestionJobCommand] [Driver] Got exception to kick off standalone data ingestion job
, but Spark application exits as success,
INFO [ApplicationMaster] [Driver] Final app status: SUCCEEDED, exitCode: 0
We have implemented a simple DQ check that checks the presence of > 0-bytes of data in
outputDirURI
to determine if the spark job was actually successful, but wondering if others have a more elegant solution here.