Hi team, when I was running the spark pinot batch ...
# troubleshooting
g
Hi team, when I was running the spark pinot batch ingestion with s3 parquet data as input , I noticed that the job will failed when there is an empty file / non parquet file in the input folder, but itโ€™s very common for the upstream processed data output by spark to have a empty _SUCCESS marker file in the folder. I wonder if it is possible to let the ingestion job ignore these non parquet / empty files by changing some config, otherwise we will need to clean up the _SUCCESS file every time for pinot ingestion jobs.
k
IIRC there is a file pattern to include and exclude
๐Ÿ‘ 1
๐Ÿ‘€ 1