I'm running a SegmentCreationAndTarPush batch inge...
# troubleshooting
a
I'm running a SegmentCreationAndTarPush batch ingest that has outputDirURI as a path on S3. I notice that if I drop the table, re-create the table, and run the batch ingest job again with different data (a different inputDirURI), it still ends up pushing all of the old data from the previous batch ingest job. Is this expected? Is there some way I can prevent it from happening?
k
its because the output directory still contains all the old segments.
you can delete that before running the job
a
Ok thanks
Easy enough 🙂 So should I think of the outputDirURI as a temporary directory for the batch ingestion job?
k
yes
this has come up multiple times, may be we should automatically delete it or figure out the newly generated files