Is there any mechanism preventing to send the inge...
# ingestion
w
Is there any mechanism preventing to send the ingestion events if there is any failure in the pipeline during the ingestion? I’m asking because I noted the
process_commit
function in
pipeline.py
. It checks if there are errors or not, and depending on that and the commit policy, it will commit or not the checkpoint. https://github.com/datahub-project/datahub/blob/23b929ea10daded7447f806f8860447626[…]e573a6/metadata-ingestion/src/datahub/ingestion/run/pipeline.py However, I don’t see such a behaviour with the ingestion events themselves. Which means that ingestion pipeline could be publishing some events via the Sink and not committing the checkpoint. In my opinion, publishing policy in the Sink should be aligned with committing policy. WDYT?
h
Ravindra Lanka [10:30 AM] Hi @witty-butcher-82399, this is a good point when stateful ingestion is turned on. We should probably rollback the ingestion on failures in this case. I'll think through this more and get back to you.
w
I haven’t checked, has this been recently fixed?