https://linen.dev logo
#feedback-and-requests
Title
# feedback-and-requests
k

Kriti (Postman)

08/31/2021, 6:43 AM
Hey everyone, Would like to use S3 as source for a Sync. We have a bunch of s3 files at a prefix - I am assuming the Sync would iterate over them and write the data to destination. How would Airbyte handle failure in sync of 1 (or more) s3 file?
u

user

08/31/2021, 7:16 AM
We have a bunch of s3 files at a prefix - I am assuming the Sync would iterate over them and write the data to destination.
Yes. You can specify a path pattern: https://docs.airbyte.io/integrations/sources/s3#path-pattern
u

user

08/31/2021, 7:18 AM
Thanks for this. Was more curious about failure management - like 2 out of 10 file failed (for any reason) - how Airbyte manage this?
u

user

08/31/2021, 7:18 AM
How would Airbyte handle failure in sync of 1 (or more) s3 file?
It depends on the type of failure. For errors showing up in the schema detection stage, if there are files with inconsistent schema, the S3 source will fail immediately. For errors showing up in the data syncing stage: https://docs.airbyte.io/faq/data-loading#what-happens-to-data-in-the-pipeline-if-the-destinat[…]with-duplicate-data-when-the-pipeline-is-reconnected
u

user

08/31/2021, 7:21 AM
So essentially Airbyte would store a cursor specifying the s3 files that failed, which it would try to sync in the next run, along with any new files?
u

user

08/31/2021, 7:23 AM
Yes.
u

user

08/31/2021, 7:23 AM
Does Airbyte expose information about the s3 files that failed in a run? For example a message on the UI saying 9 out of 10 files synced, 1 failed with error
some error
u

user

08/31/2021, 7:26 AM
I am not sure about this question. Tag @George Claireaux (Airbyte) here since he is the author of the S3 source connector.