Hi everyone! I'm trying to evaluate if Airbyte wil...
# feedback-and-requests
s
Hi everyone! I'm trying to evaluate if Airbyte will work for some of our workflows and was hoping to get clarity on some things • Can we pass parameters from Apache Airflow to an Airbyte job? For example, if my workflow involves picking up "new" files from GCS, I can have Airflow figure out what files are new, but I'm not seeing a way to pass that filename to the Airbyte job. ā—¦ We also might have situations where the "new" file name/location could be triggered by a Pub/Sub message, which we can pick up with Airflow, but then need to pass the file name/location to Airbyte for processing.
u
I don't think you can do this in a straightforward way. @Joseph Dalton there are some work to release the GCS file with incremental updates that probably solve your problem, WDYT? Or do you need to handle the connection using Airflow? @George Claireaux (Airbyte) could give more details about the incremental file connector?
u
Hey @Joseph Dalton, I've been working on an abstract files source that can handle incremental syncs šŸ˜„. It should be landing in the next few days as an S3 source with CSV support but since it's a framework it means we'll be able to rapidly add other file formats (like parquet / avro etc.) and create other file-based connectors (like GCS). The PR is here, I'd recommend taking a look at the proposed docs page to see if this will suit your needs (once we add GCS support). And if not, I'd be really keen to hear your feedback so we can make it as useful as possible airbyte rocket
u
Hey @Joseph Dalton, I think this issue will be of interest to you! Give it a šŸ‘ to signal your support!
u
Sorry for the delayed response! These would definitely help! Being able to load only new files since the last time it ran from GCS buckets (with glob-like pattern matching) would solve some issues for us.