Hi, I’m new to Airbyte and am going through the do...
# ask-community-for-troubleshooting
p
Hi, I’m new to Airbyte and am going through the documentation to find how to provision a custom code pipeline (e.g. something org specific web scraping) in a container that is executed by Airbyte. I’m looking at the section of building custom connectors but I have doubts that I’m reading the right sections. Any suggestions on Airbyte best practices how to do that?
the plan is to extract data from web pages through web scraping and ingest the extracted data onto our Snowflake data warehouse. So sources are websites, extract process would be the built and containerised web scraping scripts and destination would be Snowflake
s
Hey, welcome to the community @Peter! octavia wave Yeah, the docs would be pretty misleading for this use case. Airbyte is built more for handling API calls rather than the actual data collection and storage. For this use case typically the web scraping would happen outside of Airbyte, the scraped data would be stored somewhere (a database?) and then the Airbyte source connector can request data from its API and sync it into the destination API. So, once your web scraper has an API to pull from, you can build a source connector and it can sync with Snowflake 🙂
p
Thank you so much for explaining. You saved me many hours of experiment.
s
No prob! Let us know if you have any other questions, either here or if you have a more in-depth inquiry we have a community Discourse forum. Good luck with your development!
👍 1