Hi <@U02CMC2FMTM> and everyone, I'm Bayo and I am ...
# write-for-the-community
a
Hi @Ari Bajo and everyone, I'm Bayo and I am a data engineer. I started exploring Airbyte and wrote an article on it here
👏🏻 1
👏 2
airbyte rocket 3
a
Hi Adebayo, thanks for sharing! 🙂
👍🏿 1
a
Hi @Adebayo Adejare this is a very fascinating use case involving both Airbyte and dbt. One thing I noticed is that you had to create a set of tables in snowflake for both airbyte (for loading), dbt (for transforming), and finally for the materialized views from which the BI folks can use to analyze the data. This is now three sets of tables which I assume use the same XS size Snowflake data warehouse. Doesn’t that increase your Snowflake costs?
a
Yes, it does, I believe for the two sets of tables (Airbyte and dbt). Materialized view should use same table storage, so maybe less costs. Best practice is to separate loading and transformation db schemas.
a
I’m almost wondering now whether this will eventually lead to ELT falling out of favor and ETL returning. With airbyte, you could simply do ETL and you wouldn’t even need dbt. That way, you could save costs right?
a
I don't think Airbyte does ETL, isn't it just ELT? It has in-built dbt integration, was experimenting with this earlier, it seems the dbt integration feature with Airbyte is only available on local docker install (I didn't see it in Airbyte Cloud). I feel ETL never went away, it still has good use cases and if I had to guess I would say it is still the predominant architecture, I think ELT is growing because of lower costs and newer tech, it would be interesting to see the market share on each.
a
They way Airbyte compares itself to other ELT and ETL vendors makes me think that they definitely do ETL as well. I am surprised you would say that ELT has lower costs. In the use case that you just mentioned, you now have to have separate schemas for Airbyte and dbt whereas with an ETL tool, you would not. So why would you say that ELT is cheaper than ETL?
👍🏿 1
a
That is a good overview spreadsheet, it says Airbyte Focus: "ELT as a first step. Reverse-ETL coming in 2022." So I guess they'll have "Reverse-ETL" coming soon. The use case I did in the article is simplified version, it would depend on the org/data, why I said there are use cases. I think if there were any issues in running complex transformations you'd incur the costs (mostly time) of starting over from Extraction while in ELT you'd already have the data loaded and storage is cheap also. You would definitely have to weigh the trade off between maintaining a ELT vs ETL pipeline.
a
You are right that loading data into storage is very cheap from a cost perspective, but once you get to doing the “T” part, you are now using compute credits from snowflake, which is more expensive. And if you have to do any complex joins or aggregations, you now get into the issue of long running queries which will cost even more. Since you now have two snowflake schemas from Airbyte and dbt, I would think the cost is more than doubled. So how would you decide on ETL versus ELT as a trade-off when considering costs?
a
If you're comparing on costs, between ETL vs ELT, we'd have to assume transformations are the same. So the possible additional cost would not be from compute but from the separate schemas of load and transformation (storage). There's a Snowflake post on ELT vs ETL, it's more about speed and flexibility than price and costs. I imagine anything with costs would have to depend on data, and there are certainly data use cases for each one.
🧠 1