They haven't shared any details about the Snowflake cluster.
The cluster was also warmed up earlier
11/04/2021, 9:00 AM
if the result is so great why not open source the benchmark and make it reproducible? if you follow Gartner, the best data integration tool is from Informatica 😛
George Claireaux (Airbyte)
11/04/2021, 10:10 AM
@Kamil Breguła good points, I certainly wouldn't argue this proves lakehouse/databricks as the ultimate pattern but I'm excited for the future of it and thought this was an interesting milestone 😄
@Rick Radewagenthe benchmark is just one of many set by a separate organisation, TPC. Gartner are actually an associate member.
11/05/2021, 8:58 AM
@George Claireaux (Airbyte), I know TPC, but I also know that a lot of tuning and tweaking can make a huge difference for benchmarking. Dealing with 100TB of data, tuning is usually needed for optimal cost-performance.
Non open benchmarks, conducted by the vendor itself are suspicious for me. I would also not trust Snowflake if they publish something like that.
A good example for how it should be done: https://github.com/fivetran/benchmark