I wrote a blog about an ongoing project in which I...
# releases-and-early-demos
r
I wrote a blog about an ongoing project in which I'm using Airbyte to ingest open-source community data. Planning to follow up in a few days with a guided tutorial. https://preset.io/blog/building-an-open-source-ingestion-layer-with-airbyte
👏 5
👏🏽 1
m
This is great! Thanks @Robert Stolz for sharing
g
thanks for sharing @Robert Stolz 1. do you do any additional transformations for your insights or mostly just graphs on the reviews issues commits tables etc 2. i find the basic normalisation a little too bloated and with our dbt project framework this ends up consisting of 3 duplicate tables (AIRBYTE_RAW, table (as source) and dbt table (staged or final). As such I’ve opted for no normalisation and parsing json in the dbt project - I guess this is just the ETL x ELT argument but with duplicate data storage implications. was curious your thoughts either way
s
Thanks for the mention and feedback Rob! Your feedback on normalization is spot on and one of my focuses in the coming weeks will be on formalizing the schema change process so you don’t run into the issues you’ve mentioned going forward
r
@gunu The idea is to make an open source community data platform that you can implement your own analytic frameworks of choice on top of, so I'm doing some basic arrangement of the data (and implementing a framework or two of my own interest). I made the same choice re: #2. My preference has been to build my own transformations on the raw tables just to make schema evolution something that is more under my own control. @s Excited to see what is to come in the schema evolution department.
👍 1