Max Krog
12/07/2021, 9:35 AMcreate or replace table `project-name.airbyte_things._airbyte_raw_things` as (
with base as (
select
*,
row_number() over(partition by sha512(_airbyte_data) order by _airbyte_emitted_at) as hashed_data_rn -- hash the data and find the row_number
from
`project-name.airbyte_things._airbyte_raw_things`
)
select * except(hashed_data_rn) from base where hashed_data_rn = 1
)
Basically it hashes the raw data in the source table, orders the entries with the same hash, and deletes every record except the first for the hash.
What is this process of "condensing"/"cleaning" a source table called? 🙂[DEPRECATED] Marcos Marx