Hi Airbyte team! A few conceptual questions :sligh...
# ask-community-for-troubleshooting
a
Hi Airbyte team! A few conceptual questions 🙂 basically, I’m trying to send data from N sources to the same destination, in the same namespace and tables (all of my sources have the same format, it’s just the data inside that’s different, and I’m trying to gather everything in the same place). This is what I understand, but I kinda need guidance... • Overwrite can’t work, because if I use it, the last sync will erase all the previous ones, so instead of having data from my N sources I’ll only have one. • Incremental, with a “created_at” cursor field might not work either, right? Because all of my sources will use the same cursor field in the destination, but if I have source A with data at t=1 and 3, and then source B with data at t=2, if A syncs before B I’ll lose B’s data? • Incremental, with a “id” cursor field might work though? Would gladly take any input you might have on this one 🙂 (otherwise I’ll do copies to N destinations and just use dbt to merge everything together, but I’d rather avoid it to keep my database as clean and simple as possible)
âś… 1
c
• solution 3 seems to be the cleanest since you’d be able to inject a
"<name> as data_source,
column when union-ing them together and track from which source a row comes from • solution 2 might work though. The cursor is stored in airbyte’s internal db for your connection (a pair between source-destination) and does not depend on what is in the destination
Note you can use the
prefix
on your tables so you dont need N schemas or N destinations if you go with solution 3
a
awesome, thank you! how should I ingest my “name as data_source” ? with a custom_dbt transformation ? Glad to learn that the cursor doesn’t depend on what’s in the destination, thank you 🙂 And yeah, I had intended to use the prefix exactly for that, actually 🙂 much less complexity this way!
c
with a custom_dbt transformation ?
Yes!
a
thanks a lot! 🙂 that’s a huge help!
c
how should I ingest my “name as data_source” ?
You could even write a dbt macro to automatically detect your N sources (with some regexp?) and add their name as a column: https://docs.getdbt.com/docs/building-a-dbt-project/jinja-macros#set-variables-at-the-top-of-a-model
a
I have another table that gives me some of the information, so I might end up relying on that instead, but good to know! Thanks a lot for your help! :)
a
Hey @Alexandre Chouraki, the solution Chris suggested would really help the community on our online discourse forum, where we're currently migrating our community support. Do you mind posting the solution you came up with on this forum? 🙏🏻
a
Hi! I just pasted my initial comment and opened the topic. I’ll leave the rest to you, and will add a comment on it when I’m done with implementing it 🙂
👍🏻 1