Hey, this might be a simple question but I didn’t ...
# ask-community-for-troubleshooting
m
Hey, this might be a simple question but I didn’t find a clear answer anywhere. I’d like to use Airbyte to load data on a daily basis, incrementally. Do I need to have the Airbyte syncing running all the time or if I kick off a new sync it would pick up from the latest synced record ? Considering the source is massive I wouldn’t want to scan the whole thing every time I restart Airbyte or kick off a new sync. I understand the source must be able to do
Incremental Sync
but is this “state” only kept through a single run of Airbyte ? Also, does an Incremental Sync read always scan the table?
👀 1
o
loading...
h
Hey 1. We store the state in the database so yes if the source supports incremental you can set it to run daily 2. We do have incremental DBT so we don't scan the whole table
m
Thanks for replying @Harshith (Airbyte)! I’ve been looking into it and understanding a bit better. I’m sending some data from MSSQL to Kinesis and I see the “cursor” controls the last synced record. As you said the state (which is the same as the cursor as I understand) is stored in the database. In this case would it be in a MSSQL table ?
h
Hey we have a db instance running if you could see in the docker-compose so we store all the config and states there
m
great ! That makes more sense 🙂 Last one.. If I want to infer the initial cursor value, that would need to be done through the Database, correct ? There’s no functionality for me to set that through the current UI
h
Yeah you are right
m
Thanks Harshith !
Do I have access to the database when using Airbyte cloud ?
h
I am not sure on it. @Marcos Marx (Airbyte) if you can comment?
m
@Marcos Marx (Airbyte)? 🙂
a
Hi @Maikel Penz no you don't have access to the internal airbyte database when using the cloud offer. Just for the sake of clarity, the state is not exactly the cursor. The state stores the cursor value for an incremental stream but can also store additional custom values declared in the connector.
m
Hey @Augustin Lafanechere (Airbyte)! Makes sense.. In my case I’d like to load a very big table from a database but skip rows I have already put into the destination through another system. I’d like that the first load of Airbyte starts from a recent row and not have to do the full load again. Either having the ability to “start the load from a custom incremental cursor” or “set the cursor manually” through the UI would be more or less what I’m after. I can tweak it when I run it locally because I have access to the database but if we decide to go with the Cloud offering then I don’t see a way around at the moment
a
If you have control over the source database we usually suggest users create a view on their source table and load this view, which already has the subset of data you want to get in an initial load.