I am trying to wrap my head around developing a cu...
# ask-ai
m
I am trying to wrap my head around developing a custom integration that uses incremental intervals. Can you give some insights into how to manage the state file and when it is updated?
k
Sure, I can provide some insights on how to manage the state file and when it is updated for incremental syncs in Airbyte. In Airbyte, an incremental sync is a sync which pulls only the data that has changed since the previous sync. This is usually implemented using a cursor value (like a timestamp) that delineates which data was pulled and which data is new. A very common cursor value is an
updated_at
timestamp. This cursor means that records whose
updated_at
value is less than or equal than that cursor value have been synced already, and that the next sync should only export records whose
updated_at
value is greater than the cursor value [source](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/incremental-syncs). When a stream is read incrementally, a state message will be output by the connector after reading all the records, which allows for checkpointing. On the next incremental sync, the prior state message will be used to determine the next set of records to read [source](https://docs.airbyte.com/connector-development/config-based/understanding-the-yaml-file/incremental-syncs). For example, in the case of the
source-exchange-rates-tutorial
, the
read
operation outputs state messages like this:
Copy code
{
  "type": "STATE",
  "state": {
    "data": {
      "rates": {
        "date": "2022-07-15"
      }
    }
  }
}
Where the date ("2022-07-15") should be replaced by today's date. A state file containing the last state produced by the
read
operation is created and used to simulate incremental syncs. Running the
read
operation will now only read data for dates later than the given state [source](https://docs.airbyte.com/connector-development/config-based/tutorial/incremental-reads#supporting-incremental-syncs). In some cases, you might want to update the state incrementally during replication. However, as of now, some parts of Airbyte depend on the complete sync to finish to be updated [source](https://discuss.airbyte.io/t/3704). If you're having trouble with the
state_checkpoint_interval
property, you might find this [example](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/connectors/source-amplitude/source_amplitude/api.py) helpful where a GA connector checkpoints every 10 records [source](https://discuss.airbyte.io/t/3935). I hope this helps! Let me know if you have any other questions.
m
How do I write a test that checks whether the state has been updated or not?