Nico Wright
05/28/2024, 12:36 PMkapa.ai
05/28/2024, 12:36 PMupdated_at
values for your new connection, you can follow these steps: 1. Copy the State from the Existing Connection: - Navigate to the existing connection you are duplicating. - Go to the Settings
tab and open the Advanced
settings to see the connection state. - Copy the state to your clipboard. This step ensures that you have the latest state information, which includes the updated_at
values. Source: [Upgrading to Destinations V2 - Additional Steps for Incremental Sync Modes](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#additional-steps-for-incremental-sync-modes) 2. Update the State in the New Connection: - Go to your newly created connection. - Replace the state with the copied contents from the previous step. - Click Update State
. This will ensure historical data is not replicated again. Source: [Upgrading to Destinations V2 - Additional Steps for Incremental Sync Modes](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#additional-steps-for-incremental-sync-modes) 3. Use the API to Update the State: - You can also use the Airbyte API to update the connection state. Here is an example of how to do it using a POST
request to `/api/v1/state/create_or_update`: bash curl -X POST -u airbyte:password -H "Content-Type: application/json" -d '{ "connectionId": "your-new-connection-id", "connectionState": { "stateType": "stream", "connectionId": "your-new-connection-id", "streamState": [ { "streamDescriptor": { "name": "products" }, "streamState": { "seed": -1, "updated_at": "2023-08-03T14:47:19+00:00" } }, { "streamDescriptor": { "name": "users" }, "streamState": { "seed": -1, "updated_at": "2023-08-03T14:47:19+00:00" } }, { "streamDescriptor": { "name": "purchases" }, "streamState": { "seed": -1, "updated_at": "2023-08-04T14:47:19+00:00" } } ] } }' <https://local.airbyte.dev/api/v1/state/create_or_update> | jq
Replace "your-new-connection-id"
with the actual ID of your new connection and update the updated_at
values as needed. Source: [Github Issue - Connection State should be Editable](https://github.com/airbytehq/airbyte/issues/29016) By following these steps, you can ensure that your new connection includes the latest state information, preventing the re-syncing of already replicated data.Nico Wright
05/28/2024, 12:50 PMkapa.ai
05/28/2024, 12:51 PMcustomers
and metafield_customers
streams. json [ { "streamDescriptor": { "name": "customers" }, "streamState": { "updated_at": "2023-06-14T09:40:41+02:00" } }, { "streamDescriptor": { "name": "metafield_customers" }, "streamState": { "customers": { "updated_at": "2023-03-30T05:32:13+02:00" }, "updated_at": "2021-09-01T17:57:41+02:00" } } ]
In this example: - The customers
stream has an updated_at
timestamp of 2023-06-14T09:40:41+02:00
. - The metafield_customers
stream has two updated_at
timestamps: one for the customers
object (2023-03-30T05:32:13+02:00
) and one for the metafield_customers
stream itself (2021-09-01T17:57:41+02:00
). This state information helps the connector determine which records have been updated or created since the last sync, ensuring that only new or modified data is fetched during incremental syncs. For more details, you can refer to the original [GitHub issue](https://github.com/airbytehq/airbyte/issues/27355) where this example was discussed.Nico Wright
05/28/2024, 12:51 PMkapa.ai
05/28/2024, 12:51 PMkapa.ai
05/28/2024, 12:52 PMcustomers
, metafield_customers
, orders
, and products
. This state is used to track the progress of incremental syncs for different streams. json [ { "streamDescriptor": { "name": "customers" }, "streamState": { "updated_at": "2023-06-14T09:40:41+02:00" } }, { "streamDescriptor": { "name": "metafield_customers" }, "streamState": { "customers": { "updated_at": "2023-03-30T05:32:13+02:00" }, "updated_at": "2021-09-01T17:57:41+02:00" } }, { "streamDescriptor": { "name": "orders" }, "streamState": { "updated_at": "2023-07-01T12:00:00+02:00" } }, { "streamDescriptor": { "name": "products" }, "streamState": { "updated_at": "2023-07-15T08:30:00+02:00" } } ]
In this example: - The customers
stream has an updated_at
timestamp of 2023-06-14T09:40:41+02:00
. - The metafield_customers
stream has two updated_at
timestamps: one for the customers
object (2023-03-30T05:32:13+02:00
) and one for the metafield_customers
stream itself (2021-09-01T17:57:41+02:00
). - The orders
stream has an updated_at
timestamp of 2023-07-01T12:00:00+02:00
. - The products
stream has an updated_at
timestamp of 2023-07-15T08:30:00+02:00
. This state information helps the connector determine which records have been updated or created since the last sync, ensuring that only new or modified data is fetched during incremental syncs. For more details, you can refer to the original [GitHub issue](https://github.com/airbytehq/airbyte/issues/27355) where this example was discussed.