Greetings, all. I am trying to get a sync going wi...
# ask-community-for-troubleshooting
s
Greetings, all. I am trying to get a sync going with the Connector Builder. I set a datetime cursor with the correct field and datetime format. The dates are being read, but the connector is not continuing to request data until there is no more data…and I am at a loss. Any pointers would be appreciated. This API does not require an end date. I’m just expecting the start date to be continually incremented based on the latest timestamp from the previous response (which is sorted correctly by datetime. What hidden magic configuration is required for the connector to continue making requests?
k
Based on the Airbyte documentation, for incremental syncs using the Connector Builder, the connector will continue to make requests as long as it is configured to do so and the API supports the required mechanism. Here are the key points and possible issues to check: 1. Cursor Field and Format: You mentioned you set the correct cursor field and datetime format, and dates are being read. This is the first requirement. 2. API Time Filtering Capabilities: If your API does not require an end date and only needs a start date, you should ensure that the "API time filtering capabilities" is set appropriately. If the API only allows filtering by a start date (not a range), make sure the builder is not expecting to inject an end date. If the API is a feed (newest to oldest), you may need to set "No filter" and rely on pagination and the cursor to stop fetching when the cutoff is reached. The connector will stop when it encounters a record with a cursor value less than or equal to the last synced value. This is described in the "Incremental sync without time filtering" section. If the data is sorted correctly, this should work as expected, but if not, the sync may stop too early or too late. Double-check that your data is sorted from newest to oldest if using this mode [Incremental sync without time filtering](https://docs.airbyte.com/platform/1.8/connector-development/connector-builder-ui/incremental-sync#incremental-sync-without-time-filtering). 3. Pagination: Make sure you have enabled and correctly configured pagination in the Connector Builder. If pagination is not set up, the connector may only fetch the first page of results and stop. The builder supports several pagination strategies (offset, page, cursor, or custom). If your API uses a "next page" token or similar, ensure the pagination section is enabled and configured to follow the API's pagination mechanism [Pagination](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/pagination). 4. State Management: The connector should automatically update the start date for the next sync based on the latest cursor value from the previous sync. If this is not happening, check that the state is being saved and used correctly. The most recent cursor value should be stored and used as the new start date for subsequent syncs [Incremental sync example](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/incremental-sync#example). 5. Testing and Debugging: Use the "Testing values" in the builder to simulate different start dates and verify that the requests are being constructed as expected. If the connector is not making further requests, it may be due to a misconfiguration in pagination or the incremental sync settings. Summary of what to check: - Pagination is enabled and matches your API's mechanism. - "API time filtering capabilities" is set to match your API (range, start only, or no filter). - The cursor field and format are correct and top-level. - The state is being updated and used for subsequent syncs. If all of these are set correctly and the issue persists, there may be a limitation or bug in the builder, or the API may have an undocumented requirement. For more details, see: - [Incremental sync configuration](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/incremental-sync) - [Pagination configuration](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/pagination) - [Incremental sync without time filtering](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/incremental-sync#incremental-sync-without-time-filtering) If you need to share more specifics (such as your pagination setup or API response structure), I can help you troubleshoot further.
s
I’m aiming for start only, so the connector can continue requesting until it reaches no data.
@kapa.ai Do I need to set cursor granularity and step? Is it not enough to just provide a start datetime?
k
A message from kapa.ai
You do not always need to set both
cursor_granularity
and `step`—it depends on your use case and the API's requirements. - If you only provide a
start_datetime
(and optionally an
end_datetime
), Airbyte will request all records from the start date to the end date in a single request, as long as the API supports this and you have not configured interval splitting. - The
cursor_granularity
and
step
settings are used when you want to split the sync into smaller intervals (for example, to reduce the risk of sync failures or to comply with API chunking requirements). When these are set, Airbyte will partition the time range into windows and make multiple requests, one for each window. This is especially useful for large datasets or APIs that require data to be fetched in chunks. If you do not set
step
and
cursor_granularity
, the connector will not split up the time range and will just request all records for the entire target time range in one go. This is the default and works for many APIs, but may not be optimal for large data volumes or APIs with strict limits. In summary: - For most simple incremental syncs, providing just a
start_datetime
is sufficient. - Set
cursor_granularity
and
step
only if you want to split the sync into intervals. See the documentation for more details and examples: [Connector Builder UI - Advanced Settings](https://docs.airbyte.com/platform/connector-development/connector-builder-ui/incremental-sync#advanced-settings) [Config-based YAML - Incremental Syncs](https://docs.airbyte.com/platform/connector-development/config-based/understanding-the-yaml-file/incremental-syncs#datetimebasedcursor)
s
If anyone in the community has any experience with this, I’d appreciate some input.
j
it could be that it just isn't paginating correctly. Check your preview and make sure there are additional pages being returned after the first one. If not, check the pagination settings and update as needed. There are also some APIs that have a pagination set limit (Pardot, for example, which only returns 100K records per pagination set, and you have to start a new pagination set to get more). If this is the case, the fix is rather complex, as you have to catch the end of the pagination set and trick Airbyte into starting a new one (again, Pardot source would be a good reference for this). That case is less common though, so check the test in Builder first and make sure it's paginating, and then the the API docs related to limits on records returned, max date ranges you can pull, quotas, etc. which would be more normal to see.
s
This sounds silly, but I would need pagination enabled as well as the datetime cursor? I weakly derived from the docs that using a datetime cursor would paginate on its own until there was no more data if I just used a start date.
I’m getting the first request but no subsequent requests.
j
unless the API returns all results, yes, you need both. Pagination is a separate concept: • Incremental Syncs will only request new records based on the max cursor value on subsequent syncsPagination is used to keep requesting the results from that request until there aren't any more So the difference is resuming from where you were at on the next sync vs. continuing to the next page in the current sync. Similar language, so can get confusing. In almost all cases, you'll use both
s
Do they automatically work together such that the latest datetime will be used if pagination is enabled?
j
Yes, with the vast majority of APIs you don't have to do anything special. There are a handful of APIs that the subsequent pagination requests remove all other query parameters (which requires some conditional logic in Builder). But this is rare (Pardot being an example of this pattern too)
s
I tried pagination with the No Pagination strategy but only got one page. I’m trying cursor pagination, but some of the configuration overlaps with the datetime based cursor under incremental sync. The API with which I am syncing has an updated_at field that provides the value and a filters[updated_date][gte] query param for providing the date on each iteration. I’ve got all the datetime formatting figured out, I think.
I really appreciate your input.
j
does the API you're looking at have public documentation or something you can share either here or by PM?
most APIs either use a page number, offset, or provide a pagination token back in the response
It does have a page param which I could use. I’ll try that next, but I’m trying to drive it off of
filters[updated_date][gte]
, which I at one point got working in the older Python CDK. I’m trying to use the builder this time, though.
Faster to “develop” connectors in the Builder.
I just don’t know how to configure Airbyte to feed the latest date in.
j
yep, set a query parameter of
per-page
to
50
(the max) and set the Pagination Strategy to Page Increment and inject the
page
query parameter into the outgoing request.
❤️ 1
it's often also helpful to set the sort to be in the order of the cursor field (ascending). In this case that would be a query parameter of
sort
with a value of
updated_at
according to their docs
so effectively the incremental is setting where to start, and pagination is taking that in chunks (and by default will stop when it doesn't return any more values, meaning you've reached the end)
s
All of that makes sense. I do have it sorting by
updated_at
, which is working properly. I am just confused why the requests don’t advance automatically with the datetime cursor. Is it possible to use the pagination setting along with a starting date? I’m trying to sync a smaller portion of the data for testing.
j
What you put in `Start Datetime`→`Value` in the Incremental section is what will be used when there isn't yet a cursor. You can also feed this from a user config option if you want to be able to configure it when making a new source. But a static value works great for testing.
s
Apologies for my density; slowly getting it. So I can still have incremental sync enabled with a datetime cursor?
I would essentially have both the page param and filters[updated_date][gte]?
j
yes. sometimes it helps to actually look at the requests being sent in the builder preview. But let's take a hypothetical setup. here's my premise: • let's say there are a total of 125 records in the source this year • your Incremental sync is set to use
updated_at
as the cursor field and inject it as a query param as
filters[updated_date][gte]
and the start date value is set to
2025-01-01
(just ignore formatting for this hypothetical) • you're injecting
page
through the pagination component using the Page Increment Pagination strategy • you're setting
per-page
to
50
as a query parameter These are the requests that Airbyte would send: 1.
<https://example.com/api/endpoint?filters[updated_date][gte]=2025-01-01>
(returns rows 1-50) 2.
<https://example.com/api/endpoint?filters[updated_date][gte]=2025-01-01&page=2>
(returns rows 51-100) 3.
<https://example.com/api/endpoint?filters[updated_date][gte]=2025-01-01&page=3>
(returns rows 101-125) 4.
<https://example.com/api/endpoint?filters[updated_date][gte]=2025-01-01&page=4>
(returned nothing, so Airbyte knows the pagination is done)
s
Ah, so the expected behavior is the datetime is stored so that it is the starting point for the next sync.
I see. That is a paradigm shift in my mind but makes sense. I will test that out momentarily!
j
yep, exactly. and when you look on the Settings Screen and expand the Connection State, you'll see that it's just JSON that keeps track of those cursor values so it knows where to resume the next sync:
and then just to close the loop on the concept, Lookback Window just takes the stored cursor and reduces it by the amount you specify before starting the next sync (so it'll re-pull a certain amount of data). This is useful when the data may be updated it's returned, or when the API only supports filtering with greater than (so you can back it off by a second or day to ensure there aren't any gaps). Similarly Split Up Interval just takes the stored cursor and breaks it into pieces (e.g. 1, 7, 30 days) and kicks off separate sets of requests for that. This can be useful when you're using start AND end date filtering but the initial request would be over a limit imposed by the API. Basically it just lets you start a bunch of smaller batches. Most of the time you don't need either of these, but it's important to understand how they interact when you do
s
I love it, thank you. I will report back shortly with results. STOP to unsubscribe. 😊
@Justin Beasley Your knowledge proves correct! I’m now getting pages of data that appear sequential! Much appreciated!! I ran into the 1k / 5 page testing limit, but that is expected. I’m going to save and test with a real sync.
I am filing away the loopback window info but am thinking this time the incrementing
updated_at
should get me what I need. It either comes through my initial pages or would show up next time. I’ll keep an eye on it.
Success!! I’ve got 836 records and a state that matches the latest timestamp. I am very appreciative!
It didn’t seem to use the state value on a subsequent sync, so I’m wondering if I need to configure the incremental settings differently.
j
A few things to confirm: 1. On the connection Schema tab, is the stream you're looking at set to be Incremental for sync mode? (if it was Full Refresh before, it may still be set that way—when a field has an Incremental config, you can use it either way) 2. In the synced data, do you see the
updated_at
column populated with valid date/datetime values? 3. On the Settings tab of the Connection, if you expand Connection State, do you see a value for that stream? 4. What does the Incremental Section in Builder look like? Can you post a screenshot? 5. Can you confirm that the filter query param is NOT set anywhere besides the Incremental section?
s
💯
It was the sync mode, which I had to change to incremental. Nothing was selected, and at the time I set up the sync I hadn’t gotten incremental working. I just synced twice. The first time synced all records, the second synced nothing.
🙌 1