hi! i'm working on the Convex source connector, wh...
# help-connector-development
l
hi! i'm working on the Convex source connector, which is an incremental HttpStream. Can someone describe how pagination is supposed to work, with the
state
property,
stream_state
argument, and
next_page_token
? We only have one cursor, which I was updating in
read_records
. But then
check_availability
started calling
read_records
and advancing the cursor but discarding results, which makes us skip records. From the code it looks like checkpoint state is determined by the (deprecated)
get_updated_state
after calling
read_records
, which means
read_records
must advance the cursor. But the availability check calls
read_records
and discards the result, which means
read_records
must not advance the cursor. This looks impossible to implement correctly.
k
A message from kapa.ai
New message text here
l
I'm considering updating the cursor in
next_page_token
, but
next_page_token
is called by
read_records
. it happens to work because it's called after the record is yielded, so if you discard the iterator immediately after the first result, as the availability check does, then it's correct. but this argument seems sketchy
i sent https://github.com/airbytehq/airbyte/pull/27226 with an idea for a fix. i don't like how it relies on an implementation detail of
HttpAvailabilityStrategy
(specifically it relies on
next(stream.read_records())
only being called once). i looked at other connectors and could not determine how they avoid this issue
@Marcos Marx (Airbyte) we talked in office hours last week. do you have any updates?