hi i m working on the Convex source connector which is an in Airbyte #help-connector-development

hi! i'm working on the Convex source connector, wh...

Lee Danilek

06/09/2023, 4:43 PM

hi! i'm working on the Convex source connector, which is an incremental HttpStream. Can someone describe how pagination is supposed to work, with the

state

property,

stream_state

argument, and

next_page_token

? We only have one cursor, which I was updating in

read_records

. But then

check_availability

started calling

read_records

and advancing the cursor but discarding results, which makes us skip records. From the code it looks like checkpoint state is determined by the (deprecated)

get_updated_state

after calling

read_records

, which means

read_records

must advance the cursor. But the availability check calls

read_records

and discards the result, which means

read_records

must not advance the cursor. This looks impossible to implement correctly.

kapa.ai

06/09/2023, 4:43 PM

A message from kapa.ai

kapa.ai

06/09/2023, 4:44 PM

New message text here

Lee Danilek

06/09/2023, 5:08 PM

I'm considering updating the cursor in

next_page_token

, but

next_page_token

is called by

read_records

. it happens to work because it's called after the record is yielded, so if you discard the iterator immediately after the first result, as the availability check does, then it's correct. but this argument seems sketchy

Lee Danilek

06/10/2023, 12:25 AM

i sent https://github.com/airbytehq/airbyte/pull/27226 with an idea for a fix. i don't like how it relies on an implementation detail of

HttpAvailabilityStrategy

(specifically it relies on

next(stream.read_records())

only being called once). i looked at other connectors and could not determine how they avoid this issue

Lee Danilek

06/20/2023, 5:48 PM

@Marcos Marx (Airbyte) we talked in office hours last week. do you have any updates?

Open in Slack

Previous Next