Lee Danilek
06/09/2023, 4:43 PMstate
property, stream_state
argument, and next_page_token
? We only have one cursor, which I was updating in read_records
. But then check_availability
started calling read_records
and advancing the cursor but discarding results, which makes us skip records.
From the code it looks like checkpoint state is determined by the (deprecated) get_updated_state
after calling read_records
, which means read_records
must advance the cursor. But the availability check calls read_records
and discards the result, which means read_records
must not advance the cursor. This looks impossible to implement correctly.kapa.ai
06/09/2023, 4:43 PMkapa.ai
06/09/2023, 4:44 PMLee Danilek
06/09/2023, 5:08 PMnext_page_token
, but next_page_token
is called by read_records
. it happens to work because it's called after the record is yielded, so if you discard the iterator immediately after the first result, as the availability check does, then it's correct. but this argument seems sketchyLee Danilek
06/10/2023, 12:25 AMHttpAvailabilityStrategy
(specifically it relies on next(stream.read_records())
only being called once). i looked at other connectors and could not determine how they avoid this issueLee Danilek
06/20/2023, 5:48 PM