https://linen.dev logo
s

Subodh (Airbyte)

05/07/2021, 12:08 PM
does anyone know why we dont use ORDER BY clause in our select query here ?
u

user

05/07/2021, 4:46 PM
What did you want to order by here?
u

user

05/10/2021, 3:56 AM
I suppose if we don't order by the cursor, there might be a risk of us reading a later cursor before an earlier one, saving that to state and skipping some records?
u

user

05/10/2021, 3:57 AM
guess the tradeoff here is user specified cursor don't have to be orderable
u

user

05/10/2021, 3:58 AM
Do we actually output state for a part of the result of the select call?
u

user

05/10/2021, 3:59 AM
If we do output state after only processing a part of it what you’re saying makes a lot of sense.
u

user

05/10/2021, 4:03 AM
good question - double checked and our current impl emits state only after all the new records for that incremental run is read
u

user

05/10/2021, 4:04 AM
so that's okay for now. how will this work with the checkpointing we are introducing?
u

user

05/10/2021, 4:07 AM
I imagine Charles will have to add ordering?
u

user

05/10/2021, 4:09 AM
I summon @charles to enlighten me whenever convenient
u

user

05/10/2021, 4:19 AM
good call. it hasn't mattered until now, but will need to change this.
u

user

05/10/2021, 4:19 AM
the original thought is the order by forces the database to run a "reduce" if there is not already an index on the select column.
u

user

05/10/2021, 4:20 AM
so i thought it would be more efficient to skip the order by, but with the new checkpointing, i agree it makes sense to add it.
u

user

05/10/2021, 5:01 AM
yeap not requiring an orderBy previously makes sense. this path forward adds ordering just as we'll benefit from it
u

user

05/10/2021, 5:02 AM
seems fine as long as we add documentation that users should create an index on the cursor column if using incremental to prevent high db cpu usage
2 Views