does anyone know why we dont use ORDER BY clause i...
# contributing-to-airbyte
s
does anyone know why we dont use ORDER BY clause in our select query here ?
u
What did you want to order by here?
u
I suppose if we don't order by the cursor, there might be a risk of us reading a later cursor before an earlier one, saving that to state and skipping some records?
u
guess the tradeoff here is user specified cursor don't have to be orderable
u
Do we actually output state for a part of the result of the select call?
u
If we do output state after only processing a part of it what you’re saying makes a lot of sense.
u
good question - double checked and our current impl emits state only after all the new records for that incremental run is read
u
so that's okay for now. how will this work with the checkpointing we are introducing?
u
I imagine Charles will have to add ordering?
u
I summon @charles to enlighten me whenever convenient
u
good call. it hasn't mattered until now, but will need to change this.
u
the original thought is the order by forces the database to run a "reduce" if there is not already an index on the select column.
u
so i thought it would be more efficient to skip the order by, but with the new checkpointing, i agree it makes sense to add it.
u
yeap not requiring an orderBy previously makes sense. this path forward adds ordering just as we'll benefit from it
u
seems fine as long as we add documentation that users should create an index on the cursor column if using incremental to prevent high db cpu usage