Hi. I m testing airbyte to ingest data into our S3 data lake, from some sources (mainly posture database). I noticed the parquet files has some extra fields and that updated rows are appended (which is good as it could provide time travel capabilities, like iceberg) How to I select the most updated version of the rows? Any python library to make it easier to access the file? Is there a way to periodically prune the parquet files so they have only the latest rows?