Hi,
I am just exploring this project and have a question on pinot-s3 data ingestion.
At our company we have new data coming as json/csv files every minute/hour. We are currently using postgres which is hard to scale so we are looking for a performant, horizontally scalable OLAP solution ideally which runs on Kubernetes.
My question is if it is possible to sync a S3 bucket with pinot? So, if we add new csv/json files to the bucket, pinot should automatically injest (only) new files into its segment store without any duplicates.
I expect this is doable using S3 events but I couldn’t find if something like this is already in place.
If not, then we have to cook up out own solution using S3 events or set up a kafka cluster to stream data to Pinot.
Thanks!