https://pinot.apache.org/ logo
#general
Title
# general
m

Mus

05/04/2021, 11:29 PM
Hi! Is there a way to have the data streamed from Kafka and then put into S3 in Parquet format?
m

Mayank

05/04/2021, 11:30 PM
You mean using Pinot? No
Using Pinot you can consume via Kafka and store in S3, but it will be Pinot index format.
m

Mus

05/04/2021, 11:32 PM
Got it, thanks! Is there any other service that you know that can handle this on a very big scale?
m

Mayank

05/04/2021, 11:39 PM
Usually done via ETL pipelines (as you may need transforms on Kafka topic before storing). Depending on your specific requirements there may be standard solutions out there you can try.
👍 1
k

Kishore G

05/05/2021, 12:36 AM
@User Gobblin can do that
there are many projects that can move data from Kafka to S3, Kafka Connect as well
m

Mus

05/05/2021, 12:42 AM
thanks @User, I'll check it out. I have identified https://github.com/pinterest/secor, @User pointed out https://github.com/apache/flink. I'll review them and pick according to our needs, thanks a lot
l

Laxman Ch

05/05/2021, 12:33 PM
@User: kafka-connect is another option. s3 plugin is open source too.