i wanna do some benchmarking with this combination...
# general
i wanna do some benchmarking with this combination: cassandra as storage + analytics on pinot
One of the powers of Pinot is its indexing techniques (its own storage component).
If you can provide more details around your use case, perhaps we can suggest better alternatives?
hmm in fact i m only learning cassandra and looking for ways to integrate it with any OLAP system. i dont have any use case now, just wondering about easy of integration btw cassandra and pinot. by the way we are based in Kazakstan lol 🙂
Nice that Pinot is also reaching Kazakstan 🙂. So typically, for Pinot, you ingest data either from Realtime streams (eg Kafka) and/or batch (ie offline data push).
As I mentioned, Pinot is fast partly because of its own storage + execution engines. Using another storage system is probably not going to get what you want. I am not sure if I see the value in using Cassandra as storage and Pinot as query execution.
If your end goal is analytics, then better to use just Pinot by itself.
@User u mean integration of casandra and pinot is not a good idea? but where does pinot get data from? or pinot just connects to whatever place where the data is stored?(s3, hdfs)
I am saying I am unsure what the value add is for connecting the two (unless you have strong business case).
to me Casandra is storage, so i assumed that any other apache tool (druid, pinot) can simply connect to it and i can run my olap queries directly from casandra... wrong?
Pinot is also storage.
The question is do you need OLTP storage, or are you good with OLAP storage
@User oh ok, i was just trying to integrate them for the sake of seeing how well they glue together.
If you need OLTP, then many folks use flows like OLTP -> debezium (or similar systems) to push changes to messaging system (kafka) -> Pinot
yeah i should probalby focus on actual biz case thx mayank
yeah, that would define your requirements, and help us suggest better solutions / usage of Pinot
🙌 1
you can replace MySQL in that diagram with any other storage - Cassandra, MongoDB etc
@User hmm thx , where can the kafka connect be placed? in cloud?