i wanna do some benchmarking with this combination...
# general
e
i wanna do some benchmarking with this combination: cassandra as storage + analytics on pinot
m
One of the powers of Pinot is its indexing techniques (its own storage component).
If you can provide more details around your use case, perhaps we can suggest better alternatives?
e
hmm in fact i m only learning cassandra and looking for ways to integrate it with any OLAP system. i dont have any use case now, just wondering about easy of integration btw cassandra and pinot. by the way we are based in Kazakstan lol 🙂
m
Nice that Pinot is also reaching Kazakstan 🙂. So typically, for Pinot, you ingest data either from Realtime streams (eg Kafka) and/or batch (ie offline data push).
As I mentioned, Pinot is fast partly because of its own storage + execution engines. Using another storage system is probably not going to get what you want. I am not sure if I see the value in using Cassandra as storage and Pinot as query execution.
If your end goal is analytics, then better to use just Pinot by itself.
e
@User u mean integration of casandra and pinot is not a good idea? but where does pinot get data from? or pinot just connects to whatever place where the data is stored?(s3, hdfs)
m
I am saying I am unsure what the value add is for connecting the two (unless you have strong business case).
e
to me Casandra is storage, so i assumed that any other apache tool (druid, pinot) can simply connect to it and i can run my olap queries directly from casandra... wrong?
m
Pinot is also storage.
The question is do you need OLTP storage, or are you good with OLAP storage
e
@User oh ok, i was just trying to integrate them for the sake of seeing how well they glue together.
m
If you need OLTP, then many folks use flows like OLTP -> debezium (or similar systems) to push changes to messaging system (kafka) -> Pinot
e
yeah i should probalby focus on actual biz case thx mayank
m
yeah, that would define your requirements, and help us suggest better solutions / usage of Pinot
🙌 1
you can replace MySQL in that diagram with any other storage - Cassandra, MongoDB etc
e
@User hmm thx , where can the kafka connect be placed? in cloud?