Any suggestions for data stores that efficiently s...
# troubleshooting
a
Any suggestions for data stores that efficiently serve both as bounded and unbounded sources? I have experience with writing connectors, but wanted to know if there was any community insight that would obviate my doing a broad survey.
d
Maybe Kafka?
I do think Kafka is most often used for this.
a
Kafka has two big problems. 1) It can’t produce a change stream, so you’re forced to hold all data in normalization tables. 2) Lack of pushdown projection and filtering horrendously increases I/O and deserialization costs.
In addition to the need to set an aggressive roll/compaction policy
I almost feel like I need to create my own sharded, indexed DB on top of RocksDB to get mixed bounded/unbounded but that seems crazy. Ignite seemed promising, but obnoxiously it can’t guarantee that the initial load is synced with the change stream.
I suppose I could create progressive Kafka debezium topics representing manual compaction. 🤢
d
Your right there is a lot of overhead there.