Apache Flink

Any suggestions for data stores that efficiently serve both as bounded and unbounded sources?
I have experience with writing connectors, but wanted to know if there was any community insight that would obviate my doing a broad survey.

I do think Kafka is most often used for this.

Kafka has two big problems.
1) It can’t produce a change stream, so you’re forced to hold all data in normalization tables.
2) Lack of pushdown projection and filtering horrendously increases I/O and deserialization costs.

In addition to the need to set an aggressive roll/compaction policy

I almost feel like I need to create my own sharded, indexed DB on top of RocksDB to get mixed bounded/unbounded but that seems crazy.
Ignite seemed promising, but obnoxiously it can’t guarantee that the initial load is synced with the change stream.

I suppose I could create progressive Kafka debezium topics representing manual compaction. :nauseated_face:  

Your right there is a lot of overhead there.