hey all! i've been considering flink as a platfor...
# troubleshooting
s
hey all! i've been considering flink as a platform for facilitating a variety of IoT data processing use cases. One question i keep coming back to is to what first class capabilities flink provides for dealing with "gaps" in data. Specifically, i'm considering behavior in the following scenario: 1. Device "X" publishes data regularly, every minute. Stream processing job processes this data using a sliding window, keyed off the unique device id. 2. Device X" stops publishing for, grab a number, say 20 minutes, while the sliding window eval is 5 minutes. What happens in scenario (2)? I would expect a basic transient outage, but less than the sliding window, is generally handled (simply just have less data accumulated in the window, and potentially late arrivals on reconnect). But what happens when the unique key disappears entirely from the stream for the duration of the window? My guess is that flink simply 'does nothing', and no window is accumulated or closed for the device ID in question, since it wasn't present in the stream for that duration, and flink doesn't seem like it has an upfront expectation of what 'keys' to expect in the first place. Is my understanding correct? If so, what patterns are typically applied for this type of situation?