Hi all, I am trying to implement a low latency window system using keyed process functions like that is described here:
https://flink.apache.org/2020/07/30/advanced-flink-application-patterns-vol.3-custom-window-processing I am using it for a counting process function. The function keeps track of a count state for every window, I am using timers firing to decrement the count after the window has passed, but the timer behavior is not working how I would like. Specifically timers don’t fire until the watermark passes their time. But some of the keys of the keyed process function don’t receive values that often, so the watermark might not be advanced for a while, and timer fires AFTER the event that advances the watermark, even if the timer was for a time much earlier than the latest event that advances the watermark. eg. I have a timer to trigger today at 5pm, and no events happen until today at 6pm, so the count seen when the 6pm event comes in is overcounting, because it should have decremented at 5pm, but instead it decrements after the 6pm event. Is there a way to get timers to fire before the event that advances them, provided the event is AFTER the timer (if the timer time and the event time are equal, the event should go first).
I am seeing this behavior in a unit test, so I’m also wondering if its something that only will happen in unit tests because I’m creating a stream from a list, and there is not actual time involved, and a realtime running job won’t have this issue?