I’m curious if anyone has worked with window sizes...
# troubleshooting
m
I’m curious if anyone has worked with window sizes that are very large, for counts, like year long counters? Or long timers, like if I wanted to increment a count due to an event, and then set a timer to decrement the counter after 1 year, is there a way to do that without having everything in flink state (heap or rocksdb)
s
I’m not sure of the underlying mechanism but I think this might be what you want: https://nightlies.apache.org/flink/flink-docs-master/docs/ops/metrics/ Off the top of my naive head, you’d register a counter or a gauge as a metric. They get exposed to the jobmanager UI. It’s fairly trivial to do this for a simple count of events.
For a year long count tho, idk how metrics behave after restarts… I wonder if persistence of some kind would be necessary. Even if you were using state it would restart the counter when the process restarts. Wondering if @Mikayla Harvey might have insight. Pretty sure they’re focused on monitoring and metrics.
m
Thanks @Scott Robertson, Ill message you @Michael LeGore.
s
np. Kinda curious myself what the answer is, if you come up with one. simple smile
m
I mean like a year long window
s
Still curious on this, how much data is processed within the year long window? Is the window actually kept open for a year? or is this like, historic data which will be processed in a much shorter amount of time? I could see utility in a Batch job which processes a year's worth of data, in, a day? a week? It's hard to imagine all the implications of leaving a window open, for an entire year... esp if you had to fix bad data: 😬 https://medium.com/confluent/preventing-and-fixing-bad-data-in-event-streams-part-1-27bf2a99b48e