Hello Team,
I am working on a usecase where data is being aggregated for a given window and then published to a sink. So this may not be a keyed window aggregation and I see windowsAll executes with just 1 parallelism. Any suggestions on achieving non keyed windowing?
r
Rashmin Patel
04/04/2023, 8:27 AM
So you want to run aggregate operator (non-keyed) with more than one parallelism ?
s
Sumit Nekar
04/04/2023, 3:53 PM
Thats right! Just the aggregation without doing key by
r
Rashmin Patel
04/05/2023, 6:11 AM
One approach to parallelize this can be first doing keyed pre-aggregations and then finally do windowAll on these pre-aggregations, if single parallelism of windowAll is bottleneck for you.
s
Sumit Nekar
04/05/2023, 7:32 AM
How does that solve the problem with windowAll. My requirement is non keyed windowing. keyBy is expensive operation and results in shuffling. I want to avoid that.