Hi Everyone,
I have a question about performing lots of calculations. We have a few queries where we need to calculate the sum of each column after applying a series of where conditions. (10k+ columns with Boolean values)
When we have many AND conditions the query is quite fast, as we collect the data and retrieve all the sums on a relatively flat dataset.
However when we have a very limited number of AND conditions the query slows way down, as for each column we need to calculate the SUM for millions of rows. This has a propensity to take a very long time or worse, throw an out of memory error for the java heap space.
I don't think that pre-aggregations will work here as they are for an entire column and not an entire column with a condition. (please correct me if I'm wrong)
Furthermore, our data filtering and retrieval seems to be quite fast, it doesn't feel like it's an issue of retrieving the data but more aggregating the data.
Any thoughts on increasing performance? Any ideas or thoughts are welcome.