Luis Fernandez
08/22/2022, 7:28 PMselect sum(impression_count) from metrics where user_id = xxx and product_id = xxxx
this query even tho it has more selectivity is slower than this one:
select sum(impression_count) from metrics where user_id = xxx
we currently are partitioning on the user_id
and have a bloom filter on product_id,user_id
is it because of the partitioning (?) also we do see an elevation of numEntriesScannedInFilter
when we also add the product_id
in the query but without it is pretty much 0, what do you recommend to do in this case.Kishore G
Kishore G
Luis Fernandez
08/22/2022, 7:34 PMKishore G
Luis Fernandez
08/22/2022, 7:35 PMLuis Fernandez
08/22/2022, 7:38 PMLuis Fernandez
08/22/2022, 7:39 PMKishore G
Kishore G
Luis Fernandez
08/22/2022, 7:44 PMuser_id
and product_id
Kishore G
Luis Fernandez
08/22/2022, 8:11 PMnumEntriesScannedInFilter
?Luis Fernandez
08/22/2022, 8:11 PMZhuangda Z
07/20/2023, 3:53 PMproduct_id
is not a partition key so bloom filter wouldn’t prune many segments? Is it a rule of thumb that bloom filter should only enable for cols that are part of the primary key in general?Zhuangda Z
07/20/2023, 3:56 PM