Hi. I have question regarding using DISTINCTCOUNTH...
# general
b
Hi. I have question regarding using DISTINCTCOUNTHLL 1. What's the deviation rate for this aggregation since it says "approximate distinct count" ? It seems on smaller data size, I don't see any difference from DISTINCTCOUNT 2. What should be the use case of it ? 3. If I want to use star-tree index, it seems DISTINCTCOUNTHLL is the closest thing to DISTINCTCOUNT. What could be issue from using DISTINCTCOUNTHLL with star-tree index ? Because I do care if the result is accurate or not
m
1. You can find error rates for HLL here. 2. Use case is when you want faster latency for count distinct queries and are ok with aproximations. 3. StarTree or standalone, HLL is approximation algorithm. You want to study if the error margin is within your tolerance. Also, the error margin depends on the storage used, so you can play with that.