Ken Krugler
03/11/2021, 11:47 PMgroup by
, I assume currently we have to do a separate distinctcount
or distinctcounthll
, right? But if the group by uses multiple columns, what’s the best approach to getting this total group count?Ken Krugler
03/12/2021, 12:34 AMselect distinctcount(concat(col1, col2, '|'))
and that works, but isn’t very fast.Jackie
03/12/2021, 12:39 AMLIMIT
for the GROUP BY
query or DISTINCT
query to get all the groupsKen Krugler
03/12/2021, 12:41 AMJackie
03/12/2021, 12:45 AMKen Krugler
03/12/2021, 12:45 AMJackie
03/12/2021, 12:46 AMdistinctCount
family only support single column, we can probably extend that to support multiple columnsJackie
03/12/2021, 12:46 AMdistinctcounthll(concat(col1, col2, '|'))
should be fasterKen Krugler
03/12/2021, 12:47 AMKen Krugler
03/12/2021, 12:48 AMJackie
03/12/2021, 12:51 AMcount(distinct col1, col2)
to get the distinct countJackie
03/12/2021, 12:52 AMDISTINCT(col1, col2)
with high limit in PinotKishore G
distinctcounthll(concat(col1, col2, '|'))
will suffer from Concat scalar udfKishore G
Ken Krugler
03/12/2021, 1:38 AMKishore G
Kishore G
Ken Krugler
03/12/2021, 4:45 PM