k

Ken Krugler

07/16/2021, 9:59 PM
I’m trying to understand an interesting anomaly with the results of a query. I do
``select sum(metric), key, min(date) as firstSeen, max(date) as lastSeen from table where date >= <lowDate> AND date <= <highDate> group by key order by firstSeen desc limit 1``
. I get a single row as expected, but with a
``firstSeen``
and
``lastSeen``
both equal to
``<highDate>``
. I was expecting the
``firstSeen``
result to be equal to
``<lowDate>``
. If I then run the exact same query, but add in
``AND key = '<key value from the previous result>'``
, I get the a single row with the requested key value, but now the
``firstSeen``
result is equal to
``<lowDate>``
(as expected), and
``sum(metric)``
is larger (also as expected). Any ideas what is going on?
j

Jackie

07/16/2021, 10:06 PM
``select date from table where date >= <lowDate> AND date <= <highDate> AND key = '<key value from the previous result>'``
?
Is
``key``
a very high cardinality column? If so, some groups might be dropped due to Pinot's approximation algorithm for group-by queries
k

Ken Krugler

07/16/2021, 10:09 PM
It’s high cardinality, but
``numGroupsLimitReached: false``
in response. And the difference is for a specific key value, not a missing key value.
If I do
``select date,metric from table WHERE date >= <lowDate> AND date <= <highDate> AND key = '<key value from previous result>'``
, I get four rows. The sum of the metric values is the larger (expected) value. The one row with
``date = <highDate>``
has the
``metric``
value = the
``sum(metric)``
returned from the first query.
This is with Pinot 0.7.1
Hmm, if I do the first query, but change to
``order by lastSeen``
``firstSeen``
``firstSeen``
``sum(metric)``
``AND key = 'specific key value'``