Say I have a table `t1` with 200+ columns `A`, `B`...
# general
m
Say I have a table
t1
with 200+ columns
A
,
B
, `C1`…`C200`. When I do
select max(A) from table group by B'
, I get the max A for each B. Is there to get the corresponding `C1`…`C200` for each max(A) per B? That is to say, how to achieve `select max(A), C1, ..., C200 from table group by B`without aggregating on
C
or grouping by
C
?
a
Group by B, C?
m
I don’t want to group by C. Group by C would basically return every record, which is in millions. In addition, I have 200 columns like C in the table, it would be tedious to do group by on all of them.
a
I don't think it actually makes sense. When you say group by B, you are actually aggregating all A values which have the same B values. Since you are neither grouping by C nor specifying a function on C, what values are you expecting?
m
A can be the timestamp, B can be the stock ticker and C can be the price. I would like to get last price for each stock within a timeframe
a
You need to order by A, then. There may be a way to use DISTINCT within MAX(A) iff A's values are unique, but that has to be tried
j
Can you try:
select max(A), lastwithtime(C, A, 'datatype') from table group by B
. Further details about lastwithtime can be found at https://docs.pinot.apache.org/users/user-guide-query/supported-aggregations
a
Yep, that should work. I forgot about lastwithtime :)
m
A
is not unique in our case
a
Shouldn't matter, if you are anyways getting the max value? Just do a distinct on A and use as a temp column
m
it does matter. Say two values for B: x and y both have the same value max(A), different values for C, but
lastwithtime(C, A), 'datatype'
would return the same value
j
I set up three values for B (symbol), all the same value max(A) (event_ts), different values for C (price). The results seem to be as expected.
a
Yes, I would imagine so,too
m
Thanks @User for the confirmation! Much appreciated. You know, there are hundreds of
C
columns in my case, do I have to resort to hundreds of
lastwithtim()
? Is there a more efficient way to do this in Pinot?
j
I’m not aware of a more efficient way to do this in Pinot. I noticed that this same topic was also discussed over at https://apache-pinot.slack.com/archives/CDRCA57FC/p1638549391405300. It would be worthwhile to open a Github issue as suggested by Kishore, if one doesn’t already exist
m