https://pinot.apache.org/ logo
#general
Title
# general
m

Map

01/04/2022, 1:35 PM
Say I have a table
t1
with 200+ columns
A
,
B
, `C1`…`C200`. When I do
select max(A) from table group by B'
, I get the max A for each B. Is there to get the corresponding `C1`…`C200` for each max(A) per B? That is to say, how to achieve `select max(A), C1, ..., C200 from table group by B`without aggregating on
C
or grouping by
C
?
a

Atri Sharma

01/04/2022, 1:42 PM
Group by B, C?
m

Map

01/04/2022, 1:46 PM
I don’t want to group by C. Group by C would basically return every record, which is in millions. In addition, I have 200 columns like C in the table, it would be tedious to do group by on all of them.
a

Atri Sharma

01/04/2022, 1:59 PM
I don't think it actually makes sense. When you say group by B, you are actually aggregating all A values which have the same B values. Since you are neither grouping by C nor specifying a function on C, what values are you expecting?
m

Map

01/04/2022, 2:05 PM
A can be the timestamp, B can be the stock ticker and C can be the price. I would like to get last price for each stock within a timeframe
a

Atri Sharma

01/04/2022, 2:17 PM
You need to order by A, then. There may be a way to use DISTINCT within MAX(A) iff A's values are unique, but that has to be tried
j

Jeff Moszuti

01/04/2022, 2:19 PM
Can you try:
select max(A), lastwithtime(C, A, 'datatype') from table group by B
. Further details about lastwithtime can be found at https://docs.pinot.apache.org/users/user-guide-query/supported-aggregations
a

Atri Sharma

01/04/2022, 2:21 PM
Yep, that should work. I forgot about lastwithtime :)
m

Map

01/04/2022, 2:31 PM
A
is not unique in our case
a

Atri Sharma

01/04/2022, 2:48 PM
Shouldn't matter, if you are anyways getting the max value? Just do a distinct on A and use as a temp column
m

Map

01/04/2022, 3:23 PM
it does matter. Say two values for B: x and y both have the same value max(A), different values for C, but
lastwithtime(C, A), 'datatype'
would return the same value
j

Jeff Moszuti

01/05/2022, 1:16 PM
I set up three values for B (symbol), all the same value max(A) (event_ts), different values for C (price). The results seem to be as expected.
a

Atri Sharma

01/05/2022, 1:17 PM
Yes, I would imagine so,too
m

Map

01/06/2022, 4:07 PM
Thanks @User for the confirmation! Much appreciated. You know, there are hundreds of
C
columns in my case, do I have to resort to hundreds of
lastwithtim()
? Is there a more efficient way to do this in Pinot?
j

Jeff Moszuti

01/06/2022, 4:37 PM
I’m not aware of a more efficient way to do this in Pinot. I noticed that this same topic was also discussed over at https://apache-pinot.slack.com/archives/CDRCA57FC/p1638549391405300. It would be worthwhile to open a Github issue as suggested by Kishore, if one doesn’t already exist
m

Map

01/06/2022, 5:42 PM