https://pinot.apache.org/ logo
j

Jonathan Meyer

05/11/2021, 3:14 PM
Sorry for bumping the question, maybe it got shadowed by following ones.
Hello
What is the recommended approach to getting the "last non-null value" ?
Use a UDF ?
k

Kishore G

05/11/2021, 3:24 PM
Last non null value for a key?
Or across lot of keys
j

Jonathan Meyer

05/11/2021, 3:26 PM
Across lot of keys Like in a "periodic" timeseries (ex: 1 data point per day), where some values (ex: some days) are null. How can we efficiently fetch the last non-null value ?
Date - Value 05/01/2021 - null 04/01/2021 - null 03/01/2021 - 3.0 02/01/2021 - 1.0 Output: 3.0
k

Kishore G

05/11/2021, 3:28 PM
Ah, got it
j

Jonathan Meyer

05/11/2021, 3:28 PM
Maybe, it'd be simpler not to store null values and get the
last
value ? Not sure that'd be more efficient that way
k

Kishore G

05/11/2021, 3:30 PM
Select date, Val from T where F value >0 order by date desc limit 1 should work
Writing a udf might be more efficient but see if this works for now
1
j

Jonathan Meyer

05/11/2021, 3:31 PM
Makes sense indeed ! Do you expect this query to be highly performant ? Should a specific index be added ?
Actually, what about negative values ? We'd be interested in those too ^^ I know there's some support for null values, as in IS NOT NULL - but iirc, there's a performance penalty associated with this feature
k

Kishore G

05/11/2021, 3:34 PM
No, not null index is very fast
So you can use that
j

Jonathan Meyer

05/11/2021, 3:37 PM
So, enable this feature (
nullHandlingEnabled
) And query like
SELECT value FROM table WHERE value IS NOT NULL ORDER BY date LIMIT 1
Sounds right ?
k

Kishore G

05/11/2021, 3:46 PM
yes
j

Jonathan Meyer

05/11/2021, 3:46 PM
Great, thank you @Kishore G 🙂 Have a good day !