https://pinot.apache.org/ logo
#general
Title
# general
d

Diogo Baeder

11/12/2021, 6:31 PM
Hi folks! A question about `dateTimeFieldSpecs`: Is the
TIMESTAMP
type something that will stay in Pinot? Because @User gave me a nice hint about that but we don't see that in the docs, so I'm unsure whether I can safely use it and know that it won't get removed on future Pinot versions
n

Neha Pawar

11/12/2021, 6:34 PM
it is here to stay. It’s part of the datatypes in this page https://docs.pinot.apache.org/basics/components/schema#data-types
d

Diogo Baeder

11/12/2021, 6:34 PM
Ah, nice! Thanks!
n

Neha Pawar

11/12/2021, 6:35 PM
good catch 🙂 will fix it, and maybe also add some examples
d

Diogo Baeder

11/12/2021, 6:37 PM
Awesome, thanks a lot! ❤️
@User what format should be used though when publishing values through Kafka? I'm using, for example,
2020-04-04 00:00:00 UTC
, as a string, and it's not saving on Pinot
n

Neha Pawar

11/12/2021, 6:56 PM
should be
Copy code
"dateTimeFieldSpecs": [
    {
      "name": "time_col_name",
      "dataType": "STRING",
      "format": "1:MILLISECONDS:SIMPLE_DATE_FORMAT:yyyyMMdd HH:mm:ss z",
      "granularity": "1:MILLISECONDS"
    }
  ]
https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html
d

Diogo Baeder

11/12/2021, 8:21 PM
Oh... I thought the dataType would be
TIMESTAMP
, am I wrong?
n

Neha Pawar

11/12/2021, 9:39 PM
@User does this
2020-04-04 00:00:00 UTC
string also work for
TIMESTAMP
?
j

Jackie

11/12/2021, 9:40 PM
No, you need to remove
UTC
n

Neha Pawar

11/12/2021, 9:40 PM
so it’s
yyyyMMdd HH:mm:ss
and
epoch millis
that can use
TIMESTAMP
?
j

Jackie

11/12/2021, 9:40 PM
Yes
Standard
TIMESTAMP
format would be
yyyy-MM-dd HH:mm:ss.SSS
d

Diogo Baeder

11/12/2021, 9:40 PM
Oh, just
2020-04-04 00:00:00
then? And in terms of performance, how does that compare to using
SIMPLE_DATE_FORMAT
with a date-time format?
I mean, comparing the dataTypes, it would be
TIMESTAMP
vs
STRING
, would there be a significant performance difference?
j

Jackie

11/12/2021, 9:41 PM
TIMESTAMP
is stored as long (millis since epoch), so it has better performance than string
d

Diogo Baeder

11/12/2021, 9:42 PM
Ah, that's very relevant! 🙂
You guys rock so hard! Thank you very very much! ❤️
❤️ 1
j

Jackie

11/12/2021, 9:42 PM
The performance improvement is depending on the queries, but long should always be faster than string
Pleasure to help
d

Diogo Baeder

11/12/2021, 9:42 PM
Indeed, cause it's numeric and all that... awesome!
It worked fine here for me, thanks! The only odd thing I found about it is that, when getting the value from Pinot - not sure if because I'm using it through SQLAlchemy -, I'm getting a quoted string, like
"2020-01-01 00:00:00.000"
, instead of just the string with the datetime format. No big deal really, just a strange difference I found compared to when I was using
SIMPLE_DATE_FORMAT
earlier today. Thanks a lot, have a great weekend, folks!