Another question, somewhat related to the previous...
# troubleshooting
d
Another question, somewhat related to the previous one: is it possible to have a date column? And to set it up as the
timeColumn
? I was looking at using
DAYS
as the type for this, but it doesn't seem correct - I want to use real dates and not "days since Epoch".
m
You can use DateTimeFieldSpec
d
Is this for display purposes only? Or does it also affect the storing of values?
n
if you use the sql standard date (yyyy-MM-dd HHmmss), then you ca use dataType
TIMESTAMP
, and internally it’ll be stored as long, but displayed as your date string. If you want to use any other format (like ISO), then you would use dataType
STRING
, it will be stored as string.
for performance reasons, usually it’s better to have millisSinceEpoch (rounded to a nearest hour/day granularity as needed), so we can store as long
d
Ah, thanks @User, that's what I wanted to know! Like, what's the best way to store dates for performance, partitioning, rollups etc. But what if I define the schema as data type
LONG
, and format
1:DAYS:SIMPLE_DATE_FORMAT:yyyyMMdd
? Does this work, too?
n
yup this is fine. this uses LONG and also has coarse granularity 👍
d
Would it work with
INT
, too, for less memory/space usage?
n
it should work for yyyyMMdd with INT
for performance, partitioning, rollups, what you have will work fine. For extensibility, millisSinceEpoch works best. You have day granularity right now, so you would have values rolled up to the nearest day. In future if you want to make the granularity finer, helps to have millisSinceEpoch . You could go from day granularity to finer granularity anytime, without having to change anything else
d
Got it. But there's absolutely zero chance we'll need finer granularity, that data we have has always been and will always be daily - it's a large ecosystem that depends on this data being daily. So I guess that would be the best approach for this case, right?
n
yes it would in that case
d
Awesome, thanks! (And sorry for the long delay)