This message was deleted.
# troubleshooting
s
This message was deleted.
a
__time is stored as timestamp. there is no time zone information stored with __time column
when you query the __time column, you could use the timestamp_format function that lets you specify the timezone
r
Got it! so actually I have a json payload.
Copy code
{
  "eventTime": "2023-05-03T06:11:34+05:30"
}
How can I ingest data in druid with timeZone info? like __time in this case should be 2023-05-03T061134 instead of 2023-05-03T004134 and where should I store the timezone?
a
so you are getting these values with different timezones and you want to be able to show the values in same timezone that you received in?
r
Yes.
all values will not be in the same timezone.
a
you could store two columns - one is __time (timestamp type) and the second is the string representation of this time column.
r
should the producer be sending both the the string and epoch time in that case. as parsing the string (eventTime) is leading to unexpected time.
g
__time
is stored in timezone-independent manner, it's internally milliseconds since UTC epoch (1970-01-01T000000Z)
your
eventTime
are all stored correctly in the sense that they represent the correct instant, although the timezone info is not persisted
if you just need to store the correct instant and retrieve in arbitrary timezone, then you should be good as-is
if you need to retrieve in the original
eventTime
timezone that may vary from row to row, rather than a timezone determined by the query, then yeah you gotta store timezone as a separate column
hope this is clear.
a
does storing timezone alone as a separate column work? I thought about it but I am not sure if our timestamp functions support expression as an argument to timezone. or if they support just literals
g
i think tz needs to be a literal
at least for
TIME_FORMAT
although it could be extended to support expressions for the tz
however you could store the tz offset as milliseconds in a second column
eh nvm that wouldn't work
yeah i think we'd need to extend
TIME_FORMAT
this isn't a very common ask— most people use a timezone determined by the user at query time, rather than using the original event timezone
however it would still be nice to support it.
a
For now, I think storing eventTime as a whole including tz in a separate column is a good enough workaround
to your question, @Ravi Jain, producer doesn't need to send both. You can map same column in input to two columns in your ingestion spec.
r
got it! let me try storing this in two columns and see how it goes. Thank you for the quick support here. Also would be happy to learn and contribute in case we plan to bring this functionality natively to druid.
a
that will be nice. you should first look at
TimeFormatOperatorConversion
class in druid.
r
great! will start from the
TimeFormatOperatorConversion
class.
d
It is best if data producer always normalized to using UTC time. And if you can’t control the producer, why not create a simple consumer that pre-process the time column into UTC and then dump it to another kafka topic for Druid to consume?
r
@Didip Kerabat I have control over the producer. So basically it is ideal if the producer sends data like the one below.
Copy code
{
  "eventTime": "epochTime",
  "timezone": "+05:30"
}
Time Zone shall be required to convert to the right time at the time of visualization.
d
ah, yes. if your front end can handle that, that’s best.
r
Got it! Thanks for the input.