https://pinot.apache.org/ logo
e

Elon

02/02/2021, 10:54 PM
Hi, we have a hybrid table that is not serving realtime data (unless _REALTIME is explicitly used) and noticed that the time boundary is given in ms but the time column is a datetime field with input in ms from another field (base column in kafka) and output granularity in seconds. Anyone else experience this? Is there a way to set the time boundary?
m

Mayank

02/02/2021, 10:55 PM
Yes, likely the root cause is incorrect time boundary.
e

Elon

02/02/2021, 10:55 PM
Nice! Is there a way we can reset it?
m

Mayank

02/02/2021, 10:56 PM
Incorrect time boundary is because of incorrect setup/config, fixing that should solve the problem. Afaik, there isn't a way to explicitly set time boundary
e

Elon

02/02/2021, 10:57 PM
Ah ok, checking - you mean table config or server config? Or both? Checking table configs and comparing
for offline vs realtime
m

Mayank

02/02/2021, 10:57 PM
Table config
They should have the same unit
Also, there might be a min granularity (seconds?) needed for hybrid tables. @Jackie?
e

Elon

02/02/2021, 10:59 PM
The segment configs use the same column - it's a datetime field
both realtime and offline:
Copy code
"OFFLINE": {
    "tableName": "enriched_orders_OFFLINE",
    "tableType": "OFFLINE",
    "segmentsConfig": {
      "schemaName": "enriched_orders",
      "segmentPushFrequency": "daily",
      "segmentPushType": "APPEND",
      "timeColumnName": "order_timestamp_seconds",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "365",
      "replication": "3",
      "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy"
    },
...
Copy code
"REALTIME": {
    "tableName": "enriched_orders_REALTIME",
    "tableType": "REALTIME",
    "segmentsConfig": {
      "schemaName": "enriched_orders",
      "segmentPushFrequency": "daily",
      "segmentPushType": "APPEND",
      "timeColumnName": "order_timestamp_seconds",
      "retentionTimeUnit": "DAYS",
      "retentionTimeValue": "100",
      "segmentAssignmentStrategy": "BalanceNumSegmentAssignmentStrategy",
      "replicasPerPartition": "3"
    },
...
and here is the fieldspec from the schema:
Copy code
"dateTimeFieldSpecs": [
    {
      "name": "order_timestamp_seconds",
      "dataType": "LONG",
      "defaultNullValue": 0,
      "transformFunction": "toEpochSeconds(order_timestamp_ms)",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:SECONDS"
    }
note that the source column
order_timestamp_ms
is not included in the table.
m

Mayank

02/02/2021, 11:01 PM
Hmm, why do you see time boundary in ms?
e

Elon

02/02/2021, 11:01 PM
Not sure:) I just did the curl from the broker
and the value if converted to seconds is expected: it's the newest offline timestamp - 24 hours
m

Mayank

02/02/2021, 11:03 PM
Can you select min time from RT and max time for OFFLINE?
e

Elon

02/02/2021, 11:03 PM
there are realtime records newer than that
Yep, already did:
m

Mayank

02/02/2021, 11:03 PM
what's the unit there? seconds or ms?
e

Elon

02/02/2021, 11:03 PM
Copy code
--offline:  min: 1601424056 max: 1612051039
--realtime: min: 1607816472 max: 1612317600 
--time boundary in ms: 1611964639000
m

Mayank

02/02/2021, 11:04 PM
Interesting
e

Elon

02/02/2021, 11:04 PM
the unit for the time column should be seconds - that's what it is when we select
maybe the code expects that granularity unit == format unit?
n

Neha Pawar

02/02/2021, 11:16 PM
Copy code
"transformFunction": "toEpochSeconds(order_timestamp_ms)",
      "format": "1:MILLISECONDS:EPOCH",
      "granularity": "1:SECONDS"
this is incorrect ^. toEpochSeconds will convert it to millis/1000, but the format here says MILLISECONDS
e

Elon

02/02/2021, 11:17 PM
but granularity is seconds - it seems correct when we do selects on the datetime field (both realtime and offline)
I thought format should be what the input is and granularity is what the output is, right?
n

Neha Pawar

02/02/2021, 11:18 PM
not really, they both represent what the final value in the time column should be
granularity is not used anywhere atm
in your case, millis will get converted using function millis/1000, so your format should be 1SECONDSEPOCH
e

Elon

02/02/2021, 11:19 PM
Oh wow, I didn't know that, thanks:) I tried using millis/1000 and got an error, but when I use the epoch* functions it works
n

Neha Pawar

02/02/2021, 11:20 PM
if you want to keep it in millis, but simply round to nearest seconds, use “round(order_timestamp, 1000)”
👍 1
if you do this ^, then your current datetimespec is correct
e

Elon

02/02/2021, 11:20 PM
thx!
I think they want it in seconds - so should I just change format to
1:SECONDS:EPOCH
?
n

Neha Pawar

02/02/2021, 11:21 PM
yeah
is this a new table? i’m wondering how any segments got created. the segment creation shoudve failed due to inconsistent value and spec
e

Elon

02/02/2021, 11:22 PM
No, it has tons of offline and realtime segments
and they are correctly named as well
not sure if that matters
so right now we can consider granularity to be meaningless? i.e. unused anywhere?
n

Neha Pawar

02/02/2021, 11:24 PM
yes
😮 1
e

Elon

02/02/2021, 11:24 PM
thanks for clearing that up @Neha Pawar!
If I change the table spec now would it cause any issues?
n

Neha Pawar

02/02/2021, 11:25 PM
not quite sure about that..
e

Elon

02/02/2021, 11:25 PM
ok, we'll try it in staging 🙂 And let you know. Thanks again!
👍 1
n

Neha Pawar

02/03/2021, 7:05 PM
how did that go?
e

Elon

02/03/2021, 7:06 PM
We are still just doing basic testing of pinot 0.6.0 - probably today/tomorrow we will try it. I'll let you know.
Hi @Neha Pawar, just wanted to give you an update. I updated the schema and it returns that the format of the time column did change but no change in behavior.
I disabled the table and am reloading all segments to see if that works.
The reloading did not work, neither did restarting the servers but restarting the brokers did! Now we see the data serving as it should
Thanks for all the help @Neha Pawar! Maybe this should go in a doc somewhere? lmk where I can take a stab at it.
n

Neha Pawar

02/11/2021, 8:09 PM
there’s a schema evolution page, maybe add the notes from your observations there as tips?
👍 1
e

Elon

02/11/2021, 8:17 PM
Sure, I'll look for it