I had a question about enabling Upsert on a hybrid...
# general
j
I had a question about enabling Upsert on a hybrid table. We currently have separate REALTIME-only and OFFLINE-only tables that we use to maintain two dashboards with different refresh rates. But we oftentimes find them diverging. To address this I’d like to set these up as a single hybrid table (with upsert on the REALTIME subtable). is that possible? I identify the following considerations: 1. We will call the tables with _OFFLINE and _REALTIME to avoid the broker providing duplicate results. 2. Will the offline table creation ignore the primaryKeyColumns in the schema? 3. Is there a mechanism to set common parameters in the tablespec, like indexes? Otherwise it looks like the star indexes, streamsconfig, routing and upsertConfig rules could easily be set only on the REALTIME segments. Will this work?
j
I'm a little bit confused here. If you are going to call the OFFLINE and REALTIME separately, what benefit do you get from modeling them as one hybrid table?
Upsert doesn't work on hybrid table as of now, so only the records from the REALTIME table can be invalidated
👍 1
j
For #2 we benefit from being able to model it with a single schema definition that we can share.
And for #1 and #3 it’s more because have automation that encapsulates the creation of the schema and the tablespec. Was wondering if creating a hybrid table would get the desired results before going and changing the tooling too much.
So are the upsert specific settings ignored by offline tables altogether?
j
Upsert cannot be enabled for OFFLINE table, or the table validation will throw exception
I wouldn't suggest using hybrid table if the only benefit is to share a single schema