<@UUZ3KPZ8T> gave a very interesting Kafka Summit ...
# general
r
@User gave a very interesting Kafka Summit APAC talk recently entitled "Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot"
👍 5
k
Thanks Yupeng! Amazing work on upsert /cc @User
r
I’m curious about why upsert isn’t compatible with a few index types (ie StarTree).
y
That’s because startree index is built with preaggregation
r
Oh, right. What about Kafka’s log compaction? Would that help at all?
m
Not really. Think of it this way, there are indexes stored in a certain way (say pre-materialized form for star-tree) that upsert is either not possible or difficult.
r
I didn't think so. I was justing considering a potential workaround, similar to what I've done with Redshift (which doesn't have upsert).
Pinot's upsert seems more like an index than a true upsert
y
startree index is more for preaggregation, so the info is lost by the time you have upsert. i.e. you dont know how much preaggregated value shall be discounted
👍 1
r
That makes sense. I've only heard of StarTree at a very high level. Planning to review https://docs.pinot.apache.org/basics/indexing/star-tree-index
k
btw, @User and @User had some cool ideas of supporting star-tree index for upsert as well.