and what about Upserts?
# general
a
and what about Upserts?
c
@Jackie ^^ startree support in case of upserts ? Should we track this somewhere ?
a
yep, does a table with upserts generate star tree?
c
you can - but the question is do the results make sense 🙂
as in - de-dup would likely not happen
a
ahh, so - it will create but will be incorrect?
j
Star-tree should not be applied to upsert use cases as the records can be invalidated
There is no way to also invalidate the records that is already pre-aggregated in star-tree
@Chinmay Soman Maybe we should add this to the validation
a
we should. I was chatting with Elon about it, maybe there is another way to build/rebuild startee for upserts but will require some thinking 🙂
j
I don't see an easy way to make star-tree mutable. Whenever a record gets updated, we need to remove it from the star-tree, which will be very hard
Also, upsert won't work properly with very large table as the primary keys are maintained on heap memory, so not sure how much value star-tree can provide even if we support it
c
I’ll add the validation check for now
k
@Jackie you can do use star-tree with upsert if you negate the previous value + new value for the metrics (create a new column for the aggregate metric)
j
@Kishore G Conceptually yes, but that will be very hard to implement and has the following challenges: • Some aggregated metrics cannot be negated such as HLL, MIN, MAX • Won't work on aggregated metrics with variable length (no way to update in-place) • Thread-safety issue We don't have star-tree for mutable segment for similar reasons
y
fyi, there is a validation for startree with upsert https://github.com/apache/incubator-pinot/pull/6153/files
c
Thanks for pointing out Yupeng