Hello. What happens when upsert during the real-ti...
# general
s
Hello. What happens when upsert during the real-time ingestion with primary key and event time equals? The documentation says: "When two records of the same primary key are ingested, the record with the greater event time (as defined by the time column) is used.". But when there is a tie, what happens?
k
@User @User will know the exact answer. As part of partial upsert work we are doing, we will make the merging logic pluggable/configurable
y
Currently the behavior is undefined, so it’s implementation based which is the message that has largest offset. However, there are caveats such as for the case where the records are sorted by some column, the order is not determined
m
@User mind adding to docs/FAQ?
s
I have an upsert scenario that I am not getting my head around. We have created an order table (configuration in the file below) with primary_key being "kid" and the time column being "_order_date". We have duplicated rows (same "kid" and "_order_date") in our table. But, upsert is keeping duplicates at the end. Depending on the filter of the query it returns duplicate items to me. See the screenshots below.
j
I don’t see upsert is enabled in the table config. Please follow the instructions here to enable the upsert: https://docs.pinot.apache.org/basics/data-import/upsert
Please make sure the Kafka stream is partitioned on the primary key
s
Sorry @User. Here is the updated file! The one I had sent was from the previous version. The error happens with the table created with this schema below.
j
Is the Kafka stream partitioned with the primary key?
Can you try this query: select kid, $hostName from orders2 where kid = ‘ever max-100000009’
y
@User sure thing
thankyou 1
s
@User. The results to the query are in the image below. I don't know if the Kafka stream is partitioned with the primary key. I will check here and get back to you. Tks
j
@User The kafka stream is not properly partitioned, and the same
kid
shows up in 2 different partitions
s
Tks, @User.