Hi team, is it a requirement to enable partitionin...
# general
a
Hi team, is it a requirement to enable partitioning in Pinot to use upsert feature?
I mean, if I had set “routing”: { “instanceSelectorType”: “strictReplicaGroup” }, “upsertConfig”: { “mode”: “FULL” } in the table config and set primary key in the schema, will upsert takes effect without setting segmentPartitionConfig?
m
Yes partitioning is a requirement
a
At the moment, does Pinot set a default segmentPartitionConfig if it’s missing in the table config for upsert feature?
m
no
a
I think there is no segmentPartitionConfig in this example?
m
SegmentPartitionConfig is separate from upsert
For upsert, the requirement is the the upstream is partitioned by the upsert primary key
SegmentPartitionConfig is for specifiying partitioning that was done upstream (what function was chosen, etc). This is used during query execution to only query partitions for the key in the query. It is separate from upsert, and not needed for upsert.
a
I see. Thanks.
The following config is just for querying?
m
You need to push data to kafka with a key that is also the primary key in pinot schema. for example, if there are two columns
a
and
b
in primaryKeys of pinot schema. Then in your kafka producer (may be flink, spark or anyother job), you need to use both
a
and
b
attribute of kafka message as partitioning key. So that message lands to same kafka partition for specific values of
a
and
b
a
Yes, we’re working on it. Thanks. @User