We have an upsert table and are trying to use pool...
# troubleshooting
e
We have an upsert table and are trying to use pool based instance assignment. Only 1 instance from each pool contains all the consuming segments and we get duplicate rows if we set
COMPLETED
segments to have
numInstancesPerPartition
= 0 (so it can use all instances). Is upsert compatible with pool based instance assignment?
Here is the instanceAssignmentConfig:
Copy code
"routing": {
    "instanceSelectorType": "strictReplicaGroup"
  },
  "upsertConfig": {
    "mode": "FULL"
  },
  "instanceAssignmentConfigMap": {
      "CONSUMING": {
        "tagPoolConfig": {
          "tag": "UpsertAnalytics_REALTIME",
          "poolBased": true,
          "numPools": 3
        },
        "replicaGroupPartitionConfig": {
          "replicaGroupBased": true,
          "numReplicaGroups": 3,
          "numInstancesPerPartition": 1,
          "numPartitions": 1
        }
      },
      "COMPLETED": {
        "tagPoolConfig": {
          "tag": "UpsertAnalytics_REALTIME",
          "poolBased": true,
          "numPools": 3
        },
        "replicaGroupPartitionConfig": {
          "replicaGroupBased": true,
          "numReplicaGroups": 3,
          "numInstancesPerPartition": 0,
          "numPartitions": 1
        }
      }
    }
We also tried
numPartitions
= # of kafka partitions for
COMPLETED
segments (ex. 12) and the completed segments evenly spread across instances but we get duplicate records.
We're using pinot 0.8.0 release and this is for a realtime only upsert table
cc @Jackie lmk if we should not use the pool based assignment with upsert. Thanks!
cc @Mingfeng Tan
j
@Elon In order to have upsert work, all the segments from the same partition must be served from the same instance. In other word, the
COMPLETED
segment cannot be relocated to another instance
Upsert is compatible with pool based instance assignment, but it must have
numInstancesPerPartition
set to 1, and should not relocate the
COMPLETED
segments
e
Thanks! We tried that but noticed that the segments were only on 1 instance in each pool
i.e. we had 3 instances per pool and 2 of them were empty.
i.e. when we tried
numInstancesPerPartition
= 1 then pool0-instance0, pool1-instance0, pool2-instance0 had all segments, pool0-instance1, pool0-instance2, pool1-instance1, pool1-instance2 and pool2-instance1, pool2-instance2 had no segments.
w
@Jackie Do you have any document about why it is designed in this way for upsert table?
e
Hopefully it's our config. @Jackie does the config above but with COMPLETED
numInstancesPerPartition
= 1 look correct?
That's the config that had all segments on only 1 node in each pool, leaving the others empty
what's the minimum # of nodes used in typical configs? we are trying in staging with 9 nodes, 3 are in each pool 0,1,2 respectively
j
The
COMPLETED
config should be removed as the segment should not be relocated
💡 1
e
thanks, I'll try that!
j
How many partitions do you have in the kafka stream?
e
12
j
Then
numPartitions
should be 12, and all the servers will be used
e
I thought (from the doc) that realtime implicitly sets it to 1, iirc when I tried setting
CONSUMING
numPartitions to 12 it gave an error, but I will retry now and update, thanks!
@Jackie I got this error when rebalancing:
Copy code
"Caught exception while calculating target assignment: java.lang.IllegalStateException: Instance partitions: enriched_customer_orders_v1_16_2_upsert_CONSUMING should contain 1 partition"
here's the tableconfig for routing and instance assignment:
Copy code
"routing": {
      "instanceSelectorType": "strictReplicaGroup"
    },
    "instanceAssignmentConfigMap": {
      "CONSUMING": {
        "tagPoolConfig": {
          "tag": "UpsertAnalytics_REALTIME",
          "poolBased": true,
          "numPools": 3
        },
        "replicaGroupPartitionConfig": {
          "replicaGroupBased": true,
          "numInstances": 0,
          "numReplicaGroups": 3,
          "numInstancesPerReplicaGroup": 0,
          "numPartitions": 12,
          "numInstancesPerPartition": 1
        }
      }
    }
j
@Elon You are correct,
numPartitions
should not be configured for realtime table
Can you try removing the
numInstancesPerPartition
config as well and see if all the servers are assigned segments?
e
This works but then I get dups, only ~70-80 for 18m records though. Is the pr you did something we can try to reduce the dups? We're on pinot0.8.0 release, maybe we should upgrade?
Oh! that's about the same amount of dups we get from using unpooled upsert,
is there a way to get rid of those dups?
it's a very small percentage though
l
@Jackie Why do we allow the upsert tables to contain segment relocation if it isn't supported? I had just enabled segment relocation for our upsert tables as well, but reverted it after seeing this thread. Perhaps it would be a good idea to fail the configuration if segment relocation is detected when upsert is enabled?
j
@Lars-Kristian Svenøy Good point. Can you please help create a GitHub issue for this request? We should not allow relocating segments or real-time to offline task for upsert table. Contribution is also very welcome
l
Done @Jackie https://github.com/apache/pinot/issues/8973 I also asked for your comment here
🙌 1