Hi Team, Considering I am using `replicaGroupStrat...
# general
j
Hi Team, Considering I am using
replicaGroupStrategyConfig
to use
Partitioned Replica-Group Segment Assignment
and I give a column and number of instances per partition, I have two questions here: 1. What is the method used for doing this partition? 2. If I am partitioning based on a column and I want to partition a particular value of the column separately how can this be achieved?
m
1. You can use murmur partition, but note that the implementation of the function has to match on Pinot side and upstream where you are partitioning your input
2. What do you mean by partition a particular value of partition separately? Currently, the partitioning is on a column (apply function to column value). If you have a specific partitioning in mind I believe you can use your own as a plug-in
j
I don't see an option to set the partition function in the
replicaGroupStrategyConfig
j
From this doc I see the
replicaGroupStrategyConfig
is placed under
segmentConfig
so I am assuming
segmentPartitionConfig
under
tableIndexConfig
is mandatory for using
replicaGroupStrategyConfig
?
m
No, partitioning and replicaGroup are orthogonal to each other.
You can just have replica groups, without partitioning as well
j
I am aiming to achieve something like this
Copy code
We have lots of tenants and I want to classify these tenants into 2 replica groups with 10 partitions and 1 server under each partition. And going forward if the load increases I can either increase the server count or split the org that is causing the increase in latency into a separate partition.
so I was hoping to use something like this in table config
Copy code
{
  "instanceAssignmentConfigMap": {
    "OFFLINE": {
      "tagPoolConfig": {
        "tag": "Tenants_OFFLINE"
      },
      "replicaGroupPartitionConfig": {
        "replicaGroupBased": true,
        "numReplicaGroups": 2,
        "numPartitions": 10,
        "numInstancesPerPartition": 1
      }
    }
  },
  "segmentsConfig": {
    "replicaGroupStrategyConfig": {
      "partitionColumn": "tenantId",
      "numInstancesPerPartition": 1
    },
    ...
  },
  ...
}
But I was confused on how I can set the partitioning function here and how I can move a particular tenant into its own partition. So based on the doc you shared, if I set the
segmentPartitionConfig
with a custom plugin implementation will that suffice?