Apache Pinot #pinot-realtime-table-rebalance

Join Slack

Jackie

10/16/2020, 5:58 PM

Seems the problem is that there are 12 kafka partitions (streaming partition * replication), but only 9 servers

Jackie

10/16/2020, 6:00 PM

So 3 servers will have 2 partitions to consume, the other 6 has 1 partition

Ting Chen

10/16/2020, 6:02 PM

yes. so the 9 servers' perf is in fact worse than 6 servers.

Yupeng Fu

10/16/2020, 6:02 PM

no. this topic has 4 partitions

Yupeng Fu

10/16/2020, 6:02 PM

but 3 repication factor

Ting Chen

10/16/2020, 6:02 PM

I also observed the document distribution is not even -- the latter 3 servers has 50% more doc than the original 6.

Ting Chen

10/16/2020, 6:02 PM

is that expected?

Jackie

10/16/2020, 6:03 PM

The difference between LLC and realtime is that all the segments for one partition will be hosted on the same server, so think of partition as the smallest unit of the table

Ting Chen

10/16/2020, 6:04 PM

what do you mean by the diff between LLC and realtime? I thought LLC is realtime?

Jackie

10/16/2020, 6:04 PM

Sorry, LLC and offline

Jackie

10/16/2020, 6:07 PM

Performance wise, 9 servers should be similar to 6 servers because the server load on the 3 new servers are the same as before

Jackie

10/16/2020, 6:07 PM

Do you use partitioning or replica-group routing for this table?

Yupeng Fu

10/16/2020, 6:19 PM

we use default

Ting Chen

10/16/2020, 6:21 PM

we want to use replica-group routing for this tenant (right now it has 12 servers)

Ting Chen

10/16/2020, 6:21 PM

otherwise each query got fanned out to all 12 now -- which could depend on the slowest server.

Jackie

10/16/2020, 6:23 PM

For LLC table, because of the nature of the streaming partition, the segments are already assigned into replica-groups

Ting Chen

10/16/2020, 6:23 PM

Copy code

{
 "tableName": "pinotTable", 
 "tableType": "REALTIME",
 "routing": {
 "instanceSelectorType": "replicaGroup"
 }
 ..
}

Jackie

10/16/2020, 6:23 PM

Simple enable replica-group routing should do the work

Jackie

10/16/2020, 6:23 PM

Yes, correct

Ting Chen

10/16/2020, 6:23 PM

so we just need to add the above and restart broker?

Jackie

10/16/2020, 6:24 PM

Let me check, I think we have an API to avoid restarting broker

Jackie

10/16/2020, 6:25 PM

You can use the broker rebuild routing API to enable it:

Copy code

@PUT
@Produces(MediaType.TEXT_PLAIN)
@Path("/routing/{tableName}")

Jackie

10/16/2020, 6:26 PM

(table name here is the full table name, e.g.

pinotTable_REALTIME

)

Neha Pawar

10/16/2020, 6:27 PM

though this won’t solve you issue of some servers seeing twice the load. 1 server in each replica group is still going to have the same behavior

Jackie

10/16/2020, 6:27 PM

I think they already scaled up the cluster to 12 servers

Neha Pawar

10/16/2020, 6:28 PM

o okay

Ting Chen

10/16/2020, 6:29 PM

yes.

Ting Chen

10/16/2020, 6:30 PM

looks like 6-9 was not a good idea. 6->12 is.

Yupeng Fu

10/16/2020, 6:31 PM

Thanks @User @User for the help

Raunak Binani

12/10/2024, 3:28 AM

Hi everyone... I am using realtime ingestion. Kafka source has a retention period of 3 days, but the lag in Pinot said 4-5 days and was increasing. When I restated the servers the lag dropped to 2-2.5 days and reduced quite drastically and overnight dissipated. Can you pls help me debug this?