Hi guys I need help We did some changes to our Pinot cluster Apache Pinot #troubleshooting

Hi guys, I need help. We did some changes to our P...

Diogo Baeder

08/31/2022, 12:15 PM

Hi guys, I need help. We did some changes to our Pinot cluster deployment values - JVM tuning and liveness probes -, redeployed the instances and now we lost all our tables (from what's showing in the Incubator UI) and they're not receiving anything from Kafka. What's going on, and how can we recover our tables and segments?

Diogo Baeder

08/31/2022, 12:15 PM

We noticed that the ZooKeeper version was bumped in the Helm chart, and in ZK we can't find the schemas and segments anymore

Mayank

08/31/2022, 1:04 PM

How many ZK instances did you have? Seems like you upgraded ZK in a way that you lost all the metadata (it should be a rolling restart along with snapshots to ensure you don’t run into this issue).

Diogo Baeder

08/31/2022, 1:05 PM

Only 1, should we have had more than that?

Diogo Baeder

08/31/2022, 1:06 PM

We didn't mean to restart ZK, but it seems that the version got bumped in the official Pinot Helm chart, which caused it to be redeployed, and at that point we seem to have lost the data

Mayank

08/31/2022, 1:12 PM

Yes ZK is the persistent metadata store. Do you have any ZK snapshot that you can restore? Also @Xiang Fu for any other ideas here.

Diogo Baeder

08/31/2022, 1:13 PM

I'm not sure I have, no... we have an EBS persistent volume mounted on it, shouldn't that be enough?

Diogo Baeder

08/31/2022, 1:15 PM

@Mayank if I just recreate the tables and schemas, will that overwrite the previous ones? Or will that "re-bind" the Pinot cluster to the data we already have on the S3 deep store?

Mayank

08/31/2022, 1:22 PM

EBS should have the snapshot. Just recreating the table is not enough, there is metadata stored that needs to be restored

Diogo Baeder

08/31/2022, 1:23 PM

OK, and by recovering the EBS volume would we have the metadata back?

Mayank

08/31/2022, 1:23 PM

You restore the metadata in ZK (start it with snapshot).

Mayank

08/31/2022, 1:24 PM

In the worst case, recreate the table and repush the data from deepstore

Diogo Baeder

08/31/2022, 1:24 PM

Cool, so I just: 1. Restore the EBS snapshot 2. Restart ZK pod Should this be enough?

Diogo Baeder

08/31/2022, 1:25 PM

In the worst case, recreate the table and repush the data from deepstore

how do I do that?

Mayank

08/31/2022, 1:26 PM

You can curl + post the segments to controller

Diogo Baeder

08/31/2022, 1:29 PM

How do I find out the segments names? Can I find that in S3? Because in ZK we don't have them anymore

Diogo Baeder

08/31/2022, 1:32 PM

Is there a documentation about how we could have a more reliable deployment of ZK instances? We're thinking of having replicas of it now, to avoid facing this issue again.

Diogo Baeder

08/31/2022, 1:35 PM

Ah, it seems like the files stored in S3 are the segments names themselves, great!

Mayank

08/31/2022, 1:36 PM

Yes s3 bucket will have it. Typically you should be running ZK outside of Pinot cluster so you don’t end up deploying ZK unnecessarily. The helm chart in docs is only for QuickStart, as a sample.

Diogo Baeder

08/31/2022, 1:37 PM

Got it. Good idea. Does the same apply to Kafka as well - running it outside?

Mayank

08/31/2022, 1:38 PM

Yes.

Diogo Baeder

08/31/2022, 1:38 PM

Got it. Thanks!

Diogo Baeder

08/31/2022, 1:46 PM

We don't have EBS snapshots, as we had no backups enabled for it or anything like that. So I'll have to reload the segments - will try to do that now, after creating the tables and schemas

👍 1

Diogo Baeder

08/31/2022, 4:58 PM

So, after I create the schemas and tables again, which endpoint, specifically, do I use to recover the segments? Does it work if I just trigger the

/segments/{table}/reload

endpoint for each table?

Mayank

08/31/2022, 5:29 PM

Not that api. There are multiple options. But if your data is small (iirc), you can just do

pinot-admin.sh uploadSegment

in a loop.

Diogo Baeder

08/31/2022, 5:31 PM

I'm not sure how I can get the size of the data we currently have, I do know that we have at least about 200M rows though. Let me try to find some docs for that CLI command, I'm trying to figure out a way to upload segments without having to download them in an intermediary server first.

Diogo Baeder

08/31/2022, 6:05 PM

@Mayank I think the CLI option might not be viable to us, can't we just rebuild the metadata somehow in ZK? Maybe if we could just get a list of the segments and then recreate the metadata based on that, and then expect the servers to redownload as needed from the deep store?

Mayank

08/31/2022, 6:05 PM

Yeah that is an option, but requires some code. But it will be much easier to simply upload. What’s the issue with that?

Diogo Baeder

08/31/2022, 6:06 PM

The issue is that there's a lot of data to download and then re-upload. I'm worried that this takes days to recover, but maybe I'm overly pessimistic.

Diogo Baeder

08/31/2022, 6:50 PM

Does

uploadSegment

support setting, as a directory, an S3 bucket with a path, instead of a local directory?

Mayank

08/31/2022, 7:04 PM

200M rows should be not that much data (may be a few GB). can you check the s3 bucket size?

Diogo Baeder

08/31/2022, 7:06 PM

6.2 GB

Mayank

08/31/2022, 7:06 PM

That should be pretty fast, why do you think it will take days?

Diogo Baeder

08/31/2022, 7:08 PM

It was just a guess. Let's hope it's less than that, then. I'm trying to get the CLI running on a server in EKS, hopefully it should be fast enough there and I can quickly recover those segments. The painful part now is writing up a script that can do that based on the list of segments available in the bucket (which is why I was hoping that the CLI tool could do that for me, but aparently not)

Mayank

08/31/2022, 8:53 PM

That’s also just a for loop something like:

Copy code

for segment in $segments;
do 
  pinot_admin.sh uploadSegment...
done;

Diogo Baeder

08/31/2022, 9:16 PM

There's more to that, it's also managing AWS creds, listing the available segments in S3, not uploading already existing segments etc. But no worries, I got this 🙂

Mayank

08/31/2022, 9:54 PM

It is ok to upload already existing segments (if it makes life easy), Pinot is smart enough to figure out duplicate segment push.

Diogo Baeder

08/31/2022, 9:55 PM

But does it end up transferring the file? Or does it stop before transferring when it's duplicated?

Diogo Baeder

09/01/2022, 12:35 AM

@Mayank I got the segment files, but uploading is not working for me:

Copy code

Sending request: <http://a57e987c9a6844455840c036c14a9fa5-74048ef705b9aafd.elb.eu-west-1.amazonaws.com:9000/v2/segments?tableName=bb8_analyses_logs_REALTIME> to controller: pinot-controller-1.pinot-controller-headless.pinot.svc.cluster.local, version: Unknown
org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 404 (Not Found) with reason: "Failed to find table config for table: bb8_analyses_logs_OFFLINE" while sending request: <http://a57e987c9a6844455840c036c14a9fa5-74048ef705b9aafd.elb.eu-west-1.amazonaws.com:9000/v2/segments?tableName=bb8_analyses_logs_REALTIME> to controller: pinot-controller-1.pinot-controller-headless.pinot.svc.cluster.local, version: Unknown

Diogo Baeder

09/01/2022, 12:35 AM

Does it only work for OFFLINE tables?

Mayank

09/01/2022, 1:31 AM

Was the table offline or real-time? Newer release supports pushing to real-time also

Mayank

09/01/2022, 1:32 AM

If real-time time what’s the retention of upstream?

Diogo Baeder

09/01/2022, 1:38 AM

It's REALTIME (and was before, too). Retention is infinite.

Diogo Baeder

09/01/2022, 1:38 AM

I'm using the Pinot binaries for version 0.10.0

Diogo Baeder

09/01/2022, 1:38 AM

Is it fine to use 0.11.0 to upload to a 0.10.0 cluster?

Mayank

09/01/2022, 3:23 AM

Oh if the retention for upstream/kafka is infinite, then all you need to do is re-consume from smallest offset (table config), and you are done. No need to repush

Diogo Baeder

09/01/2022, 3:25 AM

I'm not sure if the Kafka topics were kept though, I had to recreate them

Diogo Baeder

09/01/2022, 3:26 AM

But no worries, I'm one command away from pushing the segments, I just need to be sure if I can use the 0.11.0 CLI tool with a 0.10.0 cluster - don't want to break it by running incompatible versions

Mayank

09/01/2022, 3:26 AM

It is not the CLI, it is the cluster that needs to be latest release.

Mayank

09/01/2022, 3:26 AM

But let me check if the 0.10 already has that

Diogo Baeder

09/01/2022, 3:27 AM

Oh... so I'm not able to re-upload segments in a 0.10.0? How can I recover them, then? Or is there no other way?

Mayank

09/01/2022, 3:27 AM

Checking

Mayank

09/01/2022, 3:30 AM

Let me find and dm

Diogo Baeder

09/01/2022, 12:24 PM

Problem solved! I created one OFFLINE table for each REALTIME table, and then re-uploading segments worked 🙂

Mayank

09/01/2022, 12:57 PM

Thanks that it worked @Diogo Baeder. Moving forward, my recommendation is to have at least replication of 3 for all components, and running ZK/Kafka outside of Pinot deployments.

Diogo Baeder

09/01/2022, 12:57 PM

Yeah, I should have done that right from the start, didn't realize how fail-prone my cluster was...

🙏 1

Ryan Ruane

10/24/2024, 11:47 AM

@Mayank We have run into a similar issue again, but me had retained the persistent volumes. Once we reconnected the old persistent volume to ZK it came back online, however, for what ever reason, all of the data for the REALTIME table has found its way into the OFFLINE table, and we have status bad on all tables. All REALTIME tables have missing segments and all OFFLINE tables have extra segments that wont load because the table type (REALTIME) is wrong.

Ryan Ruane

10/24/2024, 11:50 AM

We were down a server and a controller and when we put it back online, it resolved itself.

Ryan Ruane

10/24/2024, 11:50 AM

Sorry for the trouble. All is well now.

thanks 1

👍 1

Open in Slack

Previous Next