Hi guys, I need help. We did some changes to our P...
# troubleshooting
d
Hi guys, I need help. We did some changes to our Pinot cluster deployment values - JVM tuning and liveness probes -, redeployed the instances and now we lost all our tables (from what's showing in the Incubator UI) and they're not receiving anything from Kafka. What's going on, and how can we recover our tables and segments?
We noticed that the ZooKeeper version was bumped in the Helm chart, and in ZK we can't find the schemas and segments anymore
m
How many ZK instances did you have? Seems like you upgraded ZK in a way that you lost all the metadata (it should be a rolling restart along with snapshots to ensure you don’t run into this issue).
d
Only 1, should we have had more than that?
We didn't mean to restart ZK, but it seems that the version got bumped in the official Pinot Helm chart, which caused it to be redeployed, and at that point we seem to have lost the data
m
Yes ZK is the persistent metadata store. Do you have any ZK snapshot that you can restore? Also @Xiang Fu for any other ideas here.
d
I'm not sure I have, no... we have an EBS persistent volume mounted on it, shouldn't that be enough?
@Mayank if I just recreate the tables and schemas, will that overwrite the previous ones? Or will that "re-bind" the Pinot cluster to the data we already have on the S3 deep store?
m
EBS should have the snapshot. Just recreating the table is not enough, there is metadata stored that needs to be restored
d
OK, and by recovering the EBS volume would we have the metadata back?
m
You restore the metadata in ZK (start it with snapshot).
In the worst case, recreate the table and repush the data from deepstore
d
Cool, so I just: 1. Restore the EBS snapshot 2. Restart ZK pod Should this be enough?
In the worst case, recreate the table and repush the data from deepstore
how do I do that?
m
You can curl + post the segments to controller
d
How do I find out the segments names? Can I find that in S3? Because in ZK we don't have them anymore
Is there a documentation about how we could have a more reliable deployment of ZK instances? We're thinking of having replicas of it now, to avoid facing this issue again.
Ah, it seems like the files stored in S3 are the segments names themselves, great!
m
Yes s3 bucket will have it. Typically you should be running ZK outside of Pinot cluster so you don’t end up deploying ZK unnecessarily. The helm chart in docs is only for QuickStart, as a sample.
d
Got it. Good idea. Does the same apply to Kafka as well - running it outside?
m
Yes.
d
Got it. Thanks!
We don't have EBS snapshots, as we had no backups enabled for it or anything like that. So I'll have to reload the segments - will try to do that now, after creating the tables and schemas
👍 1
So, after I create the schemas and tables again, which endpoint, specifically, do I use to recover the segments? Does it work if I just trigger the
/segments/{table}/reload
endpoint for each table?
m
Not that api. There are multiple options. But if your data is small (iirc), you can just do
pinot-admin.sh uploadSegment
in a loop.
d
I'm not sure how I can get the size of the data we currently have, I do know that we have at least about 200M rows though. Let me try to find some docs for that CLI command, I'm trying to figure out a way to upload segments without having to download them in an intermediary server first.
@Mayank I think the CLI option might not be viable to us, can't we just rebuild the metadata somehow in ZK? Maybe if we could just get a list of the segments and then recreate the metadata based on that, and then expect the servers to redownload as needed from the deep store?
m
Yeah that is an option, but requires some code. But it will be much easier to simply upload. What’s the issue with that?
d
The issue is that there's a lot of data to download and then re-upload. I'm worried that this takes days to recover, but maybe I'm overly pessimistic.
Does
uploadSegment
support setting, as a directory, an S3 bucket with a path, instead of a local directory?
m
200M rows should be not that much data (may be a few GB). can you check the s3 bucket size?
d
6.2 GB
m
That should be pretty fast, why do you think it will take days?
d
It was just a guess. Let's hope it's less than that, then. I'm trying to get the CLI running on a server in EKS, hopefully it should be fast enough there and I can quickly recover those segments. The painful part now is writing up a script that can do that based on the list of segments available in the bucket (which is why I was hoping that the CLI tool could do that for me, but aparently not)
m
That’s also just a for loop something like:
Copy code
for segment in $segments;
do 
  pinot_admin.sh uploadSegment...
done;
d
There's more to that, it's also managing AWS creds, listing the available segments in S3, not uploading already existing segments etc. But no worries, I got this 🙂
m
It is ok to upload already existing segments (if it makes life easy), Pinot is smart enough to figure out duplicate segment push.
d
But does it end up transferring the file? Or does it stop before transferring when it's duplicated?
@Mayank I got the segment files, but uploading is not working for me:
Copy code
Sending request: <http://a57e987c9a6844455840c036c14a9fa5-74048ef705b9aafd.elb.eu-west-1.amazonaws.com:9000/v2/segments?tableName=bb8_analyses_logs_REALTIME> to controller: pinot-controller-1.pinot-controller-headless.pinot.svc.cluster.local, version: Unknown
org.apache.pinot.common.exception.HttpErrorStatusException: Got error status code: 404 (Not Found) with reason: "Failed to find table config for table: bb8_analyses_logs_OFFLINE" while sending request: <http://a57e987c9a6844455840c036c14a9fa5-74048ef705b9aafd.elb.eu-west-1.amazonaws.com:9000/v2/segments?tableName=bb8_analyses_logs_REALTIME> to controller: pinot-controller-1.pinot-controller-headless.pinot.svc.cluster.local, version: Unknown
Does it only work for OFFLINE tables?
m
Was the table offline or real-time? Newer release supports pushing to real-time also
If real-time time what’s the retention of upstream?
d
It's REALTIME (and was before, too). Retention is infinite.
I'm using the Pinot binaries for version 0.10.0
Is it fine to use 0.11.0 to upload to a 0.10.0 cluster?
m
Oh if the retention for upstream/kafka is infinite, then all you need to do is re-consume from smallest offset (table config), and you are done. No need to repush
d
I'm not sure if the Kafka topics were kept though, I had to recreate them
But no worries, I'm one command away from pushing the segments, I just need to be sure if I can use the 0.11.0 CLI tool with a 0.10.0 cluster - don't want to break it by running incompatible versions
m
It is not the CLI, it is the cluster that needs to be latest release.
But let me check if the 0.10 already has that
d
Oh... so I'm not able to re-upload segments in a 0.10.0? How can I recover them, then? Or is there no other way?
m
Checking
Let me find and dm
d
Problem solved! I created one OFFLINE table for each REALTIME table, and then re-uploading segments worked 🙂
m
Thanks that it worked @Diogo Baeder. Moving forward, my recommendation is to have at least replication of 3 for all components, and running ZK/Kafka outside of Pinot deployments.
d
Yeah, I should have done that right from the start, didn't realize how fail-prone my cluster was...
🙏 1
r
@Mayank We have run into a similar issue again, but me had retained the persistent volumes. Once we reconnected the old persistent volume to ZK it came back online, however, for what ever reason, all of the data for the REALTIME table has found its way into the OFFLINE table, and we have status bad on all tables. All REALTIME tables have missing segments and all OFFLINE tables have extra segments that wont load because the table type (REALTIME) is wrong.
We were down a server and a controller and when we put it back online, it resolved itself.
Sorry for the trouble. All is well now.
thanks 1
👍 1