Hey < Amit> and I have been having connection pool issues ha Prisma #orm-help

Hey <@U015DRP62P7> and I have been having connecti...

Amit Goldfarb

11/03/2021, 10:13 AM

Hey @Amit and I have been having connection pool issues (happens rarely).

Copy code

PrismaClientKnownRequestError: Timed out fetching a new connection from the connection pool. (More info: <http://pris.ly/d/connection-pool>, Current connection limit: 1)

This happened multiple times to different node processes on the same container. After closing and restarting the container this issue doesn’t reoccur.

potatoxchip

11/03/2021, 10:15 AM

hasn't worked for me even once since yesterday

potatoxchip

11/03/2021, 10:15 AM

very frustrating 😞

Jonathan

11/03/2021, 11:37 AM

We have had similar experiences, also resolved by restarting the container(s). It is difficult to reproduce it though, so we cant be certain how to prevent it from happening though or how to debug it at the moment

Ryan

11/03/2021, 12:11 PM

@potatoxchip This should not happen with if using the Data Proxy. @Amit Goldfarb Could you share more about your setup i.e. where you have deployed your app and the Prisma version?

Amit Goldfarb

11/03/2021, 1:03 PM

We are using prisma 3.2.0, we deployed on AWS ec2 instance. We are running on this dockerImage -> node:12.22.1-buster-slim

Amit

11/03/2021, 1:08 PM

Our gut feeling, which might not be accurate, is that there's some scenario where the connection is not returned to the connection pool. Does that makes sense? Is there a timeout on the connection pool such that connection is being taken back to the pool after said timeout?

Jonathan

11/03/2021, 1:13 PM

@Amit we have a similar conclusion. We have autoscaling on our fargate tasks (docker like containers on aws). When one closes after scaling down again , it seems like the connections are not returned (hence our application is very slow)

Amit

11/03/2021, 1:21 PM

Thanks @Jonathan, did you find some hack to mitigate it in the meantime? I'm thinking about something like - resetting the pool once in a while? It's annoying and brings me back to how we fixed memory leak in Ruby five years ago

Jonathan

11/03/2021, 1:25 PM

Not yet, we disabled autoscaling for now and just add more resources to one container. We and plan on finding out how to reproduce / fix it but havent had time and resources yet

Jonathan

11/03/2021, 1:27 PM

Its not ideal though, and it is a pretty big hazard for us to solve. I would be interested to see whether other users have solved this somehow

Jonathan

11/03/2021, 1:33 PM

It seems that prisma disconnects on a SIGINT signal, so maybe your node process is not closed with this signal?

Jonathan

11/03/2021, 1:33 PM

https://github.com/prisma/prisma/issues/2917

Amit

11/03/2021, 1:36 PM

Is this still relevant with Napi?

Jonathan

11/03/2021, 1:37 PM

I am not certain to be honest. maybe @Ryan knows?

Jonathan

11/03/2021, 1:38 PM

prisma.$on('beforeExit', async () => { console.log('beforeExit hook') }) We are gonna add this debug statement to see if it does an exit after autoscaling sown

Ryan

11/03/2021, 1:42 PM

Ideally it should. Have you tried performing a disconnect on SIGINT and checking if that changes anything?

Jonathan

11/03/2021, 1:48 PM

We only did one for SIGTERM since that is the message we get from the fargate task manager (container). I wasnt certain if SIGTERM => ‘node app.js’ then would also be recognized by prisma in some way. But perhaps not?

Amit

11/03/2021, 1:48 PM

The thing with debugging this thing is that at least we (@Amit Goldfarb and I) aren't sure when it happens, so whatever we will try or did try to implement we said that it might've fixed it. It'll be way easier if we at least know the situation in which a connection is not returned to the pool

Amit

11/03/2021, 1:50 PM

For us BTW it happened in the past also on a λ function. This was particularly bad because it was a function from our provisioned concurrency pool. In that case, we have verified with AWS support that indeed there was some intermittent network issue that might've caused it

Jonathan

11/03/2021, 1:53 PM

One thing that helped us in some way, is to run

*select* * *from* pg_stat_activity

and check how many connections our prisma client could have at that moment to our database

Amit

11/03/2021, 1:56 PM

That's a good idea, are you naming your connections?

Jonathan

11/03/2021, 2:04 PM

I wasnt aware that is possible? I only count the nr of connections, and then after autoscale count rhem again. I notice that it is capped and cant grow afterwards

Amit

11/03/2021, 7:36 PM

Interesting

Timo

12/15/2021, 1:13 PM

Just had this happening. Really don't know how to debug it. Did you guys get any further? Had to destroy AWS ECS task and let is restart

Amit

12/19/2021, 1:59 PM

@Timo just now saw your message. For our λ instances, we basically

exit 1

whenever we bump into something like that, this is very high level, of course it includes error handling and specific usecases, replays, etc. But the fastest way we have found to deal with it is just destroy the entire λ container and to restart it (done automatically)

Timo

12/19/2021, 3:38 PM

Interesting, @Amit. Where do you catch the error?

17 Views

Open in Slack

Previous Next