hey guys, I’m getting an error that only shows whe...
# help
j
hey guys, I’m getting an error that only shows when I am NOT live debugging I made a change to my auth trigger lambda to update my DB during sign in when I live debug it works perfectly, the DB is updated and a token is given to the client however when I deploy and test again, the DB query hangs and the lambda times out after 10s any ideas? thanks so much for your help! 🙏
t
What DB are you using?
j
the issue is on my dev/staging envs which is RDS postgres I am not seeing the issue on uat/prod envs which use aurora. even after I deploy (not live debugging) these envs are working well
t
When you say RDS postgres do you mean our new sst.RDS construct?
j
looks like it’s standard cdk rds.DatabaseInstance
t
and when running locally what db does it connect to?
j
same one, rds.DatabaseInstance, and I can see the query is successful and the auth flow succeeds
t
This database is public and exposed to the world?
is it possible this isn't the case when deployed in non-local mode?
j
yes, exposed to world, and I don’t have any conditional logic to change my stack if I am local
so I assumed when I local debug I am hopefully identical to when I deploy
I’m looking through CloudWatch to see if there’s any clues, also trying to get sentry working in case it’s easier to see anything there
I don’t know how to troubleshoot since it’s working on local
ok I found an Xray Trace ID and I can see successful select statements but then a start transaction and commit then nothing until 10s later when it is killed
t
are you missing an await anywhere?
j
I don’t think so, would that behavior be different with local debug vs deployed?
s
10s sounds like a default lambda timeout. I suspect something is awry with how you're connecting to the DB and that connection is timing out
Is the request making it to your lambda (verifiable via cloudwatch?)
j
definitely, but why success in local debug and failure after deploying and how to troubleshoot since it works on local debug
ya the xray trace id shows the lambda logging that it is running sql queries
s
oh, interesting. So it's actually connected to the DB and executing queries. That is odd
j
i see the sql logged, but not responses, but I am assuming they are successful since it moves onto the next query instead of throwing
s
can you share the handler logic?
j
it is curious that the last query is an update
yes I will try to share a simplified failing example
t
can you add more logs to see where it's stuck?
j
my rds db has a vpc, and vpcSubnets set to vpc.publicSubnets, publicallyAccessible: true but I don’t suspect it’s an issue because my normal lambdas all succeed, it’s only the auth lambda updating the db that hangs
@thdxr yep, that’s what I’m doing now
s
Yup, the VPC was my first suspicion
Your local dev env probably has some sort of VPN running that is allowing you access to the DB when running locally
but your lambda is running in a different environment when deployed
I ran into this a lot when connecting to RDS for Postgres (non-serverless, non-aurora) within Lambda. Huge PITA to get that configured properly
j
my api stack is working well though, so must be something different about the auth stack or auth trigger lambdas?
f
I can think think of a couple of things worth check, ie. is this set?
Copy code
context.callbackWaitsForEmptyEventLoop = false;
But none of what I can think of explains why some of ur Lambdas work, but other don’t.
If you can’t get an answer from Sentry or Xray, maybe try running the queries that are timing out in the authorizer inside ur Api lambdas. And see if they succeed there. This would help ruling out if it’s the problem w/ the Lambda or w/ the queries.
Either way, I think maybe try finding out why the authorizer’s behavior is different.
j
hey all! thank you everyone for your help, I figured it out! The db instance was set up with only vpc.publicSubnets, changing it to public and private subnets fixed the problem, so I guess that means: 1. my API lambdas access the db through public subnets 2. when local debugging my auth lambdas also access the db through public subnets 3. when deployed, my auth lambdas access the db through private subnets