Anyone know how to make eventbridge retry lambda t...
# help
g
Anyone know how to make eventbridge retry lambda targets? Currently it only runs twice when it errors and it seems thats because its running lambda asynchronously which means according to eventbridge the lambda succeeded even though it didnt.
j
This is the problem with eventbridge unfortunately. It considers invocation as a success. It does not care about your lambda’s output. This is why you may have to use SQS as a target and have it retry the lambda, or even a step function with a retry policy.
CORRECTION: I am wrong please ignore the above. I googled it and found out https://aws.amazon.com/blogs/compute/improved-failure-recovery-for-amazon-eventbridge/
g
Hmm seems pretty pointless to have lambda targets for eventbridge then 🤔
Looks like even that article you still have to use DLQ's the lambda itself wont retry
j
Yes, I guess this is just a case of designing systems well, testing them so that they don’t have errors in the first place. And then with some monitoring tools and recovery strategies. For example, eventbridge archive replays
g
Errors are bound to happen and I would rather have it auto retry... we still have monitoring for when errors happen but if it can fix itself by retrying thats ideal.
j
This is why I prefer SQS as that will literally not give up retrying
You can probably rely on EventBridge delivering to SQS targets since it is not your code blocking it. Then from there you don’t have to worry about it dropping the message and it will keep retrying indefinitely
g
Yeah I just have
eventbridge -> sqs -> lambda
now at least attempting
A bit annoying that I have to route to an sqs for retry support when eventbridge is supposed to have retry support and then if retries fail go to a DLQ but
j
You can consider SNS as well
g
Any route I go I have to use an intermediary from eventbridge and lambda and queues are at least natively supported as targets in sst so thats a bit easier. Eventbridge is 100% necessary to receiving the events though
j
I have a few lambda event targets where the lambda’s sole job is one specific thing, i.e. to send a task response to a step function. Anything else I am relying on step functions to retry things and handle failures.
Since these have strict event schemas, straightforward defined tasks, and sufficient unit test coverage - I can sleep easy at night knowing this is not going to fail in production. After all, most companies like Amazon have written code once and then it stays deployed for years without things breaking. Because they were designed with rigid requirements and tested against all parameters. Sure, an AWS service could go down - but then that’s when your recovery strategy comes in rather than something automated.
t
I use the EventBridge to sqs pattern a lot
It's very simple in sst
It's also nice because your lambdas can be unaware of the queue and consumers
g
Yeah it was simple to do just annoying that I can't just have eventbridge manage the retries to a lambda
t
Yeah it would be ideal to not need the queue in between but because you can define everything inline it kinda of disappears
Copy code
props.ship.bus.addRules(this, {
      TempestLinkFound: {
        eventPattern: {
          source: ["tempest"],
          detailType: ["link.found"],
        },
        targets: [
          new sst.Queue(this, "processQueue", {
            consumer: "services/links/process.handler",
          }),
        ],
      },
    })
you can almost forget the queue is in here
j
Oh wow that is clean with SST
Saves so much code
t
Yeah it's funny because up until I wrote this yesterday, I mentally was like - wow it's really annoying to need to have a queue every time I want a retry mechanism. Thought that for months but then when I actually went to do it, it was surprisingly simple
g
That code you put is basically exactly what I have yeah got it working fine
j
I find myself repeating a lot of patterns as well now. Maybe they should become constructs. The problem I ran into was mixing non-AWS constructs (i.e. my own) with SST. SST was not working. I have a demo repo as proof
t
tell me more
j
1. So I made this construct https://github.com/joekendal/cognito-passwordless It is available on npm 2. I then installed it in my SST project 3. Tried to instantiate it inside an SST stack Expected behaviour: for it to work as any other cdk library Actual behaviour: a bunch of errors related to typescript and it extending cdk instead of sst. Couldn’t use it. In the end I had to just write the construct using sst instead of cdk.
t
these should probably be peer dependencies
I have an external construct I use with sst
And all of sst's internal constructs just extend
cdk.Construct
We should probably put together a sample of making your own constructs + distributing them