I have a use case where I want to have a queue of jobs and t Serverless Stack #help

I have a use case where I want to have a queue of ...

Tamás Krasser

06/22/2022, 3:45 PM

I have a use case where I want to have a queue of jobs and then one-by-one invoke a lambda which should have a very low (1-2) concurrency. If a job fails it should be retried ~5 times before DLQ. After some research it seems a simple SQS queue is not a good fit for this (messages will be rejected because the lambda can't be invoked so they'll reach their max retries quickly). Anyone can advice a good pattern to solve this?

Ashishkumar Pandey

06/22/2022, 4:41 PM

you’ll need to use SQS with step-functions to achieve this.

Tamás Krasser

06/22/2022, 5:06 PM

Clarification: the jobs are not depending on each other. The concurrency is low because I have to call an external API with every job, with the same credentials and the API has a rate limit.

Tamás Krasser

06/22/2022, 5:15 PM

Ah, I think this is it: https://www.foxy.io/blog/we-love-aws-lambda-but-its-concurrency-handling-with-sqs-is-silly/

Ashishkumar Pandey

06/22/2022, 5:15 PM

I get that but to be able to manually move failures to a DLQ after x number of attempts either requires a state machine or a event based approach.

Ashishkumar Pandey

06/22/2022, 5:19 PM

if you need ordering, you’ll need to use a SQS fifo queue, that will take care of the ordering but then you need to ensure that you don’t pick up more than a couple of jobs at a time, in that case you’ll need to play with the lambda processing and delay job addition to the queue for some time. It’s going to be complicated to achieve all of this together at once and so go ahead by solving one step at a time.

Derek Kershner

06/23/2022, 1:46 AM

Just use a CRON + SQS. You can call for messages and process at whatever pace you want and else they will just sit there.

Derek Kershner

06/23/2022, 1:46 AM

For retries, just put a prop on the message that you increment on failures.

Ashishkumar Pandey

06/23/2022, 2:21 AM

@Derek Kershner how’d you deal with limiting the amount of items that needs to be processed by the queue on the cron trigger? Like suppose there are 40 items in queue and suppose only 2 need to be processed per trigger?

Derek Kershner

06/23/2022, 2:24 AM

ReceiveMessageCommandInput | SQS Client - AWS SDK for JavaScript v3 (amazon.com )

Derek Kershner

06/23/2022, 2:25 AM

Just set that to how many you want to process per trigger.

Ashishkumar Pandey

06/23/2022, 2:28 AM

Nice! this will definitely do the trick, my only concern is that if the number of events blows out of proportion this will create a serious back-pressure. This is why I try my best to not cap processing, I’d instead try to cap the notification of delivery.

Ashishkumar Pandey

06/23/2022, 2:32 AM

the retry by appending number of retries also seems kinda confusing because it’s not clear whether the failed attempts need to be retried first or the ones in the queue. This really would be simplified a lot using event sourcing or step-functions with sqs fifo. or I might just be complicating it as I prefer. 😅

Derek Kershner

06/23/2022, 2:36 AM

No doubt on the backpressure, but that's sort of the idea as proposed.

Derek Kershner

06/23/2022, 2:37 AM

This one is architecture simple, code complex. Step functions would be architecture complex and code simple. Tradeoffs.

Derek Kershner

06/23/2022, 2:38 AM

This is also quite a bit cheaper, FWIW, but neither is expensive here since you could use express step functions.

Ashishkumar Pandey

06/23/2022, 2:39 AM

yep, the express ones are dirt cheap lol!

Ashishkumar Pandey

06/23/2022, 2:44 AM

I’d actually implement this using dynamo and cron. Primarily, I’d keep persisting all events sorted by date and time. then, use cron to check a gsi for all unprocessed items and get as many as I want to process based on some .env config or ssm param. I’d then process them either using express step-functions or just handle it in the same lambda triggered by the cron, this would deal with the retries too. I think this could work great, no worries regarding dropping events due to the queue being full. What do you think?

Derek Kershner

06/23/2022, 2:22 PM

oh, I see, you are concerned about retries being at the back of the pack forever, Dynamo would give you way more ordering abilities, and no real downside other than cost, which would still be quite low. I got the sense this was merely about spreading out bursts, though, and that throughput in general was not a concern, only that it exceeded boundaries on rare occasion. I’d probably stick to SQS for simplicity myself, and just stick an alarm on queue length, call it a day, but nothing stopping ya!

Ashishkumar Pandey

06/23/2022, 2:49 PM

yep, thanks for the discussion, very insightful.

7 Views

Open in Slack

Previous Next