Has anybody wrote a way for developers to easily create samp Serverless Stack #help

Has anybody wrote a way for developers to easily c...

Pablo Fernandez

02/13/2022, 2:50 PM

Has anybody wrote a way for developers to easily create sample data? And easily destroy it and recreate it? How do you do it?

gio

02/13/2022, 3:12 PM

Could do you explain better your usage scenario? Meanwhile You could take a look at sst.Script

Pablo Fernandez

02/13/2022, 3:19 PM

Sure. I'm writing an application from scratch and to be able to write the the API and the frontend I need sample data. My app is a directory of questions (https://www.conversation.guru), so as a developer, when I create a dev stage, I need the database to have a few hundred questions for the app to behave realistically.

Pablo Fernandez

02/13/2022, 3:20 PM

But pretty much every time I wrote an application I wrote a script that creates what I call "sample data", so that any developer can run it and get a variety of data that shows lots of different conditions and they are up and running quickly.

Pablo Fernandez

02/13/2022, 3:21 PM

At some point, for this particular app, I might want to take snapshots from prod and use that, maybe.

Pablo Fernandez

02/13/2022, 3:21 PM

Right now prod doesn't exist.

Pablo Fernandez

02/13/2022, 3:23 PM

I am not sure how Script would help me exactly.

Pablo Fernandez

02/13/2022, 3:23 PM

Right now I have the code written as a Lambda function that I can call from an API, but DynamoDB doesn't offer a good way of clearing the database.

Pablo Fernandez

02/13/2022, 3:25 PM

So it looks like my current process for refreshing my sample data, something that I'm doing frequently right now, would would: 1. Stop SST. 2. Delete DynamoDB table. 3. Wait... wait... wait.... 4. Delete Storage stack. 5. Wait... wait... wait... 6. Run SST. 7. Call API to generate sample data. Pre-SST, working with a local PostgreSQL, I would just truncate it and it would be very fast.

gio

02/13/2022, 6:21 PM

Through sst.Script you could load data on deploy time. Honestly I didn’t well what you are trying to make, if it is a solution to perform integration testing or to generate a sandbox environment of your application. Anyway, get a snapshot of your data in production environment is something that I wouldn’t suggest in any possible scenario. Stages for me must be isolated from each other. If you need to load data/clean fast in dynamo, you could save some data in json files (stored in your codebase or in s3 if the data is heavy) and use batchwrite to write and to delete. BatchWrite is fast, you can write/delete up to 100 items through one request.

Pablo Fernandez

02/13/2022, 8:16 PM

I had around 300 putitems and the 10 second timeout on dynamo wasn't enough. That's somewhat worrying. Well... I think that was the problem, I'm not 100% sure, since I got no error message.

Pablo Fernandez

02/13/2022, 8:17 PM

If the sample data is loaded on deploy time, how do you cause the deployment of a stack that hasn't changed?

gio

02/13/2022, 8:24 PM

sst.Script couldn't be a solution for your specific problem. If you want load/clean the dataset when you need it you can declare a specific sst.Function for it. Make 300 put request is too much time expensive, you could perform 3 request to load or delete through batch operation

gio

02/13/2022, 8:27 PM

I use sst.script to load a dataset which is used to make some integration testing, or to load some asset to s3 bucket

Pablo Fernandez

02/13/2022, 10:59 PM

Yeah, right now they are not batch operations to use the same "create" function the app itself would use, for consistency. I don't mind it take long, or running from my local machine. I just can't find the right building blocks here.

Jay

02/14/2022, 9:42 PM

@Pablo Fernandez so is the problem the DynamoDB timeouts?

Pablo Fernandez

02/14/2022, 10:55 PM

@Jay I didn't want to point to a particular technical problem, but the general problem of sample data generation for development.

Pablo Fernandez

02/14/2022, 10:56 PM

Generally, for example, I have a function that creates a user. The tests for users use it, and the sample data generation also use it to create sample users, so code is shared. Sometimes it's the same function the system calls when a user is created. That's normally how I design things but I can be flexible.

Pablo Fernandez

02/14/2022, 10:58 PM

The timeouts I was referring to were lambda timeouts, not dynamodb timeouts. And I'm just guessing, since I didn't see any timeout-errors. I just noticed the lambda function didn't finished and run for 10.4s, and when I changed the timeout to 20s it run for 20.4s and when I changed it to 30s it run for 30.4s. It's a bit worrying that DynamoDB seemed to be so slow, but one thing at a time.

Jay

02/14/2022, 11:04 PM

@Frank any thoughts on this?

gio

02/15/2022, 5:43 AM

What is the average size of a single PutItem?

Frank

02/17/2022, 9:00 AM

@Pablo Fernandez Yeah a couple of thoughts: 1. Are you using the

Script

construct? The

onCreate

function has 900s timeout. I wonder if you still see the timeout. 2. If you have a lot of data to seed the DynamoDB table, you can have the

onCreate

function spawn multiple lambda function, and each function seeds a chunk. 3. If you see the X-Ray id in your Lambda log, you can take a look at the X-Ray trace and see which

aws-sdk

call is taking long.

Pablo Fernandez

02/17/2022, 9:03 AM

@gio each PutItem is tiny, less than 1k.

Pablo Fernandez

02/17/2022, 9:09 AM

@Frank 1) no, I'm not using script, it's a REST API that I can call to regenerate the data. 2) onCreate function? Can you give me a bit more context, onCreate where? 3) I'll look into X-Ray.

Frank

02/17/2022, 7:00 PM

Ah I see. Here’s what I’d do: 1. First look into what’s causing the timeout, and fix any issues there. 2. Then not place the Lambda behind a REST API because the API has a maximum timeout of 30s. You can just leave it as a standalone function ie.

new sst.Function(…)

, set the timeout to 300s. And trigger it through the SST Console. 3. If you want to automatically run this function on deploying to a new stage, you can use the

sst.Script

construct, and hook up this function to

onCreate

Pablo Fernandez

02/18/2022, 5:58 AM

1. I'm not sure what you mean by causing the timeout. Do you mean what's causing this to take so long? that I don't know. The timeout is just the standard sst.Api timeout. 2. Ok. 3. Ah, I see.

gio

02/18/2022, 7:15 AM

1. it’s not clear for me, are you talking about latency or timeout? If it’s timeout you should be already inside lambda so what I would like to understand is which task (dynamo puts?) spend so much time.

gio

02/18/2022, 7:26 AM

1. If you don’t care about synchronicity of this operation, you can create an sst.Function as suggested by Frank and triggered inside API in “Event” mode, in this way api is closed fast but in background a lambda (with 300 sec expire) is processing

Pablo Fernandez

02/19/2022, 9:04 AM

@gio the HTTP requests ends because of the sst.Api timeout.

2 Views

Open in Slack

Previous Next