Cloudflare #workers-discussions

zszszsz

03/14/2023, 7:32 PM

Yeah I see this is workers not pages

mobileengagement

03/14/2023, 11:21 PM

Hi everyone. I will be eternally grateful if we can solve this issue. Rather than using one kv write per user, we add all user json records to kv as a JSON Array (per audience). We identify the audience from the url making the request, parse the json from KV, then request the specific json record. This works great but takes time and CPU resources. Interestingly 90% of the processing time is parsing the JSON (JSON.parse) not getting the json record in the array. So what I want to do is speed up the parsing between worker invocations. Imagine 1 0 million invocations each taking a second to parse the data. It would be great if the data was already parsed. Of course we can make auduences smaller but we would prefer not to place these limits on our customers. Any ideas? Thank you.

James

03/14/2023, 11:42 PM

If the KV value is stored as JSON, you can do

await env.KV.get('key', {type: 'json'})

which will handle the JSON parsing for you. I doubt it'll be any faster though, honestly

James

03/14/2023, 11:43 PM

How large is the data though? I would not expect parsing JSON to take a full second, even if you have like a megabyte of JSON

Tom Sherman

03/14/2023, 11:49 PM

> we add all user json records to kv as a JSON Array The solution is not to do this 😅

Tom Sherman

03/14/2023, 11:50 PM

But yeah, interested in hearing how large the value is

Tom Sherman

03/14/2023, 11:51 PM

Another thing you could do is switch to a format that works better as being read as a stream, this should allow you to skip CPU cycles on chunks of the data you don't care about

mobileengagement

03/15/2023, 12:27 AM

It's not so much the size of the data (25 MB KV Limit) its the number of records. 100,000 Records takes 1s and 200ms CPU to process . Clients can have millions of records which are all updated at least once a day, sometimes more. Its not just about reading the data either. We have to get all of the data into KV in the first place and have a 3 minute window to do so. Please note time doesn't matter when we can used Bundled Workers. Which work up to 10,000 records leaving enough of the 50ms CPU resource to carry out our other tasks. Typical audiences are much bigger. We then need to use Unbundled where time costs money. Does the suggested approach still use the Worker CPU or does KV do the work. If KV does the work at no CPU cost that would be awesome. Due to the fact json data is changed every day and millions of records are changed at a time using 1 KV Write Per User is not financially viable. Hence the need to use batches. Assuming the suggested approach is processed by workers and takes just as long and consumes just as much CPU resources is there another way. I am happy the pre-process the data ready for use in each invocation, I am looking to make the process more efficient. Of course using /audience/random prefix / in the URL can make each KV JSON Array smaller (max 10k per prefix) but I am looking for a more elegant solution. Please explain how a stream would work with node.js I haven't used streams before. How would I get the same data in and out of KV? Thanks

Tom Sherman

03/15/2023, 8:27 AM

In this case I'd look into moving away from JSON. JSON is slow here because you spend a lot of time parsing the records you don't care about As a naive and simple approach imagine each record was delimited by a newline and the first 8 bytes denoted the user id You can stream the data from KV and check the first 8 bytes of each line for the relevant user. You can discard every record that isn't the correct ID, and every record after it (stop reading) From there you could look to sort the audience batch (the value you store in KV), you could skip over whole chunks of the streamed data instead of checking each record line by line

Tom Sherman

03/15/2023, 9:09 AM

With sorted data you can do a binary search on each chunk streamed in

JatinGundabathula

03/15/2023, 12:02 PM

Hi Guys, Need some help. Is there any option in workers to use stateful memory which can be accessed and updated externally and shared across all incoming requests ? Apart from using KV or Cache API

Tom Sherman

03/15/2023, 12:39 PM

durable objects can do this 😄

Soham

03/15/2023, 1:40 PM

Cloudflare seems to place a huge emphasis on workers but I could never quite get what they exactly did. Could someone explain what they do?

Tom Sherman

03/15/2023, 1:52 PM

have you read this article? https://developers.cloudflare.com/workers/learning/how-workers-works/

noil

03/15/2023, 1:52 PM

Hey guys and gals, is there a way to see the incoming request count for prev. months? It show the current month in the top right corner but I would like to compare to prev months.

Soham

03/15/2023, 1:55 PM

ah seemed to have missed this

Soham

03/15/2023, 1:55 PM

thanks

Erisa | Support Engineer

03/15/2023, 2:04 PM

You can check the previous month under Profile > Billing > Billable usage at least for a paid Workers plan, not sure if you can on free

noil

03/15/2023, 2:04 PM

I do have a paid plan yes

Erisa | Support Engineer

03/15/2023, 2:05 PM

Then billable usage will show you nice breakdowns over time going back to the previous month

Erisa | Support Engineer

03/15/2023, 2:06 PM

and you can also check your invoices for past months and see a count on there

noil

03/15/2023, 2:06 PM

It shows me zones and that is 1. so everyday in the graph is 1

Erisa | Support Engineer

03/15/2023, 2:07 PM

can you share a screenshot?

Erisa | Support Engineer

03/15/2023, 2:07 PM

should have a graph for workers requests:

Erisa | Support Engineer

03/15/2023, 2:08 PM

and at the top you can change to last month

zelnaut

03/15/2023, 2:27 PM

Hi 👋 , looking for clarification on costs with unbound usage. I'm planning on using a worker to call an external API, then (optionally) stream an object back to the client. I wouldn't be billed for the duration of the external API call, or for streaming the response, right? Since the worker would be considered idle in both cases?

kian

03/15/2023, 2:40 PM

Duration of the API call, yes

kian

03/15/2023, 2:40 PM

Streaming the response, no

zelnaut

03/15/2023, 2:43 PM

Even if the response isn't modified? E.g. like the diagram here https://blog.cloudflare.com/workers-optimization-reduces-your-bill

kian

03/15/2023, 2:43 PM

Oh, I meant yes as in you will be billed for it