Cloudflare #durable-objects

Join Discord

Unsmart | Tech debt

11/29/2022, 7:21 PM

Colo caches? As in an entire data center will share cache vs being per server? 🤔

Unsmart | Tech debt

11/29/2022, 7:24 PM

Also afaik IAD isnt core so it could just be testing ground for a new cache system

Unsmart | Tech debt

11/29/2022, 7:26 PM

But its also not canary afaik so I do think it would be interesting to test a new cache thing there but 🤷

nclevenger

11/29/2022, 7:26 PM

I didn't mean core to Cloudflare ... but it is core to the internet given it's the largest location for most cloud and hosting providers

brett

11/29/2022, 7:29 PM

Yeah, when we refer to Cache or Cache API we mean the colo cache

brett

11/29/2022, 7:29 PM

But IAD is a PoP made of multiple colos that have their own cache

Unsmart | Tech debt

11/29/2022, 7:29 PM

Interesting I thought cache was per server but its actually per colo?

brett

11/29/2022, 7:30 PM

Yeah

brett

11/29/2022, 7:30 PM

There is no per-server cache

Unsmart | Tech debt

11/29/2022, 7:30 PM

TIL

zegevlier

11/29/2022, 7:30 PM

Is there a blog post that explains how that works?

brett

11/29/2022, 7:30 PM

Anyway these colos are in their own failure domains so they don't share cache across each other

brett

11/29/2022, 7:31 PM

Which part?

nclevenger

11/29/2022, 7:32 PM

would a DO in IAD POP bounce between the different sub-colos? Or does it have a default and only change if there is a failure?

zegevlier

11/29/2022, 7:32 PM

Requests come in on different machines, so how do those machines know where to look for the cached object? Is there some central thing per colo that manages that?

nclevenger

11/29/2022, 7:33 PM

https://blog.cloudflare.com/why-we-started-putting-unpopular-assets-in-memory/

brett

11/29/2022, 7:33 PM

It's a typical distributed hash table, I don't know if we have a post about it. Think of (or look up) multi-node memcache (that's not what we're using but same idea)

brett

11/29/2022, 7:34 PM

For the most part they won't move between colos in the same location at all. In the rare cases where some do, it'd be due to failure yeah

brett

11/29/2022, 7:35 PM

Anyway, all of the above is mostly to say that you should treat cache like a cache and not a datastore 😛

Skye

11/29/2022, 7:37 PM

I'll add a little more info to my approach to see if there's a specific recommendation. Condition 1. 99% of requests. Come from IAD, and are for reading data which comes from D1. To make it a little faster, I'm able to cache some of this data in KV. When a request comes in, I check KV, or D1 and then put to KV. Condition 2. The other portion of requests are writes, which can come from anywhere, which is why I don't/can't use the cache api - as that wouldn't affect the cached value anywhere other than wherever the write happened. These writes I put to both D1 & KV. My thought process is that it may be faster for condition 1 to replace KV with durable object storage given how almost all of the reads come from one place. This allows me to still have the writes coming from anywhere, while (in theory) having the fasted possible read

Skye

11/29/2022, 7:38 PM

Oh and each individual key fits within about 1kb if that helps

nclevenger

11/29/2022, 7:39 PM

Why not then just make sure the DO is in IAD, write to the DO but always read through a cache (and you could do a stale while refresh pattern)

Unsmart | Tech debt

11/29/2022, 7:40 PM

Knowing its a per colo cache is nice though I thought it might be better to use KV for any high read throughput values but DO for the transaction updates but given its per colo I think using a DO with a cache will probably be just fine even if coming from a high variance of colos 🙂

brett

11/29/2022, 7:40 PM

The downside of using KV there is that you'll have stale reads for a while after a write. The downsides of DOs are there is a maximum req/sec one DO can handle, so it depends how often you have to fall back to it.

Skye

11/29/2022, 7:40 PM

I'm fine with the stale read time of ~60s of KV at the moment

nclevenger

11/29/2022, 7:40 PM

https://github.com/drivly/swr.do/blob/main/worker.js

Skye

11/29/2022, 7:40 PM

That would be the plan

Unsmart | Tech debt

11/29/2022, 7:43 PM

I feel like putting KV in front of DO might actually be a bad idea, just use cache api in front of the DO. Especially given the possibility of intermittent KV write errors where something you write doesnt get pushed through to KV which might mean stale values for a lot longer than you want

Skye

11/29/2022, 7:43 PM

I would use either KV (current) or the DO (considering)

Unsmart | Tech debt

11/29/2022, 7:46 PM

So right now you cache a D1 database using KV? I feel like just using the cache api in front of D1 would be better. Basically just completely remove KV from the equation