https://discord.cloudflare.com logo
Join Discord
Powered by
# durable-objects
  • b

    brett

    07/26/2021, 4:50 PM
    Going back to this though, are you hitting some kind of timeout in your own code? I'm not sure I follow why hitting more instances would have an effect other than possibly making it slower. Are you saying at 16 you received an error, or?
  • j

    john.spurlock

    07/26/2021, 5:01 PM
    It's all there in the scrollback from over the weekend, but I'll try to tldr: There were two behavioral changes noticed after last weeks platform update with the big storage change: 1.) certain storage
    list
    calls would hang/never return and deadlock all other callers indefinitely. Kenton said he may have found the reason: a call that would return more than 16mb, and a fix for that will be included in the next platform update. 2.) I have data object instances that had been working fine previously, but started getting way slower. Storage access times up by about 10x with the same data. Each object does an initial load from storage on first access, and I noticed that every access was the first access now, even on immediate subsequent requests. i.e. the object was getting reset after every call, this makes my whole project unusable, since the initial load is considered to be the cold start slow path, not to be expected on every call. No errors were thrown, even when doing a clever "create a hanging request and see what it throws" trap that Kenton suggested. In fact, instead of trapping, I was able to have multiple requests out to two versions of the same DO instance, which should never happen.
  • j

    john.spurlock

    07/26/2021, 5:06 PM
    So I started a minimal repro project, but it doesn't repro with only one DO instance. Then I noticed that in my real worker, it only starts occurring at the full number of 16 instances. Because like 8 or 9 of the instances are shoved into the same process! It sounds like what you're saying is this is by design? If so, it makes DO kind of an unreliable building block, as you really can't count on any of the ram being there, and the perf will vary wildly based on how the instances are mapped to processes.
  • b

    brett

    07/26/2021, 5:38 PM
    Yeah 1) should be fixed after the release in a couple of days. 2) is odd, were you able to verify the object was reset by setting some simple in memory state and observing it vanished after every request? or were you purely going by the storage latency?
  • b

    brett

    07/26/2021, 5:40 PM
    I wouldn't say it's by design so much as it's still a TODO we intend to fix
  • j

    john.spurlock

    07/26/2021, 5:45 PM
    Yeah I can tell when the object is reset, keeping state in the object (https://github.com/johnspurlock/workers-do-memory-issue/blob/master/memory_do.ts#L10), and also tracking static DO state (https://github.com/johnspurlock/workers-do-memory-issue/blob/master/memory_do.ts#L150) It is both going through the slow path and the
    list
    calls (there are now multiple calls to work around issue 1) are slower to complete when there are too many instantiations inside one process.
  • j

    john.spurlock

    07/26/2021, 5:49 PM
    that's awesome to hear - but not sure what to do in the meantime. my prototype was working really well before last week. Could you add something to the stub call similar to jurisdiction, where you can specify that this instance should not be shared with others? Or maybe only overmap like 2 or 3 max? Or could you bump up the memory ceiling in proportion to how many instances are created inside a single process?
  • z

    zifera

    07/27/2021, 11:02 AM
    Hmm suddenly my DO stopped working, havent changed anything for a while
  • z

    zifera

    07/27/2021, 11:02 AM
    Anyone else having issues?
  • z

    zifera

    07/27/2021, 11:02 AM
    Worker threw exception error 1101
  • z

    zifera

    07/27/2021, 12:33 PM
    Seems to work again an hour later, strange!
  • j

    john.spurlock

    07/27/2021, 2:27 PM
    When multiple DO instances are shoved into a single process, is the Bundled CPU limit also shared, or tracked precisely per incoming request? I'm seeing requests that were not hitting cpu limits before now hitting cpu limits when too many instances are stuffed into one process.
  • v

    vans163

    07/27/2021, 2:55 PM
    is there a way to load+run a .wasm blob inside a durable object?
  • b

    brett

    07/27/2021, 2:56 PM
    CPU should be accurately accounted for down to the individual Object
  • b

    brett

    07/27/2021, 2:57 PM
    Sorry, I got busy yesterday, I think it'd be interesting to see what happened if you leaned more on the new storage cache, rather than pulling things into object memory. But I do hope to fix the balancing of Objects around a colo soon
  • b

    brett

    07/27/2021, 2:57 PM
    I can't think of any platform reasons you couldn't, is there a specific issue?
  • v

    vans163

    07/27/2021, 2:57 PM
    i never tried, i am just asking, are you aware of any examples of how to setup the project? following the rollup template for DO projects
  • b

    brett

    07/27/2021, 3:00 PM
    I just know of this https://github.com/cloudflare/rustwasm-worker-template
  • v

    vans163

    07/27/2021, 3:01 PM
    awesome thankyou
  • j

    john.spurlock

    07/27/2021, 3:12 PM
    Storage api calls are $$ tho, so app-specific in-memory caching still seems to makes more sense for mostly-read scenarios
  • j

    john.spurlock

    07/27/2021, 3:17 PM
    Even a more sane limit would get me unstuck. Sometimes a parallel request for 20 objects that were not already active puts all 20 objects in the same process. A limit of 4 or 5 would be workable for now. It's true that all scenarios are not the same, for small objects that do light storage access, you could get away with 20 in a process. This is why it would be great to be able to specify either at a class-definition level (in the rest api?) or when making the stub call the desired "sharing" level
  • k

    kenton

    07/27/2021, 3:18 PM
    hmm, but isn't reading the entire contents of storage into memory every time the object starts also going to be expensive? Does the object always end up using the whole data set?
  • j

    john.spurlock

    07/27/2021, 3:20 PM
    In my case, yes. Since there is only basically one index on each object's storage, I bet many scenarios will require lots of reading/preloading.
  • j

    john.spurlock

    07/27/2021, 3:22 PM
    Or lots of additional storage calls $$ for maintaining alternate indices : )
  • k

    kenton

    07/27/2021, 3:24 PM
    I guess it kind of sucks that pricing ends up warping the implementation like this
  • c

    ckoeninger

    07/27/2021, 3:31 PM
    https://github.com/cloudflare/durable-objects-typescript-rollup-esm
  • k

    kenton

    07/27/2021, 3:33 PM
    It's true we need to improve the spread of objects across isolates. But at the same time, any in-memory cache implementation that caches a large data set really needs to track memory usage and prune the cache to stay within limits -- and it needs to account for the possibility of multiple objects in the same isolate, because eventually if you have enough objects, even if we evenly distribute them across the colo, some will land in the same isolate. Tracking memory pressure is admittedly kind of hard to do in JavaScript, but not impossible -- you could count memory usage in a global variable, so that it tracks across all objects in the isolate. Or if you rely on the built-in cache, it'll be taken care of for you. But yeah, that might incur additional billing... (I don't yet know if cache hits will count for billing, but it's possible.)
  • j

    john.spurlock

    07/27/2021, 3:43 PM
    Isn't the unpredictable sharing of objects within a single isolate a larger problem than just storage caching scenarios tho? You have the same problem without storage in the picture. Any data structures allocated by the app (ws-connection-level stuff, encryption artifacts, things coming in from input or external fetch) are going to use memory, and it won't be good if your scenario is architected to work in dev on Monday, runs fine in production on Tuesday, but then fails on Wednesday because more object instances are now put into the same isolate. Say what you want about lambda, I've used it from the very beginning, and one nice property is that it's predictable about what you get in the container you specify, and you can architect within the advertised limits.
  • k

    kenton

    07/27/2021, 3:55 PM
    So, an important design goal of durable objects is that it should scale to extremely large numbers of fine-grained objects. We strongly encourage developers to design for finer granularity, and we intentionally don't bill on the number of objects because we'd rather see people building apps around more, smaller objects. If we said that each object gets its own isolate, that would unfortunately blow up this goal, since even though isolates are much cheaper than containers, they are still much more expensive than what we'd like to see for fine-grained objects. What we really want to be able to do is adaptively adjust the isolate count to match what the app needs, so apps don't need to worry about it. We're still working on that, though, and it's a lot harder for large / coarse-grained objects compared to fine-grained.
  • v

    vans163

    07/27/2021, 3:55 PM
    is there a way to get chunks of a
    content-encoding: chunked
    inside a DO?
1...130131132...567Latest