Cloudflare #durable-objects

Deebster

05/30/2021, 3:14 AM

These are great questions. I haven't seen any unload event/destructor support and I think the idea is that you use the storage for anything you want to keep while workers aren't connected. I'm finding the old DO is terminated when I update the code and the websocket drops with closeEvent.code = 1006. In my case I reconnect with a clientId+auth (sent to the client in the socket setup) that the DO uses to ensure there's only one active socket per client - currently I don't store this session stuff in the storage and so the reconnection "login" fails when I update an active DO.

Vanessa🦩

05/30/2021, 10:03 PM

Copy code

client: pz9e54qse4c
12:18:27.918 <<< INFO Error: This script has been upgraded. Please send a new request to connect to the new version.
    at Worker.broadcast (./index.mjs:254:36)
    at Worker.TICK (./index.mjs:229:14)
    at ./index.mjs:346:28
    at ./index.mjs:26:28
12:18:31.438 <<< RECV [22793837,288,"<meta ...>"]
12:18:33.850 <<< RECV [22796245,289,"<msg #1>","ouabam4xhhp",1622315913842,0]
12:18:35.319 <<< RECV [22797711,290,"<msg #2>","ouabam4xhhp",1622315915308,7]
12:18:35.898 <<< RECV [22798288,291,"<msg #3>","ouabam4xhhp",1622315915880,12]
12:18:36.662 websocket closed with code: 1006

client: ouabam4xhhp
12:18:31.410 websocket connected
12:18:31.410 >>> JOIN {"client":"ouabam4xhhp"}
12:18:31.422 <<< SYNC @22793637#287
12:18:31.438 <<< RECV [22793837,288,"<meta ...>"]
12:18:33.842 >>> SEND [0,0,"<msg #1>","ouabam4xhhp",1622315913842,0]
12:18:33.848 <<< RECV [22796245,289,"<msg #1>","ouabam4xhhp",1622315913842,0]
12:18:35.308 >>> SEND [0,0,"<msg #2>","ouabam4xhhp",1622315915308,7]
12:18:35.320 <<< RECV [22797711,290,"<msg #2>","ouabam4xhhp",1622315915308,7]
12:18:35.880 >>> SEND [0,0,"<msg #3>","ouabam4xhhp",1622315915880,12]
12:18:35.894 <<< RECV [22798288,291,"<msg #3>","ouabam4xhhp",1622315915880,12]
12:18:36.661 websocket closed with code: 1006

Vanessa🦩

05/30/2021, 10:03 PM

This is a log of 2 websockets connected to the same DO. Timestamps are client-side, but both clients run on the same machine, so timestamps are reliable.

12:18:27.918

the DO tries to do a

socket.send()

to the first client

pz9e54qse4c

, gets the error

This script has been upgraded. Please send a new request to connect to the new version.

12:18:31.410

the DO accepts a new websocket from client

ouabam4xhhp

, receives a message

JOIN

12:18:31.422

DO successfully sends back

SYNC

ouabam4xhhp

12:18:31.438

SO successfuly sends

[22793837,288,"<meta ...>"]

to both clients, both RECV them

12:18:33.842

client

ouabam4xhhp

sends

[0,0,"<msg #1>"

12:18:33.848

DO sends

[22796245,289,"<msg #1>"

, received by

ouabam4xhhp

12:18:33.850

DO sends

[22796245,289,"<msg #1>"

, received by

pz9e54qse4c

... this repeats twice with

"<msg #2>"

and

"<msg #3>"

12:18:36.661

websocket for

ouabam4xhhp

closed

12:18:36.662

websocket for

pz9e54qse4c

closed Also, it's not visible in these logs, but I confirmed that the DO successfully stored all the messages, even after the

This script has been upgraded

error. The websockets were only closed 8 seconds after that error. What would be the correct way to handle that error? If I close all websockets, can I be sure new connections will go to the new DO instance? Is there something like

location.reload

to force a fresh start of this DO?

Vanessa🦩

05/30/2021, 10:06 PM

Is there documentation for all worker errors somewhere? I did not find anything about this particular error anywhere.

MrHalzy

05/31/2021, 1:53 PM

Hello! Is there a way to interact with DO storage other than from within a DO?

MrHalzy

05/31/2021, 4:09 PM

Hello, I have an issue where my script that calls into a DurableObject is returning a 504 Gateway Time-out. Then, on the workers graphs in the dashboard it seems to be using 3.75gb-s

Greg-McKeon

05/31/2021, 6:55 PM

Yes,

Greg-McKeon

05/31/2021, 7:01 PM

this sounds like a bug - I'm talking with the team about the reliability of setTimeout in general, I'll let you know when I have more. What are you using the setTimeouts to do?

Greg-McKeon

05/31/2021, 7:03 PM

There isn't a way to do something before unload today, since we couldn't make it reliable. Yes, in the case of code upgrade we forcible disconnect old websockets and reconnect them to the new DO - this is not atomic, though, so a new client could connect to the new DO while an old client was still finishing up its work on the old DO

Vanessa🦩

05/31/2021, 7:04 PM

It's sending a clock signal at regular intervals, but the signal needs to be restarted whenever another message arrives, so I can't just use setInterval

Greg-McKeon

05/31/2021, 7:06 PM

Not today - what are you looking to do?

Greg-McKeon

05/31/2021, 7:07 PM

We're looking at a more comprehensive way to wake a Durable Object on a schedule, which would definitely help here. Not something we're currently building, unfortunately.

Vanessa🦩

05/31/2021, 7:08 PM

my clock signal could be sending at 60 Hz for some apps 😉 Anything from 60/s to 1 per 30s (can't go longer than 30 secs to detect broken connections)

MrHalzy

05/31/2021, 7:09 PM

I'm looking to clean up storage that was added during testing.

Greg-McKeon

05/31/2021, 7:09 PM

Best way to do this is to call deleteAll from within the DO. We're adding a way to list your current DO, so you could iterate the list and call deleteAll across them.

Vanessa🦩

05/31/2021, 7:14 PM

But storage ops are atomic, yes? Meaning it is impossible a storage op will succeed in the new DO instance, and after that a storage op in the old client still succeeds (in that ~10 second window where both overlap)?

brett

06/01/2021, 4:04 PM

that's right

Vanessa🦩

06/01/2021, 4:07 PM

Can you comment on how to properly handle that error?

brett

06/01/2021, 4:12 PM

I'm a little confused at what's going on there -- is that saying that

pz9e54qse4c

received an error but was still able to use the websocket?

Vanessa🦩

06/01/2021, 4:14 PM

Correct.

Vanessa🦩

06/01/2021, 4:16 PM

The second client even connected after that error, and both received messages, and storage still worked.

brett

06/01/2021, 4:28 PM

Thanks, asking my teammate who worked on this, sec

eidam | SuperSaaS

06/01/2021, 4:29 PM

Just a small heads up, seems like the

network connection lost

errors are pretty much gone for our use-case ❤️ Thank you! 👏

brett

06/01/2021, 4:30 PM

sweet

matt

06/01/2021, 4:49 PM

* Can you share more about your client, in particular the websocket library being used? I'm wondering if there's some sort of auto-reconnect behavior happening, where the same client is reconnecting after receiving a hard disconnect. * Can you share more about the logging setup here? Are these logs coming entirely from the client, if so how is the

Error: This script has been upgraded

message being transmitted to the client? * We currently aren't disconnecting drained DOs from storage, but we will start doing so after next week's release. * Can you reliably reproduce this problem on code updates?

Vanessa🦩

06/01/2021, 5:00 PM

* vanilla JS, no automatic reconnect. This is the same code running in two tabs of the same browser. * logging is client-side, server-side errors come in via websocket (I am working on proper cloud logging, but seeing DO's console.log in

wrangler tail

would be awesome) * disconnecting storage sounds like a good idea. * haven't tried to reproduce yet, and I now added handling that error by closing all websockets and disconnecting storage

matt

06/01/2021, 5:26 PM

Thanks! If you have a minimum code sample that would reproduce this, it'd be very helpful. * Are you writing errors received from websockets somewhere so you can transmit them later? Our websocket disconnection mechanism should be preventing you from sending the

Error: This script has been upgraded. Please send a new request to connect to the new version.

down the websocket that was disconnected, as that

send()

should fail with the same error. Once any websocket method returns that error, any other methods should continue to fail with the same error. * Improving the DO dev experience is definitely on our mind -- it's a complex problem though, as we for see larger apps potentially having dozens of DO classes with hundreds of instances, and it's not clear what folks would expect from

wrangler tail

on a large project like that.

Vanessa🦩

06/01/2021, 5:46 PM

No, errors are not stored (yet). The send of the error.stack did not fail, and subsequent sends on the same websocket connection also succeeded, as well as storage puts in response to messages received on the client that connected after the error.

Vanessa🦩

06/01/2021, 6:02 PM

wrangler tail

on DO, if I could see the logs of a single DO given its id (or even more conveniently, the binding and name passed into binding.idFromName) would be great for a start. Down the road a list of names to stream logs simultaneously would be great, but running multiple tails in parallel would be good enough. Also a command to list all DOs would be nice. Or even a little "watch logs" button in the dashboard next to the list of all DOs with their storage inspectors ... a girl can dream right? 😉

Erwin

06/02/2021, 2:48 AM

You and me both are dreaming about this 😄 I am about to finish experimental support for DOs in the Honeycomb integration, which works for me, but someone else is having issues with it. Maybe it could help you debug for now? https://github.com/cloudflare/workers-honeycomb-logger/