magic-wormhole/magic-wormhole-mailbox-server

don't send ERROR:crowded in response to CLOSE

warner opened this issue · 4 comments

In magic-wormhole/magic-wormhole#408 I think a client received ERROR(crowded) (eventually, because of #19, so only after the server was rebooted and the client reconnected, and re-sent the OPEN), and then tried to release the mailbox by sending CLOSE, but was thwarted because the server sent back another ERROR(crowded). I think the client then got stuck, unable to exit (because it thinks it still has a hold on the mailbox), not sending a new CLOSE (because it already sent one and it's just waiting for CLOSED).

The fix is probably to have handle_close() (in the clause that does self._app.open_mailbox(), because this connection didn't have one already) react to CrowdedError by doing most of the rest of the function, instead of raising Error("crowded"). We can't do anything that touches self._mailbox, since we don't have one, but we can set self._did_close and (most importantly) do self.send("closed"). That should satisfy the unsettled ghost client and allow it to finally exit.

not sending a new CLOSE (because it already sent one and it's just waiting for CLOSED)

IMO that part is a bug on the client side: it sent a message and got an error response, thus it should not expect that message to have passed.

The relevant server side code appears to be this:

        if not self._mailbox:
            try:
                self._mailbox = self._app.open_mailbox(mailbox_id, self._side,
                                                       server_rx)
            except CrowdedError:
                raise Error("crowded")

I don't understand: why are we opening a new mailbox in the first place? If there's no mailbox on close, simply ignore it as already closed?

This seems like it could be related to what Im seeing right now on the public wormhole service.

I recent was reusing the same wormhole code during some testing. After about 12+ sends using the same code I got a waiting for confirmation and then after ctrl-c since it seemed to hang I got subsequent crowded errors. These go away if I just get rid of the reused code.

The following behavior is what Im see this morning:

$ wormhole send --code 2-businessman-pheasant openwrt-ath79-generic-netgear_wndr3800ch-squashfs-factory.img                                                                                       
Sending 12.0 MB file named 'openwrt-ath79-generic-netgear_wndr3800ch-squashfs-factory.img'                                                                                                                          
Wormhole code is: 2-businessman-pheasant                                                                                                                                                                            
On the other computer, please run:                                                                                                                                                                                  
                                                                                                                                                                                                                    
wormhole receive 2-businessman-pheasant                                                                   
                                                                                                                                                                                                                    
Key established, waiting for confirmation... 

Then the following crowded error:

$ wormhole send --code 2-businessman-pheasant openwrt-ath79-generic-netgear_wndr3800ch-squashfs-factory.img
Sending 12.0 MB file named 'openwrt-ath79-generic-netgear_wndr3800ch-squashfs-factory.img'
Wormhole code is: 2-businessman-pheasant
On the other computer, please run:

wormhole receive 2-businessman-pheasant

Traceback (most recent call last):
--- <exception caught here> ---
  File "/nix/store/g9bwgzbj86sjyszyanzpxqjc0db3kbbp-python3.9-magic-wormhole-0.12.0/lib/python3.9/site-packages/wormhole/cli/cli.py", line 122, in _dispatch_command
    yield maybeDeferred(command)
  File "/nix/store/g9bwgzbj86sjyszyanzpxqjc0db3kbbp-python3.9-magic-wormhole-0.12.0/lib/python3.9/site-packages/wormhole/cli/cmd_send.py", line 93, in go
    yield d
  File "/nix/store/g9bwgzbj86sjyszyanzpxqjc0db3kbbp-python3.9-magic-wormhole-0.12.0/lib/python3.9/site-packages/wormhole/cli/cmd_send.py", line 139, in _go
    yield w.get_unverified_key()
wormhole.errors.ServerError: crowded
ERROR: crowded

Again I can just use a new code but figured it was worth reporting since its fresh. Happy to open up another issue if thats more appropriate.

I can confirm that reusing codes multiple times causes spurious "crowded" issues which could be related to this issue. I have no specific reproduction recipe, but I strongly suspect there is a race hazard involved.

@sarcasticadmin please make sure that you know the security consequences of using a code multiple times and don't use this in production without counter-measures.

@piegamesde thanks for the clarification

@sarcasticadmin please make sure that you know the security consequences of using a code multiple times and don't use this in production without counter-measures.

Yep, absolutely. It was just for testing and pure laziness so I didnt have to keep updating inputs.