nrepl/drawbridge

GET floods

Opened this issue · 7 comments

I'm not sure where this could be coming from-- repl-y, lein, drawbridge, nrepl, or something else-- but I saw it using drawbridge so I'll report it here and hope for the best.

When doing a very simple compojure hello world following the drawbridge README, and then connecting via:

lein repl :connect http://localhost:8080/repl

Everything seems functional, but a packet sniff shows that the server is getting pummeled by empty GET requests to that URL, every 8ms.

The GETs seem identical, and take this form:

GET /repl HTTP/1.1
Connection: close
accept-encoding: gzip, deflate
cookie: ring-session=e868f195-cc61-4e4b-9dc0-619446e17ffa
Content-Length: 0
Host: localhost:8080
User-Agent: Apache-HttpClient/4.2.2 (java 1.5)

HTTP/1.1 200 OK
Date: Mon, 28 Jan 2013 00:53:17 GMT
Content-Type: application/json;charset=ISO-8859-1
Connection: close
Server: Jetty(7.6.1.v20120215)

[

]

There are POSTs amongst this haystack that have acutal forms and their evaluation results, so it's working. But that's a lot of SPAM packets, apparently just no-ops, and, every 8ms seems a bit excessive.

This is using Leiningen 2.0.0-RC2 on Java 1.7.0_03 OpenJDK 64-Bit Server VM, and drawbridge 0.0.6.

After sleeping on this, it seems pretty obvious to me that the problem is coming from reply. Will open the issue there.

Yeah, drawbridge doesn't actively poll or anything…though, it doesn't throttle reads coming from whatever is calling recv (part of the nREPL transport API). It might make sense to add such a mechanism, since this constraint is something that e.g. reply would have no knowledge of. (Although, 8ms is a bit overboard, regardless of the transport being used. :-)

Reopening, if only to ponder a throttling mechanism. Maybe @trptcolin has input?

Sure, REPLy could add a sleep before or after hitting recv: https://github.com/trptcolin/reply/blob/master/src/clj/reply/eval_modes/nrepl.clj#L148-158

It already has the dependency on drawbridge, so that seems reasonable to do, though I'm not sure what the timeout should be. For drawbridge/http 1 second might not feel so bad, but for a local socket connection I feel like I'd want much less latency (100ms?)

Throttling in drawbridge would be cool. Am I reading this right, that multiple http requests will happen in quick succession until the first response comes in (or the timeout expires)? https://github.com/cemerick/drawbridge/blob/master/src/cemerick/drawbridge/client.clj#L37-42 Not sure if that actually causes bursts of requests in practice, but thought I'd mention it while I'm thinking about it.

Oooh, looking again, and I think @kenrestivo might be right. That recv fn looks like it'll busy-loop alternatively checking the incoming queue and kicking off a request...

Will look at this more closely shortly.

Yep, that sure looks like it.

I'm surprised that drawbridge is reading from the server in a loop. I naively assumed that repl-y was driving the bus, the loop was actually in the UI somewhere, and that drawbridge was callback-based or maybe just blocked waiting for the user to hit RET to send a form to it, then sent the form to the server, and blocked waiting for the server to send the result of the evaluation. What it actually does looks quite a bit different from that, and I don't fully understand it, but I sure can see that "recur" in there all right, and the timeout too, so that looks like the source of the GET flood.

Thanks for looking at this.

Yeah, it's all quite stupid, a result of my wanting to bash out a proof of concept more than anything else.

The easy next step would be to just long-poll new responses, the code for which will actually be far simpler than the janky business I wrote. I'm afraid I'm going to be occupied elsewhere for some time; I'll get to it sooner or later, but, patches welcome in the interim.

Today I noticed the same issue (CPU hog/GET flood) while playing with drawbridge on http-kit.

Http-kit supports WebSockets so there is a way to replace GET requests with websockets. (I am just thinking aloud at the moment, so no patches yet)