http4s/http4s-servlet

BlockingServletIo assumes the request's InputStream is unmolested...

Opened this issue · 4 comments

I'm not very familiar with the HTTP-servlet standard, so perhaps http4s is technically in-the-right in assuming the ServletInputStream has not yet been read from. However, it would certainly make the library more user-friendly (and have saved me hours) if the BlockingServletIo raised something like an IllegalStateException -- including a pertinent error message -- if the input-stream's isFinished method returns true out-the-gate.

In my case, I had a javax.servlet.Filter registered that was calling getParameterMap on the HttpServletRequest (to do some primitive malicious-request rejection), before passing off the request object off to the http4s stack... It looks like this was invoking Jetty's implementation of getParameterMap, which reads "POST" parameters (so to speak) from the request body.

The result is that I would end up with an empty Map when doing something like req.decode[UrlForm] { f => val params = f.values; /* ... */ } and no sign of what had gone wrong.

That behavior of consuming the stream is unfortunately correct per servlet spec.

One thing I'm not sure of is whether isFinished returns true on an empty stream before an attempt to read. If not, we could probably do something like you suggest (for the cost of a boolean check per request on everybody). If empty bodies start isFinished, then I don't think there is anything we can do here.

Good question. I'll try to get you an answer -- what does isFinished return for an empty stream (empty-body request) before any attempt to read from the stream has been made?

Sorry I didn't follow-up on this sooner. It appears as though isFinished returns false even on an empty-bodied HTTP request, so long as read hasn't been called on the InputStream.

I set up an experiment like this in a servlet Filter:

println("isFinished: " + request.getInputStream.isFinished)
request.getInputStream.read()
println("isFinished: " + request.getInputStream.isFinished)

If I then issue a HEAD request using curl (e.g., curl --head http://localhost:8080/) the first call to isFinished returns false and the second true.

Okay, that's encouraging. In jetty, there's a sychronization, but it should be pretty cheap in the grand scheme of things. I'd be fine with adding this check. Would you like to give it a try?