zalando/logbook

Possible OOM with large servlet response

wintermute0 opened this issue · 6 comments

Hi,

I'm reading the source code and just wondering that why there is no size limit when writing to the branched OutputStream? What if there is a really large HTTP response? Then the whole thing will be stuck inside a ByteArrayOutputStream which will easily use up all your memory. Also there is a configuration property called "logbook.write.max-body-size". You would think that it should be able to put an upper limit to the memory footprint no matter the size of the actual HTTP response body, but looks like it did not.

You have a point here, as the response is copied in LocalResponse classes, the allocation of memory is doubled. We could use the max-body-size to enforce a limit, let's say, counting the worst case scenario, that 4 bytes are used per character.
I'm not sure how big of a risk this is. Who would enable response body logs for cases when it can be that big? Even truncation of the body doesn't make sense here.

Well, here is our story.

We use logbook in a legacy Spring 4.x system and we have a LogFilter for URL pattern "/*". We also have a max-body-size in place. Everything was fine for a couple of years, util a recent deployment. A file downloading functionality was added by one of our team members. And then with some users downloading files > 2GB, our server just exploded :(

I think the real risk is that the "max-body-size" configuration made everybody thought that it should be safe to Filter /*. But actually it's not.

After some internal discussions with the team, it doesn't look like we'll be able to address this fully without changing how Logbook works entirely. The problem is that Logbook buffers the requests and responses not only in servlets implementation, but for all the HTTP clients and other server types.

Another problem is that Logbook predicates evaluate the request/response already after this buffering is made, as the request/response are converted to the internal types at the same time. So you can't exclude the paths, where you send the big payloads from the buffering path, really.

What you could potentially do for your case for now is just excluding the paths that are using the heavy payload from Logbook (probably you already did that) writing your own Filter implementation, where you exclude certain paths before calling Logbook.

Ah, @whiskeysierra is right, the buffering will be used only if the request is eligible for the logging.

Do you think it would help if we mention that logbook.write.max-body-size only affect the size of the output, but does not prevent the buffering of the body?

Another possible suggestion for @wintermute0 I can think of is to define a custom Strategy, if you still want to log response when customers download files, but to not log the body.

As per the defaults to exclude binary content types, as far as I can tell, FilteredHttpResponse still retrieves the body from the org.zalando.logbook.HttpResponse when getBodyAsString is called. So the buffering would still happen, even through the body will not be logged, if the ResponseFilter is configured to skip the body.

I think it would definitely help to mention the buffering behavior in the documentation for logbook.write.max-body-size. As for our particular use case we have already found a way to exclude the download request from logging. Thanks for your help :)