eclipse-ee4j/jersey

Buffer overflow due to timeout with Netty + Jersey

jerseyrobot opened this issue · 13 comments

When returning a large response using the Netty connector, a "Buffer overflow" error occurs. Switching from jersey-container-netty-http to jersey-container-grizzly2-http resolves the issue, so it appears to be specific to the netty connector.

Testing with a 521k response and Jersey 2.25.1, the buffer overflow occurs in org.glassfish.jersey.netty.connector.internal.JerseyChunkedInput#write at line 224. The revelant code is:

private void write(Provider<ByteBuffer> bufferSupplier) throws IOException {

        checkClosed();

        try {
            boolean queued = queue.offer(bufferSupplier.get(), WRITE_TIMEOUT, TimeUnit.MILLISECONDS);
            if (!queued) {
throw new IOException("Buffer overflow.");
            }

        } catch (InterruptedException e) {
            throw new IOException(e);
        }
    }

The queue has a capacity of 8, and the offer timeout is 1000ms, which seems low.

Relevant portion of the stack trace:

Caused by: java.io.IOException: Buffer overflow.
	at org.glassfish.jersey.netty.connector.internal.JerseyChunkedInput.write(JerseyChunkedInput.java:224)
	at org.glassfish.jersey.netty.connector.internal.JerseyChunkedInput.write(JerseyChunkedInput.java:204)
	at org.glassfish.jersey.message.internal.CommittingOutputStream.write(CommittingOutputStream.java:229)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$UnCloseableOutputStream.write(WriterInterceptorExecutor.java:299)
	at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:221)

Environment

OpenJDK 1.8.0_121

Affected Versions

[2.25.1]

@glassfishrobot Commented
Reported by jekh

@glassfishrobot Commented
@pavelbucek said:
thanks for your report.

do you have any opinion about what are reasonable defaults here?

Also, proper solution to this one is to make these settings configurable..

@glassfishrobot Commented
jekh said:
Would it make sense to have an unlimited wait when offering? Users would use a javax.ws.rs.core.StreamingOutput to avoid storing large amounts of data in memory, right?

@glassfishrobot Commented
This issue was imported from java.net JIRA JERSEY-3228

@brettkail-wk Commented
To fix, I think HttpChunkedInput needs to be flushed so that it attempts to start reading input immediately. Otherwise, all writes are buffered until the response is fully written, which overflows the buffer if enough data is sent.

--- server/src/main/java/org/glassfish/jersey/netty/httpserver/NettyResponseWriter.java
+++ server/src/main/java/org/glassfish/jersey/netty/httpserver/NettyResponseWriter.java
@@ -144,7 +144,7 @@ class NettyResponseWriter implements ContainerResponseWriter {
             JerseyChunkedInput jerseyChunkedInput = new JerseyChunkedInput(ctx.channel());
 
             if (HttpUtil.isTransferEncodingChunked(response)) {
-                ctx.write(new HttpChunkedInput(jerseyChunkedInput)).addListener(FLUSH_FUTURE);
+                ctx.writeAndFlush(new HttpChunkedInput(jerseyChunkedInput));
             } else {
                 ctx.write(new HttpChunkedInput(jerseyChunkedInput)).addListener(FLUSH_FUTURE);
             }

This only works if the client is reading at least one chunk per 10sec (perhaps reasonable). To fully fix, JerseyChunkedInput presumably needs to be reworked to block indefinitely when there's no space but to allow the ChunkedInput.close to waken the writer and throw an IOException. (The existing close implements both the OutputStream.close and ChunkedInput.close, but it appears to be implemented as the latter to abort the output stream. In that case, the removeLast/add appears to race with concurrent writes.)

@jpommerening Commented
@brettkail-wk, @pavelbucek: Is there some progress here? I'd volunteer to help, if someone can get me started (and if that's at all acceptable).

Brett, did you come up with a workaround that does not require patching jersey-container-netty-http?

@brettkail-wk Commented
@jpommerening No, I don't think there is a workaround. In the meantime, my company has agreed to the CLA, so I'll submit a PR with a variation of the fix above.

@jpommerening Commented
@brettkail-wk Thanks for the quick update! That's some promising news! :)

This is now a blocking issue for me, is there a workaround? I am having to drop and filter data because I can't get the buffer to stop overflowing.

Not as far as I know. jersey/jersey#3791 would need to be applied. I'm no longer pursuing a fix since we stopped using Netty with Jersey.

This affected us so badly we have now done the same and swapped to Grizzly for Jersey. I will keep watch on this ticket as swapping back (which is desirable as we also use gRPC) is fairly easy at the moment.

Unfortunately, any PR made to to the original repository needs to be made again to this repository under the new conditions.