High CPU usage in gRPC over curvetls
Opened this issue · 2 comments
When running a gRPC listener protected by curvetls encryption, cpu usage skyrockets after the first connection.
I've added in the pprof endpoint and it does seem to be a spinning call to Read
or ReadFrame
in curvetls, which repeatedly calls down to an expensive syscall.
The number of file descriptors seems to remain capped to between 8 or 9. gRPC is (I believe) multiplexing connections over these. As more requests come to the same hpt server, performance does seem to degrade. We should try to find a goroutine leak.
After profiling with grmon
.
It seems like a goroutine leak. As I send more simple requests, more and more goroutines get caught in the Read loop. These get expensive due to the syscall overhead. I'm guessing these goroutines should be released after a while.
A clue?: The buffered reader delegates to Read
on the curvetls EncryptedConn
, but at first glance there doesn't seem to be a codepath that returns io.EOF
. Without EOF, I imagine that the standard library utilities like ReadFull
will just spin forever.
This seems to be fixed with a patch to curvetls. Read
did not return any errors, so the caller (an http2 transport created for each incoming connection) never returned, never released the connection, and kept a goroutine spinning for each connection in a busy read loop. That's what the profiling shows: tons of reads.
Returning EOF
and ErrUnexpectedEOF
signals the caller that there's no more to read. Without this, it tries to read forever. Fixes here
Now CPU usage for an hpt daemon with top
is really low, and there's not a goroutine leak from what I can tell. Will need to try this scenario with a longer-lived bi-directional stream at some point.