linkerd/linkerd-tcp

Improve socket reuse to avoid using too many file descriptors

Closed this issue · 5 comments

If I run a slow_cooker with 500 clients, we can run out file descriptors quickly.

slow_cooker -host "perf-cluster" -qps 20 -concurrency 500 -interval 10s http://proxy-test-4d:7474

results in a panic.

$ RUST_LOG=error RUST_BACKTRACE=yes ./linkerd-tcp-1490585634 example.yml
Listening on http://127.0.0.1:9989.
thread 'main' panicked at 'could not run proxies: Error { repr: Os { code: 24, message: "Too many open files" } }', /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libcore/result.rs:868
stack backtrace:
   1:     0x557763c6f7ac - std::sys::imp::backtrace::tracing::imp::write::hf33ae72d0baa11ed
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:42
   2:     0x557763c72abe - std::panicking::default_hook::{{closure}}::h59672b733cc6a455
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:351
   3:     0x557763c726c4 - std::panicking::default_hook::h1670459d2f3f8843
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:367
   4:     0x557763c72f5b - std::panicking::rust_panic_with_hook::hcf0ddb069e7beee7
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:555
   5:     0x557763c72df4 - std::panicking::begin_panic::hd6eb68e27bdf6140
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:517
   6:     0x557763c72d19 - std::panicking::begin_panic_fmt::hfea5965948b877f8
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:501
   7:     0x557763c72ca7 - rust_begin_unwind
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:477
   8:     0x557763c9f34d - core::panicking::panic_fmt::hc0f6d7b2c300cdd9
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libcore/panicking.rs:69
   9:     0x5577639d7642 - core::result::unwrap_failed::h52f3f53af574d319
  10:     0x5577639dcf41 - linkerd_tcp::main::h2f95da4c40bc36fe
  11:     0x557763c79f7a - __rust_maybe_catch_panic
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libpanic_unwind/lib.rs:98
  12:     0x557763c736c6 - std::rt::lang_start::hd7c880a37a646e81
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panicking.rs:436
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/panic.rs:361
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-linux/build/src/libstd/rt.rs:57
  13:     0x7fe85559a3f0 - __libc_start_main
  14:     0x5577639d5f68 - <unknown>
  15:                0x0 - <unknown>```

Might want to play with SO_LINGER on socket shutdown to free up descriptors quicker on socket.close. I know twisted uses something like this for TCPServers too. I've only got a D snippet handy, but I'm sure it's similar in rust.

                // Set SO_LINGER to 1,0 which, by convention, causes a
                // connection reset to be sent when close is called,
                /// instead of the standard FIN shutdown sequence.
                int[2] option = [ 1, 0 ];
                this.socket.handle.setsockopt(SOL_SOCKET, SO_LINGER, &option, option.sizeof);

Sorry I didn't clarify the use case. It should be used for sudden/abnormal connection loss.

This is a straight up bug. socket options won't fix this. I think I know how to fix this...

Problem

linkerd-tcp does not close destination connections when the source client closes a connection.

Reproduction Case

Run linkerd-tcp:

lb=:; cargo run -- example.yaml
...

Run a web server:

web=:; twistd -n web -p 8880
...
2017-03-31 21:47:29+0000 [HTTPChannel,1014,127.0.0.1] 127.0.0.1 - - [31/Mar/2017:21:47:29 +0000] "GET / HTTP/1.1" 200 199 "-" "curl/7.43.0"
2017-03-31 21:47:29+0000 [-] Malformed file descriptor found.  Preening lists.
2017-03-31 21:47:29+0000 [-] bad descriptor <HTTPChannel #1015 on 8880>
2017-03-31 21:47:30+0000 [-] Malformed file descriptor found.  Preening lists.
2017-03-31 21:47:30+0000 [-] bad descriptor <HTTPChannel #1016 on 88

Monitor linkerd-tpc's connections:

netstat=:; while true ; do netstat -an |awk '$4 ~ /127\.0\.0\.1\.7474/ { print $4" "$6 }; $5 ~ /127\.0\.0\.1\.8880/ { print $5" "$6 }' | sort |uniq -c |sort -rn ; sleep 10 ;echo ; done
 944 127.0.0.1.8880 ESTABLISHED
 943 127.0.0.1.7474 CLOSE_WAIT
   1 127.0.0.1.7474 LISTEN
   1 127.0.0.1.7474 ESTABLISHED

 974 127.0.0.1.8880 ESTABLISHED
 973 127.0.0.1.7474 CLOSE_WAIT
   1 127.0.0.1.7474 LISTEN
   1 127.0.0.1.7474 ESTABLISHED

1003 127.0.0.1.8880 ESTABLISHED
1002 127.0.0.1.7474 CLOSE_WAIT
   1 127.0.0.1.7474 LISTEN
   1 127.0.0.1.7474 ESTABLISHED

Monitor linkerd-tcp's metrics:

metrics=:; while true ; do curl -s http://localhost:9989/metrics | grep conns_ | sort ; sleep 10 ; echo ; done
conns_active{proxy="default"} 944
conns_established{proxy="default"} 0
conns_pending{proxy="default"} 1

conns_active{proxy="default"} 974
conns_established{proxy="default"} 1
conns_pending{proxy="default"} 0

Then run a crappy client that doesn't tell the web server to tear down the connection:

crapclient=:; while true ; do curl -s "localhost:7474" >dev/null && echo -n . ; done 
......

Note that this behavior is not observed with slow_cooker.

We observe that the connections to linkerd-tcp are in CLOSE_WAIT, indicating that
linkerd-tcp has not closed its half of the connection. Furthermore, linkerd has not
attempted to close the connection to the destination either, as these connections are
still ESTABLISHED.

Solution

tokio_io's AsyncWrite provides a shutdown. We need to make sure that serverside
shutdowns tear down the duplex stream.

After further digging, I've learned that AsyncWrite::shutdown has no relationship to TcpStream::shutdown. (Surprise!)

We need to use TcpStream::shutdown instead