RedisLabs/memtier_benchmark

local ports exhaust quickly due to TCP `TIME_WAIT` when `reconnect_interval` is small

minhuw opened this issue · 1 comments

minhuw commented

I found that when reconnect-interval is small, local ports exhaust quickly before the experiment completes as the log below shows.

$ memtier_benchmark -s 192.168.1.2 -t 1 -p 7777 -c 128 -n 10000 --json-out-file experiment.json --reconnect-interval 1
Json file experiment.json created...
Writing results to stdout
[RUN #1] Preparing benchmark client...
[RUN #1] Launching threads now...
[RUN #1 1%,   0 secs]  1 threads:       14335 ops,   14340 (avg:   14340) ops/sec, 611.74KB/sec (avg: 611.74KB/sec),  5.19 (avg:  5.19) msec latency
<some logs omitted>
[RUN #1 2%,  20 secs]  1 threads:       27477 ops,     692 (avg:    1373) ops/sec, 28.74KB/sec (avg: 58.36KB/sec), 47.35 (avg: 28.27) msec latency
connect failed, error = Cannot assign requested address
memtier_benchmark: shard_connection.cpp:470: void shard_connection::process_response(): Assertion `ret == 0' failed.

I find that SO_LINGER is not enabled so closed TCP connections go to the TIMEWAIT state instead of releasing local ports immediately.

struct linger ling = {0, 0};
int flags = 1;
int error = setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, (void *) &flags, sizeof(flags));
assert(error == 0);
error = setsockopt(sockfd, SOL_SOCKET, SO_LINGER, (void *) &ling, sizeof(ling));
assert(error == 0);

It works if I enable SO_LINGER as follows thus aborting the connection immediately when it is closed.

-        struct linger ling = {0, 0};
+        struct linger ling = {1, 0};

Is there any reason SO_LINGER is not enabled? Any workaround so I could test the scenario when reconnect_interval is very small?

@minhuw I believe tunning tcp_fin_timeout + tcp_tw_reuse / tcp_tw_recycle will help you WRT reusing TW connections and also reduce the TIMEWAIT connections in total.

However, it's essential to carefully test and evaluate the impact of enabling these parameters in your specific environment, as their behavior can vary depending on the network configuration and application requirements.