cloudfoundry/guardian

sysctl net.ipv4.tcp_keepalive_time setting needed

cwsteve2117 opened this issue · 5 comments

Description

Azure Load Balancer does not issue a reset when dropping a connection. The Linux tcp_keepalive_time needs to be below the Azure values (4 mins or 240 seconds) to avoid issues observed with applications running in Garden on a Diego cell. If the Azure LB disconnects, applications that are connected for longer than 4 minutes will eventually time out.

BOSH stemcells were updated to address this, but the values are not propagated to the Garden container. Stemcell settings added include:

net.ipv4.tcp_keepalive_intvl = 30
net.ipv4.tcp_keepalive_probes = 8
net.ipv4.tcp_keepalive_time = 120

Logging and/or test output

$ /sbin/sysctl net.ipv4.tcp_keepalive_time
sysctl: cannot stat /proc/sys/net/ipv4/tcp_keepalive_time: No such file or directory

These are the net.ipv4.tcp settings we see in the Garden container:
$ /sbin/sysctl -a | grep "net.ipv4.tcp"
net.ipv4.tcp_base_mss = 1024
net.ipv4.tcp_ecn = 2
net.ipv4.tcp_ecn_fallback = 1
net.ipv4.tcp_fwmark_accept = 0
net.ipv4.tcp_mtu_probing = 0
net.ipv4.tcp_probe_interval = 600
net.ipv4.tcp_probe_threshold = 8

Steps to reproduce

Our test environment is running Pivotal Cloud Foundry 1.9.4 on Azure. The 1.9.x release of PCF includes:
garden-runc 1.1.1

https://github.com/scottfrederick/http_client is a Go app that hits the Cloud Controller on demand, with connection reuse. After a period of inactivity a request hangs for 15 mins before timing out.

no README yet, but

  1. git clone
  2. modify the target URL https://github.com/scottfrederick/http_client/blob/master/http_client.go#L16
  3. go build
  4. cf push http_client -b https://github.com/cloudfoundry/go-buildpack

once running, you can just curl the app’s route and it will echo the /info endpoint of the configured env.

it needs master of go-buildpack to get golang 1.8 support

Outline the steps to test or reproduce the PR here. Please also provide the
following information if applicable:

  • Guardian release version
  • Linux kernel version
  • Concourse version
  • Go version

Hi there!

We use Pivotal Tracker to provide visibility into what our team is working on. A story for this issue has been automatically created.

The current status is as follows:

  • #141310829 sysctl net.ipv4.tcp_keepalive_time setting needed

This comment, as well as the labels on the issue, will be automatically updated as the status in Tracker changes.

Hi @cwsteve2117,

The Garden team had a discussion about this over in slack and here are our thoughts:

It appears that the net.ipv4.tcp_keepalive* settings are not namespaced in the 4.4 kernel, which is why you aren't able to see them when running /sbin/sysctl -a | grep "net.ipv4.tcp" in the container. Given that these settings are not namespaced, the values as set in the host should apply inside the container.

We've validated that these settings are namespaced in later kernel versions (tested on 4.8). All Garden could really suggest is changing the setting in the stemcell (which I can see you've already done ...), so it looks like something else is at play here.

Have you tried a more focused experiment to measure the keepalive time inside a CF app container? We think it should be the same as set on the host, but if not that's a bug!

Is there anything else you'd like to discuss or are you happy for us to close this off?

Cheers,
Ed & @craigfurman

I think we're good for now. Thank you for the consideration here. We'll keep up our investigation and let you know if something else potentially Garden related comes up.

emalm commented

@cwsteve2117 Does the keepalive timing arithmetic for those kernel settings on the stemcell correctly match the timing requirement for the Azure LB? Based on http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html, it looks like it would take 120 + (8 - 1)*30 = 330 seconds for the connection to considered dead on the local side, which is greater than that 240-second LB timeout.

Thanks,
Eric

@cwsteve2117 is there a follow up issue or does @ematpl question help to reopen this issue?