Handle idle connections
Closed this issue · 4 comments
Problem statement
We have http server exposing metrics to telegraf. Telegraf scrapes metrics in 60 second intervals via http: it opens keep-alive connections but under certain conditions (have no repro, only in prod) it does not reuses established connections and opens new connection each time it scrapes metrics. This results in increasing number of open connections, every connection if a fiber which lives forever.
Possible solution
Introduce new http.server config options:
- maximum number of idle connections, after reaching this limit http.server could start closing idle conns
- idle timeout is the maximum amount of time an idle (keep-alive) connection will remain idle before closing
This is supposed to be fixed in httpng.
@kyukhin I did not agree with "wishlist". Lack of idle connections timeout has been exploding every cluster our customer have for a week and that's a lot of clusters. And the only solution we have is to configure Telegraf and have no way to protect from within tarantool. If it will be fixed in httpng then it must be ported to http - we have no resources to migrate every application we have to new http module.
It is in our backlog now. Further prioritization may be done via the product team.
It is possible to set a timeout in a keep-alive header.
https://tools.ietf.org/id/draft-thomson-hybi-http-timeout-01.html#rfc.section.2
Now http module set keep-alive header without timeout, see https://github.com/tarantool/http/blob/master/http/server/init.lua#L238-L260