gRPC stream is closed after 60 seconds of idle even with timeout annotations set
Opened this issue · 6 comments
What happened:
The gRPC bi-directional stream is interrupted after 60 of idle even after necessary annotations are set.
Annotations:
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"
nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
nginx.ingress.kubernetes.io/ssl-redirect: "true"
nginx.ingress.kubernetes.io/backend-protocol: GRPCS
I verified that these values are set correctly by execing into the pod and checking nginx.conf
directly:
proxy_connect_timeout 300s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
proxy_next_upstream error timeout;
proxy_next_upstream_timeout 0;
grpc_connect_timeout 300s;
grpc_send_timeout 300s;
grpc_read_timeout 300s;
However, the bi-directional stream between the server and the agent is still closed after 60 seconds.
What you expected to happen:
I expected the stream to be closed after 5 minutes.
I think the default value of 60s is used whenever annotation values are greater than 60s. If I set these 3 annotations to a value less than 60, then the timeout is applied properly. For instance, I set it to "10" and the stream was interrupted after 10 seconds of idle.
NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version
):
ingress-nginx-controller-6df48c5677-cjpgv:/etc/nginx$ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
Release: v1.11.3
Build: 0106de65cfccb74405a6dfa7d9daffc6f0a6ef1a
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.5
-------------------------------------------------------------------------------
Kubernetes version (use kubectl version
): v1.29.10
Environment:
-
Cloud provider or hardware configuration: Managed AKS
-
OS (e.g. from /etc/os-release):
-
Kernel (e.g.
uname -a
): -
Install tools:
- Helm
-
Basic cluster related info:
- Managed AKS v1.29.10, Public Azure Cloud
-
How was the ingress-nginx-controller installed:
$ helm ls -A | grep -i ingress
ingress-nginx ingress-nginx 1 2024-11-29 15:59:19.017422556 +0100 CET deployed ingress-nginx-4.11.3 1.11.3
$ helm -n ingress-nginx get values ingress-nginx
USER-SUPPLIED VALUES:
null
- Current State of the controller:
$ kubectl describe ingressclasses
Name: azure-application-gateway
Labels: addonmanager.kubernetes.io/mode=Reconcile
app=ingress-appgw
app.kubernetes.io/component=controller
Annotations: <none>
Controller: azure/application-gateway
Events: <none>
Name: nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.11.3
helm.sh/chart=ingress-nginx-4.11.3
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ingress-nginx
Controller: k8s.io/ingress-nginx
Events: <none>
$ kubectl -n ingress-nginx get all -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
pod/ingress-nginx-controller-6df48c5677-cjpgv 1/1 Running 0 124m 10.244.2.16 aks-nodepool1-19682194-vmss000003 <none> <none>
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
service/ingress-nginx-controller LoadBalancer 10.0.217.178 <redacted> 80:32223/TCP,443:31568/TCP 124m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
service/ingress-nginx-controller-admission ClusterIP 10.0.24.224 <none> 443/TCP 124m app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
NAME READY UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR
deployment.apps/ingress-nginx-controller 1/1 1 1 124m controller registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
NAME DESIRED CURRENT READY AGE CONTAINERS IMAGES SELECTOR
replicaset.apps/ingress-nginx-controller-6df48c5677 1 1 1 124m controller registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx,pod-template-hash=6df48c5677
$ kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>
Name: ingress-nginx-controller-6df48c5677-cjpgv
Namespace: ingress-nginx
Priority: 0
Service Account: ingress-nginx
Node: aks-nodepool1-19682194-vmss000003/10.224.0.6
Start Time: Fri, 29 Nov 2024 15:59:47 +0100
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.11.3
helm.sh/chart=ingress-nginx-4.11.3
pod-template-hash=6df48c5677
Annotations: <none>
Status: Running
IP: 10.244.2.16
IPs:
IP: 10.244.2.16
Controlled By: ReplicaSet/ingress-nginx-controller-6df48c5677
Containers:
controller:
Container ID: containerd://c9fec8b39fbde912c1f7daf9e151fb32bc9fa4ab754b26908b873476f1a6d6a2
Image: registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
Image ID: registry.k8s.io/ingress-nginx/controller@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
SeccompProfile: RuntimeDefault
Args:
/nginx-ingress-controller
--publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
--election-id=ingress-nginx-leader
--controller-class=k8s.io/ingress-nginx
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
--enable-metrics=false
State: Running
Started: Fri, 29 Nov 2024 15:59:56 +0100
Ready: True
Restart Count: 0
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-6df48c5677-cjpgv (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vtb8k (ro)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
kube-api-access-vtb8k:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal RELOAD 13m (x6 over 125m) nginx-ingress-controller NGINX reload triggered due to a change in configuration
$ kubectl -n ingress-nginx describe svc ingress-nginx-controller
Name: ingress-nginx-controller
Namespace: ingress-nginx
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=ingress-nginx
app.kubernetes.io/part-of=ingress-nginx
app.kubernetes.io/version=1.11.3
helm.sh/chart=ingress-nginx-4.11.3
Annotations: meta.helm.sh/release-name: ingress-nginx
meta.helm.sh/release-namespace: ingress-nginx
Selector: app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.0.217.178
IPs: 10.0.217.178
LoadBalancer Ingress: <redacted>
Port: http 80/TCP
TargetPort: http/TCP
NodePort: http 32223/TCP
Endpoints: 10.244.2.16:80
Port: https 443/TCP
TargetPort: https/TCP
NodePort: https 31568/TCP
Endpoints: 10.244.2.16:443
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal UpdatedLoadBalancer 33m (x3 over 51m) service-controller Updated load balancer with new hosts
- Current state of ingress object, if applicable:
$ kubectl describe ing -n <ns> <ing-name>
Name: <ing-name>
Namespace: <ns>
Address: <redacted>
Ingress Class: nginx
Default backend: <default>
Rules:
Host Path Backends
---- ---- --------
*
envoy-grpcapi:443 (10.244.0.29:8080)
Annotations: nginx.ingress.kubernetes.io/backend-protocol: GRPCS
nginx.ingress.kubernetes.io/proxy-connect-timeout: 300
nginx.ingress.kubernetes.io/proxy-read-timeout: 300
nginx.ingress.kubernetes.io/proxy-send-timeout: 300
nginx.ingress.kubernetes.io/ssl-redirect: true
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Sync 15m (x7 over 55m) nginx-ingress-controller Scheduled for sync
- Others:
- Any other related information like ;
- copy/paste of the snippet (if applicable)
kubectl describe ...
of any custom configmap(s) created and in use- Any other related information that may help
- Any other related information like ;
How to reproduce this issue:
Anything else we need to know:
This issue is currently awaiting triage.
If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted
label and provide further guidance.
The triage/accepted
label can be added by org members by writing /triage accepted
in a comment.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
/remove-kind bug
Can you please write a step-by-step guide that someone can copy/paste from, to reproduce on a kind cluster. Inclusding the gRPC application.
Sure. I will work on a sample app and share the details soon.
Try setting client-body-timeout
/kind bug
/priority backlog
/triage needs-information
Let us know if the client body timeout works, i am also seeing client header timeout as well should be set as well