gRPC stream is closed after 60 seconds of idle even with timeout annotations set

Question

gRPC stream is closed after 60 seconds of idle even with timeout annotations set

Opened this issue a month ago · 6 comments

What happened:

The gRPC bi-directional stream is interrupted after 60 of idle even after necessary annotations are set.
Annotations:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  annotations:
    nginx.ingress.kubernetes.io/proxy-connect-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "300"
    nginx.ingress.kubernetes.io/proxy-send-timeout: "300"
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    nginx.ingress.kubernetes.io/backend-protocol: GRPCS

I verified that these values are set correctly by execing into the pod and checking nginx.conf directly:

proxy_connect_timeout                   300s;
proxy_send_timeout                      300s;
proxy_read_timeout                      300s;
proxy_next_upstream                     error timeout;
proxy_next_upstream_timeout             0;
grpc_connect_timeout                    300s;
grpc_send_timeout                       300s;
grpc_read_timeout                       300s;

However, the bi-directional stream between the server and the agent is still closed after 60 seconds.

What you expected to happen:
I expected the stream to be closed after 5 minutes.

I think the default value of 60s is used whenever annotation values are greater than 60s. If I set these 3 annotations to a value less than 60, then the timeout is applied properly. For instance, I set it to "10" and the stream was interrupted after 10 seconds of idle.

NGINX Ingress controller version (exec into the pod and run /nginx-ingress-controller --version):

ingress-nginx-controller-6df48c5677-cjpgv:/etc/nginx$ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       v1.11.3
  Build:         0106de65cfccb74405a6dfa7d9daffc6f0a6ef1a
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.25.5

-------------------------------------------------------------------------------

Kubernetes version (use kubectl version): v1.29.10

Environment:

Cloud provider or hardware configuration: Managed AKS
OS (e.g. from /etc/os-release):
Kernel (e.g. uname -a):
Install tools:
- Helm
Basic cluster related info:
- Managed AKS v1.29.10, Public Azure Cloud
How was the ingress-nginx-controller installed:

$ helm ls -A | grep -i ingress
ingress-nginx                   	ingress-nginx   	1       	2024-11-29 15:59:19.017422556 +0100 CET	deployed	ingress-nginx-4.11.3                                                     	1.11.3

$ helm -n ingress-nginx get values ingress-nginx
USER-SUPPLIED VALUES:
null

Current State of the controller:

$ kubectl describe ingressclasses
Name:         azure-application-gateway
Labels:       addonmanager.kubernetes.io/mode=Reconcile
              app=ingress-appgw
              app.kubernetes.io/component=controller
Annotations:  <none>
Controller:   azure/application-gateway
Events:       <none>

Name:         nginx
Labels:       app.kubernetes.io/component=controller
              app.kubernetes.io/instance=ingress-nginx
              app.kubernetes.io/managed-by=Helm
              app.kubernetes.io/name=ingress-nginx
              app.kubernetes.io/part-of=ingress-nginx
              app.kubernetes.io/version=1.11.3
              helm.sh/chart=ingress-nginx-4.11.3
Annotations:  meta.helm.sh/release-name: ingress-nginx
              meta.helm.sh/release-namespace: ingress-nginx
Controller:   k8s.io/ingress-nginx
Events:       <none>

$ kubectl -n ingress-nginx get all -o wide

NAME                                            READY   STATUS    RESTARTS   AGE    IP            NODE                                NOMINATED NODE   READINESS GATES
pod/ingress-nginx-controller-6df48c5677-cjpgv   1/1     Running   0          124m   10.244.2.16   aks-nodepool1-19682194-vmss000003   <none>           <none>

NAME                                         TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)                      AGE    SELECTOR
service/ingress-nginx-controller             LoadBalancer   10.0.217.178   <redacted>   80:32223/TCP,443:31568/TCP   124m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
service/ingress-nginx-controller-admission   ClusterIP      10.0.24.224    <none>           443/TCP                      124m   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx

NAME                                       READY   UP-TO-DATE   AVAILABLE   AGE    CONTAINERS   IMAGES                                                                                                                     SELECTOR
deployment.apps/ingress-nginx-controller   1/1     1            1           124m   controller   registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx

NAME                                                  DESIRED   CURRENT   READY   AGE    CONTAINERS   IMAGES                                                                                                                     SELECTOR
replicaset.apps/ingress-nginx-controller-6df48c5677   1         1         1       124m   controller   registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7   app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx,pod-template-hash=6df48c5677

$ kubectl -n <ingresscontrollernamespace> describe po <ingresscontrollerpodname>

Name:             ingress-nginx-controller-6df48c5677-cjpgv
Namespace:        ingress-nginx
Priority:         0
Service Account:  ingress-nginx
Node:             aks-nodepool1-19682194-vmss000003/10.224.0.6
Start Time:       Fri, 29 Nov 2024 15:59:47 +0100
Labels:           app.kubernetes.io/component=controller
                  app.kubernetes.io/instance=ingress-nginx
                  app.kubernetes.io/managed-by=Helm
                  app.kubernetes.io/name=ingress-nginx
                  app.kubernetes.io/part-of=ingress-nginx
                  app.kubernetes.io/version=1.11.3
                  helm.sh/chart=ingress-nginx-4.11.3
                  pod-template-hash=6df48c5677
Annotations:      <none>
Status:           Running
IP:               10.244.2.16
IPs:
  IP:           10.244.2.16
Controlled By:  ReplicaSet/ingress-nginx-controller-6df48c5677
Containers:
  controller:
    Container ID:    containerd://c9fec8b39fbde912c1f7daf9e151fb32bc9fa4ab754b26908b873476f1a6d6a2
    Image:           registry.k8s.io/ingress-nginx/controller:v1.11.3@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
    Image ID:        registry.k8s.io/ingress-nginx/controller@sha256:d56f135b6462cfc476447cfe564b83a45e8bb7da2774963b00d12161112270b7
    Ports:           80/TCP, 443/TCP, 8443/TCP
    Host Ports:      0/TCP, 0/TCP, 0/TCP
    SeccompProfile:  RuntimeDefault
    Args:
      /nginx-ingress-controller
      --publish-service=$(POD_NAMESPACE)/ingress-nginx-controller
      --election-id=ingress-nginx-leader
      --controller-class=k8s.io/ingress-nginx
      --ingress-class=nginx
      --configmap=$(POD_NAMESPACE)/ingress-nginx-controller
      --validating-webhook=:8443
      --validating-webhook-certificate=/usr/local/certificates/cert
      --validating-webhook-key=/usr/local/certificates/key
      --enable-metrics=false
    State:          Running
      Started:      Fri, 29 Nov 2024 15:59:56 +0100
    Ready:          True
    Restart Count:  0
    Requests:
      cpu:      100m
      memory:   90Mi
    Liveness:   http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
    Readiness:  http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      POD_NAME:       ingress-nginx-controller-6df48c5677-cjpgv (v1:metadata.name)
      POD_NAMESPACE:  ingress-nginx (v1:metadata.namespace)
      LD_PRELOAD:     /usr/local/lib/libmimalloc.so
    Mounts:
      /usr/local/certificates/ from webhook-cert (ro)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-vtb8k (ro)
Conditions:
  Type                        Status
  PodReadyToStartContainers   True 
  Initialized                 True 
  Ready                       True 
  ContainersReady             True 
  PodScheduled                True 
Volumes:
  webhook-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  ingress-nginx-admission
    Optional:    false
  kube-api-access-vtb8k:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              kubernetes.io/os=linux
Tolerations:                 node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason  Age                 From                      Message
  ----    ------  ----                ----                      -------
  Normal  RELOAD  13m (x6 over 125m)  nginx-ingress-controller  NGINX reload triggered due to a change in configuration

$ kubectl -n ingress-nginx describe svc ingress-nginx-controller

Name:                     ingress-nginx-controller
Namespace:                ingress-nginx
Labels:                   app.kubernetes.io/component=controller
                          app.kubernetes.io/instance=ingress-nginx
                          app.kubernetes.io/managed-by=Helm
                          app.kubernetes.io/name=ingress-nginx
                          app.kubernetes.io/part-of=ingress-nginx
                          app.kubernetes.io/version=1.11.3
                          helm.sh/chart=ingress-nginx-4.11.3
Annotations:              meta.helm.sh/release-name: ingress-nginx
                          meta.helm.sh/release-namespace: ingress-nginx
Selector:                 app.kubernetes.io/component=controller,app.kubernetes.io/instance=ingress-nginx,app.kubernetes.io/name=ingress-nginx
Type:                     LoadBalancer
IP Family Policy:         SingleStack
IP Families:              IPv4
IP:                       10.0.217.178
IPs:                      10.0.217.178
LoadBalancer Ingress:     <redacted>
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  32223/TCP
Endpoints:                10.244.2.16:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31568/TCP
Endpoints:                10.244.2.16:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:
  Type    Reason               Age                From                Message
  ----    ------               ----               ----                -------
  Normal  UpdatedLoadBalancer  33m (x3 over 51m)  service-controller  Updated load balancer with new hosts

Current state of ingress object, if applicable:

$ kubectl describe ing -n <ns> <ing-name>
Name:             <ing-name>
Namespace:         <ns>
Address:         <redacted>
Ingress Class:    nginx
Default backend:  <default>
Rules:
  Host        Path  Backends
  ----        ----  --------
  *
                 envoy-grpcapi:443 (10.244.0.29:8080)
Annotations:  nginx.ingress.kubernetes.io/backend-protocol: GRPCS
              nginx.ingress.kubernetes.io/proxy-connect-timeout: 300
              nginx.ingress.kubernetes.io/proxy-read-timeout: 300
              nginx.ingress.kubernetes.io/proxy-send-timeout: 300
              nginx.ingress.kubernetes.io/ssl-redirect: true
Events:
  Type    Reason  Age                From                      Message
  ----    ------  ----               ----                      -------
  Normal  Sync    15m (x7 over 55m)  nginx-ingress-controller  Scheduled for sync

Others:
- Any other related information like ;
  - copy/paste of the snippet (if applicable)
  - kubectl describe ... of any custom configmap(s) created and in use
  - Any other related information that may help

How to reproduce this issue:

Anything else we need to know:

Answer 1 · 2024-11-29T17:13:10.000Z

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Answer 2 · 2024-11-30T00:17:00.000Z

/remove-kind bug

Can you please write a step-by-step guide that someone can copy/paste from, to reproduce on a kind cluster. Inclusding the gRPC application.

Answer 3 · 2024-11-30T18:57:42.000Z

Sure. I will work on a sample app and share the details soon.

Answer 4 · 2024-12-02T01:55:15.000Z

Try setting client-body-timeout

Answer 5 · 2024-12-04T20:06:46.000Z

/kind bug
/priority backlog
/triage needs-information

Let us know if the client body timeout works, i am also seeing client header timeout as well should be set as well

Answer 6 · 2024-12-19T15:50:32.000Z

@0x113 are you still having issues? If not can you post the resolution and/or close the ticket?