/ingress-gce

Ingress controller for Google Cloud

Primary LanguageGoApache License 2.0Apache-2.0

GLBC

GitHub release Go Report Card

GLBC is a GCE L7 load balancer controller that manages external loadbalancers configured through the Kubernetes Ingress API.

A word to the wise

Please read the beta limitations doc to before using this controller. In summary:

  • This is a work in progress.
  • It relies on a beta Kubernetes resource.
  • The loadbalancer controller pod is not aware of your GCE quota.

If you are running a cluster on GKE and interested in trying out alpha releases of the GLBC before they are officially released please visit the deploy/glbc/ directory.

Overview

GCP HTTP(S) Load Balancer: Google Compute Platform does not have a single resource that represents a L7 loadbalancer. When a user request comes in, it is first handled by the global forwarding rule, which sends the traffic to an HTTP proxy service that sends the traffic to a URL map that parses the URL to see which backend service will handle the request. Each backend service is assigned a set of virtual machine instances grouped into instance groups.

Services: A Kubernetes Service defines a set of pods and a means by which to access them, such as single stable IP address and corresponding DNS name. This IP defaults to a cluster VIP in a private address range. You can direct ingress traffic to a particular Service by setting its Type to NodePort or LoadBalancer. NodePort opens up a port on every node in your cluster and proxies traffic to the endpoints of your service, while LoadBalancer allocates an L4 cloud loadbalancer.

What is an Ingress Controller?

Configuring a webserver or loadbalancer is harder than it should be. Most webserver configuration files are very similar. There are some applications that have weird little quirks that tend to throw a wrench in things, but for the most part you can apply the same logic to them and achieve a desired result.

The Ingress resource embodies this idea, and an Ingress controller is meant to handle all the quirks associated with a specific "class" of Ingress (be it a single instance of a loadbalancer, or a more complicated setup of frontends that provide GSLB, DDoS protection, etc).

An Ingress Controller is a daemon, deployed as a Kubernetes Pod, that watches the apiserver's /ingresses endpoint for updates to the Ingress resource. Its job is to satisfy requests for Ingresses.

L7 Load balancing on Kubernetes

To achieve L7 loadbalancing through Kubernetes, we employ a resource called Ingress. The Ingress is consumed by this loadbalancer controller, which creates the following GCE resource graph:

Global Forwarding Rule -> TargetHttpProxy -> URL Map -> Backend Service -> Instance Group

The controller (GLBC) manages the lifecycle of each component in the graph. It uses the Kubernetes resources as a spec for the desired state, and the GCE cloud resources as the observed state, and drives the observed to the desired. If an edge is disconnected, it fixes it. Each Ingress translates to a new GCE L7, and the rules on the Ingress become paths in the GCE URL Map. This allows you to route traffic to various backend Kubernetes Services through a single public IP, which is in contrast to Type=LoadBalancer, which allocates a public IP per Kubernetes Service. For this to work, the Kubernetes Service must have Type=NodePort.

The Ingress

An Ingress in Kubernetes is a REST object, similar to a Service. A minimal Ingress might look like:

01. apiVersion: extensions/v1beta1
02. kind: Ingress
03. metadata:
04.  name: hostlessendpoint
05. spec:
06.  rules:
07.  - http:
08.      paths:
09.      - path: /hostless
10.        backend:
11.          serviceName: test
12.          servicePort: 80

POST calls to the Kubernetes API server would cause GLBC to create a GCE L7 that routes all traffic sent to http://ip-of-loadbalancer/hostless to :80 of the service named test. If the service doesn't exist yet or isn't type NodePort, then GLBC will allocate an IP and wait until it does. Once the Service shows up, it will create the required path rules to route traffic.

Lines 1-4: Resource metadata used to tag GCE resources. For example, if you go to the console you would see a URL Map called: k8-fw-default-hostlessendpoint, where default is the namespace and hostlessendpoint is the name of the resource. The Kubernetes API server ensures that namespace/name is unique so there will never be any collisions.

Lines 5-7: Ingress Spec has all the information needed to configure a GCE L7. Most importantly, it contains a list of rules. A rule can take many forms, but the only rule relevant to GLBC is the http rule.

Lines 8-9: Each HTTP rule contains the following information: A host (eg: foo.bar.com, defaults to * in this example), a list of paths (eg: /hostless) each of which has an associated backend (test:80). Both the host and path must match the content of an incoming request before the L7 directs traffic to the backend.

Lines 10-12: A backend is a service:port combination. It selects a group of pods capable of servicing traffic sent to the path specified in the parent rule. The port is the desired spec.ports[*].port from the Service Spec -- Note, though, that the L7 actually directs traffic to the port's corresponding NodePort.

Global Parameters: For the sake of simplicity the example Ingress has no global parameters. However, one can specify a default backend (see examples below) in the absence of which requests that don't match a path in the spec are sent to the default backend of GLBC.

Load Balancer Management

You can manage a GCE L7 by creating, updating, or deleting the associated Kubernetes Ingress.

Creation

Before you can start creating Ingress you need to start up GLBC. We can use the examples/deployment/gce-ingress-controller.yaml:

$ kubectl create -f examples/deployment/gce-ingress-controller.yaml
replicationcontroller "glbc" created
$ kubectl get pods
NAME                READY     STATUS    RESTARTS   AGE
glbc-6m6b6          2/2       Running   0          21s

A couple of things to note about this controller:

  • It has an intentionally long terminationGracePeriod, this is only required with the --delete-all-on-quit flag (see Deletion)
  • Don't start 2 instances of the controller in a single cluster, they will fight each other.

The loadbalancer controller will watch for Services, Nodes and Ingress. Nodes already exist (the nodes in your cluster). We need to create the other 2. For example, create the Service with examples/multi-path/svc.yaml and the Ingress with examples/multi-path/gce-multi-path-ingress.yaml.

A couple of things to note about the Service:

  • It creates a Replication Controller for a simple "echoserver" application, with 1 replica.
  • It creates 2 services for the same application pod: echoheaders[x, y]

Something to note about the Ingress:

  • It creates an Ingress with 2 hostnames and 3 endpoints (foo.bar.com{/foo} and bar.baz.com{/foo, /bar}) that access the given service
$ kubectl create -f examples/http-svc.yaml examples/multi-path/gce-multi-path-ingress.yaml
$ kubectl get svc
NAME                 CLUSTER_IP     EXTERNAL_IP   PORT(S)   SELECTOR          AGE
echoheadersx         10.0.126.10    nodes         80/TCP    app=echoheaders   16m
echoheadersy         10.0.134.238   nodes         80/TCP    app=echoheaders   16m
Kubernetes           10.0.0.1       <none>        443/TCP   <none>            21h

$ kubectl get ing
NAME      RULE          BACKEND                 ADDRESS
echomap   -             echoheadersx:80
          foo.bar.com
          /foo          echoheadersx:80
          bar.baz.com
          /bar          echoheadersy:80
          /foo          echoheadersx:80

You can tail the logs of the controller to observe its progress:

$ kubectl logs --follow glbc-6m6b6 l7-lb-controller
I1005 22:11:26.731845       1 instances.go:48] Creating instance group k8-ig-foo
I1005 22:11:34.360689       1 controller.go:152] Created new loadbalancer controller
I1005 22:11:34.360737       1 controller.go:172] Starting loadbalancer controller
I1005 22:11:34.380757       1 controller.go:206] Syncing default/echomap
I1005 22:11:34.380763       1 loadbalancer.go:134] Syncing loadbalancers [default/echomap]
I1005 22:11:34.380810       1 loadbalancer.go:100] Creating l7 default-echomap
I1005 22:11:34.385161       1 utils.go:83] Syncing e2e-test-beeps-minion-ugv1
...

When it's done, it will update the status of the Ingress with the IP of the L7 it created:

$ kubectl get ing
NAME      RULE          BACKEND                 ADDRESS
echomap   -             echoheadersdefault:80   107.178.254.239
          foo.bar.com
          /foo          echoheadersx:80
          bar.baz.com
          /bar          echoheadersy:80
          /foo          echoheadersx:80

Go to your GCE console and confirm that the following resources have been created through the HTTPLoadbalancing panel:

  • Global Forwarding Rule
  • URL Map
  • TargetHTTPProxy
  • Backend Services (one for each Kubernetes NodePort service)
  • An Instance Group (with ports corresponding to the Backend Services)

The HTTPLoadBalancing panel will also show you if your backends have responded to the health checks, wait till they do. This can take a few minutes. If you see Health status will display here once configuration is complete. the L7 is still bootstrapping. Wait till you have Healthy instances: X. Even though the GCE L7 is driven by our controller, which notices the Kubernetes healthchecks of a pod, we still need to wait on the first GCE L7 health check to complete. Once your backends are up and healthy:

$ curl --resolve foo.bar.com:80:107.178.245.239 http://foo.bar.com/foo
CLIENT VALUES:
client_address=('10.240.29.196', 56401) (10.240.29.196)
command=GET
path=/echoheadersx
real path=/echoheadersx
query=
request_version=HTTP/1.1

SERVER VALUES:
server_version=BaseHTTP/0.6
sys_version=Python/3.4.3
protocol_version=HTTP/1.0

HEADERS RECEIVED:
Accept=*/*
Connection=Keep-Alive
Host=107.178.254.239
User-Agent=curl/7.35.0
Via=1.1 google
X-Forwarded-For=216.239.45.73, 107.178.254.239
X-Forwarded-Proto=http

You can also edit /etc/hosts instead of using --resolve.

Updates

Say you don't want a default backend and you'd like to allow all traffic hitting your loadbalancer at /foo to reach your echoheaders backend service, not just the traffic for foo.bar.com. You can modify the Ingress Spec:

spec:
  rules:
  - http:
      paths:
      - path: /foo
..

and replace the existing Ingress:

$ kubectl replace -f examples/multi-path/gce-multi-path-ingress.yaml
ingress "echomap" replaced

$ curl http://107.178.254.239/foo
CLIENT VALUES:
client_address=('10.240.143.179', 59546) (10.240.143.179)
command=GET
path=/foo
real path=/foo
...

$ curl http://107.178.254.239/
<pre>
INTRODUCTION
============
This is an nginx webserver for simple loadbalancer testing. It works well
for me but it might not have some of the features you want. If you would
...

A couple of things to note about this particular update:

  • An Ingress without a default backend inherits the backend of the Ingress controller.
  • A IngressRule without a host gets the wildcard. This is controller specific, some loadbalancer controllers do not respect anything but a DNS subdomain as the host. You cannot set the host to a regular expression.
  • You never want to delete then re-create an Ingress, as it will result in the controller tearing down and recreating the loadbalancer.

Paths

Till now, our examples were simplified in that they hit an endpoint with a catch-all path regular expression. Most real world backends have sub-resources. Let's create service to test how the loadbalancer handles paths:

apiVersion: v1
kind: ReplicationController
metadata:
  name: nginxtest
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: nginxtest
    spec:
      containers:
      - name: nginxtest
        image: bprashanth/nginxtest:1.0
        ports:
        - containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginxtest
  labels:
    app: nginxtest
spec:
  type: NodePort
  ports:
  - port: 80
    targetPort: 80
    protocol: TCP
    name: http
  selector:
    app: nginxtest

Running kubectl create against this manifest will give you a service with multiple endpoints:

$ kubectl get svc nginxtest -o yaml | grep -i nodeport:
    nodePort: 30404
$ curl nodeip:30404/
ENDPOINTS
=========
 <a href="hostname">hostname</a>: An endpoint to query the hostname.
 <a href="stress">stress</a>: An endpoint to stress the host.
 <a href="fs/index.html">fs</a>: A file system for static content.

You can put the nodeip:port into your browser and play around with the endpoints so you're familiar with what to expect. We will test the /hostname and /fs/files/nginx.html endpoints. Modify/create your Ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: nginxtest-ingress
spec:
  rules:
  - http:
      paths:
      - path: /hostname
        backend:
          serviceName: nginxtest
          servicePort: 80

And check the endpoint (you will have to wait till the update takes effect, this could be a few minutes):

$ kubectl replace -f ingress.yaml
$ curl loadbalancerip/hostname
nginx-tester-pod-name

Note what just happened, the endpoint exposes /hostname, and the loadbalancer forwarded the entire matching url to the endpoint. This means if you had '/foo' in the Ingress and tried accessing /hostname, your endpoint would've received /foo/hostname and not known how to route it. Now update the Ingress to access static content via the /fs endpoint:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: nginxtest-ingress
spec:
  rules:
  - http:
      paths:
      - path: /fs/*
        backend:
          serviceName: nginxtest
          servicePort: 80

As before, wait a while for the update to take effect, and try accessing loadbalancerip/fs/files/nginx.html.

Deletion

Deleting a loadbalancer controller pod will not affect the loadbalancers themselves, this way your backends won't suffer a loss of availability if the scheduler pre-empts your controller pod. Deleting a single loadbalancer is as easy as deleting an Ingress via kubectl:

$ kubectl delete ing echomap
$ kubectl logs --follow glbc-6m6b6 l7-lb-controller
I1007 00:25:45.099429       1 loadbalancer.go:144] Deleting lb default-echomap
I1007 00:25:45.099432       1 loadbalancer.go:437] Deleting global forwarding rule k8-fw-default-echomap
I1007 00:25:54.885823       1 loadbalancer.go:444] Deleting target proxy k8-tp-default-echomap
I1007 00:25:58.446941       1 loadbalancer.go:451] Deleting url map k8-um-default-echomap
I1007 00:26:02.043065       1 backends.go:176] Deleting backends []
I1007 00:26:02.043188       1 backends.go:134] Deleting backend k8-be-30301
I1007 00:26:05.591140       1 backends.go:134] Deleting backend k8-be-30284
I1007 00:26:09.159016       1 controller.go:232] Finished syncing default/echomap

Note that it takes ~30 seconds per ingress to purge cloud resources. This may not be a sufficient cleanup because you might have deleted the Ingress while GLBC was down, in which case it would leak cloud resources. You can delete the GLBC and purge cloud resources in two more ways:

The dev/test way: If you want to delete everything in the cloud when the loadbalancer controller pod dies, start it with the --delete-all-on-quit flag. When a pod is killed it's first sent a SIGTERM, followed by a grace period (set to 10minutes for loadbalancer controllers), followed by a SIGKILL. The controller pod uses this time to delete cloud resources. Be careful with --delete-all-on-quit, because if you're running a production glbc and the scheduler re-schedules your pod for some reason, it will result in a loss of availability. You can do this because your rc.yaml has:

args:
# auto quit requires a high termination grace period.
- --delete-all-on-quit=true

So simply delete the replication controller:

$ kubectl get rc glbc
CONTROLLER   CONTAINER(S)           IMAGE(S)                                      SELECTOR                    REPLICAS   AGE
glbc         default-http-backend   gcr.io/google_containers/defaultbackend:1.0   k8s-app=glbc,version=v0.5   1          2m
             l7-lb-controller       gcr.io/google_containers/glbc:0.9.7

$ kubectl delete rc glbc
replicationcontroller "glbc" deleted

$ kubectl get pods
NAME                    READY     STATUS        RESTARTS   AGE
glbc-6m6b6              1/1       Terminating   0          13m

The prod way: If you didn't start the controller with --delete-all-on-quit, you can execute a GET on the /delete-all-and-quit endpoint. This endpoint is deliberately not exported.

$ kubectl exec -it glbc-6m6b6  -- wget -q -O- http://localhost:8081/delete-all-and-quit
..Hangs till quit is done..

$ kubectl logs glbc-6m6b6  --follow
I1007 00:26:09.159016       1 controller.go:232] Finished syncing default/echomap
I1007 00:29:30.321419       1 controller.go:192] Shutting down controller queues.
I1007 00:29:30.321970       1 controller.go:199] Shutting down cluster manager.
I1007 00:29:30.321574       1 controller.go:178] Shutting down Loadbalancer Controller
I1007 00:29:30.322378       1 main.go:160] Handled quit, awaiting pod deletion.
I1007 00:29:30.321977       1 loadbalancer.go:154] Creating loadbalancers []
I1007 00:29:30.322617       1 loadbalancer.go:192] Loadbalancer pool shutdown.
I1007 00:29:30.322622       1 backends.go:176] Deleting backends []
I1007 00:30:00.322528       1 main.go:160] Handled quit, awaiting pod deletion.
I1007 00:30:30.322751       1 main.go:160] Handled quit, awaiting pod deletion

You just instructed the loadbalancer controller to quit, however if it had done so, the replication controller would've just created another pod, so it waits around till you delete the rc.

Health checks

Currently, all service backends must satisfy either of the following requirements to pass the HTTP(S) health checks sent to it from the GCE loadbalancer:

  1. Respond with a 200 on '/'. The content does not matter.
  2. Expose an arbitrary URL as a readiness probe on the pods backing the Service.

The Ingress controller looks for a compatible readiness probe first, if it finds one, it adopts it as the GCE loadbalancer's HTTP(S) health check. If there's no readiness probe, or the readiness probe requires special HTTP headers, the Ingress controller points the GCE loadbalancer's HTTP health check at '/'. This is an example of an Ingress that adopts the readiness probe from the endpoints as its health check.

Frontend HTTPS

For encrypted communication between the client to the load balancer, you need to specify a TLS private key and certificate to be used by the ingress controller.

Version 1.1 of GLBC now supports (as a beta feature) using more than one SSL certificate in a single Ingress for request termination (aka Multiple-TLS). With this change, keep in mind that the GCP's limit is 10. Take a look at GCP's documentation on SSL certificates for more information on how they are supported in L7 load balancing.

Ingress controller can read the private key and certificate from 2 sources:

Currently the Ingress only supports a single TLS port, 443, and assumes TLS termination.

Secret

For the ingress controller to use the certificate and private key stored in a Kubernetes secret, user needs to specify the secret name in the TLS configuration section of their ingress spec. The secret is assumed to exist in the same namespace as the ingress.

The TLS secret must contain keys named tls.crt and tls.key that contain the certificate and private key to use for TLS, eg:

$ kubectl create secret tls testsecret --key /tmp/tls.key --cert /tmp/tls.crt
apiVersion: v1
kind: Secret
metadata:
  name: testsecret
  namespace: default
type: Opaque
data:
  tls.crt: base64 encoded cert
  tls.key: base64 encoded key

Referencing this secret in an Ingress will tell the Ingress controller to secure the channel from the client to the loadbalancer using TLS.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: no-rules-map
spec:
  tls:
  - secretName: testsecret
  backend:
    serviceName: s1
    servicePort: 80

This creates 2 GCE forwarding rules that use a single static IP. Both :80 and :443 will direct traffic to your backend, which serves HTTP requests on the target port mentioned in the Service associated with the Ingress.

Specifying multiple secrets can be done as follows:

apiVersion: extensions/v1beta1
kind: Ingress
spec:
  tls:
  - secretName: svc1-certificate
  - secretName: svc2-certificate
  backend:
    serviceName: svc1
    servicePort: svc1-port
  rules:
  - host: svc1.example.com
    http:
      paths:
      - path: /*
        backend:
          serviceName: svc1
          servicePort: svc1-port
  - host: svc2.example.com
    http:
      paths:
      - path: /*
        backend:
          serviceName: svc2
          servicePort: svc2-port

In this example, ideally svc1-certificate will contain the hostname svc1.example.com and svc2-certificate will contain the hostname svc2.example.com. Therefore, when a client request indicates a hostname of svc1.example.com, the certificate contained in secret svc1-certificate will be served.

Keep in mind that if you downgrade to a version that does not support Multiple-TLS (< 1.1), then you will need to manually clean up the created certificates in GCP.

GCP SSL Cert

For the ingress controller to use the certificate and private key stored in a GCP SSL cert, user needs to specify the SSL cert name using the ingress.gcp.kubernetes.io/pre-shared-cert annotation. The certificate in this case is managed by the user and it is their responsibility to create/delete it. The Ingress controller assigns the SSL certificate with this name to the target proxies of the Ingress.

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: no-rules-map
  annotations:
      ingress.gcp.kubernetes.io/pre-shared-cert: 'my-certificate'
spec:
...

Multiple pre-shared certs can be specified as follows:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: no-rules-map
  annotations:
      ingress.gcp.kubernetes.io/pre-shared-cert: "my-certificate-1, my-certificate-2, my-certificate-3"
spec:
...

It is important to point out that certificates specified via the annotation take precedence over certificates specified via the secret. In other words, if both methods are used, the certificates specified via the annotation will be used while the ones specified via the secret are ignored.

Ingress cannot redirect HTTP to HTTPS

The GCP HTTP Load Balancer does not have support redirect rules. Your application must perform the redirection. With an nginx server, this is as simple as adding the following lines to your config:

# Replace '_' with your hostname.
server_name _;
if ($http_x_forwarded_proto = "http") {
    return 301 https://$host$request_uri;
}

Blocking HTTP

You can block traffic on :80 through an annotation. You might want to do this if all your clients are only going to hit the loadbalancer through HTTPS and you don't want to waste the extra GCE forwarding rule, eg:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: test
  annotations:
    kubernetes.io/ingress.allow-http: "false"
...

And curling :80 should just 404:

$ curl 130.211.10.121
...
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>404.</b> <ins>That’s an error.</ins>

$ curl https://130.211.10.121 -k
...
SERVER VALUES:
server_version=nginx: 1.9.11 - lua: 10001

Backend HTTPS

For encrypted communication between the load balancer and your Kubernetes service, you need to decorate the service's port as expecting HTTPS. There's an alpha Service annotation for specifying the expected protocol per service port. Upon seeing the protocol as HTTPS, the ingress controller will assemble a GCP L7 load balancer with an HTTPS backend-service with an HTTPS health check.

The annotation value is a JSON map of port-name to "HTTPS" or "HTTP". If you do not specify the port, "HTTP" is assumed.

apiVersion: v1
kind: Service
metadata:
  name: my-echo-svc
  annotations:
      service.alpha.kubernetes.io/app-protocols: '{"my-https-port":"HTTPS"}'
  labels:
    app: echo
spec:
  type: NodePort
  ports:
  - port: 443
    protocol: TCP
    name: my-https-port
  selector:
    app: echo

Troubleshooting:

This controller is complicated because it exposes a tangled set of external resources as a single logical abstraction. It's recommended that you are at least aware of how one creates a GCE L7 without a kubernetes Ingress. If weird things happen, here are some basic debugging guidelines:

  • Check loadbalancer controller pod logs via kubectl A typical sign of trouble is repeated retries in the logs:
I1006 18:58:53.451869       1 loadbalancer.go:268] Forwarding rule k8-fw-default-echomap already exists
I1006 18:58:53.451955       1 backends.go:162] Syncing backends [30301 30284 30301]
I1006 18:58:53.451998       1 backends.go:134] Deleting backend k8-be-30302
E1006 18:58:57.029253       1 utils.go:71] Requeuing default/echomap, err googleapi: Error 400: The backendService resource 'projects/Kubernetesdev/global/backendServices/k8-be-30302' is already being used by 'projects/Kubernetesdev/global/urlMaps/k8-um-default-echomap'
I1006 18:58:57.029336       1 utils.go:83] Syncing default/echomap

This could be a bug or quota limitation. In the case of the former, please head over to slack or github.

  • If you see a GET hanging, followed by a 502 with the following response:
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>

The loadbalancer is probably bootstrapping itself.

  • If a GET responds with a 404 and the following response:
  <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
  <p><b>404.</b> <ins>That’s an error.</ins>
  <p>The requested URL <code>/hostless</code> was not found on this server.  <ins>That’s all we know.</ins>

It means you have lost your IP somehow, or just typed in the wrong IP.

  • If you see requests taking an abnormal amount of time, run the echoheaders pod and look for the client address
CLIENT VALUES:
client_address=('10.240.29.196', 56401) (10.240.29.196)

Then head over to the GCE node with internal IP 10.240.29.196 and check that the Service is functioning as expected. Remember that the GCE L7 is routing you through the NodePort service, and try to trace back.

  • Check if you can access the backend service directly via nodeip:nodeport
  • Check the GCE console
  • Make sure you only have a single loadbalancer controller running
  • Make sure the initial GCE health checks have passed
  • A crash loop looks like:
$ kubectl get pods
glbc-fjtlq             0/1       CrashLoopBackOff   17         1h

If you hit that it means the controller isn't even starting. Re-check your input flags, especially the required ones.

GLBC Implementation Details

For the curious, here is a high level overview of how the GCE LoadBalancer controller manages cloud resources.

The controller manages cloud resources through a notion of pools. Each pool is the representation of the last known state of a logical cloud resource. Pools are periodically synced with the desired state, as reflected by the Kubernetes api. When you create a new Ingress, the following happens:

  • Updates instance groups to reflect all nodes in the cluster.
  • Creates Backend Service for each Kubernetes service referenced in the ingress spec.
  • Adds named-port for each Backend Service to each instance group.
  • Creates a URL Map, TargetHttpProxy, and ForwardingRule.
  • Updates the URL Map according to the Ingress.

Periodically, each pool checks that it has a valid connection to the next hop in the above resource graph. So for example, the backend pool will check that each backend is connected to the instance group and that the node ports match, the instance group will check that all the Kubernetes nodes are a part of the instance group, and so on. Since Backend Services are a limited resource, they're shared (well, everything is limited by your quota, this applies doubly to Backend Services). This means you can setup N Ingress' exposing M services through different paths and the controller will only create M backends. When all the Ingress' are deleted, the backend pool GCs the backend.

Wish list:

  • More E2e, integration tests
  • Better events
  • Detect leaked resources even if the Ingress has been deleted when the controller isn't around
  • Specify health checks (currently we just rely on kubernetes service/pod liveness probes and force pods to have a / endpoint that responds with 200 for GCE)
  • Async pool management of backends/L7s etc
  • Retry back-off when GCE Quota is done
  • GCE Quota integration

Analytics