/flannel

flannel is a network fabric for containers, designed for Kubernetes

Primary LanguageGoApache License 2.0Apache-2.0

flannel

flannel Logo

Build Status

flannel is a virtual network that gives a subnet to each host for use with container runtimes.

Platforms like Kubernetes assume that each container (pod) has a unique, routable IP inside the cluster. The advantage of this model is that it reduces the complexity of doing port mapping.

Theory of Operation

flannel runs an agent, flanneld, on each host and is responsible for allocating a subnet lease out of a preconfigured address space. flannel uses etcd to store the network configuration, allocated subnets, and auxiliary data (such as host's IP). The forwarding of packets is achieved using one of several strategies that are known as backends. The simplest backend is udp and uses a TUN device to encapsulate every IP fragment in a UDP packet, forming an overlay network. The following diagram demonstrates the path a packet takes as it traverses the overlay network:

Life of a packet

Building flannel

The most reliable way to build flannel is by using Docker.

Building in a Docker container

To build flannel in a container run make dist/flanneld-amd64 You will now have a flanneld-amd64 binary in the dist directory.

Building manually

  • Step 1: Make sure you have required dependencies installed on your machine. ** On Ubuntu, run sudo apt-get install linux-libc-dev golang gcc. ** On Fedora/Redhat, run sudo yum install kernel-headers golang gcc.
  • Step 2: Git clone the flannel repo. It MUST be placed in your GOPATH under github.com/coreos/flannel: cd $GOPATH/src; git clone https://github.com/coreos/flannel.git
  • Step 3: Run the build script, ensuring that CGO_ENABLED=1: cd flannel; CGO_ENABLED=1 make dist/flanneldmake dist/flanneld

Configuration

flannel reads its configuration from etcd. By default, it will read the configuration from /coreos.com/network/config (can be overridden via --etcd-prefix). You can use etcdctl utility to set values in etcd. The value of the config is a JSON dictionary with the following keys:

  • Network (string): IPv4 network in CIDR format to use for the entire flannel network. This is the only mandatory key.

  • SubnetLen (integer): The size of the subnet allocated to each host. Defaults to 24 (i.e. /24) unless the Network was configured to be smaller than a /24 in which case it is one less than the network.

  • SubnetMin (string): The beginning of IP range which the subnet allocation should start with. Defaults to the first subnet of Network.

  • SubnetMax (string): The end of the IP range at which the subnet allocation should end with. Defaults to the last subnet of Network.

  • Backend (dictionary): Type of backend to use and specific configurations for that backend. The list of available backends and the keys that can be put into the this dictionary are listed below. Defaults to "udp" backend.

The lease on a subnet is hard-coded to 24h (see subnetTTL). Subnet lease are renewed within 1h of their expiration (can be overridden via --subnet-lease-renew-margin).

Backends

  • udp: use UDP to encapsulate the packets.

    • Type (string): udp
    • Port (number): UDP port to use for sending encapsulated packets. Defaults to 8285.
  • vxlan: use in-kernel VXLAN to encapsulate the packets.

    • Type (string): vxlan
    • VNI (number): VXLAN Identifier (VNI) to be used. Defaults to 1.
    • Port (number): UDP port to use for sending encapsulated packets. Defaults to kernel default, currently 8472.
    • GBP (boolean): Enable VXLAN Group Based Policy. Defaults to false.
  • host-gw: create IP routes to subnets via remote machine IPs. Note that this requires direct layer2 connectivity between hosts running flannel.

    • Type (string): host-gw
  • aws-vpc: create IP routes in an Amazon VPC route table.

    • Requirements:
      • Running on an EC2 instance that is in an Amazon VPC.
      • Permissions required: CreateRoute, DeleteRoute,DescribeRouteTables, ModifyInstanceAttribute, DescribeInstances [optional]
    • Type (string): aws-vpc
    • RouteTableID (string): [optional] The ID of the VPC route table to add routes to. The route table must be in the same region as the EC2 instance that flannel is running on. flannel can automatically detect the id of the route table if the optional DescribeInstances is granted to the EC2 instance.

    Authentication is handled via either environment variables or the node's IAM role. If the node has insufficient privileges to modify the VPC routing table specified, ensure that appropriate AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, and optionally AWS_SECURITY_TOKEN environment variables are set when running the flanneld process.

    Note: Currently, AWS limits the number of entries per route table to 50.

  • gce: create IP routes in a Google Compute Engine Network

    Command to create a compute instance with the correct permissions and IP forwarding enabled:
    $ gcloud compute instances create INSTANCE --can-ip-forward --scopes compute-rw

    Note: Currently, GCE limits the number of routes for every project to 100.

  • alloc: only perform subnet allocation (no forwarding of data packets).

    • Type (string): alloc
  • ali-vpc: create IP routes in a alicloud VPC route table

    • Requirements:
      • Running on an ECS instance that is in an Alicloud VPC.
      • Permission require accessid and keysecret
    • Type (string): ali-vpc
    • AccessKeyID (string): api access key id. can also be configure with environment ACCESS_KEY_ID
    • AccessKeySecret (string): api access key secret.can also be configure with environment ACCESS_KEY_SECRET Note: Currently, AliVPC limit the number of entries per route table to 50.

Example configuration JSON

The following configuration illustrates the use of most options with udp backend.

{
	"Network": "10.0.0.0/8",
	"SubnetLen": 20,
	"SubnetMin": "10.10.0.0",
	"SubnetMax": "10.99.0.0",
	"Backend": {
		"Type": "udp",
		"Port": 7890
	}
}

Firewalls

When using udp backend, flannel uses UDP port 8285 for sending encapsulated packets. When using vxlan backend, kernel uses UDP port 8472 for sending encapsulated packets. Make sure that your firewall rules allow this traffic for all hosts participating in the overlay network.

Running

Once you have pushed configuration JSON to etcd, you can start flanneld. If you published your config at the default location, you can start flanneld with no arguments. flannel will acquire a subnet lease, configure its routes based on other leases in the overlay network and start routing packets. Additionally it will monitor etcd for new members of the network and adjust the routes accordingly.

After flannel has acquired the subnet and configured backend, it will write out an environment variable file (/run/flannel/subnet.env by default) with subnet address and MTU that it supports.

Client/Server mode (EXPERIMENTAL)

Please see Documentation/client-server.md.

Multi-network mode (EXPERIMENTAL)

Multi-network mode allows a single flannel daemon to join multiple networks. Each network is independent from each other and has its own configuration, IP space, interfaces. To configure three networks -- in this example named blue, green, and red -- start by publishing their configurations to etcd in different locations:

$ etcdctl set /coreos.com/network/blue/config  '{ "Network": "10.1.0.0/16", "Backend": { "Type": "vxlan", "VNI": 1 } }'
$ etcdctl set /coreos.com/network/green/config '{ "Network": "10.2.0.0/16", "Backend": { "Type": "vxlan", "VNI": 2 } }'
$ etcdctl set /coreos.com/network/red/config   '{ "Network": "10.3.0.0/16", "Backend": { "Type": "vxlan", "VNI": 3 } }'

Next, start the flannel daemon, specifying the networks to join:

$ flanneld --networks=blue,green,red

Instead of writing out a single /run/flannel/subnet.env file with flannel parameters, it will create a .env file for each network in the directory /run/flannel/networks:

$ ls /run/flannel/networks/
blue.env  green.env  red.env

Important: In multi-network mode, flannel will not notify systemd that it is ready upon initialization. This is because some networks may initialize slower than others (or never). Use systemd.path files for unit synchronization.

Note: Multi-network mode can work in conjunction with the client/server mode. The --networks flag is only passed to the client:

# Server daemon
$ flanneld --listen=0.0.0.0:8888

# Client daemon
$ flanneld --remote=10.0.0.3:8888 --networks=blue,green

Key command line options

--public-ip="": IP accessible by other nodes for inter-host communication. Defaults to the IP of the interface being used for communication.
--etcd-endpoints=http://127.0.0.1:4001: a comma-delimited list of etcd endpoints.
--etcd-prefix=/coreos.com/network: etcd prefix.
--etcd-keyfile="": SSL key file used to secure etcd communication.
--etcd-certfile="": SSL certification file used to secure etcd communication.
--etcd-cafile="": SSL Certificate Authority file used to secure etcd communication.
--kube-subnet-mgr: Contact the Kubernetes API for subnet assignement instead of etcd or flannel-server.
--iface="": interface to use (IP or name) for inter-host communication. Defaults to the interface for the default route on the machine.
--subnet-file=/run/flannel/subnet.env: filename where env variables (subnet and MTU values) will be written to.
--subnet-lease-renew-margin=60: subnet lease renewal margin, in minutes.
--ip-masq=false: setup IP masquerade for traffic destined for outside the flannel network. Flannel assumes that the default policy is ACCEPT in the NAT POSTROUTING chain.
--listen="": if specified, will run in server mode. Value is IP and port (e.g. `0.0.0.0:8888`) to listen on or `fd://` for [socket activation](http://www.freedesktop.org/software/systemd/man/systemd.socket.html).
--remote="": if specified, will run in client mode. Value is IP and port of the server.
--remote-keyfile="": SSL key file used to secure client/server communication.
--remote-certfile="": SSL certification file used to secure client/server communication.
--remote-cafile="": SSL Certificate Authority file used to secure client/server communication.
--networks="": if specified, will run in multi-network mode. Value is comma separate list of networks to join.
-v=0: log level for V logs. Set to 1 to see messages related to data path.
--version: print version and exit

Environment variables

The command line options outlined above can also be specified via environment variables. For example --etcd-endpoints=http://10.0.0.2:2379 is equivalent to FLANNELD_ETCD_ENDPOINTS=http://10.0.0.2:2379 environment variable. Any command line option can be turned into an environment variable by prefixing it with FLANNELD_, stripping leading dashes, converting to uppercase and replacing all other dashes to underscores.

Zero-downtime restarts

When running with a backend other than udp, the kernel is providing the data path with flanneld acting as the control plane. As such, flanneld can be restarted (even to do an upgrade) without disturbing existing flows. However in the case of vxlan backend, this needs to be done within a few seconds as ARP entries can start to timeout requiring the flannel daemon to refresh them. Also, to avoid interruptions during restart, the configuration must not be changed (e.g. VNI, --iface values).

Docker integration

Docker daemon accepts --bip argument to configure the subnet of the docker0 bridge. It also accepts --mtu to set the MTU for docker0 and veth devices that it will be creating. Since flannel writes out the acquired subnet and MTU values into a file, the script starting Docker can source in the values and pass them to Docker daemon:

source /run/flannel/subnet.env
docker daemon --bip=${FLANNEL_SUBNET} --mtu=${FLANNEL_MTU} &

Systemd users can use EnvironmentFile directive in the .service file to pull in /run/flannel/subnet.env

CoreOS integration

CoreOS ships with flannel integrated into the distribution. See https://coreos.com/docs/cluster-management/setup/flannel-config/ for more information.

Running on Vagrant

Vagrant has a tendency to give the default interface (one with the default route) a non-unique IP (often 10.0.2.15). This causes flannel to register multiple nodes with the same IP. To work around this issue, use --iface option to specify the interface that has a unique IP. If you're running on CoreOS, use cloud-config to set coreos.flannel.interface to $public_ipv4.

Release Process

  • Create a release on Github and use it to create a tag
  • Check the tag out and run
    • make release
  • Attach all the files in dist to the Github release
  • Run make docker-push-all to push all the images to a registry