The microservice paradigm continues to gain popularity among developers of scalable and fault-tolerant systems.
While usually being deployed in the cloud, the building blocks of such systems need to be deployed, developed and debugged locally on developer's workstations. The heavy use of containers and container orchestration frameworks such as Kubernetes, OpenShift to name a few drives the need to approximate the properties of cloud infrastructure when making the local deployments on software developers workstations.
Such an approach requires solving the whole set of problems: having to use the reduced datasets, scaled down configurations, simplified network topologies, having to replace certain building blocks that are parts of the cloud provider infrastructure (such as load balancers e.g.) with open source alternatives (Nginx + HAProxy e.g.) to name a few.
There are many very well-known ways to do this. We at Alliedium are using a combination of Docker and Minikube to replicate the container cloud infrastructure.
All this works for those building blocks that just need to be launched to create a sandbox environment for the one or two microservices that a software developer currently works on (by changing or debugging the source code).
In such cases a software developer usually runs the microservice component directly on the workstation without any container isolation and it is only natural to expect that particular component will continue to communicate normally with the rest of the sandbox infrastructure still running inside multiple containers. This requires a careful configuration of routing between the host machine and Docker/Minikube including exposing certain ports from inside containers to the host machine (the workstation in our case).
Moreover, usually all software developer's workstations are placed in the same local subnet. Some of the building blocks of the system might use IP multicast for auto-discovery (as Apache Ignite nodes in our case). In such cases it is important to make sure that local system deployments on each of software development workstations are isolated on a network level to prevent all sorts of unexpected behaviors and problems.
All of the challenges above becomes clearer if we use Alliedium AIssistant Cloud app as an example.
The app is built as a plugin for Atlassian Jira Cloud and actively uses Apache Ignite (along with many other system components — see the notes of our talk at Apache Ignite Meetup about Boosting Jira Cloud App Development with Apache Ignite) as both a distributed database and as a computational grid.
In the case we make changes/debug the source code of components communicating with Ignite, we have to expose all the ports necessary for Apache Ignite connectivity from containers running Ignite server nodes.
In this case Ignite client nodes, running both within the containers and outside of them on the host (software development workstation), are able to connect to all other nodes through a properly configured discovery mechanism. Apache Ignite supports a few different node discovery mechanisms, but for the local deployment scenario the natural choice is TCP/IP Discovery implemented through DiscoverySPI interface.
The latter uses TcpDiscoveryIPFinder interface allowing to find IPs of all other Ignite nodes to form the cluster. There are several implementations of TcpDiscoveryIPFinder interface, but by default TcpDiscoveryMulticastIpFinder based on Multicast is used.
And this is where we risk running into the problem of Ghost nodes:
One of the most common problems that many people encountered several times whenever they launched a new Ignite node. You have just executed a single node and encountered that you already have two server Ignite nodes in your topology. Most often, it may happen if you are working on a home office network and one of your colleges also run the Ignite server node at the same time. The fact is that by default, Ignite uses the multicast protocol to discover and communicate with other nodes. During startup, Ignite search for all other nodes that are in the same multicast group and located in the same subnetwork. Moreover, if it does, it tries to connect to the nodes.
Certainly, there is a way to fix this by configuring static IP, but we may want to retain the above multicast IP finder. The first reason for this is to make our configuration more production-like, the second is to avoid writing down all the IP addresses and ports we are going to connect to. We expect that it should be possible to configure deployment environment automatically, via the scripts. Otherwise we need to make a strong assumption (which frequently doesn't hold) that each member of software development team has sufficient qualification to set everything up correctly manually. Finally, there may be similar issues with other components of the app backend apart from Ignite nodes and clients. Thus, we can formulate our problem as a) having to isolate different software development workstations from each other and b) providing just enough communication between all components deployed within each isolated workstation.
For this purpose we have created a simple command line Python-based network isolation tool that automatically configures iptables and Linux Kernel to achieve the desired level of isolation. The tool is designed for Arch Linux and Manjaro Linux as we heavily use the latter at Alliedium for software development (explaining the choice of Manjaro Linux is a topic for a separate article).
Our tool achieves necessary network isolation by allowing all outbound traffic while disabling all inbound connections except for those we deliberately permit. In addition to that it is
- flexible in configuration
- safe to use (makes a backup of previous iptables configuration, displays warning, asks for user confirmations)
- configures not only rules for the host itself, but also for Docker
- automatically persists all the changes
The tool is implemented in Python 3 which is already preinstalled in Manjaro and uses config_iptables.sh shell script as an entry point. When the script is run without input arguments it displays the following help and does nothing (to prevent undeliberate changing of system parameters):
usage: config_iptables.py [-h] [--profile-file PROFILE_FILE] [--noconfirm] [--nopersist] [--persist] --run
Configure network isolation of localhost by changing system parameters and modifying iptables rules
optional arguments:
-h, --help show this help message and exit
--profile-file PROFILE_FILE, -p PROFILE_FILE
Path to profile file (DEFAULT VALUE: default.profile),
this file should be simple text file with lines
having the following format:
<scope> <protocol> <port(s)>
here <scope> equals either to one of `localhost', `docker', `all',
<protocol> may be given by number or name (e.g. `tcp', `udp')
while <port(s)> contains either a single port or a port range
given as <start-port>:<end-port> or a comma-separated list of
ports and port ranges, but their number should not exceed 15
(see multiport module of iptables for details), each port is allowed
for external access in INPUT, DOCKER-USER chains or both of them
--noconfirm Modify system parameters and iptables rules without confirmation
--nopersist Modify system parameters and iptables rules by the way not
persisting between boots, may be used for validation,
if something goes wrong it is sufficient to reboot to
restore previous parameters
--persist Modify system parameters and iptables rules with persistence
between boots (in the case --noconfirm is passed while
--nopersist is not passed, the behavior is the same as if
--persist was given)
--run Should be given to proceed with modifications,
otherwise help is displayed, this is just for safety
config_iptables.py: error: the following argument is required: --run
This is done for safety reasons to prevent undeliberate launching of the tool.
The argument --profile-file
allows to pass the path to the profile file with additional ports for which external inbound traffic is to be allowed (the single port that is allowed unconditionally is TCP/22 used for SSH
, moreover, the latter connection is additionally protected from attacks by a rate limit. Each line in this file should contain three space-separated values, namely, scope, protocol and port. The first from them have three possible values:
localhost
— rules are applied only to the host, but not to Dockerdocker
— rules are applied only to Docker containers, but not to the host itselfall
— rules are applied both to the host and to Docker containers
The value of protocol is usually tcp
or udp
(but in fact any protocols supported by iptables
are possible, for instance, UPD-Lite).
Finally, the third value, port, should be given, as can be seen from help above, either as a single port (passed through --destination-port
argument of iptables
) argument or as several ports (in the format of --destination-ports
argument of multiport
module).
Default profile is in the file default.profile
containing the following:
localhost tcp 5900
localhost udp 5900
all tcp 18000:18300
This means that the port 5900 for VNC is opened both for TCP and UDP protocols only for the host, but not for Docker containers, while a special port range used for our internal development needs is opened both for the host and for Docker.
The meaning of other input parameters of the tool is clear from the help above. In any case all the information on how to rollback of the changes made is displayed in the log. If --noconfirm
is not given, then the tool asks confirmation for every step to be done with displaying all the detailed info on what is to be changed and how to rollback these changes.
For example, the very first question is as follows:
!!!!!!!!!!!!!!!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!!!!!!!!!!!!
This tool should used very carefully because it modifies system
parameters and iptables, the next steps are in-memory only, so
that all the changes can be revoked by rebooting the computer
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Are you sure you want to continue? [y/N]:
When neither --nopersist
nor --persist
flags are passed and persistence between boots is to be done by the steps that follow, the question is like
!!!!!!!!!!!!!!!!!!!!!!!!!! ATTENTION !!!!!!!!!!!!!!!!!!!!!!!!!!
The next step is to make the modified system parameters and
iptables rules persistent between boots, to do this created are
/etc/sysctl.d/999-net_isolation_123456789.conf setting
necessary system parameters (to rollback these changes you will
need just to delete this file and to reboot the computer) and
/etc/iptables/iptables.rules with all modified iptables rules,
old file /etc/iptables/iptables.rules will be copied to backup
file /etc/iptables/iptables.rules.backup (to rollback these
changes just replace new /etc/iptables/iptables.rules by
/etc/iptables/iptables.rules.backup and reboot the computer)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Are you sure you really want to make the above changes? [y/N]:
Thus, the tool keeps the user informed about the changes that are about to be made to the system. This gives the user a chance to prevent unwanted changes.
Finally, the tools checks some prerequisites before it starts to make changes. It checks, for example, that all the necessary tools are installed and that all firewall services like firewalld are either not installed or at least disactivated. The tool assumes that nothing except iptables
service is launched. This helps to avoid conflicts with other firewall tools.
When it comes to Linux Kernel parameters their values may be changed either by echoing into some files in subfolders of /proc/sys
folder (this is not persistent between boots) or via sysctl. The latter tool not only allows to change system parameters at runtime, but also is able to automatically load configuration files with .conf
extension from certain system folders, namely, /etc/sysctl.d
, /run/systctl.d
, /usr/local/lib/sysctl.d
and /usr/lib/sysctl.d
, in order of precedence (see manual for details).
In case the same parameter is changed in more than one file the one with the lexicographically latest name will take precedence. Thus, our tool scans the mentioned folders and automatically creates the configuration file that has the lexicographically latest name among all other configuration files to make sure that all the values of system parameters are really set according to the tool.
Network traffic rules for the host are configured to allow all outbound traffic (configured through OUTPUT
chain for filer
table), while inbound traffic (configured through INPUT
chain for filter
table) is restricted (see also documentation on iptables
for details, especially sections 'Basic concepts' and 'Chains').
Network traffic regulation for Docker is a bit more nuanced. According to the official Docker documentation the restriction of inbound traffic to Docker containers should be done through DOCKER-USER
chain of filter
table, Docker uses this custom chain along with built-in FORWARD
chain from filter
table. See also the paper Be careful, Docker might be exposing ports to the world. But there is a caveat described in one of the comments to this paper. Namely, if to do all this exactly as stated in the official Docker documentation, thus restricting connections to Docker containers only to the host and configuring allowed ports analogously as it is done for INPUT
chain above, then Docker containers lose their connection to the Internet!.. To fix the problem it is necessary to add the following rule as the first one available for DOCKER-USER
chain:
iptables -I DOCKER-USER -m state --state ESTABLISHED,RELATED -j ACCEPT
It is important to prohibit only all inbound connections that are in NEW
state, but we need to accept all connections with ESTABLISHED
and RELATED
states. What is said in the official iptables
documentation:
NEW — meaning that the packet has started a new connection, or otherwise associated with a connection which has not seen packets in both directions, and
ESTABLISHED — meaning that the packet is associated with a connection which has seen packets in both directions,
RELATED — meaning that the packet is starting a new connection, but is associated with an existing connection, such as an FTP data transfer, or an ICMP error.
That means that if some Docker container can start a new connection and iptables
should allow responses to arrive back into the container, that is previously initiated and accepted exchanges bypass rule checking not only for INPUT
, but also for DOCKER-USER
chain.
To summarize, we have shown that Arch Network Isolator combines rich functionality and safety (in terms of both networking and being cautious when making changes to system configurations) while remaining very easy to use.
- Apache Ignite
- Bhuiyan Shamim, A Simple Checklist for Apache Ignite Beginners
- Arch Network Isolator Tool
- Fast Firewall Setup repo and video
- iptables in Arch Linux
- sysctl in Arch Linux
- Docker and iptables
- Geerling Jeff, Be careful, Docker might be exposing ports to the world
- iptables docs
- Alliedium AIssistant Cloud app