Linux Isolation Basics
Note: This is part one of a two part series. Read part two.
In the complex world of modern app deployment solutions, containers have been gaining traction as a popular distribution method. But what are they, and why are people so excited about them? This two part series will look into some of the benefits offered.
First, we’ll look at how isolation is generally used to solve a whole class of problems. Next we’ll look at how containers, specifically, makes isolation more manageable. An intermediate familiarity with UNIX-like systems is assumed throughout.
User/Group Privilege Isolation
User and group access restrictions are one of the most basic forms of isolating what a particular application can do.
Users are often warned to avoid running applications as the admin user (in most cases root) as much as possible. To accomplish this, privileges can be dropped to an unprivileged user at execution time through setuid and setgid system calls.
For example taking this simple echo server in C and modifying it to bind to a port which requires root privileges:
If this is run as-is, it will require root access for binding to a low numbered port. If we want to improve on this, we can require that the program drops its privileges after binding to the port with privileged access:
65534 is the user ID for the
nobody user on the particular system this is running on. (Practical solutions (i.e. not example code) should take user names as strings as well as user IDs.)
When the program is run with this new setuid addition:
This process is now running as the
nobody user while still being able to properly accept connections. However, as it stands, the server has the potential to access any files on the system that the
nobody user can access. While not too much of a concern for this simple echo server, it could cause security issues in a production environment, should the server be compromised.
Filesystem isolation can be used to prevent this.
The chroot is a basic form of isolation at the filesystem level, with the name of the program being an abbreviation for "change root". Essentially it changes the root of the filesystem for a one process and all child processes under it.
A few practical uses of chroots include:
- Development environments
- Isolation of services
- Restricted user SSH access
Let’s continue working with our echo server. Firstly, we need to create a basic directory structure, and then copy over some essential shared libraries:
Now these files are copied over, it’s time to attempt to run the program in the isolated filesystem:
Even though filesystem isolation is present, there’s more we can do. The server is still using the host’s resources without any real limitation. This might be a problem if there are other processes running on this host that are competing for resources. What we need to do now is to isolate those resources.
Control Group Resource Isolation
Control groups, or cgroups for short, are a way to isolate shared resources. These resources include block IO, memory, CPU, and so on.
Let’s look at IO for a second. For a disk on AWS EC2 hdparam shows:
So IO is around 126 MB/sec.
Now, let’s throttle that to 1 MB/sec using control groups.
First a control group needs to be created:
blkio is the name of the subsystem (block IO) we’re going to restrict and
throttled-io is the name of the control group we’re creating.
Throttling works on specific devices, so the major/minor identifier of the device needs to be obtained:
In this case it is 202, 0.
cgset is used to set the actual throttling:
Now we can run
hdparm with this new control group using
As shown, the IO rate is now throttled around 1 MB/sec. Success!
This is just one example of the many other cgroups that are available to utilize for resource management. Read more about the other cgroups in the Red Hat docs.
However, the service has the potential to see process information it really shouldn’t. This leads into the next form of isolation: namespaces.
Namespaces are a way to isolate areas like network and process space.
Due to the rather complex nature of network namespaces with isolated applications and chroots, we’ll discuss that in more detail in part two of this series instead. For now we’ll focus on process space isolation.
So, we need to modify the code for this. The full code is available in this gist, but the important parts are here:
First, in the child process, the PID is printed out so that we can verify the process namespace is working properly. If it is, the PID will show as 1, which is normally the init process on the host system, but will be our top level process once it’s isolated. The
clone() function creates the new namespace and will execute the server with it. This new namespace will have an entirely isolated process space.
We can see that by running:
The process has a PID of 1, showing that the process space isolation is working.
We built a fairly simple echo server, and took steps to isolate the privileges, filesystem, allocated resources, and process space. Of course, the methods used here serve as an introduction only. For those looking for more information on what was presented, you can read more about: setuid, chroot, Control Groups, and Linux Namespaces.
In part two, we’ll look at network isolation. We’ll also introduce containers and discuss how containers are the modern way to do all of this stuff effortlessly.
Addendum: I’ve received some comments regarding various access control systems as a form of isolation. While technologies such as AppArmor and SELinux do provide fine-tunable access controls, their proper implementation is a more involved discussion than the more simplified isolation overview that this article was meant to achieve. For those looking at more advanced ways to secure systems, I recommend starting with the Wikipedia article on security models. With that in mind, these sort of systems will not be touched on in detail in this series.