Fleet on CoreOS, Part One

11 Mar 2016

Servers crash all the time. But it is important to make sure applications, and hence the business, doesn’t suffer. This is why service availability is one of biggest concerns for operational engineers deploying applications in the cloud.

Fleet—a CoreOS tool—solves this problem and frees you from worry by automatically routing application execution to healthy nodes.

So, how does this work?

How does Fleet know if a node is down? How does the rerouting happen?

We covered this in detail in a previous post. But if you’re in a hurry, I will recap.

Each node in a CoreOS cluster runs the fleet daemon, which keeps a tab on the node’s health and is responsible for communicating with other nodes. The daemons coordinate to elect a leader during cluster startup, or when the current leader fails. The leader schedules new services on the nodes whenever a new service request is submitted to the cluster, or when a node goes down taking services with it.

In this miniseries, we’ll get some services up-and-running on a cluster, then take down a node to see how fleet reshuffles things. We’ll then move on and take a closer look at some additional fleet functionality.

Get Started

Let’s take a look at Fleet in action. Specifically, let’s look at how the rerouting takes places when a nodes goes down.

To get started, you should have CoreOS cluster running. If you’re not sure how to do this, check out this tutorial on how to install CoreOS on AWS. Note: in this tutorial, I have used a three node CoreOS cluster running on AWS EC2.

Once you have your cluster ready, connect to the cluster.

That should look something like this:

ssh -i /Path/to/keyfile/keyfile.pem core@ec-2-server-path.compute.amazonaws.com

Make sure to change the path to the actual key file path and the public hostname of your EC2 server.

After you’re connected, check you have fleet installed by running:

$ fleetctl

You should see something like this:


    fleetctl - fleetctl is a command-line interface to fleet, the cluster-wide CoreOS init system.


    fleetctl [global options] <command> [command options] [arguments...]


Define Your Services

While fleet comes pre-installed on CoreOS, it doesn’t start automatically.

To start fleet, we’ll need to add at least one unit file. A unit file describes a service you want to run on your cluster.

Here’s an example unit file:


ExecStartPre=-/usr/bin/docker kill busybox1
ExecStartPre=-/usr/bin/docker rm busybox1
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name busybox1 busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"
ExecStop=/usr/bin/docker stop busybox1

We can name this file myapp.service.

The Description shows up in the logs, so it is a good idea to set this to something you’ll understand later. After=docker.service and Requires=docker.service means this unit will only start after docker.service is active.

ExecStart allows you to specify a command to run when the unit is started. ExecStartPre specifies commands to be executed before the command specified by ExecStart. This can be used to do cleanup, or setup, and so on.

Do not run docker containers with -d as this will prevent the container from starting as a child of this process. In that case, systemd will think the process has exited, and the unit will be stopped.

ExecStop Commands that will run when this unit is considered failed or if it is stopped.

To start this service, move or create the myapp.service file on the node you’re already connected to.

Then run:

$ fleetctl start myapp.service

You will see that the service starts.

Confirm this by running:

$ fleetctl list-unit-files

This command displays all unit files running in the cluster along with the node IP addresses.

Now we’ve added got one service running on the cluster, we can get multiple services running by repeating the process. Just remember that unit file names need to be unique across the cluster.

Once done, run the above command again.

It should look something like this:

$ fleetctl list-unit-files
UNIT               HASH    DSTATE   STATE    TARGET
myapp.service      d4c61bf launched launched 85c0c595.../
anotherapp.service e55c0ae launched launched 113f16a7.../
someapp.service    391d247 launched launched a0b7a5f7.../

Note here we have three services (myapp.service, anotherapp.service, and someapp.service) running on three hosts, indicated by the three IP addresses under the TARGET column.

Fleet in Action

As mentioned at the start, the best bit about fleet is that, not only does it schedule units across the cluster as and when requests are submitted, it automatically reroutes application execution to healthy nodes.

To see this automatic rerouting in action, let’s take down one node.

I did this via the AWS console, by stopping one of my EC2 instances. But you can use whatever management console you’re running your virtual machines with.

After you’ve taken down the node, run:

$ fleetctl list-unit-files

You should see output like this:

UNIT               HASH    DSTATE   STATE    TARGET
myapp.service      d4c61bf launched launched 85c0c595.../
anotherapp.service e55c0ae launched launched 113f16a7.../
someapp.service    391d247 launched launched a0b7a5f7.../

Before, someapp.service was running on, but now it’s running on, along with anotherapp.service. So what’s happened here is was removed from the cluster, and the someapp.service was moved to the healthy node running anotherapp.service.

Fleet did this automatically, with no interaction from us.

Pretty cool, huh?


In this post, we learnt:

  • How to create unit files that describe a service we want to run on our cluster
  • How to launch a service on our cluster using fleet
  • How fleet detects node failure and automatically shuffles downed services onto healthy nodes

In the next post in this miniseries, we’ll look at more advanced ways to interact with and use fleet.

Posted in Series: Fleet on CoreOS, Fleet, CoreOS

triangle square circle

Did you enjoy this post?