Fleet on CoreOS, Part Two

12 Apr 2016

In my previous post, we learnt how fleet automatically reshuffles services across your cluster whenever a node fails to keep your app available. To be more specific, the code running on the failed node is automatically moved to one of the other healthy nodes in the cluster, and from the outside, your app continues to run smoothly.

If you’re interested in understanding more about how fleet fits into CoreOS, we go into that in a previous post about self-sufficient containers.

In this post, I explain that commands you can use to interact with fleet. This will lay the foundation for more advanced uses of fleet in subsiquent posts. But, before diving into commands, let's revisit unit files. This is important because most of the fleet commands are about handling unit files.

Unit Files

As you may know, unit files define services you want to run on your cluster. Think of them as a way to manage applications via fleet.

You might be wondering: "Can I run applications by hand?" Yes, you can. But if you do, fleet won’t know about the application and hence can’t manage it. So we need to define our applications via services, using fleet unit files.

So how do we work with unit files and fleet?

First, let’s take a look at the commands that help us load and run unit files. Then we’ll move on to other fleet commands.

Starting a Service

There are several steps involved in running a service via fleet.

The first step is to upload unit file into fleet. The unit file must then be scheduled onto a specific machine in the cluster. Only then it can be started. The fleetctl CLI tool has commands for all these steps.

Let's start with the submit command.

Submit

The submit command submits a unit file to fleet. Fleet will then read the file contents into memory, making it available for further actions.

It looks like this:

$ fleetctl submit myapp.service

Your myapp.service file is now known to fleet.

You can then use the list-unit-files command to see the unit files that have been submitted.

Run it like this:

$ fleetctl list-unit-files
UNIT           HASH     DSTATE    STATE     TARGET
myapp.service  0d1c468  inactive  inactive  -

As you can see, the unit file has been submitted, but has not been scheduled on any node.

To see the contents of a unit file that has already been submitted, you can type:

$ fleetctl cat myapp.service
[Unit]
Description=My Service
After=docker.service

[Service]
TimeoutStartSec=0
ExecStartPre=-/usr/bin/docker kill hello
ExecStartPre=-/usr/bin/docker rm hello
ExecStartPre=/usr/bin/docker pull busybox
ExecStart=/usr/bin/docker run --name hello busybox /bin/sh -c "while true; do echo Hello World; sleep 1; done"
ExecStop=/usr/bin/docker stop hello

Note: fleet will not update the in-memory unit file if you re-submit it. To update a unit file, you must remove it completely and then re-submit it.

Load

Once your unit file has been submitted, the next step is to schedule it on a machine.

When we schedule a unit, fleet decides which machine in the cluster is best to pass the unit to. To do this, fleet looks at the content of the unit file and the current work volume of each machine in the cluster. After fleet makes the decision, it passes the unit file to the target machine and loads it into the local systemd.

You can load and schedule a unit by using the load command, like so:

$ fleetctl load myapp.service
Unit myapp.service loaded on 43cb4dc3.../172.31.9.57

Now, if you list the unit files again, you’ll see that it has been loaded. You’ll also see the target node IP address.

Do that like so:

$ fleetctl list-unit-files
UNIT           HASH     DSTATE  STATE   TARGET
myapp.service  0d1c468  loaded  loaded  43c.../172.31.9.57

We can now also use the list-units command to show any running or scheduled units and their statuses.

Like so:

$ fleetctl list-units
UNIT           MACHINE                  ACTIVE    SUB
myapp.service  43cb4dc3.../172.31.9.57  inactive  dead

Start

To start a unit, you need to use the start command. This will start the unit on the machine it has been loaded onto. Starting the unit results in the execution of start commands defined in the unit file.

Start a unit like so:

$ fleetctl start myapp.service
Unit myapp.service launched on 43cb4dc3.../172.31.9.57

You can now check the list of unit files again:

$ fleetctl list-unit-files
UNIT           HASH     DSTATE    STATE     TARGET
myapp.service  0d1c468  launched  launched  43cb.../172.31.9.57

As you can see, the unit has been launched. Note: the DSTATE column indicates the desired state and the STATE column indicates the actual state. If these two match, this usually means that the action was successful.

You can also check the list of units:

$ fleetctl list-units
UNIT           MACHINE                  ACTIVE  SUB
myapp.service  43cb4dc3.../172.31.9.57  active  running

Note that while list-unit-files gives the information from the system perspective, list-units gives us information about the systemd state, i.e. it is collected directly from the local daemon running on whatever machine the unit has been scheduled on. So this is a better picture of how the local system sees the service state.

The ACTIVE column is a generalized state of the unit, while SUB is a more low-level description.

Removing a Service

Each of the commands we just learnt has a counter-command that reverses the action.

Stop

To stop a service from running, use the stop command. This will cause the local machine's systemd instance to execute the stopping commands defined in the unit file.

You can do that like so:

$ fleetctl stop myapp.service
Unit myapp.service loaded on 43cb4dc3.../172.31.9.57

As you can see, the service has reverted back to the loaded state. It is still loaded in the machine's systemd, but it is not currently running.

We can confirm that like so:

$ fleetctl list-unit-files
UNIT           HASH     DSTATE  STATE   TARGET
myapp.service  0d1c468  loaded  loaded  43cb.../172.31.9.57

Unload

To remove the unit from the target machine's systemd, but keep it available in fleet, you can unload the unit.

If the unit is currently active, it will be stopped prior to being unloaded.

Here’s how you do that:

$ fleetctl unload myapp.service
Unit myapp.service inactive

Now, you can check the state again:

$ fleetctl list-unit-files
UNIT           HASH     DSTATE    STATE     TARGET
myapp.service  0d1c468  inactive  inactive  -

Here you see the unit now marked as inactive. Additionally, it does not have a target machine listed.

Destroy

If you’d rather remove the unit from fleet entirely, you can use the destroy command. This will stop and unload the unit (if necessary) and then remove the unit from fleet.

Run it like so:

$ fleetctl destroy myapp.service
Destroyed myapp.service

As noted above: If you modify a unit file, you must destroy the current unit in fleet before submitting and starting it again. This is because unit files, once submitted to fleet, are static and cannot be updated.

Getting Service Status

We've looked at two commands that can get us status information: list-units and list-unit-files. The list-units command lists all units currently scheduled on a machine. The list-unit-files command shows us all units files fleet knows about, along with desired and actual state.

But what if that’s not enough?

Fortunately, there are two more commands you can use to get specific information about an individual service.

Status

The status command relays back the systemctl status for the service on the host that is running the unit.

It looks like this:

$ fleetctl status myapp.service
● myapp.service - My Service
   Loaded: loaded (/run/fleet/units/myapp.service; linked-runtime; vendor preset: disabled)
   Active: active (running) since Fri 2016-03-25 14:02:24 UTC; 51min ago
  Process: 2965 ExecStartPre=/usr/bin/docker pull busybox (code=exited, status=0/SUCCESS)
  Process: 2954 ExecStartPre=/usr/bin/docker rm hello (code=exited, status=0/SUCCESS)
  Process: 2945 ExecStartPre=/usr/bin/docker kill hello (code=exited, status=1/FAILURE)
 Main PID: 2977 (docker)
   Memory: 11.4M
      CPU: 435ms
   CGroup: /system.slice/myapp.service
           └─2977 /usr/bin/docker run --name hello busybox /bin/sh -c while true; do echo Hello World; sleep 1; done

Mar 25 14:53:35 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:36 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:37 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:38 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:39 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:40 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:41 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:42 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:43 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 14:53:44 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World

As you can see, we finally get to verify our unit is working as expected!

Journal

If you wish to see the just the journal entry for the unit and not the entire execution sequence, you can use the journal command by itself.

It looks like this:

$ fleetctl journal hello.service

-- Logs begin at Fri 2016-03-25 10:08:44 UTC, end at Fri 2016-03-25 16:19:04 UTC. --
Mar 25 16:18:55 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:18:56 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:18:57 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:18:58 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:18:59 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:19:00 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:19:01 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:19:02 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:19:03 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World
Mar 25 16:19:04 ip-172-31-9-57.ap-southeast-1.compute.internal docker[2977]: Hello World

By default, journal shows the last 10 lines.

You can adjust the number of lines shown with the --lines argument:

$ fleetctl journal --lines 20 hello.service

You can also use -f, for follow:

$ fleetctl journal -f hello.service

This functions like tail -f and lets you monitor changes to the journal as they occur.

Conclusion

This post we looked at:

  • Commands to submit, load, and start units
  • Commands, to stop, unload, and destroy units
  • Commands to inspect the status of units

This was the second post in this fleet miniseries. My first post introduces units and takes a look at fleet in action. In the next post in this miniseries, we’ll learn about fleet APIs.

Posted in Series: Fleet on CoreOS, Fleet, CoreOS

triangle square circle

Did you enjoy this post?