Docker Overview, Part Two
In part one of this miniseries looking at Docker, we looked at what makes Docker special, the difference between Virtual Machines and containers, and the primary components that make up Docker.
In this post, we'll work directly with some containers. Specifically, we'll show you hot to launch a container, how to build an image with a Dockerfile, how to work with registries, and the basics of data volumes.
Before launching a container you might pull it from the registry:
Launching a container is as simple as running:
command here is the command you want to run inside the container.
If the image doesn't exist, Docker attempts to fetch it from the public image registry. This happens automatically, but you should expect a time delay.
It's important to note that containers are designed to stop after the command executed within them has exited. For example, if you run
/bin/echo hello world as your command, the container starts, prints "hello world" and then stops.
For example, run:
You should see something like this:
Let's launch an Alpine Linux based container and install
openssh inside of it using the ash prompt. First, run this:
-i flags allocate a pseudo-TTY and keep STDIN open even if not attached. This allows you to use the container like a traditional VM as long as the ash prompt is running.
openssh like so:
If we exit the container, the changes we made to the disk are not installed. So next time we launch the container,
openssh will not be installed.
If we want to save our changes, we have to commit them.
But first we need to find the container ID. To do this, open a new shell in your terminal and look for the ID:
You should see something like this:
Now we can commit our container:
2511d433bb01 is our container ID, and
my/alpine is our repository.
You can check this worked by running:
You should see something like this:
Building Images With Dockerfiles
Committing a running container is fine for experimentation. But, for production scenarios, it is much better to use Dockerfiles where you can record the necessary commands as instructions for the
docker build command. These files can then be versioned in Git. The Docker daemon runs your commands one-by-one, committing the result to a new image if necessary, before finally outputting the ID of your new image.
Create a file called Dockerfile, and put this in it:
Now, let’s build the Docker image:
-t mynginx argument names the image
Now we can run it:
-p 8080:80 argument maps port 8080 on the local host to port 80 of the container.
In your browser, navigate to:
And you should see:
We built a simple nginx image!
As the base image, we used Alpine Linux. We installed nginx, then we exposed ports 80 and 443. Finally, we ran nginx.
The reason for using Alpine Linux is that it has very small footprint. This makes the nginx image much smaller and quicker to build and also quicker to push and to pull from the registry.
For more information about Dockerfile commands, see the documentation.
Working With a Registry
Now that we know how to build Docker images, let’s learn how to push images to the Docker registry.
In this example, we’ll use the Docker Hub registry.
You need to open an account there to be able to push Docker images. For more information, see the user guide.
In the previous example, we used this command to build the Docker image:
That command builds the Docker image, but because it has no repository name, it can only be used on the host it was built on.
To push to the Docker Hub registry (or any other hosted registry) you need to add your username to the Docker image name:
$ docker build -t yourname/mynginx .
Before using docker push, make sure that you have done
docker login. This will save your connection settings to the
.dockercfg file in your home folder.
To push to the Docker registry, run:
$ docker push yourname/mynginx
To pull from the Docker registry, run:
$ docker pull yourname/mynginx
If you have self-hosted a private Docker registry, the commands look like this:
$ docker push your_registry:5000/mynginx
$ docker pull your_registry:5000/mynginx
Docker containers are stateless. However, in some cases, such as when using a database, you need to to have persistent data. Docker supports two ways to persist data: data volumes and data volume containers. We'll cover both now.
A data volume is a specially designated directory within one or more containers that bypasses the Union file system.
Data volumes provide several useful features for persistent or shared data:
- Volumes are initialized when a container is created. If the container’s base image contains data at the specified mount point, that existing data is copied into the new volume upon volume initialization.
- Data volumes can be shared and reused among containers.
- Changes to a data volume are made directly.
- Data volumes are uneffected by updates to the image, even if those updates modify paths that would live at the same location as the mounted data volume.
- Data volumes persist even if the container itself is deleted.
Data volumes are designed to persist data, independent of the container’s life cycle. Docker therefore never automatically deletes volumes when you remove a container, nor does it "garbage collect" volumes that are no longer referenced by a container.
You can add a host folder as a data volume into a container using the
-v flag with the
docker run command. You can use the
-v multiple times to mount multiple data volumes.
Let’s mount a single volume in our web application container:
$ docker run -p 8080:80 \ -v /data/share/nginx/html:/usr/share/nginx/html mynginx
The command above mounts a host folder
/data/share/nginx/html into the container at
/usr/share/nginx/html folder. Anything we put to the host folder is seen instantly by the container.
Note: if the path
/usr/share/nginx/html already exists inside the container’s image, its contents are replaced by the contents of
/data/share/nginx/html on the host in order to stay consistent with the expected behavior of mount.
Data Volume Containers
If you have some persistent data that you want to share between containers or want to use from non-persistent containers, it’s best to create a named data volume container, and then to mount the data from it. This avoids potential permissions problems that might arise from mounting a host directory on some Linux OSes.
Let’s create a new, named container with a volume to share. We use the busybox image because the other official images, like the postgres image, are themselves based off of the busybox image. Because our images then have Union FS layers in common, Docker can save disk space by only keeping one copy of those layers.
Run this command:
This creates a data volume container called dbdata, using the busybox base image, and creates a data directory at
You can then use the
--volumes-from flag to mount the
/dbdata volume in another container:
In this case, if the postgres image contained a directory called
/dbdata then mounting the volumes from the dbdata container hides the
/dbdata files from the postgres image. The result is only the files from the dbdata container are visible.
You can use multiple
--volumes-from parameters to bring together multiple data volumes from multiple containers.
If you remove containers that mount volumes, including the initial dbdata container, the volumes will not be deleted. To delete the volume from disk, you must explicitly call
docker rm -v against the last container with a reference to the volume. This allows you to upgrade, or effectively migrate data volumes between containers.
In part one of this miniseries we took a high level view of Docker, what makes it special, and how it's architected. In this post, we looked at building images and launching containers, layered filesystems, and how to persist data.
Stay tuned for a Kubernetes overview next!