Multi-stage Docker builds are a way of building a software application from multiple source code repositories. This allows for more reliable and consistent builds, as well as easier management of the applications. Multi-stage builds are often used in larger organizations where there is a need for multiple releases of an application.


Multi-stage Docker builds let you write Dockerfiles with multiple FROM statements. This means you can create images which derive from several bases, which can help cut the size of your final build.

Docker images are created by selecting a base image using the FROM statement. You then add layers to that image by adding commands to your Dockerfile.

With multi-stage builds, you can split your Dockerfile into multiple sections. Each stage has its own FROM statement so you can involve more than one image in your builds. Stages are built sequentially and can reference their predecessors, so you can copy the output of one layer into the next.

Multi-Stage Builds in Action

Let’s look at how you can create a multi-stage build. We’re working with a barebones PHP project which uses Composer for its dependencies and Sass for its stylesheets.

Here’s a multi-stage Dockerfile which encapsulates our entire build:

Straightaway, you’ll observe we’ve got two FROM statements which split our Dockerfile into two logical sections. The first stage is dedicated to compiling the Sass, while the second one focuses on combining everything together in the final container.

We’re using the node-sass implementation of Sass. We therefore start with a Node.JS base image, within which we install node-sass globally from npm. We then use node-sass to compile our stylesheet example.scss into the pure CSS example.css. The high-level summary of this stage is we take a base image, run a command and obtain an output we’d like to use later in our build (example.css).

The next stage introduces the base image for our application: php8.0-apache. The last FROM statement in your Dockerfile defines the image your containers will end up running. Our earlier node image is ultimately irrelevant to our application’s containers – it’s used purely as a build-time convenience tool.

We next use Composer to install our PHP dependencies. Composer is PHP’s package manager but it’s not included with the official PHP Docker images. We therefore copy the binary into our container from the dedicated Composer image.

We didn’t need a FROM statement to do this. As we’re not running any commands against the Composer image, we can use the –from flag with COPY to reference the image. Ordinarily, COPY copies files from your local build context into your image; with –from and an image name, it’ll create a new container using that image and then copy the specified file out of it.

Later on, our Dockerfile uses COPY –from again, this time in a different form. Back at the top, we wrote our first FROM statement as FROM node:14 AS sass. The AS clause created a named stage called sass.

We now reference the transient container created by this stage using COPY –from=sass. This allows us to copy our built CSS into our final image. The remainder of the steps are routine COPY operations, used to obtain our source code from our local working directory.

Advantages of Multi-Stage Builds

Multi-stage builds let you create complex build routines with a single Dockerfile. Prior to their introduction, it was common for complex projects to use multiple Dockerfiles, one for each stage of their build. These then needed to be orchestrated by manually written shell scripts.

With multi-stage builds, our entire build system can be contained in a single file. You don’t need any wrapper scripts to take your project from raw codebase to final application image. A regular docker build -t my-image:latest . is sufficient.

This simplification also provides opportunities to improve the efficiency of your images. Docker images can become large, especially if you’re using a language runtime as your base.

Take the official golang image: it’s close to 300MB. Traditionally, you might copy your Go source into the image and use it to compile your binary. You’d then copy your binary back to your host machine before starting another build. This one would use a Dockerfile with a lightweight base image such as alpine (about 10MB). You’d add your binary back in, resulting in a much smaller image than if you’d used the original golang base to run your containers.

With multi-stage builds, this kind of system is much easier to implement:

In eight lines, we’ve managed to achieve a procedure that would previously have needed at least three files – a golang Dockerfile, an alpine Dockerfile and a shell script to manage the intermediary steps.

Conclusion

Multi-stage builds can dramatically simplify the construction of complex Docker images. They let you involve multiple interconnected build steps which can pass output artifacts forwards.

The model also promotes build efficiency. The ease with which you can reference different base images helps developers ensure the final output is as small as possible. You’ll benefit from reduced storage and bandwidth costs, which can be significant when using Docker within a CI/CD system.