Docker image best practices

tip

When building your own docker images, there are several best practices you can follow to optimize on security and size of your image. I came across a video on YouTube that mentions 8, and I think they make sense.

The lady handles 8 best practices and discusses each of them quite nicely.
Obviously you can just view the video, but if you prefer a written reference (I do), just read along.

Best practice 1: Use an official and verified image as base

For most images you’ll use some other image as base. That base image will have an OS installed and possibly some utilities. But, it may also contain malware. So, for unknown and non-verified publishers, try to review the Dockerfile to see its content. Conversely, using official and verified images will allow you to trust its content (if you trust the site that
verifies and determines what is official, of course…).

Best practice 2: Use specific docker image versions

A (software) configuration management best practice is to make everything specific. So, make specific which versions of which dependencies are needed to make your application work.

For docker images, it means that you shouldn’t use images tagged :LATEST. You never know which version you may get and a new version may introduce a change breaking for your application. So, moving to a new version of the base image must be a conscious choice in a process that also includes testing your application with the updated base image.

Best practice 3: Use a base image that is most specific to your need

When choosing a base image, choose one that is most specific to your need. E.g. when you need an image that supports NodeJS, you could take an Ubuntu image as base and install NodeJS. But, picking a dedicated NodeJS image has a few advantages:

Dedicated images are usually optimized and use best-practices for their purpose.
Dedicated images are usually smaller.
Dedicated images have less moving parts and therefor have a smaller attack surface.

Additionally, for dedicated containers, there are multiple options for the underlying OS-es. They may use Ubuntu, Debian or Alpine. The choice depends on required available functionality versus size and attack surface. Mostly, Alpine images are smallest in both.

Best practice 4: Optimize caching of image layers

Docker images consist of a layered filesystem. By stacking them, you can build on other layers, even on layers made by other authors. A good example of this, is the base image you use with the FROM keyword. Your image builds on top of that image by adding your layers.

Another advantage of using layers, is that you can reuse layers if they are not changed. They are cached. This also speeds up building the image.

Each command in a Dockerfile creates a layer. Use command docker history to see the layers and the commands that created them:

➜  docker history mysql:5.7
IMAGE          CREATED       CREATED BY                                      SIZE      COMMENT
c20987f18b13   3 weeks ago   /bin/sh -c #(nop)  CMD ["mysqld"]               0B        
<missing>      3 weeks ago   /bin/sh -c #(nop)  EXPOSE 3306 33060            0B        
<missing>      3 weeks ago   /bin/sh -c #(nop)  ENTRYPOINT ["docker-entry…   0B        
<missing>      3 weeks ago   /bin/sh -c ln -s usr/local/bin/docker-entryp…   34B       
<missing>      3 weeks ago   /bin/sh -c #(nop) COPY file:345a22fe55d3e678…   14.5kB    
<missing>      3 weeks ago   /bin/sh -c #(nop)  VOLUME [/var/lib/mysql]      0B        
<missing>      3 weeks ago   /bin/sh -c {   echo mysql-community-server m…   313MB     
<missing>      3 weeks ago   /bin/sh -c echo 'deb http://repo.mysql.com/a…   55B       
<missing>      3 weeks ago   /bin/sh -c #(nop)  ENV MYSQL_VERSION=5.7.36-…   0B        
<missing>      3 weeks ago   /bin/sh -c #(nop)  ENV MYSQL_MAJOR=5.7          0B        
<missing>      3 weeks ago   /bin/sh -c set -ex;  key='A4A9406876FCBD3C45…   1.84kB    
<missing>      3 weeks ago   /bin/sh -c apt-get update && apt-get install…   52.2MB    
<missing>      3 weeks ago   /bin/sh -c mkdir /docker-entrypoint-initdb.d    0B        
<missing>      3 weeks ago   /bin/sh -c set -eux;  savedAptMark="$(apt-ma…   4.17MB    
<missing>      3 weeks ago   /bin/sh -c #(nop)  ENV GOSU_VERSION=1.12        0B        
<missing>      3 weeks ago   /bin/sh -c apt-get update && apt-get install…   9.34MB    
<missing>      3 weeks ago   /bin/sh -c groupadd -r mysql && useradd -r -…   329kB     
<missing>      3 weeks ago   /bin/sh -c #(nop)  CMD ["bash"]                 0B        
<missing>      3 weeks ago   /bin/sh -c #(nop) ADD file:bd5c9e0e0145fe33b…   69.3MB

As mentioned, layers are stacked. This means the order of stacking is important. Say you have 5 layers and the 2nd one is changed, then layers 2 to 5 need to be rebuilt. While if the 2nd layer was stacked as the 5th layer only that last one would have to be rebuilt when changed. So, when designing your image, make sure that layers that change most often are stacked last.

Best practice 5: Use a `.dockerignore` file

When building your application and the docker image containing it, quite often intermediate build artifacts are created. Additionally, files like documentation and README.md are usually not needed in the image.

You can exclude these file by copying only the desired files to the image. An alternative and mostly easier way is to exclude those files from being copied. Docker, like git, uses an ignore file to specify these exclusions. The .dockerignore file must be added next to the Dockerfile.

Best practice 6: Use multi-stage builds

Sometimes you need tools to build your app and image which you don’t need when running the image as container. A good example is a Java application. To build the application a JDK is needed often combined with Maven or Gradle. When the jar/war is built, those are no longer needed and a JRE suffices.

A solution is to use multi-stage builds. One image to do the actual building and then copy the result into the final image that will be deployed. This can be combined in single Dockerfile:

## build stage
FROM maven:3.8-openjdk-11 AS build
COPY src /home/app/src
COPY pom.xml /home/app
RUN mvn -B -f /home/app/pom.xml clean package

## run stage
FROM openjdk:11.0.11-jre-slim
COPY --from=build /home/app/target/app-1.0-SNAPSHOT.jar /usr/local/lib/app.jar
COPY .config.yaml /.config.yaml
EXPOSE 8080
ENTRYPOINT ["java","-jar","/usr/local/lib/app.jar"]

Best practice 7: Use the least privileged user

By default, docker containers run as user root. Often this is not needed for the application. It poses a security risk, because the user in the container could potentially have root permissions on the docker host. That would allow an attacker to have root privileges on the host if he were to break out of the container onto the host. This is called privilege escalation.

So, use the lease privileged user possible for your application. Create a dedicated group and user for your application. Set the required (file) permissions and change to the user by using the USER directive:

RUN groupadd -r app_user && useradd -g app_user app_user && chown -R app_user:app_user /app
USER app_user
CMD node index.js

Base images may already have a least privileged user that you can use:

FROM node:10-alpine
RUN chown -R node:node /app
USER node
CMD node index.js

Best practice 8: Scan your image for security vulnerabilities

Docker hub automatically scans your image for vulnerabilities when pushing your container. This makes it easy to check for known issues.

Conclusion

With the best practices mentioned above, the docker images you create will be smaller, better to be cached, less vulnerable and reliable.

January 9, 2022