I've written about these things before, but as a note to self I just put a compilation here 😊

Containers are widely used as a method of deployment as it simplifies the installation of an application.
This is because containers are packages of files that contain everything an application needs to run without having to install any additional dependencies.

This also ensures that the same version of a container runs the same in every environment.

Containers are not the isolated sandboxes that you might think nor are they virtual machines.
You can consider them a compressed package with files inside which gets decompressed and runs in a simulated seperate environment.

Choosing the right base image

While this is not a primer on how to create containers, there are a few things worth noting.

Everything starts with a base image - indicated by the FROM instruction.
When choosing this base image, make sure it is either an official image from a well known provider or at least a actively maintained image.

FROM nginx:latest 
COPY nginx/default.conf /etc/nginx/conf.d 
EXPOSE 8080

Should security vulnerabilities be detected, you want your base image to be updated by the maintainer as soon as possible.

You can scan the image yourself using, for example, trivy.

Important: Once a new base image is available with security fixes, update your Dockerfile and re-build and re-deploy your solution.

If the base image is unknown to you, do your due diligence and check out:

  • Is this a well-known publisher?
  • Does it have a public repository - if so, check if its actively maintained
  • If available, check download count - if an image is hardly used by anyone, perhaps you shouldn't either?
  • Run a security scan on the image with tools such as trivy

The latest tag

Many examples will tell you to simply docker pull my_image:latest to get the latest version.

And this makes sense because it will (hopefully) contain all the latest and greatest security fixes and features.

The problem with this is that most container services will be default not pull a new version of a container if the tag has not changed.
So, if you have my_image:latest specified in your configuration and you want to get the latest version so you restart the container thinking it will trigger a new download, it won't.

You can configure this behavior in most cases by saying it should always pull new versions.

A better way is to be very explicit about the version you are using - that way you are certain which version is deployed and can easier roll back.
There is no way to say "roll back latest to .. the almost-latest", but you can say "roll back to 1.0 from 1.1"

Keep unwanted files out of your container

Just like with Git repositories and .gitignore containers have a .dockerignore file.

The functionality is the same and it will keep the specified files and folders away from the container build process.

When creating container images, its easy to just do something like this:

FROM node:16-lts 
WORKDIR /src 
COPY . . #just copy everything - we need all the files to run the app, yes?
RUN npm start

If you have any hidden files in your directory, such as ssh keys or config files with credentials - they will also be copied into the container by the COPY . . instruction.

This is because the docker build process will copy everything in the build context - your directory - to a staging area and subsequently add them all those files to your container if you have not either:

  • specified exactly which files to add to the container in the COPY instruction
  • added the sensitive files to .dockerignore

Running as root

By default, most containers will run as root.

While some applications require this, most don't.
The reason why this is a bad practice is that if you mount a filesystem into your container (or somehow the process gains access to the underlying host) the user will have the same root access privileges as the root user on the server your container runs on.

Any container can replace the user - even if the base image uses root
Just add something like this to your Dockerfile

RUN useradd --uid 999999 non-root USER non-root

The reason why the chosen uid is so high (999999) is, again, that if your container user bleeds into the server that runs the container - the user id will be used there as well.
So, if there is a user on the system already that matches the uid of the container user then the container user will have the same access as that user. :::

Remember that anything you did before changing the user, will have root as the owner so make sure you fix the permissions if needed or your application might not run.

You can add files to the container and specify permissions in one go:

COPY --chown=non-root /my/project /container

Capabilities (Advanced)

So, we don't want to run as root, but we need to be able to do certain things that only root can - how?

Note: Docker for Desktop has a few conveniences built in - such as being able to bind to privileged ports (<1024) as non-root.

Keep in mind that while this is handy when working locally, this most likely wont be the case when you deploy your container in a different environment.

Linux capabilities are a subset of privileges that are available to a process.
These can be added and removed as needed.

For example, you want to run as non-root and run a webserver on port 80, but only root can open ports below 1024.
Using capabilities we can drop all privileges and only add back in NET_BIND_SERVICE (which is needed to open privileged ports):

docker run --cap-drop=ALL --cap-add=NET_BIND_SERVICE my_image:1.4-12

See more about this at HackTricks and Official Docker docs

Privileged containers

Running in --privileged mode disables a lot of security checks on a container and are used in a few special cases such as running Docker in Docker.

If you want to run a privileged container, ask yourself why?
And if you still want to - ask yourself another question: Why? (notice the capital W?)

Let's say you happen to mount the entire file system inside the container (who would do that, you ask? It happens 🤷‍♂️)

Here's how we get full control of the host:

docker run -it --rm --privileged -v /:/host ubuntu:22.04 bash

root@2397c8618719:/# chroot /host bash

Any command you now run from inside the container, affects the host file system as root - such as resetting root password, creating a user, etc.

This is because running tools like passwd affects /etc/shadow and by running chroot you said "make /host the root directory of the process bash" so that all commands will write to the host files instead of the container.

Secrets

Be careful with how you build your Dockerfile and include secrets.

Given the following Dockerfile:

FROM ubuntu:22.04 
COPY secrets.json . 
RUN do_upload --credentials-file secrets.json && rm secrets.json

The resulting container image does not have a secrets.json - or does it?

Every line in your Dockerfile results in a layer and the resulting container is a combination of these layers.
One layer adds the file, then the next one removes it - but all layers are still part of the container.

We can actually do this to find the secrets.json file:

docker save my_image:v1 > my_image.tar 
mkdir temp_files && tar -xf my_image.tar -C temp_files 
cd temp_files && find . -name "*.tar" -exec tar -xf {} \;
find . -name "secrets.json" -exec cat {} \;

By using multi-stage container builds, we can avoid this as only the final container specified in the Dockerfile (and its files) are part of the completed container.

FROM ubuntu:22.04 AS build 
WORKDIR /src 
COPY secrets.json . 
RUN do_upload --credentials-file secrets.json && rm secrets.json 

FROM ubuntu:22.04 AS final 
WORKDIR /app COPY --from=build /src .

In this version of the container, secrets.json will not be part of the final container's file layers 👍

Build arguments

Build arguments are useful for providing values when building the container (they are not part of the final container and cannot be used when running the container).
That means its tempting to think its safe to pass in secrets like that - keeping it outside of the container and only available at build-time.

Given the following Dockerfile:

FROM ubuntu:22.04 
ARG API_KEY 
RUN apt update && install -y curl 
RUN curl -H "X-API-KEY:${API_KEY}" https://mysite.com/build-configuration.json

You can then build this image using docker build --build-arg API_KEY=my_secret_key -t my_image:1.0 . and curl will do its thing using the API_KEY build argument.
Once the image is built, you can run the image and find no trace of the build argument.

Safe, yes?

Actually, turns out we can inspect the history of commands that created the image by executing docker history.

docker history my_image:1.0 
IMAGE          CREATED          CREATED BY                                      SIZE      COMMENT 
323198ade7d2   21 seconds ago   RUN |1 API_KEY=my_secret_key /bin/sh -c curl…   0B        buildkit.dockerfile.v0 
<missing>      22 seconds ago   RUN |1 API_KEY=my_secret_key /bin/sh -c apt …   47MB      buildkit.dockerfile.v0 
<missing>      22 seconds ago   ARG API_KEY                                     0B        buildkit.dockerfile.v0 
<missing>      4 weeks ago      /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>      4 weeks ago      /bin/sh -c #(nop) ADD file:29c72d5be8c977aca…   77.8MB

Suddenly, our secret re-appears!

If you use multi-staged container builds, the build arguments only show up if they are used in the final stage - which becomes the actual completed container.
By using your build argument in a build stage, it will not show up in the docker history command as it has been discarded before assembling the final container.

FROM ubuntu:22.04 AS build 
ARG API_KEY 
RUN apt update && apt install -y curl 
RUN curl -H "X-API-KEY:${API_KEY}" https://mysite.com/build-configuration.json 

FROM ubuntu:22.04 AS final
docker history my_image:1.0 
IMAGE          CREATED       CREATED BY                                      SIZE      COMMENT 
b9db0844ba2c   4 weeks ago   /bin/sh -c #(nop)  CMD ["bash"]                 0B
<missing>      4 weeks ago   /bin/sh -c #(nop) ADD file:29c72d5be8c977aca…   77.8MB

Additional information

For more tips, checkout OWASP Docker Security Cheat Sheet