How to resolve cgroup error when running docker container inside a docker container?
Asked Answered
P

1

7

I am trying to run some multi-container build tests inside a running ubuntu docker container that I use to build my application (generally, I have a Gitlab CI setup).

I've found that when trying to run containers that specify a memory limit, I encounter errors like this:

ERROR: for <service-name>  Cannot start service <service-name>: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers -- it is in threaded mode: unknown

Minimal Working Example

Here is a (nearly) minimal working example:

# start from ubuntu base image
docker run -it --privileged ubuntu:18.04 /bin/bash

# once inside the container, install docker
apt-get update
apt-get remove docker docker-engine docker.io containerd runc
apt-get install -y apt-transport-https ca-certificates curl gnupg lsb-release
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo \
  "deb [arch=amd64 signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
apt-get update
apt-get install -y docker-ce docker-ce-cli containerd.io

# start docker daemon
/etc/init.d/docker stop # should already be stopped
dockerd -H unix:///var/run/docker.sock -H tcp://0.0.0.0:2375 &

# run some container -- fails
docker run --memory=1gb eclipse-mosquitto:1.6

# run some container -- works
docker run eclipse-mosquitto:1.6

What I receive as an output (after pulling the image) is:

time="2022-01-27T01:23:20.018095900Z" level=info msg="starting signal loop" namespace=moby path=/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/424ce744b789f06b7f5ff94331df19b995e5de3ace50d4307b35886c9052f2a6 pid=4697
INFO[2022-01-27T01:23:20.064529100Z] shim disconnected                             id=424ce744b789f06b7f5ff94331df19b995e5de3ace50d4307b35886c9052f2a6
ERRO[2022-01-27T01:23:20.064613000Z] copy shim log                                 error="read /proc/self/fd/13: file already closed"
ERRO[2022-01-27T01:23:20.069022100Z] stream copy error: reading from a closed fifo 
ERRO[2022-01-27T01:23:20.072130600Z] stream copy error: reading from a closed fifo 
ERRO[2022-01-27T01:23:20.122636800Z] 424ce744b789f06b7f5ff94331df19b995e5de3ace50d4307b35886c9052f2a6 cleanup: failed to delete container from containerd: no such container 
ERRO[2022-01-27T01:23:20.123051000Z] Handler for POST /v1.41/containers/424ce744b789f06b7f5ff94331df19b995e5de3ace50d4307b35886c9052f2a6/start returned error: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers -- it is in an invalid state: unknown 
docker: Error response from daemon: OCI runtime create failed: container_linux.go:380: starting container process caused: process_linux.go:385: applying cgroup configuration for process caused: cannot enter cgroupv2 "/sys/fs/cgroup/docker" with domain controllers -- it is in an invalid state: unknown.
ERRO[0004] error waiting for container: context canceled 

Possible Solution

One option I've come across is, when running the base container, I should mount this /var/run/docker.sock volume, i.e.:

docker run -it -v /var/run/docker.sock:/var/run/docker.sock --privileged ubuntu:18.04 /bin/bash

which I guess basically latches onto the host machine's docker daemon (my understanding may not be quite right here). However, as mentioned above, I am using a Gitlab CI setup and mounting this volume into the runner's container is not a practical solution for me (as it requires runner-specific configuration).

Another Alternative

What I've also come across the is more "standard" docker-in-docker (dind) approach, which again works fine provided I mount that docker.sock volume into the container, i.e.:

# start from dind base image
docker run -it -v /var/run/docker.sock:/var/run/docker.sock --privileged docker:dind /bin/sh

# run mqtt -- works
docker run --memory=1gb eclipse-mosquitto:1.6

My Request

Is there any solution that would allow me to get this multi-container setup to work under the following constraints?

  1. I cannot mount /var/run/docker.sock:/var/run/docker.sock into the base container.
  2. I cannot remove the memory limit in the inside containers.
Painting answered 27/1, 2022 at 1:45 Comment(2)
Can you start this sequence from outside a container? The setup required to run a second Docker daemon inside a container is fairly complex, and the canonical advice is to avoid it in CI environments.Varioloid
In principle, yes - I am experimenting with using the gitlab "shell" executor instead of a docker executor for the CI runner. But, the shared runner I would like to use only provides a docker executor for the CI runner, so the goal of this question is to see if its possible to fix this issue within that existing setup (i.e., where I can't change anything about the CI runner configuratio)Painting
F
1

See https://github.com/containerd/containerd/issues/6659, especially https://github.com/moby/moby/blob/38805f20f9bcc5e87869d6c79d432b166e1c88b4/hack/dind#L28-L38:

# cgroup v2: enable nesting
if [ -f /sys/fs/cgroup/cgroup.controllers ]; then
    # move the processes from the root group to the /init group,
    # otherwise writing subtree_control fails with EBUSY.
    # An error during moving non-existent process (i.e., "cat") is ignored.
    mkdir -p /sys/fs/cgroup/init
    xargs -rn1 < /sys/fs/cgroup/cgroup.procs > /sys/fs/cgroup/init/cgroup.procs || :
    # enable controllers
    sed -e 's/ / +/g' -e 's/^/+/' < /sys/fs/cgroup/cgroup.controllers \
        > /sys/fs/cgroup/cgroup.subtree_control
fi

(With modern v2 cgroups, you have to enable nesting for this to work.)

Forecastle answered 13/6, 2023 at 23:39 Comment(2)
I'm struggling to enable this on DIND. I checked both links, and i see that inside DIND there is dind binary available that includes this code. When I run it i see bunch of "echo: write error: Not supported". Any idea what is going on?Ogive
@StrahinjaDjurić are you running inside a user namespace or something like that? (rootless host Docker, perhaps?)Forecastle

© 2022 - 2024 — McMap. All rights reserved.