How to properly run a container with containerd's ctr using --uidmap/gidmap and --net-host option
Asked Answered
I

1

0

I'm running a container with ctr and next to using user namespaces to map the user within the container (root) to another user on the host, I want to make the host networking available for the container. For this, I'm using the --net-host option. Based on a very simple test container

$ cat Dockerfile
FROM alpine
ENTRYPOINT ["/bin/sh"]

I try it with

sudo ctr run -rm --uidmap "0:1000:999" --gidmap "0:1000:999" --net-host docker.io/library/test:latest test

which gives me the following error

ctr: OCI runtime create failed: container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"sysfs\\\" to rootfs \\\"/run/containerd/io.containerd.runtime.v2.task/default/test/rootfs\\\" at \\\"/sys\\\" caused \\\"operation not permitted\\\"\"": unknown

Everything works fine if I either

  1. remove the --net-host flag or
  2. remove the --uidmap/--gidmap arguments

I tried to add the user with the host uid=1000 to the netdev group, but still the same error. Do I maybe need to use networking namespaces?

EDIT:

Meanwhile found out that it's an issue within runc. In case I use user namespaces by adding the following to the config.json

    "linux": {
        "uidMappings": [
            {
                "containerID": 0,
                "hostID": 1000,
                "size": 999
            }
        ],
        "gidMappings": [
            {
                "containerID": 0,
                "hostID": 1000,
                "size": 999
            }
        ],

and additionally do not use a network namespace, which means leaving out the entry

            {
                "type": "network"
            },

within the "namespaces" section, I got the following error from runc:

$ sudo runc run test
WARN[0000] exit status 1
ERRO[0000] container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"sysfs\\\" to rootfs \\\"/vagrant/test/rootfs\\\" at \\\"/sys\\\" caused \\\"operation not permitted\\\"\""
container_linux.go:349: starting container process caused "process_linux.go:449: container init caused \"rootfs_linux.go:58: mounting \\\"sysfs\\\" to rootfs \\\"/vagrant/test/rootfs\\\" at \\\"/sys\\\" caused \\\"operation not permitted\\\"\""
Interceptor answered 9/3, 2021 at 13:8 Comment(0)
I
0

Finally found the answer from this issue in runc. It's basically a restriction within the kernel that a user that does not own the network namespace does not have the CAP_SYS_ADMIN capability and without that can't mount sysfs. Since the user on the host that the root user within the container is mapped to did not create the host network namespace, it does not have CAP_SYS_ADMIN there.

From the discussion in the runc issue, I do see the following options for now:

  1. remove mounting of sysfs.

    Within the config.json that runc uses, remove the following section within "mounts":

            {
            "destination": "/sys",
            "type": "sysfs",
            "source": "sysfs",
            "options": [
                "nosuid",
                "noexec",
                "nodev",
                "ro"
            ]
        },
    

    In my case, I also couldn't mount /etc/resolv.conf. By removing these 2, the container did run fine and had host network access. This does not work with ctr though.

  2. setup a bridge from the host network namespace to the network space of the container (see here and slirp4netns).

  3. use docker or podman if possible that seem to use slirp4netns for this purpose. There is an old moby issue that also might be interesting.

Interceptor answered 10/3, 2021 at 14:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.