error: cgroup namespace 'freezer' not mounted. aborting
Asked Answered
L

3

8

Trying to run slurmd:

sudo systemctl start slurmd

I display the status of the daemon and an error is displayed on the screen:

>>sudo systemctl status slurmd
● slurmd.service - Slurm node daemon
   Loaded: loaded (/lib/systemd/system/slurmd.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2020-06-29 18:13:06 MSK; 2s ago
     Docs: man:slurmd(8)
  Process: 13402 ExecStart=/usr/sbin/slurmd $SLURMD_OPTIONS (code=exited, status=1/FAILURE)

июн 29 18:13:06 ecm systemd[1]: Starting Slurm node daemon...
июн 29 18:13:06 ecm slurmd-ecm[13402]: Message aggregation disabled
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: cgroup namespace 'freezer' not mounted. aborting
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: unable to create freezer cgroup namespace
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: Couldn't load specified plugin name for proctrack/cgroup: Plugin init() callback failed
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: cannot create proctrack context for proctrack/cgroup
июн 29 18:13:06 ecm systemd[1]: slurmd.service: Control process exited, code=exited, status=1/FAILURE
июн 29 18:13:06 ecm slurmd-ecm[13402]: error: slurmd initialization failed
июн 29 18:13:06 ecm systemd[1]: slurmd.service: Failed with result 'exit-code'.
июн 29 18:13:06 ecm systemd[1]: Failed to start Slurm node daemon.

I don't know how to fix it. I hope for your help. I use slurm version 18.08.05 and debian 10.

UPD. I changed the ProctrackType value in slurm.config to proctrack/linuxproc:

ProctrackType=proctrack/linuxproc

All is work.

Lattimer answered 29/6, 2020 at 15:19 Comment(0)
Z
5

Unlike the documentation (man cgroup.conf), the default value of the parameter CgroupMountpoint is not good.

echo CgroupMountpoint=/sys/fs/cgroup >> /etc/slurm-llnl/cgroup.conf

And you can reset the value of ProctrackType. Tested on Debian10.7 slurmd version: slurm-wlm 18.08.5-2

Zygoma answered 9/12, 2020 at 22:46 Comment(0)
I
5

Same error in my cluster, my cgroup.conf wasn't configured.

A simple /etc/slurm/cgroup.conf with:

CgroupAutomount=yes
ConstrainCores=no
ConstrainRAMSpace=no

then:

systemctl restart slurmd
India answered 20/4, 2022 at 15:26 Comment(0)
R
2

In my case, this happened because I didn't create and configure my cgroup.conf on the nodes running slurmd. Once this was added to the same directory as slurm.conf, it worked fine. CgroupMountpoint did not need to be defined as the default was sufficient.

Rioux answered 24/5, 2021 at 18:12 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.