Using Slurm in Containers
Use Slurm Commands in Containers
By default Slurm commands are not available inside container jobs. If you want to use Slurm commands in container jobs, you will need to either install Slurm on your container images (recommended) or mount the host Slurm installation to your containers (not recommended). Care must be taken if you decide to use Slurm installation on the bare metal/host in your container jobs. You must make sure your container environments can support the bare metal/host Slurm installation. Shown below are details procedures for each method.
Install Slurm in Containers (Recommended Best Practice)
Check the version of Slurm installation. Shown below is an example command.
[sleong@c2-login-001 ~]$ sinfo --version slurm 23.02.5Install munge to your containers. Shown below is an example Ubuntu munge installation command.
apt-get install libmunge2 libmunge-devInstall the same version of Slurm to your containers.
Add
slurmuser to your/etc/passwdfile in your containers. Shown below is an example/etc/passwdentry forslurmuser.slurm:x:450:450::/cm/local/apps/slurm:/bin/bashStart your container jobs. Shown below is an example command.
srun --pty --container-mounts=/run/munge --container-env=SLURM_CONF --container-image=mycontainer-with-slurm /bin/bash