Docker and the RIS Compute Service
How Docker Relates to the HPC Environment
Why Docker?
In the past, HPC environments were built upon āstatic operating system imagesā, which is to say that each execution node of a cluster had the same Operating System Image, with a set of applications that were curated and installed by the managing IT team. Each application was therefore exposed to every applicationās ādependency treeā. For example, if Application A required Library Z version 1, but Application B required Library Z version 2, the applications conflicted with each other. Various methods were devised over the years to try to isolate application environments from each other. The development of āmodulesā that would use environment variables and shared filesystems was an attempt to solve this problem. In the end, the modules and the environments needed to be built by the cluster managers. End users had limited ability, if any at all, to deploy the software they wanted āon the flyā.
Container technologies are not exactly ānewā. The āchrootā system call has been around since 1979. FreeBSD has had ājailsā since 2000. Solaris Containers were introduced in 2004. There have been many. But with the advent of Docker, the ecosystem of support around the container technology has finally made it relatively easy for end users to build these portable runtime environments.
Docker allows users to build their own software environments independently of anyone else. The cluster management team no longer needs to be the gatekeeper controlling the available software.
What is Docker?
As described in What is a Container?,
A Docker container image is a lightweight, standalone, executable package of software that includes everything needed to run an application: code, runtime, system tools, system libraries and settings.
A container image is built and pushed to a container registry for later use. In the RIS Compute Service, a user submits a job that is managed by the IBM Spectrum LSF job scheduler that, when executed, pulls the docker container image to an execution node where it is executed on behalf of the user.
The Relationship Between Users, LSF, and Docker
The following diagram represents relationships between the user and portions of the cluster:
The user submits jobs from the lsf client using the bsub command. The jobs enter the scheduler and traverse the following states:
When the job is dispatched to the execution node there is a wrapper script that constructs the docker run command.
Recall the date command used in the Quick Start:
bsub -Is -G ${group_name} -q general-interactive -a 'docker(alpine)' date
Here, the user, a member of the Active Directory group named ${group_name}, is submitting a job to the general-interactive queue. When the job lands on the execution node, the wrapper script will āpullā the Docker container image named āalpineā:
docker pull alpine
The wrapper then constructs a shell script to execute the desired command:
#! /bin/sh
date
The wrapper then constructs a docker run command to execute the script:
The important thing to know here is that the user is using Docker, but not directly. There exists both a ājob schedulerā and a āwrapper scriptā between the user and the docker run command.
This is important to know because many of the features RIS offers users are mitigated by this wrapper script, and some are disallowed, usually for security reasons.
Most features of the docker wrapper are exposed through environment variables. See Docker Wrapper Environment Variables for the list of them and their use.
Public vs Private Registries
Sometimes a user does not want, or is forbidden, to put code into a public registry. RIS suggests use of Docker Hubās private registries as WashU does not currently have a private container registry. See Build Images Using Compute for how to log into and build using private registries on compute.
Locate Preexisting Docker Images
There are many sources of preexisting Docker images, primarily within Docker Hubās public registry. Before building an image from scratch it is recommended you search for an image to meet your requirements.
Build Images Using Compute
If you are unable to locate an image on a public registry that meets all of your needs, it will need to either be built from scratch or by building upon an existing container and pushing for future use.
RIS recommends building Docker images on a workstation or other computer you have local access to as it makes debugging the build process easier. However, some build processes may require more resources than you have available locally. For these situations, the compute cluster can be used.
The
docker_build
LSF application accepts all the same arguments as the command-linedocker build
, except for Windows-specific options such as--security-opt
. It works with build contexts that are subdirectories or URLs that are git repos or tarballs.
1. Log Into Public or Private Registry
The docker_build application uses various environment variables to interact with public and private repositories. Registry credentials are stored in the file
$HOME/.docker/config.json
. Rather than passing the password in as an environment variable, itās a better practice to first log into the registry interactively withLSB_DOCKER_LOGIN_ONLY
enabled. The credentials will automatically be populated into the config file and used when pushing to that registry.If the
docker login
process would ask for a password, be sure to run the login build with bsubās interactive flag, for example:
Docker Hub:
Other repository:
2. Build and Push Image
Once logged in, repositories accessible with those credentials, both public and private, will be accessible.
The
docker_build
LSF application accepts all the same arguments as the command-linedocker build
, except for Windows-specific options such as--security-opt
. It works with build contexts that are subdirectories or URLs that are git repos or tarballs.For example, this command will submit a job to build a container based on files in the āmy_containerā subdirectory, tag it with
1.1.4
and1.1
. All the tags that appear, both as a ādocker_buildā argument and āātagā option, must also be valid to push to with ādocker pushā, and all tags are pushed after the build completes.Take note of the
--
that separates the arguments to bsub from the arguments to ādocker buildā. Without it, bsub will try to interpret all the arguments for itself and generate an error. Also, the ābuild contextā directory path must be the final argument. Other arguments recognized by ādocker buildā must appear before the build context argument.This command will submit a job where the Dockerfile is in a non-standard place and is not named
Dockerfile
Docker Images Must Include /bin/sh
By default, the IBM Spectrum LSF software job scheduler requires that the Docker containers launched have a /bin/sh present. Users may observe that Docker uses hello-world as an example in documentation. This container is an example of one that does not include a /bin/sh. Launching a container that does not supply /bin/sh results in a āno such file or directoryā error:
This can be circumvented by the use of an LSF variable so that Docker image that do not have supply /bin/sh can be run on the Compute Platform. Please see our LSF env variable documentation for more information.
The Job Execution Wrapper Script
The wrapper process that builds the docker run
command does several things in order to construct a āsafeā docker run command. The following is an example of job submission in order to demonstrate what the wrapper script is doing:
Let us zoom in on the docker run
command that is being built:
What does all this mean?
Important take aways from the above exploration include:
We take care of users and groups.
We set SELinux contexts. SELinux is complex. We will return to this topic later.
We set OS Capabilities, and you do not have access to all of them.
Jobs run as you and never as root or any other user, this may matter for pre-built containers that expect specific users.
You are never allowed to pass in Volumes that are not GPFS Volumes ie. storage1, or scratch1. You are not to see the disk of the execution node.
You are not to see the execution nodeās
/tmp
, but rather your jobās/tmp
.We automatically pass in the ācurrent working directoryā which overrides a containerās default
WORKDIR
. Often this is $HOME, but pay attention to what you need it to be.We require a Docker
ENTRYPOINT
that is/bin/sh
or omitted. This is a limitation of the IBM Spectrum LSF Docker integration.We set the hostname within the container to match the execution node it runs on