You can use our service desk portal for getting RIS support. RIS also offers 15 min. virtual office hours session Mon-Thru..
Compute Platforms
Compute platforms or research computing environments use a batch processing engine (BPE). This engine provides and facilitates resource sharing among thousands of users. RIS compute1 platform uses IBM’s LSF. The compute 2 platform uses SLURM. Both are different softwares, but, achieves the same goal. Some other batch processing engines are: PBS/Torque, Slurm, LSF, SGE and LoadLeveler.
A typical workflow of a batch processing engine looks like below.
1. A user submits a “job”. A job is text file that contains information about: resource requirements and commands to execute your application. The job is submitted to a queue, often referred as partition in the terminology of batch processing engines.
An example of SLURM batch job:
#!/bin/bash
#SBATCH --job-name=python-test
#SBATCH --output=test.log
#SBATCH -p general-cpu
sleep infinity
echo 'hello today is' ; date
The jobs sits in the queue in “pending” state. The “scheduler” in the BPE then determines the available resources and the ones demanded by the job. If there are enough resources and the job’s turn has arrived. The job is then allocated those resources.
The job moves to an intermediate state called “dispatching” where the BPE does some house keeping work like: preparing for job environment, software setup, executing command in the
.bashrcloading default modules and other pre-processing scripts defined by the administrators.Post resource allocation, the job then executes the command(s) defined in the job file (like line #7-8 in the above example)
Finally, upon execution the job status is marked as completed and the resources are relinquished for other users.