Python and Python Environments with Micromamba
Cluster users can utilize micromamba to install and manage python environments.
Micromamba, part of the Mamba project is a Linux executable that, unlike Conda, resides outside of a python/Anaconda installation and does not require a ābaseā environment to build from. This avoids any inconveniences with having Anaconda initialize the base environment at login when not strictly necessary.
Micromamba will be in your path on the ENGR cluster under /project/compute/bin/micromamba.
Enabling Micromamba in your shell
You can enable Micromamba by adding this line to your .bashrc in your home directory:
eval "$(micromamba shell hook --shell bash)"
Unlike Anaconda, this will not immediately drop you to a base environment, nor will it activate any configured environment. You can execute that same command directly to immediate initialize Micromamba in your current shell.
Creating a Micromamba Environment
Setting a .condarc
Micromamba utilizes the file ~/.condarc to determine which channels are available to it. If you donāt have one from another source, a common default might be:
channel_priority: strict
channels:
- pytorch
- conda-forge
- defaults
Environments can be housed in your cluster home directory, or under /project/python01/conda. If you have access to one, you can create environments in RIS storage as well.
Your home directory has a default 15GB quota, and the Python01 location grants each user a 25GB quota, so it may be more appropriate for larger or multiple environments.
/project/python01/conda is only available to research cluster users.
/project/python01 is not backed up and is not appropriate for data storage.
If you are using code that downloads data at runtime, like HuggingFace AI models, make sure your code does not store those models in the environment location - modify the code to push those to RIS storage, if needed permanently, or to /scratch on the execution node.
Make a directory under /scratch at job runtime:
mkdir /scratch/mywustlkey.$LSB_JOBID
to create a directory specifically for the job you are running in.
First, choose your location, and create a directory for the environment. This document will proceed using the /project/python01/conda location. āmywustlkeyā should be your own WUSTL Key login name.
mkdir /project/python01/conda/mywustlkey
creates the directory. You can then create this blank environment with:
micromamba create -p /project/python01/conda/mywustlkey/mambatest
micromamba activate /project/python01/conda/mywustlkey/mambatest
That same activate command will be used to re-activate the environment later, either during an interactive session or before code is executed in a batch job.
Once activated, you are in a blank environment, with no packages or Python executable.
You can install packages, starting with the desired Python version as a base:
micromamba install python==3.12
There will be a flurry of activity after this command, and youāll be prompted to accept the changes - Micromamba will install the very basic packages needed to support a basic Python installation.
Executing the Micromamba Environment
To activate in the future, in a terminal or batch job:
micromamba activate /project/python01/conda/mywustlkey/mambatest
You can also execute a command via the ārunā subcommand, which may be helpful in batch jobs, especially where multiple environments might be called in sequence:
micromamba run -p /project/python01/conda/mywustlkey/mambatest mycommand
For example:
[user@node21 ~]$ micromamba run -p /project/python01/conda/mywustlkey/mambatest python --version
Python 3.12.0
Micromamba Environment Roots
You can manage multiple environments under a root environment directory, much like is done with Conda. Choose your location for your environment root, make the directory, and set the MAMBA_ROOT_PREFIX environment variable:
mkdir -p /project/python01/conda/mywustlkey/micromamba
export MAMBA_ROOT_PREFIX=/project/python01/conda/mywustlkey/micromamba
You will want to add the āexportā command to your .bashrc to make that selection permanent. You can manage multiple roots by changing that prefix definition as well.
Create an environment with:
micromamba create -n rooted-env python==3.12
and activate with
micromamba activate rooted-env
or again, run a command with the environment with:
micromamba run -n rooted-env mycommand
Jupyter Notebook Micromamba Environments
Assuming you have the ipython and ipykernel packages (with assumed dependencies) in your Micromamba environment, you can enable these environments to be used as a kernel from the OOD Jupyter notebook by activating the environment, installing the kernel.
micromamba activate /project/python01/conda/mywustlkey/mambatest
ipython kernel install --user --name=mambatest
The installed kernel will show up in the dropdown in Jupyter:
Ā
and choosing it will start a notebook in that environment.
Runtime Micromamba Environments
Micromamba is a lot faster than Conda at resolving and installing environments. If you have a defined environment file, you can, during a job, create the environment on-the-fly, adding the ā-yā flag to the create command to automatically accept the confirmation question:
mkdir -p /scratch/mywustlkey.$LSB_JOBID/mambaenv
micromamba create -y -p /scratch/mywustlkey.$LSB_JOBID/mambaenv -f environment.yml
micromamba activate /scratch/mywustlkey.$LSB_JOBID/mambaenv
Placing it under /scratch makes it disposable, the system will eventually erase it and any new job will download fresh.
A simple Pytorch environment:
name: mambaenv
channels:
- conda-forge
- nvidia
- pytorch
dependencies:
- cudatoolkit=11.1
- python=3.8
- pytorch
took about 5 minutes, most of which was downloading CUDA. You would want to make sure to specify versions of packages so that the environment is the same every time.
As these are emphemeral and local to the node they are created on, they are not appropriate for use with the OpenOndemand Jupyter Notebook.