Skip to end of banner
Go to start of banner

Parabricks Quickstart: Earlier Versions

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 19 Next »

This page contains a quick start guide for earlier versions of Parabricks that are still available but no longer directly supported. Please refer to the latest version for direct support.

v3.6

Getting Started

  • Connect to compute client.

ssh wustlkey@compute1-client-1.ris.wustl.edu
  • Prepare the computing environment before submitting a job.

# Use scratch file system for temp space
export SCRATCH1=/scratch1/fs1/${COMPUTE_ALLOCATION}

# Use Active storage for input and output data
export STORAGE1=/storage1/fs1/${STORAGE_ALLOCATION}/Active

# Mapping for the Parabricks license(s) is required
export LSF_DOCKER_VOLUMES="/scratch1/fs1/ris/application/parabricks-license:/opt/parabricks $SCRATCH1:$SCRATCH1 $STORAGE1:$STORAGE1 $HOME:$HOME"
export PATH="/opt/miniconda/bin:$PATH"

# Use host level communications for the GPUs
export LSF_DOCKER_NETWORK=host

# Use debug flag when trying to figure out why your job failed to launch on the cluster
#export LSF_DOCKER_RUN_LOGLEVEL=DEBUG

# Use entry point because the parabricks container has other entrypoints but our cluster, by default, requires /bin/sh
export LSF_DOCKER_ENTRYPOINT=/bin/sh

# Create tmp dir
export TMP_DIR=${STORAGE1}"/parabricks-tmp"
[ ! -d $TMP_DIR ] && mkdir $TMP_DIR
  • Submit job. Basic commands for use:

    • V100 Hardware

    bsub -n 16 -M 64GB -R 'gpuhost rusage[mem=64GB] span[hosts=1]' -q general -gpu "num=1:gmodel=TeslaV100_SXM2_32GB:j_exclusive=yes" -a 'docker(gcr.io/ris-registry-shared/parabricks)' pbrun command options
    • A100 Hardware

    bsub -n 16 -M 64GB -R 'gpuhost rusage[mem=64GB] span[hosts=1]' -q general -gpu "num=1:gmodel=NVIDIAA100_SXM4_40GB:j_exclusive=yes" -a 'docker(gcr.io/ris-registry-shared/parabricks_ampere)' pbrun command options

Compute Group

  • If you are a member of more than one compute group, you will be prompted to specify an LSF User Group with -G group_name or by setting the LSB_SUB_USER_GROUP variable.

Known Issues

  • VQSR does not support gzipped files.

  • CNVKit --count-reads does not work as expected. A separate CNVKit Docker image can be used an an alternative to this option.

Additional Information

  • You may need to adjust your cores (-n) and memory (-M and mem) depending on your data set.
    • 1 GPU server should have 64GB CPU RAM, at least 16 CPU threads.

    • 2 GPU server should have 100GB CPU RAM, at least 24 CPU threads.

    • 4 GPU server should have 196GB CPU RAM, at least 32 CPU threads.

  • You can run this interactive (-Is) or in batch mode in the general or general-interactive queues.

  • You will probably want to keep the GPUs at 4 and RAM at 196GB unless your data set is smaller than the 5GB test data set.

  • There is diminishing returns on using more GPUs on small data sets.

  • Replace command with any of the pbrun commands such as fq2bam, bqsr, applybqsr, or haplotypecaller.

  • Please refer to official Parabricks documentation for additional direction.

  • No labels