This profile configures Snakemake installed within a Singularity container to run on the SLURM Workload Manager.
The project integrates the SLURM Snakemake profile and singreqrun.
singslurm2
also supports spawning extra Singularity containers using
Snakemake's Singularity
support,
however this requires Snakemake 6.10.0 and only rules written with the shell
directive are supported.
Installation is performed by download a packaged release and then unzipping it.
Assuming the latest release is $VERSION
, you can run:
$ cd ~
$ wget https://github.com/frankier/singslurm2/releases/download/$VERSION/singslurm2.pkg.zip
$ unzip singslurm2.pkg.zip
After performing these steps the entrypoint is ~/singslurm2/run.sh
. Arguments
are passed using environment variables. Compulsory arguments are shown in
bold:
- Options relating to Singularity:
$SIF_PATH
: Path to Singularity SIF file within which both the Snakemake control job and the execution of the rules will occur on the clusterSING_EXTRA_ARGS
: Extra arguments to pass tosingularity exec
andsingularity shell
. Most often you might use this to pass in extra--bind
commands.
- Options relating only to Snakemake:
$SNAKEFILE
: Path within container to directory containing Snakefile
- Options relating to SLURM/Snakemake:
$CLUSTER_CONFIG
: Path within container to file mapping rules to resource requirements$CLUSTER_NAME
: Some HPCs define multiple SLURM clusters. In this case you may need to specify the cluster using this environment variable.$NUM_JOBS
: Max jobs at the Snakemake level. Each may include many SLURM tasks. 128 by default.$SBATCH_DEFAULTS
: Default arguments to pass to sbatch$RESTART_TIMES
: Maximum number of times to restart a failing job. 3 by default.$LATENCY_WAIT
: The number of seconds to wait for a rule's output before concluding it has failed. 30 by default.
- Other options:
$TRACE
: Trace the script -- useful for debugging$PRE_COORDINATOR_SCRIPT
and$PRE_JOB_SCRIPT
: script fragments, typically setting or modifying environment variables, which will run at the beginning of the coordinator and job processes respectively. See Customisation for different cluster environments for typical usage.
Anything passed as an actual arguments to run.sh
will be passed to Snakemake
within the container.
If you want to run the control job a cluster node, rather than a login node,
just put your environment variable arguments and execution of
~/singslurm2/run.sh
in a script run_myproj.sh
and submit manually e.g.:
$ sbatch --time 5-00:00:00 ./run_myproj.sh
Some cluster computing environments have different directory layouts. They
might for example have separate project and scratch data directory trees. In
some cases, scratch directories are only available on SLURM job nodes. You can
use $PRE_COORDINATOR_SCRIPT
and $PRE_JOB_SCRIPT
to customise things, for
example --bind
within $SING_EXTRA_ARGS
at the last moment, within the
correct environment with the directories and environment variables pointing at
them available.
A typical usage might be to put this cluster specific setup into a script,
which you then source this in your job running script i.e. put source ~/myclustersetup.sh
at the top of run_myproj.sh
. Examples for CSC (national
Finnish HPC provider) are available in contrib/csc_mahti.sh
and
contrib/csc_puhti.sh
. This might be useful as a starting point for other
providers. To use it you would add for example /path/to/contrib/csc.sh
to the
beginning of all your job scripts.
More information can be found from the SLURM Snakemake profile and singreqrun.