Workflow management¶
Job dependencies¶
Establishing dependencies between different SLURM jobs is a good way to split up parts of a pipeline that have very different resources requirements. In this way, the resources are used more efficiently and the scheduler can allocate the jobs more easily, which in the end can shorten the amount of time that you have to wait in the queue.
The way to work with dependent SLURM jobs is to launch them with the --dependency directive and specify the condition that has to be met so that the depending job can start.
Dependent jobs will be allocated in the queue but they will not start until the specified condition is met and the resources are available.
| Condition | Explanation |
|---|---|
| after:jobid[:jobid] | job begins after specified jobs have started |
| afterany:jobid[:jobid] | job begins after specified jobs have terminated |
| afternotok:jobid[:jobid] | job begins after specified jobs have failed |
| afterok:jobid[:jobid] | job begins after specified jobs have finished successfully |
| singleton | job begins after all specified jobs with the same name and user have ended |
Note
Job arrays can also be submitted with dependencies, and a job can depend on an array job. In the latter case, the job will start executing when all tasks in the job array met the dependency criterion (eg, finishing for afterany).
Practical examples¶
Assume you have a series of job scripts, job1.sh, job2.sh, …, job9.sh with that depend on each other in some way.
The first job to be launched has no dependencies. It needs a standard sbatch command and we store the job ID in a variable that will be used for the jobs that depend on job1:
Info
To make it easier to grab the job id from a completed job, we add the --parsable flag to SLURM jobs that other jobs depend on. The --parsable option makes sbatch return the job ID only.
Multiple jobs can depend on a single job. If job2 and job3 depend on job1 to finish, no matter the status, we can launch them with the following commands:
jid2=$(sbatch --parsable --dependency=afterany:$jid1 job2.sh)
jid3=$(sbatch --parsable --dependency=afterany:$jid1 job3.sh)
Similarly, a single job can depend on multiple jobs. If job4 depends directly on job2 and job3 (thus indirectly on job1) to finish, we can launch it with:
Job arrays can also be submitted with dependencies. If job5 is a job array that depends on job4, we can launch it like this:
A single job can depend on an array job. Here, job6 will start when all array jobs from job5 have finished successfully:
A single job can depend on all jobs by the same user with the same name. Here, job7 and job8 depend on job6 to finish successfully, and both are launched with the same name (“dtest”). We make job9 depend on job7 and job8 by making it depend on any job with the name “dtest”.
jid7=$(sbatch --parsable --dependency=afterok:$jid6 --job-name=dtest job7.sh)
jid8=$(sbatch --parsable --dependency=afterok:$jid6 --job-name=dtest job8.sh)
sbatch --dependency=singleton --job-name=dtest job9.sh
Finally, you can show the dependencies of your queued jobs like so:
Tip
It is possible to make a job depend on more than one dependency type.
For example, in the following job4 launches when job2 finished successfully and job3 failed:
jid4=$(sbatch --parsable --dependency=afterok:$jid2,afternotok:$jid3 job4.sh)
Separating the dependency types by ‘,’ means that all dependencies must be met. Separating them by ‘?’ means that either one suffices.
Do it yourself
The following proposes a simple set of scripts to test the concepts showcased above. Each script is set-up to run for 15 seconds
job1.sh:
#! /bin/sh
sleep 15
ls . > job1_output.txt
job2.sh:
#! /bin/sh
## takes input from job1
sleep 15
wc -l job1_output.txt > job2_output.txt
job3.sh:
#! /bin/sh
## takes input from job1
sleep 15
wc -c job1_output.txt > job3_output.txt
job4.sh:
#! /bin/sh
## takes input from job2 and job3
sleep 15
cat job2_output.txt job3_output.txt > job4_output.txt
sbatch job1.sh outputs Submitted batch job 53481198
sbatch --dependency=afterok:53481198 job2.sh outputs Submitted batch job 53481217
squeue -u $USER reveals:
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
53481239 scicore job2.sh duchem00 PD 0:00 1 (Dependency)
53481237 scicore job1.sh duchem00 R 0:13 1 sca29
In practice it is often more practical to output just the job ID using the –parsable flag:
jid1=$(sbatch --parsable job1.sh)
jid2=$(sbatch --parsable --dependency=afterok:$jid1 job2.sh)
You can have more that one job dependent on a single job:
jid2=$(sbatch --parsable --dependency=afterok:$jid1 job2.sh)
jid3=$(sbatch --parsable --dependency=afterok:$jid1 job3.sh)
And you can have a job depend on more than 1 job:
jid4=$(sbatch --parsable --dependency=afterok:$jid2:$jid3 job4.sh)
Snakemake¶
Important note
Running a pipeline on the cluster often involves a master process on the node where you logged in for the duration of the pipeline.
In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.
Alternatively, you can put your snakemake master process in a SLURM script.
Snakemake is a workflow management system tool to create reproducible and scalable data analyses.
In order to run snakemake workflows on SciCORE, you need to setup a config.yaml profile where you specify:
executor: slurm: sets up rules execution in SLURMuse-envmodules: true: sets up module system- resources for each rules: e.g.
set-resources: myrule:mem=500MB
You also need to specify the modules to load for each rules in the Snakefile using the envmodules option.
Follow the relevant documentations from slurm-executor-plugin and snakemake to see exactly which options to set for each use case.
Note
We recommend running snakemake -n to create a dry-run of the workflow and identify which rules are going to be invoked.
Here is an example config.yaml file for a workflow with 2 rules (fastqc and multiqc), each needing their own module:
executor: slurm # sets up rules execution in SLURM
jobs: 10 # maximum number of concurrent jobs
use-envmodules: true # sets up module system
latency-wait: 30 # time, in second, before checking for result
# files after a job finishes successfully.
# This is useful when there is a bit of
# filesystem latency.
set-resources:
multiqc:
mem: 500MB # reserved memory
runtime: 10 # reserved runtime in minutes
threads: 1 # reserved cpus
slurm_extra: "--qos=30min" # any extra slurm options,
# here used to set the queue of service
fastqc:
mem: 1GB
runtime: 10
threads: 1
slurm_extra: "--qos=30min"
And the Snakefile specifies the modules to use for each rule:
...
rule fastqc:
...
envmodules:
"FastQC/0.12.1-Java-21" # module to load at rule start-up
...
rule multiqc:
...
envmodules:
"MultiQC/1.22.3-foss-2024a" # module to load at rule start-up
...
...
Presuming the Snakefile, input data and config.yaml are in the same folder you launch the workflow with: snakemake --workflow-profile .
Warning
It may happen that the pipeline fails because of filesystem latency issues,
in which case you should typically see some line like this in your error message:
Job 5 completed successfully, but some output files are missing.
In that case consider increasing the latency-wait option to a higher number.
Warning
Beware that snakemake exports your environment when submitting jobs. This means that any environment variable you have defined in your session will get passed down to the various steps of the workflow.
rules running locally
Some rules are sometimes unsuited to become a job on the cluster, for instance if some task requires internet access.
For this snakemake uses the localrule keyword in the Snakefile.
It can either be defined at the rule level :
rule foo:
...
localrule: True # rule foo will be executed locally rather than in a SLURM job
...
Or several rules can be specified at once :
localrules: foo, bar # rules foo and bar will be executed locally rather than in a SLURM job
rule foo:
...
rule bar:
...
You can read more on the subject in the snakemake documentation
Do it yourself:
The configuration we show above corresponds to a simple bioinformatics pipeline where we generate some html Quality Control reports from a set of sequencing result files.
Here are the steps to execute if you want to test it for yourself:
First, create a new folder on scicore, move there
mkdir snakemake_test
cd snakemake_test
Then download some data (NB: these are small files):
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz
Next, create a file called Snakefile with the following content:
SAMPLES = ["sample1_R1",
"sample1_R2",
"sample2_R1",
"sample2_R2"]
rule all:
input:
"results/multiqc/multiqc_report.html"
rule fastqc:
input:
"{sample}.fastq.gz"
output:
"results/fastqc/{sample}_fastqc.html"
envmodules:
"FastQC/0.12.1-Java-21" # module to load at rule start-up
shell:
"fastqc {input} -o results/fastqc/"
rule multiqc:
input:
fqc=expand("results/fastqc/{sample}_fastqc.html", sample=SAMPLES)
output:
"results/multiqc/multiqc_report.html"
envmodules:
"MultiQC/1.22.3-foss-2024a" # module to load at rule start-up
shell:
"multiqc results/fastqc/ -o results/multiqc/"
Also create a file named config.yaml, with the content:
executor: slurm # sets up rules execution in SLURM
jobs: 10 # maximum number of concurrent jobs
use-envmodules: true # sets up module system
latency-wait: 30 # time, in second, before checking for result
# files after a job finishes successfully.
# This is useful when there is a bit of
# filesystem latency.
set-resources:
multiqc:
mem: 500MB # reserved memory
runtime: 10 # reserved runtime in minutes
threads: 1 # reserved cpus
slurm_extra: "--qos=30min" # any extra slurm options,
# here used to set the queue of service
fastqc:
mem: 1GB
runtime: 10
threads: 1
slurm_extra: "--qos=30min"
Load the snakemake module:
ml snakemake/9.3.5-foss-2025a
Finally, run the pipeline with:
snakemake --workflow-profile .
Where the option --workflow-profile . specifies the folder where you have the config.yaml file (here, the working directory .).
Nextflow¶
Important note
Running a pipeline on the cluster often involves a master process on the node where you logged in for the duration of the pipeline.
In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.
Alternatively, you can put your nextflow master process in a SLURM script.
Nextflow is a workflow system for creating scalable, portable, and reproducible workflows.
In order to run nextflow workflows on SciCORE, you need a nextflow.config file to specify the SLURM configuration for each processes, such as which resources to reserve, or which modules to load.
Here is an example nextflow.config file for a workflow with 2 processes (FASTQC and MULTIQC), each needing to load their own module:
process.executor = 'slurm'
process {
withName: FASTQC {
module = 'FastQC/0.11.8-Java-1.8'
cpus = 1
memory = 1.GB
clusterOptions = '--qos 30min'
}
}
process {
withName: MULTIQC {
module = 'MultiQC/1.14-foss-2022a'
cpus = 1
memory = 1.GB
clusterOptions = '--qos 30min'
}
}
Where:
cpuscorresponds to SLURM’scpus-per-taskqueuecorresponds to SLURM’spartitionclusterOptions: generic way of adding options to the SLURM submission, here we use it to specify theqos.
Tip
You can specify multiple options by separating them with spaces. e.g.: clusterOptions = '--qos 30min --mail-type=END,FAIL --mail-user=<my.name>@unibas.ch'.
Nextflow also let’s you set a queue option, which corresponds to the SLURM partition.
More info:
- SLURM executor
- specify process specific config (eg, resources)
- loading modules
- nextflow configuration
Tip
To attach multiple modules to the same process, you can use something like: module = ['FastQC','MultiQC']
Do it yourself:
Get the data
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz
simpleQC.nf
params.report_id = "multiqc_report"
process FASTQC {
publishDir "results/fastqc"
input:
path reads
output:
path "${reads.simpleName}_fastqc.zip", emit: zip
path "${reads.simpleName}_fastqc.html", emit: html
script:
"""
fastqc $reads
"""
}
process MULTIQC {
publishDir "results/multiqc"
input:
path '*'
val output_name
output:
path "${output_name}.html", emit: report
path "${output_name}_data", emit: data
script:
"""
multiqc . -n ${output_name}.html
"""
}
// Workflow block
workflow {
ch_fastq = channel.fromPath(params.fq) // Create a channel using parameter input
FASTQC(ch_fastq) // fastqc
MULTIQC(
FASTQC.out.zip.mix(FASTQC.out.html).collect(),
params.report_id
)
}
Workflow preview and configuration
ml Nextflow
nextflow run simpleQC.nf --fq "sample*.fastq.gz" -preview
output:
Nextflow 25.04.6 is available - Please consider updating your version to it
N E X T F L O W ~ version 24.10.4
Launching `simpleQC.nf` [big_mercator] DSL2 - revision: 0c7da237e5
[- ] FASTQC -
[- ] MULTIQC -
We are going to create a config file to specify all cluster relevant information to nextflow.
If that file is named nextflow.config and is in your current directory when you launch the workflow, then it will automatically be applied to the run.
Otherwise you can specify it to nextflow run with option -c.
nextflow.config
process.executor = 'slurm'
process {
withName: FASTQC {
module = 'FastQC/0.11.8-Java-1.8'
cpus = 1
memory = 1.GB
clusterOptions = '--qos 30min'
}
}
process {
withName: MULTIQC {
module = 'MultiQC/1.14-foss-2022a'
cpus = 1
memory = 1.GB
clusterOptions = '--qos 30min'
}
}
where:
cpus: SLURMcpus-per-taskqueue: SLURMpartitionclusterOptions: generic way of adding options to the SLURM submission, here we use it to specify theqos.- you can specify multiple options by separating them with spaces. e.g.:
clusterOptions = '--qos 30min --mail-type=END,FAIL --mail-user=<my.name>@unibas.ch'
- you can specify multiple options by separating them with spaces. e.g.:
More info
- SLURM executor
- specify process specific config (eg, resources)
- loading modules
- to attach multiple modules, you can use something like:
module = ['FastQC','MultiQC']
- to attach multiple modules, you can use something like:
- nextflow configuration
Actually running the pipeline
nextflow run simpleQC.nf --fq "sample*.fastq.gz" -with-timeline -with-trace -with-report -with-dag
The options -with-timeline -with-trace -with-report -with-dag produce files text and html reports which are all useful, in particular:
- report for usage details
- trace gives you the job ids (column
native_id, useful for debug) and usage resource
Alternatively, the pipeline could be run from a SLURM script which you would submit with sbatch:
#!/bin/bash
#SBATCH --job-name=nextflow-test
#SBATCH --cpus-per-task=1 #Number of cores to reserve
#SBATCH --mem-per-cpu=1G #Amount of RAM/core to reserve
#SBATCH --time=06:00:00 #Maximum allocated time
#SBATCH --qos=6hours #Selected queue to allocate your job
#SBATCH --output=nextflow_test.o
ml Nextflow
nextflow run simpleQC.nf --fq "sample*.fastq.gz" -with-timeline -with-trace -with-report -with-dag
nf-core¶
this subsection draws inspiration from the genotoul cluster nextflow-course
Important note
This method requires to run a master process on the node where you logged in for the duration of the pipeline.
In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.
TL;DR
- specify
process.executor = 'slurm'in your nextflow.config - use
-profile=apptainerto handle the containers - read about managing workflow resources; also use the recommendations listed above
nf-core is a global community effort to collect a curated set of open‑source analysis pipelines built using Nextflow.
We will demonstrate how to use nf-core on sciCORE with the nf-core/demo pipeline, which
On the login node, we can start by inspecting the pipeline:
This will download the workflow files and display usage options.
nf-core pipeline generally have a test profile which specifies some simple input data.
This is useful for workflow configuration:
output:
{
"processes": [
{
"name": "NFCORE_DEMO:DEMO:MULTIQC",
"container": "quay.io/biocontainers/multiqc:1.29--pyhdfd78af_0"
},
{
"name": "NFCORE_DEMO:DEMO:SEQTK_TRIM",
"container": "quay.io/biocontainers/seqtk:1.4--he4a0461_1"
},
{
"name": "NFCORE_DEMO:DEMO:FASTQC",
"container": "quay.io/biocontainers/fastqc:0.12.1--hdfd78af_0"
}
]
}
So there are 3 processes, and each are linked to a container in quay.io
This means that we will not need to load the specific module for fastqc, multiqc and seqtk, but rather just make sure that a compatible container software is available.
In the case of SciCORE, apptainer is directly available without having to load additional modules. We will just need to specify this to Nextflow as another profile option.
But first, you have to setup a file named nextflow.config containing:
And then you can run:
You can then look up the execution trace, dag, report, and timeline in the folder pipeline_info/.
In particular, you can inspect the workflow usage resource and extrapolate how much you may need for your data.
The workflows are configured to require relatively sensible resources (generally in their conf/base.config file), but we highly recommend you have a look at their configuration.
See more on that subject in the nf-core documentation and in the section above.