Workflow management¶

Job dependencies¶

Establishing dependencies between different SLURM jobs is a good way to split up parts of a pipeline that have very different resources requirements. In this way, the resources are used more efficiently and the scheduler can allocate the jobs more easily, which in the end can shorten the amount of time that you have to wait in the queue.

The way to work with dependent SLURM jobs is to launch them with the --dependency directive and specify the condition that has to be met so that the depending job can start.

Dependent jobs will be allocated in the queue but they will not start until the specified condition is met and the resources are available.

Condition	Explanation
after:jobid[:jobid]	job begins after specified jobs have started
afterany:jobid[:jobid]	job begins after specified jobs have terminated
afternotok:jobid[:jobid]	job begins after specified jobs have failed
afterok:jobid[:jobid]	job begins after specified jobs have finished successfully
singleton	job begins after all specified jobs with the same name and user have ended

Note

Job arrays can also be submitted with dependencies, and a job can depend on an array job. In the latter case, the job will start executing when all tasks in the job array met the dependency criterion (eg, finishing for afterany).

Practical examples¶

Assume you have a series of job scripts, job1.sh, job2.sh, …, job9.sh with that depend on each other in some way.

The first job to be launched has no dependencies. It needs a standard sbatch command and we store the job ID in a variable that will be used for the jobs that depend on job1:

jid1=$(sbatch --parsable job1.sh)

Info

To make it easier to grab the job id from a completed job, we add the --parsable flag to SLURM jobs that other jobs depend on. The --parsable option makes sbatch return the job ID only.

Multiple jobs can depend on a single job. If job2 and job3 depend on job1 to finish, no matter the status, we can launch them with the following commands:

jid2=$(sbatch --parsable --dependency=afterany:$jid1 job2.sh)
jid3=$(sbatch --parsable --dependency=afterany:$jid1 job3.sh)

Similarly, a single job can depend on multiple jobs. If job4 depends directly on job2 and job3 (thus indirectly on job1) to finish, we can launch it with:

jid4=$(sbatch --parsable --dependency=afterany:$jid2:$jid3 job4.sh)

Job arrays can also be submitted with dependencies. If job5 is a job array that depends on job4, we can launch it like this:

jid5=$(sbatch --parsable --dependency=afterany:$jid4 job5.sh)

A single job can depend on an array job. Here, job6 will start when all array jobs from job5 have finished successfully:

jid6=$(sbatch --parsable --dependency=afterok:$jid5 job6.sh)

A single job can depend on all jobs by the same user with the same name. Here, job7 and job8 depend on job6 to finish successfully, and both are launched with the same name (“dtest”). We make job9 depend on job7 and job8 by making it depend on any job with the name “dtest”.

jid7=$(sbatch --parsable --dependency=afterok:$jid6 --job-name=dtest job7.sh)
jid8=$(sbatch --parsable --dependency=afterok:$jid6 --job-name=dtest job8.sh)
sbatch --dependency=singleton --job-name=dtest job9.sh

Finally, you can show the dependencies of your queued jobs like so:

squeue -u $USER -o "%.8A %.4C %.10m %.20E"

Tip

It is possible to make a job depend on more than one dependency type. For example, in the following job4 launches when job2 finished successfully and job3 failed:

jid4=$(sbatch --parsable --dependency=afterok:$jid2,afternotok:$jid3 job4.sh)

Separating the dependency types by ‘,’ means that all dependencies must be met. Separating them by ‘?’ means that either one suffices.

Do it yourself

The following proposes a simple set of scripts to test the concepts showcased above. Each script is set-up to run for 15 seconds

job1.sh:

#! /bin/sh
sleep 15
ls . > job1_output.txt

job2.sh:

#! /bin/sh
## takes input from job1
sleep 15
wc -l job1_output.txt > job2_output.txt

job3.sh:

#! /bin/sh
## takes input from job1
sleep 15
wc -c job1_output.txt > job3_output.txt

job4.sh:

#! /bin/sh
## takes input from job2 and job3
sleep 15
cat job2_output.txt job3_output.txt > job4_output.txt

sbatch job1.sh outputs Submitted batch job 53481198

sbatch --dependency=afterok:53481198 job2.sh outputs Submitted batch job 53481217

squeue -u $USER reveals:

   JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
53481239   scicore  job2.sh duchem00 PD       0:00      1 (Dependency)
53481237   scicore  job1.sh duchem00  R       0:13      1 sca29

In practice it is often more practical to output just the job ID using the –parsable flag:

jid1=$(sbatch --parsable job1.sh)
jid2=$(sbatch --parsable --dependency=afterok:$jid1 job2.sh)

You can have more that one job dependent on a single job:

jid2=$(sbatch --parsable --dependency=afterok:$jid1 job2.sh)
jid3=$(sbatch --parsable --dependency=afterok:$jid1 job3.sh)

And you can have a job depend on more than 1 job:

jid4=$(sbatch --parsable --dependency=afterok:$jid2:$jid3 job4.sh)

Snakemake¶

Important note

Running a pipeline on the cluster often involves a master process on the node where you logged in for the duration of the pipeline.

In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.

Alternatively, you can put your snakemake master process in a SLURM script.

Snakemake is a workflow management system tool to create reproducible and scalable data analyses.

In order to run snakemake workflows on SciCORE, you need to setup a config.yaml profile where you specify:

executor: slurm : sets up rules execution in SLURM
use-envmodules: true : sets up module system
resources for each rules: e.g. set-resources: myrule:mem=500MB

You also need to specify the modules to load for each rules in the Snakefile using the envmodules option.

Follow the relevant documentations from slurm-executor-plugin and snakemake to see exactly which options to set for each use case.

Note

We recommend running snakemake -n to create a dry-run of the workflow and identify which rules are going to be invoked.

Here is an example config.yaml file for a workflow with 2 rules (fastqc and multiqc), each needing their own module:

executor: slurm        # sets up rules execution in SLURM
jobs: 10               # maximum number of concurrent jobs
use-envmodules: true   # sets up module system

latency-wait: 30       # time, in second, before checking for result
                       # files after a job finishes successfully.
                       # This is useful when there is a bit of
                       # filesystem latency.


set-resources:
    multiqc:
        mem: 500MB                 # reserved memory
        runtime: 10                # reserved runtime in minutes
        threads: 1                 # reserved cpus
        slurm_extra: "--qos=30min" # any extra slurm options,
                                   # here used to set the queue of service
    fastqc:
        mem: 1GB
        runtime: 10
        threads: 1
        slurm_extra: "--qos=30min"

And the Snakefile specifies the modules to use for each rule:

...

rule fastqc:
    ...
    envmodules:
        "FastQC/0.12.1-Java-21"      # module to load at rule start-up
    ...

rule multiqc:
    ...
    envmodules:
        "MultiQC/1.22.3-foss-2024a"  # module to load at rule start-up
    ...
...

Presuming the Snakefile, input data and config.yaml are in the same folder you launch the workflow with: snakemake --workflow-profile .

Warning

It may happen that the pipeline fails because of filesystem latency issues, in which case you should typically see some line like this in your error message: Job 5 completed successfully, but some output files are missing.

In that case consider increasing the latency-wait option to a higher number.

Warning

Beware that snakemake exports your environment when submitting jobs. This means that any environment variable you have defined in your session will get passed down to the various steps of the workflow.

rules running locally

Some rules are sometimes unsuited to become a job on the cluster, for instance if some task requires internet access. For this snakemake uses the localrule keyword in the Snakefile. It can either be defined at the rule level :

rule foo:
    ...
    localrule: True  # rule foo will be executed locally rather than in a SLURM job
    ...

Or several rules can be specified at once :

localrules: foo, bar # rules foo and bar will be executed locally rather than in a SLURM job

rule foo:
    ...

rule bar:
    ...

You can read more on the subject in the snakemake documentation

Do it yourself:

The configuration we show above corresponds to a simple bioinformatics pipeline where we generate some html Quality Control reports from a set of sequencing result files.

Here are the steps to execute if you want to test it for yourself:

First, create a new folder on scicore, move there

mkdir snakemake_test
cd snakemake_test

Then download some data (NB: these are small files):

wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz

Next, create a file called Snakefile with the following content:

SAMPLES = ["sample1_R1",
           "sample1_R2",
           "sample2_R1",
           "sample2_R2"]

rule all:
    input:
        "results/multiqc/multiqc_report.html"

rule fastqc:
    input:
        "{sample}.fastq.gz"
    output:
        "results/fastqc/{sample}_fastqc.html"
    envmodules:
        "FastQC/0.12.1-Java-21"      # module to load at rule start-up
    shell:
        "fastqc {input} -o results/fastqc/"

rule multiqc:
    input:
        fqc=expand("results/fastqc/{sample}_fastqc.html", sample=SAMPLES)
    output:
        "results/multiqc/multiqc_report.html"
    envmodules:
        "MultiQC/1.22.3-foss-2024a"  # module to load at rule start-up
    shell:
        "multiqc results/fastqc/ -o results/multiqc/"

Also create a file named config.yaml, with the content:

executor: slurm        # sets up rules execution in SLURM
jobs: 10               # maximum number of concurrent jobs
use-envmodules: true   # sets up module system

latency-wait: 30       # time, in second, before checking for result
                       # files after a job finishes successfully.
                       # This is useful when there is a bit of
                       # filesystem latency.

set-resources:
    multiqc:
        mem: 500MB                 # reserved memory
        runtime: 10                # reserved runtime in minutes
        threads: 1                 # reserved cpus
        slurm_extra: "--qos=30min" # any extra slurm options,
                                   # here used to set the queue of service
    fastqc:
        mem: 1GB
        runtime: 10
        threads: 1
        slurm_extra: "--qos=30min"

Load the snakemake module:

ml snakemake/9.3.5-foss-2025a

Finally, run the pipeline with:

snakemake --workflow-profile .

Where the option --workflow-profile . specifies the folder where you have the config.yaml file (here, the working directory .).

Nextflow¶

Important note

Running a pipeline on the cluster often involves a master process on the node where you logged in for the duration of the pipeline.

In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.

Alternatively, you can put your nextflow master process in a SLURM script.

Nextflow is a workflow system for creating scalable, portable, and reproducible workflows.

In order to run nextflow workflows on SciCORE, you need a nextflow.config file to specify the SLURM configuration for each processes, such as which resources to reserve, or which modules to load.

Here is an example nextflow.config file for a workflow with 2 processes (FASTQC and MULTIQC), each needing to load their own module:

process.executor = 'slurm'


process {
    withName: FASTQC {
        module = 'FastQC/0.11.8-Java-1.8'
        cpus = 1
        memory = 1.GB
        clusterOptions = '--qos 30min'
    }
}

process {
    withName: MULTIQC {
        module = 'MultiQC/1.14-foss-2022a'
        cpus = 1
        memory = 1.GB
        clusterOptions = '--qos 30min'
    }
}

Where:

cpus corresponds to SLURM’s cpus-per-task
queue corresponds to SLURM’s partition
clusterOptions : generic way of adding options to the SLURM submission, here we use it to specify the qos.

Tip

You can specify multiple options by separating them with spaces. e.g.: clusterOptions = '--qos 30min --mail-type=END,FAIL --mail-user=<my.name>@unibas.ch'.

Nextflow also let’s you set a queue option, which corresponds to the SLURM partition.

More info:

Tip

To attach multiple modules to the same process, you can use something like: module = ['FastQC','MultiQC']

Do it yourself:

Get the data

wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample1_R2.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R1.fastq.gz
wget https://raw.githubusercontent.com/nf-core/test-datasets/viralrecon/illumina/amplicon/sample2_R2.fastq.gz

simpleQC.nf

params.report_id = "multiqc_report"

process FASTQC {

    publishDir "results/fastqc"

    input:
    path reads

    output:
    path "${reads.simpleName}_fastqc.zip", emit: zip
    path "${reads.simpleName}_fastqc.html", emit: html

    script:
    """
    fastqc $reads
    """
}

process MULTIQC {

    publishDir "results/multiqc"

    input:
    path '*'
    val output_name

    output:
    path "${output_name}.html", emit: report
    path "${output_name}_data", emit: data

    script:
    """
    multiqc . -n ${output_name}.html
    """
}

// Workflow block
workflow {
    ch_fastq = channel.fromPath(params.fq)   // Create a channel using parameter input
    FASTQC(ch_fastq)       // fastqc
    MULTIQC(
        FASTQC.out.zip.mix(FASTQC.out.html).collect(),
        params.report_id
        )

}

Workflow preview and configuration

ml Nextflow
nextflow run simpleQC.nf --fq "sample*.fastq.gz" -preview

output:

Nextflow 25.04.6 is available - Please consider updating your version to it

 N E X T F L O W   ~  version 24.10.4

Launching `simpleQC.nf` [big_mercator] DSL2 - revision: 0c7da237e5

[-        ] FASTQC  -
[-        ] MULTIQC -

We are going to create a config file to specify all cluster relevant information to nextflow.

If that file is named nextflow.config and is in your current directory when you launch the workflow, then it will automatically be applied to the run. Otherwise you can specify it to nextflow run with option -c.

nextflow.config

process.executor = 'slurm'

process {
    withName: FASTQC {
        module = 'FastQC/0.11.8-Java-1.8'
        cpus = 1
        memory = 1.GB
        clusterOptions = '--qos 30min'
    }
}

process {
    withName: MULTIQC {
        module = 'MultiQC/1.14-foss-2022a'
        cpus = 1
        memory = 1.GB
        clusterOptions = '--qos 30min'
    }
}

where:

cpus : SLURM cpus-per-task
queue : SLURM partition
clusterOptions : generic way of adding options to the SLURM submission, here we use it to specify the qos.
- you can specify multiple options by separating them with spaces. e.g.: clusterOptions = '--qos 30min --mail-type=END,FAIL --mail-user=<my.name>@unibas.ch'

More info

SLURM executor
specify process specific config (eg, resources)
loading modules
- to attach multiple modules, you can use something like: module = ['FastQC','MultiQC']
nextflow configuration

Actually running the pipeline

nextflow run simpleQC.nf --fq "sample*.fastq.gz" -with-timeline -with-trace -with-report -with-dag

The options -with-timeline -with-trace -with-report -with-dag produce files text and html reports which are all useful, in particular:

report for usage details
trace gives you the job ids (column native_id, useful for debug) and usage resource

Alternatively, the pipeline could be run from a SLURM script which you would submit with sbatch:

#!/bin/bash
#SBATCH --job-name=nextflow-test
#SBATCH --cpus-per-task=1    #Number of cores to reserve
#SBATCH --mem-per-cpu=1G     #Amount of RAM/core to reserve
#SBATCH --time=06:00:00      #Maximum allocated time
#SBATCH --qos=6hours         #Selected queue to allocate your job
#SBATCH --output=nextflow_test.o

ml Nextflow
nextflow run simpleQC.nf --fq "sample*.fastq.gz" -with-timeline -with-trace -with-report -with-dag

nf-core¶

this subsection draws inspiration from the genotoul cluster nextflow-course

Important note

This method requires to run a master process on the node where you logged in for the duration of the pipeline.

In order to reduce the load on the login node, and to prevent interruption of your pipeline should your session be interrupted, we recommend running the pipelines on the vscode node (vscode.scicore.unibas.ch) with a tmux persistent session.

TL;DR

specify process.executor = 'slurm' in your nextflow.config
use -profile=apptainer to handle the containers
read about managing workflow resources; also use the recommendations listed above

nf-core is a global community effort to collect a curated set of open‑source analysis pipelines built using Nextflow.

We will demonstrate how to use nf-core on sciCORE with the nf-core/demo pipeline, which

On the login node, we can start by inspecting the pipeline:

ml Nextflow

nextflow run nf-core/demo --help

This will download the workflow files and display usage options.

nf-core pipeline generally have a test profile which specifies some simple input data.

This is useful for workflow configuration:

nextflow inspect nf-core/demo -profile test  --outdir results

output:

{
    "processes": [
        {
            "name": "NFCORE_DEMO:DEMO:MULTIQC",
            "container": "quay.io/biocontainers/multiqc:1.29--pyhdfd78af_0"
        },
        {
            "name": "NFCORE_DEMO:DEMO:SEQTK_TRIM",
            "container": "quay.io/biocontainers/seqtk:1.4--he4a0461_1"
        },
        {
            "name": "NFCORE_DEMO:DEMO:FASTQC",
            "container": "quay.io/biocontainers/fastqc:0.12.1--hdfd78af_0"
        }
    ]
}

So there are 3 processes, and each are linked to a container in quay.io

This means that we will not need to load the specific module for fastqc, multiqc and seqtk, but rather just make sure that a compatible container software is available.

In the case of SciCORE, apptainer is directly available without having to load additional modules. We will just need to specify this to Nextflow as another profile option.

But first, you have to setup a file named nextflow.config containing:

process.executor = 'slurm'

And then you can run:

nextflow run nf-core/demo --outdir . -profile test,apptainer

You can then look up the execution trace, dag, report, and timeline in the folder pipeline_info/.

In particular, you can inspect the workflow usage resource and extrapolate how much you may need for your data.

The workflows are configured to require relatively sensible resources (generally in their conf/base.config file), but we highly recommend you have a look at their configuration.

See more on that subject in the nf-core documentation and in the section above.