Specific Application How-Tos
Here is a collection of instructions to run some applications of interest in the cluster.
Tmux for persistent shell sessions
Tmux is a “terminal multiplexer” which
allows multiple terminal sessions to be accessed simultaneously in a single
window. Processes can be detached from their controlling terminals, allowing
remote sessions to remain active without being visible. This is particularly
useful to protect programs running on a remote server (e.g. a sciCORE login
node) from connection drops (e.g. when going through a tunnel on the train
).
This software is already installed on all sciCORE login nodes. We recommend checking the official manual, and provide here a quick cheat sheet of common operations in Tmux:
tmux new -s <name> # create new session
tmux ls # list active tmux sessions
tmux a -t <name> # attach target
ctrl-b # base action call
ctrl-b c # create new tab in session (asterisk indicates the active tab)
ctrl-b n # next tab
ctrl-b p # previous tab
ctrl-b [ # navigation mode indicated by [0/199], use q to quit
ctrl-b & # kills the currently active tab
ctrl-b d # detach and leave tabs running
AlphaFold
AlphaFold is a protein structure prediction database and program designed by DeepMind using deep learning techniques.
You can run the inference pipeline of AlphaFold in the sciCORE cluster. To explore the AlphaFold versions available run:
Once you choose a version and load it with ml AlphaFold/<version>
, two scripts
will be available to you, run_alphafold.sh
and run_alphafold_multimer.sh
.
For instance:
$ ml AlphaFold/2.3.2
$ which run_alphafold.sh
/scicore/soft/easybuild/apps/AlphaFold/2.3.2/bin/run_alphafold.sh
Under the hood, these scripts will run AlphaFold as if it was run via docker container. See the official documentation for information on the parameters you can pass to the script.
use_gpu_relax
error
For AlphaFold > 2.1.1 you need to specify if you want the Relax step to run on CPU or GPU.
Running on GPU is faster but is not hardcoded on the scripts, so you need to specify it when running AlphaFold by providing either --use_gpu_relax=true
for GPU or --use_gpu_relax=false
for CPU
The monomer is the original model used at CASP14 with no ensembling. Here’s an example SLURM script to run AlphaFold with a monomer:
#!/bin/bash
#SBATCH --job-name=AlphaFold-Monomer-Example
#SBATCH --time=01:00:00
#SBATCH --mem=64G
#SBATCH --cpus-per-task=8
#SBATCH --partition=a100
#SBATCH --gres=gpu:1
module load AlphaFold/2.3.2
run_alphafold.sh \
--fasta_paths /scicore/home/group/user/path/to/fasta/fasta.fa \
--output_dir /scicore/home/group/user/path/to/output/ \
--max_template_date 2021-09-01
Tip
db_preset
can be changed by updating the environment variable DB_PRESET
(defaults to full_dbs)
Example:
export DB_PRESET="reduced_dbs"
run_alphafold.sh <remaining parameters>
Similarly, model_preset
can be changed by updating the environment variable MODEL_PRESET_MONOMER
(defaults to monomer)
The multimer is AlphaFold’s Multimer model. Here’s an example SLURM script to run AlphaFold with a multimer:
#!/bin/bash
#SBATCH --job-name=AlphaFold-Monomer-Example
#SBATCH --time=01:00:00
#SBATCH --mem=64G
#SBATCH --cpus-per-task=8
#SBATCH --partition=a100
#SBATCH --gres=gpu:1
module load AlphaFold/2.2.0
run_alphafold_multimer.sh \
--fasta_paths /scicore/home/group/user/path/to/fasta/fasta.fa \
--output_dir /scicore/home/group/user/path/to/output/ \
--max_template_date 2021-09-01
Info
For multimers, model_preset
currently only works with “multimer”
Loading and using LLMs on the cluster
The compute nodes do not in general have access to the Internet. So you may have run into errors when trying to follow an LLM tutorial online because the model files could not be fetched at runtime.
The simple solution to that is to download the model files you need to the cluster beforehand. Then point to these local files when running your code from a compute node.
For example, if you have
huggingface-cli
installed in your (virtual) environment and want to download the
Qwen/Qwen3-4B
model, you can run the
following from a transfer node:
MODEL=Qwen/Qwen3-4B
OUTDIR=path/to/your/local/Qwen3-4B
mkdir -p ${OUTDIR}
huggingface-cli download ${MODEL} config.json model*safetensors* tokenizer* \
--local-dir ${OUTDIR} \
--local-dir-use-symlinks False
Info
Adapt this command to the virtual environment manager you are using.
For example, if using uv, prepend huggingface-cli
with uv run
.
This will download the model files to the specified OUTDIR
.
Then, when running your code from a compute node, you can point to the local files like this:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "path/to/your/local/Qwen3-4B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
Note
This strategy applies in general for any framework that tries to download files at runtime. Make sure to check the framework’s documentation for instructions on how to pre-fetch files to the local machine.
SRA Toolkit
The Sequence Read Archive (SRA) Toolkit a collection of tools and libraries for using data in the The International Nucleotide Sequence Database Collaboration (INSDC).
The SRA Toolkit is available via the cluster’s module system. To explore the available versions, run:
Once you load a version with ml SRA-Toolkit/<version>
, commands like
fastq-dump
and fasterq-dump
will be available to you. Check the
official documentation for more
information on commands.
Running PostgreSQL on the cluster
PostgreSQL is a popular object-relational database system, but running it on the cluster requires some configuration due to the lack of root privileges. Below are two approaches that run PostgreSQL in an isolated environment within your user space.
Note
This guide only shows how to run PostgreSQL on the cluster. For information on commands and workflows, please refer to the official documentation or to PostgreSQL tutorials.
Via Apptainer
The Apptainer container system is available by default to all users:
Info
See this page for more information of running containers on the cluster.
To run PostgreSQL with Apptainer, you must first pull a PostgreSQL image from a container registry. For example, you can use the official PostgreSQL image from Docker Hub:
Tip
If you need a specific version of PostgreSQL, you can specify the version
tag in the image name, e.g., docker://postgres:17.5
.
This should create a file named postgres_latest.sif
in your current directory.
This is the image where the PostgreSQL commands will run from.
Now, create folders to store your PostgreSQL databases and temporary files:
Pointing to the image you pulled, you can start a PostgreSQL server with the in the login node:
$ apptainer instance start \
-B $PG_DBS_PATH:/var/lib/postgresql/data \
-B $PG_TMP_PATH:/var/run/postgresql \
postgres_latest.sif \
postgres
INFO: instance started successfully
Note
Remember to change the image path (postgres_latest.sif
) if you used a
different version or save path.
You can verify that the instance is running with:
Now, connect to the PostgreSQL server from inside the container:
Inside the container, all PostgreSQL commands are available to you. For example, you can initialize a new database with:
When you’re done working, you should stop the PostgreSQL instance with:
Info
If you stop the postgres container but you don’t delete the data in
$PG_DBS_PATH
your databases will be saved so you can reuse them
in the future by booting the container again
Via pixi
PostgreSQL is available on conda-forge. So if you have the pixi package-manager in your user space, you can install PostgreSQL as a global tool with the following command:
This exposes commands like createdb
, initdb
, pg_ctl
, psql
, … to the
user’s PATH
:
You can then use PostgreSQL commands as you would normally do if you had root access:
Tip
This strategy of installing global tools via pixi is generalizable to any software available on conda-forge. The tool will be installed in its own isolated “sandbox” and will not interfere with other tools or packages in your user space. See pixi’s Global Tools page for more information.
Similarly, for software available on PyPI you can use the ‘tool’ concept of the uv package manager.