A-Cluster: Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
(→Queueing system: Slurm: doc-Links, squeue-alias, GPUs, bash auf Knoten) |
K (→Queueing system: Slurm: queues) |
||
Zeile 7: | Zeile 7: | ||
= Queueing system: [https://slurm.schedmd.com/documentation.html Slurm] = | = Queueing system: [https://slurm.schedmd.com/documentation.html Slurm] = | ||
* There are two queues (''partitions'' in Slurm terminology): | |||
** ''CPUs'' being the default | |||
** ''GPUs'' to be selected via <code>-p GPUs</code> for jobs which involve a GPU | |||
* <code>[https://slurm.schedmd.com/sinfo.html sinfo]</code> displays the cluster's total load. | * <code>[https://slurm.schedmd.com/sinfo.html sinfo]</code> displays the cluster's total load. | ||
* <code>[https://slurm.schedmd.com/squeue.html squeue]</code> shows running jobs. You can modify its output via the option <code>-o</code>. To make that permanent put something like <code>alias squeue='squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %C %o"'</code> into your <code>.bashrc</code>. | * <code>[https://slurm.schedmd.com/squeue.html squeue]</code> shows running jobs. You can modify its output via the option <code>-o</code>. To make that permanent put something like <code>alias squeue='squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %C %o"'</code> into your <code>.bashrc</code>. | ||
* In the most simple cases, jobs are submitted via <code>[https://slurm.schedmd.com/sbatch.html sbatch] -n</code> ''n'' ''script-name''. The number ''n'' of CPUs is available within the script as <code>$SLURM_NTASKS</code>. It's not necessary to pass it on to <code>mpirun</code>, since the latter evaluates it on its own, anyway. | * In the most simple cases, jobs are submitted via <code>[https://slurm.schedmd.com/sbatch.html sbatch] -n</code> ''n'' ''script-name''. The number ''n'' of CPUs is available within the script as <code>$SLURM_NTASKS</code>. It's not necessary to pass it on to <code>mpirun</code>, since the latter evaluates it on its own, anyway. | ||
* To allocate GPUs as well, add <code>-G </code>''n'' or <code>--gpus=</code>''n'' with ''n'' ∈ {1,2}. You can specify the type as well by prepending <code>rtx2080:</code> or <code>rtx3090:</code> to ''n''. | * To allocate GPUs as well, add <code>-G </code>''n'' or <code>--gpus=</code>''n'' with ''n'' ∈ {1,2}. You can specify the type as well by prepending <code>rtx2080:</code> or <code>rtx3090:</code> to ''n''. |
Version vom 21. Dezember 2021, 11:05 Uhr
Linux cluster with currently 13 compute nodes (CPUs: 416 cores, GPUs: 8x RTX 2080 + 18x RTX 3090), purchased by Ana Vila Verde and Christopher Stein
Login
External Hostname is a-cluster.physik.uni-due.de
(134.91.59.16), internal hostname is stor2
.
Queueing system: Slurm
- There are two queues (partitions in Slurm terminology):
- CPUs being the default
- GPUs to be selected via
-p GPUs
for jobs which involve a GPU
sinfo
displays the cluster's total load.squeue
shows running jobs. You can modify its output via the option-o
. To make that permanent put something likealias squeue='squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %C %o"'
into your.bashrc
.- In the most simple cases, jobs are submitted via
sbatch -n
n script-name. The number n of CPUs is available within the script as$SLURM_NTASKS
. It's not necessary to pass it on tompirun
, since the latter evaluates it on its own, anyway. - To allocate GPUs as well, add
-G
n or--gpus=
n with n ∈ {1,2}. You can specify the type as well by prependingrtx2080:
orrtx3090:
to n. - Don't use background jobs (
&
), unless youwait
for them before the end of the script. srun
is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its-n
doesn't only reserve n cores but starts n jobs. (Those shouldn't containmpirun
, otherwise you'd end up with n² busy cores.)- For an interactive shell with n reserved cores on a compute node:
srun --pty -c
nbash
- The assignment of cores can be non-trivial (cf. also task affinity), some rules:
- gromacs: Don't use its
-pin
options.
- gromacs: Don't use its
Simulation Software
... installed (on the compute nodes)
The module system is not involved. Instead, scripts provided by the software set the environment.
AMBER
/usr/local/amber18
/usr/local/amber20
(providesparmed
as well)
Script to source therein (assuming bash): amber.sh
GROMACS
(not all tested)
/usr/local/gromacs-2018.3
/usr/local/gromacs-2020.4
/usr/local/gromacs-3.3.4
/usr/local/gromacs-4.6.4
/usr/local/gromacs-5.0.1
/usr/local/gromacs-5.1.1
Script to source therein (assuming bash): bin/GMXRC.bash
Ana provided an example script to be submitted via sbatch
.
Intel Compiler & Co.
- is located in
/opt/intel/oneapi
- must be made available via
module use /opt/intel/oneapi/modulefiles
(unless you include/opt/intel/oneapi/modulefiles
in yourMODULEPATH
), thenmodule avail
lists the available modules.