A-Cluster: Unterschied zwischen den Versionen
Zur Navigation springen
Zur Suche springen
(→Queueing system: Slurm: &-jobs, no mail) |
(→Queueing system: Slurm: Mail IS configured.) |
||
Zeile 11: | Zeile 11: | ||
* Currently, there's just one ''partiton'': "a-cluster" | * Currently, there's just one ''partiton'': "a-cluster" | ||
* In the most simple cases, jobs are submitted via <code>sbatch -n</code> ''n'' ''script-name''. The number ''n'' of CPUs is available within the script as <code>$SLURM_NTASKS</code>. It's not necessary to pass it on to <code>mpirun</code>, since the latter evaluates it on its own, anyway. | * In the most simple cases, jobs are submitted via <code>sbatch -n</code> ''n'' ''script-name''. The number ''n'' of CPUs is available within the script as <code>$SLURM_NTASKS</code>. It's not necessary to pass it on to <code>mpirun</code>, since the latter evaluates it on its own, anyway. | ||
* | * Don't use background jobs (<code>&</code>), unless you <code>wait</code> for them before the end of the script. | ||
* <code>srun</code> is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its <code>-n</code> doesn't only reserve ''n'' cores but starts ''n'' jobs. (Those shouldn't contain <code>mpirun</code>, otherwise you'd end up with ''n''² busy cores.) | * <code>srun</code> is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its <code>-n</code> doesn't only reserve ''n'' cores but starts ''n'' jobs. (Those shouldn't contain <code>mpirun</code>, otherwise you'd end up with ''n''² busy cores.) | ||
* The assignment of cores can be non-trivial (cf. also [[Slurm/Task-Affinity|task affinity]]), some rules: | * The assignment of cores can be non-trivial (cf. also [[Slurm/Task-Affinity|task affinity]]), some rules: |
Version vom 15. Oktober 2021, 23:01 Uhr
Linux cluster with currently 13 compute nodes (CPUs: 416 cores, GPUs: 8x RTX 2080 + 18x RTX 3090), purchased by Ana Vila Verde and Christopher Stein
Login
External Hostname is a-cluster.physik.uni-due.de
(134.91.59.16), internal hostname is stor2
.
Queueing system: Slurm
sinfo
displays the cluster's total load.squeue
shows running jobs.- Currently, there's just one partiton: "a-cluster"
- In the most simple cases, jobs are submitted via
sbatch -n
n script-name. The number n of CPUs is available within the script as$SLURM_NTASKS
. It's not necessary to pass it on tompirun
, since the latter evaluates it on its own, anyway. - Don't use background jobs (
&
), unless youwait
for them before the end of the script. srun
is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its-n
doesn't only reserve n cores but starts n jobs. (Those shouldn't containmpirun
, otherwise you'd end up with n² busy cores.)- The assignment of cores can be non-trivial (cf. also task affinity), some rules:
- gromacs: Don't use its
-pin
options.
- gromacs: Don't use its
Simulation Software
... installed (on the compute nodes)
The module system is not involved. Instead, scripts provided by the software set the environment.
AMBER
/usr/local/amber18
/usr/local/amber20
(providesparmed
as well)
Script to source therein (assuming bash): amber.sh
GROMACS
(not all tested)
/usr/local/gromacs-2018.3
/usr/local/gromacs-2020.4
/usr/local/gromacs-3.3.4
/usr/local/gromacs-4.6.4
/usr/local/gromacs-5.0.1
/usr/local/gromacs-5.1.1
Script to source therein (assuming bash): bin/GMXRC.bash
Intel Compiler & Co.
- is located in
/opt/intel/oneapi
- must be made available via
module use /opt/intel/oneapi/modulefiles
(unless you include/opt/intel/oneapi/modulefiles
in yourMODULEPATH
), thenmodule avail
lists the available modules.