A-Cluster: Unterschied zwischen den Versionen

Linux cluster with currently 13 compute nodes (CPUs: 416 cores, GPUs: 8x RTX 2080 + 18x RTX 3090) and 2×251TiB disk storage, purchased by Ana Vila Verde and Christopher Stein

Inhaltsverzeichnis

External Hostname is a-cluster.physik.uni-due.de (134.91.59.16), internal hostname is stor2.

Queueing system: Slurm

• There are two queues (partitions in Slurm terminology) named:
• CPUs, the default
• GPUs, to be selected via -p GPUs for jobs which involve a GPU
• In the CPUs queue, 2 cores stay reserved on each node for GPU jobs, resulting in 30 available cores.
• sinfo displays the cluster's total load.
• squeue shows running jobs. You can modify its output via the option -o. To make that permanent put something like alias squeue='squeue -o "%.18i %.9P %.8j %.8u %.2t %.10M %.6D %R %C %o"' into your .bashrc.
• In the most simple cases, jobs are submitted via sbatch -n n script-name. The number n of CPUs is available within the script as $SLURM_NTASKS. It's not necessary to pass it on to mpirun, since the latter evaluates it on its own, anyway. • To allocate GPUs as well, add -G n or --gpus=n with n ∈ {1,2}. You can specify the type as well by prepending rtx2080: or rtx3090: to n. • Don't use background jobs (&), unless you wait for them before the end of the script. • srun is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its -n doesn't only reserve n cores but starts n jobs. (Those shouldn't contain mpirun, otherwise you'd end up with n² busy cores.) • For an interactive shell with n reserved cores on a compute node: srun --pty -cn bash • The assignment of cores can be non-trivial (cf. also task affinity), some rules: • gromacs: Don't use its -pin options. Scientific Software ... installed (on the compute nodes) AMBER The module system is not involved. Instead, scripts provided by the software set the environment. • /usr/local/amber18 • /usr/local/amber20 (provides parmed as well) Script to source therein (assuming bash): amber.sh GROMACS The module system is not involved. Instead, scripts provided by the software set the environment. Versions (not all tested): • /usr/local/gromacs-2018.3 • /usr/local/gromacs-2020.4 • /usr/local/gromacs-3.3.4 • /usr/local/gromacs-4.6.4 • /usr/local/gromacs-5.0.1 • /usr/local/gromacs-5.1.1 Script to source therein (assuming bash): bin/GMXRC.bash Ana provided an example script to be submitted via sbatch. OpenMolcas (compiled with Intel compiler and MKL) Minimal example script to be sbatched:  #!/bin/bash export MOLCAS=/usr/local/openmolcas export MOLCAS_WORKDIR=/tmp/$USER-$SLURM_JOB_NAME-$SLURM_JOB_ID
mkdir $MOLCAS_WORKDIR export PATH=$PATH:$MOLCAS export LD_LIBRARY_PATH=/opt/intel/oneapi/compiler/latest/linux/compiler/lib/intel64_lin:/opt/intel/oneapi/mkl/latest/lib/intel64 export OMP_NUM_THREADS=${SLURM_NTASKS:-1}

pymolcas the_input.inp

rm -rf $MOLCAS_WORKDIR  If you want/need to use the module system instead of setting LD_LIBRARY_PATH manually:  shopt -s expand_aliases source /etc/profile.d/modules.sh module use /opt/intel/oneapi/modulefiles module -s load compiler/latest module -s load mkl/latest  Intel Compiler & Co. • is located in /opt/intel/oneapi • must be made available via module use /opt/intel/oneapi/modulefiles (unless you include /opt/intel/oneapi/modulefiles in your MODULEPATH), then module avail lists the available modules. • Module mkl/latest contains also FFT routines. Backups A backup of the users' home directories is taken nightly. To access the backups, first log in to the cluster. Then: • Users in /home/stor.vd1: Last night's backup is in /export/vd1/$USER.
• Users in /home/stor1.lv0: You actually have seven backups corresponding to the last 7 days in /exports/lv0/snapshots/days.D/stor1/home/stor1.lv0/\$USER with D $$\in\{0,\dots,6\}$$.