A-Cluster

Aus IT Physics
Version vom 17. September 2021, 12:11 Uhr von Brendel (Diskussion | Beiträge) (Queueing system: Slurm: task affinity)
Wechseln zu: Navigation, Suche

Linux cluster with currently 13 compute nodes (CPUs: 416 cores, GPUs: 8x RTX 2080 + 18x RTX 3090), purchased by Ana Vila Verde and Christopher Stein

Login

External address is 134.91.59.31 (will change soon and then get a hostname), internal hostname is stor2.

Queueing system: Slurm

  • Currently, there's just one partiton: "a-cluster"
  • In the most simple cases, jobs are submitted via sbatch -n n script-name.
  • srun is intended for interactive jobs (stdin+stdout+stderr stay attached to the terminal) and its -n doesn't only reserve n cores but starts n jobs. (Those shouldn't contain mpirun, otherwise you'd end up with n² busy cores.)
  • Assigning cores to jobs can be non-trivial: task affinity

Intel Compiler & Co.

  • is located in /opt/intel/oneapi
  • must be made available via module use /opt/intel/oneapi/modulefiles (unless you include /opt/intel/oneapi/modulefiles in your MODULEPATH), then module avail lists the available modules.