This is an old revision of the document!
High Performance Computing
Ulysses v2
The Ulysses cluster v2 is available for scientific computation to all SISSA users. If you have an active SISSA account, please write to helpdesk-hpc@sissa.it in order to have it enabled on Ulysses.
Access
Ulysses v2 can be (provisionally) accessed via the login node at frontend2.hpc.sissa.it
from SISSA network or from SISSA VPN. In the meantime, frontend1
remains available as an access point to the old cluster. More access options might be made available in due time.
Hardware and Software
Available compute nodes include:
- (old) IBM nodes: Xeon E5-2680 v2 (2 sockets, 10 cores, 2 threads per core), most of them with 40GB RAM, a subset with 160GB, a much smaller subset with 320GB
- (old) IBM GPU nodes: Xeon E5-2680 v2 (same as above), 64GB, 2 Tesla K20m
- (new) HP nodes: Xeon E5-2683 v4 (2 sockets, 16 cores, 2 threads), 64GB RAM
- (new) HP GPU nodes: same as above, with 2 Tesla P100
All nodes are connected to an Infiniband QDR fabric.
The software tree is the same you have on Linux workstations, with the same Lmod modules system (with the only exception of desktop-oriented software packages).
Queue System
The queue system is now SLURM (https://slurm.schedmd.com/documentation.html), so if you were used to TORQUE on old Ulysses you will need to somewhat modify your job scripts.
Available partitions (or “queues” in TORQUE old-speak) include
regular1
(old nodes) andregular2
(new nodes): max 16 nodes, max 12hwide1
andwide2
: max 32 nodes, max 8hlong1
andlong2
: max 8 nodes, max 48hgpu1
andgpu2
: max 4 nodes, max 12h
–hint=nomultithread –cpu-bind=cores
options to srun/sbatch will help.
Job scheduling is fair share-based, so the scheduling priority of your jobs depends on the waiting time in the queue AND on the amount of resources consumed by your other jobs. If you have urgent need to start a single job ASAP (e.g. for debugging), you can use the fastlane
QoS that will give your job a substantial priority boost (before you ask: to prevent abuse, only one job per user can use fastlane at a time, and you will “pay” for the priority boost with a lower priority for your subsequent jobs).
You should always use the –mem
or –mem-per-cpu
slurm options to specify the amount of memory needed by your job. This is especially important if your jobs doesn't use all available CPUs on a node (40 threads on IBM nodes, 64 on HP) and ailing to do so will negatively impact the scheduling performance.