This is an old revision of the document!


High Performance Computing

The Ulysses cluster v2 is available for scientific computation to all SISSA users. If you have an active SISSA account, please write to helpdesk-hpc@sissa.it in order to have it enabled on Ulysses.

Access

Ulysses v2 can be (provisionally) accessed via the login node at frontend2.hpc.sissa.it from SISSA network or from SISSA VPN. In the meantime, frontend1 remains available as an access point to the old cluster. More access options might be made available in due time.

Hardware and Software

Available compute nodes include:

  • (old) IBM nodes: Xeon E5-2680 v2 (2 sockets, 10 cores, 2 threads per core), most of them with 40GB RAM, a subset with 160GB, a much smaller subset with 320GB
  • (old) IBM GPU nodes: Xeon E5-2680 v2 (same as above), 64GB, 2 Tesla K20m
  • (new) HP nodes: Xeon E5-2683 v4 (2 sockets, 16 cores, 2 threads), 64GB RAM
  • (new) HP GPU nodes: same as above, with 2 Tesla P100

All nodes are connected to an Infiniband QDR fabric.

The software tree is the same you have on Linux workstations, with the same Lmod modules system (with the only exception of desktop-oriented software packages).

Queue System

The queue system is now SLURM (https://slurm.schedmd.com/documentation.html), so if you were used to TORQUE on old Ulysses you will need to somewhat modify your job scripts.

Available partitions (or “queues” in TORQUE old-speak) include

  • regular1 (old nodes) and regular2 (new nodes): max 16 nodes, max 12h
  • wide1 and wide2: max 32 nodes, max 8h
  • long1 and long2: max 8 nodes, max 48h
  • gpu1 and gpu2: max 4 nodes, max 12h
Please note that hyperthreading is enabled on all nodes (it was disabled on old Ulysses). If you do not want to use hyperthreading, the –hint=nomultithread –cpu-bind=cores options to srun/sbatch will help.

Job scheduling is fair share-based, so the scheduling priority of your jobs depends on the waiting time in the queue AND on the amount of resources consumed by your other jobs. If you have urgent need to start a single job ASAP (e.g. for debugging), you can use the fastlane QoS that will give your job a substantial priority boost (before you ask: to prevent abuse, only one job per user can use fastlane at a time, and you will “pay” for the priority boost with a lower priority for your subsequent jobs).

You should always use the –mem or –mem-per-cpu slurm options to specify the amount of memory needed by your job. This is especially important if your jobs doesn't use all available CPUs on a node (40 threads on IBM nodes, 64 on HP) and ailing to do so will negatively impact the scheduling performance.

This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information