Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
services:computing:hpc [2023/10/27 12:06]
calucci [Queue System]
services:computing:hpc [2024/10/28 14:42] (current)
tringali Istruzioni per conda spostate in pagina dedicata
Line 7: Line 7:
  
  
-SSH access to Ulysses v2 is provided via the login nodes at ''​frontend1.hpc.sissa.it''​ or ''​frontend2.hpc.sissa.it''​ from SISSA network or from SISSA [[:​vpn|VPN]]. ​More access options might be made available in due time. +SSH access to Ulysses v2 is provided via the login nodes at ''​frontend1.hpc.sissa.it''​ or ''​frontend2.hpc.sissa.it''​ from SISSA network or from SISSA [[:​vpn|VPN]].
 ===== Hardware and Software ===== ===== Hardware and Software =====
  
Line 23: Line 22:
 The software tree is the same you have on Linux workstations,​ with the same [[services:​modules|Lmod modules]] system (with the only exception of desktop-oriented software packages). The software tree is the same you have on Linux workstations,​ with the same [[services:​modules|Lmod modules]] system (with the only exception of desktop-oriented software packages).
  
-A small number of POWER9-based nodes are also available (2 sockets, 16 cores, 4 threads per core; 256GB RAM) with 2 or 4 Tesla V100. Please note that you cannot run x86 code on POWER9. For an interactive shell on a P9 machine, please type ''​p9login''​ on frontend[12].+<del>A small number of POWER9-based nodes are also available (2 sockets, 16 cores, 4 threads per core; 256GB RAM) with 2 or 4 Tesla V100. Please note that you cannot run x86 code on POWER9. For an interactive shell on a P9 machine, please type ''​p9login''​ on frontend[12].</​del>​
  
 ===== Queue System =====  ===== Queue System ===== 
Line 36: Line 35:
   * **''​long1''​** and **''​long2''​**:​ max 8 nodes, max 48h, max 6 concurrently running jobs per user   * **''​long1''​** and **''​long2''​**:​ max 8 nodes, max 48h, max 6 concurrently running jobs per user
   * **''​gpu1''​** and **''​gpu2''​**:​ max 4 nodes, max 12h   * **''​gpu1''​** and **''​gpu2''​**:​ max 4 nodes, max 12h
-  * **''​power9''​**:​ max 4 nodes, max 24h+  * <del>**''​power9''​**:​ max 4 nodes, max 24h</​del>​
  
 <note tip>​Please note that hyperthreading is enabled on all nodes (it was disabled on old Ulysses). If you **do not** want to use hyperthreading,​ the ''​%%--hint=nomultithread%%''​ options to srun/sbatch will help. <note tip>​Please note that hyperthreading is enabled on all nodes (it was disabled on old Ulysses). If you **do not** want to use hyperthreading,​ the ''​%%--hint=nomultithread%%''​ options to srun/sbatch will help.
Line 54: Line 53:
  
  
-====== Simplest possible job ======+===== Simplest possible job =====
 This is a single-core job with default time and memory limits (1 hour and 0.5GB) This is a single-core job with default time and memory limits (1 hour and 0.5GB)
  
Line 74: Line 73:
  
 <note warning>​Please note that MPI jobs are only supported if they allocate all available core/​threads on each node (so 20c/40t on *1 partitions and 32c/64t on *2 partitions. In this context, //not supported// means that jobs using fewer cores/​threads than available may or may not work, depending on how cores //not// allocated to your job are used.</​note>​ <note warning>​Please note that MPI jobs are only supported if they allocate all available core/​threads on each node (so 20c/40t on *1 partitions and 32c/64t on *2 partitions. In this context, //not supported// means that jobs using fewer cores/​threads than available may or may not work, depending on how cores //not// allocated to your job are used.</​note>​
 +
 +==== Access to hardware-based performance counters ====
 +
 +Access to hardware-based performance counters is disabled by default for security reasons. It can be enabled on request, only for node-exclusive jobs (i.e. for allocations where a single job is allowed to run on each node), use ''​sbatch -C hwperf --exclusive ...''​
 +
 +===== Using conda env for PyTorch with CUDA support =====
 +If you want to use Python AI libraries, chances are they'​ll be published with conda distribution system. To understand how to use conda environments on Ulysses GPU nodes, please refer to the [[services:​computing:​hpc:​conda|HPC conda]] page.
 +
 ===== Filesystem Usage and Backup Policy ===== ===== Filesystem Usage and Backup Policy =====