Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
services:computing:hpc [2021/09/06 12:09]
calucci [Queue System] power9
services:computing:hpc [2024/03/06 13:30] (current)
calucci hwperf
Line 23: Line 23:
 The software tree is the same you have on Linux workstations,​ with the same [[services:​modules|Lmod modules]] system (with the only exception of desktop-oriented software packages). The software tree is the same you have on Linux workstations,​ with the same [[services:​modules|Lmod modules]] system (with the only exception of desktop-oriented software packages).
  
-A small number of POWER9-based nodes are also available (2 sockets, 16 cores, 4 threads per core; 256GB RAM). Please note that you cannot run x86 code on POWER9.+A small number of POWER9-based nodes are also available (2 sockets, 16 cores, 4 threads per core; 256GB RAM) with 2 or 4 Tesla V100. Please note that you cannot run x86 code on POWER9. For an interactive shell on a P9 machine, please type ''​p9login''​ on frontend[12].
  
 ===== Queue System =====  ===== Queue System ===== 
Line 33: Line 33:
  
   * **''​regular1''​** (old nodes) and **''​regular2''​** (new nodes): max 16 nodes, max 12h   * **''​regular1''​** (old nodes) and **''​regular2''​** (new nodes): max 16 nodes, max 12h
-  * **''​wide1''​** and **''​wide2''​**:​ max 32 nodes, max 8h +  * **''​wide1''​** and **''​wide2''​**:​ max 32 nodes, max 8h, max 2 concurrently running jobs per user 
-  * **''​long1''​** and **''​long2''​**:​ max 8 nodes, max 48h+  * **''​long1''​** and **''​long2''​**:​ max 8 nodes, max 48h, max 6 concurrently running jobs per user
   * **''​gpu1''​** and **''​gpu2''​**:​ max 4 nodes, max 12h   * **''​gpu1''​** and **''​gpu2''​**:​ max 4 nodes, max 12h
-  * **''​power9''​**:​ max nodes, max 24h+  * **''​power9''​**:​ max nodes, max 24h
  
 <note tip>​Please note that hyperthreading is enabled on all nodes (it was disabled on old Ulysses). If you **do not** want to use hyperthreading,​ the ''​%%--hint=nomultithread%%''​ options to srun/sbatch will help. <note tip>​Please note that hyperthreading is enabled on all nodes (it was disabled on old Ulysses). If you **do not** want to use hyperthreading,​ the ''​%%--hint=nomultithread%%''​ options to srun/sbatch will help.
Line 45: Line 45:
 Job scheduling is fair share-based,​ so the scheduling priority of your jobs depends on the waiting time in the queue AND on the amount of resources consumed by your other jobs. If you have urgent need to start a **single** job ASAP (e.g. for debugging), you can use the ''​fastlane''​ QoS that will give your job a substantial priority boost (to prevent abuse, only one job per user can use fastlane at a time, and you will "​pay"​ for the priority boost with a lower priority for your subsequent jobs). Job scheduling is fair share-based,​ so the scheduling priority of your jobs depends on the waiting time in the queue AND on the amount of resources consumed by your other jobs. If you have urgent need to start a **single** job ASAP (e.g. for debugging), you can use the ''​fastlane''​ QoS that will give your job a substantial priority boost (to prevent abuse, only one job per user can use fastlane at a time, and you will "​pay"​ for the priority boost with a lower priority for your subsequent jobs).
  
-You //should// always use the ''​%%--mem%%'' ​or ''​%%--mem-per-cpu%%'' ​slurm options ​to specify ​the amount of memory needed by your job. This is especially important if your jobs doesn'​t use all available CPUs on a node (40 threads on IBM nodes, 64 on HP) and failing to do so will negatively impact the scheduling performance.+You //should// always use the ''​%%--mem%%'' ​slurm option to specify the amount of memory needed by your job; ''​%%--mem-per-cpu%%'' ​is also possible, but not recommended due to the scheduler configuration. This is especially important if your jobs doesn'​t use all available CPUs on a node (40 threads on IBM nodes, 64 on HP) and failing to do so will negatively impact the scheduling performance. 
 + 
 +<note warning>​Please note that ''​%%--mem=0%%''​ (i.e. "all available memory"​) is **not** recommended since the amount of memory actually available on each node may vary (e.g. in case of hardware failures).</​note>​
  
 <note tip> <note tip>
Line 52: Line 54:
  
  
-====== Simplest possible job ======+===== Simplest possible job =====
 This is a single-core job with default time and memory limits (1 hour and 0.5GB) This is a single-core job with default time and memory limits (1 hour and 0.5GB)
  
Line 72: Line 74:
  
 <note warning>​Please note that MPI jobs are only supported if they allocate all available core/​threads on each node (so 20c/40t on *1 partitions and 32c/64t on *2 partitions. In this context, //not supported// means that jobs using fewer cores/​threads than available may or may not work, depending on how cores //not// allocated to your job are used.</​note>​ <note warning>​Please note that MPI jobs are only supported if they allocate all available core/​threads on each node (so 20c/40t on *1 partitions and 32c/64t on *2 partitions. In this context, //not supported// means that jobs using fewer cores/​threads than available may or may not work, depending on how cores //not// allocated to your job are used.</​note>​
 +
 +==== Access to hardware-based performance counters ====
 +
 +Access to hardware-based performance counters is disabled by default for security reasons. It can be enabled on request, only for node-exclusive jobs (i.e. for allocations where a single job is allowed to run on each node), use ''​sbatch -C hwperf --exclusive ...''​
 +
 ===== Filesystem Usage and Backup Policy ===== ===== Filesystem Usage and Backup Policy =====
  
Line 86: Line 93:
 Daily backups are taken of ''/​home'',​ while no backup is available for ''/​scratch''​. If you need to recover some deleted or damaged file from a backup set, please write to [[helpdesk-hpc@sissa.it]]. Daily backups are kept for one week, a weekly backup is kept for one month, and monthly backups are kept for one year. Daily backups are taken of ''/​home'',​ while no backup is available for ''/​scratch''​. If you need to recover some deleted or damaged file from a backup set, please write to [[helpdesk-hpc@sissa.it]]. Daily backups are kept for one week, a weekly backup is kept for one month, and monthly backups are kept for one year.
  
-Due to their inherent volatility, some directories can be excluded from the backup set. At this time, the list of excluded directories includes only one item, namely ​''/​home/​$USER/​.cache'' ​+Due to their inherent volatility, some directories can be excluded from the backup set. At this time, the list of excluded directories includes only ''/​home/​$USER/​.cache'' ​and ''/​home/​$USER/​.singularity/​cache''​ 
 + 
 +===== Job E-Mail ===== 
 +You can enable e-mail notifications at various stages of each job life with the ''​--mail-type=TYPE''​ option where ''​TYPE''​ can be a comma-separated list such as ''​BEGIN,​END,​FAIL''​ (more details are available in ''​man sbatch''​). Notification recipient is by default your SISSA e-mail address, but you can select a different address with ''​--mail-user''​. **End-job** notification includes a summary of consumed resources (CPU time and memory) as absolute values and as a percentage of requested resources. Please note that memory usage is sampled at 30 seconds intervals, so if your job is terminated by an out-of-memory condition arising from a very large failed allocation, the reported value can be grossly underestimated. 
 +==== Energy Accounting ==== 
 +An experimental energy accounting system has been enabled on Ulysses, and energy usage estimates are reported in end-job notification. This is intended as a very rough estimate of the energy impact your job has, but is **not** accurate enough to be used for proper cost/​energy/​environmental accounting. Known limits of the energy accounting system in use include: 
 +  * very small values are completely unreliable (and are not included at all in the end-job notification,​ so in case of very short or "​mostly idle" job you will find no value at all) 
 +  * only CPU and memory energy usage are considered, while energy consumed by other devices (network cards, disk controllers,​ service processors, power supplies) is not accounted for; energy used "​outside"​ the compute nodes is not considered as well (this include network devices, external storage, UPS, HVAC), so even for a CPU-intensive job the "​real"​ energy consumption can easily be twice as much than reported 
 +  * on the other side, //if your job doesn'​t use all available cores on each allocated node//, energy consumption can be overestimated 
 + 
 +===== Periodic Summary Reports from Slurm ===== 
 + 
 +You can enable the generation of periodic reports on your cluster usage that will be delivered to your email address on a daily, weekly and/or monthly base. 
 + 
 +Each summary reports includes the number of jobs that completed their lifecycle during the selected interval along with the total amount of CPU*hours consumed and and estimation of total energy consumption;​ the number of jobs in each partition; and the final states of completed jobs (usually one of ''​COMPLETED'',​ ''​TIMEOUT'',​ ''​CANCELLED'',​ ''​FAILED''​ or ''​OUT_OF_MEMORY''​). Optionally a detailed listing of all jobs can be included as an attachment (this will be a Zip-ed CSV file that can be further processed with your software of choice, but it is also human-readable). 
 + 
 +To enable the reports with the default options (no daily report; weekly report with jobs detail and monthly report delivered to your_username@sissa.it) just create an empty ''​.slurm_report''​ file in your home directory on Ulysses:  
 +<​code>​ 
 +touch $HOME/​.slurm_report 
 +</​code>​ 
 + 
 +If you need to tune some parameters (e.g. enable daily reports, enable/​disable job details, change mail delivery address), please copy the default configuration file to your home 
 +<​code>​ 
 +cp /​usr/​local/​etc/​slurm_report.ini $HOME/​.slurm_report 
 +</​code>​ 
 +and edit the local copy. If your account has no "​@sissa.it"​ email, it is recommended that you edit the ''​mailto=''​ line. 
 + 
 +<note tip>​Since 2022-05-12, Slurm reports are enabled for all new accounts; if you want to disable the report, just delete the config file ''​$HOME/​.slurm_report''​ </​note>​ 
 + 
 +==== How to read the detailed report ==== 
 + 
 +The detailed report, if requested, is attached as a Zip-compressed CSV file. You should be able to open / decompress it on any modern computing platform and the CSV file is both human- and machine-readable. Timestamps are in ISO 8601 format with implicit local time zone YYYY-MM-DDThh:​mm:​ss,​ e.g. 2022-03-04T09:​30:​00 is "half past nine in the morning of March 4th, 2022". Four timestamps are provided for each job: **submit** (when the job was created with sbatch or similar commands), **eligible** (when the job becomes runnable, i.e. there are no conflicting conditions, like dependency on other jobs or exceeded user limits), **start** and **end** (when the job actually begins and ends execution).
  
 ===== Reporting Issues ===== ===== Reporting Issues =====
Line 92: Line 130:
 When reporting issues with Ulysses, please keep to the following guidelines: When reporting issues with Ulysses, please keep to the following guidelines:
  
-  * write to [[helpdesk-hpc@sissa.it]],​ not to personal email addresses: this way your enquiry ​will be seen by more than one person+  * write to [[helpdesk-hpc@sissa.it]],​ not to personal email addresses: this way your request ​will be seen by more than one person
   * please use a clear and descriptive subject for your message: "​missing library libwhatever.so.12 from package whatever-libs"​ is OK, "​missing software"​ is less useful, "​Ulysses issues"​ is definitely not useful   * please use a clear and descriptive subject for your message: "​missing library libwhatever.so.12 from package whatever-libs"​ is OK, "​missing software"​ is less useful, "​Ulysses issues"​ is definitely not useful
   * please open one ticket for each issue; **do not** reply to old, closed tickets for unrelated issues   * please open one ticket for each issue; **do not** reply to old, closed tickets for unrelated issues