This page contains a virtual high performance computing (HPC, or more precisely, cluster computing) kickstart course. It is not part of the main Hands-on Scientific Computing flow, but is an expanded version of the “D” level material.
This page currently contains an online course from Aalto University (Aalto Scientific Computing), so the exact examples may not work on other clusters, but the theory and concepts will - you need to combine this outline with documentation from your own site.
In the future, this page will be adjusted to the best topics in the best order from all courses combined, which means various material may be mixed-and-matched so that the transitions are not perfect, but it will still have the best effect overall.
These can be used in whatever order suits you, or you can watch the intro and then go on.
“How to connect and use software/data” track:
Accounts, ssh, ssh keys, different operating systems, Jupyter, remote desktop environments
About storage, different storage locations and properties, quotas, access on other computers, remote access
How to use other software, common applications, singularity containers, requesting new software
modulecommand, searching for modules, loading modules, module versions, module collections.
“How to actually run stuff” track. This goes into detail about the batch system and accessing resources:
Scheduling systems, Slurm, requesting resources, running jobs you can see directly.
Jobs that run without your interaction, scripting jobs, checking output, viewing history, cancelling jobs.
Checking actual resource usage of jobs (CPU/memory/GPU) while running and after finished, adjusting resource requirements, reducing resource wastage.
Types of parallelism, shared memory (OpenMP), message passing (MPI), multiprocessing, how to run each of them, monitoring performance (doesn’t cover writing new programs that can do this).
What is an array job, doing the same thing many times, serial job → array job, various tips and examples.
GPU programs, machine learning frameworks, compiling CUDA code, requesting a GPU, monitoring efficiency, common efficiency traps.
These special topics can be used in whatever order suits you, if they are relevant to your interests.
Currently available resources at CSC, Finland: The above material is mostly abut what you can find at one university on a cluster (though even bigger clusters use the same interface). This talks about other resources available at a national computing center (other countries will be somewhat similar). (Video, Reading, `Q&A <>`__)
Cluster etiquette: We learned what you can do, but what should you do to not annoy others on the cluster? See more in Research Software Hour (Video)
“How to tame the cluster”, mostly the same material as this whole course, compressed into one hour, with a complete example worked out. (Video)