f. Get to know your Cluster

Now that you are connected to the head node, familiarize yourself with the cluster structure by running the following set of commands.

SLURM

SLURM from SchedMD is one of the batch schedulers that you can use in AWS ParallelCluster. For an overview of the SLURM commands, see the SLURM Quick Start User Guide.

  • List existing partitions and nodes per partition. You should see two nodes if your run this command after creating your cluster, and zero nodes if running it 10 minutes after creation (default cooldown period for AWS ParallelCluster, you don’t pay for what you don’t use).
sinfo
  • List jobs in the queues or running. Obviously, there won’t be any since we did not submit anything…yet!
squeue

Module Environment

Lmod is a fairly standard tool in HPC that is used to dynamically change your environment (env vars, PATH).

  • List available modules
module av
  • Load a particular module. In this case, this command loads IntelMPI in your environment and checks the path of mpirun.
module load intelmpi
which mpirun

NFS Shares

  • List mounted volumes. A few volumes are shared by the head-node and will be mounted on compute instances when they boot up. Both /shared and /home are accessible by all nodes.
showmount -e localhost

Next, you can run your first job!