In my first two tutorials, I set up a five-node Raspberry Pi cluster, installed Slurm for scheduling jobs, and configured users, storage, and administrative tools. In this post, I’m going to walk through running small sample programs on the cluster and point to the getting started guides I referenced when learning parallel programming.

There are two ways to run jobs on a Slurm managed cluster. The sbatch command will submit computing jobs to the queue to be run according to priority once resources are available. srun will run jobs and programs interactively and can be used in combination with the salloc command to reserve an allocation of resources (e.g. nodes, memory). Let’s try both.

Check Node Status

If this part fails, try rebooting all nodes and review notes from the first two tutorials.

  1. SSH to master node
  2. Run sinfo to see your partition information.
  3. Run a job on four nodes to make them print their hostname:
    1. srun –nodes=4 hostname

Download Sample Programs and Compile Code

  1. cd /scratch
  2. git clone
  3. cd mpitutorial/tutorials/mpi-hello-world/code
  4. make

Run Hello World Program Interactively

  1. salloc -N 4 #this requests a 4 node allocation
  2. mpiexec -n 4 mpi_hello_world
  3. exit #exits node allocation

Create Slurm Submission Scripts

  1. nano
    1. #!/bin/bash #SBATCH –job-name=test_mpi1 #SBATCH –time=1:00 #SBATCH -N 4 #SBATCH –ntasks-per-node=4 mpiexec -n 16 mpi_hello_world
  2. chmod 700
  3. nano
    1. #!/bin/bash #SBATCH –job-name=test_mpi1 #SBATCH –time=1:00 #SBATCH -N 2 #SBATCH –ntasks-per-node=4 mpiexec -n 8 mpi_hello_world
  4. chmod 700

Submit Batch Jobs

  1. sbatch
  2. sbatch
  3. sbatch
  4. sbatch
  5. sbatch
  6. squeue #check queue although jobs will likely finish too quickly to view them in the queue

View Output

  1. ls
    1. slurm-35.out slurm-36.out slurm-37.out slurm-38.out slurm-39.out slurm-40.out
  2. cat slurm-XX.out