Running jobs on an interactive shell
To topSometime users may want to run a program on SCIAMA and interact with it or its graphical user interface respectively (e.g. for mathematica). For that you can create an interactive shell on a compute node:
sinteractive [<options>]
e.g.
sinteractive -n2 -c4 -t 1:00:00
which allocates 2 tasks with 4 cores each to run the interactive job (sinteractive
takes the same arguments for resource allocation as e.g. salloc
and srun
)
Submit/run a batch job to/on SCIAMA
To topIn most cases, computation-heavy codes do not require any interaction with the user. They simply run on their own until the results are calculated and written back to the disk.
To submit such a batch job use the command qsub followed by the name of your jobscript:
sbatch <path/to/jobscript>
To check on the status of a job use the command qstat. Using the -u option allows you to see only the jobs you have submitted:
squeue -u <username>
To cancel a job use qdel and the job number (listed in the output of qstat
):
scancel <jobnumber>
Example jobscripts
To topThis is an example jobscript for a simply serial job that could be submitted on Sciama using the qsub command:
#!/bin/bash
# This is an example job script for running a serial program
# these lines are comments
# SLURM directives are shown below
# Configure the resources needed to run my job, e.g.
# job name (default: name of script file)
#SBATCH –job-name=my_job
# resource limits: cores, max. wall clock time during which job can be running
# and maximum memory (RAM) the job needs during run time:
#SBATCH –ntasks=1
#SBATCH –time=1:30:00
#SBATCH –mem=8G
# define log files for output on stdout and stderr
#SBATCH –output=some_output_logfile
#SBATCH –error=some_error_logfile
# choose system/queue for job submission (default: sciama2.q)
# for more information on queues, see related articles
#SBATCH –partition=training.q
# set up the software environment to run your job in
# first remove all pre-loaded modules to avoid any conflicts
module purge
# load your system module e.g.
module load system/intel64
# now load all modules (e.g. libraries or applications) that are needed
# to run the job, e.g.
module load intel_comp/2019.2
module load python/3.7.1
# now execute all commands/programs that you want to run sequentially, e.g.
cd ~/some_dir
srun python do_something.py
srun ./myapplication ‐i ~/data/some_inputfile
#
” with a queuing system directive “#SBATCH
“. And an example job script for running a parallel code (MPI and/or MultiThread) on more than one core on SCIAMA:
#!/bin/bash
# This is an example job script for running a parallel program
#SBATCH –job-name=my_parallel_job
# define number of nodes/cores needed to run job
# here: 65 nodes with 12 processes per node (=780 MPI processes in total)
#SBATCH –nodes=65
#SBATCH –ntasks-per-node=12
#SBATCH –time=1:30:00
#SBATCH –output=some_output_logfile
#SBATCH –error=some_error_logfile
#SBATCH –partition=training.q
module purge
# don’t forget to load the system module & compiler/MPI library you compiled your code with, e.g. OpenMPI
module load system/intel63
module load intel_comp/2016.2
module load openmpi/4.0.1
# now execute all commands/programs that you want to run sequentially
# on each allocated processor; for MPI applications add ‘–mpi=pmi2’ (for OpenMPI)
cd ~/some_dir
srun –mpi=pmi2 ./my_parallel_application ‐i ~/data/some_inputfile