gromacs.qsub
– utilities for batch submission systems¶
The module helps writing submission scripts for various batch submission
queuing systems. The known ones are listed stored as
QueuingSystem
instances in
queuing_systems
; append new ones to this list.
The working paradigm is that template scripts are provided (see
gromacs.config.templates
) and only a few place holders are substituted
(using gromacs.cbook.edit_txt()
).
User-supplied template scripts can be stored in
gromacs.config.qscriptdir
(by default ~/.gromacswrapper/qscripts
)
and they will be picked up before the package-supplied ones.
The Manager
handles setup and control of jobs
in a queuing system on a remote system via ssh.
At the moment, some of the functions in gromacs.setup
use this module
but it is fairly independent and could conceivably be used for a wider range of
projects.
Queuing system templates¶
The queuing system scripts are highly specific and you will need to add
your own. Templates should be shell scripts. Some parts of the
templates are modified by the
generate_submit_scripts()
function. The “place
holders” that can be replaced are shown in the table below. Typically,
the place holders are either shell variable assignments or batch
submission system commands. The table shows SGE commands but PBS and
LoadLeveler have similar constructs; e.g. PBS commands start with
#PBS
and LoadLeveller uses #@
with its own command keywords).
place holder | default | replacement | description | regex |
---|---|---|---|---|
#$ -N | GMX_MD | sgename | job name | /^#.*(-N|job_name)/ |
#$ -l walltime= | 00:20:00 | walltime | max run time | /^#.*(-l walltime|wall_clock_limit)/ |
#$ -A | BUDGET | budget | account | /^#.*(-A|account_no)/ |
DEFFNM= | md | deffnm | default gmx name | /^ *DEFFNM=/ |
STARTDIR= | . | startdir | remote jobdir | /^ *STARTDIR=/ |
WALL_HOURS= | 0.33 | walltime h | mdrun’s -maxh | /^ *WALL_HOURS=/ |
NPME= | npme | PME nodes | /^ *NPME=/ | |
MDRUN_OPTS= | “” | mdrun_opts | more options | /^ *MDRUN_OPTS=/ |
Lines with place holders should not have any white space at the
beginning. The regular expression pattern (“regex”) is used to find
the lines for the replacement and the literal default values
(“default”) are replaced. (Exception: any value that follows an equals
sign “=” is replaced, regardless of the default value in the table
except for MDRUN_OPTS
where only “” will be replace.) Not all
place holders have to occur in a template; for instance, if a queue
has no run time limitation then one would probably not include
walltime and WALL_HOURS place holders.
The line # JOB_ARRAY_PLACEHOLDER
can be replaced by
generate_submit_array()
to produce a “job array”
(also known as a “task array”) script that runs a large number of
related simulations under the control of a single queuing system
job. The individual array tasks are run from different sub
directories. Only queuing system scripts that are using the
bash shell are supported for job arrays at the moment.
A queuing system script must have the appropriate suffix to be properly recognized, as shown in the table below.
Queuing system | suffix | notes |
---|---|---|
Sun Gridengine | .sge | Sun’s Sun Gridengine |
Portable Batch queuing system | .pbs | OpenPBS and PBS Pro |
LoadLeveler | .ll | IBM’s LoadLeveler |
bash script | .bash, .sh | Advanced bash scripting |
csh script | .csh | avoid csh |
Example queuing system script template for PBS¶
The following script is a usable PBS script for a super computer. It contains almost all of the replacement tokens listed in the table (indicated by ++++++).
#!/bin/bash
# File name: ~/.gromacswrapper/qscripts/supercomputer.somewhere.fr_64core.pbs
#PBS -N GMX_MD
# ++++++
#PBS -j oe
#PBS -l select=8:ncpus=8:mpiprocs=8
#PBS -l walltime=00:20:00
# ++++++++
# host: supercomputer.somewhere.fr
# queuing system: PBS
# set this to the same value as walltime; mdrun will stop cleanly
# at 0.99 * WALL_HOURS
WALL_HOURS=0.33
# ++++
# deffnm line is possibly modified by gromacs.setup
# (leave it as it is in the template)
DEFFNM=md
# ++
TPR=${DEFFNM}.tpr
OUTPUT=${DEFFNM}.out
PDB=${DEFFNM}.pdb
MDRUN_OPTS=""
# ++
# If you always want to add additional MDRUN options in this script then
# you can either do this directly in the mdrun commandline below or by
# constructs such as the following:
## MDRUN_OPTS="-npme 24 $MDRUN_OPTS"
# JOB_ARRAY_PLACEHOLDER
#++++++++++++++++++++++ leave the full commented line intact!
# avoids some failures
export MPI_GROUP_MAX=1024
# use hard coded path for time being
GMXBIN="/opt/software/SGI/gromacs/4.0.3/bin"
MPIRUN=/usr/pbs/bin/mpiexec
APPLICATION=$GMXBIN/mdrun_mpi
$MPIRUN $APPLICATION -stepout 1000 -deffnm ${DEFFNM} -s ${TPR} -c ${PDB} -cpi $MDRUN_OPTS -maxh ${WALL_HOURS} > $OUTPUT
rc=$?
# dependent jobs will only start if rc == 0
exit $rc
Save the above script in ~/.gromacswrapper/qscripts
under the name
supercomputer.somewhere.fr_64core.pbs
. This will make the script
immediately usable. For example, in order to set up a production MD run with
gromacs.setup.MD()
for this super computer one would use
gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'])
This will generate submission scripts based on
supercomputer.somewhere.fr_64core.pbs
and also the default local.sh
that is provided with GromacsWrapper.
In order to modify MDRUN_OPTS
one would use the additonal mdrun_opts
argument, for instance:
gromacs.setup.MD(..., qscripts=['supercomputer.somewhere.fr_64core.pbs', 'local.sh'],
mdrun_opts="-v -npme 20 -dlb yes -nosum")
Currently there is no good way to specify the number of processors when creating run scripts. You will need to provide scripts with different numbers of cores hard coded or set them when submitting the scripts with command line options to qsub.
Classes and functions¶
-
class
gromacs.qsub.
QueuingSystem
(name, suffix, qsub_prefix, array_variable=None, array_option=None)¶ Class that represents minimum information about a batch submission system.
Define a queuing system’s functionality
Arguments: - name
name of the queuing system, e.g. ‘Sun Gridengine’
- suffix
suffix of input files, e.g. ‘sge’
- qsub_prefix
prefix string that starts a qsub flag in a script, e.g. ‘#$’
Keywords: - array_variable
environment variable exported for array jobs, e.g. ‘SGE_TASK_ID’
- array_option
qsub option format string to launch an array (e.g. ‘-t %d-%d’)
-
array
(directories)¶ Return multiline string for simple array jobs over directories.
Warning
The string is in
bash
and hence the template must also bebash
(and notcsh
orsh
).
-
array_flag
(directories)¶ Return string to embed the array launching option in the script.
-
flag
(*args)¶ Return string for qsub flag args prefixed with appropriate inscript prefix.
-
has_arrays
()¶ True if known how to do job arrays.
-
isMine
(scriptname)¶ Primitive queuing system detection; only looks at suffix at the moment.
-
gromacs.qsub.
generate_submit_scripts
(templates, prefix=None, deffnm='md', jobname='MD', budget=None, mdrun_opts=None, walltime=1.0, jobarray_string=None, startdir=None, npme=None, **kwargs)¶ Write scripts for queuing systems.
This sets up queuing system run scripts with a simple search and replace in templates. See
gromacs.cbook.edit_txt()
for details. Shell scripts are made executable.Arguments: - templates
Template file or list of template files. The “files” can also be names or symbolic names for templates in the templates directory. See
gromacs.config
for details and rules for writing templates.- prefix
Prefix for the final run script filename; by default the filename will be the same as the template. [None]
- dirname
Directory in which to place the submit scripts. [.]
- deffnm
Default filename prefix for mdrun
-deffnm
[md]- jobname
Name of the job in the queuing system. [MD]
- budget
Which budget to book the runtime on [None]
- startdir
Explicit path on the remote system (for run scripts that need to cd into this directory at the beginning of execution) [None]
- mdrun_opts
String of additional options for mdrun.
- walltime
Maximum runtime of the job in hours. [1]
- npme
number of PME nodes
- jobarray_string
Multi-line string that is spliced in for job array functionality (see
gromacs.qsub.generate_submit_array()
; do not use manually)- kwargs
all other kwargs are ignored
Returns: list of generated run scripts
-
gromacs.qsub.
generate_submit_array
(templates, directories, **kwargs)¶ Generate a array job.
- For each
work_dir
in directories, the array job will - cd into
work_dir
- run the job as detailed in the template
- cd into
It will use all the queuing system directives found in the template. If more complicated set ups are required, then this function cannot be used.
Arguments: - templates
Basic template for a single job; the job array logic is spliced into the position of the line
# JOB_ARRAY_PLACEHOLDER
The appropriate commands for common queuing systems (Sun Gridengine, PBS) are hard coded here. The queuing system is detected from the suffix of the template.
- directories
List of directories under dirname. One task is set up for each directory.
- dirname
The array script will be placed in this directory. The directories must be located under dirname.
- kwargs
See
gromacs.setup.generate_submit_script()
for details.
- For each
-
gromacs.qsub.
detect_queuing_system
(scriptfile)¶ Return the queuing system for which scriptfile was written.
-
gromacs.qsub.
queuing_systems
= [<Sun Gridengine QueuingSystem instance>, <PBS QueuingSystem instance>, <LoadLeveler QueuingSystem instance>]¶ Pre-defined queuing systems (SGE, PBS). Add your own here.
See also
gromacs.manager
for classes to manage jobs remotely.