9. Running MAST for real¶
9.1. General notes¶
Depending on your cluster, you might find it necessary to nice your processes:
nice -n 19 mast -i input.inp
nice -n 19 mast
Nice-ing allows the headnode to put its regular functions before the MAST processes. MAST should start running within several seconds.
9.2. Inputting an input file¶
To parse an input file, use
mast -i input.inp
or
mast -i //full/path/to/input/file/myinput.inp
If your input file specifies any POSCAR or CIF files:
- Those files must be in the same path as the original input file.
- Those files may not be moved until the recipe is complete.
The input file will be parsed and a recipe directory should be created inside the $MAST_SCRATCH
directory, with the appropriate ingredient subdirectories.
Look at the input.inp
, archive_input_options.txt
, and archive_recipe_plan.txt
files in the recipe directory to see if the setup agrees with what you think it should be.
9.3. Running MAST¶
Running MAST is separate from inputting input files. Use this command:
mast
This command will do two things:
- Submit all ingredient runs listed in the
$MAST_CONTROL/submitlist
list to the queue.- The submission command (
sbatch
,qsub
, etc.) is based on the platform chosen when you set$MAST_PLATFORM
. See Installation. - The exact commands can be found in your MAST installation path under
submit/platforms/<platform name>/queue_commands.py
.
- The submission command (
Individual ingredients’ submission scripts are created automatically through a combination of The Ingredients section in the input file, and your the template submission script for your platform
- The template submission script is found in your MAST installation path under
submit/platforms/<platform name>/submit_template.sh
).
- Spawn a MAST monitor, or mastmon, process on the queue.
- The
mastmon_submit.sh
andrunmast.py
files, originally located in your MAST installation pathsubmit/platforms/<platform name>
andsubmit
folders, respectively, and then copied into$MAST_CONTROL
when you first run mast, are are responsible for submitting this process. - The script should be set up to use the shortest, fastest turnover queue available (e.g. a serial queue with a maximum walltime of 4 hours, or morganshort on bardeen).
- You may make changes directly in
$MAST_CONTROL/mastmon_submit.sh
The mastmon process will generate additional entries on $MAST_CONTROL/submitlist
, but these entries will not be submitted to the queue until MAST is called again.
9.3.1. The MAST monitor¶
The MAST monitor, or mastmon, process goes through the $MAST_SCRATCH
directory.
- It looks at the recipe directories under
$MAST_SCRATCH
. - For each recipe directory, the MAST monitor builds a Recipes plan object from information in the recipe directory, using a combination of the
input.inp
andstatus.txt
files in the recipe directory. - MAST then uses the recipe plan object to assess the next steps appropriate for the recipe, creating objects for the separate Ingredients and evaluating them.
9.3.2. Troubleshooting in a recipe directory¶
For human troubleshooting of a recipe, the archive_recipe_plan.txt
file gives information about which ingredients are parents/children of which other ingredients, and which method each parent should use to update each of its child ingredients.
The status.txt
files gives the status of each ingredient.
Ingredient statuses are:
- I = initialized: The ingredient has just been created from inputting the input file, but nothing has been run.
- W = waiting: The ingredient is waiting for parents to complete before it can be staged.
- S = staged: All parents have updated this child, but the run is not yet ready to run
- P = proceed: The ingredient has written its input files, all parents have updated it, and its run method has been called. The run method usually adds the ingredient to the list at
$MAST_CONTROL/submitlist
, to be submitted to the queue the next time mast is called. There is no MAST status change between an ingredient proceeding to the submitlist and being submitted to the queue off of the submitlist. However,$MAST_CONTROL/submitted
can be used to see which ingredients were just submitted to the queue. - C = complete: The ingredient is complete
- E = error: The ingredient has errored out, and
mast_auto_correct
was set to False in the input file (the default is True) - skip = skip: You can set ingredients to skip in the status.txt file by manually editing the file.
The MAST monitor checks the status of all ingredients whose status is not yet complete. The MAST monitor updates each ingredient status in the recipe plan.
Each non-complete ingredient is checked to see if it is complete (this is a redundant fast-forward check, since sometimes it is useful to copy over previously completed runs into a MAST ingredient directory.)
If complete, the ingredient updates its children and is changed to Complete
For each Initialized ingredient:
- If the ingredient has any parents, it is given status Waiting
- Otherwise, it is given status Staged
For each Proceed-to-run ingredient:
- If the ingredient is now complete, it updates its children and is changed to Complete
For each Waiting ingredient:
- If all parents are now marked complete, the ingredient is changed to Staged
For each Staged ingredient:
- If the ingredient is not already ready to run, its write method is called for it to write its input files.
- The ingredient.s run method is called, which usually adds its folder to
$MAST_CONTROL/submitlist
, except in the case of special run methods like run_defect (to induce a defect) - The ingredient.s status is changed to Proceed.
When all ingredients in a recipe are complete, the entire recipe folder is moved from $MAST_SCRATCH
to $MAST_ARCHIVE
9.3.2.1. Errors in a recipe directory¶
Errors in a recipe which cause the recipe to fail out completely are logged to a MAST_ERROR
file.
These errors will need to be addressed manually. Until then, MAST will skip over the recipe directory and log a warning to the mast.log file.
Once the error has been addressed, delete the MAST_ERROR
file, and the recipe should be picked up on the next mast
command.
To get more information about why the error may have been generated, set the MAST_DEBUG
environment variable, e.g. export MAST_DEBUG=1
, delete the MAST_ERROR
file, and rerun MAST.
The error should be re-logged, and the $MAST_CONTROL/mast.log
file will now also contain DEBUG-level information.
9.3.3. The CONTROL folder¶
The $MAST_CONTROL
folder houses several files:
- errormast: Contains any queue errors from running the MAST monitor on the queue
- mastoutput: Contains all queue output from running the MAST monitor on the queue, including a printout of the ingredient statuses for all recipes in the $MAST_SCRATCH directory
- submitlist: The list of all ingredient folders to be submitted to the queue
- submitted: A list of all ingredients submitted to the queue the last time the MAST monitor ran
- mast.log and archive.<timestamp>.log: contains MAST runtime information. The default setting is INFO level. To also see DEBUG level information, set environment variable MAST_DEBUG, for example,
export MAST_DEBUG=1
.
Every file except submitlist
can be periodically deleted to save space.
The errormast
file is written when there is an error, and will need to be deleted for MAST to continue running.
9.3.4. The SCRATCH folder¶
The $MAST_SCRATCH
folder houses all recipe folders. It also houses a mast.write_files.lock
file while the MAST monitor is running, in order to prevent several versions of MAST from running at once and simultaneously checking and writing ingredients.
- Occasionally, MAST may report that it is locked. If there is no mastmon process running or queued on the queue, you may delete the
mast.write_files.lock
file manually.
9.3.4.1. Skipping recipes or ingredients in the SCRATCH folder¶
If a certain recipe has some sort of flaw, or if you want to stop tracking it halfway through, you may have MAST skip over this recipe:
- Create an empty (or not, the contents do not matter) file named
MAST_SKIP
in the recipe directory. - Go through $MAST_CONTROL/submitlist and delete all ingredients associated with that recipe to keep them from being submitted during the next MAST run.
If you would like to skip certain ingredients of a single recipe, edit the recipe’s status.txt file and replace ingredients to be skipped with the status skip (use the whole word).
To un-skip these ingredients, set them back to W for waiting for parents in status.txt.
- Be careful if deleting any files for skipped ingredients.
- Do not delete the metadata.txt file.
- If deleting a file that was obtained from a parent, like a POSCAR file, also set the parent ingredient back to P when you un-skip the child ingredient.
No recipe can be considered complete by MAST if it includes skipped ingredients. However, if you consider the recipe complete, you can move the entire recipe directory out of
$MAST_SCRATCH
and into$MAST_ARCHIVE
or another directory.
9.3.5. The ARCHIVE folder¶
When all ingredients in a recipe are complete, the entire recipe directory is moved from $MAST_SCRATCH
to $MAST_ARCHIVE
.
9.4. Running MAST repeatedly¶
The command mast
needs to be run repeatedly in order to move the status of the recipe forward. In order to run mast automatically, use a crontab.
Important notes:
- Some clusters may not allow the use of cron. Please check the cluster policy before setting up cron.
- Be ready for a lot of notification emails. Crontab on a well-behaved system should send you an email each time it runs, giving you what would have been the output on the screen.
- Include
. $HOME/.bashrc
or a similar line to get your MAST environment variables and your usual path setup.
Crontab commands are as follows:
crontab -e
to edit your crontabcrontab -l
to view your crontabcrontab -r
to remove your crontab
This crontab line will run mast every hour at minute 15, and is usually suitable for everyday use:
15 * * * * . $HOME/.bashrc; nice -n 19 mast
This crontab line will run mast every 15 minutes and is ONLY suitable for short testing:
*/15 * * * * . $HOME/.bashrc; nice -n 19 mast
9.5. Modifying recipes¶
Occasionally it is convenient to add additional ingredients onto an existent, completed or nearly-completed recipe.
For example, it may be helpful to add an additional charge state, or calculate phonons, make additional defects on a relaxed structure, or calculate additional NEBs.
The MAST “modify recipe” functionality allows new ingredient branches to be added onto an existing recipe in an existing recipe directory.
Instructions are as follows:
In the recipe directory in $MAST_SCRATCH, modify the input file as you would want it. (If the recipe directory is not in $MAST_SCRATCH, move it there.)
- For example, if the
$recipe
section uses the <N> <S> <Q> etc. tags, then the$defects
section could add an additionalbegin defectname ... end
subsection, or a charge designation within a defect subsection could be expanded.
- For example, if the
Remove the
$personal_recipe
section of the input file. (That is, remove the$personal_recipe
line, all lines in between, and the$end
line).From within the recipe directory, run the command
mast -m modifyrecipe
These steps may be accomplished over multiple recipes using a shell script, but with caution.
9.5.1. Example¶
My charged supercell isn’t charged! What happened?
My input file had charge=2,2 in the $defects section, but it did not have the charge tag <Q> in the $recipe section
The metadata.txt file wasn’t getting written correctly, and the checker wasn’t looking for a charge label, either.
Remove the $personal_recipe section. Redo the $recipe section to have the <Q> tags.
Run mast -m modifyrecipe
The uncharged supercell calculations were fine; move their data to folders with a <Q> tag for q=p0 (no charge).
Run mast (especially. mast -m monitoronly) until the status.txt file catches up Now mast will rerun a new arm of charged supercell calculations.
9.5.2. Caveats¶
If ingredient names in the $recipe section are changed, some data may need to be moved around (see the example above).
An already-complete ingredient is not necessarily rerun, depending on how its completion is evaluated. It may not get any new parent information from a newly added ingredient.
The recipe’s status.txt file is reset so that all ingredients are at status Initialized.
- Each ingredient, whether previously completed or not, gets its state re-evaluated when MAST is called (using the normal
mast
command). - This procedure may require several
mast
calls until the recipe is caught up again. - This procedure is necessary in order to update all parent-child relationships and to establish the correct data transfer among the existing and new ingredients.
- Each ingredient, whether previously completed or not, gets its state re-evaluated when MAST is called (using the normal