StochPy uses the PySCeS MDL, an ASCII text based input file to describe a cellular system in terms of it’s stoichiometry, kinetics, compartments and parameters. Input files may have any filename with the single restriction that, for cross platform compatibility, they must end with the extension .psc. In this document we describe the PySCeS Model Description Language (MDL).
The basic description of a kinetic model in the PySCeS MDL contains the following information:
Although it is in principle possible to define a model without reactions or free species, for practical purposes StochPy requires a minimum of a single reaction. Once this information is obtained it can be organised and written as a StochPy input file. While this list is the minimum information required for a StochPy input file the MDL allows the definition of advanced models that contain compartments, global units, functions, rate and assignment rules.
In StochPy it is now possible to define keywords that specify model information. Keywords have the general form
<keyword>: <value>
The Modelname (optional) keyword, containing only alphanumeric characters (or _), describes the model filename (typically used when the model is exported via the PySCeS interface module) while the Description keyword is a (short) single line model description.
Modelname: rohwer_sucrose1
Description: Sucrose metabolism in sugar cane (Johann M. Rohwer)
Two keywords are available for use (optional) with models that have one or more compartments defined. Both take a boolean (True/False) as their value:
- Species_In_Conc specifies whether the species symbols used in the rate equations represent a concentration (True) or an amount (False, default).
- Output_In_Conc tells StochPy to output the results of numerical operations in concentrations (True, not supported) or in amounts (False, default).
Species_In_Conc: False
Output_In_Conc: False
StochPy supports the (optional) definition of a set of global units. In doing so we have chosen to follow the general approach used in the Systems Biology Modelling Language (SBML L2V3) specification. The general definition of a PySCeS unit is: `<UnitType>: <kind>, <multiplier>, <scale>, <exponent>` where kind is a string describing the base unit (for SBML compatibility this should be an SI unit) e.g. mole, litre, second or metre. The base unit is modified by the multiplier, scale and index using the following relationship: <multiplier> * (<kind> * 10**<scale>)**<index>. The default unit definitions are:
UnitSubstance: mole, 1, 0, 1
UnitVolume: litre, 1, 0, 1
UnitTime: second, 1, 0, 1
UnitLength: metre, 1, 0, 1
UnitArea: metre, 1, 0, 2
Please note that defining these values does not affect the numerical analysis of the model in any way.
Symbol names (i.e. reaction, species, compartment, function, rule and parameter names etc.) must start with either an underscore or letter and be followed by any combination of alpha-numeric characters or an underscore. Like all other elements of the input file names are case sensitive:
R1
_subA
par1b
ext_1
Explicit access to the “current” time in a time simulation is provided by the special symbol _TIME_. This is useful in the definition of events and rules (see chapter on advanced model construction for more details).
Comments can be placed anywhere in the input file in one of two ways, as single line comment starting with a # or as a multi-line triple quoted comment “”“<comment>”“”:
# everything after this is ignored
"""
This is a comment
spread over a
few lines.
"""
By default StochPy assumes that the model exists in a single unit volume compartment. In this case it is not necessary to define a compartment and the ODE’s therefore describe changes in concentration per time. However, if a compartment is defined, StochPy assumes that the propensities describe changes in substance amount per time. Doing this affects how the model is defined in the input file (especially with respect to the definitions of rate equations and species) and the user is strongly advised to read the Users Guide before building models in this way. The compartment definition is as follows Compartment: <name>, <size>, <dimensions>, where <name> is the unique compartment id, <size> is the size of the compartment (i.e. length, volume or area) defined by the number of <dimensions> (e.g. 1,2,3):
Compartment: Cell, 2.0, 3
Compartment: Memb, 1.0, 2
A new addition to the PySCeS MDL is the ability to define SBML styled functions. Simply put these are code substitutions that can be used in rate equation definitions to, for example, simplify the kinetic law. The general syntax for a function is Function: <name>, <args> {<formula>} where <name> is the unique function id, <arglist> is one or more comma separated function arguments. The <formula> field, enclosed in curly brackets, may only make use of arguments listed in the <arglist> and therefore cannot reference model attributes directly. If this functionality is required a forcing function (assignment rule) may be what you are looking for.
Function: rmm_num, Vf, s, p, Keq {
Vf*(s - p/Keq)
}
Function: rmm_den, s, p, Ks, Kp {
s + Ks*(1.0 + p/Kp)
}
The syntax for function definitions has been adapted from Frank Bergmann and Herbert Sauro’s “Human Readable Model Definition Language” (Draft 1).
Boundary species, also known as fixed or external species, are a special class of parameter used when modelling biological systems. The PySCeS MDL fixed species are declared on a single line as FIX: <fixedlist>. The <fixedlist> is a space separated list of symbol names which should be initialised like any other species or parameter:
FIX: Fru_ex Glc_ex ATP ADP UDP phos glycolysis Suc_vac
If no fixed species are present in the model then this declaration should be omitted entirely.
The reaction stoichiometry and rate equation are defined together as a single reaction step. Each step in the system is defined as having a name (identifier), a stoichiometry (substrates are converted to products) and rate equation (the catalytic activity, described in terms of species and parameters). All reaction definitions should be separated by an empty line. The general format of a reaction in a model with no compartments is:
<name>:
<stoichiometry>
<rate equation>
The <name> argument follows the syntax as discussed in a previous section, however, when more than one compartment has been defined it is important to locate the reaction in its specific compartment. This is done using the @ operator:
<name>@<compartment>:
<stoichiometry>
<rate equation>
Where <compartment> is a valid compartment name. In either case this then followed either directly (or on the next line) by the reaction stoichiometry.
Each <stoichiometry> argument is defined in terms of reaction substrates, appearing on the left hand side and products on the right hand side of an identifier which labels the reaction as either reversible (=) or irreversible (>). While the PySCeS MDL supports both reversible and irreversible reactions, StochPy accepts only irreversible reactions. If required each reagent’s stoichiometric coefficient (StochPy accepts both integer and floating point) should be included in curly braces {} immediately preceding the reagent name. If these are omitted a coefficient of one is assumed:
{2.0}Hex_P = Suc6P + UDP # reversible reaction (not supported within StochPy)
Fru_ex > Fru # irreversible reaction
species_5 > $pool # a reaction to a sink
The PySCeS MDL also allows the use of the $pool token that represents a placeholder reagent for reactions that have no net substrate or product. Reversibility of a reaction is only used when exporting the model to other formats (such as SBML) and in the calculation of elementary modes. It does not affect the numerical evaluation of the rate equations in any way.
Central to any reaction definition is the <rate equation> (SBML kinetic law). This should be written as valid Python expression and may fall across more than one line. Standard Python operators + - * / ** are supported (note the Python power e.g. 2^4 is written as 2**4). There is no shorthand for multiplication with a bracket so -2(a+b)^h would be written as -2*(a+b)**h} and normal operator precedence applies:
+, - addition, subtraction *, / multiplication, division +x,-x positive, negative ** exponentiation
Operator precedence increase from top to bottom and left to right (adapted from the Python Reference Manual).
The PySCeS MDL parser has been developed to parse and translate different styles of infix into Python/Numpy based expressions, the following functions are supported in any mathematical expression:
- log, log10, ln, abs
- pow, exp, root, sqrt
- sin, cos, tan, sinh, cosh, tanh
- arccos, arccosh, arcsin, arcsinh, arctan, arctanh
- floor, ceil, ceiling, piecewise
- notanumber, pi, infinity, exponentiale
Logical operators are supported in rules, events etc but not in rate equation definitions. The PySCeS MDL parser understands Python infix as well as libSBML and NumPy prefix notation.
- and or xor not
- > gt(x,y) greater(x,y)
- < lt(x,y) less(x,y)
- >= ge(x,y) geq(x,y) greater_equal(x,y)
- <= le(x,y) leq(x,y) less_equal(x,y)
- == eq(x,y) equal(x,y)
- != neq(x,y) not_equal(x,y)
Note that currently the MathML delay and factorial functions are not supported. Delay is handled by simply removing it from any expression, e.g. delay(f(x), delay) would be parsed as f(x).
A reaction definition when no compartments are defined:
R1:
X > $pool
Mu*X
When compartments are defined note how now the reaction is now given a location and that because the propensities formed from these reactions must be in changes in substance per time the rate equation is multiplied by its compartment size:
R1@Cell:
X > $pool
Mu*X
If Species_In_Conc: True the location of the species is defined when it is initialised and will be explained later in this manual.
The general form of any species (fixed, free) and parameter is simply:
property = value
Initialisations can be written in any order anywhere in the input file but for human readability purposes these are usually placed after the reaction that uses them or grouped at the end of the input file. Both decimal and scientific notation is allowed with the following provisions that neither floating point (1. ) nor scientific shorthand (1.e-3) syntax should be used, instead use the full form (1.0e-3), (0.001) or (1.0).
Variable or free species are initialised differently depending on whether compartments are present in the model. While in essence the variables are set by the system parameters the
Although the variable species concentrations are determined by the parameters of the system, their initial values are used in various places, calculating total moiety concentrations (if present), time simulation initial values (e.g. time=zero) and as initial guesses for the steady-state algorithms. If an empty initial species pool is required it is not recommended to initialise these values to zero (in order to prevent potential divide-by-zero errors) but rather to a small value (e.g. 10**-8).
For a model with no compartments these initial values assumed to be concentrations:
NADH = 0.001
ATP = 2.3e-3
sucrose = 1
In a model with compartments it is expected that the species are located in a compartment (even if Species_In_Conc: False) this is done useing the @ symbol:
s1@Memb = 0.01
s2@Cell = 2.0e-4
A word of warning, the user is responsible for making sure that the units of the initialised species match those of the model. Please keep in mind that all species (and anything that depends on them) is defined in terms of the Species_In_Conc keyword. For example, if the preceding initialisations were for R1 (see Reaction section) then they would be concentrations (as Species_In_Conc: True). However, in the next example, we are initialising species for R4 and they are therefore in amounts (Species_In_Conc: False):
s3@Memb = 1.0
s4@Cell = 2.0
Fixed species are defined in a similar way and although technically a parameter, they should be given a location in compartmental models:
# InitExt
X0 = 10.0
X4@Cell = 1.0
However, fixed species are true parameters in the sense that their associated compartment size does not affect their value when it changes size. If compartment size dependent behaviour is required an assignment or rate rule should be considered.
Finally, the parameters should be initialised. StochPy checks if a parameter is defined that is not present in the rate equations and if such parameter initialisations are detected a harmless warning is generated. If, on the other hand, an uninitialised parameter is detected a warning is generated and a value of 1.0 assigned:
# InitPar
Vf2 = 10.0
Ks4 = 1.0
Assignment rules or forcing functions are used to set the value of a model attribute before the ODE’s are evaluated. This model attribute can either be a parameter used in the rate equations (this is traditionally used to describe an equilibrium block) a compartment or an arbitrary parameter (commonly used to define some sort of tracking function). Assignment rules can access other model attributes directly and have the generic form !F <par> = <formula>. Where <par> is the parameter assigned the result of <formula>. Assignment rules can be defined anywhere in the input file:
!F S_V_Ratio = Mem_Area/Vcyt
!F sigma_test = sigma_P*Pmem + sigma_L*Lmem
These rules would set the value of <par> which whose value can be followed with using the simulation and steady state extra_data functionality.
Time dependant events may now be defined whose definition follows the event framework described in the SBML L2V1 specification. The general form of an event is Event: <name>, <trigger>, <delay> { <assignments> }. As can be seen an event consists of essentially three parts, a conditional <trigger>, a set of one or more <assignments> and a <delay> between when the trigger is fired (and the assignments are evaluated) and the eventual assignment to the model. Assignments have the general form <par> = <formula>. Events have access to the “current” simulation time using the _TIME_ symbol:
Event: event1, _TIME_ > 10 and A > 150.0, 0 {
V1 = V1*vfact
V2 = V2*vfact
}
The following event illustrates the use of a delay of ten time units as well as the prefix notation (used by libSBML) for the trigger (StochPy understands both notations):
Event: event2, geq(_TIME_, 15.0), 10 {
V3 = V3*vfact2
}
# Event definitions
Event: reset, P2 > 30, 0.0 { P2 = 0 P = 100 }
Some models contain reactions which are defined as only have substrates or products:
R1: A + B >
R2: > C + D
The implication is that the relevant reagents appear or disappear from or into a constant pool. Unfortunately the PySCeS parser does not accept such an unbalanced reaction definition and requires these pools to be represented as a $pool token:
R1: A + B > $pool
R2: $pool > C + D
$pool is neither counted as a reagent nor does it ever appear in the stoichiometry (think of it as dev/null) and no other $<str> tokens are allowed.
StochPy test model BirthDeath.psc:
# Stochastic Simulation Algorithm input file
# --> mRNA -->
# Reactions
R1:
mRNA > {2} mRNA
Ksyn*mRNA
R2:
mRNA > $pool
Kdeg*mRNA
# Fixed species
# Variable species
mRNA = 100
# Parameters
Ksyn = 2.9
Kdeg = 3
Test suite model dsmts-003-04.xml.psc:
# Generated by PySCeS 0.8.0 (2012-02-28 14:09)
# Keywords
Description: Dimerisation model (003), variant 04
Modelname: Dimerisation04
Output_In_Conc: False
Species_In_Conc: False
# GlobalUnitDefinitions
UnitVolume: litre, 1.0, 0, 1
UnitLength: metre, 1.0, 0, 1
UnitSubstance: item, 1.0, 0, 1
UnitArea: metre, 1.0, 0, 2
UnitTime: second, 1.0, 0, 1
# Compartments
Compartment: Cell, 1.0, 3
# Reactions
Dimerisation@Cell:
{2.0}P > P2
k1*P*(P-1.0)/2.0
Disassociation@Cell:
P2 > {2.0}P
k2*P2
# Event definitions
Event: reset, gt(P2, 30), 0.0
{
P2 = 0
P = 100
}
# Fixed species
# Variable species
P2@Cell = 0.0
P@Cell = 100.0
# Parameters
k1 = 0.001
k2 = 0.01
Although it may be slightly more complicated than the basic model described above it is still, by our definition, certainly human readable.