mass - molecular masses and isotope distributions¶
Summary¶
This module defines general functions for mass and isotope abundance calculations. For most of the functions, a user may define a studied substance in various formats but all of them would be reduced to the Composition object describing its chemical composition.
Classes¶
Composition - a class storing chemical composition of a substance.
Unimod - a class representing a Python interface to the Unimod database.
Mass calculations¶
calculate_mass() - a general routine for mass / m/z calculation. Can calculate mass for a polypeptide sequence, chemical formula or elemental composition. Supplied with an ion type and charge, the function would calculate m/z.
fast_mass() - a less powerful but much faster function for polypeptide mass calculation.
Isotopic abundances¶
isotopic_composition_abundance() - calculate the relative abundance of a given isotopic composition.
most_probable_isotopic_composition() - finds the most abundant isotopic composition for a molecule defined by a polypeptide sequence, chemical formula or elemental composition.
Data¶
nist_mass - a dict with exact masses of the most abundant isotopes.
std_aa_comp - a dict with the elemental compositions of the standard twenty amino acid residues.
std_ion_comp - a dict with the relative elemental compositions of the standard peptide fragment ions.
std_aa_mass - a dict with the monoisotopic masses of the standard twenty amino acid residues.
- Composition.__init__(*args, **kwargs)[source]¶
A Composition object stores a chemical composition of a substance. Basically it is a dict object, in which keys are the names of chemical elements and values contain integer numbers of corresponding atoms in a substance.
The main improvement over dict is that Composition objects allow addition and subtraction.
A Composition object can be initialized with one of the following arguments: formula, sequence, parsed_sequence or split_sequence.
If none of these are specified, the constructor will look at the first positional argument and try to build the object from it. Without positional arguments, a Composition will be constructed directly from keyword arguments.
If there’s an ambiguity, i.e. the argument is both a valid sequence and a formula (such as ‘HCN’), it will be treated as a sequence. You need to provide the ‘formula’ keyword to override this.
Warning
Be careful when supplying a list with a parsed sequence or a split sequence as a keyword argument. It must be obtained with enabled show_unmodified_termini option. When supplying it as a positional argument, the option doesn’t matter, because the positional argument is always converted to a sequence prior to any processing.
Parameters : formula : str, optional
A string with a chemical formula. All elements must be present in mass_data.
sequence : str, optional
A polypeptide sequence string in modX notation.
parsed_sequence : list of str, optional
A polypeptide sequence parsed into a list of amino acids.
split_sequence : list of tuples of str, optional
A polypeptyde sequence parsed into a list of tuples (as returned be pyteomics.parser.parse() with ‘split=True’).
aa_comp : dict, optional
A dict with the elemental composition of the amino acids (the default value is std_aa_comp).
mass_data : dict, optional
A dict with the masses of chemical elements (the default value is nist_mass). It is used for formulae parsing only.
- class pyteomics.mass.Unimod(source='http://www.unimod.org/xml/unimod.xml')[source]¶
A class for Unimod database of modifications. The list of all modifications can be retrieved via mods attribute. Methods for convenient searching are by_title and by_name. For more elaborate filtering, iterate manually over the list.
Methods
by_name(name[, strict]) Search modifications by name. by_title(title[, strict]) Search modifications by title. - by_name(name, strict=True)[source]¶
Search modifications by name. If a single modification is found, it is returned. Otherwise, a list will be returned.
- pyteomics.mass.calculate_mass(*args, **kwargs)[source]¶
Calculates the monoisotopic mass of a polypeptide defined by a sequence string, parsed sequence, chemical formula or Composition object.
One or none of the following keyword arguments is required: formula, sequence, parsed_sequence, split_sequence or composition. All arguments given are used to create a Composition object, unless an existing one is passed as a keyword argument.
Note that if a sequence string is supplied then the mass is calculated for a polypeptide with standard terminal groups (NH2- and -OH).
Warning
Be careful when supplying a list with a parsed sequence. It must be obtained with enabled show_unmodified_termini option.
Parameters : formula : str, optional
A string with a chemical formula.
sequence : str, optional
A polypeptide sequence string in modX notation.
parsed_sequence : list of str, optional
A polypeptide sequence parsed into a list of amino acids.
composition : Composition, optional
A Composition object with the elemental composition of a substance.
average : bool, optional
If True then the average mass is calculated. Note that mass is not averaged for elements with specified isotopes.
ion_type : str, optional
If specified, then the polypeptide is considered to be in the form of the corresponding ion. Do not forget to specify the charge state!
charge : int, optional
If not 0 then m/z is calculated: the mass is increased by the corresponding number of proton masses and divided by z.
aa_comp : dict, optional
A dict with the elemental composition of the amino acids (the default value is std_aa_comp).
mass_data : dict, optional
A dict with the masses of the chemical elements (the default value is nist_mass).
ion_comp : dict, optional
A dict with the relative elemental compositions of peptide ion fragments (default is std_ion_comp).
Returns : mass : float
- pyteomics.mass.fast_mass(sequence, ion_type=None, charge=None, **kwargs)[source]¶
Calculate monoisotopic mass of an ion using the fast algorithm. May be used only if amino acid residues are presented in one-letter code.
Parameters : sequence : str
A polypeptide sequence string.
ion_type : str, optional
If specified, then the polypeptide is considered to be in a form of corresponding ion. Do not forget to specify the charge state!
charge : int, optional
If not 0 then m/z is calculated: the mass is increased by the corresponding number of proton masses and divided by z.
mass_data : dict, optional
A dict with the masses of chemical elements (the default value is nist_mass).
aa_mass : dict, optional
A dict with the monoisotopic mass of amino acid residues (default is std_aa_mass);
ion_comp : dict, optional
A dict with the relative elemental compositions of peptide ion fragments (default is std_ion_comp).
Returns : mass : float
Monoisotopic mass or m/z of a peptide molecule/ion.
- pyteomics.mass.isotopic_composition_abundance(*args, **kwargs)[source]¶
Calculate the relative abundance of a given isotopic composition of a molecule.
Parameters : formula : str, optional
A string with a chemical formula.
composition : Composition, optional
A Composition object with the isotopic composition of a substance.
mass_data : dict, optional
A dict with the masses of chemical elements (the default value is nist_mass).
Returns : relative_abundance : float
The relative abundance of a given isotopic composition.
- pyteomics.mass.most_probable_isotopic_composition(*args, **kwargs)[source]¶
Calculate the most probable isotopic composition of a peptide molecule/ion defined by a sequence string, parsed sequence, chemical formula or Composition object.
Note that if a sequence string is supplied then the isotopic composition is calculated for a polypeptide with standard terminal groups (H- and -OH).
For each element, only two most abundant isotopes are considered.
Parameters : formula : str, optional
A string with a chemical formula.
sequence : str, optional
A polypeptide sequence string in modX notation.
parsed_sequence : list of str, optional
A polypeptide sequence parsed into a list of amino acids.
composition : Composition, optional
A Composition object with the elemental composition of a substance.
elements_with_isotopes : list of str
A list of elements to be considered in isotopic distribution (by default, every element has a isotopic distribution).
aa_comp : dict, optional
A dict with the elemental composition of the amino acids (the default value is std_aa_comp).
mass_data : dict, optional
A dict with the masses of chemical elements (the default value is nist_mass).
ion_comp : dict, optional
A dict with the relative elemental compositions of peptide ion fragments (default is std_ion_comp).
Returns : out: tuple (Composition, float) :
A tuple with the most probable isotopic composition and its relative abundance.
- pyteomics.mass.nist_mass¶
A dict with the exact element masses downloaded from the NIST website: http://www.nist.gov/pml/data/comp.cfm . There are entries for each element containing the masses and relative abundances of several abundant isotopes and a separate entry for undefined isotope with zero key, mass of the most abundant isotope and 1.0 abundance.
- pyteomics.mass.std_aa_comp¶
A dictionary with elemental compositions of the twenty standard amino acid residues and standard H- and -OH terminal groups.
- pyteomics.mass.std_aa_mass¶
A dictionary with monoisotopic masses of the twenty standard amino acid residues.
- pyteomics.mass.std_ion_comp¶
A dict with relative elemental compositions of the standard peptide fragment ions. An elemental composition of a fragment ion is calculated as a difference between the total elemental composition of an ion and the sum of elemental compositions of its constituting amino acid residues.