Monosaccharide¶
Represents individual saccharide residues and their associated functions. These are the basic unit of structural representation, possesing graph node-like properties.
-
class
glypy.structure.monosaccharide.
Monosaccharide
(anomer=None, configuration=None, stem=None, superclass=None, ring_start=None, ring_end=None, modifications=None, links=None, substituent_links=None, composition=None, reduced=None, id=None, fast=False)[source]¶ Bases:
glypy.structure.base.SaccharideBase
Represents a single monosaccharide molecule, and its relationships with other molcules through
Link
objects.Link
objects stored inlinks
for connections to otherMonosaccharide
instances, building aGlycan
structure as a graph ofMonosaccharide
objects.Link
objects connecting theMonosaccharide
instance toSubstituent
objects are stored insubstituent_links
.Both
links
andsubstituent_links
are instances ofOrderedMultiMap
objects where the key is the index of the carbon atom in the carbohydrate backbone that hosts the bond. An index ofx
or-1
represents an unknown location.Warning
While
Monosaccharide
objects expose theirmodifications
,links
, andsubstituent_links
attributes as mutable, you should treat them as read-only. The methods for altering their contents,add_substituent()
,add_monosaccharide()
,add_modification()
,drop_substituent()
,drop_monosaccharide()
, anddrop_modification()
are all responsible for handling these mutations for you.Link
methods likeLink.apply()
,Link.break_link()
, andLink.reconnect()
are used internally.Attributes
anomer: Anomer
An entry of Anomer
that corresponds to the linkage type of the carbohydrate backbone. Is an entry of a class based onEnum
superclass: SuperClass
An entry of SuperClass
that corresponds to the number of carbons in the carbohydrate backbone of the monosaccharide. Controls the base composition of the instance and the number of positions open to be linked to or modified. Is an entry of a class based onEnum
configuration: Configuration
or {‘d’, ‘l’, ‘x’, ‘missing’, None}An entry of Configuration
which corresponds to the optical stereomer state of the instance. Is an entry of a class based onEnum
. May possess more than one value.stem: Stem
Corresponds to the bond conformation of the carbohydrate backbone. Is an entry of a class based on Enum
. May possess more than one value.ring_start: int
The index of the carbon of the carbohydrate backbone that starts a ring. A value of -1
,'x'
, orNone
corresponds to an unknown start. A value of0
refers to a linear chain.ring_end: int
The index of the carbon of the carbohydrate backbone that ends a ring. A value of -1
,'x'
, orNone
corresponds to an unknown ends. A value of0
refers to a linear chain.reducing_end: int
The index of the carbon which hosts the reducing end. modifications: OrderedMultiMap
The mapping of sites to Modification
entries. Directly modifies the instance’scomposition
links: OrderedMultiMap
The mapping of sites to Link
entries that refer to otherMonosaccharide
instancessubstituent_links: OrderedMultiMap
The mapping of sites to Link
entries that refer toSubstituent
instances.composition: Composition
An instance of Composition
corresponding to the elemental composition ofself
and its immediate modifications. If not provided, this will be inferred from field values.reduced: ReducedEnd
An instance of ReducedEnd, or the value True
, represents a reduced sugar. May be inferred frommodifications
if “aldi” is present-
__eq__
(other)[source]¶ Test for equality between
Monosaccharide
instances. First try scalar equality of fields, and then compare descendants.
-
__getitem__
(position)[source]¶ Gets the collection of alterations made to the carbohydrate backbone at
position
. This queriesmodifications
,links
, andsubstituent_links
.Returns: dict
-
__setstate__
(state)[source]¶ Does some testing to upgrade outdated, but equivalent modification models.
-
_fast_reduce
(value)[source]¶ Expedite adding a reducing end to this monosaccharide. Assumes that
value
is aReducedEnd
and that this monosaccharide is not already reduced.Parameters: value : ReducedEnd
Description
-
_flat_equality
(other, lengths=True)[source]¶ Test for equality of all scalar-ish features that do not require recursively comparing links which in turn compare their connected units.
-
_match_substituents
(other)[source]¶ Helper method for matching substituents in an order-independent fashion. Used by
topological_equality()
-
add_modification
(modification, position, max_occupancy=0)[source]¶ Adds a modification instance to
modifications
at the site given byposition
. This directly modifiescomposition
, consequently changingmass()
Parameters: position: int or ‘x’
The location to add the
Modification
to.modification: str or Modification
The modification to add. If passed a
str
, it will be translated into an instance ofModification
max_occupancy: int, optional
The maximum number of items acceptable at
position
. defaults to1
Returns: self
, for chain callsRaises: IndexError
position
exceeds the bounds set bysuperclass
.ValueError
position
is occupied by more thanmax_occupancy
elements
-
add_monosaccharide
(monosaccharide, position=-1, max_occupancy=0, child_position=-1, parent_loss=None, child_loss=None)[source]¶ Adds a
Monosaccharide
and associatedLink
tolinks
at the site given byposition
.>>> from glypy import monosaccharides >>> hexnac = monosaccharides.HexNAc >>> hex = monosaccharides.Hex >>> hexnac.add_monosaccharide(hex, 1) RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac.links[1][0].child RES 1b:x-xx-HEX-1:5
Parameters: monosaccharide: Monosaccharide
The monosaccharide to add.
position: int or ‘x’
The location to add the
Monosaccharide
link tolinks
. Defaults to -1child_position: int
The location to add the link to in
monosaccharide
‘slinks
. Defaults to -1.max_occupancy: int, optional
The maximum number of items acceptable at
position
. Defaults to1
parent_loss: Composition or str
The elemental composition removed from
self
child_loss: Composition or str
The elemental composition removed from
monosaccharide
Returns: self
, for chain callsRaises: IndexError
position
exceeds the bounds set bysuperclass
.ValueError
position
is occupied by more thanmax_occupancy
elements
-
add_substituent
(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]¶ Adds a
Substituent
and associatedLink
tosubstituent_links
at the site given byposition
. This new substituent is included when calculating mass with substituents included.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hex.add_substituent("n-acetyl", 2, parent_loss="OH") RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> hexnac == hex True
Parameters: substituent: str or Substituent
The substituent to add. If passed a
str
it will be translated into an instance ofSubstituent
.position: int or ‘x’
The location to add the
Substituent
link tosubstituent_links
. Defaults to -1child_position: int
The location to add the link to in
substituent
links
. Defaults to -1. Substituent indices are currently not checked.max_occupancy: int, optional
The maximum number of items acceptable at
position
. Defaults to1
parent_loss: Composition or str
The elemental composition removed from
self
child_loss: Composition or str
The elemental composition removed from
substituent
Returns: self
, for chain callsRaises: IndexError
position
exceeds the bounds set bysuperclass
.ValueError
position
is occupied by more thanmax_occupancy
elements
-
children
()[source]¶ Returns an iterator over the
Monosaccharide
instancess which are considered the descendants ofself
Alias for
__iter__
>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> ch = n_linked_core.root.children() >>> ch[0] (4, RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n) >>>
Returns: list
ofposition: int
Location of the bond to the child
Monosaccharide
child:
Monosaccharide
Monosaccharide
atposition
-
clone
(prop_id=False, fast=True, monosaccharide_type=None)[source]¶ Copies just this
Monosaccharide
and its |Substituent|s, creating a separate instance with the same data. All mutable data structures are duplicated and distinct from the original.Does not copy any
links
as this would cause recursive duplication of the entireGlycan
graph.Returns: Monosaccharide
-
drop_modification
(position, modification)[source]¶ Remove the
modification
atposition
Parameters: position: int
The position to drop the modification from
modification: Modification
The Modification to remove.
Returns: self
, for chain callsRaises: IndexError:
If
position
is not a valid carbohydrate backbone positionValueError:
If
modification
is not found atposition
-
drop_monosaccharide
(position, refund=True)[source]¶ Remove the glycosidic bond at
position
, detatching a connectedMonosaccharide
If there is more than one glycosidic bond at
position
, an error will be raised.>>> from glypy import glycans >>> n_linked_core = glycans["N-Linked Core"] >>> n_linked_core.root.drop_monosaccharide(4) RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n >>> n_linked_core.mass() 221.08993720321
Parameters: position: int
The position to drop the modification from
refund: bool
Passed to
break_link()
Returns: self
, for chain callsRaises: IndexError:
If
position
is not a valid carbohydrate backbone positionValueError:
-
drop_substituent
(position, substituent=None, refund=True)[source]¶ Remove the
substituent
atposition
.If
substituent
isNone
, then the first substituent found atposition
is removed.>>> from glypy import monosaccharides >>> hex = monosaccharides.Hex >>> hexnac = monosaccharides.HexNAc >>> hexnac.drop_substituent(2) RES 1b:x-xx-HEX-1:5 >>> hexnac == hex True
Parameters: position: int
The position to drop the modification from
substituent: Substituent
The
Substituent
to remove. IfNone
, the first substituent found atposition
will be removedrefund: bool
Passed to
break_link()
Returns: self
, for chain callsRaises: IndexError:
If
position
is not a valid carbohydrate backbone positionValueError:
If
substituent
is not found atposition
-
exact_ordering_equality
(other, substituents=True, visited=None)[source]¶ Performs equality testing between two monosaccharides where the exact position (and ordering by sort) of links must to match between the input
Monosaccharide
objectsReturns: bool
-
is_occupied
(position)[source]¶ Checks to see if a particular backbone position is occupied by a
Modification
,Substituent
, orLink
to anotherMonosaccharide
.Parameters: position: int
The position to check for occupancy. Passing -1 checks for undetermined attachments.
Returns: int:
The number of occupants at
position
Raises: IndexError:
When the position is less than 1 or exceeds the limits of the carbohydrate backbone’s size.
-
mass
(average=False, charge=0, mass_data=None, substituents=True)[source]¶ Calculates the total mass of
self
.Parameters: average: bool, optional, defaults to False
Whether or not to use the average isotopic composition when calculating masses. When
average == False
, masses are calculated using monoisotopic mass.charge: int, optional, defaults to 0
If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is
charge
mass_data: dict, optional
If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to
None
.substituents: bool, optional, defaults to True
Whether or not to include substituents’ masses.
Returns
——-
:class:`float`
-
open_attachment_sites
(max_occupancy=0)[source]¶ When attaching
Monosaccharide
instances to other objects, bonds are formed between the carbohydrate backbone and the other object. If a site is already bound, the occupying object fills that space on the backbone and prevents other objects from binding there.Currently only cares about the availability of the hydroxyl group. As there is not a hydroxyl attached to the ring-ending carbon, that should not be considered an open site.
If any existing attached units have unknown positions, we can’t provide any known positions, in which case the list of open positions will be a
list
of-1
s of the length of open sites.Parameters: max_occupancy: int
The number of objects that may already be bound at a site before it is considered unavailable for attachment.
Returns: list
:The positions open for binding
int
:The number of bound but unknown locations on the backbone.
-
parents
()[source]¶ Returns an iterator over the
Monosaccharide
instances which are considered the ancestors ofself
.Returns: list
ofposition: int
Location of the bond to the parent
Monosaccharide
parent:
Monosaccharide
Monosaccharide
atposition
-
reducing_end
¶ Return the reducing end type of
self
orNone
ifself
is not reduced. The reducing end value can also be found inmodifications
.If
Modification.aldi
is present, it will be converted into an instance ofReducedEnd
with default arguments.TODO: Remove Redundancy between
aldi
check and reduction.Returns: ReducedEnd or None
-
ring_type
¶ The size of the ring-shape of the carbohydrate, as computed by ring_end - ring_start.
Returns: EnumValue:
The appropriate value of
RingType
-
substituents
()[source]¶ Returns an iterator over all substituents attached to
self
by aLink
object stored insubstituent_links
Returns: list
ofposition: int
Location of the bond to the substituent
substituent: Substituent
Substituent
atposition
-
-
glypy.structure.monosaccharide.
_get_standard_composition
(monosaccharide)[source]¶ Used to get initial composition for a given monosaccharide
SuperClass
and modifications.Used during initialization of a
Monosaccharide
.Parameters: monosaccharide: :class:`Monosaccharide`
The
Monosaccharide
object to read attributes fromReturns: The baseline composition from
monosaccharide.superclass
+monosaccharide.modifications
-
glypy.structure.monosaccharide.
_traverse_debug
(monosaccharide, visited=None, apply_fn=<function identity at 0x00000000081CCE48>)[source]¶ A low-level depth-first traversal method for unwrapped residue graphs when the
id
attribute may be masking duplicate residuesParameters: monosaccharide: :class:`Monosaccharide`
Residue to start traversing from
visited: set or None
The collection of node ids to ignore, having already visited them. If
None
, it defaults to the empty set.apply_fn: function
Function to apply to each residue before yielding them
Yields:
-
glypy.structure.monosaccharide.
depth
(monosaccharide, visited=None)[source]¶ Calculate the distance from
monosaccharide
to its furthest grand-child node.
-
glypy.structure.monosaccharide.
graph_clone
(monosaccharide, visited=None)[source]¶ Low-level depth-first duplication method for unwrapped residue graphs
Parameters: residue: :class:`Monosaccharide`
The root of the graph to clone
visited: set or None
The collection of node ids to ignore, having already visited them. If
None
, it defaults to the empty set.Returns: The root of a newly duplicated and identical residue graph
-
glypy.structure.monosaccharide.
release
(monosaccharide)[source]¶ Break all monosaccharide-monosaccharide links on
monosaccharide
, returning them as a list. Breaking is done withrefund=True
Parameters: monosaccharide : Monosaccharide
Monosaccharide
to break all links onReturns: list of tuple(link, (link.parent, link.child))
-
glypy.structure.monosaccharide.
toggle
(monosaccharide)[source]¶ A simple generator for declaratively masking and masking a residue’s links. The first iteration masks all links. The second unmasks them. Calls
release()
Parameters: monosaccharide : Monosaccharide
Monosaccharide
to mask links on
-
glypy.structure.monosaccharide.
traverse
(monosaccharide, visited=None, apply_fn=<function identity at 0x00000000081CCE48>)[source]¶ A low-level depth-first traversal method for unwrapped residue graphs
Parameters: monosaccharide: :class:`Monosaccharide`
Residue to start traversing from
visited: set or None
The collection of node ids to ignore, having already visited them. If
None
, it defaults to the empty set.apply_fn: function
Function to apply to each residue before yielding them
Yields: