Monosaccharide

Represents individual saccharide residues and their associated functions. These are the basic unit of structural representation, possesing graph node-like properties.

class glypy.structure.monosaccharide.Monosaccharide(anomer=None, configuration=None, stem=None, superclass=None, ring_start=None, ring_end=None, modifications=None, links=None, substituent_links=None, composition=None, reduced=None, id=None, fast=False)[source]

Bases: glypy.structure.base.SaccharideBase

Represents a single monosaccharide molecule, and its relationships with other molcules through Link objects. Link objects stored in links for connections to other Monosaccharide instances, building a Glycan structure as a graph of Monosaccharide objects. Link objects connecting the Monosaccharide instance to Substituent objects are stored in substituent_links.

Both links and substituent_links are instances of OrderedMultiMap objects where the key is the index of the carbon atom in the carbohydrate backbone that hosts the bond. An index of x or -1 represents an unknown location.

Warning

While Monosaccharide objects expose their modifications, links, and substituent_links attributes as mutable, you should treat them as read-only. The methods for altering their contents, add_substituent(), add_monosaccharide(), add_modification(), drop_substituent(), drop_monosaccharide(), and drop_modification() are all responsible for handling these mutations for you. Link methods like Link.apply(), Link.break_link(), and Link.reconnect() are used internally.

Attributes

anomer: Anomer An entry of Anomer that corresponds to the linkage type of the carbohydrate backbone. Is an entry of a class based on Enum
superclass: SuperClass An entry of SuperClass that corresponds to the number of carbons in the carbohydrate backbone of the monosaccharide. Controls the base composition of the instance and the number of positions open to be linked to or modified. Is an entry of a class based on Enum
configuration: Configuration or {‘d’, ‘l’, ‘x’, ‘missing’, None} An entry of Configuration which corresponds to the optical stereomer state of the instance. Is an entry of a class based on Enum. May possess more than one value.
stem: Stem Corresponds to the bond conformation of the carbohydrate backbone. Is an entry of a class based on Enum. May possess more than one value.
ring_start: int The index of the carbon of the carbohydrate backbone that starts a ring. A value of -1, 'x', or None corresponds to an unknown start. A value of 0 refers to a linear chain.
ring_end: int The index of the carbon of the carbohydrate backbone that ends a ring. A value of -1, 'x', or None corresponds to an unknown ends. A value of 0 refers to a linear chain.
reducing_end: int The index of the carbon which hosts the reducing end.
modifications: OrderedMultiMap The mapping of sites to Modification entries. Directly modifies the instance’s composition
links: OrderedMultiMap The mapping of sites to Link entries that refer to other Monosaccharide instances
substituent_links: OrderedMultiMap The mapping of sites to Link entries that refer to Substituent instances.
composition: Composition An instance of Composition corresponding to the elemental composition of self and its immediate modifications. If not provided, this will be inferred from field values.
reduced: ReducedEnd An instance of ReducedEnd, or the value True, represents a reduced sugar. May be inferred from modifications if “aldi” is present
__eq__(other)[source]

Test for equality between Monosaccharide instances. First try scalar equality of fields, and then compare descendants.

__getitem__(position)[source]

Gets the collection of alterations made to the carbohydrate backbone at position. This queries modifications, links, and substituent_links.

Returns:dict
__setstate__(state)[source]

Does some testing to upgrade outdated, but equivalent modification models.

_fast_reduce(value)[source]

Expedite adding a reducing end to this monosaccharide. Assumes that value is a ReducedEnd and that this monosaccharide is not already reduced.

Parameters:

value : ReducedEnd

Description

_flat_equality(other, lengths=True)[source]

Test for equality of all scalar-ish features that do not require recursively comparing links which in turn compare their connected units.

_match_substituents(other)[source]

Helper method for matching substituents in an order-independent fashion. Used by topological_equality()

add_modification(modification, position, max_occupancy=0)[source]

Adds a modification instance to modifications at the site given by position. This directly modifies composition, consequently changing mass()

Parameters:

position: int or ‘x’

The location to add the Modification to.

modification: str or Modification

The modification to add. If passed a str, it will be translated into an instance of Modification

max_occupancy: int, optional

The maximum number of items acceptable at position. defaults to 1

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError

position exceeds the bounds set by superclass.

ValueError

position is occupied by more than max_occupancy elements

add_monosaccharide(monosaccharide, position=-1, max_occupancy=0, child_position=-1, parent_loss=None, child_loss=None)[source]

Adds a Monosaccharide and associated Link to links at the site given by position.

>>> from glypy import monosaccharides
>>> hexnac = monosaccharides.HexNAc
>>> hex = monosaccharides.Hex
>>> hexnac.add_monosaccharide(hex, 1)
RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n
>>> hexnac.links[1][0].child
RES 1b:x-xx-HEX-1:5
Parameters:

monosaccharide: Monosaccharide

The monosaccharide to add.

position: int or ‘x’

The location to add the Monosaccharide link to links. Defaults to -1

child_position: int

The location to add the link to in monosaccharide‘s links. Defaults to -1.

max_occupancy: int, optional

The maximum number of items acceptable at position. Defaults to 1

parent_loss: Composition or str

The elemental composition removed from self

child_loss: Composition or str

The elemental composition removed from monosaccharide

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError

position exceeds the bounds set by superclass.

ValueError

position is occupied by more than max_occupancy elements

add_substituent(substituent, position=-1, max_occupancy=0, child_position=1, parent_loss=None, child_loss=None)[source]

Adds a Substituent and associated Link to substituent_links at the site given by position. This new substituent is included when calculating mass with substituents included.

>>> from glypy import monosaccharides
>>> hex = monosaccharides.Hex
>>> hexnac = monosaccharides.HexNAc
>>> hex.add_substituent("n-acetyl", 2, parent_loss="OH")
RES 1b:x-xx-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n
>>> hexnac == hex
True
Parameters:

substituent: str or Substituent

The substituent to add. If passed a str it will be translated into an instance of Substituent.

position: int or ‘x’

The location to add the Substituent link to substituent_links. Defaults to -1

child_position: int

The location to add the link to in substituent links. Defaults to -1. Substituent indices are currently not checked.

max_occupancy: int, optional

The maximum number of items acceptable at position. Defaults to 1

parent_loss: Composition or str

The elemental composition removed from self

child_loss: Composition or str

The elemental composition removed from substituent

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError

position exceeds the bounds set by superclass.

ValueError

position is occupied by more than max_occupancy elements

children()[source]

Returns an iterator over the Monosaccharide instancess which are considered the descendants of self

Alias for __iter__

>>> from glypy import glycans
>>> n_linked_core = glycans["N-Linked Core"]
>>> ch = n_linked_core.root.children()
>>> ch[0]
(4, RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n)
>>>
Returns:

list of

position: int

Location of the bond to the child Monosaccharide

child: Monosaccharide

Monosaccharide at position

clone(prop_id=False, fast=True, monosaccharide_type=None)[source]

Copies just this Monosaccharide and its |Substituent|s, creating a separate instance with the same data. All mutable data structures are duplicated and distinct from the original.

Does not copy any links as this would cause recursive duplication of the entire Glycan graph.

Returns:Monosaccharide
drop_modification(position, modification)[source]

Remove the modification at position

Parameters:

position: int

The position to drop the modification from

modification: Modification

The Modification to remove.

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError:

If position is not a valid carbohydrate backbone position

ValueError:

If modification is not found at position

drop_monosaccharide(position, refund=True)[source]

Remove the glycosidic bond at position, detatching a connected Monosaccharide

If there is more than one glycosidic bond at position, an error will be raised.

>>> from glypy import glycans
>>> n_linked_core = glycans["N-Linked Core"]
>>> n_linked_core.root.drop_monosaccharide(4)
RES 1b:b-dglc-HEX-1:5 2s:n-acetyl LIN 1:1d(2+1)2n
>>> n_linked_core.mass()
221.08993720321
Parameters:

position: int

The position to drop the modification from

refund: bool

Passed to break_link()

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError:

If position is not a valid carbohydrate backbone position

ValueError:

If no Link or more than one Link is found at position

drop_substituent(position, substituent=None, refund=True)[source]

Remove the substituent at position.

If substituent is None, then the first substituent found at position is removed.

>>> from glypy import monosaccharides
>>> hex = monosaccharides.Hex
>>> hexnac = monosaccharides.HexNAc
>>> hexnac.drop_substituent(2)
RES 1b:x-xx-HEX-1:5
>>> hexnac == hex
True
Parameters:

position: int

The position to drop the modification from

substituent: Substituent

The Substituent to remove. If None, the first substituent found at position will be removed

refund: bool

Passed to break_link()

Returns:

Monosaccharide:

self, for chain calls

Raises:

IndexError:

If position is not a valid carbohydrate backbone position

ValueError:

If substituent is not found at position

exact_ordering_equality(other, substituents=True, visited=None)[source]

Performs equality testing between two monosaccharides where the exact position (and ordering by sort) of links must to match between the input Monosaccharide objects

Returns:bool
is_occupied(position)[source]

Checks to see if a particular backbone position is occupied by a Modification, Substituent, or Link to another Monosaccharide.

Parameters:

position: int

The position to check for occupancy. Passing -1 checks for undetermined attachments.

Returns:

int:

The number of occupants at position

Raises:

IndexError:

When the position is less than 1 or exceeds the limits of the carbohydrate backbone’s size.

mass(average=False, charge=0, mass_data=None, substituents=True)[source]

Calculates the total mass of self.

Parameters:

average: bool, optional, defaults to False

Whether or not to use the average isotopic composition when calculating masses. When average == False, masses are calculated using monoisotopic mass.

charge: int, optional, defaults to 0

If charge is non-zero, m/z is calculated, where m is the theoretical mass, and z is charge

mass_data: dict, optional

If mass_data is None, standard NIST mass and isotopic abundance data are used. Otherwise the contents of mass_data are assumed to contain elemental mass and isotopic abundance information. Defaults to None.

substituents: bool, optional, defaults to True

Whether or not to include substituents’ masses.

Returns

——-

:class:`float`

open_attachment_sites(max_occupancy=0)[source]

When attaching Monosaccharide instances to other objects, bonds are formed between the carbohydrate backbone and the other object. If a site is already bound, the occupying object fills that space on the backbone and prevents other objects from binding there.

Currently only cares about the availability of the hydroxyl group. As there is not a hydroxyl attached to the ring-ending carbon, that should not be considered an open site.

If any existing attached units have unknown positions, we can’t provide any known positions, in which case the list of open positions will be a list of -1 s of the length of open sites.

Parameters:

max_occupancy: int

The number of objects that may already be bound at a site before it is considered unavailable for attachment.

Returns:

list:

The positions open for binding

int:

The number of bound but unknown locations on the backbone.

order(deep=False)[source]

Return the “graph theory” order of this molecule

Returns:int
parents()[source]

Returns an iterator over the Monosaccharide instances which are considered the ancestors of self.

Returns:

list of

position: int

Location of the bond to the parent Monosaccharide

parent: Monosaccharide

Monosaccharide at position

reducing_end

Return the reducing end type of self or None if self is not reduced. The reducing end value can also be found in modifications.

If Modification.aldi is present, it will be converted into an instance of ReducedEnd with default arguments.

TODO: Remove Redundancy between aldi check and reduction.

Returns:ReducedEnd or None
ring_type

The size of the ring-shape of the carbohydrate, as computed by ring_end - ring_start.

Returns:

EnumValue:

The appropriate value of RingType

substituents()[source]

Returns an iterator over all substituents attached to self by a Link object stored in substituent_links

Returns:

list of

position: int

Location of the bond to the substituent

substituent: Substituent

Substituent at position

topological_equality(other, substituents=True, visited=None)[source]

Performs equality testing between two monosaccharides where the exact ordering of child links does not have to match between the input |Monosaccharide|s, so long as an exact match of the subtrees is found

Returns:bool
total_composition()[source]

Computes the sum of the composition of self and each of its linked :class:`~glypy.structure.substituent.Substituent`s

Returns:Composition
glypy.structure.monosaccharide._get_standard_composition(monosaccharide)[source]

Used to get initial composition for a given monosaccharide SuperClass and modifications.

Used during initialization of a Monosaccharide.

Parameters:

monosaccharide: :class:`Monosaccharide`

The Monosaccharide object to read attributes from

Returns:

Composition:

The baseline composition from monosaccharide.superclass + monosaccharide.modifications

glypy.structure.monosaccharide._traverse_debug(monosaccharide, visited=None, apply_fn=<function identity at 0x00000000081CCE48>)[source]

A low-level depth-first traversal method for unwrapped residue graphs when the id attribute may be masking duplicate residues

Parameters:

monosaccharide: :class:`Monosaccharide`

Residue to start traversing from

visited: set or None

The collection of node ids to ignore, having already visited them. If None, it defaults to the empty set.

apply_fn: function

Function to apply to each residue before yielding them

Yields:

Monosaccharide

glypy.structure.monosaccharide.depth(monosaccharide, visited=None)[source]

Calculate the distance from monosaccharide to its furthest grand-child node.

glypy.structure.monosaccharide.graph_clone(monosaccharide, visited=None)[source]

Low-level depth-first duplication method for unwrapped residue graphs

Parameters:

residue: :class:`Monosaccharide`

The root of the graph to clone

visited: set or None

The collection of node ids to ignore, having already visited them. If None, it defaults to the empty set.

Returns:

Monosaccharide:

The root of a newly duplicated and identical residue graph

glypy.structure.monosaccharide.release(monosaccharide)[source]

Break all monosaccharide-monosaccharide links on monosaccharide, returning them as a list. Breaking is done with refund=True

Parameters:

monosaccharide : Monosaccharide

Monosaccharide to break all links on

Returns:

list of tuple(link, (link.parent, link.child))

glypy.structure.monosaccharide.toggle(monosaccharide)[source]

A simple generator for declaratively masking and masking a residue’s links. The first iteration masks all links. The second unmasks them. Calls release()

Parameters:

monosaccharide : Monosaccharide

Monosaccharide to mask links on

glypy.structure.monosaccharide.traverse(monosaccharide, visited=None, apply_fn=<function identity at 0x00000000081CCE48>)[source]

A low-level depth-first traversal method for unwrapped residue graphs

Parameters:

monosaccharide: :class:`Monosaccharide`

Residue to start traversing from

visited: set or None

The collection of node ids to ignore, having already visited them. If None, it defaults to the empty set.

apply_fn: function

Function to apply to each residue before yielding them

Yields:

Monosaccharide