Heuristic Similarity

A collection of routines for doing fuzzy matching of monosaccharides

glypy.algorithms.similarity.build_unique_index_pairs(pairs)[source]

Generate all unique non-overlapping sets of pairs, given in pairs

glypy.algorithms.similarity.monosaccharide_similarity(node, target, include_substituents=True, include_modifications=True, include_children=False, exact=True, ignore_reduction=False, visited=None, short_circuit_after=None)[source]

A heuristic for measuring similarity between monosaccharide instances

Compares:
  1. ring_start and ring_end
  2. superclass
  3. configuration
  4. stem
  5. anomer
  6. If include_modifications, each modification
  7. If include_substituents, each substituent
  8. If include_children, each child Monosaccharide
Parameters:

node: Monosaccharide

Object to compare with

target: Monosaccharide

Object to compare against

include_substituents: bool

Include substituents in comparison (Defaults True)

include_modifications: bool

Include modifications in comparison (Defaults True)

include_children: bool

Include children in comparison (Defaults False)

exact: bool

Penalize for having unmatched attachments (Defaults True)

Returns:

res: int

Number of actual matching traits

qs: int

Number of expected matching traits assuming perfect equality

glypy.algorithms.similarity.optimal_assignment(assignments, score_fn)[source]

Given a set of possibly overlapping matches, brute-force find the optimal solution. Evaluate each pairing in assignments with score_fn