Semantic Similarity
This module contains the classes for the evaluation of the Semantic Similarity.
Please refer to the single classes for details on the implemented measures.
SemSim.TermSemSim
This class provides the prototype for Term semantic similarity measures (TSS)
There are two types of Term semantic similarity: a first group that can evaluate the semantic similarity between two sets of terms (groupwise - G_TSS), and a second group that can only evaluate the similarity between pairs of GO terms (pairwise - P_TSS). Each class extending TermSemSim should declare whether it is groupwise or pairwise.
TermSemSim relies on SemSimUtils to perform a lot of tasks (e.g. evaluating Term IC or common ancestors).
A SemSimUtils object can be passed to the constructor as input data. Otherwise, a new instance will be created. Using only one copy of SemSimUtils helps reducing time and space requirements and is strongly recommended.
-
exception fastsemsim.SemSim.TermSemSim.MissingAcException(message)
Bases: exceptions.Exception
-
class fastsemsim.SemSim.TermSemSim.TermSemSim(ontology, ac=None, util=None, do_log=False)
Bases: object
-
G_TSS = 'Groupwise'
-
IC_based = None
-
P_TSS = 'Pairwise'
-
SS_type = None
-
SemSim(term1, term2, ontology=None)
-
format_and_check_data = True
-
setSanityCheck(en)
SemSim.ObjSemSim
This class provides the prototype for a generic Object Semantic Similarity measure
-
class fastsemsim.SemSim.ObjSemSim.ObjSemSim(ontology, ac, TSS=None, MSS=None, util=None, do_log=False)
Bases: object
-
SemSim(obj1, obj2, root=None)
SemSim.ObjSetSemSim
This class provides the prototype for a generic Object Set Semantic Similarity measure (PSS)
-
class fastsemsim.SemSim.ObjSetSemSim.ObjSetSemSim(ontology, ac, TSS=None, MSS=None, util=None, do_log=False)
-
SemSim(obj1, obj2, root=None)
SemSim.SetSemSim
This class provides the prototype for a generic Pairwise Object Semantic Similarity measure
-
class fastsemsim.SemSim.SetSemSim.SetSemSim(ontology, ac=None, TSS=None, MSS=None, util=None, do_log=False)
-
SemSim(obj1, obj2, root=None)
Specific Semantic Similarity measures
SemSim.ResnikSemSim
Resnik Semantic Similarity Measure
Reference: Resnik, P. (1999). Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research, 11, 95-130.
-
class fastsemsim.SemSim.ResnikSemSim.ResnikSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
SemSim.CosineSemSim
Cosine Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.CosineSemSim.CosineSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
-
dotprod(vector1, vector2)
-
extend_annotations = True
SemSim.CzekanowskiDiceSemSim
Czekanowski and Dice Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.CzekanowskiDiceSemSim.CzekanowskiDiceSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
-
extend_annotations = True
SemSim.DiceSemSim
Dice Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.DiceSemSim.DiceSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
-
extend_annotations = True
SemSim.GSESAMESemSim
G-SESAME Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.GSESAMESemSim.GSESAMESemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Pairwise'
-
generic_score = 0.5
-
is_a_score = 0.8
-
neg_regulates_score = 0.6
-
part_of_score = 0.6
-
pos_regulates_score = 0.6
-
regulates_score = 0.6
-
score_ancestors(term)
-
score_edge(tp, t)
SemSim.JaccardSemSim
Jaccard Index based Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.JaccardSemSim.JaccardSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
-
extend_annotations = True
SemSim.JiangConrathSemSim
Jiang and Conrath Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.JiangConrathSemSim.JiangConrathSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
SemSim.LinSemSim
Lin Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.LinSemSim.LinSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
SemSim.SimGICSemSim
SimGIC Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimGICSemSim.SimGICSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Groupwise'
SemSim.SimICNDSemSim
ICND Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimICNDSemSim.ICNDSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
-
generic_score = 1.0
-
is_a_score = 1.0
-
neg_regulates_score = 1.0
-
part_of_score = 1.0
-
pos_regulates_score = 1.0
-
regulates_score = 1.0
-
score_ancestors(term)
-
score_edge(tp, t)
SemSim.SimICNPSemSim
ICNP Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimICNPSemSim.ICNPSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
-
generic_score = 1.0
-
is_a_score = 1.0
-
neg_regulates_score = 1.0
-
part_of_score = 1.0
-
pos_regulates_score = 1.0
-
regulates_score = 1.0
-
score_ancestors(term)
-
score_edge(tp, t)
SemSim.SimICSemSim
Information Content Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimICSemSim.SimICSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
-
use_Lin = True
SemSim.SimNTOSemSim
Normalized Term Overlap Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimNTOSemSim.SimNTOSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
-
extend_annotations = True
SemSim.SimRelSemSim
SimRel Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimRelSemSim.SimRelSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = True
-
SS_type = 'Pairwise'
-
use_Lin = True
SemSim.SimTOSemSim
SimTO Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimTOSemSim.SimTOSemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
SemSim.SimUISemSim
SimUI Semantic Similarity Measure
Reference:
-
class fastsemsim.SemSim.SimUISemSim.SimUISemSim(ontology, ac=None, util=None, do_log=False)
Bases: fastsemsim.SemSim.TermSemSim.TermSemSim
-
IC_based = False
-
SS_type = 'Groupwise'
SemSim.MixSemSim
This class provides the prototype for a generic mixing strategy for pairwise Term Semantic Similarity measures
-
class fastsemsim.SemSim.MixSemSim.MixSemSim(ontology, ac, util=None, do_log=False)
Bases: object
-
SemSim(set1, set2, TSS)
SemSim.avgSemSim
This class defines the prototype for a generic mixing strategy for pairwise term Protein Semantic Similarity measures
-
class fastsemsim.SemSim.avgSemSim.avgSemSim(ontology, ac, util=None, do_log=False)
Bases: fastsemsim.SemSim.MixSemSim.MixSemSim
SemSim.BMASemSim
Best Match Average (BMA) mixing strategy for pairwise term Protein Semantic Similarity measures
-
class fastsemsim.SemSim.BMASemSim.BMASemSim(ontology, ac, util=None, do_log=False)
Bases: fastsemsim.SemSim.MixSemSim.MixSemSim
-
fair = True
SemSim.SemSimUtils
This class provides some routines to calculate basic properties used by different SS measures.
In particular this class provides code for evaluating:
- term ICs
- term frequency within an annotation corpus
- term’s ancestors
- term’s offspring
- terms’s children
- terms’s parents
- MICA/DCA/LCA
- term’s distance
-
class fastsemsim.SemSim.SemSimUtils.SemSimUtils(ontology, ac=None)
Bases: object
-
det_IC(term)
-
det_IC_table()
-
det_MICA(term1, term2)
-
det_ancestors_union(term1, term2)
-
det_common_ancestors(term1, term2)
-
difference(set1, set2)
-
get_ancestors(term1)
-
int_det_IC(term_id)
-
int_det_IC_table()
-
int_det_ancestors(goid, temp_intra)
-
int_det_ancestors_table()
-
int_det_freq(term_id)
-
int_det_freq_table()
-
int_det_lineage()
-
int_det_offspring(goid, temp_intra)
-
int_det_offspring_table()
-
int_det_p(term_id)
-
int_det_p_table()
-
int_merge_sets(set1, set2)
-
intersection(set1, set2)