Gene Ontology (go
)¶
Provides access to Gene Ontology and its gene annotations.
-
class
orangecontrib.bio.go.
Ontology
(filename=None, progress_callback=None, rev=None)¶ Ontology
is the class representing a gene ontology.Parameters: - filename (str) – A filename of an .obo formated file.
- progress_callback – Optional float -> None function.
- rev (str) – An CVS revision specifier (see GO web CVS interface)
Example:
>>> # Load the current ontology (downloading it if necessary) >>> ontology = Ontology() >>> # Load the ontology at the specified CVS revision. >>> ontology = Ontology(rev="5.2092")
Ontology supports a subset of the Mapping protocol:
>>> term_ids = list(ontology) >>> term = ontology[term_ids[0]]
-
__contains__
(termid)¶ Return True if a term with termid is present in the ontology.
-
__getitem__
(termid)¶ Return a
Term
object with termid.Parameters: term (str) – An id of a ‘Term’ in the ontology. Return type: Term
-
__iter__
()¶ Iterate over all term ids in ontology.
-
__len__
()¶ Return number of terms in ontology.
-
defined_slims_subsets
()¶ Return a list of defined subsets in the ontology.
Return type: list-of-str
-
extract_sub_graph
(terms)¶ Return all sub terms of terms.
Parameters: terms (list) – A list of term IDs.
-
extract_super_graph
(terms)¶ Return all super terms of terms up to the most general one.
Parameters: terms (list) – A list of term IDs.
-
named_slims_subset
(subset)¶ Return all term IDs in a named subset.
Parameters: subset (str) – A string naming a subset in the ontology. Return type: list-of-str See also
-
set_slims_subset
(subset)¶ Set the slims_subset term subset to subset.
Parameters: subset (set) – A subset of GO term IDs. subset may also be a string, in which case the call is equivalent to
ont.set_slims_subsets(ont.named_slims_subset(subset))
-
slims_for_term
(term)¶ Return a list of slim term IDs for term.
This is a list of most specific slim terms to which term belongs.
Parameters: term (str) – Term ID.
-
class
orangecontrib.bio.go.
Term
(stanza=None, ontology=None)¶ -
id
¶ The term id.
-
namespace
¶ The namespace of the term.
-
def_
¶ The term definition (Note the use of trailing underscore to avoid conflict with a python keyword).
-
is_a
¶ List of term ids this term is a subterm of (parent terms).
List of (rel_type, term_id) tuples with rel_type specifying the relationship type with term_id.
-
-
class
orangecontrib.bio.go.
Annotations
(filename_or_organism=None, ontology=None, genematcher=None, progress_callback=None, rev=None)¶ Annotations
object holds the annotations.Parameters: - filename_or_org (str) – A filename of a GAF formated annotations file (e.g.
gene_annotations.goa_human) or an organism specifier (e.g.
'goa_human'
or'9606'
). In the later case the annotations for that organism will be loaded. - ontology (
Ontology
) –Ontology
object for annotations - rev (str) – An optional CVS revision string. If the filename_or_org is given an organism code the annotations will be retrieved for that revision (see GO web CVS)
-
gene_annotations
= None¶ A dictionary mapping a gene name (DB_Object_Symbol) to a set of all annotations of that gene.
-
term_anotations
= None¶ A dictionary mapping a GO term id to a set of annotations that are directly annotated to that term
-
annotations
= None¶ A list of all
AnnotationRecords
instances.
-
add_annotation
(a)¶ Add a single
AnotationRecord
instance to this object.
-
get_gene_names_translator
(genes)¶ Return a dictionary mapping canonical names (DB_Object_Symbol) to genes.
-
get_all_annotations
(id)¶ Return a set of all annotations (instances of
AnnotationRecord
) for GO term id and all it’s subterms.Parameters: id (str) – GO term id
-
get_all_genes
(id, evidence_codes=None)¶ Return a list of genes annotated by specified evidence_codes to GO term ‘id’ and all it’s subterms.”
Parameters: - id (str) – GO term id
- evidence_codes (list-of-strings) – List of evidence codes to consider when matching annotations to terms.
-
get_enriched_terms
(genes, reference=None, evidence_codes=None, slims_only=False, aspect=None, prob=<orangecontrib.bio.utils.stats.Binomial object>, use_fdr=True, progress_callback=None)¶ Return a dictionary of enriched terms, with tuples of (list_of_genes, p_value, reference_count) for items and term ids as keys. P-Values are FDR adjusted if use_fdr is True (default).
Parameters: - genes – List of genes
- reference – List of genes (if None all genes included in the annotations will be used).
- evidence_codes – List of evidence codes to consider.
- slims_only – If True return only slim terms.
- aspect – Which aspects to use. Use all by default. “P”, “F”, “C” or a set containing these elements.
-
get_annotated_terms
(genes, direct_annotation_only=False, evidence_codes=None, progress_callback=None)¶ Return all terms that are annotated by genes with evidence_codes.
-
add
(line)¶ Add one annotation
-
append
(line)¶ Add one annotation
-
extend
(lines)¶ Add multiple annotations
- filename_or_org (str) – A filename of a GAF formated annotations file (e.g.
gene_annotations.goa_human) or an organism specifier (e.g.
-
class
orangecontrib.bio.go.
AnnotationRecord
¶ An annotation record mapping a gene to a term.
See http://geneontology.org/GO.format.gaf-2_0.shtml for description if individual fields.
-
Annotation_Extension
¶ Alias for field number 15
-
Aspect
¶ Alias for field number 8
-
Assigned_By
¶ Alias for field number 14
-
DB
¶ Alias for field number 0
-
DB_Object_ID
¶ Alias for field number 1
-
DB_Object_Name
¶ Alias for field number 9
-
DB_Object_Symbol
¶ Alias for field number 2
-
DB_Object_Synonym
¶ Alias for field number 10
-
DB_Object_Type
¶ Alias for field number 11
-
DB_Reference
¶ Alias for field number 5
-
Date
¶ Alias for field number 13
-
Evidence_Code
¶ Alias for field number 6
-
GO_ID
¶ Alias for field number 4
-
Gene_Product_Form_ID
¶ Alias for field number 16
-
Qualifier
¶ Alias for field number 3
-
Taxon
¶ Alias for field number 12
-
With_From
¶ Alias for field number 7
-
classmethod
from_string
(string)¶ Create an instance from a line in a annotations (GAF 2.0 format) file.
-
Example¶
Load the ontology and print out some terms:
from orangecontrib.bio import go
ontology = go.Ontology()
term = ontology["GO:0097194"] # execution phase of apoptosis
# print a term
print(term)
# access fields by name
print(term.id, term.name)
# note the use of underscore due to a conflict with a python def keyword
print(term.def_)
Searching the annotation (part of code/go_gene_annotations.py
)
from orangecontrib.bio import go
ontology = go.Ontology()
# Print names and definitions of all terms with "apoptosis" in the name
apoptosis = [term for term in ontology.terms.values()
if "apoptosis" in term.name.lower()]
for term in apoptosis:
print(term.name + term.id)
print(term.def_)
# Load annotations for yeast.
annotations = go.Annotations("sgd", ontology=ontology)
res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"])
gene = annotations.alias_mapper["YIL075C"]
print(gene + " (YIL075C) directly annotated to the following terms:")
for a in annotations.gene_annotations[gene]:
print(ontology[a.GO_ID].name + " with evidence code " + a.Evidence_Code)
# Get all genes annotated to the same terms as YIL075C
ids = set([a.GO_ID for a in annotations.gene_annotations[gene]])
for termid in ids:
ants = annotations.get_all_annotations(termid)
genes = set([a.DB_Object_Symbol for a in ants])
print(", ".join(genes) +" annotated to " + termid + " " + ontology[termid].name)
Term enrichment (part of code/go_enrichment.py
)
res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"])
print("Enriched terms:")
for go_id, (genes, p_value, ref) in res.items():
if p_value < 0.05:
print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))
# And again for slims
ontology.set_slims_subset("goslim_yeast")
res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"],
slims_only=True)
print("Enriched slim terms:")
for go_id, (genes, p_value, _) in res.items():
if p_value < 0.05:
print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))