Gene Ontology (go)

Provides access to Gene Ontology and its gene annotations.

class orangecontrib.bio.go.Ontology(filename=None, progress_callback=None, rev=None)

Ontology is the class representing a gene ontology.

Parameters:
  • filename (str) – A filename of an .obo formated file.
  • progress_callback – Optional float -> None function.
  • rev (str) – An CVS revision specifier (see GO web CVS interface)

Example:

>>> # Load the current ontology (downloading it if necessary)
>>> ontology = Ontology()
>>> # Load the ontology at the specified CVS revision.
>>> ontology = Ontology(rev="5.2092")

Ontology supports a subset of the Mapping protocol:

>>> term_ids = list(ontology)
>>> term = ontology[term_ids[0]]
__contains__(termid)

Return True if a term with termid is present in the ontology.

__getitem__(termid)

Return a Term object with termid.

Parameters:term (str) – An id of a ‘Term’ in the ontology.
Return type:Term
__iter__()

Iterate over all term ids in ontology.

__len__()

Return number of terms in ontology.

defined_slims_subsets()

Return a list of defined subsets in the ontology.

Return type:list-of-str
extract_sub_graph(terms)

Return all sub terms of terms.

Parameters:terms (list) – A list of term IDs.
extract_super_graph(terms)

Return all super terms of terms up to the most general one.

Parameters:terms (list) – A list of term IDs.
named_slims_subset(subset)

Return all term IDs in a named subset.

Parameters:subset (str) – A string naming a subset in the ontology.
Return type:list-of-str
set_slims_subset(subset)

Set the slims_subset term subset to subset.

Parameters:subset (set) – A subset of GO term IDs.

subset may also be a string, in which case the call is equivalent to ont.set_slims_subsets(ont.named_slims_subset(subset))

slims_for_term(term)

Return a list of slim term IDs for term.

This is a list of most specific slim terms to which term belongs.

Parameters:term (str) – Term ID.
class orangecontrib.bio.go.Term(stanza=None, ontology=None)
id

The term id.

namespace

The namespace of the term.

def_

The term definition (Note the use of trailing underscore to avoid conflict with a python keyword).

is_a

List of term ids this term is a subterm of (parent terms).

related

List of (rel_type, term_id) tuples with rel_type specifying the relationship type with term_id.

class orangecontrib.bio.go.Annotations(filename_or_organism=None, ontology=None, genematcher=None, progress_callback=None, rev=None)

Annotations object holds the annotations.

Parameters:
  • filename_or_org (str) – A filename of a GAF formated annotations file (e.g. gene_annotations.goa_human) or an organism specifier (e.g. 'goa_human' or '9606'). In the later case the annotations for that organism will be loaded.
  • ontology (Ontology) – Ontology object for annotations
  • rev (str) – An optional CVS revision string. If the filename_or_org is given an organism code the annotations will be retrieved for that revision (see GO web CVS)
gene_annotations = None

A dictionary mapping a gene name (DB_Object_Symbol) to a set of all annotations of that gene.

term_anotations = None

A dictionary mapping a GO term id to a set of annotations that are directly annotated to that term

annotations = None

A list of all AnnotationRecords instances.

ontology

Ontology object for annotations.

add_annotation(a)

Add a single AnotationRecord instance to this object.

get_gene_names_translator(genes)

Return a dictionary mapping canonical names (DB_Object_Symbol) to genes.

get_all_annotations(id)

Return a set of all annotations (instances of AnnotationRecord) for GO term id and all it’s subterms.

Parameters:id (str) – GO term id
get_all_genes(id, evidence_codes=None)

Return a list of genes annotated by specified evidence_codes to GO term ‘id’ and all it’s subterms.”

Parameters:
  • id (str) – GO term id
  • evidence_codes (list-of-strings) – List of evidence codes to consider when matching annotations to terms.
get_enriched_terms(genes, reference=None, evidence_codes=None, slims_only=False, aspect=None, prob=<orangecontrib.bio.utils.stats.Binomial object>, use_fdr=True, progress_callback=None)

Return a dictionary of enriched terms, with tuples of (list_of_genes, p_value, reference_count) for items and term ids as keys. P-Values are FDR adjusted if use_fdr is True (default).

Parameters:
  • genes – List of genes
  • reference – List of genes (if None all genes included in the annotations will be used).
  • evidence_codes – List of evidence codes to consider.
  • slims_only – If True return only slim terms.
  • aspect – Which aspects to use. Use all by default. “P”, “F”, “C” or a set containing these elements.
get_annotated_terms(genes, direct_annotation_only=False, evidence_codes=None, progress_callback=None)

Return all terms that are annotated by genes with evidence_codes.

add(line)

Add one annotation

append(line)

Add one annotation

extend(lines)

Add multiple annotations

class orangecontrib.bio.go.AnnotationRecord

An annotation record mapping a gene to a term.

See http://geneontology.org/GO.format.gaf-2_0.shtml for description if individual fields.

Annotation_Extension

Alias for field number 15

Aspect

Alias for field number 8

Assigned_By

Alias for field number 14

DB

Alias for field number 0

DB_Object_ID

Alias for field number 1

DB_Object_Name

Alias for field number 9

DB_Object_Symbol

Alias for field number 2

DB_Object_Synonym

Alias for field number 10

DB_Object_Type

Alias for field number 11

DB_Reference

Alias for field number 5

Date

Alias for field number 13

Evidence_Code

Alias for field number 6

GO_ID

Alias for field number 4

Gene_Product_Form_ID

Alias for field number 16

Qualifier

Alias for field number 3

Taxon

Alias for field number 12

With_From

Alias for field number 7

classmethod from_string(string)

Create an instance from a line in a annotations (GAF 2.0 format) file.

Example

Load the ontology and print out some terms:

from orangecontrib.bio import go
ontology = go.Ontology()
term = ontology["GO:0097194"] # execution phase of apoptosis

# print a term
print(term)

# access fields by name
print(term.id, term.name)
# note the use of underscore due to a conflict with a python def keyword
print(term.def_)

Searching the annotation (part of code/go_gene_annotations.py)

from orangecontrib.bio import go

ontology = go.Ontology()

# Print names and definitions of all terms with "apoptosis" in the name
apoptosis = [term for term in ontology.terms.values()
             if "apoptosis" in term.name.lower()]
for term in apoptosis:
    print(term.name + term.id)
    print(term.def_)

# Load annotations for yeast.
annotations = go.Annotations("sgd", ontology=ontology)

res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"])

gene = annotations.alias_mapper["YIL075C"]
print(gene + " (YIL075C) directly annotated to the following terms:")
for a in annotations.gene_annotations[gene]:
    print(ontology[a.GO_ID].name + " with evidence code " + a.Evidence_Code)

# Get all genes annotated to the same terms as YIL075C
ids = set([a.GO_ID for a in annotations.gene_annotations[gene]])
for termid in ids:
    ants = annotations.get_all_annotations(termid)
    genes = set([a.DB_Object_Symbol for a in ants])
    print(", ".join(genes) +" annotated to " + termid + " " + ontology[termid].name)

Term enrichment (part of code/go_enrichment.py)

res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"])
print("Enriched terms:")
for go_id, (genes, p_value, ref) in res.items():
    if p_value < 0.05:
        print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))

# And again for slims
ontology.set_slims_subset("goslim_yeast")

res = annotations.get_enriched_terms(["YGR270W", "YIL075C", "YDL007W"],
                                     slims_only=True)
print("Enriched slim terms:")
for go_id, (genes, p_value, _) in res.items():
    if p_value < 0.05:
        print(ontology[go_id].name + " with p-value: %.4f " % p_value + ", ".join(genes))