Gene sets (geneset)

This module can load either gene sets distributed with Orange or custom gene sets in the GMT file format.

The available gene set collection can be listed with list_all.

collections loads gene sets. Gene sets provided with Orange are organized hierarchically. Although the GO hierarchy includes subsets, all of them can be loaded with (the organism here is a mouse):

orangecontrib.bio.geneset.collections((("GO",), "10090"))

To open multiple gene set collections at once, for example, KEGG and GO, try:

orangecontrib.bio.geneset.collections((("KEGG",), "10090"), (("GO",), "10090"))

You could also open a file with gene sets. The following line would open specific.gmt from the current working directory:

orangecontrib.bio.geneset.collections("specific.gmt")

The above examples combined:

orangecontrib.bio.geneset.collections((("KEGG",), "10090"), (("GO",), "10090"), "specific.gmt")

Furthermore, all gene sets for a specific organism can be opened with an empty hierarchy:

orangecontrib.bio.geneset.collections((tuple(), "10090"))

Loading gene sets

orangecontrib.bio.geneset.list_all(org=None, local=None)

Return gene sets available in the local and ServerFiles repositories. It returns a list of tuples of (hierarchy, organism, available_locally)

Results can be filtered with the following parameters.

Parameters:
  • org (str) – Organism tax id.
  • local (bool) – Available locally.
orangecontrib.bio.geneset.collections(*args)

Load gene sets from various sources: GMT file, GO, KEGG, and others. Return an instance of GeneSets.

Each arguments specifies a gene set and can be either:

  • a filename of a GMT file,
  • a tuple (hierarchy, organism) (for example (("KEGG",), "10090")), or
  • an instance of GeneSets

Supporting functionality

class orangecontrib.bio.geneset.GeneSets(input=None)

Bases: set

A collection of gene sets: contains GeneSet objects.

common_hierarchy()

Return a common hierarchy.

common_org()

Return a common organism.

hierarchies()

Return all hierarchies.

set_hierarchy(hierarchy)

Sets hierarchy for all gene sets.

split_by_hierarchy()

Split gene sets by hierarchies. Return a list of GeneSets objects.

to_odict()

Return gene sets in old dictionary format.

class orangecontrib.bio.geneset.GeneSet(genes=[], name=None, id=None, description=None, link=None, organism=None, hierarchy=None, pair=None)

A single set of genes.

cname(source=True, name=True)

Return a gene set name with hieararchy.

description = None

Gene set description.

genes = None

A set of genes. Genes are strings.

hierarchy = None

Hierarchy should be formated as a tuple, for example ("GO", "biological_process")

id = None

Short gene set ID.

Link to further information about this gene set.

name = None

Gene set name.

organism = None

Organism as a NCBI taxonomy ID.

to_odict(source=True, name=True)

For backward compatibility. Return a gene set as a tuple (id, list of genes).

orangecontrib.bio.geneset.register(genesets, serverFiles=None)

Registers given genesets locally. The gene set is registered by the common hierarchy or organism (None if organisms are different).

Parameters:
  • genesets (GeneSets) –
  • serverFiles – If serverFiles is an authenticated ServerFiles connection, the input gene sets are uploaded to the repository.