Protein-protein interactions (ppi)

PPIDatabase is an abstract class defining a common interface for accessing protein-protein interaction databases.

Classes implementing this interface are:

The common interface

class orangecontrib.bio.ppi.PPIDatabase

A general interface for protein-protein interaction database access.

An example:

>>> ppidb = MySuperPPIDatabase()
>>> ppidb.organisms() # List all organisms (taxids)
['...

>>> ppidb.ids() # List all protein ids
['...

>>> ppidb.ids(taxid="9606") # List all human protein ids.
['...

>>> ppidb.links() # List all links
[('...
organisms()

Return all organism ncbi taxonomy ids contained in this database.

ids(taxid=None)

Return a list of all protein ids. If taxid (as returned by organisms()) is not None limit the results to ids to this organism only.

synonyms(id)

Return a list of synonyms for primary id (as returned by ids).

all_edges(taxid=None)

Return a list of all edges. If taxid is not None return the edges for this organism only.

edges(id1, id2=None)

Return a list of all edges (a list of 3-tuples (id1, id2, score)).

all_edges_annotated(taxid=None)

Return a list of all edges annotated. If taxid is not None return the edges for this organism only.

edges_annotated(id=None)

Return a list of all edges annotated.

search_id(name, taxid=None)

Search the database for protein name. Return a list of matching primary ids. Use taxid to limit the results to a single organism.

extract_network(ids)
classmethod download_data()

Download the latest PPI data for local work.

PPI databases

class orangecontrib.bio.ppi.BioGRID

Bases: orangecontrib.bio.ppi.PPIDatabase

Access BioGRID PPI data.

Example

>>> biogrid = BioGRID()
>>> print biogrid.organism() # Print a list of all organism ncbi taxis in BioGRID
[u'10090',...

>>> print biogrid.ids(taxid="9606") # Print a set of all human protein ids
[u'110004'

>>> print biogrid.synonyms("110004") # Print a list of all synonyms for protein id '110004' as reported by BioGRID
[u'3803', u'CU464060.2', u'CD158b', u'p58.2', u'CD158B1', u'NKAT6']

>>>
ids(taxid=None)

Return a list of all protein ids (biogrid_id_interactors). If taxid is not None limit the results to ids from this organism only.

synonyms(id)

Return a list of synonyms for primary id.

all_edges(taxid=None)

Return a list of all edges. If taxid is not None return the edges for this organism only.

edges(id)

Return a list of all interactions where id is a participant (a list of 3-tuples (id_a, id_b, score)).

all_edges_annotated(taxid=None)

Return a list of all edges annotated. If taxid is not None return the edges for this organism only.

edges_annotated(id)

Return a list of all links

search_id(name, taxid=None)

Search the database for protein name. Return a list of matching primary ids. Use taxid to limit the results to a single organism.

classmethod download_data(address)

Pass the address of the latest BIOGRID-ALL release (in tab2 format).

classmethod init_db(filepath)

Initialize the sqlite data base from a BIOGRID-ALL.*tab2.txt file format.

init_db_index()

Will create an indexes (if not already present) in the database for faster searching by primary ids.

extract_network(ids)
class orangecontrib.bio.ppi.STRING(taxid=None, database=None)

Bases: orangecontrib.bio.ppi.PPIDatabase

Access STRING PPI database.

organisms()

Return all organism taxids contained in this database.

ids(taxid=None)

Return a list of all protein ids. If taxid is not None limit the results to ids from this organism only.

synonyms(id)

Return a list of synonyms for primary id as reported by STRING (proteins.aliases.{version}.txt file)

synonyms_with_source(id)

Return a list of synonyms for primary id along with its source as reported by STRING (proteins.aliases.{version}.txt file)

all_edges(taxid=None)

Return a list of all edges. If taxid is not None return the edges for this organism only.

Note

This may take some time (and memory).

edges(id)

Return a list of all edges (a list of 3-tuples (id1, id2, score)).

classmethod download_data(version, taxids=None)

Download the PPI data for local work (this may take some time). Pass the version of the STRING release e.g. v9.1.

extract_network(ids)
class orangecontrib.bio.ppi.STRINGDetailed(taxid=None, database=None, detailed_database=None)

Bases: orangecontrib.bio.ppi.STRING

Access STRING PPI database. This class also allows access to subscores per channel.

Note

This data is released under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

If you want to use this data for commercial purposes you must get a license from STRING.

all_edges(taxid=None)

Return a list of all edges. If taxid is not None return the edges for this organism only.

Note

This may take some time (and memory).

edges(id)

Return a list of all edges (a list of 3-tuples (id1, id2, score)).

extract_network(ids)
ids(taxid=None)

Return a list of all protein ids. If taxid is not None limit the results to ids from this organism only.

organisms()

Return all organism taxids contained in this database.

synonyms(id)

Return a list of synonyms for primary id as reported by STRING (proteins.aliases.{version}.txt file)

synonyms_with_source(id)

Return a list of synonyms for primary id along with its source as reported by STRING (proteins.aliases.{version}.txt file)