Dictyostelium discoideum databases (dicty
)¶
The following example downloads experiments from the PIPA database, specifically “RPKM + mapability expression (polyA) - R6” results for all public experiments on Dictyostelium discoideum (dd) at time point 16.
import orangecontrib.bio.dicty
pipa = orangecontrib.bio.dicty.PIPAx()
results = pipa.results_list("R6")
dd16 = [ (i,d) for i,d in results.items() if \
d["tp"] == '16' and d["species_id"] == "dd" ]
#group similar experiments with sorting
dd16 = sorted(dd16, key=lambda x: (x[1]["treatment"], x[1]["replicate"]))
data = pipa.get_data([i for i,d in dd16], exclude_constant_labels=True, \
allowed_labels=["id", "treatment", "replicate"])
def print_data(data):
for at in data.domain.attributes:
print("%s treatment: %s replicate: %s" % \
(at.name, at.attributes["treatment"], at.attributes["replicate"]))
print("")
for a in data[:10]:
print(a)
print_data(data)
print("")
datar = orangecontrib.bio.dicty.join_replicates(data)
print_data(datar)
PIPAx database¶
-
class
orangecontrib.bio.dicty.
PIPAx
(address='https://pipa.biolab.si/pipax/api.py', cache=None, username=None, password=None)¶ An interface to PIPAx API.
-
__init__
(address='https://pipa.biolab.si/pipax/api.py', cache=None, username=None, password=None)¶ Parameters: - address (str) – The address of the API.
- username (str) –
- password (str) – Login info; None for public access.
- cache (CacheSQLite) – A cache that stores results locally (an
CacheSQLite
).
-
genomes
(reload=False, bufver='0')¶ Return a list of available genomes as a list of (genome_id, genome_name) tuples.
-
get_data
(ids=None, result_type=None, exclude_constant_labels=False, average=<function median>, callback=None, bufver='0', transform=None, allowed_labels=None, reload=False)¶ Return data in a
Orange.data.Table
. Each feature represents a sample and each row is a gene. The feature’s.attributes
contain annotations.Parameters: - ids (list) – List of ids as returned by
results_list
if result_type is None; list of ids as returned bymappings
if result_type is set. - result_type (str) – Result template type id as returned by
result_types
. - exclude_constant_labels (bool) – If a label has the same value in whole example table, remove it.
- average (function) – Function that combines multiple reading of the same gene on a chip. If None, no averaging is done. Function should take a list of floats and return an “averaged” float (the default functions returns the median).
- transform (function) – A function that transforms individual values. It should take and return a float. Example use: logarithmic transformation. Default: None.
- ids (list) – List of ids as returned by
-
mappings
(reload=False, bufver='0')¶ Return available mappings as dictionary of { mapping_id: dictionary_of_annotations } where the keys for dictionary_of_annotations are “id”, data_id”, “data_name”, “genomes_id”.
-
result_types
(reload=False, bufver='0')¶ Return a list of available result types.
-
results_list
(rtype, reload=False, bufver='0')¶ Return a list of available gene expressions for a specific result type. Returns a dictionary, where the keys are ID and values are dictionaries of sample annotations.
Parameters: rtype (str) – Result type to use (see result_types
).
-
DictyExpress database¶
-
class
orangecontrib.bio.dicty.
DictyExpress
(address='http://bcm.fri.uni-lj.si/microarray/api/index.php?', cache=None)¶ Access the DictyExpress data API.
-
__init__
(address='http://bcm.fri.uni-lj.si/microarray/api/index.php?', cache=None)¶ Parameters: - address (str) – The address of the API.
- cache – A cache that stores results locally (an instance
of
CacheSQLite
).
-
annotationOptions
(ao=None, onlyDiff=False, **kwargs)¶ Return annotation options for given query. Return all possible annotations if the query is omitted.
If ao is chosen, only return options for that object id.
-
annotationTypes
()¶ Returns all annotation types.
-
annotations
(type, ids=None, all=False)¶ Return annotations for specified type and ids.
Parameters: - type – Object type (see
objects
). - ids – If set, only annotations corresponding to the given ids are returned. Annotations are in the same order as input ids.
- all – If False (default), only annotations for “meaningful” annotation types are returned. If True, return annotations for all annotation types.
- type – Object type (see
-
get_data
(type='norms', exclude_constant_labels=False, average=<function median>, ids=None, callback=None, format='short', transform=None, allowed_labels=None, **kwargs)¶ Return data in a
Orange.data.Table
. Each feature is a sample and each row(Orange.data.Instance
is a gene. The feature’s.attributes
contain annotations.Parameters: - ids (list) – A list of chip ids. If absent, make a search. In this case
any additional keyword arguments are threated as in
search
. - exclude_constant_labels – Remove labels if they have the same value for the whole table.
- format (str) – If “short”, use short format for downloads.
- average (function) – Function that combines multiple reading of the same gene on a chip. If None, no averaging is done. Function should take a list of floats and return an “averaged” float (the default functions returns the median).
- transform (function) – A function that transforms individual values. It should take and return a float. Example use: logarithmic transformation. Default: None.
Returns: Chips with given ids in a single data table.
Return type: Orange.data.Table
- ids (list) – A list of chip ids. If absent, make a search. In this case
any additional keyword arguments are threated as in
-
objects
()¶ Return all objects types.
-
search
(type, **kwargs)¶ Search the database. Search is case insensitive.
Parameters: - type – Annotation type (list them
with
DictyExpress().saoids.keys()
). - kwargs – In the form
annotation=values
. Values can are either strings or a list of strings (interpreted as an OR operator between list elements).
The following example lists ids of normalized entries where platform is minichip and sample is abcC3-:
search("norms", platform='minichip', sample='abcC3-')
The following example lists ids of normalized entries where platform is minichip and sample is abcC3- or abcG15-:
search("norms", platform='minichip', sample=[ 'abcC3-', 'abcG15-'])
- type – Annotation type (list them
with
-
Auxillary functionality¶
-
class
orangecontrib.bio.dicty.
CacheSQLite
(filename, compress=True)¶ An SQLite-based cache.
-
__init__
(filename, compress=True)¶ Opens an existing cache or creates a new one if it does not exist.
Parameters: - filename (str) – The filename.
- compress (bool) – Whether to use on-the-fly compression.
-
add
(addr, con, version='0', autocommit=True)¶ Inserts an element into the cache.
Parameters: - addr – Element address.
- con – Contents.
- version – Version.
-
clear
()¶ Remove all entries.
-
contains
(addr)¶ Return the element’s version or False, if the element does not exists.
Parameters: addr – Element address.
-
get
(addr)¶ Loads an element from the cache.
Parameters: addr – Element address.
-
list
()¶ List all element addresses in the cache.
-