Bio Mart (biomart
)¶
Access BioMart MartService.
>>> from orangecontrib.bio.biomart import *
>>> connection = BioMartConnection(
... "http://www.biomart.org/biomart/martservice")
...
>>> reg = BioMartRegistry(connection)
>>> for mart in reg.marts():
... print mart.name
...
ensembl...
>>> dataset = BioMartDataset(
... mart="ensembl", internalName="hsapiens_gene_ensembl",
... virtualSchema="default", connection=connection)
...
>>> for attr in dataset.attributes()[:10]:
... print attr.name
...
Ensembl Gene ID...
>>> data = dataset.get_data(
... attributes=["ensembl_gene_id", "ensembl_peptide_id"],
... filters=[("chromosome_name", "1")])
...
>>> query = BioMartQuery(reg.connection, virtualSchema="default")
>>> query.set_dataset("hsapiens_gene_ensembl")
>>> query.add_attribute("ensembl_gene_id")
>>> query.add_attribute("ensembl_peptide_id")
>>> query.add_filter("chromosome_name", "1")
>>> count = query.get_count()
Interface¶
-
class
orangecontrib.bio.biomart.
BioMartConnection
(address=None, timeout=30)¶ A connection to a BioMart martservice server.
>>> connection = BioMartConnection( ... "http://www.biomart.org/biomart/martservice") >>> response = connection.registry() >>> response = connection.datasets(mart="ensembl")
-
class
orangecontrib.bio.biomart.
BioMartRegistry
(stream)¶ A class representing a BioMart registry. Arguments:
Parameters: stream – A file like object with xml registry or a BioMartConnection instance >>> registry = BioMartRegistry(connection) >>> for schema in registry.virtual_schemas(): ... print(schema.name) ... default
-
dataset
(internalName, virtualSchema=None)¶ Return a BioMartDataset instance that matches the internalName.
-
datasets
()¶ Return a list of all datasets (
BioMartDataset
) from all marts regardless of their virtual schemas.
-
links_between
(exporting, importing, virtualSchema='default')¶ Return all links between exporting and importing datasets in the virtualSchema.
-
mart
(name)¶ Return a named mart.
-
marts
()¶ Return a list off all ‘mart’ instances (
BioMartDatabase
) regardless of their virtual schemas.
-
classmethod
parse
(stream, parser=None)¶ Parse the registry file like object and return a DOM like description (
XMLNode
).
-
query
(**kwargs)¶ Return an initialized
BioMartQuery
with registry set to self. Pass additional arguments to BioMartQuery.__init__ with keyword arguments.
-
virtual_schema
(name)¶ Return a named virtual schema.
-
virtual_schemas
()¶ Return a list of
BioMartVirtualSchema
instances representing each schema.
-
-
class
orangecontrib.bio.biomart.
BioMartQuery
(registry, virtualSchema='default', dataset=None, attributes=[], filters=[], count=False, uniqueRows=False, format='TSV')¶ Construct a query to run on a BioMart server.
>>> query = BioMartQuery(connection, ... dataset="hsapiens_gene_ensembl", ... attributes=["ensembl_transcript_id", ... "chromosome_name"], ... filters=[("chromosome_name", ["22"])]) ... >>> count = query.get_count() >>> print(count) 1221
>>> # Equivalent to >>> query = BioMartQuery(connection) >>> query.set_dataset("hsapiens_gene_ensembl") >>> query.add_filter("chromosome_name", "22") >>> query.add_attribute("ensembl_transcript_id") >>> query.add_attribute("chromosome_name") >>> count = query.get_count() >>> print(count) 1221
-
class
orangecontrib.bio.biomart.
BioMartDataset
(mart='ensembl', internalName='hsapiens_gene_ensembl', virtualSchema='default', connection=None, datasetType='TableSet', displayName='', visible='1', assembly='', date='', **kwargs)¶ A BioMart dataset (returned by
BioMartDatabase
).-
configuration
(parser=None)¶ Return the configuration tree for this dataset (
DatasetConfig
).
-
count
(filters=[], unique=False)¶ Construct and run a
BioMartQuery
and count the number of returned lines.
-
get_data
(attributes=[], filters=[], unique=False)¶ Construct and run a
BioMartQuery
and return its results.
-
-
class
orangecontrib.bio.biomart.
BioMartVirtualSchema
(locations=None, name='default', connection=None)¶ A virtual schema.
-
dataset
(internalName)¶ Return a dataset with matching internalName.
-
datasets
()¶ Return a list of all datasets (
BioMartDataset
) from all marts in this schema.
-
links
()¶ Return a list of (linkName, linkVersion) tuples defined by datasets in the schema.
-
links_between
(exporting, importing)¶ Return a list of link names from exporting dataset to importing dataset.
-
marts
()¶ Return a list off all ‘mart’ instances (
BioMartDatabase
) in this schema.
-
query
(**kwargs)¶ Return an initialized
BioMartQuery
with registry and virtualSchema set to self. Pass additional arguments toBioMartQuery.__init__
with keyword arguments
-
-
class
orangecontrib.bio.biomart.
BioMartDatabase
(name='ensembl', virtualSchema='default', connection=None, database='ensembl_mart_60', default='1', displayName='ENSEMBL GENES 60 (SANGER UK)', host='www.biomart.org', includeDatasets='', martUser='', path='/biomart/martservice', port='80', serverVirtualSchema='default', visible='1', **kwargs)¶ A BioMart ‘mart’ instance.
Parameters: - name (str) – Name of the mart instance.
- virtualSchema (str) – Name of the virtualSchema of the dataset.
- BioMartConnection – An optional BioMartConnection instance.
-
dataset_attributes
(dataset, **kwargs)¶ Return a list of dataset attributes.
-
dataset_filters
(dataset, **kwargs)¶ Return a list of dataset filters.
-
dataset_query
(dataset, filters=[], attributes=[])¶ Return an dataset query based on dataset, filters and attributes.
-
datasets
()¶ Return a list of all datasets (
BioMartDataset
) in this database.
-
class
orangecontrib.bio.biomart.
Attribute
¶ An attribute in a BioMart data-set.
-
description
¶ Human readable description
-
format
¶ Attribute format
-
internalName
¶ Attribute’s internal name
-
internal_name
¶ Attribute’s internal name
-
name
¶ Human readable name.
-
-
class
orangecontrib.bio.biomart.
Filter
¶ A filter on a BioMart data-set.
-
description
¶ Filter description
-
internal_name
¶ Internal name
-
name
¶ Filter name
-
values
¶ List of possible filter values
-
-
class
orangecontrib.bio.biomart.
DatasetConfig
(registry, *args, **kwargs)¶