Bio Mart (biomart)

Access BioMart MartService.

>>> from orangecontrib.bio.biomart import *
>>> connection = BioMartConnection(
...     "http://www.biomart.org/biomart/martservice")
...
>>> reg = BioMartRegistry(connection)
>>> for mart in reg.marts():
...    print mart.name
...
ensembl...
>>> dataset = BioMartDataset(
...     mart="ensembl", internalName="hsapiens_gene_ensembl",
...     virtualSchema="default", connection=connection)
...
>>> for attr in dataset.attributes()[:10]:
...    print attr.name
...
Ensembl Gene ID...
>>> data = dataset.get_data(
...    attributes=["ensembl_gene_id", "ensembl_peptide_id"],
...    filters=[("chromosome_name", "1")])
...
>>> query = BioMartQuery(reg.connection, virtualSchema="default")
>>> query.set_dataset("hsapiens_gene_ensembl")
>>> query.add_attribute("ensembl_gene_id")
>>> query.add_attribute("ensembl_peptide_id")
>>> query.add_filter("chromosome_name", "1")
>>> count = query.get_count()

Interface

class orangecontrib.bio.biomart.BioMartConnection(address=None, timeout=30)

A connection to a BioMart martservice server.

>>> connection = BioMartConnection(
...     "http://www.biomart.org/biomart/martservice")
>>> response = connection.registry()
>>> response = connection.datasets(mart="ensembl")
class orangecontrib.bio.biomart.BioMartRegistry(stream)

A class representing a BioMart registry. Arguments:

Parameters:stream – A file like object with xml registry or a BioMartConnection instance
>>> registry = BioMartRegistry(connection)
>>> for schema in registry.virtual_schemas():
...    print(schema.name)
...
default
databases()

Same as marts.

dataset(internalName, virtualSchema=None)

Return a BioMartDataset instance that matches the internalName.

datasets()

Return a list of all datasets (BioMartDataset) from all marts regardless of their virtual schemas.

Return all links between exporting and importing datasets in the virtualSchema.

mart(name)

Return a named mart.

marts()

Return a list off all ‘mart’ instances (BioMartDatabase) regardless of their virtual schemas.

classmethod parse(stream, parser=None)

Parse the registry file like object and return a DOM like description (XMLNode).

query(**kwargs)

Return an initialized BioMartQuery with registry set to self. Pass additional arguments to BioMartQuery.__init__ with keyword arguments.

virtual_schema(name)

Return a named virtual schema.

virtual_schemas()

Return a list of BioMartVirtualSchema instances representing each schema.

class orangecontrib.bio.biomart.BioMartQuery(registry, virtualSchema='default', dataset=None, attributes=[], filters=[], count=False, uniqueRows=False, format='TSV')

Construct a query to run on a BioMart server.

>>> query = BioMartQuery(connection,
...                      dataset="hsapiens_gene_ensembl",
...                      attributes=["ensembl_transcript_id",
...                                  "chromosome_name"],
...                      filters=[("chromosome_name", ["22"])])
...
>>> count = query.get_count()
>>> print(count) 
1221
>>> # Equivalent to
>>> query = BioMartQuery(connection)
>>> query.set_dataset("hsapiens_gene_ensembl")
>>> query.add_filter("chromosome_name", "22")
>>> query.add_attribute("ensembl_transcript_id")
>>> query.add_attribute("chromosome_name")
>>> count = query.get_count()
>>> print(count)  
1221
class orangecontrib.bio.biomart.BioMartDataset(mart='ensembl', internalName='hsapiens_gene_ensembl', virtualSchema='default', connection=None, datasetType='TableSet', displayName='', visible='1', assembly='', date='', **kwargs)

A BioMart dataset (returned by BioMartDatabase).

attributes()

Return a list of available attributes for this dataset (Attribute).

configuration(parser=None)

Return the configuration tree for this dataset (DatasetConfig).

count(filters=[], unique=False)

Construct and run a BioMartQuery and count the number of returned lines.

filters()

Return a list of available filters for this dataset (Filter).

get_data(attributes=[], filters=[], unique=False)

Construct and run a BioMartQuery and return its results.

class orangecontrib.bio.biomart.BioMartVirtualSchema(locations=None, name='default', connection=None)

A virtual schema.

databases()

Same as marts.

dataset(internalName)

Return a dataset with matching internalName.

datasets()

Return a list of all datasets (BioMartDataset) from all marts in this schema.

Return a list of (linkName, linkVersion) tuples defined by datasets in the schema.

Return a list of link names from exporting dataset to importing dataset.

marts()

Return a list off all ‘mart’ instances (BioMartDatabase) in this schema.

query(**kwargs)

Return an initialized BioMartQuery with registry and virtualSchema set to self. Pass additional arguments to BioMartQuery.__init__ with keyword arguments

class orangecontrib.bio.biomart.BioMartDatabase(name='ensembl', virtualSchema='default', connection=None, database='ensembl_mart_60', default='1', displayName='ENSEMBL GENES 60 (SANGER UK)', host='www.biomart.org', includeDatasets='', martUser='', path='/biomart/martservice', port='80', serverVirtualSchema='default', visible='1', **kwargs)

A BioMart ‘mart’ instance.

Parameters:
  • name (str) – Name of the mart instance.
  • virtualSchema (str) – Name of the virtualSchema of the dataset.
  • BioMartConnection – An optional BioMartConnection instance.
dataset_attributes(dataset, **kwargs)

Return a list of dataset attributes.

dataset_filters(dataset, **kwargs)

Return a list of dataset filters.

dataset_query(dataset, filters=[], attributes=[])

Return an dataset query based on dataset, filters and attributes.

datasets()

Return a list of all datasets (BioMartDataset) in this database.

class orangecontrib.bio.biomart.Attribute

An attribute in a BioMart data-set.

description

Human readable description

format

Attribute format

internalName

Attribute’s internal name

internal_name

Attribute’s internal name

name

Human readable name.

class orangecontrib.bio.biomart.Filter

A filter on a BioMart data-set.

description

Filter description

internal_name

Internal name

name

Filter name

values

List of possible filter values

class orangecontrib.bio.biomart.DatasetConfig(registry, *args, **kwargs)