Contents
The classes in this module enable random access to a variety of file formats (BAM, bigWig, bigBed, BED) using a uniform syntax, and allow you to compute coverage across many features in parallel or just a single feature.
Using classes in the metaseq.integration and metaseq.minibrowser modules, you can connect these objects to matplotlib figures that show a window into the data, making exploration easy and interactive.
Generally, the genomic_signal() function is all you need – just provide a filename and the format and it will take care of the rest, returning a genomic signal of the proper type.
Adding support for a new format is straightforward:
- Write a new adapter for the format in metaseq.filetype_adapters
- Subclass one of the existing classes below, setting the adapter attribute to be an instance of this new adapter
- Add the new class to the _registry dictionary to enable support for the file format.
Note that to support parallel processing and to avoid repeating code, these classes delegate their local_coverage methods to the metaseq.array_helpers._local_coverage() function.
Functions:
metaseq._genomic_signal.genomic_signal | Factory function that makes the right class for the file format. |
metaseq._genomic_signal.supported_formats | Returns list of formats supported by metaseq’s genomic signal objects. |
Classes
metaseq._genomic_signal.BaseSignal | Base class to represent objects from which genomic signal can be calculated/extracted. |
metaseq._genomic_signal.IntervalSignal | Abstract class for bed, BAM and bigBed files. |
metaseq._genomic_signal.BigWigSignal | Class for operating on bigWig files |
metaseq._genomic_signal.BamSignal | Class for operating on BAM files. |
metaseq._genomic_signal.BigBedSignal | Class for operating on bigBed files. |
metaseq._genomic_signal.BedSignal | Class for operating on BED files. |
Classes
metaseq.results_table.ResultsTable | Wrapper around a pandas.DataFrame that adds additional functionality. |
metaseq.results_table.DESeqResults | Class for working with results from DESeq. |
metaseq.results_table.DESeq2Results | Class for working with results from DESeq2. |
metaseq.results_table.EdgeRResults | Class for working with results from edgeR. |
metaseq.results_table.LazyDict | Dictionary-like object that lazily-loads ResultsTable objects. |
Module that ties together various parts of metaseq
Classes
metaseq.integration.chipseq.Chipseq | Class for visualizing and interactively exploring ChIP-seq data. |
Functions
metaseq.integration.signal_comparison.compare | Compares two genomic signal objects and outputs results as a bedGraph file. |
Module with handy utilities for plotting genomic signal
Functions
metaseq.plotutils.imshow | Do-it-all function to help with plotting heatmaps |
metaseq.plotutils.add_labels_to_subsets | Helper function for adding labels to subsets within a heatmap. |
metaseq.plotutils.clustered_sortind | Uses MiniBatch k-means clustering to cluster matrix into groups. |
metaseq.plotutils.calculate_limits | Calculate limits for a group of arrays in a flexible manner. |
metaseq.plotutils.ci_plot | Plots the mean and 95% ci for the given array on the given axes |
metaseq.plotutils.ci | Column-wise confidence interval. |
metaseq.plotutils.tip_zscores | Calculates the “target identification from profiles” (TIP) zscores from Cheng et al. |
metaseq.plotutils.tip_fdr | Returns adjusted TIP p-values for a particular alpha. |
metaseq.plotutils.nice_log | Uses a log scale but with negative numbers. |
metaseq.plotutils.prepare_logged | Transform x and y to a log scale while dealing with zeros. |
metaseq.plotutils.matrix_and_line_shell | Helper function to construct an empty figure that has space for a matrix, a summary line plot directly below it, a colorbar axis, and an optional “strip” axis that parallels the matrix (and shares its y-axis) where data can be added to create callbacks. |
metaseq.plotutils.input_ip_plots | All-in-one plotting function to make a 5-panel figure. |
Classes
metaseq.plotutils.MarginalHistScatter | Class to enable incremental appending of scatterplots, each of which generate additional marginal histograms. |
Module to handle custom colormaps.
cmap_powerlaw_adjust, cmap_center_adjust, and cmap_center_adjust are from https://sites.google.com/site/theodoregoetz/notes/matplotlib_colormapadjust
Functions
metaseq.colormap_adjust.color_test | Figure filled in with color; useful for troubleshooting or experimenting |
metaseq.colormap_adjust.smart_colormap | Creates a “smart” colormap that is centered on zero, and accounts for asymmetrical vmin and vmax by matching saturation/value of high and low colors. |
metaseq.colormap_adjust.cmap_discretize | |
metaseq.colormap_adjust.cmap_powerlaw_adjust | Returns a new colormap based on the one given but adjusted via power-law, newcmap = oldcmap**a. |
metaseq.colormap_adjust.cmap_center_adjust | Returns a new colormap based on the one given |
metaseq.colormap_adjust.cmap_center_point_adjust | Converts center to a ratio between 0 and 1 of the range given and calls cmap_center_adjust(). |
Module for spawning mini genome browsers using a plugin structure, making it possible to build rather complex mini-browsers. The goal is to point the mini-browser to some data, and call its plot() method with a feature. This will spawn a new figure showing the data for that interval.
MiniBrowser classes are just a general way of mapping data-manipulation or data-visualization methods to an Axes on which the data should be displayed.
To make a new subclass:
Create one or more methods that accept an Axes object and a pybedtools Interval object and return a feature. The simplest do-nothing method would be:
def my_panel(self, ax, feature)
return feature
A more useful method might be one that plots genomic signal over the region:
def my_panel(self, ax, feature):
# for simplicity, assume just use the first genomic_signal
gs = self.genomic_signal_objs[0]
x, y = gs.local_coverage(feature, bins=100)
ax.plot(x, y, **kwargs)
ax.axis('tight')
return feature
Then, override the panels() method. This method:
- Creates Axes as needed; assumes that self.make_fig() has already been called so that self.fig is available.
- Returns a list of (ax, method) tuples. This list maps created Axes to methods that should operate on them (like my_panel method above).
For example:
def panels(self): ax = self.fig.add_subplot(111) return [(ax, self.my_panel)]
A figure is spawned by calling the plot method on a pybedtools genomic interval, e.g.:
s = SignalMiniBrowser(ip, control])
s.plot(feature)
Classes
metaseq.minibrowser.BaseMiniBrowser | Base class for plotting a genomic region. |
metaseq.minibrowser.SignalMiniBrowser | Base class for plotting genomic signal. |
metaseq.minibrowser.GeneModelMiniBrowser | Mini-browser to show a signal panel on top and gene models on the bottom. |
This module provides classes that make a file format conform to a uniform API. These are not generally needed by end-users, rather, they are used internally by higher-level code like metaseq.genomic_signal.
File-type adapters accept a filename of the appropriate format (which is not checked) as the only argument to their constructor.
Subclasses must define __getitem__ to accept a pybedtools.Interval and return an iterator of pybedtools.Intervals
Subclasses must define make_fileobj(), which returns an object to be iterated over in __getitem__
Classes
metaseq.filetype_adapters.BaseAdapter | Base class for filetype adapters |
metaseq.filetype_adapters.BamAdapter | Adapter that provides random access to BAM objects using Pysam |
metaseq.filetype_adapters.BedAdapter | Adapter that provides random access to BED files via Tabix |
metaseq.filetype_adapters.BigBedAdapter | Adapter that provides random access to bigBed files via bx-python |