Bases: object
Wrapper around a pandas.DataFrame that adds additional functionality.
The underlying pandas.DataFrame is always available with the data attribute.
Any attributes not explicitly in this class will be looked for in the underlying pandas.DataFrame.
| Parameters: | data : string or pandas.DataFrame
db : string or gffutils.FeatureDB
import_kwargs : dict
|
|---|
Methods
| TSS([upstream, downstream]) | Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| TTS([upstream, downstream]) | Creates a BED/GFF file of the 3’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| align_with(other) | Align the dataframe’s index with another. |
| attach_db(db) | Attach a gffutils.FeatureDB for access to features. |
| copy() | |
| features([ignore_unknown]) | Generator of features. |
| five_prime([upstream, downstream]) | Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| genes_in_common(other) | Convenience method for getting the genes found in both dataframes. |
| genes_with_peak(peaks[, transform_func, ...]) | Returns a boolean index of genes that have a peak nearby. |
| radviz(column_names[, transforms]) | Radviz plot. |
| reindex_to(x[, attribute]) | Returns a copy that only has rows corresponding to feature names in x. |
| scatter(x, y[, xfunc, yfunc, xscale, ...]) | Do-it-all method for making annotated scatterplots. |
| strip_unknown_features() | Remove features not found in the gffutils.FeatureDB. |
| three_prime([upstream, downstream]) | Creates a BED/GFF file of the 3’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| update(dataframe) | Updates the current data with a new dataframe. |
Methods
| TSS([upstream, downstream]) | Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| TTS([upstream, downstream]) | Creates a BED/GFF file of the 3’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| __init__(data[, db, import_kwargs]) | |
| align_with(other) | Align the dataframe’s index with another. |
| attach_db(db) | Attach a gffutils.FeatureDB for access to features. |
| copy() | |
| features([ignore_unknown]) | Generator of features. |
| five_prime([upstream, downstream]) | Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| genes_in_common(other) | Convenience method for getting the genes found in both dataframes. |
| genes_with_peak(peaks[, transform_func, ...]) | Returns a boolean index of genes that have a peak nearby. |
| radviz(column_names[, transforms]) | Radviz plot. |
| reindex_to(x[, attribute]) | Returns a copy that only has rows corresponding to feature names in x. |
| scatter(x, y[, xfunc, yfunc, xscale, ...]) | Do-it-all method for making annotated scatterplots. |
| strip_unknown_features() | Remove features not found in the gffutils.FeatureDB. |
| three_prime([upstream, downstream]) | Creates a BED/GFF file of the 3’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. |
| update(dataframe) | Updates the current data with a new dataframe. |
Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. Needs an attached database.
| Parameters: | upstream, downstream : int
|
|---|
Creates a BED/GFF file of the 3’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. Needs an attached database.
| Parameters: | upstream, downstream : int
|
|---|
Attach a gffutils.FeatureDB for access to features.
Useful if you want to attach a db after this instance has already been created.
| Parameters: | db : gffutils.FeatureDB |
|---|
Generator of features.
If a gffutils.FeatureDB is attached, returns a pybedtools.Interval for every feature in the dataframe’s index.
| Parameters: | ignore_unknown : bool
|
|---|
Creates a BED/GFF file of the 5’ end of each feature represented in the table and returns the resulting pybedtools.BedTool object. Needs an attached database.
| Parameters: | upstream, downstream : int
|
|---|
Returns a boolean index of genes that have a peak nearby.
| Parameters: | peaks : string or pybedtools.BedTool
transform_func : callable
intersect_kwargs : dict
id_attribute : str
|
|---|
Radviz plot.
Useful for exploratory visualization, a radviz plot can show multivariate data in 2D. Conceptually, the variables (here, specified in column_names) are distributed evenly around the unit circle. Then each point (here, each row in the dataframe) is attached to each variable by a spring, where the stiffness of the spring is proportional to the value of corresponding variable. The final position of a point represents the equilibrium position with all springs pulling on it.
In practice, each variable is normalized to 0-1 (by subtracting the mean and dividing by the range).
This is a very exploratory plot. The order of column_names will affect the results, so it’s best to try a couple different orderings. For other caveats, see [1].
Additional kwargs are passed to self.scatter, so subsetting, callbacks, and other configuration can be performed using options for that method (e.g., genes_to_highlight is particularly useful).
| Parameters: | column_names : list
transforms : dict
ax : matplotlib.Axes
kwargs : dict
|
|---|
Notes
This method adds two new variables to self.data: “radviz_x” and “radviz_y”. It then calls the self.scatter method, using these new variables.
The data transformation was adapted from the pandas.tools.plotting.radviz function.
References
[2] http://www.agocg.ac.uk/reports/visual/casestud/brunsdon/radviz.htm [3] http://pandas.pydata.org/pandas-docs/stable/visualization.html #radviz
Returns a copy that only has rows corresponding to feature names in x.
| Parameters: | x : str or pybedtools.BedTool
attribute : str
|
|---|
Do-it-all method for making annotated scatterplots.
| Parameters: | x, y : array-like
xfunc, yfunc : callable
xlab, ylab : string
ax : None or Axes object
general_kwargs : dict
genes_to_highlight : list of (index, dict) tuples
callback : callable
one_to_one : None or dict
label_kwargs : dict
offset_kwargs : dict
xlab_prefix, ylab_prefix : str
hist_size : float
hist_pad : float
nan_offset, pos_offset, neg_offset : float
linelength : float
|
|---|
Remove features not found in the gffutils.FeatureDB. This will typically include ‘ambiguous’, ‘no_feature’, etc, but can also be useful if the database was created from a different one than was used to create the table.