Models

cruzdb.models contains the per-feature objects that give you attributes on each row of a table

This is used to create a Model with the appropriate methods from a UCSC table. It uses sqlalchemy reflection to do the lifiting.

class cruzdb.models.ABase[source]

Base object that wraps returned database rows

Methods

bed(*attrs, **kwargs) return a bed formatted string of this feature
bed12([score, rgb]) return a bed12 (http://genome.ucsc.edu/FAQ/FAQformat.html#format1)
blat([db, sequence, seq_type]) make a request to the genome-browsers BLAT interface
distance([other_or_start, end, features]) check the distance between this an another interval
downstream(distance) return the (start, end) of the region before the geneStart
features(other_start, other_end) return e.g. “intron;exon” if the other_start, end overlap introns and
globalize(position[, cdna])
is_downstream_of(other) return a boolean indicating whether this feature is downstream of
is_upstream_of(other) return a boolean indicating whether this feature is upstream of other
localize(*positions, **kwargs) convert global coordinate(s) to local taking
ncbi_blast([db, megablast, sequence]) perform an NCBI blast against the sequence of this feature
promoter([up, down]) Return a start, end tuple of positions for the promoter region of this
sequence([per_exon]) Return the sequence for this feature.
tss([up, down]) Return a start, end tuple of positions around the transcription-start
upstream(distance) return the (start, end) of the region before the geneStart
bed(*attrs, **kwargs)[source]

return a bed formatted string of this feature

bed12(score='0', rgb='.')[source]

return a bed12 (http://genome.ucsc.edu/FAQ/FAQformat.html#format1) representation of this interval

bins[source]

return the bins for efficient querying

blat(db=None, sequence=None, seq_type='DNA')[source]

make a request to the genome-browsers BLAT interface sequence is one of None, “mrna”, “cds” returns a list of features that are hits to this sequence.

cds[source]

just the parts of the exons that are translated

cds_sequence[source]

a list of genomic sequences for the CDS’s

cdss

just the parts of the exons that are translated

coding_exons[source]

includes the entire exon as long as any of it is > cdsStart and < cdsEnd

distance(other_or_start=None, end=None, features=False)[source]

check the distance between this an another interval Parameters ———-

other_or_start : Interval or int
either an integer or an Interval with a start attribute indicating the start of the interval
end : int
if other_or_start is an integer, this must be an integer indicating the end of the interval
features : bool
if True, the features, such as CDS, intron, etc. that this feature overlaps are returned.
downstream(distance)[source]

return the (start, end) of the region before the geneStart

exons[source]

return a list of exons [(start, stop)] for this object if appropriate

features(other_start, other_end)[source]

return e.g. “intron;exon” if the other_start, end overlap introns and exons

gene_features[source]

return a list of features for the gene features of this object. This would include exons, introns, utrs, etc.

is_downstream_of(other)[source]

return a boolean indicating whether this feature is downstream of other taking the strand of other into account

is_gene_pred[source]

http://genome.ucsc.edu/FAQ/FAQformat.html#format9

is_upstream_of(other)[source]

return a boolean indicating whether this feature is upstream of other taking the strand of other into account

localize(*positions, **kwargs)[source]

convert global coordinate(s) to local taking introns into account and cds/tx-Start depending on cdna=True kwarg

mrna_sequence[source]

a list of genomic sequences for the mRNA’s

ncbi_blast(db='nr', megablast=True, sequence=None)[source]

perform an NCBI blast against the sequence of this feature

position[source]

a chrom:start-stop representation of this feature

promoter(up=2000, down=0)[source]

Return a start, end tuple of positions for the promoter region of this gene

Parameters :

up : int

this distance upstream that is considered the promoter

down : int

the strand is used to add this many downstream bases into the gene.

sequence(per_exon=False)[source]

Return the sequence for this feature. if per-exon is True, return an array of exon sequences This sequence is never reverse complemented

tss(up=0, down=0)[source]

Return a start, end tuple of positions around the transcription-start site

Parameters :

up : int

if greature than 0, the strand is used to add this many upstream bases in the appropriate direction

down : int

if greature than 0, the strand is used to add this many downstream bases into the gene.

upstream(distance)[source]

return the (start, end) of the region before the geneStart

utr3[source]

return the 3’ UTR if appropriate

utr5[source]

return the 5’ UTR if appropriate

class cruzdb.models.Interval(start, end, chrom=None, name=None)[source]

Interval class for convenience

Parameters :

start : int

end : int

chrom : str

name : str

optional name for the interval

Methods

distance([other_or_start, end, features]) check the distance between this an another interval
is_upstream_of(other) check if this is upstream of the other interval taking the strand of
overlaps(other) check for overlap with the other interval
distance(other_or_start=None, end=None, features=False)[source]

check the distance between this an another interval Parameters ———-

other_or_start : Interval or int
either an integer or an Interval with a start attribute indicating the start of the interval
end : int
if other_or_start is an integer, this must be an integer indicating the end of the interval
features : bool
if True, the features, such as CDS, intron, etc. that this feature overlaps are returned.
is_upstream_of(other)[source]

check if this is upstream of the other interval taking the strand of the other interval into account

overlaps(other)[source]

check for overlap with the other interval

This Page