Frequently Asked Questions

Here are common questions we get. If there are any questions about pyDNase or general DNase-seq analysis, either raise an issue on GitHub or email me on j.piper@warwick.ac.uk

How can I identify hypersensitive sites in DNase-seq data?

To identify DNase I hypersensitive sites in DNase-seq data, we recommend using HOMER‘s findPeaks with the parameters: findPeaks -region -size 500 -minDist 50 -o auto -tbp 0, converting the HOMER peaks to a BED file using pos2bed.pl and then merging the overlapping regions with:

$ bedtools sort -i <input.bed> | bedtools merge -i - > <output.bed>

We find the results are almost exactly the same as the HOTSPOT method employed by ENCODE. See the HOMER documentation for detailed information on how to carry out this procedure.

pyDNase won’t install/import or gives weird errors

The most common issue here is that you have old versions of the dependencies - namely scipy, numpy, or pysam installed - try updating these to their latest version. pyDNase is built against Python 2.6, 2.7 and 3.0 by the Travis contious integration (CI) system, so we’re very confident in the deployability of the codebase.

These footprints from are too stringent for my liking

This is a common question - if you have low read depths you might need to adjust the -fdrlimit parameter to something less stringent like "-10" or "-5", which sets the mimimum amounts of evidence required to support the alternate hypothesis of there being a footprint. You can set this to 0 if you want to disable this feature altogether, and then sort the footprints by their Wellington scores (e.g. sort -nk 5 <fp.bed> > <out.bed>) and choose your threshold this way if you like.