ChangesΒΆ
Version 1.3.3
- Disabled spliced alignments by default
Version 1.3.2
- Optimized the mapping and annotation steps
Version 1.3.1
- Homopolymer miss-matches is a parameter now
- Removing now also PolyNs (parameter)
Version 1.3.0
- Added more methods to cluster UMIs
- Optimized the UMI counting algorithm
- Optimized the memory use
Version 1.2.6
- Take into account soft-clipped bases when computing start/end positions
Version 1.2.5 * Changed the limit range of some parameters
Version 1.2.4
- Fixed small bugs
- Small improvements in st_qa.py and convertEnsemblToNames.py
Version 1.2.3
- Bumped TaggD version
- Added more stats to the dataset output
- Added scripts to compute stats
- Added new option for TaggD
Version 1.2.2
- Fixed bugs in convertEnsemblToNames
- Added some parameters for TaggD demultiplexing
- Bumped version of TaggD
Version 1.2.1
- Made homopolymers filters enabled by default
- Added a test dataset to the docs
Version 1.2.0
- Fixed a small bug in the deletion of the tmp folder
Version 1.1.7
- Make sure to remove tmp files even if an error happens
Version 1.1.6
- Fixed bug that would leave some files in /tmp
- Allowed mis-matches when removing adaptors is now 2
Version 1.1.5
- Removed some un-necessary parameters
Version 1.1.1
- Simplified the two pass mode
Version 1.1.0
- Added flag to discard reads mapping to anti-sense strand
- Parameters for GC content filter instead of using the same value as AT content filter
- Fixed a small bug in the logging of some parameters
Version 1.0.4
- When removing adaptors (homopolymers streches) allow to up to 3 missmatches
- Added GC content filter (same % as AT content)
Version 1.0.3
- Fixed a minor bug in the counting of UMIs or - strand
Version 1.0.2
- If no temp folder is given a new unique one is created on top of the execution folder
- integrate createDataset.py into the code of the pipeline
- Adjusted some parameters names and descriptions (no UMI is default)
- Added sliding window when counting unique molecules
- Added support for bzip
Version 1.0.1
- Fixed small bug in the parsing of the umi quality parameter
Version 1.0.0
- Added option to check for UMI quality
- Optimized the UMI template check code
- Optimized how the unique molecules are counted
- Better stats for the quality filter step
- Updated convertEnsemblToNames script
- Updated stringdocs
Version 0.9.9
- Small bug fixes
Version 0.9.6
- Fixed a bug with the non ambiguous option
- Fix a bug in the saturation computation
Version 0.9.5
- When a R2 is trimmed its correspondant R1 is trimmed as well
Version 0.9.4
- Fixed a stupid bug in the compute saturation option
Version 0.9.3
- Changed the rRNA filter so the BAM output does not need to be sorted
Version 0.9.2
- Fixed a bug in the parsing of parameters
Version 0.9.1
- Fixed a small bug with the location of discarded files
Version 0.9.0
- Replaced JSON for data frame in the output format
- Replaced python gzip for system call (faster)
- Changed the logic of how the filenames are stored and handled
Version 0.8.9
- Improved the error messages and error handling
Version 0.8.8
- Removed barcodes IDs from the output file
Version 0.8.7
- Updated comments, manual and license
- Small improvements
Version 0.8.5
- Fixed a bug in the computation of saturation curves
Version 0.8.4
- Added a normal hash with INT keys to increase speed and reduce memory
- Using the gene_id for annotation again
Version 0.8.3
- Added parameter for strandness in annotation (yes by default)
- Simplified a bit the quality trimming step (do not account for user input trimmed bases)
Version 0.8.2
- Added stats for annotated reads
- Replaced shelve dict for sqldict
- Fixed some small bugs in the annotation
Version 0.8.1
- Removed the pair mode keep option
- Removed un-neccessary pair mode and mapped checks after alignment
Version 0.8.0
- Added option to do the STAR 2 pass mode
- Removed option to run pipeline without IDs
- Speed improvements
- Perform demultiplex after mapping
- No attaching the barcode to reverse reads
- Removing some parameters
- Some improvements in stDataPlotter
- Option to use BAM format
- Removed annotation filtering step
- Removed forward trimming parameters
- Output gene names even with ENSEMBL
Version 0.7.7
- Small memory improvements
- Updates in plotting script
Version 0.7.6
- End coordinates now contain the whole read length
- Make annotation strand aware (reverse)
- Updated to STAR 2.5
Version 0.7.5
- Fixed a small bug
Version 0.7.4
- Added some memory improvements
Version 0.7.3
- Added parameters for inverse trimming
- Memory and speed optimizations in createDatasets
- Added option for low_memory use
Version 0.7.2
- Added unique genes to saturation points
- Added option to keep non-annotated reads
Version 0.7.1
- Fixed some small bugs
Version 0.7.0
- Fixed a bug in the saturation points
- Removed counttrie as option for clustering
- Updated and improved CTTS scripts
- Updated datfa plotter color list
Version 0.6.9
- Fixed a bug in the saturation points
Version 0.6.8
- Improved speed and memory in createDatasets
- Changed saturation points to fixed values that grow exp
- Improved speed in computation of saturation points
- Small bug fixes
- Upgraded json2Scatter with many improvements
- Rename json2scatter to stDataPlotter
Version 0.6.7
- Fixed a bug in the hierarchical clustering
- Added the input parameter to qa_stats
- Append experiment name to output files
- Added option to compute saturation points
- Added tool to plot stdata and clusters with aligned image
Version 0.6.6
- Fixed a bug in the hierarchical clustering
- Fixed a bug in the printed stats
Version 0.6.5
- Fixed a bug in retrieving the version of the software
- Added time stamps in different steps
- Added a UMI template quality filter
Version 0.6.4
- Fixed a bug in counttrie clustering method
- Improved sorting of molecular barcodes prior clustering
- Added hiearachical clustering option
Version 0.6.3
- Removed reads.json
- Added qa_stats.json to the output
- Restored old versioning system
- Removed hadoop related stuff
- Added support for gziped input files
Version 0.6.2
- Improved the log a bit
- Added parameters for max,min intron size and max gap size
Version 0.6.1
- Fixed some bugs in the prefix tree
Version 0.5.9
- Added an option to find molecular barcodes clusters using a prefix tree
Version 0.5.8
- Fixed a bug in the function to retrieve the pipeline version
Version 0.5.7
- Fixed a bug with --disable-multimap option
Version 0.5.6
- Fixed a typo in a parameter
- Fixed a bug that caused some parameters to not work
Version 0.5.5
- Added some extra debugging info in createDatasets
- Output the read name in the BED output file
- Changed --allowed-kimera for --allowed-kmer
- Added version as parameter and log message
Version 0.5.4
- Added parameter to disable soft clipping in mapping
- Disable softclipping in rRNA filter
- Make sure that discarded reads after rRNA filter are replaced by Ns
- Improved stats info a bit
Version 0.5.3
- Bumped Taggd to 0.2.2
Version 0.5.2
- Fixed a bug in the rRNA filter that would cause to not discard rRNA mapped reads
Version 0.5.1
- Added check when UMI is the same as barcode
- Added more stats
- Added percentiles distributiosn stats for createDAtaset
- Added support for BAM and SAM (not functional now)
- Added option to disable multiple aligned reads
- Fixed a bug in the bed file
Version 0.5.0
- Added AT content filter in quality trimming
- Added min mapped length filter after mapping
- Make sure one of the multiple aligned reads is set as not multiple
aligned so it can be annotated * Discard the other multiple aligned reads after mapping * Disable sorting * Restored back to use gene_id as column for annotation
Version 0.4.9
- Changed naming convention
- Added support for normal RNA analysis
Version 0.4.8
- Improved STAR configuration
- Added mapping post processing to filter out and adjust reversed reads
- Changed to use gene_name for annotation
- Fixed some bugs and some improvements
- Fixed bugs in the trimming
Version 0.4.7
- Improved stats
- Fixed a bug that would remove original input files
- Added a script to convert ENSEMBL ids to gene names
Version 0.4.6
- Fixed a bug that would not compute the number of discarded reads when using molecular barcodes
Version 0.4.5
- Fixed a bug in the barcodes JSON output
Version 0.4.4
- Fixed a bug in the molecular barcodes algorithm
- Fixed a bug that would keep the original fastq reads in the system
- Update taggd version
Version 0.4.3
- Small improvements with error checking and log in the mapping
- Fixed a bug that would remove the file after filtering annoted reads
- Make the sorting by name instead by position due to a bug in htseq-count
Version 0.4.2
- Fixed a bug in the capture of parameters
Version 0.4.1
- Improved the logs
- Fixed few bugs
Version 0.4.0
- Added back taggd
- Added BED file to output
- Added STAR
- Optimized workflow
- do rRNA filter first
- Optimized annotation
- Optimized trimming
- Output reads do not contain duplicates
Version 0.3.9
- Allowing molecular barcodes to be before the barcodes
Version 0.3.8
- Added back findIndexes
Version 0.3.7
- Removed cutadapt dependency
Version 0.3.6
- Fixed a bug in the installation
Version 0.3.5
- Added options to remove PolyC fix bugs in adaptors removal
Version 0.3.4
- Added test for STAR and STAR binary to dependencies
- Added TAGGD and removed findIndexes
- Improved install script
- Added options to remove adaptors (PolyA, PolyT and PolyG)
- Exchanged Bowtie as primary mapper with STAR.
Version 0.3.3
- Added option to keep files with discarded reads/barcodes
- Internal refactoring and optimization
Version 0.3.2
- Outputted reads JSON now only has the portion of the read that was used to map
- Cutadapt is integrated but only using the quality trimming for now
- Internal refactoring and optimizations
Version 0.3.1
- Added small unit-test for molecular barcodes
- Added more molecular barcodes algorithms (using a naive one for now)
- Fixed small issues in JSON parsing libraries
Version 0.3.0
- Rewrite createDatasets.py
- Clean up repository and deprecated files
- Change the unit-test library and structure
- Refactor the unit-test (use pipeline API instead of command line calls)
- Ensure unit-test remove tmp files when failing
- Add better error handling
- Add unit-test for Molecular Barcodes
- Add Molecular Barcodes functionality
- General refactor and clean up
- Add invoke options (clean, build, install)
- Fix an important bug in createDatasets that caused incorrect computation of reads counts
Version 0.2.5
- Improved installers
- Small bug fixes
- Added basic unit-test to do a run of the pipeline
Version 0.2.4
- Some optimizations and bug fixes
Version 0.2.3
- Fixed a error with new version of HTSeq-count that will discard more reads
Version 0.2.2
- Added extra parameters
- Fixed some typos
- Fixed a bug that caused to remove some bases from the barcode ID in the rv reads
Version 0.2.1
- code refactored and modularized
- add argparse for parameters parsing
- add API for Amazon EMR and terminal version
- better error handling
- optimized code
- new version of FindIndexes
- remove dependencies
- added proper installers and documentation