0.2.0a1 (2015-08-13)

changes since 0.1.7 (2014-05-10)

Bug Fixes

  • Removed previous Ensembl data; <0.3% of previous sequences were inconsistent with newer ensembl releases

Changes

  • Added liftover mechanism to bootstrap new schemas from existing data
  • Standardize on “NCBI” as origin name
  • When RefSeq transcript exon structures differ between chromosome and patch versions, use chromosome version [0926cf34e173]
  • Use explicit sequence sources when loading sequences
  • Bumped schema version to 1.1 for addition of associated_accessions and tx_similarity_v

New Features

  • New data: ensembl-79, refseq 2015-07-31
  • Added RefSeq NR (non-coding) transcripts
  • Added geneacs (associated accessions)
  • Added transcript similarity view [40a4e734d606] with multiple similarity metrics (experimental)
  • Added Change Log to documentation
  • Developed and documented release flow w/tools
  • Implement gene-based data subsets for testing loading
  • Explicitly set cds_{start,end}_i to None when not defined in txinfo (for non-coding transcripts) [e82011d1b02c]

Other Changes

  • Added uta subcommand and make rule to update materialized views [b1f2ef185320]
  • Add load-origin from tsv file; update ensembl Makefile rules [dc421df5ef19]
  • Added “mega-test” and “the-whole-kielbasa” test targets
  • Added indexes to tx_def_summary_mv [c1cda1cfd4c9]
  • Enforce FK relationship from exon_set.tx_ac to transcript.ac [ad5309895b5c]
  • Facilitate restartability in ensembl-fetch by creating canonical gene order and batching fetches
  • Standardize on schema name as uta_1_1 for dev, then rename on release
  • Updated code to use new eutils [#171, d831e02f51fa]
  • Update to use bioutils instead of bdi [769eea9241e4]
  • Updated to use the aligner from uta-tools-align (f.k.a. locus-lib-bio) [a1f492a32f5e]