File formats usable with OBITools

The sequence files

Sequences can be stored following various format. OBITools knows some of them. The central format for sequence files manipulated by OBITools scripts is the fasta format. OBITools extends the fasta format by specifying a syntax to include in the definition line data qualifying the sequence. All file formats use the IUPAC code for encoding nucleotides and amino-acids.

The taxonomy files

Many OBITools are able to take into account taxonomic data. This is done in general by specifying either a directory containing all NCBI taxonomy dump files or an obitaxonomy formatted database.