FASTA files
===========

FASTA files have been around for quite a long time, and remain
in use today. Their success might have to with their relative
structure and the fact that they are text files (ASCII)

- A header line (starting with an `>`)

- An arbitrary number of lines for the sequence 

- Repeat the above if necessary

Countless FASTA parsers have been implemented, but given the simplicity
with which one can write offer one here as well so we do not require
a third-party package for this alone.

.. code-block:: python

   from ngs_plumbing import fasta
   fn = 'mygenome.fa'
   fa = fasta.FastaFile(fn)

   for entry in fa:
       print(entry.header)


Now what we have here is a twist with a way to handle binary FASTA, with the associated
benefits of smaller storage space needed, shorter loading times, and shorter access times
to retrive a specific entry.

.. code-block:: python

   from ngs_plumbing import fasta
   fn_a = 'mygenome.fa'
   fn_b = 'mygenome.fab'
   fasta.FastabFile.from_fastafile(fn_a, fn_b)
   
   fb = fasta.FastabFile(fn_b)

Iterating through the file can be achieved with:

.. code-block:: python

   for entry in fa:
       print(entry.header)

.. note:: 

   the sequence 2-bit encoded and the function :func:`ngs_plumbing.dna.bytes_frombit2bytes`
   should be used to obtain the DNA.