FASTA files =========== FASTA files have been around for quite a long time, and remain in use today. Their success might have to with their relative structure and the fact that they are text files (ASCII) - A header line (starting with an `>`) - An arbitrary number of lines for the sequence - Repeat the above if necessary Countless FASTA parsers have been implemented, but given the simplicity with which one can write offer one here as well so we do not require a third-party package for this alone. .. code-block:: python from ngs_plumbing import fasta fn = 'mygenome.fa' fa = fasta.FastaFile(fn) for entry in fa: print(entry.header) Now what we have here is a twist with a way to handle binary FASTA, with the associated benefits of smaller storage space needed, shorter loading times, and shorter access times to retrive a specific entry. .. code-block:: python from ngs_plumbing import fasta fn_a = 'mygenome.fa' fn_b = 'mygenome.fab' fasta.FastabFile.from_fastafile(fn_a, fn_b) fb = fasta.FastabFile(fn_b) Iterating through the file can be achieved with: .. code-block:: python for entry in fa: print(entry.header) .. note:: the sequence 2-bit encoded and the function :func:`ngs_plumbing.dna.bytes_frombit2bytes` should be used to obtain the DNA.