obicomplement: reverse-complements sequences

obicomplement reverse-complements the sequence records.

Tip

The identifiers of the sequence records are modified by appending to them the _CMP suffix.

Tip

a attribute with key complemented and value sets to True is added on each reversed complemented sequence record.

By using the selection option set, it is possible to reverse complement only a subset of the sequence records included in the input file. The selected sequence are reversed complemented, others are stored without modification

Example 1:

> obicomplement seq.fasta > seqRC.fasta

Reverses complements all sequence records from the seq.fasta file and stores the result to the seqRC.fasta file.

Example 2:

> obicomplement -s 'A{10,}$' seq.fasta > seqRC.fasta

Reverses complements sequence records from the seq.fasta file only if they finish by at least 10 A. Others sequences are stored without modification.

Options to specify input format

Restrict the analysis to a sub-part of the input file

--skip <N>

The N first sequence records of the file are discarded from the analysis and not reported to the output file

--only <N>

Only the N next sequence records of the file are analyzed. The following sequences in the file are neither analyzed, neither reported to the output file. This option can be used conjointly with the –skip option.

Sequence annotated format

--genbank

Input file is in genbank format.

--embl

Input file is in embl format.

Specifying the sequence type

--nuc

Input file contains nucleic sequences.

--prot

Input file contains protein sequences.

Common options

-h, --help

Shows this help message and exits.

--DEBUG

Sets logging in debug mode.