`ecotaxspecificity`: Evaluates barcode resolution¶

The ecotaxspecificity command evaluates barcode resolution at different taxonomic ranks.

As inputs, it takes a sequence record file annotated with taxids in the sequence header, and a database formated as an ecopcr database (see obitaxonomy) or a NCBI taxdump (see NCBI ftp site).

An example of output is reported below:

Number of sequences added in graph: 284
Number of nodes in all components: 269
Number of sequences lost: 15!
rank                      taxon_ok      taxon_total     percent
order                            8               8        100.00
superfamily                      1               1        100.00
parvorder                        1               1        100.00
subkingdom                       1               1        100.00
superkingdom                     1               1        100.00
kingdom                          3               3        100.00
phylum                           5               5        100.00
infraorder                       1               1        100.00
subfamily                        3               3        100.00
class                            6               6        100.00
species                         35             176         19.89
superorder                       1               1        100.00
suborder                         1               1        100.00
subtribe                         1               1        100.00
subclass                         3               3        100.00
genus                            9              15         60.00
superclass                       1               1        100.00
family                          10              10        100.00
tribe                            2               2        100.00
subphylum                        1               1        100.00

In this example, the input sequence file contains 284 sequence records, but only 269 have been examined, because taxonomic information was not recovered for the the 15 remaining ones.

“Taxon_total” refers to the number of different taxa observed at this rank in the sequence record file (when taxonomic information is available at this rank), and “taxon_ok” corresponds to the number of taxa that the barcode sequence identifies unambiguously in the taxonomic database. In this example, the sequence records correspond to 176 different species, but only 35 of these have specific barcodes. “percent” is the percentage of unambiguously identified taxa among the total number of taxa (taxon_ok/taxon_total*100).

`ecotaxspecificity` specific options¶

-e INT, --errors=<INT>¶

Two sequences are considered as different if they have INT or more differences (default: 1).

Example:

> ecotaxspecificity -d my_ecopcr_database -e 5 seq.fasta
This command considers that two sequences with less than 5 differences correspond to the same barcode.

Options to specify input format¶

Restrict the analysis to a sub-part of the input file¶

--skip <N>¶: The N first sequence records of the file are discarded from the analysis and not reported to the output file

--only <N>¶: Only the N next sequence records of the file are analyzed. The following sequences in the file are neither analyzed, neither reported to the output file. This option can be used conjointly with the –skip option.

Sequence annotated format¶

--genbank¶: Input file is in genbank format.

--embl¶: Input file is in embl format.

Specifying the sequence type¶

--nuc¶: Input file contains nucleic sequences.

--prot¶: Input file contains protein sequences.

Common options¶

-h, --help¶: Shows this help message and exits.

--DEBUG¶: Sets logging in debug mode.

`ecotaxspecificity` used sequence attribute¶

taxid

`ecotaxspecificity`: Evaluates barcode resolution¶

`ecotaxspecificity` specific options¶

Options to specify input format¶

Restrict the analysis to a sub-part of the input file¶

Sequence annotated format¶

Specifying the sequence type¶

Common options¶

`ecotaxspecificity` used sequence attribute¶

Table Of Contents

Previous topic

Next topic

This Page

ecotaxspecificity: Evaluates barcode resolution¶

ecotaxspecificity specific options¶

Taxonomy related options¶

Options to specify input format¶

Restrict the analysis to a sub-part of the input file¶

Sequence annotated format¶

fasta related format¶

fastq related format¶

ecoPCR related format¶

Specifying the sequence type¶

Common options¶

ecotaxspecificity used sequence attribute¶

`ecotaxspecificity`: Evaluates barcode resolution¶

`ecotaxspecificity` specific options¶

`ecotaxspecificity` used sequence attribute¶