NCBITaxa class¶
-
class
NCBITaxa
(dbfile=None)¶ Bases:
object
versionadded: 2.3
Provides a local transparent connector to the NCBI taxonomy database.
-
annotate_tree
(t, taxid_attr='name', tax2name=None, tax2track=None, tax2rank=None)¶ Annotate a tree containing taxids as leaf names by adding the ‘taxid’, ‘sci_name’, ‘lineage’, ‘named_lineage’ and ‘rank’ additional attributes.
Parameters: - t – a Tree (or Tree derived) instance.
- taxid_attr (name) – Allows to set a custom node attribute containing
the taxid number associated to each node (i.e. species in PhyloTree instances).
Parameters: tax2name,tax2track,tax2rank – Use these arguments to provide pre-calculated dictionaries providing translation from taxid number and names,track lineages and ranks.
-
get_broken_branches
(t, taxa_lineages, n2content=None)¶ Returns a list of NCBI lineage names that are not monophyletic in the provided tree, as well as the list of affected branches and their size.
CURRENTLY EXPERIMENTAL
-
get_common_names
(taxids)¶
-
get_descendant_taxa
(parent, intermediate_nodes=False, rank_limit=None, collapse_subspecies=False, return_tree=False)¶ given a parent taxid or scientific species name, returns a list of all its descendants taxids. If intermediate_nodes is set to True, internal nodes will also be dumped.
-
get_fuzzy_name_translation
(name, sim=0.9)¶ Given an inexact species name, returns the best match in the NCBI database of taxa names.
Parameters: sim (0.9) – Min word similarity to report a match (from 0 to 1). Returns: taxid, species-name-match, match-score
-
get_lineage
(taxid)¶ Given a valid taxid number, return its corresponding lineage track as a hierarchically sorted list of parent taxids.
-
get_name_translator
(names)¶ Given a list of taxid scientific names, returns a dictionary translating them into their corresponding taxids.
Exact name match is required for translation.
-
get_rank
(taxids)¶ return a dictionary converting a list of taxids into their corresponding NCBI taxonomy rank
-
get_taxid_translator
(taxids)¶ Given a list of taxids, returns a dictionary with their corresponding scientific names.
-
get_topology
(taxids, intermediate_nodes=False, rank_limit=None, collapse_subspecies=False, annotate=True)¶ Given a list of taxid numbers, return the minimal pruned NCBI taxonomy tree containing all of them.
Parameters: intermediate_nodes (False) – If True, single child nodes representing the complete lineage of leaf nodes are kept. Otherwise, the tree is pruned to contain the first common ancestor of each group.
Parameters: rank_limit (None) – If valid NCBI rank name is provided, the tree is pruned at that given level. For instance, use rank=”species” to get rid of sub-species or strain leaf nodes.
Parameters: collapse_subspecies (False) – If True, any item under the species rank will be collapsed into the species upper node.
-
translate_to_names
(taxids)¶ Given a list of taxid numbers, returns another list with their corresponding scientific names.
-
update_taxonomy_database
(taxdump_file=None)¶ Updates the ncbi taxonomy database by downloading and parsing the latest taxdump.tar.gz file from the NCBI FTP site.
Parameters: taxdump_file (None) – an alternative location of the taxdump.tax.gz file.
-