Taxonomylite¶
A simple one-file solution for those times when you want to check if one organism is a descended from another, but don’t need a full phylogenetic tree manipulation library.
The library is just a single file that depends only upon the standard library. You can easily embed it in another library by copying this script.
from taxonomylite import Taxonomy
# Create a new database from NCBI sources
# This process may take some time.
taxa_db = Taxonomy.from_source("taxonomy.db")
# Later... in a new session
from taxonomylite import Taxonomy
taxa_db = Taxonomy("taxonomy.db")
tid = taxa_db.tid_to_name("Felidae")
immediate_children = taxa_db.children(tid)
# [338151, 338152, 338153, 339610]
all_children = taxa_db.children(tid, deep=True)
# [9682, 9683, 9685, 9687, 9688, 9689, 9690, 9691, 9692, 9693, ...]
print taxa_db.tid_to_rank(db.name_to_tid("Panthera leo leo"))
# subspecies
-
taxonomylite.SEP_TOKEN= 'zzz'¶ The separator used to tokenize the in-database lineage string
-
taxonomylite.SOURCE_URL= 'ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz'¶ The default location to download taxonomy information from
-
class
taxonomylite.Taxonomy(store_path)¶ Bases:
objectOperate on taxonomic hierarchies downloaded from the NCBI Taxonomy database using a compact SQLite database.
Parameters: store_path (str) – Path to the sqlite database containing the hierarchies -
connection¶ sqlite3.Connection
The underlying connection to the sqlite database
-
children(tid, deep=False)¶ Retrieve all child taxonomic id numbers of tid. If deep is True, retrieve all descendants
Parameters: - tid (int) –
- deep (bool) – Retrieve all descendants, not just direct children
Returns: Return type: list of ints
-
close()¶ Close the underlying database connection.
See
sqlite3.Connection.close()
-
commit()¶ Save pending changes to the underlying database
See
sqlite3.Connection.commit()
-
execute(stmt, args='')¶ Execute raw SQL against the underlying database.
See
sqlite3.Connection.execute()
-
executemany(stmt, args='')¶
-
classmethod
from_source(store_path='taxonomy.db', url='ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump.tar.gz')¶ Construct a new
Taxonomyinstance and associated database file from source data downloaded from NCBI’s FTP servers.If url is
None, then it will look for the source information in the current directory at the name “taxdump.tar.gz”.Parameters: - store_path (str) – Path to construct the database at. Defaults to “taxonomy.db” in the current directory
- url (str) – The URL to download the taxonomy information from. Defaults to
SOURCE_URL
Returns: Return type: class:Taxonomy
-
is_child(child_tid, parent_tid)¶ Test if child_tid is a child taxa of parent_tid
Parameters: - child_tid (int) –
- parent_tid (int) –
Returns: Return type: bool
-
is_parent(child_tid, parent_tid)¶ Test if parent_tid is a parent taxa of child_tid
Parameters: - child_tid (int) –
- parent_tid (int) –
Returns: Return type: bool
-
lineage(db, tid)¶ Construct the taxonomic “path” from tid to the root of the phylogenetic hierarchy
Parameters: tid (int) – Returns: Return type: list of ints
-
name_to_tid(name)¶ Translates a scientific name name string into its equivalent taxonomic id number
Parameters: name (str) – A scientific name like “Homo sapiens” Returns: tid Return type: int
-
nearest_common_ancestor(a, b)¶
-
parent(tid)¶ Extract the taxonomic id number of the parent of tid
Parameters: tid (int) – Returns: Return type: int
-
relatives(tid, degree=1)¶ Retrieve relatives of tid out to degree steps removed
Parameters: - tid (int) –
- degree (int) –
Returns: Return type: list of ints
-
siblings(tid)¶ Extract the taxonomic id numbers of the siblings (same parent) of tid
Parameters: tid (int) – Returns: Return type: list of ints
-
tid_to_name(tid)¶ Translates a taxonomic id number tid into its equivalent scientific name
Parameters: tid (int) – A taxonomic id number like 9606 Returns: name – A scientific name like “Homo sapiens” Return type: str
-
tid_to_rank(tid)¶
-