New in version 2.1.

NeXML

NeXML(http://nexml.org) is an exchange standard for representing phyloinformatic data inspired by the commonly used NEXUS format, but more robust and easier to process.

Reading NeXML projects

Nexml projects are handled through the Nexml base class. To load a NexML file, the Nexml.build_from_file() method can be used.

from ete2 import Nexml

nexml_prj = Nexml()
nexml_prj.build_from_file("/path/to/nexml_example.xml")

Note that the ETE parser will read the provided XML file and convert all elements into python instances, which will be hierarchically connected to the Nexml root instance.

Every NeXML XML element has its own python class. Content and attributes can be handled through the “set_” and “get_” methods existing in all objects. Nexml classes can be imported from the ete2.nexml module.

from ete2 import Nexml, nexml
nexml_prj = Nexml()
nexml_meta = nexml.LiteralMeta(datatype="double", property="branch_support", content=1.0)
nexml_prj.add_meta(nexml_meta)
nexml_prj.export()

# Will produce:
#
# <Nexml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="Nexml">
#    <meta datatype="double" content="1.0" property="branch_support" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="LiteralMeta"/>
# </Nexml>

NeXML trees

NeXML tree elements are automatically converted into PhyloTree instances, containing all ETE functionality (traversing, drawing, etc) plus normal NeXML attributes.

In the Nexml standard, trees are represented as plain lists of nodes and edges. ETE will convert such lists into tree topologies, in which every node will contain a nexml_node and nexml_edge attribute. In addition, each tree node will have a nexml_tree attribute (i.e. NEXML->FloatTree) , which can be used to set the nexml properties of the subtree represented by each node. Note also that node.dist and node.name features will be linked to node.nexml_edge.length and node.nexml_node.label, respectively.

from ete2 import Nexml
# Create an empty Nexml project 
nexml_project = Nexml()

# Load content from NeXML file
nexml_project.build_from_file("trees.xml")

# All XML elements are within the project instance.
# exist in each element to access their attributes.
print "Loaded Taxa:"
for taxa in  nexml_project.get_otus():
    for otu in taxa.get_otu():
        print "OTU:", otu.id

# Extracts all the collection of trees in the project
tree_collections = nexml_project.get_trees()
# Select the first collection
collection_1 = tree_collections[0]

# print the topology of every tree 
for tree in  collection_1.get_tree():
    # trees contain all the nexml information in their "nexml_node",
    # "nexml_tree", and "nexml_edge" attributes.
    print "Tree id", tree.nexml_tree.id
    print tree
    for node in tree.traverse():
        print "node", node.nexml_node.id, "is associated with", node.nexml_node.otu, "OTU"


# Output:
# ==========
# Loaded Taxa:
# OTU: t1
# OTU: t2
# OTU: t3
# OTU: t4
# OTU: t5
# Tree id tree1
#  
#                /-n5(n5)
#           /---|
#          |     \-n6(n6)
#      /---|
#     |    |     /-n8(n8)
# ----|     \---|
#     |          \-n9(n9)
#     |
#      \-n2(n2)
# node n1 is associated with None OTU
# node n3 is associated with None OTU
# node n2 is associated with t1 OTU
# node n4 is associated with None OTU
# node n7 is associated with None OTU
# node n5 is associated with t3 OTU
# node n6 is associated with t2 OTU
# node n8 is associated with t5 OTU
# node n9 is associated with t4 OTU
# Tree id tree2
#  
#                /-tree2n5(n5)
#           /---|
#          |     \-tree2n6(n6)
#      /---|
#     |    |     /-tree2n8(n8)
# ----|     \---|
#     |          \-tree2n9(n9)
#     |
#      \-tree2n2(n2)
# node tree2n1 is associated with None OTU
# node tree2n3 is associated with None OTU
# node tree2n2 is associated with t1 OTU
# node tree2n4 is associated with None OTU
# node tree2n7 is associated with None OTU
# node tree2n5 is associated with t3 OTU
# node tree2n6 is associated with t2 OTU
# node tree2n8 is associated with t5 OTU
# node tree2n9 is associated with t4 OTU

[Download tolweb.xml example] || [Download script]

Node meta information is also available:

from ete2 import Nexml

# Creates and empty NeXML project
p = Nexml()
# Fill it with the tolweb example 
p.build_from_file("tolweb.xml")

# extract the first collection of trees
tree_collection = p.trees[0]
# and all the tree instances in it
trees = tree_collection.tree

# For each loaded tree, prints its structure and some of its
# meta-properties
for t in trees:
    print t
    print 
    print "Leaf node meta information:\n"
    print
    for meta in  t.children[0].nexml_node.meta:
        print  meta.property, ":", (meta.content)


# Output 
# ==========
# 
# ---- /-node3(Eurysphindus)
#  
# Leaf node meta information:
#  
#  
# dc:description : 
# tbe:AUTHORITY : Leconte
# tbe:AUTHDATE : 1878
# tba:ANCESTORWITHPAGE : 117851
# tba:CHILDCOUNT : 0
# tba:COMBINATION_DATE : null
# tba:CONFIDENCE : 0
# tba:EXTINCT : 0
# tba:HASPAGE : 1
# tba:ID : 117855
# tba:INCOMPLETESUBGROUPS : 0
# tba:IS_NEW_COMBINATION : 0
# tba:ITALICIZENAME : 1
# tba:LEAF : 0
# tba:PHYLESIS : 0
# tba:SHOWAUTHORITY : 0
# tba:SHOWAUTHORITYCONTAINING : 1

[Download tolweb.xml example] || [Download script]

Creating Nexml project from scratch

Nexml base class can also be used to create projects from scratch in a programmatic way. Using the collection of NeXML classes provided by the:mod:ete2.nexml module, you can populate an empty project and export it as XML.

import sys
# Note that we import the nexml module rather than the root Nexml
#  class.  This module contains a python object for each of the
#  nexml elements declared in its XML schema.
from ete2 import nexml

# Create an empty Nexml project 
nexml_project = nexml.Nexml()
tree_collection = nexml.Trees()

# NexmlTree is a special PhyloTree instance that is prepared to be
# added to NeXML projects. So lets populate a random tree
nexml_tree = nexml.NexmlTree()
# Random tree with 10 leaves
nexml_tree.populate(10, random_branches=True) 
# We add the tree to the collection 
tree_collection.add_tree(nexml_tree)

# Create another tree from a newick string
nexml_tree2 = nexml.NexmlTree("((hello, nexml):1.51, project):0.6;")
tree_collection.add_tree(nexml_tree2)

# Tree can be handled as normal ETE objects
nexml_tree2.show()

# Add the collection of trees to the NexML project object
nexml_project.add_trees(tree_collection)

# Now we can export the project containing our two trees 
nexml_project.export()

[Download script]

Writing NeXML objects

Every NexML object has its own export() method. By calling it, you can obtain the XML representation of any instance contained in the Nexml project structure. Usually, all you will need is to export the whole project, but individual elements can be exported.

import sys
from ete2 import Nexml
# Create an empty Nexml project
nexml_project = Nexml()

# Upload content from file
nexml_project.build_from_file("nexml_example.xml")

# Extract first collection of trees
tree_collection =  nexml.get_trees()[0]

# And export it
tree_collection.export(output=sys.stdout, level=0)

NeXML tree manipulation and visualization

NeXML trees contain all ETE PhyloTree functionality: orthology prediction, topology manipulation and traversing methods, visualization, etc.

For instance, tree changes performed through the visualization GUI are kept in the NeXML format.

from ete2 import nexml
nexml_tree = nexml.NexMLTree("((hello, nexml):1.51, project):0.6;")
tree_collection.add_tree(nexml_tree)
nexml_tree.show()