`ecoxipy.pyxom` - Pythonic XML Object Model (PyXOM)¶

This module implements the Pythonic XML Object Model (PyXOM) for the representation of XML structures. To conveniently create PyXOM data structures use ecoxipy.pyxom.output, for indexing use ecoxipy.pyxom.indexing (if Document.element_by_id and Document.elements_by_name are not enough for you).

Examples¶

XML Creation¶

If you use the constructors be sure to supply the right data types, otherwise use the create() methods or use ecoxipy.MarkupBuilder, which take care of conversion.

>>> from ecoxipy import MarkupBuilder
>>> b = MarkupBuilder()
>>> document = Document.create(
...     b.article(
...         b.h1(
...             b & '<Example>',
...             data='to quote: <&>"\''
...         ),
...         b.p(
...             {'umlaut-attribute': u'äöüß'},
...             'Hello', Element.create('em', ' World',
...                 attributes={'count':1}), '!'
...         ),
...         None,
...         b.div(
...             Element.create('data-element', Text.create(u'äöüß <&>')),
...             b(
...                 '<p attr="value">raw content</p>Some Text',
...                 b.br,
...                 (i for i in range(3))
...             ),
...             (i for i in range(3, 6))
...         ),
...         Comment.create('<This is a comment!>'),
...         ProcessingInstruction.create('pi-target', '<PI content>'),
...         ProcessingInstruction.create('pi-without-content'),
...         b['foo:somexml'](
...             b['foo:somexml']({'foo:bar': 1, 't:test': 2}),
...             b['somexml']({'xmlns': ''}),
...             b['bar:somexml'],
...             {'xmlns:foo': 'foo://bar', 'xmlns:t': '',
...                 'foo:bar': 'Hello', 'id': 'foo'}
...         ),
...         {'xmlns': 'http://www.w3.org/1999/xhtml/'}
...     ), doctype_name='article', omit_xml_declaration=True
... )

Enforcing Well-Formedness¶

Using the create() methods or passing the parameter check_well_formedness as True to the appropriate constructors enforces that the element, attribute and document type names are valid XML names, and that processing instruction target and content as well as comment contents conform to their constraints:

>>> from ecoxipy import XMLWellFormednessException
>>> def catch_not_well_formed(cls, *args, **kargs):
...     try:
...         return cls.create(*args, **kargs)
...     except XMLWellFormednessException as e:
...         print(e)

>>> t = catch_not_well_formed(Document, [], doctype_name='1nvalid-xml-name')
The value "1nvalid-xml-name" is not a valid XML name.
>>> t = catch_not_well_formed(Document, [], doctype_name='html', doctype_publicid='"')
The value "\"" is not a valid document type public ID.
>>> t = catch_not_well_formed(Document, [], doctype_name='html', doctype_systemid='"\'')
The value "\"'" is not a valid document type system ID.

>>> t = catch_not_well_formed(Element, '1nvalid-xml-name', [], {})
The value "1nvalid-xml-name" is not a valid XML name.
>>> t = catch_not_well_formed(Element, 't', [], attributes={'1nvalid-xml-name': 'content'})
The value "1nvalid-xml-name" is not a valid XML name.

>>> t = catch_not_well_formed(ProcessingInstruction, '1nvalid-xml-name')
The value "1nvalid-xml-name" is not a valid XML processing instruction target.
>>> t = catch_not_well_formed(ProcessingInstruction, 'target', 'invalid PI content ?>')
The value "invalid PI content ?>" is not a valid XML processing instruction content because it contains "?>".

>>> t = catch_not_well_formed(Comment, 'invalid XML comment --')
The value "invalid XML comment --" is not a valid XML comment because it contains "--".

Navigation¶

Use list semantics to retrieve child nodes and attribute access to retrieve node information:

>>> print(document.doctype.name)
article
>>> print(document[0].name)
article
>>> print(document[0].attributes['xmlns'].value)
http://www.w3.org/1999/xhtml/
>>> print(document[0][-3].target)
pi-target
>>> document[0][1].parent is document[0]
True
>>> document[0][0] is document[0][1].previous and document[0][1].next is document[0][2]
True
>>> document.parent is None and document[0].previous is None and document[0].next is None
True
>>> document[0].attributes.parent is document[0]
True

You can retrieve iterators for navigation through the tree:

>>> list(document[0][0].ancestors)
[ecoxipy.pyxom.Element['article', {...}], ecoxipy.pyxom.Document[ecoxipy.pyxom.DocumentType('article', None, None), True, 'UTF-8']]

>>> list(document[0][1].children())
[ecoxipy.pyxom.Text('Hello'), ecoxipy.pyxom.Element['em', {...}], ecoxipy.pyxom.Text('!')]
>>> list(document[0][2].descendants())
[ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Text('\xe4\xf6\xfc\xdf <&>'), ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('raw content'), ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('5')]

>>> list(document[0][-2].preceding_siblings)
[ecoxipy.pyxom.ProcessingInstruction('pi-target', '<PI content>'), ecoxipy.pyxom.Comment('<This is a comment!>'), ecoxipy.pyxom.Element['div', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['h1', {...}]]
>>> list(document[0][2][-1].preceding)
[ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['h1', {...}]]

>>> list(document[0][0].following_siblings)
[ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['div', {...}], ecoxipy.pyxom.Comment('<This is a comment!>'), ecoxipy.pyxom.ProcessingInstruction('pi-target', '<PI content>'), ecoxipy.pyxom.ProcessingInstruction('pi-without-content', None), ecoxipy.pyxom.Element['foo:somexml', {...}]]
>>> list(document[0][1][0].following)
[ecoxipy.pyxom.Element['em', {...}], ecoxipy.pyxom.Text('!'), ecoxipy.pyxom.Element['div', {...}], ecoxipy.pyxom.Comment('<This is a comment!>'), ecoxipy.pyxom.ProcessingInstruction('pi-target', '<PI content>'), ecoxipy.pyxom.ProcessingInstruction('pi-without-content', None), ecoxipy.pyxom.Element['foo:somexml', {...}]]

Descendants and children can also be retrieved in reverse document order:

>>> list(document[0][1].children(True)) == list(reversed(list(document[0][1].children())))
True
>>> list(document[0][2].descendants(True))
[ecoxipy.pyxom.Text('5'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('raw content'), ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Text('\xe4\xf6\xfc\xdf <&>')]

Normally descendants() traverses the XML tree depth-first, but you can also use breadth-first traversal:

>>> list(document[0][2].descendants(depth_first=False))
[ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('5'), ecoxipy.pyxom.Text('\xe4\xf6\xfc\xdf <&>'), ecoxipy.pyxom.Text('raw content')]
>>> list(document[0][2].descendants(True, False))
[ecoxipy.pyxom.Text('5'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Text('raw content'), ecoxipy.pyxom.Text('\xe4\xf6\xfc\xdf <&>')]

Normally descendants() can also be given a depth limit:

>>> list(document[0].descendants(max_depth=2))
[ecoxipy.pyxom.Element['h1', {...}], ecoxipy.pyxom.Text('<Example>'), ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('Hello'), ecoxipy.pyxom.Element['em', {...}], ecoxipy.pyxom.Text('!'), ecoxipy.pyxom.Element['div', {...}], ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('5'), ecoxipy.pyxom.Comment('<This is a comment!>'), ecoxipy.pyxom.ProcessingInstruction('pi-target', '<PI content>'), ecoxipy.pyxom.ProcessingInstruction('pi-without-content', None), ecoxipy.pyxom.Element['foo:somexml', {...}], ecoxipy.pyxom.Element['foo:somexml', {...}], ecoxipy.pyxom.Element['somexml', {...}], ecoxipy.pyxom.Element['bar:somexml', {...}]]
>>> list(document[0].descendants(depth_first=False, max_depth=2))
[ecoxipy.pyxom.Element['h1', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Element['div', {...}], ecoxipy.pyxom.Comment('<This is a comment!>'), ecoxipy.pyxom.ProcessingInstruction('pi-target', '<PI content>'), ecoxipy.pyxom.ProcessingInstruction('pi-without-content', None), ecoxipy.pyxom.Element['foo:somexml', {...}], ecoxipy.pyxom.Text('<Example>'), ecoxipy.pyxom.Text('Hello'), ecoxipy.pyxom.Element['em', {...}], ecoxipy.pyxom.Text('!'), ecoxipy.pyxom.Element['data-element', {...}], ecoxipy.pyxom.Element['p', {...}], ecoxipy.pyxom.Text('Some Text'), ecoxipy.pyxom.Element['br', {...}], ecoxipy.pyxom.Text('0'), ecoxipy.pyxom.Text('1'), ecoxipy.pyxom.Text('2'), ecoxipy.pyxom.Text('3'), ecoxipy.pyxom.Text('4'), ecoxipy.pyxom.Text('5'), ecoxipy.pyxom.Element['foo:somexml', {...}], ecoxipy.pyxom.Element['somexml', {...}], ecoxipy.pyxom.Element['bar:somexml', {...}]]

Namespaces¶

PyXOM supports the interpretation of Namespaces in XML. Namespace prefix and local names are calculated from Element and Attribute names:

>>> document[0].namespace_prefix == None
True
>>> print(document[0].local_name)
article
>>> print(document[0][-1].namespace_prefix)
foo
>>> print(document[0][-1].local_name)
somexml
>>> attr = document[0][-1].attributes['foo:bar']
>>> print(attr.namespace_prefix)
foo
>>> print(attr.local_name)
bar

The namespace URI is available as Element.namespace_uri and Attribute.namespace_uri (originally defined as NamespaceNameMixin.namespace_uri), these properties look up the namespace prefix of the node in the parent elements (this information is cached, so don’t fear multiple retrieval):

>>> xhtml_namespace_uri = u'http://www.w3.org/1999/xhtml/'
>>> document[0][1].namespace_uri == xhtml_namespace_uri
True
>>> document[0][1][1].namespace_uri == xhtml_namespace_uri
True
>>> document[0][-1][0].namespace_uri == u'foo://bar'
True
>>> document[0][-1][0].attributes['foo:bar'].namespace_uri == u'foo://bar'
True

The namespace prefixes active on an element are available as the iterator Element.namespace_prefixes:

>>> prefixes = sorted(list(document[0][-1][0].namespace_prefixes),
...     key=lambda value: '' if value is None else value)
>>> prefixes[0] == None
True
>>> print(u', '.join(prefixes[1:]))
foo, t
>>> document[0][-1][0].get_namespace_uri(u'foo') == u'foo://bar'
True
>>> print(list(document[0].namespace_prefixes))
[None]
>>> document[0].get_namespace_uri(None) == u'http://www.w3.org/1999/xhtml/'
True

If an element or attribute is in no namespace, namespace_uri is None:

>>> document[0][-1][0].attributes['t:test'].namespace_uri == None
True
>>> document[0][-1][1].namespace_uri == None
True

If an undefined namespace prefix is used, the namespace_uri is False:

>>> document[0][-1][2].namespace_uri == False
True

Indexes¶

On Document instances ecoxipy.pyxom.indexing.IndexDescriptor attributes are defined for fast retrieval (after initially building the index).

Use element_by_id to get elements by the value of their id attribute:

>>> document.element_by_id['foo'] is document[0][-1]
True
>>> 'bar' in document.element_by_id
False

elements_by_name allows retrieval of elements by their name:

>>> document[0][-1] in list(document.elements_by_name['foo:somexml'])
True
>>> 'html' in document.elements_by_name
False

Retrieve elements and attributes by their namespace data by using nodes_by_namespace:

>>> from functools import reduce
>>> elements_and_attributes = set(
...     filter(lambda node: isinstance(node, Element),
...         document.descendants()
...     )
... ).union(
...     reduce(lambda x, y: x.union(y),
...         map(lambda node: set(node.attributes.values()),
...             filter(lambda node: isinstance(node, Element),
...                 document.descendants()
...             )
...         )
...     )
... )
>>> set(document.nodes_by_namespace()) == set(filter(
...     lambda node: node.namespace_uri is not False,
...     elements_and_attributes
... ))
True
>>> set(document.nodes_by_namespace('foo://bar')) == set(filter(
...     lambda node: node.namespace_uri == u'foo://bar',
...     elements_and_attributes
... ))
True
>>> set(document.nodes_by_namespace(local_name='bar')) == set(filter(
...     lambda node: node.local_name == u'bar',
...     elements_and_attributes
... ))
True
>>> set(document.nodes_by_namespace('foo://bar', 'bar')) == set(filter(
...     lambda node: node.namespace_uri == u'foo://bar' and node.local_name == u'bar',
...     elements_and_attributes
... ))
True

Manipulation and Equality¶

All XMLNode instances have attributes which allow for modification. Document and Element instances also allow modification of their contents like sequences.

Duplication and Comparisons¶

Use XMLNode.duplicate() to create a deep copy of a XML node:

>>> document_copy = document.duplicate()
>>> document is document_copy
False

Equality and inequality recursively compare XML nodes:

>>> document == document_copy
True
>>> document != document_copy
False

Attributes¶

The attributes of an Element instance are available as Element.attributes. This is an Attributes instance which contains Attribute instances:

>>> document_copy[0][0].attributes['data']
ecoxipy.pyxom.Attribute('data', 'to quote: <&>"\'')
>>> old_data = document_copy[0][0].attributes['data'].value
>>> document_copy[0][0].attributes['data'].value = 'foo bar'
>>> document_copy[0][0].attributes['data'].value == u'foo bar'
True
>>> 'data' in document_copy[0][0].attributes
True
>>> document == document_copy
False
>>> document != document_copy
True
>>> document_copy[0][0].attributes['data'].value = old_data
>>> document == document_copy
True
>>> document != document_copy
False

Attributes instances allow for creation of Attribute instances:

>>> somexml = document_copy[0][-1]
>>> foo_attr = somexml[0].attributes.create_attribute('foo:foo', 'bar')
>>> foo_attr is somexml[0].attributes['foo:foo']
True
>>> foo_attr == somexml[0].attributes['foo:foo']
True
>>> foo_attr != somexml[0].attributes['foo:foo']
False
>>> 'foo:foo' in somexml[0].attributes
True
>>> foo_attr.namespace_uri == u'foo://bar'
True

Attributes may be removed:

>>> somexml[0].attributes.remove(foo_attr)
>>> 'foo:foo' in somexml[0].attributes
False
>>> foo_attr.parent == None
True
>>> foo_attr.namespace_uri == False
True

You can also add an attribute to an element’s attributes, it is automatically moved if it belongs to another element’s attributes:

>>> somexml[0].attributes.add(foo_attr)
>>> 'foo:foo' in somexml[0].attributes
True
>>> foo_attr.parent == somexml[0].attributes
True
>>> foo_attr.parent != somexml[0].attributes
False
>>> foo_attr.namespace_uri == u'foo://bar'
True
>>> del somexml[0].attributes['foo:foo']
>>> 'foo:foo' in somexml[0].attributes
False
>>> attr = document[0][-1].attributes['foo:bar']
>>> attr.name = 'test'
>>> attr.namespace_prefix is None
True
>>> print(attr.local_name)
test

Documents and Elements¶

>>> document_copy[0].insert(1, document_copy[0][0])
>>> document_copy[0][0] == document[0][1]
True
>>> document_copy[0][0] != document[0][1]
False
>>> document_copy[0][1] == document[0][0]
True
>>> document_copy[0][1] != document[0][0]
False
>>> p_element = document_copy[0][0]
>>> document_copy[0].remove(p_element)
>>> document_copy[0][0].name == u'h1' and p_element.parent is None
True
>>> p_element in document_copy[0]
False
>>> p_element.namespace_uri == False
True
>>> document_copy[0][0].append(p_element)
>>> document_copy[0][0][-1] is p_element
True
>>> p_element in document_copy[0][0]
True
>>> p_element.namespace_uri == u'http://www.w3.org/1999/xhtml/'
True
>>> p_element in document[0]
False
>>> document[0][1] in document_copy[0][0]
False
>>> document[0][1] is document_copy[0][0][-1]
False
>>> document[0][1] == document_copy[0][0][-1]
True
>>> document[0][1] != document_copy[0][0][-1]
False
>>> document[0][-1].name = 'foo'
>>> document[0][-1].namespace_prefix is None
True
>>> print(document[0][-1].local_name)
foo

Indexes and Manipulation¶

If a document is modified, the indexes should be deleted. This can be done using del() on the index attribute or calling delete_indexes().

>>> del document_copy[0][-1]
>>> document_copy.delete_indexes()
>>> 'foo' in document_copy.element_by_id
False
>>> 'foo:somexml' in document_copy.elements_by_name
False

XML Serialization¶

First we remove embedded non-HTML XML, as there are multiple attributes on the element and the order they are rendered in is indeterministic, which makes it hard to compare:

>>> del document[0][-1]

Getting the Unicode value of an document yields the XML document serialized as an Unicode string:

>>> document_string = u"""<!DOCTYPE article><article xmlns="http://www.w3.org/1999/xhtml/"><h1 data="to quote: &lt;&amp;&gt;&quot;'">&lt;Example&gt;</h1><p umlaut-attribute="äöüß">Hello<em count="1"> World</em>!</p><div><data-element>äöüß &lt;&amp;&gt;</data-element><p attr="value">raw content</p>Some Text<br/>012345</div><!--<This is a comment!>--><?pi-target <PI content>?><?pi-without-content?></article>"""
>>> import sys
>>> if sys.version_info[0] < 3:
...     unicode(document) == document_string
... else:
...     str(document) == document_string
True

Getting the bytes() value of an Document creates a byte string of the serialized XML with the encoding specified on creation of the instance, it defaults to “UTF-8”:

>>> bytes(document) == document_string.encode('UTF-8')
True

XMLNode instances can also generate SAX events, see XMLNode.create_sax_events() (note that the default xml.sax.ContentHandler is xml.sax.saxutils.ContentHandler, which does not support comments):

>>> document_string = u"""<?xml version="1.0" encoding="UTF-8"?>\n<article xmlns="http://www.w3.org/1999/xhtml/"><h1 data="to quote: &lt;&amp;&gt;&quot;'">&lt;Example&gt;</h1><p umlaut-attribute="äöüß">Hello<em count="1"> World</em>!</p><div><data-element>äöüß &lt;&amp;&gt;</data-element><p attr="value">raw content</p>Some Text<br></br>012345</div><?pi-target <PI content>?><?pi-without-content ?></article>"""
>>> import sys
>>> from io import BytesIO
>>> string_out = BytesIO()
>>> content_handler = document.create_sax_events(out=string_out)
>>> string_out.getvalue() == document_string.encode('UTF-8')
True
>>> string_out.close()

You can also create indented XML when calling the XMLNode.create_sax_events() by supplying the indent_incr argument:

>>> indented_document_string = u"""\
... <?xml version="1.0" encoding="UTF-8"?>
... <article xmlns="http://www.w3.org/1999/xhtml/">
...     <h1 data="to quote: &lt;&amp;&gt;&quot;'">
...         &lt;Example&gt;
...     </h1>
...     <p umlaut-attribute="äöüß">
...         Hello
...         <em count="1">
...              World
...         </em>
...         !
...     </p>
...     <div>
...         <data-element>
...             äöüß &lt;&amp;&gt;
...         </data-element>
...         <p attr="value">
...             raw content
...         </p>
...         Some Text
...         <br></br>
...         012345
...     </div>
...     <?pi-target <PI content>?>
...     <?pi-without-content ?>
... </article>
... """
>>> string_out = BytesIO()
>>> content_handler = document.create_sax_events(indent_incr='    ', out=string_out)
>>> string_out.getvalue() == indented_document_string.encode('UTF-8')
True
>>> string_out.close()

Classes¶

Document¶

class ecoxipy.pyxom.Document(doctype_name, doctype_publicid, doctype_systemid, children, omit_xml_declaration, encoding, check_well_formedness=False)¶

A ContainerNode representing a XML document.

Parameters:

doctype_name (Unicode string) – The document type root element name or None if the document should not have document type declaration.
doctype_publicid (Unicode string) – The public ID of the document type declaration or None.
doctype_systemid (Unicode string) – The system ID of the document type declaration or None.
children – The document root XMLNode instances.
encoding (Unicode string) – The encoding of the document. If it is None UTF-8 is used.
omit_xml_declaration (bool()) – If True the XML declaration is omitted.
check_well_formedness (bool()) – If True the document element name will be checked to be a valid XML name.

Raises ecoxipy.XMLWellFormednessException:

If check_well_formedness is True and doctype_name is not a valid XML name, doctype_publicid is not a valid public ID or doctype_systemid is not a valid system ID.

static create(*children, **kargs)¶

Creates a document and converts parameters to appropriate types.

Raises ecoxipy.XMLWellFormednessException:
Parameters:	children – The document root nodes. All items that are not `XMLNode` instances create `Text` nodes after they have been converted to Unicode strings. kargs – The same parameters as the constructor has (except `children`) are recognized. The items `doctype_name`, `doctype_publicid`, `doctype_systemid`, and `encoding` are converted to Unicode strings if they are not `None`. `omit_xml_declaration` is converted to boolean.
Returns:	The created document.
Return type:	`Document`
	If `doctype_name` is not a valid XML name, `doctype_publicid` is not a valid public ID or `doctype_systemid` is not a valid system ID.

doctype¶

The DocumentType instance of the document.

On setting one of the following occurs:

If the value is None, the document type’s attributes are set to None.
If the value is a byte or Unicode string, the document type document element name is set to this value (a byte string will be converted to Unicode). The document type public and system IDs will be set to None.
If the value is a mapping, the items identified by the strings 'name', 'publicid' or 'systemid' define the respective attributes of the document type, the others are assumed to be None.
If the value is a sequence, the item at position zero defines the document type document element name, the item at position one defines the public ID and the item at position two defines the system ID. If the sequence is shorter than three, non-available items are assumed to be None.

The document type values are converted to appropriate values and their validity is checked if check_well_formedness is True.

Example:

>>> doc = Document.create()
>>> doc.doctype
ecoxipy.pyxom.DocumentType(None, None, None)
>>> doc.doctype = {'name': 'test', 'systemid': 'foo bar'}
>>> doc.doctype
ecoxipy.pyxom.DocumentType('test', None, 'foo bar')
>>> doc.doctype = ('html', 'foo bar')
>>> doc.doctype
ecoxipy.pyxom.DocumentType('html', 'foo bar', None)
>>> doc.doctype = 'foo'
>>> doc.doctype
ecoxipy.pyxom.DocumentType('foo', None, None)
>>> doc.doctype = None
>>> doc.doctype
ecoxipy.pyxom.DocumentType(None, None, None)

omit_xml_declaration¶: If True the XML declaration is omitted.

encoding¶: The encoding of the document. On setting if the value is None it is set to UTF-8, otherwise it is converted to an Unicode string.

create_sax_events(content_handler=None, out=None, out_encoding='UTF-8', indent_incr=None)¶

Creates SAX events.

Parameters:

content_handler (xml.sax.ContentHandler) – If this is None a xml.sax.saxutils.XMLGenerator is created and used as the content handler. If in this case out is not None, it is used for output.
out – The output to write to if no content_handler is given. It should have a write() method like files.
out_encoding – The output encoding or None for Unicode output.
indent_incr (str()) – If this is not None this activates pretty printing. In this case it should be a string and it is used for indenting.

Returns:

The content handler used.

duplicate()¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

element_by_id¶

A ecoxipy.pyxom.indexing.IndexDescriptor instance using a ecoxipy.pyxom.indexing.ElementByUniqueAttributeValueIndexer for indexing.

Use it like a mapping to retrieve the element having an attribute id with the value being equal to the requested key, possibly throwing a KeyError if such an element does not exist.

Important: If the document’s childs are relevantly modified (i.e. an id attribute was created, modified or deleted), delete_indexes() should be called or this attribute should be deleted on the instance, which deletes the index.

elements_by_name¶

A ecoxipy.pyxom.indexing.IndexDescriptor instance using a ecoxipy.pyxom.indexing.ElementsByNameIndexer for indexing.

Use it like a mapping to retrieve an iterator over elements having a name equal to the requested key, possibly throwing a KeyError if such an element does not exist.

Important: If the document’s childs are relevantly modified (i.e. new elements were added or deleted, elements’ names were modified), delete_indexes() should be called or this attribute should be deleted on the instance, which deletes the index.

nodes_by_namespace¶

A ecoxipy.pyxom.indexing.IndexDescriptor instance using a ecoxipy.pyxom.indexing.NamespaceIndexer for indexing.

Important: If the document’s childs are relevantly modified (i.e. new elements/attributes were added or deleted, elements’/attributes’ names were modified), delete_indexes() should be called or this attribute should be deleted on the instance, which deletes the index.

delete_indexes()¶: A shortcut to delete the indexes of element_by_id and elements_by_name.

class ecoxipy.pyxom.DocumentType(name, publicid, systemid, check_well_formedness)¶

Represents a document type declaration of a Document. It should not be instantiated on itself.

Parameters:	name (Unicode string) – The document element name. publicid (Unicode string) – The document type public ID or `None`. systemid (Unicode string) – The document type system ID or `None`. check_well_formedness (`bool()`) – If `True` the document element name will be checked to be a valid XML name.

name¶: The document element name or None. On setting if the value is None, publicid and systemid are also set to None. Otherwise the value is converted to an Unicode string; a ecoxipy.XMLWellFormednessException is thrown if it is not a valid XML name and check_well_formedness is True.

publicid¶: The document type public ID or None. On setting if the value is not None it is converted to a Unicode string; a ecoxipy.XMLWellFormednessException is thrown if it is not a valid doctype public ID and check_well_formedness is True.

systemid¶: The document type system ID or None. On setting if the value is not None it is converted to a Unicode string; a ecoxipy.XMLWellFormednessException is thrown if it is not a valid doctype system ID and check_well_formedness is True.

Element¶

class ecoxipy.pyxom.Element(name, children, attributes, check_well_formedness=False)¶

Represents a XML element. It inherits from ContainerNode and NamespaceNameMixin.

Parameters:

name (Unicode string) – The name of the element to create.
children (iterable of items) – The children XMLNode instances of the element.
attributes – Defines the attributes of the element. Must be usable as the parameter of dict and should contain only Unicode strings as key and value definitions.
check_well_formedness (bool()) – If True the element name and attribute names will be checked to be a valid XML name.

Raises ecoxipy.XMLWellFormednessException:

If check_well_formedness is True and the name is not a valid XML name.

static create(name, *children, **kargs)¶

Creates an element and converts parameters to appropriate types.

Raises ecoxipy.XMLWellFormednessException:
Parameters:	children – The element child nodes. All items that are not `XMLNode` instances create `Text` nodes after they have been converted to Unicode strings. kargs – The item `attributes` defines the attributes and must have a method `items()` (like `dict`) which returns an iterable of 2-`tuple()` instances containing the attribute name as the first and the attribute value as the second item. Attribute names and values are converted to Unicode strings.
Returns:	The created element.
Return type:	`Element`
	If the `name` is not a valid XML name.

namespace_prefixes¶: An iterator over all namespace prefixes defined in the element and its parents. Duplicate values may be retrieved.

get_namespace_prefix_element(prefix)¶: Calculates the element the namespace prefix is defined in, this is None if the prefix is not defined.

get_namespace_uri(prefix)¶: Calculates the namespace URI for the prefix, this is False if the prefix is not defined..

name¶: The name of the element. On setting the value is converted to an Unicode string; a ecoxipy.XMLWellFormednessException is thrown if it is not a valid XML name and check_well_formedness is True.

attributes¶: An Attributes instance containing the element’s attributes.

duplicate()¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

class ecoxipy.pyxom.Attribute(parent, name, value, check_well_formedness)¶

Represents an item of an Element‘s Attributes. It inherits from NamespaceNameMixin and should not be instantiated on itself, rather use Attributes.create_attribute().

parent¶: The parent Attributes.

name¶: The attribute’s name. On setting the value is converted to an Unicode string, if there is already another attribute with the same name on the parent Attributes instance a KeyError is raised.

value¶: The attribute’s value.

class ecoxipy.pyxom.Attributes(parent, attributes, check_well_formedness)¶

This mapping, containing Attribute instances identified by their names, represents attributes of an Element. It should not be instantiated on itself.

create_attribute(name, value)¶

Create a new Attribute as part of the instance.

Raises KeyError:
Parameters:	name – the attribute’s name value – the attribute’s value
Returns:	the created attribute
Return type:	`Attribute`
	If an attribute with `name` already exists in the instance.

add(attribute)¶

Add an attribute to the instance. If the attribute is contained in an Attributes instance it is first removed from that.

Parameters:	attribute (`Attribute`) – the attribute to add
Raises:	ValueError – if attribute is no `Attribute` instance KeyError – If an attribute with the `attribute`‘s name already exists in the instance.

remove(attribute)¶

Remove the given attribute.

Parameters:	attribute (`Attribute`) – the attribute to remove
Raises:	KeyError – If no attribute with the name of `attribute` is contained in the instance. ValueError – If there is an attribute with the name of `attribute` contained, but it is not `attribute`.

parent¶: The parent Element.

to_dict()¶: Creates a dict from the instance’s Attribute instances. The keys are the attribute’s names, identifying the attribute’s values.

Other Nodes¶

class ecoxipy.pyxom.Text(content)¶

A ContentNode representing a node of text.

duplicate()¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

class ecoxipy.pyxom.Comment(content, check_well_formedness=False)¶

A ContentNode representing a comment node.

Raises ecoxipy.XMLWellFormednessException:
	If `check_well_formedness` is `True` and `content` is not valid.

static create(content)¶

Creates a comment node.

Raises ecoxipy.XMLWellFormednessException:
Parameters:	content – The content of the comment. This will be converted to an Unicode string.
Returns:	The created commment node.
Return type:	`Comment`
	If `content` is not valid.

duplicate()¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

content¶: The node content. On setting the value is converted to an Unicode string.

class ecoxipy.pyxom.ProcessingInstruction(target, content, check_well_formedness=False)¶

A ContentNode representing a processing instruction.

Raises ecoxipy.XMLWellFormednessException:
Parameters:	target – The `target`. content – The `content` or `None`. check_well_formedness (`bool()`) – If `True` the target will be checked to be a valid XML name.
	If `check_well_formedness` is `True` and either the `target` or the `content` are not valid.

static create(target, content=None)¶

Creates a processing instruction node and converts the parameters to appropriate types.

Raises ecoxipy.XMLWellFormednessException:
Parameters:	target – The `target`, will be converted to an Unicode string. content – The `content`, if it is not `None` it will be converted to an Unicode string.
Returns:	The created processing instruction.
Return type:	`ProcessingInstruction`
	If either the `target` or the `content` are not valid.

target¶: The processing instruction target.

content¶: The node content. On setting the value is converted to an Unicode string.

duplicate()¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

Base Classes¶

class ecoxipy.pyxom.XMLNode¶

Base class for XML node objects.

Retrieving the byte string from an instance yields a byte string encoded as UTF-8.

parent¶: The parent ContainerNode or None if the node has no parent.

previous¶: The previous XMLNode or None if the node has no preceding sibling.

next¶: The next XMLNode or None if the node has no following sibling.

ancestors¶: Returns an iterator over all ancestors.

preceding_siblings¶: Returns an iterator over all preceding siblings.

following_siblings¶: Returns an iterator over all following siblings.

preceding¶: Returns an iterator over all preceding nodes.

following¶: Returns an iterator over all following nodes.

create_str(out=None, encoding='UTF-8')¶

Creates a string containing the XML representation of the node.

Parameters:	out – A `ecoxipy.string_output.StringOutput` instance or `None`. If it is the latter, a new `ecoxipy.string_output.StringOutput` instance is created. encoding – The output encoding or `None` for Unicode output. Is only taken into account if `out` is `None`.

create_sax_events(content_handler=None, out=None, out_encoding='UTF-8', indent_incr=None)¶

Creates SAX events.

Parameters:

content_handler (xml.sax.ContentHandler) – If this is None a xml.sax.saxutils.XMLGenerator is created and used as the content handler. If in this case out is not None, it is used for output.
out – The output to write to if no content_handler is given. It should have a write() method like files.
out_encoding – The output encoding or None for Unicode output.
indent_incr (str()) – If this is not None this activates pretty printing. In this case it should be a string and it is used for indenting.

Returns:

The content handler used.

duplicate(test=None)¶: Return a deep copy of the XML node, and its descendants if it is a ContainerNode instance.

class ecoxipy.pyxom.ContainerNode(children)¶

A XMLNode containing other nodes with sequence semantics.

Parameters:	children (`list()`) – The nodes contained of in the node.

children(reverse=False)¶

Returns an iterator over the children.

Parameters:	reverse – If this is `True` the children are returned in reverse document order.
Returns:	An iterator over the children.

descendants(reverse=False, depth_first=True, max_depth=None)¶

Returns an iterator over all descendants.

Parameters:	reverse – If this is `True` the descendants are returned in reverse document order. depth_first – If this is `True` the descendants are returned depth-first, if it is `False` breadth-first traversal is used. max_depth (`int()`) – The maximum depth, if this is `None` all descendants will be returned.
Returns:	An iterator over the descendants.

insert(index, child)¶: Insert child before index.

remove(child)¶: Remove child.

class ecoxipy.pyxom.ContentNode(content)¶

A XMLNode with content.

Parameters:	content (Unicode string) – Becomes the `content` attribute.

classmethod create(content)¶

Creates an instance of the ContentNode implementation and converts content to an Unicode string.

Parameters:	content – The content of the node. This will be converted to an Unicode string.
Returns:	The created `ContentNode` implementation instance.

content¶: The node content. On setting the value is converted to an Unicode string.

class ecoxipy.pyxom.NamespaceNameMixin¶

Contains functionality implementing Namespaces in XML.

namespace_prefix¶: The namespace prefix (the part before :) of the node’s name.

local_name¶: The local name (the part after :) of the node’s name.

namespace_uri¶: The namespace URI the namespace_prefix refers to. It is None if there is no namespace prefix and it is False if the prefix lookup failed.

`ecoxipy.pyxom` - Pythonic XML Object Model (PyXOM)¶

Examples¶

XML Creation¶

Enforcing Well-Formedness¶

Navigation¶

Namespaces¶

Indexes¶

Manipulation and Equality¶

Duplication and Comparisons¶

Attributes¶

Documents and Elements¶

Indexes and Manipulation¶

XML Serialization¶

Classes¶

Document¶

Element¶

Other Nodes¶

Base Classes¶

Table Of Contents

Previous topic

Next topic

Navigation

ecoxipy.pyxom - Pythonic XML Object Model (PyXOM)¶

Examples¶

XML Creation¶

Enforcing Well-Formedness¶

Navigation¶

Namespaces¶

Indexes¶

Manipulation and Equality¶

Duplication and Comparisons¶

Attributes¶

Documents and Elements¶

Indexes and Manipulation¶

XML Serialization¶

Classes¶

Document¶

Element¶

Other Nodes¶

Base Classes¶

Table Of Contents

Previous topic

Next topic

Quick search

Navigation

`ecoxipy.pyxom` - Pythonic XML Object Model (PyXOM)¶