ecoxipy.parsing - Parsing XML

This package contains xml.sax handlers to parse XML into ecoxipy.MarkupBuilder structures.

Examples

>>> from ecoxipy.string_output import StringOutput
>>> output = StringOutput()
>>> handler = MarkupHandler(output)
>>> doc = handler.parse(b'<test><foo bar="test">Hello World!</foo></test>')
>>> print(doc)
<test><foo bar="test">Hello World!</foo></test>
>>> handler = XMLFragmentParser(output)
>>> fragment = handler.parse(u'<foo bar="test">Hello World!</foo><test/>')
>>> for item in fragment:
...     print(item)
<foo bar="test">Hello World!</foo>
<test/>

Classes

class ecoxipy.parsing.MarkupHandler(output)[source]

A SAX handler to create ecoxipy markup. By implementing your own ecoxipy.Output class you can use it to parse XML.

Parameters:output (ecoxipy.Output) – The output istance to use.
reset()[source]

Reset the current state. This should be called before parsing a new document after an exception occured while parsing. It is automatically called when a document has been processed in endDocument().

parse(source, parser=None)[source]

Parses the given XML source and returns data in the representation of the ecoxipy.Output instance given on creation.

Parameters:
Raises:

xml.sax.SAXException if the XML is not well-formed.

Returns:

the created XML data of the output representation.

document[source]

The document processed. This is only available after successfully parsing a XML document. This attribute is deletable but not assignable.

notationDecl(name, publicId, systemId)[source]

Handle a notation declaration event.

unparsedEntityDecl(name, publicId, systemId, ndata)[source]

Handle an unparsed entity declaration event.

startDocument()[source]

Receive notification of the beginning of a document.

The SAX parser will invoke this method only once, before any other methods in this interface or in DTDHandler (except for setDocumentLocator).

endDocument()[source]

Receive notification of the end of a document.

The SAX parser will invoke this method only once, and it will be the last method invoked during the parse. The parser shall not invoke this method until it has either abandoned parsing (because of an unrecoverable error) or reached the end of input.

startElement(name, attrs)[source]

Signals the start of an element in non-namespace mode.

The name parameter contains the raw XML 1.0 name of the element type as a string and the attrs parameter holds an instance of the Attributes class containing the attributes of the element.

endElement(name)[source]

Signals the end of an element in non-namespace mode.

The name parameter contains the name of the element type, just as with the startElement event.

characters(content)[source]

Receive notification of character data.

The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.

ignorableWhitespace(content)[source]

Receive notification of ignorable whitespace in element content.

Validating Parsers must use this method to report each chunk of ignorable whitespace (see the W3C XML 1.0 recommendation, section 2.10): non-validating parsers may also use this method if they are capable of parsing and using content models.

SAX parsers may return all contiguous whitespace in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity, so that the Locator provides useful information.

processingInstruction(target, data)[source]

Receive notification of a processing instruction.

The Parser will invoke this method once for each processing instruction found: note that processing instructions may occur before or after the main document element.

A SAX parser should never report an XML declaration (XML 1.0, section 2.8) or a text declaration (XML 1.0, section 4.3.1) using this method.

comment(content)[source]

Receive notification of a comment.

exception ecoxipy.parsing.XMLFragmentParsedException(xml_fragment)[source]

Indicates a XML fragment has been parsed by XMLFragmentParser.

xml_fragment[source]

The parsed XML fragment, a list() instance.

class ecoxipy.parsing.XMLFragmentParser(output, parser=None)[source]

A SAX handler to read create XML fragments (lists of XML nodes) from Unicode strings and output ecoxipy data. If used as a xml.sax.handler.ContentHandler it raises a XMLFragmentParsedException when the root element is closed.

Parameters:
endElement(name)[source]

Signals the end of an element in non-namespace mode.

The name parameter contains the name of the element type, just as with the startElement event.

parse(xml_fragment)[source]

Parses the given XML fragment and returns data in the representation of the ecoxipy.Output instance given on creation.

Parameters:xml_fragment (Unicode string) – The XML fragment to parse.
Raises:xml.sax.SAXException if the XML is not well-formed.
Returns:the created XML data of the output representation.

Table Of Contents

Previous topic

ecoxipy.decorators - Decorators for Shorter XML-Creation Code

Next topic

ecoxipy.validation - Validating XML