intessa.conneg.default — Default codecs for common content types

This module contains a global default codec register (DEFAULT_REGISTER) and codecs for some commonly used media types (which are already present in the default register).

intessa.conneg.default.DEFAULT_REGISTER

An instance of CodecRegister, pre-populated with common codecs and used by default within intessa for content negotiation.

Codecs

The default codecs defined in this module are:

Type Media Types Aliases Codec
Text text/plain text TextCodec
HTML Forms application/x-www-form-urlencoded form SimpleFormCodec
Multipart Forms multipart/form-data multipart MultipartFormCodec
JSON application/json, text/javascript json, json-js (respectively) JSONCodec
XML application/xml xml XMLCodec
class intessa.conneg.default.text.TextCodec[source]

Default text codec.

static decode(c_type, bytes)[source]

Decode a bytestring to unicode, using the content type’s charset.

>>> TextCodec.decode(ContentType('text/plain; charset=utf-8'),
...                  'H\xc3\xa9llo W\xc3\xb6rld')
u'H\xe9llo W\xf6rld'
>>> TextCodec.decode(ContentType('text/plain; charset=latin1'),
...                              'H\xe9llo W\xf6rld')
u'H\xe9llo W\xf6rld'

If no charset is present, this method assumes the input is UTF-8:

>>> TextCodec.decode(ContentType('text/plain'),
...                  'H\xc3\xa9llo W\xc3\xb6rld')
u'H\xe9llo W\xf6rld'

The decoder always uses ‘strict’ error handling:

>>> TextCodec.decode(ContentType('text/plain; charset=us-ascii'), 
...                  'H\xc3\xa9llo W\xc3\xb6rld')
Traceback (most recent call last):
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
static encode(media_type, string, encoding='utf-8', errors='strict')[source]

Encode a unicode string as a bytestring using an encoding.

Parameters:
  • encoding – The encoding to use (default: 'utf-8').
  • errors

    The strategy for handling encoding errors (default: 'strict'). See the documentation on the built-in unicode.encode() for more information about this option.

    >>> TextCodec.encode('text/plain', u"Héllo Wörld")
    (ContentType('text/plain; charset=utf-8'), 'H\xc3\xa9llo W\xc3\xb6rld')
    >>> TextCodec.encode('text/plain', u"Héllo Wörld", encoding='latin1')
    (ContentType('text/plain; charset=latin1'), 'H\xe9llo W\xf6rld')
    
class intessa.conneg.default.forms.SimpleFormCodec[source]

Codec for simple application/x-www-form-urlencoded HTML forms.

static decode(c_type, bytes)[source]

Decode the encoded form, returning a MultiDict.

For content types without a charset, returns an instance of webob.multidict.MultiDict with bytestrings:

>>> form_type = ContentType('application/x-www-form-urlencoded')
>>> SimpleFormCodec.decode(form_type, 'a=b&c=d')
MultiDict([('a', 'b'), ('c', 'd')])

If a charset is given, the input is decoded using it, returning a UnicodeMultiDict instead:

>>> utf8_form_type = ContentType(
...     'application/x-www-form-urlencoded; charset=utf-8')
>>> SimpleFormCodec.decode(utf8_form_type, 'h%C3%A9llo=w%C3%B8rld')
UnicodeMultiDict([(u'h\xe9llo', u'w\xf8rld')])
static encode(media_type, obj, encoding='utf-8', errors='strict')[source]

Encode a simple dictionary of strings to an HTML form.

Parameters:
  • encoding – The encoding to use (default: 'utf-8').
  • errors – The strategy for handling encoding errors (default: 'strict'). See the documentation on the built-in unicode.encode() for more information about this option.

The simplest case, dictionaries of bytestrings, works:

>>> SimpleFormCodec.encode('application/x-www-form-urlencoded', 
...     {'a': 'b', 'c': 'd'})
(ContentType(...), 'a=b&c=d')

You can also pass in lists of key, value pairs:

>>> SimpleFormCodec.encode('application/x-www-form-urlencoded', 
...     [('a', 'b'), ('c', 'd')])
(ContentType(...), 'a=b&c=d')

Unicode strings will be encoded according to the encoding and errors parameters:

>>> SimpleFormCodec.encode('application/x-www-form-urlencoded', 
...     [(u'héllo', u'wørld')])
(ContentType('...; charset=utf-8'), 'h%C3%A9llo=w%C3%B8rld')
class intessa.conneg.default.multipart.MultipartFormCodec[source]

Codec for multipart HTML forms (i.e. file uploads).

static decode(c_type, bytes)[source]

Decode a multipart form (not yet implemented).

static encode(media_type, params)[source]

Encode given params as multipart/form-data.

Rather than returning a bytestring as a response (as most other codecs do), this codec will return a StreamingBody instance. This means for large file uploads, files will be streamed from disk in manageable chunks rather than being loaded into memory all at once.

>>> MultipartFormCodec.encode('multipart/form-data', {'key': 'value'}) 
(ContentType('multipart/form-data; boundary=...'), <StreamingBody(...)>)

The provided data is passed directly to poster.encode.multipart_encode(); consult the poster docs for more information.

class intessa.conneg.default.json.JSONCodec[source]

Default JSON codec, powered by simplejson.

static decode(content_type, bytes)[source]

Decode a JSON bytestring using simplejson.loads().

>>> JSONCodec.decode('application/json', '{"a": 1}')
{'a': 1}
static encode(media_type, obj, **params)[source]

Encode an object as JSON using simplejson.dumps().

Additional parameters will be passed to simplejson.dumps() as-is.

>>> JSONCodec.encode('application/json', {"a": 1})
('application/json', '{"a": 1}')
>>> JSONCodec.encode('application/json', {"a": 1}, indent=True)
('application/json', '{\n "a": 1\n}')
class intessa.conneg.default.xml.XMLCodec[source]

Default XML codec, using lxml.objectify.

static decode(content_type, bytes)[source]

Decode an XML bytestring to a Python object, using lxml.objectify.

For more information on these objects, see http://lxml.de/objectify.html.

>>> doc = XMLCodec.decode('application/xml', '<obj><attr>value</attr></obj>')
>>> doc 
<Element obj at 0x...>
>>> doc.tag
'obj'
>>> doc.attr
'value'
static encode(media_type, etree, **params)[source]

Encode an lxml ElementTree using lxml.etree.tostring().

By default, the output will include an XML prolog, and will be utf-8-encoded. This can be overridden by passing xml_declaration (True or False, default True) and encoding (default 'utf-8') keywords.

See http://lxml.de/api/lxml.etree-module.html#tostring for a detailed overview of available options.

>>> tree = lxml.etree.fromstring('<obj><attr>value</attr></obj>')
>>> tree.find('attr').text = u"vålúè"
>>> XMLCodec.encode('application/xml', tree)
('application/xml', "<?xml version='1.0' encoding='utf-8'?>\n<obj><attr>v\xc3\xa5l\xc3\xba\xc3\xa8</attr></obj>")