This module contains a global default codec register (DEFAULT_REGISTER) and codecs for some commonly used media types (which are already present in the default register).
An instance of CodecRegister, pre-populated with common codecs and used by default within intessa for content negotiation.
The default codecs defined in this module are:
Type | Media Types | Aliases | Codec |
---|---|---|---|
Text | text/plain | text | TextCodec |
HTML Forms | application/x-www-form-urlencoded | form | SimpleFormCodec |
Multipart Forms | multipart/form-data | multipart | MultipartFormCodec |
JSON | application/json, text/javascript | json, json-js (respectively) | JSONCodec |
XML | application/xml | xml | XMLCodec |
Default text codec.
Decode a bytestring to unicode, using the content type’s charset.
>>> TextCodec.decode(ContentType('text/plain; charset=utf-8'),
... 'H\xc3\xa9llo W\xc3\xb6rld')
u'H\xe9llo W\xf6rld'
>>> TextCodec.decode(ContentType('text/plain; charset=latin1'),
... 'H\xe9llo W\xf6rld')
u'H\xe9llo W\xf6rld'
If no charset is present, this method assumes the input is UTF-8:
>>> TextCodec.decode(ContentType('text/plain'),
... 'H\xc3\xa9llo W\xc3\xb6rld')
u'H\xe9llo W\xf6rld'
The decoder always uses ‘strict’ error handling:
>>> TextCodec.decode(ContentType('text/plain; charset=us-ascii'),
... 'H\xc3\xa9llo W\xc3\xb6rld')
Traceback (most recent call last):
...
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
Encode a unicode string as a bytestring using an encoding.
Parameters: |
|
---|
Codec for simple application/x-www-form-urlencoded HTML forms.
Decode the encoded form, returning a MultiDict.
For content types without a charset, returns an instance of webob.multidict.MultiDict with bytestrings:
>>> form_type = ContentType('application/x-www-form-urlencoded')
>>> SimpleFormCodec.decode(form_type, 'a=b&c=d')
MultiDict([('a', 'b'), ('c', 'd')])
If a charset is given, the input is decoded using it, returning a UnicodeMultiDict instead:
>>> utf8_form_type = ContentType(
... 'application/x-www-form-urlencoded; charset=utf-8')
>>> SimpleFormCodec.decode(utf8_form_type, 'h%C3%A9llo=w%C3%B8rld')
UnicodeMultiDict([(u'h\xe9llo', u'w\xf8rld')])
Encode a simple dictionary of strings to an HTML form.
Parameters: |
|
---|
The simplest case, dictionaries of bytestrings, works:
>>> SimpleFormCodec.encode('application/x-www-form-urlencoded',
... {'a': 'b', 'c': 'd'})
(ContentType(...), 'a=b&c=d')
You can also pass in lists of key, value pairs:
>>> SimpleFormCodec.encode('application/x-www-form-urlencoded',
... [('a', 'b'), ('c', 'd')])
(ContentType(...), 'a=b&c=d')
Unicode strings will be encoded according to the encoding and errors parameters:
>>> SimpleFormCodec.encode('application/x-www-form-urlencoded',
... [(u'héllo', u'wørld')])
(ContentType('...; charset=utf-8'), 'h%C3%A9llo=w%C3%B8rld')
Codec for multipart HTML forms (i.e. file uploads).
Encode given params as multipart/form-data.
Rather than returning a bytestring as a response (as most other codecs do), this codec will return a StreamingBody instance. This means for large file uploads, files will be streamed from disk in manageable chunks rather than being loaded into memory all at once.
>>> MultipartFormCodec.encode('multipart/form-data', {'key': 'value'})
(ContentType('multipart/form-data; boundary=...'), <StreamingBody(...)>)
The provided data is passed directly to poster.encode.multipart_encode(); consult the poster docs for more information.
Default JSON codec, powered by simplejson.
Decode a JSON bytestring using simplejson.loads().
>>> JSONCodec.decode('application/json', '{"a": 1}')
{'a': 1}
Encode an object as JSON using simplejson.dumps().
Additional parameters will be passed to simplejson.dumps() as-is.
>>> JSONCodec.encode('application/json', {"a": 1})
('application/json', '{"a": 1}')
>>> JSONCodec.encode('application/json', {"a": 1}, indent=True)
('application/json', '{\n "a": 1\n}')
Default XML codec, using lxml.objectify.
Decode an XML bytestring to a Python object, using lxml.objectify.
For more information on these objects, see http://lxml.de/objectify.html.
>>> doc = XMLCodec.decode('application/xml', '<obj><attr>value</attr></obj>')
>>> doc
<Element obj at 0x...>
>>> doc.tag
'obj'
>>> doc.attr
'value'
Encode an lxml ElementTree using lxml.etree.tostring().
By default, the output will include an XML prolog, and will be utf-8-encoded. This can be overridden by passing xml_declaration (True or False, default True) and encoding (default 'utf-8') keywords.
See http://lxml.de/api/lxml.etree-module.html#tostring for a detailed overview of available options.
>>> tree = lxml.etree.fromstring('<obj><attr>value</attr></obj>')
>>> tree.find('attr').text = u"vålúè"
>>> XMLCodec.encode('application/xml', tree)
('application/xml', "<?xml version='1.0' encoding='utf-8'?>\n<obj><attr>v\xc3\xa5l\xc3\xba\xc3\xa8</attr></obj>")