Wheezy HTTP is a lightweight WSGI library that aims to take most benefits out of standard python library. It can be run from python 2.4 up to most cutting age python 3.
Configuration options is a python dictionary passed to WSGIApplication during initialization. These options are shared across various parts of application, including: middleware factory, http request, cookies, etc.
options = {}
main = WSGIApplication([
bootstrap_http_defaults,
lambda ignore: router_middleware
], options)
There are no required options necessarily setup before use, since they all fallback to some defaults defined in the config module. Actually options are checked by the bootstrap_http_defaults() middleware factory for missing values (the middleware factory is executed only once at application start up).
See full list of available options in config module.
WSGI is the Web Server Gateway Interface. It is a specification for web/application servers to communicate with web applications. It is a Python standard, described in detail in PEP 3333.
An instance of WSGIApplication is an entry point of your WSGI application. You instantiate it by supplying a list of desired middleware factories and global configuration options. Here is a snippet from Hello World example:
options = {}
main = WSGIApplication([
bootstrap_http_defaults,
lambda ignore: router_middleware
], options)
An instance of WSGIApplication is a callable that responds to the standard WSGI call. This callable is passed to application/web server. Here is an integration example with the web server from python standard wsgiref package:
try:
print('Visit http://localhost:8080/')
make_server('', 8080, main).serve_forever()
except KeyboardInterrupt:
pass
print('\nThanks!')
The integration with other WSGI application servers varies. However the principal of WSGI entry point is the same across those implementations.
The presence of middleware, in general, is transparent to the application and requires no special support. Middleware is usually characterized by playing the following roles within an application:
Middleware can be any callable of the following form:
def middleware(request, following):
if following is not None:
response = following(request)
else:
response = ...
return response
A middleware callable accepts as a first argument an instance of HTTPRequest and as second argument (following) the next middleware in the chain. It is up to middleware to decide whether to call the next middleware callable in the chain. It is expected that middleware returns an instance of HTTPResponse class or None.
Usually middleware requires some sort of initialization before being used. This can be some configuration variables or sort of preparation, verification, etc. Middleware Factory serves this purpose.
Middleware factory can be any callable of the following form:
def middleware_factory(options):
return middleware
Middleware factory is initialized with configuration options, it is the same dictionary used during WSGIApplication initialization. Middleware factory returns particular middleware implementation or None (this can be useful for some sort of initialization that needs to be run during application bootstrap, e.g. some defaults, see bootstrap_http_defaults()).
In case the last middleware in the chain returns None it is equivalent to returning HTTP response not found (HTTP status code 404).
Middleware is initialized and executed in certain order. Let’s setup a simple application with the following middleware chain:
app = WSGIApplication(middleware=[
a_factory,
b_factory,
c_factory
])
Initialization and execution order is the same - from first element in the list to the last:
a_factory => b_factory => c_factory
In case a factory returns None it is being skipped from middleware list. Let assume b_factory returns None, so the middleware chain become:
a => c
It is up to middleware a to call c before or after its own processing. WSGIApplication in no way prescribes it, instead it just chains them. This gives great power to the middleware developer to take control over certain implementation use case.
Handler is any callable that accepts an instance of HTTPRequest and returns HTTPResponse:
def handler(request):
return response
Here is an example:
def welcome(request):
response = HTTPResponse()
response.write('Hello World!!!')
return response
Wheezy HTTP does not provide HTTP handler implementations (see wheezy.web for this purpose).
Decorator accept_method accepts only particular HTTP request method if its argument (constraint) is a string:
@accept_method('GET')
def my_view(request):
...
or one of multiple HTTP request methods if the argument (constraint) is a list or tuple:
@accept_method(('GET', 'POST'))
def my_view(request):
...
Method argument constraint must be in uppercase.
Respond with an HTTP status code 405 (Method Not Allowed) in case incoming HTTP request method does not match decorator constraint.
HTTPRequest is a wrapper around WSGI environ dictionary. It provides access to all variables stored within the environ as well as provide several handy methods for daily use.
HTTPRequest includes the following useful attributes (they are evaluated only once during processing):
While working with request form/query you get a dictionary. The dictionary keys are the unique form variable names and the values are lists of values for each name. There usually exists just one value, so working with list is not that convenient. You can use get_param or first_item_adapter or last_item_adapter (see wheezy.core):
>>> from wheezy.core.collections import last_item_adapter
...
>>> request.query['a']
['1', '2']
>>> query = last_item_adapter(request.query)
>>> query['a']
'2'
>>> request.get_param('a')
'2'
While you are able initialize your application models by requesting certain values from form or query, there is a separate python package wheezy.validation that is recommended way to add forms facility to your application. It includes both model binding as well as a number of validation rules.
Supported content types: application/x-www-form-urlencoded, application/json and multipart/form-data.
HTTPResponse correctly maps the following HTTP response status codes (according to rfc2616):
# see http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
# see http://en.wikipedia.org/wiki/List_of_HTTP_status_codes
HTTP_STATUS = {
# Informational
100: '100 Continue',
101: '101 Switching Protocols',
# Successful
200: '200 OK',
201: '201 Created',
202: '202 Accepted',
203: '203 Non-Authoritative Information',
204: '204 No Content',
205: '205 Reset Content',
206: '206 Partial Content',
207: '207 Multi-Status',
# Redirection
300: '300 Multiple Choices',
301: '301 Moved Permanently',
302: '302 Found',
303: '303 See Other',
304: '304 Not Modified',
305: '305 Use Proxy',
307: '307 Temporary Redirect',
# Client Error
400: '400 Bad Request',
401: '401 Unauthorized',
402: '402 Payment Required',
403: '403 Forbidden',
404: '404 Not Found',
405: '405 Method Not Allowed',
406: '406 Not Acceptable',
407: '407 Proxy Authentication Required',
408: '408 Request Timeout',
409: '409 Conflict',
410: '410 Gone',
411: '411 Length Required',
412: '412 Precondition Failed',
413: '413 Request Entity Too Large',
414: '414 Request-Uri Too Long',
415: '415 Unsupported Media Type',
416: '416 Requested Range Not Satisfiable',
417: '417 Expectation Failed',
# Server Error
500: '500 Internal Server Error',
501: '501 Not Implemented',
502: '502 Bad Gateway',
503: '503 Service Unavailable',
504: '504 Gateway Timeout',
505: '505 Http Version Not Supported'
}
You instantiate HTTPResponse and initialize it with content_type and encoding:
>>> r = HTTPResponse()
>>> r.headers
[('Content-Type', 'text/html; charset=UTF-8')]
>>> r = HTTPResponse(content_type='image/gif')
>>> r.headers
[('Content-Type', 'image/gif')]
>>> r = HTTPResponse(content_type='text/plain; charset=iso-8859-4',
... encoding='iso-8859-4')
>>> r.headers
[('Content-Type', 'text/plain; charset=iso-8859-4')]
HTTPResponse has two methods to buffer output: write and write_bytes.
Method write let you buffer response before it actually being passed to application server. The write method does encoding of input chunk to bytes accordingly to response encoding.
Method write_bytes buffers output bytes.
Here are some attributes available in HTTPResponse:
There are a number of handy preset redirect responses:
Browsers incorrectly handle redirect response to AJAX requests, so there is used status code 207 that javascript is capable to receive and process browser redirect.
Here is an example for jQuery:
$.ajax({
// ...
success: function(data, textStatus, jqXHR) {
if (jqXHR.status == 207) {
window.location.replace(
jqXHR.getResponseHeader('Location'));
} else {
// ...
}
}
});
If AJAX response status code is 207, browser navigates to URL specified in HTTP response header Location.
There are a number of handy preset client error responses:
There is integration with wheezy.core package in json object encoding.
Here is simple example:
from wheezy.http import bad_request
from wheezy.http import json_response
def now_handler(request):
if not request.ajax:
return bad_request()
return json_response({'now': datetime.now()})
Requests other than AJAX are rejected, return JSON response with current time of server.
HTTPCookie is implemented according to rfc2109. Here is a typical usage:
response.cookies.append(HTTPCookie('a', value='123', options=options))
In case you would like delete a certain cookie:
response.cookies.append(HTTPCookie.delete('a', options=options))
While the idea behind secure cookies is to protect value (via some sort of encryption, hashing, etc), this task is out of scope of this package. However you can use Ticket from wheezy.security package for this purpose; it supports encryption, hashing, expiration and verification.
Transforms is a way to manipulate handler response accordingly to some algorithm. Typical use case includes: runtime minification, hardening readability, gzip, etc. While middleware is applied to whole application, transform in contrast to particular handler only.
Transform is any callable of this form:
def transform(request, response):
return response
There is a general decorator capable of applying several transforms to a response. You can use it in the following way:
from wheezy.http.transforms import gzip_transform
from wheezy.http.transforms import response_transforms
@response_transforms(gzip_transform(compress_level=9))
def handler(request):
return response
If you need apply several transforms to handler here is how you can do that:
@response_transforms(a_transform, b_transform)
def handler(request):
return response
Order in which transforms are applied are from first argument to last:
a_transform => b_transform
It is not always effective to apply gzip encoding to whole applications. While in most cases WSGI applications are deployed behind reverse proxy web server, it is more effective to use its capabilities of response compression (10-20% productivity gain with nginx). ON the other side, gzipped responses stored in cache are even better, since compression is done once before being added to cache. This is why there is a gzip transform.
Here is a definition:
def gzip_transform(compress_level=6, min_length=1024, vary=False):
compress_level - the compression level, between 1 and 9, where 1 is the least compression (fastest) and 9 is the most (slowest)
min_length - sets the minimum length, in bytes, of the first chunk in response that will be compressed. Responses shorter than this byte-length will not be compressed.
vary - enables response header “Vary: Accept-Encoding”.
HTTPCachePolicy controls cache specific http headers: Cache-Control, Pragma, Expires, Last-Modified, ETag, Vary.
While particular set of valid HTTP cache headers depends on certain use case, there are distinguished three of them:
HTTPCachePolicy includes the following useful methods:
You can use extend(headers) method to update headers with this cache policy (this is what HTTPResponse does when cache attribute is set):
>>> headers = []
>>> p = HTTPCachePolicy('no-cache')
>>> p.extend(headers)
>>> headers
[('Cache-Control', 'no-cache'),
('Pragma', 'no-cache'),
('Expires', '-1')]
Public caching headers:
>>> from datetime import datetime, timedelta
>>> from wheezy.core.datetime import UTC
>>> when = datetime(2011, 9, 20, 15, 00, tzinfo=UTC)
>>> headers = []
>>> p = HTTPCachePolicy('public')
>>> p.last_modified(when)
>>> p.expires(when + timedelta(hours=1))
>>> p.etag('abc')
>>> p.vary()
>>> p.extend(headers)
>>> headers
[('Cache-Control', 'public'),
('Expires', 'Tue, 20 Sep 2011 16:00:00 GMT'),
('Last-Modified', 'Tue, 20 Sep 2011 15:00:00 GMT'),
('ETag', 'abc'),
('Vary', '*')]
While you do not directly make a call to extend headers from cache policy, it is still useful to experiment within a python console.
CacheProfile combines a number of settings applicable to http cache policy as well as server side cache.
CacheProfile supports the following list of valid cache locations:
Here is a map between cache profile cacheability and http cache policy:
CACHEABILITY = {
'none': 'no-cache',
'server': 'no-cache',
'client': 'private',
'both': 'private', # server and client
Cache profile method cache_policy is adapted according the above map.
You create a cache profile by instantiating CacheProfile and passing in the following arguments:
Here is an example:
cache_profile = CacheProfile('client', duration=timedelta(minutes=15))
cache_profile = CacheProfile('both', duration=15)
It is recommended to define cache profiles in a separate module and import them as needed into a various parts of application. This way you can achieve better control with a single place of change.
Content caching is the most effective type of cache. This way your application code doesn’t have to process to determine a valid response to user. Instead a response is returned from cache. Since there is no heavy processing and just simple operation to get an item from cache, it should be super fast. However not every request can be cached and whether it can completely depends on your application.
If you show a list of goods and it has not changed in any way (price is the same, etc.) why would you make several calls per second every time it requested and regenerate the page again? You can apply cache profile to response and it will be cached according to it rules.
What happens if the price has been changed, but the list of goods cacheability was set to 15 mins? How to invalidate the cache? This is where CacheDependency comes to the rescue. The core feature of cache dependency is implemented in package wheezy.caching, however http module supports its integration.
Cache contract requires: get(key, namespace), set(key, value, time, namespace), set_multi(mapping, time, namespace) and incr(self, key, delta=1, namespace=None, initial_value=None). Look at wheezy.caching package for more details.
response_cache() decorator is used to apply cache feature to handler. Here is an example that includes also CacheDependency:
from wheezy.caching.patterns import Cached
from wheezy.http import CacheProfile
from wheezy.http import none_cache_profile
from wheezy.http import response_cache
from myapp import cache
cached = Cached(cache, time=15)
cache_profile = CacheProfile('server', duration=15)
@response_cache(cache_profile)
def list_of_goods(request):
...
response.cache_dependency.append('list_of_goods:%s:' % catalog_id)
return response
@response_cache(none_cache_profile)
def change_price(request):
...
cached.dependency.delete('list_of_goods:%s:' % catalog_id)
return response
While list_of_goods is being cached, change_price handler effectively invalidates list_of_goods cache result, so next call will fetch an updated list.
Note, cache dependency keys must not end with a number.
The response_cache() decorator is applied to handler. It is pretty far from the WSGI entry point, there are number of middlewares as well as routing in between (all these are relatively time consuming, especially routing). What if we were able determine cache profile for the given request earlier, being the first middleware in the chain. This is where HTTPCacheMiddleware comes to the scene.
HTTPCacheMiddleware serves exactly this purpose. It is initialized with two arguments:
Here is an example:
cache = ...
options = {
'http_cache': cache
}
main = WSGIApplication([
http_cache_middleware_factory()
], options)
middleware_vary is an instance of RequestVary. By default it varies cache key by HTTP method and path. Let assume we would like vary middleware key by HTTP scheme:
options = {
...
'http_cache_middleware_vary': RequestVary(
environ=['wsgi.url_scheme'])
}
RequestVary is designed to compose a key depending on number of values, including: headers, query, form and environ. It always varies by request method and path.
Here is a list of arguments that can be passed during initialization:
The following example will vary incoming request by request url query parameter q:
request_vary = RequestVary(query=['q'])
Note that you can vary by HTTP headers via environ names. A missing value is distinguished from an empty one.
RequestVary is used by CacheProfile and HTTPCacheMiddleware internally.
Wheezy HTTP providers middleware adapters to be used for integration with other WSGI applications:
See the demo example in the wsgi_adapter application.