Pluggdapps

Component system. Web framework. And more ...

parsehttp – Library functions to parse HTTP message.

Utility functions to parse and manipulate HTTP messages.

Module contents

class pluggdapps.utils.parsehttp.HTTPHeaders[source]

Dictionary of HTTP headers for both request and response. Value of each key, lower-cased string, is represented as a byte-string of comma separated values. Refer RFC2616 for more information.

pluggdapps.utils.parsehttp.port_for_scheme(scheme)[source]

Calculate port based on scheme name. If scheme and port matches, port is left empty. Otherwise port is explicitly set to port number and returned as a string.

scheme, byte-string.

pluggdapps.utils.parsehttp.parse_startline(startline)[source]

Every HTTP request starts with a start line specifying method, uri and version. Parse them and return a tuple of (method, uri, version).

startline is expected in bytes and all the elements in returned tuple will be in byte-strings as well.

pluggdapps.utils.parsehttp.parse_url(uri, host=None, scheme=None)[source]

Using stdlib’s urllib.parse.urlsplit() API, parse uri into its component parts.

uri,
byte-string of request-url, decoded using ‘utf-8’ encoding.
host,

byte-string from HTTP Host header. Many times uri, as found in the request startline, have abs_path alone, in which case, optional host name as found in the Host header can be supplied. It will be applied on the urlsplit() result.

Note that as per RFC definition Host header can also contain port address.

scheme,
Default scheme to use while parsing the url. Directly passed to urllib.parse.urlsplit().
Returns a UserDict with following keys,
scheme, netloc, path, query, fragment, username, password, hostname, port, script - all of them in string type.
Among these key values,
  • path value will be unquoted using urllib.parse.unquote()
  • query value will be unquoted using urllib.parse.parse_qs()
  • and, all values are available as strings.

Refer to Section 5.2 in RFC 2616.txt.

pluggdapps.utils.parsehttp.make_url(baseurl, path, query, fragment)[source]

Using the baseurl and the remaining variable part of a url namely path, query, fragment construct a full url that can be sent in response and interpreted by clients.

baseurl,
string of base-url with scheme and netlocation. Otherwise None, in which case relative-url is returned.
path,
string of URL path value, will be quoted using urllib.parse.quote() before generating the final URL
query
is expected as a dictionary key,value pairs, value being a list. Will be encoded using urllib.parse.urlencode()
fragment,
string of fragment portion of the url.

Return relative-url or absolute-url as a string.

pluggdapps.utils.parsehttp.compare_url(url1, url2)[source]

Compare two URLs url1 and url2 based on RFC2616 specification.

pluggdapps.utils.parsehttp.parse_netpath(netpath)[source]

Parse netpath string containing host-name and script-path into a tuple of (netloc, script-path). If script-path is absent, return (netloc, '').

pluggdapps.utils.parsehttp.parse_formbody(content_type, body)[source]

HTML form values can be submited via POST or PUT methods, in which case, request Content-Type will be appropriately set. This function supports, application/x-www-form-urlencoded, multipart/form-data media-types. Note that files are submitted using multipart/form-data media-type. Returns a dictionary of arguments.

content_type,
Value as return from parse_content_type().
body
Byte string of HTTP request body.
pluggdapps.utils.parsehttp.parse_connection(value)[source]

Return a list of lowercased token values, in byte-strings, from Connection header field.

pluggdapps.utils.parsehttp.parse_date(value)[source]

HTTP applications have historically allowed three different formats for the representation of date/time stamps:

Sun, 06 Nov 1994 08:49:37 GMT  ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov  6 08:49:37 1994       ; ANSI C's asctime() format

The first format is preferred as an Internet standard. This function heuristically parses the date format (from request header) and changes the format to RFC 1123 format. Returns datetime object

pluggdapps.utils.parsehttp.http_fromdate(dtime, tzinfo=None)[source]

Convert timestamp adjusting it to GMT using RFC 1123 date format. Return string.

pluggdapps.utils.parsehttp.http_todate(datestr)[source]

Convert date-time string, RFC 1123 normalized format, to python datetime object.

pluggdapps.utils.parsehttp.parse_transfer_encoding(value=b'')[source]

Parse Transfer-Encoding header value,

1
2
transfer-coding      = "chunked" | transfer-extension
transfer-extension   = token *( ";" parameter )
value,
byte-string of Transfer-Encoding header value to parse.

Returns [ ( token, param ), ... ], where, token and param are byte-strings. token is also lower-cased.

pluggdapps.utils.parsehttp.parse_accept(value=b'')[source]

Parse Accept header value,:

Accept         = "Accept" ":"
                    #( media-range [ accept-params ] )

media-range    = ( "*/*"
                 | ( type "/" "*" )
                 | ( type "/" subtype )
                 ) *( ";" parameter )
accept-params  = ";" "q" "=" qvalue *( accept-extension )
accept-extension = ";" token [ "=" ( token | quoted-string ) ]
value,
byte-string of Accept header value to parse.

Returns [ ( '<type>/<subtype>', q, param ), ... ], where, param is a byte-string representing media-type parameters and q is quality value in float for media-type.

pluggdapps.utils.parsehttp.parse_accept_charset(value=b'')[source]

Parse Accept-Charset header value,:

Accept-Charset = "Accept-Charset" ":"
          1#( ( charset | "*" )[ ";" "q" "=" qvalue ] )
value,
byte-string of Accept-Charset header value to parse.

Returns, [ (charset, qvalue), ... ], where, charset is in string and qvalue is in float. The returned list is sorted by qvalue which is in float.

pluggdapps.utils.parsehttp.parse_accept_encoding(value=b'')[source]

Parse Accept-Encoding header value,:

Accept-Encoding  = "Accept-Encoding" ":"
                      1#( codings [ ";" "q" "=" qvalue ] )
codings          = ( content-coding | "*" )
value,
byte-string of Accept-Encoding header value to parse.

Returns, [ (content-coding, qvalue), ... ], where content-coding is in string and qvalue is in float. Returned list is sorted by qvalue.

pluggdapps.utils.parsehttp.parse_content_length(value)[source]

Return length as integer type.

pluggdapps.utils.parsehttp.parse_content_type(value)[source]

Parse content type using grammar,:

Content-Type   = "Content-Type" ":" media-type
media-type     = type "/" subtype *( ";" parameter )
type           = token
subtype        = token
parameter      = attribute "=" value
attribute      = token
value          = token | quoted-string

Returns, [ type, subtype, [ (attr, value), ... ] where all elements are in byte-string.

pluggdapps.utils.parsehttp.parse_content_disposition(value)[source]

Parse content disposition using grammar,:

content-disposition = "Content-Disposition" ":"
                      disposition-type *( ";" disposition-parm )
disposition-type = "attachment" | disp-extension-token
disposition-parm = filename-parm | disp-extension-parm
filename-parm = "filename" "=" quoted-string
disp-extension-token = token
disp-extension-parm = token "=" ( token | quoted-string )

Returns, ( token, [ (attr, value), ... ] ), where all elements are in byte-string.