Utility classes and methods for dealing with URLs.
Represent a URL as a named-tuple object. This is an immutable object that cannot be changed after creation.
The following read-only attributes are defined on objects of class Url.
Attribute | Index | Value | Value if not present |
---|---|---|---|
scheme | 0 | URL scheme specifier | empty string |
netloc | 1 | Network location part | empty string |
path | 2 | Hierarchical path | empty string |
query | 3 | Query component | empty string |
hostname | 4 | Host name (lower case) | None |
port | 5 | Port number as integer (if present) | None |
username | 6 | User name | None |
password | 7 | Password | None |
There are two ways of constructing Url objects:
By passing a string urlstring:
>>> u = Url('http://www.example.org/data')
>>> u.scheme
'http'
>>> u.netloc
'www.example.org'
>>> u.path
'/data'
The default URL scheme is file:
>>> u = Url('/tmp/foo')
>>> u.scheme
'file'
>>> u.path
'/tmp/foo'
Please note that extra leading slashes ‘/’ are interpreted as the begining of a network location:
>>> u = Url('//foo/bar')
>>> u.path
'/bar'
>>> u.netloc
'foo'
>>> Url('///foo/bar').path
'/foo/bar'
Check RFC 3986 http://tools.ietf.org/html/rfc3986
If force_abs is True (default), then the path attribute is made absolute, by calling os.path.abspath if necessary:
>>> u = Url('foo/bar', force_abs=True)
>>> os.path.isabs(u.path)
True
Otherwise, if force_abs is False, then the path attribute stores the passed string unchanged:
>>> u = Url('foo', force_abs=False)
>>> os.path.isabs(u.path)
False
>>> u.path
'foo'
Other keyword arguments can specify defaults for missing parts of the URL:
>>> u = Url('/tmp/foo', scheme='file', netloc='localhost')
>>> u.scheme
'file'
>>> u.netloc
'localhost'
>>> u.path
'/tmp/foo'
By passing keyword arguments only, to construct an Url object with exactly those values for the named fields:
>>> u = Url(scheme='http', netloc='www.example.org', path='/data')
In this form, the force_abs parameter is ignored.
See also: http://docs.python.org/library/urlparse.html#urlparse-result-object
Return a new Url, constructed by appending relpath to the path section of this URL.
Example:
>>> u0 = Url('http://www.example.org')
>>> u1 = u0.adjoin('data')
>>> str(u1)
'http://www.example.org/data'
>>> u2 = u1.adjoin('moredata')
>>> str(u2)
'http://www.example.org/data/moredata'
Even if relpath starts with /, it is still appended to the path in the base URL:
>>> u3 = u2.adjoin('/evenmore')
>>> str(u3)
'http://www.example.org/data/moredata/evenmore'
A dictionary class enforcing that all keys are URLs.
Strings and/or objects returned by urlparse can be used as keys. Setting a string key automatically translates it to a URL:
>>> d = UrlKeyDict()
>>> d['/tmp/foo'] = 1
>>> for k in d.keys(): print (type(k), k.path)
(<class '....Url'>, '/tmp/foo')
Retrieving the value associated with a key works with both the string or the url value of the key:
>>> d['/tmp/foo']
1
>>> d[Url('/tmp/foo')]
1
Key lookup can use both the string or the Url value as well:
>>> '/tmp/foo' in d
True
>>> Url('/tmp/foo') in d
True
>>> 'file:///tmp/foo' in d
True
>>> 'http://example.org' in d
False
Class UrlKeyDict supports initialization by copying items from another dict instance or from an iterable of (key, value) pairs:
>>> d1 = UrlKeyDict({ '/tmp/foo':'foo', '/tmp/bar':'bar' })
>>> d2 = UrlKeyDict([ ('/tmp/foo', 'foo'), ('/tmp/bar', 'bar') ])
>>> d1 == d2
True
Differently from dict, initialization from keyword arguments alone is not supported:
>>> d3 = UrlKeyDict(foo='foo', bar='bar')
Traceback (most recent call last):
...
TypeError: __init__() got an unexpected keyword argument 'foo'
An empty UrlKeyDict instance is returned by the constructor when called with no parameters:
>>> d0 = UrlKeyDict()
>>> len(d0)
0
If force_abs is True, then all paths are converted to absolute ones in the dictionary keys.
>>> d = UrlKeyDict(force_abs=True)
>>> d['foo'] = 1
>>> for k in d.keys(): print os.path.isabs(k.path)
True
>>> d = UrlKeyDict(force_abs=False)
>>> d['foo'] = 2
>>> for k in d.keys(): print os.path.isabs(k.path)
False
A dictionary class enforcing that all values are URLs.
Strings and/or objects returned by urlparse can be used as values. Setting a string value automatically translates it to a URL:
>>> d = UrlValueDict()
>>> d[1] = '/tmp/foo'
>>> d[2] = Url('file:///tmp/bar')
>>> for v in d.values(): print (type(v), v.path)
(<class '....Url'>, '/tmp/foo')
(<class '....Url'>, '/tmp/bar')
Retrieving the value associated with a key always returns the URL-type value, regardless of how it was set:
>>> repr(d[1])
"Url(scheme='file', netloc='', path='/tmp/foo', hostname=None, port=None, username=None, password=None)"
Class UrlValueDict supports initialization by any of the methods that work with a plain dict instance:
>>> d1 = UrlValueDict({ 'foo':'/tmp/foo', 'bar':'/tmp/bar' })
>>> d2 = UrlValueDict([ ('foo', '/tmp/foo'), ('bar', '/tmp/bar') ])
>>> d3 = UrlValueDict(foo='/tmp/foo', bar='/tmp/bar')
>>> d1 == d2
True
>>> d2 == d3
True
In particular, an empty UrlDict instance is returned by the constructor when called with no parameters:
>>> d0 = UrlValueDict()
>>> len(d0)
0
If force_abs is True, then all paths are converted to absolute ones in the dictionary values.
>>> d = UrlValueDict(force_abs=True)
>>> d[1] = 'foo'
>>> for v in d.values(): print os.path.isabs(v.path)
True
>>> d = UrlValueDict(force_abs=False)
>>> d[2] = 'foo'
>>> for v in d.values(): print os.path.isabs(v.path)
False