bridgedb.parse
Package containing modules for parsing data.
InvalidBase64
[source]¶Bases: exceptions.ValueError
Raised if parsing or decoding cannot continue due to invalid base64.
padBase64
(b64string)[source]¶Re-add any stripped equals sign character padding to a b64 string.
Parameters: | b64string (string) – A base64-encoded string which might have had its
trailing equals sign (= ) padding removed. |
---|---|
Raises ValueError: | |
if there was any error while manipulating the string. | |
Returns: | A properly-padded (according to the base64 spec: RFC 4648) string. |
parseUnpaddedBase64
(field)[source]¶Parse an unpadded, base64-encoded field.
The field will be re-padded, if need be, and then base64 decoded.
Parameters: | field (str) – Should be some base64-encoded thing, with any trailing
= -characters removed. |
---|---|
Raises InvalidBase64: | |
if there is an error in either unpadding or decoding field. | |
Return type: | str |
Returns: | The base64-decoded field. |
Utilities for parsing IP and email addresses.
parse.addr
| |_ extractEmailAddress() - Validate a :rfc:2822 email address.
| |_ isIPAddress() - Check if an arbitrary string is an IP address.
| |_ isIPv4() - Check if an arbitrary string is an IPv4 address.
| |_ isIPv6() - Check if an arbitrary string is an IPv6 address.
| \_ isValidIP() - Check that an IP address is valid.
|
|_ PortList - A container class for validated port ranges.
The following terms define addresses which are not valid. All other addresses are taken to be valid.
These address ranges are reserved by IANA for private intranets, and not routable to the Internet:
10.0.0.0 - 10.255.255.255 (10.0.0.0/8)
172.16.0.0 - 172.31.255.255 (172.16.0.0/12)
192.168.0.0 - 192.168.255.255 (192.168.0.0/16)
For additional information, see RFC 1918.
Current network (only valid as source address). See RFC 1122. An Unspecified Address in the context of firewalls means “all addresses of the local machine”. In a routing context, it is usually termed the Default Route, and it means the default route (to “the rest of” the internet). See RFC 1700. For example:
0.0.0.0/8
::/128
Reserved for loopback and IPC on the localhost. See RFC 1122. Example:
127.0.0.0
Loopback IP addresses (refers to self). See RFC 5735. Examples include:
127.0.0.1 - 127.255.255.254 (127.0.0.0/8)
::1
These are the link-local blocks, used for communication between hosts on a single link. See RFC 3927. Examples:
169.254.0.0/16
fe80::/64
Reserved for multicast addresses. See RFC 3171. For example:
224.0.0.0 - 239.255.255.255 (224.0.0.0/4)
Reserved for private networks. See RFC 1918. Some examples include:
10.0.0.0/8
172.16.0.0/12
192.168.0.0/16
Reserved (former Class E network). See RFC 1700, RFC 3232, and
RFC 5735. The one exception to this rule is the Limited
Broadcast Address, 255.255.255.255
for which packets at the IP
layer are not forwarded to the public internet. For example:
240.0.0.0 - 255.255.255.255 (240.0.0.0/4)
Limited broadcast address (limited to all other nodes on the LAN). See
RFC 919. For IPv4, 255
in any part of the IP is reserved for
broadcast addressing to the local LAN, e.g.:
255.255.255.255
Warning
The ipaddr
module (as of version 2.1.10) does not
understand the following reserved addresses:
Reserved for IETF protocol assignments. See RFC 5735. Example:
192.0.0.0/24
IPv6 to IPv4 relay. See RFC 3068. Example:
192.88.99.0/24
Network benchmark tests. See RFC 2544. Example:
198.18.0.0/15
Reserved for use in documentation and example code. It is often used in
conjunction with domain names example.com
or example.net
in
vendor and protocol documentation. See RFC 1166.
For example:
192.0.2.0/24
TEST-NET-2. See RFC 5737. Example:
198.51.100.0/24
TEST-NET-3. See RFC 5737. Example:
203.0.113.0/24
See RFC 6598. Example:
100.64.0.0/10
Similar uses to Limited Broadcast Address. For IPv6, everything
becomes convoluted and complicated, and then redefined. See
RFC 4193, RFC 3879, and RFC 3513. The
ipaddr.IPAddress.is_site_local()
method only checks to see if
the address is a Unique Local Address vis-á-vis RFC 3513 §2.5.6,
e.g.:
ff00::0/8
fec0::/10
ASPECIAL
= u'-_+/=_~'¶BadEmail
(msg, email)[source]¶Bases: exceptions.Exception
Exception raised when we get a bad email address.
InvalidPort
[source]¶Bases: exceptions.ValueError
Raised when a given port number is invalid.
UnsupportedDomain
[source]¶Bases: exceptions.ValueError
Raised when we get an email address from an unsupported domain.
canonicalizeEmailDomain
(domain, domainmap)[source]¶Decide if an email was sent from a permitted domain.
Parameters: |
|
---|---|
Raises UnsupportedDomain: | |
if the domain portion of the email address is not within the map of alternate to canonical allowed domain names. |
|
Return type: | |
Returns: | The canonical domain name for the email address. |
extractEmailAddress
(emailaddr)[source]¶Given an email address, obtained for example, via a From:
or
Sender:
email header, try to extract and parse (according to
RFC 2822) the local and domain portions.
We only allow the following form:
LOCAL_PART := DOTATOM
DOMAIN := DOTATOM
ADDRSPEC := LOCAL_PART "@" DOMAIN
In particular, we are disallowing: obs-local-part, obs-domain, comment, and obs-FWS. Other forms exist, but none of the incoming services we recognize support them.
Parameters: | emailaddr – An email address to validate. |
---|---|
Raises BadEmail: | |
if the emailaddr couldn’t be validated or parsed. | |
Returns: | A tuple of the validated email address, containing the mail
local part and the domain:(LOCAL_PART, DOMAIN)
|
isIPAddress
(ip, compressed=True)[source]¶Check if an arbitrary string is an IP address, and that it’s valid.
Parameters: |
|
---|---|
Return type: | A |
Returns: | The IP, as a string or a class, if it passed the checks. Otherwise, returns False. |
isIPv
(version, ip)[source]¶Check if ip is a certain version (IPv4 or IPv6).
Parameters: |
|
---|---|
Return type: | boolean |
Returns: |
|
isIPv4
(ip)[source]¶Check if an address is IPv4.
Attention
This does not check validity. See isValidIP()
.
Parameters: | ip (basestring or int) – The IP address to check. |
---|---|
Return type: | boolean |
Returns: | True if the address is an IPv4 address. |
isIPv6
(ip)[source]¶Check if an address is IPv6.
Attention
This does not check validity. See isValidIP()
.
Parameters: | ip (basestring or int) – The IP address to check. |
---|---|
Return type: | boolean |
Returns: | True if the address is an IPv6 address. |
isValidIP
(ip)[source]¶Check that an IP (v4 or v6) is valid.
The IP address, ip, must not be any of the following:
- A Link-Local Address,
- A Loopback Address or Localhost Address,
- A Multicast Address,
- An Unspecified Address or Default Route,
- Any other Private Address, or address within a privately allocated space, such as the IANA-reserved Shared Address Space.
If it is an IPv6 address, it also must not be:
- A Site-Local Address or an Unique Local Address.
>>> from bridgedb.parse.addr import isValidIP
>>> isValidIP('1.2.3.4')
True
>>> isValidIP('1.2.3.255')
True
>>> isValidIP('1.2.3.256')
False
>>> isValidIP('1')
False
>>> isValidIP('1.2.3')
False
>>> isValidIP('xyzzy')
False
Parameters: | ip (An ipaddr.IPAddress , ipaddr.IPv4Address ,
ipaddr.IPv6Address , or str) – An IP address. If it is a string, it will be converted to a
ipaddr.IPAddress . |
---|---|
Return type: | boolean |
Returns: | True , if ip passes the checks; False otherwise. |
normalizeEmail
(emailaddr, domainmap, domainrules, ignorePlus=True)[source]¶Normalise an email address according to the processing rules for its canonical originating domain.
The email address, emailaddr, will be parsed and validated, and then
checked that it originated from one of the domains allowed to email
requests for bridges to the
EmailDistributor
via the
canonicaliseEmailDomain()
function.
Parameters: |
|
---|---|
Raises: |
|
Return type: | |
Returns: | The validated, normalised email address, if it was from a permitted domain. Otherwise, returns an empty string. |
PortList
(*args, **kwargs)[source]¶Bases: object
A container class for validated port ranges.
Variables: | ports (set) – All ports which have been added to this PortList . |
---|
Create a PortList
.
Parameters: | args – Should match the portspec defined above. |
---|---|
Raises: | InvalidPort, if one of args doesn’t match port as
defined above. |
PORTSPEC_LEN
= 16¶The maximum number of allowed ports per IP address.
Parsers for Tor Bridge descriptors, including bridge-networkstatus
documents, bridge-server-descriptor``s, and ``bridge-extrainfo
descriptors.
DescriptorWarning - Raised when we parse a very odd descriptor.
deduplicate - Deduplicate a container of descriptors, keeping only the newest
descriptor for each router.
parseNetworkStatusFile - Parse a bridge-networkstatus document generated and
given to us by the BridgeAuthority.
parseServerDescriptorsFile - Parse a file containing
bridge-server-descriptors.
parseExtraInfoFiles - Parse (multiple) file(s) containing bridge-extrainfo
descriptors.
DescriptorWarning
[source]¶Bases: exceptions.Warning
Raised when we parse a very odd descriptor.
parseNetworkStatusFile
(filename, validate=True, skipAnnotations=True, descriptorClass=<class 'stem.descriptor.router_status_entry.RouterStatusEntryV3'>)[source]¶Parse a file which contains an @type bridge-networkstatus
document.
See ticket #12254 for why networkstatus-bridges documents don’t look anything like the networkstatus v2 documents that they are purported to look like. They are missing all headers, and the entire footer (including authority signatures).
Parameters: |
|
---|---|
Raises: |
|
Return type: | |
Returns: | A list of
|
parseServerDescriptorsFile
(filename, validate=True)[source]¶Open and parse filename, which should contain
@type bridge-server-descriptor
.
Note
We have to lie to Stem, pretending that these are
@type server-descriptor
, not
@type bridge-server-descriptor
. See ticket #11257.
Parameters: | |
---|---|
Return type: | |
Returns: | A list of :class:`stem.descriptor.server_descriptor.RelayDescriptor`s. |
deduplicate
(descriptors, statistics=False)[source]¶Deduplicate some descriptors, returning only the newest for each router.
Note
If two descriptors for the same router are discovered, AND both descriptors have the same published timestamp, then the router’s fingerprint WILL BE LOGGED ON PURPOSE, because we assume that router to be broken or malicious.
Parameters: |
|
---|---|
Return type: | |
Returns: | A dictionary mapping router fingerprints to their newest available descriptor. |
parseExtraInfoFiles
(*filenames, **kwargs)[source]¶Open filenames and parse any @type bridge-extrainfo-descriptor
contained within.
Warning
This function will not check that the router-signature
at the end of the extrainfo descriptor is valid. See
bridgedb.bridges.Bridge._verifyExtraInfoSignature
for a method for
checking the signature. The signature cannot be checked here, because
to do so, we would need the latest, valid, corresponding
signing-key
for the Bridge.
Note
This function will call deduplicate()
to deduplicate the
extrainfo descriptors parsed from all filenames.
Kwargs validate: | |
---|---|
If there is a 'validate' keyword argument, its value
will be passed along as the 'validate' argument to
stem.descriptor.extrainfo_descriptor.BridgeExtraInfoDescriptor .
The 'validate' keyword argument defaults to True , meaning that
the hash digest stored in the router-digest line will be checked
against the actual contents of the descriptor and the extrainfo
document’s signature will be verified. |
|
Return type: | dict |
Returns: | A dictionary mapping bridge fingerprints to their corresponding,
deduplicated
stem.descriptor.extrainfo_descriptor.RelayExtraInfoDescriptor . |
Utility functions for converting between various relay fingerprint formats, and checking their validity.
toHex - Convert a fingerprint from its binary representation to hexadecimal.
fromHex - Convert a fingerprint from hexadecimal to binary.
isValidFingerprint - Validate a fingerprint.
HEX_FINGERPRINT_LEN
= 40¶The required length for hexidecimal representations of hash digest of a Tor relay’s public identity key (a.k.a. its fingerprint).
toHex
()¶(callable) Convert a value from binary to hexidecimal representation.
fromHex
()¶(callable) Convert a value from hexidecimal to binary representation.
Parsers for HTTP and Email headers.
parseAcceptLanguage - Parse the contents of a client 'Accept-Language' header
parseAcceptLanguage
(header)[source]¶Parse the contents of a client ‘Accept-Language’ header.
Parse the header in the following manner:
header
is None or an empty string, return an empty list.header
string on any commas.Parameters: | header (string) – The contents of an ‘Accept-Language’ header, i.e. as if taken from twisted.web.server.Request.getHeader. |
---|---|
Return type: | list |
Returns: | A list of language codes (with and without locales), in order of preference. |
Parsers for bridge nicknames.
nicknames
|_ isValidRouterNickname - Determine if a nickname is according to spec
InvalidRouterNickname
[source]¶Bases: exceptions.ValueError
Router nickname doesn’t follow tor-spec.
Parsers for BridgeDB commandline options.
bridgedb.parse.options
|__ setConfig()
|__ getConfig() - Set/Get the config file path.
|__ setRundir()
|__ getRundir() - Set/Get the runtime directory.
|__ parseOptions() - Create the main options parser for BridgeDB.
|
\_ BaseOptions - Base options, included in all other options menus.
||
|\__ findRundirAndConfigFile() - Find the absolute path of the config
| file and runtime directory, or find
| suitable defaults.
|
|__ SIGHUPOptions - Menu to explain SIGHUP signal handling and usage.
|__ SIGUSR1Options - Menu to explain SIGUSR1 handling and usage.
|
|__ MockOptions - Suboptions for creating fake bridge descriptors for
| testing purposes.
\__ MainOptions - Main commandline options parser for BridgeDB.
setConfig
(path)[source]¶Set the absolute path to the config file.
See BaseOptions.postOptions()
.
Parameters: | path (string) – The path to set. |
---|
getConfig
()[source]¶Get the absolute path to the config file.
Return type: | string |
---|---|
Returns: | The path to the config file. |
setRundir
(path)[source]¶Set the absolute path to the runtime directory.
See BaseOptions.postOptions()
.
Parameters: | path (string) – The path to set. |
---|
getRundir
()[source]¶Get the absolute path to the runtime directory.
Return type: | string |
---|---|
Returns: | The path to the config file. |
parseOptions
()[source]¶Create the main options parser and its subcommand parsers.
Any UsageErrors
which are raised due to
invalid options are ignored; their error message is printed and then we
exit the program.
Return type: | MainOptions |
---|---|
Returns: | The main options parsing class, with any commandline arguments already parsed. |
BaseOptions
[source]¶Bases: twisted.python.usage.Options
Base options included in all main and sub options menus.
Create an options parser. All flags, parameters, and attributes of this base options parser are inherited by all child classes.
longdesc
= u'BridgeDB is a proxy distribution system for\n private relays acting as bridges into the Tor network. See `bridgedb\n <command> --help` for addition help.'¶optParameters
= [[u'config', u'c', None, u'Configuration file [default: <rundir>/bridgedb.conf]'], [u'rundir', u'r', None, u"Change to this directory before running. [default: `os.getcwd()']\n\n All other paths, if not absolute, should be relative to this path.\n This includes the config file and any further files specified within\n the config file.\n "]]¶opt_q
()¶Decrease verbosity
opt_v
()¶Increase verbosity
findRundirAndConfigFile
(rundir=None, config=None)[source]¶Find the absolute path of the config file and runtime directory, or find suitable defaults.
Attempts to set the absolute path of the runtime directory. If the config path is relative, its absolute path is set relative to the runtime directory path (unless it starts with ‘.’ or ‘..’, then it is interpreted relative to the current working directory). If the path to the config file is absolute, it is left alone.
Parameters: |
|
---|---|
Raises: | twisted.python.usage.UsageError if either the runtime directory or the config file cannot be found. |
postOptions
()[source]¶Automatically called by parseOptions()
.
Determines appropriate values for the ‘config’ and ‘rundir’ settings.
MockOptions
[source]¶Bases: bridgedb.parse.options.BaseOptions
Suboptions for creating necessary conditions for testing purposes.
Create an options parser. All flags, parameters, and attributes of this base options parser are inherited by all child classes.
optParameters
= [[u'descriptors', u'n', 1000, u'Generate <n> mock bridge descriptor sets\n (types: netstatus, extrainfo, server)']]¶SIGHUPOptions
[source]¶Bases: bridgedb.parse.options.BaseOptions
Options menu to explain usage and handling of SIGHUP signals.
Create an options parser. All flags, parameters, and attributes of this base options parser are inherited by all child classes.
longdesc
= u'If you send a SIGHUP to a running BridgeDB process, the\n servers will parse and reload all bridge descriptor files into the\n databases.\n\n Note that this command WILL NOT handle sending the signal for you; see\n signal(7) and kill(1) for additional help.'¶SIGUSR1Options
[source]¶Bases: bridgedb.parse.options.BaseOptions
Options menu to explain usage and handling of SIGUSR1 signals.
Create an options parser. All flags, parameters, and attributes of this base options parser are inherited by all child classes.
longdesc
= u'If you send a SIGUSR1 to a running BridgeDB process, the\n servers will dump all bridge assignments by distributor from the\n databases to files.\n\n Note that this command WILL NOT handle sending the signal for you; see\n signal(7) and kill(1) for additional help.'¶MainOptions
[source]¶Bases: bridgedb.parse.options.BaseOptions
Main commandline options parser for BridgeDB.
Create an options parser. All flags, parameters, and attributes of this base options parser are inherited by all child classes.
optFlags
= [[u'dump-bridges', u'd', u'Dump bridges by hashring assignment into files'], [u'reload', u'R', u'Reload bridge descriptors into running servers']]¶subCommands
= [[u'mock', None, <class 'bridgedb.parse.options.MockOptions'>, u'Generate a testing environment'], [u'SIGHUP', None, <class 'bridgedb.parse.options.SIGHUPOptions'>, u'Reload bridge descriptors into running servers'], [u'SIGUSR1', None, <class 'bridgedb.parse.options.SIGUSR1Options'>, u'Dump bridges by hashring assignment into files']]¶Parsers for Tor version number strings.
Version - Holds, parses, and does comparison operations for package
version numbers.
InvalidVersionStringFormat
[source]¶Bases: exceptions.ValueError
Raised when a version string is not in a parseable format.
Version
(version, package=None)[source]¶Bases: twisted.python.versions.Version
Holds, parses, and does comparison operations for version numbers.
Attr str package: | |
---|---|
The package name, if available. | |
Attr int major: | The major version number. |
Attr int minor: | The minor version number. |
Attr int micro: | The micro version number. |
Attr str prerelease: | |
The prerelease specifier isn’t always present,
though when it is, it’s usually separated from the main
major.minor.micro part of the version string with a - , + ,
or # character. Sometimes the prerelease is another number,
although often it can be a word specifying the release state,
i.e. +alpha , -rc2 , etc. |
Create a version object.
Comparisons may be computed between instances of :class:`Version`s.
>>> from bridgedb.parse.versions import Version
>>> v1 = Version("0.2.3.25", package="tor")
>>> v1.base()
'0.2.3.25'
>>> v1.package
'tor'
>>> v2 = Version("0.2.5.1-alpha", package="tor")
>>> v2
Version(package=tor, major=0, minor=2, micro=5, prerelease=1-alpha)
>>> v1 == v2
False
>>> v2 > v1
True
Parameters: |
---|
base
()[source]¶Get the base version number (with prerelease).
Return type: | string |
---|---|
Returns: | A version number, without the package/program name, and with the prefix (if available). For example: ‘0.2.5.1-alpha’. |
getPrefixedPrerelease
(separator='.')[source]¶Get the prerelease string, prefixed by the separator prefix
.
Parameters: | separator (string) – The separator to use between the rest of the
version string and the prerelease string. |
---|---|
Return type: | string |
Returns: | The separator plus the prefix , i.e. ‘.1-alpha’. |