pyff Package

pyff Package

pyFF is a SAML metadata aggregator.

constants Module

Useful constants for pyFF. Mostly XML namespace declarations.

decorators Module

Various decorators used in pyFF.

pyff.decorators.cached(typed=False, ttl=None, hash_key=None)
pyff.decorators.deprecated(logger=<pyff.logs.PyFFLogger object at 0x2ba775975e50>)

This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.

pyff.decorators.retry(ex, tries=4, delay=3, backoff=2, logger=<pyff.logs.PyFFLogger object at 0x2ba775975e50>)

Retry calling the decorated function using exponential backoff based on

  • ex (Exception or tuple) – the exception to check. may be a tuple of excpetions to check
  • tries (int) – number of times to try (not retry) before giving up
  • delay (int) – initial delay between retries in seconds
  • backoff (int) – backoff multiplier e.g. value of 2 will double the delay each retry
  • logger (logging.Logger instance) – logger to use. If None, print

index Module

class pyff.index.EntitySet(initial=None)

Bases: _abcoll.MutableSet

class pyff.index.MDIndex

Bases: object

Interface for metadata index providers


Index the entity

get(a, v)

Obtains a list of entities that have a=b.

  • a
  • v


Removes the entity from the index.

class pyff.index.MemoryIndex

Bases: pyff.index.MDIndex

get(a, v)
pyff.index.hash_id(entity, hn='sha1', prefix=True)

locks Module - Read-Write lock thread lock implementation

See the class documentation for more info.

Copyright (C) 2007, Heiko Wundram. Released under the BSD-license.

class pyff.locks.ReadWriteLock

Bases: object

Read-Write lock class. A read-write lock differs from a standard threading.RLock() by allowing multiple threads to simultaneously hold a read lock, while allowing only a single thread to hold a write lock at the same point of time.

When a read lock is requested while a write lock is held, the reader is blocked; when a write lock is requested while another write lock is held or there are read locks, the writer is blocked.

Writers are always preferred by this implementation: if there are blocked threads waiting for a write lock, current readers may request more read locks (which they eventually should free, as they starve the waiting writers otherwise), but a new thread requesting a read lock will not be granted one, and block. This might mean starvation for readers if two writer threads interweave their calls to acquireWrite() without leaving a window only for readers.

In case a current reader requests a write lock, this can and will be satisfied without giving up the read locks first, but, only one thread may perform this kind of lock upgrade, as a deadlock would otherwise occur. After the write lock has been granted, the thread will hold a full write lock, and not be downgraded after the upgrading call to acquireWrite() has been match by a corresponding release().

acquireRead(blocking=True, timeout=None)
Acquire a read lock for the current thread, waiting at most timeout seconds or doing a
non-blocking check in case timeout is <= 0.
  • In case timeout is None, the call to acquireRead blocks until the lock request can be serviced.
  • In case the timeout expires before the lock could be serviced, a RuntimeError is thrown.
Acquire a write lock for the current thread, waiting at most timeout seconds or doing a non-blocking
check in case timeout is <= 0.
  • In case the write lock cannot be serviced due to the deadlock condition mentioned above, a ValueError is raised.
  • In case timeout is None, the call to acquireWrite blocks until the lock request can be serviced.
  • In case the timeout expires before the lock could be serviced, a RuntimeError is thrown.

Yields a read lock


Release the currently held lock.

  • In case the current thread holds no lock, a ValueError is thrown.

Yields a write lock

logs Module

class pyff.logs.PyFFLogger

Bases: object

class pyff.logs.SysLogLibHandler(facility)

Bases: logging.Handler

A logging handler that emits messages to syslog.syslog.

priority_map = {0: 5, 40: 3, 10: 5, 50: 2, 20: 5, 30: 4}

mdrepo Module

This is the implementation of the active repository of SAML metadata. The ‘local’ and ‘remote’ pipes operate on this.

class pyff.mdrepo.Event(dict=None, **kwargs)

Bases: UserDict.UserDict

class pyff.mdrepo.MDRepository(metadata_cache_enabled=False, min_cache_ttl='PT5M', store=None)

Bases: pyff.mdrepo.Observable

A class representing a set of SAML Metadata. Instances present as dict-like objects where the keys are URIs and values are EntitiesDescriptor elements containing sets of metadata.

annotate(e, category, title, message, source=None)
Add an ATOM annotation to an EntityDescriptor or an EntitiesDescriptor. This is a simple way to
add non-normative text annotations to metadata, eg for the purpuse of generating reports.
  • e – An EntityDescriptor or an EntitiesDescriptor element
  • category – The ATOM category
  • title – The ATOM title
  • message – The ATOM content
  • source – An optional source URL. It is added as a <link> element with @rel=’saml-metadata-source’
check_signature(t, key)
display(entity, langs=None)

Utility-method for computing a displayable string for a given entity.

Parameters:entity – An EntityDescriptor element
entity_set(entities, name, cacheDuration=None, validUntil=None, validate=True)
  • entities – a set of entities specifiers (lookup is used to find entities from this set)
  • name – the @Name attribute
  • cacheDuration – an XML timedelta expression, eg PT1H for 1hr
  • validUntil – a relative time eg 2w 4d 1h for 2 weeks, 4 days and 1hour from now.

Produce an EntityDescriptors set from a list of entities. Optional Name, cacheDuration and validUntil are affixed.

ext_display(entity, langs=None)

Utility-method for computing a displayable string for a given entity.

Parameters:entity – An EntityDescriptor element

Return a list of the Extensions elements in the EntityDescriptor

Parameters:e – an EntityDescriptor
Returns:a list
fetch_metadata(resources, max_workers=5, stats=None, timeout=120, max_tries=5, validate=False)

Fetch a series of metadata URLs and optionally verify signatures.

  • resources – A list of triples (url,cert-or-fingerprint,id, post-callback)
  • max_workers – The maximum number of parallell downloads to run
  • stats – A dictionary used for storing statistics. Useful for cherrypy cpstats
  • validate – Turn on or off schema validation

The list of triples is processed by first downloading the URL. If a cert-or-fingerprint is supplied it is used to validate the signature on the received XML. Two forms of XML is supported: SAML Metadata and XRD.

SAML metadata is (if valid and contains a valid signature) stored under the ‘id’ identifier (which defaults to the URL unless provided in the triple.

XRD elements are processed thus: for all <Link> elements that contain a ds;KeyInfo elements with a X509Certificate and where the <Rel> element contains the string ‘urn:oasis:names:tc:SAML:2.0:metadata‘, the corresponding <URL> element is download and verified.

filter_invalids(t, base_url, validation_errors)
load_dir(directory, ext='.xml', url=None, validate=False, post=None, description=None)
  • directory – A directory to walk.
  • ext – Include files with this extension (default .xml)

Traverse a directory tree looking for metadata. Files ending in the specified extension are included. Directories starting with ‘.’ are excluded.

lookup(member, xp=None)

Lookup elements in the working metadata repository

  • member (basestring) – A selector (cf below)
  • xp (basestring) – An optional xpath filter

An interable of EntityDescriptor elements

Return type:


Selector Syntax

  • selector “+” selector
  • [sourceID] ”!” xpath
  • attribute=value or {attribute}value
  • entityID
  • sourceID (@Name)
  • <URL containing one selector per line>

The first form results in the intersection of the results of doing a lookup on the selectors. The second form results in the EntityDescriptor elements from the source (defaults to all EntityDescriptors) that match the xpath expression. The attribute-value forms resuls in the EntityDescriptors that contain the specified entity attribute pair. If non of these forms apply, the lookup is done using either source ID (normally @Name from the EntitiesDescriptor) or the entityID of single EntityDescriptors. If member is a URI but isn’t part of the metadata repository then it is fetched an treated as a list of (one per line) of selectors. If all else fails an empty list is returned.

merge(t, nt, strategy=<function replace_existing at 0x2ba775c52500>, strategy_name=None)
  • t – The EntitiesDescriptor element to merge into
  • nt – The EntitiesDescriptor element to merge from
  • strategy – A callable implementing the merge strategy pattern
  • strategy_name – The name of a strategy to import. Overrides the callable if present.

Two EntitiesDescriptor elements are merged - the second into the first. For each element in the second collection that is present (using the @entityID attribute as key) in the first the strategy callable is called with the old and new EntityDescriptor elements as parameters. The strategy callable thus must implement the following pattern:

  • old_e – The EntityDescriptor from t
  • e – The EntityDescriptor from nt

A merged EntityDescriptor element

Before each call to strategy old_e is removed from the MDRepository index and after merge the resultant EntityDescriptor is added to the index before it is used to replace old_e in t.

parse_metadata(source, key=None, base_url=None, fail_on_error=False, filter_invalid=True, validate=True, validation_errors=None, expiration=None, post=None)
Parse a piece of XML and split it up into EntityDescriptor elements. Each such element
is stored in the MDRepository instance.
  • source – a file-like object containing SAML metadata
  • key – a certificate (file) or a SHA1 fingerprint to use for signature verification
  • base_url – use this base url to resolve relative URLs for XInclude processing
  • fail_on_error – (default: False)
  • filter_invalid – (default True) remove invalid EntityDescriptor elements rather than raise an errror
  • validate – (default: True) set to False to turn off all XML schema validation
  • post – A callable that will be called to modify the parse-tree before any validation

(but after xinclude processing)


A very basic test for sanity. An empty metadata set is probably not a sane output of any process.

Returns:True iff there is at least one EntityDescriptor in the active set.
search(query=None, path=None, page=None, page_limit=10, entity_filter=None, related=None)
  • query – A string to search for.
  • path – The repository collection (@Name) to search in - None for search in all collections
  • page – When using paged search, the page index
  • page_limit – When using paged search, the maximum entry per page
  • entity_filter – An optional lookup expression used to filter the entries before search is done.
  • related – an optional ‘+’-separated list of related domain names for prioritizing search results

Returns a list of dict’s for each EntityDescriptor present in the metadata store such that any of the DisplayName, ServiceName, OrganizationName or OrganizationDisplayName elements match the query (as in contains the query as a substring).

The dict in the list contains three items:

  • title – A displayable string, useful as a UI label
  • value – The entityID of the EntityDescriptor
  • id – A sha1-ID of the entityID - on the form {sha1}<sha1-hash-of-entityID>
set_entity_attributes(e, d, nf='urn:oasis:names:tc:SAML:2.0:attrname-format:uri')

Set an entity attribute on an EntityDescriptor


MetadataException unless e is an EntityDescriptor element

set_pubinfo(e, publisher=None, creation_instant=None)
set_reginfo(e, policy=None, authority=None)
Parameters:uri – An EntitiesDescriptor URI present in the MDRepository
Returns:an information dict

Returns a dict object with basic information about the EntitiesDescriptor

class pyff.mdrepo.Observable

Bases: object


mdx Module

An implementation of draft-lajoie-md-query

Usage: pyffd <options> {pipeline-files}+

        Turn off caching
-p <pidfile>
        Write a pidfile at the specified location
        Run in foreground
        Restart pyffd if any of the pipeline files change
--log=<log> | -l<log>
        Set to either a file or syslog:<facility> (eg syslog:auth)
--error-log=<log> | --access-log=<log>
        As --log but only affects the error or access log streams.
        Set logging level
        Listen on the specified port
        Listen on the specified interface
        Use redis-based store
        Wake up every <seconds> and run the update pipeline. By
        default the frequency is set to 600.
        Add the mapping 'name: uri' to the toplevel URL alias
        table. This causes URLs on the form http://server/<name>/x
        to be processed as http://server/metadata/{uri}x. The
        default alias table is presented at http://server
        Chdir into <dir> after the server starts up.
        The service is running behind a proxy - respect the X-Forwarded-Host header.
-m <module>|--modules=<module>
        Load a module

        One or more pipeline files
class pyff.mdx.DirPlugin(bus, d=None)

Bases: cherrypy.process.plugins.SimplePlugin

class pyff.mdx.EncodingDispatcher(prefixes, enc, next_dispatcher=<cherrypy._cpdispatch.Dispatcher object at 0x2ba775b4cc90>)

Bases: object

Cherrypy ass-u-me-s a lot about how requests are processed. In particular it is diffucult to send something that contains ‘/’ and ‘:’ (like a URL) throught the standard dispatchers. This class provides a workaround by base64-encoding the troubling stuff and sending the result through the normal displatch pipeline. At the other end base64-encoded data is unpacked.

class pyff.mdx.MDRoot(server)

Bases: object

The root application of pyFF. The root application assembles the MDStats and WellKnown classes with an MDServer instance.


The ‘about’ page. Contains links to statistics etc.

default(*args, **kwargs)

The default request processor unpacks base64-encoded reuqests and passes them onto the MDServer.request handler.


Process an MDX request with Content-Type hard-coded to application/xml. Regardless of the suffix you will get XML back from /entities/...


Returns the pyff icon (the alchemic symbol for sublimation).


Alias for /metadata

memory = <pyff.mdx.NotImplementedFunction object at 0x2ba775b69b50>

The main request entry point. Any requests are subject to content negotiation based on Accept headers and based on file name extension. Requesting /metadata/foo.xml gets you (signed) XML (assuming your pipeline contains that mode), requesting /metadata/foo.json gets you json, and /metadata/foo.ds gets you a discovery interface based on the IdPs found in ‘foo’. Here ‘foo’ is any supported lookup expression.


The /reset page clears all local browser settings for the device. After visiting this page users of the discovery service will see a “new device” page.


Returns a robots.txt that disables all robots.

search(paged=False, query=None, page=0, page_limit=10, entity_filter=None, related=None)
Search the active set for matching entities.
param paged:page the result when True
param query:the string query
param page:the page to return of the paged result
param page_limit:
 the number of result per page
param entity_filter:
 an optional filter to apply to the active set before searching
param related:an optional ‘+’-separated list of related domain names for prioritizing search results
return:a JSON-formatted search result

The /settings page documents the (non) use of cookies.

static(*a, **kw)
stats = <pyff.mdx.MDStats object at 0x2ba775b69a50>
class pyff.mdx.MDServer(pipes=None, autoreload=False, frequency=600, aliases=None, cache_enabled=True, observers=None, store=None)

Bases: object

The MDServer class is the business logic of pyFF. This class is isolated from the request-decoding logic of MDRoot and from the ancilliary classes like MDStats and WellKnown.

class MediaAccept

Bases: object


The main request processor. This code implements all rendering of metadata.

class pyff.mdx.MDStats

Bases: cherrypy.lib.cpstats.StatsPage

Renders the standard stats page with pyFF style decoration. We use the lxml html parser to locate the body and replace it with a ‘<div>’. The result is passed as the content using the ‘basic’ template.

class pyff.mdx.MDUpdate(bus, frequency=600, server=None)

Bases: cherrypy.process.plugins.Monitor

class pyff.mdx.NotImplementedFunction(message)

Bases: object

class pyff.mdx.WellKnown(server=None)

Bases: object

Implementation of the .well-known URL namespace for pyFF. In particular this contains the webfinger implementation which returns information about up- and downstream metadata.

webfinger(resource=None, rel=None)
An implementation the webfinger protocol (
in order to provide information about up and downstream metadata available at this pyFF instance.


# curl http://localhost:8080/.well-known/webfinger?resource=http://localhost:8080

This should result in a JSON structure that looks something like this:

{"expires": "2013-04-13T17:40:42.188549",
 "links": [
    {"href": "", "rel": "urn:oasis:names:tc:SAML:2.0:metadata"},
    {"href": "", "rel": "disco-json"}],
 "subject": ""}

Depending on which version of pyFF your’re running and the configuration you may also see downstream metadata listed using the ‘role’ attribute to the link elements.


The main entrypoint for the pyffd command.

merge_strategies Module

Merge strategies

pyff.merge_strategies.remove(e1, e2)
pyff.merge_strategies.replace_existing(e1, e2)

stats Module

pyFF statistics module

pyff.stats.set_metadata_info(name, info)

utils Module

This module contains various utilities.

class pyff.utils.EntitySet(initial=None)

Bases: object

exception pyff.utils.MetadataException

Bases: exceptions.Exception

exception pyff.utils.MetadataExpiredException

Bases: pyff.utils.MetadataException

exception pyff.utils.PyffException

Bases: exceptions.Exception

class pyff.utils.ResourceResolver

Bases: lxml.etree.Resolver

resolve(system_url, public_id, context)

Resolves URIs using the resource API

pyff.utils.avg_domain_distance(d1, d2)
pyff.utils.ddist(a, b)
pyff.utils.dumptree(t, pretty_print=False, xml_declaration=True)

Return a string representation of the tree, optionally pretty_print(ed) (default False)

Parameters:t – An ElemenTree to serialize
Parameters:t – An EntitiesDescriptor or EntityDescriptor element

Returns the list of contained EntityDescriptor elements

pyff.utils.filter_lang(elts, langs=None)
pyff.utils.find_entity(t, e_id, attr='entityID')
pyff.utils.has_tag(t, tag)
pyff.utils.hash_id(entity, hn='sha1', prefix=True)
pyff.utils.hex_digest(data, hn='sha1')

Timestamp in ISO format


Current time in ISO format

pyff.utils.parse_xml(io, base_url=None)
pyff.utils.render_template(name, **kwargs)
pyff.utils.resource_filename(name, pfx=None)

Attempt to find and return the filename of the resource named by the first argument in the first location of:

# as name in the current directory # as name in the pfx subdirectory of the current directory if provided # as name relative to the package # as pfx/name relative to the package

The last two alternatives is used to locate resources distributed in the package. This includes certain XSLT and XSD files.

  • name – The string name of a resource
  • pfx – An optional prefix to use in searching for name
pyff.utils.resource_string(name, pfx=None)

Attempt to load and return the contents (as a string) of the resource named by the first argument in the first location of:

# as name in the current directory # as name in the pfx subdirectory of the current directory if provided # as name relative to the package # as pfx/name relative to the package

The last two alternatives is used to locate resources distributed in the package. This includes certain XSLT and XSD files.

  • name – The string name of a resource
  • pfx – An optional prefix to use in searching for name
pyff.utils.safe_write(fn, data)

Safely write data to a file with name fn :param fn: a filename :param data: some data to write :return: True or False depending on the outcome of the write

pyff.utils.totimestamp(dt, epoch=datetime.datetime(1970, 1, 1, 0, 0))
pyff.utils.truncate_filter(s, max_len=10)
pyff.utils.xml_error(error_log, m=None)
pyff.utils.xslt_transform(t, stylesheet, params=None)