Notes on the repoze.urispace Parse Implementation

Parser Implementation Notes

  • The root node of a URISpace is not required to be any particular element, nor even in the URISpace namespace (see the first example in “Appendix C”, of URISpace, for instance). The root node is always mapped to a repoze.urispace.selectors.TrueSelector, for regularity.
  • Create selectors for nodes based on their QNames, using a dictionary. Create predicates for the selectors (where required) from the urispace:match attribute.
  • Any node whose QName does not map to a selector type should be treated as an operator. The default operator type is replace, with overrides coming from the urispace:op attribute.
  • The QName of an operator element is used to look up a converter, which is then passed the entire element (including children), and must return a (key, value) pair.
  • The default converter, used if no other is registered for the operator node’s QName, return the node’s QName as the key, and the node’s text as the value.

Implementing the Spec

  • repoze.urispace implements “Scheme Selectors” (section 3.1) by combining a selector and a predicate:

    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.SchemePredicate
  • Of the “Authority Selectors” (section 3.2), repoze.urispace implements the “Host” variant (section 3.2.2) by combining a selector and a predicate:

    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.NethostPredicate

    repoze.urispace does not implement selectors for “Authority Name” (section 3.2.1) or “User” (section 3.2.3). at this time.

  • repoze.urispace implements “Path Segment Selectors” (section 3.3) by combining a selector and a predicate:

    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.PathFirstPredicate

Note

the semantics of the path segment selector in the spec require matching only on the first element of the current path. repoze.urispace provides extensions which allow for matches on the last element of the current path, and for matches on any element of the current path. See Extending the Spec.

  • repoze.urispace implements “Query Selectors” (section 3.4) by combining a selector and one of two predicates, based on whether the match string includes an =:
    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.QueryKeyPredicate
    • repoze.urispace.predicates.QueryValuePredicate

Extending the Spec

The URISpace specification contemplates extension via what it calls “External Selectors” (see chapter 4). repoze.urispace in fact uses this facility to provide additional selectors:

  • repoze.urispace implements an extension to “Path Segment” selectors (section 3.3), allowing a match on the last element of the current path:
    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.PathLastPredicate
  • repoze.urispace implements an extension to “Path Segment” selectors (section 3.3), allowing a match on any element of the current path:
    • repoze.urispace.selectors.PredicateSelector
    • repoze.urispace.predicates.PathAnyPredicate
  • repoze.urispace.selectors.TrueSelector always dispatches to contained elements; its primary use is to represent the root node of a URISpace.
  • repoze.urispace.selectors.FalseSelector never dispatches to contained elements. Its primary use is in “commenting out” sections of the URISpace.