September 06, 2010

shabti_auth_rdfalchemy – RDF-based auth’n’auth

This is primarily shabti_auth – boilerplate authentication but using RDFAlchemy instead of SQLAlchemy. The identity model is expressed in RDF and persisted in an RDF triple store. Web app auth’n’auth functioning remains consistent with that of the other Shabti auth’n’auth templates.

There’s work still to be done here. Some further, if limited, exploration would be useful. The authentication is actually functioning, output is coherent and more or less as expected. A good starting position.

Note

shabti_auth_rdfalchemy source code is in the bitbucket code repository

About RDFAlchemy

The goal of RDFAlchemy is to provide Python users with an object-type API access to an RDF triple store. In the same way that SQLAlchemy is an ORM (Object Relational Mapper) for relational database users, RDFAlchemy is an ORM (Object RDF Mapper) for RDF triple store users.

RDFAlchemy supports several different methods of connecting to triple stores (either as the exclusive persistence mechanism or in parallel with SQLAlchemy connections to conventional relational tables).

Dependencies

As the below overview diagram suggests, the shabti auth rdfalchemy template requires certain dependencies to be satisfied. In a similar way that SQLAlchemy relies on psycopg2 to be able to connect to PostgreSQL, RDFAlchemy relies on the rdflib RDF library to be able to connect to triple stores.

../_images/rdfa_overview_l1.png

Satisfying the rdflib dependency can present problems. The recommended approach is to follow the rdflib web site’s explicit easy_install instructions:

$ easy_install -U "rdflib>=2.4,<=3.0a"

Note

rdflib is somewhat “bleeding-edge” and not for the faint-hearted. A sphinx-generated version of the rdflib docs is available.

The bottom line of the overview diagram shows that choice of back-end storage is constrained by support limitations. When using rdflib to maintain triple stores, SQLite is the minimal functional requirement for persistence, however performance is likely to become an early issue because the triple stores tend to expand alarmingly quickly and performance is critically dependent on fast indexing.

The shabti_auth_rdfalchemy template is intended to flatten the otherwise rather steep on-ramp to using Pylons as a foundation for exploring Semantic Web technology. However, there is some preliminary groundwork to be done. This template requires rdflib and one or more of the following packages: bsddb3, ZODB3 or MySQLdb in order to persist the identity model, or indeed to run at all.

From a practical perspective, MySQL is well-supported, ZODB and bsddb3 (Sleepycat) have proved to be unexpectedly quick (perhaps unhindered by RDBMS management processing), PostgreSQL (untuned) is disappointingly slow. Neither of the Java-based Jena and Sesame2 triple stores provide library access to Python but both offer a REST API for Pylons’ use.

Using the template

After successfully installing Shabti, additional paster templates will be available. Simply create a Shabti-configured project by specifying that paster should use the shabti_auth_rdfalchemy template:

$ paster create -t shabti_auth_rdfalchemy myproj

These are the option dialogue choices appropriate for the Shabti auth couchdb template — which uses mako templates and also offers, for this template, an optional SQLAlchemy (they’re orthogonal, use both if you wish) ...

(mako/genshi/jinja/etc: Template language) ['mako']:
(True/False: Include SQLAlchemy 0.4 configuration) [False]: True
(True/False: Setup default appropriate for Google App Engine) [False]:

Once the project has been created, navigate to the project directory.

The next step is to initialise the RDF Graph store by running the project setup script but before that can happen, an appropriate RDF Graph store has to be selected by choosing (and uncommenting) one of the storage options listed in the files development.ini and setup.ini.

# rdfalchemy.dburi = zodb:///%(here)s/data/graph/Data.fs
# rdfalchemy.dburi = mysql://username@localhost/shabti_triplestore
# rdfalchemy.dburi = sleepycat:///%(here)s/data/graph
# rdfalchemy.create = True

Fairly obviously, it needs to be the same store that’s selected in each file, can’t have sleepycat for one and zodb for t’other. setup.ini initialises the store and so has rdfalchemy.create = True. development.ini simply re-uses the pre-existing store, so has the graph creation directive commented out to avoid it expunging the data stored in the setup phase.

I recommend that you try and stay with ZODB or Sleepycat options. Both are easy-installable (“ZODB3” and “bsddb3”) Either is a good solution for an RDF triplestore, they do not have the added baggage of the RDBMS management stuff.

If you plan on storing a lot of triples — and that situation might arrive much faster you expect — you might want get a ZEO server running (for ZODB) or see if the mysql option will work for you.

Note

this setup-app incantation is different, it uses setup.ini

$ paster setup-app setup.ini

If successful, the setup script will print an XML rendering of the rdflib Graph to stdout:

<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
   xmlns:dc="http://purl.org/dc/elements/1.1/"
   xmlns:foaf="http://xmlns.com/foaf/0.1/"
   xmlns:purl="http://purl.org/dc/terms/"
   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns:usr="http://islab.hanyang.ac.kr/damls/User.daml#"
   xmlns:xmls="http://www.w3.org/2001/XMLSchema#"
>
  <rdf:Description rdf:nodeID="KAxsUVwF12">
    <purl:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
        >2009-01-31T14:47:00.285621</purl:created>
    <foaf:mbox>admin@example.org</foaf:mbox>
    <usr:Password>d033e22ae348aeb5660fc2140aec35850c4da997</usr:Password>
    <foaf:name>admin</foaf:name>
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/>
    <xmls:boolean rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
        >1</xmls:boolean>
  </rdf:Description>
  <rdf:Description rdf:nodeID="KAxsUVwF14">
    <purl:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
        >2009-01-31T14:47:00.291638</purl:created>
    <foaf:Person rdf:nodeID="KAxsUVwF12"/>
    <foaf:Person rdf:nodeID="KAxsUVwF13"/>
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Group"/>
    <xmls:boolean rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
        >1</xmls:boolean>
    <dc:description>Administration group</dc:description>
  </rdf:Description>
  <rdf:Description rdf:nodeID="KAxsUVwF13">
    <purl:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime"
        >2009-01-31T14:47:00.289455</purl:created>
    <foaf:mbox>gjh@example.org</foaf:mbox>
    <usr:Password>5ca27e75aea3e5e83a04c6cfa5f1b63d358cd03d</usr:Password>
    <foaf:name>gjh</foaf:name>
    <rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Agent"/>
    <xmls:boolean rdf:datatype="http://www.w3.org/2001/XMLSchema#int"
        >1</xmls:boolean>
  </rdf:Description>
</rdf:RDF>

After the graph has been initialised, start the Pylons web app with:

$ paster serve --reload development.ini

The Shabti RDFAlchemy auth’n’auth template’s variant on the standard Pylons welcome screen is browsable at at http://localhost:5000/ ...

Welcome screen

../_images/shabti_auth_rdfalchemy_welcome.jpg

RDF Graph details (ZODB3 store)

This and the next screenshot shows some metadata about the graph and shows the full graph serialized in n3, styled with pygments (using Phil Cooper’s n3 lexer). (In the distribution, the Python code for this appears in the index action of controllers/demo.py but is commented out )

import pygments
from pygments import lexers, formatters
formatter = formatters.HtmlFormatter(lineseparator='<br />')
lexer = lexers.get_lexer_by_name('n3')
c.n3 = pygments.highlight(c.graph.serialize(format="n3"),
                          lexer, formatter)
../_images/shabti_auth_rdfalchemy_zodb.jpg

RDF Graph details (Sleepycat store)

Just for proof of concept. Switching between back-ends is as simple as switching the rdfalchemy.dburi value.

../_images/shabti_auth_rdfalchemy_sleepycat.jpg

RDF Model data

Confirming authenticated access.

../_images/shabti_auth_rdfalchemy_model.jpg

Notes on the template

config/environment.py

Configuration bindings are retrieved, objects created and bound to keys in the pylons.config dictionary.

# Setup RDFAlchemy database engine
config['rdfalchemy.ra_engine'] = rdf_engine_from_config(config, 'rdfalchemy.')

Pylons’ extremely low impedance means that, apart from the switch from relational to RDF store, there are only a few differences between the shabti_auth – boilerplate authentication and the shabti_auth_rdfalchemy – RDF-based auth’n’auth templates.

The following is the output from a diff of the demo controller showing that little change is involved in switching the identity model’s implementation from relational to graph:

controllers/demo.py

@@ -1,10 +1,14 @@
 # -*- coding: utf-8 -*-
 from pylons import tmpl_context as c
 from pylons.templating import render_mako
 from foobar.lib.base import *
-from foobar.model import *
+from foobar.model.rdfmodel import *
 from foobar.lib.decorators import authorize
 from foobar.lib.auth.permissions import SignedIn

 class DemoController(BaseController):

@@ -16,18 +20,24 @@
         pass

     def index(self):
-        c.users = Session.query(User).all()
-        c.groups = Session.query(Group).all()
-        c.permissions = Session.query(Permission).all()
+        c.users = User.filter_by(active=1)
+        c.groups = Group.filter_by(active=1)
+        c.permissions =[]
         c.title = 'Test'
         return render('default.mak')

     # Need to protect just a single action?
     # Do it like this ....
     @authorize(SignedIn())
     def privindex(self):
-        c.users = Session.query(User).all()
-        c.groups = Session.query(Group).all()
-        c.permissions = Session.query(Permission).all()
+        c.users = User.filter_by(active=1)
+        c.groups = Group.filter_by(active=1)
+        c.permissions = []
         c.title = 'Test'
         return render('test.mak')

lib/base.py

The RDF graph is instantiated on each invocation of BaseController and bound to convenience variables. Removal of the model.Session is likely to be completely superfluous and the statement probably ought to disappear completely.

class BaseController(WSGIController):

    def __call__(self, environ, start_response):
        """Invoke the Controller"""

        # For convenience
        c.graph = rdfSubject.db = config['rdfalchemy.ra_engine']

        try:
            return WSGIController.__call__(self, environ, start_response)
        finally:
            model.Session.remove()

model/rdfmodel.py

A slighly different appearance to the usual relational approach, the db (graph) is bound, namespaces established and then the modelling may begin. Instead of choosing from a limited range of named database types, this RDF modelling matches the semantics of the field with a concept drawn from an appropriate ontology (another graph, but of of concepts instead of instances), hence a ‘username’ is a foaf.name and a ‘created date’ is a dc.terms.created date.

foaf = Namespace("http://xmlns.com/foaf/0.1/")
usr = Namespace("http://islab.hanyang.ac.kr/damls/User.daml#")
purl = Namespace("http://purl.org/dc/terms/")
dc = Namespace("http://purl.org/dc/elements/1.1/")
cc = Namespace("http://web.resource.org/cc/")

class User(rdfSubject):
    rdf_type = foaf.Agent
    username = rdfSingle(foaf.name)
    password = rdfSingle(usr.Password)
    password_check = rdfSingle(usr.Password)
    email = rdfSingle(foaf.mbox)
    created = rdfSingle(purl.created)
    active = rdfSingle(URIRef('http://www.w3.org/2001/XMLSchema#boolean'))

    @classmethod
    def authenticate(cls, username, password):
        try:
            user=cls.get_by(username=username)
            if user and encrypt_value(password) == user.password:
                return user
        except Exception:
            raise NotAuthenticated
        raise NotAuthenticated

lib/auth/__init__.py

The presence of RDF in the equation prompts a couple of changes, shown here as a diff from the standard auth’n’auth file. On login, the n3 serialized resURI of the User entity is stored in the session by login for get_user to pick it up and retrieve the entity direct from the graph.

--- __init__.py_tmpl
+++ (clipboard)
@@ -10,14 +10,13 @@
     if _auth_user_environ_key not in request.environ:
         user_id = session.get(_auth_user_session_key)
         if user_id:
-            user = model.User.get_by(id = user_id, active = True)
-            request.environ[_auth_user_environ_key] = user
+            request.environ[_auth_user_environ_key] = User(user_id)
         else:
             request.environ[_auth_user_environ_key] = None
     return request.environ[_auth_user_environ_key]

 def login(user):
-    session[_auth_user_session_key] = str(user.id)
+    session[_auth_user_session_key] = user.resUri.n3()
     session.save()

 def logout():

lib/auth/permissions.py

Used unchanged from shabti_auth – boilerplate authentication.

author:Graham Higgins <gjh@bel-epa.com>

September 06, 2010