OpenCyc for the Semantic Web

From the OpenCyc web site:

Now it is even easier to use the rich and diverse collection of real-world concepts in OpenCyc to bring meaning to your semantic web applications! The full OpenCyc content is now available both as downloadable OWL ontologies as well as via semantic web endpoints (i.e., permanent URIs). These URIs return RDF representations of each Cyc concept as well as a human-readable version when accessed via a Web Browser.

This module brings OpenCyc piecemeal into ORDF/RDFLib applications.

class ordf.vocab.opencyc.Concept(*av, **kw)[source]

Bases: ordf.graph.Graph

OpenCyc is very big. For most purposes we don’t want to have to store the entire knowledge base in our local store, but for inferencing purposes we often will want to store some of the relevant concepts.

OpenCyc handily provides a search interface that gives results in XML, and we use this to retrieve a concept if we don’t know it’s URI beforehand.

Initialisation of Concept takes, as with any other Graph an optional ‘store’ argument. If the concept in question does not exist in the store it will be fetched and added. The data that is returned by OpenCyc will typically include some other resources, these are filtered out. Only the blank node closure of the requested resource is added to the store.

The Concept class has methods for walking the ontology tree, documented below. These can be useful for adding relevant resources to the store in an automated way, but can be quite slow as they can make a potentially large number of HTTP requests.

>>> lamb, = Concept.search("lamb", max=1)
>>> print lamb.cycAnnot()
(JuvenileFn Sheep)
>>> for parent in lamb.parents():
...     print parent.cycAnnot()
...
Sheep
JuvenileAnimal
>>>
classmethod search(name, max=1, exact=True, store='IOMemory')[source]

The OpenCyc search interface is not documented anywhere obvious, but a very small amount of reverse engineering the JavaScript code on the website and analysing the XML given in returned is sufficient to implement this method.

Parameters:
  • name – a text string to search on. This might be the name in English of the concept that is of interest
  • max – maximum number of results to return
  • exact – exact matches only
  • store – the RDFLib Store to which any results should be added, returned graphs are initialised with this store. The default is the string ‘“IOMemory”’
Returns:

an iterator over populated Concept graphs for each of the search results

parents(restrict=False, seen=set([]))[source]

Walk one step up the class hierarchy by following ‘rdfs:subClassOf’ links.

Parameters:
  • restrict – boolean indicating whether parents returned should be restricted to the OpenCyc namespace.
  • seen – set of identifiers that have already been processed and are not therefore to be returned, in order to avoid needless recursion
Returns:

an iterator yielding Concept for each of the parent concepts.

ancestors(restrict=False, seen=set([]))[source]

Walk to the top of the class hierarchy recursively using parents(). Parameters are as with that method.

cycAnnot()[source]

Return the Cyc annotation or representation in the Cyc language of the current resource.

ordf.vocab.opencyc.rdf_data()[source]

Data fixture for OpenCyc. Starts with the top level predicate and recurses through all ‘owl:DatatypeProperty’, ‘owl:ObjectProperty’, predicates and classes present in the returned RDF either as subjects or appearing in ‘rdfs:domain’ or ‘rdfs:range’. In this way builds up a basic ontology that can be used for reasoning.

This function may take some time to complete as at the time of writing it will yield some 520 distinct graphs.

Previous topic

Presentation of RDF Data with Fresnel

Next topic

Semantic Provenance with OPMV

This Page