Generating Websites with a Reasoner

The website www.andonyar.com is created from a description of things in RDF. First, a reasoner derives information about the pages to be generated. Then a planner writes a shell script which finally builds all required files. This turns out to be an elegant solution for small websites.

Currently, generating most pages on www.andonyar.com is done along the following lines:

  1. Describe persons, topics, languages, articles and other things using RDF.
  2. Run the reasoner which derives information about the files that have to be generated using rules written beforehand.
  3. Run the planner. It outputs a shell script.
  4. Build all files by running the shell script.
  5. Upload the files to the webserver.

An implementation of this strategy is SemPipe. It provides the planner and some additional tools. It uses CWM from the W3C as reasoner.

In this article I explain the basic idea. A detailed description with complete examples can be found among SemPipe's documentation (once it's written).

Describing Things

As an example we take the following description of people, located in a file we call descriptions.n3. It describes three people, each with their name (foaf:name) and email address (foaf:mbox).

@prefix :     <http://example.org/people/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

:HarryMoleman a foaf:Person ;
    foaf:name "Harry Moleman" ;
    foaf:mbox <mailto:harry@toybox.org> .

:AbrahamLincoln a foaf:Person ;
    foaf:name "Stone Head of Abraham Lincoln" ;
    foaf:mbox <mailto:abe@givemeyourfirstborn.gov> .

:Stinky a foaf:Person ;
    foaf:name "Stinky" ;
    foaf:mbox <mailto:stinky@stinkys.food> .

Defining Rules

Now assume we want our website to consist of a profile page for every person and a list of all persons known to us. We put the necessary rules in rules.n3.

First we define some namespaces. log: and string: contain functions built into CWM. semp: is recognized by SemPipe's planner. We use voc: for our own internal properties.

@prefix :       <#> .
@prefix voc:    <http://www.example.org/voc#> .
@prefix log:    <http://www.w3.org/2000/10/swap/log#>.
@prefix string: <http://www.w3.org/2000/10/swap/string#>.
@prefix foaf:   <http://xmlns.com/foaf/0.1/> .
@prefix semp:   <http://www.andonyar.com/rec/2012/sempipe/voc#> .

List of all Persons

The rules to generate a list of all persons are

<http://example.org/profiles/> a semp:Resource ;
    semp:buildVar [ semp:name "descriptions"; semp:value "descriptions.n3" ] ;
    semp:build (
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc list.xsl - > build/index.xhtml"
    ) .

A semp:Resource is recognized by SemPipe as something that has to be built. When the planner creates the shell script for such a resource, it first declares all the variables given by a semp:buildVar and then appends the commands given by semp:build. What these commands do and the involved XSL transformations are discussed in a later section. For this resource, the shell script will look like

rules=rules.n3
descriptions=descriptions.n3
mkdir -p build/
cwm.py $rules $descriptions --think --rdf | xsltproc list.xsl - > build/index.xhtml

A Profile Page for each Person

Our rules should stay the same even if new persons are added to descriptions.n3. Therefore, we have to write an N3 rule that derives a profile page for every person. Also, the rule needs to coin a new URL for every profile page.

Fist of all, we denote a profile page by :Profile and we assert that a profile page is also a semp:Resource to be build by SemPipe.

{ ?profile a :Profile }
=>
{ ?profile a semp:Resource } .

=> is an implication in the mathematical sense: If the statement to the left of => is true, then the right side is true as well. Identifiers beginning with a question mark are a short notation for universally quantified variables. So, this rule says: For all ?profile it holds: If ?profile is a :Profile, then ?profile is also a semp:Resource.

The next step is to derive a profile page for every person. We construct the profile's URL by taking the part behind the last / of the person's URL and putting http://example.org/profiles/ in front of it.

{
    ?person a foaf:Person .
    (?person.log:uri "/([^/]+)$") string:scrape ?id .
    ?profile log:uri ("http://example.org/profiles/" ?id).string:concatenation .
}
=>
{
    ?profile a :Profile .
    ?profile foaf:primaryTopic ?person .
    ?profile :uniqueIdentifier ?id .
}.

Note that on the lefthand side of an implication CWM tries to compute properties it knows automatically such that the statement becomes true. ?person.log:uri denotes the URI of the value of ?person as string. string:scrape takes a string and a regular expression on the left side and a string on the right side. The statement is true when the right side is the result of matching the regular expression against the string on the left side. If the left side is known, CWM can compute the right side such that the statement becomes true. The third statement says that the URI of ?profile is the concatenation of the strings http://example.org/profiles/ and ?id.

In words, the rule says: For every ?person which is a foaf:Person, every ?id being the part of the ?person's URL after the last / (computed automatically), and every ?profile being the concatenation of http://example.org/profiles/ and ?id as URI (computed automatically) the statements ?profile a :Profile, ?profile foaf:primaryTopic ?person, ?profile :uniqueIdentifier ?id are true.

So far, we have a rule that derives profiles from persons and a rule that says that a profile is also a semp:Resource. We still need to say how to build the profile pages.

{
    ?profile a :Profile .
    ?profile foaf:primaryTopic ?person .
    ?profile :uniqueIdentifier ?id .
}
=>
{
    ?person voc:profilePage ?profile .

    ?profile semp:buildVar [ semp:name "person"; semp:value ?person ] .
    ?profile semp:buildVar [ semp:name "rules"; semp:value "rules.n3" ] .
    ?profile semp:buildVar [ semp:name "descriptions"; semp:value "descriptions.n3" ] .
    ?profile semp:buildVar [ semp:name "id"; semp:value ?id ] .

    ?profile semp:build ( 
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml"
    ) .
} .

The above rule derives variables and building instruction for every profile. The ?person voc:profilePage ?profile is used by the XSL transformation list.xsl shown later.

Running the Reasoner

Now we feed the reasoner with the descriptions and rules given above.

cwm.py rules.n3 descriptions.n3 --think

Due to the option --think, cwm tries to apply the rules as good as it can. (There are theoretical and implementation specific limitations.) It writes the result to standard output. Here I only show part of the output and I made the notation a bit more readable:

@prefix : <http://www.andonyar.com/rec/2012/sempipe/voc#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix p: <http://example.org/profiles/> .
@prefix peo: <http://example.org/people/> .
@prefix voc: <http://www.example.org/voc#> .

p: 
    a :Resource;
    :buildVar [ :name "descriptions"; :value "descriptions.n3" ];
    :buildVar [ :name "rules"; :value "rules.n3" ] ;
    :build (
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc list.xsl - > build/index.xhtml"
    ).

p:AbrahamLincoln
    a <rules.n3#Profile>;
    a :Resource;
    foaf:primaryTopic peo:AbrahamLincoln ;
    <rules.n3#uniqueIdentifier> "AbrahamLincoln";
    :buildVar [ :name "person"; :value peo:AbrahamLincoln ];
    :buildVar [ :name "rules"; :value "rules.n3" ];
    :buildVar [ :name "descriptions"; :value "descriptions.n3" ];
    :buildVar [ :name "id"; :value "AbrahamLincoln" ];
    :build (
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml" 
    ).

p:HarryMoleman
    a <rules.n3#Profile>;
    a :Resource;
    foaf:primaryTopic peo:HarryMoleman ;
    :buildVar [ :name "person"; :value peo:HarryMoleman ];
    :buildVar [ :name "rules"; :value "rules.n3" ];
    :buildVar [ :name "descriptions"; :value "descriptions.n3" ];
    :buildVar [ :name "id"; :value "HarryMoleman" ];
    <rules.n3#uniqueIdentifier> "HarryMoleman";
    :build (
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml"
    ).

p:Stinky 
    a <rules.n3#Profile>;
    a :Resource;
    foaf:primaryTopic peo:Stinky ;
    <rules.n3#uniqueIdentifier> "Stinky";
    :buildVar [ :name "person"; :value peo:Stinky ];
    :buildVar [ :name "rules"; :value "rules.n3" ];
    :buildVar [ :name "descriptions"; :value "descriptions.n3" ];
    :buildVar [ :name "id"; :value "Stinky" ];
    :build (
        "mkdir -p build/"
        "cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml" 
    ).

As you can see, it produced descriptions on how to build each resource we originally wanted to have.

Planning

Running this thourgh SemPipe's planner with the command

cwm.py rules.n3 descriptions.n3 --think | sempipe-plan.py

yields the following shell script:

rules=rules.n3
descriptions=descriptions.n3
mkdir -p build/
cwm.py $rules $descriptions --think --rdf | xsltproc list.xsl - > build/index.xhtml

descriptions=descriptions.n3
person=http://example.org/people/AbrahamLincoln
id=AbrahamLincoln
rules=rules.n3
mkdir -p build/
cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml

rules=rules.n3
person=http://example.org/people/Stinky
descriptions=descriptions.n3
id=Stinky
mkdir -p build/
cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml

descriptions=descriptions.n3
id=HarryMoleman
rules=rules.n3
person=http://example.org/people/HarryMoleman
mkdir -p build/
cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml

Executing the Plan

Running the script with a shell is all we need to do:

cwm.py rules.n3 descriptions.n3 --think | sempipe-plan.py | sh

This creates the four files build/index.xhtml, build/AbrahamLincoln.xhtml, build/Stinky.xhtml, and build/HarryMoleman.xhtml, shown below:

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:voc="http://www.example.org/voc#">
  <head>
    <title>User Profiles</title>
  </head>
  <body>
    <ul>
      <li>
        <a href="http://example.org/profiles/HarryMoleman">Harry Moleman</a>
      </li>
      <li>
        <a href="http://example.org/profiles/Stinky">Stinky</a>
      </li>
      <li>
        <a href="http://example.org/profiles/AbrahamLincoln">Stone Head of Abraham Lincoln</a>
      </li>
    </ul>
  </body>
</html>
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
  <head>
    <title>Profile Page of Stone Head of Abraham Lincoln</title>
  </head>
  <body>
    <h1>Profile Page of Stone Head of Abraham Lincoln</h1>
    <dl>
      <dt>name</dt>
      <dd>Stone Head of Abraham Lincoln</dd>
      <dt>e-mail</dt>
      <dd>
        <a href="mailto:abe@givemeyourfirstborn.gov">mailto:abe@givemeyourfirstborn.gov</a>
      </dd>
    </dl>
  </body>
</html>
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
  <head>
    <title>Profile Page of Stinky</title>
  </head>
  <body>
    <h1>Profile Page of Stinky</h1>
    <dl>
      <dt>name</dt>
      <dd>Stinky</dd>
      <dt>e-mail</dt>
      <dd>
        <a href="mailto:stinky@stinkys.food">mailto:stinky@stinkys.food</a>
      </dd>
    </dl>
  </body>
</html>
<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/">
  <head>
    <title>Profile Page of Harry Moleman</title>
  </head>
  <body>
    <h1>Profile Page of Harry Moleman</h1>
    <dl>
      <dt>name</dt>
      <dd>Harry Moleman</dd>
      <dt>e-mail</dt>
      <dd>
        <a href="mailto:harry@toybox.org">mailto:harry@toybox.org</a>
      </dd>
    </dl>
  </body>
</html>

The XSL Transformations

I owe you the XSLT list.xsl and profile.xsl. The main culprit is that there is no unique RDF/XML representation of an RDF graph. This makes it inherently difficult to treat RDF/XML with XSLT. Fortunately, CWM's RDF/XML output follows some fixed rules and can even be configured. The rules specify that in order to build a page, the RDF data is obtained from CWM and piped into xsltproc:

cwm.py $rules $descriptions --think --rdf | xsltproc --stringparam person $person profile.xsl - > build/$id.xhtml

Of course, instead of running CWM for every resource as I do it here, it is recommended to cache CWM's output.

Here are the transformations: list.xsl creates the list of all persons.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    version = "1.0"
    xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
    xmlns:rdf   = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf  = "http://xmlns.com/foaf/0.1/"
    xmlns:voc   = "http://www.example.org/voc#"
    xmlns       = "http://www.w3.org/1999/xhtml"
>

<xsl:output
     method="xml"
     encoding="UTF-8"
     indent="yes" />

<xsl:template match="/rdf:RDF">
    <html>
    <head>
        <title>User Profiles</title>
    </head>
    <body>

    <ul>
        <xsl:apply-templates select="foaf:Person">
            <xsl:sort select="foaf:name"/>
        </xsl:apply-templates>
    </ul>

    </body>
    </html>
</xsl:template>

<xsl:template match="foaf:Person">
    <li><a href="{voc:profilePage/@rdf:resource}"><xsl:value-of select="foaf:name"/></a></li>
</xsl:template>

</xsl:stylesheet>

profile.xsl creates a profile page. The parameter person on the command line indicates the person.

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
    version = "1.0"
    xmlns:xsl   = "http://www.w3.org/1999/XSL/Transform"
    xmlns:rdf   = "http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:foaf  = "http://xmlns.com/foaf/0.1/"
    xmlns       = "http://www.w3.org/1999/xhtml"
>

<xsl:param name="person"/>

<xsl:output
     method="xml"
     encoding="UTF-8"
     indent="yes" />

<xsl:template match="/rdf:RDF">
    <xsl:apply-templates select="foaf:Person[@rdf:about=$person]"/>
</xsl:template>

<xsl:template match="foaf:Person">
    <html>
    <head>
        <title>Profile Page of <xsl:value-of select="foaf:name"/></title>
    </head>
    <body>

    <h1>Profile Page of <xsl:value-of select="foaf:name"/></h1>

    <dl>
        <dt>name</dt>
        <dd><xsl:value-of select="foaf:name"/></dd>
        <dt>e-mail</dt>
        <dd><a href="{foaf:mbox/@rdf:resource}"><xsl:value-of select="foaf:mbox/@rdf:resource"/></a></dd>
    </dl>

    </body>
    </html>
</xsl:template>

</xsl:stylesheet>

Usage Scenario: Dashboard

Here a more complicated example which could be implemented using SemPipe:

Imagine a website that informs its users about the status of services, whether a service is working as expected, is overloaded, or is down. Everytime a problem turns up, an administrator would describe the issue in RDF. The reasoner would then derive from all issues the current status for each service. In order to construct the website, the reasoner would derive descriptions of the semp:Resources to be created (maybe one for each kind of service) and SemPipe would be used to build them.

Dynamic Pages

We discussed generating static web pages by starting with the descriptions and rules and deriving which pages have to be generated and how they have to be generated. For a webserver the situation is reversed. It gets a HTTP request containing an URL and it has to find out how to produce the corresponding resource. The question that comes naturally is whether SemPipe can be used for this too.

The setting is as follows: The descriptions and the rules are given. For a request URL, we need SemPipe to find out how to build the corresponding semp:Resource, should it exist. The reasoner only needs to prove that the requested URL is a semp:Resource and to find out how to build it. Unlike what we did above, it does not need to derive all statements it can, but only those that are related to the task. to facilitate this, one would have to adapt the rules slightly. (Remember string:scrape: The string on the left side would have to be computed from the one on the right. This is not feasible, because there are too many solutions. However, this can be worked around.)

The main concern would be efficiency. Rules which would slow down reasoning would have to be avoided. However, only a real-life experiment could tell whether this approach is feasible.