Clone the
git repository:
git clone https://github.com/uholzer/pkb4unix.git
The following things happened several times to me:
These events all sound like they should be solvable with currently available software. Maybe so, but I am the kind of person who wants to build his own tailored solution.
Let us look at the requirements: The data I want to store is manifold. I have problems, solutions, tasks, websites, papers and topics. Furthermore, I will likely be tempted to introduce more and more properties for all of them. Take papers as an example. There are many BiBTeX properties: authors, publisher, date, journal (issue, page and so on), title, URL, DOI. The list is virtually open ended. Also one is tempted to add more properties, for example one which says in which way the paper has been useful in the past. The things I want to add to my knowledge base should have relations between each other: A paper is related to a topic and an author. An author is related to her email address and her university. A solution is related to the problem it solves. I want to be able to build the knowledge base gradually and introduce new concepts and properties when needed.
Clearly, it is very difficult to do all this using relational databases. Sematic Web enthusiasts already know what the right thing is in this case. I will build upon Semantic Web standards and plug together readily available software. Also, I hope to benefit from integration with the Semantic Web, for example, metadata about papers is provided by many publishers in RDF, so I don't need to enter that data myself.
This page is organized as a blog. I will report here from time to time how my knowledge base evolves. Everything mentioned above will be fleshed out and I will show how it works in practise.
The requirements for my knowledge base are not very clear yet. In
fact, I have to explicitely allow them to change over time. Because of
this, it is important to stick to the UNIX philosophy. Among its
principles are Separate mechanism from policy
, Write simple
parts connected by clean interfaces
, Fold knowledge into data,
so program logic can be stupid and robust
, and Prototype before
polishing
. I will addhere to these principles like this:
This time, I'll install a SPARQL edpoint. I decided myself for Sesame 2.7.0-beta2.
Installation Instructions can be found in Sesame's documentation.
Sesame is a Java application and needs a Servelt container in order
to be run as a SPARQL endpoint. I'll use Tomcat from my Debian
distribution found in the package tomcat7-user
. This
package is special in that it allows a user to easily setup and run
Tomcat on his machine for testing purposes. Setting up a
tomcat instance is very
easy:
urs@speedy:~/p/knowledge$ tomcat7-instance-create --help Usage: tomcat7-instance-create [options] <directoryname> directoryname: name of the tomcat instance directory to create Options: -h, --help Display this help message -p httpport HTTP port to be used by Tomcat (default is 8080) -c controlport Server shutdown control port (default is 8005) -w magicword Word to send to trigger shutdown (default is SHUTDOWN) urs@speedy:~/p/knowledge$ tomcat7-instance-create tomcat-instance You are about to create a Tomcat instance in directory 'tomcat-instance' nc: unable to connect to address localhost, service 8080 nc: unable to connect to address localhost, service 8005 * New Tomcat instance created in tomcat-instance * You might want to edit default configuration in tomcat-instance/conf * Run tomcat-instance/bin/startup.sh to start your Tomcat instance
Unpacking Sesame reveals two war files, one for the Sesame itself
and one for the Sesame Workbench. These must be dropped into Tomcat's
webapps
directory.
urs@speedy:~/p/knowledge$ cd sesame/ urs@speedy:~/p/knowledge/sesame$ tar -xzf openrdf-sesame-2.7.0-beta2-sdk.tar.gz urs@speedy:~/p/knowledge/sesame$ ls openrdf-sesame-2.7.0-beta2/war/ openrdf-sesame.war openrdf-workbench.war urs@speedy:~/p/knowledge/sesame$ cd .. urs@speedy:~/p/knowledge$ cp sesame/openrdf-sesame-2.7.0-beta2/war/* tomcat-instance/webapps/
This does the trick. Starting and stopping tomcat is easy with
./tomcat-instance/bin/startup.sh
and
./tomcat-instance/bin/shutdown.sh
.
Note that Sesame's data is stored in ~/.aduna/
.
After starting,
Sesame's Workbench can be accessed via
http://localhost:8080/openrdf-workbench/
and the server is at
http://localhost:8080/openrdf-sesame.
Using the workbench, one can create new repositories. For my Personal
Knowledge Base I create one of type Native Java Store RDF
Schema
with ID pkb
and title Personal
Knowledge Base
.
Using the workbench, one can load and query graphs easily. It is useful when doing simple things, but for complex things, I will have to write my own tools.