Tuesday, May 6, 2008

Server-side storage of OWL syntax.

One concern I have with jOWL, a javascript library I am maintaining that parses and reasons with OWL-RDFS documents, is it's scalability.


Scalability is, as we all know, of vital importance in a web setting. A potential drawback of the current jOWL javascript approach is that user is required to download the entire ontology before being able to do anything with it. This might be fine and sensible in the case of small sample ontologies such as the wine ontology (still 79kB) that has been used in previous semantic web demo's at ontology online. Sadly, it doesn't take much imagination to see that it could quickly become gruesome to load when we make things a little more interesting.


So it dawned on me that it would be great to have a more dynamic load mechanism, where you only request or get what you really need. And it just so happens that Ajax (web 2.0 technology), combined with javascript, is perfectly suited for that. So I have been working hard on creating some server side code that is able to respond to Ajax calls and send back pieces of OWL-DL syntax. The image below kind of explains what I wish to achieve.





At the same time I decided to rewrite the database code that allows me to store ontology information, and take a more xml-hybrid like approach. The limitation of the older code I have, is that it attempts to translate every aspect of OWL-RDF into a relational database table format, e.g. concrete row-column values. As I try to put the richer logic of OWL-DL into work, it becomes a real pain & huge overhead in coding to represent all intersections, unions or other advanced constructs in this format. You just can't seem to beat native OWL-XML syntax (Yeay for the OWL people).


The new approach I'm undertaking stores the full OWL syntax directly into the database. The database doesn't store the ontology in one bulky blob, but slices it up into small digestible units, each unit corresponding to one defined ontology object (e.g. an object referenced by a unique rdf:ID).
The only real modification to the native OWL syntax is a slight compression / reduction in verbosity. In addition, to allow (I hope) quick access some indexes are created (terms, etc).


For those interested, the very first results of this labor can be seen at the jOWL server test page I have set up. Be warned, it's primordial, it doesn't come with much explanation. There is also no concrete integration with jOWL just yet, that is for later, I guess I still have a long way to go.


You will be noticing that I'm also sticking to using the wine ontology as a benchmark. Not that I'm such a complete wine devotee (with all these wine related demo's I can see how people might come to think of it like that). But seeing that this ontology was originally used by the W3C to illustrate the different aspects of OWL (cfr. OWL language guide) and consequently, in testing out the syntax, I figured I might as well continue the tradition. Not sure how this will scale under somewhat heavier load, but those are problems for later :).