Showing posts with label semantic web. Show all posts
Showing posts with label semantic web. Show all posts

Wednesday, February 11, 2009

jOWL 1.0: Faster, Better, Smarter

I am very happy to be able to announce the release of jOWL version 1.0. Over the last months I have been making some significant changes to the underlying engine, enough to warrant a major version change.



Faster: Modification of Internal Indices


I performed some speed tests on the internal indices jOWL maintains, evaluating the use of native javascript arrays versus associative arrays. It became quite clear that for jOWL purposes, associative arrays (key + value) are a whole lot faster to query over. The jOWL indices have been converted into such arrays. Implementation wise, this means that in many cases I see the load time for owl files decrease by a factor 10 or more(!). Accessing specific ontology resources also executes much faster as a result. To give you an example, the initial load for the 2 MB OBI ontology now takes about 109 ms, which is 0.1 second, a drastic improvement to previous jOWL versions. Whether this allows to load bigger ontology files, that I cannot say


Smarter: Reasoning on intersections


I improved the way jOWL deals with Intersections. In owl, if you declare the class "Red Wine" to be the intersection of a wine and an item that has red color, then it naturally follows that all wines with red color are red wines, even if you do not explicitely declare it as such. E.g. owl relies on open world assumptions. Previous versions were already able to reason with this a little, but with jOWL 1.0 I made some significant steps forward. This has an immediate impact on the visualization of hierarchies. Take for example the hierarchy of wine, whitewine and redwine (see http://jowl.ontologyonline.org). In jOWL 1.0 fewer wines are direct descendant of "Wine" (18) but more wines are direct descendant of redwine and whitewine respectively. This makes jOWL significantly smarter than before. It is even able to pick up some duplicate intersections (inconsistencies), such as the ones that exist in the wine ontology between DryWine and TableWine, both are declared as the intersection of wine + hasSugar dry.


There is one minor downside to this. Calling all desdendants on rich hierarchies, in an ontology relying on a lot of intersections/open world assumptions (take for example the wine ontology) can now be a little slower than before, but only at the first query. Subsequent queries are cached and happen lightning-fast. For example the sparql-dl query Type(?thing, Wine) takes a little time to complete at first try.


Better: overall refractoring of code and completion of documentation


In previous versions, relation lookup functions on Individuals and Classes where unsatisfactory implemented (chaotic). I replaced the different functions by one common function that accepts similar arguments for both Individuals and Classes: sourceof. This function filters relations on the class or individual by property, by target and possibly some additional settings. If you look at the source code you may also see that everything has been formatted in a more readable manner.


Documentation: I took some time to complete documentation, which should be a great help in understanding the library and writing your own extension/implementation for it. I also tried to make the documentation a little more visually pleasing. And it uses jOWL to present a treeview of the index, something which seems quite usefull to me.


SPARQL-DL: I more or less rewrote the engine. It is now more flexible and powerfull, having a better and more complete implementation of SubClassOf, Type, Class, etc queries... See the SPARQL-DL test page.


Styling: I adapted the User Interface components a little so that they now operate nicely with jQuery Themeroller generated custom themes. This can make styling the UI components a lot easier, but of course you can always implement your own styling as well.


Out of curiosity, I also tested whether jOWL can be used in an Adobe AIR application. While I do not have any working examples online, I can say it does. This may allow for some interesting possibilities as well...



Some minor bugs were squatted, see also the changelogs. Hope you enjoy and looking forward to some feedback. You can download the 1.0 version at the Google Code site. Make sure to also download the JOWLBrowser html & css files (version 1.0) that go with it if you are trying to set up a generic ontology browser.


Yours, David

Wednesday, January 7, 2009

Experimental TouchGraph Visualization

I had a little fun trying to implement a TouchGraph like, but javascript-based, visualization of OWL data. To make it even more challenging (what fun would it be otherwise :)), I decided not to use Canvas or SVG elements nor flash, which are usually used to draw/embed more advanced graphics in HTML pages. I wanted rely on pure DOM-HTML manipulations. It's very experimental, but I'm quite happy with the result, which I got working on the major browsers.



Straight to the Example: See the jOWL TouchGraph visualization demo onsite, uses jOWL to interact with the wine ontology. Once again the source of data for this demo.



Readers may know that I recently fiddled (see previous blog post) a little with the Javascript Information Visualization Toolkit in an attempt to create a hyperbolic tree view for jOWL. The hyperbolic tree view demo was quite easy to set up, but even so I wasn't entirely happy with the result. The reason for it can be attributed to the nature of hyperbolic tree visualizations, which only seem to work well if you visualize pure hierarchies (each node has one parent max).
The majority of the ontologies however are what is know as Directed Acyclic Graphs, which basically means that any given Class can have multiple parents. This multiple parenting is one of the things that make ontologies so powerful, but unfortunately it obscures a hyperbolic visualization a little.



The TouchGraph model ('Force-directed graph layout'), seems to have no problems with Directed Acyclic Graphs, and the results are therefore pretty cool and dynamic (see again: Demo).



I must definately credit Mathieu 'P01' HENRI for his wonderful examples & scripts on how to create diagonal lines by manipulating HTML only, and Sean McCullough for that illuminating blog post on force-directed graph layouts, complete with code examples.



Conclusion: Can be improved a lot, it's an experiment, but a fun one with great looking results. One downside: Loading lot's of nodes turns the application into a slideshow instead, but if we keep it reasonable, it seems to render quite fast.

Monday, July 14, 2008

jOWL status update


I packaged the latest development version of jOWL into a 0.5 release, available at Google Code. jOWL is an AJAX/javascript extension to jQuery that I am developing. The jOWL library parses and reasons with OWL-DL documents. Supported browsers for this release are Internet Explorer 7 and Firefox 2 & 3.


This release is accompanied by several new and impressive demo's (in my humble opinion). These make use of the new functionalities that have been incorporated so far. Below are some important highlights.


Reasoning over Individuals: Demo 2: jOWL integration with Simile Exhibit


Simile Exhibit is an MIT project that allows complex data visualisations to be defined entirely on the client-side (javascripted). This means no server-side scripting is needed (no PHP, Ruby, JSP, ASP, ...) and no database access is required. In Exhibit, data is defined in a rich, but flat, JSON format.

The jOWL library is designed with similar ideas and goals in mind, but focuses on OWL semantics and vertical search instead. So on the one hand jOWL takes into account subsumption/inheritance, but on the other hand Exhibit has powerful visualisations. I believed it would be an interesting experiment to try and reconcile the two. The results of this labor can be seen at the jOWL - Exhibit demo page that has been set up. jOWL loads the OWL file, grabs all instances of the specified class (wine), grabs all restrictions (relations) applicable for these, and converts this into a JSON data format that Exhibit can read. Exhibit then takes care of the further visualisations and filtering.


The Exhibit results also hook back to some jOWL functionality, which allows visualisation of a little more supporting information. Clicking on Meursault in the below example (image) shows the ontological classification of this type of wine.



jOWL Exhibit Demo - screenshot


Ontology Browsing


One item on my todo list was to expand the tests and demo's to include more diverse types of OWL-DL files. Even though OWL-DL is a standard, there are alternative ways to express things. In general, ontology elements are uniquely referenced by the attribute 'rdf:ID'. It is however also possible to use external declarations only, by means of the 'rdf:about' attribute and external URI's. This has some particular benefits, most importantly the identity of the ontology elements always points to one and the same location, no matter the location of the OWL file, and therefore is common practice in building ontologies. The external URI should then declare the ontology elements as well at the specified location, but funnily enough this is often overseen. Previously jOWL only indexed elements referenced by 'rdf:ID', but this has now been expanded to also include elements only referenced through 'rdf:about'. The concrete result of this rewrite is that ontologies like the Basic Formal Ontology or the Ontology of Biomedical Investigations can now also be properly visualised with jOWL.


Another change is that the foundations for a new User Interface component are being drafted. This new component is meant to visualise descriptions, labels, disjoints, outgoing non-hierarchical relations, etc... The prototype (under development) can be seen for example in the BFO (Basic Formal Ontology) demo. Hand in hand with this component is the ability to use permalinks, links that allow you to externally reference a given owl:Class. Examples of the permalink functionality is also visible in that demo.



Documentation, and visualisation of RDFa embedded OWL syntax


Documentation was somewhat lacking in the past, but I put in an effort to create it. And even more, jOWL is used to visualize it. Initially I defined the documentation in an OWL-DL file. I kept that, now outdated, owl file accessible, because it demonstrates how to include HTML syntax in rdfs:comments (by wrapping in CDATA tags). But I decided to take another approach in the end with this documentation. Instead of loading an OWL-DL file, all the documentation resides in an html file, which is enriched with RDFa markup. The RDFa indicates which ontology classes exist, along with their descriptions ('rdfs:comments') and more. jOWL then extracts an OWL-DL file directly from this RDFa-enriched html page, and adapts the page to visualize it the jOWL way instead. Basically this is a real application to what I previously wrote in the blog post 'Embedding OWL syntax in HTML with RDFa'.


Some people may have doubts on this approach, and I must admit I haven't entirely made my mind up on the applicability of it as well. For one it is only suitable if the OWL syntax remains simple, as for example in the case of documentation, where hierarchical information (vertical representation of data) is enough. But it does come with some major advantages:


  • A big plus, just being practical: Content is accessible to search engine spiders and people who disable javascript. Ergo it solves the search engine access issues always seen with any other AJAX implementation.

  • It is much simpler & userfriendly than writing an OWL-DL file. Still requires some effort. But (in combination with jOWL) it seriously lowers adoption barrier for using OWL.

  • The flat representation of the documentation is hidden, and instead you the user is presented with a more dynamic/intelligent view on that data.



And finally: The link to this documentation, which serves as a demo on it's own.



Hope you enjoy the new demo's,

David.

Tuesday, May 6, 2008

Server-side storage of OWL syntax.

One concern I have with jOWL, a javascript library I am maintaining that parses and reasons with OWL-RDFS documents, is it's scalability.


Scalability is, as we all know, of vital importance in a web setting. A potential drawback of the current jOWL javascript approach is that user is required to download the entire ontology before being able to do anything with it. This might be fine and sensible in the case of small sample ontologies such as the wine ontology (still 79kB) that has been used in previous semantic web demo's at ontology online. Sadly, it doesn't take much imagination to see that it could quickly become gruesome to load when we make things a little more interesting.


So it dawned on me that it would be great to have a more dynamic load mechanism, where you only request or get what you really need. And it just so happens that Ajax (web 2.0 technology), combined with javascript, is perfectly suited for that. So I have been working hard on creating some server side code that is able to respond to Ajax calls and send back pieces of OWL-DL syntax. The image below kind of explains what I wish to achieve.





At the same time I decided to rewrite the database code that allows me to store ontology information, and take a more xml-hybrid like approach. The limitation of the older code I have, is that it attempts to translate every aspect of OWL-RDF into a relational database table format, e.g. concrete row-column values. As I try to put the richer logic of OWL-DL into work, it becomes a real pain & huge overhead in coding to represent all intersections, unions or other advanced constructs in this format. You just can't seem to beat native OWL-XML syntax (Yeay for the OWL people).


The new approach I'm undertaking stores the full OWL syntax directly into the database. The database doesn't store the ontology in one bulky blob, but slices it up into small digestible units, each unit corresponding to one defined ontology object (e.g. an object referenced by a unique rdf:ID).
The only real modification to the native OWL syntax is a slight compression / reduction in verbosity. In addition, to allow (I hope) quick access some indexes are created (terms, etc).


For those interested, the very first results of this labor can be seen at the jOWL server test page I have set up. Be warned, it's primordial, it doesn't come with much explanation. There is also no concrete integration with jOWL just yet, that is for later, I guess I still have a long way to go.


You will be noticing that I'm also sticking to using the wine ontology as a benchmark. Not that I'm such a complete wine devotee (with all these wine related demo's I can see how people might come to think of it like that). But seeing that this ontology was originally used by the W3C to illustrate the different aspects of OWL (cfr. OWL language guide) and consequently, in testing out the syntax, I figured I might as well continue the tradition. Not sure how this will scale under somewhat heavier load, but those are problems for later :).

Thursday, March 6, 2008

Semantics in the wild: new jOWL wine Demo

I've prepared a more ambitious jOWL demo that shows off some of the reasoning capability. For the unacquainted, jOWL is a 'semantic' javascript library, under development by OntologyOnline.org, with the intention to make OWL-RDFS files (see also the previous article on OWL) a little more accessible, and thereby bring web3.0 technology another step closer to the user.

The thing about the demo that cheers up the geek within is: there is absolute no server-side scripting involved and no database access is required, all the reasoning is taken care of by your own computer.
It's a great example on how added semantics might influence your future browsing experience. The visuals don't change (web2.0 got that packed to go for us), but the user experience could. There is suddenly some 'smartness' in browsing, and it is something one might quickly get accustomed to. Like all great technology, you should hardly notice it until it's gone, most of it happens under the hood.

I'm still polishing some of the details, and will continue to do so for quite some time, adding more features as we go, but the code is crossbrowser compatible (tested on Firefox and Internet explorer) and I wanted to share that with you.
Expect more demo's in the future as well.

Comments are possible at our discussion group.

Thursday, January 3, 2008

Semantic Annotation with RDFa: a simple experiment

This article will demonstrate an application for semantic annotation with RDFa. This is more or less a follow-up to our previous article 'Embedding OWL-RDFS syntax in XHTML with RDFa' If you need additional information on RDFa and semantic annotation: see that article.


Goal


The idea is to enrich the markup of certain sections of a webpage with an annotation, in such a way that computers can recognize it for what it is (semantic annotation, invisible to the user) and do something with it.
In this case, the goal was to return a set of language terms related to an annotation, either direct synonyms, or language terms from parents, or sources or targets of relations. The resulting procedure generates an ontological table of contents for that page focused around the given annotation and renders it directly within the original web page. In particular it adds a JQuery-enabled AJAX-script to the page that retrieves data from the OntologyOnline server and visualizes the results directly within the page you are browsing. It takes the form of an information box containing a clickable, ontological index of related terms occurring in the page plus a description (if present) for the given concept. This box can be dragged across the screen. The script also highlights the terms it found in a manner similar to google-search highlighting. The results can be discarded if no longer needed.






Figure 1: Information box on a semantic annotation, rendered directly within the annotated page.



It requires at least the following components:

  1. A web page containing a semantic RDFa annotation.

  2. The ability of the browser or a browser extension (Firefox - Operator plugin) to pick up the semantic annotation and, if the user desires to do so, trigger an AJAX call (a request for information).

  3. A service to which the AJAX call is directed (OntologyOnline.org service) and returns back useful (ontological) information about the annotated construct.



1. Semantic Annotation: RDFa markup


The idea is to mark up a section of a web page with a reference to an ontological class that describes what that section is about. We do this by using RDFa.
The RDFa markup makes use of the about attribute and the class attribute. This syntax is not entirely optimal as it does not really extract as correct OWL-DL syntax (and the class and about attributes should probably be switched around as well, but that does not visualize well in the plugin).
A combination with the instanceof attribute might be more appropriate (cfr rdfa primer: @instanceof) to declare what kind of ontology concept the given entry is an instance of, but this RDFa attribute is not yet supported by the current release of the Operator Plugin (see below), so for the time being we will have to stick with 'about'.



<div class="owl:Thing"
about="celltypeontology:immature+neutrophil">text section ...</div>


Figure 2: Embedding instance information by using the class & about attributes..


Also mind that it is always a good idea (but in this case not an absolute necessity) to declare the namespaces you used in the document as well:


<div
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:celltypeontology
= "http://ontologyonline.org/visualisation/c/Directory/">...
</div>


Figure 3: Including namespaces.





2. The Operator action script


The firefox extension Operator by Michael Kaply and Elias Torres recognizes RDFa in XHTML pages, and allows users to define custom actions (an Action User script) for these annotations that can be trigerred specifically by the person browsing the page.
We've created an Action User script for the Operator extension to firefox that makes an AJAX call to an ontologyonline JSONP service (see section below). This script can be downloaded at: OntologyOnline.org/scripts/OntologyOnline_Operator.js.

To set up the script:

  1. Get Firefox if you do not have it already.

  2. Install the Operator plugin.

  3. Download the Operator user script to your computer (Right-Click OntologyOnline_Operator.js, select save as).

  4. Browse to the Options section of the operator plugin, select user scripts tab, hit new, and select the script you downloaded above. Close options.

  5. Re-open options, select the actions tab, hit new, and select 'create a topic map - ontology online'. Close options.

  6. Restart the Browser.

  7. Browse to a semantically annotated page, the 'Resources' button should be highlighting, select the resource, and select the 'topic map' action that should be visible.
    Demo Pages:








Figure 4: The topic map action.



3. The Ontology Online service


We've set up a AJAX - JSONP service that returns term information on a semantic annotation, if the concept is known to the Ontology Online semantic database. If you wonder what JSONP is, JSONP is a technique that enables you to perform some AJAX calls (requests for data) cross domain, (e.g. hypothetically any web page can access the data), see Remote JSON for more information.


Some last remarks


In my opinion, this experiment also shows one of the advantages of using RDFa as opposed to using regular microformats. By using RDFa we do not impose content limits or restraints on an annotation, it is possible to use any semantic construct, as long as it is known to a ontology service like the Ontology Online Topicmap service.
The set-up has been kept simple deliberately, improvements can (and may in the future) be made on several accounts, but nevertheless I wanted to share this with you anyway.
Be aware that for large web documents the script may have some more computation work to do. I'm also hoping to come up with some more elaborate use-cases in the distant future.




Comment on this article: Discussion group.













Tuesday, November 27, 2007

Embedding OWL-RDFS syntax in XHTML with RDFa

Short introduction to RDFa, OWL and Microformats


The OWL-language (web ontology language) is a recommendation by the W3C. is a language that allows proper definition and representation of . It is supposed to form the foundation of the next generation of the internet: the semantic web (a.k.a. web3.0). Unfortunately, even though it has been a recommendation since 2004, it has gained very little traction in the online community.
There are several reasons for this, and some of them are discussed in this article.

Microformats are small structural mark-ups of HTML, with a very consistent format. They are somewhat of a web 2 and a half approach, allowing machine readability through consistent structure, but lacking the flexibility to capture more than what is defined in the limited Microformat standard. Microformats have more or less emerged spontaneously. Through their ease of use, Microformats have gained a lot of momentum, but as mentioned, they lack the expressivity required to really push the boundaries of the current internet experience.

This article is about coming up with a solution that reconciles the ease of use of Microformats with the expressivity of a language like OWL.
Some problems hindering OWL adoptation will be highlighted, and a first experiment with the use of RDFa mark-up to embed OWL data directly into an XHTML page will be demonstrated, a solution that can be considered as a step higher than Microformats on the evolutionary ladder of the web.


Terminology frequently used in this article:
Instance

A real world occurrence of a class (a bordeaux wine you just bought in the store).

Class

A unit of meaning (wine as a 'category').

Parent

With meaning a little broader than the given Class (less specific), for example 'Potable (drinkable) Liquid' in the case of wine.

Semantic items

Human and machine understandable content.



OWL-RDFS


There are three sublanguages of OWL, named OWL-Full, OWL-DL and OWL-Lite. Funnily enough, and contrary to what you might think, OWL-Lite is the most strict of the three and hardest to achieve. OWL-Full is the least interesting of the three dialects; it is often gibberish to reasoners (programs that interpret your semantic constructs) because you can intermingle anything with everything. Usually, the goal is to produce OWL-DL syntax at the least.
All OWL syntax examples in this document validate as OWL-Lite, copy-paste the syntax into the OWL Validator to check. For a more complete introduction on OWL, see the W3C OWL language guide.


Why OWL-RDFS has such poor web presence.


There are several reasons why OWL syntax is nowhere to be seen on the web, and some of these problems are as follows:

1. OWL is Verbose

It really is, but, in my opinion, this does not constitute that much of a problem. Verbosity can be good if it improves readability.
See next point for some considerations however.
On verbosity; below is an example of the minimal OWL-DL compatible syntax required to declare one class, with one parent specified, and one outgoing relation, no comments or labels included, and no advanced relating. As you can see in the figure below, that is already quite a mouth full.


<rdf:RDF
xmlns ='http://this.page'
xml:base ='http://this.page'
xmlns:owl ='http://www.w3.org/2002/07/owl#'
xmlns:rdf ='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:rdfs ='http://www.w3.org/2000/01/rdf-schema#'
xmlns:xsd ='http://www.w3.org/2001/XMLSchema#'>
<owl:Ontology rdf:about='http://example.com/dummy_ontology'/>
<owl:Class rdf:ID='id_for_this_class'>
<rdfs:subClassOf>
<owl:Class rdf:about='http://example.com/dummy_parent' />
</rdfs:subClassOf>
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty>
<owl:ObjectProperty
rdf:about='http://example.com/dummy_property'/>
</owl:onProperty>
<owl:someValuesFrom>
<owl:Class
rdf:about='http://example.com/dummy_target' />
</owl:someValuesFrom>
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
</rdf:RDF>


Figure 1: OWL Verbosity: Expressing one class, including one parent and one outgoing relation.


2. OWL is Incomprehensible

Well, not really, there is a lot of logic in OWL. At least computers should have no problem understanding. Honestly!?
But to the average human, it is a bit like learning to read bits and bytes. Part of the reason why Microformats are successfull is that every web designer understands css and html. Before you delve into OWL, you should have a good understanding of RDF, xml Namespaces, URI's, ... and even though all OWL documents are RDF documents, the language is in many ways nothing like RDF. Finally, you also need to become familiar with ontological jargon (classes, instances, subsumption, disjoints, ...) and methodology and how it impacts your ontology. This, in many ways, resembles learning algebra. The methodology is not something people are acustomed to, there is a general 'why bother attitude', but once learned it paves the road to new insights. So there is a bit of a learning curve.


Perhaps the most significant problem however, is that there are multiple ways of expressing one and the same thing. Consider the examples below. For both, ONLY the syntax varies from what is expressed in Figure 1, the content remains the same (mind that this is still about the very simple set-up: one class, one parent and one relation). They both validate as OWL-Lite as well. The bottom example is the one that resembles the most an original RDF graph, but it is also the hardest one to read. It is an important example however, seeing that this is more or less what RDF, extracted from an RDFa-XHTML page looks like (more on that later).



<rdf:RDF
xmlns ='http://this.page'
xml:base ='http://this.page'
xmlns:owl ='http://www.w3.org/2002/07/owl#'
xmlns:rdf ='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:rdfs ='http://www.w3.org/2000/01/rdf-schema#'
xmlns:xsd ='http://www.w3.org/2001/XMLSchema#'>
<owl:Ontology rdf:about='http://example.com/dummy_ontology'/>
<owl:Class rdf:ID='id_for_this_class'>
<rdfs:subClassOf rdf:resource='dummy_parent' />
<rdfs:subClassOf>
<owl:Restriction>
<owl:onProperty rdf:resource='dummy_property' />
<owl:someValuesFrom rdf:resource='dummy_target' />
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
<owl:Class rdf:about="dummy_parent"/>
<owl:ObjectProperty rdf:about="dummy_property"/>
<owl:Class rdf:about="dummy_target"/>
</rdf:RDF>

<!--
reference to parent & to target: must be declared
in this document as being of the owl:Class type to be
owl-DL compatible.
reference to property: must be declared as being of type
owl:ObjectProperty in this document to be owl-DL compatible.
-->


<rdf:RDF
xmlns ='http://this.page'
xml:base ='http://this.page'
xmlns:owl ='http://www.w3.org/2002/07/owl#'
xmlns:rdf ='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns:rdfs ='http://www.w3.org/2000/01/rdf-schema#'
xmlns:xsd ='http://www.w3.org/2001/XMLSchema#'>
<rdf:Description rdf:about="http://example.com/dummy_ontology">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#Ontology"/>
</rdf:Description>
<rdf:Description rdf:ID="id_for_this_class">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
<rdfs:subClassOf rdf:resource="#dummy_parent" />
<rdfs:subClassOf rdf:resource="#restriction1" />
</rdf:Description>
<rdf:Description rdf:ID="restriction1">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#Restriction"/>
<owl:onProperty rdf:resource='#dummy_property' />
<owl:someValuesFrom rdf:resource='#dummy_target' />
</rdf:Description>
<rdf:Description rdf:ID="dummy_parent">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:ID="dummy_target">
<rdf:type rdf:resource="http://www.w3.org/2002/07/owl#Class"/>
</rdf:Description>
<rdf:Description rdf:ID="dummy_property">
<rdf:type
rdf:resource="http://www.w3.org/2002/07/owl#ObjectProperty"/>
</rdf:Description>
</rdf:RDF>


Figure 2: OWL variancy: The two examples above only differ syntactically from the OWL expression found in figure 1, the content is the same, and both validate as OWL-Lite as well. Cfr. OWL Validator.


3. No explored approaches to align / integrate OWL with current web content.

In my opinion, this is the biggest roadblock that stands in the way of public OWL-adoptation.
For ontological data to truly be useful, you need to somehow tie current web content with semantic classes and instances. OWL has failed miserably in this respect so far. It is much like a far away island of Eden, and a man without a canoe. All the important data could be there but no-one knows how to reach it. Granted, OWL does define a standard format for data interchange between applications, but this limited scope cannot be what the semantic vision is about.

What we need is semantic annotation. You need to be able to tag sections of your content with explicit ontology classes and even relations, without it hindering the display of your content. If you wrote a piece on a certain bordeaux for example, you could mark up the section as being about an instance of wine, perhaps even with some properties defined (or even more amazing, just by knowing it is about wine properties can be extracted automatically, a mention of red is bound to be about the wine color).

Take a look at our freebase widget for the desired search engine functionality. You should be able to look for wine (example), by specifying color, origin, vintageyear and more. The funtionality present on that page is thanks to the Freebase service.
Freebase (big thumbs up) really understands the need to couple instance data (they call it 'Topics') and content (images, text) to semantic classes ('Type' in freebase lingo).

The amazing observation here is that you would not need a service like freebase to pull out this kind of data if:

  • It where possible to directly associate your content with certain units of meaning (semantic classes), something which is referred to as 'semantic annotation'.

  • Search engines would index these annotations.


To achieve this goal we need several things to happen:

  • The ability to easily reference ontological classes; at the least there should be some highly accessible repository(/-ies) of ontological data to which one can refer. This is part of the rationale behind ontologyonline.org and why it offers one page per semantic class, it can be seen as an ontological tag index.

  • The ability to embed semantic content directly and invisibly within (X)HTML.


This brings us back to the topic of this section: the current divide between semantic data and web content. There is hope on the horizon however, there seem to be possibilities, and I'd like to demonstrate one of them in this article: the use of RDFa.
Roughly said, RDFa is a next-generation Microformat. It is a structured mark-up that allows direct inclusion of RDF into XHTML, and seeing that OWL is more or less an extension to RDF an interesting thought would be to try and use RDFa to embed OWL in a web page.

Integration into XHTML markup: the promise of RDFa


This article does not attempt to provide a full introduction to RDFa, for that: see W3C's RDFa Primer. As mentioned above, the use case for the majority of people would be to mark up content with reference to semantic classes (e.g. tag your content as being an instance of semantic class Y). In this experiment however, we focus on embedding OWL class information and ontology information directly within XHTML. Embedding Instance information (as required for the above) should be easier and may be discussed in a later Online Ontology Visualisation blog entry.


As said before, part of ontology online's design is the offer one page per semantic class. This makes referencing a lot easier, instead of having to usi URI's (like an URL, but may also point to sections of a certain page or a given part of a document), you can just use a page URL to retrieve a reference to an ontological class. As will be demonstrated, it also makes the RDFa approach a lot cleaner. Of note, the mark-up has been simplified a little (information not required for this demonstration has been stripped) for easier reading. The figure below demonstrates how the content that has RDFa embedded looks like visually (the RDFa is completely invisible to the user).



Figure: Visual output of the RDFa-embedded XHTML.



Step 0. Being Standards-compliant


This is the only pain-point of embedding RDFa. If you wish to get your page to validate against the W3C-Validator service then you must set up your page properly. This means that, besides the fact that you better have valid XHTML if you wish to extract the RDFa out of the page, you also need to change the DOCTYPE declaration of the page, and the content-type you serve, plus include any namespaces you are using in your document. Beware that changing the DOCTYPE may influence the display of your page a little (css is more strict) and changing the content type may affect javascript functionality. This may not even be a possibility to you. Fortunately, this is only a long-term goal to strive to, the RDFa will still be extractable even with less properly defined doctype & content-type (hence step 0).



<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN"
"http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:rdfs ='http://www.w3.org/2000/01/rdf-schema#'
xmlns:dc="http://purl.org/dc/elements/1.1/">
<head>
<meta http-equiv="Content-Type"
content="application/xhtml+xml; charset=UTF-8"/>
...


Figure: Declarations you need if you wish to be Standards-compliant.


You might notice that when you try to run Ontology Online concept pages through the W3C validator still one error persists: 'Conflict between Mime Type and Document Type'. This is because the server does not yet serve the correct content type (it seems to affect javascript functionality and that is something we do not wish to deal with at this point).

Step 1. Declaring the ontology


On any page describing a semantic class (concept page) the ontology is reference by a 'rev' attribute on an anchor element. 'rev' expresses an inverse relationship in RDFa. In this case we say that the ontology (defined at the target of the 'href' attribute) has an owl:Class (designates a class of the ontology) which is described on this page. The mechanism to declare the ontology itself is not discussed in this article, but can be easily seen by examining the source of any of the ontology html pages at ontologyonline.org.


<a rev="owl:Class"
href="http://ontologyonline.org/visualisation/c/CellTypeOntology/">
Cell Type Ontology</a>


Figure: Declaring the ontology on a concept page.


Step 2. Declaring the Class


In RDFa an 'about' attribute is used when talking about something that does not refer to the entire page. Seeing that the resource here is indeed the entire page (the entire page equals the semantic class) we do not use an about 'attribute'. The class is defined simply by adding a class attribute with value owl:Class in the markup. Notice how the markup does not affect the content itself, nor it's display (hint: for those who did not know, it is possible to use multiple classes in html markup, by providing a space-separated list: example class='owl:Class concept blue', css will take into account all of them). The property attribute with value rdfs:label provides a nicer representation name than just 'this page url'. Ideally an xml:lang attribute would be specified as well, denoting that the label is in English for example (xml:lang="en", work in progress).


<h1 class="owl:Class" property="rdfs:label">mature B cell</h1>

Figure: Declaring the semantic class.


The parents are declared on another page, hence a link (anchor element) is used to reference them. By wrapping the anchor element in an additional html element we can specify two relations, firstly that the class expressed on this page has a subclass relationship to something (as expressed by the 'rel' attribute with value 'SubClassOf'). And secondly, that this 'something' is an actual owl:Class, which is very important if we wish to achieve OWL-DL syntax and state that this concerns a parent. We do this by setting the 'rel' attribute on the anchor element to 'owl:Class'. In the figure below two parents are declared:


<div>
<span rel='rdfs:SubclassOf'>
<a rel='owl:Class' href='c/CellTypeOntology/B_cell'>B cell</a>
</span>
<span rel='rdfs:SubclassOf'>
<a rel='owl:Class'
href='c/CellTypeOntology/professional_antigen_presenting_cell'>
professional antigen presenting cell</a>
</span>
</div>

Figure: Declaring Parent(s).


Outgoing relations are specified by taking a similar, yet slightly more complex, approach. The outgoing relation requires a property declaration (type of relation) and a target declaration, another class of the ontology to which the class on this page relates. At this time properties at ontology online do not have their own page just yet, so they are referenced only in text. For the target of a relation a link (anchor element) is used as reference. The top html element with attribute 'rel' and value 'rdfs:SubclassOf' again denotes that the class expressed on this page has another subclass relationship to something. The element underneath it tell us that this subclass relationship is a kind of outgoing relationship (instead of a parent relationship) by having an attribute 'rel' with value 'owl:Restriction'. This element wraps around two elements, one which designates the property (recognized by having the attribute 'property' with value 'owl:onProperty' The second is an anchor element for which the href attribute specifies the target, while the 'rel' attribute 'someValuesFrom' adds some additional logic the relationship (see W3C's OWL language guide for details on someValuesFrom).

Of note, it should be expressed that the property is of the type owl:ObjectProperty and the target is of the type 'owl:Class' to be OWL-DL compliant, again work in progress (still deciding on the most optimal way to convey this...).


<div rel='rdfs:SubclassOf'>
<div rel='owl:Restriction'>
<span property='owl:onProperty'>develops_from</span>
<a rel='owl:someValuesFrom'
href='c/CellTypeOntology/immature_B_cell'>immature B cell</a>
</div>
</div>

Figure: Declaring Outgoing Relations.


Last thing to do is to add a description and perhaps some additional language terms that define the class described on the page. This is done though 'rdfs:comment' and 'rdfs:label' properties (when the target is text on the same page then the attribute 'property' is used in RDfa). Again, it would be better to also append an xml:lang attribute as well specifying the language of the comment or language term.


<div property='rdfs:comment'>"A mature form of a B cell, a type of
lymphocyte whose defining characteristic is the expression of an
immunoglobulin complex." [GOC:add, ISBN:0781735149]</div>
<span property='rdfs:label'>mature B lymphocyte</span>
<span property='rdfs:label'>mature B-cell</span>
<span property='rdfs:label'>mature B-lymphocyte</span>

Figure: Declaring description and additional language terms.



Testing out: Operator Plugin



It is possible to see how the RDFa extract from any of the concept pages looks like by using a browser plugin: The latest version of Mike Kaply's Operator Plugin recognizes RDFa data embedded within XHTML pages.


Browsing any of the Concept pages on OntologyOnline.org should highlight the 'Resources' button, which should show both the ontology to which the concept belongs (if on a concept page) and OWL-syntax like information about the concept itself:

Browsing an Ontology information page (e.g. Cell Type Ontology) should reveal one resource recognized, the ontology itself.



@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://ontologyonline.org/visualisation/c/CellTypeOntology/>
owl:Class
<http://ontologyonline.org/visualisation/c/CellTypeOntology/neutrophilic_promyelocyte>
.

<http://ontologyonline.org/visualisation/c/CellTypeOntology/neutrophilic_promyelocyte>
rdf:type owl:Class ;
rdfs:label "neutrophilic promyelocyte" ;
rdfs:label "neutrophilic premyelocyte" ;
rdfs:label "neutrophilic progranulocyte" ;
rdfs:comment ""A neutrophil precursor in the granulocytic series, being a cell
intermediate in development between a myeloblast and myelocyte, and containing
a few, as yet undifferentiated, cytoplasmic granules."
[GOC:add, ISBN:0721601464]" ;
rdfs:SubclassOf [
owl:Restriction [
owl:onProperty "develops_from" ;
owl:someValuesFrom <c/CellTypeOntology/neutrophilic_myeloblast>
]
] ;
rdfs:SubclassOf [
owl:Class <c/CellTypeOntology/promyelocyte>
] .


Figure: OWL-DL based Syntax extracted from an Ontology Online concept page by the firefox Operator plugin.



Conclusion


To my knowledge, this is a first attempt to finally reconcile OWL-RDFS with standard XHTML web pages.
The extracted RDF (as tested with RDFa Distiller) does not validate as OWL-DL just yet. But this initial venture into OWL-enabled XHTML shows promise. There is room for improvement, so RDF-Guru or not: feedback and suggestions are greatly appreciated.







If you found the content of this article interesting, consider digging it.


Monday, September 10, 2007

Why we need the semantic web

Two search strings with the same syntax can have dramatically different semantics. What if your looking for pages about a less popular topic which happens to share the same syntax as a highly popular yet 'totally irrelevant to you' topic? Finding what you need will take time and ingenuity. The newly introduced concept feeds (external feeds matching the concept of the given page), for example those associated with the concept page on Cells (in it's biological sense), demonstrate our dire need for semantically enabled search engines.

'Cell', as an English term, applies to a wide variety of concepts: 'cell' in the biological sense, 'cell' as in 'cell phone', 'cell' as in prison cell, 'cell' as in an aggregation of people, and many more including some combinatorial uses (e.g. a title for games, novels, etc). No search engine at the moment is capable of disambiguating between these different meanings and filter results accordingly. The books feed will display a novel by Stephen King, and a book about 9/11. The blog feed lists a police break-up of a Nazi-cell, many entries about cell phones and one about solar cells. The Digg news feed is again mainly about cell phones. Imagine if one of them offered the ability to filter results pertaining only to cell in the biological sense, you might imagine that becoming the next-gen search engine.
And that's exactly what ontologies do, provide a means for disambiguating syntactically equal but semantically different items. Ontologies tell you that there are in fact different concepts (owl classes), one being a designed artifact, another a part of an organism, yet another being a certain aggregation of people, etc etc which all share the same language term 'cell'. Knowing this is already half the work.

You could argue that dictionaries might point to you the different semantics as well, yet by browsing the ontology you have easy access along a variety of axes (horizontal & hierarchical) to a plethora of related concepts such as organisms and cell structures for cell in it's biological sense, telephony in the case of cell phones, social groups and people in the case of 'cell' as an aggregation of people.. you get the picture. When using an ontology as a backbone for indexing, you might be able to figure out which concept applies by examining the context in which it is found.

Tagging/Indexing with concepts, as opposed to language terms, might be a core requirement for future search engines.





Sunday, September 9, 2007

Concept Feed Widget, a new level of ontology interaction.

We are pretty excited about some new functionality we added to the Ontology Concept Information Pages. By selecting the appropriate tab button you can now retrieve a list of the most relevant Books, Blog entries and Digg news Items (of any) about the concept itself.
For a quick example: see Domain Ontology - Ontology Directory. This gives you some amazingly focused search results with only one handclick away.


The new Concept feed widget retrieving books about 'domain ontology'.

Monday, July 16, 2007

Code overhaul

It's been quiet for some time on the ontology online newsfront, reason is that we are performing a major overhaul of the java code. Currently, each imported ontology for which the content should be preserved as is requires it's own database schema. This imposes some strong limitations, and we are therefore recoding to centralise ontologies to one unique MySQL database schema, this is done by implementing ontology-specific namespaces. It will open up a lot of exciting possibilities once finished (which may take some time). Some
results will be a greatly improved search engine, the possibility to offer a lot more content, etc..

Monday, June 25, 2007

A true semantic web application

OntologyOnline.org was launched in April 2007. It offers one of the first true semantic web applications (often referenced as web 3.0) by allowing users to interact with ontologies at runtime. Ontology Online's proper search engine crawls the available ontologies for exact phrases, but also synonyms and even word fragments, the querying ability is only limited by the content itself, not by the semantics.

As an example illustrating this, when searching organisms with the OntologyOnline search engine the search parameters 'velvet worm', 'spitting worm', 'onychophoran', or even the word fragment 'onyocho' will all lead to the same page describing the Onychophora(e.g. velvet worms) phylum.

The search engine is also accessible by means of a Google Gadget.

Sunday, June 17, 2007

All about OWL - RDF tags.

We have started adding OWL - RDF element tag descriptions and definitions to the Ontology Directory Project at Ontology Online. You might have seen that the official W3C documentation on OWL tends to be quite verbose, with the Ontology Directory Project we wish to provide a focused, comprehensive and quickly accessible description of the OWL language.
The benefits of this projects are that by offering documentation in the form of an ontology you can deliver indepth information without having to resort to lengthy text-descriptions which repeat a lot of the same information.

For an example see owl:class definition, where the hierarchy around this concept and the horizontal relations attributed to this concept provide a level of detail you would only be able to express textually by writing several paragraphs of information.

Monday, May 14, 2007

The Ontology directory

I've started putting together an ontological directory of existing third-party ontologies, controlled vocabularies , terminologies, and more.
While it is still incomplete, I expect it to grow substantially in the nearby future.

To browse: Ontology Directory