Thursday, January 3, 2008

Semantic Annotation with RDFa: a simple experiment

This article will demonstrate an application for semantic annotation with RDFa. This is more or less a follow-up to our previous article 'Embedding OWL-RDFS syntax in XHTML with RDFa' If you need additional information on RDFa and semantic annotation: see that article.


Goal


The idea is to enrich the markup of certain sections of a webpage with an annotation, in such a way that computers can recognize it for what it is (semantic annotation, invisible to the user) and do something with it.
In this case, the goal was to return a set of language terms related to an annotation, either direct synonyms, or language terms from parents, or sources or targets of relations. The resulting procedure generates an ontological table of contents for that page focused around the given annotation and renders it directly within the original web page. In particular it adds a JQuery-enabled AJAX-script to the page that retrieves data from the OntologyOnline server and visualizes the results directly within the page you are browsing. It takes the form of an information box containing a clickable, ontological index of related terms occurring in the page plus a description (if present) for the given concept. This box can be dragged across the screen. The script also highlights the terms it found in a manner similar to google-search highlighting. The results can be discarded if no longer needed.






Figure 1: Information box on a semantic annotation, rendered directly within the annotated page.



It requires at least the following components:

  1. A web page containing a semantic RDFa annotation.

  2. The ability of the browser or a browser extension (Firefox - Operator plugin) to pick up the semantic annotation and, if the user desires to do so, trigger an AJAX call (a request for information).

  3. A service to which the AJAX call is directed (OntologyOnline.org service) and returns back useful (ontological) information about the annotated construct.



1. Semantic Annotation: RDFa markup


The idea is to mark up a section of a web page with a reference to an ontological class that describes what that section is about. We do this by using RDFa.
The RDFa markup makes use of the about attribute and the class attribute. This syntax is not entirely optimal as it does not really extract as correct OWL-DL syntax (and the class and about attributes should probably be switched around as well, but that does not visualize well in the plugin).
A combination with the instanceof attribute might be more appropriate (cfr rdfa primer: @instanceof) to declare what kind of ontology concept the given entry is an instance of, but this RDFa attribute is not yet supported by the current release of the Operator Plugin (see below), so for the time being we will have to stick with 'about'.



<div class="owl:Thing"
about="celltypeontology:immature+neutrophil">text section ...</div>


Figure 2: Embedding instance information by using the class & about attributes..


Also mind that it is always a good idea (but in this case not an absolute necessity) to declare the namespaces you used in the document as well:


<div
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:celltypeontology
= "http://ontologyonline.org/visualisation/c/Directory/">...
</div>


Figure 3: Including namespaces.





2. The Operator action script


The firefox extension Operator by Michael Kaply and Elias Torres recognizes RDFa in XHTML pages, and allows users to define custom actions (an Action User script) for these annotations that can be trigerred specifically by the person browsing the page.
We've created an Action User script for the Operator extension to firefox that makes an AJAX call to an ontologyonline JSONP service (see section below). This script can be downloaded at: OntologyOnline.org/scripts/OntologyOnline_Operator.js.

To set up the script:

  1. Get Firefox if you do not have it already.

  2. Install the Operator plugin.

  3. Download the Operator user script to your computer (Right-Click OntologyOnline_Operator.js, select save as).

  4. Browse to the Options section of the operator plugin, select user scripts tab, hit new, and select the script you downloaded above. Close options.

  5. Re-open options, select the actions tab, hit new, and select 'create a topic map - ontology online'. Close options.

  6. Restart the Browser.

  7. Browse to a semantically annotated page, the 'Resources' button should be highlighting, select the resource, and select the 'topic map' action that should be visible.
    Demo Pages:








Figure 4: The topic map action.



3. The Ontology Online service


We've set up a AJAX - JSONP service that returns term information on a semantic annotation, if the concept is known to the Ontology Online semantic database. If you wonder what JSONP is, JSONP is a technique that enables you to perform some AJAX calls (requests for data) cross domain, (e.g. hypothetically any web page can access the data), see Remote JSON for more information.


Some last remarks


In my opinion, this experiment also shows one of the advantages of using RDFa as opposed to using regular microformats. By using RDFa we do not impose content limits or restraints on an annotation, it is possible to use any semantic construct, as long as it is known to a ontology service like the Ontology Online Topicmap service.
The set-up has been kept simple deliberately, improvements can (and may in the future) be made on several accounts, but nevertheless I wanted to share this with you anyway.
Be aware that for large web documents the script may have some more computation work to do. I'm also hoping to come up with some more elaborate use-cases in the distant future.




Comment on this article: Discussion group.