BBC adopting the Semantic Web

Jan 28
2009

Being a major content owner, the BBC is using Semantic Web technologies to efficiently manipulate this content and improve the services offered to the public.

I met Richard Wright of the BBC Archive in the AXMEDIS conference in Barcelona on November 2007. He was showing then a demo of their system, which uses the GATE web services for NLP (Natural Language Processing) on textual news items. News items are analyzed to extract named entities according to an ontology, such as person names, companies, locations, etc. The attributes of the entities are also extracted, e.g. the position of a person within a company. The demo also showed video indexing with the use of various techniques.

The BBC Artists pages were recently launched, using Semantic Web technologies to enrich artists’ profiles and link them to external resources, such as Wikipedia entries. Matthew Shorter, BBC’s interactive editor for music, told CNET UK that “this is part of a general movement that’s going on at the BBC to move away from pages that are built in a variety of legacy content production systems to actually publishing data that we can use in a more dynamic way across the Web.”

Blogging with Calais

Jan 23
2009

The Calais initiative by Thomson Reuters is an excellent example of Semantic Web technologies being smoothly incorporated into common web activities, such as blogging. It uses Natural Language Processing to analyze text and extract named entities (e.g. persons, companies), facts (e.g. employee positions), and events (e.g. mergers, acquisitions).

I have been using Calais in this blog, through the Tagaroo plugin for WordPress. While I type a post, Tagaroo analyzes the text using the Calais web service, and suggests relevant tags and Flickr images. So far, the plugin works very well, without any glitches. Its proposals are usually quite successful and most tags of this blog have been created this way.

Here is a screenshot with the tags and images suggested by Tagaroo for this post (click on it for full size):

tagaroo-screenshot

Calais can currently analyze texts only in English and French, but more languages are on the way. Let’s hope we see support for Greek soon!