Galway eyes smarter web

TECHNOLOGY: A research unit in NUI Galway leads the way in developing smarter computers, writes KARLIN LILLINGTON

TECHNOLOGY:A research unit in NUI Galway leads the way in developing smarter computers, writes KARLIN LILLINGTON

THE PROBLEM with computers is they aren’t very bright. They are great for storing and processing data, but as far as a PC is concerned, information – whether a photo or a document, a video or webpage, the name of a city or an equation – is only just so many 0s and 1s linked together in a sequence.

But gradually, computers are learning how to recognise what a piece of data actually is, and find meaningful relationships between files held on individual computers or within a closed network or on the internet.

All of this is thanks to semantics – the area of computer research that focuses on teaching computers to be smarter and see the relationships humans innately understand. And as a result of State-supported research investment building on some initial expertise, a considerable body of leading edge international research in this area is being done in Ireland.

READ MORE

“Semantics is about enabling the computer to understand more about the information it is dealing with, so it can help you,” says Dr Manfred Hauswirth, vice director of the Digital Enterprise Research Institute (Deri) at NUI Galway.

“Semantics builds a lot on ontologies [classification systems] that add meaning to a text string [a word or set of words],” Hauswirth says. “A computer can then build a network of attributes.” He says: “For example, if you type in ‘Galway’, a semantic system will know that is the name of a city and not a dog or a recipe,” and can then find other relevant information related to Galway, the city.

Deri is the largest applied research organisation in the world working on technologies to develop the semantic web – the next generation of the web – and other areas of semantic computing involving virtual “semantic information spaces” and “semantic reality”, where sensors and devices generate data that can be used in semantic applications.

Established six years ago this month as one of Science Foundation Ireland’s initial group of CSETs (Centres of Science, Engineering and Technology), Deri has grown to house 130 researchers and boasts as one of its scientific advisers Sir Tim Berners-Lee, the inventor of the worldwide web and a leading proponent of the semantic web and a new area of “web science”.

Berners-Lee has been touting the semantic web as the next generation web for several years. But semantics is only now emerging from the research lab into the real world because more and more information held on computers is being classified with tags that define what it is, and help semantic applications find, retrieve and process it and related information. Still, a major challenge for proponents of semantic systems is finding enough classified data in the first place to make semantic applications valuable.

While some data falls into automatic classification because of how it is being used – for example, data held within business applications that might be easily identifiable as a product number or a delivery date, or e-mails which have readily identifiable subject headings and sender and recipient fields – in most cases, classifying tags must be added by people willing to take the time to do so.

Within the semantics universe, tags are a bit like classifying books in a library – a kind of Dewey decimal system for computer data.

Deri’s been getting some international notice lately. A recent coup for the organisation and researcher John Breslin was to have a semantic programme he created, called Sioc, picked up by the Obama administration in the US, to be used in conjunction with the US government’s economics-focused Recovery.gov website.

Sioc (pronounced “shuck”) is a semantics application that can lift up and analyse data from online discussion areas like bulletin boards and blogs to identify keywords and “trending” topics.

Other Deri projects include a semantics desktop that runs on Linux and can do intelligent searches; Sigma, a semantic information “mash-up” that does semantic web searches; and Coraal, a programme that does associative semantic searches of document libraries, such as repositories of scholarly publications, like medical papers.

Coraal recently took the second place award in the Elsevier Grand Challenge, an annual competition for prototype applications and tools supported by the scientific publisher Elsevier.

Hauswirth’s personal research interests lie in the area of semantic reality, especially the use of sensors to feed data into semantic applications. One prototype project Deri has developed uses a sensor worn on the finger to feed cardiac data back to a medical application that could monitor heart rates in at-risk patients and warn the patient by text message if their heart rate over-accelerates.

Or, a patient living at home could be monitored by a nurse or doctor remotely. Hauswirth recognises such new ways of sifting and filtering information will require special attention to privacy issues, not least because people don’t tend to think through possible misuse of personal data, even when they place it on the internet voluntarily. “But this is a general emerging problem, and more a social problem than a technical problem,” he says. That’s why a big picture view is needed: “The web is not only a technical phenomenon, but a social phenomenon and an economic phenomenon, and looking at it only as a technical phenomenon is incomplete.”

Hence, Berners-Lee is now arguing for recognition of “web science” as an area of science in its own right, says Hauswirth: “Web science is making researchers in other areas of study realise they need to work together.”

For researchers at the forefront of semantic computing and the semantic web, that inevitably makes a lot of sense. The growing associative links between all types of data, enabled by semantic work in computing, makes a cross-disciplinary web science approach to the web the next obvious step.