Connecting with the media

DCU researchers are developing mobile phones that can translate and computers that understand human speech, writes Dick Ahlstrom…

DCU researchers are developing mobile phones that can translate and computers that understand human speech, writes Dick Ahlstrom.

Computers that understand spoken English and mobile phones that can provide simultaneous translation between languages are just two possibilities from a new research consortium established at Dublin City University.

Announced last week, DCU's new Centre for Science, Engineering and Technology (Cset) will employ 100 researchers from within the university and also from academic and industrial collaborators.

The aim is to build the next generation of automatic language translation systems, explains the centre's new director, Prof Josef van Genabith. It is all about "localisation" he says.

READ MORE

Genabith is professor of computing and is the current director of the National Centre for Language Technology, a position he will leave shortly to become director of the new centre.

He expects to have the centre up and running very quickly and possibly by the beginning of December. Up to 80 new researchers will be brought in, mainly PhD students and post doc research fellows, he says.

In its simplest sense localisation is about taking content, say computer documentation suited for use within Ireland, and changing it for distribution in another country.

The documents would need to be translated, but would also ideally reflect the cultural norms of the receiving country, explains Prof Genabith.

LOCALISATION HAS BECOME a huge industry here given we have 800 software firms employing 24,000 and producing €17 billion in exports.

"Localisation as an industrial process was developed in Ireland," he says. "We have a unique concentration of university and industry-based research and development expertise in language technologies, machine translation, speech processing, digital content management and localisation."

The new Cset will greatly enhance that capacity, bringing together both the academics and the companies involved in these activities.

PARTNERS IN THE work will include academics from University College Dublin, the University of Limerick and Trinity College Dublin. Companies involved include IBM, Microsoft, Symantec, Dal Nippon Printing and Idiom Technologies and Irish firms Alchemy, VistaTech, SpeechStorm and Traslan.

Their capacity to conduct research will be greatly aided by the €16.8 million awarded by SFI, but the industrial partners are contributing a further €13.6 million in materials, services and money.

The work will be interdisciplinary, with the bulk of researchers being computer scientists but participants also include linguists, experts in information retrieval and extraction and specialists in systems architecture, says Prof Genabith.

"It involves integrating machine translation and the work flow involving human translators. It is also about speech technology."

THERE ARE SIGNIFICANT pressures pushing towards automated translation, with a shortage of human translators a key worry, he says. There is also a trend towards language translation applied to things like mobile phones, PDAs and other "on the move" technologies, he adds.

And there is also a move towards personalisation, where the huge amount of multilingual content is filtered for the user automatically by a system that learns the person's preferences over time and selects accordingly.

The requirement for a learning capacity in the new systems to be developed by the Cset is central, Genabith stresses. The automated translators will be based on "machine learning" where the software's ability to deliver an accurate translation improves over time.

Learning begins when software programmes that already contain vocabularies and grammar are given "bi-texts", matching documents in say English and French.

"It learns from these bi-texts and then it gets a new selection and can use what it has learned from the earlier texts," he explains.

The Cset will also by extension look at things such as voice recognition by computers, perhaps freeing computer users from the keyboard, and new kinds of security systems based on voice.