Tech advance sparks prospect of 'vocal terrorism'

Speech duplication: Computers should be able to duplicate human speech and perfectly imitate any voice in as little as 10 years…

Speech duplication:Computers should be able to duplicate human speech and perfectly imitate any voice in as little as 10 years. With this, however, comes the prospect of "vocal terrorism" through the misuse of voice simulators.

"I believe in 10 to 15 years' time I will be able to synthesise anybody's voice saying anything," stated the University of York's Prof David Howard.

He was addressing a session at the festival, which draws to a close tomorrow. He heads York's media engineering research group and has worked with a forensic scientist to analyse the human voice. His unusual approach involves modelling the physical shapes made by the "vocal tube" to produce sounds rather than by synthesizing words.

This approach ensures that the sounds produced are much more human-like. "We can get diphthongs that sound very human," Prof Howard stated.

READ MORE

He has already captured vowel sounds and will progress to the consonants. "We don't have the picture on top of the box but we do have a lot of the puzzle pieces."

Copying speech and singing by modelling the way that we ourselves make sounds provides more natural sounds, but also means specific voices could be imitated.

Prof Howard believes computer voice synthesizers will be able to mimic known individuals better than British impressionist Rory Bremner, famous for his close reproduction of former prime minister Tony Blair's speech.

This, however, raises the possibility of misuse of the technology, Prof Howard suggested. "Verbal terrorism is a possible scenario in the future."

People could be misled by synthesized voices on the phone pretending to be family members or, say, a bank manager. A political leader's voice might be copied and used to broadcast false information.

He argued there were "social responsibilities" for scientists working in this and other research areas.

"I am convinced that a debate should start now on how we approach research which, on the one hand, could prove enormously beneficial, but on the other could cause immeasurable damage to society."

There were significant challenges to be overcome, for example developing natural- sounding speech that incorporated varying pitch and rhythm, he said. This was already being built into the voice synthesizer currently under development in his lab.

He also described work under way in Denver attempting to produce a virtual larynx. Also known as the voicebox, it produces the source sound that is then formed into words by the vocal tract. He believes a model larynx would give his voice synthesizer a much more natural sound.