Computer programme beats gamers at ‘Space Invaders’

‘Artificial agent’ learns from its mistakes to master Breakout, Pong and Boxing

A computer programme that learns from its mistakes can teach itself how to play classic computer games such as Space Invaders, performing better than skilled human competitors at many of them.

A computer programme that learns from its mistakes can teach itself how to play classic computer games such as Space Invaders, performing better than skilled human competitors at many of them.

 

A computer programme that learns from its mistakes can teach itself how to play classic computer games, performing better than skilled human competitors at many of them.

Referred to as an “artificial agent” and named DQN by its developers, the programme starts from scratch, learning a game from the ground up.

The agent learned to play Space Invaders and Breakout like a professional human games tester and other classics like Tennis, Pong and Boxing.

It only had access to the movement of pixels on the screen and changes in the score but it “understands” that a rising score is desirable. It learns over time what gets points and what doesn’t, the researchers from a Google-owned company in London, DeepMind, say.

“It is not like IBM’s DeepBlue chess playing computer. It was preprogrammed with the ability to play chess,” said company founder Demis Hassabis.

“We didn’t preprogramme the agent, it only had pixel inputs and the game score and it learns how to get points and how to play the game directly through experience.”

The researchers developed two mathematical computer programmes or algorithms for its agent, both of which duplicate in a primitive way how we would learn to play a computer game.

It used reinforcement learning, in effect learning from its mistakes. It also had an artificial neural network, the “thinking” part of the agent that could process what was happening in the game environment.

The team published their findings Wednesday as the cover story in the journal Nature.

DQN was challenged using 49 classic arcade video games played on the Atari 2600. Like any player the agent did better on some games than on others - it didn’t do as well with Wizard of Wor or Chopper Command for example.

But it played better than other agent competitors and matched or bettered the playing skills of an experienced human for most of the games.

“These types of systems are more human-like when they learn,” said Mr Hassabis. “We learn from experience and our brain makes models of the world to make plans. That is the kind of system we were trying to produce.”

There is a huge potential market for agents like DQN and not just in games competitions.

“The system we have developed is just an example of what it can do, it can be applied to any type of problem even if highly complicated,” said Koray Kavukcuoglu of DeepMind. “It is a general learning algorithm.”

Applications include trip planning, machine translation and processing large sequential datasets.

The researchers used a laptop to train DQN, said Mr Vlad Mnih of DeepMind. They let the agent play different games for up to two weeks to achieve high skill levels, “but some games only needed a couple of days’ training. The training would be faster with more computing power,” he added.