Chris Horn: ‘We are on cusp of an exciting new age of innovation’

By building simulations of the real world, and letting Google’s DeepMind experiment, it can learn new skills


Can machines think? Well, what does it mean "to think"? Thinking about thought is a journey into philosophy. Instead Alan Turing asked a different question in 1950: can machines do what humans (who are thinking entities) do ?

Earlier this month, a computer – Google's DeepMind – beat Lee Sedol, the world champion Go player, four times in five matches. Go is disarmingly trivial to play: you alternately place pieces on a board with a 19x19 grid of lines to try and surround your opponents pieces, so capturing them and their territory. Having much simpler rules than chess, Go's triviality paradoxically leads to games of great complexity. Bertrand Russell observed that deep thinking starts with something trivial and hardly worth stating, and ends with something paradoxically complex. In China, mastery of Go has been considered one of four intellectual qualities since the Tang dynasty.

Chess has 20 opening moves. Go has 361. A computer – IBM's Deep Blue – first beat a chess grandmaster, Gary Kasparov, in 1996. It has taken a further 20 years for a machine to beat a Go grandmaster.

Early approaches to machine intelligence transcribed rules into a computer. Medical diagnosis is an example: a computer can follow preprogrammed rules that suggest medical tests and then deduce a patients condition. Mastering chess by computer took a similar approach: evaluate board positions and have rules to improve your position; and plan “what-if” counters to possible opponent moves up to 20 plays ahead.

READ MORE

The number of possible chess positions is about 10 followed by 40 zeros, compared to 10 followed by 80 zeros for the number of atoms in the universe. However there are about 10 followed by 170 zeros possible Go positions.

Mastering the sheer complexity of Go’s extreme simplicity has required a completely new approach to computing.

Google’s DeepMind has no preprogrammed rules and is instead inspired by biological brains. As in a brain, DeepMind’s fundamental unit is a neuron. A neuron responds – produces an output – if a pattern across its inputs exceeds a threshold bias it holds. Crucially, a neuron’s bias can be adjusted if necessary, so that it literally learns to respond to specific additional patterns on its inputs. Neurons can be arranged into layers, each neuron operating simultaneously with its peers. Layers can then be stacked so that the outputs resulting from one can be fed into the inputs of the next.

The resulting neural network is trained by being given examples of how to respond. It might be shown photos, some of which are, for example, of a cat. The network automatically adjusts the bias in each of its neurons as it learns, so that the network increasingly tends to respond positively to cat photos. Once so trained, it can then recognise whether or not a cat is present in any new photo.

DeepMind builds on a neural network to learn the consequences of interacting with its environment. For the game of Go, its only permissible action is to place a piece somewhere (in a legitimate position) on the game board.

From that action, and the subsequent actions taken by its opponent and then others by itself, it eventually either wins or loses the game. Based on the game outcome, it automatically adjusts its neural network.

By being shown 160,000 Go games by human Go masters, and then playing against itself by trial and error many thousands of times, DeepMind has taught itself from faltering clumsiness to be better than our best human player.

Strategies

Lee Sedol himself expressed deep surprise: DeepMind had moves and strategies that he had never before seen with human opponents. From humble beginnings – a network of simple neurons – DeepMind is like Bertrand Russell’s paradox. DeepMind, a machine, has discovered new approaches to the game of Go untainted by human emotion. When playing Go, not only does the machine think like a human (a thinking entity), but it thinks beyond a human.

However, DeepMind cannot (at least not yet) explain why it thinks the way it does. It “feels it in its bones” – or rather in the biases learnt within its neural network. But if we ourselves then examine this network, we do not easily uncover the new strategies DeepMind has found for playing Go. DeepMind can teach itself, and it can show us how it plays, but it cannot explain its insights to us. Instead we have to just watch, and try ourselves to understand the new strategies it has demonstrably discovered.

As yet, we have no theory for DeepMind. We cannot predict how many neurons, layers and teaching sessions DeepMind needs to become proficient. We cannot understand what DeepMind discovers by merely looking at its innards. Engineers can build it, train it and then see what it does. But computer scientists have yet to give us a theoretical model that enables us to understand and predict its behaviour.

Go is just a game and DeepMind requires many trials to learn. But by building simulations of the real world, and then letting DeepMind experiment, DeepMind can learn new skills.

However, while humans have transferable skills, DeepMind as yet lacks “common sense”: it cannot yet apply its learnings from one skill to an entirely new situation.

We are now on the cusp of a new age of innovation.

If a computer can now innovate entirely new game strategies by itself, what else could it create?