The software uses neural networks to learn from experience. For example, to train for its Go match the computer program studied 30 million Go board positions from human games, then played itself again and again to improve its skills.
DeepMind’s founder and chief executive Demis Hassabis mentioned the possibility of training a version of AlphaGo using self-play alone, omitting the knowledge from human-expert games, at a conference last month. The firm created a program that learned to play less complex arcade games in this manner in 2015. Without a head start, AlphaGo would probably take much longer to learn, says Bengio — and might never beat the best human. But it’s an important step, he says, because humans learn with such little guidance.
DeepMind, based in London, also plans to venture beyond games. In February the company founded DeepMind Health and launched a collaboration with the UK National Health Service: its algorithms could eventually be applied to clinical data to improve diagnoses or treatment plans. Such applications pose different challenges from games, says Oren Etzioni, chief executive of the non-profit Allen Institute for Artificial Intelligence in Seattle, Washington. “The universal thing about games is that you can collect an arbitrary amount of data,” he says — and that the program is constantly getting feedback on what’s a good or bad move by playing many games. But, in the messy real world, data — on rare diseases, say — might be scarcer, and even with common diseases, labelling the consequences of a decision as ‘good’ or ‘bad’ may not be straightforward.
Hassabis has said that DeepMind’s algorithms could give smartphone personal assistants a deeper understanding of users’ requests. And AI researchers see parallels between human dialogue and games: “Each person is making a play, and we have a sequence of turns, and each of us has an objective,” says Bengio. But they also caution that language and human interaction involve a lot more uncertainty.