identification is the only piece remaining, the only obstacle to someone making an autonomous weapon in their garage. Unfortunately, that technology is not far off. In fact, as I stood in the basement of the building watching Shield AI’s quadcopter autonomously navigate from room to room, autonomous target recognition was literally being demonstrated right outside, just above my head.
This is a tricky problem with a rules-based approach. What neural networks do is sidestep that problem entirely. Instead, they learn from vast amounts of data—tens of thousands or millions of pieces of data. As the network churns through the data, it continually adapts its internal structure until it optimizes to achieve the correct programmer-specified goal. The goal could be distinguishing an apple from a tomato, playing an Atari game, or some other task.
In one of the most powerful examples of how neural networks can be used to solve difficult problems, the Alphabet (formerly Google) AI company DeepMind trained a neural network to play go, a Chinese strategy game akin to chess, better than any human player. Go is an excellent game for a learning machine because the sheer complexity of the game makes it very difficult to program a computer to play at the level of a professional human player based on a rules-based strategy alone.
The rules of go are simple, but from these rules flows vast complexity.
Go is played on a grid of 19 by 19 lines and players take turns placing stones—black for one player and white for the other—on the intersection points of the grid. The objective is to use one’s stones to encircle areas of the board. The player who controls more territory on the board wins. From these simple rules come an almost unimaginably large number of possibilities. There are more possible positions in go than there are atoms in the known universe, making go 10100 (one followed by a hundred zeroes) times—literally a googol—more complex than chess.
Humans at the professional level play go based on intuition and feel. Go takes a lifetime to master. Prior to DeepMind, attempts to build go-playing AI software had fallen woefully short of human professional players. To craft its AI, called AlphaGo, DeepMind took a different approach. They built an AI composed of deep neural networks and fed it data from 30 million games of go. As explained in a DeepMind blog post, “These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like connections.” Once the neural network was trained on human games of go, DeepMind then took the network to the next level by having it play itself.
Our goal is to beat the best human players, not just mimic them,” as explained in the post. “To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process
known as reinforcement learning.” AlphaGo used the 30 million human games of go as a starting point, but by playing against itself could reach levels of game play beyond even the best human players.
This superhuman game play was demonstrated in the 4–1 victory AlphaGo delivered over the world’s top-ranked human go player, Lee Sedol, in March 2016. AlphaGo won the first game solidly, but in game 2 demonstrated its virtuosity. Partway through game 2, on move 37, AlphaGo made a move so surprising, so un-human, that it stunned professional players watching the match. Seemingly ignoring a contest between white and black stones that was under way in one corner of the board, AlphaGo played a black stone far away in a nearly empty part of the board. It was a surprising move not seen in professional games, so much so that one commentator remarked, “I thought it was a mistake.” Lee Sedol was similarly so taken by surprise he got up and left the room. After he returned, he took fifteen minutes to formulate his response. AlphaGo’s move wasn’t a mistake. European go champion Fan Hui, who had lost to AlphaGo a few months earlier in a closed-door match, said at first the move surprised him as well, and then he saw its merit. “It’s not a human move,” he said. “I’ve never seen a human play this move. So beautiful.” Not only did the move feel like a move no human player would never make, it was a move no human player probably would never make. AlphaGo rated the odds that a human would have made that move as 1 in 10,000. Yet AlphaGo made the move anyway. AlphaGo went on to win game 2 and afterward Lee Sedol said, “I really feel that AlphaGo played the near perfect game.” After losing game 3, thus giving AlphaGo the win for the match, Lee Sedol told the audience at a press conference, “I kind of felt powerless.”
AlphaGo’s triumph over Lee Sedol has implications far beyond the game of go. More than just another realm of competition in which AIs now top humans, the way DeepMind trained AlphaGo is what really matters. As explained in the DeepMind blog post, “AlphaGo isn’t just an ‘expert’
system built with hand-crafted rules; instead it uses general machine learning techniques to figure out for itself how to win at Go.” DeepMind didn’t program rules for how to win at go. They simply fed a neural network massive amounts of data and let it learn all on its own, and some of the things it learned were surprising.
In 2017, DeepMind surpassed their earlier success with a new version of AlphaGo. With an updated algorithm, AlphaGo Zero learned to play go
without any human data to start. With only access to the board and the rules of the game, AlphaGo Zero taught itself to play. Within a mere three days of self-play, AlphaGo Zero had eclipsed the previous version that had beaten Lee Sedol, defeating it 100 games to 0.
These deep learning techniques can solve a variety of other problems. In 2015, even before DeepMind debuted AlphaGo, DeepMind trained a neural network to play Atari games. Given only the pixels on the screen and the game score as input and told to maximize the score, the neural network was able to learn to play Atari games at the level of a professional human video game tester. Most importantly, the same neural network architecture could be applied across a vast array of Atari games—forty-nine games in all. Each game had to be individually learned, but the same neural network architecture applied to any game; the researchers didn’t need to create a customized network design for each game.
The AIs being developed for go or Atari are still narrow AI systems.
Once trained, the AIs are purpose-built tools to solve narrow problems.
AlphaGo can beat any human at go, but it can’t play a different game, drive a car, or make a cup of coffee. Still, the tools used to train AlphaGo are generalizable tools that can be used to build any number of special-purpose narrow AIs to solve various problems. Deep neural networks have been used to solve other thorny problems that have bedeviled the AI community for years, notably speech recognition and visual object recognition.
A deep neural network was the tool used by the research team I witnessed autonomously find the crashed helicopter. The researcher on the project explained that he had taken an existing neural network that had already been trained on object recognition, stripped off the top few layers, then retrained the network to identify helicopters, which hadn’t originally been in its image dataset. The neural network he was using was running off of a laptop connected to the drone, but it could just as easily have been running off of a Raspberry Pi, a $40 credit-card sized processor, riding on board the drone itself.
All of these technologies are coming from outside the defense sector.
They are being developed at places like Google, Microsoft, IBM, and university research labs. In fact, programs like DARPA’s TRACE are not necessarily intended to invent new machine learning techniques, but rather import existing techniques into the defense sector and apply them to military problems. These methods are widely available to those who know
how to use them. I asked the researcher behind the helicopter-hunting drone: Where did he get the initial neural network that he started with, the one that was already trained to recognize other images that weren’t helicopters? He looked at me like I was either half-crazy or stupid. He got it online, of course.