It may seem like a guy with a really curious last name gave his child an even weirder first name, but AlphaZero is actually the name of the computer program developed by DeepMind (owned by Google). This year, this program managed to shock the chess world by beating Stockfish 10, which used to be the strongest chess engine in the world. To make this performance even more impressive, it took AlphaZero only four hours to become this good at chess. Checkmate humans, your time is over.
AlphaZero vs. Stockfish
This is not the first impressive accomplishment of AlphaZero. This tool is programmed in such a way that, besides chess, it can also teach itself to play two other board games, shogi and go. In 2016, it beat the world’s best Go player, and a year later it already crushed Stockfish at chess. However, in that match the programs had only 1 minute of ‘thinking’ time, which people deemed to be disadvantageous for Stockfish. This time, both programs had more than three hours per game. The results are astonishing: in a 1000-game match, AlphaZero won 155 matches and lost only six. Afterwards, the two programs also played time-odds matches, where AlphaZero had far less thinking time. Here Stockfish only began to outscore AlphaZero when the odds reached 30-to-1, as can be seen in the following image, where green indicates wins of AlphaZero and red wins of Stockfish. It is interesting to see that, when playing white while having the same amount of time, AlphaZero didn’t lose a single match.
These results in the time-odds matches indicate that AlphaZero, besides being stronger than any other chess engine, is also much more efficient. How did AlphaZero become so skilled and efficient?
Monte Carlo tree search
Chess engines have been improving for the last few decades, as described in a previous article by Pieter Dilg. In 2005 the best chess computer program already surpassed the level of every human. Stockfish 10 has been the best for quite some time now, but that seems to be over with the rise of ‘new guy’ AlphaZero. This chess engine learns chess by playing lots of games against itself, each time improving by recognizing mistakes that it has previously made. At the base of its self-learning algorithm stands recognizing patterns, similar to the way people solve their problems. Each move can be regarded as an OR problem like all econometricians have dealt with in class, trying to optimize your position after your move. However, if you think ahead multiple moves, an incredibly large number of potential moves have to be examined. In order to handle this efficiently, AlphaZero uses a Monte Carlo tree search, eliminating options that clearly have less utility than other similar moves. In this way, according to DeepMind, while Stockfish looks at 60 million possible moves per second, AlphaZero only examines 60.000, having already filtered out the obviously wrong moves. The results are exhilarating for any chess fan, as AlphaZero not only manages to win, but also does this quite often in a very surprising way, giving away pieces in exchange for a better position in the most courageous, unexpected ways. For the proper nerdy fans out here, here’s an analysis of one of the games between AlphaZero and Stockfish 10, where AlphaZero sacrifices his knight in the most awesome way.
DeepMind presented the code behind AlphaZero as a ‘general learning algorithm’, but that statement seems a little bit excessive. AlphaZero has managed to become the best at chess, go and shogi, but these are all games where the rules are exactly known and there is only one opponent. Hence, it is still a big step towards general knowledge. But DeepMind has already moved on and is currently applying their algorithms in more complex, multiplayer games. It seems only a matter of time until DeepMind-bases algorithms will be used into real-world practical applications, such as self-driving cars and self-flying drones.
Dit artikel is geschreven door Sjors Keet