115 ISSN 2085-1944
Gunawan , I Made Bakti Kurniawan 1
Program Pascasarjana Institut Teknologi Sepuluh Nopember Kampus ITS Keputih Sukolilo Surabaya
2
Sekolah Tinggi Teknik Surabaya Ngagel Jaya Tengah 73-77 Surabaya
Email: [email protected] 1, [email protected] 2
ABSTRACT
Expectimax-N is an algorithm that creates from combination of expectimax algorithm that is used to solve stochastic game playing and Max-N algorithm that is usually used to solve multi player perfect information games. This algorithm will do a tracing and make a game tree with certain value of depth and take the Static Board Evaluator (SBE) on that node.
This algorithm has been tested on multiplayer stochastic game called carcassonne: hunter and gatherers and the result is more than enough to show that, the algorithm can sovle the problem.
Keywords: Expectimax-N, Expectimax, Max-N, Game Playing.
1
INTRODUCTION
This paper investigates how computers can play multi-players games of chance. We focus on games where the entire state of the game is visible to both players. This property is known under the more technical name of perfect information. The element of chance means that we are considering stochastic domains. Some popular multi-player, stochastic, and perfect information games include Carcassonne. This game is stochastic due to the random drawing of game tiles. This game is perfect information games since the game state is completely visible at all times.
As we know that, to solve a stochastic game, we can use expectimax algorithm. But to solve a multiplayers stochastic game we can’t just use expectimax. So we need to modify the algorithm to handle the multiplayer values.
2
GAME PLAYING
This section will discuss about recent work that will be used as basic foundation to solve multiplayer stochastic game. There is two algorithm that will be used as foundation. The first is expectimax and the second is max-n.
2.1
Expectimax
Expectimax is a brute force, depth first game tree search algorithm that generalises the minimax concept to games of chance. By looking into the future, we can compute the expected value of a particular game state. This means averaging the minimax values across all of the possible chance events, taking into account the probability of each event occurring. This generalisation amounts to adding a chance node type to the minimax tree. The successors of a chance node are all of the possible stochastic events that can occur according to the game rules.
Figure 1. Pseudo-code for expectimax
ChanceSearch(board,depth) if (board.needChanceEvent()) return expectimax(board,depth) else
return minimax(board,depth)
Expectimax(board,depth)
if ((terminalNode()) or (depth == 0)) return evaluate(board)
val <- 0 sum <- 0
for I<-1,2,3 … numChanceEvent() do board.doChance(I)
val = ChanceSearch(board,depth-1) board.undoChance(I)
Figure 1 shows the pseudo-code for the expectimax algorithm. As we can see, the algorithm work like minimax except on the chance node. On the chance node there is a calculation to get the weight of the chance node. The formula would be written like this:
Probability(e) is the probability when event e will occurs, utility(b) is the value of the node when the event e occurs.
Figure 2. Expectimax game tree
There are three different non-leaf node types, max, min and chance. The value of a max node is defined to be the highest value of it’s children. Similarly, the value of a min node is defined to be the lowest value of its children. The value of a chance node is defined to be the weighted sum of its children.
Figure 2 demonstrates the expectimax algorithm. Each node in the tree is a square, circle and spike. A square describe that node is a terminal node, a circle describe that node is on minimax node and not the terminal node and a spike describe that node is a chance node. At node (b) the value 0.35 we got from the calculation (0.1x1/4)+(0.5x1/4)+(0.4x1/2). At node (c) we got value 0.1 from the calculation 0.1x1. At node (a) the player 1 will choose between 0.35 and 0.1, and player 1 will choose 0.35 to maximize their outcomes.
2.2
Max-N
Max-N (Luckhardt & Irani 1986) is the generalization of minimax to any number of players, while in a two-player, zero-sum game it will return the same result as minimax. The values at the leaves of a max-n tree (max-n values) are n-tuples, where i-th value in the tuple corresponds to the score or utility of a particular outcome for player i. The max-n value of a node where player i is to move is the value of the child node for which the ith component is maximal. In the case of a tie, any outcome may be selected.
Figure 3. Pseudo-code for max-n algorithm
Figure 3 shows the pseudo-code for the max-n algorithm. The process omax-n that algorithm just like the minimax, but it uses max value for each array.
Figure 4 demonstrates the max-n algorithm. Each node in the tree is a square, inside of which is the player to move at that node. At node (a) Player 2 can choose between two outcomes, (6, 4, 0) and (1, 4, 5). Because Player 2 gets 4 from either choice we arbitrarily break the tie to the left and return the value (6, 4, 0). At node (b) Player 2 will choose (3, tree, and all leaf values are known, the resulting strategies will be in equilibrium, meaning that no player can do better by changing their strategy. But, this analysis doesn’t provide a worst case guarantee. A player, for instance, may be able to change their strategy in a way that decreases another player’s score without causing their own score to decrease. In fact, mistaken analysis at even a single node of a maxn tree can arbitrarily effects the payoff of the resulting strategy.
Max-N(board,depth,turn)
If ((terminalNode()) or (depth == 0)) Return evaluation(board)
Score <- -infinite
for I<- 1,2,3 … numberOfMove do board.doMove(I)
ISSN 2085-1944
Figure 4. Max-N game tree
3
MOTIVATING EXAMPLE
On this paper we used a game called Carcassonne: Hunter and Gatherers (CHG) as example of multiplayer stochastic game. CHG is a board game that similar to Carcassonne. This game can be played by 2 – 5 players. For this research, we consider the three-player version of the game, have no partnership. The rule of the game will be same like the original one. Each tile on the game will be modeled as a matrix 7x7 that will hold the value of the tile (type of tile, score value and the owner value).
This game is a stochastic multiplayer game due to the random drawing of the tile for each player turn. And it’s a perfect information game because all of game states is visible to every players that play.
4
EXPECTIMAX-N
This section will discuss about aproaches that will be used to solve a stochastic multiplayer game and a tracing on simple example on CHG.
4.1
Expectimax-N Aproaches
Expectimax-N is an algorithm that is created from the merge of two different algorithm from recent works. The first algorithm is expectimax and the second is max-n. This algorithm inherit the architecture from max-n algorithm and for the chance node, we inherit the same calculation as we used on expectimax.
Figure 5 show that the pseudo-code for expectimax-n algorithm. On that pseudo-code we can see there is a MAX-N called. The max-n here is the same like max-n from the figure 3.
Figure 5. Pseudo-code Expectimax-N
Figure 6 exposes the expectimax-n algorithm. Each node in the tree is a square and a circle, inside of each square is which player to move at that node. A circle is a node when there is a chance occurs. At node A player 2 can choose between two outcomes, (6, 3, 0) and (3, 6, 0). Player 2 will choose (3, 6, 0) to get 6 instead of (6, 3, 0) to get 3. At node B player 2 will choose (4, 5, 1) to get 5. At node C player 2 can choose between 2 outcomes, (2, 1, 5) and (0, 1, 6). Because player 2 get 1 from either choice we arbitrarily break the tie to the left and return value (2, 1, 5). At node D player 2 will choose (2, 3, 1) just like what happen in node C before. In node E player 1 didn’t choose anything. On node E there is a calculation for each event. The value of node E comes from the weighted sum of its children. The weight of its children is the value of the children node multiply by the probability when the chance will occurs. The value in node E would be ((3, 6, 0) x 0.4)+((4, 5, 1) x 0.6) = (3.6, 5.4, 0.6). Same thing would be happen on node F. After the calculation on node F we got (2, 2.2, 2.6). On node G player 1 will choose (3.6, 5.4, 0.6) to get 3.6 rather than (2, 2.2, 2.6) to get 2.
Figure 6. Expectimax-N game tree
On this example we take the 1st state when the game is started. On board there is 1 starting tile as shown on figure 7, and player 1 draw a tile from the deck and got tile like the one on the figure 8. On this example the game is played by 3 players. Player 1 move is generated using expectimax-n algorithm. After that the algorithm will calculate where player 1 should place the tile.
Figure 10 demonstrate the expectimax-n algorithm on game CHG. Game tree is not completely expanded because lack of space. The left value on each node is the value when the algorithm does the search and the value on right of each node (bold) is the value when the algorithm finishes calculate each of its children. At node (a) player 2 will choose the node with value [80, 290, 0] (the first one). At node (b) player 2 will choose the node with value [50, 255, 0] (the first one). At node (c) there is a calculation to know the weight of node (c). the result of this calculation would be [80, 271.7, 0]. The same calculation happens on node (d), (e) and (f). At node (a) player 1 will choose [ 80, 271.7, 0] because player 1 will get 80 from that node.
Figure 7. Example of first state game CHG
Figure 8. A tile that player 1 draw
The right picture on figure 7 and 8 is the code for each tile. P define plain, R define river and F define Forrest.
4.3
Experiment result in Carcassonne:
Hunter and Gatherers (CHG)
We have been do some research using this algorithm by creating a game called Carcassonne: Hunter and Gatherers. We played the game with 3 player (where the 3 of them is a computer) and we got some score statistic from 20 times playing. The result is shown on figure 9. From that experiment result we can tell that, this algorithm can be used to solve a multiplayer stochastic game.
5
CONCLUSION AND DISCUSSION
In this paper we introduced the expectimax-n algorithm for incorporation models of opponents in stochastic n-player games. This algorithm may come with the pruning method in the next experimental. However, the algorithm it self can be used to solve the mutliplayer stochastic game like carcassonne.
Figure 9. Statistic Score
REFERENCES
[1] Luckhardt C, and Irani K (1986) An Algorithmic Solution of N-person Games. In AAAI-86, volume 1, 158–162.
[2] Joel Veness (2006) Expectimax enhancements for stochastic games players. Phd Thesis, The University of New South Wales.
[3] Nathan Reed Sturtevant (2003) Multiplayer games: algorithms and approaches. Ph.D. Thesis, The University of California.
[4] Rio Grande Games (2002) Rule Game Carcassonne: Hunter And Gatherers [online] Available at: http://www.riograndegames.com /uploads/Game/Game_49_gameRules.pdf
ISSN 2085-1944
Figure 10. CHG Game Tree using Expectimax-n
(a)
(c) (d)
(a)
(e) (f)
(Alternative for Figure 10).
Carcassonne: Hunter and Gatherers game tree using expectimax-n. In this picture we use symbols,