Game playing
Kecerdasan Buatan
Pertemuan 5 IT-EEPIS
Kenapa mempelajari games?
• Kriteria menang atau kalah jelas
• Dapat mempelajari permasalahan
• Menyenangkan
• Biasanya mempunyai search space yang
besar (misalnya game catur mempunyai
35
100nodes dalam search tree dan 10
40legal states)
Seberapa hebat
computer game player?
– Catur:
• Deep Blue mengalahkan Gary Kasparov pada tahun 1997
• Gary Kasparav vs. Deep Junior (Feb 2003): seri
– Checkers:
• Chinook adalah juara dunia
– Go:
• Computer player adalah sangat tangguh
– Bridge:
• computer players mempunyai “Expert-level”
Permainan Catur Deep Blue
• Deep Blue adalah sebuah komputer catur buatan IBM.
• Deep Blue adalah komputer pertama yang
memenangkan sebuah permainan catur melawan seorang juara dunia (Garry Kasparov) dalam waktu standar sebuah turnamen catur. Kemenangan
pertamanya (dalam pertandingan atau babak pertama) terjadi pada 10 Februari 1996, dan merupakan
permainan yang sangat terkenal. Namun Kasparov kemudian memenangkan 3 pertandingan lainnya dan memperoleh hasil remis pada 2 pertandingan
selanjutnya, sehingga mengalahkan Deep Blue dengan hasil 4-2.
Permainan Catur Deep Blue
• Deep Blue lalu diupgrade lagi secara besar-besaran dan kembali bertanding melawan Kasparov pada
Mei 1997. Dalam pertandingan enam babak tersebut Deep Blue menang dengan hasil 3,5-2,5. Babak
terakhirnya berakhir pada 11 Mei. Deep Blue
menjadi komputer pertama yang mengalahkan juara dunia bertahan.
• Komputer ini saat ini sudah "dipensiunkan" dan
dipajang di Museum Nasional Sejarah Amerika
(National Museum of American History), Amerika
Serikat.
Permainan Catur
Deep Blue
Garry Kasparov and Deep Blue. © 1997,
GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Ratings of human and computer chess
champions
January/February 2003
Ciri umum pada game
• 2 pemain
• Kesempatan pemain bergantian
• Zero-sum: kerugian seorang pemain adalah keuntungan pemain lain
• Perfect information: pemain mengetahui semua informasi state dari game
• Contoh: Tic-Tac-Toe, Checkers, Chess, Go, Nim, Othello
• Tidak mengandung probabilistik (seperti dadu)
• Game tidak termasuk Bridge, Solitaire, Backgammon, dan semisalnya
Bagaimana bermain game?
• Cara bermain game:
– Pertimbangkan semua kemungkinan jalan – Berikan nilai pada semua kemungkinan jalan – Jalankan pada kemungkinan yang mempunyai
nilai terbaik
– Tunggu giliran pihak lawan jalan – Ulangi cara diatas
• Key problems:
– Representasikan “board” atau “state”
– Buatlah next board yang legal – Lakukan evaluasi pada posisi
Evaluation function
• Evaluation function atau static evaluator digunakan untuk mengevaluasi nilai posisi yang baik
• Zero-sum assumption membolehkan untuk
menggunakan single evaluation function untuk mendeskripsikan nilai posisi
– f(n) >> 0: posisi n baik untuk saya dan jelek untuk lawan – f(n) << 0: posisi n jelek untuk saya dan baik untuk lawan – f(n) near 0: posisi n adalah posisi netral/seri
– f(n) = +infinity: saya menang – f(n) = -infinity: lawan menang
First three levels of tic-tac-toe state space reduced
by symmetry
The “most wins” heuristic
Heuristically reduced state
space for tic-tac-toe
Consider this position
We are playing X, and it is now our turn.
X = Computer, O = opponent
Let’s write out all possibilities
Each number represents a position after each legal move we have.
X move
Now let’s look at their options
Here we are looking at all of the opponent responses to the first possible move we could make.
O move
Now let’s look at their options
Opponent options after our second
possibility. Not good again…
Now let’s look at their options
Struggling…
More interesting case
Now they don’t have a way to win on their next
move. So now we have to consider our responses to
their responses.
Our options
We have a win for any move they make.
So the original position in purple is an X win.
Finishing it up…
They win again if we take our fifth move.
Summary of the Analysis
So which move should we make? ;-)
Game Nim
• Diawali serangkaian batang
• Setiap pemain harus memecah serangkaian batang menjadi 2 kumpulan dimana jumlah batang di tiap kumpulan tidak boleh sama dan tidak boleh kosong
+
+
+
A variant of the game nim
• A number of tokens are placed on a table between the two opponents
• A move consists of dividing a pile of tokens into two nonempty piles of different sizes
• For example, 6 tokens can be divided into piles of 5 and 1 or 4 and 2, but not 3 and 3
• The first player who can no longer make a move loses the game
• For a reasonable number of tokens, the state space can be exhaustively searched
State space for a variant of nim
• Note that state 4-2-1 is repeated. We can simplify the structure by drawing a general graph.
State space for a variant of nim
Search techniques for 2-person games
• The search tree is slightly different: It is a
two-ply tree where levels alternate between players
• Canonically, the first level is “us” or the player whom we want to win.
• Each final position is assigned a payoff:
– win (say, 1) – lose (say, -1) – draw (say, 0)
• We would like to maximize the payoff for the first player, hence the names MAX & MINIMAX
Minimax
• John von Neumann pada tahun 1944 menguraikan sebuah algoritma search pada game, dikenal dengan nama
Minimax, yang memaksimalkan posisi
pemain dan meminimalkan posisi lawan
The search algorithm
• The root of the tree is the current board position, it is MAX’s turn to play
• MAX generates the tree as much as it can, and picks the best move assuming that Min will also choose the moves for herself.
• This is the Minimax algorithm which was invented by Von Neumann and Morgenstern in 1944, as part of game
theory.
• The same problem with other search trees: the tree grows very quickly, exhaustive search is usually
impossible.
Special technique
• MAX generates the full search tree (up to the leaves or terminal nodes or final game positions) and chooses the best one:
win or tie
• To choose the best move, values are propogated upward from the leaves:
– MAX chooses the maximum – MIN chooses the minimum
• This assumes that the full tree is not prohibitively big
• It also assumes that the final positions are easily identifiable
• We can make these assumptions for now, so let’s look at an example
MAX MIN MAX
= terminal position = agent = opponent
D E F G
4 -5 -5 1 -7 2 -3 -8
1
4 1 2 -3
1 B -3 C
A
2 7 1 8
MAX MIN
2 7 1 8
2 1
2 7 1 8
2 1
2
2 7 1 8
2 1
2
Jalan yang dipilih oleh Minimax Static evaluator
value
Minimax applied to a hypothetical
state space (Fig. 4.15)
Asumsi
• MIN bermain dulu
• Evaluation function:
– 0 MIN menang
– 1 MAX menang
Complete State Space for Nim
7
6-1 5-2 4-3
5-1-1 4-2-1 3-2-2 3-3-1
4-1-1-1 3-2-1-1 2-2-2-1
3-1-1-1-1 2-2-1-1-1
2-1-1-1-1-1 MIN
MIN
MIN MAX
MAX
MAX
0
1
0 0
0 1
0 1 0 1
1 1 1
1
0
0 1
0 0
1 1
1 1
1 1 1 0 0
Minimax for Tic Tac Toe
• In our tic tac toe example,
– player 1 is 'X’
– player 2 is 'O’
• the only three scores we will have are
– +1 for a win by 'X',
– -1 for a win by 'O',
– 0 for a draw.
Minimax for Tic Tac Toe (ex 1)
MIN MAX
MIN MAX
Minimax for Tic Tac Toe (ex 2)
MAX
MAX MIN
+1 0 0 +1
-1 -1
+1 +1
-1
0 -1
0 0 0
Special technique
• Use alpha-beta pruning
• Basic idea: if a portion of the tree is obviously good (bad) don’t explore further to see how
terrific (awful) it is
• Remember that the values are propagated upward. Highest value is selected at MAX’s level, lowest value is selected at MIN’s level
• Call the values at MAX levels α values, and the
values at MIN levels β values
The rules
• Search can be stopped below any MIN node having a beta value less than or
equal to the alpha value of any of its MAX ancestors(MIN node β≤α)
• Search can be stopped below any MAX
node having an alpha value greater than
or equal to the beta value of any of its MIN
node ancestors (MAX node α≥β)
Example with MAX
MAX
MAX MIN
3 4 5
β=3 β≤2
2
As soon as the node with value 2 is generated, we
know that the beta value will be less than 3, we don’t need
to generate these nodes
(and the subtree below them) α ≥ 3
(Some of) these still need to be looked at
MAX node α>β
Example with MIN
MIN
MIN MAX
3 4 5
α=5 α≥6
6
As soon as the node with value 6 is generated, we
know that the alpha value will be larger than 6, we don’t need
to generate these nodes
(and the subtree below them) β ≤ 5
(Some of) these still need to be looked at
MIN node β<α
A
B C
D E
6 5 8
MAX MIN
6 >=8
MAX
<=6
H I J K
= agent = opponent
A
B C
D E F G
6 5 8
MAX MIN
6 >=8
MAX
6
H I J K L M
= agent = opponent
2 1
2
<=2 >=6
A
B C
D E F G
6 5 8
MAX MIN
6 >=8
MAX
6
H I J K L M
= agent = opponent
2 1
2
2 >=6
A
B C
D E F G
6 5 8
MAX MIN
6 >=8
MAX
6
H I J K L M
= agent = opponent
2 1
2
2 6
alpha cutoff
beta cutoff
Alpha-beta Pruning
Alpha-beta pruning
α=3
β≤3
α≥5 MIN node β<α
α=0 β≤0
MAX node α>β α≥3
α=2
β≤2