• Tidak ada hasil yang ditemukan

InfiniteDynamic Games

N/A
N/A
Protected

Academic year: 2024

Membagikan "InfiniteDynamic Games"

Copied!
26
0
0

Teks penuh

(1)

هﺪﻨﻨﮐ ﻪﺋارا :

دﺮﻓﻮﮑﯿﻧ ﻦﯿﺴﺣﺮﯿﻣا

ﺮﯿﺼﻧ ﻪﺟاﻮﺧ هﺎﮕﺸﻧاد ﺮﺗﻮﯿﭙﻣﺎﮐ و قﺮﺑ ﯽﺳﺪﻨﻬﻣ

(2)

Material

Dynamic Non-cooperative Game Theory: Second Edition

Chapter5: Sections 5:1–5:3.

InfiniteDynamic Games

(3)

Zero sum games

Non-zero sum games

Infinite Games

Infinite Dynamic Games

Dynamic games in discrete time

Information structures

InfiniteDynamic Games

(4)

What does ‘dynamic’ refer to?

Multi-stage games: Players gain dynamic information throughout the decision process

Stages/Levels/Time: These games require a notion of time

Discrete-time Dynamic Games: Finite or countably infinite number of stages/levels of play

Continuous-time Dynamic Games: continuum of stages/levels of play

Static v/s Dynamic(Recap)

(5)

Consider a two-player multi-stage game in extensive form like the one in Figure (a) and suppose that we use the following notation:

For each stage,

1. xk denotes the node at which the game enters the kth stage,

2. uk denotes the action of player P1 at the kth stage,

3. dk denotes the action of player P2 at the kth stage.

Dynamic games in discrete time

{1, 2, , K}

k

(6)

In this case, the overall tree structure can be mathematically described by equations of the form:

(1)

Equation (1) describes the tree itself, but not the outcomes or the information sets.

Evolution of Decision Process: dynamic state evolution with cost

Dynamic games in discrete time

1 2

1

" " ' '

1

( , , )

k k k k k

entry node dynamics at entry node P s action P s action at at stage k stage k at stage k at stage k stage k

x f x u d

(x , u ,d )

i

k k k

J

(7)

This type of description actually allows for games that are more general that those that are typically described in extensive form.

For example,

games described by graphs that are not trees, such as the one in Figure (b);

games with infinitely many stages ( );

games with action spaces that are not

finite sets.

Notation. A tree is (connected) graph that has no cycles.

Dynamic games in discrete time

k  

(8)

Games whose evolution is represented by an equation such as (1) are called dynamic games and the equation (1) is often called the

dynamics of the game. The set X where the state xktakes values is called the state-space of the game.

The outcome Ji for a particular player Pi, in a multi-stage game in extensive form like the one in Figure (a) is a function of the state of the game at the last stage K and the actions taken by the players at this stage:

However, when the game is described by a graph that is not a tree, one may have different outcomes depending on how one got to the end of the game. In this case, the outcome Ji may depend on all the decisions made by both players from the start of the game:

Dynamic games in discrete time

1 1 1

(x , u ,d , , u ,d )

i

k k

J

(x , u ,d )

i

k k k

J

(9)

A dynamic game is said to have a stage additive cost when the outcome Ji to be minimized can be written as

when all gk are equal to zero, except for the last one gk, the game is said to have a terminal cost.

Often one regards the stage index k as time and calls these

discrete-time dynamic games, in contrast with differential games that take place in continuous times

Dynamic games in discrete time

1

(x , u ,d )

K i

k k k k

k

g

(10)

We need to model the process:

Let’s loop it around to get:

Loop model

(11)

Basic elements of the loop model:

Set of Players:

Stages:

State Space:

Action Space:

Loop model

, , : [1, , ]

P ii N N N , : [1, , ]

k K K K

k 1

x X for k K K

i i , i i

k k k k

u U u U

1 1 1

: , ,

i i i N

k k k k k

U U U U U k K i N

(12)

Information in the Loop Model:

Information Set: The information available to each player is

Information Space:

Loop model

1 1

1 1 1 1

i k N N

k k k

N X U U U U

1 1

1 1 1 1 1

{ , , ; , , ; ; , , } , i N

i N N

k x x uk uk u uk

i Ni

k k

(13)

Functionals in the loop model:

State Equation: Describes the state evolution

Strategies: Pi uses strategy

Cost Functional: The real-valued cost for Pi is given by

Loop model

1 1

1 1 1

( , , , ; ; , , , )

i N N

k K K

L x u u x u u

1 ( , N, , N ) , ,

k k k k k k

x f x u u k K x X { ,....,1 },

i i i i i

K for

     ( );

i i i i i

k k k k k

u    

(14)

Multi-agent Formation Game: N-robots, with first order dynamics

begin from an initial position

Each robot tries to get to a position , such that the position corresponds to one where the N robots are evenly spaced1m apart, over K time instants.

Example of a Nonzero sum Game

i

x D 1

i i i

k k k

x x h u

1

1 [ 1, , 1N ] x x x [ 1 , , N ]

D D D

x x x

(15)

Multi-agent Formation Game using the Loop Model:

Basic Elements:

N players , K stages

State Space:

Action Space

Information: Suppose that each robot has a perfect measurement of its state, and transmits this information to all other robots

instantaneously. Then

Missing actions?

No

Example of a Nonzero sum Game

[ 1,1], [ 1,1] 1

i i N

k k

u   u  

; [ 1 , , ]

i N N

k k k k

x R x x x R

{ ,1 , } , i N

i

k x xk

(16)

Multi-agent Formation Game using the Loop Model:

Functionals:

State Equation:

Strategies: In general, . One example could be

Cost Function:

Example of a Nonzero sum Game

( )

i i i

k k k

u  

1

1 1

( ) ( ) ( ) ( ) ( )

K K

i i i T i i i T i

k D x k D k u k

k k

L x x Q x x u Q u

1

1 [ ,..., N T]

k k k k

x x h u u

i ji j

k k k

j N

u k x

(17)

Zero sum games

Non-zero sum games

Infinite Games

Infinite Dynamic Games

Dynamic games in discrete time

Information structures

InfiniteDynamic Games

(18)

Dynamic games can have a wide range of information structures, but we will consider mostly some structures (such as open loop , (perfect) state feedback and etc.)

In open-loop (OL) dynamic games, players do not gain any information as the game is played (other than the current stage) and must make their decisions solely based on a-priori

information

Information Structures

(19)

In terms of an extensive form representation, each player has a single information set per stage, which contains all the nodes for that player at that stage, as in the game shown in Figure (a). For open-loop dynamic games, one typically represents policies as functions of the initial state.

Information Structures

1 ( )

1

,

( )

i k

i i OL i

k k k

x k K

u x

 

(20)

In (perfect) state-feedback (FB) games, the players know exactly the state xk of the game at the entry of the current stage and can use this information to choose their actions at that stage.

However, they must make these decisions without knowing each others choice (i.e., we have simultaneous play at each stage).

Information Structures

(21)

In terms of an extensive form representation, at each stage of the game there is exactly one information set for each entry-point to that stage, as in the game is shown in Figure (b). For state-

feedback games, one typically represents policies as functions of the current state:

Information Structures

1 ( )

{ ,...., } , ( )

i

k k

i i FB i

k k k

x x k K

u

 

(22)

What if the full state is not available to the players?

Observations (Measurements):

Information Structure (Imperfect State Information):

The qualifier "perfect" is sometime used to emphasize that the players know exactly the current state xk, in contrast to

situations in which the players only have access to estimates of xk, possibly constructed from noisy measurements.

Information Structures

i i

k k

y Y

( ) , k ,i N

i i

k k k

y h x K

1 1

1 1

1 1

1 1

{ ,...., ; ; ,...., ;

, , ; ; , , }

i N N

k k k

N N

k k

y y y y

u u u u

(23)

The robots use a low-quality GPS sensor to measure their positions:

Observations (Measurements):

Information Pattern: Now, the robots transmit the measurements instantly to all other robots. Thus

Missing Actions?

No (under suitable assumptions)

Information Structures (Example)

, k ,i N

i i i

k k k

y C x K

1 1

1 1

{ ,...., ; ; ,...., }

i N N

k y yk y yk i N

(24)

Forgetful Players: Players forget some of the stored

states/measurements/actions. For example: Memoryless players use only the current values of the state/observation to determine their actions.

Information Set for Forgetful Players: In general -

Missing Actions? Possibly! Can be compensated by transmitting actions to other players as well.

Example:

Information Structures (Example)

1 1

1 1

{ ,...., ; ; ,...., }

i N N

k y yk y yk

{ ,1 } ,

i i i

k y yk k K

(25)

Delayed Information: The measured state/observation (or transmitted actions) are delayed by time steps.

Information Set for Delayed Information: In general -

Missing Actions? Possibly!

Example:

Information Structures (Example)

1 1

1 1

{ ,...., ; ; ,...., } , 1 1

i N N

k y yk y yk k

  

1 1

1 1

{ ,...., ; ; ,...., }

i N N

k y yk y yk

(26)

A few typical information structures (IS):

Open Loop IS:

Closed-Loop Perfect State IS:

Closed-Loop Imperfect State IS:

Memoryless IS:

Feedback IS:

One step delayed IS:

Examples of Information Structures

1 ,

i

k x k K

{ ,....,1 } ,

i

k x x k k K

{ 1 ,...., } ,

i i i

k y y k k K

1 1

{ , } , { , } ,

i i i i

k x x k OR k y y k k K

, ,

i i i

k x k OR k y k k K

1 1 1 1

{ , , } , { , , } ,

i i i i

k x x k OR k y y k k K

Referensi

Dokumen terkait

They proved to be (perfectly, perfect) (exact, exactly) measurements.. The stillness of the tomb was

I show that if there is sufficient uncertainty, the game has an equilibrium (essentially a subgame perfect equilibrium) in which two players enter at distinct positions simul-

Read on and find out just how to choose the perfect shape, size and style of women handbag that will fit your body type and personal style.. Look at

Division Zygomycotina includes the fungi that do not have motile cells or zoospores during any stage of their life cycle, and the perfect state spores are present in the form of

3. For more than two players, suppose that there is a common prior on the set of states of the world, that at some state, the structure of the game and rationality of the players

Supports HDL Design Entry Techniques Altera’s tri-state emulation supports HDL design entry techniques, allowing designers to describe designs at the behavioral level and rely on

execution/evaluation loop • user establishes the goal • formulates intention • specifies actions at interface • executes action • perceives system state • interprets system state

Students who are active players in the PUBG online game at the Information Technology Study Program, FKIP Unisri for the 2018/2019 academic year, whose score index is affected by