InfiniteDynamic Games

(1)

هﺪﻨﻨﮐ ﻪﺋارا :

دﺮﻓﻮﮑﯿﻧ ﻦﯿﺴﺣﺮﯿﻣا

ﺮﯿﺼﻧ ﻪﺟاﻮﺧ هﺎﮕﺸﻧاد ﺮﺗﻮﯿﭙﻣﺎﮐ و قﺮﺑ ﯽﺳﺪﻨﻬﻣ

(2)

Material

 Dynamic Non-cooperative Game Theory: Second Edition

 Chapter5: Sections 5:1–5:3.

InfiniteDynamic Games

(3)

Zero sum games

Non-zero sum games

Infinite Games

Infinite Dynamic Games

Dynamic games in discrete time

Information structures

InfiniteDynamic Games

(4)

What does ‘dynamic’ refer to?

 Multi-stage games: Players gain dynamic information throughout the decision process

Stages/Levels/Time: These games require a notion of time

Discrete-time Dynamic Games: Finite or countably infinite number of stages/levels of play

Continuous-time Dynamic Games: continuum of stages/levels of play

Static v/s Dynamic(Recap)

(5)

Consider a two-player multi-stage game in extensive form like the one in Figure (a) and suppose that we use the following notation:

For each stage,

1. x_k denotes the node at which the game enters the kth stage,

2. u_k denotes the action of player P₁ at the kth stage,

3. d_k denotes the action of player P₂ at the kth stage.

Dynamic games in discrete time

{1, 2, , K}

k  

(6)

In this case, the overall tree structure can be mathematically described by equations of the form:

(1)

Equation (1) describes the tree itself, but not the outcomes or the information sets.

Evolution of Decision Process: dynamic state evolution with cost

Dynamic games in discrete time

    

1 2

1

" " ' '

1

( , , )

k k k k k

entry node dynamics at entry node P s action P s action at at stage k stage k at stage k at stage k stage k

x _ f x u d





(x , u ,d )

i

k k k

J

(7)

This type of description actually allows for games that are more general that those that are typically described in extensive form.

For example,

 games described by graphs that are not trees, such as the one in Figure (b);

games with infinitely many stages ( );

games with action spaces that are not

finite sets.

Notation. A tree is (connected) graph that has no cycles.

Dynamic games in discrete time

k  

(8)

Games whose evolution is represented by an equation such as (1) are called dynamic games and the equation (1) is often called the

dynamics of the game. The set X where the state x_ktakes values is called the state-space of the game.

The outcome J_i for a particular player P_i, in a multi-stage game in extensive form like the one in Figure (a) is a function of the state of the game at the last stage K and the actions taken by the players at this stage:

However, when the game is described by a graph that is not a tree, one may have different outcomes depending on how one got to the end of the game. In this case, the outcome J_i may depend on all the decisions made by both players from the start of the game:

Dynamic games in discrete time

1 1 1

(x , u ,d , , u ,d )

i

k k

J 

(x , u ,d )

i

k k k

J

(9)

A dynamic game is said to have a stage additive cost when the outcome J_i to be minimized can be written as

when all g_k are equal to zero, except for the last one g_k, the game is said to have a terminal cost.

Often one regards the stage index k as time and calls these

discrete-time dynamic games, in contrast with differential games that take place in continuous times

Dynamic games in discrete time

1

(x , u ,d )

K i

k k k k

k

g





(10)

 We need to model the process:

 Let’s loop it around to get:

Loop model

(11)

Basic elements of the loop model:

Set of Players:

Stages:

State Space:

Action Space:

Loop model

, , : [1, , ]

P ii  N N   N , : [1, , ]

k K K   K

k 1

x  X for k  K  K 

i i , i i

k k k k

u U u^ U ^

1 1 1

: , ,

i i i N

k k k k k

U ^ U U ^ U ^ U k K i N

(12)

Information in the Loop Model:

Information Set: The information available to each player is

Information Space:

Loop model

1 1

1 1 1 1

i k N N

k k k

N  X U U _ U U _

1 1

1 1 1 1 1

{ , , ; , , ; ; , , } , i N

i N N

k x x uk uk u uk

    _   _ 

i Ni

k k

 

(13)

Functionals in the loop model:

State Equation: Describes the state evolution

Strategies: P_i uses strategy

Cost Functional: The real-valued cost for P_i is given by

Loop model

1 1

1 1 1

( , , , ; ; , , , )

i N N

k K K

L x u  u  x u  u

1 ( , ^N, , ^N ) , ,

k k k k k k

x _  f x u  u k K x  X { ,....,1 },

i i i i i

K for

       ( );

i i i i i

k k k k k

u     

(14)

Multi-agent Formation Game: N-robots, with first order dynamics

begin from an initial position

Each robot tries to get to a position , such that the position corresponds to one where the N robots are evenly spaced1m apart, over K time instants.

Example of a Nonzero sum Game

i

x D 1

i i i

k k k

x _  x  h u

1

1 [ 1, , 1^N ] x  x  x [ 1 , , ^N ]

D D D

x  x  x

(15)

Multi-agent Formation Game using the Loop Model:

 Basic Elements:

 N players , K stages

State Space:

Action Space

Information: Suppose that each robot has a perfect measurement of its state, and transmits this information to all other robots

instantaneously. Then

Missing actions?

No

Example of a Nonzero sum Game

[ 1,1], [ 1,1] 1

i i N

k k

u   u^   ^

; [ 1 , , ]

i N N

k k k k

x R x  x  x R

{ ,1 , } , i N

i

k x xk

   

(16)

Multi-agent Formation Game using the Loop Model:

 Functionals:

 State Equation:

Strategies: In general, . One example could be

Cost Function:

Example of a Nonzero sum Game

( )

i i i

k k k

u   

1

1 1

( ) ( ) ( ) ( ) ( )

K K

i i i T i i i T i

k D x k D k u k

k k

L x x Q x x u Q u



 





  





1

1 [ ,..., ^{N T}]

k k k k

x _  x  h u u

i ji j

k k k

j N

u k x







(17)

Zero sum games

Non-zero sum games

Infinite Games

Infinite Dynamic Games

Dynamic games in discrete time

Information structures

InfiniteDynamic Games

(18)

Dynamic games can have a wide range of information structures, but we will consider mostly some structures (such as open loop , (perfect) state feedback and etc.)

In open-loop (OL) dynamic games, players do not gain any information as the game is played (other than the current stage) and must make their decisions solely based on a-priori

information

Information Structures

(19)

In terms of an extensive form representation, each player has a single information set per stage, which contains all the nodes for that player at that stage, as in the game shown in Figure (a). For open-loop dynamic games, one typically represents policies as functions of the initial state.

Information Structures

1 ( )

1

,

( )

i k

i i OL i

k k k

x k K

u x



 

 

 

(20)

In (perfect) state-feedback (FB) games, the players know exactly the state x_k of the game at the entry of the current stage and can use this information to choose their actions at that stage.

However, they must make these decisions without knowing each others choice (i.e., we have simultaneous play at each stage).

Information Structures

(21)

In terms of an extensive form representation, at each stage of the game there is exactly one information set for each entry-point to that stage, as in the game is shown in Figure (b). For state-

feedback games, one typically represents policies as functions of the current state:

Information Structures

1 ( )

{ ,...., } , ( )

i

k k

i i FB i

k k k

x x k K

u



 

 



(22)

What if the full state is not available to the players?

Observations (Measurements):

Information Structure (Imperfect State Information):

The qualifier "perfect" is sometime used to emphasize that the players know exactly the current state x_k, in contrast to

situations in which the players only have access to estimates of x_k, possibly constructed from noisy measurements.

Information Structures

i i

k k

y Y

( ) , k ,i N

i i

k k k

y  h x K 

1 1

{ ,...., ; ; ,...., ;

, , ; ; , , }

i N N

k k k

N N

k k

y y y y

u u u u

  

  

(23)

The robots use a low-quality GPS sensor to measure their positions:

Observations (Measurements):

Information Pattern: Now, the robots transmit the measurements instantly to all other robots. Thus

Missing Actions?

No (under suitable assumptions)

Information Structures (Example)

, k ,i N

i i i

k k k

y  C x K 

1 1

{ ,...., ; ; ,...., }

i N N

k y yk y yk i N

   

(24)

Forgetful Players: Players forget some of the stored

states/measurements/actions. For example: Memoryless players use only the current values of the state/observation to determine their actions.

Information Set for Forgetful Players: In general -

Missing Actions? Possibly! Can be compensated by transmitting actions to other players as well.

Example:

Information Structures (Example)

1 1

{ ,...., ; ; ,...., }

i N N

k y yk y yk

  

{ ,1 } ,

i i i

k y yk k K

  

(25)

Delayed Information: The measured state/observation (or transmitted actions) are delayed by time steps.

Information Set for Delayed Information: In general -

Missing Actions? Possibly!

Example:

Information Structures (Example)

1 1

{ ,...., ; ; ,...., } , 1 1

i N N

k y yk _ y yk _ k

  _  _   

1 1

{ ,...., ; ; ,...., }

i N N

k y yk _ y yk _

  _  _

(26)

A few typical information structures (IS):

Open Loop IS:

Closed-Loop Perfect State IS:

Closed-Loop Imperfect State IS:

Memoryless IS:

Feedback IS:

One step delayed IS:

Examples of Information Structures

1 ,

i

k x k K

  

{ ,....,1 } ,

i

k x x k k K

  

{ 1 ,...., } ,

i i i

k y y k k K

  

1 1

{ , } , { , } ,

i i i i

k x x k OR k y y k k K

    

, ,

i i i

k x k OR k y k k K

    

1 1 1 1

{ , , } , { , , } ,

i i i i

k x x k OR k y y k k K

   _    _ 