Eco514--Game Theory 1. Game Theory Multiperson Decision Theory Zero-Sum games Marciano siniscalchi September 16, 1999 administrative stuff Class: Tue-Thu 10: 40-12: 1021, Room 317, Bendheim. OH, by appointment The big picture Most of you will already have used some of the tools of GT in your core courses You will probably be familiar with the notions of simultaneous us. eatensive-form game, perfect us. imperfect information, complete us. incomplete in formation, Nash equilibrium, Perfect Bayesian equilibrium, Sequential equilibrium, etc.(but, don't worry if you are not) This course has two main objectives: (1) introduce additional, perhaps more sophisticated analytical tools, which you will most likely find useful in your own work (2)enhance your understanding of the main ideas(as opposed to the main tools) of GT, which includes(among other things) a critical appraisal of the applicability of the techniques you have learned Under point (1),I will discuss, among other issues: sophisticated models of in- complete information games which, for instance, allow one to think clearly about currency speculation, bank runs and similar phenomena having to do with"mar ket sentiment" forward induction, a powerful idea which pops up in many applied models, from political economy to industrial organization; non-equilibrium solution concepts, which we may view as a way to test the robustness of strategic predictions to deviations from"textbook?"settings; reputation phenomena, which may shed some light on, say, a certain software behemoth's unduly response to new entrants; and so
Eco514—Game Theory 1. Game Theory = Multiperson Decision Theory; Zero-Sum games Marciano Siniscalchi September 16, 1999 Administrative Stuff Class: Tue-Thu 10:40-12:10 [?], Room 317, Bendheim. OH, by appointment. The Big Picture Most of you will already have used some of the tools of GT in your core courses. You will probably be familiar with the notions of simultaneous vs. extensive-form game, perfect vs. imperfect information, complete vs. incomplete information, Nash equilibrium, Perfect Bayesian equilibrium, Sequential equilibrium, etc. (but, don’t worry if you are not). This course has two main objectives: (1) introduce additional, perhaps more sophisticated analytical tools, which you will most likely find useful in your own work; (2) enhance your understanding of the main ideas (as opposed to the main tools) of GT, which includes (among other things) a critical appraisal of the applicability of the techniques you have learned. Under point (1), I will discuss, among other issues: sophisticated models of incomplete information games which, for instance, allow one to think clearly about currency speculation, bank runs and similar phenomena having to do with “market sentiment”; forward induction, a powerful idea which pops up in many applied models, from political economy to industrial organization; non-equilibrium solution concepts, which we may view as a way to test the robustness of strategic predictions to deviations from “textbook” settings; reputation phenomena, which may shed some light on, say, a certain software behemoth’s unduly response to new entrants; and so on. 1
However, in my mind,(2)is much more important than(1). The reason is that in my perhaps-not-so-humble opinion, GT is best viewed not as a "bag of tricks but as a methodology -a select collection of internally consistent, general principles which helps you structure your own reasoning about strategic interaction in a given setting Games as Multiperson Decision Problems Decision theory enjoys an enviable status in economics. It is, often seen as the prototypical example of a rigorous, well-founded theory, which is easy to apply to the analysis of a wide variety of relevant decision situations in the social sciences: from Econ 101-type models to GE and finance. The link with GT might not be apparent. Indeed, it has been somewhat down- played in the past, because certain conclusions one can draw from strategic reasoning seem to be so much at odds with decision theory that the very existence of such a link might appear questionable(think about: "more information can hurt you. " However, the link is definitely out there, and recent research has brought it to the ore. What might be surprising is that the link was quite evident in the early days of GT, when people were mostly interested in zero-sum games. I shall presently state my case-and, in the process, review a related mathematical tool whose importance I simply cannot overemphasize: linear programming Making the connection First, I need to define a decision problem. Definition 1 A decision problem is a tuple (@, C, Fl, where Q2 is a set of possible states of the world, C is a set of consequences, and FCC is a set of acts The idea is that, if the decision-maker(dm) chooses act f E F, and the prevailing state is w E Q, then the consequence f(w)E C obtains. Consequences are supposed to represent all the dM cares about The dm does not observe the prevailing state of the world: that is, there is uncertainty about it. Still, she might be assumed to exhibit preferences(represented by a binary relation 2C FX F)among available acts. The classical"representation result exhibits a set of joint assumptions about >(and F)which are equivalent to the existence of a probability measure H E A(Q2)and a utility function u: C-R (unique up to a positive affine transformation)which represents x V,g∈F,f≥9兮/a(f(u)(du)≥/(g(a)(d)
However, in my mind, (2) is much more important than (1). The reason is that, in my perhaps-not-so-humble opinion, GT is best viewed not as a “bag of tricks”, but as a methodology—a select collection of internally consistent, general principles which helps you structure your own reasoning about strategic interaction in a given setting. Games as Multiperson Decision Problems Decision theory enjoys an enviable status in economics. It is, often seen as the prototypical example of a rigorous, well-founded theory, which is easy to apply to the analysis of a wide variety of relevant decision situations in the social sciences: from Econ 101-type models to GE and finance. The link with GT might not be apparent. Indeed, it has been somewhat downplayed in the past, because certain conclusions one can draw from strategic reasoning seem to be so much at odds with decision theory that the very existence of such a link might appear questionable (think about: “more information can hurt you.”) However, the link is definitely out there, and recent research has brought it to the fore. What might be surprising is that the link was quite evident in the early days of GT, when people were mostly interested in zero-sum games. I shall presently state my case—and, in the process, review a related mathematical tool whose importance I simply cannot overemphasize: linear programming. Making the connection First, I need to define a decision problem. Definition 1 A decision problem is a tuple {Ω, C, F}, where Ω is a set of possible states of the world, C is a set of consequences, and F ⊂ C Ω is a set of acts. The idea is that, if the decision-maker (DM) chooses act f ∈ F, and the prevailing state is ω ∈ Ω, then the consequence f(ω) ∈ C obtains. Consequences are supposed to represent all the DM cares about. The DM does not observe the prevailing state of the world: that is, there is uncertainty about it. Still, she might be assumed to exhibit preferences (represented by a binary relation ⊂ F × F) among available acts. The classical “representation” result exhibits a set of joint assumptions about (and F) which are equivalent to the existence of a probability measure µ ∈ ∆(Ω) and a utility function u : C → R (unique up to a positive affine transformation) which represents : ∀f, g ∈ F, f g ⇔ Z Ω u(f(ω))µ(dω) ≥ Z Ω u(g(ω))µ(dω) 2
Next, let us define simultaneous games Definition 2 A finite simultaneous game is a tuple (N, (Ai, uiieN, where N is a finite set of players, and for each player i E N, Ai is a finite set of actions and ui is a payoff function u2:A1×A-→R (I use the conventional notation A-i=Ili+i A, and A=lien A; )We could more general and assume that each player is endowed with a preference relation over action profiles, but let's not worry about this right now The above is the "standard"definition of a game. As you may see, it is not quite consistent with the definition of a decision problem, so we need to reinterpret things Indeed, the most important difference is that a game implicitly defines not just one, but n decision problems(with a conventional abuse of notation, I will denote by this both the set N and its cardinality A Let's take the point of view of Player i. She has no control over her opponents choices, and she will not find out what they were until the game is over. Thus, A-i seems a good candidate for the state space: Q2= A-i Next, utilities are attached to points(ai, a-iE A according to Definition 2 This amounts to saying that there is a one-to-one correspondence between the actue outcome of the game (which is what players care about)and action profiles. Thus, we may as well imagine that action profiles are the relevant consequences: C= A By a process of elimination, acts must be actions: F= A;. Formally, action 4∈ Ai defines an act fa1:A-→A1×A-;(i.e.fa1:!→C)byfa(a-)=(a2,a-) However, the definition of a game incorporates additional information: specifically, it includes a utility function for each player. This implies that we have a complet description of players preferences among consequences. However, in view of the representation results cited above, this is only one part of the description of players preferences among acts, because the probabilities are missing This trivial observation is actually crucial: in some sense, the traditional role of game theory has been that of specifying those probabilities in a manner consistent with assumptions or intuitions concerning strategic rationality Nash Equilibrium To make this more concrete, recall what you may have heard about Nash equilibrium a player's strategy doubles as her opponents' beliefs about her strategy Thus, in a 2-player game, a Nash equilibrium is a specification of beliefs(a1, a2) for both players, with the property that if Player i's belief assigns positive probability to an action a, E A,(+i), then the act fa, is preferred by j to any other act, given that j's preferences are represented by the utility function u,: A-R and the
Next, let us define simultaneous games: Definition 2 A finite simultaneous game is a tuple {N,(Ai , ui)i∈N }, where N is a finite set of players, and for each player i ∈ N, Ai is a finite set of actions and ui is a payoff function ui : Ai × A−i → R. (I use the conventional notation A−i = Q j6=i Aj and A = Q i∈N Ai .) We could be more general and assume that each player is endowed with a preference relation over action profiles, but let’s not worry about this right now. The above is the “standard” definition of a game. As you may see, it is not quite consistent with the definition of a decision problem, so we need to reinterpret things a bit. Indeed, the most important difference is that a game implicitly defines not just one, but N decision problems (with a conventional abuse of notation, I will denote by this both the set N and its cardinality |N|). Let’s take the point of view of Player i. She has no control over her opponents’ choices, and she will not find out what they were until the game is over. Thus, A−i seems a good candidate for the state space: Ω = A−i . Next, utilities are attached to points (ai , a−i) ∈ A according to Definition 2. This amounts to saying that there is a one-to-one correspondence between the actual outcome of the game (which is what players care about) and action profiles. Thus, we may as well imagine that action profiles are the relevant consequences: C = A. By a process of elimination, acts must be actions: “F = A00 i . Formally, action ai ∈ Ai defines an act fai : A−i → Ai × A−i (i.e. fai : Ω → C) by fai (a−i) = (ai , a−i). However, the definition of a game incorporates additional information: specifically, it includes a utility function for each player. This implies that we have a complete description of players’ preferences among consequences. However, in view of the representation results cited above, this is only one part of the description of players’ preferences among acts, because the probabilities are missing. This trivial observation is actually crucial: in some sense, the traditional role of game theory has been that of specifying those probabilities in a manner consistent with assumptions or intuitions concerning strategic rationality. Nash Equilibrium To make this more concrete, recall what you may have heard about Nash equilibrium: a player’s strategy doubles as her opponents’ beliefs about her strategy. Thus, in a 2-player game, a Nash equilibrium is a specification of beliefs (α1, α2) for both players, with the property that if Player i’s belief assigns positive probability to an action aj ∈ Aj (j 6= i), then the act faj is preferred by j to any other act, given that j’s preferences are represented by the utility function uj : A → R and the 3
probability measure ai E A(Ai(note the indices! ) In short, we say"a, is a best reply to ai Note: at first sight, Nash equilibrium seems to say that players' beliefs are require to be" consistent with rationality". However, note that, since the rationality of an act can only be assessed with respect to a given (utility function and)probability measure over states of the world, the definition really says that players' beliefs are required to be "consistent with rationality, given the equilibrium beliefs. We shall be more specific about this point in a subsequent lecture Also, note that this interpretation of Nash equilibrium has no behavioral content it is simply a restriction on beliefs. If we really wish to push this to the extreme we can say that, if players' belief satisfies the restrictions in the definition of a Nash quilibrium, then the players may choose an action that is a best reply to those beliefs Again, we will return to this point later in the course Zero-Sum games and the minmax solution Game theorists were originally interested in a much simpler class of games, namely two-player games in which interests were diametrically opposed Definition 3 A two-player simultaneous game 1, 2, (Ai, Lien) is zero-sum iff, for all a E A, u1(a)+u2(a)=0 war o put this in perspective, think about board games such as chess or checkers,or In the latter case, it appears that doing one's best in the worst-case scenario might be the right"thing to do. I am told this is what they teach you in military academy anyway In a zero-sum game, the worst-case scenario for Player 1 is also the best scenario for Player 2, and vice-versa. Thus, for Player 1 to assume that Player 2 is "out to get her"is not inconceivable-after all, Player 2 has all the right incentives to do so This idea leads to the notion of macmin and minmay strategies. Suppose first that both players can randomize-commit to spinning roulette wheels or something like that, so as to choose strategies according to any pre-specified probability a; E A(Ai) We will get back to this later in the course: for now, let's just assume that this is possible; the reason why this is convenient will become clear momentarily. Again using a conventional notation, let u1(a1, 02)=: 2(anaeA ui(a1, a2)ai(ar)a2(a2)for any(a1,a2)∈△(A1)×△(A2) Consider first a pair of mixed strategies a E A(Ai), i=1, 2
probability measure αi ∈ ∆(Ai) (note the indices!). In short, we say “aj is a best reply to αi”. Note: at first sight, Nash equilibrium seems to say that players’ beliefs are required to be “consistent with rationality”. However, note that, since the rationality of an act can only be assessed with respect to a given (utility function and) probability measure over states of the world, the definition really says that players’ beliefs are required to be “consistent with rationality, given the equilibrium beliefs.” We shall be more specific about this point in a subsequent lecture. Also, note that this interpretation of Nash equilibrium has no behavioral content: it is simply a restriction on beliefs. If we really wish to push this to the extreme, we can say that, if players’ belief satisfies the restrictions in the definition of a Nash equilibrium, then the players may choose an action that is a best reply to those beliefs. Again, we will return to this point later in the course. Zero-Sum games and the minmax solution Game theorists were originally interested in a much simpler class of games, namely two-player games in which interests were diametrically opposed: Definition 3 A two-player simultaneous game {{1, 2},(Ai , ui)i∈N } is zero-sum iff, for all a ∈ A, u1(a) + u2(a) = 0. To put this in perspective, think about board games such as chess or checkers, or war. In the latter case, it appears that doing one’s best in the worst-case scenario might be the “right” thing to do. I am told this is what they teach you in military academy, anyway. In a zero-sum game, the worst-case scenario for Player 1 is also the best scenario for Player 2, and vice-versa. Thus, for Player 1 to assume that Player 2 is “out to get her” is not inconceivable—after all, Player 2 has all the right incentives to do so. This idea leads to the notion of maxmin and minmax strategies. Suppose first that both players can randomize—commit to spinning roulette wheels or something like that, so as to choose strategies according to any pre-specified probability αi ∈ ∆(Ai). We will get back to this later in the course: for now, let’s just assume that this is possible; the reason why this is convenient will become clear momentarily. Again using a conventional notation, let u1(α1, α2) =: P (a1,a2)∈A u1(a1, a2)α 1 1 (a1)α 1 2 (a2) for any (α1, α2) ∈ ∆(A1) × ∆(A2). Consider first a pair of mixed strategies α 1 i ∈ ∆(Ai), i = 1, 2, such that u1(α 1 1 , α1 2 ) = min α2∈∆(A2) max α1∈∆(A1) u1(α1, α2) 4
In words, u1(al, a2)is the minimum payoff that Player 1 can secure; equivalently, it is the minimum payoff to Player 1 when Player 2 attempts to damage her as much as possible, while keeping into account the fact that Player 1 best-responds Note that since max -f=-min f u2(al, a2)=-u1al, a2)=max min u2(a1, 02) a2∈△(A2)a1∈△(A1) i.e.(ai, a))also yields Player 2 the maximum payoff he can obtain if Player 1 chooses her mixed strategy so as to hold Player 2s payoff down Such a pair of mixed strategies can always be found(why? ) Similarly, one can consider the symmetric problem for Player 2, and define a new pair(af, a2)such that so u2(af, a2)is the minimum payoff that Player 2 can secure; as above u1(ai, a2)= max min u1(a1, a2) the maximum payoff that Player 1 can obtain if Player 2 attempts to hold Player 1s yoff down It may be helpful to think about this as a "pseudo-dynamic"story: in order to define (ai, a2), one imagines that Player 2 chooses first, and Player 1 moves next the latter best-responds to Player 2's choice, and the former, anticipating 1s best response, attempts to minimize Player 1's payoff. On the other hand, to define af, a2), one imagines that Player 1 chooses first, and Player 2 moves next; the latter observes 1's choice and attempts to minimize 1s payoff, while the former, anticipating this, attempts to maximize her own payoff. Of course there is a dual to this story, with 1's and 2s roles inverted It is natural to ask whether Player 1's resulting payoff under the two definitions is the same. The answer is affirmative Theorem0.1(a.ka.“ The minmax theoren”): there exists a pair(ai,a2)∈△(41) △(A2) such that in u1(a1, a2)=min 1(a1,Q 2∈△(A2)a1∈△(A1) The easiest proof of this theorem uses Linear Programming. Let n; be the cardi nality of Ai and number its elements from 1 to ni; define the n1 X n2 matrix U by letting Uke=u(ah, a2), where af and as are the k-th and e-th elements of A1 and A2, respectively. Then Player 1s(2s) beliefs about Player 2 s choices may be represented as nonegative vectors in a2 E R(resp. a1 R")whose elements add up to one
In words, u1(α 1 1 , α1 2 ) is the minimum payoff that Player 1 can secure; equivalently, it is the minimum payoff to Player 1 when Player 2 attempts to damage her as much as possible, while keeping into account the fact that Player 1 best-responds. Note that since max −f = − min f, u2(α 1 1 , α1 2 ) = −u1(α 1 1 , α1 2 ) = max α2∈∆(A2) min α1∈∆(A1) u2(α1, α2) i.e. (α 1 1 , α1 2 ) also yields Player 2 the maximum payoff he can obtain if Player 1 chooses her mixed strategy so as to hold Player 2’s payoff down. Such a pair of mixed strategies can always be found (why?). Similarly, one can consider the symmetric problem for Player 2, and define a new pair (α 2 1 , α2 2 ) such that u2(α 2 1 , α2 2 ) = min α1∈∆(A1) max α2∈∆(A2) u2(α1, α2) so u2(α 2 1 , α2 2 ) is the minimum payoff that Player 2 can secure; as above, u1(α 2 1 , α2 2 ) = max α1∈∆(A1) min α2∈∆(A2) u1(α1, α2) the maximum payoff that Player 1 can obtain if Player 2 attempts to hold Player 1’s payoff down. It may be helpful to think about this as a “pseudo-dynamic” story: in order to define (α 1 1 , α1 2 ), one imagines that Player 2 chooses first, and Player 1 moves next; the latter best-responds to Player 2’s choice, and the former, anticipating 1’s best response, attempts to minimize Player 1’s payoff. On the other hand, to define (α 2 1 , α2 2 ), one imagines that Player 1 chooses first, and Player 2 moves next; the latter observes 1’s choice and attempts to minimize 1’s payoff, while the former, anticipating this, attempts to maximize her own payoff. [Of course there is a dual to this story, with 1’s and 2’s roles inverted.] It is natural to ask whether Player 1’s resulting payoff under the two definitions is the same. The answer is affirmative: Theorem 0.1 (a.k.a. “The minmax theorem”): there exists a pair (α ∗ 1 , α∗ 2 ) ∈ ∆(A1)× ∆(A2) such that u1(α ∗ 1 , α∗ 2 ) = max α1∈∆(A1) min α2∈∆(A2) u1(α1, α2) = min α2∈∆(A2) max α1∈∆(A1) u1(α1, α2) The easiest proof of this theorem uses Linear Programming. Let ni be the cardinality of Ai and number its elements from 1 to ni ; define the n1 × n2 matrix U by letting Uk` = u1(a k 1 , a` 2 ), where a k 1 and a ` 2 are the k-th and `-th elements of A1 and A2, respectively. Then Player 1’s (2’s) beliefs about Player 2’s choices may be represented as nonegative vectors in α2 ∈ Rn2 (resp. α1 ∈ Rn1 ) whose elements add up to one. 5
Consider the following problem ImI nu s to +1≥0,1a which we can interpret as follows: Player 2 chooses a2 so that every strategy of Player I yields u or less; indeed, Player 2 attempts to make u as small as possible. Not that u is unconstrained in sign The dual program is max u s t a1U+1v≤0,al1=1 i.e. Player 1 chooses a1 so each of Player 2's strategy yields at least v to Player 1 (think about it! ), i.e. at most -v to Player 2. Again, v is unconstrained Note that both programs are feasible(easy ); also, both programs are bounded because, say, in the dual program, v cannot be larger than the largest entry in the matrix U(which, if you think about it, is also obvious! )But then, by the Existence- Duality theorem of LP, both programs have optimal solutions, the optimal values are the same(so u= u), and the optimal solutions are complementary. In particular this means that a1(a1)>0 implies u1(a1, a2)=u, and similarly for 2. Thus, only actions which achieve the maximal payoff u against a2 receive positive probability; in other words, a1 only assigns positive probabilities to best responses(and similarly for a2). Hence, the optimal solutions of the dual LPs identify a Nash equilibrium; it is then easy to see that any Nash equilibrium satisfies the equality in Theorem 1(see Proposition 22.2 in OR if you don't see this), and we are done
Consider the following problem: min α2 u s.to − U · α2 + 1u ≥ 0, 1 0α2 = 1 which we can interpret as follows: Player 2 chooses α2 so that every strategy of Player 1 yields u or less; indeed, Player 2 attempts to make u as small as possible. Note that u is unconstrained in sign. The dual program is max α1 v s.to − α 0 1U + 10 v ≤ 0, α0 1 1 = 1 i.e. Player 1 chooses α1 so each of Player 2’s strategy yields at least v to Player 1 (think about it!), i.e. at most −v to Player 2. Again, v is unconstrained. Note that both programs are feasible (easy); also, both programs are bounded, because, say, in the dual program, v cannot be larger than the largest entry in the matrix U (which, if you think about it, is also obvious!) But then, by the ExistenceDuality theorem of LP, both programs have optimal solutions, the optimal values are the same (so u = v), and the optimal solutions are complementary. In particular, this means that α1(a1) > 0 implies u1(a1, α2) = u, and similarly for 2. Thus, only actions which achieve the maximal payoff u against α2 receive positive probability; in other words, α1 only assigns positive probabilities to best responses (and similarly for α2). Hence, the optimal solutions of the dual LPs identify a Nash equilibrium; it is then easy to see that any Nash equilibrium satisfies the equality in Theorem 1 (see Proposition 22.2 in OR if you don’t see this), and we are done. 6