gametheory《经济学理论》 Lecture 3: Nash equilibrium.pdf_大学文库

which is exactly what we want In any case, to further simplify things, we introduce the following shorthand notation for best replies to beliefs concentrated on a single action profile Definition 1 Fix a game G=(N, (Ai, Ti, uiieN). For every i E N, Player i's Nash best reply correspondence Pi: A-i= Ai is defined by p(a-)=r(a-)a-a∈A-a That is, Pi(a-i) is the set of Player i's best replies to beliefs concentrated on a_i. Note that this corresponds to OR'sBi (" I wish to reserve the letter B; for something else We are ready for the definition of Nash equilibrium: Definition 2 Fix a game G=(N, (Ai, Ti, lilieN). The profile of actions(ai)ieN is a Nash equilibrium of g iff a;∈p(a-1) for all i∈N The literal interpretation is clear: in a Nash equilibrium, the action of each player is best reply to the(belief concentrated on the) actions of her opponents Existence Note that the above definition applies to both finite and infinite games. Moreover, as stated, it does not guarantee existence in finite games. For example, Matching Pennies(see Lecture 1)does not have an equilibrium according to the above definition At this juncture the theory treats finite and infinite games differently. This might seem odd, and to some extent it so appears to this writer, too. However, the theory has developed as follows: for finite action sets, one employs a"trick"which(in some sense) guarantees existence in arbitrary games: for infinite games, theorists have developed conditions on the action spaces and payoff functions under which Nash equilibria exist. There is no(known) trick "which guarantees existence in arbitrary infinite games I shall focus on existence(with the "trick?")in finite games, although this requires a brief detour on infinite games and some ancillary notions. We shall only need to consider very special, well-behaved infinite games, and I will provide details only for those; however, I will indicate how the basic ideas and results extend to more general settings Upper Hemicontinuity Let us take a step back. Recall that a function f: X-Y, where(X, Tx) and(Y, Iy) are topological spaces, is continuous iff f-(V)=c: f()EV E Tx whenever V Ty 2

which is exactly what we want. In any case, to further simplify things, we introduce the following shorthand notation for best replies to beliefs concentrated on a single action profile: Definition 1 Fix a game G = (N,(Ai , Ti , ui)i∈N ). For every i ∈ N, Player i’s Nash bestreply correspondence ρi : A−i ⇒ Ai is defined by ρi(a−i) = ri(δa−i ) ∀a−i ∈ A−i That is, ρi(a−i) is the set of Player i’s best replies to beliefs concentrated on a−i . Note that this corresponds to OR’s “Bi(·)”: I wish to reserve the letter Bi for something else. We are ready for the definition of Nash equilibrium: Definition 2 Fix a game G = (N,(Ai , Ti , ui)i∈N ). The profile of actions (ai)i∈N is a Nash equilibrium of G iff ai ∈ ρi(a−i) for all i ∈ N. The literal interpretation is clear: in a Nash equilibrium, the action of each player is a best reply to the (belief concentrated on the) actions of her opponents. Existence Note that the above definition applies to both finite and infinite games. Moreover, as stated, it does not guarantee existence in finite games. For example, Matching Pennies (see Lecture 1) does not have an equilibrium according to the above definition. At this juncture, the theory treats finite and infinite games differently. This might seem odd, and to some extent it so appears to this writer, too. However, the theory has developed as follows: for finite action sets, one employs a “trick” which (in some sense) guarantees existence in arbitrary games; for infinite games, theorists have developed conditions on the action spaces and payoff functions under which Nash equilibria exist. There is no (known) “trick” which guarantees existence in arbitrary infinite games. I shall focus on existence (with the “trick”) in finite games, although this requires a brief detour on infinite games and some ancillary notions. We shall only need to consider very special, well-behaved infinite games, and I will provide details only for those; however, I will indicate how the basic ideas and results extend to more general settings. Upper Hemicontinuity Let us take a step back. Recall that a function f : X → Y , where (X, TX) and (Y, TY ) are topological spaces, is continuous iff f −1 (V ) = {x : f(x) ∈ V } ∈ TX whenever V ∈ TY . 2

However, extending this definition to correspondences is problematic, because the mean- ing of "f-(V) "is ambiguous. The following definition presents two alternatives Definition 3 Let(X, Ix) and(Y, Ty)be topological spaces, and consider a correspondence f: X=Y. For any V C Y, the upper inverse of f at V is f(v)=far: f(a)CV; the lower inverse of f at V is f(V)={x:f(x)∩V≠0} Clearly, fu(v)cf(v), and the two definitions coincide for singleton-valued dences. i.e. for functions Each notion gives rise to a corresponding definition of continuity Definition 4 Let(X, Tx) and(Y, Iy) be topological spaces, and consider a correspondence f: X>Y. Then f is upper hemicontinuous(uhc)iff, for every V E Ty, fu(V)EIx; f is lower hemicontinuous(lhc)iff, for every V E Ty, f(V)ETx; f is continuous iff it is both uhc and lhc Definition 4 highlights the connection with the notion of continuity for functions, but is somewhat hard to apply. However, the following characterization is useful Theorem 0.1 Let(X, Ix) be metrizable, and(Y, Ty) be compact and metrizable. Then a correspondence f: X=Yis (1)upper hemicontinuous iff, for every pair of convergent sequences aln>0 x in X and{y"}n≥0→ y in Y such that y∈f(x),y∈f(x); (2) lower hemicontinuous iff, for any sequence aIn>0-a in X, and for every E f(a there exists a subsequence ank k>o in X and a sequence y"k ]k>o in Y such that ynk E f(a"k for all k≥0, and ynk→y For infinite games, under our assumptions, the Nash best-reply correspondence is upper hemicontinuous as a consequence of the Maximum Theorem. However, in order to define Nash equilibrium for finite games(with the"trick"I mentioned above), a direct proof is easy to provide Existence of Nash Equilibrium To establish the desired result we need two ingredients: a"big mathematical hammer the Kakutani-Fan-Glicksberg Fixed-Point Theorem, and a trick. Let us start with the former. 2 B]For the more mathematically inclined, the domain of the correspondence in the next theorem need only first-countable 2For the more mathematically inclined, the theorem actually applies to locally convex hausdorff topo- logical vector spaces, such as e.g. the set of real-valued functions on a nonempty set X(endowed with the 3

However, extending this definition to correspondences is problematic, because the meaning of “f −1 (V )” is ambiguous. The following definition presents two alternatives. Definition 3 Let (X, TX) and (Y, TY ) be topological spaces, and consider a correspondence f : X ⇒ Y . For any V ⊂ Y , the upper inverse of f at V is f u (V ) = {x : f(x) ⊂ V }; the lower inverse of f at V is f ` (V ) = {x : f(x) ∩ V 6= ∅}. Clearly, f u (V ) ⊂ f ` (V ), and the two definitions coincide for singleton-valued correspondences, i.e. for functions. Each notion gives rise to a corresponding definition of continuity: Definition 4 Let (X, TX) and (Y, TY ) be topological spaces, and consider a correspondence f : X ⇒ Y . Then f is upper hemicontinuous (uhc) iff, for every V ∈ TY , f u (V ) ∈ TX; f is lower hemicontinuous (lhc) iff, for every V ∈ TY , f ` (V ) ∈ TX; f is continuous iff it is both uhc and lhc. Definition 4 highlights the connection with the notion of continuity for functions, but is somewhat hard to apply. However, the following characterization is useful:1 Theorem 0.1 Let (X, TX) be metrizable, and (Y, TY ) be compact and metrizable. Then a correspondence f : X ⇒ Y is: (1) upper hemicontinuous iff, for every pair of convergent sequences {x n}n≥0 → x in X and {y n}n≥0 → y in Y such that y n ∈ f(x n ), y ∈ f(x); (2) lower hemicontinuous iff, for any sequence {x n}n≥0 → x in X, and for every y ∈ f(x), there exists a subsequence {x nk }k≥0 in X and a sequence {y nk }k≥0 in Y such that y nk ∈ f(x nk ) for all k ≥ 0, and y nk → y. For infinite games, under our assumptions, the Nash best-reply correspondence is upper hemicontinuous as a consequence of the Maximum Theorem. However, in order to define Nash equilibrium for finite games (with the “trick” I mentioned above), a direct proof is easy to provide. Existence of Nash Equilibrium To establish the desired result, we need two ingredients: a “Big Mathematical Hammer,” the Kakutani-Fan-Glicksberg Fixed-Point Theorem, and a trick. Let us start with the former.2 1For the more mathematically inclined, the domain of the correspondence in the next theorem need only be first-countable. 2For the more mathematically inclined, the theorem actually applies to locally convex Hausdorff topological vector spaces, such as e.g. the set of real-valued functions on a nonempty set X (endowed with the product topology). 3

Theorem 0.2 Let K be a nonempty, compact and convex subset of Euclidean space, and f : K ⇒ K be a correspondence. If f is nonempty- and convex-valued, and upper hemicontinuous, then the set of its fixed points is nonempty and compact. The trick is contained in the following definition: Definition 5 Fix a finite game G = (N,(Ai , ui)i∈N ). The mixed extension of G is the game Γ = (N,(∆(Ai), Ui)i∈N ), where, for any collection (αi)i∈N ∈ Q i∈N ∆(Ai), Ui(α1, . . . , αN ) = X (ai)i∈N ∈A Y i∈N αi(ai)ui(a1, . . . , aN ) For every i ∈ N, denote by ρ Γ i Player i’s Nash best reply correspondence. The idea is to consider the modified game in which players have the physical possibility to randomize among their actions, in a stochastically independent fashion. I shall return to the interpretation of this assumption momentarily; formally, note that the set K = Q i∈N ∆(Ai), as a product of simplices, is a nonempty, compact, convex subset of Euclidean space, as required by Theorem 0.2. Moreover, define a correspondence f : K ⇒ K by f(α1, . . . , αN ) = {(α 0 i )i∈N : ∀i ∈ N, α0 i ∈ ρ Γ i (α−i)} It is easy to see that f(·) is nonempty, convex-valued and uhc if ρ Γ i is. So, we need: Proposition 0.3 For any finite game G, the Nash best reply correspondence of its mixed extension Γ is nonempty, convex-valued and upper hemicontinuous. Proof: Nonemptyness follows because the best-reply correspondence of the original finite game is nonempty (and ∆(Ai) contains all degenerate mixed strategies.) Now note that αi ∈ ρ Γ i (α−i) iff αi assigns positive probability only to actions that are best replies against the belief Q j6=i αj (why?). This immediately implies convexity. To prove uhc, note that we can use the characterization in Theorem 0.1. Consider two convergent sequences (α n −i )n≥1 → α−i and (α n i )n≥1 → αi such that, for every n ≥ 1, α n i ∈ ρ Γ i (α n −i . This means that, for every α 0 i ∈ ∆(Ai), Ui(α n i , αn −i ) ≥ Ui(α 0 i , αn −i ) Now note that, by definition, the function Ui(·, ·) is jointly continuous in its arguments. Thus, as n → ∞, Ui(α n i , αn −i ) → Ui(αi , α−i and Ui(α 0 i , αn −i ) → Ui(α 0 i , α−i). The result now follows. 4

R 1,10.0 B|1,0-1,2 Figure 1: Correct beliefs This game has a unique Nash equilibrium, (T, L). Suppose that we assume that:(1) Players are rational; (2. 1)Player 1 expects Player 2 to choose L; and(2.2)Player 2 expects Player 1 to choose R. Are these assumptions sufficient to conclude that Player 1 will choose T and Player 2 will choose L? The answer is no, because a rational Player 1 might equally well choose B if she thinks hat Player 2 will choose L. Thus, "correct beliefs "must imply something more than this We need to consider the following assumption (2.1)Player 1 expects Player 2 to play whatever action he actually chooses (2.2) Player 2 expects Player I to play whatever action she actually chooses This will be crystal-clear when we develop a model where assumptions about beliefs be formalized. However, the basic idea should still be easy to grasp: what is needed is a restriction that relates beliefs and actual behavior. The problem with this assumption is that of course, it does not go much beyond the definition of Nash equilibrium Yet, it delivers the required result, for although(B, L) is consistent with assumptions (1),(2.1)and(2.2), it fails(2.2): for Player 2 to choose L, it must be the case that he expects Player 1 to choose T with high enough probability, but in fact Player 1 chooses B with probability one. Again, these considerations will be clearer once we develop a model of interactive beliefs By way of comparison, recall that the assumptions we used to justify(correlated)ra tionalizability(following the eductive approach) were of the following form:(1)Players are rational;(2) Players believe that their opponents are rational; 3)Players believe that their opponents believe that their own opponents are rational; and so on. Note that assumptions (2),(3). do not involve specific actions(as our assumptions(2. 1)and(2. 2) above), and do not in any way refer to what players may actually do in a given "state of the world Thus, these assumptions are truly decision-theoretical in nature; those required to moti- ate Nash equilibrium are somewhat of a hybrid (indeed, fixpoint )nature One could correctly argue, however, that the above point is only valid for equilibria in which players have multiple best-replies to their equilibrium beliefs; that is, for non-strict equilibria in game-theoretic parlance. One could then observe that in "generic"games(i.e games with payoffs in general position) all pure-action equilibria are strict

L R T 1,1 0,0 B 1,0 -1,2 Figure 1: Correct beliefs This game has a unique Nash equilibrium, (T,L). Suppose that we assume that: (1) Players are rational; (2.1) Player 1 expects Player 2 to choose L; and (2.2) Player 2 expects Player 1 to choose R. Are these assumptions sufficient to conclude that Player 1 will choose T and Player 2 will choose L? The answer is no, because a rational Player 1 might equally well choose B if she thinks that Player 2 will choose L. Thus, “correct beliefs” must imply something more than this. We need to consider the following assumption: (2.1’) Player 1 expects Player 2 to play whatever action he actually chooses; (2.2’) Player 2 expects Player 1 to play whatever action she actually chooses. This will be crystal-clear when we develop a model where assumptions about beliefs can be formalized. However, the basic idea should still be easy to grasp: what is needed is a restriction that relates beliefs and actual behavior. The problem with this assumption is that, of course, it does not go much beyond the definition of Nash equilibrium! Yet, it delivers the required result, for although (B,L) is consistent with assumptions (1), (2.1) and (2.2), it fails (2.2’): for Player 2 to choose L, it must be the case that he expects Player 1 to choose T with high enough probability, but in fact Player 1 chooses B with probability one. Again, these considerations will be clearer once we develop a model of interactive beliefs. By way of comparison, recall that the assumptions we used to justify (correlated) rationalizability (following the eductive approach) were of the following form: (1) Players are rational; (2) Players believe that their opponents are rational; (3) Players believe that their opponents believe that their own opponents are rational; and so on. Note that assumptions (2), (3)... do not involve specific actions (as our assumptions (2.1) and (2.2) above), and do not in any way refer to what players may actually do in a given “state of the world.” Thus, these assumptions are truly decision-theoretical in nature; those required to motivate Nash equilibrium are somewhat of a hybrid (indeed, fixpoint) nature. One could correctly argue, however, that the above point is only valid for equilibria in which players have multiple best-replies to their equilibrium beliefs; that is, for non-strict equilibria in game-theoretic parlance. One could then observe that in “generic” games (i.e. games with payoffs in general position) all pure-action equilibria are strict. 6

This is a valid point. However, note that even generic games need not have pure-action equilibria(generic variants of Matching Pennies prove this point ). And, even taking the mixed extension of a finite game seriously, no mixed-action equilibrium can be strict(why? Hence,for games which admit equilibria only in their mixed representation,one really needs the full force of the non-decision-theoretic "correct beliefs "assumption The learning approach, by and large takes the point of view that players know how to st-respond to a belief, but it postulates that beliefs are based on the players experience from past strategic interactions The simplest such model is that of fictitious play. Each player i E N is endowed with a weighting function"w'i: A-i-10, 1, 2... which counts how many times a certain action profile was observed. Before playing the game, some arbitrary weights are assigned, and upon completing each play of the game, if a-i E A-i was observed, wi(a-i) is increased by 1. Finally, at each stage players best-respond to the belief a i defined by -i)= i.e. they play some action in riai The attractiveness of this approach lies in the intuitive notion that perhaps learning might be the reason why beliefs are correct. That is, intuitively players might learn how to coordinate"on some Nash equilibrium Indeed, several results relate the steady states of this process, or the long-run frequencies, i.e. the long-run ais, to Nash equilibrium and other solution concepts. For instance, it can be shown that, in two-player games, if a steady state exists, it must be a strict Nash equilibrium (this is not hard to prove). Also, if the long-run frequencies converge, they represent equilibria of the mixed extension of the game under consideration The above paragraphs are simply meant to convey the flavor of the learning approach: I am certainly not doing any justice to the vast and insightful literature on the subject Scant as they are, however, the above remarks do illustrate at least one difficulty in applying the ideas from the learning literature to justify Nash equilibrium: convergence to a steady state, or convergence of the long-run frequencies, is not guaranteed at all: it only occurs under appropriate conditions. Indeed, most of these conditions are either very restrictive, or sufficient to enable a fully decision-theoretic eductive analysis, without invoking the full strength of the "correct beliefs" assumption The recent learning literature uses this approach not to justify existing solution concepts but to provide a foundation for new, interesting ones(e. g. Fudenberg and Levine's self- confirming equilibrium) which typically involve departures from "correct beliefs 3Fudenberg and Levine's book is an excellent reference, if you are interested

This is a valid point. However, note that even generic games need not have pure-action equilibria (generic variants of Matching Pennies prove this point). And, even taking the mixed extension of a finite game seriously, no mixed-action equilibrium can be strict (why?). Hence, for games which admit equilibria only in their mixed representation, one really needs the full force of the non-decision-theoretic “correct beliefs” assumption. The learning approach, by and large, takes the point of view that players know how to best-respond to a belief, but it postulates that beliefs are based on the players’ experience from past strategic interactions. The simplest such model is that of fictitious play. Each player i ∈ N is endowed with a “weighting function” wi : A−i → {0, 1, 2 . . .} which counts how many times a certain action profile was observed. Before playing the game, some arbitrary weights are assigned, and upon completing each play of the game, if a−i ∈ A−i was observed, wi(a−i) is increased by 1. Finally, at each stage players best-respond to the belief α w −i defined by α w −i (a−i) = wi(a−i) P a 0 −i∈A−i wi(a−i) i.e. they play some action in ri(α w −i ). The attractiveness of this approach lies in the intuitive notion that perhaps learning might be the reason why beliefs are correct. That is, intuitively players might learn how to “coordinate” on some Nash equilibrium. Indeed, several results relate the steady states of this process, or the long-run frequencies, i.e. the long-run α w −i ’s, to Nash equilibrium and other solution concepts. For instance, it can be shown that, in two-player games, if a steady state exists, it must be a strict Nash equilibrium (this is not hard to prove). Also, if the long-run frequencies converge, they represent equilibria of the mixed extension of the game under consideration. The above paragraphs are simply meant to convey the flavor of the learning approach: I am certainly not doing any justice to the vast and insightful literature on the subject3 . Scant as they are, however, the above remarks do illustrate at least one difficulty in applying the ideas from the learning literature to justify Nash equilibrium: convergence to a steady state, or convergence of the long-run frequencies, is not guaranteed at all: it only occurs under appropriate conditions. Indeed, most of these conditions are either very restrictive, or sufficient to enable a fully decision-theoretic eductive analysis, without invoking the full strength of the “correct beliefs” assumption. The recent learning literature uses this approach not to justify existing solution concepts, but to provide a foundation for new, interesting ones (e.g. Fudenberg and Levine’s selfconfirming equilibrium) which typically involve departures from “correct beliefs.” 3Fudenberg and Levine’s book is an excellent reference, if you are interested. 7

Thus, it seems that, if we wish to invoke the notion of Nash equilibrium to draw behavioral predictions, i.e. to say what players will in fact do, we are forced to accept the full force of the "correct beliefs"assumption. As I have observed, this assumption might be unpalatable. and does not add much to our understanding of the formal definition Mixed actions I will be more concise with respect to the possibility of randomization. There is no doubt that mixing was introduced merely as a matter of technical convenie This is not to deny that, in many situations(e.g. the game of Rock, Scissors, Paper, or my own personal favorite, Matching Pennies), randomization is an appealing option Rather, this is to say that, when we represent Matching Pennies as we conventionally do(each player has two actions, H and T), we are really writing down an incomplete repre- sentation of the actual situation. Perhaps I would not go so far as to say that players can randomize with any probability between H and T, but I would not feel bad about assuming that, for instance, they can Hip a coin and choose an action based on the outcome of the coin toss Thus, if we feel that randomization is an important strategic option, then perhaps it ight be more appropriate to model it explicitly An alternative exists, however. We can interpret mixed strategies as beliefs held by the players about each other Under this interpretation, to say that(a1, a2) is a Nash equilibrium is to say that a1 ewed as a belief held by Player 2 about Player 1, is a belief which satisfies two consistency conditions Player 2 believes that Player 1 is rational (2)Player 2 believes that Player 1s beliefs are given by 2 Similar considerations hold for a2, viewed as a belief held by Player 1 about Player 2 Again, the details will only be clear once we have a full-blown model of interactive beliefs (note that(2) is not an assumption we know how to formulate yet); however, observe that(1) and(2) are statements that have to do with beliefs, not with behavior. They are, so to speak purely decision-theoretic statements-although they have no direct behavioral implication and adding the assumption that players are indeed rational may lead to the difficulties highlighted above with reference to the game in Figure In practice, a player may physically generate a sequence of coin tosses prior to playing the game, then memorize it and use it to choose an action each time she has to play 8

Thus, it seems that, if we wish to invoke the notion of Nash equilibrium to draw behavioral predictions, i.e. to say what players will in fact do, we are forced to accept the full force of the “correct beliefs” assumption. As I have observed, this assumption might be unpalatable, and does not add much to our understanding of the formal definition. Mixed actions I will be more concise with respect to the possibility of randomization. There is no doubt that mixing was introduced merely as a matter of technical convenience. This is not to deny that, in many situations (e.g. the game of Rock, Scissors, Paper, or my own personal favorite, Matching Pennies), randomization is an appealing option. Rather, this is to say that, when we represent Matching Pennies as we conventionally do (each player has two actions, H and T), we are really writing down an incomplete representation of the actual situation. Perhaps I would not go so far as to say that players can randomize with any probability between H and T, but I would not feel bad about assuming that, for instance, they can flip a coin and choose an action based on the outcome of the coin toss.4 Thus, if we feel that randomization is an important strategic option, then perhaps it might be more appropriate to model it explicitly. An alternative exists, however. We can interpret mixed strategies as beliefs held by the players about each other. Under this interpretation, to say that (α1, α2) is a Nash equilibrium is to say that α1, viewed as a belief held by Player 2 about Player 1, is a belief which satisfies two consistency conditions: (1) Player 2 believes that Player 1 is rational (2) Player 2 believes that Player 1’s beliefs are given by α2 Similar considerations hold for α2, viewed as a belief held by Player 1 about Player 2. Again, the details will only be clear once we have a full-blown model of interactive beliefs (note that (2) is not an assumption we know how to formulate yet); however, observe that (1) and (2) are statements that have to do with beliefs, not with behavior. They are, so to speak, purely decision-theoretic statements—although they have no direct behavioral implication, and adding the assumption that players are indeed rational may lead to the difficulties highlighted above with reference to the game in Figure 1. 4 In practice, a player may physically generate a sequence of coin tosses prior to playing the game, then memorize it and use it to choose an action each time she has to play. 8