Games and Economic Behavior 49 (2004) 401–423 www.elsevier.com/locate/geb
An experimental study of commitment in Stackelberg games with observation costs John Morgan a,∗ , Felix Várdy b a Haas School of Business and Department of Economics, University of California, Berkeley, USA b International Monetary Fund, Washington, DC, USA
Received 16 October 2001 Available online 19 August 2004
Abstract We report on experiments examining the value of commitment in Stackelberg games where the follower chooses whether to pay some cost to perfectly observe the leader’s action. Várdy [Games Econ. Behav. (2004)] shows that in the unique pure-strategy subgame perfect equilibrium of this game, the value of commitment is lost completely; however, there exists a mixed-strategy subgame perfect equilibrium where the value of commitment is fully preserved. In the data, the value of commitment is largely preserved when the cost of looking is small, while it is lost when the cost is large. Nevertheless, for small observation costs, equilibrium behavior is clearly rejected. Instead, subjects persistently play non-equilibrium strategies in which the probability of the follower choosing to observe the leader’s action is a decreasing function of the observation cost. 2004 Elsevier Inc. All rights reserved. JEL classification: C72; C91; D82; D83; D84 Keywords: Stackelberg duopoly; Experiments; Observation costs; Commitment; Costly leader games
* Corresponding author at: 545 Student Services Building #1900, Berkeley, CA 94720-1900, USA.
E-mail address:
[email protected] (J. Morgan). 0899-8256/$ – see front matter 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.geb.2004.04.005
402
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
1. Introduction One key insight of game theory is the value of commitment. A standard way of illustrating the value of commitment is by showing that in markets where firms compete in quantities, a firm can gain a strategic advantage if it can commit to its production quantity ahead of its rival. The reasoning is straightforward: Having observed the first-mover’s commitment to produce a large quantity, the best response of the rival firm is to cut its own production. This leads to a situation where the first-mover, or ‘leader’, gains market share and profit at the expense of the second-mover, or ‘follower.’ Suppose, however, that to observe the leader’s choice, the follower must undertake some investigative activity— perhaps at very low cost. Absent undertaking this activity, he remains in the dark about the leader’s action. How does this option affect the strategic choices in the Stackelberg game? And how does it affect the leader’s value of commitment?1 To fix ideas, consider the normal form game, g: L\F
s
c
S C
(500, 200) (600, 300)
(300, 100) (400, 400)
In this game, two firms, designated L(eader) and F (ollower), are competing with one another. Here, the choices of the leader, S and C, correspond to the Stackelberg and Cournot outputs, respectively. Likewise for the choices of the follower. If firms choose their actions simultaneously, the game is dominance solvable and yields the unique rationalizable outcome (C, c). In contrast, if L moves first followed by F and L’s choice is fully observable, then the unique subgame perfect equilibrium of the game is (S, s). Thus, the power of commitment yields L an additional 100 points at the expense of 200 points lost on the part of F . Next, consider a variation of this game that Várdy (2004) refers to as the ‘costly leader game.’ In this game L chooses first. Then F decides whether to spend an amount ε > 0 to perfectly observe L’s choice. If he does not spend ε, player F obtains no information about L’s choice. Following this, F chooses s or c and payoffs are realized. The extensive form of the costly leader game is depicted in Fig. 1. If one restricts attention to pure-strategy subgame perfect equilibria of the costly leader game, F never pays to observe L’s choice and the outcome of the game is (C, c). In other words, the value of being a first-mover is completely undermined even for arbitrarily small costs of observing L’s choice. The intuition is that, since F fully anticipates L’s choice in a pure-strategy equilibrium, there is no point in spending anything merely to confirm these beliefs. Of course, L anticipates this behavior on F ’s part and thus cannot hope to 1 In this paper, we use the terms ‘value of commitment’ and ‘first-mover advantage’ interchangeably. Both
terms refer to the extra payoff the leader gets from moving first, as compared to his payoff when the players move simultaneously.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
403
Fig. 1. Extensive form of costly leader game.
influence F ’s choice through making the first move. The game essentially collapses to its simultaneous-move version and the Cournot outcome results.2 When the cost of observing the leader’s action is sufficiently low, there also exists a subgame perfect equilibrium in mixed strategies in which the value of commitment of the leader is perfectly preserved. Specifically, for any ε < 50, a subgame perfect equilibrium ε of this game is where the leader chooses S with probability 1 − 100 and F observes L’s action exactly half of the time. If F has chosen not to observe L’s action, he takes action s with probability one. In this equilibrium, which Várdy refers to as the ‘noisy Stackelberg equilibrium,’ the leader’s payoff is 500—exactly the same as in the usual Stackelberg
2 This loss of commitment value on the part of the leader can also arise as part of a mixed-strategy equilibrium
in the costly leader game, provided that the costs of observing the leader’s action are not too large. We refer to such an equilibrium as a ‘noisy Cournot equilibrium’ and describe its properties in more detail in the sequel.
404
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
game.3 This equilibrium suggests that the presence of an option not to observe the leader’s choice may not impair the first-mover advantage of the leader at all. These properties are not unique to this example. Várdy analyzes a generic class of duopoly Stackelberg games and shows that the properties of pure-strategy and noisy Stackelberg equilibria hold quite generally. Thus, one is faced with equilibria arising in the costly leader game that differ dramatically in their implications as to the value of commitment. One approach to resolving these differing predictions is to apply equilibrium refinements to rule out certain of the equilibria in this game. Várdy offers some limited analysis along these lines, suggesting that it is the pure-strategy equilibrium that survives. By contrast, the approach of this paper is empirical. We conducted controlled laboratory experiments where subjects played the game given in Fig. 1. Of particular interest is how the value of commitment varies with F ’s cost of observing L’s choice. Our main treatment was to set this cost at five different levels: ε = {1, 15, 30, 45, 60} and observe how this affected outcomes. When the cost of observing L’s choice is low (ε = 1 or ε = 15), the Stackelberg outcome occurs about 79% of the time and the leader retains more than 84% of the value of being a first-mover. In contrast, when the cost of observing L’s choice is high (ε = 45 or ε = 60), the Cournot outcome occurs about 77% of the time, and the leader retains only 20% of the value of being a first-mover. When the cost is intermediate (ε = 30), neither Cournot nor Stackelberg constitute more than 50% of the outcomes, and the leader retains 51% of the value of being a first-mover. Thus, it appears that varying the cost of observing L’s choice significantly affects the value of commitment and, seemingly, is an important determinant of equilibrium selection.4 However, subject choices are not consistent with equilibrium play. Specifically, the ‘noisy’ equilibria (noisy Stackelberg and noisy Cournot) share the property that F pays to observe L’s choice half of the time independent of ε. This is inconsistent with what we observe. When the cost is low, followers choose to observe the leader’s action about 75% of the time. Further, the frequency with which followers choose to observe decreases monotonically as the observation cost increases. In short, despite finding that the value of commitment is largely preserved when the observation cost is relatively small, it does not appear that subjects are playing strategies approximating the noisy Stackelberg equilibrium to support this outcome. However, we show that a solution concept incorporating some bounded rationality on the part of subjects, namely, ‘agent quantal response equilibrium’ (McKelvey and Palfrey, 1998), does quite well in explaining subject behavior. The remainder of the paper proceeds as follows: In Section 2, we review the relevant theoretical and empirical literature related to the robustness of commitment in duopoly. Section 3 outlines the procedures used in the experiment. Section 4 reports the results of the experiment and compares these to the theoretical predictions. In Section 5, we show that agent quantal response equilibrium might help to explain discrepancies between the theory 3 It is interesting to note that F ’s expected payoff is 200 + ε, which is higher than in the usual Stackelberg game. 4 More precisely, it seems to be an important determinant for all treatments except ε = 60. In this treatment, the Cournot outcome is the unique rationalizable outcome.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
405
and our results. Finally, Section 6 concludes. The instructions used in the experiment are contained in Appendix A.
2. Literature review In an important paper, Bagwell (1995) points out the fragility of the first-mover advantage in Stackelberg duopoly when the follower receives a noisy signal about the action taken by the leader. That is, in a Stackelberg duopoly model where there is a small chance that the follower receives an incorrect signal about the leader’s choice, Bagwell shows that the set of pure-strategy Nash equilibria in the ‘noisy leader game’ coincide with the set of pure-strategy equilibria in the simultaneous move Cournot version of the game. Bagwell’s point is that the value of commitment is lost entirely even when the noise in the signal received by the follower becomes arbitrarily small. This observation led to a lively debate about the robustness of the first-mover advantage in Stackelberg duopoly. See, for instance, van Damme and Hurkens (1997), Oechssler and Schlag (2000), Güth et al. (1998), and Huck and Müller (2000). The first and last of these papers merit particular attention. The paper by van Damme and Hurkens shows that in addition to the pure-strategy equilibria identified by Bagwell, there always exists a mixed-strategy equilibrium that converges to the Stackelberg outcome when the signal noise vanishes. The paper then offers an equilibrium selection procedure that selects this mixed-strategy equilibrium over the pure-strategy equilibrium identified by Bagwell. Huck and Müller (2000) use laboratory experiments to investigate outcomes in noisy leader games. They find that when the probability of the follower receiving an incorrect signal is small, outcomes are close to Stackelberg. They argue that the pure strategies identified by Bagwell are not behaviorally relevant. On the relevance of the noisy Stackelberg equilibrium, however, they are not able to draw firm conclusions. In part, this has to do with the relatively small size of their data set. More importantly, however, is an intrinsic difficulty in differentiating sophisticated mixed-strategy equilibrium play in the noisy leader game from alternative hypotheses. For instance, one alternative might be that followers just ignore the small possibility of noise in the signal about the leader’s choice and simply best-respond to the observed signal. The only opportunity to distinguish this ‘naïve’ behavior from strategically sophisticated mixed-strategy equilibrium play arises when followers must make a choice after receiving the unexpected Cournot signal. In that case, naïves are predicted to play Cournot while sophisticated followers are predicted to mix. But when the probability of getting an incorrect signal is very low, this almost never happens and sophisticated noisy Stackelberg equilibrium play becomes virtually observationally equivalent to ‘noised up’ naive play. In other words, while Huck and Müller observe that the leader’s first-mover advantage is largely preserved when the noise is small, they cannot easily tell whether this is for all the ‘right’ (equilibrium) reasons, or for all the ‘wrong’ (non-equilibrium) reasons. This makes it difficult for Huck and Müller to establish conclusively that subjects are in fact playing the mixed-strategy equilibrium identified by van Damme and Hurkens. They write: “However, this observation [i.e., acceptance of the null hypothesis of mixed-strategy equilibrium play] is not really valid, as there are still too few
406
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
observations. Hence, it is too early to draw a final conclusion regarding the claim of van Damme and Hurkens, and more testing has to be done.” Huck and Müller (2000) represent the nearest antecedent to the present paper; thus it is useful to distinguish contributions of our study as compared to theirs. First, the theoretical model underlying the experiments differs in the economic reason for the potential fragility of commitment. Our paper focuses on how observation costs can undermine commitment, whereas the existing literature is concerned with how noise in the communication technology can undermine commitment.5 Second, in an experiment, the costly leader game readily offers a way to observationally distinguish sophisticated mixed-strategy equilibrium play from naive non-equilibrium play. In particular, the frequency with which followers are predicted to observe the leader’s choice in a noisy Stackelberg equilibrium is constant at 50% for all ε < 50.
3. Experiment In this section, we describe the design of the experiment and offer some justification for key design choices. We begin with a formal analysis of equilibria arising in the game we studied. 3.1. The game We study a modified version of the standard Stackelberg game Γ in which follower F gets to observe leader L’s action before choosing s or c, if and only if he spends an amount ε on information gathering. This ‘costly leader game,’ Γ ε , is shown in Fig. 1. Player L’s pure strategies in Γ ε are the same as in g and Γ , i.e., S and C. Player F ’s decision whether to expend ε and observe L’s choice is denoted by o = y, n. Here, we write y if F decides to observe L’s choice, and n if not. We denote F ’s pure strategy of never observing L’s choice and always playing s by (n, s). The pure strategy (n, c) is defined analogously. Player F ’s pure strategy of observing and best responding to L’s choice is denoted by (y, b). Finally, F ’s pure strategies ‘observe and always play s,’ ‘observe and always play c,’ and ‘observe and play c upon observing S and s upon observing C’ are denoted by (y, s), (y, c) and (y, cs), respectively. Note that none of these last three strategies may be part of a subgame perfect equilibrium. The probability with which F observes L’s choice is denoted by pε . It is therefore equal to the sum of the probabilities that F plays (y, b), (y, s), (y, c) and (y, cs), where only the first strategy is subgame perfect. In Γ ε , we define the outcome of a strategy profile to be the probability distribution that this profile induces on {S, C} × {s, c}. Hence, an outcome is not concerned with whether 5 There is a technical difference between the solution concepts underlying the analysis of costly leader games versus noisy leader games. In costly leader games, subgame perfect equilibrium remains the appropriate solution concept—as in the original Stackelberg duopoly. In noisy leader games, Nash equilibrium is the appropriate solution concept.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
407
F observes L’s action. We use this more restricted definition to preserve comparability of outcomes of Γ ε with outcomes of g, Γ , and the noisy leader game. Finally, the payoffs in the costly leader game Γ ε are as follows. For each pure outcome Ss, Cc, Sc and Cs, the players’ payoffs in Γ ε are in principle the same as in g. For player F , however, the payoff in Γ ε also depends on whether he pays to observe L’s choice. With these payoffs we may use the results obtained by Várdy to characterize the set of subgame perfect equilibria (SPE) of Γ ε : • Noisy Stackelberg equilibrium: For all ε ∈ [0, 50], there exists a SPE characterized by ε , PrF (y, b) = 12 and PrF (n, s) = 12 . (Hence, p = 12 .) PrL (S) = 1 − 100 • Noisy Cournot equilibrium: For all ε ∈ [0, 50], there exists a SPE characterized by ε , PrF (y, b) = 12 and PrF (n, c) = 12 . (Again, p = 12 .) PrL (C) = 1 − 100 • Pure-strategy Cournot equilibrium: In the unique pure-strategy SPE of this game, L always plays C and F always plays (n, c), for all ε > 0. (Hence, p = 0.) • Continuum of equilibria: In the special case where ε = 50, there exists a continuum of SPE characterized by PrL (S) = 12 , PrF (y, b) = 12 , and PrF (n, s) ∈ [0, 12 ]. (Hence, p = 12 .) In our laboratory implementation of this game, we let ε take on five different values: 1, 15, 30, 45, and 60. To get a sense for the relative magnitude of these observation costs, it is useful to keep in mind that the most F stands to gain from observing the L’s choice is the 100 point difference between “matching” L’s choice (i.e., choosing strategy s when L chose S and c when L chose C) and not matching. Thus, one can interpret ε as the percentage of the possible gain that must be expended to observe L’s choice. 3.2. Experimental design We conducted six sessions in total. Three sessions (Sessions 1–3) were run at Princeton University in January and February of 2001. Subjects participating in these sessions were recruited from the undergraduate population through a number of e-mail lists. Ten subjects participated in each session and no subject appeared in more than one session. Subjects were seated in the same room at separate computer terminals and given a written set of instructions, which were read aloud by the experimenter. Direct communication between the subjects was strictly forbidden and, to the best of our knowledge, did not occur. No subject had any previous experience with the experiment. Three additional sessions (Sessions 4–6) were run at UC Berkeley in April of 2003 under virtually identical conditions. These subjects were also recruited via e-mail and had never played the game before. Again, no subject appeared in more than one session. In each session, the costly leader game was played 100 times in succession. At the beginning of each round, subjects were randomly paired with one another and randomly assigned the roles of leader and follower. Finally, the cost of observing the leader’s action, ε, was randomly generated. It could take on any of five values: 1, 15, 30, 45, or 60, and was displayed to all subjects during the round. First, player A (the leader) chose between actions labeled “U” (up) and “D” (down), which correspond to S and C, respectively, in game Γ ε . After the leader made his choice,
408
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
player B (the follower) indicated whether he wished to pay the cost ε to observe the leader’s action. Player B did this by selecting the “Yes” or “No” buttons on the screen. If player B chose “Yes,” then A’s choice was displayed on his screen. If player B chose “No,” he received no additional information. Following this, player B then chose between actions labeled “L” (left) and “R” (right), which correspond to s and c respectively. At the end of the round, each player saw a result screen which displayed all of the choices made by him and his opponent in that period. Players did not receive feedback about the choices of players with whom they were not matched. Figure 2 presents screenshots of the input and results screens that subjects observed when making their choices. Panel (a) depicts a typical screen for a leader. Panel (b) depicts a typical screen for a follower choosing whether to pay to observe L’s choice. Panel (c) depicts the information available to a follower who observed L’s choice. Notice the arrow depicting that choice. Finally, panel (d) displays the results of a round of the experiment. The instructions used in the experiment are contained in Appendix A. Each session lasted about an hour and a subjects’ earnings were calculated from the total points they earned during the experiment. For every 1000 points they earned 50 cents. Total earnings were between $16.25 and $21.00, with an average of $18.45. Though the subjects had not been told in advance, the actual payments were rounded up to the nearest quarter to simplify the money handling. All subjects were paid in cash and in private. In arriving at these procedures, we considered experimental design issues concerning learning, fairness, repeated interaction, and collusion. We discuss these issues briefly below. The large number of periods (100) was chosen to give subjects ample opportunity to experiment with different strategies and learn what works well and what does not. In this way we wanted to ensure that there was a reasonable chance that convergence to equilibrium play would occur over the course of the experiment. The random assignment of the roles of leader and follower in each round served two purposes. First, subjects would have to think about the strategic effects of their choices from both perspectives, which we thought might speed up learning. Second, since in the noisy Stackelberg equilibrium the leader increases his payoff at the expense of the follower, with fixed roles, fairness considerations might affect subject behavior. The combination of random matching of subjects and random rotation of roles meant that the probability that a given pair of subjects would be matched in consecutive periods in the same roles was very small. Thus, the design reduced the possibility that repeated interaction played a significant role in subject choices. Of course, having more subjects per session would have further reduced this possibility; however, the physical constraints of the laboratory did not allow for this. Nevertheless, absent any information about the identity of your opponent in a given round, combined with the randomness of the matching and role schemes and the randomness in the cost of observation, there was relatively little possibility for subjects to coordinate on repeated game (collusive) strategies during the experiment. Further, we saw no evidence of this type of behavior in the data and none of the responses to our post-experiment questionnaires made any reference to this type of strategy.
(d) (c)
Fig. 2. Screenshots of experiment.
(b)
409
(a)
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
410
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
4. Results 4.1. Overview Table 1 presents the distribution of outcomes over the six sessions.6 Since our primary interest is in the equilibrium predictions of the model, these statistics are based on choices made in rounds 41–100 only, the point at which we determined that subject behavior became fairly stationary. As the table reveals, the cost of observing the leader’s action has a strong effect on whether leaders and followers arrive at Stackelberg or Cournot outcomes, or even coordinate at all. When ε = 1 or ε = 15, which we shall refer to as the low cost treatments, subjects coordinate on the Stackelberg outcome more than 75% of the time. Moreover, coordination failure (i.e., (C, s) or (S, c) outcomes) occurs relatively infrequently—less than 11% of the time. In contrast, when ε = 45 or ε = 60, which we shall refer to as the high cost treatments, the focus shifts to the Cournot outcome, which occurs more than 76% of the time. In these treatments subjects are somewhat worse at coordinating, with coordination failures happening 21% of the time. The superior coordination in the low cost treatments relative to the high cost treatments seems to stem from the frequency with which followers chose to observe the leader’s action, which is given in the column labeled “Observe %” in Table 1. Followers choose to pay to observe the leader’s action over 60% of the time in the low cost treatments, but less than 15% of the time in high cost treatments. Finally, when ε = 30, which we shall refer to as the medium cost treatment, the Cournot outcome occurs most often, but no outcome occurs more than 50% of the time. Further, coordination failures occur in 33% of the observations. Next, we turn to the underlying decisions of the subjects. The four panels of Table 2 present summary statistics of their choices. Panel (a) displays aggregate choice behavior by subjects playing the role of leader, which can be summarized by the percentage Table 1 Distribution of outcomes and frequency of observing by followers rounds 41–100, all sessions Epsilon 1 15 30 45 60
Outcome
Observe %
(S, s)
(C, c)
(S, c)
(C, s)
81 75 25 3 1
13 14 42 76 77
2 5 13 9 7
3 6 20 12 14
84 61 39 14 9
Note. All results expressed in percentages.
6 Notice that while the values of ε were randomly generated, we used an identical seed value for all sessions.
Hence, the number of observations under each treatment in Table 1 should, in principle, be equal for all sessions. The exceptions occur in Session 6, where a computer glitch caused the loss of 6 observations and Session 5, where we eliminated all data associated with Subject 2. This subject always paid to observe the leader’s action when he or she was assigned the role of follower and then chose a non-best-response to the leader’s action 100% of the time. This behavior was sufficiently bizarre that we felt justified in excluding choices associated with this subject from the data.
Table 2 Summary of choices by session (rounds 41–100) Epsilon
Session # Obs.
2
# Obs.
3
# Obs.
4
# Obs.
5
# Obs.
6
# Obs.
Overall
# Obs.
95 80 37 0 3
80 55 65 60 40
76 71 29 8 5
80 55 65 60 40
53 43 40 23 16
64 44 52 48 32
87 89 35 16 13
79 55 65 57 38
84 80 38 12 8
463 319 377 345 230
80 55 65 60 40
80 51 34 27 13
80 55 65 60 40
88 34 21 10 6
64 44 52 48 32
89 67 37 11 16
79 55 65 57 38
84 61 39 14 9
463 319 377 345 230
25 52 32 33 23
8 29 41 43 30
100 83 34 14 16
9 18 41 51 32
88 76 46 17 16
74 124 230 295 210
11 0 0 0 0
56 15 11 5 2
0 0 0 0 0
70 37 24 6 6
2 1 1 0 0
389 195 147 50 20
(a) Percentage of Stackelberg play by leaders 1 15 30 45 60
89 96 43 15 13
80 55 65 60 40
95 93 43 10 3
80 55 65 60 40
(b) Frequency followers chose to observe leader’s action 1 15 30 45 60
80 75 48 17 8
80 55 65 60 40
85 60 49 12 0
80 55 65 60 40
84 75 42 10 10
(c) Percentage of Stackelberg play by followers after not observing leader’s action 1 15 30 45 60
88 86 47 20 11
16 14 34 50 37
100 95 67 13 8
12 22 33 53 40
100 86 45 6 6
13 14 38 54 36
94 70 56 18 34
16 27 43 44 35
(d) Percentage of non-best response play by followers after observing leader’s action 1 15 30 45 60
3 2 3 0 0
64 41 31 10 3
0 0 0 0 n.a.
68 33 32 7 0
0 0 0 0 0
67 41 27 6 4
0 0 0 0 0
64 28 22 16 5
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
l
411
412
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
of Stackelberg choices these subjects made. Panel (b) displays the frequency with which followers chose to pay to observe the leader’s choice. Panels (c) and (d) display the percentage of Stackelberg choices by followers when they chose not to pay to observe the leader’s choice and the percentage of best response choices by followers when they did choose to pay to observe the leader’s action. The number of observations associated with each cell in the table is listed in italics to the right of each of these percentages. As panel (a) reveals, the choices of leaders were strongly influenced by the observation cost of followers. In low cost treatments, leaders chose Stackelberg play over 80% of the time. While there is some variation across sessions, in every session leaders chose Stackelberg play in low cost treatments more often than when observation costs were higher. In high cost treatments subjects chose Cournot play more than 88% of the time. Again, there is some variation across sessions but in every session subjects chose Cournot play more often in the high cost treatments than when the cost of observation was lower. Leader choices in the medium cost treatment lie between these two extremes. In short, leaders appeared to be aware of strategic implications of differences in cost and appeared to take these into account in choosing between Stackelberg and Cournot. Panel (b) displays a similar monotonic pattern in the decisions of followers to pay to observe the leader’s action as a function of the cost of observation. In every session, followers chose to observe more often in low cost treatments than when costs were higher. While this pattern of choices is intuitive, it is inconsistent with equilibrium predictions. Both the noisy Cournot and noisy Stackelberg equilibria (when they exist) predict that followers will pay to observe the leader’s action exactly half of the time—regardless of the cost of observation. We will return to this issue in more detail in Section 4.3. Panel (c) shows that when followers chose not to observe the leader’s action, there is a monotonic relationship between the frequency with which followers chose to play Stackelberg and the cost of observing the leader’s action. Under the low cost treatments followers chose the Stackelberg action over 75% of the time conditional on not having observed the leader’s action. In contrast, under the high cost treatments followers chose the Cournot action over 83% of the time under the same condition. As in panel (a), the medium cost treatment lies between these two extremes. While these effects are fairly consistent across sessions, in Session 5 followers chose Stackelberg much less often than subjects in other sessions, particularly for low cost treatments. While part of this difference may be a result of the small number of observations (8) of followers choosing not to observe the leader’s action under the ε = 1 treatment, the tendency toward less Stackelberg play also occurs in the ε = 15 and ε = 30 treatments. In fact, this was not an especially costly strategy for followers since leaders in Session 5 were also less likely to play Stackelberg in the first place. Indeed, as long as leaders choose Stackelberg play less than half the time, as was the case for the ε = 15 and ε = 30 treatments in Session 5, the best response for uninformed followers is to play Cournot. Finally, panel (d) is purely a test of the rationality of followers. As the panel shows, in the vast majority of instances followers best responded to the choice of the leader when they had observed it.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
413
4.2. Learning In designing the experiment, we anticipated that there would be an initial ‘learning’ period as subjects experimented with different strategies to determine their relative effectiveness. Table 3 divides the data into 20-period ‘waves’ and examines the aggregate choices of subjects. Panel (a) of Table 3 displays the percentage of Stackelberg choices by leaders. For both the low and high cost treatments, there is a trend toward more uniformity in play. In the case of the low cost treatments Stackelberg play by leaders becomes increasingly more common, while in the high cost treatments it becomes increasingly less common as the game progresses. The medium cost treatment also displays a slight upward trend in Stackelberg play. Panel (b) of Table 3 depicts the frequency with which followers paid to observe the leader’s choice. In all cases, the trend is toward less observation over time on the part of followers. This seems intuitive in that the value of obtaining feedback as to the leader’s strategy is likely to be greater during the early ‘experimentation’ phase of the game. Panel (c) depicts the frequency of Stackelberg play on the part of followers when not observing the leader’s choice. The trends here mirror those in panel (a). In low cost treatTable 3 Summary of choices by round Epsilon
Rounds 1–20
21–40
41–60
61–80
81–100
85 80 39 18 3
81 77 38 8 9
90 59 39 15 14
77 62 38 15 4
(a) Percentage of Stackelberg play by leaders 1 15 30 45 60
66 58 27 23 26
71 78 43 19 15
85 81 37 10 9
(b) Frequency followers chose to observe leader’s action 1 15 30 45 60
93 75 53 29 24
94 72 40 19 4
85 62 40 14 13
(c) Percentage of Stackelberg play by followers after not observing leader’s action 1 15 30 45 60
60 52 30 38 43
100 73 48 23 29
76 73 50 30 18
94 72 42 12 16
90 85 50 15 14
(d) Percentage of non-best response play by followers after observing leader’s action 1 15 30 45 60
6 0 8 5 7
1 0 0 0 0
2 0 1 0 0
2 2 0 0 0
2 0 0 0 0
414
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
ments, followers increasingly play Stackelberg strategies and coordination between the leaders’ and followers’ strategies increases over the course of the session. In high cost treatments the trend is toward Cournot play, which again leads to greater coordination with the choices of leaders. In the medium cost treatment, choices of followers converge to an even mixture between Stackelberg and Cournot strategies. Finally, panel (d) of Table 3 shows the fraction of ‘mistakes’ by followers in choosing an action after observing the leader’s choice. As might be expected, there is a strong downward trend in mistakes as the session proceeds. The vast majority of mistakes was made early in the session (rounds 1–20) and very few thereafter. 4.3. Hypothesis tests As the preceding tables have shown, behavior in high cost treatments was markedly different from that in low lost treatments. We now take a more detailed look at behavior relative to equilibrium predictions. Since the theoretical predictions generating subject choices in each round are all Bernoulli random variables, an exact binomial test is appropriate. In cases where the theory predicts pure strategies, any deviation trivially rejects the theory. To perform a ‘fairer’ test, we actually test against the null hypothesis of theory plus a one percentage point error rate in these cases. Results of these tests are reported in Table 4.7 Panel (a) presents the results of hypothesis tests pertaining to the frequency of Stackelberg play on the part of leaders. Notice that in all treatments, Stackelberg play occurs too infrequently compared to the theory predictions of the noisy Stackelberg equilibrium. At the same time noisy Cournot equilibrium does not do any better at fitting the data. As the hypothesis tests show, we reject the theory predictions of either noisy Cournot or noisy Stackelberg equilibria at the one percent significance level for all treatments save ε = 15, where we obtain a p-value of 0.02. Panel (b) presents the results of hypothesis tests pertaining to the frequency with which followers pay to observe the leaders action. Recall the rather striking theoretical prediction of both the noisy Stackelberg and the noisy Cournot equilibrium that followers will pay to observe the leader’s action exactly half of the time for all treatments, save for ε = 60. For all treatments we reject this null hypothesis at the one percent significance level. Finally, panel (c) presents the results of hypothesis tests of predictions about Stackelberg play by followers who chose not to observe the leader’s action. The noisy Stackelberg prediction is that followers will choose the Stackelberg action 100% of the time. This prediction does poorly, even allowing for the one percentage point error rate, and is rejected at the one percent significance level in every case. The noisy Cournot equilibrium predicts that followers in this situation will never choose the Stackelberg action. As the table shows, this hypothesis too is rejected in every case. 7 The p-values in Table 4 are computed as follows. Let v denote the actual number of ‘successes’ for the
choice variable of interest. Let µ denote the expected number of successes under the null hypothesis, and let represent the absolute value of the difference between the expected and actual number of successes. Then, the p-values are equal to 1 − Pr(µ − v˜ µ + ), where v˜ is a binomial random variable under the null hypothesis.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
415
Table 4 Hypothesis tests Epsilon
Actual
Noisy Cournot*
Noisy Stackelberg Theory
Binomial test
Theory
Binomial test
1 15 30 45 0
0.00 0.00 0.00 0.00 0.00
50 50 50 50 0
0.00 0.00 0.00 0.00 0.00
(a) Percentage of Stackelberg play by leaders 1 15 30 45 60
84 80 38 12 8
99 85 70 55
0.00 0.02 0.00 0.00
(b) Frequency followers chose to observe leader’s action 1 15 30 45 60
84 61 39 14 9
50 50 50 50
0.00 0.00 0.00 0.00
(c) Percentage of Stackelberg play by followers after not observing leader’s action 1 15 30 45 60
88 76 46 17 16
100 100 100 100
0.00 0.00 0.00 0.00
0 0 0 0 0
0.00 0.00 0.00 0.00 0.00
* For ε = 60, theory predictions are based on pure-strategy Cournot, since this is the unique equilibrium.
To sum up, formal statistical tests of the theory offer little overall support for the predictions of either the noisy Stackelberg or the noisy Cournot equilibrium. Further, the systematic pattern in the follower’s decision to observe the leader’s action as a function of the cost of observation is contrary to the equilibrium predictions. Nonetheless, in qualitative terms, subject choice behavior does somewhat resemble the theoretical predictions at the aggregate level. This is particularly true for the ε = 60 treatment, where subject choices are roughly like in the pure Cournot equilibrium. 4.4. Individual play Next, we offer a rudimentary analysis of choices at the individual level. For each individual in a given treatment we count the number of Stackelberg choices this individual made in the role of leader. If this individual were playing an i.i.d. mixed strategy in each period, then this count would be equivalent to the number of successes in a binomial random variable with k trials, where k is the number of times the individual played the role of leader for a given treatment. We then compute the sample standard deviation of this random variable over all individuals for each treatment. This offers one measure of the heterogeneity of individual play. The results of this analysis are given in panel (a) of Table 5 in the column labeled ‘actual.’ We perform the same analysis to obtain sample standard de-
416
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
Table 5 Standard deviation of individual play by treatment Epsilon
Actual
Theory
Upper bound
(a) Stackelberg play by leaders 1 15 30 45 60
2.83 1.97 2.24 1.20 0.63
0.28 0.84 1.17 1.22 0.00
1.41 1.17 1.27 1.22 1.00
(b) Followers choosing to observe leader’s action 1 15 30 45 60
2.56 2.22 2.62 1.67 0.80
1.41 1.17 1.27 1.22 0.00
1.41 1.17 1.27 1.22 1.00
viations of the frequency with which followers chose to observe the leader’s choice under each treatment. The results are listed in panel (b).8 To obtain a benchmark for the degree of heterogeneity in individual play, we compare these sample standard deviations to two theoretical data generating processes. The first, labeled “Theory,” assumes that individuals make i.i.d. choices using mixed-strategy probabilities implied by equilibrium.9 Notice that for low and medium cost treatments, the sample standard deviation is considerably higher than theory predicts, indicating substantial heterogeneity in individual choices. The second benchmark, labeled ‘Upper bound’ assumes that subjects randomize over all actions with equal probability. This induces the maximum possible dispersion in individual choices conditional on all individuals playing the same i.i.d. mixed strategy. Notice that even compared to this upper bound, sample standard deviations are higher in low and medium cost treatments. To get a feel for the sensitivity of individual strategies to changes in ε, it is useful to note that while most subjects showed quite some variability in play, there were occasional subjects playing the same pure strategy regardless of the treatment. For instance, three subjects played the pure strategy of never choosing the Stackelberg action when playing the role of leader, for all ε. One subject chose to never pay to observe the leader’s choice, while another chose to observe in all but one instance. Finally, five subjects never chose the Stackelberg action in the role of follower conditional on not having observed the leader’s action. 8 We do not include a panel for the frequency with which followers chose the Stackelberg action conditional on not having observed the leader’s choice, because this confounds two forms of heterogeneity: differences in the propensity to observe the leader’s action and differences in choices following non-observation of the leader’s action. This confounding produces a relatively uninformative statistic about follower play. 9 Note that the theoretical standard deviations from noisy Cournot and noisy Stackelberg play are identical; hence we need not distinguish between the two.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
417
5. Quantal response equilibrium In this section we study the (logit) agent quantal response equilibrium (AQRE) of our game and look whether this fixed point in ‘trembling’ strategies can explain the data. McKelvey and Palfrey (1995) developed the concept of quantal response equilibrium for normal form games. The AQRE extends this solution concept to extensive form games. The theory is contained in McKelvey and Palfrey (1998). The interested reader may want to consult these papers for details. As we shall see, the AQRE can explain many aspects of the data but not all. Under the maximum likelihood choice of the error parameter λ, the predicted probability that a follower pays to observe the leader’s choice does indeed decrease as the cost of observation grows. Further, the observed switch from the prevalence of Stackelberg outcomes for low cost treatments to Cournot outcomes for high cost treatments is well described by the AQRE. However, for ε = 1, the AQRE fails to predict correctly the high frequency with which the follower observes the leader’s action. In fact, with an overall maximum of 72% the predicted observation frequency remains too low relative to the observed 84%, regardless of the choice of error parameter λ. We now turn to the computation and estimation of the AQRE model. We begin by specifying the follower’s mixtures. Let t denote the probability of a follower playing s conditional on observing that the leader has played S. Let w denote the analogous probability of F playing c conditional on observing that L has played C. Then, under the logistic specification of the error generating process with parameter λ, we have (after some simplification): t =w=
e100λ . 1 + e100λ
Next, let q denote the frequency with which the follower plays the Stackelberg action s conditional on not having observed the leader’s action. Then, after some algebra, it can be shown that 1 q= 100λ(1−2r) 1+e where r is the probability that the leader plays the Stackelberg action S. Finally, let p denote the probability that the follower chooses to pay to observe the leader’s action. Then: 1 p= . 1 + e100λ((q−t )r+(1−r)(1−q−w)+ε) Turning to the leader’s strategy, it simply consists of the probability of playing the Stackelberg action S, which we have denoted by r. After simplification, the expression for r becomes: 1 r= . 1 + e100λ(2p(1−w−t )+1) While this system of equations does not have closed-form solutions in terms of p, q, r, t, w as a function of λ, one can readily compute solutions numerically. Figure 3
418
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
Fig. 3. AQRE predictions as function of error parameter λ and observed frequencies at λ = 3.2.
displays the results of these computations for differing values of the cost parameter ε. Notice that in addition to the parameter values of ε used in the experiment, Fig. 3 also includes a plot for ε = 46. This is indirectly related to the fact that the AQRE solution concept also acts as an equilibrium selection device, since there exists a unique connected path selection of the logit-AQRE correspondence leading from λ = 0 to λ = ∞. For ε < 44, this AQRE path converges to the noisy Stackelberg equilibrium when λ → ∞, while for ε > 45, the AQRE path converges to the noisy Cournot equilibrium. For values of ε around the switch point the behavior of the AQRE is somewhat unstable, or ‘singular,’ in the sense that the AQRE solution as a function of λ is not smooth for λ around 3.9. By pretending that ε was 46 instead of 45, we move sufficiently far away from the singularity to restore the smoothness of the AQRE solution, likelihood function and maximum likelihood estimate of λ. Table 6 reports the maximum likelihood estimates λˆ of error parameter λ. In panel (a), all sessions are pooled and likelihood estimates are made for each observation cost treatment separately, as well as over the entire dataset. When considering a particular observation cost treatment ε in isolation, the (normalized) likelihood function Lε is:
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
419
Table 6 Maximum likelihood estimates of λ Treatment ε=1
ε = 15
ε = 30
ε = 45*
ε = 60
All
(a) Data pooled by treatment Lambda estimates Log-likelihood Log-likelihood at λ = 3.2
3.8 −1.1 −1.1
3.9 −1.4 −1.5
1.9 −1.9 −2.0
4.4 −1.5 −1.6
3.2 −1.1 −1.1
3.2 −7.2 −7.2
2
3
4
5
6
4.5 −5.8
4.0 −6.2
2.8 −7.8
2.1 −8.7
3.1 −7.1
21–40
41–60
61–80
81–100
3.1 −7.4
3.2 −7.1
3.3 −7.1
Session 1
(b) Data pooled by session Lambda estimates Log-likelihood
3.3 −6.9 Rounds 1–20
(c) Data pooled by “wave” Lambda estimates Log-likelihood
2.3 −8.2
2.9 −7.5
* All estimates use ε = 46. See discussion in the text for details.
1{Si =1} log r + 1{Si =0} log(1 − r) + k 1 log(1 − p) + 1{si =1} log q + 1{si =0} log(1 − q) + ε {y =0} i 1 Lε (λ) = 1{yi =1} log p + kε 1{Si =1} 1{yi =1} 1{si =1} log t + 1{si =0} log(1 − t) + i=1 1{Si =0} 1{yi =1} 1{si =0} log w + 1{si =1} log(1 − w)
.
Here, kε denotes the total number of observations in which the observation cost was equal to ε, Si denotes the indicator function whose value is equal to 1 (instead of 0) if and only if the leader’s action in observation i was the Stackelberg action S. And 1{..} is the indicator function whose value is equal to 1 (instead of 0) if and only if the logical expression in {..} is true. The indicator functions yi and si are defined analogously. To obtain the maximum likelihood estimate λˆ of λ for the entire data set, while giving equal weight to each treatment, we maximize the sum, L, of the normalized likelihood functions Lε : L(λ) = Lε . ε={1,15,30,45,60}
In Fig. 3, the empirical distributions of the subjects’ choices are displayed, evaluated at λ = λˆ = 3.2. The diamond symbol corresponds to the empirical frequency of bestresponses by followers conditional on having observed the leader’s action. The circle symbol corresponds to Stackelberg play by followers not having observed the leader’s action. The star symbol denotes the empirical frequency with which followers pay to observe the leader’s choice and the plus symbol denotes the empirical frequency of Stackelberg play by leaders.
420
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
Consistent with the results reported in Table 6, Fig. 3 shows that the AQRE estimates achieve their best fit for the extreme treatments (ε = 1, 60). The fit is somewhat less good for the ε = 15 and ε = 45 treatments, and worst for the medium cost treatment, ε = 30. Apart from having a lower normalized log-likelihood, the ε = 30-treatment also generates a smaller maximum likelihood estimate λˆ ε=30 for the error parameter λ (corresponding to more noise) than the other treatments. This is consistent with our earlier finding that as outcomes moved from predominantly Stackelberg for low costs to predominantly Cournot for high costs, failure to coordinate on an equilibrium outcome occurred relatively frequently for ε = 30. Though, overall, the maximum likelihood AQRE is a relatively good fit for the data when ε = 1, it is noteworthy that the predicted frequency with which followers pay to observe the leader’s choice, namely 72%, remains significantly below the actual frequency of 84%. In fact, this is true for all possible values of λ. Formally, the null hypothesis that the empirical frequency with which followers chose to pay to observe the leader’s action for ε = 1 is generated by an AQRE is rejected with a p-value of 0.000, for all λ ∈ [0, ∞). Thus, the AQRE cannot fully replicate the behavioral feature that followers very frequently pay to observe when observation costs are very small. In panel (b) of Table 6, estimates of the error parameter are reported for each session separately. As the panel shows, the sessions conducted at Princeton (sessions 1–3) all have higher normalized log-likelihoods and higher estimates for the error parameter (i.e., lower levels of ‘trembles’ or noise) than the sessions conducted at Berkeley. Indeed, the highest estimate for the error parameter in the Berkeley sessions (λˆ = 3.1) is lower than the lowest estimate in the Princeton sessions (λˆ = 3.3). Treating the maximum likelihood estimate of λ by session as the unit of observation and the location as the treatment, a Wilcoxon Sum of Ranks test rejects the null-hypothesis of no location effect in favor of the one-sided alternative that the Berkeley sessions yielded lower error parameters, i.e., more noise, than did the Princeton sessions. Finally, in panel (c) of Table 6, separate estimates of the error parameter are reported for each 20-round ‘wave’ in the pooled data. As the panel reveals, both the normalized loglikelihood and the estimated error parameter increase from one wave to the next, but the rate of increase lessens after round 40. This suggests that play moved closer to equilibrium over time as subjects presumably learned more about the game and about other subjects’ behavior. To sum up, our tables and graphs suggest that the AQRE can explain certain underlying patterns of the data, such as the switch from a predominance of Stackelberg outcomes for low costs to Cournot outcomes for high costs, but cannot correctly predict the probability of observation by followers when the cost is very small.
6. Conclusion Our results suggest that, at least for small observation costs, the leader’s value of commitment is robust. Once the follower’s cost of observing the leader’s action grows larger, however, the theoretical possibility of preservation of the first-mover advantage is not borne
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
421
out in the data. In a nutshell, variations in observation cost are a key determinant in whether commitment is valuable or not. While the first-mover advantage is largely preserved in low cost treatments, it does not appear to be supported by equilibrium play. In particular, we can confidently reject the hypothesis that the empirical frequencies are being generated by the noisy Stackelberg equilibrium. Moreover, as the cost of becoming informed changes, the followers’ empirical observation frequency changes systematically; a finding at odds with equilibrium theory. The data is better fit in a model where agent quantal response equilibrium is used as the solution concept. Under AQRE, follower play is correctly predicted to depend on the cost of being informed. However, certain aspects of the data cannot be explained by AQRE. Specifically, we confidently reject the hypothesis that for very low cost of becoming informed the empirical frequency of followers choosing to observe the leader’s action derives from the maximum likelihood AQRE. Indeed, this hypothesis is rejected for every possible error parameter of that model. Thus, we are left mostly with negative conclusions. Clearly, selecting equilibria on the basis of pure strategies does not lead to a good description of behavior. That said, the mixed-strategy noisy Stackelberg equilibrium is not a good description of behavior either. Adding other factors such as risk aversion actually moves the theory further from the data. AQRE addresses most of these problems but cannot account for the high frequency with which followers choose to observe the leader’s action when the observation cost is very small. It is this aspect of the data that seems to present the greatest difficulty for the existing theory. In our view, the results reported here suggest a change in the focus of the debate on the value of commitment. Specifically, it appears to us that the emphasis on equilibrium selection between Cournot and noisy Stackelberg equilibrium does not truly go to the heart of the matter. For the costly leader game, a satisfactory theory must deal with the systematic changes in looking behavior occurring as the cost parameter changes and with the high frequencies with which followers choose to observe the leader’s choice when the cost of doing so is low. This remains a task for future research.
Acknowledgments The authors thank Don Dale, Avinash Dixit, Hugo Sonnenschein, the Associate Editor, and an anonymous referee for their valuable comments and suggestions. The first author thanks the National Science Foundation for its financial support of this project. The views presented in this paper are those of the authors and do not necessarily reflect the position of the International Monetary Fund.
Appendix A. Instructions Thank you for participating in this experiment on the economics of decision making. If you follow the instructions carefully and make good decisions you can earn a considerable
422
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
amount of money. At the end of the experiment you will be paid in cash and in private. The experiment will take about one hour and 15 minutes. There are 10 people participating in this session. They have all been recruited in the same way that you have and are reading the same instructions that you are for the first time. Please refrain from talking to the other participants during the experiment. You are about to play the same game 100 times in succession. In each round you will be paired with a co-player. These pairings are random. Neither you nor your co-player will know who you are paired with. And since the pairings are random your co-player will change from one round to the next. In each round, you will be randomly assigned the role of player A or the role of player B. Whatever role you are assigned, your co-player is assigned the opposite role. In any given round, it is equally likely that you are a player A or a player B. The time line in Fig. A.1 summarizes the play of the game. First, player A chooses U(p) or D(own). Then, player B decides whether he wants to observe player A’s choice, i.e., he chooses Yes or No. If B chose Yes, then player B is informed of A’s choice. Finally, player B chooses either L(eft) or R(ight). The cost to player B of choosing Yes is listed at the top of the screen. It will change from round to round. If player B chooses to observe player A’s choice, player A’s choice will be marked by an arrow. The payoffs in this game are summarized in the matrix of Fig. A.2. The first number in a cell represents player A’s payoff, while the second number represents player B’s payoff. This matrix should be read as follows. Suppose player A has chosen U(p) and player B has chosen L(eft). Then, player A gets a payoff of 500, while player B gets a payoff of 200. The other matrix entries have a similar interpretation. Thus, if player A chooses D(own) and player B chooses R(ight), player A receives 400 and player B receives 400. If player A chooses U(p) and player B chooses R(ight), player A receives 300 and player B receives 100. Finally, if player A chooses D(own) and player B chooses L(eft), player A receives 600 and player B receives 300. In addition, player B’s payoff depends on whether or not he chose to observe player A’s choice. If player B observed A’s choice, his payoff is reduced by the cost listed at the top of the screen.
Fig. A.1. Time line. Player A’s choice
L
R
U D
500, 200 600, 300
300, 100 400, 400
Player B’s choice
Fig. A.2. Payoff matrix.
J. Morgan, F. Várdy / Games and Economic Behavior 49 (2004) 401–423
423
Player B’s cost of observing A’s choice can take on any of five possible values: 1, 15, 30, 45, or 60 points. The value in a particular round is determined at random. All values are equally likely to occur in any given round. Your cash earnings are calculated from the total points you earned during the experiment. For every 1000 points you earn 50 Cents. For example, if you score a total of 20,000 points, you earn $10.
References Bagwell, K., 1995. Commitment and observability in games. Games Econ. Behav. 8, 271–280. Damme, E. van, Hurkens, S., 1997. Games with imperfectly observable commitment. Games Econ. Behav. 21, 282–308. Güth, W., Kirchsteiger, G., Ritzberger, K., 1998. Imperfectly observable commitments in n-player games. Games Econ. Behav. 23, 54–74. Huck, S., Müller, W., 2000. Perfect versus imperfect observability—An experimental test of Bagwell’s result. Games Econ. Behav. 31, 174–190. McKelvey, R., Palfrey, T., 1995. Quantal response equilibria for normal form games. Games Econ. Behav. 10, 6–38. McKelvey, R., Palfrey, T., 1998. Quantal response equilibria for extensive form games. Exper. Econ. 1, 9–41. Oechssler, J., Schlag, K., 2000. Loss of commitment: An evolutionary analysis of Bagwell’s example. Int. Game Theory Rev. 2, 83–96. Várdy, F., 2004. The value of commitment in Stackelberg games with observation costs. Games Econ. Behav. 49, 374–400.