Anomalous Behavior in Public Goods Experiments: How Much and Why? By THOMAS R.PALFREY AND JEFFREY E.PRISBREY* random assignments are changed round to round enabli mption.Thes for the presence of warm-glow and/o fects that are.on av dcncehc m-glow ef of an altruism effect.(JEL C92.C92.H41) ct the presenc for which th er,the range of environments reported is very nan row and more in he designs employed make it difficult,if no to estimate the actual strategies han both the info the distribution of preferences,allows the James M.Walker [I9 d the wherebut they also o fa t own best int terests to do so Nakamura ime exhibits er atic na jects alternating back and forth betwe butions Mechanism (Isaac et al..1984).Each nt survey documents these and The anomalies might be cause for a se contributed toward a publi the group.All th subjects in group had the oublic 91125 the private good from a commonly known dis onnsuch a setup.subject whose valu 54.The fnan the and The vie log for the good have a domina 829
Anomalous Behavior in Public Goods Experiments: How Much and Why? By THOMAS R. PALFREY AND JEFFREY E. PRISBREY * We report the results of voluntary contributions experiments where subjects are randomly assigned different rates of return from their private consumption. These random assignments are changed round to round, enabling the measurement of individual player contribution rates as a function of that player's investment cost. We directly test these response functions for the presence of warm-glow and/or altruism effects. We find significant evidence for heterogeneous warm-glow effects that are. on average, low in magnitude. We statistically reject the presence of an altruism effect. (JEL C92, C92, H4I) There is a growing body of experimental data from voluntary contribution, public goods environments with a single public good and a single private good. Among tbe many features of tbe data tbat are difficult to explain is tbe apparent frequent use of strictly dominated strategies. Subjects not only give away money wben free-riding is a dominant strategy (R. Mark Isaac et al. [1984, 1994], Isaac and James M. Walker [1988, 1989], and elsewbere), but tbey also often fail to contribute wben it is in their own best interests to do so (Tatsuyoshi Saijo and Hideki Nakamura, 1995). Furtbermore, individual bebaviorover time exbibits erratic patterns, witb many subjects alternating back and fortb between generosity and selfisbness. John O. Ledyard's (1995) excellent survey documents tbese and several otber anomalies. Tbe anomalies might be cause for a serious reexamination of tbe tbeory, as they signal trouble for current economic models of selfish * Palfrey: Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA 91125; Prisbrey: Federal Communications Commission, Mass Media Bureau. Policy and Rules Division, 2000 M St.. N.W., Washington, DC 20554, The financial support of the National Science Foundation is gratefully acknowledged. We tbank Roger Gordon, R. Mark Isaac, John Ledyard. Jimmy Walker. Nat Wilcox, and four referees for offering helpful suggestions and comments. The views expressed are those of the authors and do not necessarily reflect the views of the California Institute of Technology or the Federal Communications Commission. bebavior. However, tbe range of environments for which tbese experimental results bave been reported is very narrow, and more importantly tbe designs employed make it difficult, if not impossible, to estimate tbe actual strategies underlying subject behavior. Our design, by changing both tbe information structure and the distribution of preferences, allows the estimation of strategies at botb tbe group and the individual level. As a result, we are able to clearly identify the different sources of some of tbese anomalies. Tbe different environment also provides a chance to see if previous anomalous findings are robust. We employed the following basic design, which is a variation on tbe Voluntary Contributions Mechanism (Isaac et al., 1984). Eacb subject was given an endowment whicb could voluntarily be contributed toward a public good, or kept to be consumed as a private good. Tbe consumption value of the public good depended linearly upon tbe total contributions of tbe group. All the subjects in a group bad the same commonly known, marginal value for tbe public good. But. individual subjects were randomly assigned different marginal values for the private good from a commonly known distribution. In sucb a setup, subjects wbose value for tbe private good is less than their value for tbe public good have a dominant strategy to contribute all of tbeir endowment; subjects wbose value for tbe private good is greater than their value for the public good bave a dominant strategy to keep all of tbeir endowment or to free ride. Subjects repeated tbe game several 829
830 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 nitegoyegda (a)altruistic prefe our laboratory environment consists of Nindividuals,each endowed with (c)re me effects,including repu tation building:and wdiscrete units of a private good.The mar (d)subject confusion two explanati ns are nonmo onent in their utility fun tion that is difficult for the experimenter to control,and that works in the opposite direc he margin not only in hisorh is private information avof Essentially all of what we think we know his game is b d on ex contributing,indepe dent of how much it in iods f At first blush.these two effects would ap exception,the private good valuations exceed the public good valuation ,s0 jects hav explan tion,the altr good should have very large effects on contri Most players in this game violate their bution rates.The warm-glow explanation doe with many cor of the private good is three or more times that valuatio of the privat nant strategy are observed. bution is motivated, Subje grees,by each of these explanations.On .Violations of dominant strategies dimin- arate between these explanations and as ish both with repetition and with experienc certain their relative importance.To do so re (playing a second sequence of games with a quires major esign innovations r to th e Violations of dominant strateg nes to con ical past exp tribute,i.e..when r<V,appear to be as prev oup had the samer:here different subjects dominant strategies to have different r's.? In the past.all subjects of- fered for why there is so much more coo ation than the standard theory predicts.The The etal.(1985 on ar nce Saijo and Nakamura(1995)
830 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 times, eacb time being randomly reassigned a new value for tbe private good. Specifically, our laboratory environment consists of N individuals, eacb endowed with vv, discrete units of a private good. Tbe marginal rate of transformation between the public good and tbe private good is one-for-one, and individual monetary payoff functions are of the form: UiXi, x-,) = VZj Xj + r,(Wi - jc,), where jc, is the individual's contribution. We refer to V as tbe marginal value of tbe public good, and it is tbe same for all individuals. The marginal value of tbe private good is r,. and it is private information. Essentially all of what we think we know about behavior in tbis game is based on experiments in which the marginal valuations of the private good are identical in all periods for all participants in tbe experiments. Witb one exception,' tbe private good valuations exceed the public good valuation, so all subjects bave a dominant strategy to free ride. The central findings from these experiments are summarized below. • Most players in this game violate tbeir one-sbot dominant strategy, witb many contributing upwards of balf tbeir endowment. They do so even wben the marginal valuation of the private good is three or more times that of the public good. • As the marginal valuation of the private good gets closer to the marginal valuation of tbe public good, more violations of tbe dominant strategy are observed. • Subjects can be roughly categorized according to their tendency to violate tbe dominant strategy. • Violations of dominant strategies diminish botb with repetition and witb experience (playing a second sequence of games with a new group). • Violations of dominant strategies to contribute, i.e., wben r, < V, appear to be as prevalent as violations of dominant strategies to free ride (Saijo and Nakamura. 1995). Several possible explanations bave been offered for wby tbere is so much more cooperation than the standard theory predicts. The explanations that bave thus far received the most attention are: ' Saijo and Nakamura (1995). (a) altruistic preferences; (b) warm-glow preferences; (c) repeated game effects, including reputation building; and (d) subject confusion. Tbe first two explanations are similar because they both suggest tbat subjects have a nonmonetary component in tbeir utility function tbat is difficult for tbe experimenter to control, and tbat works in the opposite direction of tbe monetary incentive to free ride. By altruistic preferences we mean that a subject's utility is increasing, not only in bis or her own payoff, but also in tbe total group payoff. Warm-glow preferences mean that the act of contributing, independent of bow much it increases group payoffs, increases a subject's utility by a fixed amount. At first blusb, tbese two effects would appear to be tbe same, but in fact tbey are not. Unlike tbe warm-glow explanation, tbe altruism explanation predicts tbat increases in group size and/or in tbe value of tbe public good should have very large effects on contribution rates. Tbe warm-glow explanation does not depend upon group size or tbe marginal value of the public good. Tbe latter two explanations, (c) and (d). are suggested by tbe tendency for contributions to decline with repetition and with experience. Tbe declines may be consistent with learning or endgame effects. It is possible tbat the typical act of contribution is motivated, perbaps to differing degrees, by eacb of tbese explanations. One purpose of tbese experiments is to accurately measure subject behavior in order to cleanly separate between these explanations and ascertain their relative importance. To do so requires major design innovations relative to tbe standard public goods experiment. In tbe typical past experiments, all subjects witbin a group bad tbe same r,; bere different subjects have different r/s.^ In the past, all subjects ' Thereareafewexceptions, notably Isaac etal, (1985) and Joseph R. Fisher et al. (1995), both of whom consider environments with two incentive types. The latter provides subjects with identical information about other subjects' preferences as in parallel homogeneous preference experiments. The former has several other different features, including nonlinearities, and does not conduct any base-
VOL 87 NO.5 PALFREY AND PRISBREY:PUBLIC GOODS EXPERIMENTS usually had a dominant strategy to free ride terns of behavior as well.These featu while here the subiects sometimes have a dom s in the Appendix jects repeat on per pe In earlier experiments,a subject who con- different group of three other sion error co rom a subject who have the Because was always bigger than.subicc in the first twe never had an incentive to contribute,and there. identify experience effects.The first sequence calle with each value of V is coded as inexperi alth mo ob- ession that lasts a thermore,it was impossible even to observe n wh In all our environments,subjects receive designs.In our ion in that a omly assigned ac 20 ng to a nated and contribution arisi nfrom confusion or decision can be differentiated from ken values.Each time a subject is to make a new ue to nonmonetary components he or she is in Thus,a key benefit of our design is that the do not kno resulting data allows the accurate and unbiased assignments ofr's,but the distribution is pub measurement of the experi And.directly fr The valu 0 is al the esti T come estimates of the amount of altruism and ns of the choice hehavior of each individ. ual at different values of r,and permit the can for the robustn ch。 gregate od V and incomplete information.that are endemic between ex riments.We ha ve an e to natural settings. ber of observations for each of the four differ values I.Experimental Design and Pro (3,6 10.15)(see Table edures There are specific features of our design that subjects contribute in every decision period.In ant to served pat- that condition,on average,40 percent of the commonly ame su bjects are assigne n value tha orth times individu tl.(1989 with he e are n f the Isa (1984) distr lar mtive ways of cnrience producem
VOL 87 NO. 5 PALFREY AND PRISBREY: PVBUC GOODS EXPERIMENTS usually had a dominant strategy to free ride, while here the subjects sometimes have a dominant strategy to contribute. In tbe past, subjects repeated tbe decision witb tbe same incentives eacb period; bere tbe subject's incentives cbange eacb period. In earlier experiments, a subject who contributed because of confusion or decision error could not be differentiated from a subject who contributed because of altruism or warm glow. Because r, was always bigger than V. subjects never had an incentive to contribute, and therefore every contribution could be called a decision error. Bebavior motivated by altruism or a warm glow, although potentially observed, could not be separately identified. Furtbermore. it was impossible even to observe noncontribution when r, eriods (one decision per period), eacb ten-period sequence with a different group of tbree otber subjects.^ Tbe first two such sequences bave tbe same value of V. The last two sequences also have tbe same value of V. but different from the value in tbe first two sequences. Tbis allows us to identify experience effects. Tbe first sequence witb eacb value of V is coded as inexperienced, and the second sequence as experienced.'' All four sequences occur in a single session tbat lasts approximately 90 minutes. Eacb session includes 16 subjects. 2. In all our environments, subjects receive r,'s tbat are randomly assigned according to a uniform distribution between 1 and 20 in unit increments. We sometimes refer to these as token values. Each time a subject is to make a new decision, he or she is independently and randomly assigned a new r, for that decision. Subjects do not know tbe otber subjects' assignments of r/s, but the distribution is publicly announced at the beginning of tbe experiment. Tbe value of Vis also publicly announced. Therefore, the data contain multiple observations of tbe choice behavior of each individual at different values of r,. and permit tbe estimation of response functions at both tbe individual and aggregate levels. 3. We vary the value of tbe public good. V, between experiments. We have an equal number of observations for each of the four different values of V e {3, 6, 10, 15} (see Table 1). One value. V = 3, bas tbe feature that group efficiency is not maximized when all subjects contribute in every decision period. In that condition, on average. 40 percent of tbe time subjects are assigned a token value tbat is worth more tban four times tbe individual line experiments with homogeneous preferences, Gerald Marwell and Ruth E. Ames (1980) and D. S. Brookshire et al, (1989) have also conducted experiments with heterogeneous preferences, but these are not comparable for other reasons. None of these experiments varied individual incentives across decisions, nor did they provide explicit information about the distribution of incentives in the population. Palfrey and Howard Rosenthal (1991) use an environment similar to the one explored here, but the public good technology is step-level, not linear. ' Fixing the groups for a sequence of ten periods was done lo maintain comparability with past experiments. We also conducted a replication of one of the Isaac et al. (1984) treatments, using our instructions, computer protocol, and subject pool. We obtained results, reported in Palfrey and Prisbrey (1993). that were similar to Isaac et al. (1984). ^ Alternative ways of coding experience produce similar results
832 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 Endowment 3 6 10 I token Session 1 Sequence s 1.2 1,2 34 34 Session 4 4 Sequence #s 1.2 1.2 34 3.4 ed of four each with four unit o the private good,they the statistical st y e by assu unit of the private good.In the other condition. functions have both uncontrolled fixed com everyone is endowed with nine discrete units ponents (other tha n the monetary payoff)that ute any numb een zero dis and an indepene 5.All sese s were conducted at the Caltech call the fixe Laboratory for Experimental Economics and Po- components the altruism and warm-glow ef. fects,which we differentiate below ism effec mea ures the additiona point he or she earned in the session.On av 网959.allyy on of the I.Data Analysis subject Andreon odo wi npting to identify 1)(o behavior is pre sent in onsistent with stan- other facto Wa n-gl dard theory.Second,we attempt to measure effects are present if contributions increase with the analog to bidding fun e in the din nce tween the pu decisions depend on the private token values the token values for individuals and the public and the public good value,and how do these s ch ange w doeieamtenthheecsoneo ch as funet ons at both the ag levels,using probit models. gate and Onecanintepretouranalysisinthecoi A.Some Baselines y1982) We firs (1983),and elsewhere,for the analysis of data or rates with limited dependent variables.For exam- lower bound on the amount of noise in the
832 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 TABLE 1—SESSION NUMBER AND SEyuENCE NUMBERS FOR EACH OF THE EIGHT TREATMENTS Endowmeot 1 token 9 tokens Session # Sequence #'s Session # Sequence #'s 3 1 1.2 2 1,2 6 3 1,2 4 1.2 V 10 1 3.4 4 3.4 15 3 3.4 2 3.4 Notes: The experiment consisted of four sessions, each with four ten-.peHod sequences. This table indicates session number and sequence numbers for each of the eight treatments. marginal value of the public good. In these cases, contribution reduces group efficiency. 4. We vary the endowment. In one condition, everyone is endowed with one indivisible unit of the private good. In the other condition, everyone is endowed with nine discrete units, and can contribute any number between zero and nine in each period (see Table I). 5. All sessions were conducted at the Caltech Laboratory for Experimental Economics and Political Science, using a collection of computers that are linked together in a network. 6. Each subject was paid cash for each point he or she earned in the session. On average, each individual subject earned approximately $15 in a session. n. Data Analysis We focus mainly on two aspects ofthe data. The first has to do with attempting to identify what we call errors or background noise— behavior that is grossly inconsistent with standard theory. Second, we attempt to measure response functions, the analog to bidding functions in auctions. The response functions address the question: How do contribution decisions depend on the private token values and the public good value, and how do these functions change with our treatment variables, such as experience? We estimate response functions at both the aggregate and individual levels, using probit models. One can interpret our analysis in the context of a random utility model, of the sort found in Daniel McFadden ( 1982), G. S. Maddala (1983), and elsewhere, for the analysis of data with limited dependent variables. For example, in the treatment where subjects have a single indivisible unit of the private good, they face a simple binary decision. We then model the statistical structure by assuming that utility functions have both uncontrolled fixed components (other than the monetary payoff) that we estimate, and an independent Normally distributed random component. Consistent with terminology elsewhere, we call the fixed components the altruism and warm-glow effects, which we differentiate below. The altruism effect measures the additional utihty a subject gains from increasing the monetary payoff to other subjects by one unit (Ledyard, 1995). Formally, an altruist's utility is modeled as a convex combinatioti of the group payoff and his private payoff. The warmglow effect measures the additional utility a subject gains from just the act of contributing a unit of his endowment (James Andreoni, 1988). Altruistic behavior is present in our data if contributions increase with the public good value, other factors held constant. Warm-glow effects are present if contributions increase with an increase in the difference between the public good value and the token value, other factors held constant. Because we separately vary both the token values for individuals and the public good values, we can identify the effects on contribution rates of these two components of the utility function. This is described in detail in Section II, subsection C. A. Some Baselines We first present three different baseline error rates. TTiis gives a rough calibration of a lower hound on the amount of noise in the
VOL 87 NO.5 PALFREY AND PRISBREY:PUBLIC GOODS EXPERIMENTS These findings Early Late ting in well over half the decisions in their 8 、har中e0ceyo9pga6soe ments the Experienced Palfrey and for details.) Spite.If cooperative behavior (altruism. nngs of over experiment.By noise,we mean the percent of riding from subjects with0.To the extent that violations of dominant strategies to pare our baseline with baselines observe elsewhere. Splitting.By splitting.we ean that a subject or her 起 strategy when r have a divisible endowment.Because of the Sacrifice.In one treatment,V=3.the group nent,such be. istic or exeri ces ane effect.A subject who plays optimally in this only=12.A subject who contrib oonendowment.the choice utes when r 12 <0 sacrifices more than or her endowment,the choice It is hard to imagine y of splitting in if the the experiments where subjects could split One Surely such behavior can t among vate benefits.The eof this occurs V<0.Mos contribution also provides. splitting can be ent way,a lower bound on the amount ol subjects,sa but virtually disap ears with experience( servation out of 129). thekind of behavior that sily with simpl in our data,and mostly disappears with experience. half of such obe
VOL 87 NO. 5 PALFREY AND PRISBREY: PUBUC GOODS EXPERIMENTS 833 TABLE 2—THE FREQUENCY OF SPLITTING WHEN THE ENDOWMENT IS NINE AND DIFF > 0. Early Late fnexperienced Experienced 0.36 (182) 0.21 (180) 0.19 (176) 0.07 (170) experiment. By noise, we mean the percent of observed decisions that appear incongruous with nearly any currently accepted theory of rational decision-making. We also compare our baseline with baselines observed elsewhere. Splitting. By splitting, we mean that a subject contributes some fraction of his or her endowment, but not all of it. This is only a possibility in half of our data, the data where subjects have a divisible endowment. Because of the linear structure of the environment, such behavior is not rational even if a subject is altruistic or experiences an additive warm-glow effect. A subject who plays optimally in this environment will always contribute either all or none of his or her endowment, the choice depending on r^ - V.^ Table 2 shows the frequency of splitting in the experiments where subjects could split. One can see two striking features: first, splitting is more prominent among inexperienced suhjects and in the early periods of each tenperiod game; second, splitting almost never occurs when subjects have r, — V < 0. Most splitting can be accounted for by inexperienced subjects who have a dominant strategy to free ride.*" ^' There are possible rationalizations for splitting that we do not consider here. Kay-Yut Chen (1994) constructs a model in which suhjects do not know the payoff they will get from their contribution decisions until they have made their choice. In ihat case, splitting serves a diversification role. It may also be possihle to rationalize splitting if the warm-glow (or altruism) effect is nonlinear in contributions. '' Splitting is heavily concentrated among a few subjects. Only three of the subjects account for 30 percent of all observations of splitting, and six of the subjects account for over half of such observations. At the other end of the scale, nearly 40 percent of the subjects either never split or split only one time (out of 40 chances). These findings contrast somewhat with those of Isaac et al. (1984), who observe splitting in well over half the decisions in their data. Furthermore, in some of their experiments the frequency of splitting does not decline over the course of the ten periods. (See Palfrey and Prisbrey 11993] for details.) Spite. If cooperative behavior (altruism, warm glow, or reputation building) is the main driving force behind the past findings of overcontribution, then we should not observe freeriding from subjects with r, - V < 0. To the extent that violations of dominant strategies to contribute are observed, they might be attributed to effectively random behavior.' This gives us a second kind of baseline, called spite (Saijo and Nakamura, 1995). In our experiments, 4 percent of the decisions violate the dominant strategy to contribute when r^ — V < 0. This number is quite stable across periods and across the experience treatment. Sacrifice. In one treatment, V = 3, the group optimum does not always occur when everyone contributes. In particular, the group payoff is maximized when subjects contribute if and only if r, < 4V = 12. A subject who contributes when r, - 12 < 0 sacrifices more than the entire group benefits. It is hard to imagine any circumstances in which such behavior can be rationalized, except, perhaps, if the warmglow effects from contributing far outweigh private incentives. Surely such behavior cannot be rationalized for altruists, whose utility is a convex combination of group benefits and private benefits. The frequency of this type of contribution also provides, in a slightly different way, a lower bound on the amount of noise. Among inexperienced subjects, sacrifice occurs with the same frequency as spite, but virtually disappears with experience (1 observation out of 129). In summary, the kind of behavior that cannot be explained easily with simple models of warm glow or altruism occurs only rarely in our data, and mostly disappears with experience. ' However, as we show, some of this may be attributable to a negative warm-glow effect in some individuals
834 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 0.55 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 5 -10 5 10 15 20 B.A Simple Model that valu for which the ob For a first look at the data,consider the fol- lowing very simple model of behavior.As. simple class of models such a value of g bes sume that all subjects are s the data.Figure I graphs the ob he ond th e f value is less than or equal to some critical in the range between15 and 20.The bes value,or cutpoint.gbut that they randomly estimate is g=1.at which the deviation rate from thi 'selfish mode selfish h contribute if (n-V)8 keep or contribute if (n-V)=8. m decis ons to r at andg is one cent.in the 0 =0 in ex in which From our data.we can estimate the maximum likelihood values of(g,q)simply by finding of the public good by no more than one cent
834 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 u.oo 0.5 0.45 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 n 1 1 \ \ \ \ \ - 1 1 1 q / / / / / -15 -10 -5 0 5 10 15 20 9 FtGURB 1. CUTPOINT ANALYSIS : FRBQUENCY OF DEVIATIONS FROM THE ^-OPTIMAL DECISION RULE Notes: For each hypothetical warm-glow effect, g, the graph shows the frequency of deviations from the g-optimal decision rule, q. The value g = 1 has the lowest associated q. B. A Simple Model For a first look at the data, consider the following very simple model of behavior. Assume that all subjects are identical and that they contribute if and only if the difference between their token value and the public good value is less than or equal to some critical value, or cutpoint, g, but that they randomly deviate from this decision rule some fraction of the time, q. Call g the warm-glow effect: e.g., if g > 0, then the interpretation is that a subject gains g solely from the act of contribution. Given a fixed value of g, a subject's ^-optimal decision rule is: ( contribute keep keep or contribute iHn-V)g Despite its simplicity, this class of (g, q) models encompasses a variety of behavior, from completely random decisions {q = 1) to the standard model of completely selfish behavior with no error at all {g = 0, ^ = 0). From our data, we can estimate the maximumlikelihood values of {g, q) simply by finding that value of g for which the observed frequency of deviations from the ^-optimal decision rule is minimized. Within this very simple class of models such a value of g best describes the data. Figure 1 graphs the observed frequency of deviation from the goptimal decision rule, for each integer value of g in the range between —15 and 20. The best estimate is g = I, at which the deviation rate is ^ = 0.11. The standard "selfish" model, g = 0, is nearly as good, with a deviation rate of g = 0.12." The implication of this very simple analysis is that an aggregate warm-glow effect exists, but it is small in magnitude.'' There is overcontribution relative to the selfish theory, but much, if not ail, of this overcontribution seems to be explainable as " Even though the difference in the deviation rate is small, a likelihood ratio test rejects the g = 0 model in favor of the g = I model. The x' statistic is 107.47 with 1 degree of freedom and n = 2,560. " The dollar equivalent of the difference hetween g = 1 and g = 0 is one cent, in the sense thai g = 1 corresponds, in experimental payoffs, to behavior in which a subject is willing to contribute his or her endowment if and only if the value of the endowment exceeds the value of the public good by no more than one cent
VOL 87 NO.5 PALFREY AND PRISBREY:PUBLIC GOODS EXPERIMENTS 83 we examine the nature of the decision rule in detail,giving more consideration to the En=(ra V)-gi-a(N 1)V, errors generating dev od to the role of other factors such as ex where the right-hand side contains all the el. ements of the subject's utility function that de erience and altruism,that are likely to af- termine his or her choicex fect contribution decisions. Accordingly,we estimate a probit model C.The Probit Model The probit model provides a standard way the independent variables in the model.Giver ch don ent trea the public good value.and experience.The the difference (r).which we call structural model underlying this analysis is the diff;and of the ublic c good, that were controlled in the ariables exper,for experience,which takes on a U (xa,x-n) zero for decisions in bloo =yΣ+(8-nx+m period sequence of the same public good value: +aw-1y which takes on values from one to ten. +∑[(g-)x:+w1 D.The Representative Subject Model where V,is the public good value in period. e t in these rep riod and altruism effects a d t be the same across individuals.An observatio riod, is a contribution decision in a single period. player i's ruism term,and odel we assume that for each of subject i's decisions to the warm-glow her subject hen the tw ume that the are independent,identical Normally distrib uted random variables with mean zero and
VOL 87 NO. 5 PALFREY AND PRISBREY: PUBLIC GOODS EXPERIMENTS 835 noise rather than some systematic component of the decision rule. In the next sections, we examine the nature of the decision rule in detail, giving more consideration to the structure of errors generating deviations, to possible heterogeneity across individuals, and to the role of other factors such as experience and altruism, that are likely to affect contribution decisions. C. The Probit Model The probit model provides a standard way to measure the probability of contribution as a function of the different treatment variables, such as the individually assigned token values, the public good value, and experience. The structural model underlying this analysis is the following. We assume that the utility player i gets in period t from contributing ;c,, units of the private good is: = V, X A subject contributes if and r L (Nj*i J where V, is the public good value in period t, gi is player /'s warm-glow term, r,, is player i's token value in period t, Wi, is player I's endowment of tokens in period /, a, is player I's altruism term, and A^ is number of players in /'s group. Finally, in order to estimate the model we assume that for each of subject fs decisions at period t there is a random component, e^,, that is added to the warm-glow term. This error term represents some random added propensity for the subject to either contribute or not contribute. We assume that the e^/s are independent, identical. Normally distributed random variables with mean zero and variance only if where the right-hand side contains all the elements of the subject's utility function that determine his or her choice Xi,. Accordingly, we estimate a probit model, where the probability of contributing a unit of the endowment is given by the cumulative Normal transformation of a linear function of the independent variables in the model. Given our specification of the decision rule of the subject, our independent variables are: • a constant term, which we call constant; • the difference (r, - V^), which we call diff; and • the value of the public good, V. In addition, we include three other variables that were controlled in the experiment: • exper, for experience, which takes on a value of zero for decisions in the first tenperiod sequence with a given public good value, and one for decisions in the second tenperiod sequence of the same public good value; • endow, which takes on a value of zero if the endowment is indivisible and one if it is divisible; and • period, which takes on values from one to ten. D. The Representative Subject Model We present estimates from two probit models which differ only in which independent variables are included. Note that in these representative subject models, the warm-glow and altruism effects are implicitly assumed to be the same across individuals. An observation is a contribution decision in a single period.'" '" We pool observations across all experiments. Decisions in the divisible endowment treatment {endow = 1) are coded as either 0 or 1, depending on whether subjects contributed less than half or more than half their endowment of tokens in a given period, respectively. Similar conclusions obtain when the iwo endowment U'eatment samples are estimated separately. This is addressed in detail in the next section, where some minor differences are also discussed
836 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 TABLE 3-ESTIMATES FROM PROBIT MODELS Probit models The estimate of the warm-glow term can be 3 interpreted in the following way.Define a cut point as the d erence between the token v at whic exper.d ←9 specific values of the other independent vari ables in the model. 9 2 pproximately els,subjects can be expected to contribute half 8% their endowment when diff= -2.5 It is instructive to contrast this estimate to the the d on te 8 is I g if diff>g and equals g it diff<g Since the under h de sus 2.5 for hit ion which model s bette This is also a question that is relevant to models of mes of the cal fied effects for al q)variety.These are calied constant error models because the probability of a decision othe etween o es of bo of the e s of -1)/o.Thus,through alge a on,we nd n this m altruism effect.a. nd Palf The results are clear.Both estimates have and Colin F.Can sign,but the coefficient (1994 tal n m as define nt fr y al resp significant,indicating a significantly positive
836 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 TABLE 3^ESTIMATES FROM PROBIT MODELS diff exper.d endow.d peritid.d constant exper endow V period log likelihood observaiions perceni correctly predicled 1 -0.25 (-27,63) 0.55 (6,31) 0,0077 (0,90) -810,23 2.560 86.45 Probit models 2 -0.17 (-8.12) -0.059 (-3,23) -0.034 (-1.91) -0.0079 (-2.53) 0.52 (3.94) 0.010 (0.12) -0.046 (-0,55) 0.0089 (1,02) 0.0070 (0.46) -796,87 2,560 86,60 3 -0.27 (-9,60) -0.077 (-3.45) 0.017 (0.66) -0.0096 (-2,51) See Figures 2 and 3 0.025 (0.26) -0.0020 (-0,19) 0.0058 (0,34) -588,92 2,560 91.48 Notes: In each probil model the dependeni variable is Ihe binary inveslmenl decision variable. Under each coefficient is [he asymptotic /-statistic. Variables appended with .d are interactions with diff. Probit models I and 2 assume identical fixed etfecis for all individuals (homogeneity). Probit model 3 estimates separate individual fixed effects for each of the 64 subjects (heterogeneity). These individual effects are displayed in Figures 2 and 3, The first column of Table 3 reports the results of estimating the probit equation including only the variables constant, diff, and V. Given the specification ofthe individual utility functions, the coefficient of constant is an estimate of the warm-glow effect divided by the standard deviation of the error term, or gla. The coefficients of diffdind V are estimates of -I/CT and a{N — l)/cr. Thus, through algebraic manipulation, we can directly obtain an estimate of the warm-glow effect, g, and the altruism effect, a. The results are clear. Both estimates have the predicted positive sign, but the coefficient of V is so small that the altruism parameter is not significantly different from zero. The coefficients of constant and diff are both highly significant, indicating a significantly positive warm-glow effect with g approximately equal to 2.21." The estimate of the warm-glow term can be interpreted in the following way. Define a cutpoint as the difference between the token value of a subject and the public good value at which our prediction of subject behavior switches from noncontribution to contribution, given specific values of the other independent variables in the model.'^ Such a computation gives a cutpoint of approximately 2,5 token value units if V = 10. In other words, on average, with all other variables held fixed at these levels, subjects can be expected to contribute half their endowment when diff = —2,5. It is instructive to contrast this estimate to the one in the previous section, based on the very simple, two-parameter (g, q) model. With that model the probability of contribution is I — g if diff > g and equals q if diff < g. Since the cutpoints estimated under the two models differ, i,e., 1 for the (g, q) model versus 2,5 for the probit model, an obvious question is: which model is better? This is also a question that is relevant to other recent efforts to estimate models of subject decision errors in experiments. One class of models that has been explored is of the (g, q) variety. These are called constant error models because the probability of a decision error is assumed to be independent of other variables in the model.'^ Another class of models, one that includes the probit model, assumes that decision errors occur more frequently when subjects are nearly indifferent between choices.''* '' In fact, the /-statistic for g depends on the variances of both the coefficients constant and diff. and the /-statistic for a depends on the variances of both the coefficients diff and pval. These can be obtained using Taylor series approximations as explained in Jan Kmenta (1971 p. 444). The resulting r-st£Uistic for g is 5.6010, and Che /-statistic for a is 0.9097, '- That is, the estimated cutpoint will depend on V in this model. " See, for example, Richard D, McKelvey and Palfrey (1992), Richard T, Boylan and Mahmoud A. El-Gamal (1993), and David W. Harless and Colin F, Camerer (1994), '^ For example, quantal response equilibrium as defined by McKelvey and Palfrey (1995). Notice ihat the probit model we propose to explain the data is formally equivalent to a probit response specification of quanta! response equilibrium
VOL 87 NO.5 PALFREY AND PRISBREY:PUBLIC GOODS EXPERIMENTS R7 50 percent from 2.7 to 1.6.This suggests that subject confus account for a rienced subiec better?There are several ways to conduct such The bottom lines from the aggregate probit est and in all analysis are:(1)there is strong evidence for a warm-glow t leading to voluntary contr for an alt test between the probit model including only that much of the decline in contribution fron experience and repetition is due to decline in del the benefit of t han a c inge in the unde sign the likelihood of contribution at the cut ror rates is a planation forth point(diff= effects observed in some past experiments.a 93.36.T ratio is equal to se to models 2g.198 of reputation】 adiustment (rather than a standard chi. E.The Heterogeneous Subjects Model tion assumes We next run a probit including the addi in tional control variables exper,endow,and pe ilar indications have also been noted in man ab/so vey an ey,19 2a2emmen2 the effect of the variables on the coefficient of riments (Isaac et al.1984) diff.with a negative coefficient indicating that Here,the aggregate analysis of the previ- the the rand yterm is get Ou5eCtionisbrokendowmatthendividul ummy variab into m for eac actual distribution of individu effects. expenence van the pe e predic over time.Also of interest is the fact that none ients are onabe this ible explanati fo of reduced variance and the fact that th ign. ta d di-glow estimat matching des which i enced subjects in round one and expe andom ol in rec subjects in round ten is quite large, with the and estimated warm-glow term dropping by nearly 10)fo r this e写 are other
VOL 87 NO. 5 PALFREY AND PRISBREY: PUBUC GOODS EXPERIMENTS 837 Here we see that the estimated warm-glow term is more than twice as large in magnitude in the probit model compared with the constant error {g, q) model. Which estimate is better? There are several ways to conduct such a test, and in all those that we tried, a likelihood ratio test shows the probit model to be the clear winner, at highly significant levels. For example, we conducted a likelihood ratio test between the probit model including only the constant and diff \ahables and the (g, q) model with g = 1 and q = 0.105. To give the (g, q) model the benefit of the doubt, we assign the likelihood of contribution at the cutpoint (diff = 1) to simply equal the empirical frequency. The likelihood ratio is equal to 93.36. Since the two models are strictly nonnested, we use the Quang H. Vuong (1989) adjustment (rather than a standard chi-square test) to conduct a formal statistical test. This produces a z-statistic of 7.30 {significant at p< 10-''), We next run a probit including the additional control variables exper, endow, and period, and also including the interaction of these variables with diff.^^ (See column 2 of Table 3.) The interaction coefficients measure the effect ofthe variables on the coefficient of diff, with a negative coefficient indicating that the variance of the random utility term is getting smaller, Behaviorally, this lower variance translates into more predictable behavior by subjects, or steeper probit response curves. Not surprisingly, the interaction coefficients for both the experience variable and the period variable show such an effect, indicating that subject behavior is becoming more predictable over time. Also of interest is the fact that none of the noninteraction coefficients are significant. Jointly, this implies that the overall effect of experience and repetition is to reduce aggregate contributions, but that this reduction effect is indirect and due to the combination of reduced variance and the fact that the warm-glow level is positive. The estimated difference between cutpoints for inexperienced subjects in round one and experienced subjects in round ten is quite large, with the estimated warm-glow term dropping by nearly '^ Interactions with V are not included because the effect of V is insignificant. 50 percent from 2.7 to 1,6, This suggests that subject confusion may indeed account for a large portion of the contributions by inexperienced subjects."" The bottom lines from the aggregate probit analysis are: (I) there is strong evidence for a warm-glow effect leading to voluntary contribution, and (2) there is no significant evidence for an altruism effect. The results also show that much of the decline in contribution from experience and repetition is due to decline in error rates rather than a change in the underlying decision rule. As such, the decline in error rates is a possible explanation for the decay effects observed in some past experiments, an explanation that avoids any recourse to models of reputation building or repeated games," E. The Heterogeneous Subjects Model The analysis in the previous section assumes that individuals are identical. In fact, there are indications of heterogeneity in our data. Similar indications have also been noted in many other economics and decision experiments (McKelvey and Palfrey, 1992; El-Gamal and David M, Grether. 1995) and in public goods experiments (Isaac et al,, 1984), Here, the aggregate analysis of the previous section is broken down at the individual level by including a dummy variable for each individual, from which we can estimate the actual distribution of individual warm-glow effects.'" The last column of Table 3 reports the coefficients for the included variables. '" The coefficient on the endow treatment variable is insignificant and the coefficient on the interaction between endow and diff'\^ very smali ( <,01) and barely significant at the 5-percent level. In the later analysis with individual effects, this small effect vanishes, " This also provides a possible explanation for Andreoni's (1988) finding that in a random matching design, there is less decay than in the standard repeatedgroup design. This could happen if subject learning occurs more slowly in the random matching design, which is plausible since the random matching protocol introduces another source of noise in the feedback received by subjects after each period of play. See Palfrey and PHsbrey (1996) for additional evidence for this explanation. '^ There are other conceivable sources of heterogeneity in these experiments, including cohort effects, nonlinear warm-glow terms, different varieties of altruistic preferences, or differential error rates across subjects, but an exploration of multidimensional heterogeneity is well
THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 15 10 0 10 20 30 0 50 60 70 subject i FIGURE 2.INDIVIDUAL WARM-GLOW EFFECTS:PERIOD 1.INEXPERIENCED Note:The estimated individual warm-glow effects.for our 64 subjects(inexperienced/period 1) median warm-glow effect is 2.3 for shows clearly that the individual effects are perienced subiects in period ten.which is statisificaneg tistie is415.9 with very close to the aggregate results of the pre The x'sta vious section.Considerably less than half egree ive suhieet mo heterogeneous subject model.The informa- one that is significantly negative. tion from the e individual coefh ents is sum The distribution of cutpoints in the experi- enced. ten tre tment is c arly which sim spectively. Each individual cutpoint is flects the significant effect of those variables calculated f probit co in a on reducing error rates,as di cussed earlier warm-glow effect is zero.The confidence in. terval around each individual cutpoint is big- nd the s imnle clas ger because of the compounded effect of the nd Pri e was excluded hecaus ures the key otherwise the model is not identified.i.e..en and the individual dummies ar o the collinear. To test for any ettects due to timate of the ing an c 0(binary endowment)and endow =1(di- n as described in Km visible end 4.and the estimated in
838 THE AMERICAN ECONOMIC REVIEW DECEMBER 1997 15 - 10 - 5 ~ -p^TT T .•••'•' n TTTTTTTT ..-^-^^^'^ 0 ----,----n^-- j-j^ - . -5 '-Li 1 J --'llpp^^ 95 percent t-«-H 10 20 30 40 subject i 50 60 70 FIGURE 2, iNDtviDUAL WARM-GLOW EFFECTS: PERIOD 1, INEXPERIENCED Nole: The estimated individual warm-glow effects, g,, for our 64 subjects (inexperienced/period 1). excluding the coefficients for the 64 individual warm-glow effects. A likelihood ratio test shows clearly that the individual effects are statistically significant at any reasonable level of significance. The x^ statistic is 415.9 with 63 degrees of freedom. Thus, we reject the representative subject model in favor of the heterogeneous subject model. The information from the individual coefficients is summarized in Figures 2 and 3, which graph g, with 95-percent confidence intervals, for inexperienced and experienced subjects, respectively. Each individual cutpoint is calculated from the probit coefficients, in a manner similar to the computation of the aggregate cutpoint in the previous section.'" The beyond the scope of this paper. For example, simple classification analysis in Palfrey and Prisbrey (1993) suggests the possibility of differential error rates, which we have chosen not to model explicitly here. Nevertheless, we are confident that our specification captures the key component of subject heterogeneity in these experiments. As with any estimation, our results are subject to the usual caveat about other additional (unmeasured) sources of heterogeneity. '"The confidence intervals were derived using an estimate of the variance of g,. The estimate was created using a Taylor series approximation as described in Kmenta (1971 p. 444). median warm-glow effect is 2,3 for inexperienced subjects in period one and 1.4 for experienced subjects in period ten, which is very close to the aggregate results of the previous section. Considerably less than half the subjects have a warm-glow term that is significantly greater than zero. No subject has one that is significantly negative. The distribution of cutpoints in the experienced, period-ten treatment is clearly less dispersed and has a lower median than the inexperienced distribution, which simply reflects the significant effect of those variables on reducing error rates, as discussed earlier. The decisions are moving in the direction of the predictions of the selfish model, where the warm-glow effect is zero. The confidence interval around each individual cutpoint is bigger because of the compounded effect of the variance of period.d. The endow variable was excluded because otherwise the model is not identified, i.e,, endow, pval, and the individual dummies are collinear. To test for any effects due to the endowment, we separately estimate model 3 for the two subsamples defined by the endow = 0 (binary endowment) and endow = I (divisible endowment) treatments. The results are reported in Table 4, and the estimated in-