THE JOURNAL OF FINANCE.VOL.LIV,NO.6.DECEMBER 1999 A Unified Theory of Underreaction, Momentum Trading,and Overreaction in Asset Markets HARRISON HONG and JEREMY C.STEIN* ABSTRACT We model a market populated by two groups of boundedly rational agents:"news- watchers"and "momentum traders."Each newswatcher observes some private in- formation,but fails to extract other newswatchers'information from prices.If information diffuses gradually across the population,prices underreact in the short run.The underreaction means that the momentum traders can profit by trend- chasing.However,if they can only implement simple(i.e.,univariate)strategies, their attempts at arbitrage must inevitably lead to overreaction at long horizons. In addition to providing a unified account of under-and overreactions,the model generates several other distinctive implications. OVER THE LAST SEVERAL YEARS,a large volume of empirical work has docu- mented a variety of ways in which asset returns can be predicted based on publicly available information.Although different studies have used a host of different predictive variables,many of the results can be thought of as belonging to one of two broad categories of phenomena.On the one hand, returns appear to exhibit continuation,or momentum,in the short to me- dium run.On the other hand,there is also a tendency toward reversals,or fundamental reversion,in the long run.1 It is becoming increasingly clear that traditional asset-pricing models- such as the capital asset pricing model(CAPM)of Sharpe(1964)and Lint- ner (1965),Ross's (1976)arbitrage pricing theory (APT),or Merton's (1973) intertemporal capital asset pricing model (ICAPM)-have a hard time ex- plaining the growing set of stylized facts.In the context of these models,all of the predictable patterns in asset returns,at both short and long horizons, must ultimately be traced to differences in loadings on economically mean- ingful risk factors.And there is little affirmative evidence at this point to suggest that this can be done. Stanford Business School and MIT Sloan School of Management and NBER.This re- search is supported by the National Science Foundation and the Finance Research Center at MIT.We are grateful to Denis Gromb,Rene Stulz,an anonymous referee,and seminar par. ticipants at MIT,Michigan,Wharton,Duke,UCLA,Berkeley,Stanford,and Illinois for help- ful comments and suggestions.Thanks also to Melissa Cunniffe for help in preparing the manuscript. We discuss this empirical work in detail and provide references in Section I below. 2143
A Unified Theory of Underreaction, Momentum Trading, and Overreaction in Asset Markets HARRISON HONG and JEREMY C. STEIN* ABSTRACT We model a market populated by two groups of boundedly rational agents: “newswatchers” and “momentum traders.” Each newswatcher observes some private information, but fails to extract other newswatchers’ information from prices. If information diffuses gradually across the population, prices underreact in the short run. The underreaction means that the momentum traders can profit by trendchasing. However, if they can only implement simple ~i.e., univariate! strategies, their attempts at arbitrage must inevitably lead to overreaction at long horizons. In addition to providing a unified account of under- and overreactions, the model generates several other distinctive implications. OVER THE LAST SEVERAL YEARS, a large volume of empirical work has documented a variety of ways in which asset returns can be predicted based on publicly available information. Although different studies have used a host of different predictive variables, many of the results can be thought of as belonging to one of two broad categories of phenomena. On the one hand, returns appear to exhibit continuation, or momentum, in the short to medium run. On the other hand, there is also a tendency toward reversals, or fundamental reversion, in the long run.1 It is becoming increasingly clear that traditional asset-pricing models— such as the capital asset pricing model ~CAPM! of Sharpe ~1964! and Lintner ~1965!, Ross’s ~1976! arbitrage pricing theory ~APT!, or Merton’s ~1973! intertemporal capital asset pricing model ~ICAPM!—have a hard time explaining the growing set of stylized facts. In the context of these models, all of the predictable patterns in asset returns, at both short and long horizons, must ultimately be traced to differences in loadings on economically meaningful risk factors. And there is little affirmative evidence at this point to suggest that this can be done. * Stanford Business School and MIT Sloan School of Management and NBER. This research is supported by the National Science Foundation and the Finance Research Center at MIT. We are grateful to Denis Gromb, René Stulz, an anonymous referee, and seminar participants at MIT, Michigan, Wharton, Duke, UCLA, Berkeley, Stanford, and Illinois for helpful comments and suggestions. Thanks also to Melissa Cunniffe for help in preparing the manuscript. 1 We discuss this empirical work in detail and provide references in Section I below. THE JOURNAL OF FINANCE • VOL. LIV, NO. 6 • DECEMBER 1999 2143
2144 The Journal of Finance As an alternative to these traditional models,many are turning to"be- havioral"theories,where "behavioral"can be broadly construed as involving some departure from the classical assumptions of strict rationality and un- limited computational capacity on the part of investors.But the difficulty with this approach is that there are a potentially huge number of such de- partures that one might entertain,so it is hard to know where to start. In order to impose some discipline on the process,it is useful to articulate the criteria that a new theory should be expected to satisfy.There seems to be broad agreement that to be successful,any candidate theory should,at a minimum:(1)rest on assumptions about investor behavior that are either a priori plausible or consistent with casual observation;(2)explain the exist- ing evidence in a parsimonious and unified way;and(3)make a number of further predictions that can be subject to "out-of sample"testing and that are ultimately validated.Fama (1998)puts particular emphasis on the lat- ter two criteria:"Following the standard scientific rule,market efficiency can only be replaced by a better model....The alternative has a daunting task.It must specify what it is about investor psychology that causes simul- taneous underreaction to some types of events and overreaction to oth- ers....And the alternative must present well-defined hypotheses,themselves potentially rejectable by empirical tests." A couple of recent papers take up this challenge.Both Barberis,Shleifer, and Vishny (1998)and Daniel,Hirshleifer,and Subrahmanyam (1998)as- sume that prices are driven by a single representative agent,and then posit a small number of cognitive biases that this representative agent might have. They then investigate the extent to which these biases are sufficient to si- multaneously deliver both short-horizon continuation and long-horizon reversals.2 In this paper,we pursue the same goal as Barberis et al.(1998)and Dan- iel et al.(1998),that of building a unified behavioral model.However,we adopt a fundamentally different approach.Rather than trying to say much about the psychology of the representative agent,our emphasis is on the interaction between heterogeneous agents.To put it loosely,less of the action in our model comes from particular cognitive biases that we ascribe to in- dividual traders,and more of it comes from the way these traders interact with one another. More specifically,our model features two types of agents,whom we term “newswatchers'”and“momentum traders..”Neither type is fully rational in the usual sense.Rather,each is boundedly rational,with the bounded ra- tionality being of a simple form:each type of agent is only able to "process" some subset of the available public information.3 The newswatchers make forecasts based on signals that they privately observe about future funda- mentals;their limitation is that they do not condition on current or past 2 We have more to say about these and other related papers in Section V below. 3 Although the model is simpler with just these two types of traders,the results are robust to the inclusion of a set of risk-averse,fully rational arbitrageurs,as shown in Section III.B
As an alternative to these traditional models, many are turning to “behavioral” theories, where “behavioral” can be broadly construed as involving some departure from the classical assumptions of strict rationality and unlimited computational capacity on the part of investors. But the difficulty with this approach is that there are a potentially huge number of such departures that one might entertain, so it is hard to know where to start. In order to impose some discipline on the process, it is useful to articulate the criteria that a new theory should be expected to satisfy. There seems to be broad agreement that to be successful, any candidate theory should, at a minimum: ~1! rest on assumptions about investor behavior that are either a priori plausible or consistent with casual observation; ~2! explain the existing evidence in a parsimonious and unified way; and ~3! make a number of further predictions that can be subject to “out-of sample” testing and that are ultimately validated. Fama ~1998! puts particular emphasis on the latter two criteria: “Following the standard scientific rule, market efficiency can only be replaced by a better model. . . . The alternative has a daunting task. It must specify what it is about investor psychology that causes simultaneous underreaction to some types of events and overreaction to others. . . . And the alternative must present well-defined hypotheses, themselves potentially rejectable by empirical tests.” A couple of recent papers take up this challenge. Both Barberis, Shleifer, and Vishny ~1998! and Daniel, Hirshleifer, and Subrahmanyam ~1998! assume that prices are driven by a single representative agent, and then posit a small number of cognitive biases that this representative agent might have. They then investigate the extent to which these biases are sufficient to simultaneously deliver both short-horizon continuation and long-horizon reversals.2 In this paper, we pursue the same goal as Barberis et al. ~1998! and Daniel et al. ~1998!, that of building a unified behavioral model. However, we adopt a fundamentally different approach. Rather than trying to say much about the psychology of the representative agent, our emphasis is on the interaction between heterogeneous agents. To put it loosely, less of the action in our model comes from particular cognitive biases that we ascribe to individual traders, and more of it comes from the way these traders interact with one another. More specifically, our model features two types of agents, whom we term “newswatchers” and “momentum traders.” Neither type is fully rational in the usual sense. Rather, each is boundedly rational, with the bounded rationality being of a simple form: each type of agent is only able to “process” some subset of the available public information.3 The newswatchers make forecasts based on signals that they privately observe about future fundamentals; their limitation is that they do not condition on current or past 2 We have more to say about these and other related papers in Section V below. 3 Although the model is simpler with just these two types of traders, the results are robust to the inclusion of a set of risk-averse, fully rational arbitrageurs, as shown in Section III.B. 2144 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2145 prices.Momentum traders,in contrast,do condition on past price changes. However,their limitation is that their forecasts must be "simple"(i.e.,uni- variate)functions of the history of past prices.4 In addition to imposing these two constraints on the information process- ing abilities of our traders,we make one further assumption,which is more orthodox in nature:Private information diffuses gradually across the news- watcher population.All our conclusions then flow from these three key as- sumptions.We begin by showing that when only newswatchers are active, prices adjust slowly to new information-there is underreaction but never overreaction.As is made clear later,this result follows naturally from com- bining gradual information diffusion with the assumption that newswatch- ers do not extract information from prices. Next,we add the momentum traders.It is tempting to conjecture that because the momentum traders can condition on past prices,they arbitrage away any underreaction left behind by the newswatchers;with sufficient risk tolerance,one might expect that they would force the market to become approximately efficient.However,it turns out that this intuition is incom- plete if momentum traders are limited to simple strategies.For example, suppose that a momentum trader at time t must base his trade only on the price change over some prior interval,say from t-2 to t-1.We show that in this case,momentum traders'attempts to profit from the underreaction caused by newswatchers lead to a perverse outcome:The initial reaction of prices in the direction of fundamentals is indeed accelerated,but this comes at the expense of creating an eventual overreaction to any news.This is true even when momentum traders are risk neutral. Again,the key to this result is the assumption that momentum traders use simple strategies-that is,they do not condition on all public informa- tion.Continuing with the example,if a momentum trader's order at time t is restricted to being a function of just the price change from t-2 to t-1,it is clear that it must be an increasing function.On average,this simple trend- chasing strategy makes money.But if one could condition on more informa- tion,it would become apparent that the strategy does better in some circumstances than in others.In particular,the strategy earns the bulk of its profits early in the "momentum cycle"-by which we mean shortly after substantial news has arrived to the newswatchers-and loses money late in the cycle,by which time prices have already overshot long-run equilibrium values. To see this point,suppose that there is a single dose of good news at time t and no change in fundamentals after that.The newswatchers cause prices to jump at time t,but not far enough,so that they are still below their 4 The constraints that we put on traders'information-processing abilities are arguably not as well-motivated by the experimental psychology literature as the biases in Barberis et al.(1998) or Daniel et al.(1998),and so may appear to be more ad hoc.However,they generate new and clear-cut asset-pricing predictions,some of which have already been supported in recent tests. See Section IV below
prices. Momentum traders, in contrast, do condition on past price changes. However, their limitation is that their forecasts must be “simple” ~i.e., univariate! functions of the history of past prices.4 In addition to imposing these two constraints on the information processing abilities of our traders, we make one further assumption, which is more orthodox in nature: Private information diffuses gradually across the newswatcher population. All our conclusions then flow from these three key assumptions. We begin by showing that when only newswatchers are active, prices adjust slowly to new information—there is underreaction but never overreaction. As is made clear later, this result follows naturally from combining gradual information diffusion with the assumption that newswatchers do not extract information from prices. Next, we add the momentum traders. It is tempting to conjecture that because the momentum traders can condition on past prices, they arbitrage away any underreaction left behind by the newswatchers; with sufficient risk tolerance, one might expect that they would force the market to become approximately efficient. However, it turns out that this intuition is incomplete if momentum traders are limited to simple strategies. For example, suppose that a momentum trader at time t must base his trade only on the price change over some prior interval, say from t 2 2 to t 2 1. We show that in this case, momentum traders’ attempts to profit from the underreaction caused by newswatchers lead to a perverse outcome: The initial reaction of prices in the direction of fundamentals is indeed accelerated, but this comes at the expense of creating an eventual overreaction to any news. This is true even when momentum traders are risk neutral. Again, the key to this result is the assumption that momentum traders use simple strategies—that is, they do not condition on all public information. Continuing with the example, if a momentum trader’s order at time t is restricted to being a function of just the price change from t 2 2 to t 2 1, it is clear that it must be an increasing function. On average, this simple trendchasing strategy makes money. But if one could condition on more information, it would become apparent that the strategy does better in some circumstances than in others. In particular, the strategy earns the bulk of its profits early in the “momentum cycle”—by which we mean shortly after substantial news has arrived to the newswatchers—and loses money late in the cycle, by which time prices have already overshot long-run equilibrium values. To see this point, suppose that there is a single dose of good news at time t and no change in fundamentals after that. The newswatchers cause prices to jump at time t, but not far enough, so that they are still below their 4 The constraints that we put on traders’ information-processing abilities are arguably not as well-motivated by the experimental psychology literature as the biases in Barberis et al. ~1998! or Daniel et al. ~1998!, and so may appear to be more ad hoc. However, they generate new and clear-cut asset-pricing predictions, some of which have already been supported in recent tests. See Section IV below. Underreaction, Momentum Trading, and Overreaction 2145
2146 The Journal of Finance long-run values.At time t+1 there is a round of momentum purchases,and those momentum buyers who get in at this time make money.But this round of momentum trading creates a further price increase,which sets off more momentum buying,and so on.Later momentum buyers(i.e.,those buying at t+i for some i)lose money,because they get in at a price above the long-run equilibrium. Thus a crucial insight is that "early"momentum buyers impose a negative externality on "late"momentum buyers.5 Ideally,one uses a momentum strat- egy because a price increase signals that there is good news about funda- mentals out there that is not yet fully incorporated into prices.But sometimes, a price increase is the result not of news but just of previous rounds of momentum trade.Because momentum traders cannot directly condition on whether or not news has recently arrived,they do not know whether they are early or late in the cycle.Hence they must live with this externality,and accept the fact that sometimes they buy when earlier rounds of momentum trading have pushed prices past long-run equilibrium values. Although we make two distinct bounded-rationality assumptions,our model can be said to "unify"underreaction and overreaction in the following sense. We begin by modeling a tendency for one group of traders to underreact to private information.We then show that when a second group of traders tries to exploit this underreaction with a simple arbitrage strategy,they only par- tially eliminate it,and in so doing,create an excessive momentum in prices that inevitably culminates in overreaction.Thus,the very existence of un- derreaction sows the seeds for overreaction,by making it profitable for mo- mentum traders to enter the market.Or,said differently,the unity lies in the fact that our model gets both underreaction and overreaction out of just one primitive type of shock:Gradually diffusing news about fundamentals. There are no other exogenous shocks to investor sentiment and no liquidity- motivated trades. In what follows,we develop a simple infinite-horizon model that captures these ideas.We begin in Section I by giving an overview of the empirical evidence that motivates our work.In Section II,we present and solve the basic model,and do a number of comparative statics experiments.Sec- tion III contains several extensions.In Section IV,we draw out the model's empirical implications.Section V discusses related work,and Section VI concludes I.Evidence of Continuation and Reversals A.Continuation The continuation evidence can be decomposed along the following lines. First,returns tend to exhibit unconditional positive serial correlation at ho- rizons on the order of three to twelve months.This is true both in cross 5 As we discuss below,this "momentum externality"is reminiscent of the herding models of Banerjee(1992),Bikhchandani,Hirshleifer,and Welch(1992),and Scharfstein and Stein(1990)
long-run values. At time t 1 1 there is a round of momentum purchases, and those momentum buyers who get in at this time make money. But this round of momentum trading creates a further price increase, which sets off more momentum buying, and so on. Later momentum buyers ~i.e., those buying at t 1 i for some i! lose money, because they get in at a price above the long-run equilibrium. Thus a crucial insight is that “early” momentum buyers impose a negative externality on “late” momentum buyers.5 Ideally, one uses a momentum strategy because a price increase signals that there is good news about fundamentals out there that is not yet fully incorporated into prices. But sometimes, a price increase is the result not of news but just of previous rounds of momentum trade. Because momentum traders cannot directly condition on whether or not news has recently arrived, they do not know whether they are early or late in the cycle. Hence they must live with this externality, and accept the fact that sometimes they buy when earlier rounds of momentum trading have pushed prices past long-run equilibrium values. Although we make two distinct bounded-rationality assumptions, our model can be said to “unify” underreaction and overreaction in the following sense. We begin by modeling a tendency for one group of traders to underreact to private information. We then show that when a second group of traders tries to exploit this underreaction with a simple arbitrage strategy, they only partially eliminate it, and in so doing, create an excessive momentum in prices that inevitably culminates in overreaction. Thus, the very existence of underreaction sows the seeds for overreaction, by making it profitable for momentum traders to enter the market. Or, said differently, the unity lies in the fact that our model gets both underreaction and overreaction out of just one primitive type of shock: Gradually diffusing news about fundamentals. There are no other exogenous shocks to investor sentiment and no liquiditymotivated trades. In what follows, we develop a simple infinite-horizon model that captures these ideas. We begin in Section I by giving an overview of the empirical evidence that motivates our work. In Section II, we present and solve the basic model, and do a number of comparative statics experiments. Section III contains several extensions. In Section IV, we draw out the model’s empirical implications. Section V discusses related work, and Section VI concludes. I. Evidence of Continuation and Reversals A. Continuation The continuation evidence can be decomposed along the following lines. First, returns tend to exhibit unconditional positive serial correlation at horizons on the order of three to twelve months. This is true both in cross 5 As we discuss below, this “momentum externality” is reminiscent of the herding models of Banerjee ~1992!, Bikhchandani, Hirshleifer, and Welch ~1992!, and Scharfstein and Stein ~1990!. 2146 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2147 sections of individual stocks(Jegadeesh and Titman(1993))and for a variety of broad asset classes(Cutler,Poterba,and Summers(1991)).6 One possible interpretation of this unconditional evidence-which fits with the spirit of the model below-is that information which is initially private is incorpo- rated into prices only gradually. Second,conditional on observable public events,stocks tend to experience post-event drift in the same direction as the initial event impact.The types of events that have been examined in detail and that fit this pattern include: Earnings announcements (perhaps the most-studied type of event in this genre;see e.g.,Bernard (1992)for an overview);stock issues and repur- chases;dividend initiations and omissions;and analyst recommendations.? Recent work by Chan,Jegadeesh,and Lakonishok (1996)shows that these two types of continuation are distinct:In a multiple regression,both past returns and public earnings surprises help to predict subsequent returns at horizons of six months and one year. B.Reversals and Fundamental Reversion One of the first and most influential papers in the reversals category is DeBondt and Thaler(1985),who find that stock returns are negatively cor- related at long horizons.Specifically,stocks that have had the lowest returns over any given five-year period tend to have high returns over the subsequent five years,and vice versa.8 A common interpretation of this result is that when there is a sustained streak of good news about an asset,its price overshoots its "fundamental value"and ultimately must experience a correction.More re- cent work in the same spirit has continued to focus on long-horizon predict- ability,but has used what are arguably more refined indicators of fundamental value,such as book-to-market,and cash flow-to-price ratios.(See,e.g.,Fama and French(1992)and Lakonishok,Shleifer,and Vishny (1994).)9 C.Is It Risk? In principle,the patterns noted above could be consistent with traditional models,to the extent that they reflect variations in risk,either over time or across assets.Fama and French (1993,1996)argue that many of the long- 6 Rouwenhorst(1998,1999)finds that Jegadeesh and Titman's(1993)U.S.results carry over to many other developed and emerging markets,though they are not statistically significant for every country individually (see,e.g.,Haugen and Baker(1996)on weak momentum in Japan). 7 References include:Bernard and Thomas(1989,1990)on earnings announcements;Lough- ran and Ritter(1995)and Spiess and Affleck-Graves(1995)on stock issues;Ikenberry,Lakon- ishok,and Vermaelen(1995)on repurchases;Michaely,Thaler,and Womack(1995)on dividend initiations and omissions;and Womack(1996)on analyst recommendations. 8 These results have been controversial,but seem to have stood up to scrutiny (Chopra, Lakonishok,and Ritter(1992)).There are also direct analogs in the time series of aggregate market returns,although the statistical power is lower.See Fama and French(1988),Poterba and Summers (1988),and Cutler,Poterba,and Summers (1991). s These results have also been found to be robust in international data.(Fama and French (1998),Rouwenhorst (1999))And again,there are analogous fundamental reversion patterns in the time-series literature on aggregate market predictability (Campbell and Shiller(1988))
sections of individual stocks ~Jegadeesh and Titman ~1993!! and for a variety of broad asset classes ~Cutler, Poterba, and Summers ~1991!!. 6 One possible interpretation of this unconditional evidence—which fits with the spirit of the model below—is that information which is initially private is incorporated into prices only gradually. Second, conditional on observable public events, stocks tend to experience post-event drift in the same direction as the initial event impact. The types of events that have been examined in detail and that fit this pattern include: Earnings announcements ~perhaps the most-studied type of event in this genre; see e.g., Bernard ~1992! for an overview!; stock issues and repurchases; dividend initiations and omissions; and analyst recommendations.7 Recent work by Chan, Jegadeesh, and Lakonishok ~1996! shows that these two types of continuation are distinct: In a multiple regression, both past returns and public earnings surprises help to predict subsequent returns at horizons of six months and one year. B. Reversals and Fundamental Reversion One of the first and most influential papers in the reversals category is DeBondt and Thaler ~1985!, who find that stock returns are negatively correlated at long horizons. Specifically, stocks that have had the lowest returns over any given five-year period tend to have high returns over the subsequent five years, and vice versa.8 A common interpretation of this result is that when there is a sustained streak of good news about an asset, its price overshoots its “fundamental value” and ultimately must experience a correction. More recent work in the same spirit has continued to focus on long-horizon predictability, but has used what are arguably more refined indicators of fundamental value, such as book-to-market, and cash flow-to-price ratios. ~See, e.g., Fama and French ~1992! and Lakonishok, Shleifer, and Vishny ~1994!.!9 C. Is It Risk? In principle, the patterns noted above could be consistent with traditional models, to the extent that they reflect variations in risk, either over time or across assets. Fama and French ~1993, 1996! argue that many of the long- 6 Rouwenhorst ~1998, 1999! finds that Jegadeesh and Titman’s ~1993! U.S. results carry over to many other developed and emerging markets, though they are not statistically significant for every country individually ~see, e.g., Haugen and Baker ~1996! on weak momentum in Japan!. 7 References include: Bernard and Thomas ~1989, 1990! on earnings announcements; Loughran and Ritter ~1995! and Spiess and Affleck-Graves ~1995! on stock issues; Ikenberry, Lakonishok, and Vermaelen ~1995! on repurchases; Michaely, Thaler, and Womack ~1995! on dividend initiations and omissions; and Womack ~1996! on analyst recommendations. 8 These results have been controversial, but seem to have stood up to scrutiny ~Chopra, Lakonishok, and Ritter ~1992!!. There are also direct analogs in the time series of aggregate market returns, although the statistical power is lower. See Fama and French ~1988!, Poterba and Summers ~1988!, and Cutler, Poterba, and Summers ~1991!. 9 These results have also been found to be robust in international data. ~Fama and French ~1998!, Rouwenhorst ~1999!! And again, there are analogous fundamental reversion patterns in the time-series literature on aggregate market predictability ~Campbell and Shiller ~1988!!. Underreaction, Momentum Trading, and Overreaction 2147
2148 The Journal of Finance horizon results-such as return reversals,the book-to-market effect,and the cashflow-to-price effect-can be largely subsumed within a three-factor model that they interpret as a variant of the APT or ICAPM.However,this position has been controversial,since there is little affirmative evidence that the Fama-French factors correspond to economically meaningful risks.Indeed, several recent papers demonstrate that the contrarian strategies that ex- ploit long-horizon overreaction are not significantly riskier than average.10 There seems to be more of a consensus that the short-horizon underreaction evidence cannot be explained in terms of risk.Bernard and Thomas(1989)re- ject risk as an explanation for post-earnings-announcement drift,and Fama and French(1996)remark that the continuation results of Jegadeesh and Tit- man(1993)constitute the "main embarrassment"for their three-factor model. IⅡ.The Model A.Price Formation with Newswatchers Only As mentioned above,our model features two classes of traders,newswatch- ers and momentum traders.We begin by describing how the model works when only the newswatchers are present.At every time t,the newswatchers trade claims on a risky asset.This asset pays a single liquidating dividend at some later time T.The ultimate value of this liquidating dividend can be written as:Dr=Do+,where all the e's are independently distrib- uted,mean-zero normal random variables with variance o2.Throughout,we consider the limiting case where T goes to infinity.This simplifies matters by allowing us to focus on steady-state trading strategies-that is,strat- egies that do not depend on how close we are to the terminal date.11 In order to capture the idea that information moves gradually across the news- watcher population,we divide this population into z equal-sized groups.We also assume that every dividend innovation e;can be decomposed into z in- dependent subinnovations,each with the same variance 2/:=+...+ef. The timing of information release is then as follows.At timet,news about e+-1 begins to spread.Specifically,at time t,newswatcher group 1 observes e+-1, group 2 observes e2-1,and so forth,through group z,which observes e+-1. Thus at time t each subinnovation ofe+has been seen by a fraction 1/z of the total population. Next,at time t +1,the groups "rotate,"so that group 1 now observes e-1,group 2 observes e1,and so forth,through group z,which now observes e-1.Thus at time t+1 the information has spread further,and 10 See Lakonishok et al.(1994)and MacKinlay (1995).Daniel and Titman(1997)directly dispute the idea that the book-to-market effect can be given a risk interpretation. 11 A somewhat more natural way to generate an infinite-horizon formulation might be to allow the asset to pay dividends every period.The only reason we push all the dividends out into the infinite future is for notational simplicity.In particular,when we consider the strat- egies of short-lived momentum traders below,our approach allows us to have these strategies depend only on momentum traders'forecasts of price changes,and we can ignore their forecasts of interim dividend payments
horizon results—such as return reversals, the book-to-market effect, and the cashflow-to-price effect—can be largely subsumed within a three-factor model that they interpret as a variant of the APT or ICAPM. However, this position has been controversial, since there is little affirmative evidence that the Fama–French factors correspond to economically meaningful risks. Indeed, several recent papers demonstrate that the contrarian strategies that exploit long-horizon overreaction are not significantly riskier than average.10 There seems to be more of a consensus that the short-horizon underreaction evidence cannot be explained in terms of risk. Bernard and Thomas ~1989! reject risk as an explanation for post-earnings-announcement drift, and Fama and French ~1996! remark that the continuation results of Jegadeesh and Titman ~1993! constitute the “main embarrassment” for their three-factor model. II. The Model A. Price Formation with Newswatchers Only As mentioned above, our model features two classes of traders, newswatchers and momentum traders. We begin by describing how the model works when only the newswatchers are present. At every time t, the newswatchers trade claims on a risky asset. This asset pays a single liquidating dividend at some later time T. The ultimate value of this liquidating dividend can be written as: DT 5 D0 1 (j50 T ej , where all the e’s are independently distributed, mean-zero normal random variables with variance s2. Throughout, we consider the limiting case where T goes to infinity. This simplifies matters by allowing us to focus on steady-state trading strategies—that is, strategies that do not depend on how close we are to the terminal date.11 In order to capture the idea that information moves gradually across the newswatcher population, we divide this population into z equal-sized groups. We also assume that every dividend innovation ej can be decomposed into z independent subinnovations, each with the same variance s2 0z: ej 5 ej 11{{{1ej z . The timing of information release is then as follows. At time t, news about et1z21 begins to spread. Specifically, at time t, newswatcher group 1 observes et1z21 1 , group 2 observes et1z21 2 , and so forth, through group z, which observes et1z21 z . Thus at time t each subinnovation of et1z21 has been seen by a fraction 10z of the total population. Next, at time t 1 1, the groups “rotate,” so that group 1 now observes et1z21 2 , group 2 observes et1z21 3 , and so forth, through group z, which now observes et1z21 1 . Thus at time t 1 1 the information has spread further, and 10 See Lakonishok et al. ~1994! and MacKinlay ~1995!. Daniel and Titman ~1997! directly dispute the idea that the book-to-market effect can be given a risk interpretation. 11 A somewhat more natural way to generate an infinite-horizon formulation might be to allow the asset to pay dividends every period. The only reason we push all the dividends out into the infinite future is for notational simplicity. In particular, when we consider the strategies of short-lived momentum traders below, our approach allows us to have these strategies depend only on momentum traders’ forecasts of price changes, and we can ignore their forecasts of interim dividend payments. 2148 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2149 each subinnovation of e+1has been seen by a fraction 2/z of the total population.This rotation process continues until time t +z-1,at which point every one of the z groups has directly observed each of the subinno- vations that comprise+1.So+1 has become totally public by time t+z-1.Although this formulation may seem unnecessarily awkward,the rotation feature is useful because it implies that even as information moves slowly across the population,on average everybody is equally well-informed.12 This symmetry makes it transparently simple to solve for prices,as is seen momentarily. In this context,the parameter z can be thought of as a proxy for the (lin- ear)rate of information flow-higher values of z imply slower information diffusion.Of course,the notion that information spreads slowly is more ap- propriate for some purposes than others.In particular,this construct is fine if our goal is to capture the sort of underreaction that shows up empirically as unconditional positive correlation in returns at short horizons.However, if we are also interested in capturing phenomena like post-earnings- announcement drift-where there is apparently underreaction even to data that is made available to everyone simultaneously-we need to embellish the model.We discuss this embellishment later;for now it is easiest to think of the model as only speaking to the unconditional evidence on underreaction. All the newswatchers have constant absolute risk aversion(CARA)utility with the same risk-aversion parameter,and all live until the terminal date T.The riskless interest rate is normalized to zero,and the supply of the asset is fixed at Q.So far,all these assumptions are completely orthodox.We now make two that are less conventional.First,at every time t,newswatch- ers formulate their asset demands based on the static-optimization notion that they buy and hold until the liquidating dividend at time T.13 Second, and more critically,while newswatchers can condition on the information sets described above,they do not condition on current or past prices.In other words,our equilibrium concept is a Walrasian equilibrium with pri- vate valuations,as opposed to a fully revealing rational expectations equilibrium. As suggested in the Introduction,these two unconventional assumptions can be motivated based on a simple form of bounded rationality.One can think of the newswatchers as having their hands full just figuring out the implications of the e's for the terminal dividend D.This leaves them unable to also use current and past market prices to form more sophisticated fore- casts of D(our second assumption);it also leaves them unable to make any forecasts of future price changes,and hence unable to implement dynamic strategies (our first assumption). 12 Contrast this with a simpler setting where group 1 always sees all of+-1first,then group 2 sees it second,etc.In this case,group 1 newswatchers are better-informed than their peers. 1a There is an element of time-inconsistency here,since in fact newswatchers may adjust their positions over time.Ignoring the dynamic nature of newswatcher strategies is more sig- nificant when we add momentum traders to the model,so we discuss this issue further in Section II.B
each subinnovation of et1z21 has been seen by a fraction 20z of the total population. This rotation process continues until time t 1 z 21, at which point every one of the z groups has directly observed each of the subinnovations that comprise et1z21. So et1z21 has become totally public by time t 1 z 2 1. Although this formulation may seem unnecessarily awkward, the rotation feature is useful because it implies that even as information moves slowly across the population, on average everybody is equally well-informed.12 This symmetry makes it transparently simple to solve for prices, as is seen momentarily. In this context, the parameter z can be thought of as a proxy for the ~linear! rate of information flow—higher values of z imply slower information diffusion. Of course, the notion that information spreads slowly is more appropriate for some purposes than others. In particular, this construct is fine if our goal is to capture the sort of underreaction that shows up empirically as unconditional positive correlation in returns at short horizons. However, if we are also interested in capturing phenomena like post-earningsannouncement drift—where there is apparently underreaction even to data that is made available to everyone simultaneously—we need to embellish the model. We discuss this embellishment later; for now it is easiest to think of the model as only speaking to the unconditional evidence on underreaction. All the newswatchers have constant absolute risk aversion ~CARA! utility with the same risk-aversion parameter, and all live until the terminal date T. The riskless interest rate is normalized to zero, and the supply of the asset is fixed at Q. So far, all these assumptions are completely orthodox. We now make two that are less conventional. First, at every time t, newswatchers formulate their asset demands based on the static-optimization notion that they buy and hold until the liquidating dividend at time T. 13 Second, and more critically, while newswatchers can condition on the information sets described above, they do not condition on current or past prices. In other words, our equilibrium concept is a Walrasian equilibrium with private valuations, as opposed to a fully revealing rational expectations equilibrium. As suggested in the Introduction, these two unconventional assumptions can be motivated based on a simple form of bounded rationality. One can think of the newswatchers as having their hands full just figuring out the implications of the e’s for the terminal dividend DT. This leaves them unable to also use current and past market prices to form more sophisticated forecasts of DT ~our second assumption!; it also leaves them unable to make any forecasts of future price changes, and hence unable to implement dynamic strategies ~our first assumption!. 12 Contrast this with a simpler setting where group 1 always sees all of et1z21 first, then group 2 sees it second, etc. In this case, group 1 newswatchers are better-informed than their peers. 13 There is an element of time-inconsistency here, since in fact newswatchers may adjust their positions over time. Ignoring the dynamic nature of newswatcher strategies is more significant when we add momentum traders to the model, so we discuss this issue further in Section II.B. Underreaction, Momentum Trading, and Overreaction 2149
2150 The Journal of Finance Given these assumptions,and the symmetry of our setup,the conditional variance of fundamentals is the same for all newswatchers,and the price at time t is given by P:=D:+{(z-1)e+1+(z-2)et+2+…+et+2-1}/2-0Q, (1) where 6 is a function of newswatchers'risk aversion and the variance of the e's.For simplicity,we normalize the risk aversion so that 6=1 hereafter.In words,equation (1)says that the new information works its way linearly into the price over z periods.This implies that there is positive serial cor- relation of returns over short horizons (of length less than z).Note also that prices never overshoot their long-run values,or,equivalently,that there is never any negative serial correlation in returns at any horizon. Even given the eminently plausible assumption that private information diffuses gradually across the population of newswatchers,the gradual-price- adjustment result in equation (1)hinges critically on the further assump- tion that newswatchers do not condition on prices.For if they did-and as long as Q is nonstochastic-the logic of Grossman (1976)would imply a fully revealing equilibrium,with a price P,following a random walk given by(for6=1):14 P=D+2-1-Q (2) We should therefore stress that we view the underreaction result embod- ied in equation (1)to be nothing more than a point of departure.As such,it raises an obvious next question:Even if newswatchers are too busy process- ing fundamental data to incorporate prices into their forecasts,cannot some other group of traders focus exclusively on price-based forecasting,and in so doing generate an outcome close to the rational expectations equilibrium of equation(2)?It is to this central question that we turn next,by adding the momentum traders into the mix. B.Adding Momentum Traders to the Model Momentum traders also have CARA utility.Unlike the newswatchers,how- ever,they have finite horizons.In particular,at every time t,a new gener- ation of momentum traders enters the market.Every trader in this generation takes a position,and then holds this position forj periods-that is,until time t +j.For modeling purposes,we treat the momentum traders'horizon J as an exogenous parameter. The momentum traders transact with the newswatchers by means of mar- ket orders.They submit quantity orders,not knowing the price at which these orders will be executed.The price is then determined by the competi- tion among the newswatchers,who double as market makers in this setup. Thus,in deciding the size of their orders,the momentum traders at time t must try to predict(P+-P).To do so,they make forecasts based on past 14 Strictly speaking,this result also requires that there be an initial"date 0"at which ev. erybody is symmetrically informed
Given these assumptions, and the symmetry of our setup, the conditional variance of fundamentals is the same for all newswatchers, and the price at time t is given by Pt 5 Dt 1 $~z 2 1!et11 1 ~z 2 2!et121{{{1et1z21%0z 2 uQ, ~1! where u is a function of newswatchers’ risk aversion and the variance of the e’s. For simplicity, we normalize the risk aversion so that u 5 1 hereafter. In words, equation ~1! says that the new information works its way linearly into the price over z periods. This implies that there is positive serial correlation of returns over short horizons ~of length less than z!. Note also that prices never overshoot their long-run values, or, equivalently, that there is never any negative serial correlation in returns at any horizon. Even given the eminently plausible assumption that private information diffuses gradually across the population of newswatchers, the gradual-priceadjustment result in equation ~1! hinges critically on the further assumption that newswatchers do not condition on prices. For if they did—and as long as Q is nonstochastic—the logic of Grossman ~1976! would imply a fully revealing equilibrium, with a price Pt * , following a random walk given by ~for u 5 1!:14 Pt * 5 Dt1z21 2 Q. ~2! We should therefore stress that we view the underreaction result embodied in equation ~1! to be nothing more than a point of departure. As such, it raises an obvious next question: Even if newswatchers are too busy processing fundamental data to incorporate prices into their forecasts, cannot some other group of traders focus exclusively on price-based forecasting, and in so doing generate an outcome close to the rational expectations equilibrium of equation ~2!? It is to this central question that we turn next, by adding the momentum traders into the mix. B. Adding Momentum Traders to the Model Momentum traders also have CARA utility. Unlike the newswatchers, however, they have finite horizons. In particular, at every time t, a new generation of momentum traders enters the market. Every trader in this generation takes a position, and then holds this position for j periods—that is, until time t 1 j. For modeling purposes, we treat the momentum traders’ horizon j as an exogenous parameter. The momentum traders transact with the newswatchers by means of market orders. They submit quantity orders, not knowing the price at which these orders will be executed. The price is then determined by the competition among the newswatchers, who double as market makers in this setup. Thus, in deciding the size of their orders, the momentum traders at time t must try to predict ~Pt1j 2 Pt!. To do so, they make forecasts based on past 14 Strictly speaking, this result also requires that there be an initial “date 0” at which everybody is symmetrically informed. 2150 The Journal of Finance
Underreaction,Momentum Trading,and Overreaction 2151 price changes.We assume that these forecasts take an especially simple form:The only conditioning variable is the cumulative price change over the past k periods;that is,(P-1-P-1). As it turns out,the exact value of k is not that important,so in what follows we simplify things by setting k =1,and using(P-1-P-2)=AP:-1 as the time-t forecasting variable.15 What is more significant is that we restrict the momentum traders to making univariate forecasts based on past price changes.If,in contrast,we allow them to make forecasts using n lags of price changes,giving different weights to each of the n lags,we suspect that for sufficiently large n,many of the results we present below would go away.Again,the motivation is a crude notion of bounded rationality:Mo- mentum traders simply do not have the computational horsepower to run complicated multivariate regressions. With k=1,the order flow from generation-t momentum traders,F,is of the form F=A+中△P-1 (3) where the constant A and the elasticity parameter have to be determined from optimization on the part of the momentum traders.This order flow must be absorbed by the newswatchers.We assume that the newswatchers treat the order flow as an uninformative supply shock.This is consistent with our prior assumption that the newswatchers do not condition on prices. Given that the order flow is a linear function of past price changes,if we allowed the newswatchers to extract information from it,we would be indi- rectly allowing them to learn from prices. To streamline things,the order flow from the newswatchers is the only source of supply variation in the model.Given that there are j generations of momentum traders in the market at any point in time,the aggregate supply S,absorbed by the newswatchers is given by: 8=Q-含+1=Q-A-含1R4 (4) We continue to assume that,at any time t,the newswatchers act as if they buy and hold until the liquidating dividend at time T.This implies that prices are given exactly as in equation(1),except that the fixed supply Q is replaced by the variable St,yielding B二=D,+2-1)e+1+2-2e+2+…++-k-Q+jA+ 中AP-i (5) 15 In the NBER working paper version,we provide a detailed analysis of the comparative statics properties of the model with respect to k
price changes. We assume that these forecasts take an especially simple form: The only conditioning variable is the cumulative price change over the past k periods; that is, ~Pt21 2 Pt2k21!. As it turns out, the exact value of k is not that important, so in what follows we simplify things by setting k 5 1, and using ~Pt21 2 Pt22! [ DPt21 as the time-t forecasting variable.15 What is more significant is that we restrict the momentum traders to making univariate forecasts based on past price changes. If, in contrast, we allow them to make forecasts using n lags of price changes, giving different weights to each of the n lags, we suspect that for sufficiently large n, many of the results we present below would go away. Again, the motivation is a crude notion of bounded rationality: Momentum traders simply do not have the computational horsepower to run complicated multivariate regressions. With k 5 1, the order flow from generation-t momentum traders, Ft, is of the form Ft 5 A 1 fDPt21, ~3! where the constant A and the elasticity parameter f have to be determined from optimization on the part of the momentum traders. This order flow must be absorbed by the newswatchers. We assume that the newswatchers treat the order flow as an uninformative supply shock. This is consistent with our prior assumption that the newswatchers do not condition on prices. Given that the order flow is a linear function of past price changes, if we allowed the newswatchers to extract information from it, we would be indirectly allowing them to learn from prices. To streamline things, the order flow from the newswatchers is the only source of supply variation in the model. Given that there are j generations of momentum traders in the market at any point in time, the aggregate supply St absorbed by the newswatchers is given by: St 5 Q 2 ( i51 j Ft112i 5 Q 2 jA 2 ( i51 j fDPt2i. ~4! We continue to assume that, at any time t, the newswatchers act as if they buy and hold until the liquidating dividend at time T. This implies that prices are given exactly as in equation ~1!, except that the fixed supply Q is replaced by the variable St, yielding Pt 5 Dt 1 $~z 2 1!et11 1 ~z 2 2!et121{{{1et1z21%0z 2 Q 1 jA 1 ( i51 j fDPt2i. ~5! 15 In the NBER working paper version, we provide a detailed analysis of the comparative statics properties of the model with respect to k. Underreaction, Momentum Trading, and Overreaction 2151
2152 The Journal of Finance In most of the analysis,the constants Q and A play no role,so we disregard them when it is convenient to do so. As noted previously,newswatchers'behavior is time-inconsistent.Al- though at time t they base their demands on the premise that they do not retrade,they violate this to the extent that they are active in later periods. We adopt this time-inconsistent shortcut because it dramatically simplifies the analysis.Otherwise,we face a complex dynamic programming problem, with newswatcher demands at time t depending not only on their forecasts of the liquidating dividend D but also on their predictions for the entire future path of prices. Two points can be offered in defense of this time-inconsistent simplifica- tion.First,it fits with the basic spirit of our approach,which is to have the newswatchers behave in a simple,boundedly rational fashion.Second,we have no reason to believe that it colors any of our important qualitative conclusions.Loosely speaking,we are closing down a "frontrunning"effect, whereby newswatchers buy more aggressively at time t in response to good news,since they know that the news will kick off a series of momentum trades and thereby drive prices up further over the next several periods.16 Such frontrunning by newswatchers may speed the response of prices to information,thereby mitigating underreaction,but in our setup it can never wholly eliminate either underreaction or overreaction.17 C.The Nature of Equilibrium With all of the assumptions in place,we are now ready to solve the model. The only task is to calculate the equilibrium value of Disregarding con- stants,optimization on the part of the momentum traders implies AP:-1=yEM(Pti-P:)/varM(Pti-P), (6) where y is the aggregate risk tolerance of the momentum traders,and Ev and vary denote the mean and variance given their information,which is just△P:-l.We can rewrite equation(⑥)as 中=ycov(P+i-P,△P-1)/var(△P)varM(P+i-P)} (7) The definition of equilibrium is a fixed point such that o is given by equa- tion (7),while at the same time price dynamics satisfy equation (5).We restrict ourselves to studying covariance-stationary equilibria.In Appendix A we prove that a necessary condition for a conjectured equilibrium process to be covariance stationary is that<1.Such an equilibrium may not exist for arbitrary parameter values,and we are also unable to generically rule out the possibility of multiple equilibria.However,we prove in the ap- pendix that existence is guaranteed so long as the risk tolerance y of the 16 This sort of frontrunning effect is at the center of DeLong et al.(1990). 17 See the NBER working paper version for a fuller treatment of this frontrunning issue
In most of the analysis, the constants Q and A play no role, so we disregard them when it is convenient to do so. As noted previously, newswatchers’ behavior is time-inconsistent. Although at time t they base their demands on the premise that they do not retrade, they violate this to the extent that they are active in later periods. We adopt this time-inconsistent shortcut because it dramatically simplifies the analysis. Otherwise, we face a complex dynamic programming problem, with newswatcher demands at time t depending not only on their forecasts of the liquidating dividend DT but also on their predictions for the entire future path of prices. Two points can be offered in defense of this time-inconsistent simplification. First, it fits with the basic spirit of our approach, which is to have the newswatchers behave in a simple, boundedly rational fashion. Second, we have no reason to believe that it colors any of our important qualitative conclusions. Loosely speaking, we are closing down a “frontrunning” effect, whereby newswatchers buy more aggressively at time t in response to good news, since they know that the news will kick off a series of momentum trades and thereby drive prices up further over the next several periods.16 Such frontrunning by newswatchers may speed the response of prices to information, thereby mitigating underreaction, but in our setup it can never wholly eliminate either underreaction or overreaction.17 C. The Nature of Equilibrium With all of the assumptions in place, we are now ready to solve the model. The only task is to calculate the equilibrium value of f. Disregarding constants, optimization on the part of the momentum traders implies fDPt21 5 gEM ~Pt1j 2 Pt!0varM ~Pt1j 2 Pt!, ~6! where g is the aggregate risk tolerance of the momentum traders, and EM and varM denote the mean and variance given their information, which is just DPt21. We can rewrite equation ~6! as f 5 g cov~Pt1j 2 Pt,DPt21!0$var~DP!varM ~Pt1j 2 Pt!%. ~7! The definition of equilibrium is a fixed point such that f is given by equation ~7!, while at the same time price dynamics satisfy equation ~5!. We restrict ourselves to studying covariance-stationary equilibria. In Appendix A we prove that a necessary condition for a conjectured equilibrium process to be covariance stationary is that 6f6 , 1. Such an equilibrium may not exist for arbitrary parameter values, and we are also unable to generically rule out the possibility of multiple equilibria. However, we prove in the appendix that existence is guaranteed so long as the risk tolerance g of the 16 This sort of frontrunning effect is at the center of DeLong et al. ~1990!. 17 See the NBER working paper version for a fuller treatment of this frontrunning issue. 2152 The Journal of Finance