Some Empirical Tests of the Theory of Arbitrage Pricing TORIo Nai-Fu Chen The Journal of finance, Vol. 38, No. 5. (Dec, 1983), pp. 1393-1414 Stable url: http://inks.jstor.org/sici?sici=0022-1082%028198312%02938%3a5%3c1393%3asetott%3e2.0.c0%3b2-2 The Journal of finance is currently published by American Finance Association Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.htmlJstOr'sTermsandConditionsofUseprovidesinpartthatunlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://wwwjstor.org/journals/afina.html Each copy of any part of a jSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission jStOR is an independent not-for-profit organization dedicated to creating and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact support@jstor. org http://www」]stor.org Sat Apr823:41:102006
THE JOURNAL OF FINANCE VOL. XXXVIIl NO. 5. DECEMBER 1983 Some Empirical Tests of the Theory of Arbitrage Pricing NAI-FU CHEN* ABSTRACT We estimate the parameters of Ross' s Arbitrage Pricing Theory(APT). Using daily return data during the 1963-78 period, we compare the evidence on the aPt and the Capital Asset Pricing Model(CAPM)as implemented by market indices and find that the apt performs well. The theory is further supported in that estimated expected returns depend on estimated factor loadings, and variables such as own variance and firm size do not contribute additional explanatory power to that of the factor loadings THE ARBITRAGE PRICING THEORY(APT), originally formulated by Ross 135, 36 and extended by Huberman [23 and Connor [13], is an asset pricing model that explains the cross-sectional variation in asset returns. Like the Capital Asset Pricing Model(CAPM)of Sharpe [39, Lintner [26], and Black [2], the APT begins with an assumption on the return generating process: each asset return linearly related to several, say k, common"global "factors plus its own idiosyn cratic disturbance. Then in a well-diversified, frictionless, and perfectly compet 7e economy, the no arbitrage condition requires that the expected return vector must lie(asymptotically) in the k+ 1 dimensional vector space spanned by a vector of all one' s and the k vectors of asset response amplitudes (to the k common global factors The initial empirical evidence on the model has been rather encouraging(see Gehr [19]and Roll and Ross [34). In this paper, we shall compare the empirical performance of the APT with that of the CAPM. We shall also test whether the APT can explain some of the empirical anomalies"related to the CAPM in recent years. The paper has six sections. In Sections I and Il, some basic results related to the aPt are given so that testable implications of the model can be clearly identified and the parameters of the model can be estimated. Section III contains the cross-sectional results of the APT In Sections IV and V, we attempt to reject the apt by looking at variables that are known to be highly correlated with returns to see if they have any additional explanatory power after the aPt parameters are included. W marize our findings in Section VI. Some Graduate School of Business, University of Chicago. I thank my dissertation committee chairman Richard Roll, for his guidance and encouragement; Eugene Fama for his suggestions; and Glenn Graves for his assistance in the mathematical programmings. i have also benefited from the comments of Armen Alchian, George Constantinides, Tom Copeland, Robert Geske, Jack Hirshleifer, Robert Hamada, Herb Johnson, Edward Leamer, Robert Litzenberger, David Mayers, Merton Miller, Marc Reinganum, Fred Weston, Arnold Zellner, and especially Stephen Brown and Michael Gibbons.This esearch was partially supported by the Research Program in Competition and Business policy at the University of California, Los angeles, and a University of California, Santa Barbara Academic ate grant8581557074277 1393
The Journal of finance mathematical derivations and the estimation procedure for the factors are de- scribed in the appendix. I. The arbitrage Pricing Theory and Its Implications A. A Brief Review of the aPt Assume that asset markets are perfectly competitive and frictionless and that dividuals believe that returns on assets are generated by a k-factor model, so that the return on the ith asset can be written as: Ei+ bl51 +.. bik Sk where E, is the expected return; 5,, j=1,., k, are the mean zero factors common to all assets; bi is the sensitivity of the return on asset i to the fluctuations in factor j; and E is the"nonsystematic"risk component idiosyncratic to the ih asset with E(6,18,=0 for all j. In a well diversified economy with no arbitrage opportunity, the equilibrium expected return on the ith asset is given by Ei= o +a,bi i2b2+ If there exists a riskless(or a"zero beta")asset, its return will be Xo. The other parameters, A1 Ak, can be interpreted as risk premiums corresponding to risk factors 81,..., Sk. In other words, A, is the expected return per unit of long investment of a portfolio with zero net investment and bpi 1 and bp2 B. Factor Analysis and the estimation of the Factor Loadings The procedure to estimate factor loadings (i.e., the bi 's)for all assets corre- sponding to the same set of common factors is quite involved and expensive we first do a factor analysis on an initial subset of assets, and then we extend the factor structure of the subset to the entire sample. This is accomplished via a large scale mathematical programming exercise. Section II contains a brief It is clear that the development of the theory of arbitrage pricing is quite separate from the factor analysis. We use factor analysis here only as a statistical tool to uncover the pervasive forces(factors) in the economy by examining how asset returns covary together. As with any statistical method, its result is meaningful only when the method is applied to a representative sample. In the present context, the initial subset to which the factor analysis is applied should consist of a large random sample of securities of net positive supply in the economy; thus the sample would be closely representative of the risks borne by investors. In a recent article, Shanken [37 points out some of the potential pitfalls of testing the aPt when the factored covariance matrix is unrepresen tative of the covariation of assets in the economy. By forming portfolios from See Ross[35, 36], Huberman [23], and Connor[ 13] for the formal development. See also Ingersoll [24], Chen and Ingersoll [10], Grinblatt and Titman [22], and Dybvig [16]
Empirical Tests of Arbitrage Pricing any given set of assets, Shanken demonstrates that factor analysis can produce many different factor structures from the manipulated portfolios In the extreme case where the constructed portfolios are mutually uncorrelated factor analysis produces no common factor. Of course, forming uncorrelated portfolios by longing and shorting securities merely repackages the risk bearing and potential reward associated with the original securities and does not alter the fundamental forces and characteristics inherent in the economy. However, as a statistical tool, factor analysis can no longer detect those pervasive forces from such manipulated portfolios. This, of course, should not be construed as a criticism of the theory or of the testability of the aPt, but rather should serve as a reminder of the potential problems involved in doing statistical analysis on unrepresentative samples. In this paper, we select 180 securities for each of the initial factor analyses. If we miss an important factor because of unrepresentativeness, all the tests that follow will be biased against the apt C. Testable Hypothesis of the APT We regard Equation(2)as the main result of the apt that explains the cross- sectional differences in asset returns, and it is(2)that will be tested in the following sections A logical first step in testing(2)would be to look for priced factors. However, the task of finding priced factors turns out to be not particularly straightforward If we have determined that k factors exist (in the sense of Connor [13 ])in the generating process of asset returns in the economy, then the number of priced factors-as long as there is at least one-is not well defined. Intuitively this can be most easily seen by noting that a k factor pricing equation can always be collapsed into a single beta equation via mean-variance efficient set mathematics (and with an additional orthogonal transformation similar to that described below, the number of priced factors can be arbitrary) Mathematically, let u be the risk premium vector for a particular set of risk factors and let u be any vector of the same dimension with u'u=U'u. Then there exists an orthogonal trans formation that will transform the original set of factors to a new set of orthogonal factors whose associated risk premium vector is u. In other words, if it has been established that k factors are present, the number of priced factors can be any number between 1 and k. It should be emphasized here however, that those factors that are not priced are just as important as those that are priced in an individuals investment decision(see Breeden [4], Constantinides [12] and roll [32] for related issues). They are irrelevant only in predicting expected return This should be borne in mind when interpreting the cross-sectional results in Section Ill a question that naturally arises in this investigation is how the aPt fa against other asset pricing models. It is immediately apparent from Equations The term "pervasive forces"was made popular by Connor [13]. See Shanken [37 for his interpretation of Connor's result and its implications to the aPt, Contrary to some beliefs, Shanken's results were not driven by the idiosync erms in a finite sample or the approximate nature of Ross'original formulation. The no common factor result can be obtained in an economy with exactly k factors and no idiosyncratic risks
1396 The Journal of Finance (1)and(2)that any model that predicts a linear relation between"risk"and return is potentially consistent with the APT. For example, if we let &1 be the return on the market and &2 be changes in interest rate, we obtain Mertons [27] two factor pricing equation. Therefore, unless the"market "and/or the "factors can be properly identified, accepting the aPt(i.e. not being able to reject the aPT) does not necessarily reject the other pricing equations. however the results of APT can be compared with other models as popularly implemented. In Section Ill, we examine the cross-sectional results of the apt and the CaPm It has been suggested that the aPt is so general that it is not rejectable Fortunately, this is not true. The most important result of the APt is that only those risks that are reflected in the covariance matrix are priced, nothing else So if we are able to find a variable that is priced even after the factor loadings (FL)are accounted for, the APT would be rejected. For example, if the daily return to every small firm were increased by 1% per day, the returns to small firms would be statistically significantly higher than the returns to large firms no matter how many factors were extracted from the covariance matrix to account for risk ex ante, as long as the number of factors is small relative to the number of securitie In Sections IV and V, we attempt to reject the apt using variables such as own variance and size of firm equity. Those variables are chosen because of the well documented high correlation between them and the average returns Il. Estimation of the Factor Loadings A. The Data The data are described in Table I. All the parameters in Sections IIl, IV, and V are computed using only data within each subperiod so that we may have four dependent tests of each hypothesis The computation of the b,s, the factor loadings(FL), uses only data from odd days within each subperiod. Even day returns are reserved for testing purposes B. Methodology The details of the estimation procedure are described in the Appendix; the following is a brief outline (i)The first 180 stocks in the sample(alphabetically) are selected and their sample covariance matrix computed. The choice 180 is the upper limit It is often asserted that the CAPM is a special case of the aPt. This is true if ists a rotation of the factors such that one of the factors is the"market. "Ex ante, there is no ason to assume this is the case in our finite economy. Ex post, we discover in this study as well as ubsequent studies [ 7, 11] that the first factor is highly correlated with a measure of the "market ghted NYSE stock index. However, throughout this study, we maintain the ex ante position that the Capm and the aPt are nonnested for hypothesis testing the number of pervasive factors that influence the is no more than a few and certainly less than, say, 20. We insert"sm thological case where the number of factors is equal to the number of ittern of returns can be explained. In the present study, the maximum number 80, the size of the initial factor analysis, while the number of securities in each
Empirical Tests of Arbitrage Pricing 397 Table i Data Description Center for Research in Security Prices University of Chicago 1963-78 inclusive. The entire period is divi subperiods: I. 1963-66, IL. 1967-70, III IV.1975-78. election criterion All the securities that do not have missing data during Basic data unit Return adjusted for all capital changes and including Number of selected Subperiod Total sample II 1522 Their average daily return in absolute value is less than 0.01 to eliminate outliers Only Burma Mining Inc was excluded with this criterion during the first subperiod placed by the processing capacity of the IBM 3033 used (in 1980)for the computation of the factor loadi (ii)The first ten factor loadings for each stock are obtained with the computer software package EFAP II (iii)Five portfolios are formed using linear programming so that the resultant portfolios will balance estimation errors with other desirable properties The time series of the five portfolios will contain linear combinations of the 1,..., Sk(i.e, the factor scores) (iv) The first five factor loadings are produced for every stock in the sample by solving a matrix equation(Equation(Al)in the Appendix) One of the difficulties in empirically testing the apt is that it does not tell us what the number of common factors should be. Since ex post data are being used to test for an ex ante relation the number of factors to be included must be independently determined and prespecified in order to avoid potential data mining and to give the alternative hypothesis a fair chance ex post. Five were selected based on previous empirical studies(see Roll and Ross [34] and Rein ganum [30). A study by Brown and Weinstein [5] also confirms that the number of pervasive factors is probably no greater than five II. Cross-Sectional results A. The APT and the CAPM To see how well the data support the models, we examine the result of cross- sectional regression of assets'returns on the hypothesized parameters in each of the subperiods. The independent variables will be the FL for the aPt and the 5 This is the gUB programming within the elastic programming in the XS mathematical program- ming system developed by Glenn Graves, UCLA " Based on the analysis and the plot of eigenvalues, for each period five factors also look sufficient
1398 The Journal of Finance betas for the CAPM(both computed on the odd days) r;=A+A1b1+……+λb+,(APT) r:= Ao+ AiBi+ ni (4) The returns are computed on the even days of each subperiod. The betas are computed with market proxies: (1)the S&P 500 index,(2)the value weighted stock index, and( 3 )the equally weighted stock index. The returns on the indices are taken from the CRSP tape index file. The result of the regression is given in Table Il. Parts a and B. The adjusted R2 comes from the cross-sectional regression of assets'average (even day) returns over the entire period on the independent variables. The estimated coefficients and their t statistics come from the time series of cross ectional regression coefficients as in the studies of Black, Jensen, and Scholes (BJS)[3] and Fama and Macbeth(FM)[17]. In our case, we first compute the average of every five(even) day returns and perform a cross-sectional regression on each of them, thus generating a time series for each estimated coefficien The mean and the t statistic are then derived from the time series 7 Almost all the serial correlation coefficients are insignificant; therefore, the time series sample may be treated as essentially independent. a nonparametric test is also performed on the time series of each coefficient to hedge against nonnormality of the population. Here the "sign test"is used to test whether the median is zero Since the power of the nonparametric test is in general lower than parametric tests, both significance at the 0. 1 level and at the 0.05 level are reported next to the estimated coefficient. The Hotelling T2(see Morrison [28])in Table II is computed from the time series of A1,..., As. The T2 statistics are reported alongside the F's because it is easier to add up T 2(which asymptotically approaches x ). The interpretation of these statistics follows In looking at the results in Table II, Part A, recall the rotation indeterminancy associated with factor analysis( Section I C above). Thus, comparison of f; acros time periods is not meaningful. The only exceptions are the N,'s, which are the estimated expected return of a zero-FL asset. It also happens that the first FL of each asset is highly correlated with the B of CAPM. The simple correlation between the bu and the B, (for each market proxy)is in the neighborhood of 0.95 Almost all the ba's are negative; therefore one wou Id expect the risk premium (i.e the estimated A,)to be negative. Indeed, all the estimated A,'s are negative (even for the period 1967-70 when the estimated market premium is negative see Table Il, Part B); however, only the first and the fourth periods'estimated Ais are significantly different from zero. This is consistent with the result in Table Il, Part B, where the CAPM B is priced only in those two periods. As for the other estimated A's, there is no a priori information on their signs; therefore one can judge only by their significance level whether that factor is priced. From 7 The error nance m: of the estimated premia is(XX)x′∑X(XX), where∑ is the cross matrix of the idiosyncratic terms under the null hypothesis A,=A2=.,.As 0. If the buy are e ed with error, the error matrix for the risk premia remains the same under the null hypothesis in some special cases. See Gibbons [21] or Shanken [38]
Empirical Tests of Arbitrage Pricing 1399 5二g g员是象§ E图 93e G后后宫秀8 图5|司公岩 三〓三〓三_二三 图冒曹当 s t- t F- g 3e
The Journal of financ Table Il, Part A, it is interesting to note that there is at least one significant factor beside A, for every period To see whether APt has any explanatory power in cross-section we test the null hypothesis that A,= A2 4=A5=0. This is equivalent to the hypothesis that the expected return of all assets are the same and equal to no The Hotelling it is tricky to apply the test to the APT because, as mentioned is a multivariate test(see Morrison [28])appropriate for this previously, the APT only requires at least one of the factors to be priced. If we then choose only the first few most significant factors to be included in the Hotelling T2, obviously the Type i error is underestimated and there is a bias against the null hypothesis. If we always take all five the test will be weak and the Type II error will be large Despite the relative low power of the test when all five are included, the f statistic is still significant at least at the 0. 1 level for every period (see Table Il, Part A), so the overall significance level would be very high. If we assume independence across the four test periods we can sum up the T(>T=60.96)which asymptotically approaches x with 20 degrees of freedom and with a critical value of 45.3 for the 0.001 level. Therefore with the Fl we can confidently reject the null hypothesis of constant expected return across B. A Co son between the apt and the CaPmio The next question that comes up naturally is: which of the two competing models, CAPM or APT, do the data favor? To answer this, one is tempted to do a regression with both CAPM beta and the FL as independent variables. However th is is not done because:(i)this specification is not justified on theoretical grounds and the two models are nonnested(see also footnote 3), and (ii)the betas and the FL are intended to measure the same thing-risk Thus, to put them together in a regression would mean including the same variable twice on the right-hand side. The high degree of multicollinearity (intertwined with measurement error)tends to produce regression coefficients that make no sense. One possible way to discriminate among nonnested alternative models was suggested in Davidson and Mackinnon [14]. Let FAPT and FCAPm be the expected return generated by the APt and the CAPM, i.e they are obtained from the cross-sectional regression of (3)and(4)(on even days)without the error terms then if we estimate a in ,= rapT +(1-aricy +e (cf [14], Equation(4)), we would expect a to be close to 1 if the apt is the correct model relative to the CAPM. We analyze (5)rather than many of its 8 Gibbons [20] showed that a much more efficient estimate of the risk premium is possible by pooling cross-sectional time series data in a seemingly unrelated regression. The standard his study was roughly one third of the BJS.FM standard error. Unfortunately, there are over a thousand securities in our cross-sectional sample. Thus the SUR, which requires inversion of the sample covariance matrix, is not feasible. Furthermore, the probability limit of the Hotelling T2 statistic is biased downward in the presence of nonsystematic measurement error An approximate F statistic reconstructed from the overall T statistic is 2. 46 with (20. 78)df. am indebted to Stephen Brown for a discussion on this topic
Empirical Tests of Arbitrage Pricing 1401 Table ll Estimated Weights of the Expected Return from the apt and capm (1-a)CaPM+ 014) (0010) 0.994 014) 1975-78 0953 0970 0994 close alternatives(see [14 because of the symmetric treatment of FAPT and APM. The penalty for regressing (5)is that the asymptotic standard error of a is underestimated. A multiplicative adjustment is necessary. However, since there are so much data in the stock returns the mean and standard error of a can be estimated directly from its time series, which is obtained by performing(5)in subintervals within each period. Each subinterval contains five(even)days. The point estimate of a, which is contained in Table IIl, is persuasive of the APT even though in many cases the estimated a is significantly different from 1 n, based on theoretically sound foundations, is suggested by the Bayesians. Had the residuals of ( 3)and (4)satisfied the i i.d. multivariate normal assumption, posterior odds ratios can be computed to provide a selection rule. With diffuse prior and some assumptions, the formula for posterior odds in favor of model 1 over model 0 is given by R=ESSO/ESS,N/No-*)/2 (cf. Leamer[25, p. 114]), where ESS is the error sum of squares, N is the number of observations, and k is the dimension of the respective models The posterior odds thus computed are, with one exception, overwhelmingly in favor of the APT over the CAPM as implemented by the three market indices The odds ratios in favor of the apt are never less than 3.64E+3 for all four periods and all three market indices and are as high as 5 17E 19, except for the equally weighted stock index in the period 1975-78(R=2.94E-4) Unfortunately, while we try to extract all the cross-sectional covariance through factor analysis in the case of aPT, the same cannot be said about CAPM In fact, it is well known that the residuals across firms tend to be positively For a survey of posterior odds methods, see Zellner[41] 12 See Zellner[40), Chapter 10, pp. 306-312 for details. The side assumptions are those that reduce the posterior odds ratio to goodness of fit. See related issues in Gaver and Geisel [18]and Leamer [25]. I thank Edward Leamer and Arnold Zellner for a discussion on this topic