Data-Snooping Biases ordered but correspond to the ordering of the XN's.?For example, if X,is firm size and &is the intercept from a market-model regression of firm i's excess return on the excess market return,then is the a of the jth smallest of the N firms.We call this procedure induced ordering of the &'s. It is apparent that if we construct a test statistic by choosing n securities according to the ordering (3),the sampling theory cannot be the same as that of n securities selected independently of the data. From the following remarkably simple result by Yang (1977),an asymptotic sampling theory for test statistics based on induced order statistics may be derived analytically:8 Tbeorem 1.1.Let the vectors [X:al,i=1,...,N,be independently and identically distributed and let 1 ii<...<i<N be sequences of integers sucb that,as N-oo,ig/N-(O,1)(k=1, 2,...,n).Tben lim Pr(aa:w<a,··,anM<a,n) N-c =ΠPr(a。<ae|F(X)=), (4) fvl wbere F()is the marginal cumulative distribution function of X Proof See Yang (1977). This result gives the large-sample joint distribution of a finite subset of induced order statistics whose identities are determined solely by their relative rankings (as ranked according to the order statistics XN).From (4)it is evident that the's are mutually independent in large samples.If X,were the market value of equity of the ith company,Theorem 1.1 shows that the &of the security with size at, for example,the 27th percentile is asymptotically independent of the &,of the security with size at the 45th percentile,If the characteristics {X}and (are statistically independent,the joint distribution of If the vectors are independently and identically distributed and X,is perfectly correlated with thenw are also order statistics.But as long as the correlation coeficient p is strictly between -1 and 1,then,for example,will generally not be the largest & .See also David and Galambos (1974)and Watterson (1959).In fact,Yang(1977)provides the exact finite-sample distribution of any finite collection of induced order statistics,but even assuming bivariate normality does not yield a tractable form of this distribution. This is a limiting result and implies that the identities of the stocks with 27th and 45th percentile sizes will generally change as N increases. 437