Data-Snooping Biases examples of"data-snooping statistics,"a term used by Aldous (1989, p.252)to describe the situation "where you have a family of test statistics T(a)whose null distribution is known for fixed a,but where you use the test statistic T=T(a)for some a chosen using the data." In our application the quantity a may be viewed as a vector of zeros and ones that indicates which securities are to be included in or omitted from a given portfolio.If the choice of a is based on the data,then the sampling distribution of the resulting test statistic is generally not the same as the null distribution with a fixed a;hence, the actual size of the test may differ substantially from its nominal value under the null.Under plausible assumptions our calculations show that this kind of data snooping can lead to rejections of the null hypothesis with probability 1 even when the null hypothesis is true! Although the term"data snooping"may have an unsavory conno- tation,our usage neither implies nor infers any sort of intentional misrepresentation or dishonesty.That prior empirical research may influence the way current investigations are conducted is often un- avoidable,and this very fact results in what we have called data snoop- ing.Moreover,it is not at all apparent that this phenomenon neces- sarily imparts a "bias"'in the sense that it affects inferences in an undesirable way.After all,the primary reason for publishing scientific discoveries is to add to a store of common knowledge on which future research may build. But when scientific discovery is statistical in nature,we must weigh the significance of newly discovered relations in view of past infer- ences.This is recognized implicitly in many formal statistical circum- stances,as in the theory of sequential hypothesis testing.But it is considerably more difficult to correct for the effects of specification searches in practice since such searches often consist of sequences of empirical studies undertaken by many individuals over many years.2 For example,as a consequence of the many investigations relating the behavior of stock returns to size,Chen,Roll,and Ross (1986,p. 394)write:"It has been facetiously noted that size may be the best theory we now have of expected returns.Unfortunately,this is less of a theory than an empirical observation."Then,as Merton (1987, p.107)asks in a related context:"Is it reasonable to use the standard t-statistic as a valid measure of significance when the test is conducted on the same data used by many earlier studies whose results influ- enced the choice of theory to be tested?"We rephrase this question Statisticians have considered a closely related problem,known as the "fle drawer problem,"in which the overall significance of several published studies must be assessed while accounting for the possibility of unreported insignificant studies languishing in various investigators'file drawers. An excellent review of the file drawer problem and its remedies,which has come to be known as 'meta-analysis,"is provided by lyengar and Greenhouse (1988). 433