Bayes Factors:What They Are and What They Are Not Michael LAVINE and Mark J.SCHERVISH tio pfr()/[(1-p)fa()].The Bayes factor is the ratio fH(x)/fA(x). Bayes factors have been offered by Bayesians as alterna- Example 1.Consider four tosses of the coin mentioned tives to P values (or significance probabilities)for testing earlier,and suppose they all land heads.Let u be the prior hypotheses and for quantifying the degree to which ob- over the parameter space ={0,1/2,1],where a point served data support or conflict with a hypothesis.In an in gives the probability of heads.If the hypothesis of earlier article,Schervish showed how the interpretation of interest is H2:=1/2,then P values as measures of support suffers a certain logical flaw.In this article,we show how Bayes factors suffer that same flaw.We investigate the source of that problem and fH()=16 and fn.(e)) 4({0,1})1 consider what are the appropriate interpretations of Bayes factors. The Bayes factor in favor of H2 is KEY WORDS:Measure of support;P values. fH2(x)({0,1}) fH(x)16({1}) Suppose that a Bayesian observes data X=x and tests a hypothesis H using a loss function that says the cost of 1.INTRODUCTION type II error is some constant b over the alternative and the Consider tosses of a coin known to be either fair,two- cost of type I error is constant over the hypothesis and is headed,or two-tailed.There are six nontrivial hypotheses cx b.The posterior expected cost of rejecting H is then about 6,the probability of heads: cbPr(H is trueX =z),while the posterior expected cost of accepting H is b(1-Pr(H is true X=x)).The formal H1:0=1H2:0=1/2H3:0=0 Bayes rule is to reject H if the cost of rejecting is smaller H4:0≠1H5:0≠1/2H6:0≠0. than the cost of accepting.This simplifies to rejecting H Jeffreys (1960)introduced a class of statistics for testing if its posterior probability is less than 1/[1 +c,which is hypotheses that are now commonly called Bayes factors equivalent to rejecting H if the posterior odds in its favor The Bayes factor for comparing a hypothesis H to its com- are less than 1/c.This,in turn,is equivalent to rejecting H plement,the alternative A,is the ratio of the posterior odds if the Bayes factor in favor of H is less than some constant in favor of H to the prior odds in favor of H. k implicitly determined by c and the prior odds To make this more precise,let be the parameter space It would seem then that a Bayesian could decline to spec- and let cn be a proper subset.Let u be a probability ify prior odds,interpret the Bayes factor as "the weight of measure over n and,for each 0n,let fxie(0)be the evidence from the data in favour of the...model"(O'Hagan density function(or probability mass function)for some ob- 1994,p.191);"a summary of the evidence provided by the servable X given =0.The predictive density of X given data in favor of one scientific theory...as opposed to an- H:eEH is fH()equal to the average of fx(0) other"(Kass and Raftery 1995,p.777);or the"odds for Ho to H that are given by the data'"(Berger 1985,p.146)and with respect to u restricted to Similarly,the predictive density of X given A:e is fa(z)equal to the av- test a hypothesis"objectively"by rejecting H if the Bayes erage of fxle()with respect to u restricted to DA (the factor is less than some constant k.In fact,Schervish(1995, complement of )That is, p.221)said "The advantage of calculating a Bayes factor over the posterior odds...is that one need not state a prior fn(z)=nx()du(0) odds..."and then (p.283)that Bayes factors are "ways (2H) to quantify the degree of support for a hypothesis in a data set."Of course,as these authors clarified,such an interpreta- and tion is not strictly justified.While the Bayes factor does not fA()=x()du(0) depend on the prior odds,it does depend on"how the prior u(2A) mass is spread out over the two hypotheses"(Berger 1985, p.146).Nonetheless,it sometimes happens that the Bayes If p is the prior probability that H is true-that is,p factor "will be relatively insensitive to reasonable choices" u(H)-then the posterior odds in favor of H is the ra- (Berger 1985,p.146),and then a common opinion would be that"such an interpretation is reasonable"(Berger 1985, p.147). Michael Lavine is Associate Professor,Institute of Statistics and Deci- sion Sciences,Duke University,Durham,NC 27708-0251(Email:michael We show,by example,that such informal use of Bayes @stat.duke.edu).Mark J.Schervish is Professor,Department of Statistics, factors suffers a certain logical flaw that is not suffered by Carnegie Mellon University,Pittsburgh,PA 15213. using the posterior odds to measure support.The removal C 1999 American Statistical Association The American Statistician,May 1999,Vol.53,No.2 119
Bayes Factors: What They Are and What They Are Not Michael LAVINE and Mark J. SCHERVISH Bayes factors have been offered by Bayesians as alternatives to P values (or significance probabilities) for testing hypotheses and for quantifying the degree to which observed data support or conflict with a hypothesis. In an earlier article, Schervish showed how the interpretation of P values as measures of support suffers acertain logical flaw. In this article, we show how Bayes factors suffer that same flaw. We investigate the source of that problem and consider what are the appropriate interpretations of Bayes factors. KEY WORDS: Measure of support; P values. 1. INTRODUCTION Consider tosses of a coin known to be either fair, twoheaded, or two-tailed. There are six nontrivial hypotheses about 0, the probability of heads: H1:0=1 H2: 0 = 1/2 H3: 0 = O H4:0#71 H5:0 7#1/2 H6:0O. Jeffreys (1960) introduced a class of statistics for testing hypotheses that are now commonly called Bayes factors. The Bayes factor for comparing ahypothesis H to its complement, the alternative A, is the ratio of the posterior odds in favor of H to the prior odds in favor of H. To make this more precise, let Q be the parameter space and let QH c Q be a proper subset. Let ,u be a probability measure over Q and, for each 0 c Q, let fx IE ( I0) be the density function (or probability mass function) for some observable X given e = 0. The predictive density of X given H: e) C QH is fH(x) equal to the average of fxie(xj0) with respect to ,t restricted to QH. Similarly, the predictive density of X given A: e , QH is fA(x) equal to the average of fx Ie(x 0) with respect to At restricted to QA (the complement of QH). That is, fH (X) fQH fxE (x I 0)d-t(0) 1-t(QH) and fA (X) fQlA fx IE (xI0) d-t (0) Pi(QA) If p is the prior probability that H is true-that is, p ,-(QH)-then the posterior odds in favor of H is the raMichael Lavine is Associate Professor, Institute of Statistics and Decision Sciences, Duke University, Durham, NC 27708-0251 (Email: michael @stat.duke.edu). Mark J. Schervish is Professor, Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213. tio pfH(x)/[(l - p)fA(x)]. The Bayes factor is the ratio fH (X) /fA (X) . Example 1. Consider four tosses of the coin mentioned earlier, and suppose they all land heads. Let p, be the prior over the parameter space Q = {O, 1/2, 1}, where a point in Q gives the probability of heads. If the hypothesis of interest is H2: ( = 1/2, then fH2 (x) = I and fH5( (X)A=(l}) The Bayes factor in favor of H2 is fH2 (X) _ t( {O, I}) fH5(X) 16At({1}) Suppose that a Bayesian observes data X = x and tests a hypothesis H using a loss function that says the cost of type II error is some constant b over the alternative and the cost of type I error is constant over the hypothesis and is c x b. The posterior expected cost of rejecting H is then cbPr(H is truelX = x), while the posterior expected cost of accepting H is b(1 - Pr(H is truelX = x)). The formal Bayes rule is to reject H if the cost of rejecting is smaller than the cost of accepting. This simplifies to rejecting H if its posterior probability is less than 1/[1 + c], which is equivalent to rejecting H if the posterior odds in its favor are less than l/c. This, in turn, is equivalent to rejecting H if the Bayes factor in favor of H is less than some constant k implicitly determined by c and the prior odds. It would seem then that a Bayesian could decline to specify prior odds, interpret the Bayes factor as "the weight of evidence from the data in favour of the ... model" (O'Hagan 1994, p. 191); "a summary of the evidence provided by the data in favor of one scientific theory ... as opposed to another" (Kass and Raftery 1995, p. 777); or the "'odds for Ho to H1 that are given by the data' " (Berger 1985, p. 146) and test a hypothesis "objectively" by rejecting H if the Bayes factor is less than some constant k. In fact, Schervish (1995, p. 221) said "The advantage of calculating a Bayes factor over the posterior odds ... is that one need not state a prior odds..." and then (p. 283) that Bayes factors are "ways to quantify the degree of support for a hypothesis in a data set." Of course, as these authors clarified, such an interpretation is not strictly justified. While the Bayes factor does not depend on the prior odds, it does depend on "how the prior mass is spread out over the two hypotheses" (Berger 1985, p. 146). Nonetheless, it sometimes happens that the Bayes factor "will be relatively insensitive to reasonable choices" (Berger 1985, p. 146), and then a common opinion would be that "such an interpretation is reasonable" (Berger 1985, p. 147). We show, by example, that such informal use of Bayes factors suffers acertain logical flaw that is not suffered by using the posterior odds to measure support. The removal ?? 1999 American Statistical Association The American Statistician, May 1999, Vol. 53, No. 2 119
of the prior odds from the posterior odds to produce the nonempty,disjoint,and exhaustive hypotheses H1,H2,and Bayes factor has consequences that affect the interpretation H3 as in Examples 1 and 2.Let H4 be the complement of the resulting ratio. of H(the union of H2 and H3)as in the examples,so that H2 implies H4.Straightforward algebra shows that if 2.BAYES FACTORS ARE NOT MONOTONE IN fHa()k.That is,we face the apparent contradiction of accepting 0=.5 but rejecting (0,.5).This problem The fact that Bayes factors are not coherent as measures does not arise if we choose to test the hypotheses by reject- of support does not mean that they are not useful sum- ing when the posterior odds is less than some number k'. maries.It only means that one must be careful how one The posterior odds in favor of H2 is never more than the interprets them.What the Bayes factor actually measures posterior odds in favor of H4. is the change in the odds in favor of the hypothesis when In Example 2,we were testing two hypotheses,H2 and going from the prior to the posterior.In fact,Bernardo and H4,such that H2 implies H4.Gabriel (1969)introduced a Smith(1994,p.390)said"Intuitively,the Bayes factor pro- criterion for simultaneous tests of nested hypotheses.The vides a measure of whether the data have increased or tests of H2 and H4 are coherent if rejecting Ha entails re- decreased the odds on H:relative to Hj."In terms of log- jecting H2.One typical use of a measure of support for odds,the posterior log-odds equals the prior log-odds plus hypotheses is to reject those hypotheses (that we want to the logarithm of the Bayes factor.So,for example,if one test)that have small measures of support.We can translate were to use log-odds to measure support(a coherent mea- the coherence condition into a requirement for any measure sure),then the logarithm of the Bayes factor would measure of support for hypotheses.Since any support for H2 must how much the data change the support for the hypothesis. a fortiori be support for Ha,the support for H2 must be Testing hypotheses by comparing Bayes factors to pre- no greater than the support for H4.Using the Bayes factor specified standard levels (like 3 or 1/3 to stand for 3-to- as a measure of support violates the coherence condition. 1 for or 1-to-3 against)is similar to confusing Pr(A B) Schervish(1996)showed that using P values as measures with Pr(BA).In Example 2,even though the Bayes fac- of support also violates the coherence condition.Examples tor fr()/f (x)=.0619 is small,the posterior odds of coherent measures are the posterior probability,the pos- Pr[H]/Pr[Hz)=.99/.01 x .0619 6.13 is large and terior odds,and various forms of the likelihood ratio test implies Pr[H4].86.The small Bayes factor says that statistic the data will lower the probability of H4 a large amount relative to where it starts (.99);but it does not imply that LR(H)= supOES fxle(0) and H4 is unlikely. supoEn fxje(|0) LR'(H)= supa∈fx1e(x|e) (1) 4.WHY COHERENCE? sup0ERA fxie(x10) Is coherence a compelling criterion to require of a mea- The nonmonotonicity (incoherence)of Bayes factors sure of support?Aside from the heuristic justification given is actually very general.Suppose that there are three earlier,there is a decision theoretic justification.As be- 120 General
of the prior odds from the posterior odds to produce the Bayes factor has consequences that affect the interpretation of the resulting ratio. 2. BAYES FACTORS ARE NOT MONOTONE IN THE HYPOTHESIS Example 2. Consider once again the four coin tosses that all came up heads, let the parameter space be Q = {0, 1/2, 1} (as in Example 1) and define a prior distribution pt by H({l}) =.01, Af({1/2}) =.98, and A({0}) =.01. The six predictive probabilities are fH,(X) = 1, fH2x) = 0625, fH3(X) = 0, fH4 (X) .0619, fHX) (x) .072, and the six nontrivial Bayes factors are fHl (X)/fH4 (X) 16.16, fH2(X)/fH5(X) .125, fH3 (X)/fH6 (X) = 0, and their inverses. Suppose that we use the Bayes factors to test the corresponding hypotheses. That is, we reject a hypothesis if the Bayes factor in its favor is less than some fixed number k. If we choose k c (.0619,.125), then we reject H4 because 1/16.16 k. That is, we face the apparent contradiction of accepting 0 = .5 but rejecting 0 c {0,.5}. This problem does not arise if we choose to test the hypotheses by rejecting when the posterior odds is less than some number k'. The posterior odds in favor of H2 is never more than the posterior odds in favor of H4. In Example 2, we were testing two hypotheses, H2 and H4, such that H2 implies H4. Gabriel (1969) introduced a criterion for simultaneous tests of nested hypotheses. The tests of H2 and H4 are coherent if rejecting H4 entails rejecting H2. One typical use of a measure of support for hypotheses is to reject those hypotheses (that we want to test) that have small measures of support. We can translate the coherence condition into a requirement for any measure of support for hypotheses. Since any support for H2 must a fortiori be support for H4, the support for H2 must be no greater than the support for H4. Using the Bayes factor as a measure of support violates the coherence condition. Schervish (1996) showed that using P values as measures of support also violates the coherence condition. Examples of coherent measures are the posterior probability, the posterior odds, and various forms of the likelihood ratio test statistic LR(H) SUPOEQH fxJe (X10) and supo'E fxie(xl0) LR'(H) - SUPOEQH fXlE(X10) SUPOeQA fx|e(x 0)(1 The nonmonotonicity (incoherence) of Bayes factors is actually very general. Suppose that there are three nonempty, disjoint, and exhaustive hypotheses H1, H2, and H3 as in Examples 1 and 2. Let H4 be the complement of H1 (the union of H2 and H3) as in the examples, so that H2 implies H4. Straightforward algebra shows that if fH3(X) < min{fH2(x),fH(x)}, then the Bayes factor in favor of H4 will be smaller than the Bayes factor in favor of H2 regardless of the prior probabilities of the three hypotheses H1, H2, and H3. For instance, the nonmonotonicity will occur in Example 2 no matter what one chooses for the (strictly positive) prior distribution A,. What happens is that the Bayes factor penalizes H4 for containing additional parameter values (those in H3) that make the observed data less likely than all of the other hypotheses under consideration. An applied example of this phenomenon was encountered by Olson (1997), who was comparing three modes of inheritance in the species Astilbe biternata. All three modes are represented by simple hypotheses concerning the distribution of the observable data. One hypothesis, H1, is called tetrasomic inheritance, while the other two hypotheses, H2 and H3 (those which happen to have the largest and smallest likelihoods, respectively), together form a meaningful category, disomic inheritance. The Bayes factor in favor of H2 will be larger than the Bayes factor in favor of H2 U H3 no matter what strictly positive prior one places over the three hypotheses because H3 has the smallest likelihood. 3. BAYES FACTORS ARE MEASURES OF CHANGE IN SUPPORT The fact that Bayes factors are not coherent as measures of support does not mean that they are not useful summaries. It only means that one must be careful how one interprets them. What the Bayes factor actually measures is the change in the odds in favor of the hypothesis when going from the prior to the posterior. In fact, Bernardo and Smith (1994, p. 390) said "Intuitively, the Bayes factor provides a measure of whether the data x have increased or decreased the odds on Hi relative to Hj." In terms of logodds, the posterior log-odds equals the prior log-odds plus the logarithm of the Bayes factor. So, for example, if one were to use log-odds to measure support (a coherent measure), then the logarithm of the Bayes factor would measure how much the data change the support for the hypothesis. Testing hypotheses by comparing Bayes factors to prespecified standard levels (like 3 or 1/3 to stand for 3-to- 1 for or 1-to-3 against) is similar to confusing Pr(AIB) with Pr(BIA). In Example 2, even though the Bayes factor fH4(X)/fHl(X) = .0619 is small, the posterior odds Pr[H4 x]/Pr[Hllx] = .99/.01 x .0619 6.13 is large and implies Pr[H41x] .86. The small Bayes factor says that the data will lower the probability of H4 a large amount relative to where it starts (.99), but it does not imply that H4 is unlikely. 4. WHY COHERENCE? Is coherence a compelling criterion to require of a measure of support? Aside from the heuristic justification given earlier, there is a decision theoretic justification. As be- 120 General
Table 1.Partitions of the Set of Possible x Values by Two If we subtract these two we get R(0,)-R(0,)= Pairs of Tests Po(C)g(0),where 2 2 1 0 0 g(0)=L1(0,0)-L2(0,0)-L1(0,1)+L2(0,1) 0 Q 0 1 CUE Our assumptions imply that g()>0 for all E 2i and it is 0 for all other 0.Since Pe(C)>0 for some 6e2, fore,we assume that a typical application of a measure is inadmissible.From the Bayesian perspective,if x C, of support will be to reject hypotheses that have low sup- the posterior risk of is f(L(0,0)+L2(0,1)]duex(), port.Hence,we will justify coherence as a criterion for where uelx is the posterior distribution.The posterior risk simultaneous tests.Consider the most general loss function of is f[L(0,1)+L2(0,0)]duex().The difference be- L that is conducive to hypothesis testing.That is,let the tween these two posterior risks is easily seen to equal the action space have two points,0 and 1,where 0 means ac- integral of g()with respect to the posterior distribution.If cept H and 1 means reject H,and let H:e be C,then the two rules make the same decision;hence, the hypothesis.We assume that L(,0)>L(,1)for all they have the same posterior risk.So long as the posterior 0g H and L(0,0)0.We can now show that dominates o and that When at least one of the hypotheses is composite,inter- it has smaller posterior risk.The risk functions of the two pretations are not so simple.One might choose either to pairs of tests are maximize,to sum,or to average over composite hypothe- ses.Users of the likelihood ratio statistic maximize:they find the value of 0 within each hypothesis that best explains the data.Users of posterior probabilities sum:the posterior probability of a hypothesis is the sum (or integral)of the R(0,)=L1(0,0)Pa(CUF)+L2(0,0)Pa(EUF) posterior probabilities of all the 0's within it.Users of Bayes +L1(0,1)P(DUE) factors average:the Bayes factor is the ratio of fxe() +L2(0,1)Pe(CUD), averaged with respect to the conditional prior given each R(0,)=L1(0,0)Pa(F)+L2(0,0)Pa(CUEUF) hypothesis.But averaging has at least two potential draw- backs.First,it requires a prior to average with respect to, +L1(0,1)Pa(CUDUE) and second,it penalizes a hypothesis for containing values +L2(0,1)Pa(D). with small likelihood.As we noted at the end of Section The American Statistician,May 1999,Vol.53,No.2 121
Table 1. Partitions of the Set of Possible x Values by Two Pairs of Tests 02 _ _ _2 O 0 1 O F C 0 F 0 1 E D 1 C UE D fore, we assume that a typical application of a measure of support will be to reject hypotheses that have low support. Hence, we will justify coherence as a criterion for simultaneous tests. Consider the most general loss function L that is conducive to hypothesis testing. That is, let the action space have two points, 0 and 1, where 0 means accept H and 1 means reject H, and let H e C QH be the hypothesis. We assume that L(O, O) > L(O, 1) for all 0 ' QH and L(O,O) 0. We can now show that 0 dominates f and that it has smaller posterior risk. The risk functions of the two pairs of tests are R(0,0) = L1(0,0)Po(C U F) + L2(O,O)Po(E U F) -HL1(O, 1)Po(D U F) +L2(O, 1)Po(C U D), R(O, 4) =L1( 0: )Po(F) + L2 (O, 0)Po(C U F U F) ?L1(O, 1)Po(C U D UE) +l 2 (, 1)Po (D). If we subtract these two we get R(0, 0) - R(0,) - Po(C)g(0), where g(O) Li(0, 0) - L2 (O, 0)- L1(0, 1) + L2(O, 1). Our assumptions imply that g(O) > 0 for all 0 C Q2 \Q1 and it is 0 for all other 0. Since Po (C) > 0 for some 0 C Q2 \ Q1, 0 is inadmissible. From the Bayesian perspective, if x c C, the posterior risk of 0 is f[L1 (0, 0) + L2(0, l)]dAte1x(OJx), where At,e1x is the posterior distribution. The posterior risk of 4 is f[Li(0, 1) +L2(O, O)]dAteIx(O x). The difference between these two posterior risks is easily seen to equal the integral of g(O) with respect to the posterior distribution. If x , C, then the two rules make the same decision; hence, they have the same posterior risk. So long as the posterior risks are finite and Q2\Ql has positive posterior probability, b cannot be a formal Bayes rule. 5. DISCUSSION Coherence is a property of tests of two or more nested hypotheses considered jointly, but we can gain some insight into it by considering a single test on its own. When comparing two hypotheses it is useful to rephrase the question as How well, relative to each other, do the hypotheses explain the data? In the case of comparing two simple hypotheses, there is wide agreement on how this should be done. As Berger (1985, p. 146) pointed out, the Bayes factor is the same as the likelihood ratio LR' from (1) in this case. Also, in the case of two simple hypotheses, the P value is just the probability in the tail of one of the distributions beyond the observed likelihood ratio, hence it is a monotone function of the Bayes factor. So, the Bayes factor and the P value really can measure the support that the data offers for one simple hypothesis relative to another, and in a way that is acceptable to Bayesians and non-Bayesians alike. One should also note that coherence is not an issue in the case of two simple hypotheses because there do not exist two nonempty distinct nested hypotheses with nonempty complements. On the other hand, as we noted at the end of Section 3, just because the data increase the support for a hypothesis H relative to its complement does not necessarily make H more likely than its complement, it only makes H more likely than it was a priori. When at least one of the hypotheses is composite, interpretations are not so simple. One might choose either to maximize, to sum, or to average over composite hypotheses. Users of the likelihood ratio statistic maximize: they find the value of 0 within each hypothesis that best explains the data. Users of posterior probabilitiesum: the posterior probability of a hypothesis is the sum (or integral) of the posterior probabilities of all the 0's within it. Users of Bayes factors average: the Bayes factor is the ratio of fx,E (x I0) averaged with respect to the conditional prior given each hypothesis. But averaging has at least two potential drawbacks. First, it requires a prior to average with respect to, and second, it penalizes a hypothesis for containing values with small likelihood. As we noted at the end of Section The American Statistician, May 1999, Vol. 53, No. 2 121
2,interpreting the Bayes factor as a measure of support is Multiple Comparisons,"Annals of Mathemarical Staristics,40.224-250. incoherent because of the second drawback. Jeffreys,H.(1960),Theory of Probabiliry (3 ed.),Oxford:Clarendon Press. Kass,R.,and Raftery,A.(1995),"Bayes Factors,"Journal of the American [Received January 1997.Revised September 1997. Statistical Association,90,773-795. O'Hagan,A.(1994),Kendall's Advanced Theory of Statistics,Vol.2B: REFERENCES Bayesian Inference,Cambridge:University Press. Olson,M.(1997),"Application of Bayesian Analyses to Discriminate Be- Berger,J.O.(1985),Statistical Decision Theory and Bayesian Analysis tween Disomic and Tetrasomic Inheritance in Astilbe biternata,"Tech- (2nd ed.),New York:Springer-Verlag. nical report,Duke University,Department of Botany. Bernardo,J.,and Smith,A.F.M.(1994),Bayesian Theory,New York:Schervish,M.J.(1995),Theory of Statistics,New York:Springer-Verlag. Wiley. -(1996),"P-values:What They Are and What They Are Not,"The Gabriel,K.R.(1969)."Simultaneous Test Procedures-Some Theory of American Statistician,50,203-206. 122 General
2, interpreting the Bayes factor as a measure of support is incoherent because of the second drawback. [Received January' 1997. Revised September 1997.] REFERENCES Berger, J. 0. (1985), Statistical Decision Theory and Bayesian Analysis (2nd ed.), New York: Springer-Verlag. Bernardo, J., and Smith, A. F. M. (1994), Bayesian Theoqy, New York: Wiley. Gabriel, K. R. (1969), "Simultaneous Test Procedures-Some Theory of Multiple Comparisons," Annals of Mathematical Statistics, 40, 224-250. Jeffreys, H.(1960), Theory of Probability (3 ed.), Oxford: Clarendon Press. Kass, R., and Raftery, A. (1995), "Bayes Factors," Journal of the American Statistical Association, 90, 773-795. O'Hagan, A. (1994), Kendall's Advanced Theory of Statistics, Vol. 2B: Bayesiani. Inference, Cambridge: University Press. Olson, M. (1997), "Application of Bayesian Analyses to Discriminate Between Disomic and Tetrasomic Inheritance in Astilbe biternata," Technical report, Duke University, Department of Botany. Schervish, M. J. (1995), Theoiy of Statistics, New York: Springer-Verlag. (1996), "P-values: What They Are and What They Are Not," The American Statistician, 50, 203-206. 122 General