正在加载图片...
14.4 Contingency Table Analysis of Two Distributions 633 H takes on its maximum value when all the pi's are equal,in which case the question is sure to eliminate all but a fraction 1/I of the remaining possibilities. The value H is conventionally termed the entropy of the distribution given by the pi's,a terminology borrowed from statistical physics. So far we have said nothing about the association of two variables;but suppose we are deciding what question to ask next in the game and have to choose between two candidates,or possibly want to ask both in one order or another.Suppose that one question,x,has I possible answers,labeled by i,and that the other question, y,as/possible answers,labeled by j.Then the possible outcomes of asking both 三 questions form a contingency table whose entries Nj,when normalized by dividing by the total number of remaining possibilities N,give all the information about the p's.In particular,we can make contact with the notation(14.4.1)by identifying N Pij= N P.= N (outcomes of question x alone) (14.4.8) RECIPES I N P.i= N (outcomes of question y alone) 0> Press. The entropies of the questions x and y are,respectively, Programs H)--∑plnp H()=->p.j lnp.j (14.4.9) IENTIFIC The entropy of the two questions together is 6 H(z,)=-∑Pnp (14.4.10) i.j Now what is the entropy of the question y given (that is,if is asked first)? 、彩 10-621 It is the expectation value over the answers to z of the entropy of the restricted y distribution that lies in a single column of the contingency table(corresponding Numerical Recipes 43106 to the x answer): (outside H)=-∑∑n型=-∑P%lh (14.4.11) Pi.Pi. i.j Correspondingly,the entropy of x given y is =-少兴如是于则兴 (14.4.12) P. We can readily prove that the entropy of y given z is never more than the entropy of y alone,i.e.,that asking z first can only reduce the usefulness of asking14.4 Contingency Table Analysis of Two Distributions 633 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machine￾readable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). H takes on its maximum value when all the pi’s are equal, in which case the question is sure to eliminate all but a fraction 1/I of the remaining possibilities. The value H is conventionally termed the entropy of the distribution given by the pi’s, a terminology borrowed from statistical physics. So far we have said nothing about the association of two variables; but suppose we are deciding what question to ask next in the game and have to choose between two candidates, or possibly want to ask both in one order or another. Suppose that one question, x, has I possible answers, labeled by i, and that the other question, y, as J possible answers, labeled by j. Then the possible outcomes of asking both questions form a contingency table whose entries Nij , when normalized by dividing by the total number of remaining possibilities N, give all the information about the p’s. In particular, we can make contact with the notation (14.4.1) by identifying pij = Nij N pi· = Ni· N (outcomes of question x alone) p·j = N·j N (outcomes of question y alone) (14.4.8) The entropies of the questions x and y are, respectively, H(x) = − i pi· ln pi· H(y) = − j p·j ln p·j (14.4.9) The entropy of the two questions together is H(x, y) = − i,j pij ln pij (14.4.10) Now what is the entropy of the question y given x (that is, if x is asked first)? It is the expectation value over the answers to x of the entropy of the restricted y distribution that lies in a single column of the contingency table (corresponding to the x answer): H(y|x) = − i pi·  j pij pi· ln pij pi· = − i,j pij ln pij pi· (14.4.11) Correspondingly, the entropy of x given y is H(x|y) = − j p·j  i pij p·j ln pij p·j = − i,j pij ln pij p·j (14.4.12) We can readily prove that the entropy of y given x is never more than the entropy of y alone, i.e., that asking x first can only reduce the usefulness of asking
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有