正在加载图片...
2.4.RELATIVE ENTROPY AND MUTUAL INFORMATION 19 Proof. I(X:YZ)=H(YZ)-H(YZX) =H(Y)+H(Z)-[H(X)+H()] =[H(Y)-H(YIX】+[H(ZY)-H(ZXY】 =I(X:Y)+I(X:ZIY (2.20) Theorem 2.4.1(Chain rule for mutual information). I.Y) CI(.Xn-1) Proof. I(X1,X2.XN:Y)=H(X1,X2,.XN)-H(X1:X2.XNY) =∑HXlK,X,Xn-)-∑Hx1x,X,Xn-1,y n=1 =∑I(Xn;Y1X1,Xn-) (2.21 n 2.4.4 Data processing inequality ndom ariables X,Z,Y inendnt X→Z→YifP(,)=Pe)P(zr)P(z) Theorem 2.4.2 (Data processing inequality).If XZY,then I(X;Z)≥I(X;Y) Proof.By the chain rule,we obtain I(X:YZ)=I(X:Y)+I(X:ZY) =I(X;Z)+I(X;YIZ) Since X and Y are independent given Z,we have I(X:YZ)=0.Thus I(X:Z)=I(X:Y)+IX;ZY) (2.22) From the non-negativity of mutual information,I(X;Z)20.Thus,(2.22)implies that I(X:Z)≥I(X:Y)2.4. RELATIVE ENTROPY AND MUTUAL INFORMATION 19 Proof. I(X; Y Z) = H(Y Z) − H(Y Z|X) = H(Y ) + H(Z|Y ) − [H(Y |X) + H(Z|XY )] = [H(Y ) − H(Y |X)] + [H(Z|Y ) − H(Z|XY )] = I(X; Y ) + I(X;Z|Y ) (2.20) Theorem 2.4.1 (Chain rule for mutual information). I(X1, X2, . . . , XN ; Y ) = ∑ N n=1 I(Xn; Y |X1, . . . , Xn−1) Proof. I(X1, X2, . . . , XN ; Y ) = H(X1, X2, . . . , XN ) − H(X1, X2, . . . , XN |Y ) = ∑ N n=1 H(Xn|X1, X2, . . . , Xn−1) − ∑ N n=1 H(Xn|X1, X2, . . . , Xn−1, Y ) = ∑ n I(Xn; Y |X1, . . . , Xn−1) (2.21) 2.4.4 Data processing inequality The data processing inequality can be used to show that no clever manipulation of the data can improve the inferences that can be made from the data. Definition 2.4.4. Random variables X, Z, Y are said to form a Markov Chain in that order (denoted by X → Z → Y ) if the conditional distribution of Y depends only on Z and is conditionally independent of X. Specifically, X → Z → Y ifP(x, z, y) = P(x)P(z|x)P(y|z) Theorem 2.4.2 (Data processing inequality). If X → Z → Y , then I(X;Z) ≥ I(X; Y ) Proof. By the chain rule, we obtain I(X; Y Z) = I(X; Y ) + I(X;Z|Y ) = I(X;Z) + I(X; Y |Z) Since X and Y are independent given Z, we have I(X; Y |Z) = 0. Thus, I(X;Z) = I(X; Y ) + I(X;Z|Y ) (2.22) From the non-negativity of mutual information, I(X;Z|Y ) ≥ 0. Thus, (2.22) implies that I(X;Z) ≥ I(X; Y )
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有