正在加载图片...
20 CHAPTER 2.ENTROPY AND MUTUAL INFORMATION From the symmetry,it follows that IY;2)≥I(x:Y) This theore m demonstrates that no processing of Z,deterministic or random,can increase the information that Z contains about X. Corollary 2.4.3.In particular,if Y=f(Z),we have I(X:Z)2I(X:f(Z)). Poof.X→Z→fZ)forms a Markov Chain. ◇ Corollary2.4.4.fX→Z→Y,then I(X:ZY)≤I(X;Z). Proof.Using the fact I(X:Y)20,the corollary follows immediately from(2.22). Expressing the data processing inequality in terms of entropies,we have H(x)-H(Z)H()-H(XY) →1(DX2)≤(XY (2.23 of a oh y vields the intu- itively satisfying result that this uncertainty or equivocation can never decrease as we go further from the input on a sequence of cascaded channels. ac0=(份)-1a Note that Pylx(01)=1 so that H(YIX=1)=0 Similarly,Pylx(00)=,so that 0YX=0=()=Ib Thus,H(Y1K)=是×1=是bt H(Y)=P(,)H(==) =0)+0)+四 = and H(XYZ)=H(X)+H(YX)+H(ZXY)=1+=2 bits. Because Py(1)=Py(0)=,we have H0m=压(份)=0s1 we see that H(YX)=(Y).However,H(YX =0)=1>H(Y). Furthermore,I(X;Y)=H(Y)-H(YX)=0.811-0.5=0.311 bits In words.the first c omponent of the random vector X.Y,Z]gives 0.311 bits of informa- tion about the 2nd component,and vice versa. 20 CHAPTER 2. ENTROPY AND MUTUAL INFORMATION From the symmetry, it follows that I(Y ;Z) ≥ I(X; Y ) This theorem demonstrates that no processing of Z, deterministic or random, can increase the information that Z contains about X. Corollary 2.4.3. In particular, if Y = f(Z), we have I(X;Z) ≥ I(X; f(Z)). Proof. X → Z → f(Z) forms a Markov Chain. Corollary 2.4.4. If X → Z → Y , then I(X;Z|Y ) ≤ I(X;Z). Proof. Using the fact I(X; Y ) ≥ 0, the corollary follows immediately from (2.22). Expressing the data processing inequality in terms of entropies, we have H(X) − H(X|Z) ≥ H(X) − H(X|Y ) ⇒ H(X|Z) ≤ H(X|Y ) (2.23) The average uncertainty H(X|Z) about the input of a channel given the output is called the equivocation on the channel, and thus the above inequality yields the intu￾itively satisfying result that this uncertainty or equivocation can never decrease as we go further from the input on a sequence of cascaded channels. Example 2.4.1. Suppose that the random vector [X, Y, Z] is equally likely to take on values in {[000], [010], [100], [101]}. Then H(X) = H2 ( 1 2 ) = 1bit Note that PY |X(0|1) = 1 so that H(Y |X = 1) = 0 Similarly, PY |X(0|0) = 1 2 , so that H(Y |X = 0) = H2 ( 1 2 ) = 1bit Thus, H(Y |X) = 1 2 × 1 = 1 2 bit. H(Z|XY ) = ∑ x,y P(x, y)H(Z|X = x, Y = y) = 1 4 (0) + 1 4 (0) + 1 2 (1) = 1 2 bit and H(XY Z) = H(X) + H(Y |X) + H(Z|XY ) = 1 + 1 2 + 1 2 = 2 bits. Because PY (1) = 1 4 , PY (0) = 3 4 , we have H(Y ) = H2 ( 1 4 ) = 0.811 bits we see that H(Y |X) = 1 2 < H(Y ). However, H(Y |X = 0) = 1 > H(Y ). Furthermore, I(X; Y ) = H(Y ) − H(Y |X) = 0.811 − 0.5 = 0.311 bits In words, the first component of the random vector [X, Y, Z] gives 0.311 bits of informa￾tion about the 2nd component, and vice versa
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有