正在加载图片...
16 CHAPTER 2.ENTROPY AND MUTUAL INFORMATION It is sometimes called the information divergence/cross-entropy/Kullback entropy be- tween P(z)and Q(x). 。The divergence inequality:D(PIlQ)≥0 with equality iff P(r)=Q(z)for all e&. Proof.Let Sx={r:P()>0}be the support set of P(r).Then -DP=∑Ps (2.14) Q)->P()ioge = ≤Eoa-2Ps =0 .Conditional relative entropy:For joint PMFs P()and Q(),the conditional relative entropy D(》)=∑P∑P( P(wlr) =.)】 P(ul)] (2.15) .Chain rule for relative entropy: D(P(,)Q(红,)=D(P(lQ(e》+D(P(zlQ(ylr) .Example:Entropy written as relative entropy Suppose that X take values in with=K.Let U have the uniform distri- bution over Y.Then DRI)-∑Pxas号 logK-H(X) It follows that H(X)<log. 16 CHAPTER 2. ENTROPY AND MUTUAL INFORMATION It is sometimes called the information divergence/cross-entropy/Kullback entropy be￾tween P(x) and Q(x). • In general, D(P||Q) ̸= D(Q||P), so that relative entropy does not have the sym￾metry required for a true “distance” measure. • The divergence inequality: D(P||Q) ≥ 0 with equality iff P(x) = Q(x) for all x ∈ X . Proof. Let SX = {x : P(x) > 0} be the support set of P(x). Then −D(P||Q) = ∑ x∈SX P(x) log Q(x) P(x) (2.14) (by IT − inequality) ≤ ∑ x P(x) [ Q(x) P(x) − 1 ] log e = [∑ x Q(x) − ∑ x P(x) ] log e ≤ [∑ x∈X Q(x) − ∑ x∈X P(x) ] log e = 0 • Conditional relative entropy: For joint PMFs P(x, y) and Q(x, y), the conditional relative entropy D(P(y|x)||Q(y|x)) = ∑ x P(x) ∑ y P(y|x) log P(y|x) Q(y|x) = EP(x,y) [ log P(y|x) Q(y|x) ] (2.15) • Chain rule for relative entropy: D(P(x, y)||Q(x, y)) = D(P(x)||Q(x)) + D(P(y|x)||Q(y|x)) • Example: Entropy written as relative entropy Suppose that X take values in X with |X | = K. Let U have the uniform distri￾bution over X . Then D(PX||PU ) = ∑ x PX(x) log PX(x) PU (x) = ∑ x PX(x) log PX(x) 1/K = log K − H(X) It follows that H(X) ≤ log |X |
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有