CSCI5370 Quantum Computing December 2,2013 Lecture 12:Quantum Information IV-Channel Coding Lecturer:Shengyu Zhang Scribe:Hing Yin Tsang 12.1 Shannon's channel coding theorem A classical (discrete memoryless)channel is described by the transition matrix p(ylz).For such a channel,if the encoder sends a message r"E&n,the decoder will then receive the sequence ye yn with probability Ip(i).A rate R is achievable if there exists a sequence of encoding and decoding functions Cn:[and D ynR]such that the error probability for any message M tends to 0 as n-oo.The channel capacity is defined as the maximum achievable rate.A celebrated result of Shannon gives a complete characterization of channel capacity in terms of mutual information: Theorem 12.1 (Shannon's channel coding theorem).Let p(ylr)be a discrete memoryless channel. Then C=max I(X;Y) p() Proof sketch of achievability.The proof is by probabilistic method.Fixing p(r),we generate 2nR code- words independently according to the distribution p(r")=IIp(i).Both the encoder and decoder knows the codebook.Let ys be the messaged received by the decoder when za is sent.The decoder output any M'such that (r,y)are jointly typical.The error occurs only when (xm,y)are not jointly typical,or there are other M'M such that(r,y)are jointly typical.The first event hap- pens with very small probability by the law of large number.The second event also happens with small probability.It can be shown that for independent Xn and yn with the same marginals as p(r",y"),the probability that()are jointly typical is roughly ()Thus a simple union bound argument works when the number of codeword is smaller than 21(XY).It follows that there exists codebook with small average error probability. By removing the worst half of the codeword,the maximum error probability would be small as well and hence the proof is completed. Proof of converse.Let M be a uniformly random message,let M be the output of the decoder.Let p=Pr[MM]be the average error probability.We have nR=H(M)=H(MM)+I(M;M)≤H(MI)+I(X";Y")≤H(MlM)+nC≤H(p)+npR+nC, where the first inequality is data processing inequality,the second inequality follows by the fact that the channel is memoryless and by the definition of C,and the last inequality follows by Fano's inequality. Hence if R>C,then p>(R-C)/R-H(p)/n which is bounded away from 0 as n-oo. ▣ 12.2 Quantum channel coding theorem For the quantum analogue,we are given a quantum channel instead of a classical channel so that the codeword can be a quantum state.The quantum channel is now described by a trace-preserving quantum operator,.namely£defined over some set H.in this case,.if the encoder sends p=p1⑧·⑧pn∈H⑧n the decoder will receive a state o=o1&·⑧on=E(p1)⑧·⑧E(pn).By abusing notation,.we also write o=E(p).In the quantum case,the decoding is done by performing measurement.A rate R is achievable if there exists a sequence of product state [pMM=1...R and a measurement described by POVM elements [EM)M=1....2R such that lim max(1-Tr((M E))=0. The product state capacity of E,denoted by C((E),is defined as the maximum achievable rate of E. The following theorem is the quantum analogue of Shannon's channel coding theorem: L12-1
CSCI5370 Quantum Computing December 2, 2013 Lecture 12: Quantum Information IV - Channel Coding Lecturer: Shengyu Zhang Scribe: Hing Yin Tsang 12.1 Shannon’s channel coding theorem A classical (discrete memoryless) channel is described by the transition matrix p(y|x). For such a channel, if the encoder sends a message x n ∈ X n, the decoder will then receive the sequence y n ∈ Yn with probability Qn i=1 p(yi |xi). A rate R is achievable if there exists a sequence of encoding and decoding functions Cn : [2nR] → X n and Dn : Y n → [2nR] such that the error probability for any message M tends to 0 as n → ∞. The channel capacity is defined as the maximum achievable rate. A celebrated result of Shannon gives a complete characterization of channel capacity in terms of mutual information: Theorem 12.1 (Shannon’s channel coding theorem). Let p(y|x) be a discrete memoryless channel. Then C = max p(x) I(X; Y ). Proof sketch of achievability. The proof is by probabilistic method. Fixing p(x), we generate 2nR codewords independently according to the distribution p(x n) = Qn i=1 p(xi). Both the encoder and decoder knows the codebook. Let y n M be the messaged received by the decoder when x n M is sent. The decoder output any M0 such that (x n M0 , yn M) are jointly typical. The error occurs only when (x n M, yn M) are not jointly typical, or there are other M0 6= M such that (x n M0 , yn M) are jointly typical. The first event happens with very small probability by the law of large number. The second event also happens with small probability. It can be shown that for independent X˜ n and Y˜ n with the same marginals as p(x n, yn), the probability that (X˜ n, Y˜ n) are jointly typical is roughly 2−nI(X;Y ) . Thus a simple union bound argument works when the number of codeword is smaller than 2nI(X;Y ) . It follows that there exists codebook with small average error probability. By removing the worst half of the codeword, the maximum error probability would be small as well and hence the proof is completed. Proof of converse. Let M be a uniformly random message, let Mˆ be the output of the decoder. Let p = Pr[M 6= Mˆ ] be the average error probability. We have nR = H(M) = H(M|Mˆ ) + I(M; Mˆ ) ≤ H(M|Mˆ ) + I(Xn ; Y n ) ≤ H(M|Mˆ ) + nC ≤ H(p) + npR + nC, where the first inequality is data processing inequality, the second inequality follows by the fact that the channel is memoryless and by the definition of C, and the last inequality follows by Fano’s inequality. Hence if R > C, then p ≥ (R − C)/R − H(p)/n which is bounded away from 0 as n → ∞. 12.2 Quantum channel coding theorem For the quantum analogue, we are given a quantum channel instead of a classical channel so that the codeword can be a quantum state. The quantum channel is now described by a trace-preserving quantum operator, namely E defined over some set H. In this case, if the encoder sends ρ = ρ1 ⊗ · · · ⊗ ρn ∈ H⊗n, the decoder will receive a state σ = σ1 ⊗ · · · ⊗ σn = E(ρ1) ⊗ · · · ⊗ E(ρn). By abusing notation, we also write σ = E(ρ). In the quantum case, the decoding is done by performing measurement. A rate R is achievable if there exists a sequence of product state {ρM}M=1,...,2nR and a measurement described by POVM elements {EM}M=1,...,2nR such that limn→∞ max M (1 − Tr(E(ρMEM)) = 0. The product state capacity of E, denoted by C (1)(E), is defined as the maximum achievable rate of E. The following theorem is the quantum analogue of Shannon’s channel coding theorem: L12-1
Theorem 12.2(Holevo-Schumacher-Westmoreland theorem). C1(e)= max PPs s(n》-ye where the marimum is over all ensembles {pj,Pi}of possible input state pi to the channel E. Proof sketch of acheviability.The proof is to argue randomly generate codeword(note that it is a quan- tum state now)can achieve small average error probability,then we use the trick in Shannon's proof to turn average error probability to maximum error probability by removing bad codes.Note that the rate is not affected by much in this step. Specifically,let (pj,pi}be an ensemble.For each message M,we generate a state pM =pM... PM where Mi are independent and Mi=j with probability pj.It is the codebook we need. Before defining the measurement,we need to definite several things first.Let oM=E(pM),oM= oM,⑧…⑧oM.andi=∑iPjOj:Let P be the s-typical subspace of8m.We know that by the typical subspace theorem,for any 6>0 and n sufficiently large, Tr(a8m(I-P)≤6. Let o;=e(e be the spectral decomposition.Thus we have oM=∑AE)(EX1, K whereK=(K1…,Kn),A=Πg1 and E)=lek》le.Let5=∑,p,S(o) ={K=Kk)s定-} TM and Py be the projector to the subspace spanned by {EM):KETM}.Note that Py are random as )-le)e)are random.It can be shown (by law of large number again)that for any and n large enough,E(Tr(onPw)≥1-i.Moreover,.E(Tr(PM)》≤2n(S+e) Now we define the measurement EM.Let (EPmr(EP 1/2 E and Eo=I-∑MEM. For such measurement {EM),it is possible to show that the average error probability Pave,defined by Pnae=∑Ml-T(aN Ex) 2nR can be upper bounded by 1 pave≤2nR2 3Tr(aM(I-P)+∑Tr(PaMPPM)+Tr(aM(I-PM) M M'≠M Taking expectation over all random codes,since E(oM)=o and oM and PM are independent for M≠MI',we have E(pave)3Tr((I-P))+(2nR-1)Tr(PaPE(P))+E(Tr(aM(I-PM))) ≤46+(2nR-1)Tr(PōPE(P): L12-2
Theorem 12.2 (Holevo-Schumacher-Westmoreland theorem). C (1)(E) = max {pj ,ρj } " S E X j pjρj !! − X j pjS(E(ρj ))# , where the maximum is over all ensembles {pj , ρj} of possible input state ρj to the channel E. Proof sketch of acheviability. The proof is to argue randomly generate codeword (note that it is a quantum state now) can achieve small average error probability, then we use the trick in Shannon’s proof to turn average error probability to maximum error probability by removing bad codes. Note that the rate is not affected by much in this step. Specifically, let {pj , ρj} be an ensemble. For each message M, we generate a state ρM = ρM1 ⊗ · · · ⊗ ρMn where Mi are independent and Mi = j with probability pj . It is the codebook we need. Before defining the measurement, we need to definite several things first. Let σMi = E(ρMi ), σM = σM1 ⊗· · · ⊗σMn and ¯σ = P j pjσj . Let P be the ε-typical subspace of ¯σ ⊗n. We know that by the typical subspace theorem, for any δ > 0 and n sufficiently large, Tr(¯σ ⊗n (I − P)) ≤ δ. Let σj = P k λ j k |e j k ihe j k | be the spectral decomposition. Thus we have σM = X K λ M K |E M K ihE M K |, where K = (K1, . . . , Kn), λM K = Qn i=1 λ Mi Ki and |EM K i = |e M1 K1 i · · · |e Mn Kn i. Let S¯ = P j pjS(σj ), TM = ( K = (K1, . . . , Kn) : 1 n log 1 λM K − S¯ ) , and PM be the projector to the subspace spanned by {|EM K i : K ∈ TM}. Note that PM are random as |EM K i = |e M1 K1 i · · · |e Mn Kn i are random. It can be shown (by law of large number again) that for any δ > 0 and n large enough, E(Tr(σMPM)) ≥ 1 − δ. Moreover, E(Tr(PM)) ≤ 2 n(S¯+ε) . Now we define the measurement EM. Let EM = X M0 P PM0P !−1/2 P PMP X M0 P PM0P !−1/2 and E0 = I − P M EM. For such measurement {EM}, it is possible to show that the average error probability pave, defined by pave = P M 1 − Tr(σMEM) 2 nR , can be upper bounded by pave ≤ 1 2 nR X M " 3Tr(σM(I − P)) + X M06=M Tr(P σMP PM0 ) + Tr(σM(I − PM))# . Taking expectation over all random codes, since E(σM) = ¯σ and σM and PM0 are independent for M 6= M0 , we have E(pave) ≤ 3Tr(¯σ(I − P)) + (2nR − 1)Tr(PσP¯ E(P1)) + E(Tr(σM(I − PM))) ≤ 4δ + (2nR − 1)Tr(PσP¯ E(P1)). L12-2
Finally,observe that Popx(E),which will complete the proof.Note that ae=ML-TrNE》)=PrM≠y 2nR Now,consider the ensemble {1/2R,M},let =>MM/2nR.By Holevo's bound,we have S(a)-S(aat) 2nR ≥I(M:Y) M =H(M)-H(MY) nR-H(MY) 2 nR-H(Pave)-Pave log(2nR-1) ≥nR-H(pae)-nPave R. Since oM are product states,we have S(oM)=S(a).By the subadditivity of entropy,S() ∑=1S()where=∑Mo/2mR.By definition ofx(e),S(a)-∑MS(a)≤x(e)for all i and hence we have nX(E)>nR-H(Pave)-npave R. Asn→o, 、R-X(E) Pave 2 R which is bounded away from 0 when R>x(E)and the proof is completed. ◇ L12-3
Finally, observe that PσP¯ 2 −n(S(¯σ)−ε) I. Thus we have Tr(PσP¯ E(P1)) ≤ 2 −n(S(¯σ)−ε)Tr(E(P1)) ≤ 2 −n(S(¯σ)−S¯−2ε) . Whenever R χ(E), which will complete the proof. Note that pave = P M(1 − Tr(σMEM)) 2 nR = Pr[M 6= Y ]. Now, consider the ensemble {1/2 nR, σM}, let ¯σ = P M σM/2 nR. By Holevo’s bound, we have S(¯σ) − X M S(σM) 2 nR ≥ I(M; Y ) = H(M) − H(M|Y ) = nR − H(M|Y ) ≥ nR − H(pave) − pave log(2nR − 1) ≥ nR − H(pave) − npaveR. Since σM are product states, we have S(σM) = Pn j=1 S(σM j ). By the subadditivity of entropy, S(¯σ) ≤ Pn j=1 S(¯σ j ) where ¯σ j = P M σM j /2 nR. By definition of χ(E), S(¯σ) − P M S(σM j ) ≤ χ(E) for all i and hence we have nχ(E) ≥ nR − H(pave) − npaveR. As n → ∞, pave ≥ R − χ(E) R , which is bounded away from 0 when R > χ(E) and the proof is completed. L12-3