信息论与编码理论基础 December 4,2007 XDU,Fall 2007 Lecture Notes Introduction to Channel Coding I.Error control coding (overview) Shannon showed that reliable communications can be achieved by proper coding of information to be transmitted provided that the rate of information transmission is below the channel capacity. Coding is achieved by adding properly designed redundancy to each message before its transmission.The added redundar appear in the for rm of extra symbols (or bits).or in the form of channel signal-ser expansion or in the form of combination of both. Coding may be designed and performed separately from modulation,or designed in conjunction with modulation as a single entity.In the former case,redundancy appears in the form of extra symbols,normally called parity-check symbols. ■ ng achieved by addin is known as conventio nal coding,in which error control (or coding gain)is achieved at the expense of bandwidth expansion or data rate reduction.Therefore,conventional coding is suitable for error control in power limited channels,such as deep space channel. In the case that coding is designed in conjunction with modulation,redundancy comes from channel signal- nsion.This combination of coding and modulation is usually known as coded modulation,which allo s us to achieve error control (or coding gain)without compromising bandwidth efficiency.We refer this technique as the bandwidth efficient coding. ■Historical notes: Hamming codes(1950) Reed-Muller codes(1954) BCH codes (by Bose,Ray-Chaudhuri and Hocquenghem,1959) BM算法(1968) Reed-Solomon codes(1960) Low-density parity-check codes(by Gallager in 1962,rediscovered in 90's) by Elias,1955) Viterbi algorithm(1967) Concatenated codes(by Forney,1966) Trellis-coded modulation (by Ungerboeck,1982) Turbo codes(by Berrou,1993) Space-time codes(by Vahid Tarokh,1998) Applications Deep space,satellite,mobile communications,voice modem,data networks,etc. ■Two simple examples:
1 信息论与编码理论基础 December 4, 2007 XDU, Fall 2007 Lecture Notes Introduction to Channel Coding I.Error control coding (overview) Shannon showed that reliable communications can be achieved by proper coding of information to be transmitted provided that the rate of information transmission is below the channel capacity. Coding is achieved by adding properly designed redundancy to each message before its transmission. The added redundancy is used for error control. The redundancy may appear in the form of extra symbols (or bits), or in the form of channel signal-set expansion or in the form of combination of both. Coding may be designed and performed separately from modulation, or designed in conjunction with modulation as a single entity. In the former case, redundancy appears in the form of extra symbols, normally called parity-check symbols. Coding achieved by adding extra redundant digits is known as conventional coding, in which error control (or coding gain) is achieved at the expense of bandwidth expansion or data rate reduction. Therefore, conventional coding is suitable for error control in power limited channels, such as deep space channel. In the case that coding is designed in conjunction with modulation, redundancy comes from channel signal-set expansion. This combination of coding and modulation is usually known as coded modulation, which allows us to achieve error control (or coding gain) without compromising bandwidth efficiency. We refer this technique as the bandwidth efficient coding. Historical notes: Hamming codes (1950) Reed-Muller codes (1954) BCH codes (by Bose, Ray-Chaudhuri and Hocquenghem, 1959) Reed-Solomon codes (1960) ⎫ ⎬ ⎭ BM 算法(1968) Low-density parity-check codes (by Gallager in 1962, rediscovered in 90’s) Convolutional codes (by Elias, 1955) Viterbi algorithm (1967) Concatenated codes (by Forney, 1966) Trellis-coded modulation (by Ungerboeck, 1982) Turbo codes (by Berrou , 1993) Space-time codes (by Vahid Tarokh,1998) Applications: Deep space, satellite, mobile communications, voice modem, data networks, etc. Two simple examples:
0→0000001 Repetition codes: 1→111111 (n,1)code Single parity-check codes: 0000:01 0001:1 1(.1)code 0011:0 ■References: [1]Shu Lin and D.J.Costello,Jr.Error Control Coding:Fundamentals and Applications 2 ed Prentice-Hall 2004 [2]王新梅,肖国镇.纠错码一原理与方法.西安:西安电子科技大学出版社,1991 [3]D.J.Costello,J.Hagenauer.H.Imai,and S.B.Wicker,"Applications of error-control coding."IEEE Trans.Inform.Theory,vol44,no.6,pp.2531-2560,Oct.1998. [4]D.J.Costello and G.D.Forney,"Channel coding:The road to channel capacity," Proceedings of The IEEE,vol.95,no.6,pp.1150-1177,June 2007. II.Block coding In block coding,information sequence is divided into messages ofk information bits(or symbols)each.Each message is mapped into a structured sequence of n bits(with), ealled a codeword. (,4,.4-)(G0,9,.,cn-) codeweed The mapping operation is called encoding.Each encoding operation is independent of past encodings.The collection of all codewords is called an (n.k)block code,where n and k are the】 ength and dime sion of the code, resp In the process of encoding.n-k redundant bits are added to each message for protection against transmission errors For example,consider a(5,2)binary code of size M2=4 0010101=c 01←→10010=c 1010010=c3 C={c,c2,c3,c4} 11←→11110=c An important class of block codes is the class of linear block codes.A block code is said to be linear if the vector sum of two codewords is also a codeword: c,=(Co,C,.C-)eC, ,=(C.0,C.C-)eC C,⊕Cy=(Co⊕C0,C⊕C,.,Cn-®cn-i)eC More general,a linear code is a subspace of GF(g)”.(矢量加、标量乘运算封闭)
2 Repetition codes: 0 000000 1 111111 → ⎫ ⎬ → ⎭ (n, 1) code Single parity-check codes: 0000 0 00 01 1 0010 1 0 0 11 0 ⎫ ⎪ ⎪ ⎬ ⎪ ⎪⎭ # # # # (n, n-1) code References: [1] Shu Lin and D. J. Costello, Jr. Error Control Coding: Fundamentals and Applications. 2nd ed. Prentice-Hall, 2004. [2] 王新梅,肖国镇.纠错码-原理与方法.西安:西安电子科技大学出版社,1991. [3] D. J. Costello, J. Hagenauer, H. Imai, and S. B. Wicker, “Applications of error-control coding,” IEEE Trans. Inform. Theory, vol.44, no.6, pp.2531-2560, Oct. 1998. [4] D. J. Costello and G. D. Forney, “Channel coding: The road to channel capacity,” Proceedings of The IEEE, vol.95, no.6, pp.1150-1177, June 2007. II. Block coding In block coding, information sequence is divided into messages of k information bits (or symbols) each. Each message is mapped into a structured sequence of n bits (with n>k), called a codeword. 01 1 01 1 message codeword (, , ) (, , ) k n uu u cc c " " − − ↔ The mapping operation is called encoding. Each encoding operation is independent of past encodings. The collection of all codewords is called an (n, k) block code, where n and k are the length and dimension of the code, respectively. In the process of encoding, n-k redundant bits are added to each message for protection against transmission errors. For example, consider a (5,2) binary code of size M=2k =4: 1 2 1234 3 4 00 10101 01 10010 {, , , } 10 10010 11 11110 ↔ = ↔ = = ↔ = ↔ = c c cccc c c C An important class of block codes is the class of linear block codes. A block code is said to be linear if the vector sum of two codewords is also a codeword: ,0 ,1 , 1 (, ) i i i in cc c = ∈ − c " C , ,0 ,1 , 1 (, ) j j j jn cc c c = " − ∈C ,0 ,0 ,1 ,1 , 1 , 1 ( , , ) i j i j i j in jn c cc c c c ⊕= ⊕ ⊕ ⊕ ∈ − − c c " C More general, a linear code is a subspace of GF( )n q . (矢量加、标量乘运算封闭)
Linear block codes are normally put in systematic form: parity-check part message part Each parity-check bit is a linear sum of message bits,i.e. ,j=0,l.,n-k-1. where p0or 1.Thekequations which gives the parity-check bits are called the parity-check equations.They specify the encoding rule. For an(n.k)block code,the ratios R= and 7=- are called coderate and redundancy,respectively. An example for block code: Let n=7 and k=4.Consider the (7,4)linear systematic block code Message (4,4,2,4) Codeword: (Co,91,C2,C3,C4,C,C6)=(c,9,C2,46,4,4,4) C0=4+41+12 Here, G=4+42+华 C,=t+14+l In matrix form: [1011000] 1110100 c=4,4,4,4i100010 =uG L0110001 Encoder circuit:
3 Linear block codes are normally put in systematic form: 01 1 01 01 1 1 (, , ) ( , , , , , ) n nk k parity check part message part cc c − cc c uu u − − − − " = " " Each parity-check bit is a linear sum of message bits, i,e, 1 0 , 0,1, , 1. k j ij i i c pu j n k − = = = −− ∑ " where ij p =0 or 1. The n k − equations which gives the parity-check bits are called the parity-check equations. They specify the encoding rule. For an (n, k) block code, the ratios k R n = and n k n η − = are called code rate and redundancy, respectively. An example for block code: Let n=7 and k=4. Consider the (7, 4) linear systematic block code Message: 0123 (,) uuuu Codeword: 0123456 012 01 2 3 (,) (, , , ) ccccccc cccuuuu = Here, 0 012 1123 2 013 c uuu cuuu c uuu = ++ =++ = ++ In matrix form: 0123 1 0 11 0 0 0 1 1 10 1 0 0 (,) 1 1 00 0 1 0 0 1 10 0 0 1 uuuu ⎡ ⎤ ⎢ ⎥ = =⋅ ⎣ ⎦ c uG # # # # Encoder circuit:
message register To channe III.Convolution Coding During each unit of time,the input to convolutional is also a k-bit messa ge block and the corresponding is also ann-bit coded withk<n.(为避免混淆,可改为用ka,m表示) Each coded n-bit output block depends not only on the corresponding k-bit input message block at the same time unit but also on the m previous message blocks. Encoder (memory) Input data stream shift register data fram 运算逻辑 有记忆 encoder odeword fram The code rate is defined as R=k/n. The parameter m is called the memory order of the code
4 ⊕ ⊕ ⊕ u 0 c 1 c 2 c III. Convolution Coding During each unit of time, the input to convolutional is also a k-bit message block and the corresponding is also an n-bit coded with k<n. (为避免混淆,可改为用 k0, n0 表示) Each coded n-bit output block depends not only on the corresponding k-bit input message block at the same time unit but also on the m previous message blocks. # # Encoder (memory) u (1) u (2) u ( )k u ( )n c (2) c (1) c c Input data stream shift register 运算逻辑 n nn n encoder codeword frame data frame 有记忆 N The code rate is defined as R = k n/ . The parameter m is called the memory order of the code
In the process of coding.the information sequence u is divided into data frames of length k.These subseq ences of length k are applied to the k-input terminals of the encoder,producing coded sequence oflength An example: Let n2.=1 and m=2.Consider a rate-1/2(2,1,2)convolutional code which is specified by the following two generator sequences: g"=(101),g=(11) Note:g",g2)可看作编码器的两个冲激响应,由u=6=(I00)得到。冲激响应至多 持续m+1个时间单位,且可写为: g"=(g,g",g)g2=(882,g2,g2) -Let u=()be the input message sequence.Then the two output sequences are c)=u*g) 编码方程(与冲激响应的卷积运算) c2 =u+g) -At the /th time unit,the input is a single bitThe corresponding output is a block of two bits,(cc),which is given by P-28”4g”+g”+中w c"=4+42 →G=4++"型 memory The output codeword is given by c=). Foru=1011100.),c=(11,0L,00,10,01,10,11,.)
5 In the process of coding, the information sequence u is divided into data frames of length k. These subsequences of length k are applied to the k-input terminals of the encoder, producing coded sequence of length n. An example: Let n=2, k=1 and m=2. Consider a rate-1/2 (2,1,2) convolutional code which is specified by the following two generator sequences: (1) (2) g g = = (101), (111) u (1) c c (2) c Note: (1) (2) g g , 可看作编码器的两个冲激响应,由u = δ = (100.)得到。冲激响应至多 持续m +1个时间单位,且可写为: ( ) ( ) (1) (1) (1) (1) (2) (2) (2) (2) 01 0 1 , , , , , m m g g = = gg g gg g . . - Let 0 1 u = (,) u u " be the input message sequence. Then the two output sequences are (1) (1) (2) (2) * * = ⎫ ⎬ = ⎭ c ug c ug 编码方程 (与冲激响应的卷积运算) - At the lth time unit, the input is a single bit l u . The corresponding output is a block of two bits, (1) (2) (, ) l l c c , which is given by () () () () () 0 11 1 m j jj j j l li i l l lm m i c u g ug u g u g − −− = = = + ++ ∑ " (1) 2 2 1 2 ll l l ll l memory cu u c uu u − − − ⎧ = + ⎪ ⇒ ⎨ =+ + ⎪ ⎩ - The output codeword is given by ( ) (1) (2) (1) (2) 00 11 c = cc cc , ," . For (1011100 ), (11,01,00,10,01,10,11, ) u c = = "
State Diagram:Since the encoder is a linear sequential circuit,its behavior can be described by a state diagram.The encoder state at time is represented by the message ◆ The encoder of the (2,1.2)convolutional code given in the example has 4 possible states,and its state diagram is shown in the figure below. 000 00 011 output 0 0/01 0/10 11 1/01 label=input/output Trellis diagram:The state diag gram can be expanded in time to display the state transition of a convolutional encoder in time.This expansion in time results in a trellis diagram. state 110 010 100 001 010y 0 The encoding of a message sequence u is equivalent to tracing a path through the trellis. The trellis structure is very useful in decoding a convolutional code. 6
6 State Diagram: Since the encoder is a linear sequential circuit, its behavior can be described by a state diagram. The encoder state at time l is represented by the message bits stored in the memory units. The encoder of the (2, 1, 2) convolutional code given in the example has 4 possible states, and its state diagram is shown in the figure below . 00 01 10 11 0/00 1/11 1/00 0/01 1/01 0/10 1/10 0/11 input output label = input/output Trellis diagram: The state diagram can be expanded in time to display the state transition of a convolutional encoder in time. This expansion in time results in a trellis diagram. The encoding of a message sequence u is equivalent to tracing a path through the trellis. The trellis structure is very useful in decoding a convolutional code
IV.Conventional Coding 1.Types of codes [block codes-linear codes,cyclic codes convolutional codes classfication based on structure [random-error-correcting codes burst-error-correcting codes (Binary codes Nonbinary codes error-correction codes error-detection codes 2.Error correcting capacity The error correcting capacity of a code Cdepends on its distance structure. The Hamming distance between two codewords,x and y,in a code,denoted by d(x,y),is defined as the number of places in which they differ. 4L)=立4Gdk,)=fx≠y 10,ifx,=y. ord(y){i:x≠yl For example,dy(010,111)=2. d(30102,21103)=3 Hamming distance satisfies the axioms for a distance metric: I)dm(x,y)≥0,with equality iffx=y 2)dn(,y)=dn(y,x)(对称性) 3)du(x.y)sdu(x,z)+du(z.y) The minimum Hamming distance of a code C is defined as dmin{dn(k,y)lx,y∈C,x≠y} For a convolutional code.this minimum Hamming distance is usually called the minimum free distance,denoted byd An(n,k)block code with minimum Hamming distance d is capable of correcting
7 IV. Conventional Coding 1. Types of codes block codes - linear codes, cyclic codes classfication based on structure convolutional codes ⎧ ⎫ ⎨ ⎬ ⎩ ⎭ random-error-correcting codes burst-error-correcting codes ⎧ ⎨ ⎩ Binary codes Nonbinary codes ⎧ ⎨ ⎩ error-correction codes error-detection codes ⎧ ⎨ ⎩ 2. Error correcting capacity The error correcting capacity of a code C depends on its distance structure. The Hamming distance between two codewords, x and y, in a code, denoted by H d (, ) x y , is defined as the number of places in which they differ. H H 1 H 1, if ( , ) ( , ), ( , ) 0, if or ( , ) |{ : }| n i i Hii ii i i i i i x y d d xy d xy x y d ix y = ⎧ ≠ = = ⎨ ⎩ = = ≠ x y ∑ x y For example, (010,111) 2, (30102,21103) 3 H H d d = = Hamming distance satisfies the axioms for a distance metric: 1) ( , ) 0, with equality iff H d x y ≥ = x y 2) ( , ) ( , ) ( ) H H d d xy yx = 对称性 3) ( , ) ( , ) ( , ) H HH d dd x y ≤ + xz z y The minimum Hamming distance of a code C is defined as d d min min ( , ) | , , { H xy xy x y ∈C ≠ } For a convolutional code, this minimum Hamming distance is usually called the minimum free distance, denoted by free d . An (n, k) block code with minimum Hamming distance min d is capable of correcting
I-色≥orfrndm r of(nn decoding rule).This parameter t is called the error correcting capacity of the code. decoding sphered+ The minimum Hamming distance of a linear block code depends on the choice of parity-check equations and the number of parity bits,k 3.Important codes 1)Algebraic block codes -Hamming codes algebraic struct re,algebraic decoding algorithms available. -Golay (23.12)code:A perfect triple-error-correcting code,widely used and generated by g(x)=1+x2+x+x+x+x0+x" -Reed-Muller codes used for error control in data communications and data storages. 2)Convolutional codes:(2,1,6)code generated by g"=(1101101),g2=000111) This code hasd=10 3)Codes(defined)on graphs: Low-density parity-check codes Turbo codes capacity-approching codes 4.Types of error control schemes -Forward-error-correction (FEC):An error-correction code is used -Automatic-repeat-request(ARQ):An error-detection code is used. If the presence of error is detected in a received word,a retransmission is requested.The request signal is sent to the nitter through a feedback channel.Retransmissior continues until no errors being detected -Hybrid ARQ:A proper combination of FEC and ARQ. 5.Decoding
8 min 1 2 d t ⎢ ⎥ − = ⎢ ⎥ ⎣ ⎦ or fewer random errors over a block of n digits (using minimum distance decoding rule). This parameter t is called the error correcting capacity of the code. t t min d decoding sphere min d t ≥ + 2 1 The minimum Hamming distance of a linear block code depends on the choice of parity-check equations and the number of parity bits, n k − . 3. Important codes 1) Algebraic block codes - Hamming codes - BCH codes: A large class of powerful multiple random error-correcting codes, rich in algebraic structure, algebraic decoding algorithms available. - Golay (23, 12) code: A perfect triple-error-correcting code, widely used and generated by 2 4 5 6 10 11 gx x x x x x x () 1 =+ + + + + + - Reed-Muller codes - Reed-Solomon codes: nonbinary, correcting symbol errors or burst errors ,most widely used for error control in data communications and data storages. 2) Convolutional codes: (2, 1, 6) code generated by (1) (2) g g = = (1101101), (1001111) This code has free d =10 . 3) Codes (defined) on graphs: Low-density parity-check codes capacity-approching codes Turbo codes ⎫ ⎬ ⎭ 4. Types of error control schemes - Forward-error-correction (FEC): An error-correction code is used. - Automatic-repeat-request (ARQ): An error-detection code is used. If the presence of error is detected in a received word, a retransmission is requested. The request signal is sent to the transmitter through a feedback channel. Retransmission continues until no errors being detected. - Hybrid ARQ: A proper combination of FEC and ARQ. 5. Decoding
Based on the received sequence,the encoding rules and the noise characteristics of the channel,the receiver makes a decision which message was actually transmitted.This ecision making operation is called decoding Hard-decision When binary coding is used,the modulator has only binary inputs (M-2).If binary demodulator output quantization is used(=2),the decoder has only binary inputs.In this case.the demodulator is said to make hard decision.decoding based on hard decisions made 2yhedcgotlaoricalkeihaadkaiwonbcodis Soft-decision If the output of demodulator consists of more than two quantization levels (Q>2)or is left unquantized,the demodulator is said to make soft decisions.Decoding based on this is called soft-decision decoding O电平 硬判决(Q=2) 解调器 量化器 多 软判决 器 Hard-decision decoding is much easier to implement than soft-decision decoding However,soft-decision decoding offers significant performance improvement over hard-decision decoding.See figure 2. -BPSK soft-decision boun 1.2 -BSC Non sol-decision- 0.8 hard-deision 0.6 Ep/N.(dB) Figure2软判决与硬判决译码的信道容量 ■Optimal decoding
9 Based on the received sequence, the encoding rules and the noise characteristics of the channel, the receiver makes a decision which message was actually transmitted. This decision making operation is called decoding. Hard-decision When binary coding is used, the modulator has only binary inputs (M=2). If binary demodulator output quantization is used (Q=2), the decoder has only binary inputs. In this case, the demodulator is said to make hard decision. Decoding based on hard decisions made by the demodulator is called hard-decision decoding. Soft-decision If the output of demodulator consists of more than two quantization levels (Q>2) or is left unquantized, the demodulator is said to make soft decisions. Decoding based on this is called soft-decision decoding. 解调器 Q 电平 量化器 译 码 器 硬判决(Q=2) 软判决 Hard-decision decoding is much easier to implement than soft-decision decoding. However, soft-decision decoding offers significant performance improvement over hard-decision decoding. See figure 2. -2 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 1.2 Eb /N0 (dB) Capacity (bits/symbol) soft-decision hard-decision BPSK soft-decision bound Shannon bound BSC bound Figure 2 软判决与硬判决译码的信道容量 Optimal decoding
Given that y is received.the conditional error probability of decoding is defined as P(Ey)eP(e≠cy) Then the error probability of P(E)=∑P(Ey)Py) A decoding rule that minimizes P(E)is referred to as an optimal decoding rule. Since minimize P(y)is equivalent to maximize P(=cly),we have MAP rule: c=arg max P(cly) Maximum-likelihood decoding(MLD): Note that Pc)P)we have P(y) ML rule:=arg max P(ylc)(Suppose all the messages are equally likely) 6.MLD for a BSC In coding for a BSC,every codeword and every received word are binary sequences Suppose some codeword is transmitted and the received word is y=(y.). For a codeword c,the conditional probability P(ylc)is P(ylc)=p(1-p)) For pP(ylc)iff du(y.c,)<du(y.c,) MLD: 1)Compute du(y.c,)for all c,C. 2)c,is taken as the transmitted codeword if du(y,c,)<d(y,c)forji. 3)Decoding c,into message u, This is called the minimum distance (nearest neighbor)decoding. and coding gai Block-error pro ility:It is the prob ability that a decoded word is in erro Bit-error probability:It is the probability that a decoded bit is in error. The usual figure of merit for a communication system is the ratio of energy per information bit to noise power spectral density,E/N,that is required to achieve a 10
10 Given that y is received, the conditional error probability of decoding is defined as PE P () ( ) y c cy ˆ ≠ Then the error probability of PE PE P ( ) ( ) () = ∑ y y y A decoding rule that minimizes P(E) is referred to as an optimal decoding rule. Since minimize P( ) c cy ˆ ≠ is equivalent to maximize P( ) c cy ˆ = , we have MAP rule: ˆ = arg max ( ) P c c c y Maximum-likelihood decoding (MLD): Note that () ( ) ( ) ( ) P P P P = c y c c y y , we have ML rule: ˆ = arg max ( | ) P c c y c (Suppose all the messages are equally likely) 6. MLD for a BSC In coding for a BSC, every codeword and every received word are binary sequences. Suppose some codeword is transmitted and the received word is 1 2 (, ) n y = y y y ,., . For a codeword i c , the conditional probability ( | ) P i y c is H H (, ) (, ) ( | ) (1 ) i i d nd Pp p i − = − y c y c y c For p y c iff H H (, ) (, ) i j d d y c < y c MLD: 1) Compute H (, )i d y c for all ci ∈C . 2) i c is taken as the transmitted codeword if H H (, ) (, ) i j d d y c < y c for ∀ ≠j i . 3) Decoding i c into message ui . This is called the minimum distance (nearest neighbor) decoding. 7. Performance measure and coding gain Block-error probability: It is the probability that a decoded word is in error. Bit-error probability: It is the probability that a decoded bit is in error. The usual figure of merit for a communication system is the ratio of energy per information bit to noise power spectral density, 0 / E N b , that is required to achieve a