
FundamentalsofMultimedia,Chapter7Chapter 7Lossless CompressionAlgorithms7.1 Introduction7.2 Basics of Information Theory7.3 Run-Length Coding7.4 Variable-Length Coding (VLC)7.5 Dictionary-based Coding7.6 Arithmetic Coding7.7 Lossless Image Compression7.8 Further Exploration
Fundamentals of Multimedia, Chapter 7 Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5 Dictionary-based Coding 7.6 Arithmetic Coding 7.7 Lossless Image Compression 7.8 Further Exploration

FundamentalsofMultimedia,Chapter77.1 IntroductionCompression:the process of coding that will effectivelyreduce thetotal number of bitsneeded to represent certaininformation.InputEncoderDecoderOutputStorage ornetworks(decompression)(compression)datadataFig.7.1:AGeneralDataCompressionScheme
Fundamentals of Multimedia, Chapter 7 7.1 Introduction • Compression: the process of coding that will effectively reduce the total number of bits needed to represent certain information. Encoder (compression) Decoder (decompression) Storage or networks Input Output data data Fig. 7.1: A General Data Compression Scheme

FundamentalsofMultimedia,Chapter7Introduction (cont'd)Ifthecompressionanddecompressionprocessesinducenoinformationloss,thenthe compression schemeislossless;otherwise, it is lossy.Compressionratio:Bo(7.1)compression ratio:B1Bo-numberofbitsbeforecompressionBi-numberofbitsaftercompression
Fundamentals of Multimedia, Chapter 7 Introduction (cont’d) • If the compression and decompression processes induce no information loss, then the compression scheme is lossless; otherwise, it is lossy. • Compression ratio: compression ratio = B 0 B 1 (7.1) B 0 – number of bits before compression B 1 – number of bits after compression

FundamentalsofMultimedia,Chapter77.2 Basics of Information TheoryThe entropyn of an information sourcewith alphabet S=[s1, $2,..., Sn} is:n1(7.2)n= H(S) = Zp;log2Pi=1n(7.3)Tpilog2Pi21pi - probability that symbol si will occur in S.1-indicatestheamountofinformation(self-informationlog2pas defined byShannon) contained in si,which correspondsto the number of bits needed to encode si
Fundamentals of Multimedia, Chapter 7 7.2 Basics of Information Theory • The entropy η of an information source with alphabet S = { s 1, s 2,.,s n } is: η = H ( S) = Xn i=1 p i log 2 1 p i (7.2) = − Xn i=1 p i log 2 p i (7.3) p i – probability that symbol s i will occur in S. log 2 1 pi – indicates the amount of information ( self-information as defined by Shannon) contained in s i, which corresponds to the number of bits needed to encode s i

FundamentalsofMultimedia,Chapter7Distributionof Gray-Level Intensities1/256-2/31/3255255(b)(aFig.7.2Histograms for Two Gray-level Images.Fig.7.2(a)showsthehistogramof an imagewith uni-form distribution of gray-level intensities, i.e., Vi pi = 1/256.Hence,the entropy of this image is:(7.4)log2256=8
Fundamentals of Multimedia, Chapter 7 Distribution of Gray-Level Intensities 0 0 255 255 1 2/3 1/3 1/256 (a) (b) i i pi pi Fig. 7.2 Histograms for Two Gray-level Images. • Fig. 7.2(a) shows the histogram of an image with uniform distribution of gray-level intensities, i.e., ∀i p i = 1 /256. Hence, the entropy of this image is: log 2 256 = 8 (7.4)

FundamentalsofMultimedia,Chapter7EntropyandCode LengthAscanbeseen inEq.(7.3):theentropynisaweighted-sum;henceit represents the average amount ofoftermslog2information contained per symbol in the source S.Theentropynspecifiesthelowerboundfortheaveragenum-berof bitstocode eachsymbol inS,i.e.,n≤T(7.5)T- the average length (measured in bits) of the codewordsproduced by the encoder
Fundamentals of Multimedia, Chapter 7 Entropy and Code Length • As can be seen in Eq. (7.3): the entropy η is a weighted-sum of terms log 2 1 pi; hence it represents the average amount of information contained per symbol in the source S. • The entropy η specifies the lower bound for the average number of bits to code each symbol in S, i.e., η ≤ ¯l (7.5) ¯l - the average length (measured in bits) of the codewords produced by the encoder

FundamentalsofMultimedia,Chapter77.3 Run-Length CodingMemorylessSource:aninformationsourcethat isindepen-dently distributed. Namely,the value of the current symboldoes not depend on the values of the previously appearedsymbols.Insteadofassumingmemorylesssource,Run-LengthCoding(RLC)exploits memorypresentin theinformationsource.Rationale for RLC:if theinformation sourcehastheprop-erty that symbols tend to form continuous groups, then suchsymbol and the lengthof the group canbe coded
Fundamentals of Multimedia, Chapter 7 7.3 Run-Length Coding • Memoryless Source: an information source that is independently distributed. Namely, the value of the current symbol does not depend on the values of the previously appeared symbols. • Instead of assuming memoryless source, Run-Length Coding (RLC) exploits memory present in the information source. • Rationale for RLC: if the information source has the property that symbols tend to form continuous groups, then such symbol and the length of the group can be coded

FundamentalsofMultimedia,Chapter77.4Variable-Length Coding (VLC)Shannon-FanoAigorithmatop-downapproach1.Sort the symbols according to the frequency count of theiroccurrences.2.Recursivelydividethesymbolsintotwoparts,eachwithap-proximatelythe samenumber of counts,until allparts con-tain only one symbol.AnExample:codingof“HELLO"EHL0Symbol1121CountFrequencycountofthesymbolsin"HELLO
Fundamentals of Multimedia, Chapter 7 7.4 Variable-Length Coding (VLC) Shannon-Fano Algorithm — a top-down approach 1. Sort the symbols according to the frequency count of their occurrences. 2. Recursively divide the symbols into two parts, each with approximately the same number of counts, until all parts contain only one symbol. An Example: coding of “HELLO” Symbol HELO Count 1121 Frequency count of the symbols in ”HELLO

FundamentalsofMultimedia,Chapter7(5)(5)001(3)0L:(2)H,E,O:(3)L:(2)H:(1)E,O:(2)(b)(a)(5)0(3)0L:(2)(2)0H:()1E:(1)0:(1)(c)Fig.7.3:CodingTree for HELLO byShannon-Fano
Fundamentals of Multimedia, Chapter 7 L:(2) (5) H,E,O:(3) (a) 0 1 (b) L:(2) (5) H:(1) E,O:(2) (3) 0 1 0 1 L:(2) O:(1) (5) E:(1) H:(1) (c) (2) (3) 0 1 0 1 0 1 Fig. 7.3: Coding Tree for HELLO by Shannon-Fano

FundamentalsofMultimedia,Chapter7Table 7.1:Result of Performing Shannon-Fano on HELLO10g2 1CountCodeSymbol# of bits usedL2021.32H122.3210E311102.323012.3211110TOTALnumberofbits:
Fundamentals of Multimedia, Chapter 7 Table 7.1: Result of Performing Shannon-Fano on HELLO Symbol Count log 2 1 pi Code # of bits used L 2 1.32 0 2 H 1 2.32 10 2 E 1 2.32 110 3 O 1 2.32 111 3 TOTAL number of bits: 10