EEE TRANSACTIONS ON IMAGE PROCESSING. VOL I. NO. 2. APRIL 1992 Image Coding using Wavelet Transform Marc Antonini, Michel Barlaud, Member, IEEE, Pierre Mathieu, and Ingrid Daubechies, Member, IEEE together with its implementation as described by Mallat [27], satisfies each of these conditions. wvelet s; this new method involves two steps. First, we use aa wavelet transform and a vector quantization coding der to obtain a set of biorthogonal sub- scheme. The wavelet coefficients are coded considering a asses of images; the original image is decomposed at different noise shaping bit allocation procedure. This technique ex maintains constant the number of pixels required to describe in the image data, enabling bit rate reduction. %N s an Section lI describes the wavelet transforms used in this ory, the wavelet coefficients are vector quantized using a multi- paper. After a quick review of wavelets in general, we resolution codebook. Furthermore, to encode the wavelet coef- explain in more detail the properties and construction of hich assumes that details at high resolution are less visible to re regular biorthogonal wavelet bases. We then extend tI m ize a picture as quickiy as possible at minimum cost we scheme with separable filters. The new coding scher wavelet t ransforesi is tranimission scheme It is shown that the next presented in Section ll. We focus particularly in this transmission ients, on the asymptotic coding gain that can be achieved orthogonal wavelet, multiscale py. using vector quantization in the subimages, and on the algorithm, vector quantization, noise shaping, pro- optimal allocation across the subimages. Experimental re- sults are given in Section IV for images taken within and outside of the training Se ACTIc different fields, digitized images are replacing A. A Short Review of Wavelet Analysis onal analog images as photograph or x-rays Wavelets are functions generated from one single func costly. The information contained in the images must 1/ therefore, be compressed by extracting only the visible elements, which are then encoded, The quantity of data (For this introduction we assume t is a one-dimen- nvolved is thus reduced substantially sional variable). The mother wavelet v has to satisfy A fundamental goal of data compression is to reduce dx v(x)=0, which implies at least the bit rate for transmission or storage while maintaining (Technically speaking, the condition on y should an acceptable fidelity or image quality. Compression can dw |Y(w)l2-I or wider frequency domains. To avoid redundancy, which hinders width compression, the transform must be at least biorthogonal The basic idea of the wavelet transform is to represent and lastly, in order to save CPU time, the corresponding any arbitrary function f as a superposition of wavelets algorithm must be fast. The two-dimensional wavelet Any such superposition decomposes f into different scale transform defined by Meyer and Lemarie [31], [24], [25], levels, where each level is then further decomposed with a resolution adapted to the level. One way to achieve such ipt received February 7, 1990: revised March 26 a decomposition writes f as an integral over a and b of chies is with at&T Bell Laboratories Murray Hill. N) 07974. tice, one prefers to write f as a discrete superpose ition(sum IEEE Log Number 9106073 rather than integral). Therefore, one introduces a discre 1057714992s3.00@1992IEEE
IEEE TRANSACTIONS ON IMAGE PROCESSING, vOL L. NO 2. APRIL 199 tization,a=0,b= nboao, with m, nEZ, and ao >1, for decomposition. Such filters are well known since the bo >0 fixed. The wavelet decomposition is then work of Smith and Barnwell [35] and of Vetterli [37] .The ∫=∑cmn(f)ψnn (1) xtra ingredient in the orthonormal tion is that it writes the signal to be decomposed as a st compositions of this type were studied in [14], [15]. For blocks. The filters must satisfy the additional condition a0=2, bo= 1 there exist very special choices of y such that the ym. constitute an orthonormal basis, so that II H( Cm,n(f)=(Vm. n f )= dx ym. a(x)f(x) decay faster than C(1+5I in this case. Different bases of this nature were con- structed by Stromberg [36], Meyer [31], Lemarie [24 H(E)=2-1/2∑hne Battle [7, and Daubechies [16]. All these examples cor- respond to a multiresolution analysis, a mathematical tool This extra regularity requirement is usually not satisfied nvented by Mallat [27], which is particularly well adapted by the exact reconstruction filters in the ASSP literature to the use of wavelet bases in image analysis, and which gives rise to a fast computation algorithm B. Applications of Wavelet Bases to Image Analysis n a multiresolution analysis, one really has two func 1) Biorthogonal Wavelet Bases: Since images are tions: the mother wavelet y and a scaling function g. One mostly smooth(except for occasional edges)it seems ap- lso introduces dilated and translated versions of the scal ing function, m, n (x)=2-m/2p(2-x-n). For fixed m propriate that an exact reconstruction subband coding scheme for image analysis should correspond to an or the m n are orthonormal. We denote by m the space thonormal basis with a reasonably smooth mother wave spanned by the m, n; these spaces Vm describe successive let. In order to have fast computation, the filters should each with resolutic For each m,theψ be short(short filters lead to less smoothness, however a space wm which is exactly the orthogonal complement so they cannot be too short), On the other hand it is de- in Vm-1 of Vm; the coefficients (vm nf>, therefore. de- sirable that the FIR filters used be linear phase, since such scribe the information lost when going from an ap- without the need for phase compensation. Unfortunately ation of f with resolution 2"I to the coarser ap- there are no nontrivial orthonormal linear phase FIR fil nation with resolution 2m. All this is translated ters with the exact ruction property [351, regard into the following algorithm for the computation of the less of any regularity considerations. The only symmetric cmn(f)=〈ψm,n,f〉( for more details,se27]): xact reconstruction filters are those corresponding to the cm.(f)=282n-K@m-L&() Haar basis with all other h,, gn=0 ann(f)=∑h2Aam-1.(f) One can preserve linear phase(corresponding to sym- (2)metry for the wavelet)by relaxing the orthonormality re- quirement, and using biorthogonal bases. It is then still here g,=(-1)h-I+I and h=2/2dxo(x-n)o(2x). possible to construct examples where the mother wavelets In fact the am, n(f)are coefficients characterizing the pro- have arbitrarily high regularity ction of f onto Vm. If the functionf is given in sampled In such a scheme, we still decompose as in(2),but form, then one can take these samples for the highest or. reconstruction Ines der resolution approximation coefficients ao. n, and (2)de scribes a subband coding algorithm on these sampled val an-1.()=∑[2n-1an(f)+gm=1Cm.(f)(4) ues, with low-pass filter h and high-pass filter g. Because where the filters h, g may be different from h, g. In order of their association with orthonormal wave\nf oses, these to have exact reconstruction, we impose filters give exact reconstruction, i.e gn=(-1)"h an-1.(f)=2[hn-1am(f)+82n-1Cmn( 8n=(-1)hn+ Most of the orthonormal wavelet bases have infinitely g differently from supported y, corresponding to filters h and g with infi- the usual exact reconstruction ding scheme nitely many taps. The construction in [16] gives v with with synthesis filters different fr ecomposition fil finite support, and therefore, corresponds to FIR filters. ters. If the filters satisfy the additional condition that It follows that the orthonormal bases in [16] correspond to a subband coding scheme with exact reconstruction property, using the same FiR filters for reconstruction as II A(2-5)and II H(2-5)
ANTONINI et al.: IMAGE CODING USING WAVELET TRANSFORM decay faster than C(1+|)-sas|→∞, for some ∈>0, where H(o) B(5)=212∑hne)H()=2-12∑hne ) then we can give the following interpretation to(2)and (4). Define functions c and o by d(x)=∑hnd(2x-n)ando(x)=∑hnd(2x-n) Their Fourier transforms are exactly the infinite products Feauveau explores the construction from the point of view Fable functions, compactly supported if the filters h and ically one has two hierarchies of spaces in the bior- thogonal case, each corresponding to one pair of filters (x)=∑gn(2x-n)andv(x)=∑gn(2x-n) It is shown in [12] that arbitrarily high regularity can be achieved by both y and y. provided one chooses suf- Then, the a,mn(/) and cm. n() in(2)can be rewritten ficiently long filters. In particular, if the functions and Y are, respectively. (k-1)and(k- 1)times continu ously differentiable, then the trigonometric polynomials am. ()=(om. m/)=2 m/ dx om. n(a)f() (1 +e"f, respectively. so that the length of the corre m(f)=(vm, n,f>=2-m/ dx vom n(x)f(r) By(5). divisibility of A(E)by(1+e-5)means that y will have k consecutive moments zero and reconstruction is simply f=∑(ψmn,f)n,n xrv(x)=0, for I=0,I For more details concerning this discussion, see [12 The filter bank structure with the associating wavelets It is well known (and it can easily be checked by using nd scaling functions is depicted on the following sub band coding scheme(Fig. 1) Taylor expansions)that if y has k moments zero, then the If the infinite products in(6a)decay even faster than Coefficients (Vm. n, f)will represent functions f. which imposed above, then o and o and consequently y and y re k times differentiable, with a high compression poten will be reasonably smooth. Note that (7)is very similar tial(many coefficients will be negligibly small) o the orthonormal decomposition described in Section Many examples of biorthogonal wavelet bases with rea II-A; the only difference is that the expansion of f with sonably regular v and y can be constructed; for our ap- respect to the basis yo. n uses coefficients computed via ym n. which is linked to the number of zero moments of the dual basis tmn with y different from y. This interpre, 4, is more important than the regularity of the yo,. or the oding schemes; in particular, convergence of the infinite number of zero moments of y. Within the limits imposed tation is not possible for all exact reconstruction subband by the support widths, we will, therefore, try to choose as larg ∑hn=21/2and∑hn=21/2 、气 n terms of trigonometric polynomials H()and() he exact reconstruction requirement condition on h and Moreover, (7) can only hold h given in(5)reduces to(for symmetric filters) ∑(-1)yhn=0and∑(-1)yhn=0 H()B()+HE+)HE+丌)=1 Most exact reconstruction subband coding schemes do Together with divisibility of H and H, respectively. by ot satisfy these conditions (1+e-i 'and(1+e/5). this leads to(see [12]) Biorthogonal bases of wavelets hat Herley and Vetterli [38]. Reference [12] contains a de sin(/2)}+sin(/2)yR)(9) onditions stated above. the wavelets do indeed constitute merically stable bases(Riesz bases)and a discussion of where R(E) is an odd polynomial in cos(E), and where necessary and sufficient conditions for regularity. In [18] 2/=k+k(symmetry of h and h forces k+ k to be even)
EEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 1. No. 2. APRIL 1992 TABLE I FILTER COEFFICIENTS FOR THE SPLINE FILTERS W n 19/64 1/8 3/643/128 1/2 1/4 Many examples are possible. We have studied in par cular the following three examples, which belong to three different families 2)Spline Filters: One can choose, e. g,R=0,with A(E)=cos(E/2 e kE/2 where x=0 if k is even,K=I if k is odd. This corresponds to the filters called"spline filters"in[12](because the corresponding function is a B-spline function)or"binomial filters"'in[38](because the h are simply fficients). It then follows hat H(5)= ∑ We have looked at one example from this family; it responds to /=3, k=2. The coefficients h,and are listed in Table I; the corresponding scaling functions -1s and wavelets are plotted in Fig. 2 It is clear that the two filters in the first example have very uneven length. This is typical for all the examples in this family of"spline filters 3)A Spline variant with Less Dissimilar Lengths:This family still uses R =0 in(9), but factorizes the right hand side of (9), breaking up the polynomial of degree sin(E/2)with real coefficients, one to be allocated to H the other to H, so as to make the lengths of h and h as close as possible The example presented here is the"smallest',one in 4 this family(shortest h and h ); it corresponds to I-4 and k= 4. The filter coefficients are listed in Table II: the corresponding scaling functions and wavelets are plotted Note that, unlike examples I and 3 where the 2-1/h 2-1/h, are rational the entries in Table lI are truncated decimal expansions of irrational numbers. The functions in examples I and 2 look very similar(compare Figs 2(a)and 3(a)); a more detailed analysis shows that the one in example 2 is more regular, however. Both correspond to 4 vanishing moments for y 4)Filters Close to Orthonormal Filters: Finally, there Xist many e for which R*0. In particular ther very close to each other, and both very close to an or. Fiter 3 w ith ling function. 2 and wavelet un cti for exam ple in pline conormal wavelet filter Surprisingly, for the first example of this series, one of the two filters is a Laplacian pyramid filter pro- wavelets are plotted in Fig. 4. It is clear that the scaling rery similar R(E)=48 cos(5)/175. The filter coefficients are listed similar y and 4. Note that in this case, the filter coef in Table Ill; the corresponding scaling functions and cients are again rationa
ANTONINI et aL.: IMAGE CODING USING WAVELET TRANSFORM TABLE I FILTER COEFFICIENTS FOR THE SPLINE VARIANT WITH LESS DISSIMILAR ± 2-1/h,0.6029490.266864-0.078223-0.0168640.02674 2-120.5575430.295636-0.028 0.045636 d etsψ. ngths: /=4-kk=4).(a) Scaling function o.(b)Scaling function o(c)Wavelet 4. (d) Wavelet 2 various extensions of the one-dimensional wavelet FILTER COEFFICIENTS FOR EXAMPLE 3. THE ENTRIES ARE RATIONAL, AND transform to higher dimensions. We follow Mallat [27 LAPLACIAN PYRAMID FILTER PROPOSED IN [9]. IN THIS CASE and use a two-dimensional wavelet transform in which =2=k,E= horizontal and vertical orientations are considered pre 0 ±4 esential In two-dimensional wavelet analysis one introduces like o in the one-dimensional case, a scaling function o(x. y) 17/28 such that o(x, v)= o(x)o( v) (11) The two biorthogonal filters in this example are both close to an orthonormal wavelet filter of length 6 con- where o(r) is a one-dimensional scaling function structed in [17], where it was called a:coiflet. Being Let v(x)be the one-dimensional wavelet associated with an orthonormal wavelet filter, the coiflet is nonsymme- the scaling function (x). Then, the three two-dimer tric. The filters in this example are shorter than in exam- sional wavelets are defined as ples I and 2, but k is also smaller. The next example in this family corresponds to k= 4(and /=4); the filters h ψ"(x,y)=o(x)ψ(y) and h then have length 9 and 15: they are both close to a coiflet of length 12 (x,y)=ψ(x)q(y) 5) Extension to the Two-Dimensional Case: There ex ψ(x,y)=ψ(x)(y) (12)
EEE TRANSACTIONS ON IMAGE PROCESSING. VOL 1. NO. 2. APRIL 192 Fig. 5 represents one stage in a multiscale pyramidal In Fig. 8(a)we can see the normalized detail subimages decomposition of an image: wavelet coefficients of the at different resolution levels m=1, m=2, and m =3 Image are computed, as in the one-dimensional case(Sec- (wavelet coefficients)and in Fig. 8(b)the low resolution tions I-A and II-B. 1), using a subband coding algorithm. level subimages The filters h and g are one-dimensional filters. This de- composition provides subimages corresponding to differ- III. IMAGE CODING APPLICATI ent resolution levels and orientations(see Fig. 6). The A. Statistical Properties of Wavelet Coefficients reconstruction scheme of the image is presented Fig. 7 To compare the three different filters presented in this and direction can be determined by the statistics of the each of these filters. The results are presented in Fig. 8. function (PDF/ abimage, i.e., its probability density have decomposed the image Lena(Fig. 16)with corresponding
ANTONINI er al.: IMAGE CODING USING WAVELET TRANSFORM ow resolution Horiranto Resolution m-2 Resolution m-2orientation sub-image sub amage Resolution m=l Vertical Diagonal orientation sub-imageorientation sub-image Fig. 6. Image decomposition COLUMNS 2} [ 21 Console with alter x Dnt column e A typical PDF and different approximations are .Im.d 2 leads to the well-known Gaussian PDF in Fig. 9, where we plot the true pDF for resolution I leads to a Laplacian PDF m=I and direction d vertical together with three model functions: a Gaussian, a Laplacian, and an intermediate The variance of this approximation model is set equal function, the so-called generalized Gaussian [2] to the variance of the corresponding subimage. Thus the This generalized Gaussian law is given explicitly by parameter rm, d is computed in order to match the real PDF Pm. d(r)=am. d exp(-bm,dx rm.d) using the well-known chi-squared test. In this case the optimum parameter was 0.7. Other experiments for other resolutions (except the lowest resolution) lead to very 3 Fig 9 that the real PDF(scale dm. d and vertical orientation) is closely eralized Gaussian law with parame B. Encoding of Wavelet Coefficients Using Vector where om, d is the standard deviation of the subimage Different techniques involving vector or scalar quanti- (m, d), and r( )is the usual Gamma function zation can be used to encode wavelet coefficients The general formula(13)contains the other two ex- According to Shannons rate distortion theory, better amples as particular cases: sults are al ways obtained when vectors rather than sca
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL, 1. NO, 2, APRIL 1992 m=1 128x128pix 32x 32 pix 5 128x128pix 配 subimages. (a) Comparison among detail subimages.(b)Comparison among the low resolution evel s vector quantization erefore, the present application uses gorithm with a mean squared error (MSE)criterion 1. Principle of Vector Quantization: Developed re- a training set comprised of vectors belonging to diffe cently by Gersho and Gray (1980)[20], [21], vector quan- images; it converges iteratively toward a locally optimal tization has proven to be a powerful tool for digital image codebook compression [4],[29],[30], [321, [39]. The principle in- Each of the vectors in the codebook is indexed. At the volves encoding a sequence of samples(vector)rather than encoding stage, the index of the vector in the codebook encoding each sample individually. Encoding is per- most closely describing(in terms of MSE criterion) the formed by approximating the sequence to be coded by a sample set to be encoded is selected to represent this set vector belonging to a catalogue of shapes, usually known Of course, in order to reconstruct the sample set, the de- as a codebook coder must have the same codebook as the coder
ANTONINI ef al. IMAGE CODING USING WAVELET TRANSFORM Real PDF. legalized r=0.7) Laplace Wavelet coefficients Fig. 9. Real PDF of subimage at scale m a I for vertical orientation. and its different approximations. CODER index 2)Comparative Performances of vector quantization (vQ) and Scalar Quantization ( ccording to I [13]. [191, [43], [30] the asymptotic lower bound distor tion gain obtained when VQ, rather than SQ, is applied 之 (c I)A(k,md, c) Lpm.dx) J Ipm ()-/c +im a dr Fig. 11, Asymptotic lower bound distortion gain Gw,= function (km., and generalized Gaussian approximation laws, and for a for a subimage corresponding to resolution m and direc- subimage at scale m 1 and vertical orientation. Exper tion d. Pm, d(x) is the PDF of wavelet coefficients of the imental results are closely matched by the theoretical re- subimage with resolution m and direction d sults for a generalized Ga ian law with md=0.7 Here, the MSE criterion is used as a distortion measure cept for the lower subband. Therefore, all computations (c-2). The values of A(km. d, 2)used are the upper based on this approximation law show that, in each sub bounds of the MSE computed and tabulated by Conway band, VQ outperforms SQ(see Fig. 11) and Sloane for vector size km, d [13]. This formula gives In summary vQ performs better for coding wavelet an indication of the minimum theoretical gain that can be coefficients 3)Generation of a Multiresolution Codebook: The However, this approximation is valid only for small preceding paragraph explained why Q outperforms other quantization erro for a high bit rate Rm,, d. Thus the methods. Nonetheless, major problems are encountered an as mitotic indication in the vQ of images In Fig. 11, the of G m d are plotted as a function It is impossible to create a universal codebook(effi of the vector dimension km, d for the Laplacian, Gaussian, cient for each image to be encoded)
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL I, NO. 2. APRIL 1992 The LBG algorithm smooths high frequencies(loss aumont? Sub codebook There is a trade-off between low distortion and high compression rate(computational cost) It is not easy to take into account the properties of the human visual system [28], [33] diagonal Diagonal The use of the wavelet transform (i. e, multiresolution) s is one way of becoming these different problems The wavelet decomposition of an image enables the Fig. 12, Multiresolution codebook generation of a codebook containing two-dimensiona vectors for each resolution level and preferential direc- for all coefficients x belonging to the subimage, g(r)being tion(horizontal, vertical, and diagonal). Each of thes subcodebooks(see Fig. 12)is generated using the LBG quantization of x. algorithm Total distortion of the image for a total rate of R, bits The training set is comprised of vectors belonging to per pixel is then given by different images corresponding to the resolution and ori- D,(R)=3M DS(RS)+232m 2 Dm. d(R,wd) entation under consideration The initial codebook is generated by splitting centroid(center of gravity) of this training set [211 A multiresolution codebook can thus be obtained by as- where DM(rM )corresponds to the distortion in the sub embling all of these resulting subcodebooks. Each sub age of lowest resolution M(texture subimage) codebook has a low distortion level and contains few The problem of finding an optimal bit assignment(in words, which clearly facilitates the search for the best bits per pixel)for each subimage vector quantizer is then coding vector; the coding computational load is reduced, formulated as because only the appropriate subcodebook(resolution di rection)of the multiresolution codebook is checked for each input vector. In addition, the quality of the coded R, mage is better. The multiresolution codebook is depicted +∑,∑Dnd(Rnd)×Bnd Global codebook design has drawbacks in that it results in edge smoothing while the proposed method preserves edges. In fact, each subcodebook contains the shape of subject R the wavelet coefficients which are most highly represen tative in terms of the mse criterion where RM corresponds to the bit allocation, in bits per Since the spatial and frequency aspects of the image are pixel, of lowest resolution M subimage sification and search during the encoding of a subimage human eye is not equally sensitive to sion th taken into account in the wavelet decomposition, the clas- Assignment of the weights is based on the fact that the least mean squares. This frees us lo.e criterion such as frequencies On the basis of contrast sensitivity data col ector can be achieved using a si om using distortion lected by Campbell and Robson [10], and to obtain a con measurements such as weighted least mean squares or trolled degree of noise shaping across the subimages, we other measurements involving perceptual factors. These consider a function Bm, d such that algorithms are indeed costly in computation time Bmd="log(oa:“) (19) C. Optimal Bit Allocation where om, d is the standard deviation corresponding to sub image(m, d) and the values of y and Bm, d are chosen Multiresolution exploits the eye's masking effects, and experimentally in order to match human visi therefore, enables us to refine and select the type of cod- DF(R,)is the total weighted encoding distortion func ng according to the resolution level and the contour ori- tion and M is the lowest resolution considered entation. Although a flat noise shape minimizes the MSe The expression of dm. (Rm.) is given by (191 criterion, it is generally not optimal for a subjective qual ity of image. To apply noise shaping across the vQ sub D (P,C) images, we define a total weighted MSE distortion DF(R,) with ((I7))for a total bit rate R((18)). Let us define Dm. (Rm. d )the average distortion in the m, d(P, c)=A(k,m, d, c) coding of the subimage(m, d)for Rm. d bits per pixel (c+km.d Dn.d(Rmd)=E(x-q(x)°)=d(x,qx)c≥