Liu YL,Xu XG,Guo YW et al.Pores-preserving face cleaning based on improved empirical mode decomposition.JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(3):557-567 May 2009 Pores-Preserving Face Cleaning Based on Improved Empirical Mode Decomposition Yan-Li Liul,2(刘艳丽),Xiao-Gang Xu(徐晓刚)2,Yam-Wen Guo3(郭延文),Jin Wang2(王进) Xin Duan!(段鑫),Xi Chen1(陈曦),and Qun-Sheng Peng1,2(彭群生),Senior Member,CCF State Key Lab of CAD CG,Zhejiang University,Hangzhou 310027,China 2Department of Mathematics,Zhejiang University,Hangzhou 310027,China 2Department of Equipment Automatization,Dalian Naval Academy,Dalian 116026,China 3State Key Lab of Novel Software Technology,Nanjing University,Nanjing 210000,China E-mail:liuyanli@cad.zju.edu.cn:ywguo@nju.edu.cn:jwang@cad.zju.edu.cn Received May 29,2008;revised February 28,2009 Abstract In this paper,we propose a novel method of cleaning up facial imperfections such as bumps and blemishes that may detract from a pleasing digital portrait.Contrasting with traditional methods which tend to blur facial details, our method fully retains fine scale skin textures (pores etc.)of the subject.Our key idea is to find a quantity,namely normalized local energy,to capture different characteristics of fine scale details and distractions,based on empirical mode decomposition,and then build a quantitative measurement of facial skin appearance which characterizes both imperfections and facial details in a unified framework.Finally,we use the quantitative measurement as a guide to enhance facial skin.We also introduce a few high-level,intuitive parameters for controlling the amount of enhancement.In addition,an adaptive local mean and neighborhood limited empirical mode decomposition algorithm is also developed to improve in two respects the performance of empirical mode decomposition.It can effectively avoid the gray spots effect commonly associated with traditional empirical mode decomposition when dealing with high-nonstationary images. Keywords image enhancement,empirical mode decomposition,normalized local energy Introduction to preserve the former and remove the latter make face cleaning a challenging problem.For approaches to Facial skin not only exhibits various fine scale de- eliminating distractions with denoising techniques-3, tails,e.g.,pores,but also sometimes exhibits distrac- which are widely adopted by commercial software,fa- tions such as blemishes,bumps,acne scarring etc.From cial details such as pores are often accounted as noise many existing photo enhancement softwares that can and thereby removed. be used to remove facial distractions,a common char- In this paper,based on empirical mode decomposi- acteristic is observed:the resulting face is often over- tion](EMD),we set a quantity,i.e.,normalized local smoothed,of which the natural pores disappear com- energy (NLE),to reveal the characteristics of fine scale pletely.However,human skin is perception sensitive, and distractions.Building upon NLE,we further de- in which fine scale texture accounts for an important velop a quantitative characterization of facial skin ap- part in overall appearance.Over-smoothed face not pearance,namely imperfect degree,to depict the visual only looks unnatural but also dissolves the individua- perception of facial skin.Finally,we employ imperfect lity of a person.In this paper,we focus our attention degree as a guide to enhance facial skin.In addition on retaining fine skin texture features,e.g.,pores struc- we also propose an algorithm called adaptive local mean ture of the skin,while simultaneously removing facial and neighborhood limited empirical mode decomposi- imperfections such as scars,bumps and blemishes. tion (ALNEMD)to reduce the gray spots effect com- Since the pores and some distractions of imperfect monly associated with traditional EMD when dealing faces all belong to high frequency component of images with high nonstationary images. Regular Paper This work is supported by the National Natural Science Foundation of China under Grant Nos.60403038 and 60703084,the Natural Science Foundation of Jiangsu Province under Grant No.BK2007571,and the Natural Science Foundation of Liaoning Province under Grant No.20082176
Liu YL, Xu XG, Guo YW et al. Pores-preserving face cleaning based on improved empirical mode decomposition. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24(3): 557–567 May 2009 Pores-Preserving Face Cleaning Based on Improved Empirical Mode Decomposition Yan-Li Liu1,2 (刘艳丽), Xiao-Gang Xu (徐晓刚) 2 , Yan-Wen Guo3 (郭延文), Jin Wang1 (王 进) Xin Duan1 (段 鑫), Xi Chen1 (陈 曦), and Qun-Sheng Peng1,2 (彭群生), Senior Member, CCF 1State Key Lab of CAD & CG, Zhejiang University, Hangzhou 310027, China 2Department of Mathematics, Zhejiang University, Hangzhou 310027, China 2Department of Equipment Automatization, Dalian Naval Academy, Dalian 116026, China 3State Key Lab of Novel Software Technology, Nanjing University, Nanjing 210000, China E-mail: liuyanli@cad.zju.edu.cn; ywguo@nju.edu.cn; jwang@cad.zju.edu.cn Received May 29, 2008; revised February 28, 2009. Abstract In this paper, we propose a novel method of cleaning up facial imperfections such as bumps and blemishes that may detract from a pleasing digital portrait. Contrasting with traditional methods which tend to blur facial details, our method fully retains fine scale skin textures (pores etc.) of the subject. Our key idea is to find a quantity, namely normalized local energy, to capture different characteristics of fine scale details and distractions, based on empirical mode decomposition, and then build a quantitative measurement of facial skin appearance which characterizes both imperfections and facial details in a unified framework. Finally, we use the quantitative measurement as a guide to enhance facial skin. We also introduce a few high-level, intuitive parameters for controlling the amount of enhancement. In addition, an adaptive local mean and neighborhood limited empirical mode decomposition algorithm is also developed to improve in two respects the performance of empirical mode decomposition. It can effectively avoid the gray spots effect commonly associated with traditional empirical mode decomposition when dealing with high-nonstationary images. Keywords image enhancement, empirical mode decomposition, normalized local energy 1 Introduction Facial skin not only exhibits various fine scale details, e.g., pores, but also sometimes exhibits distractions such as blemishes, bumps, acne scarring etc. From many existing photo enhancement softwares that can be used to remove facial distractions, a common characteristic is observed: the resulting face is often oversmoothed, of which the natural pores disappear completely. However, human skin is perception sensitive, in which fine scale texture accounts for an important part in overall appearance. Over-smoothed face not only looks unnatural but also dissolves the individuality of a person. In this paper, we focus our attention on retaining fine skin texture features, e.g., pores structure of the skin, while simultaneously removing facial imperfections such as scars, bumps and blemishes. Since the pores and some distractions of imperfect faces all belong to high frequency component of images, to preserve the former and remove the latter make face cleaning a challenging problem. For approaches to eliminating distractions with denoising techniques[1−3] , which are widely adopted by commercial software, facial details such as pores are often accounted as noise and thereby removed. In this paper, based on empirical mode decomposition[4] (EMD), we set a quantity, i.e., normalized local energy (NLE), to reveal the characteristics of fine scale and distractions. Building upon NLE, we further develop a quantitative characterization of facial skin appearance, namely imperfect degree, to depict the visual perception of facial skin. Finally, we employ imperfect degree as a guide to enhance facial skin. In addition, we also propose an algorithm called adaptive local mean and neighborhood limited empirical mode decomposition (ALNEMD) to reduce the gray spots effect commonly associated with traditional EMD when dealing with high nonstationary images. Regular Paper This work is supported by the National Natural Science Foundation of China under Grant Nos. 60403038 and 60703084, the Natural Science Foundation of Jiangsu Province under Grant No. BK2007571, and the Natural Science Foundation of Liaoning Province under Grant No. 20082176
558 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 In the process of facial image cleaning,our algorithm In the field of face recognition,Lin et al.o]used only requests the user to tune a few high level.intuitive scale-invariant feature transform (SIFT)framework to parameters to interactively control the amount of en- detect irregular skin details.Later,Pierrard et al.] hancement.The advantage of our technique is that employed 3D morphable model reconstruction to rec- while effectively removing facial imperfections,it does ognize facial moles.Their techniques are effective in not blur fine scale facial details.Another important detecting moles with relatively fixed pattern,for exam- feature of our technique is:it is a general model which ple,circular shape.However,as these methods detect characterizes both imperfections and facial details in a and localize facial moles in spatial domain,they are unified framework,thereby it does not require user to not suitable for detecting and localizing general skin ir- interactively mark imperfections on facial images. regularities,due to a large number of possible spatial The rest of this paper is structured as follows.Sec- variations of distractions.Instead of explicitly detect- tion 2 reviews some related work about face cleaning ing and localizing distractions,our new approach sug- and EMD.Section 3 describes our improved EMD al- gests a quantitative characterization of both pores and gorithm.The algorithm of face cleaning is introduced distractions,and uses it to enhance facial skin. in detail in Section 4.Experimental results are demon- Among the current image filtering techniques,bilat- strated in Section 5.Finally.Section 6 concludes the eral filteringlil is one of the most powerful filters and whole paper and highlights future work. has been used in various fields.For the typical case in which the spatial and intensity weighting functions are 21 Related Work and Background Gaussian,there are two important parameters,namely geometric spread and photometric spread.While small 2.1 Related Work geometric spread corresponds to filtering small intensity Face Cleaning.The problem of facial image editing changes,large photometric spread would preserve edges has received much attention both from the computer with large discontinuity.For face cleaning,of which the graphics and image processing communities.As a re- aim is to smooth large discontinuity and preserve small sult,many techniques have been developed to achieve scale texture,the settings for the two parameters are various facial appearance effects.Leyvand et al.5]pro- difficult:when the photometric spread is set to a very posed an approach to enhanceing the facial attractive- large value in order to smooth imperfections,the bilate- ness by adjusting facial features.Nguyen et al.6]pre- ral filter nearly degenerates into a Gaussian filter.In sented a layer extraction method to remove and syn- such a case,the smoothing effect of bilateral filter only thesize beard.Using reflectance transfer,Peers et al7 relies on the geometric spread.Apparently,the geo- produced face relighting in the post-production process. metric spread cannot be set to a large value,otherwise Most recently,Bitouks introduced an algorithm that the pores of skin will be removed.On the other hand, can automatically replace faces in photographs.More- the edges of bumps or scars in the skin are left aside if over.the algorithm also allows the user to interactively it is set to a small value. edit the illumination and colors of face. To the best of our knowledge,the most related work However,little research work has been published on to ours is developed by Matsui et al.2.This work also face cleaning in literature.Nevertheless,in practice, aims at removing spots and at the same time preserv- several methods have been employed to clean or en- ing skin natural roughness.The method uses e-filters to hance facial images.While cleaning face with denois- decompose image into several different frequency com- ing techniques is popular,an alternative approach to ponents.It assumes spots are of medium frequency serving this problem is the so-called interactive cut- and pores of high frequency,and then discards medium and-paste method,of which poison image editing] frequency component and retains the high frequency and healing brush in Adobe Photoshop are well-known component.The major drawback of this approach is: This approach repairs imperfections on the facial im- since spots may exist also in low frequency and high ages by seamless cloning,with little effects on preserv frequency components,it cannot remove spots com- ing skin details.Moreover,for facial images with many pletely.Moreover,the discarded layers are determined imperfections,it would incur a lot of user interactions by several parameters,which are not intuitive,hence to mark out source and destination areas.Since a great requiring heavy user interactions.The users also need deal of skill is demanded to achieve satisfactory results to use unsharp mask to avoid image blurring,which it is a suitable tool only for trained designers.Our goal might magnify image noise. is to develop a technique with a few high level,intuitive EMD.Traditional energy-frequency analysis parameters that can be easily adopted by naive users. are based on Fourier transformation and wavelet
558 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 In the process of facial image cleaning, our algorithm only requests the user to tune a few high level, intuitive parameters to interactively control the amount of enhancement. The advantage of our technique is that while effectively removing facial imperfections, it does not blur fine scale facial details. Another important feature of our technique is: it is a general model which characterizes both imperfections and facial details in a unified framework, thereby it does not require user to interactively mark imperfections on facial images. The rest of this paper is structured as follows. Section 2 reviews some related work about face cleaning and EMD. Section 3 describes our improved EMD algorithm. The algorithm of face cleaning is introduced in detail in Section 4. Experimental results are demonstrated in Section 5. Finally, Section 6 concludes the whole paper and highlights future work. 2 Related Work and Background 2.1 Related Work Face Cleaning. The problem of facial image editing has received much attention both from the computer graphics and image processing communities. As a result, many techniques have been developed to achieve various facial appearance effects. Leyvand et al.[5] proposed an approach to enhanceing the facial attractiveness by adjusting facial features. Nguyen et al. [6] presented a layer extraction method to remove and synthesize beard. Using reflectance transfer, Peers et al. [7] produced face relighting in the post-production process. Most recently, Bitouk[8] introduced an algorithm that can automatically replace faces in photographs. Moreover, the algorithm also allows the user to interactively edit the illumination and colors of face. However, little research work has been published on face cleaning in literature. Nevertheless, in practice, several methods have been employed to clean or enhance facial images. While cleaning face with denoising techniques is popular, an alternative approach to serving this problem is the so-called interactive cutand-paste method, of which poison image editing[9] and healing brush in Adobe Photoshop are well-known. This approach repairs imperfections on the facial images by seamless cloning, with little effects on preserving skin details. Moreover, for facial images with many imperfections, it would incur a lot of user interactions to mark out source and destination areas. Since a great deal of skill is demanded to achieve satisfactory results, it is a suitable tool only for trained designers. Our goal is to develop a technique with a few high level, intuitive parameters that can be easily adopted by naive users. In the field of face recognition, Lin et al.[10] used scale-invariant feature transform (SIFT) framework to detect irregular skin details. Later, Pierrard et al.[11] employed 3D morphable model reconstruction to recognize facial moles. Their techniques are effective in detecting moles with relatively fixed pattern, for example, circular shape. However, as these methods detect and localize facial moles in spatial domain, they are not suitable for detecting and localizing general skin irregularities, due to a large number of possible spatial variations of distractions. Instead of explicitly detecting and localizing distractions, our new approach suggests a quantitative characterization of both pores and distractions, and uses it to enhance facial skin. Among the current image filtering techniques, bilateral filtering[1] is one of the most powerful filters and has been used in various fields. For the typical case in which the spatial and intensity weighting functions are Gaussian, there are two important parameters, namely geometric spread and photometric spread. While small geometric spread corresponds to filtering small intensity changes, large photometric spread would preserve edges with large discontinuity. For face cleaning, of which the aim is to smooth large discontinuity and preserve small scale texture, the settings for the two parameters are difficult: when the photometric spread is set to a very large value in order to smooth imperfections, the bilateral filter nearly degenerates into a Gaussian filter. In such a case, the smoothing effect of bilateral filter only relies on the geometric spread. Apparently, the geometric spread cannot be set to a large value, otherwise the pores of skin will be removed. On the other hand, the edges of bumps or scars in the skin are left aside if it is set to a small value. To the best of our knowledge, the most related work to ours is developed by Matsui et al.[12]. This work also aims at removing spots and at the same time preserving skin natural roughness. The method uses ε-filters to decompose image into several different frequency components. It assumes spots are of medium frequency and pores of high frequency, and then discards medium frequency component and retains the high frequency component. The major drawback of this approach is: since spots may exist also in low frequency and high frequency components, it cannot remove spots completely. Moreover, the discarded layers are determined by several parameters, which are not intuitive, hence requiring heavy user interactions. The users also need to use unsharp mask to avoid image blurring, which might magnify image noise. EMD. Traditional energy-frequency analysis are based on Fourier transformation and wavelet
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 559 transformation.The Fourier transform is designed to simplicity,we call it directional BEMD.The methods of work with liner and stationary signals.The wavelet the second category are genuine sense 2D EMD which transform,on the other hand,is well-suited to handle use radial basis functions(RBF)or cubic spline to in- non-stationary data,but is poor at processing non- terpolate envelopeslis-201.Although they have been linear data.However,few energy-frequency data are successfully used in some cases,these algorithms suffer truly linear and stationary.To address this problem from the problem of overshoot or undershoot because the Hilbert-Huang transform (HHT)has been recently of their interpolation scheme.In this paper,we pro- developedl4l.The HHT comprises two steps:EMD pose two solutions,namely limiting minimum frequency generates a finite number of intrinsic mode functions and adaptive local mean,to improve the performance (IMF),and then the Hilbert transform is applied to of EMD. IMF.The combination of IMF and its Hilbert trans- form forms an analytic signal,which can be used to 2.21 Background of EMD generate a "time-frequency-energy"representation of The principle of EMD is to identify the intrinsic os- the data. cillatory modes embedded in the signals based on their In this paper,we choose EMD and HHT rather than characteristic time scales,and then to decompose the wavelet transform due to the following considerations. signals into a sum of IMFs.In order to permit phys- First,imperfect facial images are in general nonlinear ically meaningful instantaneous frequency to be de- and nonstationary in nature,EMD is suitable to pro- fined over it,an IMF must satisfy the following two cess these data.Second,based on the local time scale conditionsl4l:first,the numbers of extrema and zero- of the data,EMD decomposes a signal into IMFs adap- crossings must differ by at most one;second,it is sym- tively without using a prior basis which do not neces- metric with respect to the local mean of the data.Ide- sarily match the varying nature of the signals.With- ally,the requirement of the second condition should be out the need of pre-specifying a decomposition level as "the local mean of the data is zero".Since it is difficult requested by wavelet decomposition to extract out all to define a local averaging time scale in general case,in the image's oscillatory modes,EMD allows us to use practice,the local mean of the data is usually replaced all the oscillatory modes to further define imperfect de- by the mean of the envelopes defined by interpolating gree.Thus,by exploiting EMD,we can obtain adap- the local maxima and local minima. tive multi-resolution representation of image.Due to EMD is an iterative process.To illustrate its con- unpredictable appearances of various kinds of distrac- cept,here we describe the algorithm of EMD in 1D tions,such an adaptive multi-resolution representation case.Given a signal x(t),the algorithm of EMD can be is favorable.Third,the IMF is generally in good agree- summarized as follows. ment with intuitive and physical signal interpretations, 1)Identify all the local extrema of x(t) and has well-defined instantaneous frequency,allowing 2)Interpolate between local minima (resp.local us to define local energy to capture the characteris- maxima),ending up with the envelope emin(t)(resp. tics of distractions and pores.On the contrary.the emax(t)). local energy extracted by Fourier or wavelet3-15]suf 3)Compute the mean m(t)=(emin(t)+emax(t))/2. fers from the problem of energy leakage,due to the 4)Extract the detail d(t)=x(t)-m(t).Check prop- infinite or limited finite length of the basis.The energy erty of d(t): leakage makes the quantitative definition of the energy- if it meets the above two conditions,an IMF frequency-space distribution of image difficult is derived,and replace x(t)with the residual r(t)= Due to the nature of image data,when process- x(t)-d(t): ing images with idea of EMD,bidimensional empir- if it is not an IMF,replace r(t)with d(t). ical mode decomposition (BEMD)is necessary.Ac- 5)Repeat Steps 1)~4)until the residual satisfies cordingly,to perform 2D space-frequency analysis of some stopping criterion. IMFs,Riesz transform,an isotropic generalization of Finally,x(t)is represented as the sum of a finite the Hilbert transform to multiple dimensions is also number of IMFs (both amplitude and frequency mod- needed[16).For BEMD,existing methods can be di- ulated)and a residual trend.The process of iteratively vided into two categories.The first category of the extracting an IMF is also called sifting process.One can approaches divide an image into one-dimensional data. observe from the above formulation that there is no pre- Then the 1D EMD is applied to a limited number of ori- fixed basis involving in the sifting process,instead,the entations:two (horizontal and vertical)or morel171.For decomposition proceeds depending on the data itself
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 559 transformation. The Fourier transform is designed to work with liner and stationary signals. The wavelet transform, on the other hand, is well-suited to handle non-stationary data, but is poor at processing nonlinear data. However, few energy-frequency data are truly linear and stationary. To address this problem, the Hilbert-Huang transform (HHT) has been recently developed[4]. The HHT comprises two steps: EMD generates a finite number of intrinsic mode functions (IMF), and then the Hilbert transform is applied to IMF. The combination of IMF and its Hilbert transform forms an analytic signal, which can be used to generate a “time-frequency-energy” representation of the data. In this paper, we choose EMD and HHT rather than wavelet transform due to the following considerations. First, imperfect facial images are in general nonlinear and nonstationary in nature, EMD is suitable to process these data. Second, based on the local time scale of the data, EMD decomposes a signal into IMFs adaptively without using a prior basis which do not necessarily match the varying nature of the signals. Without the need of pre-specifying a decomposition level as requested by wavelet decomposition to extract out all the image’s oscillatory modes, EMD allows us to use all the oscillatory modes to further define imperfect degree. Thus, by exploiting EMD, we can obtain adaptive multi-resolution representation of image. Due to unpredictable appearances of various kinds of distractions, such an adaptive multi-resolution representation is favorable. Third, the IMF is generally in good agreement with intuitive and physical signal interpretations, and has well-defined instantaneous frequency, allowing us to define local energy to capture the characteristics of distractions and pores. On the contrary, the local energy extracted by Fourier or wavelet[13−15] suffers from the problem of energy leakage, due to the infinite or limited finite length of the basis. The energy leakage makes the quantitative definition of the energyfrequency-space distribution of image difficult. Due to the nature of image data, when processing images with idea of EMD, bidimensional empirical mode decomposition (BEMD) is necessary. Accordingly, to perform 2D space-frequency analysis of IMFs, Riesz transform, an isotropic generalization of the Hilbert transform to multiple dimensions is also needed[16]. For BEMD, existing methods can be divided into two categories. The first category of the approaches divide an image into one-dimensional data. Then the 1D EMD is applied to a limited number of orientations: two (horizontal and vertical) or more[17]. For simplicity, we call it directional BEMD. The methods of the second category are genuine sense 2D EMD which use radial basis functions (RBF) or cubic spline to interpolate envelopes[18−20]. Although they have been successfully used in some cases, these algorithms suffer from the problem of overshoot or undershoot because of their interpolation scheme. In this paper, we propose two solutions, namely limiting minimum frequency and adaptive local mean, to improve the performance of EMD. 2.2 Background of EMD The principle of EMD is to identify the intrinsic oscillatory modes embedded in the signals based on their characteristic time scales, and then to decompose the signals into a sum of IMFs. In order to permit physically meaningful instantaneous frequency to be de- fined over it, an IMF must satisfy the following two conditions[4]: first, the numbers of extrema and zerocrossings must differ by at most one; second, it is symmetric with respect to the local mean of the data. Ideally, the requirement of the second condition should be “the local mean of the data is zero”. Since it is difficult to define a local averaging time scale in general case, in practice, the local mean of the data is usually replaced by the mean of the envelopes defined by interpolating the local maxima and local minima. EMD is an iterative process. To illustrate its concept, here we describe the algorithm of EMD in 1D case. Given a signal x(t), the algorithm of EMD can be summarized as follows. 1) Identify all the local extrema of x(t). 2) Interpolate between local minima (resp. local maxima), ending up with the envelope emin(t) (resp. emax(t)). 3) Compute the mean m(t) = (emin(t) + emax(t))/2. 4) Extract the detail d(t) = x(t)−m(t). Check property of d(t): • if it meets the above two conditions, an IMF is derived, and replace x(t) with the residual r(t) = x(t) − d(t); • if it is not an IMF, replace x(t) with d(t). 5) Repeat Steps 1)∼4) until the residual satisfies some stopping criterion. Finally, x(t) is represented as the sum of a finite number of IMFs (both amplitude and frequency modulated) and a residual trend. The process of iteratively extracting an IMF is also called sifting process. One can observe from the above formulation that there is no pre- fixed basis involving in the sifting process, instead, the decomposition proceeds depending on the data itself
560 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 3 Improved EMD Sk=2k+1+1. .Initialize hk.o(p)=rk-1(p),11. Theoretically,there are two factors incurring the drawback of classic EMD mentioned in the above Detect all local extreme points of hk_1.Estab- lish the local maximum point set and local minimum section.First,since there is no bandwidth constraint during the decomposition process in classical EMD, point set respectively.For every pixel p of hk.1,if over-iterating on the whole signal for a better local ap- I(p)is strictly higher or lower than all its 8-connected neighbors,it is a local extreme point.Otherwise,it is proximation has the drawback of contaminating other parts of the signal,in particular,in uniformizing the not a local extreme point. Compute the geometric distance (Euclidean dis- amplitude and in over-decomposing it by spreading out its components over adjacent modes21.Second,clas- tance)between every two adjacent local maximum sical EMD uses the mean of envelopes interpolated by points row-wisely and then column-wisely in For the local maxima and minima respectively as a sub- example,suppose two adjacent local maximum points stitute for the local mean of data.Unfortunately,the in the same column are (1,y),(2,y)and 1 2, interpolation method,for example spline fitting,often then the distance S between them is z2-z1.If S>S, overshoots or undershoots the strongly nonlinear and then put the pixels(1+ix S,y),i=1,2,...,(S/S)] nonstationary data.Such overshoots and undershoots into The purpose of supplementing is to make the distance between every two adjacent local maxi- would be amplified through many times of spline fitting in the sifting process,and eventually result in diver- mum points not more than S,ensuring the distribution gence of IMF.Therefore,unless a better approximation of extreme points is not too sparse.Furthermore,to re- duce the boundary effects,we add extreme surrounding to local mean is conducted,EMD cannot be improved significantly. the original image boundaries by mirror symmetry.The For each of the above two problems,we propose a so- same operations are also performed on. For every pixel p,compute its local mean lution to get around it.The first solution is related with time-frequency uncertainty principle.As it is known, emean,1-1(p)using the adaptive local mean algorithm de- scribed below. uncertainty principle formulated as ·hk,l(p)=hk,l-1(p)-Cmean,.l-1(p): 7f×7t≥1/2 (1) If the stop criterion is met4,imf(p)=hk.(p), and stop the sifting process;else I=I+1. defines the relation between the width of frequency win- 3)rk(p)=rk-1(p)-imfk(p). dow Vf and that of the time window Vt.According 4)If the r(p)is a monotonic or constant signal, to (1),if we set Vt s a (a 0),then it must have terminate the decomposition process.Otherwise go to Vf >b(b=1/2a).That is,if we limit the maximal Step 2). temporal resolution,a minimal frequency resolution can The image I is finally expressed as the sum of finite be confined correspondingly.In the paper,we specify IMFs and a residue,as shown in the following equation: a maximum neighborhood in sifting process,as a re- sult,the minimal frequency of the resulting IMF is de- fined.Our solution to the second problem is:instead of I(p)= im时fk(p)+rK(p) (2) k=1 endeavoring to develop a fitting method which always cause overshoot or undershoot,we directly use the local The adaptive local mean algorithm is shown in Ta- mean of the image data to ascertain symmetry.To do ble 1:the function NumErtrema(p)returns the num- so,the only issue is to define a local space scale.Fortu- ber of local extreme points in p.r.Symmetry(p) nately,the above specified maximum neighborhood is tests whether the local extreme points in p.r are spa- a good candidate for defining such a local space scale. tially symmetric,and is defined as follows.The col- Taking computation cost and locality into account,an umn and row of p equally divide pr into four quad- adaptive local mean algorithm is developed. rants.Symmetry(p)is true if the following two condi- Integrating the above two strategies into one frame- tions are both satisfied:first,in each of the four quad- work.for a given image I with size of M x N in pixels, rants one local extreme point can be found;second, our ALNEMD algorithm is formulated as the following among these four local extreme points,two of them are steps. local maximum points and the other two are local min- 1)Initialize ro(p)=I(p). imum points.Otherwise Symmetry(p)is false.The 2)Extract k-th IMF imf&. threshold of ur is used to guarantee that the number ·Set the size of maximum neighborhood Sx×Sk, of local extreme points is not too few and typically set
560 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 3 Improved EMD Theoretically, there are two factors incurring the drawback of classic EMD mentioned in the above section. First, since there is no bandwidth constraint during the decomposition process in classical EMD, over-iterating on the whole signal for a better local approximation has the drawback of contaminating other parts of the signal, in particular, in uniformizing the amplitude and in over-decomposing it by spreading out its components over adjacent modes[21]. Second, classical EMD uses the mean of envelopes interpolated by the local maxima and minima respectively as a substitute for the local mean of data. Unfortunately, the interpolation method, for example spline fitting, often overshoots or undershoots the strongly nonlinear and nonstationary data. Such overshoots and undershoots would be amplified through many times of spline fitting in the sifting process, and eventually result in divergence of IMF. Therefore, unless a better approximation to local mean is conducted, EMD cannot be improved significantly. For each of the above two problems, we propose a solution to get around it. The first solution is related with time-frequency uncertainty principle. As it is known, uncertainty principle formulated as ∇f × ∇t > 1/2 (1) defines the relation between the width of frequency window ∇f and that of the time window ∇t. According to (1), if we set ∇t 6 a (a > 0), then it must have ∇f > b (b = 1/2a). That is, if we limit the maximal temporal resolution, a minimal frequency resolution can be confined correspondingly. In the paper, we specify a maximum neighborhood in sifting process, as a result, the minimal frequency of the resulting IMF is de- fined. Our solution to the second problem is: instead of endeavoring to develop a fitting method which always cause overshoot or undershoot, we directly use the local mean of the image data to ascertain symmetry. To do so, the only issue is to define a local space scale. Fortunately, the above specified maximum neighborhood is a good candidate for defining such a local space scale. Taking computation cost and locality into account, an adaptive local mean algorithm is developed. Integrating the above two strategies into one framework, for a given image I with size of M × N in pixels, our ALNEMD algorithm is formulated as the following steps. 1) Initialize r0(p) = I(p). 2) Extract k-th IMF imf k . • Set the size of maximum neighborhood Sk × Sk, Sk = 2k+1 + 1. • Initialize hk,0(p) = rk−1(p), l = 1. • Detect all local extreme points of hk,l−1. Establish the local maximum point set A and local minimum point set B respectively. For every pixel p of hk,l−1, if I(p) is strictly higher or lower than all its 8-connected neighbors, it is a local extreme point. Otherwise, it is not a local extreme point. • Compute the geometric distance (Euclidean distance) between every two adjacent local maximum points row-wisely and then column-wisely in A . For example, suppose two adjacent local maximum points in the same column are (x1, y), (x2, y) and x1 Sl , then put the pixels (x1+i×Sl , y), i = 1, 2, . . . , b(S/Sl)c into A . The purpose of supplementing A is to make the distance between every two adjacent local maximum points not more than Sl , ensuring the distribution of extreme points is not too sparse. Furthermore, to reduce the boundary effects, we add extreme surrounding the original image boundaries by mirror symmetry. The same operations are also performed on B. • For every pixel p, compute its local mean emean,l−1(p) using the adaptive local mean algorithm described below. • hk,l(p) = hk,l−1(p) − emean,l−1(p). • If the stop criterion is met[4] , imf k (p) = hk,l(p), and stop the sifting process; else l = l + 1. 3) rk(p) = rk−1(p) − imf k (p). 4) If the rk(p) is a monotonic or constant signal, terminate the decomposition process. Otherwise go to Step 2). The image I is finally expressed as the sum of finite IMFs and a residue, as shown in the following equation: I(p) = X K k=1 imf k (p) + rK(p). (2) The adaptive local mean algorithm is shown in Table 1: the function NumExtrema(p) returns the number of local extreme points in Ωp,T . Symmetry(p) tests whether the local extreme points in Ωp,T are spatially symmetric, and is defined as follows. The column and row of p equally divide Ωp,T into four quadrants. Symmetry(p) is true if the following two conditions are both satisfied: first, in each of the four quadrants one local extreme point can be found; second, among these four local extreme points, two of them are local maximum points and the other two are local minimum points. Otherwise Symmetry(p) is false. The threshold of µT is used to guarantee that the number of local extreme points is not too few and typically set
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 561 as(T2-1)/3. 4.1 Extract the Local Property Compared with interpolation-based EMD,in our adaptive local mean algorithm,the approximation to Since our characterization and cleaning operation the local mean of image data is much more accurate, are only performed on skin parts of face,for a given therefore the problem of overshoot or undershoot of image,we first need to know which regions on the im- interpolation-based EMD is reduced.On the other age are skin parts.This information can be obtained hand,by setting a maximum neighborhood in sifting either by requiring users to indicate the skin regions, process,the minimal frequency of IMF is kept un- or by using facial feature detection algorithms to au- der control and hence image overdecomposition is pro- tomatize the process.In the paper,we manually mark hibited.Integrating both adaptive local mean algo- out the contour of the face,eyes and lips that are es- rithm and neighborhood limitation schemes,more ac- sential to the character of individuals which we do not curate frequency and amplitude can be achieved by intend to adjust.In the following,the operations are our method,laying a foundation to make quantitative all performed on the skin parts.Assuming that we energy-space analysis of facial images. have decomposed I into K IMFs with the ALNEMD algorithm described in the above section,now we apply Table 1.Adaptive Local Mean Algorithm Riesz transform to each IMF. Defining x=(1,r2)T,r is the pixel set The l-th Decomposition in Sifting Process of I,the Riesz kernel in the spatial domain is given for every pixel p of hk.1-1 as R(z)=z/2m3.We can see from the kernel that set T=3,flag false,ur; the integration in the Reisz transform is extremely lo- let p.r be the T x T neighborhood of p cal,which is a crucial requirement for analyzing non- while Tr and Symmetry(p)==tue fu(z)be the k-th IMF of the image,the corresponding Cmeam))=TxT∑qen.rhk-1(g monogenic signal thus takes the following form: flag =true; fk.M()=fr()+(R*fk)()=fk()+fk.R() break; else (6(t)+t/2)fk(-t)dt. (3) T=T+2: end if From the monogenic signal,meaningful local ampli- end while tude and local phase can be extracted.The local am- if flag==false plitude of fi.M()is: 1 Cmean(p)=x5∑qe.rhkI-1(gh end if f,M(川=VfM() end for =Vf(x)+lfk.R()2 (4) 4 Algorithm of Face Cleaning As local amplitude offers image's energetic informa- We will present in this section our face cleaning al- tion,we use the square of local amplitude,namely local energy,to characterize the oscillatory degree of every gorithm.First,we give an overview of the algorithm pixel x at level k. Formally,for the facial image I to be processed,we first use ALNEMD to decompose it into a series of both fre- LEk()=lfk.M()2. (5) quency and amplitude modulated IMFs.Next,Riesz transform is applied to every IMF,generating a corre- According to the theory of EMD.with the increase sponding monogenic signal22]which is the extension of of k,LEk(r)decreases.In order to study the behavior the analytical signal from 1D to 2D.Built upon mono- of LEL(x),for a given pixel x,across different levels k genic signals,the meaningful local energy of every pixel in a unified fashion,we normalize LEk(r),x E at ev- is extracted.By analyzing normalized local energy,we ery level,getting the normalized local energy NLE(). propose a function to quantitatively measure the im- Being a relative value,NLEk(z)reflects the relative perfect degree of image.The cleaning result is finally weight of local energy of pixel z at level k.Figs.1(b), produced after the coefficients of IMFs are adaptively 1(c),1(d)show the first three normalized local energy adjusted under the guide of imperfect degree. maps of Fig.1(f),a selected part of Fig.1(a)
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 561 as (T 2 − 1)/3. Compared with interpolation-based EMD, in our adaptive local mean algorithm, the approximation to the local mean of image data is much more accurate, therefore the problem of overshoot or undershoot of interpolation-based EMD is reduced. On the other hand, by setting a maximum neighborhood in sifting process, the minimal frequency of IMF is kept under control and hence image overdecomposition is prohibited. Integrating both adaptive local mean algorithm and neighborhood limitation schemes, more accurate frequency and amplitude can be achieved by our method, laying a foundation to make quantitative energy-space analysis of facial images. Table 1. Adaptive Local Mean Algorithm The l-th Decomposition in Sifting Process for every pixel p of hk,l−1 set T = 3, flag = false, µT ; let Ωp,T be the T × T neighborhood of p; while T µT and Symmetry(p) == ture emean(p) = 1 T × T P q∈Ωp,T hk,l−1(q); flag = true; break; else T = T + 2; end if end while if flag == false emean(p) = 1 Sl × Sl P q∈Ωp,T hk,l−1(q); end if end for 4 Algorithm of Face Cleaning We will present in this section our face cleaning algorithm. First, we give an overview of the algorithm. Formally, for the facial image I to be processed, we first use ALNEMD to decompose it into a series of both frequency and amplitude modulated IMFs. Next, Riesz transform is applied to every IMF, generating a corresponding monogenic signal[22] which is the extension of the analytical signal from 1D to 2D. Built upon monogenic signals, the meaningful local energy of every pixel is extracted. By analyzing normalized local energy, we propose a function to quantitatively measure the imperfect degree of image. The cleaning result is finally produced after the coefficients of IMFs are adaptively adjusted under the guide of imperfect degree. 4.1 Extract the Local Property Since our characterization and cleaning operation are only performed on skin parts of face, for a given image, we first need to know which regions on the image are skin parts. This information can be obtained either by requiring users to indicate the skin regions, or by using facial feature detection algorithms to automatize the process. In the paper, we manually mark out the contour of the face, eyes and lips that are essential to the character of individuals which we do not intend to adjust. In the following, the operations are all performed on the skin parts. Assuming that we have decomposed I into K IMFs with the ALNEMD algorithm described in the above section, now we apply Riesz transform to each IMF. Defining x = (x1, x2) T, x ∈ Ω, Ω is the pixel set of I, the Riesz kernel in the spatial domain is given as R(x) = x/2π|x| 3 . We can see from the kernel that the integration in the Reisz transform is extremely local, which is a crucial requirement for analyzing nonstationary signals. The combination of a signal and its Riesz transformed is called the monogenic signal. Let fk(x) be the k-th IMF of the image, the corresponding monogenic signal thus takes the following form: fk,M(x) = fk(x) + (R ∗ fk)(x) = fk(x) + fk,R(x) = Z Ω (δ(t) + t/2π|t| 3 )fk(x − t)dt. (3) From the monogenic signal, meaningful local amplitude and local phase can be extracted. The local amplitude of fk,M(x) is: |fk,M(x)| = q f 2 k,M(x) = q f 2 k (x) + |fk,R(x)| 2. (4) As local amplitude offers image’s energetic information, we use the square of local amplitude, namely local energy, to characterize the oscillatory degree of every pixel x at level k. LEk(x) = |fk,M(x)| 2 . (5) According to the theory of EMD, with the increase of k, LEk(x) decreases. In order to study the behavior of LEk(x), for a given pixel x, across different levels k in a unified fashion, we normalize LEk(x), x ∈ Ω at every level, getting the normalized local energy NLEk(x). Being a relative value, NLEk(x) reflects the relative weight of local energy of pixel x at level k. Figs. 1(b), 1(c), 1(d) show the first three normalized local energy maps of Fig.1(f), a selected part of Fig.1(a)
562 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 Zero Level 0 goes up sharply compared with Fig.1(b),and that of .8 pores begins to attenuate.Finally,in Fig.1(d).while the NLE of pores drastically decreases,the NLE of the 0.6 300 distractions continues to grow.At this level,the dif- 0.4 50 ferentiation between pores and distractions are justi- 400 0.2 fied.According to NLE behaviors from the first to third 80 130180230280 level,all pixels in the image fall into three categories. 50 0.0 1)Pixels with NLE relatively large at all levels, (a) 0 mainly relate to some sharp imperfections,e.g.,pixels One Level Two Level in purple rectangles in Figs.1(b)~1(d). 2)Pixels with NLE from small to large mainly re- late to gently rolling distractions,e.g.,pixels in green 0.6 0.6 rectangles in Figs.1(b)1(d). 300 0.4 3)Pixels with NLE from large to small mainly relate 350 0.4 to skin details,e.g.,pixels in yellow rectangles in Figs. 02 0.2 1(b)1(d). 130180230280 50 80 10 130180230280 As the residual image of EMD is nearly constant. meaning that all the oscillatory modes of original im- g age have been picked out,it is reasonable to use NLE IPD of all IMFs to define pixels'imperfect degree.We de- fine an imperfect degree for every pixel z of I as the weighted sum of NLE of all IMF levels and their differ- ences between adjacent levels: 00 NLEk(), k=0 130180230280 K-1 B=>(NLEk()-NLEk()) k=0 Fig.1.(f)is an image patch cropped from (a)(green rectangle). IPD(z)=Xea+(1-)e-8. (6) (b)~(d)are the first,second and third NLE (=0,1,2)of (f). In the above formula,the first term rewards pixels Pixels in purple,green and yellow rectangle correspond to cases in case 1),and the second term punishes those pixels 1),2),3)respectively.(e)IPD of(f).In (e)and (f),the corre- in case 3)and rewards pixels in case 2).A is a weight spondences between pixels in IPD and the image are marked out to balance two terms and usually set between 0.2~0.5 with different colors The IPD defined as above quantitatively measures 4.2 Analyze the Normalized Local Property the imperfect degree of every pixel.The larger the IPD is,the more imperfect the facial skin is.As seen from Having extracted the NLE of every pixel at different Figs.1(e),1(f),while pixels with large IPD exactly cor- levels,in this subsection,we make an analysis of NLE respond to distractions as we have marked with green and then propose a function of every pixel to charac- and purple rectangles,pixels with small IPD directly re- terize the energy distribution of the facial image. late to image regions with fine scale details,see yellow From an overall view of Figs.1(b)~1(d),we can rectangles.The accurate correspondences show that clearly see:the pores and imperfections (bumps,scars) our IPD indeed delivers an intuitive measure of degree exhibit different characteristics from the first level to of visual imperfection. the third level.For the former,NLE decreases,and for 4.3 Adjust the Coefficient the latter,NLE increases.More specifically,in Fig.1(b) the NLE of the first level,both the pores and bumps all Once the IPD for every pixel has been calculated, have high levels of energy,especially the pores and some the next step is to adjust the coefficients of every IMF distractions.Note that,the part with most highest to meet our goal based on the hints of IPD.It is recog- NLE is on the upper right in Fig.1(f)where the pores nized that pixels with large IPD generally correspond can be clearly seen.In Fig.1(c),the NLE of distractions to imperfect skin such as large bumps which we intend
562 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 Fig.1. (f) is an image patch cropped from (a) (green rectangle). (b)∼(d) are the first, second and third NLE (k = 0, 1, 2) of (f). Pixels in purple, green and yellow rectangle correspond to cases 1), 2), 3) respectively. (e) IPD of (f). In (e) and (f), the correspondences between pixels in IPD and the image are marked out with different colors. 4.2 Analyze the Normalized Local Property Having extracted the NLE of every pixel at different levels, in this subsection, we make an analysis of NLE, and then propose a function of every pixel to characterize the energy distribution of the facial image. From an overall view of Figs. 1(b)∼1(d), we can clearly see: the pores and imperfections (bumps, scars) exhibit different characteristics from the first level to the third level. For the former, NLE decreases, and for the latter, NLE increases. More specifically, in Fig.1(b), the NLE of the first level, both the pores and bumps all have high levels of energy, especially the pores and some distractions. Note that, the part with most highest NLE is on the upper right in Fig.1(f) where the pores can be clearly seen. In Fig.1(c), the NLE of distractions goes up sharply compared with Fig.1(b), and that of pores begins to attenuate. Finally, in Fig.1(d), while the NLE of pores drastically decreases, the NLE of the distractions continues to grow. At this level, the differentiation between pores and distractions are justi- fied. According to NLE behaviors from the first to third level, all pixels in the image fall into three categories. 1) Pixels with NLE relatively large at all levels, mainly relate to some sharp imperfections, e.g., pixels in purple rectangles in Figs. 1(b)∼1(d). 2) Pixels with NLE from small to large mainly relate to gently rolling distractions, e.g., pixels in green rectangles in Figs. 1(b)∼1(d). 3) Pixels with NLE from large to small mainly relate to skin details, e.g., pixels in yellow rectangles in Figs. 1(b)∼1(d). As the residual image of EMD is nearly constant, meaning that all the oscillatory modes of original image have been picked out, it is reasonable to use NLE of all IMFs to define pixels’ imperfect degree. We de- fine an imperfect degree for every pixel x of I as the weighted sum of NLE of all IMF levels and their differences between adjacent levels: α = X K k=0 NLEk(x), β = K X−1 k=0 (NLEk(x) − NLEk+1(x)), IPD(x) = λeα + (1 − λ)e −β . (6) In the above formula, the first term rewards pixels in case 1), and the second term punishes those pixels in case 3) and rewards pixels in case 2). λ is a weight to balance two terms and usually set between 0.2∼0.5. The IPD defined as above quantitatively measures the imperfect degree of every pixel. The larger the IPD is, the more imperfect the facial skin is. As seen from Figs. 1(e), 1(f), while pixels with large IPD exactly correspond to distractions as we have marked with green and purple rectangles, pixels with small IPD directly relate to image regions with fine scale details, see yellow rectangles. The accurate correspondences show that our IPD indeed delivers an intuitive measure of degree of visual imperfection. 4.3 Adjust the Coefficient Once the IPD for every pixel has been calculated, the next step is to adjust the coefficients of every IMF to meet our goal based on the hints of IPD. It is recognized that pixels with large IPD generally correspond to imperfect skin such as large bumps which we intend
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 563 to smooth,while pixels with small IPD relate to skin details such as pores which we need to preserve.In the following,we present a flexible filtering approach for modulating the pixels so as to remove facial imperfec- tions while preserving details. We take Gaussian filtering to smooth IMF.Nor- mally,the larger standard deviation is estimated,the larger smooth window should be employed.By experi- ment,we found that for most imperfect images a 11x 11 pixels window of Gaussian filter is sufficient.By intro- ducing a parameter h to control the degree of filtering, the standard deviation of Gaussian filtering for z at level k is formulated as ok(x)=h×(k+1)×IPD(x), (7) where IPD(x)is the imperfect degree of x,k+1 is a level-related quantity,meaning that standard deviation increases at the high level.h is the filtering parameter, large h implies high degree of smoothing,and vice versa. According to our experiments,a value of h ranging from 2 to 8 is suitable to achieve visually satisfactory results for most images.Typically,the default setting of h is 3. The resultant facial image is efficiently reconstructed by adding modified IMFs and the residual image to- gether. 5 Experiments and Discussion Fig.2.IMFs obtained by decomposing "Barbara"with direc- In this section,we show some experimental results of tional BEMD (the second row),RBF-based interpolation (the our ALNEMD and face cleaning algorithm in compari- third row)and our ALNEMD (the fourth row). sons with other relevant algorithms.We first compare ALNEMD with several other BEMD methods includ- the performance of different approaches.Specially,the ing directional BEMD.triangle-based cubic interpola- RMSE is defined as tion BEMD and RBF interpolation BEMD.The test M images are "Barbara"(the left image in the first row of Fig.2),Lena”(the middle one)and“Girl”(the right p(i,)-q(i,)2 one).In these results,when performing ALNEMD the RMSE= MN (8) threshold ur is set to(T2-1)/3 where T2 is the window size. Fig.2 shows the visual comparison between our AL Table 2.RMSEs of Three Approaches NEMD algorithm,directional BEMD and RBF inter- Image Cubic RBF Our Algor. polation BEMD.The second row shows the first three Barbara 0.0712 0.0539 0.0176 IMFs of the "Barbara"obtained by directional BEMD Lena 0.1015 0.0960 0.0180 The third row is the first three IMFs by RBF inter- Girl 0.0840 0.0703 0.0114 polation BEMD.The last row is our results.Unlike the results obtained by other two algorithms,no ob- Since the local mean of directional BEMD is not vious overdark or overbright spot is present on IMFs a truly 2D local mean,directional BEMD is not to be produced by our algorithm. compared in terms of RMSE.Table 2 shows the RMSEs We further adopt root mean square error (RMSE) of the three images by the three approaches,namely, between the local mean p(i,j),i=1,...,M,j= triangle-based cubic interpolation,RBF-based interpo- 1,...,N computed by different approaches and the true lation and our adaptive local mean algorithm.From local mean of the image (computed in a 5 x 5 neighbor- the table,we can see that the results of our algorithm hood)q(i,j),i=1,...,M,j=1,...,N to compare are much closer to the true local mean of image
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 563 to smooth, while pixels with small IPD relate to skin details such as pores which we need to preserve. In the following, we present a flexible filtering approach for modulating the pixels so as to remove facial imperfections while preserving details. We take Gaussian filtering to smooth IMF. Normally, the larger standard deviation is estimated, the larger smooth window should be employed. By experiment, we found that for most imperfect images a 11×11 pixels window of Gaussian filter is sufficient. By introducing a parameter h to control the degree of filtering, the standard deviation of Gaussian filtering for x at level k is formulated as σk(x) = h × (k + 1) × IPD(x), (7) where IPD(x) is the imperfect degree of x, k + 1 is a level-related quantity, meaning that standard deviation increases at the high level. h is the filtering parameter, large h implies high degree of smoothing, and vice versa. According to our experiments, a value of h ranging from 2 to 8 is suitable to achieve visually satisfactory results for most images. Typically, the default setting of h is 3. The resultant facial image is efficiently reconstructed by adding modified IMFs and the residual image together. 5 Experiments and Discussion In this section, we show some experimental results of our ALNEMD and face cleaning algorithm in comparisons with other relevant algorithms. We first compare ALNEMD with several other BEMD methods including directional BEMD, triangle-based cubic interpolation BEMD and RBF interpolation BEMD. The test images are “Barbara” (the left image in the first row of Fig.2), “Lena” (the middle one) and “Girl” (the right one). In these results, when performing ALNEMD the threshold µT is set to (T 2−1)/3 where T 2 is the window size. Fig.2 shows the visual comparison between our ALNEMD algorithm, directional BEMD and RBF interpolation BEMD. The second row shows the first three IMFs of the “Barbara” obtained by directional BEMD. The third row is the first three IMFs by RBF interpolation BEMD. The last row is our results. Unlike the results obtained by other two algorithms, no obvious overdark or overbright spot is present on IMFs produced by our algorithm. We further adopt root mean square error (RMSE) between the local mean p(i, j), i = 1, . . . , M, j = 1, . . . , N computed by different approaches and the true local mean of the image (computed in a 5×5 neighborhood) q(i, j), i = 1, . . . , M, j = 1, . . . , N to compare Fig.2. IMFs obtained by decomposing “Barbara” with directional BEMD (the second row), RBF-based interpolation (the third row) and our ALNEMD (the fourth row). the performance of different approaches. Specially, the RMSE is defined as RMSE = vuuuut X M i=1 X N j=1 (p(i, j) − q(i, j))2 MN . (8) Table 2. RMSEs of Three Approaches Image Cubic RBF Our Algor. Barbara 0.0712 0.0539 0.0176 Lena 0.1015 0.0960 0.0180 Girl 0.0840 0.0703 0.0114 Since the local mean of directional BEMD is not a truly 2D local mean, directional BEMD is not to be compared in terms of RMSE. Table 2 shows the RMSEs of the three images by the three approaches, namely, triangle-based cubic interpolation, RBF-based interpolation and our adaptive local mean algorithm. From the table, we can see that the results of our algorithm are much closer to the true local mean of image
564 J.Comput.Sci.&Technol.,May 2009,Vol.24,No.3 (a) (b) (c) (d) (f) (g) h O (K) Fig.3.Cleaning effects comparison for the facial image.(a)Image with size 528 x 644 in pixels.(b)and (c)Results obtained by bilateral with different parameters.(d)Result of Matsui et al (e)Mask painted by the user to specify the region of cleaning.(f)Result of our method.Apparently,the bumps as well freckles are removed thoroughly,while the pores on the face are retained.(g)~(k)Five zoomed in skin patches selected from the same position (the rectangle)of Figs.3(a)~3(d)and 3(f)respectively. Subsequently,we tested our face cleaning approach removed completely in the result.In another experi- with a set of facial images.For color image,we first ment of bilateral filtering,we increase the od to remove convert it from RGB space to YCrCb space,and then bumps (at top left),it appears that pores are removed perform cleaning operation on its Y component.Fi- and the scars (at bottom left)are still present,as shown nally,we invert the YCrCb space to RGB space to pro- in Fig.3(c).Since large o,would blur the edges of face, duce the resulting image.In the following experimental when performing bilateral filtering for Fig.3(a),we also results,when computing IPD with our algorithm the A use the mask image,see Fig.3(e),to keep the eyes and is set as 0.4. mouth out of filtering in Figs.3(b)and 3(c).Fig.3(d) Fig.3 shows a comparison of our cleaning method, is the result of Matsui et al.2]with =31,2=29, bilateral filtering and the technique of Matsui et all2 E3=10.a =3.As can be seen,part of skin pores are The original image to be processed is shown in Fig.3(a), retained,but the spots and the scares on the left face and the mask image we used is shown in Fig.3(e).Our are left behind.Figs.3(g)~3(k)are five zoomed in skin ALNEMD decomposes the original image into three patches selected from the same position(the rectangle) IMFs (K=2)together with a residual.In the origi- of Figs.3(a)~3(d)and 3(f)respectively. nal image,there are a number of pronounced scars and Since the IPD is derived from the normalized local bumps,especially the scars on the left part of the face, energy,it reflects the relative energy of image pixels making the task of cleaning challenging.However,as and work in the case of noisy,blurry and different reso- can be seen from our processed image(Fig.3(f)with pa- lution images.Fig.4(a)is the original image part and rameter h=5),the pores on the face are well preserved, Fig.4(b)is the corresponding IPD.Fig.4(c)is a noisy while the large rough part of the face(at bottom left) version of Fig.4(a)with additive Gaussian white noise has been repaired.Figs.3(b)and 3(c)show the re- of variance 25,and Fig.4(d)is the IPD of Fig.4(c).As sults of bilateral filtering with different parameters.For can be seen,the IPD can still represent the visual per- Fig.3(b),the geometric spread od 1 and the photo- ception of face images.Fig.4(e)is a motion blurred metric spread or 1000.For Fig.3(c),od =2 and version of Fig.4(a)with translation 20 pixels and rota- o.=3000.It should be noted that in Fig.3(b)small od tion 10 degrees.Fig.4(f)is the corresponding IPD,we is adopted with an attempt to preserve pores and large can see that in Fig.4(e)pores are blurred,leaving be- or is adopted intending to smooth the edges of bumps hind some scars which have been represented with high and scars,but the small bumps (at top left)are not IPD values in Fig.4(f).Fig.4(g)is an image with a
564 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 Fig.3. Cleaning effects comparison for the facial image. (a) Image with size 528×644 in pixels. (b) and (c) Results obtained by bilateral with different parameters. (d) Result of Matsui et al. [12] (e) Mask painted by the user to specify the region of cleaning. (f) Result of our method. Apparently, the bumps as well freckles are removed thoroughly, while the pores on the face are retained. (g)∼(k) Five zoomed in skin patches selected from the same position (the rectangle) of Figs. 3(a)∼3(d) and 3(f) respectively. Subsequently, we tested our face cleaning approach with a set of facial images. For color image, we first convert it from RGB space to YCrCb space, and then perform cleaning operation on its Y component. Finally, we invert the YCrCb space to RGB space to produce the resulting image. In the following experimental results, when computing IPD with our algorithm the λ is set as 0.4. Fig.3 shows a comparison of our cleaning method, bilateral filtering and the technique of Matsui et al. [12] The original image to be processed is shown in Fig.3(a), and the mask image we used is shown in Fig.3(e). Our ALNEMD decomposes the original image into three IMFs (K = 2) together with a residual. In the original image, there are a number of pronounced scars and bumps, especially the scars on the left part of the face, making the task of cleaning challenging. However, as can be seen from our processed image (Fig.3(f) with parameter h = 5), the pores on the face are well preserved, while the large rough part of the face (at bottom left) has been repaired. Figs. 3(b) and 3(c) show the results of bilateral filtering with different parameters. For Fig.3(b), the geometric spread σd = 1 and the photometric spread σr = 1000. For Fig.3(c), σd = 2 and σr = 3000. It should be noted that in Fig.3(b) small σd is adopted with an attempt to preserve pores and large σr is adopted intending to smooth the edges of bumps and scars, but the small bumps (at top left) are not removed completely in the result. In another experiment of bilateral filtering, we increase the σd to remove bumps (at top left), it appears that pores are removed and the scars (at bottom left) are still present, as shown in Fig.3(c). Since large σr would blur the edges of face, when performing bilateral filtering for Fig.3(a), we also use the mask image, see Fig.3(e), to keep the eyes and mouth out of filtering in Figs. 3(b) and 3(c). Fig.3(d) is the result of Matsui et al. [12] with ε1 = 31, ε2 = 29, ε3 = 10, a = 3. As can be seen, part of skin pores are retained, but the spots and the scares on the left face are left behind. Figs. 3(g)∼3(k) are five zoomed in skin patches selected from the same position (the rectangle) of Figs. 3(a)∼3(d) and 3(f) respectively. Since the IPD is derived from the normalized local energy, it reflects the relative energy of image pixels and work in the case of noisy, blurry and different resolution images. Fig.4(a) is the original image part and Fig.4(b) is the corresponding IPD. Fig.4(c) is a noisy version of Fig.4(a) with additive Gaussian white noise of variance 25, and Fig.4(d) is the IPD of Fig.4(c). As can be seen, the IPD can still represent the visual perception of face images. Fig.4(e) is a motion blurred version of Fig.4(a) with translation 20 pixels and rotation 10 degrees. Fig.4(f) is the corresponding IPD, we can see that in Fig.4(e) pores are blurred, leaving behind some scars which have been represented with high IPD values in Fig.4(f). Fig.4(g) is an image with a
Yan-Li Liu et al.:Pores-Preserving Face Cleaning 565 (a) (c) (e) ⊙ IPD IPD .0 1.0 IPD 1.0 0.8 0.8 08 250 50 250 0.6 0.6 00 00 300 350 350 04 350 0.4 IPD 0.2 0.2 m1.0 400 400 125 0g0130180230280 00 0.5 50 0.0 80130180230280 50 0.0 80130180230280 50 0.0 4017 35 (b) (d) (h) Fig.4.IPD of noisy,blurred and different resolution images.(a),(c),(e),(g)Original,noisy,blurry, a quarter resolution images respectively.(b),(d),(f),(h)Corresponding IPD. (a) (b) (c) (d) (e) (f① (g) h Fig.5.Comparison of the three approaches.(a)Girl's face with image size 944 x 592 in pixels.(b)Result of bilateral filtering.(c) Result of Matsui et al(12](d)Our result.(e)(h)Four zoomed in skin patches selected from the same position (the green rectangle) in (a)(d). quarter of resolution of Fig.4(a),and Fig.4(h)is the present in Fig.5(b).Fig.5(d)is the result of Matsui corresponding IPD. et all2,the pores of girls are retained.However,the Fig.5 provides another comparison for Fig.5(a). freckles are present.Moreover,since the method uses Fig.5(b)(parameters od =5 and or =10)is produced unsharp mask to enhance the pores,the noise in dark by bilateral filtering and Fig.5(c)(h =3)by our al- regions,for example,the left face and right face near gorithm.We can see that the young girl's pores are the hairs,have been amplified. retained and meanwhile the blemishes are diminished Fig.6 demonstrates another result of our algorithm in Fig.5(c).By contrast,small distractions are still (Fig.6(d))for a patch of a young man's facial skin
Yan-Li Liu et al.: Pores-Preserving Face Cleaning 565 Fig.4. IPD of noisy, blurred and different resolution images. (a), (c), (e), (g) Original, noisy, blurry, a quarter resolution images respectively. (b), (d), (f), (h) Corresponding IPD. Fig.5. Comparison of the three approaches. (a) Girl’s face with image size 944 × 592 in pixels. (b) Result of bilateral filtering. (c) Result of Matsui et al. [12] (d) Our result. (e)∼(h) Four zoomed in skin patches selected from the same position (the green rectangle) in (a)∼(d). quarter of resolution of Fig.4(a), and Fig.4(h) is the corresponding IPD. Fig.5 provides another comparison for Fig.5(a). Fig.5(b) (parameters σd = 5 and σr = 10) is produced by bilateral filtering and Fig.5(c) (h = 3) by our algorithm. We can see that the young girl’s pores are retained and meanwhile the blemishes are diminished in Fig.5(c). By contrast, small distractions are still present in Fig.5(b). Fig.5(d) is the result of Matsui et al. [12], the pores of girls are retained. However, the freckles are present. Moreover, since the method uses unsharp mask to enhance the pores, the noise in dark regions, for example, the left face and right face near the hairs, have been amplified. Fig.6 demonstrates another result of our algorithm (Fig.6(d)) for a patch of a young man’s facial skin
566 J.Comput.Sci.Technol.,May 2009,Vol.24.No.3 (Fig.6(a))with h =3.Because there is no rugged 6 Conclusions and Future Work part on the face,the smoothing operation is evenly per- formed.In contrast with original skin texture of the We have presented a novel method to deal with im- man,the enlarged pores of skin have been refined,and perfect facial images.It is found that normalized local more delicate skin is obtained (Fig.6(d)).Figs.6(b) energy extracted from IMFs can effectively reveal dif- and 6(c)show the results of the bilateral filtering with ferent characteristics of fine scale details and distrac- different parameters,for Fig.6(b),the geometric spread tions.After analysis of normalized local energy,we pro- od =3 and photometric spread o,=10;for (c),od =5 pose a quantitative measure of facial skin,namely IPD and o,=20.It can be seen that some pores (at top Experiments show our IPD is an intuitive measure of right)in Figs.6(b)and 6(c)have been smeared out, degree of visual imperfection.Under the guide of IPD. meanwhile others(at bottom left)have been left aside we can adaptively adjust the coefficients of IMFs to im- in Figs.6(b)and 6(c).Fig.7 (h =3)shows other result prove the appearance of a facial skin.Our method not of our method. only removes distractions such as bumps and blemishes effectively,but also successfully maintains facial pores. Furthermore,our method does not request much inter- action and professional skills.In addition,an algorithm named as ALNEMD is also developed to improve in two respects the performance of EMD. There are some possible extensions to this work. Combining facial feature detection algorithms,we would like to extend our approach to enhance facial skins in video sequence which might be interesting.Be- (a) (b) ing a function of space,our IPD gives sharp identifica- tions of imbedded structures in facial images.Hence, it can be further used in other face-related work.For example,for those work which need to explicitly detect facial imperfections,e.g.,for identification5],IPD can aid in localizing skin irregularities. References [1]Tomasi C,Manduchi R.Bilateral filtering for gray and color g images.In Proc.the Sirth International Conference on Com- (d) puter Vision,Bombay,India,Jan.4-7,1998,pp.839-846. Fig.6.Result comparison for a facial image patch.(a)Image [2]Weiss B.Fast median and bilateral filtering.ACM Trans. Graphics,2006,25(3):519-526. patch with size 266 x 299 in pixels.(b)and (c)Results obtained [3]Buades A,Coll B,Morel J M.A non-local algorithm for image by bilateral filtering with different parameters.(d)Result pro- denoising.In Proc.IEEE Conference on Computer Vision duced by our method. and Pattern Recognition,San Diego,USA,June 20-25,2005, Vol.2,pp.60-65. [4]Huang NE et al.The empirical mode decomposition and Hilbert spectrum for nonlinear and nonstationary time se- ries analysis.In Proc.the Royal Society A:Mathematical, Physical and Engineering Sciences,March 8,1998,Vol.454, Pp.903-995. [5]Leyvand T,Cohen-Or D,Dror G,Lischinski D.Data-driven enhancement of facial attractiveness.ACM Trans.Graphics, 2008,27(3:38:1-38:9. [6]Nguyen M H,Lalonde J F,Efros AA,Fernando De la Torre. Image-based shaving. Computer Graphics Forum Journal. 2008.27(2):627-635. [7 Peers P,Tamura N,Matusik W,Debevec P.Post-production facial performance relighting using reflectance transfer.ACM Tmns.Graphics.2007.26(3):52:1-52:10. [8 Bitouk D,Kumar N,Dhillon S,Belhumeur P,Nayar S K. (a) (b) Face swapping:Automatically replacing faces in photographs. ACM Trans.Graphics,2008.27(3):39:1-39:8. Fig.7.Face cleaning effect (b)for a gentleman (a)with image [9]Perez P,Gangnet M,Blake A.Poisson image editing.ACM size 352 x 488 in pixels. Tans.Graphics,.2003,22(3):313-318
566 J. Comput. Sci. & Technol., May 2009, Vol.24, No.3 (Fig.6(a)) with h = 3. Because there is no rugged part on the face, the smoothing operation is evenly performed. In contrast with original skin texture of the man, the enlarged pores of skin have been refined, and more delicate skin is obtained (Fig.6(d)). Figs. 6(b) and 6(c) show the results of the bilateral filtering with different parameters, for Fig.6(b), the geometric spread σd = 3 and photometric spread σr = 10; for (c), σd = 5 and σr = 20. It can be seen that some pores (at top right) in Figs. 6(b) and 6(c) have been smeared out, meanwhile others (at bottom left) have been left aside in Figs. 6(b) and 6(c). Fig.7 (h = 3) shows other result of our method. Fig.6. Result comparison for a facial image patch. (a) Image patch with size 266 × 299 in pixels. (b) and (c) Results obtained by bilateral filtering with different parameters. (d) Result produced by our method. Fig.7. Face cleaning effect (b) for a gentleman (a) with image size 352 × 488 in pixels. 6 Conclusions and Future Work We have presented a novel method to deal with imperfect facial images. It is found that normalized local energy extracted from IMFs can effectively reveal different characteristics of fine scale details and distractions. After analysis of normalized local energy, we propose a quantitative measure of facial skin, namely IPD. Experiments show our IPD is an intuitive measure of degree of visual imperfection. Under the guide of IPD, we can adaptively adjust the coefficients of IMFs to improve the appearance of a facial skin. Our method not only removes distractions such as bumps and blemishes effectively, but also successfully maintains facial pores. Furthermore, our method does not request much interaction and professional skills. In addition, an algorithm named as ALNEMD is also developed to improve in two respects the performance of EMD. There are some possible extensions to this work. Combining facial feature detection algorithms, we would like to extend our approach to enhance facial skins in video sequence which might be interesting. Being a function of space, our IPD gives sharp identifications of imbedded structures in facial images. Hence, it can be further used in other face-related work. For example, for those work which need to explicitly detect facial imperfections, e.g., for identification[5], IPD can aid in localizing skin irregularities. References [1] Tomasi C, Manduchi R. Bilateral filtering for gray and color images. In Proc. the Sixth International Conference on Computer Vision, Bombay, India, Jan. 4–7, 1998, pp.839–846. [2] Weiss B. Fast median and bilateral filtering. ACM Trans. Graphics, 2006, 25(3): 519–526. [3] Buades A, Coll B, Morel J M. A non-local algorithm for image denoising. In Proc. IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, June 20–25, 2005, Vol.2, pp.60–65. [4] Huang N E et al. The empirical mode decomposition and Hilbert spectrum for nonlinear and nonstationary time series analysis. In Proc. the Royal Society A: Mathematical, Physical and Engineering Sciences, March 8, 1998, Vol.454, pp.903–995. [5] Leyvand T, Cohen-Or D, Dror G, Lischinski D. Data-driven enhancement of facial attractiveness. ACM Trans. Graphics, 2008, 27(3): 38:1–38:9. [6] Nguyen M H, Lalonde J F, Efros A A, Fernando De la Torre. Image-based shaving. Computer Graphics Forum Journal, 2008, 27(2): 627–635. [7] Peers P, Tamura N, Matusik W, Debevec P. Post-production facial performance relighting using reflectance transfer. ACM Trans. Graphics, 2007, 26(3): 52:1–52:10. [8] Bitouk D, Kumar N, Dhillon S, Belhumeur P, Nayar S K. Face swapping: Automatically replacing faces in photographs. ACM Trans. Graphics, 2008, 27(3): 39:1–39:8. [9] Perez P, Gangnet M, Blake A. Poisson image editing. ACM Trans. Graphics, 2003, 22(3): 313–318