正在加载图片...
Y.W.Guo et al./Improving Photo Composition Elegantly:Considering Image Similarity During Composition Optimization images with a prominent subject for example a person or a high building,the medial axis is nearly a vertical line seg- ment.We hereby use a vertical axis as a substitute for the medial axis.Such an axis can be computed easily by finding an axis that divides the salient object into two parts of the same area.Elin is then calculated as. En(0)= 1_∑Scos 2dis(Li,L).π (4) w/3 where Li and L'are the vertical axis of Ri,and the nearest vertical lines of ROT to L;separately.dis(.is the Euclidean distance. To ease exposition,we only explore the effect of salient Figure 2:Some photos taken by professional photographers. regions following the guideline of rule of thirds.Region area is taken as coefficients in the above formulations,emphasiz- ing the influence of large salient objects.For those images like landscapes or seascapes that lack a distinct foreground well later optimization.We exploit the salient region detec- objects,it is intuitive to compute the composition aesthetics tion method developed in [CZM*11].Furthermore.we em- by detecting the prominent lines and computing the score ploy the Viola-Jones face detector to detect human faces.In using Ein It is worth noting that previous techniques have case that saliency detection and face detector fail to extract used a learnt support vector regression model [BSS10]or a the salient regions,the user is allowed to draw outlines of computational means [LCWCO10]to capture image aesthet- foreground subjects. ics by taking more aesthetic perspectives into account.Such models can be seamlessly integrated into our framework for With the detected salient regions,we calculate composi- tion aesthetics considering two aspects.The first aspect,also evaluating composition aesthetics. widely exploited by previous work,is the distance from the center of interest to the four points,also called power points, 3.2.Image Similarity where lines of ROT intersect.The second is whether or not To improve composition aesthetics,retargeting techniques the salient objects are placed along the ROT lines.This also are often employed to adjust the positions of distinct fore- obeys the ROT guideline.For an given image I,we then ground objects.During this process,images are subject to model Ee(I)as, visual distortions,especially for those with complex back- Ee(1)=1/3Esec(I)+2/3Elin(I). (2) ground structures.In order to control such distortions within acceptable tolerance,similarity measure should be used to where Esec and Elin account for the above two aspects sep- quantify the visual difference between the optimized im- arately.It is reasonable to emphasize Elin by setting it a big age and the original one.Traditional quality metrics such as coefficient.Imagine that when a professional photographer mean squared error(MSE).although simple to calculate,are takes a photo with a prominent subject,he often puts it along not very well matched to the perceived visual quality.Un- with a vertical line of ROT,rather than restricting its center der the assumption that human visual perception is highly onto one power point rigidly.Some examples are shown in adapted for extracting structural information from a scene,a Figure 2. measure of structural similarity,called SSIM,that compares For Esec,we sum up the score of each region weighted by local patterns of pixel intensities that have been normalized region area, for luminance and contrast is developed in [WBSS04].Ex- periments on several publicly available subject-rated image Esec(I)= databases show that SSIM values exhibit much better con- sistency with the qualitative visual appearance.We therefore (3) basically adopt SSIM to measure the similarity between the where Si is the area of a salient region Ri.pi and p;are the improved image and the input. center of mass of Ri,and the closest power point to pi,re- spectively.With the above formula,Esec for the image with SSIM is defined as, a single subject whose center of mass lies on one of the four SSIM (Ir,Io)=[(Ir,Io)].[c(Ir,Io)].[s(Ir,Io)],(5) power points is set to 1. where 1(,),c(,),and s(,compare the luminance,contrast, To compute Elin an optimal solution is to extract the me- and structures between Ir and Io,respectively.a,B,and yare dial axis for each salient region and compute the distance to parameters used to control relative importance of the three the nearest lines used in ROT guideline.However,for most components.In order to simplify the expression,they can be ©2012 The Author(s) 2012 The Eurographics Association and Blackwell Publishing Lid.Y. W. Guo et al. / Improving Photo Composition Elegantly: Considering Image Similarity During Composition Optimization Figure 2: Some photos taken by professional photographers. well later optimization. We exploit the salient region detec￾tion method developed in [CZM∗11]. Furthermore, we em￾ploy the Viola-Jones face detector to detect human faces. In case that saliency detection and face detector fail to extract the salient regions, the user is allowed to draw outlines of foreground subjects. With the detected salient regions, we calculate composi￾tion aesthetics considering two aspects. The first aspect, also widely exploited by previous work, is the distance from the center of interest to the four points, also called power points, where lines of ROT intersect. The second is whether or not the salient objects are placed along the ROT lines. This also obeys the ROT guideline. For an given image I, we then model Ee(I) as, Ee(I) = 1/3Esec(I) +2/3Elin(I), (2) where Esec and Elin account for the above two aspects sep￾arately. It is reasonable to emphasize Elin by setting it a big coefficient. Imagine that when a professional photographer takes a photo with a prominent subject, he often puts it along with a vertical line of ROT, rather than restricting its center onto one power point rigidly. Some examples are shown in Figure 2. For Esec, we sum up the score of each region weighted by region area, Esec(I) = 1 ∑i Si ∑ i Si· cos|pix − ps ix| w/3 + |piy − ps iy| h/3 · π 2 , (3) where Si is the area of a salient region Ri. pi and ps i are the center of mass of Ri, and the closest power point to pi, re￾spectively. With the above formula, Esec for the image with a single subject whose center of mass lies on one of the four power points is set to 1. To compute Elin, an optimal solution is to extract the me￾dial axis for each salient region and compute the distance to the nearest lines used in ROT guideline. However, for most images with a prominent subject for example a person or a high building, the medial axis is nearly a vertical line seg￾ment. We hereby use a vertical axis as a substitute for the medial axis. Such an axis can be computed easily by finding an axis that divides the salient object into two parts of the same area. Elin is then calculated as, Elin(I) = 1 ∑i Si ∑ i Si· cos2dis(Li,Ls ) w/3 · π 2 , (4) where Li and Ls are the vertical axis of Ri, and the nearest vertical lines of ROT to Li separately. dis(,) is the Euclidean distance. To ease exposition, we only explore the effect of salient regions following the guideline of rule of thirds. Region area is taken as coefficients in the above formulations, emphasiz￾ing the influence of large salient objects. For those images like landscapes or seascapes that lack a distinct foreground objects, it is intuitive to compute the composition aesthetics by detecting the prominent lines and computing the score using Elin. It is worth noting that previous techniques have used a learnt support vector regression model [BSS10] or a computational means [LCWCO10] to capture image aesthet￾ics by taking more aesthetic perspectives into account. Such models can be seamlessly integrated into our framework for evaluating composition aesthetics. 3.2. Image Similarity To improve composition aesthetics, retargeting techniques are often employed to adjust the positions of distinct fore￾ground objects. During this process, images are subject to visual distortions, especially for those with complex back￾ground structures. In order to control such distortions within acceptable tolerance, similarity measure should be used to quantify the visual difference between the optimized im￾age and the original one. Traditional quality metrics such as mean squared error (MSE), although simple to calculate, are not very well matched to the perceived visual quality. Un￾der the assumption that human visual perception is highly adapted for extracting structural information from a scene, a measure of structural similarity, called SSIM, that compares local patterns of pixel intensities that have been normalized for luminance and contrast is developed in [WBSS04]. Ex￾periments on several publicly available subject-rated image databases show that SSIM values exhibit much better con￾sistency with the qualitative visual appearance. We therefore basically adopt SSIM to measure the similarity between the improved image and the input. SSIM is defined as, SSIM (Ir,Io)=[l(Ir,Io)]α · [c (Ir,Io)]β · [s(Ir,Io)]γ , (5) where l(,), c(,), and s(,) compare the luminance, contrast, and structures between Ir and Io, respectively. α, β, and γ are parameters used to control relative importance of the three components. In order to simplify the expression, they can be c 2012 The Author(s) c 2012 The Eurographics Association and Blackwell Publishing Ltd.
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有