182 Y.Guo et al.Computers Graphics 38(2014)174-182 allows users to watch the scene under new viewpoints on a viewing sphere interactively. In the current implementation,the user needs to manually specify lines of the latent cuboid structure on the input image, even though Hough transform and Canny detector can be used to assist in this operation.Recent research efforts on localizing 3D cuboids in single-view images may facilitate automation of this process [21,6.In future we plan to explore the possibility that automatic detection and analysis of the cuboid structures are Fig.14.A failure case.The roof of the house is not visible in the left input and its integrated into our framework. corresponding part looks hollow in the right result due to visibility shifting caused by viewpoint change Acknowledgments Viewing sphere:We can generate novel images under key viewpoints around the viewpoint of input image,with given We greatly thank the anonymous reviewers for the positive and viewing angles.This enables us to design an interface through constructive comments.This work was supported in part by the which the user can watch the scene by changing viewpoints National Science Foundation of China under Grants 61073098. smoothly on a viewing sphere,mimicking 3D browsing experi- 61021062,and 61373059,the National Basic Research Program of ence.As shown in Fig.13,the images on the viewing sphere under China(2010CB327903). new viewpoints are interpolated from the original input and the newly rendered images under four key viewpoints.Please refer to References our accompanied files for the live demo. Limitations:Our system creates a partial reconstruction from a [1]Carroll R.Agarwala A.Agrawala M.Image warps for artistic perspective single image.Although visually pleasing results are generated by manipulation.ACM Trans Graph (Siggraph)2010:29(4):127:1-9. our system for a variety of images with standard or non-standard [2]Du S.Hu S.Martin R.Changing perspective in stereoscopic images.IEEE Trans. cuboid structures,its applicability is limited.Furthermore,the new Visualization Comput.Graph 2013;19(8):1288-97. [3]Lee H.Shechtman E.Wang J.Lee S.Automatic upright adjustment of viewpoints are restricted to a certain range around the viewpoint photographs.In:IEEE conference on computer vision and pattern recognition of input image.Changing the viewpoint dramatically may lead to (CVPR):2012.p.877-84. dis-occlusions or holes that need to be filled.Some previously [4]Karsch K.Hedau V.Forsyth D.Hoiem D.Rendering synthetic objects into legacy photographs.ACM Trans Graph 2011:30(6):157:1-12. occluded parts may shift from invisible to visible which ultimately [5]Jiang N.Tan P.Cheong L-F.Symmetric architecture modeling with a single leads to visual artifacts since the single image only provides image.ACM Trans Graph(Siggraph Asia)2009:28(5):113:1-8. incomplete scene information.Fig.14 shows such an example [6]Zheng Y.Chen X.Cheng M-M.Zhou K.Hu S-M,Mitra NJ.Interactive images: cuboid proxies for smart image manipulation.ACM Trans Graph (Siggraph) where the roof of the house is not visible in the input and its 2012:31(4):99:1-99:11 corresponding part looks hollow in the right result.A possible 17]Horry Y.Anjyo KI.Arai K.Tour into the picture:using a spidery mesh interface solution is to incorporate the second input image with a different to make animation from a single image.In:Siggraph:1997. .225-32 viewpoint.This should be an interesting work to be explored in [8]Hoiem D.Efros AA.Hebert M.Automatic photo pop-up.ACM Trans Graph (Siggraph)2005:243):577-84. the future.This example also reveals another limitation of our 9]Igarashi T,Moscovich T.Hughes JF.As-rigid-as-possible shape manipulation. approach.The result is generated initially with non-regular bor- ACM Trans Graph 2005:24(3):1134-41. ders.Cropping it for regular borders inevitably discards some [10]Schaefer S.McPhail T.Warren J.Image deformation using moving least squares.ACM Trans Graph 2006:25(3):533-40. image content near image boundaries,sacrificing image resolu- [11]Liu F.Gleicher M.Automatic image re-targeting with fisheye-view warping.In: tion.This is especially true for those images whose cuboid ACMU1ST.2005.p.153-62 structure occupies nearly the whole image.In addition to that, [12]Gal R.Sorkine O.Cohen-Or D.Feature-aware texturing.In:17th eurographics workshop on rendering:2006.p.297-304. we use line constraint to preserve the shape of those objects that [13]Wang Y-S.Tai C-L Sorkine O.Lee T-Y.Optimized scale-and-stretch for image stand across two faces of the cuboid structures.Our method might resizing.ACM Trans Graph(Siggraph Asia)2008:27(5):118:1-8. not produce optimum results for such cases. [14]Guo Y,Liu F.Shi J.Zhou Z-H.Gleicher M.Image retargeting using mesh parametrization.IEEE Trans Multimedia 2009:11(4):856-67. Besides,the soft constraints in our optimization do not have [15]Jin Y.Liu L Wu Q.Nonhomogeneous scaling optimization for realtime image any 3D information.Perspective may be violated in image regions resizing.Vis Comput(Proc CGI)2010:26(6-8):769-78. that are not reached by the cuboid structure.This limitation is 16]Carroll R.Agrawala M.Agarwala A.Optimizing content-preserving projections for wide-angle images.ACM Trans Graph 2009:28(3):1-9. exposed in Fig.10(right)where the tall building in top-left is [17]Kopf J.Lischinski D.Deussen O.Cohen-Or D.Cohen MF.Locally adapted parallel to the main editing target in the input but such relation is projections to reduce panorama distortions.Comput Graph Forum 2009:28 destroyed in the result. (4):1083-9. [18]Bhattacharya S.Sukthankar R.Shah M.A framework for photo-quality assessment and enhancement based on visual aesthetics.In:ACM multimedia: 2010.p.271-80. 6.Conclusion [19]Chen T.Cheng M-M,Tan P.Shamir A.Hu S-M.Sketch2photo:internet image montage.ACM Trans Graph (Siggraph Asia)2009:28(5):124:1-10. We have presented an algorithm for manipulating the view- [20]Liu L Chen R.Wolf L.Cohen-Or D Optimizing photo composition.Comput Graph Forum (Eurographics)2010:29(2):469-78. points of those cuboid-structured images and generating new [21]Xiao J.Russell B,Torralba A.Localizing 3d cuboids in single-view images.In: images realistically.Our framework creates partial scene recon- Neural information processing systems (NIPS):2012.p.755-63. struction with minimal user interaction and we show that such an [22]Zhang G-X.Cheng M-M.Hu S-M,Martin RR A shape-preserving approach to approximate reconstruction is sufficient to re-render the image image resizing.Comput Graph Forum 2009:28(7):1897-906. [23]Huang Q-X,Mech R.Carr N.Optimizing structure preserving embedded under a new viewpoint,via a triangular mesh deformation deformation for resizing images and vector art.Comput Graph Forum scheme.The mesh deformation energy is optimized efficiently 2009:28(7):1887-96. by solving a sparse linear system.In addition to the generation of [24]Levy B.Petitjean S.Ray N.Maillo t J.Least squares conformal maps for automatic texture atlas generation.ACM Trans Graph (Siggraph)2002:21 images with novel viewpoints,we provide a user interface that (3:362-71.Viewing sphere: We can generate novel images under key viewpoints around the viewpoint of input image, with given viewing angles. This enables us to design an interface through which the user can watch the scene by changing viewpoints smoothly on a viewing sphere, mimicking 3D browsing experience. As shown in Fig. 13, the images on the viewing sphere under new viewpoints are interpolated from the original input and the newly rendered images under four key viewpoints. Please refer to our accompanied files for the live demo. Limitations: Our system creates a partial reconstruction from a single image. Although visually pleasing results are generated by our system for a variety of images with standard or non-standard cuboid structures, its applicability is limited. Furthermore, the new viewpoints are restricted to a certain range around the viewpoint of input image. Changing the viewpoint dramatically may lead to dis-occlusions or holes that need to be filled. Some previously occluded parts may shift from invisible to visible which ultimately leads to visual artifacts since the single image only provides incomplete scene information. Fig. 14 shows such an example where the roof of the house is not visible in the input and its corresponding part looks hollow in the right result. A possible solution is to incorporate the second input image with a different viewpoint. This should be an interesting work to be explored in the future. This example also reveals another limitation of our approach. The result is generated initially with non-regular borders. Cropping it for regular borders inevitably discards some image content near image boundaries, sacrificing image resolution. This is especially true for those images whose cuboid structure occupies nearly the whole image. In addition to that, we use line constraint to preserve the shape of those objects that stand across two faces of the cuboid structures. Our method might not produce optimum results for such cases. Besides, the soft constraints in our optimization do not have any 3D information. Perspective may be violated in image regions that are not reached by the cuboid structure. This limitation is exposed in Fig. 10 (right) where the tall building in top-left is parallel to the main editing target in the input but such relation is destroyed in the result. 6. Conclusion We have presented an algorithm for manipulating the viewpoints of those cuboid-structured images and generating new images realistically. Our framework creates partial scene reconstruction with minimal user interaction and we show that such an approximate reconstruction is sufficient to re-render the image under a new viewpoint, via a triangular mesh deformation scheme. The mesh deformation energy is optimized efficiently by solving a sparse linear system. In addition to the generation of images with novel viewpoints, we provide a user interface that allows users to watch the scene under new viewpoints on a viewing sphere interactively. In the current implementation, the user needs to manually specify lines of the latent cuboid structure on the input image, even though Hough transform and Canny detector can be used to assist in this operation. Recent research efforts on localizing 3D cuboids in single-view images may facilitate automation of this process [21,6]. In future we plan to explore the possibility that automatic detection and analysis of the cuboid structures are integrated into our framework. Acknowledgments We greatly thank the anonymous reviewers for the positive and constructive comments. This work was supported in part by the National Science Foundation of China under Grants 61073098, 61021062, and 61373059, the National Basic Research Program of China (2010CB327903). References [1] Carroll R, Agarwala A, Agrawala M. Image warps for artistic perspective manipulation. ACM Trans Graph (Siggraph) 2010;29(4):127:1–9. [2] Du S, Hu S, Martin R. Changing perspective in stereoscopic images. IEEE Trans. Visualization Comput. Graph 2013;19(8):1288–97. [3] Lee H, Shechtman E, Wang J, Lee S. Automatic upright adjustment of photographs. In: IEEE conference on computer vision and pattern recognition (CVPR); 2012. p. 877–84. [4] Karsch K, Hedau V, Forsyth D, Hoiem D. Rendering synthetic objects into legacy photographs. ACM Trans Graph 2011;30(6):157:1–12. [5] Jiang N, Tan P, Cheong L-F. Symmetric architecture modeling with a single image. ACM Trans Graph (Siggraph Asia) 2009;28(5):113:1–8. [6] Zheng Y, Chen X, Cheng M-M, Zhou K, Hu S-M, Mitra NJ. Interactive images: cuboid proxies for smart image manipulation. ACM Trans Graph (Siggraph) 2012;31(4):99:1–99:11. [7] Horry Y, Anjyo KI, Arai K. Tour into the picture: using a spidery mesh interface to make animation from a single image. In: Siggraph; 1997. p. 225–32. [8] Hoiem D, Efros AA, Hebert M. Automatic photo pop-up. ACM Trans Graph (Siggraph) 2005;24(3):577–84. [9] Igarashi T, Moscovich T, Hughes JF. As-rigid-as-possible shape manipulation. ACM Trans Graph 2005;24(3):1134–41. [10] Schaefer S, McPhail T, Warren J. Image deformation using moving least squares. ACM Trans Graph 2006;25(3):533–40. [11] Liu F, Gleicher M. Automatic image re-targeting with fisheye-view warping. In: ACM UIST; 2005. p. 153–62. [12] Gal R, Sorkine O, Cohen-Or D. Feature-aware texturing. In: 17th eurographics workshop on rendering; 2006. p. 297–304. [13] Wang Y-S, Tai C-L, Sorkine O, Lee T-Y. Optimized scale-and-stretch for image resizing. ACM Trans Graph (Siggraph Asia) 2008;27(5):118:1–8. [14] Guo Y, Liu F, Shi J, Zhou Z-H, Gleicher M. Image retargeting using mesh parametrization. IEEE Trans Multimedia 2009;11(4):856–67. [15] Jin Y, Liu L, Wu Q. Nonhomogeneous scaling optimization for realtime image resizing. Vis Comput (Proc CGI) 2010;26(6–8):769–78. [16] Carroll R, Agrawala M, Agarwala A. Optimizing content-preserving projections for wide-angle images. ACM Trans Graph 2009;28(3):1–9. [17] Kopf J, Lischinski D, Deussen O, Cohen-Or D, Cohen MF. Locally adapted projections to reduce panorama distortions. Comput Graph Forum 2009;28 (4):1083–9. [18] Bhattacharya S, Sukthankar R, Shah M. A framework for photo-quality assessment and enhancement based on visual aesthetics. In: ACM multimedia; 2010. p. 271–80. [19] Chen T, Cheng M-M, Tan P, Shamir A, Hu S-M. Sketch2photo: internet image montage. ACM Trans Graph (Siggraph Asia) 2009;28(5):124:1–10. [20] Liu L, Chen R, Wolf L, Cohen-Or D. Optimizing photo composition. Comput Graph Forum (Eurographics) 2010;29(2):469–78. [21] Xiao J, Russell B, Torralba A. Localizing 3d cuboids in single-view images. In: Neural information processing systems (NIPS); 2012. p. 755–63. [22] Zhang G-X, Cheng M-M, Hu S-M, Martin RR. A shape-preserving approach to image resizing. Comput Graph Forum 2009;28(7):1897–906. [23] Huang Q-X, Mech R, Carr N. Optimizing structure preserving embedded deformation for resizing images and vector art. Comput Graph Forum 2009;28(7):1887–96. [24] Lévy B, Petitjean S, Ray N, Maillo t J. Least squares conformal maps for automatic texture atlas generation. ACM Trans Graph (Siggraph) 2002;21 (3):362–71. Fig. 14. A failure case. The roof of the house is not visible in the left input and its corresponding part looks hollow in the right result due to visibility shifting caused by viewpoint change. 182 Y. Guo et al. / Computers & Graphics 38 (2014) 174–182