High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling
european conference on computer vision, pp. 1-17, 2020.
Weibo:
Abstract:
Existing image inpainting methods often produce artifacts when dealing with large holes in real applications. To address this challenge, we propose an iterative inpainting method with a feedback mechanism. Specifically, we introduce a deep generative model which not only outputs an inpainting result but also a corresponding confidence m...More
Code:
Data:
Introduction
- It is an important problem in computer vision and an essential functionality in many imaging and graphics applications, e.g. object removal, image restoration, manipulation, re-targeting, compositing, and image-based rendering [9,22,29]
- Classical inpainting methods such as [13,21,9] typically rely on the principle of borrowing example patches from known regions or external database images and pasting them into the holes.
Highlights
- Image inpainting is a task of reconstructing missing regions in an image
- We aim to address the challenge of filling large holes in high resolution images for real image editing applications, e.g., object removal
- – We introduce a new procedure to synthesize training data for building deep generative models for real object removal applications
- 4.2 Comparison with state-of-the-art methods We evaluate quantitative scores and visual quality of two variants of our method: i.e. Ours*: the iterative inpainting model running on original input without
- We propose a high-resolution image inpainting method for large object removal
- Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations
- Experiments show that our method outperforms existing methods on realistic testing samples and achieves better visual quality on real object removal requests from the Web
Methods
- Object shaped L1 Loss PSNR holes SSIM
Irregular holes L1 Loss PSNR (Places2) SSIM
Square holes (Places2) L1 Loss PSNR SSIM User study P. c.
[9] .0273 25.64 .8780 .0288 22.87 .8549 .0432 19.19 .7922
[18] .0292 24.23 .8653 .0385 20.95 .8185 .0386 20.16 .7950
[35] .0243 26.07 .8803 .0245 24.31 .8718 .0430 19.08 .7984
[27] .0246 26.24 .8871 .0221 24.78 .8701 .0368 20.30 .8017
Ours* .0194 28.20 .8985 .0203 25.43 .8828 .0361 20.21 .8130
Ours .0205 27.67 .8949 0220 24.70 .8744 .0384 19.69 .8063
guided upsampling and Ours: the iterative inpainting model running on 2× downsampled input and using the guided upsample model to obtain the results of original size. - Comparison with more methods can be found in supplementary material
Results
- Experiments show that the method significantly outperforms existing methods in both quantitative and qualitative evaluations.
Conclusion
- The authors propose a high-resolution image inpainting method for large object removal.
- To improve visual quality for high-res inputs, the authors first obtain a low-res inpainting result and reconstruct it using high-res neural patches that are transformed to high-res image output.
- By decoupling high-level understanding and low-level reconstruction, the method can provide results that are both semantically reasonable and visually realistic.
Summary
Introduction:
It is an important problem in computer vision and an essential functionality in many imaging and graphics applications, e.g. object removal, image restoration, manipulation, re-targeting, compositing, and image-based rendering [9,22,29]- Classical inpainting methods such as [13,21,9] typically rely on the principle of borrowing example patches from known regions or external database images and pasting them into the holes.
Objectives:
The authors aim to address the challenge of filling large holes in high resolution images for real image editing applications, e.g., object removalMethods:
Object shaped L1 Loss PSNR holes SSIM
Irregular holes L1 Loss PSNR (Places2) SSIM
Square holes (Places2) L1 Loss PSNR SSIM User study P. c.
[9] .0273 25.64 .8780 .0288 22.87 .8549 .0432 19.19 .7922
[18] .0292 24.23 .8653 .0385 20.95 .8185 .0386 20.16 .7950
[35] .0243 26.07 .8803 .0245 24.31 .8718 .0430 19.08 .7984
[27] .0246 26.24 .8871 .0221 24.78 .8701 .0368 20.30 .8017
Ours* .0194 28.20 .8985 .0203 25.43 .8828 .0361 20.21 .8130
Ours .0205 27.67 .8949 0220 24.70 .8744 .0384 19.69 .8063
guided upsampling and Ours: the iterative inpainting model running on 2× downsampled input and using the guided upsample model to obtain the results of original size.- Comparison with more methods can be found in supplementary material
Results:
Experiments show that the method significantly outperforms existing methods in both quantitative and qualitative evaluations.Conclusion:
The authors propose a high-resolution image inpainting method for large object removal.- To improve visual quality for high-res inputs, the authors first obtain a low-res inpainting result and reconstruct it using high-res neural patches that are transformed to high-res image output.
- By decoupling high-level understanding and low-level reconstruction, the method can provide results that are both semantically reasonable and visually realistic.
Tables
- Table1: Quantitative evaluation and user preference of various methods. P.c.: preference count in user study
- Table2: Effect of each component. IT: iterative inpainting; CF: confidence feedback
Related work
- Earlier image inpainting methods rely on existing content to fill the holes. Diffusion-based methods [8,10] propagate neighboring appearances to the target holes, but they often generate significant artifacts when the holes are large or texture variation is severe. Patch-based methods [13,21,9] search for most similar patches from valid regions to complete missing regions. Drori et al [12] propose to iteratively fill missing regions from high to low confidence with similar patches. Although they also use a map to determine the region to fill in each iteration, the map is predefined based on spatial distances from unknown pixels to their closest valid pixels. The above methods use real image patches sampled from the input to fill the holes and can often generate high-quality results. However, they lack high-level structural understanding and cannot generate entirely new content that does not exist in the input image. Thus, their results may not be semantically consistent to regions surrounding the holes.
Funding
- Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations
Study subjects and analysis
real object removal cases: 25
Its results are similar to PatchMatch in terms of texture but more reasonable in terms of structure. To evaluate visual quality of our method, we conduct a user study on 25 real object removal cases collected from object removal requests on the Web. All images are resized to make the short side equal to 512
users: 11
All images are resized to make the short side equal to 512. Each input image with a marked region to remove and the results of different methods are shown in random order to 11 users and we ask them to select a single best result. Each combination of input and results are shown twice, and a valid vote is counted only when a user selects the same result twice
Reference
- can someone please remove my co-worker on the right and just leave the rhino and my friend on the left? https://www.reddit.com/r/PhotoshopRequest/comments/82v6x1/specific_can_someone_please_remove_my_coworker_on/
- Can someone please remove the backpack and ’lead’ from my sons back? would love to have this picture of my kids without it! https://www.reddit.com/r/PhotoshopRequest/comments/6szh1i/specific_can_someone_please_remove_the_backpack/
- can someone please remove the people holding the balloons and their shadows from this engagement photo? https://www.reddit.com/r/PhotoshopRequest/comments/8d12tw/specific_can_someone_please_remove_the_people/
- Can someone remove the woman in purple please? will give reddit gold! https://www.reddit.com/r/PhotoshopRequest/comments/6ddjg3/paid_specific_can_someone_remove_the_woman_in/
- Could someone help me remove background people - especially the guys head? will venmo $5. https://www.reddit.com/r/PhotoshopRequest/comments/b2y0o5/specific_paid_could_someone_help_me_remove/
- Could someone please remove the people in the background if at all possible! https://www.reddit.com/r/PhotoshopRequest/comments/6f0g4k/specific_could_someone_please_remove_the_people/
- If possible, can anyone help me remove the people on the side, esp the people facing towards the camera:) thank you. https://www.reddit.com/r/PhotoshopRequest/comments/anizco/specific_if_possible_can_anyone_help_me_remove/
- Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing 10(8), 1200–1211 (2001)
- Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: A randomized correspondence algorithm for structural image editing 28(3), 24 (2009)
- Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: The 27th Annual Conference on Computer Graphics and Interactive Techniques (2000)
- Caelles, S., Pont-Tuset, J., Perazzi, F., Montes, A., Maninis, K.K., Van Gool, L.: The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv:1905.00737 (2019)
- Drori, I., Cohen-Or, D., Yeshurun, H.: Fragment-based image completion 22(3), 303–312 (2003)
- Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: The 28th Annual Conference on Computer Graphics and Interactive Techniques. pp. 341–346. ACM (2001)
- Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88(2), 303–338 (2010)
- Fan, D.P., Cheng, M.M., Liu, J.J., Gao, S.H., Hou, Q., Borji, A.: Salient objects in clutter: Bringing salient object detection to the foreground. In: European Conference on Computer Vision (2018)
- Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (2014)
- Guo, Z., Chen, Z., Yu, T., Chen, J., Liu, S.: Progressive image inpainting with full-resolution residual network. In: Proceedings of the 27th ACM International Conference on Multimedia. ACM (2019)
- Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36(4), 107 (2017)
- Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1125–1134 (2017)
- Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
- Kwatra, V., Essa, I., Bobick, A., Kwatra, N.: Texture optimization for examplebased synthesis 24(3), 795–802 (2005)
- Levin, A., Zomet, A., Peleg, S., Weiss, Y.: Seamless image stitching in the gradient domain. In: European Conference on Computer Vision. pp. 377–389.
- Li, G., Yu, Y.: Deep contrast learning for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
- Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., Lin, L., Yan, S.: Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(12), 2402–2414 (2015)
- Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: European Conference on Computer Vision (2018)
- Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
- Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)
- Oh, S.W., Lee, S., Lee, J.Y., Kim, S.J.: Onion-peel networks for deep video completion. In: IEEE International Conference on Computer Vision (2019)
- Park, E., Yang, J., Yumer, E., Ceylan, D., Berg, A.C.: Transformation-grounded image generation network for novel 3d view synthesis. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 3500–3509 (2017)
- Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: Feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
- Wang, J., Jiang, H., Yuan, Z., Cheng, M.M., Hu, X., Zheng, N.: Salient object detection: A discriminative regional feature integration approach. International Journal of Computer Vision 123(2), 251–268 (2017)
- Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., Luo, J.: Foregroundaware image inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
- Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017)
- Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 5505–5514 (2018)
- Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: IEEE International Conference on Computer Vision (2019)
- Zeng, Y., Fu, J., Chao, H., Guo, B.: Learning pyramid-context encoder network for high-quality image inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1486–1494 (2019)
- Zhang, H., Hu, Z., Luo, C., Zuo, W., Wang, M.: Semantic image inpainting with progressive generative networks. In: 2018 ACM Multimedia Conference on Multimedia Conference. pp. 1939–1947. ACM (2018)
- Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(6), 1452–1464 (2017)
Full Text
Tags
Comments