High-Resolution Image Inpainting with Iterative Confidence Feedback and Guided Upsampling

european conference on computer vision, pp. 1-17, 2020.

Cited by: 3|Bibtex|Views126
Other Links: arxiv.org|academic.microsoft.com
Weibo:
We propose a high-resolution image inpainting method for large object removal

Abstract:

Existing image inpainting methods often produce artifacts when dealing with large holes in real applications. To address this challenge, we propose an iterative inpainting method with a feedback mechanism. Specifically, we introduce a deep generative model which not only outputs an inpainting result but also a corresponding confidence m...More

Code:

Data:

0
Introduction
  • It is an important problem in computer vision and an essential functionality in many imaging and graphics applications, e.g. object removal, image restoration, manipulation, re-targeting, compositing, and image-based rendering [9,22,29]
  • Classical inpainting methods such as [13,21,9] typically rely on the principle of borrowing example patches from known regions or external database images and pasting them into the holes.
Highlights
  • Image inpainting is a task of reconstructing missing regions in an image
  • We aim to address the challenge of filling large holes in high resolution images for real image editing applications, e.g., object removal
  • – We introduce a new procedure to synthesize training data for building deep generative models for real object removal applications
  • 4.2 Comparison with state-of-the-art methods We evaluate quantitative scores and visual quality of two variants of our method: i.e. Ours*: the iterative inpainting model running on original input without
  • We propose a high-resolution image inpainting method for large object removal
  • Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations
  • Experiments show that our method outperforms existing methods on realistic testing samples and achieves better visual quality on real object removal requests from the Web
Methods
  • Object shaped L1 Loss PSNR holes SSIM

    Irregular holes L1 Loss PSNR (Places2) SSIM

    Square holes (Places2) L1 Loss PSNR SSIM User study P. c.

    [9] .0273 25.64 .8780 .0288 22.87 .8549 .0432 19.19 .7922

    [18] .0292 24.23 .8653 .0385 20.95 .8185 .0386 20.16 .7950

    [35] .0243 26.07 .8803 .0245 24.31 .8718 .0430 19.08 .7984

    [27] .0246 26.24 .8871 .0221 24.78 .8701 .0368 20.30 .8017

    Ours* .0194 28.20 .8985 .0203 25.43 .8828 .0361 20.21 .8130

    Ours .0205 27.67 .8949 0220 24.70 .8744 .0384 19.69 .8063

    guided upsampling and Ours: the iterative inpainting model running on 2× downsampled input and using the guided upsample model to obtain the results of original size.
  • Comparison with more methods can be found in supplementary material
Results
  • Experiments show that the method significantly outperforms existing methods in both quantitative and qualitative evaluations.
Conclusion
  • The authors propose a high-resolution image inpainting method for large object removal.
  • To improve visual quality for high-res inputs, the authors first obtain a low-res inpainting result and reconstruct it using high-res neural patches that are transformed to high-res image output.
  • By decoupling high-level understanding and low-level reconstruction, the method can provide results that are both semantically reasonable and visually realistic.
Summary
  • Introduction:

    It is an important problem in computer vision and an essential functionality in many imaging and graphics applications, e.g. object removal, image restoration, manipulation, re-targeting, compositing, and image-based rendering [9,22,29]
  • Classical inpainting methods such as [13,21,9] typically rely on the principle of borrowing example patches from known regions or external database images and pasting them into the holes.
  • Objectives:

    The authors aim to address the challenge of filling large holes in high resolution images for real image editing applications, e.g., object removal
  • Methods:

    Object shaped L1 Loss PSNR holes SSIM

    Irregular holes L1 Loss PSNR (Places2) SSIM

    Square holes (Places2) L1 Loss PSNR SSIM User study P. c.

    [9] .0273 25.64 .8780 .0288 22.87 .8549 .0432 19.19 .7922

    [18] .0292 24.23 .8653 .0385 20.95 .8185 .0386 20.16 .7950

    [35] .0243 26.07 .8803 .0245 24.31 .8718 .0430 19.08 .7984

    [27] .0246 26.24 .8871 .0221 24.78 .8701 .0368 20.30 .8017

    Ours* .0194 28.20 .8985 .0203 25.43 .8828 .0361 20.21 .8130

    Ours .0205 27.67 .8949 0220 24.70 .8744 .0384 19.69 .8063

    guided upsampling and Ours: the iterative inpainting model running on 2× downsampled input and using the guided upsample model to obtain the results of original size.
  • Comparison with more methods can be found in supplementary material
  • Results:

    Experiments show that the method significantly outperforms existing methods in both quantitative and qualitative evaluations.
  • Conclusion:

    The authors propose a high-resolution image inpainting method for large object removal.
  • To improve visual quality for high-res inputs, the authors first obtain a low-res inpainting result and reconstruct it using high-res neural patches that are transformed to high-res image output.
  • By decoupling high-level understanding and low-level reconstruction, the method can provide results that are both semantically reasonable and visually realistic.
Tables
  • Table1: Quantitative evaluation and user preference of various methods. P.c.: preference count in user study
  • Table2: Effect of each component. IT: iterative inpainting; CF: confidence feedback
Download tables as Excel
Related work
  • Earlier image inpainting methods rely on existing content to fill the holes. Diffusion-based methods [8,10] propagate neighboring appearances to the target holes, but they often generate significant artifacts when the holes are large or texture variation is severe. Patch-based methods [13,21,9] search for most similar patches from valid regions to complete missing regions. Drori et al [12] propose to iteratively fill missing regions from high to low confidence with similar patches. Although they also use a map to determine the region to fill in each iteration, the map is predefined based on spatial distances from unknown pixels to their closest valid pixels. The above methods use real image patches sampled from the input to fill the holes and can often generate high-quality results. However, they lack high-level structural understanding and cannot generate entirely new content that does not exist in the input image. Thus, their results may not be semantically consistent to regions surrounding the holes.
Funding
  • Experiments show that our method significantly outperforms existing methods in both quantitative and qualitative evaluations
Study subjects and analysis
real object removal cases: 25
Its results are similar to PatchMatch in terms of texture but more reasonable in terms of structure. To evaluate visual quality of our method, we conduct a user study on 25 real object removal cases collected from object removal requests on the Web. All images are resized to make the short side equal to 512

users: 11
All images are resized to make the short side equal to 512. Each input image with a marked region to remove and the results of different methods are shown in random order to 11 users and we ask them to select a single best result. Each combination of input and results are shown twice, and a valid vote is counted only when a user selects the same result twice

Reference
  • can someone please remove my co-worker on the right and just leave the rhino and my friend on the left? https://www.reddit.com/r/PhotoshopRequest/comments/82v6x1/specific_can_someone_please_remove_my_coworker_on/
    Findings
  • Can someone please remove the backpack and ’lead’ from my sons back? would love to have this picture of my kids without it! https://www.reddit.com/r/PhotoshopRequest/comments/6szh1i/specific_can_someone_please_remove_the_backpack/
    Findings
  • can someone please remove the people holding the balloons and their shadows from this engagement photo? https://www.reddit.com/r/PhotoshopRequest/comments/8d12tw/specific_can_someone_please_remove_the_people/
    Findings
  • Can someone remove the woman in purple please? will give reddit gold! https://www.reddit.com/r/PhotoshopRequest/comments/6ddjg3/paid_specific_can_someone_remove_the_woman_in/
    Findings
  • Could someone help me remove background people - especially the guys head? will venmo $5. https://www.reddit.com/r/PhotoshopRequest/comments/b2y0o5/specific_paid_could_someone_help_me_remove/
    Findings
  • Could someone please remove the people in the background if at all possible! https://www.reddit.com/r/PhotoshopRequest/comments/6f0g4k/specific_could_someone_please_remove_the_people/
    Findings
  • If possible, can anyone help me remove the people on the side, esp the people facing towards the camera:) thank you. https://www.reddit.com/r/PhotoshopRequest/comments/anizco/specific_if_possible_can_anyone_help_me_remove/
    Findings
  • Ballester, C., Bertalmio, M., Caselles, V., Sapiro, G., Verdera, J.: Filling-in by joint interpolation of vector fields and gray levels. IEEE Transactions on Image Processing 10(8), 1200–1211 (2001)
    Google ScholarLocate open access versionFindings
  • Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: A randomized correspondence algorithm for structural image editing 28(3), 24 (2009)
    Google ScholarLocate open access versionFindings
  • Bertalmio, M., Sapiro, G., Caselles, V., Ballester, C.: Image inpainting. In: The 27th Annual Conference on Computer Graphics and Interactive Techniques (2000)
    Google ScholarLocate open access versionFindings
  • Caelles, S., Pont-Tuset, J., Perazzi, F., Montes, A., Maninis, K.K., Van Gool, L.: The 2019 davis challenge on vos: Unsupervised multi-object segmentation. arXiv:1905.00737 (2019)
    Findings
  • Drori, I., Cohen-Or, D., Yeshurun, H.: Fragment-based image completion 22(3), 303–312 (2003)
    Google ScholarLocate open access versionFindings
  • Efros, A.A., Freeman, W.T.: Image quilting for texture synthesis and transfer. In: The 28th Annual Conference on Computer Graphics and Interactive Techniques. pp. 341–346. ACM (2001)
    Google ScholarLocate open access versionFindings
  • Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge. International Journal of Computer Vision 88(2), 303–338 (2010)
    Google ScholarLocate open access versionFindings
  • Fan, D.P., Cheng, M.M., Liu, J.J., Gao, S.H., Hou, Q., Borji, A.: Salient objects in clutter: Bringing salient object detection to the foreground. In: European Conference on Computer Vision (2018)
    Google ScholarLocate open access versionFindings
  • Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial nets. In: Advances in Neural Information Processing Systems (2014)
    Google ScholarLocate open access versionFindings
  • Guo, Z., Chen, Z., Yu, T., Chen, J., Liu, S.: Progressive image inpainting with full-resolution residual network. In: Proceedings of the 27th ACM International Conference on Multimedia. ACM (2019)
    Google ScholarLocate open access versionFindings
  • Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36(4), 107 (2017)
    Google ScholarLocate open access versionFindings
  • Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1125–1134 (2017)
    Google ScholarLocate open access versionFindings
  • Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
    Findings
  • Kwatra, V., Essa, I., Bobick, A., Kwatra, N.: Texture optimization for examplebased synthesis 24(3), 795–802 (2005)
    Google ScholarLocate open access versionFindings
  • Levin, A., Zomet, A., Peleg, S., Weiss, Y.: Seamless image stitching in the gradient domain. In: European Conference on Computer Vision. pp. 377–389.
    Google ScholarLocate open access versionFindings
  • Li, G., Yu, Y.: Deep contrast learning for salient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
    Google ScholarLocate open access versionFindings
  • Liang, X., Liu, S., Shen, X., Yang, J., Liu, L., Dong, J., Lin, L., Yan, S.: Deep human parsing with active template regression. IEEE Transactions on Pattern Analysis and Machine Intelligence 37(12), 2402–2414 (2015)
    Google ScholarLocate open access versionFindings
  • Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. In: European Conference on Computer Vision (2018)
    Google ScholarLocate open access versionFindings
  • Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
    Findings
  • Nazeri, K., Ng, E., Joseph, T., Qureshi, F., Ebrahimi, M.: Edgeconnect: Generative image inpainting with adversarial edge learning. arXiv preprint arXiv:1901.00212 (2019)
    Findings
  • Oh, S.W., Lee, S., Lee, J.Y., Kim, S.J.: Onion-peel networks for deep video completion. In: IEEE International Conference on Computer Vision (2019)
    Google ScholarLocate open access versionFindings
  • Park, E., Yang, J., Yumer, E., Ceylan, D., Berg, A.C.: Transformation-grounded image generation network for novel 3d view synthesis. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 3500–3509 (2017)
    Google ScholarLocate open access versionFindings
  • Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: Feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (2016)
    Google ScholarLocate open access versionFindings
  • Wang, J., Jiang, H., Yuan, Z., Cheng, M.M., Hu, X., Zheng, N.: Salient object detection: A discriminative regional feature integration approach. International Journal of Computer Vision 123(2), 251–268 (2017)
    Google ScholarLocate open access versionFindings
  • Xiong, W., Yu, J., Lin, Z., Yang, J., Lu, X., Barnes, C., Luo, J.: Foregroundaware image inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition (2019)
    Google ScholarLocate open access versionFindings
  • Yang, C., Lu, X., Lin, Z., Shechtman, E., Wang, O., Li, H.: High-resolution image inpainting using multi-scale neural patch synthesis. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (July 2017)
    Google ScholarLocate open access versionFindings
  • Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 5505–5514 (2018)
    Google ScholarLocate open access versionFindings
  • Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: IEEE International Conference on Computer Vision (2019)
    Google ScholarLocate open access versionFindings
  • Zeng, Y., Fu, J., Chao, H., Guo, B.: Learning pyramid-context encoder network for high-quality image inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 1486–1494 (2019)
    Google ScholarLocate open access versionFindings
  • Zhang, H., Hu, Z., Luo, C., Zuo, W., Wang, M.: Semantic image inpainting with progressive generative networks. In: 2018 ACM Multimedia Conference on Multimedia Conference. pp. 1939–1947. ACM (2018)
    Google ScholarLocate open access versionFindings
  • Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: A 10 million image database for scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(6), 1452–1464 (2017)
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments