Deep Photo Style Transfer

CVPR, 2017.

Cited by: 341|Bibtex|Views40
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|arxiv.org
Weibo:
We introduce a deep-learning approach to photographic style transfer that is at the same time broad and faithful, i.e., it handles a large variety of image content while accurately transferring the reference style

Abstract:

This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style. Our approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network....More

Code:

Data:

0
Introduction
  • This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style.
  • The authors' approach builds upon the recent work on painterly transfer that separates style from the content of an image by considering different layers of a neural network.
  • The authors' contribution is to constrain the transformation from the input to the output to be locally affine in colorspace, and to express this constraint as a custom fully differentiable energy term
  • The authors show that this approach successfully suppresses distortion and yields satisfying photorealistic style transfers in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.
  • The set of affine combinations of the RGB channels spans a broad set of variations but the edge itself cannot move because it is located at the same place in all channels
Highlights
  • This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style
  • We introduce a deep-learning approach to photographic style transfer that is at the same time broad and faithful, i.e., it handles a large variety of image content while accurately transferring the reference style
  • We demonstrate the effectiveness of our approach with satisfying photorealistic style transfers for a broad variety of scenarios including transfer of the time of day, weather, season, and artistic edits
  • We introduce an optional guidance to the style transfer process based on semantic segmentation of the inputs to avoid the content-mismatch problem, which greatly improves the photorealism of the results
  • We compare our method with Gatys et al [5] (Neural Style for short) and Li et al [10] (CNNMRF for short) across a series of indoor and outdoor scenes in Figure 4. Both techniques produce results with painting-like distortions, which are undesirable in the context of photographic style transfer
  • We introduce a deep-learning approach that faithfully transfers style from a reference image for a wide variety of image content
Methods
  • The authors' algorithm takes two images: an input image which is usually an ordinary photograph and a stylized and retouched reference image, the reference style image.
  • The authors seek to transfer the style of the reference to the input while keeping the result photorealistic.
  • The authors introduce an optional guidance to the style transfer process based on semantic segmentation of the inputs to avoid the content-mismatch problem, which greatly improves the photorealism of the results
Results
  • Results and Comparison

    The authors have performed a series of experiments to validate the approach.
  • The authors compare the method with Gatys et al [5] (Neural Style for short) and Li et al [10] (CNNMRF for short) across a series of indoor and outdoor scenes in Figure 4.
  • Both techniques produce results with painting-like distortions, which are undesirable in the context of photographic style transfer.
  • The authors' photorealism regularization and semantic segmentation prevent these artifacts from happening and the results look visually more satisfying
Conclusion
  • The authors introduce a deep-learning approach that faithfully transfers style from a reference image for a wide variety of image content.
  • Semantic segmentation further drives more meaningful style transfer yielding satisfying photorealistic results in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits
Summary
  • This paper introduces a deep-learning approach to photographic style transfer that handles a large variety of image content while faithfully transferring the reference style.
  • One plausible approach is to match each input neural patch with the most similar patch in the style image to minimize the chances of an inaccurate transfer.
  • Local style transfer algorithms based on spatial color mappings are more expressive and can handle a broad class of applications such as time-of-day hallucination [4, 15], transfer of artistic edits [1, 14, 17], weather and season change [4, 8], and painterly stylization [5, 6, 10, 13].
  • We propose a photorealism regularization term in the objective function during the optimization, constraining the reconstructed image to be represented by locally affine color transformations of the input to prevent distortions.
  • We introduce an optional guidance to the style transfer process based on semantic segmentation of the inputs to avoid the content-mismatch problem, which greatly improves the photorealism of the results.
  • We summarize the Neural Style algorithm by Gatys et al [5] that transfers the reference style image S onto the input image I to produce an output image O by minimizing the objective function: L
  • We address this problem with an approach akin to Neural Doodle [1] and a semantic segmentation method [3] to generate image segmentation masks for the input and reference images for a set of common labels.
  • Both techniques apply a global color mapping to match the color statistics between the input image and the style image, which limits the faithfulness of their results when the transfer requires spatially-varying color transformation.
  • The two results look drastically different because our algorithm directly reproduces the style of the reference style image whereas Shih’s is an analogy-based technique that transfers the color change observed in a time-lapse video.
  • Figure 9a shows that CNNMRF and Neural Style produce nonphotorealistic results, which confirms our observation that these techniques introduce painting-like distortions.
  • We compared against several global methods in our second study: Reinhard’s statistics transfer [12], (b) Reference style image (c) Our result (d) Shih et al [15]
  • The study shows that our algorithm produces the most faithful style transfer results more than 80% of the time (Fig. 9b).
  • Users were shown a style image and 4 transferred outputs, the 3 previously mentioned global methods and our technique, and asked to choose the image with the most similar style to the reference style
  • Semantic segmentation further drives more meaningful style transfer yielding satisfying photorealistic results in a broad variety of scenarios, including transfer of the time of day, weather, season, and artistic edits.
Related work
  • Global style transfer algorithms process an image by applying a spatially-invariant transfer function. These methods are effective and can handle simple styles like global color shifts (e.g., sepia) and tone curves (e.g., high or low contrast). For instance, Reinhard et al [12] match the means and standard deviations between the input and reference style image after converting them into a decorrelated color space. Pitié et al [11] describe an algorithm to transfer the full 3D color histogram using a series of 1D histograms. As we shall see in the result section, these methods are limited in their ability to match sophisticated styles.

    Local style transfer algorithms based on spatial color mappings are more expressive and can handle a broad class of applications such as time-of-day hallucination [4, 15], transfer of artistic edits [1, 14, 17], weather and season change [4, 8], and painterly stylization [5, 6, 10, 13]. Our work is most directly related to the line of work initiated by Gatys et al [5] that employs the feature maps of discriminatively trained deep convolutional neural networks such as VGG-19 [16] to achieve groundbreaking performance for painterly style transfer [10, 13]. The main difference with these techniques is that our work aims for photorealistic transfer, which, as we previously discussed, introduces a challenging tension between local changes and large-scale consistency. In that respect, our algorithm is related to the techniques that operate in the photo realm [1, 4, 8, 14, 15, 17]. But unlike these techniques that are dedicated to a specific scenario, our approach is generic and can handle a broader diversity of style images.
Funding
  • This research is supported by a Google Faculty Research Award, and NSF awards IIS 1617861 and 1513967
Study subjects and analysis
user studies: 2
We have performed a series of experiments to validate our approach. We first discuss visual comparisons with previous work before reporting the results of two user studies. We compare our method with Gatys et al [5] (Neural Style for short) and Li et al [10] (CNNMRF for short) across a series of indoor and outdoor scenes in Figure 4

user studies: 2
User studies. We conducted two user studies to validate our work. First, we assessed the photorealism of several techniques: ours, the histogram transfer of Pitié et al [11], CNNMRF [10], and Neural Style [5]

Reference
  • S. Bae, S. Paris, and F. Durand. Two-scale tone management for photographic look. In ACM Transactions on Graphics (TOG), volume 25, pages 637–645. ACM, 2006.
    Google ScholarLocate open access versionFindings
  • A. J. Champandard. Semantic style transfer and turning twobit doodles into fine artworks. Mar 2016.
    Google ScholarFindings
  • L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. arXiv preprint arXiv:1606.00915, 2016.
    Findings
  • J. R. Gardner, M. J. Kusner, Y. Li, P. Upchurch, K. Q. Weinberger, K. Bala, and J. E. Hopcroft. Deep manifold traversal: Changing labels with convolutional features. CoRR, abs/1511.06421, 2015.
    Findings
  • L. A. Gatys, A. S. Ecker, and M. Bethge. Image style transfer using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2414–2423, 2016.
    Google ScholarLocate open access versionFindings
  • A. Hertzmann, C. E. Jacobs, N. Oliver, B. Curless, and D. H. Salesin. Image analogies. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 327–340. ACM, 2001.
    Google ScholarLocate open access versionFindings
  • J. Johnson. neural-style. https://github.com/jcjohnson/neural-style, 2015.
    Findings
  • P.-Y. Laffont, Z. Ren, X. Tao, C. Qian, and J. Hays. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics, 33(4), 2014.
    Google ScholarLocate open access versionFindings
  • A. Levin, D. Lischinski, and Y. Weiss. A closed-form solution to natural image matting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(2):228–242, 2008.
    Google ScholarLocate open access versionFindings
  • C. Li and M. Wand. Combining markov random fields and convolutional neural networks for image synthesis. arXiv preprint arXiv:1601.04589, 2016.
    Findings
  • F. Pitie, A. C. Kokaram, and R. Dahyot. N-dimensional probability density function transfer and its application to color transfer. In Tenth IEEE International Conference on Computer Vision (ICCV’05) Volume 1, volume 2, pages 1434– 1439. IEEE, 2005.
    Google ScholarLocate open access versionFindings
  • E. Reinhard, M. Adhikhmin, B. Gooch, and P. Shirley. Color transfer between images. IEEE Computer Graphics and Applications, 21(5):34–41, 2001.
    Google ScholarLocate open access versionFindings
  • A. Selim, M. Elgharib, and L. Doyle. Painting style transfer for head portraits using convolutional neural networks. ACM Transactions on Graphics (TOG), 35(4):129, 2016.
    Google ScholarLocate open access versionFindings
  • Y. Shih, S. Paris, C. Barnes, W. T. Freeman, and F. Durand. Style transfer for headshot portraits. 2014.
    Google ScholarFindings
  • Y. Shih, S. Paris, F. Durand, and W. T. Freeman. Data-driven hallucination of different times of day from a single outdoor photo. ACM Transactions on Graphics (TOG), 32(6):200, 2013.
    Google ScholarLocate open access versionFindings
  • K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
    Findings
  • K. Sunkavalli, M. K. Johnson, W. Matusik, and H. Pfister. Multi-scale image harmonization. ACM Transactions on Graphics (TOG), 29(4):125, 2010.
    Google ScholarLocate open access versionFindings
  • E. W. Weisstein. Gram matrix. MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/GramMatrix.html.
    Findings
  • F. Wu, W. Dong, Y. Kong, X. Mei, J.-C. Paul, and X. Zhang. Content-based colour transfer. In Computer Graphics Forum, volume 32, pages 190–203. Wiley Online Library, 2013.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments