AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
Extensive experimental results on handwritten digit recognition, object recognition and action recognition demonstrate that our proposed Maximum Mean Discrepancy-adversarial autoencoder is able to learn domain-invariant features, which lead to stateof-the-art performance for doma...

Domain Generalization With Adversarial Feature Learning

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), pp.5400-5409, (2018)

引用425|浏览87
EI
下载 PDF 全文
引用
微博一下

摘要

In this paper, we tackle the problem of domain generalization: how to learn a generalized feature representation for an "unseen" target domain by taking the advantage of multiple seen source-domain data. We present a novel framework based on adversarial autoencoders to learn a generalized latent feature representation across domains for d...更多

代码

数据

0
简介
  • In some computer vision applications, it is often the case that there are only some unlabeled training data in the domain of interest (a.k.a. the target domain), while there are plenty of labeled training data in some related domain(s) (a.k.a. the source domain(s)).
  • Many studies have been conducted to leverage the unlabeled data of the target domain given in advance for adapting the model learned with the source-domain labeled data to the target domain
  • These are referred to as domain adaptation methods [28, 13, 9].
  • A key research issue is how to learn a representation of good generalization for the unseen target domain from some related source domains
重点内容
  • In some computer vision applications, it is often the case that there are only some unlabeled training data in the domain of interest (a.k.a. the target domain), while there are plenty of labeled training data in some related domain(s) (a.k.a. the source domain(s))
  • We proposed a novel framework for domain generalization, which aims to learn an universal representation across domains by minimizing the difference between the seen source domains, and by matching the distribution of data with the learned representation to a prior distribution
  • We evaluate the performance on other vision problems, such as object recognition on Caltech [8], PASCAL VOC2007 [6], LabelMe [29] and SUN09 [2], as well as the action recognition based on different angles on IXMAS [38]
  • We propose a novel framework for domain generalization, denoted by Maximum Mean Discrepancy-adversarial autoencoder
  • The main idea is to learn a feature representation by jointly optimization a multi-domain autoencoder regularized by the Maximum Mean Discrepancy distance, an discriminator and a classifier in an adversarial training manner
  • Extensive experimental results on handwritten digit recognition, object recognition and action recognition demonstrate that our proposed Maximum Mean Discrepancy-adversarial autoencoder is able to learn domain-invariant features, which lead to stateof-the-art performance for domain generalization
方法
  • The authors compare the proposed MMD-AAE with the following baseline methods for domain generalization in terms of classification accuracy.
  • To make a fair comparison, the authors set the dimension of hidden layer to be 500 for handwritten digit recognition, and 2,000 for object and action recognition
结论
  • As reviewed in Section 2, there exist many deep learning based domain adaptation methods that use either MMD minimization [21, 22] or adversarial training [34] to align the source-domain distribution(s) to the target domain distribution.
  • The authors' proposed method aims to align distributions of the seen source domains by jointly imposing the multi-domain MMD distance as well as adversarial loss to a prior distribution.
  • The main idea is to learn a feature representation by jointly optimization a multi-domain autoencoder regularized by the MMD distance, an discriminator and a classifier in an adversarial training manner.
  • Extensive experimental results on handwritten digit recognition, object recognition and action recognition demonstrate that the proposed MMD-AAE is able to learn domain-invariant features, which lead to stateof-the-art performance for domain generalization
表格
  • Table1: Performance on handwritten digit recognition. The best performance is highlighted in boldface
  • Table2: Performance on object recognition
  • Table3: Performance on action recognition
  • Table4: Impact of different components on performance
  • Table5: Performance with different prior distributions
  • Table6: Performance with different parameters
Download tables as Excel
相关工作
  • As mentioned at the beginning of the previous section, both domain adaptation and domain generalization aim to learn a precise classifier to be used for the target domain by leveraging labeled data from the source domain(s). The difference between them is that for domain adaptation, some unlabeled data and even a few labeled data from the target domain are utilized to capture properties of the target domain for model adaptation [27, 9, 33, 4, 13, 22, 35, 21]. While numerous approaches have been proposed for domain adaptation, less attention has been raised for domain generalization. Some representative works have been reviewed in the previous section.

    Our work is also related to Generative Adversarial Network (GAN) [14] which has been explored for generative tasks. In GAN, there are two types of networks: A generative model G that aims to capture the distribution of the training data for data generation, and a discriminative model D that aims to distinguish between the instances drawn from G and the original data sampled from the training dataset. The generative model G and the discriminative model D are jointly trained in a competitive fashion: 1) Train D to distinguish the true instances from the fake instances generated by G. 2) Train G to fool D with its generated instances. Recently, many GAN-style algorithms have been proposed. For example, Li et al [20] proposed a generative model, where MMD is employed to match the hidden representations generated from training data and random noise. Makhzani et al [23] proposed adversarial autoencoder (AAE) to train the encoder and the decoder using an adversarial learning strategy.
基金
  • The ROSE Lab is supported by the National Research Foundation, Singapore, and the Infocomm Media Development Authority, Singapore
  • Pan thanks for the supports from NTU Singapore Nanyang Assistant Professorship (NAP) grant M4081532.020 and Singapore MOE AcRF Tier-1 grant 2016-T1-001-159
引用论文
  • Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.
    Google ScholarLocate open access versionFindings
  • M. J. Choi, J. J. Lim, A. Torralba, and A. S. Willsky. Exploiting hierarchical context on a large database of object categories. In CVPR, 2010.
    Google ScholarLocate open access versionFindings
  • J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell. Decaf: A deep convolutional activation feature for generic visual recognition. In ICML, 2014.
    Google ScholarLocate open access versionFindings
  • L. Duan, I. W. Tsang, D. Xu, and T. Chua. Domain adaptation from multiple sources via auxiliary classifiers. In ICML, pages 289–296, 2009.
    Google ScholarLocate open access versionFindings
  • V. Dumoulin, I. Belghazi, B. Poole, A. Lamb, M. Arjovsky, O. Mastropietro, and A. Courville. Adversarially learned inference. arXiv preprint arXiv:1606.00704, 2016.
    Findings
  • M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2):303– 338, 2010.
    Google ScholarLocate open access versionFindings
  • C. Fang, Y. Xu, and D. N. Rockmore. Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In ICCV, pages 1657–1664, 2013.
    Google ScholarLocate open access versionFindings
  • L. Fei-Fei, R. Fergus, and P. Perona. Learning generative visual models from few training examples: An incremental bayesian approach tested on 101 object categories. Computer vision and Image understanding, 106(1):59–70, 2007.
    Google ScholarLocate open access versionFindings
  • B. Fernando, A. Habrard, M. Sebban, and T. Tuytelaars. Unsupervised visual domain adaptation using subspace alignment. In ICCV, 2013.
    Google ScholarFindings
  • Y. Ganin, E. Ustinova, H. Ajakan, P. Germain, H. Larochelle, F. Laviolette, M. Marchand, and V. Lempitsky. Domainadversarial training of neural networks. Journal of Machine Learning Research, 17(59):1–35, 2016.
    Google ScholarLocate open access versionFindings
  • M. Ghifary, W. Bastiaan Kleijn, M. Zhang, and D. Balduzzi. Domain generalization for object recognition with multi-task autoencoders. In ICCV, 2015.
    Google ScholarLocate open access versionFindings
  • M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li. Deep reconstruction-classification networks for unsupervised domain adaptation. In ECCV, 2016.
    Google ScholarLocate open access versionFindings
  • B. Gong, Y. Shi, F. Sha, and K. Grauman. Geodesic flow kernel for unsupervised domain adaptation. In CVPR, 2012.
    Google ScholarLocate open access versionFindings
  • I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014.
    Google ScholarLocate open access versionFindings
  • A. Gretton, K. M. Borgwardt, M. Rasch, B. Scholkopf, and A. J. Smola. A kernel method for the two-sample-problem. In NIPS, 2006.
    Google ScholarLocate open access versionFindings
  • A. Gretton, D. Sejdinovic, H. Strathmann, S. Balakrishnan, M. Pontil, K. Fukumizu, and B. K. Sriperumbudur. Optimal kernel choice for large-scale two-sample tests. In NIPS, 2012.
    Google ScholarLocate open access versionFindings
  • D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Y. LeCun. The mnist database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998.
    Findings
  • D. Li, Y. Yang, Y.-Z. Song, and T. M. Hospedales. Deeper, broader and artier domain generalization. In ICCV, 2017.
    Google ScholarLocate open access versionFindings
  • Y. Li, K. Swersky, and R. Zemel. Generative moment matching networks. In ICML, 2015.
    Google ScholarLocate open access versionFindings
  • M. Long, Y. Cao, J. Wang, and M. Jordan. Learning transferable features with deep adaptation networks. In ICML, 2015.
    Google ScholarLocate open access versionFindings
  • M. Long, H. Zhu, J. Wang, and M. I. Jordan. Unsupervised domain adaptation with residual transfer networks. In NIPS, 2016.
    Google ScholarLocate open access versionFindings
  • A. Makhzani, J. Shlens, N. Jaitly, I. Goodfellow, and B. Frey. Adversarial autoencoders. ICLR Workshop, 2016.
    Google ScholarLocate open access versionFindings
  • X. Mao, Q. Li, H. Xie, R. Y. Lau, Z. Wang, and S. P. Smolley. Least squares generative adversarial networks. arXiv preprint ArXiv:1611.04076, 2016.
    Findings
  • S. Motiian, M. Piccirilli, D. A. Adjeroh, and G. Doretto. Unified deep supervised domain adaptation and generalization. In ICCV, 2017.
    Google ScholarLocate open access versionFindings
  • K. Muandet, D. Balduzzi, and B. Scholkopf. Domain generalization via invariant feature representation. In ICML, 2013.
    Google ScholarLocate open access versionFindings
  • S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang. Domain adaptation via transfer component analysis. IEEE Transactions on Neural Networks, 22(2):199–210, 2011.
    Google ScholarLocate open access versionFindings
  • S. J. Pan and Q. Yang. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10):1345–1359, 2010.
    Google ScholarLocate open access versionFindings
  • B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: a database and web-based tool for image annotation. International journal of computer vision, 77(1):157– 173, 2008.
    Google ScholarLocate open access versionFindings
  • K. Saenko, B. Kulis, M. Fritz, and T. Darrell. Adapting visual category models to new domains. In ECCV, 2010.
    Google ScholarLocate open access versionFindings
  • A. J. Smola, A. Gretton, L. Song, and B. Scholkopf. A hilbert space embedding for distributions. In ALT, pages 13–31, 2007.
    Google ScholarLocate open access versionFindings
  • B. K. Sriperumbudur, K. Fukumizu, A. Gretton, G. R. G. Lanckriet, and B. Scholkopf. Kernel choice and classifiability for RKHS embeddings of probability distributions. In NIPS, pages 1750–1758, 2009.
    Google ScholarLocate open access versionFindings
  • B. Sun, J. Feng, and K. Saenko. Return of frustratingly easy domain adaptation. In AAAI, 2016.
    Google ScholarLocate open access versionFindings
  • E. Tzeng, J. Hoffman, K. Saenko, and T. Darrell. Adversarial discriminative domain adaptation. In CVPR, 2017.
    Google ScholarLocate open access versionFindings
  • E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell. Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474, 2014.
    Findings
  • R. Wan, B. Shi,, L.-Y. Duan, A. H. Tan, W. Gao, and A. C. Kot. Region-aware reflection removal with unified content and gradient priors. In IEEE Transactions on Image Processing.
    Google ScholarLocate open access versionFindings
  • H. Wang, A. Klaser, C. Schmid, and C.-L. Liu. Dense trajectories and motion boundary descriptors for action recognition. International journal of computer vision, 103(1):60– 79, 2013.
    Google ScholarLocate open access versionFindings
  • D. Weinland, R. Ronfard, and E. Boyer. Free viewpoint action recognition using motion history volumes. Computer vision and image understanding, 104(2):249–257, 2006.
    Google ScholarLocate open access versionFindings
  • Z. Xu, W. Li, L. Niu, and D. Xu. Exploiting low-rank structure from latent domains for domain generalization. In ECCV. 2014.
    Google ScholarLocate open access versionFindings
  • P. Yang and W. Gao. Multi-view discriminant transfer learning. In IJCAI, 2013.
    Google ScholarLocate open access versionFindings
0
您的评分 :

暂无评分

标签
评论
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn