AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
This paper proposes to combine the domain prediction regularization idea of with the denoising autoencoders

A Domain Adaptation Regularization for Denoising Autoencoders.

ACL, (2016)

Cited by: 25|Views14
EI
Full Text
Bibtex
Weibo

Abstract

Finding domain invariant features is critical for successful domain adaptation and transfer learning. However, in the case of unsupervised adaptation, there is a significant risk of overfitting on source training data. Recently, a regularization for domain adaptation was proposed for deep models by (Ganin and Lempitsky, 2015). We build on...More

Code:

Data:

0
Introduction
  • Domain Adaptation problem arises each time when the authors need to leverage labeled data in one or more related source domains, to learn a classifier for unseen data in a target domain
  • It has been studied for more than a decade, with applications in statistical machine translation, opinion mining, part of speech tagging, named entity recognition and document ranking (Daumeand Marcu, 2006; Pan and Yang, 2010; Zhou and Chang, 2014).
  • There are several extensions of topic models and matrix factorization techniques where the latent factors are shared by source and target collections (Chen and Liu, 2014; Chen et al, 2013)
Highlights
  • Domain Adaptation problem arises each time when we need to leverage labeled data in one or more related source domains, to learn a classifier for unseen data in a target domain
  • We build on stacked Marginalized Denoising Autoencoders (Chen et al, 2012), which can be learned efficiently with a closed form solution. We show that such domain adaptation regularization keeps the benefits of the sMDA and yields results competitive to the state of the art results of (Ganin and Lempitsky, 2015)
  • Despite a single layer and Logistic Regression (LR) trained on the source only, the MDA baseline (80.15% on average) is very close to the G-sMDA results obtained with 5 layer sMDA and 6 times larger feature set (80.18%)
  • This paper proposes a domain adaptation regularization for denoising autoencoders, in particular for marginalized ones
  • One limitation of our model is the linearity assumption for the domain classifier, but for textual data, linear classifiers are the state of the art technique
  • As new words and expressions become more frequent in a new domain, the idea of using the dropout regularization that forces the reconstruction of initial objects to resemble target domain objects is rewarding
Methods
  • The authors conduct unsupervised domain adaptation experiments on two standard collections: the Amazon reviews (Blitzer et al, 2011) and the 20Newsgroup (Pan and Yang, 2010) datasets.

    From the Amazon dataset the authors consider the four most used domains: dvd (D), books (B), electronics (E) and kitchen (K), and adopt the settings of (Ganin et al, 2015) with the 5000 most frequent common features selected for each adaptation task and a tf-idf weighting.
  • The authors conduct unsupervised domain adaptation experiments on two standard collections: the Amazon reviews (Blitzer et al, 2011) and the 20Newsgroup (Pan and Yang, 2010) datasets.
  • From the Amazon dataset the authors consider the four most used domains: dvd (D), books (B), electronics (E) and kitchen (K), and adopt the settings of (Ganin et al, 2015) with the 5000 most frequent common features selected for each adaptation task and a tf-idf weighting.
Results
  • Despite a single layer and LR trained on the source only, the MDA baseline (80.15% on average) is very close to the G-sMDA results obtained with 5 layer sMDA and 6 times larger feature set (80.18%).
Conclusion
  • This paper proposes a domain adaptation regularization for denoising autoencoders, in particular for marginalized ones.
  • One limitation of the model is the linearity assumption for the domain classifier, but for textual data, linear classifiers are the state of the art technique.
  • The main advantage of the new model is in the closed form solution.
  • It is unsupervised, as it does not require labeled target examples and yields performance results comparable with the current state of the art
Summary
  • Introduction:

    Domain Adaptation problem arises each time when the authors need to leverage labeled data in one or more related source domains, to learn a classifier for unseen data in a target domain
  • It has been studied for more than a decade, with applications in statistical machine translation, opinion mining, part of speech tagging, named entity recognition and document ranking (Daumeand Marcu, 2006; Pan and Yang, 2010; Zhou and Chang, 2014).
  • There are several extensions of topic models and matrix factorization techniques where the latent factors are shared by source and target collections (Chen and Liu, 2014; Chen et al, 2013)
  • Methods:

    The authors conduct unsupervised domain adaptation experiments on two standard collections: the Amazon reviews (Blitzer et al, 2011) and the 20Newsgroup (Pan and Yang, 2010) datasets.

    From the Amazon dataset the authors consider the four most used domains: dvd (D), books (B), electronics (E) and kitchen (K), and adopt the settings of (Ganin et al, 2015) with the 5000 most frequent common features selected for each adaptation task and a tf-idf weighting.
  • The authors conduct unsupervised domain adaptation experiments on two standard collections: the Amazon reviews (Blitzer et al, 2011) and the 20Newsgroup (Pan and Yang, 2010) datasets.
  • From the Amazon dataset the authors consider the four most used domains: dvd (D), books (B), electronics (E) and kitchen (K), and adopt the settings of (Ganin et al, 2015) with the 5000 most frequent common features selected for each adaptation task and a tf-idf weighting.
  • Results:

    Despite a single layer and LR trained on the source only, the MDA baseline (80.15% on average) is very close to the G-sMDA results obtained with 5 layer sMDA and 6 times larger feature set (80.18%).
  • Conclusion:

    This paper proposes a domain adaptation regularization for denoising autoencoders, in particular for marginalized ones.
  • One limitation of the model is the linearity assumption for the domain classifier, but for textual data, linear classifiers are the state of the art technique.
  • The main advantage of the new model is in the closed form solution.
  • It is unsupervised, as it does not require labeled target examples and yields performance results comparable with the current state of the art
Tables
  • Table1: Accuracies of MDA, MDA+TR, GsMDA and DA NN on the Amazon review dataset. Underline indicates improvement over the baseline MDA, bold indicates the highest value
  • Table2: Accuracies of MDA and MDA+TR on 20Newsgroup adaptation tasks
Download tables as Excel
Funding
  • Despite a single layer and LR trained on the source only, the MDA baseline (80.15% on average) is very close to the G-sMDA results obtained with 5 layer sMDA and 6 times larger feature set (80.18%)
Reference
  • Shai Ben-David, John Blitzer, Koby Crammer, and Fernando Pereira. 2007. Analysis of representations for domain adaptation. In Advances in Neural Information Processing Systems, NIPS Conference Proceedings, Vancouver, British Columbia, Canada, December 4-7, 2006., volume 19.
    Google ScholarLocate open access versionFindings
  • John Blitzer, Ryan McDonald, and Fernando Pereira. 2006. Domain adaptation with structural correspondence learning. In Proceedings of Conference on Empirical Methods in Natural Language Processing, EMNLP, 22-23 July 2006, Sydney, Australia.
    Google ScholarLocate open access versionFindings
  • John Blitzer, Sham Kakade, and Dean P. Foster. 2011. Domain adaptation with coupled subspaces. In Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS, Fort Lauderdale, USA, April 11-13, 2011.
    Google ScholarLocate open access versionFindings
  • Danushka Bollegala, Takanori Maehara, and Ken-ichi Kawarabayashi. 2015. Unsupervised cross-domain word representation learning. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, ACL, July 26-31, 2015, Beijing, China, volume 1.
    Google ScholarLocate open access versionFindings
  • Zhiyuan Chen and Bing Liu. 2014. Topic modeling using topics from many domains, lifelong learning and big data. In Proceedings of the 31st International Conference on Machine Learning, ICML Bejing, 21-16 June 2014.
    Google ScholarLocate open access versionFindings
  • M. Chen, Z. Xu, K. Q. Weinberger, and F. Sha. 2012. Marginalized denoising autoencoders for domain adaptation. ICML, arXiv:1206.4683.
    Findings
  • Zhiyuan Chen, Arjun Mukherjee, Bing Liu, Meichun Hsu, Malu Castellanos, and Riddhiman Ghosh. 2013. Leveraging multi-domain prior knowledge in topic models. In Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence, IJCAI ’13, pages 2071–207AAAI Press.
    Google ScholarLocate open access versionFindings
  • S. Chopra, S. Balakrishnan, and R. Gopalan. 2013. DLID: Deep learning for domain adaptation by interpolating between domains. In Proceedings of the 30th International Conference on Machine Learning, ICML, Atlanta, USA, 16-21 June 2013.
    Google ScholarLocate open access versionFindings
  • H. Daumeand D. Marcu. 2006. Domain adaptation for statistical classifiers. JAIR, 26:101–126.
    Google ScholarLocate open access versionFindings
  • H. Daume. 2009. Frustratingly easy domain adaptation. CoRR, arXiv:0907.1815.
    Findings
  • Yaroslav Ganin and Victor S. Lempitsky. 2015. Unsupervised domain adaptation by backpropagation. In Proceedings of the 32nd International Conference on Machine Learning, ICML, Lille, France, 611 July 2015, pages 1180–1189.
    Google ScholarLocate open access versionFindings
  • Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor S. Lempitsky. 2015. Domain-adversarial training of neural networks. CoRR, abs/1505.07818.
    Findings
  • X. Glorot, A. Bordes, and Y. Bengio. 2011. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th International Conference on Machine Learning, ICML, Bellevue, Washington, USA, June 28July 2, 2011.
    Google ScholarLocate open access versionFindings
  • M. Long, Y. Cao,, J. Wang, and M. Jordan. 2015. Learning transferable features with deep adaptation networks. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015.
    Google ScholarLocate open access versionFindings
  • S. J. Pan and Q. Yang. 2010. A survey on transfer learning. Knowledge and Data Engineering, IEEE Transactions on, 22(10):1345–1359.
    Google ScholarLocate open access versionFindings
  • Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, Qiang Yang, and Zheng Chen. 2010. Cross-domain sentiment classification via spectral feature alignment. In Proceedings of the 19th International Conference on World Wide Web, WWW, New York, NY, USA. ACM.
    Google ScholarLocate open access versionFindings
  • P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. 2008. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25nd International Conference on Machine Learning, ICML, Helsinki, Finland on July 59, 2008.
    Google ScholarLocate open access versionFindings
  • Stefan Wager, Sida I. Wang, and Percy Liang. 2013. Dropout training as adaptive regularization. In 26, editor, Advances in Neural Information Processing Systems, NIPS Conference Proceedings, Lake Tahoe, Nevada, United States, December 5-8, 2013.
    Google ScholarLocate open access versionFindings
  • Mianwei Zhou and Kevin C. Chang. 2014. Unifying learning to rank and domain adaptation: Enabling cross-task document scoring. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pages 781–790, New York, NY, USA. ACM.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
小科