Topic Adaptation For Statistical Machine Translation
2017 25TH IRANIAN CONFERENCE ON ELECTRICAL ENGINEERING (ICEE)(2017)
摘要
we present new ways for Farsi to English topic adaptation for statistical machine translation. We incorporate topic in the phrase table in the form of sparse phrasal features and make use of sparse lexical features by determining the topic distribution of source sentences in the development and test corpus. These sparse features cover a lot of source to target topic related translations. We also develop systems with features that measure the topical similarity of the source sentence and each hypothesis. These features include features based on distributional profiles and two types of features which make use of bilingual topic models to measure the similarity of the source sentence and the hypothesis using topic vectors in source and target languages. Domain and topic adaptation is also combined to improve the translation quality. Different experiments are carried out on Farsi to English Verbmobil and CNN datasets. BLEU score shows up to 2.0 improvement on Verbmobil dataset. Up to 1.17 BLEU improvement and several individual translation corrections are observed in CNN dataset.
更多查看译文
关键词
statistical machine translation, topic adaptation, topic models, sparse features, phrase table, topical similarity, domain adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络