Statistical Machine Translation of Parliamentary Proceedings Using MorphoSyntactic Knowledge
msra(2006)
摘要
This paper presents an overview of the University of Washington statistical machine translation system developed for the 2006 TCSTAR evaluation campaign. We use a statistical phrase-based system with multiple decoding passes and a log-linear probability model. Our main focus was on exploring the possibility of using morpho-syntactic knowledge (lemmas and part-of-speech tags) for word alignment, language modeling, processing out-of-vocabulary words, and reordering. Use of these knowledge sources led to substantial improvements for translation from English into Spanish and minor improvements for the opposite translation direction. In addition, we investigated hidden-event n-gram models for postprocessing of machine translation output.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络