Soft Syntactic Constraints for Hierarchical Phrase-Based Translation Using Latent Syntactic Distributions.

EMNLP '10: Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing(2010)

引用 24|浏览52
暂无评分
摘要
In this paper, we present a novel approach to enhance hierarchical phrase-based machine translation systems with linguistically motivated syntactic features. Rather than directly using treebank categories as in previous studies, we learn a set of linguistically-guided latent syntactic categories automatically from a source-side parsed, word-aligned parallel corpus, based on the hierarchical structure among phrase pairs as well as the syntactic structure of the source side. In our model, each X nonterminal in a SCFG rule is decorated with a real-valued feature vector computed based on its distribution of latent syntactic categories. These feature vectors are utilized at decoding time to measure the similarity between the syntactic analysis of the source side and the syntax of the SCFG rules that are applied to derive translations. Our approach maintains the advantages of hierarchical phrase-based translation systems while at the same time naturally incorporates soft syntactic constraints.
更多
查看译文
关键词
SCFG rule,source side,latent syntactic category,linguistically motivated syntactic feature,linguistically-guided latent syntactic category,soft syntactic constraint,syntactic analysis,syntactic structure,hierarchical phrase-based machine translation,hierarchical phrase-based translation system,latent syntactic distribution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要