Bilingual Contexts From Comparable Corpora To Mine For Translations Of Collocations
COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT II(2018)
摘要
Due to the limited availability of parallel data in many languages, we propose a methodology that benefits from comparable corpora to find translation equivalents for collocations (as a specific type of difficult-to-translate multi-word expressions). Finding translations is known to be more difficult for collocations than for words. We propose a method based on bilingual context extraction and build a word (distributional) representation model drawing on these bilingual contexts (bilingual English-Spanish contexts in our case). We show that the bilingual context construction is effective for the task of translation equivalent learning and that our method outperforms a simplified distributional similarity baseline in finding translation equivalents.
更多查看译文
关键词
Collocations, Word vector representation, Distributional similarity, Comparable corpora
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要