A Constraint Approach To Lexicon Induction For Low-Resource Languages

SERVICES COMPUTING FOR LANGUAGE RESOURCES(2018)

引用 0|浏览7
暂无评分
摘要
Bilingual lexicon is a useful language resource, but such data rarely available for lower-density language pairs, especially for those that are closely related. The lack or absence of parallel and comparable corpora makes bilingual lexicon extraction becomes a difficult task. Using a third language to link two other languages is a well-known solution in low-resource situation, which usually requires only two input bilingual lexicons to automatically induce the new one. This approach, however, is weak in measuring semantic distance between bilingual word pairs because it has never been demonstrated to utilize the complete structures of the input bilingual lexicons as dropped meanings negatively influence the result. This research discuss a constraint approach to pivot-based lexicon induction in case the target language pair are closely related. We create constraints from language similarity and model the structures of the input dictionaries as an optimization problem whose solution produces optimally correct target bilingual lexicon. In addition, we enable created bilingual lexicons of low-resource languages accessible through service grid federation.
更多
查看译文
关键词
Low-resource languages, Bilingual dictionary induction, Pivot language, Weighted partial max-SAT, Constraint satisfaction problem
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要