Cross-language headline generation for Hindi.
ACM Transactions on Asian Language Information Processing(2003)
摘要
This paper presents new approaches to headline generation for English newspaper texts, with an eye toward the production of document surrogates for document selection in cross-language information retrieval. This task is difficult because the user must make decisions about relevance based on (often poor) translations of retrieved documents. To facilitate the decision-making process we need translations that can be assessed rapidly and accurately; our approach is to provide an English headline for the non-English document. We describe two approaches to headline generation and their application to the recent DARPA TIDES-2003 Surprise Language Exercise for Hindi. For comparison, we also implemented an alternative method for surrogate generation: a system that produces topic lists for (Hindi) articles. We present the results of a series of experiments comparing each of these approaches. We demonstrate in both automatic and human evaluations that our linguistically motivated approach outperforms two other surrogate-generation methods: a statistical system and a topic discovery system.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要