Optimal Estimation Of Rejection Thresholds For Topic Spotting

ICASSP (4)(2007)

引用 2|浏览9
暂无评分
摘要
In many applications of topic spotting technology, especially those that require a human review of in-topic documents, a low false alarm rate is a key requirement. Topic spotting techniques typically include a rejection scheme to filter out off-topic documents. In this paper we present a robust methodology for rejecting off-topic messages that, in addition to modeling the topics of interest, uses a so-called alternate model for topics that are not included in the set of topics of interest. Specifically, we introduce two novel techniques for estimating topic-specific rejection thresholds - a parametric technique that can be viewed as transformation of topic-independent thresholds, and a non-parametric technique based on constrained optimization of false rejections subject to a pre-specified number of false acceptances. Our experiments on newsgroup messages demonstrate that when adequate training data is available topic-specific threshold estimation techniques can outperform topic-independent thresholds in terms of the ROC curve.
更多
查看译文
关键词
topic classification,rejection algorithms,hidden Markov models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要