Using Latent Topic Features For Named Entity Extraction In Search Queries

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5(2011)

引用 27|浏览12
暂无评分
摘要
Search is one of the most quickly growing applications in the mobile market. As people rely more on portable devices for performing search, it becomes increasingly important to analyze user queries in order to achieve more targetted results over a broad set of search entities. While most previous work has relied on lexico-syntactic features and handcrafted knowledge sources, this paper investigates methods for learning latent semantic features from unlabelled user-generated content. We extract word-topic associations by training a Latent Dirichlet Allocation model on a corpus of online reviews, and show that this information improves named-entity classification performance over broad domain search queries. We believe that topical features provide a rich source of information from data with minimal manual effort, and no dependency on a specific language.
更多
查看译文
关键词
named entity extraction, spoken language processing, topic models, latent dirichlet allocation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要