Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning
ACL(2010)
摘要
Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents Bagel, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that Bagel can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation performance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.
更多查看译文
关键词
phrase-based statistical language generation,handcrafted generator,trainable language generation,human evaluation,statistical model,statistical language generator,semantically-aligned data,certainty-based active learning,graphical model,human gold standard,generation performance,generation decision process,active learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络