On-Line Cumulative Learning of Hierarchical Sparse -grams

msra(2004)

引用 27|浏览7
暂无评分
摘要
We present a system for on-line, cumulative learning of hier- archical collections of frequent patterns from unsegmented data streams. Such learning is critical for long-lived intelligent agents in complex worlds. Learned patterns enable prediction of unseen data and serve as building blocks for higher-level knowledge rep- resentation. We introduce a novel sparse -gram model that, un- like pruned -grams, learns on-line by stochastic search for fre- quent -tuple patterns. Adding patterns as data arrives compli- cates probability calculations. We discuss an EM approach to this problem and introduce hierarchical sparse -grams, a model that uses a better solution based on a new method for combining infor- mation across levels. A second new method for combining infor- mation from multiple granularities ( -gram widths) enables these models to more effectively search for frequent patterns (an on-line, stochastic analog of pruning in association rule mining). The re- sult is an example of a rare combination—unsupervised, on-line, cumulative, structure learning. Unlike prediction suffix tree (PST) mixtures, the model learns with no size bound but using less space than the data. It does not repeatedly iterate over data (unlike Max- Ent feature construction). It discovers repeated structure on-line and (unlike PSTs) uses this to learn larger patterns. The type of re- peated structure is limited (e.g., compared to hierarchical HMMs) but still useful, and these are important first steps towards learning repeated structure in more expressive representations, which has seen little progress especially in unsupervised, on-line contexts.
更多
查看译文
关键词
intelligent agent,association rule mining,cumulant
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要