Defining diff as a data mining primitive

Ramesh Subramonian

KDD'98: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining(1998)

引用 40|浏览15
暂无评分
摘要
The emphasis on discovery in the knowledge discovery process while important in its own right, has distracted from the equally important process of knowledge representation and maintenance. For a system to indicate what is new or different, it must have an understanding of what is old or well understood or expected. In this paper, we propose diff as a fundamental data mining primitive. We show how it can be used to capture knowledge, either as a set of representative instances or as a set of rules, in a framework that is tightly integrated with the knowledge discovery process. We show how it can be applied to both discrete and continuous attributes and association rules. Lastly, we show how it enables the user to pinpoint high-level differences between two data sets that share the same attributes.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要