Maintenance Of Maximal Frequent Itemsets In Large Databases
SAC07: The 2007 ACM Symposium on Applied Computing Seoul Korea March, 2007(2007)
摘要
There have been many studies on efficient discovery of maximal frequent itemsets in large databases. However, it is nontrivial to maintain such discovered itemsets if more and more data is inserted into the database as the insertions may invalidate some existing maximal frequent itemsets and also create some new ones.In this paper, we clearly address the relationships between old and new maximal frequent itemsets and propose an algorithm IMFI, which is based on these relationships to reuse previously discovered knowledge. The algorithm follows a top-down mechanism rather than traditional bottom-up methods to produce fewer candidates. Moreover, we integrate SG-tree into IMFI to improve the counting efficiency, which is faster than those methods based on vertical bitmap database representation.Evaluations on IMFI have been performed using both synthetic and real databases. Preliminary results show that applying IMFI is always much faster than an available incremental MFI mining algorithm, especially when it is equipped with SG-tree.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络