A Statistical Decision Tree Algorithm For Data Stream Classification

ICEIS: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1(2013)

引用 1|浏览16
暂无评分
摘要
A large amount of data is generated daily. Credit card transactions, monitoring networks, sensors and telecommunications are some examples among many applications that generate large volumes of data in an automated way. Data streams storage and knowledge extraction techniques differ from those used on traditional data. In the context of data stream classification many incremental techniques has been proposed. In this paper we present an incremental decision tree algorithm called StARMiner Tree (ST), which is based on Very Fast Decision Tree (VFDT) system, which deals with numerical data and uses a method based on statistics as a heuristic to decide when to split a node and also to choose the best attribute to be used in the test at a node. We applied ST in four datasets, two synthetic and two real-world, comparing its performance to the VFDT. In all experiments ST achieved a better accuracy, dealing well with noise data and describing well the data from the earliest examples. However, in three of four experiments ST created a bigger tree. The obtained results indicate that ST is a good classifier using large and smaller datasets, maintaining good accuracy and execution time.
更多
查看译文
关键词
Data Stream Mining,Classification,Decision Tree,VFDT,StARMiner Tree,Anytime Algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要