Region compatibility based stability assessment for decision trees.
Expert Systems with Applications(2018)
摘要
Decision tree learning algorithms are known to be unstable, because small changes in the training data can result in highly different decision trees. An important issue is how to quantify decision tree stability. Two types of stability are defined in the literature: structural and semantic stability. However, existing structural stability measures are meaningless when applied to apparently different decision trees, and semantic stability only focuses on prediction accuracy without considering structural information. This paper proposes a region compatibility based structural stability measure for decision trees that considers the structural distribution of leaves from the view of basic probability assignments in evidence theory. To the best of our knowledge, we are the first to use basic probability assignments to quantify decision tree stability. We prove convergence for region compatibility, and show that apparently different decision trees have some inherent similarity from the view of region compatibility. We also clarify the meaning of region compatibility for measuring decision tree stability, and derive a method to select a relatively stable learning algorithm for a given dataset. Experimental results validate that region compatibility is effective to quantify the stability of decision tree learning algorithms.
更多查看译文
关键词
Machine learning,Decision tree,Stability measurement,Region compatibility,Evidence theory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络