# Theoretical Comparison between the Gini Index and Information Gain Criteria

Ann. Math. Artif. Intell., no. 1 (2004): 77-93

EI

The spatial world consists of regions and relationships between regions. Examples of such relationships are that two regions are disjoint or that one is a proper part of the other. The formal specification of spatial relations is an important part of ...

• Work in the field of decision tree construction focused mainly on the definition and on the realization of classification systems.
• Once a certain number of algorithms were defined, a lot of research was dedicated to compare them
• This is a relatively hard task as the different systems evolved from different backgrounds: information theory, discriminant analysis, encoding techniques etc.

• Early work in the field of decision tree construction focused mainly on the definition and on the realization of classification systems
• The sign of the differences of the Gini Index functions corresponding to two tests Ì , Ì 1⁄4 and of the Information Gain functions are established for the six possible situations
• We presented a formal comparison of the behavior of two of the most popular split functions, namely the Gini Index function and the Information Gain function
• The situations where the two split functions agree/disagree on the selected split were mathematically characterized. Based on these characterizations we were able to analyze the frequency of agreement/disagremment of the Gini Index function and the Information Gain function

• Index and Information Gain criteria.
• The sign of the differences of the Gini Index functions corresponding to two tests Ì , Ì 1⁄4 and of the Information Gain functions are established for the six possible situations.
• The authors will present in the following the details for one case as an illustration.
• If the sign of theμ ́ μ difference of the Gini Index functions Ò Ì.

• Conclusions and Future

Work

In this paper, the authors presented a formal comparison of the behavior of two of the most popular split functions, namely the Gini Index function and the Information Gain function.
• The situations where the two split functions agree/disagree on the selected split were mathematically characterized
• Based on these characterizations the authors were able to analyze the frequency of agreement/disagremment of the Gini Index function and the Information Gain function.
• The authors would like to emphasize that the methodology introduced in this paper is not limited to the two analyzed split criteria
• The authors used it successfully to formalize and compare other split criteria.
• Preliminary results can be found in [17]

• £This work was supported by grant number 2100-056986.99 from the Swiss National Science Foundation

• A. Babic, E. Krusinska, and J. E. Stromberg. Extraction of diagnostic rules using recursive partitioning systems: A comparison of two approches. Artificial Intelligence in Medicine, 20(5):373–387, October 1992.
• L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and regression trees. Wadsworth International Group, 1984.
• Lopez de Mantaras. A distance-based attribute selection measure for decision tree induction. Machine Learning, 6(1):81– 92, 1991.
• J. Gama and P. Brazdil. Characterization of classification algorithms. In C. Pinto-Ferreira and N. Mamede, editors, EPIA-95: Progress in Artificial Intelligence, 7th Portuguese Conference on Artificial Intelligence, pages 189–200. Springer Verlag, 1995.
• Igor Kononenko. On biases in estimating multi-valued attributes. In Chris Mellish, editor, IJCAI-95: Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, pages 1034–1040, Montreal, Canada, August 199Morgan Kaufmann Publishers Inc, San Mateo, CA.
• Tjen-Sien Lim, Wey-Yin Loh, and Yu-Shan Shih. A comparison of prediction accuracy, complexity and training time of thirty-three old and new classification algorithms. Machine Learning, 1999.
• John Mingers. An empirical comparison of selection measures for decision tree induction. Machine Learning, 3:319–342, 1989.
• Masashiro Miyakawa. Criteria for selecting a variable in the construction of efficient decision trees. IEEE Transactions on Computers, 35(1):133–141, January 1929.
• B. M. Moret. Decision trees and diagrams. Computing Surveys, 14(4):593–623, 1982.
• Kolluru Venkata Sreerama Murthy. On Growing Better Decision Trees from Data. PhD thesis, The John Hopkins University, Baltimore, Maryland, 1995.
• G. Pagallo. Adaptive Decision Tree Algorithms for Learning from Examples. PhD thesis, University of California, Santa Cruz, 1990.
• J. R. Quinlan. C4.5 Programs for machine learning. Morgan Kaufmann Publishers, 1993.
• John Ross Quinlan. Simplifying decision trees. International Journal of Man-Machine Studies, (27):221–234, 1987.
• Laura E. Raileanu. Theoretical comparison between the gini index and information gain functions. Technical report, Facultede droit et Sciences Economiques, Universitede Neuchatel, 2000.
• S. R. Safavin and D. Langrebe. A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man and Cybernetics, 21(3):660–674, 1991.
• M. Sahami. Learning non-linearly separable boolean functions with linear threshold unit trees and madaline-style networks. In AAAI Press, editor, Proceedings of the Eleventh National Conference on Artificial Inteligence, pages 335–341, 1993.
• Kilian Stoffel and Laura E. Raileanu. Selecting optimal split-functions for large datasets. In Research and Development in Intelligent Systems XVII, BCS Conference Series, 2000.
• Ricardo Vilalta and Daniel Oblinger. A quantification of distance-bias between evaluation metrics in classification. In Proceedings of the 17th International Conference on Machine Learning. Stanford University, 2000.
• Allan P. White and Wei Zhang Liu. Bias il information-based measures in decision tree induction. Machine Learning, 15(3):321–328, June 1997.
0