AI帮你理解科学

AI 生成解读视频

AI抽取解析论文重点内容自动生成视频


pub
生成解读视频

AI 溯源

AI解析本论文相关学术脉络


Master Reading Tree
生成 溯源树

AI 精读

AI抽取本论文的概要总结


微博一下
From the pioneer work, random forests have been recognized as one of the successful algorithms for classification and regression, which construct a large number of random trees individually and make prediction based on an average of their predictions

Towards Convergence Rate Analysis of Random Forests for Classification

NIPS 2020, (2020)

被引用0|浏览37
EI
下载 PDF 全文
引用
微博一下

摘要

Random forests have been one of the successful ensemble algorithms in machine learning. The basic idea is to construct a large number of random trees individually and make prediction based on an average of their predictions. The great successes have attracted much attention on the consistency of random forests, mostly focusing on regressi...更多

代码

数据

简介
  • From the pioneer work [12], random forests have been recognized as one of the successful algorithms for classification and regression, which construct a large number of random trees individually and make prediction based on an average of their predictions.
  • The authors first present the following relationship of convergence rate between random forests classifier and individual random tree classifier, and the detailed proof is given in Appendix A.
  • Lemma 1 Let fm(x) be the random forests classifier given by Eqn (1), and fSn,Θ(x) denotes a classifier of individual tree with respect to random vector Θ.
重点内容
  • From the pioneer work [12], random forests have been recognized as one of the successful algorithms for classification and regression, which construct a large number of random trees individually and make prediction based on an average of their predictions
  • We present a convergence rate of pure random forests with midpoint splits for classification as follows: Theorem 2 Let fm(x) be the random forests classifier by applying pure random tree with midpoint splits to training data Sn of k leaves (k ≥ 2)
  • Our work presents the convergence rates of random forests for classification based on different analysis techniques, and it is interesting to study the convergence rates of other variants of random forests along our analysis
  • We present the first finite-sample convergence rate O(n−1/(8d+2)) for pure random forests, as well as a convergence rate O(n−1/(d+2)(ln n)1/(d+2)) for the simplified variant of Breiman’s original random forests [12], which reaches the minimax rate, except for a factor1/(d+2), of the optimal plug-in classifier under the L-Lipschitz assumption
  • It is interesting to extend our work to multi-class learning, where the challenges lie in the theoretical analysis of predictions f (x, y) − maxi=y f (x, i) and Lipschitz assumptions over multiple class-conditional distributions
结果
  • Theorem 1 Let fm(x) be the random forests classifier by applying pure random tree to training data Sn of k leaves (k ≥ 2).
  • The authors obtain a convergence rate O(n−1/(8d+2)) of pure random forests for classification, by selecting leaves parameter k = O(n4d/(4d+1)).
  • The authors first derive the convergence rate of individual random tree classifier fSn,Θ(x), and complete the proof by combining with Lemma 1.
  • The authors present a convergence rate of pure random forests with midpoint splits for classification as follows: Theorem 2 Let fm(x) be the random forests classifier by applying pure random tree with midpoint splits to training data Sn of k leaves (k ≥ 2).
  • The authors get a convergence rate O(n−1/(3.87d+2)) of pure random forests with midpoint splits for classification, by selecting leaves parameter k = O(n3.87d/(3.87d+2)).
  • The authors present a convergence rate of the simplified variant of random forests for classification as follows: Theorem 3 For k ≥ 2 and n ≥ 4, let fm(x) be the random forests classifier by applying Algorithm 1 to training data Sn of k leaves.
  • The authors' simplified variant of random forests reaches the minimax convergence rate, except for a factor1/(1+d), as that of the optimal plug-in classifiers, despite random forests are not plug-in classifiers, since random forests take a majority vote over the predictions of individual random trees, rather than the estimation of conditional probability.
  • The authors achieve tighter convergence rate O( ln n/n) of the simplified variant of random forests for classification, which is independent of dimension d.
结论
  • Mourtada et al [40] presented the consistency of online Mondrian forests classifiers according to [22, Theorem 6.1], and derived the minimax rate O(n−1/(d+2)) for plug-in classifiers based on the estimation of conditional probability, that is, they took an average of conditional probabilities calculated by individual Mondrian trees.
  • The authors present the first finite-sample convergence rate O(n−1/(8d+2)) for pure random forests, as well as a convergence rate O(n−1/(d+2)(ln n)1/(d+2)) for the simplified variant of Breiman’s original random forests [12], which reaches the minimax rate, except for a factor1/(d+2), of the optimal plug-in classifier under the L-Lipschitz assumption.
  • This is a pure theoretical work without particular application foreseen
总结
  • From the pioneer work [12], random forests have been recognized as one of the successful algorithms for classification and regression, which construct a large number of random trees individually and make prediction based on an average of their predictions.
  • The authors first present the following relationship of convergence rate between random forests classifier and individual random tree classifier, and the detailed proof is given in Appendix A.
  • Lemma 1 Let fm(x) be the random forests classifier given by Eqn (1), and fSn,Θ(x) denotes a classifier of individual tree with respect to random vector Θ.
  • Theorem 1 Let fm(x) be the random forests classifier by applying pure random tree to training data Sn of k leaves (k ≥ 2).
  • The authors obtain a convergence rate O(n−1/(8d+2)) of pure random forests for classification, by selecting leaves parameter k = O(n4d/(4d+1)).
  • The authors first derive the convergence rate of individual random tree classifier fSn,Θ(x), and complete the proof by combining with Lemma 1.
  • The authors present a convergence rate of pure random forests with midpoint splits for classification as follows: Theorem 2 Let fm(x) be the random forests classifier by applying pure random tree with midpoint splits to training data Sn of k leaves (k ≥ 2).
  • The authors get a convergence rate O(n−1/(3.87d+2)) of pure random forests with midpoint splits for classification, by selecting leaves parameter k = O(n3.87d/(3.87d+2)).
  • The authors present a convergence rate of the simplified variant of random forests for classification as follows: Theorem 3 For k ≥ 2 and n ≥ 4, let fm(x) be the random forests classifier by applying Algorithm 1 to training data Sn of k leaves.
  • The authors' simplified variant of random forests reaches the minimax convergence rate, except for a factor1/(1+d), as that of the optimal plug-in classifiers, despite random forests are not plug-in classifiers, since random forests take a majority vote over the predictions of individual random trees, rather than the estimation of conditional probability.
  • The authors achieve tighter convergence rate O( ln n/n) of the simplified variant of random forests for classification, which is independent of dimension d.
  • Mourtada et al [40] presented the consistency of online Mondrian forests classifiers according to [22, Theorem 6.1], and derived the minimax rate O(n−1/(d+2)) for plug-in classifiers based on the estimation of conditional probability, that is, they took an average of conditional probabilities calculated by individual Mondrian trees.
  • The authors present the first finite-sample convergence rate O(n−1/(8d+2)) for pure random forests, as well as a convergence rate O(n−1/(d+2)(ln n)1/(d+2)) for the simplified variant of Breiman’s original random forests [12], which reaches the minimax rate, except for a factor1/(d+2), of the optimal plug-in classifier under the L-Lipschitz assumption.
  • This is a pure theoretical work without particular application foreseen
相关工作
  • For random forests, a large number of variants have been developed according to different problems and settings in the literature during the past decades. Geurts et al [27] introduced the extremely randomized trees and Amaratunga et al [1] provided the enriched random forests for DNA microarray data of huge features. Menze et al [38] presented the oblique random forests for multivariate trees by explicitly learning the optimal split directions with linear discriminative models. Clémençon et al [14] introduced the ranking forests based on aggregation and feature randomization principles for bipartite ranking. Athey et al [4] developed a flexible and computationally efficient algorithm for the generalized random forests. A general framework is presented in [53] on various splitting criteria for random forests based on loss functions. Zhou and Feng [55, 56] proposed gcForest with performance highly competitive to deep neural networks. Online random forests have also been developed with strong theoretical guarantees [19, 33, 40, 49].
基金
  • This research was supported by the NSFC (61921006, 61876078), the Fundamental Research Funds for the Central Universities (14380003)
引用论文
  • D. Amaratunga, J. Cabrera, and Y.-S. Lee. Enriched random forests. Bioinformatics, 24(18):2010–2014, 2008.
    Google ScholarLocate open access versionFindings
  • Y. Amit and D. Geman. Shape quantization and recognition with randomized trees. Neural Computation, 9(7):1545–1588, 1997.
    Google ScholarLocate open access versionFindings
  • S. Arlot and R. Genuer. Analysis of purely random forests bias. CoRR/Abstract, 1407.3939, 2014.
    Google ScholarLocate open access versionFindings
  • S. Athey, J. Tibshirani, and S. Wager. Generalized random forests. Annals of Statistics, 47(2):1148–1178, 2019.
    Google ScholarLocate open access versionFindings
  • J.-Y. Audibert and A. Tsybakov. Fast learning rates for plug-in classifiers. Annals of Statistics, 35(2):608–633, 2007.
    Google ScholarLocate open access versionFindings
  • S. Basu, K. Kumbier, J. Brown, and B. Yu. Iterative random forests to discover predictive and stable high-order interactions. Proceedings of the National Academy of Sciences, 115(8):1943– 1948, 2018.
    Google ScholarLocate open access versionFindings
  • M. Belgiu and L. Dragut. Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114:24–31, 2016.
    Google ScholarLocate open access versionFindings
  • G. Biau. Analysis of a random forests model. Journal of Machine Learning Research, 13:1063– 1095, 2012.
    Google ScholarLocate open access versionFindings
  • G. Biau, L. Devroye, and G. Lugosi. Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research, 9:2015–2033, 2008.
    Google ScholarLocate open access versionFindings
  • G. Biau and E. Scornet. A random forest guided tour. Test, 25(2):197–227, 2016.
    Google ScholarLocate open access versionFindings
  • L. Breiman. Some infinity theory for predictor ensembles. Technical Report 579, Statistics Department, UC Berkeley, Berkeley, CA, 2000.
    Google ScholarFindings
  • L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
    Google ScholarLocate open access versionFindings
  • L. Breiman. Consistency for a simple model of random forests. Technical Report 670, Statistics Department, UC Berkeley, Berkeley, CA, 2004.
    Google ScholarFindings
  • S. Clémençon, M. Depecker, and N. Vayatis. Ranking forests. Journal of Machine Learning Research, 14:39–73, 2013.
    Google ScholarLocate open access versionFindings
  • T. Cover and P. Hart. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21–27, 1967.
    Google ScholarLocate open access versionFindings
  • A. Criminisi and J. Shotton. Decision Forests for Computer Vision and Medical Image Analysis. Springer Science & Business Media, 2013.
    Google ScholarFindings
  • A. Criminisi, J. Shotton, and E. Konukoglu. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning. Foundations and Trends in Computer Graphics and Vision, 7(2-3):81–227, 2012.
    Google ScholarLocate open access versionFindings
  • D. Cutler, T. Edwards Jr, K. Beard, A. Cutler, K. Hess, J. Gibson, and J. Lawler. Random forests for classification in ecology. Ecology, 88(11):2783–2792, 2007.
    Google ScholarLocate open access versionFindings
  • M. Denil, D. Matheson, and N. Freitas. Consistency of online random forests. In Proceedings of the 30th International Conference on Machine Learning, pages 1256–1264, Atlanta, GA, 2013.
    Google ScholarLocate open access versionFindings
  • M. Denil, D. Matheson, and N. De Freitas. Narrowing the gap: Random forests in theory and in practice. In Proceedings of the 31th International Conference on Machine Learning, pages 665–673, Beijing, China, 2014.
    Google ScholarLocate open access versionFindings
  • L. Devroye. A note on the height of binary search trees. Journal of the ACM, 33(3):489–498, 1986.
    Google ScholarLocate open access versionFindings
  • L. Devroye, L. Gyorfi, and G. Lugosi. A Probabilistic Theory of Pattern Recognition. Springer, New York, 1996.
    Google ScholarFindings
  • T. G. Dietterich. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2):139–157, 2000.
    Google ScholarLocate open access versionFindings
  • M. Fernández-Delgado, E. Cernadas, S. Barro, and D. Amorim. Do we need hundreds of classifiers to solve real world classification problems? Journal of Machine Learning Research, 15(1):3133–3181, 2014.
    Google ScholarLocate open access versionFindings
  • R. Genuer. Variance reduction in purely random forests. Journal of Nonparametric Statistics, 24(3):543–562, 2012.
    Google ScholarLocate open access versionFindings
  • R. Genuer, J. Poggi, and C. Tuleau. Random forests: Some methodological insights. CoRR/Abstract, 0811.3619, 2008.
    Google ScholarLocate open access versionFindings
  • P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3– 42, 2006.
    Google ScholarLocate open access versionFindings
  • J. Goetz, A. Tewari, and P. Zimmerman. Active learning for non-parametric regression using purely random trees. In Advances in Neural Information Processing Systems 31, pages 2537– 2546. MIT Press, Cambridge, MA, 2018.
    Google ScholarLocate open access versionFindings
  • T. K. Ho. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8):832–844, 1998.
    Google ScholarLocate open access versionFindings
  • J. Kazemitabar, A. Amini, A. Bloniarz, and A. Talwalkar. Mondrian forests: Efficient online random forests. In Advances in Neural Information Processing Systems 30, pages 426–435. MIT Press, Cambridge, MA, 2017.
    Google ScholarLocate open access versionFindings
  • J. Klusowski. Sharp analysis of a simple model for random forests. CoRR/Abstract, 1805.02587, 2018.
    Google ScholarLocate open access versionFindings
  • S. Kwok and C. Carter. Multiple decision trees. In Proceedings of the 4th Annual Conference on Uncertainty in Artificial Intelligence, pages 327–338, Minneapolis, MN, 1988.
    Google ScholarLocate open access versionFindings
  • B. Lakshminarayanan, D. Roy, and Y. Teh. Mondrian forests: Efficient online random forests. In Advances in Neural Information Processing Systems 27, pages 3140–3148. MIT Press, Cambridge, MA, 2014.
    Google ScholarLocate open access versionFindings
  • X. Li, Y. Wang, S. Basu, K. Kumbier, and B. Yu. A debiased MDI feature importance measure for random forests. In Advances in Neural Information Processing Systems 32, pages 8047–8057. MIT Press, Cambridge, MA, 2019.
    Google ScholarLocate open access versionFindings
  • Y. Lin and Y. Jeon. Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101(474):578–590, 2006.
    Google ScholarLocate open access versionFindings
  • G. Louppe, L. Wehenkel, A. Sutera, and P. Geurts. Understanding variable importances in forests of randomized trees. In Advances in Neural Information Processing Systems 26, pages 431–439. MIT Press, Cambridge, MA, 2013.
    Google ScholarLocate open access versionFindings
  • N. Meinshausen. Quantile regression forests. Journal of Machine Learning Research, 7:983– 999, 2006.
    Google ScholarLocate open access versionFindings
  • B. Menze, M. Kelm, D. Splitthoff, U. Koethe, and F. Hamprecht. On oblique random forests. In Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pages 453–469, Athens, Greece, 2011.
    Google ScholarLocate open access versionFindings
  • M. Mitzenmacher and E. Upfal. Probability and Computing: Randomized Algorithms and Probabilistic Analysis. Cambridge University Press, 2005.
    Google ScholarFindings
  • J. Mourtada, S. Gaïffas, and E. Scornet. Universal consistency and minimax rates for online mondrian forests. In Advances in Neural Information Processing Systems 30, pages 3758–3767. MIT Press, Cambridge, MA, 2017.
    Google ScholarLocate open access versionFindings
  • [42] B. Reed. The height of a random binary search tree. Journal of the ACM, 50(3):306–332, 2003.
    Google ScholarLocate open access versionFindings
  • [43] J. Rodriguez, L. Kuncheva, and C. Alonso. Rotation forest: A new classifier ensemble method. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10):1619–1630, 2006.
    Google ScholarLocate open access versionFindings
  • [44] E. Scornet. On the asymptotics of random forests. Journal of Multivariate Analysis, 146:72–83, 2016.
    Google ScholarLocate open access versionFindings
  • [45] E. Scornet, G. Biau, and J. Vert. Consistency of random forests. Annals of Statistics, 43(4):1716– 1741, 2015.
    Google ScholarLocate open access versionFindings
  • [46] S. Shalev-Shwartz and S. Ben-David. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, Cambridge, 2014.
    Google ScholarFindings
  • [47] J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake, M. Cook, and R. Moore. Real-time human pose recognition in parts from single depth images. Communications of the ACM, 56(1):116–124, 2013.
    Google ScholarLocate open access versionFindings
  • [48] V. Svetnik, A. Liaw, C. Tong, J. Culberson, R. Sheridan, and B. Feuston. Random forest: A classification and regression tool for compound classification and QSAR modeling. Journal of Chemical Information and Computer Sciences, 43(6):1947–1958, 2003.
    Google ScholarLocate open access versionFindings
  • [49] M. Taddy, R. Gramacy, and N. Polson. Dynamic trees for learning and design. Journal of the American Statistical Association, 106(493):109–123, 2011.
    Google ScholarLocate open access versionFindings
  • [50] C. Tang, D. Garreau, and U. von Luxburg. When do random forests fail? In Advances in Neural Information Processing Systems 31, pages 2983–2993. MIT Press, Cambridge, MA, 2018.
    Google ScholarLocate open access versionFindings
  • [51] S. Wager. Asymptotic theory for random forests. CoRR/Abstract, 1405.0352, 2014.
    Google ScholarLocate open access versionFindings
  • [52] S. Wager, T. Hastie, and B. Efron. Confidence intervals for random forests: The jackknife and the infinitesimal jackknife. Journal of Machine Learning Research, 15(1):1625–1651, 2014.
    Google ScholarLocate open access versionFindings
  • [53] B.-B. Yang, W. Gao, and M. Li. On the robust splitting criterion of random forest. In Proceedings of the 19th IEEE International Conference on Data Mining, pages 1420–1425, Beijing, China, 2019.
    Google ScholarLocate open access versionFindings
  • [54] Y. Yang. Minimax nonparametric classification - part I: Rates of convergence. IEEE Transactions on Information Theory, 45(7):2271–2284, 1999.
    Google ScholarLocate open access versionFindings
  • [55] Z.-H. Zhou and J. Feng. Deep forest: Towards an alternative to deep neural networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pages 3553– 3559, Melbourne, Australia, 2017.
    Google ScholarLocate open access versionFindings
  • [56] Z.-H. Zhou and J. Feng. Deep forest. National Science Review, 6(1):74–86, 2019.
    Google ScholarLocate open access versionFindings
您的评分 :
0

 

标签
评论
小科