计算机科学 ›› 2014, Vol. 41 ›› Issue (7): 283-289.doi: 10.11896/j.issn.1002-137X.2014.07.059

• 人工智能 • 上一篇    下一篇

一种新的组合分类器学习方法

郭华平,袁俊红,张帆,邬长安,范明   

  1. 信阳师范学院 信阳464000;信阳师范学院 信阳464000;信阳师范学院 信阳464000;信阳师范学院 信阳464000;郑州大学 郑州450052
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受863项目:大规模汉语词义知识相关特征提取与构建工程(2012AA011101),河南科技厅重点项目:基于自适应蚁群算法的传感器网络节能覆盖研究(12A520035)资助

New Ensemble Learning Approach

GUO Hua-ping,YUAN Jun-hong,ZHANG Fan,Wu Chang-an and FAN Ming   

  • Online:2018-11-14 Published:2018-11-14

摘要: 提出了一种新的基于决策树的组合分类器学习方法FL(Forest Learning)。与bagging和adaboost等传统的组合分类器学习方法不同,FL不采用抽样或加权抽样,而是直接在训练集上学习一个森林作为组合分类器。与传统组合学习方法独立地学习每个基分类器,然后把它们组合在一起的做法不同,FL学习每个基分类器时都尽可能地考虑对组合分类器的影响。首先,FL使用传统的方法构建森林的第一棵决策树;然后,逐一构建新的决策树并将其添加到森林中。在构建新的决策树时,结点的每次划分都考虑对组合分类器的影响。实验结果表明,与传统的组合分类器学习方法相比,FL在大部分数据集上都能构建出性能更好的组合分类器。

关键词: 森林学习,边界理论,贡献增益,特征变换 中图法分类号TP181文献标识码A

Abstract: This paper proposed a new decision tree-based ensemble learning method called FL(Forest Learning).Unlike traditional ensemble learning approaches,such as bagging and boosting,FL directly learns a forest on all training examples as an ensemble rather than on examples obtained by sampling from training set.Unlike the approach of learning ensemble by independently training each classifier and combining them for prediction,FL learns each classifier considering its influence on ensemble performance.FL first employs traditional algorithm to train the first decision tree,and then iteratively constructs new decision trees and add them to forest.When constructing current decision tree,FL considers the influence of each partition on ensemble performance.Experimental results indicate that,compared to traditional ensemble learning methods,FL induces ensemble with much better performance.

Key words: Forest learning,Margin-based theory,Contribution gain,Feature transformation

[1] Gayar N E,Kittler J,Roli F,et al.Multiple Classifier Systems[C]∥Proceedings of 9th International Workshop on MCS.LNCS 5997,Springer 2010
[2] Sansone C,Kittler J,Roli F,et al.Multiple Classifier Systems[C]∥Proceedings of 10th International Workshop on MCS.LNCS 6713,Springer 2011
[3] Zhou Z-H,Roli F,Kittler J,et al.Multiple Classifier Systems[C]∥Proceedings of 11th International Workshop on MCS.LNCS 6713,Springer 2013
[4] Sun Y,Todorovic S,Li J,et al.Unifying the error-correcting and output-code AdaBoost within the margin framework[C]∥Raedt L D,Wrobel S,eds.Proceedings of the 22nd International Conference on Machine Learning.2005:872-879
[5] Bartlett P L,Traskin M.AdaBoost is Consistent[J].Journal of Machine Learning Research,2007,8:2347-2368
[6] Freund Y,Schapire R F.A decision-theoretic generalization ofon-line learning and an application to boosting[J].Journal of Computer and System Sciences,1997,55(1):119-139
[7] Dietterich T G.Ensemble methods in machine learning[C]∥Kittler J,Roli F,eds.Proceedings of the 1st International Workshop on Multiple Classifier Systems.2000:1-15
[8] Breiman L.Bagging predictors[J].Machine Learning,1996,24(2):123-140
[9] Rodríguez J J,Kuncheva L I,Alonso C J.Rotation Forest:A New Classifier Ensemble Method[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,28(10):1619-1630
[10] Breiman L.Random Forests[J].Machine Learning,2001,45(1):5-32
[11] Zhang D,Chen S,Zhou Z-H,et al.Constraint projections for ensemble learning[C]∥Fox D,Gomes C P,eds.Proceedings of the 23rd AAAI Conference on Artificial Intelligence.Chicago,IL,2008:758-763
[12] Hecherman D.Bayesian Networks for Data Mining[J].DataMining and Knowledge Discovery,1997,1(1):79-119
[13] Lin H-T,Li L.Support Vector Machinery for Infinite Ensemble Learning[J].Journal of Machine Learning Research,2008,9:285-312
[14] Quinlan J R.C4.5:Programs for Machine Learning[M].Mor-gan-kaufmann Publisher,San Mateo,CA,1993
[15] Breiman L,Friedman J H,Olshen R,et al.Classification and Regression Trees[M]. London:Chapman and & Hall,1993
[16] Ho T K.The Random Subspace Method for Constructing Deci-sion Forests[J].IEEE Trans.Pattern Analysis and Machine Intelligence,1998,20(8):832-844
[17] Schapire R E,Freund Y,Bartlett P,et al.Boosting the Margin:A New Explanation for the Effectiveness of Voting Methods[J].Annals of Statistics,1998,26(5):1651-1686
[18] Tan P,Steinbach M,Kumar V.数据挖掘导论[M].范明,范宏建,等译.北京:人民邮电出版社,2008
[19] Rodríguez J J,Kuncheva L I,Alonso C J.An Experimental Study on Rotation Forest Ensembles[C]∥MCS 2007.LNCS 4472,2007:459-468
[20] Rudin C,Schapire R E,Daubechies I.Open Problem:Does AdaBoost Always Cycle?[J].Journal of Machine Learning Research,2012,23:1-4
[21] http://nlp.zzu.edu.cn/LySpoon.asp

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!