A survey of commonly used ensemble-based classification techniques

Anna Jurek; Yaxin Bi; Shengli Wu; Chris Nugent; Anna Jurek; Yaxin Bi; Shengli Wu; Chris Nugent

doi:10.1017/S0269888913000155

2014 Volume 29

Article Contents

Next Previous

RESEARCH ARTICLE Open Access

A survey of commonly used ensemble-based classification techniques

Jordanstown

More Information

Published online: 03 May 2013
The Knowledge Engineering Review 29, Article number: 10.1017/S0269888913000155 (2014) | Cite this article

Abstract

Abstract: The combination of multiple classifiers, commonly referred to as a classifier ensemble, has previously demonstrated the ability to improve classification accuracy in many application domains. As a result this area has attracted significant amount of research in recent years. The aim of this paper has therefore been to provide a state of the art review of the most well-known ensemble techniques with the main focus on bagging, boosting and stacking and to trace the recent attempts, which have been made to improve their performance. Within this paper, we present and compare an updated view on the different modifications of these techniques, which have specifically aimed to address some of the drawbacks of these methods namely the low diversity problem in bagging or the over-fitting problem in boosting. In addition, we provide a review of different ensemble selection methods based on both static and dynamic approaches. We present some new directions which have been adopted in the area of classifier ensembles from a range of recently published studies. In order to provide a deeper insight into the ensembles themselves a range of existing theoretical studies have been reviewed in the paper.
Rights and permissions
Copyright © Cambridge University Press 2013 2013Cambridge University Press

References

Abdelazeem S.2008. A greedy approach for building classification cascades. In Proceedings of the Seventh International Conference on Machine Learning and Applications, San Diego, CA, USA, 115–120.

Google Scholar

Altınçay H.2005. A Dempster-Shafer theoretic framework for boosting based ensemble design. Pattern Analysis and Applications8(3), 287–302.

Google Scholar

Altınçay H.2004. Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms. Pattern Analysis & Applications7(3), 285–295.

Google Scholar

Batista L., Granger E., Sabourin R.2011. Dynamic ensemble selection for off-line signature verification. In Proceedings of the 10th International Conference on Multiple Classifier Systems, Naples, Italy, 157–166.

Google Scholar

Bauer E., Kohavi R.1999. An empirical comparison of voting classification algorithms: bagging, boosting and variants. Machine Learning36(1–2), 105–139.

Google Scholar

Bi Y., Guan J., Bell D.2008. The combination of multiple classifiers using an evidential reasoning approach. Artificial Intelligence172(15), 1731–1751.

Google Scholar

Bi Y., Wu S., Wang H., Guo G.2011. Combination of evidence-based classifiers for text categorization. In Proceedings of the 23rd IEEE International Conference on Tools with Artificial Intelligence, Boca Raton, USA, 422–429.

Google Scholar

Bostrom H., Johansson R., Karlsson A.2008. On evidential combination rules for ensemble classifiers. In Proceedings of the 11th International Conference on Information Fusion, Cologne, Germany, 1–8.

Google Scholar

Breiman L.1996. Bagging predictors. Machine Learning24(2), 123–140.

Google Scholar

Breiman L.1996. Heuristics of instability and stabilization in model selection. The Annals of Statistics24(6), 2350–2383.

Google Scholar

Breiman L.2001. Random forest. Machine Learning45(1), 5–32.

Google Scholar

Bryll R., Gutierrez-Osuna R., Quek F. K.2003. Attribute bagging:improving accuracy of classiffer ensembles. Pattern Recognition36(6), 1291–1302.

Google Scholar

Buciu I., Kotropoulos C., Pitas I.2006. Demonstrating the stability of support vector machine for classification. Signal Processing86(9), 2364–2380.

Google Scholar

Caruana R., Niculescu-Mizil A., Crew G., Ksikes A.2004. Ensemble selection from libraries of models. In Proceedings of the 21st International Conference on Machine Learning, Banff, Canada, 137–144.

Google Scholar

Cevikalp H., Polikar R.2008. Local Classifier Weighting by Quadratic Programming. IEEE Transactions on Neural Networks19(10), 1832–1838.

Google Scholar

Danesh A., Moshiri B., Fatemi O.2007. Improve text classification accuracy based on classifier fusion methods. International Conference on Information Fusion, Quebec, QC, Canada, 1–6.

Google Scholar

Datta S., Pihur V.2010. An adaptive optimal ensemble classifier via bagging and rank aggregation with applications to high dimensional data. BMC Bioinformatics11(1), 427–438.

Google Scholar

De Stefano C., Fontanella F., Folino G.2011. A Bayesian approach for combining ensembles of GP classifiers. In Proceedings of the 10th International Conference on Multiple Classifier Systems, Naples, Italy, 26–35.

Google Scholar

Diao R., Shen Q.2011. Fuzzy-rough classifier ensemble selection. In Proceedings of the IEEE International Conference on Fuzzy Systems, Taipei, Taiwan, 1516–1522.

Google Scholar

Didaci L., Giacinto G., Roli F., Marcialis G.2005. A study on the performance of dynamic classifier selection based on local accuracy estimation. Pattern Recognation38(11), 2188–2191.

Google Scholar

Dietterich T.2000. An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Machine Learning40(2), 139–157.

Google Scholar

Dietterich T.2000. Ensemble methods in machine learning. International Workshop on Multiple Classifiers Systems, Cagliari, Italy, 1–15.

Google Scholar

Dimililer N., Varoglu E., Altincay H.2007. Vote-based classifier selection for biomedical NER using genetic algorithm. In Proceedings of the 3rd Iberian Conference on Pattern Recognition and Image Analysis, Girona, Spain, 202–209.

Google Scholar

Domingo C., Watanabe O.2000. MadaBoost: a modification of AdaBoost. In Proceedings of the 13th Annual Conference on Computational Learning Theory, Stanford, CA, USA, 180–189.

Google Scholar

Dos Santos E. M., Sabourin R., Maupin P.2008. A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recognition41(10), 2993–3009.

Google Scholar

Dzeroski S., Zenko B.2004. Is combining classifiers with stacking better than selecting the best one?Machine Learning54(3), 255–273.

Google Scholar

Estruch V., Ferri C., Hernández-Orallo J., Ramírez-Quintana M.2004. Bagging decision multi-trees. In International Workshop on Multiple Classifier Systems, Cagliari, Italy. Springer, 41–51.

Google Scholar

Folino G., Pizzuti C., Spezzano G.1999. A cellular genetic programming approach to classification. Genetic and Evolutionary Computation Conference, Orlando, Florida, 1015–1020.

Google Scholar

Freund Y., Schapire R. E.1999. A short introduction to boosting. Japanese Society for Artificial Intelligence14(5), 771–780.

Google Scholar

Fürnkranz J.2002. Pairwise classification as an ensemble technique. In Proceedings of the 13th European Conference on Machine Learning, Helsinki, Finland, 97–110.

Google Scholar

Gan Z. G., Xiao N. F.2009. A new ensemble learning algorithm based on improved K-Means. International Symposium on Intelligent Information Technology and Security Informatics, Moscow, Russia, 8–11.

Google Scholar

Garcia-Pedrajas N.2009. Constructing ensembles of classifiers by means of weighted instance selection. IEEE Transactions on Neural Networks20(2), 258–277.

Google Scholar

García-Pedrajas N., Ortiz-Boyer D.2009. Boosting k-nearest neighbor classifier by means of input space projection. Expert Systems with Applications36(7), 10570–10582.

Google Scholar

Geem Z. W., Kim J. H., Loganthan G. V.2001. A new heuristic optimization algorithm: harmony search. Simulation70(2), 60–68.

Google Scholar

Giacinto G., Roli F.2001. Dynamic classifier selection based on multiple classifier behaviour. Pattern Recognition34(9), 1879–1881.

Google Scholar

Grove A. J., Schuurmans D.1998. Boosting in the limit: maximization the margin of learned ensemble. National Conference on Artificial Intelligence, 692–699.

Google Scholar

Hansen J.2000. Combining Predictors. Meta Machine Learning Methods and Bias/Variance & Ambiguity Decompositions. PhD dissertation, Aurhus University.

Google Scholar

Hansen L. K., Salamon P.1990. Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence12(10), 993–1001.

Google Scholar

Hastie T., Tibshirani R., Friedman J.2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer.

Google Scholar

He L., Song Q., Shen J., Hai Z.2010. Ensemble numeric prediction of nearest-neighbor learning. Information Technology Journal9(3), 535–544.

Google Scholar

Hothorn T., Lausen B.2003. Double-bagging: combining classiffiers by bootstrap aggregation. Pattern Recognition36(6), 1303–1309.

Google Scholar

Hu X.2001. Using rough sets theory and database operations to construct a good ensemble of classifiers for data mining applications. In Proceedings of the 1st IEEE International Conference on Data Mining, San Jose, CA, USA, 233–240.

Google Scholar

Jensen R., Shen Q.2009. New approach to fuzzy-rough feature selection. IEEE Transaction on Fuzzy Systems17(4), 824–838.

Google Scholar

Jurek A., Bi Y., Wu S., Nugent C.2011. Classification by cluster analysis: a new meta-learning based approach. In 10th International Workshop on Multiple Classifier Systems, Naples, Italy, 259–268.

Google Scholar

Jurek A., Bi Y., Wu S., Nugent C.2011. Classification by clusters analysis—an ensemble technique in a semi-supervised classification. In 23rd IEEE International Conference on Tools with Artificial Intelligence, Boca Raton, FL, USA, 876–878.

Google Scholar

Kittler J., Hatef M., Duin R. P.1998. On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence20(3), 226–239.

Google Scholar

Kittler J., Roli F.2001. Genetic algotirhms for multi-classifier system configuration: a case study in character recognition. In Proceedings of the 2nd International Workshop on Multiple Classifier System, Cambridge, UK, 99–108.

Google Scholar

Ko A. H., Sabourin R., Britto A.Jr2007. K-Nearest Oracle for dynamic ensemble selection. In Proceedings of the 9th International Conference on Document Analysis and Recognition, Curitiba, Brazil, 422–426.

Google Scholar

Ko A. H., Sabourin R., Britto A. S.2008. From dynamic classifier selection to dynamic ensemble selection. Pattern Recognition41(5), 1718–1731.

Google Scholar

Kohavi R., Wolpert D.1996. Bias plus variance decomposition for zero-one loss functions. In 13th International Conference on Machine Learning, Bari, Italy, 275–283.

Google Scholar

Krogh A., Vedelsby J.1995. Neural network ensembles, cross validation and active learning. Advances in Neural Information Processing Systems7, 231–238.

Google Scholar

Kuncheva L., Jain L.2000. Designing classifier fusion systems by genetic algorithms. IEEE Transactions on Evolutionary Computation4(4), 327–336.

Google Scholar

Kuncheva L. I., Whitaker C. J.2003. Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning51(2), 181–207.

Google Scholar

Kurzynski M., Woloszynski T., Lysiak R.2010. On two measures of classifier competence for dnamic ensemble selection — experimental comparative analysis. International Symposium on Communications and Information Technologies, Tokyo, Japan, 1108–1113.

Google Scholar

Lam L., Suen C.1995. Optimal combination of pattern classifiers. Pattern Recogn Lett16(9), 945–954.

Google Scholar

Li K., Hao L.2009. Naïve Bayes ensemble learning based on oracle selection. In Proceedings of the 21st International Conference on Chinese Control and Decision Conference, Guilin, China, 665–670.

Google Scholar

Li X., Wang L., Sung E.2005. A study of AdaBoost with SVM based weak learners. In Proceedings of the IEEE International Joint Conference on Neural Networks, Chongqing, China, 196–201.

Google Scholar

Löfström T., Johansson U., Boström H.2008. On the use of accuracy and diversity measures for evaluating and selecting ensembles of classifiers. In Proceedings of the 7th International Conference on Machine Learning and Applications, San Diego, CA, USA, 127–132.

Google Scholar

Machova K., Barcak F.2006. A bagging method using decision trees in the role of base classifiers. Acta Polytechnica Hungarica3(2), 121–132.

Google Scholar

Maclin R.1997. An empirical evaluation of bagging and boosting. In Proceedings of the 14th National Conference on Artificial Intelligence, Providence, Rhode Island, 546–551.

Google Scholar

Marigineantu D., Dietterich T.1997. Pruning adaptive boosting. In Proceedings of the 14th International Conference on Machine Learning, Nashville, TN, USA, 211–218.

Google Scholar

Melville P., Mooney R.2003. Constructing diverse classifier ensemble using artificial training examples. In Proceedings of the 8th International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 505–510.

Google Scholar

Menahem E., Rokach L., Elovici Y.2009. Troika – an improved stacking schema for classification tasks. Information Sciences179(24), 4097–4122.

Google Scholar

Merler S., Capriel B., Furlanello C.2007. Parallelizing AdaBoost by weights dynamics. Computational Statistics & Data Analysis51(5), 2487–2498.

Google Scholar

Murrugarra-Llerena N., Lopes A.2011. An adaptive graph-based K-Nearest Neighbor. European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 1–11.

Google Scholar

Parvin H., Alizadeh H.2011. Classifier ensemble based class weightening. American Journal of Scientific Research19, 84–90.

Google Scholar

Pillai I., Fumera G., Roli F.2011. Classifier selection approaches for multi-label problems. In 10th International Workshop on Multiple Classifier Systems, Naples, Italy, 167–166.

Google Scholar

Reid S., Grudic G.2009. Regularized linear models in stacked generalization. In Proceedings of the 8th International Workshop on Multiple Classifier Systems, 112–121.

Google Scholar

Rodríguez J. J., Maudes J.2008. Boosting recombined weak classifiers. Pattern Recognition Letters29(8), 1049–1059.

Google Scholar

Rogova G.1994. Combining the results of several neaural networks. Neural Networks7(5), 777–781.

Google Scholar

Rokach L.2010. Ensemble-based classifiers. Artificial Intelligence Review33(1–2), 1–39.

Google Scholar

Ruta D., Gabrys B.2005. Classifier selection for majority voting. Information Fusion6(1), 63–81.

Google Scholar

Saeedian M. F., Beigy H.2009. Dynamic classifier selection using clustering for spam detection. Symposium on Computational Intelligence and Data Mining, Nashville, TN, USA, 84–88.

Google Scholar

Sait S. M., Youssef H.1999. Iterative Computer Algorithms with Applications in Engineering: Solving Combinatorial Optimization Problems. Wiley-IEEE Computer Society Press.

Google Scholar

Schapire R. E., Freund Y., Bartlett P., Lee W.1998. Boosting the margin: a new explanation for the effectiveness of voting methods. Annals of Statistics26(5), 1651–1686.

Google Scholar

Schölkopf B., Sung K., Burges C., Girosi F., Niyogi P., Poggio T.1997. Comparing support vector machines with gaussian kernels to radial basis function classifiers. IEEE Transactions on Signal Processing45(11), 2758–2765.

Google Scholar

Schwenk H., Bengio Y.2000. Boosting neural networks. Neural Computation12(8), 1869–1887.

Google Scholar

Seewald A. K.2002. How to make stacking better and faster while also taking care of an unknown weakness. In Proceedings of the 19th International Conference on Machine Learning, Sydney, Australia, 554–561.

Google Scholar

Sen M., Erdogan H.2011. Max-margin Stacking and Sparse Regularization for Linear Classifier Combination and Selection. Master Thesis, Cornell University Library, New York, USA.

Google Scholar

Shannon W., Banks D.1999. Combining classification trees using MLE. Statististics in Medicine18(6), 727–740.

Google Scholar

Shi H., Lv Y.2008. An ensemble classifier based on attribute selection and diversity measure. In Proceedings of the 5th International Conference on Fuzzy Systems and Knowledge Discovery, Shandong, China, 106–110.

Google Scholar

Shin H., Sohn S.2005. Selected tree classifier combination based on both accuracy and error diversity. Pattern Recognition38(2), 191–197.

Google Scholar

Skurichina M., Duin R. P.1998. Bagging for linear classifiers. Pattern Recognition31(7), 909–930.

Google Scholar

Skurichina M., Kuncheva L. I., Duin R. P.2002. Bagging and boosting for the nearest mean classifier: effects of sample size on diversity and accuracy. In Proceedings of the Third International Workshop on Multiple Classifier Systems, Cagliari, Italy, 62–71.

Google Scholar

Smits P.2002. Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection. IEEE Transactions on Geoscience and Remote Sensing40(4), 801–813.

Google Scholar

Stanfill C., Waltz D.1986. Toward memory based reasoning. Communications of ACM29(12), 1213–1228.

Google Scholar

Tahir M. A., Smith J.2010. Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection. Pattern Recognition Letters31(11), 1470–1480.

Google Scholar

Ting K., Witten I.1999. Issues in stacked generalization. Artificial Intelligence Research10, 271–289.

Google Scholar

Ting K. M., Witten I. H.1997. Stacked generalization: when does it work? In Proceedings of the 15th International Joint Conference on Artificial Intelligence, Aichi, Japan, 866–871.

Google Scholar

Todorovski L., Dzeroski S.2000. Combining multiple models with meta decision trees. In Proceedings of the European Conference on Principles of Data Mining and Knowledge Discovery Table, Lyon, France, 54–64.

Google Scholar

Tsoumakas G., Partalas I., Vlahavas I.2008. A taxonomy and short review of ensemble selection. ECAI: Workshop on Supervised and Unsupervised Ensemble Methods and their Applications.

Google Scholar

Valentini G.2004. Random aggregated and bagged ensembles of SVMs: an empirical bias-variance analysis. International Workshop Multiple Classifier Systems, Lecture Notes in Computer Science 3077, 263–272.

Google Scholar

Valentini G.2005. An experimental bias-variance analysis of svm ensembles based on resampling techniques. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics35(6), 1252–1271.

Google Scholar

Vezhnevets A., Barinova O.2007. Avoiding boosting overfitting by removing confusing samples. In Proceedings of the 18th European Conference on Machine Learning, Warsaw, Poland, 430–441.

Google Scholar

Wang Y., Lin C. D.2007. Learning by Bagging and Adaboost based on support vector machine. In Proceedings of the International Conference on Industrial Informatics, Vienna, Australia, 663–668.

Google Scholar

Webb G., Conilione P.2003. Estimating bias and variance from data. Technical report, School of Computer Science and Software Engineering, Monash University.

Google Scholar

Webb G. I.2000. MultiBoosting: a technique for combining boosting and wagging. Machine Learning40(2), 159–196.

Google Scholar

Wickramaratna J., Holden S., Buxton B.2001. Performance degradation in boosting. In Proceedings of the Multiple Classifier Systems, Cambridge, UK, 11–21.

Google Scholar

Woods K., Kegelmery W., Bowyer K.1997. Combination of multiple classifiers using local accuracy estimates. IEEE Transactions on Pattern Analysis and Machine Intelligence19(4), 405–410.

Google Scholar

Xiao J., He C.2009. Dynamic classifier ensemble selection based on GMDH. In Proceedings of the International Joint Conference on Computational Sciences and Optimization, Sanya, Hainan Island, China, 731–734.

Google Scholar

Zeng X., Chao S., Wong F.2010. Optimization of bagging classifiers based on SBCB algorithm. In Proceedings of the International Conference on Machine Learning and Cybernetics, Qingdao, China, 262–267.

Google Scholar

Zenko B., Todorovski L., Dzeroski S.2001. A comparison of stacking with MDTs to bagging, boosting, and other stacking methods. European Conference on Machine Learning, Workshop: Integrating Aspects of Data Mining, Decision Support and Meta-Learning, Freiburg, Germany, 163–175.

Google Scholar

Zenobi G., Cunningham P.2001. Using diversity in preparing ensembles of classifiers based on different features subsets to minimize generalization error. In Proceedings of the 12th European Conference on Machine Learning, Freiburg, Germany, 576–587.

Google Scholar

Zhang C., Zhang J.2008. A local boosting algorithm for solving classification problems. Computational Statistics & Data Analysis52(4), 1928–1941.

Google Scholar

Zhang L., Zhou W.2011. Sparse ensembles using weighted combination methods based on linear programming. Pattern Recognition44(1), 97–106.

Google Scholar

Zhiqiang Z., Balaji P.2007. Constructing ensembles from data envelopment analysis. INFORMS Journal on Computing1, 486–496.

Google Scholar

Zhou Z.2009. When semi-supervised learning meets ensemble learning. In Proceedings of the 8th International Workshop on Multiple Classifier Systems, Reykjavik, Iceland. Springer-Verlag, 5519, 529–538.

Google Scholar

Zhou Z., Yu Y.2005. Adapt bagging to nearest neighbor classifiers. Computer Science and Technology20(1), 48–54.

Google Scholar

Zhu D.2010. A hybrid approach for efficient ensembles. Decision Support Systems48(3), 480–487.

Google Scholar

About this article

Cite this article

Anna Jurek, Yaxin Bi, Shengli Wu, Chris Nugent. 2014. A survey of commonly used ensemble-based classification techniques. The Knowledge Engineering Review 29(5)551−581, doi: 10.1017/S0269888913000155

Anna Jurek, Yaxin Bi, Shengli Wu, Chris Nugent. 2014. A survey of commonly used ensemble-based classification techniques. The Knowledge Engineering Review 29(5)551−581, doi: 10.1017/S0269888913000155

Download PDF

Article Metrics

Article views(24) PDF downloads(222)

{{lists.name}}

A survey of commonly used ensemble-based classification techniques

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors