Elsevier

Neurocomputing

Volume 69, Issues 7–9, March 2006, Pages 686-700
Neurocomputing

Evolving hybrid ensembles of learning machines for better generalisation

https://doi.org/10.1016/j.neucom.2005.12.014Get rights and content

Abstract

Ensembles of learning machines have been formally and empirically shown to outperform (generalise better than) single predictors in many cases. Evidence suggests that ensembles generalise better when they constitute members which form a diverse and accurate set. Additionally, there have been a multitude of theories on how one can enforce diversity within a combined predictor setup. We recently attempted to integrate these theories together into a co-evolutionary framework with a view to synthesising new evolutionary ensemble learning algorithms using the fact that multi-objective evolutionary optimisation is a formidable ensemble construction technique. This paper explicates on the intricacies of the proposed framework in addition to presenting detailed empirical results and comparisons with a wide range of algorithms in the machine learning literature. The framework treats diversity and accuracy as evolutionary pressures which are exerted at multiple levels of abstraction and is shown to be effective.

Introduction

One of the main issues in machine learning research is that of generalisation. Generalisation refers to the prediction ability of a base learner (or learning machine). The better a predictor performs on unseen data, the better it is said to possess the ability to generalise. The ‘bias–variance dilemma’ is a theoretical result which illustrates the importance of generalisation in machine learning research.

An ensemble or a committee of learning machines has been shown to outperform (generalise better than) single learners both theoretically and empirically in many cases [12]. Tumer and Ghosh [46] present the formal proof of this. According to Dietterich [21], ensembles form one of the main research directions as far as machine learning research is concerned. Brown et al. [12] give an extensive survey of different ensemble methods. A theoretical account of why ensembles perform better than single learners is also presented in [12].

Although ensembles perform better than their members in many cases, constructing them is not an easy task. As Dietterich [21] points out, the key to successful ensemble methods is to construct base models (individual predictors) which perform better than random guessing individually and which are at least somewhat uncorrelated as far as making errors on the training set is concerned. The statement essentially tells that in order for an ensemble to work properly it should have a diverse and accurate set of predictors (also mentioned in one of the seminal works on diversity in classifier ensembles by Hansen and Salamon [28]). Krogh and Vedelsby [32] formally show that an ideal ensemble is one that consists of highly correct (accurate) predictors which at the same time disagree as much as possible (i.e. substantial diversity amongst members is exhibited). This has also been tested and empirically verified [38], [39]. Thus, diversity and accuracy are two key issues that should be taken care of when constructing ensembles. There exists a trade-off between these two terms as mentioned in [48].

Given that ensembles generalise better as compared to a single predictor, ensemble research has become an active research area and has seen an influx of researchers coming up with myriad of algorithms trying to improve the prediction ability of such aggregate systems in recent years. Brown et al. [12] give a taxonomy of methods for creating diversity. Yates and Partridge [53] show that there exists a hierarchical relationship between the ways in which diversity has and can be enforced while creating ensembles, with each method having its own diversity generating potential. Additionally, Dietterich [21] states that one of the promising and open research areas is that of combining ensemble learning methods to give rise to new ensemble learning algorithms. Moreover, incorporating evolution in segments of machine learning has been a widely studied area (e.g. evolving neural networks [8], [45], [52], evolving neural network ensembles [35]) with evolution treated as another fundamental form of adaptation in addition to learning [52]. Evolution makes neural systems adapt to a dynamic environment effectively and efficiently [52].

A more recent approach to ensemble learning views learning as a multi-objective optimisation problem [3]. We have proposed an algorithm called DIVACE (DIVerse and Accurate Ensemble Learning Algorithm) [16], [18] which uses good ideas from Negative Correlation Learning (NCL) [34] and the Memetic Pareto Artificial Neural Network (MPANN) [1], [3] algorithm, and formulates the ensemble learning problem as a multi-objective problem explicitly within an evolutionary setup, aiming at finding a good trade-off between diversity and accuracy. One very strong motivation for the use of evolutionary multi-criterion optimisation in the creation of an ensemble in both DIVACE [16], [18] and MPANN [3] is that due to the presence of multiple conflicting objectives, the evolutionary approach engenders a set of near optimal solutions. The presence of more than one optimal solution indicates that if one uses multi-objectivity while creating ensembles, one can actually generate an ensemble automatically where the member networks would inadvertently be near optimal [13], [16].

We recently attempted to integrate the aforementioned ideas into a co-evolutionary framework [17] with a view to synthesising new evolutionary ensemble learning algorithms stressing on the fact that multi-objective evolutionary optimisation is a useful ensemble construction technique. DIVACE [16], [18], as an idea, gives us a perfect base on which to develop a framework wherein various ensemble methods can be combined and diversity enforcement can be tackled at multiple levels of abstraction.

The evolutionary framework gives a simple yet effective means for the development of new ensemble learning algorithms. Extending [17], we describe this framework in greater detail in sections to follow, covering ideas that lead to it and the problems it tries to account for. We also show how this framework can be instantiated and present two algorithms (where one is from [17]) resulting from it which essentially differ in their replacement schemes. These algorithms are empirically shown to be very promising. The new results presented in this paper further demonstrate the effectiveness of the framework.

Section snippets

Ensembles tackling the bias–variance dilemma and the trade-off between diversity and accuracy

In order for a learned machine/predictor to exhibit good input–output mappings for unseen data, the predictor should ‘know’ the exact mapping function it is trying to model. In practice, however, this is not possible, i.e. a predictor cannot learn the exact function it wants to learn mainly due to the dearth in the amount of data present for it to train on and due to noise in this data. The idea behind training is simply to inculcate a statistical model (a statistical oracle) within a base

Multi-objective evolution for ensembles

Multi-objectivity in ensembles, as an area of research, has not been explored extensively yet. According to our knowledge, the idea of designing neural networks within a multi-objective setup was first considered by Kottathra and Attikiouzel [31] where they used a branch and bound method to determine the number of hidden neurons (the second objective being the mean square error) in feed forward neural networks. Recently, Abbass [4] proposed an evolutionary multi-objective neural network

A framework for the evolution of hybrid ensembles

Designing hybrid ensembles as a field of research is still in its infancy. There are many issues which one should consider while tackling this relatively new paradigm in ensemble research.

One reason for the lack of substantial literature in this area or the relative inactivity could be attributed to the fact that hybrid ensembles as the name implies, deal with the combination of base learners that are trained using different training methodologies (algorithms). Brute force or even a search

Instantiation of the framework

Here we present our algorithm, i.e. DIVACE-II (as a successor of DIVACE [16], [18]), which can be thought of as being one instance of the framework. In DIVACE-II, we try to incorporate all the levels mentioned in the framework presented in the previous section. However, one should limit the ensemble construction approach to as fewer levels as possible depending on domain knowledge due to the computationally intensive nature of evolutionary methods. We model all three levels in our algorithm

Experimental results and discussion

DIVACE-II was tested on two benchmark datasets (Australian credit card assessment dataset and Diabetes dataset), available by anonymous ftp from ice.uci.edu in /pub/machine-learning-databases. Apart from the two versions of DIVACE-II discussed above, we also experimented with the multi-objective formulation. Pairwise Failure Crediting (PFC) [18] was recently proposed as a diversity measure which credits individuals in the ensemble with differences in the failure patterns, taking each pair of

Conclusion and directions for further research

All ensemble learning methods are essentially based on a very simple idea and they strive to achieve this goal: the goal of having diverse and accurate members within them which help the ensemble to outperform single learners. Recently, we also proposed an algorithm called DIVACE [16], [18], the idea behind which was to enforce diversity and accuracy within the ensemble explicitly within a multi-objective evolutionary setup. This was found to be promising and so we tried to incorporate as much

Acknowledgements

This research was undertaken as part of the EPSRC funded project on Market-Based Control of Complex Computational Systems (GR/T10671/01). This is a collaborative project involving the Universities of Birmingham, Liverpool and Southampton and BAE Systems, BT and HP.

Arjun Chandra is a Ph.D. student at the School of Computer Science, University of Birmingham, UK. He received his M.Sc. degree in Natural Computation from the University of Birmingham, UK, in December 2004 and his B.Tech. in Computer Science and Engineering from Dr. Ram Manohar Lohia Avadh University, Faizabad, India, in 2002. His research interests include multi-objective evolutionary algorithms, co-evolutionary learning, ensemble learning, probabilistic modeling and game theory.

References (53)

  • C.M. Bishop

    Neural Networks for Pattern Recognition

    (1995)
  • E. Boers, M. Borst, I. Sprinkhuizen-Kuyper, Evolving artificial neural networks using the “baldwin effect”, Technical...
  • L. Breiman

    Bagging predictors

    Mach. Learning

    (1996)
  • L. Breiman, Bias, variance, and arcing classifiers, Technical Report 460, Statistics Department, University of...
  • G. Brown, Diversity in neural network ensembles, Ph.D. Thesis, School of Computer Science, University of Birmingham,...
  • A. Chandra, Evolutionary approach to tackling the trade-off between diversity and accuracy in neural network ensembles,...
  • A. Chandra, Evolutionary framework for the creation of diverse hybrid ensembles for better generalisation, Master's...
  • A. Chandra, H. Chen, X. Yao, Multi-objective machine learning, Trade-off between diversity and accuracy in ensemble...
  • A.Chandra, X. Yao, DIVACE: diverse and accurate ensemble learning algorithm, in: Proceedings of the Fifth International...
  • A. Chandra et al.

    Evolutionary framework for the construction of diverse hybrid ensembles

  • A. Chandra, X. Yao, Ensemble learning using multi-objective evolutionary algorithms, J. Math. Modelling Algorithms...
  • P.J. Darwen, X. Yao, Every niching method has its niche: fitness sharing and implicit sharing compared, in: Proceedings...
  • K. Deb

    Multi-Objective Optimization Using Evolutionary Algorithms

    (2001)
  • T.G. Dietterich

    Machine-learning research: four current directions

    AI Mag.

    (1998)
  • V. Faber

    Clustering and the continuous k-means algorithm

  • S. Forrest et al.

    Using genetic algorithms to explore pattern recognition in the immune system

    Evol. Comput.

    (1993)
  • Cited by (128)

    • Continuously evolving dropout with multi-objective evolutionary optimisation

      2023, Engineering Applications of Artificial Intelligence
    • An effective ensemble deep learning framework for text classification

      2022, Journal of King Saud University - Computer and Information Sciences
    • A predictive and user-centric approach to Machine Learning in data streaming scenarios

      2022, Neurocomputing
      Citation Excerpt :

      To deal with these scenarios, the notion of Evolving Ensemble has been proposed [25]. This denotes an Ensemble whose weights are fine-tunned using some optimization mechanism, generally of biological inspiration (e.g. genetic algorithm) [26]. However, when concept drift is too significant, models eventually have to be replaced.

    • Image orientation detection by ensembles of Stochastic CNNs

      2021, Machine Learning with Applications
    View all citing articles on Scopus

    Arjun Chandra is a Ph.D. student at the School of Computer Science, University of Birmingham, UK. He received his M.Sc. degree in Natural Computation from the University of Birmingham, UK, in December 2004 and his B.Tech. in Computer Science and Engineering from Dr. Ram Manohar Lohia Avadh University, Faizabad, India, in 2002. His research interests include multi-objective evolutionary algorithms, co-evolutionary learning, ensemble learning, probabilistic modeling and game theory.

    Xin Yao received the B.Sc. degree from the University of Science and Technology of China (USTC), Hefei, the M.Sc. degree from the North China Institute of Computing Technologies (NCI), Beijing, and the Ph.D. degree from the USTC, in 1982, 1985, and 1990, respectively, all in computer science.

    He is currently a Professor of Computer Science and the Director of the Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA), University of Birmingham, U.K. He is also a Distingished Visiting Professor of USTC and a Cheung Scholar awarded by the Ministry of Education of China. He was a Lecturer, Senior Lecturer, and an Associate Professor at University College, University of New South Wales, the Australian Defence Force Academy (ADFA), Canberra, Australia, between 1992–1999. He held Postdoctoral Fellowships from the Australian National University (ANU), Canberra, and the Commonwealth Scientific and Industrial Research Organization (CSIRO), Melbourne, between 1990 and 1992. His major research interests include evolutionary computation, neural network ensembles, global optimization, computational time complexity, and data mining.

    He is an IEEE Fellow, the Editor-in-Chief of IEEE Transactions on Evolutionary Computation and the recipient of the 2001 IEEE Donald G. Fink Prize Paper Award. He has given more than 35 invited keynote and plenary speeches at various conferences worldwide. He has more than 200 publications in evolutionary computation and computational intelligence.

    View full text