Evolving hybrid ensembles of learning machines for better generalisation
Introduction
One of the main issues in machine learning research is that of generalisation. Generalisation refers to the prediction ability of a base learner (or learning machine). The better a predictor performs on unseen data, the better it is said to possess the ability to generalise. The ‘bias–variance dilemma’ is a theoretical result which illustrates the importance of generalisation in machine learning research.
An ensemble or a committee of learning machines has been shown to outperform (generalise better than) single learners both theoretically and empirically in many cases [12]. Tumer and Ghosh [46] present the formal proof of this. According to Dietterich [21], ensembles form one of the main research directions as far as machine learning research is concerned. Brown et al. [12] give an extensive survey of different ensemble methods. A theoretical account of why ensembles perform better than single learners is also presented in [12].
Although ensembles perform better than their members in many cases, constructing them is not an easy task. As Dietterich [21] points out, the key to successful ensemble methods is to construct base models (individual predictors) which perform better than random guessing individually and which are at least somewhat uncorrelated as far as making errors on the training set is concerned. The statement essentially tells that in order for an ensemble to work properly it should have a diverse and accurate set of predictors (also mentioned in one of the seminal works on diversity in classifier ensembles by Hansen and Salamon [28]). Krogh and Vedelsby [32] formally show that an ideal ensemble is one that consists of highly correct (accurate) predictors which at the same time disagree as much as possible (i.e. substantial diversity amongst members is exhibited). This has also been tested and empirically verified [38], [39]. Thus, diversity and accuracy are two key issues that should be taken care of when constructing ensembles. There exists a trade-off between these two terms as mentioned in [48].
Given that ensembles generalise better as compared to a single predictor, ensemble research has become an active research area and has seen an influx of researchers coming up with myriad of algorithms trying to improve the prediction ability of such aggregate systems in recent years. Brown et al. [12] give a taxonomy of methods for creating diversity. Yates and Partridge [53] show that there exists a hierarchical relationship between the ways in which diversity has and can be enforced while creating ensembles, with each method having its own diversity generating potential. Additionally, Dietterich [21] states that one of the promising and open research areas is that of combining ensemble learning methods to give rise to new ensemble learning algorithms. Moreover, incorporating evolution in segments of machine learning has been a widely studied area (e.g. evolving neural networks [8], [45], [52], evolving neural network ensembles [35]) with evolution treated as another fundamental form of adaptation in addition to learning [52]. Evolution makes neural systems adapt to a dynamic environment effectively and efficiently [52].
A more recent approach to ensemble learning views learning as a multi-objective optimisation problem [3]. We have proposed an algorithm called DIVACE (DIVerse and Accurate Ensemble Learning Algorithm) [16], [18] which uses good ideas from Negative Correlation Learning (NCL) [34] and the Memetic Pareto Artificial Neural Network (MPANN) [1], [3] algorithm, and formulates the ensemble learning problem as a multi-objective problem explicitly within an evolutionary setup, aiming at finding a good trade-off between diversity and accuracy. One very strong motivation for the use of evolutionary multi-criterion optimisation in the creation of an ensemble in both DIVACE [16], [18] and MPANN [3] is that due to the presence of multiple conflicting objectives, the evolutionary approach engenders a set of near optimal solutions. The presence of more than one optimal solution indicates that if one uses multi-objectivity while creating ensembles, one can actually generate an ensemble automatically where the member networks would inadvertently be near optimal [13], [16].
We recently attempted to integrate the aforementioned ideas into a co-evolutionary framework [17] with a view to synthesising new evolutionary ensemble learning algorithms stressing on the fact that multi-objective evolutionary optimisation is a useful ensemble construction technique. DIVACE [16], [18], as an idea, gives us a perfect base on which to develop a framework wherein various ensemble methods can be combined and diversity enforcement can be tackled at multiple levels of abstraction.
The evolutionary framework gives a simple yet effective means for the development of new ensemble learning algorithms. Extending [17], we describe this framework in greater detail in sections to follow, covering ideas that lead to it and the problems it tries to account for. We also show how this framework can be instantiated and present two algorithms (where one is from [17]) resulting from it which essentially differ in their replacement schemes. These algorithms are empirically shown to be very promising. The new results presented in this paper further demonstrate the effectiveness of the framework.
Section snippets
Ensembles tackling the bias–variance dilemma and the trade-off between diversity and accuracy
In order for a learned machine/predictor to exhibit good input–output mappings for unseen data, the predictor should ‘know’ the exact mapping function it is trying to model. In practice, however, this is not possible, i.e. a predictor cannot learn the exact function it wants to learn mainly due to the dearth in the amount of data present for it to train on and due to noise in this data. The idea behind training is simply to inculcate a statistical model (a statistical oracle) within a base
Multi-objective evolution for ensembles
Multi-objectivity in ensembles, as an area of research, has not been explored extensively yet. According to our knowledge, the idea of designing neural networks within a multi-objective setup was first considered by Kottathra and Attikiouzel [31] where they used a branch and bound method to determine the number of hidden neurons (the second objective being the mean square error) in feed forward neural networks. Recently, Abbass [4] proposed an evolutionary multi-objective neural network
A framework for the evolution of hybrid ensembles
Designing hybrid ensembles as a field of research is still in its infancy. There are many issues which one should consider while tackling this relatively new paradigm in ensemble research.
One reason for the lack of substantial literature in this area or the relative inactivity could be attributed to the fact that hybrid ensembles as the name implies, deal with the combination of base learners that are trained using different training methodologies (algorithms). Brute force or even a search
Instantiation of the framework
Here we present our algorithm, i.e. DIVACE-II (as a successor of DIVACE [16], [18]), which can be thought of as being one instance of the framework. In DIVACE-II, we try to incorporate all the levels mentioned in the framework presented in the previous section. However, one should limit the ensemble construction approach to as fewer levels as possible depending on domain knowledge due to the computationally intensive nature of evolutionary methods. We model all three levels in our algorithm
Experimental results and discussion
DIVACE-II was tested on two benchmark datasets (Australian credit card assessment dataset and Diabetes dataset), available by anonymous ftp from ice.uci.edu in /pub/machine-learning-databases. Apart from the two versions of DIVACE-II discussed above, we also experimented with the multi-objective formulation. Pairwise Failure Crediting (PFC) [18] was recently proposed as a diversity measure which credits individuals in the ensemble with differences in the failure patterns, taking each pair of
Conclusion and directions for further research
All ensemble learning methods are essentially based on a very simple idea and they strive to achieve this goal: the goal of having diverse and accurate members within them which help the ensemble to outperform single learners. Recently, we also proposed an algorithm called DIVACE [16], [18], the idea behind which was to enforce diversity and accuracy within the ensemble explicitly within a multi-objective evolutionary setup. This was found to be promising and so we tried to incorporate as much
Acknowledgements
This research was undertaken as part of the EPSRC funded project on Market-Based Control of Complex Computational Systems (GR/T10671/01). This is a collaborative project involving the Universities of Birmingham, Liverpool and Southampton and BAE Systems, BT and HP.
Arjun Chandra is a Ph.D. student at the School of Computer Science, University of Birmingham, UK. He received his M.Sc. degree in Natural Computation from the University of Birmingham, UK, in December 2004 and his B.Tech. in Computer Science and Engineering from Dr. Ram Manohar Lohia Avadh University, Faizabad, India, in 2002. His research interests include multi-objective evolutionary algorithms, co-evolutionary learning, ensemble learning, probabilistic modeling and game theory.
References (53)
- et al.
Diversity creation methods: a survey and categorisation
J. Inf. Fusion
(2005) - et al.
A novel multicriteria optimization algorithm for the structure determination of multilayer feedforward neural networks
J. Network Comput. Appl.
(1996) - et al.
Ensemble learning via negative correlation
Neural Networks
(1999) - et al.
Analysis of decision boundaries in linearly combined neural classifiers
Pattern Recognition
(1996) A memetic pareto evolutionary approach to artificial neural networks
Pareto neuro-ensemble
Pareto neuro-evolution: constructing ensemble of neural networks using multi-objective optimization
Speeding up backpropagation using multiobjective evolutionary algorithms
Neural Comput.
(2003)- et al.
Pde: a pareto-frontier differential evolution approach for multi-objective optimization problems
- et al.
An empirical comparison of voting classification algorithms: bagging, boosting, and variants
Mach. Learning
(1999)
Neural Networks for Pattern Recognition
Bagging predictors
Mach. Learning
Evolutionary framework for the construction of diverse hybrid ensembles
Multi-Objective Optimization Using Evolutionary Algorithms
Machine-learning research: four current directions
AI Mag.
Clustering and the continuous -means algorithm
Using genetic algorithms to explore pattern recognition in the immune system
Evol. Comput.
Cited by (128)
Continuously evolving dropout with multi-objective evolutionary optimisation
2023, Engineering Applications of Artificial IntelligenceAn effective ensemble deep learning framework for text classification
2022, Journal of King Saud University - Computer and Information SciencesRegression random machines: An ensemble support vector regression model with free kernel choice
2022, Expert Systems with ApplicationsA predictive and user-centric approach to Machine Learning in data streaming scenarios
2022, NeurocomputingCitation Excerpt :To deal with these scenarios, the notion of Evolving Ensemble has been proposed [25]. This denotes an Ensemble whose weights are fine-tunned using some optimization mechanism, generally of biological inspiration (e.g. genetic algorithm) [26]. However, when concept drift is too significant, models eventually have to be replaced.
A hybrid prediction frame for HEAs based on empirical knowledge and machine learning
2022, Acta MaterialiaImage orientation detection by ensembles of Stochastic CNNs
2021, Machine Learning with Applications
Arjun Chandra is a Ph.D. student at the School of Computer Science, University of Birmingham, UK. He received his M.Sc. degree in Natural Computation from the University of Birmingham, UK, in December 2004 and his B.Tech. in Computer Science and Engineering from Dr. Ram Manohar Lohia Avadh University, Faizabad, India, in 2002. His research interests include multi-objective evolutionary algorithms, co-evolutionary learning, ensemble learning, probabilistic modeling and game theory.
Xin Yao received the B.Sc. degree from the University of Science and Technology of China (USTC), Hefei, the M.Sc. degree from the North China Institute of Computing Technologies (NCI), Beijing, and the Ph.D. degree from the USTC, in 1982, 1985, and 1990, respectively, all in computer science.
He is currently a Professor of Computer Science and the Director of the Centre of Excellence for Research in Computational Intelligence and Applications (CERCIA), University of Birmingham, U.K. He is also a Distingished Visiting Professor of USTC and a Cheung Scholar awarded by the Ministry of Education of China. He was a Lecturer, Senior Lecturer, and an Associate Professor at University College, University of New South Wales, the Australian Defence Force Academy (ADFA), Canberra, Australia, between 1992–1999. He held Postdoctoral Fellowships from the Australian National University (ANU), Canberra, and the Commonwealth Scientific and Industrial Research Organization (CSIRO), Melbourne, between 1990 and 1992. His major research interests include evolutionary computation, neural network ensembles, global optimization, computational time complexity, and data mining.
He is an IEEE Fellow, the Editor-in-Chief of IEEE Transactions on Evolutionary Computation and the recipient of the 2001 IEEE Donald G. Fink Prize Paper Award. He has given more than 35 invited keynote and plenary speeches at various conferences worldwide. He has more than 200 publications in evolutionary computation and computational intelligence.