shareengineer: DATA WAREHOUSING AND MINIG ENGINEERING LECTURE NOTES--Ensemble Learning and Model Selection

Wednesday, September 26, 2012

DATA WAREHOUSING AND MINIG ENGINEERING LECTURE NOTES--Ensemble Learning and Model Selection

Ensemble Learning and Model Selection:

Ensemble learning is the process by which multiple models, such as classifiers or experts, are strategically generated and combined to solve a particular computational intelligence problem. Ensemble learning is primarily used to improve the (classification, prediction, function approximation, etc.) performance of a model, or reduce the likelihood of an unfortunate selection of a poor one. Other applications of ensemble learning include assigning a confidence to the decision made by the model, selecting optimal (or near optimal) features, data fusion, incremental learning, nonstationary learning and error-correcting. This article focuses on classification related applications of ensemble learning, however, all principle ideas described below can be easily generalized to function approximation or prediction type problems as well.

This is perhaps the primary reason why ensemble based systems are used in practice: what is the most appropriate classifier for a given classification problem? This question can be interpreted in two different ways: i) what type of classifier should be chosen among many competing models, such as multilayer Perceptron (MLP), support vector machines (SVM), decision trees, naive Bayes Classifier, etc; ii) given a particular classification algorithm, which realization of this algorithm should be chosen - for example, different initializations of MLPs can give rise to different decision boundaries, even if all other parameters are kept constant. The most commonly used procedure - choosing the classifiers with the smallest error on training data - is unfortunately a flawed one. Performance on a training dataset - even when computed using a cross-validation approach - can be misleading in terms of the classification performance on the previously unseen data. Then, of all (possibly infinite) classifiers that may all have the same training - or even the same (pseudo) generalization performance as computed on the validation data (part of the training data left unused for evaluating the classifier performance) - which one should be chosen? Everything else being equal, one may be tempted to choose at random, but with that decision comes the risk of choosing a particularly poor model. Using an ensemble of such models - instead of choosing just one - and combining their outputs by - for example, simply averaging them - can reduce the risk of an unfortunate selection of a particularly poorly performing classifier. It is important to emphasize that there is no guarantee that the combination of multiple classifiers will always perform better than the best individual classifier in the ensemble. Nor an improvement on the ensemble’s average performance can be guaranteed except for certain special cases. Hence combining classifiers may not necessarily beat the performance of the best classifier in the ensemble, but it certainly reduces the overall risk of making a particularly poor selection.

Figure 1: Combining an ensemble of classifiers for reducing classification error and/or model selection.

In order for this process to be effective, the individual experts must exhibit some level of diversity among themselves, as described later in this article in more detail. Within the classification context, then, the diversity in the classifiers – typically achieved by using different training parameters for each classifier – allows individual classifiers to generate different decision boundaries. If proper diversity is achieved, a different error is made by each classifier, strategic combination of which can then reduce the total error. Figure 1 graphically illustrates this concept, where each classifier - trained on a different subset of the available training data - makes different errors (shown as instances with dark borders), but the combination of the (three) classifiers provides the best decision boundary.

2 comments:

DIACNovember 10, 2017 at 12:01 AM
DIAC Automation is a leading PLC scada automation training providers in Noida. We offer various PLC training courses and live PLC programming training. Call 9310096831
ReplyDelete
Replies
KITS TechnologiesJanuary 3, 2022 at 10:03 PM
I loved your post.Much thanks again. Cool.
web methods training
web methods online training
ReplyDelete
Replies

Add comment

shareengineer

Pages

Translate

Wednesday, September 26, 2012

DATA WAREHOUSING AND MINIG ENGINEERING LECTURE NOTES--Ensemble Learning and Model Selection

2 comments: