Neural Computing and Applications (2021) 33:11661–11672 https://doi.org/10.1007/s00521-021-05881-3(0123456789().,-volV)(0123456789,-().volV) ORIGINAL ARTICLE An advance ensemble classification for object recognition Ebenezer Owusu1 • Isaac Wiafe1 Received: 24 December 2019 / Accepted: 22 February 2021 / Published online: 11 March 2021  The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2021 Abstract The quest to improve performance accuracy and prediction speed in machine learning algorithms cannot be overem- phasized, as the need for machines to outperform humans continue to grow. Accordingly, several studies have proposed methods to improve prediction performance and speed particularly for spatio-temporal analysis. This study proposes a novel classifier that leverages ensemble techniques to improve prediction performance and speed. The proposed classifier, Ada-AdaSVM uses an AdaBoost feature selection algorithm to select small features of input datasets for a joint support vector machine (SVM)–AdaBoost classifier. The proposition is evaluated against a selection of existing classifiers (SVM, AdaSVM and AdaBoost) using the Jaffe, Yale, Taiwanese facial expression database (TFEID) and CK ? 48 datasets with Haar features as the preferred method for feature extraction. The findings indicated that Ada-AdaSVM outperforms SVM, AdaSVM and AdaBoost classifiers in terms of speed and accuracy. Keywords SVM  Adaboost  AdaSVM  Ada-AdaSVM 1 Introduction classification models. This is because, in ensembles, mul- tiple classifiers are used to train the model. This allows Existing datasets that are used for model training are each classifier to make use of the other’s strengths to characterized by class imbalances (with some target classes produce a robust model that is more accurate [4]. Cur- containing fewer examples), noise, under-sampling and rently, the two main methods of ensemble construction are over-sampling. To address these challenges, researchers based on boosting and bagging. Although, some studies continue to propose newer algorithms that outperform have advocated for bagging as a preferred choice [7], existing ones. For instance, recent studies have demon- others have argued that in cases where datasets have less strated that distributed bandit-based algorithms reduce noise, boosting outperforms bagging [8]. For instance, network data and are also capable of finding clusters faster Adaptive boosting (AdaBoost) can be generalized since its with higher accuracies in recommender systems [1]. Others margins can be enlarged and thus performs better. Ada- have demonstrated that performance accuracy can be Boost creates a collection of classifiers by maintaining a set improved in imbalance dataset by applying the Synthetic of weights over the training data and adaptively adjust Minority Over-sampling Technique (SMOTE) with boost- these weights during each boosting iteration. ing procedures [2]. Particularly, by transforming the Admits all the success, the need to achieve higher levels selection problem in SMOTE to a multi-objective opti- of classification accuracy remains. This is because the mization problem [3]. Similarly, others [4–6] have fundamental trade-off between accuracy and classification explained procedures that can be adopted to improve pre- speed serves as an impediment in classification model diction accuracy. It has been demonstrated that ensemble training. As such, domains that benefit from existing provides a promising solution for training more accurate machine learning algorithms continue to propose methods that seek to improve prediction accuracy. A domain that benefits most from classification algorithms and improved & Isaac Wiafe modeling performance is object recognition. Object iwiafe@ug.edu.gh recognition is relevant and has potentials in other areas of 1 study including facial recognitions and expressions, fin-Department of Computer Science, University of Ghana, Accra, Ghana gerprint recognition and retina recognition. However, the 123 11662 Neural Computing and Applications (2021) 33:11661–11672 domain inherits generic machine learning model training adopted or applied based on needs or expectations. They challenges. As such, the ability to classify or detect objects require fewer statistical training and also can detect com- more accurately with lesser computational complexity plex nonlinear relationships between dependent and inde- continues to be one of the main challenges in the domain. pendent variables. Yet, they produce multiple solutions This challenge is more visible in image recognition associated with local minima, hence they are not consid- because image datasets are characterized by several issues ered to be robust over different samples [11] or general- including pigmentation, aspect size and resolution that purpose algorithms since they require huge data to train. compounds the classification problem. In this regard, this They operate in a ‘‘black box’’ and thus have a greater study seeks to explore the strengths of using ensemble computational burden and are also prone to overfitting [12]. techniques with support vector machine (SVM) and Ada- Deep learning algorithms are also outperformed by tree Boost to improve classification accuracy without compro- ensembles for even typical machine learning problems. mising classification speed. Our motivation for using SVM Deep learning methods have improved over the years and AdaBoost is derived from studies conducted by Owusu [13], although its theoretical underpinnings are not well et al. [9] that demonstrated that both model performance understood. Also, due to the complexity involved in and speed can be improved by using AdaBoost feature training deep learning models, they are considered inap- extraction and SVM to train a classifier. In this study, it is propriate for applications that run in ‘‘real-time’’. hypothesized that model performance and training speed Nonetheless, they are considered to be one of the suc- can further be improved by using AdaBoost for feature cessful spatial techniques available [9]. extraction and a fusion of multiclass support vector machine and AdaBoost as a hybrid classifier. 2.2 SVM and AdaBoost techniques In the next section, a discussion on relevant literature on algorithms for object recognition, classification and Adaptive boosting (AdaBoost) is a spatial learning algo- ensembles is presented. This is followed by a description of rithm that selects a restricted number of weak classifiers the datasets and the pre-processing stages. The proposed from a large weak classifier hypothesis space, to construct a technique (Ada-adaSVM) is then discussed. The model strong classifier. The boosting process attempts to improve evaluation and performance measure, discussions and the model by focusing on where it is not learning properly. conclusions are then presented. It is effective for reducing bias error. However, it has been argued that although it performs better than bagging, it can overfit [14]. Yet, some studies have proven otherwise [15]. 2 Related work In boosting, parameters are tuned, and the algorithms are restricted to reducing the multiclass classification problem 2.1 Algorithms for object recognition to multiple two-class problems [16–18]. and classification SVM on the other hand provides a good out-of-sample generalization. They can be made robust by choosing Machine learning methods and algorithms have become appropriate generalization grades in the presence of biases rampant and more successful in recent times because among training samples. SVMs can be considered as computing hardware cost and processing times have maximal margin hyperplane classification techniques that reduced exponentially. Therefore, training complex algo- depend on the outcomes of statistical learning theories to rithms are now relatively easier and faster. As mentioned certify good generalization performance [19]. Yet, Ada- earlier, the relevance of object recognition and image Boost is considered to be better on generalization, and this classification cannot be overemphasized. Accordingly, makes it preferable for many classification problems several studies have proposed methods that have proven to although it is sensitive to noisy data and outliers. Some be effective with higher accuracies for classifying or rec- studies have argued that SVM is superior in recognition ognizing images. Methods that leverage on Deep Learning accuracies when compared to AdaBoost, nevertheless, the algorithms, Bayesian Network, Decision Trees, k-Nearest latter is faster in terms of model training speed [20]. Neighbors, Support Vector Machine, among others have all Generally, there is a trade-off between performance (in been used in image classification and recognition research. terms of accuracy) and model complexity (in terms of Deep learning is effective in image classification as they training speed). In addressing this, different classification are capable of extracting the appropriate features and also techniques are integrated to enhance performance and discriminate at the same time [10]. They are broadly possibly training speed. As mentioned earlier, recent classified into four main categories; Restricted Boltzmann studies conducted by Owusu et al. [9] demonstrated that Machines (RBMs), Convolutional Neural Networks SVM and AdaBoost can be ensembled to improve model (CNNs), Autoencoders and Sparse Coding. Each method is performance and speed. 123 Neural Computing and Applications (2021) 33:11661–11672 11663 2.3 Research contribution outperforms existing algorithms. The proposed procedure resigns from simple integration of data pre-processing with As has been conferred in the previous section, AdaBoost unchanged bootstrap sampling technique. As compared to performs better in terms of model training speed, yet poor standard bootstrap sampling, the probability of drawing in accuracy when compared to SVM. Thus, this study different types of examples is changed. Rather, the sam- seeks to use AdaBoost as (1) initial feature selection pling is focused on the minority class, specifically those algorithm and (2) amalgamate AdaBoost and SVM located in challenging sub-regions of the minority class. (AdaSVM) to form a classifier. It is therefore expected that This probability is dependent on the neighborhood class the resultant classifier will perform better in terms of speed distribution of the sample as suggested. and accuracy. The first part of the proposed method is an AdaBoost- based feature selection tool that selects a series of 2.4 Ensemble techniques uncomplicated classifiers to form a strong classifier. This procedure speeds up detection and avoids misclassification. Research on the application of ensembles in machine At each stage, feature extraction filters such as Gabor [20] learning and artificial intelligence application continues to and Haar [23] are treated as weak classifiers. This ensures advance as the need to discover newer approaches with that the AdaBoost feature selection algorithm selects the competitive levels of accuracy, speed and lesser compu- finest of classifiers and boosts the weights on the examples, tational complexity becomes prominent. Several studies to weigh the errors further. Afterwards, a subsequent filter have focused on variations of SVM and AdaBoost for is selected to provide the most appropriate result on the model performance. Some researchers have argued that due errors of the preceding filter. At each step, the selected to the complexities in SVM, AdaBoostSVM ensemble may filter is uncorrelated with the output of the preceding filter. not be viable [21]. Nonetheless, studies conducted by After a cycle of training, samples are re-weighted, and the Owusu et al. [20] and Li et al. [21] have experimented the weight of the error classified samples is accentuated. The possibilities available when SVM is ensembled with Ada- diverse weighted weak learners are merged to produce a Boost. Studies have concluded that AdaBoostSVM per- strong feature selection hypothesis (see Fig. 1). forms better when used as component classifiers [22], and The second part of the technique is the classification. it can improve classification performance on imbalanced This involves the fusion of the multiclass SVM and Ada- data. Boost to form a hybrid classifier. For this optimization A key objective in the use of ensemble is its ability to problem, SVM with the Radial Basis Function (RBF SVM) balance bias (underfitting) and variance (overfitting). It kernel is used as a weak classifier. This weak SVM clas- employs the power of diversity to enhance training per- sifier is trained to produce the optimum Gaussian value for formance. However, evaluating resultant models from the scale parameter d and the regularization parameter o. ensembles are more challenging and demands more com- This method was adopted from [8, 18], and the decision putation than those of single models. That notwithstanding, hyperplane k is represented by the equation: the effort required for computation can be argued to be P PDðiÞ:/ðl;v;zÞ DðiÞ:/ðl;v;zÞ worthwhile as they perform better than single models. Fast ðþÞ ðÞ algorithms (such as decision trees) benefit more than k ¼ 8i2/ þ 8i2/       ð1Þ  P   P  slower algorithms in ensembles. They perform better  DðiÞ:/  ðl;v;zÞ   DðiÞ:/   ðl;v;zÞ  ðþÞ ðÞ  in situations where there is significant diversity between 8i2/ 8i2/ the individual base models and effective in capturing both where i 2 ð1; 2; . . .;NÞ are extracted features of Gabor or linear and nonlinear patterns. Accordingly, this study Haar filters. An image I, is represented as Ui ¼ adopts an ensemble technique to train an object recognition N model. The next section is a discussion on the proposed fðxn; ynÞgn¼1 and z, l, v represents the parameters of the model. image. The positive sets / ðþÞ and the negative sets /ðÞ are /ðþÞ ¼ fðx ; y ÞgN Jdenoted by n n n¼1