Hindawi
Applied Computational Intelligence and So Computing
Volume 2021, Article ID 6672578, 13 pages
https://doi.org/10.1155/2021/6672578
Research Article
Analysis and Implementation of Optimization Techniques for
Facial Recognition
Justice Kwame Appati ,1 Huzaifa Abu ,1 Ebenezer Owusu ,1 and Kwaku Darkwah2
1Department of Computer Science, University of Ghana, Accra, Ghana
2Department of Mathematics, Kwame Nkrumah University of Science and Technology, Kumasi, Ghana
Correspondence should be addressed to Ebenezer Owusu; ebeowusu@ug.edu.gh
Received 1 November 2020; Revised 25 February 2021; Accepted 5 March 2021; Published 12 March 2021
Academic Editor: Cheng-Jian Lin
Copyright © 2021 Justice Kwame Appati et al. +is is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Amidst the wide spectrum of recognition methods proposed, there is still the challenge of these algorithms not yielding optimal
accuracy against illumination, pose, and facial expression. In recent years, considerable attention has been on the use of swarm
intelligence methods to help resolve some of these persistent issues. In this study, the principal component analysis (PCA) method
with the inherent property of dimensionality reduction was adopted for feature selection. +e resultant features were optimized
using the particle swarm optimization (PSO) algorithm. For the purpose of performance comparison, the resultant features were
also optimized with the genetic algorithm (GA) and the artificial bee colony (ABC). +e optimized features were used for the
recognition using Euclidean distance (EUD), K-nearest neighbor (KNN), and the support vector machine (SVM) as classifiers.
Experimental results of these hybrid models on the ORL dataset reveal an accuracy of 99.25% for PSO and KNN, followed by ABC
with 93.72% and GA with 87.50%. On the central, an experimentation of the PSO, GA, and ABC on the YaleB dataset results in
100% accuracy demonstrating their efficiencies over the state-of-the art methods.
1. Introduction researchers continue to propose newer algorithms that
outperform existing ones.
Automated biometric recognition is fast gaining recognition Since automated face recognition study is new as
as the most trusted security systems in the 21st century. +is compared to fingerprint and others already stated, the
is perhaps attributed to the recent significant advances in problems associated with it are still eminent. For example,
parallel processing techniques and also the search for most Zhang, Luo, Loy, and Tang [3] perceived the problem of
reliable security systems due to the sharp increases in crimes facial landmark detection, which is among the central focus
worldwide. +e earliest biometric features that were auto- of the system development. Most of the face detection al-
mated for recognition include fingerprints where the unique gorithms are slow and produce poor recognition accuracies
ridge skin patterns were utilized. Others include the retina, (Owusu, Zhan, and Mao, 2014). Other unraveled challenges
iris, palm, skin, and nose tip. Fingerprints, retina, and iris in the face recognition research have to do with occlusion,
recognition systems are known to yield very accurate results pose variation, illumination normalization, age, and gender
[1], but hardened criminals, being sensitively aware of the [4]. In unconstrained environments, there is a significant
security implications, mostly avoid presenting their bio- decrease in recognition accuracy, thus making it difficult to
metric features to be captured into databases. +us, auto- accurately identify faces. +erefore, there is a need to have
mated face recognition systems are now the obvious choice techniques that improve face recognition in these envi-
[2] because people cannot hide their facial images from ronments. Tu, Li, and Zhao [5] attempted to solve the
installed CCTV cameras all the time. +is makes the problem of illumination and pose by using DL-Net and
technology the least intrusive and a hotbed research area as N-Net methods. However, this method could not adequately
2 Applied Computational Intelligence and Soft Computing
account for large-scale normalized albedo images and face recognition, learning time, and finally, object recognition time.
recognition in the wild. Another challenge in the face rec- +e solution of the problem of low face recognition accuracy
ognition research has to do with testing for the efficacy of the due to large samples and limited availability of training samples
experimental results. +ere are no standard datasets that is was solved by He, Wu, Sun, and Tan [11] when they proposed
generally recognized by the research community to be used cross-modality images of heterogeneous face recognition
for testing. +e use of specific datasets depends on the in- (HRF). +e study proposed the Wasserstein CNN framework
dividual researcher’s choice. Most of the datasets are pre- that utilizes one network to project near infra-red and visual
meditated and therefore do not represent a real-world images to a Euclidean space.+eproposedmethod is amodality
scenario. In terms of ethnicity too, there is a challenge. invariant deep feature learning architecture for NIR-VIS HFR.
Currently, there is no dataset that is well-balanced for race, +e Wasserstein space that separates the NIR and VIS distri-
gender, and age. bution is subsequently computed, and then, the correlation is
+e problem of nonuniform illumination also arises levied on the connected layers to mitigate overfitting on small
when the lighting conditions vary at different angles. +us, NIR datasets.
the proportion of light reflected by the face is different. +is Similarly, Rahimzadeh, Arashloo, and Kittler [12] solved
phenomenon can lead to the misidentification of an indi- the optimization problem of MAP inference using the
vidual [6]. Similarly, a random gyration due to individual Markov random field (MRF) model by utilizing the pro-
movement can also lead to misclassifications in 4D recog- cessing power of the GPU’s. +e multiresolution analysis
nitions. An input image and interperson image could appear technique, incremental subgradient approach, and efficient
dissimilar due to the rotation of the image [7]. +e main message passing approach were used to obtain the maxi-
purpose of this study is to explore the popular techniques mum efficiency gain. Efficiency was enhanced by using the
and bring forth an approach that leverage on computational multiresolutional daisy features to attain invariance against
cost. Moreover, this method will take into account illumi- occlusion and illumination.+e proposed approach reduced
nation, pose, and the facial expression. +e proposed ap- the computational cost by 200% when compared to baseline
proach enhances the outcomes of the principal component methods. Likewise, Chan et al. [13] attempted the problem of
analysis (PCA) technique using the optimization techniques training and adapting deep learning networks to different
approach. Additionally, the improvement in accuracy in this data and tasks. Chan et al. offered a method of passing
research transform to a general improvement in the security images into a cascaded principal component analysis (PCA)
and integrity of biometric locks. filter for training PCANet. PCANet is subsequently used for
In this study, we explored the question of which is the feature extraction using the MultiPIE, extended YaleB, AR,
finest or suitable optimization algorithm to use to maximize FERET, and LWF databases. Moreover, PCANet is also a
recognition. It also responds to which classifier suits the reference for reviewing advanced deep learning architec-
recommended approach and again which method utilizes tures containing a large number of image classifications.
less computational resource and time.+e proposed method Also, Deng, Hu, Wu, and Guo [14] put forward the creation
for this research requires the preprocessing of image; then, of a face image to mitigate varying illumination and pose,
features are subsequently examined and extracted using respectively, using only one frontal face image to develop an
PCA. +is will be followed by the augmentation of the said extended generic elastic model (GEM) and a multidepth
features using PSO, ABC, and GA with classification cul- model. Pose-aware metric learning (PAML) was learned by
minating the entire process. means of linear regression to synthesize each pose in their
corresponding metric space, and it yielded an accuracy of
2. Related Works 100%. Chen et al.[15] on the other hand proposed a residual-
based deep face reconstruction neural network for the ex-
Face recognition is mainly performed in four phases, visa-a-viz., traction of features from varying poses and illumination.
feature extraction, face detection, face synthesis, and recognition +is method changes illumination and pose images to
[8]. Chihaoui et al. [9] stated that face recognition techniques frontal face images with an average lighting condition. By
are mainly in three categories. +e first is the use of procedures comparing the proposed triplet loss and the Euclidean loss,
that require the usage the whole face as input. +e second the experimentation proved better for the performance of
approach is considering only some features or regions of the the latter over the former. However, only one database was
face, and the final method is the simultaneous usage of global used for this study, and there were no results to compare the
and local facial traits. Furthermore, numerous datasets are proposed method with.
geared towards the solution of specific face recognition prob- Tu, Li, and Zhao [5] also solved the problem of illu-
lems, and these datasets are taken under laboratory conditions. mination, pose, and expression by using a DL-Net and
However, there are some datasets that attempt to solve multiple normalization network (N-Net). +e DL-Net purges the
problems and are taken under real-world conditions [9]. illumination and then rebuilds the input image to an albedo
Fazilov, Mirzaev, and Mirzaeva [10] examined an algorithm to image. +e N-Net normalizes the albedo image and extracts
enhance the classification of objects in higher dimensions. +e features by supervised learning. +e MultiPIE database es-
proposed algorithm formed a subset of correlated images, and tablishes efficiency of the proposed method in augmenting
then, a feature representation was elected to build elementary face recognition accuracy under illumination, expression,
transformation models in the representative features’ subspace. and varying poses. +e study concludes by stating that the
+e algorithm pursues the augmentation of the accuracy of extracted features can improve conventional feature
Applied Computational Intelligence and Soft Computing 3
extraction methods. Zhang et al. [16] also proposed an PCA+wavelet, CA, 2DPCA+DWT, and local binary pat-
emotion recognition model with better accuracy than the tern algorithms. +ey claimed that this approach can be
SOTA model. +ey extracted the facial expressions of seven extended to illumination, age, or partial occlusion problems.
different emotions. +e extracted image is filtered through a Interestingly, Duong, Luu, Quach, and Bui [22] presented an
combination of the Shannon entropy and multiscale feature approach to deep appearance models (DAM) that accurately
extraction, and the result is classified using a fuzzy support capture shape and texture variation under large variations
vector machine (SVM). +e study used the stratified cross- using the deep Boltzmann machine (DBM). DAM replaced
validation as the validation metric, and thus, an overall the active appearance model (AAM). +is method begins by
accuracy of 96.77% accuracy was achieved. Ghazi and Ekenel employing the use of DBM to ascertain the landmark dis-
[17] improved the accuracy under occlusion, variations in tribution points on the face data, and then, the facial data are
illumination, and misalignment of facial features by using vectorized as a texture model. +e two layers (shape and
two deep CNNmodels, VGG-Face and Lightened pretrained texture) are then interpreted by constructing and using a
on large datasets. +ese datasets were then used to extract high-level layer. +e LFW, Helen, and FG-NET databases
facial features. +ey also used 5 databases to attempt a were used for the experimentation. +e RMSE values of the
solution to the problem. +e AR face dataset was used as the proposed method to the controlled method (bicubic and
analytical tool for the effects of facial obstruction, CMU PIE, AAM) showed a significant improvement in the recognition
and the Extended Yale dataset B to analyze the variation in rate [22].
illumination. +e color FERETdatabase was used for impact Duan and Tan [23] also proposed a method of the low
analysis on view invariance, and last, the FRGC dataset is for complexity method of learning pose-invariant features
evaluation of multiview catalogues. +e authors then used without the need for prior pose information. +e proposed
the Facial Bounding Box Extension to scan the entire head approach removes the pose from a face image and, by so
and extract deep features, thus improving the results. +ey doing, extracts local features. Self-similarity features are first
compared their results between the Facial Bounding Box generated from a face image when the distance that separates
Extension to other methods, and there was a significant the features of different nonoverlapping blocks is evaluated.
improvement in results [18]. However, Zhang et al. opti- +en, the linear transformation is subtracted from the local
mized face landmark detection by taking advantage of features, and the transformation matrix is acquired by re-
supplementary data from the attributes of the features. +e ducing the distance between pose variant features. +is
study proposed feature extraction using four convolutional matrix is created while discriminative information across
layers. Each one of these layers produces several feature persons is retained. Nevertheless, Singh, Zaveri, and
maps that are activated using rectified linear units.+e layers Raghuwanshi [24] have proposed a rough membership
are then coupled using max-pooling to produce a shared classifier (RMF) for the classification of pose images. Feature
vector. +e Multi-Attribute Facial Landmark (MAFL), extraction was performed using log-Gabor, and SVD’s are
AFLW, and Caltech Occluded Faces in the Wild (COFW) used for the reduction of redundant features. KNN classifier
are subjects to mean error and failure rate validation. +e is finally applied on the reduced Gabor features. ORL,
study concluded that the auxiliary task is more efficient by Georgian Face database, CMU PIE, Head Pose Image da-
learning the dynamic task coefficient, and this, in turn, tabases were used with similar performance metrics to Duan
makes the proposed method more robust to occluded faces and Tan [23]. +e study concluded that the proposed
and significant view invariance [19]. method is best suited for mug shots in law enforcement.
+is approach encouraged Ding and Tao [20] to pro- Moreover, it improves the recognition of face images with
pound a homographic pose normalization approach which occlusion, and the method is augmented using modeling
handles the loss of semantic correspondence, occlusion, and techniques to gain improved results. However, the use of
nonlinear facial texture wrapping in PIFR. +e proposed three methods for testing reduces the optimality of the
method first projects a lattice of three-dimensional facial proposed methods for substantial datasets with varying
landmarks into a two-dimensional face for feature extrac- images. Nevertheless, Zhao, Li, and Liu [25] have proposed a
tion. Second, an optimal warp is appraised using a homo- MSA+PCA for pose-invariant FR. First, features are
graphic corrective texture deformation due to pose extracted using the affine-invariant multiscale autoconvo-
variation. +is is performed around each landmark on the lution (MSA) transformation. Furthermore, the decorrela-
local patch. +e restored occluded features are used for face tion of these traits and the reduction of theMSA proportions
recognition using established face descriptors [20]. How- are performed using principal component analysis. Finally,
ever, Sharma and Patterh [21] proposed a technique, the principal components with the highest eigenvalues are
whereby the face is identified by the Viola–Jones algorithm. classified using KNN. +e experimentation points out how
+en, the eyes, nose, and mouth are discovered by means of computationally expensive the proposed method is during
the proposed hybrid PCA. +e features are subsequently the MSA feature extraction phase.
mined using LBP for every part found. PCA is then applied Abdalhamid and Jeberson [26] presented an abled pose-
to each feature extracted for recognition. +e ORL face invariant FR system via artificial bee colony optimized
dataset was used with the recognition rate as the recognition K-nearest neighbor classifier (ABC-KNN).+e method used
metric. +e study concluded that there is a higher recog- video as input for conversion into frames. During the
nition rate for the proposed hybrid PCA approach for preprocessing of the converted images, the adaptive Lee filter
varying facial expressions and pose when pitted with SOTA, (ALF) was applied for image enhancement by removing
4 Applied Computational Intelligence and Soft Computing
noise. +e Viola–Jones (VJ) algorithm is then used for face feature subspace known as the eigenface. +is eigenspace
segmentation from the right eyes, nose, and mouth. Com- represents the locus of the covariance matrix of the feature
plete-LBP (CLBP), center symmetric local binary pattern landmarks. Despite its usefulness, they are computationally
(CS-LBP) features, Gabor features (GF), and patterns of expensive given a higher dimensional data. +is necessitates
gradient orientation magnitudes (POEM) descriptors are the adoption of an alternate algorithm with similar prop-
used for when quirks are extracted from the segmented erties and structures [30] as PCA but relatively inexpensive
image. ABC-KNN is applied as classification for the image. known as singular value decomposition (SVD). Taking a
Recognition accuracy was the performance evaluation matrix X with dimension n x m, a PCA can be defined as the
metric. Consequently, F. Zhang, Yu, Mao, Gou, and Zhan Eigen decomposition of the covariance matrix XTX. +is
[27] propounded an approach for the PIFER framework yields an eigenvalue λ with its corresponding eigenvectors
based on feature learning using deep learning. +e PCA-Net W. +ese eigenvectors are used as the transformation op-
used frontal images that were not labeled during the learning erator on X to obtain a new matrix T with the same di-
process of the features. +e latter are consequently used by mension as X as shown in
CNN for feature mapping across the space separating the
nonfrontal and frontal faces. +e novel description gener- T � XW . (1)
ated by the maps is then used to describe nonfrontal faces to
achieve a standard characteristic to describe arbitrary faces. Equation (1) is with the assumption that all components
+e multiview robust features are then trained using a single (i.e., columns) in W are principal. However, in practice,
classifier for varying poses. BU-3DFE Static FEW was used some of these components are expected to be redundant;
during the experimentation stage and recognition as a hence, W is ordered by λ. With the ordered W, truncations
performance evaluation metric. After this technique has can be performed using the first r components for analysis.
been contrasted with other techniques and frameworks, the By implication, we have Wr being anm by rmatrix giving us
proposed process seems to outperform SOTA techniques. the new transformed matrix Tr shown in
Additionally, this method can be used to pose robust feature
extraction when trained instead of training the model for Tr � XWr . (2)
different pose variations. As stated earlier, operations of PCA are expensive, and
Finally, Sang, Li, and Zhao’s [28] method for PIFR fuses SVD with properties mathematically identical to PCA is
texture and depth into a framework using joint Bayesian preferred for implementation. Equation (3) shows the SVD
classifiers. +e output is then identified using a similarity of X.
estimator between the input and the face database. However,
there is a high computational cost for recognition of face X � μΣ ∗V , (3)
images in large face databases. Furthermore, experimenta- ∗
tion was extensive for various poses, and multiple methods where μ is the left singular vector, V is the conjugate
were not compared to the current method. transpose of the right singular vector, and Σ contains the
singular values on its diagonals. Computing the eigenvalue
T T
3. Research Methodology decomposition for X X with equation (3) to obtain μσμ , it
becomes obvious that W is identical to V, while the ordered
+e research design for this study includes image pre- singular values (σ1σ2σ3 . . .) are proportional to λ. Again,
processing, feature extraction with PCA, the optimization of with the property that μ and V are unitary matrices, we have
these features using PSO, ABC, and GA, and finally the μ∗μ � I , (4)
classification of objects using KNN, SVM, and EUD. +e
datasets for the study are YaleB and AT&Tpopularly known
as ORL. +ese datasets were selected with the justification V∗V � I , (5)
that they have well-defined challenges necessary for vali- where I is the identity matrix. From equations (1) and (3)
dating the facial recognition algorithm. Subsequent sections and noting that W is identical to V, we have
explain in detail the major parts of the study design.
∗
T � XV � μΣV V � μΣI , (6)
3.1. Feature Extraction. +is component of the design ac-
T � μΣ . (7)
quires relevant biometric descriptors from a given image. In
the process, high volume of data is obtained making it +ese equations further justify why SVD is computa-
necessary to select only high contributing descriptors. tionally inexpensive compared to PCA which computes the
Several techniques exist for this task; however, PCA is covariance XTX. Taking the principal components of
adopted for this study due to its popularity and efficiency in equation (7), we have
this domain [29].
Tr � μrΣr . (8)
3.1.1. Principal Component Analysis. +e primary goal of Finally, since the requirement is W and not the Eigen
principal component analysis for facial recognition is the decomposition of XTX, SVD can be used to efficiently
transformation of higher dimensional data into a lower compute W.
Applied Computational Intelligence and Soft Computing 5
3.2. FeatureOptimization. +e section of the study describes Step four: if onlooker and employed bees are unable to
the swarm intelligence algorithms used for the feature op- identify new and better candidate solution through the
timization. Among these methods are artificial bee colony, local search after some predefined iterations, the so-
genetic algorithm, and particle swam optimization. lution xi is discarded and substituted with scout bees’
new solution. +ese scout bees then use random global
selection to search for new solutions.
3.2.1. Artificial Bee Colony. +e artificial bee colony (ABC)
is one of the swarm-based algorithms designed with the Step five: step two to four is repeated until the defined
foraging actions of the honeybees. +e four components of stopping criterion is met returning the optimal output
the behavioral model of ABC are mainly the food source,
scouting bees, onlooker bees, and employed bees. +e food 3.2.2. Genetic Algorithm. Genetic algorithm (GA) on the
source denotes a possible solution to the clustering other hand is based on genetics and the theory of natural
problem as the scout bee carries out a global search. +is selection. It is a stochastic algorithm which finds the best
search is performed stochastically, while the onlooker and solution by effectively finding the global optimum in a larger
employed bee search for adjacent solutions. +e employed space. A nonnegative fitness value is obtained using the
bees subsequently evaluate the precision of the solution fitness function. +is value is used to summarize how close
from the previously stored solutions in memory. +is in- the optimal solution is to the global best (Mahmud, Haque,
formation is successively passed on to onlooker bees in the Zuhori, and Pal, 2014). A GA begins by generating random
dance area. +is ensures that the best food source is chosen, numbers (called chromosomes) with population size n. Each
and the stagnated food sources within an already set cycle chromosome has its fitness value computed, and the stop-
are abandoned and replaced with new sources. +is process ping criterion is checked. +e GA operators such as selec-
is repeated until there is a convergence to obtain the op- tion, crossover, and mutation to drive the chromosomes
timal solution. Mathematically, we have the following toward convergence are explained further.
steps.
Step one: randomly initialize solutions Selection. +is operator creates offspring from an existing
x for i � {1, 2, . . . , FS}, where i represents each food population by using a process comparable to natural se-i
source, and FS represents the total food source. Fur- lection in biological lifeforms. Selection once more accen-
thermore, initialize onlookers and employ bees using a tuates on the better performance of individuals in the
random function generator in population. +is helps with the expectancy of their offspring
having the likelihood of carrying on the genetic information
x � x rand (0, 1)􏼐x − x 􏼑, (9) to a successive generation. Consequently, the convergence isij j max j j impacted greatly by the magnitude of the selection process.
where xij � [xi1, xi2, . . . xi D] is a vector of length D
Hence, the selection criteria should prevent premature
with xmaxj and xj denoting the maximum and mini-
convergence by maintaining population diversity and bal-
mum values of the jth dimension. ance with the crossover and mutation operations.
Step two: iteratively new solutions are found by each Crossover. +e crossover operator mixes information be-
employed bee using tween two parents in a manner matching sexual repro-
duction. +e objective of the crossover procedure is to give
vij � xjφij􏼐xij − xj 􏼑, (10) “birth” to an improved offspring. +is is achieved by ex-
ploring different portions of the search space.
where vij � [vi1, vi2, . . . , vi D] signifies the new solu-
tions within the local range of xij � [xi1, xi2, . . . , xi D] Mutation. Mutation procedure changes the values of the
and φij ϵ (−1, 1). +e sum of the Euclidean distance randomly selected bit within each string, thereby preventing
between the sample points and their cluster midpoints the GA from being stuck at the local minimum through the
is known to be inversely proportional to the fitness scattering of genetic data, hence maintaining the variation in
value of all candidate sources. In the selection of the the population. +is process is repeated until the optimal
sources, a greedy algorithm is employed by comparing solution is achieved or the predetermined number of gen-
the fitness values of old and new positions. erations elapses.
Step three: probability pi of the solution xi is computed
using
3.2.3. Particle Swarm Optimization. Particle swarm opti-
fit
p � i , (11) mization (PSO) is also an optimization algorithm influencedi fs
􏽐n�1 fitn by biology. It was derived by observing the collective be-
havior and swarming of a flock of birds and fish schools [30].
where fiti is the fitness value of xi. Onlooker bees use +e algorithm comprises of solutions known as population,
this probability to select new xi values by searching for with each having a series of parameters which represent a
the local optimums while following step two to cal- coordinate in a space with multiple dimensions. Further-
culate the fitness value. more, a collection of these particles becomes a population
6 Applied Computational Intelligence and Soft Computing
with the particles probing the search space to find the op- xi(t + 1) � xi(t) + vi(t + 1). (17)
timal solution. Each particle tracks its former optimal so-
lution in memory and then labels these solutions as the (c) Evaluate the fitness of the particle
personal best and global best. +e locus of the ith particle is f(xi(t + 1))≥f(pi)
then defined in the D-multidimensional space as (d) If f(xi(t + 1))≥f(pi), update personal best: pi
� (x (t + 1))
xi � 􏼂xi1, xi2, xi3, . . . , xi D􏼃, (12) i(e) If f(xi(t + 1))≥f(g), update global best: g �
and the population of the swarm as (xi(t + 1)).
X � 􏼂x , x , x , . . . , x 􏼃. (13) (3) Assign the best solution to g at the end of the it-1 2 3 N erative process.
+e particles then iteratively update their respective
positions in the parameter space when searching for the 3.4. Classification. After the optimization of the extracted
optimal solution using feature vectors, classification models are built to address
x (t + 1) � x (t) + v (t + 1) , (14) the face recognition challenges. +ere are myriads ofi i i
predefined models for this task given the feature set.
where vi is the velocity components of the i
th particle along Among these are SVM, KNN, K-means, Euclidean dis-
the D-dimensions with t and t+1, indicating a dual con- tance, VGGNet, and CNN. Other pretrained face classifiers
secutive run of the process. Velocity of the ith particle is such as the VGG-Face also exits which estimate the sim-
defined in equation (15) with three terms: the first is inertia ilarity between the face image of a subject and relevant
which prevents the particles from drastically changing di- features selected from the face images in the database. In
rection, the second term describes the ability of particles this study, the Euclidean distance (EUD), K-nearest
returning to the previously known best position, and the last Neighbor (KNN), and the support vector machine (SVM)
term describes the particles moving (swarm) closer to the were used.
best position:
vi(t + 1) � vi(t) + c1( pi − xi(t)􏼁R1 + c2( g − xi(t)􏼁R2, 4. Implementation of Methods
(15)
where pi is the personal best of the particle,
4.1. Implementation Pipeline. +e implementation pipeline
g is the global
best, and and , in the range of 0≤ ≤ 4, are the for this study is shown in Figure 1. From the figure, everyc1 c2 c1, c2
cognitive and social coefficients respectively. Finally, R and image undergoes a series of preprocessing and subsequent1
R2 are the two diagonal matrices randomly generated from a
feature selection and finally features optimization. +ese
uniform distribution in [0,1].+is ensures that the social and optimized features are trained for feature matching.
cognitive components have a random effect on the velocity
update in equation (15). Since the particles are derived from 4.2. Environmental Setup. +e face recognition system
the convergence of the personal and global best solutions, implemented in this study was developed, trained, and tested
the stochastic weight of the two accelerating terms and the using Matlab R2018b on an HP desktop processor Intel
trajectories are semirandom. +is requires that equations Core™ i7-770T CPU @ 2.90GHz, Linux Ubuntu 20.04 LT®S
(14) and (15 are iterated until a stopping criterion is met. operating system.
Algorithmically, we have the following pseudocode.
4.3. Image Preprocessing. +e first step taken in image
3.3. PSO Algorithm. analysis is the preprocessing of the image for undesirable
(1) N particle initialization noise. +ese components are detrimental to the examination
of the image and thus are removed via preprocessing. All
(a) Initialize the position xi(0)∀ i ∈ 1: N images with dimensions more than 96-by-84 pixels are
(b) Initialize the particles best position to its position downsampled. +is is followed by the conversion of all
Pi(0) � xi(0) colored images to grayscale. +e outputs of the images are
(c) Calculate the fitness of each particle, and if separated into training and test sets. Eighty percent of the
f(xj(0))≥f(xi (0)) ∀ i≠ j , initialize the global images are considered as training sets with 20 percent as the
best as g � xj (0) test set. +is preprocessing is implemented so that the
(2) Repeat until condition is met complexity will be reduced and the computational time
improved.
(a) Update the particle velocity in accordance with
equation (15)
4.4. Feature Extraction. +is section further illuminates on
vi(t + 1) � vi(t) + c1( pi − xi(t)􏼁R1 + c2( g − xi(t)􏼁R2. the feature extraction approach used in this study. Among
(16) the objectives of this study is the implementation of an
offline facial recognition system with an improved and
(b) Update the particle position using equation (14) robust feature extraction method using optimization
Applied Computational Intelligence and Soft Computing 7
Standardized ‘Mean’ face
face image
2 Face Features 10
preprocessing extraction 3 20
30
Image Feature
input 40vectors
50
60
Feature
1 Face optimization 4 70
dataset 80
Feature 90
Comparison vectors 10 20 30 40 50 60 70 80
optimization
Figure 2: Mean face for the AT&T dataset.
Output Feature Feature
matching classification 5
‘Mean’ face
6 10
Figure 1: Implementation pipeline. 20
30
techniques. +is method will be tested using the AT&T and 40
YaleB face datasets as they contain faces with varying illu- 50
mination, different poses, occluded faces, dissimilar ex- 60
pressions, or a combination of them. +e mean of the 70
features is computed, and the feature of the first principal 80
component of each image is selected. +e mean face for 90
AT&T and YaleB datasets is shown in Figures 2 and 3,
respectively. 10 20 30 40 50 60 70 80
Figure 3: Mean face the YaleB dataset.
4.5. Dimensionality Reduction and Feature Selection.
Given the computed mean face of the training data, the are used. For an efficient evaluation and a valid comparison
binary singleton expansion function is applied as an ele- with the existing study, the accuracy metric is selected. +e
ment-wise operator. +e resultant image is decomposed recognition accuracy is computed for all the classification
with the single value decomposition function to reduce the methods as applied on different datasets with varying op-
coefficient used to characterize the image. +e cumulative timization methods. Tables 1–7 show the average, maxi-
sum of the square of the diagonal matrix is computed to mum, and minimum recognition accuracies for the datasets
produce the principal component with the first k eigenvalue with different classification methods. +is experiment was
of the component selected. +e eigenvectors are then nor- conducted with a thousand five hundred (1500) iterations
malized into eigenfaces.+e sample output of this process on with/without considering the optimization of the extracted
the AT&T and YaleB datasets is shown in Figures 4 and 5 , features.
respectively.
Once more, the binary singleton expansion function is
used to transform the test data by using the mean face.+ese 5.2. Discussion. From the result shown in Section 4.1, it is
transformed train and test data are then optimized for better observed that the model’s performance on the AT&Tdataset
classification results. is fairly low in general. +is could be attributed to the oc-
clusion, varying pose, and expression exhibited in the face
5. Results and Discussion images making it naturally difficult to model. On the con-trary, the model’s performance was relatively good as it
+is section describes in detail the results of the experiment contains only images with varying illumination. From Ta-
and the analysis of the results. Moreover, comparisons ble 1, it can be seen that the accuracy is highest for KNN and
between other optimization methods using the same data- SVM at 100% each for the YaleB dataset. Nevertheless,
base and three different classifiers will be discussed. optimizing the features with GA saw a significant decrease of
the KNN classifier to 70% with a 9.8% reduction using the
Euclidean distance method as shown in Table 3. +ere is no
5.1. Numerical Results. Generally, in recording the perfor- loss as shown in Tables 2 and 4 for KNN and SVM when
mance of a facial recognition model, statistical metrics such ABC and PSO optimization is performed. However, 4% and
as accuracy, recall, precision, F-measure, and among others 5% reduction for PSO and ABC, respectively, was noted
8 Applied Computational Intelligence and Soft Computing
Eigenface 1 Eigenface 2 Eigenface 3
20 20 20
40 40 40
60 60 60
80 80 80
20 40 60 80 20 40 60 80 20 40 60 80
Eigenface 4 Eigenface 5 Eigenface 6
20 20 20
40 40 40
60 60 60
80 80 80
20 40 60 80 20 40 60 80 20 40 60 80
Figure 4: First six eigenfaces of AT&T.
Eigenface 1 Eigenface 2 Eigenface 3
20 20 20
40 40 40
60 60 60
80 80 80
20 40 60 80 20 40 60 80 20 40 60 80
Figure 5: First three eigenfaces of YaleB.
Table 1: Recognition accuracy without the use of the optimization algorithm.
Default recognition accuracies
EUD KNN SVM Dataset
29.94 77.84 82.05 AT&T
82.63 100 100 YaleB
Table 2: Recognition accuracies for the YaleB database with the PSO algorithm.
YaleB–particle swarm optimization (PSO)
EUD KNN SVM
87.92 100 91.72 Average
100 100 100 Maximum
10.53 100 0 Minimum
Applied Computational Intelligence and Soft Computing 9
Table 3: Recognition accuracies for the YaleB database with the GA algorithm.
YaleB–genetic algorithm (GA)
EUD KNN SVM
70.04 70.04 99.15 Average
100 100 100 Maximum
10.53 10.523 91.23 Minimum
Table 4: Recognition accuracies for the YaleB database with the ABC algorithm.
YaleB–artificial bee colony (ABC)
EUD KNN SVM
74.71 100 99.60 Average
100 100 100 Maximum
17.54 100 95.61 Minimum
Table 5: Recognition accuracies for the AT&T database with the PSO algorithm.
AT&T–particle swarm optimization (PSO)
EUD KNN SVM
28.46 80.46 59.85 Average
36.25 99.25 78.75 Maximum
18.75 67.5 5 Minimum
Table 6: Recognition accuracies for the AT&T database with the GA algorithm.
AT&T–genetic algorithm (GA)
EUD KNN SVM
28.78 47.05 54.86 Average
40 87.5 86.25 Maximum
17.5 7.5 15 Minimum
Table 7: Recognition accuracies for the AT&T database with the PSO algorithm.
AT&T–artificial bee colony (ABC)
EUD KNN SVM
28.50 66.78 35.55 Average
43.75 93.75 72.5 Maximum
17.5 8.75 8.75 Minimum
when the EUD classifier was used. Consequently, there is a in Tables 6 and 7. +e order of experimentation is given as
large difference in recognition accuracy using the AT&T follows.
database. Without the use of the optimization method, the
AT&T database’s recognition plummeted to 29.94, 77.84, PCA+EUD
and 82.05 for EUD, KNN, and SVM, respectively, as shown PCA+KNN
in Table 1. However, using the PSO optimization technique PCA+ SVM
saw an improvement in the average recognition accuracy to
80.46% for KNN. +ere is a significant degradation when PCA+PSO+ED
using the SVM classifier with an average recognition ac- PCA+PSO+KNN
curacy of 59.85%. Again, EUD saw 28.46% average recog- PCA+PSO+ SVM
nition accuracy for PSO as shown in Table 5. Conversely, the
average recognition accuracy for GA and ABC reduced to PCA+GA+ED
47.05 and 66.78, respectively, when using KNN and 54.86% PCA+GA+KNN
and 35.55% when using SVM as can be separately observed PCA+GA+ SVM
10 Applied Computational Intelligence and Soft Computing
PCA+ABC+ED Again, PCA+GA+EUD indicates 28.78% average recog-
PCA+ABC+KNN nition accuracy. +is is similar to the average results got by
PCA+ABC+ SVM all 3 optimization methods using EUD as the classifier.However, GA and ABC achieved 47.05% and 66.78% average
By observation, the PCA+PSO+EUD has 15.19% of the recognition accuracy for the KNN classifier, respectively.
recognition accuracy below 79.83%, which is the recognition +is illustrates an atrophy of the result from 77.84% to
accuracy of PCA+EUD for the YaleB dataset. +is indicates 47.05% for GA and 66.78% for ABC. GA suffers 30%
that PSO optimizes the features well with 84.81% of the degradation, while ABC saw an 11% reduction in average
experiments producing better results. In addition, there is no recognition accuracy. Moreover, the average recognition
change in the recognition accuracy of KNN.+is points with accuracy for both GA and ABC for the SVM classifier
PSO not reducing the results achieved when the optimi- plummeted further than that of KNN. A 27% reduction in
zation technique is performed. Conversely, SVM saw less average recognition accuracy using the SVM classifier for
than 1% of the recognition accuracies below 90%. +is GA supersedes that of ABC, which has 46.5% reduction.
demonstrates that over 99% of the results for the +us, it concludes that GA and ABC using SVM as the
PCA+PSO+ SVM have recognition accuracy above 90% classifier is not suitable for this approach. +e first 20 results
with 95% of the recognition accuracy at 100%.+erefore, the are shown in Table 9.
5% reduction in PCA+ SVM can be considered insignificant. Again, the linear kernel was used for the SVM classifier
As a final point, PSO optimizes well for EUD and SVM. when the experiment was performed. +is kernel has the
Similarly, PCA+GA+EUD has over 71% of the results propensity of improving computational time compared to
above that of PCA+EUD. KNN and SVM, however, have other SVM kernels, and it is suitable for high dimensional
60% and 100% of the results greater than that of PCA+KNN data [31]. However, the linear kernel in this experiment
and PCA+ SVM, respectively. Yet still, PCA+GA+KNN appears to have sacrificed the accuracy for computational
shows significant decay of results from its default 100%. time. +us, the kernel chosen does not produce good re-
SVM, on the other hand, displays a negligible reduction in sults. Other kernels such as the polynomial, Gaussian,
average recognition accuracy. With this, SVM seems to radial basis function (RBF), or ANOVA could be used for
produce better results than both KNN and EUDwith respect SVM in future research, and the result is compared to the
to the use of the GA optimization algorithm. proposed method. Similarly, Table 1 indicates that SVM is a
In like manner, the YaleB dataset results for better classifier when the linear kernel is used and when no
PCA+ABC+EUD give rise to 28% of the data above the optimization algorithms are utilized. +us, both AT&Tand
default 79.83% of PCA+EUD. However, the ABC optimized YaleB datasets produce the best results for SVM. Now,
recognition for KNN and SVM revealed no significant loss of comparing Tables 2–4, it is perceived that a perfect rec-
results with an average recognition accuracy of 100% and ognition accuracy of 100% for the maximum of all meta-
99.6% for KNN and SVM, respectively. It can be established heuristic algorithms and classifiers is achieved. +is indi-
that the result optimized by ABC and classified using KNN cated that all optimization methods can be used for the
are appropriate for the YaleB dataset, and ABC optimizes YaleB database regardless of the classifier. Conversely, the
well for KNN and SVMon the said dataset. Table 8 shows the maximum recognition accuracy for the algorithms used for
first 20 results of the total experiments for PSO, ABC, and augmentation gave the impression that the KNN classifier
GA optimization algorithms implemented on the YaleB was better. +is means that PSO+KNN, ABC+KNN, and
dataset using EUD, KNN, and SVM classifiers. Nevertheless, GA +KNN have better recognition accuracy than their
the substantial reduction of the results observed when the SVM counterparts. +is shows that the optimization al-
AT&Tdataset is used stems from the increase in parameters gorithms have degraded the results produced by the SVM
for recognition. +e AT&T dataset contains images that are classifier. Nonetheless, GA’s maximum recognition was
occluded, and it also has varying poses and expressions. better than that of the default SVM (PCA+ SVM).
PCA+PSO+EUD for the AT&T face dataset produced +erefore, GA should be preferred when an SVM classifier
results that are on average below PCA+EUD for the da- with a linear kernel is chosen. Furthermore, the algorithms
tabase. 51% of the 1500 results obtained were lower than the improved the highest recognition accuracy achieved by the
default 29.94% for PCA+EUD. +e overall average recog- EUD classifier only. With this, PSO is selected as the ideal
nition of PCA+PSO+EUD for the AT&T database, how- optimization algorithm for the YaleB and AT&T datasets.
ever, was 28.46% as shown in Table 5. It is perceived that the Juxtaposing the proposed method to other approaches, it is
deterioration of average recognition is offset by the larger shown in Table 10 that the offered approach is effective than
values of the other recognitions. 48% of the result above the other SOTA methods. +e culmination of this research
default 29.94% is not insignificant, yet it is a small percentage presented the proposed optimization method and classifier,
for consideration. KNN on the other hand has 31% of the given their respective datasets in Table 11.
results above the 77.84% default recognition. Still, the av- Finally, Table 12 shows the time taken for each exper-
erage recognition accuracy achieved was 3% higher than the iment carried out. It is seen that PSO has the lowest average
default. +us, 80.46% average recognition accuracy for KNN time for the experiment with 1.594s, 1.592s, and 55.46s for
with PSO-optimized features (PCA+PSO+KNN) is the EUD, KNN, and SVM, respectively. PSO+ SVM saw the
best combination for the AT&T database since none of the highest computational cost with 55.46s for all experimen-
results for SVM was above its 82.05% baseline recognition. tation. However, it required less than 2 seconds for
Applied Computational Intelligence and Soft Computing 11
Table 8: YaleB experiment results for PSO, GA, and ABC using EUD, KNN, and SVM classifiers.
YaleB–PSO YaleB–GA YaleB–ABC
EUD KNN SVM EUD KNN SVM EUD KNN SVM
100 100 100 60.526 60.526 99.123 36.842 100 100
100 100 100 99.123 99.123 100 35.088 100 100
71.93 100 100 51.754 51.754 100 28.07 100 100
100 100 100 100 100 98.246 71.93 100 99.123
95.614 100 100 41.228 41.228 98.246 86.842 100 100
99.123 100 100 40.351 40.351 100 100 100 100
90.351 100 100 94.737 94.737 98.246 71.053 100 99.123
99.123 100 100 99.123 99.123 99.123 47.368 100 100
100 100 100 35.088 35.088 99.123 53.509 100 100
100 100 100 99.123 99.123 100 54.386 100 100
85.965 100 100 100 100 100 100 100 100
67.544 100 98.246 100 100 100 57.895 100 100
82.456 100 100 46.491 46.491 100 91.228 100 100
87.719 100 100 97.368 97.368 99.123 30.702 100 98.246
88.596 100 100 41.228 41.228 100 92.982 100 98.246
83.333 100 98.246 28.07 28.07 96.491 100 100 100
93.86 100 100 100 100 98.246 50 100 100
100 100 100 50 50 97.368 72.807 100 100
70.175 100 99.123 100 100 100 27.193 100 100
Table 9: AT&T experiment results for PSO, GA, and ABC using EUD, KNN, and SVM classifiers.
AT&T–PSO AT&T–GA AT&T–ABC
EUD KNN SVM EUD KNN SVM EUD KNN SVM
30 82.5 70 28.75 53.75 51.25 27.5 71.25 53.75
26.25 80 61.25 37.5 31.25 37.5 31.25 55 33.75
28.75 76.25 40 32.5 78.75 76.25 28.75 38.75 20
32.5 76.25 75 36.25 20 30 23.75 76.25 41.25
27.5 77.5 62.5 23.75 51.25 70 31.25 10 8.75
27.5 80 75 27.5 31.25 35 26.25 71.25 27.5
30 86.25 55 28.75 35 38.75 31.25 55 27.5
33.75 87.5 73.75 27.5 63.75 62.5 30 42.5 23.75
23.75 71.25 66.25 33.75 35 40 27.5 77.5 47.5
23.75 76.25 68.75 30 56.25 71.25 28.75 42.5 16.25
31.25 72.5 62.5 36.25 28.75 35 26.25 50 22.5
26.25 73.75 68.75 27.5 46.25 56.25 23.75 76.25 37.5
32.5 86.25 7.5 27.5 46.25 60 30 77.5 52.5
30 83.75 63.75 28.75 11.25 16.25 25 45 31.25
30 85 65 27.5 53.75 50 27.5 67.5 37.5
22.5 83.75 50 30 28.75 33.75 30 45 40
31.25 78.75 63.75 26.25 67.5 65 26.25 70 33.75
28.75 68.75 56.25 35 58.75 62.5 28.75 86.25 56.25
31.25 87.5 70 30 57.5 60 23.75 45 10
Table 10: Recognition accuracy of other methods on the YaleB and AT&T datasets.
Author Method Recognition accuracy (%) Database
[32] Generalized low-rank approximation of matrices (GLRAM) 82.18 YaleB
[33] FDDL 96.2 YaleB
[34] Local nonlinear multilayer contrast patterns (LNLMCP) 97.50 YaleB
[35] Discriminative sparse representation via l2 regularization 82.61 YaleB
[32] GLRAM 97.25 AT&T
[33] Fisher discriminative dictionary learning (FDDL) 96.7 AT&T
[31] PSO–KNN 98.75 AT&T
[31] PCA-LDA fusion algorithm 98.00 AT&T
[35] Discriminative sparse representation via l2 regularization 95.00 AT&T
12 Applied Computational Intelligence and Soft Computing
Table 11: Proposed classification and optimization techniques for both datasets.
Proposed selection AT&T YaleB
Optimization technique Particle swarm optimization (PSO) Particle swarm optimization (PSO)
Classification method K-nearest neighbor K-nearest neighbor
Table 12: Average time taken for experiments. Conflicts of Interest
Time in seconds for Particle swarm Artificial Genetic
experimentation optimization bee colony algorithm +e authors declare that they have no conflicts of interest.
Euclidean distance 1.594 2.104 1.648
K-nearest neighbor 1.592 2.115 1.646 References
Support vector
machine 55.46 4.871 4.445 [1] S. Li and W. Deng, “Deep facial expression recognition: a
survey,” 2018, http://arxiv.org/abs/1804.08348.
PSO+EDU and PSO+KNN trials. Subsequently, the ABC [2] R. D. Labati, A. Genovese, E. Muñoz, V. Piuri, F. Scotti, and
and GA meta-heuristic algorithms produced a similar result G. Sforza, “Biometric recognition in automated border
to PSO, but PSO is computationally less expensive than both. control,” ACM Computing Surveys, vol. 49, no. 2, pp. 1–39,
2016.
[3] F. Zhang, Y. Yu, Q. Mao, J. Gou, and Y. Zhan, “Pose-robust
6. Conclusion feature learning for facial expression recognition,” Frontiers of
+is study looks at how to augment PCA feature with the Computer Science, vol. 10, no. 5, pp. 832–844, 2016.[4] S. Zafeiriou, C. Zhang, and Z. Zhang, “A survey on face
selected optimization method to improve the accuracy of detection in the wild: past, present and future,” Computer
face recognition models. +e proposed implementation Vision and Image Understanding, vol. 138, pp. 1–24, 2015.
shows that the choice of PSO as an optimization method [5] H. Tu, K. Li, and Q. Zhao, “Robust Face Recognition with
works well in an unconstrained environment of the real Assistance of Pose and expression Normalized albedo im-
world, since pose, occlusion, and expression are among the ages,” ACM International Conference Proceeding Series,
dominate face recognition problems found in the uncon- vol. 93, 2019.
strained environments. +e default recognition accuracy of [6] X. Chen, X. Lan, G. Liang, J. Liu, and N. Zheng, “Pose-and-
the YaleB showed 100% accuracy for both SVM and KNN illumination-invariant face representation via a triplet-loss
classifiers. However, the ORL database did not attain perfect trained deep reconstruction model,” Multimedia Tools and
recognition due to the inherent nature of the dataset. Applications, vol. 76, no. 21, pp. 22043–22058, 2017.
Nonetheless, the use of optimization algorithms on the [7] C. Ding, C. Xu, and D. Tao, “Multi-task pose-invariant face
selected features saw an increase in recognition accuracy recognition,” IEEE Transactions on Image Processing: A
from 82.63% to a maximum of 100% for EUD.+is indicates Publication of the IEEE Signal Processing Society, vol. 24, no. 3,
that all three evolutionary algorithms can be used to improve pp. 980–993, 2015.
the accuracy of results. However, due to the ORL database [8] C. Ding and D. Tao, “A comprehensive survey on pose-invariant
catering for 3 parameters, the maximum recognition did not face recognition,” ACM Transactions on Intelligent Systems and
reach 100% but 99.25% which is promising using the PSO Technology, vol. 7, no. 3, 2016.
algorithm and KNN classifier. Last, the PCA+PSO+KNN [9] M. Chihaoui, A. Elkefi, W. Bellil, and C. B.. Amar, “A survey
approach is chosen for this study due to its ability to handle of 2D face recognition techniques,” Computers, vol. 5, no. 4,
the increase in parameters, and it also outperforms other pp. 41–68, 2016.
SOTA algorithms. +ese parametric increases move the [10] S. K. Fazilov, N. M. Mirzaev, and G. R. Mirzaeva, “Modified
recognition closer to real-world human face recognition. recognition algorithms based on the construction of models of
Moving forward, this study can be extended by looking at elementary transformations,” Procedia Computer Science,
other recent swarm intelligent optimization models used in vol. 150, pp. 671–678, 2019.[11] R. He, X.Wu, Z. Sun, and T. Tan, “Wasserstein CNN: learning
other fields with the property of it be being less expensive, invariant features for NIR-VIS face recognition,” IEEE
Other private datasets with more stricter challenges could be Transactions on Pattern Analysis and Machine Intelligence,
used to further validate this model. +is remains a limitation vol. 41, no. 7, pp. 1761–1773, 2019.
to this study. [12] S. Rahimzadeh Arashloo and J. Kittler, “Fast pose invariant
face recognition using super coupled multiresolution Markov
Data Availability Random Fields on a GPU,” Pattern Recognition Letters,
vol. 48, pp. 49–59, 2014.
+e secondary data source used to support the findings of [13] T.-H. Chan, K. Jia, S. Gao, J. Lu, Z. Zeng, and Y.Ma, “PCANet:
this study are available from the AT&T database (https:// a simple deep learning baseline for image classification?” IEEE
www.kaggle.com/kasikrit/att-database-of-faces) and YaleB Transactions on Image Processing, vol. 24, no. 12, pp. 5017–
database (https://github.com/Suchetaaa/CS663- 5032, 2015.
Assignments/tree/ [14] W. Deng, J. Hu, Z. Wu, and J. Guo, “From one to many: pose-
0426d951d0212ed3dd831377a0df11551670ab87/ Aware Metric Learning for single-sample face recognition,”
Assignment-4/1/CroppedYale). Pattern Recognition, vol. 77, pp. 426–437, 2018.
Applied Computational Intelligence and Soft Computing 13
[15] Z. Chen, W. Shen, and Y. Zeng, “Sparse representation for [32] S. Ahmadi and M. Rezghi, “Generalized low-rank approxi-
pose invariant face recognition,” International Journal for mation of matrices based on multiple transformation pairs,”
Engineering Modelling, vol. 30, no. 1–4, pp. 37–47, 2017. Pattern Recognition, vol. 108, 2020.
[16] T. Zhang, W. Zheng, Z. Cui, Y. Zong, J. Yan, and K. Yan, “A [33] B. B. Benuwa, B. Ghansah, and E. K. Ansah, “Kernel based
deep neural network-driven feature learning method for locality – sensitive discriminative sparse representation for
multi-view facial expression recognition,” IEEE Transactions face recognition,” Scientific African, vol. 7, Article ID e00249,
on Multimedia, vol. 18, no. 12, pp. 2528–2536, 2016. 2020.
[17] M. M. Ghazi and H. K. Ekenel, “A comprehensive analysis of [34] L. Zhou, W. Li, Y. Du, B. Lei, and S. Liang, “Adaptive illu-
deep learning based representation for face recognition,” in mination-invariant face recognition via local nonlinear multi-
Proceedings of the IEEE Computer Society Conference on layer contrast feature,” Journal of Visual Communication and
Computer Vision and Pattern Recognition Workshops, Image Representation, vol. 64, Article ID 102641, 2019.
pp. 102–109, Las Vegas, Nevada, USA, July 2016. [35] Y. Xu, Z. Zhong, J. Yang, J. You, and D. Zhang, “A new
[18] M. M. Ghazi and H. K. Ekenel, “Automatic emotion recog- discriminative sparse representation method for robust face
nition in the wild using an ensemble of static and dynamic recognition via $l_{2}$ regularization,” IEEE Transactions on
representations,” in Proceedings of the ICMI 2016 - Proceedings Neural Networks and Learning Systems, vol. 28, no. 10,
of the 18th ACM International Conference on Multimodal In- pp. 2233–2242, 2017.
teraction, pp. 514–521, Tokyo, Japan, November 2016.
[19] Z. Zhang, P. Luo, C. C. Loy, and X. Tang, “Learning deep
representation for face alignment with auxiliary attributes,”
IEEE Transactions on Pattern Analysis and Machine Intelli-
gence, vol. 38, no. 5, pp. 918–930, 2016d.
[20] C. Ding and D. Tao, “Pose-invariant face recognition with
homography-based normalization,” Pattern Recognition,
vol. 66, pp. 144–152, 2017.
[21] R. Sharma and M. S. Patterh, “A new hybrid approach using
PCA for pose invariant face recognition,” Wireless Personal
Communications, vol. 85, no. 3, pp. 1561–1571, 2015.
[22] C. N. Duong, K. Luu, K. G. Quach, and T. D. Bui, “Deep
appearance models: a deep Boltzmann machine approach for
face modeling,” International Journal of Computer Vision,
vol. 127, no. 5, pp. 437–455, 2019.
[23] X. Duan and Z.-H. Tan, “A spatial self-similarity based feature
learning method for face recognition under varying poses,”
Pattern Recognition Letters, vol. 111, pp. 109–116, 2018.
[24] K. Singh, M. Zaveri, and M. Raghuwanshi, “Rough set based
pose invariant face recognition with mug shot images,”
Journal of Intelligent & Fuzzy Systems, vol. 26, no. 2,
pp. 523–539, 2014.
[25] Y. Zhao, L. Li, and Z. Liu, “A novel algorithm using affine-
invariant features for pose-variant face recognition,” Com-
puters & Electrical Engineering, vol. 46, pp. 217–230, 2015.
[26] K. H. Abdalhamid and W. Jeberson, “Pose-invariant face
recognition by means of artificial bee colony optimized knn
classifier,” Journal of Advanced Research in Dynamical and
Control Systems, vol. 11, no. 8, pp. 525–539, 2019.
[27] Y.-D. Zhang, Z.-J. Yang, H.-M. Lu et al., “Facial emotion
recognition based on biorthogonal wavelet entropy, fuzzy
support vector machine, and stratified cross validation,” IEEE
Access, vol. 4, pp. 8375–8385, 2016.
[28] G. Sang, J. Li, and Q. Zhao, Pose-Invariant Face Recognition
via RGB-D Images, 2016.
[29] M. Haghighat, S. Zonouz, and M. Abdel-Mottaleb, “CloudID:
trustworthy cloud-based and cross-enterprise biometric
identification,” Expert Systems with Applications, vol. 42,
no. 21, pp. 7905–7916, 2015.
[30] M. Tiwari and A. K. Shukla, “An Implementation of FACE
recognition system ( FARS ) using PCA and PSO based tech-
niques,” State of the Art in Face Recognition, vol. 211007, no. 6,
pp. 225–229, 2016.
[31] K. Sasirekha and K. +angavel, “Optimization of K-nearest
neighbor using particle swarm optimization for face recog-
nition,” Neural Computing and Applications, vol. 31, no. 11,
pp. 7935–7944, 2019.