University of Ghana  http://ugspace.ug.edu.gh
FACE RECOGNITION UNDER ANGULAR CONSTRAINT USING 
DISCRETE WAVELET TRANSFORM AND PRINCIPAL COMPONENT 
ANALYSIS WITH SINGULAR VALUE DECOMPOSITION 
 
 
 
 
BY 
 
 
ENOCH SAKYI-YEBOAH 
(10349736) 
 
 
 
 
THIS THESIS IS SUBMITTED TO THE SCHOOL OF GRADUATE 
STUDIES, UNIVERSITY OF GHANA IN PARTIAL FULFILMENT OF 
THE REQUIREMENT FOR THE AWARD OF THE MASTER OF 
PHILOSOPHY DEGREE IN STATISTICS 
 
 
JULY 2017
 
 
University of Ghana  http://ugspace.ug.edu.gh
DECLARATION 
Candidate’s Declaration  
This is to certify that, this thesis is the result of my own research work and that no part of it has 
been presented for another degree in this University or somewhere else.  
 
 
SIGNATURE: …………………………. DATE: ………………………..….  
ENOCH SAKYI-YEBOAH 
(10349736)  
 
Supervisors’ Declaration  
We hereby certify that this thesis was prepared from the candidate’s own work and supervised 
in accordance with guidelines on supervision of thesis laid down by the University of Ghana.  
 
SIGNATURE: ………………………….                   DATE………………………..….  
DR. EZEKIEL NII NOYE NORTEY  
(Principal Supervisor)  
  
 
SIGNATURE: ………………………….                     DATE………………………..….  
DR. LOUIS ASIEDU   
(Co-Supervisor) 
 
 
i 
 
University of Ghana  http://ugspace.ug.edu.gh
ABSTRACT 
 The complexity of a face’s components originates from the incessant variations in the facial 
component that occur with respect to time. Notwithstanding of these variations, humans recognise 
a person very easily using physical characteristics such as faces, voice, gait, etc. In human 
interactions, the articulation and perception of constraints; like head-poses, facial expression form 
a communication channel that is additional to voice and that carries crucial information about 
mental, emotional and even physical states of a conversation. Automatic face recognition deals 
with extracting these essential features from an image, placing them into a suitable representation 
and performing some kind of recognition on them. This study presents a statistical assessment of 
the performance of modify Discrete Wavelet Transform and Principal Component Analysis with 
Singular Value Decomposition (DWT-PCA/SVD) under angular constraints 
(4°, 8°, 12°, 16°, 20°, 24°, 28°and 32°). 80 head-pose images from 10 individuals were captured 
into the study database. The study dataset extracted from Massachusetts Institute of Technology 
Database (2003-2005) were considered for recognition runs. A Friedman Sum Rank was used to 
ascertain whether significant difference exist between the median recognition distances of the 
varying constraints from their straight-pose ( 0°). Recognition rate and runtime was adopted as 
the numerical evaluation methods to assess the performance of the study algorithm. All numerical 
and statistical computations were done using Matlab. The results of the Friedman Sum Rank test 
show that the higher the degrees of head-pose, the larger the recognition distances and that at 20° 
and above, the recognition distances become profoundly larger compared to the 4° head-pose. 
The numerical evaluations show that, DWT-PCA/SVD face recognition algorithm has an 
appreciable average recognition rate (87.5%) when used to recognise face images under angular 
constraints. Also the recognition rate decreases for head-poses greater than 20°. DWT is 
recommended as a feasible noise removal tool that should be implemented during image pre-
processing phase.   
ii 
 
University of Ghana  http://ugspace.ug.edu.gh
DEDICATION 
 
This research work (thesis) is dedicated to the Almighty God, my Mother Dorcas Akosua Mansah 
Adjei, my Aunt Comfort Sakyi, my Felicia Sakyi and my Dad Samuel Sakyi-Yeboah of blessed 
memory.   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
iii 
 
University of Ghana  http://ugspace.ug.edu.gh
 
ACKNOWLEDGEMENT 
My sincerest thankfulness spirits to the Almighty God for his Divine Grace, Mercy and Favour 
upon my Life throughout my stay in the University of Ghana. Without Him I wouldn’t have come 
this far. 
I want to use this golden chance to express my considerate appreciation, sincere gratitude and 
bottomless esteems to my hard working supervisors, Dr. Ezekiel Nii Noye Nortey and Dr. Louis 
Asiedu for their exemplary supervision, monitoring, and continuous inspiration during the course 
of this research work. From the initial stages (deciding and choosing a topic) of this research 
work, to the late day of submission, I owe an immense obligation of gratitude to my distinguish 
supervisors. Their sound advice, encouragement and careful guidance were invaluable. 
I am indebted to Dr. Felix Okoe Mettle and Mr. Issah Seidu, for their affectionate support, 
valuable information, explanation and guidance, which aided me in completing this task through 
various stages of my research work. 
I am very grateful to Dr. Samuel Iddi, Mrs. Charlotte Chapman Wardy, Asaa Gyebi-Adjei, Benita 
Odoi, Derek K. Degbedzui, Alberta Adoma Bawuah, Jacklyn Asiedua Korankye and Abeku Atta 
Asare-Kumi, for the assistances provided by them in their relevant fields. I am obliged for their 
care, co-operation and support during the course of my research work. 
Finally, I thank My Mother, Auntie, Sibling, Cousins, Roommates (Seth Attah Armah, Francis 
Dogodzi and Kwaku Kyere Appiah) and Colleagues (Emmanuel Kojo Aidoo, Stephen Nkrumah, 
Obu Amponah Amoah, Fred Fosu Agyarko, Millicent Narh) for their constant encouragement 
and support, without which this research work would not have been possible. 
Thank you all for this immeasurable support. May the Almighty richly bless you all in abundant. 
 
iv 
 
University of Ghana  http://ugspace.ug.edu.gh
TABLE OF CONTENT 
Contents 
DECLARATION ............................................................................................................................ i 
ABSTRACT ...................................................................................................................................ii 
DEDICATION ............................................................................................................................. iii 
ACKNOWLEDGEMENT ............................................................................................................ iv 
TABLE OF CONTENT ................................................................................................................. v 
LISTS OF FIGURES ................................................................................................................. viii 
LISTS OF PLATES ...................................................................................................................... ix 
LISTS OF TABLES ....................................................................................................................... x 
ABBREVIATION......................................................................................................................... xi 
CHAPTER ONE ............................................................................................................................ 1 
INTRODUCTION ......................................................................................................................... 1 
1.1 Background of the Study ...................................................................................................... 1 
1.2 Problem Statement ............................................................................................................... 4 
1.3 Objectives ............................................................................................................................. 5 
1.4 Significance of the Study ..................................................................................................... 5 
1.5 Brief Methodology ............................................................................................................... 6 
1.6 Scope .................................................................................................................................... 7 
1.7 Limitation of the study ......................................................................................................... 7 
1.8 Analytic Tools Used for the Study ....................................................................................... 7 
1.9 Organization of the study ..................................................................................................... 7 
CHAPTER TWO ........................................................................................................................... 9 
LITERATURE REVIEW .............................................................................................................. 9 
2.1 Introduction .......................................................................................................................... 9 
2.2 Reviews of Face Recognition Works ................................................................................... 9 
2.2.1 Face Recognition Approaches ..................................................................................... 13 
v 
 
University of Ghana  http://ugspace.ug.edu.gh
2.2.2 Face Recognition Techniques ...................................................................................... 14 
2.3 Feature Extraction Method ................................................................................................. 15 
2.4 Haar Transform .................................................................................................................. 26 
2.5 Conclusion .......................................................................................................................... 30 
CHAPTER THREE ..................................................................................................................... 31 
METHODOLOGY ...................................................................................................................... 31 
3.1 Introduction ........................................................................................................................ 31 
3.1.1 Data Acquisition .......................................................................................................... 31 
3.2 RECOGNITION PROCEDURE ........................................................................................ 32 
3.2.1 Preprocessing Stage ..................................................................................................... 32 
3.2.2 Feature Extraction Technique ...................................................................................... 42 
3.2.3 Recognition Stage ........................................................................................................ 47 
3.3 Statistical Testing Approach .............................................................................................. 49 
3.3.1 Assessing Multivariate Normality Based on the Euclidean Distance ......................... 50 
3.3.2 Repeated Measures Design .......................................................................................... 52 
3.3.3 Friedman Rank Sum Test ............................................................................................ 55 
CHAPTER FOUR ........................................................................................................................ 60 
DATA ANALYSIS ...................................................................................................................... 60 
4.1 Introduction ........................................................................................................................ 60 
4.1.1 Data Description .......................................................................................................... 60 
4.2 Result .................................................................................................................................. 61 
4.2.1 Preprocessing Stage ..................................................................................................... 61 
4.2.3 Feature Extraction Stage .............................................................................................. 63 
4.2.3 Recognition Procedure Stage....................................................................................... 69 
4.3 Statistical Evaluation of the Euclidean Distance................................................................ 70 
4.3.1 Assessing for Normality .............................................................................................. 70 
4.3.2 Test for Multivariate Normality .................................................................................. 71 
vi 
 
University of Ghana  http://ugspace.ug.edu.gh
4.3.3 Friedman Rank Sum Test ............................................................................................ 72 
4.3.4 Recognition Rate ......................................................................................................... 73 
4.3.5 Recognition Curve ....................................................................................................... 74 
CHAPTER FIVE ......................................................................................................................... 76 
Summary, Conclusion and Recommendation .............................................................................. 76 
5.1 Introduction ........................................................................................................................ 76 
5.2 Summary ............................................................................................................................ 76 
5.3 Conclusions and Recommendations................................................................................... 77 
REFERENCE ............................................................................................................................... 78 
APPENDIX I ............................................................................................................................... 85 
APPENDIX II .............................................................................................................................. 87 
 
 
 
 
 
 
 
 
 
 
 
 
vii 
 
University of Ghana  http://ugspace.ug.edu.gh
LISTS OF FIGURES 
Figure 3.1: Research Design ........................................................................................................ 32 
Figure 3.2: Summary of face recognition process of the study ................................................... 48 
Figure 4.1: A display of covariance matrix of the train image .................................................... 64 
Figure 4.2: Image unit of eigenvector U ...................................................................................... 66 
Figure 4.3: Image of the diagonal matrix of the train image ....................................................... 66 
Figure 4.6: Proportion of the variance explained by the principal components .......................... 67 
Figure 4.4: A graph of observed data quantiles against the normal quantiles ............................. 72 
Figure 4.5: Displayed the recognition curve ................................................................................ 75 
This implies that adopting a DWT-PCA/SVD algorithm, the recognition rate decline for the 
varying angular constraint above 20 . ......................................................................................... 75 
 
 
 
 
 
 
 
 
 
 
 
viii 
 
University of Ghana  http://ugspace.ug.edu.gh
 
LISTS OF PLATES 
Plate 3.1: Original images generated from 3D models for all ten subject ................................... 31 
Source: [Weyrauch, Huang, Heisele & Blanz (2004)] ................................................................ 31 
Comprehensive description of the data used has been presented in the subsequent chapter 
(Chapter 4). .................................................................................................................................. 31 
Plate 4.1: The display of the ten train image ............................................................................... 60 
Source: [Weyrauch, Huang, Heisele & Blanz (2004)] ................................................................ 61 
Plate 4.2: Example of the test image under the various angular pose.......................................... 61 
Source: [Weyrauch, Huang, Heisele & Blanz (2004)] ................................................................ 61 
Plate 4.3: Preprocessed image from DWT ................................................................................... 63 
Plate 4.4: Image of the eigenface of the trained image ................................................................ 68 
Plate 4.5:  Example of correctly recognised individual across all angular poses ........................ 75 
Source: [Weyrauch, Huang, Heisele & Blanz (2004)] ................................................................ 75 
 
 
 
 
 
 
 
 
 
 
 
ix 
 
University of Ghana  http://ugspace.ug.edu.gh
 
LISTS OF TABLES  
Table 1.1: consist of some application to face recognition system................................................ 6 
Table 3.1: Discrete Wavelet Transform decomposition of an image using 3-level pyramid where 
1, 2, 3 represent Decomposition levels, H represent High Frequency Bands and L represent Low 
Frequency Bands .......................................................................................................................... 36 
Table 3.2: Data display for the Friedman Two-Way Anova by ranks ......................................... 57 
Table 4.1: The euclidean distance for the ten image in the train dataset. .................................... 70 
Table 4.3: Friedman Test ............................................................................................................. 72 
Table 4.4: Pairwise comparisons (Post hoc of the Friedman Test).............................................. 73 
Table 4.5: Recognition Rate ........................................................................................................ 74 
 
 
 
 
 
 
 
 
 
 
 
x 
 
University of Ghana  http://ugspace.ug.edu.gh
ABBREVIATION 
AFR……………………………… Automatic Face Recognition 
ANN……………………………… Artificial Neural Network 
 BFLD ……………………………  Block-based Fisher’s linear discriminant 
BFR……………………………… Biometric Face Recognition 
CWT…………………………… Continuous Wavelet Transform 
DFT…………………………… Discrete Fourier Transform 
DLDA………………………... Direct Linear Discriminant Analysis 
DWT. ………………………… Discrete Wavelet Transform 
FFT……………………………  Fast Fourier Transform 
FRVT…………………………. .Face Recognition Vendor Tests 
FWT ………………………… Fast Wavelet Transform  
GF…………………………… Gabor Feature 
IALDA………………………… Illumination Adaptive Linear Discriminant Analysis 
ICA……………………………. Independent Component Analysis 
ID……………………………… Identification Number 
IDWT………………………… Inverse Discrete Wavelet Transform 
KLT………………………….. Karhumen-Loeve transformation 
KPCA………………………… Kernel Principal Component Analysis 
LDA…………………………... Linear Discriminant Analysis  
xi 
 
University of Ghana  http://ugspace.ug.edu.gh
LFA…………………………… Local Feature Analysis 
LMBP…………………………….. Levenberg-Marquardt backpropagation 
MIT…………………………… Massachusetts Institute of Technology  
MPCA………………………… Modified Principal Component Analysis 
MRA…………………………… Multiresolution Analysis 
MVN…………………………... Multivariate Normality 
NDA…………………………… Non-Parametric Discriminant Analysis 
PCA…………………………… Principal Component Analysis 
PDF……………………………. Probability Distribution Function 
PINS…………………………… Person Identification Number 
SURF…………………………... Speeded Up Robust Feature  
SVD……………………………. Singular Value Decomposition 
SVM…………………………… Support Vector Machine 
WHT…………………………. Walsh-Hadamard Transform 
WMPCA……………………. Weighted Modular Principal Component Analysis
xii 
 
University of Ghana  http://ugspace.ug.edu.gh
 
CHAPTER ONE 
INTRODUCTION 
1.1 Background of the Study  
The human brain has a sensory organ (specialised nerve cells) that permit an individual’s to 
recognise features such as angles, lines, edges or movement. These nerve cells operate 24 hours 
after birth (Wagner, 2012). However, the complexity of a face’s components originates from the 
incessant variations in the facial component that occur with respect to time. Notwithstanding of 
these variations, humans recognise a person very easily using physical characteristics such as 
faces, voice, gait, etc. Automatic face recognition deals with extracting these essential features 
from a face image, placing them into a suitable representation and executing some kind of 
recognition on them (Wagner, 2012). The knowledge of mimicking this skill characteristic in 
individuals by machines can be very worthwhile, however the innovation emerging from an 
intelligent and self-learning system may necessitate the provision of adequate data for the 
machine to operate. A well-organised and robust face recognition technique is valuable in a 
number of application areas. These include, criminal identification, access management, law 
enforcement, information security and entertainment or leisure. Recently, Face recognition has 
become a relevant technique that we employ in our day by day lives and this has received 
blooming attention among researchers and the general public. This can be attributed to its 
application in various areas such as terrorisms, robberies, thefts, assassinations, mob justice, 
credit card fraud, etc., committed across the globe. Albeit, researchers argued that face 
recognition is the most challenging problem associated with computer vision and posited that face 
recognition applications are far from restricted to security system as perceived and portrayed by 
the general public (Sahu and Tiwari, 2015).  
1 
 
University of Ghana  http://ugspace.ug.edu.gh
The primary aim of using a face recognition framework is for both verification (of this situation, 
the framework essentials affirm or dismiss the identity of the individual in the query database 
“input face”) and identification (in this situation, the framework consist of unidentified face, and 
this process report back to determine identity from a database of identifying faces) purposes.  
For many years, people have been utilizing physical human attributes, for example, gait 
(movement), facial expression, voice, height, weight, shadow and many more mechanisms to 
recognise an individual. The set of facial features that relate to an individual constitutes their 
respective identity. With the introduction of computers in the nineteenth century, the scientific 
community was interested in using computer software applications to identify or verify an 
individual’s identity. This led to face recognition studies which present an individual facial image 
with targeted facial characteristics (such as the mouth, eye, ear and nose) and tends to measure a 
face image at a time.  
The process of using face recognition, works with individuals’ images in addition to tackling 
many stimulating real-world uses such as de-duplication of identity documents (for example, 
National Health Insurance Scheme , passport, voter ID, driver license, national ID and many 
others), access control and video surveillance (Liao, Jain, & Li, 2013). 
Chellappa, Wilson, and Sirohey (1995) and Woodward Jr, Horn, Gatune, and Thomas (2003)  in 
their respective research findings, concluded that face recognition is a fundamental part of 
biometrics, which aids the automatic recognition of an individual utilizing recognised traits of the 
same individual.  
The face recognition techniques involve two ways which are based on the various criteria, 
namely; an appearance-based model (could also refer as photometric-based) and feature-based 
(could also be refer as geometrical-based) model. The appearance-based model utilises the entire 
or specific areas of the face for recognition and verification. The application of appearance-based 
2 
 
University of Ghana  http://ugspace.ug.edu.gh
model is achieved by extracting the facial feature from the facial image. The feature-based uses 
the geometrical components of the face images which includes the position of the ear, mouth, eye 
and nose. The feature-based focus on assessing the geometric distance among the position of eyes, 
ratio of geometric distance among the position of the nose and eye. Thus, face recognition which 
employs feature-based model of a face image is presumably the natural way to deal face 
recognition. 
Current researchers revealed that there are two approaches in the face recognition system: the 
first approach is the face recognition framework which utilises facial characteristics such as 
mouth, eye, nose, ear and chin to recognise an image; the second approach utilises the entire face 
to recognise an individual image. In the first approach, Belhumeur, Jacobs, Kriegman, and Kumar 
(2013) portrayed that the local method of face recognition calculates the descriptor of the facial 
characteristics of the face image and assembles images into a single descriptor. Some examples 
of the feature-based methods are: Local Feature Analysis (LFA), Garbor Features (GF), Elastic 
Bunch Graph Matching (EBGM) and Local Binary Pattern Feature (LNPF). In the second 
approach, the idea of the entire face technique is to develop a subspace using Principal 
Component Analysis (PCA), Linear Discriminant Analysis (LDA), Independent Component 
Analysis (ICA) and Random Projection (RP) or Non-negative Matrix Factorization (NMF) as a 
feature extraction technique. These dimensionality reduction (mainly feature extraction) 
techniques lessen the huge dimensional space of the face image information (matrix) to an 
intrinsic low dimensional space with approximately the same face image information for 
recognition purposes (Belhumeur et al., 2013). 
Wagner (2012) stated that the automatic face recognition approach utilised the extracted features 
of a face image. Thus putting the face images into a vital representation of the extracted facial 
feature and performing the recognition. This kind of approach improves the recognition rate of 
face recognition systems. Liao et al. (2013) argued that despite the fact that the automatic face 
3 
 
University of Ghana  http://ugspace.ug.edu.gh
recognition algorithm has improved recognition rate significantly, there exist many problems 
associated to the performance of a face recognition algorithm under environmental constraint 
(which involves facial expression, pose variation, occlusion, ageing and extreme ambient 
illumination). Also, developing an algorithm for a face recognition system is quite a challenging 
task since human faces are multifaceted, multi-dimensional and react to visual stimuli (Sahu & 
Tiwari, 2015) . Many researchers have designed an algorithm to tackle face recognition problems, 
but there is little work on face recognition under angular constraint. Lately, angular constraints 
have generated a lot discussion among the scientific community. Since the face images captured 
by any digital devices are somehow tilted, many researchers have advocated for extensive 
research work directed towards angular constraints. The study focused on recognizing face 
images under angular constraints using DWT-PCA/SVD algorithm. 
 
1.2 Problem Statement  
The problems associated with face recognition continue to attract researchers from various 
disciplines due to the numerous application of face recognition. Currently, face recognition is 
categorised on model based schemes which are derived from the number of typical algorithm 
presented. Weyrauch, Heisele, Huang, and Blanz (2004) posited that, variations in the frontal 
view pose mainly lead to the variations in the position of the facial component which called for 
the flexibility of the geometric model. Murugan, Arumugam, Rajalakshmi, and Manish (2010)  
stated that after many years of invention of various software packages and exploration, face 
recognition remains a challenging field of study. The authors argued that face recognition are 
sensitive to variation based on an individual image pose position. Passalis, Perakis, Theoharis, 
and Kakadiaris (2011) presented a three-dimension model to tackle frontal head pose of the face 
recognition algorithm. Their study revealed that angular constraints pose a great challenge to face 
recognition algorithm especially in real-world biometric applications. Beham and Roomi (2012) 
4 
 
University of Ghana  http://ugspace.ug.edu.gh
suggested that there are numerous issues that requires an extensive research in the area of face 
recognition algorithm. These issues also speak to the need for further improvement in facial 
recognition to address challenges of angular variation.  
This study will try to fill the gap by assessing the performance of a modified algorithm (DWT-
PCA/SVD) under angular constraints.  
1.3 Objectives 
This study measure the performance of a modified face recognition algorithm (DWT-PCA/SVD) 
under angular constraints. Specifically the study seeks to;  
i. Modify PCA/SVD algorithm using DWT at preprocessing phase. 
ii. Assess whether a significance difference exists in the average recognition distance of the 
various angular constraints. 
iii. Assess the performance of the modified algorithm under angular constraints through the 
computation of the recognition rate.   
1.4 Significance of the Study 
An efficient and resilient face recognition algorithm is significant in addressing most of the real 
life problems in application areas displayed in Table (1.1).  
Thus, this study serves as a catalyst to breed interest and further research into the other aspects of 
biometrics in Ghana.  
  
5 
 
University of Ghana  http://ugspace.ug.edu.gh
Table 1.1: consist of some application to face recognition system 
Areas Application of Face Recognition 
  Information Security Accessing Security, Data Privacy, User authentication, Intranet security, 
 Application security, File encryption, Securing trading terminal, Medical 
 records, Internet access, TV-Parental control, Personal device log on,  
Access Management  Secure Access Authentication, 
Personal Identification (National ID, Passport, Driver’s licenses,) 
 Welfare fraud, Entitlement programs Immigration, voter registration, 
 multimedia communication (synthetics face), 
Biometric  
Advanced Video surveillance, Forensic Reconstruction of Face remains, 
CCTV Control, Portal control, Post event analysis, Ship lifting, Suspect 
trading and Investigation 
Law Enforcement  
Virtual reality, Photo camera, Training program, Human-robot 
Entertainment, Leisure interaction, Human-computer-interaction and home video games 
 
1.5 Brief Methodology 
The study used a secondary database which was extracted from Massachusetts Institute of 
Technology (2003-2005) Database. This database consists of various angular faces of ten 
individuals (both male and female) in a morphable model. The angular face image data served as 
an input for the proposed model. Ten face images from 10 individuals with head pose (0°) were 
captured into the training image database to be trained by the study algorithm. The test image 
database consist of 80 face images from 10 individuals each of which is captured under eight 
angular constraints ( 4°, 8°, 12°, 16°, 20°, 24°, 28°and 32°). The facial image of these individuals 
were captured and resized to 100 × 100 pixel resolution dimension for the recognition. 
6 
 
University of Ghana  http://ugspace.ug.edu.gh
Several literatures have revealed that face recognition is executed in three essential phases, 
namely the pre-processing phase, extraction (dimension reduction) of facial features phase and 
classification (recognition) stage. The research focused on running a template-based (DWT-
PCA/SVD) face recognition algorithms which is constructed on a face frontal database and 
subsequently evaluated the performance of the study algorithm for recognition.  
1.6 Scope 
The study relies on recognizing the frontal face images in the adopted database under varying 
head pose.  
1.7 Limitation of the study 
This study is limited to assessment under angular constraints. 
Face images captured under any other constraints except the aforementioned were not 
considered in the study.  
1.8 Analytic Tools Used for the Study 
The analytical tool that was used for the image training and recognition is the Matlab [version 
R2015a]. The Matlab software was also used for all numerical computation. 
Also, R software was used for all statistical analysis.   
1.9 Organization of the study 
The study of face recognition under angular constraints using DWT-PCA/SVD consist of five 
chapters. The first chapter consists of an introduction of the study, which involves background of 
the study, problem statement, objectives, the significance (some application of the face 
recognition system), brief methodology, scope of the study, limitation, analytic software used and 
chapter organization including general overview of face recognition algorithm used. 
The Chapter Two covers empirical and theoretical review of articles, journals and related issues 
on face recognition. This includes an automatic face recognition techniques and approaches 
involving the PCA/SVD and wavelet transform. 
7 
 
University of Ghana  http://ugspace.ug.edu.gh
Chapter Three provides concrete theories and the underlying assumption that was employed in 
this study. This chapter entails the discussion of Discrete Wavelet Transform and Principal 
Component Analysis with Singular Value Decomposition (DWT-PCA/SVD) in face recognition 
stages. The chapter also discusses the preprocessing techniques, recognition procedure and 
feature extraction methods in face recognition algorithm.  
In Chapter Four, the study presents the analysis of the data results and discussion on facial 
recognition system.  A computational model for the frontal view of images which is based on 
statistical technique is developed. The PCA/SVD was modified and evaluated under angular 
constraints using MIT database. The frontal view of face recognition algorithms is extended to 
angular constraint.   
Chapter Five presents the findings, conclusions and recommendation of the study.   
 
 
 
 
 
 
 
 
 
 
 
8 
 
University of Ghana  http://ugspace.ug.edu.gh
 CHAPTER TWO 
LITERATURE REVIEW 
2.1 Introduction  
This chapter focuses on the empirical and theoretical appraisal of articles, journals and related 
issues on face recognition using Discrete Wavelet Transform and Principal Component Analysis 
with Singular Value Decomposition (DWT-PCA|SVD). This includes an automatic face 
recognition techniques and approaches involving the PCA|SVD, wavelet transform and other 
statistical approach. 
2.2 Reviews of Face Recognition Works 
The earliest study of face recognition can be traced as far back as the 1870s. Darwin (1872)  
initiated the earliest studies in the works on facial expression of emotions in man and animals.   
Galton (1892) works on facial profile-based biometrics aimed at personal identification. The 
study discovered an independent feature suitable for hereditary investigation that aided personal 
identification. The study was limited to parentage and next of kingship, but failed to address the 
variation in similarity image, pose, illumination, occlusion, etc. It was also around the 1950’s that 
automated face recognition was developed. 
Bruner and Tagiuri (1954) pioneered the first research studies into the automated face recognition 
system in a psychology research. In engineering, Bledsoe (1964)  adopted existing literature 
methods to design and implement a semi- automated system. The short-comings were attributed 
to recognition on a huge database of face images and photographs. The challenge was to select a 
small set of records from the image database such that one of the image records matched the 
photograph. In addition, Bledsoe’s work received a little publicity due to the fact that the work 
was fully funded by an unnamed intelligence agency.  
In 1970s, Kelly (1970) and Kanade (1973) presented seminars in which an semi-automated face 
recognition technique were adopted to map a person’s face image onto a global template using a 
9 
 
University of Ghana  http://ugspace.ug.edu.gh
facial template scheme. However, their method required high computational time because 
locations and measurements were done manually. Literature revealed that after the studies of 
Kelly (1970)  and Kanade (1973), research work related to face recognition system became 
dormant until the mid-1980s, when researchers concentrated on automated face recognition 
techniques which employed facial component (such as the eye, nose and mouth) to recognise an 
individual image. Sirovich and Kirby (1987) initiated the methods of representing a facial image 
in a picture format (dimension). Their study indicates that within a specified framework, a face 
represented in a picture form is unique. Further, they also stated that the characterization of a 
face, within an error bound of relative low dimensional vectors, is limited when significant scale 
and illumination variation are present.  Kohonen (1989) used simple neural networks to perform 
face recognition for aligning and normalizing face images. Their study employed computer face 
descriptor to approximate the eigenvectors of the face image autocorrelation matrix. Kohonen’s 
procedure seemed good because of the need for precise alignment and normalization, but his 
procedure was practically unsuccessful. 
Later in the 1990’s, face recognition techniques saw a perceptible upsurge in the number of 
articles and journals in various research blogs. The design of different algorithms to evaluate face 
recognition have emerged from various disciplines such as Engineering (Electrical & Computer), 
Mathematics, Statistics, Computer Science, Psychology, etc. Among them are PCA, ICA, LDA, 
Neural Network and their derivatives (Asiedu, Adebanji, Oduro, & Mettle, 2015). Kirby and 
Sirovich (1990) focused on a well-defined human face and employed the Karhunen–Loeve 
Transform (KLT) expansion framework. This approach was classified as unsuccessful as it 
proved that fewer than 100 values were essential to precisely code suitably aligned and 
normalised images. The outcome of the study projected face onto an optimal basis. The shortfall 
of this study is that Eigen function of the covariance matrix was manually computed. 
10 
 
University of Ghana  http://ugspace.ug.edu.gh
Turk and Pentland (1991) states that the application of the information theory motivated their 
study to adopt an eigenface approach to provide practical solutions to face recognition problem. 
The eigenface approach adopted to face recognition is relatively simple, very fast and performed 
better under constrained environment. Even though the eigenface approach was restricted by 
constrained environment, it still created significant framework in advancing the developments of 
automatic face recognition systems. 
The development stage for face recognition began a century ago, but it was made available 
commercially in the 1990’s. Public attention was drawn to the trial version of face recognition 
technology in January 2001 at USA Super Bowl event. During the Super Bowl event, surveillance 
face images were captured and compared to digital mugshot database. Research revealed that 
many people across the globe heard of the face recognition technologies after September 11 th, 
2001 disaster (Vijg, 2011).   
In 2006, the National Science and Technology Council (NSTC) report revealed that face 
recognition technology was first introduced at the 2004 Super Bowl event. This work, involving 
face recognition technologies, initiated an ultimate attention on the technology to support natural 
needs while being considerate of public social and privacy concern. 
Recent years have countersigned an emergent concentration of researchers in the field of face 
recognition. The significant attentiveness is based on the increasing application areas of the face 
recognition system. Among these application areas are the Information Security (Data privacy, 
User authentication), Biometric system (National IDs, Passport checks, School ID, Staff ID, etc.), 
Law enforcement agency (video surveillance, forensic reconstruction of face, criminal 
investigation, etc.), and Image and Film processing. The ability of a face recognition system to 
suitably recognise the face image depends on a variation of variables, including illumination, 
head pose, lighting (Gross & Brajovic, 2003), expressions of the facial features (Georghiades, 
11 
 
University of Ghana  http://ugspace.ug.edu.gh
Belhumeur, & Kriegman, 2001), quality of the  image (Shan, Gong, & McOwan, 2009) and tilted 
angular image.  
Currently, face recognition technologies are used to combat crime such as passport fraud, 
credit/debit card fraud, support law enforcement, identify missing individuals and reduce 
benefit/identity fraud. It is also an undeniable fact that many researchers have addressed the 
problems associated with face recognition; by proposing various types of algorithms and 
techniques which are equivalent to the ability of the human brain (from different point of views). 
Face recognition algorithms are classified into two categories: image template-based or geometry 
feature-based. The geometric featured-based method analyses the localised facial component and 
establishes a geometrical relationship. The geometrics facial features includes that eye brow, 
mouth, chin, cheeks, etc. A typical example of this method is the Elastic Bunch Graph Modelling 
algorithm.  
Carrera and Marques (2010)   stated that geometric approach employed a set of fiducial point for 
face recognition. The geometric features are generated by segments, perimeters and some areas 
by the fiducial points. However, this approach is less often used since there are developed 
algorithm that use the whole face (template-based) approach. The Template-based methods 
(Appearance-based method) utilise the facial feature of the entire face to perform face 
recognition. Template-based methods calculates a measure of correlation between original faces 
and a set of template models to assess the face identity (Guillamet & Vitria, 2002). Independent 
Component Analysis (ICA), Linear Discriminant Analysis (LDA), Principal Component Analysis 
(PCA), Singular Value Decomposition (SVD) and the Non-Negative Matrix Factorization (NMF) 
are some example of template based method. 
These algorithms are typically dimensional reduction (feature extraction technique) algorithms 
which seek to lessen the dimensions of the face space, thereby bringing out only important points 
12 
 
University of Ghana  http://ugspace.ug.edu.gh
on the face for recognition and identification purposes. In face recognition works, the dimension 
of the face images are very huge and this requires a substantial amount of computational time 
(runtime) for recognition (classification) (Thakur, Sing, Basu, Nasipuri, & Kundu, 2008). 
2.2.1 Face Recognition Approaches 
 According to Brunelli and Poggio (1993) and N. A. Singh, Kumar, and Bala (2016) there are 
three face recognition approaches namely: 
 Geometrical (feature) based approach 
 Holistic based approach 
 Hybrid approach 
2.2.1.1 Face Recognition using Features Based Approach 
The method of using geometrical features in face recognition implies that recognition of facial 
components such as nose, eye and hair. These facial components are partitioned into segments 
for faster and easy recognition of a face image. The process uses an input data for recognition of 
face image by matching input images with available database images. 
For the feature component-based methods, the fundamental principle is to recompense for pose 
variations by permitting a flexible geometrical relation among the features in the recognition 
phase (Reddy, Babu, & Kishore, 2011). Brunelli and Poggio (1993) performed a face recognition 
task by adopting a matching template of facial features (eye, nose and mouth) independently. The 
facial features classification of the feature-based model was limited since the procedure fails to 
consider the factors associated with variations arising from visual stimuli. 
Though the feature component perform better under the conditions of lighting, it falls short under 
environmental constraints, the feature points are not precisely examined outside the face region.  
13 
 
University of Ghana  http://ugspace.ug.edu.gh
2.2.1.2 Face Recognition using Holistic Based Approach  
Holistic approaches, also known as appearance-based approach, uses eigenfaces and fisherfaces 
in face recognition. This technique has been demonstrated to be effective in testing with huge 
databases. This processing of adopting the holistic approaches requires using either entire faces 
as components (features) or Gabor wavelet filtered whole faces (Beham & Roomi, 2012). 
The study adopted a holistic approach for face recognition in which the facial images are pre-
processed (DWT) to enhance the recognition performance. Thus, the entire face image is resized 
and normalised, afterward it is de-noised though Discrete Wavelet Transform (DWT) to extract 
facial features such as the mouth, eye and nose. 
2.2.1.3 Face Recognition using Hybrid Method 
The hybrid method in face recognition is the combination of holistic-based approach and feature-
based approach (Reddy et al., 2011).  Harandi, Ahmadabadi, and Araabi (2009) argue that holistic 
approach is not the only form of recognizing a face image in a database but there exist 
combination of other approaches. Their study presented a hybrid recognition approach which 
consist of holistic and geometrical component based methods. Their method sought to determine 
the behaviour of face recognition by uniquely combining the facial features in a task of 
performing recognition. Their face recognition utilised the weighted strategy to link the outcomes 
of the facial component classifier and the entire image, adopting a simulation studies which gave 
a better performance compared to that of eigenface approach.  
Kakade (2016) study revealed that using hybrid method enhanced face recognition algorithm. His 
study assessed face recognition algorithms using the statistical approaches (ANN, LDA, PCA, 
and SVM) which yielded better recognition performance. 
 
2.2.2 Face Recognition Techniques 
14 
 
University of Ghana  http://ugspace.ug.edu.gh
According to Singh et al., (2016), face recognition techniques are classified as follows: 
 Artificial Neural Network 
 Template based matching 
 Geometrical feature matching 
 Fisherface  
 
2.3 Feature Extraction Method   
Some literature refer to feature extraction method as dimensional reduction technique. Haykin, 
Haykin, Haykin, and Haykin (2009) state that feature extraction is simply transforming face space 
into a geometrical vector (feature space). In the geometrical vector, the face image database is 
characterised by a lesser number of facial features and retains most of the significant component 
of the original facial image. Bellakhdhar, Loukil, and Abid (2013) state that face recognition 
algorithm performance depends on how it extracts feature vector and recognise a facial image 
correctly. The feature extraction method is a vital tool in determining the performance of 
recognition rate of a face recognition algorithm. 
In this section, the study principally focused on the PCA/SVD as a face dimensional reduction 
(feature extraction) technique. Other dimensional techniques are LDA, ICA, etc.  
 
2.3.1 Face Recognition by Principal Component Analysis (PCA)  
2.3.1.1 Theoretical Approach 
PCA was first proposed in 1901 by Karl Pearson and in 1933 by Harold Hotelling to turn a set of 
practicably orthogonal variables into a smaller set of uncorrelated variables (Turk & Pentland, 
1991). PCA is one of the widely used holistic approaches (appearance-based) for dimensional 
reduction and upstream (feature extraction stage) than other algorithms to enhance the outcomes 
15 
 
University of Ghana  http://ugspace.ug.edu.gh
of application for face recognition problems. The primary aim of PCA includes feature extraction, 
compression of data, removal of data redundancy, etc. The fundamental idea of a face recognition 
system is to construct and convert two-dimension facial image into one dimension vector of 
values consisting of compact principal component of the feature space. This idea implies that, 
PCA determines the vectors which best account for the face image distribution within the entire 
image space and aims to extract a subspace where the variance is maximised (Moghaddam & 
Pentland, 1997).  
In face recognition system, the PCA is a statistical approach that is adopted to explain the 
variance-covariance structure of an image dataset. Based on the variance-covariance matrix 
generated from the image dataset, the PCA technique is employed to compute the eigenvalues 
and eigenvectors respectively, and project the original image datasets onto a reduced dimensional 
feature space. This involves a numerical procedure that transforms an orthogonal basis vector of 
a number of uncorrelated variables into a smaller correlated form called principal component, 
which maximises the scatter of all the projected samples. 
According to A. Singh, Singh, and Tiwari (2012)  the PCA process can be achieved if the training 
(query/probe) and testing (gallery) images have the same dimensioned size and undergoes 
normalization of the face image in both datasets. Thus, normalization involves lining up the eye, 
nose and mouth of the subjects whining the face image. Each of the face image is then illustrated 
as the weighted sum of geometrical space (feature vector) of eigenfaces which are in a unit 
dimensional space array. This procedure compares the training image against testing image by 
computing the euclidean distances between their respective geometrical space (fiducial points on 
the face) which leads to recognizing the face image in the dataset. 
Mathematically, let 𝑻𝑗 = [𝑇1, 𝑇2, … , 𝑇𝑀] represent 𝑀 normalised training images in a data set. 
 Each of the training image 𝑻𝑗 is represented as feature vector. 
16 
 
University of Ghana  http://ugspace.ug.edu.gh
 t11 t12 t13 . . . t1N   t11 
 
t  
 21
t22 t23 . . . t2N  t 12 
 t 31 t32 t33 . . . t3N  t 
  13
                      T   . . . . . . .
 
   .                                                     (2.1) j  
 . . . . . . .   . 
   
 . . . . . . .   . 

t

N1 tN2 tN3 . . . tNN  t  NN 
𝑁 × 𝑁 Dimensional matrix 
 Determine the mean of the vectorised training image 
1 M
                                       Tj  Tj                                                                                (2.2) 
M j1
 Subtract the vectorised average (mean) face image from each of the original face image 
                                                   S j = Tj - Tj                                                                                                  (2.3) 
 Compute for the covariance matrix 
M
                                               C=S  jST  j                                                                                        (2.4) 
j=1
Thus,  
                                       Let  A = S1 ,S2 , ...,SM    
                                                C = AAT                                                                                                  (2.5a) 
      where C  N 2 N 2  and A  N 2 M   
 Thus, dimensional space for the covariance matrix C is huge. Therefore, there is the need to 
reduce the high dimension space to an intrinsic low dimension space. 
Various studies have re-defined covariance matrix based on dimension reduction as: 
                                                     C = ATA                                                                        (2.5b) 
17 
 
University of Ghana  http://ugspace.ug.edu.gh
 Using the covariance matrix C to determine the eigenvalue and the eigenvectors.  
This procedure can be deduced as singular value decomposition (SVD). Thus, covariance matrix 
can be decompose as: 
                                          C = UDVT                                                                                       (2.6) 
 where the columns of U  and V are orthonormal matrices  their respective dimensional space 
given as (𝑚 × 𝑘) and (𝑘 × 𝑛). D  is a diagonal matrix with (𝑘 × 𝑘) dimensional space and a real 
positive values. The diagonal element of matrix is termed as singular values, 𝑚 row of the matrix
U and 𝑛 row V are termed as left- singular vectors and right-singular vectors respectively. 
Let (𝑣1, 𝑣2, 𝑣3, 𝑣4, … 𝑣𝑘)  represent a left singular vectors as a results of SVD. It has been proven 
that  
                                                      v1 = argmax Av                                                                            (2.6a) 
v=1
                                                      v2 = argmax Av  
v=1
v^v1
                                                      v3 = argmax Av  
v=1
v^v1
v^v2
                                                           .                     . 
                                                           .                       . 
                                                           .                        . 
                                                         vk = argmax Av                                                                           (2.6b) 
v=1
v^v1
v^vk =1
Based on the above equation, it can be deducted that σ1 A = Av1 is the diagonal element of D . 
 Now eigenvector and eigenvalue computed using the Jacobian transformation as follow: 
18 
 
University of Ghana  http://ugspace.ug.edu.gh
                                        U j = AVj                                                                                           (2.7) 
 Compute the weight of each training image as 
                                           p j = U
T
j S j                                                                                     (2.8) 
 Compute the individual weight vector for each training image 
 p1 
 
p
 2 
p 3
 
                                     η =  .                                                                                                      (2.9) j
 . 
 
 . 
 
pN 
 
For example; 
 Let A, B and C represent a matrices of a face image in a database with dimension 22 . 
3 3 9 5 7 8
                   A                              B                    C     
2 4 3 3 4 9
 Vectorised the face image in the data (reshape into one matrix) 
3 9 7
 
2 3 4
                                    Tj 
   
3 5 8
 
4 3 9
 Recall (2.2): Mean of the vectorised face image  
Tj  3 5 7  
 Recall (2.3): Mean Centered 
19 
 
University of Ghana  http://ugspace.ug.edu.gh
 0 4 0 
 
1 2 3
S     j
 1 0 1 
 
 0 2 2 
 Recall (2.4): Covariance matrix 
2 2 4 
 
c  2 24 2  
 
4 2 14
 Singular value decomposition 
 -0.1248 0.2569 -0.9583
 
U   -0.9659  -0.2521  0.0582  
 
 -0.2266 0.9329  0.2796
 24.7278  0  0 
 
D  0 14.5611 0  
 
 0 0  0.7109
 -0.1248 0.2569 -0.9583
  V   -0.9659  -0.2521  0.0582  
 
 -0.2266 0.9329  0.2796
 Eigenvalues 
                                      1.81e+02    2.32e+01    -2.31e-16    
 Eigenvector 
  0.0524 0.1897 0.9804 
 
 -0.9362  0.3508  -0.0178  
 
 0.3473 0.9169 -0.1960 
2.3.1.2 Empirical Approach  
Many scholarly works in face recognition system adopted the PCA as a dimension reduction 
mechanism. The study considered the most recent work using PCA in a face recognition system 
20 
 
University of Ghana  http://ugspace.ug.edu.gh
Nicholl and Amira (2008) adopted the PCA (eigenfaces) approach in face recognition. They 
presented discrete wavelet transform with principal component analysis (DWT/PCA) to 
automatically calculate the most discriminative coefficients in a face image. The most 
discriminative coefficient was determined based on their respective intra-class and inter-class 
variations of the training set of eigenface weighted vectors. Even though the method improved 
the performance of face recognition algorithm, the study failed to address the pose variation and 
illumination in the training face image datasets.    
Kshirsagar, Baviskar, and Gaikwad (2011) applied an information theory (coding and decoding 
of the face image) to address face recognition problems. The method used PCA and Back 
Propagation Neural Network (BPNN) for feature extraction and recognition respectively. Their 
method was described as the effective approach to determine the lower dimensional space. Their 
study revealed that eigenfaces approach possess the properties that significantly extract face 
features and BPNN lessen the input size of face image. The study concentrated on dimension 
reduction of the face image, but its performance is limited to constrained environment.  
Abdullah, Wazzan, and Bo-Saeed (2012) employed PCA (eigenvalues) as a data representation 
and dimension reduction techniques in solving face recognition problems. The aim of their study 
was to measure the effect of run time in performance of face recognition algorithms. The results 
show that for small database size the eigenfaces approach has no relationship with time 
complexity in face recognition. However, this approach has drawback of high computation time 
especially for large database size.  
 Luo, Swamy, and Plotkin (2003) proposed modified principal component analysis (MPCA) for 
a face recognition and compared it with PCA and LDA approaches. They used Yale face database 
and the results revealed that MPCA perform better than the conventional PCA and LDA 
approaches. The computational cost remained the same as that of the PCA, and much less than 
21 
 
University of Ghana  http://ugspace.ug.edu.gh
that of the LDA. Later studies revealed that both the PCA and LDA do not increase the 
computational cost. 
Paul and Al Sumam (2012) focused on applying statistical approach to construct a face 
recognition system. They employed the PCA approach to reduce the high-dimensional space to 
an intrinsically low- dimensional space. In doing that, they splitted the dataset into training and 
test datasets. The eigenvalues (eigenvectors) were deduced from the covariance matrix of the 
training set and the eigenfaces measured the linear combination of weighted eigenvectors. 
Recognition was achieved by projecting a test image onto the subspace spanned by the eigenfaces 
and then classification was performed by measuring minimum euclidean distance. However, this 
study failed to address the problem of real time in recognition rate. 
Baah (2013), employed eigenfaces approach, as an application of principal component analysis 
to facial images. This provided a real-time solution that tackled face recognition problem. The 
researcher argued that, based on the result, PCA is very fast in a runtime, relatively easy, and 
performs well in a constrained environment. 
Bellakhdhar et al., (2013)  used a Gabor wavelet, PCA and Support Vector Machine (SVM) to 
develop face recognition algorithm. Their method focused on joining the Gabor wavelet to 
extract a face characteristics vector, and used the PCA algorithm for recognition and SVM to 
classify faces. However, this approach is limited compared to the performance of the current 
face recognition system in terms of reliability and real time. 
Dhoke and Parsai (2014)  performed face recognition using PCA and Back Propagation Neural 
Network (BPNN) for identification and verification purposes. Their study adopted PCA as a 
dimensional reduction technique (feature extraction) and BPNN was used for recognition and 
classification. The results revealed that their face image recognition and classification was 
computationally very fast and provides high precision rate. 
22 
 
University of Ghana  http://ugspace.ug.edu.gh
Kadam (2014) presented a hybrid combination of principal component analysis (PCA) and 
discrete cosine transform (DCT) to address issues related to face recognition system. The 
DCT/PCA technique was found to be highly efficient in extracting meaningful features with high 
recognition rate. 
Asiedu et al., (2015) employed statistical techniques that assessed the performance of two face 
recognition algorithms, namely PCA/SVD and FFT-PCA/SVD, under varying facial expressions. 
The study findings indicated that FFT-PCA/SVD comparatively exhibit low variation and better 
recognition rate than PCA/SVD algorithm in the recognition of facial images under varying 
expressions. The researchers posited that FFT is a viable noise filtering technique that should be 
employed in the preprocessing stage of face recognition.    
Goyal and Batra (2016) adopted PCA with neural network in a face recognition system and 
applied photometric nominalization for comparison. The study results indicated that euclidean 
distance classifier gives a high precision using original face images, although employing 
histogram equalization techniques on the face image does not yield much impact to the 
performance of the face recognition system under unconstrained environment. 
Ramalingam and Dhanushkodi (2016) focused on recognizing occlusion and facial expression. 
Their study employed PCA and Haar wavelet technique to detect facial features on the input 
image and extract from the detected facial features. Although this approach yielded high 
recognition rate in the face recognition performance, the approach failed to address illumination 
and head pose constraints. 
 
2.3.2 Other Form of Feature Extraction Approach 
The study considered other form of dimensional reduction techniques. 
23 
 
University of Ghana  http://ugspace.ug.edu.gh
2.3.2.1 Linear Discriminant Analysis (LDA) 
It is mainly used as a feature extraction stage before classification and provides dimensionality 
reduction of feature vectors without a loss of information (Johnson & Wichern, 1998). Also, LDA 
is usually used to examine linear groups (fisher face) of features while maintaining the separation 
of classes which are also known as Fisherface. LDA is a supervised learning method that uses 
more than one training image for each class (Barnouti, Al-Dabbagh, Matti, & Naser, 2016). 
The original work of the Fisherface projection method is to solve the illumination problem by 
maximizing the ratio of between-class scatter to within-class scatter. Fisherface is less sensitive 
to pose, lighting, and expression variations. LDA optimizes the low dimensional representation 
of the objects (Bhattacharyya & Rahul, 2013). 
In LDA, the face images are divided and labeled into between-class and within-class. Between-
class captures the image variations of the same individual. While, within-class captures the image 
variations among classes of individuals. Unlike PCA, LDA attempts to model the differences 
among classes. LDA obtains different projection vectors for each class. Multi-class LDA, which 
is widely used in biometric systems, can be performed on more than two classes (Sahu, Singh, & 
Kulshrestha, 2013). 
Theoretically, the LDA divides the scatter matrix into two matrix classes namely: within-class 
scatter matrix (SW) and the between-class scatter matrix (SB). The SW and SB are represented 
in the equations as follows: 
𝑁 𝑙𝑡
𝑆𝐵 = ∑ ∑(𝑌𝑡𝑖 − 𝜇
𝑡
𝑡)(𝑌𝑖 − 𝜇𝑡)′ 
𝑗=1 𝑖=1
𝑙
                                                 𝑆𝑊 = ∑ 𝑡 (𝜇𝑡 − 𝜇)(𝑖=1 𝜇𝑡 − 𝜇)′                                               (2.10) 
where 𝒀𝒕 is the 𝑖𝑡ℎ𝒊  observation in the sample from class t , 𝝁𝒕 is the mean of  class t , N  is the 
size of  the classes, 𝒍𝒕 is the number sample from class t  and 𝝁 is the mean of all classes. 
24 
 
University of Ghana  http://ugspace.ug.edu.gh
The Fisherface technique is outlined in the following steps:  
 Train the PCA model for 𝑌 and linearly project 𝑌 into a 𝑘 dimensionality. 
 Calculate the scatter matrix  SW and SB of 𝑌. 
 Determine the eigenvalues of inverse of SW and SB, select all the eigenvector of the basis 𝑌. 
Liu, Zhou, and Jin (2010) employed an illumination adaptive linear discriminant analysis 
(IALDA) approach to address illumination problem in face recognition. Their method was 
effective in dealing with illumination in face images. 
Singh et al., (2016) employed a novel model that comprise of LDA and Speeded up Robust 
Feature (SURF) to tackle illumination and pose variations. This novel method in face recognition 
improves the quality parameters using SURF and LDA to optimize the result and feature matching 
respectively. 
Small sample size problem is a critical issue during the use of LDA. This can be resolved by using 
PCA during pre-possessing stage and performing LDA afterwards. This ensures that 
dimensionality reduction occurs at the PCA step. Additionally, LDA has a high computational 
cost in cases of huge data (Murtaza, Sharif, Raza, & Shah, 2014). 
2.3.2.2 Independent Component Analysis (ICA) 
 In research, the basic face recognition is finding a seemly representation of the large random 
vectors or multivariate data.  The aim of using Independent Component Analysis (ICA) is to 
transform the data as linear combinations of statistically independent data points and to reveal 
hidden factors that bring about sets of random variables, measurements, or signals. ICA is defined 
by using statistical “latent variables” model (Hyvärinen, Karhunen, & Oja, 2004). ICA is 
basically a method for extracting individual signals from mixtures. For conceptual simplicity and 
computation purpose, a linear transformation of the original data is required. ICA is one of such 
linear transformation that seek to separate the original data into independent component that 
25 
 
University of Ghana  http://ugspace.ug.edu.gh
project the distinct features of the original data. In a Gaussian source model, PCA can be derived 
as special case of ICA. 
 Bartlett, Movellan, and Sejnowski (2002) stated that ICA is a generalization of the PCA approach 
and it is derived from the principle of optimal information transfer through sigmoidal neurons. 
The researchers argued that in ICA, face images and pixels can either be treated as random 
variables or outcomes interchangeable. The results of their study found a spatially local basis and 
produced a factorial face code for face images. The performance of their approach were superior 
to representation based on PCA for recognizing faces across days and change in facial expression. 
Draper, Baek, Bartlett, and Beveridge (2003) presents a comparison of the ICA and PCA 
algorithms in the context of a baseline face recognition system. The study stated that both the 
ICA and PCA algorithms’ performances depend on a subspace distance metric and evaluates the 
algorithms using recognition rate criterion.  
Kabir, Hossain, and Islam (2016) revealed that both ICA and PCA algorithms worked for face 
recognition, however PCA algorithm perform better that  ICA algorithm. They concluded that 
PCA algorithm represents intelligent suction of a random search within a short time and it can 
detect to solve the problem. 
In general, the ICA is a good statistical technique for face recognition system but the variance 
and the order of the ICA cannot be determined. 
 
2.4 Haar Transform 
Chang, Yu, and Vetterli (2000)  employed the Haar transformation technique to form a wavelet. 
Haar wavelet transform is described as the simplest wavelet transformation method and can 
efficiently support the study interests.  
26 
 
University of Ghana  http://ugspace.ug.edu.gh
The Haar transform cross-multiplies a function against the Haar wavelet with various shifts and 
stretches, like the Fourier transform cross-multiplies a function against a sine wave with two 
phases and many stretches.  
m
The N Haar functions can be sampled at t  , where m  0,1,..., N 1  . This constitute a 
N
N N  matrix for discrete Haar transform where t represents the signal in the Haar function. 
For example, when N  2 , the study of the discrete Haar transform derived from the Haar 
matrix is shown as: 
1 1 1 
                                        H2 =                                                                              (2.11) 
2 1 -1
             Also, when N  4 , 
 1 1 1 1 
 
1 1 1 -1 -1 
                               H4 =                                                                (2.12)                                  
2  2 - 2 0 0 
 
 0 0 2 - 2
Furthermore, when N  8  
 1 1 1 1 1 1 1 1 
 
1 1 1 1 -1 -1 -1 -1
 
 2 2 - 2 - 2 0 0 0 0 
 
1  0 0 0 0 2 2 - 2 - 2 
               H8 =                                             (2.13) 8
 2 -2 0 0 0 0 0 0 
 0 0 2 -2 0 0 0 0 
 
 0 0 0 0 2 -2 0 0 

 0 0 0 0 0 0 2 -2


 
Haar transform portrays details in signal of different frequencies and time location. Thus the 
matrix are both real and orthogonal: this implies  
27 
 
University of Ghana  http://ugspace.ug.edu.gh
                               H = H* , H-1 = HT , that is. HTH = I  
 
where I  is defined as the identity matrix.  
In case where N = 4 , we have  
 
1 1 2 0   1 1 1 1 
   
1 1 1 - 2 0  1 1 1 -1 -1-1 T  H4 H4 = H4 H4 =                                      (2.14) 
2 2  2 - 2 0 0 
1 -1 0 2   
 
1 -1 0 - 2  0 0 2 - 2   
This results into: 
1 0 0 0
 
-1 T 0 1 0 0
                       H4 H4 = H4 H4 =   
                                                                  (2.15) 
0 0 1 0
 
0 0 0 1
 
In general, an N N discrete Haar matrix can be represented in terms of its row vectors where
hm,m  0,1,..., N 1 as follows: 
 h0 
 
h
 1 
 . 
                                                H                                                                                 (2.16) 
 . 
 . 
 
hN 1
Then, the inverse and the transpose of the Haar matrix is given as  
                                       H-1 =HT = hT hT . . . hT                                                  (2.17) 0 1 N-1
th
where hTm  is the m row vector of the matrix.  
The Haar transform of a given signal vector: 
28 
 
University of Ghana  http://ugspace.ug.edu.gh
                                             y   y 0 ,..., y N -1                                                            (2.18)            
 is  
                                     Y = Hy =  Th h
T . . . hT  y                                                    (2.19)                                           0 1 N-1
with the mth  column of Y  being  
 
                                         Ym = hTm                      m  0,1,2,..., N 1                                (2.20) 
 
which is the mth  transform coefficient, the projection of the signal vector y  onto the mth row 
vector of the transform H  matrix.  
The inverse transform is given as: 
 Y0 
 
 . 
            y = H-1Y = HTY = hT hT . . hT     .                                                 (2.21) 0 1 N -1  
 . 
 
YN -1
N -1
 y = H
-1Y = HTY =  Y mhm               (2.22) 
m=0
That is, the signal is expressed as a linear combination of the row vectors of H . 
Comparing this Haar transform matrix with all transform matrices such as Fourier transform, Fast 
Fourier transform (FFT), Cosine transform involves Direct Cosine Transform (DCT), Continuous 
Cosine Transform (CCT) and Walsh-Hadamard Transform (WHT). Studies have revealed that 
there exist significant differences among them. 
The row vectors of all preceding transform methods represent different frequency (or sequency) 
components, including zero frequency or the average or DC component (first row m  0 ), and the 
progressively higher frequencies (sequencies) in the subsequent rows ( m 1,2,..., N 1  ). 
However, the row vectors in Haar transform matrix represent progressively smaller scales 
29 
 
University of Ghana  http://ugspace.ug.edu.gh
(narrower width of the square waves) and their different positions. It possesses the proficiency to 
represent different positions as well as different scales (corresponding different frequencies) that 
distinguish Haar transform from the above mentioned transforms. This capability is also the main 
advantage of wavelet transform over other orthogonal transforms. 
2.5 Conclusion  
Presently, the problems of face recognition continue to attract more research effort from the 
scientific community, due to the wide spread of face recognition application. Although face 
recognition has proven to be a difficult task for frontal face images, certain algorithm has been 
proposed to be performing well under constrained environments.  
Research has revealed that eigenface method is the most prominent method used in tackling face 
recognition problem. Many researchers in the scientific community have applied eigenface 
method with various technique such as PCA-SVM, PCA-ICA, PCA-BPNN and so on. However, 
most of the performances of face recognition algorithms are limited to constrained environments 
such as head pose, occlusion, etc. Literature have revealed that there is little or no work done at 
all on angular constraints.  
This study seeks to fill the gap of assessing the performance of face recognition algorithm on face 
with angular constraints. 
 
 
 
 
 
 
30 
 
University of Ghana  http://ugspace.ug.edu.gh
CHAPTER THREE 
METHODOLOGY  
3.1 Introduction  
This chapter concentrates on the theory of the concept used, derivation and methods of analyzing 
the available data to satisfy the research aims of the study. The underlying objective of this 
research work focuses on a detailed review of a modify DWT-PCA/SVD in constructing face 
recognition algorithm. The face recognition procedure comprises using: DWT as a noise filtering 
in the preprocessing phase, PCA/SVD as feature extraction (dimension reduction technique) 
phase and recognition distance in the recognition phase.  
3.1.1 Data Acquisition 
The study employed a secondary database which was extracted from Massachusetts Institute of 
Technology (2003-2005) database. This database comprises of the varying angular faces of 
individuals images.  
Plate 3.1 shows the original images of individuals used in the study. The angular poses of each 
individual were captured along 0°, 4°, 8°, 12°, 16°, 20°, 24°, 28°and 32°rotation of the head. 
 
              Plate 3.1: Original images generated from 3D models for all ten subject 
                  Source: [Weyrauch, Huang, Heisele & Blanz (2004)]  
Comprehensive description of the data used has been presented in the subsequent chapter 
(Chapter 4). 
31 
 
University of Ghana  http://ugspace.ug.edu.gh
3.2 RECOGNITION PROCEDURE 
The study focused on employing DWT-PCA/SVD recognition technique on a created MIT face 
image database under angular constraints. The database comprising of facial images (in both the 
training and testing datasets) served as input images for the face recognition module. The inputted 
facial images are operationally transformed into uniform dimension and a compatible format for 
image processing.  
The research design is illustrated in Figure 3.1.   
 
                                                        Figure 3.1: Research Design 
3.2.1 Preprocessing Stage  
The principle for using pre-processing tool is to enhance the quality of the face image in order to 
improve the performance of the face recognition algorithm. Pre-processing stage is a useful phase 
in face image representation and serves as a noise removal mechanism as stated by Asiedu et al. 
(2015). Pre-processing stage is an effective way of suppressing the unwanted distortion of image 
feature for further processing. This helps to reduce the noise level in the face image data set and 
makes the estimation process simpler and better conditioned for an improved recognition rate in 
face recognition system. 
32 
 
University of Ghana  http://ugspace.ug.edu.gh
Based on the suggestion of Asiedu et al., (2015) , the study employed Discrete Wavelet Transform 
(DWT) at preprocessing stage as de-noise/filtering technique before extracting features from face 
images.  
 The study adopted two stages in the preprocessing phase namely: 
 Resizing of the face image 
 Image denoising of the face image 
3.2.1.1 Resizing of the face image 
Resizing involves reshaping the images to a preferred size. Each face image from the database 
was imported into the Matlab software. The face images were resized into a uniform dimension 
of 100 × 100. Reddy et al. (2011) stated that the resizing of face image helps to reduce the 
lighting effect associated with visual images, thus makes them best suited for PCA in image 
processing. The study adopted this procedure to lessen the mathematical complexity in the feature 
extraction of a face image. 
3.2.1.2 Image de-noising  
Many researchers have revealed that face images by nature exhibit the characteristics of a 
Gaussian noise due to the presents of illumination variations. For a face image in the dataset to 
be de-noise (filter), the study adopted a pixel based filtering technique. In this study, Discrete 
Wavelet Transforms (DWT) was applied to eliminate noise in the face image and retain vital 
facial feature for recognition. DWT has been proven to be a viable tool for removing noise in an 
image data.  According to Chang et al., (2000), DWT is a very easy and fast approach to denoising 
a face image. 
3.2.1.4.1 Wavelet Transform 
Wavelets transform are filters that localise a signal into constituent part which includes space and 
frequency; and are defined based on a hierarchy scales. The wavelet transform is classified as 
33 
 
University of Ghana  http://ugspace.ug.edu.gh
time-compact wave and double index. It is a time-frequency localization analysis which has the 
capacity to convert local features into time and frequency domains. With wavelet transform, 
analyzing low-frequency components for a short duration with wide band of the signal and high 
frequency component for a long duration with a narrow band are done easily while  
simultaneously looking at high-frequency components with high time resolution (Szeliski, 2010). 
The application of wavelet transform is a very popular tool in signal decomposition such as   
transient signal analysis, communication system, image analysis, pattern recognition and 
computer vision. Daubechies, Mallat, and Willsky (1992)  states that wavelet transform primarily 
classifies functions, data, or operators into different frequency components and then analyses 
each component with a resolution matched to its scale.  
Mostly, the wavelet transform is given by the subsequent equation: 

                                     F a,b   f x

(a,b) xdx                                                               (3.1) 

where the * is the complex conjugate symbol and function   is an arbitrarily function when 
certain rules are satisfied.  
The generation of the wavelet basis function can be defined as: 
1  x b 
                                            x                                                                        (3.2) a,b a  a 
where a andb are both real numbers which quantify the scaling and translation factors respectively 
(Poularikas, 1999). 
Wavelet transform decomposes a signal into a set of basic functions. These basic functions are 
obtained from a mother wavelet by scaling, translation and dilation of a known function  .  
34 
 
University of Ghana  http://ugspace.ug.edu.gh
Meanwhile, in order to right improve the original signal, the mother wavelet basis has to fulfill 
this three assumptions: 

                                               xdx  0                                                                              (3.3) 

                                       |0 0                                                                              (3.4) 
The wavelet transforms are applicable in many different situations based on the function used for 
its calculation. Thus, it takes an infinite set of different transform based time-domain.  
 This study shows only the time-signal (frequency) decomposition based on the wavelet 
orthogonality (orthonormal). There are two forms of wavelet transform: namely, orthonormal 
wavelets (for discrete wavelet transform) and non-orthonormal wavelets (for continuous wavelet 
transform). The study considered the former, which is discussed presently. 
3.2.1.4.2 Discrete Wavelet Transform (DWT) 
In wavelet analysis, the Discrete Wavelet Transform (DWT) decomposes a signal into a set of 
mutually orthogonal wavelet basis functions. These functions differ from sinusoidal basis 
functions in that they are spatially localised – that is, nonzero over only part of the total signal 
length. Furthermore, wavelet functions are dilated, translated and scaled versions of a common 
function φ, known as the mother wavelet. As is the case in Fourier analysis, the DWT is invertible, 
so that the original signal can be completely recovered from its DWT representation (Bultheel, 
1995). 
The DWT is a linear transformation that operates on a data vector whose length is an integer 
power of two, transforming it into a numerically different vector of the same length (Kociołek, 
Materka, Strzelecki, & Szczypiński, 2001). DWT is basically a technique that aids in 
transforming image pixels into wavelet, which are then used for wavelet-based compression and 
35 
 
University of Ghana  http://ugspace.ug.edu.gh
coding. Also, the DWT is a tool that separates data into different frequency components, and then 
studies each component with resolution matched to its scale (Kociołek et al., 2001).  
As indicated earlier, the measured advantage of using DWT is that, it captures both frequency 
and location information. DWT is a transform based on frequency domain (the distributions of 
the frequency are transformed in each step of DWT), where L represents Low frequency band, H 
represents High frequency band and subscript behind them represents the number of layers of 
transforms. LL sub graph represents the lower resolution estimate of the original value, while 
mid-frequency and high-frequency details sub graphs HL, LH and HH represent horizontal edge, 
vertical edge and diagonal edge details respectively.  
Most of the energy is concentrated in low frequency sub-band. Table 3.1 depicts DWT 
decomposition of an image using 3- level pyramid. 
 
Table 3.1: Discrete Wavelet Transform decomposition of an image using 3-level pyramid where 
1, 2, 3 represent Decomposition levels, H represent High Frequency Bands and L represent Low 
Frequency Bands 
                H= High Frequency Bands 
LL3  LH 3    L= Low Frequency Bands 
                          1, 2, 3 – Decomposition Levels 
3 2HL  HH 3            LH    
1  
            HL2        HH 2         LH  
 
                                      
 
                               HL1         HH 1  
 
Of the four sub-bands, only the LL component represents the approximate coefficients of the 
decomposition used to produce the next level of decomposition. As the LL sub-band contains 
36 
 
University of Ghana  http://ugspace.ug.edu.gh
only the low-frequency components of the image, it is relatively free of noise in comparison to 
the other three sub-bands. The sub-band HL represents major facial expression features. The sub-
band LH depicts face pose features. The sub-band HH is the unstable band in all sub bands 
because it is easily disturbed by noises, expressions and poses. In view of these, the sub band LL 
is the most stable sub band (Lai, Yuen, & Feng, 2001).  
Unlike the Discrete Fourier Transform (DFT), the DWT refers not just to a single transform, but 
rather a set of transforms, each with a different set of wavelet basis functions. Two of the most 
common are the Haar wavelets and the Daubechies set of wavelets. The other form of wavelets 
includes the Morlet, Coiflets, Biorthogonal and Mexican Hat Symlets.  The study concentrated 
on the Haar Transform.  
3.2.1.4.2.1 Haar Transform 
A Haar wavelet is the simplest type of wavelet which is discontinuous, while all other Daubechies 
wavelets are continuous. The Haar wavelet transform is a generally used technique that has a 
recognised name as a simple and powerful technique for the multi-resolution decomposition of 
time series. In discrete form, Haar wavelets are related to a mathematical operation called the 
Haar transform. The Haar transform serves as a prototype for all other wavelet transforms. Haar 
transform can be used for compressing image signals and for removing noise. 
Consider a time series v  v1,v2,...,vN where N is even. The single-level Haar transform 
N
decompose v  into two signal of the length . They are the mean coefficient vector u1 , with 
2
component 
v2m1 v2m N
                                            um  (  )       m 1,2,...,                                              (3.5) 
2 2 2
and details coefficient vector b1 , with component  
37 
 
University of Ghana  http://ugspace.ug.edu.gh
v v
b  ( 2m1  2m N                                        m )       m 1,2,...,                                                  (3.6) 
2 2 2
Each term in the equation represents variation between successive elements of the time series that 
is on a time scalet . Each form in the mean vector is a mean across a time scale 2t . These can 
be concatenated into another N vector, which can be regarded as a linear matrix transformation 
of v . 
                                                         f 1   1 1u | b                                                                 (3.7) 
Clearly, the transform can be inverted to get  v   from f , as follows  
 u b   u b 
v2i1 
i  i N           ,        v2i 
i
 
i
                         i 1,2,...,                         (3.8) 
 2 2   2 2  2
Like the Fourier transform, the Haar transform is power-censoring (2-norm conserving) 
N
2 2
1 2 2
                                                         f  um  bm                                                          (3.9) 
m1
N
2 2
  1                                                       f  v2m1  v2mv2m1  v2m                         (3.10) 
m1
N
2 2
1
                                                         f  v
2
2m1  v
2 
2m                                                     (3.11) 
m1
2
                                                                 f 1  v2                                                               (3.12) 
This implies that the Haar transform was regarded as partitioning the power between different 
time scales and time ranges.  
38 
 
University of Ghana  http://ugspace.ug.edu.gh
 3.2.1.4.2.3 Haar Wavelet  
The Haar wavelet is a sequence of rescaled (square shaped) functions. For an input represented 
by a list of 2n numbers, the Haar wavelet transform may be considered to pair up input values, 
storing the difference and passing the sum. This process is repeated recursively, pairing up the 
sums to provide the next scale, which leads to 2n -1  differences and a final sum. The technical 
disadvantage of the Haar wavelet is that it is not continuous, and therefore not differentiable. This 
property can, however, be an advantage for the analysis of signals with sudden transitions, such 
as monitoring of tool failure in machines. The Haar wavelet transform has a number of 
advantages:  
 It is conceptually simple and fast.  
 It is memory efficient, since it can be calculated in place without a temporary array.  
 It is exactly reversible without the edge effects that are problems with other wavelet 
transforms. 
 The detail components was written as an inner product of the time series with the m'th  level-1. 
W 'Haar Wavelet is represented by m . 
'
                                            bm Wmv                                                                                    (3.13) 
where  
 
 
 
 
 
39 
 
University of Ghana  http://ugspace.ug.edu.gh
1  1 1 
                                            W1   , ,0,0,...,0,0   
 2 2 
1  1 1 
                                             W2  0,0, , ,...,0,0            
 2 2 
                                                .                       . 
                                                .                       . 
                                                .                        . 
 1 1 
                                          W
1
N  0,0,0,0,...,0,0, ,                                                   (3.14) 
2  2 2 
Each level-1 Haar Wavelet are; 
1
 A translation W1  by an even number of units. 
 Orthogonal to all other Haar wavelets. 
Similarly, the mean component is given as  
1
                                               Um  X v                                                                                (3.15) 
where  
 
 
 
 
 
40 
 
University of Ghana  http://ugspace.ug.edu.gh
1  1 1 
                                              X1   , ,0,0,........,0,0     
 2 2 
 1 1 
                                       X 12  0,0, , ,........,0,0      
 2 2 
                                                 .                     . 
                                                 .                      .   
                                                .                       . 
 1 1 
                                X 1N  0,0,0,0,...,0,0, ,                                                (3.16) 
2  2 2 
are the level-1 Haar scaling signals, which are also translations of each other that are mutually 
orthogonal and orthogonal to all the level-1 wavelets. Using this notation, the study illustrate the 
signal in terms of components proportional to the wavelets and scaling signals:        
                                                      v U1 B1                                                            (3.17) 
where                                   
N
1  u1 u1 u2 u u2 N/2 uN/2 
2
                             U = , , , ,..., , =u X
1                            (3.18a) m m
 2 2 2 2 2 2  m=1
N
 b -b b -b b b  2
                              B1= 1 , 1 , 2 , 2 ,,,,, N/2 , N/2 =b W
1                               (3.19a) m m
 2 2 2 2 2 2  m=1
  These can be simplified as  
N
2
1 -
1
1
                   U =2 2 u1,u1,u2 ,u2 ,...,uN ,uN =umXm                                                   (3.18b)           
2 2 m=1
41 
 
University of Ghana  http://ugspace.ug.edu.gh
N
- 1
2
1
                            B =2 2 b1,-b1,b2 ,-b2 ,...,bN ,-bN =b W1m m                                     (3.19b) 
2 2 m=1
This describes the inverse transform for getting signal from the coefficients to the original time 
series. 
3.2.2 Feature Extraction Technique 
In this study, dimensional reduction techniques was employed after using DWT at a preprocessing 
phase. Thus feature extraction related to dimension reduction technique. The extracted feature of 
the face image are expected to contain important information through the input data .This implies 
that the desired task can be executed by reduced representation in place of the comprehensive 
initial data. This process involves reducing the amount of resource required to label a large set of 
data. 
In the feature extraction phase, the study adopted PCA/SVD approach as a data reduction 
techniques. 
3.2.2.1 Singular Valued Decomposition (SVD) 
The study considers SVD of a matrix 𝑿  defined as the factorization of 𝑿 into the product of three 
matrices. Let 𝑿 represent a 𝑛 × 𝑚 a real valued-data of rank 𝑟, without loss of generality nm , 
and hence r  n . Let 𝑼 and 𝑽𝑇 be orthonormal matrices where 𝑼 = 𝑛 × 𝑚 and 𝑽 = 𝑚 × 𝑚. 
Thus, assuming that matrix 𝑿 is real, implies the singular values are always real numbers, and 
thus matrix 𝑼 and matrix 𝑽 are also real.  
The equation for Singular Valued Decomposition of 𝑿 can be stated as: 
                                                  X = UΣV                                                                    (3.20) 
where ∑ is the diagonal matrix with diagonal entries such as  jj  0  , j 1,2,3,...,m  and 
11,22,33,...,mm . 
42 
 
University of Ghana  http://ugspace.ug.edu.gh
Equation 3.26 display the matrix representation of the SVD. 
𝑥11 ⋯ 𝑥1𝑚 𝑢11 ⋯ 𝑢1𝑚 𝜎11 ⋯ 0 𝑣11 ⋯ 𝑣1𝑚
  [ ⋮ ⋱ ⋮ ]=[ ⋮ ⋱ ⋮ ] [ ⋮ ⋱ ⋮ ] [ ⋮ ⋱ ⋮ ]                                (3.21) 
𝑥𝑛1 ⋯ 𝑥𝑛𝑚 𝑢𝑛1 ⋯ 𝑢𝑛𝑚 0 ⋯ 𝜎𝑛𝑚 𝑣𝑛1 ⋯ 𝑣𝑛𝑚
3.2.2.1.1 Theorem  
 Define matrix X  as a matrix with dimension × 𝑛 , (𝑚 ≥ 𝑛). 
Then 
 The matrices XX  and  XX  are symmetric with real and nonnegative 
nn mm
eigenvalues.  
 If λ is a nonzero e XX  eigenvalue of XX  corresponding to the eigenvector X , 
mm
then λ is also an eigenvalue of  XX  with corresponding eigenvector X . In other words, 
XX  and XX  have the same nonzero eigenvalues. 
3.2.2.1.2 Properties of SVD 
SVD properties that are used in this study are presented as follow: 
 The singular values 𝜎11, 𝜎22, … , 𝜎𝑚𝑚 are unique. 
 XX = UΣΣV , the matrix U and the matrix V are calculated from the eigenvector of 
XX  and  XX  respectively. 
 The rank of the matrix X  is equal to the number of its non-zero singular values.                                                                                                                         
 
3.2.2.2 Principal Component Analysis (PCA) 
In this study, PCA technique adopted primarily focused on data reduction and interpretation 
which explains the variance-covariance structure of a set of variables though they are fewer linear 
combination of these variables. This procedure changed high dimensional image in every two-
dimensional picture into a dimensional vector. The study concentration lies in the components 
that explains most of the information required. 
43 
 
University of Ghana  http://ugspace.ug.edu.gh
In face recognition, the eigenface approach is one of the most simplest and effective in extracting 
facial features of face images. The eigenface approach transforms face images into a small set of 
facial features, which constitute the central component of the initial set of training dataset. 
Mathematically, PCA is computed as follow;  
A two-dimensional facial image illustrate as unit dimension by concatenating each column or row 
into a long thin vector. 
Let m  be vectors of size M (row of image column of image) signifying a set of sampled face 
images. Pi  represents values of the camera pixel. 
Define an image matrix Pi as  
                                                   𝑷𝒊 = (𝑃𝑖𝑗𝑘)            𝑗, 𝑘 = 1,2, … , 𝑛     𝑖 = 1,2, … , 𝑚  
                                              𝑷𝒊= (pi1, pi2 ,…,pis) , where 𝑝𝑖𝑘 = (𝑝𝑖1𝑘 , 𝑝𝑖2𝑘 , … 𝑝𝑖𝑠𝑘)′           
                                             𝑿𝒊 = (𝑝
′ ′
𝑖1 ,𝑝𝑖2, … , 𝑝
′
𝑖𝑠)′                                                                         (3.22) 
where s is the order of the image matrix and m is the number of image to be trained. 
From equation (3.31), let X i be a column vector of dimension M  given by 
                                                     X  X                                                                                   (3.23) i ij M1
where X replaces theij p position-wise. ijk
The pre-processing phase are based on the sample X  X1, X 2 ,...X m   whose members are the 
vectorised form of the individual images in the study. 
Let the training set be denoted by X  X1, X 2 ,...X m    
3.3.1.2.1 Mean Centering 
The Mean face is defined as the arithmetic mean of the training image vectors at each pixel point 
and its dimensional size is M 1.  
44 
 
University of Ghana  http://ugspace.ug.edu.gh
This is a simple pre-processing phase, performed by the difference in the mean. Thus pi  E Xi   
of the data X i , i 1, 2,3,..., m through the data.  
1 M
                                                           pi  Xij                                                         (3.24)                             
M j1
1 n n
                                                  pi   pijk   where i 1,2,3,...,m                           (3.25) 
M j1 k1
where M  nn , length=(row of imagecolumn of image )of the image data, Pi . 
Let define X i is a constant column vector of the order n nwith all members equivalent as pi
i 1,2,3,..., m  
The mean (average) centering is illustrated by Z  (z1, z2,..., zm) . Subtract the mean image from 
the training image. 
                                                                Z  XXi                                                                        (3.26) 
Feature Extraction Stage 
Mathematically, the mean centered images are then subjected to PCA, which pursues a set of m
orthogonal vector ei , which best account the distribution of the image data. 
The t th vector 𝒆𝒊 is selected such that  
1 m 2
                                                       ei iZi                                                                    (3.27) 
m i1
is maximum with respect to the orthogonal restrictions. 
eiet 1,        if  i  t
                                  it                                                                         (3.28)     
 0,       otherwise
The vector ei and the scalarsi are the deduced through covariance matrix by representing 
eigenvectors and eigenvalues, respectively. 
45 
 
University of Ghana  http://ugspace.ug.edu.gh
1 m 1
                                    i  Z Zi i  ZZ                                                                      (3.29) 
m i1 m
where the matrix Z   z1, z2 ,..., zm   
The size of dimension M 2 M 2 is huge and computing the eigenvalues and eigenvectors is an 
intractable task for distinctive image sizes. For instance 100100  create a covariance matrix of 
size 1000010000 . 
Literature have stated that the scalars t and vectors ei  are obtained through the computation of 
the eigenvalues of ZZrespectively. 
                                                 μ α = ZZα                                                                                                  (3.30) i i i
Where i and i are the eigenvalues and eigenvectors respectively of matrix ZZ . 
By post-multiplying both by Z , we can deduce that  
                                                  μi Zα i  = ZZZα i                                                                   (3.31) 
which implies that the initial m1eigenvectors ei and eigenvalues 
T
i of ZZ are given by Zi  
and i  respectively. Thus Zi  is normalised so that it become equivalent to ei and covariance 
matrix ranks cannot exceed m1(the negative one was deducted as a result of the average vector 
m ). 
Hence; 
m
                                                              ei  zii                                                                (3.32) 
i1
where i and zi are the column matrices of U and Z respectively and i 1,2,3,4,...,m .  
46 
 
University of Ghana  http://ugspace.ug.edu.gh
The principal component of the trained image set is deduced by; 
                                                                eT x  p                                                                (3.33) i i i
                                                            1,2 ,...,m                                                                    (3.34) 
The huge dimension of the correlated face image are finally reduced to smaller uncorrelated  
fundamental dimension which exhibit an advantageous features of the image set. 
3.2.3 Recognition Stage 
This segment gives a stage by stage approach to recognizing an unknown face in the trained 
database. After formulizing the representation of each face, the final phase is to recognise the 
identities of these faces.  
An unknown input face is passed from the phase below before recognition. The face image 
datasets contains 10 individuals straight image pose (0°) and the angular poses of each individual 
were captured along 4°, 8°, 12°, 16°, 20°, 24°, 28°and 32°rotation of the head and their frontal face 
component are extracted and stored in the database.  
Therefore when an input face image is place in dataset, the face image undergoes a preprocessing 
stage and dimensional reduction process, and compare its component to each face class stored in 
the database for recognition. 
3.2.3.1 Euclidean (Recognition) Distance 
The most common recognition rule is the use of the Euclidean distance. Euclidean distance 
calculates the distance from the query image and database images. 
Following the procedures in the dimensional reduction phase, a new face from the test image 
databases is transformed into its eigenface components.  
47 
 
University of Ghana  http://ugspace.ug.edu.gh
Firstly, the input image is compared with the mean face image (trained images mean) in memory 
and their difference is multiplied with each eigenvector from ei .  
Each value represents a weight and is saved on a vector i . 
Recall, from equation (3.47) and (3.48)  T   and    i  ei xi  p 1,2 ,...,m                                     
i  account for the impact of each eigenface in depicting the face image by treating the eigenfaces 
as a basis set for facial images. 
The i -dimension Euclidean (recognition) distance between them is computed as stated below; 
                                     i   i                                                                                         (3.35) 
 where i  is a vector describing the i
th face class  and Γ is a feature space of the database image.   
If  i  is chosen as the distances at which a test image is recognised in the trained image database.  
Flow Chart for PCA 
 
                                  Figure 3.2: Summary of face recognition process of the study 
48 
 
University of Ghana  http://ugspace.ug.edu.gh
3.2.3.2 Recognition Rate  
Computationally, the face recognition algorithm is evaluated through their recognition rate and 
run time (computational time). The recognition rate of a face recognition algorithm is expressed 
as the ratio of the total summation of face images recognised by the algorithm to the total 
summation of face images in the test dataset for a single experimental run. The performance of 
this recognition has many measurement standards. 
Mathematically, the rate of face recognition is expressed                       
Number of Face Recognized (n )
Recognition Rate = r ×100                                                 (3.36) 
Number of Face Present (nt )
Thus, numerically the average recognition rate as stated by Thakur et al., (2008) is given as: 
20
n jcls
                                                j1R  100                                                                                      (3.37) avg
qntot
where q  represent the  total number of experimental runs, ntot is the total number of faces under 
th
test in each run and  n j represent the number of correct face recognise in the j run .  cls
Therefore, the mean error rate is computed as  
                                                     ERavg 100RRavg                                                     (3.38) 
The precision of a face recognition algorithm is a vital part in the analysis process of image 
processing. The highest similarity measure is the lowest distance of a resultant image. 
 
3.3 Statistical Testing Approach 
The study evaluated the face recognition algorithm based on a statistical technique. The study 
employed the parametric or non-parametric distribution measures, grounded on the assumptions 
of the underlying distribution of the adopted statistical test.   
49 
 
University of Ghana  http://ugspace.ug.edu.gh
3.3.1 Assessing Multivariate Normality Based on the Euclidean Distance 
Per the study data structure, the appropriate statistical test that can be used to achieve this 
objective is the repeated measure design. Repeated measure design test is relatively subtle to the 
assumption that the study data should conform to multivariate normality. Upon violation of this 
assumption, the nonparametric counterpart of the repeated measures design (Friedman Test) 
would adopted. The Generalised Squared Distance approach was used to assess multivariate 
normality assumption 
Based on the study dataset, eight-variates are collected through the recognition distance between 
the various head pose 4o , 8o ,12o , 16o ,20o , 24o , 28oand 32o   and straight pose 0  .  
For the face recognition algorithm, define dataset  Xij  where j 1,2,...,q  and i 1,2,...n , where
j  is the number of constraint aside the straight pose and i is the number of individual in the 
research  database. 
It is important to assess whether statistical significant difference exist between the average 
recognition distance of various head pose 4o ,8o ,12o ,16o ,20o ,24o ,28oand 32o   and straight pose
0 when being recognised by the study algorithm.  
Reviewed literature revealed that there are several statistical test for multivariate normality such 
as Royston test, Mardia test, Henze-Zirkler’s test, chi-square plot procedure, etc. Unfortunately, 
there is no well-known uniformly most powerful (UMP) test and; it is therefore appropriate to 
perform several test to assess multivariate normality. However, the study adopted chi-square plot 
to assess multivariate normality test since the test exhibit a statistical consistency property.   
 
 
50 
 
University of Ghana  http://ugspace.ug.edu.gh
3.3.1.2 Generalised Square Distance Approach 
Defined the Generalised square distance as  
/
                                          d 2  X X 1 X X                                                            (3.39) ij ij j ij j
1 n
where X  X  and j 1,2,...,q ; X1,X2 , ...,Xj ij n  are the sample of the recognition distance at 
n j1
varying angular constraints. Also X is an nq data matrix of recognition distance of constraints 
th
and X is an nq matrix with j  j 1, 2,..., q  column consisting of elements each of which is 
th
the same as the mean j variable in the data matrix X .  
d 2 2 Define  d  , for i  j . That is 2 as the diagonal entry of the ith
2
i ij d individual from matrix d . i ij
To construct a Gamma/Chi-square Plot;  
 The square distance from the least to the highest distance value as are ordered stated as; 
d 2 21  d2  ... d
2
i . 
  i  1  n
   i  1 
  2   i 
1 
Plot the graph of q ,d where  n c,i ij  q
 is the 2100  quantile of 
  n 
c,i
  n  n
     
2
the  - distribution with i  degree freedom. 
 i  1     n  i  1     n  i  1  
Note that  nq   =  2  2  2
 2
,d , where  
2 
  i  are the upper c,i  n 
i
  n 
i 
  n

       
quartile of the chi-square distribution. 
 The plot takes the form of a straight line from the origin having a unit slope is an indication 
of multivariate normality. 
51 
 
University of Ghana  http://ugspace.ug.edu.gh
 An orderly curve pattern serve as indicator to reject normality test. In this case, at most 
two points far above the line indicate a large distance or outlying observations, which 
warrant further attention.  
3.3.2 Repeated Measures Design 
In a repeated measures design, each group member in an experiment is tested for multiple 
conditions over time or under different conditions. This test is used to compare experimental 
treatments which take measurements on the same subject over successive time or under different 
conditions. For this study, the treatments are seen as an angular facial constraints in the 
recognition database. The rationale of using repeated measure design test is to determine whether 
for the euclidean distance for the study algorithm, there exist a significant difference between the 
minimum euclidean distances of the tilted frontal view pose (angular poses) from their straight 
pose. 
The study assumes that each subject is measured q times in row. Each measurement corresponds 
to a specific angular pose. The ith observation is given as  
Xi1 
 
X
 i2 
 . 
                                                      X =                                                                                           (3.40) i
 . 
 . 
 
Xiq 
where i=1,2,...,q , X ik  is the measure of the k
th constraint on the ith individual. 
For comparative purpose, the study consider contrasts of the components of u  E X i  . The 
successive difference comparison used is given as 
52 
 
University of Ghana  http://ugspace.ug.edu.gh
 2  1  1 1 0 . . 0 1 
       
 3 2  0 1 1 . . 0   2 
 .   . . . . . .   . 
                                                  Cμ                          (3.41)                             
 .   . . . . . .   . 
 .   . . . . . .   . 
     
q  q1  0 0 . . 1 1 q 
where  𝑪 is called the contrast matrix because the rows constitutes linearly independent and each 
row is known as a contrast vector. The nature of the repeated test design removes much influence 
of the unit to unit difference on the various head pose comparisons (Johnson & Wichern, 1998). 
The study randomise the order in which the various head pose are presented to each individual 
(subject). 
3.3.2.1 Assumption 
The assumption for multivariate repeated measure design is stated as follow; 
 The samples must satisfied multivariate normality assumption. 
 The observation from varying angular pose (different variable) are independent to each 
other. 
 The samples must satisfied the sphericity assumption (homogeneity of covariance)  
 There must no outliers in the datasets. 
3.3.2.2 Test for Equality of Treatments 
All contrast matrices 𝑪 are with 𝑞 − 1 linearly independent orthogonal rows. The contrast matrix 
𝑪 can either be categories as a Baseline comparison or Successive differences.   
The Baseline comparison is given as:  
53 
 
University of Ghana  http://ugspace.ug.edu.gh
 1  2  1 1 0 . . . 0  1 
 
    
 
 1 3  1 0 1 . . . 0 
2 
 .   . . . . . . .  . 
                                                                                         (3.42a) 
 .   . . . . . . .  . 
 .   . . . . . . .  . 
   
   
 1  q  1 0 0 . . . 1
 
 q 
The Successive difference is given as: 
 2  1  1 1 0 . . . 0  1 
    
 3  2   0 1 1 . . . 0 

 2 
 .   . . . . . . .  . 
                                                                                  (3.52b) 
 .   . . . . . . .  . 
 .   . . . . . . .  . 
   
  
 q  
   
q1   0 0 . . . 1 1 q 
3.3.2.2.1 Hypotheses 
Let 𝑋~𝑁𝑞(𝜇, ∑)and let 𝑪 be a contrast matrix. 
 The null hypothesis (𝐻0), claims that, there exist no significance differences in treatments 
(Equal treatment means). 
 The alternative (𝐻1) , claims that, there exist significance difference in treatments. 
                                          𝐻0: 𝑪𝝁 = 𝟎                                                                                     (3.43a) 
                                          𝐻1: 𝑪𝝁 ≠ 𝟎                                                                                     (3.43b) 
The test statistics (HotellingT 2 ) is given as  
2
                           T  nCxCμ CSCCxCμ                                                      (3.44)        
1 k 1 n
where X  X ik  and S= Xi XXi X  is the arithmetic mean and covariance 
n i1 n 1 i1
matrix respectively for the study algorithm.  
54 
 
University of Ghana  http://ugspace.ug.edu.gh
Critical region is stated as  
n1k 1
                                       F                                                             (3.45) 
  ,k1,nk1n k 1
th
where 𝛼 is the level of significance, F ,k1,nk1  is the upper 100  percentile of an F-
distribution with 𝑘 − 1 and 𝑛 − 𝑘 + 1 degree of freedom.. 
Decision Rule state that reject the null hypothesis if the test statistics (Hotelling T 2 ) exceed the 
critical region at   level of significance. Otherwise we fail to reject the null hypothesis. 
' n1k 1
                   T 2  nCx Cμ CSC' Cx Cμ  F                      (3.46) 
   ,k1,nk1n k 1
3.3.2.2.3 Confidence Region 
A confidence region for contrasts Cμ , is obtained by the set of all Cμ  such that: 
' n1k 1
            T 2  nCx Cμ CSC' Cx Cμ  F                           (3.47) 
  ,k1,nk1n k 1
1001 % Simultaneous confidence intervals for single contrasts cμ  for any contrast vector 
of interest is given as; 
n 1k 1 ciSci
                                cμ  cix  F ,k1,nk1                                                   (3.48) n  k 1 n
where 𝑪  𝑖 = 1,2,3 … , 𝑘 − 1, is the ith𝒊 row vector of the contrast 𝑪.  
 
3.3.3 Friedman Rank Sum Test  
The Friedman Rank Sum test is a non-parametric counterpart of repeated measure design.  The 
test is used to detect difference among treatment (varying angular constraints) effects. The 
55 
 
University of Ghana  http://ugspace.ug.edu.gh
purpose of the test is to determine whether there exist statistical significant difference between 
the median recognition distances of the varying head-poses through their straight pose. The study 
perform calculations on ranks, which may be derived from observations measured on a higher 
scale or may be the original observations themselves. 
3.3.3.1 Assumptions  
The Friedman two-way analysis of variance by ranks is based on the following assumptions; 
 The sample of k  subjects has been randomly selected from the population it represents. 
 The observations within each block may be ranked in ascending/descending order.  
 The data consist of b  mutually independent samples (blocks) each of size k .  
 The variable of interest is continuous.   
 There is no interaction between blocks and treatments.  
3.3.3.2 Model  
The linear Friedman Rank Sum test model is given as 
                                                        yij   M i  j  ij     ,        ij 1,2,...,k                          (3.49) 
th
where  j is the j level of the block and M i  is the i
th  level of the treatment. 
3.3.3.3 Hypotheses 
The null hypothesis states that the median euclidean distance are the same across the repeated 
measures and the alternative hypothesis state that the median euclidean distance repeated 
measures across are different  
𝐻0=𝑀1 = 𝑀2 = ⋯ = 𝑀𝑘                                                                                                      (3.50a) 
𝐻1 = 𝑀𝑖 ≠ 𝑀𝑗           for at least one i  j                                                                               (3.51b) 
The level of significance (𝛼 = 0.05)  
56 
 
University of Ghana  http://ugspace.ug.edu.gh
Table 3.2: Data display for the Friedman Two-Way Anova by ranks           
Block   1 2   3 . . .    I . . . b 
      1 𝑋11 𝑋12 𝑋13 . . . 𝑋1𝑖 . . . 𝑋1𝑏 
      2 𝑋21 𝑋22 𝑋23 . . . 𝑋2𝑖 . . . 𝑋2𝑏 
      3 𝑋31 𝑋23 𝑋33 . . . 𝑋3𝑖 . . . 𝑋3𝑏 
       .     .    .    . . . .       . . . .  
       .     .    .    . . . .       . . . .  
       .     .    .    . . . .       . . . .  
       I 𝑋𝑖1 𝑋𝑖2 𝑋𝑖3 . . . 𝑋𝑖𝑖 . . . 𝑋𝑖𝑏 
       .    . . .       . . . .  
       .    . . .       . . . .  
       .    . . .       . . . .  
       B 𝑋𝑏1 𝑋𝑏2 𝑋𝑏3 . . . 𝑋𝑏𝑖 . . . 𝑋𝑏𝑏 
 
Testing Procedure 
 Firstly, rank the original observations unless it is already rank. 
  Rank each block within each observation independently in ascending order, so that each 
block contains a separate set of k  ranks.  
 Obtain the sum of rank R j in each column. 
The test statistics is computed as follows: 
k 2
2 12  b(k 1)                                     R                                                                        (3.62a)          R j
bk(k 1)  j1  2 
Alternatively,            
57 
 
University of Ghana  http://ugspace.ug.edu.gh
k
12R2 2i 3b k k 1
i1
                                    W                                                                  (3.62b)                          
3b2k k 2 1
b k 1
where    is the mean of the sum of the ranks when the null hypothesis is true. The Friedman 
2
2  2   2statistics is approximately R  distributed and the null hypothesis is rejected if R r1, . 
Theoretically, the study expects no ties in the ranking of the euclidean distance under angular 
constraints in a face recognition system. Thus, the median recognition distance is ranked on an 
ordinal measure and assumed to continuous.  In practice, there may be cases of ties, in that 
situation where there are ties in the ranking of the median recognition distance, the study assigns 
the ties observation the average of the rank position for which the variable are tied.   
The test statistics for the tie Friedman two way-analysis of variance test is defined as 
k
12R2 2i 3b k k 1
i1
                               W                                                                     (3.53)           
3b2k k 2 1b t3 t 
3.3.3.4 Pairwise Comparison 
The study compares all possible differences between pairs of samples, when the experiment wise 
error rate is  . The assumption for the post-hoc Friedman two-way Anova test is that, it requires 
a balanced design (𝑛1, 𝑛2, … . , 𝑛𝑘 = 𝑛) for each treatment  (𝑘 = 10) and a ranking of the variable 
whose value is in the dataset. 
 The critical difference refers to mean rank sums (|𝑅?̅? − 𝑅?̅?|)  
𝑞 ;𝑘;𝛼 𝑘(𝑘+1)
                                     |𝑅 ∞?̅? − 𝑅?̅?| >  √                                                                 (3.54) √2 6𝑛
 
58 
 
University of Ghana  http://ugspace.ug.edu.gh
 where 𝑘 refer to number of individual, 𝑛 is the number of angular constraints in the dataset and 
𝑞∞ is based on the   with 𝑟 − 1 degree of freedom. r1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59 
 
University of Ghana  http://ugspace.ug.edu.gh
 
   CHAPTER FOUR 
   DATA ANALYSIS 
4.1 Introduction 
This chapter presents a comprehensive computational work of the study. The subject area focus 
on face recognition algorithm on the available database using DWT-PCA/SVD under angular 
constraint discussed in the said chapter. The chapter further explains the rationale behind every 
method used and the outcome of the analysis. 
4.1.1 Data Description  
The data were extracted from Massachusetts Institute of Technology (MIT) Database (2003-
2005). The study generates 3D face model based on ten training image of each individual with 
eight angular (00, 40, 8°, 12°, 16°, 20°, 24°, 28°, 32°) image constraints. 80 head pose images from 
10 individuals’ were captured into the study database 
10 face images (0° head pose) from 10 individuals were captured into the train image database. 
Noting that, both the train and test images were resize to 100100  dimensional unit. 
Recall, Plate 3.1 displayed the original frontal image of the dataset. 
 
                                              Plate 4.1: The display of the ten train image 
60 
 
University of Ghana  http://ugspace.ug.edu.gh
Source: [Weyrauch, Huang, Heisele & Blanz (2004)] 
70 head pose from 10 individuals across (4°, 8°, 12°, 16°, 20°, 24°, 28°, 32° ) were captured into 
the test image database.   
 
Plate 4.2: Example of the test image under the various angular pose 
                                 Source: [Weyrauch, Huang, Heisele & Blanz (2004)]  
 
4.2 Result  
The result of the two-dimension face image is divided into three stages as stated in the research 
methodology.   
4.2.1 Preprocessing Stage  
Prior to measuring the training and testing distance, a preprocessing stage is applied to remove 
noise artifact constructed in the database before collecting the sample image values for analysis. 
Reviewed literature has revealed pre-processing phase significantly increase the consistency of a 
visual examination and improve the performance of a face recognition algorithm. In the 
preprocessing stage, the study applied a discrete wavelet transform on the images in the database. 
61 
 
University of Ghana  http://ugspace.ug.edu.gh
Thus, Discrete Wavelet Transform (DWT) serve as the noise reducing mechanism, making the 
computation process simpler and better conditioned for recognition. More so, the DWT technique 
is easy to implement on the computer (Asiedu, Adebanji, Oduro & Mettle, 2015a).  
From the experimental runs of Haar Discrete Wavelet Transform, the real components of the 
transformed images are extracted for the feature extraction stage whereas the imaginary 
components are treated as noise. Plate 4 shows the DWT preprocessed images of the one angular 
images shown in Plate 3. 
The Discrete Wavelet Transform (DWT) is a viable noise reduction mechanism during image 
pre-processing phase. The DWT is a computationally efficient algorithm used to compute the   
Inverse Discrete Wavelet Transform (IDWT).  
The Gaussian filter was adopted for filtering the noise after DWT before the Inverse Discrete 
Wavelet Transform (IDWT) was performed to reconstruct the image. Figure 4.1 is an 
illustration of the DWT and IDWT procedure. 
 
            
62 
 
University of Ghana  http://ugspace.ug.edu.gh
                           Plate 4.3: Preprocessed image from DWT 
4.2.3 Feature Extraction Stage 
The feature extraction starts after the image decomposition ends. Wavelet decomposition dealt in 
extracting the features for better processing.  
When a test face dataset information are calculated according to the described process and its 
principal components are extracted.   
Procedures  
The study displays the result of the first six train image in the databased; 
Step One 
Consider six train image in the database Recall from (3.32); 
Given the training set of images, X  X  and X  X1, X 2 ,...X m    i ij M1
Define an image matrix Pi as  
                                    Pi   pijk    j,k 1,2,...,m      i 1,2,...,m  
T
                                 Pi   pi1, pi2 ,..., pis   where pik   pi1k , pi2k ,..., pisk   
T
                                X   pT , pT ,..., pT                                                                         i i1 i2 is
Step Two 
Vectorised the train images and reshape. 
The trained images are reshape from 2D to 1D in order to construct the face database with each 
row depicting the train image of an individual. The dimension of the constructed row vector is 
given as 1 x 10000.  
63 
 
University of Ghana  http://ugspace.ug.edu.gh
Reshaping the image from 2D to 1D is done by the code line: 
The outcome of reshaping the train image as display in fig 4 leads to the vectorised image after 
Figure 4.1 has been transform into 1D image. 
The process in steps 1 and 2 are repeated for all face images in the training dataset.  
Step Three 
Compute the mean of the reshape train image 
Step Four  
Determine the mean centered. 
Step Five  
The covariance matrix for the train image is given as; 
 0.0184 -0.0045 -0.0097 0.0052 -0.0090 -0.0005 
 
-0.0045 0.0315 -0.0125 -0.0033 -0.0063  -0.0049
 
 -0.0097 -0.0125 0.0331 -0.0080 -0.0000 -0.0028 
c     
 0.0052 -0.0033 -0.0080 0.0215  -0.0100  -0.0055
-0.0090 -0.0063 -0.0000  -0.0100 0.0289 -0.0035 
 
 -0.0005  -0.0049 -0.0028  -0.0055 -0.0035 0.0171 
 
                     Figure 4.1: A display of covariance matrix of the train image 
64 
 
University of Ghana  http://ugspace.ug.edu.gh
Next, was to run a singular value decomposition (SVD) of the matrix C USV '  to ascertain its 
eigenvalues and their corresponding eigenvectors, as seen from equation (3.34). This will split 
the covariance matrix C into two orthogonal matrices U and V and a diagonal matrix S . 
Step Six  
The Singular Value Decomposition of the covariance matrix; 
 -0.3067 0.3718 -0.2349 0.1029  0.7316  0.4082
 
-0.4473 -0.6424 0.4642 -0.0302 0.0645 0.4082
 
 0.6495 0.2382 0.5720 -0.1121 0.1229 0.4082 
U     
 -0.3356 0.4137 -0.0699 -0.5148 -0.5288 0.4082 
 0.4132 -0.4631 -0.6204 -0.2508 0.0168 0.4082 
 
 0.0270 0.0818 -0.1109 0.8050 -0.4069 0.4082 
0.0503 0 0 0 0 0 
 
0  0.0369 0 0 0 0
 
 0 0 0.0285 0 0 0 
S     
 0 0 0 0.0223 0 0 
 0 0 0 0 0.0127 0 
 
 0 0 0 0 0 0.0000
 -0.3067 0.3718 -0.2349 0.1029  0.7316  0.4082
 
-0.4473 -0.6424 0.4642 -0.0302 0.0645 0.4082
 
 0.6495 0.2382 0.5720 -0.1121 0.1229 0.4082 
V     
 -0.3356 0.4137 -0.0699 -0.5148 -0.5288 0.4082 
 0.4132 -0.4631 -0.6204 -0.2508 0.0168 0.4082 
 
 0.0270 0.0818 -0.1109 0.8050 -0.4069 0.4082 
Figure (4.1) depict the image of the eigenvectors of the covariance matrix C. The extent of 
variability carried by each value is seen from the level of their brightness. The brighter the image, 
the greater the importance attached to the value in variability. 
This simply means the bright portions contains important information about the covariance matrix 
C than the dark portions. The hierarchy of importance is determined by the level of brightness. 
65 
 
University of Ghana  http://ugspace.ug.edu.gh
 
                                  Figure 4.2: Image unit of eigenvector U 
The diagonal values of the matrix S are the eigenvalues of the covariance matrix C and their 
corresponding unit eigenvectors are the elements of the matrix U. Here, the eigenvalues are 
arranged in descending order and this accounts for why portion of brightness fades as the values 
become smaller. This is illustrated in plate 3.7. 
 
                         Figure 4.3: Image of the diagonal matrix of the train image 
Step Seven 
The eigenvalue for the train image is deduced from the solution of the equation C  Ι , where C   
is the covariance matrix, Ι  is the identity matrix and   is the eigenvalue. The result of the 
eigenvalue is arranged in the order of magnitude.  
eigenvalues  5.05e-04 2.72e-04  1.62e-04 9.90e-05 3.21e-05  4.26e-20  
66 
 
University of Ghana  http://ugspace.ug.edu.gh
The first eigenvalues account for an approximately 35% of the variance in the data set while the 
first three eigenvalues together explained for an approximately 66% and the first four 
eigenvalues together an approximately 92%. 
 
        Figure 4.6: Proportion of the variance explained by the principal components 
The eigenvector is derived from cv  v where v  is the eigenvector:  
 0.3066  -0.3720  -0.2334 0.1046 0.7315  -0.4085
 
0.4472  0.6431 0.4636 -0.0333 0.0646  -0.4077
 
 -0.6505 -0.2361 0.5709  -0.1184  0.1243  -0.4070
eigenvectors     
  0.3340 -0.4122 -0.0738 -0.5169 -0.5276  -0.4091
 -0.4130 0.4641 -0.6233  -0.2437  0.0176  -0.4070
 
 -0.0276  -0.0839 -0.1026  0.8045  -0.4078  -0.4099
 
 
67 
 
University of Ghana  http://ugspace.ug.edu.gh
Step Eight: Eigenface 
The eigenfaces are computed as; 
                                   Eigen Face= Eigen vector mean centered  
Plate 4.4 shows the eigenfaces corresponding to the six train images 
 
                                  Plate 4.4: Image of the eigenface of the trained image 
Step Nine: (Principal Component Analysis) 
The principal component of the trained is given as  
Faces (Γ) =Eigenfacemean centered  
 -160.3687 -242.0858 335.1241 -170.9149 223.9642 14.2811 
 
 136.4290  -238.6740 88.7261 152.3101 -169.0179 30.2267
 
  -63.7019 140.9452 158.3819 -18.7857 -184.9114 -31.9281 
faces     
 23.8370 -4.1111 -26.2442 -114.2033 -58.2758 178.9974 
  93.4786 10.7522 14.2499 -66.5693 -0.3244  -51.5870
 
 0.0000  -0.0000  -0.0000  -0.0000  -0.0000 0.0000 
Image of the principal component display of the trained image 
68 
 
University of Ghana  http://ugspace.ug.edu.gh
 
                               Figure 4.6 Image of principal component 
Again, their brightness is tagged directly to their level of importance in describing the variability 
in the image. This means high level of brightness corresponds to high variability and therefore 
high importance of that point in carrying information about the image characteristics. 
4.2.3 Recognition Procedure Stage 
This stage accounts for a step by step approach to recognizing an unknown face in the trained 
database.  
Recall equation (3.59),   the recognition was determined. 
4.2.4 Recognition (Euclidean) Distance 
The recognition (euclidean) distance for the various angular pose from their straight position is 
display in the Table (4.1). 
 
 
 
 
 
 
69 
 
University of Ghana  http://ugspace.ug.edu.gh
Table 4.1: The euclidean distance for the ten image in the train dataset. 
 4               8  12  16  20  24  28  32  
10.405 16.209 21.806 27.581 32.392 36.737 39.489 38.596 
11.134 10.727 15.062 21.341 28.049 36.304 44.245 51.687 
7.5496 9.4757 12.376 16.377 21.669 27.687 34.071 40.492 
9.6804 13.774 17.912 22.623 25.976 28.416 30.41 32.614 
4.3542 8.3438 12.55 16.485 20.331 24.798 29.166 33.122 
6.9719 12.342 18.608 24.375 30.413 35.756 38.254 37.399 
8.3455 13.808 20.139 27.335 34.3 40.567 45.087 46.734 
10.03 16.551 22.694 28.339 33.091 34.614 32.337 30.516 
7.3898 11.584 18.223 26.311 35.853 45.343 52.993 59.356 
7.8737 7.2926 15.341 24.917 35.449 46.21 56.213 61.881 
Source: Matlab Output 
From Table (4.1), it is obvious that as an individual increase the angular position of the head pose 
as the recognition (euclidean) distance increases. 
4.3 Statistical Evaluation of the Euclidean Distance 
The face recognition algorithms under study is the DWT-PCA/SVD with a Gaussian filter. To 
assess whether significant difference exist between the various angular pose from the straight, 
repeated measures design was adopted. This statistical technique requires the study dataset to be 
multivariate normality distributed.  
4.3.1 Assessing for Normality  
Normality test is a juxtapose assumption to identify a particular distribution. Since the dataset has 
multiple angular pose, the study investigate to identify whether the euclidean distance from the 
train image follow a parametric distribution or non-parametric distribution.   
70 
 
University of Ghana  http://ugspace.ug.edu.gh
4.3.2 Test for Multivariate Normality 
The study perform a multivariate normality test on the recognition (euclidean) distance based on 
their respective angular pose.  The study adopted the chi-square plot to test MVN assumption.  
4.3.2.1 The chi-square plot 
Figure 4.4 shows Chi-square plot of the observed data quantiles against the normal quantiles on 
  i  1      n  i  1   i  1 
the Cartesian plane. qc,i ,d
2 
 ij  where q
 n  is the 2
  c,i 100
 quantile of the 
  n    n  n
     
 2 - distribution with i  degree freedom. 
 
71 
 
University of Ghana  http://ugspace.ug.edu.gh
            Figure 4.4: A graph of observed data quantiles against the normal quantiles 
It can be seen from Figure 4.4 that, the graph consistently deviates from the unit slope. Also using 
the correlation test, the correlation, r, value between the observed data quantiles and the normal 
quantiles is 0.8386. This value (0.8386) is less than the expected 0.9198 for a sample size of n = 
10 and a significance level of 5%. It can therefore be concluded that the assumption of 
multivariate normality is not tenable at 5% level of significance. It should however be noted that 
there is no absolutely definitive method for testing multivariate normality. 
The non-parametric counterpart (Friedman Rank Sum Test) is explored as a substitute for the 
repeated measure design.  
4.3.3 Friedman Rank Sum Test 
The Friedman test was used to evaluate median differences among the angular frontal pose of the 
trained image. The results are displayed in the Table (4.3).  
Table 4.3: Friedman Test 
Friedman Rank Sum Test 
chi-square DF P-Value 
6.6367 7 7.99E-12 
                                               Source: R output 
The result from Table (4.3) reject the null hypothesis and states that there is enough evidence to 
support the claim that there exist significant difference in the median euclidean distances among 
2
the angular pose ( R  6.6367 ; p-value 7.9910
12 ). The study further did a pairwise 
comparison to investigate which angular pose position causes a significant difference in the 
median recognition (euclidean) distance. 
The results of the pairwise comparison (Friedman Rank Sum test) is shown in Table (4.4) which 
revealed that at 5%  level of significance, the pair angular position at 4 , 20  , 4 , 24  , 
72 
 
University of Ghana  http://ugspace.ug.edu.gh
4 , 28  and 4 ,32  ; angular position pose at 8 ,24  , 8 ,28  and 8 ,32  ; angular 
position pose at 12 ,28  and 12 ,32  ; angular position pose at 16 ,32  are statistical 
significantly different in median recognition distance.    
It can therefore be concluded that recognition distances of head-poses 4 and 8 are significantly 
lower than those of head-poses 20 and above, when these constraints are being recognised by 
the study algorithm. Recognition distances of head-pose 12 is significantly lower that those of 
head-poses 28 and above, when these constraints are being recognised by the study algorithm 
whereas recognition distances for head-pose 16 is significantly lower than that head-pose 32 , 
when these constraints are being recognised by the study algorithm. It can be inferred from 
submission below that, the higher the degrees of head-pose the larger the recognition distances 
and that at 20 and above, the recognition distances become profoundly larger compared to the 
4   head pose.  
Table 4.4: Pairwise comparisons (Post hoc of the Friedman Test) 
  4  8  12  16  20  24  28  
8  0.99939 - - - - - - 
12  0.7239 0.95797 - - - - - 
16  0.1723 0.47609 0.98494 - - - - 
20  0.00635* 0.04024* 0.47609 0.95797 - - - 
24  0.00014* 0.00152* 0.06841** 0.47609 0.98494 - - 
28  2.00E-06* 3.60E-05* 0.00451* 0.0877** 0.66454 0.99196 - 
32  2.50E-07* 5.40E-06* 0.00104* 0.03036* 0.41504 0.93595 0.99996 
* implies significance difference exist between angular constraints at 5% significance level  
** implies significance difference exist between angular constraints at 10% significance level  
 
4.3.4 Recognition Rate 
The Massachusetts Institute Technology (MIT) database is used to test the performance of the 
recognition rate adopting the modify DWT-PCA/SVD under angular constraints.  
73 
 
University of Ghana  http://ugspace.ug.edu.gh
q20
j
Recall, equation (3.51), the total number of correct recognition  ncls  70  for algorithm, the 
j1
total number of experimental runs q  20 and the total number of face images in a single 
experimental runs ntot  4 . 
The average (mean) recognition rate of the face algorithm is 
70
                                                    Ravg  100  87.5%  
204
The average error rate of the face algorithm is 
                                                 Eavg 10087.5 12.5%  
The average runtime of the face algorithm in the recognition of the 80 face images is 
approximately 4 seconds. 
Table 4.5: Recognition Rate 
4  8  12  16  20  24  28  32  
100% 100% 100% 100% 100% 90% 60% 50% 
             Source: R output 
With reference to Table (4.5), there is an evidence that the modify DWT-PCA/SVD algorithm 
recognition rate obtained an 100% precision for the angular constraints 4 ,8 ,12 ,16 and 20 
but decline steeply for all angular constraints that exceed 20 . 
4.3.5 Recognition Curve  
The study displayed the graphical representation of the recognition rate with respect to the angular 
pose. 
 
 
74 
 
University of Ghana  http://ugspace.ug.edu.gh
Recognition Curve
120
100 100 100 100 100
100 90
80
60
60 50
40
20
0
4 deg 8 deg 12 deg An1g6u dlaegr Con2s0t rdaeingts 24 deg 28 deg 32 deg  
                                 Figure 4.5: Displayed the recognition curve 
This implies that adopting a DWT-PCA/SVD algorithm, the recognition rate decline for the 
varying angular constraint above 20 . 
 
 
 
                 Plate 4.5:  Example of correctly recognised individual across all angular poses  
                            Source: [Weyrauch, Huang, Heisele & Blanz (2004)]  
 
 
 
75 
 
Recognition Rate (%)
University of Ghana  http://ugspace.ug.edu.gh
     CHAPTER FIVE 
Summary, Conclusion and Recommendation 
5.1 Introduction   
This is the final chapter of the study and also makes recommendations for future thesis work to 
be done in this field. 
5.2 Summary 
The study provided a concrete application of face recognition algorithm using Discrete Wavelet 
Transform and Principal Component Analysis with Singular Value Decomposition (DWT-
PCA/SVD) under angular constraint. The study used ten images with an angular constraint from 
Massachusetts Institute of Technology database (2002-2005).  
From the numerical evaluations in the previous chapter, the recognition rates of Principal 
Component Analysis with Singular Value Decomposition (PCA/SVD) with Discrete Wavelet 
Transform as a noise removal technique in preprocessing stage (DWT-PCA/SVD) is 87.5% and 
a mean error rate is 12.5%. 
The average runtime of the modify DWT-PCA/SVD algorithm in the recognition of the 80 face 
images in the dataset is approximately 4 seconds. The time used by DWT-PCA/SVD algorithm 
in preprocessing accounts for the algorithm’s runtime (speed). 
It can be concluded from the results of the Friedman Rank Sum test design that, statistically there 
exist statistical significance difference in the recognition distances between the varying angular 
constraints 4o ,8o ,12o ,16o,20o,24o,28oand 32 o   and their straight-pose 0  . The post hoc 
revealed that median euclidean distance for the angular pose position, there exists a significance 
difference whenever the angular position are far apart.  
 
76 
 
University of Ghana  http://ugspace.ug.edu.gh
5.3 Conclusions and Recommendations  
The study successfully modified the Principal Component Analysis and Singular Value 
Decomposition (PCA/SVD) algorithm by using Discrete Wavelet Transform as noise removal 
mechanisms during the face image pre-processing stage.  
Adopting the study numerical evaluation, it can be concluded that, DWT-PCA/SVD has an 
appreciable performance when used to recognise face images under angular constraints.  
A non-parametric statistical technique (Friedman Rank Sum test) has been used to assess whether 
statistical significant difference exist between the mean recognition distances of the varying 
angular head pose and  their straight pose when recognised by DWT-PCA/SVD. From the study 
finding, it is highly probable that the face images captured with angular constraints less than or 
equal to 20 can be recognise by the study algorithm (DWT-PCA/SVD). Discrete Wavelet 
Transform is therefore suggested as a feasible de-noise technique, which ought to be employed 
throughout pre-processing stages in image processing. 
Future studies should be carried out to analyse the effect of other noise removal mechanism. 
Another area for future research could be a study on other constraints like, ageing, occlusions, 
face images in glasses and lightening conditions. 
 
 
 
 
 
 
 
77 
 
University of Ghana  http://ugspace.ug.edu.gh
REFERENCE 
Abdullah, M., Wazzan, M., & Bo-Saeed, S. (2012). Optimizing face recognition using pca. arXiv 
preprint arXiv:1206.1515.  
Asiedu, L., Adebanji, A. O., Oduro, F., & Mettle, F. O. (2015). Statistical evaluation of face 
recognition techniques under variable environmental constraints. International Journal of 
Statistics and Probability, 4(4), 93.  
Baah, G. S. (2013). Face Recognition Using Principal Component Analysis. Knustspace.  
Barnouti, N. H., Al-Dabbagh, S. S. M., Matti, W. E., & Naser, M. A. S. (2016). Face Detection 
and Recognition Using Viola-Jones with PCA-LDA and Square Euclidean Distance. 
International Journal of Advanced Computer Science and Applications (IJACSA), 7(5).  
Bartlett, M. S., Movellan, J. R., & Sejnowski, T. J. (2002). Face recognition by independent 
component analysis. IEEE Transactions on neural networks, 13(6), 1450-1464.  
Beham, M. P., & Roomi, S. M. M. (2012). Face recognition using appearance based approach: 
A literature survey. Paper presented at the Proceedings of International Conference & 
Workshop on Recent Trends in Technology, Mumbai, Maharashtra, India. 
Belhumeur, P. N., Jacobs, D. W., Kriegman, D. J., & Kumar, N. (2013). Localizing parts of faces 
using a consensus of exemplars. IEEE Transactions on pattern analysis and machine 
intelligence, 35(12), 2930-2940.  
Bellakhdhar, F., Loukil, K., & Abid, M. (2013). Face recognition approach using Gabor 
Wavelets, PCA and SVM. International Journal of Computer Science Issues, 10(2), 201-
207.  
Bhattacharyya, S. K., & Rahul, K. (2013). Face recognition by linear discriminant analysis. 
International Journal of Communication Network Security, 2(2), 31-35.  
Bledsoe, W. (1964). The model method in facial recognition, Panoramic Research Inc., Palo Alto: 
CA, Technical Report, Technical Report PRI: 15. 
78 
 
University of Ghana  http://ugspace.ug.edu.gh
Brunelli, R., & Poggio, T. (1993). Face recognition: Features versus templates. IEEE 
Transactions on pattern analysis and machine intelligence, 15(10), 1042-1052.  
Bruner, J. S., & Tagiuri, R. (1954). The perception of people. In" Handbook of Social Psychology, 
vol. II.(Lindzey, G., ed.): Cambridge, Mass: Addison-Wesley. 
Bultheel, A. (1995). Learning to swim in a sea of wavelets. Bulletin of the Belgian Mathematical 
society-simon stevin, 2(1), 1-45.  
Chang, S. G., Yu, B., & Vetterli, M. (2000). Adaptive wavelet thresholding for image denoising 
and compression. IEEE transactions on image processing, 9(9), 1532-1546.  
Chellappa, R., Wilson, C. L., & Sirohey, S. (1995). Human and machine recognition of faces: A 
survey. Proceedings of the IEEE, 83(5), 705-741.  
Darwin, C. (1872). The expression of the emotions in man and animals John Murray: London. 
Daubechies, I., Mallat, S., & Willsky, A. S. (1992). Special issue on wavelet transforms and 
multiresolution signal analysis-introduction: IEEE-INST ELECTRICAL 
ELECTRONICS ENGINEERS INC 345 E 47TH ST, NEW YORK, NY 10017-2394. 
De Carrera, P. F., & Marques, I. (2010). Face recognition algorithms. Master's thesis in Computer 
Science, Universidad Euskal Herriko.  
Dhoke, P., & Parsai, M. (2014). Amatlab based face recognition using PCA with back 
propagation neural network. Int. J. Innov. Res. Comput. Commun. Eng, 2(8).  
Draper, B. A., Baek, K., Bartlett, M. S., & Beveridge, J. R. (2003). Recognizing faces with PCA 
and ICA. Computer vision and image understanding, 91(1), 115-137.  
Galton, F. (1892). Finger prints: Macmillan and Company. 
Georghiades, A. S., Belhumeur, P. N., & Kriegman, D. J. (2001). From few to many: Illumination 
cone models for face recognition under variable lighting and pose. IEEE Transactions on 
pattern analysis and machine intelligence, 23(6), 643-660.  
79 
 
University of Ghana  http://ugspace.ug.edu.gh
Goyal, S., & Batra, N. (2016). Using 3G Dongle, Recognition and Reporting based on Pervasive 
Computing. International Journal of Computer Science and Information Security, 14(10), 
907.  
Gross, R., & Brajovic, V. (2003). An image preprocessing algorithm for illumination invariant 
face recognition. Paper presented at the AVBPA. 
Guillamet, D., & Vitria, J. (2002). Non-negative matrix factorization for face recognition. CCIA, 
2, 336-344.  
Harandi, M. T., Ahmadabadi, M. N., & Araabi, B. N. (2009). Optimal local basis: A 
reinforcement learning approach for face recognition. International Journal of Computer 
Vision, 81(2), 191-204.  
Haykin, S. S., Haykin, S. S., Haykin, S. S., & Haykin, S. S. (2009). Neural networks and learning 
machines (Vol. 3): Pearson Upper Saddle River, NJ, USA:. 
Hyvärinen, A., Karhunen, J., & Oja, E. (2004). Independent component analysis (Vol. 46): John 
Wiley & Sons. 
Johnson, R. A., & Wichern, D. (1998). Multivariate analysis. Wiley StatsRef: Statistics Reference 
Online.  
Kabir, M. S., Hossain, M. S., & Islam, R. (2016). PERFORMANCE ANALYSIS BETWEEN 
PCA AND ICA IN HUMAN FACE DETECTION.  
Kadam, K. D. (2014). Face Recognition using Principal Component Analysis with DCT. 
International Journal of Engineering Research and General Science, ISSN, 2091-2730.  
Kakade, S. D. (2016). A review paper on face recognition techniques. International Journal for 
Research in Engineering Application & Management (IJREAM).  
Kanade, T. (1973). Computer Recognition of Human Faces, Birkhauser, Basel, Switzerland and 
Stuttgart: Germany. 
80 
 
University of Ghana  http://ugspace.ug.edu.gh
Kelly, M. D. (1970). Visual identification of people by computer: STANFORD UNIV CALIF 
DEPT OF COMPUTER SCIENCE. 
Kirby, M., & Sirovich, L. (1990). Application of the Karhunen-Loeve procedure for the 
characterization of human faces. IEEE Transactions on pattern analysis and machine 
intelligence, 12(1), 103-108.  
Kociołek, M., Materka, A., Strzelecki, M., & Szczypiński, P. (2001). Discrete wavelet transform-
derived features for digital image texture analysis. Paper presented at the International 
Conference on Signals and Electronic Systems. 
Kohonen, T. (1989). Self-organization and associative memory (Vol. 8): Springer Science & 
Business Media. 
Kshirsagar, V., Baviskar, M., & Gaikwad, M. (2011). Face recognition using Eigenfaces. Paper 
presented at the Computer Research and Development (ICCRD), 2011 3rd International 
Conference on. 
Lai, J. H., Yuen, P. C., & Feng, G. C. (2001). Face recognition using holistic Fourier invariant 
features. Pattern Recognition, 34(1), 95-109.  
Liao, S., Jain, A. K., & Li, S. Z. (2013). Partial face recognition: Alignment-free approach. IEEE 
Transactions on pattern analysis and machine intelligence, 35(5), 1193-1205.  
Liu, Z., Zhou, J., & Jin, Z. (2010). Face recognition based on illumination adaptive LDA. Paper 
presented at the Pattern Recognition (ICPR), 2010 20th International Conference on. 
Luo, L., Swamy, M., & Plotkin, E. I. (2003). A modified PCA algorithm for face recognition. 
Paper presented at the Electrical and Computer Engineering, 2003. IEEE CCECE 2003. 
Canadian Conference on. 
Moghaddam, B., & Pentland, A. (1997). Probabilistic visual learning for object representation. 
IEEE Transactions on pattern analysis and machine intelligence, 19(7), 696-710.  
81 
 
University of Ghana  http://ugspace.ug.edu.gh
Murtaza, M., Sharif, M., Raza, M., & Shah, J. (2014). Face recognition using adaptive margin 
fisher’s criterion and linear discriminant analysis. International Arab Journal of 
Information Technology, 11(2), 1-11.  
Murugan, D., Arumugam, S., Rajalakshmi, K., & Manish, T. (2010). Performance evaluation of 
face recognition using Gabor filter, log Gabor filter and discrete wavelet transform. 
International journal of computer science & information Technology (IJCSIT), 2(1).  
Nicholl, P., & Amira, A. (2008). DWT/PCA face recognition using automatic coefficient 
selection. Paper presented at the Electronic Design, Test and Applications, 2008. DELTA 
2008. 4th IEEE International Symposium on. 
Passalis, G., Perakis, P., Theoharis, T., & Kakadiaris, I. A. (2011). Using facial symmetry to 
handle pose variations in real-world 3D face recognition. IEEE Transactions on pattern 
analysis and machine intelligence, 33(10), 1938-1951.  
Paul, L. C., & Al Sumam, A. (2012). Face recognition using principal component analysis 
method. International Journal of Advanced Research in Computer Engineering & 
Technology (IJARCET), 1(9), pp: 135-139.  
Poularikas, A. (1999). The Hankel transform. The Handbook of Formulas and Tables for Signal 
Processing.  
Ramalingam, P., & Dhanushkodi, S. (2016). Recognizing Expression Variant and Occluded Face 
Images Based on Nested HMM and Fuzzy Rule Based Approach. Circuits and Systems, 
7(06), 983.  
Reddy, K. R. L., Babu, G., & Kishore, L. (2011). Face recognition based on eigen features of 
multi scaled face components and artificial neural network. International Journal of 
Security and Its Applications, 5(3), 23-44.  
82 
 
University of Ghana  http://ugspace.ug.edu.gh
Sahu, G., & Tiwari, A. (2015). Training Free Face Recognition and Cropping using Skin 
Identification Algorithm. International Journal of Scientific & Engineering Research, 
6(3).  
Sahu, K., Singh, Y. P., & Kulshrestha, A. (2013). A Comparative Study of Face Recognition 
System Using PCA and LDA. International Journal of IT, Engineering and Applied 
Sciences Research, 2(10).  
Shan, C., Gong, S., & McOwan, P. W. (2009). Facial expression recognition based on local binary 
patterns: A comprehensive study. Image and Vision Computing, 27(6), 803-816.  
Singh, A., Singh, S. K., & Tiwari, S. (2012). Comparison of face Recognition Algorithms on 
Dummy Faces. The International Journal of Multimedia & Its Applications, 4(4), 121.  
Singh, N. A., Kumar, M. B., & Bala, M. C. (2016). Face Recognition System based on SURF and 
LDA Technique. International Journal of Intelligent Systems and Applications, 8(2), 13.  
Sirovich, L., & Kirby, M. (1987). Low-dimensional procedure for the characterization of human 
faces. Josa a, 4(3), 519-524.  
Szeliski, R. (2010). Computer vision: algorithms and applications: Springer Science & Business 
Media. 
Thakur, S., Sing, J. K., Basu, D. K., Nasipuri, M., & Kundu, M. (2008). Face recognition using 
principal component analysis and RBF neural networks. Paper presented at the Emerging 
Trends in Engineering and Technology, 2008. ICETET'08. First International Conference 
on. 
Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Journal of cognitive neuroscience, 
3(1), 71-86.  
Vijg, J. (2011). The American Technological Challenge: Stagnation and Decline in the 21st 
Century: Algora Publishing. 
83 
 
University of Ghana  http://ugspace.ug.edu.gh
Wagner, P. (2012). Face recognition with python. Tersedia dalam: www. bytefish. de [diakses 
pada 16 Februari 2015].  
Weyrauch, B., Heisele, B., Huang, J., & Blanz, V. (2004). Component-based face recognition 
with 3D morphable models. Paper presented at the Computer Vision and Pattern 
Recognition Workshop, 2004. CVPRW'04. Conference on. 
Woodward Jr, J. D., Horn, C., Gatune, J., & Thomas, A. (2003). Biometrics: A look at facial 
recognition: RAND CORP SANTA MONICA CA. 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84 
 
University of Ghana  http://ugspace.ug.edu.gh
APPENDIX I 
CODES 
Imdata <- read.csv(file.choose(),header = TRUE) 
Imdata 
attach(Imdata) 
names(Imdata) 
Imdata <- as.matrix(Imdata) 
names(Imdata) 
shapiro.test(Imdata[,1]) 
shapiro.test(Imdata[,2]) 
shapiro.test(Imdata[,3]) 
shapiro.test(Imdata[,4]) 
shapiro.test(Imdata[,5]) 
shapiro.test(Imdata[,6]) 
shapiro.test(Imdata[,7]) 
shapiro.test(Imdata[,8]) 
a=cbind(c(0.0184,-0.0045,-0.0097,0.0052,-0.0090,-0.0005),c(-0.0045,0.0315,-0.0125,-0.0033,-
0.0063,-0.0049),c(-0.0097,-0.0125,0.0331,-0.0080,-0,-0.0028),c(0.0052,-0.0033,-
0.0080,0.0215,-0.010,-0.0055),c(-0.0090,-0.0063,-0,-0.01,0.0289,-0.0035),c(-0.0005,-0.0049,-
0.0028,-0.0055,-0.0035,0.0171)) 
a 
eigen(a, symmetric, only.values = FALSE, EISPACK = FALSE) 
Eigenvalues <- eigen(cov(a)) 
g=sort(Eigenvalues$values) 
g 
85 
 
University of Ghana  http://ugspace.ug.edu.gh
attach(g) 
q=sort(g, decreasing=TRUE) 
q 
t=q/sum(q) 
t 
plot(cumsum(t),type="l",xlab="No. of eigenvalues", ylab="Variance accounted") 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86 
 
University of Ghana  http://ugspace.ug.edu.gh
APPENDIX II 
Univariate normal test for varying angular constraints 
4  8  12  16  20  24  28  32  
0.6684 0.8023 0.628 0.1372 0.2469 0.6082 0.4598 0.2476 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87 
 
University of Ghana  http://ugspace.ug.edu.gh
 
 
 
 
88