University of Ghana http://ugspace.ug.edu.gh UNIVERSITY OF GHANA RECOGNITION OF FACE IMAGES UNDER VARYING EXPRESSIONS USING DISCRETE WAVELET TRANSFORM AND PRINCIPAL COMPONENT ANALYSIS WITH SINGULAR VALUE DECOMPOSITION BY CHRISTOPHILIA ADJEI DANSO (10308704) THIS THESIS IS SUBMITTED TO THE UNIVERSITY OF GHANA LEGON IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE AWARD OF MPHIL STATISTICS DEGREE JULY 2018 University of Ghana http://ugspace.ug.edu.gh DECLARATION Candidates Declaration This is to certify that this thesis is the result of my own research work and that no part of it has been presented for another degree in this University or somewhere else. SIGNATURE: …………………… DATE: ………………………. CHRISTOPHILIAADJEIDANSO (10308704) Supervisors’ Declaration We hereby certify that this thesis was prepared from the candidate’s own work and supervised in accordance with guidelines on supervision of thesis laid down by the University of Ghana. SIGNATURE: …………………………. DATE……………………….…. DR.LOUIS ASIEDU (Principal Supervisor) SIGNATURE: …………………………. DATE………………………..…. DR. RICHARD MINKAH (Co-Supervisor) i University of Ghana http://ugspace.ug.edu.gh ABSTRACT The face is the most widely used part of the human body for recognition. Automated face recognition is all about transferring an inherent trait in humans into machines by supplying the machine with enough information. Face recognition has gained much attention over the past few decades and it is used in numerous application areas which includes but not limited to information privacy, law enforcement and assess management. The study presents an assessment of the Discrete Wavelet Transform and Principal Component Analysis with Singular Value Decomposition under varying facial expressions. One hundred and eighty-two (182) frontal images from the Cohn Kanade AU-Coded database were used in the study database. These images were obtained from twenty-six persons and were considered for recognition runs. A repeated measures design was used to determine whether there existed significant differences in the average recognition distances of the varying facial expressions from their neutral pose. Recognition rate and runtime were employed as the numerical evaluation methods to assess the performance of the study algorithm. The numerical and statistical computations were done using Matlab. The results of the Repeated Measures revealed that there existed significant differences in the average recognition distances from their neutral pose. The numerical evaluations also showed that, DWT-PCA/SVD face recognition has a remarkable average recognition rate (98.7%) under varying expressions and an average run time of 10.5 seconds which was influenced by the DWT used at the preprocessing stage. Therefore, DWT is recommended as a viable noise removal tool that should be implemented during image pre- processing phase of face recognition systems. ii University of Ghana http://ugspace.ug.edu.gh DEDICATION This thesis is dedicated to the Almighty God, my mother, Vida Arku, my dad, Samuel AdjeiDanso and my uncle Kweku Appiah. iii University of Ghana http://ugspace.ug.edu.gh ACKNOWLEDGEMENT My sincerest gratitude is to the Almighty God for his grace, mercy, favour and protection upon my life throughout my schooling. With Him I came this far. I also take this opportunity to express my greatest appreciation to my supervisors, Dr. Louis Asiedu and Dr. Richard Minkah for their exemplary supervision, monitoring, and continuous inspiration during the course of this thesis from the initial stages (choosing a topic) of this thesis, to the late day of submission. I owe an immense gratitude to my supervisors for their sound advice, encouragement and careful guidance. I am indebted to Enoch Sakyi-Yeboah, John Essel and Felix Dzokoto for their affectionate support, valuable information and guidance which helped me in completing this task through various stages of my thesis. Finally, I thank my parents, uncles, siblings, and colleagues for their constant encouragement without which this assignment would not have been possible. Thank you all for this immeasurable support. May the Almighty God richly bless you. iv University of Ghana http://ugspace.ug.edu.gh TABLE OF CONTENT Contents DECLARATION …........................................................................................................................ i ABSTRACT................................................................................................................................... ii DEDICATION ..............................................................................................................................iii ACKNOWLEDGEMENT ............................................................................................................ iv TABLE OF CONTENT................................................................................................................. v LISTS OF FIGURES ................................................................................................................... xi LISTS OF PLATES ..................................................................................................................... xii LISTS OF TABLES...................................................................................................................xiii ABBREVIATION........................................................................................................................xiv CHAPTER ONE ............................................................................................................................ 1 INTRODUCTION ......................................................................................................................... 1 1.1 Background of the Study...................................................................................................... 1 1.2 Problem Statement ............................................................................................................... 5 1.3 Objectives ............................................................................................................................. 6 1.4 Significance of the Study......................................................................................................... 7 1.5 Scope .................................................................................................................................... 8 1.6 Limitation of the study ......................................................................................................... 8 1.7 Organisation of the study......................................................................................................... 8 v University of Ghana http://ugspace.ug.edu.gh CHAPTER TWO ............................................................................................................................9 LITERATURE REVIEW..............................................................................................................9 2.1 Introduction................................................................................................................................9 2.2 Reviews of Early Works on Face Expressions and Recognition.............................................9 2.3 Face Recognition Methods………........................................................................................14 2.3.1 Knowledge Based Method........................................................................................14 2.3.2 Feature Invariant Method…………………………………………………………..16 2.3.3 Template Matching Method………………………………………………………. 16 2.3.4 Appearance Based Method…………………………………………………….......17 2.4 Face Recognition Techniques..................................................................................................17 2.4.1 Artificial Neural Network (ANN)……………………………………………….....18 2.4.2 Template Matching…………………………………………………………….......18 2.4.3 Geometrical Feature Matching…………………………………………………….19 2.4.4 Fisherface…………………………………………………………………………..19 2.5 Discrete Wavelet Transform....................................................................................................20 2.6 Face Recognition by Principal Component Analysis………………………………………..21 2.6.1 Theoretical Approach………………………………………………………….......21 2.6.2 Empirical Approach……………………………………………………………….23 2.7 Other Feature Extraction Methods………………………………………………………......26 vi University of Ghana http://ugspace.ug.edu.gh 2.7.1 Linear Discrimination Analysis……………………………………………………26 2.7.2 Independent Component Analysis……………………………………………........28 2.8 Conclusion..............................................................................................................................30 CHAPTER THREE .....................................................................................................................31 METHODOLOGY....................................................................................................................... 31 3.1 Introduction............................................................................................................................. 31 3.2 Data Acquisition .................................................................................................................... 31 3.3 Recognition Procedures…….................................................................................................. 32 3.3.1 Pre- Processing……………………………………………………………………. 33 3.3.1.1 Wavelet Transform…………………………………………………...….34 3.3.1.2 Discrete Wavelet Transform……………………………………………..36 3.3.1.3 Haar Wavelet Transform…………………………………………………38 3.3.1.4 Haar Wavelet…………………………………………………………….41 3.3.2 Extraction of Features………………………………………………………….......42 3.3.2.1 Singular Value Decomposition (SVD)…………………………….…….43 3.3.2.1.1 Theorem…………………………………………………..........44 3.3.2.1.2 Properties…………………………………………………........45 3.3.2.2 Principal Component Analysis (PCA)……………………………….......45 3.3.3 Recognition………………………………………………………………………...49 vii University of Ghana http://ugspace.ug.edu.gh 3.3.3.1 Euclidean (Recognition) Distance……………………………………….50 3.3.3.2 Recognition Rate………………………………………………………....51 3.4 Statistical Testing Approach…………………………………………………………………52 3.4.1 Assessment of Multivariate Normality…………………………………………….53 3.4.1.1 Generalised Square Distance Approach………………………………….53 3.4.2 Repeated Measure Design………………………………………………………….54 3.4.2.1 Assumptions……………………………………………………………...55 3.4.2.2 Test for Equality of Treatments………………………………………….56 3.4.2.3 Hypothesis………………………………………………………………. 56 3.4.2.4 Confidence Region……………………………………………………… 57 3.4.3 Friedman Rank Sum Test…………………………………………………………. 58 3.4.3.1 Assumptions…………………………………………………………….. 58 3.4.3.2 Model…………………………………………………………………….58 3.4.3.3 Testing Procedure………………………………………………………..59 3.4.4 Pairwise Comparison…………………………………………………………........60 CHAPTER 4……………………………………………………………………………………..61 DATA ANALYSIS……………………………………………………………………………... 61 4.1 Introduction…………………………………………………………………………………. 61 viii University of Ghana http://ugspace.ug.edu.gh 4.1.1 Data Description………………………………………………………………...... 61 4.2 Result…………………………………………………………………………………….......62 4.2.1 Preprocessing………………………………………………………………………63 4.2.2 Recognition Stage………………………………………………………………….64 4.2.2.1 Euclidean (Recognition) Distance……………………………………….64 4.2.2.2 Recognition Rate………………………………………………………....66 4.3 Numerical Assessment…………………………………………………………………….....67 4.3.1 Assessing for Normality……………………………………………………….......67 4.3.1.1 Test for Multivariate Normality………………………………………….68 4.3.1.2 The Chi-square plot……………………………………………………... 68 4.3.2 Test of Sphericity…………………………………………………………………..69 4.3.3 Repeated Measures Design………………………………………………………...69 4.4 Paired Comparison…………………………………………………………………………...70 CHAPTER 5……………………………………………………………………………………..72 SUMMARY, CONCLUSION AND RECOMMENDATION…………………………………..72 5.1 Introduction…………………………………………………………………………………. 72 5.2 Summary…………………………………………………………………………………..... 72 5.3 Conclusion and Recommendation………………………………………………………….. 73 ix University of Ghana http://ugspace.ug.edu.gh REFERENCES…………………………………………………………………………………. 74 LIST OF FIGURES Figure 3.1: Research Design........................................................................................................ 32 Figure 3.2: A DWT decomposition of an image using 2-level pyramid ..................................... 37 Figure 3.3: A summary of the face recognition procedure……………………........................... 51 Figure 4.1: A graph of observed data quantiles against normal quantiles……………………… 68 x University of Ghana http://ugspace.ug.edu.gh LIST OF PLATES Plate 3.1: Example of images used from the Cohn Kanade database……................................... 32 Plate 4.1: Examples of images in the train database .................................................................... 62 Plate 4.2: Examples of image in the test image database ............................................................ 62 Plate 4.3: Preprocessed image from DWT.................................................................................... 64 Plate 4.4: Samples of correctly recognised individual under varying expressions…....................67 xi University of Ghana http://ugspace.ug.edu.gh LIST OF TABLES Table 4.1: The Euclidean distances of the images from their neutral pose...................................65 Table 4.2: Recognition rate of varying facial expressions………………………………………66 Table 4.3: Mauchly’s W test of sphericity....................................................................................69 Table 4.4: Repeated measure test…..……………………………………………………………70 Table 4.5: Paired comparison of various constraints……………………………………………71 xii University of Ghana http://ugspace.ug.edu.gh ABBREVIATION ANN……………………………… Artificial Neural Network BFLD ……………………………. Block-based Fisher’s linear discriminant DFT………………………………. Discrete Fourier Transform DWT. ……………………………. Discrete Wavelet Transform FFT…………………………… …..Fast Fourier Transform GF…………………………… …...Gabor Feature ICA………………………………...Independent Component Analysis ID…………………………………..Identification IDWT………………………… …...Inverse Discrete Wavelet Transform KPCA………………………………Kernel Principal Component Analysis LDA………………………………...Linear Discriminant Analysis LFA………………………………... Local Feature Analysis LMBP……………………………. . Levenberg-Marquardt backpropagation MVN…………………………... …. Multivariate Normality PCA…………………………… …. Principal Component Analysis xiii University of Ghana http://ugspace.ug.edu.gh SURF…………………………........Speeded Up Robust Feature SVD…………………………………Singular Value Decomposition xiv University of Ghana http://ugspace.ug.edu.gh CHAPTER ONE INTRODUCTION 1.1 Background of the Study Human organs, with or without an attribute, can be used as variables of authentication or verification purposes. Technology has eased these processes by providing many ways to validate the identity of a person using one or more human organs. This area of technology is what we know as “biometrics”. Biometric techniques refer to means of validating the true identity of an individual by their behavioural patterns or using one’s physical characteristics. These techniques include the use of fingerprint, keystroke, iris, palm print, DNA, voice, face and handwriting styles among others. Using such biological traits in biometric systems to verify a person makes it difficult to alter or forge. Biometric methods can be classified into physiological (fingerprint, face, DNA) and behavioural methods (handwriting, voice, keystroke). The physiological methods are known to be more reliable than the latter because physiological traits are often unalterable except by acute injuries. The behavioural characteristics, on the other hand, may vary due to anxiety, fatigue or illness but also has the advantage of being less intrusive. The human face is still the most widely used way of identifying a person traditionally. The face is a complex and extremely dynamic structure that rapidly changes over a period of time. The face has a lot of distinct landmarks known as nodal points. The landmarks are made from different shapes and sizes of bone structure, muscle mass and facial tissues. Humans have an inherent ability to identify a known individual by one’s face with recent studies showing that facial recognition is a dedicated process in the human brain (Bentin, Allison, Puce, Perez and McCarthy, 1996). Experiments have also proved that, babies up to three days oldare able to 1 University of Ghana http://ugspace.ug.edu.gh differentiate between known faces (Chiara, Viola, Macchi, Cassia & Leo, 2006). It is clear that humans are skilled in distinguishing between known faces. However, humans are not so skilled with a sizable amount of unknown faces. Building an efficient and strong automatic face recognition systemwith its computational speed can be very rewarding in overcoming human limitations in this regard. In the field of image analysis, face recognition remains a very important aspect. Face recognition has become a fast-growing research areawith researchers finding it interesting as it is also challenging. The benefits of an effective and efficient automatic face recognition system spill over into many industries. For security purposes, face recognition can be used in law enforcement, closed-circuit television (CCTV) systems to identify persons of interest and also in access request systems to authenticate individuals. It can also be used in the entertainment industry, virtual reality applications and voting systems. The overlapping interests from diverse industries has pushed for more research into face recognition techniques. In today’s networked world, the need to maintain a higher level of security cannot be overemphasized. Crime has become increasingly sophisticated and criminals are becoming increasingly difficult to apprehend especially when the perpetrators leave little or no trace of their identity. Governments and organizations are thus investing heavily in reliable methods or systems that explicitly identify persons of interest without running into issues of rights to privacy. Automatic face recognition is being employed to address some of these issues. Automatic face recognition is the ability to identify individuals solely by the facial characteristics that are unique to a person. This is also one aspect of the broader field known as pattern recognition research and technology. The broader field of pattern recognition involves techniques that employ statistical methods to identify and set aside patterns of data in order to compare, contrast or match it with already identified patterns that may be stored in a form of 2 University of Ghana http://ugspace.ug.edu.gh database. Building an automatic face recognition system is all about extracting meaningful features such as lines, angles, edges or movements from an image, putting them in a useful representation and applying some kind of classification on them (Wagner, 2012). Face recognition is proven to be more robust, non-intrusive and less user interactions needed among other technologies. Unlike other biometrics such as fingerprints, it is considered flexible since it is easy to acquire an image of someone’s face without consent and also from a distance. Face recognition is often viewed as a ‘contactless’ form of surveillance identification and verification. In order for it to be detected by an automatic system, facial appearance is interpreted in various distinct ways. Various facial recognition systems are currently under development, with the choice of system dependent on the particular use for which a facial recognition system is required. However, to show the appearance of the face or to show its geometry remains the major difference in the various face recognition systems been developed. The two major differences have been compared but with most of these systems in current times preferring to use a combination of both a representation of the facial appearance and the geometry. It is difficult to measure the geometry of a face to a high degree of accuracy especially when one has just a single image but provides more defense against aging and forgery through disguise. Information on appearance is easily obtained from an image of the face but is more prone to variations especially from changes in expressions and pose. Practically, in most cases, even systems that tend to use the appearance-based approach often tend to use some form of geometryin order to arrive at a ‘shapefree’ depiction that is devoid of pose and expression. In order to keep track of facial characteristics, Kakumanu and Bourbakis (2006) made use of local graph, while employing global graph to store information on the texture of the face. Chang et al. (2006) introduced the idea of toning or harmonizing the area covered by the nose. Likewise, Gundimada 3 University of Ghana http://ugspace.ug.edu.gh and Asari (2009) through modular kernel Eigen-spaces for multi-dimensional spaces selected the local facial characteristics. Dimensionality remains a prime aspect of facial recognition regardless the approach used in a system with appropriate techniques required to lessen the dimension of the studied space. Face recognition systems work in two diverse ways, using local and global features. The local face recognition system requires the detection and localizing of facial characteristics such as eyes, nose and mouth. That is, considering the fiducial landmarks of the face to associate the face with an individual. With the local-feature method, it computes the descriptor from facial parts and places the information into one descriptor. Examples of local-feature techniques are, Local Feature Analysis (LFA), Elastic Bunch Graph Matching (EBGM), Local Binary Pattern Feature (Agrawal et al., 2014) and Garbor Features. On the other hand, for global face recognition system, the absolute facial image is programmed into a point on high dimensional space in identifying the individual. This principle of using the whole face technique, an application of the effectiveness of subspace-based face recognition, requires the utilization of subspace learning methods which gives the global feature of the observation. In this method, each pixel is taken in to consideration for the processing. Thus, if one pixel is varied then the entire subspace is varied. This can be achieved with the help of Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA), Independent Component Analysis (ICA), Non-negative Matrix Factorization (NMF) or Random Projection (RP). All these are dimensionality-reduction procedures seeking to lessen the rather large dimensional facial image data to smaller dimension for match making. A face recognition can work in two modes, verification or identification mode. In an identification facial recognition system, recognition is given an individual by comparing the 4 University of Ghana http://ugspace.ug.edu.gh individual’s input image with images of all users already stored in a form of database. On the other hand, for face verification systems, an identity is claimed by the user and this identity is either accepted or rejected by comparing the input image of the individual with the image of the identity being claimed which is already stored in a database. As facial expressions vary, identifying faces under changing facial expressions is a challenging task. Thus, an effective facial recognition system will also be one that is able to understand and interpret human affective states and respond accordingly to these states in its identification and verification of facial images. 1.2 Problem Statement Facial recognition became very important and interesting due to its numerous advantages in different fields of study. Its use in lots of real time applications and various research areas have devoted time in the development of facial recognition systems. Notwithstanding this, there has been challenges in developing these systems. Liao et al. (2013) stated that, despite the fact that the automatic face recognition algorithms has improved recognition rate significantly, there exist many problems associated with the performance and efficiency of these algorithms under environmental constraints which includes but not limited to facial expression, pose variation and illumination. The face which is not a rigid object, that is, it varies easily over time contributes to the limitations in face recognition systems. Also, when face muscles contract, deformed facial features are produced (Fasel B. and Luettin J, 2003). It has also been stated by Biswas and Ghose (2014) that facial recognition algorithms are affected by the major expressions that occur in the face of a person. The expressions could be explained as a way an individual shows his/ her sentiments and this generates some kind of vagueness in face recognition systems. They also 5 University of Ghana http://ugspace.ug.edu.gh stated that, one of the drawbacks in works done are the unreliable invariants which occurs in a person’s face when there are major variations in expressions. According to Chin and Kim (2009) and Ekman and Friesen (2003), facial expressions behave like signals that are varied with the contraction of face features such as the lips, cheeks, eyes and eye brows which also affects the accuracy of the recognition algorithms. Some recently developed recognition algorithms are video sequence based to handle these varieties of face expressions (Gorodnichy, 2005). This is due to the fact that a video sequence includes a number of consecutive images which may contain more information than a case of a single image. However, the traditional way of using a single image is still relevant because in several practical settings, the video sequence would need a continual recording of face images. This would also attract the need of extra storage space which might not be obtainable. These limitations communicate the need to further improve facial recognition systems to deal with demanding situations of facial expressions. This study would employ the use of a modified (DWT-PCA/SVD) algorithm on the seven universally accepted principal facial expressions which are the anger, disgust, happy, fear, neutral, sad and surprise pose. 1.3 Objectives The major objective of this study is to adopt and measure the performance of a DWT-PCA/SVD Algorithm in a facial recognition system under the seven varying face expressions. The specific objectives of the study are as stated;  To compute the recognition rate and run time in order to assess the performance of the modified algorithm under varying facial expressions. 6 University of Ghana http://ugspace.ug.edu.gh  Assess whether a significant difference exists in the average recognition distance of the various facial expressions. 1.4 Significance of the Study An efficient and resilient face recognition algorithm is significant in addressing most of the real- life problems in various application areas such as biometrics, security, law and commercial enforcement. Here are some applications of face recognition systems;  Information Protection: Data Privacy (income, medical records), Access security, User authentication, Permission based systems.  Access management: Access log in or audit trials, Secure access authentication (restricted facilities).  Biometrics: Personal identification (Passports, national IDs), Automatic identity verification.  Law enforcement: Suspect identification, Video surveillance, Simulated aging, Suspect tracking (investigation), Forensic face Reconstruction.  Personal security: Home video surveillance systems.  Entertainment: video game systems, Photo camera applications. Again, this study serves as a catalyst to breed interest and further research into the other aspects of biometrics in Ghana. This sterns from the fact that face recognition is a multifaceted phenomenon and no one research is capable of addressing it in full. 7 University of Ghana http://ugspace.ug.edu.gh 1.5 Scope The study relies on the verification of the neutral face images in the adopted database under varying facial expressions. 1.6 Limitation of the study This study is limited to assessment under varying face expressions. Face images captured under any other constraints except the aforementioned were not considered in the study. 1.7 Organisation of the study This thesis would consist of five chapters. The first chapter consists of an introduction of the study, which involves background of the study, problem statement, objectives, the significance (some applications of the facial recognition systems), scope of the study and limitation of the study. The Chapter two covers empirical and theoretical review of articles, journals and related issues on face recognition. This includes an automatic face recognition techniques and approaches involving the PCA/SVD and wavelet transform. Chapter three gives the details of the research methodology employed in the study. The methodology discusses the statistical knowledge of the Discrete Wavelet Transform and Principal Component Analysis with Singular Value Decomposition (DWT-PCA/SVD) in face recognition. The chapter also discusses the pre-processing techniques, recognition procedure and feature extraction methods in face recognition algorithm. Chapter four presents the results and discussions on facial recognition system. Chapter five gives the findings of the study, conclusion and recommendation. 8 University of Ghana http://ugspace.ug.edu.gh CHAPTER TWO LITERATURE REVIEW 2.1 Introduction The Chapter provides a review of existing researches on face Recognition. It involves theoretical and empirical review of articles, journals and related work on facial recognition using Discrete Wavelet Transform and Principal Component Analysis with Singular Value Decomposition (DWT-PCA/SVD). This includes automatic face recognition techniques and approaches involving the PCA/SVD, wavelet transform and other statistical approaches. The methodology or processes involved in the study would also be explained. 2.2 Review of Early Works on Face Expression and Recognition Face recognition has been one of the fastest research fields and has improved significantly over some decades. Nevertheless, there are still some limitations of facial recognition which affects the performance of the various algorithms proposed. These are environment variations such as illumination, pose variation, occlusion and facial expressions which the latter is the main focus of this study. The history of face recognition, facial analysis and interpretation started since the late eighteenth century. The advancement and importance of face recognition has led to its vast use in recent times in the field of politics, security and commercial systems. Many fields of study have researched into this area which includes but not limited to psychology, engineering and mathematics. 9 University of Ghana http://ugspace.ug.edu.gh Charles Darwin was interested in the concept of man as an animated machine, thus he investigated how and why emotions express themselves in the face. He conducted a study on “the expression of the emotion in man and in animals”. Darwin (1872) first proposed that the emotions and expressions were universal and also biologically inherent. That is, they were obtained from associated habits which were referred to as basic emotions. He also focused on the assumption that, the visage is universal across ethnic groups or customs which engages the basic emotions. The principal view was that, the face expressions were culture-specific in psychology. That is, every culture or custom has its own language of facial expression just as it has its own verbal language. Ekman and Friesen (1971) was commended for their immerse contributions to the formation of the six principal emotions, that is the anger, happiness, fear, sadness, disgust and surprise. These principal face expressions are considered distinct in their various features. Tomkins (1962, 1963) revived Darwin’s claims by suggesting that these emotions are the basis of human motivation and that emotion has its place in the face. Again, Tomkins and McCarter (1964) demonstrated that face expressions were genuinely related with specific emotion states. Subsequently, Friesen (1972) recorded that same face expressions were produced voluntarily by people of various cultures and ethnicities in response films that are emotion-eliciting. Early works of face recognition for that matter recognition under varying expressions dwelt on some of these theories. One of the earliest researches in the 1960’s was by Bledsoe who many may refer to as the father of face recognition. He with other researchers started Panoramic Research, in California. Most of the works done by the institution were in the area of defence in United States of America in 1964 and 1965. In 1965, Bledsoe, Chan and Bisson, conducted a study recognizing faces with the help of computers. Much of their work was not published due to its involvement in an unnamed intelligence agency. Later, he formulated and executed a system 10 University of Ghana http://ugspace.ug.edu.gh that was semi-automated where face features were chosen by human operators and computers were made to use the information gained for recognition. This system could arrange face images by using the RAND tablet which was developed at the Research And Development (RAND) Corporation. This device could be used to load vertical and horizontal coordinates. This was done on a grid by using a stylus which discharges electromagnetic pulsations. The system records the location coordinates of various face features in a manual way. These features include the nose, eyes, hairline and mouth. He further stated that most of the problems of facial recognition systems is still been battled with up to date. These limitations included facial expressions, variation in illumination, head pose and aging. However, numerous researches are being conducted and algorithms proposed to curtail the problem. Goldstein, Harmon, and Lesk also in the 1970’s increased the accuracy of the manual face recognition systems. They included the use of twenty-one distinct subjective face features such as the thickness of the lips and hair colour so that they could automatically identify the faces. With the system of Bledsoe, the computations of the biometrics were still done in a manual way. Fischler and Elschanger (1973) automatically recorded the same face features. Their system employed the use of local template matching and global measure of fit so as to detect and measure the face features. In the 1970’s, others defined faces as geometric framework and carried out some pattern recognition using the faces. Kenade (1973) developed a fully face automated recognition. His algorithm automatically extracted sixteen facial parameters. He further compared this automated extraction to manual or human extraction. He obtained a good recognition rate of about 45-75%. However, the study showed that accurate results were achieved when insignificant features were not included. Suwa et al. (1978) first attempted an automatic facial expression recognition using image sequence. 11 University of Ghana http://ugspace.ug.edu.gh Different approaches followed actively with some researches trying to better the procedures used in recording of the instinctive features in the 1980’s. New approaches were also introduced in this decade where facial recognition algorithms were built on artificial neural networks (Stonham, 1986). Mark Nixon (1985) proposed a geometric measuring for eye spacing. Strategies such as “deformable templates” were employed to improve the template matching approach. Sirovich and Kirby in 1986, first mentioned eigenfaces which was easily calculated by using an algebraic manipulation introduced by them in image processing. This technique focused on using principal component analysis (PCA), where an image can be represented in reduced dimension but not losing needed information, and then reconstructing the image (Kirby &Sirovich, 1990). In the following years, this technique has become a dominant approach in the development of many new algorithms. The eigenface approach saw broad recognition in the 1990’s with it serving as the basis for the first industrial application. Turk and Pentland (1991) widen the scope of using eigenfaces to recognise faces images. They proposed that using the eigenfaces, the residual error could be applied to find face images in a cluster and also verified the specific scale and location of the faces in an image. Despite the limitations, that is the environmental and technological constraints of the method, they provided an key breakthrough in the efficiency of automatic face recognition. They found that by the merging of the process of localizing and detecting, one can easily obtain an efficient recognition of images in a limited constrained situation. Since then, there have been an increased number of publications each year to the improvement of face recognition. The rapid growth is due to the existence of real-time hardware and the enormous need in surveillance systems. 12 University of Ghana http://ugspace.ug.edu.gh The aptness of a face recognition system to suitably detect the face image depends on a variation of variables, including illumination, head pose, lighting (Gross &Brajovic, 2003), expressions of the facial features (Georghiades, Belhumeur, &Kriegman, 2001), and quality of the image (Shan, Gong, &McOwan, 2009). Bartlet (1999) performed a study by exploring ways to automatically recognize facial expressions in a sequence of face images. He compared various techniques which included analysis of facial expression in a sequence of facial motion. This was done by estimating optical flow that is the holistic spatial analysis which includes independent component analysis, linear discriminant analysis and the local feature analysis. He also used the Gabor wavelet representation and the local principal component analysis in the recognition process. Donato et al., (1999) worked using manually aligned facial images which were also free from head pose in image sequence. He used and compared different techniques in recognizing eight single action units and also four action unit combinations. The techniques were optical flow, local feature analysis and Gabor wavelets. Fasel and Luettin (2003) were able to recognise the units of face expressions with the used of Crane Edge Detector which is an optimum feature extraction algorithm. It was applied on face images that are localised. Zhang, Ding, Li and Liu (2015) also presented a study which involved the use of the Fast Fourier Transform (FFT) with an improved PCA based on facial expressions. The study concluded that the algorithm was efficient under varying expressions. 13 University of Ghana http://ugspace.ug.edu.gh Asieduet al., (2015) performed a statistical evaluation of the PCA/SVD and the Whitened PCA/SVD algorithms under varying expressions. It was concluded that PCA/SVD algorithm is comparatively efficient than the Whitened PCA/SVD algorithm. Face recognition undergoes three stages after acquiring an image. They are the pre-processing, feature extraction and recognition stages. Several methods have been proposed for facial feature extraction and they can be broadly classified into Holistic (appearance based), model-based and hybrid methods. The holistic approach requires using either the entire face as components (features) or Gabor wavelet filtered whole faces (Beham & Roomi, 2012). This method has proven to be very effective in testing huge databases. The model-based approach uses geometrical features in face recognition which implies the use facial components such as nose, eyes and hair. This process uses an input data for recognition of face image by matching input images with available database images. Brunelli and Poggio (1993) performed a face recognition task by adopting a matching template of facial features (eye, nose and mouth) independently. The facial features classification of the model-based method was limited since the procedure fails to consider the factors associated with variations arising from visual stimuli. The hybrid method is a combination of holistic-based approach and model-based approach (Reddy et al., 2011). Kakade (2016) study showed that using hybrid method improved face recognition algorithm. His study assessed face recognition algorithms using some statistical approaches (ANN, LDA, PCA, and SVM) which yielded better recognition performance. The very important aspect in building an efficient face recognition is the dimensionality. Appropriate techniques needed dimension reduction of the face space because working on higher face dimensions causes over fitting. This can elongate the computational (run) time. 14 University of Ghana http://ugspace.ug.edu.gh 2.3 Face Recognition Methods There are four categories of face recognition methods (Choi et al., 2012; Yuille et al., 1989; Dubey & Tomar, 2016).  Knowledge based methods  Features invariant methods  Template matching methods  Appearance-based methods 2.3.1 Knowledge Based Method They are known as rule-based methods. They mainly depend on a set of rules in the detection process (Kenade, 1997). This method takes into consideration our knowledge of face images and translates them into a set of regulation or rules. An example of a rule is a face having two eyes, a nose and a mouth or a face having the eye area darker than the cheeks. These features are within specific distances in relation to each other. There is a limitation to this method which is the ability to build an appropriate set of rules because very general rules might result in false positives and too detailed set of rules might result in false negatives. This actually could be resolved by using hierarchical knowledge-based methods which is efficient with simple inputs. This method is generally limited since it cannot be used to locate many faces in a complex image. Wang and Tan (2000) presented a system which was for images with uncomplicated background. The system’s approach was to find the knowledge surrounding the geometry of face images and the design of the various features. They set rules that defined how to identify the face from 15 University of Ghana http://ugspace.ug.edu.gh complicated background. These regulations provide estimate of the geometry which is later used for recognition. This method does not work effectively under varying positions or orientations and there is also the need for a method which is able define human facial framework into clearly-defined and meaningful regulations. 2.3.2 Feature Invariant Method Feature invariant approaches locate faces by extracting structural features of the face (Leung et al., 1998; Kjeildsen & Kender 1996; Yow & Cipolla 1996). This idea was developed to overcome the limitations of our instinctive knowledge of face images. One of the earliest algorithms developed was by Han, Liao, Yu and Chen in 1997. Normally, a statistical classifier, models or edge detector is trained and then utilized to distinguish between non-face and face sections. The method seeks to find distinct characteristics of an image of face in spite of the position or angle. Its focus is to locate systemic characteristics like the fiducial points, the skin texture and the colour of a face even though there are changes in the head pose, lightning variations and viewpoint. In effect facial recognition use different characteristics of the face such as the mouth, cheekbones, the contour of the eye socket, nose, zone near the cheekbones and the eyes. Research have shown that, the colour of the skin is reviewed to be one of the significant characteristics for face recognition since every person has a unique skin colour and recognition is explicit when the race is a criterion for detection (Ali, Mohd, Mohd & Yasaman, 2014). Feature invariant has a challenge when the characteristics of a face image is altered by noise, occlusion or illusion. It is also demanding if there is a need for feature extraction. 16 University of Ghana http://ugspace.ug.edu.gh 2.3.3 Template Matching Method Template matching methods use parameterized face templates to locate and detect faces. It can be computed using the relationship between the various characteristics obtained from input image and the predetermined estimate. It compares test images with template of images stored for detection. Each feature could be defined independently for instance, the hairline can be distinguished with the use of filters or edge detectors. This technique has limitations if face images are anterior or when there are differences in scale, shape and pose. Nevertheless, Dubey and Tomar (2016) proposed the use of deformable templates to solve these limitations. 2.3.4 Appearance-Based Method Appearance-based methods depend on a set of delegated trained images to discover facial models (Osuna et al., 1997, Rowley et al., 1998, Viola & Jones 2001). It captures the representative variability of faces. In effect, appearance-based techniques have exhibit greater performance when compared to others techniques (Zhang & Zhang, 2010). Generally, these methods depend on skills from machine learning and analysing statistically to search the significant features of facial images. The method explains face recognition as an issue of two class image classification that is the face or non-face class. Whatever is considered face is contained in the face class and all non-face characteristics are also grouped into the non-face. There are different recognition techniques that are based on the appearance-based methods. They include eigen faces, LDA, SVM, PCA etc. This method requires a large database and a high- quality image for detection process. 2.4 Face Recognition Techniques According to Singh etal., (2016), face recognition techniques are classified as follows: 17 University of Ghana http://ugspace.ug.edu.gh • Artificial Neural Network • Template based matching • Geometrical feature matching • Fisher face 2.4.1 Artificial Neural Network (ANN) The usage of neural works is found attractive because of its non-linearity property and also is considered efficient (Kirby &Sirovich, 1990). Lawrence, Giles, Tsoi, and Back (1997) performed a study using a hybrid neural network which comprised of a self-organising map (SOM), neural network and a convolutional neural network on ORL database consisting of 400 images from 40 individuals with different facial expression. The SOM provided a reduction in the dimension while the convolutional neural network extracted consecutive larger characteristics in a ranked group of layers. The study concluded that the recognition rate 96.2% and its classification time was less than 0.5 second even though the time for training was 4 hours. One of the earliest artificial neural network techniques is a lone layer adaptive network called the Wilkie, Stonham and Aleksander’s Recognition Device (WISARD). In general, artificial neural network approaches experience some sort of challenges when there is an increment in the number of individuals. 2.4.2 Template Matching Template matching can be likened to a test face image which constitute a two-dimensional array and can be compared with the use of acceptable measures like the Euclidean distance with one framework showing the whole face image. The human face can also be represented by more than 18 University of Ghana http://ugspace.ug.edu.gh one framework. Brunelli and Poggio (1993) chose a group of four feature templates which is the mouth, eyes, nose and the whole face of 188 images of 47 individuals. They compared it to a geometrical matching algorithm and it came out superior with a recognition rate of 100%. One limitation of the template matching approach is that, it is computationally complex. 2.4.3 Geometrical Feature Matching This technique focuses on computing a set of geometrical features from an image. It can be described as a vector which represents the size and positions of major fiducial points of the face image like the mouth, eyes, nose and eyebrows. Kanade (1973) presented a study which attained a good recognition rate of 75% by using the geometric feature matching technique on a database of 20 individuals. Each individual had two images, one considered as the train image and the other image as the test. Also, Cox et al., (1996) achieved a recognition rate of 95% using a mixture –distance technique on a test database of six hundred and eighty-five persons. Bruneli and Poggio, (1993) performed a test on a database of 47 individuals whose facial geometrical features were automatically extracted such as the position of the mouth, nose width and the shape of the chin. Bayes classifier was used in recognition which yielded a recognition rate of 90%. 2.4.4 Fisherface Fisherfaces can be considered as one of the mostly used successful methods for facial recognition. It works on appearance-based methods. Fisherface technique selects an image subspace in a way that the ratio of the between-class scatter and the within-class scatter is maximized. Belhumeur et al., (1997) proposed the use of fisherface technique by combining PCA and Fisher’s linear discriminant analysis to determine subspace projection matrix that is very alike to 19 University of Ghana http://ugspace.ug.edu.gh that of the eigen space method. The technique was reported to solve one of the major limitations of the eigenface method on database with varying lighting condition. 2.5 Discrete Wavelet Transform Discrete Wavelet Transform (DWT) is been established as an efficient and versatile way of decomposing sub bands of signals. DWT produces numerous wavelets whose positions and scales are distinct. The coefficients of these wavelets acquired show that a wavelet is nearly correlated to a specific section of the signal. In obtaining these coefficients, a filtering process is done during the inverse transformation. Bute and Jasutkar (2012) stated that the feature which concentrates signal energy to distinct wavelet coefficient is needful in image compression. Imtiaz and Fattah (2011) also presented that discrete wavelet gives ample information when analysing the original face image whiles also reducing the computational time. Abdulrahman et al., (2014) presented a study on facial recognition which sought to assess the performance of Eigen face and DWT with principal component analysis on the Yale face database. The study concluded that the DWT/PCA algorithm had a higher recognition rate thus efficient than the Eigenface algorithm. This was explained that the former converged faster than the latter. Again, the number of levels of DWT affect the efficiency of the algorithm. Wadkar & Wankhade (2012) conducted a study on face recognition using a discrete wavelet transform. Two wavelet transforms which are the Haar and the bi-orthogonal 9/7 wavelets were used on ORL standard face database which consisted of forty (40) individuals with each person having ten (10) different facial expressions. The study employed the use of the Euclidean distance and L1 measures. It was shown that, when experimenting is done with Haar transform, the Euclidean distance measure give better recognition rate than L1 distance measure. On the 20 University of Ghana http://ugspace.ug.edu.gh other hand, a better recognition rate of the L1 measure is obtained when experiment was done with bi-orthogonal 9/7 wavelets. The study further stated that the study algorithm would give a higher recognition rate when more multi-resolution transforms are used. Biswas and Ghose (2014) proposed an algorithm of DWT and scale invariant features transform (SIFT) on Essex-Grimace database. The database consisted of 20 different expression images of 18 individuals. It was concluded that that the proposed algorithm obtained promising results on expression invariant images. The study also stated there could be some control step in the pre- processing stage to enhance recognition. 2.6 Face Recognition by Principal Component Analysis 2.6.1 Theoretical Approach Principal Component Analysis (PCA) has been an accepted and long-established data reduction and feature extraction method which is broadly used in the aspects of image detection and recognition. It is derived from Karhunen Loeve’s transformation and known as Eigenspace. PCA seeks to identify the eigenvectors and covariance matrix that exhibits the way in which the principal components of the original data follow. Therefore, PCA helps to determine the direction of the differences and similarities of the data. PCA can be said to work with entire class composition without giving heed to the underlying structure. The original size is decreased, and the current size is utilized for recognition. According to Zhen and Zilu (2012), images that are captured into a subspace by the PCA have their first orthogonal size recording the highest variations among the images whilst the last orthogonal size captures the least variations. In using principal component analysis, images found in the train database are set as linear combination of weighted eigen vectors that are known as eigenfaces which can be computed 21 University of Ghana http://ugspace.ug.edu.gh using the covariance matrix of the train database. After the significant face features are chosen, the weight of the images can be determined. In order to recognise a test or query image, the images go through a projection onto the subspace and their corresponding weights are obtained. The weight of the query image is compared to the weights in the train database and then the image can be recognised. The aim of this method is to reduce the size whilst maintaining the primary or significant features of the original image. There have been numerous researches performed to assess the performance of the different facial recognition algorithms. Studies show that the use of PCA as compared to other techniques makes the recognition process simple and efficient. Sharma and Dubey (2014) concluded that in singular variance situations, the PCA does better than other algorithms. The principal component analysis has an edge over other recognition techniques because it is simple, fast and not affected by slight changes on the face. Smith (2002) found out that PCA is efficient compressing information with little loss. There are steps in applying Principal Component Analysis in image processing. These steps are listed below (Barnouti, 2016):  An Average mean is computed from a total of ‘M’ set of training images. 1 𝑀 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑀𝑒𝑎𝑛 = ∑ 𝐼𝑚𝑎𝑔𝑒𝑠 (𝑛) (2.1) 𝑀 𝑛=1  The Average mean is subtracted from the original images as shown below. 𝑆 = 𝐼𝑚𝑎𝑔𝑒𝑠 − 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 𝑀𝑒𝑎𝑛 (2.2)  Covariance Matrix is computed as indicated. 22 University of Ghana http://ugspace.ug.edu.gh 𝑀 𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒 = ∑ 𝑆(𝑛)𝑆𝑇(𝑛) (2.3) 𝑛=1  Eigenvalues and Eigenvectors of the Covariance Matrix are computed.  Eigenvalues are sorted and eliminated.  The training samples are projected onto the Eigenfaces. 2.6.2 Empirical Approach Many researchers from different fields have employed the use of Principal Component Analysis as a reduction technique in face recognition. Some of the works of these scholars are listed below: Sirovich and Kirby (1990) first made use of the principal component analysis to recreate and represent face images of humans. Turk and Pentland (1991) built upon and illustrated the popular eigenface technique in facial recognition systems. They designed an automatic facial detection system with the use of eigenfaces. They determined the procedures needed to compute the eigen vectors in a method that computers at that moment could use to calculate the decomposition on huge face database. Eigenface techniques have proven to be effective in identifying a smaller dimensional space of an image and competent in presenting significant component of the face features. However, Pentland, Starner, Etcoff, Masoiu, Oliyide, and Turk (1993) found that Eigenfaces are greatly affected by non-homogeneous conditions. Since the eigenface technique was established, it was built upon to entail preprocessing techniques to enhance accuracy of recognition algorithms (Yambor, Draper &Beveridge, 2002). Mashal, Faust and Hendler (2005) conducted a survey using a multimedia database. The study used the principal component analysis method in selecting the images from the database. 23 University of Ghana http://ugspace.ug.edu.gh Nicholl and Amira (2008) conducted a study using the eigenface approach. They adopted the DWT/PCA algorithm and focused on automatically assessing the most discriminative coefficients in a set of training images with respect to their inter-class and intra-class standard deviations. Eigenface weighted vectors were selected based on the various eigenvalues. Although the method improved the performance of face recognition algorithm, the study failed to address the various environmental constraints in the training face image datasets. Kshirsaga, Baviskar and Gaikwadr (2011) subjected to a procedure based on information theory. This decomposed facial images into few sets of significant feature images known as eigenfaces which represents the principal components of the original trained data set. The method also used PCA and Back Propagation Neural Network (BPNN) for feature extraction and recognition. The method was an effective approach to determine the lower dimensional space. The study revealed that eigenface approach possesses the properties that significantly extract face features and also BPNN lessen the input size of face image. The study focused on dimension reduction of the face image, but its performance is limited to constrained environment. Ibrahim and Zin (2011) proposed an automated face recognition algorithm to evaluate the possible implementation of access control methods for office doors. The system made use of the PCA and ANN. The study indicated that training set could be determined in two ways either online or offline. The online uses the facial detection and recognition of frontal images in the system and offline uses already captured and resized images in the database. However, the study reported the influence of pose and illumination variations. Paul and Al Sumam (2012) concentrated on using statistical methods to develop a face recognition algorithm. The PCA was employed to reduce the dimension of the image data into a smaller intrinsic dimensional space. They achieved this by dividing the study dataset into 24 University of Ghana http://ugspace.ug.edu.gh training and test dataset. During recognition, a test image was projected onto the subspace spanned by the eigenfaces. Some classifications were made by recording the Euclidean distance. The limitation of the study was that, it failed to solve the issue of real time recognition rate. Kadam (2014) proposed an algorithm by combining the PCA with discrete cosine transform (DCT) where both methods sought to reduce the dimension of the data. The PCA in the study was used to reduce the dimensionality and the run time was reduced by the DCT. The combination of the techniques provided an appreciable recognition rate. Bellakhdhar et al., (2013) focused on using a Gabor wavelet, PCA and Support Vector Machine (SVM) to develop a face recognition algorithm. Their method focused on joining the Gabor wavelet to extract a face characteristics vector, the PCA algorithm for recognition and SVM to classify faces. However, the method is limited compared to the performance of the current face recognition system in terms of reliability and real time. Bakhshi et al., (2016) used a procedure which applied the SIFT and speeded up robust feature (SURF) features to group face images. PCA was used to help get efficient matching result under varying angular pose and facial expressions. The study concluded that the proposed algorithm was efficient with a higher recognition rate than others. Poon et al., (2016) conducted a study by applying different illumination invariant techniques to examine and identify which one would work better with PCA for face recognition. A technique called the Gradient faces which was used in the pre-processing stage was noted to greatly improve the recognition rate. Barnouti (2016) performed a study on ORL database to assess the recognition rate of the PCA algorithm. Preprocessing techniques were employed to increase the recognition rate. The study 25 University of Ghana http://ugspace.ug.edu.gh concluded that the recognition rate increases when images brightness is increased and also when the size is reduced by 0.3. Barnouti (2016) presented a method which uses PCA-BPNN with DCT. This algorithm was applied on the Face94 and Grimace databases to assess its performance. BPNN and PCA combined made the system effective to recognize faces image easily. Discrete cosine transform was employed to compress the face data sets. The system was considered efficient with an accepted recognition rate which was more than 90%. Asiedu et al., (2015) performed a statistical assessment of two recognition algorithms under varying facial expressions. The PCA/SVD and the Whitened PCA/SVD algorithms were employed in this study. It was evident from the study that PCA/SVD algorithm is comparatively efficient than the Whitened PCA/SVD algorithm. The PCA/SVD was adjudge superior in detecting face images under the environmental constraints. Asieduet al., (2016) conducted an assessment of two algorithms namely FFT-PCA/SVD and the PCA/SVD under varying facial expressions. The study findings indicated that FFT-PCA/SVD in comparison to PCA/SVD exhibited low variation and better recognition in the recognition of facial images under varying expressions. They concluded that FFT is a viable noise filtering technique that should be employed in the pre-processing stage of face recognition. 2.7 Other Feature Extraction Methods There are other feature extraction methods that would be used in this study. 26 University of Ghana http://ugspace.ug.edu.gh 2.7.1 Linear Discrimination Analysis Linear discrimination analysis is a powerful dimensional reduction method used in recognition at the feature extraction stage. Yang, Yu and Kunz (2000) stated that the main goal of the LDA method is looking for a linear transformation such that feature clusters are most separable after the transformation which can be achieved through scatter matrix analysis. LDA is a supervised learning method that uses more than one training image for each class (Barnouti, Al-Dabbagh, Matti& Naser, 2016). This technique can also be referred to as Fisher’s Discrimination Analysis or Fisherface and it is used in classification problems. Linear Discrimination Analysis (LDA) focuses mainly on discrimination between classes and this makes it more stable in classification. Fisherface method was first introduced to maximise the ratio of between-class scatter to within- class scatter to resolve illumination problems. Fisherface is less reactive to lighting, pose and expression constraints. Bhattacharyya & Rahul (2013) showed that LDA optimizes the low dimensional representation of the objects. Some limitations of the LDA are the singularity problem and the high computational cost when there are cases of huge data set (Murtaza, Sharif, Raza & Shah, 2014). In Linear Discrimination Analysis, images are sorted and labelled into inter-class and intra-class. Inter-class captures the image constraints of the same individual while the intra-class captures the variations among classes of persons. LDA seeks to model the discrepancies among classes unlike the Principal Component Analysis. There are some fundamental steps in LDA algorithm and they are as follows;  Compute the intra-class scatter matrix,𝑆𝑤 27 University of Ghana http://ugspace.ug.edu.gh 𝑁𝑗 𝑆 = ∑𝐶 ∑ (𝑥 𝑗 − 𝜇 )(𝑥 𝑗𝑤 𝑗=1 𝑖 𝑗 𝑖 − 𝜇 𝑇 𝑗) (2.4) 𝑖=1 where 𝑥 𝑗𝑖 represents the ith sample of class j 𝜇𝑗 represents the average of class j 𝐶 represents the number of classes 𝑁𝑗 represents the number of samples in class j  Compute the inter-class scatter matrix. 𝐶 𝑆𝑏 = ∑ (𝜇𝑗 − µ)(𝜇𝑗 − µ) 𝑇 (2.5) 𝑗=1 whereµ is the average of the various classes.  Compute the eigenvectors of the projection matrix. Bhattacharyya and Rahul (2013) used the ORL dataset. The LDA algorithm was applied on the dataset for face recognition. 40 images were used as the test images. The test images and the trained images were compared. With the help of the LDA, 37 images out of the 40 were correctly identified whiles the remaining 3 images were incorrectly recognized. This concluded on how efficient LDA algorithm performed. Murtaza et al., (2014) concluded that small sample size issue is a critical problem in the use of LDA. The study proposed the use of PCA during preprocessing phase and the LDA afterwards. This checks that the PCA reduces the dimension of image data so an efficient recognition rate is attained. LDA has a high computational cost in cases of huge data. 28 University of Ghana http://ugspace.ug.edu.gh Barnouti et al., (2016) devised a face recognition using appearance-based methods. A PCA-LDA implementation was successful in testing the performance of the system. The study revealed that if the number of training images are increased, the recognition would also increase. Singh et al., (2016) used a novel method that comprised of the LDA and SURF to solve illumination and pose constraints in face recognition. The proposed algorithm improved the quality parameters to optimise the result when compared with other methods in all situations like the accuracy, error rate and mean square error. 2.7.2 Independent Component Analysis The Independent Component Analysis is a reduction technique employed to modify data as linear combinations of statistically independent sample data points and to reveal hidden factors that bring about sets of random variables, signals and measurements. According to Comon (1994), ICA of random vectors involves looking for a linear transformation that reduces the dependency among their components. It helps to uncover hidden factors of a random variable and basically removes individual signals from mixtures. To make it conceptually simple, a linear transformation of the original data is required. Barlett et al., (2002) presented a study involving the ICA in facial recognition. The study concluded that the ICA can be considered better than the PCA when the similarity measure are cosines. It however showed that the performance is not significantly different with the use of the Euclidean distances. Bhat and Pujari (2015) performed a study using the combination of PCA and ICA in face recognition. This was proved that the performance of the hybrid technique was lesser in accuracy even though the ICA is efficient enough. 29 University of Ghana http://ugspace.ug.edu.gh Kabir, Hossan and Islam (2016) showed that both ICA and PCA algorithm are effective in facial recognition systems. Nevertheless, it can be concluded that PCA algorithm perform better than ICA algorithm and found that PCA algorithm represents intelligent section of a random search within a short time. In general, the ICA is a good statistical technique for face recognition system but the variance and the order of the ICA cannot be determined. 2.8 Conclusion Face recognition has improved over the years leading to the high use in various fields. Scholars from different fields have gained interest and various algorithms have been proposed to improve it since there are still limitations in the way of facial recognition which affects the performance of the various algorithms proposed. These factors include illumination, pose variation, occlusion and facial expressions. The study focused on facial expressions and the findings from this review have reveal that there is limited work done on recognition under facial expressions. This study seeks to fill the gap of assessing the performance of face recognition algorithm on face under varying facial expressions. 30 University of Ghana http://ugspace.ug.edu.gh CHAPTER THREE METHODOLOGY 3.1 Introduction This chapter focuses on describing the theory of the concept employed, derivatives and methods of the study. It explains in detail and gives a comprehensive understanding of the Discrete Wavelet Transform/ Principal Component – Singular Value decomposition algorithm. It also entails the description and analyses of the available data that would be ascertained to satisfy the research aims of the study. The algorithms for generating processed images will be written in language of MATLAB. The aims of the study are; • To compute the recognition rate so as to assess how the modified algorithm under varying facial expressions performs. • Assess whether a significance difference exists with respect to the average recognition distance for the various facial expressions. The face recognition procedure comprises using DWT as a noise filtering in the pre-processing phase, PCA/SVD as feature extraction (dimension reduction technique) phase and recognition distance in the recognition phase. 3.2 Data Acquisition The study made use of secondary data from Cohn Kanadedatabase (CK) which consisted of twenty-six individuals. The face image samples were used as inputs for the proposed algorithm. The database consisted of images in the seven universally accepted facial expression which are the angry,disgust, fear, happy, neutral, sad, surprise poses. 31 University of Ghana http://ugspace.ug.edu.gh Plate 3.1 shows some images from the Cohn Kanade database in the varying facial expressions. Plate 3.1 Examples of images used from the Cohn Kanade database. 3.3 Recognition Procedures The study employed the use of the DWT-PCA/SVD algorithm on the Cohn Kanade database under the varying facial expressions. The facial images in the databases served as inputs for the recognition module. These images (both in the training and testing datasets) are processed and transformed into more suitable format for image processing. The research design is illustrated in Figure 3.1 Pre Feature Recognized Processing Extraction Expression Pp DWT/Mean PCA/SVD Centering Data Base Pre Feature Classifier Processing Extraction Knowledge Figure 3.1: Research design 32 University of Ghana http://ugspace.ug.edu.gh 3.3.1 Pre- Processing This is usually the first stage in face recognition. Facial images are pre-processed and enhanced to improve recognition systems’ performance. To have an effective extracting of face features, there is a need to first go through a series of pre-processing operations. First, the background of the images in the dataset which is termed ‘useless’ or redundant in the recognition procedures, is removed by manually cropping the image into a uniform, preferred size leaving the main face which contain the unique features like the nose, eyes and mouth. Images from the database were imported into MATLAB software with each image resized into a uniform dimension of 256 × 256. According to Reddy et al., (2011), resizing images helps in reducing the lightning effect in visual images, thus making it suitable for PCA in image processing. Mairal et al., (2010) stated that images in different stages of generation, transmission and acquisition are automatically subjected to external factors which are termed as interference noise. Also, many researchers have stated that images naturally exhibit the characteristics of Gaussian noise due to environmental variations. Which means the cropped images still contain some kind of noise level. An image with noise in it devalues the image’s quality, suppresses some distinct features and changes the image’s visual effect. Thus, image denoising is an important step in image pre-processing. This is for quality enhancement of the image, which would be helpful in the recognition process by basically getting rid of noise as much as possible and at the same time maintaining significant features as edges, lines and fine details. For this study, the face image was denoised by adopting a pixel-based filtering technique known as the discrete wavelet transform which have been proven to be a viable tool for denoising. It was applied on the image 33 University of Ghana http://ugspace.ug.edu.gh whilst maintaining the vital facial features for recognition. Chang et al., (2000) stated that DWT is very fast and easy way to denoise face images. 3.3.1.1 Wavelet Transform Wavelet transform in recent times has gained attention as a powerful mathematical technique utilised in signal transient, signal analysis, communication system, image analysis, pattern recognition, computer vision decomposition and image processing. It is used as filters in extracting localised time and frequency information about an image. That is, it has the ability to change local features into frequency and time domains. It is classified as a time compact wave and double index which it is defined based on hierarchy scales. It involves representing general functions in terms of fixed, simple building blocks at different positions and scales. The fixed building blocks are obtained by translation and dilation operations from one fixed function known as mother wavelet (Alarcon-Aquino &Barria, 2009). According to Szeliski in 2010, analyzing low-frequency components for a short duration with wide band of the signal and high frequency component for a long durationwith a narrow band are done easily. Wavelet transform is mostly represented by the subsequent equation: ∞ 𝐹(𝑥, 𝑦) = ∫ 𝑓(𝑡)𝜑∗(𝑥,𝑦)(𝑡)𝑑𝑡 (3.10) −∞ where * is the complex conjugate symbol and function 𝜑 is an arbitrarily function when specific rules are satisfied. Wavelet transform is used to decompose signals into a set of basic functions. Such basic functions are obtained from a mother wavelet by scaling, translation and dilation of a known function and they are known as wavelets. Wavelets basically constitute signals which are local in scale and time, usually having asymmetrical shapes. These signals are termed as wavelets 34 University of Ghana http://ugspace.ug.edu.gh because they desegregate to zero, that is they wave up and down over the axis. Signals are easily broken down into numerous scaled and shifted forms of the primary parent wavelets. Later their coefficients are capable of being reduced to do away with much of the details. Wavelets have huge benefits of the capability of separating the tolerable features in a signal. Smaller wavelets are employed in isolating the finest details and larger wavelets are used to recognize rough details. The capacity of wavelets to produce spatial and frequency forms of the image at the same time propel its usage in the process of feature extraction. Some of the fundamental categories of wavelets include Symlet, Haar, Reverse Bio-orthogonal and Bio-orthogonal. The generation of the wavelet basis function can be denoted as: 1 𝑡−𝑟 𝜑(𝑟,𝑠)(𝑟) = 𝜑 ( ) (3.11) 𝑟 𝑟 where r and s are real, which measures the translation and scaling factors respectively (Poularikas, 1999). Wavelet transforms are very significant in diverse scenarios as mentioned earlier based on the function employed for its calculation. Therefore, it takes infinite set of different transform-based time domain. They can be classified as continuous (non-orthonormal) and discrete (orthonormal) wavelet transforms. The continuous wavelet transform uses all possible scales and translations, whilst the discrete wavelet transform uses only selected scales and translations. This study focused on the time-signal decomposition based on wavelet orthogonality (orthonormal). The wavelet transform has an advantage over the Fourier transform, in that it has a temporal resolution and also the ability to get both frequency and location information. Wavelet analysis deals with estimations and details. The estimations usually constitute the high-scaled, and low frequency parts of the signal while the latter is low-scaled, high-frequency parts. 35 University of Ghana http://ugspace.ug.edu.gh 3.3.1.2 Discrete Wavelet Transform The Discrete Wavelet Transform is a greatly adjustable and an efficient technique for sub-band separation of signals. In the Discrete Wavelet Transform technique, locations and scales are distinct. The outcome of discrete wavelet transform is made up of a great number of wavelet coefficients which are functions of location and scale. The coefficients which are ascertained denotes how related the wavelet is with a certain part of the signal. In the DWT method, signal energy centralizes to certain wavelet coefficients. Bute and Jasutkar, (2012) stated that such feature is very useful for pressing images. The DWT is notably utilized during signal coding, so as to denote a distinct signal in a more redundant representation and generally as a prerequisite for data compacting (Mallat, 1999). Also, Imtiaz and Fattah, (2011) showed that DWT yields adequate information for analysing primary images with a remarkable cut in the computation time. DWT decomposes images into successive low pass and high pass filtering, and this is done row- wise and column-wise. Subsequent to every transform pass, the low pass value of the face is modified again. This procedure may be repeated recursively, based on the need. By applying a one- level DWT, the face image is broken down into four detail parts and approximations, specifically Low-Low (LL), High-Low (HL), Low-High (LH), High-High (HH) sub bands, relating to the approximate vertical, horizontal and diagonal characteristics. The rows of the formation are solved with just one level of breakdown. This primarily separates the array into two vertiginous halves, with the first keeping the mean coefficients and the second half keeps the detail data. The procedure is carried out repeatedly with the columns, which yields four sub- bands. The size of every single sub-band is half that of the primary image. The LL part which corresponds to the approximate value of the breakdown is utilized in the breakdown of the next 36 University of Ghana http://ugspace.ug.edu.gh level. The sub-band HL denotes significant facial expression characteristics. The sub-band LH also represents the vertical features of outline and depicts face pose features (Lai, Yuen! & Feng, 2001). The sub-band HH becomes the unsteady band amongst all the sub-bands since it is interrupted by expressions, noise and pose. This occurs since it contains the high frequency data of the face image. A 2-D DWT can be considered as a 1-D wavelet scheme which transform along the rows and then a 1-D wavelet transform along the columns. The figure 3.2 displays a Discrete Wavelet Transform decomposition of an image using 2-level pyramid where 1, 2 represent Decomposition levels. H constitutes high frequency bands while L is low frequency bands. Figure 3.2 A DWT decomposition of an image using 2-level pyramid. Unlike the Discrete Fourier Transform (DFT), DWT concerns not only to a lone transform but instead a set of transforms, such that every transform has a unique set of wavelet basis functions. Among the most popular forms include the Daubechies set and Haar wavelets. The other form of wavelets includes the Morlet, Coiflets, Biorthogonal and Mexican Hat Symlets. The focus of this study was the Haar Transform. 37 University of Ghana http://ugspace.ug.edu.gh 3.3.1.3 Haar Wavelet Transform Haar transformation techniques are employed to produce wavelets (Chang, Chuang & Hu, 2004). The easiest of all the wavelet transforms is the Haar which is also discontinuous while all other Daubechies wavelets are continuous The Haar transform cross-multiplies a function against the Haar wavelet with various shifts and stretches, as the Fourier transform cross-multiplies a function against a sine wave with two phases and many stretches (Alwakeel&Shaaban, 2010). It is a technique that is mostly utilized, having a recognised name as a powerful and easy mechanism for a multi-resolution breakdown of time series. The Haar wavelets can be seen as a mathematical operation which functions as a prototype in every other wavelet transform. Generally, the Haar transform can be employed as a noise removal mechanism and also for compressing image signals. 𝑛 The M Haar functions can be sampled at 𝑡 = where 𝑚 = 0,1, … .𝑀 − 1. This constitutes an 𝑀 M × M matrix for a discrete Haar transform. An example when M = 2 1 1 1 𝑀2 = [ ] √2 1 −1 Similarly, when M = 4 1 1 1 1 1 1 1 −1 −1 𝑀4 = [ ] √4 √2 −√2 0 0 0 0 √2 −√2 In Haar, transform the matrix are both real and orthogonal: this implies that; 38 University of Ghana http://ugspace.ug.edu.gh 𝑀−1 ∗ 𝑀 = 𝑀𝑇 ∗ 𝑀 = 𝐼 (3.12) where I represent the identity matrix. For instance: 1 1 −1 1 1 1 1𝑀 ∗ 𝑀 = 𝑀𝑇 ∗ 𝑀 = [ ] [ ] √2 1 −1 √2 1 −1 which results in: 1 0 [ ] 0 1 Generally, 𝐻 × 𝐻 discrete Haar matrix is represented in terms of its row vectors where𝑚𝑥 , 𝑥 = 0,1,2, … , 𝐻 − 1as follows: 𝑚0 𝑚 1 𝑀 = 𝑚 2 (3.13) ⋮ [𝑚𝐻−1] Its inverse and the transpose are indicated as 𝑀−1 = 𝑀𝑇 = [𝑚𝑇0 𝑚 𝑇 1 ⋯ 𝑚 𝑇 𝐻−1] th where 𝑚𝑇𝑥 is the x row vector of the matrix. Let’s consider a time series 𝑣 = 𝑣1,𝑣2,𝑣3 … 𝑣𝑁 where N is even. The single-level Haar transform 𝑁 decomposes v into two signals of length . They are the mean coefficient vector𝑢1, with 2 component 𝑣 𝑢 = ( 2𝑚−1 𝑣2𝑚 𝑁 𝑚 + )m = 1, 2, ..., √2 √2 2 39 University of Ghana http://ugspace.ug.edu.gh and details coefficient vector 𝑏1, with component 𝑣2𝑚−1 𝑣𝑏 = ( − 2𝑚 𝑁 𝑚 )m = 1, 2, …, √2 √2 2 The terms in the equation represents variation between successive elements of the time series that is on a time scale ∆𝑡. Each form in the mean vector is a mean across a time scale 2∆𝑡. These can be linked into another N-vector, which can be regarded as a linear transformation of v. 𝑓1 = (𝑢1|𝑏1) (3.14) From here the transform can be inverted to get v fromf, as follows: 𝑢 𝑣 = ( 𝑖 𝑏𝑖 2𝑖−1 + ) (3.15) √2 √2 and 𝑢 𝑏 𝑣2𝑖 = ( 𝑖 − 𝑖) (3.16) √2 √2 𝑁 where 𝑖 − 1,2, … , 2 The Haar Transform is comparable to the Fourier transform in terms of its being power censoring (2-norm conserving). This means that the Haar transform was regarded as partitioning the power between various time scales and ranges. 𝑁 |𝑓1|2 = ∑2 (𝑢2𝑚 + 𝑏 2 𝑚) (3.17a) 𝑚=1 𝑁 |𝑓1|2 = ∑2 [{𝑚=1 𝑣2𝑚−1 + 𝑣2𝑚} + {𝑣2𝑚−1 − 𝑣2𝑚}] (3.17b) 𝑁 |𝑓1|2 = ∑2 [𝑣2 2 𝑚=1 2𝑚−1 + 𝑣2𝑚] (3.17c) 40 University of Ghana http://ugspace.ug.edu.gh |𝑓1|2 = 𝑣2 (3.17d) 3.3.1.4 Haar Wavelet The Haar wavelet can be thought of as series of resized (square shaped) functions. For an input represented, the difference is kept, passing the sum. The procedure is repeated continuously, matching up the sums to obtain the next scale, which culminates in a 2𝑛−1 differences and a terminal sum. The technical deficiency of the Haar wavelet mechanism stems from the fact that it is discontinuous, and hence it is not differentiable. On the contrary, this same challenge can be an advantage for analysing signals with unexpected transitions like tool failure monitoring in machines. There are numerous advantages of the Haar transform such as;  It is fast and conceptually straightforward.  It has memory efficiency, because it can be computed in situations without a temporary array.  It overcomes the problem of edge effect which affects other transforms since it is absolutely reversible. The Haar wavelet mother function is denoted by: 1, 0 ≤ 𝑥 ≤ 1⁄2 𝜓(𝑥) = {−1, 1⁄2 ≤ 𝑥 ≤ 1 (3.18) 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 The Haar scaling function is described as: 1, 𝑖𝑓 0 ≤ 𝑥 ≤ 1 ф(𝑥) = { (3.19) 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 41 University of Ghana http://ugspace.ug.edu.gh The properties of Haar wavelet include:  All functions are linear combinations of ф(𝑥), ф(2𝑥), ф(22𝑥),… , ф(2𝑘𝑥),… and the corresponding shifting functions  Any function can be represented as a linear combination of constant a function, 𝜓(𝑥), 𝜓(2𝑥), 𝜓(22𝑥), … , 𝜓(2𝑘𝑥) and the corresponding shifting functions. 𝑚  The function {2 ⁄2ф(2𝑚𝑥 − 𝑚);𝑚 𝜖 𝑍}is an orthonormal basis. 3.3.2 Extraction of Features The methods of feature extraction are applied after the pre-processing stage. When the desired level of segmentation is obtained at the pre-processing stage, the feature extraction techniques are applied on the segment to obtain a reduced set of important features which would be useful in classification and efficient recognition of faces. The features must have information necessary to differentiate classes, be inconsiderate to peripheral variability and limited in number. If the characteristics are carefully analysed, the expectation would be that the features will extrapolate the necessary information from the input image so as to get a preferred output. These methods are very useful is image processing since these features define its place in terms of storage space, efficiency and time consumption in classification. The main objective of this stage is to obtain the most significant data from the existing image and representing it in a lower dimension space. This information helps in making classification easier and increases the rate of recognition with the very least number of elements. Examples of some of the techniques of feature extraction that are mostly used include Template matching, Geometric moment invariants, Deformable templates, Fourier descriptors and Gabor feature. Bellakhdhar, Loukil, and Abid (2013) stated that face recognition algorithm performance depends on how it extracts feature vector and 42 University of Ghana http://ugspace.ug.edu.gh recognise a facial image correctly. The feature extraction method is a vital tool in determining the performance of recognition rate of a face recognition algorithm. In this study, dimensional reduction techniques were employed after using Discrete Wavelet Transform at a pre-processing phase. The expectation is that the extracted features of the face image will contain information about the input data. This implies that the required venture may be executed by a smaller representation in place of the comprehensive original data. This procedure involves a reduction in the amount of resources needed to label a huge data set. During the phase of extracting features, the study adopted PCA/SVD approach as a data reduction technique. 3.3.2.1 Singular Value Decomposition (SVD) Singular value decomposition can be thought of as an effective mathematical tool which is considered an outcome of linear algebra. It plays fundament roles in various applications such as image processing. SVD in processing digital images gives a robust technique for storing up larger images as much smaller and more controllable ones. This is done by duplicating the input face with succeeding non-zero singular values. It is an efficient matrix decomposition method in the sense of a least square and it loads its greatest signal energy into as few coefficients as possible. Singular value decomposition also has the capability to suit to the changes in an image’s local features. In Singular Value Decomposition transformation, a matrix is broken down into three other matrices that has the same size as the primary matrix. An image, from the perspective of linear algebra can be seen as an array of scaler entries which are non-negative, forming a matrix. The SVD of a matrix N of order 𝑎 × 𝑏 can be written as; 43 University of Ghana http://ugspace.ug.edu.gh 𝑁 = 𝑈∑𝑉𝑇 (3.20) where; U and V are 𝑎 × 𝑎 and 𝑏 × 𝑏 orthogonal matrices respectively ∑ is a matrix of order 𝑎 × 𝑏 having the singular values of N along the leading. The diagonal matrix is as follows; 𝜎 1 𝜎2 𝜎3 ⋱ [ 𝜎𝑛] where the diagonal elements are referred to as the singular values and must satisfy the condition below; 𝜎1 ≤ 𝜎2 ≤ 𝜎3 ≤ ⋯𝜎𝑟 ≤ 𝜎𝑟+1 ≤= ⋯ = 𝜎𝑛 = 0 Also, for every real 𝑎 × 𝑏 matrix, N, there exist orthogonal matrix U and V such that 𝑊 0 𝑈𝑇𝑎×𝑎𝑁𝑎×𝑏 𝑉𝑏×𝑏 = 𝑊𝑎×𝑏 = [ 1 ] 0 0 where 𝑊1 is a non-singular diagonal matrix having the diagonal entries to be non-negative and can be arranged in non-increasing order. 3.3.2.1.1 Theorem Let matrix N be a matrix with dimension 𝑎 × 𝑏, (a ≤ b). Then;  The matrices (𝑁𝑇𝑁) and (𝑁𝑁𝑇𝑏×𝑏 )𝑎×𝑎 are symmetric with real and non-negative eigenvalues. 44 University of Ghana http://ugspace.ug.edu.gh  If β is a non-zero eigenvalues of 𝑁𝑇𝑁 corresponding to the eigenvector 𝑥, then β is also an eigenvalue of 𝑁𝑁𝑇 with corresponding eigenvector N𝑥. Thus 𝑁𝑁𝑇and 𝑁𝑇𝑁 have the same non-zero eigenvalue. 3.3.2.1.2 Properties Singular Value Decomposition has numerous properties. The following properties are used in the study.  The singular values 𝜎1, 𝜎2, 𝜎3 …𝜎𝑛of the matrix N are unique. Nevertheless, matrix U and matrix V are not unique.  𝑁𝑇𝑁 = 𝑈∑𝑇∑𝑉𝑇, the matrices U and V are calculated from the eigenvector of 𝑁𝑁𝑇 and 𝑁𝑇𝑁 respectively.  The rank of the matrix N is identical to the number of its non-zero singular values. 3.3.2.2 Principal Component Analysis (PCA) Principal Component Analysis is a dimensional reduction technique generally used in numerous fields. It is mostly used in digital image processing due to its ability to produce an optimum linear least-square breakdown of a training data to bring out the most significant low dimensional composition of face patterns of large datasets. Its advantage is that, the reduction process only removes information not useful and maintains the very important information for recognition. In this study, PCA technique was adopted primarily to focus on data reduction and interpretation. It explains the variance-covariance structure of a set of variables though they are fewer linear combination of these variables. This procedure changes high dimensional image in every two- dimensional picture into a dimensional vector. The study concentration lied in the components that explained most of the information required. 45 University of Ghana http://ugspace.ug.edu.gh Mathematically, PCA is computed as follows, A two-dimensional facial image is changed into one dimensional vector by concatenating the column or row as shown below; 𝑥1 𝑥1 𝑥3 𝑥 [𝑥 ] = [ 2] (3.21) 2 𝑥4 𝑥3 𝑥4 The Mean face is defined as the arithmetic mean of the training image vectors at each pixel point. It’s also a one-dimensional vector. Let each face image be represented by the vector Γi 𝑥1 𝑦1 𝑧1 𝑥 𝑦2 𝑧 Γ 2 21=[𝑥 ] Γ2= [𝑦 ] … ΓN= [𝑧 ] (3.22) 3 3 3 𝑥4 𝑦4 𝑧4 The mean face is calculated as: 1 𝜓 = ∑𝑁𝑖=1 Γ𝑖 (3.23a) 𝑁 𝑥1 𝑦1 𝑧1 𝑥 𝑦 𝑧 [ 2𝑥 ] + [ 2 2 𝑦 ] + ⋯+ [𝑧 ] 3 3 3 𝑥4 𝑦4 𝑧4 Γ +Γ +⋯+Γ 𝜓 = 1 2 𝑁 (3.23b) 𝑁 Every face is significantly different from the average by Φ1 = Γ1 − 𝜓, known as the mean centered image. 46 University of Ghana http://ugspace.ug.edu.gh A covariance matrix is constructed as 𝐶 = 𝐴𝐴𝑇 where 𝐴 = Φ1, Φ2, Φ3, … . , Φ4 of size 𝑀 2 × 𝑀2. Eigenvectors corresponding to the covariance matrix is calculated as: 𝐴𝑇𝐴𝑋𝑖 = 𝜆𝑖𝑋𝑖 (3.24a) 𝐴𝐴𝑇𝐴𝑋𝑖 = 𝐴𝜆𝑖𝑋𝑖 (3.24b) 𝐴𝐴𝑇(𝐴𝑋𝑖) = 𝜆𝑖(𝐴𝑋𝑖) (3.24c) where𝐴𝑋𝑖 is the eigenvector and 𝜆𝑖 is the eigenvalue. The eigenvectors are denoted by 𝑈𝑖 where 𝑈𝑖 are face images which are ghostly in nature andare known as eigenfaces. Every eigenvector maps to an eigenface in the face space for which any face image is discarded if its eigenvalue is zero (0), thereby reducing eigenface to an extent. Eigenface are ranked in order of the adequacy in determining the variability among the face images. Projecting an image onto the face space is computed as: Ω 𝑇𝑖 = 𝑈 (𝑇𝑖 − 𝜓)i =1,…, M where (𝛤𝑖𝜓) is the mean centered image. Ω = 𝑈𝑇(Γ − 𝜓) (3.25) A test image Γiis then mapped onto the image space to ascertain a vector Ω such that 𝛺 = 𝑈𝑇(𝛤 − 𝜓). 47 University of Ghana http://ugspace.ug.edu.gh An example of PCA/SVD operations:  Let A, B and C represent matrices of face images in a database. 7 8 3 3 9 5 𝐴 = [ ] 𝐵 = [ ] 𝐶 = [ ] 4 9 2 4 3 3  Vectorised the face image in the data (reshape into one matrix). 7 3 9 𝑇 = [4 2 3𝑗 ] 8 3 5 9 4 3  Mean of vectorised face image. ?̅? = [7 4 5]  Mean Centered 0 −1 4 𝑆𝑗 = [ −3 −2 −2] 1 −1 0 2 0 −2  Covariance matrix 14 5 2 𝑐 = [ 5 6 0 ] 2 0 24  Singular value decomposition −0.2146 0.8728 0.4382 𝑈 = [−0.0581 0.4364 −0.8978] −0.9749 −0.2182 −0.0428 24.4403 0 0 𝐷 = [ 0 16 0 ] 0 0 3.5596 48 University of Ghana http://ugspace.ug.edu.gh −0.2146 −0.0581 −0.9749 𝑉𝑇 = [ 0.8728 0.4364 −0.2182 ] 0.4382 −0.8978 −0.0428  Eigenvalues [24.4403 16.0000 3.5596]  Eigenvectors −0.2146 0.8728 0.4382 [−0.0581 0.4364 −0.8978] −0.9749 −0.2182 −0.0428 3.3.3 Recognition The last step after an input image is pre-processed, features extracted and stored is to recognize the image. This stage involves comparing the input image to an already built face database. The features of the input image are compared to each face class stored in the database. The recognition phase falls under two classifications, which are face identification and face verification. Verification, also known as one-to-one matching contrasts the image against a template face images already on the file. Face identification also referred to as one-to-many matching which compares a query face image against all image templates in a face database. This phase of the study gives a stage by stage approach to recognizing an unknown face in the trained database. After formulizing the representation of the faces, the final phase is to recognise the identity of such images as stated earlier. An input face image that is unfamiliar is passed from the phase above before recognition. The face image datasets used contain individuals with their neutral pose and the other six universally accepted facial expressions including fear, disgust, angry, sad, happy and surprise poses are extrapolated and kept in the database. Therefore, as 49 University of Ghana http://ugspace.ug.edu.gh soon as an input image is placed in dataset, the face image undergoes a pre-processing stage and dimensional reduction process and compare its component to each face class stored in the database for recognition. 3.3.3.1 Euclidean (Recognition) Distance The recognition rule which is popularly used is the Euclidean distance. The Euclidean distance computes the distance from the query image and the database images. Following the procedures in the dimensional reduction phase, a new image from the test set is changed into its eigenface parts. The input face is firstly contrasted with the mean face stored in memory and their difference is multiplied with each eigenvector. Thus, every value denotes a weight which is saved as a vector. The Euclidean distance is denoted by: 𝜀𝑖 = ‖𝛺 − 𝛺𝑖‖ (3.26) where, 𝑖 = 1,2, … ,𝑀 𝛺 th𝑖 is a vector of the i face class. A face is recognised as been part of the i-class if the minimum 𝜀𝑖 is below a certain threshold 𝜃 otherwise it is classified as unknown. 50 University of Ghana http://ugspace.ug.edu.gh A Flow chat for Principal Component Analysis Image matrix Discrete Wavelet Transform Vectorized image Pre-processing Stage Mean centring Covariance Matrix Singular Value Decomposition Feature Extraction Image Eigenface Decomposition Principal Component Extraction Knowledge Storage Recognition Classification stage Figure 3.3: A summary of the face recognition procedures 3.3.3.2 Recognition Rate The recognition performance of an algorithm is assessed through the recognition rate and the computational time. The recognition rate of an algorithm is expressed as the ratio of the sum of 51 University of Ghana http://ugspace.ug.edu.gh images recognised by the algorithm to the total of face images in the test dataset for each experimental run. The performance of this recognition has many measurement standards. The rate of recognition is defined as: 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑐𝑒 𝑖𝑚𝑎𝑔𝑒 𝑟𝑒𝑐𝑜𝑔𝑛𝑖𝑧𝑒𝑑 (𝑛𝑡) Recognition Rate = ×100 (3.27) 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑓𝑎𝑐𝑒 𝑖𝑚𝑎𝑔𝑒 𝑝𝑟𝑒𝑠𝑒𝑛𝑡 (𝑛𝑟) Thakur et al., (2008) stated the average recognition rate numerically as: ∑26 𝑛𝑘 𝑅𝑅 = 𝑘=1 𝑐𝑙𝑠𝑎𝑣𝑔 × 100 (3.28) 𝑞×𝑛𝑡𝑜𝑡 where q represents the total number of experimental runs, 𝑛𝑡𝑜𝑡 represents the sum of images under test in every run; 𝑛𝑘 th𝑐𝑙𝑠 represents the number of correct face images recognised in thej run. Therefore, the mean error rate is computed as 𝐸𝑅𝑎𝑣𝑔 = 100 − 𝑅𝑅𝑎𝑣𝑔 (3.29) The exactness of a face recognition algorithm is a very important part in the analysis process of image processing. The highest similarity measure is the lowest distance of a resultant image. 3.4 Statistical Testing Approach The study evaluated the face recognition algorithm based on a statistical technique. The study employed the use of a parametric or non-parametric distribution measures, based on the assumptions of the underlying distribution of the adopted statistical test. 52 University of Ghana http://ugspace.ug.edu.gh 3.4.1 Assessment of Multivariate Normality The data structure used in the study requires the use of a repeated measure design in achieving our objectives for the study. The repeated measures design is relatively precise on the grounds that the data used assumes normality. When this assumption is not met, the equivalent which is the Friedman’s ranked sum test would be adopted. The study dataset has six variates gathered through the recognition distance between the various facial expressions and the neutral pose. We define dataset𝑌𝑘𝑙 where 𝑙 = 1,2,3, … , 𝑛 and 𝑘 = 1,2,3, … ,𝑚 where l is the number facial expressions aside the neutral expression. The statistically significant difference between the mean recognition distances of the different face expression (happy, sad, angry, surprise, disgust, fear) and the neutral pose is assessed when being recognized by the study algorithm. There are many statistical tests that can be used in assessing the multivariate normality of a dataset but unfortunately, there no familiar Uniform Most Powerful tests for them. To make it suitable to use, one has to employ the use of several tests. Examples of such tests are Mardia, Chi-square and Henze-Zirkless test. The study employed the use of the Chi-square test since it has a statistical consistency property. 3.4.1.1 Generalised Square Distance Approach It is a widely used technique which is derived from the distribution of the squared distances of sample points from their average. 𝑥2𝑖 = (𝑦𝑖 − ?̅?) 𝑇𝑆−1(𝑦𝑖 − ?̅?) (3.30) 53 University of Ghana http://ugspace.ug.edu.gh where 1 ?̅? = ∑𝑛𝑗=1 𝑌𝑖𝑗 𝑖 = 1,2,3, … ,m (3.31) 𝑛 𝑌1, 𝑌2 … 𝑌𝑚are the various recognition distances under varying facial expressions andY is also an 𝑚 × 𝑛 matrix of recognition distance of constraints. 2 𝑖−0.5 A chi squared plot can also be explained as a plot of ordered squared distances d and 100 ( ) 𝑚 quantities of the chi square distribution with p degrees of freedom. Let𝑑2 = 𝑑2 th𝑖 𝑖𝑗, for 𝑖 = 𝑗. That is 𝑑 2 𝑖 as the diagonal entry of the i individual from the matrix 𝑑 2 𝑖𝑗. Procedure in constructing a chi-square plot are as follows; a. Arrange the squared distances in ascending order that is from the least to greatest distant value as 𝑑2 ≤ 𝑑2 ≤ 𝑑2(1) (2) (3) ≤ ⋯𝑑 2 (𝑚) 𝑖−0.5 b. Compute the quantiles,𝑞𝑐𝑝 = ( ) in relation to the topmost percentiles of the chi- 𝑚 square distribution. 𝑖−0.5 c. The points 𝑞𝑐𝑝 = ( ) , 𝑑2(𝑖)are plotted to derive a chi-square plot. The plot indicates a 𝑚 multivariate normality if is assumes the form of a straight line 3.4.2 Repeated Measures Design In conducting a repeated measures design, every member of the each group is tried several times with various conditions over a particular time frame. This test is primarily used to compare experimental treatments which take measurements on the same subject over successive times or under different conditions. For this study, the treatments are seen as the varying facial expressions (constraint) in the recognition database. The rationale of utilizing the repeated 54 University of Ghana http://ugspace.ug.edu.gh measures design test is to examine if the Euclidean distances of the study algorithms, there is a significant difference between the minimum Euclidean distances of the varying face expressions from the images with the neutral poses. The study assumes that each subject is measured p times in row. Each measurement corresponds th to a specific facial expression. The k observation is given as 𝑋𝑘1 𝑋 𝑘. 2 𝑋𝑘 = (3.32) . . [𝑋𝑘𝑝] th th where k=1,2,...,n𝑋𝑘𝑙 is the measure of the l constraint on the k individual. For comparative purposes, the study considered contrasts of the components of 𝜇 = 𝐸(𝑋𝑘). The successive difference comparison used is given as 𝜇2 − 𝜇1 −1 1 0 . . 0 𝜇1 𝜇 3 − 𝜇2 0 . . 𝜇2 . . −.1 1. . . .. . = . . = 𝐶𝜇 (3.33) .. .. . . .. . . . . . . [𝜇 . −1𝑃 − 𝜇𝑃−1] [ 0 0 . 1] [𝜇𝑃] where C is called the contrast matrix since the rows constituted are linearly independent and each row is known as a contrast vector. The nature of the repeated test design removes much influence of the unit to unit difference on the various head pose comparisons (Johnson & Wichern, 1998). The study randomized the order in which the varying facial expressions are presented to each individual (subject). 3.4.2.1 Assumptions The assumptions for carrying out a multivariate repeated measure design are stated as follow; 55 University of Ghana http://ugspace.ug.edu.gh  The samples must satisfy multivariate normality assumption.  The observations from varying facial expressions (different variable) are independent to each other.  The samples must satisfy the sphericity assumption (homogeneity of covariance)  There must be a complete absence of outliers in the datasets. 3.4.2.2 Test for Equality of Treatments All contrast matrices 𝑪 are with p − 1 linearly independent orthogonal rows. The contrast matrix 𝑪 can either be categorized as a Baseline comparison or Successive differences. The Baseline comparison is given as: 𝜇1 − 𝜇2 1 −1 0 . . 0 𝜇1 𝜇 − 𝜇 𝜇 1 3 . . 2. 1. 0. −.1 . . .. . = . . (3.34) .. .. .. .. . . .. .. 𝜇 − 𝜇 1 0 0 . .( 1 𝑃) ( −1)(𝜇𝑃) The Successive difference is given as: 𝜇2 − 𝜇1 −1 1 0 . . 0 𝜇1 𝜇3 − 𝜇 2 𝜇 . 0 −1 1 . . . . . .. . 2. . = . . (3.35) .. .. .. .. . . .. .. 𝜇 . −1( 𝑃 − 𝜇𝑃−1) ( 0 0 . 1)(𝜇𝑃) 3.4.2.3 Hypotheses Let 𝑋~(𝜇,∑) and let 𝑪 be a contrast matrix.  The null hypothesis (𝐻0) claims that, there exist no significance diff erences in treatments (Equal treatment means).  The alternative (𝐻1) claims that, there exist significance diff erence in treatments. 56 University of Ghana http://ugspace.ug.edu.gh 𝐻0:𝑪𝝁 = 𝟎(3.36a) 𝐻1:𝑪𝝁 ≠ 𝟎(3.36b) 2 The test statistics (HotellingT ) is given as 𝑇2 = 𝑛(𝐶?̅? − 𝐶𝜇)′(𝐶𝑆𝐶′)(𝐶?̅? − 𝐶𝜇) (3.37) 1 1 where ?̅? = ∑𝑛𝑘=1 𝑋𝑘𝑙 and 𝑆 = ∑ 𝑛 𝑘=1(𝑋𝑘 − ?̅?)(𝑋𝑘 − ?̅?)′ are the arithmetic mean an d 𝑛 𝑛−1 covariance matrix respectively for the study algorithm. The critical region is stated as (𝑛−1)(𝑙−1) 𝐹 (𝑛−𝑙+1) 𝛼,𝑝−1,𝑛−𝑝+1 th where 𝛼 is the level of significance, 𝐹𝛼,𝑙−1,𝑛−𝑙+1 is the upper (100𝛼) percentile of an F- distribution with p − 1 and 𝑛 –p + 1 degree of freedom. 2 The decision rule will be to reject the null hypothesis if the test statistics (HotellingT ) exceed the critical region at 𝛼 level of significance. Otherwise we fail to reject the null hypothesis. (𝑛 − 1)(𝑝 − 1) 𝑇2 = 𝑛(𝐶?̅? − 𝐶𝜇)′(𝐶𝑆𝐶′)(𝐶?̅? − 𝐶𝜇) > 𝐹 (𝑛 − 𝑝 + 1) 𝛼,𝑝−1,𝑛−𝑝+1 3.4.2.4 Confidence Region A confidence region for contrasts 𝐶𝜇, is obtained by the set of all 𝐶𝜇 such that: (𝑛 − 1)(𝑝 − 1) 𝑇2 = 𝑛(𝐶?̅? − 𝐶𝜇)′(𝐶𝑆𝐶′)(𝐶?̅? − 𝐶𝜇) > 𝐹 (𝑛 − 𝑝 + 1) 𝛼,𝑝−1,𝑛−𝑝+1 100(1− 𝛼) % Simultaneous confidence intervals for single contrasts c′μ for any contrast vector of interest is given as; 57 University of Ghana http://ugspace.ug.edu.gh ′ (𝑛−1)(𝑝−1) 𝑐 ′ c′μ= 𝑐 𝑋 ̅ ± √ 𝐹 √ 𝑘 𝑆𝑐𝑘 𝑘 𝛼,𝑝−1,𝑛−𝑝+1 (3.38) (𝑛−𝑝+1) 𝑛 th where 𝑐𝑘, k = 1,2,3,…,p-1is the k row vector of the contrast 𝑪. 3.4.3 Friedman Rank Sum Test The Friedman rank sum is also referred to as Friedman test. It is a non-parametric statistical test which is similar to the repeated measures design and also used to detect differences in treatments (varying facial expressions) across multiple test attempts. The goal of the test is to ascertain whether there exists statistically significant difference between the median recognition distances of the varying facial expressions and their neutral face. The study performs calculations either on ranks obtained from points measured on a higher scale or the original points. 3.4.3.1 Assumptions The Friedman two-way analysis of variance by ranks has the under listed assumptions;  The sample of k subjects are selected randomly.  The intra-block observations are ranked in ascending/descending order.  The data consist of b mutually independent samples (blocks) each of size k.  The variable under study is continuous.  There is no interaction between blocks and treatments. 3.4.3.2 Model The linear Friedman Rank Sum test model is given as: 𝑦𝑖 = 𝜇 + 𝑀𝑖 + 𝜋𝑗 + 𝜀𝑖𝑗 , ∀ 𝑖𝑗 = 1,2,3, … , 𝑘 58 University of Ghana http://ugspace.ug.edu.gh 3.4.3.3 Testing Procedure  Firstly, rank the original observations unless it is already rank.  Rank each block within each observation independently in ascending order, so that every block has a different set of k ranks.  Obtain the sum of rank Rj in each column. The test statistics is computed as follows: 2 12 𝑘 𝑏(𝑘+1) 2 𝑥𝑅 = ∑𝑗=1 [𝑅𝑗 − ] (3.39) 𝑏𝑘(𝑘+1) 2 Alternatively, 12∑𝑘𝑖=1 𝑅 2 2 𝑖 −3𝑏 𝑘(𝑘+1)𝑊 = 2 2 (3.40) 3𝑏 𝑘(𝑘 −1) 𝑏(𝑘+1) where is the sum of the ranks when the null hypothesis is true. 2 In theory with the face recognition systems, there should be no cases of ties in the Euclidean distances under the various expressions so that the median recognition distance is ranked on an ordinal measure and assumed to be continuous. Practically, it doesn’t always work that way. There may be ties recorded in some cases. In that situation, where there are ties in the ranking of the median recognition distance, the study assigns the tie observations to the average of the rank position for which the variable are tied. The test statistics for the tie Friedman two way-analysis of variance test is defined as 12 ∑𝑘 2 2𝑖=1 𝑅𝑖 −3𝑏 𝑘(𝑘+1)𝑊 = 3𝑏2𝑘(𝑘2−1)−𝑏(∑ 𝑡3 (3.41) −∑𝑡) 59 University of Ghana http://ugspace.ug.edu.gh 3.4.4 Pairwise Comparison The study will consider the comparison of all possible differences between pairs of samples. It is used to ascertain whether the response differ significantly in this study in relation to the Euclidean distance of the various face expressions with their neutral pose. 𝐻0:𝝁 = 𝟎(zero mean difference between pairs) 𝐻1:𝝁 ≠ 𝟎 For a 𝑁𝑝(µ, 𝛴𝑑) population, reject 𝐻0if the observed, 2 𝑇 −1 (𝑛−1)𝑝𝑇 = 𝑛𝑑 𝛴 𝑑 > [ ]𝐹𝑝,𝑛−𝑝(𝛼) (3.42) (𝑛−𝑝) Where 𝐹𝑝,𝑛−𝑝(𝛼) is the upper (100𝛼) percentile of an F- distribution with 𝑝 and 𝑛 − 𝑝 degree of freedom. 60 University of Ghana http://ugspace.ug.edu.gh CHAPTER 4 DATA ANALYSIS 4.1 Introduction This chapter presents and reviews a detailed analysis of the computational work of this study. A further explanation of the rationale behind every process conducted and the outcome of the analysis is discussed. The purpose of this study was to recognised faces under varying expressions from an available database using a DWT-PCA/SVD face algorithm. The objectives of the study were to: • To compute the recognition rate in order to assess the performance of the modified algorithm under varying facial expressions. • Assess whether a significance difference exists in the average recognition distance of the various facial expressions. 4.1.1 Data Description The face images used for the study were obtained from the Cohn-Kanade AU-Coded Facial Expression database. It consisted of twenty-six (26) individuals with their respective seven universally accepted principal expressions (neutral, angry, disgust, fear, happy, sad and surprise). The individuals in the database were students from the university who were admitted into the introductory psychology class. The ages range from 18 to 30 years. In all, one hundred and eighty-two (182) face images from the twenty-six individuals were used in the study database. The database was classified into two different databases, the train image database and the test image database. The train image database consisted of the 26 neutral face poses and the test 61 University of Ghana http://ugspace.ug.edu.gh image database contained the remaining 156 expressions (angry, disgust, fear, happy, sad, surprise). Each image, both in the train and test databases was resized into 256×256-dimensional unit. Plate 4.1 Examples of images (neutral poses) in the train database. Angry Disgust Fear Happy Sad Surprise Plate 4.2 Examples of images in the test image database. 4.2 Result The face recognition of the images was divided into three main phases as stated in the research methodology. They are the preprocessing, feature extraction and recognition phases. 62 University of Ghana http://ugspace.ug.edu.gh 4.2.1 Preprocessing Preprocessing methods are used to prepare images for further analysis. Before the training and testing distances can be measured, a preprocessing technique is applied to eliminate the noise factor in the various images. This makes the input images better conditioned since the unwanted or redundant nature of the images are taken off. Lots of reviewed literatures have in effect shown that the pre-processing stage significantly increases the consistency of a visual examination and enhance the efficiency of a face recognition algorithm. In this study, a Discrete Wavelet Transform (DWT) was applied to the images at the preprocessing stage. DWT is a viable noise removal technique that makes the computation phase simpler and better conditioned. The Discrete wavelet transform experimental runs present the real and imaginary elements of the transformed image. The real element of the image are extracted for the feature extraction phase whilst the imaginary components are been treated as noise. In the DWT algorithm, the Inverse Discrete Wavelet Transform (IDWT) can be extracted after the decomposition. The Gaussian filter is further employed to filter the noise after the decomposition before acquiring the reconstructed image. The decomposition of the image in DWT is based on frequency domains, thus the distributions of the frequency are transformed in each step of the DWT where L represents Low frequency band, H represents High frequency band, LL sub graph represents the lower resolution estimate of the original value while mid-frequency and high-frequency details sub graphs HL, LH and HH represent horizontal edge, vertical edge and diagonal edge details respectively. 63 University of Ghana http://ugspace.ug.edu.gh Plate 4.3 is an illustration of the DWT and IDWT procedure 4.2.2 Recognition Stage This is the final stage of the recognition process. It shows the procedure in recognizing an unknown face in the trained database 4.2.2.1 Euclidean (Recognition) Distance The Euclidean distances are computed and displayed in Table 4.1. The Euclidean distances are the recognition distances of the various facial expressions from their neutral pose. 64 University of Ghana http://ugspace.ug.edu.gh Table 4.1: Euclidean distances of the images from their neutral pose. Angry Disgust Fear Happy Sad Surprise 642.7500 354.3400 617.6700 1297.9000 408.2900 3016.3000 437.6300 573.7400 614.8300 835.8600 858.2900 1002.8000 1199.7000 644.5900 1094.4000 783.2700 1206.7000 1841.7000 746.1400 748.8900 817.5800 1245.3000 741.4400 913.8300 900.3700 906.5800 1054.0000 607.4900 943.9600 1865.3000 592.8400 630.0700 1230.1000 410.5200 737.1600 1430.7000 1484.3000 447.0000 959.3000 841.2000 477.8300 1968.3000 539.5900 907.9100 996.7000 955.1500 479.0300 982.2700 284.4000 409.7500 1420.2000 1037.7000 1425.0000 1314.1000 587.5700 808.5900 1767.4000 888.7900 1181.9000 816.7300 664.7100 206.8600 890.9100 326.0600 594.6200 778.9400 1013.9000 859.9600 825.4700 503.3600 1126.8000 2314.6000 1637.1000 1236.9000 1675.2000 1851.4000 991.4000 1463.8000 338.8600 1337.6000 635.0700 1203.2000 468.8600 1368.0000 788.7100 914.2000 502.6100 404.9900 496.7000 948.6800 1704.7000 1784.1000 727.9300 1328.3000 1690.3000 1984.4000 1510.5000 1350.0000 1038.7000 476.6400 1876.6000 1398.5000 695.6400 589.8300 1805.3000 1069.1000 1019.4000 433.8200 1185.1000 656.2300 1092.3000 423.0300 1272.8000 800.0700 898.3800 1249.0000 983.7800 1569.8000 737.3500 1128.6000 1385.0000 454.8800 1089.2000 1743.7000 954.3700 1791.8000 1123.8000 633.2100 813.1200 695.3300 1107.6000 2008.2000 1942.1000 814.5000 1817.0000 551.6300 362.5200 1019.2000 1470.8000 936.4500 2542.1000 1384.6000 1206.8000 1996.2000 628.4100 797.4500 519.6500 569.5300 662.7300 623.7500 2683.7000 1263.9000 1147.2000 697.7800 1175.4000 1225.6000 Source: MATLAB output. 65 University of Ghana http://ugspace.ug.edu.gh 4.2.2.2 Recognition Rate The recognition rate is used to assess the proportion of correctly recognised images made by the study algorithm. The study used the Cohn Kanade database to identify the recognition rate for the DWT-PCA/SVD algorithm under varying constraints. According to Thakur et al., (2008), the recognition rate follows this computation. ∑26 𝑗𝑘=1 𝑛 𝑅 𝑐𝑙𝑠𝑎𝑣𝑔 = × 100 𝑞 × 𝑛𝑡𝑜𝑡 Correctly recognised images ∑26 𝑛𝑘𝑘=1 𝑐𝑙𝑠 = 154 , sum of experimental runs 𝑞 = 26 and the total number of images in a single run𝑛𝑡𝑜𝑡 = 6. Therefore, the average recognition rate is 154 𝑅𝑎𝑣𝑔 = × 100 = 98.7% 26 × 6 The average error rate can be computed as 𝐸𝑎𝑣𝑔 = 100 − 98.7 = 1.3% The average run time (speed) was 10.5 seconds. Table 4.2: Recognition Rate of Varying Facial Expressions Angry Disgust Fear Happy Sad Surprise 100% 100% 100% 100% 96.15% 96.15% 66 University of Ghana http://ugspace.ug.edu.gh Angry Neutral Disgust Neutral Fear Neutral Happy Neutral Sad Neutral Surprise Neutral Plate 4.4: Samples of correctly recognised individuals across the different facial expressions. 4.3 Numerical Assessment The algorithm under study is the DWT-PCA/SVD with a Gaussian filter. In determining whether there is a significant difference between the various facial expressions from their neutral pose, a repeated measure test is adopted. Statistically, the dataset has to follow the assumption of a multivariate normal distribution. 4.3.1 Assessing for Normality The normality of the data set is a necessary assumption to be able to use the repeated measures design. Considering that the data set has multiple facial expressions, the study is seeking to identify if the Euclidean distances from the train images in the database follow a parametric and non-parametric distribution. 4.3.1.1 Test for multivariate normality To assess the multivariate normality of the data (Euclidean distances), a Doornik-Hansen test was employed which resulted in a p-value of 0.1445. This suggest that, there is enough evidence 67 University of Ghana http://ugspace.ug.edu.gh of the data been multivariate normal. This test is not entirely definitive so a chi-square plot was used to validate the multivariate normality of the data set. 4.3.1.2 The Chi-square plot A chi-square test of observed data quantiles against the normal quantiles is displayed on the (1−1⁄ 1𝑛) 2 (1− ⁄𝑛) (1− 1⁄ ) Cartesian plane in figure 4.1. That is (𝑞𝑐,𝑖 ( ) , 𝑑𝑖𝑗) where 𝑞𝑐,𝑖 ( ) is the 100 𝑛 𝑛 𝑛 𝑛 quantile of the 𝜒2 distribution with i degree of freedom. Figure 4.1 A graph of observed data quantiles against the normal quantiles The graph in Figure 4.1 shows a steadily adherence to the unit slope, that is the data points are close to forming a straight line. Again, a correlation test was employed to test for MVN assumption. The r value obtained between the observed data quantiles and the normal quantiles is 0.9634. The r value is greater 68 University of Ghana http://ugspace.ug.edu.gh than the expected 0.9591 for a sample of size of 𝑛 = 26 and a significance level of 5%. It can therefore be concluded that the assumption of multivariate normality is at 5% level of significance is not tenable. However, it is no method for testing multivariate normality that is absolutely definitive. Since the MVN assumption has been satisfied, a repeated measures test is used to examine whether there is a difference between the average recognition distances. 4.3.2 Test of Sphericity In using the repeated measures test, the sphericity assumption must be satisfied. The Mauchly’s W test with a p-value of 0.172 at a significance level of 5%, the test suggested that variance of the data is homogeneous across the varying facial expressions, that is it satisfies the sphericity assumption. The Mauchly’s W test is displayed in table 4.2. Table 4.2 Mauchly’s W test of sphericity Chi-Square Degree of freedom P-value 18.874 14 0.172 4.3.3 Repeated Measures Design The repeated measure design employed used 6-variate from the dataset. It was computed from the Euclidean distance obtained after recognition. The test was used to show whether there is significant difference in the average recognition distances of the various face expressions from their neutral pose. The result of the test is displayed in Table 4.3. Table 4.3: Repeated Measures Test Hotelling trace Degree of freedom P-value 4.092 14 0.172 69 University of Ghana http://ugspace.ug.edu.gh From table 4.3, hotelling’s trace test obtained a p-value of 0.009 which suggested that the study rejects the null hypothesis. Conclusion can be drawn that there is enough evidence to support the claim that there exists significant difference in the average recognition distances of the different study constraints. Further analysis was conducted to determine where exactly the differences occur among the various constraints. 4.4 Paired Comparison A pairwise comparison (Post Hoc of repeated measure) was employed to investigate whether a facial expression pose can cause a significant difference in the average recognition distance. The results of the pairwise comparison at 5% significant level is displayed in Table 4.4. 70 University of Ghana http://ugspace.ug.edu.gh Table 4.4 Paired Comparison of Varying Facial Expressions Constraints Average difference P value Angry vs Disgust 214.237 .050 Angry vs Fear -61.193 .625 Angry vs Happy 130.195 .344 Angry vs Sad 110.879 .340 *Angry vs Surprise -359.596 .020 *Disgust vs Fear -275.43 .030 Disgust vs Happy -84.042 .398 Disgust vs Sad -103.358 .230 *Disgust vs Surprise -573.833 .000 Fear vs Happy 191.388 .097 Fear vs Sad 172.072 .121 Fear vs Surprise -298.403 .072 Happy vs Sad -19.316 .866 *Happy vs Surprise -489.791 .001 *Sad vs Surprise -470.475 .002 * significant difference exists between constraints at 5% significance level From the table, all pairs with p-value less than 0.05 are considered significant. This means that these pairs (angry, surprise), (disgust, fear), (disgust, surprise), (happy, surprise), (sad, surprise) are significantly different in their average recognition distance from their neutral pose. 71 University of Ghana http://ugspace.ug.edu.gh CHAPTER 5 SUMMARY, CONCLUSION and RECOMMENDATION 5.1 Introduction This chapter is the final phase of the study. It gives a brief summary of the work done and also makes some recommendations for future research to be conducted in the field. 5.2 Summary The study made use of a Discrete Wavelet Transform and Principal Component Analysis with singular value decomposition (DWT-PCA/SVD) algorithm under varying facial expressions. The study provided a concrete application using DWT-PCA/SVD on twenty-six images from the Cohn Kanade AU-Coded Facial Expressions database (CKFE). The algorithm used was numerically assessed from its recognition rate and run time. It was also evaluated using multivariate statistical techniques to determine the performance of the recognition algorithm on the varying constraints used in the study. The numerical evaluation computed in chapter 4 showed that the recognition rate of the DWT- PCA/SVD algorithm is 98.7 % and also has a mean error of 1.3%. The algorithm also recorded an average run time of 10.5 seconds which was influenced by the DWT technique employed at the processing stage. The repeated measure design also indicated that with the DWT-PCA/SVD algorithm, there exist significant difference in recognition distances between the varying expressions and their neutral pose. 72 University of Ghana http://ugspace.ug.edu.gh Specifically, the post hoc analysis indicated that there are significant differences in the average recognition distance of (angry, surprise), (disgust, fear), (disgust, surprise), (happy, surprise) and (sad, surprise) poses. 5.3 Conclusion and Recommendations A modified Principal Component Analysis with Singular Value Decomposition algorithm was successfully implemented in this study with a Discrete Wavelet Transform used as a noise removal technique during the preprocessing stage. The study evaluated the performance of the face recognition algorithm using multivariate methods. From the study, the Discrete Wavelet Transform is recommended as a feasible noise removal technique which is worthy of usage throughout the preprocessing stage of pattern or image processing. The parametric test (Repeated measure design) employed indicated that there existed significant differences between the mean (average) Euclidean distances of the (angry, surprise), (disgust, fear), (disgust, surprise), (happy, surprise) and (sad, surprise) expressions from their neutral pose. Again, according to the numerical assessment of the study, it can be concluded that DWT- PCA/SVD is an efficient with a recognition rate of 98.7% in the recognition of face images under varying facial expressions. Further studies can explore the use of the study (DWT-PCA/SVD) algorithm on other constraints such as ageing, occlusion and images under varying lighting conditions. Other de-noising techniques should be explored and applied under variable constraints. 73 University of Ghana http://ugspace.ug.edu.gh REFERENCES Abdulrahman M., Gwadabe T. R., Abdu F. J., &EleyanA., (2014) Gabor wavelet transform based facial expression recognition using PCA and LBP. In IEEE 22nd Signal Processing and Communications Applications Conference (SIU), pp 2265-2268. Agrawal, S., Khatri, P., & Gupta, S. (2014). Facial Expression Recognition Techniques: A survey. International Conference on Electrical, Electronic, Computer Science and Mathematics Physical & Management, (ICEECMPE), pp 6-11. Gwalior, India. Alwakeel, M., & Shaaban, Z. (2010) Face Recognition based on haar wavelet transform and principal component analysis via Levenberg-Marquardt backpropagation neural network. Eur. J. Sci. Res. 42, pp 25–31. Amrita, B. & Ghose, M.K., (2014) “Expression Invariant Face Recognition using DWT SIFT Features”, International Journal of Computer Applications, vol. 92, no.2, pp. 30-32, 2014. Asiedu L., Oduro F., Adebanji A O., & Mettle, F. O. (2016). A Statistical Assessment of Whitened PCA/SVD under variable Environmental Constraints. International Journal of Ecological Economics and Statistics™, 37(1), 63-79. Asiedu, L., Adebanji, A. O., Oduro, F., & Mettle, F. O. (2015). Statistical evaluation of face recognition techniques under variable environmental constraints. International Journal of Statistics and Probability, 4(4), 93-111. Bakhshi J., Ireland, V., &Gorod A., (2016). Clarifying the project complexity construct: past, present and future.Int. J. Proj. Manag., 37 (7), pp. 1183-1198. Barnouti, N.H., (2016) Improve Face Recognition Rate Using Different Image Pre-Processing Techniques. American Journal of Engineering Research (AJER), 5(4), pp. 46-53. 74 University of Ghana http://ugspace.ug.edu.gh Barnouti, N.H., (2016). Face Recognition using PCABPNN with DCT Implemented on Face94 and Grimace Databases. 2016. International Journal of Computer Applications, 142(6), pp. 8-13. Barnouti, N.H., Al-Dabbagh, S.S.M., Matti, W.E., & Naser, M.A.S., (2016). Face Detection and Recognition Using Viola-Jones with PCA-LDA and Square Euclidean Distance. International Journal of Advanced Computer Science and Applications (IJACSA), 7(5). Bartlett M.S., Movellan, J.R., &Sejnowski, T.J., (2002). “Face Recognition by Independent Component Analysis,” IEEE Trans. Neural Networks, vol. 13, no. 6, pp. 1450-1464. Beham, M. P., &Roomi, S. M. M. (2012). Face recognition using appearance-based approach: A literature survey. Paper presented at the Proceedings of International Conference & Workshop on Recent Trends in Technology, Mumbai, Maharashtra, India. Belhumeur, P., Hespanda, J., & Kriegman D., (1997), "Eigenfaces versus Fisherfaces: Recognition Using Class Specific Linear Projection", IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 19, no. 7, pp. 711-720. Bellakhdhar, F., Loukil, K., & Abid, M. (2013). Face recognition approach using Gabor Wavelets, PCA and SVM. International Journal of Computer Science Issues, 10(2), 201- 207. Bentin, S., Allison, T., Puce, A., Perez, E., & McCarthy, G. (1996). Electrophysiological Studies of Face Perception in Humans. Journal of Cognitive Neuroscience, 8, 551-565. Bhat, V.S. and Pujari, J.D., (2015). Human Face Recognition Using Combined Approaches PCA and ICA. Digital Image Processing, 7(10), pp. 301-307. Bhattacharyya, S. K., & Rahul, K. (2013). Face recognition by linear discriminant analysis. International Journal of Communication Network Security, 2(2), 31-35. 75 University of Ghana http://ugspace.ug.edu.gh Bledsoe W. W. & Chan H. (1965) A man-machine facial recognition system some preliminary results. Technical report pri19a, Panoramic Research, Inc., Palo Alto, California, 1965. Brunelli, R., &Poggio, T. (1993). Face recognition: Features versus templates. IEEE Transactions on pattern analysis and machine intelligence, 15(10), 1042-1052. Bute, Y.S., &Jasutkar, R.W., (2012) “Implementation of Discrete Wavelet Transform Processor for Image Compression” International Journal of Computer Science and Network (IJCSN), Vol. 1. Chang C-C, Chuang J-C & Hu Y-S, (2004). "Similar Image Retrieval Based on Wavelet Transformation", International Journal of Wavelets, Multiresolution and Information Processing, Vol. 2, No. 2, pp.111–120. Chang K., Bowyer K., & Flynn P., (2006). “Multiple Nose Region Matching for 3D Face Recognition under Varying Facial Expression,” IEEE Transaction on Pattern Analysis & Machine Intelligence, vol. 28, no. 10, pp. 1695-1700. Chiara, T., Viola Macchi Cassia, F. S., & Leo, I. (2006). “Newborns face recognition: Role of inner and outer facial features”. Child Development, 77(2), 297-311. Chin S. & Kim K., (2009) “Emotional Intensity-Based Facial Expression Cloning for Low Polygonal Applications,” IEEE Transaction on Systems, man, and Cybernetics-Part C: Applications and Reviews, vol. 39, no. 3, pp. 315-330. Choi, J. Y., Plataniotis, K. N., Ro, Y. M., (2012) "Face feature weighted fusion based on fuzzy membership degree for video face recognition", IEEE Trans. Syst. Man Cybern. B Cybern., vol. 42, no. 4, pp. 1270-1282. Comon P., (1994). “Independent component analysis. a new concept?”, Signal Processing, Elsevier, vol. 36, pp. 287--314, Special issue on Higher-Order Statistics. 76 University of Ghana http://ugspace.ug.edu.gh Cox, I.J., Ghosn, J., &Yianios P.N., (1996) “Feature-Based face recognition using mixture- distance,” Computer Vision and Pattern Recognition. D.O. Gorodnichy, (2004) "Recognizing faces in video requires approaches different from those developed for face recognition in photographs", Proceedings of NATO IST - 044 Workshop on Enhancing Information Systems Security through Biometrics. Darwin, C. (1872). The expression of the emotions in man and animals John Murray: London. Donato G., Bartlet, T. M., Hager J.C., Ekman, P., &Sejnowski, T., (1999) “Classifying Facial Actions,”IEEE TRANS. PAMI, 21(10):974-989, Dubey D., &Tomar, G.S., (2016)” Deepnpersual of human face Recognition Algorithms from Facial Snapshots, InternationalJournal of Signal Processing, Image Processing and Pattern Recognition”,Vol-9,No-9,pp.103-112. Ekman P. & Friesen W.V., (2003). “Unmasking the Face, A Guide to Recognizing Emotions from Facial Clues”, Cambridge MA Malor Books, CA. Ekman, P. & Friesen, W.V. (1971), Constants across cultures in the face and emotion, Journal of Personality and Social Psychology, Vol 17(2), 124–129. Fasel B. &Luettin J., (2003). “Automatic Facial Expression Analysis: A Survey,” IEEE Pattern Recognition, vol. 36, no. 1, pp. 259-275. Fischler, M., &Elschlager, R., (1973).The representation and matching of pictorial structures. IEEE Transactions on Computers, C-22(1):67 – 92. Friesen, W. V. (1972). Cultural differences in facial expressions in a social situation: An experimental test of the concept of display rules. Doctoral dissertation, University of California, San. 77 University of Ghana http://ugspace.ug.edu.gh Georghiades, A. S., Belhumeur, P. N., & Kriegman, D. J. (2001). From few to many: Illumination cone models for face recognition under variable lighting and pose. IEEE Transactions on pattern analysis and machine intelligence, 23(6), 643-660. Goldstein, A.J., Harmon, L.D., &Lesk, A.B., (1971)"Identification of Human Faces", Proc. IEEE, vol. 59, no. 5, pp. 748-760. Gross, R., &Brajovic, V. (2003). An image preprocessing algorithm for illumination invariant face recognition. Paper presented at the AVBPA. Gundimada S., &Asari V., (2009) “Facial Recognition Using Multisensor Images Based on Localized Kernel Eigen Spaces,” IEEE Transaction on Image Processing, vol. 18, no. 6, pp. 1314-1325. Han, C-C., Liao, H-Y. M., chung Yu, K., & Chen L.H., (1997). Lecture Notes in Computer Science, volume 1311, chapter Fast face detection via morphology-based pre-processing, pages 469–476. Springer Berlin / Heidelberg. Ibrahim, R.; Zin, Z.M., (2011) "Study of automated face recognition system for office door access control application," Communication Software and Networks (ICCSN), 2011 IEEE 3rd International Conference, vol., no., pp.132-136, 27-29 May 2011 Imtiaz, H., & Fattah, S.A., (2011). “A Wavelet-Domain Local Feature Selection Scheme for Face Recognition”, IEEE International Conference on Communications and Signal Processing, pp. 448-451. Kabir, M. S., Hossain, M. S., & Islam, R. (2016). PERFORMANCE ANALYSIS BETWEEN PCA AND ICA IN HUMAN FACE DETECTION. Int. J. Proj. Management, 37 (7), pp. 1183-1198. 78 University of Ghana http://ugspace.ug.edu.gh Kadam, K. D., (2014). Face Recognition using Principal Component Analysis with DCT. International Journal of Engineering Research and General Science, ISSN, 2091-2730. Kakade, S. D. (2016). A review paper on face recognition techniques. International Journal for Research in Engineering Application & Management (IJREAM). Kakumanu P. and Bourbakis N., (2006) “A LocalGlobal Graph Approach for Facial Expression Recognition,” in Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, Arlington, pp. 685-692. Kanade, T., (1973). “Picture processing by computer complex and recognition of human faces,” technical report, Dept. Information Science, Kyoto Univ. Kirby M. and Sirovich L. (1990). Application of the karhunen-loeve procedure for the characterization of human faces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1):103–108. Kjeldsen, R., &Kender, J.R., (1996). “Finding skin in color images,” in 2nd International Conference on Automatic Face and Gesture Recognition (FG 96), pp. 312–317. Kshirsagar, V., Baviskar, M., & Gaikwad, M. (2011). Face recognition using Eigenfaces. Paper presented at the Computer Research and Development (ICCRD), 2011 3rd International Conference on. Lai, J., P. Yuen, G. Feng, "Face recognition using holistic fourier invariant features", Pattern Recognition., vol. 34, no. 1, pp. 95-109, 2001. Lawrence, S., Giles, C.L., Tsoi, A.C., & Back, A.D., “Face recognition: A convolutional neural- network approach,” IEEE Trans. Neural Networks, vol. 8, pp. 98-113, 1997. Leung, T.K., Burl, M.C., &Perona, P. (1998). “Probabilistic affine invariants for recognition,” in Proc. IEEE Comput. Soc. Conf. Comput. Vision and Pattern Recogn, pp.678– 684. 79 University of Ghana http://ugspace.ug.edu.gh Liao, S., Jain, A. K., & Li, S. Z. (2013). Partial face recognition: Alignment-free approach. IEEE Transactions on pattern analysis and machine intelligence, 35(5), 1193-1205. Mallat, S. (1999). “A Wavelet Tour of Signal Processing, New York:Academic”, 1999. Mashal N, Faust M. Hendler T.(2005). “The role of the right hemisphere in processing non- salient metaphorical meanings”: application of principal components analysis to fMRI data. Neuropsychologia.;43 (14):2084-100. Murtaza, M., Sharif, M., Raza, M. & Shah, J., (2014). “Face recognition using adaptive margin fisher’s criterion and linear discriminant analysis”. International Arab Journal of Information Technology, 11(2), pp.1-11. Nicholl, P., & Amira, A. (2008). DWT/PCA face recognition using automatic coefficient selection. Paper presented at the Electronic Design, Test and Applications. DELTA. 4th IEEE International Symposium on. Nixon M., (1985). “Eye spacing measurement for facial recognition”. Proceedings of the Society of Photo-Optical Instrument Engineers, SPIE, 575(37):279–285. Osuna, E., Freund, R. &Girosi, F. (1997) “Training support vector machines: an application to face detection,” in CVPR, pp. 130–136. Paul, L. C., & Al Sumam, A. (2012). Face recognition using principal component analysis method. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(9), pp: 135-139. Pentland, A., Starner, T., Etcoff, N., Masoiu, N., Oliyide, O., & Turk, M., (1993). Experiments with Eigenfaces. Proc. Looking at People Workshop Int’l Joint Conf. Artificial Intelligence. 80 University of Ghana http://ugspace.ug.edu.gh Poon, B., Amin, M.A., & Yan, H., (2016). Improved Methods on PCA Based Human Face Recognition for Distorted Images. In Proceedings of the International MultiConference of Engineers and Computer Scientists (1). Poularikas, A. (1999). The Hankel transforms. The Handbook of Formulas and Tables for Signal Processing. Reddy, K. R. L., Babu, G., & Kishore, L. (2011). Face recognition based on eigen features of multi scaled face components and artificial neural network. International Journal of Security and Its Applications, 5(3), 23-44. Rowley, H.A., Baluja, S., &Kanade, T., (1998). “Neural network-based face detection,” IEEE Transactions on Pattern Analysis and Machine intelligence, vol. 20, pp. 23–38. Shan C., Gong, S., &McOwan, P.W., (2005) “Robust facial expression recognition using local binary patterns”, in: IEEE International Conference on Image Processing (ICIP), Genoa, vol. 2, pp. 370–373. Sharifara A., Rahim, M. S. M., &Anisi, Y. (2014) "A general review of human face detection including a study of neural networks and Haar feature-based cascade classifier in face detection", Proc. Int. Symp. Biometrics Secur. Technol. (ISBAST), pp. 73-78. Sharma, N., & Dubey, S.K., (2014). “Face Recognition Analysis Using PCA, ICA and Neural Network”. International Journal of Digital Application & Contemporary research, 2(9). Singh, N. A., Kumar, M. B., &Bala, M. C. (2016). “Face Recognition System based on SURF and LDA Technique”. International Journal of Intelligent Systems and Applications, 8(2). 81 University of Ghana http://ugspace.ug.edu.gh Sirovich L. and Kirby M. (1987). “Low-dimensional procedure for the characterization of human faces”. Journal of the Optical Society of America A- Optics, Image Science and Vision. 4(3):519–524. Smith LI. A tutorial on principal components analysis [Internet]. 2002 [cited 2011 May 22]. Available in: http://www.sccg.sk/~haladova/principal_components.pdf. Stonham T.J., (1986). “Practical face recognition and verification with wizard”. In H. D. Ellis, editor, Aspects of face processing. Kluwer Academic Publishers. Stonham, T.J., (1984) “Practical face recognition and verification with WISARD,” Aspects of Face Processing, pp. 426-441. Suwa, M., Sugie N., &Fujimora, K., (1978). A preliminary note on pattern recognition of human emotional expression. Proceedings of the 4th International Joint Conference on Pattern Recognition, November 7-10, 1978, Kyoto, Japan, pp: 408-410. Thakur S., Sing, J.K., Basu, D.K., Nasipuri, M., & Kundu, M., (2008). "Face recognition using principal component analysis and RBF neural networks". Proc. IEEE Int. Conf. on Emerging Trends in Engineering and Technology. Tomkins, S. S. (1962). Affect, imagery, and consciousness (Vol. 1: The positive affects). New York: Springer. Tomkins, S. S. (1963). Affect, imagery, and consciousness (Vol. 2: The negative affects). New York: Springer. Tomkins, S. S., & McCarter, R. (1964). What and where are the primary affects? Some evidence for a theory. Perceptual and Motor Skills, 18(1), 119-158. 82 University of Ghana http://ugspace.ug.edu.gh Turk, M., & Pentland, A., (1991). “Face recognition using eigenfaces”.Computer Vision and Pattern Recognition. Proceedings {CVPR'91.}, {IEEE} Computer Society Conference. 3(1), 71-86. Viola P., & Jones, M., (2001) “Rapid object detection was using a boosted cascade of simple features,” in Proc. Of CVPR, pp. 511–518. Wadkar P. D., &Wankhade, M., (2012). “Face Recognition Using Discrete Wavelet Transforms.” International Journal of Advanced Engineering Technology 3:1-3. Wagner, P. (2012). Face recognition with python. Retrieved from http:// www.bytesh.de Wang, J., & Tan, T., (2000) “A new face detection method based on shape information”, Pattern Recognition Letters 21, 463-471. Yambor, W., Draper, B., & Beveridge, R. (2002). "Analyzing PCA-Based Face Recognition Algorithms: Ei-Genvector Selection and Distance Measures" in Empirical Evaluation Methods in Computer Vision, Singapore:World Scientific Press. Yang, J., Yu, H., Kunz, W. (2000). “An Efficient LDA Algorithm for Face Recognition”. In: Proceedings of the Sixth International Conference on Control, Automation, Robotics and Vision (2000) Yow, K.C., &Cipolla, R., (1996). “A probabilistic framework for perceptual grouping of features for human face detection,” in Int. Conf. Automatic Face and Gesture Recognition, pp. 16–21, 1996. Yuille, A.L., Cohen D.S. & Hallinan, P.W. (1989) Feature extraction from faces using deformable templates proc. CVPR, San Diego, CA, June 1989. Zhang, C., & Zhang, Z. (2010). A survey of recent advances in face detection. Research technical report. 83 University of Ghana http://ugspace.ug.edu.gh Zhang, D., Ding,D., Li, J.,& Liu, Q .(2015) .PCA Based Extracting Feature Using Fast Fourier Transform for Facial Expression Regression. Transaction on Engineering Technologies, 413-424.SpringerNetherlands. Zhen, W., &Zilu, Y. (2012). ‘Facial expression recognition based on adaptive local binary pattern and sparse representation’. The Proceedings of the IEEE International Conference on Computer Science and Automation Engineering. 2, 440-444. 84