Anda di halaman 1dari 6

FACIAL GESTURE RECOGNITION

Introduction: A facial recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame from a video source. One of the ways to do this is by comparing selected facial features from the image and a facial database. In this presentation, I will be talking about facial gesture recognition as it is one of the important components of natural human-machine interfaces; it may also be used in behavioral science , security systems and in clinical practice. Although humans recognize facial expressions virtually without effort or delay, reliable expression recognition by machine is still a challenge. The face expression recognition problem is challenging because different individuals display the same expression differently. The algorithm considers six universal emotional categories namely joy, anger, fear, disgust, sadness and surprise. Methodology: 1. Firstly, the train images are utilized to create a low dimensional face space. This is done by performing Principal Component Analysis (PCA) in the training image set and taking the principal components (i.e. Eigen vectors with greater Eigen values). Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The number of principal components is less than or equal to the number of original variables. This transformation is defined in such a way that the first principal component has the largest possible variance (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to (i.e., uncorrelated with) the preceding components. Principal components are guaranteed to be independent only if the data set is jointly normally distributed. PCA is sensitive to the relative scaling of the original variables. PCA is the simplest of the true eigenvector-based multivariate analyses. Often, its operation can be thought of as revealing the internal structure of the data in a way which best explains the variance in the data. If a multivariate dataset is visualized as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA can supply the user with a lower-dimensional picture, a "shadow" of this object when viewed from its (in some sense) most informative viewpoint. This is done by using only the first few principal components so that the dimensionality of the transformed data is reduced. 2. In this process, projected versions of all the train images are also created. This is done by creating a set of eigenfaces. Eigenfaces are a set of eigen vectors which are derived

from the covariance matrix of the probability distribution of the high-dimensional vector space of possible faces of human beings. A set of eigenfaces can be generated by performing a mathematical process called principal component analysis (PCA) on a large set of images depicting different human faces. Informally, eigenfaces can be considered a set of "standardized face ingredients", derived from statistical analysis of many pictures of faces. Any human face can be considered to be a combination of these standard faces. For example, one's face might be composed of the average face plus 10% from eigenface 1, 55% from eigenface 2, and even -3% from eigenface 3. Remarkably, it does not take many eigenfaces combined together to achieve a fair approximation of most faces. Also, because a person's face is not recorded by a digital photograph, but instead as just a list of values (one value for each eigenface in the database used), much less space is taken for each person's face. The eigenfaces that are created will appear as light and dark areas that are arranged in a specific pattern. This pattern is how different features of a face are singled out to be evaluated and scored. There will be a pattern to evaluate symmetry, if there is any style of facial hair, where the hairline is, or evaluate the size of the nose or mouth. Other eigenfaces have patterns that are less simple to identify, and the image of the eigenface may look very little like a face.

Computing the eigenvectors Performing PCA directly on the covariance matrix of the images is often computationally infeasible. If small, say 100 x 100, greyscale images are used, each image is a point in a 10,000-dimensional space and the covariance matrix S is a matrix of 10,000 x 10,000 = 108 elements. However the rank of the covariance matrix is limited by the number of training examples: if there are N training examples, there will be at most N-1 eigenvectors with non-zero eigenvalues. If the number of training examples is smaller than the dimensionality of the images, the principal components can be computed more easily as follows. Let T be the matrix of preprocessed training examples, where each column contains one meansubtracted image. The covariance matrix can then be computed as S = TTT and the eigenvector decomposition of S is given by

However TTT is a large matrix, and if instead we take the eigenvalue decomposition of

then we notice that by pre-multiplying both sides of the equation with T, we obtain

Meaning that, if ui is an eigenvector of TTT, then vi=Tui is an eigenvector of S. If we have a training set of 300 images of 100 x 100 pixels, the matrix TTT is a 300 x 300 matrix, which is much more manageable than the 10000 x 10000 covariance matrix. Notice however that the resulting vectors vi are not normalized; if normalization is required it should be applied as an extra step. 3. Thirdly the 2-dimensional cross correlation will be done between the Video Sequence and the Image which consists of only the part of the face and expressions that are to be correlated in the sequence. The test image obtained by the correlation is projected on the face space as a result, all the test images are represented in terms of the selected principal components. Digital Image Correlation is an optical method that employs tracking & image registration techniques for accurate 2D and 3D measurements of deformation, displacement and strain from the digital images. Thus it is important for image processing. Other applications of digital image correlation are in the field of micro- and nano-scale mechanical testing, thermo mechanical property characterization and thermo mechanical reliability in electronic packaging, stress management etc. Correlation is a mathematical operation that is very similar to convolution. Just as with convolution, correlation uses two signals to produce a third signal. This third signal is called the cross-correlation of the two input signals. If a signal is correlated with itself, the resulting signal is instead called the autocorrelation. The correlation between two signals (cross correlation) is a standard approach to feature detection. The amplitude of each sample in the cross-correlation signal is a measure of how much the received signal resembles the target signal, at that location. This means that a peak will occur in the cross-correlation signal for every target signal that is present in the received signal. In other words, the value of the cross-correlation is maximized when the target signal is aligned with the same features in the received signal.

For image-processing applications in which the brightness of the image and template can vary due to lighting and exposure conditions, the images can be first normalized. Normalized correlation is one of the methods used for template matching, a process used for finding incidences of a pattern or object within an image. The peak of the cross-correlation matrix occurs where the images are best correlated.

4. Fourthly the Mahalanobis of a projected test image from all the projected train images are calculated and the minimum value is chosen in order to find out the train image which is most similar to the test image. The test image is assumed to fall in the same class that the closest train image belongs to. Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analyzed. It gauges similarity of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set and is scale-invariant. In other words, it is a multivariate effect size. Formally, the Mahalanobis distance of a multivariate vector group of values with mean and covariance matrix from a is defined as:

5. Fourthly, in order to determine the intensity of a particular expression, its Mahalanobis distance from the mean of the projected neutral images is calculated. The more the distance - according to the assumption - the far it is from the neutral expression. As a result, it can be recognized as a stronger expression.

Comparison with other distance based approaches: A reference model is formed for each gesture by generating a reference template (a mean vector and a covariance matrix) from the feature vector representations. Each test feature vector is compared against a reference model by distance measure or by probability estimation. Regarding the distance measure, four variations according to different usage of the covariance matrix are studied. They are the City block (CBD), the Euclidean (ED), the Weighted Euclidean (WED), and the Mahalanobis (MD) distance measures. Euclidean and Mahalanobis distance methods identify the interpolation regions assuming that the data is normally distributed (10, 11). City-block distance assumes a triangular distribution. Mahalanobis distance is unique because it automatically takes into account the correlation between descriptor axes through a covariance matrix. Other approaches require the additional step of PC rotation to correct for correlated axes. City block distance is particularly useful for the discrete type of descriptors. Of the four base distance measures, there appears to be a significant improvement with Mahalanobis distance. Disadvantages of Mahalanobis Distance: The drawback of the Mahalanobis distance is the equal adding up of the variance normalized squared distances of the features. In the case of noise free signals this leads to the best possible performance. But if the feature is distorted by noise, due to the squaring of the distances, a single feature can have such a high value that it covers the information provided by the other features and leads to a misclassification. Therefore, to find classification procedures which are more robust to noise we have to find a distance measure which gives less weight to the noisy features and more weight to the clean features. This can be reached by comparing the different input features to decide which feature should be given less weight or being

excluded and which feature should have more weight .

Anda mungkin juga menyukai