Anda di halaman 1dari 6

A Comparative Survey on

Supervised Classifiers
for Face Recognition

Miguel F. Arriaga-Gómez ∗, Ignacio de Mendizábal-Vázquez ∗ ,


Rodrigo Ros-Gómez † and Carmen Sánchez-Ávila ∗
∗ Group
of Biometrics, Biosignals and Security, GB2S
Centro de Domótica Integral
Universidad Politécnica de Madrid
Campus de Montegancedo. Pozuelo de Alarcón, 28223, Madrid
Email: {marriaga, imendizabal, csa}@cedint.upm.es
† ETS Ingenieros de Telecomunicación

Universidad Politécnica de Madrid


28040, Madrid
Email: rodrigo.ros@ieee.org

Abstract—During the last decades, several different techniques complete independent evaluation protocols for face-recognition
have been proposed for computer recognition of human faces. algorithms, as the FERET evaluation procedure [2], [3], have
A further step in the development of these biometrics is to also been designed.
implement them in portable devices, such as mobile phones.
Due to this devices’ features and limitations it is necessary to Facial feature extraction methods can be classified into
select, among the currently available algorithms, the one with two main categories, according to the way of obtaining the
the best performance in terms of algorithm overall elapsed time face patterns [4]: the “Holistic approach” (Template matching)
and correct identification rates. which considers the hole face region in an image as the
The aim of this paper is to offer a complementary study system’s input data and represents each face as a vector whose
to previous works, focusing on the performance of different components codify the grey level of each face pixel, and
supervised classifiers, such as the Normal Bayesian Classifier, the “Feature approach” (Geometric, feature-based matching)
Neural Architectures or distance-based algorithms. In addition, which establishes certain facial landmarks related to face
we analyse all the proposed algorithms’ efficiency over public elements, such as eyes, nose, mouth and ears and computes
face databases (ORL, FERET, NIST and the Face Recognition features as distances between landmarks, relative positions or
Data from the Essex University). Each one of these databases elements’ sizes. A big majority of these systems are based
contains a different number of individuals and particular samples on the holistic approach because, as remarked in [5], the
and they present variations among images from the same user template approach is more reliable than the feature one, and its
(scale, pose, expression, illumination, . . . ). We expect to simulate
many different situations which take place when dealing with
implementation is simpler. Many algorithms, based on different
face recognition on mobile phones. In order to get a complete similarity measures can be used. In most of the cases, to avoid
comparison, all the proposed algorithms have been implemented the effects of face illumination, a previous image equalization
and run over all the databases, using the same computer. Different must be carried out.
parametrizations for each algorithm have also been tested. The problem of face identification consists of two main
Bayesian classifiers and distance-based algorithms turn out tasks: the face detection task, in which a face is located
to be the most suitable, as their parametrization is simple, within an image; and the face recognition task, in which the
the training stage is not as time consuming as others’ and previously located face is linked to an individual, who has
classification results are satisfying. already been enrolled in the system. These tasks are carried
Keywords—Biometrics, face recognition, supervised classifiers, out in three steps:
machine learning, PCA, LDA.
1) The detection of a face within an image.
2) The image normalization and the face feature extrac-
I. I NTRODUCTION tion (pattern construction).
3) The individual identification (pattern recognition).
In the last years many face-based algorithms have been
developed and several studies regarding their performance exist There are many research works about face biometrics
[1]. Due to the increasing functionalities of mobile devices the that focus on face detection algorithms, as well as on the
implementation of these systems in cellphones is becoming capabilities of classical methods of extracting facial features,
a general demand. In order to test and assess the devel- such as Principal Component Analysis or Linear Discriminant
oped systems, a large amount of face images databases have Analysis [6]–[9]. However, we have found that in previous
been created and made public for general use. Furthermore, research lower attention is dedicated to the choice of the

978-1-4799-3532-1/14/$31.00 ©2014 IEEE


pattern classifier. In addition, the developed techniques are B. Image preprocessing
often tested against small private databases, specially drawn
up with testing purposes, and following different evaluation Prior to use the holistic approach to extract features from
protocols. In this situation, algorithm performance comparison face images, some considerations must be taken:
is not meaningful. • The pixel grey levels which conform feature vectors
In this work we focus on the performance of different su- are associated to pixel positions. It is therefore im-
pervised classifiers in terms of Correct Identification Rates and portant that face elements are aligned among different
algorithm overall elapsed time. These performance controls are face images.
calculated over 5 public face databases which present scale, • For the purpose of extracting facial features, it is
pose, expression or illumination variations among images from recommendable that the input image contains only
the same individual. face pixels. In this sense, the face must be located
and the face surroundings, such as hair, clothes or
Biometric systems can operate in two modes: identification
background should be removed.
and verification. In both cases, there must be a sample template
constructed by the system from a captured face image during • So as to improve image comparison performance, face
identification/verification process. If the user has previously images must have the same size.
been enrolled in the system there must also be a user pattern
template stored in a database during the enrollment process. • Differences in brightness and illumination must be
The user template construction process follows the scheme normalized as well as image contrast must be im-
shown in Fig. 1. proved.

000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
000000000000000000000000000000000000000
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
Hence, we propose the following image preprocessing
00 00 00 00 WƌĞƉƌŽĐĞƐƐ
000000000000000000000000000000000000 000 000 000 000 000 000 000 000 000 000 &ĂĐĞĞƚĞĐƚŝŽŶ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 W
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 > 00000000000000000000000000
00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
0 0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00ZĞĚƵĐƚŝŽŶ
0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0 0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 ^ĞůĞĐƚŝŽŶ
0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
steps:
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ,ŝƐƚŽŐƌĂŵ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
&ĞĂƚƵƌĞ &ĞĂƚƵƌĞ
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
ZĂǁ/ŵĂŐĞ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000000000000000000000000000000000000000000000
ƋƵĂůŝnjĞĚ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
/ŵĂŐĞ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
sĂůŝĚ&ĂĐĞ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000000000000000000000000000000000000000000000
sĞĐƚŽƌ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0000000000000000000000000000000000000000000000000
sĞĐƚŽƌ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000000000000000000000000000000000000000000000000
1) All the images in the database are converted to gray
scale and normalized by using Histogram Equaliza-
Fig. 1. The template construction process comprises at least 3 stages: image
tion, as suggested in [10], because different lighting
preprocessing, face detection and feature reduction. And additional feature conditions can modify pixel grey levels and provide
selection stage is sometimes performed. different feature vectors for the same face image.
2) After detecting a face within an image, the face area
is cut and both the background and the face surround-
Face templates are classified into user groups during iden- ings are suppressed to avoid extracting features of non
tification process. The proposed classifiers represent the most faces.
commonly used pattern recognition schemes (Fig. 2). 3) The face-region image is scaled to 92 × 112 pixels,
000000
00 00 00 00 00 00
in order to compare color intensity between related
00 00 00 00 00 00 00 00 00 00 00

000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00
000 000 000 000 000
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00 00
000 000 000 000 000 000
ƵĐůŝĚĞĂŶ pixels.
00 00 00 00 00 00 00 00 00 00 00 ^ĂŵƉůĞ 0000000000000000000000000000000000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00EĞĂƌĞƐƚͲEĞŝŐŚďŽƵƌ
000000000000000000000000000000000000000000000000000000000000000000000000 00 00 00 00 00 00 DĂŚĂůĂŶŽďŝƐ
00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0 0 0 0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00&ĞĂƚƵƌĞƐ 00000000000000000000000000000000000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0 0 0 0 0 0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 tĞŝŐŚƚĞĚƵĐůŝĚĞĂŶ
000 000 000 000 000 000 000 000 000 000 ŝƐƚĂŶĐĞͲĂƐĞĚ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00
00 00 00 00 00
0 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
C. Face detection
000 000 000 000 000 000 000 000 000 000 000 000 000 000<ͲEĞĂƌĞƐƚEĞŝŐŚďŽƵƌƐ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ƵĐůŝĚĞĂŶ
00 00 00 00 00 00000 0 0 0 0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
00 00 00 00 00
000 000 000 000 000
00000000000000000000000000000000000000000000000000000000000000
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
With the aim of extracting a proper user template from
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00EĞƵƌĂů 000000000000000000000000000000000000000000
00 00 00 00 00
00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
D>W
an image, it is quite important to locate the face area within
000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 WĂƚƚĞƌŶ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
the image. The most popular method to perform this task is
00 00 00 00 00 00 00 00 00 00&ĞĂƚƵƌĞƐ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00
00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00ĂLJĞƐŝĂŶ
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
E
the Viola-Jones algorithm, described in [11]. This algorithm
is a supervised learning method which uses certain predefined
Fig. 2. The most frequently used classifiers are tested: Nearest Neighbors, features1 to detect any object after a training phase.
Multilayer Perceptron and Normal Bayesian Classifier
For our purpose we need to locate the biggest complete and
correctly aligned face in each image. To achieve this objective
The overall structure of this paper comprises section II, we have used the pretrained cascade classifiers provided by
in which all the implementation and parametrization details the OpenCV library (Frontal face cascade classifier, based on
are explained; section III, where further information about the Local Binary Patterns; Left eye, Right eye and Nose classifiers,
selected databases’ features is offered; section IV, presenting based on Haar-like features).
the main results of the study, and section V, with the main
conclusions of the work. Face detection results are detailed in section IV.

D. Feature reduction and selection: Principal Component


II. M ETHODS Analysis and Linear Discriminant Analysis
A. Implementation details User templates are obtained from individual’s images, after
the preprocessing steps. Each feature represents the grey value
All the analysed algorithms have been implemented using of an image pixel. Therefore 92 × 112 pixel images can be
the OpenCV library over Java language, and the testing cycles represented as a 10304 length vector of integer numbers.
TM
have been carried out in an Intel Core
R 2 Quad@2.66 GHz
and 6GB RAM. 1 Haar-like features in the original algorithm.
Since supervised classifiers need to be trained for each (feature vector), the k training templates with the
object classification, training time depends strongly on the smallest distance to the sample one are selected. The
templates’ length. In the same way classifying times increase sample template is identified as the most repeated
quickly as templates size grows. In addition, not all the pixels user label among the k selected templates. The most
in a face image provide relevant information for classification. common distances are Euclidean and Mahalanobis’
For those reasons, it is crucial to find a feature reduction (which is frequently used with random variables, as it
method which preserves the most differenciating features and takes into account their correlation). A particular case
discards the worse ones. of this method is the Nearest Neighbor Algorithm
(NNA), in which k = 1.
One of the classical proposals, explained in [9], is the
In this proposal kNN with k = 5, and NNA algo-
Eigenfaces model, based on the Karhunen-Loève transform.
rithms are tested.
This approach, so called Principal Component Analysis (PCA)
2) Neural networks are learning adaptive structures com-
assumes that the most distinguishing facial features could be
posed by a fixed number of interconnected computing
not related to our intuitive idea of “feature”, and could there-
units called neurons, capable of approximating non-
fore be obtained by explaining the variance of a face image set.
linear output signals (classification results) from input
The spectral decomposition of the face image set provides with
signals (feature vectors). The relevance of each con-
new “artificial” features, in the sense that they don’t correspond
nection is characterized by a parameter called weight.
to any visual feature, but to linear combinations of pixel
The weights are tuned as a result of a supervised
grey intensities which maximize the explained variance. The
learning algorithm during the training stage. Multi-
resulting eigenvectors provide with a new vector base in which
layer Perceptron [14] is a neural network in which the
any face image could be projected in order to obtain its new
neurons are arranged in layers: an input layer where
features (so called Principal Components). There is a narrow
input data (feature vectors) is provided, an output
relationship between the number of chosen components and
layer where classification predictions are obtained,
the percentage of explained variance: when feature reduction
and certain amount of inner or hidden layers in which
is applied some variance in the feature set can not be explained.
data is transformed. Information flow goes in one
Likewise, to explain all the variance, all the components must
direction from the input to the output layer (feed-
be computed.
forward networks).
In our proposal, we have distinguished two operation The neural structure chosen in this proposal is a
modes: the tough mode, in which a 80% of the variance is Multilayer Perceptron with three layers: the input
explained, and a conformist mode, in which only a 50% is one, with as many neurons as remaining features,
explained. after feature reduction; the output layer, with as many
neurons as users in the database; and the hidden layer.
Besides the feature reduction, a feature selection stage The number of neurons nH in the inner layer is set
can be performed. Algorithms based on Fisher’s Linear Dis- according to the Lippman criteria, that establishes
criminant Analysis (also known as FLD or LDA) construct a nH = 2 · f + 1, where f is the amount of input
“feature space” as a subspace of the image space and project features.
the images into the new space. The feature space is determined 3) The Normal Bayesian Classifier (NBC) is based on
calculating the best projection directions, in the sense of max- the idea that feature vectors from each user are
imizing between-classes variance while minimizing within- normally distributed. The data distribution function is
classes variance, by applying LDA to the original face images therefore a Gaussian mixture with one component per
set or to the feature vector set obtained after PCA reduction. class [15]. Mean vectors and covariance matrices are
According to [12], the resulting LDA selected features are learned for each class during the training stage and in
robust against lighting variations and facial expressions. the classification stage the trained model is used to es-
tablish the user under whose class-conditional density
E. Classifiers the sample template has the highest probability. An
The aim of this paper is to compare the performance of advantage of these models is that all the parameters
the most extended pattern classifying supervised algorithms. A (means and covariances) are easily estimated with a
classifying algorithm is considered supervised if its operating few training patterns.
is based on a previous training. During this training stage a set
of properly identified images is given to the classifier, in order F. Evaluation
to make it associate certain feature values to the corresponding
user. Among supervised algorithms, the following stand out: The proposed system is evaluated at four milestones:

1) Distance based algorithms, which make use of the 1) Face detection stage. The number of images in which
simplest way of classifying a sample template in a face area can be located is computed for each
certain space by computing a similarity measure or database and the overall correct detection rates are
distance to all the existing pattern templates (con- presented. For those datasets containing images with
sidered as vectors in a multidimensional space), in both angled faces and frontal faces, frontal detection
order to determine the closest one. The k-Nearest rates are calculated as well.
Neighbors (kNN) method is one of the most popular 2) Enrollment stage. In this stage, each user’s pattern is
distance-based classifying algorithms [13]. Given a generated from a certain amount, n, of feature vectors
positive integer number, k and a sample template (n is, hence, the number of user samples required to
TABLE I. FACE DATABASES USED FOR TESTING THE ALGORITHMS
complete the enrollment process). However, depend- PRESENT VARIATIONS BETWEEN USER IMAGES IN ACCESSORIES (A),
ing on the value of n not all the individuals in the BACKGROUND (B), EXPRESSIONS (E), LIGHTING (L), POSE (P), SCALE (S)
considered databases present at least n properly cap- AND TIME (T)
tured face images. In section IV correct enrollment
Database Users Image size BW/Color Variations
rates are presented.
Faces 94 (EUCFI) 153 180 × 200 Color E
3) Identification process. All the images in the testing
Faces 95 (EUCFI) 72 180 × 200 Color E/L/S
set are classified with all the proposed classifiers. For
Faces 96 (EUCFI) 152 196 × 196 Color B/L/S
each method, a correct identification rate is computed. Grimace (EUCFI) 18 180 × 200 Color E/P
4) Verification process. In this stage, each image in the ORL 40 92 × 112 256 grey levels A/E/L/T
testing set is presented to the system with each of the FERET BW 1204 256 × 384 256 grey levels E/L/T/P
system user’s ID. Each pair image-ID is a verification Color FERET 1199 256 × 384 Color E/L/T/P
attempt. In some of these attempts the image real NIST MID 1573 Varying 256 grey levels E/P/S
ID matches the presented user ID (genuine attempt),
whereas in some others it doesn’t (intruder attempt).
On each verification attempt, the presented image’s provided user’s image set. An image is considered as “valid
template (sample) is compared with the stored user face” if after the preprocessing and detection stages there is a
template (pattern). The verification process is consid- complete, centered and properly aligned face on it.
ered as “successful” by the system if the difference
between these templates is smaller than a prefixed However, not all the face images provided by the user are
threshold. A successful attempt is called a True useful for the enrollment. What is more, there are many users
Positive (TP) attempt if the user is genuine, and a who are not capable of enrolling into the system due to many
False Positive (FP) attempt if the user is an intruder. factors.
An unsuccessful attempt is called a True Negative As we pointed out in section II-B, image Histogram
(TN) attempt if the user is an intruder, and a False Equalization provides user features normalization. In addition
Negative (FN) if the user is genuine. These amounts an increase in the detection algorithm’s performance is shown
allow to calculate the False Acceptance Rate (FAR) in Table II.
and the False Reject Rate (FRR) for a given threshold.
Modifying the threshold value, it is possible to deter- TABLE II. H ISTOGRAM E QUALIZATION IMPROVES DETECTION RATES .
mine the Equal Error Rate (EER) and the optimal # detected faces
threshold (θ). Verification results using the optimal Database # images
Without HE With HE
% of improvement
threshold are presented in section IV. EUCFI 7856 7704 7743 0.49 %
ORL 400 305 315 2.5%
III. DATABASES FERET BW 14051 8112 8559 3.18 %
During the assessment stage of new techniques, in order Color FERET 11338 6049 6354 2.69%
to compare the performance of several methods, it is rec- NIST 1309 1197 1198 0.07%
ommendable to use a standard testing data set. There are
many databases currently in use and each one has been
developed under a different set of requirements (a complete Apparently, this is not a significant improvement. But a
list can be found in [16] and http://www.face-rec. discarded image during the preprocessing phase could imply
org/databases/). In this work, the proposed algorithms a user’s enrollment fail.
are tested using 5 datasets: the Essex University Collection of Table III shows the amount of users from a database who
Face Images2 (EUCFI, so called Face Recognition Data), the achieve a successful enrollment (are able to provide n “valid
AT&T Database of faces3 (formerly known as ORL Database faces” to the system), related to n.
of faces) [17], the FERET (black and white) database and the
Color FERET Database4 [2], [3], as well as the NIST Mugshot TABLE III. T HE AMOUNT OF CORRECTLY ENROLLED USERS
Identification Database5 (NIST MID) [18]. Table I summarizes DECREASES AS THE NUMBER OF ENROLLMENT REQUIRED SAMPLES
GROWS
the most important features of these sets.
Using 3 faces (n = 3) Using 5 faces (n = 5)
In this work, only database subsets of frontal images Database # users
# % # %
(FERET and NIST databases) are used, as identification pro- EUCFI 394 393 99.75% 393 99.75%
ORL 40 37 92.5% 32 80%
cesses through mobile phones are usually collaborative. FERET BW 1204 373 30.98% 121 10.05%
Color FERET 994 244 24.55% 71 7.14%
IV. R ESULTS NIST 518 82 15.83% 46 8.88%

A. Users’ enrollment from face images


To complete the enrollment process, it is necessary to
obtain a certain amount n of “valid face” images from the B. Influence of the training set size
2 http://cswww.essex.ac.uk/mv/allfaces/index.html Enrolling users in the system by using a different amount of
3 http://www.cl.cam.ac.uk/research face images turns into a reduction of correctly enrolled users,
/dtg/attarchive/facedatabase.html but also an increase of the number of training images per
4 http://www.nist.gov/itl/iad/ig/colorferet.cfm enrolled user. These two factors introduce variations on the
5 http://www.nist.gov/srd/nistsd18.cfm sizes of the training sets. According to this fact and the results
TABLE V. S YSTEM TRAINING TIME DECREASES AS THE FEATURE
in section IV-A, the training datasets’ sizes are shown in Table NUMBER DECREASES . S O DOES THE CORRECT IDENTIFICATION RATES .
IV.
Enrollment Selected features Decrease (%)
Database
TABLE IV. S IZE OF THE TRAINING DATASETS WITH DIFFERENT images Tough Conformist Time ident. rate
ENROLLMENT IMAGE AMOUNTS EUCFI 58 9 34,75 11,45
ORL 27 7 24,31 24,38
Database Using 3 images Using 5 images n=3 FERET BW 58 6 32,57 41,65
EUCFI 1179 1965 Color FERET 64 6 29,02 48,84
ORL 111 160 NIST 51 8 23,57 47,33
FERET BW 1119 605 EUCFI 61 9 16,13 8,74
Color FERET 732 355 ORL 32 7 16,28 20,72
NIST 246 230 n=5 FERET BW 51 6 37,32 27,25
Color FERET 50 6 28,75 37,96
NIST 49 7 15,94 47,15

The main consequence of the training sets size variation


is the corresponding training time variation. The dependence
images from other’s. In other words, big values of this amount
between dataset size and training time is not linear, as seen in
mean that images of one user conform a well determined
Fig. 3.
cluster within the image set, whereas small values show that
images of different users are relatively close.
4
10
However, as we previously stated, this parameter depends
strongly of the number of used features. What is more, the
optimal value is determined experimentally, as the Euclidean
3
10 distance’s value for which the Equal Error Rate is achieved.
Training time (x 103s)

Both values (Threshold and EER) must be analyzed together.


Table VI shows the computed values for each database.
2
10
TABLE VI. EER AND OPTIMAL THRESHOLD (θ) IN THE TOUGH MODE ,
USING 3 ENROLLMENT IMAGES .

Database EER (%) θ


1
10 EUCFI 3.90 4039.63
ORL 11.84 3997.70
FERET BW 4.04 4592.17
Color FERET 10.90 7200.00
0
NIST 10.95 11062.21
10
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Training set size
When the feature number is reduced, the discriminative
Fig. 3. System training time has a non-linear dependence of training dataset power of the classification algorithm decreases and the optimal
size. (Time is represented in logarithmic scale)
thresholds of all the databases tend to get equal.

V. C ONCLUSION
C. Influence of the number of extracted features
We have implemented over a common architecture the most
In order to reduce system training times, a smaller amount
extended face recognition schemes, and we have tested all of
of computed features at PCA reduction stage are considered.
them over public, general use, and literature referenced face
The system is firstly tested with a number of features that
databases to compare the performance rates in different envi-
explain the 80% (tough operating mode) of the features vari-
ronment situations. All the experiments have been carried out
ance, and the obtained values are compared with a second test
in the same computer in order to establish a time consumption
in which the explained feature variance is decreased to the
comparison.
50% (conformist operating mode).
As a result of this exhaustive study, it is possible to
This feature amount decrease reduces the system training
establish the best performing algorithms for each situation, in
time in about a 25%, but it also entails an average reduction
terms of Correct Matching Rate (Tables VII and VIII).
of more than a 30% in the identification rates. Table V shows
this facts. TABLE VII. B EST CLASSIFIER ELECTION IN TOUGH MODE .
CMR=C ORRECT M ATCHING R ATE . Euc STANDS FOR E UCLIDEAN
On the other hand, the reduction of the number of used DISTANCE AND Mah FOR M AHALANOBIS ’
features produces a decrease of the verification thresholds,
Explained feature variance=0.8
just because euclidean distances in lower dimension spaces Database n=3 n=5
are generally smaller. algorithm CMR(%) algorithm CMR(%)
EUCFI LDA+NBC 97.36 LDA+NBC 98.35
ORL LDA+NBC 80.80 LDA+MLP 88.80
D. Influence of the database features over the selected thresh- Feret BW LDA+NBC 88.04 NBC 86.83
olds Feret Color kNN (Mah) 71.82 kNN (Mah) 84.57
NIST NBC 36.72 kNN (Mah) 43,48
The value of the selected distance thresholds for the
verification stage represents how well separated are each user’s
TABLE VIII. B EST CLASSIFIER ELECTION IN CONFORMIST MODE .
CMR=C ORRECT M ATCHING R ATE . Euc STANDS FOR E UCLIDEAN [2] P. J. Phillips, H. Wechsler, J. Huang, and P. J. Rauss, “The FERET
DISTANCE AND Mah FOR M AHALANOBIS ’
database and evaluation procedure for face-recognition algorithms,”
Image and vision computing, vol. 16, no. I 998, pp. 295–306, 1997.
Explained feature variance=0.5 [3] P. J. Phillips, H. Moon, S. A. Rizvi, and P. J. Rauss, “The FERET
Database n=3 n=5
algorithm CMR (%) algorithm CMR (%)
evaluation methodology for face-recognition algorithms,” IEEE Trans-
EUCFI LDA+kNN (Euc) 85.71 LDA+kNN (Euc) 89.75 actions on Pattern Analysis and Machine Intelligence, vol. 22, no. 10,
ORL kNN (Euc) 61.10 LDA+kNN (Mah) 70.4 pp. 1090–1104, 2000.
Feret BW LDA+kNN (Euc) 51.37 kNN (Euc) 63.17 [4] T. Chauhan and S. Sharma, “Literature Report on Face Detection with
Feret Color LDA+KNN (Euc) 36.74 kNN (Euc) 52.47 Skin & Reorganization using Genetic Algorithm,” International Journal
NIST NBC 19.34 PCA+kNN (Euc) 22.98 of Advanced and Innovative Research, vol. 2, no. 2, pp. 256–262, 2013.
[5] R. Brunelli and T. Poggio, “Face Recognition: Features versus Tem-
plates,” IEEE Transactions on Pattern Analysis and Machine Intelli-
gence, vol. 15, no. 10, pp. 1042–1052, 1993.
However, we consider important to point out some facts:
[6] F. Ahmad, A. Najam, and Z. Ahmed, “Image-based Face Detection
It is not easy to state a clear decision about the use of LDA and Recognition: State of the Art,” International Journal of Computer
(so many times discussed in pattern recognition literature [19], Science Issues, vol. 9, no. 6, pp. 3–6, 2013.
[20]) as feature selection method. [7] E. Hjelmå s and B. K. Low, “Face Detection: A Survey,” Computer
Vision and Image Understanding, vol. 83, no. 3, pp. 236–274,
These rates are very sensitive to the algorithm parametriza- Sep. 2001. [Online]. Available: http://linkinghub.elsevier.com/retrieve/
pii/S107731420190921X
tion. As seen in section IV, slight changes on the number
of enrolling images or face extracted features can lead to [8] S. Dabbaghchian, M. P. Ghaemmaghami, and A. Aghagolzadeh,
“Feature extraction using discrete cosine transform and discrimination
important changes on the results, since fundamental operating power analysis with a face recognition technology,” Pattern Recognition,
parameters (such as the verification distance threshold) have a vol. 43, no. 4, pp. 1431–1440, Apr. 2010. [Online]. Available:
huge variability. http://linkinghub.elsevier.com/retrieve/pii/S0031320309004142
[9] M. A. Turk and A. P. Pentland, “Face recognition using eigenfaces,” in
For this reason, it would be guileless to expect a complete Computer Vision and Pattern Recognition, 1991. Proceedings CVPR’91,
algorithm outperforming all the others in every situation. IEEE Computer Society Conference on. IEEE, 1991, pp. 586–591.
Facial expressions, lighting and pose changes between dif- [10] E. Osuna, R. Freund, and F. Girosi, “Training support vector machines:
ferent images of an individual have a higher influence in an application to face detection,” in Computer Vision and Pattern Recog-
the system performance than the algorithm itself. When the nition, 1997. Proceedings, 1997 IEEE Computer Society Conference on.
IEEE, 1997, pp. 130–136.
holistic approach is used, a high user collaboration is required.
[11] P. Viola and M. Jones, “Rapid Object Detection using a Boosted
Nevertheless, distance-based and Bayesian classifiers seem to Cascade of Simple Features,” Proceedings of the 2001 IEEE Computer
have a promising overall performance. Society Conference on Computer Vision and Pattern Recognition, vol. 1,
pp. 511–518, 2001.
This leads to a future research line, conducted to define
[12] P. Belhumeur, J. Hespanha, and D. Kriegman, “Eigenfaces vs.
a face representation scheme allowing a robust face template Fisherfaces: recognition using class specific linear projection,”
construction, less dependent of environment and image capture IEEE Transactions on Pattern Analysis and Machine Intelligence,
circumstances. vol. 19, no. 7, pp. 711–720, Jul. 1997. [Online]. Available:
http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=598228
ACKNOWLEDGMENTS [13] T. Cover and P. Hart, “Nearest neighbor pattern classification,” Infor-
mation Theory, IEEE Transactions on, vol. 13, no. 1, pp. 21–27, 1967.
The research leading to these results has received funding [14] R. P. Lippmann, “Pattern classification using neural networks,” Com-
from the European Union Seventh Framework Programme munications Magazine, IEEE, vol. 27, no. 11, pp. 47–50, 1989.
(FP7/2007-2013) under grant agreement no 610713. [15] P. Langley and S. Sage, “Induction of selective bayesian classifiers,”
in Proceedings of the Tenth international conference on Uncertainty in
Portions of the research in this paper use the FERET artificial intelligence. Morgan Kaufmann Publishers Inc., 1994, pp.
database of facial images collected under the FERET program, 399–406.
sponsored by the DOD Counterdrug Technology Development [16] R. Gross, “Face Databases,” in Handbook of Face Recognition, S. Li
Program Office. and A. Jain, Eds. Springer New York, 2005, ch. 13, pp. 301–327.
[17] F. S. Samaria and A. C. Harter, “Parameterisation of a stochastic model
Authors would also like to thank AT&T Laboratories for human face identification,” in Applications of Computer Vision,
Cambridge for the use of the ORL Database, as well as Libor 1994., Proceedings of the Second IEEE Workshop on. IEEE, 1994,
Spacek for the use of Essex University’s Collection of Facial pp. 138–142.
Images (Face Recognition Data). [18] C. Watson, “NIST special database 18: Mugshot identification database
of 8 bit gray scale images,” CD-ROM & documentation, 1994.
[19] A. M. Martı́nez and A. C. Kak, “PCA versus LDA,” Pattern Analysis
R EFERENCES and Machine Intelligence, IEEE Transactions on, vol. 23, no. 2, pp.
[1] J. P. Maurya and S. Sharma, “A Survey on Face Recognition Tech- 228–233, 2001.
niques,” Computer Engineering and Intelligent Systems, vol. 4, no. 6, [20] J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, “Face recognition
pp. 11–17, 2013. using lda-based algorithms,” Neural Networks, IEEE Transactions on,
vol. 14, no. 1, pp. 195–200, 2003.