MN
(u)(v)
M1
X
x=0
N1
X
y=0
f(x, y)
cos
(2x + 1)u
2M
cos
(2y + 1)v
2N
(5)
where (w) is denided by:
(w) =
2
w = 0
1 otherwise
(6)
where f(x, y) is the image intensity function and F(u, v) is
a 2D matrix of DCT coefcients. Low-frequency coefcients
represent lighting conditions.
There are two ways to implement the DCT. In the rst, the
DCT is directly applied in the entire image, while the second,
the image is divided in blocks and the DCT is calculated
in each block. The DCT coefcients with larger magnitude
are located on DC and low frequency AC coefcients. Low
frequencies are correlated with the illumination conditions
and high frequencies represent noise and small variations
(details).
2.3 Illumination Compensation
Illumination variations in face image can be well com-
pensated by adding and subtracting the compensation term
(x, y) of (3) in the logarithm domain. Illumination varia-
tions usually changes slowly in face images. Since, illumina-
tion variations mainly lie in the low frequency band, they can
be reduced by removing the low frequency components. The
DCT is used to transform an image from spatial domain to
frequency domain. Low frequency components are removed
by setting the low-frequency DCT coefcients to zero,
working as a high-pass lter. It follows from (3) that setting
the DCT coefcients to zero is equivalent to subtracting
the product of the DCT basis image and the corresponding
coefcient from the original image. If n low-frequency DCT
coefcients are set to zero,
F
0
(x, y) =
M1
X
u=0
N1
X
v=0
E(u, v)
n
X
i=1
E(u
i
, v
i
)
= F(x, y)
n
X
i=1
E(u
i
, v
i
) (7)
Fig. 1: Example of illumination compensation.
where
E(u, v) = (u)(v)C(u, v) cos
(2x + 1)u
2M
cos
(2y + 1)v
2N
. (8)
The term
P
n
i=1
E(u
i
, v
i
) is the illumination compensation
term. F
0
(x, y) in (7) is the desired normalized face in the
logarithm domain.
The rst DCT coefcient, the DC component, determines
the overall illumination of face image. Therefore, the desired
uniform illumination can be obtained by setting the DC
coefcient to the same value
C(0, 0) = log
MN (9)
where C(0, 0) is the DC coefcient of the logarithm image.
For the convenience of understanding and visualization,
Chen et al. [34] suggest a value of near the middle
level of the original image. It follows from (7) and (3) that
the difference between the original DC component and the
normalized DC component, together with the other discarded
low frequency AC components, approximately make up the
compensation term (x, y).
2.4 Discarding DCT Coefcients
Low-frequency DCT coefcients which are highly related
to illumination variations should be discarded. A problem
that remains is: which and how many DCT coefcients
should be discarded in order to obtain the well normalized
face image? The way of discarding DCT coefcients is
following the zigzag order as shown in Figure 2.
3. Feature Extraction
In [34], after discarding the rst n DCT coefcients,
inverse DCT is perfomed and then the recognition process
Fig. 2: Manner of discarding DCT coefcients.
is executed directly on the image in the logarithmic domain,
i.e., the inverse logarithm tranform is skipped. The authors
proved that PCA can be performed in the DCT domain and
the same results are obtained as if it is applied in the spatial
domain. We skip the step of performing inverse DCT domain
in order to reduce the computational cost. Then the image is
divided into several blocks in which a block contains 8 8
pixels. Therefore, each block contains 64 DCT coefcients.
As suggested by [3], the rst four coefcients of each block
are used, arranged in a vector and used to describe the
content of the image. The features are selected from all DCT
coefcients of partitioned blocks. After applying the DCT,
some coefcients are selected and other are discarded in
the dimensionality reduction. The DCT coefcients of these
blocks are used as candidate features. The proposed model
deals with multi-classes recognition problem based on an
SVM classication engine.
Only a few lower frequency components are generally
selected as features values for classication. The selection
of the DCT coefcients is an important task of the feature
extraction process. Most of the approaches based on DCT, do
not give enough attention to coefcient selection. Coefcient
are usually selected with conventional methods (zigzag or
zonal masking). These approaches are not necessarily ef-
cient in all applications. In this work, we propose the use of
DPA for dimensionality reduction.
4. Discrimination Power Analysis
After the DCT application, the selection of the coefcient
with the highest discrimination power is done and conse-
quently the dimension is reduced. To perform this process,
Dabbaghchian and Aghgolzadeh [9] propose a statistic ap-
proach that analyzes the image dataset and associate to each
DCT coefcient a number that represents the Discrimination
Power (DP). This technique is called DPA and when used,
it achieves better recognition rate. While approaches like
PCA and LDA try to obtain a transform that maximizes the
discrimination of the features in the transformed domain,
DPA searches for the best features in the original domain.
Besides that the DPA has no singularity problem and can
be used as an algorithm feature reduction algorithm or
combined with other approaches.
In order to calculate the DP of each coefcient, a great
variation inter-class and low variation intra-class are consid-
erated. So, having C classes and S training images for each
class, the DP of each coefcient x
ij
can be estimated as
follows.
First, build the train matrix A
ij
A =
x
ij
(1, 1) x
ij
(1, 2) ... x
ij
(1, C)
x
ij
(2, 1) x
ij
(2, 2) . . . x
ij
(2, C)
.
.
.
.
.
.
.
.
.
.
.
.
x
ij
(S, 1) x
ij
(S, 2) ... x
ij
(S, C)
SC
(10)
then choose the DCT coefcients of the positions i and j
for all classes and for all training images. Next, the mean
value is calculated:
M
c
ij
=
1
s
S
X
s=1
A
ij
(s, c), c = 1, 2, ..., C (11)
and the variance of each class:
V
c
ij
=
S
X
s=1
(A
ij
(s, c) M
c
ij
)
2
c = 1, 2, ..., C (12)
Then is calculate average variation of all classes:
V
W
ij
=
1
C
C
X
c=1
V
c
ij
(13)
Then is calculated the average and a variance all training
samples:
M
ij
=
1
S C
C
X
c=1
S
X
s=1
A
ij
(S, C)y (14)
V
B
ij
=
C
X
c=1
S
X
s=1
(A
ij
(S, C) M
ij
)
2
(15)
Finally, the DP is estimated for location (i, j):
D
ij
=
V
B
ij
V
W
ij
, 1 i M, 1 j N (16)
Higher values of DP, mean higher discrimination power of
the coefcients.
Fig. 3: Example of images of CMU PIE dataset.
5. RESULTS
In this paper, we conduct two experiments. In the rst
experiment, we compare our proposed method with other
illumination invariant approaches. The Yale Face Database
B and the CMU PIE Face Database are both used to
evaluated our approach, these databases contain images with
large illumination variations. In the second experiment, we
evaluate the performance of our approach with different
expressions, frontal and prole views and face images with
cluttered background.
The analysis of tests were conducted using the accuracy,
precision and recall. The precision represents the portion of
real positive items that were correctly classied among all
items classied as positive. The recall represents the amount
that was classied with success, in other words, how many
items were correctly classied as a positive class.
Acc =
TP +TN
TP +FN +TN +FP
(17)
P =
TP
TP +FP
(18)
R =
TP
TP +FN
(19)
where TP = True Positive, TN = True Negative, FN =
False Negative, and FP = False Positive;
5.1 Illumination Invariance Experiment
The YaleB Face database [28] have 2432 frontal images
of 38 people, with different facial expression and illumi-
nation conditions. In the CMU PIE database, there are 68
subjects with pose, illumination and expression variations.
In our experiments, only frontal face images under different
lighting conditions area selected. In Figure 3, we show some
examples of frontal views of CMU PIE dataset.
In Table 1, we compare our method with Chen et al. [34]
and Kao et al. [3] methods. The rst method, in order to
achieve the illumination invariance, reduces the illumination
variation by truncating the low frequency DCT coefcients
in the logarithmic domain. While the second method uses
the local contrast enhancement to reduce the illumination
variations. We test our method without dimensionality reduc-
tion (Our approach) and with dimensionality reduction (Our
approach PDA). In Table 1, we can see that our method give
superior results compared with the other two methods. In
Table 2, we compare our methods in terms of precision and
recall. Again, our methods, with and without dimensionality
reduction, achieves better results.
Table 1: Mean accurracy (Mean) and variance accurracy
(Var) for YaleB and CMU PIE databases.
Yale B PIE
Method Mean Var Mean Var
Our approach 99.875 0.026 99.923 0.012
Our approach PDA 99.938 0.017 99.962 0.008
Chen et al [34] 96.312 0.004 95.907 0.016
Kao et al [3] 96.730 0.163 95.647 0.018
Table 2: Results of in experiments in terms of precision(Prec)
and recall (Recall) for Yale B and CMU PIE datasets.
Yale B PIE
Method Prec Recall Prec Recall
Our approach 0.999 0.999 0.999 0.999
Our approach PDA 0.999 0.999 1.000 1.000
Chen et al [34] 0.966 0.963 0.963 0.959
Kao et al [3] 0.961 0.963 0.993 0.993
In Figure 4 and 5, are displayed as a graph the values
shown in Table 2 for the database Yale B and CMU PIE,
respectively. We can see that our methods, without and with
dimensionality reduction, we have obtained the best results.
Fig. 4: Comparative Graph the database YaleB.
Fig. 5: Comparative Graph the database PIE.
5.2 Pose and views variation experiment
In order to evaluate our proposed approach, the exper-
iments were performed on four benchmark face datasets:
the Japanese Female Facial Expression (JAFFE) database
[25], that contains 213 images with 7 images illustrating
different expression of each person; AT&T database [26]
with 400 images of 40 individuals containing variations like
expressions and face details; the Sheled database (known as
UMIST) [27] that consist in 564 image of 20 individuals,
showing them from frontal and prole views; and Georgia
Tech face database [29] that contains image of 50 people, all
people in the database are represented by 15 color images
with cluttered background.
In the second experiment, we compare our approach with
other well known methods such as: Eigenface [11], Fisher-
face [12], SIFT [18], Mel-cepstrum [19], Mellin cepstrum
[19], our approach and Kao et al. method [3]. Again, the
performance is evaluated using the accuracy, precision and
recall. In Table 3, we present the performance of the
different methods.
The performance with more variations occurs with Geor-
gia dataset. This happens because this dataset not only has
variations in illumination, expression and position, it also
has variations in the background. Despite these variations,
we can see that our approach achieves good results compared
with the other methods.
For a better analysis of results can be seen in Table 4
which shows the model accuracy and variance of the exper-
iments. For each method, and for each database, a cross
validation with ten fold were carried out. The accuracy
reported in Table 4 is the average of 10 runs.
Table 3: Result of Eingenface, Fisherface, SIFT, Mel, Mellin, Our approach and DCT+DPA ([3]). The variable P is precision
and R is recall.
Eigenface Fisherface SIFT Mel Mellin Our approach DCT+DPA
Database P R P R P R P R P R P R P R
JAFFE 0.99 0.98 0.97 0.97 0.99 0.99 0.99 0.99 0.98 0.98 1.00 1.00 0.98 0.98
AT&T 0.94 0.94 0.89 0.82 0.92 0.90 0.84 0.81 0.83 0.81 0.97 0.96 0.94 0.92
UMIST 0.99 0.98 0.97 0.97 0.98 0.97 0.92 0.91 0.93 0.93 0.98 0.98 0.99 0.99
Georgia 0. 98 0.97 0.78 0.77 0.96 0.96 0.97 0.96 0.93 0.91 0.98 0.98 0.97 0.96
The Our approach again gave better results in three of
the four database used in the tests. The accuracy was higher
for the JAFFE, Georgia and AT&T database also getting
the lowest variance for these bases. In UMIST database
the best result was obtained by DCT + APD. Based on
our experiments, we can conclude that DCT based methods
perform as good as or better than well known methods with
the advantage that the computational cost is lower than them.
6. CONCLUSION
Analyzing the performance of all the methods, we can see
that our proposed method achieves better results with lower
computational cost and smaller feature vectors. Eigenface
and Fisherface methods delivers good results but with high
computational cost that make these methods become imprac-
tical for large datasets. This work proposes a method for face
recognition that is invariant to light variations, expressions
with frontal and prole views. Based on the experiments,
we can also conclude that DCT based methods achieve
equal or better results than methods from the literature. A
dimensionality reduction and feature selection is performed
using the DPA method. Despite the dimensionality reduction,
our approach obtain better results with lower cost.
References
[1] Simon Fear. (2005). Publication quality tables
in LaTeX. [Online]. Available: http://www.ctan.org/tex-
archive/macros/latex/contrib/booktabs/booktabs.pdf
[2] Xudong Xie and Kin-Man Lam(2005), "Face recognition under varying
illumination based on a 2D face shape model", Pattern Recognition, vol.
38, number 2, p. 221230.
[3] Wen-Chung Kao and Ming-Chai Hsu and Yueh-Yiing Yang (20100,
"Local contrast enhancement and adaptive feature extraction for
illumination-invariant face recognition", Pattern Recognition, vol. 43,
number 5, p. 1736 - 1747.
[4] Ronen Basri and David W. Jacobs (2003), "Lambertian Reectance
and Linear Subspaces", IEEE Trans. Pattern Anal. Mach. Intell., vol
25, number 2, pages 218-233.
[5] Li, Jun-Bao and Pan, Jeng-Shyang and Lu, Zhe-Ming (2009), "Face
recognition using Gabor-based complete Kernel Fisher Discriminant
analysis with fractional power polynomial models", Neural Computing
& Applications, Springer London, pages 613-621, vol. 18.
[6] Imtiaz, H. and Anowarul Fattah, S. (2010), "Electrical and Computer
Engineering (ICECE), 2010 International Conference on", "A face
recognition scheme based on spectral domain feature extraction", pages
514 -517.
[7] Hu, Haifeng (2008), "ICA-based neighborhood preserving analysis for
face recognition",
issue 3, issn 1077-3142, pages 286295.
[8] oung-Youn Kim and Lee-Sup Kim and Seung-Ho Hwang (2001), "An
advanced contrast enhancement using partially overlapped sub-block
histogram equalization", Circuits and Systems for Video Technology,
IEEE Transactions on, vol. 11, number 4, pages 475 -484.
[9] Dabbaghchian, Saeed and Ghaemmaghami, Masoumeh P. and
Aghagolzadeh, Ali (2010), "Feature extraction using discrete cosine
transform and discrimination power analysis with a face recognition
technology", Pattern Recogn., vol. 43, issue 4, issn 0031-3203, pages
14311440.
[10] Jafri, R.; Arabnia, H. R. (2009), "A Survey of Face Recognition
Techiniques", Journal of Information Precessing System, volume 5,
number 2, pages 41-68.
[11] Turk, M.; Pentland, A.(1991), "Eigenfaces for recognition", Journal
of Cognitive Neuroscience, volume 3, number 1, pages 71-86.
[12] Belhumeur, P.N. and Hespanha, J.P. and Kriegman, D.J.(1997),
"Eigenfaces vs. Fisherfaces: recognition using class specic linear pro-
jection", Pattern Analysis and Machine Intelligence, IEEE Transactions
on, volume 19, number 7, pages 711 -720.
[13] Hu, Haifeng (2008), "Orthogonal neighborhood preserving discrimi-
nant analysis for face recognition", Pattern Recogn., volume 41, issue
6, issn 0031-3203, pages 20452054.
[14] T. Jebara (1996), "Center for Intelligent 3D Pose Estimation and
Normalization for Face Recognition", School McGill University.
[15] Cox, I.J. and Ghosn, J. and Yianilos, P.N.(1996), "Feature-based
face recognition using mixture-distance", Computer Vision and Pattern
Recognition, 1996. Proceedings CVPR 96, 1996 IEEE Computer
Society Conference on, pages 209 -216.
[16] Takeo Kanade(1973), "Picture Processing System by Computer Com-
plex and Recognition of Human Faces", Doctoral dissertation, Kyoto
University.
[17] Cleanu, Ctlin-Daniel and Mao, Xia and Pradel, Gilbert and Moga,
Sorin and Xue, Yuli (2011), "Combined pattern search optimization of
feature extraction and classication parameters in facial recognition",
Pattern Recogn. Lett., volume 32, issn 0167-8655, pages12501255.
[18] Lowe, David G. (2004), "Distinctive Image Features from Scale-
Invariant Keypoints", Int. J. Comput. Vision, volume 60, issue 2, issn
0920-5691, pages 91110.
[19] Cakir, Serdar and Cetin, A. Enis (2011), "Mel-and Mellin-cepstral
Feature Extraction Algorithms for Face Recognition", Comput. J.,
volume 54, issue 9, issn 0010-4620, pages 15261534.
[20] Aman Chadha and Pallavi P. Vaidya and M. Mani Roja(2011), "Face
Recognition Using Discrete Cosine Transform for Global and Local
Features", CoRR, volume abs/1111.1423.
[21] Yuan-Yuan Huang and Jian-Ping Li and Jie Lin and Yu-Jie Hao
and Gui-Duo Duan (2009), "Apperceiving Computing and Intelligence
Analysis, 2009. ICACIA 2009. International Conference on", Robust
face recognition by combining wavelet decomposition and local hybrid
projection entropy, pages 325 -328.
[22] Jia-Shu Zhang and Cun-Jian Chen (2008), "Local variance projection
log energy entropy features for illumination robust face recognition",
Biometrics and Security Technologies, 2008. ISBAST 2008. Interna-
tional Symposium on, pages 1 -5.
[23] Vapnik, Vladimir N.(1995), "The nature of statistical learning theory",
Springer-Verlag New York, Inc., isbn 0-387-94559-8.
[24] S. Alirezaee and K. FaezH. Aghaeinia and Askari F.(2006), "An
ecient algorithm for face localization.", Int. Journal of Information
Technology, volume12, pages 3036.
[25] Kamachi M. and Lyons M. and Gyoba, J.(1998), "The
Table 4: Variance of Eingenface, Fisherface, SIFT, Mel, Mellin, Entropy+DCT+PCA and DCT+DPA. The variable P is
precision and R is recall.
Eigenface Fisherface SIFT Mel Mellin Our approach DCT+DPA
Database Acc Var Acc Var Acc Var Acc Var Acc Var Acc Var Acc Var
JAFFE 98.65 1.70 97.69 2.07 99.23 0.51 99.23 0.51 98.46 0.22 100.00 0.00 98.46 2.44
AT&T 94.10 5.94 82.00 12.80 90.70 2.06 81.70 2.96 81.00 1.8 96.1 1.49 92.70 3.86
UMIST 98.72 0.10 97.96 0.29 98,04 0.43 93.38 1.73 94.46 0.84 98.1 0.66 99.40 0.11
Georgia 97.77 0.89 77.14 504.5 96.05 0.69 96.51 0.43 91.54 0.57 97.86 1.48 97.6 0.73
Japanese Female Facial Expression (JAFFE) Database",
http://www.kasrl.org/jaffe.html.
[26] Laboratories Cambrige (2002), "Database of faces",
http://www.cl.cam.ac.uk/research/dtg/-attarchive/facedatabase.html,
[27] Graham, D. B.; Allinson, N. M.(1998), The UMIST Database,
http://www.face-rec.org/databases/.
[28] A. Georghiades and P. Belhumeur and
D. Kriegman(2001), "Yale Face Database",
http://vision.ucsd.edu/ leekc/ExtYaleDatabase/ExtYaleB.html.
[29] Georgia Tech Face Database(20070,
http://www.aneam.com/research/face_reco.html.
[30] mtiaz, H. and Anowarul Fattah, S.(2010), "A face recognition scheme
based on spectral domain feature extraction", Electrical and Computer
Engineering (ICECE), 2010 International Conference on pages 514
517.
[31] Xuan Zou and Kittler, J. and Messer, K. (2007), "Illumination Invari-
ant Face Recognition: A Survey", Biometrics: Theory, Applications,
and Systems, 2007. BTAS 2007. First IEEE International Conference
on, pages 1 -8.
[32] Ramji M. and Makwana (2010) "Illumination invariant face recog-
nition: A survey of passive methods", Procedia Computer Science,
volume 2, pages 101110.
[33] Ruiz-del-Solar, Javier and Quinteros, Julio (2008), "Illumination
compensation and normalization in eigenspace-based face recognition:
A comparative study of different pre-processing approaches", Pattern
Recogn. Lett., volume 29, issue 14, pages 19661979.
[34] Chen, W. and Meng Joo Er and Shiqian Wu(2006), "Illumination
compensation and normalization for robust face recognition using
discrete cosine transform in logarithm domain", Systems, Man, and
Cybernetics, Part B: Cybernetics, IEEE Transactions on volume 36,
number 2, pages 458 -466.
[35] Shermina, J. (2011), "Illumination invariant face recognition us-
ing Discrete Cosine Transform and Principal Component Analysis",
Emerging Trends in Electrical and Computer Technology (ICETECT),
2011 International Conference on pages 826 -830.