Professional Documents
Culture Documents
ABSTRACT
Many existing works in face recognition are based solely on visible images. The
use of bimodal systems based on visible and thermal images is seldom reported in
face recognition, despite its advantage of combining the discriminative power of
both modalities, under expressions or pose variations. In this paper, we investigate
the combined advantages of thermal and visible face recognition on a Principal
Component Analysis (PCA) induced feature space, with PCA applied on each
spectrum, on a relatively new thermal/visible face database – OTCBVS, for large
pose and expression variations. The recognition is done through two fusion
schemes based on k-Nearest Neighbors classification and on Support Vector
Machines. Our findings confirm that the recognition results are improved by the
aid of thermal images over the classical approaches on visible images alone, when
a suitably chosen classifier fusion is employed.
Keywords: face recognition, fusion scheme, thermal images, PCA, k-NN, SVMs.
The weight α (between 0 and 1) in the thermal images. Few samples are illustrated in Fig. 4.
computation of dw(x,xt) is determined based on a The images are acquired under illumination,
validation set, to maximize the face recognition rate. expressivity (“surprised”, “laughing”, “angry”) and
On the weighted distances, a k-NN classification is pose (11 positions for each type of acquisition as
applied to obtain the recognition result. illustrated in Fig. 3) variation. We select OTCBVS
3.3.2 The features fusion scheme benchmark since its images simulate the main
The feature fusion scheme is illustrated in Fig. 3, variations of real scenarios.
where in block A is illustrated the feature extraction
in visible spectrum, in block B the feature extraction
in thermal spectrum and in block C2 the features
fusion scheme. The purpose of the scheme is to fuse
the projections of the visible facial image IV and the
projections of the similar thermal spectrum facial
image IIR in a β weighted combination, where the
weight β should be chosen to maximize the
recognition rate of the multi-class linear SVM
classifier over some validation set of examples. Let
us denote the recognition rate of this classifier,
depending on β, by rate(β),
Number of correctly classified instances in the validation set .
rate(β) =
Total number of instances in the validation set
Then the feature vector is described by Eq. (8) and Figure 4: Samples of different subjects from
the optimal value of the weight β is given by the Eq. OTCBVS; (A) Visible spectrum images; (B) IR
(9). thermal images
In order to test our fusion approach, we use the Figure 5: Example of a subject acquisition under
OTCBVS benchmark [32]. For both of the fusion pose variation; (A) Visible spectrum images; (B) IR
schemes proposed, we established a small training thermal images
set and a validation set to derive the optimal weight
for the fusion (α or β), and several test sets for In our tests, no preprocessing is performed on the
expressions and pose variations. images, due to test the discriminative power of IR and
visible spectrum in individual PCA based face
4.1 The face database and the evaluation recognition methods and in our fusion based
design approach. The preprocessing techniques would
In our experiments, we use OTCBVS benchmark increase indirectly the recognition rate. The data used
[32], which contains 4228 pairs of visible and IR for the experiment is divided in three main sets:
training, validation and test sets. Efforts were made to The recognition rate in the validation set for
test the fusion schemes proposed, in order to reflect values of α ranging from 0 (IR modality only) to 1
the performance of real world scenarios according to (visible modality only – 100%) are illustrated in Fig.
high variations of pose or expressivity. Most of the 6 for the score fusion scheme. Thus, in Fig. 6, the
results reported in the literature are obtained on classical PCA based recognition rate in visible
experimental images with nearly frontal pose, with spectrum is draw with the dotted blue line and in IR
preprocessing (manually or automatically) such as spectrum with the dotted green line. According to the
localization, scale or rotation of the human faces, validation set, the weight of the visible image score
which are not practical in most of the face recognition in the score fusion scheme is chosen as the minimum
applications. weight 82% (i.e. α=0.82) that maximizes the
recognition rate. It is important to remark from Fig. 6
4.2 The training set that for a large set of α values, from 0.2 to 0.98, the
We include in the training set a total of 12 nearly performance of the bimodal approach is higher than
frontal images for each subject: 6 images in IR the performance of the classical PCA approach on
spectrum and 6 images in visible spectrum. For each visible images and for every α between 0.01 to 1, the
spectrum there are 3 images under the “surprised” bimodal performance is higher than the classical
expressivity (pose 4, 6 and 8) and 3 images under the PCA approach on IR thermal images.
“laughing” expressivity (also pose 4, 6 and 8). C For the features fusion scheme, many values of β
classes are defined from the training set, and each offers superior results than a classical PCA-a based
class contains 6 projections in the eigenfaces approach in visible or only thermal spectrum with a
subspace for each spectrum. linear SVM classifier. Such as values of the features
fusion weight which maximize the recognition
4.3 The validation set results are 0.7, 0.72 or 0.8. The 0.8 value for β is
After computing the projections of the training carried out for the performance evaluation tests to
images in the eigenfaces subspace, we must compare all the results from both of the fusion
determine the optimal value of the fusion parameters schemes proposed.
α (weight of the score fusion scheme) and β (weight
of the features fusion scheme). This is done by 4.4 Performance evaluation test
tuning the weights to maximize the recognition The next experiments aim to evaluate the
performance on a validation set. The validation set performance of our bimodal fusion based approaches.
includes images with the “angry” expressivity under To exhibit superiority of the bimodal systems, the
pose 5 and 7. single modal scheme’s performances of the visible
images and the IR thermal images are carried out for
comparison. The performances are evaluated by the
means of k-NN classification (with k=1, 3, 5 and 7)
for the score fusion scheme and SVM classification
for the features fusion scheme. A particular
experiment is for k-NN based only on the first
neighbor, i.e. k=1, and only on the visible spectrum
images, i.e. α=1, which is the classical PCA-based
method [2] but applied on the OTCBVS database.
The first evaluation experiment is considered an
expressivity test. The test set is consisting from
images with the “angry” expressivity, a different
expressivity than those from the training set, and the
same nearly-frontal poses: 4, 6 and 8.
Figure 6: Results on the validation set for various α
values in the score fusion scheme.
Table 1: Description of the images sets for the performance evaluation tests.
Test 2(%) 95.08 87.05 98.21 90.62 78.57 94.64 83.48 75.00 91.07 76.33 64.73 83.03
Test 3(%) 89.28 62.50 92.85 82.14 51.78 87.50 73.21 44.64 78.57 64.28 37.50 71.42
Test 4(%) 99.10 92.85 100 99.10 86.60 99.10 97.32 84.82 99.10 89.28 77.67 96.42
Test 5(%) 69.10 53.72 74.85 64.13 48.21 70.53 61.01 43.75 66.51 55.65 36.16 59.07
Test 6(%) 75.89 56.25 82.73 66.36 48.80 75.29 61.30 42.55 69.34 55.65 33.03 58.33
The second evaluation experiment is a pose test, For the features fusion scheme it can be seen that
where the test set consist from images with the same also the bimodal approach had obtained better results
expressivity as those from the training test than the classical PCA approach on a single spectrum
(“surprised” and “laughing”) but with different poses with a linear SVM for the classification. Due to the
selected (3, 5, 7 and 9). The third test set includes complexity of the SVM, the results for the visible
images with a different expressivity (“angry”) and spectrum or the thermal spectrum facial images are
also different poses (3 and 9) than those from the superior to those obtained with k-NN in almost all the
training or the validation sets. Finally, we evaluate tests. Also, the improvements obtained with a
the performance of our approaches with other three bimodal system in the features fusion scheme are
test sets that include also the largest expressivity and usually better than the results of the score fusion
pose variations from the data set, with extreme pose scheme. In test 4, the rates for visible spectrum and
variations (such as pose 1 or 11) and all the for our features fusion scheme is maximum for the
expression variations. The description of the test weight β=0.8. All the results for the features fusion
images sets for our experiments is illustrated in scheme are illustrated in Table 3. As can be observed,
Table 1. the SVM classifier use more favorable the features
from the thermal spectrum than the k-NN classifier.
4.5 Results
The performances of each individual modality and Table 3: Recognition rates for the features fusion
of our fusion based approach are illustrated in Table 2 scheme on OTCBVS (V-visible spectrum; IR-
for the score fusion scheme and in Table 3 for the thermal IR spectrum; Bi-bimodal features fusion
features fusion scheme. As can be seen, in every test scheme).
for both of the fusion schemes, the recognition rate
for the IR spectrum is lower than the visible spectrum V IR Bi
with significantly differences, also due to the Test 1(%) 94.04 76.19 98.80
acquisition of the IR thermal images which are more Test 2(%) 96.42 92.85 98.21
variant in respect to pose and even rotation, compared Test 3(%) 83.92 66.07 92.85
with the images from the visible spectrum. It is Test 4(%) 100 99.10 100
expected that a preprocessing of both the visible and Test 5(%) 69.94 55.80 74.55
the IR images may solve partially the significant Test 6(%) 75.29 59.82 81.84
difference of the results.
For the score fusion scheme, it can be seen that For our experiments, we proposed a set with a
for the 1-NN classification, our approach has superior small number of training images in order to simulate
results in all the tests, also being the only one that a real practical application with difficult scenarios
achieves 100% recognition rate in some tests (i.e. such as: few samples for every person and high
Test 4). The recognition rate of our approach is lower variations in appearance in the test images. As can be
in expressivity variance than in pose variance, mainly seen in Table 2 and Table 3, for a difficult test which
due to the training set selected. For an extremely high considers all the expressions and a large set of
pose variance (i.e. pose 1 and 11), it is expected that positions of the subject (even with half of the face
all the rates will be lower, as in Tests 5 or 6. Another hidden), the recognition result of the classical PCA-
issue easily to remark is the lower recognition based approach is as poor as 69.10% and for a PCA-
performance of the higher nearest neighbor approach classified with SVM classifier is closely as
classification (i.e. k=5, 7) due to the small number of 69.94% and our fusion based approaches improves
samples in the training set as compared to the number the recognition rate with almost 6% in both cases.
of classes.
Figure 7: Performance of the fusion-based approaches.
From the same tests, it can be seen that even if the face recognition, Journal of Cognitive
SVM classifier offers superior results in most of the Neuroscience, 3(1), pp. 71-86 (1991).
test, there are situations when it has minor lower [3] M.R. Gupta and N.P. Jacobson: Wavelet
differences than the score fusion approach based on Principal Component Analysis and its
k-NN classifier. In all the experiments, we found that Application to Hyperspectral Images, IEEE Int’l
the performance of our score fusion based approach Conf. on Image Processing, pp. 1585-1588
with α=0.82 and our features fusion based approach (2006).
with β=0.8 exceeds the individual performances of [4] W. Hu, O. Farooq and S. Datta: Wavelet Based
the systems, sometimes with almost 10% (illustrated Sub-space Features for Face Recognition, CISP
in Fig. 7 for direct comparison of the results). '08. Congress on Image and Signal Processing,
Vol. 3, pp. 426-430, (2008).
5 CONCLUSIONS [5] H. Wang, S. Yang and W. Liao: An Improved
PCA Face Recognition Algorithm Based on the
Two fusion based approaches that highly Discrete Wavelet Transform and the Support
improves the performance of the classical PCA-based Vector Machines, Int’l Conf. on Computational
techniques are proposed in this paper. First approach Intelligence and Security Workshops, pp. 308-
is a score fusion system of the PCA induced feature 311 (2007).
space with a k-NN classification and the second [6] M. Mazloom and S. Ayat: Combinational
approach is a directly features fusion system in the Method for Face Recognition: Wavelet, PCA
same PCA induced space with a linear SVM and ANN, Digital Image Computing:
classification. The PCA-based techniques in visible Techniques and Applications, pp. 90-95 (2008).
spectrum aim to achieve high recognition rate for
[7] P. Parveen and B. Thuraisungham: Face
frontal images with low intra-personal variation.
recognition using multiple classifiers,
Thus, for practical face recognition application, the
acquisition’s conditions cannot be always controlled Proceedings of the 18th International Conference
(e.g. expressivity and pose variance), and the on Tools with Artificial Intelligence, pp. 179-
performance of the classical PCA-based approaches 186 (2006).
are highly affected. In order to minimize the intra- [8] D. He, L. Zhang and Y. Cui: Face Recognition
personal variations of the human faces, we combine Using (2D)^2PCA and Wavelet Packet
the discriminative power of IR and visible spectrum, Decomposition, Congress on Image and Signal
and provide a principled formulation of a procedure Processing, vol. 1, pp. 548-553 (2008).
to select optimal values of the fusion weight α or β [9] M. Zhao, P. Li and Z. Liu: Face Recognition
between the two modalities. To improve the Based on Wavelet Transform Weighted Modular
recognition rates, the classification with a non-linear PCA, CISP’08, vol. 4, pp.589-593 (2008).
more complex SVM can be performed or [10] G.F. Xu, S.Q. Ding, L. Huang and C.P. Liu:
preprocessing can be also performed, thus the scale, Recognition based on wavelet reconstruction
rotation and illumination variations would be face, International Conference on Machine
reduced. Learning and Cybernetics, pp. 3005-3020 (2008).
[11] Y. Yoshitomi, T. Miyaura, S. Tomita and S.
6 REFERENCES Kimura: Face Identification Using Thermal
Image Processing, 6th IEEE International
[1] R. Brunelli and T. Poggio: Face recognition: Workshop in Robot and Human Communication,
features versus templates, IEEE Trans. Patt. pp. 374–379 (1997).
Anal. Mach. Intell. 15(10), pp. 1042-1052 [12] D. Socolinsky, A. Selinger and J. Neuheisel:
(1993). Face Recognition With Visible And Thermal
[2] M.A. Turk and A.P. Pentland: Eigenfaces for Infrared Imagery, Computer Vision and Image
Understanding, Vol. 91, Issue 1-2, pp.72–114 on Biometrics: Theory, Applications, and
(2003). Systems, pp.1-6 (2009).
[13] A. Selinger and D. Socolinsky: Appearance- [26] M.D. Shahbe and S. Hati: Decision fusion based
Based Facial Recognition Using Visible And on voting scheme for IR and visible face
Thermal Imagery: A Comparative Study, recognition, Computer Graphics, Imaging and
Technical Report 02-01, Equinox Corporation Visualisation, pp. 358-364 (2007).
(2002). [27] S. Singh, A. Gyaourova, G. Bebis, I. Pavlidis:
[14] F.J. Prokoski, R.B. Riedel and J.S. Coffin: Infrared and visible image fusion for face
Identification of individuals by means of facial recognition, Proceedings of SPIE Defense and
thermography, Proceedings of the IEEE Security Symposium, vol. 5404. pp. 585-596
International Conference on Security (2004).
Technology, Crime Countermeasures, pp. 120– [28] R. Singh, M. Vatsa, A. Noore: Integrated
125 (1992). Multilevel Image Fusion and Match Score
[15] S. G. Kong, J. Heo, B.R. Abidi, J. Paik, M.A. Fusion of Visible and Infrared Face Images for
Abidi: Recent advances in visual and infrared Robust Face Recognition, Pattern Recognition,
face recognition-a review, Computer Vision and Vol. 41, Issue 3, pp. 880-893 (2008).
Image Understanding, Vol. 97, Issue 1, pp. 103- [29] H. Schwenk: The diabolo classifier, Neural
135 (2005). Computation, Vol. 10, Issue 8, pp. 2175–2200
[16] X. Chen, P. J. Flynn, K. W. Bowyer: PCA- (1998).
Based Face Recognition in Infrared Imagery: [30] <Notre Infrared Face Database>:
Baseline and Comparative Studies, International http://www.nd.edu/~cvrl/undbiometricsdatabase.
Workshop on Analysis and Modeling of Faces html.
and Gestures, IEEE, Nice, France, (2003). [31] <Equinox Infrared Face Database>
[17] D.A. Socolinski, A. Selinger: Thermal Face http://www.equinoxsensors.com/products/HID.h
Recognition In An Operational Scenario, CVPR tml
2004, Vol. 2, pp. II-1012 - II-1019 (2004). [32] <OTCBVS Thermal/Visible Face Database>:
[18] S.W. Jung, Y. Kim, A.B.J Teoh, K.A. Toh: http://www.cse.ohio-state.edu/OTCBVS-
Robust Identity Verification Based on Infrared BENCH/bench.html
Face Images, ICCIT’07, pp. 2066-2071 (2007). [33] F. Smarandache and J. Dezert: Advances and
[19] A.F. Abate, M. Nappi, D. Riccio, G. Sabatino: applications od DSmT for information fusion,
2d And 3d Face Recognition: A Survey, Pattern American Research Press (2004).
Recognition Letters, Vol. 28, Issue 14, pp. 1885- [34] N. Vapnik: Statistical Learning Theory, J. Wiley,
1906 (2007). N.Y., (1998).
[20] Y. Yao, X. Jing, H. Wong: Face And Palmprint [35] M. Gordan, C. Kotropoulos, I. Pitas: A Support
Feature Level Fusion For Single Sample Vector Machine-Based Dynamic Network for
Biometrics Recognition, Neurocomputing Vol. Visual Speech Recognition Applications,
70, Issues 7-9, pp. 1582–1586 (2007). EURASIP JASP, Special Issue on Joint Audio-
[21] S. Ribaric, I. Fratric: A Biometric Identification Visual Speech Processing, Vol. 2002, No. 11 ,
System Based On Eigenpalm And Eigenfinger pp. 1248-1259 (2002).
Features, IEEE Trans. on Patt. Anal. and Mach. [36] M. Gordan, A. Georgakis, O. Tsatos, G. Oltean,
Intell., Vol. 27, Issue 11, pp. 1698–1709 (2005). L. Miclea: Computational Complexity Reduction
[22] L. Hong and A. Jain: Integrating faces and of the Support Vector Machine Classifiers for
fingerprints for personal identification, IEEE Image Analysis Tasks Through the Use of the
Transactions on Pattern Analysis and machine Discrete Cosine Transform, Proc. of IEEE-
Intelligence, Vol. 20, Issue 12, pp. 1295-1307 TTTC International Conference on Automation,
(1998). Quality and Testing, Robotics A&QT-R 2006
[23] X. Jing, Y. Yao, D. Zhang, M. Li: Face and (THETA 15), Volume 2, pp. 350 – 355 (2006).
Palmprint Pixel Level Fusion And Kernel DCV- [37] J. Milgram, M. Cheriet, R. Sabourin: “One
RBF Classifier For Small Sample Biometrics against one” or “one against all”: which one is
Recognition, Pattern Recognition Vol. 40, Issue better for handwriting recognition with SVMs?,
11, pp. 3209–3224 (2007). Tenth International Workshop on Frontiers in
[24] K.I. Chang, K.W. Bowyer, P.J. Flynn, X. Chen: Handwriting Recognition (2006)
Multi-biometrics Using Facial Appearance, [38] C. Lu, J. Wang and M. Qi: Multimodal
Shape and Temperature, Proceedings Sixth IEEE Biometric Identification Approach Based on
International Conference on Automatic Face and Face and Palmprint, Second International
Gesture Recognition, pp. 43-48 (2004). Symposium on Electronic Commerce and
[25] P. Buyssens, M. Revenu, O. Lepetit: Fusion of Security, Vol.2, pp. 44-47 (2009).
IR and visible light modalities for face
recognition, IEEE 3rd International Conference