Professional Documents
Culture Documents
I. INTRODUCTION
ACE recognition plays an important role in many applications such as building/store access control, suspect identification, and surveillance [1], [2], [4][7], [16][23]. Over the
past 30 years, many different face-recognition techniques have
been proposed, motivated by the increased number of real-world
applications requiring the recognition of human faces. There are
several problems that make automatic face recognition a very
difficult task. The face image of a person input to a face-recognition system is usually acquired under different conditions from
those of the face image of the same person in the database.
Therefore, it is important that the automatic face-recognition
system be able to cope with numerous variations of images of
the same face. The image variations are mostly due to changes
in the following parameters: pose, illumination, expression, age,
disguise, facial hair, glasses, and background [18][23].
In many pattern-recognition systems, the statistical approach
is frequently used [18][23]. Although this paradigm has been
successfully applied to various problems in pattern classification, it is difficult to express structural information unless an
appropriate choice of features is possible. Furthermore, this
approach requires much heuristic information to design a classifier. Neural-network (NN)-based paradigms, as new means of
implementing various classifiers based on statistical and structural approaches, have been proven to possess many advantages
Manuscript received January 28, 2005; revised September 22, 2005.
The authors are with the Graduate School of Science and Technology, Chiba
University, Chiba 263-8522, Japan (e-mail: yuanxue@graduate.chiba-u.jp).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TNN.2006.884678
151
II. PREPROCESSING
A. Facial-Image Acquisition
In our research, original images were obtained using a
charge coupled devices (CCD) camera with image dimensions
of 384 243 pixels encoded using 256 gray-scale levels.
In image acquisition, the subject sits 2.5 m away from a CCD
camera. On each site of the camera, two 200-W lamps are placed
at 30 angles to the camera horizontally. The original images are
shown in Fig. 1.
B. Lighting Compensation
We adjusted the locations of the lamps to change the lighting
conditions. The total energy of an image is the sum of the
squares of the intensity values. The average energy of all the
face images in the database is calculated. Then, each face image
is normalized to have energy equal to the average energy
Energy
Intensity
(1)
C. Facial-Region Extraction
We adopt the face-detection method presented in [25]. The
method of detecting and extracting the facial features in a grayscale image is divided into two stages. First, the possible human
eye regions are detected by testing all the valley regions in an
image. A pair of eye candidates is selected by means of the genetic algorithm to form a possible face candidate. In our method,
a square block is used to represent the detected face region.
Fig. 2 shows an example of a selected face region based on the
location of an eye pair. The relationships between the eye pair
and the face size are defined as follows:
be a two-dimensional (2-D)
array of
Let a pattern
intensity values. A pattern may also be considered as a vector
. Denote the database of patterns by
of dimension
. Define the covariance matrix as
follows [4]:
(2)
152
where
and
. Then, the eigenvalues and eigenvectors of
the covariance are calculated. Let
be the eigenvectors corresponding to the
largest eigenvalues. Thus, for a set of patterns
,
their corresponding eigenface-based features
can be
obtained by projecting into the eigenface space as follows:
(3)
For the PCA method, results are shown for the case of using
32 principal components. In other words, faces from a high-dimensional image space are projected to a 32-dimensional feature vector.
III. FUZZY CLUSTERING AND NEURAL NETWORKS
The clusters are functions that assign to each object a
number between zero and one, which is called the membership
of the object in the cluster. Objects which are similar to each
other are identified by having high membership degrees in the
same cluster. It is also assumed that the membership degrees
are chosen so that their sum for each object is one; therefore,
fuzzy clustering is also a partition of the set of objects. The most
widely used fuzzy clustering algorithm is the FCM algorithm
[13][15].
A. FCM
FCM is a data clustering algorithm in which each data point
is associated with a cluster through a membership degree. This
data points into
fuzzy
technique divides a collection of
groups and finds a cluster center in each group such that a cost
function of a dissimilarity measure is minimized. The algorithm
employs fuzzy partitioning such that a given data point can
belong to several groups with a degree specified by membership grades between 0 and 1. A fuzzy -partition of input feature vector
is represented by a matrix
, and is an -element set of -dimensional vectors,
each representing a 32-dimensional vector. The entries satisfy
the following constraints:
(4)
(5)
(6)
represents the feature coordinate of the th
data.
is the membership degree of
to cluster . A proper
partition of may be defined by the minimization of the
(9)
One of the major factors that influence the determination
of appropriate clusters of points is the dissimilarity measure
chosen for the problem. Indeed, the computation of the memdepends on the definition of the distance
bership degree
measure
, which is the inner product norm (quadratic norm).
The squared quadratic norm (distance) between a pattern vector
and the center of the th cluster is defined as
(10)
where is any positivedefinite matrix. The identity matrix is
the simplest and most popular choice for .
B. Distributing Algorithm of the Facial Images by FCM
The FCM algorithm consists of a series of iterations using (8)
and (9). This algorithm converges to a local minimum point of
. We use the FCM as follows to determine the cluster
centers and the membership matrix .
Step 1) Initially, the membership matrix is constructed using
random values between 0 and 1 such that constraints
(4), (5), and (6) are satisfied.
Step 2) The membership function is computed as follows.
a) For each cluster , the fuzzy cluster center is
computed using (9).
b) All cluster centers which are too close to each
other are eliminated.
is computed
For each cluster , the distance
.
as
for
to ,
When
where
is the average distance value.
is computed
c) For each cluster , the distance
using (10).
d) The cost function using (7) is computed. Stop
if its improvement over the previous iteration
below a threshold.
e) A new using (7) is computed and Step 2) is
repeated.
Step 3) The number of membership functions is decreased
based on defuzzification.
153
C. Parallel NNs
In this paper, the parallel NNs are composed of three-layer
BPNNs. A connected NN with 32 input neurons and six output
neurons have been simulated (six individuals are permitted to
belong to each subnet, which is presented in Section IV). The
structure of the proposed parallel NNs is illustrated in Fig. 5.
The number of hidden units was selected by sixfold cross
validation from 6 to 300 units [29]. The algorithm added three
nodes to the growing network once. The number of hidden units
is selected based on the maximum recognition rate.
1) Learning Algorithm: A standard pattern (average pattern)
is obtained from 12 patterns per registrant. Based on the FCM
algorithm, 20 standard patterns are divided into several clusters.
Similar patterns in one cluster are entered into one subnet.
Then, 12 patterns of a registrant are entered into the input
layer of the NN to which the registrant belongs. On each subnet,
the weights are adapted according to the minus gradient of the
squared Euclidean distance between the desired and obtained
outputs.
2) Recognition Algorithm: When a test pattern is input into
the parallel NNs, as illustrated in Fig. 5, based on the outputs
in each subnet and the similarity values, the final result can be
obtained as follows.
Step 1) Exclusion by the negation ability of NN. First, all the
registrants are regarded as candidates. Then, only
the candidate with the maximum output remains in
each subnet. If the maximum output values are less
than the threshold value, corresponding candidates
are deleted. The threshold value is set to 0.5, which
is determined based on the maximum output value
of the patterns of the nonregistrant. Since similar individuals are distributed into one subnet, based on
this step, the candidates similar to the desired individual are excluded.
Step 2) Exclusion by the negation ability of parallel NNs.
Among the candidates remaining after Step 1), the
candidate that has been excluded in one subnet will
be deleted from other subnets. If all the candidates
(11)
154
TABLE I
SUBNETS DETERMINED USING FCM
A. Computation by FCM
155
TABLE II
SUBNETS AFTER PARTIAL UNIFICATION
3) Merging of Clusters: In order to reduce the amount of calculation, clusters are integrated (arranged). However, the maximum number of elements per cluster is six. We integrate the
clusters automatically. The algorithm for this step is presented
as follows.
for
If net-count
max-to-cluster, continue
for
for
If net-count
net-count
net-count
max-to-cluster, continue
net-count
156
TABLE III
OUTPUT RESULTS OF SUBNET 1 AFTER LEARNING
TABLE IV
OUTPUTS OF EACH SUBNET FOR PATTERN 1116
subnet as the first answer to each subnet. For subnet 6, no answer was obtained because all element values were lower than
the threshold of 0.5. Table V lists the results. For subnets 17,
patterns 6, 11, 12, and 18 were selected. Pattern 6 was excluded
from subnets 3, 5, and 6. Pattern 18 was excluded from subnets 5
and 6 as well as from the answers of all subnets. Patterns 11 and
12 remained after recognition based on the negation ability of
the parallel NNs system. The results are presented in Table VI.
Here, recognition by the similarity measure was applied. The
157
TABLE V
RESULTS OF EACH SUBNET FOR PATTERN 1116
TABLE X
RESULTS OF EACH SUBNET FOR PATTERN N0311
TABLE VI
RECOGNITION RESULTS OF PARALLEL NNS BASED
ON REJECTION RULES FOR PATTERN 1116
TABLE XI
OUTPUT OF EACH SUBNET FOR PATTERN N0307
TABLE VII
EXAMPLE OF SIMILARITY WITH PATTERN 1116
TABLE VIII
OUTPUTS OF EACH SUBNET FOR PATTERN N0101
TABLE IX
OUTPUTS OF EACH SUBNET FOR PATTERN N0311
158
TABLE XII
RESULTS OF EACH SUBNET FOR PATTERN N0307
TABLE XIII
ERROR RATES OF DIFFERENT APPROACHES
159
TABLE XIV
ERROR RATES OF RECENTLY PERFORMED EXPERIMENTS
ON THE ORL DATABASE
APPENDIX
COMPARISONS WITH OTHER APPROACHES
To compare our proposed recognition system against the popular face-recognition methods, we perform our proposed system
on an Olivetti Research Laboratory (ORL) database, Cambridge
University, Cambridge, U.K.1 All the 400 patterns from the ORL
database are used to evaluate the face-recognition performance
of our proposed method. The ORL face database is composed
of 400 patterns of ten different patterns for each of 40 distinct
individuals. The variations of the patterns are across pose, size,
time, and facial expression. All the images were taken against a
dark homogeneous background with the subjects in an upright,
frontal position, with tolerance for some tilting and rotation of
up to about 20 . There is some variation in scale of up to about
10% [27]. The spatial and gray-scale resolution of the patterns
are 92 112 and 256, respectively.
The training set and test set are derived in the same way as
in [6], [16], [24], and [25]. A total of 200 patterns were randomly selected as the training set and another 200 patterns as the
testing set, in which each individual has five patterns. Next, the
training and testing patterns were exchanged and the experiment
was repeated one more time. Such procedures were carried out
several times. In the following experiments, the National Football League (NFL) error rate was the average of the error rates
obtained by three runs (three runs [6], four runs [26], six runs
[16], and five runs [27]).
The face-recognition procedure consists of 1) a feature extraction step where the feature representation of each training or
test pattern is extracted by PCA + FDA (fisher discriminant analysis) [16], and 2) a classification step in which each feature representation obtained is input into the proposed fuzzy clustering
and parallel NN system. A 1% error rate was obtained when 25
features were used. This is better than the result (error rate of
1.92%) reported by Er et al. [16], where the feature extraction
step was the same as ours and the RBFNN was used in the classification step. It should be noted at this point that PCA + FDA is
used in Step 1) since the facial patterns from the ORL database
are variable in the pose and facial expression. As mentioned by
Er et al. [16], the PCA retains unwanted variations caused by
lighting, facial expression, and other factors, the fisherface paradigm aims at overcoming the drawback of the eigenface paradigm by integrating FDA criteria. Otherwise, it is mentioned
by Lu et al. [27] that fisherfaces may lose significant discriminant information due to the intermediate PCA step. Therefore,
PCA is used to extract features in our aforementioned experiment since the variation of the patterns in our database is slight.
Comparisons with Cable News Network (CNN) [6], RBFNN
[16], NFL [26] and direct fractional LDA (DF-LDA) [27] performed on the same ORL database are shown in Table XIV. It
1The ORL database is available from http//www.cam-orl.co.uk/face-database.html
160
[21] Q. Liu, X. Tang, H. Lu, and S. Ma, Face recognition using kernel
scatter-difference-based discriminant analysis, IEEE Trans. Neural
Netw., vol. 17, no. 4, pp. 10811085, Jul. 2006.
[22] W. Zheng, X. Zhou, C. Zou, and L. Zhao, Facial expression recognition using kernel canonical correlation analysis (KCCA), IEEE Trans.
Neural Netw., vol. 17, no. 1, pp. 233238, Jan. 2006.
[23] X. Tan, S. Chen, Z. H. Zhou, and F. Zhang, Recognizing partially occluded, expression variant faces from single training image per person
with SOM and soft k-NN ensemble, IEEE Trans. Neural Netw., vol.
16, no. 4, pp. 875886, Jul. 2005.
[24] D. Valentin, H. Abdi, A. J. OToole, and G. W. Cottrell, Connectionist
models of face processing: A survey, Pattern Recognit., vol. 27, no. 9,
pp. 12091230, 1994.
[25] K. W. Wong, K. M. Lam, and W. C. Siu, An efficient algorithm for
face detection and facial feature extraction under different conditions,
Pattern Recognit., vol. 34, no. 10, pp. 19932004, 2001.
[26] S. Z. Li and J. Lu, Face recognition using the nearest feature line
method, IEEE Trans. Neural Netw., vol. 10, no. 2, pp. 439443, Mar.
1999.
[27] J. Lu, K. N. Plataniotis, and A. N. Venetsanopoulos, Face recognition
using LDA-based algorithm, IEEE Trans. Neural Netw., vol. 14, no.
1, pp. 195200, Jan. 2003.
[28] A. S. Nugroho, S. Kuroyanagi, and A. Iwata, Efficent subspace
learning using a large scale neural network combNET-II, in Proc. 9th
Int. Conf. Neural Inf. Process., Nov. 2002, vol. 1, pp. 447451.
[29] R. Setiono, Feedforward neural network construction using cross validation, Neural Comput., vol. 13, pp. 28652877, 2001.
Jianming Lu received the M.S. and Ph.D. degrees
from the Graduate School of Science and Technology, Chiba University, Chiba, Japan, in 1990 and
1993, respectively.
In 1993, he joined Chiba University as an Associate in the Department of Information and Computer
Sciences. Since 1994, he has been with the Graduate
School of Science and Technology, Chiba University,
where, in 1998, he became an Associate Professor.
His current research interests are in the theory and
applications of digital signal processing and control
theory.
Dr. Lu is a member of the Institute of Electronics, Information and Communication Engineers (IEICEJapan), the Society of Instrument and Control Engineers (SICEJapan), the Institute of Electrical Engineers of Japan (IEEJ), Japan
Society of Mechanical Engineers (JSME), and Research Institute of Signal Processing, Japan.