Professional Documents
Culture Documents
APPEARANCE MODELS
Matthew S. Ratliff
Department of Computer Science
University of North Carolina Wilmington
601 South College Road
Wilmington, NC, USA
msr3520@uncw.edu
ABSTRACT
Recognizing emotion using facial expressions is a key element in human communication. In this paper we discuss a
framework for the classification of emotional states, based
on still images of the face. The technique we present involves the creation of an active appearance model (AAM)
trained on face images from a publicly available database
to represent shape and texture variation key to expression
recognition. Parameters from the AAM are used as features for a classification scheme that is able to successfully
identify faces related to the six universal emotions. The results of our study demonstrate the effectiveness of AAMs
in capturing the important facial structure for expression
identification and also help suggest a framework for future
development.
KEY WORDS
Emotion, Facial Expression, Expression Recognition, Active Appearance Model
Introduction
Facial expressions provide a key mechanism for understanding and conveying emotion. Even the term interface
suggests the primary role of the face in communication between two entities. Studies have shown that interpreting
facial expressions can significantly alter the interpretation
of what is spoken as well as control the flow of a conversation [31]. Meharbian has suggested that the ability for
humans to interpret emotions is very important to effective
communication, accounting for up to 93% of communication used in a normal conversation [23]. For ideal humancomputer interfaces (HCI), we would desire that machines
have this capability as well. Computer applications could
better communicate by changing responses according to
the emotional state of human users in various interactions.
In order to work toward these capabilities, efforts
have recently been devoted to integrating affect recognition into human-computer applications [20]. Applications
exist in both emotion recognition and agent-based emotion
generation [17]. The work presented in this paper explores
the recognition of expressions, although the same research
can be useful for synthesising facial expressions to convey
Eric Patterson
Department of Computer Science
University of North Carolina Wilmington
601 South College Road
Wilmington, NC, USA
pattersone@uncw.edu
emotion [17] [18]. By creating machines that can understand emotion, we enhance the communication that exists
between humans and computers. This would open a variety
of possibilities in robotics and human-computer interfaces
such as devices that warn a drowsy driver, attempt to placate an angry customer, or better meet user needs in general. The field of psychology has played an important role
in understanding human emotion and in developing concepts that may aid these HCI technologies. Ekman and
Freisen have been pioneers in this area, helping to identify six basic emotions (anger, fear, disgust, joy, surprise,
sadness) that appear to be universal across humanity [12] .
In addition, they developed a scoring system used to systemically categorize the physical expression of emotions,
known as the Facial Action Coding System (FACS) [13].
FACS has been used in a variety of studies and applications, and has found its way into many face-based computer technologies. The study of the facial muscle movements classified by FACS in creating certain expressions
was used to inform the choice of landmarks for active appearance model (AAM) shape parameters in our work. Our
work thus far has been focused on developing a framework
for emotion recognition based on facial expressions. Facial
images representing the six universal emotions mentioned
previously as well as a neutral expression were labeled in
a manner to capture expressions. An AAM was built using
training data and tested on a separate dataset. Test face images were then classified as one of the six emotion-based
expressions or a neutral expression using the AAM parameters as classification features. The technique achieved a
high level of performance in classifying these different facial expressions based on still images. This paper presents
a summary of current contributions to this area of research,
discusses our approach to the problem, and details techniques we plan to pursue for this work.
Previous Work
Data Collection
been somewhat limited, and it would be useful to encourage comparisons of techniques on the same data sets.
2.2
In order to recognize expressions of the face, a useful feature scheme and extraction method must be chosen. One
of the most famous techniques used in face recognition and
related areas is that of eigenfaces developed by Turk and
Pentland [29]. An average face and a set of basis functions
for face-space is constructed using principal components
analysis. Although a successful method for simple face
recognition, this technique would lack feature specificity
of underlying muscle movements appropriate to facial expressions.
Other feature extraction methods have been explored
including image-processing techniques such as Gabor filters and wavelets [21]. Bartlett used a similar approach for
feature extraction employing a cascade of classifiers used
to locate the best filters for feature extraction [1]. Michel
and Kaliouby use a method similar to our approach for
extracting features [24]. Their method employs a feature
point tracking system similar to active shape models. According to their research, Cohen suggests that using feature point tracking shows on average a 92% agreement with
manual FACS coding by professionals [7]. Shape information of some kind is likely one of the most important types
of data to include in any feature method.
Image-based methods have been applied in many areas of facial computing. One of the most successful recent
techniques, though, incorporates both shape and texture information from facial images. The AAM, developed initially by Cootes and Taylor [11], has shown strong potential in a variety of facial recognition technologies, but to our
knowledge has yet to be used in recognizing emotions. It
has the ability to aid in initial face-search algorithms and in
extracting important information from both the shape and
texture (wrinkles, nasio-labial lines, etc.) of the face that
may be useful for communicating emotion.
2.3
Classification Schemes
Several classification schemes have been used thus far, including support vector machines, fuzzy-logic systems, and
neural networks. For instance, Eckschlager et al. used an
ANN to identify a users emotional state based on certain
pre-defined criteria [10]. NEmESys attempts to predict the
emotional state of a user by obtaining certain knowledge
about things that commonly cause changes in behavior. By
giving the computer prior information such as eating habits,
stress levels, sleep habits, etc. the ANN predicts the emotional state of the user and can change its responses accordingly. (One weakness of this approach is that it requires
the user to fill out a questionaire providing the system with
the information in advance). While this system is unique,
it does not incorporate any interpretation of facial emotion,
which has been identified as one of the key sources of emotional content [12] [18] [9]. Another approach used a fuzzy,
rule-based system to match facial expressions that returned
a probable emotion based on rules of the system [25].
Several have used support vector machines (SVM) as
a classification mechanism [21] [1] [24]. In most cases
SVMs yield good separation of the clusters by projecting
the data into a higher dimension. Michel and Keliouby indicated a 93.3% successful classification when aided with
Adaboost for optimal filter selection [24]. Sebe, Lew, Cohen, Garg, and Huang [27] offer a naive Bayes approach
in emotion recognition based on a probability model of facial features given a corresponding emotional class. The
work presented in this paper does not focus on classification schemes and uses a simple Euclidean-distance classification.
Background
Score
7.0
7.5
6.5
6.0
3.5
9.5
3.2
The Experiment
Upon evaluation of the facial expression database [30], several subjects were removed from the data set used for this
work. Occlusions such as eyeglasses, hair, as well as inconsistencies in expression were the main factors that contributed to the removal of these subjects. Facial images and
their representative expression in the database were categorized based on emotion clarity, sincerity, and head movement. Emotion clarity ranks the image based on the clarity
of the emotional content. Sincerity was also chosen as a
measure to help determine how well the subject conveys
the intended emotion. Head movement is not included in
the experiment and those images exhibiting certain levels
of head movement are excluded from the training and test
sets. Also, subjects 7, 14, and 17 were removed from the
data set due to facial occlusions such as eyeglasses and hair
as well as other inconsistencies. Overall, the database stills
were evaluated and each image given an overall score as
shown in Table 1. A benchmark was set which marked the
minimum score required for inclusion in this initial experiment.
Subject 1
Subject 2
Subject 3
Subject 4
Subject 5
Subject 6
Subject 8
Subject 9
Subject 10
Subject 11
Subject 12
Subject 13
Subject 15
Subject 16
Subject 18
Total Average Correct
Percentage Correct
80.0%
74.0%
90.5%
90.9%
96.3%
79.2%
83.3%
100%
60.0%
100%
75%
100%
83.3%
89.7%
100%
91.7%
One mode of the AAM based on these parameter vectors is shown in Figure 3. The two faces on either side
represent variation from the mean within the model. The
Experimental Results
Fear
Joy
Surprise
Anger
Disgust
Sadness
Neutral
Percentage Correct
90.0%
93.3%
79.7%
63.9%
93.3%
63.9%
93.3%
Conclusions
Using the AAM as a feature method has proven successful even with a simple Euclidean-distance classification
scheme. The capability of AAMs to model both the shape
and texture of faces makes them a strong tool to derive feature sets for emotion-based expression classification. It is
certainly likely that more sophisticated classifiers such as
SVMs will provide better results on this data set. Overall,
though, this initial work has shown potential for AAMs as
a feature set for expression classification.
5.1
Future Work
We are currently expanding this initial data set to consider other classification schemes used in conjunction with
AAM parameter features, considering Bayesian classifiers
and SVMs initially. We also plan to explore dynamic expression recognition, as levels of sophistication can be improved with temporal information. This also has the possibility of strengthening methods based on FACS and scoring
schemes that are generated automatically.
A current weakness in this area of facial study,
though, is still the lack of comparable databases. We plan to
consider others in our future work [2] but would also like to
encourage the creation and use of common data sets in this
area as a means to strengthen comparison and fine-tuning
of techniques.
Acknowledgements
Special thanks to Dr. Eric Patterson for guidance and direction with the project, as well as Dr. Curry Guinn for
assistance with project scope and data collection methods.
Also, gratitude is extended to Frank Walhoff for providing
a freely accessible database.
References
[1] Marian Stewart Barlett, Gwen Littlewort, Ian Fasel,
and Javier R. Movellan. Real time face detection
and facial expression recognition: Development and
applications to human computer interaction. In Pro-
[24] Philipp Michel and Rana El Keliouby. Real time facial expression expression recognition in video using
support vector machines, 2003.