You are on page 1of 5

Am I your sibling?

Inferring Kinship Cues from


Facial Image Pairs

Sherin M Mathews Chandra Kambhamettu Kenneth E. Barner


Department of Electrical Department of Computer Department of Electrical
and Computer Engineering and Information Sciences and Computer Engineering
University of Delaware University of Delaware University of Delaware
Newark, Delaware Newark, Delaware Newark, Delaware
sherinm@udel.edu chandrak@udel.edu barner@udel.edu

AbstractKinship inferred from pairs of facial images pro- in, for example, the analysis of social media, in child adoption
vides contextual information for various applications including practices, in curbing child trafficking and locating missing
forensics, genealogical science research, image retrieval, and children, as well as in historic and forensic science research.
image database annotation. Because automatically identifying
and predicting siblings from pairs of facial images with high Existing kinship recognition systems have placed empha-
confidence remains a challenge in computer vision applications, sis on developing frameworks that calculate the accuracy of
we propose in this paper a robust framework for detecting siblings inference of a given image pairs likely sibling relationship. In
from a pair of images, based upon how closely one images feature doing so, the main focus was to develop an image classification
set matches that of another. In calculating similarity for a given methodology based on an absolute mathematical function as
pair of images, our algorithm predicts a sibling pair only when represented by a set of various feature parameters. Such
matched-feature vectors are above a defined similarity metric systems do not, however, take into consideration the relation
threshold (85%). We illustrate a combination of metaheuristic
and support vector machine methods for recognition wherein
that might exist among the parameters. For example, the shape
distance-based features can be used to build a hidden Markov of a facial feature (e.g., eyes, lips, nose) will always remain
model. A further contribution of the work is the development of the same, irrespective of the number of images considered.
a novel classification strategy that fuses a genetic algorithm and By contrast, when features are depicted with the same sort
a support vector machine in order to identify siblings. of shape, it becomes easier both to group and classify them.
Hence, we aim to develop a recognition system that infers
KeywordsKinship Classification, Context, Genetic Algorithm,
whether a given image pair is sibling or non-sibling, and also
Face Recognition, Discrete Cosine Transform, Support Vector
machine predict the similarity metric between a given image pair. The
steps taken to find kinship similarities in image pairs are given
I. I NTRODUCTION below:
Because the human face provides the most salient features For each dataset, we first compute its eigenfaces, and a
for biometric identification compared to other biometric representative vector v of each face is obtained by linear
features (e.g., fingerprint, palmprint, voiceprint, signature, iris projection. Once the representative vector is formed, we imple-
or retina scans), analyzing facial images has become a leading ment Discrete Cosine Transform (DCT) over the images, and
research topic. A recent trend in image processing has been retrieve the coefficients for training the Support Vector Ma-
the analysis of kinship cues for sociological and psychological chine (SVM). Therefore, eigen faces are calculated not just by
applications. According to the theory of inclusive fitness put normal feature vector selection but by a DCT-based coefficient
forward by [1], recognizing kinship and degrees of relatedness selection. The major advantage of using DCT based coefficient
is relevant to understanding the social behavior of animals is that DCT is able to compact more information into smaller
and humans. People are often drawn to others perceived as dataset than is possible in traditional transform based methods.
similar which influences their decision to choose leaders and Geometric distance-based features are used to build a Hidden
role models [2]. Humans tend to offer more assistance to kin Markov Model. The classification algorithm is modified to
than to non-kin as reported in [3],[4],[5],[6]. include the best features for the genetic algorithm and support
vector machine (SVM) algorithm, thereby obtaining superior
Research in psychology and cognitive science [7], [8], [9] higher confidence results. The computational complexity is
has demonstrated the human faces potential as an influential also reduced due to the smaller compact feature sets obtained
cue in kinship similarity measurement. Although kinship can by applying genetic operators in genetic algorithm.
be established by means of DNA testing (because kins have The rest of this paper is organized as follows.
overlapping genetics), such an expensive test is impractical for
mass screening, thus creating a need for an automatic familial Related work in Section II
relationship recognition system. While facial recognition is Proposed methodology in Section III
a trivial activity for humans, it is quite a challenging task
Experimental results and discussion in Section IV
for computers, a subject of increased study in recent years.
Detecting kinship from face images has potential applications Conclusion in Section V

978-1-4799-8428-2/15/$31.00 2015 IEEE


II. RELATED WORK III. P ROPOSED W ORK
Our framework for sibling recognition would predict
One of the first works to tackle the challenge of kinship whether or not a given input image pair is a sibling, by com-
verification by extracting features by means of a simplified puting the similarity value, that is, the similarity metric. Here
Pictorial Structure Model and k-Nearest Neighbors (KNN) and we present the intermediate results in each stage, explaining
SVM classification schemes was Fang et al. [10]. Somanath et all modifications made.
al. [11] addressed the problem of verifying kinship on a low-
resolution database by using the Metric Learning approach. A. Preprocessing
Xia et al. [12] used an intermediate young-parent facial image-
set to reduce divergence among the children for kinship Typically the preprocessing stage employs a grayscale-
verification. A neighborhood repulsed metric learning (NRML) image representation followed by the DCT (Algorithm 1).
algorithm was presented in [13] and prototype-based discrimi- Requiring either a pair of input images or a merged input
native feature learning (PDFL) for kinship verification was pre- image, the demonstrated methodology then converts the RGB
sented in [14]. Fang et al. [15] extended kinship verification to image to gray-level by means of linear projection onto a linear
kinship classification wherein the proposed approach involved space to provide eigen values that form the essence for building
reconstructing the query face from a sparse set of samples the HMM of the next stage.
among the candidates for family classification. In [16], a graph A calculation of DCT coefficients comes next, as 1) higher
model-based approach that incorporates facial similarities was recognition rates can be achieved with lower computational
presented as a cue to improve the performance of kinship costs [19], and 2) DCT has a strong energy compaction
recognition. A method to recognize kinship from videos by property, concentrating most visually significant information
means of describing facial dynamics was presented in [17], in just a few coefficients. A series of coherent DCT-provided
using facial features and spatio-temporal appearances. Current coefficients F (u,v) can then be computed (Algorithm 1, step
kinship-recognition algorithms are designed to determine the 3).
accuracy of inference as to whether a given image pair is a
sibling or not. We intend to provide a framework to distinguish
between sibling and non siblings pairs. In addition, we find Algorithm 1 Preprocessing of the Images
the similarity within an image pair by a predicting similarity
metric. Input: Image 1 denoted by Img1 represents the image of first
person, Image 2 denoted by Img2 represents image of
second person
Our algorithm has an additional application in which the Or Image 3 denoted by Imag3 represents merged image
aim is to identify a match for a given target image with images of Image1 and Image 2.
from a database by predicting a similarity score. In order to
compute feature vectors, a number of techniques could be Output: Image Ready Output for further computation denoted
adopted, such as transform-based techniques and geometric- by F (u, v)
based techniques. We illustrate a combination of metaheuristic 1: Read input image
and SVM methods for recognition wherein distance-based 2: Convert to gray scale i.e RGB grayscale
features are used to build a Hidden Markov Model (HMM).
Although several statistically motivated approaches have been Linear Projection (x, y, z) ()
proposed for classification, to the best of our knowledge the 3: Perform Discrete Cosine Transform
1 1 P
combination of a genetic algorithm and support vector machine F (u, v) = ( n2 ) 2 ( m
2 2
)
N 1 PM 1
i=0 j=0 [ A(i)A(j)
has not before been used for kinship-recognition tasks.
u u
cos[ 2N (2i + 1)] cos[ 2M (2j + 1)] f (i, j) ]
Earlier databases suffered from non-uniform illumination where F (u, v) DCT coefficients of M N image
issues, variance in expression, and dissimilar head poses.
and f (i, j) ()
Automatic kinship recognition itself is an inherently chal-
lenging task requiring high-quality databases to avoid issues
resulting from low quality pictures and unconstrained imaging
conditions. Taking this into account, we used a high-resolution B. Feature Extraction
sibling image database called SiblingDB collected at Politec-
nico di Torino [18]. To analyze the generalization capabilities Gradient features are extracted from a gray scale image
of the proposed approach, we also tested our algorithm on with the help of the Sobel operator, essentially a discrete
LQFaces [18] which contains low-quality images of celebrity differentiator that performs a 2-D spatial gradient measurement
sibling pairs from the Internet. The approach in [18] used on images primarily to detect edges in both directions [20].
a combination of geometric, holistic, and textural feature The Sobel edge detector uses a pair of convolution masks,
attributes. A SVM classification, aided by a Feature Selection one estimating the gradient for the x-direction and the other
process, was incorporated to obtain kinship recognition results. for the y-direction, to find absolute gradient magnitude at each
While the outcomes were encouraging, we propose a novel pixel of an input grayscale image (Algorithm 2, step 1). These
approach that uses robust features for kinship recognition gradients, however, are not merely simple tan functions of the
and correlates results to a gene pool by employing a genetic arc of the radius, but are actually dependent upon the tangent
algorithm. vector passing through two different points on the image. Thus
the distance between successive edges, calculated using the By the use of appropriate mutation operators, we are able
Sobel operator, are used as feature vectors to build the HMM. to successfully classify boundary points. The term X() (Al-
gorithm 3, step 3) indicates the value of expectation parameter
To begin with, the HMM consists of two interrelated X for a given distance feature vector . If the mutation
processes: 1) an underlying Markov chain having a finite operator is able to optimize a given value and that value
number of states, a state-transition probability matrix, and an is less than DCT, we proceed with classification. Otherwise
initial state-probability distribution; and 2) a set of probability optimization is incomplete and the given number of values
density functions associated with each state [21], [22]. It can (i.e., the decision boundary passing through the given zeros
be defined as the triplet (Algorithm 2, step 2). and ones) is incorrect, requiring more constraints to optimize
The HMM models the likelihood of a sequence of observa- it further. The genetic algorithm invokes itself to repeat the
tions as a series of state transitions, which in turn are governed process until the constraint is met and classification may
by a set of probabilities called transition probabilities. In proceed.
any particular state an outcome or observation can only be
The accuracy of the SVM classification is guaranteed for
generated according to the associated probability distribution.
each pairs dataset, as the classification has been optimized
It is, therefore, the outcome not the state that is visible to an
using a genetic algorithm. The output of this classification is
external observer, and thus states are hidden; hence the name
then given in terms of a percentage representing how close the
Hidden Markov Model [23], [24].
two image pairs are with respect to kinship. This similarity
Distance based features are employed to build the Hidden measure is derived from the number of matched features from
Markov Model. The Hidden Markov model indicates the the merged image pair.
probability that distance between the two successive edges
remains constant when we move from one pixel block (one Algorithm 3 Classification
state) to another pixel block (next state) in an image. The
function (Algorithm 3, step 3) represents change in distance Input: (n nx )
between the edges when moving around an image. If distance Output: Img out where Img out is the classified image.
remains constant throughout the transition from one block to
O where O a boolean variable having value 0 or 1
another, then the solution of the equation is going to be 1; if
not, it is 0, implying that distance is changing in a certain way. 1: Support Vector Machine (SVM) Classifier
min L() = 12 kk2 subject to yi ( T xi + 0 )
,0
Algorithm 2 Feature Extraction
2: Iteratively select best features for SVM using Genetic
Input: F (u, v), Img original () Algorithm
Output: (1 , 2 , 3 , 4 , .....x , ) where 0 < x < N 1 Gene pool X((n1 )), X((n2 )) and so on.
is feature vector Mutation operator X((n1 )) X((n2 )) and so on.
1: Perform Sobel Operator 3: If X((n1 )) X((n2 )) min F (u, v)
p
G(()) = (Gx )2 + (Gy )2 then L() = 21 kX((n1 )) X((n2 ))k2
2: Build Hidden Markov Model
4: Next if, L() i
(, A, B) = Pr (xt /xt 1) and
then Operator O = 1 or 0 otherwise
= Pr (yt /yt 1)
where is vector of initial state probabilities
A is state transition matrix IV. E XPERIMENTAL R ESULTS AND D ISCUSSION
B is confusion matrix
We evaluated the proposed algorithm by conducting a
3: ( x ) = [F (u, v) G(()) + (, A, B)] number of experiments for each pair of frontal images from
the SiblingDB and LQfaces databases. The following provides
The overall process can be explained as follows: the details of the databases, experimental results, and discussion.
Sobel Operator calculates distance between successive edges,
when combined with the HMM, illustrates an easy-to-classify A. Database
function of boolean states as a series of 0 and of 1.
The SiblingDB consists of images shot with a uniform
C. Classification background and controlled lighting, and a resolution of 4256
2832 pixels (Fig. 1). It is composed as follows:
In the classification stage, we first calculate the SVM
classification which is the modulus of image distance followed 1. HQ-f: frontal expressionless images of 184 subjects (92
by the Genetic algorithm for optimization. Next, we compare siblings pairs);
this prediction to that predicted by the HMM. Classification 2. HQ-fp: 158 individuals, each represented by one frontal
also takes into consideration whether the values are mixed and one profile expressionless image (79 sibling pairs);
intricately. In order to properly classify boundary values, we
propose performing a Genetic Algorithm (GA)-based strategy 3. HQ-fps: 112 individuals, each represented by a set of
for optimization [25]. four images per individual (56 sibling pairs) [18].
B. Results and Discussion
For the HQ-f dataset, experimental results illustrate that
the framework was accurately able to distinguish siblings and
non sibling pairs, and the highest similarity metric accuracy
obtained for an image pair is about 92.40%. This indicates the
robustness of the algorithm as it has a higher confidence in
predicting accurate sibling pairs .
Furthermore, the results obtained are more dependable as
a genetic algorithm is used for optimization. The algorithm
predicts a pair of images as siblings only when similarities
between chosen images pairs are greater than the threshold
limit (i.e., 85%). For the LQ dataset, the experimental results
show that in addition to the reduction in testing time, the
highest similarity metric accuracy obtained for an image pair
is about 90.24% . Not only are our results more reliable as the
confidence in results is enhanced due to use of genetic opera-
Fig. 1: Examples of HQ-frontal (HQ-f) dataset. The images are high quality images
taken under controlled lighting conditions; Top Row is a sibling pair and bottom row is
tors, but we gain the added advantage of reduced complexity
a non-sibling pair. and lessened runtime.
Similarity measure is derived from the number of matched
features from a merged image pair. Feature vectors of distance
are used to build a HMM that is further utilized for classifica-
tion through a combination of SVM and a genetic algorithm.
We define the similarity metric threshold to be 85%. Only
when the two feature vectors match more than the threshold
does the system predict a sibling pair.
Rationale for choosing a threshold of 85%
A similarity match between two siblings less than 85% is
possible only when the developed HMM has uncertain states.
Because we use distance between edges as features, we would
have a number of geometric distances to be considered as
facial features, making an 85% feature match a reasonable
estimate. Our results can also predict the value of a similarity
metric even greater than 100, implying a confidence-measure
interval in two successive edges of more than 100%. By
exploiting the similarity information, a GA provides the
conclusion that both images must have the same gene pool,
knowledge which will be used as a mutation operator.

V. C ONCLUSION
In this paper, a new robust and effective method for
recognizing kins from frontal image pairs is presented. The
Fig. 2: Examples of LQ dataset. The images in LQ dataset have disparate resolutions and
lighting conditions; Top Row is a sibling pair and bottom row is a non-sibling pair. kinship recognition framework predicts a similarity measure
for a given image pair. Eigenfaces are calculated by DCT-based
coefficient selection while a HMM calculates the probability
of various state being transitioned. Classification is performed
The Second database, the LQfaces [18], contains 98 pairs through a novel combination of a genetic algorithm and SVM.
of siblings taken from the Internet (196 individuals; mostly Not only do these experimental results demonstrate the
celebrities). The photographs had disparate resolutions and efficacy and effectiveness of the proposed method, but point as
were taken under various lighting conditions (Fig. 2). Our well to a substantial reduction in error rates and a lower pro-
algorithm was designed for frontal profile images, and hence cessing time for predicting relations. Greater results reliability
we used HQ-f and LQfaces for our analysis. For each pair of is obtained due to the use of genetic operations for optimizing
images in HQ-f and LQfaces, we created merged image pairs a genetic algorithm that predicts an image pair to be a sibling
consisting of sibling and non-sibling image pairs. Information pair only when matched features vectors are above a defined
regarding the relation between image pairs was taken from the similarity-metric threshold. These results can be correlated to
meta-data sheet provided by the databases. Given a merged the gene pool as we obtain high-confidence recognition results.
input image, our algorithm would predict whether a pair is
sibling or not, in addition to its similarity metric percentage, Currently, we are in the stage of implementing an auto-
which indicates the accuracy of the measured similarity value. mated system in order to recognize sibling relations based on
a variety of facial profiles and expressions. As a genetic test [22] I. Kotsia and I. Pitas, Facial expression recognition in image sequences
may not always be practical for checking kinship, our aim is to using geometric deformation features and support vector machines,
IEEE Transactions on Image Processing, vol. 16, no. 1, pp. 172187,
implement an unobtrusive and rapid computer vision solution 2007.
in its place.
[23] M. H. Siddiqi and S. Lee, Human facial expression recognition using
wavelet transform and hidden markov model, in Ambient Assisted
R EFERENCES Living and Active Aging. Springer, 2013, pp. 112119.
[1] W. D. Hamilton, The genetical evolution of social behaviour II, [24] M. Vijayalakshmi and T. Senthil, Automatic human facial expression
Journal of theoretical biology, vol. 7, no. 1, pp. 1752, 1964. recognition using hidden markov model, in International Conference
on Electronics and Communication Systems (ICECS),. IEEE, 2014,
[2] R. F. Baumeister, The Self, In The Handbook of Social Psychology, pp. 15.
1998.
[25] W. Xiaoqiang, Study on genetic algorithm optimization for support
[3] D. Dunning, K. Johnson, J. Ehrlinger, and J. Kruger, Why people fail to vector machine in network intrusion detection, Advances in Informa-
recognize their own incompetence, Current Directions in Psychological tion Sciences and Service Sciences, vol. 4, no. 2, pp. 282288, 2012.
Science, vol. 12, no. 3, pp. 8387, 2003.
[4] S. Stewart-Williams, Altruism among kin vs. nonkin: Effects of cost of
help and reciprocal exchange, Evolution and human behavior, vol. 28,
no. 3, pp. 193198, 2007.
[5] R. L. Michalski and T. K. Shackelford, Grandparental investment as a
function of relational uncertainty and emotional closeness with parents,
Human Nature, vol. 16, no. 3, pp. 293305, 2005.
[6] J. Jeon and D. M. Buss, Altruism towards cousins, Proceedings of
the Royal Society B: Biological Sciences, vol. 274, no. 1614, pp. 1181
1187, 2007.
[7] L. M. DeBruine, F. G. Smith, B. C. Jones, S. C. Roberts, M. Petrie, and
T. D. Spector, Kin recognition signals in adult faces, Vision research,
vol. 49, no. 1, pp. 3843, 2009.
[8] G. Kaminski, S. Dridi, C. Graff, and E. Gentaz, Human ability to
detect kinship in strangers faces: effects of the degree of relatedness,
Proceedings of the Royal Society B: Biological Sciences, vol. 276, no.
1670, pp. 31933200, 2009.
[9] A. Alvergne, R. Oda, C. Faurie, A. Matsumoto-Oda, V. Durand,
and M. Raymond, Cross-cultural perceptions of facial resemblance
between kin, Journal of Vision, vol. 9, no. 6, p. 23, 2009.
[10] R. Fang, K. D. Tang, N. Snavely, and T. Chen, Towards computational
models of kinship verification. in ICIP, 2010, pp. 15771580.
[11] G. Somanath and C. Kambhamettu, Can faces verify blood-relations?
in Fifth International Conference Biometrics: Theory, Applications and
Systems (BTAS). IEEE, 2012, pp. 105112.
[12] S. Xia, M. Shao, and Y. Fu, Kinship verification through transfer
learning, in IJCAI Proceedings-International Joint Conference on
Artificial Intelligence, vol. 22, no. 3, 2011, p. 2539.
[13] J. Lu, X. Zhou, Y.-P. Tan, Y. Shang, and J. Zhou, Neighborhood
repulsed metric learning for kinship verification, IEEE Transactions
on Pattern Analysis and Machine Intelligence, vol. 36, no. 2, pp. 331
345, 2014.
[14] H. Yan, J. Lu, and X. Zhou, Prototype-based discriminative feature
learning for kinship verification, 2014.
[15] R. Fang, A. C. Gallagher, T. Chen, and A. C. Loui, Kinship classifica-
tion by modeling facial feature heredity. in ICIP, 2013, pp. 29832987.
[16] Y. Guo, H. Dibeklioglu, and L. v. d. Maaten, Graph-based kinship
recognition, in 22nd International Conference on Pattern Recognition
(ICPR). IEEE, 2014, pp. 42874292.
[17] H. Dibeklioglu, A. A. Salah, and T. Gevers, Like father, like son: Facial
expression dynamics for kinship verification, in IEEE International
Conference on Computer Vision (ICCV), 2013, pp. 14971504.
[18] T. F. Vieira, A. Bottino, A. Laurentini, and M. De Simone, Detecting
siblings in image pairs, The Visual Computer, vol. 30, no. 12, pp.
13331345, 2014.
[19] F. M. de S Matos, L. V. Batista et al., Face recognition using
DCT coefficients selection, in Proceedings of the ACM symposium
on Applied computing. ACM, 2008, pp. 17531757.
[20] H. Liu and X. Ding, Handwritten character recognition using gradient
feature and quadratic classifier with multiple discrimination schemes,
in Eighth International Conference on Document Analysis and Recog-
nition. IEEE, 2005, pp. 1923.
[21] M. Z. Uddin, J. Lee, and T.-S. Kim, An enhanced independent
component-based human facial expression recognition from video,
IEEE Transactions on Consumer Electronics, vol. 55, no. 4, pp. 2216
2224, 2009.

You might also like