You are on page 1of 4

Reverse Video Search : A New Definition for Indexing Videos

Jamal J1, Nijanthan P2, and Kalichy Balamurugan 3


1

Pondicherry Engineering College, Puducherry, India Email: jamal@pec.edu 2 Pondicherry Engineering College, Puducherry, India Email: nijanth.niji@pec.edu 3 Pondicherry Engineering College, Puducherry, India Email: kalichy.90@gmail.com

Abstract The Internet has a myriad of videos with no best indexing best available. The effectiveness of the state-of-art indexing is only to the extent of the person indexing the videos which is purely text based. The best possible method to index most videos is by the people who are featured in the video. The idea is to develop a reverse video search engine which gives the videos ordered by relevance (the length the videos feature the person), given the image(s) of the person(s) in the video. The videos over the Internet are indexed into the database using skin color modeling and skin region segmentation. The database contains the faceprint values of all distinct faces in the videos. As of retrieval, the given image is processed and the faces in the given image are detected. The faceprint values of the faces in the image are deduced and matched to the values in the database. The developed system eases the video search process to a greater extent and is of utmost use in investigation of various cases. Index Termsreverse video search, faceprint values, face recognition, skin color model, video indexing

indexed into database. The paper uses two methods for face recognition and segmentation [1] which are explained later. The process of searching would be to upload an image of the person(s) which is then processed to retrieve the videos. The result is sorted based on the length (time) the videos feature the person(s). The rest of the paper is organized as follows: section II deals with the inefficiency of the current search system, section III throws light on the detailed working of the system, section IV gives a brief note of the algorithms used and section V deals with the experimental results II. A STUDY OF CURRENT VIDEO SEARCH In this section we try to explore the effectiveness of the current video search system. As stated earlier, the current search system requires the video seeker to know the significant information in order to retrieve the video, which is a far-reach in most cases. The paper brings out the ineffectiveness taking two cases. A. Just an image takes you Nowhere: Let us the case of Aarushi Talwar which was the hot topic to be discussed time and again a few months back in all the news channels in India. Let us consider a person Alice, a novice of the information about this case ends up with a photo (Figure 1), wishes to see all the videos featuring the persons in the image. He uses the current Reverse Image Search system that is presently available and finds himself in no position better, for all the result could provide him was a zero match [2]. The current state-of-the-art system requires Alice to do a extensive search to retrieve the videos.

I. INTRODUCTION The Internet, today, has torrents of videos which are poorly indexed, in virtue of the fact that the videos are indexed by text that are closely related to the content of the video. In many a cases we end up having a picture of a person(s) and make a futile attempt to retrieve the fulllength video featuring the person(s). The traditional video search demands the video seeker to know the name of the person featured in the video or at least to know the name of the event or something of significance related to the video which should happen to be in synch with the indexing done. An approach to make better the video search would be to automate the process of indexing the videos over the Internet by the persons featured in the video without any human intervention in addition to existing method of content based indexing. This combined approach would ease the process of searching the videos over the Internet. In this paper, we concentrate on indexing the videos based on the person(s) featured in the video leaving the combined method as a future work. The process of indexing in enhanced to take care of the illumination factor and facial accessories which occlude the image of the person. Each frame of the video is scanned for the presence of the face and the presence is confirmed. The face is then segmented from the rest of the frame and 1

Figure1. Image of the persons whose full-length videos are to be searched

B. In the Field of Investigation: The system proposed is of great use in the field of investigation. Say that, the investigation requires a keen study of the accused, examining the videos he is featured and his interactions with the people around. The system available at present extends no help to single out the videos in which the accused is present, demanding a manual search of all possible videos scanning the videos completely. The system proposed removes human intervention in the process of zeroing in on the videos in which the person under investigation is present over the Internet. III. PROCESS UNDER MICROSCOPE The system proposed at the highest level of abstraction is a two stage process. Stage I which is the Indexing Phase and Stage II is the Retrieval Phase. A. Indexing Phase: The process flow in this phase is pictorially briefed in Figure2.

This Indexing Phase can be further subdivided into the following sub-phases. The sub-phases include crawling over the Internet for new videos, frame extraction, alignment, face detection, face segmentation and indexing into the database, in the order specified. Indexing phase starts with selecting a video for indexing into the database. Each frame of the video is scanned for the presence of the face using skin color model. Once a face is detected, the system determines the heads position, size and pose. The face needs to be turned at least 35 degree towards the camera. The luminance in the image extracted is removed and the image is normalized to separate the skin region from the non-skin region. [1][3] The normalized image is a gray scale image that is subjected to the process of threshold using appropriate values to give a black and white image. The image is used to distinguish the skin-region from the rest of the background. The algorithm used is detailed in the next section. The system then measures the peaks and valleys that make the different facial features. There are about 80 nodal points on a human face. These nodal points are measured to create numerical code, a string of numbers that represents the face in the database. The string of values is called the faceprint values. If the faceprint values are already present, the face is presumed to appear for the consequent time and the time-length of that particular person in the video is increased by unit time. At the end of indexing a particular video, the database contains faceprint values of distinct persons in the video along with the time-length the person is featured in the video. B. Retrieval Phase : The video search begins with the image of the person(s) to be searched. The image is uploaded to the system. The image may contain a single person or a group of persons. The image is then processed to extract the faces in the image. The faceprint values of the person in the image are calculated and searched against the values in the database. The search result provides the links of the videos ordered by the relevance as mentioned earlier. The system proposed can take as input a group of photos as input as well. The process of retrieval of videos is again pictorially defined in the figure 3.

Figure 2. Indexing of Videos 2

Figure 3. Retrieval Phase

IV. ALGORITHM FOR FACE DETECTION The system revolves around the Skin Color Modeling for face recognition. Various other algorithms can be used for making the system more efficient and robust. A. Skin Color Model[3] : In order to separate human skin regions from non-skin regions based on color, we need a reliable skin color model that is adaptable to people of different skin colors and to different lighting conditions [4]. The common RGB representation of color images is not suitable for characterizing skin-color. In the RGB space, the triple component (r, g, b) represents not only color but also luminance. Luminance may vary across a person's face due to the ambient lighting and is not a reliable measure in separating skin from non-skin region [5][1]. Luminance can be removed from the color representation in the chromatic color space. Chromatic colors [6], also known as "pure" colors in the absence of luminance, are defined by a normalization process shown below: r = R/(R+G+B) (1) b = B/(R+G+B) (2)

normalized image has three values and they are normalized-red, normalized-green and normalized-blue. Segmentation process extracts these normalized components and constructs two images. Each of these images is converted into black and white image by applying different threshold for normalized input image such that r = 0.41-0.50 and g = 0.21- 0.30. Finally, we perform an 'AND' operation between these two black and white images where white pixels are skin and blacks are non skin pixel. In this approach, due to noise and distortion in input image, color information of some skin pixels acts like non skin region and generate non contiguous skin color region. To solve this problem, first morphological closing operator is used to obtain skincolor blobs. A median filter was also used to eliminate spurious pixels. Boundaries of skin-color regions are determined using a region growing algorithm in the binary image. Regions with size less than 1% of image size are eliminated [8]. At the end of segmentation process black and white skin region image is multiplied by the original RGB image and we then just get the skin region. IV. EXPERIMENTS AND RESULTS The proposed system is idealized and scaled down for the sake of experimenting on the system. The number of videos that are indexed into the database is a count of 100. The interface for the system which happens to have search engine like appearance is build using HTML and the back-end is developed using PHP. We used Oracle 10g for implementing the database and connected to the database using JDBC connectivity. We implemented the entire algorithm of face detection and segmentation using Matlab r2010 [9] on Windows 7 workstation. The system is studied for its performance using videos of various resolution. The system is successful in detecting the face and indexing into the database even if the face is rotated through an angle of up to 35 degree. The results of the various search conducted on the proposed system is tabulated in the Table 1. Table 1. Performance Analysis of the proposed system Average Average Quality No. of the Average Face Video of Test Indexing Match Detection Videos Conducted rate rate Rate Low 30 85% 87% 85% Medium 30 90% 92% 92% High 30 95% 97% 98% CONCLUSIONS The paper proposes a novel method that revolutionizes the process of searching videos. The process of indexing videos is automatized. This proposed model is of great help in the field of investigation. The future work includes using the algorithm that converts 2D to 3D [10] image for better recognition that can detect faces that are turned up to 60 degree. The model can be further 3

Color green is redundant after the normalization because r + g + b = 1. If two points P1[r1,gj,bj] and P2[r2,g2,b2], are proportional, i.e., r1/r2 = g1/g2 = b1/b2 (3)

then, P1 and P2 have the same color but different brightness. Chromatic colors have been effectively used to segment color images in many applications [7]. It is also well suited in this case to segment skin regions from non-skin regions. The color distribution of skin colors of different people was found to be clustered in a small area of the chromatic color space. Skin colors of different people are very close, but they differ mainly in intensities [4].The color histogram revealed that the distribution of skin color of different people are clustered in the chromatic color space and a skin color distribution can be represented by a Gaussian model N(m, C), where: Mean, m= E {x} [ where x= (r b)T ] (4)

(5)

B. Skin Region Segmentation : The segmentation process is to remove the background of the image from skin regions using previously discussed skin color model. First, input image is converted to chromatic color space. Using Gaussian model, a gray scale image of skin likelihood pixels is constructed and skin pixels have some set of constant values for each r, g and b component. Every pixel in

enhanced to include a combined method of indexing which provides the added advantage of content based indexing (text) and indexing based on the image of the persons in the video. REFERENCES
[1] Sanjay Kr. Singh1, D. S. Chauhan, Mayank Vatsa, Richa Singh A Robust Skin Color Based Face Detection System in Tamkang Journal of Science and Engineering, Vol. 6, No. 4, pp. 227-234 (2003) [2] The result of searching a image (figure 1) which is a part of a video over the current reverse image search yielded the following http://www.tineye.com/search/ d96186251339a4d03aaa8c5b469aba23b949e2f8 [3] Md. Maruf Monwar, Padma Polash Paul,Md. Wahedul Islam,Siamak Rezaei A Real-time Face Recognition Approach from Video Sequence Using Skin Color Model and EigenFace Method in Electrical and Computer Engineering, 2006. CCECE '06. Canadian Conference on Publication Year: 2006 , Page(s): 2181 - 2185 [4] Yang, and Alex Waibel, A real-time face tracker procedding of 3rd IEEE Workshop on Applications of Computer Vision, WACV 95, Sarasota, Fla. 1996

[5] J.Cai, A. Goshtabsy, and C.Yu, Detecting Human Faces in Color Images proceedings of the International Workshop on Multimedia Database Management System, pp.0124,1998 [6] G. Wyszecki and W.S. Styles, Color Science: Concepts and Methods, Quantitative Data and Formulae, Second edition, John Wiley & Sons, New York, 1982 [7] Y.Gong, and M. Sakauchi, Detection of regions matching specified chromatic features, Computer Vision and Image Understanding, vol.61(2), pp. 263-269, 1995 [8] R.S Feris, T.E. Campos, and R.M.C. Junior, Detection and Tracking of Facial Features in Video Sequences, proceedings of the Mexican International Conference on Artificial Intelligence: Advances in Artificial Intelligence, pp. 127 135, 2000 [9] Image Acquisition Toolbox 1.1 in MATLAB R2010. [10] Yi Dai, Guoqiang Xiao, Kaijin Qiu Efficient Face Recognition with Variant Pose and Illumination in Video, proceedings of 2009, 4th International Conference on Computer Science and Education

You might also like