You are on page 1of 72

Key Frame Extraction On MPEG by using Threshold Algorithm

CHAPTER 1 INTRODUCTION
1.1 LITERATURE REVIEW
Recent years have witnessed an enormous increase in video data on the internet. This rapid increase demands efficient techniques for management and storage of video data. Video summarization is one of the commonly used mechanisms to build an efficient video archiving system. The video summarization methods generate summaries of the videos which are the sequences of stationary or moving images (Money and Agius, 2008). Key frame extraction is a widely used method for video summarization. The key frames are the characteristic frames of the video which render limited, but meaningful information about the contents of the video (Li et al., 2001). The researchers have attempted to exploit various features for the extraction of key frames in videos. These features have been utilized in a variety of different ways. Some of the low level features which are commonly used include color histogram, frame correlation, motion information and edge histogram etc. (Jiang et al., 2009). Zhang et al. (1997) used the color histogram difference between the current frame and the last extracted key frame to draw out key frames from the video. Gunsel and Tekalp (1998) compared the histogram of current frame with the average color histograms of the previous frames to compute the discontinuity value. A thorough survey of existing techniques reveals that the researchers have used many different visual features for the problem of key frame extraction. In our project we dealt with the frame difference measures such as color histogram, frame correlation and edge orientation histogram for the extraction of key frame.

1.2 OVERVIEW OF PROJECT


Efficient key frame extraction enables efficient cataloguing and retrieval with large video collections. Video is rich in content and it results in a tremendous amount of data to process. This can be made easier by only processing some frames, such as the key frames of video. In general, a key frame extraction technique must be fully automated in nature and must use the contents of the video to generate summary.

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

Theoretically, key frames must be extracted using high level features such as objects, actions and events. However, the key frame extraction based on high level features is mostly specific to certain applications and usually low level features have been employed. Some of the examples of low features that are commonly used are colour histogram, correlation, moments, edges and motion features. These low level features can then be employed to derive high level features to generate domain specific applications. A common methodology is to compare consecutive frames based on some low level Frame Difference Measures (FDMs) and extract a key frame if this difference satisfies a certain threshold value. The low level features used in our project are (1) Colour histogram (2) Frame correlation (3) Edge orientation histogram The basic block diagram of Elicitation of key frames in sports video based on multiple frame difference features is shown in Fig.1.1. It consists of Extraction of frames, color histogram, correlation, and edge orientation histogram and threshold logic modules. Extraction of frames module extract all the frames from the given input video and the keyframes are identified based on color histogram, correlation and edge orientation histogram methods by making use of threshold logic. In our work, the results from these three methods are compared for sample video (Foot Ball), Cricket video, Hockey video and Foot Ball Video. Colour Histogram for the frames is calculated in HSV color space. HSV stands for Hue, Saturation and Value. HSV colour model is based on how colors appear to a human observer. From the colour histograms of these three channels between the frames, colour histogram difference measure is calculated. This measure lies between -64 to 0. Frame correlation is done by using Pearsons Distance. Pearsons Distance is defined as one minus Pearsons correlation coefficient. Pearson's correlation coefficient between two variables is defined as the covariance of the two variables

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

divided by the product of their standard deviations. The Pearson correlation coefficient falls between [-1, 1] and the Pearson distance lies in [0, 1]. The third measure used for computing is the histogram of edge orientation. Edge orientation histogram is done by sobel operator. The edges are first computed using horizontal and vertical sobel operators which are then used to find gradient and angle of edges. The angles are then used to build a histogram of edge orientation. The range of values for edge orientation measure is 0 to 82.

Key frames

Key frames

Threshold

Threshold

Threshold Video the video Edge

logic

logic

Colour histogram

Correlation

orientation

Extraction

Feature

Extraction of

Department. of ECE, MRITS

Input

all frames in

histogram
3

logic

Key frames

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 1.1: Basic Block Diagram Input Video Cricket, football and hockey are taken as input videos to this work. The video can be in format of .avi, .flv, .mov, .mp4, .mpg, .rm etc. To process this video, frames have to be extracted. The following is a brief explanation of the different video file formats found commonly AVI (.avi) The AVI (Audio Video Interleave) format was developed by Microsoft. The AVI format is supported by all computers running Windows, and by all the most popular web browsers. MWV (.mwv) The Windows Media format is developed by Microsoft. Windows Media is a common format on the Internet, but Windows Media movies cannot be played on non-Windows computer without an extra (free) component installed. Some later Windows Media movies cannot play at all on non-Windows computers because no player is available. MPEG (.mpg/.mpeg) The MPEG (Moving Pictures Expert Group) format is the most popular format on the Internet. It is cross-platform, and supported by all the most popular web browsers. QuickTime (.mov) The QuickTime format is developed by Apple. QuickTime is a common format on the Internet, but QuickTime movies cannot be played on a Windows computer without an extra (free) component installed. Flash (.flv/.swf)
Department. of ECE, MRITS 4

Key Frame Extraction On MPEG by using Threshold Algorithm

The Flash (Shockwave) format was developed by Macromedia. The Shockwave format requires an extra component to play. But this component comes preinstalled with web browsers like Firefox and Internet Explorer. 3GP(.3gp) The 3gp format is both an audio and video format that was designed as a multimedia format for transmitting audio and video files between 3G cell phones and the internet. It is a 3G streaming video format, mainly used to meet the high transmission speed of 3G networks and is currently the most common type of mobile phone video format. Realmedia(.rm) Real Media is a format which was created my Real Networks. It contains both audio and video data and typically used for streaming media files over the internet. Real media can play on a wide variety of media players for both Mac and Windows platforms. The real player is the most compatible. Mpeg-4(.mp4) Mpeg-4 is the new format for the internet. In fact, You Tube recommends using MP4. You Tube accepts multiple formats, and then converts them all to .flv or .mp4 for distribution. More and more online video publishers are moving to MP4 as the internet sharing format for both Flash players and HTML5. Advances Streaming Format (.asf) ASF is a subset of the mwv format and was developed by Microsoft. It is intended for streaming and is used to support playback from digital media and HTTP servers, and to support storage devices such as hard disks. It can be compressed using a variety of video codes. The most common files types that are contained within an ASF file are Windows Media Audio, and Windows Media video Frame Extraction The video taken as input is divided into frames in this section. To do this task we have used mmreader and extracted frames. The input to mmreader can be any of the above mentioned formats.

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

Feature extraction The features of the extracted key frames can be colour, edge, motion or textual features. The low level features such as colour histogram, frame correlation and edge histogram are obtained using certain frame difference measures. Then the frame difference values are calculated for all extracted frames for different videos. Key frame extraction To start the extraction process, the first frame is declared as a key frame. Then the frame difference is computed between the current frame and the last extracted key frame. If the frame difference satisfies a certain threshold condition, then the current frame is selected as key frame. This process is repeated for all frames in the video.

1.3

ORGANISATION OF THESIS
In view of the proposed research work, explanation of theoretical aspects used

in this work is presented as per the sequence described below. Chapter 2 Describes the need for key frames and introduction to frame difference measures for the extraction of key frames. Chapter 3 Deals with the basic colour models and different frame difference measures for the extraction of key frames based on colour histogram. Chapter 4 Explains different correlation coefficients for the extraction of key frames. Chapter 5 Describes fundamentals of edge detection and different edge detection operators for the extraction of key frames. Chapter 6 Deals with results and discussions. Chapter 7 has conclusions and future scope.

CHAPTER 2
Department. of ECE, MRITS 6

Key Frame Extraction On MPEG by using Threshold Algorithm

VIDEO COMPACTION USING KEY FRAME EXTRACTION


2.1 DEFINITION
Key frame is the frame which can represent the salient content and distinct information as compared to the previous frame. Key frame extraction is a widely used method for video summarization that is the Key frames extracted will summarize the characteristics of the video. Video summarization is a method to generate succinct version of a video by eliminating the redundant frames. The method for video summarization is shown in Fig 2.1. The effective way of generating key frames is shown in Fig.2.2. .

Fig. 2.1: Scheme For Video Summary

Video stream

Frame sequence

Key frame extraction

Fig. 2.2: The Basic Framework Of The Key Frame Extraction Algorithm

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

2.2 NEED FOR KEY FRAMES


Key frame extraction is an essential part in video analysis and management, providing a suitable video summarization for video indexing, browsing and retrieval. General video is rich in content and consists of 24 frames per second. Hence a one hour video would contain around 24x60x60 frames. Most of these frames contain redundant information and thus key frame extraction is essential. Thus, the use of key frames reduces the amount of data required in video indexing and provides the framework for dealing with the video content. A basic rule of key frame extraction is that key frame extraction would rather be wrong than not enough. So it is necessary to discard the frames with repetitive or redundant information during the extraction. To extract valid information from video, process video data efficiently, and reduce the transfer stress of network, more and more attention is being paid to the video processing technology. The amount of data in video processing is significantly reduced by using video segmentation and key-frame extraction. To reduce the transfer stress in network and invalid information transmission, the transmission, storage and management techniques of video information become more and more important.

2.3

EXTRACTION

OF

KEY

FRAMES

USING

FRAME

DIFFERENCE MEASURES (FDMs)


2.3.1 INTRODUCTION TO FDMs A common methodology for extraction of key frames is to compare consecutive frames based on some low level Frame Difference Measures (FDMs). The frame difference is measured and if this difference exceeds a certain threshold, then that frame is selected as a key frame otherwise discard the frame. Some of the low level features which are commonly used for the extraction purpose include colour histogram, frame correlation, motion information and edge histogram etc. 2.3.2 KEY FRAME EXTRACTION To start the extraction process, the first frame is declared as a key frame. Instead of computing one histogram for the entire image, we divide the image shown in Fig 2.3(a) into total of Ts sections each of size m*m, as shown in fig 2.3(b). This is

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

to effectively measure the level of difference between the two frames. Then the frame difference is computed between the current frame and the last extracted key frame. This frame difference is computed by using colour histogram, correlation, edge orientation histogram. Then the obtained frame difference is compared with certain threshold, if the difference satisfies with the threshold condition then the current frame is selected as a key frame. By continuously repeating the procedure for all frames we can extract the key frames.

Fig 2.3(a): Original image

Fig 2.3(b): Division of image in to sections

Department. of ECE, MRITS

Key Frame Extraction On MPEG by using Threshold Algorithm

CHAPTER 3 COLOUR HISTOGRAM DIFFERENCE


3.1 INTRODUCTION
The colour histograms have been commonly used for key frame extraction in frame difference based techniques. This is because the colour is one of the most important visual features to describe an image. Colour histograms are easy to compute and are robust in case of small camera motions. An image histogram is a type of histogram that acts as a graphical representation of the tonal distribution in an image. It plots the number of pixels for each tonal value. By looking at the histogram for a specific image a viewer will be able to judge the entire tonal distribution at a glance. The horizontal axis of the graph represents the tonal variations, while the vertical axis represents the number of pixels in that particular tone. The left side of the horizontal axis represents the black and dark areas, the middle represents medium grey and the right hand side represents light and pure white areas. The vertical axis represents the size of the area that is captured in each one of these zones. Thus, the histogram for a very dark image will have the majority of its data points on the left side and centre of the graph. Conversely, the histogram for a very bright image with few dark areas and/or shadows will have most of its data points on the right side and centre of the graph. The representation of color histogram is shown in the Fig.3.1.

Fig 3.1: Colour Histogram representation of image


Department. of ECE, MRITS 10

Key Frame Extraction On MPEG by using Threshold Algorithm

3.2COLOR IMAGE PROCESSING


3.2.1 Color fundamentals Basically, the colors the humans and some other animals perceive in an object are determined by the nature of the light reflected from the object. Visible light is composed of relatively narrow band of frequencies in the electromagnetic spectrum. A body that reflects light that is balanced in all visible wavelengths appears white to the observer. However, a body that favors reflectance in a limited range of the visible spectrum exhibits some shades of colors. For example, Green objects reflect light with wavelengths primarily in the 500-570nm range while observing most of the energy at other wavelengths Characterization of light is central to the science of color. If the light is achromatic (void of color), Its only attribute is its intensity. Achromatic light is what viewers see on a black and white television set and it has been an implicit component of discussion of Image processing thus far. The term gray level refers to a scalar measure of intensity that ranges from black to grays and finally to white. Chromatic light spans the electromagnetic spectrum from approximately 400 700nm.three basic quantities are used to describe the quality of a chromatic light source: radiance, luminance and brightness. Radiance is total amount of energy that flows from light source and is usually measure is Watts (W). Luminance measured in lumens (ln), gives a measure of amount of energy an observer perceives from light source. For example, light emitted from a source operating in the far infrared region of the spectrum could have significant energy (radiance), but an observer would hardly perceive it. Its luminance would be almost zero. Finally, brightness is a subjective descriptor that is practically impossible to measure. It embodies the achromatic notion of intensity and is one of the key factors in describing color sensation.

Department. of ECE, MRITS

11

Key Frame Extraction On MPEG by using Threshold Algorithm

3.2.2 Primary colors Cones are the sensors in the eye responsible for color vision. Cones in the human eye can be divided in to thee principle sensing categories corresponding roughly to red green and blue. Approximately 65% of all cones are sensitive to red light.33% are sensitive to green light and only 2% are sensitive to blue (but the blue cones are the most sensitive).Due to these absorption characteristics of human eye, colors are seen as variable combinations of the so called primary colors Red (R), Green (G), Blue (B).The wavelength values to the three primary colors: Blue=435.8nm, Green=546.1nm and red=700nm The primary colors can be added to produce the secondary colors of light magenta(red plus blue),cyan(green plus blue) and yellow(red plus green)Mixing these primary or secondary with its opposite primary color, in the right intensity produces white light. Differentiating between primary colors of light and the primary colors of pigments or solourants is important. In the later, a primary color is defined as one that subtracts or absorbs the primary color of light and reflects or transmits the other two. Therefore, the primary colors of pigments are magenta, cyan and yellow and the secondary colors red, green and blue. A proper combination of the three pigment primaries or a secondary with its opposite primary produces black. 3.2.3 Hue and saturation The characteristics generally used to distinguish one color from another are brightness, hue and saturation. Brightness embodies the chromatic notion of intensity. Hue represents dominant color as perceived by an observer. Saturation refers to relative purity or the amount of white light mixed with hue. The pure spectrum color is fully saturated. Colors sic as pink (red and white) and lavender (violet and white) are less saturated, with the degree of saturation being inversely proportional to the amount of white light added. Hue and saturation taken together are called chromaticity and therefore a color may be characterized by its brightness and chromaticity.

Department. of ECE, MRITS

12

Key Frame Extraction On MPEG by using Threshold Algorithm

3.2.4 Importance of color image processing The use of color in image processing is motivated by two principle factors (1) First, color is a powerful descriptor that often simplifies object identification and extraction from a scene. (2) Second, humans can discern thousands of color shades ad intensities compared to about every two dozen shades of gray. This second factor is particularly important in manual. (i.e., when performed by human) image analysis.

3.3 COLOR MODELS


3.3.1 Introduction to Color models The purpose of color model is to facilitate the specification of color in some standards, generally accepted way in essence; a color model is the specification of a coordinate system and a subspace within that system where each color is represented by a single point. Most color models in use today are oriented either towards hardware or towards application where color manipulation is o goal .In terms of digital image processing, the hardware oriented models most commonly used in practice are the RGB(red, green, blue) model for color monitors and a broad class of color video cameras: the CMY(cyan magenta and yellow)and CMYK(cyan, magenta ,yellow and black)models for color primitive; and the HSI (hue saturation and intensity)models, which corresponds closely with the way humans describe and interpret colors. The HIS model also has the advantage that it decouples the color and gray-scale information in an image making it suitable for many of gray-scale techniques developed. There are numerous color models in use today due to the fact that color science is a broad field that encompasses many areas of applications. 3.3.2 RGB color model The RGB color model is an additive color model in which red, green and blue light are added together in various ways to reproduce a broad array of colors. The name of model comes from the initials of the three additive primary colors red, green and blue.

Department. of ECE, MRITS

13

Key Frame Extraction On MPEG by using Threshold Algorithm

The main purpose f the RGB color model is for the sensing, representation and display of images in electronic systems such as televisions and computers, though it has also been used in conventional photography. Before the electronic age, the RGB color model already had a solid theory behind it based on human perception of colors. Typical RGB input devices are color TV and video cameras, image scanners and digital cameras. Typical RGB output devices are TV sets of various technologies (CRT, LCD, and Plasma etc), computer and mobile phone displays and video projectors, multicolor LED displays and large screens as jumbotron etc. Color printers on the other hand are not RGB devices but subtractive color devices (typically CMYK color models).

Fig. 3.2: RGB Colour Model Fig.3.2 shows the RGB colour mode. To form a color with RGB, three colored light beams (one red, one green and one blue) must be superimposed (for e.g., by emission from a black screen, or by reflection from a white screen).Each of the three beams is called a component of that color, and each of them can have an arbitrary intensity from fully off to fully on in the mixture. 3.3.2.1 Representation of RGB We can represent the RGB model by using a unit cube. Each point in the cube (or vector where the other point is the origin) represents a specific color. This model is the best for setting the electron guns for a CRT.Note that for the complimentary colures the sum of the values equals white light (1,1,1).For example: Red(1,0,1)+cyan(0,1,1)=white(1,1,1)
Department. of ECE, MRITS 14

Key Frame Extraction On MPEG by using Threshold Algorithm

Green(0,1,0)+magenta(1,0,1)=white(1,1,1) Blue(0,0,1)+yellow(1,1,0)=white(1,1,1)

Fig. 3.3 Cartesian coordinates (3D) MATLAB code for extraction of a particular component: R=RGB (:,:, 1) //extracting red component G=RGB (:,:,2) //extracting green component B=RGB(:,:,3) //extracting blue component

3.3.3 HSV color model The characteristics generally used to distinguish one color from another are brightness hue and saturation. Brightness embodies the chromatic motion of intensity.
(1) Hue represents the dominant wavelength of the light wave. Thus, when we

call an object red, orange or yellow, we are specifying its hue.


(2) Saturation refers to the relative purity or the amount of white light mixed

with the hue. The pure spectrum colors are fully saturated. The HSV (Hue saturation and value) color model is more intuitive than the RGB color model. The user specifies a color (hue) and then adds white or black. There are
Department. of ECE, MRITS 15

Key Frame Extraction On MPEG by using Threshold Algorithm

three color parameters: Hue, Saturation and value. Change in the saturation parameter corresponds to adding or subtracting whiter and changing the value parameter corresponds to adding or subtracting black. The HSV model is shown in Fig.3.4.

Fig. 3.4: HSV Color Model HSV improves on the color cube representation of RGB by arranging colors of each hue in a radial slice around a simple axis of neutral colors which ranges from black at the bottom to white at the top. The fully saturated colors of each hue lie in a circle, a color wheel. Matlab code for extraction of a particular component: H=HSV (:,:,1); S=HSV (:,:,2); V=HSV (:,:,3); //extracting hue component //extracting saturation component //extracting value

Conversion from RGB to HSV: Let r, g, b [0, 1] be the red, green and blue coordinates respectively, of a color in RGB space. Let max be the greatest of r, g and b and min the least.

To find the hue angle h [0,360], compute:

Department. of ECE, MRITS

16

Key Frame Extraction On MPEG by using Threshold Algorithm

0, H= (60*((g-b)/ (max-min)) +360), (60*((b-r)/ (max-min) +120), (60*((r-g)/ (max-min) +240),

if max=min if max=r if max=g if max=b

The values for s and v of an HSV color are defined as follows: 0, S= if max=0

((max-min)/min) =1-(min/max), otherwise

3.3.4 CMYK Color model It is possible to achieve a large range of colors seen by humans combining cyan, magenta and yellow transparent dyes/inks on white substrate. These are the subtractive primary colors. Often a fourth black is added to improve reproduction of some dark colors. This is called CMY or CMYKcolour space. The cyan ink will reflect all but the red light, the yellow ink will reflect all but the blue light and the magenta ink will reflect all but the green light. This is because cyan light is an equal mixture of green and blue, yellow is a mixture of red, green and magenta light is an equal mixture of red and blue. Cyan=green+blue, so light reflected from a cyan pigment has no red component i.e., the red is absorbed by cyan. Similarly magenta subtracts green and yellow subtracts blue. Printers usually use four colors: cyan, yellow, magenta and black. This is because cyan, yellow and magenta together produce a dark gray rather than a true black. The conversion between the RGB and CMY is easily computed as below: C=1-R; R=1-C; M=1-G; G=1-M; Y=1-B B=1-Y

Department. of ECE, MRITS

17

Key Frame Extraction On MPEG by using Threshold Algorithm

3.3.5 YIQ Color Model This model was designed to separate chrominance from luminance. This was a requirement in the early days of color television when black and white sets were expected to pick up and display what were originally color pictures .The Y-channel contains luminance information (sufficient for black and television sets) while the I and Q channels (in-phase and in quadrature) carried the color information .A color television set would take these three channels Y, I and Q and the information back to R, G and B levels on a display on a screen. 3.3.6 HIS color model In this color model, as in YIQ model, luminance or intensity (I) is decoupled from the color information which is described by a Hue channel and Saturation channel .Hue and saturation of colors respond closely to the way humans perceive color and thus this model is suited for interactive manipulation of color images where changes occur for each variable shift that corresponds to what the operator expects. 3.3.7 L*a*b* Colour Space The L*a*b* (Brightness, red-green and yellow blue content) system gives quantitative expression to the Munsell system of colour classification L*a*b* colour space is best according to perceptual similarity. It is not dependent on any particular device. Colours can be set as them are perceived when operating a repro system. In the analysis L*a*b* is divided into 7 L* levels, 5 a* levels and 5 b* levels. The problems with L*a*b* colour space is quantization. From Fig. 2 can be seen, that on each edge the quantization should be coarser, because the volume should be the same for each subspace. In our tests the volume is smaller for values near the edges.

3.4 COLOUR HISTOGRAM DISCRIMINATION


There are several distance formulas for measuring the similarity of colour histograms. In general, the techniques for comparing probability distributions, such as the kolmogoroff-smirnov test are not appropriate for colour histograms. This is because visual perception determines similarity rather than closeness of the probability distributions. Essentially, the colour distance formulas arrive at a measure of similarity between images based on the perception of colour content. Three

Department. of ECE, MRITS

18

Key Frame Extraction On MPEG by using Threshold Algorithm

distance formulas that have been used for image retrieval including histogram Euclidean distance, histogram intersection and histogram quadratic (cross) distance. 3.4.1 Histogram Euclidean distance Let h and g represent two colour histograms. The Euclidean distance between the colour histograms h and g can be computed as:

In this distance formula, there is only comparison between the identical bins in the respective histograms. Two different bins may represent perceptually similar colours but are not compared crosswise. All bins contribute equally to the distance. 3.4.2 Histogram quadratic (cross) distance The colour histogram quadratic distance was used by the QBIC system introduced in the cross distance formula is given by:

The cross distance formula considers the cross-correlation between histogram bins based on the perceptual similarity of the colours represented by the bins. And the set of all cross-correlation values are represented by a matrix A, which is called a similarity matrix. And a (i, j) the element in the similarity matrix A is given by: for RGB space,

Where dij is the L2 distance between the colour i and j in the RGB space. In the case that quantization of the colour space is not perceptually uniform the cross term contributes to the perceptual distance between colour bins. For HSV space it is given in by:

Department. of ECE, MRITS

19

Key Frame Extraction On MPEG by using Threshold Algorithm

3.4.3 Histogram intersection distance The colour histogram intersection was proposed for colour image retrieval in the intersection of histograms h and g is given by:

Where |h| and |g| gives the magnitude of each histogram, which is equal to the number of samples. Colours not present in the user's query image do not contribute to the intersection distance. This reduces the contribution of background colours. The sum is normalized by the histogram with fewest samples. 3.5 FORMULATION: For computing FDM, colour histogram has been built in HSV colour space by performing a quantization step to reduce the number of distinct colours to 64. Instead of computing one histogram for the entire image, we divided image in a total of Ts sections, each of size mxm. This is to effectively measure the level of difference between two frames. Each corresponding section of one frame is compared with the corresponding section of other frame using the histogram intersection mechanism. The histogram difference HDi,j,s between two corresponding sections s of histogram His of frame i and histogram Hjs of frame j is defined as:

The histogram difference HD between two frames i and j is then calculated by taking the average of the difference measure between each section.

Department. of ECE, MRITS

20

Key Frame Extraction On MPEG by using Threshold Algorithm

CHAPTER 4 CORRELATION DIFFERENCE HISTOGRAM


4.1 INTRODUCTION
The correlation coefficients have been very popular scheme to find similarity between two data sets. The correlation coefficients are invariant to brightness .The cross correlation is used to determine the degree of similarity between two similar images, or, with the addition of a linear offset to one of the images, the spatial shift or spatial correlation between the images. The degree of similarity between the two images is determined by correlation coefficient. The correlation coefficient has value 1 if the two images are identical, 0 if they are completely uncorrelated, and 1 if they are completely anti-correlatedangles in the contrast.

4.2 TYPES OF CORRELATION COEFFICIENTS


4.2.1. Pearsons Correlation Coefficient (PCC) The Pearsons Correlation Coefficient, r, is widely used in statistical analysis, pattern recognition, and image processing. disparity measurement. For monochrome digital images, the Pearsons Correlation Coefficient is described by Applications on the latter include comparing two images for image registration purposes, object recognition, and

Where xi is the intensity of the ith pixel in the first image, yi is the intensity of the ith pixel in the next image. The correlation coefficient has value 1 if the two images are identical, 0 if they are completely uncorrelated, and 1 if they are completely anti-correlated, for example, if one image is the negative of the other. In theory, they would obtain a value of 1 for r if the object is intact and a value of less than 1 if alteration or
Department. of ECE, MRITS 21

Key Frame Extraction On MPEG by using Threshold Algorithm

movement has occurred. In practice, distortions in the imaging system, pixel noise, slight variations in the objects position relative to the camera, and other factors produce an r value less than 1, even if the object has not been moved or physically altered in any manner. For security applications, typical r values for two digital images of the same scene, one recorded immediately after the other using the same imaging system and illumination, range from 0.95 to 0.98. Interpretation of Correlation Coefficient (r) is shown in Fig.4.1.The value of correlation coefficient r ranges from -1 to +1. Case1: If r = +1, then the correlation between the two variables is said to be perfect and positive. Case2: If r = -1, then the correlation between the two variables is said to be perfect and negative. Case3: If r = 0, then there exists no correlation between the variables.

Fig.4.1: Coefficient(r) of Determination between x and y frames


One of the obvious advantages of Pearsons correlation coefficient is that it condenses the comparison of the two dimensional images down to a single vector r. The most widely recognized disadvantage is that it is computationally intensive.

4.2.2 Point-Biserial The point-biserial correlation coefficient, referred to as rpb, is a special case of Pearson in which one variable is quantitative and the other variable is dichotomous and nominal. The calculations simplify since typically the values 1 (presence) and 0 (absence) are used for the dichotomous variable. This simplification is sometimes expressed as follows:

Department. of ECE, MRITS

22

Key Frame Extraction On MPEG by using Threshold Algorithm

rpb = (Y1 - Y0) sqrt (pq)/y, where Y0 and Y1 are the Y score means for data pairs with an x score of 0 and 1, respectively, q = 1 - p and p are the proportions of data pairs with x scores of 0 and 1, respectively, and y is the population standard deviation for the y data. An example usage might be to determine if one gender accomplished some task significantly better than the other gender. 4.2.3 Phi Coefficient If both variables instead are nominal and dichotomous, the Pearson simplifies even further. First, we need to introduce contingency tables. A contingency table is two dimensional table containing frequencies by category. For this situation it will be two by two since each variable can only take on two values, but each dimension will exceed two when the associated variable is not dichotomous. In addition, column and row headings and totals are frequently appended so that the contingency table ends up being n + 2 by m + 2, where n and m are the number of values each variable can take on. 4.2.4 Biserial Correlation Coefficient Another measure of association, the biserial correlation coefficient, termed rb, is similar to the point biserial, but pits quantitative data against ordinal data, but ordinal data with an underlying continuity but measured discretely as two values (dichotomous). An example might be test performance vs. anxiety, where anxiety is designated as either high or low. Presumably, anxiety can take on any value in between, perhaps beyond, but it may be difficult to measure. We further assume that anxiety is normally distributed. The formula is very similar to the point-biserial but yet different; rb = (Y1 - Y0) (pq/Y) / y, where Y0 and Y1 are the Y score means for data pairs with an x score of 0 and 1, respectively, q = 1 - p and p are the proportions of data pairs with x scores of 0 and 1, respectively, and y is the population standard deviation for the y data, and Y is the height of the standardized normal distribution at the point z, where P(z'<z)=q and P(z'>z)=p. Since the factor involving p, q, and the height is always greater than 1, the biserial is always greater than the point-biserial.

Department. of ECE, MRITS

23

Key Frame Extraction On MPEG by using Threshold Algorithm

4.2.5 Tetrachoric Correlation Coefficient The tetrachoric correlation coefficient, rtet, is used when both variables are dichotomous, like the phi, but we need also to be able to assume both variables really are continuous and normally distributed. Thus it is applied to ordinal vs. ordinal data which has this characteristic. Ranks are discrete so in this manner it differs from the Spearman. The formula involves a trigonometric function called cosine. The cosine function, in its simplest form, is the ratio of two side lengths in a right triangle, specifically, the side adjacent to the reference angle divided by the length of the hypotenuse. The formula is: rtet = cos (180/ (1 + sqrt (BC/AD)). 4.2.6 Rank-Biserial Correlation Coefficient The rank-biserial correlation coefficient, rrb, is used for dichotomous nominal data vs. rankings (ordinal). The formula is usually expressed as rrb = 2 (Y1 - Y0)/n, where n is the number of data pairs, and Y0 and Y1, again, are the Y score means for data pairs with an x score of 0 and 1, respectively. These Y scores are ranks. This formula assumes no tied ranks are present. This may be the same as a Somer's D statistic for which an online calculator is available.

4.3 FORMULATION
For computing correlation measure, we divide frames into Ts sections of size mxm. The correlation values of each section are then averaged. The correlation is measurement for three colour channel values red, green and blue. The correlation difference CDp,q,s,c of a colour channel c between two corresponding sections s of frame p and q is defined as

Where s=1Ts; c=red, green, blue; fic=mean value of channel c for the frame i ; fj,c=mean value of channel c for the frame j. The correlations of all sections of frame i and j are averaged to obtain the overall correlation CDi, j, c for a colour channel.
Department. of ECE, MRITS 24

Key Frame Extraction On MPEG by using Threshold Algorithm

Then, the overall correlation difference measure CDi, j between frames i and j is obtained by averaging the value of each colour channel.

Department. of ECE, MRITS

25

Key Frame Extraction On MPEG by using Threshold Algorithm

CHAPTER 5 EDGE ORIENTATION HISTOGRAM


5.1 INTRODUCTION
Edge detection is one of the most commonly used operations in image analysis, and there are probably more algorithms in the literature for enhancing and detecting edges. The reason for this is that edges form the outline of an object. An edge is the boundary between an object and the background, and indicates the boundary between overlapping objects. This means that if the edges in an image can be identified accurately, all of the objects can be located and basic properties such as area, perimeter, and shape can be measured. Edges define the boundaries between regions in an image, which helps with segmentation and object recognition. They can show where shadows fall in an image or any other distinct change in the intensity of an image. Edge detection is a fundamental of low-level image processing and good edges are necessary for higher level processing. The problem is that in general edge detectors behave very poorly. The quality of edge detection is highly dependent on lighting conditions, the presence of objects of similar intensities, density of edges in the scene, and noise. The detection of edges is shown in Fig.5.1.

Fig.5.1: Edge detection results

Department. of ECE, MRITS

26

Key Frame Extraction On MPEG by using Threshold Algorithm

5.2 FUNDAMENTALS OF EDGE DETECTION:


Edge detection refers to the process of identifying and locating sharp discontinuities in an image. The discontinuities are abrupt changes in pixel intensity which characterize boundaries of objects in a scene. Classical methods of edge detection involve convolving the image with an operator (a 2-D filter), which is constructed to be sensitive to large gradients in the image while returning values of zero in uniform regions. There are an extremely large number of edge detection operators available, each designed to be sensitive to certain types of edges. Variables involved in the selection of an edge detection operator include:
(1)Edge orientation:

The geometry of the operator determines a

characteristic direction in which it is most sensitive to edges. Operators can be optimized to look for horizontal, vertical, or diagonal edges.
(2)Noise environment: Edge detection is difficult in noisy images, since both

the noise and the edges contain high-frequency content. Attempts to reduce the noise result in blurred and distorted edges. Operators used on noisy images are typically larger in scope, so they can average enough data to discount localized noisy pixels. This results in less accurate localization of the detected edges.
(3)Edge structure: Not all edges involve a step change in intensity. Effects

such as refraction or poor focus can result in objects with boundaries defined by a gradual change in intensity. The operator needs to be chosen to be responsive to such a gradual change in those cases. Newer waveletbased techniques uses actually characterize the nature of the transition for each edge in order to distinguish, for example, edges associated with hair from edges associated with a face.

5.3 EDGE DETECTION OPERATORS


5.3.1 Prewitts operator Prewitt operator is similar to the Sobel operator and is used for detecting vertical and horizontal edges in images. The Prewitt operator is used in image processing, particularly within detection algorithms. Technically, it is a discrete
Department. of ECE, MRITS 27

Key Frame Extraction On MPEG by using Threshold Algorithm

differentiation operator, computing an approximation of the gradient of the image intensity function. At each point in the image, the result of the Prewitt operator is either the corresponding gradient vector or the norm of this vector. The Prewitt operator is based on convolving the image with a small, separable, and integer valued filter in horizontal and vertical direction and is therefore relatively inexpensive in terms of computations. On the other hand, the gradient approximation which it produces is relatively crude, in particular for high frequency variations in the image. 5.3.2 Canny Operator Another approach to edge detection using colour information is simply to extend a traditional intensity based edge detector into the colour space. This method seeks to take advantage of the known strengths of the traditional edge detector and tries to overcome its weaknesses by providing more information in the form of three colour channels rather than a single intensity channel. As the Canny edge detector is the current standard for intensity based edge detection, it seemed logical to use this operator as the basis for colour edge detection. The algorithm runs in 5 separate steps: 1 .Smoothing: Blurring of the image to remove noise. 2 Finding gradients: The edges should be marked where the gradients of the image has large magnitudes. 3. Non-maximum suppression: Only local maxima should be marked as edges. 4. Double thresholding: Potential edges are determined by thresholding. 5. Edge tracking by hysteresis: Final edges are determined by suppressing all edges that are not connected to a very certain (strong) edge.

5.3.3 Sobel operator


The Sobel operator is used in image processing, particularly within edge detection algorithms. Technically, it is a discrete differentiation operator, computing an approximation of the opposite of the gradient of the image intensity function. At each point in the image, the result of the Sobel operator is either the corresponding opposite of the gradient vector or the norm of this vector. The Sobel operator is based on convolving the image with a small, separable, and integer valued filter in horizontal and vertical direction and is therefore relatively inexpensive in terms of

Department. of ECE, MRITS

28

Key Frame Extraction On MPEG by using Threshold Algorithm

computations. On the other hand, the opposite of the gradient approximation that it produces is relatively crude, in particular for high frequency variations in the image. Mathematically, the operator uses two 33 kernels which are convolved with the original image to calculate approximations of the derivatives - one for horizontal changes, and one for vertical. If we define A as the source image, and Gx and Gy are two images which at each point contain the horizontal and vertical derivative approximations, the computations are as follows:

Where * here denotes the 2-dimensional convolution operation. The x-coordinate is here defined as increasing in the "right"-direction, and the y-coordinate is defined as increasing in the "down"-direction. At each point in the image, the resulting Gradient approximations can be combined to give the gradient magnitude, using

Using this information, we can also calculate the opposite of the gradient's direction:

Fig 5.2(b) shows the application of sobel operator for the original image shown in Fig.5.2(a)

Department. of ECE, MRITS

29

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig. 5.2(a): Colour picture of a steam engine

Fig. 5.2(b): sobel operator applied to that image The Fig 5.2(b) shows the application of sobel operator for the original image shown in Fig.5.2 (a).

Department. of ECE, MRITS

30

Key Frame Extraction On MPEG by using Threshold Algorithm

5.4 FORMULATION
The purpose of edge detection in general is to significantly reduce the amount of data in an image, while preserving the structural properties to be used for further image processing. The edges are good under illumination changes. The edges are first computed using horizontal and vertical Sobel operators which are then used to find gradient and angle of edges. The angles are then used to build a histogram of edge orientation. For simplicity, we defined only 72 bins for the angles. As in the case of histograms, we compare histograms of corresponding sections of the two frames. The edge Histogram difference ED between two frames i and j is calculated by taking the average of the difference measure between each section. The formula for calculating ED is

Department. of ECE, MRITS

31

Key Frame Extraction On MPEG by using Threshold Algorithm

CHAPTER 6 RESULTS
6.1 FLOW CHART FOR THE EXTRACTION OF KEY FRAMES Start

Video Frames from input video

For n=0 n=n+1 0<n<total number of frames from the video

First frame ?
False

Tru e

Key frame data base k=0 k=k+1 1<k<n

Current frame

Correlation difference (CD)

Colour histogram difference (HD)

Edge orientation histogram difference (ED)

Discard frame

Fals

Thresh old
Tru e

Key frame= current frame

Stop
True

Last frame?

False

Department. of ECE, MRITS Flow chart for the extraction of key frames Fig 6.1:

32

Key Frame Extraction On MPEG by using Threshold Algorithm

6.2 ALGORITHM FOR EXTRACTING KEY FRAMES BASED ON

CORRELATION
The key frame extraction method is composed of the following steps Step1: All the frames are extracted from the input sports video. Step2: Consider first frame as a key frame. Step3: Select the next subsequent frame from the extracted frames and divide frame into a total of Ts sections, each of size mxm (8x8). Step4: Histogram Creation Step4.1: Correlation Histogram Creation: The correlation values of each section are then averaged. The correlation is measured for three color channel values red, green and blue. Step4.2: The correlation difference CDp,q,s,c of a color channel c between two corresponding sections s of frame p and q is defined as:

Where s =1T ; c =red, green, blue ;f= mean value of c channel of the frame. Step4.3: The correlations of all sections of frame i and j are averaged to Obtain the overall correlation CDi,j,c for a color channel.

Step4.4: Then, the overall correlation difference measure CDi,j between frames i and j is obtained by averaging the value of each color channel.

Step4.5: CDi,j is compared with the threshold value to detect key frame. The frames with higher CDi,j as compared to threshold are treated as key frame.

Department. of ECE, MRITS

33

Key Frame Extraction On MPEG by using Threshold Algorithm

Step5: To detect key frames based on correlation difference measure in entire video repeat step3 & step4.

6.3 FLOW CHART FOR CORRELATION


Key frame from the database

Current Frame

Division of each frame into Ts sections of size (m*m)

Correlation difference of two corresponding sections of current frame and previous frame (C1, C2 ...Cs) are calculated

Mean of correlation difference values

CD

Fig 6.2: Flow chart for correlation difference

Department. of ECE, MRITS

34

Key Frame Extraction On MPEG by using Threshold Algorithm

6.4 ALGORITHM FOR EXTRACTING KEY FRAMES BASED ON COLOUR DIFFERENCE MEASURE
The key frame extraction method is composed of the following steps Step1: All the frames are extracted from the input sports video. Step2: Consider first frame as a key frame. Step3: select the next subsequent frame from the extracted frames and convert RGB to HSV colour space then divide frame into a total of Ts sections, each of size mxm(8x8). Step4: Histogram Creation Step4.1: Colour Histogram Creation: A three dimensional colour histogram is built by subdividing the HSV colour space into 8:2:4 bins. Step4.2: The histogram difference HDi,j,s between two corresponding sections s of histogram His of frame i and histogram Hjs of frame j is calculated by using the formula

Step4.3: The histogram difference HD between two frames i and j is then Calculated by taking the average of the difference measure between each section by the formula

Step4.4: HDi,j is compared with the threshold value to detect key frame. The frames with lower HDi,j as compared to threshold are treated as key frame. Step5: To detect key frames based on colour difference measure in entire video repeat step3 & step4.

Department. of ECE, MRITS

35

Key Frame Extraction On MPEG by using Threshold Algorithm

6.5 FLOW CHART FOR COLOUR HISTOGRAM

Current Frame

Key frame from the database

Conversion of RGB to HSV

Division of each frame into Ts sections of size (m*m) Colour histogram difference of two corresponding sections of current frame and previous frame (ch1, ch2 ...chs)

Mean of colour difference values


HD

Fig 6.3: Flow chart of colour histogram difference

Department. of ECE, MRITS

36

Key Frame Extraction On MPEG by using Threshold Algorithm

6.6 ALGORITHM FOR EXTRACTING KEY FRAMES BASED ON EDGE DIFFERENCE MEASURE
The key frame extraction method is composed of the following steps Step1: All the frames are extracted from the input sports video. Step2: Consider first frame as a key frame. Step3: select the next subsequent frame from the extracted frames and convert RGB to Gray image then divide frame into a total of Ts sections, each of size mxm(8x8). Step4: Histogram Creation Step4.1: Edge Histogram Creation: The edges are first computed using horizontal and vertical Sobel operators which are then used to find gradient magnitude and angle of edges. Gradients magnitude is given by

Gradients direction is given by

Step4.2: the angles are computed for only those pixels where value of gradient is above a certain threshold (>3). The angles are then used to build a histogram of edge orientation. We defined only 82 bins for the angles. Step4.3: we compare histograms of corresponding sections of the two frames. The edge histogram difference ED between two frames i and j is calculated by taking the average of the difference measure between each section.

Step4.4: EDi,j is compared with the threshold value to detect key frame. The frames with higher EDi,j as compared to threshold is treated as key frame. Step5: To detect key frames based on edge difference measure in entire video repeat step3 & step4.

Department. of ECE, MRITS

37

Key Frame Extraction On MPEG by using Threshold Algorithm

6.7 FLOW CHART FOR EDGE ORIENTATION HISTOGRAM Current Frame Key frame from the database

RGB to gray conversion

Division of each frame into Ts sections of size (m*m) Calculate gradients ( Gx & Gy )

Evaluate gradients magnitude for all sections

False

If gradient magnitude <3

True Eliminate edge

Evaluate gradient direction (=arc tan (Gy/Gx))

Correlation difference of two corresponding sections of current frame and previous frame (e1, e2 ...es)

Mean of edge orientation difference values

ED

Fig 6.4: Flow chart for edge orientation histogram difference

Department. of ECE, MRITS

38

Key Frame Extraction On MPEG by using Threshold Algorithm

6.8 COLOUR HISTOGRAM OUTPUT

Fig 6.5 : Reading the frames from the input video Figure 6.5 indicates the reading of frames from the video as well as the comparisons of frame with the previous frame to find out the key frames.

Department. of ECE, MRITS

39

Key Frame Extraction On MPEG by using Threshold Algorithm

10

11

12

13

14

15

16

17

18

19

20

Fig 6.6: Frames extracted from the (sample) football video

Department. of ECE, MRITS

40

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.7: colour histogram difference values for the sample (football) video Figure 6.7 indicates the colour histogram difference values of the current frame and previous frame. Total 19 colour histogram difference values are generated from 20 frames in the football video. The range of colour histogram difference values is -64 to 0.The absolute value of the colour histogram differences are compared with the set of threshold value to extract key frames based on colour histogram. In this the frames with colour histogram difference value greater than the threshold are discarded.

Department. of ECE, MRITS

41

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.8: output graph of colour histogram The above Fig 6.8 shows the graph between frames and colour difference value.

Department. of ECE, MRITS

42

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.9 (a): key frames based on colour histogram for the sample (football) video with the threshold value as 35. The above figure shows the number of key frames extracted based on colour histogram technique with the threshold value as 35.Total 8 frames are obtained with this threshold value.

Department. of ECE, MRITS

43

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.9 (b): set of key frames based on colour histogram for the sample (football) video with the threshold value as 35.

Department. of ECE, MRITS

44

Key Frame Extraction On MPEG by using Threshold Algorithm

With 35 as the threshold value we obtained 8 frames as key frames based on colour histogram.

Department. of ECE, MRITS

45

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.10 (a): key frames based on colour histogram for the sample (football) video with the threshold value as 45. The above figure shows the number of key frames extracted based on colour histogram technique with the threshold value as 45. Total 12 frames are obtained with this threshold value.

Department. of ECE, MRITS

46

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.10(b): set of key frames based on colour histogram for the sample(football) video with the threshold value as 45. With 45 as the threshold value we obtained 12 frames as key frames based on colour histogram.

Department. of ECE, MRITS

47

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.11 (a): key frames based on colour histogram for the sample (football) video with the threshold value as 55. The above figure shows the number of key frames extracted based on colour histogram technique with the threshold value as 55. Total 13 frames are obtained with this threshold value.

Department. of ECE, MRITS

48

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.11(b): set of key frames based on colour histogram for the sample (football) video with the threshold value as 55. With 55 as the threshold value we obtained 13 frames as key frames based on colour histogram.

Department. of ECE, MRITS

49

Key Frame Extraction On MPEG by using Threshold Algorithm

6.9 CORRELATION OUTPUT

Fig 6.12: Correlation difference values for the sample (football) video Fig 6.12 indicates the correlation difference values of the current frame and previous frame. Total 19 correlation difference values are generated from 20 frames in the football video. The range of correlation difference values is 0 to 1. The absolute value of the correlation differences are compared with the set of threshold value to extract key frames based on correlation. In this the frames with correlation difference value lesser than the threshold are discarded.

Department. of ECE, MRITS

50

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.13: output graph of correlation The above figure shows the graph between frames and correlation difference value.

Department. of ECE, MRITS

51

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.14 (a): key frames based on correlation for the sample (football) video the threshold value as 0.4.

with

The above figure shows the number of key frames extracted based on correlation technique with the threshold value as 0.4. Total 4 frames are obtained with this threshold value.

Department. of ECE, MRITS

52

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.14(b): set of key frames based on correlation for the sample (football) video with the threshold value as 0.4. With 0.4 as the threshold value we obtained 4 frames as key frames based on correlation.

Department. of ECE, MRITS

53

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.15 (a): key frames based on correlation for the sample (football) video the threshold value as 0.6.

with

The above figure shows the number of key frames extracted based on correlation technique with the threshold value as 0.6. Total 2 frames are obtained with this threshold value.

Department. of ECE, MRITS

54

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.15(b) :set of key frames based on correlation for the sample (football) video with the threshold value as 0.6. With 0.6 as the threshold value we obtained 2 frames as key frames based on correlation.

Department. of ECE, MRITS

55

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.16 (a): key frames based on correlation for the sample (football) video the threshold value as 0.8.

with

The above figure shows the number of key frames extracted based on correlation technique with the threshold value as 0.8. Only one frame is obtained as a key frame.

Department. of ECE, MRITS

56

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.16(b): set of key frames based on correlation for the sample (football) video with the threshold value as 0.8. With 0.8 as the threshold value we obtained 1 frame as key frames based on correlation.

Department. of ECE, MRITS

57

Key Frame Extraction On MPEG by using Threshold Algorithm

6.10 EDGE ORIENTATION HISTOGRAM OUTPUT

Fig 6.17: edge orientation histogram difference values for the sample (football) video

Figure 6.17 indicates the edge orientation histogram difference values of the current frame and previous frame. Total 19 edge orientation histogram difference values are generated from 20 frames in the football video. The range of edge orientation histogram difference values is 0 to 82.the absolute value of the edge orientation histogram differences are compared with the set of threshold value to extract key frames based on edge orientation histogram. In this the frames with edge orientation difference value lesser than the threshold are discarded.

Department. of ECE, MRITS

58

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.18: output graph of edge orientation histogram

The above figure shows the graph between frames and edge orientation difference value.

Department. of ECE, MRITS

59

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.19 (a): key frames based on edge orientation histogram for the sample (football) video with the threshold value as 40.

The above figure shows the number of key frames extracted based on edge orientation histogram technique with the threshold value as 40. Total 11 frames are obtained with this threshold value.

Department. of ECE, MRITS

60

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.19(b): set of key frames based on edge orientation histogram for the sample (football) video with the threshold value as 40.

With 40 as the threshold value we obtained 11 frames as key frames based on edge orientation histogram.

Department. of ECE, MRITS

61

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.20 (a): key frames based on edge orientation histogram for the sample (football) video with the threshold value as 50.

Department. of ECE, MRITS

62

Key Frame Extraction On MPEG by using Threshold Algorithm

The above figure shows the number of key frames extracted based on edge orientation histogram technique with the threshold value as 50. Total 6 frames are obtained with this threshold value.

Department. of ECE, MRITS

63

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.20(b): set of key frames based on edge orientation histogram for the sample (football) video with the threshold value as 50

With 50 as the threshold value we obtained 6 frames as key frames based on edge orientation histogram.

Department. of ECE, MRITS

64

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.21 (a): key frames based on edge orientation histogram for the sample (football) video with the threshold value as 60.

The above figure shows the number of key frames extracted based on edge orientation histogram technique with the threshold value as 60. Total 4 frames are obtained with this threshold value.

Department. of ECE, MRITS

65

Key Frame Extraction On MPEG by using Threshold Algorithm

Fig 6.21(b): set of key frames based on edge orientation histogram for the sample (football) video with the threshold value as 60.

With 60 as the threshold value we obtained 4 frames as key frames based on edge orientation histogram.

Department. of ECE, MRITS

66

Key Frame Extraction On MPEG by using Threshold Algorithm

6.11 OUTPUT For different sport videos the number of key frames for different threshold value based on colour, correlation and edge orientation techniques are shown below.

COLOR HISTOGRAM

Total no. of frames Sample(football) 20 Cricket 455 Football 121 Hockey 476

Type of video

Number of key frames for the Threshold value 35 45 55 8 12 13 19 84 231 1 1 1 1 17 260

Table 6.1: Colour histogram key frames for different frames on different videos.

CORRELATION

Total no. of frames Sample(football) 20 Cricket 455 Football 121 Hockey 476

Type of video

Number of key frames for the Threshold value 0.4 0.6 0.8 4 2 1 192 72 2 89 27 1 292 101 1

Table 6.2: Correlation key frames for different frames on different videos.

Department. of ECE, MRITS

67

Key Frame Extraction On MPEG by using Threshold Algorithm

EDGE ORIENTATION HISTOGRAM

Total no. of frames Sample(football) 20 Cricket 455 Football 121 Hockey 476

Type of video

Number of key frames for the Threshold value 40 50 60 11 6 4 106 28 3 22 2 1 49 6 1

Table 6.3: Edge orientation histogram key frames for different frames on different videos.

Colour histogram Exactly matched Partially matched Mismatch 64 35 0

correlation 0 0.6 1

Edge orientation histogram 0 50 82

Table 6.4: Frame difference measures The above table 6.4 gives the behaviour of different frame difference measures.

6.12 PERFORMANCE MEASURES 6.12.1 Accuracy rate: Accuracy rate is defined as the ratio of number of matched key frames from the automatic summary to the number of key frames from the user summary. Accuracy rate =

Department. of ECE, MRITS

68

Key Frame Extraction On MPEG by using Threshold Algorithm

6.12.2 Error rate: Error rate is defined as the ratio of number of non matched key frames from the automatic summary to the number of key frames from the user summary. Error rate = Where Nmas= number of matched key frames from the automatic summary Nnmas=number of non matched key frames from the automatic summary Nus= number of key frames from the user summary. The value of Accuracy Rate varies from 0 to 1, 1 being the best value where all frames of automated summary matches with all frames of user summaries. The value of Error Rate ranges from 0 to Nas /Nus where 0 is the best value (Nas is the number of frames in automatic summary). The quality of a summary is superior if it has high Accuracy Rate and low Error Rate. Color histogram correlation Edge orientation histogram Accuracy rate 0.8 0.7 1.0 Error rate 0.2 0.3 0.0 Table 6.5: Comparison of accuracy and error rates of a sport (football) video

The above table 6.5 clearly shows the accuracy and error rates for a soccer video. The accuracy rate and error rate of colour histogram are 0.8 and 0.2 with a threshold of 35. Similarly, for correlation and edge orientation histogram the accuracy rate and error rate values are 0.7 , 0.3 and 1, 0 for a threshold of 0.2 and 40 respectively. From the above table an error of 0.2 occurs for colour histogram measure, because in most of the sport videos the camera is mostly concentrated on the field. In such situation the colour histogram difference is almost similar for the frames even though there is a change in the frame and an error will occurs in extracting the key frames. The error in the correlation is due to the pixel wise comparison. So, the edge orientation histogram feature works well for the sports videos to extract keyframes.

CHAPTER 7
Department. of ECE, MRITS 69

Key Frame Extraction On MPEG by using Threshold Algorithm

CONCLUSION AND FUTURE SCOPE


7.1 CONCLUSION

Our proposed system is able to extract the key frames from most of the sports videos. The methods used are computationally simple and dynamically determines the number of key frames. Experiments on other type of videos such as cartoons, documentaries etc., have shown that the method is adaptive to the video content. The experimental results shows that the frame difference features using edge orientation histogram has high accuracy rate and low error rate. 7.2 FUTURE SCOPE In our project we had extracted key frames by using multiple frame difference features individually. But in general one frame difference feature alone is not enough to capture all the visual contents of the image. For instance, color histograms have been a very popular feature for image representation and computation of key frames. However, key frame methods that use color histograms as FDM, tends to fail in scenes with illumination changes. For instance, in a video of a soccer game, where the camera is mostly focused on the field, edge orientation is an appropriate feature to capture the camera motion. This means that for a particular genre of videos, different visual features must be combined with varying weights, giving more weight to the visual feature (or FDM) which provides more detail about the visual content of the video. Therefore certain low level features can be combined to get an effective representation of a frame.

REFERENCES
Department. of ECE, MRITS 70

Key Frame Extraction On MPEG by using Threshold Algorithm

[1] Automatic Video Classification: A Survey of the Literature Darin Brezeale and Diane J. Cook, Senior Member, IEEE, 2007. [2] Ciocca G, Schettini R (2006).Innovative Algorithm for Key Frame Extraction in Video Summarization. J. Real Time Image Process, 1(1): 69-88. [3] Classification of sports videos using edge based features and auto associative neural models, C.Krishna Mohan, B. Yegnanarayana in Signal, image and video processing. [4] Combined Key-frame Extraction and Object-based Video Segmentation, Lijie Liu, Student Member, IEEE, and Guoliang Fan, Member, IEEE.

[5] Gunsel B, Tekalp AM (1998). Content-based video abstraction. Proceedings of IEEE International Conference of Image Processing, Chicago, USA, 1998, pp. 128 132. [6] International Journal of Computer and Electrical Engineering, Vol. 2, No. 2, April, 2010 1793-8163Integrating Pixel Cluster Indexing, Histogram Intersection And Discrete Wavelet Transform Methods For Colour Images Content Based Image Retrieval System. [7] Jianxinwu and James m.Rehg:Beyond the Euclidean distance: creating effective visual code books using the histogram intersection kernel. [8] J Sklansky, Image Segmentation and Feature Extraction, IEEE Trans on Systems, Man and Cybernetics, vol8, pp237-247, 1978. [9] Jiang RM, Sadka AH, Crooks D (2009). Advances in Video Summarization and Skimming. In: Grgic M et al. (eds.) Recent Advances in Multimedia Signal Processing and Communications, Springer, Berlin, 231: 27-50. [10] Li Y, Zhang T, Tretter D (2001). An overview of video abstraction techniques. Tech. Rep., HP-2001-191, HP Laboratory.

Department. of ECE, MRITS

71

Key Frame Extraction On MPEG by using Threshold Algorithm

[11] Lin Mei and Gred Kernel biased discriminate analysis using histogram intersection kernel for content based image. [12] Money AG, Agius H (2008). Video summarisation: A conceptual framework and survey of the state of the art. J. Visual Commun. Image Represent. 19(2): 121-143. [13] Mundur P, Rao Y, Yesha Y (2006). Key frame-based video summarization using Delaunay clustering. Int. J. Digital Lib., 6(2): 219-232 [14] N.Dalal and B.triggs Histogram of oriented gradients for human detectionInCVPR, volume 1 page 886-893, 2005. [15]. Pearson's Correlation Coefficient for Discarding Redundant Information in Real Time Autonomous Navigation System, A. Miranda Neto, Member, IEEE, L. Rittner, Member, IEEE, N. Leite, D. E. Zampieri, R. Lotufo and A. Mendeleck. [16] Tianming L, Zhang HJ, Qi FH (2003). A novel video key-frame extraction algorithm based on perceived motion energy model. [17]. Y.K. Eugene and R.G. Johnston, The Ineffectiveness of the Correlation Coefficient for Image Comparisons, Technical Report LA-UR-96-2474, Los Alamos, 1996.l. 13(10): 1006-1013. [18]. Zhang HJ,Wu J, Zhong D, Smoliar SW (1997). An integrated system for content-based video retrieval and browsing. Pattern Recognit., 30(4): 643658.

Department. of ECE, MRITS

72

You might also like