You are on page 1of 13

International Journal of Computer Engineering (IJCET), ISSN 0976 6367(Print), International Journal of Computer Engineering and Technology ISSN

N 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), and Technology (IJCET), ISSN 0976 6367(Print) IAEME
ISSN 0976 6375(Online) Volume 1 Number 2, Sep - Oct (2010), pp. 147- 159

IJCET

IAEME, http://www.iaeme.com/ijcet.html

IAEME

A NOVEL APPROACH FOR SATELLITE IMAGERY STORAGE BY CLASSIFYING THE NON-DUPLICATE REGIONS
Cyju Varghese Computer Science Department Karunya University, India E-Mail: jogycyju@gmail.com John Blesswin Computer Science Department Karunya University, India E-Mail: johnblesswin@gmail.com Navitha Varghese Computer Science Department Karunya University, India E-Mail: navithapullan@gmail.com Sonia Singha Computer Science Department Karunya University, India E-Mail: soniacs09@gmail.com

ABSTRACT
Everyday satellite is capturing thousands of images which needs to be classified in a proper way. In this paper, we address the problem of replacing the existing images with the captured one. We provide a new solution by storing only the non-existing part of the image. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. In order to overcome this difficulty, we propose an efficient approach, which consists of an algorithm that can adopt robust feature kernel principle component analysis (KPCA) to reduce dimensionality of image. Concerning image clustering, we utilize Fuzzy N-Means algorithm. Finally data is stored into 147

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

database according to specific class by utilizing support vector machine classifier. Thus the proposed scheme improve the efficient storage of satellite images in the database, save time consumption and make the correction of the satellite images more proficiently. Index Terms- Compression, Duplicate Detection, Feature Extraction, Image Clustering, Satellite Image

1. INTRODUCTION
Satellite images are playing an important role in many applications, especially to capture earth images for environmental study and homeland security. Geo is a generally used satellite, to capture earth images. Thousands of thousands images are transmitted every day to digital globe database. Everyday the topography of the earth is changing and therefore updating the images in database frequently is very tedious. In current applications, the images are being totally updated in the database. Instead of updating the whole image, this paper employs an approach to detect non-duplicates and duplicate blocks in the captured image and update the non-duplicate blocks only in the corresponding image in the database. The approaches make use of a Duplication Detection algorithm. To avoid the duplication of same image duplication detection approach need to be applied. Traditional approaches in duplication detection of image objects normally partition images into several blocks. These detection methods are designed specifically for the purpose of separating duplicate and non-duplicate image. It can detect duplication when the locations of the extracted objects are invariant to scaling, translation, or rotation. The traditional techniques used in detecting duplication include discrete wavelet transform (DWT), principle component analysis (PCA), fourier mellin transform (FMT). These techniques are restricted with only linear features. Duplicate detection involves division of the image into overlapping blocks, extract features from each block, detect similar feature. Depending on the type of duplication, various measures and mechanisms can be adopted and implemented to counter duplication. A discrete wavelet transform (DWT) [3][4]maps the time-domain signal of f(t) into a real-valued time frequency domain and the signals are described by the wavelet

148

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

coefficients. Five-scale signal decomposition is performed to ensure that all disturbance features in both high and low frequencies are extracted. Thus, the output of the wavelet transform consists of five decomposed scale signals, with different levels of resolutions. Principal component analysis (PCA) [5] in signal processing can be described as a transform of a given set of input vectors (variables) with the same length formed in the ndimensional vector. FMT [6][7]is a global transform and applies on all pixels in the same way. Fourier-Mellin Transform includes translation, scaling, and rotation invariance. To achieve these properties, image is divided into overlapping blocks. Fourier transform is applied into each of the block and obtains features. KPCA [1][2]is better over other techniques because it is used for non-linear feature extraction. It can detect duplication if a particular portion of an image has been rotated in any direction. Quantitative analyses indicate that the KPCA-based feature obtains excellent performance in the additive noise and lossy JPEG compression environments. This method uses global geometric transformation and the labeling technique to indentify the mentioned duplication. Experiments with a good number of natural images show very promising results, when compared with the other conventional approach. Duplication detection involves division of the image into overlapping blocks, extract features from each block, detect similar feature. KPCA technique is mainly used for nonlinear feature extraction where other techniques are used for linear feature extraction. KPCA extracts more useful features than the linear PCA. Initial mapping to high-dimensional space provides smoother dimensionality reduction than the standard PCA. It does not require nonlinear optimization but just the solution of eigen value problem. Although signal reconstruction is unnecessary for the tampering detection, KPCA is computationally more expensive than the linear PCA.

2. PROPOSED SCHEME
Satellite image is used as input for this application. At the time of storing this image in database the image size will be reduced and then stored in the database. It requires less memory. Kernel principle is used to reduce the dimensionality of the image. Kernel PCA is a non-linear feature extractor which is used to detect duplicate and nonduplicate regions from satellite image. In Kernel PCA one important concern is selection

149

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

of kernel function and computation of gram matrix. They can extract data-nonlinearity and can simulate the behavior of other kernels. Gram matrix can be finding from following equation:

K (xi, xj) = exp ( important.

(1)

Where Gaussian kernel denote important property, the value of kernel parameter is very

Figure 1 Flowchart of proposed scheme To compute principle component following step has to follow: 1. Construct one training and one testing matrix. 2. Compute gram matrix for training matrix. 3. Center the training gram matrix.

4. Diagonalizable the new matrix and compute Eigen value and eigenvector. 5. Construct the test gram matrix. 150

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

6. Center the test gram matrix. 7. Compute projection of all vectors onto the eigenvectors. From the compressed images, extraction of image features is the most important step that has a great impact on the retrieval performance.

A. Satellite Image Clustering


The concept of points having significant membership to multiple classes is deployed by Fuzzy algorithm. The points situated in the overlapped regions of different clusters are first identified and excluded from consideration while clustering. Thereafter, these points are given class labels based on Support vector Machine classifier which is trained by the remaining points. The well known Fuzzy N-Means algorithm and some recently proposed genetic clustering schemes are utilized in the process. Image is divided into number of blocks. Each block can have same features or different kind of features. Clustering is performed to group same kind of features. This step will give some additional advantage for duplication detection from image.

B. Satellite Image Segmentation


Using KPCA image is divided into number of blocks. Each block can have same features or different kind of features. Here image segmentation is performed to group same kind of features. This step will give some additional advantage for duplication detection. Image segmentation is the basis of image analysis & understanding. Image segmentation is exactly the problem of classifying pixel set of image. Clustering analysis is naturally applied into image segmentation. Here we are using Fuzzy N means algorithm for image segmentation. Fuzzy N means is improved version of fuzzy C means. Here outlier test is also performed to improve performance of segmentation. The internal level is used for calculating new centroid and updating fuzzy subjection-level matrix, and the external level is for judging if the algorithm has been converged to estimated threshold. After finishing the iterative, we can know generic subjection-level of certain pixel to certain clustering centre according to generated fuzzy subjection-level matrix, and determine generic category of the pixel by the size of the matrix[8]. Image segmentation means that image is indicated as set of physically meaningful connected areas.

151

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

Algorithm:
Input: Test image Output: Segmented image Step1: Initialize the parameter and also perform normalization Step 2: For k=1.N Perform outlier test using equation Outlier test: Where ui(k)= Update centroid by: Vi(new)=vi(old)+ (x-vi(old)) Step 3: Termination test || ||> ||xk-vI|| 2

The problem of segmenting image into different clusters is iteratively [10] handled by means of single parameter .Outlier test is performed to improve the cluster validity index. After finding the centers of clusters fuzzy membership value can be measured at any point. Thus groups of clusters with similar feature are obtained after performing this algorithm. Image segmentation means that image is indicated as set of physically [9] meaningful connected areas. Generally we achieve image segmentation purpose through analyzing such different image characteristics by using fuzzy N means clustering algorithm. Table 1 List of Symbols List of symbols ui Vi Fuzzy Membership Value Centroid Value

Xk Pixel Values

152

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

C. Duplication Detection of Satellite Image


Satellite Image database contain previously captured images of real world that are used as a training data. Everyday thousands of images is captured by satellite. In order to update the database, duplication detection has to be performed before storing the image into database. Each time satellite is storing the new image into database by replacing the previous one which is captured by it. This process is time consuming and it requires additional memory space. This paper proposes a new approach which updates the existing image with the identified non-duplicate block. To find the duplicate and non-duplicate blocks from the images duplication detection algorithm has proposed. Input of this algorithm is test image block. The duplication detection steps are as follows: Algorithm: Input: Test image of N pixels. Output: Non duplicate block of image. Step 1: Initialize block processing parameters: b: Number of pixels per block, Q: Number of quantization bins, Rth: Number of neighboring rows to search in the lexicographically sorted matrix, Dth: Minimum offset-magnitude threshold : Fraction of the ignored variance along the principal axes or the fraction of the ignored local variance of the wavelet coefficients M: Number of training samples for the KPCA. Step 2: Apply KPCA on each block, b, of data, and compute a transform vector of length L, which is equal to (M, Nt2) for the KPCA-based features with dimension reduction. Step 3: Construct a data matrix, Mdata, of size Nb L, where row-elements contain component-wise quantized features, i.e., bai/Qc. Step 4: Apply lexicographic sorting to the rows of the above matrix to obtain a new matrix S. Let si, be the i-th row of S, which represents the i-th block with its center coordinates (xi, yi).

153

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

Step 5: For every row si from S, select a number of adjacent rows, sj, such that |i j| < Rth and place all pairs of coordinates (xi, yi) and (xj , yj) for j = 0, 1, ..., (Rth 1) onto a list Pin. Step 6: Eliminate all pairs of points, whose offset-magnitude, Dof , is less than Dth. Construct a set, OF, of various offsets (m, n) and offset-frequencies (fm,n) for all elements in Pin. Step 7: Create a refined list of point-pairs, Pout, from Pin by the algorithm or by using manual threshold, fth. The proposed duplication detection algorithm has several parameters to be selected and justified before using them. These are block-size (b), number of quantization bins (Q), block-similarity threshold (Rth), minimum offset-magnitude threshold (Dth), offset-frequency threshold (fth), and the fraction of ignored variance (). The selection of Q depends on the feature variations. The selection of Rth depends on how well lexicographic sorting arranges similar vectors (blocks) in the sorted matrix,S. The parameter Dth is
used to avoid false detection.

D. Categorization of Satellite Image


From the identified blocks to classify satellites image manually is a tedious process. To perform this, computer utilizes the numerical "signatures" for each training class. Each pixel in the image is compared to these signatures and labeled as the class it most closely resembles digitally. Hence, supervised classifiers require the user to decide which classes exist in the image, and then to define training areas of these classes. SVM allows not only the best classification performance (e.g., accuracy) on the training data, but also leaves much room for the correct classification of the future data. [11] After detecting a few duplicate pixels whose similarity scores are bigger than the threshold using the KPCA algorithm, we have positive examples, the identified duplicate blocks in D, and negative blocks, namely, the remaining non duplicate blocks in N.

154

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

Table 2 Some Samples of the Test High-Resolution Satellite Image Database Existing image in database Results after applying the Algorithm Stored Image MS PSNR E 0 33.56

Captured Image

Algorithm: Input: Duplicate D and Non-Duplicate regions N Original Image Output: Updated Image Step1: Train Classifier C1 using D and N. Step2: Classify the Non-Duplicate region N to the corresponding class label C in the database. Step 3: Perform Step 2 until all the Non-Duplicate blocks in N are inserted into the Original Image I. The duplicate blocks in D and Non-Duplicate N are used to train the classifier (SVM) inorder to identify where to categorize the non-duplicated block in the already stored image I thus updating the image. Thus the satellite images are stored in an efficient manner in the database. The proposed scheme works as follows: Image captured is compressed using Kernel Principle Component Analysis (KPCA) and the feature extracted. The features extracted are clustered, employing the Fuzzy P Means Algorithm inorder to perform the duplicate detection algorithm efficiently. Duplication Detection is performed by comparing the captured image with the stored image. The duplicate and non-duplicate blocks are thus detected. Later on the missing part of the image stored in the database is updated by bringing in the nonduplicate block.

155

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

Table 2 Detection Accuracy for JPEG Dataset Intra-dataset average precision (P%) and recall (R%) Features P KPCA 73.19 JPEG R 40.27

KPCA based feature obtains the best recall (40.27%) for JPEG and medium precision (73.19%) for JPEG performances in the compressed and noisy domain shown in Table 2. KPCA is performed on JPEG Satellite images in our experiment. It can be also be performed BMP and SNR images. Recall varies roughly in sigmoid fashion with increasing JPG.

III. EXPERIMENTAL RESULTS


Experimental results on satellite images demonstrate four objectives. Thus more than 100 satellite jpeg images have been tested. Sample tested satellite images are given in Table1. The first is the implementation of KPCA. The dimensionality of the original image is reduced. The image is resized to 256 x 256 before applying the proposed duplicate detection method. Moreover, the features are extracted using KPCA. The second is, clustering the extracted features of the compressed image. Fuzzy N-means cluster algorithm groups the similar features. This clustered information is used to identify duplicate and non-duplicate block of the image. The third objective is the duplication detection. To show the non-duplicate block of the image a different color is used. First set of experiments use parameters which were empirical fixed to b=64, Q= 256, Rth=50.Dth=16, =0, =1. The identified duplicate D and non-duplicate N blocks

are used to train SVM classifier. This performs the task of blocks being inserted into the database. In our scheme, peak signal-to-noise ratio (PSNR) is used to evaluate the quality of the updated image. Similarly, we use mean square error (MSE) to identify the difference between the updated image and the captured image. The quality of the updated image is considered by using two points of view. First, under the human resource system the

156

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

updated image is almost indistinguishable from the original image. Secondly, the PSNR values of the updated images and the original images range from 32 to 34.5db. Moreover, all MSEs are equal to zero when the image is exactly updated.

IV. CONCLUSION
This paper has presented one method for detecting duplicated regions in the satellite image. An automatic duplication detection forgery has been proposed. This technique reduces false detection as well as eliminates an important threshold parameter. Although time-cost is high, this method can have good performance. The next method what we are applying is clustering method. Finally classification method is applied to store the non-duplicate region of the image in the database.

REFERENCES
[1] M. K. Bashar, Member, IEEE, K. Noda, Non-member, N. Ohnishi, and K. Mori, Member, IEEE ,Exploring Duplicated Regions in Natural Images. IEEE Transaction on Image Processing, Vol 1,pp. 1-40, March 2010. [2] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience, vol. 3, no. 1, 1991. [3] G. Li, Q. Wu, D. Tu, and S. Sun, A Sorted Neighborhood Approach for Detecting Duplicated Regions in Image Forgeries based on DWT and SVD, in Proceedings of IEEE International Conference on Multimedia and Expo, Beijing China, July 2-5, 2007, pp. 1750-1753. [4] W .Luo, J. Huang, and G. Qiu, Robust Detection of Region Duplication Forgery in Digital Image, in Proceedings of the 18th International Conference on Pattern Recognition, Vol. 4, 2006, pp. 746-749. [5] C. Popescu and H. Farid, Exposing Digital Forgeries by Detecting Duplicated Image Regions, Technical Report, TR2004-515, Dartmouth College, Computer Science, 2004. [6] Sevinc Bayram, Taha Sencar, and Nasir Memon, An efficient and robust for detecting copy-move forgery, in Proceedings of ICASSP 2009. method

157

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

[7] H. Huang, W. Guo, and Y. Zhang, Detection of Copy-Move Forgery in Digital Images Using SIFT Algorithm, in Proceedings of IEEE Pacific-Asia Workshop on Computational Intelligence and Industrial Application, Vol. 2, pp. 272-276, 2008. [8] Sang Wan Lee, Yong Soo Kim, and Zeungnam Bien, Fellow, IEEE, A Nonsupervised Learning Framework of Human Behavior Patterns Based on Sequential Actions IEEE 22, no. 4, April 2010. [9] Z. Bien and M.-G. Chun, A Fuzzy Petri Net Model, Handbook of Fuzzy Computation, C2.4, IOP Publishing Ltd., 1998. [10] T. Tajima et al., Development of a Marketing System for Recognizing Customer Buying Behavior Sensor, J. Japan Soc. for Fuzzy Theory and Intelligent Transactions on Knowledge and Data Engineering, vol.

Informatics, vol. 20, no. vol 5,pp 18-22,apr.2007 [11] Weifeng Su, Jiying Wang, and Frederick H. Lochovsky, Member, IEEE Computer Society. Record Matching over Query Results from Multiple Web Databases IEEE Transactions On Knowledge And Data Engineering, VOL. 22, NO. 4, APRIL 2010 [12] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. ACM Press, 1999. Cyju Elizabeth Varghese received the B.E degree in Computer Science and Engineering from CSI Institute of Technology, Thovalai, India, in 2001 and been working since. Currently she is doing M. Tech in Computer Science and Engineering in Karunya University, Coimbatore. Her research interests include Web Mining and areas related to Database. John Blesswin received the B.Tech degree in Information Technology from Karunya University, Coimbatore, India, in 2009. He passed B.Tech examination with gold medal. He is doing M.Tech Computer Science and Engineering in Karunya University. His research interests include visual cryptography, visual secret sharing schemes, image hiding, and information retrieval.

158

International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 6367(Print), ISSN 0976 6375(Online) Volume 1, Number 2, Sep - Oct (2010), IAEME

Navitha Varghese received the B.Tech degree in Computer Science and Engineering from Model Engineering College, Ernakulam, India, in

2009.Currently she is doing M.Tech in Computer Science at Karunya University, Coimbatore. Her research interests include Web Mining, Web technology Sonia Singha received the B.Tech degree in Computer Science and Engineering from Calcutta Institute of Technology, Kolkata, India, in 2009.Currently she is doing M. Tech in Computer Science at Karunya University, Coimbatore. Her research interests include Data Processing. Mining, Image

159

You might also like