Professional Documents
Culture Documents
Department of Electrical Engineering, National Institute of Technology, Silchar 788010, Assam, India. Correspondence : sudipta.it@gmail.com
Abstract:
A hybrid image compression method is proposed in this work based upon two compression techniques, namely Vector Quantization (VQ) and Discrete Cosine Transform (DCT). In this approach, the codebook is generated initially with the help of ten different images using VQ. The final codebook is then generated using DCT matrix and is ready to be used. Any image can be compressed using this code book. Appropriate codewords are generated for the selected image and ultimately the compressed form of the image is obtained. The decompression can also be done to reconstruct the original image. Proposed approach is tested on standard images and performance of the approach is compared with standard VQ method. The performance of the proposed method is better as evidenced by higher PSNR with the hybrid method as compared to VQ method.
Keywords:
Image compression, decompression, vector quantization, discrete cosine transform, DCT matrix, PSNR.
1. Introduction:
Image compression addresses the problem of reducing the amount of data required to represent a digital image. The underlying basis of the reduction process is the removal of redundant data. From a mathematical viewpoint, this amounts to transforming a 2-D pixel array into a statistically uncorrelated data set. The transformation is applied prior to storage or transmission of the image. Later the compressed image is decompressed to reconstruct the original image or an approximation of it (Gonzalez and Woods, 2006). The initial focus of research efforts in this field was on the development of analog methods for reducing video transmission bandwidth, a process called bandwidth compression. The advent of the digital computer and subsequent development of advanced integrated circuits caused the shift of interest from analog to digital compression approaches. With the relatively recent adoption of several key international image compression standards, the field has undergone significant growth through the practical application of the theoretical work that began in the 1940s, when C. E. Shannon and others first formulated the probabilistic view of information and its representation, transmission, and compression (Gonzalez and Woods, 2006; Chanda and Dutta Majumder, 2000; Taubman and Marcellin, 2002). Currently, image compression is recognized as an enabling technology. It is the natural one for handling the increased spatial resolutions of todays imaging sensors and evolving broadcast television standards. Furthermore, image compression plays a major role in many important and diverse applications, including tele-video-conferencing, remote sensing, document and medical imaging, facsimile transmission (FAX), and the control of remotely piloted vehicles in military, space and hazardous waste management applications. So, an ever-expanding number of applications depend on the efficient manipulation, storage, and transmission of binary, gray-scale and color images (http://www.data-compression.com/vq.shtml).
Compression techniques can be broadly classified into two categories, namely, loss-less compression and lossy compression. The digital signal is represented by g and g' represents the decompressed form of the compressed digital signal g. Hence, any discrepancy between g' and g is considered as error introduced by the compression technique. Usually amount of error increases as amount of data decreases. So, the objective of the compression technique is to achieve maximum compression without introducing objectionable error (Chanda and Dutta Majumder, 2000; Taubman and Marcellin, 2002). If amount of error introduced is zero, we call it loss-less compression; otherwise it is a lossy compression. Loss-less compression is perfectly invertible. That means original image can be exactly recovered from its compressed representation. Principal loss-less compression strategies are Huffman Coding, Run-length Coding, Block Coding, Quad Tree Coding, Contour Coding. In case of lossy compression, perfect recovery of original image is not possible but amount of data reduction is more than loss-less compression. Lossy compression is useful in applications in which a certain amount of error is an acceptable trade-off for increased compression performance, such as Broadcast television, Videoconferencing, Facsimile transmission etc. All the image compression techniques exploit the common characteristic of most images is that the neighboring pixels are correlated and therefore contain redundant information. The foremost task then is to find less correlated representation of the image. Two fundamental components of compression are redundancy and irrelevancy reduction. All compression techniques attempt to remove the redundant information to the possible extent and derive the less correlated representation of the image. Principal Lossy compression strategies are Transform Compression, Block Truncation Compression, Vector Quantization (VQ) Compression (Chanda and Dutta Majumder, 2000; McGowan, http://www.jmcgowan.com/avialgo.html; Linde and Gray, 1980). VQ is a powerful method for lossy compression of data like in sounds or images, because their vector representations often occupy only small fractions of their vector spaces. Like, in a 2D gray scale image the vector space can be visualized as the [0,0]-[255,255] square in the plane. If taken on two components of the vectors as XY coordinates and a dot can be plotted for each vector found in the input image. In traditional coding methods based on the DCT (Kesavan, http://www.jmcgowan.com/avialgo.html; Rao and Yip, 1990; Cabeen and Gent; Ponomarenko et al., 2002), level of compression and amount of losses are determined by the quantization of DCT coefficients. Losses in images with DCT based compression results from quantization of DCT coefficients. And quantization is essential for compression of image/information. The main advantage of the DCT is its energy compaction property, that is, the entire signal energy before applying DCT is concentrated in only a few DCT coefficients after transforming, Hence most of the other coefficients become zero or negligibly small, and hence can be ignored or truncated. Image compression research aims at reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies as much as possible. As DC components of DCT coefficients reflect average energy of pixel blocks and AC components reflect pixel intensity changes, it is conceivable to index and retrieve images directly based on DCT coefficients. However, the index or representation would not be compact as the number of DCT coefficients is equal to the number of pixels. Therefore it is proposed to use coefficients of some selected image windows. But the choice of windows will affect the performance dramatically, as the objects of interest may be located anywhere in a image. Although VQ offers more compression, yet is not widely implemented. This is due to two things. The first is the time it takes to generate the codebook, and the second is the speed of the search. Many algorithms have been proposed to increase the speed of the search. Some of them reduce the mathematics used to determine the codeword that offers the minimum distortion, other algorithms preprocess the codewords. Hence, it is felt to compress an image first using VQ method which will retain most of the information of the image at the same time achieves compression and secondly the code book of VQ method will be redefined by DCT matrix. This way of hybridization of VQ and DCT will make use of good subjective performance of VQ and high compression capability of DCT resulting into a more efficient algorithm for compression of images than VQ alone.
In view of the above, the main objectives of the present work are: 1. To generate code book of images using VQ method. 2. To redefine the code book using DCT matrix. 3. Compare the results with standard VQ based method. The rest of the paper is organized as follows: In Section 2 the concept of vectors, vector quantization and formation of DCT matrix are introduced. Hybridization is described in section 3. and section 4 presents the results and discussions. Conclusions are drawn in Section 5.
Figure 1 Codewords in 1-dimentional space Here, every number less than -2 is approximated by -3. All numbers between -2 and 0 are approximated by -1. Every number between 0 and 2 are approximated by +1. Every number greater than 2 are approximated by +3. The approximate values are uniquely represented by 2 bits. This is a 1-dimensional, 2-bit VQ. It has a rate of 2 bits/dimension. In the above example, the stars are called codevectors.
k
A vector quantizer maps k-dimensional vectors in the vector space R into a finite set of vectors Y = {y : i = 1, 2, ..., N}. Each vector y is called a code vector or a codeword. and the set of all the
i i
codewords is called a codebook. Associated with each codeword, y , is a nearest neighbour region
i
(1)
k
The set of encoding regions partition the entire space R such that: = =
Thus the set of all encoding regions is called the partition of the space. As an example we take vectors in the two-dimensional case without loss of generality in Figure 2. In the figure, Input vectors are marked with an x, codewords are marked with solid circles, and the Voronoi regions are separated with boundary lines. The figure shows some vectors in space. Associated with each cluster of vectors is a representative codeword. Each codeword resides in its own Voronoi region. These regions are separated with imaginary boundary lines in figure 2. Given an input vector, the codeword that is chosen to represent it is the one in the same Voronoi region. The representative codeword is determined to be the closest in Euclidean distance from the input vector. The Euclidean distance is defined by:
, =
(2)
th j ij
th i
where x is the j component of the input vector, and y is the j component of the codeword y . In Figure 2 there are 13 regions and 13 solid circles, each of which can be uniquely represented by 4
The DCT equation (Eq.3) computes the (i, j ) entry of the DCT of an image. , =
,
(3) 4)
p(x,y) is the (x,y) element of the image represented by the matrix p. N is the size of the block on that
th
= 0 = 1 > 0
th
the DCT is done. The equation calculates one entry (i,j) of the transformed image from the pixel values of the original image matrix. For the standard 8x8 block, N equals 8 and x and y range from 0 to 7. Therefore D(i,j) would be as in equation 5. , = (5) ,
Because the DCT uses cosine functions, the resulting matrix depends on the horizontal, diagonal, and vertical frequencies. Therefore an image black with a lot of change in frequency has a very random looking resulting matrix, while an image matrix of just one color, has a resulting matrix of a large value for the first element and zeroes for the other elements. To get the matrix form of equation(3), we may use the equation(6) as below: = 0 = > 0 .3536 .4904 .4619 = .4157 .3536 .2778 .1913 .0975 The DCT matrix for a block of size 8x8 is listed in Table 1.
(6)
.3536 .4157
.3536 .2778
.4157 .4157 .0975 .4904 .1913 .1913 .4619 .4619 .4904 .4904 .4157 .2778 Table 1 DCT matrix
.2778
.4904
.3536 .4157
.3536
.1913 .0975
The first row (i = 0) of the matrix has all the entries equal to of equation (6). The columns of T form an orthogonal set, so T is an orthogonal matrix. When doing the inverse DCT the orthogonality of T is important, as the inverse of T is which is easy to calculate.
2.7 Quantization
Our 8x8 block of DCT coefficients is now ready for compression by quantization. A remarkable and highly useful feature is that in this step, varying levels of image compression and quality are obtainable through selection of specific quantization matrices. This enables the user to decide on quality levels ranging from 1 to 100, where 1 gives the poorest image quality and highest compression and quality are obtainable through selection of specific quantization matrices. Quantization is achieved by dividing each element in the transformed image matrix D by the corresponding element in the quantization matrix, and then rounding to the nearest integer value.
where M is the number of elements in the image. For example, if we wanted to find the MSE between the reconstructed and the original image, then we would take the difference between the two images pixel-by-pixel, square the results, and average the results.
(8)
where n is the number of bits per symbol. As an example, if we want to find the PSNR between two 256 gray level images, then we set n to 8 bits. Subjective criteria For subjective measurement the original image and the reconstructed image are shown to a large group of examiners. Each examiner assigns a grade to the reconstructed image with respect to the original image. These grades may be drawn from a subjective scale and may be divided as excellent, good, reasonable, poor, unacceptable. Based on grades assigned by examiners, an overall grade is assigned to the reconstructed image. Complement of this grade gives an idea of the subjective error.
= 10
(9)
is now compared with the codewords of dctcodebook and the index value of the dctcodebook is stored which belongs to minimum Euclidean distance. The index array is the compressed value of the image. To decompress the image, the index value of that image is used. From the index, we can get the values for the corresponding vectors in the dctcodebook. But the original one is the ultimate desire. To do that, we can transform the dctcodebook vectors into their original values by using inverse DCT functions or in a simpler manner, we can reconstruct the image from the codebook with the help of the index array. Thus the ultimate decompressed image is reconstructed. Then the performance criterion PSNR is calculated.
4. Performance
The performance of the VQ based compression and the new VQ-DCT based compression approach is evaluated in terms of PSNR. The output results of the images considered are presented below in the Table 2 with the size of the image, size of the block, number of codewords in the codebook: The results of the algorithms on different images with the block size 8x8 are presented in Table 2. Image Size of the Image 512x512 512x512 512x512 512x512 512x512 512x512 512x512 512x512 512x512 512x512 Block Size 8x8 8x8 8x8 8x8 8x8 8x8 8x8 8x8 8x8 8x8 8x8 Number of Vectors 256 256 256 256 256 256 256 256 256 256 256 PSNR of VQ based compression 25.899132 20.858301 23.168871 25.358310 25.910507 21.531078 22.300915 25.925383 24.933610 25.875566 28.670153 PSNR of the proposed approach 27.276513 22.302915 25.725581 27.576528 27.983451 24.938610 25.625083 28.076503 25.925388 27.276513 29.929355
Airplane 512x512 Baboon Barb Boat Couple Kiel Lake Lena Man Peppers Zelda
Table 2 Performance in terms of PSNR with block size 8x8 And the results of the algorithms on different images with the block size 4x4 are presented in Table 3. It can be observed from tables 2 and 3 that performance of both the algorithms is better with smaller block size as compared to that with larger block size. When the performance of the proposed algorithm is compared with standard VQ based method it can be observed the PSNR values for all the images with proposed method is quite better than that with standard VQ based method implying that the retrieved image quality with proposed method is superior. Image Size of the Image 512x512 512x512 512x512 512x512 512x512 Block Size 4x4 4x4 4x4 4x4 4x4 4x4 Number of Vectors 256 256 256 256 256 256 PSNR of VQ based compression 25.911146 20.867771 23.907484 25.272032 25.953247 21.953107 PSNR of the proposed approach 27.566214 22.675425 25.729589 27.546538 27.883403 24.969231
5. Conclusion
Standard images are compressed using both the standard VQ method and proposed method with different block sizes and the PSNR as performance index is obtained for each case for comparison. It is observed that using the proposed image compression method the PSNR is improved for all the images which is vivid from the tables 2 & 3. Thus the quality of the image is enhanced as PSNR is increased. So, the new approach may be considered as the superior one and can be used for the further development of the compression and decompression tools.
References:
1. R. C. Gonzalez and R. E. Woods (2006), Digital Image Processing, Pearson Education, Second Impression. 2. http://www.data-compression.com/vq.shtml. 3. C. Christopoulos, A. Skodras, T. Ebrahimi, (2000), The JPEG2000 still image coding system: an Overview, IEEE Trans. on Consumer Electronics, Vol. 46, Issue: 4, pp.11031127. 4. B. Chanda and D. Dutta Majumder, (2000), Digital Image Processing and Analysis, Prentice Hall Pvt. Ltd. 5. D. Taubman, M. Marcellin, (2002), JPEG 2000: Image Compression Fundamentals, Standards and Practice, Boston: Kluwer. 6. John. McGowan, AVI Overview: Video Compression Technologies available at
http://www.jmcgowan.com/avialgo.html.
7. Hareesh Kesavan, Choosing a DCT Quantization Matrix for JPEG Encoding available at www.jmcgowan.com/avialgo.html. 8. K. Rao, K., P. Yip, (1990), Discrete Cosine Transform, Algorithms, Advantages, Applications, Academic Press. 9. John McGowan, AVI Overview: Video Compression Technologies available at http://www.jmcgowan.com/avialgo.html. 10. K. Cabeen and P. Gent, Image Compression and the Discrete Cosine Transform, available at http://online.redwoods.cc.ca.us/instruct/darnold/LAPROJ/Fall98/ PKen/dct.pdf. 11. A. B. Y. Linde and R. M. Gray, (1980), "An algorithm for vector quantization design, IEEE Transactions on Communicatinos Vol COM-28, pp. 84-95.