You are on page 1of 19

CHAPTER-4 IMAGE COMPRESSION USING NEURAL NETWORKS

4.1 Image Compression


Images require much storage space, large transmission bandwidth and long transmission time. The only way currently to improve on these resource requirements is to compress images, such that they can be transmitted quicker and then decompressed by the receive In image processing there are 256 intensity levels (scales) of grey. 0 is black and 255 is white. Each level is represented by an 8-bit binary number so black is 00000000 and white is 11111111. An image can therefore be thought of as grid of pixels, where each pixel can be represented by the 8-bit binary value for grey-scale.

Figure 4.1 The resolution of an image is the pixels per square inch. (So 500dpi means that a pixel is 1/500th of an inch). To digitise a one-inch square image at 500 dpi requires 8 x 500 x500 = 2 million storage bits. Using this representation it is clear that image data compression is a great advantage if many images are to be stored, transmitted or processed. According to "Image compression algorithms aim to remove redundancy in data in a way which makes image reconstruction possible." This basically means that image compression algorithms try to exploit redundancies in the data; they calculate which data needs to be kept in order to reconstruct the original image and therefore which data can be thrown

26

away. By removing the redundant data, the image can be represented in a smaller number of bits, and hence can be compressed. But what is redundant information? Redundancy reduction is aimed at removing duplication in the image. According to Saha there are two different types of redundancy relevant to images: (i) Spatial Redundancy : correlation between neighbouring pixels. (ii) Spectral Redundancy : correlation between different colour planes and spectral bands. Where there is high correlation, there is also high redundancy, so it may not be necessary to record the data for every pixel. There are two parts to the compression: 1. Find image data properties; grey-level histogram, image entropy, correlation functions etc.. 2. Find an appropriate compression technique for an image of those properties.

4.1.1 Image Data Properties


In order to make meaningful comparisons of different image compression techniques it is necessary to know the properties of the image. One property is the image entropy; a highly correlated picture will have a low entropy. For example a very low frequency, highly correlated image will be compressed well by many different techniques; it is more the image property and not the compression algorithm that gives the good compression rates. Also a compression algorithm that is good for some images will not necessarily be good for all images, it would be better if we could say what the best compression technique would be given the type of image we have. One way of calculating entropy is suggested. If an image has G grey-levels and the probability of grey-level k is P(k) the entropy He is:

27

Information redundancy, r, is

Where, b is the smallest number of bits for which the image quantization levels can be represented. Information redundancy can only be evaluated if a good estimate of image entropy is available, but this is not usually the case because some statistical information is not known. An estimate of He can be obtained from a grey-level histogram. If h(k) is the frequency of grey-level k in an image f, and image size is MxN then an estimate of P(k) can be made:

Therefore,

The Compression ratio

4.1.2 Compression techniques


There are many different forms of data compression. This investigation will concentrate on transform coding and then more specifically on Wavelet Transforms. Image data can be represented by coefficients of discrete image transforms. Coefficients that make only small contributions to the information contents can be omitted. Usually the image is split into blocks (sub-images) of 8x8 or 16x16 pixels, then each block is transformed separately. However this does not take into account any correlation between blocks, and creates "blocking artifacts" , which are not good if a smooth image is required.

28

However wavelets transform is applied to entire images, rather than sub-images, so it produces no blocking arte-facts. This is a major advantage of wavelet compression over other transform compression methods.

4.2 DISCRETE WAVELET TRANSFORM


Calculating wavelet coefficients at every possible scale is a fair amount of work, and it generates an awful lot of data. If we choose only a subset of scales and positions based on powers of twoso-called dyadic scales and positionsthen our analysis will be much more efficient and just as accurate. We obtain such an analysis from the discrete wavelet transform (DWT). An efficient way to implement this scheme using filters was developed in 1988 by Mallat. The Mallat algorithm is in fact a classical scheme known in the signal processing community as a two-channel sub band coder. This very practical filtering algorithm yields a fast wavelet transform a box into which a signal passes, and out of which wavelet coefficients quickly emerge. Like Fourier series expansion, the wavelet series expansion maps a function of a continuous variable into a sequence of coefficients. If the function being expanded is discrete the resulting coefficients are called the discrete wavelet transform (DWT). For example, if f (n) = f(x0 + nx) for some x0 ,x and n=0, 1, 2,.M-1,the wavelet series expansion coefficients for f(x) defined as: W (j0, k) = f (n) j0, k (n)

And the forward DWT coefficients for sequence f (n) W (j, k) = f (n) j, k (n) for j j0

The j0, k (n) and j, k (n) in these equations are samples versions of basis functions j0, k(x) and j, k(x)

29

4.2.1 DWT and sub-signal encoding


The DWT provides sufficient information for the analysis and synthesis of a signal, but is advantageously, much more efficient. Discrete Wavelet analysis is computed using the concept of filter banks. Filters of different cut-off frequencies analyze the signal at different scales. Resolution is changed by the filtering, the scale is changed by up-sampling and downsampling. If a signal is put through two filters: (i) a high-pass filter, high frequency information is kept, low frequency information is lost. (ii) a low pass filter, low frequency information is kept, high frequency information is lost. then the signal is effectively decomposed into two parts, a detailed part (high frequency), and an approximation part (low frequency). The sub-signal produced from the low filter will have a highest frequency equal to half that of the original. According to Nyquist sampling this change in frequency range means that only half of the original samples need to be kept in order to perfectly reconstruct the signal. More specifically this means that up-sampling can be used to remove every second sample. The scale has now been doubled. The resolution has also been changed, the filtering made the frequency resolution better, but reduced the time resolution. The approximation subsignal can then be put through a filter bank, and this is repeated until the required level of decomposition has been reached. The ideas are shown in figure 2.8.

30

Figure 4.2 The DWT is obtained by collecting together the coefficients of the final approximation sub-signal and all the detail sub-signals. Overall the filters have the effect of separating out finer and finer detail, if all the details are added together then the original signal should be reproduced. Using a further analogy from Hubbard this decomposition is like decomposing the ratio 87/7 into parts of increasing detail, such that: 87 / 7 = 10 + 2 + 0.4 + 0.02 + 0.008 + 0.0005 The detailed parts can then be re-constructed to form 12.4285 which is an approximation of the original number 87/7.

31

4.2.2 Conservation and Compaction of Energy


An important property of wavelet analysis is the conservation of energy. Energy is defined as the sum of the squares of the values. So the energy of an image is the sum of the squares of the pixel values, the energy in the wavelet transform of an image is the sum of the squares of the transform coefficients. During wavelet analysis the energy of a signal is divided between approximation and details signals but the total energy does not change. During compression however, energy is lost because thresholding changes the coefficient values and hence the compressed version contains less energy. The compaction of energy describes how much energy has been compacted into the approximation signal during wavelet analysis. Compaction will occur wherever the magnitudes of the detail coefficients are significantly smaller than those of the approximation coefficients. Compaction is important when compressing signals because the more energy that has been compacted into the approximation signal the less energy can be lost during compression. Thresholding in Wavelet Compression For some signals, many of the wavelet coefficients are close to or equal to zero. Thresholding can modify the coefficients to produce more zeros. In Hard thresholding any coefficient below a threshold , is set to zero. This should then produce many consecutive zeros which can be stored in much less space, and transmitted more quickly by using entropy coding compression. An important point to note about Wavelet compression is explained by Aboufadel "The use of wavelets and thresholding serves to process the original signal, but, to this point, no actual compression of data has occurred". This explains that the wavelet analysis does not actually compress a signal, it simply provides information about the signal which allows the data to be compressed by standard entropy coding techniques, such as Huffman coding. Huffman coding is good to use with a signal processed by wavelet analysis, because it relies on the fact that the data values are

32

small and in particular zero, to compress data. It works by giving large numbers more bits and small numbers fewer bits. Long strings of zeros can be encoded very efficiently using this scheme. Therefore an actual percentage compression value can only be stated in conjunction with an entropy coding technique. To compare different wavelets, the number of zeros is used. More zeros will allow a higher compression rate, if there are many consecutive zeros, this will give an excellent compression rate.

4.3 Wavelets and Compression


Wavelets are useful for compressing signals but they also have far more extensive uses. They can be used to process and improve signals, in fields such as medical imaging where image degradation is not tolerated they are of particular use. They can be used to remove noise in an image, for example if it is of very fine scales, wavelets can be used to cut out this fine scale, effectively removing the noise.

4.3.1 The Fingerprint example


The FBI have been using wavelet techniques in order to store and process fingerprint images more efficiently. The problem that the FBI were faced with was that they had over 200 Million sets of fingerprints, with up to 30,0000 new ones arriving each day, so searching through them was taking too long. The FBI thought that computerising the fingerprint images would be a better solution, however it was estimated that checking each fingerprint would use 600Kbytes of memory and even worse 2000 terabytes of storage space would be required to hold all the image data. The FBI then turned to wavelets for help, adapting a technique to compress each image into just 7% of the original space. Even more amazingly, according to Kiernan, when the images are decompressed they show "little distortion". Using wavelets the police hope to check fingerprints within 24 hours. Earlier attempts to compress images used the JPEG format; this breaks an image into blocks eight pixels square. It then uses Fourier transforms to transform the data, then
33

compresses this. However this was unsatisfactory, trying to compress images this way into less than 10% caused "tiling artifacts" to occur, leaving marked boundaries in the image. As the fingerprint matching algorithm relies on accurate data to match images, using JPEG would weaken the success of the process. However wavelets dont create these "tiles" or "blocks", they work on the image as a whole, collecting detail at certain levels across the entire image. Therefore wavelets offered brilliant compression ratios and little image degradation; overall they outperformed the techniques based on Fourier transforms. The basic steps used in the fingerprint compression were: 1. Digitise the source image into a signal s 2. Decompose the signal s into wavelet coefficients 3. Modify the coefficients from w, using thresholding to a sequence w. 4. Use quantisation to convert w to q. 5. Apply entropy encoding to compress q to e.

4.3.2 2D Wavelet Analysis


Images are treated as two dimensional signals, they change horizontally and vertically, thus 2D wavelet analysis must be used for images. 2D wavelet analysis uses the same mother wavelets but requires an extra step at every level of decomposition. The 1D analysis filtered out the high frequency information from the low frequency information at every level of decomposition; so only two sub signals were produced at each level. In 2D, the images are considered to be matrices with N rows and M columns. At every level of decomposition the horizontal data is filtered, then the approximation and details produced from this are filtered on columns.

34

Figure 4.3 At every level, four sub-images are obtained; the approximation, the vertical detail, the horizontal detail and the diagonal detail. Below the Saturn image has been decomposed to one level. The wavelet analysis has found how the image changes vertically, horizontally and diagonally.

Figure 4.4 2-D Decomposition of Saturn Image to level 1

35

To get the next level of decomposition the approximation sub-image is decomposed, this idea can be seen in figure 4.5

Figure 4.5 A screen print from the Matlab Wavelet Toolbox GUI showing the saturn image decomposed to level 3. Only the 9 detail sub-images and the final sub-image is required to reconstruct the image perfectly.
36

4.4 EXAMPLES OF WAVELETS


The figure below illustrates four different types of wavelet basis functions.

Fig 4.6: Different wavelet families The different families make trade-offs between how compactly the basis functions are localized in space and how smooth they are. Within each family of wavelets (such as the Daubechies family) are wavelet subclasses distinguished by the number of filter coefficients and the level of iteration. Wavelets are most often classified within a family by the number of vanishing moments. This is an extra set of mathematical relationships for the coefficients that must be satisfied. The extent of compactness of signals depends on the number of vanishing moments of the wavelet function used.

37

4.5 A SIMPLE NEURON


An artificial neuron is a device with many inputs and one output. The neuron has two modes of operation, the training mode and the using mode. 1. In the training mode, the neuron can be trained to fire (or not), for particular input patterns. 2. In the using mode, when a taught input pattern is detected at the input, its associated output becomes the current output. If the input pattern does not belong in the taught list of input patterns, the firing rule is used to determine whether to fire or not.

Fig-4.7: Network Layers

4.6 TRANSFER FUNCTION


The behavior of an ANN (Artificial Neural Network) depends on both the weights and the input-output function (transfer function) that is specified for the units. This function typically falls into one of three categories: 1. Linear (or ramp): the output activity is proportional to the total weighted output. 2. Threshold: the output is set at one of two levels, depending on whether the total input is greater than or less than some threshold value. 3. Sigmoid: the output varies continuously but not linearly as the input changes. Sigmoid units
38

bear a greater resemblance to real neurons than do linear or threshold units, but all three must be considered rough approximations.

Fig-4.8: Transfer Functions To make a neural network that performs some specific task, we must choose how the units are connected to one another, and we must set the weights on the connections appropriately. The connections determine whether it is possible for one unit to influence another. The weights specify the strength of the influence. A three-layer network can be taught to perform a particular task by using the following procedure: 1. The network is presented with training examples, which consist of a pattern of activities for the input units together with the desired pattern of activities for the output units. 2. It is determined how closely the actual output of the network matches the desired output. 3. The weight of each connection can be changed so that the network produces a better approximation of the desired output.

4.6.1 LEARNING ALGORITHMS OF NEURAL NETWORKS


Learning is a process by which the free parameters of a neural network are adapted through a process of stimulation by the environment in which the network is embedded.The type of learning is determined by the manner in which the parameter changes takes place. All learning methods used for neural networks can be classified into two major categories:
39

SUPERVISED LEARNING which incorporates an external teacher, so that each output unit is told what its desired response to input signals ought to be. During the learning process global information may be required. Paradigms of supervised learning include errorcorrection learning, reinforcement learning and stochastic learning. An important issue concerning supervised learning is the problem of error convergence, i.e. the minimization of error between the desired and computed unit values. The aim is to determine a set of weights which minimizes the error. One wellknown method, which is common to many learning paradigms, is the least mean square (LMS) convergence. UNSUPERVISED LEARNING uses no external teacher and is based upon only local information. It is also referred to as self-organization, in the sense that it selforganizes data presented to the network and detects their emergent collective properties. Paradigms of unsupervised learning are Hebbian learning and competitive learning.

4.7 THE BACK-PROPAGATION ALGORITHM


A back propagation neural network uses a feed-forward topology, supervised learning. and back propagation learning algorithm. This algorithm was responsible in large part for the re-emergence of neural networks in the mid 1980s. Back propagation is a general purpose learning algorithm. It is powerful but also expensive in terms of computational requirements for training. A back propagation network with a single hidden layer of processing elements can model any continuous function to any degree of accuracy (given enough processing elements in the hidden layer).

40

Fig-4.9:Back Propagation Network There are literally hundreds of variations of back propagation in the neural network literature, and all claim to be superior to basic back propagation in one way or the other. Indeed, since back propagation is based on a relatively simple form of optimization known as gradient descent, mathematically astute observers soon proposed modifications using more powerful techniques such as conjugate gradient and Newtons methods. However, basic back propagation is still the most widely used variant. Its two primary virtues are that it is simple and easy to understand, and it works for a wide range of problems. The basic back propagation algorithm consists of three steps. 1. The input pattern is presented to the input layer of the network. These inputs are propagated through the network until they reach the output units. This forward pass produces the actual or predicted output pattern. 2. Because back propagation is a supervised learning algorithm, the desired outputs are given as part of the training vector. The actual network outputs are subtracted from the desired outputs and an error signal is produced. 3. This error signal is then the basis for the back propagation step, whereby the errors are passed
41

back through the neural network by computing the contribution of each hidden processing unit and deriving the corresponding adjustment needed to produce the correct output. The connection weights are then adjusted and the neural network has just learned from an experience. Two major learning parameters are used to control the training process of a back propagation network. The learn rate is used to specify whether the neural network is going to make major adjustments after each learning trial or if it is only going to make minor adjustments. Momentum is used to control possible oscillations in the weights, which could be caused by alternately signed error signals. While most commercial back propagation tools provide anywhere from 1 to 10 or more parameters for you to set, these two will usually produce the most impact on the neural network training time and performance.

4.7.1 REINFORCEMENT LEARNING


Reinforcement learning is a form of supervised learning because the network gets some feedback from its environment. The feedback signal (yes/no reinforcement signal) is only evaluative, not instructive, i.e. if the reinforcement signal says that a particular output is wrong and it gives no hint as to what the right answer should be. It is therefore important in a reinforcement learning network to implement some source of randomness in the network, so that the space of possible outputs can be explored until a correct value is found. Reinforcement earning is sometimes called "learning with a critic" as opposed to "learning with a teacher" which refers to more traditional learning schemes were an error signal is generated which also contain information in which direction the synaptic weights of the networks should be changed in order to improve the performance . In reinforcement learning problems it is common to think explicitly of a network functioning in an environment. The environment supplies the inputs to the network, receives its output and then provides the reinforcement signal.

42

4.8 APPLICATIONS OF NEURAL NETWORKS


In order to have an integrated understanding on neural networks, we adopt the next perspective, called top-down, from application, algorithm to architecture:

Fig-4.10:Flow Chart for implementation of Appliucations

The approach is application-motivated, theoretically based, and implementation oriented. The main applications are for signal processing and pattern recognition. The algorithmic treatment represents a combination of mathematical theory and heuristic justification for neural models. The ultimate objective is the implementation of digital neurocomputers, embracing technologies of VLSI, adaptive, digital and parallel processing.

43

2.10.1 NEURAL NETWORKS IN PRACTICE


Neural networks have broad applicability to real world business problems. In fact, they have already been successfully applied in many industries. Since neural networks are best at identifying patterns or trends in data, they are well suited for prediction or forecasting needs including: 1. Sales Forecasting 2. Industrial Process Control 3. Customer Research 4. Data Validation 5. Risk Management 6. Target Marketing But to give you some more specific examples; ANN are also used in the following specific paradigms: recognition of speakers in communications; diagnosis of hepatitis; recovery of telecommunications from faulty software; interpretation of multimeaning Chinese words; undersea mine detection; texture analysis; three-dimensional object recognition; handwritten word recognition; and facial recognition.

44

You might also like