You are on page 1of 35

CSCI-8810 Image Processing - Project 2(By Akshay Choche)

Edge Detection

Component Labeling

Pyramid Construction

First-Order & Zero-Order Hold Scheme

This project is actually continuation of Project 1. In this project I have implemented various Edge detection Algorithms like Roberts Operator, Kirsh Operator. Also a component labeling algorithm was implemented. This project can be categorized as below. 1) Edge Detection Algorithms. I) Robert's Operator. II) Sobel Operator. III) Prewitt Operator. IV) Kirsh Operator. V) Finding the Laplacian of a grey scale image and Displaying it on screen. 2) Component Labeling. 3) Pyramid Construction. 4) Expanding pyramid using hold schemes. I) Zero-Order Hold Scheme. II) First-Order Hold Scheme.

1.I) Edge Detection using Robert's Operator: Robert's Operator is used for detecting the Edges in an image. Its a differential operator which tries to approximate the gradient of an image through discrete differentiation. The Masks used for this are as below.

Figure 1.1.a Roberts Operator Mask 1 and Mask 2 Respectively Lets look at some examples below.

Figure 1.1.e After Merging O/P from Mask 1&2

Although this method works well you will be astonished after looking at the o/p using other operators like Prewitt's or Kirsh's. Lets look at some more examples of Roberts operator before we move to next operator.

Figure 1.1.h After Merging O/P from Mask 1&2

I know that the O/P shown above is not convincing but I still selected such an image so that I could highlight the performance of other operators. 1.II.a) Edge Detection Using Sobel Operator: Sobel operator is also used for edge detection. It is also a differential operator used for approximating the gradient of an image. However this operator uses a different mask which performs lot better than the Robert's Operator. This is evident from the examples shown below. Sobel operator is more inclined towards finding horizontal and vertical edges first and then using these intermediate results to generate a final O/P.

Figure 1.2.g After Merging O/P from Gx & Gy

It is evident that Sobel operator performs way better than Roberts Operator. Using Sobel operator you are not only able to get the boundaries of the coins but you can also get some boundary information that there on the coin themselves (i.e. the faces of presidents).

Now lets try this operator on the bike image (Figure 1.1.f).

Figure 1.2.j After applying Mask Gx

Figure 1.2.k After Merging O/P from Mask 1&2 We can observe that (Figure 1.2.k) is way better than (Figure 1.1.h).

1.II.b) Edge Detection Using Alternate Sobel operator: The only difference between this method and the one shown above is that the Mask used differ.

Figure 1.2.r After Merging O/P from Mask 1&2

1.III) Edge Detection Using Prewitt Operator: Prewitt Operator is another operator that can be used for edge detection. However Prewitt Operator is more inclined towards finding the horizontal and vertical edges and then using both of these intermediate result to generate a final image. The Masks used for this operator are as below.

Figure 1.3.a Masks used for Prewitt Operator(here A is original image).

Figure 1.3.b Final result generated using Gx &Gy. Lets look at some Examples to illustrate the functioning of this Operator.

Figure 1.3.f After Merging O/P from Mask 1&2

Again we can see that even this operator works better than the Roberts Operator.

Now lets try this operator on the bike image (Figure 1.1.f).

Figure 1.3.j After Merging O/P from Mask 1&2

1.IV) Edge Detection Using Kirsh Operator: Another variation in Edge detecting algorithms is Kirsh Operator. This variation finds the edge at positive and negative 45O angles. Then uses these intermediate results to generate a final image. The Masks used for these operation are as shown below.

Figure 1.4.a Masks used for Kirsh Operator.

Figure 1.4.b Final result generated using Gx &Gy. Lets look at some Examples to illustrate the functioning of this Operator.

Figure 1.4.f After Merging O/P from Mask 1&2

Again we can see that even this operator works better than the Roberts Operator.

Now lets try this operator on the bike image (Figure 1.1.f).

Figure 1.4.j After Merging O/P from Mask 1&2

1.V) Edge Detection using Laplacian Matrix: This edge detection technique is a second order differential operator which tries to approximate the gradient of an image through second order discrete differentiation. It uses the following mask in order to find the Laplacian of an image.

Figure 1.5.a,b Masks used for Laplacian Transform Below is an example where we find Laplacian on an image and use scaling in order to display it.

Figure 1.5.e After applying Mask b

Now lets try this operator on the bike image (Figure 1.1.f).

Figure 1.5.h After applying Mask b.

This concludes the Edge Detection Part of this report. Lets do some experiments with these Operators and see how they perform. Also throughout this report I have been criticizing Roberts Operator, Is there a way to improve its performance, well we will try to answer that question too. Now, lets use some general image and see how these operators perform.

Figure 1.6.f After Kirsh Operator

Now lets talk about how can we improve Roberts Operator. Well I think why not apply Roberts operator again but this time to the O/P generated from the first stage (i.e. Figure 1.6.b). This is much better than the previous O/P.

Figure 1.6.g After applying Roberts Operator to Figure 1.6.b In the first report I kept on saying that smoothing or noise reduction makes the boundaries lighter and lighter i.e. the boundaries start disappearing. Lets confirm what we said previously.

Figure 1.7.d After applying Sobel Operator to Figure 1.7.c

On comparing figure 1.7.d and figure 1.7.b we can see that Noise reduction indeed makes the edges lighter. Lets see how edge detection works with a noisy image. How it behaves before noise reduction and after noise reduction.

Figure 1.8.b Applying Sobel to Fig 1.8.a

Figure 1.8.c After applying Sobel operator to Fig 1.8.a but this time we apply noise reduction first.

Finally lets conclude this section with experimenting how edge detection works with Black and White Image.

Figure 1.8.g After Prewitt Operator

You can see that the O/P generated after converting the image to Black and White is way better than previous results. Compare figures 1.8.f and 1.8.g with figure 1.7.b.

2) Component Labeling: Component Labeling allows a user to identify objects/blobs in an image. The algorithm simply traverses through all the pixels and tries to estimate pixels which could potentially belong to a single component. The algorithm that I have implemented is 4 connect. This is sometimes also refereed to as blob extraction or blob discovery. In the program implementation I try to find not only how many components are present in the image but also try to estimate their average size in number of pixels.

Figure 2.a 4-Connect Component Labeling Lets look at some of the examples.

Figure 2.c The O/P generated at the console.

The Fig 2.c describes the O/P that we get to see on the console. It identifies that there are 3 components in the original image. Below we can see the components identified by this systems.

Figure 2.g Original Image. Now lets look at the O/P.

Figure 2.f O/P at console The components identified by the program are as below.

(a) (b) (c) Figure 2.g. (a)(b)(c) Represents the components identified by the software One good this about this software is that it correctly identified (b) as a single component. Looking at these results we feel that component labeling is really a neat way for blob identification. So are there any drawback to this algorithm? Well there is one but its not a serious drawback. Lets see this with an example.

Figure 2.e Original Image with 1 component.

Figure 2.f O/P at console The program identifies that there are 2 components in the image above but we can see only one.

Figure 2.h Second Component Identified

But why is the program saying that there are two components when we can actually see that there is just a single component. Well the 4 connect method that we use is the root cause of all this. It only checks the left and the top neighbor of a pixel, thus it fails to see that these images are actually connected diagonally as shown below.

Figure 2.i Its a magnification of the connection between the white component, the red circle highlights the connection. So is there are way to over come this, well the answer is yes we can go for 8 connect method

This discussion concludes the components labeling section of this report.

3) Pyramid Construction: Given an image we construct multi-resolution version of the same image by smoothing and sub sampling. Example if the image has a resolution of 512X512 we construct its equivalent in following resolution (256X256, 128X128, 64X64, 32X32, 16X16). These 5 images can be considered as layers in a pyramid with the original image representing the base of the pyramid and the others being the upper layers of the pyramid. Lets look at an Example.

Figure 3.a Original Image

Figure 3.a Pyramid Construction from an image Individually the Image look like this.

256X256

128X128

64X64 32X32 16X16

Figure 3.b Various Levels of the Pyramid Although its not necessary that the Image be of 512X512 resolution, but it helps in simplifying the code. Apart from that Ideally the resolution of the image should be in 2 to the power n ( 2 n ). If the original image doesn't support this we can always do that algorithmically by introduction black or white band at the edges. So can we use the same technique and try to increase the resolution of an image? Well theoretically the answer is yes we can do that but that problem with it is that after some iteration, when we blow up the image it will start looking bad visually and would contain large number of blocks. Lets see how that works now we will take image with resolution 32X32 and try to blow/enlarge it to resolution of 256X256.

Figure 3.c Original 32X32

64X64 128X128 256X256 Figure 3.d After increasing the Resolution by First-Order hold scheme. Here we can see that as we go on increasing the resolution of the image it starts to have many BLOBs in it and is visually not that great. However this is very important in pattern recognition. In pattern recognition you would actually want BLOBs, as per information theory BLOBs represent information so rather than scanning the entire image one would actually Blow up a lesser resolution version of an image and search for BLOBs in it. Once you find a BLOB you could restrict the search area to the BLOB location in the original image and thus this would save you lot of time. The process of increasing the resolution of an image can be achieved using First-Order hold Scheme and Zero-Order Hold Scheme which will be discussed in the later section.

Now Lets see what happens when we convert the levels of the pyramid to black and white and then apply Edge detection technique.

Figure 3.f Generating B/W image using Iterative Thresholding.

Figure 3.g After Performing Sobel Operator on Fig 3.f We can see in the Figure 3.g how various components in the image start converging into BLOBs as we move from 256, 128, 64 and so on, and these BLOBs can be used for pattern recognition. I know the original image chosen for this operation is not appropriate but after the last section we will do some more experimentation for pattern recognition.

4.I) Zero-Order Hold Scheme: This is a very simple way to blow up an image. Here you go on increasing the resolution of an image. The increase is in powers of 2. Example if we have an image with resolution 64X64 and we want to blow it up to 512X512 then we increase it first to 128X128 then to 256X256 and finally to 512X512. Also the net work done over here is simple, we just copy the pixel from initial image into 4 adjacent location in the final image.
2 3 5 6

(a)
2 2 3 3 2 2 3 3 5 5 6 6 5 5 6 6

(b) Figure 4.1.a Represents the Zero-Order Hold Scheme where values are repeated from (a) to (b) Now lets look at a real example we will take a 128X128 image and blow it up to 512X512.

Figure 4.1.b Original Image 128X128 Iteration 1: We generate a 256X256 equivalent using Zero-Order hold scheme.

Figure 4.1.c Increase the resolution to 256X256 Iteration 2: Finally we blow up the image to 512X512 resolution.

Figure 4.1.d Increase the resolution to 512X512 This is how the Zero-Order hold scheme works. I had mentioned earlier that thou its theoretically possible for us to blow up an image to even higher resolution, but we start to see boxing effect as we go on increasing the resolution, as see in the image above. However although we start seeing BLOBs in the image these BLOBs are use full in pattern recognition. Lets us check out another example. Lets apply zero-order hold scheme on a 64X64 resolution pic as shown below.

Figure 4.1.h After applying zero order hold scheme (512X512)

4.II) First-Order Hold Scheme: This is another way to blow up an image. Even in this case we go on increasing the resolution in powers of 2, but instead to just plainly copying pixel value at adjacent location in new image we calculate the average among the neighboring pixels to determine new pixel values.

Figure 4.2.a Initial Row Expansion Look at the image above we simply stretch the original image array horizontally to create empty spaces between column for new pixels and fill them up with the average of its two neighbors. Example between 8 and 4 in the first row of Original Image Array we insert 6 which is average of 8 and 4, we carry this process for each neighbor in each row and get the matrix as shown in the right hand side image. In the next step we carry out the same thing, but this time we create empty space between rows and fill them up with average of the neighbors (i.e. Column Expansion) as show below.

Figure 4.2.b After Column Expansion Like in the example shown above we insert a new pixel between 8 and 4, which is 6 i.e. the average of 8 and 6, we repeat this for every neighbors in every column till we get a matrix as shown in the right hand side of the image. Generally the decimal values are rounded to the nearest integer. Enough with the theory now lets look at a real example we will take a 128X128 image and blow it up to 512X512. We will use the same image as in Figure 4.1.b.

Figure 4.2.c Original 128X128 image Iteration 1: We generate a 256X256 equivalent using First-Order hold scheme.

Figure 4.2.d After increasing the resolution to 256X256 using First-Order hold scheme Iteration 2: Finally we blow up the image to 512X512 resolution.

Figure 4.2.e After increasing the resolution to 512X512 using First-Order hold scheme Clearly we can see that First-Order hold scheme performs much better than the Zero-Order hold scheme. To see the difference compare Figure 4.2.e with Figure 4.1.d and compare Figure 4.2.d with Figure 4.1.c. Lets us check out another example. Lets apply first-order hold scheme on a 64X64 resolution pic as shown below.

Figure 4.2.f After applying first order hold scheme (512X512)

Patter Recognition Experiment: I have been saying that we can use the pyramid and hold scheme to perform pattern recognition. Now lets see how can we do that. Say we are given an image and have been asked to write a code to find a ship in that image. We have been told that the ship would cover about 40% of the image. We are told to do some pattern recognition in order to achieve our goal. Here is one way of finding the ship.

Figure 5.a Original 512X512 resolution Image

Step 1) Find 5 levels of pyramid for this image. (i.e. Obtain the image in following resolutions 256X256, 128X128 till16X16).

(a)

(b) Figure 5.b Five levels of Pyramid.

(c)

(d) (e)

Step 2) Convert each image into its Black and White equivalent.

(a)

(b)

(c)

(d) (e)

Figure 5.c Coverting them to Black and White. Step 3) Apply edge detection Algorithm to each of the image say Sobel's Operator.

(a)

(b) (c) (d) (e) Figure 5.d Applying Sobel's Operator to above images

Step 4) Blow up each image to 256X256 resolution using First-Order Hold Scheme.

to 5.d (c)

to 5.d (b)

Figure 10 Applied First-Order Hold Scheme

to 5.d (a)

We see a single Blob in Figure 6 which roughly covers 40%-50% of the image. This could be something that we were looking for. So we Blow up the Fig 5.d (e) using first-order hold scheme to get figure 7, which confirms that fact about a single large Blob. So we narrow down our search area to that vicinity in the original picture and hence it takes relatively less time to detect the presence of ship in the original image(i.e. Figure 5.a).

Refernces of the images used in the report:

http://inperc.com/wiki/index.php?title=Gray_scale_function

http://www.wikipedia.org