Professional Documents
Culture Documents
North-Holland
Abstract: A procedure for image segmentation involving no image-dependent thresholds is described. The method involves not
only detection of edges but also production of closed region boundaries. The method has been developed and tested on head
and shoulder images.
Key words: Image analysis, scene analysis, segmentation, edge detection, thresholding.
or regions of approximately the same texture. For tial to the correct segmentation of the image,
each of these regions parameters can then be ex- leaving discontinuous edges. It is much more diffi-
tracted which will enable the future reconstruction cult to find criteria for the interpolation of such
of the texture within that region. These parameters, broken edges than it is to find criteria for the rejec-
together with a description of each region's boun- tion of edge points.
dary can then form the final description of the Secondly thresholds are very often arbitrary in
image. It is required, for our purposes, that the nature so that, even if a way could be found
region boundaries are 4-connected, rather than around the first problem, the best threshold for
8-connected or discontinuous (see Figure I and one picture could vary greatly from that for
Duda and Hart (1973), p. 284). another. However, provided that the value in ques-
So in this approach the first stage of processing tion is a reasonable measure of the gradient of the
is to detect the edges between regions. Noise occurs edge at a point, then very low thresholds can be
in various forms in a real digitised image and pro- used to reduce the number of spurious edge points
duces many spurious edges in processes which use slightly.
purely edge-detection operators. Obviously unless Below are descriptions of the two contrasting
these edges can be eliminated in some way, this will edge-detection operators which are used in this
lead to more storage space than necessary being scheme. The first one was assumed to detect all the
taken up by the final description. Noise can also edges necessary for an adequate segmentation of
give discontinuity of edges in some schemes, making the image. However it is sensitive to noise and so
them useless for segmentation purposes. produces a large number of spurious edges. The
Several schemes exist which use output from second operator is much more robust when used
edge-detection operators to reduce the number of on noisy images but has too many disadvantages to
such spurious edge points (Pratt (1978), Rosenfeld be used on its own.
and Thurston (1972), Heuckel (1973)), but these
have not proved satisfactory. Many of these apply
a threshold to some value calculated at each candi- 2.1. The Marr-Hildreth operator
date edge point but this is not satisfactory for two
reasons. Firstly, thresholds which reduce the num- The basic edge-detection algorithm in this scheme
ber of spurious edge points to a reasonable level involves the use of Marr and Hildreth's Edge-
must also reject some edge points which are essen- Detection Theory (1980). In a given image, at the
boundary of a region, there is a zero-crossing in
the second directional derivative of the intensity
function of the image, I(x, y). However to detect
relatively large-scale intensity changes, such as are
required for our purposes, some form of smoothing
"k
Id
436
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
437
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
to noise and so a large number of spurious zero- edges. However as stated previously this is a
crossings are produced. The smaller the width the dangerous operation if it is used as the only process
more sensitive the algorithm becomes. Two to reduce the number of spurious edges and it has
methods have been tried to reduce the number o f been found in this case to produce discontinuous
such zero-crossings. First the width o f the Gaussian edges.
filter could be increased; even small increases In their work Marr and Hildreth suggest that in-
greatly reduce the amount of noise, however not formation from a number of convolutions, using
only are some features erroneously merged but Gaussian filters of small and large widths, should
also as stated above problems arise with curvature. be combined. There is some evidence to suggest
Another disadvantage is that the mask size is in- that filters of different widths are present in the
creased and so not only takes up more storage human eye, but whilst some work has been done
space but the process is also made more computa- on this in relation to stereovision it is not at all
tionally complex and a much larger part o f the clear how the information could be combined for
image must be ignored in the edge-detection our purposes.
process. The second method makes use of the fact
that in general noise produces edges of relatively 2.2. The Sobel operator
small gradients. Thus the gradient at each zero-
crossing could be thresholded to eliminate false The Sobel operator (Duda and Hart (1973),
p. 271) produces an approximate gradient at each
pixel, except those on the perimeter o f the image,
by differencing the intensity function in two per-
pendicular directions using the 3 x 3 masks
meduim
GI=
I! 0!1 I i il
0
0
, G2= 0
-2 -
Let the inner product of the mask Gi with a 3 x 3
. (2)
438
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
algorithm and by using the gradients calculated in cess in the scheme is shown as Figure 4.
both algorithms reject whole edge segments thus The output from the modified Marr-Hildreth al-
leaving 4-connected edges. gorithm is an array of integer gradient values fac-
tored to the range 0-255 for ease of storage and
display. Pixels at which there is no zero-crossing
3. Implementation of the edge-detection scheme are given a gradient value of 0. Since the edges are
two pixels thick they are reduced to one pixel by
A flow chart showing the position of each pro- using a thinning algorithm. Similarly the output
ITHIN t h e a r r a y
+
Threshold the Product at 255
+
I
f
I
t
Chain code edge segments,
rejecting those without 'probable' points
+
Store codes for accepted segments
439
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
from the Sobel algorithm is an array of integer pixels which had previously been junctions may at
gradient values, also factored to the range 0-255. this stage have only one edge segment either en-
These arrays are both windows of the original tering or leaving the junction, where there had
image area as described in the previous section. previously been three or four. The remaining seg-
Some initial experiments, involving calculation ment can then also be rejected as it is not active in
o f the product, at each pixel, of the two gradient the division of the image into regions. Thus each
values, showed that at least one pixel at which the junction is continuously examined until no further
product is 255 or more is included in each edge segments can be removed in this way. Then the re-
essential to the segmentation of the image. More- maining segments have their codes and starting
over many o f the false edges consisted entirely o f points recorded in a data-file for future reforma-
pixels at which the product was" less than 255. Thus tion of the image's segmentation.
the array of product values is thresholded twice. It has been previously stated that thresholding
Once to give an array of 'possible' edge points, was a dangerous operation and yet it is used in this
those with product 1 or greater, and the second scheme on three separate occasions. However two
time to give an array of 'probable' edge points, o f them are very low thresholds which are applied
those with product of 255 or more. The pixels with indirectly when the gradients from the Marr-
product 0 can be safely discarded since these points Hildreth and Sobel operators are factored to the
are either not zerocrossings in the Marr-Hildreth
algorithm or give extremely small gradient values
for either of the Sobel of Marr-Hildreth operators.
The final segmentation required lies somewhere
between the 'possible' and the 'probable' array.
The 'possible' array has too many spurious edges
wilst the 'probable' array does not consist of con-
tinuous edges. However a large number of spurious
edges can be removed from the 'possible' array by
the use of a sorting algorithm, which removes
those edge points not active in the segmentation of
the image.
Advantage is now taken of a fact mentioned
earlier in this section. Namely that many false
edges contain no 'probable' points. For an effi- Fig. 5. The earlier noisy image.
cient description of each region's boundary, edge
segments have to be coded and in this scheme a
2-bit chain code is used (see Duda and Hart
(1973)). An edge segment is defined to be a 4-con-
nected edge linking any two junction points, a
junction to the perimeter of the window of the
image or two pixels on the perimeter. The chain
coding is done at this stage and after each segment
is coded it is recorded whether or not the edge seg-
ment (including the first and last pixels) contained
any 'probable' edge points.
When all the edge segments have been coded,
segments with no 'probable' edge points are re-
jected as a whole, except for segments consisting of
only two points which are not considered long
enough to give trustworthy information. Some Fig. 6. The image with reduced noise.
440
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
4. Examples
Fig. 7. The final segmentation of the first image.
The scheme has been developed and tested using
real images. These images are 128x 1 2 8 x 8 bit
monochrome pictures thus having 256 grey levels. w = 3 could be used on this image because as stated
Two examples are presented here, both images in Section 2, decreasing w has many advantages.
consisting of head and shoulder portraits (see However even on this image with much less noise
Figures 5 and 6). The two images were produced at too many spurious edges were detected by the
different times, the earlier one, which is examined modified Marr-Hildreth algorithm for that mask
first being very much noisier than the other. Faces size to be used in the scheme, so again w = 4 was
represent tough challenges to edge-detection used. There are only about 110 regions in the final
schemes because they contain very few sharp boun- segmentation of this image (see Figure 8). The
daries between regions. This scheme, however, original image in this case is certainly no less com-
works particularly well on the less noisy picture, plex than the previous example and so the differ-
and although understandably less well on the ence in the number of regions found in the images
other, still produces an adequate segmentation in is explained by the reduction in the level of noise
this case. in the second image. Comparison of Figures 6 and
8 shows that the scheme has produced a reasonably
4.1. Image with a high level o f noise accurate segmentation with not too many un-
Obviously with such a large amount o f noise wanted edges.
present in this image, there are going to be many
S. Conclusions
more unwanted edges in the final segmentation of
this image than the second one. However the In this paper a scheme has been described for the
scheme can still discern the important features of
the image in spite of the high level of noise. The
amount of noise in this picture is particularly
reflected in the large number of spurious edges
detected in the background to the face. In the final
segmentation produced by the scheme there are
around 180 regions (see Figure 7).
441
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983
segmentation of images. An important feature of mentation should be possible in the future using
the scheme is that there are no image-dependent appropriate VLSI circuits.
parameters or thresholds, thus making it suitable Segmentation, as described here, forms an essen-
for automatic implementation. The aim of the tial part of the feature extraction process needed
method is to provide a complete decomposition of for the efficient description of images, and as the
an image into its constituent regions, and therefore front end for pattern recognition and scene under-
it involves not only detection of edges but the pro- standing on complex images. As part of the on-
duction of closed region boundaries. The segmen- going research texture analysis is also being
tation procedure has been developed and tested studied.
using real head and shoulder images, and gives
very good results even when significant amounts of
noise are present.
References
The basis of the method consists of combining
information from the Marr-Hildreth and the Sobel
edge detection operators in such a way as to utilise Duda, R.O. and P.E. Hart (1973). Pattern Classification and
the best qualities of each. On its own the former Scene Analysis. John Wiley, New York.
has the advantage of tending to produce closed Freeman, H. (1961). On the encoding of arbitrary geometric
configurations. 1RE Trans. Elect. Comp. 10 (June), 260-268.
contours, but was found to be too sensitive to Heuckel, M. (1973). A local visual operator which recognises
noise unless wide masks were used resulting in loss edges and lines. J. A C M 20(4) 634-647.
of detail and excessive computational complexity. Marr, D. and E.C. Hildreth (1980). Theory of edge detection.
The Sobel operator on its own is suitable for hard- Proc. Roy. Soc. B207, 187-217.
ware implementation but does not give closed con- Pratt, W.K. (1978). Digital Image Processing. John Wiley, New
York, pp. 478-499.
tours. The method described here avoids the above Rosenfeld, A. and M. Thurston (1972). Edge and curve detec-
difficulties. It is presently implemented on a VAX tion: further experiments. IEEE Trans. Computers 21(7)
11/780 computer, but real-time hardware imple- 677-715.
442