You are on page 1of 8

Pattern Recognition Letters 1 (1983) 435-442 July 1983

North-Holland

An automatic procedure for image segmentation


D.J. B A R T L I F F
GEC Research Laboratories, Hirst Research Laboratories, Wernbley, UK

Received 31 January 1983


Revised 14 April 1983

Abstract: A procedure for image segmentation involving no image-dependent thresholds is described. The method involves not
only detection of edges but also production of closed region boundaries. The method has been developed and tested on head
and shoulder images.

Key words: Image analysis, scene analysis, segmentation, edge detection, thresholding.

1. Introduction were tried and rejected, but finally a scheme was


evolved which combined information from the
This paper reports work carried out as part of Marr-Hildreth and the Sobel edge-detection opera-
current investigations into image-processing al- tors. Use was also made of two edge-cleaning
gorithms for the purpose of image analysis. Poten- algorithms. The edge segments which have been
tial applications of this work include development detected and accepted by the scheme can be finally
o f an image storage or archiving system as well as described using a 2-bit chain code. This image-
for feature extraction in scene analysis and under- segmentation scheme has been implemented on a
standing. VAX 11/780 computer.
One approach in initial processing involves seg- The next section describes briefly the Marr-
menting an image into regions of approximately Hildreth and Sobel operators and gives reasons for
the same texture in order that each region can then the combination of the two. Section 3 looks at the
be analysed individually. To this end the boun- actual implementation of the scheme on the com-
daries of the regions must be located, thus involving puter, while Section 4 presents two examples.
not only the detection of edges but also the edges Finally Section 5 summarises the work which has
must be linked together to form closed contours. been carried out so far and outlines the areas for
Many techniques for edge detection have been further investigation.
developed but each has its disadvantages (Pratt,
1978). In many cases, for example, closed region
boundaries are not produced and indeed most 2. The edge-detection operators
methods are successful only in a narrow field of
application. Segmentation is a necessary first stage in many
This paper covers work which has been carried applications of image analysis. By way of example,
out on 'algorithm engineering', i.e. optimisation in order to achieve a very efficient description of
o f combinations of several basic techniques to find pictures it is necessary to go beyond sophisticated
a scheme which improves upon individual methods coding techniques applied to the raw picture data.
and has broader applications. The approach which has been adopted here con-
To this end several combinations of algorithms sists of dividing the image into its component parts

0167-8655/83/$3.00 © 1983, Elsevier Science Publishers B.V. (North-Holland) 435


Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

or regions of approximately the same texture. For tial to the correct segmentation of the image,
each of these regions parameters can then be ex- leaving discontinuous edges. It is much more diffi-
tracted which will enable the future reconstruction cult to find criteria for the interpolation of such
of the texture within that region. These parameters, broken edges than it is to find criteria for the rejec-
together with a description of each region's boun- tion of edge points.
dary can then form the final description of the Secondly thresholds are very often arbitrary in
image. It is required, for our purposes, that the nature so that, even if a way could be found
region boundaries are 4-connected, rather than around the first problem, the best threshold for
8-connected or discontinuous (see Figure I and one picture could vary greatly from that for
Duda and Hart (1973), p. 284). another. However, provided that the value in ques-
So in this approach the first stage of processing tion is a reasonable measure of the gradient of the
is to detect the edges between regions. Noise occurs edge at a point, then very low thresholds can be
in various forms in a real digitised image and pro- used to reduce the number of spurious edge points
duces many spurious edges in processes which use slightly.
purely edge-detection operators. Obviously unless Below are descriptions of the two contrasting
these edges can be eliminated in some way, this will edge-detection operators which are used in this
lead to more storage space than necessary being scheme. The first one was assumed to detect all the
taken up by the final description. Noise can also edges necessary for an adequate segmentation of
give discontinuity of edges in some schemes, making the image. However it is sensitive to noise and so
them useless for segmentation purposes. produces a large number of spurious edges. The
Several schemes exist which use output from second operator is much more robust when used
edge-detection operators to reduce the number of on noisy images but has too many disadvantages to
such spurious edge points (Pratt (1978), Rosenfeld be used on its own.
and Thurston (1972), Heuckel (1973)), but these
have not proved satisfactory. Many of these apply
a threshold to some value calculated at each candi- 2.1. The Marr-Hildreth operator
date edge point but this is not satisfactory for two
reasons. Firstly, thresholds which reduce the num- The basic edge-detection algorithm in this scheme
ber of spurious edge points to a reasonable level involves the use of Marr and Hildreth's Edge-
must also reject some edge points which are essen- Detection Theory (1980). In a given image, at the
boundary of a region, there is a zero-crossing in
the second directional derivative of the intensity
function of the image, I(x, y). However to detect
relatively large-scale intensity changes, such as are
required for our purposes, some form of smoothing

"k

Id

Fig. 1. The difference between 4-connected and 8-connected


edges. An n-connected edge, for n = 4 or 8, is an edge along
which each pixel has at least two n-neighbours as edge points.
4-neighbours are given in (a); 8-neighbours in (b). An example
of a 4-connected edge is given in (c), and of an 8-connected edge
in (d). Fig. 2a. Cross-section of the Gaussian distribution G.

436
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

A zero-crossing is defined to be a pixel at which the


signal S has opposite sign to that of the signal at,
at least one of its 8-neighbours (see Figure 1).
- A 2 G can be calculated before the convolution is
performed and so it can be permanently stored in
the form of a rough quadrant of a circle of radius
2w (see Figure 2c). Rotational symmetry is then
used to generate the rest of the mask which is used
in performing the convolution. A mask of this size
is used because - A 2 G is negligible at distances
greater than 2w from the centre. However points
I within 2w of the perimeter of the image cannot be
considered since at these points the mask would
Fig. 2b. Cross-section of - V 2G.
overlap the perimeter. So the edge-detection pro-
cess is carried out on a window formed by ignoring
of the intensity function must first be carried a margin of width 2w around the original image.
out. Marr and Hildreth found that the optimal Information from points inside this margin is still
smoothing filter is a 2-dimensional, rotationally used, for instance in the convolution, but no edges
symmetric Gaussian distribution (see Figure 2a), can be detected there.
and also found there is no need to use directional The edges detected by this algorithm are 4-con-
derivatives. The use of the only non-directional nected but are two pixels thick because both sides
second derivative operator, the Laplacian, leads to of a zero-crossing are marked. This differs from
a great saving in computational time and so that the implementation by Marr and Hildreth of their
operator is used in this implementation. There is own theory. However the Marr-Hildreth edge-
some evidence to suggest that the human eye uses detection algorithm in its original form has been
such filters in its visual processing. found to fail where several regions meet in a situa-
In this implementation candidate edge points are tion such as shown in Figure 3. This problem can
located by scanning the signal, be very noticeable when the algorithm is used on
real images. Even when both sides of each zero-
S(x, y) = - A 2 G * I(x, y), (1)
crossing are marked as in this implementation,
pixel by pixel and finding the zero-crossings. Here G then the Gaussian filter used must have small
is the 2-dimensional Gaussian distribution para- width (w = 4), otherwise the problems of curvature
metrised by its width, w, (see Figure 2b), and • cannot be overcome.
denotes a 2-dimensional convolution over x and y. It may be thought that any edges thus detected
will be inaccurate and also that any edges which
are two pixels apart in the image will be merged by
this process. However the edges when subsequently
reduced to only one pixel thick (see Section 3), will
lie less than one pixel away from the zero-crossing.
Compare this with the original implementation of
I
\ Marr and Hildreth's theory in which zero-crosings
2w
are asymmetrically marked on the positive side.
It was assumed that, for a small width of the
JI Gaussian filter ( w = 4 ) , all the region boundaries
required for an adequate segmentation of the
image would be represented by zero-crossing seg-
ments in the output of the algorithm. Unfortunately
Fig. 2c. Storage of the mask. The value of - V 2G, at the centre
of each square above, is stored in the mask. with this small width the algorithm is very sensitive

437
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

to noise and so a large number of spurious zero- edges. However as stated previously this is a
crossings are produced. The smaller the width the dangerous operation if it is used as the only process
more sensitive the algorithm becomes. Two to reduce the number of spurious edges and it has
methods have been tried to reduce the number o f been found in this case to produce discontinuous
such zero-crossings. First the width o f the Gaussian edges.
filter could be increased; even small increases In their work Marr and Hildreth suggest that in-
greatly reduce the amount of noise, however not formation from a number of convolutions, using
only are some features erroneously merged but Gaussian filters of small and large widths, should
also as stated above problems arise with curvature. be combined. There is some evidence to suggest
Another disadvantage is that the mask size is in- that filters of different widths are present in the
creased and so not only takes up more storage human eye, but whilst some work has been done
space but the process is also made more computa- on this in relation to stereovision it is not at all
tionally complex and a much larger part o f the clear how the information could be combined for
image must be ignored in the edge-detection our purposes.
process. The second method makes use of the fact
that in general noise produces edges of relatively 2.2. The Sobel operator
small gradients. Thus the gradient at each zero-
crossing could be thresholded to eliminate false The Sobel operator (Duda and Hart (1973),
p. 271) produces an approximate gradient at each
pixel, except those on the perimeter o f the image,
by differencing the intensity function in two per-
pendicular directions using the 3 x 3 masks
meduim
GI=
I! 0!1 I i il
0
0
, G2= 0
-2 -
Let the inner product of the mask Gi with a 3 x 3
. (2)

window of pixels, centred on a pixel P, be Gi(P)


light
/ for i = 1 and 2, then the magnitude of the gradient
at P is
A (P ) = ( G 1(p)2 + Gz(p)2)l/2. (3)
verydark / a
When this algorithm is used on its own, the method
of producing edges is to threshold the gradient
magnitudes thus calculated. Again thresholding
does not give good results for reasons given pre-
viously. Other methods of using the information
from the Sobel operator, such as tracing local
maxima of the gradient, have also not produced
satisfactory results. Two reasons being their com-
putational complexity and their use of arbitrary
parameters.
However this operator is much less sensitive to
noise than the Marr-Hildreth operator and again
gives, in general, low values for the gradient at
false edge points. Thus the task of the work
Fig. 3. (a) The true regions. (b) The edges detected by the covered here was to take the zero-crossing seg-
original Marr-Hildreth algorithm. ments produced by the modified Marr-Hildreth

438
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

algorithm and by using the gradients calculated in cess in the scheme is shown as Figure 4.
both algorithms reject whole edge segments thus The output from the modified Marr-Hildreth al-
leaving 4-connected edges. gorithm is an array of integer gradient values fac-
tored to the range 0-255 for ease of storage and
display. Pixels at which there is no zero-crossing
3. Implementation of the edge-detection scheme are given a gradient value of 0. Since the edges are
two pixels thick they are reduced to one pixel by
A flow chart showing the position of each pro- using a thinning algorithm. Similarly the output

l Digitised Picture Array


1
+

IObtain the gradient values at


Marr-Hildreth zero-crossings
'Obtain the gradients from
I the Sobel operator

ITHIN t h e a r r a y

Calculate the Product of the


two gradient values

l hreshold the product at I I

+
Threshold the Product at 255

ISort the edge array

+
I

IFill-in 8-connected edges 1

f
I
t
Chain code edge segments,
rejecting those without 'probable' points

+
Store codes for accepted segments

Fig. 4. Implementation of the scheme.

439
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

from the Sobel algorithm is an array of integer pixels which had previously been junctions may at
gradient values, also factored to the range 0-255. this stage have only one edge segment either en-
These arrays are both windows of the original tering or leaving the junction, where there had
image area as described in the previous section. previously been three or four. The remaining seg-
Some initial experiments, involving calculation ment can then also be rejected as it is not active in
o f the product, at each pixel, of the two gradient the division of the image into regions. Thus each
values, showed that at least one pixel at which the junction is continuously examined until no further
product is 255 or more is included in each edge segments can be removed in this way. Then the re-
essential to the segmentation of the image. More- maining segments have their codes and starting
over many o f the false edges consisted entirely o f points recorded in a data-file for future reforma-
pixels at which the product was" less than 255. Thus tion of the image's segmentation.
the array of product values is thresholded twice. It has been previously stated that thresholding
Once to give an array of 'possible' edge points, was a dangerous operation and yet it is used in this
those with product 1 or greater, and the second scheme on three separate occasions. However two
time to give an array of 'probable' edge points, o f them are very low thresholds which are applied
those with product of 255 or more. The pixels with indirectly when the gradients from the Marr-
product 0 can be safely discarded since these points Hildreth and Sobel operators are factored to the
are either not zerocrossings in the Marr-Hildreth
algorithm or give extremely small gradient values
for either of the Sobel of Marr-Hildreth operators.
The final segmentation required lies somewhere
between the 'possible' and the 'probable' array.
The 'possible' array has too many spurious edges
wilst the 'probable' array does not consist of con-
tinuous edges. However a large number of spurious
edges can be removed from the 'possible' array by
the use of a sorting algorithm, which removes
those edge points not active in the segmentation of
the image.
Advantage is now taken of a fact mentioned
earlier in this section. Namely that many false
edges contain no 'probable' points. For an effi- Fig. 5. The earlier noisy image.
cient description of each region's boundary, edge
segments have to be coded and in this scheme a
2-bit chain code is used (see Duda and Hart
(1973)). An edge segment is defined to be a 4-con-
nected edge linking any two junction points, a
junction to the perimeter of the window of the
image or two pixels on the perimeter. The chain
coding is done at this stage and after each segment
is coded it is recorded whether or not the edge seg-
ment (including the first and last pixels) contained
any 'probable' edge points.
When all the edge segments have been coded,
segments with no 'probable' edge points are re-
jected as a whole, except for segments consisting of
only two points which are not considered long
enough to give trustworthy information. Some Fig. 6. The image with reduced noise.

440
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

integer range 0-255. It is thought that there will be


very few pixels at which either gradient is given the
value 0 when in fact both are non-zero. The third
application of a threshold is also permissible since
points which have gradient product less than 255
can still be included in the picture. Moreover the
threshold is not arbitrary in that it is fixed by the
maximum gradients obtained from the Marr-
Hildreth and Sobel algorithms.

4. Examples
Fig. 7. The final segmentation of the first image.
The scheme has been developed and tested using
real images. These images are 128x 1 2 8 x 8 bit
monochrome pictures thus having 256 grey levels. w = 3 could be used on this image because as stated
Two examples are presented here, both images in Section 2, decreasing w has many advantages.
consisting of head and shoulder portraits (see However even on this image with much less noise
Figures 5 and 6). The two images were produced at too many spurious edges were detected by the
different times, the earlier one, which is examined modified Marr-Hildreth algorithm for that mask
first being very much noisier than the other. Faces size to be used in the scheme, so again w = 4 was
represent tough challenges to edge-detection used. There are only about 110 regions in the final
schemes because they contain very few sharp boun- segmentation of this image (see Figure 8). The
daries between regions. This scheme, however, original image in this case is certainly no less com-
works particularly well on the less noisy picture, plex than the previous example and so the differ-
and although understandably less well on the ence in the number of regions found in the images
other, still produces an adequate segmentation in is explained by the reduction in the level of noise
this case. in the second image. Comparison of Figures 6 and
8 shows that the scheme has produced a reasonably
4.1. Image with a high level o f noise accurate segmentation with not too many un-
Obviously with such a large amount o f noise wanted edges.
present in this image, there are going to be many
S. Conclusions
more unwanted edges in the final segmentation of
this image than the second one. However the In this paper a scheme has been described for the
scheme can still discern the important features of
the image in spite of the high level of noise. The
amount of noise in this picture is particularly
reflected in the large number of spurious edges
detected in the background to the face. In the final
segmentation produced by the scheme there are
around 180 regions (see Figure 7).

4.2. Image with reduced noise

This example is presented in more detail since


the standard of the original image in this case is
suitable for many applications.
It was hoped that a Gaussian filter of width Fig. 8. The final segmentation of the second image.

441
Volume 1, Numbers 5,6 PATTERN RECOGNITION LETTERS July 1983

segmentation of images. An important feature of mentation should be possible in the future using
the scheme is that there are no image-dependent appropriate VLSI circuits.
parameters or thresholds, thus making it suitable Segmentation, as described here, forms an essen-
for automatic implementation. The aim of the tial part of the feature extraction process needed
method is to provide a complete decomposition of for the efficient description of images, and as the
an image into its constituent regions, and therefore front end for pattern recognition and scene under-
it involves not only detection of edges but the pro- standing on complex images. As part of the on-
duction of closed region boundaries. The segmen- going research texture analysis is also being
tation procedure has been developed and tested studied.
using real head and shoulder images, and gives
very good results even when significant amounts of
noise are present.
References
The basis of the method consists of combining
information from the Marr-Hildreth and the Sobel
edge detection operators in such a way as to utilise Duda, R.O. and P.E. Hart (1973). Pattern Classification and
the best qualities of each. On its own the former Scene Analysis. John Wiley, New York.
has the advantage of tending to produce closed Freeman, H. (1961). On the encoding of arbitrary geometric
configurations. 1RE Trans. Elect. Comp. 10 (June), 260-268.
contours, but was found to be too sensitive to Heuckel, M. (1973). A local visual operator which recognises
noise unless wide masks were used resulting in loss edges and lines. J. A C M 20(4) 634-647.
of detail and excessive computational complexity. Marr, D. and E.C. Hildreth (1980). Theory of edge detection.
The Sobel operator on its own is suitable for hard- Proc. Roy. Soc. B207, 187-217.
ware implementation but does not give closed con- Pratt, W.K. (1978). Digital Image Processing. John Wiley, New
York, pp. 478-499.
tours. The method described here avoids the above Rosenfeld, A. and M. Thurston (1972). Edge and curve detec-
difficulties. It is presently implemented on a VAX tion: further experiments. IEEE Trans. Computers 21(7)
11/780 computer, but real-time hardware imple- 677-715.

442

You might also like