Professional Documents
Culture Documents
Abstract
EdgeFlow is a new technique for boundary detection, proposed by W.-Y. Ma [1]. This
technique is very eÆciently algorithm for boundary detection and image segmentation.
This scheme utilizes a predictive coding model to identify the direction of change in color
and texture at each image location at a given scale, and constructs an edge
ow vector.
By propagating the edge
ow vectors, the boundaries can be detected at image locations
which encounter two opposite directions of
ow in the stable state. A user dened image
scale is the only signicant control parameter that is needed by the algorithm. The
scheme facilitates integration of color and texture into a single framework for boundary
detection. Segmentation results on a large and diverse collections of natural images are
provided.
Keywords : Boundary detection, Gabor ltering, Image segmentation, Texture.
1. Introduction
In most computer vision applications, boundary detection and image segmentation constitute a crucial initial
step before performing high-level tasks such as object recognition and scene interpretation. While considerable
research and progress have been made in the area of image segmentation, the robustness and generality of the
algorithms on a large variety of image data have not been established. One of the diÆculties arises from the
fact that most natural images are rich in color and texture, and these features need to be integrated for a good
segmentation. Furthermore, image segmentation itself is an ill-posed problem. For example, scale is often an
important and context dependent parameter. Fig. 1(a) shows an image which contains ve dierent \beans"
regions. One might consider each bean as an individual object and obtain a result similar to Fig. 1(b), or might
consider each \beans" region as a texture and get a segmentation like in Fig. 1(c).
Original image Consider each bean as an object Consider each "beans" regions as a texture
(a) (b) (c)
Fig. 1. Image segmentation often requires additional information from the user in order to select a proper scale
to segment the objects or regions of interest, (a) shows an image with ve dierent \beans" regions, (b) is the
segmentation result using a smaller scale, and (c) is the segmentation result using a larger scale.
The method described here, called EdgeFlow, is a new technique for boundary detection that requires very
little parameter tuning. Traditionally edges are located at the local maxima (see appendix A.1) of the gradient
in the intensity/image feature space. In contrast, the detection and localization of edges (or image boundaries
in a more general sense) are performed indirectly in the proposed EdgeFlow method: rst by identifying a
ow
direction at each pixel location that points to the closest boundary; then followed by the detection of locations
that encounter two opposite directions of edge
ow. Since any of the image attributes such as color, texture, or
their combination can be used to compute the edge energy and direction of
ow, this scheme provides a general
framework for integrating dierent image features for boundary detection.
The EdgeFlow method utilizes a predictive coding model to identify and integrate the direction of change in
1
image attributes such as color, texture, and phase discontinuities, at each image location. Towards this objective,
the following values are computed: E (s; ) which measures the edge energy at pixel s along the orientation ,
P (s; ) which is the probability of nding an edge in the direction of from s, and P (s; + ) which is the
probability of nding an edge along + from s. These edge energies and the associated probabilites can be
computed in any image feature space of interest, such as color or texture and can be combined by equations
20-23. From these measurements, an edge
ow vector F~ (s) is then computed. The magnitude of F~ (s) represents
the total edge energy and F~ (s) points in the direction of nding the closest boundary pixel.
The distribution of F~ (s) in the image forms a
ow eld which is allowed to propagate. At each pixel location,
the
ow is in the estimated direction of the boundary pixel. A boundary location is characterized by
ows in
opposing directions towards it. On a discrete image grid, the
ow typically takes a few iterations to converge. In
this approach, the only required parameter is the scale parameter (standard deviation of the Gaussian). Fig.
1(b) and (c) show the results of boundary detection at two dierent scales using color and texture features.
The organization of this report is as follows. In section 2, the intensity and texture EdgeFlows, based on
Gaussian smoothing and Gabor wavelet, will be described. The construction of EdgeFlow vector is introduced
in section 3. Section 4 describes the EdgeFlow propagation and boundary detection techniques. The boundary
connection and region merging will be presented in section 5. Experimental results will be demostrated in section
6, and nally, the conclusion in section 7.
2. Intensity and Texture Edge
The EdgeFlow method utilizes a predictive coding model to identify and integrate the direction of change in image
attributes such as color, texture, and phase discontinuities, at each image location. Consider the followings:
F (s; ) = [E (s; ); P (s; ); P (s; + )];
where s = (x; y) is a pixel in an image. E (s; ) is an edge energy at location s along the orientation (this
measures changes in the local image features, such as color or texture, along the specied orientation). P (s; ) is
the probability of nding an image boundary in the direction from s. P (s; + ) is the probability of nding
an image boundary in the direction + from s. An edge
ow vector is computed at each location s using
E (s; ); P (s; ) and P (s; + ). The notations used in computing the edge
ow vectors will be summarized as
follows. The Gaussian derivative (GD) is used in computing the edge energies and the dierence of
oset Gaussians (DOOG) is used in estimating the
ow directions. A 2-D 1isotropic Gaussian function
is dened as
1
G (x; y) = p exp[
(x2 + y2 ) ]: (1)
2 22
The rst derivative of Gaussian (GD) along the x-axis is given by
@G (x; y)
GD (x; y) =
@x
= x G (x; y);
2 (2)
and the dierence of oset Gaussian (DOOG) along the x-axis is dened as
DOOG (x; y) = G (x; y) G (x + d; y); (3)
where d is the oset between centers of two Gaussian kernels and is chosen proportional to . By rotating these
two functions, a family of the Gaussian derivatives and the dierence of oset Gaussian functions along dierent
orientations can be generated as
= GD (x0 ; y0 )
GD; (x; y)
0
= x2 G (x0 ; y0 )
0 02 02
= x2 p 1 exp (x 2+2y )
2
DOOG; (x; y) = DOOG (x0 ; y0 )
1 Notethat the magnitude of the gradient is actually independent of the direction of the edge. Such operators are called isotropic
operators.
2
= G (x0 ; y0) G (x0 + d; y0 )
x0 = x cos + y sin
y0 = x sin + y cos : (4)
The scale parameter of the Gaussian is the only required parameter that the user needs to specify.
2.1. Intensity Edges
1) Computing E (s; ): Now consider an image at a given scale as I (x; y), which is obtained by smoothing
the original image I (x; y) with a Gaussian kernel G (x; y). The scale parameter controls both the edge energy
computation and the local
ow direction estimation, so that only edges larger than the specied scale are detected.
The edge energy E (s; ) at scale is dened to be the magnitude of the gradient of the smoothed image I (x; y),
which is obtained by smoothing the original image I (x; y) with a Gaussian kernel G (x; y). The scale parameter
along the orientation
@
E (s; ) = I (x; y)
@n
@
= [ I (x; y) G (x; y)]
@n
@
=
I (x; y) G (x; y) ;
@n
(5)
where represents convolution (see Appendix A.2), and n represents the unit vector in the direction. We can
rewrite (5) as
E (s; ) = jI (x; y) GD; (x; y)j : (6)
This edge energy indicates the strength of the intensity change. Many existing edge detectors actually use
similar operations to identify the local maxima of intensity changes as edges.
2) Computing P (s; ): For the edge energy E (s; ), there are two possible
ow directions: the forward () and
the backward ( + ). The likelihood of nding the nearest boundary along these two directions is now estimated.
Consider the use of image intensity at location s to predict its neighbor in the direction . The error in prediction
is dened to be
0.3 zero-crossing
magnitude of the first derivative 1
of the second
0.25 derivative
0.5
0.2
0.15
250
-0.5
0.1
200
-1
0.05
150
0 -1.5
-15 -10 -5 0 5 10 15 0 10 20 30 40 50 60 70 80
A B
0.35 0.3
50
0.3
P(s, right) P(s, left) 0.25
0
0 10 20 30 40 50 60 70 80 0.25
0.2
Edge 0.2
0.15
0.15
0.1
0.1
0.05
0.05
0 0
-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15
(b)
Fig. 2. A comparison of the EdgeFlow technique with the conventional approach for a 1-D edge signal. (a) Traditional
method of edge detection, which usually seeks the local maxima of the intensity gradient magnitude or the zero-
crossings of the second derivative of image intensity. (b) EdgeFlow technique, which constructs the
ow vectors with
their energy equivalent to the magnitude of the intensity gradient and their direction computed by the prediction
errors using DOOG lters, and propagates the
ow vectors to localize the edge.
y)2
5
@
= imag O (x; y) O(x; y) =m(x; y)2 ; (16)
@x
where is complex conjugate. The phase derivative with respect to any arbitrary orientation can be computed
in a similar manner.
Without loss of the generality, we rst consider the design of a linear phase predictor along the x axis
@
^(x + d; y) = (x; y) + a (x; y); (17)
@x
and therefore, the prediction error is equal to
@
Error = (x + a; y) (x; y) a (x; y): (18)
@x
However, because the rst two terms in (18) are wrapped phases, the prediction error has to be further
corrected by adding or subtracting 2 such that it always lies between and . Because the linear component
of the phase has been removed by the rst-order predictor, the magnitude of the prediction error is usually much
smaller than . As a result, the prediction error contributed by the 2 phase wrapping can be easily identied
and corrected. The general form of computing the phase prediction error can be written as
@
Error(s; ) = (x + a cos ; y + a sin ) (x; y)
a (x; y) + 2k(x; y) ; (19)
@n
where n = (cos ; sin ) and k(x; y) is an interger which ensures that the prediction error is between and .
One can use the second derivative of the phase to compute the corresponding phase edge energy. However, for
simplicity in implementation, we directly use the summation of prediction errors Error(s; ) and Error(s; + )
to represent the phase \edge" energy E (s; ).
3. Edge Flow Vector
The edge energies and the corresponding probabilities obtained from dierent image attributes can be combined
together to form a single edge
ow eld for boundary detection. The total of edge energies and probabilities of
edge
ow direction are dened as
X X
E(s; ) = Ea (s; ) w(a); and w(a) = 1 (20)
2
a A 2
a A
X
P(s; ) = Pa (s; ) w(a); (21)
2
a A
where Ea (s; ) and Pa (s; ) represent the energy and probability of the edge
ow computed from image attribute
a, a 2 fintensity=color; texture; phaseg. w(a) is the weighting coeÆcient associated with image attribute a.
Now consider the use of combined color and texture information for boundary detection. For a given color
image, the intensity edge
ow can be computed in each of three color bands (R; G; B ) using (6)-(8), and the
texture edge
ow can be calculated from the intensity I = (R; G; B )=3. The the overall edge
ow can be obtained
by combining them as in (20) and (21) with A = fred; green; blue; textureg. In the following experiments:
w(texture) = 0:4 and w(red) = w(green) = w(blue) = 0:2.
The
ow direction needs to be estimated as well. At each location s in the image, we have f[E(s; ); P(s; ); P
(s; + )]j0 g. We rst identify a continuous range of
ow directions which maximizes the sum of
probabilities in a corresponding half plane:
8 9
< =
P(s; 0 ); :
X
(s) = arg max
:
(22)
0 < +
I(x,y) b(x,y)
y F
a(x,y)
Fig. 3. Edge
ow propagation: (a) Edges and image pixels. Note that each image pixel I (x; y) is surrounded by two
horizontal edges (H (x; y) and H (x; y + 1)) and two vertical edges (V (x; y) and V (x + 1; y)). (b) The stable
ow eld
vector F and its projections a(x; y) and b(x; y) on horizontal and vertical axes, respectively. (c) Boundary detection
based on the edge
ow. The shaded rectangles indicate edges which lie between image pixels.
Once the edge
ow propagation reaches a stable state, we can detect the image boundaries by identifying the
locations which have non-zero edge
ows coming from two opposing directions. Let us rst dene the edges V (x; y)
and H (x; y) as the vertical and horizontal edge maps between image pixels as shown in Fig. 3(a), and let F~ f (s) be
the nal stable edge
ow vector. F~ f (s) can be further represented as (a(x; y); b(x; y)) = (real(F~ f (s)); imag(F~ f (s)))
based on the projection on horizontal and vertical axes (see Fig. 3(b)). Then, the edges V (x; y) and H (x; y) will
have nonzero energy if and only if the two neighboring edge
ow vectors point at each other, and the corresponding
energy is dened to be the summation of the projections of those two edge
ow vectors towards it. We summarize
as follows:
V (x; y) = a(x 1; y) a(x; y) if and only if a(x 1; y) > 0 and a(x; y) < 0; otherwise V (x; y) = 0;
H (x; y) = b(x; y 1) b(x; y) if and only if b(x; y 1) > 0 and b(x; y) < 0; otherwise H (x; y) = 0.
Fig. 3(c) shows an example of boundary detection.
After the edges are detected, the connected edges are used to form a boundary, whose energy is dened to be
the average energy of its edges V (x; y) and H (x; y). A threshold for the energy is used to remove weak boundaries.
In the experiments, this threshold is automatically selected (for example, set to 3 ) based on the mean and
standard deviation of the nonzero edges V (x; y) and H (x; y) in the image.
5. Post-Processing: Boundary Connection and Region Merging
After boundary detection, disjoint boundaries are connected to form closed contours and result in a number of
image regions. The basic strategy for connecting the boundaries is summarized as follows.
For each open contour, we associate a neighborhood search size proportional to the length of the contour.
This neighborhood is dened as a half ellipse with its center located at the unconnected end of the contour.
7
The major axis of the ellipse should point to the direction of the contour such that it favors the search for
boundaries located along a similar orientation (see Fig. 4).
The nearest boundary element which is within the half ellipse is identied.
If such a boundary element is found, a smooth boundary segment is generated to connect the open contour
to the nearest boundary element.
This process is repeated few times (typically 2-3 times).
contour A
contour B
At the end, a region merging algorithm is used to merge similar regions based on a measurement that evaluates
the distances of region color and texture features, the sizes of regions, and the percentage of original boundary
between the two neighboring regions. In our experiment, we use color histogram [3] and Gabor texture feature
[2] computed from each image region to measure region similarity. If the size of a region is smaller than a certain
threshold, then it is always merged with its most similar neighboring region. In order to preserve salient image
boundaries, we do not merge the regions which are separated by boundaries consisting of more than 90% of the
length of the original contours.
6. Experimental Results
In this section, we will show the example results of each stage in this approach. The attributes of image are
used to detect the boundaries, i.e., color, intensity, texture, or combination of those attributes. Therefore, the
rst stage is to compute the intensity edges by using the Gaussian derivative (GD) and the dierence of oset
Gaussian (DOOG). Fig. 5(a), (b) and (c) show the original color image, boundary detection and segmentation
result. While Fig. 5(d), (e) and (f) show the original image in red, green and blue channels, respectively. Fig.
6 shows the example results of intensity edge and probabilities of edge
ow direction in RGB color. Fig. 7
demonstrates the example results of texture edge and total of edge energies and corresponding probabilities. In
Fig. 6 and 7, the results of each stage computed in the degree of 0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300
and 330. Fig. 8 shows the example results of image segmentation using edge
ow technique. Fig. 9 and 10 show
edge
ow vector and zooming to boundary of edge
ow vector in Fig. 9. Fig. 11-14 show another example of
edge
ow vector. Fig. 15 shows segmentation results in dierence scale parameter .
8
(a) (b) (c) (d) (e) (f)
Fig. 6. Edge energy and probabilities of edge
ow direction in RGB color. (a) edge energy in red color (b) edge energy
in green color (c) edge energy in blue color (d) probabilities of edge
ow direction in red color (e) probabilities of edge
ow direction green color (f) probabilities of edge
ow direction in blue color.
9
(a) (b) (c) (d)
Fig. 7. Texture edge and corresponding probabilities and total of edge energies and probabilities. (a) texture edge in
intensity color (b) probabilities of texture intensity (c) total of edge energies (d) total of probabilities of the edge
ow.
10
(a) (b) (c)
Fig. 8. An example result. (a) original image (b) boundary detection using EdgeFlow technique (c) nal result of
segmentation.
11
(a)
(b)
Fig. 9. An example of Edge Flow. (a) An original image (b) edge ow vector.
Fig. 10. Zooming to boundary of Fig. 9. Note that boundary detection can be performed by propagating the edge
ow vector and identifying the locations where two opposite direction of
ows encounter each other.
(a) (b)
Fig. 11. An example of edge ow vector. (a) original ower image (b) edge ow vector.
12
(a) (b)
Fig. 12. An example of edge ow vector. (a) original ower image (b) edge ow vector.
(a) (b)
Fig. 13. An example of edge ow vector. (a) original boxtext image (b) edge ow vector.
(a) (b)
Fig. 14. An example of edge ow vector. (a) original rgbwhite image (b) edge ow vector.
13
(a) (b) (c) (d)
Fig. 15. An example of boundary detection in dierence . (a) original image (b) segmentation result using = 3 (c)
segmentation result using = 4 (d) segmentation result using = 5.
14
References
1. W.-Y. Ma and B. S. Manjunath, \EdgeFlow: A Technique for Boundary Detection and Image Segmentation," IEEE
Transactions on Image Processing, Vol. 9, No. 8, August 2000, pp. 1375{1388.
2. B. S. Manjunath and W.-Y. Ma, \Texture Features for Browsing and Retrieval of Image Data," IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 18, August 1996, pp. 837{842.
3. M. J. Smith and D. H. Ballard, \Color indexing," Int. J. Computer Vision, Vol. 7, No. 1, 1991, pp. 11-32.
Appendix A
Fig. A.1. An example of local maxima. Shaded area indicates the local maxima in an image.
where f (x; y) and h(x; y) are the input and output images, respectively. The output h(x; y) is the convolution of
f (x; y) with the impulse response g(x; y) and is dened as
If f and h are images, convolution becomes the computation of weighted sums of the image pixels. The
impulse response, g[i; j ], is referred to as a convolution mask. For each pixel [i; j ] in the image, the value h[i; j ]
is calculated by translating the convolution mask to pixel [i; j ] in the image, and then taking the weighted sum
of the pixels in the neighborhood around [i; j ] where the individual weights are the corresponding values in the
convolution mask.
A.3 Gaussian kernel function
Gaussian lters are a class of linear smoothing lters with the weights chosen according to the shape of a Gaussian
function. The Gaussian smoothing lter is a very good lter for removing noise drawn from a normal distribution.
15
A B C p1 p2 p3
D E F p4 p5 p6
h[i,j]
p7 p8 p9
G H I
h[i, j]=AP1+BP2+CP3+DP4+EP5+FP6+GP7+HP8+IP9
Fig. A.2. An example of a 3 3 convolution mask. The origin [0; 0] of the convolution mask corresponds to location
E and the weights A; B; :::; I are the values of g [ k; l]; k; l = 1; 0; +1.
The Gaussian function depends on the Gaussian spread parameter which determines the width of the Gaussian.
The Guassian functions have ve properties that make them particularly useful in early vision processing. The
ve properties are summarized below.
1. In two dimensions, Gaussian functions are rotationally symmetric. This means that the amount of smoothing
performed by the lter will be the same in all directions. In general, the edges in an image will not be
oriented in some particular direction that is known in advance; consequently, there is no reason a priori to
smooth more in one direction than in another. The property of rotational symmetry implies that a Gaussian
smoothing lter will not bias subsequent edge detection in any particular direction.
2. The Gaussian function has a single lobe. This means that a Gaussian lter smooths by replacing each
image pixel with a weighted average of the neighboring pixels such that the weight given to a neighbor
decreases monotonically with distance from the central pixel. This property is important since an edge is
a local feature in an image, and a smoothing operation that gives more signicance to pixels farther away
will distort the features.
3. The fourier transform of a Gaussian has s single lobe in the frequency spectrum. This property is a
straightforward corollary of the fact that the Fourier transform of a Gaussian is itself a Gaussian. Images
are often corrupted by undesirable high-frequency signals (noise and ne texture). The desirable image
features, such as edges, will have components at both low and high frequencies. The single lobe in the
Fourier transform of a Gaussian means that the smoothed image will not be corrupted by contributions
from unwanted high-frequency signals, while most of the desirable signals will be retained.
4. The width, and hence the degree of smoothing , of a Gaussian lter is parameterized by , and the rela-
tionship between and the degree of smoothing is very simple. A larger implies a wider Gaussian lter
and greater smoothing. The degree of smoothing can be adjust to achieve a compromise between excessive
blur of the desired image features (too much smoothing) and excessive undesired variation in the smoothed
image due to noise and ne texture (too little smoothing).
5. Large Gaussian lters can be implemented very eÆciently because Gaussian functions are separable. Two-
dimensional Gaussian convolution can be performed by convolving the image with a one-dimensional Gaus-
sian and then convolving the result with the same one-dimensional lter oriented orthogonal to the Gaussian
used in the rst stage. Thus, the amount of computation required for a 2-D Gaussian lter grows linearly
in the width of the lter mask instead of growing quadratically.
In this EdgeFlow technique, the rst derivative of Gaussian (GD) and the dierence of oset Gaussian (DOOG)
are used to compute the edge energy and the probability of edge
ow direction. Fig A.3. and A.4. show the 3-D
and 2-D Gaussian derivative with = 0, respectively. Fig A.5. and A.6. show the 3-D and 2-D of the dierence
of oset Gaussian with = 0, respectively.
Appendix B
This appendix covers some aspect of C++ programming.
Program B. Gaussian function
double Gaussian(double x, double y, double sigma)
16
0.02
0.015
0.01
0.005
-0.005
-0.01
-0.015
-0.02
-60
-40
-20
0
-60
20 -40
-20
40 0
20
60 40
60
Fig. A.4. Gaussian derivative mask. From left to right and top to bottom, the GD mask is computed in the degree
of 0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300 and 330.
17
0.1
0.08
0.06
0.04
0.02
-0.02
-0.04
-0.06
-0.08
-0.1
60
40
20 -60
-40
0 -20
-20 0
-40 20
40
-60 60
Fig. A.6. The dierence of oset Gaussian mask. From left to right and top to bottom, the DOOG mask is computed
in the degree of 0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300 and 330.
18
f
return 0.39894228040143267794/sigma*exp(-(x*x+y*y)/(2.0*sigma*sigma));
g
Note: p12 = 0:39894228040143267794.
19