Gabor Filter Analysis For Texture Segmentation

Technion - Computer Science Department - Technical Report CIS-2005-05 - 2005
Gabor Filter Analysis for Texture Segmentation

Michael Lindenbaum, Roman Sandler
Abstract
Gabor features are a common choice for texture analysis. There are several popular sets of Gabor filters. These sets are usually designed based on representation
considerations.
We propose here an alternative criterion for designing the filters set. We consider
a set of filters and their responses to a pairs of harmonic signals. Two signals are
considered separable if the corresponding two sets of responses are disjoint in at least
one of the responses. We look for the set of Gabor filters maximizing the fraction of
separable harmonic signals. The proposed semi-analytical algorithm calculates filters
parameters for the optimal set, given the desired number of filters and the frequency
range of possible signals. The resulting filters are significantly different from those
traditionally used.
We tested the proposed filters both in texture segmentation and texture recognition aspects with commonly used discrimination algorithms for each of the tasks. We
show that, as expected, the resulting filters perform better than the traditional ones
in discriminating synthetic and real textures. An important side effect of using the
proposed filters with the popular features distribution based methods, considering a
feature vector composed of the filters responses, is the possibility to use a more compact (a lower number of feature vector prototypes) representation of the texture classes
than using the common filters.
Introduction
Texture is a basic visual cue, helping the human visual system in segmentation and recognition tasks. Its usage in computer vision has been a very active topic in the past three decades.
Texture is defined in dictionary as the characteristic appearance of a surface having a tactile
quality. However, in computer vision, there is no universally accepted definition for texture
[4]. The following images (Figures 1, 2) are some examples to what is referred as texture
and textured objects.
A texture is not specified by the intensity (or color) in a single point. Therefore, texture
descriptors are always based on some neighborhood. It is common to characterize the texture
using a vector of scalar descriptors, which may be simply gray levels in some neighborhood
[29] or responses of different filters [12, 19, 16, 24, 10].
One of the important applications of texture analysis is image segmentation: the given
image is divided into parts with contain homogeneous texture. Texture segmentation may
1
Figure 1: Some example of textures
Figure 2: Some example of textured objects

be done in several ways. A common approach is to associate image units (pixels or larger
regions, referred to as patches) with feature vectors describing texture in these locations,
and consider the distances between the vectors as distances between the textures. Then
the algorithm search for a segmentation, for which the feature vector based distance between
points in the same segment is small and the feature based distance between points in different
regions is large. Such partition may be found, for example, by graph algorithms [19].
Consider for example figure 3. A illustration of the desirable distances between feature
vectors: The neighborhood of point P1 belongs to T exture1 , while those of P2 and P3 belong
to T exture2 . Let d(Pi, Pj ) be a distance between associated feature vectors at Pi and Pj .
Then ideally,
d(P2 , P3 ) < d(P1 , P2 ) and d(P2 , P3 ) < d(P1 , P3 ).
The validity of this relation depends on the type of metric, the feature vector characterizing
the texture patch etc. Some features and comparison methods are described in section 2.
This work considers the influence of the feature vector (and the associated distance
measure) on texture discrimination. We restrict ourselves to Gabor features (a common
choice for texture features, (see section 2.2) and propose a method for designing a Gabor filter
set to maximize texture discrimination. The design is made over a simple set of synthetic
textures, for which the discrimination is optimized by analytic and numerical methods. Then,
we show that the derived filters work well for realistic textures.
Unlike the common modern approach, we characterize a texture by a vector whose components are the responses of (Gabor) filters and not histograms of local texton cluster identities
over a region. This way, the decision whether two points belong to the same texture may be
made without additional clustering stage, which results is a faster answer to such queries.
2
50
100
150
200
250
Texture 1
50
Texture 2
100
150
200
250
300
350
400
Figure 3: Texture segmentation problem example.

We also show that using the proposed feature in the context of a clustering based approach
is advantageous as well.
Our approach is different from previous ones also by the emphasis we put on translation
invariance. To be a good texture characterization, the descriptor should not take significantly
different values in different locations of the same texture. Common methods do not pay much
attention to this demand and use filters which are optimized for texture representation with
minimal distortion (measured, say, by MSE error). This causes completely different feature
values in different location on the same texture. Although the texture can be correctly
restored from this features, several representative features should be taken to describe each
texture patch. Here, on the other hand, we directly optimize the filter to achieve maximal
discrimination using a single texture feature vector, which, not surprisingly indeed results in
superior discrimination.
The paper is organized as follows: Section 2 reviews some commonly used texture features, and then focuses on Gabor functions as a method for texture features extraction. It
described some examples of their usage, explains the motivation for the common design of
Gabor filter banks, suggests that these filter banks are not optimal for texture discrimination,
and propose an alternative design approach. Section 4 describes the framework and suggests
an intuitive (semi-analytic) derivation of the proposed filters in a simple setting. Section 5
describes numerical procedures for optimizing one dimensional filter bank(s), describes the
resulting filters, and compare them to the traditionally used filter sets. Section 6 extends
the derived filter sets into 2D and suggests a related adaptive distance measure. Section
7 describe experiments, done on both synthetic and real textures, showing the advantage
of the proposed texture features on the traditional ones. The work is concluded by drawing some relations on the obtained filter sets and suggests future directions of the research
continuation (section 8).
Background: Texture discrimination methods
Texture discrimination is needed for two main applications: classification and segmentation.
The aims of these applications are different but they often use the same texture characterization methods (See 2.1.5 for a short description of feature application methods).
2.1
Texture features
A texture feature associated with some texture patch, is a local descriptor describing some
texture-relevant characteristic of this patch. The patch is thus represented by a vector of such
descriptors. Many texture features were proposed. A simplistic (but in some context, also
effective) method to use the actual grey-levels of pixels neighborhood as texture features.
Some other approaches, described below, project the image onto another representation
space.
We shall discriminate between patch descriptors, soft neighborhood descriptors and indirect descriptors
2.1.1
Texture Patch Descriptors
Some of methods are traditionally classified as model based methods, because the pixels
neighborhood is considered to be a result of some known parametric process. Other methods
do not use such models and are called non-model based methods.
Some of the popular Model Based Features are
1. Fractal Models [23, 1]. Pentland [23] proved that the fractal dimensions [1] of any object
and its image are equal. From this proof he derived that images may be segmented
into homogenous parts using a segmentation of their fractal dimension histograms.
This approach actually claimed that textured areas may be distinguished by a single
parameter. When opposite examples were provided, additional fractal parameters such
as Lacunarity [1] and directional fractal measurement were proposed.
2. Autoregressive Models [21]. Autoregressive (AR) models consider the pixels gray level
value as a result of some random process. This way each pixels value can be represented
as a weighted sum of surrounding pixels. Region classification is done by comparing
the models built for them. Rotation invariant versions were proposed as well. AR is a
particular case of more general MRF approach.
3. Markov Random Fields [3, 29]. A texture is considered to be a realization of a stochastic, two dimensional Markov Random field. In a MRF model, the gray level in a
pixel is a random variable randomized depending on its neighbors. Therefore, some
typical neighborhoods characterizing the texture appear more frequently and may represent the texture. Alternatively, similar to AR models, the parameters specifying the
dependency may be used as a representation.
Some of the Non-model-based Features are
1. Grey-level co-occurrence [13]. These methods actually calculate two dimensional histograms of the occurrence of the pairs of grey-levels for a given displacement vector
(distance and direction). That is, the resulting matrix contains the number of the
occurrences for each possible pair of gray-levels separated by distance d in direction
for a given texture patch.
2. Frequency domain methods Commonly used features consist of sums of coefficients
within wedges, rings or sectors of two dimensional power spectra. Additional features
include the characteristics of spectral peaks.
2.1.2
Soft neighborhood descriptors
One often needs to characterize every pixel in the image with respect to the texture it
lies on. One straightforward method would be to define a patch around every such pixel
and to characterize the pixel using the patch texture features described above. A better
representation would be one which puts more weight on the image properties closer to the
pixel (relative to other locations in the patch). Features based on Gabor filters (see section
2.2), wavelet and wavelet packet decomposition [2] and other characterizations based on
mixed space-frequency descriptors are especially suitable for such tasks. One example of
the later is Laws texture energy filters [15] (which are sets of seven bandpass and high-pass
directional filters, implemented as 5 5 masks). A comparative study which consider most
of the common texture features and suggests new sets of filters for specific texture classes is
described [27].
2.1.3
Indirect, distribution based, descriptors
It often happens that textures contain several typical sub-patterns, which appear more frequently than others. These are commonly named texons. Several texture characterization
methods rely on the distribution of textons: One popular method is to cluster the (vector)
pixel (soft neighborhood) or patch descriptors vectors into several (K) groups (or clusters).
This way, every pixel is assigned to some cluster and is given an index. The distribution
of such indexes in the neighborhood of the pixel is used as a texture characterization of
this pixel. This method turned out to be very successful but note that it is not local: the
characterization depends, in principle, on all image gray levels. The feature vectors that
are clustered may be either Gabor filter responses [19] or even the gray levels in a small
neighborhood themselves [29].
Mean shift clustering is a more modern method of clustering which may be used for
indirect texture features generation as well. Here, the number of groups is found in an
iterative convergence process. The distribution of the groups in neighborhood of the pixel is
used for textures comparison [10].
2.1.4
Comparing texture descriptors
The feature vector descriptions may be compared by several methods. One way would be
to use some metric distance d (satisfying symmetry, minimality, triangular inequality and
self-similarity). However there are strong evidences that metric distance measurement is
5
not fully consistent with human perception system. For example, there is no proof that a
property of triangular inequality exists for human perception [14].
Instead of a metric distance it is possible to compare texture features using some dissimilarity measure, which does not require the Triangle inequality. Practically, many measures are used. Some common methods are Euclidean distance Ln , Mahalanobis distances,
correlation, 2 and Earth Movers distance (EMD) [24]. Some papers suggest building rulebased systems or neural network classifiers [11].
2.1.5
Application of the features
As mentioned earlier, there are two main tasks for texture discrimination systems: recognition and segmentation.
A texture recognition task is divided into two stages: training and recognition. In training
many patches of different textures are given together with their classifications. A representation of every texture is constructed. A modern example is the distribution of feature vectors.
Then, in the recognition stage, a query patch of some texture is given and its classification is
required. This may be carried out by finding the distribution of features in the given patch
and comparing it to every one of the distributions obtained in training. In this task, much
of the work is done in the feature extraction and in the distribution estimation. Texture
recognition results are usually characterized by a relatively high performance, because a lot
of features, for relatively large image areas are considered for a single decision on texture
kind [29].
Texture based segmentation usually starts by extracting a large matrix of similarities
associated with every pair of pixels (or some rougher image quantization). For an N N
image, this matrix is of size N 2 N 2 . Then, the segmentation method based on this matrix
may be of different complexities. A simple method could, for example, build a graph where
every nodes corresponds to a unique pixel. Then, graph edges are added only where the
similarity between the corresponding feature vectors is high. Then, connected components
algorithm partitions the graph (and the corresponding image) into groups. A more complex
algorithm could use less trivial graph partitioning methods such as Normalized Cut [26],
edge flow [18], Multi-grid aggregation [25], Active contours and different Byesian criteria
based methods.
2.2
Gabor features
Gabor kernels are commonly used for texture features extraction. Their popularity is motivated by the mathematical and the biological properties of Gabor functions.
In 1946, D.Gabor [9] proved that a signals specificity simultaneously in time and frequency is fundamentally limited by a lower bound on the product of its bandwidth and
duration (analogous to indeterminacy relations of quantum mechanics). The bound is
t2
1
(x)() 4
. He also proved that signals of the form s(t) = e 2 +it achieve the theoretical limit he found.
Computationally, the Gabor functions form a complete but non-orthogonal basis set and
any given function can be expanded in terms of these basic functions.
In 1985 Daugman generalized the Gabor function to the following 2D form to model the
receptive-field profiles of simple cells in the striate cortex[6]:
1
G(x, y) =
e
2
(xx0 )2
(yy0 )2
+
2
2

2i[u0 (xx0 )+v0 (yy0 )]
(2.1)
These functions, which are spatial bandpass filters with central frequency (u0 , v0 ), achieve
the theoretical limit for conjoint resolution of information in the 2D spatial and 2D Fourier
domains.
The specified properties of 2D Gabor functions inspired a search for texture discrimination
methods based on these functions. Porat and Zeevi [22] reviewed several Gabor features and
suggested a texture discrimination method based on two of them. Fogel and Sagi [8] proposed
a texture discrimination method, which is based on calculation of the dissimilarity between
Gabor power spectrum of two texture elements. In other words they compute the absolute
value of the complex Gabor-filtering result and compare the received values. Fogel and
Sagi were the first to notice that texture features received by this method are locally shift
invariant. Malik and Perona [20] explained that Gabor features should be non-linear, to
make their recognition properties consistent with human perception. They also mentioned
the Fogel and Sagi method as one of the methods which implement their findings.
2.2.1
Representing texture with oriented Gabor power filters
As it clear from eq. (2.1), Gabor kernel significantly responds to a limited range of signals
which form a repeating structure in some direction (which is usually denoted as filters
orientation or direction) and are associated with some frequencies (filters band). The obvious
question is which filters should be taken to represent all possible textures.
Textured images are commonly represented with responses to filters from some filter bank,
which usually covers the frequency-direction domain (see figure 4). Therefore each image
pixel is represented with a two dimensional vector of values specifying coarse frequency
decomposition of the texture in several directions.
Greenspan and Perona shown [12] numerically that taking complex filters in four directions is enough to represent most (97%) of the image energy. Still six directions are usually
used. The number of scales (number of considered filter frequency bands) is usually three or
four.
It is generally believed that the number of filters in the bank and their parameters
are not so important and that all reasonable choices provide almost the same quality of
representation. The commonly used filters are either filter-set designed for general image
representation or filter-set design for discriminating a set of known textures. We argue
that these approaches are not optimal for texture segmentation of general images. We shall
further show that some significant difference in discrimination performance may be achieved
as a function of filter bank choice.
2.2.2
Filter-sets for general image representation
Commonly used 2D Gabor basis functions are spatially localized, oriented, and roughly
one octave in bandwidth (see eq. 2.1). Such decompositions are inspired by biological visual
7
(a)
(b)
Figure 4: (a) The cumulative frequency responses of a bank of Gabor filters, plotted in a
2D frequency plane. Each bright spot represents a range of frequencies for which some filter
strongly respond. Note that each ring of spots corresponds to filters with the same radial
frequency. Spot with different distance from the origin but same directions correspond
to different scales. This filter bank contains 5 scales and 4 directions. (b) The features
calculated in every pixel of the given texture. Each greyscale in the right 18 squares equals
to the response magnitude of the filter associated with this square in the pixel with the same
coordinates on the original image (the most left image). The features are calculated using
filter set of three filters oriented in six different directions. For example the value at position
(14, 46) on the third square in the second row is associated with the response to the filter
with the second frequency of the set in the third direction at pixel (14,46) of the original
texture.
processing [6] and by signal representation considerations. Specifically, Field [7] claimed that
such filters are optimal for natural scene images coding: he found that the most compact
representation of (a small set of) natural images is achieved when the representation is done
with log-Gabor filters associated with 1.0 octave frequency bands.
Lee [16] showed that 2D Gabor filters with octave steps form a frame. Family of functions
forms a frame when any original signal may be reconstructed in a numerically stable way from
its decomposition coefficients, and the difference between the energy of the reconstructed
signal and the original energy is bounded. A tightness of a frame is specified as a measure
of non-variability of the difference between the original and the reconstructed energies. Lee
found that a 1.5 octave filters form a tighter frame and that they can be treated as though
it were an orthonormal base.
2.2.3
Multi-textured images
In this approach [30, 11] all the possible textures are preliminarily known, as well as the
bank of all possible Gabor filters. The issue is to select the best filters for the segmentation
task. The general method in this case is to test all the filters on textures samples and to
select the most valuable ones into the filter-set.
2.2.4
Some sample implementation of Gabor feature based texture discrimination systems
Many vision systems use texture discrimination by Gabor filters. To illustrate the usage of
these feature, we now briefly describe three representative examples.
Greenspan et al. [11] The method is based on extraction of texture features for each
8 8 block of the image and consecutive recognition from database with learning
based algorithms.
In the feature extraction stage the image processed in three scales. In each scale 4
absolute values of the responses to different orientations of the same complex logGabor filter are obtained. The Fourier transform of theses values, done relating to
the orientation axis, provides rotation invariant representation of the texture on this
scale. The transform also allows to find some textures direction which is used to
distinguish different orientations of the texture of the same kind. An additional feature
for each scale is the response to Laplacian filter of the same size as the Gabor filter.
The result of the feature extracting stage is a 15-dimensional feature vector. The scales
(the filters frequencies) are distributed in octave steps, such that the filters bandwidth
doubles for each next high-frequency band.
In the second (recognition) stage the features were clustered along each dimension
using K-means algorithm and the results were classified using different methods. The
association of the patches with texture classes is used for image segmentation and
texture classes recognition.
EMD based image retrieval Rubner [24] uses Gabor features in his database image retrieval algorithm for searching an image that includes a given texture. In the preprocessing stage of the algorithm, every image passes a feature extraction stage, the
features are clustered using K-means algorithm and K resulting features are stored
as a representation of the textures which are present in the image. In the retrieval
stage some image including only the requested texture is presented to the system, its
features are averaged and the resulting feature vector is EMD-compared (see below) to
every feature vector stored in the database in the preprocessing stage. The images that
one of their feature vectors is close to the query are added to the answer as possible
candidate.
The EMD (Earth Movers Distance) is defined as the minimum amount of work needed
to change one feature vector into the other. The work may be defined differently
for features of different nature (for example, moving energy between the features of
the same scale may be defined as cheaper then moving it to other scales). In the case
of Gabor features the resulting distance measure is more justified than the commonly
used metrics that minimize geometrical distance.
Rubner used 24-dimensional feature vectors, constructed from absolute value Gabor
responses to 4 filters frequencies in 6 directions. He also used filters organized in octavelike way and used Fourier-Mellin transform for providing rotation and scale invariance
to the resulting feature vectors.
9
Normalized cut method [26, 19] In [26] the authors acted in a more or less common way
- they calculated Gabor-like features of the input image and passed these features to the
segmentation mechanism. They used 18 DOG absolute value features, received using 3
filters in 6 directions. DOG - Difference of non-isotropic Gaussians are complex filters
which are similar to the Gabor filters with one cycle of the sine inside the meaningful
area of the filter.
In the segmentation stage a connected full graph is built, while each graph node representing a pixel of the image and every edge represents a likelihood for the two pixels
to be a part of the same object. The edge value is set as an exponential function of
(minus) the scalar product between the pixels feature vectors. This way, pixels with
similar feature vectors are getting a high edge value. The goal of graph segmentation
methods is to find a minimal cut which is a combination of edges having minimal sum
of edge values (i.e. find the least alike pairs of pixels), and removing these edges divides the graph into two unconnected subgraphs. The advantage of the normalized cut
method is that it considers two aspects of graph segmentation: minimal cut (i.e. better
separation) and preferring segments of large size. The normalized cut process divides
the weighted graph into two subgraphs, such that the cut between the subgraphs,
normalized by the total connectivity, is minimal.
In a second paper [19] the texture features are analyzed using the distribution based
texture descriptors (see 2.1.3). The Gabor feature vectors are not directly compared
as in the first article, but divided into K clusters and each image pixel is assigned to
one of these clusters. Using the Delaunay triangulation a local scale is found for every
pixel, and the histogram of clusters distribution in the neighborhood of this local scale
size is attached to each pixel as a descriptor. A likelihood for the pixels to be a part
of the same object (denoted as texture cue) is calculated by 2 test on the descriptor
histograms of the pixels. In this method in addition to the texture cues, additional,
edge detection based, contour cues are also considered. The segmentation is done using
normalized cut method, while the edge value is set to be the exponent of the product
of contour and texture cues for the pixel pair.
10
Our approach
This work is done in the context of the following common framework for texture discrimination: the texture image is filtered by several (complex) Gabor filters and a feature vector
is constructed with the absolute values of the responses as components. The discrimination
between textures uses only these feature vectors and relies on simple procedures to decide
whether two vectors are associated with the same texture. Specifically, such a procedure
decides that the texture in two points is of the same type, if the Euclidean (or Mahalanobis)
distances between the corresponding feature vectors is lower than some threshold.
This work focuses on the design of the Gabor filters set. As mentioned above, the common
choices are made for optimizing the filter design with respect to image representation (in
MSE sense). We argue that the best filters for texture discrimination are different than those
used for representation and shall indeed propose a set of Gabor filter, which is different that
the commonly used sets.
The response of a filter to some texture may take a range of values. We shall say that
a filter (or a filter set) discriminate between two textures if the ranges of values associated
with these two textures, are disjoint. We shall look for filters maximizing the discrimination
between different textures, in a well defined sense described below; see section 4.2. To that
end, we develop an analytical expression for the range of response values of a 1D Gabor
filter to simple signals (e.g. Sine waves). We use these ranges to specify the best filter sets,
which separate best between the simple signals. The filter sets are derived by numerical and
semi-analytical methods, which lead to similar results. The resulting filter sets are different
than the traditional ones.
In a sense, the design of filter sets for specific textures [30] is related to our work. One
main difference is that we model the signal and estimate the variation analytically. Another
difference is that unlike the iterative, heuristic design, suggested in [30], we use a global
process for the optimization.
Our work can be considered to be an investigation into the translation sensitivity of
Gabor filters. While the absolute value of Gabor filtered signal will never be truly shift
invariant, its variability may be bounded in relatively small ranges, such that for a relatively
far values we would be able to say that they belong to different textures.
Essentially, traditional texture discrimination methods consider the filter response in
qualitative form. For example, if the texture is simple (e.g. sinusoidal), they are interested
in the filter that react maximally to it, and not much in the actual response size. In contrast to these methods but not unlike the modern, distribution based approaches, we model
the actual response and therefore can make a finer, super resolution like, discrimination
between textures. Analogously to super-resolution methods using several measurements corresponding to the same brightness, the resulting filters turn out to be highly overlapping in
frequency coverage. This leads, as we shall see, to better discrimination.
As described above, modern texture characterization methods rely on distributions of
filter response. Such methods were proved to be insensitive to the type of texture features
[29], provided that a sufficient number of textons is allowed. We, on the other hand, are
interested in finding the best feature and therefore shall evaluate the proposed ones using
direct methods (and not using distributions). It is likely that less complex distributions
(with a lower number of textons) could be built with better features.
11
A filter response to harmonic input analysis
We start by considering the range of responses associated with one 1D filter and one simple signal. The simple signals considered as a model of texture are either Sines signals or
combination of them. The discrimination study is done in the context of specific sets of signals associated with [fmin , fmax ] frequency range and [Amin , Amax ] amplitudes range. These
ranges are latter referred to as the framework. Our goal is to maximize the number of signal
pairs which may be distinguished. The desired result is a set of filters parameters, depending
on the prespecified number of filters and the parameters fmin , fmax , Amin , Amax .
We first focus on a few simple cases, allowing intuition to the Gabor filters signal discrimination capabilities and providing computational advantages in the final optimization.
For a simple case of a single harmonic signal with known amplitude we can find the optimal
single filter analytically. Using the acquired intuition we developed a semi-analytical method
for building a filter set for discriminating between single harmonic signals with unspecified
amplitudes. The resulting filter set is very similar to the one obtained from the full numerical
search on the filter parameters domain.
We also provide an expression for Gabor filter response bounds to dual-harmonic signals
and multi-harmonic signals. Estimates based on these expressions allows us to understand
the responses behavior and imply that the filters, obtained for single harmonic signals, perform well for the multi harmonic case too. As we shall see later, these filters perform well
for distinguishing between real textures as well.
4.1
A single filter response to Sine signal

x2
1
e 22 +ikx , and a signal gA,m (x) = Asin(mx).
Consider a complex Gabor filter hk,2 (x) = 2
2
The (complex) filter response is highly variable depending on location but its absolute value
rA,m,k,2 (x) = |hk,2 (x) gA,m (x)|

(k2 +m2 ) 2 p
2
ch2 (km 2 ) cos2 (mx)
= Ae
(4.1)
is more stable (see Appendix A for derivation). Here we consider only the absolute value of
the response and refer to it simply as the response.
This response varies with location (x coordinate) and depends on the relative phase of
the Sine input signal with respect to the filter location (x = 0). This change is undesirable,
but still unavoidable. It can be shown that the range of responses is bounded by

(km)2 2
(km)2 2
(k+m)2 2
(k+m)2 2
A
A
2
2
2
2
e
e
> rA,m,k,2 (x) >
.
(4.2)
+e
e
2
2
These bounds are tight. The response varies around a middle value,
rA,m,k,2 =
A (km)2 2
2
e
,
2
with a maximal deviation

rA,m,k,2 = Ae
12
(k+m)2 2
2
(4.3)
(4.4)
0.4
0.35
0.3
r1,,k,2
0.5
0.45
0.25
r2
0.2
r1
0.15
r2
0.1
r1
0.05
0
2.5
1.5
log(1)
1
0.5
log(2) log(3)
0.5
1
1.5
log(k) log(4)log( )
5
Figure 5: The upper and lower bounds on the response of a Gabor filter hk,2 to a constant
amplitude signal, plotted against the signal frequency (in log (natural logarithm) scale).
(k = 1.6, = 1.51). The signal with m2 frequency is indistinguishable from signals with
frequencies in [m1 , m3 ] and [m4 , m5 ] ranges.
This deviation is referred to as the filters variability. It increases (exponentially) as the
frequency decreases, implying that the response to lower frequencies is more space varying;
see Figure 5.
4.2
The optimality measure - number of separated harmonic signal pairs
As described above we intend to specify filters which maximize the separation of signals.
Here we formally define the optimality measure for this separation. We intend to use it to
find the filter sets with maximal discrimination power, given the parameter of the framework.
The variables declared here are described more intuitively later. The goal of these formal
definitions is to draw guidelines for derivations in the following sections.
x2
1
e 22 +ikx and an harmonic signal
1. Range of responses For a filter hki ,i2 = 2
2
gA,m (x) = Asin(mx), the range of possible responses is:
h
i
Rki ,i2 (m, A) = min{rA,ki ,m,i2 (x)}, max{rA,ki ,m,i2 (x)} .
(4.5)
To reduce the sensitivity to measurement noises we require that the response is not
too small and is al least Rmin (see more on this in section 4.3.1) and redefine:
h
i
(x)},
max{r
(x)}
(4.6)
Rki ,i2 (m, A) = min{rA,k
2
2
A,ki ,m,
i ,m,
i
where
rA,k
2 (x)
i ,m,i
rA,ki ,m,i2 (x) rA,ki ,m,i2 (x) Rmin

,
0
rA,ki ,m,i2 (x) < Rmin
13
(4.7)
2. Signal separability. Two signals gAi ,mi (x) and gAj ,mj (x) are separated using the filter
set {hkl ,l2 |l 1, .., L}, where L is the number of filters, if their responses for some filter
are disjoint. The binary variable

1 l s.t. Rkl ,l2 (mi , Ai ) Rkl ,l2 (mj , Aj ) =
SAA (mi , mj , Ai , Aj ) =
(4.8)
0
else
is used to describe signals separability.
3. Frequency separability. Two frequencies mi , mj are separated if for all amplitudes
in the allowed range [Amin ; Amax ] the associated signals are separated. The binary
variable

1 Ai Aj SAA (mi , mj , Ai , Aj ) = 1
s(mi , mj ) =
(4.9)
0
else
is used to describe frequencies separability.
4. The measure of filter set performance. The variable Q [0, 1] specifies the
relative fraction of separable frequencies and is a proposed optimality measure for
filters hk1 ,1 , hk2 ,2 , ...:

Q (k1 , 12 ); (k2 , 22 ); ... =
R fmax R fmax
fmin
s(mi , mj )dmi dmj

fmin
R fmax R fmax
dmi dmj
fmin fmin
(4.10)
All the following derivation are made in the context of maximizing Q. In the numerical
experiments section (5) we consider a discrete form of this measure.
4.3
The design of a single filter for discriminating equal amplitude

sine signals
As described above we intend to specify filters which maximize the separation of signals.
Considering, first, the simpler, equal amplitudes, signals, we say that a filter separates two
signals associated with frequencies m1 and m2 , if the set of possible responses are disjoint,
\
(4.11)
{rA1 =1,m,k1 ,2 (x), x R} {rA2 =1,m,k2 ,2 (x), x R} =
In other words, if a value r is the response to a signal with frequency m1 at some location
x1 , it cannot be the response to a signal with frequency m2 for any x.
Figure 5 illustrates such a separation: The filters response to a signal with frequency m1
varies between R1 and R1 . The closest frequency (from above, with the same amplitude)
which can be separated from m1 is therefore m2 , for which the minimal response R2 is
larger tan R1 . In the same way the closest frequency which is separable from m2 is m3 .
The variability of the responses to m2 is smaller than the variability of the responses to m1
(consistent with eq. 4.4). Therefore m3 is closer to m2 than m2 to m1 . Note the logarithmic
scale.
14
0.5
0.9
0.45
0.8
0.4
0.7
0.35
0.6
0.3
0.5
0.25
0.4
0.2
0.3
0.15
0.2
0.1
0.1
0.05
0
3
2.5
1.5
0.5
0.5
0
3
1.5
(a)
2.5
1.5
0.5
0.5
1.5
(b)
Figure 6: Specifying the filters frequency too low is not advisable because it causes either
high variation (with low - fig. (a) - the lower bound on the response is slightly above X
axis, while the upper one is very high ) or low response for most of the frequencies in the
range - fig.(b)
4.3.1
Avoiding noise
The difference in response to high frequencies gets smaller and therefore, theoretically, even
very close frequencies may be separated. This is not true in practice because the real signals
are not pure sines and have additional noise harmonics. The variability due to the noise
is bounded by the magnitude of the noise component response (see Appendix B and section
4.5.1 for details). Therefore, even a small noise signal, with frequency close to the center of
the filter (k), makes very low responding signals indistinguishable from all the other signals
having a very low response.
Therefore, we specify a minimal response threshold Rmin , and shall not consider two
signals separated by some filter, if both responses of this filter are lower than Rmin . The
threshold Rmin is specified relative to the maximal response of the filter to a signal with
amplitude Amin . (This maximal response is obtained for a signal associated with frequency
k). Thus, Rmin may be regarded as the minimal relative amplitude of a signal which causes a
competing response. A typical value of Rmin chosen in our experiments is in 0.05. The filters
bandwidth may be redefined as a range of frequencies associated with signals for which the
response is at least Rmin .
The goal of this section is to find an optimal filter, with given Rmin and optimized central
frequency k and width , such that the fraction of separated frequency pairs in [fmin , fmax ]
is maximal.
4.3.2
A preliminary consideration
Based on the discussion above, one could intuitively argue that the best way to design the
filter would be to specify its central frequency k smaller than the minimal frequency fmin .
This way, all the frequency pairs are in the right side of the filter where the variability is small,
and therefore many pairs may be separated. Unfortunately, this approach is problematic
because:
1. A choice of small implies a very high variation causing most pairs to be indistinguishable; see figure 6(a).
15
rA,m,k,2
rA,m,k,2
rA,m,k,2
m
m
Figure 7: [m , m ] is a minimal interval of frequencies indistinguishable from signal with

frequency m. rA,m,k,2 are variability ranges for each of the three frequencies. In the
analysis we use a linear approximation, to estimate m.
2. On the other hand, a choice of large implies that the response is lower than Rmin for
most of the frequencies in the range; see figure 6(b)).
There is no intermediate value solving these problems, In fact, the graphs of figure 6(a)
correspond to the maximal satisfying that the middle response to all frequencies is above
Rmin = 0.05.
The other option of placing k is between fmin and fmax . In the next subsection we derive
the optimal values for k and for this case.
4.3.3
Optimal single filter derivation
Let [m , m ] be a minimal interval of indistinguishable frequencies (using a filter hk, (x)),
see Figure 7. Consider an asymptotic approximation where m m m. Then,

rA,m ,k,2 rA,m ,k,2 rA,m,k,2
Therefore,
r A,m ,k,2 r A,m ,k,2 =

1
rA,m ,k,2 + rA,m ,k,2 + 2rA,m,k,2 2rA,m,k,2
2
A linear local approximation implies that
(k+m)2 2
2
2rA,m,k,2
4Ae
4e2km
=
=
m
r
(km)2 2
2
|k m| 2
2e
2
A|k
m|

A,m,k,
m
where m
= m(m,
k, 2 ) = m m (see eq. (4.3), (4.4)).

16
(4.12)
A higher discriminability is obtained if m

is smaller. Therefore its value is sometimes
r
2
referred to as the filter indiscriminability. Note that near the filters frequency k, A,m,k,
m
0 and the range of non separability is wider. This problem is latter resolved using additional
filters. Note that if a frequency is non separable from any other frequency, then its m
is
still fmax fmin and not infinity. Therefore, the indistinguishability is redefined
m = min{m,
fmax fmin }
(4.13)
As mentioned earlier, our goal is to segregate the maximum possible amount of frequency
pairs m1 , m2 [fmin , fmax ]. As it is clear from figure 5 that there are two ranges of signal
frequencies which respond similarly to signal m2 - the first is m [m1 , m3 ] and the second
is m [m4 , m5 ]. For simplicity we ignore the second range of signals and consider every
signal in [fmin , fmax ] to be distinguishable from the signal with frequency m if its farther
than m(m, k, )/2 from m. I.e. in case on figure 5 we consider m2 separable from every
frequency smaller than m1 or larger than m3 . With this simplification, the fraction of
indistinguishable signal pairs is (proportional to)
Z fmax
m(m, k, 2 )dm,
(4.14)
fmin
and the filter which minimizes this measure, maximizes 4.10 and is thus optimal one.
By symmetry, adding the ignored second range of indistinguishable signals, change (4.14)
expression by a multiplicative constant, and hence does not matter for the optimal choice.
Requiring that the response is above the threshold Rmin , is approximately equivalent to
m [fmin , fmax ],
rA,m,k,2 > Rmin ,
(4.15)
which, considering the filters unimodal response, is satisfied if

r A,k,fmin = rA,k,fmax = Rmin .
(4.16)
Choosing these minimal responses in the range endpoints specifies the filters parameters as:
k=
fmin + fmax
2
2 =
8log(Rmin )
.
(fmax fmin )2
(4.17)
This choice provides filter parameters which are close to the optimal ones, minimizing
(4.14) expression. As explained below (section 5), the values fmax = , fmin = .1 and
Rmin = .05 are good choice for practical texture discrimination tasks. These choices lead to
a filter associated with k = 1.62 and 2 = 2.59, which are indeed close to the, numerically
obtained, optimal single filter parameters, which are k = 1.61 and 2 = 2.31. See Figure 5
for a plot of this filter response.
This filter turns out to be optimal due to the following considerations: Increasing
beyond the value (4.17) contradicts the constraint (4.16). Decreasing it increases m. Increasing k require to decrease to satisfy (4.16) again. In Appendix D it is shown that
increasing k and decreasing (and preserving r A,k,fmin = Rmin ) causes a bigger m for
every m, and therefore the filter can separate less frequency pairs (see 4.14).
17
.05
.1
m2
Discriminability of sin(m1x) from sin(m2x) using a single filter
.15
.2
.25
.3
.05
.1
.15
m1
.2
.25
.3
Figure 8: The discriminability of a single filter. For signal pairs, where the frequency of the
first signal is along x axis and the frequency of the second signal is along y axis, white pixel
value means that the signals are distinguishable using single filter and black pixel value means
they are not. Most of the indistinguishable signal pairs are associated with similar signal
frequencies or with signal frequencies which are symmetric around the filters frequency (as
ranges [m1 , m3 ] and [m4 , m5 ] on figure 5)
The conclusion is that any change of the filter specified by (4.17) degrades its discriminability. Thus a single filter for best discrimination of single sine signals with equal amplitudes in an interval [fmin , fmax ] for a preset Rmin response is specified by (4.17). The
separating power of a single filter is illustrated in Figure 8 where pairs of indistinguishable
frequencies are colored dark. As we shall see latter a combination of filters may improve
discriminability.
4.4
The design of a filter-set for signals with a single harmonic

and arbitrary amplitudes
Clearly, the discrimination of two arbitrary sine signals using a single Gabor power filter
is impossible because for every pair of frequencies m1 and m2 and for every point x, there
are always non-zero amplitudes A1 and A2 , such that rA1 ,m1 ,k,2 (x) = rA2 ,m2 ,k,2 (x). Adding
more filters can improve this situation, see Figure 9. A pair of signals indistinguishable by
the first filter may be distinguished by the second filter and vice versa. Our goal is to find
such organization of filter pairs that separates maximal number of signal pairs.
Consider a pair of signals g1 (x), g2 (x) where gi (x) = Ai sin(mi x). To discriminate between
the signals we look for two filters, hk1 ,12 , hk2 ,22 satisfying the following two conditions:
1.
|(hkj ,j2 gi )(x0 )| > Rmin i {1, 2}, j {1, 2}.
(4.18)
2. At least one of the filters (hkj ,j2 ) satisfies

{rA1 ,m1 ,kj ,j2 (x), x R}
\
{rA2 ,m2 ,kj ,j2 (x), x R} =
(4.19)
The noise consideration (described above, in section 4.3.3), require that the response is
above a some threshold. This threshold was specified relatively to the maximal response of
18
0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
2.5
1.5
m1
0.5
0.5
0
2.5
1.5
m2 m3
1.5
1
m1
(a)
0.5
0.5
1.5
m2 m3
(b)
Figure 9: The discrimination of two arbitrary sine signals using a single Gabor power filter is
impossible. For example (a) signals m1 and m2 are separated for some amplitude. However
slight change in the amplitude makes m2 and even more distant m3 indistinguishable from
m1 . Adding another filter (b) resolves this problem for many signals. Although m1 and m2
are still indistinguishable with the first filter they can be separated using the second filter.
the minimal signal. Such response is obtained when the signals frequency coincide with the
filter center frequency. For mi = kj , the response of hkj ,j2 to gi is at most Ai /2. Therefore,
the maximal response to the minimal signal is Amin /2. Rmin was usually specified as a tenth
of this value, i.e. Amin /20. Satisfying the first condition thus implies that the signals are
observable (by these filters), and are not too sensitive to some small amplitude noise.
The second condition implies that at least one of the filters is able to distinguish between
the signals based on its response. This condition is satisfied if the relative responses satisfy
the (stronger) condition:
(
) (
)
\ rA2 ,m2 ,k1 ,2 (x)
rA1 ,m1 ,k1,12 (x)
1
,x R
, x R = ;
(4.20)
rA1 ,m1 ,k2,22 (x)
rA2 ,m2 ,k2 ,22 (x)
see Claim 1.1 in Appendix E for the proof.
The stronger condition is even more intuitive: suppose that two signals associated with
unit amplitudes satisfy 4.11. Increasing one of the amplitudes does not change the ratio and
hence leaves the signals separable. Therefore, this condition is independent of the amplitudes.
In the rest of this section we shall consider two signals as separable if their frequencies
satisfy 4.19, assuming that the minimal response condition (4.18) is satisfied. Later we shall
see that this condition is satisfied by specifying the filters properly.
Note also that 4.20 is a sufficient condition to 4.19. That is, it may be that two signal
have disjoint responses even if the ratio ranges are not disjoint and 4.19 is not satisfied. As
we shall see the filters are chosen so that the ranges of both the responses and their ratios
is small, implying, for most signals, that 4.20 is equivalent to 4.19 and not only a sufficient
condition.
19
The following calculations bring this condition into a simpler form, similar to that of the
discriminability measure (4.12) and allow relatively simple optimization on {~k, ~2 }1 parameters. Note, that since the rA,m,k,2 (x) = Ar1,m,k,2 (x), the ratio expression is amplitude
independent, although its derivation is based on signals with various amplitudes.
4.4.1
An expression for indistinguishability
The ratio of two Gabor filters responses to a signal gA,m (x) = Asin(mx), is bounded for any
x by the following upper (G+ ) and lower (G ) expressions, derived using (4.3)(4.4):
ch(mk1 12 )
sh(mk2 22 )
2
2 +m2 ) 2 (k2 +m2 ) 2
(k1
1
2
2 sh(mk1 1 )
2
G (m, ~k, ~2 ) = e
ch(mk2 22 )
G+ (m, ~k, ~2 ) = e
2 +m2 ) 2 (k2 +m2 ) 2

(k1
2
2
1
2
As done before (see 4.3) and (4.4), it is convenient to describe this (relative) response range
~k, ~2 ) and its variation G(m, ~k, ~2 ):
using its middle response G(m,

+
~ ~2
~ ~2
~k, ~2 ) = G (m, k, ) + G (m, k, )
G(m,
2
+
~
2
~
G (m, k, ) G (m, ~k, ~2 )
G(m, ~k, ~2 ) =
2
The sensitivity of the middle response to frequency may be quantified by its derivative
(in analogy to the derivation of (4.12))
2 +m2 ) 2 (k2 +m2 ) 2
2
2
~k, ~2 )
(k1
G(m,
1
2
2 m(2 1 )sh(2m2 ) 22 ch(2m2 )
2
= e
ch [m(1 + 2 )]
m
sh2 (2m2 )
2 +m2 ) 2 (k2 +m2 ) 2

(k1
1
2
2
2
sh(2m2 )
(1 + 2 )sh [m(1 + 2 )] ,
where i = ki i2 . The indiscriminability measure quantifying the undistinguishable difference

in frequency is:
m
=
=
G(m, ~k, ~2 )

~k,~2 )
G(m,
m
[m(22
12 )
(4.21)
ch [m(1 2 )]
.
22 coth(2m2 )] ch [m(1 + 2 )] + (1 + 2 )sh [m(1 + 2 )]
As before (see explanation for (4.13)) m = min{m,

fmax fmin }
1
We denote ~k = (k1 , k2 ) and ~2 = (12 , 22 )
20
4.4.2
Some useful approximations
R
We would like to to minimize cumulative indistinguishability (4.14) - m(m, ~k, ~2 )dm over
all filter parameters k1 , k2 , 12 , 22 . While we can look for this minimum, numerically, by a
dense search over the 4 dimensional parameters space, we prefer to find also some shortcuts,
which will be especially useful for the design of larger filter banks.
Reducing the search to two parameters - We first observed that specifying ki (i =
1, 2), the cumulative indistinguishability is minimized by choosing the parameters i2
(i = 1, 2) as the highest satisfying (4.15). This observation is consistent with the
result obtained for single amplitude Sine signal analysis and with intuition: Filters
with bigger 2 have smaller variability. Unfortunately, we were not able to formally
prove this claim, but it is verified by our numerical tests.
R
Using this observation we may plot the cumulative indistinguishability m(m, ~k, ~2 )dm
as a function of only two parameters k1 and k2 . See Figure 10(a). For the chosen frequency range m [.2, 3/14] the optimal parameters are k1 = 1.61 and k2 = 1.29 (or
vise versa due to symmetry).
Reducing the search to one parameter - Note that k1 gets the same value obtained
for a single optimal filter (see (4.17)). This is somewhat expected due to the following
argument. For large (say, larger than 5, which is common for most considered filters),
ch [m(2 + 1 )] sh [m(2 + 1 )] and coth(2m2 ) 1 leading to the approximation
1
ch [m(1 2 )]
|m(22 12 ) + (1 2 )| ch [m(1 + 2 )]
m = min{m,
fmax fmin }
m

(4.22)
The term ch [m(1 + 2 )] have a dominant influence on the size of m.

To minimize
m,
it should be larger which happens when its argument, m(1 + 2 ) is maximal.
Both 1 and 2 are bounded with (4.17) because of (4.15). They cannot get the same
value because then, they will have equal k and 2 , making m large due to the first
part of the expression. Therefore only one of 1 , 2 can get maximal value. Letting
2 = max and setting k2 , 22 according to (4.17), we are left with one parameter, 1 .
Performing a one parameter optimization over 1 which can only take values smaller
than 2 while hk1 ,12 satisfies (4.15), we varied k1 , and updated 12 accordingly, looking
for minimal value of integral (4.14).
The results obtained using this faster, approximation based method (see figure 10(b)) are
the same as obtained from the search on k1 , k2 (figure 10(a))and also similar to the results
of full search on k1 , k2 , 12 , 22 , maximizing explicitly the number of separated signal pairs,
described in the next section.
4.4.3
Three filters and more
For more than two filters the analysis seems too complex and we rely on numerical optimization. Essentially, the optimization looks for a set of filters such that for every frequency
21
3.01
2.51
k2
2.01
1.61
1.01
.51
global minimum
.51
1.01
1.61
2.01
2.51
3.01
k1
(a)
log(ffmax m(m,k,)dm)
m optimization by 1 parameter
10
10
min
(k1, k2, 21, 22)
10
.5
1.5
k*
2.5
3.5
(b)
Figure 10: The total discriminability of filter pairs hk1 ,12 (x), hk2 ,22 (x). (a) varying both 1
and 2 , (b) having k2 22 = 2 = max , the discriminability is optimized along k1 12 = 1
parameter, where k1 [fmin , fmax ]. Note that k1 = k = 1.29 is the optimal value and it is
equal to one of the optimal values from the first case
there is at least two different filters with different significant responses. This optimization
thus uses a modification of the conditions 4.18,4.19 and is now described. See Figure 11(c,d)
for the resulting filters (satisfying this demand).
Note that now, having more than two filters, not every filter has to cover the entire range.
Therefore, the constraint (4.16) is somewhat modified:
For any pair of signals g1 (x) = A1 sin(m1 x) and g2 (x) = A2 sin(m2 x), there are at least
two filters hki1 ,i2 , hki2 ,i2 in the filter set satisfying the following two conditions:
1
2

1. hki ,i2 gk (x) > Rmin k 1, 2 i i1 , i2
2. At least one of the filters (hki ,i2 ;
i i1 , i2 ) satisfies
o
o\n
n
rA2 ,m2 ,ki,i2 (x), x R =
rA1 ,m1 ,ki ,i2 (x), x R
22
(4.23)
0.45
.5
0.4
1
0.35
m2
R(m)
0.3
1.5
0.25
0.2
0.15
2.5
0.1
0.05
3
0
.5
0
0.5
1.5
2.5
1.5
3.5
(a)
2.5
(b)
Three filters set
Five filters set
0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
R(m)
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
m1
R(m)
Discriminability of A1sin(m1x) from A2sin(m2x) using two optimal filters

Two filters set
0.5
0.5
1.5
2.5
3.5
0.5
1.5
2.5
3.5
(c)
(d)
Figure 11: Single-harmonic non-constraint amplitude optimal filter-sets. (a) 2 filters, (b)the
discriminability of a filters pair (c) 3 filters, (d) 5 filters. Note, that the signals associated
with low frequencies are the most problematic for discrimination with two filters. Intuitively,
one would add a third filter to cover these signals especially. The result of optimization is
in full agreement with this intuition.
The latter condition is (again) satisfied for hki1 ,i2 and hki2 ,i2 if the ratio condition
2
rA1 ,m1 ,ki1 ,i2 (x)

1
rA1 ,m1 ,ki2 ,i2 (x)

2
,x R
rA2 ,m2 ,ki1 ,i2 (x)

1
rA2 ,m2 ,ki2 ,i2 (x)

2
,x R
(4.24)
holds. These modified conditions allow to use filters with larger values, which distinguish
more signals pairs (in agreement with (4.12) and (4.22)).
4.5
The response to multi-harmonic signal
P
We
now
consider
the
response
of
a
single
filter
to
a
sum
of
harmonic
signals.
i Ai sin(mi x)+
P
j Aj cos(mj x). This response is a sum of responses to each harmonic and some additional
non-linear terms. The variability of the nonlinear part is of the same order of magnitude
as the harmonic response and cannot be neglected. Therefore the previous analysis cannot
be directly extended to these signals; see Appendices B and C. We approximate the number
of pairs of signals which are undistinguishable by Gabor filter using a variation on the
indistinguishability m derivation technique. This approximation shows that the filter sets
obtained for single harmonic signals are reasonable for use with multi-harmonic signals. The
23
approximations in this section are much rougher then in previous sections. Therefore all the
derivations for the multi-harmonic signals should be considered only as supporting evidences
for suggested filter set and not as exact calculations.
4.5.1
Single filter response to dual-harmonic signal

x2
1
e 22 +ikx , and a signal gA,
Consider first a complex Gabor filter hk (x) = 2
~m
2
~ (x) =
Asin(m1 x) + Bsin(m2 x) or gA,
~m
~ (x) = Asin(m1 x) + Bcos(m2 x). It can be shown (see
Appendix B) that the filter response is bounded with:
Ae
2
(k2 +m2
1 )
2
2
(k2 +m2
2 )
2
ch(km1 2 ) + Be
rA,
2 (x)
~ m,k,
~
ch(km2 2 )
(4.25)

2
(k2 +m21 )2

(k2 +m2
2 )
2
2
Ae

2
2
sh(km
Be
sh(km
)
1
2

2
For large , sh(km 2 ) ch(km 2 ) ekm the inequality may be rewritten in even simpler
form:

(km1 )2 2
(km2 )2 2
(km2 )2 2
(km1 )2 2

2
2
2
2
+ Be
2rA,
Be
(4.26)
Ae
2 (x) Ae
~ m,k,
~

and without loss of generality we can assume that for some A, B, m1 , m2

Ae
(km1 )2 2
2
Be
(km2 )2 2
2
(4.27)
and refer to m1 as the signals primary frequency. Therefore

r A,
2 =
~ m,k,
~
A
e
2
(km1 )2 2
2
rA,
2 = Be
~ m,k,
~
(km2 )2 2
2
(4.28)
Note, that the variability is now proportional to the highest response for the single sine case
and depends on other variables than the median response magnitude..
Note that according to our approximation, (B, m2 ) specifies only the responses variability,
therefore changing them does not make the signal distinguishable in all cases. For example,
(A, m1 , B, m2 ) is not distinguishable from (A, m1 , B , m2 ) for any B , m2 . Following the
derivations above (section 4.4, 4.3.3), the interval of indistinguishable signals with primary
frequency m1 6= m1 is
rA,k,
~ m,
~ 2
=
m
= r ~
m,
~ 2

A,k,
m1
Be
A
e
2
(km2 )2 2
2
(km1 )2 2
2
The condition (4.27) is equivalent to
Be
(km2 )2 2
2
Ae
(km1 )2 2
2
24
1,
|k m1 | 2
2
.
|k m1 | 2
(4.29)
Note that this expression relies on the approximation of the response by the approximated
bounds center value and variability. Still the expression provide useful intuition: the bound
on m decreases when m1 is distant from k, and substantially increases when it becomes
closer to k. Therefore, given a pair of signals where both frequencies m1 of both signals
are close to filters frequency k, it is less likely that the filter would separate these signals
(although it still may happen for some values of A, B, m2 ).
Optimal Filter Set on Linear Frequency Domain
Optimal Filter Set on Linear Frequency Domain

0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
RA,m,k , 2
RA,m,k , 2
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0.5
1.5
2.5
0
1
10
3.5
(a)
10
m
10
(b)
Optimal Filter Set on Logarithmic Frequency Domain
Optimal Filter Set on Logarithmic Frequency Domain

0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
i
RA,m,k ,2
0.3
i
RA,m,k ,2
implying that the indiscriminability measure is roughly bounded with:
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0.5
1.5
2.5
0
1
10
3.5
(c)
10
m
10
(d)
Figure 12: The optimal three filter sets as received by numerical optimization on linear(a) and
logarithmic(c) frequency domains. It is common to consider real signals to be logographically
spread on frequency domain. (b) and (d) shows (a) and (c) on log frequency space. Although
(a) and (c) are very similar, it is very important for multi-harmonic signals, that in (c) for
each filter i there exist another filter, responding low (but above Rmin ) at frequency ki .
Therefore, there should be another filter for the separation of such signals. That is,
for each filter hki ,i2 we prefer to have filter hkj ,j2 , such that the latter filter response to
frequencies near ki would be not too low (specifically above Rmin ) and, in the same time,
not too high (where m is smaller). We found that the filter set derived for single harmonic
case (on log frequency domain; see next section) satisfies the latter demand.
25
4.5.2
Multi-harmonic signals
It is shown in Appendix C that the filter design considerations in the case of multi-harmonic
signals are similar to those of the dual-harmonic case. The filters are likely to have difficulties
with recognition of signals, some of which frequencies are close to the central frequency of the
filter, as in the case of dual harmonic signals. Therefore, according to the reasons described
for dual harmonic signals, the filters derived for the single harmonic signals are suitable for
all harmonic signals. This conclusion provides some justification to apply the derived single
harmonic filters to the task of segmentation of general textured signals.
26
Building an optimal filters-set
In previous section we derived expressions for parameters of optimal single filter for discriminating harmonic signals with equal amplitudes. We also shown a semi analytical way to find
optimal filters pair for segmentation of harmonic signals with unconstrained amplitudes. In
this section we describe the numerical search we performed on the filters parameters to support the theoretical conclusions. Making the full search for many filters is computationally
expensive. We show how to extend the semi-analytical method (developed for finding two
filters set) so that multi-filter set can be found with relatively small complexity.
In order to find the optimal filter-sets we performed a numerical search. For combination
of filters parameters ~k, ~2 taken on some hierarchical grid, the value of a discrete version of
Q(~k, ~2 ) was calculated. In the case of single filter segregating harmonic signals with preset
amplitudes we calculated Rk,2 using (4.3,4.4) and then inserted it into (4.8). In the case of
several (2,3,4,5) filters for segmentation of signals with unconstrained amplitudes, (4.20) was
used to for evaluation of Rk,2 . Note, that the use of (4.20) allowed to evaluate s(mi , mj )
without considering actual amplitudes (see claim 1.1).
The frequencies of signal pairs were taken from a 2D uniformly spaced grid in either a
linear:
fmax fmin
lin
lin
filin = fmin + i fstep
, where fstep
=
N
or a logarithmic:
s
i

fmax
log
log
filog = fmin fstep
, where fstep
= N
fmin
frequency domains. The discrete versions of Q(~k, ~2 ) are defined as:
P
lin
lin
i,j[0...N ] s(fi , fj )
lin
Q =
N2
log
log
log
i,j[0...N ] s(fi , fj )
N2
(5.1)
(5.2)
Note that the analytically calculated filters correspond to the linear domain. Optimizing
over the logarithmic domain is advantageous because the number of separated frequencies
is preserved with the scaling process, which presents in every vision system. Therefore,
the separation power of the filter set doesnt depend on the distance from the camera to
the object, which is an important property for a filter set. There are also psychophysical
evidences that logarithmic domain is consistent with human perception. The receptive-field
profiles of simple cells in the striate cortex have an octave-like order of frequencies, which is
a way to cover uniformly the logarithmic domain. The results for the logarithmic domain
were somewhat different. For each type of frequency domains we tested two [fmin , fmax ]
bounds - [0.1, 2.82] and [0.1, 3.14]. 2
2
The considered fmax values are associated with the Nyquist frequency specified by the sampling of the
image, and fmin is to be the frequency associated with a wavelength equal to the maximal detectable texture
patch size, which we have set to 20
27
5.1
The technique of the numerical search
We searched for optimal filters parameters for filter sets of one filter for preset amplitude
signals and 2 to 5 filters for unconstrained amplitudes signals. Since each signal is described
by its frequency m (2 parameters in each search) and each filter is described by two parameters k and 2 (2,4,6,8 or 10 parameters), the complexity of search for the optimal filters set
varied between O(n4 ) to O(n12 ), where n is the number of possible values tested for each of
the parameters. This may be computationally hard. Therefore we used a hierarchical, grid
based, search.
We started with a widely spaced grid of k and 2 parameters (10 values of each) and
examined every combination of them against a rough frequencies m grid (305 305 values).
For the filters parameters with best Q measure we defined a finer grid around each parameter.
We took the optimal value and its two neighbors from the coarse grid and added two equally
spaced values between them (see figure 13). We repeated the search on the received finer grids
and repeated the grid redefinition around the optimal values. We repeated this operation
until the resolutions of filter parameter grids where of the same order as the resolution of
the frequencies grid.
fine gridX

HH XXX

?
z y

9

j i XX
H
y
y
i
yXX
X
:

XX

6 grid
X coarse

Figure 13: Converting coarse grid to fine

In the case when several different sets of filter parameters give the same Q value, we
increased the resolution of signals grid and repeated the calculations for these sets, trying
to find the best one.
5.2
Signals with equal amplitudes
In section 4.2 we derived theoretically a single optimal filter for segregation of harmonic
signals with equal amplitudes (4.17). This derivation was numerically confirmed with a
search over k parameter in [fmin , fmax ] domains and over 2 parameter in [1,100] range.
Each signal pair was examined vs. (4.9) test and the k, 2 parameters which maximized
Q(k, 2 ) were chosen as the optimal filter.
We considered different values of Rmin parameter between 5% and 20% of Rmax . We found
that while the optimal ki was not sensitive to Rmin , the i2 was. For small Rmin values the
optimized parameters are very similar to the theoretical ones; see (4.17). For bigger values
of Rmin it is preferred to decrease the filters coverage and to give up the discrimination of
some signal pairs in order to increase the segregation rate for the remaining ones.
5.3
Extension of the semi-analytical method for filter sets of three

and more filters for signals with unconstrained amplitudes
In section 4.3 we proposed a semi-analytic method for receiving the approximately optimal
filter parameters for segregation of single harmonic signals with unconstrained amplitudes.
28
(a)
(b)
Figure 14: The optimal single filters band for segmentation signals between minimal and
maximal frequencies for different Rmin . There are vertical lines on both ends of each plot
corresponding to the fmin and fmax frequencies. The curves in the center of each plot
correspond to the optimal values of k for different Rmin values. The curves on the left and
right sides of each plot correspond to the band bounds for the optimal filter width and
the optimal k. Note, that the left side curves are overlapping the fmin lines. (a) the filters
received from search on linearly spaced frequency domain. (b) the filters received from search
on log frequency domain.
The semi-analytic method is based on a search for the maximum of Q by the minimization
of the integral 4.14 which relies on the approximate expression for m (4.22). Numerical
optimization of the integral measure gave us the filter set parameters for two filters.
The latter method may be extended for bigger filter sets using the following heuristic
consideration: it is necessary and sufficient to have two different filters responding above
Rmin for every frequency in [fmin , fmax ] for segregating harmonic signals with unconstrained
amplitudes. On the other hand we prefer filters with 2 as large as possible (which results
with small bandwidth) in order to decrease the variability of its responses. Therefore, in the
optimal filter sets there will be exactly two filters responding above Rmin for each frequency.
Since the filters band is symmetric around its central frequency ki and its width is
proportional to i , it is enough to find the optimal frequencies of filter switch. For Nk
filters there are Nk 2 such points, due to the demand of two filters for each frequency. For
every choice of Nk , f1 = fmin and fNk = fmax . Suppose now that the switch frequencies
f2 , ..., fNk 1 are known. Then, the n filters set (n > 2) is build as follows (see also figure 15):
Construct one filter which covers the range [f1 , f2 ] and another filter which covers the
range [f1 , f3 ].
For i = 2 to Nk 2 construct a filter which covers the range [fl , fl+2 ].
Construct one filter which covers the range [fNk 1 , fNk ].

r
r r r qqq r r r
r-m
f1
f3
f2 f4
fNk 3fNk 1
fNk 2 fNk
Figure 15: Constructing Nk filters set

29
Between each two filter switch points there are exactly two filters, and their total inefficiency (or non quality), which is integrated indistinguishability for each frequency in this
range may be calculated by
Z fl+1
m(m, ki , kj , i2 , j2 )dm
NQl
fl
where l is the index of the range, fl is filter change frequency, m is defined by (4.22) and
hki ,i2 , hkj ,j2 are the filters responding above Rmin for frequencies fl m fl+1 . The total
inefficiency of filter set is given by
NQ =
N
k 1
X
NQl
l=1
where Nk 1 is the number of ranges specified by Nk 2 filter change points in [fmin , fmax ].
Note, that the signals associated with distant frequencies are always distinguished because
they are covered by different filters and therefore their responses to these filters are disjoint.
The optimal filter set is obtained by minimizing NQ.
For example finding optimal three filters set optimization is as follows:
1. Choose f2 [fmin , fmax ].
2. Calculate the parameters of the three filters, such that
r1,k1 ,fmin ,12 (x) = r1,k1 ,f2 ,12 (x) = Rmin
r1,k2 ,fmin ,22 (x) = r1,k2 ,fmax ,22 (x) = Rmin
r1,k3 ,f2 ,32 (x) = r1,k3 ,fmax ,32 (x) = Rmin
3. Calculate NQ = NQ1 + NQ2 .
4. Run 1-3 on all possible f2 values and select the one which minimizes NQ.
The complexity of the obtained numerical optimization is O(nNk 1 ) where n is the number
of tested frequencies. For Nk > 5 a dynamic programming algorithm of optimization would
be faster than the proposed one. However, filter sets of 6 and more filters are impractical
and because of this we didnt consider such algorithm.
The results of this algorithm for 3 and 4 filters are shown at figures 16(e), 16(g).
5.4
Numerical optimization of signals with unconstrained amplitudes
For the numerical optimization each signal pair was examined vs. (4.9) test and the ~k, ~2
parameters which maximized Q (4.10) were chosen as the optimal filters. The full numerical
search on 2Nk parameters of the Nk filters obtained values which were very similar to those
obtained by semi-analytically considerations.
30
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
RA,m,k,2
RA,m,k,2
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0
0
0.5
1.5
2.5
3.5
0.5
1.5
(a)
0.5
0.5
0.45
0.4
0.4
0.35
0.35
3.5
3.5
3.5
3.5
RA,m,k,2
0.3
RA,m,k,2
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
0.5
1.5
2.5
3.5
0.5
1.5
(c)
2.5
(d)
Numerically optimal three filters set
0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
RA,m,k,2
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
Optimal theoretical Three filters set
RA,m,k,2
Numerically optimal two filters set
0.45
0.5
1.5
2.5
3.5
0.5
1.5
2.5
(e)
(f)
Four theoretically optimal filters
Numerically optimal four filters set
0.5
0.5
0.45
0.45
0.4
0.4
0.35
0.35
0.3
RA,m,k,2
0.3
0.25
0.25
0.2
0.2
0.15
0.15
0.1
0.1
0.05
0.05
2.5
(b)
Optimal theoretical two filters set
RA,m,k,2
Numerically optimal single filter
Theoretically optimal single filter

0.5
0.5
1.5
2.5
3.5
0.5
1.5
2.5
(g)
(h)
Figure 16: Theoretical (left) vs. numerically optimized (right) filter sets
31
Commonly used octave filters
Commonly used octave filters
0.7
0.6
0.6
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
RA,m,k,2
0.7
0.5
1.5
2.5
0
1
10
3.5
(a)
10
(b)
1.5 octave filters
1.5 octave filters

0.5
0.5
0.4
RA,m,k,2
0.4
0.3
0.3
0.2
0.2
0.1
0.1
10
log(m)
RA,m,k,2
We searched for optimal filter sets on signal domains which were both linearly and logarithmically spaced. For both cases the optimal filter sets are very similar. We repeated
all the searches on two different [fmin ; fmax ] frequency ranges. The filters coverage of the
given domains was similar. The results for linearly spaced domain are shown at figure 16(b),
16(d), 16(f) and 16(h) against their semi-analytical estimation.
We considered different values of Rmin parameter between 5% and 20% from Rmax . For
small Rmin values the ki parameters tend to be more distant one from another and i2 values
are larger. Again, for bigger Rmin values it is preferred to decrease the filters coverage and
to give up some signal pairs in order to increase the segregation rate in the remaining ones.
0.5
1.5
2.5
0
1
10
10
log(m)
(c)
(d)
Figure 17: Commonly used filter sets. (a) The most popular octave set. (b) The same
set on log frequency domain. Note that the variability for lower frequencies is large. Each
filter responds strongly for all the frequencies smaller than its central frequency. (c) More
theoretically justified 1.5 octave set and (d) the same set on log domain.
5.5
Comparison with commonly used filter sets
As discussed in the introduction, most texture filters, based on Gabor filters are octave
filters set [19], [7], [11]. In addition to their empirical justification [7], these filters are
computationally efficient [11]. An alternative approach, suggesting to use 1.5 octave filter,
was suggested and justified for signals representation [16].
To build N-octave filter set one starts by choosing the fmin frequency. Then, the central
N
frequency of the first filter is specified by k1 = 2 2 fmin . The central frequencies of the
32
Octave Filters vs. Proposed Filters

0.1
0.1
0.05
0.05
0.05
0.05
0.1
10
0.1
10
10
0.2
0.2
0.1
0.1
0.1
0.1
0.2
10
0.2
10
10
0.4
0.1
0.2
0.05
0.2
0.05
0.4
10
0.1
10
10
10
10
10
Proposed Filters
Octave Filters
Figure 18: Commonly used octave filter set of three filters compared to the proposed filter
set of the same size (3 filters) covering the same frequency range.
following filters are calculated by ki+1 = 2N ki . The filters band associated with the iN
N
th filter is [2 2 ki ; 2 2 ki ]. This way, the ratio between the maximal and the minimal
frequencies covered by one filter is 2N . The filter response at this cutoff frequencies is 1e of
the response to the central frequency.
As one can see from figures 17,18 such commonly used filter sets are very different from
the sets we propose. Note that the common filters are equally spaced on log frequency space.
Note also that for many frequencies, the response of the common filters is associated with
high variability. In particular, it cannot be claimed for example that the response to low
frequency signals is high only for the first filter. The 1.5 octave filters have smaller variability,
and as we shall see later, they indeed perform better in texture discrimination. The proposed
filter respond more weakly to most frequencies, but the variability in responses is also much
smaller.
33
Extending the filters to 2D
In principle, we could extend our approach to two (or higher) dimensions by trying to find
optimal filters for maximally separating signals from some standard set (e.g. 2D harmonic
functions). We shall take the easier (and common [6]) approach: The basic 2D filter is
constructed as a product between the proposed 1D filter (along the x axis) and a Gaussian
along the y axis. The width of the Gaussian is specified as the width of the proposed
1D filter. Similarly to other methods, we created the 2D kernels in 6 directions by rotating
the 2D filter by 6 . The rotated x axis is denoted the filters direction. See Figure 19 for a
2D illustration of a typical 2D filter set (corresponding to the 1D filters described in Figure
16(f).
Figure 19: 2D filters corresponding to 1D filters on figure 16(e-f).
Variavility of Responses for Different Angles

0.055
Traditional Filter
Proposed Filter
0.05
0.045
0.04
0.035
R1,.1,k,2()
0.03
0.025
0.02
0.015
0.01
0.005
20
40
60
80
100
()
120
140
160
180
Figure 20: Response variability as a function of 2D filters direction applied to a sine signal.
Note that for octave filters the magnitude remains of the same order for all directions, while
for the proposed filters the responses differ up to 10 times.
Note that the response of these filters to a harmonic signal may be more variable than
their 1D counterpart. When the signals variation is not exactly in the direction of the filter,
the filters response is some combination of of the response of the proposed filter (which
is relatively non-variable) and a Gaussian (low-pass) filter (which varies a lot). For the
common (e.g. octave) filters this variability is comparable to their normal 1D variability
and therefore doesnt make a large difference. For the proposed filters, on the other hand,
this variability matters and decreases performance. See figure 20 for example of variability
values as function of filters direction to signals direction.
34
6.1
Response normalization
To reduce the variability associated with the low-pass directions, we examine the response
of each filter in several nearby image points along the direction of the filter. The set of
responses associated with the j-th filter in (xi , yi ) is denoted {Rj }i . Rj and Rj are the
i
i
mean and standard deviation of this set. Specifically, 2 + 1 points were sampled in unit
spacings along the line segment coinciding with the direction of the filter.
A low variance of these responses indicates that the texture direction is close to the
filter direction. A high variance indicates, on the other hand, that the texture direction
is different and is not expected to yield a informative response, with low spatial variability.
The detected variance is used to create two (related) distance measures associated with some
filter (denoted the jth filter):
1. Given a set of filter responses {Rj }1 ( Rj , Rj ) along filters direction in some location
1
(x1 , y1) and another response R2j in some other location (x2 , y2 ), the distance between
the responses is

j R2j
R

d8j ({Rj }1 , R2j ) = 1
(6.1)

Rj
1
2. Given two sets of filter responses {Rj }1 , {Rj }2 (to the same filter) along the filters
direction, corresponding to two locations. Then, the distance between the responses
sets is specified as

Rj Rj

88
j
j
1
2

(6.2)
dj ({R }1 , {R }2 ) = q

R2 j + R2 j
1
For a jet of J = 6N filters (including N filters rotated in 6 directions), the corresponding

distances between the the responses in two locations are
8
D ({R}1 , R2 ) =
J
X
d8j ({R}1 , R2 )
j=1
for measure (6.1) and

88
D ({R}1 , {R}2 ) =
J
X
j=1
d88j ({R}1 , {R}2)
for measure (6.2)

Normalizing by the variance, these distance measures are in a sense, similar to Mahalanobis distance using diagonal covariance matrix estimated locally.
In the experimental part we used the asymmetric measure (6.1) because in the clustering
procedure we try to fit the feature vectors to some known centers. The symmetric measure
is used in graph partitioning methods where each two nodes of the graph are of the same
importance.
35
7
7.1
Experiments
Testing texture descriptors
Retrieval test - This test, built around the image retrieval tasks. A (typically large) set of
images containing various textures is available. In a query, a texture patch is given, and
the images with a similar texture are sought. A query is successful if an image with the
(subjectively judge) same texture is found to be one of the k most similar (according
to the descriptor) images [24], or if all the images associated with this texture in the
database are found [29]. It is sufficient to find the image and no specification where
the query texture is present in the image is required.
Segmentation test - In this test, an image with several textured regions present in it, is
given, and the algorithm needs to correctly specify these regions [19]. The algorithm
may involve texture descriptor as well as spatial considerations and is essentially a
clustering procedure. The texture descriptor ability to provide an effective distance
measure is tested. Simpler segmentations tests involve semi synthetic images composed
of tiles where every tile contains some (synthetic or real) texture. Harder tests involve
real images, containing textured objects, where the texture is influenced by say 3D
structure, or object pose.
Focusing on the issue of descriptor stability under translation, which is important mostly
for segmentation, we tested the proposed descriptors using the segmentation test. Textures
changes associated with 3D structure and pose correspond to scaling and shear and are
beyond the scope of this work.
Moreover, spatial considerations such as region size, and preference to texture continuity
were not used as well. This is done so that these considerations, or a segmentation algorithm
relying on them, would not mask the differences associated with the usage of different filter
sets. Our experiments basically test the texture distance between different textures (and its
stability) compared with a similar distance within the same texture.
We tested the discrimination power of the proposed filter-sets as well as several other,
commonly used, filter-sets, using both the common L2 metric and the two new distance
measures proposed above. The experiments were done with synthetic sine images (biased
for our filters) and with the popular Brodatz textures.
Our results indicate that the proposed filter sets perform significantly better than the
commonly used methods.
In aspect of retrieval test, we compare the proposed filters versus common octave filters
set using the texture recognition algorithm by Varma and Zisserman [28]. The algorithm
is implemented exactly as described by the authors and tested with the original and the
proposed filter sets on the original database. The results for the algorithm parameters
reported in the article are similar for both filter sets. However the proposed filters obtain
very close results using much more compact texture representation (smaller runtime), unlike
the octave filters.
36
(a)
(b)
(c)
Figure 21: Synthetic sines image (a) and its 16-means segmentation using common (b) and
the newly proposed (c) features segmented with L2 distance measure.
7.2
Synthetic sine images
To demonstrate the stability available with the use of the proposed filter, we started with
a test image composed of sine textures; see Fig. 21(a). (This example is clearly biased
towards favoring our method, constructed with Sine signals as a model, but other example
will follow below). This image is segmented using 16-means algorithm using the simplest
distance measure: L2 distance metric between the feature vectors. The results corresponding
to a traditional one octave filters and the proposed filters (over the same frequency range),
are shown in Fig. 21(b)) and Fig. 21(c)), respectively. The high intra-texture variability
associated with the traditional filters is clear.
7.3
Brodatz images
We now turned to the standard Brodatz textures and considered the commonly used set of
16 texture patch described in Fig. 22. We tested these textures in a pairwise discrimination
task as well as in one-against all segmentation tests.
Figure 22: Patch of 16 Brodatz images we used in experiments
7.3.1
Clustering texture pairs
Here we considered the 120 different pairs of 128 128 texture patches; see Fig. 23 for an
an example of such a pair. A vector of filter responses was calculated for every point.
37
Texture patches classification success
Texture patches classification success
120
120
100
100
Number of texture pairs
Number of texture pairs
Figure 23: Example pair of patches of Brodatz textures
80
60
40
20
0
0.55
our [0.1; 2.9]

our [0.17; 2.0]
1.5 octave [0.1; 2.9]
1 octave 4 filters [0.17; 2.9]
1 octave [0.17; 2.0]
our 2 filters [0.1;2.82]
1.5 octave 2 filters
1 octave 2 filters
0.6
0.65
0.7
0.75
80
60
40
20
0.8
0.85
0.9
0.95
0
0.55
Percent of successfully classified pixels
our[0.1; 2.9]
our[.17; 2.0]
1.5 octave[0.1; 2.9]
1 octave [0.17; 2.0]
our[.17; 2.0] new metric
1 octave[.17; 2.0] new metric
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
Percent of successfully classified pixels
(a)
(b)
Figure 24: Segmentation success rates for Brodatz patches. (a) shows the results with L2
distances for different filter sets. (b) shows the results for the new distance measure compared
to L2 for best filter sets.
The vectors were clustered using the K-means (K = 2) algorithm. After convergence, we
labelled each cluster as either left of right depending on the origin of the pixels which are the
majority in this cluster. The success rate is the fraction of pixel classified into the correct
cluster. Note that, according to this definition, the worst segmentation quality is 50%.
We performed such tests for several filter-sets developed according to our design, for
the popular 1-octave filter-sets [7, 19] and for the 1.5- octave filter-set, claimed to be more
mathematically justified in [16]. For the comparison, the proposed filter sets were constructed
for the same frequency ranges associated with the traditional filters and contained the same
number of filters.
The tests were performed for both common L2 distance metric and for the asymmetric
adaptive distance measure (6.1) on all the filter-sets.
The results, described in Figure 24, show the quality of clustering associated with the
various representations. Specifically, for every level of clustering quality, measured by the
success rate, it shows the number of texture pairs which are clustered better than this level.
Ideally all pairs should be clustered perfectly so that the clusters are pure and the success rate
is 1. This never happens but the clusters obtained with the proposed filters approximate this
ideal behavior much better than the clusters constructed with 1-octave filters or 1.5-octave
filters.
The performance difference is large for the L2 distance and is even larger for the proposed
new distance. Note that the performance of common filter-sets remained the same with both
38
100
Number of Texture Pairs
Texture Patches Classification Success

120
80
60
40
20
1 octave 3 filters meanshift L2 [0.1; 2.0]
proposed 3 filters meanshift adaptive measure [.17; 2.0]
proposed 3 filters meanshift adaptive measure [.1; 3.14]
0
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
Percent of Successfully Classified Pixels
Figure 25: Segmentation success rates for Brodatz patches with meanshift clustering for
three filter sets.
distances, while for the proposed filters, the adaptive distance is preferred.
Note, for example, that 106 pairs out of the 120 (89%) were clustered better than 80%
with the proposed filter set and the adaptive distance. On the other hand, such accuracy
was achieved by only 49 of 120 pairs (41%) using the 1-octave filters (both filter sets were
developed over the same range and include the same number of filters (3)).
These results were obtained with kmeans procedure initialized by the average of the
texture vectors (one patch for each cluster). This is justified because we do not proposed here
the k-means as a clustering algorithm but just test the clustering properties of the various
representation. Similar results were obtained with randomly initialized seeds.
Figure 25 shows results for similar experiment where meanshift clustering was used instead of K-Means. The filters responses were clustered by meanshift to several groups and
these groups were attached to one of the two patches according to majority of its pixels. The
success rate is the ratio of the sum of the correctly clustered pixels to the total image size.
This experiment also shows the advantage of the proposed method on the traditional one.
Another important issue is that the meanshift finds much less (usually half) clusters while
using the proposed filters and adaptive distance measure. This implies that the proposed
method provides more compact texture representation using textons methods.
7.3.2
Primitive segmentation quality
In this experiment we tested how well the distance between two texture vectors serves as a
classifier telling whether they belong to the same texture. In this experiment we considered
L2 and the proposed symmetric metric (6.2).
To carry out this test, we considered the 4 4 patch image (figure 22), chose a sample
of 4 4 pixels from each patch, calculated the distance from each of these pixels to all the
others. This distance was compared to a threshold. The results was considered as positive
(same texture) if the distance was below the threshold and as negative otherwise.
The ROC graphs in figure 26 show the false negatives as a function of the false positives
for different values of the threshold. A ROC curve is better if it is closer to the axes and the
origin. It is apparent that the classifiers constructed using the proposed filter set (with the
39
our filters (0.192.0) new symmetric distance

1.0 octave L2 metric
1.5 octave L2 metric
our filters (0.192.0) L2 metric
1.0 octave new symmetric distance
0.9
0.8
0.7
False Negatives
RoC
1
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
False Positives
Figure 26: ROC curve for Brodatz patches

two distance measures) are better than those built with the common filters.
7.3.3
Stability test
In this test we choose a representative feature vector for each texture patch and attach each
pixel in the image to the closest representative among those 16 vectors. This was done with
the L2 norm and with the proposed asymetric adaptive distance (6.1).
First, the representative feature vectors were the average feature vectors in each patch;
see Figure 27 (a,b). For the proposed filter set, this choice yields much better segmentations
of the image than that obtained with the common 1-octave filters.
In the second case, which results are shown at figure 27(c-h), the representative feature
vectors were randomly chosen from each patch. This method is not deterministic because
for each run the representative vectors are different. The results show that the proposed
method provides much more robust segmentation of the image into correct patch squares.
7.4
Texture recognition test
For this test we implemented the texture recognition algorithm proposed by Varma and
Zisserman [28] and partially based on earlier work by Leung and Malik [17]. We compared
the recognition results obtained for octave filters (the filters used in the original work) to the
results obtained for the proposed filters while preserving all other components of the original
work.
40
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
Figure 27: Stability test for Brodatz patches. (a-b) Supervised test: each pixel was attached
to the closest average response from each patch. (a) the proposed filter set with new distance
measure. (b) one octave filter set with L2 distance metric.(c-h) Unsupervised test: each pixel
was attached to the closest random response from each patch. In the lower row (f-h) the
standard octave filter set (3 frequencies) with L2 distance measure was used. In the upper
row (c-e) the proposed filter set (3 filters, the same frequencies range) with new distance
measure was used.
41
Figure 28: Varma-Zisserman algorithm scheme.

7.4.1
Brief description of the algorithm
The scheme of the algorithm is presented in figure 28. The database, which is a subset
of CUReT (Columbia-Utrecht Reflectance and Texture Database [5]), contains 20 classes
of textures; 92 samples for each class. Each class is divided to 46 training and 46 test
images. For each of the images Gabor features are calculated using the tested filter set.
Some examples of the database images are shown in figure 29, and it is easy to see that
the database is hard for recognition task even with the human eye (the samples of the same
texture class have different scales and photographing angles). As one can see there are some
examples of the same class which have a very different appearance, while some examples of
different classes look similar.
In the learning stage, the representative K feature vectors for each class are obtained by
K-Means clustering of feature vectors of a subset of the training images. The 20K feature
vectors obtained from all classes represent, together, all the textures. Each pixel of each
training image is labelled with an index of the representative feature vector which is closest
to it. The histogram of features distribution for each training texture sample is stored with
the corresponding texture class label.
In the recognition stage, every pixel of the test images is labelled with the index of the
closest representative feature vector, and the distribution of the features is calculated. The
same features as in the learning stage are used in the recognition stage. For each test image
its features histogram is 2 compared to the histograms stored during the learning stage.
The class referred by the closest sample histogram is considered as the recognized class. The
fraction of the correctly recognized images from 46 20 testing samples is considered as the
success rate associated with the tested filter set. See [28] for more details.
42
(a)
(b)
Figure 29: CUReT images examples. The upper image contain four sample images of the
same class. The bottom image contain four samples from different classes. As one can see
sometimes the images in the same class have a very different appearance, while images in
different classes are very similar
7.4.2
Experiments setup
As first stage we repeated the recognition system exactly as it was proposed in the article
[28] and tested it on the same database. The obtained results were similar to those reported
by the authors, thus we concluded that our implementation is correct. We used a variation of
the filter set denoted RFS in the original work, with the omission of the two isotropic filters.
The best result reported by the authors for 20 classes recognition is 98.37%; we obtained
96.6% without isotropic filters.
In the second stage we replaced the filter set with the filter sets proposed here (sets with
three and five filters). We considered six texture recognition methods:
1. The original octave filters, 18 complex responses (36D feature space), L2 distance.
2. The original octave filters, absolute values of complex responses (18D feature space),
L2 distance.
3. The proposed filter set with three filters, covering frequencies range equal to the range
of the octave filters set (18D feature space), L2 distance.
4. The proposed filter set with three filters, covering frequencies range equal to the range
of the octave filters set (18D feature space), adaptive distance measure.
5. The proposed filter set with five filters, covering frequencies range [.1, 3.14] (30D feature
space), L2 distance.
43
0.95
0.95
0.9
Success rate (%)
Success rate (%)
0.85
0.8
Octave Complex L2
Octave L2
Our 3 L2
Our 3 AM
Our 5 L2
Our 5 AM
0.75
0.7
0.9
0.85
Octave3 complex L2 (38D)

Octave3 L2 (20D)
MHD3 L2 (20D)
MHD3 AM (20D)
MHD5 L2 (32D)
MHD5 AM (32D)
0.8
0.75
10
Number of features for each class
10
Number of features for each class
(a)
(b)
Figure 30: Recognition tests using Varma-Zisserman algorithm. (a) Recognition rate using
13 random training images for features extraction. (b) Recognition rate using all 46 training
images for features extraction.
6. The proposed filter set with five filters, covering frequencies range [.1, 3.14] (30D feature
space), adaptive distance measure.
In this test we checked the dependence of the recognition rate on the number of representative feature vectors, K, taken for each texture class. Figure 30 (a) shows that for
a small number of features the proposed filters provide better recognition rates than the
octave filters, but after that, for larger number of representative texture prototypes, the set
proposed in [28] (using octave set and complex responses) wins.
7.4.3
Modified representative features extraction
The image database used in these experiments contain images of the same texture taken with
different illumination, angle and scale. Therefore the frequency domain signature of different
examples of the same texture may change as a function of these three parameters. If the
initial learning group doesnt contain an example of some scale or angle of photographing, the
feature set learned for this class will not contain all possible response values, and therefore
the learned features will not be representative enough. In the case of octave filters this may
be less problematic, since the responses of octave filters for many frequencies are overlapping.
Note that the training in [28] requires all 46 images in each class, but uses only some of them
for the feature extraction. In the case of the proposed filters this is a serious problem, because
the response ranges are mostly disjoint.
To overcome this problem we repeated the experiments using all the training images
to find the most representative feature vectors for each class. Figure 30 (b) shows that in
this case the proposed filters out-perform the octave ones for small number of features and
provide the same performance for more features.
44
Conclusions
A novel method for constructing set of Gabor filters was proposed. Unlike the commonly
used filter sets which concentrate on the optimal representation (in MSE sense) of the signals, the main consideration in constructing the proposed set is maximizing the number of
distinguished signal pairs.
The method is based on analytical and semi-analytical derivations of the optimal filter
sets for separating harmonic signals. It was also analytically shown that the proposed filters
are reasonable for signals with several harmonics. The proposed filter sets are extended to
2D and a special, based on analytical properties of the filters, Mahalanobis-like distance
measure was proposed to compare the resulting features.
It was shown in experiments with real-life texture samples that the proposed filters are
superior than the commonly used ones in the task of image segmentation, even using such
primitive methods as K-means. It was also shown that for a reasonable small set of textures,
the proposed method provides a stable recognition of texture members, given representative
of each texture. Note, that the proposed filters performed better using a common L2 metric
for features comparing. Using a new adaptive distance measure improved the results for the
proposed filters and didnt change the results for the commonly used filters.
The difference in filter set building considerations caused that the obtained filter sets
are significantly different from the commonly used ones. The representation based filter sets
achieve better reconstruction of the texture from the response coefficients in each location on
the texture, but this may cause a very different responses in different locations. The response
distribution comparing methods deal with this phenomenon, but they are computationally
hard. The responses of the proposed filters to textured signals are less varying with translation than the responses to the commonly used filters. This allows to use a single feature
vector to describe each texture. In context of segmentation of natural textured images this
is a simpler and faster method.
The proposed method fits the main goal of the work: the distance between feature
vectors of pixels from the same texture is mostly smaller than the distance to pixels from
other textures. However, since the goal was to describe each texture by a single feature
vector, this method may work worse than the distribution based ones for textures with large
variability of responses.
The next step in this work seems to be an integration of the proposed filter sets into
complex segmentation algorithms, mentioned in the beginning of the work. Another direction
may be development of a new fast segmentation algorithm which would use the special
properties of the proposed features, such as a good separation using a simple threshold
criterion. Considering geometrical properties of compared pixels should increase even more
the high distinguishability power of the proposed features. The segmentation tasks usually
imply that inter texture distances are small, therefore the stability of responses in the same
texture of the proposed features should increase the segmentation quality.
45
Acknowledgements
We would like to thank the creators of the CUReT database for enabling us to use it, and
to Manik Varma, who let us use the processed version of their database as well. We would
like to thank Ilan Shimshoni for providing us an implementation of meanshift clustering
algorithm. We thank Israeli science foundation for financial support of this research.
46
A response to single harmonic signal

x2
1
e 22 +ikx response to a signal gm,A (x),
The absolute value of the Gabor filter hk,2 (x) = 2
2
is
q
|hk,2 (x) gm,A (x)| = (Re(hk,2 (x) gm,A (x)))2 + (Im(hk,2 (x) gm,A (x)))2
For gm,A (x) = Asin(mx) the Real part response is:

=
Re(hk,2 (x) gm,A (x))
Ae
x2
e 22 sin(kx) Asin(mx)
2Z2
2 2 Z
e 22 sin(k)sin(m(x )) d
2
e 22 sin(k) {sin(mx)cos(m) cos(mx)sin(m)} d
2 2
Z
Asin(mx) 22
=
e 2 sin(k)cos(m) d
2 2
Z
Acos(mx) 22

e 2 sin(k)sin(m) d
2 2
Z
Asin(mx) 22
=
e 2 {sin((k + m)) + sin((k m))} d
8 2
Z
Acos(mx) 22
e 2 {cos((k m)) cos((k + m))} d

8 2
Z
Acos(mx) 22
=
e 2 {cos((k + m)) cos((k m))} d
8 2

(k+m)2 2
(km)2 2
A
2
2
e
e
cos(mx)
=
2
o
A (k2 +m2 )2 n km2
2
2
e
e
ekm cos(mx)
=
2
where * is due to
R
0
2 x2
ea
(k2 +m2 ) 2
2
cos(mx)dx =
sh(km 2 )cos(mx)
m2
e 4a
2a
47
For gm,A (x) = Acos(mx) the Real part response is:

=
Re(hk,2 (x) gm,A (x))
x2
2 2
e 22 sin(kx) Acos(mx)
m(x )) d
2
n
o
A 22
2
e sin(k) sin( )cos(m(x )) cos( )sin(m(x )) d
=
2
2
2 2
A
e 22 sin(k) {cos(m(x ))} d
=
2 2
A 22
e 2 sin(k) {cos(mx)cos(m) sin(mx)sin(m)} d
=
2 2
Acos(mx) 22
=
e 2 sin(k)cos(m) d
2 2
Asin(mx) 22

e 2 sin(k)sin(m) d
2 2
Acos(mx) 22
=
e 2 {sin((k + m)) + sin((k m))} d
8 2
Asin(mx) 22

e 2 {cos((k m)) cos((k + m))} d
8 2
Asin(mx) 22
=
e 2 {cos((k + m)) cos((k m))} d
8 2

(k+m)2 2
(km)2 2
A
2
2
e
=
e
sin(mx)
2
o
A (k2 +m2 )2 n km2
2
2
e
e
ekm sin(mx)
=
2
where * is due to
R
0
Ae
2 x2
ea
2 2
e 22 sin(k)sin(
(k2 +m2 ) 2
2
cos(mx)dx =
sh(km 2 )sin(mx)
m2
e 4a
2a
48
For gm,A (x) = Asin(mx) the imaginary part response is:

Im(hk,2 (x) gm,A (x)) =
=
=
=
=
=
=
=
=
x2
e 22 cos(kx) Asin(mx)
2 2 Z
2
A
e 22 cos(k)sin(m(x )) d
2 2 Z
2
A
e 22 cos(k) {sin(mx)cos(m) cos(mx)sin(m)} d

2 2
Z
Asin(mx) 22
e 2 cos(k)cos(m) d
2 2
Z
Acos(mx) 22

e 2 cos(k)sin(m)) d
2 2
Z
Asin(mx) 22
e 2 {cos((k + m)) + cos((k m))} d

8 2
Z
Acos(mx) 22

e 2 {sin((k m)) + sin((k + m))}cos(mx) d
8 2
Z
Asin(mx) 22
e 2 {cos((k + m)) + cos((k m))} d

8 2

(km)2 2
(k+m)2 2
A
2
2
e
+e
sin(mx)
2
o
A (k2 +m2 )2 n km2
km2
2
e
sin(mx)
e
+e
2
= Ae
(k2 +m2 ) 2
2
ch(km 2 )sin(mx)
49
For gm,A (x) = Acos(mx) the imaginary part response is:

Im(hk,2 (x) gm,A (x)) =
=
=
=
=
=
=
=
x2
e 22 cos(kx) cos(mx)
2 2 Z
2
e 22 cos(k)sin( m(x )) d
2
2
2 Z
o
n
2
A
2
2
e cos(k) sin( )cos(m(x )) cos( )sin(m(x )) d

2
2
2 2 Z
2
A
e 22 cos(k) {cos(mx)cos(m) sin(mx)sin(m)} d

2 2
Z
Acos(mx) 22
e 2 cos(k)cos(m) d
2 2
Z
Asin(mx) 22

e 2 cos(k)sin(m) d
2 2
Z
Acos(mx) 22
e 2 {cos((k + m)) + cos((k m))} d

8 2
Z
Asin(mx) 22

e 2 {sin((k m)) + sin((k + m))} d
8 2
Z
Acos(mx) 22
e 2 {cos((k + m)) + cos((k m))} d

8 2

(km)2 2
(k+m)2 2
A
2
2
e
+e
cos(mx)
2
o
A (k2 +m2 )2 n km2
km2
2
cos(mx)
e
+e
e
2
= Ae
(k2 +m2 ) 2
2
ch(km 2 )cos(mx)
Thus, for gm,A (x) = Asin(mx) the absolute value of the response is:
q
= A e(k2 +m2 )2 sh2 (km 2 )cos2 (mx) + e(k2 +m2 )2 ch2 (km 2 )sin2 (mx)
|hk,2 (x) gm,A (x)|
(k2 +m2 ) 2 p
2
= Ae
sh2 (km 2 )cos2 (mx) + ch2 (km 2 )sin2 (mx)
r
(k2 +m2 ) 2
1
1
2
= Ae
(ch(2km 2 ) 1)cos2 (mx) + (ch(2km 2 ) + 1)sin2 (mx)
2
2
r
2
2
2
(k +m )
1
1
2
= Ae
ch(2km 2 ){cos2 (mx) + sin2 (mx)} + {sin2 (mx) cos2 (mx)}
2
2
r
2
2
2
(k +m )
1
1
2
= Ae
ch(2km 2 ) cos(2mx)
2
2
(k2 +m2 ) 2 p
2
ch2 (km 2 ) cos2 (mx)
= Ae
50
and for gm,A (x) = Acos(mx) the absolute value of the response is:
q
= A e(k2 +m2 )2 sh2 (km 2 )sin2 (mx) + e(k2 +m2 )2 ch2 (km 2 )cos2 (mx)
|hk,2 (x) gm,A (x)|
(k2 +m2 ) 2 p
2
= Ae
sh2 (km 2 )sin2 (mx) + ch2 (km 2 )cos2 (mx)
r
(k2 +m2 ) 2
1
1
2
= Ae
(ch(2km 2 ) 1)sin2 (mx) + (ch(2km 2 ) + 1)cos2 (mx)
2
2
r
2
2
2
(k +m )
1
1
2
= Ae
ch(2km 2 ){cos2 (mx) + sin2 (mx)} + {cos2 (mx) sin2 (mx)}
2
2
r
2
2
2
(k +m )
1
1
2
= Ae
ch(2km 2 ) + cos(2mx)
2
2
(k2 +m2 ) 2 p
2
ch2 (km 2 ) sin2 (mx)
= Ae
A response to dual-harmonic signal

2
The absolute value of the Gabor filter hk,2 (x) =

signal gm,
~ (x) = Asin(m1 x) + Bsin(m2 x), is
~ A
x
+ikx
1
2 2
e
2
2
response to a dual harmonic
|hk,2 (x) gm,

~ (x)| = |hk,2 (x) Asin(m1 x) + hk,2 (x) Bsin(m2 x)|
~ A
(
2
2
2
(k2 +m2
(k2 +m2
2 )
1 )
2
2
2
sh(km1 )cos(m1 x) + Be
sh(km2 )cos(m2 x)
=
Ae
2 ) 21

2 +m2 ) 2
2
(k
(k2 +m2
)
2
1
2
2
ch(km1 2 )sin(m1 x) + Be
ch(km2 2 )sin(m2 x)
+ Ae
(
2
2 !
2
2
(k2 +m2
(k2 +m2
1 )
1 )
2
2
2
2
sh(km1 )cos(m1 x) + Ae
ch(km1 )sin(m1 x)
=
Ae
+
Be
+2Ae
2
(k2 +m2
2 )
2
2
(k2 +m2
1 )
2
sh(km2 )cos(m2 x)
2
sh(km1 2 )cos(m1 x)Be
2
(k2 +m2
1 )
2
+ Be
2
(k2 +m2
2 )
2
2
(k2 +m2
2 )
2
2
(k2 +m2
2 )
2
ch(km2 )sin(m2 x)
2 !
sh(km2 2 )cos(m2 x)
2
12
+2Ae
ch(km1 )sin(m1 x)Be
ch(km2 )sin(m2 x)
n

2
2 2
2
2 2
= A2 e(k +m1 ) ch2 (km1 2 ) cos2 (m1 x) + B 2 e(k +m2 ) ch2 (km2 2 ) cos2 (m2 x)
+2Ae
2
(k2 +m2
1 )
2
Be
2
(k2 +m2
2 )
2
sh(km1 )cos(m1 x)sh(km2 )cos(m2 x) + ch(km1 2 )sin(m1 x)ch(km2 2 )sin(m2 x)
51
21
since ch(x) > 0 for all x-s,we can bound the received result with:
n
2
2 2
2
2 2
max{|h g|} = A2 e(k +m1 ) ch2 (km1 2 ) + B 2 e(k +m2 ) ch2 (km2 2 )
2
(k2 +m2
1 )
2
2
(k2 +m2
2 )
2
+Ae
Be
ch(k (m1 + m2 )) + ch(k (m1 m2 ))
n
2
2 2
2
2 2
= A2 e(k +m1 ) ch2 (km1 2 ) + B 2 e(k +m2 ) ch2 (km2 2 )
21
(k2 +m2
) 2
(k2 +m2
) 2
1
2
2
2
Be
ch(km1 2 )ch(km2 2 )
+2Ae
= Ae
2
(k2 +m2
1 )
2
ch(km1 2 ) + Be
2
(k2 +m2
2 )
2
1
2
ch(km2 2 )
n
2
2 2
2
2 2
min{|h g|} = A2 e(k +m1 ) sh2 (km1 2 ) + B 2 e(k +m2 ) sh2 (km2 2 )
1
2
2
(k2 +m2
(k2 +m2
2
2 )
1 )
2
2
2
2
Be
sh(km1 )sh(km2 )
2Ae

2

(k2 +m21 )2 2
(k2 +m2
2 )
2
2
2

2
2
sh (km1 ) Be
sh (km2 )
= Ae
2
The absolute value of the Gabor filter hk,2 (x) =
52
x
+ikx
1
2 2
e
2
2
response to a dual har-
monic signal gm,

~ (x) = Asin(m1 x) + Bcos(m2 x), is
~ A
|hk,2 (x) gm,
~ (x)| = |hk,2 (x) Asin(m1 x) + hk,2 (x) Bcos(m2 x)|
~ A
(
2
2
2
(k2 +m2
(k2 +m2
1 )
2 )
2
2
2
=
Ae
sh(km1 )cos(m1 x) + Be
sh(km2 )sin(m2 x)
2 ) 12

2 +m2 ) 2
2
(k
(k2 +m2
)
2
1
2
2
ch(km1 2 )sin(m1 x) + Be
ch(km2 2 )cos(m2 x)
+ Ae
(
2
2 !
2
2
(k2 +m2
(k2 +m2
1 )
1 )
2
2
2
2
sh(km1 )cos(m1 x) + Ae
ch(km1 )sin(m1 x)
=
Ae
+
Be
2
(k2 +m2
1 )
2
+2Ae
+2Ae
2
(k2 +m2
2 )
2
sh(km2 )sin(m2 x)
2
sh(km1 2 )cos(m1 x)Be
2
(k2 +m2
1 )
2
2 !

2
(k2 +m2
2 )
2
2
ch(km2 )cos(m2 x)
+ Be
2
(k2 +m2
2 )
2
sh(km2 2 )sin(m2 x)
2
(k2 +m2
2 )
2
21
ch(km1 )sin(m1 x)Be

ch(km2 )cos(m2 x)
n

2
2 2
2
2 2
= A2 e(k +m1 ) ch2 (km1 2 ) cos2 (m1 x) + B 2 e(k +m2 ) ch2 (km2 2 ) sin2 (m2 x)
+2Ae
2
(k2 +m2
1 )
2
Be
2
(k2 +m2
2 )
2
sh(km1 )cos(m1 x)sh(km2 )sin(m2 x) + ch(km1 2 )sin(m1 x)ch(km2 2 )cos(m2 x)
since ch(x) > 0 for all x-s,we can bound the received result with:
n
2
2 2
2
2 2
max{|h g|} =
A2 e(k +m1 ) ch2 (km1 2 ) + B 2 e(k +m2 ) ch2 (km2 2 )
21
2
2
(k2 +m2
(k2 +m2
1 )
2 )
2
2
2
2
+2Ae
Be
ch(km1 )ch(km2 )
= Ae
2
(k2 +m2
1 )
2
ch(km1 2 ) + Be
2
(k2 +m2
2 )
2
ch(km2 2 )
n
2
2 2
2
2 2
A2 e(k +m1 ) sh2 (km1 2 ) + B 2 e(k +m2 ) sh2 (km2 2 )
1
2
2
(k2 +m2
(k2 +m2
2
1 )
2 )
2
2
2
2
2Ae
Be
sh(km1 )sh(km2 )

2

(k2 +m21 )2 2
(k2 +m2
2 )
2
2
2
2
2
sh (km1 ) Be
sh (km2 )
= Ae
min{|h g|} =
53
21
A response to multi-harmonic signal
In analogy to the bounds on the response of a single sine signal to a filter, described in
previous appendix, we have developed an expression for a filter response to a signal which
is a sum of sines with different amplitudes.
P
x2
+ikx
1
2 2
Consider a complex Gabor filter hk (x) = 2
, and a signal gA,
e
~m
2
i Ai sin(mi x).
~ (x) =
It can be shown that the filter response:
rA,k,
~ m
~ (x) = hk (x) gA,m (x)
(
X

2
2 2
=
A2i e(k +mi ) ch2 (kmi 2 ) cos2 (mi x)
(C.1)
Ai e
2
(k2 +m2
i )
2
Aj e
2
(k2 +m2
j )
2
i,j;i6=j

ch[k 2 (mi + mj )]cos[(mi mj )x]
) 21
ch[k 2 (mi mj )]cos[(mi + mj )x]
The upper bound of this expression is:

)
(

2
X
X
(k2 +m2
i )

2
max hk (x)
Ai e
Ai sin(mi x) =
ch(kmi 2 )

i
A lower bound depends in a complex way on the combination of mi frequencies and

amplitudes Ai . We could not find a useful lower bound and just use the trivial one, which
is zero.
Thus,
N
2
X
(k2 +m2
i )
2
Ai e
ch(kmi 2 ) rA,
(C.2)
2 (x) 0
~ m,k,
~
i=1
Relating on the upper bound as the actual maximal response and use the trivial lower bound
implies that

2
(k2 +m2
PN
i )
1
2
2
ch(kmi )
rA~1 ,k,m,
i=1 Ai e
~ 2 2
rA,k,
~ m,
~ 2
PN
i=1 Ai e
2
(k2 +m2
i )
2
ch(kmi 2 )
Applying m estimation to this approximation:

rA,
2
~ m,k,
~
PN
2
(k2 +m2
i )
2
ch(kmi 2 )

m
=
2
2
2
(k +m )
PN

i
rA,k,
2
~ m,
|k sh(kmi 2 ) m1 ch(kmi 2 )| 2
~ 2
i=1 Ai e
i=1 Ai e
54
(C.3)
Note, that this time the middle response is multidimensional function. To obtain a linear
approximation for its change tempo we should consider the direction of maximal change,
i.e. the gradient. Note also, that the received expression is an upper bound on m. The
dividend may be only smaller. The divisor an absolute value of a gradient of some Gaussianlike function, which tails are always near zero, but the maximal value may only increase.
The absolute value of the gradient of such function may only increase. Therefore the total
fraction is bounded from above by the given expression.
2
For large (sh(kmi 2 ) sh(kmi 2 )) 12 ekmi ) , it may be simplified to
PN
(kmi )2 2
2
A
e
i=1 i
.
(C.4)
m P
(kmi )2 2
N
2
2
|k mi |
i=1 Ai e
The obtained expression gets its maximum at k = mi similarly to the two frequencies
case (4.29). Thus, the filters design considerations for dual-harmonic signals are identical to
those of the multi-harmonic case.
Optimality of the chosen single filter
If a response in fmin have to be equal Rmin , the 2 of the filter is given by:
2log(Rmin )
2 =
(k fmin )2
If we increase k by the new 2 would be:

2log(Rmin )
2 =
(k + fmin )2
The m measure for fmin and filter k, is given by:
(D.2)
4log(Rmin )kfmin
2(kfmin )2
e2kfmin
e
m =
=
min )
(k fmin ) 2
(k fmin ) 2log(R
(kfmin )2
(D.3)
) (kfmin )
fmin k C [(k+fmin
2(kfmin )2
e
C

(k+fmin )2
1
fmin k C 2(kf
2
2
min )
e
=
C
and filter k + , is given by:
2]
(D.4)
The m measure for fmin
(D.1)
(D.5)

(k++fmin )
1
fmin k C 2(k+f
2 2
min )
=
(D.6)
m = e
e
C
Since Rmin is a small number, its log is a negative number. Therefore C is a negative
constant. Both k + fmin and k fmin are above zero, therefore
2
(k++fmin )2
(k + + fmin )2
(k + fmin )2
<
(D.7)
(k + fmin )2
(k fmin )2
and thus m < m because its exponent value is less negative and its non-exponent factor
is smaller. Therefore increasing of the filters frequency and preserving the given response
at fmin would increase m in filters responses.
55
Ratio theorem
Theorem 1.1.
(
rA1 ,m1 ,k1 ,12 (x)

,x R
rA1 ,m1 ,k2 ,22 (x)
{rA1 ,k1 ,m1 ,12 (x), x R}
{rA2 ,k1 ,m2 ,12 (x), x R} =
or
T
{rA1 ,k2 ,m1 ,22 (x), x R}
rA2 ,m2 ,k1 ,12 (x)

,x R
rA2 ,m2 ,k2 ,22 (x)
{rA2 ,k2 ,m2 ,22 (x), x R} =
Proof: Suppose the claim is wrong, then w.l.g.

(
)
(
)
rA1 ,m1 ,k1 ,12 (x)
rA2 ,m2 ,k1 ,12 (x)
T
,x R
,x R =
rA1 ,m1 ,k2 ,22 (x)
rA2 ,m2 ,k2 ,22 (x)
{rA1 ,k1 ,m1 ,12 (x), x R}
{rA1 ,k2 ,m1 ,22 (x), x R}
which is:
and
T
and
T
min{rA1 ,m1 ,k1 ,12 (x)}

max{rA1 ,m1 ,k2 ,22 (x)}
{rA2 ,k1 ,m2 ,12 (x), x R} =

6
{rA2 ,k2 ,m2 ,22 (x), x R} =
6
>
max{rA2 ,m2 ,k1 ,12 (x)}

min{rA2 ,m2 ,k2 ,22 (x)}
and
min{rA1 ,m1 ,k1 ,12 (x)} < max{rA2 ,m2 ,k1 ,12 (x)}
and
min{rA2 ,m2 ,k2 ,22 (x)} < max{rA1 ,m1 ,k2 ,22 (x)}
combining the latter two expressions:
min{rA1 ,m1 ,k1 ,12 (x)}
max{rA1 ,m1 ,k2 ,22 (x)}
>
max{rA2 ,m2 ,k1 ,12 (x)}

min{rA2 ,m2 ,k2 ,22 (x)}
and
max{rA2 ,m2 ,k1 ,12 (x)}
>
min{rA2 ,m2 ,k2 ,22 (x)}
>
min{rA1 ,m1 ,k1 ,12 (x)}

max{rA1 ,m1 ,k2 ,22 (x)}
we reached a contradiction, therefore the assumption was wrong, w.s.p.
56
References
[1] B.B.Mandelbrot, The fractal geometry of nature, New York: W.H. Freeman, 1983.
[2] K. Brady, I.H. Jermyn, and J.Zerubia, Adaptive probabilistic models of wavelet packets
for the analysis and segmentation of textured remote sensing images, Proceedings of the
British Machine Vision Conference (2003).
[3] R. Chellappa and S. Chatterjee, Classification of textures using gaussian markov random
field, IEEE Transactions on Acoustics, Speech and Signal Processing 33 (1985), no. 4,
959963.
[4] G.R. Cross and A.K. Jain, Markov random field texture models, TransPAMI 5 (1983),
2539.
[5] K.J. Dana, B. van Ginneken, S.K. Nayar, and J. J. Koenderink, Reflectance and texture
of real world surfaces, ACM Transactions on Graphics 18 (1999), no. 1, 134.
[6] J.G. Daugman, Uncertainty relation for resolution , spatial frequency, and orientation
optimized by 2d visual cortical filters, Journal of the Optical Society of America A 2
(1985), 11601169.
[7] D.J.Field, Relation between the statistics of natural images and the response properties
of cortical cell, J.Opt.Soc.Am.A 4 (1987), no. 12, 23792394.
[8] I. Fogel and D. Sagi, Gabor filters as texture discriminator, BioCyber 61 (1989), 102
113.
[9] D. Gabor, Theory of communication, J.IEE 93 (1946), 429459.
[10] B. Georgescu, I. Shimshoni, and P. Meer, Mean shift based clustering in high dimensions:
A texture classification example., 9th International Conference on Computer Vision
(2003), 456463.
[11] H. Greenspan, Multi-resolution image processing and learning for texture recognition and image enhancement, Ph.d. thesis, California Institute of Technology,
Pasadena,California, May 1994.
[12] H. Greenspan, S. Belongie, P. Perona, R. Goodman, S. Rackshit, and C.H.Anderson,
Overcomplete steerable pyramid filters and rotation invariance, Proc. of the IEEE Conference on Computer Vision and Pattern Recognition (1994).
[13] R.M. Haralick, K. Shanmugam, and I. Dinstein, Textural features for image classification, SMC (1973), 611621.
[14] J.M. Jolion, Feature similarity (in principles of visual information retrieval), ch. I.5,
Springer, 2000.
[15] K.I.Laws, Textured image segmentation, Ph.D. thesis, Dept. Electrical Engineering,University of Southern California, January 1980.
57
[16] T. S. Lee, Image representation using 2d gabor wavelets, IEEE Transactions on Pattern
Analysis and Machine Intelligence 18 (1996), no. 10, 959971.
[17] T. Leung and J. Malik, Representing and recognizing the visual appearance of materials
using three-dimensional textons, International Journal of Computer Vision 43 (2001),
no. 1, 2944.
[18] W.Y. Ma and B.S. Manjunath, Edgeflow: A technique for boundary detection and image
segmentation.
[19] J. Malik, S. Belongie, T. K. Leung, and J. Shi, Contour and texture analysis for image
segmentation, International Journal of Computer Vision 43 (2001), no. 1, 727.
[20] J. Malik and P. Perona, Preattentive texture discrimination with early vision mechanism,
JOSA-A 7 (1990), no. 5, 923932.
[21] J. Mao and A.K. Jain, Texture classification and segmentation using multiresolution
simultaneous autoregressive models, Pattern recognition 25 (1992), no. 2, 173188.
[22] M.Porat and Y.Y.Zeevi, Localized texture processing in vision: Analysis and synthesis
in the gaborian space, IEEE Trans Biomed Eng (1989), 11529.
[23] A.P. Pentland, Fractal-based description of natural scenes, PAMI 6 (1984), no. 6, 661
674.
[24] Y. Rubner, Perceptual metrics for image database navigation, Ph.D. thesis, Stanford
University, 1999.
[25] E. Sharon, A. Brandt, and R. Basri, Fast multiscale image segmentation, IEEE International Conference in Computer Vision (1999), 7077.
[26] J. Shi and J. Malik Self inducing relational distance and its application to image segmentation, EVVC (1), 1998, pp. 528543.
[27] T.Randen and J.H.Husoy, Filtering for texture classification: A comparative study,
PAMI 21 (1999), no. 4, 291310.
[28] M. Varma and A. Zisserman, Classifying images of materials: Achieving viewpoint and
illumination independence, Proceedings of the 7th European Conference on Computer
Vision, Copenhagen, Denmark, vol. 3, May 2002, pp. 255271.
[29] M. Varma and A. Zisserman, Texture classification: Are filter banks necessary?, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2,
June 2003, pp. 691698.
[30] T. P. Weldon, W. E. Higgins, and D. F. Dunn, Efficient Gabor filter design for texture
segmentation, Pattern Recognition 29 (1996), no. 12, 20052015.
58

Gabor Filter Analysis For Texture Segmentation

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Gabor Filter Analysis For Texture Segmentation

Uploaded by

Copyright:

Available Formats

Technion - Computer Science Department - Technical Report CIS-2005-05 - 2005