You are on page 1of 4

Discrete Wavelet Transform

The wavelet transform was borne out of a need for further developments from Fourier transforms. Wavelets
transform signals in the time domain (rather, assumed to be in the time domain) to a joint time-frequency
domain. The main weakness that was found in Fourier transforms was their lack of localized support, which
made them susceptible to Heisenberg's Uncertainty principle. In short, this means that we could get information
about the frequencies present in a signal, but not where and when the frequencies occurred. Wavelets, on the
other hand, are not anywhere as subject to it.

A wavelet is, as the name might suggest, a little piece of a wave. Where a sinusoidal wave as is used by Fourier
transforms carries on repeating itself for infinity, a wavelet exists only within a finite domain, and is
zero-valued elsewhere. A wavelet transform involves convolving the signal against particular instances of the
wavelet at various time scales and positions. Since we can model changes in frequency by changing the time
scale, and model time changes by shifting the position of the wavelet, we can model both frequency and
location of frequency. Hence, the dubbing of the WT's resulting domain as a joint time-frequency domain.

To perform these convolutions at every position and every characteristic scale is called the continuous wavelet
transform. As you can probably guess, this is a terribly costly process. Fortunately, we don't actually need to go
that far. Most signal data we'd like to process on a computer tends to be stored discretely. Moreover, Nyquist's
theorem tells us that the highest frequency we can model with discrete signal data is half that of the sampling
frequency. So that means we, at worst, have to perform the transform at every other point.

The continuous wavelet transform is generally expressed with the following integral.

When we actually perform the discrete wavelet transform, the common approach involves using the dyadic
scheme. This is where we increase the scale and step spacing between wavelets by a factor of two at each step.
e.g., in the first pass, we would use wavelets of scale 1 and space them 1 sample apart. Then in the second pass,
we'd scale them to two samples, and space them apart by two samplepoints.

For a visual representation --

Everywhere you see an x is a point where the convolution is performed at that scale. The same number of
calculations is needed for each convolution, but fewer actually need to be done in each pass. Fortunately, we
cannot forget Nyquist's theorem, so the first pass, where the wavelets are at a scale to take up a single
samplepoint is unnecessary. The smallest scale we need to consider is where the wavelet takes up two
samplepoints. This also means that the transform takes up the same amount of space as the original data.

What we do is take a lowpass(scaling function) and a highpass (wavelet function) version of the curve and
separate the highpass and lowpass information. In short, we treat the wavelet transform as if it is a filter bank.
And we iterate downward along the lowpass subband. The lowpass data is treated as a signal in its own right
and is subdivided into its own low and high subbands.

1
Ultimately, we can go all the way down until the lowpass and highpass bands are a single value, but whether or
not we do that depends on the application. On images, for instance, doing so may prove overkill.

I could go on for ages about actually choosing the filter banks for your wavelet and scaling functions, but that
would go a little beyond the scope of this tutorial. Moreover, I don't claim to be enough of an expert to have
every single secret in my back pocket. There's plenty more relevant information out there to grab than anyone
on Earth could possibly keep track of. Let's just say, at the very least, that the wavelet and scaling filters must
be orthogonal so that the highpass and lowpass information are distinct and mutually exclusive. That is, there is
no overlap between the data representations in frequency space. Also, that since you're progressively dividing
into two parts, the integral of the scaling filter should be sqrt(2) to keep the energy of the wavelet constant for
every depth.

There are plenty of resources to find dozens upon dozens of wavelet and corresponding scaling functions. If
you've seen anything on wavelets before, the two filters you'll probably instantly recognize are the Haar and
Daubechies-4 banks.
Haar Daubechies-4 (D4)
H = { (1 + sqrt(3)) / (4 * sqrt(2)),
(3 + sqrt(3)) / (4 * sqrt(2)),
H = {1 / sqrt(2), 1 / sqrt(2) }
(3 - sqrt(3)) / (4 * sqrt(2)),
G = {1 / sqrt(2), -1 / sqrt(2) }
(1 - sqrt(3)) / (4 * sqrt(2)) }
G[0] = H[3], G[1] = -H[2], G[2] = H[1], G[3] = -H[0]

The Haar wavelet is probably the most basic and is equivalent to an old-fashioned "sum & difference"
transformation. The Daubechies-4 (or more accurately, the second-order member of the Daubechies family of
wavelet) is probably the one shape most associated with the word wavelet. And while I'm on the subject of
names, I should note that there can be some nomenclature disagreement. Some people will use the name

2
Daubechies-4 (since there are 4 elements in the filter bank) while others will say Daubechies-2 (since it's a
second-order filter). Similarly, you'll see biorthogonal wavelets like CDF(2,2)* also go by the name CDF(5,3).
Fortunately, people are usually consistent with one naming convention, so it will be certain after seeing the
actual filter bank data what someone means.

Let's assume for the sake of examples that we're going to use the Daubechies-4, which means our filter banks
will contain 4 elements.

In this case, our code(or pseudocode) to perform one step of the DWT looks something like this --
for(i = 0; i < (n/2); ++i)
{
yh[i] = x[2i]*g[3] + x[2i+1]*g[2] + x[2i+2]*g[1] + x[2i+3]*g[0];
yl[i] = x[2i]*h[3] + x[2i+1]*h[2] + x[2i+2]*h[1] + x[2i+3]*h[0];
}

'x' is the input signal, and 'n' is the length of that signal. 'g' is the wavelet filter and 'h' is the scaling filter. 'yh' is
the resulting highpass data and 'yl' is the resulting lowpass data, each of these is half the size of our original data
(i.e. n/2). In practice, we'd also do a few extra tasks, such as copying the highpass and lowpass data back into a
destination buffer (same size as the source buffer). In addition, since this is one step, we can take the lowpass
half of the data and repeat the process with it a number of times(and n will be half its value from the previous
step).

One thing worth noting is that there should actually be a wraparound through the buffer, so even though in the
pseudocode, there are indices like [2i+3], it should actually be [(2i+3) % n]. To be more accurate, though, the
fact is that we're treating the buffer as a circular list. So, for instance, say you have a buffer of size n=256. If
we index buf[256], it should actually wrap around to buf[0] -- essentially, (index % n). Similarly, if we try to
index buf[-1], it should wrap to the end and read from buf[255] -- essentially, (n + index).

Anyway, that's the idea for the forward transform, and here's the general idea for the inverse transform.
for(i = 0; i < (n/2); ++i)
{
x[2i ] = yl[i-1]*h[1] + yl[i]*h[3] + yh[n-1]*g[1] + yh[n]*g[3];
x[2i+1] = yl[n-1]*h[0] + yl[n]*h[2] + yh[n-1]*g[0] + yh[n]*g[2];
}

Of course, the examples shown are for a 1-dimensional transform, so as it is, it is only suitable for
1-dimensional data such as audio. We can make this into a separable 2-d transform (e.g. for images) rather
simply by performing each step on each dimension individually. That is, we perform a 1-d transform on all the
horizontal lines in the image, and then perform a 1-d transform on all the vertical lines.

To help visualize what's going on during these transforms, we can look at it in the form of a matrix
multiplication. For the sake of space, I'll assume again, the D4 filter (4 elements), and a signal that's only 8
samples in length. We'll treat those 8 samples as a 8-element column vector, and the lowpass and highpass
sections as two 4-vectors combined to make one 8-vector.
[yl(0)] = [ h3 h2 h1 h0 0 0 0 0 ][x(0)]
[yl(1)] = [ 0 0 h3 h2 h1 h0 0 0 ][x(1)]
[yl(2)] = [ 0 0 0 0 h3 h2 h1 h0 ][x(2)]
[yl(3)] = [ h1 h0 0 0 0 0 h3 h2 ][x(3)]
[yh(0)] = [ g3 g2 g1 g0 0 0 0 0 ][x(4)]
[yh(1)] = [ 0 0 g3 g2 g1 g0 0 0 ][x(5)]
[yh(2)] = [ 0 0 0 0 g3 g2 g1 g0 ][x(6)]
[yh(3)] = [ g1 g0 0 0 0 0 g3 g2 ][x(7)]

And for the inverse transform, we have something like this --


[x(0)] = [ h3 0 0 h1 g3 0 0 g1 ][yl(0)]
[x(1)] = [ h2 0 0 h0 g2 0 0 g0 ][yl(1)]
[x(2)] = [ h1 h3 0 0 g1 g3 0 0 ][yl(2)]
[x(3)] = [ h0 h2 0 0 g0 g2 0 0 ][yl(3)]
[x(4)] = [ 0 h1 h3 0 0 g1 g3 0 ][yh(0)]
[x(5)] = [ 0 h0 h2 0 0 g0 g2 0 ][yh(1)]

3
[x(6)] = [ 0 0 h1 h3 0 0 g1 g3 ][yh(2)]
[x(7)] = [ 0 0 h0 h2 0 0 g0 g2 ][yh(3)]

As mentioned before, we must be careful about the choice of wavelet. It is not innately true that the inverse
transform of the forward transform will always yield the original data. This property is referred to as "perfect
reconstruction," and it's not something you can assume about wavelets. A great deal of wavelet literature tends
to deal with the mathematician's point of view rather than a computer scientist's view. In general, this means
that they often deal with wavelets applied to infinite sequences of data. And many wavelets that exhibit perfect
reconstruction on infinite domains won't translate over to finite domains.

So in the end, what is this good for? Well, if you've read this far, hopefully, it means you're actually interested
enough to have heard something about them. The most common applications are signal processing and data
compression. About the simplest thing you can do as far as compression is to set anything below a threshold to
zero, and then run-length encode the zeros. Generally, when you do this it tends to be nice to choose a wavelet
that produces a lot of zeros when transforming your dataset in the first place. The highpass info sections tend to
have very small values in general, anyway, but it still often counts for something. Other, more complicated
approaches could involve things like zerotrees and so on, but that's another tutorial.

There are, of course, other uses for wavelets such as techniques involving precomputed light transport.
Performing wavelet transforms on the column or row vectors of the summation matrix can make a matrix more
sparse for the sake of performance. We can often get by preserving a small fraction of coefficients when doing
so.

Hopefully, the DWT doesn't appear so scary. Make no mistake, some of the pure theoretical content
concerning wavelets is incredibly scary to newcomers. It gets to the point that a lot of professors don't enjoy
teaching it even if they have personal interest in the matter. But for the programmer in us, we can at least look
upon it from a programmer's point of view and play around in that world.

Good luck, and happy coding


- Parashar K.

*CDF is a biorthogonal family of wavelets. CDF stands for Cohen-Daubechies-Feauveau. The CDF(5,3) AKA CDF(2,2) filter is the
same one used in the JPEG2000 spec.

References :

http://www.bearcave.com/misl/misl_tech/wavelets/lifting/ -- Wavelet Lifting Scheme. An alternate method for


constructing and working with wavelets.
http://dmsun4.bath.ac.uk/resource/warehouse.htm -- The Bath Wavelet Warehouse, a repository full of wavelet
filter bank coefficients.
http://www-dsp.rice.edu/software/ -- DSP-related software at Rice University. Contains several resources on
wavelets as well as demonstratory Java applets.

You might also like