You are on page 1of 1056

1

ETHER WIND, SPECTRAL LINES, AND


MICHELSON INTERFEROMETERS
The Michelson interferometer is named after Albert Abraham Michelson, who designed and built
it in 1881 to detect the ether wind caused by the Earth’s orbital motion. Michelson’s attempt
failed; his interferometer, sensitive enough to detect stamping feet 100 meters away,1 could not
detect the Earth’s orbital motion. So important and difficult to explain was this result that
Michelson and Edward Morley repeated the experiment with a larger and more sensitive
interferometer in 1887. This second attempt, which is today called the Michelson-Morley
experiment, also yielded a negative result: The Earth’s motion could not be detected. The
Michelson-Morley experiment is one of the most important negative findings of 19th-century
science; it encouraged physics to discard the idea of a luminiferous ether and prepared the way
for Einstein’s relativity theories at the beginning of the 20th century.
The idea of a luminiferous ether—a plenum pervading both (transparent) matter and empty
space—had been widely accepted ever since Young and Fresnel established around 1820 that
light behaved like a transverse vibration or wavefield as it propagated past obstacles. There were
recognized difficulties with the concept; for example, the ether provided no detectable resistance
to the motion of material bodies yet was elastic enough to transmit light vibrations without
measurable energy loss. In the 1820s and ’30s, Poisson, Cauchy, and Green, famous
mathematical scientists, derived equations of motion for transverse waves in an elastic medium,
but when these equations were applied to the already known behavior of light, the results were at
best mixed.2 In 1867 James Clerk Maxwell modified the formulas describing the interdependent
behavior of electric and magnetic fields to make them a self-consistent set of equations; he
believed himself to be constructing a mechanical analogy for the ether. After showing that the
new set of equations predicted transverse electromagnetic waves traveling at the speed of light,
Maxwell not only asserted that light was a propagating electromagnetic disturbance, but he also
used his discovery to connect electric and magnetic properties to the behavior of the luminiferous
ether. It was not until 1888 that Hertz demonstrated experimentally that propagating
electromagnetic disturbances actually exist; and the optical community itself did not
acknowledge until 1896, with the discovery of the Lorentz-Zeeman effect, that light had to be

1
A. Michelson, “The Relative Motion of the Earth and the Luminiferous Ether,” American Journal of Science 22,
Series 2 (1881), p. 120–129.
2
E. Whittaker, A History of the Theories of Aether and Electricity, Vol. I, The Classical Theories (Thomas Nelson &
Sons, Ltd., New York, 1951), pp. 129–142.

-1-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

such a propagating electromagnetic wavefield.3 So the ether concept was not only alive and well
at the time of Michelson’s experiments, but it could also be said, with the growing acceptance of
Maxwell’s equations to describe the behavior of the luminiferous ether, that it had never been
healthier.

1.1 The First Michelson Interferometer


Figure 1.1(a) is a drawing of the instrument Michelson described in his 1881 paper, and Fig.
1.1(b) shows how the interferometer works. Incident light enters from the left, as shown by the
dark solid arrow, and hits a glass plate whose back is a partly reflecting, partly transmitting
surface. Ideally, half the incident light is transmitted through to mirror C and half is reflected up
to mirror D. Mirrors C and D then return the light to the beam splitter, as shown by the dashed
arrows. At the beam splitter, the light is again half transmitted and half reflected to send two
equal-intensity beams into the observer’s telescope. The light that is first transmitted and then
reflected at the beam splitter is called beam TR, and the light that is first reflected and then
transmitted at the beam splitter is called beam RT. These beams are drawn as two side-by-side
dotted arrows, but in reality they should be thought of as lying one on top of the other, filling the
same volume of space as they travel from the beam splitter to the telescope.
Michelson, thinking then in terms of 19th-century optical theory, would have regarded light as
transverse and elastic vibrations in the ether. The ether’s plane of vibration might be horizontal,
as shown in Fig. 1.2(a), or vertical, as shown in Fig. 1.2(b). It was assumed, in fact, that the ether
could undergo transverse vibrations in any plane at all—horizontal, vertical, or something in
between, as shown in Fig. 1.2(c)—although not all at the same time. At any given point in the
light beam, there could be only one plane of vibration, with different colors of light characterized
by different wavelengths of vibration. If a “snapshot” of a light beam could be taken, the plane of
vibration could well be changing along its length, as shown in Fig. 1.3(a). At some slightly later
time, the snapshot would show the same configuration advanced in the direction of propagation,
as shown in Fig. 1.3(b). White light, then as now, was taken to be a composite beam consisting of
many different wavelengths simultaneously traveling in the same direction. Different colors of
light correspond to disturbances of different wavelengths. Combining or adding together many
different-colored disturbances produces a total transverse vibration having no particular or unique
wavelength and with the plane of vibration free to change in an irregular fashion along the length
of the beam, as shown in Fig. 1.3(c). The situation depicted in Figs. 1.3(a)–1.3(c) is actually very
close to the physical models used today to explain the behavior of light; all we need to do is
accept Maxwell’s equations—but not Maxwell’s ether—and say that the sinusoidal curves in

3
D. Goldstein, Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003), p. 298.

-2-
7KH)LUVW0LFKHOVRQ,QWHUIHURPHWHUÂ

),*85( $
D 7KHILUVW0LFKHOVRQLQWHUIHURPHWHU








Figs. 1.3(a)±1.3(c) describe the changing length and orientations of the tip of the wavefield’s
oscillating electric or magnetic field vectors.4
Suppose length D in Fig. 1.1(b) is adMusted until the distance from mirror C to the beam splitter
is exactly the same as the distance from mirror D to the beam splitter. When monochromatic
light—that is, light having a unique wavelength—enters the interferometer as shown in Figs.
1.4(a) and 1.4(b), then the beams reflected from C and D recombine when leaving the
interferometer in such a way that their planes of vibration, as well as their state of oscillation,
exactly match. Since the planes of vibration match, we can disregard the planes’ orientation and
Must add together the two beams’ sinusoidal curves. Figure 1.5(a) shows that if the RT and TR
beams line up exactly—as they must when the distances from mirrors C and D to the beam
splitter are equal—then the summed oscillation is a maximum because the two wavefields are in
phase. If the distances from mirrors C and D to the beam splitter are unequal, then beams RT and
TR shift with respect to each other, as shown in Figs. 1.5(b)±1.5(e). The two beams can be out of
wavelength.depending on the
phase by any fraction of a wavelength howamount
much the
of inequality in mirror
the twodistance is.
distances.



4
See, for example, the discussion in Secs. 4.2 through 4.4 of Chapter 4. Figures 1.2(a) and 1.2(b) can be profitably
compared to Figs. 4.5 and 4.6 in Chapter 4.


1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.1(b).

Mirror D

a Beam Compensator
Splitter Plate

Incident
Light

Mirror C
partially reflective
surface

Beam RT Beam TR
first reflected then first transmitted then
transmitted at beam splitter reflected at beam splitter

Observing Telescope

-4-
The First Michelson Interferometer · 1.1

FIGURE 1.2(a).

cut in
wavefield
plane perpendicular
to direction of
propagation

FIGURE 1.2(b).

vibrations of vibrations of
transverse wavefield transverse wavefield
cut in wavefield

direction of
propagation

plane
perpendicular
to direction of
propagation

-5-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.2(c). propagation direction for


transverse wavefield

three different
planes of vibration

FIGURE 1.3(a).
vibration wavelength

vibration wavelength
FIGURE 1.3(b).

FIGURE 1.3(c). white light—no unique


wavelength

-6-
The First Michelson Interferometer · 1.1

The closer this fraction is to one-half, the smaller the summed oscillation; and if they are out of
phase by exactly a half-wavelength, then their sum is zero and the combined beam disappears.
When one beam is shifted against the other by exactly one wavelength, and the planes of
vibration still match, then once again the monochromatic RT and TR beams are in phase and
producing a bright combined oscillation.5 There seems to be a real possibility that a
monochromatic beam cannot be used to confirm that mirrors C and D are the same distance from
the beam splitter because the recombined exit beam may look the same as it does when no shift at
all exists if one wavefield is shifted against the other by one, two, etc., wavelengths.
Suppose two monochromatic beams with two different wavelengths are sent through the
interferometer at the same time. If the distances from mirrors C and D to the beam splitter are
equal, then both the monochromatic beams, even though they have different wavelengths, must
be in phase when leaving the interferometer, producing a maximally bright oscillation in the
recombined exit beam. When the distances to the beam splitter are not exactly equal, however,
one of the monochromatic beams may end up shifted against itself by one, two, etc., wavelengths,
but there is no reason for the other beam to be shifted against itself the same way. When three
monochromatic beams are sent through the interferometer while the distances to the beam splitter
are not equal, matching all three wavetrains becomes even more unlikely. Hence, if we pass
white light containing innumerable distinct monochromatic wavetrains through the instrument,
then the RT and TR beams will recombine to produce a maximally bright output beam if and only
if the distances from mirrors C and D to the beam splitter are equal.
To make the white-light beam work as intended, the interferometer needs a glass compensator
plate between mirror C and the beam splitter [see Fig. 1.1(b)]. The compensator plate must be the
same thickness and orientation—and made from the same type of glass—as the glass in front of
the beam splitter’s partially reflecting surface. Figure 1.6(a) shows how light waves reflect from
mirrors C and D; the wavelength does not change while reflecting. In Fig. 1.6(b), however, light
waves inside the glass are somewhat shorter than they are outside the glass; the wavelength of the
light with respect to the glass thickness is greatly exaggerated to show this effect.
Therefore, a given distance traveled inside the glass corresponds to more wavelengths of a
monochromatic beam than the same distance in empty space. Moreover, different colors or
wavelengths of light shrink by different amounts, and this effect was a familiar one to 19th-
century optical scientists. If the compensator plate is not present, then the RT beam in Fig. 1.1(b)
passes through the glass in the beam splitter three times, whereas the TR beam passes through the
beam-splitter glass only once. The RT beam thus contains more wavelengths than the TR beam
even though the distances between the mirrors and the beam splitter are equal. With the
compensator plate there, however,
present, however,both thethe
both TRTRandand
RTthe
beams pass through
RT beams three glass
pass through threelayers.
glass
thicknesses.

5
In fact, we now know that a strictly monochromatic beam of light must have matching planes of vibration when
shifted against itself by exactly one, two, etc., wavelengths.

-7-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.4(a). Figure 1.4(a) shows a segment of radiation entering the interferometer and Fig. 1.4(b)
shows what that segment becomes when it leaves the interferometer if the distance it travels up and back
each interferometer arm is the same.

before passing through


the interferometer

-8-
The First Michelson Interferometer · 1.1

FIGURE 1.4(b).

after leaving the


interferometer

Beam RT Beam TR

-9-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Beam TR
FIGURE 1.5(a).
Beam RT
In Phase Total

Beam TR
FIGURE 1.5(b).
Beam RT
Out-of-Phase
by a Quarter
Wavelength Total

Beam TR
FIGURE 1.5(c).
Out-of-Phase Beam RT
by a Half
Wavelength
Total

Beam TR
FIGURE 1.5(d).
Beam RT
Out-of-Phase by
Three-Quarters Total
Wavelength

Beam TR
FIGURE 1.5(e).
Beam RT
In Phase
Total

- 10 -
The First Michelson Interferometer · 1.1

FIGURE 1.6(a).

Incident Wavefield

Reflected Wavefield

FIGURE 1.6(b).

Reflected Wavefield

Incident Wavefield

Transmitted
Glass Wavefield
Substrate

Beamsplitting Film

- 11 -
Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV

Now Noweacheach monochromatic


monochromatic component
component has
has itsitsown
ownunique
uniquenumber
numberofofwavelengths
wavelengths inin each
each arm
of the interferometer; thus, the blue-light component in one arm has the same number of
wavelengths as the blue-light component in the other arm, the red-light component in one arm
has the same number of wavelengths as the red-light component in the other arm, and the same
can be said about all the other colors in the white-light beam.
Michelson wanted to do more than Must make the distances traveled by light going back and
forth between the C, D mirrors and the beam splitter equal; he also wanted to see how the
distances traveled by the light beams changed when he rotated the interferometer on its stand >see
Fig. 1.1(a)@. Up to now, we have assumed that mirrors C and D are exactly perpendicular to the
line of sight between their centers and the beam splitter, but nothing stops us from tilting one of
them a very slight amount, as shown in Fig. 1.7. The degree of tilt is, of course, greatly
exaggerated to show what is happening. When the tilt is imposed after the distances of mirrors C
and D to the beam splitter have been made equal, the center line of the tilted mirror remains at the
same distance from the beam splitter as it was before the tilt occurred. If the tilt is so small that
the slight change in direction of the beam can be disregarded, then that part of the beam reflecting
off the mirror’s center line still recombines with light from the other mirror in such a way as to
produce the maximally bright oscillation already discussed above. The off-center parts of the
recombined beam are, of course, dimmer because the off-center parts of the tilted mirror no
longer match up properly to the untilted mirror.6 An observer looking through the telescope
shown in Figs. 1.1(a) and 1.1(b) sees a bright central band, called a ³fringe,´ corresponding to the
central strip lying along the center line of the tilted mirror, with dark and less bright bands or
fringes on either side. If the distance that the light travels between the tilted mirror and the beam
splitter changes slightly, we expect the central fringe to shift as one side or another of the tilted
mirror—instead of its center line—becomes equal to the distance traveled by the light in the other
arm of the interferometer. It is exactly this sort of fringe shift that Michelson hoped to see when
he rotated the interferometer on its stand, changing the direction in space of the light going up
and back the arms of the interferometer.
One last point we need to make is that many beam splitters of the type shown in Fig. 1.1(b)
reflect differently from the glass side and the nonglass side of the partially reflecting surface,
reversing the directing of vibration in the TR beam reflecting off the nonglass side and not
reversing it in the RT beam reflecting off the glass side.7
Figure 1.5(c) shows that reversing the direction of vibration is the same as changing the phase
of the beam by one half-wavelength or 1808, so the phenomenon is often referred to as a 1808
phase shift on reflection. Michelson used this sort of phase-shifting beam splitter, so the RT and
TR beams in his interferometer did not match up the way they are shown in Fig. 1.4(b) when the
distances of mirrors
mirrors CC and
andDDfrom
fromthe
thebeam
beamsplitter
splitter are
are equal
equalbut
butinstead
insteadmatch
matchupupasasshown
showninin



6
See Secs. 5.20 and 5.21 in Chapter 5 for a more detailed discussion of how to analyze a tilted mirror.
7
F. Jenkins and H. White, )XQGDPHQWDOV RI 2SWLFV 3rd ed. (McGraw-Hill Book Company, New <ork, 1957), p.
251.


The First Michelson Interferometer · 1.1

Centerline of
FIGURE 1.7. Tilted Mirror

Line of Sight to Beam Splitter

Angle
Note: The angle of tilt is
greatly exaggerated in of Tilt
this diagram.

- 13 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Fig. 1.8. Now the central fringe coming from the center line of the tilted mirror is dark because
all the monochromatic components of the two beams cancel out rather than add together. When
Michelson sent white light through his interferometer, he thus saw a central dark fringe with
parallel multicolored fringes on either side. The colored fringes come from the off-center strips of
the tilted mirror where one or another monochromatic wavetrain is shifted against itself by
exactly one, two, etc., wavelengths, increasing the amplitude of its oscillation with respect to the
wavetrains of other colors inside the recombined beam. In this setup, the central dark fringe is
unique, making it easy for Michelson to see how its position changes as the interferometer is
rotated.

1.2 Historical Reasoning Behind the Ether-Wind Experiment


Physical theory has changed a great deal since 1881, but it is still relatively easy to understand
the reasoning behind Michelson’s experiment. As soon as light is taken to be a wavefield in a
medium at rest, such as waves on the surface of water, and the Earth’s motion through space is
regarded as carrying the interferometer through the medium, everything falls into place.
The first point worth mentioning is that the velocity at the equator due to the Earth’s daily
47 km/sec, much less than the Earth’s orbital velocity around the sun of 29.67
rotation is 0.46 9.7
km/sec. Consequently, the rotational velocity of Michelson’s laboratory—well north of the
equator—was only about 1% of the orbital velocity, and Michelson did not have to pay any
attention to it. The interferometer in Fig. 1.1(a) can be rotated on its stand, so at noon and
midnight, Michelson could always arrange for one arm to be aligned with the Earth’s orbital
velocity. Figures 1.9(a) and 1.9(b) show light traveling along the arms of a Michelson
interferometer when the interferometer is viewed as moving with a velocity v through a stationary
medium—that is, a luminiferous ether—and one of the arms is aligned with v. To keep life
simple, we have dropped the compensator plate from the two diagrams. Figure 1.9(a) shows light
traveling out and back along the arm aligned with v, with the interferometer rotated so that this is
the arm holding mirror C in Fig. 1.1(b). Figure 1.9(b) shows light traveling out and back along
the arm holding mirror D in Fig. 1.1(b). The positions of mirrors C and D are adjusted so that
each one is the same distance a from the beam splitter.
Figure 1.9(a) shows the beam splitter at three different positions as a single crest of the light’s
wavefield moves through the interferometer: when the wavecrest first enters the arm of the
interferometer, when the wavecrest reflects off mirror C, and when the wavecrest returns to the
beam splitter for the second time. Mirror C is shown at the same three times—when the
wavecrest enters the arm, when it reflects off C, and when it returns to the beam splitter. The
velocity of the wavecrest with respect to the ether is c, and time t1 elapses as the wavecrest goes
from the beam splitter to mirror C. Hence, the wavecrest covers a distance a + vt1 in the
stationary ether while traveling at velocity c, with

a  vt1 ct1 . (1.1a)

- 14 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.8.

Beam RT Beam TR

- 15 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.9(a).
Direction of
Earth’s Motion

vt1 vt 2 vt1 vt 2

a
Incident Light

Positions of the Positions


Beam Splitter of Mirror C

To Telescope

- 16 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.9(b).

Direction of
Earth’s Motion

Mirror D

Positions of the
Beam Splitter
a

Incident Light

vt 3 vt 3

To Telescope

- 17 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Time
Time t2 elapses
t2 elapses while
while thethe wavecrestreturns
wavecrest returnsfrom
frommirror
mirrorCCtotothe
thebeam
beam splitter,
splitter, and
and similar
reasoning shows that
a  vt2 ct2 . (1.1b)

Solving
Solving for for
t1 and
t1 and
t2 in
t2 Eqs.
in Eqs.
(1.1a)
(1.1a)
andand
(1.1b)
(1.1b)
gives
gives

a
t1
cv
and
a
t2 .
cv

TheThe
wavecrest
wavecrest
spends
spends
timetime
a a 2ac
t1  t2  2 2
cv cv c v

going out to mirror C and back to the beam splitter, and it does so while traveling at velocity c, so
it covers a total distance
2ac 2
c A (t1  t2 ) 2 2 . (1.1c)
c v

Figure 1.9(a) also shows the wavecrest traveling at an angle, instead of straight down, after it
reflects off the beam splitter when leaving the interferometer’s arm. This allows it to head toward
where the observing telescope will be by the time the wavecrest reaches it; there is thus no
danger of the telescope missing the wavecrest because it has moved out of position. Figures
1.10(a) and 1.10(b) show why this happens. Figure 1.10(a) shows a single wavecrest reflecting
off a 458 stationary mirror. The large dots indicate where the “corner” of the reflecting wavecrest
is now and has been in the past as it reflects from the stationary mirror. The reflected wavecrest
travels upward at 908 from its original direction, as expected. Figure 1.10(b) shows what happens
when the same type of wavecrest reflects off a moving 458 mirror. The four thin solid lines show
the positions of the mirror at four equally spaced instants in time, and the large dots again show
where the corner of the reflecting wavecrest is at these times. Connecting these dots with a thick
dashed line, we see that the wavecrest feels an effective stationary mirror that is slanted at an
angle somewhat greater than 458. This means the reflected wavecrest does not travel straight up
as in Fig. 1.10(a) but instead moves a little off to the right.

- 18 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2

Figure 1.9(b) shows how the wavecrest travels up and back the interferometer arm
perpendicular to velocity v. In time t3 , the wavecrest travels a distance a 2  v 2t32 from the beam
splitter to mirror D; and, because it does this at velocity c, we must have

ct3 a 2  v 2t32
or
a
t3 .
c2  v2

Figure
Figure 1.9(b)
1.9(b) shows
shows thatthat
thethe totaldistance
total distancetraveled
traveledfrom
fromthe
thebeam
beamsplitter
splitter to
to mirror
mirror D
D and
back again must be
2ac
2ct3 . (1.2)
c2  v2

Even though the two interferometer arms are both of length a, if the interferometer is moving
then a single wavecrest splitting at the beam splitter does not travel the same distance in each arm
before recombining at the beam splitter. The difference ¨s between the distances traveled out and
back in each arm is, according to Eqs. (1.2) and (1.1c),

2ac ª c º 2a ª 1 º
s c(t1  t2 )  2ct3 «  1» «  1» .
c2  v2 ¬ c2  v2 ¼ 1  v 2 c 2 «¬ 1  v 2 c 2 »¼

The Earth’s orbital velocity is about 104 of the speed of light c, so we can make the
approximation

1 2 v2
1  v2 c2  1
2c 2
.

This gives
§ v2 · § v2 · av 2
s 2a ¨1  2 ¸ ¨1  2  1¸ 2  O(v 4 c 4 ) .
© 2c ¹ © 2c ¹ c

- 19 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.10(a). An incident wavecrest enters from the right and is reflected up from a stationary
surface. The dots show where the corner of the wavecrest is at equally spaced time intervals while it is
reflecting off the surface.

incident wavecrest
moving to the left

reflected wavecrest
moving up

reflecting surface

- 20 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2

FIGURE 1.10(b). The same wavecrest is shown here at four instants of time, each instant
separated from the next by a time interval of ¨t, as it enters from the right and reflects off a flat
surface traveling from left to right across the page. The dots show where the corner of the wavecrest
is at these four instants of time, and the thick dashed line shows the effective slant of the surface
experienced by the wavecrest as it reflects.

Same incident wavecrest at four equally


spaced instants of time

t t – ǻt
t  2t
direction of travel of
incident wavecrest
t  3t

direction of travel of
reflected wavecrest

t
t  3t
t  2t t  t

reflecting surface at four equally spaced


instants of time

- 21 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Since v 2 c 2 108 and v 4 c 4 1016 , it makes sense to neglect the v 4 c 4 terms and write

av 2
s 2 ? 108 a . (1.3a)
c

It isIt perhaps of of
is perhaps interest to to
interest point
pointoutoutthat
thatMichelson,
Michelson,bybymistakenly
mistakenlyassuming
assuming that
that the
the light
light
traveling up and back the arm perpendicular to the orbital velocity covered a distance 2a instead
of 2ac / c 2  v 2 , ended up with
2av 2
s ? 2 ;108 a (1.3b)
c2

in his 1881 paper. This incorrect formula did not affect Michelson’s overall analysis because, as
he explained in the paper, the data was good enough to rule out an effect ten times smaller than
what he expected to see.
As pointed out in Sec. 1.1, when white light passed through the interferometer with one of the
end mirrors slightly tilted, Michelson saw a central dark band or fringe from the centerline of the
tilted mirror because the centerline is the same distance from the beam splitter as the untilted
mirror. Remembering that Michelson used a beam splitter that reversed the direction of vibration
in one of the recombining beams, we know that at the center of the dark fringe each
monochromatic wavetrain in the white-light beam cancels itself out. At the first colored band or
fringe on either side of the centerline, the wavetrains go from cancelling themselves out to
reinforcing themselves, becoming bright at those positions on the tilted mirror where the length
traveled out and back the tilted mirror arm is a half-wavelength longer than at the center of the
dark band [see, for example, the transition from Fig. 1.5(c) to Fig. 1.5(e)]. Hence, for each
monochromatic wavetrain, the transition from dark to bright is halfway complete where the
length traveled out and back the tilted-mirror arm is a quarter wavelength different from what it is
at the center of the dark band. Considering the joint actions of all the monochromatic wavetrains
in the white-light beam, Michelson then knew that going from the center to the edge of the dark
fringe corresponded to shifting from a position on the tilted mirror where the length out and back
in both interferometer arms was equal to a position where the length out and back the tilted
mirror arm was different by one quarter of the average wavelength Ȝav of the white-light beam.
Thus the fringe widths inside the telescope’s field of view gave him an extremely fine-grained
scale for measuring the difference in distance between the two arms. For greater accuracy, a
monochromatic beam could be sent through the interferometer and the tilted mirror adjusted until
the fringes matched up with the scale marks of the telescope’s eyepiece.
If the interferometer is rotated so that the arm originally parallel to v is now perpendicular to
v, then the distance out and back one arm is shorter by ¨s and the distance out and back in the
other arm is longer by ¨s, so there is—according to Eq. (1.3a)—a shift of

- 22 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2

2av 2
2∆s ≅ 2
≈ 2 ×10−8 a (1.4)
c

of the wavefield from one arm when compared to the wavefield from the other arm. If 2¨s equals
λav / 4 , the dark fringe shifts until its center is located at the previous position of one of its edges;
if 2¨s is larger, then the dark fringe shifts more; and if 2¨s is smaller, then the dark fringe shifts
less. For the value of a he chose, Michelson expected the fringe to shift by approximately one-
tenth its width. To within experimental error, he did not see the dark fringe shift at all. Michelson
concluded that

the hypothesis of the stationary ether is thus shown to be incorrect, and the necessary conclusion follows that
the hypothesis is erroneous.8

The existence of the ether was accepted by a lot of scientists, so this experiment was by no
means the last word in the matter; indeed, it inaugurated 50 years of ever more painstaking
attempts to detect an ether wind using larger and more sensitive Michelson interferometers.
Michelson himself took the first step down this road when, in 1887, he collaborated with Edward
Morley to repeat his experiment; Fig. 1.11 shows the optical diagram of the interferometer they
constructed. They concluded that the velocity v of the interferometer with respect to the ether was
probably less than a sixth of the Earth’s orbital velocity, an upper limit suggested by
experimental error.9 Michelson and Morley regarded this as another negative result. Many
scientists, including Michelson, at first interpreted these experiments as showing that the Earth
dragged along a layer of ether near its surface, making it hard to say just how fast the
interferometer might be moving with respect to the ether in the laboratory. Interferometers were
set up on tops of mountains and sent up in high-altitude balloons, hoping to get outside the ether
layer dragged along by the Earth, but no one came up with any results convincingly larger than
experimental error. According to Einstein’s special theory of relativity, published in 1905, there
is no reason to expect “ether drift” at all, because the speed of light is the same in all inertial
frames of reference. After 1905, attempts to detect ether drift were basically attempts to disprove
relativity theory, and scientists who pursued them were regarded by their peers as ever more
eccentric. Perhaps the last serious attempt to detect an ether wind using a Michelson
interferometer took place on top of Mount Palomar, where Dayton Miller ran an extremely large
and sensitive Michelson experiment in the 1920s. When publishing the results in the early 1930s,
he claimed to detect ether-wind velocities on the order of 10 km/sec,10,11 but the data remained

8
Michelson, “The Relative Motion of the Earth.”
9
A. Michelson and E. Morley, “On the Relative Motion of the Earth and the Luminiferous Ether,” American Journal
of Science 34, Series 3 (1887), 333–345.
10
D. Miller, “The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,” Reviews of
Modern Physics 5, no. 2 (July 1933), 203–242.

- 23 -
Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV

controversial. After his death, the results were attributed to slight but systematic temperature
changes in the instrument during the measurements.12

0RQRFKURPDWLF/LJKWDQG6SHFWUDO/LQHV
The wavelength λ of a monochromatic light wave and the frequency I in cycles per unit time of
that same monochromatic light wave are connected by

λI =F, (1.5)

where F is the velocity of light. By the second half of the 19th century, it was known that the light
emitted by free atoms, such as from the atoms inside a hot dilute gas, is often emitted at specific
frequencies called spectral lines. Equation (1.5) then requires the light from a spectral line to
have a precise wavelength λ  FI. Michelson used these spectral lines to generate the
monochromatic light sent through his interferometer. When, for example, a spectroscope was
used to separate out the cadmium red line and send it through the interferometer, he would see a
regular pattern of red fringes; when the mercury green line was sent through, he would see
regular green fringes; and so on. Many of these lines are in reality clumped groups of spectral
lines, all having nearly the same wavelength; they masquerade as a single bright line when
observed by low-resolution spectroscopes and spectrometers.

$SSO\LQJWKH0LFKHOVRQ,QWHUIHURPHWHUWR6SHFWUDO/LQHV
After the first ether-wind experiments, Michelson demonstrated that his interferometer could also
be used both as an extremely accurate, practical ruler for measuring fundamental lengths and as
an extremely high-resolution spectrometer. To understand Michelson’s approach, we must keep
in mind that the only ³optical detectors´ available back then were cameras (whose images had to
be chemically developed in darkrooms) and the human eye.
When the interferometer is used as a ruler or spectrometer, one of the arms is modified so that
its mirror is easily moved, as shown in Fig. 1.12. This moving mirror and the fixed mirror on the
other arm are still slightly tilted with respect to each other; that is, when extended indefinitely,
the planes of the mirror surfaces do not meet at exactly 90°. In this discussion, we refer to the
moving mirror as being tilted and the fixed mirror as being untilted. To keep things consistent
Sec. 1.1,
with the discussion in Sec. 1.1, the
the beam
beam splitter
splitter isis assumed
assumedto
tobe
bethe
thesame
sametype
typeused
usedininthe
the1881
1881



11
D. Miller, ³The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,´ 1DWXUH
(February 3, 1934), 162±164.
12
R. Shankland, S. McCuskey, F. Leone, and G. Kuerti, ³New Analysis of the Interferometer Observations of
Dayton C. Miller,´ 5HYLHZVRI0RGHUQ3K\VLFV , no. 2 (April 1955), 167±178.


Applying the Michelson Interferometer to Spectral Lines · 1.4

FIGURE 1.11.

- 25 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

ether-wind experiment. Hence, when a white-light beam is sent through the instrument, an
observer notes a central dark fringe if the center of the tilted moving mirror is the same distance
from the beam splitter as the center of the fixed mirror. This equidistant position of the moving
mirror is today often called the position of zero-path difference (ZPD) because the light’s path up
and back each arm of the interferometer is the same when there is no tilt present.
The position and tilt of the moving mirror can be adjusted until the central dark fringe is
centered on rulings marked in the telescope’s eyepiece. When the white-light beam is replaced by
a monochromatic beam from a spectral line, the observer sees a sequence of light and dark bands
forming a regular pattern of fringes having the same color as the spectral line. The marked
position of the central dark fringe in the center of the eyepiece is now occupied by a dark null of
the monochromatic fringe pattern. This null corresponds to the centerline strip of the tilted
mirror’s surface being the same distance from the beam splitter as the untilted mirror’s surface.
The two bright fringes on either side of the marked null separate that null from the two
neighboring nulls, with the neighboring nulls corresponding to two strips of the tilted mirror’s
surface that are a half-wavelength closer to, and a half-wavelength further away from, the beam
splitter. A half-wavelength difference in distance from the beam splitter creates, of course, a full
wavelength’s difference in the distance traveled up and back the interferometer’s arm, which is
why we see another null. Depending on the configuration of the telescope, the amount of tilt in
the tilted mirror, and the wavelength of the monochromatic beam, there will be some number of
additional fringes alternating bright and dark across the field of view, with the nulls
corresponding to strips of the tilted mirror’s surface that are one half-wavelength closer to and
further away from the beam splitter, two halves or one full wavelength closer to and further away
from the beam splitter, three halves closer to and further away from the beam splitter, and so on.
The observer can slowly move the tilted mirror out along its arm, watching as the fringe
pattern moves across the telescope’s field of view. The movement occurs, of course, because the
strips of the moving mirror’s tilted surface that are 1/2, 1, 3/2, etc., wavelengths closer to or
further away from the beam splitter are now no longer where they used to be. The marked null
shifts and, after the mirror moves half a wavelength from its original position, the null that used
to be immediately to one side shifts into the marked location. The fringe pattern looks the same
as just before the mirror began moving, but the observer knows there has been a half-wavelength
shift in the position of the moving mirror because the fringes have been carefully watched as their
positions changed. As the mirror moves, old fringes move out of sight on one side of the field of
view while new fringes replace them on the other side of the field of view. The observer checks
that the tilt of the moving mirror does not change by making sure that there is always the same
number of bright-null repetitions in the fringe pattern. Since the position of the moving mirror is
always known to within a small fraction of a wavelength, the interferometer has now become an
extremely accurate way to measure distance.

- 26 -
Applying the Michelson Interferometer to Spectral Lines · 1.4

FIGURE 1.12.

Moving Mirror
p

Beam Compensator
Splitter Plate

Source Radiance Containing


Spectral Lines

Fixed
Mirror

To Telescope

- 27 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Michelson did not hesitate to measure distances with his interferometer. In 1892 he
established that the standard meter bar in Paris corresponded, to an accuracy of one part in two
million, to 1,553,163.5 wavelengths of monochromatic light from the red cadmium spectral line.
At Yerkes Observatory in Wisconsin, he measured the extremely small tidal distortions of the
planet Earth due to the moon’s gravity, helping to establish that the Earth has an iron core, and
published the results in 1919. There is, however, a fundamental difficulty limiting his ability to
use the interferometer as a ruler: As the moving mirror gets further and further away from its
equidistant or ZPD position, the pattern of fringes starts to fade and eventually disappears. This
phenomenon is caused by the beam from the spectral line not being exactly monochromatic—
either because what looks like a single spectral line is in reality a group of two or more lines
having almost the same wavelength, or because the line itself has a finite spectral “width,”
simultaneously emitting light at a very large number of wavelengths all very close to each other
in value.
To see why the fade-out occurs for a closely spaced group of spectral lines, we first analyze
what happens when the light from a pair of equal-intensity, closely spaced spectral lines,
sometimes called a spectral doublet, is sent through the interferometer. Inside the interferometer,
the doublet behaves like two monochromatic beams—each having a slightly different
wavelength—simultaneously passing through the instrument. After using white light to put the
moving, tilted mirror at its ZPD position, we begin sending the doublet beam through the
interferometer. Each monochromatic beam produces a fringe pattern. To the human eye, the
fringe patterns have the same color and their nulls seem to be at exactly the same places in the
telescope’s field of view. Because the wavelengths of the beams are nearly identical, the two
fringe patterns lie almost exactly on top of each other, reinforcing each other the same way the
dashed and solid oscillations lie on top of each other to create a thicker line at the left-hand edge
of Fig. 1.13. When, for example, there is a null in one beam’s fringe pattern because that strip of
the tilted mirror’s surface is an integer number of half-wavelengths closer to or further away from
the beam splitter, the null from the other beam’s fringe pattern falls in almost exactly the same
place because it has almost exactly the same wavelength. As we shift the moving mirror further
away from ZPD and watch the fringes move, we know that when each new fringe forms at the
leading edge of the field of view, it shows that the edge of the tilted moving mirror is an ever
larger number of half-wavelengths further from the beam splitter. Sooner or later, however, the
same thing happens to the two beams’ fringe patterns that happens in Fig. 1.13 as we look away
from its left-hand edge—the oscillations get out of phase. Just as the dashed and solid lines in
Fig. 1.13 no longer match up exactly because they have slightly different repetition lengths, so do
the two fringe patterns of the two beams match up less well because they have slightly different
wavelengths. There always comes a point—perhaps when the next null is forming at 10,000 or
50,000 or more half-wavelengths from the ZPD position of the moving mirror—where the
monochromatic beam with the slightly shorter wavelength λ1 is ready to form a null somewhat
before the beam with the slightly longer wavelength λ2. The nulls and brights from one
monochromatic fringe pattern shift enough with respect to the other that we begin to notice a
change: the pattern begins to fade. Eventually, the two fringe patterns are completely out of

- 28 -
Applying the Michelson Interferometer to Spectral Lines · 1.4

phase, with the brights and nulls of one pattern lying on, respectively, the nulls and brights of the
other. If the two beams are of equal intensity, then the fringe pattern fades away completely.
Suppose the λ1 set of fringes first becomes exactly out of phase with the λ2 set of fringes when
the moving mirror has traveled a distance of approximately N/2 wavelengths of the λ2 beam from
its equidistant or ZPD location. At this point, N satisfies the approximate equation

1 1§ 1·
N λ2 ≅ ¨ N + ¸ λ1 , (1.6a)
2 2© 2¹
which can also be written as
λ2 − λ1 1
≅ . (1.6b)
λ1 2N

This gives the formula for the fractional spread

λ2 − λ1
λ1

between the doublet’s wavelengths in terms of N. If N is too large for convenient counting and
only several digits of accuracy are needed, we can directly measure the distance p in Fig. 1.12 at
which the fringe pattern disappears. Recognizing that both sides of Eq. (1.6a) are formulas for p
at the fade-out point, we can approximate either side of Eq. (1.6a) by N λav , where λav is the
approximate wavelength of the doublet, and write

N λav
≅ p. (1.6c)
2

Solving for N gives the formula


2p
N≅ (1.6d)
λav

to estimate N in terms of the known values of p and λav . This approximate value of N can then
be put into Eq. (1.6b) to find the fractional spread in the doublet. Hence, we see that the fade-out
is both a “bug” and a “feature” of the interferometer—although it sets a limit on the distances that
can be measured, it also specifies the exact separation of spectral lines too close to be resolved by
other types of spectrometers. This exercise also establishes the basic idea behind Michelson-
based spectroscopy: examining the behavior of the interference signal to measure the beam’s
spectral shape.

- 29 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.13. The solid oscillation represents the fringe pattern of one spectral line in the doublet and
the dashed oscillation represents the fringe pattern of the other spectral line in the doublet. The
wavelengths of both spectral lines are almost the same, so their fringe patterns slowly change from being
in-phase, to being out-of-phase, and then back to being in-phase.

ax( p )

p
i
0
P
i

min ( p )
0 1 2 3 4 5 6 7 8 9 10
0 x 10
i

strong fringes weak fringes no fringes weak fringes strong fringes

Now that we understand why the fringe pattern of a doublet fades, it is easy to see why the
same sort of thing happens with any size group—or multiplet—of closely spaced spectral lines.
Each line of intrinsically greater or lesser intensity generates a fringe pattern of intrinsically
greater or lesser intensity connected to its wavelength. Near ZPD, all the fringe patterns are in
phase, but as the moving mirror shifts away from ZPD, the fringe patterns, since each is produced
by a slightly different wavelength, go out of phase, causing the fringes to fade. Figure 1.14 even
suggests a quick way of understanding something about why a single, finite-width spectral line
also produces fading fringe patterns; approximating it as a closely spaced multiplet, we might
expect its fringes to behave the same way any other multiplet’s would. We should, however, be
careful about carrying this sort of reasoning too far. Figure 1.13 suggests that if, after reaching
the fade-out point, we keep moving the tilted mirror away from its ZPD position, then the
doublet’s fringe pattern starts to reappear, eventually becoming as strong as it was near ZPD. The
same sort of phenomenon should also occur for any multiplet consisting of a finite number of
exact wavelengths; if we go far enough from ZPD, then there should be a region where the fringe
patterns are all back in phase. In reality, when moving away from ZPD, there are indeed regions
where a multiplet’s fringe pattern first fades then grows stronger, but the finite width of each
spectral line inside the multiplet stops the fringes from ever regaining their full ZPD strength.
The fringes always, eventually, fade away completely. To explain this behavior, it is enough to
examine how and why the fringe pattern of a single, finite-width spectral line fades away. This is
done in the next three sections, where we show how a fringe pattern is connected to the Fourier
transform of the spectral intensity.

- 30 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

1.5 Interference Equation for the Ideal Michelson Interferometer


When using a Michelson interferometer for Fourier-transform spectroscopy, the end mirrors in
each arm are aligned to be perpendicular to the line of sight between their centers and the center
of the beam splitter. In effect, we remove the tilt from the moving mirror so that its central fringe
fills the detector’s field of view in Fig. 1.15. The light beam passing through the interferometer
should be collimated, shown schematically in Fig. 1.15, by putting the point source of the beam
at the focus of a thin lens. The beam leaving the interferometer is concentrated onto a detector by
another thin lens. The dashed line shows the ZPD position of the moving mirror in Figs. 1.15 and
1.16. The moving mirror is a distance p from ZPD in these two figures, with p taken to be
positive when the mirror is further away from the beam splitter than its ZPD position and
negative when it is closer to the beam splitter than its ZPD position. The moving mirror should
remain perpendicular to the line of sight between it and the beam splitter as p changes, and the
detector records the changing intensity I of the collimated beam leaving the interferometer.
Even though Michelson did not usually set up his interferometers this way, optical theory was
advanced enough then for him to predict how I depends on p. The first step is to set up an x, y, z
Cartesian coordinate system such as the one shown in Fig. 1.16, with the collimated exit beam
traveling down the z axis. There are dimensionless unit vectors x̂ , ŷ , ẑ pointing in the direction
of the positive x, y, z coordinate axes. Still treating a light beam as a transverse wavefield of the
type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c), we assume that beam TR in Fig. 1.16 is
monochromatic light and write its transverse disturbance as

K § 2π z · § 2π z ·
Af = xU
ˆ f cos ¨ − 2π ft + δU ¸ + yV
ˆ f cos ¨ − 2π ft + δV ¸ . (1.7a)
¨ λf ¸ ¨ λf ¸
© ¹ © ¹

Here, t is the time coordinate, f is the frequency of the monochromatic disturbance, and λf is the
wavelength corresponding to frequency f. The period of the disturbance is, of course, 1/f, and Eq.
(1.5) reminds us that the wavelength λf is connected to the frequency f by

λf f = c ,
K
where again c is the speed of light. Vector Af has no ẑ component, allowing it to represent a
transverse disturbance in the “ether”
K of the type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c).
The x̂ and ŷ components of Af are the real-valued expressions

§ 2π z ·
U f cos ¨ − 2π ft + δU ¸
¨ λf ¸
© ¹

- 31 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.14.

Spectral Intensity

frequency f

Spectral Intensity

Spectral Multiplet

frequency f

- 32 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

FIGURE 1.15.

90 deg.

p Moving Mirror

Fixed
Mirror
45 deg. Compensator
source at
Plate
focus 90 deg.

Beam
Splitter

Detector

- 33 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

and
§ 2π z ·
V f cos ¨ − 2π ft + δV ¸¸
¨ λf
© ¹

respectively. These components must both oscillate at the same frequency f because the light
beam is monochromatic, but they can have different constant phase shifts δU and δV . This allows
K
Af to point in different directions in the x, y plane when we move along the beam, as suggested
by the changing orientations of the arrows in beams RT and TR of Fig. 1.16. The Uf and Vf
amplitudes of the x and y oscillations do not have to be equal. To simplify the notation, and
because the concept will be routinely used in the rest of the book, we define

1
σf = (1.7b)
λf

to be the wavenumber of the monochromatic disturbance. Now Eqs. (1.7a) and (1.5) can be
written as
K
ˆ f cos ( 2πσ f z − 2π ft + δU ) + yV
Af = xU ˆ f cos ( 2πσ f z − 2π ft + δV ) (1.7c)
with
σ f = f /c . (1.7d)

This is the same monochromatic disturbance as before; all that changes is the notation used to
specify how its phase changes with z.
The power transported by a physical wavefield of any type is usually proportional to its
squared amplitude;13,14 and in optics it is now, as it was in Michelson’s time, customary to set the
time average of the squared amplitude equal to the intensity of the transverse wavefield.15 Visible
light has a wavelength on the order of 5 × 10−7 meters , so by Eq. (1.5) its frequency is about

c
f ≅ ≅ 6 ×1014 Hz (1.8a)
5 ×10 meters
−7

given that c ≅ 3 ×108 m/sec . Hence one cycle of the transverse wavefield has a period of about

13
H. Lamb, Hydrodynamics (6th edition), Dover Publications, New York, 1945 copy of the 6th edition first
published in 1879, p. 370.
14
P. Morse and K. Ingard, Theoretical Acoustics, McGraw-Hill, Inc., New York, 1968, p. 250.
15
G. Stokes, Mathematical and Physical Papers, Vol. III, Cambridge at the University Press, 1901, pp. 233-258.

- 34 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

FIGURE 1.16.

Moving Mirror

Fixed
Mirror
Beam
Splitter

Compensator
Plate
χ = 2p

Beam RT


y axis
x axis

z axis
ẑ x̂
Beam TR

- 35 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

1
2 ;1015 sec . (1.8b)
6 ;1014 Hz

TheThe response
response time
time of of
thethe unaidedhuman
unaided humaneye eyeisisperhaps
perhapsasasshort 10í2í2 s,s, and
shortasas10 and 2×10 í15
2×10í15 s is
13 13
shorter than that by a factor of about 10 . The response of the fastest optical detectors available
today is on the order of 10í9 s, which is still an incredibly long time compared to 2×10í15 s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefield’s period, that’s what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be

T
1
j  g (t )  lim
T 75 2T ³ g (t )dt .
T
(1.9a)

ForFor
anyany
twotwo functions
functions g(t)g(t)
andand
h(t),h(t),
we wethenthen have
have

T T T
1 1 1
j  g (t )  h(t )  lim
T 75 2T ³ [ g (t )  h(t )]dt lim
T
T 75 2T ³
T
g (t )dt  lim
T 75 2T ³ h(t )dt
T

or
j  g (t )  h(t )  j  g (t )   j  h(t )  . (1.9b)

Multiplying
Multiplying g(t)g(t)
by abyconstant
a constant K and
K and thenthen averaging,
averaging, we we
get get

T T
1 1
j  K A g (t )  lim
T 75 2T ³T [ Kg (t )]dt K Tlim
75 2T ³ g (t )dt
T
or
j  K A g (t )  K A j  g (t )  . (1.9c)

The squared amplitude of the monochromatic wavefield in Eq. (1.7c) is


K K
Af = Af U 2f cos 2  2&) f z  2& ft  U   V f2 cos 2  2&) f z  2& ft  V  .

Time averaging both sides to get the intensity gives


K K

j ( Af = Af ) j U 2f cos 2  2&) f z  2& ft  U   V f2 cos 2  2&) f z  2& ft  V  ,  (1.10a)

- 36 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

which becomes, applying Eqs. (1.9b) and (1.9c),


K K
  
j ( Af = Af ) U 2f j cos 2  2&) f z  2& ft  U   V f2 j cos 2  2&) f z  2& ft  V  .  (1.10b)

The average of the squared cosine is 1/2 over one of its cycles.16 As the averaging time gets
longer, it contains ever more cycles of the squared cosine, as well as—almost certainly—some
fraction of a cycle. The contribution of the squared cosine over a fractional cycle has practically
no influence compared to the squared cosine’s average value of 1/2 over a large number of
complete cycles. In the limit as T ĺ ’, it follows that


j cos 2 (at  b) 1/ 2  (1.10c)

for all real values of a and b. Hence, the formula for the intensity of the monochromatic beam in
Eq. (1.10b) now reduces to
K K 1
j ( Af i Af ) U 2f  V f2 .
2
  (1.10d)

Although the squared cosine is always positive, the cosine itself is negative as often as it is
positive and averages to zero over one cycle. As the averaging time increases, it includes an ever
larger number of cycles as well as (probably) some leftover fraction of a cycle. Again, the
influence of the zero from the large number of complete cycles outweighs the contribution of
whatever fractional cycle may be present, and as T ĺ ’ in the limit

j  cos(at  b)  0 (1.11)
for all real values of a and b.
The wavefield of a beam of light containing two monochromatic wavetrains of frequencies f1
and f2 can be written as K K K
A A f1  A f2 , (1.12a)
where
K
 
ˆ f1 cos 2&) f1 z  2& f1t  U(1)  yV
Af1 xU 
ˆ f1 cos 2&) f1 z  2& f1t  V(1)  (1.12b)
and
K
 
ˆ f2 cos 2&) f2 z  2& f 2t  U(2)  yV
Af2 xU 
ˆ f2 cos 2&) f2 z  2& f 2t  V(2) .  (1.12c)

16
D. Griffiths, Introduction to Electrodynamics, 2nd ed. (Prentice Hall, Englewood Cliffs, NJ, 1989), p. 359.

- 37 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

The beam’s intensity is the time average of its squared amplitude, which is
K K K K K K K K K K K K
  
j  A = A  j ( Af1  Af2 ) = ( Af1  Af2 ) j Af1 = Af1  Af2 = Af2  2 Af1 = Af2 ) . 

Equations (1.9b) and (1.9c) can be applied to get


K K K K K K K K
   
j  A = A  j Af1 = Af1  j Af2 = Af2  2 j Af1 = Af2 .   (1.12d)

Substituting Eqs. (1.12b) and (1.12c) into the cross term in Eq. (1.12d) gives
K K
     
j Af1 = Af2 j U f1U f2 cos 2&) f1 z  2& f1t  U(1) cos 2&) f2 z  2& f 2t  U(2) 

 V f1V f2 cos 2&) f1 z  2& f1t  V(1)  cos  2&) f2 z  2& f 2t  V(2)  .

Again, Eqs. (1.9b) and (1.9c) are applied to get


K K
     
j Af1 = Af2 U f1U f2 j cos 2&) f1 z  2& f1t  U(1) cos 2&) f2 z  2& f 2t  U(2) 
(1.12e)
 
 V f1V f2 j cos 2&) f1 z  2& f1t  V(1)  cos  2&) (2)
f 2 z  2& f 2 t  V  .

There
There is aistrigonometric
a trigonometric identity
identity

1 1
(cos . )(cos  ) cos(.   )  cos(.   ) , (1.12f)
2 2
which shows that
  
cos 2&) f1 z  2& f1t  U(1) cos 2&) f2 z  2& f 2t  U(2) 
1

2

cos 2& z () f1  ) f2 )  2& t ( f1  f 2 )  U(1)  U(2)  (1.12g)

1

 cos 2& z () f1  ) f2 )  2& t ( f1  f 2 )  U(1)  U(2) .
2

Taking
Taking the the
timetime average
average of both
of both sides
sides andand applying
applying Eqs.Eqs. (1.9b)
(1.9b) andand (1.9c),
(1.9c), we we
see see
thatthat

- 38 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

   
j cos 2&) f1 z  2& f1t  U(1) cos 2&) f2 z  2& f 2t  U(2) 
1

2
  
j cos 2& z () f1  ) f2 )  2& t ( f1  f 2 )  U(1)  U(2)

1
 
 j cos 2& z () f1  ) f2 )  2& t ( f1  f 2 )  U(1)  U(2)
2
 .
Equation (1.11)
Equation requires
(1.11) bothboth
requires terms on the
terms right-hand
on the sideside
right-hand to be
to zero, which
be zero, gives
which gives

   
j cos 2&) f1 z  2& f1t  U(1) cos 2&) f2 z  2& f 2t  U(2)  = 0 . (1.12h)

Replacing
Replacing U(1,2)
U(1,2)bybyV(1,2)
V(1,2)in inthethealgebra
algebraused
usedtotoreach
reach this
this result
result does
does not
not change
change the
conclusion, which means that

   
j cos 2&) f1 z  2& f1t  V(1) cos 2&) f2 z  2& f 2t  V(2)  = 0 (1.12i)

also. Substituting these two formulas into Eq. (1.12e) leads to


K K

j Af1 = Af2 0  (1.12j)

for any two frequencies f1 and f2 such that f1  f2. Hence, Eq. (1.12d) can be written as
K K K K K K
  
j  A = A  j Af1 = Af1  j Af2 = Af2 .  (1.12k)

Comparing
Comparing thethe
formula in in
formula (1.12k)
(1.12k)forforthe
theintensity
intensityofofa abeam
beamcontaining
containing two
two monochromatic
monochromatic
wavefields to the left-hand side of the formula in (1.10d) for the intensity of a single
monochromatic wavefield, we note that the intensity of the beam with two monochromatic
wavefields is the sum of the intensities of each monochromatic wavefield.
The wavefield of a beam of light containing three monochromatic wavetrains of frequencies
f1, f2, and f3 can be written as K K K K
A A f1  Af2  A f3 (1.13a)

K K K
with Af1 , Af2 specified by formulas (1.12b) and (1.12c) respectively and Af3 specified by

- 39 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

K

ˆ f3 cos 2&) f3 z  2& f 3t  U(3)  yV
Af3 xU  
ˆ f3 cos 2&) f3 z  2& f3t  V(3) .  (1.13b)

Following thethe
Following same
sameanalysis as as
analysis before,
before,wewenotenotethat
thatthe
theintensity
intensityofofthis
thisthree-frequency
three-frequency light
beam is
K K K K K K K K

j  A = A  j ( Af1  Af2  Af3 ) = ( Af1  Af2  Af3 ) 
K K K K K K K K K K K K

j Af1 = Af1  Af2 = Af2  A f3 = Af3  2 A f1 = A f2  2 Af1 = Af3  2 A f2 = Af 3 
K K K K K K
  
j Af1 = Af1  j Af2 = Af2  j Af3 = Af3   
K K K K K K
  
 2 j Af1 = Af2  2 j Af1 = Af3  2 j Af2 = Af3 .   
Equation (1.12j) shows that
K K

j Af1 = Af2 0 
K K
for any two distinct frequencies f1 and f2. The only thing different about j Af1 = Af3 and  
K K
 
j Af2 = Af3 is the subscripts assigned to the distinct frequencies, so the same algebra showing
K K
 
that j Af1 = Af2 is zero also shows that

K K K K
 
j Af1 = Af3 j Af2 = Af3 0 . 
K K K
Hence, the the
Hence, three-frequency formula
three-frequency for for
formula j jA= A= A
reduces   
reduces
to to

K K K K K K K K
  
j  A = A  j Af1 = Af1  j Af2 = Af2  j Af3 = Af3 .    (1.13c)

Here again, the intensity of the beam equals the sum of the intensities of its monochromatic
wavetrains.
This same argument can obviously be generalized to a beam consisting of N monochromatic
wavetrains. Since N may be left unspecified and can be made as large as we please, this is the
same as extending it to a beam of white light. The white-light wavefield can be written as

K N K
A ¦ A fi , (1.14a)
i 1
where
K

ˆ fi cos 2&) fi z  2& f i t  U( i )  yV
Afi xU  
ˆ fi cos 2&) fi z  2& fi t  V( i )  (1.14b)

- 40 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

with fi  fj whenever i  j. The intensity of this beam is

K K §§ N K · § N K ·· § N N K K ·
j ( A • A ) = j ¨ ¨ ¦ A fi ¸ • ¨ ¦ A f j ¸ ¸¸ = j ¨ ¦¦ Afi • Af j ¸ ,
¨
© © i =1 ¹ © j =1 ¹¹ © i =1 j =1 ¹

or, applying Eq. (1.9b),


K K K K
( )
N N
j ( A • A ) = ¦¦ j A fi • Af j . (1.14c)
i =1 j =1

Equation (1.12j) requires


K K
(
j A fi • A f j = 0 ) (1.14d)

whenever i  j, so Eq. (1.14c) reduces to

K K K K K K K K N K K
( ) ( ) (
j ( A • A ) = j Af1 • Af1 + j Af2 • Af2 + " + j Af N • Af N = ¦ j Afi • Afi ) i =1
( ) (1.14e)

because all the i  j terms disappear. Equation (1.14e) shows that the intensity of any beam, even
a white-light beam, is the sum of the intensities of its monochromatic wavetrains. This is
sometimes called the principle of independent superposition,17 and can be written as

N
I = I f1 + I f2 + " + I f N = ¦ I fi , (1.14f)
i =1
where
K K
I = j ( A • A) (1.14g)
is the total intensity of the beam and
K K
(
I fi = j A fi • A fi ) (1.14h)

is the intensity of the beam’s monochromatic wavetrain of frequency fi.


Returning now to Fig. 1.16, we suppose that Eqs. (1.14f)–(1.14h) refer to beam TR and
consider how to write the disturbance for beam RT. In an ideal Michelson interferometer, the
only difference between beam RT and beam TR is that the wavefields in beam RT lag behind the
wavefields in beam TR by a distance Ȥ = 2p that is usually called the optical-path difference.
Using the notation specified in Eq. (1.14b), we see that for every monochromatic wavetrain

17
J. Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley & Sons, New York, 1979), p. 98.

- 41 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

K )
A(fTR
i
= xU (
ˆ fi cos 2πσ fi z − 2π fi t + δU(i ) + yV ) (
ˆ fi cos 2πσ fi z − 2π fi t + δV(i ) ) (1.15a)

in beam TR, there must be, according to Fig. 1.16, a corresponding monochromatic wavetrain
K
( )
ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δU( i ) + yV
A(fiRT ) = xU (
ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δV( i ) ) (1.15b)

in beam RT. The total disturbance for the combined beams’ fith wavetrain is then
K K )
A(fiRT ) + A(fTR
i

in Fig. 1.16. We also note, however, that the beam splitter in Fig. 1.16 is evidently not the same
sort of beam splitter as the one used by Michelson because it does not reverse the direction of the
oscillation of the TR beam the way that the beam splitter in Fig. 1.8 did. For this sort of beam
splitter, the total disturbance of the combined beam’s fith wavetrain should be
K K )
A(fiRT ) − A(fTR
i

according to the discussion at the end of Sec. 1.1. To accommodate both possibilities, we write
the fith wavetrain of the combined beam as
K K K )
A(ficb ) = A(fiRT ) + WA(fTR
i
, (1.15c)

where parameter W is í1 for Michelson-type beam splitters Kand 1 for non-Michelson beam
splitters. The superscript (cb) indicates that the disturbance A(ficb ) is the fith wavetrain of two
beams combined in a balanced way—that is, each beam has undergone one transmission and one
reflection at the beam splitter. The intensity of the combined fith wavetrain is
K K K K K K )
( ) (
I (ficb ) = j A(ficb ) • A(ficb ) = j ( A(fiRT ) + WA(fiTR ) ) • ( A(fiRT ) + WA(fTR
i
)
)
K K K ) K (TR ) K ( RT ) K (TR )
(
= j A(fiRT ) • A(fiRT ) + W 2 A(fTR
i
• Af
i
+ 2WA fi • Af
i
)
.

Applying Eqs. (1.9b) and (1.9c) gives


K K K ) K (TR ) K ( RT ) K (TR )
( ) (
I (ficb ) = j A(fiRT ) • A(fiRT ) + j A(fTR
i
• Af
i
+ 2W j)A fi • Af (
i
, ) (1.15d)

- 42 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

where we have recognized that W2 = 1 because W = ±1. Since both disturbances have the same fi
K K )
frequency, Eq. (1.12j) cannot be used to say that j A(fiRT ) = A(fTR
i

is zero. Substituting from 
(1.15a) and (1.15b) gives
K K )

j A(fiRT ) = A(fTR
i
    
j U 2fi cos 2&) fi ( z   )  2& f i t  U( i ) cos 2&) fi z  2& f i t  U( i ) 
  
 V f2i cos 2&) fi ( z   )  2& fi t  V( i ) cos 2&) fi z  2& f i t  V( i )  ,
or
K K )

j A(fiRT ) = A(fTR
i
    
U 2fi j cos 2&) fi z  2&) fi   2& fi t  U(i ) cos 2&) fi z  2& fi t  U(i ) 
(1.15e)
2
 
 V j cos 2&) fi z  2&) fi   2& fi t  
fi
(i )
V  cos  2&) fi z  2& fi t   (i )
V  .

Formula (1.12f) shows that

   
j cos 2&) fi z  2&) fi   2& fi t  U(i ) cos 2&) fi z  2& f i t  U( i ) 
§1 1 ·
 
j ¨ cos 4&) fi z  2&) fi   4& fi t  2U( i )  cos 2&) fi  ¸ .
©2 2 ¹
 
Applying (1.9b)
Applying andand
(1.9b) (1.9c), we we
(1.9c), get get
thatthat

   
j cos 2&) fi z  2&) fi   2& f i t  U( i ) cos 2&) fi z  2& fi t  U( i ) 
(1.15f)
1 1

2
 
j cos 4&) fi z  2&) fi   4& f i t  2U( i )  
2
 
j cos 2&) fi   .

TheThe
timetime average
average of any
of any time-independent
time-independent quantity
quantity equals
equals thatthat quantity—that
quantity—that is, is,

j K  K (1.15g)

for any constant K. Equation (1.11) shows that

 
j cos 4&) fi z  2&) fi   4& fi t  2U(i )  0 .

- 43 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

These two results can be substituted into (1.15f) to get

   
j cos 2&) fi z  2&) fi   2& fi t  U(i ) cos 2&) fi z  2& f i t  U( i ) 
(1.15h)
1

2

cos 2&) fi  . 
Replacing U(i ) by
Replacing (i )
U  byV(i ) does
(i )
V doesnot not
change
change
the the
algebra
algebra
usedused
to derive
to derive
(1.15h).
(1.15h).
It follows
It follows
thatthat

   
j cos 2&) fi z  2&) fi   2& fi t  V(i ) cos 2&) fi z  2& f i t  V( i )   12 cos  2&)   . (1.15i)
fi

Substituting
Substituting (1.15h)
(1.15h) andand (1.15i)
(1.15i) intointo (1.15e)
(1.15e) nownow gives
gives

K K ) 1 2

j A(fiRT ) = A(fTR
i

2
   
U fi  V f2i cos 2&) fi  ,  (1.15j)

and this result can be put into (1.15d) to get


K K K K )
  
I (ficb ) j A(fiRT ) = A(fiRT )  j A(fiTR ) = A(fTR
i
   
 W U 2fi  V f2i cos 2&) fi  .  (1.15k)

For an ideal Michelson interferometer, the intensity of the fith monochromatic wavetrain in
the RT beam and the intensity of the fith monochromatic wavetrain in the TR beam must be
identical because they arise in a symmetric way from the fith wavetrain of the white-light beam
entering the instrument. We can imagine taking out the moving mirror from its interferometer
arm
K (TR )so that only the TR beam is reflected back to the beam splitter. This means that only the
A fi monochromatic disturbance leaves the interferometer in the proper direction, and its
K ) K (TR )
intensity is, of course, j A(fTR i
 = Af
i

. Taking out the fixed mirror in the other arm and
replacing the moving mirror in the first arm ensures that only the RT beam reflects back to the
K K
 
beam splitter. Now j A(fiRT ) = A(fiRT ) is the intensity of the monochromatic disturbance leaving
the interferometer in the proper direction. Since we have just said that these two intensities must
be equal, it follows that
K K K K )
  
j A(fiRT ) = A(fiRT ) j A(fiTR ) = A(fTR
i
.  (1.16a)

- 44 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

KK
Equation
Equation
(1.10d)
(1.10d)
holds
holds true
true
forforany
anymonochromatic
monochromaticwavetrain
wavetrain AAf f of
offrequency
frequency f,f, so
so itit must
K (TR )
apply to wavetrain Afi of frequency f1. Hence, Eq. (1.15a) must mean that
K ) K (TR ) 1 2

j A(fTRi
= Af
i

2

(U fi  V f2i ). (1.16b)

K KRT( RT) )
Equation (1.10d)
Equation (1.10d)also
alsoapplies wavetrain A(A
appliestotowavetrain fi fi ofoffrequency
frequency fifi inin Eq.
Eq. (1.15b),
(1.15b), which
similarly leads to
K K 1
 
j A(fiRT ) = A(fiRT ) (U 2fi  V f2i ) .
2
(1.16c)

The right-hand sides of (1.16b) and (1.16c) are the same, which makes sense since the left-hand
sides of (1.16b) and (1.16c) must satisfy Eq. (1.16a).
Again taking out the moving mirror, we note that then, in an ideal interferometer, one quarter
of the entering beam’s power ends up leaving the interferometer as beam TR traveling along the z
axis in Fig. 1.16. Hence, if I (0)
fi is the intensity of the fith monochromatic wavetrain entering this
interferometer, we must have
K ) K (TR ) 1

j A(fTR
i
= Af
i
I (0)
4 i
f . (1.17a)

Consulting Eq. (1.16a), we see that this means

K K 1
 
j A(fiRT ) = A(fiRT ) I (0)
4 i
f (1.17b)

and, of course, Eqs. (1.16b) and (1.16c) then reveal that

I (0) 2 2
fi 2(U fi  V fi ) . (1.17c)

Substituting Eqs. (1.17a)–(1.17c) into (1.15k) then leads to

1 (0) W (0)
I (ficb ) I f  I fi cos 2&) fi 
2 i 2
 
or
1 (0) ª
I (ficb ) I f 1  W cos 2&) fi  º .
  (1.17d)
2 i ¬ ¼

- 45 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Equation (1.17d) is the basic equation for the intensity of a monochromatic wavetrain leaving
an ideal Michelson interferometer when the intensity of the corresponding wavetrain entering the
interferometer is I (0)
fi and the moving mirror is displaced from its ZPD position by a distance
p  / 2 , as shown in Fig. 1.16. We note that for those values of Ȥ = 2p, where
 
W cos 2&) f  1 , the intensity of the fith monochromatic wavetrain leaving the interferometer is
i

the same as the intensity of the fith monochromatic wavetrain entering the interferometer. This
corresponds to constructive interference of the fith monochromatic component of the RT and TR
beams. Suppose the beam entering the interferometer consists of just this one monochromatic
component. Glancing back at Fig. 1.1(b), we see that the power of the beam entering an ideal
Michelson interferometer can leave by either the combined RT and TR dotted beams or by the
two combined dash-dot beams traveling in the opposite direction to the incident beam. The dotted
beams are often called the balanced output of the interferometer, because each one has undergone
one transmission and one reflection at the beam splitter; similarly, the dash-dot beams are called
the unbalanced output, because one beam has undergone two reflections and the other beam has
undergone two transmissions. Conservation of energy requires that the power in all the
monochromatic beams leaving the ideal interferometer must equal the power in the one
monochromatic beam entering the interferometer. Hence, when constructive interference of the
balanced RT and TR beams makes their combined intensity equal to that of the beam entering the
interferometer, we know that destructive interference of the two unbalanced beams must make
their combined intensity equal to zero. Consequently, at each Ȥ = 2p value where
W cos  2&) f   1 , not only is the intensity of the balanced monochromatic beams the same as
i

that of the monochromatic beam entering the interferometer, but also the intensity of the
unbalanced monochromatic beams is zero. On the other hand, for moving-mirror positions where
Ȥ = 2p has a value such that W cos  2&) f   1 , the intensity of the combined monochromatic
i

RT and TR beams in Fig. 1.1(b) is zero according to Eq. (1.17d). At these moving-mirror
locations, the balanced output undergoes destructive interference. Conservation of energy then
requires the unbalanced output to undergo constructive interference and have the same intensity
as the monochromatic beam entering the interferometer.
This analysis can be generalized to any mirror position and value of Ȥ = 2p. If I (ficu ) is the
intensity of the unbalanced monochromatic wavetrain and, as before, I (0)
fi and I (ficb ) are the
intensities of the incident monochromatic wavetrain and balanced monochromatic wavetrain
respectively, then conservation of energy forces us to write

I (0) ( cb )
fi I fi  I (ficu ) . (1.18a)

Substituting from Eq. (1.17d), we get

- 46 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

1 (0) ª
I (0)
fi =
2 ¬ ¼ (
I fi 1 + W cos 2πσ fi χ º + I (ficu ) , )
which can be solved for I (ficu ) to get
1 (0) ª
2
I fi 1 − W cos 2πσ fi χ º .
I (ficu ) =
¬ ¼ ( (1.18b) )
This specifies the intensity of the fith monochromatic wavetrain in the unbalanced output of an
ideal Michelson interferometer.
The dashed lines in Fig. 1.17 show the positions of the moving mirror at which

n n +1 n + 2
χ = …, , , ,… .
σf i
σf i
σf i

These are the positions where I (ficb ) = 0 in Eq. (1.17d) when W = í1 for an interferometer using a
Michelson-type beam splitter. This can also be written as, substituting from Eq. (1.7b),

χ = " , nλ f , (n + 1)λ f , (n + 2)λ f ," ,


i i i

where λ fi is the wavelength of the fith monochromatic wavetrain. For beam splitters where
W = 1 , of course, these dashed lines represent the moving-mirror positions at which I (ficb ) = I (0)
fi . If

the moving mirror is slightly tilted, so that its surface crosses more than one dashed line, and the
beam entering the interferometer contains only the fith monochromatic wavetrain, then the
combined RT and TR beams leaving the interferometer have light and dark strips as the surface
of the tilted mirror crosses through those planes in space where an untilted mirror would produce
an all-bright or an all-dark balanced output. This connects Eq. (1.17d) to the bright and null
fringe patterns from a spectral line discussed in Sec. 1.4.
When a beam of white light passes through the interferometer—that is, a beam having many
different frequencies—the principle of independent superposition in Eq. (1.14f) requires the
intensity of the interferometer’s balanced output to be the sum of the intensities of each
monochromatic wavetrain,
N
I ( cb ) = ¦ I (ficb ) ,
i =1

which becomes, substituting from Eq. (1.17d),

1 N (0) ª
I ( cb )
= ¦ I fi 1 + W cos 2πσ fi χ º .
2 i =1 ¬ ¼ ( ) (1.19a)

- 47 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.17.

(n + 3)rd crossing

(n + 2)nd crossing

distance between
dashed lines is λ fi / 2

(n + 1)st crossing

nth crossing

position where position where position where position where


χ = nλ f i χ = (n + 1)λ fi χ = (n + 2)λ fi χ = (n + 3)λ fi

- 48 -
Interference Equation for the Ideal Michelson Interferometer· 1.5

When describing natural sources of light, we often replace sums of discrete quantities with
integrals over continuous functions, and this transformation was perhaps even more characteristic
of late 19th-century science than it is of today’s physics. So it would be an automatic process for
Michelson and his contemporaries to define a spectral intensity function I (0) ( f ) to describe the
radiation entering the instrument. When using this sort of mathematical formalism, we say that
I (0) ( f )df is the optical intensity of all the radiation having frequency values between f and f + df
entering the interferometer. The intensity of the balanced output is then

5
1 (0)
2 ³0
I ( cb ) I ( f ) ª¬1  W cos  2&) f   º¼ df . (1.19b)

TheThe physical
physical meaning
meaning of of
Eq.Eq. (1.19b)
(1.19b) is isexactly
exactlythe
thesame
sameasasEq.
Eq.(1.19a);
(1.19a);we
wehave
have just
just replaced
replaced
(0) (0)
I fi by I ( f )df and changed the sum to an integral. We have also relied on variable f itself
instead of index i to label the different frequencies. To make this last tactic work, we just assume
that I (0) ( f ) is zero for those frequencies f that are not part of the original sum over i; this also
lets us specify the integral to be over all possible frequencies f between 0 and ’. The
wavenumber ıf can be eliminated by substituting from the formula for f in (1.7d) to get

ª § 2& f · º
5
1
I ( cb )
³ I (0) ( f ) «1  W cos ¨  ¸ » df . (1.19c)
20 ¬ © c ¹¼

TheThe only
only problem
problem with
with this
this equationis isthe
equation theunreasonably
unreasonablyhigh
highnumbers
numbersrequired
required to
to represent
represent f
at optical frequencies—when going from one extreme to the other across the visible spectrum, for
example, frequency f changes from 4×1014 Hz to 7.5×1014 Hz (approximately). Consequently,
today’s Fourier spectroscopists often use Eq. (1.7d) to eliminate f rather than ı from Eq. (1.19b).
To do this, we differentiate both sides of (1.7d) to get

1
df c d) or d) df
c
and define
S () ) cI (0) (c) ) (1.19d)
so that
1
S () ) d) cI (0) (c) ) A df
c
simplifies to
S () ) d) I (0) (c) ) df . (1.19e)

- 49 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

Now Eq. (1.7d) can be applied to (1.19c) to get


1
I ( cb )
= ³ S (σ ) ª¬1 + W cos ( 2πσχ ) º¼ dσ . (1.19f)
20

To get the white-light intensity formulas for the unbalanced output, we can apply to the
unbalanced monochromatic formula the same analysis used on the balanced monochromatic
formula. Comparing the unbalanced formula (1.18b) to the balanced formula (1.17d), we see that
changing the sign of W is all that needs to be done to go from the balanced formula to the
unbalanced formula. Hence, when we apply to the unbalanced formula the same algebra used on
the balanced formula, we know that all the way through the derivation—and, of course, in the
final results—the only difference would be that W is replaced by íW. Consequently, we can write
down at once the unbalanced white-light formulas corresponding to (1.19b), (1.19c), and (1.19f)
as


1
I ( cu )
= ³ I (0) ( f ) ª¬1 − W cos ( 2πσ f χ ) º¼ df , (1.20a)
20


1 ª § 2π f · º
I ( cu )
= ³ I (0) ( f ) «1 − W cos ¨ χ ¸ » df , (1.20b)
20 ¬ © c ¹¼

and

1
I ( cu )
= ³ S (σ ) ª¬1 − W cos ( 2πσχ ) º¼ dσ (1.20c)
20

respectively. Formulas (1.19b), (1.19c), and (1.19f) contain all the basic information needed to
understand how Fourier-transform spectroscopy works, and it was derived here using only those
facts that Michelson knew over 100 years ago about the nature of light. Unfortunately, it applies
only to an ideal interferometer; not surprisingly, the 19th-century approach used to derive it is
difficult to adapt to the study of both the random and nonrandom errors present in even the most
accurate of today’s Michelson interferometers. For this reason, in Chapter 4 we return to basic
principles and rederive the formula for I(cb) starting from the modern form of Maxwell’s
equations, this time being careful to include all the nonideal terms needed for the error analysis.
Formula (1.19f) is, however, already good enough—if we borrow several mathematical results
from Chapter 2—to explain why the fringes from even the thinnest of spectral lines discussed in
Sec. 1.4 must eventually fade away as Ȥ = 2p increases.

- 50 -
Fringe Patterns of Finite-Width Spectral Lines· 1.6

1.6 Fringe Patterns of Finite-Width Spectral Lines


Finite-width spectral lines, such as the one in the top graph of Fig. 1.18, can be represented by a
spectral intensity function I(0)(f). We can also follow the standard practice of Fourier
spectroscopists and represent the finite-width spectral line by the S(ı) function defined in Eq.
(1.19d) and plotted in the bottom graph of Fig. 1.18. If the intensity of a spectral line is described
by a narrow I(0)(f) function such as the one in the top graph of Fig. 1.18, which is significantly
different from zero only between two very closely spaced frequencies f1 and f2, then the
corresponding S(ı) curve is significantly different from zero only between the two closely spaced
wavenumbers ) 1 f1 / c and ) 2 f 2 / c , as shown in the bottom graph of Fig. 1.18.
The right-hand side of Eq. (1.19f) can be split up into the sum of a constant term and a term
that changes as the location coordinate p = Ȥ/2 of the moving mirror changes,

5 5
1 W
I ( cb )
³ S () ) d)  ³ S () ) cos  2&)  d) . (1.21a)
20 2 0

Since ) :)0: in
Since 0 the
in the
integrals
integrals
over
overd)d), nothing
, nothingstops
stopsususfrom
fromreplacing
replacing SS(()))) by
by SS(()) )) in the
second term to get
5 5

³ S () ) cos  2&)  d) ³ S ( ) ) cos  2&)  d) .


0 0
(1.21b)

Anticipating
Anticipating some
some of of
thethe Fourier
Fourier materialininChapter
material Chapter2,2,we
wenote
notethat,
that,according
according to
to Eq.
Eq. (2.11a)
(2.11a)
in Chapter 2, function S ( ) ) is even because

S ( ) ) S ( ) ) ,

and, of course, it is real because it represents a real physical quantity—the intensity of the
spectral line. Turning next to Eq. (2.34g) in Chapter 2, we see that because S ( ) ) is a real and
even function, the cosine integral on the right-hand side of Eq. (1.21b) is one half of the Fourier
transform of S [if we specify that parameter ı in (1.21b) corresponds to variable t in (2.34g) and
that parameter Ȥ in (1.21b) corresponds to variable f in (2.34g)]. Anticipating the material in
Chapter 2 one last time, we consult Eq. (2.35k) and note that if the nth derivative of S has a well-
defined Fourier transform, then for large values of its argument the Fourier transform of S
approaches zero as the nth power of the absolute value of its argument. Since S describes a
spectral line—that is, a natural phenomenon—we expect it to have derivatives of all orders and
also expect those derivatives to have Fourier transforms. The argument of the Fourier transform
of S is Ȥ, and we already know that the right-hand side of (1.21b) is half the Fourier transform of
S, so we can now conclude that

- 51 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

5 5

³ S () ) cos  2&)  d) ³ S ( ) ) cos  2&)  d) ? O   


n
(1.21c)
0 0

for positive values of n as  7 5 . Applying this to Eq. (1.21a)


(1.20a) shows that

5
1
I ( cb )
2 ³0
S () ) d )  O  
n
 (1.21d)

for large values of Ȥ. Hence, as the moving mirror gets further and further from its ZPD location,
increasing the value of  2 p , the value of I ( cb ) eventually stops changing and approaches the
constant value
5
1
lim I ( cb )
³ S () ) d) . (1.21e)
 75 20

This happens for all types of intensity curves, not just those associated with spectral lines. If S
does represent a spectral line such as the one in Fig. 1.18, the brights and nulls associated with
the dashed lines in Fig. 1.17 eventually fade away. Consequently, no matter how the moving
mirror is tilted, no fringes can be seen. If the Michelson interferometer is being used as a ruler,
the fringe counting must stop. When the spectral line is a closely spaced multiplet, each line in
the group has a finite spectral width, ensuring that—no matter how the lines interact with each
other to form bright and dim regions in the overall fringe pattern—eventually any and all fringe
traces must disappear. Every spectral line found in nature produces light having some finite
spectral width, no matter how small, so this sort of fade-out is a universal phenomenon.

1.7 Fourier-Transform Spectrometers


In Michelson’s time there was no easy way to measure the intensity of the exit beam leaving the
interferometer, so it was not practical to measure the change in I(cb) as a function of Ȥ = 2p in
order to determine the Ȥ-dependent curve,

³ S () ) cos  2&)  d) ,
0

coming from the second term on the right-hand side of Eq. (1.21a). In the previous section we
found that this curve is half the Fourier transform of S. This means that if the curve could be

- 52 -
Fourier-Transform Spectrometers · 1.7

FIGURE 1.18.

( 0)
Spectral Intensity I (f)

f1 f2 frequency f

S (σ ) = cI (0) (cσ )

f1 f2 wavenumber σ
σ1 = σ2 =
c c

- 53 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

measured, then the Fourier transform could be reversed to get the shape of the S spectrum
entering the interferometer. In the 1950s, both optical detectors to measure I(cb) and digital
computers to reverse the Fourier transform became widely available. Spectroscopists began to
design and build spectrometers based on measuring I(cb) as a function of Ȥ and then reversing the
Fourier transform to find S. Today, these sorts of instruments are usually called Fourier-transform
spectrometers.
Equation (1.21a) is an idealized form of the fundamental equation of Fourier-transform
spectroscopy. It describes the intensity of the beam leaving an interferometer whenever we

1) Divide the beam into equal-amplitude secondary beams, and


2) Recombine the two secondary beams after the wavefield of one is shifted a distance Ȥ
with respect to the wavefield of the other.

Although this is exactly what happens inside a standard Michelson interferometer, Figs. 1.19(a)–
1.19(d) show that there are many other combinations of beam splitters and mirrors that divide and
recombine beams in this way.18
Figure 1.19(a) shows the first and perhaps most obvious modification. Michelson put the arms
of his interferometer at right angles to maximize the fringe shift due to the ether wind thought to
exist by 19th-century scientists. If all that is desired, however, is to divide and recombine beams,
then the two arms can be at any (reasonable) angle with respect to each other, as shown in Fig.
1.19(a). The setup in Fig. 1.19(a) may in fact have some advantages over the standard Michelson
interferometer; arranging for near-normal reflections off the beam splitter usually modifies the
polarization of the wavefields less than large-angle reflections (see Sec. 4.4 of Chapter 4 for an
explanation of polarization).
Figure 1.19(b) shows that the end mirrors can be replaced by retroreflectors like corner cubes
or cat’s-eyes. For best results, both arms should have the same type of retroreflector.
The discussion following Eq. (1.17d) above explains the difference between the balanced and
unbalanced optical outputs leaving the standard Michelson interferometer. In Figs. 1.19(a) and
1.19(b), the unbalanced output cannot be detected because it goes back out along the entrance
beam, making it impossible to separate the two. The interferometer in Fig. 1.19(c), however,
shows that there are ways to keep the entrance beam separate from the unbalanced output, giving
us access to both the balanced and unbalanced optical signals. According to Eqs. (1.19f) and
(1.20c), if I(cb) is the intensity of the balanced output and I((cu)
cb)
is the intensity of the unbalanced
output, then
5
I ( cb )
I ( cu )
W ³ S () ) cos  2&)  d) (1.22a)
0

and

18
To keep things simple, compensation plates and other secondary optical components have been omitted.

- 54 -
)RXULHU7UDQVIRUP6SHFWURPHWHUVÂ


, ( FE )
+, ( FX )
= ³ 6 (σ ) Gσ . (1.22b)
0

Equation (1.22a) shows that subtracting the output of the detectors measuring the balanced and
unbalanced signals eliminates the constant term and doubles the size of the signal component
containing the Fourier transform. Adding the detectors’ outputs in Eq. (1.22b) eliminates the
Fourier transform, producing the integrated spectral intensity of the entrance beam. This
integrated source intensity should, of course, remain constant during a spectral measurement
because Fourier-transform spectrometers are vulnerable to source fluctuations. Astronomers often
design their Fourier-transform spectrometers so that both the balanced and unbalanced outputs
are available. When they investigate the spectra of weak and fluctuating sources (such as
twinkling stars), these instruments allow them both to double the signal from—and to check the
constancy of—the radiances being measured. If the source fluctuates, formula (1.22b) can be
used to measure the fluctuation. Sometimes this allows the astronomer to rescale the Fourier
signal in (1.22a) to correct the spectral measurement.
In a standard Michelson interferometer such as the one shown in Fig. 1.1(b), and in the setups
shown in Figs. 1.19(a)±1.19(c), the wavefield of one recombining beam is displaced a distance Ȥ
with respect to the wavefield of the other whenever the moving mirror or corner cube is displaced
from =PD by a distance Ȥ/2. In Fig. 1.19(d), however, the corner cube only has to move a
distance Ȥ/4 to displace one wavefield by Ȥ with respect to the other. Equation (5.67) in Chapter 5
shows that larger values of Ȥ lead to more detailed spectral measurements in standard Michelson
interferometers, and the same holds true for the nonstandard interferometers discussed here. In
particular, a setup such as the one shown in Fig. 1.19(d) lets us achieve larger Ȥ values with
smaller displacements of the corner cube. The moving corner cube is also, strictly speaking, no
longer the retroreflector; plane mirrors in both arms are used to reverse the beam directions.
During the 1950s, it was established that Fourier-transform spectrometers had two basic
advantages—often called the Jacquinot advantage and the Fellget advantage—over contemporary
types of prism-based and grating-based spectrometers.19 These advantages revealed that under
many circumstances spectra measured by Fourier-transform spectrometers had a better signal-to-
noise ratio than equivalent prism-based or grating-based instruments. With the popularization of
the fast-Fourier transform (FFT) algorithms in the 1960s, Fourier-transform spectrometers soon
established themselves as usually the first and best choice for measuring infrared spectra
(electromagnetic radiation having wavelengths between 1 and 100 ȝm). The growing availability
of personal and desktop computers in the late 1970s and 1980s made Fourier-transform systems
more compact, powerful, and user-friendly. Over the past two decades, there has been a tendency
standard Michelson
to use standard Michelson configurations,
configurations,such
suchasasthose
thoseininFigs.
Figs.1.1(b)
1.1(b)oror1.19(a),
1.19(a),when
when



19
J. Chamberlain, 7KH3ULQFLSOHVRI,QWHUIHURPHWULF6SHFWURVFRS\ p. 16.


1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.19(a). χ
p=
2
Moving
Mirror
Beam Fixed
Splitter Mirror
Entrance Beam

To Balanced
Signal Detector
Moving Corner χ
FIGURE 1.19(b). Cube p=
2
Beam
Splitter
Entrance Beam Fixed Corner
Cube
To Balanced
Signal Detector
χ
p=
FIGURE 1.19(c). 2

Beam
Entrance Beam Splitter
Fixed Corner
Cube

To Unbalanced To Balanced
Signal Detector Signal Detector

- 56 -
Fourier-Transform Spectrometers · 1.7

FIGURE 1.19(d).

Moving Corner Cube


χ
p=
4

Beam
Entrance Beam Splitter

Fixed
Mirror

To Balanced Signal Detector

designing the optics of Fourier-transform spectrometers. Standard Michelsons are well suited to
the laser-based servo controls often used to maintain the alignment of the fixed and moving
mirrors.

1.8 Laser-Based Control Systems


Today’s Fourier-transform spectrometers often rely on laser-based servo systems to maintain
alignment and control the motion of the moving mirror. The average wavelength of the measured
spectra determines the standards of alignment and control required for good spectral

- 57 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

measurement. Systems designed to measure infrared spectra typically have lasers that work in the
visible. Not only do modest standards of alignment and control in the visible correspond to
extremely accurate standards of alignment and control in the infrared—because visible
wavelengths are much shorter than infrared wavelengths—but the infrared detectors responsible
for the spectral measurements are also easily shielded from stray laser light. The laser servo
systems follow many different designs. Figures 1.20(a) and 1.20(b) show a typical setup that may
not be exactly like any system now in use but that does present the basic ideas behind them.
In Fig. 1.20(a), a single laser beam is separated into beams A, B, and C by laser-beam
splitters. Separating one beam into three ensures that all three beams have the same wavelength.
The three beams enter the interferometer parallel to, and at the edges of, the entrance beam.
Figure 1.20(b) shows the path of beams A and B through the instrument; beam C is not shown
because it is out of the plane of the page, but it is assumed to follow a path similar to beams A
and B. The solid lines representing the laser beams are always parallel to the dotted lines showing
the path of the entrance beam through the interferometer; and the laser beams interact with the
interferometer’s beam splitter, fixed mirror, and moving mirror exactly the same way the
entrance beam does. Because all three laser beams are monochromatic wavetrains of wavelength
λ, the same reasoning used to produce Fig. 1.17 shows that we can draw a sequence of dashed
lines perpendicular to the laser beams to represent the moving-mirror positions where the laser
beams would form fringes. Just like in Fig. 1.17, each dashed line is separated from its two
nearest neighbors by λ/2. Taking the dashed lines to represent nulls, we note that if the moving
mirror has a slight tilt, as shown in Fig. 1.20(b), then the laser detector for beam B will see a near
null in the beam B fringe while the laser detector for beam A will see a near bright in the beam A
fringe. If the moving mirror is aligned in the plane of Fig. 1.20(b) but has a small out-of-plane
tilt, then the laser detector for beam C is sure to see a different fringe brightness than the laser
detectors for beams A and B. The three laser detectors send their signals to a servomechanism
that readjusts the mirror tilt until both detectors see the same fringe intensity, keeping the
interferometer aligned while the moving mirror changes position. Often these servomechanisms
readjust the tilt of the fixed mirror instead of directly correcting the moving mirror’s tilt. It is not
difficult to design systems of this sort that can detect changes of λ/100 in the position of the
moving-mirror’s surface. The A, B, and C laser detectors can also be used to count fringes as the
moving mirror changes position, keeping a record of where the moving mirror is and how fast it
is moving. This information is almost always used to sample the interferometer’s output signal at
equally spaced positions of the moving mirror, and it is often sent to a servomechanism
responsible for producing steady motion in the moving mirror.

___________

Chapters 2 and 3 spell out the mathematical ideas needed to analyze the performance of
Fourier-transform spectrometers, and they also establish the notation used to describe these ideas
in subsequent chapters. Readers who are already familiar with Fourier theory and random

- 58 -
Laser-Based Control Systems · 1.8

functions can skip ahead to Chapter 4, returning to Chapters 2 and 3 as needed to refresh their
understanding. Chapter 4 starts with Maxwell’s equations, working with them to derive the
nonideal versions of Eq. (1.19f) and (1.20c) needed to understand both the nonrandom and
random sources of error in Fourier-transform spectrometers. We always assume a standard
Michelson configuration, such as the ones shown in Fig. 1.1(b) or 1.19(a), controlled by laser-
based metrology and alignment systems similar to the ones shown in Figs. 1.20(a) and 1.20(b).
These are arguably the most common type of Fourier-transform spectrometer in use today. Most
of the basic ideas applied here to these standard Michelson systems are also relevant to other
types of Fourier-transform spectrometers; anyone who reads and understands the analysis
presented in Chapters 4 through 8 will be able to modify the equations presented there so that
they apply to nonstandard Michelson configurations. One possible exception to this rule are
Michelsons such as the one shown in Fig. 1.19(b) that use nonstandard retroreflectors to return
the split entrance beam to the beam splitter. These sorts of systems, which are outside the scope
of this book, are spared many forms of the “tilt” misalignment possible in a standard Michelson,
which is an advantage, but on the other hand exhibit shear types of misalignments, which
standard Michelsons do not have. The equations governing shear misalignment turn out to be
similar to those for tilt misalignment, but it does not necessarily make sense to analyze them as a
source of random error, the way tilt is analyzed in Chapter 7.

- 59 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers

FIGURE 1.20(a).

Interferometer
Beam Splitter

Beam C

Beam B
Laser

Beam A

Laser Beam
Splitters

Entrance
Beam

- 60 -
Laser-Based Control Systems · 1.8

FIGURE 1.20(b).

Laser Fringe Positions

Moving
Mirror
Laser

Beam C
Interferometer Fixed
Laser Beam Beam Splitter Mirror
Splitters
Beam B

Entrance
Beam

Beam A

To Laser
Detector B

To Laser
Detector A
To Infrared Detector

- 61 -
2
FOURIER THEORY
Many single-chapter introductions to Fourier theory follow a top-down approach, defining what a
Fourier transform is and then listing the mathematical consequences. Here, on the other hand, we
begin with more of a bottom-up approach, seeking not only to present the mathematical
formalism of Fourier transforms but also to give an intuitive feel for how they work and what
they mean. Once the basic idea is established, we need to know which data sequences and
functions have well-defined Fourier transforms. This topic is often scanted because Fourier
theory is notorious for providing no simple mathematical answers to this simple mathematical
question. Indeed, engineers, scientists, and applied mathematicians have a long tradition of using
Fourier transforms in mathematically improper—yet extremely useful—ways that usually give
the correct answer. To show why these techniques work, and also when they cannot be trusted,
there is a brief sketch of generalized function theory. This is followed by a discussion of the
Fourier series and the discrete Fourier transform, including an exact description of how they are
connected to the integral Fourier transform. The discrete Fourier transform is particularly
important because, almost without exception, the only type of Fourier transform calculated on
today’s computers is the discrete Fourier transform; without it, the Michelson interferometer
would be a much more limited instrument. The chapter then concludes with a brief discussion of
how Fourier transforms are applied to two-dimensional and three-dimensional functions.

2.1 Basic Concept of a Fourier Transform


The idea of a Fourier transform develops naturally from a simple idea for comparing the shape of
two sequences of measurements. A sequence of measurements is really just a list of numbers, so
when we compare sequences of measurements we compare the shapes of number lists graphed in
the order of their measurement. We can suppose without any loss of generality that two lists, uk
and vk , have the same number of members with k 1, 2, … , N . Figures 2.1(a) and 2.1(b) show
two lists uk and vk graphed against their index value k. Defining u and v to be the mean values
of uk and vk ,
1 N
u ¦ uk (2.1a)
N k 1
and
1 N
v ¦ vk , (2.1b)
N k 1

-- 62
62 --
Basic Concept of a Fourier Transform · 2.1

FIGURE 2.1(a).

List uk

1 2 3 4
increasing index k

FIGURE 2.1(b).

List vk

1 2 3 4
increasing index k

-- 63
63 --
2 · Fourier Theory

we form the sum S of the products of the differences from the mean,

N
S ¦  uk  u  vk  v  . (2.2)
k 1

If the graphs of uk and vk have similar shapes, so that uk  u ? vk  v for most values of k,
then  uk  u  and  vk  v  are very likely to have the same sign for most values of k. This means
few terms in the sum are negative and S ends up being a large positive number. If uk and vk have
little similarity in shape, then  uk  u  and  vk  v  are as likely to have opposite signs as the
same sign and the terms in the sum are just as likely to be positive as they are to be negative.
When this happens, S is a sum of terms that tend to cancel out, and the magnitude of S is likely to
be small.
The same basic idea can be applied to continuous functions u(t) and v(t). To create a formal
correspondence between functions and lists, we define an interval ¨t in t and match uk and vk to
u(t) and v(t) with the equations
u
u (k t ) k
t
and
v(k t ) vk .

Because u and v are continuous functions of time, we can assume that they vary in an
unsurprising manner between the isolated points at t , 2t , … , N t at which they have been
specified. Traditionally, the argument of functions u and v is called t and assumed to be time, but
it is worth remembering that t can stand for any relevant physical parameter, such as length,
voltage, current, etc. Now we can approximate Eq. (2.2) as

N t
S ³  u (t )  u  v(t )  v  dt ,
t
(2.3a)

where now
N t
1
u
N t ³ u (t )dt
t
(2.3b)

and
N t
1
v
N t ³ v(t )dt .
t
(2.3c)

Equations (2.3b) and (2.3c) just ensure that u and v are now the average values of u(t) and

- 64 --
- 64
Basic Concept of a Fourier Transform · 2.1

v(t) respectively. We note that the value of u has been redefined from what it was in Eq. (2.1a)
above,
unew uold / t ,

whereas v has basically the same value as in Eq. (2.1b)—the only change is to replace the sum
by the equivalent integral. At this point, the finite value of ¨t is just a distraction, because it is the
shapes of the continuous functions u(t) and v(t) that are being compared. Taking the limit as
t 7 0 and N 7 5 in such a way that

lim N t Tmax constant , (2.4a)


t 70
N 75

we get
Tmax

S ³  u (t )  u  v(t )  v  dt ,
0
(2.4b)

where
Tmax
1
u
Tmax ³ u (t )dt
0
(2.4c)

and
Tmax
1
v
Tmax ³
0
v(t )dt . (2.4d)

We still expect S to be large when functions u and v have similar shapes and S to be small when
they have dissimilar shapes.
Equation (2.4b) can be written as

Tmax Tmax

S ³  u(t )  u  v(t )dt  v ³  u(t )  u  dt


0 0
Tmax
ªTmax º
³  u (t )  u  v(t ) dt  v A « ³ u (t ) dt  u A Tmax »
0 ¬« 0 ¼» (2.5)
Tmax Tmax
ª Tmax
º
³ u (t ) v(t ) dt  u ³ v(t )dt  v A « ³ u (t ) dt  u A Tmax »
0 0 ¬« 0 ¼»
Tmax

³ u(t )v(t )dt  u A v A T


0
max ,

where in the last step (2.4c) ensures that the term in the square brackets [ ] is zero and (2.4d) is

-- 65
65 --
2 · Fourier Theory

used to replace the integral over v by vTmax . To get to Fourier theory from Eq. (2.5), we suppose
v(t) to be an oscillatory function like sin(2& ft ) or cos(2& ft ) with f > 0 . This makes function u
the data—that is, the value of our measurement at time t is u(t). Equation (2.4d) then reveals,
depending on whether we choose v to be a sine curve or a cosine curve, that

Tmax
1
vTmax ³ sin(2& ft )dt 2& f A 1  cos(2& fT ) 
0
max (2.6a)

or
Tmax
1
vTmax ³
0
cos(2& ft )dt
2& f
A sin(2& fTmax ) . (2.6b)

When v is a sine curve, vTmax oscillates between 1 & f  and 0 as Tmax increases; and when v
is a cosine curve, vTmax oscillates between 1  2& f  and 1  2& f  as Tmax increases. Keeping in
mind that u(t) represents a function measured in a laboratory, if we want to compare the shape of
u to either sin(2& ft ) or cos(2& ft ) , common sense requires Tmax, the range of t over which data is
gathered, to be much greater than 1/ƒ, the period of the sine or cosine curve to which we want to
compare the data. Unless u entirely lacks a resemblance to the sine or cosine so that

Tmax

³ u (t )v(t )dt 0
0

no matter how large u or Tmax become, we expect

Tmax

³ u (t )v(t )dt
0

to be large when the u measurements are large, and small when the u measurements are small—
and the integral’s magnitude should also increase as Tmax increases. So when u represents a
typical set of data that is not completely unlike v in shape, then

Tmax

³ u(t )v(t )dt O(uT


0
max )

or

- 66 --
- 66
Basic Concept of a Fourier Transform · 2.1

Tmax
1
u ³ u(t )v(t )dt O(T
0
max ).

Equations (2.6a) and (2.6b) show that vTmax must remain somewhere between the two values
1 & f  and 1  2& f  no matter how large Tmax gets, which means

vTmax O( f 1 ) .

Having already concluded that Tmax has been chosen much larger than 1/ƒ, we expect

Tmax
1
u ³ u(t )v(t )dt O(T
0
max ) O( f 1 ) vTmax ,

which, of course, reduces to


Tmax
1
u ³ u(t )v(t )dt vT
0
max .

Therefore, Eq. (2.5) can be approximated as

ª 1 Tmax º T
1 max
Tmax

S u A « ³ u (t )v(t )dt  v A Tmax » u A ³ u (t )v(t ) dt ³ u (t )v(t ) dt . (2.7)


¬« u 0 ¼» u 0 0

The integral in (2.7) can be regarded as assigning the number S to the similarity in shape of u and
v, when v is a sine or cosine curve of frequency ƒ. Remembering where S came from, we realize
that this number is large when u and v have similar shapes and small when u and v have
dissimilar shapes.

2.2 Fourier Sine and Cosine Transforms


To make the ideas of the previous section mathematically rigorous, we define the Fourier sine
transform of function u to be
5
p( ft )  u (t )  2 ³ u (t ) sin(2& ft ) dt (2.8a)
0

-- 67
67 --
2 · Fourier Theory

and the Fourier cosine transform of u to be

5
C ( ft )
 u (t )  2³ u (t ) cos(2& ft )dt . (2.8b)
0

The notation p( ft )  u (t )  and C ( ft )  u (t )  shows that the function u(t) is being multiplied by,
respectively, the sine or cosine function having—as indicated by the superscript—an argument ft
multiplied by 2& . The order of the ft product in the superscript does not matter because it does
not matter in the arguments of the sine and cosine, so

p( ft )  u (t )  p( tf )  u (t )  and C ( ft )  u (t )  C ( tf )  u (t )  .

In particular we know, because t is repeated in both u(t) and the superscript of p and C , that t is
the dummy variable of integration whereas ƒ, which is only contained in the superscript, is an
independent parameter. This means the transforms p( ft )  u (t )  and C ( ft )  u (t )  are themselves
functions of the parameter ƒ,
5
U p  f  2 ³ u (t ) sin(2& ft )dt (2.8c)
0

and
5
U C  f  2 ³ u (t ) cos(2& ft )dt . (2.8d)
0

The “capital U” names of functions U p and U C show that they are mathematically associated
with the original function u(t), created from u(t) by the integrals in (2.8c) and (2.8d).
Although the upper limit of integration is now ’ in Eqs. (2.8a) and (2.8b), this should not be
interpreted as taking the limit as Tmax 7 5 in Eq. (2.7). The upper limit is put at ’ just to
eliminate Tmax as an explicit parameter, and the idea behind the presence of Tmax—that u(t)
represents the result of a measurement—is kept alive by placing restrictions on the type of
function u can be. In particular, we expect u(t), in some sense, to diminish or get small as t gets
large, because it is impossible to measure data for all the times t out to ’. It turns out that when
the right sorts of restrictions are placed on u, the Fourier sine and cosine transforms can be
inverted to recover the original functions,

5
u (t ) 2 ³ U p  f  sin(2& ft ) df (2.8e)
0

- 68 --
- 68
Fourier Sine and Cosine Transforms · 2.2

and
5
u (t ) 2 ³ U C  f  cos(2& ft ) df (2.8f)
0

for t 0 .
If we adopt the strictest definition of what is meant by the integral of a function between 0 and
’, then Eqs. (2.8a)–(2.8f) are true when function u(t) satisfies the following four requirements:

(I) It is absolutely integrable.


(II) It is continuous except for a finite number of jump discontinuities.
(III) It is bounded on any finite interval 0
a
t
b
5 .
(IV) It has finite variation on any finite interval 0
a
t
b
5 .

We now show why function u(t) naturally satisfies all these restrictions when it represents a
(possibly idealized) measurement controlled or described by a continuous parameter t.
No matter what the argument t of function u represents—time, voltage, energy, etc.—function
u(t) can only be measured over a finite range of t. Although there may be no reason to think u is
zero or negligible when measured outside this range, we obviously cannot “make up” values for
what it might be. If we extrapolate to get the unmeasured t values, the extrapolation should not
dominate the information contained in u. In general, the measurement should be carried out in
such a way that the unmeasured or extrapolated values are of negligible importance compared to
the measured values. Mathematically we might say that there exists a positive, finite value of t,
which we call Tmax, such that the important measured values of u are all at t 4 Tmax . One way of
expressing this constraint is to require

Tmax 5

³
0
u (t ) dt ³ u (t ) dt .
0
(2.9a)

Since the left-hand integral ought to be finite, when (2.9a) is true, it follows that

³ u (t ) dt
5 .
0
(2.9b)

Functions u that satisfy (2.9b) are said to be absolutely integrable; clearly, all functions
representing possible measurements share this quality, satisfying requirement (I) above.
Understanding requirement (II) requires some discussion of what it means to call an
experimental measurement continuous. To assign, with negligible experimental error, a definite
value of t to a measurement u, some minimum and finite change in t must occur between adjacent
measurements. In practice, continuous measurements are constructed by connecting sequences of

-- 69
69 --
2 · Fourier Theory

adjacent but separate points. We then assume that if u were measured between these already
known points, it would equal (to within experimental error) the values selected by connecting the
points. Thus, the continuity of u is a requirement that the measurement captures all the relevant
detail. In this sense, asserting that u is continuous is a type of idealization—just another way of
saying that the measurement is accurate and representative. This takes care of the first part of
requirement (II), but there is a second part permitting u to have a finite number of jump
discontinuities. Figure 2.2 shows a jump discontinuity in u(t). Jump discontinuities represent
another type of idealization—what can occur when, for example, instruments are turned on or off
during a measurement. Because it is unrealistic to have this happen an infinite number of times
over a finite range of t, it makes sense to say that all functions u representing measurements are
continuous over any finite range of t except for a finite number of jump discontinuities.
Consequently, we can expect all functions representing measurements to satisfy requirement (II).
Standard proofs that the Fourier transform of the Fourier transform returns the original
function u usually end up showing as their final step that

5
1
2 ³ U p  f  sin(2& ft )df lim u (t   )  u (t   ) (2.9c)
 70 2
0

and
5
1
2 ³ U C  f  cos(2& ft ) df lim  u (t   )  u (t   )  . (2.9d)
 70 2
0

When u is continuous, this immediately reduces to the desired result, but when the integrals are
evaluated at a jump discontinuity, such as at t to in Fig. 2.2, the limits on the right-hand side of
(2.9c) and (2.9d) give u a value at the jump discontinuity that is probably different from the
original value of u at the jump discontinuity. To keep this from happening, we define the value of
u to be, for all values t t jump marking the location of a jump discontinuity,

1
u (t jump ) lim ª¬u (t jump   )  u (t jump   ) º¼ . (2.9e)
 70 2

Modifying u this way cannot change the value of any integral whose integrand is the product of u
with another smooth function. The sine and cosine are smooth functions, so using (2.9e) to
modify the value of u at jump discontinuities does not change the values of the sine or cosine
transforms.
Measurements must be done with physically realizable equipment, which necessarily
produces finite values of u. This means there always exists a finite real number B
5 such that

- 70 --
- 70
Fourier Sine and Cosine Transforms · 2.2

Figure 2.2.
u (t )

t t0

______________________________________________________________________________

u (t )
B (2.9f)

over any finite interval 0


a
t
b
5 when function u represents a measurement. Functions
obeying this inequality are called bounded functions, so functions representing measurements
always satisfy requirement (III).
Requirement (IV) is a little bit more complicated to explain. Any function u(t) can be written
as the difference of two other functions u1 (t ) and u2 (t ) , as shown in Figs. 2.3(a) and 2.3(b),

u (t ) u1 (t )  u2 (t ) (2.9g)

In Fig. 2.3(a), function u is drawn with a continuous line where it is increasing and with a dashed
line where it is decreasing. In Fig. 2.3(b), we see that functions u1 and u2 are constructed so that
every time u increases, u1 also increases while u2 remains the same, and every time u decreases,
u2 increases while u1 remains the same. Consequently, for any function u and time values b : a ,
the differences u1 (b)  u1 (a) and u2 (b)  u2 (a ) are non-negative and can only increase, which
means that their sum

-- 71
71 --
2 · Fourier Theory

FIGURE 2.3(a).
u (t )

a b
t1 t2 t3

FIGURE 2.3(b).
u1,2 (t )

u1 (t )

u2 (t )

a b
t1 t2 t3

- 72 --
- 72
Fourier Sine and Cosine Transforms · 2.2

Vab (u ) u1 (b)  u1 (a )  u2 (b)  u2 (a ) (2.9h)

is also non-negative. Functions u1 and u2 have been constructed so that every time u goes up and
down, the differences u1 (b)  u1 (a ) and u2 (b)  u2 (a ) increase, making the size of Vab (u ) a
record of how many times u oscillates in the interval a
t
b . We define Vab (u ) to be the
variation of u over the interval a
t
b , and if

Vab (u )
5 , (2.9i)

we say that u has finite variation over the interval a


t
b . Requirement (IV), that u have finite
variation in any interval 0
a
t
b
5 , means that u can only oscillate a finite number of
times in that interval. The function sin((t  1) 1 ) , for example, does not have finite variation over
any interval containing t 1 . If we attempted to measure a quantity that had infinite variation
inside a finite interval, we would be blocked by the realization, already discussed above in
connection with requirement (II), that adjacent measurements must be separated by some
minimum value of t. If the measurement were repeated over and over, it would seem as if u were
changing unpredictably in the region of infinite variation, leading us to wonder whether our
measurement reflected the same physical reality. Therefore, our measurements cannot have
infinite variation, and so any function u(t) representing a realistic measurement must also satisfy
requirement (IV).
We see that requirements (I) through (IV) are always satisfied by functions representing
physically realizable measurements. It should be emphasized that requirements (I) through (IV)
are sufficient to ensure that Eqs. (2.8a)–(2.8f) hold true, but not necessary. It is easy to show that
there exist functions that do not meet requirements (I) through (IV) yet still satisfy Eqs. (2.8a)–
(2.8f). Consider, for example,
­& for 0 4 t
1  2& 
°
g (t ) ® & / 2 for t 1  2&  (2.10a)
°0 for t 1  2& 
¯

This test function clearly satisfies (I) through (IV) and so must have a Fourier cosine transform,

 2 &  1
sin( f )
GC ( f ) 2& ³
0
cos(2& ft )dt
f
(2.10b)

such that we return to the original function g by taking cosine transform of the GC transform,

-- 73
73 --
2 · Fourier Theory

5 5
sin( f )
g (t ) 2 ³ GC ( f ) cos(2& ft )df 2³ cos(2& ft )df . (2.10c)
0 0
f

We could, however, just as easily have started with the function

sin(t )
h(t )
t

and taken its cosine transform to get

5
sin(t )
H C ( f ) 2³ cos(2& ft )dt . (2.10d)
0
t

The integral in (2.10d) is clearly the same as the first integral in (2.10c) with the variables ƒ and t
interchanged. Therefore,

­& for 0 4 f
1  2& 
°
H C ( f ) g ( f ) ® & / 2 for f 1  2& 
° 0 for f 1  2& 
¯

Hence we know that h(t) satisfies Eqs. (2.8b), (2.8d), and (2.8f)—it is both cosine transformable
and its cosine transform returns the original function when cosine transformed—exactly because
g(t) in (2.10a) satisfies Eqs. (2.8b), (2.8d), and (2.8f). Yet h(t), unlike g(t), does not satisfy
requirements (I) through (IV)—in particular, it violates requirement (I) because it is not
absolutely integrable. To see that this is true, note that

j& j&
5
sin(t ) 5
sin(t ) 5
1 2 5
1
³ dt ¦ ³& dt : ¦ ³& sin(t ) dt ¦ j 7 5,
0
t j 1  j 1 t j 1 j&  j 1 & j 1

where the last step uses a well-known property of the harmonic series,

5
1
¦ j,
j 1

that it grows large without limit. This simple example also shows that just because a function g(t)
satisfies requirements (I) through (IV), so that the transform of the transform returns the original

- 74 --
- 74
Fourier Sine and Cosine Transforms · 2.2

function g(t), it does not necessarily follow that transform itself satisfies requirements (I) through
(IV).
Here is another example to show that, even though the transform of a function may exist, if
requirements (I) through (IV) are violated, then the transform of the transform does not
necessarily return the original function. We consider another test function,

z (t ) t 1 , (2.10e)

which is clearly not absolutely integrable because

5 A
dt dt
³0 t lim
A75 ³ lim ª¬ ln  A   º¼ 5 ,
t A775
 70  0

violating requirement (I). The sine transform of z is

5
sin(2& ft )
Z p ( f ) 2³ dt .
0
t

Any handbook of definite integrals shows that

­ 0 for f 0
Zp ( f ) ® . (2.10f)
¯& for f 0

Therefore, the sine transform Z p of z (t ) t 1 exists, yet the sine transform of the sine transform
does not return z:

5 F
1 1
2& ³ sin(2& ft )df lim 2& ³ sin(2& ft ) df lim 1  cos(2& Ft )  > . (2.10g)
F 75 F 75 t t
0 0

Clearly, if a function violates requirements (I) through (IV) yet has a well-defined sine or
cosine transform, the sine transform of the sine transform and the cosine transform of the cosine
transform must be checked explicitly to confirm that the original function is returned. The only
exception is when the transform itself satisfies (I) through (IV) even though the original test
function does not. Because we could just as easily have started with the transform itself instead of
the original test function, we can conclude that the transform of the transform of the original
function must return the original function. In general, repeatedly applying the sine or cosine

-- 75
75 --
2 · Fourier Theory

transform just takes us back and forth between the same two functions, and the transformations
are mathematically justified whenever at least one of those functions satisfies requirements (I)
through (IV).

2.3 Even, Odd, and Mixed Functions


Fourier transform theory can be extended to include functions that are evaluated for negative as
well as positive values of their arguments. To assist our analysis of these extended transforms, we
decide to classify u as an even, odd, or mixed function. An even function u satisfies the constraint

u (t ) u (t ) (2.11a)

for all values of t, negative as well as positive; an odd function satisfies the constraint

u (t ) u (t ) (2.11b)

for all values of t, negative as well as positive; and a mixed function is partly even and partly odd
in the sense that it is the sum of an even function and an odd function, neither of which is
identically zero. Any function u(t)—whether even, odd, or mixed—can be written as the sum of
two functions, ue and uo , with ue being an even function obeying (2.11a) and uo being an odd
function obeying (2.11b),

u (t ) ue (t )  uo (t ) , (2.11c)
where
1
ue (t ) u (t )  u (t ) (2.11d)
2
and
1
uo (t ) u (t )  u (t ) . (2.11e)
2

Clearly,
1 1
ue (t ) u (t )  u (t ) u (t )  u (t ) ue (t )
2 2
and
1 1
u o ( t ) u (t )  u (t )  u (t )  u (t ) uo (t ) .
2 2

If u starts off as an even function, then u ue , and uo is identically zero; if u starts off as an odd
function, then u uo , and ue is identically zero; and if u starts off as a mixed function, then

- 76 --
- 76
Even, Odd, and Mixed Functions · 2.3

neither ue nor uo are identically zero. If u is identically zero, it can be regarded as either even or
odd, according to the classifier’s convenience.
Figures 2.4(a) and 2.4(b) graph examples of even and odd functions respectively, and Fig.
2.4(c) shows a mixed function that is split up into its even and odd parts. We note that cos(2& ft )
is an even function of both ƒ and t and sin(2& ft ) is an odd function of both ƒ and t. One point
worth remembering is that the behavior of even and odd functions is severely constrained near
t 0 . For any odd function at t 0 , we have

u (0) u (0) u (0)

from Eq. (2.11b). Since the only number equal to its own negative value is zero, all odd functions
u(t) that have a well-defined value at t 0 must be zero at t 0 ,

u 0 if u (0) exists and u is odd. (2.12a)


t 0

Because u (t ) u (t ) for even functions, when t is near zero the value of u (if u is continuous) is
almost constant. Therefore, when t is exactly zero the derivative of any even function u(t), if it is
well defined, must be zero,

du
0 if the derivative at zero exists and u is even. (2.12b)
dt t 0

In fact, using the definition of the derivative

du ª u (t   )  u (t ) º ª u (t )  u (t   ) º
lim « » lim « »¼ ,
dt  70 ¬  ¼  70 ¬ 

when u is even we see that

du ª u (to   )  u (to ) º ª u (t   )  u (to ) º du


lim « » lim « o »  .
dt t  to  70 ¬  ¼  70 ¬  ¼ dt t to

This shows that when u is even, the derivative of u is odd, and so from (2.12a), which states that
odd functions are zero when their argument is zero, we know that (2.12b) must be true. Similarly,
for any odd function u,

-- 77
77 --
2 · Fourier Theory

FIGURE 2.4(a).
u (t )

FIGURE 2.4(b).

u (t )

- 78 --
- 78
Even, Odd, and Mixed Functions · 2.3

FIGURE 2.4(c).

10
9.28

ue (t )
5

u (t )

u t
i

ue t
i 0

uo t
i

uo (t )
5

9.557 10
2 1.5 1 0.5 0 0.5 1 1.5 2
2 t ti 0 t 2

du ª u (to   )  u (to ) º ª u (t )  u (to   ) º du


lim « » lim « o »¼ dt ,
dt t  to  70 ¬  ¼  70 ¬  t to

showing that when u is odd, its derivative is even. The second derivative d 2u dt 2 of an even
function u is the first derivative of du dt that is odd, and so d 2u dt 2 must be even; similarly, the
third derivative d 3u dt 3 is the first derivative of d 2u dt 2 that is even, and so must be odd.
Examining in this fashion ever higher derivatives of the even function u, we conclude that

-- 79
79 --
2 · Fourier Theory

d n u ­odd function for n 1, 3, 5, …½


® ¾ when u is even. (2.12c)
dt n ¯ even function for n 2, 4, … ¿

The same reasoning applied to the derivatives of an odd function u shows that

d n u ­even function for n 1, 3, 5, …½


® ¾ when u is odd. (2.12d)
dt n ¯ odd function for n 2, 4, 6, …¿

Equation (2.12c) states that the odd-numbered derivatives of an even function are odd while the
even-numbered derivatives of an even function are even, and Eq. (2.12d) states that the odd-
numbered derivatives of an odd function are even while the even-numbered derivatives of an odd
function are odd. Therefore, an immediate consequence of (2.12a), (2.12c), and (2.12d) is that the
odd-numbered derivatives of an even function—if they exist and are well-defined—are zero at
t 0 and the even-numbered derivatives of an odd function—if they exist and are well-defined—
are zero at t 0 .

2.4 Extended Sine and Cosine Transforms


We can now extend the sine and cosine transforms to include functions u(t) evaluated for
negative as well as positive values of t while generalizing requirements (I) through (IV)
previously applied to u for t 0 in Sec. 2.2. The extended requirements are

(V) Function u (t ) must satisfy


5

³ u(t ) dt
5 .
5
(2.13a)

(VI) Function u (t ) must be continuous except for a finite number of jump discontinuities
over any finite interval 5
a
t
b
5 .
(VII) There must exist a finite positive number B such that

u (t )
B . (2.13b)

(VIII) The non-negative variation Vab (u ) of function u(t) as defined in Eqs. (2.9g) and (2.9h)
is finite over any finite interval 5
a
t
b
5 ,

Vab (u )
5 . (2.13c)

- 80 --
- 80
Extended Sine and Cosine Transforms · 2.4

We also define the value of u at all its jump discontinuities to be given by Eq. (2.9e). These new
requirements are clearly just the old set of requirements extended to cover negative as well as
positive values of t.
The extended Fourier sine transform of u is

5
pE ( ft )
 u (t )  ³ u (t ) sin(2& ft )dt , (2.14a)
5

and the extended Fourier cosine transform of u is

5
CE ( ft )
 u (t )  ³ u (t ) cos(2& ft )dt . (2.14b)
5

Just like in Eqs. (2.8a) and (2.8b), defining the standard sine and cosine transforms, the order of
the ft product in the superscript does not matter:

pE ( ft )  u (t )  pE ( tf )  u (t ) 
and
CE ( ft )  u (t )  CE ( tf )  u (t )  .

We can write u as the sum of even and odd functions, u (t ) ue (t )  uo (t ) , as described in Eq.
(2.11c), and substitute this sum into the definitions of the extended sine and cosine transforms in
(2.14a) and (2.14b) to get

5 5
pE ( ft )
 u (t )  ³ ue (t ) sin(2& ft )dt  ³ uo (t ) sin(2& ft )dt (2.15a)
5 5
and
5 5
CE ( ft )  u (t )  ³ ue (t ) cos(2& ft )dt 
5
³ u (t ) cos(2& ft )dt .
5
o (2.15b)

We note that the product of an even function ue and the sine, as well as the product of an odd
function uo and the cosine, must be an odd function,

ue (t ) sin  2& f A (t )  ue (t )   sin(2& ft )   ue (t ) sin(2& ft )  , (2.16a)

-- 81
81 --
2 · Fourier Theory

and
uo (t ) cos  2& f A (t )   uo (t )  cos(2& ft )  uo (t ) cos(2& ft )  . (2.16b)

The integral between í’ and +’ of any odd function o (t ) can be thought of as the limit of
the sum of a large number of small terms,

³  (t )dt "   (2dt ) A dt   (dt ) A dt   (0) A dt   (dt ) A dt   (2dt ) A dt  " .


5
o o o o o o

Because o is odd, o (0) is zero; o (dt ) A dt o (dt ) A dt and cancels o (dt ) A dt ;
o (2dt ) A dt o (2dt ) A dt and cancels o (2dt ) A dt ; and so on. Therefore,20

³  (t )dt 0 ,
5
o (2.17)

and Eqs. (2.15a) and (2.15b) can be written as

5
pE ( ft )  u (t ) 
5
³ u (t ) sin(2& ft )dt
o (2.18a)

and
5
CE ( ft )
 u (t )  ³ ue (t ) cos(2& ft )dt . (2.18b)
5

The integral between í’ and +’ of any even function e (t ) can be thought of as

³  (t )dt "   (2dt ) A dt   (dt ) A dt   (0) A dt   (dt ) A dt   (2dt ) A dt  " .


5
e e e e e e

Because e is even, e ( dt ) e (dt ) , e (2dt ) e (2dt ) , and so on. Therefore, the integral over
negative t has the same value as the integral over positive t and we can write

20
Strictly speaking, we are here treating the integral between í’ and +’ as a Cauchy principle value, a concept
introduced in Sec. 2.10 below.

- 82 --
- 82
Extended Sine and Cosine Transforms · 2.4

5 5

³  (t )dt 2³  (t )dt .
5
e
0
e (2.19)

The product of uo and the sine is an even function,

uo (t ) sin  2& f A (t )   uo (t )  A   sin(2& ft )  uo (t ) sin(2& ft )  , (2.20)

and the product of ue and the cosine, both of them even functions, is another even function.
Consequently, the extended sine and cosine transforms in Eqs. (2.18a) and (2.18b) are, according
to (2.19), (2.8a), and (2.8b),

5 5
pE ( ft )  u (t )  ³ uo (t ) sin(2& ft )dt 2³ uo (t ) sin(2& ft )dt p  uo (t ) 
( ft )
(2.21a)
5 0

and
5 5
CE ( ft )
 u (t )  ³ ue (t ) cos(2& ft )dt 2³ ue (t ) cos(2& ft )dt C ( ft )  ue (t )  . (2.21b)
5 0

Equation (2.21a) shows that the extended sine transform of a function u(t) is the unextended sine
transform of uo , the odd component of u; and Eq. (2.21b) shows that the extended cosine
transform of u(t) is the unextended cosine transform of ue , the even component of u. Because the
result will be needed later, we also show that the extended sine transform defined in Eq. (2.14a)
is an odd function of ƒ,

5 5
pE (  ft )
 u (t )  ³ u (t ) sin(2& ft )dt  ³ u (t ) sin(2& ft )dt pE ( ft )  u (t )  ; (2.22a)
5 5

and a similar manipulation shows that the extended cosine transform defined in (2.14b) is an even
function of ƒ,

5 5
CE (  ft )  u (t )  ³ u (t ) cos(2& ft )dt ³ u (t ) cos(2& ft )dt C  u (t )  .
( ft )
E (2.22b)
5 5

We now examine what happens when the extended sine and cosine transforms are applied
twice to the same function. We define

-- 83
83 --
2 · Fourier Theory

U pE  f  pE ( ft )  u (t )  p( ft )  uo (t )  (2.23a)
and
U CE  f  CE ( ft )  u (t )  C ( ft )  ue (t )  , (2.23b)

where the second step in Eqs. (2.23a) and (2.23b) comes from (2.21a) and (2.21b). Taking the
extended Fourier sine and cosine transforms of U pE and U CE respectively, we get

5
pE ( tf )
U pE ( f )  pE U pE ( f )  ³ U pE ( f ) sin(2& ft )df
( ft )
(2.24a)
5
and
5
CE ( tf )
U CE ( f )  CE U CE ( f )  ³ U CE ( f ) cos(2& ft )df
( ft )
. (2.24b)
5

The second step in (2.24a) and (2.24b) is there just to emphasize that we are allowed to change
the order of the ft product in the superscripts.
Equation (2.22a) shows that the extended sine transform U pE is an odd function of ƒ, so its
product with the sine is an even function of ƒ; and Eq. (2.22b) shows that the extended cosine
transform U CE is an even function of ƒ, so its product with the cosine is also an even function of
ƒ. Hence, according to (2.19), Eqs. (2.24a) and (2.24b) become

5
pE ( tf )
U pE ( f )  2³ U pE ( f ) sin(2& ft )df (2.25a)
0

and
5
CE ( tf ) U CE ( f )  2³ U CE ( f ) cos(2& ft )df . (2.25b)
0

But Eq. (2.23a) shows that U pE is also the unextended sine transform of uo , so from (2.25a) we
see that
pE ( tf ) U pE ( f ) 

equals the unextended sine transform of the unextended sine transform of uo , the odd component
of function u. According to Eqs. (2.8a), (2.8c), and (2.8e), the unextended sine transform of the
unextended sine transform returns the original function for positive values of t. This means that
the extended sine transform of the extended sine transform,

- 84 --
- 84
Extended Sine and Cosine Transforms · 2.4

pE ( tf ) U pE ( f )  ,

which we have just seen to be equal to the unextended sine transform of the unextended sine
transform, must return uo for positive values of t. Consequently, for positive values of t, Eq.
(2.25a) becomes
5
pE ( tf )
U pE ( f )  2³ U pE ( f ) sin(2& ft )df uo (t ) . (2.26a)
0

Function uo is, however, defined for all values of t according to the rule for odd functions
uo (t ) uo (t ) , and the integral
5
2 ³ U pE ( f ) sin(2& f (t ))df
0

is also an odd function of t when we allow t to be both positive and negative,

5 5
2 ³ U pE ( f ) sin(2& f (t ))df 2³ U pE ( f ) sin(2& ft ) df .
0 0

Consequently, the integral exists and is well defined for negative t whenever the integral exists
and is well-defined for positive t. We conclude that Eq. (2.26a) holds true for negative as well as
positive t. Hence, using Eq. (2.23a) to substitute for U pE in Eq. (2.26a), we can write

 
pE (tf ) pE ( ft 3)  u (t 3)  uo (t ) (2.26b)

This shows that taking the extended sine transform of the extended sine transform returns the odd
component uo of function u for all values of t, both positive and negative. Switching now to the
extended cosine transform U CE , we see that Eq. (2.23b) shows the extended cosine transform U CE
is also the unextended cosine transform of ue , the even component of function u. From the right-
hand side of Eq. (2.25b), we then know that

CE ( tf ) U CE ( f ) 

is equal to the unextended cosine transform of the unextended cosine transform of ue . Equations
(2.8b), (2.8d), and (2.8f) show that the unextended cosine transform of the unextended cosine
transform returns the original function for positive values of t. Consequently, the extended cosine

-- 85
85 --
2 · Fourier Theory

transform of the extended cosine transform,

CE ( tf ) U CE ( f )  ,

which we have just seen to be equal to the unextended cosine transform of the unextended cosine
transform of ue , must also equal ue for positive values of t. This means that Eq. (2.25b) becomes
(for positive values of t),
5
CE ( tf )
U CE ( f )  2³ U CE ( f ) cos(2& ft )df ue (t ) . (2.26c)
0

But ue (t ) is defined for negative as well as positive values of t according to the rule
ue (t ) ue (t ) for even functions of t, and the integral

5
2 ³ U CE ( f ) cos(2& ft )df
0

is also an even function of t when t is allowed to be both positive and negative:

5 5
2 ³ U CE ( f ) cos  2& f (t )  df 2³ U CE ( f ) cos  2& f (t )  df .
0 0

Consequently, the integral exists and is well defined for negative t if it exists and is well defined
for positive t. We conclude that Eq. (2.26c) is valid for both negative and positive t and that,
substituting Eq. (2.23b) into Eq. (2.26c),

 
CE (tf ) CE ( ft 3)  u (t 3)  ue (t ) . (2.26d)

This shows that taking the extended cosine transform of the extended cosine transform returns
ue , the even component of function u, for all values of t both positive and negative. Equations
(2.11d) and (2.11e), the original definitions of the even and odd components of a function u,
show that Eqs. (2.26b) and (2.26d) can be written as

1
 
pE ( tf ) pE ( ft 3)  u (t 3) 
2
u (t )  u (t ) (2.26e)
and

- 86 --
- 86
Extended Sine and Cosine Transforms · 2.4

1
 
CE ( tf ) CE ( ft 3) (u (t 3))
2
u (t )  u (t ) . (2.26f)

Adding together the extended sine transform of the extended sine transform and the extended
cosine transform of the extended cosine transform then gives

  
pE ( tf ) pE ( ft 3)  u (t 3)   CE ( tf ) CE ( ft 3)  u (t 3) 
1 1 (2.26g)
u (t )  u (t )   u (t )  u (t ) u (t ) .
2 2

We conclude that for any function u(t), the sum of the extended sine transform of the extended
sine transform and the extended cosine transform of the extended cosine transform returns the
original function.
One obvious way to proceed from this point is to define the Hartley transform

5
e a
( ft )
 u (t )  ³ u (t ) cos(2& ft )  sin(2& ft ) dt
5
5 5
³ u (t ) cos(2& ft )dt  ³ u(t ) sin(2& ft )dt
5 5
(2.26h)

CE (tf )  u (t )   pE (tf )  u (t ) 
U CE ( f )  U pE  f  ,

where in the next-to-last step we use definitions (2.14a) and (2.14b) of the extended sine and
cosine transforms and in the last step Eqs. (2.23a) and (2.23b) are used to write the extended sine
and cosine transforms as functions of ƒ. The order of the ft product in the superscript is not
important because, just like in the sine and cosine transforms, we have

ea( ft )  u (t )  ea( tf )  u (t )  .

Working with this definition, we see that the Hartley transform of the Hartley transform gives

 
ea( tf ) ea( ft 3)  u (t 3)  ea(tf ) U CE ( f )  U pE  f  
5 (2.26i)
³
5
ª¬U CE ( f )  U pE  f  º¼  cos(2& ft )  sin(2& ft )  df .

-- 87
87 --
2 · Fourier Theory

According to Eqs. (2.22a) and (2.22b), the extended sine transform U pCE is an odd function of ƒ
and the extended cosine transform U CE is an even function of ƒ. Using the same reasoning as in
Eqs. (2.16a) and (2.16b) above,

U CE ( f ) sin  2& t A ( f )  U CE ( f )   sin(2& ft )   U CE ( f ) sin(2& ft ) 

and
U pE ( f ) cos  2& t A ( f )   U pE ( f )  cos(2& ft )  U pE ( f ) cos(2& ft )  .

We see that U CE ( f ) sin(2& ft ) and U pE  f  cos(2& ft ) are both odd functions of ƒ, and Eq. (2.17)
states that the integral between í’ and +’ of any odd function is zero. Therefore,

5 5

³U
5
CE ( f ) sin(2& ft )df ³ U  f  cos(2& ft )df 0 .
5
pE

Now the Hartley transform of the Hartley transform in Eq. (2.26i) can be simplified to

5
e a
( tf )
 e  u (t 3)   ³
a
( ft 3 )
ª¬U CE ( f )  U pE  f  º¼  cos(2& ft )  sin(2& ft )  df
5
5 5
³
5
U CE ( f ) cos(2& ft )df  ³U
5
CE ( f ) sin(2& ft )df
5 5
 ³ U  f  cos(2& ft )df  ³ U  f  sin(2& ft )df
5
pE
5
pE

5 5
³U
5
CE ( f ) cos(2& ft )df  ³ U  f  sin(2& ft )df
5
pE

CE (tf ) U CE ( f )   pE ( tf ) U pE ( f )  .

Because U CE and U pE are respectively the extended cosine and


sine and sine transforms of u [see Eqs.
cosine
(2.23a) and (2.23b)], we have

     
ea( tf ) ea( ft 3)  u (t 3)  CE ( tf ) CE ( ft 3)  u (t 3)   pE ( tf ) pE ( ft 3)  u (t 3)  ,

which becomes, substituting from (2.26g),

- 88 --
- 88
Extended Sine and Cosine Transforms · 2.4

 
ea( tf ) ea( ft 3)  u (t 3)  u (t ) . (2.26j)

We see that the Hartley transform of the Hartley transform returns the original function for both
positive and negative values of t. The Hartley transform was never very popular and is only rarely
encountered today. What is done instead, as we shall see in the next section, is to combine the
extended sine and cosine transforms into a single Fourier transform based on a complex
exponential.

2.5 Forward and Inverse Fourier Transforms


The Fourier transform is based on the well-known identity

ei cos( )  i sin( ) , (2.27)

where i 1 .
For any real function u(t) satisfying requirements (V) through (VIII) in Sec. 2.4, we can add
the extended cosine transform to i times the extended sine transform to get

5 5
CE ( ft )  u (t )   i A pE ( ft )  u (t )  ³ u(t ) cos(2& ft )  i sin(2& ft ) dt ³e
2& ift
u (t )dt . (2.28a)
5 5

From Eqs. (2.23a) and (2.23b), we have

CE ( ft )  u (t )  U CE  f  and pE ( ft )  u (t )  U pE  f  ,

which means (2.28a) can be written as

³e
2& ift
u (t )dt U CE  f   iU pE  f  . (2.28b)
5

Taking the extended sine transform of both sides of (2.28b) gives

5 5 5 5

³ df sin(2& ft ) ³ dt 3 e ³ U  f  sin(2& ft )df  i ³ U  f  sin(2& ft )df


2& ift 3
u (t 3) CE pE
5 5 5
5
5
(2.28c)
i ³ U pE  f  sin(2& ft )df
5

-- 89
89 --
2 · Fourier Theory

because U CE  f  sin(2& ft ) is an odd function of ƒ and integrates to zero [see discussion after Eq.
(2.26i) above]. Taking the extended cosine transform of both sides of Eq. (2.28b) gives

5 5 5 5

³ df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) ³ U CE  f  cos(2& ft )df  i ³ U pE  f  cos(2& ft )df


5 5
5
5 5
(2.28d)
³ U  f  cos(2& ft )df
5
CE

because U pE  f  cos(2& ft ) is an odd function of ƒ and integrates to zero. Substitution of Eqs.


(2.24a) and (2.24b) into (2.28c) and (2.28d) gives

5 5

³ df sin(2& ft ) ³ dt 3 e
2& ift 3
u (t 3) i A pE ( tf ) U pE ( f )  (2.28e)
5 5
and
5 5

³
5
df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) CE ( tf ) U CE ( f )  .
5
(2.28f)

Since CE ( ft )  u (t )  U CE  f  and pE ( ft )  u (t )  U pE  f  [see Eqs. (2.23a) and (2.23b)], Eqs.


(2.28e) and (2.28f) can be written as

5 5

³ df sin(2& ft ) ³ dt 3 e  
2& ift 3
u (t 3) i A pE ( tf ) pE ( ft 3)  u (t 3)  (2.28g)
5 5
and
5 5

³ df cos(2& ft ) ³ dt 3 e  
2& ift 3
u (t 3) CE ( tf ) CE ( ft 3)  u (t 3)  . (2.28h)
5 5

We now multiply both sides of (2.28g) by ( i ) and sum the resulting equation with Eq. (2.28h) to
get
5 5 5 5

³
5
df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3)  i ³ df sin(2& ft ) ³ dt 3 e 2& ift 3u (t 3)
5 5 5

CE ( tf )
C E
( ft 3 )
 u (t 3)    pE (tf )  pE ( ft3)  u (t 3)  

or, using the identity e  i cos( )  i sin( ) ,

- 90 --
- 90
Forward and Inverse Fourier Transforms · 2.5

5 5

³ df e ³ dt 3 e    
2& ift 2& ift 3
u (t 3) CE ( tf ) CE ( ft 3)  u (t 3)   pE (tf ) pE ( ft3)  u (t 3)  . (2.28i)
5 5

Equation (2.26g) simplifies this to

5 5

³ df e ³ dt 3 e
2& ift 2& ift 3
u (t 3) u (t ) . (2.28j)
5 5

If, in Eq. (2.28a), we start out by adding the extended cosine transform to (i ) times the extended
sine transform, then instead of Eqs. (2.28g) and (2.28h), we get [just replace i by (i )
everywhere]
5 5

³
5
df sin(2& ft ) ³ dt 3 e2& ift 3u (t 3) i A pE ( tf ) pE ( ft 3)  u (t 3) 
5
 
and
5 5

³ df cos(2& ft ) ³ dt 3 e  
2& ift 3
u (t 3) CE ( tf ) CE ( ft 3)  u (t 3)  .
5 5

Now we must multiply the top equation by i before summing it with the bottom equation to get

5 5 5 5

³ df cos(2& ft ) ³ dt 3 e u (t 3)  i ³ df sin(2& ft ) ³ dt 3 e2& ift 3u (t 3)


2& ift 3

5 5 5 5

C E
( tf )
C E
( ft 3 )
 u (t 3)    pE (tf )  pE ( ft3)  u (t 3)  

or
5 5

³ df e ³ dt 3 e
2& ift 2& ift 3
u (t 3) u (t ) . (2.28k)
5 5

Clearly, Eqs. (2.28j) and (2.28k) are basically the same identity, which can be written as

5 5

³ df e ³ dt 3 e
92& ift B2& ift 3
u (t 3) u (t ) . (2.28 A )
5 5

As long as the exponent of e changes sign in the two integrals over ƒ and t, we get back the
original function. Looking at how Eqs. (2.28j) and (2.28k) are derived, we see that if the sign of
the exponent does not change, we get

-- 91
91 --
2 · Fourier Theory

  
CE ( tf ) CE ( ft 3)  u (t 3)   pE ( tf ) pE ( ft 3)  u (t 3)  
instead of
  
CE ( tf ) CE ( ft 3)  u (t 3)   pE ( tf ) pE ( ft 3)  u (t 3)  . 
Equations (2.26e) and (2.26f) then show that

   
CE ( tf ) CE ( ft 3)  u (t 3)   pE ( tf ) pE ( ft 3)  u (t 3)  u (t ) ,

which gives
5 5

³ df e 92& ift ³ dt 3 e
92& ift 3
u (t 3) u (t ) (2.28m)
5 5

This interesting result shows that when u is even so that u (t ) u (t ) , we still get back the
original function, and when u is odd so that u (t ) u (t ) , we just have to multiply by ( 1) to
retrieve u. Even when u is mixed, no information is lost; reversing the sign of the argument still
gets us back to the original function. Replacing t by ít in (2.28m) takes us back to the original
formula (2.28 A ).
Up to this point, we have taken u to be real, but if Eq. (2.28 A ) holds true when u is a real
function of a real argument, it must also hold true when u is a complex function of a real
argument. To show why this is so, we break complex functions u(t) of a real argument t into real
and imaginary parts,
u (t ) ur (t )  iui (t ) ,

where ur and ui are both real functions of t. Substituting this complex-valued u(t) into the left-
hand side of (2.28 A ) gives

5 5

³ ³ dt 3 e
B2& ift 3
df e92& ift ur (t 3)  iui (t 3)
5 5
5 5 5 5

³ df e ³ dt 3 e ur (t 3)  i ³ df e ³ dt3 e
92& ift B2& ift 3 92& ift B2& ift 3
ui (t 3) .
5 5 5 5

Since (2.28 A ) holds for real functions ur and ui , this last expression must be equal to the
original complex function u,

ur (t )  iui (t ) u (t ) ,

- 92 --
- 92
Forward and Inverse Fourier Transforms · 2.5

showing that Eq. (2.28 A ) is true for complex functions of t as well as strictly real functions of t.
Similar reasoning shows that (2.28m) also holds true for complex functions of real variables.
Indeed, we can even apply this analysis to the unextended sine and cosine transforms to show that
the unextended sine transform of the unextended sine transform and the unextended cosine
transform of the unextended cosine transform return the original function (for positive values of
the argument) when the original function is complex.
We now define the Fourier transform of a complex function u with real argument t to be

5
F ( ift )  u (t )  ³ u(t )e
2& ift
dt . (2.29a)
5

The notation for F introduced in (2.29a) explicitly shows that t, being repeated inside both upper
and lower parentheses, is the dummy variable of integration; and that F produces a function of ƒ
because ƒ is only listed in the upper parentheses. We call (2.29a) the forward Fourier transform
and, when convenient, follow the custom of writing it with the upper-case letter of the
transformed function,
5

³ u (t )e
2& ift
U( f ) dt . (2.29b)
5

If (2.29a) is the forward transform, then the inverse Fourier transform is

³ U ( f )e
( itf ) 2& ift
F (U ( f )) df . (2.29c)
5

In both the forward and inverse transform the order of the tf product in the superscript is
irrelevant, just as it is for the sine, cosine, and Hartley transforms,

F ( 9 itf )  u (t )  F ( 9 ift )  u (t )  and F ( 9 itf ) U ( f )  F ( 9 ift ) U ( f )  .

What is important is the sign inside the superscript, since it determines whether the forward or
inverse transform is being performed. Equation (2.28 A ) shows, of course, that

5
u (t ) F ( itf ) U ( f )  ³ U ( f )e  
2& ift
df F (itf ) F (  ift 3)  u (t 3)  . (2.29d)
5

It is entirely a matter of convention which Fourier transform is called the forward transform and
which is called the reverse transform; all that matters is for (2.28 A ) to be satisfied. Some authors

-- 93
93 --
2 · Fourier Theory

change the sign of the exponent  2& ift  , defining the forward Fourier transform to be F ( ift ) ,

5
F ( ift )
 u (t )  ³ u (t )e2& ift dt ,
5

and the inverse Fourier transform to be F ( ift ) ,

5
F (  itf ) U ( f )  ³ U ( f )e
2& ift
df .
5

Clearly, this convention also satisfies (2.28 A ), with the inverse Fourier transform of the forward
Fourier transform still returning the original function.
In physics and related disciplines, the frequency variable is often changed to - 2& f , so that
(2.28 A ) becomes
5 5
1
³ ³ dt 3 eB i-t 3u (t 3) u (t ) .
9 i-t
d - e (2.30a)
2& 5 5

Authors using the frequency variable Ȧ allocate the factor of 1 (2& ) different ways when
defining the forward and inverse Fourier transforms in terms of Ȧ, with all reasonable
possibilities chosen at one time or another:

5
Forward Fourier transform of u (t ) is ³ u (t )e B i-t dt U (- ) , (2.30b)
5
5
1
³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- ,
2& 5

5
1
³ u (t )e
B i-t
Forward Fourier transform of u (t ) is dt U (- ) , (2.30c)
2& 5
5
1
³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- ,
2& 5

5
1
³ u (t )e
B i-t
Forward Fourier transform of u (t ) is dt U (- ) , (2.30d)
2& 5

- 94 --
- 94
Forward and Inverse Fourier Transforms · 2.5

³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- .
5

In each of the three pairs of definitions listed above, the plus and minus signs are synchronized;
so if the top (bottom) sign is chosen for the first member of the pair then the top (bottom) sign
must also be chosen for the second member of the pair. This gives a total of six different ways of
defining the forward and inverse Fourier transforms, and all six satisfy Eq. (2.30a).
The unextended sine and cosine transforms—usually called just the sine and cosine
transforms—can also be defined in many different ways. Equations (2.8a), (2.8c), (2.8e), and
(2.8b), (2.8d), (2.8f) can be combined to write

5 5
4 ³ df sin(2& ft ) ³ dt 3 u (t 3) sin(2& ft 3) u (t ) for t 0 (2.31a)
0 0

and
5 5
4 ³ df cos(2& ft ) ³ dt 3 u (t 3) cos(2& ft 3) u (t ) for t 0 . (2.31b)
0 0

Changing the frequency variable to - 2& f gives

5 5
2
& ³ df sin(-t )³ dt 3 u (t 3) sin(-t 3) u(t ) for t 0
0 0
(2.31c)

and
5 5
2
& ³ df cos(-t )³ dt 3 u (t 3) cos(-t 3) u(t )
0 0
for t 0 . (2.31d)

Just like the factor of 1 (2& ) in Eq. (2.30a), the factor of 2 & in (2.31c) and (2.31d) can be
allocated three different ways when defining the forward and inverse sine and cosine transforms:

5
Forward sine transform of u (t ) for t 0 is ³ u (t ) sin(-t )dt U p -  , (2.31e)
0
5
Forward cosine transform of u (t ) for t > 0 is ³ u (t ) cos(-t ) dt U C -  ,
0
5
2
Inverse sine transform of U p -  is
& ³ U -  sin(-t )d- u(t )
0
p for t 0 ,

-- 95
95 --
2 · Fourier Theory

5
2
Inverse cosine transform of U C -  is
& ³ U -  cos(-t )d- u (t )
0
C for t 0 ,

5
2
Forward sine transform of u (t ) for t > 0 is
& ³ u (t ) sin(-t )dt U -  ,
0
p (2.31f)

5
2
Forward cosine transform of u (t ) for t > 0 is
& ³ u (t ) cos(-t )dt U -  ,
0
C

5
2
Inverse sine transform of U p -  is
& ³ U -  sin(-t )d- u (t )
0
p for t 0 ,

5
2
Inverse cosine transform of U C -  is
& ³ U -  cos(-t )d- u (t )
0
C for t 0 ,

5
2
Forward sine transform of u (t ) for t > 0 is
& ³ u (t ) sin(-t )dt U -  ,
0
p (2.31g)

5
2
Forward cosine transform of u (t ) for t > 0 is
& ³ u (t ) cos(-t )dt U -  ,
0
C

5
Inverse sine transform of U p -  is ³ U p -  sin(-t )d - u (t ) for t 0 ,
0
5
Inverse cosine transform of U C -  is ³ U C -  cos(-t )d - u (t ) for t 0 .
0

The reader should expect to encounter all three classes of definitions given in (2.31e)–(2.31g).
The symmetric definitions in (2.31f) are the most popular, probably because they remove the
distinction between the forward and inverse transform, letting us say that the sine transform of
the sine transform and the cosine transform of the cosine transform return the original function
for t 0 .
In today’s optical-engineering textbooks—and user manuals for the fast Fourier transform—
there is a tendency to choose Eq. (2.29a)–(2.29d) as the definitions of the forward and inverse
Fourier transform, and that is the convention followed here. It is perhaps somewhat
unconventional not to use the frequency variable - 2& f when defining the sine and cosine
transforms, but using ƒ rather than Ȧ brings their definitions into conformity with the definitions
chosen for the forward and inverse Fourier transforms.

- 96 --
- 96
Fourier Transform as a Linear Operation · 2.6

2.6 Fourier Transform as a Linear Operation


The forward and inverse Fourier transforms are linear operations. If Į, ȕ are any two complex
constants and u(t), v(t) are two complex-valued functions of a real variable t, then the definition
of a linear operator L isis that
that

L  A u (t )   A v(t )   A L  u (t )    A L  v(t )  . (2.32a)

Examples of linear operators are multiplication by a specified function g(t)

L1  u (t )  g (t ) A u (t ) ,

differentiation with respect to t


du (t )
L2  u (t )  ,
dt

and integration over the interval t1


t
t2

t2

L3  u (t )  ³ u (t ) dt .
t1

We see that for these three examples

L1  u (t )   v(t )   g (t )u (t )   g (t )v(t )  L1  u (t )    L1  v(t )  ,

du (t ) dv(t )
L2  u (t )   v(t )     L2  u (t )    L2  v(t )  ,
dt dt
and
t2 t2

L3  u (t )   v(t )   ³ u (t )dt   ³ v(t )dt  L3  u (t )    L3  v(t )  .


t1 t1

Combinations of linear operators are always linear; for example, the operator Z defined by

Z  u (t )  L3  L1  u (t )  
must be linear because

-- 97
97 --
2 · Fourier Theory

Z  u (t )   v(t )  L3  L1  u (t )   v(t )   L3  L1  u (t )    L1  v(t )  


 L3  L1  u (t )     L3  L1  v(t )   (2.32b)
 L  u (t )    L  v(t ) 

We note that the forward Fourier transform

5
F (  ift )
 u (t )  ³ u (t )e2& ift dt
5

as defined in Eq. (2.29a) is, in fact, just L3  L1  u (t )   with g (t ) e 2& ift in the L1 multiplication
and t1 5 , t2 5 in the L3 integration. Similarly, the inverse Fourier transform is,
interchanging the roles of the ƒ and t variables in Eq. (2.29b),

5
F (ift ) U (t )  ³ U (t )e
2& ift
dt ,
5

showing it to be L3  L1 U (t )   with g (t ) e 2& ift in the L1 multiplication and t1 5 , t2 5 in


the L3 integration. Equation (2.32b) thus shows that both the forward and inverse Fourier
transforms are linear. The unextended and extended sine transforms in Eqs. (2.8a) and (2.14a),

5 5
p ( ft )
 u (t )  2³ u (t ) sin(2& ft )dt and pE ( ft )
 u (t )  ³ u (t ) sin(2& ft )dt ,
0 5

are also both L3  L1  u (t )   : the unextended sine transform has g (t ) 2sin(2& ft ) in the L1
multiplication and t1 0 , t2 5 in the L3 integration; and the extended sine transform has
g (t ) sin(2& ft ) in the L1 multiplication and t1 5 , t2 5 in the L3 integration. The
unextended and extended cosine transforms in Eqs. (2.8b) and (2.14b),

5 5
C ( ft )
 u (t )  2³ u (t ) cos(2& ft )dt and CE ( ft )
 u (t )  ³ u (t ) cos(2& ft )dt ,
0 5

are, of course, identical to the unextended and extended sine transforms in being L3  L1  u (t )   ;
the only change is that the sines change to cosines in the L1 multiplications. From Eq. (2.32b), all

- 98 --
- 98
Fourier Transform as a Linear Operation · 2.6

four transforms—the extended sine transform, the unextended sine transform, the extended
cosine transform, and the unextended cosine transform—are linear operations. We see that the
only other transform discussed so far, the Hartley transform

5
ea( ft )  u (t )  ³ u (t ) cos(2& ft )  sin(2& ft ) dt
5

in Eq. (2.26h), must also be linear because it is

L3  L1  u (t )   with g (t ) cos(2& ft )  sin(2& ft )

in the L1 multiplication and has t1 5 , t2 5 in the L3 integration.

2.7 Mathematical Symmetries of the Fourier Transform


There are a large number of symmetry relations that hold for any function u(t) and its Fourier
transform
5
U( f ) F (  ift )
 u (t )  ³ u (t )e2& ift dt . (2.33a)
5

We have already seen that the inverse Fourier transform of U ( f ) returns the original function,

³ U ( f )e
2& ift
df F (itf ) U ( f )  u (t ) . (2.33b)
5

Replacing t by ít, changes this to


u (t ) F (  itf ) U ( f )  .

Interchanging the roles of variables ƒ and t, we get

u ( f ) F (  ift ) U (t )  , (2.33c)

which shows that u(íf) is the forward Fourier transform of U(t). We expect, then, that U(t) is the
inverse Fourier transform of u(íf). To show this is true, we interchange the roles of variables ƒ
and t in (2.33a) and then make f 3  f the new variable of integration to get

-- 99
99 --
2 · Fourier Theory

5 5 5
U (t ) F (  itf )
 u ( f )  ³ u ( f )e 2& ift
df  ³ u ( f 3)e 2& if 3t df 3 ³ u (  f )e
2& ift
df
5 5 5
(2.33d)
( itf )
F  u ( f )  .
Not only does this show that U(t) is the inverse Fourier transform of u(íf) but also, by comparing
the two expressions involving the F operator, we see that changing the sign of the integration
variable ƒ does not change the value of the Fourier operation F. It does, however, change its
name—the first F operation in (2.33d) is the forward Fourier transform of u(f) and the second F
operation in (2.33d) is the inverse Fourier transform of u(íf). Taking the complex conjugate of all
three expressions in Eq. (2.33b) gives

³ U( f ) e  
 2& ift
u (t ) df F (  itf ) U ( f ) ,
5

which shows that we get the complex conjugate of operator F by taking the complex conjugates
of the quantities inside both parentheses. Starting with the original Fourier transform relationship
between U and u,
U ( f ) F ( ift )  u (t )  (2.33e)
and
u (t ) F ( itf ) U ( f )  , (2.33f)

we take the complex conjugates of both sides of (2.33e),


U ( f ) F ( ift ) u (t ) , 
and then change the sign of ƒ to get


U ( f ) F ( ift ) u (t ) .  (2.33g)

This shows that U(íf)* is the forward Fourier transform of u(t)*. Since U(íf)* is the forward
Fourier transform of u(t)*, we expect the inverse Fourier transform of U(íf)* to be u(t)*. To show
this is true, we just change the sign of integration variable in Eq. (2.33f),

u (t ) F ( itf ) U ( f )  ,

and then take the complex conjugate to get

- 100
- 100- -
Mathematical Symmetries of the Fourier Transform · 2.7


u (t ) F ( itf ) U ( f ) .  (2.33h)

Hence, u(t)* is indeed the inverse Fourier transform of U(íf)*.


When u(t) is a strictly real function, as it is for much of the Fourier-transform work done in
this book, u equals its complex conjugate so that

 
F ( ift )  u (t )  F ( ift ) u (t ) ,

and Eq. (2.33g) becomes


U ( f ) F (  ift )  u (t )  .

But F ( ift )  u (t )  is just U(f), the forward Fourier transform of u, so

U (  f ) U ( f )

or, taking the complex conjugate of both sides,

U (  f ) U ( f ) . (2.34a)

Functions U(f) that obey Eq. (2.34a) are called Hermitian. If u(t) is purely imaginary, so that
u (t ) u (t ) , then Eq. (2.33g) becomes

U ( f ) F ( ift )  u (t ) 
or
F ( ift )  u (t )  U ( f ) , (2.34b)

where the linearity of F is used to take (1) outside the transform and shift it over to the other
side of the equation. Since F ( ift )  u (t )  is just U(f), Eq. (2.34b) shows that

U ( f ) U ( f )
or
U ( f ) U ( f ) (2.34c)

when u is purely imaginary. Functions U(f) that obey Eq. (2.34c) are called anti-Hermitian. A
special and very important case occurs when u is both real and even. Then, since U is the forward

-- 101
101 --
2 · Fourier Theory

Fourier transform of u with U ( f ) F ( ift )  u (t )  , we take the complex conjugate of both sides to
get

U ( f ) F ( ift ) u (t ) . 
Because u is real this becomes, changing the sign of the variable of integration,

U ( f ) F ( ift )  u (t )  F (  ift )  u (t )  .

Because u is even, this simplifies to

U ( f ) F (  ift )  u (t )  U ( f )
so that
U ( f ) U ( f ) . (2.34d)

Hence, U equals its own complex conjugate, which shows it must be real. Because u is real, we
already know that U is Hermitian and (2.34a) must hold true; now that U is known to be real, Eq.
(2.34a) can be written as
U ( f ) U ( f ) (2.34e)

This shows that U must be real and even when u is real and even. Taking the real part of Eq.
(2.33a) now gives, since both U and u are known to be real,

§5 · 5
U ( f ) Re ¨ ³ u (t )e 2& ift
dt ¸ ³ u (t ) Re e2& ift dt ,
 
© 5 ¹ 5

which becomes, applying Eq. (2.27),

5
U( f ) ³ u(t ) cos(2& ft ) dt .
5
(2.34f)

Because u(t) is also even, we know that the product u (t ) cos(2& ft ) is even with respect to t,
which means that (2.34f) can be written as [see formula (2.19) above]

5
U ( f ) 2 ³ u (t ) cos(2& ft ) dt . (2.34g)
0

- 102
- 102- -
Mathematical Symmetries of the Fourier Transform · 2.7

The right-hand side is the unextended cosine transform of u, showing that when u(t) is real and
even, its Fourier transform equals its cosine transform. According to Eq. (2.8f), it follows that u
must then be the cosine transform of U,

5
u (t ) 2 ³ U ( f ) cos(2& ft ) df . (2.34h)
0

2.8 Basic Fourier Identities


There are a number of simple Fourier identities that are true for the transforms of any function u.
One very simple identity—surprisingly easy to overlook—is that when U(f) is the forward or
inverse Fourier transform of u(t), the value of U at the origin is the total integral of u:

ª5 º
U( f ) f 0
« ³ u (t )e B2& ift dt »
¬ 5 ¼ f 0

or
5
U (0) ³ u (t )dt .
5
(2.35a)

Similarly, u (0) is the total integral of U ( f ) :

ª5 º
u (t ) t 0 « ³ U ( f )e 92& ift df »
¬ 5 ¼ t 0
or
5
u (0) ³ U ( f )df .
5
(2.35b)

When U(f) is the forward Fourier transform of u(t), the nth derivative of U is

d nU <n
5 5

³ u(t )e ³ ª¬t u(t ) º¼ e


2& ift n n 2& ift
dt (2& i ) dt ; (2.35c)
df n <f n 5 5

and, because Eqs. (2.29a) and (2.29d) require u to be the inverse transform of U when U is the
forward transform of u, the nth derivative of u is

-- 103
103 --
2 · Fourier Theory

d nu < n
5 5

³ U ( f )e ³ ª¬(2& i) f nU ( f ) º¼ e2& ift df .


2& ift n
df (2.35d)
dt n <t n 5 5

Therefore, when both u and d nu dt n satisfy requirements (V) through (VIII) in Sec. 2.4 and U(f)
is the forward Fourier transform of u(t), Eq. (2.35d) shows that [(2& i ) n f nU ( f )] must be the
forward Fourier transform of d nu dt n because d nu dt n is the inverse Fourier transform of
[(2& i ) n f nU ( f )] . Equation (2.35c) similarly shows that when u(t) and [t nu (t )] satisfy
requirements (V) through (VIII) in Sec. 2.4 and U(f) is the forward Fourier transform of u(t), the
forward Fourier transform of [t nu (t )] is
1 d nU
.
(2& i ) n df n

We introduce the notation “ 6 ” to show this sort of Fourier-transform relationship between


functions, adopting the convention that the function on the right is always the forward Fourier
transform of the function on the left and the function on the left is always the inverse Fourier
transform of the function on the right. The results of the above analysis can then be written as

d nu
6 (2& i ) n f nU ( f ) (2.35e)
dt n
and
1 d nU
t nu (t ) 6 . (2.35f)
(2& i ) n df n

For the integral of any complex function c(t), the inequality

b b

³ c(t ) dt 4 ³ c(t ) dt
a a
(2.35g)

must hold true for any two real values of a and b where a 4 b . When u(t) is real, so is its nth
derivative, and we can write

d n u 2& ift d n u 2& ift d n u 2& ift


5 5 5

³5 dt n e dt 4 ³5 dt n e dt ³5 dt n A e dt ,

which reduces to, since e 2& ift 1 ,

- 104
- 104- -
Basic Fourier Identities · 2.8

d nu 2& ift d nu
5 5

³ dt n e dt 4 5³ dt n dt .
5
(2.35h)

Because we are supposing the Fourier transform of d nu / dt n to exist, the existence requirement
in Eq. (2.13a) shows that
d nu
5

³ dt n dt
5

is finite. Hence, inequality (2.35h) requires

d n u 2& ift
5

³5 dt n e dt

also to be finite, which means that we can assume that it is less than or equal to some finite real
and non-negative number B for all values of ƒ:

d nu 2& ift
5

³ dt n e dt 4 B .
5
(2.35i)

Formula (2.35e) states that


d nu 2& ift
5

³5 dt n e dt (2& ) i f U ( f ) ,
n n n
(2.35j)

where
5

³ u (t )e
2& ift
U( f ) dt
5

is, of course, the Fourier transform of u(t). Taking the magnitude of the complex values of both
sides of (2.35j) and remembering that i n 1 shows that

d nu 2& ift
5

³5 dt n e dt (2& ) f
n n
U( f ) ,

which becomes, applying inequality (2.35i),

-- 105
105 --
2 · Fourier Theory

n
B : (2& ) n f U( f )
or
B n
U( f ) 4 f . (2.35k)
(2& ) n

Hence, when the Fourier transform of the nth derivative of u(t) exists, we know that the
n
magnitude U ( f ) of the Fourier transform of u decreases as f for large values of ƒ.
We next examine a set of identities often called the Fourier shift theorem. When U(f) is the
forward Fourier transform of u(t),
5

³ u (t )e
2& ift
U( f ) dt ,
5

and u(t) is shifted to the right by an amount a,

u (t ) 7 u (t  a) ,

then the forward Fourier transform of u (t  a) is, changing the variable of integration to
t3 t  a ,

5 5

³ u(t  a)e dt ³ u (t 3)e


2& ift 2& if ( t 3  a )
dt 3
5 5
5
e 2& ifa ³ u (t 3)e
2& ift 3
dt 3 e 2& ifaU ( f ).
5

Hence the forward Fourier transform of u (t  a) is e 2& ifaU ( f ) when the forward Fourier
transform of u(t) is U(f), which we can write as

If u (t ) 6 U ( f ) then u (t  a) 6 e 2& ifaU ( f ) . (2.36a)

operator, we
In terms of the Fourier F operator, we have
have

F ( ift )  u (t  a )  e 2& ifa F (  ift )  u (t )  . (2.36b)

Working with the reverse Fourier transform of U ( f  f 0 ) and changing the variable of
integration to f 3 f  f 0 , we see that

- 106
- 106- -
Basic Fourier Identities · 2.8

5 5

³ U ( f  f )e ³ U ( f 3)e
2& ift 2& if0 t 2& if 3t
0 df e df 3 e 2& if0t u (t ) (2.36c)
5 5
or
e 2& if0t u (t ) 6 U ( f  f 0 ) . (2.36d)

The F operator lets us write this result as

F ( itf ) U ( f  f 0 )  e 2& if0t F ( itf ) U ( f )  (2.36e)


or
F (ift ) e 2& if0t u (t ) U ( f  f 0 ) F 
i  f  f 0 t 
   u (t )  . (2.36f)

Equations (2.36d)–(2.36f) show that multiplying u(t) by e 2& if0t shifts U(ƒ), the forward Fourier
transform of u(t), to the right by a frequency f 0 . By interchanging the roles of t and ƒ—and
replacing u by U and f 0 by a—in (2.36e) and comparing the result to (2.36b), we see the two
equations can be combined into one formula:

F ( 9 ift )  u (t  a )  e 92& ifa F ( 9 ift )  u (t )  . (2.36g)

This last result can also be written as, defining a new constant b  a ,

5 5

³ u(t  b) e ³ u(t ) e
92& ift B2& ifb 92& ift
dt e dt (2.36h)
5 5
or
F ( 9 ift )  u (t  b)  e B2& ifb F ( 9 ift )  u (t )  . (2.36i)

The next set of identities is sometimes called the Fourier scaling theorem. If U(ƒ) is the
forward Fourier transform of u(t) and the argument of u is scaled by the real constant a,

u (t ) 7 u (at ) ,

then the forward Fourier transform of u ( at ) is, letting t 3 at ,

5 5 § ft 3 ·
1 2& i ¨ ¸ 1 § f ·
³ u(at )e ³ u (t 3)e
2& ift © a ¹
dt dt 3 U ¨ ¸.
5
a 5
a ©a¹

-- 107
107 --
2 · Fourier Theory

This can be written as


1 § f ·
u (at ) 6 U¨ ¸ (2.37a)
a ©a¹
or
1  i f a t 
F (  ift )  u (at )  F  u (t )  . (2.37b)
a

We also have, scaling the frequency by a positive constant a and letting f 3 af , that

5 5 § f 3t ·
1 2& i ¨ ¸ 1 §t·
³ U (af )e df ³ U ( f 3)e © a ¹ df 3 u ¨ ¸ .
2& ift

5
a 5 a ©a¹

This can be written as


1 §t·
u ¨ ¸ 6 U (af ) for a 0 (2.37c)
a ©a¹
or
1 it a  f 
F ( itf ) U (af )  F U ( f )  for a 0 . (2.37d)
a

Equation (2.37b) and (after interchanging the roles of ƒ and t) Eq. (2.37d) can be combined into
the single formula,
1 9 i  f a t 
F ( 9 ift )  u (at )  F   u (t )  for a 0 . (2.37e)
a

Because u(t) must satisfy requirements (V) through (VIII) in Sec. 2.4 for these results to be
true—and in particular it must satisfy requirement (V) that it be absolutely integrable—there may
well be only a finite region of t over which u(t) is significantly different from zero. When
0
a
1 so that the range of t over which u is significantly different from zero expands, formula
(2.37a) shows that the region of ƒ over which U(ƒ) is significantly different from zero shrinks;
and, of course, when a 1 , just the opposite occurs. For 0
a
1 , function u (at ) more closely
resembles sin(2& ft ) and cos(2& ft ) for smaller values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero shrinks; and when a 1 , function u (at ) more closely
resembles sin(2& ft ) and cos(2& ft ) for larger values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero expands. We also note that if f 1 (2& ) , so that
sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) , then the sine and cosine can change significantly in
value only when t changes by at least

- 108
- 108- -
Basic Fourier Identities · 2.8

tmin O (1) .

Suppose t must also change by at least tmin O (1) for a significant change in u(t) to occur,
which means that sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) vary about as fast with respect to t as
u does—that is, sin(t ) and cos(t ) “resemble” u somewhat. Recalling the heuristic reasoning used
in Sec. 2.1 to introduce and justify the sine and cosine integrals, we now expect U(ƒ) to be
significantly different from zero when f 1 (2& ) . Suppose next that t changes by less than
tmin O (1) so that u does not change significantly in value, remaining almost constant. Now
when ƒ becomes significantly larger than 1 (2& ) , functions sin(2& ft ) and cos(2& ft ) oscillate
ever more rapidly so that they change significantly in value for changes in t that are ever smaller
than tmin . For these larger values of ƒ, the sine and cosine do not much resemble u(t), forcing
the Fourier transform U(ƒ) to be negligible or zero for f O (1 (2& )) . We can modify the
original function u by creating a new function u (t ) u (t  ) for  0 . Now t must change by at
least an O(  ) amount for u to change significantly; and when t changes by less than O(  ) ,
function u does not change significantly in value. We know from (2.37a) with a 1  that the
forward Fourier transform of u is U  ( f )  U   f  . Hence, when ƒ is larger than
O 1 (2& )  , it must be true that U  ( f ) is negligible or zero, since this is the same as having
f O(1 (2& )) in U(ƒ). Because 2& is often regarded as an O(1) quantity, this result can also be
interpreted as showing that U  ( f ) must be negligible or zero for f O (1  ) . Since the original
Fourier transform pair

u (t ) 6 U ( f )

is left unspecified, u in fact represents any function v(t) where t must change by at least an
O(  ) amount for a significant change in v to occur. Consequently, we can conclude if t must
change by at least an O(  ) amount for v(t) to change significantly, then the forward Fourier
transform of v(t) must be negligible or zero for f O (1  ) . The arguments leading to this
conclusion work just as well when we consider the inverse Fourier transform in Eqs. (2.37c) and
(2.37e). Therefore, this more general result is also true: if v(t) is a function such that t must
change by at least an O(  ) amount for a significant change in v to occur, then the forward or
inverse Fourier transform,
5

³ v(t )e
92& ift
V( f ) dt ,
5

is negligible or zero for f O (1  ) .

-- 109
109 --
2 · Fourier Theory

2.9 Fourier Convolution Theorem


It is hard to overstate the importance of the Fourier convolution theorem; it plays a fundamental
role in linear signal theory and structures the thinking of many different engineering
disciplines—signal processing, electrical engineering, image analysis, and servomechanism
design, to name but a few.
We define the convolution of two functions u(t) and v(t) to be

5
u (t )  v(t ) ³ u(t3)v(t  t3) dt 3 .
5
(2.38a)

Here, u and v may be complex functions but their argument t is assumed to be real. The
convolution is commutative and associative. It is commutative because making the substitution
t 33 t  t 3 gives

5 5 5
u (t )  v(t ) ³ u (t 3)v(t  t 3) dt 3  ³ u (t  t 33)v(t 33) dt 33 ³ v(t33)u(t  t 33) dt 33 ,
5 5 5

showing that
u (t )  v(t ) v(t )  u (t ) . (2.38b)

The convolution is associative because for three complex functions u(t), v(t), and h(t) with real
argument t we can write, changing the variable of integration to t 333 t 33  t 3 ,

5 5 5 5

u (t )  v(t )  h(t ) ³ dt 33h(t  t 33) ³ dt 3u (t 3)v(t 33  t 3) ³ dt 3u (t 3) ³ dt 33h(t  t 33)v(t 33  t 3)


5 5 5 5
5 5
³ dt 3u (t 3) ³ dt 333v(t 333)h  (t  t 3)  t 333
5 5
u (t )   v(t )  h(t ) .

Hence,
u (t )  v(t )  h(t ) u (t )  v(t )  h(t ) . (2.38c)

The convolution is a linear operation, because for any two complex constants Į and ȕ,

- 110
- 110- -
Fourier Convolution Theorem · 2.9

5
h(t )   u (t )   v(t )  ³ h(t 3)  u (t  t 3)   v(t  t 3)  dt 3
5
5 5
 ³ h(t 3)u (t  t 3)dt 3   ³ h(t 3)v(t  t 3)dt 3 ,
5 5
showing that

h(t )  ( u (t )   v(t ))   h(t )  u (t )     h(t )  v(t )  . (2.38d)

Because the convolution is commutative, the equation can also be written as

 u (t )   v(t )   h(t )   u (t )  h(t )     v(t )  h(t )  . (2.38e)

This shows that the convolution is linear on both the left-hand and right-hand sides of the  .
The convolution of two even functions or two odd functions is an even function. If u(t) and
v(t) are both even or both odd, then we have, using t 33 t 3 ,

5 5
u (t )  v(t ) ³ u(t 3)v(t  t 3) dt 3  ³ u (t 33)v(t  t 33) dt 33
5
5
5
(2.38f)

5
³ u (t 33)v(t  t 33) dt 33 u(t )  v(t ) .

When u is even and v is odd, or u is odd and v is even, then we have

5 5
u (  t )  v ( t ) ³ u(t 3)v(t  t 3) dt 3  ³ u (t 33)v(t  t33) dt 33
5 5
5
(2.38g)
 ³ u (t 33)v(t  t 33) dt 33 u (t )  v(t ) .
5

Hence, the convolution of an even and an odd function is always odd.


If u and v have more than one argument so that they are written u ( y, x1 , x2 ,…) and
v( y, x13, x23 ,…) , then we adopt the convention that the convolution

-- 111
111 --
2 · Fourier Theory

u ( y, x1 , x2 ,…)  v( y, x13, x23 ,…)

is over variable y rather than variables x1 , x13, x2 , x23 ,… ,

5
u ( y, x1 , x2 ,…)  v( y, x13, x23 ,…) ³ u( y3, x , x ,…)v( y  y3, x3, x3 ,…) dy3 ,
5
1 2 1 2

because y is the only argument repeated on both sides of the  .


To derive the Fourier convolution theorem, we take the forward or inverse transform of
u (t )  v(t ) to get
5 5 5
F ( 9 ift )
 u (t )  v(t )  ³ e 92& ift
u (t )  v(t ) dt ³ dt e 92& ift
³ dt 3u(t 3)v(t  t 3)
5 5 5
5 5

³ dt 3u (t 3) ³ dt e
92& ift
v(t  t 3).
5 5

Changing the variable of integration in the inner integral to t 33 t  t 3 gives

5 5
F ( 9 ift )  u (t )  v(t )  ³
5
dt 3u (t 3)e 92& ift 3 ³ dt 33e92& ift 33v(t 33)
5

ª 5
º ª5 º
« ³ dt 3u (t 3)e 92& ift 3
» A « ³ dt 33e
92& ift 33
v(t 33) »
¬ 5 ¼ ¬ 5 ¼

or
F ( 9 ift )  u (t )  v(t )  F ( 9 ift )  u (t )  A F ( 9 ift )  v(t )  . (2.39a)

If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39a) to get

³e
2& ift
u (t )  v(t ) dt U ( f ) AV ( f ) , (2.39b)
5
which shows that
u (t )  v(t ) 6 U ( f ) A V ( f ) . (2.39c)

Equation (2.28 A ) can be written as, for any function g(t) after interchanging the roles of t and t 3 ,

- 112
- 112- -
Fourier Convolution Theorem · 2.9

 
F ( 9 it 3f ) F ( B ift )  g (t )  g (t 3) . (2.39d) (2.39d)

We replace F ( 9 ) by F ( B ) on the right-hand side of Eq. (2.39a), which is just a change in the order
in which the two possible signs of the exponent are listed, and then take F ( 9 it 3f ) of both sides to
get that, applying (2.39d) with g (t ) u (t )  v(t ) ,


u (t 3)  v(t 3) F ( 9 it 3f ) F ( B ift )  u (t )  A F ( B ift )  v(t )  .  (2.39e)

Because u(t) and v(t) represent arbitrary, Fourier-transformable functions of t, F ( B ift ) (u (t )) and
F ( B ift ) (v(t )) must be arbitrary, Fourier-transformable functions of ƒ, which we can call U ( B ) and
V ( B ) respectively,
U ( B ) ( f ) F ( B ift )  u (t )  (2.39f)
and
V ( B ) ( f ) F ( B ift )  v(t )  . (2.39g)

Applying this notation to (2.39d), first with g (t ) u (t ) and then with g (t ) v(t ) , we see that

 
F ( 9 it 3f ) U ( B ) ( f ) u (t 3) (2.39h)
and
 
F ( 9 it 3f ) V ( B ) ( f ) v(t 3) . (2.39i)

Hence Eq. (2.39e) can be written as

    
F ( 9 it 3f 3) U ( B ) ( f 3)  F ( 9 it 3f 33) V ( B ) ( f 33) F ( 9 it 3f ) U ( B ) ( f ) AV ( B ) ( f ) , 
where the convolution is over t 3 because it is the only argument repeated on both sides of the  .
Since U ( B ) and V ( B ) are arbitrary, transformable functions, we can replace them by the arbitrary
transformable functions u and v to get, after interchanging the roles of ƒ and t 3 ,

F ( 9 ift 3)  u (t 3) A v(t 3)  F ( 9 ift 33)  u (t 33)   F ( 9 ift 333)  v(t 333)  .

This can be simplified by dropping a prime from each of the t’s:

F ( 9 ift )  u (t ) A v(t )  F ( 9 ift 3)  u (t 3)   F ( 9 ift 33)  v(t 33)  . (2.39j)

-- 113
113 --
2 · Fourier Theory

If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39j) to get

³e
2& ift
u (t ) A v(t ) dt U ( f ) V ( f ) (2.39k)
5
or
u (t ) A v(t ) 6 U ( f )  V ( f ) . (2.39 A )

Equation (2.39b) shows that the forward Fourier transform of the convolution of two functions
is the product of the forward Fourier transform of each function, and (2.39k) shows that the
forward Fourier transform of the product of two functions is the convolution of the forward
Fourier transform of each function. Equations (2.39a) and (2.39j) show that everything we just
said about the forward Fourier transform still holds true when we take the reverse Fourier
transform of the product of two functions or of the convolution of two functions.
When using the Fourier convolution theorem, we usually regard one of the two convolved
functions as representing the undisturbed signal—that is, the true set of values for what is to be
measured—and the other—usually much more narrow—function as specifying the blurring or
smearing effect of an imperfect measurement. The blurring or smearing function has different
names in different engineering disciplines; optical engineers often call it the instrument-response
or instrument line-shape function. In Fig. 2.5(a), function u is taken to be the true signal, and in
Fig. 2.5(b) function v is the instrument-response or instrument line-shape function. The
convolution
5
u (t )  v(t ) ³ u (t 3)v(t  t 3) dt 3 u
5
blur (t )

defines the new function ublur (t ) as shown in Figs. 2.5(c)–2.5(e). The function v is flipped left to
right and slid along the t 3 axis in Fig. 2.5(c) by changing the value of t. Figure 2.5(d) is a close-
up of v at a specific value of t, with the shaded region being the area under the product
u (t 3)v(t  t 3) . Since u (t 3)v(t  t 3) is zero where v(t  t 3) is zero, the area of the shaded region can
be found by integrating u (t 3)v(t  t 3) over t 3 between í’ and +’. This is, of course, just the
convolution of u and v for this particular value of t , which means the area of the shaded region
must be ublur (t ) for this value of t. Figure 2.5(e) represents the complete ublur (t ) function for all
values of t; clearly ublur has less detail than the original signal u.
The v(t) function in Fig. 2.5(b) is an unusual type of instrument response because it is not an
even function of t. Figure 2.5(f) shows a typical even instrument response ve (t ) . When the
instrument-response function is ve , the blurred signal is

- 114
- 114- -
Fourier Convolution Theorem · 2.9

ue ,blur (t ) u (t )  ve (t ) . (2.40a)

The instrument-response function is even, so ve (t ) ve (t ) and we can write

5 5
ue,blur (t ) ³ u(t 3)ve (t  t 3) dt 3
5
³ u(t 3)v (t 3  t ) dt 3
5
e (2.40b)

with the last integral in (2.40b) making it perhaps more obvious that ue,blur is a localized and
weighted average of u centered on t. Instrument-response or line-shape functions are usually
designed to be even because an even instrument-response function does not shift the center point
of isolated peaks in the true data u.
As described in the first chapter, when using Michelson interferometers, we do not much care
about the exact shape of the optical intensity signal u but are instead interested in the shape of its
transform,
U ( f ) F (  ift )  u (t )  . (2.40c)

In many types of interferometers, u is a signal of time t, which means U can be analyzed as a


function of ƒ, the signal frequency. The electrical circuits transmitting and recording the signal u
can never do a perfect job—they always blur and smooth the original signal to some extent—so
what we end up with is not u(t) and U(ƒ) but rather ue,blur (t ) and the associated Fourier transform

U e ,blur ( f ) F (  ift )  ue ,blur (t )  . (2.40d)

The relationship between U e ,blur and U must be understood to design the electrical circuits
properly. Here is an important example of how to use the Fourier convolution theorem.
Substitution of (2.40a) into (2.40d) gives

U e ,blur ( f ) F (  ift )  u (t )  ve (t )  .

Using the Fourier convolution theorem as presented in Eq. (2.39a), this is rewritten as

U e ,blur ( f ) F (  ift )  u (t )  A F (  ift )  ve (t ) 


or
U e ,blur ( f ) U ( f ) A Ve ( f ) , (2.40e)

where U(ƒ) comes from (2.40c) and we define

-- 115
115 --
2 · Fourier Theory

FIGURE 2.5(a).
u (t )

FIGURE 2.5(b). v(t )

u (t 3)
FIGURE 2.5(c).

t3

u (t 3)v(t  t 3)

t value
v(t  t 3)

FIGURE 2.5(d).
t3

ublur (t )

FIGURE 2.5(e). t

ve (t )

t
FIGURE 2.5(f).

- 116
- 116- -
Fourier Convolution Theorem · 2.9

Ve ( f ) F (  ift )  ve (t )  .

Equation (2.40e) is a very reassuring result, stating that as long as Ve ( f ) is known and not zero,
we can recover the Fourier transform of the true signal U(ƒ) from U e ,blur ( f ) by calculating

U e ,blur ( f )
U( f ) . (2.40f)
Ve ( f )

To design the circuits of a Michelson interferometer, we find the frequencies ƒ for which U(ƒ)
must be known and arrange for Ve to be as constant as possible—and definitely not zero—over
these frequencies. It turns out that preserving certain signal frequencies while neglecting others is
a standard problem in electrical circuit design, and it is usually easy to arrange for this to occur.
There is, in fact, a whole branch of electrical engineering called filter theory that describes
exactly how to design circuits where Ve is zero or very small at some frequencies while being
large and quasi-constant at others.

2.10 Fourier Transforms and Divergent Integrals


Fourier-transform theory has a history of treating with extreme kindness engineers and scientists
who blindly use its formalism without worrying about whether their manipulations make
mathematical sense. The rule of thumb seems to be that if the final result is mathematically
sound—such as a finite integral or the transform of an obviously transformable function—it
almost never matters whether intermediate steps involve the transforms of functions that
obviously cannot be transformed or even, strictly speaking, are not true functions at all. Any
reasonably comprehensive table of Fourier transforms contains functions that not only violate
requirements (V) through (VIII) in Sec. 2.4 but also have transform integrals that, according to
the standard definition of integration, either diverge or have no well-defined value. This book
shows that these puzzling entries are the modest but ubiquitous legacy of mathematicians who
have extended the meaning of what is meant by an integral and what is meant by a function in
Fourier-transform theory. Their work has not only benefited many scientists and engineers who
no longer have to apologize for the way they solve Fourier-transform problems but has also
helped their students who no longer need to accept without good explanations divergent integrals
and the transforms of poorly defined functions.
The standard definition of an improper integral

³ u (t )dt
5
for the function u(t) is that

-- 117
117 --
2 · Fourier Theory

5 T2

³ u(t )dt lim ³ u(t )dt .y


T1 75
5 T2 75 T1

If there is any singular point t s where lim u (t ) 95 , the definition becomes


t 7t s

5 ª ts 1 T2
º
³5 u (t ) dt lim « ³
« T1
u (t ) dt  ³ u (t ) dt ». (2.41a)
¼»
T1 75 , T2 75
1 70,  2 70 ¬ ts  2

In this definition, the limits as T1 7 5 , T2 7 5 , 1 7 0 , and  2 7 0 occur independently; no


matter how T1 , T2 , 1 , and  2 approach their limits, the same answer is expected if the integral
exists. We now decide, in the interest of expanding Fourier-transform theory, to change this
standard definition of improper integral by connecting 1 to  2 and T1 to T2 as we take the limit,

5 ª t s  T º
³5 u (t ) dt lim «
« T
³ u (t ) dt  ³ u (t )dt » . (2.41b)
¼»
T 75
 70 ¬ ts 

The limiting process in definition (2.41b) is said to give the Cauchy principle value of the
integral, sometimes written as
5 5
_
PV ³ u (t )dt or ³ u(t )dt .
5 5

If u(t) has multiple singular points, the definition is expanded in the obvious way. For example,
with two singular points at ts1 and ts 2 with ts1
ts 2 , we have

5 ª ts1 1 t s 2  2 T º
PV ³ u (t )dt lim « ³ u (t )dt  ³ u (t )dt  ³ u (t ) dt » (2.41c)
1 70 « »¼
T 75
5 ¬ T ts 1 1 ts 2  2
 2 70

and so on for three, four, etc., interior points of singularity in u(t). If an improper integral
converges to a finite value in the standard sense of (2.41a), then its Cauchy principle value also
converges to the same answer, but many improper integrals that do not converge in the sense of
(2.41a) nevertheless have well-defined Cauchy principle values. For this reason, it is customary
in Fourier-transform theory to interpret all improper integrals—such as the forward and inverse
Fourier transforms—as Cauchy principle values, and that is what we shall do from now on. There
will be no special notation used to distinguish Cauchy principle values from ordinary improper
integrals.

- 118
- 118- -
Fourier Transforms and Divergent Integrals · 2.10

To show the relevance of the Cauchy principle value, we calculate the Fourier transform of
1 t , an example already considered above in connection with the sine transform [see discussion
following Eq. (2.10e)]. Using the identity ei cos( )  i sin( ) , we have

5 5 5
F (  ift ) (t 1 ) ³ e 2& ift t 1dt ³ cos(2& ft ) t 1dt  i ³ sin(2& ft ) t 1dt . (2.42a)
5 5 5

There is no problem evaluating the imaginary part of this transform. Because [t 1 sin(2& ft )] is
an even function of t, we can apply formulas (2.19) and (2.10f) to get

5 5
i ³ sin(2& ft ) t dt 2i ³ sin(2& ft ) t 1dt i& for
1
f 0.
5 0

When f
0 , we have

5 5
i ³ sin(2& ft ) t dt i ³ sin(2& f t ) t 1dt i& ,
1

5 5

allowing us to write
5
i ³ sin(2& ft ) t 1dt i& sgn( f ) , (2.42b)
5

where we define
­ 1 for f 0
°
sgn( f ) ® 0 for f 0 . (2.42c)
° 1 for f
0
¯

The specification that sgn(0) 0 makes sgn( f ) a proper odd function, equal to zero at f 0 ,
even though it has a jump discontinuity there. It also, of course, makes sense considering that
(2.42b) is the integral of the zero function when f 0 . Evaluation of the real part of the
transform in (2.42a) shows the usefulness of interpreting improper integrals as Cauchy principle
values. When f 0 , the real part of the left-hand side of (2.42a) becomes, using the standard
interpretation of an improper integral in (2.41a),

-- 119
119 --
2 · Fourier Theory

5
dt ª 1 dt T2 dt º ª T1 dt § T2 · º
³ t T1 75,T2 75 « ³T t ³ t » T1 75,T2 75 « ³ t ¨©  2 ¸¹»»
lim «  » lim «   ln
5  70,  70 ¬ 1
1 2 2 ¼  70, 70 ¬ 1
1 2 ¼
ª §T · § T ·º
lim «  ln ¨ 1 ¸  ln ¨ 2 ¸ » (2.43a)
T1 75 , T2 75
1 70,  2 70 ¬ © 1 ¹ ©  2 ¹¼
ª § · § T ·º
lim «ln ¨ 1 ¸  ln ¨ 2 ¸ » .
T1 75 , T2 75
 70,  7 0 ¬
1 2
© 2 ¹ © T1 ¹ ¼

The expression ln(1  2 ) can be made anything we want depending on the limiting ratio
chosen for 1  2 as 1 7 0 and  2 7 0 ; the same is true of ln(T1 T2 ) as T1 7 5 and T2 7 5 .
Therefore, under the standard interpretation of an improper integral, the limit in (2.43a) does not
exist. Comparison of (2.41a) to (2.41b) shows that (2.43a) can be converted to a Cauchy principle
value by setting 1  2  , T1 T2 T , and taking the limit as T 7 5 ,  7 0 . This leads to

ª § · § T ·º
lim «ln ¨ ¸  ln ¨ ¸ » 0 ,
 70 ¬
T 75
© ¹ © T ¹¼

5
dt
allowing us to give a well-defined value to the expression ³ t .
5
In general, the Cauchy principle value of any odd function is always zero,

³ u(t )dt 0
5
for any function u such that u (t ) u (t ), (2.43b)

because when taking the limit we are always simultaneously adding u (t )dt increments to the
integral at values of t and ít with the balanced addition of increments always cancelling out.
Hence, interpreted as a Cauchy principle value,

³ cos(2& ft ) t
1
dt 0 (2.43c)
5

because [t 1 cos(2& ft )] is an odd function of t. Therefore we can now assign a well-defined


meaning to the forward Fourier transform of 1 t in (2.42a) using (2.43c) and (2.42b):

F ( ift ) (t 1 ) i& sgn( f ) . (2.43d)

- 120
- 120- -
Fourier Transforms and Divergent Integrals · 2.10

For this answer to be a true extension to Fourier-transform theory, however, 1/t must satisfy
Eq. (2.28 A ); that is, the inverse transform

F ( itf )  i& sgn( f ) 

has to give back the original function 1/t.


Direct evaluation of the inverse transform gives

5
F ( itf )
 i& sgn( f )  i& ³ e2& ift sgn( f )df
5
5 5
(2.43e)
i& ³ cos(2& ft ) sgn( f )df  & ³ sin(2& ft ) sgn( f )df .
5 5

The cosine integral is again the integral of an odd function so its Cauchy principle value is zero,
but it is still not clear what value to assign the integral of [sin(2& ft ) sgn( f )] . As the integral of
an even function, we might try applying formula (2.19) to get

5
? 5 5
& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) df , (2.43f)
5 0 0

but then we have the same difficulty already encountered when trying to evaluate the sine
transform
5
2& ³ sin(2& ft )df
0

in Eq. (2.10g). To evaluate the inverse transform of i& sgn( f ) , we need to create a new class of
mathematical entities, called generalized functions, together with a set of rules for how they
behave inside integrals. This extension to Fourier-transform theory is often called distribution
theory, with the generalized functions called distributions.

2.11 Generalized Functions


Generalized functions are based on the well-established mathematical concept of a functional. A
functional is a rule for assigning a complex number to each member of a set of test functions,
where each test function  has only one number assigned to it and the same number may end up
assigned to different test functions. The Fourier transform of a function  (t ) at a specific
frequency f f 0 is a functional because it assigns the number  ( f 0 ) F (  if0t )  (t )  to the test

-- 121
121 --
2 · Fourier Theory

function  . In general, we can use any complex function u(t) having a real argument t as a
weighting function inside an integral to create a functional. This functional, called ³ u , is defined
to be
5

 ³ u  ³ dt u(t ) (t ) complex number . (2.44)


5

According to this definition the functional ³ u is linear, like the Fourier transform, because

5 5 5

 ³ u      ³ u(t )  (t )   (t ) dt  ³ u (t ) (t )dt   ³ u(t ) (t )dt


1 2 1 2 1 2
(2.45)
5 5 5

  ³ u     ³ u 
1 2

for any two complex constants Į, ȕ and test functions 1 , 2 .


From the notation ³ u , it is clear that all functions u, as long as the integral in Eq. (2.44) exists,
have associated with them the functional ³ u defined for the test functions  . There are also
functionals that behave in every way like the functionals ³ u , but for which no corresponding true
function u can be defined. We can, however, associate with these functionals a new class of
mathematical objects, called generalized functions, which can be shown to have many of the
properties of true functions. For this reason, it is customary to use function notation when
referring to generalized functions. If an already-understood functional has no true function u(t)
associated with it, we can use the properties of this already-understood functional to define a
generalized function called uG (t ) , with the subscript G reminding us that uG is a generalized
function. By analogy with the true function u(t) associated with the functional ³ u , the
generalized function and its behavior inside integrals is defined in terms of the already-known
functional, which we call ³ uG , using the definition

³u
5
G  
(t )  (t ) dt ³ uG  (2.46)

for any test function  . Since we already know what complex number the functional ³ uG gives
for any test function  , Eq. (2.46) is not a definition of ³ uG but rather a definition of what it
means to put [uG (t ) A  (t )] inside an integral. Clearly, the generalized function itself is well
defined only when its product with a test function is integrated over t. Because the functional ³ uG
behaves in every way like the functionals ³ u based on the Cauchy-principle-value integration of
true functions, we have established a new type of integration using the product of generalized

- 122
- 122- -
Generalized Functions · 2.11

functions uG (t ) with test functions  (t ) . Hence, we have not only generalized what is meant by a
function but have also extended again what is meant by integration.
To handle algebraic expressions involving both generalized functions and true functions, we
must define what it means to say two generalized functions uG (t ) and vG (t ) are equal. We say
that when
5 5

³u
5
G (t ) (t )dt ³v
5
G (t ) (t )dt (2.47a)

for all appropriate test functions  , then

uG (t ) vG (t ) . (2.47b)

We also define a generalized function uG (t ) , which we know only from its associated
functional ³ uG using definition (2.46), to be equal to a true function v(t) when

 ³ u   ³ v 
G (2.48a)

for all appropriate test functions  . Another way of stating this is that whenever

5 5

³ uG (t ) (t )dt
5
³ v(t ) (t )dt
5
(2.48b)

for all the test functions  , we say that


uG (t ) v(t ) . (2.48c)

Two generalized functions uG (t ) and vG (t ) are defined to be equal over an interval a


t
b
when
 
³ uG ab ³ vG ab   (2.48d)
or
5 5

³ uG (t )ab (t )dt
5
³v
5
G (t )ab (t )dt (2.48e)

for all test functions ab (t ) that are identically zero for all t
a and for all t b . The key point
here is that we are explicitly allowing ab (t ) to be nonzero only inside the interval a
t
b . We
also say that a true function v(t) equals a generalized function uG (t ) in the interval a
t
b ,

-- 123
123 --
2 · Fourier Theory

uG (t ) v(t ) for a
t
b , (2.48f)
whenever
5 5

³u
5
G (t )ab (t )dt ³ v(t )
5
ab (t )dt (2.48g)

for all the ab (t ) test functions. In Eqs. (2.48d)–(2.48g), we allow for half-infinite intervals by
permitting constant b to be 5 with constant a finite and constant a to be í’ with constant b
finite.
The definitions of equality between two generalized functions or between a generalized
function and a true function can be, depending on the set of test functions  chosen, either very
much looser than the standard idea of equality or very much the same. Suppose, by way of
analogy, we define two true functions u1 (t ) and u2 (t ) to be “equal” when

5 5

³ u (t ) (t )dt ³ u (t ) (t )dt
5
1
5
2 (2.49)

for all test functions  . If the only allowed test function is  (t ) 0 , then any two functions u1 (t )
and u2 (t ) are “equal.” If, on the other hand, the allowed test functions are  (t ) e 92& ift for all real
values of ƒ, we are saying that u1 (t ) and u2 (t ) are “equal” when their Fourier transforms
F ( 92& ift )  u1 (t )  and F ( 92& ift )  u2 (t )  are the same. From the Fourier inversion formulas, it then
follows that u1 (t ) must be identical to u2 (t ) , except possibly at jump discontinuities and isolated
points, for all reasonably well-behaved functions u1 (t ) and u2 (t ) . In general, we expect the set of
test functions to be diverse enough that serious thought and some mathematical ingenuity are
required to find two functions u1 (t ) and u2 (t ) that satisfy Eq. (2.49) yet are not basically the
same function. Of course, the integrals used in Eq. (2.49)—and all the other integrals involving
only true functions in Eqs. (2.44) through (2.48g), for that matter—must be known to exist. Often
the finiteness of these integrals and the general smoothness of the test functions are enforced by
the requirement that

N
lim[ t  (t )] 0 for N 0,1, 2,… , (2.50a)
t 75

with the Mth derivative,    (t ) d M  dt M , satisfying


M

- 124
- 124- -
Generalized Functions · 2.11

N
lim[ t  ( M ) (t )] 0 for N 0,1, 2,…
t 75 . (2.50b)
and M 1, 2,…

2
A function such as e  at for a 0 satisfies (2.50a) and (2.50b), and in general all functions
representing physically realistic measurements can be taken to satisfy these two requirements. It
turns out, however, that the most useful and popular generalized function used in Fourier theory
can handle a wider variety of test functions, requiring only that the test functions  be
continuous at t 0 (see Sec. 2.14 below).
Continuing to develop what is meant by the sign applied to generalized functions, we say
that the product of a true function w(t) and a generalized function uG (t ) is another generalized
function vG (t ) ,
vG (t ) w(t ) A uG (t ) , (2.51a)

which is defined to mean that


5 5

³
5
vG (t ) (t )dt ³ w(t )u
5
G (t ) (t ) dt

for all test functions  (t ) . A linear combination of true functions and generalized functions
specified by
wG (t ) u1 (t )vG1 (t )  u2 (t )vG 2 (t )  " (2.51b)

is defined to mean that

5 5 5

³
5
wG (t ) (t )dt ³ u1 (t )vG1 (t ) (t )dt 
5
³ u (t ) v
5
2 G2 (t ) (t ) dt  "

for all test functions  (t ) . In general, there is no difficulty assigning a meaning to equations such
as
u1 (t )vG1 (t )  u2 (t )vG 2 (t )  "  u N (t )vGN (t )
(2.51c)
U1 (t )VG1 (t )  U 2 (t )VG 2 (t )  "  U M (t )VGM (t )

for true functions u1 (t ), u2 (t ),… , u N (t ), U1 (t ), U 2 (t ),… , U M (t ) and generalized functions


vG1 (t ), vG 2 (t ),… , vGN (t ), VG1 (t ), VG 2 (t ),… , VGM (t ) . As long as both sides of the equation are just
linear combinations of generalized functions and true functions, we interpret their equality to
mean that

-- 125
125 --
2 · Fourier Theory

5 5 5

³ u1 (t )vG1 (t ) (t )dt 
5 5
³ u2 (t )vG 2 (t ) (t )dt  "  ³u
5
N (t )vGN (t ) (t ) dt
5 5 5
³ U (t )V
5
1 G1 (t ) (t ) dt  ³ U 2 (t )VG 2 (t ) (t ) dt  "  ³ U M (t )VGM (t ) (t ) dt
5 5

for all test functions  (t ) . Even the simplest nonlinear expressions, however, such as

? 2
vG (t ) uG (t )  ,

cannot be resolved by putting both sides inside an integral, because the right-hand side of

5
?5
³ vG (t ) (t )dt ³ uG (t )   (t )dt
2

5 5

is still undefined. We know that the left-hand side is the same as applying the already-understood
functional ³ uG to  ,
5

³u
5
G  
(t ) (t )dt ³ uG  ,

but no definition has been given to


5

³ u
2
G (t )   (t )dt
5

in terms of the functional ³ uG . It turns out that, in general, nonlinear expressions involving
generalized functions cannot be given useful interpretations. Hence, generalized functions must
be treated with caution unless they are used inside linear combinations of the type shown in
(2.51b) and (2.51c).
Although generalized functions do have limitations, there are many things that can be done
with them. We can give meaning to uG (t  a ) for any real constant a by defining that

5 5

³ uG (t  a) (t )dt
5
³u
5
G (t ) (t  a)dt (2.52a)

for all test functions  . This definition is, of course, consistent with what happens when the
formal substitution t 3 t  a is made inside the original integral,

- 126
- 126- -
Generalized Functions · 2.11

5 5 5

³ uG (t  a) (t )dt ³ uG (t 3) (t 3  a)dt 3


5 5
³u
5
G (t ) (t  a)dt ,

treating uG (t  a ) like a true function u (t  a) . We can give meaning to uG (at ) for any real
constant a by defining that
5 5
1
³5 G
u ( at ) (t ) dt ³ uG (t )  t a  dt
a 5
(2.52b)

for all test functions  . This definition is consistent with what happens when we make the formal
substitution t 3 at in the integral
5

³u
5
G (at ) (t )dt

and treat uG (at ) like a true function,

­1 5 ½
5 ° ³
° a 5
uG (t 3)  t 3 a  dt 3 for a 0 °
° 1
5

³5 uG (at ) (t )dt ® 1 5 ¾ ³u G (t )  t a  dt .


° ° a
°a ³
5
uG ( t 3)  t 3 a  dt 3 for a
0 °
¯ 5 ¿

When the argument of uG is the a linear combination at  c for real constants a and c, we
define
5 5
1
³ uG (at  c) (t )dt a 5³ uG (t )  (t  c) a  dt
5
(2.52c)

and, combining the arguments used to explain definitions (2.52a) and (2.52b), we see that
transforming the variable of integration to t 3 at  c gives

5 5
1
³5 uG (at  c) (t )dt a ³u
5
G (t 3)  (t 3  c) a  dt 3 ,

justifying definition (2.52c). In general, any variable transformation that is permitted for the
argument of a true function we also permit for the argument of a generalized function unless it
results in an inappropriate test function.
We define a generalized function uG (t ) to be even if

-- 127
127 --
2 · Fourier Theory

³u
5
G (t )o (t )dt 0 (2.52d)

for all odd test functions o , and we define uG (t ) to be odd if

³u
5
G (t )e (t )dt 0 (2.52e)

for all even test functions e . This gives uG (t ) the same behavior it would have if it were an even
or odd true function multiplied by e or o and integrated over all t. Putting a subscript e on the
generalized function uGe (t ) to show that it obeys the above definition for an even generalized
function, we note that, as described in Eq. (2.11c) above, any test function  (t ) can be written as
the sum of an even function e (t ) and an odd function o (t ) . Hence, for any test function  and
an even generalized function uGe (t ) , we can write, using definition (2.52d),

5 5 5 5

³ uGe (t ) (t )dt
5
³ uGe (t ) e (t )  o (t ) dt
5 5
³ uGe (t )e (t )dt  ³u
5
Ge (t )o (t )dt
5
³u
5
Ge (t )e (t )dt .

Definition (2.52b) gives, again using that  (t ) e (t )  o (t ) ,

5 5 5

³ uGe (t ) (t )dt


5
³ uGe (t ) (t )dt
5
³u
5
Ge (t ) e (t )  o (t )  dt
5 5
³ uGe (t )e (t )dt 
5
³u
5
Ge (t )o (t )dt
5 5
³u
5
Ge (t )e (t ) dt  ³u
5
Ge (t )o (t ) dt
5
³u
5
Ge (t )e (t ) dt ,

where in the last two steps we use o (t ) o (t ) , e (t ) e (t ) , and definition (2.52d). We see
that both

- 128
- 128- -
Generalized Functions · 2.11

5 5

5
³u Ge (t ) (t )dt and ³u
5
Ge (t ) (t )dt

are equal to
5

5
³u Ge (t )e (t )dt

for any test function  , so by definition (2.47a) for the equality of two generalized functions, it
follows that
uGe (t ) uGe (t ) (2.52f)

for any even generalized function uGe (t ) . If uGo (t ) is any odd generalized function, we can use
 (t ) e (t )  o (t ) and definition (2.52e) to get

5 5 5

³u
5
Go (t ) (t )dt ³u
5
Go (t ) e (t )  o (t ) dt ³u
5
Go (t )o (t )dt

and definition (2.52b) to get

5 5 5 5

³ uGo (t ) (t )dt


5
³ uGo (t ) (t )dt
5
³ uGo (t )e (t )dt 
5
³u
5
Go (t )o (t )dt
5 5
³u
5
Go (t )e (t ) dt  ³ [u
5
Go (t )]o (t ) dt
5
 ³ [uGo (t )o (t )] dt
5

or
5 5

³ [ u
5
Go (t )] (t ) dt ³u
5
Go (t )o (t ) dt .

5 5
Clearly, ³ uGo (t ) (t )dt and
5
³ [u
5
Go (t )] (t ) dt are equal to each other because they are both
5
equal to ³u
5
Go (t )o (t )dt for any test function  , so by definition (2.47a) we conclude that

uGo (t ) uGo (t )

-- 129
129 --
2 · Fourier Theory

or
uGo (t ) uGo (t ) . (2.52g)

We define the derivative of a generalized function uG (t ) to be another generalized function

uG3 (t ) uG(1) (t ) .

The generalized function uG (t ) is defined in terms of the already-known functional ³ uG , but


what functional ³ uG3 defines the generalized function uG3 (t ) ? We specify this new functional ³ uG3
with the definition
 
³ uG3   ³ uG  3  
or
5 5
§ d ·
³ uG3   ³ uG (t ) 3(t )dt  ³ uG (t ) ¨
  ¸ dt (2.53a)
5 5 © dt ¹

for any test function  . Therefore, the new generalized function uG3 (t ) satisfies the equation

5 5
§ d ·
³ u3 (t ) (t )dt  ³ u
5
G
5
G (t ) ¨ ¸ dt
© dt ¹
(2.53b)

for any test function  . We note that this definition is consistent with a formal integration by
parts, treating uG3 (t ) like a true function u 3(t ) to get

5 5 5
§ d · § d ·
³5 uG3 (t ) (t )dt uG (t ) (t )5  5³ uG (t ) ¨© dt ¸¹ dt 5³ uG (t ) ¨© dt ¸¹ dt ,
5

with the term in square brackets [ ] zero for all test functions  . We can make this first term zero
either by requiring  to approach zero as t 7 95 or by having uG (t ) equal a true function in the
sense of (2.48g) with the true function becoming zero as t 7 95 . The integral involving
 3(t ) d dt must also, of course, have a well-defined meaning for all the test functions  .
The convolution of two generalized functions uG (t ) and vG (t ) is defined to be another
generalized function
wG (t ) uG (t )  vG (t ) . (2.54a)

From Eqs. (2.47a) and (2.47b), we know that (2.54a) must mean that

- 130
- 130- -
Generalized Functions · 2.11

5 5

³w
5
G (t ) (t )dt
5
³ u G (t )  vG (t ) (t )dt (2.54b)

for all test functions  . We now give meaning to both sides of (2.54b) by defining that, for all
test functions  ,

5 5 5 5

³
5
wG (t ) (t )dt
5
³ uG (t )  vG (t ) (t )dt 5
³ dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3  t 33) .
5
(2.54c)

Note that the right-hand side of (2.54c) is as well defined as our previous definitions, since

5
v ³v
5
G (t 33) (t 3  t 33)dt 33

is just another complex number depending on the real parameter t 3 , which can be treated as
another true test function  v (t 3) inside the double integral of (2.54c),

5 5 5

³ dt 3 u
5
G (t 3) ³ dt 33 vG (t 33) (t 3  t 33)
5
³u
5
G (t 3) v (t 3) dt 3 .

As long as  (t 3  t 33) and  v (t 3) are both test functions whenever  is a test function,
definition (2.54c) should present no difficulties. To justify this definition, we note that formally
treating uG (t ) and vG (t ) as true functions gives

5 5 5

³ uG (t 33)  vG (t 33) (t 33)dt 33


5
³
5
dt 33 (t 33) ³ dt 3 uG (t 3)vG (t 33  t 3)
5
5 5
³ dt 3 u
5
G (t 3) ³ dt 33 (t 33)vG (t 33  t 3) ,
5

where the last step interchanges the order of integration. We now use (2.52a) to write

5 5

³  (t 33)v
5
G (t 33  t 3)dt 33 ³v
5
G (t 33) (t 33  t 3) dt 33 ,

which leads to

-- 131
131 --
2 · Fourier Theory

5 5 5

³ uG (t 33)  vG (t 33) (t 33)dt 33


5
³
5
dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 33  t 3) ,
5

justifying the definition given in (2.54c). Note that the order of integration inside the double
integral of (2.54c) can be freely interchanged,

5 5 5 5

³
5
dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3  t 33)
5
³
5
dt 33 vG (t 33) ³ dt 3 uG (t 3) (t 3  t 33) ,
5

showing that uG (t )  vG (t ) vG (t )  uG (t ) for generalized functions as well as true functions.


Because the convolution itself is defined as an integral, there is no problem giving a meaning to
the convolution of a true function with a generalized function as long as the true function is an
acceptable test function. For a generalized function uG (t ) and test function  (t ) , we have

5 5 5
uG (t )   (t ) ³u
5
G (t 3) (t  t 3)dt 3 ³u
5
G (t 3)  (t 3  t )  dt 3 ³u
5
G (t  t 3) (t 3)dt 3 , (2.55a)

where definition (2.52c) with a 1 and c t is used in the last step of (2.55a). It clearly makes
sense to say that
5

³u
5
G (t  t 3) (t 3)dt 3  (t )  uG (t ) ,

which means that


uG (t )   (t )  (t )  uG (t ) (2.55b)

for the convolution of a generalized function with any test function  .

2.12 Generalized Limits


Given a sequence of true functions u1 (t ), u2 (t ),… , un (t ),… , we can form a corresponding
sequence of integrals with the test functions  ,

5 5 5

³ u (t ) (t )dt , ³ u (t ) (t )dt, … , ³ u (t ) (t )dt , … .


5
1
5
2
5
n

We define Glim, the generalized limit of the sequence of true functions un (t ) , by taking the
standard limit of the sequence of integrals,

- 132
- 132- -
Generalized Limits · 2.12

5
lim
n 75 ³ u (t ) (t )dt ,
5
n

and requiring that the generalized limit of the sequence of true functions un (t ) , written as

G lim un (t ) ,
n 75

satisfy the equation


5 5
lim
n 75 ³ u (t ) (t )dt ³ ª¬G lim u (t )º¼  (t )dt
5
n
5
n 75
n (2.56a)

for any test function  . In effect, the generalized limit Glim is what we get when we insist on
moving the standard limit inside the integral. Almost always, of course, it turns out that the
generalized limit is the same as the standard limit,

G lim un (t ) lim un (t ) ,
n 75 n 75

so that
5 5
lim
n 75 ³ u (t ) (t )dt ³ ª¬ lim u (t )º¼  (t )dt ,
5
n
5
n 75
n (2.56b)

but this is not always the case. If we define the  function (see Fig. 2.6) by

­ 1 for t
T
°
 (t , T ) ®1 2 for t T , (2.56c)
° 0 for t T
¯

we can construct a sequence of true functions by

1 §t ·
un (t )  ¨ ,1¸ . (2.56d)
n ©n ¹

Function  (t n ,1) is 1 only when  n


t
n , so when

 (t ) 1

-- 133
133 --
2 · Fourier Theory

is an acceptable test function, it is always true that

5 5
1 §t ·
³5 un (t )dt n 5³  ¨© n ,1¸¹ dt 2 ,
which makes
5
lim
n 75 ³ u (t )dt 2 .
5
n (2.56e)

On the other hand,


ª 1 § t ·º
lim un (t ) lim «  ¨ ,1¸ » 0 ,
n 75 n
n 75
¬ © n ¹¼
which gives
5

³ ª¬ lim u (t )º¼ dt 0 .
5
n 75
n (2.56f)

______________________________________________________________________________

FIGURE 2.6.  (t , T )

t T t T

- 134
- 134- -
Generalized Limits · 2.12

The disagreement of (2.56e) and (2.56f) shows that there can be a very important difference
between the generalized limit and the standard limit, because Eq. (2.56b) does not always hold
true. We cannot avoid this problem by ruling out constant test functions such as  (t ) 1 .
Consider, for example,
1
 (t )
1 t2

and construct a sequence of true functions

un (t ) t sin(t n) .

We find that21
5
t sin(t n)
5
³ 1 t 2
dt & e 1 n , (2.57a)

which gives
5
t sin(t n)
lim
n 75 ³
5
1 t2
dt & . (2.57b)

This is not the same as


5 {lim t sin(t n) }
³
5
n 75
1 t2
dt 0 . (2.57c)

Once again, we have found a sequence of true functions un (t ) that does not satisfy (2.56b). This
second example can, in fact, be seen to fail (2.56b) for much the same reason as the first. Since an
even function is being integrated, we can write that [see Eq. (2.19)]

5 5
t sin(t n) t sin(t n)
lim ³ 2
dt 2 lim ³ dt . (2.57d)
n 75
5
1 t n 75
0
1 t2

Consider what happens to the first, positive hump of the sine as n increases in the integral on the
right-hand side of Eq. (2.57d). The values of t for which sin(t n) is significantly different from
zero, say from n A (& 4) to n A (3& 4) , comprise an interval t n A (& 2) with a width that
increases linearly with n, just like the interval 2n in (2.56d) over which  (t n ,1) equals one. The

21
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, edited by Alan Jeffrey, 5th ed.
(Academic Press, New York, 1994), p. 445, formula 4 in Sec. 3.723 with a=1/n and =1.

-- 135
135 --
2 · Fourier Theory

center of this hump is at t n A (& 2) , so as n increases, the hump’s center appears at ever larger
values of t. Hence, we can make the approximation that for large n

t 2
2
t 1 ? .
1 t n&

This means the characteristic size of


t sin(t n)
1 t2

at the hump decreases as 1 n , while the hump’s width, t n A (& 2) , increases as n. The product
of the size and width therefore tends to a constant as n gets large, preventing the integral from
shrinking as n 7 5 . This is the same phenomenon that caused our first example n 1 (t n ,1) to
fail Eq. (2.56b). Up to this point, we have, of course, only discussed the contribution of the first

2.13 Fourier Transforms of Generalized Functions


For every generalized function uG (t ) , there is at least one sequence of true functions
u1 (t ), u2 (t ),… , un (t ),… such that
G lim un (t ) uG (t ) . (2.58a)
n 75

This formula should be interpreted in the sense of (2.47b) and (2.56a); that is, it means

5 5 5
ª º
³5 «¬G limn75un (t )»¼  (t )dt lim
n 75 ³
5
un (t ) (t )dt ³ uG (t ) (t )dt
5
(2.58b)

for all test functions  . We use the sequence of true functions whose generalized limit is the
generalized function to define the Fourier transform of the generalized function. If a sequence of
true functions w1 (t ), w2 (t ),… , wn (t ),… can be forward Fourier transformed to give another

- 136
- 136- -
Fourier Transforms of Generalized Functions · 2.13

sequence of true functions W1 ( f ), W2 ( f ),… , Wn ( f ),… such that

³ w (t )e
2& ift
Wn ( f ) n dt (2.59a)
5
and
5

³ W ( f )e
2& ift
wn (t ) n df (2.59b)
5

for all values of n, we then define the forward Fourier transform of the generalized function

wG (t ) G lim wn (t ) (2.59c)
n 75

to be
F ( ift )  wG (t )  G lim Wn ( f ) . (2.59d)
n 75

We expect the sequence of true functions W1 ( f ), W2 ( f ),… , Wn ( f ),… also to give a generalized
function when we take the generalized limit of the sequence,

WG ( f ) G lim Wn ( f ) , (2.59e)
n 75

and we define the inverse Fourier transform of this generalized function to be wG (t ) ,

F ( itf ) WG ( f )  G lim wn (t ) wG (t ) . (2.59f)


n 75

The double-arrow notation 6 introduced in the discussion after Eq. (2.35d) can be used to
restate this definition more concisely. We define that whenever

w1 (t ), w2 (t ),… , wG (t )

is true, and that whenever


W1 ( f ), W2 ( f ),… , WG ( f )

is true, and that whenever

w1 (t ) 6 W1 ( f ), w2 (t ) 6 W2 ( f ), … , wn (t ) 6 Wn ( f ),…

-- 137
137 --
2 · Fourier Theory

is true for all n, it must also be true that

wG (t ) 6 WG (t ) (2.59g)

for the generalized functions given by the generalized limits of sequences

w1 (t ), w2 (t ),… and W1 ( f ), W2 ( f ),… .

Now at last we can attach a meaning to the Fourier transform pair that could not be completed
in Eqs. (2.43d)–(2.43f). The explicit development that follows is perhaps somewhat long, but
worth doing to show how to construct the Fourier transforms of some of the functions violating
one or more of requirements (V) through (VIII) in Sec. 2.4. We create the sequence

sgn( f ) ( f ,1), sgn( f ) ( f , 2), … , sgn( f ) ( f , n), …

and define the generalized sgn function by

"sgn( f )" G lim sgn( f ) ( f , n)  , (2.60a)


n 75

where quotes “ ” are used to indicate that the “ sgn( f ) ” is a generalized function instead of the
true function sgn( f ) defined in Eq. (2.42c) above. The reason for this choice of sequence is
straightforward—function [sgn( f ) ( f , n)] satisfies requirements (V) through (VIII) in Sec. 2.4
for every finite value of n and so has a well-defined Fourier transform; as n increases, function
[sgn( f ) ( f , n)] resembles ever more closely the sgn( f ) function to which we want to give a
Fourier transform. We note that for any test function 

5 5

³  ( f ) "sgn( f )" df ³  ( f ) G lim sgn( f ) ( f , n) df


5 5
n 75

5
lim ³  ( f ) sgn( f ) ( f , n)df
n 75
5
n
lim ³  ( f ) sgn( f )df
n 75
n
5
³  ( f ) sgn( f )df ,
5
so
"sgn( f )" sgn( f ) (2.60b)

- 138
- 138- -
Fourier Transforms of Generalized Functions · 2.13

in the sense of Eq. (2.48c). This equivalence can be used to justify dropping the distinction
between “ sgn( f ) ” and sgn( f ) . Applied mathematicians who work with generalized functions
often drop the distinction between a generalized function and the true function to which it is
equivalent, and the double-quote notation introduced here is not standard usage. There is,
however, no harm in keeping track of the distinction between the two types of functions, and the
double quotes acknowledge the close relationship of the two functions while reminding us that
they are not the same.
The inverse Fourier transform of [ i& sgn( f ) ( f , n)] is, using the identity
ei" cos "  i sin " ,

5 n
F ( itf )
 i& sgn( f ) ( f , n)  i& ³ e 2& ift
sgn( f ) ( f , n) df 2& ³ sin(2& ft ) df .
5 0

In the last step, we use that the integral of

[cos(2& ft ) sgn( f ) ( f , n)] ,

which is an odd function in ƒ, has an integral that is zero according to Eq. (2.17); and the integral
between (ín) and n of [sin(2& ft ) sgn( f )] , which is an even function in ƒ, is twice the value of its
integral from zero to n according to Eq. (2.19). Making the substitution f 3 2& tf gives

1 2& nt
F (itf )  i& sgn( f ) ( f , n)   cos f 30 .
t

This shows that the inverse Fourier transform of [i& sgn( f ) ( f , n)] is

F (itf )  i& sgn( f ) ( f , n)  t 1 1  cos(2& nt )  .

Now we calculate the forward Fourier transform of (1/ t )[1  cos(2& nt )] . We get

5
F (  ift )
t 1
 ³e
[1  cos(2& nt )] 2& ift 1
t [1  cos(2& nt )] dt
5
5 5
dt 1
³
5
e 2& ift  ³ e 2& ift cos(2& nt ) dt
t 5 t
5
1
i& sgn( f )  i ³ cos(2& nt ) sin(2& ft ) dt .
5
t

-- 139
139 --
2 · Fourier Theory

In the last step, Eq. (2.43d) is used to evaluate the integral of [e 2& ift t 1 ] ; we also substitute
ei" cos "  i sin " into the integral of [e 2& ift t 1 cos(2& nt )] , discovering that the Cauchy principle
value of the integral of [t 1 cos(2& ft ) cos(2& nt )] , which is an odd function in t, is zero [see Eq.
(2.17)]. The remaining integral over the even function

[t 1 sin(2& ft ) cos(2& nt )]

can be simplified by applying Eq. (2.19) and then consulting a table of definite integrals,22

5 5
1 1
³5 t cos(2& nt ) sin(2& ft ) dt 2sgn( f )³0 t cos(2& nt ) sin(2& f t ) dt
& sgn( f ) (2& n, 2& f ) & sgn( f ) (n, f ) .

We conclude that the forward Fourier transform of (1/ t )[1  cos(2& nt )] is

 
F ( ift ) t 1[1  cos(2& nt )] sgn( f ) ª¬ i&  i& (n, f ) º¼ i& sgn( f ) ª¬1   ( n, f ) º¼
i& sgn( f ) ( f , n) .

Hence, (1/ t )[1  cos(2& nt )] and [i& sgn( f ) ( f , n)] are a Fourier-transform pair,

1
1  cos(2& nt ) 6 i& sgn( f ) ( f , n) .
t

This confirms that there are two sequences

1 1 1
1  cos(2& t ) , 1  cos(4& t ) , … , 1  cos(2& nt ) , … (2.60c)
t t t
and

i& sgn( f ) ( f ,1),  i& sgn( f ) ( f , 2), … ,  i& sgn( f ) ( f , n), …

22
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, p. 453, formula 2 in Sec. 3.741 with
a=2&|f| and b=2&n.

- 140
- 140- -
Fourier Transforms of Generalized Functions · 2.13

such that each member of the lower sequence is the forward Fourier transform of the
corresponding member of the upper sequence and each member of the upper sequence is the
inverse Fourier transform of the corresponding member of the lower sequence. We know from
(2.60a) and (2.60b) that the generalized function given by the generalized limit of the lower
sequence is

G lim  i& sgn( f ) ( f , n)  i& G lim sgn( f ) ( f , n)  i& "sgn( f )"


n 75 n 75 (2.60d)
i& sgn( f ) ,

but what is the generalized function given by the generalized limit of the upper sequence? We
have for any test function 

5 5
1
³  (t ) G lim 1 t [1  cos(2& nt )]2 dt lim ³  (t ) 1  cos(2& nt ) dt
1

5
n 75 n 75
5
t
­° 5 dt
5
dt ½°
lim ® ³  (t )  ³  (t ) cos(2& nt ) ¾ (2.60e)
n 75
¯° 5 t 5 t ¿°
5 5
dt 1
³  (t )  lim ³  (t ) cos(2& nt ) dt .
5
t n75 5 t

Working with the limit of the integral containing cos(2& nt ) , we write

5 
1 1
lim ³  (t ) cos(2& nt ) dt lim ³  (t ) cos(2& nt ) dt
n 75
5
t n 75
5
t

1
 lim ³  (t ) cos(2& nt ) dt (2.60f)
n 75

t
5
1
 lim ³  (t ) cos(2& nt ) dt ,
n 75
 t

where  is a small positive number. By making all the test functions  (t ) have finite variation as
in requirement (VIII) in Sec. 2.4, we recognize the first and third integrals on the right-hand side
of (2.60f) become zero as n 7 5 , because eventually the cosine oscillates both positive and
negative over each infinitesimal interval while  (t ) t barely changes at all—the integrals can be
made as small as desired by picking a large enough value of n. For future use, we note that for
any continuous, finite-variation test function  ,

-- 141
141 --
2 · Fourier Theory

5 5 5
lim ³  (t ) sin(nt )dt lim ³  (t ) cos(nt )dt lim ³  (t )e 9 int dt 0 ,
n 75 n 75 n 75
5 5 5

so that
G lim sin(nt ) G lim cos(nt ) G lim e 9 int 0 . (2.60g)
n 75 n 75 n 75

The middle integral in Eq. (2.60f) can be written as

 5
dt 1
³  (t ) cos(2& nt ) t  (0)5³  (t ,  ) t cos(2& nt ) dt ,

where we have chosen  small enough that  (t ) barely changes over the integral, letting us
replace it by  (0) . Now the middle integral on the right-hand side of (2.60f) can be recognized as
the Cauchy principle value of the integral of (1 t )  (t ,  ) cos(2& nt ) , which is an odd function of t
and must be zero according to Eq. (2.17). Hence, (2.60f) becomes

5
1
lim ³  (t ) cos(2& nt ) dt 0 ,
n 75
5
t

which shows that (2.60e) simplifies to

5 5
dt
³  (t ) G lim 1 t [1  cos(2& nt )]2 dt ³  (t )
1
(2.60h)
5
n 75
5
t

for any test function  . Since (2.60h) denotes equality in the sense of Eq. (2.48c), we can define
the generalized function “ t 1 ” to be

1
" t 1 " G lim t 1[1  cos(2& nt )]
n 75
2 (2.60i)

and then note that Eq. (2.60h) now states that

" t 1 " t 1 . (2.60j)

Equations (2.60d) and (2.60j) show that [i& "sgn( f )"] and “ t 1 ” are the generalized limits of the
two sequences in (2.60c). Because all the sequence members are Fourier transform pairs, we

- 142
- 142- -
Fourier Transforms of Generalized Functions · 2.13

know, according to (2.59g), that [ i& "sgn( f )"] and " t 1 " are a Fourier transform pair even
though [i& sgn( f )] and t 1 do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as
shown in Eqs. (2.43a) and (2.43f), their transforms cannot be evaluated as standard integrals. In
this sense, we can write that

F ( ift ) (t 1 ) i& sgn( f ) (2.60k)


and
F ( ift )  i& sgn( f )  t 1 . (2.60 A )

This can also be written as, reversing the sign of ƒ in (2.60k), the sign of t in (2.60 A ), and using
Eq. (2.42c) to get that sgn(  f )  sgn( f ) ,

F (ift ) (t 1 ) i& sgn( f ) (2.60m)


and
F ( ift )  i& sgn( f )  t 1 . (2.60n)

It is important to remember that Eqs. (2.60k) and (2.60m) are true only when integrals between
í’ and +’ are interpreted as Cauchy principle values and (2.60 A ) and (2.60n) are true only
when equality is defined as in Eq. (2.48c) using generalized function theory. Strictly speaking, it
might be better to say that the Cauchy principle value of

5
dt
³e
92& ift
is 9i& sgn( f )
5
t
and that
5

³e
92& ift
 i& "sgn( f )" df 9 " t 1 " .
5

This is the reason that


5
dt
³e
92& ift
9i& sgn( f ) (2.61a)
5
t

is usually not listed in standard tables of improper integrals without notation showing that it is a
Cauchy principle value, and the equality

5
i
³e
92& ift
sgn( f ) df 9 (2.61b)
5
&t

-- 143
143 --
2 · Fourier Theory

is usually not listed in these tables under any circumstances. It is also true, however, that (2.61a)
and (2.61b) are constantly used either explicitly or implicitly in Fourier-transform theory; and
lists of Fourier-transform pairs often contain (2.61a) and (2.61b). Unfortunately, it is standard
practice in the Fourier-transform tables that do list these integrals to omit any explanation that
they are only true when interpreted as the Fourier transforms of generalized functions. In general,
when using tables of Fourier transforms, all those transforms that do not exist as standard
integrals or Cauchy principle values should be interpreted as the transforms of generalized
functions and used only in the context of generalized function theory.

2.14 The Delta Function


The most popular and useful generalized function is the Dirac delta function, a name usually
shortened to just the delta function. In a sense, the Secs. 2.11–2.13 describing generalized
function theory are there just so we can give a mathematically exact description of the delta
function. The delta function is often inexactly described in elementary textbooks as that function
 (t ) such that
­5 for t 0
 (t ) ® (2.62a)
¯ 0 for t > 0

with
b
­ f (0) for a
0
b
³  (t ) f (t )dt ®
a ¯ 0 for a
b
0 or 0
a
b
. (2.62b)

More sophisticated textbooks may define it as a standard limit, for example,

 (t ) lim[n (t , n 1 )] (2.63a)
n 75

or
§ n 2 ·
 (t ) lim ¨¨ e  nt ¸¸ . (2.63b)
© &
n 75
¹

There are, in fact, two different—but equivalent—mathematically exact ways to define the delta
function. The first way is to create a well-defined functional ³  that, when operating on a
complex-valued test function  (t ) with a real argument t, produces as its complex number  (0) ,
the value of  at t equal to zero,
 ³    (0) . (2.64a)

- 144
- 144- -
The Delta Function · 2.14

This makes  (t ) the generalized function associated with functional ³  , with  (t ) having the
property that
5

³  (t ) (t )dt  (0)
5
(2.64b)

for all test functions  . The second way to define  (t ) is to say it is the generalized limit of a
sequence such as the ones specified in (2.63a) and (2.63b),

 (t ) G lim[n (t , n 1 )] (2.65a)
n 75

or
§ n 2 ·
 (t ) G lim ¨¨ e  nt ¸¸ . (2.65b)
© &
n 75
¹

Although the delta function is a generalized function in every sense of the term, we follow
standard notation and do not add the G subscript—or add the quotes “ ”—used to label other
generalized functions in this chapter.
Defining  (t ) with a functional, as in (2.64a), shows that this generalized function can be
used on an extremely large set of test functions—any true function that is continuous at the origin
is an acceptable and appropriate test function. The subset of test functions ab used in Eqs.
(2.48d)–(2.48g) has a
b with ab (t ) automatically set to zero when t does not lie inside the
interval a
t
b . These functions can be used in (2.64b) to show that

³  (t )
5
ab (t )dt ab (0) 0

when a
b
0 or 0
a
b . Therefore, we have

 (t ) 0 for t > 0 (2.65c)

in the sense of definition (2.48f)—that is, we know that

5 5

³  (t )
5
ab (t )dt ³ 0 A
5
ab (t )dt 0

-- 145
145 --
2 · Fourier Theory

for all test functions ab where the interval a


t
b does not include t 0 . This is a
mathematically exact way of stating the lower level of Eq. (2.62a). If  (t ) is defined using
generalized limits, as in Eqs. (2.65a) and (2.65b), then we must show why Eq. (2.64b) is true. The
sequence in (2.65b), for example, leads to

5
ª n  nt 2 º
5
n  nt 2
5
n  nt 2
³5  (t ) «Gn75
lim
&
e » dt lim ³
n 75 &
e  (t )dt lim  (0) ³
n 75 &
e dt
¬ ¼ 5 5
5
n  nt 2
 (0) lim ³ e dt (2.66)
n 75
5
&
 (0)

for any test function  . As n gets large in (2.66), only the value of  at t 0 can contribute
significantly to the integral. Replacing  (t ) by  (0) quickly reduces the whole expression to
 (0) , showing that the generalized limit of the sequence in (2.65b) is indeed the delta function.
Some commonly used sequences that have the delta function as their generalized limits are

 (t ) G lim
n &  , (2.67a)
n 75 1  n 2t 2

sin 2 (nt )
 (t ) G lim , (2.67b)
n 75 n& t 2

sin(2& nt )
 (t ) G lim , (2.67c)
n 75 &t

and so on. Perhaps the most interesting of these sequences is (2.67c). We know from (2.65c) that
one important property of the delta function is

³  (t )
5
ab (t )dt 0

whenever the interval a


t
b does not include t 0 . The reason that

5 5
ª sin(2& nt ) º ª sin(2& nt ) º
³5 «¬ n75 & t »¼
G lim ab (t ) dt
n 75 ³ «
lim
5 ¬ & t »¼ ab (t )dt 0

- 146
- 146- -
The Delta Function · 2.14

when the interval a


t
b does not include t 0 is that for extremely large n values the sine
oscillates rapidly between +1 and í1 while ab (t ) t stays essentially constant for t > 0 , averaging
the integrand to zero. Hence,

sin(2& nt )
G lim  (t ) 0 for t > 0
n 75 &t

for the same reason that


G lim e 9 int 0
n 75

in Eq. (2.60g). To understand the behavior near t 0 , we construct function a 0b (t ) in which the
interval a
t
b does include t 0 . Now we can write, transforming the variable of integration
to t 3 2& nt ,

5 5
ª sin(2& nt ) º 1 ª sin(t 3) º § t3 · 3
³ « n75 & t »¼
5 ¬
G lim a 0 b (t ) dt lim
n 75 & ³5 «¬ t 3 »¼ ©¨ 2& n ¹¸ dt
a 0 b

5 5
1 ª sin(t 3) º
a 0b  0  lim ³ « dt a 0b  0  ³  (t )a 0b (t )dt ,
3
n 75 &
5 ¬
t 3 »¼ 5

where in the second-to-last step we use (see any handbook of definite integrals)

5
sin(t 3)
³
5
t3
dt 3 & .

Any arbitrary test function can be written as a function a 0b (t ) whose interval of nonzero values
includes t 0 plus other test functions whose intervals of nonzero values do not include t 0 ;
that is, we can always write  (t ) a 0b (t )  [other functions zero at the origin] . When this  (t ) is
multiplied by G lim sin(2& nt ) (& t ) and integrated over t between í’ and +’, we realize that the
n 75

value of the integral is a 0b (0)  (0) because the other functions that are zero at the origin give
zero contribution to the integral as n 7 5 . Consequently,

5 5
ª sin(2& nt ) º
³5 «¬Gn75
lim
&t »¼  (t )dt   0  ³  (t ) (t )dt ,
5

indicating that the generalized limit of the sequence

-- 147
147 --
2 · Fourier Theory

sin(2& nt )
&t

equals the delta function in the only sense that two generalized functions can ever be equal—the
integral of the left-hand side with any test function  is always the same as the integral of the
right-hand side with any test function  [see discussion after Eq. (2.47b)]. Figures 2.7(a)–2.7(c)
2
and 2.8(a)–2.8(c) plot the behavior of n & A e  nt and (& t ) 1 sin(2& nt ) sequences, showing the
two different ways these sequences change into delta functions.
We note that for any odd test function o (t )

5
³  (t ) (t )dt  (0) 0
o o

because, according to Eq. (2.12a), odd functions are zero at the origin. Therefore, from the
definitions of even and odd generalized functions in Eqs. (2.52d) and (2.52e), we conclude that
the delta function is an even generalized function because its integral with all odd test functions is
always zero. This means we can write [see Eq. (2.52f)]

 (t )  (t ) . (2.68a)

From the behavior of generalized functions specified in Eq. (2.52a), we have

5 5

³  (t  t ) (t )dt ³  (t ) (t  t )dt  (t )
5
0
5
0 0

and, because the delta function equals the zero function for t > 0 , this result can be written as

b
­0 for a
b
t0 or t0
a
b
³a  (t  t0 ) (t )dt ®¯ (t0 ) for a
t0
b . (2.68b)

From Eq. (2.52b), we have

5 5
1 1
³5  (c A t ) (t )dt c ³  (t ) (t / c)dt
5
c
 (0) ,

from which we conclude that

- 148
- 148- -
The Delta Function · 2.14
FIGURE 2.7(a).

0
t
FIGURE 2.7(b).

0 t
FIGURE 2.7(c).

0 t

2
Figures 2.7(a)–2.7(c) show how n / & e  nt changes into a delta function of t as n increases.

-- 149
149 --
2 · Fourier Theory

FIGURE 2.8(A).

0 t
FIGURE 2.8(b).

0 t

FIGURE 2.8(c).

0 t
-1
Figures 2.8(a)–2.8(c) show how (ʌt) sin(2ʌnt) changes into a delta function of t as n increases.

- 150
- 150- -
The Delta Function · 2.14

1
 (ct )  (t ) (2.68c)
c
because
5
1
5
ª1 º
³  ( c A t ) (t ) dt
c
 (0) ³ « c »»  (t )dt
«
5 ¬
 (t )
5 ¼

for all test functions  . We note that this last rule, Eq. (2.68c), can also be used to show that the
delta function is even, since (2.68a) is just a special case of (2.68c) with c 1 .
Equation (2.52c) shows that there is no difficulty handling a general linear transformation of
the delta function’s argument, because for any two real constants a and c, we have

5
1
5
1 §c·
5
ª 1 § c ·º
³  (a A t  c) (t )dt
5
a ³  (t ) ((t  c) / a)dt
5
 ¨ ¸ ³ «  ¨ t  ¸ »  (t )dt
a © a ¹ 5 «¬ a © a ¹ »¼

for all test functions  . Consequently,


1 § c·
 (a A t  c)  ¨t  ¸ . (2.68d)
a © a¹

This is the same answer we would get from factoring a out of the delta function argument and
then using (2.68c) to rescale the delta function.
When the delta function is multiplied by a true function v(t), we have

5 5 5

³  (t  t )v(t ) (t )dt ³  (t  t ) v(t ) (t ) dt v(t ) (t ) ³  (t  t )v(t ) (t )dt


5
0
5
0 0 0
5
0 0

for any test function  , from which we conclude that

v(t ) A  (t  t0 ) v(t0 ) A  (t  t0 ) . (2.68e)

A useful generalization of (2.68d) is, for continuous true functions u(t),

1
  u (t )  ¦  (t  tk ) , (2.68f)
all k u 3(tk )

where u3(t ) du dt and t1 , t2 ,… are the values of t for which u (t ) 0 . This formula only makes
sense, of course, when u3(tk ) > 0 for t1 , t2 ,… . Perhaps the easiest way to see that (2.68f) must be

-- 151
151 --
2 · Fourier Theory

true is to note that the delta function equals the zero function whenever its argument is not zero.
Therefore,
5 ª tk   º
³  (u (t )) (t ) dt ¦ « ³  (u (t )) (t ) dt » (2.68g)
5 « tk 
all k ¬ ¼»

with  0 taken to be small enough that each interval tk  


t
tk   only includes one of the
tk values for which u is zero. Nothing stops us from making  as small as we please—as long as
it does not become zero—and eventually each integral on the right-hand side of (2.68g) can be
written as
tk  tk  

³   u(t )   (t )dt ³   (t  t )u3(t )   (t )dt ,


tk  tk 
k k

where we expand u as

u (t ) u (tk )  (t  tk )u3(tk ) (t  tk )u 3(tk )

since u (tk ) 0 . Next, we use (2.68d) to write

1
  (t  tk )u 3(tk )   (t  tk ) ,
u 3(tk )

so that
tk  tk  
ª 1 º
³   u (t )   (t )dt ³ «« u3(t )  (t  t ) »»  (t )dt
k
tk  tk  ¬ k ¼
5
ª 1 º
³5 « u3(tk )
«  (t  t k ) »  (t )dt .
¬ ¼»

Substitution of this result back into (2.68g) gives

5
ª 1
5
º 5
ª 1 º
³5  (u (t )) (t )dt ¦ ³ «  (t  tk ) »  (t )dt ³ « ¦  (t  tk ) »  (t )dt
¬ u 3(tk )
all k 5 « »¼ ¬ all k u 3(tk )
5 « »¼

for all test functions  . This justifies Eq. (2.68f) according to the definition for the equality of
generalized functions [see Eqs. (2.47a) and (2.47b)].

- 152
- 152- -
Derivatives of the Delta Function · 2.15

2.15 Derivatives of the Delta Function


We have already remarked that the set of test functions for  (t ) contains all functions that are
continuous at the origin. Changing the argument of the delta function changes the set of
appropriate test functions. In Eq. (2.68b), for example, the test functions must be continuous at
t t0 ; in (2.68d) they must be continuous at t c / a ; and in (2.68f) they must be continuous at
all t tk . When Eq. (2.53b) is used to define the derivative of a delta function,  3(t ) , we have

5 5

³  3(t ) (t )dt  ³  (t ) 3(t )dt  3(0) ,


5 5
(2.69a)

which shows that now the first derivative of all the test functions must be continuous at the
origin. If we start out with a test function ab (t ) that must be identically zero for all t
a and for
all t b , then Eq. (2.69a) becomes

5 5

³  3(t )ab (t )dt  ³  (t )ab3 (t )dt ab3 (0) 0


5 5

whenever the interval a


t
b does not contain the origin. Hence, we can write

5 5

³  3(t )ab (t )dt 0


5
³ 0 A
5
ab (t )dt

for a
b
0 or 0
a
b , showing that  3(t ) equals the zero function in the sense of Eq. (2.48f)
for t > 0 . Equation (2.52a) can be used in conjunction with (2.53b) to evaluate  3(t ) when it is
shifted from the origin by an amount t0 ,

5 5 5

³  3(t  t ) (t )dt ³  3(t ) (t  t )dt  ³  (t ) 3(t  t )dt  3(t ) ,


5
0
5
0
5
0 0 (2.69b)

where now we require the first derivative of the test functions to be continuous at t t0 . This
result can be applied to test functions ab (t ) to get

5 5 5

³  3(t  t )
5
0 ab (t )dt ³  3(t )
5
ab (t  t0 )dt  ³  (t )ab
5
3 (t  t0 )dt ab
3 (t0 ) 0

-- 153
153 --
Â)RXULHU7KHRU\

ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W 7KHUHIRUH

∞ ∞
  ³ δ ′ W − W φDE W GW =  = ³  ⋅φ DE W GW 
−∞ −∞

ZKHQHYHU D < E < W RU W < D < E VKRZLQJWKDW δ ′ W − W HTXDOVWKH]HURIXQFWLRQ>LQWKHVHQVHRI
(T I @IRU W ≠ W (TXDWLRQV D DQG E FDQEHDSSOLHGDQ\QXPEHURIWLPHVWRJHW
δ Q WKHQWKGHULYDWLYHRIWKHGHOWDIXQFWLRQVKLIWHGDZD\IURPWKHRULJLQE\DQDPRXQW W :H
KDYH
∞ ∞ ∞

³δ W − W φ W GW = − ³ δ W φ W + W GW = ³δ W φ  W + W GW = " 
Q Q −  Q − 

−∞ −∞ −∞

ZKLFKHYHQWXDOO\EHFRPHV


G Qφ
³δ W − W φ W GW = ( −) φ W = ( −)
Q Q Q Q
   F 
−∞
GW Q W = W


$JDLQWKLVODWHVWUHVXOWFDQEHDSSOLHGWRWHVWIXQFWLRQV φDE W WRJHW


 ³ δ Q W − W φDE W GW = ( −) φDE Q W =  
Q

−∞

ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W %HFDXVH

∞ ∞
 ³δ Q
W − W φDE W GW =  = ³  ⋅φ DE W GW 
−∞ −∞

ZKHQHYHU W = W  OLHV RXWVLGH WKLV LQWHUYDO ZH HQG XS ZLWK >XVLQJ WKH GHILQLWLRQ RI HTXDOLW\ LQ
I @
  δ Q W − W = IRUW ≠ W   G 

7KH WHVW IXQFWLRQV LQWHJUDWHG ZLWK δ Q W − W  PXVW RI FRXUVH KDYH WKHLU QWK GHULYDWLYHV
FRQWLQXRXVDW W = W 



Derivatives of the Delta Function · 2.15

We define the function (t ) to be

­ 1 for t 0
°
(t ) ®1 2 for t 0 . (2.70a)
° 0 for t
0
¯

Function  is often called the Heaviside step function. If we take

d
 (1) (t )  (t ) (2.70b)
dt

to be the first derivative of the  function, then  (1) (t ) 0 for all t > 0 . To evaluate  (1) (t ) at
the origin, we decide to turn  (t ) and  (1) (t ) into generalized functions that we call “  (t ) ” and
“  (1) (t ) ” respectively. We define

5 5

³ "(t )" (t )dt ³ (t ) (t )dt


5 5

for all test functions  , which means that, according to Eqs. (2.48b) and (2.48c),

" (t )" (t ) . (2.70c)

Having established the generalized function “ (t ) ”, we know from Eq. (2.53b) that the
generalized function “  (1) (t ) ” must satisfy

5 5

³ " (t )" (t )dt  ³ "(t )" 3(t )dt .


(1)
(2.70d)
5 5

A formal integration by parts of the left-hand side gives

5 5

³ " ³ "(t )" 3(t )dt .


(1) 5
(t )" (t )dt " (t )"A  (t )5 
5 5

This becomes, using (2.70c) to remove the double quotes,

-- 155
155 --
2 · Fourier Theory

5 5

³5 " (t )" (t )dt ª¬lim  (t ) º  ³ (t ) 3(t )dt


(1)
t 75 ¼ 0
ªlim  (t ) º   (0)  ªlim  (t ) º
¬ t 75 ¼ ¬ t 75 ¼
5
 (0) ³  (t ) (t )dt .
5

Hence, for all test functions  continuous at the origin (note that they do not have to approach
zero at ’), we have
5 5

³ " (t )" (t )dt ³  (t ) (t )dt ,


(1)

5 5
so
d
"  (1) (t )" " (t )"  (t ) (2.70e)
dt

in the sense of Eq. (2.47b). There is nothing unique about the Heaviside step function. We can
also show, using the generalized function "sgn(t )" introduced in Eqs. (2.60a) and (2.60b) above,
that for any test function 
5 5
1
³5 2 "sgn (t )" (t )dt 5³  (t ) (t )dt ,
(1)
(2.70f)

where "sgn (1) (t )" is the first derivative of "sgn(t )" . To show this is true, we do a formal
integration by parts,

5 5
1 1 1
³5 2 "sgn (t )" (t )dt 2 "sgn(t )"A  (t )5  2 5³ "sgn(t )" 3(t )dt .
(1) 5

This becomes, using Eqs. (2.60b) and (2.42c),

5 0 5
1 1ª º  1 ª lim  (t ) º  1  3(t )dt  1  3(t ) dt
³2 ³ 2 ³0
(1)
"sgn (t )" ( t ) dt lim  (t )
5
2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 5
1 1 1 1 1 1
ªlim  (t ) º  ª lim  (t ) º  ª lim  (t ) º   (0)   (0)  ª lim  (t ) º
2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 2 2 ¬ t 75 ¼
5
 (0) ³  (t ) (t )dt .
5
This shows Eq. (2.70f) is true. Again, we get a formula

- 156
- 156- -
Derivatives of the Delta Function · 2.15

1
"sgn (1) (t )"  (t ) (2.70g)
2

in the sense of Eq. (2.47b), where the only major restriction on the test functions is that they be
continuous at the origin.

2.16 Fourier Transform of the Delta Function


To find the Fourier transform of the delta function, we construct two sequences of functions
having the relationship specified in (2.59a)–(2.59g) above. It is easiest to start with the delta-
function sequence in Eq. (2.67c). Any standard table of Fourier transforms gives23

5
sin(2& nt ) § sin(2& nt ) ·
³e
2& ift
dt F (  ift ) ¨ ¸  ( f , n)
5
&t © &t ¹
and
5
sin(2& nt )
³
5
e 2& ift  ( f , n)df F (ift )   ( f , n) 
&t
so that
sin(2& nt )
6  ( f , n) . (2.71a)
&t

Although Eq. (2.71a) holds true for all real n, it is here used only for integer values of n. We
know from (2.67c) that the generalized limit as n 7 5 of the left-hand side of (2.71a) is  (t ) ,
but what is the corresponding generalized limit of the right-hand side? We have

5 5 n 5

³5  ( f ) df ª¬Gn75
lim  ( f , n) º lim ³  ( f , n)  ( f ) df lim ³  ( f ) df ³ 1A ( f ) df
¼ n75 5 n 75
n 5

for any test function  . This shows that

G lim  ( f , n) 1 ,
n 75

which is no surprise. Therefore, taking the generalized limit as n 7 5 of both sides of (2.71a)

23
Jack D. Gaskill, Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, New York, 1978), p. 201,
with the sinc, rect function pair corresponding to formula (2.71a) above.

-- 157
157 --
2 · Fourier Theory

gives
 (t ) 6 1 , (2.71b)
or, restating this result,
5

³  (t )e
2& ift
dt 1 (2.71c)
5
and
5

³e
2& ift
df  (t ) . (2.71d)
5

Equation (2.71c) is just what we expect from Eq. (2.64b), since

e 2& if A0 1 ;

but Eq. (2.71d) is true only in the sense of Eq. (2.47b), and it is only safe to substitute freely from
(2.71d) when the substitution takes place inside an integral.
Because the sine is an odd function of its argument, we have according to Eq. (2.17), and
assuming the integral is a Cauchy principle value, that

³ sin(2& ft )df
5
0.

Therefore, Eq. (2.71d) becomes, using Eq. (2.19) and that the cosine is even,

5 5

³ cos(2& ft )  i sin(2& ft ) df 2³ cos(2& ft )df  (t ) .


5 0

Since the integral over the sine always disappears, we can also write

5 5

³ cos(2& ft ) 9 i sin(2& ft ) df ³ e
92& ift
 (t ) df .
5 5

Hence, two additional formulas for the delta function are

5
2 ³ cos(2& ft )df  (t ) (2.71e)
0

- 158
- 158- -
Fourier Transform of the Delta Function · 2.16

and
5

³e
92& ift
df  (t ) . (2.71f)
5

As was the case for Eq. (2.71d), these formulas are meant to be used inside integrals.

2.17 Fourier Convolution Theorem with Generalized Functions


Now that we have defined what is meant by the Fourier transform of a generalized function, it is
surprisingly easy to show that the Fourier convolution theorem holds for the product of a
generalized function and a true function.
We start with two sequences of true functions, one of them labeled with a superscript minus
sign for reasons that will become shortly become apparent, called

v1 (t ), v2 (t ),… , vn (t ),… and V1(  ) ( f ), V2(  ) ( f ),… , Vn(  ) ( f ),… .

If these two sequences obey the relationship

v1 (t ) 6 V1(  ) ( f )
v2 (t ) 6 V2(  ) ( f )
# ,
()
vn (t ) 6 Vn ( f )
#
we know from Eq. (2.59g) that the generalized functions vG (t ) and VG(  ) ( f ) specified by

vG (t ) G lim vn (t ) (2.72a)
n 75

and
VG(  ) ( f ) G lim Vn(  ) ( f ) (2.72b)
n 75

form a Fourier transform pair,


vG (t ) 6 VG(  ) ( f ) . (2.72c)

We also suppose that there exists a third sequence of true functions labeled with a superscript
plus sign,
V1(  ) (t ),V2(  ) (t ),… , Vn(  ) (t ),… ,
such that

-- 159
159 --
2 · Fourier Theory

V1(  ) (t ) 6 v1 ( f )
V2(  ) (t ) 6 v2 ( f )
# .
()
Vn (t ) 6 vn ( f )
#

If this third sequence has a generalized function as its generalized limit,

VG(  ) (t ) G lim Vn(  ) (t ) , (2.72d)


n 75

then the generalized functions VG(  ) (t ) and vG ( f ) are also a Fourier transform pair,

VG(  ) (t ) 6 vG ( f ) . (2.72e)

Definitions (2.72b) and (2.72d) taken together show that

VG( 9 ) ( f ) G lim Vn( 9 ) ( f ) , (2.72f)


n 75

where we have replaced t by ƒ in (2.72d); and Eqs. (2.72c) and (2.72e) taken together give

VG( 9 ) ( f ) F ( 9 ift )  vG (t )  , (2.72g)

where we have interchanged the roles of t and ƒ in Eq. (2.72e).


From the Fourier convolution theorem for true functions [see Eq. (2.39j)], it follows that for
any true function u(t)

F ( 9 ift )  u (t ) A vn (t )  F ( 9 ift 3)  u (t 3)   F ( 9 ift 33)  vn (t 33) 


or
5 5

³e ³U
92& ift (9)
u (t )vn (t ) dt ( f 3) Vn( 9 ) ( f  f 3) df 3 ,
5 5
where
5 5
U (9) ( f ) ³ e 92& ift u (t )dt and Vn( 9 ) ( f ) ³e
92& ift
vn (t )dt .
5 5

The integral formula for Vn( 9 ) ( f ) just restates the definitions given to Vn(  ) and Vn(  ) on the two
previous pages. Taking the limit of both sides as n 7 5 gives

- 160
- 160- -
Fourier Convolution Theorem with Generalized Functions · 2.17

5 5
lim
n 75 ³
5
e 92& ift u (t )vn (t ) dt lim ³ U ( 9 ) ( f 3) Vn( 9 ) ( f  f 3) df 3
n 75
5

or, moving the limiting process inside the integral so that it becomes a generalized limit [see
discussion after Eq. (2.56a)],

5 5

³e ³U
92& ift (9)
u (t ) G lim vn (t ) dt ( f 3) G lim Vn( 9 ) ( f  f 3)df 3 .
n 75 n 75
5 5

From the definitions of vG (t ) and VG( 9 ) ( f ) [see Eqs. (2.72a) and (2.72f)], we get

5 5

³e ³U
92& ift (9)
u (t )vG (t ) dt ( f 3)VG ( 9 ) ( f  f 3) df 3 ,
5 5

which becomes
5

³e
92& ift
u (t )vG (t ) dt U ( 9 ) ( f )  VG ( 9 ) ( f ) (2.72h)
5

or, substituting from Eq. (2.72g),

F ( 9 ift )  u (t ) A vG (t )  F ( 9 ift 3)  u (t 3)   F ( 9 ift 33)  vG (t 33)  . (2.72i)

Consulting Eq. (2.55b) above, we note that convolution with a generalized function is
commutative, just like the convolution of two standard functions, so Eqs. (2.72h) and (2.72i) can
also be written as
5

³e
92& ift
u (t )vG (t ) dt VG ( 9 ) ( f )  U ( 9 ) ( f ) (2.72j)
5
and
F ( 9 ift )  u (t ) A vG (t )  F ( 9 ift 33)  vG (t 33)   F ( 9 ift 3)  u (t 3)  . (2.72k)

This establishes the generalized-function counterpart to Eq. (2.39j) whenever e 92& ift u (t ) and
U ( 9 ) ( f ) qualify as acceptable test functions. Since almost all well-behaved, continuous functions
are acceptable test functions when used with linear combinations of delta functions or the
derivatives of delta functions, Eqs. (2.72h) and (2.72i) are valid whenever vG (t ) is a linear
combination of delta functions or the derivatives of delta functions.

-- 161
161 --
2 · Fourier Theory

Establishing the Fourier convolution theorem in the other direction is even easier. We just
write, making the variable substitution t 33 t  t 3 and remembering that the convolutions are
commutative,

5 5 5

³
5
e 92& ift [u (t )  vG (t )] dt ³
5
dt e92& ift
5
³ dt3 u(t  t 3) G lim v (t 3)
n 75
n

5 5

³ dt 3 G lim v (t 3) ³ dt u(t  t 3) e
92& ift
n
n 75
5 5
5 5

³ dt 3 v (t 3) ³ dt u(t  t 3) e
92& ift
lim n
n 75
5 5
5 5
lim
n 75 ³
5
dt 3 vn (t 3) e92& ift 3 ³ dt 33 u (t 33) e92& ift 33
5
5 5
[³ e 92& ift 3
G lim vn (t 3) dt 3] A [ ³ u (t 33) e92& ift 33 dt 33] .
n 75
5 5

We conclude that
F ( 9 ift )  u (t )  vG (t )  F ( 9 ift 3)  u (t 3)  A F ( 9 ift 33)  vG (t 33)  , (2.72 A )

showing that Eq. (2.39a) holds true for the convolution of a true function and a generalized
function as well as for the convolution of two true functions.

2.18 The Shah Function


The shah function, often written as I I , can be defined as the generalized limit

§ t § 1 ··
sin ¨ 2& ¨ n  ¸ ¸
1
II( t , T ) A G lim © T© 2 ¹¹
. (2.73)
T n75 § t ·
sin ¨ & ¸
© T¹
For any test function  (t ) , we have

5 ª sin 2& tT 1  n  (1 2) 
  »ºdt lim 5 ­° sin 2& tT 1  n  (1 2) 
  ½°dt
³5  (t ) Gn75
lim «
«¬ sin & tT 1  »¼ n 75 ³
 (t ) ®
sin & tT 1  
¾ (2.74a)
5 ¯° ¿°

- 162
- 162- -
The Shah Function · 2.18

As n gets large in (2.74a), the term in braces { } oscillates ever more rapidly between +1 and í1,
causing the more slowly varying function  to make only a negligible contribution to the
integral. The only place this might not hold true is at the isolated t values

t 0, 9 T , 9 2T ,… . (2.74b)

It is easy to see why these isolated values are different. Suppose t differs from one of these
isolated values by only a small amount ¨t so that

t t 9 mT for m 0,1, 2,… . (2.74c)

Then the term in braces becomes


sin 2& (t 9 mT )T 1  n  (1 2)   sin  2&tT  n  (1 2)  9 2& nm 9 & m 
1


sin & (t 9 mT )T 1  sin(&tT 1 9 & m)



sin 2&tT 1  n  (1 2)  .
1
sin(&tT )

To explain the last step, we note that the sine does not change when a ±nm number of 2ʌ’s is
added to its argument, and adding a ±m number of ʌ’s to the sine’s argument either leaves the
sine unchanged (if m is even) or multiplies it by í1 (if m is odd). Since the sine values in both the
numerator and denominator have the same number of ʌ’s added to their arguments, we do not
care if m is odd because the factor of í1 cancels, leaving the sine ratio unchanged. As ¨t is taken
to be ever smaller in magnitude for a fixed value of n, there comes a time when the arguments of
both sines are small in magnitude, allowing each sine to be approximated by its argument. We
then have

sin 2& (t 9 mT )T 1  n  (1 2) 

 
sin 2&tT 1  n  (1 2)  

sin & (t 9 mT )T 1  sin &tT 1  
1
2&tT  n  (1 2)  2
 n  (1 2)  .
&tT 1

Consequently, the peak values of the term in braces get ever larger at the isolated points in
(2.74b) as n increases, as shown in Figs. 2.9(a)–2.9(c). We see that the triangular peaks at the
isolated points in (2.74b) have widths equal to T (n  (1 2)) . As n gets ever larger, the term in
braces oscillates so rapidly between +1 and í1 compared to the test function  that there is no
contribution made to the integral on the right-hand side of (2.74a) except at the isolated t values
shown in Figs. 2.9(a)–2.9(c). At these t values, we have

-- 163
163 --
2 · Fourier Theory

5 ­° sin 2& tT 1  n  (1 2) 
  °½dt "   (T ) 1area of triangular peak2
lim ³  (t ) ® ¾
n 75
5 °¯ 
sin & tT 1  °¿
  (0) 1area of triangular peak2
  (T ) 1area of triangular peak2  "
1 T
A A 2  n  (1 2) 1"   (T )   (0)   (T )  "2 ,
2 n  (1 2)
which simplifies to

5 ­° sin 2& tT 1  n  (1 2) 
  ½° dt T k 5
lim ³  (t ) ® ¾ ¦  (kT ) . (2.75a)
n 75
5 °¯ sin & tT 1   °¿ k 5

k 5
But T ¦  (kT ) can be regarded
k 5
thought ofasaswhat
whatwe
weget
getwhen
whenevaluating
evaluatingthe
theintegral
integral

ª k 5 º
5 k 5 5 k 5

³5 «¬ k¦
 (t ) T
5
 (t  kT ) »
¼
dt T ¦³
k 5 5
 (t  kT )  (t ) dt T ¦  (kT ) .
k 5

This lets us write (2.75a) as

5 ­° sin 2& tT 1  n  (1 2) 
  ½° dt 5
ª k 5
º
lim ³  (t ) ® ¾ ³  (t ) «T ¦  (t  kT ) »¼ dt (2.75b)
n 75
5 ¯° sin & tT 1   ¿° 5 ¬ k 5

or, using (2.56a) to take the limit inside the integral as a generalized limit,

5 ­° sin 2& tT 1  n  (1 2) 
  ½° dt 5
ª k 5
º
³5  (t ) Gn75
lim ® ¾ ³  (t ) «T ¦  (t  kT ) »¼ dt .
°¯ sin & tT 1   °¿ 5 ¬ k 5

Since this last result is true for any test function  , we conclude that

- 164
- 164- -
The Shah Function · 2.18

­° sin  2& tT 1  n  (1 2)   ½° k 5
G lim ® ¾ T ¦  (t  kT ) (2.75c)
n 75
°¯ sin & tT 1  °¿ k 5

in the sense of Eq. (2.47b). Comparison of this result to the definition of the shah function in Eq.
(2.73) above shows that
5
II( t , T ) ¦  (t  kT ) .
k 5
(2.75d)

We note that variable t can be replaced by ƒ in Eq. (2.75c) to get

­° sin 2& fT 1  n  (1 2) 
  ½° T k 5
G lim ® ¾ ¦  ( f  kT ) .
n 75
¯° 
sin & fT 1  ¿° k 5

Parameter T is arbitrary throughout this derivation, so nothing stops us from replacing it by T 1


everywhere to get
­° sin  2& fT  n  (1 2)   ½° 1 k 5 § k·
G lim ® ¾ ¦  ¨ f  ¸. (2.75e)
n 75
¯° sin  & fT  ¿° T k 5 © T ¹

This is another useful version of the formula in Eq. (2.75d).

2.19 Fourier Transform of the Shah Function


To get the Fourier transform of the shah function, we construct the sequence of true functions
G1 (t , T ), G2 (t , T ),… , Gn (t , T ),… such that
n
Gn (t , T ) ¦g
k  n
n (t  kT ) , (2.76a)

where
sin(2& (n  1)t )
g n (t ) . (2.76b)
&t

From Eq. (2.67c), we have


sin(2& nt )
G lim g n 1 (t ) G lim  (t ) .
n 75 n 75 &t

-- 165
165 --
2 · Fourier Theory

FIGURE 2.9(a).

FIGURE 2.9(b).

FIGURE 2.9(c).

The formula for the t interval between the arrows is T /( n  1/ 2) in all three plots. Figures 2.9(a), 2.9(b),
and 2.9(c) show how the base width of the central lobe becomes ever narrower as n increases.

- 166
- 166- -
Fourier Transform of the Shah Function · 2.19

Since adding one to n does not make any difference in the limit, we end up with

G lim g n (t )  (t ) ; (2.76c)
n 75

and from (2.71a) we get, again adding one to n,

sin  2& (n  1)t 


6  ( f , n  1) for n 1, 2,… . (2.76d)
&t

To find the generalized function that is the forward Fourier transform of the generalized limit of
Gn as n 7 5 , we must evaluate the forward Fourier transform of Gn for finite n,

5 n 5
F ( ift )  Gn (t )  ³ e Gn (t ) dt
2& ift
¦ ³e
k  n 5
2& ift
g n (t  kT ) dt
5
n 5
¦ e2& ifkT
k  n
³e
2& ift 3
g n (t 3) dt 3,
5

where in the last step the variable of integration has been changed to t 3 t  kT . The Fourier
transform inside the sum can be done using (2.76b) and (2.76d) to get

n
F (  ift )  Gn (t )   ( f , n  1) A ¦ e 2& ifkT . (2.77a)
k  n

n
The sum ¦e
k  n
2& ifkT
is just a disguised form of geometric series. We can write
n n

¦e
k  n
2& ifkT
¦w
k  n
k
, (2.77b)

where
w e 2& ifT

and define
n n
Sn ¦w
k  n
k
¦e
k  n
2& ifkT
.

-- 167
167 --
2 · Fourier Theory

Using the standard approach for calculating the sum of a geometric series, we note that
multiplying every term in the sum by w increases each power of w in the sum by one. This is the
same as adding wn 1 and subtracting w n from the original sum, giving

n 1
wSn ¦
k  n 1
wk S n  wn 1  w n

or
wn 1  w n
Sn .
w 1

Hence, (2.77b) becomes

2& ifT  n  1 2  
e  
2& ifT n  1 2 
n
e 2& ifT ( n 1)  e 2& ifT ( n ) e
¦e
k  n
2& ifkT

e 2& ifT  1

e & ifT  e& ifT
(2.77c)


sin 2& fT  n  1 2   ,
sin(& fT )

which means Eq. (2.77a) can be written as

F ( ift ) (Gn (t ))

sin 2& fT  n  1 2    ( f , n  1) . (2.77d)
sin(& fT )

The inverse Fourier transform of the forward Fourier transform returns the original function [see
Eqs. (2.29b) and (2.29d)], so this last result lets us write

Gn (t ) 6

sin 2& fT  n  1 2    ( f , n  1) . (2.77e)
sin(& fT )

From the definition of the Fourier transform of a generalized function [see (2.59g)], we know that
taking the generalized limit of both sides of (2.77e) gives a Fourier transform relationship
between two generalized functions—all that needs to be done now is to find out what these
generalized functions are.
To find the generalized function that is the generalized limit of Gn as n 7 5 , we write for
any test function  , using Eq. (2.76a), that

- 168
- 168- -
Fourier Transform of the Shah Function · 2.19

ª º
5 5 5 n

³  (t ) ªG lim Gn (t ) º dt lim ³  (t ) Gn (t ) dt lim ³  (t ) « ¦ gn (t  kT )» dt


¬ n 75 ¼ n 75 n 75
¬ k  n ¼
5 5 5
n 5
(2.77f)
lim
n 75
¦ ³  (t ) g
k  n 5
n (t  kT ) dt.

Equation (2.76c) states that the generalized limit of g n is the delta function, so

5 5 5
lim ³  (t ) g n (t  kT ) dt ³  (t ) G lim g n (t  kT ) dt ³  (t ) (t  kT ) dt  (kT ) ,
n 75 n 75
5 5 5

which means that


n 5 5
lim
n 75
¦ ³  (t ) gn (t  kT ) dt
k  n 5
¦  (kT ) .
k 5

Hence, Eq. (2.77f) can be written as

5 5

³  (t ) ªG lim Gn (t )º dt
¬ n 75 ¼ ¦  (kT ) .
k 5
(2.77g)
5

But, just as in the discussion following Eq. (2.75a) above, we can regard

¦  (kT )
k 5

as the result of integrating the shah generalized function

5
II( t , T ) ¦  (t  kT )
k 5
with any test function  , since

ª 5 º
5 5 5

³ II( t , T )  (t ) dt
5
³ «¦
5 ¬ k 5
 (t  kT ) »
¼
 (t ) dt ¦  (kT ) .
k 5

Therefore, (2.77g) can be written as

-- 169
169 --
2 · Fourier Theory

ª 5 º
5 5
ª lim Gn (t ) ³ « ¦  (t  kT ) »  (t ) dt
³5 dt (t ) ¬Gn75 º (2.77h)
¼ 5 ¬ k 5 ¼

for any test function  , showing that

5
G lim Gn (t )
n 75
¦  (t  kT ) II( t , T )
k 5
(2.77i)

in the sense of Eq. (2.47b).


The generalized function that is the generalized limit of the right-hand side of (2.77e) is
multiplied by an arbitrary test function  ( f ) and integrated over all ƒ to get

­ ª sin 2& fT  n  1 2  
  º ½°
°
5

³5  ( f ) ®Gn75
lim «
« sin(& fT )
 ( f , n  1) » ¾ df
»°
°¯ ¬ ¼¿
n 1 ª sin 2& fT  n  1 2   º
 
lim ³  ( f ) « » df (2.78a)
n 75 « sin(& fT ) »
 ( n 1)
¬ ¼
5 ª sin 2& fT  n  1 2   º
 
lim ³  ( f ) « » df ,
n 75 « sin( & fT ) »
5
¬ ¼

where in the last step we recognize that the behavior of the sine ratio inside the square brackets
[ ] is not affected by the endpoints for the region of integration as n 7 5 . Equations (2.56a) and
(2.75e) show that

°­ sin  2& fT  n  (1 2)   °½ ª º
5 5 k 5
lim ³  ( f ) ® ¾ df ³  ( f ) «T 1 ¦  ( f  kT 1 ) » df ,
n 75
5 °¯ sin & fT  °¿ 5 ¬ k 5 ¼

which means that (2.78a) simplifies to

- 170
- 170- -
Fourier Transform of the Shah Function · 2.19

­ ª sin 2& fT  n  1 2  
  º ½°
°
5

³5  ( f ) ®Gn75
lim «
« sin(& fT )
 ( f , n  1) » ¾ df
»°
°¯ ¬ ¼¿
ª º
5 k 5
³  ( f ) «T 1 ¦  ( f  kT 1 ) » df
5 ¬ k 5 ¼
for any test function  ( f ) . Therefore,

ª sin 2& fT  n  1 2  
  º 1 k 5
§ k·
G lim «  ( f , n  1) » ¦  ¨ f  ¸ (2.78b)
n 75 « sin(& fT ) » T k 5 © T¹
¬ ¼

in the sense of Eq. (2.47b). Since the right-hand side of (2.78b) is, according to (2.75d),
proportional to the shah function, we end up with

1 5 § k · 1
¦ ¨f 
T k 5 © T ¹ T
1
¸ II( f , T ) . (2.78c)

Equations (2.78b) and (2.77i) let us take the generalized limits as n 7 5 of both sides (2.77e) to
get
5
1 5 § k·
¦
k 5
 (t  kT ) 6 ¦  ¨ f  ¸ .
T k 5 © T¹
(2.78d)

According to Eq. (2.75d), this can also be written as

1
II( t , T ) 6 II( f , T 1 ) . (2.78e)
T

These last two results can be transformed


modified to directly
generalize
to how
showboth the forward
explicitly andthe
that both inverse
forward Fourier
and
inverse Fourier
transform transform
of the shah of produce
function the shahanother
function produce
shah another
function. shah
We first function.
write (2.78d)We firstforward
as the write
(2.78d)
and as the
inverse forward
Fourier and inverse Fourier transforms,
transforms,

2& ift ª º
5 5
1 5
§ j·
³5 e « ¦  (t  kT ) » dt T ¦  ¨© f  T ¸¹ (2.79a)
¬ k 5 ¼ j 5

and
5
ª
2& ift 1
5
§ j ·º 5

³ «¬ T
e ¦  ¨  ¸ » ¦  (t  kT ) .
j 5 ©
f
T ¹¼
df
k 5
(2.79b)
5

-- 171
171 --
2 · Fourier Theory

The discussion following Eq. (2.52c) above shows that linear transformations of the variables of
integration are allowed when using generalized functions, so we can change to t 3 t in Eqs.
(2.79a) and (2.79b) to get

2& ift 3 ª º
5 5
1 5
§ j·
³ e « ¦
¬ k 5
 ( t 3  kT ) »
¼
dt 3
T
¦  ¨© f  T ¸¹
j 5
5
and
5
ª
2& ift 3 1
5
§ j ·º 5

³5 e «¬ T ¦¨f 
j 5 ©
¸»
T ¹¼
df ¦  (t 3  kT ) .
k 5

The sum over index k goes over all positive and negative integers, so we can change the sum’s
index to k 3  k and use that the delta function is even [see Eq. (2.68a)] to get

2& ift 3 ª º
5 5
1 5
§ j·
³5 «¬ k¦
e
3 5
 (t 3  k 3T ) » dt 3
¼ T
¦  ¨© f  T ¸¹
j 5

and
5
ª1 5
§ j ·º 5

³e
2& ift 3
« ¦  ¨ f  ¸ » df ¦  (t 3  k 3T ) .
5 ¬T j 5 © T ¹¼ k 3 5

Dropping the primes and combining these results with Eqs. (2.79a) and (2.79b) produces the
more general formulas

92& ift ª º
5 5
1 5
§ j·
³ e « ¦
¬ k 5
 (t  kT ) »
¼
dt
T
¦  ¨© f  T ¸¹
j 5
(2.79c)
5
and
5
ª
92& ift 1
5
§ j ·º 5

³5 e «¬ T ¦¨f 
j 5 ©
¸»
T ¹¼
df ¦  (t  kT ) .
k 5
(2.79d)

In fact, we can easily show that Eqs. (2.79c) and (2.79d) are really the same formula. First, we
interchange the j, k indices and the ƒ, t variables in Eq. (2.79c) so that it becomes

5
ª 5 º 1 5 § k ·
³ e 92& ift
« ¦  ( f  jT ) » df ¦  ¨t 
T k 5 © T
¸.
¹
5 ¬ j 5 ¼

Parameter T is arbitrary, so—just like in the analysis following Eq. (2.75d) above—it can be
replaced everywhere by T 1 to get

- 172
- 172- -
Fourier Transform of the Shah Function · 2.19

5
ª 5 § j ·º 5
§ k·
³5 «¦ ¨ ¦
92& ift
e  f  ¸» df T  ¨ t  kT¸ .
¬ j 5 © T ¹¼ k 5 © T¹

After dividing through by T, we see that this last result is the same as Eq. (2.79d), showing that
Eqs. (2.79c) and (2.79d) are really the same formula.

2.20 Fourier Series


Integral Fourier transforms are connected in a direct and straightforward way to both the Fourier
series and the discrete Fourier transform. This section shows the connection to the Fourier series
and the next section shows the connection to the discrete Fourier transform.24
We begin with an arbitrary, nonpathological function u(t) that has a well-defined Fourier
integral transform. Function u can be complex-valued but its argument t must be real, and U(ƒ) is
the forward Fourier transform of u(t), so

5
U( f ) F (  ift )
 u (t )  ³ u (t )e2& ift dt (2.80a)
5
and
u (t ) 6 U ( f ) . (2.80b)

From u(t), we create a new function u[ 5 ] (t , T ) that repeats forever along the t axis at intervals of
T,
5
u[ 5 ] (t , T ) ¦ u(t  kT ) .
k 5
(2.81a)

Although perhaps redundant, it turns out that listing T as one of the arguments of u[ 5 ] is a
convenient way to keep track of the connection between u and u[ 5 ] . Function u   is called a
5

periodic function of period T because, for any finite positive or negative integer m,

u[ 5 ] (t  mT , T ) u[ 5 ] (t , T ) . (2.81b)

Figures 2.10(a) and 2.10(b) show the plots for both u and u[ 5 ] as functions of t. Since function u
is left unspecified, u[ 5 ] can be thought of as representing an arbitrary periodic function. We can

24
The analysis in Secs. 2.20 and 2.21 is adapted from A. Papoulis, Signal Analysis (McGraw-Hill Book Company,
New York, 1977), pp. 76–81.

-- 173
173 --
2 · Fourier Theory

also define a function u[ N ] (t , T ) by the formula

N
u [N ]
(t , T ) ¦ u (t  kT ) .
k  N
(2.81c)

Clearly,
lim u[ N ] (t , T ) u[ 5 ] (t , T ) . (2.81d)
N 75

We assume that u[ N ] is well behaved with respect to the test functions  , so that

5 5

³  (t ) u ³  (t ) u
[N] [5]
lim (t , T ) dt (t , T ) dt . (2.81e)
N 75
5 5

_____________________________________________________________________________

FIGURE 2.10(a). u (t )

FIGURE 2.10(b).
u[ 5 ] (t , T )
T

Figure 2.10(a) is a plot of u (t ) . The solid curve in Fig. 2.10(b), shifted upward from its true position, is
u[ 5 ] (t , T ) and the dashed curves represent u (t ) displaced by multiples of T .

- 174
- 174- -
Fourier Series · 2.20

From (2.81e) and the definition of the generalized limit [see Eq. (2.56a)], we then know that

5 5 5
lim ³  (t ) u
[N ]
(t , T ) dt ³  (t ) ªG lim u[ N ] (t , T ) º dt ³  (t ) u[ 5 ] (t , T ) dt ,
N 75
5 5
¬ N 75 ¼ 5

from which it follows that


G lim u[ N ] (t , T ) u[ 5 ] (t , T ) (2.81f)
N 75

in the sense of Eq. (2.48c).


Following the pattern of the definitions in (2.81a) and (2.81c), we define

N
 [ N ] (t , T ) ¦  (t  kT )
k  N
(2.82a)

and
5
 [5]
(t , T ) ¦  (t  kT ) .
k 5
(2.82b)

Function  [ 5 ] (t , T ) is clearly just another way of writing the shah function II( t , T ) . [The shah
5
function is defined in Eq. (2.73) and shown equal to ¦  (t  kT )
k 5
in Eq. (2.75d).] The

convolution of the generalized function

N
 [N]
(t , T ) ¦  (t  kT )
k  N

with the true function u(t) is

5 N 5
u (t )   [ N ] (t , T ) ³ u(t 3) (t  t 3, T )dt 3
[N ]
¦ ³ u(t3)  t 3  (t  kT ) 
k  N 5
5
N
¦ u(t  kT ) ,
k  N

where the next-to-last step uses  ( x )  ( x ) as shown in Eq. (2.68a). The definition of u[ N ] in
(2.81c) then gives

u[ N ] (t , T ) u (t )   [ N ] (t , T ) . (2.82c)

-- 175
175 --
2 · Fourier Theory

Taking the integral Fourier transform of both sides, using the Fourier convolution theorem [see
Eq. (2.72 A )], and remembering that U(ƒ) is the forward Fourier transform of u(t), we get

  
F ( ift ) u[ N ] (t , T ) F (  ift )  u (t )  A F (  ift )  [ N ] (t , T ) 
N 5
U( f )A ¦ ³e
k  N 5
2& ift
 (t  kT )dt
N (2.83a)
U( f ) ¦ e 2& ikfT

k  N
sin  2& fT ( N  1 2) 
U( f ) ,
sin(& fT )

where in the last step we substitute from Eq. (2.77c) above. Having now found that

sin  2& fT ( N  1 2) 

F ( ift ) u[ N ] (t , T ) U ( f )  sin(& fT )
,

we take the inverse Fourier transform of both sides to get

5
sin  2& fT ( N  1 2) 
³e
[N ] 2& ift
u (t , T ) U( f ) df . (2.83b)
5
sin(& fT )

Taking the limit of both sides as N 7 5 , we get, using (2.81d), that

5
sin  2& fT ( N  1 2) 
³e
[5] 2& ift
u (t , T ) lim U( f ) df . (2.83c)
N 75
5
sin(& fT )

Equations (2.56a) and (2.75e) can now be used to write

5
ª sin  2& fT ( N  1 2)  º
³e
[5] 2& ift
u (t , T ) U ( f ) G lim « » df
5
N 75
¬ sin(& fT ) ¼
1ª k ·º
5 5
§
³ e 2& iftU ( f ) « ¦  ¨ f  ¸ » df
5
T ¬ k 5 © T ¹¼
or
5 kt

¦ ª¬T
2& i
u [5]
(t , T ) 1
U (k T ) º¼ e T
. (2.83d)
k 5

- 176
- 176- -
Fourier Series · 2.20

Equation (2.83d) specifies the Fourier series for an arbitrary periodic function u[ 5 ] , showing that
u[ 5 ] can be written as the infinite sum of complex exponentials multiplied by the complex
constants [T 1U (k T )] . To get these complex constants directly from u[ 5 ] , we note that for any
real number * and integer m,

1 ­° ½°
5 m *  ( N 1)T m
1 §m· 1 2& i t 2& i t
U ¨ ¸ ³ u (t )e T
dt lim ® ³ u (t )e T
dt ¾
T © T ¹ T 5 N 75 T
¯° *  NT ¿°
1 ­°
*  ( N 1)T m *  ( N  2)T m
2 & i t 2 & i t
lim ®
N 75 T
°̄ *
³
 NT
u (t )e T
dt  ³
*  ( N 1)T
u (t )e T
dt  "
* m * T m
2 & i t 2& i t
 ³ u (t )e
* T
T
dt  ³* u(t )e T
dt
*  2T m
2 & i t
*  ( N 1)T m
2& i t ½°

* T
³ u (t )e T
dt  "  ³
*  NT
u (t )e T
dt ¾ .
°¿
This can be simplified to

*  ( k 1)T m
1 §m· 1 N
U ¨ ¸ lim ¦
2& i t

T © T ¹ N 75 T k  N ³
*  kT
e T
u (t )dt . (2.83e)

For each value of k, we change the variable of integration to t 3 t  kT so that

*  ( k 1)T m * T m * T m
2& i t 2& i t 3 2& i t 3
³ ³* e ³*
2& imk
e T
u (t ) dt T
e u (t 3  kT ) dt 3 e T
u (t 3  kT ) dt 3 ,
*  kT

where we use that e 2& imk 1 . Substituting this into (2.83e) gives

* T m * T m
1 §m· 1 N 1 ª N º
U ¨ ¸ lim ¦ « ¦ u (t 3  k 3T ) » dt 3 ,
2& i t 3 2& i t 3

T © T ¹ N 75 T k  N ³* e T
u (t 3  kT )dt 3 lim
N 75 T ³* e T

¬ k 3  N ¼

where in the last step we have replaced index k by index k 3  k . Now, taking the limit inside the
integral to get the generalized limit [see Eq. (2.56a) above], we rely on (2.81f) to get

* T m * T m
1 §m· 1 ª N º 1
G lim « ¦ u (t 3  k 3T ) » dt 3
2& i t 3 2 & i t 3
³* e ³* e
[5 ]
U¨ ¸ T T
u (t 3, T ) dt 3 . (2.83f)
T ©T ¹ T N 75
¬ k 3  N ¼ T

-- 177
177 --
2 · Fourier Theory

Equations (2.83d) and (2.83f) let us put the Fourier series into its standard form. For any
periodic function
5
[5]
v(t ) u (t , T ) ¦ u (t  kT )
k 5
of period T, we have found that
5 t

¦
2& ik
v(t ) Ak e T
, (2.84a)
k 5

where
* T k
1 2& i t
Ak
T ³* e T
v(t ) dt . (2.84b)

for any finite value of * . Because we did not require u(t) to be real in (2.80a), Eqs. (2.83d),
(2.83f), (2.84a), and (2.84b) still hold true for complex periodic functions with real arguments t.
It is customary—but of course not mandatory—to choose * 0 or *  T 2 in (2.84b).
Using v(t ) u[ 5 ] (t , T ) , we know from Eqs. (2.83d), (2.83f), (2.84a), and (2.84b) that the Ak
coefficients can be specified in terms of the forward Fourier transform U(ƒ) of u(t),

1 §k ·
Ak U¨ ¸. (2.85a)
T ©T ¹

When u is real—which means that v(t ) u[ 5 ] (t , T ) is also real—we know from entry 7 of Table
2.1 (located at the end of this chapter) that U(ƒ) must be Hermitian so that

U (  f ) U ( f ) .

Hence, when v(t) is real in (2.84a), it then follows from (2.85a) that

A k Ak  (2.85b)

in (2.84b). This procedure can be extended to all the entries in Table 2.1, giving us the entries in
Table 2.2 (also located at the end of this chapter). To go through another example, if u is
imaginary and odd, we know from entry 3 of Table 2.1 that U is real and odd, so

U ( f ) U ( f ) and Im U ( f )  0 .

- 178
- 178- -
Fourier Series · 2.20

Equation (2.85a) then shows that

A k  Ak and Im  Ak  0 . (2.85c)

We can show that v(t ) u[ 5 ] (t , T ) is imaginary and odd when u is imaginary and odd (let
k 3  k ),

5 5 5
v(t ) u [5 ]
(t , T ) ¦ u(t  kT ) ¦ u (t  k 3T )  ¦ u(t  k 3T )
k 5 k 3 5 k 3 5
u[ 5 ] (t , T ) v(t ) ,
and
5
Re  v(t )  ¦ Re  u (t  kT )  0 .
k 5

This shows that we end up with (2.85c) associated with v(t) being imaginary and odd, as stated in
entry 3 of Table 2.2.
A final point worth mentioning about Fourier series is that the Ak coefficients are often
reshuffled so that the series can be written as a sum of sines and cosines. Equation (2.84a) can be
rewritten as, using ei cos   i sin  ,

5 ª t
2& i k º
t
v(t ) A0  ¦ « A k e
2 & i k
T
 Ak e T »
k 1 ¬ ¼ (2.86a)
5 § 2& k t · 5 § 2& k t ·
A0  ¦ ª¬ A k  Ak º¼ cos ¨ ¸  ¦ i ª¬ Ak  A k º¼ sin ¨ ¸.
k 1 © T ¹ k 1 © T ¹

From Eq. (2.84b), we get


* T
1
A0
T ³* v(t ) dt , (2.86b)

1
* T
ª 2& i k Tt 2& i k º
t
2
* T
§ 2& k t ·
A k  Ak
T ³* v(t ) « e
¬
e T
» dt ³ v(t ) cos ¨
¼ T * © T ¹
¸ dt , (2.86c)

and
i
* T
ª 2& i k Tt 2& i k º
t
2
* T
§ 2& k t ·
i ª¬ Ak  A k º¼
T ³* v(t ) «e
¬
 e T » dt ³ v(t ) sin ¨
¼ T * © T
¸ dt .
¹
(2.86d)

-- 179
179 --
2 · Fourier Theory

Putting these results together, we can write

c0 5 § 2& kt · 5 § 2& kt ·
v(t )  ¦ ck cos ¨ ¸  ¦ sk sin ¨ ¸, (2.87a)
2 k 1 © T ¹ k 1 © T ¹
where
* T
2 § 2& kt ·
ck
T ³* v(t ) cos ¨© T ¹
¸ for k 0,1, 2,… (2.87b)

and
* T
2 § 2& kt ·
sk
T ³* v(t ) sin ¨© T ¹
¸ for k 1, 2,3,… . (2.87c)

The absolute value signs are dropped from index k because it is defined positive in (2.87a), and
A0 is replaced by c0 2 so that the formula for c0 can be folded into the general formula for ck in
(2.87b). Although it is still not mandatory, parameter * is usually given the value 0 or  T 2 .
Nowhere has v been required to be real, so Eqs. (2.87a)–(2.87c), just like Eqs. (2.84a) and
(2.84b), still hold true when v is a complex-valued periodic function of (real) period T. Indeed, if
v is a complex-valued function of a real argument t, both its real part

vR (t ) Re  v(t ) 
and its imaginary part
vI (t ) Im  v(t ) 

are real-valued periodic functions of period T. This means that when, for any integer m, we have

v(t 9 mT ) v(t ) (2.88a)

for a complex-valued function v of a real argument, then

vR (t 9 mT ) vR (t ) (2.88b)
and
vI (t 9 mT ) vI (t ) . (2.88c)

Since sines and cosines of real arguments are strictly real, we can now take the real and
imaginary parts of (2.87a)–(2.87c) to get

- 180
- 180- -
Fourier Series · 2.20

Re(c0 ) 5 § 2& kt · 5 § 2& kt ·


vR (t )  ¦  Re(ck )  cos ¨ ¸  ¦  Re( sk ) sin ¨ ¸, (2.89a)
2 k 1 © T ¹ k 1 © T ¹
with
* T
2 § 2& kt ·
Re(ck )
T ³* v R (t ) cos ¨
© T ¹
¸ for k 0,1, 2,… (2.89b)

and
* T
2 § 2& kt ·
Re( sk )
T ³* v R (t ) sin ¨
© T ¹
¸ for k 1, 2,3,… , (2.89c)

as well as
Im(c0 ) 5 § 2& kt · 5 § 2& kt ·
vI (t )  ¦  Im(ck )  cos ¨ ¸  ¦  Im( sk )  sin ¨ ¸ , (2.90a)
2 k 1 © T ¹ k 1 © T ¹
with
* T
2 § 2& kt ·
Im(ck )
T ³* v (t ) cos ¨©
I
T ¹
¸ for k 0,1, 2,… (2.90b)

and
* T
2 § 2& kt ·
Im( sk )
T ³* v (t ) sin ¨©
I
T ¹
¸ for k 1, 2,3,… . (2.90c)

2.21 Discrete Fourier Transform


The first step in going from the integral Fourier transform to the discrete Fourier transform is to
repeat the procedure used in Sec. 2.20 to get the Fourier series. We pick a nonpathological
function u(t) having a forward Fourier transform

³ u (t )e
2& ift
U( f ) dt (2.91a)
5

and, following the same procedure used in Eq. (2.81a) above, create a periodic function of period
T:
5
u[ 5 ] (t , T ) ¦ u (t  kT ) .
k 5
(2.91b)

As was shown Sec. 2.20, we can now write the associated Fourier series as [see Eq. (2.83d)]

-- 181
181 --
2 · Fourier Theory

kt
1 5 §k · 2& i T
u[ 5 ] (t , T ) ¦ U¨
T k 5 © T
¸e
¹
, (2.91c)

where, as specified in (2.91a), U is the forward Fourier transform of u.


Next we divide the period T of u[ 5 ] into N equal lengths, t T N , and evaluate (2.91c) only
for t mt with m 0,1, 2,… , N  1 ,

km
1 5 §k · 2& i N
u [5]
(mt , T ) ¦ U ¨ ¸e , (2.92a)
T k 5 © T ¹

where we have used


N t T (2.92b)

to simplify the exponent of (2.92a). The infinite sum in (2.92a) can be split in two by making the
substitution k n  rN with n 0,1, 2,… , N  1 and r 0, 9 1, 9 2,…. This gives

nm
1 5 N 1 § n  rN · 2& i N 2& irm
u [5]
(mt , T ) ¦ ¦ U ¨ ¸e e .
T r 5 n 0 © T ¹

Since e2& irm 1 and T N t , this becomes, making the index substitution r 3 r ,

nm
1 N 1 2& i N 5
§n r3 ·
u [5]
(mt , T ) ¦ e ¦ U ¨© T  t ¸¹
T n 0 r 3 5

or
nm
1 N 1 2& i N [ 5 ] § n 1 ·
u[ 5 ] (mt , T ) ¦ e U ¨© T , t ¸¹ ,
T n 0
(2.93a)

where we follow the pattern of Eqs. (2.81a) and (2.91b) and define

5
U [5] ( f , F ) ¦ U ( f  rF )
r 5
(2.93b)

for any two frequencies ƒ and F.


Equation (2.93a) is a somewhat disguised version of the discrete Fourier transform (DFT).
Figures 2.11(a) and 2.11(b) show the relationship of the two periodic functions u[ 5 ] and U [ 5 ] ,
graphed with solid lines, to the two original functions u and U graphed with dashed lines. [In
graphs such as these, u(t) typically stands for data and is usually real, making it easy to represent

- 182
- 182- -
Discrete Fourier Transform · 2.21

with a two-dimensional plot; but its transform U(ƒ) is often complex, so it makes more sense to
plot U ( f ) if we just want to show where U(ƒ) is different from zero.] When function u[ 5 ] has
period T and is uniformly sampled at intervals of ¨t, then function U [ 5 ] has period

1
F (2.93c)
t
and is uniformly sampled at intervals of
1
f . (2.93d)
T

Note, of course, we could also say that u[ 5 ] has period 1 f and is uniformly sampled at
intervals of 1 F when U [ 5 ] has period F and is sampled at intervals of ¨ƒ. When both ¨ƒ and ¨t
are known, we have from (2.92b) and (2.93d) that

1
f A t (2.93e)
N

Figures 2.12(a) and 2.12(b) show that if T and F are large and functions u(t) and U(ƒ) die away
relatively quickly when t and f are large—which means that u and U are localized near the t
and ƒ origins—then the corresponding periodic functions u[ 5 ] (t , T ) and U [ 5 ] ( f , F ) can be used
to approximate the non-negligible regions of u and U. Almost always when the DFT is used, its
users have in mind a situation such as that shown in Figs. 2.12(a) and 2.12(b), with u[ 5 ] and U [ 5 ]
being good approximations of u and U for small to moderately large values of t and ƒ.
To complete the DFT transform pair, we define

2& i
wN e N
(2.94a)

and write (2.93a) as


1 N 1 nm [ 5 ] § n 1 ·
u[ 5 ] (mt , T ) ¦ wN U ¨© T , t ¸¹ .
T n 0
(2.94b)

Multiplying both sides by wN mk and summing over m gives

N 1
1 N 1 ­ [ 5 ] § n 1 · ª N 1 mA( n  k ) º ½
¦u
m 0
[5 ]
(mt , T ) w mk
N ¦ ®U ¨ , ¸ A « ¦ wN
T n 0 ¯ © T t ¹ ¬ m 0
»¾ .
¼¿
(2.94c)

-- 183
183 --
2 · Fourier Theory

FIGURE 2.11(a).
1
T
f
u[ 5 ] (t , T )

t
t 1/ F
FIGURE 2.11(b).
1
F
U [5 ] ( f , F ) t

f
f 1/ T

The sum over m on the right-hand side is the sum of a geometric series,

N 1
Vn[,Nk ] ¦ wNm ( n  k ) . (2.94d)
m 0

This can be solved using the standard procedure for geometric sums [see the analysis following
Eq. (2.77b) above], multiplying every term in the sum by wNn  k to get

Vn[,Nk ] A wn  k Vn[,Nk ]  wNN A( n k )  1 . (2.94e)

Solving for Vn[,Nk ] gives


[N] 1  wNN A( n  k ) 1  e 2& i ( n  k )
V n,k , (2.94f)
1  wNn  k § nk ·
2& i ¨ ¸
1 e © N ¹

- 184
- 184- -
Discrete Fourier Transform · 2.21

where in the last step definition (2.94a) is used to eliminate wN . Index n goes from zero to N  1
for each value of k [see Eqs. (2.94b) and (2.94c)]. Deciding also to restrict k to one of the integers
k 0,1, 2,… , N  1 , we see that the denominator in (2.94f) can be zero only when n k . This
looks like it could be a problem, but when n = k, we can return to the original formula in (2.94d),
noting that for n = k the sum Vn[,Nk ] is equal to N. When n  k, the right-hand side of (2.94f) shows
that Vn[,Nk ] is zero because e2& i ( n  k ) 1 . We conclude that

­ N for n k ½
Vn[,Nk ] ® ¾ N  k ,n , (2.94g)
¯ 0 for n > k ¿

where  k ,n is the Kronecker delta,


­1 for n k
 k ,n ® . (2.94h)
¯0 for n > k

Substitution of (2.94d) into (2.94c) gives

N 1
1 N 1 ­ [ 5 ] § n 1 · [ N ] ½
¦ u[5 ] (mt , T )wN mk ¦ ®U ¨© T , t ¸¹ AVn,k ¾ .
T n 0 ¯
m 0 ¿

Substituting from (2.94g), we get

N [ 5 ] § k 1 · N 1 [ 5 ]
U ¨ , ¸ ¦ u (mt , T ) wN mk . (2.94i)
T © T t ¹ m 0

This equation is the other half of the DFT [the first half is specified by Eqs. (2.94a) and (2.94b)].
Using Eqs. (2.94a) and (2.92b) to replace wN by e(2& i ) / N and N T by 1 t , we write (2.94b)
and (2.94i) as
§ mn ·
1 N 1 2& i¨ ¸ §n 1 ·
u (mt , T ) ¦ e © N ¹ U [ 5 ] ¨ , ¸
[5]
(2.95a)
T n 0 © T t ¹
and
§ mn ·
§n 1 · N 1 2& i ¨ ¸
U [5]
¨ , ¸
© T t ¹
t ¦
m 0
u[ 5 ] (mt , T )e © N ¹ , (2.95b)

-- 185
185 --
2 · Fourier Theory

FIGURE 2.12(a).

u[ 5 ] (t , T ) 1
T
f

t 1/ F

region over which u[ 5 ] u

FIGURE 2.12(b).
U [5] ( f , F ) 1
F
t

f
f 1/ T

region over which U [ 5 ] U

- 186
- 186- -
Discrete Fourier Transform · 2.21

where index k has been replaced by n in (2.94i). This can also be written as, using Eqs. (2.93c)
and (2.93d),
N 1 2& i § mn ·
¨ ¸
u (mt , T ) f ¦ e
[5] © N ¹
U [ 5 ]  nf , F  (2.95c)
n 0
and
N 1 § mn ·
2& i ¨ ¸
U [5]
 nf , F  t ¦ u [5]
(mt , T )e © N ¹
. (2.95d)
m 0

The forward and inverse DFTs shown in (2.95c) and (2.95d) are often written as

N 1 § mn ·
2& i ¨ ¸
um ¦ U n e © N ¹
(2.96a)
n 0
and
N 1 § mn ·
1 2 & i ¨ ¸
Un
N
¦u
m 0
m e © N ¹
. (2.96b)

To get Eq. (2.96a) from (2.95c), we define

um u[ 5 ] (mt , T ) (2.96c)
and
U n f A U [ 5 ] (nf , F ) , (2.96d)

and to get Eq. (2.96b), both sides of (2.95d) are multiplied by ¨ƒ, using (2.93e) to replace f A t
by 1 N . We can also define
U n U [ 5 ] (nf , F ) (2.97a)
and
um t A u[ 5 ] (mt , T ) (2.97b)

to transform Eqs. (2.95c) and (2.95d) into

N 1 § mn ·
1 2& i ¨ ¸
um
N
¦U n e
n 0
© N ¹
(2.97c)

and
N 1 § mn ·
2& i ¨ ¸
U n ¦ um e © N ¹
, (2.97d)
m 0

-- 187
187 --
2 · Fourier Theory

where now we have multiplied both sides of (2.95c) by ¨t before replacing f A t by 1 N .


Figures 2.13(a) and 2.13(b) show how the u[’] and U[’] continuous functions are sampled to
create the DFT formulas in the previous paragraph. The values of the original functions u and U
are ignored for negative values of t and ƒ; instead, we sample u[’] and U[’] out to t = T and f = F,
picking up the original u and U values at negative t and ƒ where they repeat near t = T and f = F.
Many times DFT plots show um and Un with n and m running from 0 to N í 1. When this is done,
it is with the understanding that the large index values greater than N/2 represent u and U for
negative t and ƒ values respectively.

2.22 Aliasing as an Error


The DFT is important because there is an algorithm, called the fast Fourier transform (FFT), that
allows computers to calculate the sums in Eqs. (2.96a), (2.96b), (2.97c), and (2.97d) rapidly when
N is a multiple of 2. The FFT performs best when N 2 j for j a positive integer. In fact, when
faced with calculating an integral Fourier transform

³ u (t )e
2& ift
U( f ) dt
5

over a range of ƒ values for an arbitrary function u(t), it is standard practice to convert the
integral to a DFT and do the job on a computer with a FFT. As we saw in the previous section,
the DFT deals directly with u[ 5 ] and U [ 5 ] rather than u and U. Thus, successfully using the DFT
to calculate the integral transform requires that u[ 5 ] and U [ 5 ] consist of well-separated, repetitive
regions of u and U, as shown in Figs. 2.12(a) and 2.12(b), instead of overlapping regions of u and
U, as shown in Figs. 2.11(a) and 2.11(b). Ensuring that u[ 5 ] consists of nonoverlapping regions
of u tends to occur naturally; the shape of u is already known so there is no real difficulty in
picking T large enough to prevent significant amounts of overlap in u[ 5 ] . The shape of U,
however, is not known in advance, so care must be taken to avoid significant amounts of overlap
in U.
Consider what happens when the DFT is used to analyze a real signal u(t) having the spectrum
U(ƒ) and we know that U(ƒ) is zero for all f : f max and nonzero for 0
f
f max . Because u is
real, we know from entry 7 in Table 2.1 that U ( f ) U ( f ) , ensuring that U(ƒ) is also nonzero
for negative frequency values 0 f  f max ; that is, for every positive ƒ at which U is nonzero
there must be a íƒ at which U is nonzero, and because U is zero for f : f max it follows that U is
zero for all f 4  f max . Hence U can be represented schematically by the solid triangle centered
on the origin of Fig. 2.14. To construct U [ 5 ] , we write

- 188
- 188- -
Aliasing as an Error · 2.22

FIGURE 2.13(a).

u[ 5 ] (t , T )
1
T
f

t 1/ F

region over which u[ 5 ] u

FIGURE 2.13(b).

U [5 ] ( f , F ) 1
F
t

f 1/ T

region over which U [ 5 ] U

-- 189
189 --
2 · Fourier Theory

5
U [5] ( f , F ) ¦ U ( f  kF ) ,
k 5
(2.98a)

where the smallest we can make F and still avoid overlap is, as shown by the dotted triangles in
Fig. 2.14,
F 2 f max . (2.98b)

From Eq. (2.93c), we see that in Fig. 2.14


1
F ,
t

where ¨t is the interval in t between adjacent samples of u(t). If ¨t is made smaller, then F
increases, moving the regions of nonzero U further apart in Fig. 2.14; and if ¨t is made larger,
then F decreases, forcing the regions of nonzero U to overlap in Fig. 2.14. Making ¨t smaller is
wasteful, in that more effort than is needed goes into sampling u(t), and making ¨t larger
damages the integrity of the U calculations for large values of ƒ near f max . Clearly, the frequency
value F/2 plays an important role in DFT analysis, because optimum performance requires
f max F / 2 . For this reason frequency F/2 is given a special name: the Nyquist frequency
f Nyq F / 2 . From (2.93c), we see that
1
f Nyq . (2.99a)
2t

A realistic system, of course, is designed with some built-in margin for error. The requirement
then becomes that ¨t be small enough to separate unexpectedly high frequencies when the
highest expected frequency is f max . To provide this margin, we take

1
f Nyq f max (2.99b)
2t
or
1
t
. (2.99c)
2 f max

Now the region between f max and f Nyq is available for analysis of unexpectedly high frequencies.
Suppose U(ƒ) is negligible everywhere except at two frequencies, the positive frequency f 0
and the corresponding negative frequency   f 0  . Since U(ƒ) is the transform of a real signal,
entry 7 of Table 2.1 requires U ( f ) U ( f ) , forcing the existence of a non-negligible transform

- 190
- 190- -
Aliasing as an Error · 2.22

value at   f 0  when there is a non-negligible transform value at f 0 . The two frequencies are
represented by wide, solid-sided arrows in Fig. 2.15. The arrows represent isolated, narrow
regions where U is very large, so we can think of them as proportional to delta functions and
write U(ƒ) as
U ( f ) A A  ( f  f0 )  B A  ( f  f0 ) .

Variables A and B are arbitrary complex constants. We have just seen that Table 2.1 requires
U ( f ) U ( f ) . Because the delta functions are real, the equation U ( f ) U ( f ) can be
written as
A A  ( f  f 0 )  B A  ( f  f 0 ) A A  ( f  f 0 )  B A  ( f  f 0 )

or, since the delta functions are also even [see Eq. (2.68a)],

A A  ( f  f 0 )  B A  ( f  f 0 ) A A  ( f  f 0 )  B A  ( f  f 0 ) .

This can only be true if A B (which is, of course, the same thing as having B A ).
Therefore, we have the freedom to choose only one arbitrary complex constant, say A, and after
making that choice function U(ƒ) becomes

______________________________________________________________________________

FIGURE 2.14.
U [5 ] ( f , F )

-F - f max f max F

U( f )

-- 191
191 --
2 · Fourier Theory

U ( f ) A A  ( f  f 0 )  A A  ( f  f 0 ) . (2.100a)

It is not difficult to figure out what happens when the DFT is used to calculate this double-delta
frequency spectrum. If the double-delta U(ƒ) is used to construct U[’](f, F) according to formula
(2.98a), we get multiple isolated regions where U[’] is very large, as shown by the wide dashed
arrows in Fig. 2.15. The curved single arrows show which wide dashed arrows come from the
wide, solid-sided arrow at f0 and which wide dashed arrows come from the wide solid-sided
arrow at   f 0  . For example, the wide dashed arrow closest to f0 comes from the wide solid-
sided arrow at (–f0), and the wide dashed arrow closest to (–f0) comes from the wide solid-sided
arrow at f0. The two wide solid-sided arrows at f0 and –f0 lie a distance a inside the positions of
the positive and negative Nyquist frequencies fNyq and –fNyq, and the two wide dashed arrows that
are closest to f0 and –f0 lie a distance a outside the positive and negative Nyquist frequencies fNyq
and –fNyq. We see that the original double-delta U(ƒ) transform can be written as [from Eq.
(2.100a)]

U ( f ) A A  ( f  f Nyq  a)  A A  ( f  f Nyq  a) , (2.100b)

and we can pair up the two wide dashed arrows closest to f0 and –f0 to create the transform

U [1] ( f ) A A  ( f  f Nyq  a)  A A  ( f  f Nyq  a ) . (2.100c)

Because the delta function  ( f  f Nyq  a)  ( f  f 0 ) has the coefficient A in (2.100b), the
curved single arrow going from   f 0  to f Nyq  a shows that the delta function  ( f  f Nyq  a )
at f Nyq  a must have the coefficient A in Eq. (2.100c); similarly, the curved single arrow going
from f 0 to  f Nyq  a shows that the delta function  ( f  f Nyq  a ) at  f Nyq  a must have the
coefficient A in Eq. (2.100c). Nothing stops us from continuing out from the origin, pairing the
wide dashed arrows at f 3 f Nyq  a and f 3 f Nyq  a to get

U [2] ( f ) A A  ( f  3 f Nyq  a)  A A  ( f  3 f Nyq  a ) (2.100d)

and pairing the wide dashed arrows at f 3 f Nyq  a and f 3 f Nyq  a to get

U [3] ( f ) A A  ( f  3 f Nyq  a )  A A  ( f  3 f Nyq  a) . (2.100e)

- 192
- 192- -
Aliasing as an Error · 2.22

FIGURE 2.15.

frequency – f 0 frequency f 0

frequency – f Nyq frequency f Nyq

a a a a

F 2 f nyq

Each time, the curved single arrows in Fig. 2.15 are consulted to find the coefficients of the delta
functions. This can obviously be continued out to indefinitely large values of ƒ, creating the
paired transforms U [4] ,U [5] ,…, etc. The general formula for U [ k ] turns out to be

­ A ( f  f Nyq  kf Nyq  a)
°
°  A ( f  f Nyq  kf Nyq  a) for k even
°
U [ k ] ( f ) ® . (2.100f)
° A ( f  f  (k  1) f  a)
° Nyq Nyq
°  A ( f  f Nyq  (k  1) f Nyq  a ) for k odd
¯

-- 193
193 --
2 · Fourier Theory

We started out with the double-delta U(ƒ) being the forward Fourier transform of u(t), which
means that u(t) is the inverse Fourier transform of the double-delta U(ƒ),

³ U ( f )e
2& ift
u (t ) df .
5

We now show that u(t), the inverse transform of the double-delta U(ƒ), and u[1] (t ), u[2] (t ),… the
inverse transforms of U [1] ,U [2] ,…, all have the same values at t mt for m 0, 9 1, 9 2,… ,

u (mt ) u[1] (mt ) u[2] (mt ) " u[ k ] (mt ) " . (2.100g)

We begin by taking the inverse Fourier transform of the double-delta U(ƒ) function specified
in (2.100b),

5
u (t ) ³ [ A A ( f  f
5
Nyq  a)  A A  ( f  f Nyq  a)]e 2& ift df
(2.101a)
2& it ( f Nyq  a )  2& it ( a  f Nyq ) 2& it ( f Nyq  a )
Ae Ae 2 Re[ Ae ].

Similarly, we can take the inverse Fourier transform of U [ k ] ( f ) in (2.100f) to get

­°2 Re[ Ae2& it ( f Nyq  kf Nyq  a ) ] for k even


u[ k ] (t ) ® 2& it ( f Nyq  ( k 1) f Nyq  a )
. (2.101b)
°̄ 2 Re[ Ae ] for k odd

Substituting t mt from (2.100g) and f Nyq 1 (2t ) from (2.99a) into Eq. (2.101a) gives

1
u (mt ) 2 Re[ Ae2& imt ((2 t ) a )
] 2 Re[ Aei& m e 2& imat ]
(2.101c)
2 Re[(1) m Ae 2& imat ] .

Making the same substitutions into Eq. (2.101b) gives

- 194
- 194- -
Aliasing as an Error · 2.22

­ 2 Re[ Ae 2& imt ((2 t )  k (2 t )  a ) ]


1 1

°
° 2 Re[ Aei& m ei& mk e2& imat ] for k even
°
u[ k ] (mt ) ® . (2.101d)
° 1 1
2& imt ((2 t )  ( k 1)(2 t )  a )
°2 Re[ Ae ]
° 2 Re[ Ae i& m e i& m ( k 1) e2& imat ] for k odd
¯

But e 9 i& mk (1) mk 1 when k is even and e 9 i& m ( k 1) (1) m ( k 1) 1 when k is odd, so this last
result can be written as

­2 Re[ A(1) m e 2& imat ] for k even


u[ k ] (mt ) ® m 2& imat
. (2.101e)
¯ 2 Re[ A(1) e ] for k odd

Comparing this with (2.101c), we conclude that u (mt ) u[ k ] ( mt ) for all values of m and k,
showing that (2.100g) must be true. Because the u[ k ] functions have exactly the same values as
the u functions at t mt for m 0, 9 1, 9 2,… , the u[ k ] functions are called aliases of function
u. Figure 2.16 graphs an example of u(t) and to show how u and its alias u[1] can have identical
values at all the sample positions on the t axis.
The term “alias” is an interesting one; it suggests that there is no real way to distinguish these
functions if all we know are the values of the sample points at t mt . Yet in Figs. 2.14 and
2.15, there is really no question as to which is the correct region of U [ 5 ] ; spectral values whose
frequencies do not lie between +fNyq and –fNyq can clearly be disregarded. Consider, however, that
before u(t) is analyzed there is no guarantee as to what the correct value of fmax is. Figure 2.17, for
example, shows a pattern for U [ 5 ] that seems to have well-separated regions for U and all its
aliases when in fact there is a high-frequency triangle that is hidden by aliasing. The unwary
analyst might conclude that U has the shape shown in Fig. 2.18(a) when its true shape is the one
shown in Fig. 2.18(b). There is really no way to be sure of the true shape of U when all that is
known is the DFT of the sampled signal u(t). The basic problem, which is that the DFT is the
sampled version of U [ 5 ] instead of U, does not disappear when F 1 t is made larger by
decreasing the sampling interval ¨t; there is always the possibility that the true U curve is broad
enough to overlap. Returning to Fig. 2.16, we see that no matter how small ¨t is made, the
information thrown away from between the samples inevitably allows high frequencies to
masquerade as low frequencies. There is no foolproof method for both sampling the data and
avoiding this possibility.
Fortunately, there are usually ways of avoiding this logical dead end. As is pointed out in Sec.
2.2 above [see discussion after Eq. (2.9b)], in practice all measurements are sampled and, before
representing them by continuous functions, we must know that the samples capture all the

-- 195
195 --
2 · Fourier Theory

relevant detail. In other words, there must be some way of knowing, based on past experience or
knowledge of how the data is gathered, that the sampling is rapid enough to represent faithfully
all the important high-frequency details. In terms of the notation used to discuss Fig. 2.14, we
must eventually be prepared to say that, for some specific ƒmax, no higher frequencies are present
to create aliasing—that is, we must know that if more closely spaced sampling is done all that
would be found is a smooth, quasi-linear variation between the current samples. Many times the
electronic instruments used to make the measurements cannot sense high-frequency data, so even
if high-frequency components exist, they cannot be recorded. Other times, all that can be done is
to look at the data samples and decide whether it is reasonable to suspect the presence of unseen
high-frequency components. The data in Fig. 2.19(a), for example, almost certainly do not
contain significant amounts of unseen high frequencies, whereas unseen high frequencies could
well be present in Fig. 2.19(b). There may be cases where all that can be done is to shorten ¨t and
see whether previously aliased frequency components suddenly appear. The question of whether
aliasing is present is analogous to the question of whether experimental error is present. Just as it
is always logically possible that data contain significant amounts of undetected error, so it is

FIGURE 2.16.

1.1
1

0.5

y
i
0
Y
i

0.5

1
1.1
5 4 3 2 1 0 1 2 3 4 5
4.5 x
i t 4.5

The solid line represents a sinusoidal oscillation at a frequency that is 0.8 times the Nyquist
frequency, and the dashed line represents a sinusoidal oscillation that is 1.2 times the
Nyquist frequency. When the curves are sampled at the rate represented by the black dots—
which in this case is the Nyquist frequency—there is no way to tell them apart in the sampled
data.

- 196
- 196- -
Aliasing as an Error · 2.22

always logically possible that significant amounts of aliasing are being overlooked. Just as we
often expect insignificant amounts of error to occur no matter what precautions are taken, so we
often expect insignificant amounts of aliasing to occur in the calculated DFT. What is needed is
the presence of good engineering and scientific judgment; there must always be someone willing
to pick a value for ƒmax, allowing us to specify the sampling interval t 4 1 (2 f max ) that prevents
significant aliasing in the DFT.

2.23 Aliasing as a Tool


The previous section presented the bad aspects of aliasing, treating it as a form of data corruption.
There are, however, occasions when aliasing is more of a feature than a bug. Many times, a real
function u(t) is known to have a Fourier transform

³ u (t )e
2& ift
U( f ) dt ,
5

which is zero for all positive frequencies ƒ that do not lie between the two positive numbers ƒmin
and ƒmax; that is, U(ƒ) is zero when 0 4 f 4 f min and f : f max . Because u(t) is real, U(ƒ) must be
Hermitian (see entry 7 of Table 2.1), which means

U (  f ) U ( f ) .

This shows
This shows thatthat
U(U(ƒ)
f ) must
mustalso
alsobebestrictly
strictlyzero
zerofor
fornegative
negative frequencies
frequencies ƒf where
where  f min 4 f 4 0
and f 4  f max . The U(ƒ) transform is schematically represented in Fig. 2.20 with the two blocks
showing that U is zero unless ƒ lies between ( f max ,  f min ) or ( f min , f max ) .
The situation shown in Fig. 2.20 describes the signal produced by Michelson interferometers.
At the beginning of this chapter, we mentioned that interferometers produce interferograms that
must then be Fourier transformed to produce the desired spectral measurement. As explained
later in Chapter 4 (see Sec. 4.10), interferometers use optical filters to block out undesired
electromagnetic frequencies, which means there always exist values of ƒmin and ƒmax such that the
transform U(ƒ) of the interferogram signal u(t) is zero unless ƒ lies between ( f max ,  f min ) or
( f min , f max ) . Suppose we sample the interferogram signal with a sampling interval ¨t such that
the Nyquist frequency f Nyq (2t ) 1 is slightly larger than ƒmax. Repeating the reasoning used to
get Fig. 2.15 above, we see that
5
U [5] ( f , F ) ¦ U ( f  kF )
k 5

-- 197
197 --
2 · Fourier Theory

FIGURE 2.17.
U [5 ] ( f , F )

f
 F 2 f Nyq F 2 f Nyq
 f Nyq f Nyq

FIGURE 2.18(a). U( f )

FIGURE 2.18(b).

U( f )

The U [5 ] ( f , F ) data in Fig. 2.17 contains hidden aliasing that can lead spectral analysts to assume
that the Fig. 2.18(a) rather than 2.18(b) depicts the true frequency spectrum.

- 198
- 198- -
Aliasing as a Tool · 2.23

FIGURE 2.19(a).

This data is relatively smooth, suggesting that it does not contain high-frequency components.

FIGURE 2.19(b).

This curve varies rapidly in three locations, suggesting the presence of high-frequency
components in the data.

-- 199
199 --
2 · Fourier Theory

now has the form shown in Fig. 2.21. Again, the solid blocks show the original U(ƒ), the dashed
blocks show the aliases created by turning U(ƒ) into U [ 5 ] ( f , F ) , and the curved arrows drawn
show exactly how the aliased blocks are created from the original blocks. No solid blocks overlap
with the dashed blocks, so aliasing is not a problem.
Now consider what happens when we force aliasing to occur by choosing ¨t to be half its
original size, creating the U [ 5 ] plot shown in Fig. 2.22. As in Fig. 2.21, none of the solid blocks
overlap with the dashed blocks. Because the dashed blocks come from turning U into U [ 5 ] , the
spectral shapes represented by the solid and dashed blocks are all identical. This means that the
aliasing does not cause spectral information to be lost; either the solid blocks or the dashed
blocks can be used to recover the true shape of U(ƒ). The electronic equipment used to sample
u(t) only needs to sample half as often as before, which usually makes it less expensive to build,
and as a bonus the rate at which data flows from the interferometer ends up being cut in half. This
last point is often a significant consideration when the interferometer is on a satellite and all the
data has to be communicated to the ground. The scheme shown in Fig. 2.22 is called
undersampling. There is nothing special about undersampling by a factor of 2; if the distance
between ƒmin and ƒmax is small enough, and ƒmin is far enough from f 0 , we can undersample
by much higher factors. Figure 2.23 shows a scheme that withundersamples
4 aliases rather
bythan one. of 5.
a factor

2.24 Sampling Theorem


We define a band-limited function u(t) to be a function for which there exists a positive
frequency ƒmax such that the forward Fourier transform of u(t),

³ u(t )e
2& ift
U( f ) dt ,
5

is strictly zero when f 4  f max or f : f max . The previous section indicated that the interferogram
of a Michelson interferometer is a special case of a band-limited function; not only is its
transform zero for f : f max , but there is also a positive frequency ƒmin such that its transform is
zero for f 4 f min (see Fig. 2.20). It can be shown that whenever a continuous function u(t) is
also band limited, then its samples u (mt ) (with m 0, 9 1, 9 2,… ) can be used to reconstruct the
complete function—including the values of u between the samples—as long as we choose

1
t
(2.102)
2 f max
to prevent aliasing.
We start by forming the mathematical construct

- 200
- 200- -
Sampling Theorem · 2.24

FIGURE 2.20.
U( f )

f
 f max  f min f min f max

FIGURE 2.21.

U [5] ( f , F )

f
 f min f min
F  f max f max f Nyq F
 f Nyq

Frequency F is twice the Nyquist frequency f Nyq in Fig. 2.21.

-- 201
201 --
2 · Fourier Theory

5
v(t ) ¦ u(mt ) (t  mt ) .
m 5
(2.103)

Clearly, the u (mt ) sample values of function u are the only data used to set up function v(t).
Because u (t ) (t  t0 ) u (t0 ) (t  t0 ) for any continuous function u [see Eq. (2.68e) above], this
can be written as
5
v(t ) ¦ u (t ) (t  mt )
m 5
or
ª 5 º
v(t ) u (t ) A « ¦  (t  mt ) » .
¬ m 5 ¼

here tt in
Note that here has
thereturned
functiontoubeing a continuous,
has returned not
to being a sampled, variable. Taking the Fourier
a continuous
transform of both sides gives, using the Fourier convolution theorem [see Eq. (2.72i)],

ª1 5 § k ·º
V ( f ) U ( f )  « ¦  ¨ f  ¸» , (2.104a)
¬ t k 5 © t ¹ ¼
where
5

³ v(t )e
2& ift
V( f ) dt , (2.104b)
5

³ u (t )e
2& ift
U( f ) dt , (2.104c)
5
and
ª 5 º 2& ift
5
1 5 § k ·
³5 ¬« k¦
5
 (t  k t ) »
¼
e dt ¦ ¨ f  ¸
t k 5 © t ¹
(2.104d)

from formula (2.78d). Note that here both ƒ and t are continuous, not sampled, variables. We can
now use the linearity of the convolution [see discussion after Eq. (2.38c)] and the definition of
the convolution in Eq. (2.38a) to write (2.104a) as

5
5
§ k · 5
§ k ·
t AV ( f ) ¦
k 5
U ( f )   ¨
©
f  ¸ ¦ ³ U ( f 3) ¨ f   f 3 ¸ df 3

t ¹ k 5 5 © t ¹ (2.105a)
5
§ k · § 1 ·
¦ U ¨ f  ¸ U [5] ¨ f , ¸ ,
k 5 © t ¹ © t ¹

- 202
- 202- -
Sampling Theorem · 2.24

 f min f min
FIGURE 2.22.

F F
[5 ]
 f max U ( f ,F)
f max

f
 f Nyq
f Nyq
[5]
U ( f ,F)
FIGURE 2.23.

f min f max
 f max  f min f Nyq
F F
 f Nyq

In both Figs. 2.22 and 2.23, frequency F is twice the Nyquist frequency f Nyq .

where U [ 5 ] is as defined in Eq. (2.93b) above. Inequality (2.102) ensures that the separate
regions of U that combine to create U [ 5 ] do not overlap, giving us the graph of U [ 5 ] shown in
Fig. 2.24. Hence, we can use the  function defined in Eq. (2.56c) to select just the region of
nonzero U [ 5 ] between  (2t ) 1 and  (2t ) 1 , recreating the original U(ƒ) transform.
 
Multiplication of (2.105a) by  f , (2t ) 1 then gives

§ 1 · [5] § 1 · § 1 ·
U( f ) ¨ f , ¸ AU ¨ f , ¸ t AV ( f ) A  ¨ f , ¸. (2.105b)
© 2t ¹ © t ¹ © 2t ¹

-- 203
203 --
2 · Fourier Theory

Having recovered the original U(ƒ), an inverse Fourier transform of U(ƒ) gives back the original
unsampled u(t). Using the Fourier convolution theorem again to take the inverse Fourier
transform of both sides of (2.105b), we get [applying Eq. (2.39j) after interchanging the roles of ƒ
and t]
5
§ 1 · 2& ift
u (t ) t ³ V ( f ) A  ¨ f , ¸ e df
5 © 2 t ¹
(2.106a)
ª 5
º ª5 § 1 · 2& if 3t º
t « ³ V ( f )e df »  « ³  ¨ f 3,
2& ift
¸ e df 3» ,
¬ 5 ¼ ¬ 5 © 2t ¹ ¼

where the convolution between the two expressions inside square brackets [ ] is over the variable
t. From (2.104b), function V(ƒ) is the forward Fourier transform of v(t), making v(t) equal to the
inverse Fourier transform of V(ƒ) in (2.106a), with v(t) defined as

5
v(t ) ¦ u(mt ) (t  mt )
m 5

in Eq. (2.103). From Eq. (2.71a) above, the inverse Fourier transform of  is

§ § 1 ··
5
§ 1 · 1 § &t ·
¸¸ ³ e ¨ f ,
( ift ) 2& ift
F ¨¨ f , ¸ df sin ¨ ¸ .
© © 2t ¹ ¹ 5 © 2t ¹ & t © t ¹

Equation (2.106a) can now be written as

ª 5 º ª1 § & t ·º
u (t ) t « ¦ u (mt ) (t  mt ) »  « sin ¨ ¸ » . (2.106b)
¬ m 5 ¼ ¬& t © t ¹ ¼

Again, the linearity of the convolution can be used to simplify (2.106b),

5
­ ª 1 § & t ·º ½
u (t ) t ¦ ®u (mt ) « (t  mt )  & t sin ¨© t ¸¹» ¾
m 5 ¯ ¬ ¼¿

or, using that  (t  t0 )  u (t ) u (t  t0 ) for any continuous function u,

5 ­° ª 1 § & (t  mt ) · º ½°
u (t ) ¦ °®u (mt ) « & ((t  mt ) t ) sin ©¨ t
¸» ¾ .
¹ ¼ ¿°
(2.106c)
m 5 ¯ ¬

- 204
- 204- -
Sampling Theorem · 2.24

FIGURE 2.24.

§ 1 ·
U [5] ¨ f , ¸
© t ¹

f
§1 · 1
 ¨  f max ¸  f max f max  f max
© t ¹ t

1 1
 U( f )
2t 2t

This formula gives us u(t) everywhere in terms of the samples u (mt ) and the function

1 § &t ·
sin ¨ ¸ .
& (t t ) © t ¹

We now define the function


sin( x)
sinc( x) (2.106d)
x

and write (2.106c) as


5
§ & (t  mt ) ·
u (t ) ¦ u(mt )sinc ¨©
m 5 t
¸.
¹
(2.106e)

-- 205
205 --
2 · Fourier Theory

Many authors use a different definition of the sinc function, which we call here sincalt , with
sin(& x)
sinc alt ( x) .
&x

In terms of sincalt , Eq. (2.106e) becomes

5
§ (t  mt ) ·
u (t ) ¦ u(mt )sinc
m 5
alt ¨
© t
¸.
¹

sin( x) sin(& x)
For the rest of this book, the symbol sinc will refer to instead of . We also
x &x
note that the Fourier transform pair in (2.71a) can be written in terms of sinc( x) as

³e
2& ift
[2 Fsinc(2& Ft )] dt  ( f , F )
5
and
5

³e
2& ift
 ( f , F ) df 2 Fsinc(2& Ft ) .
5

Replacing ƒ by íƒ in the top integral and t by ít in the bottom integral gives

³e
2& ift
[2 Fsinc(2& Ft )] dt  ( f , F )  ( f , F )
5
and
5

³e
2& ift
 ( f , F ) df 2 Fsinc(2& Ft ) 2 Fsinc(2& Ft ) ,
5

where we have used that  ( f , F ) and sinc(2& Ft ) are even functions of their arguments:

sinc( x) sinc( x) (2.107a)


and
 ( f , F )  ( f , F ) . (2.107b)

This means we can write this Fourier relationship using the more general formulas

- 206
- 206- -
Sampling Theorem · 2.24

5
F ( 9 ift )
 2 Fsinc(2& Ft )  ³ e92& ift [2 Fsinc(2& Ft )] dt  ( f , F ) (2.108a)
5
and

5
F ( 9 ift )
 ( f , F )  F ( 9 itf )
  ( f , F )  ³ e92& ift  ( f , F ) df 2 Fsinc(2& Ft ) . (2.108b)
5

2.25 Fourier Transforms in Two and Three Dimensions


The integral Fourier transform extends easily and naturally to two- and three-dimensional
functions. We can, for example, define the integral Fourier transform of any two-dimensional
function u(x,y) to be
5 5

³ dx ³ dy e
2& i ( x.  y! )
U (. ,! ) u ( x, y ) . (2.109a)
5 5

The inverse Fourier transform of U returns the original function,

5 5

³ d. ³ d! e
2& i ( x.  y! )
u ( x, y ) U (. ,! ) . (2.109b)
5 5

In three dimensions we can write, for the function u( x, y, z ) , that

5 5 5
U (. ,! , 0 ) ³
5
dx ³ dy ³ dz e2& i ( x.  y!  z0 )u ( x, y, z )
5 5
(2.109c)

and
5 5 5

³ d. ³ d! ³ d0 e
2& i ( x.  y!  z0 )
u ( x, y , z ) U (. ,! , 0 ) . (2.109d)
5 5 5

This pattern of forward and inverse transforms can be extended indefinitely to functions u and U
with ever larger numbers of arguments, but for the purposes of this book there is no need to go
beyond the two- and three-dimensional transforms given in Eqs. (2.109a)–(2.109d). As a matter
of notation, we often use the standard Cartesian x̂ and ŷ unit vectors pointing along the x and y
axes of a Cartesian coordinate system to define vectors
G G
( xxˆ  yyˆ and q . xˆ  ! yˆ .

-- 207
207 --
2 · Fourier Theory

G G
We introduce the symbol u ( ( ) as a shorthand for u(x,y) and the symbol U (q ) as a shorthand for
U (. ,! ) . Now Eqs. (2.109a) and (2.109b) can be written as

5
G G G G
U (q ) ³³
5
d 2 ( e 2& i ( =q u ( () (2.110a)

and
5
G G G G
u(( ) ³³
5
d 2q e 2& i( =qU (q ) . (2.110b)

We can also define vectors for the three-dimensional case,


G G
r xxˆ  yyˆ  zzˆ and s . xˆ  ! yˆ  0 zˆ ,

and then write Eqs. (2.109c) and (2.109d) as

5
G G G G
³ ³³
3 2& ir = s
U (s ) d r e u (r ) (2.110c)
5
and
5
G G G G
³ ³³d se
3 2& ir = s
u (r ) U (s ) . (2.110d)
5
Vector notation is sometimes used to group families of associated forward and inverse Fourier
transforms into a single equation. We might, for example, write the six scalar equations

5 5
G G G G G G G G
³ ³ ³ d r e u x (r ) , u x (r ) ³ ³³d
3 2& ir = s 3
U x (s ) s e 2& ir = sU x ( s ) ,
5 5

5 5
G G G G G G G G
³ ³³d re ³ ³³
3 2& ir = s 3 2& ir = s
U y (s ) u y (r ) , u y (r ) d s e U y (s ) ,
5 5
and
5 5
G G G G G G G G
³ ³ ³ d r e u z (r ) , u z (r ) ³ ³³d se
3 2& ir = s 3 2& ir = s
U z (s ) U z (s )
5 5

as the pair of vector equations


G G 5
2& ir = s G G
G G

³ ³³
3
U (s ) d r e u (r ) (2.110e)
5

- 208
- 208- -
Fourier Transforms in Two and Three Dimensions · 2.25

and
G G G G
5
G G
³ ³³d
3
u (r ) s e 2& ir = sU ( s ) , (2.110f)
5
where
G G G G G G G G G G
u (r ) xˆu x (r )  yˆ u y (r )  zˆu z (r ) and U ( s ) xˆU x ( s )  yˆU y ( s )  zˆU z ( s ) .
G G G G G G
We call U ( s ) the vector Fourier transform of u (r ) and u (r ) the vector inverse Fourier
G G
transform of U ( s ) . Just as in the one-dimensional case, it makes no difference which Fourier
transform is labeled the forward transform and which is labeled the inverse transform as long as
there is a change in sign of the exponent of e. Following the pattern of Eq. (2.28 A ), we can also
write
5 5
G G G G G G
³³ ³³
2 9 2& i ( = q 2 B 2 & i ( 3= q
d q e d ( 3 e u ( ( 3) u ( ( ) (2.110g)
5 5
and
5 5
G G
³³ ³ d ³ ³ ³ d r3 e
G G G G
3 se 92& ir = s 3 B2& ir 3= s v (r 3) v(r ) (2.110h)
5 5

G G
for two-dimensional and three-dimensional scalar functions u ( ( ) and v(r ) . For three-
dimensional vector functions, this becomes

5 5
G G G G G G G G
³³ ³ d s e ³ ³ ³ d r3 e
3 9 2& ir = s 3 9 2& ir 3= s
v (r 3) v (r ) . (2.110i)
5 5

Many one-dimensional Fourier identities have two-dimensional and three-dimensional


counterparts. For example, the Fourier shift theorem [see Eq. (2.36h) above] in two dimensions
G
ˆ x  ya
becomes, for a two-dimensional vector constant a xa ˆ y,

5 5 5
G G G G
³ ³d ³ dx ³ dy e
2 92& i ( = q 92& i ( x.  y! )
(e u((  a) u ( x  ax , y  a y )
5 5 5
5 5

³ dx3 ³ dy3 e
B2& i (. a x ! a y ) 92& i ( x3.  y 3! )
e u ( x3, y3) ,
5 5

where in the last step we define x3 x  ax and y3 y  ax . We now see that (dropping the
primes inside the double integral)

-- 209
209 --
2 · Fourier Theory

5 5
G G G G G G G G G
³ ³d (e u ( (  a ) e B2& ia =q ³ ³
2 92& i ( = q 2 92& i ( = q
d ( e u( ( ) . (2.110j)
5 5

G G G G
This shows the forward or inverse two-dimensional Fourier transform of u( (  a) to be e B2& ia =q
G
multiplied by the forward or inverse two-dimensional Fourier transform of u ( ( ) . Similarly in
G
ˆ x  yb
three dimensions, we have, for a three-dimensional constant vector b xb ˆ y  zb
ˆ z , that

G G
5 5 5 5
G G
³ ³³d re ³ dx ³ dy ³
3 92& ir = s
v(r  b ) dz e92& i ( x.  y!  z0 ) v( x  bx , y  by , z  bz )
5 5 5 5
5 5 5

³ dx3 ³ dy3 ³ dz 3e 92& i ( x3.  y3!  z30 ) v( x3, y3, z3) ,


B2& i ( bx.  by!  bz0 )
e
5 5 5

where x3 x  bx , y3 y  by , and z 3 z  bz . This time we find that the forward or inverse three-
G G G G
dimensional Fourier transform of v (r  b ) is e B2& is =b multiplied by the forward or inverse three-
G
dimensional Fourier transform of v(r ) ,

G G
5 5
G G G G G G G
³ ³³d re v(r  b ) e B2& is =b ³ ³³
3 92& ir = s 3 92& ir = s
d r e v( r ) . (2.110k)
5 5

There is also a two-dimensional and three-dimensional version of the one-dimensional Fourier


scaling theorem discussed in Sec. 2.8 above [see Eq. (2.37a)]. In two dimensions when we have

5
G G G G
V ( 9 ) (q ) ³³
5
d 2 ( e 9 2& i ( =qv ( () (2.110 A )

G G G G
and v( ( ) is replaced by v(( ) , where Į is a real scalar, then we can substitute ( 3 ( to get
G
5 5 § ( 3· G
G G G 1 9 2& i¨ ¸ = q G 1 G
³³d ³³d
2 9 2& i ( = q 2
(e v ( () 2 (3e © ¹
v ( ( 3) 2 V ( 9 ) (q  ) . (2.110m)
5
 5

G G G G
Suppose there is a function of ( called u ( ( ) such that ( has to change by a vector distance (
G
whose magnitude must be at least (  for there to be a significant change in the value of
G
u ( ( ) . Using the same reasoning as was applied to the one-dimensional Fourier scaling theorem
G
[see the analysis following Eq. (2.37e)], we can show that U ( 9 ) (q ) , the two-dimensional forward

- 210
- 210- -
Fourier Transforms in Two and Three Dimensions · 2.25

G
or inverse Fourier transform of u, must be negligible or zero for all vectors q whose magnitude
G
q exceeds 1  . The Fourier scaling theorem in three dimensions starts with

5
G G G G
³ ³³
(9) 3 92& ir = s
V (s ) d r e v(r ) , (2.110n)
5

G G G
from which we discover, replacing r by r 3  r , that
G
5 5 § r3 · G
G G G 1 9 2& i¨ ¸ =s G 1 G
³ ³³ d r e ³ ³³ d
3 9 2& ir =s 3
v ( r ) 3 r3 e © ¹
v (r 3) 3 V ( 9 ) ( s  ) . (2.110o)
5  5 

G G
Again we can conclude that if there is a function u (r ) such that r must be at least ȕ for there
G
to be a significant change in u, then U ( 9 ) ( s ) , the three-dimensional forward or inverse Fourier
G G
transform of u, must be negligible or zero for all vector arguments s whose magnitude s
exceeds 1  .
The two-dimensional convolution of scalar functions u(x,y) and v(x,y) is written using the
symbol  and defined to be
5 5
u ( x, y ) v( x, y ) ³ dx3 ³ dy3u( x3, y3)v( x  x3, y  y3) ,
5 5
(2.111a)

or
5
G G G G G
³ ³d
2
u ( ( ) v( ( ) ( 3 u ( ( 3)v( (  ( 3) (2.111b)
5

using the more concise vector notation. The vector notation may make the connection between
the one- and two-dimensional convolutions in Eqs. (2.38a) and (2.111b) easier to see. The two-
dimensional convolution, like the one-dimensional convolution, is both commutative and
associative. Using the same type of reasoning as in the analysis in Sec. 2.9, we have for the two-
G G G
dimensional functions u ( ( ) , v( ( ) , and h( ( ) that

5 5
G G G G G G G G
³ ³ ³ ³
2 2 2
u ( ( ) v( ( ) d ( 3 u ( ( 3) v ( (  ( 3)  1 d ( 33 u ( (  ( 33) v ( ( 33)
5
5
5
(2.111c)
G G G G G
³ ³d
2
( 33 v( ( 33)u ( (  ( 33) v( ( ) u ( ( )
5
and

-- 211
211 --
2 · Fourier Theory

5 5
G G G G G 2 G G G G
³³ ³³
2
u ( ( ) v( ( ) h( ( ) d ( 33 h ( (  ( 33) d ( 3 u ( ( 3) v ( ( 33  ( 3)
5 5
5 5
G G G G G G
³³ d ( 3 u ( ( 3) ³ ³d
2 2
( 33 h( (  ( 33)v( ( 33  ( 3)
5 5
(2.111d)
5 5
G G G G G G
³³ d ( 3 u ( ( 3) ³ ³d
2 2
( 333 v( ( 333) h(( (  ( 3)  ( 333)
5 5
G G G
u ( ( )   v( ( ) h( ( )  ,

where to show that the two-dimensional convolution is commutative we make the variable
G G G
substitution ( 33 (  ( 3 in (2.111c); and to show it is associative, we make the variable
G G G
substitution ( 333 ( 33  ( 3 in (2.111d). The two-dimensional convolution is also linear. For any
two complex constants Į and ȕ, we have

5
G G G G G G G G
³ ³d
2
u ( ( )   v( ( )   h( ( )  ( 3 u ( ( 3)  v( (  ( 3)   h( (  ( 3) 
5
5 5
G G G G G G
³ ³d ³ ³d
2 2
 ( 3 u ( ( 3)v( (  ( 3)   ( 3 u ( ( 3)h( (  ( 3) (2.111e)
5 5
G G G G
 u ( ( ) v( ( )   u ( ( ) h( ( ),

and because the two-dimensional convolution is commutative it follows that


G G G G G G G
 v( ( )   h( ( ) u ( ( )  v( ( ) u ( ( )    h( ( ) u ( ( ) . (2.111f)

It is easy to show that the Fourier convolution theorem holds true in two dimensions. We start
with
5 5

³ dx ³ dy e
92& i ( x.  y! )
[u ( x, y ) v( x, y )]
5 5
5 5 5 5
³
5
dx ³ dy e 92& i ( x.  y! ) ³ dx3 ³ dy3 u ( x3, y3)v( x  x3, y  y3)
5 5 5
5 5 5 5

³ dx3 ³ dy3 u( x3, y3) ³ dx ³ dy e


92& i ( x.  y! )
v( x  x3, y  y3).
5 5 5 5

- 212
- 212- -
Fourier Transforms in Two and Three Dimensions · 2.25

Now we replace the x, y integration variables by x33 x  x3 and y33 y  y3 , with dx33 dx and
dy33 dy , so that

5 5

³ dx ³ dy e
92& i ( x.  y! )
[u ( x, y ) v( x, y )]
5 5
5 5 5 5

³ dx3 ³ dy3 u ( x3, y3)e ³ dx33 ³ dy33 e


92& i ( x3.  y 3! ) 92& i ( x33.  y 33! )
v( x33, y33)
5 5 5 5
or
5 5

³ dx ³ dy e
92& i ( x.  y! )
[u ( x, y ) v( x, y )] U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) , (2.112a)
5 5

where U ( 9 ) is the two-dimensional forward or inverse Fourier transform of u,

5 5
U ( 9 ) (. ,! ) ³
5
dx ³ dy e 92& i ( x.  y! )u ( x, y ) ,
5
(2.112b)
(9)
and V is the two-dimensional forward or inverse Fourier transform of v,

5 5

³ dx ³ dy e
(9) 92& i ( x.  y! )
V (. ,! ) v ( x, y ) . (2.112c)
5 5

This gives the first half of the two-dimensional Fourier convolution theorem. To get the
second half, we reverse the transform in (2.112a). If the plus sign is used in (2.112a), take the
forward two-dimensional Fourier transform of both sides, and if the minus sign is used take the
inverse two-dimensional Fourier transform of both sides. This leads to

5 5

³ d. ³ d! e
B2& i ( x.  y! )
U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) u ( x, y ) v( x, y ) , (2.113a)
5 5

where, reversing the transforms in Eqs. (2.112b) and (2.112c),

5 5

³ d. ³ d! e
B2& i ( x.  y! )
u ( x, y ) U ( 9 ) (. ,! ) (2.113b)
5 5
and
5 5

³ ³ d! e
B2& i ( x.  y! )
v ( x, y ) d. V ( 9 ) (. ,! ) . (2.113c)
5 5

-- 213
213 --
2 · Fourier Theory

The first half of the two-dimensional Fourier convolution theorem, Eqs. (2.112a)–(2.112c),
shows that the forward or inverse two-dimensional Fourier transform of the two-dimensional
convolution of two functions u and v is the product of the forward or inverse two-dimensional
Fourier transforms of u and v. Because no restrictions are placed on the nature of u and v, other
than that they are transformable, there are also no restrictions on the nature of their U ( 9 ) and V ( 9 )
transforms. This means we can think of U ( 9 ) and V ( 9 ) as arbitrary transformable functions. The
(9 ) superscripts on U and V in Eqs. (2.113a)–(2.113c) then just tell us that, according to Eqs.
(2.112b) and (2.112c),
5 5

³ dx ³ dy e
(9) 92& i ( x.  y! )
U (. ,! ) u ( x, y )
5 5
and
5 5
V ( 9 ) (. ,! ) ³
5
dx ³ dy e92& i ( x.  y! ) v( x, y ) .
5

We already know this, however, from looking at Eqs. (2.113b) and (2.113c)—just take the
opposite-sign Fourier transform of both sides. Hence, we can drop the (9 ) superscripts on U and
V in Eqs. (2.113a)–(2.113c) as long as ( B ) superscripts are added to u and v to distinguish
between the two choices of sign in (2.113b) and (2.113c). Now Eqs. (2.113a)–(2.113c) become

5 5

³ d. ³ d! e
B2& i ( x.  y! )
U (. ,! ) A V (. ,! ) u ( B ) ( x, y ) v ( B ) ( x, y ) , (2.114a)
5 5
where
5 5
u ( B ) ( x, y ) ³ ³ d! e
B2& i ( x.  y! )
d. U (. ,! ) (2.114b)
5 5
and
5 5

³ d. ³ d! e
(B) B2& i ( x.  y! )
v ( x, y ) V (. ,! ) . (2.114c)
5 5

The letters used to label the functions and variables are, of course, arbitrary, so nothing stops us
from interchanging the letters u and U, v and V, x and ȗ, y and Ș, and the vertical order of the ±
signs to get

5 5

³ dx ³ dy e
92& i ( x.  y! )
u ( x, y ) A v( x, y ) U ( 9 ) (. ,! ) V ( 9 ) (. ,! ) , (2.115a)
5 5

- 214
- 214- -
Fourier Transforms in Two and Three Dimensions · 2.25

where
5 5
U ( 9 ) (. ,! ) ³
5
dx ³ dy e 92& i ( x.  y! )u ( x, y )
5
(2.115b)

and
5 5

³ dx ³ dy e
(9) 92& i ( x.  y! )
V (. ,! ) v ( x, y ) . (2.115c)
5 5

Equations (2.115a)–(2.115c) are the other half of the two-dimensional Fourier convolution
theorem—they show that the forward or inverse two-dimensional Fourier transform of the
product of two functions u and v is the two-dimensional convolution of the forward or inverse
two-dimensional Fourier transforms of u and v.
The three-dimensional convolution is written using the symbol  and defined to be

5 5 5
u ( x, y, z )  v( x, y, z ) ³
5
dx3 ³ dy3 ³ dz3u ( x3, y3, z3) v( x  x3, y  y3, z  z3)
5 5
(2.116a)

or
5
G G G G G
³ ³ ³ d r 3 u (r 3) v(r  r 3) .
3
u (r )  v(r ) (2.116b)
5

Using three-dimensional vector notation, the three-dimensional convolution has the same
commutative, associative, and linearity properties as the two-dimensional convolution, as can be
seen by returning to Eqs. (2.111c)–(2.111f), mentally adding an extra  , an extra integral sign,
and replacing all the superscript 2’s by superscript 3’s.

G G G G
u ( ( )  v( ( ) v( ( )  u ( ( ) , (2.117a)

G G G G G G
u ( ( )  v( ( )  h( ( ) u ( ( )  v( ( )  h( ( ) , (2.117b)

G G G G G G G
u ( ( )   v( ( )   h( ( )   u ( ( )  v( ( )    u ( ( )  h( ( )  , (2.117c)

and
G G G G G G G
 v( ( )   h( ( )  u ( ( )  v( ( )  u ( ( )    h( ( )  u ( ( ) . (2.117d)

-- 215
215 --
2 · Fourier Theory

Looking carefully at the variable manipulations used to derive Eqs. (2.112a)–(2.112c), the first
half of the two-dimensional Fourier convolution theorem, we see that working with an extra
product z0 in the exponent of e and an extra integration over dz does not affect the end result.
We can therefore say that

5 5 5

³ dx ³ dy ³ dz e
92& i ( x.  y!  z0 )
[u ( x, y, z )  v( x, y, z )]
5 5 5
(2.118a)
(9) (9)
U (. ,! , 0 ) A V (. ,! , 0 ) ,

where
5 5 5
U ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e
92& i ( x.  y!  z0 )
u ( x, y , z ) (2.118b)
5 5 5

and
5 5 5

³ dx ³ dy ³ dz e
(9) 92& i ( x.  y!  z0 )
V (. ,! , 0 ) v ( x, y , z ) . (2.118c)
5 5 5
The argument about relabeling the functions and variables used to go from (2.112a)–(2.112c) to
(2.115a)–(2.115c) works equally well here, giving us at once the other half of the three-
dimensional Fourier convolution theorem,

5 5 5

³ dx ³ dy ³ dz e
92& i ( x.  y!  z0 )
u ( x, y , z ) A v ( x, y , z )
5 5 5
(2.119a)
U ( 9 ) (. ,! , 0 ) V ( 9 ) (. ,! , 0 ) ,

where
5 5 5

³ dx ³ dy ³ dz e
(9) 92& i ( x.  y!  z0 )
U (. ,! , 0 ) u ( x, y , z ) (2.119b)
5 5 5

and
5 5 5
V ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e
92& i ( x.  y!  z0 )
v ( x, y , z ) . (2.119c)
5 5 5

One last matter of notation worth mentioning is that we can create two-dimensional and three-
dimensional delta functions from the products of the already-discussed one-dimensional delta
function:

- 216
- 216- -
Fourier Transforms in Two and Three Dimensions · 2.25

G
 ( ( )  ( x) A  ( y ) (2.120a)

and
G
 (r )  ( x) A  ( y ) A  ( z ) . (2.120b)

For any two-dimensional continuous function u(x,y), we have

5 5 5 5

³ dx ³ dy u( x, y) ( x  x ) ( y  y ) ³ dx ( x  x ) ³ dy u( x, y) ( y  y )


o o o o
5 5 5
5
5
(2.121a)
³ dx ( x  x )u( x, y ) u( x , y );
5
o o o o

and similarly for any continuous three-dimensional function v ( x, y , z ) , we have

5 5 5

³ dx ³ dy ³ dz v( x, y, z) ( x  x ) ( y  y ) ( z  z )
5 5 5
o o o

5 5
³ dx ( x  x ) ³ dy v( x, y, z ) ( y  y )
5
o
5
o o (2.121b)
5
³ dx ( x  x )v( x, y , z ) v( x , y , z ).
5
o o o o o o

These equations can be written in vector notation as

5
G G G G
³ ³d
2
( u ( ( )  ( (  (o ) u ((o ) (2.121c)
5
and
5
G G G G
³ ³ ³ d r v( r )  (r  r ) v(r ) .
3
o o (2.121d)
5

Combining Eq. (2.71f) for the one-dimensional delta function with Eqs. (2.120a) and (2.120b),
we see that in two dimensions

5 5 5
G G G

³ d. e92& ix. ³ d! e 92& iy! ³ ³d qe


2 92& i ( = q
 ( ( )  ( x) A  ( y ) (2.122a)
5 5 5

-- 217
217 --
2 · Fourier Theory

G
using the vector notation q . xˆ  ! yˆ ; and in three dimensions

5 5 5
G
³ d. e ³ d! e ³ d0 e
92& ix. 92& iy! 92& iz0
 (r )  ( x) A  ( y ) A  ( z )
5 5 5
5
(2.122b)
G G

³ ³³d
3 92& ir = s
se
5

G
using the vector notation s . xˆ  ! yˆ  0 zˆ .

__________

This chapter provides both an intuitive understanding and a rigorous explanation of how
Fourier transforms work. Sine and cosine transforms are introduced as a way to measure how
much functions resemble sine and cosine curves, and these transforms are then combined to
create the standard complex Fourier transform. We describe convolutions and how they produce
new functions by blurring old ones. The Fourier convolution theorem—whose importance is
difficult to overstate—directly connects the convolution to Fourier-transform theory. Generalized
limits are explained to show in what sense some of the more puzzling functions found in lists of
Fourier transforms belong there, and a brief outline of generalized functions is presented to show
how delta functions can be described without making them sound like obvious nonsense.
Computers use discrete Fourier transforms to handle Fourier calculations, and we explain how
the discrete Fourier transform can be used to approximate the integral Fourier transform. The
discrete Fourier transform produces aliasing; we show when aliasing is desirable, when it is not
desirable, and when it can be neglected. All the major concepts explained in this chapter—the
linearity of the Fourier transform, the linearity of the convolution, the Fourier convolution
theorem, the idea of even and odd functions, and the delta function—have important roles to play
in the pages that follow.

- 218
- 218- -
Table 2.1

Table 2.1

U ( f ) F ( ift ) (u (t )) u (t ) F (ift ) (U ( f ))

(1) [real, even] [real, even]


Im(U ( f )) 0 , U ( f ) U ( f ) Im(u (t )) 0 , u (t ) u (t )

(2) [imag., even] [imag., even]


Re(U ( f )) 0 , U ( f ) U ( f ) Re(u (t )) 0 , u (t ) u (t )
(3) [real, odd] [imag., odd]
Im(U ( f )) 0 , U ( f ) U ( f ) Re(u (t )) 0 , u (t ) u (t )

(4) [imag., odd] [real, odd]


Re(U ( f )) 0 , U ( f ) U ( f ) Im(u (t )) 0 , u (t ) u (t )
(5) [complex, even] [complex, even]
Re(U ( f )) > 0 for some f Re(u (t )) > 0 for some t
Im(U ( f )) > 0 for some f Im(u (t )) > 0 for some t
U ( f ) U ( f ) u (t ) u (t )
(6) [complex, odd] [complex, odd]
Re(U ( f )) > 0 for some f Re(u (t )) > 0 for some t
Im(U ( f )) > 0 for some f Im(u (t )) > 0 for some t
U ( f ) U ( f ) u (t ) u (t )
(7) [Hermitian] [real]
U ( f ) U ( f )  Im(u (t )) 0
(8) [real] [Hermitian]
Im(U ( f )) 0 u (t ) u (t ) 

-- 219
219 --
2 · Fourier Theory

Table 2.1
(continued)

(9) [anti-Hermitian] [imag.]


U ( f ) U ( f )  Re(u (t )) 0

(10) [imag.] [anti-Hermitian]


Re(U ( f )) 0 u (t ) u (t ) 

(11) [complex, no symmetry] [complex, no symmetry]

- 220
- 220- -
Table 2.2

Table 2.2

§t·
5 2&ik ¨ ¸
T §k ·
1 2 & i ¨ t ¸ v(t ) ¦ Ak e ©T ¹
Ak ³ e © T ¹ v(t )dt k 5
T 0

(1) [real, even] [real, even]


Im( Ak ) 0 , Ak Ak Im(v(t )) 0 , v(t ) v(t )
(1)
(2) [real, even]
[imag., even] [real, even]
[imag., even]
Re( Akk ) 00 ,, AAkk AAkk
Im( A ) Im(vv((tt))
Re( )) 00 ,, vv((tt)) vv((tt))

(2)
(3) [imag.,
[real, even]
odd] [imag., odd]
[imag., even]
Re( A
Im( Ak ))
00 ,, A
Ak
AA
kk
Re(vv((tt ))
Re( ))
00 ,, vv((
tt ))
v(vt()t )
k k

(3)
(4) [real, odd] [imag., odd]
[real, odd]
[imag., odd]
Im( Re(
Im(vv((tt )) 00 ,, vv((tt)) vv((tt))
Re( Ak ) 00 ,, A
A k ) Akk  A
Akk ))
(4) [imag., odd] [real, odd]
(5) [complex, even] [complex, even]
Re( Ak ) 0 , Ak  Ak Im(v(t )) 0 , v(t ) v(t )
Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(5) [complex, even] [complex, even]
A (tv)(t ))v>
vRe( (t )0 for some t
 k A )A>
Re( kk 0 for some k
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(6) [complex, odd]
Ak Ak v[complex,
(t ) v(todd]
)
Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(6) [complex, odd] [complex, odd]
A Ak0 for some k (tv)(t ))>v(0t )for some t
vRe(
Re( k A ) >
k
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(7)
A[Hermitian] v[real]
(t ) v(t )
 k  Ak Im(v(t )) 0
Ak Ak
(7) [Hermitian] [real]
(8) A Ak Im(v(t )) 0
[Hermitian]
k
[real]
Im( Ak ) 0 v(t ) v(t ) 
- 221 -
(8) [real] [Hermitian]
Im( Ak ) 0 v(t ) v(t ) 
-- 221
221 --
2 · Fourier Theory

Table 2.2
(continued)

(9) [anti-Hermitian] [imag.]


Ak  Ak Re(v(t )) 0

(10) [imag.] [anti-Hermitian]


Re( Ak ) 0 v(t ) v(t ) 

(11) [complex, no symmetry] [complex, no symmetry]

- 222
- 222- -
3
RANDOM VARIABLES, RANDOM
FUNCTIONS, AND POWER SPECTRA
Engineers and scientists are taught many statistical concepts in school, but all too often this is
done in an informal manner that does a good job of explaining how to eliminate random errors
and noise from real experimental data and a poor job of explaining how to analyze random errors
and noise in physical models. Understanding the correct way to represent random errors and
noise requires formal knowledge of the statistical concepts used to describe random signals;
otherwise, basic equations can be misunderstood and misused. For this reason, we here take a
more formal approach to the subject. Starting off with an explanation of the basics—random
functions, independent and dependent random variables, the expectation operator E , stationarity
and ergodicity—that do not require the Fourier theory discussed in the previous chapter, we then
move on to topics that do, such as autocorrelation functions, white noise, the noise-power
spectrum, and the Wiener-Khinchin theorem. The techniques explained in this chapter are used a
few times in the next chapter during the derivation of the Michelson interference equations and
then over and over again in Chapters 6, 7, and 8 to analyze the random errors and noise found in
Michelson systems.

3.1 Random and Nonrandom Variables


Random variables can be thought of as uncontrolled variables and nonrandom variables can be
thought of as controlled variables. When, for example, a computer program is being written, the
programmer controls the values of nonrandom program variables using inputs or lines of code,
but the programmer has no desire to control the program’s random variables—a pseudo-random
number generator gives them values instead. In a similar spirit, a statistician constructing a set of
model equations always ends up controlling the nonrandom variables—either directly by saying
this variable can be measured like this and that variable can be measured like that, or indirectly,
by saying these variables must solve that set of equations. Even when a statistician plots a
function against its argument, the graph is constructed by specifying the argument’s values and
then calculating the function according to its definition, which puts both the nonrandom argument
and the nonrandom value of the function under the statistician’s control. The statistician always,
on the other hand, treats random variables in a model as if they cannot be controlled. They must
be handled as if coins will be flipped, dice rolled, or needles spun on dials to determine their
values after the model is written down. All the statistician can know is the probability this
random variable takes on that value and the probability that random variable takes on this value;

- 223 -
3 · Random Variables, Random Functions, and Power Spectra

that is, he knows what the chances are that the coins, dice, or needles return one set of numbers
rather than another. Most scientists and engineers do not pay much attention to the difference
between controlled and uncontrolled variables—perhaps because most of their “controlled”
variables are usually a little “uncontrolled” in the sense that they come from imperfectly accurate
measurements—but it is very convenient when analyzing a statistical model to keep careful track
of this distinction. To help us remember which variables are random and which are not, we put a
wavy line or tilde over the random variables while writing the nonrandom variables in the usual
way. As an example of how this looks, we note that u, a0, and zƍ are all nonrandom variables
whereas NJ, ã0, and z′ are all random.

3.2 Random and Nonrandom Functions


When the argument of a function is a random variable, the value of the function is also random.
If, for example, x is a random variable and f is a function, then

y = f ( x ) (3.1a)

is another random variable. To give an example of how this works, we create a nonrandom time
variable t and a random angular frequency ω , multiply them together and take the sine of their
product to get
y = sin(ω t ) . (3.1b)

The value of y is clearly uncontrolled; for each unpredictable value of ω at time t, there is a
corresponding unpredictable number y that is given by sin(ω t ) . This example also shows that
when a function has several arguments, its value becomes random when only one of the
arguments is random. In Eq. (3.1b) the sine of ω t , regarded as a function of both ω and t, is
random even though only one of its arguments, ω , is random.
Many times when a function has multiple arguments, the controlled argument or arguments
are more interesting than the uncontrolled argument or arguments that make the function random.
One way to handle this situation is to list only the nonrandom arguments and say that what we
have is a random function with nonrandom arguments. To show what is going on, we put a wavy
line over the function name, indicating that even though all the listed arguments are nonrandom,
the function itself is random. If, for example, we are only interested in the nonrandom time t, we
could define
R (t ) = sin(ω t ) (3.2a)

to be a random function of the nonrandom variable t. Now whenever there is a list of time values
t1, t2, …, there is a corresponding list of random variables

- 224 -
Random and Nonrandom Functions · 3.2

u1 = R (t1 ) = sin(ω t1 ) , (3.2b)


u = R (t ) = sin(ω t ) ,
2 2 2

Although Eq. (3.2b) implicitly assumes a list of distinct and separate t values, this reasoning still
holds up when t is explicitly made a continuous variable. Nothing, for example, stops us from
saying that for each value of t between í’ and +’, there corresponds a different random variable

ut = R (t ) = sin(ω t ) . (3.2c)

The idea of a random function of nonrandom arguments becomes more attractive when there is
no realistic possibility of analyzing the effect of multiple random arguments on a single
nonrandom function. We might, for example, know exactly how N random parameters r1 , r2 , …,
rN interact to cause an error e in an electrical signal s at time t. This lets us write the error as a
nonrandom function
e(t , r1 , r2 ,… , rN ) .

Rather than investigating how r1 , r2 , …, rN are behaving, it usually makes more sense to say that
there is a random noise
n (t ) = e(t , r1 , r2 ," , rN ) (3.3a)

contaminating electrical signal s. Now we can put the error into our model as a random function ñ
that depends on a nonrandom parameter t instead of as a nonrandom function e that depends on t
and N random parameters r1 , r2 , …, rN . Sometimes the signal s in our model depends on more
than one nonrandom parameter, such as the x, y coordinates of an image point at time t. If the
corresponding error e in the signal s depends on x, y, and t as well as the random parameters r1 ,
r2 , …, rN , then we can say there is a random noise

n ( x, y, t ) = e( x, y, t , r1 , r2 ,… , rN ) (3.3b)

contaminating signal s(x, y, t). Note that we can think in terms of a signal noise ñ(t) or ñ(x,y,t)
even when we are not sure what random arguments r1 , r2 , …, rN make the nonrandom function e
behave randomly. This is, of course, why the idea of a random function is so useful. In this book,
we use the term “random function” to refer to what statisticians often prefer to call a random or
stochastic process.

- 225 -
3 · Random Variables, Random Functions, and Power Spectra

3.3 Probability Density Distributions: Mean, Variance, Standard


Deviation
With every random variable r , we associate a nonrandom probability density distribution pr ( x)
such that pr ( x) dx is the probability that the random variable r takes on a value between x and
x + dx . The nonrandom argument x of pr is a dummy variable, and nothing stops us from calling
it r instead—in fact, that is the convention. The usual way to introduce a probability density
distribution for a random variable r is to say that pr (r ) dr is the probability that r takes on a
value between r and r + dr . The dummy argument of a probability density distribution p must be
nonrandom, and the subscript of the probability density distribution p must be random—the
subscript, after all, labels p to show which random variable is being described. Since r must
always take on some sort of value between í’ and +’, the sum of all the probabilities pr (r ) dr
between í’ and +’ must always be one. Consequently, for any probability density distribution
pr (r ) , we have

³ p (r ) dr = 1 .
−∞
r (3.4)

For Eq. (3.4) to make sense, the probability density distribution pr (r ) must be defined for all r
between í’ and +’ with the understanding that

pr (r ) = 0

for those values of r to which the random variable r can never be equal.
The predicted average or mean value of r can be written as


µr = ³ p (r ) r dr .
−∞
r (3.5a)

Note that µr , just like pr , is nonrandom even though it has a random subscript. The predicted
variance of r , which is defined to be the predicted average or mean squared difference between
r and µr , is another nonrandom quantity

³ p (r ) (r − µ )
2
vr = r r dr . (3.5b)
−∞
Many people prefer to characterize a random number r by its standard deviation σ r instead of its
variance vr . The standard deviation of a random number r is defined to be the square root of the
variance,

- 226 -
Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3

σ r = vr . (3.5c)

Of course σ r , like vr , is a nonrandom quantity. In general, the probability density distribution pr
lets us find the predicted average or mean value of any nonrandom function f of the random
variable r by calculating the nonrandom quantity


predicted mean value of f = ³ p (r ) f (r ) dr .
−∞
r (3.5d)

When f (r ) = r , this equation reduces to formula (3.5a) for µr ; and when f (r ) = (r − µr ) 2 , this
equation reduces to formula (3.5b) for vr .
Many random variables found in nature appear to obey a Gaussian, or “normal,” probability
distribution:
( r − µ r ) 2
1 −
2σ r2
pr (r ) = e . (3.6a)
σ r 2π

This can in part be explained as a consequence of the central limit theorem,25 which is described
in Sec. 3.11 below. It is easy to show that parameter µr in Eq. (3.6a) is the mean of the Gaussian
distribution. Consulting formula (3.5a) above, we see that the mean of the distribution in (3.6a)
must be
∞ ( r − µr )2 ∞ ( r ′ )2
r −
1 −

³σ ³
2σ r2 2σ r2
e dr = (r ′ + µr ) e dr ′ , (3.6b)
−∞ r 2π σ r 2π −∞

where on the right-hand side the variable of integration is changed to r ′ = r − µr . This becomes,
consulting Eq. (7A.3d) in Appendix 7A of Chapter 7,
∞ ( r ′ )2 ∞ ( r ′ )2 ∞ ( r ′ )2
1 −
1 −
µr −

³ (r ′ + µ ) e ³ r′ e ³e
2σ r2 2σ r2 2σ r2
r dr ′ = dr ′ + dr ′
σ r 2π −∞ σ r 2π −∞ σ r 2π −∞
(3.6c)
∞ ( r ′ )2
1 −

³ r′ e
2σ r2
= dr ′ + µr ⋅1 .
σ r 2π −∞

25
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. (McGraw-Hill, Inc., New
York, 1991), p. 214.

- 227 -
3 · Random Variables, Random Functions, and Power Spectra

If we replace r ′ by −r ′ in
( r ′ )2

2σ r2
g (r ′) = r ′ e ,

it is the same as multiplying g by −1 , which makes g an odd function [see Eq. (2.11b) in Chapter
2). Hence, according to Eq. (2.17) in Chapter 2,

∞ ( r ′)2

³ r′ e
2σ r2
dr ′ = 0
−∞

because it is the integral of an odd function between í’ and +’. Therefore, Eq. (3.6c) simplifies
to
∞ ( r ′ )2
1 −

³ (r ′ + µ ) e
2σ r2
r dr ′ = µr , (3.6d)
σ r 2π −∞

which can be substituted back into (3.6b) to get

∞ ( r − µ r ) 2
r −

³σ
2σ r2
e dr = µ r . (3.6e)
−∞ r 2π

This shows that, as claimed above, parameter µr is the mean of the probability distribution
specified in Eq. (3.6a). It is just as easy to show that σ r is the standard deviation of the
distribution in (3.6a). From (3.5b) we know that the variance of this distribution is

∞ ( r − µ r )2 ∞ ( r ′ )2
(r − µ r ) 2 − (r ′) 2 − 2σ r2
³−∞ σ  2π e dr = ³
2σ r2
e dr ′
r −∞ σ r 2 π

when the variable of integration is changed to r ′ = r − µr . According to Eq. (7A.3b) in Appendix
7A of Chapter 7, we can write
∞ ( r ′ )2
(r ′) 2 − 2σ r2
³−∞ σ  2π e dr ′ = σ r .
2
(3.6f)
r

Consequently, σ r2 is the variance of this probability density distribution. The square root of the
variance is the standard deviation according to (3.5c). Hence, it is, as claimed, easy to see that σ r

- 228 -
Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3

is the standard deviation of the probability density distribution in Eq. (3.6a).


When r can only take on the values r1 , r2 , …, rN , then pr can be written as a sum of delta
functions. If, for example, p1 is the probability that r is r1 , p2 is the probability that r is r2 , …,
pN is the probability that r is rN , then

N
pr (r ) = ¦ pk ⋅ δ (r − rk ) . (3.7a)
k =1

The integral for the predicted mean value of r in Eq. (3.5a) now reduces to

∞ N N ∞ N
µr = ³ [¦ pk ⋅ δ (r − rk )] r dr = ¦ pk ³ δ (r − rk ) r dr = ¦ pk rk (3.7b)
−∞ k =1 k =1 −∞ k =1

as we expect. Similarly, according to Eq. (3.5b), the predicted variance of r becomes

∞ N N ∞

³ [¦ pk ⋅ δ (r − rk )](r − µr ) dr = ¦ pk ³ δ (r − r ) (r − µ )
2 2
vr = k r dr
−∞ k =1 k =1 −∞
(3.7c)
N
= ¦ pk (rk − µr ) 2 ;
k =1

and, according to Eq. (3.5d), the predicted mean value of f (r ) becomes

∞ N N ∞ N

³ [¦ pk ⋅ δ (r − rk )] f (r ) dr = ¦ pk
−∞ k =1 k =1
³
−∞
f (r ) δ (r − rk ) dr = ¦ pk f (rk ) .
k =1
(3.7d)

Again, the integral formulas reduce to the correct probability-weighted sums. Looking at the
limiting case where N = 1 and p1 = 1 , we get

pr (r ) = δ (r − r1 )
so that

µr = ³ δ (r − r ) r dr = r
−∞
1 1 (3.7e)

and the variance about µr = r1 is

- 229 -
3 · Random Variables, Random Functions, and Power Spectra

³ (r − r ) δ (r − r1 ) dr = (r1 − r1 )2 = 0 .
2
vr = 1 (3.7f)
−∞

Results (3.7e) and (3.7f) show that the value of r is now completely controlled; it must be equal
to r1 and no longer needs to be treated like a random variable. Hence, the limiting case where
N = 1 and p1 = 1 can be regarded as changing a random variable into a nonrandom variable.

3.4 The Expectation Operator


Statisticians avoid the mathematical awkwardness of probability density distributions and their
associated integrals by defining an expectation operator E . For any nonrandom function f with a
random argument x , we say that
E ( f ( x ) )

is the predicted mean, or average, value of f ( x ) . We also call E ( f ( x ) ) the expectation value of
f ( x ) . Mathematically we define

E ( f ( x ) ) = ³ p ( x) f ( x) dx .
x (3.8a)
−∞

Just like before, px ( x) dx is the probability that the random variable x takes on a value between
x and x + dx . We can find E( x ) , the expectation value of x , by choosing f ( x ) = x in Eq. (3.8a)
to get

E( x ) = ³ p ( x) x dx .
−∞
x (3.8b)

Comparing this to Eq. (3.5a) above, we see that the expectation value of x is the same as the
predicted mean or average value of x ,

E( x ) = µ x , (3.8c)

which makes good intuitive sense. Choosing f ( x ) = ( x − µ x ) 2 gives

(
E ( x − µ x ) 2 = ) ³ p ( x) ( x − µ )
x x
2
dx . (3.8d)
−∞

- 230 -
The Expectation Operator · 3.4

Comparing this to Eq. (3.5b) above, we see that E ( ( x − µ x ) 2 ) is the variance of x ,

(
vx = E ( x − µ x ) 2 . ) (3.8e)

A notation often used for the variance of x instead of vx is

(
Var ( x ) = E ( x − µ x ) 2 . ) (3.8f)

When the E operator is applied to any sort of random variable or function—for example,
f ( x ) —the result is always a nonrandom variable or function, namely

³ p ( x) f ( x) dx .
−∞
x

For example, the characteristic function Φ x of a random variable x , which is the nonrandom
Fourier transform of the probability density distribution of x ,

³ p ( x )e
−2π iν x
Φ x (ν ) = x dx , (3.9a)
−∞

can be written as, using the E operator,

Φ x (ν ) = E (e −2π iν x ) . (3.9b)

To specify what happens when E is applied to a nonrandom variable c, we set up a random


variable ρ that has the probability density distribution

pρ ( ρ ) = δ ( ρ − c) . (3.9c)

According to the discussion following Eqs. (3.7e,f) above, this makes ρ equivalent to the
nonrandom variable c. Consequently, we can say that

E(c) = E( ρ ) (3.9d)
and use Eq. (3.8b) above to get

- 231 -
3 · Random Variables, Random Functions, and Power Spectra

∞ ∞
E( c ) = ³ pρ ( ρ ) ρ d ρ = ³ δ ( ρ − c ) ρ d ρ = c .
−∞

−∞
(3.9e)

This justifies the general rule—which also makes good intuitive sense—that

E( c ) = c (3.9f)
for any nonrandom quantity c.
The expectation operator E can be applied to multiple random variables at the same time—all
that we need is the appropriate probability density distribution. Suppose, for example, that the
behavior of two random variables x and X is described by a two-argument probability density
distribution pxX

( x, X ) , with pxX

( x, X ) dx dX being the probability that the random variable x
takes on a value between x and x + dx while the random variable X takes on a value between X
and X + dX . No matter what the behavior of random variables x and X , we can always
construct an appropriate probability density distribution p  . Since x and X must always take

xX

on some values in the intervals

−∞ < x < ∞ and −∞ < X < ∞ ,

the same reasoning used to produce Eq. (3.4) now shows that

∞ ∞

³
−∞
dx ³ dX pxX
−∞

( x, X ) = 1 (3.10a)

for any probability density distribution pxX 


. The expectation value of any function of the random
variables x and X , such as f ( x , X ) , is defined to be

∞ ∞

( ) ³
E f ( x, X ) = dx ³ dX pxX

( x, X ) f ( x , X ) . (3.10b)
−∞ −∞

In particular, we can always set f ( x , X ) = x X to get the expected value of the random variables’
product,

∞ ∞
 )=
E( xX ³ x dx ³ dX X p 
xX
( x, X ) . (3.10c)
−∞ −∞

- 232 -
Independent and Dependent Random Variables · 3.5

3.5 Independent and Dependent Random Variables


When comparing two random variables such as x and X , one of the first questions that arises is
whether they are dependent or independent. When two random variables are dependent, the
random variables influence each other; and when two random variables are independent, they do
not.
Independent random variables are used to describe random quantities for which no cause-and-
effect relationship can be found. When, for example, we pick a car randomly from all the cars
sold in a given year, there is no reason to expect that the random variable representing the
brightness of the car’s headlights is associated with any particular value of the random variable
representing the car’s length. Lacking any evidence to the contrary, then, we say that these two
random variables ought to be independent. Similarly, if we pick someone at random from a
collection of adults, there is no obvious reason to assume that the random variable representing
the person’s yearly income is associated with any particular value of the person’s shoe size.
Again, we might assume that these are independent random variables. In general, when there is
no reason to connect the values of random quantities, we set them up in our models as
independent random variables.
Many times random variables turn out to be dependent in surprising ways. Returning to the
first of the previous examples, when we examine the connection between a car’s length and the
brightness of its headlights, it might turn out that very short cars are more likely to be European
sports cars frequently washed by their owners, making them more likely to have cleaner and thus
brighter headlights. Similarly, returning to the second example, a person’s shoe size and height
are connected; and statisticians have in fact shown that tall people, who are more likely to wear
large shoes, are also more likely to earn large incomes (if only because people living in the
United States, Australia, Canada, and Europe are more likely to be tall). Just as in these two
examples, many random variables that look like they ought to be unconnected and independent
turn out, after closer examination, to be dependent; in this sense, the independence of random
variables is the ideal case from which realistic random variables tend to deviate to a greater or
lesser degree.

3.6 Analyzing Independent Random Variables


When x and X are independent random variables, their probability density distribution can be
written as26

pxX

( x, X ) = px ( x) ⋅ p X ( X ) . (3.11a)

where px and p X are the standard probability density distributions for x and X when x and X

26
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 132.

- 233 -
3 · Random Variables, Random Functions, and Power Spectra

are treated as solitary random variables. This means that px ( x) dx is the probability that x lies
between x and x + dx regardless of the value of X , and p X ( X ) dX is the probability that X lies
between X and X + dX regardless of the value of x . We see that, according to Eqs. (3.10c) and
  of two independent random variables is
(3.11a), the expectation value of the product xX

∞ ∞ ∞ ∞
 )=
E( xX ³ x dx ³ dX X pxX

( x, X ) = ³ x dx ³ dX X px ( x) p X ( X )
−∞ −∞ −∞ −∞
∞ ∞
= [ ³ px ( x) x dx] ⋅ [ ³ p X ( X ) X dX ] .
−∞ −∞

According to Eqs. (3.8b) and (3.8c), this can be written as

  ) = E( x ) ⋅ E( X )
E( xX (3.11b)
or
  ) = µ x µ X .
E( xX (3.11c)

3.7 Large Numbers of Random Variables


Our analysis of two random variables can be extended in a straightforward way to large
collections of random variables. If there are N random variables x1 , x2 ,…, x N , then we can
always construct a probability density distribution

px1x2 "xN ( x1 , x2 ,… , xN )
such that
px1x2 "xN ( x1 , x2 ,… , xN ) dx1 dx2 " dxN

is the probability that x1 lies between x1 and x1 + dx1 , that x2 lies between x2 and x2 + dx2 , ... ,
that x N lies between xN and xN + dxN . The expectation value of any function f ( x1 , x2 ,… , x N ) of
these N random variables is

E ( f ( x1 , x2 ,… , x N ) )
∞ ∞ ∞ (3.12a)
= ³
−∞
dx1 ³ dx2 " ³ dxN f ( x1 , x2 ,… , xN ) px1 x2 "xN ( x1 , x2 ,… , xN ).
−∞ −∞

- 234 -
Large Numbers of Random Variables · 3.7

Note that nothing has been said so far about the connections between these N random variables;
they could be either dependent or independent. If we now assume that these N random variables
are all independent with respect to one another, then

px1x2 "xN ( x1 , x2 ,… , xN ) = px1 ( x1 ) px2 ( x2 ) " pxN ( xN ) , (3.12b)

where px1 ( x1 ) dx1 is the probability that x1 lies between x1 and x1 + dx1 regardless of the values
of the other N − 1 random variables, px2 ( x2 ) dx2 is the probability that x2 lies between x2 and
x2 + dx2 regardless of the values of the other N − 1 random variables, …, pxN ( xN ) dxN is the
probability that x N lies between xN and xN + dxN regardless of the values of the other N − 1
random variables. The expectation value of the product of these N random variables can now be
written as, setting f ( x1 , x2 ," , x N ) = x1 x2 " x N in Eq. (3.12a),

∞ ∞ ∞
E( x1 x2 " x N ) = ³
−∞
dx1 ³ dx2 " ³ dxN [ x1 x2 " xN ] px1 x2 "xN ( x1 , x2 ,… , xN )
−∞ −∞
∞ ∞ ∞
= ³
−∞
px1 ( x1 ) x1 dx1 ³ px2 ( x2 ) x2 dx2 " ³ pxN ( xN ) xN dxN .
−∞ −∞

Again, we consult Eqs. (3.8b) and (3.8c) to get

E( x1 x2 " x N ) = E( x1 ) E( x2 ) " E( x N ) (3.12c)


or
E( x1 x2 " x N ) = µ x1 µ x2 " µ xN . (3.12d)

3.8 Single-Variable Means from Multivariable Distributions


We can calculate the predicted mean values of x and X by choosing f ( x , X ) = x and
f ( x , X ) = X in Eq. (3.10b) above. This gives
∞ ∞
µ x = E( x ) = ³ dx ³ dX x p
−∞ −∞

xX
( x, X ) (3.13a)

and
∞ ∞
µ X = E( X ) = ³ dx ³ dX X p 
xX
( x, X ) . (3.13b)
−∞ −∞

- 235 -
3 · Random Variables, Random Functions, and Power Spectra

Writing the double integrals as


∞ ∞
E( x ) = ³
−∞
x [ ³ pxX
−∞

( x, X ) dX ] dx (3.13c)

and
∞ ∞
E( X ) = ³ X [³ pxX

( x, X ) dx] dX , (3.13d)
−∞ −∞

we compare them to the formula for the expected value of a random variable given in Eq. (3.8b).
This comparison suggests that, if we want to specify the behavior of one random variable while
disregarding the presence of the other, we can construct the single-argument probability density
distributions of x and X by writing

px ( x) = ³p
−∞

xX
( x, X ) dX (3.13e)

and

p X ( X ) =
−∞
³p 
xX
( x, X ) dx . (3.13f)

Up to this point, none of the integrations have required assumptions about the dependence or
independence of the random variables, so Eqs. (3.13e) and (3.13f) hold true both for dependent
and independent random variables x and X . If we specify that x and X are independent, then
Eq. (3.11a) can be substituted into (3.13e) and (3.13f) to get

∞ ∞
px ( x) = ³ p ( x)
−∞
x p X ( X ) dX = px ( x) ³ p X ( X ) dX
−∞
and
∞ ∞
p X ( X ) =
−∞
³ px ( x) p X ( X ) dx = p X ( X ) ³ px ( x) dx .
−∞

Glancing back at Eq. (3.4), we note that these last two equalities are trivially true, because in both
cases the right-most integrals must be one.

3.9 Analyzing Dependent Random Variables


Having found formulas for µ x and µ X that hold true for any pair of dependent or independent
random variables x and X , we now use µ  and µ  to define a new random variable
x X

- 236 -
Analyzing Dependent Random Variables · 3.9

y = ( x − µ x )( X − µ X ) . (3.14a)

From Eq. (3.8c), we know that

(
E( y ) = E ( x − µ x )( X − µ X ) ) (3.14b)

is just the predicted average value of y . We can imagine, each time we acquire a random pair of
x and X values, comparing the sizes of x and X to their respective averages µ x and µ X by
subtracting µ  and µ  from them. If x and X are both simultaneously greater than, or both
x X

simultaneously less than, their averages, then y is positive; and if one is greater than its average
when the other is less that its average, then Ϳ is negative. If there is a tendency for one of the
random variables to exceed its average whenever the other exceeds its average, or a tendency for
one of the random variables to fall below its average whenever the other falls below its average,
then Ϳ has a greater probability of being positive than negative, so

E( y ) > 0 .

If, on the other hand, there is a tendency for one of the random variables to exceed its average
when the other falls below its average, then Ϳ has a greater probability of being negative than
positive, so
E( y ) < 0 .

If E( y ) is zero, it indicates that Ϳ is just as likely to be negative as positive, which means that
knowing one variable lies above or below its average tells us nothing about the likelihood that the
other variable lies above or below its average. Writing out the integral formula for E( y ) in terms
of the probability density distribution pxX 
( x, X ) gives

∞ ∞

( ) ³ dx ³ dX [( x − µ )( X − µ
E( y ) = E ( x − µ x )( X − µ X ) = x X
)] pxX

( x, X ) . (3.14c)
−∞ −∞

We say that the value of the integral in Eq. (3.14c) measures the covariance of random variables
x and X . When
(
E( y ) = E ( x − µ x )( X − µ X ) )
is greater than zero, x and X are said to be positively correlated; when

- 237 -
3 · Random Variables, Random Functions, and Power Spectra

(
E( y ) = E ( x − µ x )( X − µ X ) )
is less than zero, x and X are said to be negatively correlated; and when

(
E( y ) = E ( x − µ x )( X − µ X ) )
equals zero, x and X are said to be uncorrelated.
Evaluating E( y ) and finding it not equal to zero is a standard way of showing that two
random variables x and X are correlated and so cannot be independent. We cannot, however,
say that x and X are independent just because E( y ) is zero; that is, saying that x and X are
uncorrelated is a weaker statement than saying that x and X are independent. To show why this
is so, we set up a random variable φ which has a probability density distribution

­ 1 (2π ) for 0 ≤ φ < 2π


pφ (φ ) = ® . (3.15a)
¯ 0 for φ < 0 or φ ≥ 2π

The probability density distribution pφ shows that φ is equally likely to take on any value
between zero and 2ʌ, and that φ never takes on values less than zero or greater than 2ʌ. We next
define two random variables u and v such that

u = sin(φ ) (3.15b)
and
v = cos(φ ) . (3.15c)

It follows that
∞ 2π
1
µu = E(u ) = E(sin φ ) = ³−∞ pφ (φ ) sin(φ ) dφ = 2π ³ sin(φ ) dφ = 0 , (3.15d)
0

and similar reasoning shows that


1
µv = E(v ) =
2π ³ cos(φ ) dφ = 0 .
0
(3.15e)

Note that

- 238 -
Analyzing Dependent Random Variables · 3.9

(
E ( (u − µu )(v − µv ) ) = E(u v ) = E (sin φ )( cos φ ) )

1
=
2π ³ sin(φ ) cos(φ ) dφ
0
(3.15f)


1
4π ³0
= sin(2φ ) dφ = 0 ,

which means that u and v are uncorrelated random variables. On the other hand, we also know
that
u 2 + v 2 = sin 2 φ + cos 2 φ = 1 ,

which means that whenever u takes on a particular random value, say 1/2, then v must take on
one of the two random values
± 1 − (1 2) 2 = ± 3 2 .

Consequently, u and v are by no means independent random variables even though by definition
they are uncorrelated random variables.

3.10 Linearity of the Expectation Operator


The expectation operator is linear with respect to all random quantities. To see why, we take any
two functions f and g whose arguments are the N random variables x1 , x2 ,…, x N and multiply
them by two nonrandom variables Į and ȕ. The expectation operator E applied to

α f ( x1 , x2 ,… , x N ) + β g ( x1 , x2 ,… , x N )

then gives, according to Eq. (3.12a) above,

E (α f ( x1 , x2 ,… , x N ) + β g ( x1 , x2 ,… , x N ) )


∞ ∞ ∞
= ³ dx ³ dx " ³ dx
−∞
1
−∞
2
−∞
N [α f ( x1 , x2 ,… , xN ) + β g ( x1 , x2 ,… , xN )] px1 x2 "xN ( x1 , x2 ,… , xN )

∞ ∞ ∞
=α ³ dx ³ dx " ³ dx
−∞
1
−∞
2
−∞
N f ( x1 , x2 ," , xN ) px1 x2 "xN ( x1 , x2 ," , xN ) (3.16a)

∞ ∞ ∞
+β ³
−∞
dx1 ³ dx2 " ³ dxN g ( x1 , x2 ,… , xN ) px1x2 "xN ( x1 , x2 ,… , xN )
−∞ −∞

= α E ( f ( x1 , x2 ,… , x N ) ) + β E ( g ( x1 , x2 ,… , x N ) ) .

- 239 -
3 · Random Variables, Random Functions, and Power Spectra

Note that in the last step Eq. (3.12a) is applied again to return to the expectation operator.
According to Eq. (2.32a) in Chapter 2, the definition of a linear operator L is that

L (α f + β g ) = α L ( f ) + β L ( g ) (3.16b)

for any two functions f, g and any two constants Į, ȕ. When we think of the nonrandom variables
Į and ȕ as “constants,” we see that Eqs. (3.16a) and (3.16b) provide plenty of justification for
calling the expectation operator E a linear operator with respect to all random quantities.
The linearity of E can be used to show that multiplying any random variable x by a
nonrandom parameter Į results in the mean of x being multiplied by Į and the variance of x
being multiplied by Į2. Starting with Eq. (3.8c), we multiply both sides by Į to get

α E( x ) = αµ x . (3.16c)

Because E is linear, E(α x ) = α E( x ) , which means that Eq. (3.16c) can be written as

E(α x ) = αµ x . (3.16d)

This shows that multiplying x by Į changes its average value from µ x to αµ x . As for the
variance vx of random variable x , according to Eq. (3.8e) we have

( )
E ( x − µ x ) 2 = vx (3.16e)

from the definition of the variance of x . Multiplying both sides by Į2 gives

α 2E ( ( x − µ x ) 2 ) = α 2 vx . (3.16f)
Again the linearity of E lets us write

α 2E ( ( x − µ x ) 2 ) = E (α 2 ( x − µ x )2 ) ,

and taking Į inside the square gives

α 2E ( ( x − µ x )2 ) = E ( (α x − αµ x )2 ) .

This can be substituted into (3.16f) to get

- 240 -
Linearity of the Expectation Operator · 3.10

E ( (α x − αµ x ) 2 ) = α 2 vx . (3.16g)

Since α x is the new random variable which comes from multiplying x by Į and [according to
Eq. (3.16d)] the quantity αµ x is the mean of this new random variable, we now realize—
consulting the definition of the variance in Eq. (3.8e)—that E ( (α x − αµ x ) 2 ) must be the variance
of the new random variable α x . Equation (3.16e) reminds us that vx is the variance of the old
random variable x . Hence, Eq. (3.16g) states that if x is multiplied by Į then its variance must
be multiplied by Į2.
The expectation operator usually can be moved inside an integral over a nonrandom variable.
Suppose function f depends on one nonrandom variable z in addition to N random variables
x1 , x2 ,…, x N . Then, again using Eq. (3.12a), the expectation value of the integral

zB

³ f ( z, x , x ,…, x
zA
1 2 N ) dz

is
zB

E ( ³ f ( z , x1 , x2 ,… , x N ) dz )
zA
∞ ∞ ∞ zB

= ³ dx ³ dx " ³ dx
−∞
1
−∞
2
−∞
N px1 x2 "xN ( x1 , x2 ,… , xN ) ³ f ( z, x1 , x2 ,… , xN ) dz .
zA

As long as we can interchange the order of these integrations—which is almost always allowed
when dealing with physically realistic integrals—the expectation value can also be written as

§ zB ·
E ¨ ³ f ( z, x1 , x2 ,… , x N ) dz ¸
¨z ¸
© A ¹
zB
ª∞ ∞ ∞
º
= ³ dz « ³ dx1 ³ dx2 " ³ dxN px1 x2 "xN ( x1 , x2 ,… , xN ) f ( z, x1 , x2 ,… , xN ) » .
zA ¬ −∞ −∞ −∞ ¼

This can, again applying Eq. (3.12a), be written as

§ zB · zB
E ¨ ³ f ( z, x1 , x2 ,… , xN ) dz ¸ = ³ E ( f ( z, x1 , x2 ,… , x N ) ) dz .
   (3.17a)
¨z ¸ z
© A ¹ A

- 241 -
3 · Random Variables, Random Functions, and Power Spectra

The same reasoning can be extended to M integrals over M nonrandom variables z1 , z2 ,…, zM .
We have

§ z1 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸
¨z ¸
© 1 A z2 A zMA ¹
∞ ∞ z1 A zMB

= ³
−∞
dx1 " ³ dxN px1x2 "xN ( x1 ,… , xN )
−∞
³
z1 A
dz1 " ³ dz
zMA
M f ( z1 ,… , zM , x1 ," , xN )

z2 A zMB
ª∞ ∞
º
= ³
z1 A
dz1 " ³z M «¬ −∞³ 1 −∞³ dxN px1x2"xN ( x1 ,… , xN ) f ( z1 ,…, zM , x1 ," , xN ) »¼ ,
dz dx "
MA

which can also be written as

§ z2 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸
¨z ¸
© 1 A z2 A zMA ¹ (3.17b)
z1 B z2 B zMB

= ³ dz ³ dz " ³ dz
1 2 M E ( f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ).
z1 A z2 A zMA

The expectation operator can even be moved inside the integral of a random function

f ( z1 , z2 ,… , zM ) .

According to our definition of a random function in Sec. 3.2 above, we have

f ( z1 , z2 ,… , zM ) = f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N )

for some set of random variables x1 , x2 ,…, x N . Hence, we can just suppress the random variables
x1 , x2 ,…, x N in Eq. (3.17b) to get

- 242 -
Linearity of the Expectation Operator · 3.10

§ z2 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM ) ¸
¨z ¸
© 1 A z2 A zMA ¹ (3.17c)
z1 B z2 B zMB

= ³ dz ³ dz " ³ dz
1 2 M ( )
E f ( z1 , z2 ,… , zM ) .
z1 A z2 A zMA

This result is referred to more than once in the following chapters.

3.11 The Central Limit Theorem


The central limit theorem states that if there is a random variable sN equal to the sum of N
independent random variables r1 , r2 ,…, rN , then

sN = r1 + r2 + " + rN (3.18a)

has a probability density distribution psN ( sN ) that resembles a Gaussian or normal probability
density distribution more and more as N gets large,

( s N − µ sN )2

1 2σ s2N
psN ( sN ) ≅ e . (3.18b)
σ s N

In Eq. (3.18b), µ sN is the mean or average value of sN and σ sN is the standard deviation of sN
about its mean. Figure 3.1 is a plot of the Gaussian distribution specified on the right-hand side of
(3.18b). For large but finite values of N, this Gaussian distribution tends to be a relatively good
approximation of psN ( sN ) for sN values near the peak in Fig. 3.1 and a not-so-good
approximation of psN ( sN ) for sN values in the tails of Fig. 3.1—that is, for sN values far from
the peak.
The mean of sN comes from applying the expectation operator E to both sides of Eq. (3.18a).
Remembering that E is linear with respect to random quantities [see Eq. (3.16a) above], we get

E( sN ) = E(r1 + r2 + " + rN ) = E(r1 ) + E(r2 ) + " + E(rN ) ,

- 243 -
3 · Random Variables, Random Functions, and Power Spectra

FIGURE 3.1.

p ~sN ( s N )

sN

σ ~sN µ ~sN σ ~sN

which becomes, applying Eq. (3.8c) above,

µ s = µr + µr + " + µr .


N 1 2 N
(3.19a)

The variance of sN is, according to Eq. (3.8e),

( )
vsN = E ( sN − µ sN ) 2 ,

which becomes, after substituting from Eqs. (3.18a) and (3.19a),

- 244 -
The Central Limit Theorem · 3.11

§§ N N · ·
2
§§ N · ·
2

vsN = E ¨ ¨ ¦ rj − ¦ µrj ¸ ¸ = E ¨ ¨ ¦ (rj − µrj ) ¸ ¸ .


¨ © j =1 ¹ ¸¹ ¨ © j =1 ¹ ¸¹
© j =1
©

Expanding the square inside the expectation operator gives

§ N N N
·
vsN = E ¦ (rj − µrj ) + ¦¦ [(rj − µrj )(rk − µrk )] ¸ ,
¨ 2
¨ j =1 ¸
¨ j =1 k =1 ¸
© k≠ j ¹

and the linearity of the expectation operator with respect to random quantities then lets us write
this as

( ) ( )
N N N
vsN = ¦ E (rj − µrj ) 2 + ¦¦ E (rj − µrj )(rk − µrk ) . (3.19b)
j =1 j =1 k =1
k≠ j

Since r1 , r2 ,…, rN are independent random quantities, so must the random quantities r1 − µr1 ,
r2 − µr2 ,…, rN − µrN also be independent. Hence, according to Eq. (3.11b), we see that when
j≠k

( )
E (rj − µrj )(rk − µrk ) = E(rj − µrj ) ⋅ E(rk − µrk ) . (3.19c)

But, applying the linearity of the expectation operator and Eqs. (3.8c) and (3.9f), we have

E(rj − µ rj ) = E(rj ) − E( µ rj ) = µ rj − µ rj = 0 .

Consequently, Eq. (3.19c) becomes

(
E (rj − µ rj )(rk − µ rk ) = 0 ) (3.19d)

when j ≠ k . Substituting this into (3.19b) gives

( )
N
vsN = ¦ E (rj − µrj ) 2 ,
j =1

- 245 -
3 · Random Variables, Random Functions, and Power Spectra

which becomes, after applying Eq. (3.8e),

vsN = vr1 + vr2 + " + vrN , (3.19e)


where
( )
E (rj − µrj ) 2 = vrj (3.19f)

is the variance of rj for j = 1, 2,… , N . The standard deviation of a random quantity is the square
root of its variance [see Eq. (3.5c)], so formulas (3.19e) and (3.19f) can also be written as

σ s2 = σ r2 + σ r2 + " + σ r2 ,


N 1 2 N
(3.19g)
where
( )
E (rj − µrj ) 2 = σ rj (3.19h)

is the standard deviation of rj for j = 1, 2,… , N and σ sN is the standard deviation of sN .
Returning to the approximation in Eq. (3.18b) used to explain the central limit theorem, we
notice that some care must be exercised in interpreting the limit as N → ∞ ; in particular, it is
clear from Eqs. (3.19a) and (3.19g) that there is a tendency for both µ sN and σ sN to become large
without limit as N increases, making the expression on the right-hand side of (3.18b) difficult to
interpret in the limit of large N. The central limit theorem can be written in terms of a
mathematically well-defined limit as N → ∞ if we are careful how the arguments of the
Gaussian or normal distribution are defined. To state the central limit theorem precisely, we
define a new random variable
sN − µ sN
zN = (3.20a)
σ s N

that has a probability density distribution pzN ( z N ) . Now we can present the central limit theorem
exactly by stating that
1 − z2 / 2
lim ª¬ pzN ( z ) º¼ = e . (3.20b)
N →∞ 2π

The right-hand side of (3.20b) is the Gaussian or normal distribution introduced above in Eq.
(3.6a) where the random variable has a mean of zero and a standard deviation of one. For any
large but finite value of N, we can recover the approximation in (3.18b) by assuming that pzN is
near its limit and then replacing z in (3.20b) by zN as defined in (3.20a). [The extra factor of σ sN

- 246 -
The Central Limit Theorem · 3.11

multiplying the 2π on the right-hand side of (3.18b) can be regarded as coming from Eq. (3.4)
above—if it isn’t there, then the integral of the probability density distribution between í’ and
+’ does not equal one.]

3.12 Averaging to Improve Experimental Accuracy


It is now easy to explain why averaging together many identical but independent measurements
from the same experiment improves the accuracy of the result. Suppose N independent
measurements are to be averaged together this way. We can say that each measurement is an
independent random number rj for j = 1, 2,… , N having the same mean value µ, with µ taken to
be the true value of the experimental quantity being measured. Since the measurements are all
identical, all the rj have the same standard deviation ı due to the same sorts of random errors
occurring in each independent measurement. When all the experimental results are averaged, we
create a new random number—namely, the sum of all the rj divided by N. Let’s call this new
random number a N . The work done in the previous section lets us write this as [see Eq. (3.18a)]

sN
a N = . (3.21a)
N

Applying the expectation operator E to both sides gives, using the linearity of the expectation
operator (see Sec. 3.10 above),
1
E(a N ) = E( sN ) . (3.21b)
N

Since E( sN ) = µ sN , Eq. (3.19a) shows that, since all the rj have the same mean value µ,

E( sN ) = µr1 + µr2 + " + µrN = N µ . (3.21c)

Hence, Eq. (3.21b) now becomes


1
E(a N ) = (N µ) = µ . (3.21d)
N

Equation (3.21d) states that the expected value of the experimental average a N is µ, the true
value of the experimental quantity being measured. This is no great surprise, because the
averaging process would not make sense unless it were true. The typical size of the error left after
the rj are averaged together—that is, the amount by which a N is likely to be different from its
average value—is just its standard deviation [see Eqs. (3.5c) and (3.8e) above],

- 247 -
3 · Random Variables, Random Functions, and Power Spectra

σ a = E ( (a N − µ ) 2 ) ,
N

which can also be written as, after substituting from Eq. (3.21a) and using the linearity of the
expectation operator,
§§ 1 · · 1
2

σ a N
= E ¨ ¨ sN − µ ¸ ¸ =
¨© N
©
¸
¹ ¹ N
E ( sN − N µ ) .
2
( ) (3.21e)

According to (3.21c), N µ is the mean value of sN , which makes

(
E ( sN − N µ )
2
).
the variance vsN of sN [see Eq. (3.8e) above]. Hence, (3.21e) can be written as

1 1
σ a = vsN = σ s2N
N
N N

because the variance is the square of the standard deviation σ sN . Substituting from (3.19g) now
gives
1 1
σ a = vsN = σ r21 + σ r22 + " + σ r2N .
N
N N

As already mentioned above, we can assume that all the rj have the same standard deviation ı.
Hence,
1 σ
σ a = Nσ 2 = . (3.21f)
N
N N

This shows that when the standard deviation or expected error in one measurement is ı, then the
standard deviation or expected error in the average a N of N identical but independent
measurements is σ / N , a significantly smaller number. Although we use several formulas from
the previous section on the central limit theorem to get this result, there is no assumption here
that the rj obey any particular probability density distribution. In order to derive Eqs. (3.21d) and
(3.21f), all that is needed is that the rj are independent and that the probability density
distributions of the rj have the same mean and standard deviation.
When spectrometers are used to make independent measurements of the same radiance

- 248 -
Averaging to Improve Experimental Accuracy · 3.12

spectra, we can extend the above analysis to the spectral measurements by regarding the
independent but identical random variables rj as random functions of the spectral wavelength or
frequency, with different values of index j now representing different spectral curves from
independent spectral measurements. We can now repeat all the algebraic manipulations used in
(3.21a)–(3.21f) above while regarding every quantity except N as a function of the spectral
wavelength or frequency and end up with the same results. If, for example, the quantities are
regarded as functions of the spectral wavelength Ȝ, then we just need to visualize a (Ȝ)
immediately following the relevant variables. In a sense, all that is happening is that we have
decided to repeat the algebra of Eqs. (3.21a)–(3.21f) at each spectral wavelength. Equation
(3.21d), for example, becomes

E ( a N (λ ) ) = µ (λ ) , (3.22a)

showing that the point-by-point average of the rj (λ ) spectral curves creates another curve a N (λ )
whose expected value is the true spectrum µ(Ȝ). The average spectrum a N (λ ) is allowed to have
a different expected value µ(Ȝ) at each wavelength Ȝ because it is now, of course, taken to be a
function of Ȝ. Similarly Eq. (3.21f) becomes

σ (λ )
σ a (λ ) = . (3.22b)
N
N

This shows that the expected error σ aN (λ ) at wavelength Ȝ of the average spectrum a N (λ ) is
smaller by a factor of N than the expected error ı(Ȝ) at wavelength Ȝ of a single spectral
measurement. The expected error σ (λ ) , just like the average µ(Ȝ), is allowed to be different at
different wavelengths. As long as the expected value µ(Ȝ) of a N (λ ) is the true spectral curve, Eq.
(3.22b) shows that we can approach this true spectrum as closely as we desire—that is, make the
error in our point-by-point average spectrum arbitrarily small—by making N as large as
necessary.

3.13 Mean, Autocorrelation, Autocovariance of Random Functions of


Time
Using the same notation as in the discussion following Eq. (3.2a) above, we write ñ(t) to
represent a random function ñ of a nonrandom time t. As we already mentioned at the end of Sec.
3.2, ñ(t) is often called a random or stochastic process. Having specified a random function—or
stochastic process or random process—called ñ(t), we know that for each time t there is a random
variable ñ(t); and when there are two different time values t1 and t2 with t1 t2, there is no reason
to expect the random variables ñ(t1) and ñ(t2) to behave the same way.

- 249 -
3 · Random Variables, Random Functions, and Power Spectra

We also know the behavior of random variables can be described by probability density
distributions. Associated with any N sequential random variables n (t1 ) , n (t2 ) ,..., n (t N ) specified
by the time values t1 < t2 < " < t N there is a probability density distribution

pn (t1 ) n (t2 )"n (tN ) (n1 , n2 ,… , nN ) ,


such that
pn (t1 ) n (t2 )"n (tN ) (n1 , n2 ,… , nN )dn1dn2 " dnN

is the probability first that ñ(t1) takes on a value between n1 and n1 + dn1 , and then that n (t2 )
takes on a value between n2 and n2 + dn2 , and then that n (t3 ) takes on a value between n3 and
n3 + dn3 , …, and then that n (t N ) takes on a value between nN and nN + dnN . The expectation
operator E has the same meaning as before: the expected or mean value of any function f of the
N random variables n (t1 ) , n (t2 ) , ... , n (t N ) is

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )


∞ ∞ ∞ (3.23a)
= ³
−∞
dn1 ³ dn2 " ³ dnN f (n1 , n2 ,… , nN ) pn ( t1 ) n ( t2 )"n ( tN ) (n1 , n2 ,… , nN ) .
−∞ −∞

One of the most important expectation values associated with ñ occurs when we set N = 2 and
specify that
f ( n (t1 ), n (t2 ),… , n (t N ) ) = n (t1 ) ⋅ n (t2 )

to get the autocorrelation function

∞ ∞
Rnn  (t1 , t2 ) = E ( n (t1 ) ⋅ n (t2 ) ) = ³ dn1 ³ dn2 [n1n2 ] pn ( t1 ) n (t2 ) (n1 , n2 ) . (3.23b)
−∞ −∞

Other important expectation values are the mean of ñ as a function of time,


µn (t ) = E ( n (t ) ) = ³np n ( t ) (n) dn , (3.23c)
−∞
and the autocovariance of ñ,

- 250 -
Mean, Autocorrelation, Autocovariance of Random Functions of Time · 3.13

  (t1 , t2 ) = E
Cnn ((
n (t1 ) − µn ( t1 ) )( n(t ) − µ ) )
2 n ( t2 )

∞ ∞ (3.23d)
= ³ dn ³ dn (n − µ
−∞
1
−∞
2 1 n ( t1 ) )(n2 − µn ( t2 ) ) pn ( t1 ) n ( t2 ) (n1 , n2 ).

Clearly, when µn ( t ) = 0 for all t, we have

  (t1 , t2 ) = Cnn
Rnn   (t1 , t2 ) . (3.23e)

Almost always, the random functions used to represent noise in a physical system are specified in
such a way that µn ( t ) = 0 , which means the distinction between the autocorrelation function and
the autocovariance function becomes irrelevant.

3.14 Ensembles
Just as random variables are often regarded as taking on one or another specific value chosen
randomly from some collection of allowed nonrandom values, so too do we often think of
random functions as becoming one or another specific, nonrandom function chosen randomly
from a collection—or ensemble—of allowed nonrandom functions. We can visualize this
situation by imagining an infinitely long row of biased and crooked slot machines, one for every
value of t on the time axis.27 The slot machines do not necessarily behave identically and they are
wired together so that they can influence each other. When a slot machine’s lever is pulled, there
is never any jackpot; all that happens is that another number appears inside its window. Each time
we simultaneously pull all the levers of the slot machines, we randomly choose another member
of the ensemble of allowed functions. The probability pn ( t ) (n) dn that random variable ñ(t) takes
on a value between n and n + dn is just the probability that the slot machine at t takes on a value
between n and n + dn , and it is also the probability that some member function randomly chosen
from the ensemble of allowed functions has a value between n and n + dn at time t. In fact, we
can say that

pn ( t1 ) n ( t2 )"n ( tN ) (n1 , n2 ,… , nN )dn1dn2 " dnN

is the probability, after the slot machine levers are pulled, that the slot machine at t1 has a value
between n1 and n1 + dn1 , that the slot machine at t2 has a value between n2 and n2 + dn2 , …, and

27
An objection that could be raised here is that an infinite number of slot machines is only what is called countably
infinite whereas the number of points on the time axis is uncountably infinite, a much “larger” type of infinity. For
our purposes, the distinction between these two types of infinity is not important.

- 251 -
3 · Random Variables, Random Functions, and Power Spectra

that the slot machine at tN has a value between nN and nN + dnN . It can also, of course, be thought
of as the probability that a member function randomly chosen from the ensemble of allowed
functions has values at times t1 < t2 < " < t N that lie between n1 and n1 + dn1 , n2 and n2 + dn2 ,
…, nN and nN + dnN respectively.

3.15 Stationary Random Functions


A random function ñ(t) is strictly stationary,28 or strict-sense stationary,29 if all its statistical
properties are unaffected when the origin of its time axis is changed (that is, when we change the
point at which t = 0 ). Mathematically we require, for any t1 < t2 < " < t N , that the probability
density distribution

pn ( t1 ) n ( t2 )"n ( tN ) (n1 , n2 ,… , nN ) = pn ( t1 +τ ) n ( t2 +τ )"n ( tN +τ ) (n1 , n2 ,… , nN ) (3.24a)

for any value of τ and all N = 1, 2,… , ∞ . Thus, for any integrable function f with N arguments,

∞ ∞ ∞

³ dn ³ dn " ³ dn
−∞
1
−∞
2
−∞
N f (n1 , n2 ,… , nN ) pn ( t1 ) n (t2 )"n ( tN ) (n1 , n2 ,… , nN )

∞ ∞ ∞
(3.24b)
= ³ dn ³ dn " ³ dn
−∞
1
−∞
2
−∞
N f (n1 , n2 ,… , nN ) pn ( t1 +τ ) n ( t2 +τ )"n ( tN +τ ) (n1 , n2 ,… , nN ) ,

where t1 < t2 < " < t N and N = 1, 2,… , ∞ . This means that, according to Eq. (3.23a),

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) = E ( f ( n (t1 + τ ), n (t2 + τ ),… , n (t N + τ ) ) ) (3.24c)

for any integrable function f, any value of τ , and N = 1, 2,… , ∞ . We note that when Eq. (3.24c)
holds true,
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )

cannot depend on all the N independent time values t1 , t2 ,…, t N as we might at first suppose. To
see why this is so, we just set τ = −t1 in (3.24c) to get

28
Paul H. Wirsching, Thomas L. Paez, and Keith Ortiz, Random Vibrations: Theory and Practice (John Wiley and
Sons, Inc., New York, 1995), p. 80.
29
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 297.

- 252 -
Stationary Random Functions · 3.15

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )


(3.24d)
= E ( f ( n (0), n (t2 − t1 ), n (t3 − t1 ),… , n (t N − t1 ) ) ) .
This shows that
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )

must be a function of just the nonrandom time parameters (t2 − t1 ) , (t3 − t1 ) ,…, (t N − t1 ) and there
are, of course, only N − 1 of these.
Equations (3.24b)–(3.24d) can be understood in terms of the following thought experiment.
We randomly pick some function from the ensemble of allowed functions and choose N time
values t1 < t2 < " < t N . The randomly picked function has values n1 , n2 ,…, nN at times
t1 , t2 ,…, t N respectively. Next, we create some nonrandom function f that has N arguments and is
not one of those physically unreasonable abstractions that mathematicians specialize in. We
calculate and store the value of f (n1 , n2 ,… , nN ) . Randomly choosing another function from the
ensemble of allowed functions for n (t ) , we again use n1 , n2 ,…, nN at t1 , t2 ,…, t N to calculate and
store a new value of f (n1 , n2 ,… , nN ) . Repeating this procedure enough times to get a large
collection of f values, we average them all together to get a good estimate of

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) .

Shifting to a new set of time values t1 + τ , t2 + τ ,…, t N + τ , we again generate another large
collection of f values, this time averaging them together to get a good estimate of

E ( f ( n (t1 + τ ), n (t2 + τ ),… , n (t N + τ ) ) ) .

Since n is strict-sense stationary, we know that no matter what the positive integer N is, and no
matter what the function f is, and no matter what the value of τ is, both collections of f values
always have approximately the same average, with the difference between the averages becoming
less and less as the collections of f values get larger and larger.
To give an example of a random function ñ(t) that is strict-sense stationary, we define

n (t ) = a cos(ω t ) + b sin(ω t ) , (3.25a)

where a and b obey a probability density distribution pab



(a, b) such that pab

(a, b) da db is the
probability that a takes on a value between a and a + da when b takes on a value between b and

- 253 -
3 · Random Variables, Random Functions, and Power Spectra

b + db . We can also, just as correctly, say that pab



(a, b) da db is the probability that b takes on a
value between b and b + db when a takes on a value between a and a + da . We next require

pab

(a, b) = pab

( a 2 + b2 ) . (3.25b)

Equation (3.25b) says that pab



(a, b) is circularly symmetric because it depends on a and b only
through a 2 + b 2 , the “radius length” of a point whose x and y coordinates are a, b. Returning to
the slot-machine model for ñ(t) explained in Sec. 3.14, we note that randomly choosing values for
a and b is the same as simultaneously pulling the levers of all the slot machines representing
ñ(t) in Eq. (3.25a). Having pulled the levers and gotten, say, values a1 for a and b1 for b , we
then know that the number in the window of the slot machine located at time value t1 is

a1 cos(ω t1 ) + b1 sin(ω t1 ) ,

we know that the number in the window of the slot machine located at time value t2 is

a1 cos(ω t2 ) + b1 sin(ω t2 ) ,

and so on. If we pull all the levers again and get values a2 for a and b2 for b , then we know that
the slot machine at t1 has a number
a2 cos(ω t1 ) + b2 sin(ω t1 ) ,

we know the slot machine at t2 has a number

a2 cos(ω t2 ) + b2 sin(ω t2 ) ,

and so on. Because the probability density distribution pab 


(a, b) completely determines the
statistics of random variables a and b , we see that it must also completely determine the
statistics of ñ(t) in Eq. (3.25a).
It is not difficult to show that ñ(t) in Eq. (3.25a) is strict-sense stationary when pab
is
circularly symmetric.30 Picking an arbitrary time interval τ , we construct two new random
variables

30
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 301.

- 254 -
Stationary Random Functions · 3.15

A = a cos(ωτ ) + b sin(ωτ ) (3.26a)


and
B = b cos(ωτ ) − a sin(ωτ ) . (3.26b)

The reverse transformation to Eqs. (3.26a) and (3.26b) is, of course,

a = A cos(ωτ ) − B sin(ωτ ) (3.26c)


and
b = B cos(ωτ ) + A sin(ωτ ) , (3.26d)

which we can find by solving Eqs. (3.26a) and (3.26b) for a and b in terms of A and B .
Equations (3.26a) and (3.26b) state that if random variables a and b take on the values a and b,
then random variables A and B must take on the values

a cos(ωτ ) + b sin(ωτ )
and
b cos(ωτ ) − a sin(ωτ )

respectively. Similarly Eqs. (3.26c) and (3.26d) state that if random variables A and B take on
values A and B, then random variables a and b must take on values

A cos(ωτ ) − B sin(ωτ )
and
B cos(ωτ ) + A sin(ωτ )

respectively. Whenever there are two random variables x and y that have a probability density
distribution pxy ( x, y ) and we use constants α1 , α 2 , α 3 , and α 4 to construct from x and y two
new random variables
z = α1 x + α 2 y (3.27a)
and
w = α 3 x + α 4 y , (3.27b)

then we can find the probability density distribution pzw  and w by calculating the reverse
  for z

transformation
x = β1 z + β 2 w (3.27c)

- 255 -
3 · Random Variables, Random Functions, and Power Spectra

and
y = β 3 z + β 4 w , (3.27d)
31
and requiring that

1
  ( z , w) =
pzw p  ( β z + β 2 w, β 3 z + β 4 w) . (3.27e)
α1α 4 − α 2α 3 xy 1

Comparing Eqs. (3.26a)–(3.26d) to Eqs. (3.27a)–(3.27d), we see that

α1 = cos(ωτ ) , α 2 = sin(ωτ ) , α 3 = − sin(ωτ ) , α 4 = cos(ωτ )

and
β1 = cos(ωτ ) , β 2 = − sin(ωτ ) , β3 = sin(ωτ ) , β 4 = cos(ωτ ) .

Consequently,
α1α 4 − α 2α 3 = cos 2 (ωτ ) + sin 2 (ωτ ) = 1 ,

and so the probability density distribution of A and B must be

 (
  ( A, B ) = pab
p AB A cos(ωτ ) − B sin(ωτ ), A sin(ωτ ) + B cos(ωτ ) ) . (3.28a)

Since pab

is circularly symmetric, obeying Eq. (3.25b), this becomes

  ( A, B ) = pab
p AB 
([ A2 cos 2 (ωτ ) + B 2 sin 2 (ωτ ) − 2 AB sin(ωτ ) cos(ωτ )
+ A2 sin 2 (ωτ ) + B 2 cos 2 (ωτ ) + 2 AB sin(ωτ ) cos(ωτ )]1 2 )

= pab
 ( ( ) (
A2 cos 2 (ωτ ) + sin 2 (ωτ ) + B 2 cos 2 (ωτ ) + sin 2 (ωτ ) ))
= pab

( A2 + B 2 ).

From Eqs. (3.26c) and (3.26d), we know that, whenever A and B take on the values A and B,
that a and b must then take on the values

31
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 144.

- 256 -
Stationary Random Functions · 3.15

A cos(-* )  B sin(-* )
and
B cos(-* )  A sin(-* ) .

Hence,
2 2
a 2  b2  A cos(-* )  B sin(-* )    A sin(-* )  B cos(-* )  A2  B 2

so that
p AB
  ( A, B ) pab
  
a 2  b 2 pab

( a , b) ,

where Eq. (3.25b) is reversed to make the last step in this equality. We have now shown that Eq.
(3.28a) can be written as
p AB
  ( A, B ) pab

( a , b) (3.28b)

where
becausethepequal

ab
probability
is circularly densities do not depend on
symmetric. -* .
Equation (3.28b) is a very restrictive statement applied to random variables A and B because
it requires A and B to obey exactly the same statistics as a and b . Consequently, we can set up
a random function
N (t ) A cos(- t )  B sin(- t ) (3.29a)

and know that it has exactly the same random behavior as ñ(t) in Eq. (3.25a). Substituting Eqs.
(3.26a) and (3.26b) into (3.29a) gives

N (t ) [a cos(-* )  b sin(-* )]cos(- t )  [b cos(-* )  a sin(-* )]sin(- t )


(3.29b)
a cos - (t  * )   b sin - (t  * )  .

According to Eq. (3.25a), this is the same as writing

N (t ) n (t  * ) . (3.29c)

This means that not only does Ñ(t) have the same random behavior as ñ(t), it also has the same
random behavior as n (t  * ) . Consequently, ñ(t) and n (t  * ) must both have the same random
behavior. We have made no assumptions about the value of * ; hence, Eq. (3.29c) holds true for
any * value. We have therefore demonstrated that

- 257 -
3 · Random Variables, Random Functions, and Power Spectra

n (t ) = a cos(ω t ) + b sin(ω t )

is strict-sense stationary when the probability density distribution pab



is circularly symmetric
with
pab

(a, b) = pab

( a 2 + b2 ) .

A random function ñ(t) is called wide-sense stationary32 when

E ( n (t ) ) = µ n = same finite constant for all values of t (3.30a)


and
E ( n (t1 ) n (t2 ) ) = Rnn
  (t 2 − t1 ) . (3.30b)

Other terms applied to random functions ñ(t) that satisfy these two restrictions are weakly
stationary or covariance stationary.33 Equation (3.30a) requires the average value of ñ(t) to be
finite and independent of time. We call this average µ n instead of µ n ( t ) as in Eq. (3.23c) to
emphasize that it does not depend on time. Equation (3.30b) requires the autocorrelation function
  (t1 , t2 ) defined in Eq. (3.23b) to depend only on (t2 − t1 ) , the difference between times t2 and
Rnn
t1 . Glancing back at the definition of Cnn   (t1 , t2 ) in Eq. (3.23d), we see that when Eqs. (3.30a) and

(3.30b) are satisfied,

  (t1 , t2 ) = E ( ( n
Cnn  (t1 ) − µn )( n (t2 ) − µn ) )
= E ( n (t1 )n (t2 ) − µn n (t1 ) − µn n (t2 ) + µn2 )
= E ( n (t1 )n (t2 ) ) − µn E ( n (t1 ) ) − µn E ( n (t2 ) ) + µn2 .

The last step uses the linearity of the expectation operator (see Sec. 3.10 above) and Eq. (3.9f).
Consequently, the formula for Cnn   becomes, using Eqs. (3.30a) and (3.30b),

  (t 2 − t1 ) − µ n .
2
  (t1 , t2 ) = Rnn
Cnn (3.30c)

This result shows that the autocovariance Cnn


  (t1 , t2 ) of random functions that are wide-sense

32
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 298.
33
T. T. Soong, Random Differential Equations in Science and Engineering (Academic Press, New York, 1973), p.
43.

- 258 -
Stationary Random Functions · 3.15

stationary also depends only on (t2 − t1 ) , the difference between times t2 and t1 . We note that
random functions that are wide-sense stationary need not be strict-sense stationary, but random
functions that are strict-sense stationary must also be wide-sense stationary. For future use, we
note that two random functions nα (t ) and nβ (t ) are defined to be jointly wide-sense stationary34
when each one is itself wide-sense stationary and when

E ( nα (t1 ) nβ (t2 ) ) = Rnα nβ (t2 − t1 ) , (3.30d)

which is called their cross-correlation function, depends only on the difference between times t1
and t2 .
Returning to the ñ(t) defined in Eq. (3.25a) above,

n (t ) = a cos(ω t ) + b sin(ω t ) ,

we stop assuming that pab 


(a, b) is circularly symmetric and examine the weaker conditions that
must be put on random variables a and b to make ñ wide-sense stationary.35 The expectation
value of ñ(t) must be time independent, so by the linearity of the expectation operator

E ( n (t ) ) = E(a ) cos(ω t ) + E(b ) sin(ω t ) .

Hence, for E ( n (t ) ) to obey Eq. (3.30a) and so be time independent, we must have
E(a ) = 0 (3.31a)
and
E(b ) = 0 . (3.31b)

These are the first two restrictions that must be placed on a and b for ñ(t) to be wide-sense
stationary. We also know from Eq. (3.30b) that Rnn   must have the same value whenever

t2 − t1 = 0 or t2 = t1 , so (remember that nothing has been said about what the value of time t2 = t1
is)
E ( n (t3 ) n (t3 ) ) = E ( n (t4 ) n (t4 ) )

34
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 299.
35
This treatment is taken from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p.
300.

- 259 -
3 · Random Variables, Random Functions, and Power Spectra

must hold true for all values of t3 and t4 . In particular, this must hold true when t3 = 0 and
t4 = π (2ω ) . But from Eq. (3.25a)

n (0) = a and n (π (2ω ) ) = b ,


so it must be true that
E(a 2 ) = E(b 2 ) . (3.31c)

This is the third restriction that must be placed on a and b for ñ(t) to be wide-sense stationary.
To find the fourth and last restriction, we evaluate the left-hand side of Eq. (3.30b) for t1 ≠ t2 ,
using (3.25a) and the linearity of the expectation operator (see Sec. 3.10) to get

E(n (t1 )n (t2 )) = E([a cos(ω t1 ) + b sin(ω t1 )][a cos(ω t2 ) + b sin(ω t2 )])
= E(a 2 cos(ω t ) cos(ω t ) + ab
1
  cos(ω t ) sin(ω t )
2 1 2

  cos(ω t2 ) sin(ω t1 ) + b 2 sin(ω t1 ) sin(ω t2 ))


+ ab
= E(a 2 ) ⋅ [cos(ω t ) cos(ω t )] + E(b 2 ) ⋅ [sin(ω t ) sin(ω t )]
1 2 1 2

  ) ⋅ [cos(ω t1 ) sin(ω t2 ) + cos(ω t2 ) sin(ω t1 )].


+ E(ab

This becomes, using E(a 2 ) = E(b 2 ) from Eq. (3.31c),

  ) ⋅ sin (ω (t1 + t2 ) ) .
E ( n (t1 )n (t2 ) ) = E(a 2 ) ⋅ cos (ω (t2 − t1 ) ) + E(ab (3.31d)

The first term on the right-hand side of (3.31d) depends only on (t2 − t1 ) , which is what Eq.
(3.30b) requires, but the second term on the right-hand side does not. Therefore, the last
restriction on random variables a and b is

 ) = 0 .
E(ab (3.31e)

Equations (3.31a), (3.31b), (3.31c), and (3.31e) list all the restrictions on random variables a and
b needed to ensure that ñ(t) in Eq. (3.25a) is a wide-sense stationary random function.
If a and b are independent random variables that obey the same probability density
distribution, and this probability density distribution assigns a mean value of zero to random
variables obeying it, then Eqs. (3.31a)–(3.31c) are automatically satisfied and, since a and b are
independent, Eqs. (3.31a) and (3.31b) show that (3.31e) is also satisfied:

- 260 -
Stationary Random Functions · 3.15

  ) = E(a ) ⋅ E(b ) = 0 ⋅ 0 = 0 .
E(ab

This is sufficient to make ñ(t) wide-sense stationary, but there are other ways to do the job. We
can, for example, set a = u and b = v where u and v are the random variables defined in Eqs.
(3.15b) and (3.15c) above. Equations (3.15d) and (3.15e) then show that Eqs. (3.31a) and (3.31b)
are satisfied, and Eq. (3.15f) shows that (3.31e) is satisfied. The only requirement left is (3.31c),
which can be checked now by writing


1 1
E(a 2 ) = E(u 2 ) = ³ sin φ dφ =
2
(3.32a)
2π 0
2
and

1 1
E(b 2 ) = E(v 2 ) = ³ cos
2
φ dφ = . (3.32b)
2π 0
2

Clearly, Eq. (3.31c) is also satisfied. We conclude that even though a = u and b = v are not, as is
pointed out in the discussion following Eq. (3.15f), independent random variables, the random
function ñ(t) in Eq. (3.25a) is still wide-sense stationary. Note that Eqs. (3.15b) and (3.15c) can
now be used to write ñ(t) as

n (t ) = sin(φ ) cos(ω t ) + cos(φ ) sin(ω t ) = sin(ω t + φ ) . (3.32c)

In (3.32c), random variable φ can, according to Eq. (3.15a), be regarded as a random phase
equally likely to take on any value between zero and 2ʌ. Adding this sort of random phase to the
argument of a sinusoidal oscillation always produces a wide-sense stationary random function.

3.16 Gaussian Random Processes


A random function ñ(t) is called a Gaussian random process or normal process when for any N
time values t1 < t2 < " < t N the random variables n (t1 ) , n (t2 ) ,…, n (t N ) obey a probability density
distribution
pn ( t1 ) n ( t2 )"n (tN ) (n1 , n2 ,… , nN ) ,

which is multivariate Gaussian. To write this multivariate Gaussian in a reasonably compact


form, we define the vectors
G
n = (n1 , n1 ,… , nN ) , (3.33a)

G G
n (t ) = ( n (t1 ), n (t1 ),… , n (t N ) ) , (3.33b)

- 261 -
3 · Random Variables, Random Functions, and Power Spectra

and
G G
(G )
µnG (tG ) = E n (t ) = (E ( n (t1 ) ) , E ( n (t1 ) ) ,… , E ( n (t N ) ) ) . (3.33c)

Glancing back at Eq. (3.23c), we remember that µn (t ) is the expected or mean value of the
random variable ñ(t), so Eq. (3.33c) can also be written as
G
µnG (tG ) = ( µn (t ) , µn (t ) ,… , µn (t ) ) .
1 2 N
(3.33d)

We define the covariance matrix C to be the N × N square matrix whose i,jth element is given by

(
(C)ij = E [n (ti ) − µ n (ti ) ][n (t j ) − µ n ( t j ) ] . ) (3.33e)

Equation (3.14c) reminds us that (C)ij is measuring the covariance of the two random variables
n (ti ) and n (t j ) . A T superscript applied to a matrix or vector specifies the transpose of that
matrix or vector; so, for example,
§ n1 ·
¨ ¸
G T ¨ n2 ¸
n = .
¨ # ¸
¨ ¸
© nN ¹
Now the multivariate Gaussian distribution

pn ( t1 ) n ( t2 )"n (tN ) (n1 , n2 ,… , nN )


can be written as

pn ( t1 ) n ( t2 )"n ( tN ) (n1 , n2 ,… , nN )


G
= pnG ( tG ) (n ) (3.33f)
§ 1 G G G G ·
= (2π ) − N 2 [det(C)]−1 2 exp ¨ − (n − µnG ( tG ) ) ⋅ C−1 ⋅ (n − µnG ( tG ) )T ¸ .
© 2 ¹

In this formula, det(C) stands for the determinant of C , and C−1 is the inverse matrix of C .
Nothing said so far about Gaussian random processes requires them to be stationary in any
sense of the term, and in fact not all Gaussian random processes are stationary. They are often
good models for the noise found in mechanical processes and electrical signals. Perhaps the most
interesting thing about them, however, is that it can be shown that if they are wide-sense

- 262 -
Gaussian Random Processes · 3.16

stationary, then they are also strict-sense stationary.36,37

3.17 Products of Two, Three, and Four Jointly Normal Random


Variables
Random variables such as n (t1 ) , n (t2 ) ,…, n (t N ) that obey a multivariate Gaussian distribution
such as the one in Eq. (3.33f) are often called jointly normal random variables.38 There are a
number of useful product identities that apply to groups of two, three, and four jointly normal
random variables. Since the derivation of these identities does not involve t, our notation can be
simplified by writing
n (t1 ) → n1
n (t2 ) → n2
#
etc.

Each random variable is also assumed to have a mean of zero:

µn = 0
1

µn = 0
2

#
etc.

We start by specifying three jointly normal, zero-mean random variables n1 , n2 , and n3 .
Consulting Eq. (3.33f) above, we note that the jointly normal probability density function for n1 ,
n2 , and n3 can be written as, by expanding the matrix product in the exponent after setting the
G
means vector µ to zero,
3 3
− ¦¦α jk n j nk
pn1n2 n3 (n1 , n2 , n3 ) = K e j =1 k =1
(3.34a)

for real constants K and α jk (with j , k = 1, 2,3 ). Note that these three random variables can be
either independent or dependent random variables and still obey the probability density
distribution in (3.34a). The expected value of the triple product n1n2 n3 is [applying Eq. (3.12a)

36
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 300.
37
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
38
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 197.

- 263 -
3 · Random Variables, Random Functions, and Power Spectra

above]
3 3
∞ ∞ ∞ − ¦¦α jk n j nk
E (n1n2 n3 ) = K ³ dn ³ dn ³ dn (n n n )e . (3.34b)
j =1 k =1
1 2 3 1 2 3
−∞ −∞ −∞

Changing the dummy variables of integration to

u1 = −n1 , u2 = −n2 , u3 = −n3


gives
3 3
−∞ −∞ −∞ − ¦¦α jk ( − u j )( −uk )
E (n1n2 n3 ) = K ³ (−du ) ³ (−du ) ³ (−du )(−u u u )e
j =1 k =1
1 2 3 1 2 3
∞ ∞ ∞
or
3 3
∞ ∞ ∞ − ¦¦α jk (u j )( uk )
E (n1n2 n3 ) = − K ³ du ³ du ³ du (u u u )e . (3.34c)
j =1 k =1
1 2 3 1 2 3
−∞ −∞ −∞

Comparing the right-hand sides of (3.34b) and (3.34c) shows that

E (n1n2 n3 ) = −E (n1n2 n3 ) .

The only number that is equal to (í1) times itself is zero, so we conclude that

E (n1n2 n3 ) = 0 (3.34d)

for any three distinct, jointly normal, and zero-mean random variables.
When n1 , n2 , and n3 are not three distinct random variables—or, what amounts to the same
thing, two or more are perfectly correlated—we can redo the analysis to see what happens.
If two of the three random variables n1 , n2 , and n3 are perfectly correlated, there are really
only two distinct, jointly normal, zero-mean random variables that we call n1 and n2 . Their
multivariate probability density distribution can be written as
2 2
− ¦¦α jk n j nk
pn1n2 (n1 , n2 ) = K e j =1 k =1

for real constants K and α jk (with j , k = 1, 2 ). If necessary, we renumber the random variables so
that n2 represents the two perfectly correlated random variables that used to be distinct. Equation
(3.34b) now simplifies to

- 264 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17

2 2
∞ ∞ − ¦¦α jk n j nk
E (n1n22 ) = K ³ dn ³ dn (n n ) e
2
1 2 1 2
j =1 k =1
. (3.35a)
−∞ −∞

Again the dummy variables of integration are changed, this time to

u1 = −n1 and u2 = −n2 ,


which gives
2 2
−∞ −∞ − ¦¦α jk ( −u j )( − uk )
E (n1n22 ) = K ³ (−du ) ³ (−du )(−u u
2
1 2 1 2 )e j =1 k =1

∞ ∞
or
2 2
∞ ∞ − ¦¦α jk u j uk
E (n1n22 ) = − K ³ du1 ³ du2 (u1u22 ) e j =1 k =1
. (3.35b)
−∞ −∞

Comparing the right-hand sides of (3.35a) and (3.35b) shows that

E (n1n22 ) = −E (n1n22 ) ,

so using the same reasoning as before—that only zero can be equal to (í1) times itself—we get

E (n1n22 ) = 0 . (3.35c)

Hence, Eq. (3.34d) still holds true when any two of the jointly normal, zero-mean random
variables n1 , n2 , n3 are perfectly correlated.
When all three of these random variables are perfectly correlated, there is really just one zero-
mean random variable n1 obeying the normal probability distribution [see Eq. (3.6a) above],
n12

1 2σ n21
pn1 (n1 ) = e .
σ n 2π
1

The left-hand side of (3.34d) now becomes E (n13 ) , which satisfies the formula

n12
∞ −
1 2σ n21
E (n13 ) = ³ne
3
1 dn1 . (3.36a)
σ n 2π
1 −∞

- 265 -
3 · Random Variables, Random Functions, and Power Spectra

Since this is the integral between +’ and –’ of an odd function, it must be zero [see Eq.
(2.17) in Chapter 2]. Consequently,
E (n13 ) = 0 (3.36b)

for any zero-mean, normally distributed random variable n1 . We conclude that Eq. (3.34d) holds
for any three strictly normal and zero-mean random variables even if they are not distinct.
To construct a formula for E (n1n2 n3n4 ) for four zero-mean, jointly normal random variables
n1 , n2 , n3 , n4 , we construct a new random variable,

4
w = ω1n1 + ω2 n2 + ω3 n3 + ω4 n4 = ¦ ω j n j . (3.37a)
j =1

There is no requirement that n1 , n2 , n3 , and n4 be distinct random variables, but we do assume
that the real parameters ω 1 , ω 2 , ω 3 , and ω 4 can independently take on any value between í’
and +’. Since n1 , n2 , n3 , and n4 are jointly normal, w is also a normal variable.39 Using the
linearity of the expectation operator with respect to random variables (see Sec. 3.10 above) and
remembering that n1 , n2 , n3 , and n4 are zero mean, we have

§ 4 · 4
E ( w ) = E ¨ ¦ ω j n j ¸ = ¦ ω jE (n j ) = 0 , (3.37b)
© j =1 ¹ j =1

showing that w is also zero-mean. For future use we note, applying (3.37b) to Eq. (3.8e), that the
variance of w is

§§ 4 ·§ 4 ·· 4 4
vw = E ( w 2 ) = E ¨ ¨ ¦ ω j n j ¸ ¨ ¦ ωk nk ¸ ¸ = ¦¦ ω jωk E(n j nk ) ,
¨ ¹ ¹¸ j =1 k =1
© © j =1 ¹ © k =1

which can also be written as, recognizing that [according to Eq. (3.5c)] the variance vw is the
square of the standard deviation σ w of w ,

4 4
σ w2 = ¦¦ ω jωk E(n j nk ) . (3.37c)
j =1 k =1

39
This analysis is an expanded version of a treatment given in Athanasios Papoulis, Probability, Random Variables,
and Stochastic Processes, pp. 197–198.

- 266 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17

The characteristic function of w is [see Eqs. (3.9a) and (3.9b) above]

³p
−2π iν w
E (e )= w ( w)e−2π iν w dw ,
−∞

where pw ( w) is the probability density distribution of random variable w . Since w obeys a
zero-mean normal distribution [defined in Eq. (3.6a)], this becomes

∞ w2

1
³e
−2π iν w 2σ w2
E (e )= e −2π iν w dw . (3.38a)
σ w 2π −∞

Substituting the identity eiφ = cos φ + i sin φ into (3.38a) gives

∞ w2 ∞ w2
− −
1 i
³ cos(2πν w) e ³ sin(2πν w) e
−2π iν w 2σ w2 2σ w2
E (e )= dw + dw .
σ w 2π −∞ σ w 2π −∞

When we replace w by − w in
w2

2σ w2
Y ( w) = sin(2πν w) e ,

we see that
( − w )2 ( w)2
− −
2σ w2 2σ w2
Y (− w) = sin(−2πν w) e = − sin(2πν w) e = −Y ( w) ,

showing that Y is an odd function. Hence, according to Eq. (2.17) in Chapter 2, its integral
between í’ and +’ is zero. The formula for E (e −2π iν w ) must then reduce to

∞ w2
1 −

³ cos(2πν w) e
−2π iν w 2σ w2
E (e )= dw . (3.38b)
σ w 2π −∞

A table of integrals40 shows that, for any two real parameters a and b,

40
Formula 679 of the Handbook of Chemistry and Physics, edited by Robert C. Weast, 51st ed. (The Chemical
Rubber Company, Cleveland, OH, 1970–1971), p. A-215.

- 267 -
3 · Random Variables, Random Functions, and Power Spectra

∞ b2
π −

³e
− a2 x2
cos(bx)dx = e 4 a2
.
0
2a
Setting
2 2
Z ( x) = cos(bx) e − a x ,

we note that Z is an even function because


2
( − x )2 2
( x )2
Z (− x) = cos(−bx) e − a = cos(bx) e − a = Z ( x) .

Hence, according to Eq. (2.19) in Chapter 2, we can write

∞ b2
π −

³e
− a2 x2
cos(bx)dx = e 4 a2
. (3.38c)
−∞
a

1
Applying formula (3.38c) to Eq. (3.38b) by specifying that a = and b = 2πν , we get
σ w 2

2 2
σ w2
E (e −2π iν w ) = e −2π ν . (3.38d)

Equation (3.38d) holds true for any value of ν ; in particular, when ν = (2π ) −1 , it must still be
true:
2
E (e − iw ) = e −σ w / 2 . (3.38e)

Formula (3.38e) applies to any zero-mean, normal random variable, which means it applies to w
for any set of ω 1 , ω 2 , ω 3 , ω 4 values in Eq. (3.37a) above.
We can expand the left-hand side of (3.38e) in powers of w to get, using the linearity of the
expectation operator with respect to random variables (see Sec. 3.10 above),

§ w 2 w 3 w 4 · E ( w 2 ) E ( w 3 ) E ( w 4 )
E (e − iw ) = E ¨ 1 − iw − +i + + " ¸ = 1 − iE ( w ) − +i + +".
© 2 6 24 ¹ 2 6 24

According to Eqs. (3.37b) and (3.36b), both E ( w ) and E ( w 3 ) are zero [the discussion following
Eq. (3.37a) shows that w like n1 is a zero-mean, normally distributed random variable, which
means that it must satisfy both Eqs. (3.37b) and (3.36b)]. Hence, we can write, remembering that
E ( w 2 ) = σ w2 because σ w is the standard deviation of w and w is zero mean, that

- 268 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17

σ w2 E ( w 4 )
E (e− iw ) = 1 − + +" . (3.39a)
2 24

The right-hand side of (3.38e) can be expanded in powers of σ w to get

2 σ w2 σ w4
e−σ w / 2 = 1 − + +" . (3.39b)
2 8

Substitution of (3.39a) and (3.39b) into (3.38e) now gives

σ w2 E ( w 4 ) σ2 σ4
1− + + " = 1 − w + w + "
2 24 2 8
or
E ( w 4 ) σ w4
+" = +". (3.39c)
24 8

Equation (3.37c) reminds us that σ w2 is the weighted sum of ω jωk products, so for small ω it
follows that σ w2 is of order ω 2 . This means that σ w4 on the right-hand side of (3.39c) is of order
ω 4 . Similarly, Eq. (3.37a) reminds us that E ( w 4 ) on the left-hand side of (3.39c) is order ω 4
when the ω values are small. Formula (3.39c) must hold true for all values of ω 1 , ω 2 , ω 3 , and
ω 4 . If we choose ω 1 through ω 4 to be small, we must have

E ( w 4 ) = 3σ w4 . (3.39d)

If (3.39d) is false, then the higher powers of w and σ w in (3.39c), which are represented by
“ +" ” on both sides of the formula, cannot make (3.39c) hold true because these +" terms
contain only order ω 6 and higher powers of ω 1 through ω 4 , making them too small to rescue
the equality.
The next step is to expand E ( w 4 ) . Raising w to the fourth power in (3.37a) gives

w 4 = (ω1n1 + ω2 n2 + ω3n3 + ω4 n4 ) 2 (ω1n1 + ω2 n2 + ω3 n3 + ω4 n4 ) 2


or
w 4 = (ω12 n12 + ω22 n22 + ω32 n32 + ω42 n42
+ 2ω1ω2 n1n2 + 2ω1ω3n1n3 + 2ω1ω4 n1n4
+ 2ω2ω3 n2 n3 + 2ω2ω4 n2 n4 + 2ω3ω4 n3n4 ) 2 .

- 269 -
3 · Random Variables, Random Functions, and Power Spectra

Paying attention only to those terms whose coefficients are proportional to ω1ω2ω3ω4 , we have

w 4 = " + 24ω1ω2ω3ω4 n1n2 n3 n4 + " . (3.40a)

Formula (3.37c) gives, again concentrating only on terms whose coefficients are proportional to
ω1ω2ω3ω4 ,
σ w4 = [ω12E (n12 ) + ω1ω2E (n1n2 ) + ω1ω3E (n1n3 ) + ω1ω4E (n1n4 )
+ ω2ω1E (n2 n1 ) + ω22E (n22 ) + ω2ω3E (n2 n3 ) + ω2ω4E (n2 n4 )
+ ω3ω1E (n3 n1 ) + ω3ω2E (n3 n2 ) + ω32E (n32 ) + ω3ω4E (n3n4 )
+ ω4ω1E (n4 n1 ) + ω4ω2E (n4 n2 ) + ω4ω3E (n4 n3 ) + ω42E (n42 )]2 ,

which becomes

σ w4 = " + 8ω1ω2ω3ω4E (n1n2 )E (n3n4 ) + 8ω1ω2ω3ω4E (n1n3 )E (n2 n4 )


(3.40b)
+ 8ω1ω2ω3ω4E (n2 n3 )E (n1n4 ) + " .

Equations (3.40a) and (3.40b) can be substituted into (3.39d) to get

E (" + 24ω1ω2ω3ω4 n1n2 n3n4 + ")


= 3 ⋅ [" + 8ω1ω2ω3ω4E (n1n2 )E (n3 n4 ) + 8ω1ω2ω3ω4E (n1n3 )E (n2 n4 )
+ 8ω1ω2ω3ω4E (n2 n3 )E (n1n4 ) + "] ,

which simplifies to, using the linearity of the expectation operator (see Sec. 3.10),

" + 24ω1ω2ω3ω4E (n1n2 n3n4 ) + "


= " + 24ω1ω2ω3ω4 [E (n1n2 )E (n3 n4 ) + E (n1n3 )E (n2 n4 ) + E (n2 n3 )E (n1n4 )] + " .

This must hold true for any combination of ω 1 , ω 2 , ω 3 , and ω 4 values, large or small, so the
coefficients of all the ω1ω2ω3ω4 terms must be the same on both sides of this equation. Therefore,

E (n1n2 n3n4 ) = E (n1n2 )E (n3n4 ) + E (n1n3 )E (n2 n4 ) + E (n2 n3 )E (n1n4 ) (3.40c)

for any collection of zero-mean, jointly normal random variables n1 , n2 , n3 , and n4 .
Equation (3.40c) requires ω 1 through ω 4 to be distinct real parameters, but it does not

- 270 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17

require the n1 , n2 , n3 , and n4 random variables to be distinct. Consequently, if n1 and n2 are the
same, we can relabel the jointly random variables using

n1 = n2 = na


n3 = nb
n4 = nc
to get
E (na2 nb nc ) = E (na2 )E (nb nc ) + 2 E (na nb )E (na nc ) . (3.41a)

Similarly, if n3 and n4 are also identical, we can relabel n1 through n4 as

n1 = n2 = na


and
n3 = n4 = nb ,
so that
E (na2 nb2 ) = E (na2 )E (nb2 ) + 2 E (na nb ) 2 . (3.41b)

When all four random variables are the same, Eq. (3.40c) collapses to

E (n 4 ) = 3E (n 2 ) 2 , (3.41c)

which holds true for any zero-mean random variable ñ obeying a normal distribution.

3.18 Ergodic Random Functions


Ergodic random functions are random functions where time averages can be used to calculate
ensemble averages. Just as stationary random functions can be stationary in many different ways,
so can ergodic random functions be ergodic in many different ways.
We start with a simple example, discussing what is meant by saying that a random function
ñ(t) is “ergodic in the mean.”41 Equation (3.23c) defines the mean of ñ(t) to be the ensemble
average created by the expectation operator,

µn (t ) = E ( n (t ) ) .

To find the mean using a time average, we must calculate

41
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.

- 271 -
3 · Random Variables, Random Functions, and Power Spectra

T
1
2T ³ n (t ) dt
−T

and take the limit as T → ∞ . Since “ergodic” refers to using time averages to calculate ensemble
averages, we might expect that a random function that is ergodic in the mean would satisfy the
equation
T
1
T →∞ 2T ³
µn (t ) = lim n (t ) dt . (3.42a)
−T

There are two problems with Eq. (3.42a). The first is that µn (t ) is allowed to be a function of time
t, whereas
T
1
lim
T →∞ 2T ³ n (t ) dt
−T

is not. This means Eq. (3.42a) can only be true when µn (t ) does not depend on time.
Consequently, for ñ to be ergodic in the mean, we must also require ñ to be stationary in the mean
with [see Eq. (3.30a) above]

E ( n (t ) ) = µ n = constant with respect to time.

Now Eq. (3.42a) can be written as


T
1
µn = lim
T →∞ 2T ³ n (t ) dt .
−T
(3.42b)

The second problem is more difficult to deal with. We note that the value of

T
1
2T ³ n (t ) dt
−T

must be a random value because it is proportional to the integral of a random function. Hence, we
expect
T
1
T →∞ 2T ³
lim n (t ) dt
−T

also to be a random value. This means Eq. (3.42b) sets a random value equal to µn , a nonrandom

- 272 -
Ergodic Random Functions · 3.18

value, which is in general not allowed. The way out of this impasse is to put a restriction on the
limiting process used to get the right-hand side of (3.42b). Clearly,

T
1
ξ (T ) = ³ n (t ) dt (3.42c)
2T −T

is a random function of T. This means there must be a probability density distribution pξ (T ) (ξ )
such that pξ (T ) (ξ ) d ξ is the probability that ξ (T ) takes on a value between ξ and ξ + dξ . We
now require the limiting random variable

T
1
ξ∞ = lim ξ (T ) = lim ³ n (t ) dt (3.42d)
T →∞ T →∞ 2T
−T

to obey the limiting the probability density distribution

pξ (ξ ∞ ) = δ (ξ ∞ − µ n ) . (3.42e)

According to the discussion following Eqs. (3.7e) and (3.7f) above, this turns ξ∞ into a random
variable that behaves like a constant, since


E (ξ∞ ) = ³ δ (ξ ∞ − µn ) ⋅ ξ ∞ ⋅ dξ ∞ = µn
−∞
and

( ) ³ δ (ξ
E (ξ∞ − µn ) 2 = ∞ − µn ) ⋅ (ξ ∞ − µn ) 2 ⋅ dξ ∞ = 0 .
−∞

Now we can note that, yes, strictly speaking, Eq. (3.42b) does equate a random variable to a
nonrandom variable, but this does not matter because Eq. (3.42e) makes the random variable

T
1
lim
T →∞ 2T ³ n (t ) dt
−T

equivalent to a nonrandom quantity.

- 273 -
3 · Random Variables, Random Functions, and Power Spectra

A random function ñ(t) is “ergodic in the autocorrelation function”42 if the autocorrelation


function defined as an ensemble average in Eq. (3.23b) can also be calculated with a time
average. Glancing back at (3.23b), we define t2 − t1 = τ and set the ensemble average equal to the
time average by writing
§ 1 T ·
E ( n (t1 ) n (t1 + τ ) ) = lim ¨
T →∞ 2T ³

n (t ) 
n (t + τ ) dt ¸. (3.43a)
© −T ¹

Once again we face the same two problems: the left-hand side of this equation is allowed to be a
function of t1 whereas the right-hand side is not, and the left-hand side of this equation is
nonrandom whereas the right-hand side is random.
Dealing with the t1 problem first, we again say that

E ( n (t1 ) n (t1 + τ ) )

does not depend on t1 , making ñ(t) stationary with respect to its autocorrelation function. Now
Eq. (3.43a) can be written as

§ 1 T
·
E (n (t1 ) n (t1 + τ )) = Rnn
  (τ ) = lim ¨
T →∞ 2T
©
³ n (t )n (t + τ ) dt ¸¹ .
−T
(3.43b)

Both in Eqs. (3.42a) and (3.42b) describing what it means to be ergodic in the mean, and in Eqs.
(3.43a) and (3.43b) describing what it means to be ergodic in the autocorrelation function, the
time dependence that ensemble averaging preserves is lost in the time average. This is clearly
going to happen whenever some sort of ensemble average is set equal to the corresponding time
average. We conclude that when a random function is ergodic in some way, it must also be
stationary in that same way. In this sense, ergodic random functions are always stationary.43
Moving on to the second problem with Eq. (3.43a)—that of equating random and nonrandom
quantities—we follow the same procedure as before. This time the random function ξ is defined
to be
T
 1
2T −³T
ξ (T ,τ ) = n (t ) n (t + τ ) dt (3.44a)

and the random function ξ∞ (τ ) is defined to be

42
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
43
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.

- 274 -
Ergodic Random Functions · 3.18

.5 (* ) lim . (T ,* ) . (3.44b)


T 75

Associated with .5 (* ) is the probability density distribution p.5 (* ) such that p.5 (* ) (.5 ) d .5 is the
probability that .5 (* ) has a value between .5 and .5  d. 5 . We again require

p.5 (* ) (. 5 )  .5  Rnn


  (* )  (3.44c)
so that
5 5
E .5 (* )
  ³p .5 (* )
(. 5 ) . 5 d .5 ³  . 5   (* )  . 5 d . 5 Rnn
 Rnn   (* ) (3.44d)
5 5
and
5
E [.5 (* )  Rnn
   (* )]
2
 ³p .5 (* )
(. 5 )[.5  Rnn 2
  (* )] d . 5
5
5
(3.44e)

³  .
2
5   (* )  [. 5  Rnn
 Rnn   (* )] d . 5 0.
5

This shows, according to the discussion following Eqs. (3.7e) and (3.7f), that the random variable
.5 (* ) behaves like a nonrandom quantity. We have now solved the second problem with Eq.
(3.43a) and therefore can make sense of the idea that a random function can be ergodic in the
autocorrelation function.
The pattern used in analyzing the ergodic qualities of a random function ñ(t) has by now been
set. There is some mathematically useful and reasonable function f that has N arguments. We pick
N time values t1 , t2 ,…, t N and calculate an ensemble expectation value or average

E  f  n (t1 ), n (t2 )," , n (t N )   ,

which is then set equal to the time average

T
1
lim
T 75 2T ³ f  n (t ), n (t  *
T
2 ), n (t  * 3 ),… , n (t  * N )  dt .

We define
* 2 t2  t1 , * 3 t3  t1 , ... , * N t N  t1

and set the expectation value equal to the time average by writing

- 275 -
3 · Random Variables, Random Functions, and Power Spectra

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )


= E ( f ( n (t1 ), n (t1 + τ 2 ), n (t1 + τ 3 ),… , n (t1 + τ N ) ) ) (3.45a)
T
1
= lim
T →∞ 2T ³ f ( n (t ), n (t + τ
−T
2 ), n (t + τ 3 ),… , n (t + τ N ) ) dt.

In order for Eq. (3.45a) to make sense, the expectation value

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )


= E ( f ( n (t1 ), n (t1 + τ 2 ), n (t1 + τ 3 ),… , n (t1 + τ N ) ) )

cannot be a function of t1 . This means the right-hand side this of relationship still has the same
value when t1 is increased by any time value τ ; hence we can write, increasing t1 by τ only on
the right-hand side,

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )


= E ( f ( n (t1 + τ ), n (t1 + τ + τ 2 ), n (t1 + τ + τ 3 ),… , n (t1 + τ + τ N ) ) ) .

Remembering that
τ 2 = t2 − t1 , τ 3 = t3 − t1 , …, τ N = t N − t1 ,

we eliminate τ 2 , τ 3 ,…, τ N from the equation to get

E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) = E ( f ( n (t1 + τ ), n (t2 + τ ),… , n (t N + τ ) ) ) . (3.45b)

This is the same as Eq. (3.24c) above. We conclude that Eq. (3.24c) must be true whenever Eq.
(3.45a) is true. According to the discussion following Eq. (3.24c), whenever Eq. (3.45a) is true,
the expectation value
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )

must be a function of only the N − 1 independent time values

τ 2 = t2 − t1 , τ 3 = t3 − t1 , …, τ N = t N − t1 .

Consequently, the expectation values and the time integral in Eq. (3.45a) have the same number

- 276 -
Ergodic Random Functions · 3.18

of independent time parameters, which we can show by writing

T
1
S (τ 2 ,τ 3 ,… ,τ N ) = lim
T →∞ 2T ³ f ( n (t ), n (t + τ
−T
2 ), n (t + τ 3 ),… , n (t + τ N ) ) dt , (3.45c)

where
S (τ 2 ,τ 3 ,… ,τ N ) = E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) . (3.45d)

Equation (3.45a) needs to have one more requirement imposed on it—the random quantity on the
right-hand side must be equivalent to the nonrandom quantity on the left. This means the random
quantity
T
1
ξ∞ (τ 2 ,τ 3 ,… ,τ N ) = lim f ( n (t ), n (t + τ 2 ), n (t + τ 3 ),… , n (t + τ N ) ) dt
T →∞ 2T ³
(3.45e)
−T

must become equivalent to the nonrandom quantity S by having

( )
E ξ∞ (τ 2 ,τ 3 ,… ,τ N ) = S (τ 2 ,τ 3 ,… ,τ N ) (3.45f)
and

(
E ª¬ξ∞ (τ 2 ,τ 3 ,… ,τ N ) − S (τ 2 ,τ 3 ,… ,τ N ) º¼
2
)=0. (3.45g)

Now, by requiring Eqs. (3.45b)–(3.45g) to hold true, we can be sure that Eq. (3.45a) is
mathematically self-consistent.
It is not difficult to relate this mathematical machinery to the analysis of what it means to say
that ñ(t) is ergodic in the mean or ergodic in the autocorrelation function. When specifying what
it means to say that ñ(t) is ergodic in the mean, we take N = 1 and define function f to be
f ( x) = x ; and when specifying what it means to say that ñ(t) is ergodic in the autocorrelation
function, we take N = 2 and define function f to be f ( x, y ) = xy . To give another example of
how to use Eqs. (3.45a)–(3.45g), we examine an often encountered type of ergodicity called
“ergodic in the variance.”44 We define ergodic in the variance for a random function ñ(t) by
setting N = 1 and f ( x) = ( x − µn ) 2 , with µn in function f being the stationary mean of ñ,

E ( n (t ) ) = µn ,

specified by Eq. (3.30a) above. When a random function ñ(t) is ergodic in the variance, Eq.

44
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.

- 277 -
3 · Random Variables, Random Functions, and Power Spectra

(3.45a) becomes
T
1
( )
E [n (t ) − µn ]2 = lim
T →∞ 2T ³ [n (t ) − µ ] dt .
n
2
(3.46a)
−T

The requirements imposed by Eq. (3.45b) can be written as

( ) (
E [n (t ) − µn ]2 = E [n (t + τ ) − µn ]2 ) (3.46b)

for all values of τ , which means that

( )
E [n (t ) − µn ]2 = vn = nonrandom variable independent of time. (3.46c)

Here, we write vn instead of vn ( t ) for the variance of ñ(t) to emphasize that vn does not depend
on time. Equation (3.46c) can be interpreted as saying that ñ is stationary with respect to its
variance vn . We note that variance vn is equivalent to S in Eq. (3.45d), so Eqs. (3.45e), (3.45f),
and (3.45g) now reduce to
T
 1
T →∞ 2T ³
ξ∞ = lim [n (t ) − µn ]2 dt , (3.46d)
−T

E (ξ∞ ) = vn , (3.46e)


and
(
E [ξ∞ − vn ]2 = 0 . ) (3.46f)

A random function ñ(t) is called weakly ergodic if it is ergodic in the mean, ergodic in the
variance, and ergodic in the autocorrelation function.45 It is called strongly ergodic if Eqs.
(3.45a)–(3.45g) are satisfied for all N = 1, 2,… , ∞ and for any reasonable choice of function f.
This is equivalent to requiring that all reasonable ensemble averages of the random function ñ(t)
be equal to their corresponding time averages.
The distinction made between weakly ergodic and strongly ergodic is reminiscent of the
distinction made between wide-sense stationary and strict-sense stationary. Just as all strict-sense
stationary random functions are also wide-sense stationary, but not all wide-sense stationary
random functions are strict-sense stationary, so too are all strongly ergodic random functions also
weakly ergodic, but not all weakly ergodic random functions are strongly ergodic. The Gaussian
random processes discussed in Sec. 3.16 above are an important special case. We have already

45
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.

- 278 -
Ergodic Random Functions · 3.18

said that when Gaussian random processes are wide-sense stationary they must also be strict-
sense stationary; it can also be shown that whenever Gaussian random processes are weakly
ergodic they must also be strongly ergodic.46
Although we have seen that all ergodic random functions are also stationary, it is easy to show
that not all stationary random functions are ergodic. The random function

n (t ) = c , (3.47a)

where c is a random constant chosen from a probability density distribution pc (c ) , is clearly
strict-sense stationary. To see why this is so, we just observe that Eq. (3.24c) is automatically
satisfied, since


E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) = ³ p (c) f (c, c,…, c) dc
c
−∞ (3.47b)
= E ( f (c, c,… , c ) ) = E ( f ( n (t1 + τ ), n (t2 + τ ),… , n (t N + τ ) ) )

for any value of τ and any integrable function f with N = 1, 2,… , ∞ arguments. On the other
hand, n (t ) = c cannot be ergodic because once a value for c is chosen from the ensemble, it must
stay the same for all time values. Looking at even the simplest type of ergodicity, ergodicity in
the mean, we get from Eq. (3.42d)

T
1 § 1 ·
ξ∞ = lim ³ n (t ) dt = lim ¨© 2T ⋅ (2Tc) ¸¹ = c . (3.47c)
T →∞ 2T T →∞
−T

Hence, the probability density distribution of ξ∞ is the same as the probability density
distribution pc , which, unless pc is a delta function, violates requirement (3.42e) for ergodic in
the mean.

3.19 Experimental Noise


We almost always analyze noise in experimental signals as a random function of time ñ(t). The
signal noise in any given experiment is then a member function chosen at random from the
ensemble of allowed functions because it corresponds to pulling the levers of all the slot
machines simultaneously in Sec. 3.14 above. This suggests that the straightforward way to
calculate an expectation value or ensemble average is to acquire many different member

46
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.

- 279 -
3 · Random Variables, Random Functions, and Power Spectra

functions by running the experiment many different times. This is, of course, unlikely to happen;
there is usually not much incentive to do the same experiment over and over in exactly the same
way, because the point of most experiments is to measure a signal, not the noise associated with
it. Sometimes repeating an experiment is literally impossible. If, for example, stock-market prices
are treated as random functions of time, there is no way to repeat last year to see what happens
this time around. Consequently, when examining random functions of time, there is usually only
one, or at best a few, member functions of the ensemble to examine. In practice, then, most
experimental statisticians are forced to assume that their random functions are ergodic as well as
stationary; otherwise, they cannot calculate the ensemble averages needed for their analysis.
Another point worth making about stationarity and ergodicity is that, strictly speaking, no
experimental data can be truly stationary or truly ergodic in even the weakest sense, because
before an experiment begins or after an experiment ends the random function representing the
noise must be strictly zero. One way of handling this is to regard the noise data as a finite-length
sample of some random function stretching between t = í’ and t = +’, but we should also
acknowledge that stationarity and ergodicity are ideals that experimental noise can only realize to
some degree of approximation. Just as, in Sec. 3.5 above, many pairs of independent random
variables turn out after all to depend slightly on each other, so too do many recordings of
experimental noise turn out, after close analysis, to be stationary and ergodic only to some degree
of approximation.

3.20 The Power Spectrum


A random function ñ(t) that is wide-sense stationary has an autocorrelation function Rnn
  , which

according to Eq. (3.30b) can be written as

  (t2 − t1 ) = E ( n
Rnn  (t1 ) n (t2 ) ) (3.48a)

for any two time values t2 and t1 . We note that

E ( n (t1 ) n (t2 ) ) = E ( n (t2 ) n (t1 ) )


automatically. This means that

  (t2 − t1 ) = Rnn
Rnn   (t1 − t2 )

or, setting τ = t2 − t1 ,
  (τ ) = Rnn
Rnn   ( −τ ) , (3.48b)

making Rnn   an even function when n  is wide-sense stationary. Since Rnn


  is a function of only

the single real parameter τ , we can set up the one-dimensional Fourier transform of Rnn   , getting

- 280 -
The Power Spectrum · 3.20

 ( f ) =
S nn ³R
−∞

nn (τ ) e −2π if τ dτ . (3.48c)

This Fourier transform S nn   ( f ) of Rnn


  almost always exists, and we define it to be the power

spectrum47,48 of the random function ñ(t). Over the next few sections of this chapter, we examine
the properties of S nn
  , showing as we go along why it makes sense to call it the power spectrum.

Functions ñ that have power spectra must be wide-sense stationary because we are assuming
that the autocorrelation Rnn   is a function with only a single real argument. Given that S nn
  exists,

we can always reverse the transform in Eq. (3.48c) and write the autocorrelation function of ñ as
the inverse Fourier transform of the power spectrum,

  (τ ) =
Rnn ³S
−∞

nn ( f ) e 2π if τ df . (3.48d)

When two random functions nα (t ) and nβ (t ) are jointly wide-sense stationary, as defined in the
discussion following Eq. (3.30c), we can define their cross-power spectrum to be


S nα nβ ( f ) = ³R
−∞
nα nβ (τ ) e −2π if τ dτ , (3.48e)

where
Rnα nβ (t2 − t1 ) = E ( nα (t1 ) nβ (t2 ) )

is their cross-correlation function introduced in Eq. (3.30d).


We know that Rnn   in Eq. (3.48a) is always real because E ( n
 (t1 ) n (t2 ) ) is always real.
According to Eq. (3.48b), Rnn   is an even function of its argument. Therefore its Fourier

transform, the power spectrum S nn   , is the Fourier transform of a real and even function. Because

the Fourier transform of a real and even function is always another real and even function,49 it
follows that S nn
  is also real and even:

Im ( S nn
  ( f )) = 0 (3.49a)
and
  ( − f ) = S nn
S nn  ( f ) . (3.49b)

47
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 124.
48
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 319.
49
See entry 1 of Table 2.1 in Chapter 2.

- 281 -
3 · Random Variables, Random Functions, and Power Spectra

We note in passing that the cross-power spectrum S n n in (3.48e) is not necessarily a real-valued
function. It is, however, the Fourier transform of a real-valued function Rn n so it must be
Hermitian,50

S n n ( f ) S n n ( f ) (3.49c)

Equation (3.49a) shows that S nn   behaves like a power spectrum by being strictly real; Eq.

(3.49b) shows that S nn   is double-sided, having the same value at +f and –f. The next step is to

show that S nn  behaves like a power spectrum by being non-negative for all values of f, but that

has to wait until we examine what happens to S nn   when a wide-sense stationary random function

ñ(t) is put through an arbitrary linear system.

3.21 Random Inputs and Outputs of Linear Systems


Section 2.9 in Chapter 2 describes what a convolution is and the role it plays in Fourier-transform
theory. A linear system can be represented by a convolution, with the u(t) input being convolved
with the linear system’s impulse-response function h(t) to get the v(t) output,

v(t ) hu (t )  uh(t ) .

According to the definition of convolution in Chapter 2 [see Eq. (2.38a)], this can be written as

5
v(t ) ³ h(* 3) u(t  * 3) d* 3 .
5

When a random function ñ(t) is the input to a linear system characterized by an impulse-
response function h(t), the output is another random function m (t ) given by

5
m (t ) ³ h(* 3) n (t  * 3) d* 3 .
5
(3.50a)

50
See entry 7 of Table 2.1 in Chapter 2.

- 282 -
Random Inputs and Outputs of Linear Systems · 3.21

We define the correlation function between m (t ) and ñ(t) to be51

  (t1 , t 2 ) = E ( m
Rmn  (t1 ) n (t2 ) ) . (3.50b)

Function Rmn
  (t1 , t2 ) is called the cross-correlation function of m  and ñ. Substitution of (3.50a)
gives
§ ª∞ º· §∞ ·
Rmn   (t1 , t 2 ) = E ¨ n(t2 ) « ³ h(τ ) n(t1 − τ ) dτ » ¸ = E ¨ ³ h(τ ′) n
 ′  ′ ′  (t2 )n (t1 − τ ′) dτ ′ ¸
¨ ¸
© ¬ −∞ ¼¹ © −∞ ¹

Using Eq. (3.17c) to move the expectation operator inside the integral, and using (3.16a) to put h
outside the expectation operator because it is a nonrandom quantity, we get

  (t1 , t 2 ) =
Rmn ³ h(τ ′) E ( n (t )n (t
−∞
2 1 − τ ′) ) dτ ′ .

Assuming that ñ is wide-sense stationary, we use Eq. (3.30b) to write

E ( n (t2 )n (t1 − τ ′) ) = Rnn


  (t1 − τ ′ − t2 )

so that

  (t1 , t 2 ) =
Rmn
−∞
³ h(τ ′) R 
nn (t1 − t2 − τ ′) dτ ′ . (3.50c)

This shows that Rmn  depends only on the difference between t1 and t2 . Nothing then stops us

  as a function of τ = t2 − t1 , which gives


from regarding Rmn

  (τ ) =
Rmn ³ h(τ ′) R
−∞

nn (−τ − τ ′) dτ ′

or, using Eq. (3.48b),


  (τ ) =
Rmn ³ h(τ ′) R
−∞

nn (τ + τ ′) dτ ′ .

51
This derivation comes from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, pp.
323–324.

- 283 -
3 · Random Variables, Random Functions, and Power Spectra

Changing the variable of integration to τ ′′ = −τ ′ changes this into a convolution,

  (τ ) =
Rmn ³ h(−τ ′′) R
−∞

nn (τ − τ ′′) dτ ′′ = h(−τ ) ∗ Rnn
  (τ ) . (3.50d)

Equation (3.50a) can also be used to evaluate the autocorrelation function of the random
output m (t ) , giving
§ ª∞ º·
Rmm (t
 1 2, t ) = E ( 
m (t1 ) 
m (t 2 ) ) = E
¨ 1 «³
¨ m (t ) h (τ ′) 
n (t 2 − τ ′) dτ ′ » ¸¸
© ¬ −∞ ¼¹
§ ∞
·
= E ¨ ³ h(τ ′)m (t1 ) n (t2 − τ ′) dτ ′ ¸ .
© −∞ ¹

Again moving the expectation operator inside the integral, we use Eq. (3.50b) to write

∞ ∞

  (t1 , t2 ) =
Rmm ³ h(τ ′)E ( m (t1 ) n (t2 − τ ′) ) dτ ′ = ³ h(τ ′) R 
mn (t1 , t2 − τ ′) dτ ′ .
−∞ −∞

From (3.50c) we know that Rmn


  depends only on the difference between times t1 and t2 , which

means we can write


  (t1 , t 2 ) = Rmn
Rmn   (t 2 − t1 ) .

Hence, the formula for Rmm


  (t1 , t2 ) simplifies to

  (t1 , t2 ) =
Rmm ³ h(τ ′) R
−∞

mn (t2 − t1 − τ ′) dτ ′ . (3.51a)

This is an important result because it shows that the autocorrelation of the output random
function m depends only on τ = t2 − t1 . Substituting τ for (t2 − t1 ) gives

  (τ ) =
Rmm ³ h(τ ′) R
−∞

mn (τ − τ ′) dτ ′ = h(τ ) ∗ Rmn
  (τ ) . (3.51b)

Glancing back at Eqs. (3.30a) and (3.30b) above, and having shown that the autocorrelation
  (t1 , t2 ) depends only on (t2 − t1 ) , we realize that m
function Rmm  must be wide-sense stationary if

- 284 -
Random Inputs and Outputs of Linear Systems · 3.21

E ( m (t ) ) is time-independent and finite. Taking the expectation value of both sides of (3.50a)
gives
§∞ · ∞
E ( m (t ) ) = E ¨ ³ h(τ ′) n (t − τ ′) dτ ′ ¸ = ³ h(τ ′) E ( n (t − τ ′) ) dτ ′
© −∞ ¹ −∞ (3.51c)

= µn ³ h(τ ′) dτ ′
−∞
,

where we have again assumed that ñ(t) is wide-sense stationary so that, according to Eq. (3.30a),

E ( n (t ) ) = µ n = same finite constant for all values of t .

Equation (3.51c) makes E ( m (t ) ) a time-independent quantity. The Fourier transform of the


impulse-response function h is called the transfer function,

³ h(t ) e
−2π ift
H( f ) = dt , (3.51d)
−∞

of the linear system. (The idea of a transfer function is discussed in greater detail below in
Appendix 5A of Chapter 5.) Therefore Eq. (3.51c) can also be written as

E ( m (t ) ) = µn ⋅ H (0) . (3.51e)

This shows that when H(0), the zero-frequency value of the transfer function, is finite, so is
E ( m (t ) ) . We conclude that the output m (t ) of the linear system is wide-sense stationary when
the input ñ(t) is wide-sense stationary and the H(0) value of the transfer function is finite.
Because the H(f) transfer function is the Fourier transform of h(t), which is a strictly real
function, we can take the complex conjugate of both sides of Eq. (3.51d) to get

∞ −∞

³ h(t ) e dt = − ³ h(−t ′) e−2π ift ′ dt ′ .


∗ 2π ift
H( f ) = (3.52a)
−∞ ∞

In the last step of (3.52a), we change the variable of integration to t ′ = −t . Equation (3.52a) can
also be written as, dropping the prime,

³ h ( −t ) e
∗ −2π ift
H( f ) = dt . (3.52b)
−∞

- 285 -
3 · Random Variables, Random Functions, and Power Spectra

Clearly, H ( f )∗ , the complex conjugate of the transfer function H(f), is the Fourier transform of
h(−t ) . Since H is the Fourier transform of a real function h, it must, according to entry 7 of Table
2.1 in Chapter 2, be Hermitian,
H (− f ) = H ( f )* . (3.52c)

We now define S mn   (τ ) , giving


  ( f ) to be the Fourier transform of Rmn

( f ) =
S mn ³R
−∞

mn (τ ) e −2π if τ dτ . (3.53a)

Function S mn  and ñ [see Eq. (3.48e)]. The transform can,


  ( f ) is the cross-power spectrum of m

of course, be reversed to get


  (τ ) =
Rmn ³S
−∞

mn ( f ) e 2π if τ df . (3.53b)

Applying the Fourier convolution theorem to Eq. (3.50d) above gives, according to Eq. (2.39a) in
Chapter 2,

  ] = [Fourier transform of h(-t )] ⋅ [Fourier transform of Rnn


[Fourier transform of Rmn  ].

This can be written as, using Eqs. (3.53a), (3.52b), and (3.48c),


  ( f ) = H ( f ) ⋅ S nn
S mn  ( f ) . (3.53c)

Applying Eq. (2.39a) again, this time to Eq. (3.51b), gives

  ( f ) = H ( f ) S mn
S mm ( f ), (3.53d)
where

(f )=
S mm ³R
−∞

mm (τ ) e −2π if τ dτ (3.53e)

is the Fourier transform of Rmm


  . Following the nomenclature introduced in Eq. (3.48c), this must

be the power spectrum of m (t ) ; and the Fourier transforms of h and Rmn


  come from (3.51d) and

(3.53a) respectively. The Fourier transform in (3.53e) can, of course, be reversed to get

- 286 -
Random Inputs and Outputs of Linear Systems · 3.21

  (τ ) =
Rmm ³S
−∞

mm ( f ) e 2π if τ df . (3.53f)

Substitution of (3.53c) into (3.53d) gives the result we have been working toward:

2
  ( f ) = H ( f ) S nn
S mm  ( f ) . (3.53g)

This result shows that the power spectrum of the random input function ñ(t) gives, when
multiplied by the squared modulus of the transfer function, the power spectrum of the random
output function m (t ) of the linear system.

3.22 The Sign of the Power Spectrum


Equation (3.53g) can be used to show that the power spectrum S nn   of any wide-sense stationary

random function cannot be negative. To show how this is done, we set up a linear system that has
the transfer function
­ −i for f1 ≤ f ≤ f 2
° i for − f ≤ f ≤ − f
° 2 1
HB ( f ) = ® , (3.54a)
° 0 for f < f1
°¯ 0 for f > f 2

where f1 and f 2 are both non-negative frequencies. Function H B ( f ) is (−i ) when f lies
between f1 and f 2 and i when f lies between (− f1 ) and (− f 2 ) ; otherwise it is zero. The transfer
function H B satisfies
H B (− f ) = H B ( f )∗ , (3.54b)

which [see Eq. (3.52c)] makes it an acceptable transfer function because it is Hermitian. By
reversing the Fourier transform in (3.51d), we find that the impulse-response function for this
linear system must be the inverse Fourier transform of the transfer function,


hB (t ) = ³H
−∞
B ( f ) e 2π ift df .

According to entry 7 in Table 2.1 of Chapter 2, since H B ( f ) is Hermitian, its inverse Fourier
transform hB (t ) must be real. We can take any random function ñ(t) that is wide-sense stationary

- 287 -
3 · Random Variables, Random Functions, and Power Spectra

and run it through the H B linear system. Looking at the resulting output m (t ) , we know from the
discussion following Eq. (3.51e) that m (t ) must also be wide-sense stationary because H B (0) is
finite. This means that m has a well-defined autocorrelation function

  (t2 − t1 ) = E ( m
Rmm  (t1 ) m (t2 ) )

  ( f ) . Setting t1 = t2 in the autocorrelation function gives,


and a well-defined power spectrum S mm
since m is real,

  (0) = E m
Rmm (
 (t1 ) 2 ≥ 0 . ) (3.54c)

From Eq. (3.53f) we know


  (0) =
Rmm ³S
−∞

mm ( f ) df . (3.54d)

Combining Eqs. (3.53g) and (3.54a) gives

2
  ( f ) = H B ( f ) S nn
S mm  ( f ) .

This can be substituted into (3.54d) to get, noting the definition of H B in (3.54a), that

− f1 f2

  (0) =
Rmm ³S
− f2

nn ( f ) df + ³ S nn
  ( f ) df .
f1

Equation (3.49b) reminds us that S nn


  is an even function of f, which means that this formula for

  (0) can be written as


Rmm
f2

  (0) = 2 ³ S nn
Rmm   ( f ) df . (3.54e)
f1

Substitution of (3.54e) into inequality (3.54c) gives

f2

³S
f1

nn ( f ) df ≥ 0 . (3.54f)

- 288 -
The Sign of the Power Spectrum · 3.22

No assumptions have been made about the values of f1 and f 2 other than

0 ≤ f1 ≤ f 2 .

Therefore, because inequality (3.54f) must hold true for all allowed values of f1 and f 2 no
matter where they are on the positive f axis or how close together they are, we conclude that
  ( f ) ≥ 0 for all f ≥ 0 . Because
S nn
  ( − f ) = S nn
S nn  ( f )

in Eq. (3.49b), it then follows that

 ( f ) ≥ 0
S nn (3.54g)

for all positive and negative values of f.


We have already demonstrated that S nn   is real and even, and now we know that it must also

be a non-negative function of frequency f. These are all attributes that a double-sided power
spectrum ought to have. The final step in justifying the label “power spectrum” for S nn
  is to show

that it satisfies a power-spectrum type of formula with regard to the random function ñ(t).

3.23 The Power Spectrum and Fourier Transforms of Random


Functions
The power spectrum Pzz ( f ) of a nonrandom function z(t) can be written as52

2
ZT ( f )
Pzz ( f ) = lim . (3.55a)
T →∞ 2T

Here, ZT ( f ) is the Fourier transform between times t = −T and t = T of a real signal z(t):

³ z (t ) e
−2π ift
ZT ( f ) = dt . (3.55b)
−T

52
B. P. Lathi, An Introduction to Random Signals and Communication Theory (International Textbook Company,
Scranton, PA, 1968), p. 59.

- 289 -
3 · Random Variables, Random Functions, and Power Spectra

We now justify the label “power spectrum” for the function S nn   ( f ) defined in Eq. (3.48c) by

deriving a formula for S nn


  in terms of the random function ñ(t) that closely resembles formula

(3.55a) for the power spectrum Pzz ( f ) of the nonrandom function z(t).
We define N ( f ) to be the Fourier transform of the random function ñ(t) between times
T

t = −T and t = T :
T
N T ( f ) = ³ n (t ) e
−2π ift
dt . (3.56a)
−T

In effect, N is a random function of the two nonrandom variables f and T, and it could be written
as N ( f , T ) to emphasize this fact. When ñ(t) is a random function that is wide-sense stationary,
we have, since ñ is real,

( 2
) (
§ªT
)
ºªT º·
E N T ( f ) = E N T ( f )∗ ⋅ N T ( f ) = E ¨ « ³ n (t1 ) e2π ift1 dt1 » « ³ n (t2 ) e−2π ift2 dt2 » ¸
¨
© ¬ −T ¼ ¬ −T ¼¹
¸
(3.56b)
§T T
·
= E ¨ ³ dt1 ³ dt2 n (t1 )n (t2 ) e−2π i (t2 −t1 ) f ¸.
© − T −T ¹

Applying Eqs. (3.17c) and (3.16a), the expectation operator E is taken inside the double integral
to get
T T
2
E( N T ( f ) ) = ³ dt ³ dt E(n (t )n (t ) ) e
1 2 1 2
−2π i ( t2 −t1 ) f

−T −T
T T
(3.56c)
³ dt ³ dt R
−2π i ( t2 −t1 ) f
= 1 2 
nn (t2 − t1 ) e .
−T −T

In the last step, Eq. (3.30b) is used to replace

E ( n (t1 )n (t2 ) )

  (t2 − t1 ) .
for the wide-sense stationary ñ by the autocorrelation function Rnn
The rightmost expression in Eq. (3.56c) is a double integral of a function

ψ (t2 − t1 ) = Rnn  (t2 − t1 ) e−2π i (t 2 − t1 ) f

over the square region of the t1 , t2 plane specified by

- 290 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23

−T ≤ t1 ≤ T
and
−T ≤ t2 ≤ T .

Figure 3.2 shows that the value of ȥ must be constant along any line given by

t2 − t1 = τ = constant

in the t1 , t2 plane. To lowest order in dτ in Fig. 3.2, the shaded area is, when t2 ≥ t1 so that
τ ≥ 0,

⋅ (2T − τ ) 2 = (2T − τ ) dτ .
2

When t2 < t1 , as shown in Fig. 3.3, the value of τ is negative, so the formula for the shaded area
in Fig. 3.3 is


⋅ (2T − τ ) 2 = (2T − τ ) dτ .
2

Consequently, the rightmost double integral in Eq. (3.56c) can be written as

T T

³ dt1 ³ dt2 Rnn −2π i ( t2 − t1 ) f


  (t2 − t1 ) e
−T −T
2T 0

³ ³
−2π if τ −2π if τ
=   (τ ) e
Rnn (2T − τ ) dτ +   (τ ) e
Rnn (2T − τ ) dτ
0 −2T
2T

³
−2π if τ
=   (τ ) e
Rnn (2T − τ ) dτ .
−2T

Taking the factor of 2T outside the integral and substituting the result back into Eq. (3.56c) gives

( ) § τ ·
2T
2
E N T ( f ) = 2T ³−2T © 2T ¸¹ Rnn  (τ ) e dτ .
¨ 1 − −2π if τ
(3.57a)

This can be written as

- 291 -
3 · Random Variables, Random Functions, and Power Spectra

( )

1 2
E N T ( f ) = ³ Λ(τ , 2T ) Rnn
  (τ ) e
−2π if τ
dτ , (3.57b)
2T −∞

where
­1 − ta for ta ≤ tb
° tb
Λ (ta , tb ) = ® . (3.57c)
°
¯ 0 for ta > tb

Function Λ is graphed in Fig. 3.4. The Fourier transform of Λ(t , 2T ) is

∞ 2
ª sin(2π fT ) º
³ Λ(t , 2T ) e
−2π ift
dt = 2T ⋅ « » . (3.57d)
−∞ ¬ 2π fT ¼

The right-hand side of Eq. (3.57b) is the Fourier transform of the product of functions ȁ and Rnn  .

According to the Fourier convolution theorem [see Eq. (2.39k) in Chapter 2], this must equal the
convolution of the Fourier transforms of Λ and Rnn   . Therefore, Eq. (3.57b) can be written as,

according to (3.57d) and (3.48c),

(
E N T ( f )
2
) = ­°®2T ⋅ ª sin(2π fT ) º ½°¾ ∗ S
2

(f). (3.57e)
2T « 2π fT » 
nn
°¯ ¬ ¼ °¿

In the limit as T → ∞ , it can be shown that53

2
ª sin(2π fT ) º
2T ⋅ « » →δ( f ). (3.57f)
¬ 2π fT ¼

53
John B. Thomas, An Introduction to Applied Probability and Random Processes (John Wiley & Sons, Inc., New
York, 1971), p. 231. Formula (3.57f) is also a slightly disguised version of Eq. (2.67b) in Chapter 2.

- 292 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23

FIGURE 3.2.

T − (τ − T )
t2
= 2T − τ

τ −T

t1

τ −T
τ
−T

−T T

- 293 -
3 · Random Variables, Random Functions, and Power Spectra

FIGURE 3.3.

t2

t1

τ
−T

2T − τ

−T dτ T

- 294 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23

FIGURE 3.4.

1.0

ta

− tb tb

- 295 -
3 · Random Variables, Random Functions, and Power Spectra

Consequently, we can take the limit of both sides of (3.57e) as T → ∞ to get [using Eq. (2.55a)
in Chapter 2)

« T(
ª E N ( f ) 2 º
» ) ∞

  ( f ) = ³ δ ( f − f ′) S nn
  ( f ′) df ′
lim
T →∞ « 2T » = δ ( f ) ∗ Snn
« » −∞
¬ ¼
or

S nn ( f ) = lim
« T (
ª E N ( f ) 2 º
» )

T →∞ « 2T ». (3.57g)
« »
¬ ¼

Comparing this result to the similar formula in Eq. (3.55a) for the power spectrum of a
nonrandom function z, we see that the formulas are similar enough to justify the definition of S nn


as the power spectrum of the random function ñ.


The S nn
  ( f ) power spectrum specified in Eq. (3.48c) and used later in (3.57g), (3.49b),

(3.54g), and so on, is often called the double-sided power spectrum because it is defined for both
positive and negative values of its argument f. It is typically found as a weighting function in
integrals of the form

³S
−∞

nn ( f ) φe ( f ) df ,

where φe ( f ) , like S nn   ( f ) φe ( f ) product must also


  ( f ) , is an even function of f. Because the S nn

be even, this integral can also be written as [see Eq. (2.19) in Chapter 2]

∞ ∞

³S
−∞

nn ( f ) φe ( f ) df = 2³ Snn
  ( f ) φe ( f ) df .
0
(3.58a)

(1)
Many analysts define a single-sided power spectrum S nn
  to be

(1)
  ( f ) = 2 S nn
S nn   ( f ) for f ≥ 0 (3.58b)

and use it to write equations like (3.58a) as

∞ ∞

³S ( f ) φe ( f ) df = ³ Snn
  ( f ) φe ( f ) df .
(1)

nn (3.58c)
−∞ 0

- 296 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23

The motivation for this procedure is often the feeling that only positive frequencies f are
meaningful, so we ought to restrict ourselves to using power spectra with positive arguments.54
Many times articles and textbooks refer to “the” power spectrum without making it clear whether
they are referring to the double-sided or single-sided power spectrum. Casual references to power
spectra should be treated with caution until it becomes clear which type of power spectrum the
author has in mind.

3.24 The Multidimensional Wiener-Khinchin Theorem


Equation (3.57g) derived in Sec. 3.23 is often referred to as the Wiener-Khinchin theorem. This
theorem can easily be extended to multiple dimensions.
A random function with more than one nonrandom argument is often called a random scalar
field. We can write a random scalar field ñ as n (t1 , t2 ,… , t K ) when it is a function of K
nonrandom arguments t1 , t2 ,…, t K . The property for a random field that is analogous to
stationarity for a one-dimensional random function is called homogeneity. A random function ñ is
called a (wide-sense) homogeneous random field n (t1 , t2 ,… , t K ) when there is a correlation
function Rnn  such that

Rnn ′ − t K ) = E ( n (t1 , t2 ,… , t K ) n (t1′, t2′ ,… , t K′ ) ) .


  (t1′ − t1 , t2′ − t2 ,… , t K (3.59a)

The multidimensional Fourier transform of Rnn


  is the multidimensional power spectrum of the

random field

  ( f1 , f 2 ,… , f K )
S nn
∞ ∞ ∞
(3.59b)
³ dτ 1 ³ dτ 2 " ³ dτ K Rnn −2π i ( f1τ1 + f 2τ 2 +"+ f K τ K )
=   (τ 1 ,τ 2 , … ,τ K )e .
−∞ −∞ −∞

This transform can, of course, be reversed to get

54
There is, of course, no more problem in using negative ƒ values when ƒ represents a frequency than there is in
using negative x values when x represents a length along the axis of a coordinate system. Lengths can never be
negative, so when we allow x to be negative we are implicitly talking about a length coordinate rather than a length.
Similarly, when we allow ƒ to be negative we are implicitly talking about a frequency coordinate rather than a
frequency.

- 297 -
3 · Random Variables, Random Functions, and Power Spectra

  ( f1 , f 2 ,… , f K )
Rnn
∞ ∞ ∞
(3.59c)
³ df1 ³ df 2 " ³ df K Snn 2π i ( f1τ1 + f 2τ 2 +"+ f Kτ K )
=   ( f1 , f 2 ,… , f K )e .
−∞ −∞ −∞

The multidimensional Wiener-Khinchin theorem states that

  ( f1 , f 2 ,… , f K )
S nn

= lim «
ª 1
T1 →∞ (2T )(2T ) " (2T )
T2 →∞ ¬ 1 2 K
( 2 º
E N T1T2 "TK ( f1 , f 2 ,… , f K ) » ,
¼
) (3.59d)
#
TK →∞

where
N T1T2 "TK ( f1 , f 2 ,… , f K )
T1 T2 TK
(3.59e)
=
−T1
³ dt1 ³
−T2
dt2 " ³
−TK
dt K n (t1 , t2 ,… , t K )e−2π i ( f1t1 + f2t2 +"+ f K tK ) .

The next chapter uses the three-dimensional Wiener-Khinchin theorem with one time
coordinate t and two space coordinates x and y. Using the vector notation introduced in Chapter 2
(see Sec. 2.25), we write the random field ñ as
G
n ( x, y, t ) = n ( ρ , t ) , (3.60a)
with
G
ρ = xxˆ + yyˆ (3.60b)

being the position vector defined in terms of the x̂ and ŷ unit vectors corresponding to the x and
G
y coordinates. We also define a vector u with u x and u y components such that

G
u = xu
ˆ x + yu
ˆ y. (3.60c)

Here, u x and u y are the spatial frequencies corresponding to the x and y coordinates respectively.
The frequency corresponding to time t is called w. The truncated time and space Fourier
G
transform of n ( ρ , t ) can now be written as

T
N T , A (u x , u y , w) = ³ dt ³³
−2π i ( xu x + yu y + wt )
dx dy n ( x, y, t )e
−T area A

or

- 298 -
The Multidimensional Wiener-Khinchin Theorem · 3.24

T
G G G G
N T , A (u , w) = ³ dt ³³ d 2 ρ n ( ρ , t )e −2π i ( u • ρ + wt ) . (3.60d)
−T area A

G
Random field n ( ρ , t ) has an autocorrelation function

  ( x′ − x , y ′ − y , t ′ − t ) = E ( n
Rnn  ( x, y, t ) n ( x′, y′, t ′) ) , (3.61a)

which can be written as


G G G G
  (ρ ′ − ρ , t′ − t) = E ( n
Rnn  ( ρ , t ) n ( ρ ′, t ′) ) . (3.61b)

Because Rnn   depends only on the difference between the unprimed and primed coordinates, we

say that field ñ is (wide-sense) stationary and homogeneous. The corresponding power spectrum
is
∞ ∞
G G G G

³ ³ ³ d ρ Rnn  ( ρ , t )e
2 −2π i ( u • ρ + wt )
S nn
 (u , w) = dt . (3.61c)
−∞ −∞

The transform can be reversed to get


∞ ∞
G G G G

³ dw ³ ³ d 2u Snn 2π i ( u • ρ + wt )
  (ρ , t) =
Rnn   (u , w) e . (3.61d)
−∞ −∞

Glancing back at the notation for the truncated Fourier transform of ñ in Eq. (3.60d), we see that
the three-dimensional Wiener-Khinchin theorem for this case can be stated as

G ª 1
  (u , w) = lim «
S nn
T →∞ 2TA
A→∞ ¬
G
(
2 º
E N T , A (u , w) » .
¼
) (3.61e)

3.25 Band-Limited White Noise


A random function ñ(t) is band-limited white noise when it is wide-sense stationary and has a
power spectrum
­°W0 for f ≤ F
  ( f ) = Wnn
S nn  ( f ) = ® (3.62a)
°̄ 0 for f > F
with
E ( n (t ) ) = 0 . (3.62b)

- 299 -
3 · Random Variables, Random Functions, and Power Spectra

FIGURE 3.5.

Wn~n~ ( f )

W0

−F F

- 300 -
Band-Limited White Noise · 3.25

The bandwidth of this white noise is said to be F (see Fig. 3.5). Equation (3.48d) shows that the
autocorrelation function of this band-limited white noise must be

F
sin(2π Fτ )
³e
2π if τ
  (τ ) = W0
Rnn df = W0 . (3.62c)
−F
πτ

Glancing back at Eq. (3.48a), we see that

( )
E ( n (t ) ⋅ n (t ) ) = E n (t ) 2 = Rnn
  (0) ,

so that, according to Eq. (3.62c),


F

(
E n (t ) 2
) = W ³ dτ = 2FW .
0 0 (3.62d)
−F

According to (3.62b) ñ is a zero-mean random function, so Eq. (3.62d) shows that product 2FW0
must be the variance of ñ(t) when ñ is band-limited white noise.
Sometimes we take the limit as F → ∞ in Eqs. (3.62a)–(3.62d) to get white noise that has no
band limits. Now the power spectrum of ñ(t) is

  ( f ) = W0
Wnn (3.63a)

for all values of f. According to formula (3.62c) and Eq. (2.71f) in Chapter 2, this makes the
autocorrelation function Rnn
  proportional to a delta function,

  (τ ) = W0 ³ e
2π if τ
Rnn df = W0δ (τ ) , (3.63b)
−∞
with of course
lim[E(n (t ) 2 )] = ∞ (3.63c)
F →∞

and
E ( n (t ) ) = 0 . (3.63d)

Just like the concepts of stationarity and ergodicity, the concept of white noise (even of band-
limited white noise) is an idealization that is often useful for approximating random processes
seen in nature. When a poor-quality recording is played on an audio system, the noise
contaminating it is often white in nature, showing up as unwanted hissing, crackling, and an
overall “shussing” sound. This white noise is band limited, with the band specified by the finite

- 301 -
3 · Random Variables, Random Functions, and Power Spectra

range of frequencies produced by the audio system and heard by the audience. Setting a TV set to
a channel or station that does not exist, or that cannot be picked up, often produces hissing in the
speakers and a rapidly changing speckle (sometimes called snow) on the screen; both the snow
and the hissing come from quasi white-noise processes that the TV is treating like a nonrandom
signal.

3.26 Even and Odd Components of Random Functions


A useful approach often applied to random functions Ñ(t) that are wide-sense stationary is to
divide them up into even and odd components, as shown in Eqs. (2.11a)–(2.11e) in Chapter 2.
Instead of using e and o subscripts as is done in Chapter 2, this time the even component has a +
superscript and the odd component has a í superscript:

N (t ) = N ( + ) (t ) + N ( − ) (t ) , (3.64a)
where
1
N ( + ) (t ) = ª¬ N (t ) + N (−t ) º¼ (3.64b)
2
and
1
N ( − ) (t ) = ª¬ N (t ) − N (−t ) º¼ . (3.64c)
2

We now apply to Ñ(t) the time-limited Fourier transform shown in Eq. (3.56a),

T ∞
 (f)= N (t ) e −2π ift dt =
³ ³ Π(t , T ) N (t ) e
−2π ift
N T dt . (3.65a)
−T −∞

Here, the Π (t , T ) function [defined in Eq. (2.56c) of Chapter 2] is used to convert the integral
between +T and –T into a true Fourier transform. Substituting (3.64a) into (3.65a) gives

∞ ∞
 (f)= Π (t , T ) N ( + ) (t ) e−2π ift dt +
³ ³ Π(t , T ) N (t ) e −2π ift dt ,
(−)
N T
−∞ −∞

which can be written as

 ( f ) = N( +) ( f ) + N(−) ( f ) ,
N (3.65b)
T T T

where

 (+) ( f ) =
³ Π(t , T ) N (t ) e −2π ift dt
(+)
N T (3.65c)
−∞

- 302 -
Even and Odd Components of Random Functions · 3.26

and

 (−) ( f ) =
³ Π (t , T ) N (t ) e −2π ift dt .
(−)
N T (3.65d)
−∞

According to entries 1 and 4 of Table 2.1 in Chapter 2, random function N  ( + ) must be a real and
T

even function of f because it is the forward Fourier transform of a real and even function of t;
and random function N  ( − ) must be an imaginary and odd function of f because it is the forward
T

Fourier transform of a real and odd function of t. This means that every function in the ensemble
of functions associated with random function N  ( + ) is real and even, and every function in the
T

ensemble of functions associated with random function N  ( − ) is imaginary and odd. It also reveals
T

that in Eq. (3.65b) function N  ( + ) is the real part of N


 ( f ) and N
 ( − ) / i is the imaginary part of
T T T

N ( f ) . This can be written mathematically as
T

 ( + ) ( f ) = Re N
N T T(
 (f) ) (3.65e)
and
 ( − ) ( f ) = i Im N
N T (
 (f) .
T ) (3.65f)

There is a simple connection between the expectation values of the squared magnitudes of
 (±)
NT and N  , that is between
T

( )
 ( ± ) ( f ) 2 and E N
E N T
 (f)2 ,
T ( )
which is worth taking the time to analyze in detail.
 ( ± ) ( f ) 2 to get
We start by applying formulas (3.65c) and (3.65d) to E N T ( )
( ) (( )( ))

 (±) ( f ) 2 = E N
E N  (±) ( f ) N
 (±) ( f )
T T T

§§ ∞ · § ∞
·

·

= E ¨ ³ Π (t , T ) N (t ) e
¨ ( ± ) −2 π ift 
dt ¸ ¨ ³ Π (t ′, T ) N (t ′) e
( ± ) −2 π ift ′
dt ′ ¸ ¸ .
¨ © −∞ ¹ © −∞ ¹ ¸¹
©

- 303 -
3 · Random Variables, Random Functions, and Power Spectra

Everything inside the integral over dt ′ is real except for e −2π ift ′ , so we can write this as

( §∞
) ·

2
 
E NT ( f ) = E ¨ ³ Π (t , T ) N (t ) e
(±) (±) −2π ift
dt ³ Π (t ′, T ) N ( ± ) (t ′) e 2π ift ′ dt ′ ¸ .
© −∞ −∞ ¹

Substituting from (3.64b) and (3.64c) gives

(
 (±) ( f ) 2
E N T )
§1 ∞  

·
= E ¨ ³ Π (t , T ) ¬ N (t ) ± N (−t ) ¼ e
ª º −2π ift
dt ³ Π (t ′, T ) ª¬ N (t ′) ± N (−t ′) º¼ e 2π ift ′ dt ′ ¸ ,
© 4 −∞ −∞ ¹

which becomes, applying the linearity of operator E discussed in Sec. 3.10 above,

(
 (±) ( f ) 2
E N T )
∞ ∞ (3.66a)
1
(
= ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) e 2π ift ′ E ª¬ N (t ) ± N (−t ) º¼ ª¬ N (t ′) ± N (−t ′) º¼
4 −∞ −∞
).
The linearity of E can also be used to write

(
E [ N (t ) ± N ( −t )] [ N (t ′) ± N (−t ′)] )
(
= E N (t ) N (t ′) ± N (t ) N (−t ′) ± N (−t ) N (t ′) + N (−t ) N (−t ′) )
= E ( N (t ) N (t ′) ) ± E ( N (t ) N (−t ′) ) ± E ( N (−t ) N (t ′) ) + E ( N (−t ) N (−t ′) ) .

Equation (3.30b), which specifies the autocorrelation function of wide-sense stationary random
functions like Ñ(t), can now be applied to get

(
E [ N (t ) ± N (−t )][ N (t ′) ± N (−t ′)] )
  (t ′ − t ) ± RNN
= RNN   ( −t ′ − t ) ± RNN
  (t ′ + t ) + RNN
  ( −t ′ + t ) .

According to Eq. (3.48b) the autocorrelation function RNN


  is even, so the right-hand side can be

simplified to

- 304 -
Even and Odd Components of Random Functions · 3.26

(
E [ N (t ) ± N (−t )][ N (t ′) ± N (−t ′)] = 2 RNN )
  (t − t ′) ± 2 RNN
  (t + t ′) .

Putting this result back into Eq. (3.66a) gives

( )
∞ ∞
 ( ± ) ( f ) 2 = 1 dt Π (t , T ) e −2π ift dt ′Π (t ′, T ) R   (t − t ′) e 2π ift ′
E N T ³
2 −∞ ³
−∞
NN

∞ ∞
(3.66b)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
  (t + t ′) e
2π ift ′
.
2 −∞ −∞

Equation (3.48d) states that there exists a power spectrum S NN


  ( f ) such that

  (t ± t ′) =
RNN ³S
−∞

NN
( f ) e 2π if ( t ±t ′) df .

Substituting this expression into the first term on the right-hand side of the formula for

(
E N T )
 ( ± ) ( f ) 2 and moving the integral over S   to the front, we get
NN

( )
∞ ∞ ∞
 ( ± ) ( f ) 2 = 1 df ′ S   ( f ′) dt Π (t , T ) e −2π it ( f − f ′) dt ′Π (t ′, T )e 2π it ′( f − f ′)
E N T ³ NN −∞³
2 −∞ ³
−∞
∞ ∞
(3.66c)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
  (t + t ′) e
2π ift ′
.
2 −∞ −∞

Interchanging the roles of f, t and then replacing F by T in Eq. (2.108b) of Chapter 2 gives

∞ ∞

³ Π(t , T ) e
−2π it ( f − f ′ )
dt = ³ Π(t , T ) e
2π it ( f − f ′ )
dt = 2T sinc ( 2π ( f − f ′)T ) , (3.66d)
−∞ −∞

with Eq. (2.106d) showing that the definition of the sinc function is

sin( x)
sinc( x) = . (3.66e)
x

Substitution of this formula into Eq. (3.66c) leads to

- 305 -
3 · Random Variables, Random Functions, and Power Spectra

( )

 ( ± ) ( f ) 2 = 1 S   ( f ′)[2T sinc(2π ( f − f ′)T )]2 df ′
E N T ³ NN
2 −∞
∞ ∞
(3.66f)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
  (t + t ′) e
2π ift ′
.
2 −∞ −∞

To evaluate the integral over df ′ in (3.66f), we assume that T is chosen large enough that

2
ª sin(2π f ′ T ) º
[sinc(2π f ′ T )] = « 2
»
¬ 2π f ′ T ¼

varies rapidly as a function of f ′ compared to S NN


  ( f ′) . Hence, if ∆f S is the change in f ′

  ( f ′) , we must have
required to cause a significant change in S NN

1
∆f S ⋅ T >> 1 or T >> . (3.67a)
∆f S

Then we can follow the lead of (3.57f) and approximate

2
ª sin(2π f ′ T ) º
2T « » ′ ) ≅ δ ( f ′) .
= 2T sinc 2 (2π f T (3.67b)
¬ 2π f ′ T ¼

Applying this approximation to the integral over df ′ on the right-hand side of (3.66f), we replace

2
ª sin ( 2π ( f − f ′) T ) º 2
2T « » = 2T ª¬sinc ( 2π ( f − f ′) T ) º¼
¬ 2π ( f − f ′) T ¼

by δ ( f − f ′) to get

∞ ∞

³S ( f ′)[2T sinc(2π ( f − f ′)T )] df ′ ≅ 2T ³S ( f ′)δ ( f − f ′)df ′ = 2TS NN


2

NN 
NN   ( f ).
−∞ −∞

This result can now be substituted back into Eq. (3.66f) to get

- 306 -
Even and Odd Components of Random Functions · 3.26

(T NN)
 (±) ( f ) 2 ≅ T S   ( f ) ± 1 Λ ,
E N
2
T (3.67c)

where we define ΛT to be the value of the remaining double integral,

∞ ∞

³ dtΠ (t , T ) e ³ dt ′Π(t ′, T ) R (t + t ′) e 2π ift ′ .


−2π ift
ΛT = 
NN
(3.67d)
−∞ −∞

To evaluate ΛT , we change the variable of integration in the inner integral from t ′ to


t ′′ = −(t ′ + t ) to get


ª −∞
º
³ dtΠ (t , T ) e «(−1) ³ dt ′′Π (−t ′′ − t , T ) RNN
−2π ift −2π if ( t ′′ + t )
ΛT =   ( −t ′′) e »
−∞ ¬ +∞ ¼

ª∞ º
³−∞ « ³ dt ′′Π ( −(t ′′ + t ), T ) RNN
−2π ift −2π if ( t ′′+ t )
= dt Π (t , T ) e   ( −t ′′) e ».
¬ −∞ ¼

According to Eq. (2.56c) in Chapter 2, function Π (t , T ) is an even function of t, so

Π ( −(t ′′ + t ), T ) = Π (t ′′ + t , T ) .

Similarly, according to Eq. (3.48b) above,

  ( −t ′′) = RNN
RNN   (t ′′) .

Applying these two formulas to the ΛT double integral gives, after interchanging the order of the
integrals over dt and dt ′′ ,

∞ ∞

³ ³ dtΠ(t , T )Π (t + t ′′, T ) e .
−2π ift ′′ −4π ift
ΛT = dt ′′RNN
  (t ′′) e (3.67e)
−∞ −∞

To simplify the inner integral on the right-hand side of (3.67e), we note that only when both
Π (t , T ) and Π (t + t ′′, T ) are one is their product one—in other words, when either Π (t , T ) or
Π (t + t ′′, T ) is zero, then their product is zero and no contribution is made to the integral. Figure
3.6(a) shows what happens for positive values of t ′′ , and Fig. 3.6(b) shows what happens for
negative values of t ′′ .

- 307 -
3 · Random Variables, Random Functions, and Power Spectra

FIGURE 3.6(a).

t ′′

−T T

Π (t , T )

Π (t + t ′′, T ) for t ′′ > 0

FIGURE 3.6(b).

í t ′′

−T T

Π (t + t ′′, T ) for t ′′ < 0


Π (t , T )

- 308 -
Even and Odd Components of Random Functions · 3.26

In both Figs. 3.6(a) and 3.6(b), the dark solid line is a plot of Π (t , T ) and the dashed line is a plot
of Π (t + t ′′, T ) . When t ′′ > 0 , the dashed block shifts to the left; when t ′′ < 0 , the dashed block
shifts to the right. Only in the region of overlap of the solid and dashed lines in Figs. 3.6(a) and
3.6(b) does the product function

Π (t , T )Π (t + t ′′)

allow a contribution to be made to the inner integral. Hence, we can write

­1 when 0 < t ′′ < 2T and − T < t < T − t ′′


°
Π (t , T )Π (t + t ′′) = ®1 when 0 > t ′′ > −2T and − T − t ′′ < t < T , (3.67f)
°0 outside these regions
¯

disregarding the edge points of the Π functions because these single-point values do not
contribute to the integral. Equation (3.67e) thus reduces to

0 T

³ ³
−2π ift ′′
ΛT = dt ′′RNN
  (t ′′) e dt e−4π ift
−2T −T −t ′′
2T T −t ′′
(3.67g)
+ ³ dt ′′R
0

NN
(t ′′) e −2π ift ′′ ³
−T
dt e−4π ift .

We note that
b
1
³e ª¬e −4π ifa − e −4π ifb º¼ .
−4π ift
dt = (3.67h)
a
4π if

Applying (3.67h) to (3.67g) gives

−2π ift ′′ ª 1 −4π ifT º


0
ΛT = ³−2T R 
NN  (t ′′) e « 4π if
e (
4π if (T + t ′′ )
− e » dt ′′ )
¬ ¼
−2π ift ′′ ª 1 º
2T
+ ³   (t ′′) e
RNN « 4π if e
4π ifT
(
− e 4π if ( t ′′−T ) » dt ′′. )
0 ¬ ¼

Changing the variable of integration in the first integral to t ′′′ = −t ′′ leads to [remember to apply
Eq. (3.48b)]

- 309 -
3 · Random Variables, Random Functions, and Power Spectra

2T
1
ΛT = ³R  (t ′′′) ( e4π ifT e −2π ift ′′′ − e−4π ifT e2π ift ′′′ ) dt ′′′
4π if 0
NN

2T
1
+ ³R  (t ′′) ( e 4π ifT e−2π ift ′′ − e−4π ifT e2π ift ′′ ) dt ′′
4π if 0
NN

4π ifT 2T 2T
e e −4π ifT
³R ³R
−2π ift
=  (t )e dt −  (t )e2π ift dt ,
2π if 0
NN
2π if 0
NN

where in the last step we have dropped the primes from the variables of integration. The second
integral is the complex conjugate of the first, so this formula can be written as

ª e 4π ifT 2T
º
³
−2π ift
ΛT = Re « RNN
  (t )e dt » (3.67i)
¬ π if 0 ¼

because Re(c) = (c / 2) + (c∗ / 2) for any complex number c.


The Heaviside step function is defined to be

­ 1 for t > 0
°
Ξ(t ) = ®1 2 for t = 0 (3.67j)
° 0 for t < 0
¯

in Eq. (2.70a) of Chapter 2. The integral on the right-hand side of (3.67i) can now be written as

2T ∞

³R ³ Ξ(t )Π (t , 2T ) R
−2π ift

NN
(t )e dt = 
NN
(t )e −2π ift dt . (3.67k)
0 −∞

The right-hand side is the Fourier transform of

Ξ(t )Π (t , 2T ) RNN
  (t )

and the Fourier-transform operator F defined in Eq. (2.29a) of Chapter 2 can be used to write it as

F ( − ift ) (Ξ(t )Π (t , 2T ) RNN


  (t )) .

- 310 -
Even and Odd Components of Random Functions · 3.26

The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] can be applied to get

F ( −ift ) ( Ξ(t )Π (t , 2T ) RNN


  (t ) ) = F
( − ift ′ )
( Ξ(t ′)Π (t ′, 2T ) ) ∗ F ( −ift′′) ( RNN  (t ′′) ) . (3.68a)

  ( f ) such that
According to Eq. (3.48c) there exists a power spectrum S NN

(f )=
S NN ³R 
NN
(t ′′)e −2π ift ′′ dt ′′ = F ( −ift ′′) ( RNN
  (t ′′) ) . (3.68b)
−∞

Evaluating F ( −ift ′) ( Ξ(t ′)Π (t ′, 2T ) ) is not much more difficult. Writing the Fourier transform as an
integral gives [remember that eiφ = cos(φ ) + i sin(φ ) ]

2T
1
F ( −ift ′) ( Ξ(t ′)Π (t ′, 2T ) ) = ³e
−2π ift ′
dt ′ = ª¬1 − e−4π ifT º¼
0
2π if
e −2π ifT 2π ifT
=
2π if
e( − e −2π ifT )
1
= [cos(2π fT ) − i sin(2π ft )]sin(2π fT )
πf
1 i
= sin(4π fT ) − sin 2 (2π fT ) ,
2π f πf

where in the last step we use that


1
sin θ cos θ = sin(2θ ) .
2

Applying the formula for the sinc function from Eq. (3.66e), we end up with

F ( −ift ′) ( Ξ(t ′)Π (t ′, 2T ) ) = 2Tsinc(4π fT ) − i (2π fT ) ª¬ 2Tsinc 2 (2π fT ) º¼ . (3.68c)

Equations (3.68b) and (3.68c) are substituted into (3.68a) to get

F ( −ift ) ( Ξ(t )Π (t , 2T ) RNN {


  (t ) ) = 2Tsinc(4π fT ) − i (2π fT ) ª
¬ 2Tsinc (2π fT ) º¼ ∗ S NN
2
(f ), }

- 311 -
3 · Random Variables, Random Functions, and Power Spectra

which can then be substituted into (3.67k), giving

2T

³R 
NN { }
(t )e −2π ift dt = 2Tsinc(4π fT ) − i (2π fT ) ª¬ 2Tsinc 2 (2π fT ) º¼ ∗ S NN
(f ). (3.68d)
0

Equation (2.67c) in Chapter 2 and the discussion following it show that

sin(2π nf )
→δ( f )
πf (3.68e)
as n → ∞ ,

where t in (2.67c) is here replaced by f. We note that, working with Eq. (3.66e),

sin(4π fT ) 1 sin ( 2π f (2T ) )


2Tsinc(4π fT ) = = ⋅ .
2π f 2 πf

Hence, applying (3.68e), we have


1
2Tsinc(4π fT ) → δ ( f )
2 (3.68f)
as (2T ) → ∞ .

As n gets large in (3.68e), the sine oscillates ever more rapidly with f. Similarly, as 2T gets large
in (3.68f)—which is, of course, the same as T getting large—the sinc oscillates ever more rapidly
with f. In order to approximate the sinc in (3.68f) by a delta function, then, we need to have the
other functions of f that are also present varying slowly compared to the original oscillation.
Again assuming, as in the discussion following Eq. (3.66f), that T is large enough for the first
sinc function on the right-hand side of Eq. (3.68d) to oscillate rapidly compared to the noise-
power spectrum S NN   , we expand the convolution in (3.68d), writing it as [apply Eq. (2.38e) in

Chapter 2]
2T

³R (t )e −2π ift dt = {[2Tsinc(4π fT )] ∗ S NN


  ( f )} − i {(2π fT [2Tsinc (2π fT )]) ∗ S NN
2

NN   ( f )} ,
0

and then apply (3.68f) to get, since δ ( f ) ∗ S NN


  ( f ) = S NN
  ( f ) , that

2T
1
³R (t )e −2π ift dt ≅   ( f ) − i {(2π fT [2Tsinc (2π fT )]) ∗ S NN
2

NN
S NN   ( f )} . (3.68g)
0
2

- 312 -
Even and Odd Components of Random Functions · 3.26

The remaining convolution on the right-hand side can be written as [see Eqs. (2.38a) and (2.38b)
in Chapter 2]

(2& fT [2Tsinc 2 (2& fT )])  S NN


  ( f ) S NN  2
  ( f )  2& fT [2Tsinc (2& fT )] 
5
³S
5

NN 1 2
( f 3) 2& ( f  f 3)T [2Tsinc 2  2& ( f  f 3)T ] df 3.

Both functions ( f  f 3) and S NN


  ( f 3) vary slowly with f 3 compared to

[2Tsinc 2  2& ( f  f 3  T )]

for large values of T, so (3.67b) can be applied to the integral to get

5
(2& fT [2Tsinc2 (2& fT )])  S NN
(f ) ³S
5

NN
( f 3){2& T ( f  f 3) ( f  f 3)} df 3 0 . (3.68h)

Substituting this into (3.68g) gives

2T
1
³
2& ift
  (t )e
RNN dt S  ( f ), (3.68i)
0
2 NN

which can then be put back into (3.67i) to get that [using ei cos( )  i sin( ) ]

ª cos(4& fT
ft ))iisin(4 fT)
sin(4&& ft
ft)) 11 ºº
T Re «   (( ff ))»» ..
AA SSNN
NN
¬ &&ifif 22 ¼¼

Equation (3.66e) simplifies this to

T [ 2Tsinc(4& fT )]S NN
(f ). (3.68j)

Substituting this approximation into (3.67c) lets us write, at last, that

 T 
 ( 9 ) ( f ) 2 T S   ( f ) 9 Tsinc(4& fT ) S   ( f )
E N NN NN
(3.68k)
  ( f )[1 9 sinc(4& fT )].
T S NN

- 313 -
3 · Random Variables, Random Functions, and Power Spectra

The approximation in (3.68k) makes sense whenever T is large enough for sinc(2π fT ) and
sinc(4π fT ) to oscillate rapidly with frequency f compared to S NN
  ( f ) , which is usually true for

white-noise-like power spectra. When fT >> 1 , the sinc function’s value in formula (3.68k) is
small compared to one [see, for example, Figs. 3.7(a) and 3.7(b)] and we can write

(
E N T )
 (±) ( f ) 2 ≅ T S   ( f ) .
NN
(3.69a)

When f = 0 , it is of course no longer true that fT >> 1 . For this special case, the sinc function
is one; and, according to (3.68k), no matter how large T is we have

(
 ( − ) (0) 2 ≅ 0
E N T ) (3.69b)
and

(
E N T )
 ( + ) (0) 2 ≅ 2T S   ( f ) .
NN
(3.69c)

Equation (3.69b) is easy to understand after reviewing the discussion following Eq. (3.65d)
above. Since N ( − ) is always an odd function of f, it must be zero at f = 0 according to Eq.
T

(2.12a) of Chapter 2. To understand Eq. (3.69c), we consult Eqs. (3.65e) and (3.65f) and note that

 (+) ( f ) 2 + N
N  ( − ) ( f ) 2 = [Re(N
 ( f ))]2 + [Im(N  ( f ) 2.
 ( f ))]2 = N
T T T T T

Applying the expectation operator E to both sides and using its linearity with respect to random
quantities (see Sec. 3.10 above), we get

( ) (
 (+) ( f ) 2 + E N
E N T T )
 ( − ) ( f ) 2 = E §¨ ª Re N
©¬
 ( f ) º ·¸ + E §¨ ª Im N
(T
2

¼ ¹ ) ©¬
 ( f ) º ·¸ = E N
(
T
2
)
¼ ¹ (
 (f)2
T )
or
E N(
 (f )2 =E N
T T ) (
 (+) ( f ) 2 + E N
) (
 ( −) ( f ) 2 .
T ) (3.69d)

- 314 -
Even and Odd Components of Random Functions · 3.26

FIGURE 3.7(a).
sinc(2πfT )

1.0

1 1

2T 2T

FIGURE 3.7(b).

sinc(4πfT )

1.0

1 1

4T 4T

- 315 -
3 · Random Variables, Random Functions, and Power Spectra

Glancing back at formula (3.57g), we realize, because T is assumed to be large in our analysis
here, that
E N 
 (f)2
T 
2T

is close to its limiting value as T 7 5 . Hence, (3.57g) lets us write


E N T 
 ( f ) 2 2TS   ( f )
NN
(3.69e)

for large values of T. This approximation works well no matter what the value of f is. Therefore,
at f 0 we can substitute (3.69e) into (3.69d) to get

 () 2  E N
  (0) E N T (0)
2TS NN  
 (  ) (0) 2 .
T  (3.69f)

Having already justified (3.69b), we can now apply it to (3.69f) to get


 () 2 .
  (0) E NT (0)
2TS NN 
This result then justifies formula (3.69c) above.
Equation (3.69d) can also be used to justify the assumption
– but f > 0formula
only whenbehind (3.69e) that,behind
– the assumption when
formula (3.69e)
f > 0 , the ratio that the ratio


 (f)2
E N T 
2T

is, for large values of T, close to its limiting value of S NN


  ( f ) . When f > 0 and T is large so that

fT 1 , we can substitute (3.69a) into (3.69d) to rederive (3.69e),


E N T 
 ( f ) 2 2TS   ( f ) .
NN

According to (3.69a), then, it follows that when fT 1 and T is large, both

- 316 -
Even and Odd Components of Random Functions · 3.26

(
E N T )
 ( + ) ( f ) 2 = E ¨§ ª Re N
©¬
 ( f ) º ¸· ≅ TS   ( f )
T (
2

¼ ¹ NN )
and

(
E N T )
 ( − ) ( f ) 2 = E ¨§ ª Im N
©¬
 ( f ) º ¸· ≅ TS   ( f )
T (
2

¼ ¹ NN )

( )
 ( f ) 2 . Having arrived at the formula
contribute equally to E N T

E N T( NN )
 ( f ) 2 ≅ 2TS   ( f )

without using Eq. (3.57g)—that is, without thinking about what the limiting value of the ratio

( (f)2
E N T )
2T

might be as T gets large—we can now work in reverse to get that

 (f)2
E N T ( )≅S  (f ).
NN
2T

Not only does this result demonstrate that the ratio

( (f)2
E N T )
2T

  ( f ) when fT >> 1 and T is large, but we have also seen, when


is indeed about equal to S NN
fT >> 1 and T is large, that the expected value of the squared real component of N and the
T

expected value of the squared imaginary component of N  contribute equally to the expected
T

value of the squared magnitude of N . In other words, both
T

(
E N T )
 ( + ) ( f ) 2 = E ¨§ ª Re N
©¬
 ( f ) º ¸·
T (
2

¼ ¹ )

- 317 -
3 · Random Variables, Random Functions, and Power Spectra

and
 (  ) ( f ) 2 E §¨ ª Im N
 ( f ) º ·¸

E N T  ©¬
T 
2

¼ ¹ 
have turned out to be about half the expected value of the squared magnitude of N  , which lets
T

us write
 (  ) ( f ) 2 2E ¨§ ª Re N
 ( f ) º ¸·
E N 
T  
 ( f ) 2 2E N
T
©¬
 T
2

¼ ¹  
(3.69g)

and
 (  ) ( f ) 2 2E ¨§ ª Im N
 ( f ) º ¸· .
E N 
T  
 ( f ) 2 2E N
T ©¬
T 
2

¼ ¹ 
(3.69h)

A not-very-rigorous argument often used to derive Eqs. (3.69a), (3.69g), and (3.69h) starts out
by breaking N  ( f ) into real and imaginary parts. (This step is sound—we did the same thing in
T

our analysis above.) Writing

 ( f ) 2 [Re N
N  ( f ) ]2  [Im N
   ( f ) ]2 ,
  (3.70a)
T T T

 is equally likely to be real or imaginary, which means that


we next assume that N T

 
E [Re N T  
 ( f ) ]2 E [Im N  
 ( f ) ]2 .
T   (3.70b)

This is the
This result,
is the of of
result, course, that
course, wewe
that have
havegone
gonetotosome
sometrouble
troubletotojustify
justifyanalytically
analytically rather
rather
than just assuming it applies; it is sometimes true and sometimes very wrong, for example, when
f 0 or when S NN   varies rapidly with f. Applying the E expectation operator to both sides of

(3.70a) gives, using the linearity of E explained in Sec. 3.10,


E N T  
 ( f ) 2 E [Re N
T     
 ( f ) ]2  E [Im N
 ( f ) ]2 .
T   (3.70c)

Substitution of (3.70b) into (3.70c) then leads to


E N T 
 ( f ) 2 2 E [Re N  
 ( f ) ]2
T   (3.70d)
and

- 318 -
Even and Odd Components of Random Functions · 3.26

(
E N T )
 ( f ) 2 = 2 E [Im N( (
 ( f ) ]2 .
T ) ) (3.70e)

Consulting Eqs. (3.65e) and (3.65f), we see that formulas (3.70d) and (3.70e) are identical to
(3.69g) and (3.69h). Fortunately, since a more rigorous line of reasoning has already been used to
derive Eqs. (3.69g) and (3.69h), there is no need to rely on the assumption that (3.70b) is true to
establish the truth of (3.70d) and (3.70e). Having derived these results more rigorously, we also
now know that formulas (3.69g) and (3.69h) and formulas (3.70d) and (3.70e) are approximations
that should be used only when T is large, when fT >> 1 , and when S NN   varies slowly with

frequency f.

3.27 Analyzing the Noise in Artificially Created Even Signals


Many times in interferometer measurements we take all the data recorded for times t > 0 and,
assuming the signal is an even function of time, use the positive-time data to specify what the
data “ought to be” at t < 0 . This means that the noise in the data for −∞ < t < ∞ ends up being an
even function of time; that is, the real-valued random function n E (t ) that characterizes the noise
at t > 0 in the original recording also characterizes the noise for all negative time values because
of the way we construct the data set. Mathematically we say that

n E (−t ) = nE (t ) for all −∞ < t < ∞ . (3.71a)

Although random function n E (t ) is neither ergodic nor stationary, we can assume that a real-
valued and stationary random function ñ(t) exists such that

n (t ) = nE (t ) for t ≥ 0 . (3.71b)

Just like any other stationary random function, ñ(t) has an autocorrelation function [see Eq.
(3.30b)]
  (t − t ′) = E ( n
Rnn  (t ′) n (t ) ) . (3.71c)

Following the conventions of Sec. 3.20 above [see Eqs. (3.48a)–(3.48c)], we note that Rnn
  is an

even function,
  ( −τ ) = Rnn
Rnn   (τ ) , (3.71d)

and that autocorrelation Rnn


  and the power spectrum S nn
  make up a Fourier-transform pair,

- 319 -
3 · Random Variables, Random Functions, and Power Spectra

5
S nn
 ( f ) ³R
5

nn (* ) e 2& if * d* (3.71e)

and
5
Rnn
  (* ) ³S
5

nn ( f ) e 2& if * df . (3.71f)

Following the same pattern as in Eq. (3.65a), we define

T 5
N T ( f ) ³ n (t ) e
2& ift
dt ³ (t , T ) n (t ) e
2& ift
dt (3.72a)
T 5
and
T 5
N TE ( f ) ³ nE (t ) e 2& ift dt ³ (t , T ) n E (t ) e2& ift dt . (3.72b)
T 5

For large values of T, we can derive a simple approximation for

 2
E N TE ( f ) , 
the expectation value of the squared magnitude of N TE , in terms of


E N T ( f )
2

and the power spectrum S nn
 ( f ) .

We start by specifying the Heaviside step function to be the same as in Eq. (3.67j):

­ 1 for t 0
°
(t ) ®1 2 for t 0 . (3.73a)
° 0 for t
0
¯

This is the same step function defined in Eq. (2.70a) in Chapter 2. It follows that n E (t ) can be
written as [see Eqs. (3.71a) and (3.71b)]

n E (t ) (t )n (t )  (t )n (t ) . (3.73b)

- 320 -
Analyzing the Noise in Artificially Created Even Signals · 3.27

We note that for t > 0 , the first term has Ξ (t ) = 1 and the second term has Ξ(−t ) = 0 , so

n E (t ) = n (t ) .

For t < 0 , the first term has Ξ(t ) = 0 and the second term has Ξ(−t ) = 1 , so

n E (t ) = n (−t ) ,

and when t = 0 both Ξ(t ) and Ξ(−t ) are 1/2, so

n E (0) = n (0) .

We can now write, using Eq. (3.72b) and remembering that n E is real, that

( 2
) (
E N TE ( f ) = E N TE ( f )∗ ⋅ N TE ( f ) )
§∞ ∞
·
= E ¨ ³ Π (t ′, T ) nE (t ′) e −2π ift ′
dt ′ ³ Π (t , T ) n E (t ) e2π ift dt ¸ .
© −∞ −∞ ¹

Using the linearity of E described in Sec. 3.10 above, we bring the expectation operator inside
the double integral over dt and dt ′ to get

( )
∞ ∞
2
E N TE ( f ) = ³ dt ′Π(t ′, T ) e
−2π ift ′
³ dt Π(t , T ) e
2π ift
E ( nE (t ′)nE (t ) ) . (3.73c)
−∞ −∞

Equation (3.73b) shows that, again using the linearity of the expectation operator,

E ( n E (t ′)nE (t ) ) = E ([Ξ(t ′)n (t ′) + Ξ(−t ′)n (−t ′)] ⋅ [Ξ(t )n (t ) + Ξ(−t )n (−t )])
= Ξ(t ′)Ξ(t )E ( n (t ′)n (t ) ) + Ξ(−t ′)Ξ(t )E ( n (−t ′)n (t ) )
+ Ξ(t ′)Ξ(−t )E ( n (t ′)n (−t ) ) + Ξ(−t ′)Ξ(−t )E ( n (−t ′)n (−t ) ) .

Substituting from Eq. (3.71c) gives

E ( n E (t ′)nE (t ) ) = Ξ(t ′)Ξ(t ) Rnn ′


  (t − t ′) + Ξ ( −t )Ξ (t ) Rnn
  (t + t ′)

+ Ξ(t ′)Ξ(−t ) Rnn ′


  ( −t − t ′) + Ξ (−t )Ξ (−t ) Rnn
  ( −t + t ′) .

- 321 -
3 · Random Variables, Random Functions, and Power Spectra

Because the autocorrelation is even [see Eq. (3.71d)], this simplifies to

E ( n E (t ′)nE (t ) ) = [Ξ(t ′)Ξ(t ) + Ξ(−t ′)Ξ(−t )]Rnn


  (t ′ − t )
(3.73d)
+ [Ξ(−t ′)Ξ(t ) + Ξ(t ′)Ξ(−t )]Rnn
  (t + t ′) .

Substituting the right-hand side of (3.73d) into the double integral in (3.73c) gives

(
E N TE ( f )
2
)
∞ ∞
= ³ dt ′Π (t ′, T ) e −2π ift ′ ³ dt Π (t , T ) e 2π ift [Ξ(t ′)Ξ(t ) + Ξ(−t ′)Ξ(−t )]Rnn
  (t ′ − t )
−∞ −∞ (3.73e)
∞ ∞
+ ³ dt ′Π(t ′, T ) e
−2π ift ′
³ dt Π (t , T ) e
2π ift
[Ξ(−t ′)Ξ(t ) + Ξ(t ′)Ξ(−t )]Rnn
  (t + t ′)
−∞ −∞

=Λ +Λ ,
1 2

where
∞ ∞
Λ1 = ³ dt ′Π (t ′, T ) e −2π ift ′ ³ dt Π (t , T ) e2π ift [Ξ(t ′)Ξ(t ) + Ξ(−t ′)Ξ(−t )]Rnn
  (t ′ − t ) (3.73f)
−∞ −∞
and
∞ ∞
Λ 2 = ³ dt ′Π (t ′, T ) e −2π ift ′
³ dt Π(t , T ) e
2π ift
[Ξ(−t ′)Ξ(t ) + Ξ(t ′)Ξ(−t )]Rnn
  (t + t ′) . (3.73g)
−∞ −∞

The dark solid line in Fig. 3.8(a) is a plot of the Heaviside step function Ξ(t ) and the dashed
line is a plot of Π (t , T ) . Disregarding the edge points whose values do not contribute to the
integrals in (3.73f) and (3.73g), the product [Ξ (t ) ⋅ Π (t , T )] is zero unless both Ξ and Π are
one—that is, the product is zero unless t lies inside the region where both the solid and dashed
plots are one in Fig. 3.8(a). Comparing this region to the plot of

§ T T·
Π¨t − , ¸
© 2 2¹
in Fig. 3.8(b), we see that
§ T T·
Ξ(t ) ⋅ Π (t , T ) = Π ¨ t − , ¸ . (3.74a)
© 2 2¹

- 322 -
Analyzing the Noise in Artificially Created Even Signals · 3.27

In Fig. 3.8(c), the dashed line is again a plot of Π (t , T ) , but now the dark solid line is a plot of
Ξ(−t ) . Comparing the region where both Ξ (−t ) and Π (t , T ) are one in Fig. 3.8(c) to the plot of

§ T T·
Π¨t + , ¸
© 2 2¹

in Fig. 3.8(d), we see that

§ T T·
Ξ(−t ) ⋅ Π (t , T ) = Π ¨ t + , ¸ . (3.74b)
© 2 2¹

FIGURE 3.8(a). FIGURE 3.8(c).

t t
−T T −T T

FIGURE 3.8(b). FIGURE 3.8(d).

t t
−T T −T T

- 323 -
3 · Random Variables, Random Functions, and Power Spectra

Splitting the formula in Eq. (3.73f) into two double integrals, we get that

∞ ∞
Λ1 = ³ dt ′Ξ(t ′)Π (t ′, T ) e −2π ift ′
³ dtΞ(t ) Π(t , T ) e
2π ift
  (t ′ − t )
Rnn
−∞ −∞
∞ ∞

³ dt ′Ξ(−t′)Π (t ′, T ) e ³ dt Ξ(−t )Π (t , T ) e
−2π ift ′ 2π ift
+   (t ′ − t ) ,
Rnn
−∞ −∞

which becomes, applying (3.74a) and (3.74b),

∞ ∞
Λ1 = ³ dt ′Π §¨ t ′ − , ·¸ e−2π ift ′ ³ dtΠ §¨ t − , ·¸ e2π ift Rnn  (t ′ − t )
T T T T
−∞
2 2 © ¹ 2 2 −∞ © ¹
∞ ∞
§ ′ T T · −2π ift ′ § T T · 2π ift
+ ³−∞ © 2 2 ¹
dt ′Π ¨ t + , ¸ e ³−∞ © t + 2 , 2 ¸¹ e Rnn  (t′ − t ) .
dt Π ¨

After changing the variables of integration in the first double integral from t , t ′ to τ = t − (T / 2)
and τ ′ = t ′ − (T / 2) , and changing the variables of integration in the second double integral from
t , t ′ to τ ′′ = t + (T / 2) and τ ′′′ = t ′ + (T / 2) , we see that

∞ § T· ∞ § T·

Λ1 = ³ dτ ′ Π §¨τ ′, ·¸ e © 2 ¹ ³ dτ Π §¨τ , ·¸ e © 2 ¹ Rnn  (τ ′ − τ )


T −2π if ¨τ ′+ ¸ T 2π if ¨τ + ¸
−∞ © 2¹ −∞ © 2¹
∞ § T· ∞ § T·
§ T · −2π if ¨©τ ′′′− 2 ¸¹ § ′′ T · 2π if ¨©τ ′′− 2 ¸¹
+ ³ dτ ′′′ Π ¨τ ′′′, ¸ e ³ dτ ′′ Π ¨τ , ¸ e   (τ ′′′ − τ ′′) .
Rnn
−∞ © 2 ¹ −∞ © 2 ¹

Since
e −2π if ( ±T / 2) ⋅ e 2π if ( ±T / 2) = 1 ,

the double integral over dτ ′′′ and dτ ′′ has the same value as the double integral over dτ ′ and
dτ , which means that

∞ ∞
Λ1 = 2 ³ dτ ′ Π §¨τ ′, ·¸ e−2π if τ ′ ³ dτ Π §¨τ , ·¸ e2π if τ Rnn  (τ ′ − τ ) .
T T
−∞ © 2¹ −∞ © 2¹

This type of double integral has already been evaluated in Sec. 3.26 while simplifying Eq.
(3.66b), but there is no harm in quickly repeating the procedure. Applying Eq. (3.71f), we get

- 324 -
Analyzing the Noise in Artificially Created Even Signals · 3.27

∞ ∞ ∞
Λ1 = 2 ³ dτ ′ Π §¨τ ′, ·¸ e−2π if τ ′ ³ dτ Π §¨τ , ·¸ e2π if τ ³ df ′ Snn  ( f ′)e2π if ′(τ ′−τ )
T T
−∞ © 2¹ −∞ © 2¹ −∞
∞ ∞ ∞
§ T · −2π i ( f − f ′)τ ′ § T·
= 2 ³ df ′ Snn
  ( f ′) ³ dτ ′ Π ¨ τ ′, ¸e ³ dτ Π ¨τ , ¸ e2π i ( f − f ′)τ .
−∞ −∞ © 2¹ −∞ © 2¹

This expression can be simplified further using Eq. (3.66d). Equation (3.66d) still holds true if T
is replaced by T/2 because the original T is a dummy parameter. So, replacing T by T/2 and
substituting the result in the formula for ȁ1,


Λ1 = 2T ³ S nn  ( f ′) ª¬Tsinc2 (π T ( f − f ′) ) º¼ df ′ . (3.75a)
−∞

According to Eq. (3.66e),


2
ª § T ·º
« sin ¨ 2π ⋅ ⋅ f ¸ » 2
§T · © 2 ¹ ª sin ( 2π T ′ f ) º
T sinc (π Tf ) = 2 ⋅ ¨ ¸ ⋅ «
2
» = 2T ′ « » ,
© 2 ¹ « § 2π ⋅ T ⋅ f · » «¬ ( 2π T ′ f ) »¼
«¬ ¨© ¸
2 ¹ »¼

where T ′ = T 2 . In the limit T → ∞ we also have, of course, that T ′ → ∞ , so according to Eq.


(3.57f) it follows that

T sinc 2 (π Tf ) → δ ( f )
(3.75b)
as T → ∞ .

Again, we assume that T is large enough to make

T sinc 2 (π T ( f − f ′) ) ≅ δ ( f − f ′)

in Eq. (3.75a). Consequently,


Λ1 ≅ 2T ³ Snn  ( f ′)δ ( f − f ′)df ′
−∞
or
Λ1 ≅ 2TSnn  ( f ) . (3.75c)

- 325 -
3 · Random Variables, Random Functions, and Power Spectra

To evaluate Λ 2 , we apply Eqs. (3.74a) and (3.74b) to the right-hand side of Eq. (3.73g) to get

∞ ∞
Λ2 = ³ dt ′Π (t ′, T )Ξ(−t ′) e −2π ift ′
³ dt Π (t , T )Ξ(t ) e
2π ift
  (t + t ′)
Rnn
−∞ −∞
∞ ∞
+ ³ dt ′Π (t ′, T )Ξ(t ′) e −2π ift ′ ³ dt Π (t , T ) Ξ(−t ) e2π ift Rnn
  (t + t ′)
−∞ −∞
∞ ∞
§ T T· § T T·
³ dt ′Π ¨© t ′ + 2 , 2 ¸¹ e ³ dt Π ¨© t − 2 , 2 ¸¹ e
−2π ift ′ 2π ift
=   (t + t ′)
Rnn
−∞ −∞
∞ ∞
§ T T · −2π ift ′ § T T · 2π ift
+ ³−∞ dt ′Π ¨© t ′ − 2 , 2 ¸¹ e −∞³ dtΠ ¨© t + 2 , 2 ¸¹ e Rnn  (t + t ′) .

In the first double integral, the t ′ , t variables of integration are replaced by τ ′ = t ′ + (T / 2) and
τ = t − (T / 2) respectively; and in the second double integral, the t ′ , t variables of integration are
replaced by τ ′′′ = t ′ − (T / 2) and τ ′′ = t + (T / 2) respectively. This leads to

∞ § T· ∞ § T·

Λ2 = ³ dτ ′ Π §¨τ ′, ·¸ e © 2 ¹ ³ dτ Π §¨τ , ·¸ e © 2 ¹ Rnn  (τ + τ ′)


T −2π if ¨τ ′− ¸ T 2π if ¨τ + ¸
−∞ © 2¹ −∞ © 2¹
∞ § T· ∞ § T·
§ T · −2π if ¨©τ ′′′+ 2 ¸¹ § ′′ T · 2π if ¨©τ ′′− 2 ¸¹
+ ³ dτ ′′′Π ¨ τ ′′′, ¸ e ³ dτ ′′
Π ¨τ , ¸ e   (τ ′′ + τ ′′′)
Rnn
−∞ © 2¹ −∞ © 2¹
or
∞ ∞
Λ2 = e 2π ifT ³ dτ ′ Π §¨τ ′, ·¸ e−2π if τ ′ ³ dτ Π §¨τ , ·¸ e 2π if τ Rnn  (τ + τ ′)
T T
−∞ © 2¹ −∞ © 2¹
∞ ∞
(3.75d)
§ T · −2π if τ ′′′ § T · 2π if τ ′′
³−∞ dτ ′′′ Π ¨©τ ′′′, 2 ¸¹ e −∞³ dτ ′′ Π ¨©τ ′′, 2 ¸¹ e Rnn  (τ ′′ + τ ′′′) .
−2π ifT
+e

Everything on the right-hand side of (3.75d) is real except the complex exponentials, so the
second term is the complex conjugate of the first term. It is easy to show that this is true. Starting
with the first term we have

- 326 -
Analyzing the Noise in Artificially Created Even Signals · 3.27


ª 2π ifT ∞ § ′ T · −2π if τ ′

§ T · 2π if τ º
«
¬
e ³
−∞
dτ ′ Π ¨ τ
© 2¹
, ¸ e ³
−∞
dτ Π ¨ τ
© 2¹
, ¸ e R 
nn  (τ + τ ′) »
¼
∞ ∞
§ T · 2π if τ ′ § T · −2π if τ
³−∞ dτ ′ Π ¨©τ ′, 2 ¸¹ e −∞³ dτ Π ¨©τ , 2 ¸¹ e Rnn  (τ + τ ′)
−2π ifT
=e

∞ ∞
§ T · −2π if τ ′′′ § T · 2π if τ
³−∞ dτ ′′′ Π ¨©τ ′′′, 2 ¸¹ e −∞³ dτ ′′ Π ¨©τ ′′, 2 ¸¹ e Rnn  (τ ′′ + τ ′′′) ,
−2π ifT
=e

where in the last step we interchange the order of the double integral and replace the dummy
variables of integration τ , τ ′ by τ ′′ , τ ′′′ respectively. Clearly, the second term in (3.75d) is the
complex conjugate of the first. Since 2 Re(c ) = c + c∗ for any complex number c, it follows that
Eq. (3.75d) can be written as

§ 2π ifT ∞ § T · −2π if τ ′

§ T · 2π if τ ·
Λ2 = 2 Re ¨ e ³ dτ Π ¨τ , ¸ e
′ ′ ³ dτ Π ¨ τ , ¸ e R 
nn  (τ + τ ′) ¸. (3.75e)
© −∞ © 2¹ −∞ © 2¹ ¹

After the variable of integration of the inner integral is changed to t ′′ = −(τ + τ ′) , it can be written
as
∞ ∞
§ T · 2π if τ § T · −2π if (t ′′+τ ′)
³−∞ dτ Π ¨©τ , 2 ¸¹ e Rnn  (τ + τ ) = −∞³ dt ′′ Π ¨© −t ′′ − τ ′, 2 ¸¹ e
′   ( −t ′′) .
Rnn (3.75f)

According to Eq. (3.48b) above and Eq. (2.56c) in Chapter 2, both Π and Rnn
  are even functions,

which means that


§ T· § T·
Π ¨ −t ′′ − τ ′, ¸ = Π ¨ t ′′ + τ ′, ¸
© 2¹ © 2¹

and
  ( −t ′′) = Rnn
Rnn   (t ′′) .

Substituting these two formulas into the right-hand side of (3.75f) gives

∞ ∞
§ T · 2π if τ § T · −2π if ( t ′′+τ ′)
³−∞ dτ Π ¨©τ , 2 ¸¹ e Rnn  (τ + τ ′) = −∞³ dt ′′ Π ¨© t ′′ + τ ′, 2 ¸¹ e   (t ′′) ,
Rnn

- 327 -
3 · Random Variables, Random Functions, and Power Spectra

which can in turn be substituted into (3.75e) to get

§ 5 5
·
2 2 Re ¨ e2& ifT ³ d* 3  §¨* 3, ·¸ e2& if * 3 ³ dt 33  §¨ t 33  * 3, ·¸ e2& if (t 33* 3) Rnn  (t 33) ¸ .
T T
© 5 © 2¹ 5 © 2¹ ¹

Interchanging the order of integration and replacing the variable * 3 by t, we end up with

§ 2& ifT 5 5
§ T· § T· ·
2 2 Re ¨ e ³ dt 33 Rnn  (t 33)e 2& ift 33
³ dt  ¨ t , ¸  ¨ t 33  t , ¸ e 4& ift ¸ . (3.75g)
© 5 5 © 2¹ © 2¹ ¹

Comparing (3.75g) with (3.67e), we note that the double integral in the formula for  2 can be
written as
5 5
§ T· § T · 4& ift
³5 dt Rnn  (t )e 5³ dt  ¨© t, 2 ¸¹  ¨© t 33  t, 2 ¸¹ e
2& ift 33
33 33 T / 2

with the understanding that the random function is now ñ(t) instead of Ñ(t) as in Eq. (3.67e). This
leads to a simpler—well, shorter—formula for  2 ,

 2 2 Re  e 2& ifT T / 2  . (3.75h)

We have already found the appropriate approximation for T and T / 2 when T and T/2 are large
enough to make the sinc functions oscillate rapidly with f compared to the noise-power
spectrum. Hence, we now apply formula (3.68j) to (3.75h), which gives, after remembering to
replace Ñ by ñ and T by T/2,

 2 2 Re  e2& ifT [ Tsinc(2& fT )]Snn  ( f )  .

Since
e 2& ifT cos(2& fT )  i sin(2& fT ) ,

the formula for  2 can be written as

 2 2T cos(2& fT ) sinc(2& fT )]SSnnnn ((ff)).. (3.75i)

Having found good approximations for 1 and  2 , we can substitute (3.75c) and (3.75i) into

- 328 -
Analyzing the Noise in Artificially Created Even Signals · 3.27

(3.73e) to get

 2

E N TE ( f ) 2TS nn
  ( f )  2T cos(2& fT ) sinc(2& fT )]
SSnn ((f f))
 nn

or

 2

E N TE ( f ) 2TSnn
  ( f ) A [1  cos(2& fT ) sinc(2& fT )] . (3.76a)

For large values of T, so that


fT 1 , (3.76b)

we know that [apply Eq. (3.66e)]

cos(2& fT ) sin(2& fT ) 1
cos(2& fT ) sinc(2& fT ) 4

1
2& fT 2& fT

because (i) the absolute value of the product of the sine and cosine must always be less than or
equal to one and (ii) the value of 1/ 2& fT must be small when fT is large. The formula in
(3.76a) now simplifies to

 
2
E N TE ( f ) 2TSnn
 ( f ) . (3.76c)

This will be a useful approximation to know when analyzing detector noise in Chapter 6.

__________

The basic concepts introduced in this chapter—such as random variables and functions, the
autocorrelation function, the noise-power spectrum, stationarity and ergodicity—may not be as
important as the Fourier theory covered in Chapter 2, but they turn up over and over again in the
following pages. The Wiener-Khinchin theorem is used to transform electromagnetic wavefields
into the spectral radiances that Michelson interferometers are built to measure. Stationary random
functions are added to interference signals to represent what happens when the interference
signals become contaminated by noise. The expectation operator E is applied to the products of
random quantities to turn them into autocorrelation functions, and the autocorrelation functions
are then transformed into noise-power spectra in formulas for the random-measurement error.
This chapter has explained the statistical ideas behind these procedures—and the context in
which the ideas arise—to show what the formulas mean and why they make sense.

- 329 -
4
FROM MAXWELL’S EQUATIONS TO
THE MICHELSON INTERFEROMETER
The interference formulas for a highly idealized version of the standard Michelson interferometer
can be derived in a page or two, and that is what is done in most textbooks. Section 1.5 of
Chapter 1 lays out the basic approach of this derivation, pointing out that all we really need is the
19th-century ether-wave theory of light because a full knowledge of Maxwell’s equations is not
required. Afterwards, these ideal interference formulas can, with some difficulty and an appeal to
ad hoc arguments, be modified to handle the measurement errors and distortions present in
nonideal instruments, but this is difficult to do in a straightforward and convincing way.
Consequently, in this chapter we prefer to start with first principles, carefully tracing the plane-
wave solutions to Maxwell’s equations through the standard Michelson interferometer and then
applying the Fourier methodology and random-signal theory explained in the previous two
chapters to describe the electromagnetic wavefields leaving the instrument. Although longer than
the standard textbook procedure, this approach leads naturally to detailed formulas describing
what happens when the optical setup is slightly misaligned, what happens when the input
radiation is polarized, and what happens when the interferometer measures an input spectrum that
is nonuniform over its field of view. We do this both for the interferometer’s balanced
interference signal and its unbalanced background signal, explaining first the reasoning behind
the formulas for the balanced input signal and then showing how the same sort of analysis
produces similar formulas for the unbalanced background signal. At the end of this process, the
reader has a detailed understanding of how the formulas describing ideal Michelson
interferometers should
interferometers should be
be modified
modified and
andexpanded
expandedto to
describe optical
describe imperfections
nonideal andinnon-
instruments an
ideal inputs.
imperfect world.

4.1 Deriving the Electromagnetic Wave Equations


In SI units, Maxwell’s equations for empty space are
G
G G <E
@ ; B $o o , (4.1a)
<t
G
G G <B
@; E  , (4.1b)
<t

- 330 -
Deriving the Electromagnetic Wave Equations · 4.1

G G
∇•E = 0 , (4.1c)
and
G G
∇•B = 0 (4.1d)
where
µo = 4π ⋅10−7 henry meter
and
1
εo = . (4.1e)
µo c 2
G G
In these equations, E is the electric field, which is a function of position and time; B is the
magnetic-induction field, which is also a function of position and time; t is the time coordinate;
µo is the magnetic permeability of free space; ε o is the permittivity of free space; c is the
G
velocity of light; and ∇ is the standard vector-derivative “del” operator [see Eq. (4A.7a) in
Appendix 4A for a definition]. We take the curl of both sides in Eqs. (4.1a) and (4.1b) to get

G G G ∂ G G
∇ × [∇ × B] = µoε o
∂t
∇× E ( ) (4.2a)
and
G G G ∂ G G
∇ × [∇ × E ] = −
∂t
∇× B .( ) (4.2b)

G
But for any vector field v , we have the identity
G G G G G G G
(
∇ × [∇ × v ] = ∇ ∇ • v − ∇ 2 v . ) (4.2c)

Substitution of (4.2c) into (4.2a) and (4.2b) gives

G G G G ∂ G G
( )
∇ ∇ • B − ∇ 2 B = µ oε o
∂t
∇× E , ( )
G G G G ∂ G G
( )
∇ ∇ • E − ∇2 E = −
∂t
∇× B , ( )
or
G
G ∂2 B
∇ B − µ oε o 2 = 0 ,
2

∂t

- 331 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G
G ∂2E
∇ E − µ oε o 2 = 0 ,
2

∂t
G G G G
where we have used ∇ • B = ∇ • E = 0 from (4.1c) and (4.1d) and
G G G G G G
∇ × E = − ∂B ∂t , ∇ × B = µoε o ∂E ∂t

from (4.1a) and (4.1b) to simplify our results. The substitution µoε o = c −2 from (4.1e) now gives
G
G 1 ∂2 B
2
∇ B− 2 2 =0 (4.3a)
c ∂t
and
G
G 1 ∂ 2
E
∇2 E − 2 2 = 0 . (4.3b)
c ∂t
G
Equation (4.3a) is the wave equation for E , the electric field as a function of position and time;
G
and (4.3b) is the wave equation for B , the magnetic-induction field as a function of position and
G G
time. Because E and B are vectors and the wave equation is usually applied to scalar fields, we
now rewrite Eqs. (4.3a) and (4.3b) as a collection of six scalar wave equations to show the
G G
meaning of the two vector wave equations. The first step is to identify the E and B Cartesian
G
field components. Figure 4.1 specifies a three-dimensional Cartesian coordinate system for the E
G
and B field vectors located at a single point P. We use the x̂ , ŷ , ẑ unit vectors of the coordinate
system to write
G
E = xE
ˆ x + yE
ˆ y + zE
ˆ z (4.4a)
and
G
B = xB
ˆ x + yB
ˆ y + zB
ˆ z , (4.4b)

where, as shown in Fig. 4.1, Ex , E y , Ez are the real x, y, z components of the electric field and
Bx , By , Bz are the real x, y, z components of the magnetic-induction field. Both Ex , y , z and Bx , y , z
are, of course, functions of position and time. We define a position vector
G
r = xx
ˆ + yy
ˆ + zz
ˆ (4.4c)
G G
and show the dependence of the E and B fields on position and time by rewriting (4.4a) and
(4.4b) as

- 332 -
Deriving the Electromagnetic Wave Equations · 4.1

FIGURE 4.1. z

Ez > 0
G
Draw only the E field and
its x, y, z components
Ey > 0
y
z Ex < 0
G
E

Point P at the
same x, y, z
G
B z coordinates
y

x
G
Draw only the B field and Bz < 0
its x, y, z components By > 0
y
Bx > 0

- 333 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G G G G
E (r , t ) = xE
ˆ x (r , t ) + yE
ˆ y (r , t ) + zE
ˆ z (r , t )
and
G G G G G
B (r , t ) = xB
ˆ x (r , t ) + yB
ˆ y (r , t ) + zB
ˆ z (r , t ) .

This notation is best regarded as a shorthand for [see the discussion after Eq. (2.109d) in Sec.
2.25 of Chapter 2]
G
E ( x, y, z , t ) = xE
ˆ x ( x, y, z , t ) + yE
ˆ y ( x, y, z , t ) + zE
ˆ z ( x, y , z , t )
and
G
B ( x, y, z , t ) = xB
ˆ x ( x, y, z , t ) + yB
ˆ y ( x, y, z , t ) + zB
ˆ z ( x, y , z , t ) .

G
For any vector v we have, according to Eq. (4A.11c) in Appendix 4A,
G
∇ 2 v = xˆ∇ 2 vx + yˆ ∇ 2 v y + zˆ∇ 2 vz

G
where vx , v y , vz are the real x, y, z components of real vector v . It follows that substitution of
Eqs. (4.4a) and (4.4b) into (4.3a) and (4.3b) gives six scalar wave equations, one for each
Cartesian component of the two vector equations (4.3a) and (4.3b):

2 1 ∂ 2 Ex ∂ 2 Ex ∂ 2 Ex ∂ 2 Ex 1 ∂ 2 Ex
∇ Ex − 2 = + + − = 0, (4.5a)
c ∂t 2 ∂x 2 ∂y 2 ∂z 2 c 2 ∂t 2

2 2 2 2 2
2 1 ∂ Ey ∂ Ey ∂ Ey ∂ Ey 1 ∂ Ey
∇ Ey − 2 = + + − = 0, (4.5b)
c ∂t 2 ∂x 2 ∂y 2 ∂z 2 c 2 ∂t 2

2 1 ∂ 2 Ez ∂ 2 Ez ∂ 2 Ez ∂ 2 Ez 1 ∂ 2 Ez
∇ Ez − 2 = + + 2 − 2 = 0, (4.5c)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2

2 1 ∂ 2 Bx ∂ 2 Bx ∂ 2 Bx ∂ 2 Bx 1 ∂ 2 Bx
∇ Bx − 2 = + + 2 − 2 = 0, (4.5d)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2

2 2 2 2 2
2 1 ∂ By ∂ By ∂ By ∂ By 1 ∂ By
∇ By − 2 = + + 2 − 2 =0, (4.5e)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2

- 334 -
Deriving the Electromagnetic Wave Equations · 4.1

1 ∂ 2 Bz ∂ 2 Bz ∂ 2 Bz ∂ 2 Bz 1 ∂ 2 Bz
∇ 2 Bz − = + + 2 − 2 =0. (4.5f)
c 2 ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2

Here, ∇ 2 = ∂ 2 ∂x 2 + ∂ 2 ∂y 2 + ∂ 2 ∂z 2 is used to write these equations using explicit partial


derivatives of x, y, and z. These six equations are just the scalar wave equation for Ex , E y , Ez
and Bx , By , Bz . They are not really that difficult to solve when they have simple boundary
G G
conditions. In fact, if at some time t the E and B electromagnetic fields are zero everywhere,
G G
then the solution to these equations is the trivial one that the E and B fields remain identically
zero everywhere. If, however, at some time t there is a region of space where the fields are not
G G
identically zero, then we expect nontrivial solutions having nonzero values of the E and B
fields.

4.2 Electromagnetic Plane Waves


Equations (4.1a)–(4.1d), (4.3a), and (4.3b) contain five different differential operators—the
G G
divergence ( ∇ • ), the curl ( ∇ × ), the Laplacian ( ∇ 2 ), and the first and second partial derivatives
with respect to time ( ∂ ∂t , ∂ 2 ∂t 2 )—and all five are real linear operators as defined in Appendix
4A. According to the discussion following Eqs. (4A.19a) and (4A.19b), we can therefore find real
G G
solutions for E and B by first solving for them as complex vector fields and then, at the end,
taking their real parts to get the desired real solutions. Following this procedure, we begin looking
for complex solutions to (4.3a) and (4.3b) that have the form
G G G G
E (r , t ) = ¦ EA (r ) e −2π ifAt (4.6a)
A
and
G G G G
B (r , t ) = ¦ BA (r ) e −2π ifAt , (4.6b)
A

G G
where all the f A values are real and EA , BA may be complex vector functions of position.
Substituting (4.6a) and (4.6b) into (4.3a) and (4.3b) shows that then we end up with
G G
¦ [(∇ E
A
2
A + 4π 2σ A 2 EA ) e −2π ifAt ] = 0

and
G G
¦ [(∇ B
A
2
A + 4π 2σ A 2 BA ) e−2π ifAt ] = 0

if we define

- 335 -
4 · From Maxwell’s Equations to the Michelson Interferometer

fA
σA = . (4.7a)
c

The only way these sums can be identically zero for all times t is to set
G G
∇ 2 EA + 4π 2σ A 2 EA = 0 (4.7b)

and
G G
∇ 2 BA + 4π 2σ A 2 BA = 0 (4.7c)

for each value of A in the sums. We next look for solutions


G G G 2π i ( kG •rG )
EA (r ) = ¦ EAj e Aj (4.8a)
j

and
G G G 2π i ( kG •rG )
BA (r ) = ¦ BAj e Aj , (4.8b)
j

G G G
where all the kAj are constant, real, three-dimensional vectors and EAj , BAj are complex, constant,
three-dimensional vectors. In terms of the x̂ , ŷ , ẑ unit vectors of Fig. 4.1,

G
kAj = xk
ˆ Ajx + yk
ˆ Ajy + zk
ˆ Ajz ,

so that, substituting from Eq. (4.4c),


G G G G
kAj • r = r • kAj = xkAjx + ykAjy + zkAjz .

From Eq. (4A.12a) of Appendix 4A,

- 336 -
Electromagnetic Plane Waves · 4.2

G G G 2 ª 2π i( kGAj •rG ) º
∇ EA (r ) = ¦ EAj ∇ e
2

j
«¬ »¼
G ª ∂ 2 2π i( xkAjx + ykAjy + zkAjz )
= ¦ EAj « 2 e
j ¬ ∂x
∂ 2 2π i( xkAjx + ykAjy + zkAjz )
+ 2e
∂y
∂ 2 2π i( xkAjx + ykAjy + zkAjz ) º
+ 2e »
∂z ¼
G G G
2 π i ( kA j • r ) G 2 G 2π i( kGAj •rG )
= −4π 2 ¦ ( 2 2
kAjx + kAjy + kAjz EAj e 2
) = −4π ¦ kAj EAj e
2

j j

and similarly,
G G G 2 G 2π i( kGAj •rG )
∇ BA (r ) = −4π ¦ kAj BAj e
2 2
.
j

Substitution of these two results and Eqs. (4.8a) and (4.8b) into (4.7b) and (4.7c) gives

(
ª EG e 2π i ( kAj •rG ) σ 2 − kG
)º»¼ = 0
G 2
¦j «¬ Aj A Aj (4.9a)

and

(
ª BG e 2π i ( kAj •rG ) σ 2 − kG
)º»¼ = 0 .
G 2
¦j «¬ Aj A Aj (4.9b)

G G G
This can be true over all values of r with nonzero values of EAj and BAj only when

G 2
σ A 2 = k Aj (4.9c)

G
for all values of A and j. Equation (4.9c) requires the real vector kAj to have a magnitude
G
kAj = σ A that depends only on index A . This suggests that the j index specifies the different
G
directions taken on by the kAj vectors, giving

G
ˆ .
k Aj = σ A ⋅ Ω Aj

- 337 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ is a dimensionless unit vector, called the propagation vector, which for a specified
Here Ω Aj

value of A points in different directions for different values of j. In fact, nothing stops us from
assuming that the Ω ˆ propagation vectors range over the same (indefinitely large) set of j
Aj

directions for each A value; if we want to leave out some j direction for a given A , we can always
G G
remove those directions by making both EAj and BAj zero for the unwanted values of A and j. We
can thus write
G
ˆ .
k Aj = σ A ⋅ Ω (4.9d)
j

Substitution of (4.8a), (4.8b), (4.7a), and (4.9d) into (4.6a) and (4.6b) gives
G G G 2π i σ ( Ωˆ •rG −(σ A )
E (r , t ) = ¦¦ EAj e A j
σ A ) ct
(4.10a)
A j

and
G G G 2π i σ A ( Ωˆ j •rG −(σ A )
B (r , t ) = ¦¦ BAj e
σ A ) ct
. (4.10b)
A j

The phase term in Eqs. (4.10a) and (4.10b) is

ˆ • r − ct )
2π σ A (Ω j

if
σA σA =1
and
ˆ • r + ct ) if σ σ = −1 .
2π σ A (Ω j A A

When
σ A σ A = 1,

Eq. (4.9c) has been solved with


G
σ A = k Aj ≥ 0 ;
and when
σ A σ A = −1 ,

Eq. (4.9c) has been solved with


G
σ A = − k Aj ≤ 0 .

- 338 -
Electromagnetic Plane Waves · 4.2

ˆ
Figure 4.2 shows that the choice made here is to have the phase increasing in the direction of Ω j

as time increases, hence the solution to (4.9c) is chosen to be


G
σ A = k Aj ≥ 0 (4.10c)
and Eqs. (4.10a) and (4.10b) become
G G G 2π iσ ( Ωˆ •rG −ct )
E (r , t ) = ¦¦ EAj e A j (4.11a)
A j

and
G G G 2π iσ ( Ωˆ •rG −ct )
B (r , t ) = ¦¦ BAj e A j . (4.11b)
A j

The next section explains why these double sums are called electromagnetic plane waves.
We define
ˆ = xˆε + yˆε + zˆε
Ω (4.12a)
j jx jy jz

ˆ with respect to the x, y, z axes shown in Fig.


so that ε jx , ε jy , ε jz are the direction cosines of Ω j

4.3,

ε jx = Ω ˆ • yˆ = cos(θ ) , ε = Ω
ˆ • xˆ = cos(θ ) , ε = Ω ˆ • zˆ = cos(θ ) . (4.12b)
j jx jy j jy jz j jz

The standard relationship between direction cosines—that the sum of their squares is one—is the
ˆ have unit length
same as the requirement that Ω j

ε jx 2 + ε jy 2 + ε jz 2 = cos 2 θ jx + cos 2 θ jy + cos 2 θ jz = 1 . (4.12c)

G G
Although we have chosen E and B to satisfy the vector wave equations (4.3a) and (4.3b),
they must also satisfy the full set of Maxwell conditions, Eqs. (4.1a)–(4.1d). Substituting (4.11a)
into (4.1c) gives, using Eq. (4A.12b) from Appendix 4A,
G G ˆ • rG − ct ) G G 2π iσ ( Ωˆ •rG −ct )
¦¦ ∇ • [ E ] =¦¦ EAj • ∇[e A j
2π iσ A ( Ω
Aj e j
]=0. (4.13a)
A j A j

Simplifying the gradient gives

- 339 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.2.
x

ˆ
unit vector Ω j

G
ˆ • r = ct = constant , with each value of ct specifying
The planes of constant phase are specified by Ω j

ˆ .
a different plane perpendicular to Ω j

- 340 -
Electromagnetic Plane Waves · 4.2

FIGURE 4.3.

ˆ
unit vector Ω j

θ jx

θ jz

θ jy

- 341 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G 2π iσ ( Ωˆ •rG −ct ) § ∂ ∂ ∂ · 2π iσ ( xε + yε + zε −ct )


∇[e A j ] = ¨ xˆ + yˆ + zˆ ¸ ª¬e A x y z º¼
© ∂x ∂y ∂z ¹
ˆ • rG − ct )
2π iσ A ( Ω
= ª¬ xˆ (2π iσ Aε x ) + yˆ (2π iσ Aε y ) + zˆ (2π iσ Aε z ) º¼ e j
(4.13b)
G
ˆ e 2π iσ A ( Ωˆ j •r −ct )
= 2π iσ A Ω j

Hence, Eq. (4.13a) becomes


G
( )
G
2π i ¦¦ ªσ A EAj • Ω
ˆ º e 2π iσ A ( Ωˆ j •r −ct ) = 0 . (4.14a)
A j
¬ j
¼

Similarly, substituting (4.11b) into (4.1d) and simplifying gives


G
( )
G
2π i ¦¦ ªσ A BAj • Ω
ˆ º e 2π iσ A ( Ωˆ j •r −ct ) = 0 . (4.14b)
A j
¬ j
¼

G
The only way (4.14a) and (4.14b) can hold true for all values of r and t with nonzero σ A is to
require
G
ˆ =0
EAj • Ω (4.14c)
j

and
G
ˆ =0
BAj • Ω (4.14d)
j

for all values of A and j . Working next with Eq. (4.1a), we substitute (4.11a) and (4.11b) to get

G G ˆ • rG − ct ) ∂ G 2π iσ A ( Ωˆ j •rG −ct )
¦¦ ∇ ×[ B e ] = µoε o ¦¦
2π iσ A ( Ω
Aj
j
[ EAj e ]
A j A j ∂t

which becomes, using Eq. (4A.12c) in Appendix 4A,


G G 2π iσ ( Ωˆ •rG −ct ) G 2π iσ ( Ωˆ •rG −ct )
−¦¦ BAj × ∇[e A j ] = µoε o ¦¦ (−2π iσ A c)[ EAj e A j ].
A j A j

Substituting from Eq. (4.13b) and using µoε o = c −2 [see Eq. (4.1e)] gives

ˆ • rG − ct ) ªG G
ˆ − 1 E º = 0.
¦¦ 2π iσ A e
2π iσ A ( Ω j
« B × Ω (4.15a)
c »¼
A j j Aj
A j ¬

- 342 -
Electromagnetic Plane Waves · 4.2

G
The only way this can be true for all r and t with nonzero σ A is if
G G
ˆ )=E
c( BAj × Ω (4.15b)
j Aj

for all values of A and j. Similarly, substitution of (4.11a) and (4.11b) into (4.1b) gives

ˆ • rG − ct ) G G
¦¦ 2π iσ A e ˆ + cB º = 0 .
ª EAj × Ω
2π iσ A ( Ω j

¬ j Aj ¼ (4.15c)
A j

G
The only way (4.15c) can hold true for all r and t with nonzero σ A is if
G G
ˆ = −cB
EAj × Ω (4.15d)
j Aj

for all values of A and j. It is not difficult to show that (4.15b) and (4.15d) are just different forms
of the same equation. Taking the cross product of the left-hand side of (4.15d) with Ω ˆ gives,
j

using Eq. (4A.14) in Appendix 4A,


G G G G G
ˆ × (E × Ω
Ω ˆ ) = −Ω
ˆ × (Ω
ˆ × E ) = −(Ω
ˆ • E )Ω ˆ + (Ω
ˆ •Ωˆ )E = E ,
j Aj j j j Aj j Aj j j j Aj Aj

G
where we use Ω ˆ • E = 0 from Eq. (4.14c) and that Ω ˆ •Ω ˆ = 1 because Ω
ˆ has unit length.
j Aj j j j

Therefore taking the cross product of both sides of (4.15d) with Ωˆ gives
j

G G G
ˆ × B = cB × Ω
EAj = −cΩ ˆ ,
j Aj Aj j

which is the same as Eq. (4.15b). We can also take the cross product of the left-hand side of
G
ˆ and use Ω
(4.15b) with Ω ˆ • B = 0 from (4.14d) and Eq. (4A.14) in Appendix 4A to get
j j Aj

G G
ˆ × (B × Ω
c[Ω ˆ )] = cB .
j Aj j Aj

ˆ now must
Taking the cross product of both the right-hand and left-hand sides of (4.15b) with Ω j

give
G G G
ˆ × E = −E × Ω
cBAj = Ω ˆ .
j Aj Aj j

- 343 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ and
This is the same formula as Eq. (4.15d). Hence, as stated above, the restrictions placed on Ω j
G G
the complex vectors EAj , BAj in Eqs. (4.14c) and (4.14d) make (4.15b) and (4.15d) the same
equality. We see that the double sums shown in (4.11a) and (4.11b) lead to acceptable complex
G G
solutions to the vector wave equations for E and B in (4.3a) and (4.3b); and when the
G G
ˆ , E , and B , the
restrictions (4.14c), (4.14d), and either (4.15b) or (4.15d) are placed on Ω j Aj Aj

double sums also satisfy (4.1a)–(4.1d), Maxwell’s equations for empty space. No limits are
placed on the size of these double sums. This means we can create two different double sums,
both matching the criteria of this section and so solving Maxwell’s equations, and add them
together to get one big double sum matching the criteria of this section and solving Maxwell’s
equations. In general we can add together any number of plane-wave solutions to Maxwell’s
equations to create a new and larger collection of plane waves solving Maxwell’s equations.

4.3 Monochromatic Wave Trains


To show why Eqs. (4.11a) and (4.11b) are called plane-wave sums, we focus attention on a single
component of the sums in Eqs. (4.11a) and (4.11b) by assuming there to be only one nonzero pair
G G G G G G
of EAj , BAj terms. Then the formulas for E (r , t ) and B(r , t ) in (4.11a) and (4.11b) become

G G G 2π iσ ( Ωˆ •rG −ct )
E (r , t ) = EAj e A j (4.16a)
and
G G G 2π iσ ( Ωˆ •rG −ct )
B (r , t ) = BAj e A j (4.16b)
with
G G G G
ˆ = B •Ω
EAj • Ω ˆ ×E )
ˆ = 0 and B = c −1 (Ω (4.16c)
j Aj j Aj j Aj

from (4.14c), (4.14d), and (4.15d). Although it is customary to leave wave formulas in complex
form, strictly speaking only the real parts (or imaginary parts, see discussion at end of Appendix
4A) of the right-hand sides of (4.16a) and (4.16b) provide acceptable physical solutions to wave
Eqs. (4.3a) and (4.3b). Since an x, y, z coordinate system has not yet been specified, nothing stops
us from choosing the z axis to be parallel to Ω ˆ ; and because both ẑ and Ω ˆ are dimensionless,
j j

real, unit-length vectors, we then have Ω ˆ = zˆ . Equations (4.14c) and (4.14d) now show that the
j
G G
complex vectors EAj and BAj have zero z components, allowing us to write

G
EAj = xE
ˆ Ajx + yE
ˆ Ajy (4.17a)
and
G
BAj = xB
ˆ Ajx + yB
ˆ Ajy (4.17b)

- 344 -
Monochromatic Wave Trains · 4.3

where EAjx , EAjy , BAjx , BAjy are all complex numbers. Substituting into (4.15b) gives, using
ˆ = xˆ × zˆ = − yˆ and yˆ × Ω
xˆ × Ω ˆ = yˆ × zˆ = xˆ ,
j j

ˆ Ajx + yE
xE ( ˆ B + yˆ × Ω
ˆ Ajy = c xˆ × Ω j Ajx
ˆ B
j Ajy ) (4.17c)
= − yˆ ( cBAjx ) + xˆ ( cBAjy ) ,
which means that
EAjx = cBAjy (4.17d)
and
EAjy = −cBAjx . (4.17e)

If we write
iφAjx
EAjx = EAjx e (4.18a)

and
iφAjy
EAjy = EAjy e (4.18b)

using real phase terms φAjx and φAjy to describe the EAjx , EAjy complex constants, it then follows
from (4.17d) and (4.17e), because c is real, that

1 iφ
BAjy = EAjx e Ajx (4.18c)
c
and
1 iφ
BAjx = − EAjy e Ajy . (4.18d)
c

Hence, (4.17a) and (4.17b) become


G iφ iφ
EAj = xˆ EAjx e Ajx + yˆ EAjy e Ajy (4.18e)
and
K 1 iφ 1 iφ
BAj = − xˆ EAjy e Ajy + yˆ EAjx e Ajx , (4.18f)
c c

ˆ = zˆ
so that taking the real part of the right-hand sides of Eqs. (4.16a) and (4.16b) gives, using Ω j

and eiψ = cosψ + i sinψ ,

- 345 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
Re[ EAj e 2π iσ A ( z •r −ct ) ]
ˆ

( )
G
= Re ª xˆ EAjx e Ajx + yˆ EAjy e Ajy e2π iσ A ( z •r −ct ) º
iφ iφ ˆ
(4.19a)
¬ ¼
= xˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) + yˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy )
and
G 2π iσ zˆ•rG −ct )
Re[ BAj e A ( ]
ª§ 1 1 iφ ·
G º
= Re «¨ − xˆ EAjy e Ajy + yˆ EAjx e Ajx ¸ e 2π iσ A ( z •r −ct ) »
iφ ˆ
(4.19b)
¬ © c c ¹ ¼
1 1
= − xˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy ) + yˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) .
c c
G G
When z is held constant, all the x and y components of the E and B fields in (4.19a) and (4.19b)
oscillate at the same frequency f = σ A c . We can recognize what is going on by keeping z
constant and noting that if t increases (or decreases) by 1/(σ A c ) , then the phases of all the cosines
in Eqs. (4.19a) and (4.19b) increase (or decrease) by 2ʌ. This makes the wavefield specified in
(4.19a) and (4.19b) a plane wavefield, since every point on a plane specified by z = constant has
G G
the same real E field and B field at all times t. Figure 4.4 shows that when t is held constant in
Eqs. (4.19a) and (4.19b) and z increases (or decreases) in value by 1 σ A , the phases of all the
cosines also increase (or decrease) by 2ʌ. Consequently, planes in Fig. 4.4 that are separated by
G G
1 σ A have the same phase and thus the same real E and B fields. This distance is called the
wavelength Ȝ of the plane wavefield. Parameter σ A is called the wavenumber, already defined in
Eq. (1.7b) of Chapter 1 to be 1/Ȝ. The plane wave is called monochromatic because it is specified
by a single frequency f = σ A c and wavelength Ȝ. Its wavenumber σ A is 1/Ȝ, so the equality

f = σ Ac (4.19c)

can now be interpreted as


λf =c,

the classic relationship between wavelength, frequency, and velocity for any wavefield. We
ˆ = zˆ direction at
conclude that Eqs. (4.19a) and (4.19b) describe a wavefield traveling in the Ω j

velocity c, the speed of light.


This analysis obviously applies to any

- 346 -
Monochromatic Plane Waves · 4.3

FIGURE 4.4.

1
z=
σA

G
G E
E

G
G B z
B

G ˆ
unit vector Ω
G E j

G
G B
B

- 347 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G 2πσ ( Ωˆ •rG −ct ) G 2πσ ( Ωˆ •rG −ct )


EAj e A j and BAj e A j

pair of terms from formulas (4.11a) and (4.11b). Since the pair of sums in (4.11a) and (4.11b) is a
general solution to the vector wave equations, this sort of general solution can now be interpreted
as a sum over an arbitrary collection of monochromatic plane waves characterized by different
wavenumbers and directions of propagation, where for each wavenumber σ A , there is a unique
frequency cσ A .
From Eqs. (4.19a) and (4.19b), we get
G 2π iσ zˆ •rG −ct ) G 2π iσ zˆ•rG −ct )
{Re[ EAj e A ( ]} • {Re[ BAj e A ( ]}
1
=− EAjx EAjy cos ( 2πσ A ( z − ct ) + φAjx ) cos ( 2πσ A ( z − ct ) + φAjy ) (4.20)
c
1
+ EAjx EAjy cos ( 2πσ A ( z − ct ) + φAjx ) cos ( 2πσ A ( z − ct ) + φAjy ) = 0 ,
c
G G
showing that the real E and B fields of a monochromatic plane wave are always perpendicular
to each other while they oscillate. From (4.17a), (4.17b), (4.17d), and (4.17e), we get

G G § 1 · §1 ·
EAj • BAj = EAjx BAjx + EAjy BAjy = EAjx ¨ − EAjy ¸ + EAjy ¨ EAjx ¸ = 0 . (4.21a)
© c ¹ ©c ¹

It follows that in Eqs. (4.16a) and (4.16b)


G G G G G G
( )
E (r , t ) • B(r , t ) = EAj • BAj e 4π iσ A ( z −ct ) = 0 . (4.21b)

G G
In this sense, we can say that the complex monochromatic plane wave E and B fields are also
perpendicular to each other. Another result worth deriving, again using Eqs. (4.17a), (4.17b),
(4.17d), and (4.17e), is that
G G
EAj × BA∗j = [ xE
ˆ Ajx + yE ˆ A∗jx + yB
ˆ Ajy ] × [ xB ˆ A∗jy ] = zˆ EAjx BA∗jy − zˆ EAjy BA∗jx

( ) ( )
= [ EAjx c −1 EA∗jx + EAjy c −1 EA∗jy ] zˆ (4.21c)
1 G G∗ 1 G G
=
c
( ) c
(
EAj • EAj zˆ = EAj • EA∗j ) Ωˆ j ,

- 348 -
Monochromatic Plane Waves · 4.3

ˆ . Vector identities that, like Eqs. (4.21a) and


where we use xˆ × xˆ = yˆ × yˆ = 0 and xˆ × yˆ = zˆ = Ω j

(4.21c), can be written using only dot products and cross products, hold true in all (proper)
coordinate systems if they hold true in any one (proper) coordinate system.55 Choosing a new
coordinate system where the ẑ unit vector is not the same as the Ω ˆ propagation vector is
j

geometrically equivalent to specifying a new direction for the propagation vector that is not
parallel to the original ẑ unit vector. Since (4.21a) and (4.21c) use only dot and cross products,
they must also hold true in those coordinate systems where Ω ˆ is not parallel to ẑ . Hence we can
j

conclude that Eqs. (4.21a) and (4.21c) must be obeyed when the A , j monochromatic plane wave
propagates in any direction, not just when it propagates parallel to the z axis. Therefore the
G G
double sums over A and j in Eqs. (4.11a) and (4.11b) must all have coefficients EAj and BAj
satisfying Eqs. (4.21a) and (4.21c), with
G G
EAj • BAj = 0 (4.22a)
and
G G 1 G G ˆ
c
(
EAj × BA∗j = EAj • EA∗j Ω)j. (4.22b)

G G
Similarly, the perpendicularity of the real, physical E and B fields as they oscillate in Eq. (4.20)
G G
cannot be affected by the choice of coordinate system, which means the oscillating E and B
fields stay perpendicular when ẑ is not chosen parallel to Ω ˆ . Since, once again, this is
j

geometrically equivalent to specifying a new direction of propagation, we conclude that the real
G G
ˆ vectors—that is, they are perpendicular
oscillating E and B fields are perpendicular for all Ω j

no matter in what direction the wavefield propagates.

4.4 Linear Polarization of Monochromatic Plane Waves


Equations (4.19a) and (4.19b) specify an acceptable monochromatic plane wave—that is, they
specify an acceptable term in the double-sum solutions in Eqs. (4.11a) and (4.11b)—no matter
what values are given to the real constants EAjx , EAjy , φAjx , and φAjy . If we again use a Cartesian
ˆ and choose E = 0 , then from Eqs. (4.18e) and (4.18a) we get
coordinate system with zˆ = Ω j Ajy

G iφ
EAj = xˆ EAjx e Ajx = xE
ˆ Ajx . (4.23a)

55
The cross product is invariant only if the coordinate systems are always chosen to be left-handed or right-handed.
This book uses right-handed coordinate systems, sometimes referred to as proper coordinate systems, where the x̂ ,
ŷ , ẑ vectors are always chosen so that xˆ × yˆ = zˆ .

- 349 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Since EAjy 0 , Eqs. (4.18f) and (4.18a) give

G 1 i 1
BAj yˆ EAjx e Ajx yˆ EAjx . (4.23b)
c c
Setting EAjy 0 in Eqs. (4.19a) and (4.19b) now leads to

G G
Re[ EAj e 2& i) A  z =r ct  ] xˆ EAjx cos  2&) A ( z  ct )  Ajx 
ˆ
(4.23c)
and
G G 1
Re[ BAj e2& i) A  z =r ct  ] yˆ EAjx cos  2&) A ( z  ct )  Ajx  .
ˆ
(4.23d)
c

Equations (4.23a)–(4.23d) describe a plane wave whose real electric-field vector always points
strictly along the x axis and whose real magnetic-induction vector always points strictly along the
y axis. Characterizing this wave by the direction of the electric-field vector, we call it linearly
polarized along the x axis, or x-polarized for short (see Fig. 4.5). Equation (4.23a) shows that in
G
an x-polarized plane wave the complex vector EAj is the x̂ unit vector multiplied by a complex
G
constant EAjx —which, of course, means that in (4.23b) the complex vector BAj must be the ŷ
unit vector multiplied by the complex constant EAjx c .
To get a monochromatic plane wave that is linearly polarized in the y direction, we choose
EAjx 0 . Then, repeating the analysis used to find Eqs. (4.23a)–(4.23d), we have

G i
EAj yˆ EAjy e Ajy yE
ˆ Ajy , (4.24a)

G 1 i 1
BAj  xˆ EAjy e Ajy  xˆ EAjy , (4.24b)
c c
G G
Re[ EAj e 2& i) A  z =r ct  ] yˆ EAjy cos  2&) A ( z  ct )  Ajy  ,
ˆ
(4.24c)
and
G G 1
Re[ BAj e2& i) A  z =r ct  ]  xˆ EAjy cos  2&) A ( z  ct )  Ajy  .
ˆ
(4.24d)
c

The monochromatic plane wave described by Eqs. (4.24a)–(4.23d) 4.24d has an electric-field vector
that always points along the y axis and a magnetic induction vector that always points along the
íx axis (see Fig. 4.6). Equation (4.24a) shows that y polarization can be recognized by noting that
G
the complex vector EAj is the ŷ unit vector multiplied by a complex constant EAjy [with,

- 350 -
Linear Polarization of Monochromatic Plane Waves · 4.4

FIGURE 4.5.

E field vectors

B field vectors

One wavelength of a monochromatic plane wave linearly polarized in the


x direction and propagating in the z direction

G
according to (4.24b), complex vector BAj being the x̂ unit vector multiplied by the complex
constant (− EAjy c) ].
Writing down Eqs. (4.19a) and (4.19b) again while switching the order of addition in the
second equation gives
G
Re[ EAj e 2π iσ A ( z •r −ct ) ] = xˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) + yˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy )
G
ˆ

and

- 351 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.6.
x

E field vectors

y
B field vectors

One wavelength of a monochromatic plane wave linearly polarized in the


y direction and propagating in the z direction

G 2ʌiı z•r G
1 1
Re[ Blj e l 
ˆ -ct 

c
  c
 
] = yˆ Eljx cos 2ʌıl (z - ct)+ijljx - xˆ Eljy cos 2ʌıl (z - ct)+ijljy .

Clearly, the first term in the general formula for the E field and the first term in the general
formula for the B field can be grouped together and called an x-polarized wave, and similarly the
second terms in the general formulas can be grouped together and called a y-polarized wave. This
shows that the E field of an arbitrary monochromatic plane wave—that is, a plane wave where
neither EAjx nor EAjy is automatically zero—can be represented as the sum of the E field of a
monochromatic planewave
monochromatic plane wave linearly
linearly polarized
polarized in theinx the x direction
direction and the and thetheE Efield
sum of field of
of a
monochromatic plane wave linearly polarized in the y direction. Similarly, the B field of that
same monochromatic plane wave can be represented as the sum of the B field of the
corresponding x-polarized plane wave and the B field of the corresponding y-polarized plane

- 352 -
Linear Polarization of Monochromatic Plane Waves · 4.4

wave. This point is often made by stating that any monochromatic plane wave can be written as
the sum of an x-polarized plane wave and a y-polarized plane wave.

4.5 Transmitted Plane Waves


Figure 4.7 shows a monochromatic plane wave incident on a thin film of optical material placed
at an angle to the axis of propagation. Note that we have again chosen the ẑ unit vector equal to
Ωˆ , the propagation vector of the incident plane wave. This means, according to Eqs. (4.19a) and
j

(4.19b), that the incident plane wave can be represented by the real part of

G
( )
G
EAj e2π iσ A ( z •r −ct ) = xˆ EAjx e Ajx + yˆ EAjy e Ajy e 2π iσ A ( z −ct )
ˆ iφ iφ
(4.25a)

and the real part of

G G
§ 1 1 iφ ·
BAj e 2π iσ A ( z •r −ct ) = ¨ − xˆ EAjy e Ajy + yˆ EAjx e Ajx ¸ e 2π iσ A ( z −ct ) .
ˆ iφ
(4.25b)
© c c ¹

The thin film divides the space in Fig. 4.7 into two regions labeled A and B. Equations (4.25a)
and (4.25b) only apply to points in region A, the region occupied by the incident wavefield. The
unit normal vector n̂ of the surface on which the plane wave is incident lies in the y, z plane of
the coordinate system, making an angle ψ j with respect to the z axis. Angle ψ j is called the
angle of incidence, and we give it an index j because it specifies the direction of the Ω ˆ
j

propagation vector with respect to n̂ . The interaction of the plane wave with the film creates a
transmitted radiation field in region B that also propagates in the Ω ˆ = zˆ direction, and a
j

reflected radiation field in region A that propagates in the direction

ˆ (r ) = Ω
Ω j
ˆ − 2nˆ Ω
j
ˆ • nˆ
j ( ) (4.26a)
or
ˆ ( r ) = zˆ + 2nˆ ( cosψ ) .
Ω (4.26b)
j j

Both the transmitted and reflected wavefields have the same σ A wavenumber as the incident
wave. For any wavefield incident on a flat surface, the plane of incidence is defined to be that
plane containing both the surface normal n̂ and the incident propagation vector Ω ˆ . Equation
j

ˆ
(4.26a) shows that the Ω ( r )
propagation vector of the reflected wave automatically lies in the
j

- 353 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.7.
A B

propagation
ˆ = zˆ
vector Ω j

z
ψj
surface normal n̂

ψj
ˆ (r )
propagation vector Ω j

- 354 -
Transmitted Plane Waves · 4.5

same plane as n̂ and Ω ˆ . In Fig. 4.7, the plane of incidence is the y, z plane of the coordinate
j

system.
Since the transmitted radiation field is also a monochromatic plane wave traveling down the z
axis, the E and B fields of the wave can still be found from the real parts of complex plane wave
solutions such as the ones given in Eqs. (4.16a) and (4.16b),
G 2π iσ ( Ωˆ •rG −ct ) G (t ) 2π iσ ( z −ct )
EA(jt ) e A j = EAj e A (4.27a)
and
G 2π iσ ( Ωˆ •rG −ct ) G ( t ) 2π iσ ( z −ct )
BA(jt ) e A j = BAj e A , (4.27b)

where the (t) superscript specifies the transmitted wavefield and Eqs. (4.27a) and (4.27b) are
G
assumed to apply only to region B in Fig. 4.7. The complex vector EA(jt ) can be written as

G
EA(jt ) = xE
ˆ A(jxt ) + yE
ˆ A(jyt )

with the two complex numbers EA(jxt ) and EA(jyt ) representing its x and y components. Equations
G G
(4.18e) and (4.18f) show that the complex vectors EA(jt ) , BA(jt ) can now be written as

G iφ ( t ) iφ ( t )
EA(jt ) = xˆ EA(jxt ) e Ajx + yˆ EA(jyt ) e Ajy (4.27c)
and
K 1 iφ ( t ) 1 iφ ( t )
BA(jt ) = − xˆ EA(jyt ) e Ajy + yˆ EA(jxt ) e Ajx , (4.27d)
c c

where we have used the two real constants φA(jxt ) and φA(jyt ) to represent the phases of EA(jxt ) and EA(jyt )
respectively. We require the film to be nonbirefringent, nonoptically active, and to have an index
of refraction that is constant in layers parallel to its surface; that is, the index of refraction can
only depend on the distance from the film’s surface. If the film absorbs radiant energy, we
account for it in the usual way by making its index of refraction complex.56 This sort of film turns
out to be an adequate model for the partially transmitting, partially reflecting layer of a Michelson
interferometer’s beam splitter.
When the plane wave incident on the film has EAjy = 0 or EAjx = 0 , making the wave in Eqs.
(4.25a) and (4.25b) linearly x-polarized or linearly y-polarized respectively, the transmitted wave

56
Leonard Eyges, The Classical Electromagnetic Field (Dover Publications, Inc., New York, 1972), p. 340.

- 355 -
4 · From Maxwell’s Equations to the Michelson Interferometer

must have the same type of linear polarization.57 Hence, when EAjy = 0 in (4.25a) and (4.25b),
the transmitted plane wave must also be linearly polarized along the x axis, making EA(jyt ) = 0 in
Eqs. (4.27c) and (4.27d); and when EAjx = 0 , the transmitted plane wave, which must be linearly
polarized along the y axis, has EA(jxt ) = 0 in (4.27c) and (4.27d).
Consulting Eqs. (4.25a) and (4.25b), we see that for linear polarization along the x axis with
EAjy = 0 , the incident plane wave is given by the real part of

xˆ EAjx e Ajx e 2π iσ A ( z −ct )



(4.28a)

for the electric field and the real part of

1
EAjx e Ajx e 2π iσ A ( z −ct )

yˆ (4.28b)
c

for the magnetic induction. The corresponding transmitted plane wave is given by the real part of

iφA(jxt )
xˆ EA(jxt ) e e2π iσ A ( z −ct ) (4.29a)

for the electric field and the real part of

1 ( t ) iφA(jxt ) 2π iσ A ( z −ct )
yˆ EAjx e e (4.29b)
c

for the magnetic induction [see Eqs. (4.27c) and (4.27d) with EA(jyt ) = 0 ). The ratio of the complex
transmitted electric field’s x component in (4.29a) to the complex incident electric field’s x
component in (4.28a) is the complex coefficient

EA(jxt ) (
i φA(jxt ) −φAjx ).
ts = e (4.30a)
EAjx

We see by inspection that this is the same as the ratio of the two complex magnetic inductions in
(4.29b) and (4.28b). Consequently, no matter what happens inside the film to produce the

57
Max Born and Emil Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference, and
Diffraction of Light, 7th (expanded) ed. (Cambridge University Press, New York, 1999), p. 55.

- 356 -
Transmitted Plane Waves · 4.5

transmitted x-polarized wave, the process can be described by a complex parameter ts , which in
general is a function of the wavenumber σ A and ψ j , the angle of incidence in Fig. 4.7,

ts = ts (σ A ,ψ j ) . (4.30b)

The subscript s in Eqs. (4.30a) and (4.30b) is traditionally applied to incident plane waves whose
electric field is linearly polarized perpendicular to the plane of incidence, and parameter ts is
called the s-wave amplitude-transmission coefficient.58
It is important to note that t s does not depend on either EAjx or φAjx , giving it the same value
for all monochromatic plane waves having equal wavenumbers and angles of incidence.59
Equations (4.28a), (4.28b), (4.29a), and (4.29b) and the definition of parameter ts (σ A ,ψ j ) in
(4.30a) let us write
G G
EA(jst ) e 2π iσ A ( z −ct ) = ts (σ A , φ ) ⋅ EAjs e2π iσ A ( z −ct ) (4.31a)

and
G G
BA(jst ) e 2π iσ A ( z −ct ) = ts (σ A , φ ) ⋅ BAjs e2π iσ A ( z −ct ) , (4.31b)
where
G iφ
G 1 iφ
EAjs = xˆ EAjx e Ajx , BAjs = yˆ EAjx e Ajx , (4.31c)
c
and
G iφ ( t ) G 1 iφ ( t )
EA(jst ) = xˆ EA(jxt ) e Ajx , and BA(jst ) = yˆ EA(jxt ) e Ajx . (4.31d)
c

This shows that to get the complex formula for the transmitted plane wave linearly polarized
perpendicular to the plane of incidence, we need only multiply the complex formula for the
incident plane wave by ts (σ A ,ψ j ) . If the plane wavefield incident on the optical film at an angle
ψ j contains more than one wavenumber (but is still polarized perpendicular to the plane of
incidence), then its electric field is given by the real part of
G
¦EA
Ajs e 2π iσ A ( z −ct )

and its magnetic induction is given by the real part of

58
This notation can be traced back to the German word for perpendicular, senkrecht.
59
O. S. Heavens, Optical Properties of Thin Solid Films (London, Butterworths Scientific Publications, 1955), pp.
46–95.

- 357 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G
¦BA
Ajs e2π iσ A ( z −ct ) ,

where an s subscript has been added to show that all the waves are linearly polarized
perpendicular to the plane of incidence. The s-wave amplitude-transmission coefficient can now
be used to write the complex formulas for the transmitted radiation fields as
G (t ) G
¦E
A
Ajs e2π iσ A ( z −ct ) = ¦ ts (σ A ,ψ j ) ⋅ EAjs e2π iσ A ( z −ct )
A
(4.31e)

and
G (t ) G
¦BA
Ajs e 2π iσ A ( z −ct ) = ¦ ts (σ A ,ψ j ) ⋅ BAjs e2π iσ A ( z −ct )
A
(4.31f)

because
G G G G
EA(jst ) = ts (σ A ,ψ j ) ⋅ EAjs and BA(jst ) = ts (σ A ,ψ j ) ⋅ BAjs (4.31g)

for all values of A .


For linear polarization along the y axis with EAjx = 0 , Eqs. (4.25a) and (4.25b) show that the
electric field of the incident plane wave is given by the real part of

yˆ EAjy e Ajy e2π iσ A ( z −ct )



(4.32a)

and the magnetic induction of the incident plane wave is given by the real part of

1
EAjy e Ajy e2π iσ A ( z −ct ) .

− xˆ (4.32b)
c

Recalling that the corresponding transmitted plane wave must have the same type of linear
polarization as the incident wave, we set EA(jxt ) = 0 in Eqs. (4.27c) and (4.27d) to get that the
electric field of the transmitted plane wave is the real part of

iφA(jyt )
yˆ EA(jyt ) e e2π iσ A ( z −ct ) (4.33a)

and the magnetic induction of the transmitted plane wave is the real part of

1 (t ) iφA(jyt ) 2π iσ A ( z −ct )
− xˆ EAjy e e . (4.33b)
c

- 358 -
Transmitted Plane Waves · 4.5

The ratio of the complex transmitted electric field in (4.33a) to the complex incident electric field
in (4.32a) is
EA(jyt ) i(φA(jyt ) −φAjy )
tp = e . (4.34a)
EAjy

Again, this is the same as the ratio of the two complex magnetic inductions in (4.33b) and
(4.32b)—so again the process of transmission is described by a single complex parameter that is a
function of σ A and ψ j but not of EAjy or φAjy ,

t p = t p (σ A ,ψ j ) . (4.34b)

The p subscript is traditionally applied to incident plane waves whose electric field is linearly
polarized parallel to the plane of incidence, and parameter t p is called the p-wave amplitude-
transmission coefficient.60 When the incident wavefield contains more than one wavenumber and
every monochromatic component is a p-type plane wave, its electric field is given by the real part
of
G
¦ EAjp e2π iσ A ( z −ct )
A
(4.35a)

and its magnetic induction is given by the real part of


G 2π iσ A ( z −ct )
¦ Ajp e
B
A
, (4.35b)

where
G iφ
G 1 iφ
EAjp = yˆ EAjy e Ajy and BAjp = − xˆ EAjy e Ajy (4.35c)
c

with the p subscript showing that the waves are linearly polarized parallel to the plane of
incidence. To get the complex formula for the transmitted plane wave linearly polarized parallel
to the plane of incidence, we need only multiply the complex term for each incident plane wave
by t p (σ A ,ψ j ) to get
G (t ) 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ Ajp
E
A
e =
A
¦ p A j Ajp e
t (σ ,ψ ) ⋅ E (4.35d)

and

60
This notation can also be traced back to German scientists, with the German word for parallel spelled the same as
in English, parallel.

- 359 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G (t ) G
¦B A
Ajp e 2& i) A  z ct  ¦ t p () A ,/ j ) A BAjp e2& i) A  z ct  .
A
(4.35e)

The details of the mathematics used here to represent the incident and transmitted wavefields
have an unfortunate tendency to conceal the basic ideas behind what is being done. No matter
what the orientation of the E field in the incident monochromatic plane wave—parallel or
perpendicular to the plane of incidence—terms having the form
m (with  A = 2 & ) A and bA = 2 & ) A c)

Aei () A z bctAt))

are used to describe the electromagnetic wavefields on the incident side of the thin film, and
terms such as

* Aei () z bctt))


A A

are used to describe the electromagnetic wavefields on the transmitted side of the thin film. Here,
* is a complex number standing for either ts or t p in the above formulas; and A is a complex
number standing for either the x or y components of the E and B fields’ complex amplitudes—for
example, EAjx , BAjy , etc.. If we write the complex A value as

A A ei A ,
then
Aei () A z bctctA )) AA eeii(())AAzzctbctctAA)A )
and
* Aei () z bctct)) ** AA eeii(()) zzctbcct) ). .
A A AA A AA

If the incident monochromatic wavefield is shifted forward or back along the z axis—that is,
along its direction of propagation—by a distance z0 , then z 7 z 9 z0 so that

A ei () A z bctctA AA)) 7  


7 AA eeii(())AAzzctbcAtA9A )9)A zA0z)0 ) AAee9 9i)i)AzA0z0 eei (i)()AzAz ctbctA tA)A )

To change the amplitude of the incident wavefield to some fraction of its original value, we
multiply A by a real number Į between zero and one to get

Ae 9 i)
 A z0
e  A z bctA tAA))
i ()
 
7  AA ee99i)i)AAzz00 eeii(())AAzzctbctA A)A ). .
7

The complex * parameter can be written as

- 360 -
Transmitted Plane Waves · 4.5

* * ei* ,

which means the transmitted wavefield that is specified above to be

* Aei ()) zzbctct)) ** AA eeii(()) zzctbctct) )


AA A AA A AA

becomes
* Aei () z bctt)) ** AA eeii** eeii(()))zzzctbct) .) .
A A AAA A AA

Comparing the right-hand side of this equation to the expression

 A e 9 i)
 A z0
e  A z bctct
i () A AA))

for the incident wavefield shifted by z0 and diminished by a real factor Į, we note that

B*
and
9)9)zA0zB
A *.* .
0 B

Hence, all that happens when we multiply a wavefield specified to be

Aei () A z bctctA ))


by a complex parameter * to get
* Aei () z bctct))
A A

is that the amplitude A of the original wavefield changes to * A A and the oscillations of the
wavefield are moved forward or back by a distance

* arg(* )

)
A )
A

along the direction of propagation. This mathematical fact—knowing what happens when the
complex expression for a monochromatic wavefield is multiplied by a complex parameter—gives
meaning to the formulas derived in the first part of this section. Monochromatic wavefields
transmitted through the thin film in Fig. 4.7 have their amplitudes diminished by ts if the E field
is perpendicular to the plane of incidence and by t p if the E field is parallel to the plane of
incidence. The oscillations of the transmitted wavefields are also moved forward or back with

- 361 -
4 · From Maxwell’s Equations to the Michelson Interferometer

respect to the incident wavefield as specified by the complex phases or arguments of ts and t p .
How much the wavefields shift and change in amplitude depends on the angle of incidence and
wavenumber—that is why t s and t p are written as functions of ψ j and σ A .
From the work done in Sec. 4.4, we know that any monochromatic plane wave having a
propagation vector parallel to the z axis can be analyzed as the sum of a monochromatic plane
wave linearly polarized along the x axis and a monochromatic plane wave linearly polarized
along the y axis. This means that any monochromatic plane wave incident on the optical film in
Fig. 4.7 can be treated as the sum of an s-type monochromatic plane wave and a p-type
monochromatic plane wave. Consequently, we expect an arbitrary plane wavefield incident along
the z axis in region A of Fig. 4.7 to have both s-type and p-type components, with its electric field
given by the real part of
G 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ Ajs
E
A
e + ¦ Ajp e
E
A
(4.36a)

and its magnetic induction given by the real part of


G G
¦BA
Ajs e 2π iσ A ( z −ct ) + ¦ BAjp e 2π iσ A ( z −ct ) .
A
(4.36b)

The recipe for taking this combined wavefield through the optical film into region B of Fig. 4.7 is
to multiply each s-wave component and p-wave component by the appropriate s-wave and p-
wave amplitude-transmission coefficients. Hence, the electric field for the transmitted wave in
region B is the real part of
G 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ s A j Ajs
t
A
(σ ,ψ ) E e + ¦ p A j Ajp e
t (σ ,ψ ) E
A
(4.36c)

and the magnetic induction is the real part of


G G
¦ t (σ
A
s A ,ψ j ) BAjs e2π iσ A ( z −ct ) + ¦ t p (σ A ,ψ j ) BAjp e2π iσ A ( z −ct ) .
A
(4.36d)

Thus the transmission of any plane wavefield containing many different wavenumbers—that is,
the transmission of any polychromatic plane wave—can be handled by writing each incident
monochromatic wave as the sum of an s-wave and a p-wave, as shown in (4.36a) and (4.36b), and
then multiplying each s-wave and p-wave in that sum by the correct s-wave and p-wave
amplitude-transmission coefficient, as shown in (4.36c) and (4.36d).

- 362 -
Reflected Plane Waves · 4.6

4.6 Reflected Plane Waves


If the incident wavefield in region A of Figs. 4.7 and 4.8 is a monochromatic plane wave with
propagation vector Ωˆ = zˆ and wavenumber σ , then the reflected wavefield in region A is a
j A

monochromatic plane wave with wavenumber σ and propagation vector Ω ˆ ( r ) . In Fig. 4.8, we
A j

construct a special x , y , z coordinate system to analyze the reflected plane wave. The z ( r )
(r ) (r ) (r )

ˆ ( r ) propagation vector, so that zˆ ( r ) = Ω


axis is set parallel to the Ω ˆ ( r ) . Note that, according to the
j j

discussion at the end of Sec. 4.2, the sum of the incident and reflected plane waves is still a
solution to Maxwell’s equations in region A. We see that the x ( r ) , y ( r ) , z ( r ) coordinate system is
just the x, y, z coordinate system rotated about the x axis to make ẑ parallel to Ω ˆ ( r ) , so the two
j

coordinate systems have the same origin. Both coordinate systems have the same x axis, so
ˆ ( r ) × xˆ .
xˆ ( r ) = xˆ , and to get the y axis of the new coordinate system, we specify yˆ ( r ) = zˆ ( r ) × xˆ ( r ) = Ω j

When an x, y, z coordinate system is rotated by an angle ȕ about its x axis to create a new x ( r ) ,
y ( r ) , z ( r ) coordinate system (see Fig. 4.9), the relationship between the x̂ , ŷ , ẑ unit vectors and
the xˆ ( r ) , yˆ ( r ) , zˆ ( r ) unit vectors is
xˆ ( r ) = xˆ , (4.37a)

yˆ ( r ) = yˆ cos β + zˆ sin β , (4.37b)


and

zˆ ( r ) = zˆ cos β − yˆ sin β . (4.37c)

Equations (4.37a)–(4.37c) provide another way of specifying the xˆ ( r ) , yˆ ( r ) , zˆ ( r ) unit vectors in


terms of the x̂ , ŷ , ẑ unit vectors. Comparing Figs. 4.8 and 4.9, we see that to create the desired
x ( r ) , y ( r ) , z ( r ) coordinate system in Fig. 4.8, the original x, y, z coordinate should be rotated
around the x axis by an angle in radians of β = 2ψ j − π or β = π + 2ψ j .
Because the reflected plane wave is traveling down the z ( r ) axis rather than the z axis, when
the E and B fields of the wave are specified by the real parts of complex expressions, such as the
ones shown in Eqs. (4.16a) and (4.16b), we must replace Ω ˆ and z by Ω
ˆ ( r ) and z ( r )
j j

respectively,
G 2π iσ ( Ωˆ ( r ) •rG −ct ) G ( r ) 2π iσ ( z( r ) −ct )
EA(jr ) e A j = EAj e A (4.38a)
and
G 2π iσ ( Ωˆ ( r ) •rG −ct ) G ( r ) 2π iσ ( z ( r ) −ct )
BA(jr ) e A j = BAj e A . (4.38b)

- 363 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.8.

x, xr

ˆ
propagation vector  j

y ((rr))

/j surface normal n̂

z(r )
/j ˆ (r )
propagation vector  j

z(r )

A B

G G
The r superscript on the complex EA(jr ) and BA(jr ) vectors show that they belong to the reflected
G
wave. Vector EA(jr ) in (4.38a) can be written as

G
EA(jr ) xE
ˆ A(jxr )  yˆ ( r ) EA(jyr )( r ) (4.38c)

using two complex numbers EA(jxr ) and EA(jyr )( r ) to represent its x̂ and yˆ ( r ) components. Although
the y subscripts and unit vectors have an r superscript to show that they belong to the x ( r ) , y ( r ) ,

- 364 -
Reflected Plane Waves · 4.6

FIGURE 4.9.
x, xr

z(r )

β
z

β
y y(r )

- 365 -
4 · From Maxwell’s Equations to the Michelson Interferometer

z ( r ) coordinate system, the x subscripts and unit vectors do not need one because x̂ and xˆ ( r ) are
identical in the two coordinate systems. Following the pattern of Eqs. (4.27c) and (4.27d), we
G G
write the complex vectors EA(jr ) and BA(jr ) as

G iφ ( r ) iφ ( r )( r )
EA(jr ) = xˆ EA(jxr ) e Ajx + yˆ ( r ) EA(jyr )( r ) e Ajy (4.38d)
and
G 1 iφ ( r )( r ) 1 iφ ( r )
BA(jr ) = − xˆ EA(jyr )( r ) e Ajy + yˆ ( r ) EA(jxr ) e Ajx (4.38e)
c c

using the real constants φA(jxr ) and φA(jyr )( r ) to represent the phases of the complex values of EA(jxr ) and
EA(jyr )( r ) respectively.
When the plane wave incident on the optical film is linearly polarized along the x axis or y
axis, the reflected wave is linearly polarized along the xˆ ( r ) = xˆ axis or the yˆ ( r ) axis respectively.61
Equations (4.28a) and (4.28b), which give the complex formulas for an incident plane wave
that is linearly x-polarized, force the reflected plane wave to be linearly polarized along the
xˆ ( r ) = xˆ axis. According to Eq. (4.38d), this reflected wave must have

EA(jyr )( r ) = 0

for it to be linearly polarized along the xˆ ( r ) = xˆ axis. Equations (4.38a)–(4.38e) then show that the
E field of the reflected wave is given by the real part of

iφA(jxr ) (r)
xˆ EA(jxr ) e e2π iσ A ( z − ct )
(4.39a)

and the B field of the reflected wave is given by the real part of

1 ( r ) iφA(jxr ) 2π iσ A ( z( r ) −ct )
yˆ ( r ) EAjx e e . (4.39b)
c

Comparing these two complex formulas to the complex formulas (4.28a) and (4.28b) for the
incident wave, we note that if we consider only the scalar factors that do not depend on position
or time, then the xˆ ( r ) = xˆ components of the complex E fields together with the ŷ , yˆ ( r )
components of the complex B fields have the same complex ratio

61
Max Born and Emil Wolf, Principles of Optics, p. 55.

- 366 -
Reflected Plane Waves · 4.6

EA(jxr ) (
i φA(jxr ) −φAjx ).
rs = e (4.40a)
EAjx

Parameter rs is called the s-wave amplitude-reflection coefficient, with s again referring to the
incident plane wave’s being polarized perpendicular to the plane of incidence. In general,

rs = rs (σ A ,ψ j ) , (4.40b)

where rs , like the amplitude-transmission coefficients t s and t p , does not depend on either EAjx
or φAjx ; it is the same for all incident plane waves having the same σ A and ψ j . Comparing the x-
polarized reflected wave in (4.39a) and (4.39b) to the x-polarized incident wave in (4.28a) and
(4.28b), we see that multiplying the complex formulas in (4.28a) and (4.28b) by rs converts them
to the complex formulas in (4.39a) and (4.39b) if ŷ is replaced by yˆ ( r ) and z is replaced by z ( r ) .
Turning to the case of the y-polarized incident wave specified by the complex formulas
(4.32a) and (4.32b), we remember that now the reflected wave must be polarized along the yˆ ( r )
axis. This forces EA(jxr ) = 0 in Eqs. (4.38a)–(4.38e), showing the reflected E field is given by the
real part of

iφ ( r )( r ) (r)
yˆ ( r ) EA(jyr )( r ) e Ajy
e 2πσ A ( z − ct )
(4.41a)

and the reflected B field is given by the real part of

1 ( r ) iφA(jyr )( r ) 2πσ A ( z ( r ) −ct )


− xˆ EAjy( r ) e e . (4.41b)
c

Comparing these two formulas to (4.32a) and (4.32b) for the incident wave, we again see that if
we consider only the scalar factors that do not depend on position or time then the ŷ , yˆ ( r )
components of the complex E fields together with the xˆ ( r ) = xˆ components of the complex B
fields have the same complex ratio

EA(jyr )( r ) i §¨ φ ( r )( r ) −φAjy ·¸
rp = e© Ajy ¹
. (4.42a)
EAjy

- 367 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Parameter rp is called the p-wave amplitude-reflection coefficient, where again p refers to the
incident wave being polarized parallel to the plane of incidence. This coefficient, like rs , ts , and
t p , in general depends only on the wavenumber and incidence angle,

rp = rp (σ A ,ψ j ) . (4.42b)

Multiplying the complex formulas in (4.32a) and (4.32b) by rp converts them to (4.41a) and
(4.41b) if ŷ is replaced by yˆ ( r ) and z is replaced by z ( r ) .
Having analyzed how to create the reflected wavefield when the incident wavefield is a
monochromatic s-wave or monochromatic p-wave, we are now prepared to handle the reflection
of an arbitrary polychromatic plane wavefield incident along the z axis. Splitting each
monochromatic term into an s-wave component and a p-wave component as in formulas (4.36a)
and (4.36b), we can write the incident wave’s E field as the real part of
G G
¦EA
Ajs e2π iσ A ( z −ct ) + ¦ EAjp e2π iσ A ( z −ct )
A

or, using Eqs. (4.31c) and (4.35c), as the real part of

¦ xˆ E e2π iσ A ( z −ct ) + ¦ yˆ EAjy e e 2π iσ A ( z −ct ) .


iφAjx iφAjy
Ajx e (4.43a)
A A

Similarly, the incident wave’s B field is, using Eqs. (4.31c) and (4.35c), the real part of

1 1
¦ yˆ c E e2π iσ A ( z −ct ) − ¦ xˆ EAjy e Ajy e 2π iσ A ( z −ct ) .
iφAjx iφ
Ajx e (4.43b)
A A c

In these latest formulas, (4.43a) and (4.43b), the first term is the sum over the s-wave components
of the incident wavefield and the second term is the sum over the p-wave components of the
incident wavefield. To get the corresponding polychromatic reflected wavefield, we follow the
just-described recipes for finding the reflected monochromatic plane waves generated by each
incident monochromatic plane wave. The electric field of the reflected wavefield is then found to
be the real part of

¦ r (σ + ¦ rp (σ A ,ψ j ) yˆ ( r ) EAjy e
iφAjx (r) iφAjy (r)

s A ,ψ j ) xˆ EAjx e e 2πσ A ( z − ct )
e 2πσ A ( z − ct )
(4.43c)
A A

and the magnetic-induction field of the reflected wavefield is found to be the real part of

- 368 -
Reflected Plane Waves · 4.6

1 1
¦ r (σ EAjx e Ajx e2πσ A ( z −ct ) − ¦ rp (σ A ,ψ j ) xˆ EAjy e Ajy e 2πσ A ( z −ct ) .
iφ (r) iφ (r)

s A ,ψ j ) yˆ ( r ) (4.43d)
A c A c

These reflected-wave formulas are, of course, the counterpart equations to (4.36c) and (4.36d) for
the transmitted wavefields.

4.7 Polychromatic Wave Fields


Having found and at least to some extent analyzed the complex E-field and B-field plane-wave
solutions in Eqs. (4.11a) and (4.11b), we can write their associated real-valued radiation fields as

G G ­ G 2π iσ ( Ωˆ •rG −ct ) ½
E (rad) (r , t ) = Re ®¦¦ EAj e A j ¾
¯ j A ¿ (4.44a)
­1 G 2π iσ A ( Ωˆ j •r −ct ) 1
G G ∗ −2π iσ A ( Ωˆ j •rG −ct ) ½
= ¦ ® ¦ EAj e + ¦ EAj e ¾
j ¯2 A 2 A ¿
and

G G ­ G 2π iσ ( Ωˆ •rG −ct ) ½
B (rad) (r , t ) = Re ®¦¦ BAj e A j ¾
¯ j A ¿ (4.44b)
­1 G 2π iσ ( Ωˆ •rG −ct ) 1 G −2π iσ A ( Ωˆ j •rG −ct ) ½
= ¦ ® ¦ BAj e A j + ¦ BA∗j e ¾.
j ¯2 A 2 A ¿
G
In Eq. (4.44a), to convert the first inside sum over EAj into an integral, we replace σ A ≥ 0 with
G
the continuous variable σ ≥ 0 . To convert the sum over EAj ∗ into an integral, we use negative
values of the same continuous variable ı; that is, we replace −σ A with σ < 0 . To set up these
conversions, we define

G 1 G
∆σ A E j (σ ) = EAj for σ = σ A > 0 , (4.45a)
2
and
G 1 G
∆σ A E j (σ ) = EA∗j for σ = −σ A < 0 (4.45b)
2
with
∆σ A = σ A +1 − σ A .

- 369 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

A similar conversion of sums into integrals can be applied to Eq. (4.44b) if we define

G 1 G
∆σ A B j (σ ) = BAj for σ = σ A > 0 , (4.45c)
2
and
G 1 G
∆σ A B j (σ ) = BA∗j for σ = −σ A < 0 . (4.45d)
2
G G
Equations (4.45a) and (4.45c) associate positive ı arguments in E j (σ ) and B j (σ ) with the
G G
original EAj and BAj vectors, and Eqs. (4.45b) and (4.45d) associate negative ı arguments in
G G G G
E j (σ ) and B j (σ ) with the complex conjugate EAj ∗ and BAj ∗ vectors. In the limit of decreasing
∆σ A and increasing numbers of σ A values per unit wavenumber interval, Eqs. (4.44a) and
(4.44b) become
G (rad) G ∞
G ˆ • rG − ct )
2π iσ ( Ω
E (r , t ) = ¦ ³ E j (σ ) e j
dσ (4.46a)
j −∞

and
G G

G ˆ • rG − ct )
2π iσ ( Ω
B (rad) (r , t ) = ¦ ³ B j (σ ) e j
dσ . (4.46b)
j −∞

G G
For this limit to make sense, we have to set E j (σ ) = 0 and B j (σ ) = 0 in (4.45a)–(4.45d) at those
wavenumbers for which there are no specified A index values in (4.44a) and (4.44b); in effect,
the indices left out of the sums are now included but assigned zero for their complex vector
G G G G
coefficients EAj and BAj . Although Eqs. (4.44a) and (4.44b) force vectors E (rad) and B (rad) to be
G G
real, vectors E j (σ ) and B j (σ ) are allowed to be complex.
Equations (4.46a) and (4.46b) are a vector shorthand for the six scalar equations


G ˆ • rG − ct
( )
(r , t ) = ¦
2π iσ Ω
³ E jx (σ ) e dσ ,
(rad)
Ex j

j −∞


G ˆ • rG − ct
( )
E y (rad) (r , t ) = ¦
2π iσ Ω
³ E jy (σ ) e dσ ,
j

j −∞


G (ˆ • rG − ct )
(r , t ) = ¦
2π iσ Ω
³ E jz (σ ) e dσ ,
(rad)
Ez j

j −∞

- 370 -
Polychromatic Wave Fields · 4.7

and

G (ˆ • rG − ct )
Bx (rad) (r , t ) = ¦
2π iσ Ω
³ B jx (σ ) e dσ ,
j

j −∞


G ˆ • rG − ct
( )
(r , t ) = ¦
2π iσ Ω
³ B jy (σ ) e dσ ,
(rad)
By j

j −∞


G (ˆ • rG − ct )
Bz (rad) (r , t ) = ¦
2π iσ Ω
³ B jz (σ ) e dσ ,
j

j −∞

where
G G G G G
E (rad) (r , t ) = xE
ˆ x (rad) (r , t ) + yE
ˆ y (rad) (r , t ) + zE
ˆ z (rad) (r , t )

with
G
E j (σ ) = xE
ˆ jx (σ ) + yE
ˆ jy (σ ) + zE
ˆ jz (σ )

and
G G G G G
B (rad) (r , t ) = xB
ˆ x (rad) ( r , t ) + yB
ˆ y (rad) ( r , t ) + zB
ˆ z (rad) (r , t )

with
G
B j (σ ) = xB
ˆ jx (σ ) + yB
ˆ jy (σ ) + zB
ˆ jz (σ )

for any x̂ , ŷ , ẑ triplet of mutually perpendicular Cartesian unit vectors. The integrals in (4.46a)
and (4.46b) are inverse Fourier transforms, so we can define, using ξ = Ω ˆ • rG − ct ,
j

∞ ∞

³E ³E
2π iσξ
E jx (ξ ) = jx (σ ) e dσ , E jy (ξ ) = jy (σ ) e2π iσξ dσ ,
−∞ −∞

E jz (ξ ) = ³E
−∞
jz (σ ) e 2π iσξ dσ

and
∞ ∞
B jx (ξ ) = ³
−∞
B jx (σ ) e2π iσξ dσ , B jy (ξ ) =
−∞
³B jy (σ ) e2π iσξ dσ ,

B jz (ξ ) = ³B
−∞
jz (σ ) e 2π iσξ dσ .

- 371 -
4 · From Maxwell’s Equations to the Michelson Interferometer

In our shorthand vector notation, this becomes

G ∞
G
E j (ξ ) = ³
−∞
E j (σ ) e2π iσξ dσ (4.46c)

and
G ∞
G
B j (ξ ) = ³
−∞
B j (σ ) e2π iσξ dσ (4.46d)

where
G
E j (ξ ) = xˆ E jx (ξ ) + yˆ E jy (ξ ) + zˆ E jz (ξ ) (4.46e)
and
G
B j (ξ ) = xˆ B jx (ξ ) + yˆ B jy (ξ ) + zˆ B jz (ξ ) . (4.46f)

ˆ • rG − ct )
Now Eqs. (4.46a) and (4.46b) can be written as (remember that ξ = Ω j

G G G ˆ G
E (rad) (r , t ) = ¦ E j (Ω j • r − ct ) (4.46g)
j

and
G G G ˆ G
B (rad) (r , t ) = ¦ B j (Ω j • r − ct ) . (4.46h)
j

G G
Returning to the definitions of E j and B j in Eqs. (4.45a)–(4.45d), we see that

G G
E j (−σ ) = E j (σ )∗ (4.47a)
and
G G
B j (−σ ) = B j (σ )∗ . (4.47b)

G G
This shows that E j and B j are Hermitian, and entry 7 in Table 2.1 of Chapter 2 requires the
inverse Fourier transforms of Hermitian functions to be real. Consequently, because they are
G ˆ G G ˆ G
inverse Fourier transforms of Hermitian functions, each E j (Ω j • r − ct ) and B j (Ω j • r − ct ) vector
G ˆ G G ˆ G
function in (4.46g) and (4.46h) is real. Every E (Ω • r − ct ) and B (Ω
j j • r − ct ) pair of vector
j j

functions can be thought of as the real electric and magnetic-induction fields of a single
ˆ at velocity c. Hence these two equations
polychromatic plane wave traveling in direction Ω j

demonstrate that electromagnetic radiation fields in empty space can be represented as the sum of
polychromatic plane waves traveling in a specified collection of different directions.

- 372 -
Polychromatic Wave Fields · 4.7

G G
From Eqs. (4.14c) and (4.14d), we know that BAj • Ω ˆ = 0 and E • Ω ˆ = 0 . Taking the
j Aj j
G∗ G∗
complex conjugate of these two relationships gives BAj • Ω ˆ = 0 and E • Ω ˆ = 0 . We can now
j Aj j

ˆ to get
take the dot product of both sides of Eqs. (4.45a) and (4.45b) with Ω j

G
ˆ =0
E j (σ ) • Ω (4.48a)
j

ˆ to get
and the dot product of both sides of Eqs. (4.45c) and (4.45d) Ω j

G
ˆ =0
B j (σ ) • Ω (4.48b)
j

ˆ of both sides of Eqs.


for all positive and negative values of ı. Taking the dot product with Ω j

(4.46c) and (4.46d) gives


G ∞
G
ˆ = ˆ º e2π iσξ dσ
E j (ξ ) • Ω j ³
−∞
ª E j (σ ) • Ω
¬ j¼

and
G ∞
G
ˆ =
B j (ξ ) • Ω j ³ ª¬ B (σ ) • Ωˆ
−∞
j j
º e2π iσξ dσ
¼

because Ωˆ is a constant unit vector. Substituting from Eqs. (4.48a) and (4.48b) and
j

ˆ • rG − ct now leads to
remembering that ξ = Ω j

G ˆ G ˆ
E j (Ω j • r − ct ) • Ω j = 0 (4.49a)
and
G ˆ G ˆ
B j (Ω j • r − ct ) • Ω j = 0 (4.49b)

G ˆ G G ˆ G
for any polychromatic plane wave E j (Ω j • r − ct ) and B j (Ω j • r − ct ) . Consequently, the E and

B fields of a polychromatic plane wave, just like the E and B fields of a monochromatic plane
wave, are transverse to the wave’s direction of propagation. From Eq. (4.22a) we note that, taking
the complex conjugates of the original equality,
G G G G
EAj • BAj = EA∗j • BA∗j = 0 .

- 373 -
4 · From Maxwell’s Equations to the Michelson Interferometer

through
Hence from Eqs. (4.45a) and (4.45d)
(4.45c) it follows
it follows that that

G G 1 G G
E j () ) = B j () ) 2
EAj = BAj 0
4  ) A 
for ) 0 and
G G 1 G G
E j () ) = B j () ) 2
EAj = BAj 0
4  ) A 

for )
0 . We conclude, in the limit of decreasing ) A and increasing numbers of ) A values,
that
G G
E j () ) = B j () ) 0 (4.49c)

for all positive and negative values of ı. We divide both sides of Eq. (4.22b) by 4() A ) 2 to get

1 G G 1ª 1 º G G
ˆ .
4  ) A 
2
EAj ; BAj « 2
c « 4  ) A  »

» EAj = EAj  j  (4.49d)
¬ ¼

Consulting Eq. (4.45a), the complex conjugate of Eq. (4.45a), and the complex conjugate of Eq.
(4.45c), we note that in the limit of decreasing ) A and increasing numbers of ) A it follows that

G G 1 G G
 ˆ
E j () ) ; B j () ) E j () ) = E j () )   (4.49e)
j
c

for ) 0 . For )
0 we have, using (4.45b) and the complex conjugate of (4.45d), that

1 G G G G
2
EAj ; BAj E j () ) ; B j () ) .
4  ) A 

Substituting this into the complex conjugate of (4.49d) gives

G G 1 G G
E j () ) ; B j () )
4c  ) A 
2 E Aj = EAj  ˆ j .

Remembering that )
0 , we now use (4.45b) and the complex conjugate of (4.45b) to write, in
the limit of decreasing ) A and increasing numbers of ) A , that

- 374 -
Polychromatic Wave Fields · 4.7

G G 1 G G G G
( ˆ = 1 E (σ ) • E (σ )∗ Ω
E j (σ ) × B j (σ )∗ = E j (σ )∗ • E j (σ ) Ω) ( ˆ . )
j j j j
c c

Comparing the results for σ > 0 and σ < 0 , we conclude that

G G 1 G G
( ) ˆ
E j (σ ) × B j (σ )∗ = E j (σ ) • E j (σ )∗ Ω
c
j (4.49f)

holds true for all positive and negative values of ı. Glancing back at Eq. (4.47a), we see that this
can also be written as
G G 1 G G
( ˆ
E j (σ ) × B j (σ )∗ = E j (σ ) • E j (−σ ) Ω
c
)j (4.49g)

for all positive and negative values of ı.

4.8 Angle-Wavenumber Transforms


The next step is to convert the sums over j in Eqs. (4.46a) and (4.46b) into integrals.
ˆ are defined in Eq. (4.12a) to be Ω
Remembering that the Ω ˆ = xˆε + yˆε + zˆε , we require that
j j jx jy jz

ε jz > 0 . Now all the plane waves in Eqs. (4.46a) and (4.46b) are traveling more or less along the
ˆ and ẑ is
positive z axis of the Cartesian coordinate system—that is, the angle between Ω j

always less than π / 2 . We use


2
ˆ =1= ε 2 +ε 2 +ε 2
Ω j jx jy jz

[see Eq. (4.12c)] to write


ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 .
Ω (4.50a)
j jx jy jx jy

ˆ
This makes it clear that the two real parameters ε jx and ε jy specify the propagation direction Ω j

of the jth plane wave. Consequently, each plane wave in the sums over j in Eqs. (4.46a) and
(4.46b) can be specified by a single point in the ε x , ε y plane. Figure 4.10 shows how this works
for the sum of the five plane waves specified by the points (ε1x , ε1 y ) , (ε 2 x , ε 2 y ) , (ε 3 x , ε 3 y ) ,
(ε 4 x , ε 4 y ) , and (ε 5 x , ε 5 y ) . We can construct a grid of ε x , ε y values such that each plane wave is
located at a node in the grid, where if necessary the grid lines are unevenly spaced as in Fig. 4.10.
After numbering the grid lines, we can replace the single index j by a pair of indices m and n. The
five plane waves in Fig. 4.10, for example, become

- 375 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

(ε1x , ε1 y ) → (ε 2 x , ε 4 y ) , (ε 2 x , ε 2 y ) → (ε 5 x , ε1 y ) ,

(ε 3 x , ε 3 y ) → (ε 3 x , ε 2 y ) , (ε 4 x , ε 4 y ) → (ε 4 x , ε 5 y ) ,
and
(ε 5 x , ε 5 y ) → (ε1x , ε 3 y ) .

Replacing index j by a pair of indices m and n lets us write the sums in Eqs. (4.46a) and (4.46b)
as
G (rad) G ∞ ∞ ∞ G ˆ • rG − ct )
2π iσ ( Ω
E (r , t ) = ³ ¦ ¦ Enm (σ ) e nm
dσ (4.51a)
−∞ n =−∞ m =−∞
and
G G
∞ ∞ ∞ G ˆ • rG − ct )
2π iσ ( Ω
B (rad) (r , t ) = ³¦¦ Bnm (σ ) e dσ , (4.51b)
nm

−∞ n =−∞ m =−∞

G G
where we define Enm (σ ) = Bnm (σ ) = 0 for those grid points that do not correspond to propagation
directions specified in the original sums over j. The new set of Ω ˆ propagation vectors can be
nm
written as
ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 .
Ω (4.51c)
nm nx my nx my

For each m and n propagation direction in Eqs. (4.51a) and (4.51b), we now define that
G G
∆ε nx ∆ε my e (ε nx , ε my , σ ) = Enm (σ ) (4.52a)
and
G G
∆ε nx ∆ε my b(ε nx , ε my , σ ) = Bnm (σ ) (4.52b)
with
∆ε nx = ε n +1, x − ε n , x (4.52c)
and
∆ε my = ε m +1, y − ε m, y . (4.52d)

In the limit of decreasing ∆ε nx , ∆ε my and increasing numbers of specified propagation directions


per unit interval in ε x and ε y , Eqs. (4.51a) and (4.51b) can be written as

- 376 -
Angle-Wavenumber Transforms· 4.8

FIGURE 4.10.
εy

6 ε 4x ,ε 4 y
ε 1x , ε 1 y
5
4
3 ε 3x , ε 3 y
2
ε 2x ,ε 2 y

1
ε 5x , ε 5 y
0
εx

-1

-2

-2 -1 0 1 2 3 4 5 6

G G

G ˆ rG − ct )
2π iσ ( Ω•
E (rad) (r , t ) = ³ ³ ³ 2 x 2y x y
d σ d ε d ε e (ε , ε , σ ) e
(4.53a)
−∞ [ε x + ε x < 1]

and
G G

G ˆ rG − ct )
2π iσ ( Ω•
B (rad) (r , t ) = ³ ³ ³ 2 x 2y x y
d σ d ε d ε b (ε , ε , σ ) e
(4.53b)
−∞ [ε x + ε x < 1]

- 377 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 . We single out the x and y components of Ω̂ and rG by


with Ω x y x y

writing
ˆ = εG + zˆ 1 − ε 2
Ω (4.54a)
and
G G
r = ρ + z zˆ , (4.54b)
where
G
ε = xˆε x + yˆε y , (4.54c)

G2
ε 2 = ε = ε x2 + ε y2 , (4.54d)
and
G
ρ = x xˆ + y yˆ . (4.54e)

G G
As a shorthand, we write the complex vector functions e and b as
G G G G G G
e (ε x , ε x , σ ) = e (ε , σ ) and b(ε x , ε x , σ ) = b(ε , σ ) .

G G G G
Equations (4.52a) and (4.52b) show that both e (ε , σ ) and b(ε , σ ) must be negligible or zero for
G
values of ε = xˆε x + yˆε y that do not correspond to grid points contained in the original sums over
G G G G G G
j. We also require e(ε , σ ) and b(ε , σ ) to be zero for values of ε for which ε ≥ 1 . Now Eqs.
(4.53a) and (4.53b) become

G G
∞ ∞
G G G G
E (rad) ( ρ , z, t ) = ³ dσ ³ ³ d ε [e(ε ,σ ) e
2 2π iσ z 1−ε 2
]e2π iσ (ε • ρ −ct ) (4.55a)
−∞ −∞
and
G G
∞ ∞
G G G G
B (rad) ( ρ , z, t ) = ³ dσ ³ ³ d ε [b(ε , σ ) e
2 2π iσ z 1−ε 2
]e2π iσ (ε • ρ −ct ) , (4.55b)
−∞ −∞

G G
where we have singled out the z dependence of E (rad) and B (rad) , writing that

G G G G G
E (rad) (r , t ) = E (rad) ( x, y, z , t ) = E (rad) ( ρ , z, t )

and
G G G G G
B (rad) (r , t ) = B (rad) ( x, y, z , t ) = B (rad) ( ρ , z, t ) .

- 378 -
Angle-Wavenumber Transforms· 4.8

From Eqs. (4.49f), (4.52a), and (4.52b), we have, replacing each j index by the appropriate m
and n pair of indices,
G G 2 G G
Enm () ) ; Bnm () )   nx  my  e ( nx ,  my , ) ) ; b( nx ,  my , ) )
1 2 G G
 ˆ .
 nx  my   e ( nx ,  my , ) ) = e ( nx ,  my , ) )   nm
c
G
Dropping the m and n indices, making the notation change  x ,  y 7  , and dividing through by
( nx  my ) 2 , we get, in the limit of decreasing  nx ,  my and increasing numbers of specified
propagation directions, that
G G G G 1 G G G G
 ˆ.
e ( , ) ) ; b( , ) ) e ( , ) ) = e ( , ) )   (4.56a)
c

Following
Following the the same
same procedure,
procedure, we we substitute
substitute Eqs.Eqs. (4.52a)
(4.52a) andand (4.52b)
(4.52b) intointo (4.49g)
(4.49g) to get
to get

G G G G 1 G G G G ˆ .
e( , ) ) ; b( , ) )  e ( , ) ) = e ( , ) )   (4.56b)
c

We can also substitute (4.52a) into (4.48a) to get, replacing each j by appropriate m and n indices,
G ˆ

 nx  my e ( nx ,  my , ) ) =  
nm 0 , (4.56c)

which becomes, making the same notation changes as before and taking the same limit as before,
G G ˆ 0.
e ( , ) ) =  (4.56d)

A similar substitution of (4.52b) into (4.48b) gives


G G
ˆ 0.
b( , ) ) =  (4.56e)

Equations (4.55a) and (4.55b) can be simplified by defining


G G G G 1 2
E( , z, ) ) e( , ) ) e 2& i) z (4.57a)
and
G G G G 1 2
B( , z , ) ) b( , ) ) e 2& i) z (4.57b)
to get

- 379 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
5 5
G G G G

³ ³ ³
2& i)   = (  ct 
E (rad) ( ( , z , t ) d) d 2
 E ( , z , ) ) e (4.58a)
5 5
and
G G
5 5
G G G G

³ ³ ³
2& i)   = (  ct 
B (rad) ( ( , z, t ) d) d 2
 B ( , z , ) ) e . (4.58b)
5 5

G G G G GG G
TheThe complex
complex vectors
vectors E(E,(z ,,)z ,)) )and
andB(B(,z,, z), )
) )are
arecalled
calledthe
theangle-wavenumber
angle-wavenumber transforms
transforms
G (rad) G (rad)
of E and B respectively. By definition [see Eqs. (4.57a) and (4.57b)], the angle-
wavenumber transforms at z0  z are given by

G G G G 1 2
E( , z0  z, ) ) E( , z0 , ) )e 2& i) z (4.59a)
and
G G G G 1 2
B( , z0  z , ) ) B( , z0 , ) )e 2& i) z . (4.59b)
G G GG GG GG
These equalities
These show
equalities that
show getgetE Eand
to to
that andB Batatz0z0 z zwe
weneed
needonly multiply EE and
onlymultiply and B
B at z0
1 2 1 2
by e2& i) z . Multiplication of Eqs. (4.56d) and (4.56e) by e2& i) z gives
G G
ˆ 0
E( , z , ) ) =  (4.59c)
and
G G
ˆ 0.
B( , z , ) ) =  (4.59d)

Multiplying both sides of (4.56a) by

1 2 1 2
e2& i) z A e2& i) z 1
gives

G
ª eG (G , ) ) e 2& i) z º ; ªb(G , ) ) e 2& i) z

1 2 1 2º
«¬ »¼ «¬ »¼
G G º = ª eG (G , ) ) e 2& i) z

ª« e ( , ) ) e 2& i) z 1 2 1 2 º ˆ
¬ ¼» ¬« ¼»

or
G G G G 1 G G G G
 ˆ .
E( , z , ) ) ; B( , z , ) ) E( , z, ) ) = E( , z , ) )   (4.59e)
c

- 380 -
Angle-Wavenumber Transforms· 4.8

Similar treatment of (4.56b) gives

G G G G 1 G G G G
( ˆ.
E(ε , z , σ ) × B(ε , z, σ )∗ = E(ε , z, σ ) • E(ε , z , −σ ) Ω
c
) (4.59f)

Equations (4.58a) and (4.58b) are a disguised form of the inverse Fourier transform. Writing
G G
(4.58a) using x and y for ρ , ε x and ε y for ε , and then making the substitutions

w = −σ c , (4.60a)

u x = σε x , (4.60b)

u y = σε y (4.60c)
gives G
E (rad) ( x, y, z , t )
∞ ∞ ∞
G 2π i ( xu x + yu y −σ )
³ ³ x ³
ct
= dσ σ du −1
σ −1
du y E (σ u x , σ u y , z , σ )e
−∞ −∞ −∞
−∞ ∞
dw § c ·

§ c· G § −cu x −cu y w · 2π i( xux + yu y + wt )
=−³ ³ ¨ − ¸ x³¨
du − du
¸ y ¨E , , z, − ¸ e

c −∞ © w ¹ −∞ ©
w¹ © w w c¹

or
G G
ª −2 G § cu
∞ ∞
G w · º 2π i( ρG •uG + wt )
E (rad) ( ρ , z , t ) = ³−∞ ³−∞³ «¬
2
dw d u cw E ¨ − , z , − ¸ e , (4.61a)
© w c ¹ »¼

where in the last step we create a vector


G
u = xu
ˆ x + yu
ˆ y (4.61b)
such that
G G c G
ε = σ −1u = − u. (4.61c)
w
G
The same transformation of variables applied to the triple integral over B in (4.58b) gives

G G
ª −2 G § cu
∞ ∞
G w · º 2π i( ρG •uG + wt )
B (rad) ( ρ , z , t ) = ³ ³−∞³ «¬
2
dw d u cw B ¨ − , z , − ¸ e . (4.61d)
−∞ © w c ¹ »¼

- 381 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
According to Eq. (2.110f) in Chapter 2, we have now demonstrated that functions E (rad) and B (rad)
at a specified value of z are the vector inverse Fourier transforms of

G § cuG
−2 w·
cw E ¨ − , z , − ¸
© w c¹
and
G § cuG
−2 w·
cw B ¨ − , z , − ¸ .
© w c¹
G G
Hence, the vector forward Fourier transforms of E (rad) and B (rad) must be [see Eq. (2.110e) in
Chapter 2)
G G ∞ ∞
G
−2 § cu w· G G G
cw E ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ E (rad) ( ρ , z , t ) e −2π i( ρ •u + wt ) (4.62a)
© w c ¹ −∞ −∞
and
G G ∞ ∞
G
−2 § cu w· G G G
cw B ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ B (rad) ( ρ , z , t ) e −2π i( ρ •u + wt ) (4.62b)
© w c ¹ −∞ −∞
G
or, returning to the ε and ı arguments,

G G ∞ ∞
G G G G
E ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ E (rad) ( ρ , z , t ) e −2π iσ (ε • ρ −ct )
2
(4.62c)
−∞ −∞
and
G G ∞ ∞
G G G G
B ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ B (rad) ( ρ , z , t ) e −2π iσ (ε • ρ −ct ) .
2
(4.62d)
−∞ −∞

Equations (4.58a), (4.58b), (4.62c), and (4.62d) are a formal transformation from the angle-
wavenumber transforms to the real E and B radiation fields and back again, subject only to the
ˆ = εG + zˆε and that
constraint that ε z > 0 in the propagation vector Ω z

G G G G G2
E ( ε , z , σ ) = B ( ε , z , σ ) = 0 when ε = ε x2 + ε y2 ≥ 1 .

To go from Eqs. (4.58a) and (4.58b) to (4.62c) and (4.62d), we show the original angle-
wavenumber transforms to be a form of three-dimensional vector Fourier transform. This lets us
use Fourier transform theory to write down the integrals for the inverse transforms.
G G
Unfortunately, the change in Eqs. (4.60a)–(4.60c) from the ε , ı variables to the u , w variables
that reveals the transform’s Fourier nature is a somewhat awkward one. There are two reasons for

- 382 -
Angle-Wavenumber Transforms· 4.8

this: In the physical sciences, waves conventionally travel from left to right, forcing ı and w to
have opposite signs in (4.60a), and in spectroscopy, wavenumbers rather than frequencies are
conventionally used to characterize monochromatic radiation. Nevertheless, the rewards of
converting to the Fourier transform—immediate access to the well-known results of Fourier
theory—significantly outweigh the inconvenience, and the reader can expect to see
G G
transformations between the ε , ı variables and the u , w variables more than once in the balance
of this chapter.

4.9 Beam-Chopped and Direction-Chopped Radiation


In geometric optics, a plane wave of any sort, polychromatic or monochromatic, is represented by
a collection of equally spaced parallel rays (see Fig. 4.11). Returning briefly to the notation of
Eqs. (4.46g) and (4.46h), we again label each plane wave’s direction of propagation with a
propagation vector Ω ˆ . Each ray belonging to the collection of rays representing the plane wave
j

points in the direction of Ωˆ , and the plane surfaces specified by Ω ˆ • rG = constant are surfaces
j j

perpendicular to all the parallel rays. If the plane wave is monochromatic, then these surfaces
where Ω ˆ • rG = constant are also surfaces of constant phase at fixed time t, since the
j

monochromatic phase term is


ˆ • rG − ct )
2πσ A (Ω j

[see the discussion following Eq. (4.19b)].

This means, of course, that the monochromatic E field as well as the monochromatic B field is
constant over any of these plane surfaces at fixed time t. If the plane wave is polychromatic, we
review the discussion following Eq. (4.47b) and note that a single polychromatic plane wave has
E and B fields specified by the vector functions
G ˆ G G ˆ G
E j (Ω j • r − ct ) and B j (Ω j • r − ct )

respectively. Consequently, at any fixed time t, the polychromatic E field as well as the
ˆ • rG = constant ; that is, they are
polychromatic B field is constant over any plane surface where Ω j

constant over any plane surface perpendicular to the rays. For both monochromatic and
polychromatic plane waves, the E and B fields themselves lie in these plane surfaces because they
must be perpendicular to the propagation vector Ωˆ [as shown by Eqs. (4.14c), (4.14d), (4.49a),
j

and (4.49b)].
Figure 4.11 shows a plane wave encountering an aperture. The rays entering the aperture pass
on through, creating a beam; we say that the aperture creates a beam-chopped radiation field.

- 383 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

FIGURE 4.11.





Apertures can be used to create beam-chopped radiation fields.

From our current point of view, the most important characteristic of beam-chopped fields is that
they obviously can be Fourier transformed in planes perpendicular to the beam’s direction of
travel. Using the x, y, z coordinate system shown in Fig. 4.11, with its origin in the center of the
beam and its z axis pointing down the beam, we drop the (rad) superscript from Eqs. (4.62a) and
(4.62b) and write
G G 5 5
G G
2 § cu w· G G
cw E ¨  , z ,  ¸ ³ dt ³ ³ d 2 ( E ( ( , z , t ) e 
2& i ( =u  wt 

© w c ¹ 5 5
(4.63a)
5 5 5
G 2& i  xu x  yu y  wt 
³ dt ³ dx ³ dy E ( x, y, z , t ) e
5 5 5
and
G § cuG w·
5 5
G G G G
cw B ¨  , z ,  ¸ ³ dt ³ ³ d 2 ( B( ( , z , t ) e 
2 2& i ( =u  wt 

© w c ¹ 5 5
(4.63b)
5 5 5
G 2& i  xu x  yu y  wt 
³ dt ³ dx ³ dy B( x, y, z, t ) e ,
5 5 5

- 384 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

G G
where E ( x, y, z , t ) and B( x, y, z , t ) represent the E and B fields after the aperture in Fig. 4.11. In
these formulas the integrals over x and y can be assumed to converge because the beam-chopped
E and B fields are negligibly small for large values of x and y.
Figure 4.11 suggests that a beam-chopped radiation field can travel indefinitely far to the right
with a cross-section that is always the same shape as the aperture. We know, however, that
diffraction eventually causes all beam-chopped radiation fields to spread; the smaller the
characteristic (or average) wavelength of the radiation compared to the characteristic (or average)
distance across the aperture, the farther the beam travels before significant spreading occurs.62
Michelson interferometers use apertures that are very large compared to the wavelengths of
interest, ensuring that only an insignificant amount of spreading occurs in the beam-chopped field
as it travels through the instrument.
In geometric optics, when a lens is placed perpendicular to the z axis—that is, the optical
axis—of a beam, the plane waves with propagation vectors parallel to the optical axis are focused
onto the point where the optical axis intersects a perpendicular surface called the focal plane (see
Fig. 4.12). Plane waves with propagation vectors at an angle with respect to the optical axis are
focused onto points in the focal plane that are off to the side. Figure 4.12 shows four rays
representing a plane wave propagating at a small angle to the optical axis being focused by lens A
slightly to the side of where the axis intersects the focal plane. Every propagation direction that is
at a small angle to the optical axis is focused onto a unique point in the focal plane close to the
optical axis and each point in the focal plane close to the optical axis corresponds to a unique
propagation direction at a small angle to the optical axis. Directions that differ only slightly with
respect to each other are focused at closely adjacent points. The plane wave in Fig. 4.12 has a
propagation vector propagating at a small enough angle with respect to the optical axis that it is
focused by lens A only slightly to the side of where the axis intersects the focal plane.
Consequently, it passes through the small aperture placed in the focal plane and out to lens B,
which defocuses it back into a plane wave. Figure 4.13 gives a side view of this phenomenon.
Here there are three plane waves a, b, and c propagating in different directions with respect to the
beam’s optical axis. All the rays belonging to the plane wave a are focused at point a in the focal
plane; all the rays belonging to plane wave b are focused at point b in the focal plane; and all the
rays belonging to plane wave c are focused at point c in the focal plane. Only those plane waves
with propagation vectors at just a slight angle to the optical axis, such as plane wave b, pass
through the central aperture, allowing lens B to create a beam of plane waves propagating nearly
parallel to the optical axis. We say that the radiation leaving lens B has been direction-chopped,
meaning that it contains only a small range of propagation directions. The distance between the
focal plane and lenses A and B depends on the lens’ index of refraction, which may in turn
depend on the radiation frequency f = σ c . If the frequency dependence is strong, then the two
lenses in Fig. 4.13 may not do a good job of creating a polychromatic direction-chopped beam.
When this is a concern, the all-reflective setup shown in Fig. 4.14 (composed of two Cassegrain

62
R. W. Ditchburn, Light, Vol. I, 2nd ed. (Interscience Publishers, a division of John Wiley & Sons, Inc., New York,
1963), pp. 162–166, 195.

- 385 -
4 · From Maxwell’s Equations to the Michelson Interferometer

telescopes having focal-plane locations independent of frequency) is a better way to remove


unwanted propagation directions.
Using the notation of Eq. (4.51c), we note that direction-chopped radiation can contain only
propagation vectors
ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2
Ω nm nx my nx my

that are nearly parallel to the optical axis. This means that both

ε nx << 1 and ε my << 1

for all values of n, m in the sum over plane waves in Eqs. (4.51a) and (4.51b).
When these sums are transformed into double integrals in (4.53a) and (4.53b), the propagation
vectors

ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 with ε << 1 and ε << 1


Ω nm nx my nx my nx my

become, according to Eqs. (4.54a) and (4.54c),

ˆ = xˆε + yˆε + zˆ 1 − ε 2 .
Ω x y

Functions
G G G G G G
e (ε , σ ) = e (ε x , ε x , σ ) and b(ε , σ ) = b(ε x , ε x , σ )

are negligible or zero in direction-chopped beams unless both

ε x << 1 and ε y << 1 .

G G G G
Since the angle-wavenumber transforms E(ε , z , σ ) and B(ε , z , σ ) in (4.57a) and (4.57b) are
G G G G
proportional to e(ε , σ ) and b(ε , σ ) , they should also be negligible or zero in direction-chopped
beams when ε x and ε y are not both very small. Consequently, in Eqs. (4.58a) and (4.58b) we
see, dropping the (rad) superscript, that the formulas for the E and B fields of the direction-
chopped beam,
G G ∞ ∞
G G G G
E ( ρ , z , t ) = ³ dσ ³ ³ d 2ε E(ε , z , σ )e 2π iσ (ε • ρ −ct ) (4.64a)
−∞ −∞

- 386 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

FIGURE 4.12.

Focal Plane
with Aperture

Lens A

Lens B Optical
Axis

Two matched lenses can be used to create direction-chopped radiation. Only plane waves
propagating at small angles to the optical axis can make it through the aperture in the focal
plane of the lenses (see also Figs. 4.13 and 4.14).

- 387 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.13.

b c
a
Optical Axis
c
b

a
b b b
c

c a
b

Lens A Focal Plane Lens B


with Aperture

Fig.
F 4.124.12
IGURE gives a three-dimensional
GIVES view of
A THREE-DIMENSIONAL how
VIEW OFmatched lenses LENSES
HOW MATCHED can be CANusedBEto USED
createTOdirection-
CREATE
chopped radiation,
DIRECTION -CHOPPEDand this diagram
RADIATION , AND is theDIAGRAM
THIS side view.
IS Plane waves
THE SIDE propagating
VIEW . PLANE WAVESat large angles to the
PROPAGATING AT
optical ANGLES
LARGE axis, likeTOthe a OPTICAL
THE and c plane , LIKE THE
AXISwaves a and
in the c planeare
diagram, waves in thefrom
removed diagram, are removed
the beam because from
they
the beam
focus because
outside they focus
the aperture outside
in the focal the aperture
plane in the
(see also Fig.focal plane (see also Fig. 4.14).
4.14).

- 388 -
Beam-Chopped and Direction-Chopped Radiation · 4.9

FIGURE 4.14.

Optical
Axis
b
c b


a ŷ
b
b
c

Focal Plane
with Aperture

Telescope A Telescope B

Just like the lenses in Fig. 4.13, two matched Cassegrain telescopes can be used to
create direction-chopped radiation. Again plane waves propagating at large angles to the
optical axis are removed from the beam because they focus outside the focal-plane
aperture.

and
G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
B( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d B ( , z , ) e , (4.64b)
−∞ −∞

have double integrals over d 2ε = d ε x d ε y that must converge. For each ε x , ε y pair of values
inside the double integral, the tip of the Ω̂ vector can be thought of as lying somewhere inside
the infinitesimal area d 2ε = d ε x d ε y (see Fig. 4.15). As long as only direction-chopped beams
where both ε x and ε y are small are being analyzed, this d 2ε infinitesimal area must be

- 389 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.15.

unit vector x̂

infinitesimal area
element d 2ε

propagation vector Ω̂

εx

εy

unit vector ẑ

unit vector ŷ

approximately perpendicular to the direction in which Ω̂ points. Because Ω̂ is of unit length and
d 2ε is an infinitesimal area, the formula for the solid angle subtended by d 2ε becomes
2
d 2ε Ωˆ = d 2ε . Hence, d 2ε can also be regarded as an infinitesimal solid angle, and the double

integrals over d 2ε can be interpreted as integrals over all the solid angles that specify allowed
propagation directions inside the direction-chopped beam.

4.10 Time-Chopped and Band-Limited Radiation


When Michelson interferometers are used to measure spectra, the act of measurement must cover
a finite interval of time. If the radiation fields drop abruptly to zero outside this time interval, the
result of the measurement is not affected as long as the field values inside the time interval do not
change. Mathematically speaking, it is often convenient to analyze the situation as if the radiation
fields do indeed drop abruptly to zero before and after the measurement interval. We say these

- 390 -
Time-Chopped and Band-Limited Radiation · 4.10

radiation fields are time-chopped. The formulas for the angle-wavenumber transforms of time-
chopped radiation fields are [dropping the (rad) superscript from Eqs. (4.62c) and (4.62d)]

G G ∞ ∞
G G G G
E ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ E ( ρ , z , t ) e −2π iσ (ε • ρ −ct )
2
(4.65a)
−∞ −∞
and
G G ∞ ∞
G G G G
B ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ B( ρ , z , t ) e −2π iσ (ε • ρ −ct ) .
2
(4.65b)
−∞ −∞

G G G G
Since the E ( ρ , z , t ) and B( ρ , z , t ) radiation fields are assumed to be time-chopped, the integrals
between í’ and +’ over time must be well defined and so converge. When the E and B fields are
beam-chopped, the infinite double integrals over d 2 ρ are also well defined and converge [see the
G G G G
discussion after (4.63b)] so the multiple integrals defining E ( ε , z , σ ) and B ( ε , z , σ ) are well-
G G G G
defined quantities when E ( ρ , z , t ) and B( ρ , z , t ) represent beam-chopped and time-chopped
radiation fields. Similar reasoning shows that when the angle-wavenumber transforms are
calculated using the three-dimensional Fourier transforms in Eqs. (4.63a) and (4.63b),

G § cuG w·
∞ ∞
G G G G
cw E ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ E ( ρ , z , t ) e −2π i ( ρ •u + wt )
−2
(4.65c)
© w c ¹ −∞ −∞
and
G § cuG w·
∞ ∞
G G G G
cw B ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ B ( ρ , z , t ) e −2π i( ρ •u + wt ) ,
−2
(4.65d)
© w c ¹ −∞ −∞

the infinite integrals over dt and d 2 ρ are for the same reasons well-defined and convergent when
G G G G
E ( ρ , z , t ) and B( ρ , z , t ) represent beam-chopped and time-chopped radiation fields.
The inverse transforms to Eqs. (4.65a) and (4.65b) are given in (4.64a) and (4.64b),

G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
E ( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d E ( , z , ) e (4.66a)
−∞ −∞
and
G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
B( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d B ( , z , ) e . (4.66b)
−∞ −∞

Michelson interferometers—indeed, most types of optical instruments—usually shield their


detectors with filters that pass only the radiation wavelengths that the detectors are designed to

- 391 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G G G
measure. Hence, in (4.66a) and (4.66b), we expect E ( ε , z , σ ) and B ( ε , z , σ ) to be negligible for
wavenumbers ı corresponding to radiation wavelengths blocked by the filters. The filters are said
to define the spectral band (or bands) to which the instrument is sensitive. Even when these filters
are built into the detectors themselves, which means the actual radiation fields traversing the
instrument may contain out-of-band radiation, it is mathematically convenient to assume that only
negligible amounts of out-of-band radiation are present inside the instrument (while, of course,
retaining the correct amounts of in-band radiation measured by the detectors). The situation is
very similar to that encountered in the discussion of time-chopped radiation fields; just as we
assume the absence of radiation outside the time interval during which the measurement occurs,
so now we assume the absence of out-of-band radiation to which the detectors are insensitive.
We must be careful to note which wavenumbers ı correspond to the radiation band passed by
the filters. Remembering that the wavenumber is one over the wavelength, and reviewing how the
original sums over σ A in Eqs. (4.44a) and (4.44b) become integrals over ı in Eqs. (4.46a) and
(4.46b), we see that if only wavelengths Ȝ between λa and λb are measured by the detectors,

0 < λb ≤ λ ≤ λa , (4.67a)
G G G G
then E ( ε , z , σ ) and B ( ε , z , σ ) can be non-negligible only for ı values inside the two intervals

1 1
−∞ < − ≤σ ≤ − <0
λb λa
and
1 1
0< ≤σ ≤ <∞.
λa λb
These intervals can also be written as

−∞ < −σ b ≤ σ ≤ −σ a < 0 (4.67b)


and
0 < σa ≤ σ ≤ σb < ∞ , (4.67c)
where
1
σa = (4.67d)
λa
and
1
σb = . (4.67e)
λb

- 392 -
Time-Chopped and Band-Limited Radiation · 4.10

A band-limited function g(t) is a function whose Fourier transform

³e
− 2π ift
G( f ) = g (t ) dt
−∞

becomes strictly zero when f > F for some positive value of F. There is a well-known theorem
that states that when a function g(t) is time-chopped, meaning that there exists some positive
value of T such that g(t) is strictly zero whenever t > T , then there is no value of F such that the
Fourier transform

³e
− 2π ift
G( f ) = g (t ) dt
−∞

becomes strictly zero whenever f > F . In short, a function cannot be both time-chopped and
G G G G
band-limited.63 If the angle-wavenumber transforms E ( ε , z , σ ) and B ( ε , z , σ ) are taken to be
strictly zero for wavenumbers ı outside the intervals specified in (4.67b) and (4.67c), then,
G G G G
because Eqs. (4.65c) and (4.65d) show E ( ε , z , σ ) and B ( ε , z , σ ) to be proportional to the
G G G G G G G G
Fourier transforms of E ( ρ , z , t ) and B( ρ , z , t ) , functions E ( ρ , z , t ) and B( ρ , z , t ) must be band-
limited functions. Therefore, according to the just-mentioned theorem, we cannot say that
G G G G
functions E ( ρ , z , t ) and B( ρ , z , t ) are both band-limited and time-chopped. Unfortunately, we
G G G G
have just said in the previous two paragraphs that we expect E ( ρ , z , t ) and B( ρ , z , t ) to be just
that—both band-limited and time-chopped. The loophole in this situation is that Fourier
transforms

³e
− 2π ift
G( f ) = g (t ) dt
−∞

can be negligibly small without becoming strictly zero, allowing us to create time-chopped
functions g(t) whose Fourier transforms G(ƒ) are only approximately zero when f > F for some
positive value of F. Hence it is possible for g(t) to be exactly time-chopped and approximately
G G
band-limited.64 Similarly, we are free to regard the angle-wavenumber transforms E and B as
being negligibly small rather than strictly zero for values of ı representing out-of-band radiation
G G
when E and B represent strictly time-chopped radiation fields. Hence it does make sense to treat
G G
the radiation fields as both time-chopped and approximately band-limited, taking the E ( ρ , z , t )

63
Athanasios Papoulis, Signal Analysis, p. 188.
64
We can also create functions g(t) that are exactly band-limited and approximately time-chopped.

- 393 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
and B( ρ , z , t ) fields to be strictly zero for all times t outside the measurement interval and
G G G G
assuming the angle-wavenumber transforms E ( ε , z , σ ) and B ( ε , z , σ ) to be negligible or zero
for all wavenumbers ı lying outside the intervals specified in (4.67b) and (4.67c).
The same mathematical point, by the way, comes up when analyzing the relationship of beam-
G G
chopped and direction-chopped E and B radiation fields. For this reason we have been careful
in the previous section to say that beam-chopped radiation fields are negligible or zero, instead of
strictly zero, for positions outside the beam and that direction-chopped radiation fields have
angle-wavenumber transforms that are negligible or zero, instead of strictly zero, for those
propagation vectors removed from the beam. This allows the beam passing through the
interferometer to be both direction-chopped and beam-chopped without getting into mathematical
difficulties.

4.11 Top-Level Description of a Standard Michelson Interferometer


Although the operation of a standard Michelson interferometer was described in some detail in
Chapter 1, it does no harm to review the basic setup before beginning a more rigorous analysis of
how electromagnetic radiation passes through the instrument. Figure 4.16 is a top view of a
standard Michelson interferometer and Fig. 4.17 is a perspective drawing of the same
interferometer configuration. In Fig. 4.16 the radiation whose spectrum is to be measured enters
the system traveling along the z axis, with the beam splitter partially reflecting and partially
transmitting the incident beam. Rays entering the system split at the beam splitter into reflected
rays shown with dashed lines and transmitted rays shown with solid lines. The reflected rays
travel down the moving-mirror arm to the moving mirror that reflects them back to the beam
splitter. The transmitted rays travel out the fixed-mirror arm through the compensator plate to the
fixed mirror that reflects them back to the beam splitter. When both sets of rays return to the
beam splitter, they are again partially reflected and partially transmitted. The rays from the
moving-mirror arm (partially transmitted by the beam splitter) and the rays from the fixed-mirror
arm (partially reflected by the beam splitter) then travel up the z axis, combining to produce the
balanced radiation field recorded by the detector. Figure 4.16 shows neither the rays from the
moving-mirror arm that are partially reflected nor the rays from the fixed-mirror arm that are
partially transmitted, because these rays end up going back out the way they came in. Following
the convention introduced in Chapter 1, the field produced by the rays going to the
interferometer’s detector is called the balanced radiation field, and the field produced by the rays
going back out the way they came in is called the unbalanced radiation field.
The beam splitter’s partial transmissions and reflections typically occur in a thin layer of
material shown as a dark line in Fig. 4.16. The thin layer lies on the right side of a transparent
block called the beam-splitter substrate. Even though the substrate material is transparent, it does
absorb a small fraction of the electromagnetic radiation traveling through it; and, as discussed in
Appendix 4E, plane waves passing through the substrate material undergo a phase shift. The
fraction absorbed usually depends on the wavenumber ı and can also depend to some extent on

- 394 -
Top-Level Description of a Standard Michelson Interferometer ·4.11

the radiation’s angle of incidence and whether it consists of s-type or p-type plane waves. The
phase shift is strongly dependent on the angle of incidence and ı. It can also depend on whether
s-type or p-type plane waves are passing through the substrate material. Appendix 4E introduces
six complex parameters γ s( a ) , γ (pa ) , γ s( b ) , γ (pb ) , γ s( c ) , γ (pc ) to describe the passage of radiation
through the two optical elements—the beam splitter substrate and the compensator plate—that
are made from the beam-splitter substrate material.
When the moving mirror in Fig. 4.16 is further from the beam splitter than the fixed mirror,
the rays in the moving-mirror arm travel a longer distance down and back than the rays in the
fixed-mirror arm. Just like in Eq. (1.15b) of Chapter 1, we call this extra distance the optical-path
difference (OPD) and represent it by the variable Ȥ. There is, of course, a position of the moving
mirror for which the OPD is zero, shown by the dash-dot line in Fig. 4.16. When the moving
mirror is closer to the beam splitter than this dash-dot line, the OPD value Ȥ is taken to be
negative. Just like in Sec. 1.4 of Chapter 1, the position of the dash-dot line is called the zero-path
difference (ZPD) position. When the OPD is Ȥ for the interferometer setup shown in Fig. 4.16, the
moving mirror is a distance Ȥ/2 from its ZPD position.
Section 1.7 of Chapter 1 shows that there are many different ways to build a Michelson
interferometer, and for some setups the moving mirror is not Ȥ/2 from its ZPD position when the
OPD is Ȥ [see, for example, Fig. 1.19(d)]. The interferometer signal in the ideal case does not,
however, depend directly on the interferometer setup but rather on the OPD value Ȥ generated by
the setup. For this reason, it makes sense to unfold the interferometer as shown in Fig. 4.18. Now
we see only what is common to Michelson interferometers of all configurations: the distance
traveled along one path through the interferometer differs by Ȥ from the distance traveled along
the other path through the interferometer.

4.12 Monochromatic Plane Waves and Michelson Interferometers


Consider a single monochromatic plane wave characterized by the incident propagation vector
Ω̂[i] shown in Fig. 4.19. Vector Ω̂[i] is drawn at a greatly exaggerated angle θb with respect to
the ẑ[i] axis, which is here the same as the optical axis; almost all interferometers are designed to
make angle θb small, requiring the propagation vector of any monochromatic plane wave
reaching the detector to be nearly parallel to the optical axis. In Fig. 4.19, the part of the Ω̂[i]
plane wave that first transmits through the beam splitter and then, coming back from the fixed
mirror, reflects off the beam splitter, ends up going toward the detector with propagation vector
Ω̂ . The part of the Ω̂[i] plane wave that first reflects off the beam splitter and then, coming back
from the moving mirror, transmits through the beam splitter, ends up going toward the detector
with propagation vector Ω ˆ . Vectors Ωˆ and Ω
ˆ are not the same because we allow the moving
d d

mirror to be tilted slightly out of alignment, producing a very small angle θ d between the two

- 395 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.16.

x̂ ŷ

Input Radiance

xˆ[ i ] zˆ[i ]

yˆ[i ]
Beam Compensator
Splitter Plate Fixed
Mirror

ZPD Position χ
2
Moving Mirror

- 396 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

FIGURE 4.17.

yˆ[i ] xˆ[ i ]

zˆ[i ]

Moving Mirror
Input Radiance


To Detector

Beam Splitter

Compensator Plate

Fixed Mirror

- 397 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.18.

First Pass through the Second Pass through


Beam Splitter the Beam Splitter

Fixed-Mirror Arm
xˆ[ i ] x̂
zˆ[i ] ẑ
yˆ[ i ] ŷ

Moving-Mirror Arm

χ = extra distance (OPD) traveled in


the moving-mirror arm.

- 398 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

FIGURE 4.19.

angle θb ẑ

x̂ ŷ
propagation vector Ω̂

angle θ d

ˆ
propagation vector Ω d Compensator Plate
Fixed Mirror

xˆ[ i ] zˆ[i ]

yˆ[i ]

Input plane wave Beam


propagates Splitter The slightly tilted moving
ˆ
in direction Ω mirror causes the reflected
[i ]

at an angle θb plane wave to propagate in the


angle θ d ˆ direction (dashed arrow)
with respect to the Ω d

optical axis. instead of the Ω̂ direction


(solid arrow).

Angles θb and θd are drawn much larger than they actually are. Note that propagation vector Ω̂ ,
ˆ , and the optical axis do not necessarily all lie in the same plane.
propagation vector Ω d

- 399 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ and Ω
propagation vectors. Angle θ d between Ω ˆ is greatly exaggerated in Fig. 4.19; the Ω
ˆ
d d

unit vector is drawn much shorter than the Ω̂ unit vector, using perspective to show there is no
reason to expect θ d and θb to be co-planar angles.
Michelson interferometers are, of course, designed to keep θ d small, and as a general rule
they do not work well unless θ d is much less than the typical angle θb between the plane wave’s
propagation vector and the optical axis,

θ d << θb . (4.68)

As is pointed out at the end of Appendix 4E, angle θ d is so small that we expect neither the
amplitude nor the phase shifts of monochromatic plane waves propagating through the beam
splitter substrate to be affected by it.
We note that when θb = θ d = 0 , the plane of incidence65 is the same for all reflections and
transmissions through the beam splitter in Figs. 4.16, 4.17, and 4.19. Both the x̂[i] and x̂ unit
vectors are normal to this plane of incidence; indeed, they are the same unit vector. If we unfold
the interferometer as shown in Fig. 4.18, the

( xˆ[i] , yˆ[i] , zˆ[i] ) and ( xˆ , yˆ , zˆ )

coordinate systems are brought into alignment, with ( yˆ[i] , yˆ ) and ( zˆ[i] , zˆ ) also becoming the same
unit vectors. Now the only difference between the two coordinate systems is the location of their
origins, with the ( xˆ[i] , yˆ[i] , zˆ[i] ) system having its origin on the optical axis of the input beam
approaching the beam splitter and the ( xˆ , yˆ , zˆ ) system having its origin on the optical axis of the
output beam traveling from the beam splitter to the detector. This means the two coordinate
systems are essentially equivalent, allowing us to discard one and keep the other. For the rest of
this chapter, we work with the unfolded interferometer and use only the ( xˆ , yˆ , zˆ ) coordinate
system to represent the plane waves in the input beam, the fixed-mirror arm, the moving-mirror
arm, and the output beam traveling from the beam splitter to the detector.
When θb is not zero, the tunnel-diagram analysis performed in Figs. 4E.4(a) and 4E.4(b) of
Appendix 4E shows that vector Ω̂[i] must have the same angles with respect to ( xˆ[i] , yˆ[i] , zˆ[i] ) that
vector Ω̂ has with respect to ( xˆ , yˆ , zˆ ) ; in particular, angle θb is the same in both the input and
output coordinate systems. Vector Ω ˆ and its associated angle θ , on the other hand, are defined
d d

65
The plane of incidence of a reflected or transmitted monochromatic plane wave is defined in Sec. 4.5 above.

- 400 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

in the output ( xˆ , yˆ , zˆ ) coordinate system after reflection off the slightly misaligned moving mirror
but not, of course, in the input ( xˆ[i] , yˆ[i] , zˆ[i] ) coordinate system
From the work done in Secs. 4.3 and 4.4, we know that the input plane wave can be written
using the real part of
G ˆ G
E0 e 2π iσ ( Ω•r −ct ) (4.69a)

to represent the wave’s E field and the real part of

1 ˆ G 2π iσ ( Ω•
c
(
Ω × E0 e ) ˆ rG − ct )
(4.69b)

to represent the wave’s B field when angle θb is small. These formulas come from dropping the
A, j subscripts from Eqs. (4.16a) and (4.16b) and using (4.16c) to substitute for the B vector. In
G
(4.69a) and (4.69b), parameter E0 is a constant complex vector; and the convention of the
unfolded interferometer is used to replace the propagation vector Ω̂[i] by Ω̂ when describing the
input plane wave. The wavenumber ı is taken to be positive. According to Eq. (4.16c), the
G
complex E0 vector satisfies
G
ˆ =0.
E0 • Ω (4.69c)

The work done in Sec. 4.3 shows that this means the plane wave’s real E field is always
perpendicular to the direction of propagation Ω̂ .
Since θb is small, we know that plane waves entering the interferometer must be propagating
parallel to, or nearly parallel to, the z axis. When, as in Fig. 4.19, Ω̂ is tilted at a nonzero small
angle θb to the z axis, it follows that the real E field must have a small component along the z
axis. According to Fig. 4.20, the real E-field component along the z axis must be on the order of
sin θb . Since θb is a small angle, we have

O(sin θb ) = O(θb ) . (4.70)


G
Writing the complex constant vector E0 in terms of its complex ( xˆ , yˆ , zˆ ) components,
G
E0 = xE
ˆ 0 x + yE
ˆ 0 y + zE
ˆ 0z , (4.71a)

- 401 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.20.

θb

G
vector E

unit vector Ω̂

θb

unit vector ẑ

- 402 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

we note that the real E field of the monochromatic plane wave must be
G ˆ G ˆ G
Re[ E0 e 2π iσ ( Ω•r −ct ) ] = xˆ Re[ E0 x e 2π iσ ( Ω•r −ct ) ]
ˆ G
+ yˆ Re[ E0 y e 2π iσ ( Ω•r −ct ) ] (4.71b)
ˆ rG − ct )
2π iσ ( Ω•
+ zˆ Re[ E0 z e ].
G
Looking at the special point in space r = 0 at time t = 0 , we see that, according to Fig. 4.20,

Re[ E0 z ] = O(θb ) . (4.71c)

ˆ G
The imaginary part of E0 z e 2π iσ ( Ω•r −ct ) has no physical relevance, so it can also be specified as
G G
O(θb ) at point r = 0 when t = 0 . This means the formula for E0 can be written as
G
E0 = xE ˆ 0 y + zˆ[O(θb ) + iO(θb )] .
ˆ 0 x + yE (4.71d)

We now introduce the symbol


O(θb ) = O(θb ) + iO (θb ) (4.71e)

as a notational convenience to describe a complex scalar whose real and imaginary parts are both
O(θb ) . Then Eq. (4.71d) can be written as
G
E0 = xE ˆ 0 y + zˆ O(θb ) .
ˆ 0 x + yE (4.71f)

The O(θb ) symbol, like the O(θb ) symbol, is an algebraic “black hole” absorbing other finite
algebraic quantities. Some of the formal rules for manipulating O(θb ) are that

a O(θb ) + b O(θb ) = O(θb ) (4.72a)

for any two finite complex scalars a and b; that

O(θb ) ⋅ eiα = O(θb ) (4.72b)

for any real parameter α ; and, of course, that

O(θb ) = O(θb ) . (4.72c)

- 403 -
4 · From Maxwell’s Equations to the Michelson Interferometer

From Eqs. (4.54a), (4.54c), and (4.54d) we have

ˆ = εG + zˆ 1 − ε 2 = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 .
Ω x y x y

Clearly, both ε x and ε y are O(sin θb ) = O(θb ) when Ω̂ is nearly parallel to the optical axis
(see Fig. 4.20), so
ˆ = zˆ + xO
Ω ˆ (θb ) + yO
ˆ (θb ) , (4.73a)
where
§ ε 2 ε y2 ·
zˆ 1 − ε x2 − ε y2 ≅ zˆ ¨1 − x − ¸ ≅ zˆ ,
¨ 2 2 ¸¹
©

neglecting terms of O(θb 2 ) . From Eqs. (4.71f) and (4.73a) we have, again neglecting terms of
O(θb 2 ) , that
G
ˆ × E = [ zˆ + xO
Ω ˆ (θb ) + yO
ˆ (θb )] × [ xE ˆ 0 y + zˆ O(θb )]
ˆ 0 x + yE
0
(4.73b)
= yE ˆ 0 y + zˆ O(θb ).
ˆ 0 x − xE

G
We next introduce the symbol O(θb ) to represent a small complex vector, each of whose
G
( xˆ , yˆ , zˆ ) components are O(θb ) . The symbol O(θb ) is another algebraic black hole. We note that
G G G
a O(θb ) + b O(θb ) = O(θb ) (4.74a)

for any two finite complex scalars a and b, that


G G
O(θb ) ⋅ eiα = O(θb ) (4.74b)

for any real parameter Į, and that


G G
c • O(θb ) = O(θb ) (4.74c)
and
G G G
c × O(θb ) = O(θb ) (4.74d)
G
for the vector dot and cross products with any finite complex vector c . The underscore in the
G G
symbol O(θb ) can be dropped to give O(θb ) , with this new symbol indicating a strictly real

- 404 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

vector, each of whose real ( xˆ , yˆ , zˆ ) components are O(θb ) . Then for any two real scalars a and b,
we have
G G G
a O(θb ) + b O(θb ) = O(θb ) , (4.75a)

for any real parameter Į, we have


G G
O(θb ) ⋅ eiα = O(θb ) , (4.75b)

G G
c • O(θb ) = O(θb ) , (4.75c)
and
G G G
c × O(θb ) = O(θb ) (4.75d)
G G
for the vector dot product and vector cross product with any finite complex vector c . If c is a
finite real vector, we can, of course, drop the underscore on the right-hand sides of (4.75c) and
G
(4.75d) to show that the resulting small quantities must also be strictly real. The O(θb ) symbol
can be used to write Ω̂ in (4.73a) as
G
ˆ = zˆ + O(θ )
Ω (4.76a)
b

G
and, of course, the O(θb ) symbol can be used to write the complex vectors in (4.71f) and (4.73b)
as
G G
E0 = xE ˆ 0 y + O(θb )
ˆ 0 x + yE (4.76b)
and
G G
ˆ × E = yE
Ω ˆ 0 y + O(θb ) .
ˆ 0 x − xE (4.76c)
0

Substituting Eqs. (4.76b) and (4.76c) into the expressions for the complex E and B fields in
(4.69a) and (4.69b) gives, when angle θb is small,

ˆ G G
Complex E field = ( xE ˆ 0 y ) e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x + yE (4.77a)
and
1 ˆ G G
Complex B field = ˆ 0 y ) e 2π iσ ( Ω•r −ct ) + O(θb ) ,
ˆ 0 x − xE
( yE (4.77b)
c
G
where (4.74b) is used to simplify the final results and Ω̂ is given by Eq. (4.76a). If the O(θb )
G
terms in (4.77a) and (4.77b) and the O(θb ) terms in (4.76a) are all exactly zero, then the plane
wave’s propagation vector is strictly parallel to the ẑ optical axis; when they are not, the plane

- 405 -
4 · From Maxwell’s Equations to the Michelson Interferometer

wave is propagating in a slightly off-axis direction. Looking at how the interferometer is unfolded
G G
going from Fig. 4.17 to Fig. 4.18, we see that if all the O(θb ) and O(θb ) terms are exactly zero,
then the x̂ component of E is strictly perpendicular to the plane of incidence on the beam splitter
and the ŷ component of E is strictly parallel to the plane of incidence on the beam splitter. For
G G
now, we assume that all the O(θb ) and O(θb ) terms are exactly zero and analyze just plane
waves that propagate parallel to the optical axis—that is, just the on-axis plane waves. From the
work done in Secs. 4.5 and 4.6, we can then predict that the on-axis monochromatic plane
wavefield transmitted through the beam splitter is

ˆ G
Complex E field = [ xE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct )
ˆ 0 x tsγ s( a ) + yE (4.77c)
and
1 ˆ G
Complex B field = [ yE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) ,
ˆ 0 x tsγ s( a ) − xE (4.77d)
c

where γ s( a ) is the complex parameter introduced in Appendix 4E that describes the passage of s-
type monochromatic plane waves on their first pass through the beam-splitter substrate, and γ (pa )
is the complex parameter from Appendix 4E describing the passage of p-type monochromatic
plane waves on their first pass through the beam-splitter substrate. Both γ s( a ) and γ (pa ) are
functions of ı and the plane wave’s angle of incidence on the substrate. The plane wave reflected
off the beam splitter after passing into and out of the substrate is

ˆ G
Complex E field = [ xE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct )
ˆ 0 x rsγ s( ab ) + yE (4.77e)
and
1 ˆ G
Complex B field = [ yE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) .
ˆ 0 x rsγ s( ab ) − xE . (4.77f)
c
Here, we define
γ s(ab) = γ s( a ) ⋅ γ s(b ) (4.77g)
and
γ p(ab) = γ p( a ) ⋅ γ p(b ) , (4.77h)

where γ s( b ) is the complex parameter introduced in Appendix 4E that describes the second pass of
s-type monochromatic plane waves through the beam-splitter substrate and γ (pb ) is the complex
parameter from Appendix 4E that describes the second pass of p-type monochromatic plane
waves through the substrate. Like γ s( a ) and γ (pa ) , the γ s(,bp) complex parameters are functions of ı
and the plane wave’s angle of incidence. The complex parameters rs, rp, ts, tp describe what

- 406 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

happens in the thin beam-splitter layer in Fig. 4.16, where the partial transmission and partial
reflection of the radiation fields occur. Parameters rs and ts are the s-wave amplitude-reflection
and amplitude-transmission coefficients, and parameters rp and tp are the p-wave amplitude-
reflection and amplitude-transmission coefficients. Recognizing that the amount of reflection and
transmission can depend on both wavenumber and angle of incidence, we realize that these
coefficients must also be functions of ı and the angle of incidence on the substrate. For the on-
axis plane waves characterized by Eqs. (4.77c)–(4.77f), the angle of incidence on the beam-
splitter substrate must be the same as the angle of incidence φ made by the optical axis on the
beam splitter. The unfolded model of the interferometer in Fig. 4.18 lets us use the same symbol
ŷ for both the original and reflected ŷ unit vectors and also allows us to represent both the
transmitted and reflected propagation vectors by the same symbol Ω̂ .
Now we consider what happens to the slightly off-axis plane waves where șb is no longer
exactly zero, which means that Ω̂ is at a slight angle to the optical axis. In this situation Fig. 4.21
shows that the angle of incidence on the beam splitter changes by an O (θb ) amount from the
optical axis’s angle of incidence φ . We want to show that the transmitted and reflected
wavefields can now be written as

ˆ G G
Complex E field = [ xE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x tsγ s( a ) + yE (4.78a)

and
1 ˆ G G
Complex B field = [ yE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x tsγ s( a ) − xE (4.78b)
c

for the wavefield transmitted through the beam splitter and as

ˆ G G
Complex E field = [ xE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x rsγ s( ab ) + yE (4.79a)

and
1 ˆ G G
Complex B field = [ yE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x rsγ s( ab ) − xE (4.79b)
c

for the wavefield reflected from the beam splitter.


Figure 4.22 shows that when șb is not exactly zero, the Ω̂ vector defines a new, slightly tilted
plane of incidence of the plane wave on the beam splitter. We choose ŝ to be the unit vector
perpendicular to this new plane of incidence and note that
G
sˆ = xˆ + O(θb ) . (4.80a)

- 407 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.21.

size of angle between Ω̂ and


n̂ is φ + O(θb )
unit vector Ω̂

angle șb

unit vector ẑ

angle φ

beam-splitter surface-normal vector n̂

Vectors n̂ , Ω̂ , and ẑ do not necessarily all lie in the same plane.

- 408 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

The unit vector perpendicular to both Ω̂ and ŝ is given by [see Fig. 4.22 and Eq. (4.76a)]
G G
ˆ × sˆ = [ zˆ + O(θ )] × [ xˆ + O(θ )] .
pˆ = Ω b b

G G
This becomes, gathering together the O(θb ) terms and neglecting the [O(θb )]2 terms,
G
pˆ = yˆ + O(θb ) . (4.80b)
G
We take components along ŝ and p̂ of the complex vector E0 vector used to describe the
incident plane wave in Eqs. (4.69a) and (4.69b) to get [since ŝ , p̂ , and Ω̂ are mutually
G
perpendicular unit vectors and the complex E0 vector is, according to (4.69c), strictly
perpendicular to Ω̂ ]
G
E0 = sE
ˆ 0 s + pE
ˆ 0p . (4.80c)

G
Here, E0s and E0 p are two complex scalars representing the components of E0 along ŝ and p̂ .
Substitution of (4.80a) and (4.80b) into (4.80c) gives
G G
E0 = xE ˆ 0 p + O(θb ) .
ˆ 0 s + yE

Comparing this expression to Eq. (4.76b) shows that

E0 s = E0 x + O(θb ) (4.80d)
and
E0 p = E0 y + O(θb ) (4.80e)

G
if the two formulas for E0 are to be consistent.
Using the relationships Ω ˆ × sˆ = pˆ and Ω
ˆ × pˆ = − sˆ from Fig. 4.22, we substitute (4.80c) into
(4.69a) and (4.69b) to write the incident wave as

ˆ G
Complex E field = ( sE ˆ 0 p ) e 2π iσ ( Ω•r −ct )
ˆ 0 s + pE
and
1 ˆ G
Complex B field = ˆ 0 p ) e 2π iσ ( Ω•r −ct ) .
ˆ 0 s − sE
( pE
c

- 409 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.22.

unit vector ŝ
Angle here is O(θb )
unit vector ẑ

unit vector Ω̂

unit vector x̂

θb

New,
slightly unit vector ŷ
tilted
plane of
incidence
containing
the Ω̂ and Angle here is
n̂ vectors O(θb )

unit vector p̂

beam-splitter surface-normal vector n̂

- 410 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

In effect, the original ( xˆ , yˆ , zˆ ) coordinate system is replaced by the slightly tilted ( sˆ, pˆ , Ω ˆ)
coordinate system with E0s and E0 p playing the role of E0 x and E0 y . Thus it has now been
G
shown that we can make the O(θb ) terms in (4.77a) and (4.77b) equal to zero by replacing
[ ( xˆ, yˆ ) , E0 x , E0 y ] with [ ( sˆ, pˆ ) , E0s , E0 p ] respectively. Previously xˆ and yˆ represented unit
vectors perpendicular and parallel to the plane of incidence, and now sˆ and pˆ represent unit
vectors perpendicular and parallel to the plane of incidence. Following the pattern established in
going from Eqs. (4.77a) and (4.77b) to Eqs. (4.77c)–(4.77f), we see that the wave transmitted
through the beam splitter must be

ˆ G
Complex E field = [ sE ˆ 0 p t pγ (pa ) ] e2π iσ ( Ω•r −ct )
ˆ 0 s tsγ s( a ) + pE (4.80f)
and
1 ˆ G
Complex B field = [ pE ˆ 0 p t pγ p( a ) ] e2π iσ ( Ω•r −ct ) ,
ˆ 0 s tsγ s( a ) − sE (4.80g)
c

and the wave reflected off the beam splitter after passing into and out of the substrate must be

ˆ G
Complex E field = [ sE ˆ 0 p rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) ,
ˆ 0 s rsγ s( ab ) + pE (4.80h)
and
1 ˆ G
Complex B field = [ pE ˆ 0 p rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) .
ˆ 0 s rsγ s( ab ) − sE (4.80i)
c

The γ s(,ap) , γ s(,bp) , γ s(,abp ) parameters are the same functions of ı and the angle of incidence as in Eqs.
(4.77c)–(4.77f); and the rs,p and ts,p parameters are also the same functions as they were in
(4.77c)–(4.77f). We note that even if the wavenumber ı has the same value as in Eqs. (4.77c)–
(4.77f), the work done in Appendix 4E shows that the values of γ s(,ap) , γ s(,bp) , and γ s(,abp ) are different
because these complex-valued functions are very sensitive to the slight changes in the angle of
incidence produced by nonzero values of θb . The values of rs,p and ts,p do not, however, usually
depend as sensitively on the angle of incidence. As long as θb is small, we can treat rs,p and ts,p as
complex functions that depend only on the wavenumber ı.
Substituting Eqs. (4.80a), (4.80b), (4.80d), and (4.80e) into Eqs. (4.80f)–(4.80i) and gathering
G
together the O(θb ) terms while neglecting the O(θb 2 ) terms gives us, as expected, Eqs. (4.78a),
(4.78b), (4.79a), and (4.79b) for the beam splitter’s transmitted and reflected waves. This
establishes that (4.78a), (4.78b), (4.79a), and (4.79b) can be used to represent monochromatic
plane waves propagating through the interferometer in a slightly off-axis direction. From now on,
we use (4.78a), (4.78b), (4.79a), and (4.79b) to represent both the on-axis and off-axis

- 411 -
4 · From Maxwell’s Equations to the Michelson Interferometer

monochromatic plane waves with the understanding, of course, that both 'b and all the order 'b
terms are strictly zero for on-axis propagation.
The plane wave transmitted through the beam splitter into the fixed-mirror arm of the
interferometer reflects off the fixed mirror and returns to the beam splitter. There is no way to
distinguish between s-wave and p-wave reflections when  ˆ is exactly parallel to the z axis, so
we use the single amplitude-reflection coefficient rFM to describe normal reflection off the fixed
mirror. When  ˆ is not exactly parallel to the ẑ axis, which means the reflection off the fixed
mirror is only nearly normal and not strictly normal, we can distinguish between s-wave and p-
wave reflections; but there is no real point to it because both the s-wave and p-wave amplitude-
reflection coefficients are approximately equal to rFM . When  ˆ is allowed to be approximately
parallel to ẑ , the radiation fields of the plane wave after reflection off the fixed mirror are

Complex E field
( abc ) ( abc ) ˆ G G (4.81a)
ˆ 0 xts
rFM [ xE s
ˆ 0 yt p
 yE p ] e 2& i) ( =r ct )  O('b )
and
Complex B field
rFM ( abc ) ( abc ) ˆ G G (4.81b)
ˆ 0 xts
[ yE s
ˆ 0 yt p
 xE p ] e2& i) ( =r ct )  O('b ).
c
Here,
( abc ) ( ab ) (c ) (a) (b ) (c)
s s A s s A s A s (4.81c)
and
( abc ) ( ab ) (c ) (a) (b ) (c)
p p A p p A p A p , (4.81d)

(c )
where s, p are the complex parameters introduced in Appendix 4E to describe the third pass
through the beam-splitter substrate and the second pass through the compensator plate of the s-
type and p-type waves respectively. We note that s(,bp) can, according to Eq. (4E.7b) in Appendix
4E, describe the first passage of a plane wave through the compensator plate as well as the second
(a) (a) (b) (b) (c )
passage through
throughthethebeam-splitter substrate.
beam-splitter Like Like
substrate. s, p and
s, p and
s , p , the
s , p , they
s, p are functions of
( abc )
wavenumber ı and the angle of incidence. In Eqs. (4.81a) and (4.81b), the factors of s, p show
that the plane wave passes once through the beam-splitter substrate and twice through the
G
compensator plate, and the O('b ) symbol again represents complex vector components that are
too small to be worth keeping track of explicitly. Just like before, these equations reduce to the
G
case where  ˆ is exactly parallel to ẑ when all the O(' ) terms are taken to be exactly equal to
b
zero.

- 412 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

The plane wave reflected off the beam splitter and into the interferometer’s moving-mirror
arm reflects off the moving mirror and returns to the beam splitter. Because it reflects normally or
near normally, we can write, following the pattern of Eqs. (4.81a) and (4.81b) and assuming the
moving mirror is at its ZPD position,

Complex E field
ˆ G G (4.82a)
ˆ 0 y rpγ p( abc ) ] e 2π iσ ( Ωd •r −ct ) + O(θb )
ˆ 0 x rsγ s( abc ) + yE
= rMM [ xE
and
Complex B field
rMM ˆ G G (4.82b)
= [ yE ˆ 0 y rpγ p( abc ) ] e 2π iσ ( Ωd •r −ct ) + O(θb ) ,
ˆ 0 x rsγ s( abc ) − xE
c

where rMM is the complex amplitude-reflection coefficient for plane waves normally incident on
the moving mirror, and ȍˆ is replaced by ȍ
ˆ because of the slightly tilted moving mirror. The
d

factors of γ ( abc )
s, p now represent three passages through the beam-splitter substrate. At the end of
Appendix 4E, there is a discussion about why it makes sense to neglect the very slight change in
the angle of incidence due to the tilted moving mirror. Most interferometers use identical
reflective surfaces for the fixed and moving mirrors, so from now on we assume that

rFM = rMM = rM (4.83)

with the lack of an s or p subscript on rM reminding us that it represents the amplitude-reflection


coefficient of the fixed and moving mirrors, which have identical s-wave and p-wave amplitude-
reflection coefficients, instead of the beam splitter, which does not.
We have to be careful when reflecting the plane wave coming from the fixed-mirror arm off
the beam splitter because the reflection takes place outside rather than inside the substrate (see
Figs. 4.16, 4.17, and 4.19). In this type of interferometer, the beam splitter is usually designed so
that the s-wave and p-wave amplitude-reflection coefficients are (í1) times the amplitude-
reflection coefficients rs and rp for reflection inside the substrate. In some types of beam splitter,
however, these s-wave and p-wave amplitude-reflection coefficients are equal to rs and rp rather
than [–rs] and [–rp] (see discussion in Sec. 1.1 of Chapter 1). To allow for both types of beam-
splitter arrangements, we use the same parameter W introduced in the discussion following Eq.
(1.15c) of Chapter 1. Just like before, W can only equal +1 or í1. Now the expressions
Wrs and Wrp can be used to represent both types of s-wave and p-wave amplitude-reflection
coefficients off the beam splitter’s backside. Reflecting the plane wavefield in (4.81a) and (4.81b)
off the backside of the beam splitter now gives

- 413 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Complex E field
( abc ) ( abc ) ˆ G G (4.84a)
ˆ 0 x ts rs
WrM [ xE s
ˆ 0 y t p rp
 yE p ] e2& i) ( =r ct )  O('b )
and
Complex B field
WrM ( abc ) ( abc ) ˆ G G (4.84b)
ˆ 0 x ts rs
[ yE s
ˆ 0 y t p rp
 xE p ] e 2& i) ( =r ct )  O('b ) ,
c

where Eq. (4.83) is used to replace rFM by rM in the expressions for the complex E and B fields.
There is no difficulty passing the plane wave coming from the moving-mirror arm through the
beam-splitter film because it transmits the same way the original plane wave transmitted through
to the fixed-mirror arm. This means we can use the same ts,p complex parameters to describe the
change in the plane wave in Eqs. (4.84a)
(4.84) and (4.84b).
(4.82). Now, however, we also want to allow for
the possibility that the moving mirror is no longer at ZPD. In Eqs. (4.84a) and (4.84b) the
complex exponential
ˆ G
e 2& i) ( =r ct )

is always the correct phase term for the plane wave traveling toward the detector after passing out
and back the fixed-mirror arm. When the moving mirror is no longer at ZPD, the correct phase
term for the plane wave passing out and back the moving-mirror arm iss, in Eqs. (4.82),

ˆ G
ˆ
e 2& i) [ d =( r  z  )ct ]
G G
with r 7 r  zˆ  to account for the moving-mirror arm’s OPD (that is, to account for the extra
distance Ȥ traveled when the moving mirror is not at its ZPD position).66 Therefore, we now write
the E and B fields of the plane wave traveling toward the detector after transmitting through the
beam splitter from the moving-mirror arm ass [just put ts,p into (4.82 a,b) and use (4.83)]

Complex E field
ˆ G G (4.85a)
ˆ 0 x ts rs
rM [ xE ( abc )
s
ˆ 0 y t p rp
 yE ( abc )
p ] e 2& i) [ d =( r  zˆ  ) ct ]  O('b )
and
Complex B field
rM ˆ G G (4.85b)
ˆ 0 x ts rs
[ yE ( abc )
s
ˆ 0 y t p rp
 xE ( abc )
p ] e 2& i) [ d =( r  zˆ  ) ct ]  O('b ).
c

66
The OPD is defined in Sec. 4.11 and first used in Eq. (1.15b) of Chapter 1. The ZPD is defined at the beginning of
Sec. 1.4 of Chapter 1.

- 414 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12

In Sec. 4.11 we decided that the recombined radiation on the far side of the beam splitter
would be called the balanced radiation field. Having traced a monochromatic plane wave through
the interferometer, we can now represent its balanced E and B fields by adding together the
formulas in (4.84a), (4.84b), (4.85a), and (4.85b),

Complex balanced E field


ˆ G ˆ ˆ ˆ G
= xˆ rM E0 x ts rs γ s( abc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ωd • zˆ ) e 2π iσ ( Ωd −Ω )•r )
ˆ G ˆ ˆ ˆ G (4.86a)
+ yˆ rM E0 y t p rpγ (pabc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ωd • zˆ ) e 2π iσ ( Ωd −Ω )•r )
G
+ O(θb )

and
Complex balanced B field
rM ˆ G ˆ ˆ ˆ G
= yˆ E0 x ts rsγ s( abc ) e 2π iσ [ Ω•r −ct ] (W + e2π iσχ ( Ωd • zˆ ) e2π iσ ( Ωd −Ω )•r )
c
(4.86b)
r ˆ G ˆ ˆ ˆ G
− xˆ M E0 y t p rpγ p( abc ) e 2π iσ [ Ω•r −ct ] (W + e2π iσχ ( Ωd • zˆ ) e2π iσ ( Ωd −Ω )•r )
c
G
+ O(θb ) .

According to inequality (4.68), angle θb is much greater than θ d ; and because the input beam is
direction-chopped, we know that θb is itself a small quantity. When the typical values of șb and
șd for standard Michelson interferometers are plugged into the phase terms of (4.86a) and (4.86b),
it can be shown that [see Eqs. (4B.5d) and (4B.10d) from Appendix 4B]

ˆ ˆ
e 2π iσχ ( Ωd • zˆ ) ≅ e 2π iσχ ( Ω• zˆ ) (4.87a)

and
ˆ ˆ G G
e 2π iσ ( Ωd −Ω )•r ≅ e 4π iσ ( nˆM − zˆ )•r . (4.87b)

Here nˆM is the dimensionless unit normal vector to the moving mirror’s surface and, following
the convention of the unfolded interferometer, ẑ points from the moving mirror to the beam
splitter. When the moving mirror is perfectly aligned, nˆM = zˆ . Substitution of these two
approximations into the formulas for the complex balanced E and B fields gives

- 415 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Complex balanced E field


ˆ G ˆ G
= xˆ rM E0 x ts rsγ s( abc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r )
ˆ G ˆ G (4.88a)
+ yˆ rM E0 y t p rpγ (pabc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r )
G
+ O(θb )
and
Complex balanced B field
rM ˆ G ˆ G
= yˆ E0 x ts rsγ s( abc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e4π iσ ( nˆM − zˆ )•r )
c
(4.88b)
r ˆ G ˆ G
− xˆ M E0 y t p rpγ (pabc ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e4π iσ ( nˆM − zˆ )•r )
c
G
+ O(θb ).

4.13 Multiple Plane Waves and Michelson Interferometers


Having found the complex E and B fields for one monochromatic plane wave passing through the
interferometer, we put an index A on the wavenumber, an index j on the propagation vector,
G G
replace E0 by EAj , and take the sum over A and j to create a polychromatic input radiance field.
In place of formulas (4.69a) and (4.69b) for a monochromatic plane wave entering the
interferometer, we have
G 2π iσ A ( Ωˆ j •rG −ct )
Complex input E field = ¦¦ Aj e
E
A j
ˆ • rG − ct G (4.89a)
( )
= ¦¦ ( xE
2π iσ A Ω
ˆ Ajx + yE
ˆ Ajy ) e j
+ O(θb )
A j

and
§1 ˆ G · 2π iσ A ( Ωˆ j •rG −ct )
Complex input B field = ¦¦
A
¨
j ©c
Ω j × E Aj ¸ e
¹
(4.89b)
1 ˆ • rG − ct ) G
= ¦¦ ( yE ˆ Ajy ) e A ( j
2π iσ Ω
ˆ Ajx − xE + O(θb ) ,
A j c

where Eqs. (4.77a) and (4.77b) are used to write the sums over A and j in terms of the x and y
G
ˆ
components of EAj . Equations (4.89a) and (4.89b) apply to a collection of plane waves with Ω j

propagation vectors parallel to, or nearly parallel to, the optical axis. Having passed through the

- 416 -
Multiple Plane Waves and Michelson Interferometers · 4.13

interferometer, each plane wave takes on the form given in Eqs. (4.88a) and (4.88b), so that the
total balanced radiation field traveling to the detector becomes

Complex balanced E field


ˆ = rG  ct ] ˆ = zˆ ) 4& i) ( nˆ  zˆ )= rG
¦¦ xˆ rM A
A j
1 ( abc )
sjA EAjx tsA rsA e
2& i) A [  j
(W  e
2& i) A  (  j
e A M
)
(4.89c)
ˆ = rG  ct ] ˆ = zˆ ) 4& i) ( nˆ  zˆ )= rG
 yˆ rM A (pjabc
G
)
A EAjy t pA rpA e
2& i) A [  j
(W  e
2& i) A  (  j
e A M
2
)

 O('b )
and
Complex balanced B field
­° rM A sj( abc )
ˆ = rG  ct ] G
¦¦ ® yˆ A 2& i) [  ˆ = zˆ )
2& i)  ( 
EAjx tsA rsA e A j (W  e A j e4& i) A ( nˆM  zˆ )=r )
A j °̄ c
(4.89d)
rM A ( abc )
pjA ˆ = rG  ct ]
2& i) A [  ˆ = zˆ ) 4& i) ( nˆ  zˆ )= rG
2& i) A  (  ½°
 xˆ EAjy t pA rpA e j
(W  e j
e A M

c °¿
G
 O('b ) .

Note that all the parameters depending on ı acquire A subscripts; all the parameters depending on
the angle of incidence acquire j subscripts; and all the parameters with A and j subscripts depend
on both. Specifically, we define

( abc ) ( abc )
sjA s at the )
ı = ı)l Awave
wavenumber
numberand
andatatthe
theangles
anglesofofincidence
incidence
G G
corresponding to a monochromatic plane wave with an ȍ = ȍ j (4.89e)
propagation vector
and
( abc ) ( abc )
pjA p at the ı
) = ı)l Awavenumber
wave number
and
and
at at
thethe
angles
angles
of of
incidence
incidence
G G
corresponding to a monochromatic plane wave with an ȍ = ȍ j (4.89f)
propagation vector.

Similarly, we define
rsA rs at ) ) A , (4.89g)

rpA rp at ) ) A , (4.89h)

tsA ts at ) ) A , (4.89i)

- 417 -
4 · From Maxwell’s Equations to the Michelson Interferometer

t pA = t p at σ = σ A , (4.89j)
and
rM A = rM at σ = σ A . (4.89k)

Following the procedure shown in Eqs. (4.44a) and (4.44b) above, the true radiation fields can
be written as the real part of the above formulas, giving

Real balanced E field =


­1 ˆ • rG − ct ] ˆ • zˆ ) 4π iσ ( nˆ − zˆ )• rG
¦ ®¯ 2 ¦ ª¬ xˆ r
2π iσ A [ Ω 2π iσ A χ ( Ω
M Aγ sjA EAjx t sA rsA e
( abc ) j
(W + e j
e A M
)
j A
ˆ • rG − ct ] ˆ • zˆ ) 4π iσ ( nˆ − zˆ)• rG
)º +
2π iσ A [ Ω 2π iσ A χ ( Ω
+ yˆ rM Aγ (pjabc )
A EAjy t pA rpA e
j
(W + e j
e A M

¼
(4.90a)
1 ª ˆ • rG − ct ] ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
¦ ∗ ∗ −2π iσ A [ Ω −2π iσ A χ ( Ω
xˆ rM A∗γ sj( abc )∗ ∗
E t r e j
(W + e j
e A M
)
2 A ¬
A Ajx sA sA

+ yˆ rM A∗γ (pjabc
G
A
)∗ ∗
EAjy t pA ∗ rpA ∗ e
ˆ • rG − ct ]
−2π iσ A [ Ω j
(W + e
ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
−2π iσ A χ ( Ω j
e A M

¼ }
+ O(θb )
and
Real balanced B field =
°­ 1 ª rM Aγ sjA
( abc )
ˆ • rG − ct ] G
¦j ® 2 ¦A « y c EAjxtsA rsA e A j (W + e A j e4π iσ A ( nˆM − zˆ )•r )
2π iσ [ Ω ˆ • zˆ )
2π iσ χ ( Ω
ˆ
°¯ ¬«
rM Aγ (pjabc
A
)
ˆ • rG − ct ]
2π iσ A [ Ω ˆ • zˆ ) 4π iσ ( nˆ − zˆ)• rG
2π iσ A χ ( Ω º
− xˆ EAjy t pA rpA e j
(W + e j
e A M
)» +
c »¼
(4.90b)
1 ª rM A γ sjA
∗ ( abc ) ∗
ˆ • rG − ct ] ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
¦ ∗ ∗ −2π iσ A [ Ω
∗ −2π iσ A χ ( Ω
« ˆ
y E t r
Ajx sA sA e j
(W + e j
e A M
)
2 A «¬ c

rM A∗γ (pjabc
A
)∗
ˆ • rG − ct ]
−2π iσ A [ Ω ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
−2π iσ A χ ( Ω º ½°
− xˆ EA∗jy t pA∗ rpA∗ e j
(W + e j
e A M
) »¾
c »¼ ¿°
G
+ O(θb ) ,
G
where the underscore has been removed from the O(θb ) symbol to show that only the real part of
this small uncertainty is retained. We define

- 418 -
Multiple Plane Waves and Michelson Interferometers · 4.13

1
∆σ A E jx (σ ) = EAjx for σ = σ A > 0 , (4.91a)
2

1
∆σ A E jy (σ ) = EAjy for σ = σ A > 0 , (4.91b)
2

1 ∗
∆σ A E jx (σ ) = EAjx for σ = −σ A < 0 , (4.91c)
2
and
1 ∗
∆σ A E jy (σ ) = EAjy for σ = −σ A < 0 , (4.91d)
2
with
∆σ A = σ A +1 − σ A .

We set up new versions of the r , t , and γ parameters by defining the complex functions
r , rs , rp , ts , t p , γ sj( abc ) , γ (pjabc ) to be

r (σ A ) = rM A with r (−σ ) = r (σ )∗ , (4.92a)

rs (σ A ) = rsA with rs (−σ ) = rs (σ )∗ , (4.92b)

rp (σ A ) = rpA with rp (−σ ) = rp (σ )∗ , (4.92c)

ts (σ A ) = tsA with ts (−σ ) = ts (σ )∗ , (4.92d)


and
t p (σ A ) = t pA with t p (−σ ) = t p (σ )∗ . (4.92e)

We also say that


γ sj( abc ) (σ A ) = γ sj( abc
A
)
and γ (pjabc ) (σ A ) = γ (pjabc
A ,
)
(4.92f)
where
γ sj( abc ) (−σ ) = γ sj( abc ) (σ )∗ and γ (pjabc ) (−σ ) = γ (pjabc ) (σ )∗ .

Now r , rs , rp , ts , t p , γ sj( abc ) , γ (pjabc ) are Hermitian functions of ı [see remark following Eq. (2.34a) in
Chapter 2]. The definitions of E jx (σ ) and E jy (σ ) in Eqs. (4.91a)–(4.91d) require that

- 419 -
4 · From Maxwell’s Equations to the Michelson Interferometer

E jx (−σ ) = E jx (σ )∗ (4.92g)
and

E jy (−σ ) = E jy (σ )∗ , (4.92h)

showing that they are also Hermitian functions. Just as in Eqs. (4.46a) and (4.46b), the sums over
A in (4.90a) and (4.90b) can be converted to integrals over ı to get

Real balanced E field


­° ∞ ª ˆ • rG − ct ] ˆ • zˆ ) 4π iσ ( nˆ − zˆ )• rG
= ¦ ® ³ xˆ r (σ )γ sj( abc ) (σ ) E jx (σ )ts (σ )rs (σ ) e
2π iσ [ Ω 2π iσχ ( Ω
j
(W + e j
e M
)
j °̄ −∞
¬
(4.93a)
}
G G
e 4π iσ ( nM − z )•r ) º dσ
2 π iσ [ ˆ
Ω • r − ct ] 2 π iσχ ( ˆ
Ω • ˆ
z )
+ yˆ r (σ )γ (pjabc ) (σ ) E jy (σ )t p (σ )rp (σ ) e j
(W + e j ˆ ˆ
¼
G
+ O(θb )
and

Real balanced B field


­° ∞ ª r (σ )γ sj( abc ) (σ ) ˆ • rG − ct ] ˆ • zˆ ) 4π iσ ( nˆ − zˆ )• rG
= ¦® ³
2π iσ [ Ω 2π iσχ ( Ω
« yˆ E jx (σ )ts (σ )rs (σ ) e j
(W + e j
e M
)
j °¯ −∞ «¬ c
(4.93b)
r (σ )γ pj( abc ) (σ ) ˆ • rG − ct ]
2π iσ [ Ω ˆ • zˆ ) 4π iσ ( nˆ − zˆ )• rG
2π iσχ ( Ω º ½°
− xˆ E jy (σ )t p (σ )rp (σ ) e j
(W + e j
e M
) » dσ ¾
c ¼» ¿°
G
+ O(θb ) .

The limits of integration are put at í’ and +’ by defining E jx (σ ) and E jy (σ ) to be zero for ı
values that do not correspond to allowed index values in the sums over A . In particular, we
expect the integrals to converge because E jx , y (σ ) are negligible or zero for values of ı
corresponding to radiation wavelengths not measured by the interferometer’s detector [see
discussion after Eq. (4.66b) above].
Following the procedure already explained in Sec. 4.8, we replace the sum over j with a
double integral over d 2ε . The first step is to convert the sum over j into a double sum over
indices m, n as in Eqs. (4.51a) and (4.51b) above,

- 420 -
Multiple Plane Waves and Michelson Interferometers · 4.13

Real balanced E field


5
G G

³ ¦ ª¬ xˆ r () )
ˆ ˆ
( abc )
snm () ) Enmx () )ts () )rs () ) e 2& i) [ nm =r ct ] (W  e2& i) ( nm = zˆ ) e 4& i) ( nˆM  zˆ )=r )
5 n , m (4.94a)
ˆ = rG  ct ] ˆ = zˆ ) 4& i) ( nˆ  zˆ )= rG
 yˆ r () ) ( abc )
() ) Enmy () )t p () )rp () ) e 2& i) [  nm
(W  e 2& i) (  nm
e M
) º d)
pnm ¼
G
 O('b )

and

Real balanced B field


5
1 G G

³ ¦ ª¬ yˆ r () )
ˆ ˆ
( abc )
snm () ) Enmx () )ts () )rs () ) e 2& i) [ nm =r ct ] (W  e2& i) ( nm = zˆ ) e4& i) ( nˆM  zˆ )=r )
c 5 n ,m (4.94b)
ˆ = rG  ct ] ˆ = zˆ ) 4& i) ( nˆ  zˆ )= rG
 xˆ r () ) ( abc )
() ) Enmy () )t p () )rp () ) e 2& i) [  nm
(W  e 2& i) (  nm
e M
) º d)
pnm ¼
G
 O('b ),

where we define Enmx () ) Enmy () ) 0 for those m and n values that do not correspond to j
values in the original sums, the
the ones
sumsover
overpropagation
propagationdirections
directionsininEqs.
Eqs.(4.93a)
(4.93a)and
and(4.393b).
(4.93b). As
ˆ propagation vectors can be written as [see Eq. (4.51c)]
in Sec. 4.8, the  nm

ˆ xˆ  yˆ  zˆ 1   2   2 .
 nm nx my nx my

Unlike the situation at the beginning of Sec. 4.8, parameters  nx and  mx are always very small
compared to one because all the j values in the original sum correspond to propagation vectors
that are parallel,to,
to or nearly parallel to, ẑ ; that is

 nx

1 (4.95a)
and
 my

1 . (4.95b)

For each m, n propagation direction, we define

 nx A  my A e x ( nx ,  my , ) ) Enmx () ) (4.96a)
and
 nx A  my A e y ( nx ,  my , ) ) Enmy () ) (4.96b)

- 421 -
4 · From Maxwell’s Equations to the Michelson Interferometer

with
∆ε nx = ε n +1, x − ε n , x (4.96c)
and
∆ε my = ε m +1,m − ε m, y . (4.96d)

We also specify that

γ s( abc ) (ε nx , ε my , σ ) = γ snm
( abc )
(σ ) and γ (pabc ) (ε nx , ε my , σ ) = γ (pnm
abc )
(σ ) . (4.96e)

Since
γ sj( abc ) (−σ ) = γ sj( abc ) (σ )∗ and γ (pjabc ) (−σ ) = γ (pjabc ) (σ )∗

in Eqs. (4.92f), it follows that when

index j → indices m, n
we must have

γ snm
( abc )
(−σ ) = γ snm
( abc )
(σ )∗ and γ (pnm
abc )
(−σ ) = γ (pnm
abc )
(σ )∗
so that

γ s( abc ) (ε nx , ε my , −σ ) = γ s( abc ) (ε nx , ε my , σ )∗ (4.96f)


and γ (pabc ) (ε nx , ε my , −σ ) = γ (pabc ) (ε nx , ε my , σ )∗ .

Just like in Eqs. (4.53a) and (4.53b), we pass to the limit of decreasing ∆ε nx , ∆ε my in (4.94a) and
(4.94b) to get

Real balanced E field


∞ ∞
2 ª ( abc ) G G ˆ rG − ct ]
³ ³ ³ ¬
2π iσ [ Ω•
= dσ d ε ˆ
x r (σ )γ s (ε , σ )e x (ε , σ )t s (σ ) rs (σ ) e
−∞ −∞
ˆ G
⋅ (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ) (4.97a)
G G ˆ rG − ct ] ˆ zˆ ) 4π iσ ( nˆ − zˆ )• rG
+ yˆ r (σ )γ (pabc ) (ε , σ )e y (ε , σ )t p (σ )rp (σ ) e 2π iσ [ Ω•
(W + e 2π iσχ ( Ω•
e M

¼
G
+ O(θb )
and

- 422 -
Multiple Plane Waves and Michelson Interferometers · 4.13

Real balanced B field


∞ ∞
1 2 ª ( abc ) G G ˆ rG − ct ]
= ³ dσ ³ ³ ¬
d ε ˆ
yr (σ )γ s (ε , σ )e x (ε , σ )t s (σ ) rs (σ ) e 2π iσ [ Ω•

c −∞ −∞
ˆ G
⋅ (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ) (4.97b)
G G ˆ G G
− xˆ r (σ )γ (pabc ) (ε , σ )e y (ε , σ )t p (σ )rp (σ ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ) º
ˆ
¼
G
+ O(θb )
G
As in Sec. 4.8, the vector argument ε = xˆε x + yˆε y is used as a shorthand for the two arguments
ε x and ε y , so that
G G
e x (ε x , ε y , σ ) = e x (ε , σ ) , e y (ε x , ε y , σ ) = e y (ε , σ )

and
G
γ s(,abc
p (ε x , ε y , σ ) = γ s , p (ε , σ ) .
) ( abc )

This last formula lets us write rule (4.96f) as


G G G G
γ s( abc ) (ε , −σ ) = γ s( abc ) (ε , σ )∗ and γ p( abc ) (ε , −σ ) = γ (pabc ) (ε , σ )∗ . (4.97c)

Following the notation used in Eqs. (4.54a) and (4.54d), we write

ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 = εG + zˆ 1 − ε 2
Ω (4.97d)
x y x y

with
G2
ε 2 = ε = ε x2 + ε y2 .

G G G
Vector ρ = xxˆ + yyˆ lets us write r = ρ + zzˆ [see Eqs. (4.54b) and (4.54e)] so that the expressions
G ˆ G G ˆ G
e x (ε , σ ) e 2π iσ [ Ω•r −ct ] and e y (ε , σ ) e 2π iσ [ Ω•r −ct ] become

G ˆ G G 1−ε 2
G G G G G
e x (ε , σ ) e 2π iσ [ Ω•r −ct ] = e x (ε , σ ) e 2π iσ z e 2π iσ [ε • ρ −ct ] = E x (ε , z , σ ) e 2π iσ [ε • ρ −ct ]

and
G ˆ G G 1−ε 2
G G G G G
e y (ε , σ ) e 2π iσ [ Ω•r −ct ] = e y (ε , σ ) e 2π iσ z e 2π iσ [ε • ρ −ct ] = E y (ε , z , σ ) e 2π iσ [ε • ρ −ct ] ,

- 423 -
4 · From Maxwell’s Equations to the Michelson Interferometer

where we define
G G 1−ε 2
E x (ε , z, σ ) = e x (ε , σ ) e 2π iσ z (4.98a)
and
G G 1−ε 2
E y (ε , z , σ ) = e y (ε , σ ) e 2π iσ z . (4.98b)

Substitution of these results into Eqs. (4.97a) and (4.97b) gives


G G
Real balanced E field = E (bal) ( ρ , z , t ) =
∞ ∞
( abc ) G G G G
2 ª
³ ³−∞³ «¬
2π iσ [ε • ρ − ct ] 1−ε 2
dσ d ε ˆ
x r (σ )γ s (ε , σ ) E x (ε , z , σ ) t s (σ ) rs (σ ) e (W + e 2π iσχ
−∞
G
⋅ e 4π iσ ( nˆM − zˆ )•r )
G G G G G
+ yˆ r (σ )γ p( abc ) (ε , σ )E y (ε , z , σ ) t p (σ )rp (σ ) e 2π iσ [ε • ρ −ct ] (W + e 2π iσχ 1−ε 2
e 4π iσ ( nˆM − zˆ )•r ) º»
¼
G
+ O(θb )
and G G
Real balanced B field = B (bal) ( ρ , z , t ) =
∞ ∞
1 ( abc ) G G G G
2 ª
³ ³−∞³ «¬
2π iσ [ ε • ρ − ct ] 1−ε 2
d σ d ε ˆ
y r (σ )γ s (ε , σ ) E x (ε , z , σ ) t s (σ ) rs (σ ) e (W + e 2π iσχ
c −∞
G
⋅e 4π iσ ( nˆM − zˆ )•r )
G G G G G
− xˆ r (σ )γ (pabc ) (ε , σ )E y (ε , z , σ ) t p (σ )rp (σ ) e 2π iσ [ε • ρ −ct ] (W + e 2π iσχ 1−ε 2
e 4π iσ ( nˆM − zˆ )•r ) º»
¼
G
+ O(θb )
G G G G G G
with E (bal) ( ρ , z , t ) and B (bal) ( ρ , z , t ) used as a shorthand for E (bal) ( x, y, z, t ) and B (bal) ( x, y, z , t ) .
These formulas can be simplified by gathering together like terms to get
G G
E (bal) ( ρ , z , t ) =

{
∞ ∞

³ dσ ³ ³d ε
2
G G
(
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ 1−ε 2
G
e 4π iσ ( nˆM − zˆ )•r ⋅ )
−∞ −∞
(4.99a)

G
G G G G
ª¬ xˆ E x (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) + yˆ E y (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼ }
+ O(θb )

- 424 -
Multiple Plane Waves and Michelson Interferometers · 4.13

and
G G
B (bal) ( ρ , z , t ) =

{
∞ ∞
1
³ dσ
c −∞ ³ ³d ε
2
G G
(
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ 1−ε 2
G
)
e 4π iσ ( nˆM − zˆ )•r ⋅
−∞
(4.99b)

G
G G G G
}
ª¬ yˆ E x (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) − xˆE y (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼

+ O(θb ).

In Eqs. (4.92g) and (4.92h) we see that E jx (σ ) and E jy (σ ) are Hermitian functions, so when
the index j is replaced by the pair of indices m, n, it follows that Enmx (σ ) and Enmy (σ ) must also
G G
be Hermitian. This forces e x (ε , σ ) and e y (ε , σ ) in Eqs. (4.96a) and (4.96b) to be Hermitian
2
functions of ı. Changing the sign of ı in e 2π iσ z 1−ε is equivalent to taking its complex conjugate,
G G
so Eqs. (4.98a) and (4.98b) show that E x (ε , z , σ ) and E y (ε , z , σ ) are also Hermitian functions of
ı, giving
G G
E x (ε , z , −σ ) = E x (ε , z , σ )∗ (4.100a)
and
G G
E y (ε , z , −σ ) = E y (ε , z, σ )∗ . (4.100b)

Returning briefly to the discussion leading up to inequalities (4.95a) and (4.95b), we see that
because only plane waves traveling parallel to, or nearly parallel to, the optical axis can pass
through the interferometer—that is, because the radiation passing through the interferometer is
direction-chopped—both ex and ey must be zero or negligible unless ε x and ε y are small.
Consequently, consulting the definitions of Ex and Ey in (4.98a) and (4.98b), it follows that for
the direction-chopped radiation passing through the interferometer both Ex and Ey must be zero or
G
negligible unless ε << 1 .
G G G G
The connection between the output radiation fields E (bal) ( ρ , z , t ) , B (bal) ( ρ , z , t ) and the input
radiance is easy to understand because we have just created a carefully elaborated connection
G G
between E x (ε , z , σ ) , E y (ε , z , σ ) and the complex EAjx , EAjy values in Eqs. (4.89a) and (4.89b)
characterizing the input radiation fields. To develop a consistent notation and make the
connection explicit, we apply the same process used to go from (4.89c) and (4.89d) to (4.99a) and
(4.99b) to Eq. (4.89a) and (4.89b) representing the input radiation fields. The interferometer’s
input fields then become

- 425 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Real input E field =


G (in) G ∞ ∞
G G G G G (4.101a)
E ( ρ , z , t ) = ³ dσ ³ ³ d 2ε ª¬ xˆ E x (ε , z , σ ) + yˆ E y (ε , z , σ ) º¼ e 2π iσ [ε • ρ −ct ] + O(θb )
−∞ −∞
and
Real input B field =
G (in) G 1
∞ ∞
G G G G G (4.101b)
B ( ρ , z , t ) = ³ dσ ³ ³ d 2ε ª¬ yˆ E x (ε , z , σ ) − xˆ E y (ε , z , σ ) º¼ e 2π iσ [ε • ρ −ct ] + O(θb ).
c −∞ −∞

G G G G
For future use, we note that E (in) , B (in) , E (bal) , and B (bal) can be written as three-dimensional
inverse Fourier transforms. We make the same variable substitutions used above in equations
(4.60a)–(4.60c), (4.61b), and (4.61c), specifying that

w = −σ c , (4.102a)

u x = σε x , (4.102b)
and
u y = σε y , (4.102c)
with
G G wG
u = xu ˆ y = σε = − ε .
ˆ x + yu (4.102d)
c
and
G2
u 2 = u = u x2 + u y2 . (4.102e)

Equations (4.101a) and (4.101b) now become


G G
E (in) ( ρ , z , t )
∞ ∞ G G
§ c ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « xˆ E x (− , z , − ) + yˆ E y (− , z , − ) » e 2π i[ u • ρ + wt ]
2
(4.103a)
−∞ −∞ © w ¹¬ w c w c ¼
G
+ O(θb )
and G G
B (in) ( ρ , z , t )
∞ ∞ G G
§ 1 ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « yˆ E x (− , z , − ) − xˆE y (− , z , − ) » e 2π i[ u • ρ + wt ]
2
(4.103b)
−∞ −∞ © w ¹¬ w c w c ¼
G
+ O(θb ).
A similar transformation converts Eqs. (4.99a) and (4.99b) to

- 426 -
Multiple Plane Waves and Michelson Interferometers · 4.13

G G
E (bal) ( ρ , z , t ) =

{
2
2π iwχ § cu ·
∞ ∞
c w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −

³−∞ dw ³−∞³ d u w2 ⋅
2π i[ u • ρ + wt ] M
2
e r (− ) W + e c ©w¹
e c
c
G G
cu w ( abc ) cu w
[ xˆ E x (− , z, − )γ s (− , − ) ts (− w )rs (− w ) (4.104a)
w c w c c c

}
G G
cu w ( abc ) cu w w w
+ yˆ E y (− , z , − )γ p (− , − ) t p (− )rp (− )]
w c w c c c

G
+ O(θb )

and
G G
B (bal) ( ρ , z , t ) =

{
2
2π iwχ § cu ·

1

w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −

³ ³−∞³ w2 ⋅
2π i[ u • ρ + wt ] M
dw d 2
u e r (− ) W + e c ©w¹
e c
−∞
c
G G
cu w ( abc ) cu w
[ yˆ E x (− , z, − ) γ s (− , − )ts (− w )rs (− w ) (4.104b)
w c w c c c

}
G G
cu w ( abc ) cu w w w
− x E y (− , z , − ) γ p (− , − )t p (− )rp (− )]
ˆ
w c w c c c

G
+ O(θb ) .

4.14 Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields


G G
Because E (in) and B (in) in Eqs. (4.101a) and (4.101b) represent the electric and magnetic fields
for the input radiation entering a Michelson interferometer, we know from Secs. 4.9 and 4.10 that
these fields are both time-chopped and beam-chopped.67 Up to now, there has been no need to
indicate this explicitly, but from this point on we introduce T, A subscripts to show that the
radiant fields are significantly different from zero only over a time interval −T ≤ t ≤ T and only
over a beam cross-sectional area A in the x,y plane. We also know from Sec. 4.10 that the
radiation inside the interferometer can be thought of as being approximately band-limited.

67
The radiation is also, of course, direction-chopped. The direction-chopped property is used in the discussion after
Eq. (4.119d) below.

- 427 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

Consequently, there exists a positive wavenumber σ av , which can be thought of as the typical or
“average” wavenumber of the approximately band-limited radiation, that characterizes the
polychromatic wavefield passing through the interferometer. We require T to be extremely long
compared to the period 1 f av = cσ av of a typical electromagnetic wave inside the interferometer.
We also require any characteristic distance across area A to be extremely large compared to the
wavelength λav = 1 σ av of a typical electromagnetic wave inside the interferometer,

T >> cσ av (4.105a)
And

1
A >> . (4.105b)
σ av

To show how the T, A subscripts are used, we rewrite Eqs. (4.99a) through (4.104b) using T, A
subscripts and neglecting all terms of O (θb ) ,

G (in) G ∞ ∞
G G G G

³ ³ ³ ¬ ª º 2π iσ [ε • ρ − ct ]
ETA ( ρ , z, t ) = dσ d 2
ε ˆ
x E xTA (ε , z , σ ) + ˆ
y E yTA (ε , z , σ ) ¼ e , (4.106a)
−∞ −∞

G (in) G 1
∞ ∞
G G G G
BTA ( ρ , z , t ) = ³ dσ ³ ³ ¬ xTA
d 2
ε ª ˆ
y E (ε , z , σ ) − ˆ
x E yTA (ε , z , σ ) º
¼ e 2π iσ [ ε • ρ − ct ]
, (4.106b)
c −∞ −∞

G (bal) G
ETA ( ρ , z, t ) =

{ ( )
∞ ∞
G G G

³ ³ ³d ε e 4π iσ ( nˆM − zˆ )•r ⋅
1−ε 2
dσ 2
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ (4.107a)
−∞ −∞

G G G G
ª¬ xˆ E xTA (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) + yˆ E yTA (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼ } ,

G (bal) G
BTA ( ρ , z, t ) =

{ ( )
∞ ∞
1 G G G

³ dσ ³ ³d ε e 4π iσ ( nˆM − zˆ )•r ⋅
1−ε 2
2
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ (4.107b)
c −∞ −∞

G G G G
ª¬ yˆ E xTA (ε , z , σ )γ s( abc ) (ε , σ ) ts (σ )rs (σ ) − xˆE yTA (ε , z , σ )γ p( abc ) (ε , σ ) t p (σ )rp (σ ) º¼ } ,

- 428 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

G (in) G
ETA ( ρ , z, t )
∞ ∞ G G (4.108a)
§ c ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « x E xTA (− , z , − ) + y E yTA (− , z , − ) » e 2π i[u • ρ + wt ] ,
ˆ2
ˆ
−∞ −∞ © w ¹¬ w c w c ¼

G (in) G
BTA ( ρ , z, t )
∞ ∞ G G (4.108b)
§ 1 ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « y E xTA (− , z , − ) − xE yTA (− , z , − ) » e 2π i[ u • ρ + wt ] ,
ˆ2
ˆ
−∞ −∞ © w ¹¬ w c w c ¼
G (bal) G
ETA ( ρ , z, t ) =

{
2
2π iwχ § cu ·
∞ ∞
c w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −

³−∞ dw ³−∞³ d u w2 ⋅
2 2π i [ u • ρ + wt ] c ©w¹ M
e r (− ) W + e e c
c
G G (4.109a)
ˆ
cu w ( abc ) cu w
[ x E xTA (− , z, − ) γ s (− , − )ts (− w )rs (− w )
w c w c c c

}
G G
cu w ( abc ) cu w w w
+ yˆ E yTA (− , z , − ) γ p (− , − )t p (− )rp (− )] ,
w c w c c c
G (bal) G
BTA ( ρ , z, t ) =

{
2
2π iwχ § cu ·
∞ ∞
1 w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −

³−∞ dw ³−∞³ d u w2
2π i[u • ρ + wt ]
r (− ) W + e c ©w¹

2 M
e e c
c
G G (4.109b)
cu w ( abc ) cu w
[ yˆ E xTA (− , z, − )γ s (− , − ) ts (− w )rs (− w )
w c w c c c

}
G G
cu w ( abc ) cu w w w
− xˆ E yTA (− , z , − )γ p (− , − ) t p (− )rp (− )] .
w c w c c c

Equations (4.100a) and (4.100b) require the E xTA and E yTA functions inside these integrals to
satisfy the Hermitian condition for their wavenumber arguments,
G G
E xTA (ε , z , −σ ) = E xTA (ε , z, σ )∗ (4.110a)
and
G G
E yTA (ε , z , −σ ) = E yTA (ε , z , σ )∗ . (4.110b)

- 429 -
4 · From Maxwell’s Equations to the Michelson Interferometer

For future use, we note that this is the same thing as saying
G G
cu w cu w
E xTA (− , z , − ) = E xTA (− , z , )∗ (4.110c)
w c w c
and
G G
cu w cu w
E yTA (− , z , − ) = E yTA (− , z , )∗ . (4.110d)
w c w c

The power flux—energy per unit area per unit time—carried by the input radiation field at any
point in space is given by the Poynting vector,68

G (in) 1 G (in) G (in)


STA = (
ETA × BTA .
µo
) (4.111a)

G G (in) G (in)
The Poynting vector S is zero where ETA and BTA are zero, so it is given TA subscripts to show
G (in) G (in)
that it is time-chopped and beam-chopped in the same way that ETA and BTA are time-chopped
and beam-chopped. Equations (4.108a) and (4.108b) show that the total radiant energy entering
the interferometer during a time interval −T ≤ t ≤ T is

∞ ∞ G (in)
³ dt ³ ³ d 2ρ STA (• zˆ )
−∞ −∞
∞ ∞ ∞ ∞ ∞ ∞
c G G G
dw ³ dw′ ³ ³ d u ³ ³ d 2u′ ⋅ w′−2 ⋅ w−2 ⋅ e [
2π i ρ •( u + u ′ ) + ( w + w′ ) t ]
³ dt ³ ³ d ρ ³
2 2
= (4.111b)
µo −∞ −∞ −∞ −∞ −∞ −∞
G G G G
ª cu w cu ′ w′ cu w cu ′ w′ º
⋅ «E xTA (− , z , − )E xTA (− , z , − ) + E yTA (− , z , − )E yTA (− , z, − )» .
¬ w c w′ c w c w′ c ¼

The integrals over d 2u and dw in (4.108a) and (4.108b) are changed to integrals over d 2u , d 2u′
and dw, dw′ before they are substituted into (4.111a). This maneuver is often used to show that
formulas such as the one in (4.111b) deal with integrals over independent variables of integration.
We have also used the unit-vector identities xˆ × yˆ = − yˆ × xˆ = zˆ and xˆ × xˆ = yˆ × yˆ = 0 to simplify
the expression inside the square brackets [ ]. We note that the integrals over dt and d 2ρ can be
G (in)
extended to í’ and +’ exactly because the time-chopped and beam-chopped nature of ETA ,
G (in) G (in)
BTA , and STA ensures that their integrals drop to zero thus correctly excluding the

68
John David Jackson, Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York, 1999), p. 259.

- 430 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

electromagnetic energy at large values of x, y, and t that are not part of the interferometer
measurement. Moving the integrals over dt and d 2( to the inside to get

5 5 G (in)
³
5
dt ³ ³ d 2( STA
5
 = zˆ 
5 5 5 5 5 5
c G G G

³ dw ³ dw3 ³ ³ d u ³ ³ d u 3 A w3 A w ³ dt ³ ³ d 2( A e 2& i( =( u u3)


2 2 2 2 2& i ( w  w3 ) t
e
$o 5 5 5 5 5 5
G G G G
ª cu w cu 3 w3 cu w cu3 w3 º
A «E xTA ( , z ,  )E xTA ( , z ,  )  E yTA ( , z ,  )E yTA ( , z,  ) » ,
¬ w c w3 c w c w3 c ¼

we recognize these integrals to be forms of the delta function [see Eqs. (2.71f) and (2.122a) in
Chapter 2),

³ dt e
2& i ( w  w3 ) t
 ( w  w3)
5
and
5 5 5
G G G

³ ³d ³ dx e A ³ dy e
2 2& i ( = ( u  u 3 ) 2& ix ( u x  u x3 ) 2& iy ( u y  u 3y )
(e
5 5 5
G G
 (u x  u 3x ) A  (u y  u 3y )  (u  u 3).

Substituting these delta functions back into the multiple integral gives
5 5 G (in)
³
5
dt ³ ³ d 2( STA
5

= zˆ 
5 5 5
c G G
³ dw A w ³ ³ d u ³ ³ d u3 (u  u3)
4 2 2

$o 5 5 5
G G G G
ª cu w cu3 w cu w cu3 w º
A «E xTA ( , z ,  )E xTA ( , z , )  E yTA ( , z ,  )E yTA ( , z, ) »
¬ w c w c w c w c ¼
5 5
c
³ dw A w ³ ³ d u
4 2

$o 5 5
G G G G
ª cu w cu w cu w cu w º
A «E xTA ( , z,  )E xTA ( , z , )  E yTA ( , z ,  )E yTA ( , z , ) » .
¬ w c w c w c w c ¼

- 431 -
4 · From Maxwell’s Equations to the Michelson Interferometer

From Eqs. (4.110c) and (4.110d), we get

G G G 2
cu w cu w cu w
E xTA (− , z , − )E xTA (− , z , ) = E xTA (− , z , − ) (4.112a)
w c w c w c
and
G G G 2
cu w cu w cu w
E yTA (− , z , − )E yTA (− , z , ) = E yTA (− , z , − ) , (4.112b)
w c w c w c

which shows the total radiant energy entering the interferometer during a time interval −T ≤ t ≤ T
to be
∞ ∞ G (in)
³ ³ ³ d ρ STA • zˆ
dt 2
( )
−∞ −∞
G G (4.113a)
1
∞ ∞
ª c2 cu w
2
c2 cu w º
2

³ dw−∞³ ³ d u ⋅ «« w4 E xTA (− w , z, − c ) + w4 E yTA (− w , z, − c ) »» .


2
=
µo c −∞ ¬ ¼

The radiation fields entering the interferometer—unlike, say, the electromagnetic signal put
out by television or radio stations—can be modeled as random variables because they are not
under our direct control. Following the notation used in Chapter 3, we now write
G (in)

,E 
STA xTA and E yTA

to show that these are random functions (see Sec. 3.2). No tilde is added to their arguments
because the arguments are nonrandom variables. To find the average or expected radiant energy
entering the interferometer during a time interval 2T, which is long compared to the period

1 f av = cσ av

of a typical electromagnetic wave inside the interferometer, we apply the expectation operator E
defined in Sec. 3.4 of Chapter 3 to both sides of Eq. (4.113a) to get

ª∞
( ) º
∞ G (in)
Average input energy = E « ³ dt ³ ³ d 2ρ STA • zˆ »
¬ −∞ −∞ ¼
G G (4.113b)
1
∞ ∞ ª § c  cu w
2
· § c  cu w
2
· º
= ³ dw ³ ³ d 2u ⋅ «E ¨ 2 E
µo c −∞ −∞ ¨
«¬ © w
xTA ( −
w
, z, − ) ¸ + E ¨ 2 E
c ¹ ¸ ¨ yTA ( − , z, − ) ¸ » .
c ¸¹ »¼
© w w

- 432 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

Equations (3.17c) and (3.16a) of Chapter 3 are used when taking E inside the integrals over dw
and d 2u .
Although the random radiation fields are not under our direct control, the amount of radiant
energy that is linearly polarized in the x or y direction is. We can, for example, imagine passing

the radiation in (4.113b) through a polarizing filter, setting E 
yTA to zero without affecting E xTA or

setting E to zero without affecting E . Therefore (4.113b) can be interpreted as saying that
xTA yTA

during a time interval 2T in duration,

Average input energy polarized in x


§ c G
w ·
∞ ∞ 2
1  cu (4.114a)
³ dw ³−∞³ d u ⋅ E ¨¨ w2 E xTA (− w , z, − c ) ¸¸
2
=
µo c −∞ © ¹
and

Average input energy polarized in y


§ c G
w ·
∞ ∞ 2
1  cu (4.114b)
³ dw ³−∞³ d u ⋅ E ¨¨ w2 E yTA (− w , z, − c ) ¸¸ .
2
=
µo c −∞ © ¹

 (− cuG w , z , − w c) 2 and cw−2 E


Because cw−2 E  (− cuG w , z , − w c) 2 are non-negative random
xTA yTA

quantities, we can then interpret

§ G 2·
c  cu w
E ¨ E (− , z, − ) ¸ d 2u dw (4.114c)
µo w ¨ xTA w
4
c ¸
© ¹
and
G
c
§
¨  cu w 2 ·¸ 2
E E (− , z, − ) d u dw (4.114d)
µo w4 ¨ yTA w c ¸
© ¹
G G
as the average or expected energy characterized by u = εσ and w = −σ c that is carried by,
respectively, the x-polarized and y-polarized radiation fields entering the interferometer during a
time interval 2T in length. By converting the integrals over dw and d 2u to integrals over dı and
d 2ε using the variable transformations [see Eqs. (4.102a)–(102d)]
G G
σ = − w c , ε = −c u w , dw = −cdσ , and d 2u = ( w2 / c 2 )d 2ε ,

- 433 -
4 · From Maxwell’s Equations to the Michelson Interferometer

we get that during a time interval 2T in length

Average input energy polarized in x

( )
−∞ ∞
1 § w2 · 2 c 2  (εG, z , σ ) 2
µo c ∞³ ³−∞³ © c ¹ w
=− ( cd σ ) ¨ 2 ¸ d ε ⋅ 4
⋅ E E xTA (4.115a)

( )
∞ ∞
1  (εG, z , σ ) 2
= εo ³ dσ ³ ³d ε ⋅ E
2
⋅ E xTA
−∞ −∞
σ 2

and
Average input energy polarized in y

( )
∞ ∞
1  (εG, z , σ ) 2 . (4.115b)
= εo ³ dσ ³ ³d ε ⋅ E
2
⋅ E yTA
−∞ −∞
σ 2

In (4.115a) and (4.115b), we use that ε o = µo−1c −2 from Eq. (4.1e) above. Remembering that the
ˆ = εG + zˆ 1 − ε 2 is specified by vector εG , we note that
direction of the propagation vector Ω

εo
σ 2
⋅E E (
 (εG, z , σ ) 2 dσ d 2ε
xTA ) (4.116a)
and
εo
σ 2
⋅E E (
 (εG , z , σ ) 2 dσ d 2ε
yTA ) (4.116b)
can be interpreted as the average or expected energy entering the interferometer during a time
interval 2T in length carried by, respectively, the x-polarized or y-polarized radiation fields
traveling in the Ω̂ direction at wavenumber ı.
From Appendix 4C, we see that, according to the three-dimensional Wiener-Khinchin theorem
G G
discussed in Sec. 3.24 of Chapter 3, there exist power spectra Sx (u , w) and S y (u , w) such that
[see Eqs. (4C.10a) and (4C.10b)]

­° 1 1 c 2 § G
w · ½°
2
G  cu
Sx (u , w) = lim ® ⋅ ⋅ 4 E ¨ E xTA (− , z , − ) ¸ ¾ (4.117a)
T →∞ 2T A w ¨ w c ¸¹ °¿
A→∞ °
¯ ©
and
­° 1 1 c 2 § G
w · ½°
2
G  cu
S y (u , w) = lim ® ⋅ ⋅ 4 E ¨ E yTA (− , z , − ) ¸ ¾ . (4.117b)
T →∞ 2T A w ¨ w c ¸¹ ¿°
°
A→∞ ¯ ©

- 434 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

Here, the limit as A → ∞ is interpreted to be the limit as the beam cross-sectional area A extends
to cover the entire x, y plane; and, of course, the limit as T → ∞ means that the measurement
time becomes infinitely long. We have dropped z from the argument list of Sx,y on the left-hand
side of these two equations because, as is pointed out at the end of Appendix 4C, the values of
E 2 and E  2 are no longer functions of z. According to inequalities (4.105a) and (4.105b),
xTA yTA

area A has already been assumed to be much wider than the typical wavelength of the radiation
fields and time interval 2T has already been assumed to be much longer than the typical period of
the radiation fields. It is therefore plausible that in (4.117a) and (4.117b) the values of A and T are
already large enough for the expressions inside the braces { } to be approximately equal to their
limits. Assuming this to be true and multiplying both sides by ( µo c) −1 d 2u dw , we then get

§ G
w · 2
2
1 G 1 1 c  cu
2
S (u , w) d u dw ≅ ⋅ ⋅ E ¨ E xTA (− , z , − ) ¸ d u dw (4.118a)
µo c x 2T A µo w4 ¨© w c ¸¹
and
§ G
w · 2
2
1 G 1 1 c  cu
2
S (u , w) d u dw ≅ ⋅ ⋅ E ¨ E yTA (− , z , − ) ¸ d u dw . (4.118b)
µo c y 2T A µo w4 ¨© w c ¸¹
G
Comparing (4.114c) to (4.118a), we see that (µo c) −1 Sx (u , w) d 2u dw is the average x-polarized
G
input energy at (u, w) divided by the both the time interval 2T during which it entered the
interferometer and the area A through which it entered the interferometer. This means
G
(µo c) −1 Sx (u , w) d 2u dw can be interpreted as the average x-polarized input power per unit area at
G G
values u, w ; and a similar comparison of (4.114d) to (4.118b) shows that (µo c) −1 S y (u , w) d 2u dw
G
is the average y-polarized input power per unit area at values (u, w) .
Integrating both sides of (4.118a) and (4.118b) over dw and d 2u gives expressions for the
G
average input x-polarized and y-polarized input power per unit area from all the u and w values,

Average x - polarized input power per unit area


∞ ∞
1 G
µc³
= dw ³ ³ d u S (u , w)
2
x
o −∞ −∞
G
1 §  w ·
∞ ∞ 2
c 1 cu
≅ ⋅ ³ dw ³ ³ d u 4 E ¨ E xTA (− , z , − ) ¸
2

µo 2TA −∞ −∞
w ¨© w c ¸¹
and

- 435 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Average y - polarized input power per unit area


∞ ∞
1 G
µc³
= dw ³ ³ d u S (u , w) 2
y
o −∞ −∞
G
1 §  w ·
∞ ∞ 2
c1 cu
≅ ⋅ ³ dw ³ ³ d u 4 E ¨ E yTA (− , z , − ) ¸ .
2

µo 2TA −∞ −∞
w ¨© w c ¸¹

Making the same variable transformations as before,


G G
σ = − w c , ε = −c u w , dw = −cdσ , and d 2u = ( w2 / c 2 )d 2ε ,
gives
Average x - polarized input power per unit area
2 ªσ º
∞ ∞ 2
G
= ³
−∞
dσ ³−∞³ «¬ µo Sx (σε , −σ c) »¼
d ε (4.119a)

( )
∞ ∞
2 ª εo  (εG, z , σ ) 2 º
≅ ³
−∞
dσ ³−∞³ «¬ 2TAσ 2
d ε E E xTA »
¼
and
Average y - polarized input power per unit area

ªσ 2

G º
= ³ dσ ³ ³ d ε « S y (σε , −σ c) »
2
(4.119b)
−∞ −∞ ¬ µo ¼

( )
∞ ∞
2 ª εo  (εG, z , σ ) 2 º ,
≅ ³
−∞
dσ ³−∞³ «¬ 2TAσ 2
d ε E E yTA »
¼

where we again use ε o = µo−1c −2 from Eq. (4.1e). These last two equations suggest that

σ2
µo
G
Sx (σε , −σ c)d 2ε dσ ≅
εo
2TAσ 2
E E xTA (
 (εG , z , σ ) 2 d 2ε dσ
) (4.119c)

can be interpreted as the average x-polarized input power per unit area traveling in direction
Ωˆ = εG + zˆ 1 − ε 2 at wavenumber ı, and

σ2
µo
G
S y (σε , −σ c)d 2ε dσ ≅
εo
2TAσ 2
E E yTA (
 (εG , z , σ ) 2 d 2ε dσ
) (4.119d)

- 436 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14

can be interpreted as the average y-polarized input power per unit area traveling in direction
Ωˆ = εG + zˆ 1 − ε 2 at wavenumber ı. From the discussion following Eq. (4.100b) in Sec. 4.13, we
know that E and E
xTA

yTA represent direction-chopped radiation even though nothing like the T or
A subscripts has been used to make this explicit. This means, of course, that E xTA and E yTA must
G
be negligible or zero for ε values that do not represent propagation directions that are parallel to,
or nearly parallel to, the z axis. Consequently, Eqs. (4.119c) and (4.119d) show that Sx and Sy
G
must also be negligible or zero for ε values not representing propagation directions parallel to or
nearly parallel to the z axis. From the observations made at the end of Sec. 4.9, we know that d 2ε
can be interpreted as an infinitesimal solid angle. Hence, we can always regard

σ2
µo
Sx , y dσ ≅
εo
2TAσ 2
E 
E xTA(, yTA
2
dσ)
as the input power per unit area and per unit solid angle of x-polarized or y-polarized radiation
respectively. The next obvious step is to drop dı and recognize

σ2
µo
Sx , y ≅
εo
2TAσ 2
E 
E (
xTA, yTA
2
)
as the input power per unit area per unit solid angle and per unit wavenumber interval of the x-
polarized or y-polarized radiation respectively. It is customary in interferometric spectroscopy to
G G
define two functions L x (ε , σ ) and L y (ε , σ ) to represent the x-polarized and y-polarized radiant
power per unit area per unit solid angle and per unit wavenumber interval traveling in the
ˆ = εG + zˆ 1 − ε 2 at wavenumber ı. Hence it now makes sense to define that
direction Ω

G
L x (ε , σ ) =
µo
σ2 G
Sx (σε , −σ c) ≅
εo
2TAσ 2 (
 (ε , z , σ ) 2
E E xTA ) (4.120a)

and
G
L y (ε , σ ) =
σ2
µo
G
S y (σε , −σ c) ≅
εo
2TAσ 2
E E(
 (ε , z, σ ) 2 .
yTA ) (4.120b)

Again, because the beam is direction-chopped, the newly defined functions Lx,y must be
G
negligible or zero for ε values not representing directions parallel to, or nearly parallel to, the z
axis. As noted in Sec. 4.10 in the discussion after Eq. (4.66b), we are never interested in the
values of E xTA or E yTA at σ = 0 . Consequently, we can always take the expected values of

- 437 -
4 · From Maxwell’s Equations to the Michelson Interferometer

2

E to be zero at σ = 0 , preventing the factors of σ −2 in the last steps of (4.120a) and
xTA, yTA

(4.120b) from specifying a singularity when the wavenumber ı is zero. The x-polarized and y-
polarized power spectra specified by Lx and Ly are double-sided in ı because functions Lx,y equal
µo−1σ 2 Sx , y , and the Sx,y functions are double-sided. [We know that the Sx,y are double-sided
because, according to Eqs. (4.119a) and (4.119b), the Sx,y must be integrated over all
wavenumbers ı between í’ and +’ to get the average power per unit area.] Equations (4.110a)
2 2
and (4.110b) show that E (ε , z , σ ) and E (ε , z , σ ) must have the same values at íı that
xTA yTA

they have at + ı, requiring Lx and Ly to be even functions of the wavenumber argument:


G G
L x (ε , −σ ) = L x (ε , σ ) (4.121a)
and
G G
L y (ε , −σ ) = L y (ε , σ ) . (4.121b)

4.15 Energy Flux of the Balanced Radiation Fields


To find the energy carried by the balanced radiation fields reaching the interferometer’s detector,
we repeat the procedure used in the previous section to find the energy flux of the input radiation
fields. In the previous section, we decided to make E  
xTA and E yTA random functions, so now
G (bal) G (bal)
ETA and BTA in Eqs. (4.109a) and (4.109b) must also be random. We write these vector
functions as
G (bal) G G (bal) G
ETA ( ρ , z , t ) and BTA ( ρ , z, t ) ,

where again the tilde is used to indicate that these are random functions of nonrandom variables
(see Sec. 3.2 in Chapter 3). The balanced energy flux at any point in the balanced output beam is
now simply the expected or average value of

( )
G (bal) 1 G (bal) G (bal)
STA • zˆ = ETA × BTA • zˆ , (4.122a)
µo

the z component of the Poynting vector. To get the radiant energy reaching the detector, we just
integrate the expected value of the Poynting vector’s z component over the beam’s cross-sectional
area and the time interval −T ≤ t ≤ T used to collect the signal. Therefore,

- 438 -
Energy Flux of the Balanced Radiation Fields · 4.15

Average energy in balanced signal over time interval 2T and beam cross - section A
§5 5 G (bal) · § 1 5 5
G (bal) G (bal) · (4.122b)
E ¨ ³ dt ³ ³ d 2( STA
© 5 5

= zˆ ¸ E ¨
¹

© $o
³
5
dt ³ ³ d 2( ETA
5
 ; BTA  = zˆ ¸ .
¹

In this section, we use Eqs. (4.109a) and (4.109b) to evaluate the right-hand side of (4.122b) from
the inside out. Most of the massive algebraic manipulations we encounter turn out to be
conceptually simple exercises in listing—and then eliminating through integration—a large
number of superfluous variables.
We introduce simplifying notation before substituting (4.109a) and (4.109b) into (4.122b).
According to Eq. (4B.12b) in Appendix 4B, the typical angle between nˆm and ẑ is small enough
for us to neglect the z component of the (nˆm  zˆ ) vector in Eqs. (4.109a) and (4.109b).
ThisThis
means there
means must
there exist
must twotwo
exist realreal
constants a and
constants b such
a and thatthat
b such

2(nˆm  zˆ ) axˆ  byˆ . (4.123a)


To shorten the way functions
( abc )
r , rs , ts , rp , t p , and
and s, p

are written with primed and unprimed arguments, we define that

r r (  w c), r 3 r ( w3 c) , (4.123b)

rs rs ( w c), rs3 rs ( w3 c) , (4.123c)

ts ts ( w c), ts3 ts ( w3 c) , (4.123d)

rp rp ( w c), rp3 rp ( w3 c) , (4.123e)

t p t p ( w c), t 3p t p ( w3 c) , (4.123f)
and
( abc ) G ( abc ) G
s, p s, p (cu / w,  w c), 33
s, p s, p (cu 3 / w3,  w3 c) . (4.123g)

We also define that


u 2 u x2  u y2
and
u 32 u 3x2  u 3y2 .

- 439 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Now at last Eqs. (4.109a) and (4.109b) can be substituted into (4.122b). Postponing for a
while the application of the expectation operator E , we write

1
5 5
G (bal) G (bal)
$o ³
5
dt ³ ³ d 2( ETA
5
; BTA   = zˆ

{ wwrr33
5 5 5 5 5 5
c

$o ³
5
dt ³ ³ dxdy ³ dw ³ dw3 ³ ³ d 2u ³ ³ d 2u3
5 5 5 5 5
2

2& i ¬ª x ( u x  u 3x )  y ( u y  u 3y )  t ( w  w3 ) ¼º
Ae

A ªW  e 2& i ( w c )  1 ( c 2u 2 ) w2
e2& i ( w c )A( xa  yb ) º (4.123h)
¬« ¼»
A ªW  e 2& i ( w e 2& i ( w3 c )A( xa  yb ) º
3 c )  1 ( c 2u 32 ) w32

¬« ¼»
G G
ª 33 3 3 
cu w  cu3 w3
A« s s rs rs t s t s E xTA (  , z ,  )E xTA (  , z,  )
¬ w c w3 c
G G
 cu
 p 33p rp rp3t p t 3p E yTA ( , z ,  )E yTA (
w
w 
c
cu3
w3
w3 º
, z,  ) »
c ¼ }.
The three double integrals over d 2( , d 2u , and d 2u3 are, of course, a shorthand for dxdy,
du x du y , and du 3x du 3y respectively. Moving the integral over dt to the inside givess [see Eq. (2.71f )
in Chapter 2]

³ dt e
2& it A( w  w3 )
 ( w  w3) . (4.124a)
5

We define

( abc ) G
3
s, p s, p (cu3 / w, w / c) (4.124b)

and substitute (4.124a) into (4.123h) to get

- 440 -
Energy Flux of the Balanced Radiation Fields · 4.15

( )
1
∞ ∞
G (bal) G (bal)
µo ³
−∞
dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ

{
∞ ∞ ∞ 2 ∞
c r 2π i ª¬ x ( u x + u ′x ) + y ( u y + u ′y ) º¼
³ dw ³ ³ d u ³ ³ d u ′ ⋅ ³ ³ dxdye
2 2
=
µo −∞ −∞ −∞
w 4
−∞

⋅ ªW 2 + We −2π i ( w c ) χ
2 2 2
1− ( c u ) w
e −2π i ( w c )⋅( xa + yb )
«¬
1− ( c 2u ′2 ) w2
+ We 2π i ( w c ) χ e 2π i ( w c )⋅( xa + yb ) (4.125a)

+ e 2π i ( w c ) χ [ 1− ( c 2u ′2 ) w2 − 1− ( c 2u 2 ) w2 ] º
»¼
G G
ª 2 2  cu w  cu′ w
⋅ « rs ts γ sγ s′E xTA (− , z , − )E xTA ( , z , )
¬ w c w c
G G
2 2
′  cu w  cu ′ w º
+ rp t p γ pγ p E yTA (− , z , − )E yTA ( , z , ) »
w c w c ¼ } .

In Eq. (4.125a), the integral over δ ( w + w′)dw′ has been used to replace w′ by íw everywhere, so
that [see Eqs. (4.92a-e), (4.123g), and (4.124b)]

2
rr ′ → r (− w c)r ( w c) = r ,

2
rs rs′ → rs (− w c)rs ( w c) = rs ,

2
ts ts′ → ts (− w c)ts ( w c) = ts ,

2
rp rp′ → rp (− w c)rp ( w c) = rp ,

2
t p t ′p → t p (− w c)t p ( w c) = t p ,
and
γ s , pγ s′′, p → γ s , pγ s′, p .
G G
Equations (4.123g), (4.124b), and (4.97c) show that when u ′ → −u in the argument lists of
γ sγ s′ , we get

- 441 -
4 · From Maxwell’s Equations to the Michelson Interferometer

( abc ) G G
3
s s s (cu / w,  w / c) s( abc ) (cu3 / w, w / c) 7
( abc ) G G ( abc ) 2
(4.125b)
s (cu / w,  w / c) s( abc ) (cu / w, w / c) ( abc ) ( abc ) 
s s s

and similarly
( abc ) 2
p
3p 7 p (4.125c)

G G
when u3 7 u in the argument lists of p
3p . Equation (4.97c) also shows that, when

G G
cu / w  and  w / c ) ,

we can write
( abc ) G ( abc ) G ( abc ) G ( abc ) G ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) )

and
( abc ) G ( abc ) G ( abc ) G ( abc ) G ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) .

( abc ) G ( abc ) G
At the extreme left-hand side of these last two formulas, we find s, p ( , ) ) s, p ( , ) ) and
( abc ) G ( abc ) G
s , p ( , ) ) s , p ( , ) ) , and since it must always be true that

( abc ) G ( abc ) G ( abc ) G ( abc ) G


s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) ,

it follows that, examining the extreme right-hand sides of these two formulas,

( abc ) G 2 ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) . (4.125d)

From thethe
From discussion following
discussion Eq.
following Eq.(4.83)
(4.83)ininSec.
Sec.4.12
4.12above,
above,we
wesee W22 11 because
thatW
seethat because W must
be either 1 or í1. We also note, according to Eq. (2.122a) in Chapter 2, that

5 5 5
2& i ¬ª xA( u x  u x3 )  y A( u y  u 3y ) ¼º
³ ³ dxdy e ³ dxe ³ dye
2& ixA( u x  u 3x ) 2& iy A( u y  u 3y )

5 5 5

 (u x  u3x ) A  (u y  u3y ).
Now Eq. (4.125a) can be written as

- 442 -
Energy Flux of the Balanced Radiation Fields · 4.15

( )
1
∞ ∞
G (bal) G (bal)
µo −∞
³ dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ

G
{
2
2c
∞ ∞
ª 2 2 ( abc ) 2 r cu w
2

³ dw ³−∞³ d u w4 ⋅ «« rs ts γ s E xTA (− w , z, − c )
2
=
µo −∞ ¬
G
w º
}
2
2 2 ( abc ) 2  cu
+ rp t p γ p E yTA (− , z , − ) »
w c »¼

{w ⋅ ³ ³ dxdy e
∞ ∞ ∞ 2 ∞ ª wa wb º
Wc r 2π i « x ( u x + u ′x − ) + y ( u y + u ′y − )
c »¼
³ dw ³ ³ d u ³ ³ d u ′
2 2
+ ¬ c

µo −∞ −∞ −∞
4
−∞
G G
−2π i ( w c ) χ 1− ( c 2u 2 ) w2 ª 2 2  cu w  cu ′ w
⋅e ⋅ « rs ts γ sγ s′E xTA (− , z , − )E xTA ( , z , )
¬ w c w c
G G
2 2
′  cu w  cu ′ w º
+ rp t p γ pγ p E yTA (− , z , − )E yTA ( , z , ) »
w c w c ¼ }
{w ⋅ ³ ³ dxdy e
∞ ∞ ∞ 2 ∞ ª wa wb º
Wc r 2π i « x ( u x + u ′x + ) + y ( u y + u ′y + ) »

³ dw ³ ³ d u ³ ³ d u′
2 2
+ ¬ c c ¼

µo −∞ −∞ −∞
4
−∞
G G
2π i ( w c ) χ 1− ( c 2u ′2 ) w2 ª 2 2  cu w  cu ′ w
⋅e ′
⋅ « rs ts γ sγ s E xTA (− , z , − )E xTA ( , z , )
¬ w c w c (4.125e)
G G
2 2
 cu w  cu ′ w º
+ rp t p γ pγ ′p E yTA (− , z , − )E yTA ( , z , ) » ,
w c w c ¼ }
where Eqs. (4.112a), (4.112b), (4.125b), and (4.125c) are used to simplify the first set of integrals
on the right-hand side. Even though Eqs. (4.112a) and (4.112b) state an equality between
nonrandom quantities E xTA and E yTA , we know this equality is also true for the random quantities
E and E  because (4.112a) and (4.112b) must hold true for any radiation fields. The
xTA yTA

remaining double integrals over dxdy can be written as

∞ ª § wa · § wb · º
2π i « x⋅¨ u x + u ′x ± ¸ + y ⋅¨ u y + u ′y ± ¸
c ¹ »¼ wa wb
³ ³ dxdy e ¬ ©
= δ (u x + u ′x ± ) ⋅ δ (u y + u′y ±
c ¹ ©
). (4.125f)
−∞
c c

From Eq. (4B.13e) in Appendix 4B, we get the approximation

wχ c2 § G w G · § G w G · wχ u 2 c2
2π i 1− 2 ¨ u + ∆ ¸ •¨ u + ∆ ¸ 2π i 1− 2
c w © c ¹ © c ¹
e ≅e c w
, (4.126a)

- 443 -
4 · From Maxwell’s Equations to the Michelson Interferometer

where we define, in harmony with Eqs. (4.123a) and (4B.13e),

G
∆ = axˆ + byˆ = 2(nˆM − zˆ ) . (4.126b)

This new vector will make it easier to write down what happens to Eq. (4.125e) when we
substitute from (4.126a). Equations (4.100a) and (4.100b) hold true for all physically possible
radiation fields, so they must still be true when E xTA and E yTA are taken to be the random
quantities E and E . Therefore we can take the complex conjugate of both sides of Eqs.
xTA yTA

(4.100a) and (4.100b) to get

 (εG, z , σ ) = E
E  (εG , z , −σ )∗ (4.126c)
xTA xTA
and

 (εG, z , σ ) = E
E  (εG , z, −σ )∗ , (4.126d)
yTA yTA

where the T, A subscripts are added because now we are explicitly acknowledging their time-
G
chopped and beam-chopped nature. Equation (4.124b) shows that when the argument u ′ of γ s′, p
G G
is replaced by – (u ± w∆ / c) , we get

G
§ −cu G w ·
γ s′, p → γ ( abc )
s, p ¨ B ∆, ¸ .
© w c¹

G
Examining the definition of ∆ in Eq. (4.126b), we note that the angle between nˆM and ẑ is
O(θ d ) , which means, according to inequality (4.68) above, that the angle between nˆM and ẑ
must be much smaller than the typical size of the off-axis propagation angle șb. Although we
know from the discussion at the beginning of Appendix 4E that changing the propagation
direction by șb can significantly affect the value of the complex γ s(,abc
p
)
parameters, the discussion

- 444 -
Energy Flux of the Balanced Radiation Fields · 4.15

at the end of Appendix 4E demonstrates that changing the direction of propagation by only an
O(' d ) amount does not significantly affect s(,abc )
p . Hence, Eqs. (4.125b) and (4.125c) still specify
G G G
what happens to s s3 and p 3p when u3 7 (u 9 w / c) . Taking all this into account while
changing the double integrals over dxdy in Eq. (4.125e) into the delta functions specified by
(4.125f) then leads to
o [applying (4.126a), (4.126c), and (4.126d)]

1
5 5
G (bal) G (bal)
$o ³
5
dt ³ ³ d 2( ETA
5

; BTA  = zˆ

G
{ r ª 2
5 5 2 2
2c ( abc ) 2 E ( cu , z ,  w )
³ dw ³ ³ d u
2 2
« rs ts s xTA
$o 5 5
w4 «¬ w c
G
w º
}
2
2 2
 cu
( abc ) 2
 rp t p E yTA ( , z ,  ) »
p
w c »¼
G G
{
2
r ª 2 2 ( abc ) 2  cu G
5 5
Wc cu w  w
³ dw ³ ³ d u
2
 4 « s
r ts s E xTA ( , z,  )E xTA (  , z ,  )
$o 5 5
w ¬ w c w c
G G
}
w ( cu )2
2 2 ( abc ) 2  cu w  cu G w  º 2& i c  1 w2
 rp t p p E yTA ( , z,  )E yTA (  , z,  ) » e
w c w c ¼
G G
{
2
r ª 2 2 ( abc ) 2  cu G
5 5
Wc cu w  w
³ dw ³ ³ d u
2
 4 « s
r ts s E xTA ( , z,  )E xTA (  , z,  )
$o 5 5
w ¬ w c w c
G G
}
w ( cu ) 2
2 2 ( abc ) 2  cu w  cu G w  º 2& i c  1 w2
 rp t p p E yTA ( , z,  )E yTA (  , z,  ) » e . (4.127a)
w c w c ¼

There is no point postponing any longer the application of the expectation operator E to both
sides of this formula. Because the expectation operator is linear with respect to nonrandom
quantities [see Eqs. (3.16a) and (3.17c) in Chapter 3], it can be taken inside all the integrals on
the right-hand side, which means Eq. (4.122b) can now be written as

- 445 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Average energy in balanced signal over time interval 2T and beam cross - section A
§ 1 5 5
G (bal) G (bal) ·

© $o
³
5
dt ³ ³ d 2( ETA
5
 ; BTA  = zˆ ¸
¹
G
r ª 2
{
2
2c
5 5
(abc ) 2
§ cu w ·
2

³ dw ³ ³ d u
2
2
« rs ts E ¨ E xTA ( , z ,  ) ¸
$o w4 «¬
s
¨ w c ¸¹
5 5 ©
G
§ w ·º
}
2
2 2
 cu
(abc ) 2
 rp t p E ¨ E yTA ( , z ,  ) ¸ »
¨ p
w c ¸¹ »¼
©
G G
{
2
r ª 2 2 (abc ) 2 §  cu G
5 5
Wc cu w  w ·
³ dw ³ ³ d u E ¨ E xTA ( , z ,  )E xTA (  , z ,  ) ¸
2
 4 « s
r ts s
$o 5 5 w ¬ © w c w c ¹
G G G
}
w ( cu ) 2
2 2 2 § cu w
 (  , z ,  )E cu w
 (    , z ,  ) e c · º 2& i  1
 rp t p (abc )
p E¨ E yTA yTA ¸»
w2

© w c w c ¹¼
G G
{
2
r ª 2 2 (abc ) 2 §  cu G
5 5
Wc cu w  w ·
³ dw ³ ³ d u E ¨ E xTA ( , z ,  )E xTA (  , z ,  ) ¸
2
 4 « s
r ts s
$o 5 5 w ¬ © w c w c ¹
G G
}
w ( cu )2
2 2 (abc ) 2 § cu w  cu G w  · º 2& i c  1 w2
 rp t p p E ¨ E yTA ( , z ,  )E yTA (  , z ,  ) ¸ » e .
© w c w c ¹¼
(4.127b)

The key terms in Eq. (4.127b) are the expectation values of the random variables

 2 ) , E( E
E( E  2),
xTA yTA

G G
§ cu w  cu G w ·
E ¨ E xTA ( , z ,  )E xTA ( 9 , z ,  ) ¸ ,
© w c w c ¹
andand
G G
§ cu w  cu G w ·
E ¨ E yTA ( , z ,  )E yTA ( 9 , z ,  ) ¸ .
© w c w c ¹

 2 ) , E( E
We learned how to handle terms such as E( E  2 ) in Sec. 4.14 [see Eqs. (4.120a)
xTA yTA

and (4.120b)], but what can be done with terms such as

- 446 -
Energy Flux of the Balanced Radiation Fields · 4.15

G G
§ cu w  cu G w ·
E ¨ E xTA (− , z , − )E xTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
and
G G
§ cu w  cu G w ·
E ¨ E yTA (− , z , − )E yTA (− ± ∆, z , − )∗ ¸ ?
© w c w c ¹

To evaluate this new type of term, we return to Eq. (4.108a) above, making the radiation field
random and taking x and y components to get

∞ ∞ G
(in) G ª −2  cu w º 2π i[ uG • ρG + wt ]
E xTA ( ρ , z, t ) = ³ ³−∞³ «¬
dw d u 2
cw E ( − , z , − ) e (4.128a)
c »¼
xTA
−∞
w
and
∞ ∞ G
G ª −2  cu w º G G
E (in)
yTA ( ρ , z , t ) = ³ dw ³ ³ d u « cw E yTA (− , z , − ) » e 2π i[ u • ρ + wt ] .
2
(4.128b)
−∞ −∞ ¬ w c ¼

This shows that E xTA


(in)
and E yTA
(in)
are the inverse three-dimensional Fourier transforms of


cw−2 E −2 
xTA and cw E yTA ,

which means that



cw−2 E −2 
xTA and cw E yTA

must be the three-dimensional forward Fourier transforms of E xTA


(in)
and E yTA
(in)
,

G ∞ ∞
cu w (in) G
G G
cw E xTA (− , z , − ) = ³ dt ³ ³ d 2 ρ E xTA
 −2
( ρ , z , t ) e −2π i[u • ρ + wt ] (4.129a)
w c −∞ −∞
and
G ∞ ∞
cu w (in) G
G G
cw E yTA (− , z , − ) = ³ dt ³ ³ d 2 ρ E yTA
 −2
( ρ , z , t ) e −2π i[ u • ρ + wt ] . (4.129b)
w c −∞ −∞

We now let E     (in)  (in)  (in)


x , yTA stand for either E xTA or E yTA , and E x , yTA stand for either E xTA or E yTA . Since
G G (in)
the algebra is the same for the x and y components of ETA and ETA , we combine Eqs. (4.129a)
and (4.129b) to write, using Eq. (3.17c) of Chapter 3,

- 447 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
§ cu w  cu G w ·
E ¨ E x , yTA ( , z ,  )E x , yTA ( 9 , z ,  ) ¸
© w c w c ¹
§ ª w2 5 5
G G G º
E¨ « ³ dt ³ ³ d 2 ( E x(in)
, yTA ( ( , z , t ) e
2& i[ u = (  wt ]
»
¨ c
© ¬ 5 5 ¼
(4.130)
ª w2 5 5
G G G G
 º·
³ dt 3 ³ ³ d ( 3 E x(in)
2 2& i[ u B ( w c )  = ( 3  wt 3 ]
A« , yTA ( ( , z , t ) e
3 3 » ¸¸
¬ c 5 5 ¼¹
w4
5 5 5 5
G G G G G
G  (in) G 3 3
³ dt ³ dt 3 ³ ³ d 2 ( ³ ³ d 2 ( 3 e 2& i[ u =( (  ( 3)  w( t t 3) 9 ( w c ) = ( 3]E E x(in)  
, yTA ( ( , z , t ) E x , yTA ( ( , z , t ) .
c2 5 5 5 5

Equations (4C.4c) and (4C.4d) from Appendix 4C show that


G  (in) G 3 3
E E x(in)
 , yTA ( ( , z , t ) E x , yTA ( ( , z , t ) 
G G G G (4.131a)
 (t , T ) A  (t 3, T ) A  ( ( ; A) A  ( ( 3; A) A R x , y ( (  ( 3, t  t 3, z ) .

It is important to remember, when using this approximation, that Rx,y are the three-dimensional
autocorrelation functions of the x and y radiation field components before they enter the
G
interferometer [see Eqs. (4C.3a) and (4C.3b) in Appendix 4C]. The  (t , T ) and  ( ( ; A)
functions are defined in Appendix 4C to be69

­°1 for t 4 T
 (t , T ) ® (4.131b)
°̄0 for t T

G
­1 when point ( ( x, y ) lies inside or on the edge
G °° of the beam of cross - sectional area A
 ( ( ; A)  ( x, y; A) ® G . (4.131c)
° 0 when point ( ( x, y ) lies outside the beam of
°̄ cross - sectional area A

These  functions approximate what happens to the original autocorrelation function Rx,y when
radiation enters the interferometer; they make explicit the time-chopped and beam-chopped
nature of the interferometer signal (see discussion in Secs. 4.9 and 4.10 above). Substitution of
(4.131a) into (4.130) gives

69
This formula for  (t , T ) in Eq. (4.131b) is similar to the formula for  (t , T ) given in Eq. (2.56c) 2,
(1.56c) of Chapter 1,
differing only in the value specified for  at t 9T .

- 448 -
Energy Flux of the Balanced Radiation Fields · 4.15

G G
§ cu w  cu G w ·
E ¨ E x , yTA (− , z , − )E x , yTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
∞ ∞ ∞ G G ∞
w4 G G
³ [Π ( ρ ; A)
(
B2π i ( w c ) ρ ′•∆ )
³ Π(t ′, T )dt ′ ³ Π(t , T )dt ³ ³ Π( ρ ′; A)e d ρ′ ³
2
≅ 2 (4.132a)
c −∞ −∞ −∞ −∞
G G
G G G
⋅ e −2π i[ u •( ρ − ρ ′) + w( t −t ′)] ⋅ R x , y ( ρ − ρ ′, t − t ′, z ) d 2 ρ ]
G G G
Transforming the variables of integration to ρ ′′ = ρ − ρ ′ and t ′′ = t − t ′ so that dt ′′ = dt and
d 2 ρ ′′ = d 2 ρ changes the formula to

G G
§ cu w  cu G w ·
E ¨ E x , yTA (− , z , − )E x , yTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
∞ ∞ G G ∞ ∞
w4 G G G
³ [Π ( ρ ′′ + ρ ′; A)
(
B2π i ( w c ) ρ ′•∆ )
³ Π (t ′, T )dt ′ ³ ³ Π ( ρ ′; A)e d ρ ′ ³ Π (t ′′ + t ′, T )dt ′′ ³
2
≅ 2 (4.132b)
c −∞ −∞ −∞ −∞
G
G G
]
⋅ e −2π i[ u • ρ ′′+ wt ′′] ⋅ R x , y ( ρ ′′, t ′′, z ) d 2 ρ ′′ .

We note that, in the limit as T → ∞ and A → ∞ , the inner integrals over dt ′′ and d 2 ρ ′′ become
G
the three-dimensional Fourier transform of R x , y ( ρ ′′, t ′′, z ) :

∞ ∞
G G G
³ [Π( ρ ′′ + ρ ′; A) e ]
G G

³ Π (t ′′ + t ′, T )dt ′′ ³ −2π i[ u • ρ ′′ + wt ′′ ]
⋅ R x , y ( ρ ′′, t ′′, z ) d 2 ρ ′′
−∞ −∞

T → ∞ A → ∞ (4.133a)
∞ ∞
G G G
= ³
−∞
dt ′′ ³
−∞
³ R x , y ( ρ ′′, t ′′, z ) e −2π i[ u • ρ ′′+ wt ′′] d 2 ρ ′′ .

According to Eqs. (4C.5a) and (4C.5b) in Appendix 4C, the three-dimensional Fourier transform
G
of R x , y ( ρ ′′, t ′′, z ) is

∞ ∞
G G G G
³
−∞
dt ′′ ³ ³ d 2ρ ′′ R x , y ( ρ ′′, t ′′, z ) e −2π i (u • ρ ′′+ wt ′′) = Sx , y (u , w) ,
−∞
(4.133b)

- 449 -
4 · From Maxwell’s Equations to the Michelson Interferometer

where, as was discussed at the end of Appendix 4C, functions Sx,y do not need to have z as part of
their argument list because they do not depend on that variable. Equations (4.133a) and (4.133b)
can be combined to give

5 5
G G G
³ [( ( 33  ( 3; A) e ]
G G
³  (t 33  t 3, T )dt 33 ³ 2& i[ u = ( 33  wt 33 ]
A R x , y ( ( 33, t 33, z ) d 2 ( 33
5 5
(4.133c)
T 7 5 A 7 5
G
Sx , y (u , w).

Following
Following the same reasoning
the same reasoning used
used in
in the
the discussion
discussion following
following Eq.
Eq. (4.117b)
(4.117b) above,
above, wewe assume
assume
that in a well-designed interferometer A and T are large enough – in fact, that a relatively
that in a well-designed interferometer A and T are large enough for the left-hand side of (4.133c) small
patch of A (say A/100 or A/1000) is
to be approximately equal to its limit:large enough and a similarly small fraction of T is large enough
– for the left-hand side of (4.133c) to be approximately equal to its limit:

5 5
G G G
³ [( ( 33  ( 3; A) e ]
G G

³  (t 33  t 3, T )dt 33 ³ 2& i[ u = ( 33  wt 33 ]
A R x , y ( ( 33, t 33, z ) d 2 ( 33
5 5 (4.133d)
G
Sx , y (u , w).

G
Another way of looking at this is to say that only the values of Rx,y reasonably near ( 33 = 0 and3
When using this approximation, the inner integrals in (4.132b) no longer depend on variables t
t 33= 0G contribute significantly to the Sx,y Fourier transform. When using this approximation, the
and ( 3 , allowing us to write G
inner integrals in (4.132b) no longer depend on variables t 3 and ( ,3 except for a relatively small
border region around the edge of A and relatively small time durations at the beginning and end
of T. Neglecting the contribution of these small border regions and time durations, we make the
approximation that G G
§ cu w  cu G w ·
E ¨ E x , yTA ( G , z ,  )E x , yTA ( G 9 , z ,  ) ¸
§©  w
cu c
w w G
cu c  ·¹
w
E¨ E (  , z ,  ) 
E (  9  , z ,  5) ¸
(4.133e)
x , yTA x , yTA 5 G G
© w w4 c G w c ¹ G B2& i ( w c ) ( 3=  2
2 Sx , y (u , w) ³  (t 3, T )dt 3 ³ ³  ( ( 3; A)e d (3 . (4.133e)
c
w 4
G 5
5 5
G G G
B2& i ( w c ) ( 3=  2
2 Sx , y (u , w) ³  (t 3, T )dt 3 ³ ³  ( ( 3; A)e
5
d (3 .
c 5 5

At this point, we have everything needed to put together the interferometer’s balanced-signal
equations. Using Eqs. (4.102a)–(4.102d) to transform the variables of integration in Eq. (4.127b)
G G
to ) ( w c) and  c(u w) gives

- 450 -
Energy Flux of the Balanced Radiation Fields · 4.15

Average energy in balanced signal over time interval 2T and beam cross section A


∞ ∞
d 2ε
³ dσ ³ ³ σ
2
= 2ε o 2
r( )
−∞ −∞

[ 2 2 2
 (εG, z , σ ) 2
⋅ rs (σ ) ts (σ ) γ s( abc ) (σ ) E E xTA ( )
2 2 2
 (εG, z , σ ) 2
+ rp (σ ) t p (σ ) γ (pabc ) (σ ) E E yTA ( )]}

∞ ∞
d 2ε
³ dσ ³ ³ σ
2
+W ε o 2
r( )
−∞ −∞
G G G
⋅[ rs (σ ) ts (σ ) γ s( abc ) (σ ) E ( E xTA (ε , z, σ )E xTA (ε − ∆, z , σ )∗ )
2 2 2

2 2 2
 (εG, z , σ )E
+ rp (σ ) t p (σ ) γ (pabc ) (σ ) E E yTA yTA
G
 (εG − ∆, z , σ )∗
( )] e −2π iσχ 1−ε 2
}

∞ ∞
d 2ε
³ ³³σ
2
+W ε o dσ 2
r( )
−∞ −∞
G G G
⋅[ rs (σ ) ts (σ ) γ s( abc ) (σ ) E ( E xTA (ε , z, σ )E xTA (ε + ∆, z , σ )∗ )
2 2 2

(4.134a)
2
+ rp (σ ) t p (σ ) γ
2 ( abc )
p
2
(
 (ε , z , σ )E
(σ ) E E yTA
G G
 (ε + ∆, z , σ )∗
yTA
G
)]e 2π iσχ 1−ε 2
} .

Here rule (4E.6a) in Appendix 4E is used to acknowledge that γ s( abc ) and γ (pabc ) are functions
only of the wavenumber ı for on-axis and slightly off-axis plane waves, and once again Eq.
(4.1e) is used to simplify the constant outside the integrals. Converting the arguments in (4.133e)
G G
to σ = −( w c) and ε = −c(u w) gives

G G G

E E (
x , yTA (ε , z , σ ) 
E x , yTA (ε ± ∆ , z , σ )∗ )
∞ ∞ G
G G G
±2π i ( ρ ′•(σ∆ ) ) 2
≅ c 2σ 4Sx , y (σε , −σ c) ³ Π (t ′, T )dt ′ ³ ³ Π ( ρ ′; A)e d ρ′ (4.134b)
−∞ −∞
G G
= c σ Sx , y (σε , −σ c) ⋅ 2T ⋅ Ȇ A (Bσ∆) ,
2 4

where Ȇ A is defined to be the two-dimensional forward Fourier transform

- 451 -
4 · From Maxwell’s Equations to the Michelson Interferometer

5
G G G G

³ ³
2 2& i ( =u
Ȇ A (u ) d (  ( ( ; A) e . (4.134c)
5

G G
of the beam’s pupil function  ( ( ; A) defined in Eq. (4.131c). Because  ( ( ; A) is strictly real,
we see that

G
5
G G G § 5 2 G G G ·

³ ³ d (  ( ( ; A) e ¨ ³ ³ d (  ( ( ; A) e2& i( =u ¸
2 2& i ( =u
Ȇ A (u )
5 © 5 ¹
or
G G
Ȇ A (u ) Ȇ A (u ) . (4.134d)

Equations
Equations (4.120a)
(4.120a) andand (4.120b)
(4.120b) let substitute
let us us substitute Sx,ySinx,y(4.134b)
for for in (4.134b)
to get
to get
G G
 (G, z , ) )E

E E  (G 9 , z , ) ) 2T $ c 2) 2 L (G , ) ) Ȇ (B))
 (4.135a)
xTA xTA o x A

and
G G
 (G , z , ) )E

E E  (G 9 , z , ) ) 2T $ c 2) 2 L (G , ) ) Ȇ (B)) .
 (4.135b)
yTA yTA o y A

These last two results, together with  o $o c 2 from Eq. (4.1e), can be substituted into (4.134a)
to give
Average energy in balanced signal over time interval 2T and beam cross section A
d 2
{)
5 5

³ d) ³ ³ )
2
2 o 2
r( )
5 5

[  (G , z , ) ) 2
2 2 2
A rs () ) ts () ) Ȗs( abc ) () ) E E xTA  
2 2 2
 (G , z , ) ) 2
 rp () ) t p () ) Ȗ (pabc ) () ) E E yTA  ]}
{ G
5 5
G
2TW ³ d) ³ ³d 
2 2 2
[ 2 2
r () ) Ȇ A ()) rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5

2 2 2 G
 rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) e 2& i) ] 1 2
}
{ G
5 5
G
2TW ³ d) ³ ³d 
2 2 2 2
[ 2
r () ) Ȇ A ()) rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5

2 2 2 G
 rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) e 2& i) ] 1 2
}. (4.135c)

- 452 -
Energy Flux of the Balanced Radiation Fields · 4.15

2

Returning again to Eqs. (4.120a) and (4.120b), this time to substitute for E( E x , yTA ) , gives

Average energy in balanced signal over time interval 2T and beam cross section A

{)
5 5
4TA ³ d) ³ ³d 
2 2
r( )
5 5

«¬
2 2 G 2 2 G
A ª rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )  rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) º
2 2

»¼ } (4.135d)

{)
5 5
G
2TW ³ d) ³ ³ d 
2
r( )
2
[ r () )
s
2 2 2
ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5

2 2 2 G
¬
]
G
 rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) A ª« Ȇ A ()) e 2& i) 1 2
G
 Ȇ A ()) e2& i) 1 2 º
}
»¼

G G
where Eq. (4.134d) is used to replace Ȇ A ()) by Ȇ A ()) . From the definitions of Lx,y in the
G
discussion preceding Eqs. (4.120a), we know that L x ( , ) ) d 2 is the x-polarized optical power
per unit area of the beam and per unit wavenumber interval at wavenumber ı that is inside the
G G
d 2 solid angle and traveling in the direction of the propagation vector    zˆ 1   2 . A
G
similar statement can be made about L y ( , ) ) d 2 —that it is the y-polarized optical power per
unit area of the beam and per unit wavenumber interval at wavenumber ı that is inside the d 2
G
solid angle and traveling in the direction of the propagation vector  . The discussion following
Eq. (4.120b) points out that Lx and Ly must represent direction-chopped radiation, with both Lx
G
and Ly negligible for  values specifying propagation directions that are not parallel to, or nearly
parallel to, the optical axis ẑ . Hence Lx and Ly are negligible for those propagation directions
that cannot enter the interferometer because they lie outside the interferometer’s field of view,
and we can regard the integrals over d 2 as occurring only over the interferometer’s field of
view. We define Pbal (  ) to be the time-averaged power in the balanced signal from the beam of
cross-sectional area A at an OPD OPD value value of
of Ȥ.
Ȥ. Dividing
Dividing both
both sides
sides of
of (4.135d)
(4.135d)by by2T
2Tthen s, after
thengives
gives
using that Re(c) = (c+c )/2 for any complex number c,


{{
5
G
2
[ 2 2 2
Pbal (  ) 2 A 5³ d) ³ ³ d 2 r () ) A rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
G
³5 d) field
Pbal (  ) 2 A5 ³field³ofofdview
2 2
[ 2 2 2
 r () ) A rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
G
view 2 2 2
 rp () ) t p () ) Ȗs( abc ) () ) L y ( , ) ) (4.135e)
G
]
2 2 ( abc ) 2
 rp () ) t p () ) Ȗs () ) L y ( , ) ) (4.135e) ]
}
W G
[
A 1  Re Ȇ A ()G )e2& i)  cos  ]

[W
A 1  A Re Ȇ A ())e2& i)  cos  ]
A
 }
- 453 -
4 · From Maxwell’s Equations to the Michelson Interferometer

where
cos   1   2
K (4.135f)
cosine of the angle between the propagation vector ȍ and the z axis.

We note that by definition   is the same as angle 'b used in Sec. 4.12 and Appendix 4B.
Writing the integral over d 2 like this lets us think of Lx and Ly as representing the radiation
field before it becomes direction-chopped—always assuming, of course, that direction-chopping
the incident radiation does not significantly change Lx and Ly. Equation (4.135e) makes it clear
that the triple integral over dı and d 2 must be real because the quantity being integrated is
always real.

4.16 Simplified Formulas for the Optical Power in the Balanced Signal
Equation (4.135e) specifies the optical power in the balanced signal when L x > L y so that the
G
incident radiation is polarized, when Lx and Ly depend on  as well as ı so that there is both
G
spectral and intensity variation across the interferometer’s field of view, and when  > 0 so that
there are small misalignments in the moving mirror. This section strips away these effects step by
step, eventually to arrive at the same formula for the optical power in the balanced signal for an
ideal interferometer that was presented in Eq. (1.19f) of Chapter 1.
The first step is to specify unpolarized incident radiation, which we do by setting

G G 1 G
L x ( , ) ) L y ( , ) ) L ( , ) ) . (4.136a)
2

Here, the incident radiation is made unpolarized by splitting the total power equally between the
two possibilities—x polarization and y polarization. From Eqs. (4.121a) and (4.121b), we see that
both Lx and Ly are even functions of ı, requiring L to be another even function of ı:
G G
L ( , ) ) L ( , ) ) . (4.136b)

Equation (4.135e) can now be written as

{)
5
G
Pbal (  ) A ³ d) ³³d 
2 2
r ( ) L ( , ) )
5 field of view

A[ rs () ) ts () ) Ȗs( abc ) () )  rp () ) t p () ) Ȗ (pabc ) () ) ]


2 2 2 2 2 2
(4.136c)

] }.
WW GG
A [1A [1  ReReȆȆ(A)())e)e2&2&i)i)cos
 cos

AA A

- 454 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16

Glancing back to the definitions of Lx and Ly [see discussion preceding Eq. (4.120a) above], we
G G G
recognize L ( , ) ) L x ( , ) )  L y ( , ) ) to be the total optical power per unit cross-sectional area
of this unpolarized beam per unit solid angle per unit wavenumber interval at wavenumber ı. As
an argument of function L, the wavenumber ı takes on negative as well as positive values:
5
)
5 . This makes L analogous to a double-sided power spectrum [in Chapter 3, see Sec.
3.20 and the discussion following Eq. (3.57g)]. In radiometry the spectral radiance of an optical
field is the transmitted optical power per unit area transverse to the direction of propagation per
unit solid angle in the direction of propagation per unit wavenumber interval. This is the same
meaning we have attached to L; however, in radiometry, the wavenumber ı is always positive.70
This makes the radiometric spectral radiance of the optical field analogous to a single-sided
power spectrum. Because the radiation passing through the interferometer is direction-chopped,
ensuring that all the propagation vectors are parallel to, or nearly parallel to, the ẑ axis, a unit
cross-sectional area of the beam is approximately the same as a unit area transverse to the
radiation’s direction of propagation; and a solid angle d 2 in Eq. (4.135e) is approximately the
G
same as a solid angle in the direction of propagation. Hence we could interpret L ( , ) ) as the
radiometric spectral radiance of the optical field if L were not in fact defined for both positive and
negative wavenumbers, making it analogous to a double-sided rather than a single-sided power
spectrum. Therefore we use the standard conversion for going from double-sided to single-sided
power spectra [see Eq. (3.58b) in Chapter 3] to define the spectral radiance L of the optical field
as
G G
L( , ) ) 2 L ( , ) ) for ) : 0 . (4.136d)

The next step is to assume no spectral or intensity variation across the interferometer’s field of
G
view, which means we suppress the dependence of L and L on  and write Eqs. (4.136a),
(4.136b), and (4.136d) as
G G 1 G 1
L x ( , ) ) L y ( , ) ) L ( , ) ) L () ) , (4.136e)
2 2
with
L () ) L () ) (4.136f)
and
L() ) 2 L () ) for ) : 0 . (4.136g)

Substituting Eq. (4.136e) into (4.136c) now gives

A
5
W
W GG
[[ Re §¨ Ȇ
Pbal (  ) ³ d) ³ ³ d 2 ! ( ) )L () ) A 11 Re 
ȆA ()())e)e2& 2i)&icos 
) cos 
]
 ] ·
¸ (4.136h)
2 5 field of view AA © A ¹

70
See Table 1-2 on page 1-4 in The Infrared Handbook, edited by William L. Wolfe and George J. Zissis, rev. ed.
(Infrared Information Analysis Center of the Environmental Research Institute of Michigan, 1985).

- 455 -
4 · From Maxwell’s Equations to the Michelson Interferometer

with the beam-splitter efficiency Ș defined to be

η (σ ) = 2 r (σ ) ª« rs (σ ) ts (σ ) γ s( abc ) (σ ) + rp (σ ) t p (σ ) γ p( abc ) (σ ) º» .
2 2 2 2 2 2 2
(4.136i)
¬ ¼

An ideal interferometer has


2 2 2 2 1
rs (σ ) = rp (σ ) = ts (σ ) = t p (σ ) =
2
and
2 2
r (σ ) = γ s(,abc
p (σ ) = 1
)

so that η = 1 . For realistic interferometers, we expect 0 < η < 1 ; and the closer Ș is to one, the
more nearly ideal is the performance of the interferometer’s optical components (i.e., the beam
splitter, compensator, and return mirrors in Figs. 4.16, 4.17, and 4.19).
Traditional interferometers have beams with a circular cross section. Equation (4D.6b) in
Appendix 4D gives the formula for the two-dimensional forward Fourier transform of a circular
pupil function when R is the pupil radius,
G
G J1 (2π R u )
Ȇ A (u ) = R⋅ G . (4.137a)
circle of
radius R
u

Here J1 is the first-order Bessel function of the first kind. We note that this two-dimensional
G
Fourier transform depends only on the magnitude of vector u ; and, since J1 is always real-valued
G G
for real arguments (see Fig. 4.23), the transform is always real. Substitution of σ∆ for u in
(4.137a) gives
G
G J1 (2π R ⋅ σ ⋅ ∆ ) J (4π R ⋅ σ ⋅ nˆM − zˆ )
Ȇ A (σ∆) = R⋅ G = R⋅ 1 , (4.137b)
circle of σ ⋅∆ 2 ⋅ σ ⋅ nˆM − zˆ
radius R

G G
where in the last step we replace ∆ with its definition from Eq. (4.126b): ∆ = 2(nˆM − zˆ ) .
Both nˆM , the normal vector to the reflecting surface of the moving mirror, and ẑ , the vector
pointing down the interferometer beam along the optical axis, have unit length. Because the angle
between them is always small—the moving mirror is assumed to be only slightly misaligned—it
follows that nˆM − zˆ is the misalignment angle of the moving mirror with respect to the optical

- 456 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16

Figure 4.23.

1.2
1.2

1.0 1

0.8 0.8

0.6 0.6

J1 ( x )
0.440.4
y
i

0.2 0.2

0.0 0

-0.20.2

-0.40.4

-0.60.6
0.582
30 20 10 0 10 20 30
-30
30 -20 -10 0.0
x 10 20 3030
i

- 457 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.24.

1.2
1.2

1.0 1

0.8 0.8

2 J 1 ( x) 0.6 0.6
x y
i

0.440.4

0.2 0.2

0.0 0

-0.20.2
0.132
30 20 10 0 10 20 30
-30
30 -20 -10 0.0
x 10 20 3030
i

- 458 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16

axis and [see discussion following Eq. (4B.4h) in Appendix 4B], the angle between rays
reflecting from a perfectly aligned moving mirror and rays reflecting from a slightly misaligned
moving mirror is always θ d = 2 nˆM − zˆ . We define

θ ma = nˆM − zˆ (4.137c)

to be the misalignment angle of the interferometer’s moving mirror for a beam with a circular
cross section and write (4.137b) as

G J1 (4π R ⋅ σ ⋅θ ma )
Ȇ A (σ∆) = R⋅ . (4.137d)
circle of 2 ⋅ σ ⋅ θ ma
radius R
It follows that
1 G J (4π R ⋅ σ ⋅ θ ma )
Ȇ A (σ∆) = 1 (4.137e)
A circle of
2π R ⋅ σ ⋅ θ ma
radius R

because the area of a circular beam is, of course, A = π R 2 . To see how this function behaves, we
note that the right-hand side of (4.137e) can be written as

J1 ( x )
2 for x = 4π R σ θ ma
x

and graph this function of x in Fig. 4.24. Because

J1 ( − x ) = − J1 ( x ) , (4.137f)

function J1 is an odd function (see Sec. 2.3 of Chapter 2 for a description of what an odd function
is).71 This means that
J1 (4π R(−σ )θ ma ) J1 (4π Rσθ ma )
= , (4.137g)
2π R(−σ )θ ma 2π Rσθ ma
which shows that
1 G
Ȇ A (σ∆ )
A circle of
radius R

71
The standard series formula for J1 shows at once that it is odd. See Eq. (9.1.10) in Handbook of Mathematical
Functions, edited by Milton Abramowitz and Irene A. Stegun (National Bureau of Standards, Applied Mathematics
Series 55, November 1964), p. 360.

- 459 -
4 · From Maxwell’s Equations to the Michelson Interferometer

is an even function of ı. Consequently the absolute value signs can be dropped from ı so that Eq.
(4.137e) becomes

1 G J (4& R)' ma )
Ȇ A ()) 1 . (4.137h)
A circle of
2& R)' ma
radius R
Hence for an interferometer beam with a circular cross section, Eq. (4.136h) can be written as

Pbal (  )
A
5
ª JJ(4(4&&RR)'
)'mama) ) º (4.137i)
³ d) ³ ³ d 2 ! ( ) )L () ) ª«1W
WA A 1 1 cos(2&)
A Acos(2 cos))º»..
&)cos
2 5 field of view
¬¬ 22&&RR)'
)'mama ¼¼

G G
Returning to the original definition of A1Ȇ A ()) in Eq. (4.134c), we have, as  goes to zero,

1 G 5
G G G
2& i ( =() ) 1
³5³
2
Ȇ A ()) d (  ( ( ; A) e A A 1 (4.137j)
A G
 0 G
A
 70

G
for the pupil functions  ( ( ; A) of beams with any shape cross section. According to Eq. (4.137c)
and the discussion preceding it,
G
 2 nˆM  zˆ 2' ma .

G
This means the limit when  7 0 is the same as the limit when ' ma 7 0 . For a beam with a
circular cross section, it must then be true that [see Eqs. (4.137h) and (4.137j)]

J1 (4& R)' ma )
lim 1. (4.137k)
' ma 70 2& R)' ma

Equation (4.137i) for a perfectly aligned system then becomes

5
A
Pbal (  ) ³ d) ³ ³ d 2 ! ( ) )L () ) ª1  W cos(2&) cos   ) º . (4.137 A )
2 5 field of view
¬ ¼

If we assume
If we thatthat
assume thethe
interferometer’s field
interferometer’s ofofview
field viewisissufficiently
sufficientlynarrow
narrowthat
that cos   1 , then
Eq. (4.137i), for an imperfectly aligned moving mirror, becomes

- 460 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16

Pbal (  )
A
5
ª (4&&RR)'
J 11(4 ) )
)'ma ma ºº (4.138a)
³
2 5
! ( ) )L () ) «
¬
1  W A
2
2&&RR
)')'
mama
A cos(2
cos(2&)
&)) ») »dd))
¼¼
where
ǻŸ = solid angle of the interferometer’s field of view (4.138b)

Equations (4.92a)–(4.92d) above let us write that

2 2
r () ) r () ) r () ) r () ) r () ) r () ) r () ) r () ) , (4.139a)

2 2
rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) , (4.139b)

2 2
ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) , (4.139c)

2 2
rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) , (4.139d)
and
2 2
t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) . (4.139e)

Equation (4.125d) shows that, using rule (4E.6a) in Appendix 4E to drop the superfluous
G
argument  ,
( abc ) 2 ( abc ) 2 ( abc ) 2 ( abc ) 2
s () ) s () ) and p () ) p () ) . (4.139f)

Consequently, Ș(ı) defined in Eq. (4.136i) is an even function of ı:

! () )
2 r () ) ª rs () ) ts () ) () ) º (4.139g)
2 2 2 ( abc ) 2 2 2 ( abc ) 2
() )  rp () ) t p () )
¬« s p
¼»
2 r () ) ª rs () ) ts () ) () ) º ! () ).
2 2 2 ( abc ) 2 2 2 ( abc ) 2
() )  rp () ) t p () )
¬« s p
¼»

We also know that L is an even function of ı [see Eq. (4.136f)], that cos(2&) ) is an even
function of ı, and that

- 461 -
4 · From Maxwell’s Equations to the Michelson Interferometer

J1 (4& R)' ma )
2& R)' ma

is an even function of ı [see Eq. (4.137g)]. Therefore, the entire product being integrated in
(4.138a),
ª J (4& R)' ma ) º
! () )L () ) «11 W
WA 1 A cos(2&) ) » ,
¬ 2& R)' ma ¼

is an even function of ı. Hence we can write, using the rule from Eq. (2.19) in Chapter 2 and also
that L() ) 2 L () ) from Eq. (4.136g), that the balanced-signal power specified in (4.138a) is

A
5
ª J (4& R)' ma ) º
Pbal (  )
2 0³ ! () )L() ) «11 W
¬
WA 1
2& R)' ma
A cos(2&) ) » d) .
¼
(4.140a)

Making the interferometer perfectly aligned by taking ' ma 0 , we see from the limit

J1 (4& R)' ma )
lim 1
' ma 70 2& R)' ma

in (4.137k) that the Bessel function ratio disappears in (4.140a). Now we can write

5
1
Pbal (  ) ³ ! () )S() ) [1  W cos(2&) )] d) , (4.140b)
20
where we define
S() ) AL() ) (4.140c)

to be the total optical power per unit wavenumber interval entering the interferometer. All that
needs to be done now is to make the final idealization, ! 1 , and we get the same formula for the
ideal interferometer signal given in Eq. (1.19f) in Chapter 1,

5
1
Pbal (  ) ³ S() ) [1  W cos(2&) )] d) . (4.140d)
20

The only difference is that in Chapter 1 the balanced optical power is called I ( cb ) instead of
Pbal (  ) , and that now we can get progressively less idealized formulas for the balanced optical
power by reversing the simplifications leading to (4.140d).

- 462 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16

In most optical textbooks it is customary to assume that W = 1 , allowing Eq. (4.140d) to be


written as

∞ ∞ ∞
1 1 1
Pbal ( χ ) = ³ S(σ )dσ + ³ S(σ ) cos(2πσχ ) dσ = constant + ³ S(σ ) cos(2πσχ ) dσ .
20 20 20

Separating out the nonconstant signal component that changes with Ȥ, we give it the name


1
Ibal ( χ ) = ³ S(σ ) cos(2πσχ ) dσ . (4.141a)
20

In this book, we call Ibal ( χ ) the interferogram. Comparing this last result to Eq. (2.8b) in Chapter
2, we see that the interferogram is 1/4 of the Fourier cosine transform of S(ı). Since

S(σ ) = A ⋅ ∆Ω ⋅ L(σ ) = 2 A ⋅ ∆Ω ⋅ L (σ ) for σ ≥ 0 (4.141b)

and L is an even function [see Eqs. (4.136f) and (4.136g)], the definition of S(ı) can be extended
to negative values of ı by making S another even function:

S(−σ ) = S(σ ) . (4.141c)

The cosine is also even, so we can then write the interferogram as [see Eq. (2.19) in Chapter 2)


1
Ibal ( χ ) = ³ S(σ ) cos(2πσχ ) dσ .
4 −∞

The sine is an odd function and S(ı) is even, so the product S(σ ) sin(2πσχ ) is an odd function of
ı. According to Eq. (2.17) of Chapter 2, the integral between í’ and +’ of any odd function is
zero, so

³ S(σ ) sin(2πσχ ) dσ = 0 .
−∞
Hence we can write
∞ ∞
1 i
Ibal ( χ ) = ³
4 −∞
S(σ ) cos(2πσχ ) d σ ± ³ S(σ ) sin(2πσχ ) dσ
4 −∞

1
= ³ S(σ )[cos(2πσχ ) ± i sin(2πσχ )] dσ
4 −∞

- 463 -
4 · From Maxwell’s Equations to the Michelson Interferometer

or

1
Ibal ( χ ) = ³
4 −∞
S(σ ) e ±2π iσχ dσ (4.141d)

using cos(φ ) ± i sin(φ ) = e ± iφ . This shows that the interferogram of an ideal interferometer is 1/4 of
either the forward or inverse Fourier transform of S(ı).
Equations (4.141a) and (4.141d) are important because—as pointed out at the beginning of
Sec. 1.7 of Chapter 1—they show why people build Michelson interferometers. Reversing the
complex Fourier transform gives

S(σ ) = 4 ³ Ibal ( χ ) e B2π iσχ d χ . (4.141e)
−∞

or, using Eq. (2.8f) in Chapter 2 to reverse the cosine transform in (4.141a),


S(σ ) = 8³ Ibal ( χ ) cos(2πσχ ) d χ . (4.141f)
0

To find S(ı), the radiation spectrum as a function of wavenumber, we need only measure Ibal ( χ )
and take its transform.

4.17 Energy Flux in the Unbalanced Radiation Fields


As was explained in Chapter 1, one very common application of Michelson interferometers is as
infrared spectrometers. When interferometers are used to analyze infrared spectra, the relatively
warm optical elements used to shape and direct the signal beam, like any other type of warm
surface, can generate large amounts of infrared radiation. As we remarked when discussing the
operation of a standard interferometer in Sec. 4.11 above, the unbalanced radiation field from the
input beam is irrelevant because it goes back out the entrance aperture, never reaching the
detector; the same, however, cannot be said of the unbalanced signal from the infrared
background. Figure 4.25 shows background radiation from the detector side of the beam splitter
entering the interferometer to generate both a balanced and an unbalanced interference signal.
The balanced signal is a combination of two sets of rays, each set having been reflected once and
transmitted once at the beam splitter, while the unbalanced signal is a combination of two sets of
rays where one set has been transmitted twice at the beam splitter and one set has been reflected
twice at the beam splitter. Figure 4.25 traces this process through the interferometer, with the
balanced background signal represented by a combination of dash-dot and dashed rays and the
unbalanced signal represented by a combination of solid and dashed rays. The balanced
background signal travels out the input port, leaving the system; but the unbalanced background
signal is sent back to the detector, creating an unwanted optical signal. This unbalanced
background term is often a relatively large fraction of the total interference signal reaching the

- 464 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

detector. In well-designed interferometers, it can usually be eliminated by the same basic


calibration procedures used in other infrared spectrometers (with a few special twists due to the
Fourier nature of the spectral measurement—see Sec. 5.19 of Chapter 5), but when designing an
interferometer we need to calculate the background spectrum and unbalanced interferogram
because, as will be explained in Chapters 6, 7, and 8, it contributes to the noise contaminating the
interferometer measurements.
The derivation of the unbalanced signal equations is very similar to the derivation of balanced
signal equations, with every major step in the unbalanced derivation having its counterpart in the
balanced derivation. Because we have just completed a detailed, step-by-step derivation of the
balanced signal equations, there is no need for an equally detailed derivation of the unbalanced
signal equations. What we do instead is to list, with a minimum of explanation, the important
equations of the unbalanced derivation, always specifying the equations in the balanced
derivation to which they correspond. This approach avoids repeating at length points already
covered during the balanced derivation while giving the interested reader enough information to
fill in the details if so inclined.
Just like before, we start with a monochromatic plane wave,

ˆ G
ˆ 0(back)
Complex E field = ( xE x
ˆ 0(back)
+ yE y ) e2π iσ ( Ω•r −ct ) (4.142a)
and
ˆ G
Complex B field = c −1 ( yE
ˆ 0(back)
x
ˆ 0(back)
− xE y ) e 2π iσ ( Ω•r −ct ) . (4.142b)

We assume that intelligent efforts are made to control the background radiation from the warm
optical surfaces, so that, unless Ω̂ is parallel to or nearly parallel to the optical axis ẑ , the
background radiation cannot reach the detector. This means that, as was the case for the balanced
signal, only direction-chopped radiation at relatively small angles θb can reach the detector.
Equations (4.142a) and (4.142b) correspond to (4.77a) and (4.77b) in the balanced derivation, and
the complex plane wave they specify represents radiation entering the interferometer from the
detector side of the system (neglecting terms of order θb ). Again we imagine an unfolded system
of coordinates such as the one shown in Fig. 4.18 above, only now the coordinates are unfolded
in such a way as to trace the unbalanced background signal—rather than the balanced input
signal—into and out of the interferometer. Both ways of unfolding the interferometer end up
specifying the same exit beam traveling to the detector. Therefore the xˆ , yˆ , zˆ coordinate system
used for vectors Ω̂ and r̂ in Eqs. (4.142a) and (4.142b) is the same coordinate system as the one
located in the exit beam of the unfolded interferometer in Fig. 4.18. In this sense, the xˆ , yˆ , zˆ
coordinate system used to specify Ω̂ and r̂ in (4.142a) and (4.142b) is the same as the xˆ , yˆ , zˆ
coordinate system used to specify Ω̂ and r̂ in Eqs. (4.77a) and (4.77b).

- 465 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4.25.

Background Radiance from Detector


Side of the Interferometer

Detector Side of the


Interferometer

Beam Compensator
Input Side of the Splitter Plate Fixed
Interferometer Mirror

χ
2
Moving Mirror

- 466 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

The plane wave specified in Eqs. (4.142a) and (4.142b) can be decomposed into two linearly
polarized plane waves: one plane wave that has E0(back)x for its complex amplitude and is linearly
polarized perpendicular to the plane of incidence on the beam splitter, and one plane wave that
has E0(back)
y for its complex amplitude and is linearly polarized parallel to the plane of incidence on
the beam splitter. Tracing the background rays through Fig. 4.25, we find that the unbalanced
radiation field for the rays traveling out and back the moving-mirror arm are (again neglecting
terms of order θb )
Complex E field
ˆ G (4.143a)
ˆ 0(back)
= rM [ xE x ts2γ s( uv ) + yE
ˆ 0(back)
y t p2γ p(uv ) ] e 2π iσ [ Ωd •( r + zˆ χ ) −ct ]
and
Complex B field
r ˆ G (4.143b)
ˆ 0(back)
= M [ yE x ts2γ s(uv ) − xE
ˆ 0(back)
y t 2pγ (puv ) ] e 2π iσ [ Ωd •( r + zˆ χ ) −ct ] .
c

Appendix 4F presents the tunnel diagrams used to construct the γ (uv ) parameters for the s-type
and p-type plane waves passing through the beam-splitter substrate and compensator plate; the
ts , t p , rM , Ω d , and Ȥ variables all have the same meaning as before. Equations (4.143a) and
(4.143b) correspond to Eqs. (4.85a) and (4.85b) in the balanced derivation. Corresponding to Eqs.
(4.84a) and (4.84b), we have (neglecting terms of order θb )

Complex E field
ˆ G (4.144a)
ˆ 0(back)
= rM [ xE x rs2γ s( uv ) + yE
ˆ 0(back)
y rp2γ (puv ) ] e 2π iσ ( Ω•r −ct )
and
Complex B field
r ˆ rG − ct )
2π iσ ( Ω•
(4.144b)
ˆ 0(back)
= M [ yE x r γ
2 ( uv )
s s − ˆ
xE0y r γ
(back) 2 ( uv )
p p ] e .
c
From the discussion following Eq. (4.83) above, we know that the amplitude reflection
coefficients for plane waves reflecting off the back side of the beam splitter are Wrs (σ ) and
Wrp (σ ) , with W = 1 or W = í1 depending on the type of beam splitter being used. The W
parameter occurred in Eqs. (4.84a) and (4.84b) of the balanced derivation because the rs and rp
parameters appeared to the first power in the formulas. In Eqs. (4.144a) and (4.144b), on the other
hand, only the squares of rs and rp appear—which means, since W 2 = 1 , that the W parameter
disappears. The formulas for the recombined, unbalanced fields corresponding to Eqs. (4.88a)
and (4.88b) in the balanced derivation is (neglecting terms of order θb )

- 467 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Complex unbalanced E field


ˆ G ˆ G
= xˆ rM E0(back)
x e 2π iσ [ Ω•r −ct ] [rs2γ s( uv ) + ts2γ s( uv ) e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ] (4.145a)
ˆ rG − ct ]
2π iσ [ Ω• ˆ zˆ ) 4π iσ ( nˆ − zˆ )• rG
( uv ) 2π iσχ ( Ω•
+ yˆ rM E0(back)
y e [rp2γ (puv ) + t 2pγ p e e M
]
and

Complex unbalanced B field


r ˆ rG − ct ] 2 ( uv )
2π iσ [ Ω• ˆ zˆ ) 4π iσ ( nˆ − zˆ )• rG
2 ( uv ) 2π iσχ ( Ω•
= yˆ M E0(back)
x e [ r γ
s s + t γ
s s e e M
] (4.145b)
c
r ˆ G ˆ G
− xˆ M E0(back)
y e2π iσ [ Ω•r −ct ] [rp2γ (puv ) + t 2pγ (puv ) e2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ].
c

For future use, we note that Eqs. (4.89g)–(4.89k), (4.92a)–(4.92e), and (4.139a)–(4.139e) already
specify how rs , rp , t s , t p , and rM = r behave as functions of wavenumber ı; and the γ s(,uvp ) can be
set up to behave the same way the γ s(,abc
p
)
do in Eqs. (4.97c) and (4.139f),

G G G G
γ s(uv ) (ε , −σ ) = γ s(uv ) (ε , σ )∗ and γ (puv ) (ε , −σ ) = γ (puv ) (ε , σ )∗ (4.145c)

with
2 2 2 2
γ s(uv ) (−σ ) = γ s(uv ) (σ ) and γ (puv ) (−σ ) = γ (puv ) (σ ) . (4.145d)

Equation (4F.2a) in Appendix 4F points out that, like the magnitudes of the γ s(,abc
p
)
parameters, the
magnitudes of the γ s(,uvp ) parameters are functions only of wavenumber ı.
The next major step in the balanced derivation was to represent the radiation entering the
system by integrals over dw and d 2u , as in Eqs. (4.103a) and (4.103b). We now do the same for
the background radiation entering the interferometer from the detector side of the system
(neglecting terms of order θb ),
G G
E (back) ( ρ , z , t )
∞ ∞ G G (4.146a)
§ c · ª (back) cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « xˆ E x (− , z , − ) + yˆ E y (− , z , − ) » e 2π i[u • ρ + wt ]
2 (back)

−∞ −∞ © w ¹¬ w c w c ¼

and

- 468 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

G G
B (back) ( ρ , z , t )
∞ ∞ G G (4.146b)
§ 1 · ª (back) cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « y E x (− , z , − ) − xE y (− , z , − ) » e 2π i[ u • ρ + wt ] .
2
ˆ ˆ (back)

−∞ −∞ © w ¹¬ w c w c ¼
G G
We note that E (back) and B (back) must be real whereas E(back)
x and E(back)
y are allowed to be complex.
In addition E(back)
x and E(back)
y must satisfy all the symmetry relations that Ex and Ey satisfied for
the incident signal radiation entering the interferometer [see, for example, Eqs. (4.100a) and
(4.100b)],
G G
E(back)
x (ε , z , −σ ) = E(back)
x (ε , z , σ )∗ (4.146c)
and
G G
E(back)
y (ε , z , −σ ) = E(back)
y (ε , z , σ )∗ . (4.146d)

The total E and B fields for the unbalanced radiation traveling back to the detector are also
written as integrals over dw and d 2u ,
G G
E (unb) ( ρ , z , t ) =

{
∞ ∞ G G
c 2π i[ uG • ρG + wt ] w cu w (uv ) cu w
³−∞ ³−∞³ w2 e (− , z, − ) ⋅ γ s (− , − )
2 (back)
dw d u r (− ) xˆ E x
c w c w c
2π iwχ
4π iw § cu ·
2
G G (4.147a)
w 2 w 2 cu w
[ ]
− − 1−¨ ¸
( nˆM − zˆ )• r
⋅ rs (− ) + ts (− ) e e cc ©w¹ (back)
+ yˆ E y (− , z, − )
c c w c
G
}
2
2π iwχ § cu ·
cu w w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG
w c
[
⋅ γ p (− , − ) rp (− ) + t p (− ) e
( uv )

c c
e ]
and G G
B (unb) ( ρ , z , t ) =

{
∞ ∞ G G
1 2π i[ uG • ρG + wt ] w cu w ( uv ) cu w
³−∞ dw ³−∞³ d u w2 e (− , z , − ) ⋅ γ s (− , − )
2 (back)
r (− ) yˆ E x
c w c w c
2π iwχ § cu ·
2
G (4.147b)
w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG cu w
[
⋅ rs (− ) + ts (− ) e
c c
e ]
(back)
− xˆ E y (− , z , − )
w c
G
}
2
2π iwχ § cu ·
cu w w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG
w c
[
⋅ γ p (− , − ) rp (− ) + t p (− ) e
( uv )

c c
e ]

- 469 -
4 · From Maxwell’s Equations to the Michelson Interferometer

These two equations correspond to Eqs. (4.104a) and (4.104b) in the balanced derivation
(neglecting terms of order θb ).
The unbalanced background signal from the warm optical surfaces can be thought of as
traveling to the detector from the beam splitter along the same ray paths as the balanced optical
signal; consequently, it ends up being processed by the system much the same way as the
balanced optical signal. For this reason, we now give T, A subscripts to
G G
E (unb) , B (unb) , E(back)
x , and E(back)
y

to show that they also represent time-chopped and beam-chopped radiation fields. The
unbalanced E and B fields are time-chopped to the same 2T time interval as the balanced fields,
because the detector records both signal and background for the same length of time. Although
the effective cross-sectional area of the background beam is probably somewhat larger than that
of the input beam, in a well-designed system they are roughly the same size and can be
represented by the same symbol A. Again we treat
G G
E (unb) , B (unb) , E(back)
x , and E(back)
y

as random quantities. Hence, the z component for the Poynting vector for the unbalanced
radiation fields is another random quantity given by

( )
G (unb) 1 G (unb) G (unb)
STA • zˆ = ETA × BTA • zˆ . (4.148a)
µo

This corresponds to Eq. (4.122a) in the balanced derivation. The average radiant energy from the
unbalanced background reaching the interferometer’s detector during a time interval 2T over a
beam cross-sectional area A is now

§ 1
( ) ·
∞ ∞
G (unb) G (unb)

© µo
³
−∞
dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ ¸ ,
¹
(4.148b)

which corresponds to the right-hand side of (4.122b) in the balanced derivation. Adding T, A
subscripts to E(back)
x and E(back)
y in Eqs. (4.147a) and (4.147b)—and representing them as random
quantities—we substitute the right-hand sides of these two equations into the expression in
(4.148b) to get, after a great deal of algebra, that

- 470 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

Average energy in unbalanced background signal over time 2T and beam cross section A

{[ r (σ ) ( )
∞ ∞
d 2ε  (back) (εG, z , σ ) 2
= εo ³ dσ ³³σ 2
r (σ )
2
s
4
+ ts (σ )
4
]γ ( uv )
s
2
(σ ) E E xTA
−∞ −∞

[ 4
+ rp (σ ) + t p (σ )
4
]γ ( uv )
p
2
(σ ) E E yTA (
 (back) (εG, z , σ ) 2
)}
{ G
∞ ∞
d 2ε  (back) (εG, z , σ )E
 (back) (εG − ∆, z , σ )∗
+ε o ³ dσ ³³σ 2
r (σ )
2
[r (σ )
s
∗2 2
ts (σ ) 2 γ s( uv ) (σ ) E E xTA ( xTA ) (4.148c)
−∞ −∞

2
 (back) (εG , z , σ )E
+ rp (σ )∗2 t p (σ ) 2 γ p(uv ) (σ ) E E yTA ( yTA
G
 (back) (εG − ∆, z , σ )∗ )]e −2π iσχ 1−ε 2
}
{σ G
∞ ∞
d 2ε G G
+ε o ³ dσ ³³σ 2
r( )
2
[r (σ ) t (σ )
s
2
s
∗2
γ s(uv ) (σ ) E ( E (back)
2
xTA (ε , z , σ )E xTA (ε + ∆, z , σ ) )
 (back) ∗

−∞ −∞

2
 (εG, z , σ )E
+ rp (σ ) 2 t p (σ )∗2 γ (puv ) (σ ) E E yTA ( yTA
G
 (εG + ∆, z , σ )∗ )] e 2π iσχ 1−ε 2
}
Here we have used Eqs. (4.102a)–(4.102d) to transform the integrals over dw and d 2u into
G
integrals over dı and d 2ε , and once again we have used ∆ = 2(nˆM − zˆ ) . Equation (4.148c)
corresponds to Eq. (4.134a) in the balanced derivation.
Following the pattern of Eqs. (4.120a), (4.120b), (4.135a), and (4.135b), we now write

εo
2TAσ 2 xTA (
 (back) (ε , z , σ ) 2 ≅ L(back) (εG, σ ) ,
E E x ) (4.149a)

εo
2TAσ 2 yTA (
 (back) (ε , z , σ ) 2 ≅ L(back) (εG, σ ) ,
E E y ) (4.149b)

G G
 (back) (εG, z , σ )E
E E(  (back) (εG ± ∆, z , σ )∗ ≅ 2T µ c 2σ 2 L(back) (εG , σ ) Ȇ (Bσ∆) ,
) (4.149c)
xTA xTA o x A

and
G G
 (back) (εG, z , σ )E
E E(  (back) (εG ± ∆, z , σ )∗ ≅ 2T µ c 2σ 2 L(back) (εG , σ ) Ȇ (Bσ∆) .
) (4.149d)
yTA yTA o y A

As was the case for the balanced derivation [see Eqs. (4.121a) and (4.121b)], L(back)
x and L(back)
y are
even functions of ı,

- 471 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G
L(back)
x ( , ) ) L(back)
x ( , ) ) (4.149e)
and
G G
L(back)
y ( , ) ) L(back)
y ( , ) ) . (4.149f)

Glancing back at where these two functions came from, we see that these two functions represent,
respectively, the x-polarized and y-polarized background optical power per unit area per unit solid
angle per unit ı interval entering the interferometer from the detector side of the system.
Substitution of Eqs. (4.149a)–(4.149d) into (4.148c) gives

Average energy in unbalanced background signal over time 2T and beam cross section A

{ {[ r () )
5
G
2TA ³ d) ³ ³ d  r () )
2 2
s
4
 ts () )
4
] ( uv )
s
2
() ) L(back)
x ( , ) )
5fieldfield
of view
of view

[
 rp () )  t p () )
4 4
] ( uv )
p
2
() ) L(back)
y
G
( , ) ) }
{)
5
G
2TA ³ d) ³³d 
2
r( )
2
[r () )
s
2
ts () ) 2 ( uv )
s
2
() ) L(back)
x ( , ) )
5 field of view

 rp () )2 t p () ) 2 ( uv )
p
2
() ) L(back)
y (
G
, ) ) A
1
A
Ȇ A ()
G 2& i)
) e][ ] 1 2
}
{
5
G
2TA ³ d) ³³d 
2
r () )
2
[r () ) t () )
s
2
s
2 ( uv )
s
2
() ) L(back)
x ( , ) )
5 field of view

 rp () ) 2 t p () )2 ( uv )
p
2
() ) L(back)
y
G 1 G
][
( , ) ) A Ȇ A () ) e 2& i)
A
] 1 2
} ,
(4.150)
G
where we use Eq. (4.1e) to replace  o $o c 2 by one and Eq. (4.134d) to replace Ȇ A (u ) by
G
Ȇ A (u ) . This result corresponds to Eq. (4.135d) in the balanced derivation, except we have
anticipated the reasoning used to go from (4.135d) to (4.135e) by using the interferometer’s field
of view to set the limits on the double integral over d 2 . Strictly speaking, this should be the
field of view for the unbalanced background radiation coming from the warm optical surfaces
between the beam splitter and the detector, but in a well-designed system the two fields of view
are roughly the same. In this formula, the second triple integral over dı and d 2 is the complex
conjugate of the third triple integral over dı and d 2 , ensuring that their sum is real. Since the
first triple integral is the integral of a real expression, evaluation of the right-hand side of (4.150)
produces a real number—which makes sense considering that this is the formula for the energy in
the unbalanced background signal.

- 472 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

To make further simplifications in this energy formula, we break from the pattern of the
balanced derivation and use Eq. (4.150) to represent a somewhat idealized interferometer with a
nonideal beam-splitter film. From this point on to the end of this section, we are not so much
analyzing a likely type of Michelson setup as we are constructing a thought experiment to
discover hidden properties of the beam-splitter amplitude-transmission and amplitude-reflection
coefficients ts, tp, rs, and rp. The first step is to set up the interferometer so that no electromagnetic
energy enters the system through the input port—for example, by having the interferometer
entrance aperture look at a chilled nonreflective surface. This means only detector-side
background radiation enters the system. To keep things simple, we first assume that all single-
pass s-type and p-type transmissions through the beam-splitter substrate and compensator plate
are equivalent, with every single-pass transmission characterized by complex constants having
3
the same magnitude γ . Now the γ s(,abc p
)
terms correspond to γ and the γ s(,uvp ) terms correspond
2
to γ so that
2 6 2 4
γ s(,abc
p
)
→γ and γ s(,uvp ) → γ . (4.151a)

This lets us assume that only negligible amounts of optical power are lost passing through the
substrate and compensator plate by saying that γ is approximately equal to one; similarly, we
say that only negligible amounts of optical power are lost by reflection off the fixed and moving
mirrors by saying that r is approximately equal to one. These assumptions can be written as

γ (σ ) ≅ 1 and r (σ ) ≅ 1 . (4.151b)

G
In addition, the moving mirror is taken to be in perfect alignment with ∆ = 0 so that [see Eq.
(4.137j)]
1 1
Ȇ A (0) = Ȇ A (0)∗ = 1 . (4.151c)
A A

To keep our thought experiment simple, we force the background radiation to be x-polarized and
confined to a very narrow solid angle ∆Ω back , so that

G
L(back)
y (ε , σ ) ≅ 0 (4.151d)
and
G G
L(back)
x (ε , σ ) ≅ ∆Ω backδ (ε ) L(xback) (σ ) = ∆Ω backδ (ε x )δ (ε y ) L(xback) (σ ) . (4.151e)

- 473 -
4 · From Maxwell’s Equations to the Michelson Interferometer

The double-sided power function L(xback) () ) for 5


)
5 has units of optical power per unit
G
cross-sectional area per unit solid angle per unit ı interval. We note that the delta function  ( )
in the
is explained at Sec.end
2.25
of of Chapter
Sec. 2.25 of2.Chapter
It has units
2. It of
hasinverse
units ofsteradians, as can be as
inverse steradians, seen
canfrom the
be seen
identity
from the identity
5
G
³ ³ d   ( ) 1 ,
2

5

showing that the delta function when integrated over d 2 (that is, integrated over a solid angle
G
containing  0 ) always produces the dimensionless number one. In Eq. (4.151e), we drop the
G
dependence of the background optical power L (xback) on  ( x ,  y ) , using  ( x ) and  ( y ) to
show that only the contribution from the on-axis direction is significant. Just like L(ı) in Eq.
(4.136f), function L (xback))  must be even,

L (xback) )  L (xback))  . (4.151f)

Although it is highly unlikely that an actual interferometer would have this sort of idealized x-
polarized background radiance, we can always arrange for an existing system to have this sort of
contaminating background without changing the properties of the interferometer’s beam splitter.
Substitution of Eqs. (4.151a)–(4.151e) into (4.150) gives

Average energy in unbalanced background signal over time 2T and beam cross section A

{
5
ª rs () ) 4  ts () ) 4 º
³L
(xback)
2TA back () ) (4.152)
¬ ¼
5

}
 e 2& i) rs () )2 ts () ) 2  e 2& i) rs () ) 2 ts () )2 d) .

We next consider what happens to the balanced, instead of the unbalanced, detector-side
background signal. Equation (4.135d), which specifies the energy in the balanced input signal,
can be adapted to describe the balanced background signal, but to do this we must analyze how
the balanced background signal differs from the balanced input signal. We note that rs and rp in
(4.135d) refer to an initial reflection of the beam coming from the input port that is off the front
side of the interferometer, as shown in Fig. 4.16, whereas the balanced background signal must,
as shown in Fig. 4.25, have its initial reflection off the back side of the beam splitter. Tracing the
balanced background rays through the interferometer, we see that, compared to the balanced input
rays, front-side beam-splitter reflections are replaced by back-side beam-splitter reflections and
back-side beam-splitter reflections are replaced by front-side beam-splitter reflections. We also
note that rays pass through the compensator plate and beam-splitter substrate a different number

- 474 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

of times, but this does not matter because we take γ (σ ) ≅ 1 in our idealized interferometer. From
the discussion following Eq. (4.83) above, we know that if the front-side reflection coefficients
are rs and rp, then the back-side reflection coefficients are Wrs and Wrp. This means that to
convert the balanced input-signal derivation to the balanced background-signal derivation, we
need to convert all the rs and rp variables to Wrs and Wrp whenever rs and rp refer to front-side
reflection coefficients. What about those times when rs and rp are already part of Wrs and Wrp
products referring to back-side reflection coefficients? To handle this situation, we note that when
W = −1 , making the original back-side reflection coefficients (− rs ) and (− rp ) , then W2rs and W2rp
return to us the front-side coefficients rs and rp; and when W = 1 , the back-side and front-side
coefficients are always equal and can be multiplied by as many powers of W as we please.
Therefore, if Wrs and Wrp refer to back-side reflection coefficients in the balanced input-signal
derivation, then W2rs and W2rp automatically convert the terms to the desired front-side reflection
coefficients. This shows that replacing the rs and rp variables everywhere by Wrs and Wrp
converts all front-side reflection terms to back-side reflection terms and all back-side reflection
terms to front-side reflection terms. Hence, Eq. (4.135d) can be used to calculate the energy in the
balanced background signal if rs and rp are replaced everywhere by Wrs and Wrp (and, of course,
Lx and Ly are replaced by L(back)
x and L(back)
y ). The only values W can have are +1 or í1 so as
always W 2 = 1 . Looking at Eq. (4.135d), we see that rs and rp only enter the formula as
2 2
r and rp , so replacing rs and rp by Wrs and Wrp does not change the equation. Therefore, all
s
that needs to be done to adapt (4.135d) to the balanced background signal using the
approximations in (4.151d) is to set γ s(,abc
p (σ ) = γ (σ ) = r (σ ) = 1 and to replace Lx and Ly by
)

L(back)
x and L(back)
y , which gives us

Average energy in balanced background over time 2T and beam cross section A
∞ ∞
= 4TA ³ dσ ³ ³d ε
2

−∞ −∞

{[

2 2
rs (σ ) ts (σ ) L(back)
x
G 2 2
(ε , σ ) + rp (σ ) t p (σ ) L(back)
y
G
(ε , σ ) ]} (4.153)

{
∞ ∞
G
³ ³ ³ d ε [ r (σ )
2 2
+2TW dσ 2
s ts (σ ) L(back)
x (ε , σ )
−∞ −∞

]}
G G −2π iσχ G
2 2
+ rp (σ ) t p (σ ) L(back)
y (ε , σ ) ⋅ ][
Ȇ A (σ∆ )e 1−ε 2
+ Ȇ A (σ∆)∗ e 2π iσχ 1−ε 2

Substitution of the remaining idealizations and approximations in (4.151c)–(4.151e) into Eq.


(4.153) gives

- 475 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Average energy in balanced background over time 2T and beam cross section A

(4.154)
³
2 2
= 2TA∆Ω back rs (σ ) ts (σ ) L (xback) (σ ) ª¬ 2 + We −2π iσχ + We 2π iσχ º¼ dσ .
−∞

We now consider formulas (4.152) and (4.154) for the balanced and unbalanced background
energy. Although the background radiance while passing through the interferometer may have
some of its energy absorbed, by conservation of energy there is no way for its energy to
increase—consequently, the sum of (4.152) and (4.154) must be less than or equal to the total x-
polarized energy produced by the radiant background in time 2T for a beam of cross-sectional
area A and solid angle ∆Ω back ,

2TA∆Ω back ³L
−∞
(xback) (σ )dσ . (4.155a)

Since L (xback) is even [see Eq. (4.151f)], the total background energy entering the interferometer
can also be written as

4TA∆Ω back ³ L (xback) (σ )dσ (4.155b)
0

following the rule given in Eq. (2.19) of Chapter 2.


We now add together the balanced and unbalanced energy—that is, the total energy—leaving
the interferometer. The sum of the right-hand sides of (4.152) and (4.154) gives

Total background energy over time 2T and beam cross section A

{

ª rs (σ ) 4 + ts (σ ) 4 º + rs (σ )∗2 ts (σ ) 2 e −2π iσχ
³L (σ )
(xback)
= 2TA∆Ω back
¬ ¼
−∞
2 2 2 2
(4.156a)
+ rs (σ ) 2 ts (σ )∗2 e 2π iσχ + 2 rs (σ ) ts (σ ) + W rs (σ ) ts (σ ) e −2π iσχ

2 2
+ W rs (σ ) ts (σ ) e 2π iσχ dσ . }
We represent the complex scalars rs and t s by

rs (σ ) = rs (σ ) eiθrs (σ ) (4.156b)
and
ts (σ ) = ts (σ ) eiθts (σ ) (4.156c)

- 476 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

for ' rs () ) and 'ts () ) defined to be real wavenumber-dependent angles representing the phases
of rs and t s . Since rs () ) rs () ) and ts () ) ts () ) from Eqs. (4.92b) and (4.92d), we must
have
' rs () ) ' rs () ) (4.156d)
and
'ts () ) 'ts () ) (4.156e)

in (4.156b) and (4.156c), the defining equations for ' rs () ) and 'ts () ) . Substitution of (4.156b)
and (4.156c) into (4.156a) gives

Total background energy over time 2T and beam cross section A

{
5 2
ª rs () ) 2  ts () ) 2 º
³L
(xback)
2TA back () ) (4.157a)
¬ ¼
5

 rs () ) ts () ) ªe 2& i) e  ts
2 2
¬
2 i ' () ) ' rs () )
 e 2& i) e 2i ('ts () ) 'rs () ))  We 2& i)  We 2& i) º d) .
¼ }
2 2
Equations (4.139b) and (4.139c) show that rs () ) and ts () ) are even functions of ı, as is
L(back) according to (4.151f). The term

ª¬ e 2& i) e 2i'ts () ) 'rs () )   e 2& i) e 2i'ts () ) 'rs () )   We 2& i)  We2& i) º¼

inside the integral is also even with respect to ı, because by (4.156d) and (4.156e)

ª¬ e 2& i ( ) )  e 2i'ts ( ) ) 'rs ( ) )   e 2& i ( ) )  e 2i'ts ( ) ) 'rs ( ) )   We2& i ( ) )   We2& i ( ) )  º¼


ª¬e 2& i) e 2i'ts () ) 'rs () )   e 2& i) e 2i'ts () ) 'rs () )   We 2& i)  We 2& i) º¼ .

Eq. (2.19) in
This means (4.157a) is an integral of an even expression between í’ and +’, so by rule
Chapter 2 it can also be written as

Total background energy over time 2T and beam cross section A

{
5 2

4TA back ³ L (back)


() ) ª rs () )  ts () ) º
2 2
(4.157b)
¬ ¼
0

 rs () ) ts () ) ªe 2& i) e  ts
2 2
¬
2 i ' () ) ' rs () )
 e 2& i) e 2i ('ts () ) 'rs () ))  We 2& i)  We 2& i) º d) .
¼ }
- 477 -
4 · From Maxwell’s Equations to the Michelson Interferometer

We know from formula (4.116a) in Sec. 4.14 above that

εo
σ 2 ( (0, z , σ ) 2
⋅E E xTA )
is the average input energy, per unit wavenumber interval and per unit solid angle, that is entering
the interferometer during a time 2T and is carried by the x-polarized radiation field traveling in
the ẑ direction at wavenumber ı. We note that the z in the argument list of E 
xTA can be

 2 does not
disregarded because, as is mentioned at the end of Appendix 4C, the value of E xTA

depend on z. Using the approximations introduced for γ s(,abc


p (σ ) in (4.151a) above, we know from
)

the analysis in Appendix 4E that the effect of one transmission through the beam splitter is to
replace the monochromatic plane wavefield specified by E  (0, z , σ ) with the monochromatic
xTA
 (0, z , σ ) . Hence, the average energy, per unit wavenumber
plane wavefield specified by γ t E s xTA
interval per unit solid angle, that passes through the beam splitter during a time 2T and is carried
by the x-polarized radiation field traveling in the ẑ direction at wavenumber ı is

εo
σ 2 (
⋅ E γ (σ )ts (σ )E xTA )
 (0, z , σ ) 2 = γ (σ ) 2 t (σ ) 2 ε o E E
s
σ 2 (
 (0, z , σ ) 2
xTA )
2 ε
≅ ts (σ ) o2 E E
σ ( )
 (0, z , σ ) 2 ,
xTA

2
where in the last step (4.151b) is used to drop γ from the formula. This result shows why ts (σ )
is called the power transmission coefficient for x-polarized radiation. Using similar reasoning, the
effect on the plane waves of one reflection from the beam splitter is to replace E  (0, z , σ ) by
xTA
2 
γ r E (0, z, σ ) . Hence, the formula for the average energy, per unit wavenumber interval and
s xTA
per unit solid angle, that is carried by the x-polarized radiation reflected off the beam splitter in
time 2T is

εo
σ 2 (
⋅ E γ (σ ) 2 rs (σ )E xTA ) s
σ 2 (
 (0, z , σ ) 2 = γ (σ ) 4 r (σ ) 2 ε o E E
 (0, z , σ ) 2
xTA )
2 ε
≅ rs (σ ) o2 E E
σ ( )
 (0, z , σ ) 2 .
xTA

2
This shows why rs (σ ) is called the power reflection coefficient for x-polarized radiation.
Although the beam-splitter substrate can absorb energy—a process now being neglected by

- 478 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

taking γ about equal to one—a well-designed beam splitter has only negligible absorption in the
thin film where the partial transmission and reflection of the interferometer beam occurs. This
means, by conservation of energy, that

εo
σ 2 (
 (0, z , σ ) 2
⋅E E xTA )
= ts (σ )
2εo
σ2
⋅ (
E E xTA s ) σ2 (
 (0, z , σ ) 2 + r (σ ) 2 ε o ⋅ E E
 (0, z , σ ) 2
xTA )
2 2 ε
= ( ts (σ ) + rs (σ ) ) ⋅ o2 ⋅ E E
σ (
 (0, z , σ ) 2
xTA )
or
2 2
ts (σ ) + rs (σ ) = 1 . (4.157c)

Substitution of this conclusion back into (4.157b) gives

Total background energy over time 2T and beam cross section A



[
= 4TA∆Ω back ³ L (xback) (σ ) 1 + 2 rs (σ ) ts (σ )
2 2
(4.157d)
0

{
⋅ cos[2πσχ − 2 (θts (σ ) − θ rs (σ ) )] + W cos(2πσχ ) }] dσ ,
where eiφ = cos φ + i sin φ is used to reduce the complex exponentials to a sum of cosines. For an
ideal beam splitter
2 2
rs (σ ) = ts (σ ) = 1 2 ,

2 2
so 2 rs (σ ) ts (σ ) must also be about equal to 1/2 for a well-designed, nonideal beam splitter; it
obviously cannot be a small term. We now compare (4.157d) to formula (4.155b) for the total
energy produced by the radiant background. Unless the term inside the braces { } in (4.157d) is
identically zero for all values of ı, we can always construct an x-polarized background spectrum
L(xback) that, for certain values of Ȥ, specifies more energy leaving the interferometer in the
balanced and unbalanced background signal than entered the interferometer in (4.155b).
Therefore, the term inside the braces { } must be identically zero for all non-negative values of ı,
which means that

cos[2πσχ − 2 (θts (σ ) − θ rs (σ ) )] + W cos(2πσχ ) = 0 . (4.158a)

- 479 -
4 · From Maxwell’s Equations to the Michelson Interferometer

To make this happen, we require θts (σ ) − θ rs (σ ) = ± π 2 for W = 1 and θts (σ ) − θ rs (σ ) = 0 for


W = −1 . Of course, multiples of ʌ can be added to these values because that does not change the
value of the cosine in (4.157d). We can specify both these conditions by the constraint

±2 i[θts (σ ) −θ rs (σ )]
e = −W . (4.158b)

By Eqs. (4.156d) and (4.156e), this constraint holds true for all negative values of ı if it holds
true for all non-negative values of ı, since

±2i[θts ( −σ ) −θ rs ( −σ )] B2i[θts (σ ) −θ rs (σ )]
e =e .

When this constraint is substituted back into Eq. (4.157b), the right-hand side reduces to

Total background energy over time 2T and beam cross section A


∞ 2

= 4TA∆Ω back ³ dσ L (σ ) ª rs (σ ) + ts (σ ) º ,
(xback) 2 2
¬ ¼
0

which, by substituting in (4.157c), is shown to be the same as the expression for the background
radiant energy given in (4.155b).
We have just seen that the background radiant energy is conserved—for x-polarized
background radiation. Clearly, nothing stops us from now making the background energy y-
polarized and repeating the analysis. If we return to Eq. (4.150), now specifying that
G
L(back)
x (ε , σ ) ≅ 0 (4.159a)
and
G G
L(back)
y (ε , σ ) ≅ ∆Ω backδ (ε ) L (yback) (σ ) = ∆Ω backδ (ε x )δ (ε y ) L (yback) (σ ) , (4.159b)

everything will proceed as before because all the properties used to get to (4.158b) for rs and ts
also hold true for rp and tp. Having switched from x polarization to y polarization, we define
iθ rp (σ )
rp (σ ) = rp (σ ) e (4.160a)
and
iθtp (σ )
t p (σ ) = t p (σ ) e (4.160b)

- 480 -
Energy Flux in the Unbalanced Radiation Fields · 4.17

for θ rp (σ ) and θtp (σ ) real parameters representing the phase of rp and tp as functions of ı. Again,
these functions must be odd:

θ rp (−σ ) = −θ rp (σ ) (4.160c)
and
θtp (−σ ) = −θtp (σ ) . (4.160d)

2
We can show that t p (σ ) is the power transmission coefficient through the beam splitter for y-
2
polarized waves and that rp (σ ) is the power reflection coefficient through the beam splitter for
y-polarized waves so that
2 2
t p (σ ) + rp (σ ) = 1 . (4.160e)

As before, this leads to the final conclusion that

±2 i ª¬θtp (σ ) −θ rp (σ ) º¼
e = −W (4.160f)

for all positive and negative values of ı, allowing us to conserve energy for the y-polarized
background radiation passing through the interferometer.
These results certainly hold for the beam-splitter transmission and reflection coefficients ts, tp,
rs, and rp in our thought experiment on an ideal interferometer, but what about the ts, tp, rs, and rp
coefficients of a nonideal interferometer? The idealizations made at the start of this analysis in
Eqs. (4.151a)–(4.151c) are standard ways of improving the performance of Michelson
interferometers—decreasing substrate absorption, improving mirror reflectivity, and correctly
aligning the moving mirror—and in that sense are physically possible modifications that can be
made to the interferometer without changing the ts, tp, rs, and rp of the partially transmitting,
partially reflecting beam-splitter film. Similarly, we can imagine using polarizing filters and
beam collimators to create an x-polarized or y-polarized radiance field that is severely direction-
chopped, and then switching the interferometer’s entrance and exit ports to create “background”
radiation of the type specified in Eqs. (4.151d), (4.151e), (4.159a), and (4.159b). This is also a
procedure that does not affect the ts, tp, rs, and rp beam-splitter coefficients. Hence, our analysis
strongly suggests that the constraints on ts, tp, rs, and rp in Eqs. (4.158b) and (4.160f), which are
derived from these idealizations, can be confidently applied to the nonideal system of Eq. (4.150).
Concluding that this is in fact the case, we substitute (4.158b) and (4.160f) into (4.150) to get

- 481 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Average energy in unbalanced background signal over time 2T and beam cross section A

{[ r (σ )

G
= 2TA ³ dσ ³ ³ d 2ε r (σ )
2
s
4
+ ts (σ )
4
]γ ( uv )
s
2
(σ ) L(back)
x (ε , σ )
−∞ field of view

[ 4
+ rp (σ ) + t p (σ )
4
]γ ( uv )
p
2
(σ ) L(back)
y
G
(ε , σ ) }


G
−2WTA ³ dσ ³ ³ d 2ε r( )
2
[ r (σ )
s
2 2 2
ts (σ ) γ s( uv ) (σ ) L(back)
x (ε , σ )
−∞ field of view

2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y (ε
G
, σ ) ⋅
1
A
Ȇ A (σ∆
G −2π iσχ
][
) e ] 1−ε 2
}
{

G
−2WTA ³ dσ ³ ³ d 2ε r (σ )
2
[ r (σ )
s
2 2 2
ts (σ ) γ s( uv ) (σ ) L(back)
x (ε , σ )
−∞ of view
field

2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y
G 1 G
][
(ε , σ ) ⋅ Ȇ A (σ∆)∗ e 2π iσχ
A
] 1−ε 2
}
.
(4.161a)

Dividing both sides by 2T to get an expression for Punb (back)


( χ ) , the average power in the
unbalanced background signal for a beam of cross-sectional area A at an OPD value of Ȥ, gives

(back)
Punb (χ )

{

ª rs (σ ) 4 + ts (σ ) 4 º γ s( uv ) (σ ) 2 L(back) G
= A ³ dσ ³ ³ d 2ε r (σ )
2
(ε , σ )
¬ ¼ x
−∞ field of view

G
+ ª rp (σ ) + t p (σ ) º γ (puv ) (σ ) L(back)
4 4 2
(ε , σ ) (4.161b)
¬« »¼ y

W G
−2
A
[ 2 2 2
rs (σ ) ts (σ ) γ s(uv ) (σ ) L(back)
x (ε , σ )

2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y (ε
G
, σ ) ⋅ Re ª Ȇ
¬ A (σ∆
G
]
)e −2π iσχ cosαε º¼ }
where cos α ε = 1 − ε 2 has the same meaning as in Eqs. (4.135f) above (it is the cosine of the
angle the propagation vector Ω ˆ = εG + zˆ 1 − ε 2 makes with the ẑ axis of the unfolded
(back) (back)
interferometer). This integral clearly gives a real value for Punb , as it should because Punb
is a real quantity.

- 482 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18

4.18 Simplified Formulas Describing Unbalanced Background Radiation


There is usually no reason to treat the background radiation as anything other than unpolarized or
as having anything other than the same background spectrum everywhere inside the field of view.
Once again, we follow the pattern of the balanced derivation [see Eq. (4.136e)] and define

G G 1
L(back)
x ( , ) ) L(back)
y ( , ) ) L (back) () ) . (4.162a)
2

Here, L(back) () ) is the total background optical power per unit cross-sectional area of the beam per
unit solid angle per unit ı interval. Just like LL(()))) in (4.136f), L (back) is a double-sided power
spectrum, making it an even function of ı:

L (back) () ) L (back) () ) . (4.162b)

When there is negligible absorption in the partially transmitting and partially reflecting beam-
splitter film, Eqs. (4.157c) and (4.160e) require that
2
ª ts () ) 2  rs () ) 2 º 1
¬ ¼
and
2
ª t () ) 2  r () ) 2 º 1 .
¬« p p
¼»

We can use these equations to write

4 4 2 2
ts () )  rs () ) 1  2 ts () ) rs () )
and
4 4 2 2
t p () )  rp () ) 1  2 t p () ) rp () ) ,

which can be rearranged to get

4 4 4 4
ts () )  rs () )  t p () )  rp () )
(4.162c)
2  2 ª ts () ) rs () )  t p () ) rp () ) º .
2 2 2 2

«¬ »¼

The idealizations introduced in Eq. (4.151a) let Eq. (4.136i) be approximated as

- 483 -
4 · From Maxwell’s Equations to the Michelson Interferometer

() ) ª rs () ) ts () )  rp () ) t p () ) º ,
2 6 2 2 2 2
! () ) 2 r () ) «¬ »¼
(4.162d)

which can be substituted into (4.162c) to get

4 4 4 4 ! () )
ts () )  rs () )  t p () )  rp () ) 2  6 2
. (4.162e)
() ) r () )

Applying the idealizations in (4.151a) to (4.161b), and then substituting from Eqs. (4.162a),
(4.162d), and (4.162e) gives
(back)
Punb ( )

{ {2 r() )
5
A ! () )
³ d) ³ ³ d 2 L (back) () )
2 4
Ȗ() )  2
(4.162f)
2 5 field
field of
of view
view Ȗ() )

W ª ! () ) º
 « »
A « Ȗ() ) 2 »
¬ ¼
A Re ª Ȇ
¬ A ()
G 2& i) cos
)e  º
¼ }.
The next idealization is to give the background-radiance beam a circular cross section. From
the work done in the balanced derivation [see Eq. (4.137e)], the formula for A1Ȇ A is then

1 G J (4& R A ) A ' ma )
Ȇ A ()) 1 ,
A circle of
2& R A ) A ' ma
radius R

where ' ma is the angle (in radians) between the surface normal vectors of the correctly aligned
and misaligned moving-mirror positions. From Eq. (4.137g), we know that

J1 (4& R)' ma )
2& R)' ma

has the same value at íı as it has at +ı, so we can discard the absolute value signs and write

1 G J (4& R)' ma )
Ȇ A ()) 1 .
A circle of
2& R)' ma
radius R

- 484 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18

The J1 Bessel function is always real when it has a real argument, so A1Ȇ A must be real for a
circular cross section. This means that when this last expression is substituted into (4.162f), we
get
(back)
Punb ( )

{ {
5
A ! () )
³ d) ³ ³ d 2 L (back) () )
2 4
2 r () ) Ȗ() )  2
(4.163a)
2 5 field
field of view
of view Ȗ () )

W [ ! () ) ]A[ J 2(4&&RR)')' ) ]A cos(2&) cos  )}.


2
1 ma

Ȗ() ) ma

Equation (4.163a) corresponds to (4.137i) in the balanced derivation. Assuming the effective field
of view for the background radiance is sufficiently narrow that cos   1 , we can write (4.163a)
as

{2 r() )
5
A ! () )
³
(back) 2 4
Punb ( ) L (back) () ) Ȗ() )  2
2 5 Ȗ() )
(4.163b)
ª ! () ) º ª J (4& R)' ) º
W « 2
»A« 1
«¬ Ȗ() ) »¼ ¬ 2& R)'
ma

ma ¼
}
» A cos(2&) ) d) .

This result corresponds to Eq. (4.138a). Again, ¨Ÿ represents the value of the integral over d 2 .
This makes ¨Ÿ the solid angle of the interferometer’s effective field of view for the unbalanced
background signal, which should be, as pointed out in the discussion after Eq. (4.150), about the
same size as the interferometer’s input field of view. We note that the entire product

ª ! () ) º ª J (4& R)' ) º
{
L (back) () )
2 4
2 r () ) Ȗ() ) 
! () )
Ȗ() )
2
 W « 2
»A« 1
«¬ Ȗ() ) »¼ ¬ 2& R)'
ma

ma
» A cos(2&) )
¼
}
2 2
is an even function of ı if L (back) , r , , ! , cos(2&) ) , and

J1 (4& R)' ma )
2& R)' ma

are all even functions of ı. The cosine is always an even function, and Eq. (4.162b) shows that
2
L (back) is even. The analysis following Eq. (4.138b) shows that r and Ș are also even functions

- 485 -
4 · From Maxwell’s Equations to the Michelson Interferometer

2
of ı, and Eq. (4.137g) shows that (2& R)' ma ) 1 J1 (4& R)' ma ) is even. As for , the only
uncertainty left, we know that it must be an even function of ı because, according to (4.151a), it
( abc ) 2 ( uv ) 2
comes from idealized approximations for s, p and s, p that are themselves, as shown in
Eqs. (4.139f) and (4.145d), even functions of ı. Hence Eq. (4.163b) can be written as [by
applying formula (2.19) in Chapter 2]

{ r)
5
A ! () )
³
(back) 2 4
P unb ( ) L(back) () ) 2 ( ) Ȗ() )  2
2 0 Ȗ() )
(4.163c)
ª ! () ) º ª J (4& R)' ) º
W « 2
»A« 1 ma

«¬ Ȗ() ) »¼ ¬ 2& R)' ma ¼


» A cos(2&) ) }d) ,

where we define

L(back) () ) 2 L (back) () ) for ) : 0 . (4.163d)

We recognize from the discussion preceding Eq. (4.136d) that L(back) can be thought of as the
spectral radiance of the background radiation causing the unbalanced background signal.
Equation (4.163c), then, corresponds to Eq. (4.140a) in the balanced derivation.
Our next idealization is to assume the interferometer is well aligned so that ' ma 0 .
Substitution of Eq.
Application of Eq. (4.137k),
(4.137k), which
which states
states that
that

J1 (4& R)' ma )
lim 1,
' ma 70 2& R)' ma
into (4.163c)
to (4.163c) gives
gives

{ r)
5
1 ! () )
(  ) ³ S(back) () )
(back) 2 4
Punb 2 ( ) Ȗ() )  2
20 Ȗ() )
(4.164a)
ª ! () ) º
W « 2
» A cos(2&) ) d) ,
«¬ Ȗ() ) »¼
}
where
S(back) () ) A  L(back) () ) (4.164b)

is the total, single-sided optical power per unit wavenumber interval entering the detector-side of
the interferometer as background radiation. This corresponds to Eq. (4.140b) in the balanced
derivation.

- 486 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18

The final idealization is to assume that

() ) r () ) ! () ) 1
so that
5
1
P (back)
unb (  ) ³ S(back) () ) A 1  W cos(2&) )  d) , (4.164c)
20

which matches Eq. (4.140d) in the balanced derivation. We can then adopt the same convention
as most optical textbooks by setting W 1 to get

5
1
P (back)
unb (  ) ³ S(back) () ) A 1  cos(2&) )  d) . (4.164d)
20

(back)
Separating out the signal component Iunb (  ) , which changes with Ȥ, gives

5
1 (back)
I (back)
unb ( )  ³ S () ) cos(2
S() ) cos(2 ,).
)d) )d
&)&) (4.165a)
20

(back)
corresponding to Eq. (4.141a) in the balanced derivation. Function Iunb (  ) is often called the
unbalanced background interferogram. It is difficult to imagine a procedure for recording the
balanced interferogram for the input optical signal that does not at the same time record the
unbalanced background interferogram; fortunately, there are several well-known calibration
methods discussed in Secs. 5.14 and 5.19 of Chapter 5 that can be used to measure and eliminate
the unbalanced background interferogram from interferometer data.
From (4.163d) and (4.164b), we have

S(back) () ) A  L(back) () ) 2 A L(back) () ) for ) : 0 . (4.165b)

Because L (back) is an even function [see Eq. (4.162b)], we can easily extend the definition of
S(back) to negative values of ı by saying that

S(back) () ) S(back) () ) , (4.165c)

- 487 -
4 · From Maxwell’s Equations to the Michelson Interferometer

so that it becomes another even function of ı. Now the product

S(back) (σ ) cos(2πσχ )

is an even function of ı and Eq. (2.19) of Chapter 2 can be used to write (4.165a) as


1
I (back)
unb ( χ ) = − ³ S (back) (σ ) cos(2πσχ )dσ . (4.165d)
4 −∞

We also note that the product

S(back) (σ ) sin(2πσχ )

is an odd function of ı. This means that

³S (σ )e ±2π iσχ dσ
(back)

−∞
∞ ∞

³ S (σ ) cos(2πσχ )dσ ± i ³ S (σ ) sin(2πσχ )dσ


(back) (back)
=
−∞ −∞

³S (σ ) cos(2πσχ )dσ
(back)
=
−∞

because the integral between í’ and +’ of any odd function such as

S(back) (σ ) sin(2πσχ )

must be zero [see Eq. (2.17) in Chapter 2]. This last result can be combined with Eq. (4.165d) to
get

1
Iunb ( χ ) = − ³ S (back) (σ ) e ±2π iσχ dσ ,
(back)
(4.165e)
4 −∞

corresponding to Eq. (4.141d) in the balanced derivation. Equation (4.165e), just like (4.141d) for
the balanced interferogram, shows that we can get the unbalanced background spectrum by taking
(back)
the appropriate Fourier transform of Iunb . There are calibration procedures that can be used to
isolate the unbalanced background interferogram, giving us access to the unbalanced background

- 488 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18

spectrum, but these measurements are usually of interest only to scientists and engineers trying to
improve the performance of poorly working interferometers.

__________

This chapter starts with Maxwell’s equations and ends up with detailed formulas for the
balanced and unbalanced optical power leaving the exit port of a standard Michelson
interferometer. The formulas account for imperfect reflection off the interferometer’s end mirrors
as well as the reflection, transmission, and absorption characterizing nonideal beam splitters and
compensator plates. Along the way, we have learned how to characterize the optical beams
passing through interferometers as well as how to handle polarized input radiation, slightly
misaligned instruments, and an input spectrum that is nonuniform over the field of view. We have
also, and in the end perhaps most importantly, introduced the concept of spectral radiance to
describe the behavior of electromagnetic wavefields inside Michelson interferometers.

- 489 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Appendix 4A
G
We define a complex vector a to be, for any three-dimensional Cartesian coordinate system
having xˆ, yˆ, zˆ unit vectors along the x, y, z Cartesian axes,
G
a = xa
ˆ x + ya
ˆ y + za
ˆ z, (4A.1)

where ax , a y , az are three complex scalars. Using the subscript r to denote a complex scalar’s real
part and the subscript i to denote the complex scalar’s imaginary part, we have

ax = arx + i aix , (4A.2a)

a y = ary + i aiy , (4A.2b)

az = arz + i aiz , (4A.2c)

for i = −1 . The xˆ, yˆ, zˆ unit vectors themselves are taken to be real and can be written in
column-vector notation as
§1· §0· §0·
¨ ¸ ¨ ¸ ¨ ¸
xˆ = ¨ 0 ¸ , yˆ = ¨ 1 ¸ , zˆ = ¨ 0 ¸ ,
¨0¸ ¨0¸ ¨1¸
© ¹ © ¹ © ¹
G
which means the complex vector a can be written in column-vector notation as

§ arx + iaix ·
G ¨ ¸
a = ¨ ary + iaiy ¸ .
¨ a + ia ¸
© rz iz ¹

Many of the standard three-dimensional formulas for real vectors can be extended to complex
vectors without any difficulty. For example, we define the vector dot product of two complex
G G
vectors a and b to be
G G
a • b = ax bx + a y by + az bz , (4A.3a)

where ax bx , a y by , az bz are the complex products of two complex scalars. Applying (4A.3a) to the
G
formulas for xˆ, yˆ, zˆ , and a , we get

- 490 -
Appendix 4A

G G G G G G
ax = a • xˆ = xˆ • a , a y = a • yˆ = yˆ • a , az = a • zˆ = zˆ • a (4A.3b)

G G
just like when a is a real vector. To make the length of a complex vector a a non-negative real
number, we define
G G G
a = a • a∗ (4A.4)

G G
where a ∗ , the complex conjugate of a , is
G
a ∗ = xa
ˆ x∗ + ya
ˆ ∗y + za
ˆ z∗ (4A.5)
or in column-vector notation
§ arx − iaix ·
G∗ ¨ ¸
a = ¨ ary − iaiy ¸ .
¨ a − ia ¸
© rz iz ¹

G G
The formula for the vector cross product of two complex three-dimensional vectors a and b is
also identical to the formula for the cross product of two real three-dimensional vectors,
G G G G
a × b = −b × a = xˆ (a y bz − az by ) + yˆ (az bx − ax bz ) + zˆ (ax by − a y bx ) . (4A.6)

The well-known operations of vector calculus on real three-dimensional vector fields can also
G
be extended to fields of complex three-dimensional vectors. We define the ∇ operator in the
usual way,
G ∂ ∂ ∂
∇ = xˆ + yˆ + zˆ , (4A.7a)
∂x ∂y ∂z

so that for any complex scalar field Į we have

G ∂α ∂α ∂α
∇α = xˆ + yˆ + zˆ
∂x ∂y ∂z
(4A.7b)
§ ∂α ∂α · § ∂α ∂α · § ∂α ∂α ·
= xˆ ¨ r + i i ¸ + yˆ ¨ r + i i ¸ + zˆ ¨ r + i i ¸ ,
© ∂x ∂x ¹ © ∂y ∂y ¹ © ∂z ∂z ¹

where α = α r + iα i for α r the real part of α and α i the imaginary part of α . We know for any
G
real three-dimensional vector field ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z that

- 491 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G G ∂ρ ∂ρ y ∂ρ z
∇•ρ = x + + . (4A.8a)
∂x ∂y ∂z
G
For any complex vector field a = xa
ˆ x + ya
ˆ y + za
ˆ z we now define

G G ∂a ∂a y ∂az ∂arx ∂ary ∂arz ª ∂a ∂aiy ∂aiz º


∇•a = x + + = + + + i « ix + + ». (4A.8b)
∂x ∂y ∂z ∂x ∂y ∂z ¬ ∂x ∂y ∂z ¼
G
Indeed, we can regard any complex vector field a as the complex sum of two real vector fields
G G G
a = ar + i ai , (4A.9a)

G
where the vector field’s real component ar is the real vector
§ arx ·
G
ar = xa
ˆ rx + ya ˆ rz = ¨¨ ary ¸¸ ,
ˆ ry + za (4A.9b)
¨a ¸
© rz ¹
G
and the vector field’s imaginary component ai is the real vector
§ aix ·
G
ai = xa
ˆ ix + ya ˆ iz = ¨¨ aiy ¸¸ .
ˆ iy + za (4A.9c)
¨a ¸
© iz ¹
Now we can treat i like any other constant scalar to write
G G G G G G G G G
∇ • a = ∇ • (ar + iai ) = ∇ • ar + i∇ • ai . (4A.10a)
G G
Equation (4A.10a) is the same as (4A.8b) and can be used instead of (4A.8b) to define ∇ • a for a
G G
complex vector field a in terms of the already-understood ∇ • operation applied to the real
G G G G
vector fields ar and ai . We know that the curl ∇ × ρ of any real, three-dimensional vector field
G
ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z is

G G § ∂ρ ∂ρ y · § ∂ρ x ∂ρ z · § ∂ρ y ∂ρ x ·
∇ × ρ = xˆ ¨ z − ¸ + yˆ ¨ − ¸ + zˆ ¨ − ¸.
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹

G
Now for the curl of any complex vector field a we can write

- 492 -
Appendix 4A

G G G G G G G G G
∇ × a = ∇ × (ar + i ai ) = ∇ × ar + i (∇ × ai ) , (4A.10b)
G G G G G G G G
which defines ∇ × a in terms of the curls ∇ × ar and ∇ × ai of two real vector fields ar and ai .
We know that ∇ 2α r for any real scalar field α r is

∂ 2α r ∂ 2α r ∂ 2α r
∇ 2α r = + + 2 , (4A.11a)
∂x 2 ∂y 2 ∂z

so that ∇ 2α for any complex scalar field α = α r + iα i becomes

§ ∂ 2α ∂ 2α ∂ 2α · § ∂ 2α i ∂ 2α i ∂ 2α i ·
∇ 2α = ∇ 2α r + i ∇ 2α i = ¨ 2r + 2r + 2r ¸+i¨ 2 + 2 + 2 ¸. (4A.11b)
© ∂x ∂y ∂z ¹ © ∂x ∂y ∂z ¹
G G
The standard definition of ∇ 2 ρ for any real vector ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z is

G
∇ 2 ρ = xˆ∇ 2 ρ x + yˆ∇ 2 ρ y + zˆ∇ 2 ρ z . (4A.11c)

G
For any complex vector field a we say that
G
∇ 2 a = xˆ∇ 2 ax + yˆ ∇ 2 a y + zˆ∇ 2 az (4A.11d)

for ax , a y , az , the three complex scalar fields that are the x, y, z components of the complex
G G
vector field a . Equations (4A.11a), (4A.11b) and (4A.11d) when taken together define ∇ 2 a for
G
any complex vector field a . Note that we can also use
G G G
∇ 2 a = ∇ 2 ar + i∇ 2 ai (4A.11e)

G G G
to define ∇ 2 a in terms of ∇ 2 applied to the real vector fields ar and ai .
G
If we have a constant complex vector u multiplied by a complex scalar field Į, then
G G
u = xu
ˆ x + yu ˆ z and α u = xˆ (α u x ) + yˆ (α u y ) + zˆ (α u z ) ,
ˆ y + zu

where u x , u y , u z are constant complex scalars and Į is a complex scalar function of position. From
(4A.11d) we have

- 493 -
4 · From Maxwell’s Equations to the Michelson Interferometer

G
∇ 2 (α u ) = xˆ∇ 2 (α u x ) + yˆ ∇ 2 (α u y ) + zˆ∇ 2 (α u z )
G (4A.12a)
ˆ x ∇ 2α + yu
= xu ˆ y ∇ 2α + zu ˆ z ∇ 2α = u∇ 2α .

G
Another useful identity involving a constant complex vector u multiplied by a complex scalar
G G
field Į comes from using Eq. (4A.8b) to simplify ∇ • (α u ) ,

G G ∂ ∂ ∂
∇ • (α u ) = (α u x ) + (α u y ) + (α u z )
∂x ∂y ∂z
(4A.12b)
∂α ∂α ∂α G G
= ux + uy + uz = u • (∇α ) .
∂x ∂y ∂z

Here we have used Eqs. (4A.3a) and (4A.7b) in the last step of (4A.12b). We also note that
G G
∇ × (α u )
§ ∂ (α u z ) ∂ (α u y ) · § ∂ (α u x ) ∂ (α u z ) · § ∂ (α u y ) ∂ (α u x ) ·
= xˆ ¨ − ¸ + yˆ ¨ − ¸ + zˆ ¨ − ¸
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹
(4A.12c)
§ ∂α ∂α · § ∂α ∂α · § ∂α ∂α ·
= xˆ ¨ u z − uy ¸ + yˆ ¨ u x − uz ¸ + zˆ ¨ u y − ux ¸
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹
G G
= −u × (∇α ).
G G G G
We define a complex vector a = ar + i ai to be orthogonal to a real vector ρ when

G G G G G G G G
ρ • a = a • ρ = ρ • ar + i ρ • ai = 0 . (4A.13)

G G G G
In (4A.13), both the real and imaginary components of the dot product, ρ • ar and ρ • ai
G G G
respectively, must be zero. Equation (4A.13) requires that both ar and ai be perpendicular to ρ
in the standard sense of real three-dimensional vectors. Another vector identity that holds true for
G G G G G
two real vectors ρ a , ρb , and a complex vector a = ar + i ai is

G G G G G G G G G
ρ a × ( ρb × a ) = ( ρ a • a ) ρb − ( ρ a • ρb )a . (4A.14)

To justify (4A.14), we note that because


G G G G G G G G G
ρ1 × ( ρ 2 × ρ3 ) = ( ρ1 • ρ3 ) ρ 2 − ( ρ1 • ρ 2 ) ρ3

- 494 -
Appendix 4A

G G G
holds true for any real vectors ρ1 , ρ 2 , ρ3 , it follows that
G G G G G G G G G
ρ a × ( ρb × a ) = ρ a × ( ρb × ar ) + i [ ρ a × ( ρb × ai )]
G G G G G G G G G G G G
= ( ρ a • ar ) ρb − ( ρ a • ρb )ar + i [( ρ a • ai ) ρb − ( ρ a • ρb )ai ]
G G G G G G G G G G G G G G
= [ ρ a • (ar + iai )]ρb − ( ρ a • ρb )(ar + iai ) = ( ρ a • a ) ρb − ( ρ a • ρb )a.

Another useful formula comes from simplifying


G G G G
(ρ × a) • (ρ × a∗ )
G G G G
when ρ is a real three-dimensional vector, a = ar + i ai is a complex three-dimensional vector,
G G G G
and ρ • a = 0 . Because ρ • a = 0 , we have [see Eq. (4A.13)]
G G G G G G G G
ρ • ar + i ρ • ai = 0 or ρ • ar = ρ • ai = 0 .

It follows that
G G G G G G G G G G
( ρ × a ) • ( ρ × a ∗ ) = [ ρ × (ar + iai )] • [ ρ × (ar − iai )]
G G G G G G G G
= [( ρ × ar ) + i ( ρ × ai )] • [( ρ × ar ) − i ( ρ × ai )] (4A.15)
G G G G G G G G
= ( ρ × ar ) • ( ρ × ar ) + ( ρ × ai ) • ( ρ × ai )
G G G G G G
Because ρ and ar are real—and because ρ and ar are orthogonal (remember that ρ • ar = 0 )—
G G
we know the length of ρ × ar must be

G G G G G G
ρ × ar = ρ ⋅ ar ⋅ sin θ = ρ ⋅ ar ,

G G G G G G
where ρ is the length of ρ , ar is the length of ar , and the angle θ between ρ and ar must be
π 2 . Because the dot product of a real vector with itself gives the square of its length, we
conclude that
G G G G G2 G 2 G G G G
( ρ × ar ) • ( ρ × ar ) = ρ ar = ( ρ • ρ )(ar • ar ) .

Using similar reasoning, we find that


G G G G G2 G 2 G G G G
( ρ × ai ) • ( ρ × ai ) = ρ ai = ( ρ • ρ )( ai • ai ) .

- 495 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Hence, Eq. (4A.15) can be written as


G G G G G G G G G G
( ρ × a ) • ( ρ × a ∗ ) = ( ρ • ρ )[(ar • ar ) + (ai • ai )]
G G G G G G G G G G (4A.16)
= ( ρ • ρ )[(ar + i ai ) • (ar − i ai )] = ( ρ • ρ )(a • a ∗ ) ,

which is the formula we are looking for.


For any complex scalar α = α r + iα i , we can use the notation

Re(α ) = α r (4A.17a)
and
Im(α ) = α i (4A.17b)

G G G
to specify the real and imaginary parts of Į. Similarly, for any complex vector a = ar + i ai , we
can use the notation
G G
Re(a ) = ar (4A.17c)
and
G G
Im(a ) = ai (4A.17d)

G
to specify the real and imaginary parts of a . We define LR to be a linear operator that, when
operating on a real three-dimensional scalar or vector field, creates another scalar or vector field
that is also real. We call LR a real linear operator. When operating on a complex quantity, a real
linear operator LR can return either a real or complex quantity; but when operating on a real
quantity, a real linear operator must return another real quantity. Because LR is linear, we know
that
G G G G
LR (α a + β b ) = α LR (a ) + β LR (b ) (4A.18)

G G
for any two real or complex constant scalars Į, ȕ and any two real or complex vectors fields a , b .
When dealing with scalar fields we need only remove all the vector signs from the linear-operator
G G
formula in Eq. (4A.18). We note that the ∇ × , ∇ • , and ∂ ∂t operators in Maxwell’s equations
are all real linear operators, as are the ∂ 2 ∂t 2 and ∇ 2 operators created by manipulation of
Maxwell’s equations.
G G
Many times we have to find real vector fields aR and bR that satisfy equations of the form

G G
L1 (aR ) + L2 (bR ) = 0 (4A.19a)

- 496 -
Appendix 4A

G G
L3 (aR ) + L4 (bR ) = 0 (4A.19b)
#
etc.

for real linear operators L1, L2 , L3 , L4 , … . It is often easier to find two complex vector-field
G G
solutions a and b such that
G G
L1 (a ) + L2 (b ) = 0 (4A.19c)
G G
L3 (a ) + L4 (b ) = 0 (4A.19d)
#
etc.
G G
than it is to find real vector fields a and b satisfying (4A.19a) and (4A.19b). For any real linear
G G G G G
operator LR acting on a complex vector field c = cr + i ci , with cr and ci the real and imaginary
G
parts of c , we have
G G G G G
LR (c ) = LR (cr + i ci ) = LR (cr ) + i LR (ci ) .

G G
Both LR (cr ) and LR (ci ) must be real because they represent real linear operators acting on real
G G
vector fields cr and ci . Hence,
G G G
Re ( LR (c ) ) = LR (cr ) = LR ( Re(c ) ) (4A.20a)
and
G G G
Im ( LR (c ) ) = LR (ci ) = LR ( Im(c ) ) . (4A.20b)

Although Re and Im are not themselves true linear operators, we do know that for any two
G G
complex vector fields u and v
G G G G
Re(u + v ) = Re(u ) + Re(v ) (4A.21a)
and
G G G G
Im(u + v ) = Im(u ) + Im(v ) . (4A.21b)

We can thus take the real and imaginary parts of (4A.19c) and (4A19d), using (4A.21a) and
(4A.21b) to get
G G
Re[ L1 (a )] + Re[ L2 (b )] = 0
G G
Re[ L3 (a )] + Re[ L4 (b )] = 0
#
etc.

- 497 -
4 · From Maxwell’s Equations to the Michelson Interferometer

and
G G
Im[ L1 (a )] + Im[ L2 (b )] = 0
G G
Im[ L3 ( a )] + Im[ L4 (b )] = 0
#
etc.

Equations (4A.20a) and (4A.20b) now give

G G
( )
L1 ( Re(a ) ) + L2 Re(b ) = 0 (4A.22a)
G G
( )
L3 ( Re(a ) ) + L4 Re(b ) = 0 (4A.22b)
#
etc.
and
G G
( )
L1 ( Im(a ) ) + L2 Im(b ) = 0 (4A.22c)
G G
( )
L3 ( Im(a ) ) + L4 Im(b ) = 0 (4A.22d)
#
etc.
G G G G
Equations (4A.22a)–(4A.22d) show that both Re(a ), Re(b ) and Im(a ), Im(b ) are pairs of real
G G
aR , bR fields that satisfy Eqs. (4A.19a) and (4A.19b). We can thus solve sets of equations based
on real linear operators by allowing the proposed solutions to be complex vector fields, finding
formulas for these complex vector fields, and then—at the very end of the process—taking either
the real or imaginary part of the complex solutions to get the desired real solutions. When
following this procedure, it is customary to take the real rather than the imaginary parts of the
complex solutions to get the desired real solutions.

- 498 -
Appendix 4B

Appendix 4B
We must be careful when approximating the phase terms of interferometer equations because
phase changes can be significant while still being very small compared to the largest term in the
phase expression. Consider, for example, the expressions

Scmplx = eiA⋅(1+δ ) (4B.1a)


and
S real = cos ( A ⋅ (1 + δ ) ) . (4B.1b)

What are the constraints on the size of δ such that both

Scmplx ≅ eiA and Sreal ≅ cos( A)

are good approximations of Eqs. (4B.1a) and (4B.1b)? At first glance, we might say that if
δ << 1 , then the contribution of δ to the phase expression A ⋅ (1 + δ ) can be neglected because no
matter what the size of A the fractional error in the phase from neglecting the presence of δ is

A ⋅ (1 + δ ) − A
= δ << 1 .
A

Note, however, that when A is very large we can write

A = 2 Nπ + a

for some positive (or negative) integer N and a non-negative real variable a < 2π . Because
A ⋅ (1 + δ ) is a phase, Eqs. (4B.1a) and (4B.1b) can be written as

Scmplx = ei ( A+ Aδ ) = ei (2 Nπ + a + Aδ ) = ei ( a + Aδ )
and
S real = cos( A + Aδ ) = cos(2 N π + a + Aδ ) = cos(a + Aδ ) .

Now it looks like what matters is that Aδ be small compared to a. But all we are interested in is
the approximate value of ei ( a + Aδ ) or cos(a + Aδ ) . If A is about equal to 2Nʌ so that a = A − 2 N π
is very small, making it about the same size or even smaller than the small value of Aδ , then
Aδ can still be neglected as long as we can say

- 499 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Scmplx = eiA⋅(1+δ ) ≅ ei 2 Nπ
and
S real = cos ( A ⋅ (1 + δ ) ) ≅ cos(2 N π ) .

For this reason, we adopt as our rule for neglecting į that the change in phase Aį must be small
compared to the change in phase producing an O (1) change in exp(iA ⋅ (1 + δ )) or cos( A ⋅ (1 + δ )) .
This means į must satisfy

Aδ << π 4 ≈ 1 (4B.1c)

before we can say that


eiA⋅(1+δ ) ≅ eiA

or
cos ( A ⋅ (1 + δ ) ) ≅ cos( A) .

Our rule of thumb, then, is to give both A and į their extreme allowed values, maximizing Aδ ,
and after that to check to see whether the resulting maximum Aδ value satisfies (4B.1c). If it
does, we can be sure that (4B.1c) is also satisfied for all the nonextreme Aį products, allowing us
to neglect į in Eqs. (4B.1a) and (4B.1b).
We start our analysis with terms such as
ˆ
e 2π iσχΩ• zˆ ,

where Ω̂ and ı are respectively the propagation vector and wavenumber of a monochromatic
plane wave and Ȥ is the OPD value of an interferometer. The beam passing through an
interferometer is direction-chopped, which means that all the plane waves have propagation
ˆ • zˆ ≅ 1 . Does this mean that
vectors that are parallel to, or nearly parallel to, ẑ , so Ω

ˆ
e 2π iσχΩ• zˆ ≅ e 2π iσχ ?

We now show why this approximation does not work. Following the notation developed in
Sec. 4.12 above, we take șb to be the angle between Ω̂ and the ẑ axis. The types of
interferometers we are interested in have angles șb that are relatively small,

0 ≤ θb ≤ θb max ≤ 10−2 radians , (4B.2a)

- 500 -
Appendix 4B

and measure infrared spectra over a range of wavenumbers

0 < σ min ≤ σ ≤ σ max < ∞ . (4B.2b)

In a well-designed interferometer χ max , the largest possible absolute value of the OPD, or optical
path difference, must satisfy the inequality

σ max χ max ≤ θb−max


2
(4B.2c)

for accurate spectral measurements to occur.72 As a general rule, interferometer designs attempt
to maximize the optical signal, which usually means making șbmax as large as possible.
Consequently, it makes sense to assume that

σ max χ max ≅ θb−max


2
. (4B.2d)
We know that
ˆ • zˆ = cos θ ≅ 1 − θb + θb − "
2 4
Ω b (4B.3a)
2 24

ˆ • zˆ gives
because angle șb is small. Substituting this into the phase 2πσχΩ

ˆ • zˆ = 2πσχ cos θ ≅ 2πσχ (1 − θb + θb − ") .


2 4
2πσχΩ b
2 24

Here, 2πσχ plays the role of A [see discussion following Eq. (4B.1b)], and the terms θb2 2 and
θb4 24 play the role of į. We first take δ = θb4 24 and note that the maximum expected value of
Aδ is
1
2πσ max χ max (θb4max 24) ≅ θb2max ,
4

where we have taken π ≅ 3 and used (4B.2d). Inequality (4B.2a) then shows that

1 −4
Aδ ≤ ⋅10 ,
4

72
John Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley and Sons, New York, 1979), pp.
220–222.

- 501 -
4 · From Maxwell’s Equations to the Michelson Interferometer

which, according to (4B.1c), is small enough to neglect. When, however, δ = θb2 2 , it follows
that Aδ can be as large as

2πσ max χ max (θb2max 2) = π ,

which is obviously too large to discard. Hence, we must approximate cos θb by

θb2
cos θb ≅ 1 − (4B.3b)
2

ˆ • zˆ = cos θ ≅ 1 even though, according to


when multiplied by 2πσ max χ max . We cannot take Ω b

(4B.2a), the O(θb2 ) term can be no larger than

5 ⋅10−5 << 1 .

ˆ
We conclude that e 2π iσχΩ• zˆ cannot be approximated as e 2π iσχ unless we are prepared to put stricter
limits on ı, Ȥ, and șb.
We now consider a plane wave with a unit-length propagation vector ω̂ that is incident on the
flat moving mirror of a Michelson interferometer. When the moving mirror is correctly aligned,
its unit-length surface normal is ẑ , pointing approximately antiparallel to ω̂ as shown in Fig.
4B.1; and when the moving mirror is misaligned by a very small angle, its unit-length surface
normal is nˆM . The unit-length propagation vector of the plane wave reflected from the aligned
moving mirror is Ω̂ , and the unit-length propagation vector of the plane wave reflected from the
misaligned moving mirror is Ω ˆ . We know that the angle between Ω ˆ and Ω ˆ is șd, with șd
d d
much smaller than șb as shown in inequality (4.68) of Sec. 4.12 above. Since we are only
interested in finding the interferometer’s measurement noise for small misalignment angles, we
say that șdmax, the maximum expected value of șd, satisfies

θ d max
≤ 10−2 . (4B.4a)
θb max

According to inequality (4B.2a) this means the largest we expect θ d to become is

θ d max ≤ 10−4 radians . (4B.4b)

- 502 -
Appendix 4B

FIGURE 4B.1.

unit vector Ω̂

unit vector ẑ

Reflective
șb Surface of
the Moving
Mirror

șb

unit vector ω̂

- 503 -
4 · From Maxwell’s Equations to the Michelson Interferometer

There is also a close connection between șdmax and the cross-sectional size of the interferometer’s
beam. In a well-designed interferometer,73

Dθ d maxσ max ≤ 0.14 , (4B.4c)

where D is the typical distance across the beam’s cross-sectional area. If, for example, the beam
has a circular cross section, then D is the circle’s diameter.
Although, as shown in Fig. 4B.1, vectors ω̂ , ẑ , and Ω̂ always lie in the same plane, there is
no reason to expect the surface normal nˆM of the misaligned moving mirror—or the propagation
vector Ω ˆ of the plane wave reflected off misaligned moving mirror—also to lie in that plane.
d
ˆ are all unit-length vectors. When we put the
We do, however, know that ω̂ , ẑ , Ω̂ , nˆM , and Ω d
ˆ
bases or “tails” of vectors ẑ , Ω̂ , nˆ , and Ω at the same location, their tips always lie on the
M d

surface of a sphere of unit radius; and if we put the tip of ω̂ together with the other four vectors’
bases, then the base of ω̂ lies on the surface of that same sphere. Because θb ≤ 10−2 radians and
θ d ≤ 10−4 radians [see inequalities (4B.2a) and (4B.4b)] are very small angles, we can approximate
the sphere’s curving surface near the tip of ẑ as a plane, drawing the construction shown in Fig.
4B.2. Then, according to the law of specular reflection, the base of ω̂ lies on a straight line with
the tips of ẑ and Ω̂ , with the tip of ẑ lying a distance șb from the base of ω̂ and the tip of Ω̂
lying a distance șb from the tip of ẑ . Similarly, the base of ω̂ lies on a straight line with the tips
of nˆM and Ω ˆ , with the tip of nˆ lying halfway between ω̂ and Ω ˆ . Having defined—using
d M d
ˆ and Ω
this flat-plane approximation—that the distance between the tips of Ω ˆ on the unit sphere
d

is angle șd, we then know that the distance between the tips of ẑ and nM must be θ d 2 . This
ˆ
result comes from the similar triangle theorem: the triangle formed by the base of ω̂ and the tips
of Ωˆ and Ωˆ is twice the size of, and similar in shape to, the triangle formed by the base of ω̂
d

and the tips of ẑ and nˆM .


We can define displacement vectors
G
γ = nˆM − zˆ (4B.4d)
and
G
ˆ −Ω
Γ=Ω ˆ , (4B.4e)
d

73
D. Cohen, “Performance Degradation of a Michelson Interferometer When Its Misalignment Angle Is a Rapidly
Varying Time Series,” Applied Optics 36, no. 18 (20 June 1997), pp. 4034–4042.

- 504 -
Appendix 4B

unit vector Ω̂
FIGURE 4B.2. ˆ
unit vector Ω d șd

unit vector ẑ
G
vector Γ
G
vector γ

unit vector nˆM


șb

θd / 2

φ
șb

unit vector ω̂

This diagram and Fig. 4B.3 go with the discussion following Eq. (4B.4c) in Appendix 4B. No matter where ω̂ is
put in this geometric construction, the angle between Ω and Ω d is always twice the angle between the tips of
ˆ and Ω̂ , Ω
vectors ẑ and nˆM . Note in Fig. 4B.3 how the angle between the tips of Ω̂1 , Ω ˆ is twice as
d1 2 d2

large as the angle between the tips of ẑ , nˆM even though ω̂1 and ω̂2 are not the same vector.

- 505 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4B.3.
unit vector Ω̂1

vector Γ d 1

ˆ
unit vector Ω d1

unit vector ω̂2

unit vector ẑ

vector γ

θb 2

unit vector nˆM

θb1

ˆ
unit vector Ω unit vector Ω̂ 2
d2
vector Γ d 2

unit vector ω̂1

- 506 -
Appendix 4B

G G
with γ the displacement vector from the tip of ẑ to the tip of nˆM and Γ the displacement vector
ˆ . According to these definitions, we have
from the tip of Ω̂ to the tip of Ωd

G
Γ ≅ θd (4B.4f)
and
G
γ ≅ θd 2 . (4B.4g)

Because the two displacement


G vectors point in the same direction, we can write, according to the
G
flat-plane approximation, Γ = 2 γ . Since the flat-plane approximation is only approximately true,
we settle for G G
Γ ≅ 2γ . (4B.4h)
G
Angle φ gives the orientation of γ with respect to the line joining the base of ω̂ to the tip of
ẑ ; by changing the value of φ , we change the shape of the two similar triangles, but we cannot
change the fact that they are similar.
Figure 4B.3 shows another geometric fact worth noting. Holding vectors ẑ and nˆM fixed, we
consider two different propagation vectors ω̂1 and ω̂2 making two different angles șb1 and șb2
with respect to ẑ . The ω̂1 plane wave reflects off the aligned and misaligned moving mirror with
propagation vectors Ω1 and Ω d 1 respectively, and the ω̂2 plane wave reflects off the aligned and
misaligned moving mirror with propagation vectors Ω 2 and Ω d 2 respectively. Using the flat-
G
plane approximation, we see that—because the displacement vector γ = nˆM − zˆ does not
G
change—the similar triangle theorem forces the two displacement vectors Γ1 = Ω ˆ −Ω ˆ and
d1 1
G
Γ2 = Ω ˆ −Ω ˆ to be equal. Realizing again that the flat-plane approximation is only
d2 2
approximately true, we end up with

ˆ −Ω
Ω ˆ ≅ 2 ⋅ (nˆ − zˆ ) (4B.4i)
d M

for all possible incident propagation vectors ω̂ (as long as the incident wave is part of the field-
chopped beam, propagating parallel to or nearly parallel
G to the ẑ axis).
G
We next consider whether the approximation Γ ≅ 2γ , which is strictly true when the surface
of the unit sphere is treated as a plane, is accurate enough to use in the phase terms of Chapter 4.
Figure 4B.4 shows the orientation of vectors ω̂ , ẑ , nˆM , Ω ˆ , and Ω
ˆ on the curved surface of a
d

unit sphere. We acknowledge the curvature of the sphere by drawing two straight lines s′ and s′′
perpendicularly from the shaft of ẑ to the tips of Ω ˆ and Ω ˆ respectively. We also draw two arc
d

- 507 -
4 · From Maxwell’s Equations to the Michelson Interferometer

lengths a′ and a′′ running along the surface of the sphere from the tip of ẑ to the tips of
Ωˆ and Ωˆ respectively. If we decrease φ while holding șb and șd constant, which shortens a′
d
ˆ closer to the tip of ẑ , then the straight line s′ hits the shaft of ẑ at a
and draws the tip of Ω d
ˆ • zˆ , the distance from where s′ hits ẑ to the
point that gets closer to the tip of ẑ , increasing Ω d

base of ẑ . Changing angle φ does not change a′′ , s′′ , or the value of Ω ˆ • ẑ , the distance from
where s′′ hits ẑ to the base of ẑ . Clearly, the smaller we can make a′ compared to a′′ , the
greater is the difference between the values of Ω ˆ • zˆ and Ωˆ • ẑ . If instead of decreasing φ we
d

increase it past π 2 , the point where s′ hits the shaft of ẑ starts dropping, eventually going
below the point where s′′ hits the shaft of ẑ . Thus it is also true that as we increase φ , the
difference between Ω ˆ • ẑ eventually begins to increase. We conclude, then, that the
ˆ • zˆ and Ω
d
ˆ • ẑ is maximized when φ is 0 or π , making a′ − a′′ a
ˆ • zˆ and Ω
difference between Ω d

maximum.
ˆ
The term e 2π iσχ ( Ωd • zˆ ) first appears in Eqs. (4.86a) and (4.86b) in Sec. 4.12. We want to
maximize the difference ¨ between the phase term 2πσχΩ ˆ • zˆ and 2πσχΩˆ • zˆ to see if, even
d
when this difference ¨ is at a maximum, the latter can be used to approximate the former.
Therefore we take σ = σ max , χ = χ max in the phase term to get

ˆ • zˆ − 2πσχΩ
∆ = 2πσχΩ ˆ • zˆ ≤ 2πσ χ Ω ˆ • zˆ − Ω
ˆ • zˆ
d max max d

= 2πσ max χ max cos(a′) − cos(θb )


≤ 2πσ max χ max cos(θb ± θ d ) − cos(θb ) ,
ˆ • zˆ − Ω
where in the last step we say φ is 0 or ʌ in Fig. 4B.4 to make Ω ˆ • zˆ a maximum. Of
d

course if φ is 0 or ʌ, then cos(a′) → cos(θb ± θ d ) .


Working now with the term inside the absolute value signs, we have, remembering that both
θb and θ d are small with θ d << θb [see inequality (4.68)],
1 1 θb2 θb4
cos(θb ± θ d ) − cos(θb ) = [1 − (θb ± θ d ) + (θb ± θ d ) − "] − [1 − +
2 4
− "]
2 24 2 24
θ d2
= Bθbθ d − + O(θb3θ d )
2
θ d2max
≤ θb maxθ d max + + O(θb3maxθ d max ) .
2

- 508 -
Appendix 4B
G
G vector Γ
vector γ s′
a′
FIGURE 4B.4.

φ s′′ unit vector Ω̂ d

a′′ θd

θb unit vector Ω̂

unit vector n̂m θb

unit vector ẑ

unit vector ω̂

- 509 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Substituting this latest result into the previous inequality to set an upper bound on the size of the
¨, we get
θ d2max
∆ ≤ 2πσ max χ max [θb maxθ d max + ] + O(σ max χ maxθb3maxθ d max ) . (4B.5a)
2

Substitution from (4B.2a), (4B.2d), and (4B.4b) gives

O(σ max χ maxθb3maxθ d max ) ≅ O(θb maxθ d max ) ≤ O(10−6 ) , (4B.5b)

which according to (4B.1c) is small enough to neglect. Hence we can write (4B.5a) as

∆ ≤ 2πσ max χ maxθb maxθ d max + πσ max χ maxθ d2max . (4B.5c)

We substitute from (4B.2c) to get


θ d max θ2
∆ ≤ 2π + π d2max
θb max θb max

From inequality (4B.4a), the first term on the right-hand side is less than or equal to 2π ⋅10−2 , and
the second term on the right-hand side is less than or equal to π ⋅10−4 . Both of these are,
according to (4B.1c), small enough to neglect. We conclude that ∆ must itself be small enough
to neglect, letting us write
2πσχΩ d • zˆ ≅ 2πσχΩ • zˆ
or
e2πσχΩd • zˆ ≅ e2πσχΩ• zˆ (4B.5d)

for the phase terms of our interferometer equations.


G G G
2π iσ r •( Ω d −Ω )
The phase term e first appears in Eqs. (4.86a) and (4.86b) in Sec. 4.12. From the
G G G
ˆ ˆ
definition Γ = Ω d − Ω in Eq. (4B.4e), we can write this phase term as e 2π iσ r •Γ
. From the laws of
specular optics—or careful study of Figs. 4B.1 and 4B.4—we know that

ˆ = ωˆ − 2 zˆ (ωˆ • zˆ )
Ω (4B.6a)

for plane waves reflecting off the correctly aligned moving mirror, and

ˆ = ωˆ − 2nˆ (ωˆ • nˆ )
Ω (4B.6b)
d M M

- 510 -
Appendix 4B

for plane waves reflecting off the misaligned moving mirror. The orientation of ω̂ and Ω̂ with
respect to ẑ shows that
ˆ • zˆ ,
ωˆ • zˆ = −Ω (4B.6c)

ˆ with respect to nˆ shows that


and similarly the orientation of ω̂ , Ω d M

ˆ • nˆ .
ωˆ • nˆM = −Ω (4B.6d)
d M

G G
Hence Eqs. (4B.6a-d) can be used to write the phase in e 2π iσ r •Γ as
G G G ˆ ˆ G
2πσ r • Γ = 2πσ r • (Ω ˆ ˆ
d − Ω ) = 4πσ r • [ nM (Ω d • nM ) − z
ˆ ˆ • zˆ )] .
ˆ (Ω (4B.7a)

G
Remembering the definition γ = nˆM − zˆ in Eq. (4B.4d), we will now demonstrate that the
G G
rightmost expression in (4B.7a) can be approximated as 4πσ r • γ , which turns (4B.7a) into

G G G G G
2πσ r • Γ ≅ 4πσ r • γ = 4πσ r • [nˆM − zˆ ] . (4B.7b)

We start the demonstration by noting that


G ˆ • nˆ ) − zˆ (Ω
ˆ • zˆ )]
4πσ r • [nˆM (Ω d M
G ˆ • zˆ ) − zˆ (Ω
ˆ • zˆ ) + nˆ (Ω
ˆ • nˆ − Ω ˆ • zˆ )]
= 4πσ r • [nˆM (Ω M d M (4B.8a)
G ˆ • zˆ )] + 4πσ (rG • nˆ )(Ω
ˆ • zˆ ) − zˆ (Ω ˆ • nˆ − Ωˆ • zˆ ) .
= 4πσ r • [nˆM (Ω M d M

G ˆ • nˆ − Ω
ˆ • zˆ ) term can be shown to be negligible by evaluating the upper
The 4πσ (r • nˆM )(Ω d M
limit of its absolute value,

G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ G ˆ ˆ ˆ
4πσ (r • nˆM )(Ω d M max r • nM ⋅ Ω d • nM − Ω • z
ˆ ˆ
(4B.8b)
ˆ • nˆ − Ω
≅ 4πσ max z ⋅ Ω ˆ • zˆ ,
d M

where we use that


G G
σ ≤ σ max and r • nˆM ≅ r • zˆ = z

- 511 -
4 · From Maxwell’s Equations to the Michelson Interferometer

because the nˆM unit vector is tilted away from ẑ by only a very small angle.74 Although at first
we might suppose that z can be indefinitely large, this is not the case. The maximum value of
șb, which we called șbmax above, governs how much the interferometer beam’s cross section
spreads as radiation travels through the interferometer. We are only interested in approximating
the phase terms for field points inside the interferometer, and if z gets too large it represents
points outside the interferometer where the validity of our phase approximations is irrelevant. We
assume that in a well-designed interferometer the beam does not spread more than 5%, which
means the product z 'b max satisfies the inequality

D D
0 4 z 'b max 4 or z max 4 (4B.8c)
20 20 'b max

for D having the same meaning as in inequality (4B.4c) above.


Figure 4B.5 is the same as Fig. 4B.4 only now we have left out aƎ and sƎ33 to avoid clutter, and
added aƎƍ and s333 to represent the arc-length and straight-line separation of the tips of vectors ˆ
d

and nˆ . Following the same sort of reasoning used above to analyze the behavior of  ˆ = zˆ and
M d

ˆ = ẑ [see discussion after Eq. (4B.4i)], we note that as  decreases to zero in Fig. 4B.5, the arc
ˆ , and 
length aƎƍ eventually decreases, because as the tips of nˆ , ẑ ,  ˆ fall onto the same arc,
M d
ˆ [see Eq. (4B.4h)].
the tip of nˆM only goes about half as far toward the base of -̂ as the tip of  d

This means that the point where s333 perpendicularly joins the shaft of nˆM approaches the tip of
ˆ = nˆ . While this happens, there is no change in the position where
nˆ , increasing the value of 
M d M
ˆ = ẑ stays the same. Thus, for  0 there is a
s33 perpendicularly joins the shaft of ẑ , so 
maximum in the value of  ˆ = nˆ that, because  ˆ = ẑ stays the same, maximizes the expression
d M

ˆ = nˆ  
 ˆ = zˆ .
d M

When  increases to ʌ in Fig. 4B.5, arc length aƎƍ eventually increases, because as the tips of
ˆ , and 
nˆM , ẑ ,  ˆ fall onto the same arc, the tip of 
ˆ moves away from the base of -̂ by
d d

double the distance that nˆM does. This makes the point where s333 perpendicularly joins the shaft
of nˆ drop further from the tip of nˆ , decreasing the value of  ˆ = nˆ . Consequently,  &
M M d M
marks the other maximum in

74 5
This angle is approximately ' d 2 , which is less than or equal to 5 ; 10 radians ; see inequality (4B.4b).

- 512 -
Appendix 4B
G a′′
vector γ s′′
FIGURE 4B.5. aƎƍ s′′′ G
vector Γ

φ
unit vector Ω̂ d
θd

θb
unit vector Ω̂

unit vector n̂m

unit vector ẑ

unit vector ω̂

- 513 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ • nˆ − Ω
Ω ˆ • zˆ .
d M

G ˆ • nˆ − Ω
ˆ • zˆ ) term in Eq.
Hence the upper limit of the absolute value of the 4πσ (r • nˆM )(Ω d M
(4B.8a) is given by, using Eq. (4B.8b),

G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ ˆ ˆ ˆ
4πσ (r • nˆM )(Ω d M max z ⋅ Ω d • nM − Ω • z
ˆ
≤ 4πσ max z max cos(a′′′) − cos θb

To maximize cos(a′′′) − cos θb , we either maximize cos(a′′′) when cos(a′′′) > cos(θb ) at φ = 0 or
minimize cos(a′′′) when cos(θb ) > cos(a′′′) at φ = π . When φ = 0 we have a′′′ = θb − θ d / 2 so
cos ( a′′′ ) = cos (θb − θ d / 2 ) . Similarly, when φ =π , we have a′′′ = θb + θ d / 2 so
cos ( a′′′ ) = cos (θb + θ d / 2 ) . Hence the two possible maximums of cos(a′′′) − cos θb at φ = 0 and
φ = π must each be less than or equal to cos(θb ± θ d / 2) − cos θb . This latest expression can only
get larger when we stop dividing θ d by 2. Therefore we can write

G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ
4πσ (r • nˆM )(Ω d M max z max cos(θ b ± θ d ) − cos θ b . (4B.9a)

Inequality (4B.8c) can now be used to show that (approximating the cosine by its power series
because both θb and θ d are small)

G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ πσ D θ d2
4πσ (r • nˆM )(Ω d M max Bθbθ d − + O(θb 3θ d )
5θb max 2
π Dσ max θ d2
≤ [ θbθd + + O(θb3θd ) ] .
5θb max 2
Replacing șb and șd by șbmax and șdmax gives

G ˆ • nˆ − Ω
ˆ • zˆ )
4πσ (r • nˆM )(Ω d M

π Dσ maxθ d max π Dσ maxθ d2max (4B.9b)


≤ + + O( Dσ maxθb maxθ d max ) .
2

5 10 θb max

Inequality (4B.4c) shows that

- 514 -
Appendix 4B

O(Dσ maxθb2maxθ d max ) ≤ θb2max ⋅ (0.14) ≤ 0.14 × 10−4 , (4B.9c)

where in the last step we used (4B.2a) to establish an upper bound on


O(Dσ maxθb2maxθ d max ) .

According to (4B.1c), this upper bound is small enough to neglect, so we can rewrite inequality
(4B.9b) as

G ˆ • nˆ − Ω
ˆ • zˆ )
4πσ (r • nˆM )(Ω d M

π Dσ maxθ d max π Dσ maxθ d2max (4B.9d)


≤ + + neglectable terms.
5 10 θb max

Again using inequality (4B.4c) and also (4B.4a), we write

π Dσ maxθ d2max 0.14 π θ d max


≤ ⋅ ≤ 0.14 π ⋅10−3 ,
10 θb max 10 θb max

which is also, according to (4B.1c), small enough to neglect.


Applying inequality (4B.4c) yet again, now to the first term on the right-hand side of (4B.9d),
gives

π Dσ maxθ d max 0.14 π


≤ ,
5 5

which is again small enough to neglect. Clearly, the left-hand side of inequality (4B.9d) must
always be small enough to neglect, allowing us to approximate Eq. (4B.8a) as
G ˆ • zˆ )] ≅ 4πσ rG • [nˆ (Ω
ˆ • nˆ ) − zˆ (Ω ˆ • zˆ ) − zˆ (Ω
ˆ • zˆ )]
4πσ r • [nˆM (Ω d M M
G (4B.10a)
ˆ • zˆ ).
= 4πσ r • (nˆM − zˆ ) ⋅ (Ω
We now write
ˆ • zˆ = cos θ ≅ 1 − θb
2
Ω b
2

and substitute this into the rightmost expression in (4B.10a) to get


G ˆ • zˆ )] ≅ 4πσ rG • (nˆ − zˆ ) − 2πσ rG • (nˆ − zˆ )θ 2 .
ˆ • nˆ ) − zˆ (Ω
4πσ r • [nˆM (Ω d M M M b (4B.10b)

- 515 -
4 · From Maxwell’s Equations to the Michelson Interferometer

The second term on the right-hand side of (4B.10b) has an absolute value with an upper limit
G G
2&) r = (nˆM  zˆ )'b2 4 2&) max'b2max r = (nˆM  zˆ )
G G
2&) max' b2max r =
G G
4 2&) max' b2max r max A max .

G G
We note that the last step here is really a gross overestimate of r = because the two unit-length
G G
vectors nˆM and ẑ are almost parallel, making vectors r and nˆM  zˆ almost perpendicular for
G G
large values of r . We estimate r max by z max , writing that

G G
2&) r = (nˆM  zˆ )'b2 4 2&) max'b2max z max A max
D ' d max
4 2&) max'b2max A A ,
20 'b max 2

G
where in the second step Eq. (4B.4g) is used to replace max
by ' d max 2 , and inequality (4B.8c)
is used to replace z max by D ( 20'b max) . Now we can use inequalities (4B.4c) and (4B.2a) to
write
G 0.14 & 0.14 &
2&) r = (nˆM  zˆ )'b2 4 A'b max 4 A102 ,
20 20

which is, according to (4B.1c), small enough to neglect. We conclude that the second term on the
right-hand side of (4B.10b) is small enough to ignore, so
giving
that (4B.10b) becomes
G ˆ = zˆ )] 4&) rG = (nˆ  zˆ ) .
ˆ = nˆ )  zˆ (
4&) r = [nˆM ( d M M

For the final step, we substitute this back into Eq. (4B.7a) to get
G G G
2&) r =  4&) r = (nˆM  zˆ )
or
G G G G
2&) r =  4&) r = , (4B.10c)
G
where in the last step we use that nˆM  zˆ from Eq. (4B.4d). This shows that the
approximation in Eq. (4B.7b) holds true, which is what we set out to demonstrate. Since
G
  ˆ  ˆ [see Eq. (4B.4e) above], this result can also be written as
d

- 516 -
Appendix 4B

G ˆ ˆ G
e 2π iσ r •( Ωd −Ω ) ≅ e 4π iσ r •( nˆM − zˆ ) . (4B.10d)
G
Before moving on, it is worth checking whether the phase term e 4π iσ r •( nˆM − zˆ ) can be simplified
any further. Figure 4B.6 (see caption) shows that when the angle between the ẑ and nˆM unit-
normal vectors is approximately θ d 2 , as specified in Fig. 4B.2, then the deviation of vector
G
γ = nˆM − zˆ from being exactly perpendicular to ẑ is approximately the angle θ d 4 . If we
G G G
decompose γ into a vector γ ⊥ that is exactly perpendicular to ẑ and a vector γ || that is
antiparallel to ẑ , we have
G G G
γ = γ ⊥ + γ || (4B.11a)
with
G G θd
γ⊥ ≅ γ ≅ (4B.11b)
2
and
G θd G θ d2
γ || ≅ ⋅ γ ⊥ ≅ . (4B.11c)
4 8
G G G G
ˆ −Ω
Substitution of γ = nˆM − zˆ = γ ⊥ + γ || into (4B.10c) gives, remembering that Γ = Ω ˆ ,
d

G ˆ ˆ G G G G G G G
2πσ r • (Ω d − Ω) ≅ 4πσ r • (γ ⊥ + γ || ) = 4πσ r • γ ⊥ + 4πσ r • γ || . (4B.12a)

G G
The absolute value of 4πσ r • γ || has an upper limit

G G G θ2
4πσ r • γ || ≤ 4πσ max z max γ || ≅ 4πσ max z max d max
8
π D § 0.14 ·
≤ σ max ⋅ ⋅ θ d max ⋅ ¨ ¸,
2 20θb max © Dσ max ¹

where we have used (4B.11c), (4B.8c), and (4B.4c) to simplify the expression for the upper limit.
Clearing away common factors and using inequality (4B.4a) gives

G G π
4πσ r • γ || ≤ ⋅ (0.14 × 10−2 ) ,
40

which is, according to (4B.1c), small enough to neglect. Hence, (4B.12a) can be written as

- 517 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4B.6.

' G
angle  d vector G
4 vector
'd
angle  G
4 vector G
vector 

& 'd
angle  
2 4

G
vector

unit vector n̂Mm unit vector ẑ

'd 'd
angle angle
2 4

Angle  is part of the right triangle whose hypotenuse is unit vector ẑ , showing that
'd & &
 . The sum of angles  and  is also because 
is perpendicular ẑ . Hence,
4 2 2
'd
angle  must be equal to .
4

- 518 -
Appendix 4B

G ˆ ˆ G G
2πσ r • (Ω d − Ω) ≅ 4πσ r • γ ⊥ (4B.12b)

G
with γ ⊥ being the component of nˆM − zˆ that is perpendicular to ẑ . We note that the right-hand
side of (4B.12b) is too large to neglect. This expression can be as large as
G
2πσ max D γ ⊥ ,

where we remember that, when an interferometer beam has a circular cross section, D is its
G G
diameter and the component of r parallel to γ ⊥ can then be as large as D/2. From (4B.11b) and
G
(4B.4c), we see that 2πσ max D γ ⊥ can be as large as

G θ d max
2πσ max D γ ⊥ ≅ 2πσ max D ⋅ = π ⋅ 0.14 ≅ 0.44 ,
2

which, according to (4B.1c), is really too large to neglect. This is why we retain the term
G ˆ ˆ
2πσ r • (Ω d − Ω) on the left-hand side of (4B.12b) in the interferometer equations.
The final phase approximation that we need to justify is

ˆ zˆ cosθ − 2γG )•( Ω−


−2π iσχ 1−( Ω− ˆ zˆ cosθ − 2γG ) ˆ zˆ cosθ )•( Ω−
−2π iσχ 1−( Ω− ˆ zˆ cosθ )
e b b
≅e b b
. (4B.13a)

We write the phase term on the left-hand side as

ˆ − zˆ cos θ − 2γG ) 2 = −2πσχ 1 − (Ω


−2πσχ 1 − (Ω ˆ − zˆ cos θ ) 2 + K . (4B.13b)
b b

The absolute value of the difference term K has an upper bound

ˆ − zˆ cos θ − 2γG ) 2 − 1 − (Ω
K ≤ 2πσ max χ max 1 − (Ω ˆ − zˆ cos θ ) 2 . (4B.13c)
b b

ˆ − zˆ cos θ ≅ θ and that


Figure 4B.4 shows that Ω b b

( Ωˆ − zˆ cosθ ) − 2γG
b

has its maximum and minimum values when φ is zero and ʌ respectively. These minimum and
maximum values will maximize the right-hand side of (4B.13c) because the term

- 519 -
4 · From Maxwell’s Equations to the Michelson Interferometer

ˆ − zˆ cos θ ) 2 does not vary when angle φ changes. Since Ω


1 − (Ω ˆ − zˆ cos θ ≅ θ , 2γG ≅ θ , and
b b b d

θ d << θb , the maximum and minimum values of Ω ˆ − zˆ cos θ − 2γG are θ ± θ , so that we can
b b d

write

ˆ − zˆ cos θ − 2γG ) 2 − 1 − (Ω
1 − (Ω ˆ − zˆ cos θ ) 2 ≤ 1 − (θ ± θ ) 2 − 1 − θ 2
b b b d b

(θb ± θ d ) 2 θ2
= 1− − (1 − b ) + O (θb4 )
2 2
θ d2
= ±θ bθ d − + O (θ b4 )
2
θ d2max
≤ + θ b maxθ d max + O (θ b4 ).
2
This can be used in (4B.13c) to get

θ d2max
K ≤ 2πσ max χ max [θb maxθ d max + ] + O(σ max χ maxθb4max ) . (4B.13d)
2

We have already seen from the discussion following Eq. (4B.5c) that

ª θ2 º
2πσ max χ max «θb maxθ d max + d max »
¬ 2 ¼
= 2πσ max χ maxθb maxθ d max + πσ max χ maxθ d2max

can be neglected. We note that substitution from (4B.2a) and (4B.2d) gives

O(σ max χ maxθb4max ) = O(θb2max ) ≤ 10−4 ,

which, according to (4B.1c), is small enough to neglect. Hence everything on the right-hand side
of (4B.13d) is small enough to neglect, which means K can be dropped from (4B.13b), making
G
Eq. (4B.13a) a good approximation. From the definition of ε in Eq. (4.54c), we can rewrite
(4.54a) to get, after applying Eq. (4.135f), that

G ˆ − zˆ 1 − ε 2 = Ω
ˆ − zˆ cos θ .
ε =Ω b

From Eq. (4.126b) and (4B.4d), we get

- 520 -
Appendix 4B

G G
∆ = 2(nˆM − zˆ ) = 2γ .

These two formulas together with Eqs. (4.102a) and (4.102d) in Sec. 4.13 lead to

G
ˆ − zˆ cos θ − 2γG ) 2 = −2πσχ 1 − (εG − ∆) 2
−2πσχ 1 − (Ω b

w c G G
= 2π χ 1 − (− u − ∆) 2
c w
w c2 G w G 2
= 2π χ 1 − 2 (u + ∆)
c w c
and

ˆ w c2 G 2
−2πσχ 1 − (Ω − zˆ cos θb ) = 2π χ 1 − 2 (u ) .
2

c w
G G
The phase approximation used in (4B.13a) now becomes, written in terms of w , c , u , and ∆ ,

w c2 G w G w c 2u 2
2π i χ 1− 2 ( u + ∆ )2 2π i χ 1− 2
e c w c
≅e c w
. (4B.13e)

- 521 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Appendix 4C
In this appendix, we apply the three-dimensional Wiener-Khinchin theorem explained in Sec.
3.24 of Chapter 3 to the random functions describing the radiation fields entering the
interferometer.
We specify function Π (t , T ) to be
°­1 for t ≤ T
Π (t , T ) = ® , (4C.1a)
°̄0 for t > T
G
and also define a two-dimensional version of this function to be Π ( ρ ; A) = Π ( x, y; A) such that
G
­ 1 when point ρ = ( x, y ) lies inside or on the edge
G °° of the beam of cross - sectional area A
Π ( ρ ; A) = Π ( x, y; A) = ® G . (4C.1b)
° 0 when point ρ = ( x, y ) lies outside the beam of
°̄ cross - sectional area A
G
Function Π ( ρ ; A) can be thought of as a pupil function for the beam.75 Function Π (t , T )
G
specifies the one-dimensional measurement time for the beam, and function Π ( ρ ; A) specifies
the two-dimensional cross section of the beam as it passes through the interferometer.
We set up two random functions

(in) G (in) G
E xTA ( ρ , z , t ) and E yTA ( ρ , z, t )

to represent, respectively, the x and y electric-field components at coordinate z of the radiation


beam entering the interferometer. The T, A subscripts show that the radiation fields are time-
chopped and beam-chopped (see Secs. 4.9, 4.10, and 4.14 for an explanation of what this means).
G
The three-dimensional autocorrelation functions in ρ and t used in the three-dimensional
Wiener-Khinchin theorem are
G G (in) G (in) G
(
R xTA ( ρ , t , ρ ′, t ′, z ) = E E xTA ( ρ , z , t ) E xTA ( ρ ′, z , t ′))(4C.2a)
and
G G (in) G (in) G
(
R yTA ( ρ , t , ρ ′, t ′, z ) = E E yTA ( ρ , z , t ) E yTA )
( ρ ′, z , t ′) . (4C.2b)

The T, A subscripts in R xTA , R yTA show that these are the autocorrelations of time-chopped and
beam-chopped radiation fields. The argument z is always unprimed because we want to compare

75
Joseph W. Goodman, Introduction to Fourier Optics (McGraw-Hill Inc., New York, 1988), p. 83; reissue of 1968
book.

- 522 -
Appendix 4C

the E xTA
(in)
and E yTA
(in)
variables at the same z coordinate along the beam. Because E xTA
(in)
and E yTA
(in)

represent time-chopped and beam-chopped radiation fields, we know they cannot be


G
homogeneous and stationary in ρ = ( x, y ) and t (see Secs. 3.15 and 3.24 in Chapter 3). When
either t or t ′ lies outside the time interval between +T and íT, so that

Π (t , T ) = 0 or Π (t ′, T ) = 0 ,
we expect the product
(in) G (in) G
E xTA ( ρ , z , t ) E xTA ( ρ ′, z , t ′)

to be zero; and the same of course is true for the product

(in) G (in) G
E yTA ( ρ , z , t ) E yTA ( ρ ′, z , t ′) .
Consequently,
(in) G (in) G (in) G (in) G
(
E E xTA ( ρ , z , t ) E xTA )
( ρ ′, z , t ′) and E E yTA (
( ρ , z , t ) E yTA ( ρ ′, z , t ′) )
G G
should be zero when Π (t , T ) = 0 or Π (t ′, T ) = 0 . Similarly, when either ρ or ρ ′ represent points
outside the beam cross-section, so that
G G
Π ( ρ ; A) = 0 or Π ( ρ ′; A) = 0 ,
we know that

(in) G (in) G (in) G (in) G


(
E E xTA ( ρ , z , t ) E xTA )
( ρ ′, z , t ′) and E E yTA (
( ρ , z , t ) E yTA ( ρ ′, z , t ′) )
should be zero. Therefore, R xTA and R yTA cannot be written as

G G G G
R xTA ( ρ − ρ ′, t − t ′, z ) or R yTA ( ρ − ρ ′, t − t ′, z )

G
as we would for the three-dimensional autocorrelations in ρ and t of homogeneous and
stationary random functions. On the other hand, if the radiation field had not been time-chopped
and beam-chopped, we would expect E x(in) and E y(in) to follow the pattern of other radiation fields
in nature. When described by random variables, these fields are usually taken to be stationary in
time,76 and—since we have given the non-beam-chopped fields no preferred structure—it makes
G
sense to have them homogeneous in ρ also. We can therefore assume that the random functions

76
Handbook of Optics, edited by Michael Bass, Vol. I (McGraw-Hill Inc., New York, 1995), Chapter 4, page 4.2,
sponsored by the Optical Society of America.

- 523 -
4 · From Maxwell’s Equations to the Michelson Interferometer

E x(rad) and E y(rad) representing the radiation before it enters the interferometer are homogeneous
G
and stationary in ( and t, with autocorrelation functions Rx and Ry, which can be written as

G G G G
R x ( (  ( 3, t  t 3, z ) E E x(rad) ( ( , z , t ) E x(rad) ( ( 3, z , t 3)
  (4C.3a)
and
G G G G
R y ( (  ( 3, t  t 3, z ) E E y(rad) ( ( , z , t ) E y(rad) ( ( 3, z , t 3) .
  (4C.3b)

We also suppose the interferometer to be well designed, only minimally perturbing E x(rad) and
E y(rad) , when
when turning into E xTA
turning them into (in)
and E yTA
(in)
to create the time-chopped and beam-chopped
radiation fields entering the interferometer. This means we can assume that E xTA (in)
and E yTA
(in)
are the
G
same as E x(rad) and E y(rad) well away from the boundaries of the beam in ( and t. Hence, we can
make the approximations that

(in) G G G
E xTA ( ( , z , t )  (t , T ) ( ( ; A) E x(rad) ( ( , z , t ) (4C.4a)
and
(in) G G G
E yTA ( ( , z , t )  (t , T ) ( ( ; A) E y(rad) ( ( , z, t ) . (4C.4b)

These approximations respect both our knowledge that E xTA (in)


and E yTA
(in)
are negligible or zero
outside the time and cross-section boundaries of the beam and also our assumption that inside the
beam and during the time interval of the measurement E xTA (in)
and E yTA
(in)
are little changed from the
E (rad) and E (rad) values they would have if they did not enter the interferometer. Substituting these
x y

approximations for E xTA


(in)
and E yTA
(in)
into the right-hand sides of Eqs. (4C.2a) and (4C.2b) gives

(in) G (in) G
E E xTA
 ( ( , z , t ) E xTA ( ( 3, z , t 3) 
G G G G
 (t , T )  (t 3, T )  ( ( ; A)  ( ( 3; A) E E x(rad) ( ( , z , t ) E x(rad) ( ( 3, z , t 3)
  (4C.4c)
G G G G
 (t , T )  (t 3, T )  ( ( ; A)  ( ( 3; A) R x ( ( 3 ( 3, t  t 3, z )
and
(in) G (in) G
E E yTA
 ( ( , z , t ) E yTA ( ( 3, z , t 3) 
G G G G
 (t , T )  (t 3, T )  ( ( ; A)  ( ( 3; A) E E y(rad) ( ( , z , t ) E y(rad) ( ( 3, z , t 3)
  (4C.4d)
G G G G
 (t , T )  (t 3, T )  ( ( ; A)  ( ( 3; A) R y ( (  ( 3, t  t 3, z ) ,

- 524 -
Appendix 4C

where we have used (4C.3a) and (4C.3b) in the final steps of these two equations. From the three-
dimensional Wiener-Khinchin theorem, we know that the Fourier transforms of Rx and Ry are the
two power spectra
∞ ∞
G G G G
Sx (u , z , w) = ³ dt ³ ³ d 2 ρ R x ( ρ , t , z )e −2π i ( ρ •u + wt ) (4C.5a)
-∞ −∞
and
∞ ∞
G G G G

³ ³³
−2π i ( ρ •u + wt )
S y (u , z , w) = dt d 2
ρ R y ( ρ , t , z ) e . (4C.5b)
-∞ −∞

The Wiener-Khinchin theorem also states that the power spectra Sx and Sy are given by the limits

G
Sx (u , z, w) = lim
1 1 G
(
⋅ ⋅ E ExTA (u , z , w)
T →∞ 2T A
A→∞
2
) (4C.5c)

and
G
S y (u , z , w) = lim
1 1
T →∞ 2T A
A→∞
G 2
(
⋅ ⋅ E EyTA (u , z , w) , )
(4C.5d)
G G
where the random functions ExTA (u , z , w) and EyTA (u , z , w) are defined to be the three-dimensional
forward Fourier transforms of
G G G G
Π (t , T )Π ( ρ ; A) E x(rad) ( ρ , z, t ) and Π (t , T )Π ( ρ ; A) E y(rad) ( ρ , z , t ) ,
given by
∞ ∞
G G  (rad) ( ρG , z, t )e −2π i ( ρ •u + wt )
G G
ExTA (u , z , w) = ³ ³³ ρ ρ
2
dt d Π (t , T ) Π ( ; A) E x (4C.6a)
-∞ −∞
and
∞ ∞
G G  (rad) ( ρG , z, t )e−2π i ( ρ •u + wt ) .
G G
EyTA (u , z , w) = ³ ³³ ρ ρ
2
dt d Π (t , T ) Π ( ; A) E y (4C.6b)
-∞ −∞

In Eqs. (4C.5c) and (4C.5d), lim is interpreted to be the limit as the time interval specified by
T →∞

Π (t , T ) extends to cover all time, and lim is interpreted to be the limit as the cross-sectional area
A→∞
G
specified by Π ( ρ ; A) extends to cover the entire x, y plane. From the approximations in (4C.4a)
and (4C.4b), we have
∞ ∞
G (in) G
G G
ExTA (u , z , w) ≅ ³ dt ³ ³ d 2 ρ E xTA ( ρ , z , t )e−2π i ( ρ •u + wt ) (4C.7a)
-∞ −∞
and

- 525 -
4 · From Maxwell’s Equations to the Michelson Interferometer

∞ ∞
G  (in) ( ρG , z, t )e −2π i ( ρ •u + wt ) .
G G
EyTA (u , z , w) ≅ ³ ³³ ρ
2
dt d E yTA (4C.7b)
-∞ −∞

We compare these results to Eqs. (4.129a) and (4.129b) in this chapter to get
G
 G −2  § cu w·
ExTA (u , z , w) ≅ cw E xTA ¨ − , z , − ¸ (4C.8a)
© w c¹
and
G

G −2  § cu w·
E yTA (u , z , w) ≅ cw E yTA ¨ − , z , − ¸ . (4C.8b)
© w c¹

Substituting these last two approximations in (4C.5c) and (4C.5d) gives

§ G
w ·
2
G 1 1 c2  cu
Sx (u , z, w) ≅ lim ⋅ ⋅ ⋅ E ¨ E xTA (− , z , − ) ¸ (4C.9a)
T →∞ 2T A w4 ¨ w c ¸¹
A→∞ ©
and
§ G 2
·
G 1 1 c2 cu w
 (− , z, − ) ¸ .
S y (u , z , w) ≅ lim ⋅ ⋅ 4 ⋅E ¨ E (4C.9b)
T →∞ 2T A w ¨ yTA
w c ¸¹
A→∞ ©

As A, T get ever larger, the time-chopped and beam-chopped E xTA (in)


and E yTA
(in)
inside the
interferometer more and more resemble the E x(rad) and E y(rad) random functions that would have
been present if the original radiation fields had not been modified by entering the interferometer.
This means that as A and T get larger, the approximations made in Eqs. (4C.4a) and (4C.4b)
become ever more exact, and so do the approximations made in Eqs. (4C.7a), (4C.7b), (4C.8a),
and (4C.8b). Concentrating on (4C.8a) and (4C.8b) in particular, we expect ExTA and EyTA to
resemble cw−2 E and cw−2 E ever more closely as A, T get large, turning the approximate
xTA yTA

equalities in (4C.9a) and (4C.9b) into the exact equalities,

§ G 2
·
G 1 1 c2 cu w
 (− , z , − ) ¸
Sx (u , w) = lim ⋅ ⋅ 4 ⋅E ¨ E (4C.10a)
T →∞ 2T A w ¨ xTA
w c ¸¹
A→∞ ©
and
§ G
w ·
2
G 1 1 c2  cu
S y (u , w) = lim ⋅ ⋅ ⋅ E ¨ E yTA (− , z , − ) ¸ . (4C.10b)
T →∞ 2T A w4 ¨ w c ¸¹
A→∞ ©

- 526 -
Appendix 4C


Tracing E 
xTA and E yTA back to their original definitions of E x and E y in Eqs. (4.98a) and
2
 2 are no
(4.98b)—before they acquired their T, A subscripts—we recognize that E xTA and E yTA

longer functions of z, allowing us to drop z from the argument lists of Sx and Sy.

- 527 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Appendix 4D
We calculate here the two-dimensional Fourier transform of the pupil function
G
Π ( ρ ; A) = Π ( x, y; A) = Π ( x, X) ⋅ Π ( y, Y) (4D.1a)

for an interferometer beam with a (2X) × (2Y) rectangular cross section as well as the two-
dimensional Fourier transform of the pupil function

G
Π ( ρ ; A) = Π ( x, y; A) = Π ( x 2 + y 2 , R) (4D.1b)

for an interferometer beam with a circular cross section of radius R. Function Π (u, v) can be
thought of as a one-dimensional pupil function and is defined to be [see Eq. (4C.1a) in Appendix
4C]
­°1 for u ≤ v
Π (u, v) = ® . (4D.1c)
°̄0 for u > v
G
It can be distinguished from the two-dimensional Π ( ρ ; A) functions by the absence of a
semicolon in its argument list.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1a), we write

∞ ∞ ∞
G G G

³ ³d ³ dxΠ( x, X) e ³ dyΠ ( y, Y) e
±2π i ( ρ •u ) ±2π ixu x ±2π iyu y
2
ρ Π ( ρ ; A) e = . (4D.2)
−∞ −∞ −∞

The one-dimensional integrals are straightforward. In x we have

∞ X X

³ Π( x, X) e ³ cos(2π xu )dx ± i ³ sin(2π xu )dx


±2π ixu x
dx = x x
−∞ −X −X
(4D.3a)
sin(2π u x X )
= = 2Xsinc(2π u x X) ,
π ux

sin( x)
where sinc( x) = . The integral over y is identical, of course, so the final result is
x


G G G

³ ³
±2π i ( ρ •u )
d 2
ρ Π ( ρ ; A) e = 4XYsinc(2π u x X) sinc(2π u y Y) (4D.3b)
−∞

- 528 -
Appendix 4D

or, choosing the minus sign in the exponent of e to match the definition of Ȇ A in Eq. (4.134c) of
this chapter,
G
Ȇ A (u ) = 4XYsinc(2π u x X) sinc(2π u y Y) . (4D.3c)

This is the two-dimensional forward Fourier transform of the pupil function of an interferometer
beam with a (2X) × (2Y) rectangular cross section.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1b), we write

∞ ∞ ∞
G G G ±2π i ( xu x + yu y )
³ ³d ³ dx ³ dy e
±2π i ρ •u
2
ρ Π ( ρ ; A) e = Π ( x 2 + y 2 , R)
−∞ −∞ −∞
R 2π
(4D.4a)
= ³ ρ d ρ ³ dθ e±2π i ρu (cosθ cosφ +sinθ sin φ ) ,
0 0

where in the last step the variables of integration have been transformed using

G
ρ = x2 + y 2 , u = u x2 + u y2 = u ,

x = ρ cos θ , u x = u cos φ (4D.4b)

y = ρ sin θ , u y = u sin φ .
We note that

e ±2π i ρ u (cosθ cosφ +sinθ sinφ ) = e±2π i ρ u cos(θ −φ )


(4D.5a)
= cos ( 2πρ u cos(θ − φ ) ) ± i sin ( 2πρ u cos(θ − φ ) ) .

From the Handbook of Mathematical Functions, we know that77


cos( z cos θ ) = J 0 ( z ) + 2¦ (−1) k J 2 k ( z ) cos(2kθ ) (4D.5b)
k =1
and

sin( z cos θ ) = 2¦ ( −1) k J 2 k +1 ( z ) cos ( (2k + 1)θ ) , (4D.5c)
k =0

77
See Eqs. (9.1.44) and (9.1.45) in Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun (National Bureau of Standards, Applied Mathematics Series 55, November 1964), p. 361.

- 529 -
4 · From Maxwell’s Equations to the Michelson Interferometer

where J n ( z ) is a Bessel function of the first kind of order n, with n = 0,1, 2,… , and z a non-
negative real number. Using Eqs. (4D.5a)–(4D.5c), we find that


±2π i ρ u (cosθ cosφ +sinθ sin φ )
³e
0

2π 2π
= ³ cos ( 2πρ u cos(θ − φ ) )dθ ± i ³ sin ( 2πρ u cos(θ − φ ) )dθ
0 0
∞ 2π
= 2π J 0 (2πρ u ) + 2¦ (−1) J 2 k (2πρ u ) ³ cos ( 2k (θ − φ ) )dθ
k

k =1 0
∞ 2π
± 2i ¦ (−1) J 2 k +1 (2πρ u ) ³ cos ( (2k + 1)(θ − φ ) ) dθ .
k

k =0 0

The integrals over the cosine are clearly zero, because in each one the cosine is being integrated
over an integral number of periods. Hence,


±2π i ρ u (cosθ cosφ +sinθ sin φ )
³ dθ e
0
= 2π J 0 (2πρ u ) . (4D.5d)

Substituting this result back into (4D.4a) gives, changing the variable of integration to
ρ ′ = 2πρ u ,

∞ R 2π uR
G G G 1
³ ³d = 2π ³ ρ J 0 (2πρ u )d ρ = ³
±2π i ρ •u
2
ρ Π ( ρ ; A) e ρ ′J 0 ( ρ ′) d ρ ′ .
−∞ 0
2π u 2 0

The Bessel identity78


x

³ zJ
0
0 ( z )dz = xJ1 ( x)

now lets us write


G G G 2π uR J1 (2π uR) R
³−∞³
±2π i ρ •u
d 2
ρ Π ( ρ ; A) e = = J1 (2π uR) (4D.6a)
2π u 2 u

or, choosing the minus sign in the exponent of e to match the definition of Ȇ A in Eq. (4.134c) of

78
Joseph W. Goodman, Introduction to Fourier Optics, p. 16.

- 530 -
Appendix 4D

this chapter,
G R G
Ȇ A (u ) = G J1 (2π u R) , (4D.6b)
u

where now Ȇ A is the two-dimensional forward Fourier transform of the pupil function of an
interferometer beam with a circular cross section of radius R.

- 531 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Appendix 4E
Snell’s law requires monochromatic plane waves entering a thick transparent slab or window to
change their angle of propagation. Figure 4E.1 uses a triplet of parallel rays to show this change,
and the angle variables specified there can be used to write Snell’s law as

nA sin  nB sin/ , (4E.1a)

where nA is the index of refraction outside the slab and nB is the index of refraction inside the
slab.

FIGURE 4E.1.

nA
nB
nCA

/

/


planes of constant phase


(side view)

- 532 -
Appendix 4E

The index of refraction of any transparent medium is here taken to be a real dimensionless ratio
of the monochromatic wave’s velocity in empty space to the same monochromatic wave’s
velocity inside the slab. The index of refraction of empty space is thus always one. The index of
refraction of air is extremely close to one, being just little bit larger than one by an amount that
can usually be neglected when analyzing optical instruments. The indexes of refraction of the
transparent substances used to make interferometer windows, beam-splitter substrates, and
compensator plates are significantly larger than one and usually less than six or seven. If c is the
monochromatic wave’s velocity in empty space, v A is its velocity outside the slab in Fig. 4E.1,
and vB is its velocity inside the slab, then
c c
nA = and nB = . (4E.1b)
vA vB

The wavelength of a monochromatic plane wave also changes inside the transparent slab [see,
for example, Fig. 1.6(b) in Chapter 1]. This effect is shown in Fig. 4E.1 by drawing the planes of
constant phase perpendicular to the triplet of rays as being more closely spaced inside the slab
than outside the slab. If λA is the wavelength outside the slab and λB is the wavelength inside the
slab, then
nAλA = nB λB . (4E.2a)

Because the wavenumber is one over the wavelength, this can also be written as

nAσ B = nBσ A , (4E.2b)

where σ A = 1/ λA and σ B = 1/ λB . Substituting Eq. (4E.1b) into (4E.2a) gives

§ c · § c ·
λA ¨ ¸ = λB ¨ ¸
© vA ¹ © vB ¹
or
λA λB
= . (4E.2c)
vA vB

Remembering that the frequency of a monochromatic plane wave, according to Eq. (1.5) in
Chapter 1, satisfies the formula

wavelength · frequency = velocity ,

we note that the wavelength divided by the velocity must be the frequency. Therefore, Eq.
(4E.2c) requires the frequency of a monochromatic plane wave to be equal inside and outside the

- 533 -
4 · From Maxwell’s Equations to the Michelson Interferometer

slab in Fig. 4E.1. Another point worth mentioning here is that the index of refraction can be—and
usually is—a function of frequency when the monochromatic plane wave is propagating through
a transparent substance. Rearranging Eq. (4E.2a), we see that the ratio of the monochromatic
wavelengths inside and outside the slab shown in Fig. 4E.1 must be

λA nB
= , (4E.3)
λB nA

and so can also depend on the plane wave’s frequency. This point is discussed in a general sort of
way in Sec. 1.1 of Chapter 1 when explaining why Michelson’s interferometer needed a
compensator plate to work correctly.
Figure 4E.2(a) shows two monochromatic plane waves propagating through a Michelson
interferometer’s compensator plate. The interferometer’s optical axis passes horizontally through
the compensator plate, parallel to the ray showing the direction of propagation of the on-axis
wave incident on the plate. The direction of propagation of the off-axis wave incident on the plate
has a slight downward component. The solid ray paths show the change in direction of the on-
axis and off-axis plane waves inside the plate as well as the way Snell’s law requires both types
of wave to revert to their incident propagation directions when leaving the plate. The short, solid
lines perpendicular to and crossing through the rays show the planes of constant phase of the
monochromatic waves, with the distance between equivalent planes being much less inside the
plate than outside the plate. This distance can be regarded as a proxy for the wavelength if we are
careful to remember that the diagram would not then be drawn to scale—the typical wavelength
of these infrared plane wavefields is 1000 to 10,000 times less than the thickness of a typical
interferometer’s compensator plate. The dashed rays with the dashed lines of constant phase show
how the incident monochromatic waves would have propagated had the compensator plate not
been present.
In Fig. 4E.2(a), the on-axis plane wave travels a distance p1 through the compensator plate
and the off-axis plane wave travels a distance p2 through the compensator plate. The substances
used to make compensator plates and beam-splitter substrates can absorb significant amounts of
power from propagating wavefields, with the amount of absorbed power depending linearly on
the distance traveled inside the substance. In a well-designed interferometer, the plane waves
propagating in an off-axis direction are traveling at nearly the same angle of incidence with
respect to the compensator plate as are the plane waves propagating on axis, making p1 and p2
nearly equal. Hence, both types of plane wave lose about the same amount of power passing
through the compensator plate and so, to a good approximation, the amplitudes of both the on-
axis and off-axis monochromatic plane waves decrease by the same fractional amount, say γ .
The absolute value or magnitude signs here force γ to be non-negative, but this is no problem
because we can take the plane-wave amplitudes before and after passage through the plate also to
be inherently non-negative.

- 534 -
Appendix 4E

FIGURE 4E.2(a).

p1

p2

The behavior of the wavefield for on-axis and oblique rays passing through the compensator plate is
shown schematically by short lines drawn perpendicularly to the rays’ direction of propagation. The
solid rays and lines show how the rays and wavefields actually behave while passing through the
compensator plate, and the dashed rays and lines show how the rays and wavefields would have
behaved had the compensator plate not been present. Although not drawn to scale—the wavelengths
are typically several orders of magnitude shorter than the width of the compensator plate—radiation
wavelengths do, as shown, get shorter inside the compensator plate, which means the solid rays’
wavefields are very unlikely to match up exactly to the dashed rays’ wavefields. Hence, there is almost
always a phase change of the wavefields compared to what they would have been had they not
passed through the compensator plate.

- 535 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4E.2(b).

o1

The behavior of the wavefield for an on-axis ray interacting with the beam splitter and its substrate can
be analyzed the same way the compensator plate was analyzed in Fig. 4E.2(a). Again, the solid rays
and lines show how the rays and wavefields actually behave, and the dashed rays and lines show how
the rays and wavefields would have behaved had the substrate not been present. Like the
compensator plate, the substrate is not drawn to scale—the wavelengths are made much too large
compared to the substrate’s width. Radiation wavelengths shorten inside the substrate just as they do
inside the compensator plate, so the solid wavefields do not match up exactly to the dashed
wavefields. This again produces a phase change in the wavefields compared to what they would have
been had the substrate not been present.

- 536 -
Appendix 4E

FIGURE 4E.2(c).

o2

The behavior of the wavefield for an oblique ray interacting with the beam splitter and its substrate is
similar to that of an on-axis ray and wavefield [see Fig. 4E.2(b)]. Again, the solid rays and lines show
how the rays and wavefields actually behave while passing through the substrate, and the dashed rays
and lines show how the rays and wavefields would have behaved had the substrate not been present.
As before, radiation wavelengths shorten inside the substrate, so the solid wavefields do not match up
to the dashed wavefields. Just like for the on-axis ray, there is a phase change of the wavefields
compared to what they would have been had the substrate not been present.

- 537 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Using the same notation as in the discussion following Eq. (4.35e) in Sec. 4.5 of this chapter, we
can write that
Aτ = γ ⋅ Ai (4E.4a)

where Ai is a complex parameter standing for the complex amplitudes of any of the components
of the E or B field of the monochromatic plane wave and Aτ is another complex parameter
standing for the complex amplitudes of the corresponding E or B field components of the
monochromatic plane wave after it has passed through the slab. The value of γ can change
significantly for different values of frequency; we can allow for this by writing

γ = γ (σ ) , (4E.4b)

where ı is the wavenumber of the plane wave incident on the compensator plate. Having just
noted that in a well-designed interferometer γ does not change significantly when comparing
on-axis and off-axis plane waves, there is no need to include a dependence on the angle of
incidence in Eq. (4E.4b).
When comparing a monochromatic plane wave entering the slab in Fig. 4E.2(a) to the same
monochromatic plane wave leaving the slab, we are analyzing a situation very similar to the
situation examined in Sec. 4.5 above—the only real difference is that in Sec. 4.5 we discuss what
happens to monochromatic plane waves passing through a thin slab or film and now we are
analyzing what happens when passing through a thick slab or window. Passage through a thin
slab or film produces a change in phase as well as a change in amplitude, and the same thing
happens in the passage through the thick slab in Fig. 4E.2(a). In Fig. 4E.2(a) there are short
dashed lines representing what the planes of constant phase in the monochromatic wave would be
if the slab were absent. Comparing these to the short solid lines showing the actual position of the
planes of constant phase after the wave leaves the slab, we note that in both the on-axis and off-
axis cases they fail to match up. This comes from the shortening of the wavefield’s wavelength
inside the slab. Even though there is only a slight difference in the p1 and p2 distances covered by
the on-axis and off-axis waves, the on-axis phase change is much different from the off-axis
phase change because the wavefields’ wavelengths are so much shorter than the width of the slab.
This means that even though p1 and p2 are almost equal, their difference is still large compared to
a wavelength.
Figure 4E.2(b) shows an on-axis monochromatic wavefield reflecting off and transmitting
through the beam splitter and beam-splitter substrate. The one-way distance through the substrate
is called o1. Figure 4E.2(c) shows an off-axis monochromatic wave reflecting off and
transmitting through the beam splitter and its substrate; here, the one-way distance through the
substrate is called o2. Just like in Fig. 4E.2(a), the off-axis ray is only slightly off-axis because in
well-designed interferometers only slightly off-axis plane waves are allowed to pass through the

- 538 -
Appendix 4E

instrument. Hence, o1 and o2, just like p1 and p2, are almost equal. The compensator plate is
made from the same material—and is designed with the same thickness and orientation—as the
beam-splitter substrate, so the same value of γ that is used for the compensator plate can be
used to describe the one-way passage through the beam-splitter substrate of the on-axis and
slightly off-axis monochromatic plane waves. Just like in Eq. (4E.4b), we expect γ to be a
function only of ı because the loss of power is about the same for both the on-axis and slightly
off-axis waves. Figures 4E.2(b) and 4E.2(c) also show that, just like in Fig. 4E.2(a), the on-axis
and off-axis monochromatic waves can undergo significantly different phase shifts after passing
through the beam-splitter substrate. Again, this is due to the wavelength being so short compared
to the thickness of the slab, which is in this case the beam-splitter substrate.
Section 4.5 of this chapter explains how to show that a monochromatic plane wavefield

Ae2π iσ ( z −ct )

traveling along the z axis has undergone both a phase shift and a change in amplitude: just
multiply by a complex constant. The magnitude of the constant changes the wavefield’s
amplitude and the complex phase angle of the constant changes the wavefield’s phase, shifting
the position of the planes of constant phase from where they would be if the multiplication did
not occur. This happens no matter what direction in space is taken to be the z axis—that is, no
matter what the direction of propagation of the plane wavefield. We have already chosen γ (σ )
to be the magnitude specifying how much the amplitude of the plane wave changes when it goes
through the compensator plate, and now nothing stops us from taking γ to be a complex number

γ = γ ei arg(γ ) , (4E.5)

where the complex phase angle—that is, the argument of the γ complex value—is chosen so that
multiplying by the complex γ correctly modifies both the phase and the amplitude of the plane
wave. Now taking the z axis to lie along either the on-axis or the off-axis ray in Fig. 4E.2(a), we
know that if
Ae2π iσ ( z −ct )

represents any E field or B field component of the monochromatic plane wave incident on the
compensator plate, then
γ Ae2π iσ ( z −ct )

must represent the corresponding E or B component after the monochromatic plane wave has
passed once through the compensator plate.
When analyzing transmission through a thin film in Sec. 4.5 of this chapter, we distinguish

- 539 -
4 · From Maxwell’s Equations to the Michelson Interferometer

between s-type wavefields where the E field is perpendicular to the plane of incidence and p-type
wavefields where the E field is parallel to the plane of incidence. Following the same pattern
here, we say that there is a s complex parameter specifying how s-type waves transmit through
the compensator plate and a p complex parameter specifying how p-type waves transmit
through the compensator plate.
We have already noted that the phase shift undergone by a monochromatic plane wave
passing through the compensator plate depends sensitively on the path taken through the plate;
even small differences in p1 and p2 in Fig. 4E.2(a) can lead to significantly different phase shifts.
Hence, even though s , p does not depend sensitively on the monochromatic plane wave’s angle
of incidence on the compensator plate so that for both on-axis and slightly off-axis plane waves
s , p can be taken to depend only on the wavenumber as shown in Eq. (4E.4b), the same cannot

be said about the complex phase angle or argument of s, p . It follows that for a well-designed
standard interferometer,

s, p function only of the incident wavenumber ı (4E.6a)


but
arg( s, p ) function of both the angle of incidence and (4E.6b)
the incident wavenumber ı
so that
s, p function of both the angle of incidence and (4E.6c)
the incident wavenumber ı

Multiplying a plane wavefield by a complex parameter is also a good way to show what
happens to the wavefield when it passes once through the beam-splitter substrate before reflecting
off or transmitting through the beam-splitter film, and similarly a complex parameter can be used
to show what happens to the wavefield when it passes back through the beam-splitter substrate
after reflecting from the beam-splitter film. The above discussion of Figs. 4E.2(b) and 4E.2(c)
shows that the Eqs. (4E.6a)–(4E.6c) still hold true when s , p is taken to be a complex parameter
describing one passage—before or after reflection—through the beam-splitter substrate. The only
course, isisthat
caveat, of course, thatisisnow
now taken
(see Fig.to4E.1)
be the
theangle
angleofofincidence
incidenceof
of the
the monochromatic plane
wave on the combined substrate-and-film beam-splitter optical element shown in Figs. 4E.2(b)
and 4E.2(c). A little thought shows that rules (4E.6a)–(4E.6c) must in fact have a still wider
application: if they hold true for any two complex parameters A and B , then they must also
hold true for their complex product

A A .
It is easy to see why; we just note that

- 540 -
Appendix 4E

γ = γA γA
and
arg(γ ) = arg(γ A ) + arg( γ A ) .

Hence γ must be a function only of ı, not depending significantly on the angle of incidence,
because it is the product of functions for which this is true; and similarly arg(γ ) must depend on
both ı and the angle of incidence because it is the sum of functions for which this is true. This
same reasoning can in fact be extended to conclude that rules (4E.6a)–(4E.6c) must hold true for
all complex products γ s , p representing any number of passages through any combination of the
compensator plate and beam-splitter substrate.
Since the complex phase angles of the γ parameters describing the compensator plate and
beam-splitter substrate depend sensitively on the angle of incidence, we should examine how the
angle of incidence of a plane wave changes as it passes through the interferometer.
Most textbooks on elementary optics describe a simple procedure for analyzing the geometry
of rays reflecting off mirrors and other types of specular surfaces—they recommend the
construction of a mirror-image virtual world on the other side of the mirror or specularly
reflecting surface. Figure 4E.3(a) shows how this works for the elementary case of rays leaving a
chair and then specularly reflecting into an observer’s eye. For each ray entering the observer’s
eye, there is a corresponding direction at which the ray originally left the chair, as shown by the
solid arrows in Fig. 4E.3(a). To find the direction at which a ray must leave the chair, we
construct a virtual world—in this case, a virtual chair—on the other side of the reflecting surface,
as shown by the dashed lines in Fig. 4E.3(a). The virtual chair is drawn point by point exactly the
same distance “behind” the mirror as the real chair is in front of the mirror. To find, for example,
the direction of ray Ar in coordinate system S such that it reflects off the mirror and enters the
observer’s eye as ray A, we just draw a straight dashed Ar′ extension of the ray back to the dashed
virtual chair on the other side of the mirror and look at the direction of Ar′ in the virtual S ′
coordinate system.
Figure 4E.3(b) shows how to analyze optical configurations by constructing virtual objects on
the virtual side of all the specularly reflecting surfaces. The plane wave represented by the Z
triplet of rays drawn with solid arrows in Fig. 4E.3(b) reflects first off mirror M1, then reflects off
mirror M2 and into the transparent slab T. Using the same procedure as in Fig. 4E.3(a), we
construct T1′ and M 2′ , the dashed virtual images of T and M2 on the far side of M1. Just like
before, the direction at which the rays approach M2 can be found by extending the Z triplet of rays
as dashed straight arrows onto the virtual M 2′ surface. To analyze and “reflect” the virtual rays
off M 2′ , we construct T12′′ , a dash-dot virtual image of T1′ on the far side of the virtual mirror M 2′ .

- 541 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4E.3(a).

C C′

S S′

Ar Ar′

- 542 -
Appendix 4E

FIGURE 4E.3(b).

T2′
M2 T12′′

M 2′

T Z T1′

M1

The direction of the Z rays at the true transparent slab T can now be found by extending the
dashed arrows as dash-dot arrows onto the virtual T12′′ transparent slab. (The symmetry of the
situation, by the way, shows that T12′′ can also be constructed as the virtual image in M1 of the
virtual image T2′ on the other side of M2.) The collection of surfaces M1, M 2′ , and T12′′ together
with the solid, dashed, and dash-dot Z rays is sometimes called a tunnel diagram. Tunnel
diagrams can be a convenient way to keep track of the angle of incidence of plane waves
propagating through a collection of specularly reflecting flat surfaces.
Figures 4E.4(a) and 4E.4(b) are tunnel diagrams for the A triplet of rays propagating through a
standard Michelson interferometer. The A rays represent a monochromatic plane wave A
propagating through the instrument in a slightly off-axis direction. Figure 4E.4(a) is the tunnel
diagram for the interferometer arm without the compensator plate. Here the dashed slab Sb and
mirror M ′ come from constructing a virtual Sa and M on the other side of the beam splitter’s
thin, partially reflecting film; and the dash-dot Sc slab is then the virtual representation of Sb in
mirror M ′ . The dashed and dash-dot virtual extensions of the A rays show what the angle of

- 543 -
4 · From Maxwell’s Equations to the Michelson Interferometer

incidence of plane wave A must be for its three passes through Sa while following the path of the
solid rays in Fig. 4E.4(a). In this tunnel diagram, we characterize the passage of slightly off-axis
plane waves through Sa, Sb, and Sc by complex parameters γ s(,ap) , γ s(,bp) , and γ s(,cp) respectively. For s-
type plane waves, the s subscript is chosen and for p-type plane waves the p subscript is chosen.
This is, of course, the same thing as saying that the γ s(,ap) characterize the first passage of s-type
and p-type plane waves through the beam-splitter substrate before reflection off the beam-splitter
film, that the γ s(,bp) characterize the second passage of s-type and p-type plane waves through the
beam-splitter substrate after reflection off the beam-splitter film, and that the γ s(,cp) characterize the
third s-type or p-type passage through the beam-splitter substrate after reflection off mirror M.
We note that it is important to distinguish between these three passages because, as shown by the
tunnel diagram, the angle of incidence corresponding to γ s(,cp) must be slightly different from the
angle of incidence corresponding to γ s(,ap) for slightly off-axis plane waves; and, of course, the γ s(,bp)
is allowed to be different from γ s(,ap) and γ s(,cp) because it characterizes the reverse passage through
the beam-splitter substrate.
______________________________________________________________________________

FIGURE 4E.4(a). This is a tunnel diagram for the interferometer arm without the compensator plate.
M

γ s(,ap)
γ s(,cp)
A γ s(b, p)

Virtual
Optical
Axis
Sa
Sb Sc
M′

- 544 -
Appendix 4E

FIGURE 4E.4(b). This is the tunnel diagram for the interferometer arm with the compensator plate.

S a3 S b3 S c3 M3

Optical Virtual
Optical ( c )'
( a )' (b )' Axis s, p
s, p s, p Axis

Figure 4E.4(b) is the tunnel diagram for the interferometer arm with the compensator plate; it
is simpler than the tunnel diagram in Fig. 4E.4(a) because it uses the virtual images
corresponding to one rather than two specularly reflecting surfaces. Slab Sa3 represents the beam-
splitter substrate in Fig. 4E.4(b); it must have the same shape and orientation as Sa in Fig. 4E.4(b)
a
because it represents the same block of substrate material. Hence, for any monochromatic plane
wave, on-axis or off-axis, we must have
(a) ( a )3
s, p s, p (4E.7a)

( a )3
where s, p are the complex parameters specifying an s-type or p-type plane wave’s passage
(a)
through slab Sa3 in Fig. 4E.4(b) and s, p are the complex parameters in Fig. 4E.4(a) that have
been defined in the previous paragraph to specify an s-type or p-type plane wave’s first passage
through slab Sa. Clearly, since passage through slab Sa3 is just another name for the same event as
passage through slab Sa, Eq. (4E.7a) is trivially true; in fact, for this reason it makes sense to drop
the primes from the s(,ap)3 parameters, assuming them to be always the same as the s(,ap)
parameters. We next note that if the compensator plate in Fig. 4E.4(b) has the same shape as the
substrate slab and it is given the appropriate orientation parallel to the substrate slab, then the
angle of passage of any on-axis or slightly off axis plane wave through slab Sb in Fig. 4E.4(a) is
the same as the angle of passage of that same plane wave through slab Sb3 in Fig. 4E.4(b).
Consequently, we can regard the angle of incidence at which monochromatic plane waves
approach Sb to be the same as the angle of incidence at which the monochromatic plane waves
approach Sb3 , which means that if the plane waves have the same wavenumber then

- 545 -
4 · From Maxwell’s Equations to the Michelson Interferometer

γ s(,bp) ≅ γ s(,bp)′ , (4E.7b)

where γ s(,bp)′ are the complex parameters in Fig. 4E.4(b) specifying an s-type or p-type plane
wave’s passage through slab Sb′ , and γ s(,bp) are the already-defined complex parameters in Fig.
4E.4(a) specifying an s-type or p-type plane wave’s passage through slab Sb. Even though we are
here saying that γ s(,bp)′ and γ s(,bp) are only approximately equal because the compensator plate may
not be exactly matched in thickness and orientation to the moving-mirror arm’s second pass
through the beam-splitter substrate, it still makes sense to idealize the situation and drop the
primes, assuming that in a well-designed interferometer all the γ s(,bp) complex parameters are
effectively the same. Finally, we examine the angles of incidence of the plane wave on slab Sc in
Fig. 4E.4(a) and on slab Sc′ in Fig. 4E.4(b). Even though the ray triplet hits the two slabs at
different places, the angles of incidence must always be the same for any on-axis or slightly off-
axis monochromatic plane wave. The plane wave incident on the virtual compensator plate Sc′ in
Fig. 4E.4(b) passes through the slab “in reverse” compared to Sb′ ; and, of course, the same
observation applies to Sc compared to Sb in Fig. 4E.4(a). So again, in a well-designed
interferometer, we know that the same compensator plate satisfying Eq. (4E.7b) can also satisfy

γ s(,cp) ≅ γ s(,cp)′ , (4E.7c)

where γ s(,cp)′ are the complex parameters specifying a plane wave’s passage through slab Sc′ and
γ s(,cp) are the previously defined complex parameters specifying a plane wave’s passage through
slab Sc. In this situation, the primed and unprimed complex parameters may be only
approximately equal not only due to a slightly mismatched compensator plate but also because
the moving mirror may be slightly out of alignment, changing the angle of incidence from what it
ought to be.
When the moving mirror is slightly out of alignment, we know that angle șd defined at the
beginning of Sec. 4.12 of this chapter is nonzero and can give rise to a slight change in the angle
of propagation for on-axis and off-axis plane waves propagating back down the moving mirror
arm of the interferometer and into the beam-splitter substrate for the third time. Angle șd is much
smaller than angle șb, the typical angle at which a slightly off-axis plane wave propagates with
respect to the optical axis. According to the discussion associated with Eqs. (4E.6a)–(4E.6c), the
only reason to worry about the effect of slightly different angles of incidence on the complex Ȗ
parameters associated with the beam-splitter substrate and compensator plate is that the phase—
but not the amplitude—of plane waves passing through these transparent slabs can depend
sensitively on the angle of passage. This change in the plane wave’s phase shows up as a change
in the complex phase angle or argument of the Ȗ parameters and does not significantly affect the

- 546 -
Appendix 4E

value of their complex magnitudes γ . We now show that the typical șd angle is in fact small
enough to disregard its effect on the phase of the monochromatic plane waves, allowing us to
disregard its effect on the complex phase angle—and thus on the value—of the Ȗ parameters. This
means in particular that even for typical nonzero șd values we can drop the primes from the γ s(,cp)′
complex parameters and assume that γ s(,cp) and γ s(,cp)′ are always effectively the same parameters in
a well-designed instrument.
Figure 4E.5(a) shows the solid ray corresponding to a properly aligned plane wave
propagating down the z axis toward the origin of an x, y, z Cartesian coordinate system. The
hollow arrow going through the origin and lying in the x, z plane is the unit-normal vector of the
surface of the transparent slab corresponding to the beam-splitter substrate. The plane of
incidence of the solid ray is then of course the x, z plane of the coordinate system. The solid ray
makes an angle of incidence șA when it intersects the slab’s surface at the origin. This ray is
labeled as ray 1. It refracts into the slab as ray I, still lying in the x, z plane of incidence and
having an angle of refraction

§ n A sin θ A ·
θ B = sin −1 ¨ ¸
© nB ¹

from Eq. (4E.1a) above. The dashed ray, labeled ray 2 in Fig. 4E.5(a), corresponds to the
direction of propagation of the solid ray’s plane wave when the moving mirror is slightly
misaligned, changing its direction of propagation by an angle șd. There is no reason to expect șd
to lie in the x, z plane, so the plane of incidence of ray 2 is depicted as being different from that of
ray 1. When ray 2 refracts at the origin, turning into ray II, we see that the angle between rays I
and II must be the same order of magnitude as θ d , so we write this angle as O(șd).
Figure 4E.5(b) shows the plane containing refracted rays I and II. The intersection of this
plane with the flat surfaces of the beam-splitter substrate produces lines 1 and 2 in Fig. 4E.5(b),
showing where rays I and II enter and exit the slab. The O(șd) angle between them is also clearly
shown, now lying in the plane of the diagram. The distance between lines 1 and 2 must be O(w)
where w is the thickness of the slab. To estimate the monochromatic plane wave’s change in
phase due to the O(șd) change in the angle of propagation through the slab, we evaluate

2π 2π § 2π w 2 ·
∆s ≅ ⋅ [O(θ d ) ⋅ O( w) ] ⋅ O(θ d ) = O ¨ θd ¸ , (4E.8a)
λ λ © λ ¹

where Ȝ is the smallest typical wavelength of the monochromatic plane wave.

- 547 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4E.5(a).

ray 1 angle θ d

angle θ A ray 2

unit vector ẑ

surface normal
vector n̂

unit vector x̂
unit vector ŷ

ray I
ray II

angle is O (θ d )
−1 § n A sin θ A ·
angle θ B = sin ¨¨ ¸¸
© nB ¹

- 548 -
Appendix 4E

We note that
# #min ? 104 cm

for infrared systems and that, according to inequality (4B.4b) in Appendix 4B of this chapter, the
maximum expected value of șd is
' d ? 104 radians .

The thickness w
w isis typically
typicallyon
onthe
theorder
orderofof1 1cm.
cm.WeWe now seesee
now thatthat [see requirement (4B.1c) in
Appendix 4B]
2& § 2& ·
2& s 4 O §¨ 2&4 1088 ·¸ 2& A1044 . (4E.8b)
# s 4 O ¨© 104 10 ¸¹ 2& A10 . (4E.8b)
# © 10 ¹
This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
(4E.7c) to disregard the difference between the complex s((,ccp)) and s((,ccp))33 parameters. Inequality
(4E.7c) to disregard the difference between the complex s , p and s , p parameters. Inequality
(4B.2a) in Appendix 4B reveals that șb, the typical off-axis angle of propagation of the off-axis
(4B.2a) in Appendix 4B reveals that șb, the typical off-axis angle of propagation of the off-axis
plane waves, can be as large as 1022 radians . Putting this into formula (4E.8a) gives
plane waves, can be as large as 10 radians . Putting this into formula (4E.8a) gives
2& § 2& ·
2& s O §¨ 2&4 1044 ·¸ O(2& ) . (4E.8c)
# s O ©¨ 104 10 ¹¸ O(2& ) . (4E.8c)
# © 10 ¹
This is clearly too large to neglect, showing why we have been so careful to do the bookkeeping
This
on theis phase
clearlychanges
too large to neglect,
undergone byshowing whyand
the on-axis weoff-axis
have been so careful to plane
monochromatic do thewaves
bookkeeping
as they
on the phase changes undergone by the on-axis and off-axis monochromatic
propagate through the beam-splitter substrate and compensator plate. plane waves as they
propagate through the beam-splitter substrate and compensator plate.

- 549 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4E.5(b).
line 1

This point represents the Angle size is O(' d ) .


origin of the coordinate & /2
system in Fig. 4E.5(b).
a
line 1
& /2

line 2
Small length
Angle size is O (' d ) . here is s .

Length is O (w) .

line 2

- 550 -
Appendix 4F

Appendix 4F
Figures 4F.1 and 4F.2 are tunnel diagrams like the ones used to explain the meaning of the γ s(,ap) ,
γ s(,bp) , and γ s(,cp) complex parameters introduced in Appendix 4E. The only difference is that these
tunnel diagrams apply to the monochromatic plane waves of the unbalanced background optical
signal coming from the detector side of a standard Michelson interferometer while the tunnel
diagrams in Appendix 4E analyze the monochromatic plane waves for the balanced optical signal
entering the interferometer’s front aperture.
Figures 4F.1 and 4F.2 show the path of a slightly off-axis plane wave represented by three
rays coming from the detector side of the interferometer. The tunnel diagram in Fig. 4F.1
corresponds to the rays transmitting through the beam-splitter film and substrate, reflecting off
the moving mirror, and then transmitting a second time through the beam-splitter substrate and
film on their way back to the detector. The tunnel diagram in Fig. 4F.2 corresponds to the rays
coming from the same off-axis direction, but this time they reflect off the back side of the beam-
splitter film, transmit twice through the compensator plate while going out and back the fixed-
mirror arm, and then reflect a second time off the back side of the beam-splitter film to return to
the detector.
In Fig. 4F.1, the angle of incidence of the off-axis rays on the back side of the beam-splitter
substrate is slightly different from the angle of incidence on the virtual beam-splitter substrate
shown on the other side of the moving mirror in the tunnel diagram. We have found that the
change in a plane wave passing through a transparent slab can be described by a complex
parameter whose argument or complex phase angle depends sensitively on the angle of incidence
and whose magnitude does not [see, for example, Eqs. (4E.6a)–(4E.6c) in Appendix 4E]. The
tunnel diagram in Fig. 4F.1 shows that the angle of incidence for the second pass through the
beam-splitter substrate differs slightly from that of the first pass, so we call the complex
parameter for the second pass γ s( v ) for s-type monochromatic plane waves and γ (pv ) for the p-type
monochromatic plane waves, while the complex parameter for the first pass is called γ s(u ) for s-
type monochromatic plane waves and γ (pu ) for the p-type monochromatic plane waves.
In Fig. 4F.2, the angle of incidence of the slightly off-axis rays on their second pass through
the compensator plate must also be slightly different from the angle of incidence of their first
pass; we also note, however, that according to the tunnel diagrams in Figs. 4F.1 and 4F.2, the
angle of incidence of the first pass through the beam-splitter substrate must be the same as the
angle of incidence of the first pass through the compensator plate when the compensator plate is
correctly aligned parallel to the beam-splitter substrate. Similarly, the angle of incidence of the
second pass through the compensator plate must be the same as the angle of incidence of the
second pass through the beam-splitter substrate.

- 551 -
4 · From Maxwell’s Equations to the Michelson Interferometer

FIGURE 4F.1.

Background Radiation from


Detector Side of Interferometer

Optical
Axis

Beam-Splitter Substrate

γ s(u, p)

Moving Mirror

γ s(,vp)

Virtual Beam-Splitter
Substrate

Virtual Optical Axis

- 552 -
Appendix 4F

FIGURE 4F.2. Optical


Axis
Background Radiation from
Detector Side of Interferometer

Virtual Compensator Plate

γ s(u, p)

Virtual Fixed Mirror

γ s(,vp)

Second Virtual Compensator Plate


from Virtual Fixed Mirror

- 553 -
4 · From Maxwell’s Equations to the Michelson Interferometer

Hence, the same complex parameters γ s(,up) and γ s(,vp) used for the first and second passes through
the beam-splitter substrate should also be used to describe the first and second passes of the s-
type and p-type monochromatic plane waves through the compensator plate.
For future use, we define that
γ s(uv ) = γ s(u ) ⋅ γ s( v ) (4F.1a)
and
γ (puv ) = γ p(u ) ⋅ γ (pv ) . (4F.1b)

Just like in Eqs. (4E.6a)–(4E.6c), we know that

γ s(,uvp ) ≅ function only of the incident wavenumber σ (4F.2a)


but
arg(γ s(,uvp ) ) = function of both the angle of incidence and (4F.2b)
the incident wavenumber σ
so that
γ s(,uvp ) = function of both the angle of incidence and (4F.2c)
the incident wavenumber σ .

The reason for this is the same as before: the change in phase of the slightly off-axis plane waves
passing through the beam-splitter substrate or compensator plate depends sensitively on their
angle of incidence while their loss of power does not. We also note that, according to the analysis
at the end of Appendix 4E of this chapter, this dependence of the phase change on the angle of
incidence is not so sensitive as to be affected by the very small misalignments of the moving
mirror that may occur in well-designed interferometers.

- 554 -
5
DESCRIPTION OF PRACTICAL
INTERFEROMETER MEASUREMENTS
The concept of spectral radiance was introduced in Chapter 4 to simplify the interference
equations, and it turns out to have a much wider range of usefulness than might at first be
suspected.
expected. We start off this chapter with a quick description of how the spectral radiance can be
used to analyze the large-scale power flow and spectral content of electromagnetic radiation
fields, matching this to our intuitive understanding of what is meant by the brightness and
darkness of both near and distant objects. This is followed by a description of what is seen with
the naked eye when looking out through a standard Michelson interferometer, showing how it fits
in with the previous chapter’s interference formulas. The somewhat abstract equations derived in
Chapter 4 are converted into more practical formulas, and we explain the consequences of the
nonrandom errors and signal distortions found in realistic instruments. We describe the balanced,
unbalanced, and off-axis interferogram signals as well as how calibration removes contaminating
background radiances from the measured spectra. The characteristic strengths and weaknesses of
double-sided and single-sided interferogram systems are discussed, and we analyze the
degradation introduced by nonflat optical surfaces. The signals produced in the detector are
traced through the anti-aliasing filter to the analog-to-digital converter, where they are
transformed into digital input for the discrete Fourier transform. The chapter ends with an
explanation of why it sometimes makes sense to oversample or undersample the interferogram
signals.

5.1 Radiometric Description of Electromagnetic Fields


Radiometry analyzes the power flow and spectral content of radiation fields. The analysis is
almost always done using length scales much larger than the typical wavelength of the radiation
and time scales much longer than the typical period of the radiation, allowing us to treat the
radiation as collections of beam-chopped and direction-chopped radiant beams (see Sec. 4.9 of
Chapter 4). It should be emphasized that this division into separate beams is entirely conceptual;
no apertures or lenses are required. In Chapter 4, we introduced the spectral radiance function
L(ı) to describe the propagation of electromagnetic energy inside an interferometer beam.79
There, the spectral radiance of the beam is defined in such a way that the amount of radiant

79
See, for example, the discussion at the start of Sec. 4.16 of Chapter 4.

- 555 -
5 · Description of Practical Interferometer Measurements

energy dE) passing through a cross-sectional area A of a beam in time 2T into a solid angle dŸ
and having a wavenumber between ı and ı + dı is

dE) L() ) A 2T A A A d  A d) . (5.1)

In analyzing any radiation field as a collection of radiant beams, as we are doing here, the idea of
a spectral radiance can be applied to any large-scale description of electromagnetic radiation. In
Sec. 4.9 of the last chapter, parallel groups of rays are used to represent plane waves inside a
beam. In radiometry, these ray groups are bundled together into what are often called pencils of
rays,80 or pencil rays for short, such that each pencil ray becomes an idealized representation of a
single conceptual beam of the radiation field. The pencil rays can be thought of as channels along
which electromagnetic energy flows. Just as the interferometer beam has a spectral radiance, so
too can a spectral radiance be assigned to every pencil ray of a large-scale radiation field.
In radiometry
radiometrywe thethespectral
spectralradiance
radiance of each
each pencil
pencilray
rayto is
be aa ffunction L(ı) such that the
amount of radiant energy dE) passing through a cross-sectional area dA of a pencil ray in time dt
into a solid angle dŸ and having a wavenumber between ı and ı + dı is

dE) L() ) A dt A dA A d  A d) . (5.2a)

In this formula area dA has its normal vector parallel to the axis of the ray as shown in Fig. 5.1(a).
Equations (5.1) and (5.2a) can be matched to each other exactly if we make the associations

A B dA (5.2b)

and
2T B dt . (5.2c)

This shows that to keep the radiometric L(ı) function consistent with Maxwell’s equations, the
physical quantities dA and dt, although mathematical infinitesimals, must always be thought of as
much larger than the wavelengths and periods of the propagating electromagnetic fields. If the
normal vector of area dA makes an angle ș with respect to the axis of the pencil ray, we expect
the effective area transverse to the beam to be (dA A cos ' ) as shown in Fig. 5.1(b). Now the
energy propagating along the pencil ray is

dE) L() ) A dt A (dA A cos ' ) A d  A d) . (5.2d)

80
Max Born and Emil Wolf, Principles of Optics, 7th exp. ed. (Macmillan Company, New York, 1964).

- 556 -
Radiometric Description of Electromagnetic Fields · 5.1

FIGURE 5.1(a).

edge-on
view of dA

pencil ray passing


through dA

FIGURE 5.1(b).

edge-on
view of dA
unit vector normal to dA
edge-on view of
(cosθ ) dA θ

pencil ray passing


through dA
θ

There is no particular reason to use wavenumbers to characterize the spectral distribution of


the energy flowing along the pencil rays. As a matter of fact, in radiometry the spectral radiance
of pencil rays is more likely to be given in terms of L λ (λ ) , the spectral radiance with respect to
wavelength Ȝ, or L f ( f ) , the spectral radiance with respect to frequency ƒ. It is straightforward to
convert from L(ı), the spectral radiance with respect to wavenumber, to either L λ (λ ) or L f ( f ) .
To get L λ (λ ) , we simply note that dEλ , the radiant energy flowing in time dt through an area dA
making an angle θ with respect to the pencil ray into a solid angle dŸ and having a wavelength
between Ȝ and λ + d λ should be

- 557 -
5 · Description of Practical Interferometer Measurements

dEλ = L λ (λ ) ⋅ dt ⋅ (dA ⋅ cos θ ) ⋅ d Ω ⋅ d λ . (5.3a)

Similarly dEf, the radiant energy flowing in time dt through an area dA making an angle ș with
respect to the pencil ray into a solid angle dŸ and having a frequency between ƒ and f + df
should be
dE f = L f ( f ) ⋅ dt ⋅ (dA ⋅ cos θ ) ⋅ d Ω ⋅ df . (5.3b)

The total amount of radiant energy flowing along the ray should be the same no matter how the
spectrum is represented, so

∞ ∞
(dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L(σ )dσ = (dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L f ( f )df
0 0

= (dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L λ (λ )d λ
0

or
∞ ∞

³ L(σ )dσ = ³ L
0 0
f ( f )df (5.3c)

and
∞ ∞

³ L(σ )dσ = ³ Lλ (λ )d λ
0 0
(5.3d)

But σ = 1 λ = f c [see discussion immediately preceding Eq. (4.19c) in chapter 4], which we
can use to change the variable of integration in these last two equations to get

∞ ∞
ª 1 § f ·º
³0 «¬ c L ¨© c ¸¹ »¼ df = ³0 L f ( f )df (5.3e)

and
∞ ∞
ª 1 § 1 ·º
³0 «¬ λ 2 L ¨© λ ¸¹»¼ d λ = ³0 Lλ (λ )d λ . (5.3f)

These equations must hold true for any physically conceivable spectral radiance L(ı), which
means that
1 § f ·
L f ( f ) = L¨ ¸ (5.3g)
c ©c¹
and

- 558 -
Radiometric Description of Electromagnetic Fields · 5.1

1 §1·
L λ (λ ) = L¨ ¸ . (5.3h)
λ ©λ¹
2

This can be used to define Lƒ and LȜ in terms of L. The phrase “physically conceivable” lets us
assume that L(1 λ ) → 0 as λ → 0 and that it does this fast enough to avoid any concern that the
right-hand side of (5.3h) becomes singular as λ → 0 .
Radiation escaping from relatively small holes in cavities whose walls are all at the same
temperature T is called black-body or Planck radiation. One of the first triumphs of quantum
mechanics at the beginning of the 20th century was to explain why the spectral radiance of this
sort of radiation is always given by the formula

2hc 2σ 3
L(σ ) Planck = hcσ
, (5.3i)
e kT
−1

where T is the temperature of the walls in degrees Kelvin (abbreviated as K),


h = 6.625 ×10 −27 erg ⋅ sec is Planck’s constant, k = 1.381×10−16 erg/K is Boltzmann’s constant,
and c = 2.998 ×1010 cm/sec is the speed of light in empty space. Equivalent forms of this
equation come from applying formulas (5.3g) and (5.3h) to get

(2hf 3 / c 2 )
Lf ( f ) = hf
(5.3j)
Planck
e kT
−1
and
(2hc 2 / λ 5 )
L λ (λ ) Planck = hc
. (5.3k)
e kT λ
−1

We often use a “gray-body” approximation to get the spectral radiance for heat or infrared
radiation that a surface of temperature T spontaneously emits. To use the gray-body
approximation, we just multiply L Planck at the surface’s temperature T by a dimensionless fraction
between zero and one, which is called the surface’s emissivity, with different surfaces having
different emissivity values. Sometimes, to get more accuracy, the emissivity is taken to be a
function of wavenumber and temperature; when this is done the L Planck function is being used to
give the correct overall size and shape to the surface’s spectral radiance while the spectral
dependence of the emissivity is used to reproduce the rapid fluctuations with respect to ı
characteristic of the surface.
Figures 5.2(a)–5.2(c) contain plots of L(σ ) Planck , L f ( f ) , and L λ (λ ) Planck at temperatures
Planck

- 559 -
5 · Description of Practical Interferometer Measurements

of 300 K, 400 K and 500 K. The spectral radiance increases with temperature at every
wavenumber, matching our intuition about what ought to occur. We note that at 300 K
(approximate room temperature) only negligible radiation is emitted in the visible region of the
electromagnetic spectrum between approximately 15,000 cmí1 and 22,000 cmí1—which is, of
course, what we should expect—and the same is also true of the 400 K and 500 K curves.
(Surfaces in fact start to become visibly hot only at 700 K and higher.) Unfortunately, the Planck
curve is rather featureless, tending to conceal what is going on when we switch from L to Lƒ to
LȜ to represent the same radiance spectrum. Figures 5.2(d)–5.2(f) show a more interesting
electromagnetic spectrum represented using the L(ı), L f ( f ) , and L λ (λ ) spectral radiance
functions. These plots reveal that the transformation from L to LȜ not only distorts the spectrum’s
overall shape but also reverses the ordering of the spectral features, putting large wavenumber
features at small wavelengths and small wavenumber features at large wavelengths. The
transformation from L to Lƒ, on the other hand, just involves a rescaling of the x and y axes of the
spectrum. This latter transformation, then, acts like a simple change in our choice of units; and
for this reason the word “frequency” is sometimes used to refer to wavenumber. The idea behind
this terminology is that wavenumbers are just frequencies that happen to be measured in “units”
of cmí1.

FIGURE 5.2(a).

800800
710.524107

500 K

600600
L(σ ) Planck
400 K
[in (erg/sec)/cm/sr]B1 σi
B2 σ 300 K
400400
i

B3 σ
i

200200

2.477522 .10 0.0 0


3

0 500 1000 1500 2000 2500 3000 3500 4000


0.01 1000 2000
σ
i
3000 4000
4 .10
3

σ (in cm-1)

- 560 -
Radiometric Description of Electromagnetic Fields · 5.1

FIGURE 5.2(b).

3 .10 3x10-8
8

8 500 K
2.5 10

-88
2x10
2 10

Lf ( f ) B1f
i 400 K
Planck
2
[in (erg/sec)/cm /sr/Hz]i B2f
1.5 10
8

B3f
i 300 K
-88
1x10
1 10

9
5 10

14
8.263917 .10 0.0 0
13 13 13 13 14 14
0.0 4 10 13
4x10 8 10 13
8x10 1.2 10 14
2.998 .10
0
10
2 10 6 10
f
i
1 10
1.2x10
1.1992 .10
14

f (in Hz)

- 561 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.2(c).

9
1.4x101.4 109
9
1.279813 .10

9
1.2 109
1.2x10
500 K
9 9
1.0x10
1 10

L λ (λ ) Planck B1λ
i
8.0x10 88
8 10
[in (erg/sec)/cm3/sr] B2λ
i 400 K
88
6.0x10
B3λ
i
6 10

300 K
88
4.0x10
4 10

88
2.0x10
2 10

0 0.0 0
0 0.001 0.002 0.003 0.004
0 0.0 0.001 0.002 λ 0.003 0.004 0.005
λmax
i

λ (in cm)

- 562 -
Radiometric Description of Electromagnetic Fields · 5.1

FIGURE 5.2(d).

0.003
Lmax
0.003

0.0025

0.002
0.002
L(σ )
(in Watts/cm2/sr/cm -1
) 0.0015
L( σ )

0.001
0.001

4
5 10

0 0.0 0
0 500 1000 1500 2000 2500 3000
σmin 0 500 1000 1500
σ 2000 2500 3000
σmax

σ (in cm-1)

- 563 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.2(e).

13
1 10-13
1x10
Lνmax

-14
14
8x10
8 10

-14
14
Lf ( f ) 6x10
6 10

(in Watts/cm2/sr/Hz)
Lν( ν )
-14
14
4x10
4 10

-14
14
2x10
2 10

0 0.0 0
13 13 13 13 13 13 13 13 13 14
νmin 0 20 40 60 80 100
0 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 1 10
ν νmax

f (in TeraHz)

- 564 -
Radiometric Description of Electromagnetic Fields · 5.1

FIGURE 5.2(f).

4000
Lλmax
4000

3500

3000
3000

2500
L λ (λ )
2000
Lλ( λ ) 2000
(in Watts/cm3/sr)
1500

1000
1000

500

0.0
0 0
4
0
λmin0 5 10 10
0.001 0.0015 20
0.002
λ
0.0025 30
0.003 0.0035 40
0.004
λmax

λ (in microns)

- 565 -
5 · Description of Practical Interferometer Measurements

Different authors use different notations for L, Lƒ, and LȜ. The easiest way to find out what
exactly is meant by the term “spectral radiance” is to check the units. Consulting Eqs. (5.2d),
(5.3a), and (5.3b), we see that L must have units of energy per unit time per unit area per unit
solid angle per unit wavenumber interval, whereas Lƒ has units of energy per unit time per unit
area per unit solid angle per unit frequency interval and LȜ has units of energy per unit time per
unit area per unit solid angle per unit wavelength interval. Although solid angles measured in
steradians, like angles measured in radians, are strictly speaking dimensionless, it is customary in
radiometry to write out the steradian unit explicitly, treating it as if it had a dimension. This
convention makes it easy to distinguish physical quantities such as the spectral radiance, which
are both “per unit surface area” and “per unit steradian” from physical quantities such as the
energy flux that are just “per unit surface area.”
To go from the spectral radiance to the radiance, we need only integrate L(ı) over all positive
wavenumbers, integrate L f ( f ) over all positive frequencies, or integrate LȜ over all
wavelengths. Using l to represent the radiance, we say that

∞ ∞ ∞
l = ³ L(σ )dσ = ³ L f ( f )df = ³ L λ (λ )d λ . (5.4a)
0 0 0

The integrals are between 0 and ’ because L and Lƒ are defined in such a way as to spread the
radiant energy over positive wavenumbers and frequencies respectively—and wavelength, of
course, must be a positive quantity. In this sense, they are all analogous to the single-sided power
spectra discussed at the end of Sec. 3.23 in Chapter 3. We integrate Eq. (5.2d) over positive ı and
use (5.4a) to get that the total energy dE flowing in time dt through an area dA making an angle ș
with respect to the pencil ray into a solid angle dŸ is

dE = l ⋅ (dA ⋅ cos θ ) ⋅ dt ⋅ d Ω . (5.4b)

The same formula comes from integrating Eqs. (5.3a) or (5.3b) over positive frequencies or
wavelengths respectively. Different authors may use different notations for the radiance, and
again the surest way to find out what is going on is to check the units. The units of the radiance l
are, of course, energy per unit time per unit area per unit solid angle.

5.2 Radiance Fields in Space


The solid angle dŸ referred to in the definitions of L and l [see Eqs. (5.2a), (5.4a), and (5.4b)]
can be taken to extend either forward or backward along the pencil ray, as shown in Fig. 5.3(a).
We can place two areas dA1 and dA2 at positions 1 and 2 along the same pencil ray, with the
normals of dA1 and dA2 making angles θ1 and θ 2 with respect to the ray as shown in Fig. 5.3(b).
The amount of radiant energy passing through dA1 in time dt into a solid angle dŸ1 is

- 566 -
Radiance Fields in Space · 5.2

l1 A (dA1 A cos '1 ) A dt A d 1 , (5.5a)

where l1 is the radiance at position 1 along the pencil ray. Similarly, the amount of radiant energy
passing through dA2 in time dt into a solid angle dŸ2 is

l 2 A (dA2 A cos ' 2 ) A dt A d  2 , (5.5b)

where l 2 is the radiance at position 2 along the pencil ray. The values of l1 and l 2 cannot depend
on the size of the infinitesimal quantities dA1, dA2, dŸ1, dŸ2, or dt, so nothing stops us from
choosing dŸ1,2 to be the solid angles subtended by dA2,1 at positions 1,2:
dA cos ' 2
d 1 dA2 cos2 '
(5.5c)
d 1 2 r 2 2 (5.5c)
and r
and dA cos '1
d  2 dA1 cos
2 '
, (5.5d)
d 2 1 r 2 1 , (5.5d)
r
where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.
where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.
______________________________________________________________________________
______________________________________________________________________________

FIGURE 5.3(a).
FIGURE 5.3(a). Unit Vector
Unit Vector
normal to
normal
area dAto
area dA

'
'

Solid angle d dA Solid angle d


Solid angle
surrounding d ray
pencil dA Solid angle
surrounding d ray
pencil
surrounding pencil ray surrounding pencil ray

- 567 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.3(b).

Unit Vector
normal to dA2

d1 dA1 d 2 d1 dA2 d 2


'2

'1
Unit Vector
normal to dA1
r
Position 1 Position 2
______________________________________________________________________________

If we make the reasonable assumption that energy travels in straight lines inside a homogeneous
medium, as shown by the dotted lines in Fig. 5.3(c), and also specify that the values of l1 and l 2
do not change with time, then the radiant energy passing through dA1 into dŸ1 in time dt must be
the same as the radiant energy passing through dA2 into dŸ2 in time dt. From Eqs. (5.5a)–(5.5d)
we then have

dA2 A cos ' 2 dA A cos '


l1 A (dA1 A cos '1 ) A 2
A dt l 2 A (dA2 A cos ' 2 ) A 1 2 1 A dt , (5.5e)
r r

which reduces to
l1 l 2 (5.5f)

Hence
Hence when
when thethe radiance
radiance is is
notnot changingwith
changing withtime
timeit itmust
mustalso
alsobe
beconstant
constant along
along any
any pencil
pencil
of rays.
We have now established a self-consistent model for radiation fields in empty space and
transparent media. To find the radiance at any point, such as point A in Fig. 5.4, we need only
take note of all the criss-crossing pencil rays, establishing their radiances by tracing them back to
the surfaces where they originated. It does not matter whether the surface has reflected them like
surface 1 or, being self-luminous, has created them like surfaces 2 and 3; all that is relevant is the
radiance value they have when leaving the surface. There is nothing special about point A in Fig.
5.4—its location is obviously arbitrary.

- 568 -
Radiance Fields in Space · 5.2

FIGURE 5.3(c).
dA1 dA2

Position 1 Position 2

FIGURE 5.4.

Point A

Surface 1

Surface 3 Surface 2

- 569 -
5 · Description of Practical Interferometer Measurements

By moving point A around and specifying the radiances of the different pencil rays passing
G ˆ G
through point A, we construct a radiance field l (r ,  ) that is a function both of position r and
direction ˆ . Having picked a position rG at which to evaluate l, we need  ˆ as well to specify
G
one particular pencil ray passing through position r . It is even possible, once the radiance l is
G ˆ , to derive a simple differential equation describing the
thought of as a function of r and 
gradual change in radiance undergone by these pencil beams when they travel through
semiopaque and self-luminous media, such as clouds of radiating gas. This last idea is the starting
point for modeling radiance fields inside stars or planetary atmospheres, but is not really needed
for the material in this book.81
G ˆ
Along with the radiance field l (r ,  ) , which is a function of position and direction of
G ˆ
propagation, we can associate a spectral radiance field L() , r ,  ) that is a function of
G ˆ such that
wavenumber ı, position r , and direction of propagation 

5
G ˆ G ˆ
l (r ,  ) ³ L() , r ,  ) d) . (5.6)
0

G
Suppressing the r dependence on position to represent a radiance field that is constant over some
region of space, and choosing a direction in space to be the ẑ axis of a coordinate system so that

ˆ G  zˆ 1   2 ,

G G
as in Eq. (4.97d) of Chapter 4, we can write L as a function of  , as in L L( , ) ) , to show its
dependence on the radiation’s direction of propagation. This function L is the same quantity as
G
the spectral radiance L( , ) ) specified by Eq. (4.136d) in Chapter 4. Many times in the rest of
this chapter we will talk about a single pencil ray from a distant source passing through an
G
interferometer. The pencil ray has, of course, a unique spectral radiance L( , ) ) ; and the pencil
ray while passing through the interferometer can be decomposed into a group of parallel rays
because it emanates from a distant source. This parallel group of rays, according to Sec. 4.9 of
Chapter 4, specifies a plane wave passing through the interferometer. To get the optical energy
per unit area per unit time per unit wavenumber interval carried by the plane wave, we just
multiply the spectral radiance of the pencil ray by the extremely small solid angle dŸ subtended

81
The interested reader is referred to S. Chandrasekhar, Radiative Transfer (Dover Publications, New York, 1960)
for a classic textbook, or Curtis D. Mobley, Light and Water: Radiative Transfer in Natural Waters (Academic
Press, New York, 1994), based in part on collaborations with Rudolf W. Preisendorfer, for a more modern work in
this field. What we call radiance and spectral radiance, Chandrasekhar calls, respectively, intensity and specific
intensity.

- 570 -
Radiance Fields in Space · 5.2

by the distant source at the position of the interferometer. This procedure amounts to nothing
G
more than mentally associating dŸ with L(ε , σ ) in Eq. (5.2a) to get

G
dEσ = [ L(ε , σ ) ⋅ d Ω ] ⋅ dt ⋅ dA ⋅ dσ .
Writing this equality as
dEσ G
= [ L(ε , σ ) ⋅ d Ω]
dt ⋅ dA ⋅ dσ

makes it easy to see why multiplying the spectral radiance of the pencil ray by dŸ gives the
optical energy of the plane wave per unit time per unit area per unit wavenumber interval.

5.3 Radiance, Brightness, and the Inverse-Square Law


One interesting consequence of the radiance l being constant along any pencil ray is that we can
immediately identify the radiance with our subjective sensation of the “brightness” of an
illuminated or luminous surface. No matter what the distance between the observer and the
surface patch in Fig. 5.5, we know that, as long as the surface patch is close enough for its shape
to be discerned, the surface brightness remains the same. This is, of course, easily explained by
noting that the radiance along any pencil ray between the surface patch and the eye of the
observer does not change. We hasten to add that the radiance turns out not to be exactly the same
as the subjective notion of brightness because, when measuring radiance, all the energy in the ray
must be recorded no matter what its wavelength, and the human eye is more sensitive to some
wavelengths of visible light than to others.
Figure 5.5 also shows how to recover the well-known inverse-square law for the amount of
radiant energy perceived by an observer. Although the observer sees any point on the surface
patch as equally bright at positions A and B, the surface patch itself—or, to be more precise, its
image inside the observer’s eye—shrinks as the distance increases. If the surface patch has an
area asurf , the eye’s pupil has an area a pupil , and we assign all the pencil rays coming from the
surface patch the same radiance l surf , then Eq. (5.4b) above states that the radiant energy entering
the eye at position A in time dt is

dE A = l surf ⋅ a pupil ⋅ d Ω A ⋅ dt , (5.7a)

where
asurf
dΩA = (5.7b)
rA2

is the solid angle subtended by the surface patch at position A.

- 571 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.5.

rB

A B

rA

What observer What observer


at point A sees at point B sees

Similarly, the radiant energy entering the eye at position B in time dt is

dEB = l surf ⋅ a pupil ⋅ d Ω B ⋅ dt , (5.8a)


where
asurf
d ΩB = (5.8b)
rB2

is the solid angle subtended by the surface patch at position B. Substitution of (5.7b) into (5.7a)
and (5.8b) into (5.8a) gives
dE A l ⋅a ⋅ asurf
= PA = surf pupil (5.9a)
dt rA2
and
dEB l ⋅a ⋅ asurf
= PB = surf pupil , (5.9b)
dt rB2

- 572 -
Radiance, Brightness, and the Inverse-Square Law · 5.3

where PA and PB are, respectively, the radiant power entering the observer’s eye at positions A
and B. This result can be written as
PB § rA2 ·
=¨ ¸, (5.9c)
PA © rB2 ¹

showing how the familiar inverse-square law for radiant power hides inside the rule that the
radiance along any pencil ray is constant.
The idea that the interior points of a surface patch can have a brightness only makes sense
when the observer is near enough to resolve—or “see”—the shape of the surface patch. When the
observer is so distant that the surface patch is just a point of light, we say that the image of the
surface patch is unresolved. Now the brightness of that point of light follows the inverse-square
law directly by growing ever dimmer as the distance between the observer and surface patch
increases. The “brightness” of an unresolved point source, then, depends not on the radiance of
the pencil ray emanating from that source but rather on the total radiant power entering the
observer’s eye.

5.4 The Balanced Signal of a Michelson Interferometer


Suppose a pencil ray from a distant object passes through an idealized Michelson interferometer
with a beam having a circular cross section. The object is so far away that it acts like an
unresolved point source and to the naked eye it looks like a bright star. This means, according to
the work done in Sec. 5.3, that the total power entering the naked eye determines the perceived
brightness of the source. To keep things simple, we assume the radiation in the pencil ray is
unpolarized.
We unbundle the pencil ray inside the interferometer into a collection of parallel rays as
shown in Fig. 5.6, turning it into a single plane wave of the type discussed in Chapter 4. The
plane wave’s propagation vector is parallel to the interferometer’s optical axis. We have, using
the notation of Eq. (4.135f) of Chapter 4, that cos α ε = 1 because α ε , the propagation angle with
respect to the optical axis, is zero. The only source of radiation present in the system is the pencil
ray, so we can take the interferometer’s field of view to be the extremely small solid angle ¨Ÿ
subtended by the distant source at the position of the interferometer. Now the appropriate formula
to pull from Chapter 4 to describe the radiant power passing through the interferometer is
(4.140a). This equation can be written as


1
S (σ ) [1 + W ⋅ M(Rσθ ma ) ⋅ cos(2πσχ ) ] dσ ,
2 ³0
Pbal ( χ ) = (5.10a)

where
S (σ ) = A ⋅ ∆Ω ⋅η (σ ) ⋅ L(σ ) (5.10b)

- 573 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.6.

Moving Mirror
χ
a=
2

Parallel Rays Coming from


Distant Point Source

Ideal
Beam
Splitter Fixed
Mirror

- 574 -
The Balanced Signal of a Michelson Interferometer · 5.4

and
J1 (4π Rσθ ma )
M(Rσθ ma ) = . (5.10c)
2π Rσθ ma

For an ideal interferometer the beam-splitter efficiency Ș is always one and so S(ı) specified by
(5.10b) is the same S(ı) specified by Eq. (1.19d) in Chapter 1 and (4.140c) in Chapter 4. Function
Pbal ( χ ) gives the optical power in the balanced interference signal coming from the point source,
and we often call Pbal the balanced interference signal when context makes it clear what is meant.
Figure 5.6 shows the source observed through the interferometer by the unaided eye, so in Eq.
(5.10b) the effective cross-sectional area A of the interferometer beam is the area of the eye’s
circular entrance pupil and L(ı), of course, is the spectral radiance of the pencil ray from the
distant source. The beam-splitter efficiency Ș(ı), which is a function of wavenumber, reminds us
that radiation of wavenumber ı only contributes to the interference signal to the extent that it
penetrates the beam splitter—wavenumbers for which the beam splitter is opaque so that η = 0 ,
for example, cannot contribute to Pbal ( χ ) . As is pointed out in the discussion following Eq.
(4.136i) of Chapter 4, we expect that
0 <η <1

for realistic interferometers. In Eq. (5.10c) the radius R of the eye’s circular entrance pupil is
related to the pupil area by the standard formula

A
R= , (5.10d)
π

and θ ma is the misalignment angle of the moving mirror. For an ideal interferometer that is in
perfect alignment, θ ma = 0 ; and, according to Eq. (4.137k) of Chapter 4, when θ ma is zero

J1 (4π Rσθ ma )
= M(0) = 1 . (5.10e)
2π Rσθ ma θ
ma = 0

This means M = 1 is a shorthand for the assumption that the interferometer is in perfect
alignment. For future use we also note that, according to Eq. (4.137g),

M( − Rσθ ma ) = M(Rσθ ma ) , (5.10f)

making M an even function of wavenumber ı. Figure 4.24 of Chapter 4 shows that

- 575 -
5 · Description of Practical Interferometer Measurements

M ≤1 (5.10g)

always. Unless otherwise stated, we assume in this chapter that M is constant, postponing until
the next chapter any discussion of what happens when M changes while the interferometer is
measuring spectra.
To show how formula (5.10a) works, we choose a specific shape for the spectral radiance
L(ı), making the idealization that η = M = 1 at all ı for which L(σ ) ≠ 0 . Figure 5.7 specifies the
shape of L(ı), and according to Eq. (5.10b) the single-sided power spectrum

S (σ ) = A ⋅ ∆Ω ⋅ L(σ )

must have the same shape as L because A and ¨Ÿ are constant.

FIGURE 5.7.

4
3 10
Lmax

4
2.5 10

4
2 10

4
L(σ
L( σ )1.5) 10

4
1 10

5
5 10

0 0
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
σmin -2000 -1000 σ0.0 1000 2000 σmax

σ (in cm-1)

- 576 -
The Balanced Signal of a Michelson Interferometer · 5.4

When a = 0 in Fig. 5.6, the optical path difference, which is

χ = 2a (5.11a)

for this interferometer, is also zero. This means that when a = 0 the moving mirror is at the zero-
path difference (ZPD) position shown by the dashed line in Fig. 5.6. From (5.10a) we see that at
ZPD when η = M = 1

1 ­ P for W = 1
Pbal (0) = ³ S (σ ) (1 + W ) dσ = ® inp , (5.11b)
20 ¯ 0 for W = −1
where

input radiant power in pencil ray = Pinp = ³ S (σ )dσ . (5.11c)
0

Evaluating (5.10a) when η = M = 1 for all nonzero values of χ = 2a , we get the two different
Pbal curves shown in Figs. 5.8(a) and 5.8(b). When χ = 2a = 0 and W = 1 , we see that Pbal (0) in
Eq. (5.11b) specifies the maximum possible value for the interference signal; and when W = −1 ,
we see that Pbal (0) specifies the minimum possible value for the interference signal. The observer
in Fig. 5.6 sees the starlike source disappear when the pencil ray passes through an ideal
interferometer that has its moving mirror at ZPD and a beam splitter with W = −1 . When the
pencil ray passes through an ideal interferometer whose beam splitter has W = 1 , the observer
sees the full brightness of the starlike source when the moving mirror is at ZPD. If Ȥ is changed
by shifting the moving mirror, both Figs. 5.8(a) and 5.8(b) show how, for this ideal
interferometer, the source brightness seen by the observer oscillates around Pinp/2, half the full
brightness of the starlike source. We note that when a and Ȥ are positive (that is, when
a = χ 2 > 0 ), the moving mirror is more distant from the beam splitter than it is at ZPD; and
when a and Ȥ are negative (that is, when a = χ 2 < 0 ), the moving mirror is closer to the beam
splitter than it is at ZPD. Because
cos(−2πσχ ) = cos(2πσχ ) ,

Eq. (5.10a) also shows that


Pbal (− χ ) = Pbal ( χ ) , (5.12)

which means that Pbal is an even function of the optical-path difference. Consequently the
observer sees the same source brightness when the moving mirror moves off ZPD and away from
the beam splitter by a distance a = χ 2 as he does when the moving mirror moves off ZPD and
closer to the beam splitter by a distance a = χ 2 .

- 577 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.8(a). [for W = í1]


0.08
2 Imax

Pinp
0.06

Pbal (  )
IferSig 0.04
Pinp
ng

2
0.02

0.0
0. 0
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
10
2 -0.008 -0.004 0.0ng
graph 0.004 0.008 10
2

 (in cm)

FIGURE 5.8(b). [for W = 1]

0.08
2 Imax

Pinp
0.06

Pbal (  )
IferSig 0.04
Pinp
ng

2
0.02

0.0
0. 0
0.01
2
0.008
-0.008 0.006 0.004
-0.004 0.002
0.0
0
graph
0.002 0.004
0.004 0.006 0.008
0.008 0.01
2
10 ng 10

 (in cm)

- 578 -
The Balanced Signal of a Michelson Interferometer · 5.4

Following the notation introduced in Eq. (4.141a) of Chapter 4, we say the ideal interferogram
of the balanced power spectrum is

5
1
I ( ideal )
bal (  ) ³ S () ) cos(2&) ) d) . (5.13a)
20

When ! M 1 , we can write the interference signal in Eq. (5.10a) as

5
1
³
( ideal )
Pbal (  ) S () ) d)  W A I bal ( ) (5.13b)
20
or, using Eq. (5.11c),
1 ( ideal )
Pbal (  ) Pinp  W A I bal ( ) . (5.13c)
2

( ideal )
Figure 5.8(c) shows the I bal (  ) interferogram that corresponds to both the W 1 and the
W 1 interference signals.

FIGURE 5.8(c).

P
Imax
inp
0.04

2
0.02

(ideal )
I bal ( )
Igraph0.0 0
ng

0.02

Pinp

2
Imax 0.04
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
2 -0.008 -0.004 0.0
graph 0.004 0.008 2
10 ng 10

 (in cm)

- 579 -
5 · Description of Practical Interferometer Measurements

Since there are values of Ȥ for which

Pbal ( χ ) < (1 2)Pinp ,

( ideal )
the interferogram I bal takes on negative as well as positive values. A negative interferogram
value does not mean the total optical power reaching the observer has gone negative—this cannot
ever happen, of course—but just that the interference signal has dropped below Pinp/2. One easy
way to keep track of the distinction between the interferogram signal and the interference signal
is to remember that the interferogram signal has negative values whereas the interference signal
is never negative. Because, according to Eq. (5.12),

Pbal (− χ ) = Pbal ( χ ) ,

we can conclude from Eq. (5.13c) that

1 1
( ideal )
W ⋅ I bal (− χ ) = Pbal (− χ ) − Pinp = Pbal ( χ ) − Pinp = W ⋅ I bal
( ideal )
(χ )
2 2
or
( ideal )
I bal (− χ ) = I bal
( ideal )
(χ ) . (5.14)

( ideal )
Hence both Pbal and I bal are even functions of Ȥ. Since the interference signal Pbal approaches
Pinp/2 as χ gets large in Figs. 5.8(a) and 5.8(b), the balanced interferogram

Pbal ( χ ) − (1 2 ) Pinp
( ideal )
I bal (χ ) = (5.15)
W

approaches zero for large values of χ in Fig. 5.8(c). This behavior is typical of all
interferograms; the only way to avoid it is to make the power spectrum a delta function,

S (σ ) = S0 ⋅ δ (σ − σ 0 ) , (5.16a)

of the type discussed in Sec. 2.14 of Chapter 2 [see also Fig. 5.9(a)]. This delta function
represents monochromatic light of wavenumber σ 0 coming from the distant source. Equation
(5.11c) now requires Pinp = S0 , so according to Eqs. (5.13a) and (5.13c) the balanced interference
signal Pbal becomes

- 580 -
The Balanced Signal of a Michelson Interferometer · 5.4

S0 WS0 S
Pbal ( χ ) = + ⋅ cos(2πσ 0 χ ) = 0 [1 + W cos(2πσ 0 χ ) ] , (5.16b)
2 2 2

which is plotted in Figs. 5.9(b) and 5.9(c) for W = 1 and W = −1 . Equation (5.15) gives the
associated interferogram
S
( ideal )
I bal ( χ ) = 0 ⋅ cos(2πσ 0 χ ) , (5.16c)
2

which we plot in Fig. 5.9(d). Formula (5.16b) is clearly identical to Eq. (1.17d) in Chapter 1 after
we set up the correspondences

fi ⇔ S 0 , and σ fi ⇔ σ 0 .
I (ficb ) ⇔ Pbal , I (0)

This ideal delta-function spectrum can be approximated by passing a laser through the
interferometer, producing interferograms resembling the one shown in Fig. 5.9(d). Even lasers,
however, have a small but finite spread in their power spectra, causing their interferograms to
approach zero at extremely large values of χ .

FIGURE 5.9(a).

2.5
2.5

1.5 S (σ )
1

c0( χ ) 0.5

0
σ =σ0
0.5

1.5 1.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
2.5 χ
σ 2.5

- 581 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.9(b). [for W = 1]

2.5
2.5

S0 2
1.5
S0
1
2
c1( χ ) 0.5

0.0 0
0.5
S0 1/ σ 0
− 1
2
1.5 1.5
2.5 2 1.5 1 0.5 χ =χ0 0 0.5 1 1.5 2 2.5
2.5 χ 2.5

FIGURE 5.9(C). [for W = í1]

2.5
2.5

S0 2
1.5
S0
1
2
c2( χ ) 0.5

0.0 0
0.5
S 1/ σ 0
− 01
1.5
2
1.5
2.5 2 1.5 1 0.5
χ =χ0 0 0.5 1 1.5 2 2.5
2.5 χ 2.5

- 582 -
The Balanced Signal of a Michelson Interferometer · 5.4

FIGURE 5.9(d).

2.5
2.5

S0 2
1/ σ 0
1.5
S0
1
2
c3( χ ) 0.5

0.0 0

0.5
S
− 0 1
2
1.5 1.5
2.5 2 1.5 1 0.5
χ =χ0 0 0.5 1 1.5 2 2.5
2.5
χ 2.5

Having separated the balanced interference signal Pbal ( χ ) for the ideal interferometer into a
( ideal )
constant term Pinp 2 and an ideal interferogram I bal ( χ ) , we note that a similar procedure can
be followed with respect to the nonideal interference signal where 0 < η < 1 and M < 1 .
Equation (5.10a) can be written as

∞ ∞
1 W
Pbal ( χ ) =
20³ S (σ ) dσ + ³ S (σ ) M(Rσθ ma ) ⋅ cos(2πσχ ) dσ
2 0 (5.17a)
= P0 / 2 + W ⋅ I bal ( χ )
or
Pbal ( χ ) − (1 2 ) P0
I bal ( χ ) = , (5.17b)
W
where, applying Eq. (5.10b),
∞ ∞
P0 = ³ S (σ ) dσ = ³ A ∆Ω L(σ )η (σ ) dσ (5.17c)
0 0

and

- 583 -
5 · Description of Practical Interferometer Measurements


1
I bal ( χ ) = ³ S (σ ) M(Rσθ ma ) cos(2πσχ ) dσ
20

(5.17d)
1
= ³ A ∆Ω L(σ )η (σ ) M(Rσθ ma ) cos(2πσχ ) dσ .
20

Although P0 in Eq. (5.17c) looks superficially like Pinp in Eq. (5.11c), since it too can be written
as

³ S (σ ) dσ ,
0

the constant power level P0 is not the same as Pinp because now η (σ ) < 1 in Eq. (5.10b), making
P0 less than the radiant power Pinp of the pencil ray entering the interferometer. Similarly I bal ( χ )
in formula (5.17d) becomes, for χ = 0 ,

∞ ∞
1 1
I bal (0) = ³ A ∆Ω L(σ )η (σ ) M(Rσθ ma )dσ = ³ S (σ ) M(Rσθ ma )dσ ,
20 20

which means—since M < 1 in this nonideal case—that we cannot expect to have Pbal (0) be
either P0 or zero for W = 1 or W = −1 respectively. Nevertheless, in a well-designed
interferometer, both Ș and M are reasonably close to one for the wavenumbers of interest, and the
balanced signal of a nonideal interferometer usually behaves much the same as the balanced
signal of the ideal interferometer. In fact the symmetry properties with respect to Ȥ of the ideal
balanced signal—that the balanced interference signal and balanced interferogram are even
functions of the optical path difference—apply as well to the nonideal case where 0 < η < 1 and
M < 1 because neither Ș nor M depends on Ȥ. Hence the same reasoning already used to derive
Eqs. (5.12) and (5.14) can also be applied to this nonideal case to get

Pbal (− χ ) = Pbal ( χ ) for 0 < η < 1 and M < 1 (5.18a)

and
I bal (− χ ) = I bal ( χ ) for 0 < η < 1 and M < 1 . (5.18b)

- 584 -
The Unbalanced Signal of a Michelson Interferometer · 5.5

5.5 The Unbalanced Signal of a Michelson Interferometer


At large values of the optical-path difference Ȥ, all balanced interferograms, ideal and nonideal—
even those generated by lasers—approach zero. Returning to the ideal case where η = M = 1 , we
see that if the ideal interferogram is zero in Eq. (5.13c), then the optical power Pbal in the
balanced interference signal becomes Pinp/2, half the original Pinp power entering the
interferometer. Hence, for large Ȥ values, the observer in Fig. 5.6 sees the distant source at half its
true brightness when looking through the interferometer. This raises the question of where the
optical power unseen by the observer goes.
When analyzing the background signal in Sec. 4.17 of Chapter 4, we saw that the balanced
and unbalanced signals had to contain all the unabsorbed background power entering the
interferometer [see discussion following Eq. (4.154) of Chapter 4]. The same must be true for the
optical power of the distant point source in Fig. 5.6; and with this clue we realize, following the
same reasoning used in Sec. 4.17 of Chapter 4 and in the discussion following Eq. (1.18a) in
Chapter 1, that the missing optical power goes back out the interferometer’s entrance aperture as
an unbalanced and unseen optical signal. Figure 5.10 shows that the unbalanced signal comes
from the interference of those rays that reflect twice off the beam splitter—at the beginning and
end of their trip up and back the moving-mirror arm—with those rays that transmit twice through
the beam splitter—at the beginning and end of their trip up and back the fixed-mirror arm. Using
conservation of energy, we note that the unbalanced signal power Punb ( χ ) and the balanced
signal power Pbal ( χ ) , for the ideal interferometer with η = M = 1 , must add up to Pinp, the input
radiant power entering the system:

Punb ( χ ) + Pbal ( χ ) = Pinp . (5.19a)

Substitution of (5.10a) with η = M = 1 and (5.11c) into (5.19a) then gives

∞ ∞
1
Punb ( χ ) = ³ S (σ ) dσ − S (σ ) [1 + W cos(2πσχ ) ] dσ
0
2 ³0
or

1
Punb ( χ ) = ³ S (σ ) [1 − W cos(2πσχ ) ] dσ . (5.19b)
20

Comparing this result to Eq. (5.10a) with Ș = M = 1, we see that, at this level of idealization,
going from the balanced to the unbalanced interference signal is the same as changing the sign of
W. Consulting Figs. 5.8(a) and 5.8(b), we see that when 5.8(a) is the balanced interference signal,

- 585 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.10.

Moving Mirror

Parallel Rays Coming from


Distant Point Source

Ideal
Beam
Splitter
Fixed
Mirror

The dashed lines show the rays going back out the front aperture as an unbalanced interference
signal. This unbalanced interference signal cannot by seen by the observer.

- 586 -
The Unbalanced Signal of a Michelson Interferometer · 5.5

then 5.8(b) is the unbalanced interference signal; and when 5.8(b) is the balanced interference
signal, then 5.8(a) is the unbalanced interference signal. Following the pattern of Eq. (5.13c), we
( ideal )
can define an ideal unbalanced interferogram I unb (  ) for the unbalanced optical signal by
saying that

1 ( ideal )
Punb (  ) Pinp  W A I unb ( ) (5.19c)
2
so that
Punb (  )  1 2  Pinp 1
5

2 ³0
( ideal )
I unb ( )  S () ) cos(2&) ) d) . (5.19d)
W

The sign convention chosen for the balanced and unbalanced interferograms in Eqs. (5.13a) and
(5.19d) specifies a positive ZPD peak for the balanced interferogram,

5
1
2 ³0
( ideal )
I bal S () ) d) 0 , (5.20a)
 0

and a negative ZPD peak for the unbalanced interferogram,

5
1
2 ³0
( ideal )
I unb  S () ) d)
0 . (5.20b)
 0

The qualitative behavior of the nonideal unbalanced interference signal and nonideal unbalanced
interferogram is, in a well-designed interferometer, very similar to the behavior of the ideal
unbalanced interference signal and ideal unbalanced interferogram. Note that although the shapes
of the balanced and unbalanced interference signals depend on the sign of W, the shapes of the
balanced and unbalanced interferograms do not.82

82
Section 4.17 of Chapter 4 derives the formulas for the nonideal unbalanced interference signal of the
interferometer’s
interferometer’sbackground
backgroundradiance because
radiance becausetheyit show the totaltoradiant
contributes power
the total reaching
radiant the interferometer
power detector.
reaching the detector.
The same procedures can be used to derive formulas for the nonideal unbalanced interference signal of the
interferometer’s input radiance. For the interferometer designs analyzed here, this signal is of much less interest
because it goes back out the interferometer’s entrance aperture and has no effect on the total radiant power reaching
the interferometer’s detector. There do exist interferometers, like the one shown in Fig. 1.19c of Chapter 1, for which
both types of formula are relevant.

- 587 -
5 · Description of Practical Interferometer Measurements

5.6 The Off-Axis Signal of a Michelson Interferometer


Suppose there is another distant source in addition to the one present in Fig. 5.6, with the pencil
ray coming from the second source making an angle ȕs with respect to the pencil ray coming from
the first. An observer looking directly at these distant sources sees two “stars in the sky”
separated by an angular distance ȕs. When an observer looks at these two distant sources through
an interferometer, as shown in Fig. 5.11, they still look like two stars in the sky separated by an
angular distance ȕs, but now their brightness depends on the position of the interferometer’s
moving mirror. We unbundle the pencil rays from these two sources as they enter the
interferometer to form the two sets of parallel rays shown in Fig. 5.11. The unbundled rays from
the first source are parallel to the interferometer’s optical axis and the unbundled rays from the
second source are at an angle ȕs to the optical axis.
We have already discussed how the brightness of the on-axis source varies with the optical
path difference Ȥ for an ideal interferometer according to the formula [see Eq. (5.10a) with
η = M = 1]
1∞
(0)
Pbal ( χ ) = ³ S (0) (σ ) [1 + W cos(2πσχ )] dσ . (5.21a)
20

Here, the superscript (0) has been added to show that the balanced interference signal Pbal and the
spectrum S refer to the point source whose rays are parallel to—that is, at a zero angle to—the
optical axis. To get the corresponding formula for the off-axis point source, we use Eq. (4.137i)
from Chapter 4 to write

( βs )
Pbal (χ )

A (5.21b)
³ dσ field³ ³of dviewεfor η (σ )L s (σ ) [1 + W ⋅ M( Rσθma ) ⋅ cos(2πσχ cos αε )] ,
2 (β )
=
2 −∞
βs point source

where again, using Eq. (5.10c), we say that

J1 (4π Rσθ ma )
M(Rσθ ma ) = .
2π Rσθ ma

The superscript (ȕs) is added to show that Pbal and L refer only to the off-axis source, the one
whose rays are at an angle ȕs to the optical axis. The effective cross-sectional area of the
interferometer beam is still A, the area of the eye’s entrance pupil; and R in the formula for M is
still the radius of the eye’s entrance pupil so that R = A / π . The relevant field of view, however,

- 588 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

FIGURE 5.11.
Moving Mirror

The parallel rays coming from a


distant, off-axis point source are
shown with dashed arrows.

βs

The parallel rays coming from


a distant, on-axis point source Ideal
are shown with solid arrows. Beam
Splitter
Fixed
Mirror
βs

is now ∆Ω( β s ) , the extremely small solid angle subtended by the second distant source at the
position of the interferometer. Recognizing that α ε ≅ β s for all the rays coming from this distant,
off-axis source, we perform the integral over d 2ε in (5.21b) to get

( βs )
Pbal (χ )
A ∆Ω( βs ) ∞ (5.21c)
³ η (σ )L s (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s ) ] dσ .
(β )
=
2 −∞

Equations (4.136f) and (4.139g) in Chapter 4 require L and Ș to be even functions of ı; Eq.

- 589 -
5 · Description of Practical Interferometer Measurements

(5.10f) shows that M is another even function of ı; and the cosine is also even. Therefore the
product
η (σ ) L( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )]

must be an even function of ı, which means that, according to Eq. (2.19) in Chapter 2,

³ η (σ ) L
( βs )
(σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ
−∞

= 2 ³ η (σ ) L( β s ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ .
0

From Eq. (4.136g) in Chapter 4, we know that the off-axis spectral radiance is

L( β s ) (σ ) = 2 L( β s ) (σ ) ,

where the superscript (ȕs) is added to show that we are only interested in the pencil ray entering
the interferometer at an angle ȕs to the optical axis. This lets us write

³ η (σ ) L
( βs )
(σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ
−∞

= ³ η (σ ) L( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ .
0

Substitution of this last result into (5.21c) gives


1
P( βs )
bal ( χ ) = ³ S ( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ , (5.21d)
20
where
S ( βs ) (σ ) = A ∆Ω( β s ) L( βs )(σ ) η (σ ) . (5.21e)

For the ideal interferometer with η = M = 1 , this becomes


1
P( βs )
bal ( χ ) = ³ S ( βs ) (σ ) [1 + W cos(2πσχ cos β s )] dσ (5.21f)
20
where

- 590 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

S ( βs ) (σ ) = A ∆Ω( β s ) L( βs )(σ ) . (5.21g)

Comparing Eq. (5.21f) for the ideal off-axis case to Eq. (5.21a) for the ideal on-axis case shows
that the only effect of the off-axis passage through the interferometer is to multiply ıȤ by cosȕs
and to replace S (0) by S ( βs ) .
Equations (5.21f) and (5.21g) for the off-axis source can be compared to Eq. (5.21a) for the
on-axis source under the assumption that both sources are the same size, have the same spectral
radiance L(ı), and are at the same distance from the interferometer. Both sources then pass the
same power spectrum S(ı) through the interferometer so that


1
2 ³0
(0)
Pbal (χ ) = S (σ ) [1 + W cos(2πσχ )] dσ (5.22a)

and

1
P( βs )
bal ( χ ) = ³ S (σ ) [1 + W cos(2πσχ cos β s )] dσ . (5.22b)
20

Comparing these two formulas, we see that

( βs )
Pbal ( χ ) = Pbal(0) ( χ cos β s ) . (5.23a)

The displacement a of the moving mirror from its ZPD position is given by (see Eq. (5.11a)]

a = χ 2. (5.23b)

Consequently Eq. (5.23a) can also be written as

( βs )
Pbal ( 2a ) = Pbal(0) ( 2a cos β s ) . (5.23c)

This shows that the balanced interference signal of a distant, on-axis source has the same power
when the moving mirror is displaced from ZPD by a distance (a cos β s ) that an identical distant,
off-axis source has when the moving mirror is displaced from ZPD by a distance a. Another way
of saying this is to note that the on-axis source looks as bright when the moving mirror is
displaced from ZPD by a distance a as the off-axis source does when the moving mirror is
displaced from ZPD by a distance ( a cos β s ) . Since ( a cos β s ) > a , as the moving mirror is
shifted steadily away from ZPD the brightness of the on-axis source predicts the brightness of the
off-axis source—if the on-axis source brightens or dims, we know that soon the same thing will
happen to the off-axis source.

- 591 -
5 · Description of Practical Interferometer Measurements

We next consider a ring of distant sources surrounding the on-axis source, with all the sources
passing the same power spectrum S(ı) through the interferometer. As shown in Fig. 5.12(a), an
observer looking at the ring sees these sources as a circle of stars, a circle with angular radius ȕs
centered on the distant on-axis source. Each source in the ring sends its own group of parallel
rays through the interferometer as shown in Fig. 5.12(b).
Every parallel
Every groupgroup
parallel of rays
ofpasses throughthrough
rays passes the interferometer at the same
the interferometer at ȕthe same
s angle with angle to
ȕs respect
the optical axis, so everything previously said about the single off-axis source also applies to the
ring of off-axis sources. As the moving mirror shifts away from ZPD, we know— using the same
reasoning as before—that if the central source brightens or dims then soon the same thing will
______________________________________________________________________________

FIGURE 5.12(a).

s

s

FIGURE 5.12(b). s s
Moving
Mirror
s

s

Fixed
Mirror

- 592 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

FIGURE 5.13.

Moving
Mirror

Fixed
Mirror

Ideal Beam Splitter

happen simultaneously to every source on the off-axis ring. We can imagine filling the entire
“sky” with identical distant sources, as shown in Fig. 5.13.
Now when the sky is observed directly, not looking through the interferometer, it exhibits a
uniform, featureless glow; but when it is observed indirectly through the interferometer with the
eye focused at infinity—which may require a little practice—the sky becomes a concentric series
of rings at different levels of brightness. These are sometimes called Heidinger rings. The rings
have different levels of brightness because they are at different angular distances from the on-axis
source. The only way to escape this effect is to put the moving mirror at its ZPD position, with
a = χ = 0 . According to Eq. (5.22b), the rays at every angle ȕs with respect to the optical axis
then all have the same Pbal value; and the observer looking through the interferometer either sees
the same uniform featureless glow seen when looking directly at the source-filled sky (if W = 1 )
or nothing at all (if W = −1 ). As the moving mirror shifts steadily away from ZPD, the region at
the center of the scene changes its brightness first and, then, obeying Eq. (5.23c), this change in
brightness forms a ring that expands and travels out to the edge of the scene. This is, of course,
just a consequence of the on-axis brightness predicting the off-axis brightness, with regions at
larger ȕs copying the central brightness after a longer delay as the interference rings form and
expand.
To record these rings in the laboratory, we need only replace the observer’s eye with a camera

- 593 -
5 · Description of Practical Interferometer Measurements

focused at infinity. In Fig. 5.14 this camera is shown schematically as a lens and a light-sensitive
surface in the lens’s focal plane. As has already been discussed in Sec. 4.9 of Chapter 4, each
group of parallel rays can be regarded as a single plane wave, and each plane wave reaching the
lens focuses to its own separate and distinct point of light on the light-sensitive surface. In fact
what the light-sensitive surface records is an image of the scene “at infinity,” with each distant
source showing up as a separate point of brightness on the lens’s focal plane. The position of
each bright point on the focal plane corresponds to the angular separations seen by an observer;
for example, the ring of distant sources depicted in Fig. 5.12(a) shows up as a ring of bright
points equidistant from the central bright point representing the on-axis source. In practice the
creation of bright distant sources all having the same spectrum is an awkward and tedious
business; what is done instead is to create a nearby extended source with a uniformly bright
surface having the same spectral radiance everywhere. From the discussion at the end of Sec. 4.2
as well as the discussion following Eq. (4.47b) in Chapter 4, we know that every radiation field
can be thought of as a collection of plane waves propagating in different directions. When the
extended source is placed close to the interferometer, its plane waves fill the interferometer’s
field of view; that is, every point on the light-sensitive surface of the lens’s focal plane represents
a different plane wave generated by the extended source (see Fig. 5.15). To get a sequence of
brightness rings such as the ones shown in Fig. 5.16, we make sure the camera is focused at
infinity and then just take a series of snapshots while steadily shifting the moving mirror away
from ZPD.
The discussion so far has assumed that all the plane waves entering the interferometer,
whether coming from distant sources or an extended nearby source, pass the same power
spectrum S(ı) through the interferometer. There is, of course, no reason why this has to be the
case. Returning to Eq. (5.21f), we rewrite it using slightly different notation. Instead of talking
about parallel rays passing through the interferometer at an angle ȕs to the optical axis, we give
each group of parallel rays an index i and refer to the ith group of parallel rays as the ith plane
wave passing through the interferometer. The balanced signal power associated with this ith
plane wave is then

1
(i )
Pbal ( χ ) = ³ S ( i ) (σ ) [1 + W cos(2πσχ cos α i )] dσ , (5.24a)
20

where α i refers to the ith plane wave’s ȕs angle with respect to the interferometer’s optical axis
and S ( i ) (σ ) is the power spectrum of the ith plane wave as it passes through the interferometer.
According to Eq. (5.21g), if the plane wave is generated by a distant point source then we should
say that
S (i ) (σ ) = A ∆Ω( i ) L(i )(σ ) . (5.24b)

- 594 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

FIGURE 5.14.

Moving Mirror

The parallel rays coming


from a distant, off-axis point
source are shown with
dashed arrows.

The parallel
rays coming
Ideal
from a distant,
Beam
on-axis point
Splitter
source are Fixed
shown with Mirror
solid arrows.

Lens

LIght-Sensitive Surface in
the Focal Plane of the Lens

- 595 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.15.

Plane Waves from


Extended Source

Moving Mirror

Ideal
Beam
Splitter
Fixed
Mirror

Extended Lens
Source

LIght-Sensitive Surface in
the Focal Plane of the Lens

- 596 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

FIGURE 5.16.

1 2 3 4

5 6 7 8

This sequence of eight brightness rings is modeled on the brightness rings


occurring at the interferometer’s focal plane. The radii of the rings increase going
from one to eight due to the increasing displacement a = χ 2 of the moving
mirror from ZPD. Features like this are sometimes called Heidinger rings.

Here L( i ) (σ ) is the spectral radiance of the pencil ray entering the interferometer from the distant
point source and A is the cross-sectional area of the beam gathered in by the lens—that is, the
area of the lens itself. We can think of the ith plane wave as just one of a group of i = 1, 2,… , N
plane waves all emanating from distant sources, which makes ∆Ω( i ) the extremely small solid
angle subtended by the ith distant source at the position of the interferometer.
After these plane waves pass through the interferometer, the lens in Figs. 5.14 and 5.15 forms
an image—that is, N points of brightness—from these N distant sources. If, as shown in Fig. 5.17,
we put an array of small detectors in the focal plane then, as the moving mirror shifts away from
(i )
ZPD, each detector records the Pbal signal given by Eq. (5.24a) that is generated by the ith plane
wave coming from the ith distant source. The central region of the focal plane no longer

- 597 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.17.

Moving Mirror

Plane
Waves
coming
from
Distant
Scene

Ideal Beam
Splitter
Fixed
Mirror

Lens

Detector Array

- 598 -
The Off-Axis Signal of a Michelson Interferometer · 5.6

automatically predicts the brightness of the off-center regions, and there need not exist any well-
formed, outwardly moving rings because now the different plane waves have different S ( i )
spectra.83 This setup is sometimes referred to as an imaging Fourier-transform spectrometer, and
when it is put on board a spacecraft it can be used to investigate distant astronomical scenes, such
as a planet’s surface viewed from orbit, where we expect the power spectra to vary with position
in the scene.

5.7 The Standard Michelson Interferometer with Central Detector


In laboratory Michelson interferometers, we usually place a single circular detector in the central
region of the focal plane as shown in Fig. 5.18. The points on the detector near the center of the
focal plane represent plane waves propagating parallel, to or nearly parallel to, the optical axis, so
cos α i is always close to one and the central detector records the sum of all the Pbal (i )
at
approximately the same optical path difference Ȥ. As justification for saying that cos α i ≅ 1 for all
the plane waves hitting the detector, we note that when all the plane waves have the same
S ( i ) (σ ) , producing a ring pattern of the sort shown in Fig. 5.16, there is usually (but not always)
a large circular patch in the center having about the same brightness. For the time being, though,
we retain the factor of cos α i in order to derive equations showing how to analyze
interferometers having large detectors that extend into the ring pattern of the focal plane.
Using Eqs. (5.24a) and (5.24b) and assuming that the plane waves are numbered so that
i = 1, 2,… , Ndet are all the plane waves focused onto the detector, we write the balanced signal
power reaching the detector as


Ndet
A ­ Ndet ½
P (det)
bal ( χ ) = ¦ P ( χ ) = ³ L(σ ) ®¦ ∆Ω( i ) [1 + W cos(2πσχ cos α i ) ]¾ dσ ,
(i )
bal (5.25a)
i =1 20 ¯ i =1 ¿

where in the last step we have assumed that all the plane waves entering the interferometer have
the same spectrum S(ı) and thus the same spectral radiance L(ı). We convert the sum over i into
an integral over solid angle by writing


A
(det)
Pbal (χ ) = ³ L(σ ) dσ ³³ [1 + W cos(2πσχ cos α ε ) ] d 2ε . (5.25b)
20 field of view
of detector

83
Of course if all these S(i) spectra have common features producing similar interference signals, there will still be a
tendency for ringlike features to form and expand out from the center as the moving mirror shifts away from ZPD.

- 599 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.18.

Moving Mirror

Ideal Beam
Splitter
Fixed
Mirror

Plane Waves
Coming from Lens
outside the
Interferometer

Circular Detector in
Focal Plane of Lens

- 600 -
The Standard Michelson Interferometer with Central Detector · 5.7

On the right-hand side of this equation, A is the area of the lens focusing the interferometer signal
onto the detector, d 2ε is an infinitesimal solid angle replacing ∆Ω( i ) , and angle α ε replaces
angle α i as the angle of propagation through the interferometer. We note that this angle α ε is the
same as the α ε defined in Eq. (4.135f) of Chapter 4 and used in Eq. (4.137i) of Chapter 4. This is
not very surprising, considering that the line of reasoning used to derive Eq. (5.25b) begins with
Eq. (5.21b), which is a special case of Eq. (4.137i).
We can, in fact, easily show that Eq. (5.25b) is the same as Eq. (4.137i) in Chapter 4 with
η = M = 1 . Formula (5.10c) lets us write the integral on the right-hand side of (4.137i) as


A J (4π Rσθ ma )
³
2 −∞
dσ ³ ³ d 2ε η (σ ) L (σ ) [1 + W ⋅ 1
field of view
2π Rσθ ma
⋅ cos(2πσχ cos α ε )]


A
= ³ dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )] .
2 −∞ field of view

Here A is the cross-sectional area of the interferometer beam; R = A / π is the radius of the
interferometer beam; the “field of view” limiting the integral over d 2ε is the interferometer’s
field of view; and of course α ε is the angle of propagation through the interferometer. For the
lens and detector in Fig. 5.18, the area of the lens focusing the beam onto the detector defines the
cross-sectional area of the interferometer beam, so variable A has the same meaning as in Eq.
(5.25b). The field of view specified by the size of the detector—that is, the detector’s field of
view—is the same as the field of view of the interferometer, so the integral over d 2ε is also the
same integral as in Eq. (5.25b). Following the procedure used in the discussion after Eq. (5.21c),
we recognize that “field of view” in the integral over d 2ε now refers to the detector’s field of
view and note that L, Ș, M, and the cosine are even functions of ı. This gives us, after applying
Eq. (2.19) in Chapter 2,


A J (4π Rσθ ma )
³
2 −∞
dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W ⋅ 1
field of view
2π Rσθ ma
⋅ cos(2πσχ cos α ε )]


A
= ³
2 −∞
dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )]
field of view
(5.25c)
of detector

A
=
20³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )] ,
field of view
of detector

where L(ı) is, according to Eq. (4.136g) of Chapter 4, the spectral radiance of the beam entering

- 601 -
5 · Description of Practical Interferometer Measurements

the interferometer. Equation (5.25c) is a new formula for the right-hand side of Eq. (4.137i) in
Chapter 4. Thus it can be substituted back into (4.137i) to get

Pbal ( χ )

A
=
20³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )].
field of view
of detector

From Chapter 4 we know that Pbal in this formula is the optical power leaving the interferometer
in the balanced signal, and since the ideal lens in Fig. 5.18 focuses all of the beam onto the
(det)
detector, Pbal is the same quantity as Pbal in Eq. (5.25b). Hence this last result can be written as

(det)
Pbal (χ )

A (5.25d)
= ³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )].
20 field of view
of detector

When η = M = 1 in Eq. (5.25d), it becomes the same as Eq. (5.25b). Consequently we have now,
as promised, shown that (5.25b) is the same as Eq. (4.137i) of Chapter 4 applied to an ideal
interferometer. Equation (5.25d) with 0 < η < 1 and M < 1 is then clearly the extension of Eq.
(5.25b) to the nonideal case of an interferometer with an imperfect beam splitter and an
imperfectly aligned moving mirror. Interchanging the integrals in (5.25d) gives

(det)
Pbal (χ )
­° A ∞ ½° (5.25e)
= ³ ³d ε ® ³ dσ η (σ ) L(σ )[1 + W M( Rσθ ma ) cos(2πσχ cos α ε )]
2
¾ .
field of view ¯°2 0 ¿°
of detector

Now at last we make the idealization that the detector is small enough to assume that all the
plane waves focused on it provide an approximately uniform illumination across its surface,
allowing us to set cos α ε ≅ 1 in Eq. (5.25e) to get, after dropping the (det) superscript,


A∆Ω
2 ³0
Pbal ( χ ) = η (σ ) L(σ )[1 + W M( Rσθ ma ) cos(2πσχ )] dσ

(5.26a)
1
= ³ S (σ )[1 + W M( Rσθ ma ) cos(2πσχ )] dσ ,
20

- 602 -
The Standard Michelson Interferometer with Central Detector · 5.7

where
 ³³ d 2
field of view
(5.26b)
of detector

and
S () ) A  !() ) L () ) . (5.26c)

The (det) superscript has been dropped to emphasize the close resemblance of Eq. (5.26a) to Eq.
(5.10a) for the balanced interference signal of the distant, on-axis source. Indeed the only real
difference
difference is
is that
that ¨Ÿ
solidinangle
Eqs. (5.10a), (5.10b),
¨Ÿ in Eqs. and(5.10b)
(5.10a), (5.10c) refers
refers to
to the
the solid
solid angle subtended by
the distant source and ¨Ÿ in (5.26a), (5.26b), and (5.26c) refers to the detector’s field of view.
Because the mathematical formalism is the same, it makes sense to call Pbal in (5.26a) the optical
power of the balanced interference signal hitting the detector and, following the pattern of Eqs.
(5.17a) through (5.17d), once again define

5
1
I bal (  ) ³ S () )M ( R)' ma )cos(2&) ) d) (5.27a)
20

to be the balanced interferogram. The only difference between (5.17d) and (5.27a) is the meaning
we attach to the solid angle ¨Ÿ in the definition of S. Now Eq. (5.26a) can be written as

1
Pbal (  ) P0  W I bal (  ) , (5.27b)
2
where, just like in (5.17c),
5
P0 ³ S () ) d) . (5.27c)
0

Since the cosine in Eq. (5.26a) is an even function of Ȥ, the interference signal Pbal must be, as it
is in Eq. (5.18a), an even function of Ȥ,

Pbal (  ) Pbal (  ) , (5.28a)

which means that, according to Eq. (5.27b), the interferogram

Pbal (  )  1 2  P0
I bal (  ) (5.28b)
W
is once again an even function of Ȥ:
I bal (  ) I bal (  ) . (5.28c)

- 603 -
5 · Description of Practical Interferometer Measurements

As in Eq. (4.141c) of Chapter 4, we can make S(ı) into an even function by requiring

S () ) S () ) (5.29a)

to end up with, after extending Eq. (5.26c) to negative values of ı,

S () ) A  !() ) L ( ) ) . (5.29b)

Unlike Eqs. (4.140c) and (4.141c) of Chapter 4, the beam-splitter efficiency Ș is now included in
the definition of S. The argument of Ș does not have to be put inside absolute value signs
because, according to Eq. (4.139g) of Chapter 4, it is already an even function of ı. Function
M(R)' ma ) is also an even function of ı [see Eq. (5.10f)], as is cos(2&) ) , so both
 S () ) M(R)' ma ) andª¬ S () ) M(R)' ma ) cos(2&) ) º¼ are even functions of ı. The sine of (2&) )
is an odd function of ı because
sin(  2&) ) sin(2&) ) ,

so multiplying the even function  S () ) M(R)' ma )  by sin(2&) ) produces an odd function:

 S () ) M(R)' ma ) sin(2&) ) .
This means we can write
This means we can write, using ei cos( )  i sin( ) ,
5

³³ M(
5 92& i)
M( R
R)'
)' ma
ma
)) SS (()
) )) ee92& i) dd)
)
5
5
5 5

³³ M( 9 ii ³³ M(
5 5

M( R
R)'
)' ma
ma
)) SS (()
) )) cos(2&) ))dd)
cos(2&) )9 M( R
R)' ) S () ) sin(2&) )d)
ma ) S () ) sin(2&) ) d)
)'ma
5 5
5 5
5

22³³ M(
5
M( R
R)' ) S () ) cos(2&) )d) .
ma ) S () ) cos(2&) ) d) .
)'ma
0
0

Here
Here we we use
use that
that the
the integral
integral of
of  SS (() ) M(R)' ) cos(2&) )  over just positive ı is, according to
) ) M(R)' ma ) cos(2&) )  over just positive ı is, according to
ma
Eq.
Eq. (2.19) in Chapter 2, twice the value of its
(2.19) in Chapter 2, twice the value of its integral
integral between
between í’ í’ and and +’,
+’, because
because
 SS (()) )) M(R)' ma ) cos(2&) )  is an even function of ı; and we also use that the integral of
M(R)' ma ) cos(2&) )  is an even function of ı; and we also use that the integral of
M(R)' ma ) sin(2&) )  over ı is the integral of an odd function between í’ and +’, which,
 SS (()) )) M( R)' ma ) sin(2&) )  over ı is the integral
integral ofof an
an odd
odd function
function between
between í’ í’ and
and +’,
+’ which,
according
according to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
that
that the the interferogram
interferogram can can bebe written
written as as

- 604 -
The Standard Michelson Interferometer with Central Detector · 5.7


1
I bal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ (5.29c)
4 −∞

where the plus sign is chosen for the complex exponent of e. Note that, having now chosen
I bal ( χ ) to be the inverse Fourier transform of [(1 4 ) S (σ )M( Rσθ ma ) ] , we can reverse the Fourier
transform in (5.29c) to get

S (σ ) M( Rσθ ma ) = 4 ³ I bal ( χ ) e −2π iσχ d χ . (5.29d)
−∞

Our choice of sign for the complex exponent thus makes [ S (σ )M( Rσθ ma )] the forward Fourier
transform of 4 I bal ( χ ) . This sign choice is, of course, purely a matter of convention, but it is the
one followed by most optical engineers today and it is the one used for the rest of this book.

5.8 The Fore and Aft Optics


We now derive an expression for the optical power of the balanced interference signal when the
polychromatic plane waves propagating at different angles α ε are characterized by different
spectral radiances.
We can rewrite Eq. (4.136c) of Chapter 4 as, after using Eqs. (4.136a) and (4.136i) to simplify
the integral over dı,

A

G ª W G −2π iσχ cosα º
Pbal ( χ ) =
2 field ³³
d 2ε ³−∞ L (ε , σ ) η (σ ) «
¬
1 +
A
Re Ȇ(A (σ∆ )e )
ε
» dσ .
¼
of view

Applying the same reasoning as in the discussion after Eq. (5.25b), we note that here the
interferometer beam’s cross-sectional area A must be the same as the area A of the lens, and that
the “field of view” in the integral over d 2ε must refer to the field of view of the detector. For the
standard interferometer beam with a circular cross section, we have from Eq. (4.137h) of Chapter
4 and Eq. (5.10c) that

1 G J (4π Rσθ ma )
Ȇ A (σ∆) = 1 = M( Rσθma ) .
A circle of
2π Rσθ ma
radius R

Substituting this into the expression for Pbal ( χ ) gives, since M is real and

eiφ = cos(φ ) + i sin(φ ) ,

- 605 -
5 · Description of Practical Interferometer Measurements

that
5
A G
Pbal    ³³
2 field of view
d 2 ³ L ( ,) )! () ) 1  W M( R)'
5
ma ) cos(2&) cos   )  d) .

G
Equations (4.136b) and (4.139g) of Chapter 4 require L ( , ) ) and Ș(ı) to be even functions of ı,
and we already know that M is an even function of ı [see Eq. (5.10f)]. Consequently, because the
cosine is also an even function, it follows that
G
L ( , ) )! () ) 1  W M( R)' ma ) cos(2&) cos   )

is itself an even function of ı. Equation (2.19) in Chapter 2 can now be used to modify the upper
and lower bounds of the integral over dı, so that the integration takes place between 0 and ’.
Having made these changes, Eq. (4.136d) of Chapter 4 can then be used to write the formula for
Pbal (  ) as

Pbal   
­° 2 A 5 G ½° (5.30)
³³
field of view ¯
®
°
d 
2 ³
0
L ( , ) ) ! () ) 1  W M( R)' ma ) cos(2&) cos   )  d ) ¾.
°¿
G
Function L( , ) ) was defined in Eq. (4.136d) of Chapter 4 to be the spectral radiance as a
G
function of wavenumber ı and direction  for the beam entering the interferometer; and from the
analysis in Sec. 5.2 above, and in particular the discussion following Eq. (5.6), we know that
G
L( , ) ) can also be interpreted as the spectral radiance of the pencil ray traveling in a direction
G
specified by  . When Parallelthispencil
pencilrays
raywith
becomes part the
the same interferometer’s
radiance entering thebeam, it can be
interferometer
G
decomposed
are treated asinto a parallel
a parallel groupofofrays
group raystraveling
travelingininthe
thedirection specified by  , which is of
direction specified
course the same thing as recognizing the existence of a plane wave traveling in the direction
G
specified by  . This means that the integral over d 2 can be interpreted as a sum over all the
plane waves passing through the interferometer. Consequently, in Eq. (5.30), the term inside the
braces { },
5
A G
d 2 ³ L( , ) )! () ) 1  W M( R)' ma ) cos(2&) cos   )  d) ,
20

should be interpreted as the small amount of power, an order-of-magnitude d 2 amount of


power, that a polychromatic plane wave, when traveling at an angle   to the optical axis in the
G
direction specified by  , contributes to the optical power of the balanced interference signal.

- 606 -
The Fore and Aft Optics · 5.8

Having interpreted the integral over d 2 in Eq. (5.30) as a sum over the power contributed by
each polychromatic plane wave passing through the interferometer, we next analyze the integral
over dı as a sum over all the monochromatic wavenumber components present in any one
polychromatic plane wave.84 In Eq. (5.30) we regard the ıth wavenumber component of the plane
G
wave specified by  as contributing an amount of power

§ A· G
d 2 d) ¨ ¸ L( , ) )! () ) 1  W M( R)' ma ) cos(2&) cos   ) 
©2¹

to the optical power reaching the detector. Analyzing the system this way shows us how to
include the effects of nonideal optical components in the formulas for Pbal (  ) . If, for example,
the lens in Fig. 5.18 transmits some optical wavelengths Ȝ more efficiently than others, a behavior
typical of real optical materials, we can introduce a transmission * lens that is always a real number
between zero and one and make it a function of wavenumber ) # 1 . Now in Fig. 5.18 each ıth
G
wavenumber component of the plane wave specified by  contributes an order-of-magnitude
d 2 A d) amount of power

­ § A· G ½
* lens () ) A ®d 2 d) ¨ ¸ L( , ) )! () ) 1  W M( R)' ma ) cos(2&) cos   ) ¾
¯ ©2¹ ¿

to the detector. Those wavenumbers for which * lens 0 , showing that for them the lens is opaque,
are blocked from contributing any power to Pbal (  ) ; and those wavenumbers for which * 1 ,
meaning that they pass through the lens without losing any power, contribute to Pbal (  ) as if they
were being focused by an ideal lens.
In general, an interferometer such as the one shown in Fig. 5.18 will have both “fore optics”
to gather in and prepare outside radiation for passage through the interferometer and “aft optics”
to focus the optical beam onto the detector after passage through the interferometer (see Fig.
5.19). In an astronomical Fourier-transform system, for example, the fore optics could be a
telescope designed to gather in large quantities of photons and send them through the
interferometer while the aft optics, like the lens in Fig. 5.18, is designed to focus the beam onto
the detector. We can lump the transmissions of the individual optical elements of both the fore
optics and the aft optics into two combined transmission functions * f () ) and * a () )
GG
respectively. This means the ıth component
component of of the
the 'th plane wave can only contribute a

84
In effect, we are reverting to the analysis at the beginning of Chapter 4, representing the optical field propagating
through the interferometer as a sum of monochromatic plane waves over different directions and wavenumbers.

- 607 -
5 · Description of Practical Interferometer Measurements

­ § A· G ½
τ f (σ ) ⋅τ a (σ ) ⋅ ®d 2ε dσ ¨ ¸ L(ε , σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ]¾
¯ ©2¹ ¿

amount to the optical power reaching the detector. Consequently, we adjust the formula for
Pbal ( χ ) in Eq. (5.30) to get

Pbal ( χ )
A

G (5.31a)
, σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] dσ
2 field ³³ ³
= d 2
ε L (ε
of view 0

for the total power from the balanced optical signal reaching the detector in Fig. 5.19. If all the
plane waves of interest are characterized by the same spectral radiance, the dependence of L on
G
ε can be suppressed to get

Pbal ( χ )
A

(5.31b)
= ³ dσ L(σ )τ f (σ )τ a (σ )η (σ ) ³³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] .
20 field of view

If, in addition, the field of view is sufficiently small to make cos α ε ≅ 1 a good approximation,
then we can write


A ∆Ω
Pbal ( χ ) = L(σ )τ f (σ )τ a (σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ ,
2 ³0
(5.31c)

where
∆Ω = ³³
field of view
d 2ε . (5.31d)

Equations (5.31a)–(5.31d) are a useful set of formulas for describing Pbal ( χ ) . If an interferometer
is built with no fore optics, then we can set τ f (σ ) = 1 ; and to represent negligible loss in the aft
optics, we set τ a (σ ) = 1 . As was discussed in the previous sections, we know that for an ideal
beam splitter η (σ ) = 1 , and for a perfectly aligned interferometer M = 1.

- 608 -
The Fore and Aft Optics · 5.8

FIGURE 5.19.
Moving Mirror

FORE
OPTICS

Ideal Beam Splitter Fixed


Mirror
AFT OPTICS

Circular Detector

We can put Eqs. (5.31c) and (5.31d) into the same form as Eqs. (5.26a)–(5.26c) by writing


1
Pbal ( χ ) = ³ S (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ , (5.32a)
20
where
S (σ ) = A ∆Ω L(σ ) τ f (σ ) τ a (σ )η (σ ) . (5.32b)

All that is different from Eqs. (5.26a)–(5.26c) is the definition of S(ı), which now includes

- 609 -
5 · Description of Practical Interferometer Measurements

factors of τ f (σ ) and τ a (σ ) , so we can set up the same pattern of mathematical definitions as


before by calling the balanced interferogram


1
I bal ( χ ) = ³ S (σ )M ( Rσθ ma )cos(2πσχ ) dσ , (5.32c)
20
with
1
Pbal ( χ ) = P0 + W I bal ( χ ) (5.32d)
2
and

P0 = ³ S (σ ) dσ . (5.32e)
0

Again we can see that Ibal and Pbal are even functions of Ȥ because the cosine is an even function
of Ȥ:
I bal (− χ ) = I bal ( χ ) (5.33a)
and

Pbal (− χ ) = Pbal ( χ ) . (5.33b)

As before, we can make S an even function of ı,

S (−σ ) = S (σ ) , (5.33c)
by writing
S (σ ) = A ∆Ω η(σ ) L ( σ )τ f ( σ )τ a ( σ ) (5.33d)

for negative values of ı. Using the same argument as in the discussion following Eq. (5.29b), the
interferogram can now be written as the inverse Fourier transform of [(1 4 ) S (σ ) M(Rσθ ma ) ] ,


1
I bal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ , (5.34a)
4 −∞

which makes [ S (σ ) M(Rσθ ma ) ] the Fourier transform of 4 I bal ( χ ) ,


S (σ ) M( Rσθ ma ) = 4 ³ I bal ( χ ) e−2π iσχ d χ . (5.34b)
−∞

- 610 -
The Fore and Aft Optics · 5.8

There is nothing new here; all that has changed from the previous Fourier-transform relations in
Eqs. (5.29c) and (5.29d) is that we have extended the definition of S(ı) from

S (σ ) = A ∆Ω η(σ ) L ( σ )
in Eq. (5.29b) to
S (σ ) = A ∆Ω η(σ ) L ( σ )τ f ( σ )τ a ( σ )

in Eq. (5.33d). In fact, all of Eqs. (5.26a) through (5.29d) can now be regarded as a special case
of Eqs. (5.32a) through (5.34b), what we get when making the idealization that τ f = τ a = 1 .

5.9 The Detector Signal


Up to this point, we have been talking about how to calculate the power in the optical signal
reaching the detector, but of course what a detector produces is an electrical signal—usually
measured in volts or amps—that is proportional to the optical power it absorbs. Unfortunately
detectors have different sensitivities to different wavelengths of electromagnetic radiation, which
means that the proportionality constant between the optical power absorbed by the detector and
the electrical signal produced by the detector is a function of wavelength Ȝ. This proportionality
constant is called the detector responsivity R. Since the interferometer equations are based on
integrals over wavenumber, we write the responsivity as a function of wavenumber σ = λ −1
rather than wavelength: R = R (σ ) . Depending on the type of detector being analyzed, the
responsivity R(ı) has units of volts per unit optical power or amps per unit optical power—that is,
its units are always detector-output signal per unit optical power reaching the detector.
In the previous section, the integrals in Eq. (5.31a) were interpreted as a sum over all the
balanced power contributions of all the wavenumber components of all the plane waves reaching
G
the detector. This means the ıth component of the ε th plane wave contributes an amount of
power
§ A· G
d 2ε dσ ¨ ¸ L(ε , σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ]
©2¹

to the balanced component of the optical power reaching the detector in Eq. (5.31a). To find the
corresponding contribution to the electrical signal leaving the detector, we multiply this by R(ı)
to get

­ § A· G ½
R (σ ) ⋅ ® d ε dσ ¨ ¸ L(ε , σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ]¾ .
2

¯ ©2¹ ¿

Consequently, the balanced component of the electrical signal leaving the detector at an optical

- 611 -
5 · Description of Practical Interferometer Measurements

path difference Ȥ is

K bal ( χ )

A G (5.35a)
= ³³ d 2ε ³ L(ε , σ )R(σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] dσ .
2 field of 0
view

When all the plane waves of interest have the same spectral radiance L, this becomes

K bal ( χ )

A (5.35b)
= ³ dσ L(σ ) R (σ )τ f (σ )τ a (σ )η (σ ) ³³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] ;
20 field of
view

and if we can assume cos α ε ≅ 1 because the interferometer’s field of view is small, then


A ∆Ω
K bal ( χ ) = L(σ ) R (σ )τ f (σ )τ a (σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ
2 ³0
(5.35c)

with
∆Ω = ³³
field of
d 2ε . (5.35d)
view

From the way this result is derived, we see that it is always easy to go from the formulas for the
signal leaving the detector to the formulas for the optical signal hitting the detector: just set
R (σ ) = 1 .
We work now with the assumption that all the plane waves of interest have the same spectral
radiance L. Just like in Eq. (5.32b), we define a function

S (σ ) = A ∆Ω R (σ ) η(σ ) L (σ )τ f (σ )τ a (σ ) . (5.36a)

This definition of S(ı), unlike the one in (5.32b), contains the detector responsivity R(ı).
Equation (5.35b) becomes, when cos α ε ≅ 1 is not a good approximation,


­ ½
1 ° 1 °
K bal ( χ ) = ³ S (σ ) ® ³³ d ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] ¾ dσ ;
2
(5.36b)
20 ° ∆Ω field of °
¯ view ¿

- 612 -
The Detector Signal · 5.9

and Eq. (5.35c) becomes, when cos α ε ≅ 1 is a good approximation,


1
K bal ( χ ) = ³ S (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ . (5.36c)
20

Following the same pattern as in the discussions after Eqs. (5.26c) and (5.32b), we can write
either of these two expressions as the sum of a constant term and a term depending on Ȥ,

1
K bal ( χ ) = K 0 + WK Ibal ( χ ) . (5.37a)
2

No matter what we do with cos α ε ,



K 0 = ³ S (σ ) dσ . (5.37b)
0

When cos α ε cannot be approximated as one,



­ ½
1 ° 1 °
K Ibal ( χ ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε cos(2πσχ cos α ε ) ¾ dσ ; (5.37c)
20 ° ∆Ω field of °
¯ view ¿

and when cos α ε can be approximated as one,



1
K Ibal ( χ ) = ³ S (σ ) M( Rσθ ma ) cos(2πσχ ) dσ . (5.37d)
20

Whether or not cos α ε ≅ 1 is a good approximation, Eqs. (5.37c) and (5.37d) show that K Ibal is an
even function of the optical path difference Ȥ,

K Ibal ( − χ ) = K Ibal ( χ ) , (5.38a)


because
cos(−2πσχ cos α ε ) = cos(2πσχ cos α ε )

for all values of cos α ε . Since K Ibal is even, it follows from (5.37a) that Kbal must also be an even
function of Ȥ,
K bal (− χ ) = K bal ( χ ) (5.38b)

- 613 -
5 · Description of Practical Interferometer Measurements

As before, the nonconstant component K Ibal of the total signal can made proportional to a Fourier
transform. Equation (5.10f) shows M( Rσθ ma ) to be an even function of ı, and we can always
force S to be even by defining S (σ ) = S ( σ ) so that

S (−σ ) = S (σ ) . (5.39a)

Now both
S (σ ) M( Rσθ ma ) cos(2πσχ )

and
­ ½
° 1 °
S (σ )M( Rσθ ma ) ® ³³ d 2ε cos(2πσχ cos α ε ) ¾
° ∆Ω field of °
¯ view ¿

are even functions of ı because they are the products of even functions of ı. We can write Eq.
(5.37b) as

1
K 0 = ³ S (σ ) dσ (5.39b)
2 −∞

because S is even [see Eq. (2.19) in Chapter 2]. Equation (5.37c) becomes


­ ½
1 ° 1 °
K Ibal ( χ ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε cos(2πσχ cos α ε ) ¾ dσ (5.39c)
4 −∞ ° ∆Ω field of °
¯ view ¿

when cos α ε cannot be approximated as one and Eq. (5.37d) becomes


1
K Ibal ( χ ) = ³ S (σ ) M( Rσθma ) cos(2πσχ ) dσ (5.39d)
4 −∞

when cos α ε can be approximated as one. Using the same reasoning as in the discussion
following Eq. (5.29b), we see that

- 614 -
The Detector Signal · 5.9


1
³
4 −∞
S (σ ) M( Rσθma ) e2π iσχ dσ

∞ ∞
1 i
= ³ S (σ ) M( Rσθma ) cos(2πσχ )dσ + ³ S (σ ) M( Rσθma ) sin(2πσχ ) dσ
4 −∞ 4 −∞

1
= ³ S (σ ) M( Rσθma ) cos(2πσχ ) dσ .
4 −∞

because the integral over dı of the odd function of ı,

S (σ )M( Rσθma ) sin(2πσχ ) ,

must, according to Eq. (2.17) of Chapter 2, equal zero. Hence, when cos α ε can be approximated
as one, Eq. (5.39d) can be written as


1
K Ibal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ . (5.40a)
4 −∞

A similar manipulation is possible when cos α ε cannot be approximated as one. Interchanging


the order of the integrals in Eq. (5.39c) gives


1
K Ibal ( χ ) = ³³ d 2ε ³ dσ S (σ )M( Rσθ ma ) cos(2πσχ cos α ε ) .
4∆Ω field of −∞
view

Now we note that


S (σ )M( Rσθma ) cos(2πσχ cos α ε )

is an even function of ı and

S (σ )M( Rσθma ) sin(2πσχ cos α ε )

is an odd function of ı, so according to Eq. (2.17) in Chapter 2,

- 615 -
5 · Description of Practical Interferometer Measurements


1
³
4 −∞
S (σ ) M( Rσθma ) e2π iσχ cosαε dσ

∞ ∞
1 i
= ³ S (σ ) M( Rσθma ) cos(2πσχ cos α ε )dσ + ³ S (σ ) M( Rσθma ) sin(2πσχ cos α ε ) dσ
4 −∞ 4 −∞

1
= ³ S (σ ) M( Rσθma ) cos(2πσχ cos α ε ) dσ .
4 −∞

This means that



1
K Ibal ( χ ) = ³³ d 2ε ³ dσ S (σ )M( Rσθ ma ) cos(2πσχ cos α ε )
4∆Ω field of −∞
view

can be written as

1
K Ibal ( χ ) = ³³ d 2ε ³ dσ S (σ )M( Rσθ ma ) e2π iσχ cosαε
4∆Ω field of −∞
view

­ ½ (5.40b)

1 ° 1 °
= ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e2π iσχ cosαε ¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿

Therefore we have shown that, according to Eqs. (5.40a) and (5.40b), K Ibal can be written as


­ ½
1 ° 1 °
K Ibal ( χ ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ (5.40c)
4 −∞ ° ∆Ω field of °
¯ view ¿

when cos α ε cannot be approximated as one and as


1
K Ibal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ (5.40d)
4 −∞

when cos α ε can be approximated as one. Glancing back at Eqs. (5.37a) and (5.39b), we note that
the balanced component of the electrical signal leaving the detector due to the input spectral
power is [see also Eqs. (5.40c) and (5.40d)]

- 616 -
The Detector Signal · 5.9

5 5
­ ½
1 W ° 1 °
K bal    ³ S () ) d) 
³5 S () )M( R)'ma ) ®  field³³of d  e
2 2& i) cos 
¾ d) (5.40e)
4 5 4 ° °
¯ view ¿
when cos   cannot be approximated as one, and

5 5
1 W
K bal    ³ S () ) d)  ³ S () )M( R)' ma )e 2& i) d) (5.40f)
4 5
0
4 5

when it can. The formula for S(ı) comes from Eqs. (5.39a) and (5.36a), which can be combined
to give
S () ) A R ( ) ) !( ) ) L ( ) )* f ( ) )* a ( ) )
(5.40g)
A R ( ) ) !() ) L ( ) )* f ( ) )* a ( ) ) .

The absolute value signs are dropped from the argument of Ș because it is already an even
function [see Eq. (4.139g) of Chapter 4].

5.10 The Detector Circuit


A realistic Fourier-transform spectrometer sends the signal leaving the detector into an electronic
circuit designed to record and stabilize signal Kbal. One nice thing about electronic signals is that
they can be negative as well as positive; that is, the detector’s electronic circuit can have both
negative and positive potentials (in volts) and currents (in amps). The K0 term that keeps K bal : 0
in Eq. (5.37a) has no information about the power spectrum S and the detector circuit need not
respond to its presence. Typically what is done is to give the moving mirror in Fig. 5.20 a
constant velocity while at the same time building the detector circuit in such a way as to record
only time-varying signals. For the interferometer in Fig. 5.20, the optical-path difference Ȥ is two
times the displacement a of the moving mirror from ZPD,

 2a .

Taking the time t to be zero when the moving mirror is at ZPD with  a 0 , we have

a vt

for v the velocity of the moving mirror. Substituting the second formula into the first gives

- 617 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.20.

Moving Mirror

FORE
OPTICS

Fixed
Ideal Beam Splitter
Mirror
AFT OPTICS

Circular
Detector

Electrical Signal
Detector circuit
from Detector
to process K bal

- 618 -
The Detector Circuit · 5.10

χ = 2vt = ut , (5.41a)
where
u = 2v (5.41b)

is a quantity called the optical-path-difference velocity, or OPD velocity for short. Just as the
optical-path difference Ȥ has the same length units as the mirror displacement a, so does u have
the same velocity units as v.
Substitution of (5.41a) into (5.37a) gives

1
K bal (ut ) = K 0 + WK Ibal (ut ) , (5.42a)
2

where, according to Eq. (5.40c), K Ibal (ut ) can be written as


­ ½
1 ° 1 °
K Ibal ( ut ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσ ut cosαε ¾ dσ , (5.42b)
4 −∞ ° ∆Ω field of °
¯ view ¿

when cos α ε cannot be approximated as one and, according to Eq. (5.40d), K Ibal (ut ) can be
written as

1
K Ibal ( ut ) = ³ S (σ ) M( Rσθ ma ) e2π iσ ut dσ (5.42c)
4 −∞

when cos α ε can be approximated as one. If the detector circuit is built to record only time-
varying signals, a process sometimes called “AC coupling” of the detector,85 then the K bal (ut )
signal leaving the detector only contributes its time-varying part K Ibal (ut ) to the rest of the
system.
Suppose we define
1
gin (t ) = K bal (ut ) = K 0 + WK Ibal (ut ) (5.43)
2

to be the time-varying signal leaving the detector and entering the detector circuit. Assuming the
circuit to be linear—and the interferometer cannot produce accurate spectral measurements if it is
not—we know from the discussion in Appendix 5A of this chapter that the product of the circuit
transfer function and Fourier transform of the input signal equals the Fourier transform of the

85
AC stands for alternating current.

- 619 -
5 · Description of Practical Interferometer Measurements

output signal [see Eq. (5A.3a) in Appendix 5A]. Consequently, to get the output of a linear
circuit, we just take the Fourier transform of the input, multiply by the transfer function, and then
take the inverse Fourier transform of the product. Applying this recipe to gin (t ) , we see from Eq.
(5.43) that Gin ( f ) , the Fourier transform of gin (t ) , is

5 5 5
ª1 º 2& ift 1
³5 «¬ 2 K 0  WK Ibal (ut ) »¼ e dt 2 K 0 5³ e dt  W 5³ K Ibal (ut ) e dt .
2& ift 2& ift
Gin ( f )

According to Eq. (2.71f) of Chapter 2, the constant term turns into a delta function. This means
that when cos   cannot be approximated by one, Eq. (5.42b) can be used to write

Gin ( f )
­ ½
K W
5 5
° 1 ° (5.44a)
³ dt e ³5 d) S () )M( R)'ma ) ®  field³³of d  e
2& ift 2 2& i) ut cos 
0 (f ) ¾,
2 4 5 ° °
¯ view ¿

and when cos   can be approximated as one, Eq. (5.42c) can be used to write

5 5
K W
³ dt e ³ d) S () ) M( R)'
2& ift
Gin ( f ) 0  ( f )  ma ) e2& i) ut . (5.44b)
2 4 5 5

In either case, we can move the integral over dt to the inside to get, using Eq. (2.71f) from
Chapter 2, that
5
1 § f ·
³
2& it () u cos   f )
e dt  () u cos    f )  ¨ )  ¸
5
u cos   © u cos   ¹

when cos   cannot be approximated as one and

5
1 § f ·
³e
2& it () u  f )
dt  () u  f )  ¨ )  ¸
5
u © u¹

when it can. In both these expressions, Eq. (2.68d) of Chapter 2 is used to factor the arguments of
the delta functions. Substitution of these two
Here u is positive, and results back
so is the into Eqs.
cosine (5.44a)
because its and (5.44b)isgives
argument always a
relatively small angle. Substitution of these two results back into Eqs. (5.44a) and (5.44b) gives

- 620 -
The Detector Circuit · 5.10

K0 W d 2ε § f · § Rf θ ma ·
Gin ( f ) =
2
δ( f )+ ³³of cos αε © u cos αε ¸¹ M ¨© u cos αε ¸¹
4u ∆Ω field
S ¨ (5.45a)
view

when cos α ε cannot be approximated as one and

K0 W § f · § Rf θ ma ·
Gin ( f ) = δ( f )+ S¨ ¸M¨ ¸ (5.45b)
2 4u © u ¹ © u ¹

when it can. Still following the recipe for the detector circuit’s output signal, we define H(ƒ) to
be the detector circuit’s transfer function and take the inverse Fourier transform of the product

H( f ) ⋅ Gin ( f )

to get the formula for the signal leaving the detector circuit:

³e
2π ift
gout (t ) = H( f )Gin ( f )df . (5.46a)
−∞

When cos α ε cannot be approximated as one, this becomes, according to (5.45a),


ª K0 º
³e
2π ift
g out (t ) = « 2 H( f ) δ ( f ) » df
−∞ ¬ ¼
W d 2ε

§ f · § Rf θ ma · (5.46b)
³³of cos αε ³ df e
2π ift
+ H( f ) S ¨ ¸M¨ ¸
4u∆Ω field −∞ © u cos α ε ¹ © u cos α ε ¹
view


­ ½
K W ° 1 °
³−∞ S (σ ′) M ( Rσ ′θma ) ® ∆Ω field³³of d ε H(σ ′u cos αε )e
2π iσ ′ut cos αε
¾ dσ ′ ,
2
= 0 H(0) +
2 4 ° °
¯ view ¿

where in the last step the variable of integration is changed from ƒ to

f
σ′ = .
u cos α ε

Glancing back at Eq. (5.45b), we see that the formula for the case where cos α ε can be

- 621 -
5 · Description of Practical Interferometer Measurements

approximated by one must be [just take cos   1 in Eq. (5.46b)]and apply Eq. (5.35d)]

5
K W
gout (t ) 0 H(0) 
2 4 ³
5
H() 3u ) S ) 3  M  R) 3' ma  e 2& i) 3ut d) 3 . (5.46c)

In either case, we can AC couple the detector to the detector circuit by designing the circuit so
that its transfer function has
H(0) 0 . (5.46d)

This eliminates the constant term from formulas (5.46b) and (5.46c). At this level of idealization,
there is no particular reason to think of the signal leaving the detector circuit as a function of time
rather than the optical-path difference, since they are linearly related to each other by formula
(5.41a) above. Dropping the prime from ı, we use (5.41a) to write the output of the detector
circuit as
z (  ) gout (  u ) (5.47a)
with
5
­ ½
W ° 1 2& i) cos  °
³ S )  M  R)'ma  ®  field³³of d  H() u cos  )e
2
z( ) ¾ d) (5.47b)
4 5 ° °
¯ view ¿

when cos   cannot be approximated as one and

5
W
z( )
4 ³
5
H() u ) S )  M  R)' ma  e 2& i) d) (5.47c)

when it can. Because these last two formulas refer to the time-based signal leaving the detector
circuit, it may seem unnatural to write them in terms of Ȥ and ı, but we will find it useful to have
them written in terms of the optical-path difference and wavenumber just like the previous
equations discussed in this chapter. To neglect the effect of the detector circuit, for example, we
need only take H = 1 inside the integrals of (5.47b) and (5.47c) to return at once to the integrals
in (5.40c) and (5.40d) respectively, which, when multiplied by W, become WK Ibal , the Ȥ-
dependent part of the signal leaving
absorbedthebydetector.
the detector.

5.11 The Effective Spectrum


Equation (5.47c) is an example of a formalism we will see many times in the rest of this chapter.
We can write (5.47c) as

- 622 -
The Effective Spectrum · 5.11


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ ,

where
W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) . (5.48a)
4

This shows that in (5.47c) the interferogram signal z(Ȥ) can be written as the inverse Fourier
transform of an effective spectrum Z eff (σ ) . It is easy to show that the interferogram signal can
always be written as the inverse Fourier transform of an effective spectrum. As long as the
interferogram signal z(Ȥ) is a transformable function, we can take its Fourier transform,

³ z( χ )e
−2π iσχ
dχ ,
−∞
and call it the effective spectrum,

³ z( χ )e
−2π iσχ
Z eff (σ ) = dχ . (5.48b)
−∞

The reciprocity of the Fourier transform then leads to


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.48c)

When, for example, cos α ε cannot be approximated as one, as in Eq. (5.47b), we can write for the
effective spectrum

Zeff (σ )

(5.48d)
³ z ( χ )e
−2π iσχ
= dχ
−∞

∞ ∞
­ ½
W ° 1 °
³ dχ e ³−∞ dσ ′ S (σ ′) M ( Rσ ′θma ) ® ∆Ω field³³of d ε H(σ ′u cos αε )e
−2π iσχ 2 2π iσ ′χ cos αε
= ¾,
4 −∞ ° °
¯ view ¿

which, reversing the Fourier transform, leads again to the formula

- 623 -
5 · Description of Practical Interferometer Measurements


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ .

Although there is nothing very profound about this procedure, it can be a useful way of analyzing
the distortions undergone by the interferogram signal as it passes through the Fourier-transform
spectrometer.

5.12 Symmetries of the Interferogram Signal and Effective Spectrum


As long as the effective spectrum Z eff is a real and even function of ı, we know from the first
entry of Table 2.1 in Chapter 2 that its inverse Fourier transform


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ

must also be a real and even function of the optical-path difference Ȥ. After the interferogram
signal passes through the detector circuit, it is still, of course, real, but there is no reason to
suppose that it is still even.
Suppose we look first at the simpler case where cos α ε can be approximated as one. Then,
according to the Eq. (5.48a), we have

W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) .
4

From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H is Hermitian,

H( - uσ ) = H(uσ )∗ , (5.49a)

and the discussion following (5A.6b) points out that H must have a nonzero imaginary part. We
know that W = +1 or í1 and that S(ı) and M( Rσθ ma ) are both real. From Eqs. (5.39a) and
(5.10f), we know that
S (−σ ) = S (σ )
and
M(− Rσθ ma ) = M( Rσθ ma )

are even. Hence the transfer function H in Eq. (5.48a) must give a nonzero imaginary part to
Z eff , and consequently all that can be said about Z eff is that it is Hermitian:

- 624 -
Symmetries of the Interferogram Signal and Effective Spectrum · 5.12

W W
Z eff (−σ ) = H(−uσ ) S (−σ )M(− Rσθ ma ) = H(uσ )∗ S (σ )M( Rσθ ma )
4 4

ªW º
= « H(uσ ) S (σ )M( Rσθ ma ) » (5.49b)
¬4 ¼
= Z eff (σ ) .

This makes

z( χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ

the inverse Fourier transform of a Hermitian function. Therefore, according to entry 7 in Table
2.1 of Chapter 2, z(Ȥ) must be real but need not be even. In fact, if z(Ȥ) were both even and real,
then entry 1 of Table 2.1 states that Z eff must be both real and even—that is, entry 1 requires
Z eff to have a zero imaginary part when z(Ȥ) is even. Since we already know that Z eff must have
a nonzero imaginary part, we conclude that z(Ȥ) cannot be an even function of Ȥ. So already in the
simpler case where cos α ε is approximated as one, the interferogram signal cannot be even after
passing through the detector circuit.
The interferogram signal, in fact, always becomes uneven after passing through the detector
circuit. To see why this is so, we return to Eq. (5.46a), which holds true both when cos α ε can be
approximated as one and when it cannot. According to the Fourier convolution theorem, the
right-hand side, which is now the inverse Fourier transform of the product of two functions, can
be replaced by a convolution to get [see Eq. (2.39c) in Chapter 2]

gout (t ) = h(t ) ∗ gin (t ) , (5.50a)


where

³e
2π ift
h(t ) = H ( f ) df (5.50b)
−∞

is the impulse-response function of the detector circuit, as described at the beginning of Appendix
5A, and

³e
2π ift
gin (t ) = Gin ( f ) df . (5.50c)
−∞

In Eq. (5.43) we defined gin to be the signal as it leaves the detector and enters the detector
circuit, and in the discussion following (5.43) Gin was defined to be the Fourier transform of gin.
Hence gin must be the inverse Fourier transform of Gin as shown in Eq. (5.50c). We know from
Eqs. (5.43) and (5.38b) that gin is an even function of time when t = 0 is chosen to coincide with

- 625 -
5 · Description of Practical Interferometer Measurements

χ = 0 as in Eq. (5.41a). Relationship (5A.5) in Appendix 5A states that the impulse-response


function h(t) must be zero for t < 0 and, of course, it cannot be a delta function at t = 0 for any
physically realistic detector circuit. Consequently, h(t) must have nonzero values at t > 0 that are
not matched by nonzero values at t < 0 . This means the convolution in formula (5.50a) makes
gout a blurred version of gin that has also been shifted to the right, in the direction of positive t. All
this can be regarded as just a complicated way of saying that the detector signal cannot pass
through the detector circuit with infinite swiftness—there is always some sort of delay.
Therefore, gout can never be an even function of t, which means, according to Eq. (5.47a),

z ( χ ) = g out ( χ u )

can never be an even function of Ȥ. No assumptions have been made about the value of cos α ε , so
this result clearly holds true whether or not we approximate cos α ε by one in the double integral
over the interferometer’s field of view.
One last point worth making is that, although we now know that z(Ȥ) cannot be strictly even,
detector circuits are often designed to preserve the major features of the signals passing through
them, making the delays with which signals pass through the circuit small compared to the signal
fluctuation rate. Consequently in (5.50a) we then have

gout (t ) ≈ gin (t )
so that
z ( χ ) ≈ g in ( χ u ) .

Now, since gin is an even function, z(Ȥ) is an approximately even function so that

z (− χ ) ≅ z ( χ ) .

In some systems, the output signal of the detector circuit may have to be examined quite closely
to confirm that it is not a strictly even function of its argument.

5.13 Background Radiation Inside a Standard Michelson Interferometer


In Fig. 5.20, the optical signal passes through the interferometer’s fore optics and aft optics on its
way to the detector; and, as described in Sec. 5.8, we can represent the effects of this passage by
the two transmission functions τ f (σ ) and τ a (σ ) . When measuring infrared spectra with
uncooled interferometers, the fore optics and aft optics not only affect the optical signal passing
through them but can also act as unwanted sources of infrared background radiation. Unless they
have been cooled far below room temperature, optical elements spontaneously “glow” in the
infrared—so if the object being observed by the interferometer is at or near room temperature, the

- 626 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

optical elements may be as strong a source of infrared radiance as the object itself.
Figure 5.21 shows that the fore optics’ background masquerades as an additional type of
radiance entering the interferometer. To include the fore optics’ background in our formulas, we
add a background term to the input spectrum S(ı) defined in Eq. (5.36a),

S (σ ) → S (σ ) + S ( fore ) (σ ) .

The S ( fore ) (σ ) term is just like S(ı) in (5.36a) except that, since the radiance L( fore ) (σ ) coming
from the fore optics does not have to pass through the fore optics before reaching the
interferometer, we set τ f (σ ) = 1 to get

S ( fore ) (σ ) = A ∆Ω R (σ ) η(σ ) L( fore ) (σ )τ a (σ ) .

Remembering that the formula for S(ı) in Eq. (5.36a) is made into an even function of ı in
(5.39a), we do the same thing to S ( fore ) (σ ) by writing

S ( fore ) (σ ) = A ∆Ω R ( σ ) η(σ ) L( fore ) ( σ )τ a ( σ ) . (5.51a)

As before, there is no need to add absolute value signs to the wavenumber argument of Ș(ı)
because, according to Eq. (4.139g) in Chapter 4, it is already an even function of wavenumber.
Here we implicitly assume that the detector’s field of view ¨Ÿ for the fore optics is the same as
its field of view ¨Ÿ for the external source—which is usually a good approximation for well-
designed systems.
Now when we consider Eq. (5.37a) for the signal leaving the detector,

1
K bal ( χ ) = K 0 + WK Ibal ( χ ) , (5.51b)
2

the constant K 0 in Eq. (5.37b) becomes

∞ ∞ ∞
K 0 = ³ ª¬ S (σ ) + S ( fore ) (σ ) º¼ dσ = ³ S (σ ) dσ + ³ S ( fore ) (σ ) dσ (5.51c)
0 0 0

- 627 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.21.

Moving Mirror

Ideal Beam
Splitter

FORE
OPTICS

Fixed
Mirror
AFT OPTICS

Circular Detector

The warm surfaces of the fore and aft optics emit infrared background radiation in both
directions along the interferometer’s optical axis.

- 628 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

and K Ibal in Eqs. (5.40a) and (5.40b) becomes



1
K Ibal ( χ ) = ³ [ S (σ ) + S ( fore ) (σ )]M( Rσθ ma ) e 2π iσχ dσ
4 −∞
∞ ∞
(5.51d)
1 1
= ³ S (σ )M( Rσθ ma ) e 2π iσχ dσ + ³ S ( fore ) (σ )]M( Rσθ ma ) e2π iσχ dσ
4 −∞ 4 −∞

when cos α ε is approximated as one in (5.40a) and


­ ½
1 ° 1 °
K Ibal ( χ ) = ³ [ S (σ ) + S ( fore ) (σ )]M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿

­ ½
1 ° 1 2π iσχ cos αε °
= ³ S (σ )M( Rσθ ma ) ® ³³ d εe
2
¾ dσ (5.51e)
4 −∞ ° ∆Ω field of °
¯ view ¿

­ ½
1 ° 1 °
+ ³ S ( fore )
(σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿

when cos α ε is not approximated by one in (5.40b). Unfortunately the background radiance
generated by the aft optics cannot be handled this simply.
Figure 5.21 shows that the background radiance generated by the aft optics travels in two
different directions—directly to the detector and backwards into the interferometer. The detector
sees the aft optics’ radiation that shines directly on it as a constant level of infrared illumination,
introducing a new constant term into the detector signal. This term can be written as


S ( dir ) = Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ , (5.51f)
0

where we note that



P ( dir ) = Αdet ∆Ω( dir ) ³ L( dir ) (σ ) dσ (5.51g)
0

is the background optical power contributed by warm surfaces emitting a spectral radiance
L( dir ) (σ ) uniformly over a solid angle ∆Ω( dir ) as seen from the detector of area Αdet . Just like the

- 629 -
5 · Description of Practical Interferometer Measurements

constant term in the interference signal coming from the source, this additional constant signal is
removed by the detector’s AC coupling to the detector circuit and for that reason can be
disregarded (it should, however, be taken into account when calculating the noise terms in the
next chapter). The aft optics’ radiance going backward into the interferometer, on the other hand,
interferes with itself as it passes “backwards” through the interferometer, generating an
interference signal that depends on Ȥ, the optical-path difference. Some of this Ȥ-dependent
optical signal ends up returning to the detector. As the moving mirror changes its position, this
interference signal also changes, generating a time-dependent signal capable of passing through
the AC coupling to the rest of the system. In Sec. 4.17 of Chapter 4, we call this the unbalanced
background signal and derive a formula for Punb ( back )
( χ ) , the power in the unbalanced background
signal at an optical-path difference Ȥ.
Working at the same level of idealization as in the analysis of the balanced interference signal
reaching the detector, we set γ ≅ 1 to neglect substrate absorption in formula (4.163a) for
( back )
Punb ( χ ) from Chapter 4 to get

{

A
³ dσ ³ ³ d 2ε L (back) (σ ) 2 r (σ ) − η (σ )
2
(back)
Punb (χ ) =
2 −∞ field of
view (5.52)

}
− Wη (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos α ε ) .

Here Eq. (5.10c) is used to substitute M for the original Bessel-function ratio, and A again refers
to the area of the aperture in the aft optics that specifies the cross-sectional area of the beam
passing through the interferometer. The double integral over d 2ε can be taken over the
detector’s field of view of the exterior source, since in well-designed systems this is usually a
good approximation for the detector’s background field of view. The L(back) (σ ) function refers to
all the radiance entering the back end of the interferometer, not only the background radiance
coming directly from the aft optics but also radiance emitted from the detector itself that passes
backwards through the aft optics before entering the back end of the interferometer. This is why
the unbalanced background signal is sometimes called the “Narcissus” interference signal,
because it can come in part from the detector “looking at itself” in the interferometer.
From Eq. (5.10f) in this chapter and Eqs. (4.139a), (4.139g), and (4.162b) of Chapter 4, we
2
know that M, r , Ș, and L(back) are all even functions of ı, as is, of course, cos(2πσχ cos α ε ) .
Hence, the double integral

³ ³
field of
{ 2
}
d 2ε L (back)(σ ) 2 r (σ ) − η (σ ) − Wη (σ ) ⋅ M ( Rσθ ma ) ⋅ cos(2πσχ cos α ε )
view

- 630 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

has the same value at ı and í ı, making it another even function of ı. Equation (5.52) can thus be
written as

{
5
A
Punb (  ) ³ d) ³ ³ d 2 L
(back)  back  2
L (back) () ) 2 r () )  ! () )
2 0 field of
view (5.53a)

 W! () ) A M( R)' ma ) A cos(2&) cos   )} ,

where we have used Eq. (4.163d) of Chapter 4 to recognize

L (back) () ) 2 L (back) () ) for ) : 0


L

as the spectral radiance of the infrared background entering the back end of the interferometer.
When cos   can be approximated as one, this equation reduces to

(back)
Punb ( )
A 
5
(5.53b)
³
2
L (back) () )[ 2 r () )  ! () )  W! () ) A M( R)' ma ) A cos(2&) )]d) ,
2 0
with
 ³³
field of
d 2 . (5.53c)
view

Keeping in mind the definition of M given in Eq. (5.10c) and our approximation that 1 , we
see that (5.53b) is the same as Eq. (4.163c) in Chapter 4.
Just as we did for the power in the balanced signal, we can interpret the integrals over dı in
Eqs. (5.53a) and (5.53b) to be sums over all the power contributions of all the monochromatic
wavenumber components ı of the background radiation. Hence, when cos   is approximated by
one in Eq. (5.53b), we say that

A  (back)
() ) ª 2 r () )  ! () )  W! () ) A M( R)' ma ) A cos(2&) ) º
2
d) A L
2 ¬ ¼

is the power carried by the ıth wavenumber component leaving the interferometer and traveling

- 631 -
5 · Description of Practical Interferometer Measurements

toward the detector; and when cos   cannot be approximated by one in Eq. (5.53a), we make the
same claim for

A
d) A
2 ³ ³d  L
field of
2 (back)
1 2
2
() ) 2 r () )  ! () )  W! () ) A M( R)' ma ) A cos(2&) cos   ) .
view

Following the same reasoning used in Secs. 5.8 and 5.9 above to analyze the power in the
balanced optical signal, we multiply these expressions first by the aft optics’ transmission * a () )
to get the fraction of power component passing from the interferometer to the detector and then
by the detector responsivity R(ı) to get the signal component produced by the interferometer’s
detector. This makes

{2 r() )
5
A
³ d) ³ ³ d 2 * a () )R () )L(back) () )
2
K unb (  )  ! () )
2 0 field of
view (5.54a)

 W! () ) A M( R)' ma ) A cos(2&) cos   ) },


the total unbalanced interference signal leaving the detector when cos  cannot be approximated
by one, and

K unb (  )
A 
5 (5.54b)
ª 2 r () ) 2  ! () )  W! () ) A M( R)' ma ) A cos(2&) ) º d) ,
2 ³0
(back)
* () ) R () ) L () )
a
¬ ¼

the total unbalanced interference signal leaving the detector when cos   can be approximated as
one.
Following the pattern of Eq. (5.37a), we can write Eqs. (5.54a) and (5.54b) as

1 ( unb )
K unb (  ) K 0  WK Iunb (  ) , (5.55a)
2
where
K Iunb (  )
5
A
 ³ d) ³ ³ d 2 * a () ) R () )L (back) () ) ! () ) A M ( R)' ma )A cos (2&) cos   ) (5.55b)
cos 
2 0 field of
view

- 632 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

when cos α ε cannot be approximated as one, and


K Iunb ( χ ) =

A ∆Ω (5.55c)

2 0³ τ a (σ )R (σ )L (back) (σ ) η (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ )dσ

when cos α ε can be approximated as one. No matter how cos α ε is approximated, we have


( χ ) = A ∆Ω ³ τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
( unb ) 2
K 0 (5.55d)
0

We simplify the formulas in these expressions by defining

S (back) (σ ) = A∆Ωτ a (σ )R (σ )L(back) (σ )η (σ ) (5.56a)


to get
K Iunb ( χ ) =


­ ½
1 ° 1 ° (5.56b)
− ³S (back)
(σ ) M( Rσθ ma ) ® ³ ³
2
d ε cos(2πσχ cos α ε )¾ dσ
20 ° ∆Ω field of °
¯ view ¿

when cos α ε cannot be approximated by one and


1
K Iunb ( χ ) = − ³ S (back) (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ )dσ , (5.56c)
20

when it can. We force S (back) to be an even function of its argument by writing

S (back) (σ ) = A∆Ωτ a ( σ ) R ( σ )L(back) ( σ )η (σ ) . (5.57a)

There is, of course, no need to put absolute value signs on the argument of Ș because we already
know from Eq. (4.139g) of Chapter 4 that it is even. Since the cosine is an even function of its
argument and, according to Eq. (5.10f), so is M, we recognize that now both

S (back) (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ )


and

- 633 -
5 · Description of Practical Interferometer Measurements

S (back) () ) A M( R)' ma ) A cos(2&) cos   )

are even functions of ı. Repeating the same argument that has already been used before to
convert cosine integrals over even functions into Fourier transforms, we note that

³S
(back)
() ) A M ( R)' ma ) A ªcos(2&) cos  ) º d)
0
¬ ¼
5
1
³ S (back) () ) A M( R)' ma ) A ª cos(2&) cos  )  i sin(2&) cos  ) º d) (5.57b)
2 5 ¬ ¼
5
1 2& i) cos  
³ S (back) () )M( R)' ma ) e d)
2 5
because
S (back) () ) A M( R)' ma ) A sin(2&) cos   )

is an odd function of ı, making its integral over ı between í’ and +’ equal to zero for all values
of cos   .[see Eq. Eq.
Hence, (2.17) in Chapter
(5.57b) can be 2].
usedHence, Eq.
to write (5.57b)
(5.56b) as can be used to write (5.56b) as

5
1 1
K Iunb (  )  A A ³³ d 2 ³ d) S (back) () )M( R)' ma ) cos(2&) cos   )
2  field of 0
view
5
1 1
 A A ³³ d 2 ³ d) S (back) () )M( R)' ma ) e 2& i) cos
4  field of 5
view

or
5
ª º
1 « 1 2& i) cos  »
K Iunb (  )  ³ S (back)
() )M( R)' ma ) « ³³of d  e
2
» d) (5.58a)
4 5  field
«¬ view »¼
when cos   cannot be approximated as one. When cos   can be approximated as one, (5.57b)
can becos
with used to can
  =1 writebe(5.56c)
used toaswrite (5.56c) as

5
1
K Iunb (  )  ³
4 5
S (back) () ) A M( R)' ma ) e 2& i) d) . (5.58b)

To get all of the interference signal reaching the detector from the source, the fore optics’

- 634 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

background, and the aft optics’ background, we add together the expressions for the signal
components from the source, the fore optics’ background, and the aft optics’ background.
Equation (5.51b) specifies the combined signal and fore optics’ background, and Eqs. (5.51f),
(5.55a) give the signal coming from the aft optics’ background. Adding all these formulas
together gives

K tot ( χ ) = S ( dir ) + K bal ( χ ) + K unb ( χ ) . (5.59a)

If cos α ε cannot be approximated as one, Eq. (5.59a) expands to, after applying Eqs. (5.51c)–
(5.51f), (5.55d), (5.58a), and (5.58b),

∞ ∞
1
K tot ( χ ) = Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ +
2 ³0
S (σ ) dσ
0
∞ ∞
1 A ∆Ω
+ ³ S ( fore ) (σ ) dσ + ³
2
τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ
20 2 0


­ ½
W ° 1 2π iσχ cos αε °
³−∞ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
2
+
4 ° °
¯ view ¿

­ ½
W ° 1 2π iσχ cos αε °
³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
( fore ) 2
+
4 −∞ ° °
¯ view ¿

­ ½
W ° 1 2π iσχ cos αε °
³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
(back) 2

4 −∞ ° ° (5.59b)
¯ view ¿

- 635 -
5 · Description of Practical Interferometer Measurements

and if cos α ε can be approximated as one, Eq. (5.59a) can be written as

∞ ∞
1
K tot ( χ ) = Αdet ∆Ω ³ R (σ )L (σ ) dσ + ³ S (σ ) dσ
( dir ) ( dir )

0
20
∞ ∞
1 A ∆Ω
+ ³ S ( fore ) (σ ) dσ + ³
2
τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ
20 2 0

W
+
4 ³
−∞
S (σ )M( Rσθ ma ) e 2π iσχ dσ (5.59c)


W
+
4 ³
−∞
S ( fore ) (σ )M( Rσθ ma )e 2π iσχ dσ


W

4 ³
−∞
S (back) (σ )M( Rσθ ma ) e 2π iσχ dσ .

When the moving mirror moves at a constant OPD velocity u so that

χ = ut ,

then the constant terms (that is, the terms that do not depend on Ȥ) do not make it past the detector
circuit that AC couples the detector to the rest of the system. According to the discussion
following Eq. (5A.2a) in Appendix 5A, if we know what the output of the linear detector circuit
is for each individual component of a sum of input signals, then we know that the output of the
linear detector circuit for the sum of the input signals is the sum of the outputs of the individual
components. Using Ȥ = ut to represent the nonconstant terms, we already know from the
procedure used to transform Eq. (5.42a) to (5.47b) that the term

­ ½
W ° 1 2π iσχ cos αε °
WK Ibal ( χ ) = ³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
2

4 −∞ ° °
¯ view ¿

in Eq. (5.42a) entering the detector circuit comes out as


­ ½
W ° 1 °
³−∞ S (σ ) M ( Rσθma ) ® ∆Ω field³³of d ε H(σ u cos αε )e
2π iσχ cos αε
¾ dσ
2

4 ° °
¯ view ¿

- 636 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

in Eq. (5.47b) when cos α ε cannot be approximated as one. Consequently, when the same term


­ ½
W ° 1 °
³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿

occurs in Eq. (5.59b), we know that it comes out of the detector circuit as


­ ½
W ° 1 °
³ S (σ ) M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿

Passage through the detector circuit just introduces a factor of H(σ u cos α ε ) into the integral
over the field of view when cos α ε cannot be approximated as one. Examining the other two
nonconstant terms in Eq. (5.59b), we note that the only difference between them and the term just
analyzed is way the S(ı) function is labeled: for one of the input terms we have

S (σ ) → S ( fore ) (σ )

and for the other we have

S (σ ) → − S (back ) (σ ) .

Therefore, we can write down at once that


­ ½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
( fore )
S
4 −∞ ° ∆Ω field of °
¯ view ¿
becomes


­ ½
W ° 1 °
³ S ( fore ) (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ ,
4 −∞ ° ∆Ω field of °
¯ view ¿

- 637 -
5 · Description of Practical Interferometer Measurements

and that

­ ½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿
becomes

­ ½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿

We now know what the output of the detector circuit is for each nonconstant component of the
sum in Eq. (5.59b), and we have already noted that the Ȥ-independent, constant terms in (5.59b)
have zero output. Knowing what the output is for each individual component of the sum in
(5.59b), we can write down the total output of (5.59b) as the sum of the outputs of each
individual component to get, when cos α ε cannot be approximated as one, that the total signal
leaving the detector circuit is
ztot ( χ )


­ ½
W ° 1 2π iσχ cos αε °
³ S (σ ) M ( Rσθma ) ® ∆Ω field³³of d ε H(σ u cos αε )e ¾ dσ
2
=
4 −∞ ° °
¯ view ¿

­ ½
W ° 1 °
+ ³ S ( fore ) (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿

­ ½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿

W
= ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ ⋅
4 −∞

­ ½
° 1 °
M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
° ∆Ω field of ° (5.60a)
¯ view ¿

To get the total signal leaving the detector circuit when cos α ε can be approximated as one, we

- 638 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13

need only replace cos α ε by one to get


W
ztot ( χ ) = ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ . (5.60b)
4 −∞

Once again, we use the formula


∆Ω = ³³
field of view
d 2ε

to dispose of the integral over the field of view. Equations (5.60a) and (5.60b) show that to
include the effect of the background radiance in the standard formulas for the signal leaving the
detector circuit, we need only replace the original source spectrum S(ı) in Eqs. (5.47b) and
(5.47c) with
S (σ ) → S (σ ) + S ( fore ) (σ ) − S (back) (σ ) . (5.60c)

Equations (5.40g), (5.51a), and (5.57a) are now substituted into (5.60c) to get

A∆ΩR ( σ )τ f ( σ )τ a ( σ )η (σ )L( σ ) → A∆ΩR ( σ )τ f ( σ )τ a ( σ )η (σ )L( σ )


+ A∆ΩR ( σ )τ a ( σ )η (σ )L( fore ) ( σ )
− A∆ΩR ( σ )τ a ( σ )η (σ )L( back ) ( σ ),

which can be reduced to

L( fore ) ( σ ) L(back ) ( σ )
L( σ ) → L( σ ) + − . (5.60d)
τ f (σ ) τ f (σ )

When the background radiance L( back ) is very large, the signal ztot leaving the detector circuit can
quite literally be the transform of a “negative” spectrum. The replacement rules given in (5.60c)
and (5.60d) are one reason we only need to keep track of the input radiance L when analyzing the
noise-free signal leaving the detector circuit—because (5.60c) or (5.60d) can be used at any point
to reintroduce the background radiances into the Fourier transforms. The next section gives
another reason the background radiances can be disregarded: they are easy to eliminate from the
signal leaving the detector circuit before any attempt is made to measure the input radiance
spectrum.

- 639 -
5 · Description of Practical Interferometer Measurements

5.14 Removing the Background Spectra


The source spectrum—what we build Fourier-transform spectrometers to measure—is derived
from the Fourier transform of z(Ȥ), the signal leaving the detector circuit. When cos α ε can be
approximated as one and the background radiance can be neglected, as in Eq. (5.47c), we start
with the formula

W
z( χ ) = ³ H(σ u ) S (σ ) M ( Rσθ ma ) e 2π iσχ dσ
4 −∞

and reverse the Fourier transform to get


W
³ z( χ ) e
−2π iσχ
dχ = H(σ u ) S (σ ) M ( Rσθ ma ) . (5.61a)
-∞
4

Substituting for the source spectrum S from Eq. (5.40g) gives


A ∆ΩW
³ z( χ ) e
−2π iσχ
dχ = L ( σ )H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )M ( Rσθ ma ) , (5.61b)
-∞
4

which can be solved for the source radiance to get

L (σ )
−1 ∞
ª A ∆ΩW º (5.61c)
=« H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )M ( Rσθ ma ) » ³ z( χ ) e d χ .
−2π iσχ

¬ 4 ¼ -∞

Before we started analyzing the interferometer’s background radiance, this sort of equation had
been enough to explain how to find the source radiance, since in a well-aligned interferometer
M ≅ 1 and all the other quantities,

A, ∆Ω, W , R, η, τ a , τ f , H, and u ,

are known or can—in principle, anyway—be measured. This lets us write

−1 ∞
ª A ∆ΩW º
³ z( χ ) e
−2π iσχ
L (σ ) = « H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )» dχ (5.61d)
¬ 4 ¼ -∞

- 640 -
Removing the Background Spectra · 5.14

to get a formula for what we want to measure in terms of the Fourier transform of z(Ȥ) and other
known quantities. Now, however, we know from the work done in the previous section that when
measuring infrared spectra there may be significant amounts of background radiance
contaminating the source spectrum. Equations (5.60a) and (5.60b) show that if the background
radiance cannot be neglected, then the signal leaving the detector circuit is not z(Ȥ) but rather
ztot ( χ ) , which is not the correct signal to substitute into equations such as (5.61d).
To recover z(Ȥ) from ztot ( χ ) there must be two measurements made: one looking at the source
and one looking at nothing at all. No matter how cos α ε is approximated, when the
interferometer observes an extremely cold source, it produces a signal in Eqs. (5.60a) and (5.60b)
in which S(ı), the infrared source spectrum, is very small compared to the background spectra
S ( fore ) (σ ) and S (back) (σ ) . To match the notation used in Chapter 6, where the background
radiances play a more important role than they do here, we call this signal zC( cold ) ( χ ) . According
to Eqs. (5.60a) and (5.60b), zC( cold ) ( χ ) can be written as


W
(χ ) = ³ ª¬ S ( fore ) (σ ) − S (back) (σ ) º¼ ⋅
(cold)
zC
4 −∞

­ ½ (5.62a)
° 1 °
M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
° ∆Ω field of °
¯ view ¿

when cos α ε cannot be approximated as one, and as


W
z (cold)
C (χ ) = ³ ª¬ S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ (5.62b)
4 −∞

when cos α ε can be approximated as one. Assuming the interferometer is stable, meaning that the
background radiances of the instrument do not change, we can then measure ztot ( χ ) as given in
formulas (5.60a) and (5.60b) and subtract from it zC( cold ) ( χ ) as defined in Eqs. (5.62a) and
(5.62b). This gives

- 641 -
5 · Description of Practical Interferometer Measurements

z (  ) ztot (  )  zC(cold) (  )
5
W

4 ³
5
ª¬ S )   S ( fore ) () )  S (back) () ) º¼ A

­ ½
° 1 °
M  R)' ma  ® ³³ d 2 H() u cos   )e 2& i) cos ¾ d)
°  field of °
¯ view ¿
5
W

4 ³
5
ª¬ S ( fore ) () )  S (back) () ) º¼ A

­ ½
° 1 °
M  R)' ma  ® ³³ d 2 H() u cos   )e 2& i) cos ¾ d)
°  field of °
¯ view ¿
5
­ ½
W ° 1 °

4 ³ S )  M  R)' ma  ® ³³ d 2 H() u cos   )e 2& i) cos ¾ d)
5 °  field of °
¯ view ¿ (5.62c)

when cos   cannot be approximated as one, and

z (  ) ztot (  )  zC(cold) (  )
5
W

4 ³
5
ª¬ S )   S ( fore ) () )  S (back ) () ) º¼ M  R)' ma  H() u )e 2& i) d) A

W
5
(5.62d)
³ ¬ª S () )  S (back ) () ) ¼º M  R)' ma  H() u )e 2& i) d) A
( fore )

4 5
5
W
³ S )  M  R)'  H() u)e d)
& ) 2 i
ma
4 5

when cos   can be approximated as one. This is one of the ways infrared spectroscopists using
interferometers with uncooled optics can eliminate unwanted background spectra and retrieve the
desired z(Ȥ) interferogram signal associated with the source spectrum. Designers of satellite
interferometers almost always schedule some form of “space look” where the instrument
observes nothing but empty space, containing only distance sources of radiation too dim for the
instrument to detect. This sort of space look allows it to acquire the information needed to find
the zc(C(cold)
cold )
(  ) signal generated by its own internal warmth. A quick way of achieving the same

- 642 -
Removing the Background Spectra · 5.14

effect on the ground is to point the interferometer at a surface cooled by liquid nitrogen or, for
greater accuracy, liquid helium.
Now that we know how to extract z(Ȥ) from the unwanted background, the presence of the
background signal can be disregarded when analyzing nonrandom spectral distortions introduced
by nonideal interferometer measurements. This is what we do for the rest of this chapter (except
for Sec. 5.19, where we discuss one common method of extracting a radiance measurement from
the raw signal spectrum). These formulas do, however, return in the next chapter because the
background signal can have a significant effect on the amount of random noise present in the
measurement.

5.15 Double-Sided Interferograms


Equation (5.61d) gives the spectral radiance (what Fourier-transform spectrometers measure) in
terms of the Fourier transform

³ z(χ ) e
−2π iσχ

-∞

of the interferogram signal z(Ȥ) leaving the detector circuit. It is, of course, impossible to measure
z for all optical-path differences Ȥ between í’ and +’, so there is no hope of calculating the
direct, unadulterated Fourier transform of z. We must therefore settle for an approximation of the
Fourier transform, and there are two different ways to do this—one using finite-length, double-
sided measurements of the interferogram signal and one using finite-length, single-sided
measurements of the interferogram signal. Because it is conceptually simpler, we start with the
double-sided interferogram measurement, postponing discussion of the single-sided
interferogram until Sec. 5.18 below.
As was remarked at the end of Sec. 5.12, the interferogram signal leaving the detector circuit
is usually approximately—although not exactly—even, so that it tends to look as shown in Fig.
5.22 when plotted as a function of Ȥ. In a double-sided interferogram measurement, there is a
positive length D such that the signal z(Ȥ) is measured for all

−D ≤ χ ≤ D ,
or
χ ≤ D.

When z is only measured for χ ≤ D , there is no way to know what z is in the regions marked
with question marks “?” in Fig. 5.22, and in a double-sided interferogram measurement, the value
of z(Ȥ) in these regions is assumed to be, if not negligible, at any rate unimportant. The Fourier
transform of z then becomes,

- 643 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.22.

χ =0

χ = −D χ=D

? 2D ?

______________________________________________________________________________

∞ D ∞

³ z ( χ ) e −2π iσχ d χ → ³ z ( χ ) e −2π iσχ d χ = ³ Π ( χ , D) z ( χ ) e


−2π iσχ
dχ , (5.63a)
-∞ -D -∞
where
°­1 for χ ≤ D
Π ( χ , D) = ® (5.63b)
°̄0 for χ > D

has already been defined by Eq. (4C.1a) in Appendix 4C of Chapter 4.


To see the effect of neglecting the signal values at χ > D , we must understand the
information carried by the Fourier transform of z(Ȥ). We know from the discussion in Sec. 5.11
that there is always an effective spectral function Z eff (σ ) that is the Fourier transform of z [see

- 644 -
Double-Sided Interferograms · 5.15

Eq. (5.48b)],
5

³ z( ) e
2& i)
Z eff () ) d . (5.64a)
-5

Rewriting Eq. (5.61d) by substituting (5.64a) for the Fourier transform of z gives

1
ª A W º
L () ) « H() u ) R ( ) ) !() )* f ( ) )* a ( ) ) » Z eff () ) . (5.64b)
¬ 4 ¼

The terms inside the square brackets are usually designed to be slowly varying functions over the
range of wavenumbers for which L(ı) is being measured. This means Z eff () ) contains the fine
details of spectrum L(ı). Since L(ı) is real and—according to the discussion following Eq.
(5A.6b) in Appendix 5A—the transfer function H(u) ) is complex, the effective spectrum
Z eff () ) in (5.64b) must also be complex. Taking the complex magnitude of both sides of formula
(5.64b), we indicate that Z eff () ) carries the fine details of L(ı) by writing

L( ) ) ~ Z eff () ) . (5.64c)

Although Eq. (5.64b) comes from formulas that apply only when the interferometer’s field of
view is sufficiently narrow that cos   can be approximated as one, the idea expressed by
(5.64c), that Z eff () ) carries the fine details of the L(ı) spectrum, holds true even when cos  
cannot be approximated as one.
We now consider what happens to these fine details when what we have is not Z eff () ) , the
true Fourier transform of z(Ȥ), but rather the double-sided approximation specified in Eq. (5.63a).
The integral in (5.63a) is the Fourier transform of the product  (  , D) A z (  ) , and by the Fourier
convolution theorem [see Eq. (2.39k) of Chapter 2], this can be written as the convolution of the
Fourier transforms
transform of  and
and zz.. We
We already
already know
know that
that ZZeffeff (()))) is
is the
the Fourier
Fourier transform of z, and
the Fourier transform of  can be evaluated directly as

5 D
1 D
³  (  , D) e ³e ª¬e 2& i) º¼ 2 Dsinc(2&) D) ,
2& i) 2& i)
d d (5.65a)
-5 -D
2& i) D

where in the last step we use


ei cos   i sin 

- 645 -
5 · Description of Practical Interferometer Measurements

and the function


sin x
sinc( x) = (5.65b)
x

previously defined in Eq. (2.106d) of Chapter 2. Hence, by the Fourier convolution theorem

³ Π ( χ , D) z ( χ ) e
−2π iσχ
d χ = [ 2 Dsinc(2πσ D) ] ∗ Z eff (σ ) . (5.65c)
-∞

This shows that what we settle for in a double-sided interferogram measurement is the
convolution of Z eff (σ ) with 2 Dsinc(2πσ D) instead of the true Fourier transform Z eff (σ ) .
In the discussion following Eq. (2.39 A ) of Chapter 2, we pointed out that when two functions
are convolved and one of them is much narrower than the other, the narrower function can be
thought of as blurring and distorting the shape of the other. Since what we are interested in is the
fine detail encoded in

³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ ,
-∞

we cannot hope to get even an approximate measurement of this fine detail unless
2 Dsinc(2πσ D) is narrower than Z eff (σ ) , the Fourier transform of z. We substitute the right-
hand side of (5.65c), which is our approximation for Z eff (σ ) , the Fourier transform of z, into
(5.64c) to get
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ Zeff (σ ) . (5.66a)

The original spectral radiance L(ı) encodes its own fine details at least as well as Z eff (σ ) , which
lets us write (5.66a) as
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ L( σ )
or
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ L( σ ) . (5.66b)

In the last step, we restrict the magnitude signs to the arguments of L and Lblur because
2 Dsinc(2πσ D) and L(ı) are always real—making their convolution real—and because negative
values of the convolution indicate an unphysically distorted measurement of L(ı), because L
cannot be negative. Since, according to Eq. (2.38b) in Chapter 2, it does not matter in what order
two functions are convolved, this can also be written as

- 646 -
Double-Sided Interferograms · 5.15

Lblur ( ) ) ~ L( ) )  2 Dsinc(2&) D) .

Comparing this result to Eq. (2.40a) of Chapter 2, we realize that 2Dsinc(2&) D ) is playing the
role of an instrument response function. Figure 5.23 reveals the width of function
2 Dsinc(2&) D) between the two zeros bracketing the central peak to be 1/D. This shows us how
to control the narrowness of the spectrometer’s instrument-response function. When designing
Fourier-transform spectrometers we try to pick D sufficiently large that the blurring sinc
function in (5.66b) does not significantly distort the spectral features of the radiance L(ı) that we
want to measure.
Figures 5.24(a)–5.24(f) give examples of how this works when the 2 Dsinc(2&) D)
instrument-response function acts acts to
toblur
blurtogether
togethera apair
collection of ever-closer
of ever-closer spectral spectral
peaks. Wepeaks. We
see that
see
whenthat
thewhen
peaksthearepeaks are separated
separated by a wavenumber
by a wavenumber interval interval

1
) (5.67)
2D

all sure knowledge of their separate existence is lost. In Fourier-transform spectrometry, the
quantity (2 D) 1 is often called the unapodized spectral resolution of the interferometer
measurement. This terminology can be confusing, because a smaller spectral resolution ¨ı now
corresponds to a higher resolving power for the interferometer. The important thing to remember
is that the interferometer’s resolving power—that is, its ability to measure spectral detail—is
directly proportional to D. Figures 5.24(a)–5.24(f) also show that when the true spectra are
convolved with sinc-like instrument-response functions, the oscillations in the instrument-
response functions create secondary oscillations in regions where L(ı) is changing rapidly. This
is sometimes referred to as “ringing” in the measured spectrum Lblur () ) . This ringing can lead to
unphysically negative values in Lblur () ) , as shown in Figs. 5.24(b), 5.24(d), and 5.24(f).
In InFourier-transform
Fourier-transformspectroscopy,
spectroscopy,thetheinstrument-response
instrument-response function
function isis often
often called
called the
instrument line shape, or ILS for short. The instrument line shape can be measured by passing a
laser beam through the interferometer. Although all lasers in practice have some spectral width,
they do produce a spectral radiance L(ı) that is, as shown in Fig. 5.25(a), very close to a delta
function.86 Figure 5.25(b) plots the curve Lblur () ) produced by a Fourier-transform spectrometer
when it measures the laser spectrum at wavenumber ) ) 0 . We can normalize Lblur () ) so that
the total area under the curve is one, creating a new curve

86
Equation (5.16c) gives the ideal interferogram created by a strictly monochromatic source represented by a delta
function.

- 647 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.23.

2D

0.0

1 1

2D 2D

sinc(2&) D) versus ı.
This is a graph of 2 D sinc(
____________________________________________________________________________________

1
ª5 º
L ( norm )
blur () ) « ³ Lblur () 3) d) 3» Lblur () ) . (5.68a)
¬0 ¼

The origin of the wavenumber axis is then shifted so that the center of the normalized curve is at
the origin, giving a measurement of the instrument-response function or instrument line shape at
) )0 ,
I LS () ) L(blur
norm )
()  ) 0 ) , (5.68b)

- 648 -
Double-Sided Interferograms · 5.15

FIGURE 5.24(a). FIGURE 5.24(b).

L(σ ) Lblur (σ )

σ σ

§ 1 · § 1 ·
3⋅¨ ¸ 3⋅¨ ¸
© 2D ¹ © 2D ¹
FIGURE 5.24(c). FIGURE 5.24(d).

L(σ ) Lblur (σ )

σ σ

§ 1 · § 1 ·
2⋅¨ ¸ 2⋅¨ ¸
FIGURE 5.24(e). © 2D ¹ FIGURE 5.24(f). © 2D ¹

L(σ ) Lblur (σ )

σ σ

1 1
2D 2D

- 649 -
5 · Description of Practical Interferometer Measurements

as shown in Fig. 5.25(c). To a first approximation (and as a general rule of thumb), we expect to
get about the same shape for I LS () ) no matter what the wavenumber ) 0 of the laser used to
make the measurement.
One last point worth making is that after the effective spectrum Z eff () ) has been blurred by a
convolution with the sinc function, what we end up with is a new effective spectrum

Zeff , new () ) [2 Dsinc(2&) D)]  [Z eff ,old () )] . (5.69a)

Now the Fourier-transform relationship in Eq. (5.65c) can be written as

³ (  , D) z(  ) e
2& i)
Z eff , new () ) d (5.69b)
-5
and
5
 (  , D) z(  ) ³
-5
Z eff ,new () )e2& i) d) . (5.69c)

So even this aspect of the interferogram signal—that we cannot measure it for all optical-path
differences between í’ and +’—can be expressed by representing the truncated signal

 (  , D) z (  )

as the Fourier transform of an effective spectral function. Z() ) .

5.16 Apodization of Spectra


In Sec. 5.15, the basic philosophy of the double-sided interferogram is to give equal weight to all
parts of the signal measured between +D and íD, as shown in Eq. (5.63a). When, however, the
approximation in (5.63a) is written as

5 5

³ z( ) e ³  (  , D) z (  ) e
2& i) 2& i)
d d ,
-5 -5
it is perhaps not so obvious that putting function  (  , D) inside the integral on the right-hand
side leads to the best possible approximation of the true Fourier transform of z. Suppose we
replace  with an arbitrary function of Ȥ called aD (  ) , making the approximation that

- 650 -
Apodization of Spectra · 5.16

FIGURE 5.25(a).

L(σ )

σ =σ0

- 651 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.25(b).

Lblur (σ )

σ0
1 1
σ0 − σ0 +
2D 2D

- 652 -
Apodization of Spectra · 5.16

FIGURE 5.25(c).

I LS (σ )

1 0.0 1

2D 2D

- 653 -
5 · Description of Practical Interferometer Measurements

∞ ∞

³ z(χ ) e ³a
−2π iσχ
dχ ≅ D ( χ ) z ( χ ) e−2π iσχ d χ . (5.70a)
-∞ -∞

The subscript D reminds us that


aD ( χ ) = 0 for χ >D (5.70b)

since we do not know what values to give z when χ > D . Setting up the problem of
approximating the true Fourier transform in this way—that is, the way it is stated in Eq. (5.70a)—
suggests that what we need to do is find that function aD ( χ ) for which the integral

³a
-∞
D ( χ ) z ( χ ) e −2π iσχ d χ

best approximates the true Fourier transform

³ z( χ ) e
−2π iσχ
dχ .
-∞

Trying to approximate the Fourier transform of a function z, which is known from only a finite
stretch of data, is not a problem unique to Fourier-transform spectroscopy; in fact, it occurs over
and over again in many different fields of electrical engineering and signal processing. In these
fields, aD is called the window function and multiplying z(Ȥ) by aD (χ ) is referred to as
windowing z(Ȥ). In Fourier-transform spectroscopy aD is called the apodization function, and
multiplying z(Ȥ) by aD (χ ) is called apodizing the interferogram signal z.
There are several different types of restrictions put on the apodization function aD (χ ) . If

³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ (5.71a)
-∞

is the true Fourier transform of z, then according to Eq. (2.35b) of Chapter 2,


z (0) = ³
-∞
Z eff (σ )dσ . (5.71b)

When we replace z(Ȥ) by aD (χ ) z(χ ) in Eq. (5.70a), distorting the shape of the Fourier transform

- 654 -
Apodization of Spectra · 5.16

Z eff (σ ) , we want the integral over the distorted spectrum to have the same value as the integral
over the undistorted spectrum in (5.71b). Because the distorted spectrum is by definition the
Fourier transform of aD (χ ) z (χ ) , it follows—again using (2.35b) of Chapter 2—that the integral
over the distorted spectrum is aD (0) z (0) . Forcing the integrals over the distorted and undistorted
spectra to have the same values now leads to

aD (0) z (0) = z(0)


or
aD (0) = 1 . (5.71c)

It is hard to justify giving the apodization or window function a nonzero imaginary part, so
almost always
Im ( aD ( χ ) ) = 0 . (5.71d)

According to the discussion at the end of Sec. 5.12, z(Ȥ) is often an approximately symmetric
function of the optical-path difference Ȥ, which means there is no obvious reason to weight z(íȤ)
differently from z(Ȥ) in the integral on the right-hand side of (5.70a). This suggests that the
apodization should be an even function of the optical-path difference:

aD (-χ ) = aD (χ ) . (5.71e)

Different choices of aD (χ ) preserve different aspects of Z eff (σ ) when it is approximated by


the apodization integral in (5.70a), and it is impossible to pick one particular apodization or
window function as being ideal under all circumstances. Applying the Fourier convolution
theorem [in the form of Eq. (2.39k) of Chapter 2] to (5.70a) and remembering that Z eff (σ ) is the
exact Fourier transform of z(Ȥ), we get

Z eff (σ ) ≅ A D (σ ) ∗ Z eff (σ ) , (5.72a)


where

A D (σ ) = ³a
-∞
D (χ ) e−2π iσχ d χ . (5.72b)

From Eqs. (5.71d) and (5.71e), we know that aD (χ ) is real and even, which means, according to
entry 1 in Table 2.1 of Chapter 2, that the Fourier transform A D (σ ) is also real and even. Figures
5.26(a) and 5.26(b) give some of the more popular apodization or window functions and their
corresponding Fourier transforms. Compared to Π (Ȥ,D), they all do a better job of preventing

- 655 -
5 · Description of Practical Interferometer Measurements

ringing; in fact, the Bartlett and Parzen window functions, because their Fourier transforms do
not go negative, can never produce unphysical negative values when convolved with the non-
negative true spectrum L(ı) [which is the basic shape-determining factor of Z eff (σ ) on the right-
hand side of Eq. (5.72a)]. Apodization functions in fact get their name from the way they can
diminish or remove unsightly ringing at the base of sharp, spectral peaks in Fourier
measurements. The “pod” root comes from the Latin word for “foot,” a metaphorical reference to
the small spurious bumps often present at the base of these peaks; and the “a” prefix before the
“pod” shows that apodization is intended to remove (or diminish) the “feet.” As a rule of thumb,
apodizing the interferogram signal is more a matter of aesthetics—making the measured spectrum
look better—than it is a way to reveal previously hidden spectral detail. If there are doubts about
the true shape of a measured spectrum, it is better to increase the value of D than to introduce a
more sophisticated apodization function.

5.17 The Effect of a Finite Field of View


Equation (5.62c) gives the formula for the detector-circuit signal z(Ȥ) generated by the source
when cos α ε cannot be approximated as one:


­ ½
W ° 1 °
z(χ ) = ³ S (σ ) M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ . (5.73a)
4 −∞ ° ∆Ω field of °
¯ view ¿

To investigate what happens to this signal when the field of view is sufficiently large that cos α ε
is approximately but not exactly equal to one, we write

α ε2
cos α ε ≅ 1 − . (5.73b)
2

Substitution of this back into the formula for z(Ȥ) gives

z(χ )

­ ½
W ° 1 uσα ε2 ° (5.73c)
= ³ S (σ ) M ( Rσθ ma ) e 2π iσχ
® ³³ d ε e
2 −π iσχαε2
H(uσ − ) ¾ dσ .
4 −∞ ° ∆Ω field of 2 °
¯ view ¿

- 656 -
The Effect of a Finite Field of View · 5.17

FIGURE 5.26(a).

1.0
1.0

0.9

0.80.8 Bartlett
0.7

0.60.6 Parzen
B
kg
0.5
T
kg
0.40.4
H
kg

P 0.3
kg

0.20.2 Tukey

0.1
Hamming
0.0 0

0.1

0.2 0.2
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1.2
1.2 0.0
t
kg
1.2

D D

This graph shows four popular


popular apodization
apodization or
or window functionsaas
window functions D ((Ȥ).
) They get
their names from the analysts who first publicized them.

- 657 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.26(b).

3D
1.5
2 1.4
1.3

1.2
Hamming
1.1

D 1

0.9
B
kg
)
0.8
T
kg 0.7

H 0.6
kg
D Tukey
P 0.5
kg
2 Parzen
0.4
Bartlett
0.3

0.2

0.1

0.0 0

0.1

0.2 0.2
3 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3
3 0.0
t 3.0
kg

 1/ D 1/ D

AdD((ı)) )ofofthe
This graph plots the Fourier transforms A thefour
fourapodization
apodizationoror
window functions shown in Fig. 5.26(a).

- 658 -
The Effect of a Finite Field of View · 5.17

The outer integral over dı goes between í’ and +’, so as long as α ε2 is not zero there is
eventually a value of ı large enough to make cos α ε in the expression σ cos α ε too large to be
approximated by (5.73b) in the phase formulas in Eqs. (5.73a) and (5.73c). (The first part of
Appendix 4B of Chapter 4 explains why we must be careful when deriving approximations for
the phase.) The first step, then, in treating α ε2 as a small quantity in Eq. (5.73c) is to require that
S(ı) be zero or negligible for those values of ı large enough to invalidate (5.73b). Because S is
even so that [see Eq. (5.39a) above]

S (−σ ) = S (σ ) ,

it follows that S must also be zero or negligible for large negative values of ı. Glancing back at
the definition of S for positive ı in Eq. (5.36a),

S (σ ) = A ∆Ω R (σ ) η(σ ) L (σ )τ f (σ )τ a (σ ) ,

we see that the behavior of S at large ı is under the control of the interferometer designer—for
example, the fore and aft optics can be constructed so that the product τ f (σ )τ a (σ ) is zero or
negligible for large values of ı. When σα ε2 is also multiplied by the optical-path difference Ȥ, as
it is in the phase of
2
e−π iσχαε

in Eq. (5.73c), the interferometer designer must also choose an appropriate upper limit on χ .
This upper limit was called D in Secs. 5.15 and 5.16 above, so in (5.73a)–(5.73c) we want D to
be chosen small enough that for all
χ ≤D (5.73d)

the product χσα ε2 can be treated as a small quantity.


To connect α ε to the variable of integration ε in the double integral over d 2ε , we return to
the original definition of cos α ε in Eq. (4.135f) of Chapter 4,

cos α ε = 1 − ε 2 . (5.74a)

When the detector’s field of view is small, which means ε is always close to zero, we can
approximate the square root as

- 659 -
5 · Description of Practical Interferometer Measurements

ε2
1− ε 2 ≅ 1− .
2

Consequently, Eq. (5.74a) can be written as

ε2
cos α ε ≅ 1 − (5.74b)
2

inside the double integral over d 2ε in Eq. (5.73c). Comparing Eq. (5.74b) to (5.73b), we see that
if α ε is small—which is, of course, the same as saying the detector’s field of view is small—it
follows that
αε2 ≅ ε 2 . (5.74c)

In the discussion at the beginning of Sec. 5.7 above, we interpreted the double integral over d 2ε
in Eq. (5.25b) as a sum over all the plane waves propagating through the interferometer, with α ε
being the angle of the ε th plane wave’s propagation vector with respect to the optical axis. The
double integral over d 2ε in Eqs. (5.73a) and (5.73c) can be interpreted in the same way. The
angle α ε is always taken to be greater than or equal to zero, so Eq. (5.74c) can be written as

αε ≅ ε . (5.74d)

Hence, when the propagation angle is small, ε can be thought of as the angle in radians with
respect to the optical axis at which the ε th plane wave is propagating through the interferometer.
The discussion in Sec. 5.7 shows that interferometer setups that have a circular detector centered
on the optical axis, such as the standard Michelson interferometer shown in Fig. 5.18, have
propagation angles α ε = ε that are at a maximum ε max when the plane waves are focused onto
the detector’s edge. The interior points of the detector absorb the focused energy of plane waves
passing through the interferometer at propagation angles ε < ε max ; in fact, all plane waves with
the same propagation angle ε end up focused onto a circle surrounding the detector’s center,
with the radius of the circle proportional to ε as shown in Fig. 5.27.
The double integral over the field of view has this same sort of circular symmetry.
Substituting (5.74c) into (5.73c) gives

- 660 -
The Effect of a Finite Field of View · 5.17

z(χ )

­ ½
W ° 1 uσε 2 ° (5.75a)
= ³ S (σ ) M ( Rσθ ma ) e 2π iσχ
® ³³ d ε e H(uσ − 2 )
2 −π iσχε 2
¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿

We see that the quantity inside the double integral over d 2ε ,

−π iσχε 2 uσε 2
e H(uσ − ),
2

depends only on ε 2 . This means the double integral can be thought of as an integral over all the
infinitesimal area patches d 2ε = d ε x d ε y of a quantity that only depends on ε 2 = ε = ε x2 + ε y2 ,
the distance of any point in this area integral from the origin where ε = 0 . Consequently the area
integral d 2ε has circular symmetry and can be treated as a one-dimensional integral over a
collection of rings with radii between 0 and ε max ,

ε max

³³ d ε → 2π ³ ε dε .
2

field of 0
view

Equation (5.75a) can now be written as

z(χ )
°­ 2π max °½
∞ ε
W uσε 2
³ S (σ ) M ( Rσθ ) e ³
2π iσχ −π iσχε 2
= ma ® ε [ e H(uσ − )]d ε ¾ dσ (5.75b)
4 −∞ ¯° ∆Ω 0 2 ¿°
W

­° 2π ε max ½°
≅ ³ S (σ ) M ( Rσθ ma ) H(uσ ) e 2π iσχ ® ³ ε e −π iσχε 2
d ε ¾ dσ ,
4 −∞ ¯° ∆Ω 0 ¿°

where in the last step we have assumed that the transfer function H(ƒ) is such a slowly varying
function of ƒ that we can disregard the effect of adding the small quantity (uσε 2 ) / 2 to the
argument uı. For future use, we note that the circular symmetry of the detector’s field of view
lets us write Eq. (5.35d) as

- 661 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.27.

All plane waves with the same off-axis


propagation angle end up focused onto the
same circle surrounding the center of the
detector.

- 662 -
The Effect of a Finite Field of View · 5.17

ε max

³³ d ε = 2π ³ ε dε = πε
2 2
∆Ω = max (5.75c)
field of 0
view

Hence ¨Ÿ is given by the formula for the area of a circle of radius ε max . In Eq. (5.75b) the term
inside the braces { } can be simplified to

ε max
2π 2 −1 ª −π iσχε 2 ºε max
∆Ω ³0
ε e−π iσχε d ε =
iσχ ∆Ω ¬
e
¼0

e ( )
2
− 1 2 π iσχε max
=−
iσχ ∆Ω
e ( ) (
2
− 1 2 π iσχε max
− e( )
2
1 2 π iσχε max
. )
Equation (5.75c) can be written as ε max
2
= ∆Ω π , and with this substitution the integral becomes

ª § σχ∆Ω · º

ε max « sin ¨ 2 ¸ »
« © ¹» .
−π iσχε 2 −( i 2 )σχ∆Ω

∆Ω ³0
ε e dε = e
« § σχ∆Ω · »
«¬ ¨© 2 ¸¹ »¼

Following the definition of


sin( x)
sinc( x) =
x

given in Eq. (2.106d) of Chapter 2, we write

ε max
2π § σχ∆Ω ·
³ ε e−π iσχε d ε = e−( i 2)σχ∆Ωsinc ¨
2

¸. (5.75d)
∆Ω 0 © 2 ¹

This can be substituted back into (5.75b) to get

∞ § ∆Ω ·
W § σχ ∆Ω · 2π i χσ ¨©1− 4π ¸¹
z(χ ) = ³ S (σ ) M ( Rσθ ma ) H(uσ ) sinc ¨ ¸e dσ . (5.75e)
4 −∞ © 2 ¹

According to the discussion in Sec. 5.11, we can associate an effective spectrum with the
formula in Eq. (5.75e),

- 663 -
5 · Description of Practical Interferometer Measurements

³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ
−∞
§ ∆Ω ·
(5.76a)
∞ ∞
ªW § σ ′χ ∆Ω · º 2π i χσ ′¨©1− 4π ¸¹
= ³ dχ e
−2π iσχ
³ dσ ′ « S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸» e .
−∞ −∞ ¬4 © 2 ¹¼

From Eqs. (5B.8a) and (5B.8b) in Appendix 5B at the end of this chapter, it follows that

§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
1 ªW º
Z eff (σ ) ≅
∆σ ³ «¬ 4 S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) »¼ dσ ′ , (5.76b)
§ ∆Ω · ∆σ
σ ⋅¨1+ ¸−
© 4π ¹ 2

where
∆Ω σ
∆σ = . (5.76c)

Formula (5.76b) is good for fields of view small enough that cos α ε can be approximated
quadratically as
α ε2
cos α ε ≅ 1 − ,
2

but not so small that cos α ε can be approximated as one. In (5.76b) the term inside the square
brackets [ ] is, as pointed out in Appendix 5B, averaged over a wavenumber interval that is
centered on
§ ∆Ω ·
σ ⋅ ¨1 + ¸
© 4π ¹

and that has width


∆Ω σ
∆σ = .

Equations (5.47c) and (5.48a) show that when cos α ε can be approximated by one and there is no
background radiance, the effective signal spectrum can be written as

W
Z eff (σ ) = S (σ ) M ( Rσθ ma ) H(uσ ) . (5.76d)
4

- 664 -
The Effect of a Finite Field of View · 5.17

This expression is the same as the term inside the square brackets in (5.76b). We conclude that in
Eq. (5.76b) the term inside the integral is just the effective signal spectrum of the narrow field-of-
view case where cos   can be approximated by one. Consequently, the effect of increasing the
field of view beyond the point where cos   can be approximated by one is to blur the effective
signal spectrum by averaging it over a wavenumber region of width

 )
)
2&

§  ·
centered on wavenumber ) A ¨ 1  ¸ instead of ı. Therefore, another effect of the increased
© 4& ¹
field of view is to scale the wavenumber axis of the effective signal spectrum by a factor of
§  ·
¨1  ¸ . In other words, the spectral details at ) ) 0 are blurred over a region ¨ı
 ) in width
© 4& ¹
1
§  ·
around ı0 and then, in the spectral measurements, show up at wavenumber ) 0 A ¨1  ¸ instead
© 4& ¹
of at wavenumber ı0. When the ¨Ÿ field of view is known, we can always rescale the
wavenumber axis to put the spectral details in their correct locations, but the blurring degrades
the spectral resolution in a way that cannot be fixed.
We specify a new variable of integration

) 33 ) A g  (5.77a)
with
§  ·
g  ¨ 1  ¸ (5.77b)
© 4& ¹

and use it to write Eq. (5.75e) as

z(  z(
) )
5 5
ª W ª W1 º § g 
1
) 33  · 2& i) 33 (5.77c)
³ « ³ «g )S  g )  M  g ) R' ma  H  g ) u  » sinc ¨
1
33 1
33 1
33 ¸e d) 33.
¬
5 5
4 ¬ 4 ¼ © 2 ¹

The term inside the square brackets is just another version of the effective spectrum in (5.76d),
but now it is multiplied by the factor

- 665 -
5 · Description of Practical Interferometer Measurements

§ g ∆Ω
−1
σ ′′χ ∆Ω · § § ∆Ω · −1 σ ′′χ ∆Ω ·
sinc ¨ ¸ = sinc ¨¨ ¨1 − ¸ ¸¸ .
© 2 ¹ © © 4π ¹ 2 ¹

This sinc factor artificially decreases the size of the effective spectrum, forcing it to contribute
too little to the integration over dσ ′′ so that the signal z(Ȥ) is smaller than it would otherwise be
at large values of the optical-path difference Ȥ. This effect is sometimes called the “self-
apodization” of the interferogram signal. To avoid having significant amounts of self-apodization
in the measured spectrum, we should keep the optical-path difference Ȥ from becoming so large
that the sinc factor becomes small or even negative. Following the notation of Sec. 5.15, there
must be a length D with
χ ≤D
such that
§ § ∆Ω ·−1 σ ′′D ∆Ω ·
sinc ¨ ¨1 − ¸ ¸¸
¨© 4π ¹ 2
© ¹

stays reasonably close to one. In any well-designed interferometer, the wavenumbers to which the
detector is sensitive lie within a specified wavenumber range,

σ min ≤ σ ≤ σ max , (5.78)

as is discussed following Eq. (4.66b) in Chapter 4. Consequently, the traditional rule of thumb is
to require the sinc factor to be greater than 2/3 for the maximum possible value of its argument,

§ § ∆Ω · −1 σ max D ∆Ω · 2
sinc ¨ ¨1 − ¸ ¸¸ > , (5.79)
¨© 4π 2
© ¹ ¹ 3

to avoid having significant amounts of self-apodization occur. This implies that

−1
§ ∆Ω · σ max D ∆Ω
¨1 − ¸ < 1.488
© 4π ¹ 2
or
§ ∆Ω · (2.976) 2.976
D < ¨1 − ¸ ≅ , (5.80)
© 4π ¹ σ max ∆Ω σ max ∆Ω

where in the last step we assume [see Eqs. (5B.1c) and (5B.1d) in Appendix 5B]

- 666 -
The Effect of a Finite Field of View · 5.17

∆Ω
<< 1 ,

something that is almost always the case. As was discussed in the Sec. 5.15, the size of D
controls the overall resolution of the spectral measurement, with small values of D producing
low-resolution spectral measurements and large values of D producing high-resolution spectral
measurements [see Eq. (5.67)]. What we have here, then, is the interferometric version of the
classic inverse relationship between spectral resolution and field of view that affects all
spectrometers, not just the Fourier-transform type. Inequality (5.80) states that to avoid self-
apodization, large fields of view ¨Ÿ should have small values of D, producing low-resolution
spectral measurements, and small fields of view ¨Ÿ can have large values of D, producing high-
resolution spectral measurements. If inequality (5.80) is ignored, then self-apodization occurs and
resolution is lost from the blurring effect of the integral in Eq. (5.76b) above.

5.18 Single-Sided Interferograms


It is easy to show that when the interferogram signal z is even, we can double the spectral
resolving power of a standard Michelson interferometer by shifting the fixed mirror so that the
moving-mirror’s ZPD position occurs at the beginning (rather than the center) of the moving
mirror’s range of motion (see Fig. 5.28). Before the fixed mirror is shifted, we have the standard
setup for a double-sided interferogram, with Ȥ varying between +D and íD as the moving mirror
moves from the beginning to the end of its path. After the fixed mirror is shifted, running the
moving mirror over the same physical positions as before gives z at all the optical-path
differences between zero and 2D instead of between íD and D. By assumption, z is even, so we
can then use
z (− χ ) = z ( χ )

to get z between í2D and zero. Consequently, we end up with the same knowledge of the
interferogram signal that we would get from measuring a double-sided interferogram between
í2D and 2D. Putting the moving mirror’s ZPD location at the beginning of its range of motion
therefore doubles the effective length of the interferogram signal. According to Eq. (5.67), the
resolving power of a double-sided interferogram is directly proportional to the interferogram
signal’s length, so—when the interferogram signal is even—putting the moving mirror’s ZPD
location at the beginning of its range of motion doubles the spectral resolving power.
Shifting the position of the fixed mirror is, as a general rule, much easier than extending the
moving mirror’s range of motion, so it is unfortunate that—because z is not exactly even after
passing through the detector circuit87—we cannot so simply double the resolving power of
already-built Michelson interferometers. If, however, the fixed mirror is shifted as shown in Fig.

87
See discussion at the end of Sec. 5.12.

- 667 -
5 · Description of Practical Interferometer Measurements

5.29 so that the ZPD position is put close to, rather than exactly at, the beginning of the moving
mirror’s range of motion, we can usually symmetrize the interferogram signal, turning it into an
exactly even function of Ȥ. This returns us to the ideal case discussed above, letting us increase
the interferometer’s spectral resolving power by increasing the effective length of the
interferogram signal. Because we do not put the ZPD exactly at the beginning of the moving
mirror’s range of motion, we cannot double the resolving power; but in almost all cases there is a
large increase—almost a doubling—in the amount of spectral detail which the interferometer can
measure.
From the work done in Sec. 5.11, we know that after passing through the detector circuit the
interferogram signal can be written as the inverse Fourier transform of an effective spectrum,


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.81a)

From entry 7 in Table 2.1 of Chapter 2, we know that, since z(Ȥ) is real, Z eff (σ ) must be
Hermitian,
Z eff (−σ ) = Z eff (σ )∗ . (5.81b)

We also know from the discussion following Eq. (5A.6b) in Appendix 5A that the transfer
function H(uı) of the detector circuit must have a nonzero imaginary component. For small fields
of view where cos α ε can be approximated by one, Eq. (5.48a) gives

W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) , (5.82a)
4

showing that, since W = +1 or í1 and functions S(ı) and M( Rσθ ma ) are real, the effective
spectrum Z eff (σ ) has a nonzero imaginary component only because H has a nonzero imaginary
component.
For larger fields of view when cos α ε cannot be approximated by one, we can again show that
Z eff (σ ) has a nonzero imaginary component because H has a nonzero imaginary component.
Equations (5.76b) and (5.76c) give

§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
1 ªW º
Z eff (σ ) =
∆σ ³ « 4 S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) » dσ ′ (5.82b)
§ ∆Ω · ∆ σ
¬ ¼
σ ⋅¨1+ ¸−
© 4π ¹ 2

- 668 -
Single-Sided Interferogram · 5.18

FIGURE 5.28.

Moving
Mirror
Old ZPD Position Range
of
Motion
New ZPD Position

Radiance entering the


interferometer

Ideal
Beam
Splitter New Old
Radiance heading to the Position Position
Detector of Fixed of Fixed
Mirror Mirror

- 669 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.29.

Old ZPD Position D


Moving
Mirror
Range
of
New ZPD Position Motion
d D

Radiance entering the


interferometer

Ideal
Beam
Splitter
New Old
Radiance heading to the
Position Position
Detector
of Fixed of Fixed
Mirror Mirror

- 670 -
Single-Sided Interferogram · 5.18

with
 )
) . (5.82c)
2&

In a well-designed interferometer system we want M (if it is not equal to one) and H to vary
slowly as functions of ı, letting S(ı) carry the high-resolution spectral detail. In fact, we know
from Eq. (5.40g) that
S () ) A  R ( ) ) !() ) L ( ) )* f ( ) )* a ( ) ) . (5.82d)

This shows that the interferometer can be designed and built so that R, Ș, * a , and * f also vary
slowly with ı over the range of wavenumbers being measured, allowing the rapid variation with
wavenumber to come entirely from the spectral radiance L(ı). This is, in fact, how we expect R,
Ș, * a , and * f to behave in well-designed interferometers. Consequently, all the slowly varying
functions of ı can be brought outside the integral in Eq. (5.82b), which means we can substitute
(5.82d) into (5.82b) to get

WA § §  · · § §  · ·
Z eff () ) A H ¨ u) ¨1  ¸ ¸ A M ¨ R' ma) ¨ 1  ¸¸A
4)
) © © 4& ¹ ¹ © © 4& ¹¹
§ §  · · § §  · · § §  · · § §  · ·
R¨ ) ¨1  ¸ ¸ A! ¨ ) ¨ 1  ¸ ¸ A* a ¨ ) ¨1  ¸ ¸ A* f ¨) ¨1  ¸¸A
© © 4& ¹ ¹ © © 4& ¹ ¹ © © 4& ¹ ¹ © © 4& ¹ ¹
§  ·  )
) A¨1 ¸
© 4& ¹ 2
(5.83a)
³ L( ) 3 ) d) 3
§  ·  )
) A¨1 ¸
© 4& ¹ 2

§  ·  )
) A¨1 ¸
WA  H  u)  M  R' ma)  R  ) ! ) * a  ) * f  )  © 4& ¹ 2


4) ³ L( ) 3 ) d) 3.
) §  ·  )
) A¨1 ¸
© 4& ¹ 2

In the last step, we assume both that




1
4&

and that H, M, R, Ș, * a , and * f vary slowly enough for us to write

- 671 -
5 · Description of Practical Interferometer Measurements

§ § ∆Ω · · § § ∆Ω · ·
H ¨ uσ ¨1 + ¸ ¸ ≅ H ( uσ ) , M ¨ Rθ maσ ¨1 + ¸ ¸ ≅ M ( Rθ maσ ) , (5.83b)
© © 4π ¹ ¹ © © 4π ¹ ¹
§ § ∆Ω · · § § ∆Ω · ·
R ¨ σ ¨1 + ¸ ¸ ≅ R (σ ) , η ¨ σ ¨ 1 + ¸ ¸ ≅ η (σ ) ,
© © 4π ¹ ¹ © © 4π ¹ ¹
§ § ∆Ω · · § § ∆Ω · ·
τ a ¨ σ ¨1 + ¸ ¸ ≅ τ a (σ ) , τ f ¨ σ ¨1 + ¸ ¸ ≅ τ f (σ ) .
© © 4π ¹ ¹ © © 4π ¹ ¹

It is also worth noting that the integral

§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2

³ L ( σ ′ ) dσ ′
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸−
© 4π ¹ 2

must be an even function of ı because L( σ ′ ) is an even function of σ ′ . In Eqs. (5.82b) and


(5.83a) everything but H is real. Therefore—both in Eq. (5.82a), which applies to very small
fields of view when cos α ε can be approximated as one, and in Eq. (5.83a), which applies to
slightly larger fields of view when cos α ε cannot be approximated as one—we see that Z eff (σ )
has a nonzero imaginary component only because H(uı) has a nonzero imaginary component.
We can underline the fundamental similarity of Eqs. (5.82a) and (5.83a) by combining them
into a single formula. Substitution of (5.82d) into (5.82a) gives

WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L ( σ ) , (5.83c)
4

and this last result can be combined with (5.83a) by writing

WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) , (5.83d)
4

where we define that

- 672 -
Single-Sided Interferogram · 5.18

­ L( σ ) for small ǻȍ where cos α ε


° can be approximated as one
°
°°
L FOV ( σ ) = ® § ∆Ω · ∆ σ (5.83e)
σ ⋅ 1+ +
° 1 ¨© 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ∆ σ § ∆Ω³ · ∆σ
° ⋅ L ( σ ′ ) d σ ′
cannot be approximated as one .
°̄ σ ⋅¨1+ ¸−
© 4π ¹ 2

Absolute value signs are put around the argument of LFOV in (5.83d) and (5.83e) in part to remind
us, as pointed out in the discussion following (5.83b), that the integral

§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2

³ ∆σ
L ( σ ′ ) dσ ′
§ ∆Ω ·
σ ⋅¨1+ ¸−
© 4π ¹ 2

must be an even function of ı. Figures 5.30(a) and 5.30(b) show how the original spectrum L(ı)
is shifted and blurred by an interferometer’s finite field of view. In Fig. 5.30(b) the compression
of the wavenumber axis can be removed by stretching the axis so that spectral edge E is returned
to its proper position, but nothing can recover the detail lost in the spectral blurring.
The next step in setting up a single-sided interferogram measurement is to write Eq. (5.83d) as

Z eff (σ ) = Zeff (σ )eiψ (σ ) (5.84a)

for real functions Zeff (σ ) and ȥ(ı). Here Zeff is the magnitude of Z eff ,

Zeff (σ ) = Z eff (σ ) , (5.84b)


and ȥ is the argument of Z eff ,
ψ (σ ) = arg[Zeff (σ )] . (5.84c)

Applying these definitions to the right-hand side of (5.83d) gives

WA ∆Ω
Zeff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) (5.85a)
4
and
ψ (σ ) = arg[H(uσ )] . (5.85b)

- 673 -
5 · Description of Practical Interferometer Measurements

Equation (5.85b) shows that

ψ (σ ) = slowly varying function of ı (5.85c)

because H(uı) is a slowly varying function of the wavenumber. Substituting (5.84a) into (5.81b)
gives, since both Zeff and ȥ are real, that

Zeff (−σ )eiψ ( −σ ) = Zeff (σ )e − iψ (σ ) .

Taking the magnitude of both sides, we get

Zeff (−σ ) = Zeff (σ ) , (5.86a)

making Zeff an even function of ı. Now we can write

Zeff (σ )eiψ ( −σ ) = Zeff (σ )e − iψ (σ ) or eiψ ( −σ ) = e − iψ (σ ) .

Taking the complex logarithm of both sides shows that ȥ must be an odd function of ı,

ψ (−σ ) = −ψ (σ ) . (5.86b)

Equation (5.85b) suggests that we automatically know ȥ(ı) because, having designed and built
the detector circuit, we know its transfer function H. In practice, however, it is often difficult to
know H with sufficient accuracy to get good measurements of LFOV. It turns out that all we need
to make single-sided interferograms practical is to know that ȥ is a slowly varying function of the
wavenumber, because then it is easy to measure ȥ as a function of ı. The key point to take away
from Eq. (5.85b), then, is that if the transfer function is designed to be a slowly varying function
of wavenumber, then we have good reason to expect ȥ to be a slowly varying function of
wavenumber.88
The customary procedure used to measure ȥ(ı) directly is to run the moving mirror in Fig.
5.29 between χ = −d and χ = 2D − d , at first confining our attention to the z(Ȥ) signal values

88
This point is more important than it looks. There are interferometer defects not discussed here that, like the
transfer function, contribute slowly varying complex modulations to the effective spectrum. All we need for a good
single-sided interferogram measurement is to know that the total complex modulation is slowly varying, and then we
can use the procedure discussed in this section to remove all these complex modulations from the effective spectrum
at the same time.

- 674 -
Single-Sided Interferogram · 5.18

FIGURE 5.30(a).
Spectral Edge E

L() )

1009 cm-1 1010 cm-1


This small section of the radiance spectrum entering the interferometer is
plotted here the way it actually is, undistorted by any measurement errors. The
true position of Spectral Edge E is at wavenumber 1010 cm-1.

- 675 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.30(b).

Spectral Edge E

L FOV () )

1009 cm-1 1010 cm-1

The same small section of the radiance spectrum plotted in Fig. 5.30(a) is shown here
with the rescaled wavenumber axis and blurring due to the interferometer’s finite field of
view. Spectral Edge E is measured at a wavenumber slightly smaller than its true
positon.

- 676 -
Single-Sided Interferogram · 5.18

between Ȥ = íd and Ȥ = +d. These signal values give a perfectly good double-sided interferogram
of the type described in Sec. 5.15 above, leading to a low-resolution estimate of the effective
spectrum
Z eff (σ ) ≅ Z (low
eff
res)
(σ ) . (5.87a)

According to Eq. (5.67), the spectral resolution of this Z (low


eff
res)
(σ ) measurement is

1
∆σ low res = . (5.87b)
2d

This spectral resolution is not sufficient to measure L(ı), the spectral radiance of the source, but
we can easily make it good enough to capture all the spectral detail in the slowly varying function
ȥ(ı). We choose d twice as large as we would for a minimally accurate representation, making
∆σ low res half the size of the spectral interval ∆σ detail used to examine the detail in ȥ(ı). This
makes
1
∆σ detail = 2∆σ low res = . (5.87c)
d

Because ȥ is a low-resolution function of wavenumber ı, Eqs. (5.84c) and (5.87a) show that

ψ (σ ) = arg[Z (low
eff
res)
(σ )] . (5.88a)

Now that ȥ is known, we can define a new function ϖ ( χ ) such that e − iψ (σ ) is the Fourier
transform ϖ ( χ ) . According to Eq. (5.78), we are only interested in ı values that are between
σ min , σ max and (−σ min ) , (−σ max ) . This means ȥ(ı) can be given any values we please outside
these two ranges, and for that matter so can any function of ȥ such as e − iψ (σ ) . Keeping in mind,
then, that the only ı values that matter satisfy

σ min ≤ σ ≤ σ max ,
we set up the Fourier transform pair

³ ϖ (χ ) e
− iψ (σ ) −2π iσχ
V (σ )e = dχ (5.88b)
−∞
and

ϖ ( χ ) = ³ [V (σ ) e−iψ (σ ) ] e2π iσχ dσ . (5.88c)
−∞

- 677 -
5 · Description of Practical Interferometer Measurements

In these two formulas, V(ı) is a real-valued tapering function chosen so that V (σ ) → 0 slowly as
σ → ∞ with
V (σ ) = 1 for σ min ≤ σ ≤ σ max . (5.88d)

For future use (and to keep things neat), we require V(ı) to be non-negative and even,

V (σ ) ≥ 0
and
V (−σ ) = V (σ ) . (5.88e)

Since ψ (−σ ) = −ψ (σ ) in Eq. (5.86b) and V(ı) is real and even in (5.88e), we note that
V (σ ) e − iψ (σ ) is Hermitian,

V (−σ ) e − iψ ( −σ ) = V (σ ) eiψ (σ ) = [V (σ ) e− iψ (σ ) ]∗ . (5.88f)

Consequently, according to entry 7 of Table 2.1 in Chapter 2, its Fourier transform must be real:

Im (ϖ ( χ ) ) = 0 . (5.88g)

Because ȥ and V are slowly varying functions of ı, their product V (σ ) e − iψ (σ ) is also a slowly
varying function of ı. According to the discussion following Eq. (2.37e) in Chapter 2, it follows
that ϖ ( χ ) , the inverse Fourier transform of V (σ ) e − iψ (σ ) in (5.88c), must be a relatively narrow
function of Ȥ. By the end of that discussion, we realize that if ∆σ detail is the change in ı required
to produce a significant change in V (σ ) e − iψ (σ ) , then the inverse Fourier transform ϖ ( χ ) must be
negligible at all values of Ȥ with χ > ∆σ detail
-1
. From Eq. (5.87c), we know

1
∆σ detail = ,
d
which means that
ϖ ( χ ) is negligible when χ > d . (5.88h)

Having analyzed ϖ ( χ ) , we now turn our attention to the entire interferogram signal recorded
between χ = −d and χ = 2D − d . When the interferogram signal in Eq. (5.81a) is convolved
with ϖ ( χ ) , the result is
zconv ( χ ) = ϖ ( χ ) ∗ z ( χ ). (5.89a)

- 678 -
Single-Sided Interferogram · 5.18

From the definition of convolution in Chapter 2 [see Eq. (2.38a)], we understand that both ϖ ( χ )
and z(Ȥ) must be known for all Ȥ between í’ and +’ to calculate their convolution,


zconv ( χ ) = ³ ϖ ( χ ′) z ( χ − χ ′)d χ ′ .
−∞
(5.89b)

We have just seen, however, that ϖ ( χ ) is a narrow function of Ȥ, so from (5.88h) we get

d
zconv ( χ ) ≅ ³ ϖ ( χ ′) z ( χ − χ ′)d χ ′ . (5.89c)
−d

Consequently, there is now no real difficulty in calculating zconv(Ȥ) between χ = 0 and


χ = 2 D − 2d from our limited knowledge of ϖ ( χ ) and z(Ȥ). [Note that the formula for the
integral in (5.89c) does not let us calculate zconv all the way out to χ = 2D − d .] Suppose there
were some way to know zconv(Ȥ) for negative as well as positive values of its argument. The
Fourier transform of zconv would then be, using the Fourier convolution theorem [see Eq. (2.39b)
of Chapter 2] and formula (5.88b),

∞ ∞

³z ³ [ϖ ( χ ) ∗ z ( χ )] e
−2π iσχ −2π iσχ
conv (χ ) e dχ = dχ
−∞ −∞
(5.90a)
ª∞ º
= V (σ ) e −iψ (σ ) ⋅ « ³ z ( χ ) e −2π iσχ d χ » .
¬ −∞ ¼

Reversing the Fourier transform in (5.81a) gives

³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ , (5.90b)
−∞

which, when substituted into (5.90a), would lead to

³z
−∞
conv ( χ ) e −2π iσχ d χ = V (σ ) e − iψ (σ ) ⋅ Zeff (σ ).

In well-designed interferometers, the [τ a ( σ ) ⋅ R ( σ )] product in Eq. (5.83d) goes to zero when

- 679 -
5 · Description of Practical Interferometer Measurements

σ ≥ σ max or σ ≤ σ min . Consequently, Zeff is zero for those ı values where V(ı) is, according to
(5.88d), not necessarily equal to one. Hence, this latest result can be written as

³z
−∞
conv ( χ ) e −2π iσχ d χ = e − iψ (σ ) ⋅ Zeff (σ ). (5.90c)

Consulting Eqs. (5.84a) and (5.84b), we see that (5.90c) could also be written as

³z
−∞
conv ( χ ) e −2π iσχ d χ = Zeff (σ ). (5.90d)

We already know that Zeff (σ ) is real, and from (5.86a) we see that Zeff (σ ) is even. Reversing
the Fourier transform in (5.90d) now gives

zconv ( χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.91a)

According to entry 1 of Table 2.1 in Chapter 2, the inverse Fourier transform of a real and even
function is another real and even function, which means that zconv is an even function of Ȥ,

zconv (− χ ) = zconv ( χ ) . (5.91b)

This is the result we need. In the discussion following Eq. (5.89c), we supposed that zconv was
known for negative as well as positive values of its argument because we wanted to take its
Fourier transform. It now turns out, however, that when zconv is known between χ = 0 and
χ = 2 D − 2d , it is also known between χ = 0 and χ = −(2 D − 2d ) because it must be an even
function. This means that measuring z(Ȥ) between χ = −d and χ = 2D − d , as shown in Fig.
5.29, gives enough information to calculate zconv(Ȥ) for

−2( D − d ) ≤ χ ≤ 2( D − d ) . (5.91c)

Applying the double-sided approximation for the Fourier transform discussed in Sec. 5.15 to
formula (5.90d), we can now treat zconv(Ȥ) as a double-sided interferogram signal to get

2( D − d )

Zeff (σ ) ≅ ³
−2( D − d )
zconv ( χ ) e −2π iσχ d χ . (5.91d)

- 680 -
Single-Sided Interferogram · 5.18

Equation (5.91d) justifies the use of single-sided interferograms. In a conventional double-


sided interferogram, we measure the signal z(Ȥ) leaving the detector circuit between
χ = − D and χ = + D and use Eq. (5.63a) to write the approximation

³ z(χ ) e
−2π iσχ
Z eff (σ ) ≅ dχ (5.92a)
-D

for the effective spectrum Z eff (σ ) . According to Eq. (5.83d), the correct formula for the effective
spectrum is

WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.92b)
4

Now compare this to the single-sided situation. Following the procedure outlined above, we
measure signal z(Ȥ) between χ = −d and χ = 2D − d . This data lets us calculate zconv(Ȥ) between
χ = 0 and χ = 2( D − d ) . Because zconv(Ȥ) is even, we end up knowing its values between
χ = −2( D − d ) and χ = +2( D − d ) , allowing us to make the new approximation

2( D − d )

Zeff (σ ) ≅ ³
−2( D − d )
zconv ( χ ) e −2π iσχ d χ , (5.92c)

where, according to Eq. (5.85a),

WA ∆Ω
Zeff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.92d)
4

The only difference between the two spectral formulas in Eqs. (5.92b) and (5.92d) is that in
(5.92b) spectrum Z eff is proportional to the full complex transfer function H while in (5.92d)
spectrum Zeff is proportional to the magnitude of H. Although the detector circuit’s transfer
function H must have a nonzero imaginary part [see discussion following Eq. (5A.6b) in
Appendix 5A], we shall see in the following section that the calibration formula used for complex
H also works when the original transfer function H is replaced by H . The alternate method of
removing an interferometer’s background radiance discussed in Sec. 5.14 also works as desired
when H is replaced by H . Consequently, we can think of the magnitude of H in (5.92d) as just
another type of transfer function and treat Zeff like any other effective spectrum when it becomes
time to calibrate the interferometer and eliminate unwanted background radiation from our
measurements. Since the integral in Eq. (5.92c) goes between −2( D − d ) and + 2( D − d ) rather
than between íD and +D, the discussion in Sec. 5.15 shows that we must end up with a more

- 681 -
5 · Description of Practical Interferometer Measurements

highly resolved spectrum. According to Eq. (5.67), a double-sided interferogram system can
measure spectral details separated by a wavenumber interval as small as
1
∆σ double sided = (5.93a)
2D

using formulas (5.92a) and (5.92b). Therefore, when integrating between


−2( D − d ) and + 2( D − d ) in Eq. (5.92c), we know that single-sided interferogram system can
measure spectral details separated by a wavenumber interval as small as

1
∆σ single sided = < ∆σ double sided , (5.93b)
2(2 D − 2d )

which gives us much more resolving power than the equivalent double-sided system,

∆σ single sided << ∆σ double sided . (5.93c)

From what has been said so far, it seems that all spectral measurements ought to be made
using single-sided rather than double-sided interferograms. In practice, however, we often want
to compare one side of a double-sided interferogram signal to the other to check that no blunders
have been made in taking the measurement—and we clearly give up this possibility when using
single-sided interferograms. In addition, the expected noise amplitude of single-sided
measurements is, as a general rule, larger by 2 than the expected noise amplitude of equivalent,
equal-resolution double-sided measurements [see the discussion following Eq. (6.76e) in Chapter
6 below]. Finally, to justify our single-sided procedure, we are forced to assume that the phase
term e–iȥ(ı) is a slowly varying function of wavenumber ı and then choose parameter d large
enough to capture all the relevant spectral detail in e–iȥ(ı). The only way to confirm that this is
true is to make a high-resolution, double-sided spectral measurement, verify that e–iȥ(ı) behaves as
expected, and adjust the value of d accordingly. In this sense, then, a good single-sided
measurement depends on our having at some point performed a high-resolution, double-sided
measurement with the same instrument. Nevertheless, having the flexibility to perform single-
sided measurements can be a very attractive way to increase an interferometer’s resolving power
when a standard double-sided measurement turns up unexpected but poorly resolved spectral
detail, and for this reason many interferometer designs include it as one of their options.

5.19 Calibration
The uncalibrated spectrum of a standard Michelson interferometer can be treated the same way as
the output spectrum of any other type of uncalibrated spectrometer would be treated. Consider,
for example, Eq. (5.60b) for the total interferogram signal ztot when the interferogram’s field of

- 682 -
Calibration · 5.19

view is small enough that cos α ε can be approximated as one,


W
ztot ( χ ) = ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ . (5.94a)
4 −∞

In this section, we can regard function M as a constant and steady misalignment, unchanging
during calibration and subsequent spectral measurements—or we can think of the instrument as
being so well-aligned that M ≅ 1 . Assuming that ztot in (5.94a) is analyzed using a double-sided
interferogram with D large enough that there is no significant ringing or loss of spectral detail
from the sinc convolution in Eq. (5.66b), we can treat the Fourier-transform of ztot, which we call
Z eff ,tot (σ ) , as the uncalibrated spectrum of the Michelson interferometer. Reversing the Fourier
transform in (5.94a) then gives


Z eff ,tot (σ ) = ³z tot ( χ ) e −2π iσχ d χ
-∞ (5.94b)
W
= ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) .
4

Equation (5.40g), which specifies that

S (σ ) = A ∆Ω R ( σ ) η(σ ) L ( σ )τ f ( σ )τ a ( σ ) , (5.94c)

can now be used to write (5.94b) as

Zeff ,tot (σ )
W (5.94d)
= ª¬ A ∆Ω R ( σ ) η(σ ) L ( σ )τ f ( σ )τ a ( σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma )
4

for the ideal case where cos α ε can be approximated by one and D is large enough that there is
no significant loss of detail from the sinc convolution described in Sec. 5.15 above.
What can be done with the more realistic case where there is significant loss of detail from the
sinc convolution and cos α ε can no longer be approximated as one because the field of view is
relatively large? Glancing back at the analysis used in Sec. 5.18 to go from Eq. (5.82b) to
(5.83d)—and in particular paying close attention to the approximations listed in (5.83b)—we
note that in a well-designed interferometer R, Ș, τ a , τ f , H, and M all vary slowly with ı
compared to L(ı). In fact compared to L(ı) they can be regarded as quasi-constants, especially
over the range of wavenumbers

- 683 -
5 · Description of Practical Interferometer Measurements

σ min ≤ σ ≤ σ max

over which L is being measured. In Eq. (5.83d) we account for the effect of a small but finite
field of view blurring and distorting the measurement of L by replacing L(ı) with LFOV(ı). This
is very similar to the situation examined in Sec. 5.15 above, where we represented the distorting
effect of the sinc convolution on the measured spectrum by replacing L(ı) with Lblur(ı). To
combine the blurring and distorting effects of both the sinc convolution and the finite field of
view, we replace L(ı) by Leff(ı) in Eq. (5.94c) to get

S (σ ) = A ∆Ω R ( σ ) η(σ )τ f ( σ )τ a ( σ ) L eff ( σ ) , (5.94e)

where we have added absolute value signs to the argument of Leff to keep S(ı) well-defined for
both positive and negative ı values and to show that it is still an even function, having the same
value at ı and íı. Applying this to Eq. (5.94d), we say that

Z eff ,tot (σ )
W (5.94f)
= ª¬ A ∆Ω R ( σ ) η(σ ) L eff ( σ )τ f ( σ )τ a ( σ ) + Seff( fore ) (σ ) − Seff( fore ) (σ ) º¼ H(σ u ) M ( Rσθ ma )
4

with L, S ( fore ) , and S ( back ) replaced by L eff , Seff( fore ) , and Seff( back ) respectively to show that the finite
field of view and sinc convolution have somewhat blurred and distorted the original functions.
We can, in fact, regard L eff ( σ ) as the best measurement of L( σ ) that the interferometer system
can be expected to produce. Hence, for relatively small fields of view in situations where the sinc
convolution introduces only a negligible distortion,

L eff ( σ ) ≅ L( σ ) ,

and for situations where the finite field of view and sinc convolution must be taken into account,
L eff ( σ ) is what L( σ ) is measured as when subjected to these two unavoidable effects.
To calibrate any type of spectrometer having a linear response to the input spectrum, we need

- 684 -
Calibration · 5.19

to observe at least two known spectral radiances L(1) ( ) ) and L(2) ( ) ) where again we use
absolute value signs to make the radiances well-defined for negative as well as positive ı values.
For an interferometer the L(1) and L(2) radiances should be distinct and slowly varying functions
of wavenumber so that they undergo only negligible distortion from the sinc convolution and
finite field of view; a black-body target at two widely separated temperatures does nicely. We
suppose that Z (1) (2)
eff ,tot () ) and Z eff ,tot () ) are the uncalibrated spectra measured when the

interferometer is observing the known spectral radiances L(1) ( ) ) and L(2) ( ) ) respectively. We
( meas )
then observe a source of unknown spectral radiance L( ) ) and calculate Z eff ,tot () ) , the

uncalibrated spectrum associated with the ztot signal generated by L( ) ) . For a standard
Michelson interferometer, we note that the traditional linear calibration algorithm gives,
consulting Eq. (5.94f) to get the appropriate formulas for Z (1) (2) ( meas )
eff ,tot , Z eff ,tot , and Z eff ,tot ,

( meas ) (1)
Z eff ,tot () )  Z eff ,tot () )
ª¬ L ( ) )  L ( ) ) º¼ A (2)
(2) (1)
(1)
 L(1) ( ) )
Z eff ,tot () )  Z eff ,tot () )
ª¬L(2) ( ) )  L(1) ( ) ) º¼ A
W (5.95a)
M  R)' ma  H() u ) A  ª¬ L eff ( ) )  L(1) ( ) ) º¼ R ( ) ) !() )* f ( ) ) * a ( ) )
4
W
M  R)' ma  H() u ) A  ª¬ L(2) ( ) )  L(1) ( ) ) º¼ R ( ) ) !() )* f ( ) )* a ( ) )
4
(1)
 L ( ) ) L eff ( ) ).

This is the best estimate of the unknown spectral radiance that the interferometer can be expected
to produce, which shows that the standard linear calibration algorithm can work well when we
treat the effective total spectrum Z eff ,tot () ) of the signal leaving the detector circuit just like we
would any other uncalibrated spectrometer signal that depended linearly on the spectral radiance
entering the
the instrument.
instrument.Equation
Once the(5.94e)
systemcan hasbebeen
generalized as we can measure any number of
calibrated,
other spectra simply by pointing the instrument at the other radiances, recording new “(meas)”
quantities, and plugging these “(meas)”
S () ) Lquantities into Equation (5.95a) while leaving all
eff ( ) ) A 1Function of ı2 .
other
(5.95b)
formula values the same.
NowEquation (5.94e)
in Eq. (5.94f) thecan be generalized
effective as
signal spectrum can be written as, for both positive and negative
ı values,
S () ) L eff ( ) ) A 1Function of ı2 . (5.95b)

Now in Eq. (5.94f) the effective signal spectrum can be written as, for both positive and negative
ı values,

- 685 -
5 · Description of Practical Interferometer Measurements

W
Z eff ,tot (σ ) = [L eff ( σ ) ⋅ {Function of σ }]H(σ u ) M ( Rσθ ma )
4
W
+ ª¬ Seff( fore ) (σ ) − Seff(back) (σ ) º¼ H(σ u ) M ( Rσθ ma ) (5.95c)
4
= L eff ( σ ) ⋅ {Complex Function of σ } + {Background Complex Function of σ }.

As long as the effective spectrum of the total signal can be written as a product of the spectral
radiance and a complex function of wavenumber that, due to the background radiance, must be
added to another complex function of the wavenumber, the standard linear calibration algorithm
given in (5.95a) successfully extracts the desired spectral measurement L eff ( σ ) . This procedure
is sometimes called the Revercomb calibration algorithm.89

5.20 Nonflat Optical Surfaces


The easiest way to handle nonflat optical surfaces is to treat the interferometer as a collection of
secondary interferometers operating side by side as shown in Fig. 5.31. The main interferometer
beam is split up into a grid of parallel secondary beams with, as shown in Fig. 5.32, each
secondary beam hitting only a small area of the lens, focusing it onto the detector. Each point on
the detector corresponds to a beam direction hitting the lens. All rays of the secondary beams
that, like the solid lines in Fig. 5.32, are parallel to the optical axis, end up focused at the
detector’s center; and all the rays of the secondary beams that, like the dashed lines in Fig. 5.32,
are traveling in the same off-axis direction, end up focused at the same off-center detector point.
This means that all the small secondary interferometers have the same field of view as the
original large-scale interferometer because each point on the detector corresponds to a different
angle in the field of view.
We label each secondary interferometer with the x, y coordinates of its secondary beam inside
the cross section of the main beam, as shown in Fig. 5.33, and define the distance δ ( x, y ) to be
the offset of the x, y secondary interferometer’s optical-path difference from the average optical-
path difference Ȥ of the entire collection of secondary beams. When δ = 0 for all the secondary
interferometers, we return to the ideal case of a standard interferometer built using perfectly flat
optical surfaces, with the total signal z(Ȥ) leaving the detector circuit becoming


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ , (5.96a)

89
H. E. Revercomb et al., “Radiometric Calibration of IR Fourier Transform Spectrometers: Solution to a Problem
with the High-Resolution Interferometer Sounder,” Applied Optics, 27, no. 5 (1 August 1988), pp. 3210–3218.

- 686 -
Nonflat Optical Surface · 5.20

FIGURE 5.31.

Moving Mirror Surface

Ideal Beam
Splitter

Radiance entering the


interferometer

Fixed Mirror
Surface

Radiance heading to the


Detector

- 687 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.32.
Moving Mirror Surface

Ideal Beam
Splitter

Radiance entering the


interferometer

Fixed Mirror
Surface

Lens

Circular Detector
in the Focal Plane
of the Lens

- 688 -
Nonflat Optical Surface · 5.20

FIGURE 5.33.
entering the
interferometer

Moving Mirror

heading to the
detector

Ideal Beam Splitter

y axis

x axis

Grid of Secondary
Interferometers on the
Fixed Mirror

- 689 -
5 · Description of Practical Interferometer Measurements

using the effective spectrum Z eff () ) explained in Sec. 5.11 above.90 If the total cross-sectional
area of the interferometer beam is A, then for  ( x, y ) > 0 the beam coming from the x, y
secondary interferometer can be thought of as producing a signal

5
dx A dy
zsecondary (  )
x, y

A 5 ³ Z eff () ) e 2& i)    ( x , y )  d) . (5.96b)

The total signal coming from the interferometer can now be written as

5
1
z (  ) ³ dx ³ dy³ d) Z eff () ) e 2& i) (   ( x , y ))
A cross section 5
of main beam

ª
¬«
º (5.96c)
5
« 1 y )( x, y ) »
»
³ ³crossdxsection
³dxdy³edy e
«
Z eff () ) e 2& i) « A
2& i)2(&xi,)
« »d)».d) .
5
«¬ of main beam
ª » »¼

Because Ȥ is the average optical-path difference of all the secondary interferometers—and


 ( x, y ) is the difference from this average at any x, y point—we can write

Average OPD difference 1


over beam cross section A ³ ³
dx dy     ( x, y )   ,

which simplifies to
³ dx³ dydy( x, y( x) , y)0 . 0 .
cross section
(5.96d)
of main beam

The interferometer has no hope of working unless  is always small. We use that

1 2
ex 1 x  x
2
for small x and write that
e 2& i) 1  2& i)  2& 2) 2 2 . (5.97a)

90
Any of the previously discussed formulas for the effective spectrum can be substituted into the formulas used in
this section as long as the mirror-tilt term M is taken to be identically equal to one. We explain the reason for this
rule at the beginning of Sec. 5.21 below.

- 690 -
Nonflat Optical Surface · 5.20

Substitution of (5.97a) into (5.96c) gives


5
1
z( ) ³Z
5
eff () ) e 2& i) { ³ dx ³ dy[1dy
A cross section
[1 x, y( x) ,y2)&22)&22)( 2x, (yx) 2, ]}
2& i()
2&i) y )d2 ]}
) d)
of main beam

2& 22)&22) 2
5
2& i)
³ Z eff () ) e 2& i) [1  dx³³dx ³ dydy ³ dx³³dx
dy³dy y( )x2, ]y(dx))
2
( x, y( x) , y )  ( x, dy , ]yd).2)] d.) .
5
A oss cross
section
section
A A cross cross
section
section
main
of beam
main beam of main
of beam
main beam

Equation (5.96d) shows that the imaginary term inside the square brackets [ ] disappears, leading
to
5
z( ) ³Z
5
eff () ) ª¬1  2& 2) 2  2 º¼ e 2& i) d) , (5.97b)

where  2 , the average value of  2 , is defined to be

1
2 ³ dx³³ dydy
Across section
[ [(x(, xy,)]y2)].2 . (5.97c)
of main beam

We want [1  2& 2) 2  2 ] to be approximately one for all the wavenumbers measured by the
interferometer, so if we plan to measure spectra over the wavenumber range defined by

0
) min 4 ) 4 ) max , (5.98a)

we must use surfaces whose average squared deviation from flatness  2 satisfies

1
 2

. (5.98b)
2& 2) 2

for all the wavenumbers between ) min and ) max . If (5.98b) is satisfied at ) ) max , it is satisfied
for all the wavenumbers in (5.98a). Hence, after defining the root-mean-square deviation from
flatness to be  RMS  2 , the inequality in (5.98b) reduces to

#min
 RMS

, (5.98c)
& 2

- 691 -
5 · Description of Practical Interferometer Measurements

where the formula σ = λ −1 is used to write the inequality in terms of the minimum measured
wavelength instead of the maximum measured wavenumber.

5.21 An Example of How to Analyze Nonflat Optical Surfaces


Most of our previous formulas for the effective spectrum Z eff (σ ) , such as Eqs. (5.82a) or
(5.83d), contain a factor M( Rσθ ma ) representing the effect of a slightly misaligned moving
mirror. This term is defined in Eq. (5.10c) above to be

J1 (4π Rσθ ma )
M(Rσθ ma ) = ,
2π Rσθ ma

and we see from Eq. (5.10e) that M = 1 when the misalignment angle șma is zero. A misaligned
moving mirror is, of course, misaligned with respect to the fixed mirror, so we can always model
this imperfection as a misalignment of the fixed mirror rather than the moving mirror (see Fig.
5.34). The size of the fixed mirror’s misalignment angle is also șma, the same as the size of the
moving mirror’s misalignment angle. This means that when θ ma > 0 , as in Fig. 5.34, we have a
special case of the nonflat optical surface discussed in Sec. 5.20. Hence, when using the analysis
for a nonflat optical surface in Sec. 5.20, we must also set M = 1 in all the formulas for Z eff (σ ) ,
because otherwise we “double count” the effect of a tilted moving mirror. By the same reasoning,
however, the accuracy of the procedure used to analyze nonflat surfaces can be checked by
comparing it to what we get when șma is small but not zero.
Equation (5.97b) states that when the moving or fixed mirror is not flat for any reason—
including, for example, being slightly misaligned and so having a nonzero șma value—the
original formula for the effective spectrum Z eff (σ ) should be multiplied by a factor of

ª1 − 2π 2σ 2 δ 2 º .
¬ ¼

Equations (5.82a) and (5.83d), on the other hand, require the formulas for Z eff (σ ) to be
multiplied by
J1 (4π Rσθ ma )
M(Rσθ ma ) =
2π Rσθ ma

when the misalignment angle șma is small but nonzero. (As before, R is the radius of the circular
cross section of the beam passing through the interferometer.) Comparing these two expressions,
we see that for them to be consistent

- 692 -
An Example of How to Analyze Nonflat Optical Surfaces · 5.21

FIGURE 5.34.

Moving Mirror

θ ma

Fixed Mirror
with Tilt

Radiance
entering the
Interferometer Radiance heading to
the detector

- 693 -
5 · Description of Practical Interferometer Measurements

J1 (4π Rσθ ma ) ≅?
ª1 − 2π 2σ 2 δ 2 º (5.99)
2π Rσθ ma ¬ ¼

must hold true when the misalignment angle șma is small.


To see whether (5.99) is in fact true, we expand its left-hand side in a power series. When x is
small, we know that91
x x3
J1 ( x ) ≅ − , (5.100a)
2 16
so for small șma we can write

J1 (4π Rσθ ma ) 1
≅ (2π Rσθ ma ) −1 ⋅ [2π Rσθ ma − (2π Rσθ ma )3 ]
2π Rσθ ma 2
(5.100b)
1
= 1 − (2π Rσθ ma ) 2 = 1 − 2π 2 R 2σ 2θ ma
2
.
2

To evaluate the right-hand side of (5.99), we consult Fig. 5.35 to get

δ ( x, y ) = 2θ ma y . (5.101a)

Circular symmetry allows us to choose the orientation of the x, y axes any way we please, and
they have been chosen so that the moving mirror is tilted by a rotation șma about the x axis. We
convert to polar coordinates using
x = r cos φ
y = r sin φ

so that formula (5.101a) becomes

δ (r , φ ) = 2θ ma r sin φ . (5.101b)

Since the main beam has a circular cross section of radius R, Eq. (5.97c) can be written as

91
See Eq. (9.1.10) on page 360 of Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun.

- 694 -
An Example of How to Analyze Nonflat Optical Surfaces · 5.21

FIGURE 5.35.

δ = 2 ⋅ θ ma ⋅ y

Radiance
Impinging
on the
Tilted
Mirror

- 695 -
5 · Description of Practical Interferometer Measurements

2& R 2 2& R
1 4' ma
³0 d ³0 dr r [2'ma r sin  ] & R 2 ³ d sin  ³ dr r
2 2 2 3

& R2 0 0
2 2&
4' ª  sin(2 ) º R 4
ma
A  A (5.101c)
& R «¬ 2 4 »¼ 0 4
2

R 2' ma
2
.

Substitution of Eq. (5.100b) into the left-hand side—and Eq. (5.101c) into the right-hand side—
of the proposed equality in (5.99) gives

?
1  2& 2 R 2) 2' ma
2
[1  2& 2) 2 R 2' ma
2
],

which is clearly true. This result not only shows why we should be careful to regard a misaligned
fixed mirror
moving or fixed mirror as
as aa special
special type
type of
ofnonflat
nonflat optical
opticalsurface
surfacebut
butalso
alsojustifies
checks the procedure
used in Sec. 5.20 to analyze more general types of nonflat optical surfaces.

5.22 Sampling the Interferogram Signal


After the interferogram signal leaves the detector circuit, it should be sampled at equally spaced
intervals of the optical-path difference. In principle, all we need to do is keep the moving mirror
traveling at a constant velocity while using an analog-to-digital converter (A/D converter) to
sample the signal at equally spaced instants in time. In practice, much better results are achieved
when a laser beam is used to trigger the A/D converter. In Sec. 1.8 of Chapter 1, we discussed in
a general way how laser control systems can be used to maintain alignment and produce steady
motion of the moving mirror. In a well-aligned system, we only need a single laser beam to
sample the interferogram signal at equally spaced intervals of the optical path difference. In Fig,
5.36, for example, two small angle mirrors insert and remove a laser beam parallel to the optical
axis of the signal beam. The laser beam passes through the interferometer in exactly the same
way as the main signal beam, and it experiences the same optical-path difference Ȥ as the main
signal beam. The laser detector registers a monochromatic interference signal, and from Eqs.
(5.16b) and (5.16c) and Figs. 5.9(b) and 5.9(c), we know that this signal generates a cosine wave
in Ȥ. The laser trigger circuit analyzes this cosine wave, sending out a trigger signal telling the
A/D converter to sample the main-beam signal every time the cosine wave crosses a
predetermined trigger level (see Fig. 5.37). Now when the location of the moving mirror varies
slightly from its predetermined value, the error in the sampling position no longer comes from
sampling at the wrong position of the moving mirror but instead is caused by inaccuracies in the

- 696 -
Sampling the Interferogram Signal · 5.22

FIGURE 5.36. Moving Mirror

Ideal Beam
Splitter

Outside
Radiance
entering the
Interferometer

Laser

Laser Detector Fixed Mirror

Lens

Trigger
Circuit
processing
the Signal Interferometer
from the Detector
Laser
Detector

Detector
Circuit
A/D
Converter

Digitized Detector Signal

- 697 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.37.
Total Power in the Laser
Interference Signal

Laser Trigger
Lever


#0

laser-trigger and main-beam detector circuits.92 As a general rule, this makes the error a great
deal smaller. Similarly, slight changes in the overall size of the interferometer setup due to
mechanical flexing need no longer concern us; the laser beam establishes an invariant “ruler” that
does not care whether the overall distance between, say, the beam splitter and the fixed mirror
has changed by several microns since the last time the instrument was calibrated.
Section 5.14 above points out that to remove the background radiance from the main-beam
detector signal, we just subtract the interferogram signal produced by a very cold source from the
interferogram signal produced by the source whose spectrum we want to measure. Equation
(5.62c) describes this process as

92
In Chapter 8 we analyze this sort of sampling error as a random source of noise.

- 698 -
Sampling the Interferogram Signal · 5.22

z ( χ ) = ztot ( χ ) − zC(cold) ( χ ) , (5.102)

where ztot(Ȥ) is the interferogram signal produced by the combination of the desired source
spectrum with the instrument background, z(cold)
C ( χ ) is the interferogram signal produced by just
the instrument background when observing a very cold source, and z(Ȥ) is the interferogram
signal by just the source spectrum we want to measure. When we sample the signal leaving the
detector circuit at equal optical-path-difference intervals ¨Ȥ, what we get is either

ztot (m∆χ ) for m = 0, ± 1, ± 2, …

when observing the source radiance combined with the instrument background or

í zC(cold) (m∆χ ) for m = 0, ± 1, ± 2, …


when observing just the cold source.93 Working with a double-sided interferogram system of the
type described in Sec. 5.15, we acquire a total of N samples. Formula (5.102) then gives the
sampled interferogram signal produced by just the source spectrum,

z (m∆χ ) = ztot (m∆χ ) − zC(cold) (m∆χ ) . (5.103a)

The fast-Fourier transform algorithms that are applied to these samples work best when N is a
multiple of 2, as is mentioned at the beginning Sec. 2.22 of Chapter 2, so in (5.103a) the index
values of m can be chosen so that

N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1, . (5.103b)
2 2 2 2

Note that (5.103b) specifies one “extra” sample to occur on the positive Ȥ axis.

5.23 Setting Up the Discrete Fourier Transform of the Sampled Signal


The key step in modern Fourier-transform spectroscopy is to apply a fast-Fourier transform (FFT)
algorithm to the sampled signal z(m∆χ ) in order to calculate the discrete Fourier transform
(DFT) that best approximates the integral Fourier transform of z(Ȥ). The unsampled interferogram
signal leaving the detector circuit can be written as

93
To keep things simple, we assume for now that the sample with index m = 0 occurs at χ = 0 . Section 5.26
below shows what happens when we stop assuming that one of the samples occurs at exactly χ = 0 .

- 699 -
5 · Description of Practical Interferometer Measurements


z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ (5.104a)

where, according to Eq. (5.83d),

WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.104b)
4

Usually the aft optics’ transmission function τ a (σ ) is nonzero only for those wavenumbers ı that
satisfy
σ min ≤ σ ≤ σ max , (5.105)

making the effective spectrum Z eff equal to zero for σ > σ max or σ < σ min as shown in Fig.
5.38.

______________________________________________________________________________

FIGURE 5.38.

Z eff (σ )

- σ max - σ min σ min σ max σ

- 700 -
Setting Up the Discrete Fourier Transform · 5.23

FIGURE 5.39.

z trunc ( χ )

-D D

______________________________________________________________________________

The work done in Sec. 5.15 shows that the interferogram signal for a double-sided
interferogram, which we call the truncated interferogram signal, can be written as

ztrunc ( χ ) = Π ( χ , D) z ( χ ) (5.106a)
so that
­° z ( χ ) for χ ≤ D
ztrunc ( χ ) = ® (5.106b)
°̄ 0 for χ > D

as shown in Fig. 5.39.


Looking ahead to when the signal is sampled, we note that for N equally spaced samples

2D
∆χ = . (5.107a)
N

- 701 -
5 · Description of Practical Interferometer Measurements

Since
ztrunc ( χ ) = z ( χ ) for χ ≤ D , (5.107b)
it then follows that
ztrunc (m∆χ ) = z (m∆χ ) (5.107c)
for

N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1, . (5.107d)
2 2 2 2

Equation (5.65c) shows, after we substitute from (5.106a), that the effective spectrum
associated with the unsampled signal is

³z
−∞
trunc ( χ ) e −2π iσχ d χ = Z eff(σ )
trunc
(5.108a)

with
Z eff(σ ) = [2 Dsinc(2πσ D )] ∗ Z eff (σ ) . (5.108b)
trunc

Figure 5.40 shows that D will be chosen large enough to make Z eff just a slightly blurred
trunc

version of Z eff with a tendency to oscillate at abrupt changes in value. According to the
discussion following Eq. (5.82c) above, the quantities H, M, R, Ș, τ a , and τ f are all slowly
varying functions of their arguments.94 This means that when the formula for Z eff in (5.104b) is
substituted into Eq. (5.108b), the sinc function is narrow enough for these quantities to be treated
as quasi-constant with respect to the convolution [see Eq. (5C.1) in Appendix 5C]. Hence, we can
approximate (5.108b) as

WA ∆Ω
Z eff (σ ) ≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L mnf ( σ ) , (5.108c)
trunc 4
where
L mnf (σ ) = [2 Dsinc(2πσ D )] ∗ L FOV ( σ ) . (5.108d)

94
So if the fore-optics transmission τ a and the detector responsivity R drop to zero for | σ | > σ max and
| σ | < σ min , this must occur slowly compared to the rate at which LFOV varies with ı.

- 702 -
Setting Up the Discrete Fourier Transform · 5.23

FIGURE 5.40.

Z eff (σ )
trunc

- σ max - σ min σ min σ max σ

We note that because both 2 Dsinc(2πσ D ) and L FOV ( σ ) are even functions of ı, their
convolution Lmnf(ı) is also even [see Eq. (2.38f) in Chapter 2],

L mnf (−σ ) = L mnf (σ ) (5.108e)


and
L mnf (σ ) = L mnf ( σ ) . (5.108f)

Even though the argument of Lmnf does not need absolute value signs because Lmnf is by
definition in (5.108d) already an even function, they are put there anyway to keep the notation
parallel with the previous L-type radiance symbols. The mnf subscript indicates that Lmnf is the
measured, noise-free spectral radiance produced by the interferometer; it is L(ı) blurred both by
the finite field-of-view effect discussed in Sec. 5.17 and the finite-interferogram effect discussed
in Sec. 5.15. Figures 5.41(a)–5.41(c) show the progression from the original L(ı) radiance
spectrum to LFOV(ı) defined in Eq. (5.83e) above to Lmnf(ı) defined in Eq. (5.108d). The
unsampled, noise-free signal can now be written as the Fourier transform pair,

- 703 -
5 · Description of Practical Interferometer Measurements


Z eff(σ )
trunc
= ³z
−∞
trunc ( χ ) e −2π iσχ d χ (5.109a)

and

ztrunc ( χ ) = ³Z
−∞
(σ ) e 2π iσχ dσ .
eff
trunc
(5.109b)

Function L mnf ( σ ) is closely related to function L eff ( σ ) in Eqs. (5.94e) and (5.94f). Hence, it
makes sense to assume that
L mnf ( σ ) ≅ L eff ( σ ) (5.110)

and to replace L eff ( σ ) by L mnf ( σ ) in the calibration formulas of Sec. 5.19.

5.24 Oversampling the Interferogram


Section 2.21 of Chapter 2 shows how to go from the integral Fourier transform to the discrete
Fourier transform. Comparing the integral transform pair in Eqs. (5.109a) and (5.109b) to Eq.
(2.91a) of Chapter 2, we note that variables Ȥ and ı in this chapter play the roles of variables t
and ƒ respectively in Chapter 2,
χ ⇔t (5.111a)
and
σ⇔ f . (5.111b)

Perhaps the most important decision involved in going from the integral to the discrete Fourier
transform is the choice of step size ¨Ȥ between the equally spaced samples of ztrunc ( χ ) .
Converting Eq. (2.99a) of Chapter 2 from variables t and ƒ to variables Ȥ and ı, we see that the
Nyquist wavenumber σ Nyq corresponding to the Nyquist frequency f Nyq is given by

1
σ Nyq = . (5.112)
2∆χ

The discussion at the beginning of Sec. 2.22 of Chapter 2 shows that oversampling the
interferogram signal ztrunc(Ȥ) means choosing the sampling interval ¨Ȥ in such a way that the
Nyquist wavenumber ıNyq satisfies

σ Nyq > σ max

- 704 -
Oversampling the Interferogram · 5.24

FIGURE 5.41(a).

Spectral Edge E

L(σ )

1009 cm-1 1010 cm-1

This small piece of the radiance spectrum, the same piece plotted in Fig. 5.30(a) above, is
here graphed in all its detail as it enters the interferometer. This is why the y axis is labeled
L(σ ) . Note that Spectral Edge E lies at wavenumber 1010 cm-1.

- 705 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.41(b).

Spectral Edge E

L FOV (σ )

1009 cm-1 1010 cm-1

Here the same small piece of the radiance spectrum plotted in Figs. 5.41(a) and 5.30(b)
is shown with the rescaled wavenumber axis and blurring due to the interferometer’s
finite field of view. Hence, the y axis is labeled LFOV(ı). Note that Spectral Edge E now
occurs at a slightly smaller wavenumber than before.

- 706 -
Oversampling the Interferogram · 5.24

FIGURE 5.41(C).

Spectral Edge E

L mnf () )

1009 cm-1 1010 cm-1


This is the same small piece of the radiance spectrum plotted in Figs. 5.41(a) and
5.41(b). The y axis is labeled Lmnf(ı) to show that here the radiance is blurred by both
the interferometer’s finite field of view and the interferometer’s finite interferogram
length. Note that Spectral Edge E has the same wavenumber shift as in Fig. 5.41(b),
and that the spectral detail has been further blurred by the finite interferogram length.
Figures 5.41(a)--5.41(c)
5.41(a)–5.41(c) qualitatively showofthe
show the effect theeffect of the two
two spectral spectral inherent
distortions distortions
in
standard Fourier-transform spectrometers.
inherent in standard Fourier-transform spectrometers.

- 707 -
5 · Description of Practical Interferometer Measurements

with ımax defined by inequality (5.105) and Fig. 5.38. The larger ıNyq is compared to ımax, the
more accurate the transformation from the integral Fourier transform to the discrete Fourier
transform. The reason for this, of course, is that the larger ıNyq is compared to ımax, the less likely
it is that significant amounts of aliasing will occur when going from the integral to the discrete
Fourier transform. Although both aliasing and the transformation from the integral to the discrete
Fourier transform have already been covered in Secs. 2.21–2.23 of Chapter 2, it does no harm to
review these ideas here in the context of the truncated interferogram signal ztrunc(Ȥ) and its
effective spectrum Z eff(σ ) .
trunc
[∞ ]
The first step in setting up the discrete Fourier transform is to construct function ztrunc ( χ , 2 D)
from ztrunc(Ȥ) following the procedure used in Eq. (2.91b) of Chapter 2,


[∞ ]
ztrunc ( χ , 2 D) = ¦z
k =−∞
trunc ( χ − 2kD) . (5.113a)

From Eq. (5.106b) and Fig. 5.39, we know that ztrunc is zero for χ > D . Consequently
[∞]
ztrunc ( χ , 2 D) has the form shown in Fig. 5.42. This matches the situation shown in Fig. 2.12(a) of
[∞]
Chapter 2, with the original signal ztrunc turned into a nonoverlapping, periodic signal ztrunc of
period 2D. In particular, we note that

[∞ ]
ztrunc ( χ , 2 D ) = ztrunc ( χ ) for χ ≤ D . (5.113b)

Next we construct Z[eff∞ ] (σ , 2σ Nyq ) using


trunc


Z (σ , 2σ Nyq ) =
[∞ ]
eff
trunc
¦Z
k =−∞
eff(σ − 2kσ Nyq ) .
trunc
(5.113c)

Glancing at the plot of Z eff in Fig. 5.43, we see that the plot of Z[eff∞ ] has the form shown in
trunc trunc

Fig. 5.44. The original signal Z eff is turned into a periodic signal Z[eff∞ ] of period 2σ Nyq ,
trunc trunc

matching the situation shown in Fig. 2.12(b) of Chapter 2. Consequently, we have that

Z[eff∞ ] (σ , 2σ Nyq ) ≅ Z eff (σ ) for σ ≤ σ Nyq . (5.113d)


trunc trunc

- 708 -
Oversampling the Interferogram · 5.24

FIGURE 5.42.
[∞]
z trunc ( χ ,2 D )
χ

- 5D - 3D -D D 3D 5D

The edge ripples of Z eff are small and become smaller as we get further from the edge, but they
trunc

can in principle extend indefinitely far along the wavenumber axis, which means that overlapping
can occur making Z[eff∞ ] not exactly equal to Z eff for σ ≤ σ Nyq . Reviewing the discussion
trunc trunc

following Eqs. (2.93a) and (2.93b) of Chapter 2, we see that approximating ztrunc and Z eff by
trunc

periodic functions (with periods 2D and 2σ Nyq respectively) is exactly what we need to do when
approximating the integral Fourier transform by the discrete Fourier transform. Now we
understand why the correct choice of ıNyq is so important; if ıNyq is set too close to ımax, the
ringing at the edges of Z eff could create significant amounts of overlap in its periodic extension
trunc
[∞ ]
to function Z eff .
trunc

As is pointed out in Sec. 2.22 of Chapter 2, this sort of overlap is called aliasing of the signal
spectrum. When ıNyq is chosen to be decidedly greater than ımax, the interferogram signal is said
∞]
to be oversampled. The choice of D made when going from ztrunc to z[trunc , although in principle
equally important in characterizing the discrete Fourier transform, is in practice specified at an
earlier stage of the interferometer design when deciding on the spectral resolution of the
measured spectrum [see Eq. (5.67) above].
Because ztrunc(Ȥ) is zero for χ > D , and both the real and imaginary components of Z eff(σ )
trunc

are negligible for σ > σ Nyq , the pair of integral Fourier transforms in Eqs. (5.109a) and (5.109b)
can be approximated by

- 709 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.43. Z eff (σ )


trunc

- σ max - σ min σ min σ max


- σ Nyq σ Nyq

FIGURE 5.44.

Z [eff∞ ] (σ ,2σ Nyq )


trunc

− σ Nyq σ Nyq 2σ Nyq 3σ Nyq


− 3σ Nyq − 2σ Nyq - σ min σ min σ max
- σ max
σ

- 710 -
Oversampling the Interferogram · 5.24

D
Z eff(σ ) =
trunc
³z
−D
trunc ( χ ) e −2π iσχ d χ (5.114a)

and
σ Nyq

ztrunc ( χ ) =
−σ
³ Z eff(σ ) e 2π iσχ dσ .
trunc
(5.114b)
Nyq

With the understanding that only the signal values at χ ≤ D and the spectral values at σ ≤ σ Nyq
are of interest on the left-hand sides of the formulas, we use Eqs. (5.113b) and (5.113d) to replace
∞]
ztrunc by z[trunc and Z eff by Z[eff∞ ] . Equations (5.114a) and (5.114b) now become
trunc trunc

D
Z (σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π iσχ d χ
[∞ ] [∞ ]
eff trunc (5.115a)
trunc −D
and
σ Nyq

( χ , 2 D) = ³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ .


[∞ ]
ztrunc (5.115b)
− σ trunc
Nyq

Working first with the right-hand side of Eq. (5.115b), we note that
σ Nyq 0

−σ
³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ =
trunc − σ
³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ
trunc
Nyq Nyq

σ Nyq

+ ³0
Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ
trunc

2σ Nyq
(5.116)

³
−2π i (2σ Nyq ) χ
= Z[eff∞ ] (σ ′ − 2σ Nyq , 2σ Nyq ) e 2π iσ ′χ e dσ ′
σ trunc
Nyq

σ Nyq

+ ³0
Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ ,
trunc

where in the last step the variable of integration in the first integral has been changed to
σ ′ = σ + 2σ Nyq . From Eq. (5.112) we get

−2π i (2σ Nyq ) χ


e = e −2 π i ( χ ∆χ )
.
If χ / ∆χ = m = integer , then

- 711 -
5 · Description of Practical Interferometer Measurements

e 2 & i (   )
e2& im 1 .

[5]
Substituting (5.116) back into (5.115b) and deciding to evaluate ztrunc only at those optical-path
differences Ȥ for which  /  m , we get

2) Nyq

³
[5]
ztrunc (m , 2 D) Z[eff5 ] () 3  2) Nyq , 2) Nyq ) e 2& i) 3m d) 3
) trunc
Nyq

) Nyq

 ³ 0
Z[eff5 ] () , 2) Nyq ) e2& i) m d) .
trunc

This becomes, dropping the prime and recognizing that Z[eff5 ] is periodic with period 2) Nyq ,
trunc

2) Nyq

³
[5]
z
trunc (m , 2 D) Z[eff5 ] () , 2) Nyq ) e 2& i) m d) . (5.117)
0 trunc

NowNow
we we
switch ourour
switch attention to to
attention Eq.Eq.
(5.115a). Following
(5.115a). Followingthe
thesame
sameprocedure
procedureas
as before,
before, this
this time
changing the variable of integration to  3   2D , we write its right-hand side as

D 2D

³z ³z
[5] 2& i) [5]
trunc (  , 2 D) e d trunc (  3  2 D, 2 D) e 2& i) 3e 2& i) (2 D ) d  3
D D
D
 ³ ztrunc
[5 ]
(  , 2 D) e 2& i) d  .
0

[5]
Substituting this into (5.15a) gives, since ztrunc is periodic with period 2D,

D
Z[eff5 ] () , 2) Nyq ) ³ ztrunc
[5]
(  , 2 D) e 2& i) d 
trunc 0
2D
(5.118)
³z
[5]
 trunc (  , 2 D) e 2& i) e 2& i) (2 D ) d  ,
D

where the prime has been dropped from the integral between D and 2D. From Eq. (2.93d) of
Chapter 2, we note—remembering that variables Ȥ and ı here correspond, respectively, to

- 712 -
Oversampling the Interferogram · 5.24

variables t and ƒ there—that the interval ¨ı between samples of Z[eff∞ ] in the discrete Fourier
trunc

transform is
1
∆σ = (5.119)
2D

[∞]
when ztrunc has period 2D. This means that

e 2π iσ (2 D ) = e2π i (σ ∆σ )

in the second integral of (5.118). If σ / ∆σ = n = integer , then

e 2 π i (σ ∆σ )
= e2π in = 1 .

Now, deciding to evaluate Z[eff∞ ] only at wavenumbers for which σ / ∆σ = n , we can write
trunc

(5.118) as
2D
Z (n∆σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π i ( n∆σ ) χ d χ .
[∞ ] [∞]
eff trunc (5.120)
trunc 0

Equations (5.117) and (5.120) are gathered together to get

2D
Z (n∆σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π i ( n∆σ ) χ d χ
[∞ ] [∞]
eff trunc (5.121a)
trunc 0

and
2σ Nyq

(m∆χ , 2 D) = ³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσ m∆χ dσ .


[∞]
ztrunc (5.121b)
0 trunc

The discussion following Eq. (5.106b) defined N to be the number of equally spaced samples of
[∞]
ztrunc between íD and D, with ∆χ = (2 D) / N , so N equally spaced samples spaced ¨Ȥ apart must
also cover the optical-path difference between zero and 2D. We now show that N equally spaced
samples of Z[eff∞ ] spaced ¨ı apart in wavenumber cover the wavenumber interval between zero
trunc

and 2σ Nyq . Remembering that variables Ȥ and ı here correspond to variables t and ƒ respectively
in Chapter 2, we rewrite Eq. (2.93e) of Chapter 2 as

- 713 -
5 · Description of Practical Interferometer Measurements

1
∆σ ∆χ = . (5.122a)
N
Consequently,
1
∆σ =
N ∆χ

or, using 2σ Nyq = (∆χ ) −1 from Eq. (5.112),

N ∆σ = 2σ Nyq . (5.122b)

Therefore N equally spaced samples ¨ı apart must cover the wavenumber interval between zero
and 2σ Nyq .
Having established that N equally spaced samples cover the regions of integration in Eqs.
(5.121a) and (5.121b), we approximate both integrals as sums over N equally spaced samples in
wavenumber and optical-path difference. This gives

N −1
Z[eff∞ ] (n∆σ , 2σ Nyq ) ≅ ∆χ ¦ ztrunc
[∞]
(m∆χ , 2 D) e −2π i ( n∆σ )( m∆χ ) (5.123a)
trunc m =0
and
N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq ) e 2π i ( n∆σ )( m∆χ ) . (5.123b)
n =0 trunc

To put this into the traditional form of the discrete Fourier transforms shown in Eqs. (2.96a) and
(2.96b) in Chapter 2, just multiply both sides of (5.123a) by ¨ı and use ∆σ∆χ = N −1 from
(5.122a) to get
N −1 nm
2π i
zm = ¦ Z n e N
(5.124a)
n =0
and
N −1 nm
1 −2π i
Zn =
N
¦z
m =0
m e N
, (5.124b)

where
[∞]
zm = ztrunc (m∆χ , 2 D) (5.124c)
and
Z n = ∆σ ⋅ Z[eff∞ ] (n∆σ , 2σ Nyq ) . (5.124d)
trunc

It is important to remember, when using the discrete Fourier transforms defined in (5.124a)–

- 714 -
Oversampling the Interferogram · 5.24

(5.124d) to approximate the integral Fourier transforms in (5.114a) and (5.114b), that functions
[5]
ztrunc and Z[eff5 ] are qualitatively different from the truncated interferogram signal ztrunc and its
trunc
[5]
associated spectrum Z eff with which we began—because functions ztrunc and Z[eff5 ] , unlike ztrunc
trunc trunc

and Z eff , are periodic with periods of 2D and 2) Nyq respectively. We also note that the
trunc

unapodized spectral resolution ¨ı given in Eq. (5.67) above is, when using the discrete Fourier
transform, the same as the distance between spectral samples given by Eq. (5.119),

1
) .
2D

Consequently, the unapodized spectral resolution can be defined very simply and exactly as the
distance between adjacent spectral samples after the discrete Fourier transform is applied to the
sampled interferogram signal. This is one reason why the unapodized spectral resolution has
become such a widespread figure of merit for resolution in Fourier-transform spectroscopy—
when discrete Fourier transforms are used to approximate integral Fourier transforms, it sets an
easily understood limit on how much spectral detail we can hope to resolve.

5.25 Undersampling the Interferogram


When oversampling the interferogram signal in the previous section, we take advantage of the
way Z eff() ) becomes negligibly small for ) ) max in order to avoid overlapping—or aliasing—
trunc

of the spectrum. We do this by requiring that ıNyq end up well to the right of ımax in Fig. 5.43
when creating the periodic function

5
Z[eff5 ] () , 2) Nyq )
trunc
¦Z
k 5
eff()  2k) Nyq )
trunc
(5.125a)

in Eq. (5.113c) above. Consequently ¨Ȥ, the optical-path difference between adjacent samples of
the interferogram signal ztrunc(Ȥ), must be chosen small enough that, according to formula (5.112),

1
) Nyq (5.125b)
2

decidedly larger
is decidedly largerthan
thanıımax
max. .Since
SinceZZ
eff(eff
)()) )also
becomes
becomes
negligibly
negligiblysmall
smallfor
for ))

))min
min ,, we
we may
trunc
trunc

also be able
be able to follow
to follow the the strategy
strategy outlined
outlined in inSec.
Sec.2.23
2.23ofofChapter
Chapter 22 and
and undersample
undersample the

- 715 -
5 · Description of Practical Interferometer Measurements

interferogram signal instead. We now review how undersampling works, explaining in more
detail how to set up the appropriate discrete Fourier transform for an undersampled interferogram
signal.
The first step in undersampling an interferogram signal is to compare the wavenumber
interval (σ max − σ min ) to ımin to see how many aliases of the original spectrum can be fit between
σ = 0 and σ = σ min . For the spectrum in Fig. 5.45, we could choose the undersampled Nyquist
wavenumber σ Nyq
(u )
small enough to fit in as many as two aliases, as shown by the dashed curves;
but we decide to be conservative and only fit in one, as shown in Fig. 5.46. This conservative
strategy is called undersampling by a factor of 2.
When undersampling by a factor of 2, the old Nyquist frequency ıNyq and the new Nyquist
frequency σ Nyq
(u )
are related by

σ Nyq = 2σ Nyq
(u )
. (5.126)
Just like

1
2σ Nyq = (5.127a)
∆χ

for the old Nyquist frequency and the old sampling interval in Eq. (5.112), we associate with
σ Nyq
(u )
a new sampling interval ∆χ ( u ) such that
1
2σ Nyq
(u )
= . (5.127b)
∆χ ( u )

For Eqs. (5.126), (5.127a), and (5.127b) to be true, we must have

∆χ ( u ) = 2 ∆χ . (5.127c)

This is, of course, why what we are doing is called undersampling by a factor of 2; according to
(5.127c), the interferogram signal is to be sampled half as often as before.
[∞]
In the previous section, we found the sampled interferogram signal ztrunc could be written as
[see Eq. (5.123b)]

N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq ) e 2π i ( n∆σ )( m∆χ ) . (5.128)
n =0 trunc

- 716 -
Undersampling the Interferogram · 5.25

FIGURE 5.45.

−σ Nyq (u )
(u ) (u ) (u )
−3σ Nyq = −σ Nyq −2σ Nyq 3σ Nyq = σ Nyq
(u ) (u )
σ Nyq 2σ Nyq

______________________________________________________________________________

Note that here ¨Ȥ, ¨ı, ıNyq, and N all retain the old oversampled values specified in the previous
section. Assuming that the number of samples N is large, we see that as the index n goes from
zero to N − 1 , the wavenumber argument n∆σ of Z[eff∞ ] goes from zero to
trunc

1 1
( N − 1)∆σ ≅ N ∆σ = ⋅ ∆σ = = 2σ Nyq .
∆σ∆χ ∆χ

Here both N = (∆σ∆χ ) −1 from formula (5.122a) and 2σ Nyq = (∆χ ) −1 from formula (5.127a) are
used to get the final result. Since
( N − 1)∆σ ≅ 2σ Nyq ,

we see that the sum over Z[eff∞ ] in Eq. (5.128) is over the original oversampled spectrum between
trunc

σ = 0 and σ = σ Nyq and one of its aliases between σ = σ Nyq and σ = 2σ Nyq . Suppose the old
Nyquist wavenumber in Eq. (5.128) is replaced by σ Nyq
(u )
, half the old Nyquist value, to get

? N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ ) . (5.129)
n =0 trunc

- 717 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.46.

(u ) (u )
− σ Nyq (u )
σ Nyq 3σ Nyq

(u ) (u )
(u )
− 2σ Nyq = −σ Nyq 2σ Nyq = σ Nyq 4σ Nyq = 2σ Nyq

Solid lines show the position of the original spectrum on the wavenumber axis, and the unshaded
dashed lines show the aliases associated with the original Nyquist wavenumber σ Nyq . The shaded
dashed lines show the aliases produced by undersampling. They are associated with the
undersampled Nyquist wavenumber σ Nyq .
(u )

Figure 5.46 shows that the new spectrum Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) has twice as many aliases as the
trunc

original spectrum Z (n∆σ , 2σ Nyq ) . Comparing the new spectrum in Fig. 5.46 to the original
[∞ ]
eff
trunc

spectrum in Fig. 5.44, we see that the sum in (5.129) covers two extra aliases in Fig. 5.46 that it
did not cover in Fig. 5.44. The wavenumbers where Z[eff∞ ] are zero do not, of course, contribute
trunc

anything to the sum. Let’s see what happens when we eliminate two of the aliases by taking the
sum over the new spectrum only up to the new, rather than the old, Nyquist wavenumber.
According to the discussion following Eq. (5.103a), N is even, which means that formula (5.129)
can now be written as

? ( N / 2) −1
[∞]
ztrunc (m∆χ , 2 D ) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ ) . (5.130)
n =0 trunc

This eliminates altogether the alias between 3σ Nyq


(u )
and 4σ Nyq
(u )
in Fig. 5.46, which was not part of

- 718 -
Undersampling the Interferogram · 5.25

(u ) (u )
the original
original sum,
sum,and replaces
as well the alias
as the alias between 2) Nyq and 3) Nyq , which was partregard
we can of theas
original
being
(u ) (u )
replaced
sum, withbythe
thealias
aliasbetween
betweenzero
zeroand
and ) Nyq . The alias between zero and ) Nyq is an exact copy of
(u ) (u )
the alias between 2) Nyq and 3) Nyq , and these two aliases are separated by a wavenumber interval
[see Eqs. (5.126), (5.127a), and (5.122a)]

(u ) 1 ) 1 N )
2) Nyq ) Nyq A .
2 2 )  2
Consequently, we can write that

§ N (u ) ·
Z[eff5 ] ¨(n  ) ) , 2) Nyq [5 ] (u )
¸ Z eff (n) , 2) Nyq ) (5.131a)
trunc © 2 ¹ trunc

(u )
when comparing spectral values in the alias between zero and ) Nyq to spectral values in the alias
(u )
between 2) Nyq (u )
and 3) Nyq . As far as the complex exponent multiplying Z[eff5 ] is concerned, we
trunc

have, according to (5.122a), that

e 2& i n ( N / 2) ) Am e 2& i ( n) )( m ) e 2& im ( N / 2)( ) )


e 2& i ( n) )( m ) e 2& im ( N / 2)A(1/ N )
e 2& i ( n) )( m ) e 2& i ( m / 2) .

Hence, whenever m is even, we have


e2& i ( m / 2) 1
so that
2& i  n ( N / 2)  ) Am
e e2& i ( n) )( m ) .

Suppose we add a subscript 2 to m to show that it must be a non-negative and even integer,

m2 0, 2, 4, … . (5.131b)
This lets us write the latest result as

2& i  n ( N / 2)  ) Am2  2& i ( n) )( m2  )


e e . (5.131c)

Consequently, we can combine (5.131a) and (5.131c) to get

- 719 -
5 · Description of Practical Interferometer Measurements

§ N ( u ) · 2π i ( n − ( N / 2) ) ∆σ ⋅m2 ∆χ
Z[eff∞ ] ¨(n − ) ∆σ , 2σ Nyq ¸e
trunc © 2 ¹ (5.131d)
2π i ( n∆σ )( m2 ∆χ )
= Z (n∆σ , 2σ
[∞]
eff
(u )
Nyq ) e .
trunc

This shows that—whenever m = m2 = a non-negative even integer—each term in the original sum
over the alias in Fig. 5.46 between 2σ Nyq
(u )
and 3σ Nyq
(u )
is the same as the corresponding term in the
new sum over the alias in Fig. 5.46 between zero and σ Nyq
(u )
. Therefore, whenever the ¨Ȥ index is a
non-negative even integer, we can remove the question mark from formula (5.130) and write

( N / 2) −1
z [∞]
trunc (m2 ∆χ , 2 D ) ≅ ∆σ ¦
n =0
Z[eff∞ ] (n∆σ , 2σ Nyq
trunc
(u )
) e 2π i ( n∆σ )( m 2∆χ ) ,

where we have replaced m by m2 on the left-hand side to honor the restriction placed on the
permitted values of the ¨Ȥ index. If we define an undersampled value of N,

N (u ) = N / 2 , (5.132a)
then the formula becomes

N ( u ) −1
z[∞]
trunc (m2 ∆χ , 2 D) ≅ ∆σ ¦
n =0
Z[eff∞ ] (n∆σ , 2σ Nyq
trunc
(u )
) e 2π i ( n∆σ )( m 2∆χ ) . (5.132b)

We note that the m2 sequence of non-negative even integers can be written as

m2 = 2m for m = 0, 1, 2, … .

Hence, using that ∆χ ( u ) = 2 ∆χ from Eq. (5.127c), we see that

m2 ∆χ = 2m∆χ = m∆χ ( u ) .

Equation (5.132b) can now be written as

N ( u ) −1

¦
(u )
z[∞]
trunc (m∆χ , 2 D) ≅ ∆σ
(u )
Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ )
. (5.132c)
n =0 trunc

This gives one of the two formulas for the discrete Fourier transform of the undersampled
interferogram signal.

- 720 -
Undersampling the Interferogram · 5.25

To get the other formula, we multiply both sides of (5.132c) by


(u )
e −2π i ( n′′∆σ )( m∆χ )

and sum over m. This gives

N ( u ) −1

¦
(u )
e−2π i ( n′′∆σ )( m∆χ ) [∞]
ztrunc (m∆χ ( u ) , 2 D)
m =0
(5.133)
N ( u ) −1 N ( u ) −1

¦ ¦ e2π i (n−n′′)∆σ (m∆χ


(u )
)
≅ ∆σ Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) .
n =0 trunc m =1

We note, using Eqs. (5.127c) and (5.122a), that

2 1
∆χ ( u ) ∆σ = 2∆χ∆σ = = (u ) (5.134a)
N N

with the last step using definition (5.132a). Therefore,

N ( u ) −1 N ( u ) −1 N ( u ) −1

¦
m =0
e 2π i ( m∆χ ( u ) )( n − n′′) ∆σ
= ¦
m =0
e 2π im ( n − n′′) / N ( u )
= ¦
m =0
( wN ( u ) ) m ( n −n′′)

with wN ( u ) given by Eq. (2.94a) of Chapter 2 as

N (u )
wN ( u ) = e 2π i .

According to Eqs. (2.94d) and (2.94g) of Chapter 2,

N ( u ) −1

¦
m =0
( wN ( u ) ) m ( n − n′′) = N ( u )δ n′′,n ,

and so
N ( u ) −1

¦
(u )
e2π i ( m∆χ )( n − n′′) ∆σ
= N ( u )δ n′′,n . (5.134b)
m =0

Substitution of (5.134b) into (5.133) gives

- 721 -
5 · Description of Practical Interferometer Measurements

N ( u ) −1

¦ e−2π i (n′′∆σ )(m∆χ


(u )
) [∞]
ztrunc (m∆χ ( u ) , 2 D)
m =0
N ( u ) −1
≅ N ∆σ (u )
¦
n =0
Z[eff∞ ] (n∆σ , 2σ Nyq
trunc
(u )
) δ n′′,n .

or
N ( u ) −1
1
¦
(u )
Z (n′′∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ (u ) [∞]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n′′∆σ )( m∆χ )
. (5.135a)
trunc N ∆σ m =0

Dropping the primes from n and using formula (5.134a) to write

1
= ∆χ ( u ) , (5.135b)
N ∆σ (u )

Eq. (5.135a) becomes

N ( u ) −1

¦
(u )
Z (n∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ ∆χ (u ) [∞ ]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n∆σ )( m∆χ )
. (5.135c)
trunc m =0

Having now found the second formula for the discrete Fourier transform of the undersampled
signal, we gather together Eqs. (5.132c) and (5.135c) to write

N ( u ) −1

¦
(u )
Z (n∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ ∆χ (u ) [∞ ]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n∆σ )( m∆χ )
(5.136a)
trunc m =0
and
N ( u ) −1

¦
(u )
z[∞]
trunc (m∆χ , 2 D) ≅ ∆σ
(u )
Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e2π i ( n∆σ )( m∆χ )
. (5.136b)
n =0 trunc

This pair of equations has the exact same form as the pair of equations specifying the discrete
Fourier transform for the oversampled signal in Eqs. (5.123a) and (5.123b), with

∆χ Ÿ ∆χ ( u ) ,

N Ÿ N (u ) ,
and
σ Nyq Ÿ σ Nyq
(u )
.

- 722 -
Undersampling the Interferogram · 5.25

This shows we can sample the interferogram signal ztrunc with double the sampling interval used
in the previous section—that is, undersample by a factor of 2—and plug the resulting ztrunc values
into formula (5.136a) to get

Z[eff∞ ] (n∆σ , 2σ Nyq


(u )
).
trunc

Knowing that the wavenumber interval ∆σ has not changed from what it was before, and that
the aliases of Z eff do not overlap when undersampled by a factor of 2, we can now use the
trunc

correspondences shown in Fig. 5.46 to extract the true spectral values between

[−2σ Nyq
(u )
] , [−σ Nyq
(u )
] , and [σ Nyq
(u )
] , [2σ Nyq
(u )
].

When oversampling the interferogram signal ztrunc in the previous section, N interferogram
samples are used to find the spectrum Z eff ; and in this section, when undersampling the
trunc

interferogram signal by a factor of 2, only N ( u ) = N / 2 interferogram samples are needed to get


the same information. When the spectrum to be measured is narrow enough for this sort of
undersampling to make sense, it can lead to significant savings in data storage and calculation
time for the discrete Fourier transform. The drawback is, as shown by the discussion at the end of
Sec. 6.22 of Chapter 6, that we may end up with an increased level of low-frequency noise in the
measured spectrum.

5.26 Off-Center Sampling of the Interferogram Signal


When analyzing the sampled interferogram signal in Secs. 5.22–5.25, we said the interferogram
samples occurred at optical-path differences m¨Ȥ for

N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1,
2 2 2 2

(see footnote 93 above). The problem with this is that it assumes the interferogram signal is
sampled at exactly χ = 0 when m = 0 , as shown in Fig. 5.47(a). In practice, however, it is very
hard to sample the interferogram at exactly χ = 0 ; often the sample nearest χ = 0 is located a
large fraction of a sampling interval away from χ = 0 . We call this fraction of a sampling
interval α , with
1 1
− ≤α ≤ ,
2 2

- 723 -
5 · Description of Practical Interferometer Measurements

FIGURE 5.47(a).

FIGURE 5.47(B).

α ∆χ

- 724 -
Off-Center Sampling of the Interferogram Signal · 5.26

which means that the peak of ztrunc is located at χ = α∆χ , as shown in Fig. 5.47(b).
Mathematically, this can be regarded as a displacement of ztrunc(Ȥ), the interferogram signal as
defined above in Eq. (5.106a), along the Ȥ axis by a distance α∆χ . The displaced interferogram
signal can be written as

(α )
ztrunc ( χ ) = ztrunc ( χ − α∆χ ) . (5.137a)

Glancing back at Eq. (5.108a), we see that the new effective signal spectrum is

∞ ∞

³z ³z
(α ) (α ) −2π iσχ
Z (σ )
eff = trunc (χ ) e dχ = trunc ( χ − α∆χ ) e −2π iσχ d χ . (5.137b)
trunc −∞ −∞

Transforming the variable of integration to χ ′ = χ − α∆χ gives

∞ ∞

³z ³z
(α ) −2π iσχ ′ −2π iσα∆χ −2π iσα∆χ
Z (σ )
eff = trunc ( χ ′) e e d χ′ = e trunc ( χ ) e−2π iσχ d χ ,
trunc −∞ −∞

where in the last step the prime is dropped from the variable of integration. Substituting from Eq.
(5.108a), we see that

Z (effα()σ ) = e −2π iσα∆χ ⋅ Z eff(σ ) . (5.137c)


trunc trunc

Since α∆χ is a small quantity, the effect of shifting ztrunc by a distance α∆χ along the Ȥ axis is
to multiply the original signal spectrum Z eff by a slowly varying, complex function of ı. There
trunc

is nothing profound about this result; it is just an example of the Fourier shift theorem given in
Eq. (2.36a) of Chapter 2. From the discussion following Eq. (5.95c) above, we know that
multiplying the original effective signal spectrum by another complex function of ı does not
change the way the calibration procedure extracts the desired spectral radiance L(ı)—as long as
the complex function does not change after the instrument is calibrated. Hence, as long as Į is a
true constant, having the same value each time the moving mirror scans through its range of

- 725 -
5 · Description of Practical Interferometer Measurements

motion, the extra factor of e −2π iσα∆χ in Eq. (5.137c) can be removed by calibration. Since e −2π iσα∆χ
is a slowly varying function of ı, we can even, as described in footnote 88 above, use the type of
single-sided system discussed in Sec. 5.18 to measure the spectral radiance L(ı).

__________

Interferometer systems nonrandomly distort their spectral measurements in characteristic


ways. Background radiances and complex modulations can be removed by calibration, and
careful assembly can minimize the effects of nonflat optical surfaces and static misalignments.
We can enjoy without any reservations the high-resolution advantages of single-sided
interferogram systems when double-sided measurements confirm that the eiȥ phase term
[introduced in Eq. (5.84a)] is a slowly varying function of wavenumber ı. There is no way,
however, to avoid the nonrandom distortions and errors introduced by the finite interferogram
length and finite field of view of practical interferometer systems. The effect of the ¨ȍ finite
field of view is to blur the true spectral radiance L while at the same time shrinking the
wavenumber axis by a factor of (1 + ∆Ω (4π )) . This blurred and shifted spectral radiance is
called LFOV. The effect of the finite interferogram length is to blur by convolution with a sinc
function, as shown in Eq. (5.108d) above. When the spectral radiance is distorted both by the
interferometer’s finite field of view and by its finite interferogram length, we call it Lmnf. We plan
to keep track of these distinctions between true radiances, FOV radiances, and mnf radiances
when the random spectral errors produced by detector noise, misalignment noise, and sampling
noise are discussed in the next three chapters.

- 726 -
Appendix 5A

Appendix 5A
The detector circuit of a Fourier-transform spectrometer is a time-invariant linear system. If g(t)
is the input signal as a function of time going into the linear system, then the output signal k(t)
can always be written as

k (t ) = ³ g (t ′) h(t − t ′) dt ′ ,
−∞
(5A.1a)

where h(t) is a continuous function of time specifying how the input signal is modified by passing
through the circuit. The explicit limits on the integral expression for output k(t) are +’ and –’,
but in practice we always assume that the input signal g(t) is time limited, with the true limits on
the integral being set by the finite range of t over which g(t) is not zero. Function h(t) is often
called the impulse-response function of the linear circuit, because when the input is a delta
function impulse (see Sec. 2.14 of Chapter 2),

g (t ) = δ (t ) ,
then the output is h(t):

k (t ) = ³ δ (t ′) h(t − t ′) dt′ = h(t ) .
−∞
(5A.1b)

As a general rule, we expect h(t) to be a much narrower function of time than g(t), which means
that output k(t) can be regarded as just a slightly blurred and distorted version of the input g(t).
According to Eq. (2.38a) of Chapter 2, Eq. (5A.1a) states that output k is the convolution of h
and g,
k (t ) = g (t ) ∗ h(t ) . (5A.2a)

The convolution is a linear operation, so when the input signal is the linear combination of two
functions g1 and g 2 , with
g (t ) = α g1 (t ) + β g 2 (t )

for two real constants Į and ȕ, then the resulting output is Į multiplied by the output that would
occur if only g1 were present plus ȕ multiplied by the output that would occur if only g 2 were
present [see Eq. (2.38e) in Chapter 2],

g (t ) = [α g1 (t ) + β g 2 (t )] ∗ h(t ) = α [ g1 (t ) ∗ h(t )] + β [ g 2 (t ) ∗ h(t )] . (5A.2b)

Therefore, if we know the output of the circuit for input g1 and the output of the circuit for input
g 2 , we know at once the output of the circuit for an input [α g1 (t ) + β g 2 (t )] . In particular, taking

- 727 -
5 · Description of Practical Interferometer Measurements

α = β = 1 in (5A.2b), we see that if the output of the circuit for an input g1 (t ) is

g1( out ) (t ) = g1 (t ) ∗ h(t ) ,

and the output of the circuit for an input g 2 (t ) is

g 2( out ) (t ) = g 2 (t ) ∗ h(t ) ,

then the output of the circuit when the input is

g1 (t ) + g 2 (t )

must be must be the sum of the individual signals’ outputs,

g1( out ) (t ) + g 2( out ) (t ) .

Glancing back to Eq. (2.40a) in Chapter 2 and the discussion following it, we note that the input
signal g(t) in Eq. (5A.2a) plays the role of u(t) in (2.40a), that the output signal k(t) plays the role
of ue,blur (t ) in (2.40a), and that the impulse-response function h(t) plays the role of the
instrument-response function ve (t ) in (2.40a). In fact we already know from the discussion
following Eq. (2.40a) that the correct way to handle Eq. (5A.2a) is to take the Fourier transform
of both sides and then apply the Fourier convolution theorem [see Eq. (2.39b) of Chapter 2] to get

K ( f ) = G( f ) ⋅ H ( f ) , (5A.3a)
where

³ v(t ) e
−2π ift
K( f ) = dt , (5A.3b)
−∞

³ g (t ) e
−2π ift
G( f ) = dt , (5A.3c)
−∞
and

³ h(t ) e
−2π ift
H( f ) = dt . (5A.3d)
−∞

The Fourier transform H(ƒ) of the impulse-response function is often called the transfer
function of the linear circuit. The formula shown in Eq. (5A.3a) is often the easiest way to find
the output k(t) corresponding to a given input g(t). We first calculate G(ƒ), the Fourier transform

- 728 -
Appendix 5A

of input g(t), then multiply G(ƒ) by the transfer function H(ƒ) to get K(ƒ), the Fourier transform
of the output. Having found K(ƒ), we then take the inverse Fourier transform to get output k(t),

³ K( f )e
2π ift
k (t ) = df . (5A.4)
−∞

Although the impulse-response function h(t) in Eq. (5A.2a) plays the same role as the instrument-
response function ve (t ) in Eq. (2.40a) of Chapter 2, there is one important difference. A linear
circuit is a causal system, which means that its output signal k(t) cannot start happening before
the input signal g(t) occurs. Consequently, the circuit’s impulse-response function h(t) must
satisfy the restriction
h(t ) = 0 for t < 0 . (5A.5)

Suppose, for example, we supply a delta function at t = 0 , the impulse signal g (t ) = δ (t ) , for the
circuit’s input. Then, according to Eq. (5A.1b), the circuit’s output is

k (t ) = h(t ) ;

and if, for some t < 0 , we have h(t ) ≠ 0 , then there will be some part of the circuit’s output
signal being produced at t < 0 before its cause, the input delta function at t = 0 , has occurred.
This is why the impulse-response function of a causal linear system, unlike ve (t ) in Fig. 2.5(f) of
Chapter 2, must satisfy (5A.5).
Because h(t) is a nonzero function that must nevertheless be zero for negative values of t, it
cannot be an even function,95
h(−t ) ≠ h(t ) . (5A.6a)

The transfer function H(ƒ) is, according to Eq. (5A.3d), the Fourier transform of the real impulse-
response function h(t), which means, according to entry 7 of Table 2.1 of Chapter 2, that H is a
Hermitian function of ƒ,
H ( − f ) = H ( f )∗ . (5A.6b)

If H were a real function of ƒ, then it would also need to be even in order to satisfy (5A.6b).
According to entry 1 of Table 2.1, however, function H(ƒ) can be both real and even only when
h(t) is both real and even. Since, according to (5A.6a), we know that h(t) is not even, we conclude
that H, although Hermitian, cannot be real. We can directly verify this conclusion by using
eiφ = cos φ + i sin φ to break the Fourier transform of the real impulse-response function h(t) into

95
See Eq. (2.11a) of Chapter 2 for a definition of what it means to say a function is even.

- 729 -
5 · Description of Practical Interferometer Measurements

real and imaginary parts,

5 5 5

³ h(t ) e ³ h(t ) cos(2& ft ) dt  i ³ h(t ) sin(2& ft ) dt


2& ift
H( f ) dt
5 5 5
5 5
(5A.7a)
³ h(t ) cos(2& ft ) dt  i ³ h(t ) sin(2& ft ) dt.
0 0

The last step here uses the restriction in (5A.5) to limit the sine and cosine integrals to non-
negative values of t. Because the sine integral in particular is limited to non-negative values of t,
we note that the imaginary part of the transfer function,

5
Im[ H ( f )] ³ h(t ) sin(2& ft ) dt , (5A.7b)
0

can be zero for all values of ƒ only if h(t) is zero for all non-negative values of t. Since we
already know that h is
is zero
zero for
for all
all negative
negativevalues
valuesof
oft,t,ititfollows
followsthat
thathhwould
must be zero everywhere.
This is an unacceptable impulse-response function, confirming our previous assertion that the
transfer function H(ƒ) of the detector circuit cannot be a real-valued function.

- 730 -
Appendix 5B

Appendix 5B
This appendix shows how to simplify Eq. (5.76a) in Sec. 5.17 of Chapter 5. We start off with

Z eff (σ )
σ ′χ ∆Ω · º 2π i χσ ′¨©1− 4π ¸¹ (5B.1a)
∞ ∞ § ∆Ω ·
ªW §
= ³ dχ e
−2π iσχ
³ dσ ′ « S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸» e
−∞ −∞ ¬4 © 2 ¹¼

and note that the integral over dȤ can be moved inside to get

Z eff (σ )
0 ∞
W
= ³ dσ ′ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π i χ [σ −σ ′(1−α )] (5B.1b)
4 −∞ −∞
∞ ∞
W
³ dσ ′ ª¬ S (σ ′) M ( Rσ ′θ ) H(uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e
−2π i χ [σ −σ ′ (1−α )]
+ ma ,
4 0 −∞

where
∆Ω
α= (5B.1c)

and the integral over dσ ′ is, for future convenience, divided into two integrals—one from í’ to
zero and one from zero to ’. For any reasonable interferometer design, the ¨ȍ field of view (in
steradians, of course) is small compared to 4π , so

0 < α << 1 . (5B.1d)

From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H(ƒ) is Hermitian,

H ( − f ) = H ( f )∗ . (5B.2a)

From Eq. (5.46d) in Chapter 5, we know that

H(0) = 0 . (5B.2b)

We can write the complex transfer function H(ƒ) as

H ( f ) = Λ ( f )eiυ ( f ) , (5B.2c)

- 731 -
5 · Description of Practical Interferometer Measurements

where both Λ and υ are real functions of ƒ. The same sort of reasoning used to derive Eqs.
(5.86a) and (5.86b) in Chapter 5 can be used here to analyze Λ( f ) and υ ( f ) . Substituting
(5B.2c) into (5B.2a) gives, since both Λ and υ are real,

Λ (− f ) eiυ ( − f ) = Λ( f ) e − iυ ( f ) .

Taking the complex magnitude of both sides shows Λ to be an even function of ƒ,

Λ(− f ) = Λ( f ) . (5B.2d)
To match Eq. (5B.2b), we require
Λ(0) = 0 . (5B.2e)
Now we can write
Λ ( f ) eiυ ( − f ) = Λ ( f ) e− iυ ( f ) or eiυ ( − f ) = e−iυ ( f )

and take the complex logarithm of both sides to get

υ (− f ) = −υ ( f ) , (5B.2f)

showing υ to be an odd function of ƒ. Because both Eqs. (5B.2d) and (5B.2e) must be true, we
conclude that not only is Λ( f ) equal to zero at f = 0 but also that the derivative of Λ( f ) with
respect to ƒ is zero at f = 0 . The point of this analysis is revealed when we substitute formula
(5B.2c) into the two integrals on the right-hand side of (5B.1b) to get

0 ∞
W
³ dσ ′Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) e iυ ( uσ ′ )
º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π i χ [σ −σ ′(1−α )]
4 −∞ −∞
and
∞ ∞
W
³ dσ ′Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) eiυ ( uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π iχ [σ −σ ′(1−α )] .
4 0 −∞

Changing the variable of integration of the inner integral to χ ′ = σ ′χ with d χ ′ = σ ′ d χ , we can


write these two integrals as

0 −∞
W dσ ′
³ Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) eiυ (uσ ′) º¼ ³ d χ ′ sinc ( 2πχ ′α ) e
−2π i χ ′[(σ / σ ′ ) − (1−α )]

4 −∞
σ′ +∞
and

- 732 -
Appendix 5B

5 5
W d) 3
4 ³
0
)3
 (u) 3) ª¬ S ) 3  M  R) 3' ma  ei+ ( u) 3) º¼ ³ d  3 sinc  2& 3  e 2& i 3[() / ) 3) (1 )] ,
5

where, in the first integral over d  3 , we note that  3 is negative when Ȥ is positive because
) 3
0 . Ordinarily we might worry about the singularity at ) 3 0 in the outside integrals over
d) 3 , but since both ȁ and its derivative are zero at ) 3 0 , it follows that  (u) 3) / ) 3 must be
zero at ) 3 0 and very small near ) 3 0 . Consequently, both of these integrals are well-defined,
and, replacing the ei+ product by the transfer function H, we can write Eq. (5B.1b) as

Z eff () )
0 5
W d) 3

4 ³
5
)3
ª¬ S ) 3  M  R) 3' ma  H(u) 3) º¼ ³ d  3 sinc  2& 3  e 2& i  3[() / ) 3) (1 )]
5
(5B.3)

5 5
W d) 3

4 ³
0
)3
ª¬ S ) 3  M  R) 3' ma  H(u) 3) º¼ ³ d  3 sinc  2& 3  e 2& i  3[() / ) 3) (1 )] .
5

Equation (2.108a) in Chapter 2 can be written as, after replacing F by Į,

5
1
³ sinc(2& t ) e
2& itf
dt ( f , ) , (5B.4a)
5
2
where
­ 1 for f

°
 ( f ,  ) ®1/ 2 for f

= 0 . (5B.4b)
° 0 for f 
¯

This definition of function  is the same as that in Eq. (2.56c) of Chapter 2, and the definition of
the sinc function is given in Eq. (2.106d) of Chapter 2.
Applying Eq. (5B.4a) to (5B.3) gives (here  3 plays the role of t)

0
W 1 §) ·
Z eff () ) 
8 ³
5
)3
ª¬ S ) 3  M  R) 3' ma  H(u) 3) º¼  ¨  (1   ),  ¸ d) 3
©)3 ¹
5
(5B.5a)
W 1 §) ·

8 ³
0
)3
ª¬ S ) 3  M  R) 3' ma  H(u) 3) º¼  ¨  (1   ),  ¸ d) 3.
©)3 ¹

From Eq. (5B.4b), it follows that  is zero unless

- 733 -
5 · Description of Practical Interferometer Measurements

σ
−α ≤ − (1 − α ) ≤ α
σ′
or
σ
1 − 2α ≤ ≤ 1.
σ′

Since, according to (5B.1d), 0 < α << 1 , we realize that 1 − 2α is positive. Therefore, if ıƍ is


positive then ı must also be positive, and if ıƍ is negative then ı must also be negative. Hence
this double inequality can be written as
σ′ 1
1≤ ≤
σ 1 − 2α
or
­ σ ½
°° σ ≤ σ ′ ≤ for σ > 0 °
°
1 − 2α
® ¾ . (5B.5b)
°− σ ≤ σ ′ ≤ − σ for σ < 0 °
°¯ 1 − 2α °¿

According to the discussion immediately preceding Eq. (5B.3), the quantity

1
ª S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼
σ′ ¬

is very small or zero when ıƍ is near or at zero, which means that the region around σ ′ = 0 cannot
contribute significantly to either integral in Eq. (5B.5a). Therefore, in the first integral between
í’ and zero, we can think of ıƍ as always negative, and in the second integral between zero and
’ we can think of ıƍ as always positive. According to (5B.5b), then, the first integral can be
nonzero only when σ < 0 and the second integral can be nonzero only when σ > 0 . This means
that Eq. (5B.5a) can be written as

­ W −σ 1
°− ³ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ dσ ′ for σ < 0
° 8α − σ /(1− 2α ) σ ′
Z eff (σ ) = ® σ /(1− 2α )
.
°W 1
° 8α ³ σ ′ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ dσ ′ for σ > 0
¯ σ

Changing the variable of integration in the top integral to σ ′′ = −σ ′ gives

- 734 -
Appendix 5B

­ W ) /(1 2 ) 1
³

° S ) 33  M  R) 33' ma   H(u) 33)  d) 33 for )
0
° 8 ) ) 33
® ) /(1 2 )
° W 1
Z eff () ) °
8 ) ³ ) 3
S ) 3  M  R) 3' ma  H(u) 3) d) 3 for ) 0 ,
¯
where
S () 33) S () 33) , (5B.6a)

M   R) 33' ma  M   R) 33' ma  , (5B.6b)


and
H(u) 33) [H(u) 33)] (5B.6c)

from Eqs. (5.39a) in Chapter 5, (5.10f) in Chapter 5, and (5A.6b) in Appendix 5A respectively.
Since it makes no difference whether we label the variable of integration ıƍ or ıƎ, we can now
write, remembering that   /(4& ) from Eq. (5B.1c),

­ W & ) /[1(2& )  ] 1
1

° ³

S ) 3  M  R) 3' ma   H(u) 3)  d) 3 for )
0
°° 2 ) )3
Z eff () ) ® . (5B.7a)
) /[1 (2& )1  ]
° W& 1
°
2  ³) ) 3
S ) 3  M  R) 3' ma  H(u) 3) d) 3 for ) 0
°̄

Since 2  /(2& ) is small compared to one, we have

) A [1  (2& ) 1 ]1 ) A [1  (2& ) 1 ] ,

so that Eq. (5B.7a) becomes

­W § 2& · )  ) (2& ) 1


1

° ¨ ³

¸ S ) 3  M  R) 3' ma   H(u) 3)  d) 3 for )
0
°° 4 ©  ¹ ) )3 (5B.7b)
Z eff () ) ® 1
)  ) (2& )
° W § 2& · 1
Z eff () ) ° ¨
4 © 
¸
¹
³ ) 3
S ) 3  M  R) 3' ma  H(u) 3) d) 3 for ) 0 .
°̄ )

Because ıƍ only changes from ) to

- 735 -
5 · Description of Practical Interferometer Measurements

σ + ∆Ω σ (2π ) −1 = σ (1 + ∆Ω(2π ) −1 )

inside the top and bottom integrals, we can, remembering from Eqs. (5B.1c) and (5B.1d) that

∆Ω
<< 1 ,

use the average value of ıƍ to approximate the 1/ıƍ term as

−1
1 1 § ∆Ω ·
1
= = ¨1 + ¸ .
1ª § ∆Ω · º σ ª ∆Ω º σ © 4π ¹
σ + σ ¨1 + ¸ 2+
2 «¬ © 2π ¹ »¼ 2 «¬ 2π »¼

Now the 1/ıƍ term can be brought outside the integrals to get

Z eff (σ )
­ σ (1+
∆Ω σ ∆Ω
)+
°§ 2π · 1 4π 4π
W
S (σ ′ ) M ( Rσ ′θ ma ) [ H(uσ ′) ] dσ ′ for σ < 0

°¨¨ ∆Ω σ ¸¸ ⋅ ⋅ ³
°© ¹ 1 + ∆Ω σ (1+ ∆Ω ) − σ ∆Ω 4
° 4π 4π 4π (5B.7c)
≅® .
∆Ω σ ∆Ω
° σ (1+ )+

°¨ § 2π · 1 4π 4π
W
¸ ⋅ ⋅ ³ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) dσ ′ for σ > 0
° ¨ ∆Ω σ ¸ ∆Ω 4
°̄ © ¹ 1+ ∆Ω σ ∆Ω
σ (1+ )−
4π 4π 4π

Making the variable substitution σ ′′ = −σ ′ in the upper integral of (5B.7c), we can write, using
Eqs. (5B.6a)–(5B.6c) and remembering that σ < 0 so that − σ = σ ,

∆Ω σ ∆Ω
σ (1+ )+
4π 4π
W
S (σ ′ ) M ( Rσ ′θ ma ) [ H(uσ ′) ] dσ ′

³ σ ∆Ω 4
∆Ω
σ (1+ )−
4π 4π
σ ∆Ω
(5B.7d)
∆Ω
σ (1+ )+
4π 4π
W
= ³ H(uσ ′′) S (σ ′′ ) M ( Rσ ′′θ ma ) dσ ′′.
∆Ω σ ∆Ω 4
σ (1+ )−
4π 4π

- 736 -
Appendix 5B

§ ∆Ω · § ∆Ω ·
In the bottom integral of (5B.7c), we can replace σ ⋅ ¨1 + ¸ by σ ⋅ ¨ 1 + ¸ because σ > 0 ,
© 4π ¹ © 4π ¹
making it look the same as Eq. (5B.7d). Consequently, both the top and bottom parts of Eq.
(5B.7c) can be combined into a single formula,

Z eff (σ )
∆Ω σ ∆Ω
σ (1+ )+
§ 2π · 1 ªW
4π 4π
º
≅¨ ⋅ ⋅ ³ H(uσ ′) S (σ ′ ) M ( Rσ ′θ ma ) » dσ ′.
¨ ¸¸ «
© ∆Ω σ ¹ 1 + ∆Ω σ (1+ ∆Ω ) − σ ∆Ω ¬ 4 ¼
4π 4π 4π

Making the approximation from Eqs. (5B.1c) and (5B.1d) that

∆Ω
1+ ≅ 1,

we can write this latest formula as

∆Ω ∆
σ (1+ )+ σ
4π 2
1 ªW º
Z eff (σ ) ≅
∆σ
⋅ ³ « 4 H(uσ ′) S (σ ′ ) M ( Rσ ′θ ma ) » dσ ′ , (5B.8a)
σ (1+
∆Ω ∆
)− σ
¬ ¼
4π 2

where

∆Ω σ
∆σ = . (5B.8b)

We conclude that Z eff (σ ) is, to a very good approximation, given by the average value of

W
H(uσ ) S (σ ) M ( Rσθ ma )
4

over a wavenumber range centered on


§ ∆Ω ·
σ ⋅ ¨1 + ¸,
© 4π ¹
which has a width of ∆Ω σ /(2π ) .

- 737 -
5 · Description of Practical Interferometer Measurements

Appendix 5C
When a relatively narrow and rapidly varying function h(z) centered on zero is convolved with
the product of another rapidly varying function g(z) and a broad, slowly varying function G(z),
we can often approximate the result as

h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] . (5C.1)

It is easy to see why this works. We start out by making h(z) a narrow function centered on z0,
as shown in Fig. 5C.1. Starting with the definition of a convolution in Eq. (2.38a) of Chapter 2,
we have

h( z ) ∗ [G ( z ) ⋅ g ( z )] = ³ h( z′)G( z − z′) g ( z − z′)dz′ .
−∞
(5C.2)

Since h is a narrow function compared to G, the range of values between z0 − Lh and z0 + Lh in


Fig. 5C.1 for which h is significantly different from zero is for function G in Fig. 5C.2 a range of
zƍ values over which very little change occurs. Hence, the right-hand side of Eq. (5C.2) can be
approximated as

∞ ∞

³ h( z′)G( z − z′) g ( z − z′)dz′ ≅ G( z − z ) ³ h( z′) g ( z − z′)dz′


−∞
0
−∞
(5C.3a)

or

³ h( z′)G( z − z′) g ( z − z′)dz′ ≅ G( z − z )[h( z ) ∗ g ( z )] ,


−∞
0 (5C.3b)

using the definition of the convolution in Eq. (2.38a). Substituting this back into (5C.2) now
gives the desired result,

h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z − z0 ) ⋅ [h( z ) ∗ g ( z )] (5C.4a)

or
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] (5C.4b)

when z0 ≅ 0 because h is centered on zero. This justifies Eq. (5C.1).

- 738 -
Appendix 5C

FIGURE 5C.1.

h( z )

z
z0

Lh Lh

FIGURE 5C.2.

G( z)

2 L2h Lh

- 739 -
5 · Description of Practical Interferometer Measurements

FIGURE 5C.3.
hsum ( z )

FIGURE 5C.4.

h1 ( z ) h2 ( z ) h3 ( z )

z z z

- 740 -
Appendix 5C

When a function h is the sum of several narrow functions, it can be written as

N
hsum ( z ) = ¦ hk ( z − zk ) (5C.5)
k =1

with the N narrow hk ( z ) functions centered at the origin. Figure 5C.3 shows what a plot of
hsum ( z ) might look like for N = 3 when h1 , h2 , h3 are as shown in Fig. 5C.4. The linearity of the
convolution shown in Eq. (2.38e) of Chapter 2 can now be used to write

N
hsum ( z ) ∗ [G ( z ) ⋅ g ( z )] = ¦ {hk ( z − zk ) ∗ [G ( z ) ⋅ g ( z )]}
k =1
N
(5C.6)
≅ ¦ {G ( z − zk ) ⋅ [h( z − zk ) ∗ g ( z )]} ,
k =1

where the last step uses Eq. (5C.4a) to move G outside the convolutions.

- 741 -
6
NEdN AND DETECTOR NOISE
Laboratory measurements contaminated by random errors are usually characterized by their
signal-to-noise ratio (SNR). In measurements of spectral radiance, however, signal-to-noise ratios
can be confusing because the SNR can change by orders of magnitude as the signal itself—in
spectra having strong emission or absorption lines—changes by orders of magnitude. Hence the
noise performance of Fourier-transform spectrometers is often characterized by the noise-
equivalent change in radiance (NEdN) instead of the signal-to-noise ratio.96 By far the largest
part of the random error or NEdN in the spectral measurements of most Fourier-transform
spectrometers comes from random errors in the way detectors respond to the optical signal. These
random errors in the detector response are called detector noise. Because as few assumptions as
possible are made in this chapter about the shape of the detector-noise power spectrum, our
approach to detector noise is more elaborate than most discussions of the subject. In this chapter,
we derive formulas for the detector-noise NEdN of Michelson spectrometers using double-sided
and single-sided interferogram signals. While deriving our NEdN formulas, we are careful to
trace through what happens to the spectral signal during calibration, making it easy to understand
the different ways detector noise is processed in double-sided and single-sided systems. Although
the formulas in this chapter apply directly only to the detector noise in standard two-port
Michelson systems, the approach used here can be easily adapted to any type of Fourier-
transform spectrometer by changing the details of the analysis to accommodate the interferogram
signals generated by more elaborate instruments.
interferometers.

6.1 Definition of NEdN


The letter “N” is often used to represent radiance in radiometric equations, and in radiometric
measurements the letters “NEdN” usually stand for “noise-equivalent change in radiance.” The
NEdN is the expected amount of uncertainty that random errors give to the radiance value. The
name itself does a good job of explaining the basic concept: random errors—that is, noise—
produce an error in the measurement that corresponds to a modification—that is, an equivalent
change—of the radiance N.
When a nonideal instrument whose measurement errors are predominantly random is used to
measure radiance, there are error bars attached to the data points (see Fig. 6.1). The error bar
(usually) indicates that the true value of the radiance probably lies within one error-bar length BE
of the data point. When the error is predominantly random, the randomness shows up as a change

96
The NEdN is described in Sec. 6.1 below.

- 742 -
Definition of NEdN · 6.1

in the measured radiance if the same measurement is repeated; so for predominantly random
errors, the error bars also show that a repeated measurement of the same spectrum value with the
same instrument is likely to lie within one error-bar length of the original data point. For
predominantly random errors, then, the length BE of the the error
errorbar
barattached
attachedtoto
thethe data
data point
point also
approx-
imates theNEdN
gives the NEdN value—the
value noise-equivalent change in radiance—associated with the data point.
When the measurement errors are not predominantly random, the error bar specifies the total
measurement error, both random and nonrandom. Hence, when a data point is also contaminated
by significant amounts of nonrandom error, BE is larger than the probable change in value if the
measurement is is repeated.
repeated. Consequently,
The NEdN when alwaysbothhasrandom
the same and type
nonrandom
of unitserrors are radiance
as the present,
measurements
the NEdN canit describes.
be thoughtIn of thisaschapter, the NEdN
that portion describes
of the the expected
error-bar amountby
length caused of random
error in the spectral
measurement error—thatradiance L NEdN
is, the as a function of wavenumber
is the increase ı, soofhere
in the length the NEdN
BE due always has
to the presence of
units of measurement
random optical powererrors.
per unit area the
Because per NEdN
unit solid
mustangle per the
be either uniterror-bar
wavenumberlengthinterval
itself or(for
an
2 í1
example, inwatts/m
increase /sr/cmlength,
the error-bar or the NEdN 2always
erg/sec/cm /sr/cmí1has= theerg/sec/sr/cm). In a aswell-designed
same type of units the radiance
interferometer,
measurements itwe expect the
describes. NEdN—indeed,
In this we expect
chapter, the NEdN the total
describes measurement
the expected amount error—to
of random be
small compared
error in to theradiance
the spectral average Lor as
typical size ofofthewavenumber
a function radiance. ı, so here the NEdN always has
units of optical power per unit area per unit solid angle per unit wavenumber interval (for
example, watts/m2/sr/cmí1 or erg/sec/cm2/sr/cmí1 = erg/sec/sr/cm). In a well-designed
interferometer, we expect the NEdN—indeed, we expect the total measurement error—to be
small compared to the average or typical size of the radiance.
______________________________________________________________________________

FIGURE 6.1.

N () )

- 743 -
6 · NEdN and Detector Noise

In Chapter 5, we found that an interferometer’s spectral radiance measurements always suffer


to some degree from two types of nonrandom error: a measurement distortion due to the
interferometer’s finite field of view and a measurement distortion due to the finite length of the
interferometer’s interferogram. According to the discussion following Eq. (5.108f) in Chapter 5,
this nonrandomly distorted spectral measurement can be represented by function Lmnf(ı). To get a
complete representation of the measured spectral radiance produced by the interferometer, we
must add to the distorted yet noise-free measurement specified by Lmnf(ı) a random term
representing the measurement noise. We write the measured, noise-contaminated spectral
radiance produced by an interferometer measurement as

L mN (σ ) = Lmnf (σ ) + δ L (σ ) (6.1a)

with δ L being the random error contaminating the L


mN spectral measurement. The wavy line

over L mN and δ L shows that these are both random functions of ı (see Secs. 3.1 and 3.2 of
Chapter 3 for an explanation of the wavy-line notation and random functions). We need δ L to be
a random function of ı because very often the size and nature of the random error in the spectral
measurement depends strongly on the value of the wavenumber ı. The δ part of δ L  reminds us
that the random error takes on values that are small compared to the typical size of Lmnf.
Representing this typical size by the spectral average of Lmnf, we note that

σ max
1
δ L (σ ) << ³ Lmnf (σ ) dσ . (6.1b)
σ max − σ min σ min

In this inequality, the interferometer is assumed to measure spectral radiances between ımin and
ımax. Using the same notation as in inequality (5.78) of Chapter 5, we say that

0 < σ min ≤ σ ≤ σ max (6.1c)

for all wavenumbers ı in (6.1a) and (6.1b).


The relationship between the NEdN and δ L  is not as straightforward as it might at first look.
From Sec. 3.4 of Chapter 3, we know that the average or expected value of L mN in Eq. (6.1a) is

( ) ( ) ( )
E L mN (σ ) = E L mnf (σ ) + δ L (σ ) = L mnf (σ ) + E δ L (σ ) , (6.2a)

where Eqs. (3.16a) and (3.9f) in Chapter 3 are used to simplify the right-hand side of the
equation. Looking at (6.2a), we might be tempted to define E(δ L (σ )) , which is the average or

- 744 -
Definition of NEdN · 6.1

expected value of δ L , to be the NEdN associated with the L mN measurement; but there are
problems with this approach. Suppose, for example, that only random errors are present in our
measurements and that the error bar attached to the data point shows that the random error is just
as likely to make a measured radiance too large as it is to make it too small. This suggests that the
radiance value that would be produced by a noise-free interferometer measurement can be
estimated by averaging together a large number of independent measurements, with the presence
of the randomly occurring “too-large” measurements compensating for the presence of the
randomly occurring “too-small” measurements. According to Sec. 3.4 of Chapter 3, to get the
average value of a randomly varying quantity, we should apply the expectation operator E .
Hence, the assumption that averaging together many randomly occurring too-large and too-small
measurements produces a good estimate of the noise-free interferometer measurement can be
written as
( )
E L mN (σ ) = L mnf (σ ) . (6.2b)

Substitution of (6.2b) into (6.2a) then gives

(
E δ L (σ ) = 0 . ) (6.2c)

So now if E(δ L (σ )) is defined to be the NEdN, we end up saying that our measurements have
zero NEdN even though every individual measurement is contaminated by a substantial amount
of random error. This is obviously not acceptable.
To define the NEdN correctly, we must remember that the NEdN is not the average random
error itself, E(δ L (σ )) , but rather the average size of the random error. Glancing back at Eqs.
(3.5c) and (3.8e) in Chapter 3, we see that the standard deviation of δ L , which can be written as

{( )}
1/ 2

¬ (
E ªδ L (σ )-E δ L (σ ) º 2
¼ ) ,

( )
gives us what we want. Even if E δ L (σ ) is zero, the standard deviation

( ( ) )
E ªδ L (σ )-E δ L (σ ) º 2 = E δ L (σ ) 2
¬ ¼ ( )
will be greater than zero as long as δ L itself is not identically zero. Hence, the standard
deviation behaves the way we want it to when E(δ L (σ )) is zero while δ L (σ ) is not zero. The
next step is to check how well this definition of the NEdN works when E(δ L (σ )) is not equal to

- 745 -
6 · NEdN and Detector Noise

zero.
Suppose Eq. (6.2c) is no longer satisfied; that is, suppose that

( )
E δ L (σ ) = δ L nr (σ ) = small nonzero error which depends on σ . (6.3a)

We decide to write δ L (σ ) as the sum of a nonrandom function δ L nr (σ ) and a random function


δ L (σ ) ,
r

δ L (σ ) = δ L nr (σ ) + δ L r (σ ) . (6.3b)

Taking the expectation value of both sides and using Eqs. (3.16a) and (3.9f) in Chapter 3, we get

( ) (
E δ L (σ ) = δ L nr (σ ) + E δ L r (σ ) . )
We can reconcile this result with (6.3a) only by requiring that

( )
E δ L r (σ ) = 0 . (6.3c)

Equations (6.3a)–(6.3c) show that if E(δ L (σ )) is not zero, then δ L (σ ) can be written as the
sum of both a random function δ L r (σ ) and a nonrandom function δ L nr (σ ) , with the nonrandom
function δ L (σ ) equal to the nonzero expectation value of δ L (σ ) and the random function
nr

δ L r (σ ) having a zero expectation value.


It is easy to show that δ L nr (σ ) acts like an extra, nonrandom error added to Lmnf. Substituting
(6.3a) into (6.2a) gives
( )
E L mN (σ ) = L mnf (σ ) + δ L nr (σ ) , (6.3d)

and substituting (6.3b) into (6.1a) gives

L mN (σ ) = [L mnf (σ ) + δ L nr (σ )] + δ L r (σ ) . (6.3e)

Equation (6.3e) shows that the sum inside the square brackets [ ] plays the same role as Lmnf does
in (6.1a) because it is a nonrandom function of ı added to a random function of ı; and Eq. (6.3d)
shows that δ L nr (σ ) cannot be removed by averaging together many different measurements of
L (σ ) .
mN

When repeated measurements are made of the same data point and then averaged together,

- 746 -
Definition of NEdN · 6.1

Eqs. (6.3d) and (6.3e) show that the random change from one measurement to the next comes
entirely from δ L r (σ ) , with δ L nr (σ ) just shifting the data point away from the Lmnf value by the
same amount each time the measurement is made. This shows that the increase in BE, the error-
bar length, due to random error comes entirely from the random component δ L r (σ ) of δ L (σ ) .
Fortunately, defining the NEdN to be the standard deviation of δ L (σ ) still gives us a well-
behaved value for the NEdN when δ L (σ ) has a significant nonrandom component δ L (σ ) . We nr

have already seen that [see the formula following Eq. (6.2c) above]

{( )}
1/ 2
standard deviation of δ L = E ªδ L (σ )-E δ L (σ ) º 2
¬ ¼ ( ) .

Substituting first (6.3b) and then (6.3a) into the right-hand side gives

{( )} { ( )}
1/ 2 1/ 2

¬ ( )
E ªδ L nr (σ ) + δ L r (σ )-E δ L (σ ) º 2
¼
= E ¬ªδ L r (σ ) ¼º 2 .

Again the standard deviation gives us what we want: a nonzero and positive value of the NEdN
that does not depend in any way on the nonrandom error component δ L nr (σ ) of δ L (σ ) . We
conclude that it makes sense to define the NEdN of any radiance measurement described by Eq.
(6.1a) to be the standard deviation of the random function δ L (σ ) even when E(δ L (σ )) does not
equal zero:

{( )}
1/ 2

¬ (
NEdN (σ ) = E ªδ L (σ )-E δ L (σ ) º 2
¼ ) . (6.3f)

We note that this definition automatically gives the NEdN units of spectral radiance, as it should.
To emphasize that the standard deviation formula only applies to non-negative wavenumbers ı,
we often write

{( )}
1/ 2

¬ (
NEdN ( σ ) = E ªδ L ( σ )-E δ L ( σ ) º 2
¼ ) . (6.3g)

Equation (6.3g) can also be thought of as giving the NEdN the same behavior with respect to
negative wavenumber values as the spectral radiance; the absolute value signs make the NEdN an
even function of ı in the same way that absolute value signs make L, L( fore ) , and L(back) even
functions of ı in Eqs. (5.40g), (5.51a), and (5.57a) of Chapter 5.

- 747 -
6 · NEdN and Detector Noise

6.2 Signal from the Spectral Radiance


Shifting the position of the moving mirror in Fig. 6.2 changes the interferogram signal by
changing the value of Ȥ, the interferometer’s optical path difference or OPD. During a spectral
measurement, the moving mirror moves uniformly and steadily through its allowed range of
positions, which means that Ȥ satisfies [see Eq. (5.41a) of Chapter 5]

χ = ut . (6.4)

Here u is the constant OPD velocity and t is a time coordinate chosen so that t = 0 when χ = 0 .
We usually find it more convenient to represent the interferometer signal and signal errors as
functions of Ȥ while remembering that, according to Eq. (6.4), the OPD value Ȥ and the time
coordinate t are directly proportional to each other.
The interferometer signal can be evaluated at any position along the signal chain shown in
Fig. 6.2. If we think of the signal as being the electrical impulses leaving the detector circuit due
to the input radiance L(ı), then we can analyze it at point C in Fig. 6.2 and represent it by zC ( χ ) .
Function zC ( χ ) can be either the voltage or current as a function of OPD, depending on how we
want to record the signal; and Eq. (6.4) can always be used to write the signal as zC (ut ) if we
want it as a function of time. To get the corresponding electrical impulses leaving the detector,
we can analyze the signal at point B in Fig. 6.2 and represent it by z B ( χ ) ; and if we think of the
interferometer signal as being the corresponding optical power reaching the detector, then we
analyze it at point A in Fig. 6.2 and represent it by z A ( χ ) . Again we have the choice of using
either volts or amps to represent the electrical signal z B ( χ ) , and signal z A ( χ ) is usually thought
of as having units of optical power. Just like the zC signal, the zB and zA signals can be specified
as functions of time by writing z B (ut ) and z A (ut ) .
At point C in Fig. 6.2, we know from Sec. 5.18 of Chapter 5 [see Eqs. (5.81a) and (5.83d)]
that the electrical signal due to the spectral radiance L(ı) entering the interferometer’s aperture is

³Z
−∞
eff (σ ) e 2π iσχ dσ


(6.5a)
WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ ,

- 748 -
Signal from the Spectral Radiance · 6.2

FIGURE 6.2.

Interferometer Fixed Mirror

Input Scene
Radiance
Interferometer Moving Mirror
Fore Optics

ZPD position
Interferometer Beam Splitter
of moving An OPD value of
mirror Ȥ corresponds to
a physical shift of Ȥ/2

Aft
Optics

Region of Optical Signal POINT A


POINT A

Detector

POINT
POINT B
POINT B
B
Det. circuit
w/ antialiasing
Region of Electrical Signal filter

POINT C
POINT C

Analog-to-Digital Converter
Region of Digital Signal sampling signal at equally-
spaced Ȥ values

- 749 -
6 · NEdN and Detector Noise

where, according to Eq. (5.83e) of Chapter 5, L FOV ( ) ) is defined to be

­ L( ) ) for small ǻȍ where cos  


° can be approximated as one
°
°°
L FOV ( ) ) ® §  ·  ) (6.5b)
) A 1
° 1 ¨© 4& ¸¹ 2

for slightly larger ǻȍ where cos  
°  ) § ³ · )
° A L ( ) 3 ) d ) 3
cannot be approximated as one .
°̄ ) A¨1 ¸
© 4& ¹ 2

According to Eq. (5.76c) of Chapter 5, ¨ı


 ) in Eq. (6.5b) is given by

 )
) . (6.5c)
2&

Since zC(Ȥ) is defined to be the electrical signal due to L(ı) at point C of Fig. 6.2, we can now
write Eq. (6.5a) as

zC (  )
WA 
5
(6.5d)

4 5³ H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L FOV ( ) ) e 2& i) d) .

Examining carefully Eqs. (5.104a) and (5.104b) in Chapter 5, we see that zC(Ȥ) is the same signal
as z (  ) in (5.104a) because the signal spectrum Z eff () ) in (5.104b) is the same as the
expression put through the inverse Fourier transform on the right-hand side of (6.5d).
The easiest way to get the formulas for zB(Ȥ) and zA(Ȥ) is to go backwards through the signal
chain in Fig. 6.2.
Going back to zB(Ȥ), we note that it is the component of the electrical signal leaving the
detector due to the input radiance L(ı). To find this component, we just set H 1 in (6.5d) to
remove the influence of the detector circuit. Since the AC coupling of the detector circuit also
removes constant terms from the signal, we should also add back any constant signal terms
leaving the detector.97 Equation (6.4), which requires time and OPD to be proportional, reminds
us that the constant signal terms must be independent of both time t and the OPD value Ȥ.
Examining Eqs. (5.40e)–(5.40g) in Sec. 5.9 of Chapter 5, we note that the formulas for K bal (  )

97
See the discussion following Eq. (5.46c) in Chapter 5 for an explanation of how the constant terms are eliminated
as the signal passes from point B to C in Fig. 6.2.

- 750 -
Signal from the Spectral Radiance · 6.2

are formulas for what we are now calling zB(Ȥ), the electrical signal leaving the detector due to the
spectral radiance L(ı) entering the interferometer’s aperture. Both of the formulas for K bal = z B
in Eqs. (5.40e) and (5.40f) have the same constant term—that is, the same Ȥ-independent term—
no matter what approximation is used for cos α ε . This term can be written as, substituting from
Eq. (5.40g),
∞ ∞
1 A∆Ω
³
4 −∞
S (σ ) d σ = ³ η( σ ) R ( σ ) τ f ( σ ) τ a ( σ )L ( σ ) dσ .
4 −∞
(6.6a)

Because this constant term is the same no matter what approximation is used for cos α ε , all that
we need to do to get the formula for signal zB(Ȥ) is to add this constant term to the formula for zC
in (6.5d) with H set equal to one. This gives

zB ( χ )

A∆Ω
= ³ η ( σ ) R ( σ ) τ f ( σ ) τ a ( σ )L ( σ ) d σ
4 −∞
(6.6b)


WA ∆Ω
+
4 −∞³ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ .

To get zA(Ȥ), the optical power reaching the detector due to the spectral radiance L(ı) entering the
interferometer’s aperture, we go back one more step in Fig. 6.2. According to the remark
following Eq. (5.35d) at the beginning of Sec. 5.9 of Chapter 5, replacing the detector
responsivity R ( σ ) by one takes us from the electrical signal produced by the detector to the
optical power hitting the detector. Therefore, to get zA(Ȥ), the optical power reaching the detector
at point A due to the spectral radiance L(ı), we just set R ( σ ) = 1 in Eq. (6.6b) to get

zA (χ )

A∆Ω
= ³ η( σ ) τ f ( σ ) τ a ( σ )L ( σ ) dσ
4 −∞
(6.6c)


WA ∆Ω
+
4 −∞³ M( Rσθ ma ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ .

- 751 -
6 · NEdN and Detector Noise

6.3 Signal from the Background Radiance


As is discussed in Sec. 5.13 of Chapter 5, the total interference signal passing through the signal
chain of Fig. 6.2 often contains significant background components as well as the zA(Ȥ), zB(Ȥ), and
zC(Ȥ) signal components due to the spectral radiance L(ı) entering the interferometer’s aperture.
The background components at point C have already been discussed to some extent in Chapter 5.
When we compare the background components at C to the background components of points A
and B, we note that the background signals at points A and B contain additional constant terms—
that is, terms that are Ȥ independent—that do not pass through the AC coupling of the detector to
point C. Conceptually, we can write for the total signal at points A, B, and C of Fig. 6.2 that

z (Atot ) ( χ ) = z A ( χ ) + background terms at point A , (6.7a)

z B( tot ) ( χ ) = z B ( χ ) + background terms at point B , (6.7b)


and
zC( tot ) ( χ ) = zC ( χ ) + background terms at point C . (6.7c)

The formulas for zA, zB, and zC in Eqs. (6.6a)–(6.6c) show that if L(ı), the spectral radiance
entering the front aperture, is zero, then zA, zB, and zC are also zero. The standard way of making
the infrared spectral radiance L(ı) negligible—that is, effectively zero compared to the
background radiances—is to point the interferometer at an extremely cold surface. When this is
done, Eqs. (6.7a)–(6.7c) reduce to

z A(tot ) ( χ ) = background terms at point A = z (Acold ) ( χ ) ,

z B( tot ) ( χ ) = background terms at point B = z B( cold ) ( χ ) ,


and
zC( tot ) ( χ ) = background terms at point C = zC( cold ) ( χ ) .

The superscript (cold) reminds us that the background terms at points A, B, and C are the same
thing as the total signal at points A, B, and C when the interferometer is looking at a cold surface.
Because z (Acold ) , z B( cold ) , and zC( cold ) represent the background terms, and these terms are the same
no matter what negligible or non-negligible spectral radiance L(ı) is entering the front aperture
of the interferometer, Eqs. (6.7a)–(6.7c) can also be written as

z A(tot ) ( χ ) = z A ( χ ) + z (Acold ) ( χ ) , (6.8a)

z B(tot ) ( χ ) = z B ( χ ) + z B( cold ) ( χ ) , (6.8b)

- 752 -
Signal from the Background Radiance · 6.3

and
zC( tot ) (  ) zC (  )  zC( cold ) (  ) . (6.8c)

As a general rule, the calibration of any well-designed interferometer system provides us with the
data needed to find
z (Acold ) (  ) , z B( cold ) (  ) , and zC( cold ) (  ) .

Consequently, in principle all that need be done to recover signals zA(Ȥ), zB(Ȥ), and zC(Ȥ) at points
A, B, and C is to subtract z (Acold ) , zB( cold ) , and zC( cold ) from z A(tot ) , zB(tot ) , and zC(tot ) to get

z A (  ) z A(tot ) (  )  z (Acold ) (  ) , (6.8d)

zB (  ) zB(tot ) (  )  z B( cold ) (  ) , (6.8e)


and
zC (  ) zC(tot ) (  )  zC( cold ) (  ) . (6.8f)

6.4 Inverse Fourier Transform of the Background Radiance


In Chapter 5, we were interested in the nonrandom errors and distortions of the measured
spectrum, so we concentrated on those signal components carrying information about the input
spectral radiance L(ı). Signal noise, however, can arise from the interferometer’s background
radiance as well as from its input radiance, so we now have to expand somewhat on the analysis
of the background radiance in Chapter 5. In this section, we show what the background signal
terms look like at points A, B, and C in Fig. 6.2, specifying them as integrals and inverse Fourier
transforms of the background spectral radiance.
Function zC( cold ) (  ) in Eq. (6.8c) represents the component of the total signal at point C
created by the background radiance. Returning to Chapter 5 and examining how Eqs. (5.62a) and
(5.62b) are derived, we confirm, as was mentioned in Chapter 5, that zC( cold ) (  ) in Eq. (6.8a)
(6.8c) is
the same signal as the zC( cold ) (  ) given in Eqs. (5.62a) and (5.62b). We work first with the ideal
case where the interferometer’s field of view  is small enough that cos   can be
approximated as one. Substituting Eqs. (5.51a) and (5.57a) from Chapter 5 into (5.62b) gives

zC( cold ) (  )
WA 
5
(6.9)
³ H(u) ) M( R)'ma ) R ( ) ) !() ) * a ( ) )[L () )  L () )] e d) .
( fore ) (back) 2& i)

4 5

- 753 -
6 · NEdN and Detector Noise

The nonideal case where cos α ε can no longer be approximated by one requires somewhat more
work. Equation (5.62a) in Chapter 5 gives the nonideal formula for zC( cold ) . When we compare
(5.62a) to Eq. (5.73a) in Chapter 5, we notice that Eq. (5.73a) becomes identical to (5.62a) when
functions z(Ȥ) and S(ı) are taken to be the same as zC( cold ) ( χ ) and [ S ( fore ) (σ ) − S (back) (σ )]
respectively:

z ( χ ) ⇔ zC( cold ) ( χ ) (6.10a)

S (σ ) ⇔ [ S ( fore ) (σ ) − S (back) (σ )] . (6.10b)

In the mathematical analysis following Eq. (5.73a), functions z(Ȥ) and S(ı) are just “placeholder”
functions—that is, our mathematical analysis down to Eq. (5.75e) holds true for any appropriate
pair of “z” and “S” functions because it makes no assumptions about them other than that they are
related by a formula like Eq. (5.73a). This means we can find out what would happen to Eq.
(5.62a) when the same sort of analysis is applied to it as is applied to the z and S functions in
(5.73a) simply by replacing z and S in (5.75e) by zC( cold ) and [ S ( fore ) − S (back) ] respectively. Making
this replacement shows that the mathematical relationship specified in Eq. (5.62a) transforms into

zC(cold) ( χ )
W
∞ § ∆Ω ·
§ σχ ∆Ω · 2π iχσ ¨©1 − 4π ¸¹ (6.10c)
= ³ ª¬ S ( fore )
(σ ) − S (back)
(σ ) º¼ M ( Rσθ ma ) H(uσ ) sinc ¨ ¸e dσ .
4 −∞ © 2 ¹

For this new formula to be true, we must assume, just as in the analysis following Eq. (5.73a),
that χσα ε2 can be treated as a small quantity for all − D ≤ χ ≤ D over which the zC( cold ) ( χ ) signal
is recorded and that the field of view ∆Ω , although relatively large, is not so large that

α ε2
cos α ε ≅ 1 −
2

is a bad approximation [see Eq. (5.73b) in Chapter 5].


Using the notation introduced in Eq. (2.29a) of Chapter 2, we continue the analysis by taking
the forward Fourier transform of both sides of (6.10c) to get

- 754 -
Inverse Fourier Transform of the Background Radiance · 6.4

∞ ∞

( ) ³
F ( −iσχ ) zC( cold ) ( χ ) = d χ e −2π iσχ ³ dσ ′ ⋅
−∞ −∞
(6.10d)
°­ W ( fore ) § σ ′χ ∆Ω · 2π i χσ ′¨©1− 4π ¸¹ °½
§ ∆Ω ·

® ¬ª S (σ ′) − S (back )
(σ ′) ¼º M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸e ¾.
¯° 4 © 2 ¹ ¿°

Comparing this to Eq. (5.76a) in Chapter 5, we note that the right-hand sides of (5.76a) and
(6.10d) become identical if we once again use (6.10b), matching S (σ ′) to
[ S ( fore ) (σ ′) − S (back) (σ ′)] . Checking out how Appendix 5B is used to transform the right-hand side
of (5.76a) into the right-hand side of (5.76b), we note that again S is just a placeholder function.
This means the mathematical analysis still holds true when S (σ ′) is replaced by
[ S ( fore ) (σ ′) − S (back) (σ ′)] . Consequently, we can apply the same transformation used on (5.76a) to
Eq. (6.10d) to get

(
F ( −iσχ ) zC( cold ) ( χ ) ≅ )
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
1 © 4π ¹ 2
­W ( fore ) ½ (6.10e)
³ ® ª¬ S (σ ′) − S (back ) (σ ′) º¼ M ( Rσ ′θ ma ) H(uσ ′) ¾ dσ ′ ,
∆σ § ∆Ω · ∆ σ
¯4 ¿
σ ⋅¨1+ ¸−
© 4π ¹ 2

where
∆Ω σ
∆σ = .

Substituting from Eqs. (5.51a) and (5.57a) of Chapter 5 gives

1 § WA∆Ω ·
(
F ( −iσχ ) zC( cold ) ( χ ) ≅ ) ∆σ
⋅¨
© 4 ¹
¸⋅

§ ∆Ω · ∆ σ
σ ⋅¨1+
©
¸+
4π ¹ 2
(6.10g)
³ {M ( Rσ ′θ ma ) H(uσ ′) η(σ ) R ( σ )τ a ( σ ) ª¬L( fore) ( σ ′ ) − L(back) ( σ ′ ) º¼} dσ ′ .
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸−
© 4π ¹ 2

According to the discussion following Eq. (5.82c) of Chapter 5, the functions M, H, R, η , and τ a
all vary slowly with wavenumber ı, allowing them to be brought outside the integral in (6.10g).
In well-designed interferometers, it is often true that the background radiances L( fore ) and L(back)

- 755 -
6 · NEdN and Detector Noise

are also slowly varying functions of ı, being more or less proportional to a combination of Planck
black-body curves, but for now we can leave open the possibility that this is not the case.
Equation (6.10g) can now be written as, using the approximations specified in (5.83b) of Chapter
5,
§ WA∆Ω ·
( )
F ( −iσχ ) zC( cold ) ( χ ) ≅ ¨
© 4 ¹
¸ M ( Rσθ ma ) H(uσ ) η(σ ) R ( σ )τ a ( σ ) ⋅

­ª § ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
º ª § ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
º ½ (6.10h)
°° « 1 © 4 π ¹ 2
» « 1 © 4π ¹ 2
» °°
³ σ ′ σ ′ ³ σ ′ σ ′
( fore ) (back )
®« ⋅ L ( ) d » − « ⋅ L ( ) d »¾ .
° « ∆ σ σ ⋅§1+ ∆Ω · − ∆σ » « ∆ σ σ ⋅§1+ ∆Ω · − ∆σ »°
°¯ ¬« ¨
© 4π ¹
¸
2 ¼» ¬« ¨
© 4π ¹
¸
2 »¼ °¿

Equation (6.10h) applies, of course, to the nonideal case where ¨ȍ is small but not so small that
cos α ε can be approximated by one. Returning to Eq. (6.9), which gives the formula for
zC( cold ) ( χ ) when the ¨ȍ field of view is small enough to approximate cos α ε by one, we take the
forward Fourier transform of both sides of (6.9) to get

( )
F ( −iσχ ) zC( cold ) ( χ ) =
§ WA ∆Ω · (6.10i)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L ( σ ) − L(back) ( σ )].
( fore )
¨
© 4 ¹

Comparing Eqs. (6.10h) and (6.10i), we see they can be combined into a single result by writing

( )
F ( −iσχ ) zC( cold ) ( χ ) ≅
§ WA ∆Ω · (6.11a)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] ,
( fore ) (back)
¨
© 4 ¹

where we define, following the pattern of Eq. (5.83e) in Chapter 5, that

­ L( fore ) ( σ ) for small ǻȍ where cosα ε


° can be approximated as one
°
°
°
L FOV ( σ ) = ®
( fore )
§ ∆Ω · ∆ σ (6.11b)
σ ⋅ 1+ +
° 1 ¨© 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ∆ σ § ∆Ω³ · ∆σ
° ⋅ L ( fore )
( σ ′ ) d σ ′
cannot be approximated as one
°̄ σ ⋅¨1+ ¸−
© 4π ¹ 2

- 756 -
Inverse Fourier Transform of the Background Radiance · 6.4

and
­ L(back) ( σ ) for small ǻȍ where cos α ε
° can be approximated as one
°
°
°
L FOV ( σ ) = ®
(back)
§ ∆Ω · ∆ σ (6.11c)
σ ⋅¨1+ +
° 1 © 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ⋅ ³
° ∆ σ § ∆Ω · ∆σ
L(back) ( σ ′ ) dσ ′
cannot be approximated as one .
°̄ σ ⋅¨ 1+
© 4π ¹
¸ −
2

The Fourier transform in Eq. (6.11a) can always be reversed to get

zC( cold ) ( χ ) ≅
§ WA ∆Ω ·

(6.12a)
¸³
2π iσχ
¨ H(uσ ) M( Rσθ ma ) η (σ ) R ( σ ) τ a ( σ )[ L( fore )
FOV ( σ ) − L(back)
FOV ( σ )]e d σ .
© 4 ¹ −∞

This is the formula for zC( cold ) that belongs in Eq. (6.8c).
Having found the background terms at point C in Fig. 6.2, we now get the background terms
at point B by going back up the signal chain the same way we did for zA, zB, and zC in Sec. 6.2
above. To evaluate the right-hand side of (6.12a) at point B, we set H(uσ ) = 1 to get


§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)
dσ .
© 4 ¹ −∞

Unfortunately, to get the complete z B( cold ) ( χ ) background signal at point B, we have to add to this
the constant terms removed by the AC coupling of the detector to the rest of the system.98 Since,
according to Eq. (6.4), time and the OPD value Ȥ are proportional to each other, the time-
independent constant terms are also Ȥ independent. We note that according to Eq. (6.8b) above,
the total interference signal at point B is

z B(tot ) ( χ ) = z B ( χ ) + z B( cold ) ( χ ) .

Returning to Chapter 5, we compare Eq. (5.59b), which gives the total interference signal at point
B when the interferometer’s field of view is too large for cos α ε to be approximated as one, and
Eq. (5.59c), which gives the total interference signal at point B when the field of view is small

98
See discussion following Eqs. (5.42c) and (5.46c) in Sec. 5.10 of Chapter 5 for more information on AC coupling.

- 757 -
6 · NEdN and Detector Noise

enough for cos α ε to approximated by one, and see that they both have the same Ȥ-independent
constant terms:

χ - independent terms = Αdet ∆Ω ³ R (σ )L (σ ) dσ
( dir ) ( dir )

0
∞ ∞
1 1
+
20³ S (σ ) dσ + ³ S ( fore ) (σ ) dσ
20

A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ .
2 0

Because we are only trying to find the constant terms in the z B( cold ) ( χ ) background radiance—that
is, constant terms that are still present when the input radiance L(σ ) → 0 because the instrument
observes a cold scene—we must be careful to drop everything that is zero when L(ı) is zero.
Formula (5.40g) in Chapter 5 shows that the integral over S(ı) becomes zero when L(ı) is zero,
so it should be removed to give

cold input radiance source produced the χ -independent background terms


∞ ∞
1
³ R (σ )L (σ ) dσ + ³ S ( fore ) (σ ) dσ
( dir ) ( dir )
= Αdet ∆Ω
0
20

A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
2 0

Because these Ȥ-independent background terms are the same no matter how cos α ε is
approximated, they correctly represent the constant background terms at point B for all
reasonable sizes of the interferometer’s field of view. Adding them to the Ȥ-dependent terms from
Eq. (6.12a) with H = 1 thus gives the total background interference signal at point B,

z B( cold ) ( χ ) =

§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)

© 4 ¹ −∞

1
∞ (6.12b)
³ R (σ ) L (σ ) dσ + ³ S ( fore ) (σ ) dσ
( dir ) ( dir )
+ Αdet ∆Ω
0
20

A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ .
2 0

- 758 -
Inverse Fourier Transform of the Background Radiance · 6.4

We substitute for S ( fore ) (σ ) from Eq. (5.51a), with absolute value signs dropped from the ı
arguments because the integral does not cover negative ı values, to get

z B( cold ) ( χ ) =

§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)

© 4 ¹ −∞

A ∆Ω
∞ (6.12c)
+ Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ +
2 ³0
R (σ ) η(σ ) L (σ )τ a (σ ) dσ
( fore )

0

A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
2 0

Now that the constant terms have been correctly incorporated into (6.12c), it is easy to get the
formula for z (Acold ) ( χ ) , the total background signal at point A in Fig. 6.2: just set the detector
responsivity to R = 1 . Hence, we see that the total background optical power reaching the
detector is

z (Acold ) ( χ ) =

§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)

© 4 ¹ −∞

A ∆Ω

(6.12d)
+ Αdet ∆Ω( dir ) ³ L( dir ) (σ ) dσ + ³ η(σ ) L (σ )τ a (σ ) dσ
( fore )

0
2 0

A ∆Ω
³
2
+ τ a (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
2 0

Equations (6.5d), (6.6b), (6.6c), and (6.12a)–(6.12d) give all the information needed to make
sense of the formulas (6.8a)–(6.8c) for the z (tot ) ( χ ) signals at points A, B, and C in Fig. 6.2.

6.5 Background Radiance, Total Error, and Signal Noise


Equations (6.8a)–(6.8c) are not, of course, the complete story because we have not yet considered
random errors in the measurements. In reality, we can never measure z A(tot ) ( χ ) , z B(tot ) ( χ ) , and
zC(tot ) ( χ ) directly; instead, what we get from any measurement at points A, B, or C are the noise-
( tot )
contaminated signals z AN ( χ ) , zBN
( tot )
( χ ) , and zCN
( tot )
( χ ) given by the formulas

- 759 -
6 · NEdN and Detector Noise

( tot )
z AN (  ) z A (  )  z (Acold ) (  )   z A (  ) , (6.13a)

( tot )
zBN (  ) z B (  )  z B( cold ) (  )   zB (  ) , (6.13b)
and
( tot )
zCN (  ) zC (  )  zC( cold ) (  )   zC (  ) . (6.13c)

Here  z A (  ) represents the noise associated with any signal at point A in Fig. 6.2,  zB (  )
represents the noise associated with any signal at point B in Fig. 6.2, and  zC (  ) represents the
noise associated with any signal at point C in Fig. 6.2. Just like in Eq. (6.1a) above, the noise
terms have a  to show that they are expected to be small, and they have wavy lines or tildes to
( tot ) ( tot ) ( tot )
show that they are random functions of Ȥ. Tildes are added to z AN (  ) , zBN (  ) , and zCN (  ) to
show that these signals are also random quantities (because they are contaminated by the random
noise).
As pointed out in Sec. 6.3, the z ( cold ) (  ) signals are special cases of the z (tot ) (  )
interferometer signals; they are just the total signals at points A, B, or C when zA(Ȥ), zB(Ȥ), and
zC(Ȥ) are negligible or zero because the interferometer is observing a cold scene having negligible
or zero spectral radiance L(ı). Hence, when L is negligible or zero, Eqs. (6.13a)–(6.13c) can be
specialized by writing
( cold )
z AN (  ) z A( cold ) (  )   z A( cold ) (  ) , (6.13d)

( cold )
zBN (  ) z B( cold ) (  )   zB( cold ) (  ) , (6.13e)
and
( cold )
zCN (  ) zC( cold ) (  )   zC( cold ) (  ) . (6.13f)

( cold ) ( cold ) ( cold )


Here z AN , zBN , and zCN represent the noise-contaminated signals at points A, B, and C in
Fig. 6.2 for cold-surface observations with negligible or zero L(ı), and  z A( cold ) ,  zB( cold ) , and
 zC( cold ) are, of course, their noisecontaminating
the noise components. them.
( tot )
In a well-designed interferometer, we can assume that many different measurements of z AN ,
( tot ) ( tot )
zBN , and zCN can be averaged together to produce signals contaminated by only negligible
amounts of noise.randomThis error or noise.
imposes theThis imposes the requirements
requirements

E z AN
 ( tot )

(  ) z (Atot ) (  ) , (6.14a)

E zBN
( tot )

(  ) z B( tot ) (  ) , (6.14b)

- 760 -
Background Radiance, Total Error, and Signal Noise · 6.5

and
(
E zCN
( tot )
)
( χ ) = zC( tot ) ( χ ) (6.14c)

( tot ) ( tot ) ( tot )


on the average or expected values of random functions z AN , zBN , and zCN . Since the z ( cold )
signals are just special cases of the z ( tot ) signals, we can also write

(
E z AN
( cold )
)
( χ ) = z (Acold ) ( χ ) , (6.14d)

(
E zBN
( cold )
)
( χ ) = z B( cold ) ( χ ) , (6.14e)
and
(
E zCN
( cold )
)
( χ ) = zC( cold ) ( χ ) . (6.14f)

We substitute (6.13a)–(6.13c) into (6.14a)–(6.14c) and use the linearity of the expectation
operator E as explained in Sec. 3.10 of Chapter 3 to get

(
E z A, B ,C ( χ ) + z (Acold )
)
, B ,C ( χ ) + E ( δ z
 A, B ,C ( χ ) ) = z (Atot, B),C ( χ ) .

Substitution of (6.8a)–(6.8c) gives

( )
E z A( tot, B),C ( χ ) + E (δ z A, B ,C ( χ ) ) = z (Atot, B),C ( χ ) ,

which becomes, using Eq. (3.9f) of Chapter 3,

E (δ z A, B ,C ( χ ) ) = 0 . (6.14g)

Similarly, we can substitute (6.13d)–(6.13f) into (6.14d)–(6.14f) and use the linearity of the
expectation operator to get

E ( z (Acold
, B ,C ( χ ) ) + E ( δ z
)
 A, B ,C ( χ ) ) = z (Acold
, B ,C ( χ ) .
)

According to Eq. (3.9f) of Chapter 3, this also reduces to

E (δ z A( cold
, B ,C ( χ ) ) = 0 .
)
(6.14h)

- 761 -
6 · NEdN and Detector Noise

Equations (6.14g) and (6.14h) require the expectation or average values of the random functions
representing the noise to be equal to zero at every OPD value Ȥ. From this point on, we can think
of the  z signal “noise” as a random signal error whose expectation value is always zero.
Function Lmnf(ı) is, according to the discussion in Sec. 6.1, the the best and most distorted
non-randomly accurate
spectral-radiance measurement
measurement that
produced by an interferometer.
an interferometer can produce.It Itcan canbeberecovered
recovered from
from the
noise-free signal at points A, B, or C in Fig. 6.2; however, all that we get from a single
( tot ) ( tot ) ( tot )
measurement is the noise-contaminated signal z AN , zBN , zCN or (when looking at a cold
( cold ) ( cold ) ( cold )
surface) z AN , zBN , zCN . In principle, we could average together large numbers of
measurements to get, according to Eqs. (6.14a)–(6.14f), both of the noise-free signals z (Atot, B),C and
( cold )
z A( cold )
, B ,C . Then, following the recipe in Eqs. (6.8d)–(6.8f), the noise-free z signal could be
subtracted from the noise-free z ( tot ) signal to get zA(Ȥ), zB(Ȥ), or zC(Ȥ) at points A, B, or C in Fig.
6.2. This is exactly what is needed to gain access to Lmnf(ı); unfortunately, it is also impractical.
Typically, enough work is invested in calibrating an interferometer to produce very high-quality
estimates of the z ( cold ) signals if we want them. Even when we calibrate in the spectral domain, as
discussed in Sec. 5.19 of Chapter 5, the calibration algorithm requires substantially noise-free
signal spectra from which we could extract substantially noise-free z ( cold ) signals. When making
everyday measurements, on the other hand, we end up relying on less high-quality information;
( tot ) ( tot ) ( tot )
that is, we use the noise-contaminated z AN , zBN , zCN signals or their equivalents. Everyday
measurements are less accurate than the information used to calibrate the interferometer because
that is what it means to calibrate an instrument: however accurate the everyday measurement, and
however many noise-suppression averages go into its making, we expect the calibration to be
done with even greater care. Hence, when analyzing the zA,B,C signals generated by the input L(ı)
radiance, we can assume there is always enough noise-free data to subtract off, if only as a
thought experiment, the nonrandom functions z (Acold )
, B ,C from the random functions z
( tot )
 AN , BN ,CN to get

( tot )
z AN (  ) z AN (  )  z A( cold ) (  ) , (6.15a)

( tot )
zBN (  ) zBN (  )  z B( cold ) (  ) , (6.15b)
and
( tot )
zCN (  ) zCN (  )  zC( cold ) (  ) . (6.15c)

Substitution of (6.13a)–(6.13c) now gives

z AN (  ) z A (  )   z A (  ) , (6.16a)

zBN (  ) zB (  )   zB (  ) , (6.16b)

- 762 -
Background Radiance, Total Error, and Signal Noise · 6.5

and
zCN ( χ ) = zC ( χ ) + δ zC ( χ ) . (6.16c)

Equations (6.16a)–(6.16c) show that any noise in the signals at points A, B, or C in Fig. 6.2
“automatically” ends up attached to zA,B,C; that is, it ends up attached to the signal component
used to recover the Lmnf(ı) spectral radiance measured by the interferometer.

6.6 Detector Noise


Detectors are usually the largest and most noticeable source of noise in interferometer
measurements. Detector noise enters the signal chain at point B in Fig. 6.2; this is where it first
shows up as a random error contaminating the signal. As a general rule, detector noise has many
high-frequency components, changing very rapidly with time as the detector is being used.
During a spectral measurement the moving mirror moves at a steady rate, making the OPD value
Ȥ directly proportional to time [as required by Eq. (6.4)]. Consequently the detector noise can also
be written as a rapidly changing random function of Ȥ at point B,

δ zB ( χ ) = n (det) ( χ ) , (6.17a)

and we expect it to obey Eq. (6.14g),


E ( n (det) ( χ ) ) = 0 . (6.17b)

Equation (6.13b) above can now be written as

( tot )
zBN ( χ ) = z B ( χ ) + z B( cold ) ( χ ) + n (det) ( χ ) . (6.17c)

Since only detector noise is being analyzed in this chapter, we specify here that only negligible
amounts of noise occur “upstream” of point B in Fig. 6.2 by setting

δ z A ( χ ) = 0 (6.17d)

in Eqs. (6.13a) and (6.16a). We also assume that only negligible amounts of extra noise enter the
signal chain downstream of point C, which means that δ zC in (6.13c) and (6.16c) comes entirely
from the transmission of δ zB = n (det) between points B and C. Our job is to find what δ zC looks
like in terms of n (det) and then to use that information to find a formula for the NEdN due to
detector noise.
Many Fourier-transform systems go to great lengths to minimize detector noise. Some tactics
are obvious—for example, careful choice and treatment of detectors so that they perform well

- 763 -
6 · NEdN and Detector Noise

and do not generate large amounts of random error. Other tactics are perhaps less obvious—for
example, averaging together a large number of interferogram signals to reduce the detector noise
present. Section 3.12 of Chapter 3 has a discussion of how averaging of identical, noise-
contaminated signals works to reduce random error;; and and ofofcourse
courseFourier-transform
interferometer signals
signals are
put through computers to extract spectra, making it easy to store and average them. This sort of
averaging often involves the combination of many different independent measurements at the
same OPD value Ȥ and is often referred to as “co-adding” the interferograms. (It should not be
confused with the averaging discussed in Sec. 6.8 below, where we talk about averaging together
the signal values at Ȥ and í Ȥ.) There are two points that should be kept in mind when reading the
balance of this chapter:

(1) However much effort is put into co-adding interferograms to reduce noise, almost
always—as discussed at the end of the previous section—even more effort is put
into processing the calibration data to reduce noise; and
(2) The n (det) random function in Eq. (6.17a) above can be taken to represent the amount
of noise that still contaminates the signal after co-adding has occurred.

We can, in effect, pretend that co-adding is something that happens to the signal immediately
after it leaves the detector, acting to reduce the noise at point B and all points further downstream
in the signal processing chain of Fig. 6.2.

6.7 1/f Noise in Detectors


Calibrations tend to go “stale”; that is, the more time there is between when an instrument is
calibrated and when it is used, the less accurate the measurements are. In particular this is true of
the detectors in Fourier-transform spectrometers, or indeed the detectors of any optical
instrument—the longer the time between calibration and use, the less accurately do we know how
the detectors respond to incoming photons. In optical detectors this phenomenon is often referred
to as “1/f noise” for reasons that will be explained below.
Suppose, as a thought experiment if nothing else, that a collection of k identical detectors are
all calibrated at the same time and we then keep track of their random errors as the calibrations
go stale. (Note that the error due to the calibrations going stale must be random or else we could
study how the detector response changes with time and correct for it.) We do this over a very
long time interval * . From this set of data we then select, for each detector, a subset of data
covering a time interval 2T with 2T

* , and the 2T time interval is then used to construct k


error functions for the k 1, 2,… , k identical detectors. We call these functions

nk(det) (t ) measured detector noise for the kth detector as a function of time t
with  T 4 t 4 T

- 764 -
1/f Noise in Detectors · 6.7

Note that although these are error functions, they are not random since they represent the actual
measured error for each detector. Because the detectors are all identical, each nk(det) (t ) can be
thought of as a specific instance of the same random function n (det) (t ) ; that is, each nk(det) (t ) can
be treated as a typical member of the ensemble of functions associated with the n (det) (t ) random
function.99 Returning briefly to Sec. 3.23 of Chapter 3, we use Eq. (3.56a) to calculate another set
of functions,
T

³n
(k ) (det) 2& ift
2& ift
N T (f) k e(t ) e dt .dt .
T

Each NT( k ) ( f ) can be regarded as a member of the ensemble of functions associated with random
function
T
N T ( f ) ³ n
(det) 2& ift2& ift
e(t ) e dt .dt .
T

Formula (3.57g) in Chapter 3 then states that the noise-power spectrum of n (det) (t ) is

ª E N ( f ) 2
S nn
  ( f ) lim «
« T   º» .
T 75 2T »
« »
¬ ¼

Because the expected value of a random quantity can be estimated by taking its average, we can
write that
ª 1 1 k 2º
S nn
 ( f ) lim « A ¦ NT( k ) ( f ) »
T 75 2T k
¬ k 1 ¼
and the formula reduces to
2
1 k NT( k ) ( f )
S nn
 ( f )
k
¦k 1 2T

when we assume that T is large enough for the value of

2
NT( k ) ( f )
2T

99
See Sec. 3.14 of Chapter 3 for an explanation of what is meant by an ensemble of functions.

- 765 -
6 · NEdN and Detector Noise

to be close to its limit as T 7 5 . This result shows how to calculate the noise-power spectrum of
the k== identical detectors. When discussing 1/f noise, it is customary to introduce one final step:
using Eq. (3.58b) to go from the double-sided power spectrum Sññ to the single-sided power
(1)
spectrum S nn ,
(1)
S nn
  ( f ) 2 S nn
  ( f ) for f : 0 .

This is really just a change of scale—doubling the size of the noise-power spectrum—along with
an agreement to ignore the negative f values because they are always the same as the positive
ones [see Eq. (3.49b) in Chapter 3].
(1) (1)
Figure
Figure6.3(a)) onshows
6.3(a) page 795 sshows aplot,
a typical typical
forplot, for detector
detector noise,noise,  S
of
of S nn   versus
versus
nn f onf on a log-logscale.
a log-log scale. For
For
(1)
most detectors, there is a “corner” frequency fc such that when f > fc the value of S nn   is

essentially constant over a wide range of frequencies (before rolling off at very high f). When f <
(1) 
fc, on the other hand, the value of S nn   is typically proportional to 1/ f , with Į approximately
equal to one. Low frequencies correspond to long time intervals, so the growth in the value of
(1)
  as f gets small reflects the way detector calibrations go stale as time goes by. It has become
Snn
convenient to refer to this phenomenon as detector 1/f noise because in many detectors the corner
frequency fc is relatively large, meaning that their calibrations start to go stale in a very small
fraction of a second. We like to set up Fourier-transform systems so that the low-frequency noise
at f < fc cannot significantly contaminate our measurements. The basic strategy for doing this is
to use high-quality detectors—meaning that fc is small—and calibrate often enough that 1/f noise
does not become important. This is only the first line of defense; there are other ways of
minimizing the effect of 1/f noise and they will be pointed out in the remainder of the chapter
when appropriate.
A mathematical point often ignored in elementary discussions of 1/f noise is that if noise-
power spectra are 1/f all the way down to zero frequency, then integrals over frequency that
include the zero must diverge—that is, they become infinite. Standard treatments of random
function theory require the use of these integrals. Equation (3.48d) in Chapter 3, for example,
shows that Rññ(0) is equal to the integral of the power spectrum over all frequency values—
including, of course, f=0. Hence, the integral formula for Rññ(0) diverges when the power
spectrum is 1/f all the way down to zero. According to Eq. (3.48a) in Chapter 3, Rññ(0) is just the
squared standard deviation of the random function ñ at any time t. This squared standard
deviation must have a well-defined value to describe the detector noise accurately. Consequently,
the integral for Rññ(0) cannot be allowed to diverge. Perhaps the quickest way out of this problem
is to note that zero frequency corresponds to the most recent calibration occurring an infinite time
in the past; so, as long as the detectors have been calibrated more recently than that, we do not
expect the 1/f region of the noise-power spectrum to extend all the way down to zero. In general,
when the 1/f form of the noise-power spectrum leads to problems near f=0, it means that an
important aspect of the random error—an aspect which prevents the 1/f noise from producing

- 766 -
1/f Noise in Detectors · 6.7

infinite integrals—has been left out of the noise model.


Since 1/f noise can usually be neglected when analyzing the effects of detector noise on the
spectral measurements of well-designed Fourier-transform spectrometers, many models of
detector noise assume that it can be approximated as band-limited white noise of the type
discussed in Sec. 3.25 of Chapter 3. The white-noise level used for this approximation is typically
given by the level part of the power spectrum for frequencies f > fc in Fig. 6.3(a) (before the roll
off at very high frequencies). In this chapter—except for Secs. 6.13 and 6.15—we do not make
this sort of approximation, which is why the treatment of detector noise given here may seem
overly elaborate to those familiar with other presentations of the topic. Modeling detector noise
as band-limited white noise may capture most of the basic features of detector noise in Fourier-
transform spectrometers, but it can be misleading when analyzing the effects of 1/f noise and
other types of nonstandard detectorr errors.
noise.

6.8 Avoidable and Unavoidable Noise in Double-Sided Signals


Equation (6.15b) shows that when specialized calibration procedures are used to measure the
zB( cold ) (  ) signal, we can then subtract it from zBN
( tot )
(  ) , giving us access to the noise-
contaminated interferogram signal zBN (  ) specified in Eq. (6.16b). According to (6.17a), when
analyzing detector noise the signal in (6.16b) should be written as

zBN (  ) z B (  )  n (det) (  ) . (6.18a)

From Eq. (6.6b), we know that the noise-free signal zB (  ) in (6.18a) is

zB (  )
5
A
³ ! ( ) ) R ( ) ) * f ( ) ) * a ( ) )L ( ) ) d )
4 5
5
WA 

4 5³ M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L FOV ( ) ) e 2& i) d) .

Consulting Eqs. (4.139g) of Chapter 4 and (5.10f) of Chapter 5, we see that Ș and M are even
functions of ı. This turns the second integral on the right-hand side into the inverse Fourier
transform of a real and even function of ı. Therefore, according to entry 1 of Table 2.1 in Chapter
2, the integral itself is a real and even function of Ȥ. Because the first integral on the right-hand
side is a constant, independent of Ȥ, we conclude that the noise-free signal zB (  ) must also be a
real and even function of Ȥ,
zB (  ) zB (  ) . (6.18b)

- 767 -
6 · NEdN and Detector Noise

Glancing back at the formula for zBN ( χ ) in Eq. (6.18a), we see that the detector noise n (det) ( χ )
is, however, another story—it would be strange indeed if the random error coming from the
detector is an even function of Ȥ. The detector cannot possibly care what the position of the
moving mirror is; the only reason n (det) depends on Ȥ is that we acknowledge n (det) to be a
function of time and then use Eq. (6.4) to make it function of Ȥ. Consequently zBN , the sum of zB
and n (det) in (6.18a), is an uneven function of Ȥ only because it is a noise-contaminated signal.
This distinction between zB(Ȥ) and n (det) ( χ ) , that one is an even function and the other is not, can,
in principle, be used to reduce the NEdN of the interferometer’s spectral measurements. (In
practice we always have to worry about the distorting effect of any circuit used to measure the
detector signal—see for example the discussion of the detector circuit in Sec. 5.12 of Chapter 5.)
For this reason, we say that some of the noise contributed to zB(Ȥ) by n (det) ( χ ) is avoidable
noise—that is, noise that can be eliminated by an intelligent analysis of the zBN signal.
Perhaps the quickest way to distinguish the avoidable and unavoidable noise in zBN ( χ ) is to
recall the discussion following Eq. (2.11b) in Chapter 2, where it is pointed out that any function
can be written as the sum of even and odd components. Hence, we can always write

n (det) ( χ ) = ne(det) ( χ ) + no(det) ( χ ) , (6.19a)


with
1 (det)
ne(det) ( χ ) = ª¬ n ( χ ) + n (det) (− χ ) º¼ (6.19b)
2
and
1 (det)
no(det) ( χ ) = ª¬ n ( χ ) − n (det) (− χ ) º¼ . (6.19c)
2

Here ne(det) is the even component of n (det) and no(det) is the odd component of n (det) ,

ne(det) (− χ ) = ne(det) ( χ ) (6.19d)


and
no(det) (− χ ) = −no(det) ( χ ) . (6.19e)

Equations (6.19d) and (6.19e) are just the definition of what it means for a function to be even or
odd [see Eqs. (2.11a) and (2.11b) in Chapter 2], and it is easy to see that (6.19d) and (6.19e) are
true by checking what happens when the sign of the argument is changed in formulas (6.19b) and
(6.19c). Substitution of (6.19a) into (6.18a) gives

zBN ( χ ) = [ z B ( χ ) + ne(det) ( χ )] + no(det) ( χ ) . (6.19f)

- 768 -
Avoidable and Unavoidable Noise in Double-Sided Signals · 6.8

A little thought shows that ne(det) ( χ ) must be the unavoidable component of the noise, because
there is no way to distinguish the noise-contaminated sum inside the square brackets [ ] from a
noise-free measurement of a zB(Ȥ) interference signal. The no(det) ( χ ) noise, on the other hand, is an
avoidable source of error. We could, for example, eliminate it by averaging together zBN ( χ ) and
zBN (− χ ) ,

1 1
[ zBN ( χ ) + zBN (− χ )] = [ z B ( χ ) + ne(det) ( χ ) + no(det) ( χ )]
2 2
1
+ [ z B (− χ ) + ne(det) (− χ ) + no(det) (− χ )]
2
1
= [ z B ( χ ) + z B (− χ )]
2
1
+ [ne(det) ( χ ) + ne(det) (− χ )]
2
1
+ [no(det) ( χ ) + no(det) (− χ )]
2
= z B ( χ ) + ne(det) ( χ ) ,

where in the last step Eqs. (6.18b), (6.19d), and (6.19e) are used to show that the average
produces signal zB(Ȥ) contaminated only by ne(det) ( χ ) , the unavoidable even-noise component.
Although in practice the avoidable noise no(det) ( χ ) is usually not averaged away at this point in the
signal processing chain, it could in principle be eliminated this way. To show that the no(det) ( χ )
avoidable noise has not yet been eliminated from the noise-contaminated signal, we substitute
(6.19a) into (6.17c) to get

( tot )
zBN ( χ ) = z B ( χ ) + z B( cold ) ( χ ) + ne(det) ( χ ) + no(det) ( χ ) . (6.19g)

For now, this is still the signal we trace through the signal chain, always remembering that only
the ne(det) noise component is an unavoidable source of signal contamination.

6.9 Passing the Detector Noise Through the Detector Circuit


The discussion following Eq. (5.43) of Chapter 5 points out that the detector circuit must be
linear, which means it obeys the rules outlined in Appendix 5A of Chapter 5. The analysis
following Eq. (5A.2a) in Appendix 5A of Chapter 5 shows that if the input to the detector circuit
is the sum of two signals, for example,

- 769 -
6 · NEdN and Detector Noise

[ z B (  )  z B( cold ) (  )] and [ne(det) (  )  no(det) (  )] ,

as in Eq. (6.19g), then the output signal must be the sum of the outputs generated by each signal
going through the circuit separately. We already know, according to Eqs. (6.8b) and (6.8c) above,
that the output corresponding to input

[ z B (  )  z B( cold ) (  )] is [ zC (  )  zC( cold ) (  )] ;

and we also know that the total signal plus noise leaving the detector circuit at point C is,
according to Eq. (6.13c),

( tot )
zCN (  ) zC (  )  zC( cold ) (  )   zC (  ) . (6.20a)

Hence  zC (  ) , the noise contaminating the signal at point C, is the signal we would get when
passing the sum m in Eq. (6.19a)
n (det) (  ) ne(det) (  )  no(det) (  )

of both the avoidable and unavoidable noise through the detector circuit as a separate signal.
The first step in sending the total detector noise n (det) (  ) through the detector circuit is to use
Eq. (6.4) above to convert
n (det) n (det) (ut ) (6.20b)

into a function of time. Then, using formula (5A.1a) in Appendix 5A of Chapter 5, we know the
corresponding output is
5

³
5
n (det) (ut 3) h(t  t 3) dt 3 ,

where h(t) is the impulse-response function of the detector circuit. Following the suggestion in
Eq. (6.4), we change the variable of integration to  3 ut 3 . The detector circuit’s output
corresponding to the n (det) input is then

5
1 § 3 · 3
³
(det)

n (  3) h ¨t  ¸d .
u 5 © u ¹

Now we substitute t  / u from Eq. (6.4) to get the noise output corresponding to input n (det) as
a function of Ȥ,

- 770 -
Passing the Detector Noise Through the Detector Circuit · 6.9


1 § χ − χ′ · ′
³ χ ′ ¸dχ .
(det)

n ( ) h ¨
u −∞ © u ¹

The discussion following Eq. (6.20a) above shows that δ zC ( χ ) must be exactly this integral—
that is, the output of the detector circuit corresponding to input n (det) . Therefore, we can write


1 § χ − χ′ · ′
δ zC ( χ ) = ³ n (det) ( χ ′) h ¨ ¸dχ . (6.20c)
u −∞ © u ¹

Glancing back at the definition of the convolution in Eq. (2.38a) of Chapter 2, we note that this
can also be written as
1ª § χ ·º
δ zC ( χ ) = « n (det) ( χ ) ∗ h ¨ ¸ » . (6.20d)
u¬ © u ¹¼

Equations (6.20c) and (6.20d) are exact formulas for δ zC ( χ ) , but there is also an
approximation for it that is often useful. According to the analysis at the beginning of Appendix
5A to Chapter 5, when h(t) is a narrow function of time the output of the detector circuit is just a
slightly blurred and distorted version of the input; and, according to the discussion at the end of
Sec. 5.12 of Chapter 5, detector circuits are typically designed to produce this sort of output. We
can almost always assume that h(t) is relatively narrow—that is, that there exists a time T such
that h(t) is negligible when t lies outside the time interval between +T and íT,

h(t ) ≈ 0 for t > q . (6.21a)

In fact, if h is causal, we can also assume that h(t ) = 0 for t < 0 [see Eq. (5A.5) in Appendix 5A
of Chapter 5]. Therefore the time-based output of the detector circuit can be approximated as

∞ t +T

³ (ut ′) h(t − t ′) dt ′ ≅ ³ n (det) (ut ′) h(t − t ′) dt ′ .


(det)
n (6.21b)
−∞ t −T

Again we change the t ′ dummy variable of integration to χ ′ = ut ′ and replace the time parameter
t by t = χ / u to get

∞ χ +uT
1 § χ − χ′ · ′ 1 § χ − χ′ · ′
³
u −∞
n (det) ( χ ′) h ¨
© u ¹
¸dχ ≅ ³
u χ −u T
n (det) ( χ ′) h ¨
© u ¹
¸dχ . (6.21c)

- 771 -
6 · NEdN and Detector Noise

According to the definition of convolution in Eq. (2.38a) of Chapter 2, this can also be written as

 uT
§· §   3 · 3
n (det)
(  )  h ¨ ¸ ³ n (det) (  3) h ¨ ¸d . (6.21d)
© u ¹  u T © u ¹

Hence Eq. (6.20d) can be approximated as

 uT
1 §   3 · 3
u  ³uT
(det)
 zC (  ) 
n (  3) h ¨ ¸d . (6.21e)
© u ¹

6.10 Total Detector Noise in Double-Sided Signals


Having found the formula for  zC (  ) , we substitute (6.20d) into (6.20a) to get the total noise-
contaminated signal at point C in Fig. 6.2,

( tot ) 1 ª (det) §  ·º
zCN (  ) zC (  )  zC( cold ) (  )  
n (  )  h ¨ ¸» . (6.22a)
u «¬ © u ¹¼

( tot )
We multiply zCN (  ) in (6.22a) by
­°1 for  4 D
 (  , D) ® (6.22b)
°̄0 for  D

to make it a double-sided signal, following the same tactic used before in Eq. (5.106a) of Chapter
5. Function  (  , D) is given the same definition as in Appendix 4C of Chapter 4 [see Eq.
(4C.1a)]. The formula for the double-sided and noise-contaminated signal used to measure the
spectral radiance thus becomes

( tot )
 (  , D) zCN ( )
1 ª §  ·º (6.22c)
 (  , D ) zC (  )   (  , D) zC( cold ) (  )   (  , D) « n (det) (  )  h ¨ ¸ » .
u ¬ © u ¹¼

Applying
Section 5.11the Fourier5transform
of Chapter to there
explains why both issides of an
always theeffective
equation gives, corresponding
spectrum because the Fourier transform
to an interferometer
is linear
signal. We(see
nowSec. 2.6 aofformula
develop Chapter
for2),
the detector noise-contaminated effective spectrum corresponding to the
signal in Eq. (6.22c). Applying the Fourier transform to both sides of the equation gives, because the Fourier
transform is linear (see Sec. 2.6 of Chapter 2),

- 772 -
Total Detector Noise in Double-Sided Signals · 6.10


F ( i) )  (  , D) zCN
( tot )

(  ) F (  i) )   (  , D) zC (  ) 

1 § ª §  ·º · (6.22d)
 
 F ( i) )  (  , D) zC( cold ) (  )  F (  i) ) ¨  (  , D) « n (det) (  )  h ¨ ¸ » ¸ .
u © u ¹¼ ¹
© ¬

Evaluating the first Fourier transform on the right-hand side of (6.22d) is not very difficult.
The remark following Eq. (6.5d) above points out that zC(Ȥ) is the same signal as z(Ȥ) in Eq.
(5.104a) of Chapter 5. The discussion following (5.104a) shows that  (  , D) zC (  ) must then be
the same signal function that we called ztrunc(Ȥ) in (5.106a). This means the Fourier transform

F (  i) )   (  , D) zC (  ) 

is the same quantity as Z eff() ) specified in Eqs. (5.108a) and (5.108b). According to Eq.
trunc

(5.108c), function Z eff() ) can be approximated as


trunc

WA 
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) ,
4

where
L mnf () ) [2 Dsinc(2&) D)]  L FOV ( ) ) . (6.23a)

We conclude that the same expression can be used to approximate F (  i) )   (  , D) zC (  )  ; that
is, we can write that
F ( i) )   (  , D) zC (  ) 
F ( i) )   (  ,WA C ( )
D) z (6.23b)
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) . (6.23b)
WA4
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) .
4
The second Fourier transform on the right-hand side of (6.22d) is not much more difficult.
According to theFourier
The second Fouriertransform
convolution on theorem [see Eq.side
the right-hand (2.39j) in Chapter
of (6.22d) is not2], much more difficult.
According to the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],
   
F ( i) )  (  , D) zC( cold ) (  ) F ( i) 3)   (  3, D)   F (  i) 33) zC( cold ) (  33) . (6.24a)

F
(  i) )
 ( cold )

 (  , D ) zC (  ) F

(  i) 3 )

  (( i)3,)D)   F

(  i) 33 )

( cold )
F ( i) )  (  , D) zC( cold ) (  ) F ( i) 3)   (  3, D)   F (  i) 33) zC( cold ) (  33) .
zC (  33) .
(6.24a)
(6.24a)
Equation (5.65a) in Chapter 5 and the definition of F from Eq. (2.29a) in Chapter 2 give
Equation (5.65a) in Chapter 5 and the definition of F ((  ii) )
) ) from Eq. (2.29a) in Chapter 2 give
Equation
Equation (5.65a)
(5.65a) in in Chapter
Chapter 55 andand thethe definition
definition of of F F (  i) ) from
from Eq.
Eq. (2.29a)
(2.29a) in in Chapter
Chapter 2
2 give
give

- 773 -
6 · NEdN and Detector Noise


F ( − iσχ ′ )
( Π ( χ ′, D) ) = ³ Π( χ ′, D) e
−2π iσχ ′
d χ ′ = 2 Dsinc(2πσ D) , (6.24b)
-∞
where
sin( x)
sinc( x) =
x

is defined in Eq. (2.106d). Hence, Eq. (6.24a) can be written as

( )
F ( −iσχ ) Π ( χ , D) zC( cold ) ( χ ) = [ 2 Dsinc(2πσ D) ] ∗ F ( −iσχ ′′) zC( cold ) ( χ ′′) . ( ) (6.24c)

Consulting Eq. (6.12a) above, we note that zC( cold ) ( χ ) is the inverse Fourier transform of

§ WA ∆Ω ·
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] ,
( fore ) (back)
¨
© 4 ¹

( )
which means that F ( −iσχ ) zC( cold ) ( χ ) , the forward Fourier transform of zC( cold ) ( χ ) , is

(
F ( −iσχ ) zC( cold ) ( χ ) = )
§ WA ∆Ω · (6.24d)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] .
( fore ) (back)
¨
© 4 ¹

According to the discussion following Eq. (5.82c), the quantities M, H, R, Ș, and τ a are all
slowly varying functions of their arguments. Hence they can, following the reasoning explained
in Appendix 5C of Chapter 5, be treated as quasi-constants with respect to the narrow sinc
convolution when (6.24d) is substituted into (6.24c). This leads to the approximation

(
F ( −iσχ ) Π ( χ , D) zC( cold ) ( χ ) )
§ WA ∆Ω ·
≅¨ ¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )
© 4 ¹
⋅ { [2Dsinc(2πσ D)] ∗[L ( fore )
FOV ( σ ) − L(back) }
FOV ( σ )] .

The linearity of the convolution [see, for example, Eq. (2.38d) of Chapter 2] now lets us write

- 774 -
Total Detector Noise in Double-Sided Signals · 6.10


F ( i) )  (  , D) zC( cold ) (  ) 
§ WA  · ( fore ) (back)
¨ ¸ H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A [L mnf ( ) )  L mnf ( ) )] , (6.25a)
© 4 ¹

where
L(mnf
fore )
() ) [2 Dsinc(2&) D)]  L(FOV
fore )
() ) (6.25b)
and
L(back) (back)
mnf () ) [2 Dsinc(2&) D )]  L FOV ( ) ) . (6.25c)

Since sinc(2&) D ) , L(FOV


fore )
( ) ) , and L(back)
FOV ( ) ) are all even functions of ı, it follows that the

convolutions of L(FOV
fore )
and L(back)
FOV with the sinc function are also even [see Eq. (2.38f) in Chapter

2],
L(mnf
fore )
() ) L(mnf
fore )
() ) , (6.25d)
and
L(back) (back)
mnf () ) L mnf () ) . (6.25e)

Hence the absolute-value


The absolute-value signs
signs around
around arguments of L(mnf
thethearguments fore )
and L(back)
mnf in Eq. (6.25a) are
are unnecessary
not needed
because the functions are already even, but they are put there anyway to keep our notation
parallel with that of the previous L-type radiance symbols:

L(mnf
fore )
( ) ) L(mnf
fore )
() ) , (6.25f)
and
L(back) (back)
mnf ( ) ) L mnf () ) . (6.25g)

Functions L(mnf
fore )
and L(back)
mnf are the background spectral radiances distorted by the effects of the
interferometer’s finite field of view and finite length of interferogram signal. They are given the
subscript mnf to show their similarity to Lmnf, the input spectral radiance distorted by the
interferometer’s finite field of view and finite interferogram length.
Unfortunately, the third Fourier transform on the right-hand side of Eq. (6.22d) is not as easy
to evaluate as the first two. We start the analysis by multiplying both sides of Eq. (6.21d) by
[ (  , D) / u ] to get

 uT
ª §  ·º §   3 · 3
u  (  , D) « n (det) (  )  h ¨ ¸ » u 1 (  , D) ³ n (det) (  3) h ¨
1
¸d (6.26a)
¬ © u ¹¼  u T © u ¹

- 775 -
6 · NEdN and Detector Noise

where, according to Eq. (6.21a), h(t ) ≈ 0 for t > T . The Π ( χ , D) function specified in Eq.
(6.22b) automatically makes both sides of (6.26a) equal to zero when χ > D , so we only need a
good approximation for the integral on the right-hand side when χ ≤ D . In particular, we note
that in (6.26a) the integral goes between χ ′ = χ − uT and χ ′ = χ + uT , which means that

χ − uT ≤ χ ′ ≤ χ + uT .

Since χ ≤ D we also know, putting less strict bounds on χ ′ , that

−( D + u T ) ≤ χ ′ ≤ D + u T .

Hence n (det) ( χ ′) can be multiplied by Π ( χ ′, D + u T ) without changing the value of the integral
for any values of Ȥ that matter. Consequently Eq. (6.26a) can be written as

ª § χ ·º
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ »
¬ © u ¹¼
χ +uT
(6.26b)
§ χ − χ′ · ′
≅ u Π ( χ , D) ³ Π ( χ ′, D)n ( χ ′) h ¨
−1
¸dχ ,
(det)

χ −u T © u ¹

where
D = D + uT . (6.26c)

The integral’s limits between χ + uT and χ − uT in (6.26b) came from the observation in (6.21a)
that function h is very small outside these limits, making the product

§ χ − χ′ ·
Π ( χ ′, D)n (det) ( χ ′) h ¨ ¸
© u ¹

negligible when χ ′ is less than χ − u T or exceeds χ + u T . Therefore, we expect the integral to


have the same value when its limits are extended to í’ and +’ giving

- 776 -
Total Detector Noise in Double-Sided Signals · 6.10

ª § χ ·º
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ »
¬ © u ¹¼

§ χ − χ′ ·
≅ u −1Π ( χ , D) ³ Π ( χ ′, D)n (det) ( χ ′) h ¨ ¸ d χ′ .
−∞ © u ¹

Equation (2.38a) of Chapter 2 shows this integral to be the convolution of Πn (det) and h,

ª § χ ·º ­ § χ ·½
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ≅ u −1Π ( χ , D) ® ª¬ Π ( χ , D)n (det) ( χ ) º¼ ∗ h ¨ ¸ ¾ . (6.26d)
¬ © u ¹¼ ¯ © u ¹¿

Taking the Fourier transform of both sides, and then applying the Fourier convolution theorem,
gives [see Eqs. (2.39a) and (2.39j) in Chapter 2]

§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27a)
( ) ( )
≅ F ( −iσχ ) u −1Π ( χ , D) ∗ ª¬ F ( − iσχ ′) Π ( χ ′, D)n (det) ( χ ′) ⋅ F ( − iσχ ′′) ( h( χ ′′ / u ) ) º¼ .

To evaluate the Fourier transform of h, we replace the dummy variable of integration χ ′′ by


t ′′ = χ ′′ / u to get
∞ ∞
F ( − iσχ ′′ )
( h( χ ′′ / u ) ) = ³ h( χ ′′ / u ) e −2π iσχ ′′ d χ ′′ = u ³ h(t ′′) e −2π iσ ut ′′ dt ′′ .
−∞ −∞

Equation (5A.3d) in Appendix 5A of Chapter 5 shows that this can be written as

F ( −iσχ ′′) ( h( χ ′′ / u ) ) = uH(uσ ) (6.27b)


or

³ h( χ ′′ / u ) e
−2π iσχ ′′
d χ ′′ = uH(uσ ) , (6.27c)
−∞

where H, the Fourier transform of h, is the transfer function of the detector circuit in Fig. 6.2.
Substituting this into Eq. (6.27a) gives

- 777 -
6 · NEdN and Detector Noise

§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27d)
( ) (
≅ F ( − iσχ ) u −1Π ( χ , D) ∗ ª¬uH(uσ ) ⋅ F ( − iσχ ′) Π ( χ ′, D)n (det) ( χ ′) º¼ . )
Equation (6.24b) [see also Eq. (5.65a) of Chapter 5] shows that


F ( − iσχ )
(u )
Π ( χ , D) = u
−1 −1
³ Π ( χ , D) e
−2π iσχ
d χ = 2u −1 D sinc(2πσ D) .
−∞

Hence Eq. (6.27d) can be written as, using the linearity of the convolution to cancel out u −1 and
u,
§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27e)
≅ [ 2 D sinc(2πσ D) ] ∗ ¬ª H(uσ ) ⋅ F ( − iσχ ′ )
(
Π ( χ ′, D)n ( χ ′) ¼º .
(det)
)
According to the discussion following Eq. (5.82c) of Chapter 5, the transfer function H(uσ )
varies slowly compared to the spectral radiance L(ı) that the interferometer is measuring, and
Sec. 5.15 of Chapter 5 explains why [ 2 D sinc(2πσ D) ] should be a narrow function compared to
L(ı). Consequently, there is every reason to expect H(uσ ) to vary slowly with respect to
[ 2 D sinc(2πσ D)] . Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, Eq. (6.27e)
can be approximated as

§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
≅ H(uσ ) ⋅{ [2D sinc(2πσ D)] ∗ F ( − iσχ ′ )
( Π( χ ′, D)n(det)
( χ ′) )}
= H(uσ ) ⋅{ F ( − iσχ ′′ )
(Π ( χ ′′, D)) ∗ F ( − iσχ ′ )
( Π( χ ′, D)n (det)
( χ ′) )}

where in the last step Eq. (6.24b) is again used, this time to replace

2 D sinc(2πσ D)

by the Fourier transform of Π ( χ , D) .


According to the Fourier convolution theorem [see Eq. (2.39j) of Chapter 2], this can be

- 778 -
Total Detector Noise in Double-Sided Signals · 6.10

written as
§ ª § χ ·º ·
F ( − iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
(
≅ H(uσ ) ⋅ F ( − iσχ ′) Π ( χ ′, D) ⋅ Π ( χ ′, D) n (det) ( χ ′) . )
Glancing back at Eq. (6.22b) above, we note that

Π ( χ , D ) ⋅ Π ( χ , D) = Π ( χ , D ) (6.28a)

because, according to (6.26c), D ≤ D . Hence the latest approximation becomes

§ ª § χ ·º ·
F ( − iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.28b)
≅ H(uσ ) ⋅ F ( − iσχ ′ )
( Π( χ ′, D) n (det)
)
( χ ′) .

We define the D-limited Fourier transform to be


(σ ) = ³ Π ( χ , D) n ( χ )e −2π iσχ d χ ,
(det) (det)
n D (6.29a)
−∞
which can also be written as

D (σ ) = F
n (det) ( − iσχ )
( Π( χ , D) n (det) ( χ ) ) (6.29b)
or
D

D (σ ) =
n (det) ³ n ( χ ) e −2π iσχ d χ .
(det)
(6.29c)
−D

Equation (6.28b) now becomes, using the linearity of the Fourier transform to take the factor of
u −1 outside the F operator,

1 ( −iσχ ) § ª (det) § χ ·º ·
¨ Π ( χ , D) « n ( χ ) ∗ h ¨ ¸ » ¸ ≅ H(uσ ) n D (σ ) .
(det)
F (6.29d)
u © ¬ © u ¹¼ ¹

D (σ ) , the Fourier transform of the product Π ⋅ n


It makes sense to work with n (det)  (det) , instead of

- 779 -
6 · NEdN and Detector Noise

working directly with the simple Fourier transform of n (det) . To see why this is so, we write down
the simple Fourier transform of n (det) ,

³ n
(det)
(  )e 2& i) d  ,
5

and note that there is no reason to think that n(det) always satisfies requirement (V) in Sec. 2.4 of
Chapter 2 for the existence of Fourier transforms.100 Function n (det) D () ) , on the other hand,

because it is the Fourier transform of n (det) (  ) after it is multiplied by  (  , D) , is a well-defined,


random function of ı because the  A n (det) product must is zerobefor
zero for D.satisfying
D, D. requirement (V).
Now that formulas (6.23b), (6.25a), and (6.29d) are known for all three Fourier transforms on
the right-hand side of Eq. (6.22d), they can be substituted into (6.22d) to get


F ( i) )  (  , D) zCN
( tot )

( )
WA 
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) )
4
WA 
 H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A [L(mnf
fore )
( ) )  L(back)
mnf ( ) )]
4
 H(u) ) n (det)
D () ) .

or, combining terms,


F ( i) )  (  , D) zCN
( tot )

( )
WA 
H(u) ) M( R)' ma ) R ( ) ) !() ) * a ( ) ) ª¬* f ( ) )L mnf ( ) )
4 (6.30a)
 L(mnf
fore )
( ) )  L(back) º
mnf ( ) )¼

 H(u) ) n (det )
D () ) .

The quantity F (  i) ) ( (  , D) zCN


( tot )
(  )) on the right-hand side of (6.30a) represents the noise-
contaminated measurement of the total signal spectrum at point C in Fig. 6.2. It can be thought of

100
Remember that the extended sine and cosine transforms to which requirement (V) applies will be used to define
the standard Fourier transform in Eq. (2.28a) of Chapter 2, so requirement (V) also applies to the standard Fourier
transform.

- 780 -
Total Detector Noise in Double-Sided Signals · 6.10

as the
the uncalibrated,
uncalibrated,noise-contaminated
noise-contaminatedoutput
ouput spectrum
spectrumofofthe
theinterferometer.
interferometer. It is the detector
noise-contaminated
In principle, there effective
is nospectrum.
problem removing all the noise from (6.30a). Glancing back at the
In principle,
formula for n (det)
D
there
in Eq.is(6.29c),
no problem removing
we apply all the noise
the expectation from E
operator (6.30a).
to bothGlancing back at the
sides to get
formula for n D in Eq. (6.29c), we apply the expectation operator E to both sides to get
(det)

§ DD (det) · DD
E n
 (det)
() ) E §¨ ³ n (det) (  ) e2& i) d  ·¸ ³ E n (det)
 2 & i)
 
(  ) e 2& i) d  0 , (6.30b)
E n (det)
D
() ) E ¨ ³D n (  ) e d  ¸ ³D E n (det) (  ) e 2& i) d  0 , (6.30b)
D © ¹
© D ¹ D
where Eq. (3.17c) of Chapter 3 and Eq. (6.17b) are used to show that E(n (det)
D () )) , the average or

D () ) , is zero. Applying E to both sides of (6.30a) now gives, using Eqs.


expected value of n (det)
(3.16a) and (3.9f) from Chapter 3,


E F (  i) )  (  , D) zCN
 ( tot )
( ) 
WA 
H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A
4
ª¬* f ( ) )L mnf ( ) )  L(mnf
fore )
( ) )  L(back) º
mnf ( ) ) ¼ (6.30c)
ª WA  º
L mnf ( ) ) « H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) ) »
¬ 4 ¼
ª WA  º

 L(mnf
fore )
( ) )  L(back)
mnf ( ) ) «
¬ 4

H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )» ,
¼

which shows that the noise term [H(u) ) n (det)


D () )] disappears when many noise-contaminated

measurements of
F (  i) )   (  , D) zCN
( tot )
( )

are averaged together. Hence the right-hand side of (6.30c) can be thought of as the uncalibrated,
noise-free output spectrum of the interferometer. According to Eq. (5.110) in Chapter 5, the Lmnf
radiance spectrum on the right-hand side of (6.30c) is the same as spectrum Leff in Eq. (5.95c) of
Chapter 5. Consequently the entire right-hand side of (6.30c) has the same form as Zeff,tot in
(5.95c), since it looks like

L eff () ) A 1Complex Function of ) 2  1Background Complex Function of ) 2. (6.30d)

This is no surprise, because Zeff,tot in Sec. 5.19 of Chapter 5 is defined to be the uncalibrated
output spectrum of a Michelson interferometer, which is exactly what the total noise-free signal
spectrum at point C of Fig. 6.2 ought to be. Therefore it now makes sense to write Eqs. (6.30a)

- 781 -
6 · NEdN and Detector Noise

and (6.30c) as
F ( −iσχ ) ( Π ( χ , D) zCN
( tot )
( χ ) ) ≅ Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ ) (6.31a)
and
(
E F ( − iσχ ) ( Π ( χ , D) zCN
( tot )
)
( χ ) ) ≅ Z eff ,tot (σ ) , (6.31b)

where, reversing the factoring in (6.30c), we have

ª WA ∆Ω º
Z eff ,tot (σ ) ≅ L mnf ( σ ) « H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) »
¬ 4 ¼
ª WA ∆Ω º
+ ( L(mnf
fore )
mnf ( σ )) «
( σ ) − L(back) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) »
¬ 4 ¼ (6.31c)
WA ∆Ω
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4
ª¬τ f ( σ )L mnf ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .

6.11 Measuring the Noise-Contaminated Spectrum


We have discussed two basic strategies for eliminating the interferometer’s unwanted background
radiance: measuring and subtracting the background radiance’s interferogram from the total
interferogram signal [see Eqs. (6.15a)–(6.15c)], or measuring and removing the background
radiance’s signal spectrum from the total signal spectrum (see Sec. 5.19 of Chapter 5). Here in
Chapter 6 we have concentrated up to now on the first strategy, assuming that during calibration
zC( cold ) ( χ ) is measured and subtracted from zCN
( tot )
( χ ) to get the zCN ( χ ) signal specified in Eq.
(6.15c). Although this is in principle an acceptable way to calibrate interferometers, in practice
the spectral calibration strategy described in Sec. 5.19 of Chapter 5 is more popular. Equations
(6.31a)–(6.31c) let us investigate this more popular spectral calibration strategy, because these
equations describe the total signal and noise in an uncalibrated spectral measurement. One
obvious way to investigate the spectral calibration strategy is to apply the spectral calibration
algorithm directly to Eqs. (6.31a)–(6.31c). The spectral calibration algorithm in Sec. 5.19 of
Chapter 5 [specified by Eq. (5.95a)] can be used whenever the interferometer’s uncalibrated
output spectrum has the form

L eff ( σ ) ⋅ {Complex Function of σ } + {Background Complex Function of σ } ;

and, according to the discussion following Eq. (6.30c), that is exactly the form taken by the
noise-free uncalibrated spectrum in the previous section. To use the calibration algorithm in

- 782 -
Measuring the Noise-Contaminated Spectrum · 6.11

(5.95a), we not only need the uncalibrated spectral output Zeff,tot(ı) associated with the spectral
radiance L but also must have the uncalibrated output signals associated with two known,
calibrating spectral radiances. Following the notation of Sec. 5.19 of Chapter 5, we call the two
calibrating radiances L(1) and L(2) and the two output signals associated with them Z (1)
eff ,tot (σ ) and

eff ,tot (σ ) respectively. Equation (6.31b) reminds us that to extract the noise-free signals Z eff ,tot
Z (2) (1)

(1)
and Z (2)
eff ,tot in the presence of noise, we need only point the interferometer at radiances L and
L(2) and average together a large number of uncalibrated output spectra to get each spectrum’s
noise-free expectation value. Examining Eqs. (6.31b) and (6.31c) closely, we realize that
(1)
eff ,tot (σ ) cannot depend directly on L
Z (1,2) and L(2) but instead must depend directly on L(1)
mnf and
(1)
L(2)
mnf , where again the mnf subscripts indicate that the L and L(2) radiances entering the front
end of the interferometer are blurred and distorted by the interferometer’s finite field of view and
finite interferogram length. Fortunately, because L(1) and L(2) are under our control, we can
choose them to be slowly varying functions of wavenumber. This means, according to Eq. (6A.6)
in Appendix 6A, that
mnf ( σ ) ≅ L ( σ )
L(1) (1)
(6.32a)
and
mnf ( σ ) ≅ L ( σ )
L(2) (2)
(6.32b)

should be acceptable approximations for L(1,2) (1) (2)


mnf . We can construct formulas for Z eff ,tot and Z eff ,tot ,

using Eqs. (6.31c), (6.32a), and (6.32b) to write

WA ∆Ω
eff ,tot (σ ) ≅
Z (1) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.33a)
¬ªτ f ( σ )L ( σ ) + L mnf ( σ ) − L mnf ( σ ) º¼
(1) ( fore ) (back)

and

WA ∆Ω
eff ,tot (σ ) ≅
Z (2) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.33b)
ª¬τ f ( σ )L(2) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .

This, together with the uncalibrated output spectrum Zeff,tot(ı) produced by the unknown radiance
L that the interferometer is being used to measure, is all we need to apply the spectral calibration
algorithm.
Although we could in principle collect a large number of measurements of Zeff,tot(ı), averaging
eff ,tot (σ ) , in practice more
them together to remove the noise just like we did when calculating Z (1,2)

- 783 -
6 · NEdN and Detector Noise

effort is put into removing noise from the calibration data than is put into removing noise from
everyday measurements. [This same point is made at the end of Sec. 6.5 when discussing noise in
the measurements of signals zA(Ȥ), zB(Ȥ), and zC(Ȥ).] Consequently, even though noise-free values
of Z (1,2)
eff ,tot are available for use in the calibration algorithm in Eq. (5.95a) of Chapter 5, we should

replace the noise-free Z (effmeas )


,tot in (5.95a) by the noise-contaminated, uncalibrated output specified

in Eq. (6.31a):


F ( i) )  (  , D) zCN
( tot )

(  ) Z eff ,tot () )  H(u) ) n (det)
D () ) .

 ( meas ) and specify it using the


We call this noise-contaminated, uncalibrated spectral signal Z eff ,totN

formula

 ( meas ) () ) Z  (det)
Z eff ,totN eff ,tot () )  H(u) ) n D () ) . (6.34a)

 ( meas ) shows, of course, that it is Z ( meas ) contaminated


Changing tot to totN in the subscript of Z eff ,totN eff ,tot

by noise; and the tilde shows that the spectral signal now has a random component. Function
Z ( meas ) is just, of course, a different name for
eff ,totN

F (  i) )   (  , D) zCN
( tot )
( ) .

The discussion at the beginning of Sec. 6.1 points out that Lmnf is
is the
thenoise-free
best measurement
measurementof
of the
unknown spectral radiance L that can be extracted from the interferometer, and substituting Eq.
 ( meas ) in terms of Lmnf,
(6.31c) into (6.34a) gives Z eff ,totN

 ( meas ) () ) WA  H(u) ) M( R)' ) !() ) R ( ) )* ( ) ) A


Z eff ,totN ma a
4
ª¬* f ( ) )L mnf ( ) )  L(mnf
fore )
( ) )  L(back) º
mnf ( ) ) ¼ (6.34b)
 H(u) ) n (det)
D () ) .

NowNowwe we
apply the calibration
apply algorithm
the calibration in Sec.
algorithm in 5.19
Sec. of Chapter
5.19 5. Equations
of Chapter (6.34b) (6.34b)
5. Equations and (6.33a)
and
give give
(6.33a)

- 784 -
Measuring the Noise-Contaminated Spectrum · 6.11

 ( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot

WA ∆Ω
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ª¬ L mnf ( σ ) − L(1) ( σ ) º¼ (6.35a)
4
+ H(uσ ) n (det)
D (σ ) .

Because we have decided to include noise in our measurement of the uncalibrated output
spectrum, this corresponds to the difference

,tot (σ ) − Z eff ,tot (σ ) ”


( meas ) (1)
“ Z eff

in Eq. (5.95a) of Chapter 5. Equations (6.33a) and (6.33b) give

eff ,tot (σ ) − Z eff ,tot (σ )


Z (2) (1)

WA ∆Ω (6.35b)
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ª¬ L(2) ( σ ) − L(1) ( σ ) º¼ .
4

The ratio of these two differences is

 ( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot

eff ,tot (σ ) − Z eff ,tot (σ )


Z (2) (1)

L mnf ( σ ) − L(1) ( σ )
= (6.35c)
L(2) ( σ ) − L(1) ( σ )

D (σ )
4n (det)
+ .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )[L(2) ( σ ) − L(1) ( σ )]

The left-hand side of this formula is, of course, the noise-contaminated version of the ratio

,tot (σ ) − Z eff ,tot (σ )] [ Z eff ,tot (σ ) − Z eff ,tot (σ )]


[Z (effmeas ) (1) (2) (1)

in Eq. (5.95a). We now complete the calibration algorithm by substituting (6.35c) into (5.95a) to
get

- 785 -
6 · NEdN and Detector Noise

­° Z ( meas ) (σ ) − Z (1) (σ ) ½°
ª¬ L ( σ ) − L ( σ )º¼ ⋅ ® eff(2),totN
(2) (1) eff ,tot
¾ + L (σ )
(1)

°¯ Z eff ,tot (σ ) − Z eff ,tot (σ ) °¿


(1)

(6.35d)
D (σ )
4n (det)
= L mnf ( σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

At first we might think, examining Eq. (6.35d) and comparing it to (6.1a) above, that the
right-hand side is just a disguised version of

L mnf (σ ) + δ L (σ ) ,

which would mean that for detector noise

? D (σ )
4n (det)
δ L (σ ) = .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

A little thought, however, shows that this cannot be correct. The quantities W, A, ¨ȍ, M, Ș, R, τ a ,
and τ are all real, as is δ L , but there is no reason for n (det) to be real. From Eq. (6.29a), we
f D

have

D (σ ) =
n (det) ³ Π ( χ , D) n ( χ )e −2π iσχ d χ .
(det)

−∞

This means n (det)


D is the forward Fourier transform of the real product

Π ( χ , D) n (det) ( χ ) ,

which, according to entry 7 of Table 2.1 in Chapter 2, makes n (det)


D Hermitian:

D ( −σ ) = n D (σ ) .
n (det)  (det) ∗ (6.36)

Unless Π ( χ , D) n (det) ( χ ) is also an even function—and there is absolutely no reason for this to be
true—we expect n (det)D to have both real and imaginary components.
The observation that Π ( χ , D) n (det) ( χ ) must be even for n (det)
D to be strictly real brings to mind
the distinction previously made between avoidable and unavoidable detector noise. In Sec. 6.8
above, we note that only the even component

- 786 -
Measuring the Noise-Contaminated Spectrum · 6.11

1 (det)
ne(det) ( χ ) = ª¬ n ( χ ) + n (det) (− χ ) º¼
2

of the total detector noise in double-sided signals is unavoidable, because in principle the odd
component
1
no(det) ( χ ) = ª¬ n (det) ( χ ) − n (det) (− χ ) º¼
2

can be removed from the signal at point B by averaging together the signal values at +Ȥ and –Ȥ.
We also point out in Sec. 6.8 that the avoidable noise is usually not eliminated this way, but
instead passed along the signal chain to be eliminated later. We have now reached the point
where it is easy to eliminate the avoidable noise in double-sided signals.
Suppose, just like in Eq. (6.19a) of Sec. 6.8, we write n (det) ( χ ) as the sum of an unavoidable,
even component and an avoidable, odd component,

n (det) ( χ ) = ne(det) ( χ ) + no(det) ( χ ) .

Since n (det)
D is the forward Fourier transform of Π ( χ , D) n (det) ( χ ) , we have

D (σ ) = F
n (det) ( − iσχ )
( Π( χ , D) n (det) ( χ ) )
= F ( −iσχ ) ( Π ( χ , D) ne(det) ( χ ) + Π ( χ , D)no(det) ( χ ) ) (6.37a)
= F ( −iσχ ) ( Π ( χ , D) ne(det) ( χ ) ) + F ( −iσχ ) ( Π ( χ , D)no(det) ( χ ) ) ,

where in the last step the linearity of the Fourier transform is used to write the transform of the
sum as the sum of the transforms (see Sec. 2.6 of Chapter 2). To get a spectrum for the
unavoidable detector noise, we now define

De (σ ) = F
n (det) ( − iσχ )
(
Π ( χ , D) ne(det) ( χ ) . ) (6.37b)

This makes n (det)


De the forward Fourier transform of a real and even function of Ȥ, which means,
according to entry 1 of Table 2.1 in Chapter 2, that n (det)
De must be a real and even function of ı,

De (σ ) = Re ( n De (σ ) )
n (det)  (det) (6.37c)
and
De ( −σ ) = n De (σ ) .
n (det)  (det) (6.37d)

- 787 -
6 · NEdN and Detector Noise

To get a spectrum for the avoidable detector noise, we define

n (det)
Do () ) F
(  i) )
 
 (  , D) no(det) (  ) . (6.37e)

This makes n (det)


Do the forward Fourier transform of a real and odd function of Ȥ, which means,
according to entry 4 of Table 2.1 in Chapter 2, that n (det)
Do must be an imaginary and odd function

of ı,

n (det) 
 (det)
Do () ) i Im n Do () )  (6.37f)
and

n (det)  (det)
Do ( ) ) n Do () ) . (6.37g)

Substitution of (6.37b) and (6.37e) into (6.37a) gives

n (det)  (det)  (det)


D () ) n De () )  n Do () ) , (6.37h)

which we can interpret as requiring the total noise spectrum n (det)


D to be the sum of the
unavoidable noise spectrum n (det)  (det)
De and the avoidable noise spectrum n Do . Remembering that Eqs.

(6.37c) and (6.37f) show that n (det)  (det)


De is strictly real and n Do is strictly imaginary, we also note that

n (det) 11 (det)


De must be the real part of the detector noise spectrum and i n
(det)
nDo
Do must be the imaginary part

of the detector noise spectrum,

De () ) Re  n D () ) 
n (det)  (det) (6.37i)
and
Do () ) i Im  n D () )  .
n (det)  (det) (6.37j)

Therefore we can remove all of the avoidable detector noise from the n (det)
D detector noise
spectrum by taking its real part, as shown in (6.37i); moreover, since the noise-free spectral
measurement of Lmnf must be real, we can remove all of the avoidable detector noise from our
noise-contaminated spectral measurement by taking its real part. The right-hand side of Eq.
(6.35d) gives the formula for the noise-contaminated spectral measurement, and taking its real
part gives

- 788 -
Measuring the Noise-Contaminated Spectrum · 6.11

§ D (σ )
4n (det) ·
Re ¨ L mnf ( σ ) + ¸¸
¨ (WA ∆Ω ) M( Rσθ ) η (σ ) R ( σ ) τ ( σ ) τ ( σ )
© ma a f ¹
(6.38a)
4 Re ( n D (σ ) )
(det)

= L mnf ( σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

The imaginary part of the right-hand side is, of course, pure noise:

§ D (σ )
4n (det) ·
Im ¨ L mnf ( σ ) + ¸
¨ (WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ¸¹
©
(6.38b)
4 Im ( n (det)
D (σ ) )
= .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

Comparing (6.38a) to the right-hand side of Eq. (6.1a) above,

L mnf (σ ) + δ L (σ ) ,

now suggests that the appropriate formula for the unavoidable random error in a double-sided
signal contaminated by detector noise must be

δ L (σ ) =
( D (σ )
4 Re n (det) ) . (6.38c)
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

The right-hand side of (6.38c) comes from (6.38a), which was derived while permitting
wavenumber ı to be negative as well as positive; the left-hand side however comes from Eq.
(6.1a) where, according to (6.1c),
0 < σ min ≤ σ ≤ σ max .

Wavenumbers ımax and ımin are the maximum and minimum wavenumber values over which
radiance spectra are measured, and in a well-built interferometer unwanted spectral energy is
usually prevented from entering the optical signal chain by designing the product

R (σ ) τ a (σ ) τ f (σ )

to be zero when ı does not lie between ımin and ımax. Because the denominator on the right-hand
side of (6.38c) contains the product

- 789 -
6 · NEdN and Detector Noise

R ( σ )τ a ( σ )τ f ( σ ) ,

we end up dividing by zero unless we require that

0 < σ min ≤ σ ≤ σ max . (6.38d)

Therefore the restrictions on the left-hand and right-hand sides of (6.38c) look very similar; the
only real difference is the way ı is allowed to be negative on the right-hand side but not on the
left. According to Eq. (5.10f) in Chapter 5 and (4.139g) in Chapter 4, functions M and Ș in the
denominator of (6.38c) are even with respect to ı, and of course the absolute value signs in R, τ a ,
and τ f force them to be even functions of their arguments. Equations (6.37d) and (6.37i) show
that the real part of n (det)
D is also an even function:

Re ( n (det)
D ( −σ ) ) = Re ( n D (σ ) ) .
 (det) (6.38e)

Consequently, the entire right-hand side of Eq. (6.38c) is an even function of ı and there is no
extra information to be lost if we require ı to be positive on both sides of (6.38c). To show that
both sides should be evaluated for positive wavenumbers ı, we follow the convention used in
Sec. 6.1 when going from Eq. (6.3f) to (6.3g) and write (6.38c) as

δ L ( σ ) =
(
D (σ )
4 Re n (det) ) . (6.38f)
(WA ∆Ω) M( R σ θ ma ) η( σ ) R ( σ )τ a ( σ )τ f ( σ )

This can also, of course, be interpreted as a decision to make δ L an even function of


wavenumber, giving it a well-defined meaning for all ı such that 0 < σ min ≤ σ ≤ σ max . No matter
what the interpretation, however, the mathematical meaning is clear. For future use, we note that
(6.38f) can also be written as, substituting from Eq. (6.37i),

De ( σ )
4n (det)
δ L ( σ ) = , (6.38g)
(WA ∆Ω) M( R σ θ ma ) η( σ ) R ( σ )τ a ( σ )τ f ( σ )

where applying the definition of the forward Fourier transform to (6.37b) gives [see Eq. (2.29a)
in Chapter 2]

De (σ ) =
n (det) ³ Π( χ , D) n ( χ ) e −2π iσχ d χ .
(det)
e (6.38h)
−∞

- 790 -
Measuring the Noise-Contaminated Spectrum · 6.11

Substitution of (6.37f) into the right-hand side of (6.38b) gives

( D (σ )
4 Im n (det) )
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
(6.38i)
4i −1 n (det)
Do (σ )
= ,
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

where, consulting Eq. (6.37e), we note that

Do (σ ) =
n (det) ³ Π( χ , D) n ( χ ) e −2π iσχ d χ .
(det)
o (6.38j)
−∞

Equations (6.38h)–(6.38j) can be used for both negative and positive values of ı.

6.12 Characterizing the Detector Noise


When the noise coming from a detector is examined, it almost always looks ergodic. According
to the discussion at the end of Sec. 3.18 of Chapter 3, all ergodic functions are also stationary,
which means that n (det) representing the detector noise is a stationary random function. There is
nothing unusual about characterizing the detector noise this way; most mathematical treatments
of random processes assume at least wide-sense stationarity in order to assign power spectra to
the random behavior under investigation. Like all statistical assumptions, saying the detector
noise is wide-sense stationary is at best an approximation. As a general rule, however, the
assumption that n (det) is wide-sense stationary and so has a well-defined power spectrum turns
out to be a good description of reality.
Appendix 6B explains how to use the direct proportionality between time t and the OPD value
Ȥ given by
χ = ut

in Eq. (6.4) to analyze the detector noise n (det) as a random process or function that is wide-sense
stationary in Ȥ instead of t. We can say that [see Eq. (6B.4e) in Appendix 6B] that the
autocorrelation function onn(det)
 of n (det) ( χ ) is given by

  ( χ 2 − χ1 ) = E n
(det)
onn (
 (det) ( χ1 ) ⋅ n (det) ( χ 2 ) . ) (6.39a)

The corresponding Ȥ-based power spectrum is [see Eq. (6B.6a) in Appendix 6B]

- 791 -
6 · NEdN and Detector Noise

³o
(det) (det)
p 
nn () ) 
nn (  ) e 2& i) d  , (6.39b)
5

from which it follows, reversing the Fourier transform, that

³p
(det) (det)
o 
nn ( ) 
nn () ) e2& i) d) . (6.39c)
5

(det)
Glancing back at (6.39a), we note that onn
 is real because n (det) is real. We can easily show that
(det)
onn
 must be even. Starting with (6.39a), we have

  (  2  1 ) E n
 (det) ( 1 ) A n (det) (  2 ) E n (det) (  2 ) A n (det) ( 1 )
   
(det)
onn
(det) (det)
  ( 1   2 ) onn
onn   (  2  1 )  .

Hence, replacing  2  1 by Ȥ, we gett [see Eq. (2.11a) in Chapter 2 defining even functions]

(det) (det)
  (  ) onn
onn   (  ) (6.39d)

(det) (det)
It follows, since pnn
 is the forward Fourier transform of a real and even function, that pnn
 is
also real and even (see entry 1 of Table 2.1 in Chapter 2):

(det) (det)
  ( ) ) pnn
pnn   () ) (6.39e)
and
Im  pnn
  () )  0 .
(det)
(6.39f)

The detector noise can also, of course, be analyzed in a more conventional way, treating it as a
random function of time N (det) (t ) that is wide-sense stationary. The transformation between n (det)
and N (det) is given in Eqs. (6B.2a) and (6B.2b) in Appendix 6B as

n (det) (  ) N (det) (  / u ) (6.40a)


and
n (det) (ut ) N (det) (t ) , (6.40b)

where u is the OPD velocity. The D-limited transform of n (det) defined in Eq. (6.29a) can be

- 792 -
Characterizing the Detector Noise · 6.12

written as [see Eq. (6.29c)]


D

D (σ ) =
n (det) ³ n ( χ ) e −2π iσχ d χ .
(det)

−D

Changing the variable of integration to t = χ / u gives

D/u
(σ ) = u ³ n (det) (ut ) e −2π iσ ut dt .
(det)
n D
−D/u

If we define
T = D/u (6.40c)

and set
f = uσ , (6.40d)

then Eq. (6.40b) can be used to write this latest formula as

T
 (det) (t ) e −2π ift dt .
D (σ ) = u ³ N
n (det) (6.40e)
−T

Working in the time domain, it makes sense to define the T-limited Fourier transform of N (det) (t )
to be
T
 (det) ( f ) =
³ N (t ) e−2π ift dt ,
(det)
N T (6.40f)
−T

which means that (6.40e) can now be written as, remembering that f = uσ ,

 (det) (uσ )
D (σ ) = u NT
n (det) (6.40g)
or
u −1n (det)  (det) ( f ) .
D ( f / u ) = NT (6.40h)

[These are the detector-noise versions of Eqs. (6B.7g) and (6B.7h) in Appendix 6B.] Equation
(6B.7i) in Appendix 6B now gives

(det)
pnn
 (σ ) = lim
­ 1
®
D →∞ 2 D
¯
E 
n (det)
D (σ )
2 ½
¾,
¿
( ) (6.40i)

- 793 -
6 · NEdN and Detector Noise

which we can approximate as, assuming that D is large enough for

1
2D (
E n (det)
D (σ )
2
)
to be close to its limit as D → ∞ ,

  (σ ) ≅
(det)
pnn
1
2D
E n (det)(
D (σ )
2
. ) (6.40j)

The time-based autocorrelation function of the detector noise is [see Eq. (6B.3a) in Appendix 6B)

(det)
RNN (
 (det) (t ) ⋅ N (det) (t )
  (t2 − t1 ) = E N 1 2 ) (6.41a)

(det)
 ,
with an associated time-based power spectrum that is the forward Fourier transform of RNN

³R (t ) e−2π ift dt .
(det) (det)
 (f )=
S NN 
NN
(6.41b)
−∞
The transform can be reversed to get

³S ( f ) e2π ift df .
(det) (det)
  (t ) =
RNN 
NN
(6.41c)
−∞

Equations (6B.4g), (6B.4h), (6B.6d), and (6B.6f) of Appendix 6B give the transformation
formulas connecting onn(det)

(det)
to RNN (det)
  and pnn

(det)
to S NN
 :

  ( χ ) = RNN
  (χ / u) ,
(det) (det)
onn (6.41d)

(det) (det)
  (t ) = onn
RNN   (ut ) , (6.41e)

  (σ ) = uS NN
  (uσ ) ,
(det) (det)
pnn (6.41f)
and
(det) −1 (det)
  ( f ) = u pnn
S NN   ( f / u) . (6.41g)

Working with power spectra and autocorrelation functions that are both time-based and Ȥ-based
can sometimes be confusing, but the custom of using variables Ȥ and ı to analyze interferometer
signals makes it hard to avoid.

- 794 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13

FIGURE 6.3(a).

log  Snn
  ( f )
(1)

log( f c ) log(f )

6.13 Detector Noise with a Band-Limited, White-Noise Power Spectrum


Many times detector noise can be modeled as band-limited white noise (see the discussion at the
end of Sec. 6.7 above). Following the rules outlined in Sec. 3.25 of Chapter 3, Fig. 6.3(b) plots
the double-sided power spectrum of white noise with bandwidth fband. The constant
positive power level
(det)
of this spectrum is Sconst . The corresponding ı-based power spectrum is plotted in Fig. 6.3(c); it
has the same shape as the power spectrum in Fig. 6.3(b) but obeys Eqs. (6.41f) and (6.41g) by
having the constant
positive power level

p0(det) u A Sconst
(det)
(6.42a)
and a ı bandwidth of
fband
) band . (6.42b)
u

In these two equations, u is still the constant OPD velocity used in Eq. (6.4) above.

- 795 -
6 · NEdN and Detector Noise

FIGURE 6.3(b).
(det)
S NN
 (f )

(det)
S const

− f band f band

FIGURE 6.3(c).

  (σ )
p (det)
nn

p0(det) = uSconst
(det)

− σ band = − f band / u σ band = f band / u

- 796 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13

FIGURE 6.4(a).

6
2 .10

I1N(det)
10
6

n~ (det) ( χmp)k 0.0 0

− I1N(det)
10
6

6
2 .10
0 0.02 0.04 0.06 0.08 0.1
0 −D 0.0
k .∆χ ( Nσask D
1 ) .∆χ

FIGURE 6.4(b).

9
4 .10

Z2(det)
DN
10
9

Real part
~ (det) 0.0
D (σ )
Re hdTradSpec
of n k 0

− Z2(det)
10
DN
9

9
4 .10

0 − 0σ Nyq1000 2000 3000 4000 0.0


5000
k
6000 7000 8000 9000 σ1 Nyq
10
4
. 3
9.999 10
σ

- 797 -
6 · NEdN and Detector Noise

Figure 6.4(a) plots one possible member of the ensemble of functions associated with the
random function n (det) (  ) obeying a band-limited, white noise power spectrum like the one in
(c)
Fig. 6.3(b)—that is, Fig. 6.4(a) contains a specific instance of n (det) (  ) . In Fig. 6.4(a), the Ȥ
interval between samples is
1
 , (6.43a)
2) Nyq

where ıNyq is the Nyquist wavenumber of the sampled interferogram signal that we plan to
contaminate with this noise. We make the simulated n (det) (  ) relatively large so that its effects
are easily visible, giving it a scale size I N(det) , shown with dashed lines, equal to 1/50th of the
maximum value of the simulated interferogram signal. A power spectrum such as the one shown
in Fig. 6.3(c) does not uniquely determine all the statistical rules needed to generate the random
noise sequence in Fig. 6.4(a)—we also need to pick a probability density distribution for n (det) at
each value of Ȥ. This probability density distribution must be zero-mean because, according to
Eq. (6.17b),
E  n (det) (  )  0 .

To match the probability density distribution to the power spectrum, we also need to give it the
correct variance. Remembering that the noise is zero-mean, and consulting Eq. (6.39a) with
 2 1 , we see that
variance of detector noise vn(det)
 
E  n (det) (  ) 
2
 o(det)

nn (0) . (6.43b)

Equation (6.39c) with  0 then requires that

5
vn(det) ³p
(det) (det)
   (0)
onn 
nn () ) d) . (6.43c)
5

Equations (6.42a) and (6.42b)—and the band-limited nature of the white-noise power spectrum in
(det)
(6.43c)—now give (remember that pnn
 is zero for ) ) band )

vn(det)
 2) band p0(det) . (6.43d)

Having made the probability density distribution zero-mean, and matched its variance to the
power level, we are left free to arrange everything else about the probability density distribution

- 798 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13

FIGURE 6.4(c).

9
4 .10

Z (det)
2 10
DN
9

Imaginary part
~ (det)
Im hdTradSpec 0.0 0
of n D (σ )
k

− Z2(det)
9
10
DN

9
4 .10

0
− 0σ Nyq1000 2000 3000 4000 0.0
5000
k
6000 7000 8000 9000 σ1 Nyq
10
4
. 3
9.999 10
σ
FIGURE 6.4(d).

Imaginary part Real part


~ (det) (σ )
of n ~ (det) (σ )
of n
D D
9
4 10
9
2.68834 .10

Z2(det)
10
DN
9

Real andReimaginary
hdTradSpec
k
~ (det) 0.0
D (σ )k
0
n
parts of Im hdTradSpec

− Z2(det)
10
DN
9

9
2.68834 .10 4 10
9
4799 4839 4879 4919 4959 4999 5039 5079 5119 5159 5199
4999 200 ⋅ ∆σ
− 200 0.0
k .∆f 200 ⋅ ∆σ
4999 200

- 799 -
6 · NEdN and Detector Noise

any way we please. Sometimes knowing the variance is enough to pick a specific probability
distribution from a family of similar zero-mean density distributions; it is certainly all that is
needed to specify the Gaussian probability distribution used to generate the random noise in Fig.
6.4(a).
We have already noted that the simulated noise plotted in Fig. 6.4(a) can be regarded as a
member function picked at random from the ensemble of functions associated with n (det) (  ) ; that
is, it is a single instance of the detector noise. Even though it is not, in the strictest sense, possible
to graph a random function as such—because it stands for a whole collection or ensemble of
functions—as a convenient shorthand, we often call graphs such as the one in Fig. 6.4(a) a
simulation of n (det) (  ) . Figure 6.4(b) contains the real part of the D-limited forward Fourier
transform of the detector noise shown in Fig. 6.4(a). This means, following the notation specified
in Eqs. (6.29a), (6.37i), and (6.37j), that Fig. 6.4(b) simulates the random function

De () ) Re  n D () )  .
n (det)  (det)

Figure 6.4(c) plots the imaginary part of this same transform, which means Fig. 6.4(c) simulates
the random function
Im  n (det)
D () )  i n Do () )
1 (det)


corresponding to the detector noise specified in Fig. 6.4(a). The scale size Z (det)
DN , shown with

dashed lines in Figs. 6.4(b) and 6.4(c), is 1/50th of the maximum uncalibrated spectral signal
produced by the simulated interferometer. Figure 6.4(d) plots a very short stretch of the two
curves in Figs. 6.4(b) and 6.4(c) around ) 0 . Here ¨ı is the distance between adjacent samples
on the wavenumber axis for the radiance spectrum measured by the interferometer. We see that
the imaginary part obeys Eqs. (6.37g) and (6.37j) by being an odd function of ı; and the real part
obeys Eqs. (6.37d) and (6.37i) by being an even function of ı.

6.14 An Example of Simulated Detector Noise in a Double-Sided Signal


In this
this section,
section,we
weconsider
considerthe
theeffect
effectofof
detector noise
detector in ainsimplified
noise interferometer system
an ideal interferometer system where,
in Eqs. (6.38d) and (6.1c), we take ımin = 0 and ımax = ’. Figures 6.5(a)–6.5(c) show what
happens to a radiance spectrum L( ) ) measured without any noise, and Figs. 6.6(a) and 6.6(b)
show what happens when these same measurements are contaminated by significant amounts of
detector noise. Figure 6.5(a) plots the radiance spectrum entering the interferometer’s input
aperture, with the radiance spectrum graphed for both positive and negative wavenumbers to
honor the absolute value sign in its argument. Figure 6.5(b) gives the interferogram signal zC (  )
generated by L( ) ) , as specified in Eq. (6.5d) with W 1 . This is the signal seen at point C in

- 800 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14

FIGURE 6.5(a).

5
2 10

5L
1.5 10max

5
1 10

Lp
L( ) )ip
6
5 10

0.0 0

6
5 10
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
 ) Nyq / 2
2500 )p0.0 ) Nyq2500/ 2
ip

5
2 10

L 5
1.5 10max

5
1 10
L () )
L( kPlot ) )
6
5 10

0.0
0

6 6
5 10 5 10
1000 1100 1200 1300 1400 1500
1000 )
kPlot ) 1500

- 801 -
6 · NEdN and Detector Noise

FIGURE 6.5(b).

4
1 10
5
5.614754 .10

5
χ
5 10

Interferogram
IfTradNFT 0
kPlot
signal
5
5 10

5
5.344839 .10 1 10
4
0.02 0.01 0 0.01 0.02
− D/2
0.030
kPlot
Nσtot0.0
1 .∆χ
D0.030
/2
2

5
1 .10

Interferogram
IfTradNFT 0
signal kPlot

5
1 .10
0.005 0.01 0.015 0.02 0.025
0.005 Nσtot 0.03
1 .∆χ
kPlot
χ2

- 802 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14

FIGURE 6.5(c).

5
2 .10

L 5
1.5 10max

5
1 10

( σkPlot
L mnf gp )
6
5 10

0.0 0

6
5 .10
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
− σ Nyq / 2
2500
kPlot
0.0
Nσtot
1 .∆σ
σ Nyq
2500/ 2
2
σ

5
2 .10

L 5
1.5 10max

5
1 10
Re LNFTL mnf (σ )
kPlot
6
5 10

0.0 0

6
5 .10 5 10
6
1000 1100 1200 1300 1400 1500
1000 σ
kPlot .∆σ 1500

- 803 -
6 · NEdN and Detector Noise

FIGURE 6.6(a).

4
1 10
5
5.582875 10

5

5 10

Interferogram
IfTradNT 0
kPlot
signal
5
5 10

5 4
44839 10
5.392793 1 10
0.02 0.01 0 0.01 0.02
 D/2
0.030
kPlot
0.0
0.0
N)tot
N)tot
1 
D0.030
/2
2

5
1 10

Interferogram
IfTradNT 0
signalkPlot

5
 D/2 0.0 D/2
0
5
1 10
0.005 0.01 0.015 0.02 0.025
0.005
kPlot

N)tot
1 
0.030
2

- 804 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14

FIGURE 6.6(b).

5
2 10

L 5
1.5 10max

5
1 10
Noise-contaminated
radianceLTradNT
measurement
kPlot
6
5 10

0.0 0

6
5 10
6000 4000 2000 0 2000 4000 6000
6000  ) Nyq 0.0
N)tot ) ) Nyq 6000
kPlot 1 )
2

5
2 10

L 5
1.5 10max

5
1 10
LTradNT
kPlot
Noise-contaminated 6
5 10
radiance measurement

0.0 0

6 6
5 10 5 10
1000 1100 1200 1300 1400 1500
1000 N)tot 1500
kPlot 1 )
2

- 805 -
6 · NEdN and Detector Noise

Fig. 6.2 when only negligible amounts of noise and background radiance are present. Figure
6.5(c) gives the L mnf ( ) ) radiance measurement extracted from the interferogram signal in Fig.
6.5(b). The most dramatic change is perhaps the spurious oscillation or “ringing” produced
throughout the measured spectrum by the finite signal length or truncation of the interferogram
signal (only signal values between Ȥ = +D and Ȥ = –D are recorded in this double-sided system).
Careful examination also reveals the blurring effects of this truncation—note that three
absorption lines in the center of Fig. 6.5(c) are not quite as deep and are more closely matched in
intensity than are the absorption lines in Fig. 6.5(a). The characteristic scale of the radiance axis
in Figs. 6.5(a) and 6.5(c) is taken to be Lmax, the maximum value of the input radiance spectrum
(in units of optical power per unit area per unit solid angle per unit wavenumber interval). Next
detector noise is added to the radiance measurement. Figure 6.6(a) plots the interference signal in
Fig. 6.5(b) contaminated by the band-limited detector noise plotted in Fig. 6.4(a), and Fig. 6.6(b)
gives the spectral measurement produced by this noise-contaminated signal.
The discussion following Eq. (6.35d) above reveals that the detector noise n (det) (  ) in the
zC(Ȥ) signal adds a complex spectral noise n (det)D () ) to the spectral data coming out of the

calibration algorithm; and, as shown in Eq. (6.38a), only the real component of the complex
spectral noise unavoidably contaminates the spectral measurement. Figure 6.6(b) shows that this
real component typically introduces a fuzziness into the measured spectrum, which is most easily
seen where the noise-free Lmnf spectrum is negligible or zero. Figures 6.7(a) and 6.7(b) show the
real and imaginary parts of the complex spectral noise in this simulated interferometer
measurement. Because the last step in producing a double-sided interferometer measurement is—
according to Eq. (6.38a) above—to take the real part of the calculated spectrum, only the real part
plotted in Fig. 6.7(a) ends up contaminating the spectral measurement. The plots in Figs.
6.7(a)and 6.7(b) look qualitatively similar and have the same characteristic size, which is typical
of detector noise (see the discussion at the beginning of Sec. 6.17 below).
It is important to remember that the random noise in Figs. 6.7(a) and 6.7(b) comes from one
specific spectral measurement. The very next measurement might have negative errors where
there are now positive, or positive errors where there are now negative, or something in
between—there is quite literally no necessary connection to the random spectral errors in the
previous measurement. If we keep track of the detector-noise error in a very large collection of
measurements, and then at each wavenumber average together the detector-noise error from all
the different measurements, we would discover that the average detector-noise error approaches
zero at every wavenumber as we increase the number of independent measurements. This is, of
just what
course, just whatshould
shouldhappen
happenaccording
accordingto to
Eq.Eq. (6.30b)
(6.30b) above. If we calculate the standard
above.
deviation at every wave number, we get the NEdN levels shown in Figs. 6.7(a,b).
6.15 Photon Noise in Detectors
6.15
Most Photon
detectorsNoise in Detectors
approach an ideal state when chilled to very low temperatures (typically tens of
degrees
Most Kelvin)approach
detectors at reasonable levels
an ideal stateofwhen
illumination.
chilled toFor anlow
very ideal detector, the
temperatures only source
(typically tensof
of
degrees Kelvin) at reasonable levels of illumination. For an ideal detector, the only source of

- 806 -
Photon Noise in Detectors · 6.15

FIGURE 6.7(a).

Thick, solid line is the


NEdN level for this
noise.

7
6.10 5 10
7

L maxL/ 50
Real max
part of the
complex
spectral noise
LrTradNT 0.0 0
in the radiancekPlot
measurement
− L/ 50
−L max max

7
5 10
7
6.10
4000 2000 0 2000 4000
−σ
5000 Nyq
0.0
Nσtot
σ Nyq
5000
kPlot 1 .∆σ σ
2

detector noise is the quantum fluctuations in the number of photons it absorbs. When the detector
experiences a constant level of illumination, these quantum fluctuations show up as band-limited
white noise. The photon noise in many types of photovoltaic (PV) detectors often approaches the
ideal of band-limited white noise. Many times this occurs when the detector observes the signal
in the presence of large amounts of background radiation, because then most of the photons
reaching the detector come from the constant background, keeping the total number of absorbed
photons approximately constant as the optical signal varies. A detector operating in this mode is
said to have reached its background-limited infrared photon, or BLIP, limit. Figures 5.8(a) and
5.8(b) in Chapter 5 show that when detectors measure interferograms, the total signal variation
about its average level is usually small except very close to ZPD in a region symmetrically
located about Ȥ = 0. In this sense, even when background radiation is disregarded, PV detectors
measuring interferograms are analogous to PV detectors operating in the BLIP limit: photons are
absorbed at a more or less constant rate during most of the measurement. Experience has shown

- 807 -
6 · NEdN and Detector Noise

that for this reason the photon noise contaminating interferograms can usually be approximated
as band-limited white noise, with the photon noise level specified by the detector’s average
illumination from both the background and signal radiances.
To derive a power level for the photon noise generated in a detector, we treat the detector as
an element of an electric circuit—it does, after all, put out an electric signal when illuminated—
(det)
which means it must have a typical bandwidth that we call fband . Associated with this bandwidth
is a response time
1
τ band
(det)
= (det)
. (6.44a)
2 ⋅ fband

If the illumination hitting the detector varies significantly on a timescale shorter than τ band
(det)
, the
detector does not record the change in illumination directly but instead generates a signal based
on the average level of illumination reaching the detector over the τ band
(det)
time interval. In this
sense, τ band
(det)
is the effective length of time during which the detector collects photons to produce
its signal. We also assume that the detector responsivity R (σ ) (which is defined at the beginning
of Sec. 5.9 in Chapter 5) can be written as the product of two functions ηd (σ ) and ed (σ ) for
wavenumbers ı greater than zero,
R (σ ) = η d (σ ) ⋅ ed (σ ) . (6.44b)

Function Șd is often called the detector’s quantum efficiency; it specifies the fraction of photons
of frequency f = c / λ = cσ that are absorbed after hitting the detector’s surface. The value of Șd
for any ı must be a dimensionless number between zero and one:

0 ≤ ηd (σ ) ≤ 1 . (6.44c)

Every photon is associated with a monochromatic wavefield of frequency f (in cycles per
second) and carries an amount of energy hf = hcσ , where h ≅ 6.626 ×10−27 erg ⋅ sec is Planck’s
constant and c ≅ 2.998 ×1010 cm/sec is the speed of light in a vacuum. We define P1 to be the
random number of photons absorbed by the detector in time τ band(det)
that have frequency f1 = cσ 1 ,
P to be the random number of photons absorbed in time τ
2
(det)
that have frequency f = cσ , P
band 2 2 3

to be the random number of photons absorbed in time τ that have frequency f3 = cσ 3 , and so
(det)
band

on. The statistical rules obeyed by photons require P1 , P2 , P3 ,… to be independent random
numbers.
The total number of photons absorbed by the detector in time τ band
(det)
is

- 808 -
Photon Noise in Detectors · 6.15

FIGURE 6.7(b).
Thick, solid line is the
NEdN level for this
noise.

7
6.10 5 10
7

Imaginary Lmax
Imaginary part L max / 50
partofofthe
thecomplex
complex
spectral noise
spectral
in thenoise
radiance
LiTradNT 0.0 0
in the radiance kPlot
measurement
measurement
−L
−L maxmax
/ 50
7
5 10
7
6.10
4000 2000 0 2000 4000
− Nyq
5000 σ 0.0
Nσtot σ Nyq
5000
kPlot 1 .∆σ
2 σ

______________________________________________________________________________

Ptot = P1 + P2 + P3 + " . (6.45a)

The detector has an area Ad, a field of view specified by the solid angle ¨ȍd, and is illuminated
by a constant radiance Ld(ı) that is defined only for σ ≥ 0 . As has already been pointed out at the
beginning of this section, for interferometers we can take Ld(ı) to be the average radiance level,
both from the optical background and the optical signal, reaching the detector. Using the linearity
of the expectation operator E with respect to random variables (see Sec. 3.10 of Chapter 3), the
average number of photons absorbed by the detector in time τ band
(det)
is

E( Ptot ) = E( P1 ) + E( P2 ) + E( P3 ) + " , (6.45b)

where E( P1 ) is the average number of photons absorbed in time τ band


(det)
that have frequency f1 ,
E( P2 ) is the average number of photons absorbed in time τ band
(det)
that have frequency f 2 , and so

- 809 -
6 · NEdN and Detector Noise

on. Given Ld(ı), we know that

Ad ∆Ω d L d (σ 1 )
E( P1 ) ≅ η d (σ 1 ) ⋅τ band
(det)
⋅ dσ ,
hcσ 1
Ad ∆Ω d L d (σ 2 )
E( P2 ) ≅ ηd (σ 2 ) ⋅τ band
(det)
⋅ dσ ,
hcσ 2
# (6.45c)
Ad ∆Ω d L d (σ j )
E( Pj ) ≅ η d (σ j ) ⋅τ band
(det)
⋅ dσ ,
hcσ j
#
where
Ad ∆Ω d L d (σ j ) dσ

is the radiant power carried by electromagnetic radiation having a frequency f between f = cσ j


and f = c (σ j + dσ ) , and
Ad ∆Ω d L d (σ j )

hcσ j

is, of course, the average number of photons per unit time carried by that radiation.
Returning to Eq. (6.45a), we see that the actual random optical power Wd absorbed by the
detector over a time interval τ band
(det)
is

hcσ 1  hcσ 2  hcσ 3 


Wd = P + (det) P2 + (det) P3 + " .
(det) 1
(6.46a)
τ band τ band τ band

This should not be confused with the average or expected optical power absorbed over the time
interval τ band
(det)
. Since the photons have already been absorbed, all that is needed to get the actual
random signal I is to multiply the first term by e (σ ) , the second term by e (σ ) , etc., which
d d 1 d 2

gives
§ hcσ · § hcσ · § hcσ ·
Id = ed (σ 1 ) ⋅ ¨ (det)1 ¸ ⋅ P1 + ed (σ 2 ) ⋅ ¨ (det)2 ¸ ⋅ P2 + ed (σ 3 ) ⋅ ¨ (det)3 ¸ ⋅ P3 + " . (6.46b)
© τ band ¹ © τ band ¹ © τ band ¹

The right-hand side of this equation is a sum of independent random variables. Equation (3.19e)
in Chapter 3 states that the variance of the sum of independent random variables is the sum of the

- 810 -
Photon Noise in Detectors · 6.15

variances, so we can use the notation introduced in Eq. (3.8f) of Chapter 3 to write
§ hcσ ·
Var ( Id ) = Var ¨ ed (σ 1 ) (det)1 P1 ¸
© τ band ¹
§ hcσ · § hcσ ·
+ Var ¨ ed (σ 2 ) (det)2 P2 ¸ + " + Var ¨ ed (σ j ) (det)j Pj ¸ + " .
© τ band ¹ © τ band ¹

Equation (3.16g) in Chapter 3 points out that multiplying a random variable by a nonrandom
parameter means that its variance must be multiplied by the square of that parameter, so the
variance in signal Id can also be written as

2
§ hcσ ·
Var ( Id ) = ¨ ed (σ 1 ) (det)1 ¸ ⋅Var ( P1 )
© τ band ¹
2 2
(6.46c)
§ hcσ · § hcσ j ·
+ ¨ ed (σ 2 ) (det)2 ¸ ⋅Var ( P2 ) + " + ¨ ed (σ j ) (det) ¸ ⋅ Var ( Pj ) + " .
© τ band ¹ © τ band ¹

The number of photons absorbed at any frequency f = cσ j obeys Poisson statistics, which means
that the variance in the random number of photons equals the mean or average number of
photons:

Var ( Pj ) = E( Pj ) . (6.46d)

Substituting Eqs. (6.46d) and (6.45c) into (6.46c) gives

2
§ hcσ · ª (det) Ad ∆Ω d L d (σ 1 ) º
Var ( Id ) = ¨ ed (σ 1 ) (det)1 ¸ ⋅ «η d (σ 1 ) ⋅τ band ⋅ » dσ
© τ band ¹ ¬ hc σ 1 ¼
2
§ hcσ · ª (det) Ad ∆Ω d L d (σ 2 ) º
+ ¨ ed (σ 2 ) (det)2 ¸ ⋅ «ηd (σ 2 ) ⋅τ band ⋅ » dσ + "
© τ band ¹ ¬ hc σ 2 ¼
2
§ hcσ · ª Ad ∆Ω d L d (σ j ) º
+ ¨ ed (σ j ) (det)j ¸ ⋅ «ηd (σ j ) ⋅τ band
(det)
⋅ » dσ + " ,
© τ band ¹ «
¬ hc σ j »¼

which can be written as

- 811 -
6 · NEdN and Detector Noise

§ hcσ ·
Var ( Id ) = Ad ∆Ω d ⋅ ¨ (det)1 ¸ ⋅ ed (σ 1 ) 2η d (σ 1 ) ⋅ L d (σ 1 )dσ
© τ band ¹
§ hcσ ·
+ Ad ∆Ω d ⋅ ¨ (det)2 ¸ ⋅ ed (σ 2 ) 2η d (σ 2 ) ⋅ L d (σ 2 ) dσ
© τ band ¹
§ hcσ ·
+ Ad ∆Ω d ⋅ ¨ (det)3 ¸ ⋅ ed (σ 3 ) 2ηd (σ 3 ) ⋅ L d (σ 3 ) dσ
© τ band ¹
+ ".

Converting this sum into an integral, we get

 § hc · ∞
Var ( I d ) = Ad ∆Ω d ¨ (det) ¸ ⋅ ³ ed (σ ) 2η d (σ )L d (σ ) σ dσ .
© τ band ¹ 0

Equations (6.44a) and (6.44b) can be used to write this as


R (σ )
2

Var ( I d ) = 2 f band hc Ad ∆Ω d ⋅ ³
(det)
L (σ ) σ dσ . (6.46e)
0
ηd (σ ) d

The photon noise is band-limited white noise like that shown in Figs. 6.3(b) and 6.3(c) above.
Hence, Eq. (3.62d) in Chapter 3, which connects the variance of band-limited white noise to the
constant level of its noise-power spectrum, here allows us to write that

Var ( Id ) = 2 f band


(det) (det)
S p2 , (6.46f)

where S p(det)
2 is the constant power level of the double-sided, time-based power spectrum due to
the random quantum fluctuations in the number of photons absorbed by the detector. Comparing
Eq. (6.46e) to (6.46f), we see that


R (σ )
2
S (det)
p2 = hc Ad ∆Ω d ⋅ ³ L d (σ ) σ dσ . (6.46g)
0
η d (σ )

A single-sided power spectrum must, according to Eq. (3.58b) of Chapter 3, have a constant
power level S p(det)
1 that is twice the size of the double-sided power level, hence

- 812 -
Photon Noise in Detectors · 6.15


R (σ )
2
S (det)
p1 = 2 hc Ad ∆Ω d ⋅ ³ L d (σ ) σ dσ . (6.46h)
0
η d (σ )

When the detector experiences an approximately constant level of monochromatic radiation at


wavenumber σ = σ 0 , we can write the radiance Ld as

§ Q ·
L d (σ ) = ¨ ¸ ⋅ hcσ 0 ⋅ δ (σ − σ 0 ) , (6.47a)
© ∆Ω d ¹

where Q, which is often called the photon incidence, is defined to be the number of photons per
unit time and per unit area hitting the detector. The delta function in (6.47a) has units of inverse
wavenumbers (that is, length) and is explained in Sec. 2.14 of Chapter 2. Substitution of (6.47a)
into (6.46h) gives
R (σ ) ª§ Q · º
∞ 2
S p(det)
1 = 2 hc Ad ∆Ω d ³
⋅ «¨ ¸ ⋅ hcσ 0 ⋅ δ (σ − σ 0 ) » σ dσ
0
ηd (σ ) «¬© ∆Ω d ¹ »¼
or
2 Ad Q
[ hcσ 0 R (σ 0 )] .
2
S p(det)
1 = (6.47b)
ηd (σ 0 )

Detectors are often characterized by a figure of merit called the specific detectivity D*, or “D-
star.” The specific detectivity of a detector at a positive wavenumber ı is defined to be

R( σ ) Ad
D∗ ( σ ) = , (6.48a)
S1(det) (u σ )

where u is again the constant OPD velocity used in Eq. (6.4) above, R(ı) is the detector’s
responsivity, Ad is the detector area, and S1(det) ( f ) is the single-sided noise-power density at the
signal frequency f (in Hz). The absolute value signs applied to ı both remind us that its value
must be positive and allow us to extend the definition of D* to negative wavenumbers. The units
of D* are cm ⋅ Hz/watt (which is often called a Jones). The D* tends to be constant for all
infrared detectors made from the same detector material and operating at the same temperature,
no matter what the detector area Ad; consequently, it can be used to predict the amount of noise
contamination present in any size detector, all other things being equal. High-performance
detectors produce low-noise signals and have large D* values (for example, 1014 cm ⋅ Hz/watt ),
and low-performance detectors have small D* values (for example, 107 cm ⋅ Hz/watt ). The D* of
an ideal detector that is photon-noise limited and experiencing an approximately constant level of

- 813 -
6 · NEdN and Detector Noise

monochromatic illumination at wavenumber σ 0 is, substituting Eq. (6.47b) into (6.48a),

R (σ 0 ) Ad 1 η d (σ 0 )
D∗ = = (6.48b)
S p(det)
1
hcσ 0 2Q

or, remembering that the radiation wavelength λ0 equals σ 0−1 ,

λ0 ηd (σ 0 )
D∗ = . (6.48c)
hc 2Q

This equation is the standard D∗ formula for a PV detector in the BLIP limit.101

6.16 Detector-Noise NEdN in Double-Sided Signals


It is easy to show that—as expected from Eqs. (6.2c) and (6.3c)—the expectation value of
δ L (σ ) is zero in a double-sided signal contaminated by detector noise. Returning to Eq. (6.30b),
we get that

E ( n (det)
D (σ ) ) = 0 . (6.49a)

Substituting from Eq. (6.37h) now gives, using the linearity of the expectation operator with
respect to random variables [see Eq. (3.16a) in Chapter 3],
E ( n (det)
De (σ ) + n Do (σ ) ) = E ( n De (σ ) ) + E ( n Do (σ ) ) = 0 .
 (det)  (det)  (det) (6.49b)

De (σ ) is purely real and n Do (σ ) is purely imaginary,


According to Eqs. (6.37i) and (6.37j), n (det)  (det)
which means the expectation value E n (det) (
De (σ ) )
must be purely real and the expectation value
(
E n (det)
Do (σ ))must be purely imaginary. Consequently we can take real and imaginary parts of
(6.49b) to get
(
E n (det)
De (σ ) = 0 ) (6.49c)
and
(
E n (det) )
Do (σ ) = 0 . (6.49d)

101
See, for example, Eq. (2.48a) in John David Vincent, Fundamentals of Infrared Detector Operation and Testing
(John Wiley and Sons, New York, 1990), p. 65.

- 814 -
Detector-Noise NEdN in Double-Sided Signals · 6.16

Taking the expectation value of both sides of the formula for  L ( ) ) in Eq. (6.38g) now gives
the desired result:
E  L ( ) ) 0
  (6.49e)
for the double-sided detector noise.
To get the detector-noise NEdN in a double-sided signal, we first substitute Eq. (6.49e) into
(6.3g) to get


NEdN ( ) ) E ª¬ L ( ) ) º¼ 2  (6.50a)

and then substitute (6.38f),


4 E §¨ ª¬ Re n (det) º ·
2

©
D () ) ¼ ¸
¹
 
NEdN 2(det)( ) ) . (6.50b)
( A ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) )* f ( ) )

The subscript 2 and the superscript (det) are added to the NEdN parameter to show that this is the
NEdN of a double-sided signal contaminated by detector noise. According to the discussion
immediately preceding Eqs. (4.84a) and (4.84b) in Chapter 4, parameter W = +1 or í1, which
means that it drops out of the formula when  L ( ) ) is squared. We can remove the absolute
value signs from the arguments of M and Ș because they are already even functions [see Eqs.
(4.139g) and (5.10f) in Chapters 4 and 5 respectively] to get

4 E §¨ ª¬ Re n (det) º ·
2

©
D () ) ¼ ¸ ¹

NEdN 2(det)( ) ) . (6.50c)
( A ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) )

According to the discussion at the beginning of Sec. 6.12, we can assume the detector noise to
be wide-sense stationary; and Appendix 6B shows that it has this property both when treated as a
random function of time and when treated as a random function of the OPD value Ȥ. Using the
transformation specified in Eqs. (6.40a) and (6.40b) to treat the detector noise as the random
function of time N (det) (t ) , we
we use
useEq.
Eq.(6.40f)
(6.40f)to to
construct its T-limited
construct Fourier
its T-limited transform
Fourier transform [Eq.
(6.22b) above defines ]
T 5
 (det) ( f ) T N (det) (t ) e2& ift dt 5  (t , T ) N (det) (t ) e 2& ift dt .
 (det) ( f ) ³T N (det) (t ) e2& ift dt 5
³
N T (6.51a)
³ ³ (t , T ) N (t ) e dt .
(det) 2& ift
N T (6.51a)
T 5
The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that
The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that

- 815 -
6 · NEdN and Detector Noise

©¬
 (det) ( f ) º ·¸ ≅ 1 E N
E §¨ ª Re N(T )
2

¼ ¹ 2 (
 (det) ( f ) 2
T ) (6.51b)

and

©¬
 (det) ( f ) º ¸· ≅ 1 E N
E ¨§ ª Im N(T )
2

¼ ¹ 2 (
 (det) ( f ) 2
T ) (6.51c)

as long as f is substantially greater than O(T −1 ) . This is a very easy requirement to satisfy
since at this point all it really does is show how large T must be chosen for us to have Eqs.
(6.51b) and (6.51c) hold true at the frequencies f we are interested in. Remembering that E is a
linear operator with respect to random variables and that σ = f / u , we use Eq. (6.40h) to write

E §¨ ª¬ Re n (det)
©
( D (σ)) º
¼
2
¸
¹ 2 (
· ≅ 1 E n (det) (σ ) 2
D ) (6.51d)

and
E §¨ ª¬ Im n (det)
©
( D (σ )
) º
2

¼ ¸¹ 2 (
· ≅ 1 E n (det) (σ ) 2 .
D ) (6.51e)

These two formulas only hold true as long as σ is substantially greater than O( D −1 ) as can be
seen by applying Eqs. (6.40c) and (6.40d) to the requirement that f is substantially greater than
O(T −1 ) . The intersample distance between spectral samples along the wavenumber axis of the
radiance measurement is, according to the discussion following Eq. (5.124d) in Chapter 5,

1
∆σ = .
2D

Consequently, as long as the wavenumbers between ımin and ımax at which the spectral radiance is
being measured lie a reasonable number of ¨ı lengths away from the σ = 0 origin of the
wavenumber axis—as would be the case in a well-designed interferometer system—we can rely
on σ being substantially greater than O( D −1 ) for the wavenumbers of interest. Hence formulas
(6.51d) and (6.51e) can be assumed to hold true. Now we can substitute Eq. (6.51d) into (6.50c)
to get

NEdN 2(det)( σ ) =
(
2 2 E n (det)
D (σ )
2
) . (6.51f)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

- 816 -
Detector-Noise NEdN in Double-Sided Signals · 6.16

This basic equation for the detector-noise NEdN of a double-sided signal can be put into a
variety of forms.
If the power spectrum of the detector noise is known, we can evaluate

(
E n (det)
D (σ )
2
)
directly no matter what shape it has. In particular, we do not need to assume that the detector
produces band-limited white noise. Starting with Eq. (6.29a), we have

2 ∗
ª  (det) º
D ( σ ) = n D ( σ ) ⋅ ¬n D ( σ ) ¼
n (det)  (det)

ª∞ º ª∞ º
= « ³ Π ( χ , D) n (det) ( χ )e −2π iσχ d χ » ⋅ « ³ Π ( χ ′, D) n (det) ( χ ′)e −2π iσχ ′ d χ ′» ,
¬ −∞ ¼ ¬ −∞ ¼

which becomes, since n (det) ( χ ) is real,

∞ ∞
2
³ Π( χ , D) n d χ ³ Π ( χ ′, D) n (det) ( χ ′)e 2π iσχ ′ d χ ′ .
−2π iσχ
n (det)
D (σ ) = (det)
( χ )e
−∞ −∞

Equation (3.17c) in Chapter 3 allows the expectation operator E to be taken inside the double
integral formula, so applying E to both sides leads to

( ) = ³ d χ Π(χ , D)e
∞ ∞

( )
2
E n (det) ³ d χ ′ Π( χ ′, D) e E n (det) ( χ ′)n (det) ( χ ) .
−2π iσχ 2π iσχ ′
D (σ )
−∞ −∞

Substituting from Eq. (6.39a) and then applying (6.39c) gives

( ) = ³ d χ Π ( χ , D) e
∞ ∞
2
E n ³ d χ ′ Π( χ ′, D) e
−2π iσχ 2π iσχ ′
(σ )   ( χ − χ ′)
(det) (det)
D onn
−∞ −∞
∞ ∞ ∞

³ d χ Π ( χ , D) e−2π iσχ ³ d χ ′ Π ( χ ′, D) e2π iσχ ′ ³ dσ ′ pnn 2π iσ ′ ( χ − χ ′ )


  (σ ′)e
(det)
= (6.52a)
−∞ −∞ −∞
∞ ∞ ∞

³ dσ ′ p (σ ′) ³ d χ Π ( χ , D) e −2π i (σ −σ ′) χ ³ d χ ′ Π( χ ′, D) e
(det) −2π i (σ ′−σ ) χ ′
= 
nn .
−∞ −∞ −∞

- 817 -
6 · NEdN and Detector Noise

Consulting Eq. (2.108b) of Chapter 2, we set up the variable correspondences

 B f , D B F , ()  ) 3) B t
for the integral
5

³  (  , D) e
2& i () ) 3 ) 
d
5

and the variable correspondences

 3 B f , D B F , () 3  ) ) B t
for the integral
5

³
5
 (  3, D) e 2& i () 3) )  3 d  3 .

This gives
5

³
5
 (  , D) e 2& i () ) 3)  d  2 Dsinc(2& ()  ) 3) D) (6.52b)

and
5

³
5
 (  3, D) e 2& i () 3) )  3 d  3 2 Dsinc(2& () 3  ) ) D) , (6.52c)

where, following the definition in Eq. (2.106d) of Chapter 2, we say that

sin( x)
sinc( x) .
x

Substitution of (6.52b) and (6.52c) into (6.52a) leads to


E n (det)
D () )
2

5 (6.52d)
³p () 3) A ª¬ 2 Dsinc  2& ()  ) 3) D  º¼ A ª¬ 2 Dsinc  2& () 3  ) ) D  º¼ d) 3.
(det)

nn
5

Clearly, the sinc isis an


an even
even function
function of
of its
its argument,
argument,

sin( x) sin( x)
sinc( x) sinc( x) .
x x

- 818 -
Detector-Noise NEdN in Double-Sided Signals · 6.16

Consequently, Eq. (6.52d) can be written as

( ) { }

2 2
E n (det)
D (σ ) = 2 D ³ pnn ¬sinc ( 2π (σ − σ ′) D ) º¼ dσ ′.
  (σ ′) ⋅ 2 D ª
(det)
(6.52e)
−∞

(det)
We assume that the detector noise has a power spectrum pnn  that varies slowly with ı compared
to
sinc(2π (σ − σ ′) D) .

This means we can, just as in Eq. (3.67b) of Chapter 3, approximate the action of

2 D [sinc(2π (σ − σ ′) D) ]
2

inside the integral by replacing it with a delta function δ (σ − σ ′) . Equation (6.52e) then
simplifies to
(
E n (det)
D (σ )
2
) = 2D p (det)

nn (σ ) , (6.52f)

which can be substituted into (6.51f) to get

 (σ )
(det)
4 D pnn
NEdN 2 (σ ) =
(det)
. (6.52g)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

(det)
We note that pnn  is a double-sided power spectrum, which means [see Eqs. (6.39e) and (6.39f)
above] it is real and even, making the absolute value signs applied to its argument superfluous.
Many times the detector noise is characterized by its power spectrum written as a function of
(det)
the frequency f (in Hz). This is called S NN  ( f ) in Sec. 6.12 above, and Eq. (6.41f) can be used to

write (6.52g) as

  (u σ )
(det)
4 uDS NN
NEdN (det)
2 (σ ) = (6.53a)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

Again, the absolute value signs do not need to be added to the argument of the power spectrum
because it is a real and even function. This formula is often written in terms of the single-sided
power spectrum described by Eq. (3.58b) of Chapter 3, which is defined only for non-negative
values of frequency f = u σ . Calling this single-sided power spectrum S1(det) ( f ) , we know from
Eq. (3.58b) that

- 819 -
6 · NEdN and Detector Noise

S1(det) ( f ) = 2 S NN
(det)
 ( f ). (6.53b)

Here, the absolute value signs are needed to show that the frequency argument must be non-
negative. Substituting this into (6.53a) gives

2 2uDS1(det) (u σ )
NEdN 2 (σ ) =
(det)
. (6.53c)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

One last form into which this formula can be put uses the D* figure of merit introduced in Eq.
(6.48a),

R( σ ) Ad
D∗ ( σ ) = .
S1(det) (u σ )

Substituting this into (6.53c) gives

2 2uDAd
NEdN 2(det)( σ ) = , (6.53d)
( A ∆Ω) M( Rσθ ma ) η(σ )τ a ( σ )τ f ( σ ) D∗ ( σ )

where Ad is the optically sensitive area of the detector.

6.17 Real and Imaginary Parts of the Detector Noise


One way—perhaps the easiest way—to estimate the detector noise contaminating a spectral
measurement is to graph the imaginary component of the spectral data coming out of the
interferometer’s calibration algorithm. Figures 6.5(a)–6.5(c), 6.6(a), and 6.6(b) above show the
simulated spectral measurement of an interferometer both with and without detector noise. To
show the behavior of the imaginary component of the complex data, we graph it in Fig. 6.7(b) for
the spectral measurement in Fig. 6.6(b), stretching the scale of the y axis to make it easier to see.
According to Eq. (6.38b), this is pure noise. Squaring the right-hand side of (6.38b) and taking its
expectation value gives, after substituting from Eq. (6.51e) (and using that W = +1 or í1 from the
discussion immediately preceding Eq. (4.84a) in Chapter 4),

- 820 -
Real and Imaginary Parts of the Detector Noise · 6.17

16 E §¨ Im ( n (det)
D (σ ) ) ¸
·
2

© ¹
2
ª¬(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) º¼

=
(
8 E n (det)
D (σ )
2
) .
2
ª¬( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) º¼

This is the variance of the noise in Fig. 6.7(b). Taking the square root gives, for the imaginary
component,

standard deviation =
(
2 2 E n (det)
D (σ )
2
) . (6.54)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

The thick solid line labeled NEdN in Fig. 6.7(b) shows the size of this standard deviation. Figure
6.7(a) plots the actual spectral noise in the measured spectrum. Not only does this spectral noise
qualitatively resemble the imaginary component of the complex data in Fig. 6.7(b), but also, as
shown by the thick solid line in Fig. 6.7(a), the NEdN or standard deviation of the spectral noise
has the same value as the standard deviation of the imaginary component of the complex data.
This is no surprise; glancing back at the right-hand side of Eq. (6.51f), we note that the right-hand
side of (6.51f) has the same formula for the NEdN (or standard deviation of the spectral noise) as
appears on the right-hand side of Eq. (6.54) above.

6.18 Detector Noise in a Single-Sided Signal


Section 5.18 of Chapter 5 describes how to produce a single-sided measurement of the spectral
radiance L. When the interferometer signal at point C in Fig. 6.2 is free of noise, the most
important difference between single-sided and double-sided measurements from the viewpoint of
the interferometer user is the gain in spectral resolution that can be achieved without a major
redesign of the moving mirror (see the discussion at the beginning of Sec. 5.18 in Chapter 5).
When, however, the signal is contaminated by significant amounts of detector noise, the NEdN in
a single-sided measurement at a specified spectral resolution is larger than the NEdN in a double-
sided measurement at that same spectral resolution. To show why this is so, we add detector
noise to the signal and process it as a single-sided measurement while keeping track of what the
detector noise does to the spectral measurement.
Equations (5.84c) and (5.88c) in Chapter 5 introduce two functions, ψ (σ ) and ϖ ( χ ) , used to
process single-sided measurements. Function ψ (σ ) is, according to Eq. (5.85b), the phase angle

- 821 -
6 · NEdN and Detector Noise

of the detector circuit’s transfer function,


ψ (σ ) = arg[H(uσ )] . (6.55a)

Footnote 88 of Chapter 5 explains that there are other nonideal aspects to interferometer
signals—such as the off-center sampling mentioned in Sec. 5.26 of Chapter 5—that can modify
the nonzero phase angle ȥ (although it always remains a slowly varying function of ı). From this
point on, we can include all these aspects in our analysis by regarding H as the “effective”
transfer function that includes not only the effects of the detector circuit but also all the other
significant causes of a nonzero phase angle. This makes h, the forward Fourier transform of H
(see Appendix 5A of Chapter 5), an “effective” impulse-response function for the signal leaving
the detector. Because H is still the forward Fourier transform of a real-valued function h when H
and h are taken to be the effective transfer function and effective impulse-response function, H is
still a Hermitian function satisfying Eq. (5A.6b) in Appendix 5A. Equation (5A.5), however, may
not be satisfied by an “effective” impulse-response function because the effective h may not be
causal.
Equation (5.88c) defines function ϖ ( χ ) to be the inverse Fourier transform of e− iψ (σ )
multiplied by the tapering function V (σ ) specified in Eq. (5.88d),


ϖ ( χ ) = ³ [V (σ ) e−iψ (σ ) ] e2π iσχ dσ . (6.55b)
−∞

As pointed out in the discussion following Eq. (5.88a) of Chapter 5, we only need to know ȥ
exactly for
σ min ≤ σ ≤ σ max ;

outside this range, function V can be adjusted to make [V (σ )e −iψ (σ ) ] taper to zero, ensuring that
the Fourier transform in (6.55b) exists.
Functions ȥ(ı) and ϖ ( χ ) can usually be recovered from the calibration procedure applied to
the interferometer. One method, as described at the beginning of Sec. 6.11 above, is to subtract
off the background signal zC( cold ) described in Sec. 6.3 and then—being sure to repeat the signal
measurements often enough to average away the noise—to calculate ȥ and ϖ from the recipe
given in Sec. 5.18 of Chapter 5. Another possibility is to note that every detector signal must pass
through the same signal chain, ending up multiplied by the same effective transfer function H.
eff ,tot (σ ) and Z eff ,tot (σ ) in Eqs. (6.33a) and (6.33b) above are complex because all
Hence both Z (1) (2)

their real functions of ı are multiplied by the same complex transfer function H(uı), giving both
spectra the same nonzero phase angle ȥ(ı). In this sense Z (1) eff ,tot (σ ) and Z eff ,tot (σ ) are
(2)

mathematically equivalent to Zeff(ı) in Eq. (5.83d) of Chapter 5—which means that we can get

- 822 -
Detector Noise in a Single-Sided Signal · 6.18

eff ,tot (σ ) or Z eff ,tot (σ ) through the same


the required ȥ(ı) phase data by putting either Z (1) (2)

numerical recipe that Zeff(ı) is put through in Sec. 5.18.


The single-sided signal zconv(Ȥ) defined in Eq. (5.89a) in Chapter 5 is calculated between
χ = 0 and χ = 2 D − 2d because in Sec. 5.18 of Chapter 5 we want to examine how the same
range of moving-mirror motion can be manipulated to improve spectral resolution. In this section,
however, we want to keep the spectral resolution unchanged while comparing the detector noise
in single-sided and double-sided spectral measurements. Equation (5.67) of Chapter 5 specifies
the spectral resolution of a double-sided measurement between Ȥ = íD and Ȥ = D to be

1
∆σ double sided = ,
2D

and Eq. (5.93b) specifies the corresponding spectral resolution of a single-sided measurement
with zconv ( χ ) known between χ = 0 and χ = 2 D − 2d to be

1
∆σ single sided = .
2(2 D − 2d )

For the single-sided interferometer discussed in Sec. 5.18 of Chapter 5, we expect to have

d << D , (6.56)

which means that ∆σ single sided ≅ 1/(4 D) = ∆σ double sided / 2 . Hence, to create a single-sided
measurement with the same spectral detail as a double-sided measurement, we should record
zconv(Ȥ) only between χ = 0 and χ = D rather than between χ = 0 and χ = 2 D − 2d ≅ 2 D . This
ensures that both the single-sided and double-sided cases have the same spectral resolution.
To construct the zconv signal between 0 and D, we convolve ϖ ( χ ) with the signal component
created by the L(ı) input radiance at point C in Fig. 6.2, as shown by Eq. (5.89a) in Chapter 5.
Nothing stops us, however, from convolving the total signal at point C with ϖ while planning to
discard the unwanted background components later on. Because we want to keep track of the
noise, ϖ should be convolved with the total noise-contaminated signal zCN( tot )
( χ ) specified in Eq.
(6.22a) above. We get, remembering that the convolution is a linear operation [see Eqs. (2.38b)
and (2.38d) in Chapter 2],

1 ª (det) § χ ·º
( tot )
zCN ( χ ) ∗ϖ ( χ ) = zC ( χ ) ∗ϖ ( χ ) + zC( cold ) ( χ ) ∗ϖ ( χ ) + « n ( χ ) ∗ h ¨ ¸ » ∗ϖ ( χ ) .
u¬ © u ¹¼

The associative property of the convolution [see Eq. (2.38c) in Chapter 2] gives

- 823 -
6 · NEdN and Detector Noise

ª (det) § χ ·º ª §χ· º §χ·


« n ( χ ) ∗ h ¨ u ¸ » ∗ϖ ( χ ) = n ( χ ) ∗ « h ¨ u ¸ ∗ϖ ( χ ) » = n ( χ ) ∗ h/ ¨ u ¸ ,
(det) (det)

¬ © ¹¼ ¬ © ¹ ¼ © ¹

where we define
§χ· §χ·
h/ ¨ ¸ = h ¨ ¸ ∗ϖ ( χ ) . (6.57a)
©u¹ ©u¹

Now the noise-contaminated signal can be written as

1 ª (det) § χ ·º
( tot )
zCN ( χ ) ∗ϖ ( χ ) = zC ( χ ) ∗ϖ ( χ ) + zC( cold ) ( χ ) ∗ϖ ( χ ) + « n ( χ ) ∗ h/ ¨ ¸ » , (6.57b)
u¬ © u ¹¼

and to get the total noise-free signal, we just set n (det) ( χ ) to zero:

zC(tot ) ( χ ) ∗ϖ ( χ ) = zC ( χ ) ∗ϖ ( χ ) + zC( cold ) ( χ ) ∗ϖ ( χ ) . (6.57c)

To analyze zC ( χ ) ∗ϖ ( χ ) , the first term in Eq. (6.57c), we apply the Fourier convolution
theorem to its forward Fourier transform [see Eq. (2.39a) in Chapter 2],

F ( − iσχ ) ( zC ( χ ) ∗ϖ ( χ ) ) = F ( − iσχ ) ( zC ( χ ) ) ⋅ F ( − iσχ ′) (ϖ ( χ ′) ) . (6.58a)

The Fourier transforms in Eqs. (6.55b) and (6.5d) can be reversed to get


V (σ ) e − iψ (σ )
= ³ ϖ (χ ) e
−2π iσχ
d χ = F ( − iσχ ) (ϖ ( χ ) ) (6.58b)
−∞
and
WA ∆Ω
H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ )
4
∞ (6.58c)
= ³ zC ( χ ) e −2π iσχ
dχ = F ( − iσχ )
( zC ( χ ) ) .
−∞

Substituting (6.58b) and (6.58c) into (6.58a) gives

- 824 -
Detector Noise in a Single-Sided Signal · 6.18

F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω · (6.58d)
¸ ⋅ ª¬ e H(uσ ) º¼ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ ) τ a ( σ )L FOV ( σ ).
− iψ (σ )
= V (σ ) ⋅ ¨
© 4 ¹

According to Eq. (6.55a) and the discussion following it, ȥ(ı) is the argument or complex phase
angle of the effective transfer function H(uσ ) , so

H(uσ ) = eiψ (σ ) ⋅ H(uσ )


or
e − iψ (σ ) H(uσ ) = H(uσ ) . (6.58e)

We also note that [see Eq. (5.88d) in Chapter 5] the tapering function V(ı) equals one for those ı
values where 0 < σ min ≤ σ ≤ σ max . These are also, according to the discussion following Eq.
(6.38c) above, the ı values where the product

R ( σ )τ a ( σ )τ f ( σ )

in Eq. (6.58d) is not zero. So either V(ı) is multiplied by zero on the right-hand side of (6.58d),
which means that its value does not matter, or else ı has a value for which V(ı) is one. Hence Eq.
(6.58d) can be written as

F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω ·
¸ ⋅ ª¬ e H(uσ ) º¼ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) ,
− iψ (σ )

© 4 ¹

which becomes, substituting from (6.58e),

F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω · (6.58f)
=¨ ¸ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) .
© 4 ¹

According to Eq. (5A.6b) in Appendix 5A of Chapter 5, H is a Hermitian function, which makes


its magnitude even:
H(−uσ ) = H(uσ )∗ = H(uσ ) . (6.58g)

- 825 -
6 · NEdN and Detector Noise

Equation (5.10f) in Chapter 5 and (4.139g) in Chapter 4 show that M and Ș are also even
functions, and clearly the product

R ( σ ) τ f ( σ ) τ a ( σ )L FOV ( σ )

is even because all the functions depend on σ . Hence the entire right-hand side of Eq. (6.58f) is
a real and even function of ı. Reversing the Fourier transform in (6.58f) to get

zC ( χ ) ∗ϖ ( χ )
§ § WA ∆Ω · · (6.58h)
= F ( iσχ ) ¨ ¨ ¸ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) ¸ ,
©© 4 ¹ ¹
we conclude that the convolution zC ( χ ) ∗ϖ ( χ ) is another real and even function because it is the
inverse Fourier transform of a real and even function (see entry 1 in Table 2.1 of Chapter 2):

zC (− χ ) ∗ϖ (− χ ) = zC ( χ ) ∗ϖ ( χ ) . (6.58i)

To analyze the second term on the right-hand side of Eq. (6.57c),

zC( cold ) ( χ ) ∗ϖ ( χ ) ,

we take its forward Fourier transform to get, again using Eq. (2.39a) in Chapter 2,

( ) ( )
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) = F ( −iσχ ) zC( cold ) ( χ ) ⋅ F ( − iσχ ′) (ϖ ( χ ′) ) . (6.59a)

This can be written as, substituting from Eqs. (6.58b) and (6.11a),

( )
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) =
§ WA ∆Ω · −iψ (σ ) (6.59b)
V (σ ) ¨ ¸ ª¬ e H(uσ ) º¼ M( Rσθma ) η(σ ) R ( σ )τ a ( σ )[L(FOV (σ ) − L(back)
FOV ( σ )].
fore )

© 4 ¹

Comparing this to Eq. (6.58d), we note that if τ f is replaced by one, and if LFOV is replaced by
[L(FOV
fore )
− L(back)
FOV ] , then the right-hand side of (6.58d) becomes the same as the right-hand side of

(6.59b)—that is,
(
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) )

- 826 -
Detector Noise in a Single-Sided Signal · 6.18

becomes mathematically equivalent to

F ( i) )  zC (  ) , (  )  .

No special assumption was made about the nature of LFOV when analyzing the formula for

F ( i) )  zC (  ) , (  )  ,

and only one assumption was made about * f : that the tapering function V(ı) equals one for those
ı values where the product
R ( ) )* a ( ) )* f ( ) )

is not zero [see discussion following Eq. (6.58e) above]. Nothing stops us from tightening this
assumption slightly by requiring that the tapering function equals one when the product
R ( ) ) * a ( ) ) is not equal to zero; this prevents * f from having any effect on our previous

analysis of F (  i) ) ( zC (  ) , (  )) . Hence both * f and LFOV turn into placeholder functions when
deriving Eqs. (6.58h)
f and (6.58i) from (6.58d), which means that (6.58h) f and (6.58i) still hold
( fore ) (back)
true when * f is set equal to one and LFOV is replaced by [L FOV  L FOV ] . Consequently, we can
now apply Eqs. (6.58h)
f and (6.58i) to Eq. (6.59b) to get, setting * f equal
equal totoone andreplacing
one, replacing
LFOV by [L(FOV
fore )
 L(back)
FOV ] , and adding a “(cold)” superscript to Eqs. (6.58 f, i),


F ( i) ) zC( cold ) (  ) , (  ) 
§ WA  · (6.59c)
( fore ) (back)
¨ ¸ H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )[L FOV ( ) )  L FOV ( ) )]
© 4 ¹
and

zC( cold ) (  ) , (  ) zC( cold ) (  ) , (  ) . (6.59d)

Equations (6.58i) and (6.59d) show that both terms on the right-hand side of Eq. (6.57c) are
even functions of Ȥ, which means that their sum

zC(tot ) (  ) , (  ) zC (  ) , (  )  zC( cold ) (  ) , (  )

- 827 -
6 · NEdN and Detector Noise

must also be an even function of Ȥ. This means we can take the zC( tot ) (  ) , (  ) data collected
from  0 to  D and use it to create an artificial signal between  0 and   D . We call
the artificially doubled, noise-free signal

Even[zC( tot ) (  ) , (  )]
(6.60a)
 (  , D )[ zC (  ) , (  )]   (  , D )[ zC( cold ) (  ) , (  )].

We define the “Even” operator by stating that

Even[z (  )]  (  , D) z    (6.60b)

for any function z (  ) . This forces Even[zC( tot ) (  ) , (  )] to have the same values
s atat  –9900 , for
0 4  0 4 D , as zC( tot ) (  ) , (  ) has at   0 . The  (  , D ) function has the same meaning as in
Eq. (6.22c) above, reminding us, since it equals one for  4 D and equals zero otherwise, that
no data exists for  D . We note that, although absolute value signs are applied to Ȥ on the
right-hand side of (6.60b), they are not needed in (6.60a) because the right-hand side is already
an even function of Ȥ. These formulas seem straightforward enough, but we should note that the
Even operator has an interesting effect on a noise-contaminated signal such as the one in Eq.
(6.57b): the noise contaminating the signal at positive Ȥ automatically becomes the same as the
noise contaminating the signal at negative Ȥ. Another way of putting this is that, for any
0 4  0 4 D , the signal at    0 is always in error from the presence of random detector noise
by exactly the same amount as the signal at   0 . To show what the Even operator does to the
noise-contaminated signal in (6.57b), we need the Heaviside step function [which has already
been defined in Eq. (2.70a) of Chapter 2],

­ 1 for  0
°
(  ) ®1 2 for  0 . (6.60c)
° 0 for 
0
¯

- 828 -
Detector Noise in a Single-Sided Signal · 6.18

Applying the Even operator to both sides of (6.57b) now gives

( tot )
Even[ zCN ( χ ) ∗ϖ ( χ )]
= Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] + Π ( χ , D) ª¬ zC( cold ) ( χ ) ∗ϖ ( χ ) º¼ (6.60d)
­ ª § χ ·º ª § χ ·º ½
+ u −1Π ( χ , D) ®Ξ( χ ) ⋅ « n (det) ( χ ) ∗ h/ ¨ ¸ » + Ξ(− χ ) ⋅ « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¾ .
¯ ¬ © u ¹¼ ¬ © u ¹¼ ¿

To show that the noise term is handled correctly in (6.60d), we note that when Ȥ is positive, the
first term inside the braces { } specifies the noise because the second term is zero; and when Ȥ is
negative, the second term specifies the noise to be the same as it is for − χ ≥ 0 because then the
first term is zero. This ensures that the random noise inside the braces automatically has the same
value at +Ȥ and –Ȥ.

6.19 Uncalibrated Spectra of Single-Sided Signals with Detector Noise


To get the uncalibrated signal spectrum of the artificially even, noise-contaminated signal in Eq.
(6.60d), we apply the forward Fourier transform to both sides of the equation, using the linearity
of the transform as described in Sec. 2.6 of Chapter 2 to write

(
F ( −iσχ ) Even[ zCN
( tot )
( χ ) ∗ϖ ( χ )] )
(
= F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] ) + F ( −iσχ ) Π ( χ , D ) ª¬ zC( cold ) ( χ ) ∗ϖ ( χ ) º¼ )
§ ª § χ ·º · (6.61)
+ u −1F ( − iσχ ) ¨ Π ( χ , D) ⋅ Ξ( χ ) ⋅ « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ ª § χ ·º ·
+ u −1F ( − iσχ ) ¨ Π ( χ , D) ⋅ Ξ(− χ ) ⋅ « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸ .
© ¬ © u ¹¼ ¹

The first two terms on the right-hand side are easier to evaluate than the last two, so we start with
the first two and leave the more difficult work for later.
Using the Fourier convolution theorem once on the first term [see Eq. (2.39j) of Chapter 2]
gives

- 829 -
6 · NEdN and Detector Noise

F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] )
= F ( −iσχ ) (Π ( χ , D)) ∗ F ( −iσχ ′) ( zC ( χ ′) ∗ϖ ( χ ′) ) (6.62a)
­WA ∆Ω ½
= [ 2 Dsinc(2πσ D) ] ∗ ® H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ )¾ ,
¯ 4 ¿

where in the last step we substitute from Eqs. (6.24b) and (6.58f) to evaluate the convolved
Fourier transforms. According to the discussion following Eq. (5.82c) in Chapter 5, everything
inside the braces { } is a slowly varying function of ı compared to LFOV; and Sec. 5.15 of Chapter
5 explains why sinc(2πσ D) must, in a well-designed interferometer, be a narrow function
varying no less rapidly than the major features of LFOV. Hence everything inside the braces
(except LFOV) must be slowly varying with ı compared to the narrow function sinc(2πσ D) .
Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, the convolution in (6.62a)
primarily affects LFOV, giving us

F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] )
WA ∆Ω (6.62b)
≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L mnf ( σ ) ,
4

where [see Eqs. (6.23a) above and (5.108f) in Chapter 5]

L mnf ( σ ) = [2 Dsinc(2πσ D)] ∗ L FOV ( σ ) . (6.62c)

The second term on the right-hand side of (6.61) is handled the same way as the first. Again
using Eq. (2.39j) in Chapter 2, we write

F ( −iσχ ) ( Π ( χ , D)[ zC( cold ) ( χ ) ∗ϖ ( χ )])


= F ( − iσχ ) ( Π ( χ , D) ) ∗ F ( −iσχ ′) ( zC( cold ) ( χ ′) ∗ϖ ( χ ′) )
= [ 2 Dsinc(2πσ D) ] ∗ (6.63a)

­WA ∆Ω ½
® H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]¾
¯ 4 ¿

with Eqs. (6.24b) and (6.59c) used to evaluate the convolved Fourier transforms. Only

[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]

- 830 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19

inside
insidethe
thebraces
braces{ }{might notabeslowly
} is not a slowly varyingfunction
varying function of ı compared to sinc(2&) D) , so again Eq.
(5C.1) in Appendix 5C can be used to write

F ( i) )   (  , D)[ zC( cold ) (  ) , (  )] 


WA  (6.63b)
H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )[L(mnf
fore )
( ) )  L(back)
mnf ( ) )] ,
4

where [see Eqs. (6.25b), (6.25c) and (6.25f), (6.25g) above]

L(mnf
fore )
( ) ) [2 Dsinc(2&) D )]  L(FOV
fore )
() ) (6.63c)
and
L(back) (back)
mnf ( ) ) [2 Dsinc(2&) D )]  L FOV ( ) ) . (6.63d)

Now we are ready to analyze the last two terms in Eq. (6.61). Evaluation of the forward
Fourier transforms of h (  / u ) , [ (  , D) A (  )] , and [ (  , D) A (  )] comes first.
Taking the forward Fourier transform of h (  / u ) defined in Eq. (6.57a) gives, applying the
Fourier convolution theorem [Eq. (2.39a) in Chapter 2],

§ §  ·· § §  ··
F ( i) ) ¨ h ¨ ¸ ¸ F ( i) ) ¨ h ¨ ¸ ¸ A F ( i) 3) , (  3)  .
© © u ¹¹ © © u ¹¹

This can be written as, substituting from Eqs. (6.27b) and (5.88b) in Chapter 5,
§ §  ··
F ( i) ) ¨ h ¨ ¸ ¸ u H(u) ) A V () )e i/ () ) uV () ) H(u) ) , (6.64a)
© © u ¹¹

where in the last step we use


e  i/ () ) H(u) ) H(u) )

from (6.58e) to simplify the formula. According to Eq. (6.58g), the magnitude of the effective
transfer function H(u) ) is even with respect to ı, and of course it must also be real. Function
V(ı) is real and, according to Eq. (5.88e) in Chapter 5, it is also even. Hence, (6.64a) reveals that
the forward Fourier transform of h (  / u ) is real and even. Entry 1 of Table 2.1 in Chapter 2 now
shows that h itself must be real and even:

§ · §·
h ¨  ¸ h ¨ ¸ (6.64b)
© u¹ ©u¹

- 831 -
6 · NEdN and Detector Noise

and
§ § χ ··
Im ¨ h/ ¨ ¸ ¸ = 0 . (6.64c)
© © u ¹¹

For future use, we note that h/ ( χ / u ) , just like h(t), is a relatively narrow function of its argument.
To see why this is so, we consult Eq. (6.21a) and note that there exists a time T such that

h(t ) ≈ 0 for t > T ,

which means that


§χ·
h ¨ ¸ ≈ 0 for χ > uT . (6.65a)
©u¹

Function ϖ ( χ ) is also a relatively narrow function of Ȥ with [see Eq. (5.88h) in Chapter 5]

ϖ ( χ ) ≈ 0 for χ > d . (6.65b)

Function h/ ( χ / u ) is, according to Eq. (6.57a), the convolution of h( χ / u ) and ϖ ( χ ) and so can
be written as [see the definition of the convolution in Eq. (2.38a) of Chapter 2]


§χ· § χ′ ·
h/ ¨ ¸ = ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′ .
© u ¹ −∞ © u ¹

The approximation in (6.65a) gives

uT
§χ· § χ′ ·
h/ ¨ ¸ ≅ ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′ . (6.65c)
© u ¹ −uT © u ¹

The approximation in (6.65b) reveals that

§ χ′ ·
h ¨ ¸ϖ ( χ − χ ′)
©u ¹

can only make a significant contribution to the integral in (6.65c) when

χ − χ′ < d ,

- 832 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19

because, when this is not true, (6.65b) forces ϖ to be small. But the limits on the integral confine
χ ′ to values between +uT and íuT, so when

χ > d + uT ,
it is impossible for
§ χ′ ·
h ¨ ¸ϖ ( χ − χ ′)
©u ¹

to make a significant contribution to the integral for any of the allowed values of χ ′ .
Consequently,
uT
§χ· § χ′ ·
h/ ¨ ¸ ≅ ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′
© u ¹ −uT © u ¹

must be negligible when χ > d + u T . Therefore h/ ( χ / u ) is a relatively narrow function, because


we can write
§χ·
h/ ¨ ¸ ≈ 0 for χ > d + uT . (6.65d)
©u¹

This demonstration that h/ is a narrow function relies only on its being the convolution of two
other narrow functions. In general, the convolution of two narrow functions produces another
narrow function whose width can be no wider than (approximately) the sum of the widths of the
functions being convolved.
The forward Fourier transform of [Π ( χ , D) ⋅ Ξ( χ )] is, according to Eq. (6.22b) and (6.60c),

D ∞
§ D D·
F ( −iσχ ) ( Π ( χ , D) Ξ( χ ) ) = ³ e −2π iσχ d χ = ³ Π ¨© χ − 2 , 2 ¸¹ e
−2π iσχ

0 −∞

§ § D D ··
= F ( −iσχ ) ¨ Π ¨ χ − , ¸ ¸ .
© © 2 2 ¹¹

This becomes, using Eqs. (2.36b) and (2.108b) in Chapter 2,

F ( −iσχ ) ( Π ( χ , D) Ξ( χ ) ) = e −π iσ D [ Dsinc(πσ D) ] . (6.66a)

The same analysis of the forward Fourier transform of [Π ( χ , D) ⋅ Ξ(− χ )] gives

- 833 -
6 · NEdN and Detector Noise

0 ∞
§ D D·
F ( − iσχ )
( Π ( χ , D) Ξ(− χ ) ) = ³e
−2π iσχ
dχ = ³ Π ¨© χ + 2 , 2 ¸¹ e
−2π iσχ

−D −∞
or
F ( −iσχ ) ( Π ( χ , D) Ξ(− χ ) ) = eπ iσ D [ Dsinc(πσ D)] . (6.66b)

Having evaluated the formulas for the Fourier transforms of h/ ( χ / u ) , [Π ( χ , D) ⋅ Ξ( χ )] , and


[Π ( χ , D) ⋅ Ξ(− χ )] , we are ready to evaluate the third and fourth terms on the right-hand side of
Eq. (6.61).
We begin evaluation of the third term by using the definition of the convolution [see Eq.
(2.38a) in Chapter 2] to write


§χ· § χ − χ′ · ′
n (det) ( χ ) ∗ h/ ¨ ¸ = ³ n (det) ( χ ′) h/ ¨ ¸dχ
© u ¹ −∞ © u ¹
χ + ( d +uT )
(6.67a)
§ χ − χ′ · ′
³ ( χ ′) h/ ¨ ¸dχ .
(det)
≅ n
χ −( d +uT ) © u ¹

The approximation in the last step comes from noting that the product

§ χ − χ′ ·
n (det) ( χ ′) h/ ¨ ¸
© u ¹

is negligible when ( χ − χ ′) lies outside the range of values between (d+uT) and í(d+uT) for
which h/ is significantly different from zero [see (6.65d)]. Multiplying both sides of (6.67a) by
Π ( χ , D) gives

χ + ( d +uT )
ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ n (det) ( χ ′) h/ ¨ ¸dχ .
¬ © u ¹¼ χ −( d +uT ) © u ¹

The new Π ( χ , D) factor reduces this equation to 0 = 0 when χ > D . Remembering that h/ is
negligible whenever ( χ − χ ′) lies outside the range of values between (d+uT) and í(d+uT), we
extend the limits of the integral on the right-hand side to get the new approximation

D + ( d +uT )
ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ n (det) ( χ ′) h/ ¨ ¸dχ .
¬ © u ¹¼ − D −( d +uT ) © u ¹

- 834 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19

Here we rely on the extra regions of integration going from χ ′ = − D − (d + uT ) to


χ ′ = χ − (d + uT ) and from χ ′ = χ + (d + uT ) to χ ′ = D + (d + u T ) to contribute only a
negligible amount to the integral. This approximation can also be written as


ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ Π ( χ ′, D) n (det) ( χ ′) h/ ¨ ¸dχ ,
¬ © u ¹¼ −∞ © u ¹
where
D = D + d + uT . (6.67b)

Equation (2.38a) in Chapter 2 can be used to recognize the integral as a convolution,

ª § χ ·º ­ § χ ·½
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ® ª¬Π ( χ , D) n (det) ( χ ) º¼ ∗ h/ ¨ ¸ ¾ . (6.67c)
¬ © u ¹¼ ¯ © u ¹¿

This becomes, multiplying through by the Heaviside step function Ξ( χ ) ,

ª (det) § χ ·º ­ § χ ·½
Ξ( χ ) Π ( χ , D) « n ( χ ) ∗ h/ ¨ ¸ » ≅ Ξ( χ ) Π ( χ , D) ® ª¬Π ( χ , D) n ( χ ) º¼ ∗ h/ ¨ ¸ ¾ .
(det)

¬ © u ¹¼ ¯ © u ¹¿

Taking the forward Fourier transform of both sides gives, using Eqs. (2.39j) and (2.39a) in
Chapter 2,

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
­ § § χ ′′ · · ½
≅ F ( −iσχ ) ( Ξ( χ ) Π ( χ , D) ) ∗ ® F ( −iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ⋅ F ( −iσχ ′′) ¨ h/ ¨ ¸ ¸ ¾ .
¯ © © u ¹ ¹¿

This can be written as, substituting from Eqs. (6.64a) and (6.66a),

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
≅ ª¬ D e −π iσ D sinc(πσ D) º¼ ∗ ¬ªuV (σ ) H (uσ ) ⋅ F ( −iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ¼º .

We note, due to the size of the ıD product, that e −π iσ D sinc(πσ D) is about as narrow and rapidly
varying a function of ı as sinc(πσ D) . Hence, glancing back at the discussion following Eq.

- 835 -
6 · NEdN and Detector Noise

(6.62a) above, we see that V(ı) and H (uσ ) vary slowly with ı compared to e −π iσ D sinc(πσ D) ,
which means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that V(ı) and H (uσ ) can
be moved outside the convolution:

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹

{ }
≅ uV (σ ) H (uσ ) ª¬ D e −π iσ D sinc(πσ D) º¼ ∗ ¬ª F ( − iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ¼º .

Remembering that
D e −π iσ D sinc(πσ D) = F ( −iσχ ) ( Π ( χ , D) Ξ( χ ) )

from Eq. (6.66a), we apply Eq. (2.39j) in Chapter 2 to get

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
(
≅ uV (σ ) H (uσ ) F ( −iσχ ) Ξ ( χ ) Π ( χ , D )Π ( χ , D) n (det) ( χ ) .)
From Eq. (6.67b), we know D > D , which means that [see Eq. (6.22b)]

Π ( χ , D )Π ( χ , D) = Π ( χ , D ) .
Therefore

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.67d)
(
≅ uV (σ ) H (uσ ) F ( −iσχ ) Ξ( χ ) Π ( χ , D) n (det) ( χ ) . )
This takes care of the third term on the right-hand side of Eq. (6.61). At no point during this
derivation did we make any assumptions about the behavior of n (det) ( χ ) ; it acts as a placeholder
and could be replaced by other functions—both random and nonrandom—without making any
part of the analysis untrue.
It is now time to simplify the fourth and last term in Eq. (6.61). We have just remarked that, in
the analysis of the third term in (6.61), n (det) ( χ ) acts as a placeholder and can be replaced by any
other reasonable choice. It turns out that we are not so much interested in modifying the final
result in Eq. (6.67d) as we are in modifying the approximation in (6.67c) that appears partway

- 836 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19

through the derivation of (6.67d). Replacing the n (det) (  ) placeholder in (6.67c) by n(det) (  )
gives
ª §  ·º ­ §  ·½
 (  , D ) « n (det) (  )  h ¨ ¸ »  (  , D ) ® ¬ª (  , D) n (det) (  ) ¼º  h ¨ ¸ ¾ . (6.68a)
¬ © u ¹¼ ¯ © u ¹¿

This can be written as, using (6.64b) to modify the left-hand side,

ª §  ·º ­ §  ·½
 (  , D) « n (det) (  )  h ¨  ¸ »  (  , D) ® ª¬ (  , D) n (det) (  ) º¼  h ¨ ¸ ¾ . (6.68b)
¬ © u ¹¼ ¯ © u ¹¿

Multiplying through by (   ) and taking the forward Fourier transform of both sides leads to

§ ª §  ·º ·
F ( i) ) ¨ (  ) (  , D) « n (det) (  )  h ¨  ¸ » ¸
© ¬ © u ¹¼ ¹
(6.68c)
§ ­ §· ½ ·
F ( i) ) ¨ (  ) (  , D) ® ¬ª  (  , D) n (det) (  ) ¼º  h ¨ ¸ ¾ ¸ .
© ¯ © u ¹¿ ¹

The left-hand
left-hand side
side of
ofthis
thisformula
formulaisis(after
(after dividing
dividing byby u) exactly
u) the Fourier the same as
transform of the
the fourth term
in (6.61) that we need to evaluate. We apply Eqs. (2.39a) and (2.39j) in Chapter 2 to the right-
hand side to get

§ ª §  ·º ·
F ( i) ) ¨ (  ) (  , D) « n (det) (  )  h ¨  ¸ » ¸
© ¬ © u ¹¼ ¹
­ § § 3 ·· ½
F ( i) )  (  ) (  , D)   ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33)   (  33, D) n (det) (  33)  ¾ .
¯ © © u ¹¹ ¿

Substituting from Eqs. (6.66b) and (6.64a) gives

§ ª §  ·º ·
F ( i) ) ¨ (  ) (  , D) « n (det) (  )  h ¨  ¸ » ¸
© ¬ © u ¹¼ ¹ (6.68d)
1 
¬ª De& i) D sinc(&) D) ¼º  uV () ) H(u) ) A F ( i) )  (  , D) n (det) (  ) . 2
Again, just like in the analysis of the third term of (6.61), Eq. (5C.1) in Appendix 5C of Chapter
5 is used to move V and H outside the convolution because they vary slowly compared to

- 837 -
6 · NEdN and Detector Noise

[eπ iσ D sinc(πσ D)] . Equation (6.68d) then becomes

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ ¬ª Deπ iσ D sinc(πσ D) ¼º ∗ F ( −iσχ ) Π ( χ , D) n (det) (− χ ) .

Glancing back at (6.66b) to get

eπ iσ D [ Dsinc(πσ D) ] = F ( −iσχ ) ( Π ( χ , D) Ξ(− χ ) ) ,

we use (2.39j) in Chapter 2 to write

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ F ( −iσχ ) Ξ(− χ )Π ( χ , D) Π ( χ , D) n (det) (− χ ) .

Again we note [see Eq. (6.67b)] that D > D , making Π ( χ , D)Π ( χ , D) = Π ( χ , D) . Hence,

§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹ (6.68e)
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ F ( − iσχ ) Ξ(− χ )Π ( χ , D) n (det) (− χ ) .

This takes care of the fourth term on the right-hand side of Eq. (6.61).
Before substituting our results back into Eq. (6.61), it makes sense to use the linearity of the
Fourier transform (see Sec. 2.6 in Chapter 2) to combine the equation’s third and fourth terms.
Multiplying by u −1 and adding together (6.67d) and (6.68e) gives

- 838 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19

§ ª § χ ·º ·
u -1F ( − iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ ª § χ ·º ·
+ u -1F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹

{ (
≅ V (σ ) H (uσ ) F ( −iσχ ) Ξ( χ )Π ( χ , D) n (det) ( χ ) ) (6.69a)

(
+ F ( −iσχ ) Ξ(− χ )Π ( χ , D) n (det) (− χ ) )}
(
= V (σ ) H (uσ ) F ( −iσχ ) Π ( χ , D) ¬ªΞ( χ ) n (det) ( χ ) + Ξ(− χ )n (det) (− χ ) ¼º . )
Now we can substitute into Eq. (6.61) the approximations shown in Eqs. (6.62b), (6.63b), and
(6.69a) to get

F ( −iσχ ) (Even[ zCN


( tot )
( χ ) ∗ϖ ( χ )])
WA ∆Ω
≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L mnf ( σ )
4
WA ∆Ω
+ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L(mnffore )
( σ ) − L(back)
mnf ( σ )]
4
(
+ V (σ ) H (uσ ) F ( −iσχ ) Π ( χ , D) ª¬Ξ( χ ) n (det) ( χ ) + Ξ(− χ )n (det) (− χ ) º¼ )
or

F ( −iσχ ) (Even[ zCN


( tot )
( χ ) ∗ϖ ( χ )])
WA ∆Ω
≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ a ( σ ) ª¬τ f ( σ )L mnf ( σ )
4 (6.69b)
+ L(mnf
fore )
( σ ) − L(back) º
mnf ( σ )¼

(
+ V (σ ) H (uσ ) F ( −iσχ ) Π ( χ , D) ª¬ Ξ( χ ) n (det) ( χ ) + Ξ(− χ )n (det) (− χ ) º¼ . )

The next section explains how to analyze the noise term in this formula.

- 839 -
6 · NEdN and Detector Noise

6.20 Calibrated Spectra of Single-Sided Signals with Detector Noise


To analyze the detector noise contaminating a single-sided signal, we define a new random
function,
n E(det) ( χ ) = Ξ( χ ) ⋅ n (det) ( χ ) + Ξ(− χ ) ⋅ n (det) (− χ ) . (6.70a)

The Heaviside step function Ξ ( χ ) from Eq. (6.60c) ensures that n E(det) always has the same value
at − χ as it does at + χ : when χ = χ is positive, the first term specifies the value of n E(det)
because the second term is zero; and when χ = − χ is negative, the second term specifies the
value of n E(det) to be the same as it is for χ = χ because the first term is zero. This means random
function n E(det) is always even,
n E(det) (− χ ) = n E(det) ( χ ) , (6.70b)

and, because it represents noise contaminating a real signal, it must also be real:

( )
Im n E(det) ( χ ) = 0 . (6.70c)

Following the same pattern as in the previous Ȥ-based noise terms [see Eq. (6.29a)], we define the
D-limited forward Fourier transform of n E(det) to be

DE (σ ) = F
n (det) (
( − iσχ )
Π ( χ , D)nE(det) ( χ ) = ) ³ Π( χ , D) n (det)
E ( χ ) e −2π iσχ d χ (6.70d)
−∞
or
D
(σ ) = ³ n ( χ ) e −2π iσχ d χ .
(det) (det)
n DE E (6.70e)
−D

This can also be written as, substituting from (6.70a),

DE (σ ) = F
n (det) ( − iσχ )
(
Π ( χ , D)[Ξ( χ ) ⋅ n (det) ( χ ) + Ξ(− χ ) ⋅ n (det) (− χ )] . ) (6.70f)

We have just seen that function n E(det) is real and even. Function Π ( χ , D) is also real and even
DE (σ ) in (6.70d) and (6.70e) is the forward Fourier transform of a
[see Eq. (6.22b) above], so n (det)
DE (σ ) another real and even function (see entry 1 of Table
real and even function. This makes n (det)
2.1 in Chapter 2):

- 840 -
Calibrated Spectra of Single Sided Signals with Detector Noise · 6.20

DE ( −σ ) = n DE (σ )
n (det)  (det) (6.70g)
and
Im ( n (det)
DE (σ ) ) = 0 . (6.70h)

The expectation value of n E(det) ( χ ) is, applying the expectation operator E to both sides of
(6.70a),
( ) ( )
E nE(det) ( χ ) = Ξ( χ ) E n (det) ( χ ) + Ξ(− χ ) E n (det) (− χ ) , ( )
using the linearity of the expectation operator with respect to random quantities discussed in Sec.
3.10 of Chapter 3. Since
E n (det) ( χ ) = 0( )
for any value of Ȥ [see Eq. (6.17b)], we can now see that

(
E nE(det) ( χ ) = 0 .) (6.70i)

Applying the expectation operator E to both sides of Eq. (6.70e) gives, using Eq. (3.17c) in
Chapter 3,
D

(
E n (det)
DE (σ ) = ) ³ E ( n (det)
E )
( χ ) e −2π iσχ d χ .
−D

( )
Since we now know that E nE(det) ( χ ) is zero, this shows that

(
E n (det)
DE (σ ) = 0 . ) (6.70j)

The detector-noise term in Eq. (6.69b) can be simplified by substituting from Eq. (6.70f):

F ( −iσχ ) (Even[ zCN


( tot )
( χ ) ∗ϖ ( χ )])
WA ∆Ω
≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ a ( σ ) ª¬τ f ( σ )L mnf ( σ )
4 (6.71a)
+ L(mnf
fore )
( σ ) − L(back) º
mnf ( σ )¼

+ H(uσ ) ª¬V (σ ) n (det) º


DE (σ ) ¼ .

- 841 -
6 · NEdN and Detector Noise

In a single-sided measurement, we can think of

(
F ( −iσχ ) Even[ zCN
( tot )
( χ ) ∗ϖ ( χ )] )
as the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2, because it plays
the same role that
(
F ( − iσχ ) Π ( χ , D) zCN
( tot )
(χ ) )
does in the double-sided signal spectrum specified in Eq. (6.30a) above. Comparing the formulas
for
(
F ( −iσχ ) Even[ zCN
( tot )
) (
( χ ) ∗ϖ ( χ )] and F ( − iσχ ) Π ( χ , D) zCN
( tot )
)(χ )

in Eqs. (6.71a) and (6.30a), we see that there is an exact correspondence if H(uı) in (6.30a) is
matched with H(uσ ) in (6.71a) and if n (det)
D (σ ) in (6.30a) is matched with [V (σ ) n DE (σ )] in
 (det)
(6.71a):
H(uσ ) ⇔ H(uσ ) (6.71b)
and
D (σ ) ⇔ [V (σ ) n DE (σ )] .
n (det)  (det)

We also note that the expectation value of the spectral noise “ [V (σ ) n (det)
DE (σ )] ” in (6.71a) is zero,

D (σ ) in (6.30a) is zero. To see why this is


just like the expectation value of the spectral noise n (det)
so, we just apply the expectation operator E to [V (σ ) n (det)
DE (σ )] and consult Eq. (6.70j) to get

(
E V (σ ) n (det) ) (
DE (σ ) = V (σ ) E n DE (σ ) = 0 .
 (det) ) (6.71c)

Knowing that the spectral noise in (6.70a) has a zero expectation value, we can repeat the
mathematical analysis used in Sec. 6.11 to extract the Lmnf data from the uncalibrated spectrum
in (6.30a), only this time replacing H(uı) by H(uσ ) and n (det) D (σ ) by [V (σ ) n DE (σ )] as
 (det)
eff ,tot (σ ) and Z eff ,tot (σ ) in Eqs. (6.33a) and (6.33b) now
specified in (6.71b). The formulas for Z (1) (2)

become

WA ∆Ω
eff ,tot (σ ) ≅
Z (1) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.71d)
ª¬τ f ( σ )L(1) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼

- 842 -
Calibrated Spectra of Single Sided Signals with Detector Noise · 6.20

and
WA ∆Ω
eff ,tot (σ ) ≅
Z (2) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.71e)
ª¬τ f ( σ )L(2) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .

 ( meas ) (σ ) in Eq. (6.34b) changes to


The formula for Z eff ,totN

 ( meas ) (σ ) = WA ∆Ω H(uσ ) M( Rσθ ) η(σ ) R ( σ )τ ( σ ) ⋅


Z eff ,totN ma a
4
ª¬τ f ( σ )L mnf ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ (6.71f)
+ H(uσ ) ª¬V (σ ) n (det) º
DE (σ ) ¼ .

Substituting these expressions into the calibration formula in (6.35d) now gives

­° Z ( meas ) (σ ) − Z (1) (σ ) ½°
ª¬ L(2) ( σ ) − L(1) ( σ )º¼ ⋅ ® eff(2),totN eff ,tot
¾ + L (σ )
(1)

¯° Z eff ,tot (σ ) − Z eff ,tot (σ ) ¿°


(1)

(6.71g)
4V (σ ) n (det)
DE (σ )
= L mnf (σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

The most important difference between the single-sided formula in (6.71g) and the double-sided
DE (σ ) is strictly real whereas, as was
formula in (6.35d) is that, according to Eq. (6.70h), n (det)
D (σ ) has both real and imaginary components. According
pointed out following Eq. (6.35d), n (det)
to the discussion of the double-sided case following Eq. (6.36) above, the imaginary component
D (σ ) is called the avoidable spectral noise because it can be eliminated by taking the real
of n (det)
part of the interferometer measurement; and the real component of n (det) D (σ ) is called the

unavoidable spectral noise because it cannot be eliminated from the interferometer measurement.
The avoidable spectral noise comes from the odd part of the n (det) ( χ ) signal noise contaminating
the interferometer data, and the unavoidable spectral noise comes from the even part of the
n (det) ( χ ) signal noise contaminating the interferometer data. The n (det) ( χ ) noise contaminating
the double-sided signal has both even and odd components because the interferometer data is
recorded for both positive and negative values of the OPD value Ȥ. In the single-sided case, on
the other hand, interferometer data is recorded only for non-negative values of Ȥ and then
artificially extended to negative Ȥ values, automatically turning the noise contaminating the
signal into an even function of Ȥ [see Eq. (6.70b)]. Consequently, the single-sided spectral noise

- 843 -
6 · NEdN and Detector Noise

n (det)
DE () ) is always real and even [see Eqs. (6.70g) and (6.70h)], and there is no avoidable noise

that can be eliminated by taking the real part of the measured spectrum. Hence, when comparing
the right-hand side of (6.71g) to a spectral radiance measurement contaminated by random error,
such as

L mnf () )   L () )

in Eq. (6.1a) above, we see that for single-sided spectral measurements contaminated by detector
noise all of n (det) 
DE contributes to  L , giving

4V () ) n (det)
DE () )
 L () ) . (6.72a)
(WA ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) )

for positive ı values. We know that n (det)


DE on the right-hand side of (6.72a) is a real and even

function of ı. Functions Ș, M, and V are real and—according to Eq. (4.139g) in Chapter 4 and
Eqs. (5.10f) and (5.88e) in Chapter 5—even functions of ı. Functions R, * a , and * f are also real
and have ) for their argument, forcing them to be even functions of ı. It follows that Eq.
(6.72a) presents aa well-founded for  L () ) that is, as it should be, a real and even random
formula for
single-sided formula
function of ı just like equation (6.38c) above. Following the convention adopted there, we write
 L in (6.72c) as a function of ) to get
4 V ( ) ) n (det)
DE ( ) )
 L ( ) ) . (6.72b)
(WA ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) )* f ( ) )

6.21 Detector-Noise NEdN in a Single-Sided Signal


Taking the expectation value of both sides of Eq. (6.72b) gives, after consulting Eq. (6.70j),

4 V ( ) ) E n (det)
 DE ( ) ) 
E  L ( ) )
  0. (6.73a)
(WA ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) ) * f ( ) )

To find the NEdN for the detector noise in the single-sided signal, we apply Eqs. (6.72b) and
(6.73a) to the formula in (6.3g) above to get, remembering that W 2 1 according to the
discussion following Eq. (4.83) in Chapter 4, that

- 844 -
Detector-Noise NEdN in a Single-Sided Signal · 6.21

NEdN (det)
(σ ) =
(
4 V ( σ ) E n (det)
DE ( σ )
2
)
1
( A ∆Ω) M( R σ θ ma ) η( σ )R ( σ )τ a ( σ )τ f ( σ )

or, removing the absolute value signs from the arguments of n (det)
DE , Ș, M, and V,

NEdN (det)
(σ ) =
(
4 V (σ ) E n (det)
DE (σ )
2
) . (6.73b)
1
( A ∆Ω) M( Rσθ ma ) η(σ )R ( σ )τ a ( σ )τ f ( σ )

The absolute value signs are removed because these functions are even [see Eq. (6.70g), Eq.
(4.139g) in Chapter 4, and Eqs. (5.10f) and (5.88e) in Chapter 5]. The subscript 1 and superscript
(det) show that this is the formula for the NEdN of a single-sided signal contaminated by detector
noise.
The quickest way to connect NEdN1(det) to the formula for the double-sided signal is to
analyze the detector noise as a time-based rather than a Ȥ-based random function. Returning to
the definition of n E(det) ( χ ) in Eq. (6.70a) above, we use χ = ut from Eq. (6.4) to convert both
sides of (6.70a) to time-based, rather than Ȥ-based, functions,

n E(det) (ut ) = Ξ(ut ) ⋅ n (det) (ut ) + Ξ(−ut ) ⋅ n (det) (−ut ) . (6.74a)

Equation (6.60c) shows that


Ξ(t ) = Ξ(ut ) (6.74b)

for the Heaviside step function, so (6.74a) can be written as

N E(det) (t ) = Ξ(t ) ⋅ N (det) (t ) + Ξ(−t ) ⋅ N (det) (−t ) , (6.74c)

where Eq. (6.40b) is used to replace n (det) by N (det) on the left-hand side of the formula, and on
the right-hand side we define
N E(det) (t ) = n E(det) (ut ) (6.74d)
so that
n E(det) ( χ ) = N E(det) ( χ / u ) . (6.74e)

Equation (6.74c) is exactly the same as Eq. (3.73b) in Sec. 3.27 of Chapter 3 when N (det) (t ) is
matched to n (t ) and N E(det) (t ) is matched to n E (t ) ,

- 845 -
6 · NEdN and Detector Noise

N (det) (t ) ⇔ n (t ) (6.75a)
and
N E(det) (t ) ⇔ n E (t ) .

Remember that in this section all terms with the superscript “(det)” refer to the detector noise
being analyzed in this chapter and the terms without the superscript “(det)” come from Chapter 3.
Section 3.27 of Chapter 3 defines the T-limited forward Fourier transform of n E (t ) to be,
according to Eq. (3.72b),
T
N TE ( f ) = ³ n E (t ) e −2π ift dt . (6.75b)
−T

Following this lead, we copy this idea and define the T-limited forward Fourier transform of
N E(det) (t ) to be
T
 (det) ( f ) =
³ N (t ) e −2π ift dt ,
(det)
N TE E (6.75c)
−T

where, just like in Eq. (6.40c) above, T = D / u . Since N E(det) (t ) matches up to n E (t ) in (6.75a), it
follows that Eqs. (6.75b) and (6.75c) are now the same equation with N  (det) ( f ) matching up to
TE

N TE ( f ) ,
 (det) ( f ) ⇔ N ( f ) .
N (6.75d)
TE TE

The analysis presented in Sec. 3.27 [see Eq. (3.76c) in Chapter 3] shows that

( 2
E N TE ( f ) ≅ 2 T Snn )
 ( f ) , (6.75e)

where S nn   ( f ) is the double-sided power spectrum of random function ñ(t) in (6.75a). We know

that N (t ) , which corresponds to ñ(t) has, according to Eq. (6.41b), its own power spectrum
(det)

S (det)
  ( f ) . Since N
 (det) (t ) corresponds to ñ(t), the power spectrum in S   ( f ) (6.75e) corresponds
NN nn
(det)
to the power spectrum S 
NN
(f),
(det)
  ( f ) ⇔ S nn
S NN  ( f ) . (6.75f)

Hence the formula corresponding to Eq. (6.75e), which has been directly copied from (3.76c) in
Chapter 3, must be, according to (6.75d) and (6.75f),

- 846 -
Detector-Noise NEdN in a Single-Sided Signal · 6.21

( TE )
 (det) ( f ) 2 ≅ 2 T S (det)
E N   ( f ).
NN
(6.75g)

To find the counterpart of this result for Ȥ-based random functions, we follow Eq. (6.4) and
change the dummy variable of integration in (6.75c) to χ = ut to get

uT
 (det) ( f ) = u −1 N E(det) ( χ / u ) e−2π i ( f / u ) χ d χ .
N TE ³
− uT

According to Eqs. (6.40c), (6.40d), and (6.74e), this can be transformed into

D
 (det) (uσ ) = u −1 n (det) ( χ ) e −2π iσχ d χ ,
N TE ³ E −D

which can also be written as [see Eq. (6.70e)]

 (det) (uσ ) = u −1 n (det) (σ ) .


N (6.76a)
TE DE

We again consult Eqs. (6.40c) and (6.40d), this time using them to write (6.75g) as

(
E N TE )
 (det) (uσ ) 2 ≅ 2 u −1 D S (det)
  (uσ ) .
NN

According to Eqs. (6.41f) and (6.76a), this can also be written as

(
E n (det)
DE (σ )
2
) ≅ 2Dp (det)

nn (σ ) (6.76b)

DE (σ ) is strictly real,
or, since (6.70h) shows that n (det)

E ( [n (det)
DE (σ )] ) ≅ 2 D pnn
2
  (σ ) .
(det)
(6.76c)

Substituting this into (6.73b) gives

4 V (σ ) 2 D pnn
  (σ )
(det)

NEdN (det)
1 (σ ) = . (6.76d)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

- 847 -
6 · NEdN and Detector Noise

We are usually interested in the NEdN only for ı values corresponding to the wavenumber
range over which L(ı) is to be measured—that is, formula (6.76d) is almost always used for
wavenumbers ı such that
σ min ≤ σ ≤ σ max

with ımin and ımax the same as in Eq. (5.78) in Chapter 5. According to Eq. (5.88d) in Chapter 5,
V(ı) is always one for these ı values, which means it can be eliminated from (6.76d),

  (σ )
(det)
4 2 D pnn
NEdN (det)
1 (σ ) = , (6.76e)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

without (usually) making the formula any less useful.


Comparing the formula in (6.76e) to the corresponding formula in (6.52g) for NEdN 2(det) , we
see that the single-sided NEdN is 2 larger than the double-sided NEdN. This result—that the
single-sided NEdN is 2 larger than the double-sided NEdN—can be blamed entirely on the
way the single-sided measurement is constructed. Double-sided signals are measured for both
positive and negative Ȥ values, which means, as discussed following Eqs. (6.19f) and (6.35d)
above, that part of the signal noise is in principle avoidable: we can either average together the
noise-contaminated signal values at Ȥ and –Ȥ to reduce the detector noise at once or remove the
avoidable noise later on by taking the real part of the measured spectrum. Single-sided signals, on
the other hand, are in effect—once they have been turned into even functions—measured only for
positive Ȥ and then artificially extended to negative Ȥ values. There is thus no way to lessen the
single-sided signal noise because there is no way to compare independent signal measurements at
Ȥ and –Ȥ, so it is no surprise to find that single-sided NEdN’s are larger than the corresponding
double-sided NEdN’s. Now that NEdN1(det) is known to be larger than NEdN 2(det) by 2 , Eqs.
(6.53a), (6.53c), and (6.53d) can be used to put the formula for the single-sided NEdN into
several different forms:
  (u σ )
(det)
4 2 uDS NN
NEdN 1 (σ ) =
(det)
, (6.77a)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )

4 uD S1(det) (u σ )
NEdN 1 (σ ) =
(det)
, (6.77b)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
and
4 uDAd
NEdN1(det)( σ ) = . (6.77c)
( A ∆Ω) M( Rσθ ma ) η(σ )τ a ( σ )τ f ( σ ) D∗ ( σ )

- 848 -
Detector Circuit as an Anti-Aliasing Filter · 6.22

6.22 Detector Circuit as an Anti-Aliasing Filter


Detector noise is usually the dominant type of noise in Michelson interferometers. Up to now the
detector circuit between points B and C in Fig. 6.2 has been treated as just another part of the
signal chain; now that we know how to describe detector noise, we can discuss the detector
circuit’s role as an anti-aliasing filter.
To get the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2, we consult
Eqs. (6.31a) and (6.31c) to get

(
F ( −iσχ ) Π ( χ , D) zCN
( tot )
)
( χ ) = Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ ) , (6.78a)
where
WA ∆Ω
Z eff ,tot (σ ) = H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.78b)
ª¬τ f ( σ )L mnf ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼

D (σ ) is defined in Eq. (6.29a) to be the (D-limited) complex spectrum of the detector


and n (det)
noise. At point C, the analog-to-digital converter samples the signal at equally spaced intervals in
Ȥ so that a discrete Fourier transform (DFT) can be applied. To analyze the effect of this
procedure, we start with the obvious point that

[Π ( χ , D) zCN
( tot )
( χ )] and [Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ )]

in (6.78a) are a Fourier transform pair and then analyze what must happen to them when
[Π ( χ , D) zCN
( tot )
( χ )] is sampled and put through a DFT.
Section 2.21 of Chapter 2 explains the effect of sampling and the DFT on any two functions,
such as U(f) and u(t) in Eq. (2.91a) of Chapter 2, which form a Fourier-transform pair. To match
the interferometer signal at point C to functions u(t) and U(f), we write Eq. (6.78a) as


[Z eff ,tot (σ ) + H(uσ ) n (σ )] = ³ [Π ( χ , D) z ( χ )] e −2π iσχ d χ .
(det) ( tot )
D CN (6.79)
−∞

Comparing this to Eq. (2.91a) in Chapter 2, we note that wavenumber ı corresponds to f, the
OPD value Ȥ corresponds to t, function U(f) corresponds to

[Z eff ,tot (σ ) + H(uσ ) n (det)


D (σ )] ,

and function u(t) corresponds to

- 849 -
6 · NEdN and Detector Noise

[Π ( χ , D) zCN
( tot )
( χ )] .

These correspondences can be written symbolically as

t⇔χ, (6.80a)

f ⇔σ , (6.80b)

u (t ) ⇔ [Π ( χ , D) zCN
( tot )
( χ )] , (6.80c)
and
U ( f ) ⇔ [Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ )] . (6.80d)

In double-sided interferometer measurements, the analog-to-digital converter samples the


signal at N equally spaced OPD values between Ȥ = íD and Ȥ = D, with ¨Ȥ being the OPD
interval between neighboring samples. In single-sided interferometer measurements, even though
the analog-to-digital converter samples only half (approximately) the length in Ȥ, we
subsequently double the signal to get, again, N equally spaced data points between Ȥ = íD and Ȥ
= D with ¨Ȥ again being the OPD interval between neighboring samples [see the discussion
following Eq. (6.59d) in Sec. 6.18 above]. Hence, for both double-sided and single-sided
interferogram systems, we have
N ∆χ = 2 D . (6.81a)

This corresponds to Eq. (2.92b) in Chapter 2, which states that the N equally spaced samples used
to represent u(t) are separated by intervals ¨t such that

N ∆t = T .

Therefore, interval ¨t corresponds to ¨Ȥ and T corresponds to 2D, which can be written


symbolically as

∆t ⇔ ∆χ (6.81b)
and
T ⇔ 2D . (6.81c)

Furthermore, since ¨Ȥ corresponds to ¨t, parameter

1
F=
∆t

- 850 -
Detector Circuit as an Anti-Aliasing Filter · 6.22

specified in Eq. (2.93c) of Chapter 2 must correspond to 1/ ∆χ ,

1
F⇔ . (6.81d)
∆χ

The Nyquist wavenumber, defined in Eq. (5.112) of Chapter 5 to be

1
σ Nyq = , (6.81e)
2∆χ

can be used to write the correspondence in (6.81d) as

F ⇔ 2σ Nyq . (6.81f)

Formulas (2.95c) and (2.95d) show what happens to the sampled interferometer signal when
the DFT is applied: the original Fourier-transform pair u(t) and U(f), which describes the signal
and its spectrum, changes into u[ ∞ ] (t , T ) and U [ ∞ ] ( f , F ) . The transformation of spectrum U(f)
into U [ ∞ ] ( f , F ) is discussed at some length in Secs. 2.22 and 2.23 of Chapter 2, which show why
it is referred to as aliasing the signal spectrum. Equation (2.93b) defines U [ ∞ ] ( f , F ) to be


U [∞] ( f , F ) = ¦ U ( f − rF ) .
r =−∞

Therefore, applying correspondences (6.80b), (6.80d), and (6.81f) to (2.93b), we see that the
original noise-contaminated spectrum

[Z eff ,tot (σ ) + H(uσ ) n (det)


D (σ )]

in (6.78a) and (6.79) must transform, after sampling and the DFT, into

noise-contaminated and aliased spectrum



(6.82)
= ¦ [Z eff ,tot (σ − 2rσ Nyq ) + H ( u (σ − 2rσ Nyq ) ) n (det)
D (σ − 2rσ Nyq )] .
r =−∞

- 851 -
6 · NEdN and Detector Noise

FIGURE 6.8(a).
Z eff ,tot

− σ max − σ min σ min σ max σ

This is a schematic plot of the magnitude of the signal spectrum Zeff,tot


against wavenumber ı. Spectrum Zeff,tot is zero unless σ min ≤ σ ≤ σ max .

When n (det)
D = 0 in (6.82)—that is, in the absence of noise—Eq. (6.82) becomes the same as Eq.
(5.113c) in Chapter 5 if all the background radiances are negligible compared to the radiance
spectrum entering the interferometer. The practical consequences of Eq. (5.113c) are discussed at
length in Secs. 5.24 and 5.25 of Chapter 5. Following the same sort of reasoning used there, we
note that Zeff,tot is expected to be negligible or zero unless

σ min ≤ σ ≤ σ max ,

as shown in Fig. 6.8(a).


We choose ¨Ȥ small enough that
1
= σ Nyq > σ max
2∆χ

so that the spectrum is oversampled and its original shape preserved, as shown in Fig. 6.8(b). If
there is a large gap between σ = 0 and σ = σ min , we can instead choose ¨Ȥ large enough to
undersample the spectrum while preserving its original shape, as shown in Fig. 6.8(c). When

- 852 -
Detector Circuit as an Anti-Aliasing Filter · 6.22

n (det)
D is not zero, however, both oversampling and undersampling may introduce extra noise into
the measured spectrum if
H(u) )n (det)
D () )

in Eq. (6.82) is not negligible or zero when )


) min and ) ) max . This is shown pictorially by
the dotted
dashedlines
linesof
ofthe
thealiased
aliasednoise
noisespikes
spike in Fig. 6.8(b) and the overlap of the aliased spectrum
over the solid lines representing the low-frequency noise in Fig. 6.8(c). We cannot easily control
the spectrum n (det)
D () ) of the detector noise, which tends to be significantly different from zero at

all frequencies, both high and low; but we can design the detector circuit so that H(uı) is very
small for those wavenumbers ı that can contaminate the spectral measurement due to aliasing.
Detector circuits of this sort are often referred to as anti-aliasing filters or as containing an anti-
aliasing filter. Although it may not be mandatory to design the anti-aliasing transfer function H so
that H(uı) is negligible or zero unless

) min 4 ) 4 ) max ,

we note that if H obeys this rule, then H(u) )n (det)


D () ) is
issmall
smallwhere
whenever
Zeff,tot Z
iseff,tot is small, and
not measured,
aliasing can never introduce extra noise into the measured spectrum. Figure 6.8(d) graphs this
ideal band-pass transfer function, suitable for all types of oversampled or undersampled spectral
measurements.

__________

Detectors are the major source of random error in almost all Michelson interferometers. The
NEdN of an interferometer measurement is defined at the beginning of this chapter to be the
standard deviation of the random measurement error, which suggests that some effort might be
required to observe detector noise. It turns out, however, that the distinctively “fuzzy” appearance
of detector noise [see Fig. 6.6(b)] usually means that a single spectral measurement is enough to
show its presence and importance. We have traced detector noise through the block diagram of a
standard Michelson interferometer (shown in Fig. 6.2), taking care to include the effect of the
calibration process on the spectral signal. In double-sided interferogram systems, some of the
signal noise can be eliminated rather easily by taking the real part of the noise-contaminated
measurement after the calibration algorithm has been applied. This lets us divide the signal noise
of double-sided systems into avoidable and unavoidable components. Signal noise is somewhat
more prominent in systems using single-sided interferograms—being larger by a factor of square-
root of 2—because there is no way to eliminate
eliminate the
an avoidable component of the signal noise. This
is the inevitable price paid for the gain in spectral resolution discussed in Sec. 5.18 of Chapter 5.

- 853 -
6 · NEdN and Detector Noise

FIGURE 6.8(b).

Use this region of oversampled data


to measure the spectrum.
 ) min
) max  2) Nyq 2) Nyq  ) max
) min
))min
min
 22))NyqNyq  ) max 2) Nyq  ) min
) max
 ) Nyq
) Nyq

high-frequency aliased high- low-frequency aliased high-


noise frequency noise high-frequency
noise frequency noise noise

This is a schematic plot of the magnitude of the noise-contaminated spectral signal Zeff,tot
against wavenumber ı when the data has been oversampled. The solid lines represent
the noise-free Zeff,tot and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Only aliased high-frequency
noise ends up in the measured spectrum.

- 854 -
Detector Circuit as an Anti-Aliasing Filter · 6.22

FIGURE 6.8(c).

Use this region of undersampled data


to measure the spectrum.

aliased high-frequency noise

high-frequency
noise

aliased low- low-frequency aliased low-


frequency noise noise frequency noise
− 3σ Nyq 3σ Nyq

This is a schematic plot of the magnitude of the noise-contaminated spectral signal Zeff,tot
against wavenumber ı when the data has been undersampled. The solid lines represent
the noise-free Zeff,tot and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Both low-frequency noise and
aliased high-frequency noise end up in the measured spectrum.

- 855 -
6 · NEdN and Detector Noise

FIGURE 6.8(d
D).

H (u) )

1.0

 ) min ) min
 ) max ) max

- 856 -
Appendix 6A

Appendix 6A
When a spectral radiance L(ı) is a slowly varying function of wavenumber, then the distortion
given by an interferometer’s field of view can be disregarded. To see why this is so, we use the
formula given in Eq. (6.5b) [and also in Eq. (5.83e) of Chapter 5] for LFOV(ı), the spectral
radiance distorted by an interferometer’s finite field of view ¨ȍ when ¨ȍ is small but also large
enough that cos   cannot be approximated as one:

§  ·  )
) A¨1 ¸
© 4& ¹ 2
1
L FOV () )
) ³ L() 3)d) 3 (6A.1)
§  ·  )
) A¨1 ¸
© 4& ¹ 2

In this formula, ) 0 and

 A )
) . (6A.2)
2&

When L(ı) is a slowly varying function of wavenumber, we can assume that it is quasi-constant
when ı changes by an amount ¨ı, so the integral in (6A.1) can be approximated as

§  ·  ) §  ·  )
) A¨1 ¸ ) A¨1 ¸
© 4& ¹ 2
§ ) · © 4& ¹ 2

³ L() 3)d) 3 L ¨ ) 
©
¸A
2 ¹ ³ d) 3 )) A L() ) .
§  ·  ) §  ·  )
) A¨1 ¸ ) A¨1 ¸
© 4& ¹ 2 © 4& ¹ 2

Equation (6A.1) now simplifies to


L FOV () ) L() ) , (6A.3)

showing that an interferometer with a small but finite field of view does not significantly distort
the measured spectral radiance when the radiance is a slowly varying function of wavenumber.
The effect of the interferometer’s finite interferogram length can also be shown to disappear
when L is a slowly varying function of wavenumber. Following the notation introduced in Sec.
5.15 of Chapter 5, we say that 2D is the finite length of the interferogram signal. According to
Eq. (5.108d) in Chapter 5,

L mnf () ) [2 Dsinc(2&) D)]  L FOV ( ) ) , (6A.4a)

is then the spectral radiance distorted by both the interferometer’s finite interferogram length and
its finite field of view. Using (6A.3), this reduces to

- 857 -
6 · NEdN and Detector Noise

L mnf (σ ) = [2 Dsinc(2πσ D)] ∗ L ( σ ) . (6A.4b)

The sinc function has a tall central lobe centered on σ = 0 and then oscillates to zero as we move
away from the origin (see Fig. 5.23 in Chapter 5). Since L is a slowly varying function of
wavenumber, the sinc can be thought of as an extremely narrow function compared to L.
Appendix 5C of Chapter 5 discusses what happens when a narrow function such as
[2 Dsinc(2πσ D)] in (6A.4b) is convolved with a broad, slowly varying function such as L. To
make use of the work done in Appendix 5C, we consult Eq. (5C.4b) to get

h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] .

Here, h(z) represents the narrow function and G(z) represents the broad function. We apply the
definition of the convolution in Eq. (2.38a) of Chapter 2 to just the right-hand side of this formula
to get

h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ ³ h( z ′) g ( z − z ′) dz ′ .
−∞

For the special case g ( z ) = 1 , this reduces to



h( z ) ∗ G ( z ) ≅ G ( z ) ⋅ ³ h( z ′) dz ′ . (6A.4c)
−∞

Remembering that h(z) represents the narrow sinc function and G(z) represents the broad, slowly
varying spectral radiance L, we set up the correspondences

z ⇔σ

G ( z ) ⇔ L( σ )

h( z ) ⇔ [2 Dsinc(2πσ D)]

and then apply (6A.4c) to the right-hand side of (6A.4b) to get


L mnf (σ ) ≅ L ( σ ) ⋅ ³ 2 Dsinc(2πσ ′D)dσ ′ . (6A.4d)
−∞

- 858 -
Appendix 6A

Glancing back at Eq. (2.108a) in Chapter 2, we mentally replace F by D and t by ıƍ, noting that
when f = 0 formula (2.108a) becomes

³ [2 Dsinc(2π Dσ ′)] dσ ′ = Π (0, D) .


−∞

Equation (2.56c) in Chapter 2 shows that Π (0, D) is one for all D > 0 , so

³ [2Dsinc(2π Dσ ′)] dσ ′ = 1
−∞
(6A.5)

and Eq. (6A.4d) can be written as


L mnf (σ ) ≅ L ( σ ) . (6A.6)

Hence, when the spectral radiance L is a slowly varying function of wavenumber with respect to
[2 Dsinc(2πσ D)] and with respect to a change in wavenumber

∆Ω ⋅ σ
∆σ = ,

then it undergoes only negligible distortion from the interferometer’s ¨ȍ finite field of view and
2D finite interferogram length.

- 859 -
6 · NEdN and Detector Noise

Appendix 6B
The noise contaminating a time-based signal can be represented by a random function Ñ of time t,
which we write as Ñ(t) using the notation of Chapter 3 [see Sec. 3.2 of Chapter 3). According to
Eq. (6.4), for time-based interferometer signals the time t is linearly proportional to the OPD
value Ȥ,
t  /u , (6B.1)

where u is the OPD velocity. Hence, when Ñ(t) represents noise contaminating a time-based
interferometer signal, we can also decide to represent the same noise as a random function ñ(Ȥ),
with
n (  ) N (  / u ) (6B.2a)
or
n (ut ) N (t ) . (6B.2b)

From Sec. 3.15 of Chapter 3 [see Eq. (3.30b)], we know that when Ñ is wide-sense stationary it
has an autocorrelation function RNN
  given by

RNN  
  (t 2  t1 ) E N (t1 ) A N (t2 ) .  (6B.3a)

The power spectrum S NN


  of Ñ(t) is the forward Fourier transform of RNN
  given by [see Eq. (3.48c)]
5
S NN
(f ) ³R
5

NN
(* ) e 2& if * d* . (6B.3b)

The Fourier transform can, of course, be reversed to give

5
RNN
  (* ) ³S
5

NN
( f ) e 2& if * df . (6B.3c)

Equation (6B.2b) can be used to replace Ñ by ñ in Eq. (6B.3a) to get

  (t 2  t1 ) E  n
RNN  (ut1 ) A n (ut2 )  . (6B.4a)

Using Eq. (6B.1), we define


1 ut1 and  2 ut2 ,

which lets us write (6B.4a) as

- 860 -
Appendix 6B

§ χ 2 − χ1 ·
RNN
 ¨ ¸ = E ( n ( χ1 ) ⋅ n ( χ 2 ) ) . (6B.4b)
© u ¹

We can now, using the most basic definition of the autocorrelation function in Eq. (3.23b) of
Chapter 3, define the autocorrelation function of ñ(Ȥ) to be

  ( χ1 , χ 2 ) = E ( n
onn  ( χ1 ) ⋅ n ( χ 2 ) ) , (6B.4c)

which means, according to Eq. (6B.4b), that

§ χ 2 − χ1 ·
  ( χ1 , χ 2 ) = RNN
onn  ¨ ¸. (6B.4d)
© u ¹

  of Ñ(t) is a function only of (t2 − t1 ) , as shown


This shows that whenever the autocorrelation RNN
in Eq. (6B.3a), the autocorrelation of ñ(Ȥ) must also be a function only of ( χ 2 − χ1 ) .
Consequently we can write Eq. (6B.4c) as

  ( χ 2 − χ1 ) = E ( n
onn  ( χ1 ) ⋅ n ( χ 2 ) ) . (6B.4e)

Equation (6B.4d) can now be written as

§ χ 2 − χ1 ·
  ( χ 2 − χ1 ) = RNN
onn  ¨ ¸ (6B.4f)
© u ¹
or, setting χ ′ = χ 2 − χ1 ,
§ χ′ ·
  ( χ ′) = RNN
onn  ¨ ¸. (6B.4g)
©u ¹

This formula can also be written as, setting τ = χ ′ / u ,

  (uτ ) = RNN
onn   (τ ) . (6B.4h)

Equations (6B.4g) and (6B.4h) specify the connection between the autocorrelation functions of Ñ
and ñ.
We examine the definition of a wide-sense stationary random function in Sec. 3.15 of Chapter
3 [in Eq. (3.30b)] and note that (6B.4e) is the major requirement for showing that ñ(Ȥ) is wide-
sense stationary. All that remains is to discover whether or not

- 861 -
6 · NEdN and Detector Noise

E ( n ( χ ) )

is finite and independent of Ȥ. If Ñ(t) is wide-sense stationary, we know from Eq. (3.30a) of
Chapter 3 that
( )
E N (t ) = µ N = finite constant . (6B.5a)

Substituting (6B.2b) into (6B.5a) gives

E ( n (ut ) ) = µ N = finite constant ,

which, since χ = ut from (6B.1), is clearly the same thing as saying that

E ( n ( χ ) ) = µ N = finite constant . (6B.5b)

Therefore, putting together (6B.4e) and (6B.5b), we find that ñ(Ȥ) satisfies all the requirements
for being a wide-sense stationary random function of Ȥ whenever Ñ(t) is a wide-sense stationary
random function of t.
The power spectrum pnn  ( χ ) is the forward Fourier transform of its autocorrelation
  of n

function onn
 ,

  (σ ) =
pnn ³o
−∞

nn ( χ ′) e −2π iσχ ′ d χ ′ . (6B.6a)

Reversing this transform gives


  ( χ ′) =
onn ³p
−∞

nn (σ ) e 2π iσχ ′ dσ . (6B.6b)

Substituting (6B.4g) into (6B.6a) gives

  (σ ) =
pnn
−∞
³R 
NN
( χ ′ / u ) e −2π iσχ ′ d χ ′ .

We can, following the suggestion contained in Eq. (6B.1), change the dummy variable of
integration to τ = χ ′ / u (with d χ ′ = u dτ ) to get

  (σ ) = u ³ RNN
−2π iσ uτ
pnn   (τ ) e dτ . (6B.6c)
−∞

- 862 -
Appendix 6B

Comparing (6B.6c) to (6B.3b), we see that

  (σ ) = uS NN
pnn   (uσ ) , (6B.6d)
which, setting
f = uσ , (6B.6e)

can also be written as


u −1pnn
  ( f / u ) = S NN
(f ). (6B.6f)

Equation (3.57g) in Chapter 3 can be written as, using the notation of this appendix,

S NN
­ 1
  ( f ) = lim ®
T →∞ 2T
¯
T
¿
(
 ( f ) 2 ½¾ ,
E N ) (6B.7a)

where

T ∞
 (f)= N (t ) e −2π ift dt =
³ ³ Π(t , T ) N (t ) e
−2π ift
N T dt (6B.7b)
−T −∞

with Π (t , T ) defined the same way it was in Eq. (4C.1a) in Appendix 4C of Chapter 4:

°­1 for t ≤ T
Π (t , T ) = ® . (6B.7c)
°̄0 for t > T

Transforming Eq. (6B.7b) from f and t variables to ı and Ȥ variables gives [see Eqs. (6B.1) and
(6B.6e)]
uT
1
NT (uσ ) = ³ N ( χ / u ) e −2π i ( uσ )⋅( χ / u ) d χ

u − uT

or, substituting from (6B.2a),


D
 (uσ ) = 1 n ( χ ) e −2π iσχ d χ ,
u −³D
N T (6B.7d)

where we define, as in Eq. (6.40c) above,

D = uT . (6B.7e)

- 863 -
6 · NEdN and Detector Noise

If we also define
D ∞

³ n ( χ ) e −2π iσχ d χ = ³ Π( χ , D) n ( χ ) e
−2π iσχ
n D (σ ) = dχ , (6B.7f)
−D −∞

then Eq. (6B.7d) can be written as


 (uσ ) .
n D (σ ) = u N (6B.7g)
T

Replacing ı by f / u gives
 ( f ) = u −1n ( f / u ) .
N (6B.7h)
T D

Now Eqs. (6B.6f), (6B.7h), and (6B.7e) can be combined with Eq. (6B.7a) to get

­ 1 § 1 2 ·½
u −1pnn
  ( f / u ) = lim ® E ¨ 2 n D ( f / u ) ¸ ¾
D →∞ 2( D / u )
¯ ©u ¹¿

or, replacing f by uσ ,
­ 1
  (σ ) = lim ®
pnn
D →∞ 2 D
¯
(
2 ½
E n D (σ ) ¾ .
¿
) (6B.7i)

Equations (6B.6d), (6B.6f), and (6B.7a)–(6B.7i) specify the connections between the Ȥ-based and
the t-based power spectra of ñ and Ñ.

- 864 -
7
MIRROR-MISALIGMENT NEdN IN
DOUBLE-SIDED INTERFEROGRAMS
Unlike the detector noise described in the previous chapter, the misalignment noise in a well-
designed interferometer should be a small source of random error. To design these instruments
properly, ensuring that misalignment noise is likely to be small, we need some way to analyze it.
The formulas derived in Chapters 4 and 5 can handle static interferometer misalignments—that
is, they can handle situations where the alignment does not significantly change during a spectral
measurement—but a more sophisticated approach is needed when the alignment changes rapidly
and randomly. In this chapter we use wide-sense stationary random functions of the type
described in Sec. 3.15 of Chapter 3 to describe how the interferometer’s randomly changing
misalignment can contaminate the interference signal. By tracing the contaminated interferogram
through the entire signal-processing chain, including the calibration algorithm, we discover what
the spectral NEdN looks like when the interferometer is dominated by misalignment noise. This
not only produces the formulas needed to design interferometers with insignificant amounts of
random misalignment but it also, when interferometers break down, gives us the information
needed to decide whether unexpectedly large and randomly changing alignment errors are
contributing to the problem.

7.1 Setting Up the Signal Equations


Equation (6.8a) in Chapter 6 specifies the total optical signal presented to the detector at point A
in Fig. 6.2 by the formula

z A(tot ) ( χ ) = z A ( χ ) + z (Acold ) ( χ ) .

Consulting Eqs. (6.6c) and (6.12d) in Chapter 6, we expand this expression to

- 865 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms


A ∆Ω
(χ ) = ³ η (σ )τ f ( σ )τ a ( σ )L( σ ) dσ
( tot )
z A
4 −∞

WA ∆Ω
+
4 −∞³ M( Rσθ ma )η (σ )τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ


WA ∆Ω
³
2π iσχ
+ M( Rσθ ma )η (σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ
4 −∞

(7.1a)
A ∆Ω
+
2 0³ η (σ )τ a (σ )L(fore) (σ ) dσ

A ∆Ω
³
2
+ [2 r (σ ) − η (σ )]τ a (σ )L(back) (σ ) dσ
2 0

³L (σ ) dσ .
( dir ) ( dir )
+ A det ∆Ω
0

We note that because Ș is even [see Eq. (4.139g) in Chapter 4] that the product

η (σ )τ f ( σ )τ a ( σ )L( σ )

is even. According to formula (2.19) in Chapter 2, we then can write that

∞ ∞

³ η (σ )τ
−∞
f ( σ )τ a ( σ )L( σ ) dσ = 2³ η (σ )τ f (σ )τ a (σ )L(σ ) dσ .
0
(7.1b)

This allows the first and fourth terms on the right-hand side of (7.1a) to be combined into a single
integral,

∞ ∞
A ∆Ω A ∆Ω
³
4 −∞
η (σ )τ f ( σ )τ a ( σ ) L ( σ ) d σ +
2 0³ η (σ )τ a (σ )L(fore) (σ ) dσ

(7.1c)
A ∆Ω
=
2 0³ η (σ )τ a (σ )[τ f (σ ) L(σ ) + L(fore) (σ )] dσ .

In a similar way, we can combine the two Fourier transforms in (7.1a) to get

- 866 -
Setting Up the Signal Equations · 7.1

³ M( Rσθ
−∞
ma )η (σ )τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ

³ M( Rσθ
2π iσχ
+ ma )η (σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ (7.1d)
−∞

³ M( Rσθ
2π iσχ
= ma )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ .
−∞

Equations (7.1c) and (7.1d) can be substituted into (7.1a) to get

z A( tot ) ( χ ) =

WA ∆Ω
³
2π iσχ
M( Rσθ ma )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ
4 −∞
A ∆Ω

(7.1e)
+
2 0³ η (σ )τ a (σ )[τ f (σ )L(σ ) + L(fore) (σ )] dσ
∞ ∞
A ∆Ω
³ − η (σ )]τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ L( dir ) (σ ) dσ .
2
+ [2 r (σ )
2 0 0

This is the formula for z A(tot ) ( χ ) that we will trace through the signal chain of Fig. 6.2 in Chapter
6.

7.2 Specifying the Random Misalignment Angle of the Moving Mirror


Figure 7.1 specifies the random-angle variables θx and θy used to describe the misalignment of
the moving mirror. The total misalignment angle θ in Eq. (7.1e) is now called θ to show that it
ma

too is random. The dashed arrow in Fig. 7.1 shows the orientation of the surface normal when it
is misaligned by the random angle
θ = θ 2 + θ 2 , x y

and the bold arrow pointing along the interferometer’s optical axis shows the orientation of the
moving mirror’s surface normal when it is correctly aligned. The formula for the modulation
function M used in Eq. (7.1e) and defined in Eq. (5.10c) of Chapter 5 assumes that the beam

- 867 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

passing through the interferometer has a circular cross section. In a well-designed interferometer
' is always small, so in Eq. (5.10c) we can make the approximation102

J (4& R)' )
M( R)' ) 1 1  a) 2' 2 (7.2a)
2& R)'
with
a 2& 2
R2 . (7.2b)

FIGURE 7.1. The z axis is the correctly aligned normal vector of the mirror surface and the dashed arrow
is the misaligned normal vector. The x and y axes show the orientation of the 'x (  ) and 'y (  )
components of the total ' (  ) misalignment angle.

~
z axis ' y ( )

~
' x ( )
~
' ( )

y axis

x axis

102
Handbook of Mathematical Functions, edited by Abramowitz and Stegun, see formula (9.1.10), p. 360.

- 868 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2

The random angles θx and θy can take on both positive and negative values, but random angle θ
can never be negative. All three angles— θ , θ , and θ —can be treated as random functions of
x y

the OPD value Ȥ, giving us

θ ( χ ) = θx ( χ ) 2 + θy ( χ ) 2 . (7.2c)

By making these angles stationary random functions of Ȥ, we can analyze what happens to the
interferometer signal when θx and θy change randomly with OPD while the moving mirror is in
motion.
Using stationary random functions to represent angles θx , θy , and θ is an obvious approach
when the misalignment is driven by outside disturbances—when, for example, interferometers
are operated in high-vibration environments. In this sort of situation, we expect θx ( χ ) , θy ( χ ) ,
and θ ( χ ) to be at least wide-sense stationary and weakly ergodic (these types of random
quantities are discussed in Secs. 3.15 and 3.18 of Chapter 3). In low-vibration environments,
however, there may well be a tendency for the interferometer’s own motion—it does, after all,
have a moving mirror—to excite internal resonances that disturb the alignment. When this
happens, the misalignment may well be preferentially large at certain Ȥ values. Although at first
glance it may seem that θx , θy , and θ must now be nonstationary random functions, we can
instead, remembering the discussion following Eq. (3.47a) in Chapter 3, say that θ , θ , θ are
x y

still stationary but nonergodic. Before the instrument is built, it is very difficult to know at what Ȥ
values the random quantities θx , θy , and θ have a greater chance of taking on large values.
Hence, in our ignorance, while designing the instrument, we can treat these angles as equally
likely to be large or small at any Ȥ value—that is, we say that θx , θy , and θ are stationary.
Building the interferometer then corresponds to choosing specific angle functions from the
ensemble of allowed functions, as discussed in Sec. 3.14 in Chapter 3. If the angle function turns
out to be preferentially large at some Ȥ values, this just means that a nonergodic member function
of the ensemble has been chosen. So even in a low-vibration environment we can still, while
designing the interferometer, regard θx ( χ ) , θy ( χ ) , and θ ( χ ) as wide-sense stationary random
functions.
Now that we have decided to treat θx ( χ ) and θy ( χ ) as wide-sense stationary random
functions, we note that θx and θy are usually zero-mean random variables, which means that
their expectation values are zero:

- 869 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

E 'x (  ) 0
  (7.2d)
and
E 'y (  ) 0 .
  (7.2e)

Some interferometers, however, have a bias tilt angle  , which is the same thing as saying that
E(' (  )) and E(' (  )) are not both equal to zero. When this happens, the expectation values of
x y

'x (  ) and 'y (  ) are assumed to be independent of Ȥ, and we can orient the x and y axes in Fig.
7.1 so that
E 'x (  )   (7.2f)
and
E 'y (  ) 0 .
  (7.2g)

that when  0 , these equations reduce to the previous formulas in (7.2d) and (7.2e). To
Note that,
analyze mirror-misalignment noise both with and without bias tilt, we say that the probability
density distribution characterizing the behavior of 'x at all values of Ȥ has a mean of  and that
the probability density distribution characterizing the behavior of ' at all values of Ȥ has a mean
y

of zero. We assume that the probability density distributions for 'x and 'y are normal and have
standard deviations x and y respectively. These two normal distributions can then be written
as

1  ('  )2  2 x 2 
p'x (' x ) e x (7.2h)
x 2&
and
1 ' 2  2 2 
p'y (' y ) e y y . (7.2i)
y 2&

Here p'x (' x ) d' x is the probability that 'x takes on a value between ' x and ' x  d' x and
p'y (' y ) d' y is the probability that 'y takes on a value between ' y and ' y  d' y . Having used
Eqs. (7.2h) and (7.2i) to set up the 'x and 'y distributions, it can be shown that if

x y

- 870 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2

and θx , θy are independent, then θ in Eq. (7.2c) must obey the probability density
distribution103
(θ 2 +φ 2 )
θ § θφ · − 2γ 2
pθ (θ ) = 2 I 0 ¨ 2 ¸ e , (7.2j)
γ ©γ ¹
where

1
³e
ξ cos ω
I 0 (ξ ) = dω (7.2k)
2π 0

is a modified Bessel function of order zero.


Since the statistics of θx , θy , and θ do not change with Ȥ, the random functions θx ( χ ) ,
θ ( χ ) , and θ ( χ ) are—speaking somewhat loosely—equally likely to take on the same values at
y

any position of the interferometer’s moving mirror. This means that the average or mean squared
values of θx , θy , and θ are Ȥ-independent constants. Equations (7A.5a) and (7A.5c) in Appendix
7A then show that
( )
E θx ( χ ) 2 = φ 2 + γ x2 (7.3a)
and
(
E θy ( χ ) 2 = γ y2 . ) (7.3b)

We define θ rms
2
to be the Ȥ-independent constant equal to E θ ( χ ) 2 , ( )
( 2
)
E θ ( χ ) 2 = θ rms . (7.3c)

Squaring (7.2c) and taking the expectation value of both sides gives, after applying Eq. (3.16a) in
Chapter 3,
θ rms
2
( ) ( )
= E θ ( χ ) 2 = E θx ( χ ) 2 + θy ( χ )2 = E θx ( χ )2 + E θy ( χ ) 2 .( ) ( )
Substituting from (7.3a) and (7.3b) gives

( )
E θ ( χ ) 2 = θ rms
2
= φ 2 + γ x2 + γ y2 . (7.3d)

103
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 140.

- 871 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

When both θx and θy have the same standard deviation, with

γx =γy =γ ,
then Eq. (7.3d) becomes
(
E θ ( χ ) 2 = θ rms
2
)= φ 2 + 2γ 2 ,

which can be solved for γ 2 to get


θ rms
2
−φ2
γ2 = . (7.3e)
2

( )
For future use, we derive the value of E θ ( χ ) 4 . Taking the fourth power of both sides of Eq.
(7.2c) and taking the expectation value gives [again using Eq. (3.16a) in Chapter 3],

( ) ( 2
) (
E θ ( χ ) 4 = E ª¬θx ( χ ) 2 + θy ( χ ) 2 º¼ = E θx ( χ ) 4 + θy ( χ ) 4 + 2 θx ( χ ) 2 θy ( χ ) 2 )
= E (θ ( χ ) ) + E (θ ( χ ) ) + 2 E (θ ( χ ) θ ( χ ) ) .
x
4
y
4
x
2
y
2

Assuming that θx ( χ ) and θy ( χ ) are independent random variables—which, of course, means
that θ ( χ ) 2 and θ ( χ ) 2 are also independent—we can write that [see formula (3.12c) in Chapter
x y

3)
( ) ( ) ( )
E θ ( χ ) 4 = E θx ( χ ) 4 + E θy ( χ ) 4 + 2 E θx ( χ ) 2 E θy ( χ ) 2 . ( ) ( ) (7.4a)

From Eqs. (7A.5b) and (7A.5d) in Appendix 7A, we have

( )
E θx ( χ ) 4 = 3 γ x4 + 6 φ 2γ x2 + φ 4 (7.4b)
and
( )
E θy ( χ ) 4 = 3 γ y4 . (7.4c)

Substitution of Eqs. (7.3a), (7.3b), (7.4b), and (7.4c) into (7.4a) gives

( )
E θ ( χ ) 4 = 3 γ x4 + 6 φ 2γ x2 + φ 4 + 3 γ y4 + 2 (φ 2 + γ x2 ) γ y2
(7.4d)
= 3(γ + γ ) + 2γ (3φ + γ ) + 2φ γ + φ .
4
x
4
y
2
x
2 2
y
2 2
y
4

- 872 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2

When θx and θy have the same standard deviation γ x = γ y = γ , this reduces to

( )
E θ ( χ ) 4 = 8 γ 4 + 8 φ 2γ 2 + φ 4 . (7.4e)

Substituting Eq. (7.3e) into (7.4e) gives

2
§ θ rms
2
−φ2 · 2 § θ rms − φ ·
2 2

( 4
)
E θ (χ ) = 8 ¨
2
¸ + 8φ ¨
2
¸ +φ
4

© ¹ © ¹
= 2 (θ rms
4
+ φ 4 − 2θ rms
2
φ 2 ) + 4 φ 2θ rms
2
− 4φ 4 + φ 4 .

Thus we have, simplifying the right-hand side, that

( )
E θ ( χ ) 4 = 2θ rms
4
−φ4 (7.4f)

when θx and θy are independent and obey normal distributions having the same standard
deviation.

7.3 Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise


When random misalignment of the moving mirror is the primary source of noise, Eq. (7.1e)
above with șma replaced by θ ( χ ) is—as was pointed out at the beginning of the previous
section—the formula for the noise-contaminated signal,

( tot )
z AN (χ ) =

WA ∆Ω
4 −∞³ ( )
M Rσθ( χ ) η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e
2π iσχ

A ∆Ω

(7.5a)
+
2 0³ η (σ )τ a (σ )[τ f (σ )L(σ ) + L(fore) (σ )] dσ
∞ ∞
A ∆Ω
³ [2 r (σ ) −η (σ )]τ a (σ )L(back) (σ ) dσ + Adet ∆Ω( dir ) ³ L( dir ) (σ ) dσ .
2
+
2 0 0

( tot )
In this chapter, the random function z AN represents the total signal contaminated by mirror-
misalignment noise at point A in Fig. 6.2 of Chapter 6. The AN subscript and (tot) superscript
( tot )
remind us that z AN is the noise-contaminated total signal at point A, and the tilde shows that

- 873 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

θ ( χ ) turns z AN
( tot )
into a random function of Ȥ. To get the detector signal generated by all the
optical power hitting the detector, we insert the detector responsivity R into the integrals on the
right-hand side of (7.5a):

( tot )
zBN (χ ) =

WA ∆Ω
4 −∞³ ( )
R ( σ )M Rσθ ( χ ) η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e
( fore ) (back) 2π iσχ

A ∆Ω

(7.5b)
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
2
+
2 0 0

( tot )
Here zBN represents the total signal contaminated by mirror-misalignment noise at point B in
Fig. 6.2. Traditionally the responsivity R(ı) is defined only for positive wavenumber arguments,
so inside the first integral on the left-hand side the argument of R has absolute value signs to
make R well-defined for negative ı values.
Equation (7.2a) can be substituted into (7.5b) to get

( tot )
zBN (χ ) =

WA ∆Ω
³
2π iσχ
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e dσ
( fore ) (back)

4 −∞

A ∆Ω
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+ (7.6a)
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + Adet ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ
2
+
2 0 0

WA ∆Ω
− aθ ( χ )2 ³ σ 2 R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e
2π iσχ
dσ .
4 −∞

Adding and subtracting


WA ∆Ω
aθ 2
rms ³ σ 2 R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e
2π iσχ

4 −∞

- 874 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3

on the right-hand side of (7.6a) gives


( tot )
zBN (χ ) =
§ WA ∆Ω ·
¨ ¸⋅
© 4 ¹

³ (1 − aθ
2π iσχ
2
σ 2 )R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
rms
fore )
( σ ) − L(back)
FOV ( σ )]e dσ
−∞

§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨ (7.6b)
© 2 ¹ 0
∞ ∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ
2 (back)

© 2 ¹ 0 0

§ WA ∆Ω ·
− a ⋅ [θ ( χ ) 2 − θ rms
2
]⋅ ¨ ¸⋅
© 4 ¹

³σ
2π iσχ
R( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV ( σ ) − L(back)
FOV ( σ )]e dσ .
2 fore )

−∞

Again we apply Eq. (7.2a) above to get, since θ rms


2
is a small angle,

M( Rσθ rms ) = 1 − aσ 2θ rms


2
.
This lets us write (7.6b) as
( tot )
zBN (χ ) =
§ WA ∆Ω ·
¨ ¸⋅
© 4 ¹

³ M( Rσθ
2π iσχ
rms ) R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ
−∞

§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨ (7.7a)
© 2 ¹ 0
∞ ∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L (σ ) dσ + A det ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ
2 (back)

© 2 ¹ 0 0

§ WA ∆Ω ·
− a ⋅ [θ ( χ ) 2 − θ rms
2
]⋅ ¨ ¸⋅
© 4 ¹

³σ
2π iσχ
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e dσ .
2 ( fore ) (back)

−∞

- 875 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

Now by defining

§ WA ∆Ω ·
Z FOV (σ ) = ¨ ¸ R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] , (7.7b)
( fore ) (back)

© 4 ¹

we can write Eq. (7.7a) as


( tot )
zBN ( χ ) = ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ
−∞

§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)

© 2 ¹ 0

§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ (7.7c)
© 2 ¹ 0

³ R (σ ) L (σ ) dσ
( dir ) ( dir )
+ A det ∆Ω
0

+ a[θ rms
2
− θ ( χ ) 2 ] ³ σ 2 Z FOV (σ ) e 2π iσχ dσ .
−∞

( tot )
The formula for zBN can be cleaned up some more by defining function W ( χ ) to be


W (χ ) = ³σ Z FOV (σ ) e2π iσχ dσ
2
(7.8a)
−∞

and also defining a new random function n (θ 2) ( χ ) ,

n (θ 2) ( χ ) = θ rms
2
− θ ( χ ) 2 . (7.8b)

Equation (7.3c) shows that n (θ 2) ( χ ) can also be written as

n (θ 2) ( χ ) = E(θ ( χ ) 2 ) − θ ( χ ) 2 . (7.8c)

We note that, using the linearity of operator E described in Sec. 3.10 of Chapter 3,

( ) ( ( ) ) ( (
E n (θ 2) ( χ ) = E E θ ( χ ) 2 − θ ( χ )2 = E E θ ( χ ) 2 ) ) − E (θ( χ ) ) .
2

- 876 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3

( )
Since E θ ( χ ) 2 is a nonrandom quantity, Eq. (3.9f) of Chapter 3 requires that

( (
E E θ ( χ ) 2 ) ) = E (θ( χ ) ) ,
2

from which it follows that

( ) ( (
E n (θ 2) ( χ ) = E E θ ( χ ) 2 ) ) − E (θ( χ ) ) = E (θ( χ ) ) − E (θ( χ ) ) = 0 .
2 2 2
(7.8d)

Hence, n (θ 2) ( χ ) is a zero-mean random function. Since n (θ 2) ( χ ) is just the square of θ ( χ )


subtracted from a constant, and the statistics of θ ( χ ) do not depend on Ȥ, we expect that the
statistics of n (θ 2) ( χ ) also do not depend on Ȥ. Consequently, we now assume that n (θ 2) ( χ ) is at
least wide-sense stationary with respect to Ȥ [see Eqs. (3.30a) and (3.30b) and the discussion
following them for a description of what this means]. Substituting (7.8a) and (7.8b) into (7.7c)
gives

( tot )
zBN ( χ ) = ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ
−∞

§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)

© 2 ¹ 0

§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ (7.8e)
© 2 ¹ 0

³ R (σ ) L (σ ) dσ
( dir ) ( dir )
+ A det ∆Ω
0

+ a n (θ 2) ( χ ) W ( χ ) .

The first four terms on the right-hand side are all nonrandom, so it makes sense to write (7.8e) as

( tot )
zBN ( χ ) = z B( tot ) ( χ ) + a n (θ 2) ( χ ) W ( χ ) , (7.8f)

where

- 877 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms


z B( tot ) ( χ ) = ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ
−∞

§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)

© 2 ¹ 0
∞ (7.8g)
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ
© 2 ¹ 0

+ A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
0

Substituting for Z FOV (σ ) from (7.7b) lets the formula for z B(tot ) be written as

z B( tot ) ( χ ) =
WA ∆Ω ∞ 2π iσχ
³ M( Rσθ rms )R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] e dσ
( fore ) (back)

4 −∞

A ∆Ω (7.8h)
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
2
+
2 0 0

Comparing this latest expression to the formula for z A(tot ) ( χ ) in Eq. (7.1e), we note that z A(tot ) ( χ )
turns into z B( tot ) ( χ ) if we insert the responsivity R into all the integrals of (7.1e) and also set

θ rms = θ ma (7.8i)

in the modulation term M. This correspondence justifies the z B(tot ) label given to the sum of the
four nonrandom terms in (7.8e) above, because this term looks like what the noise-free signal
z (Atot ) at point A in Fig. 6.2 would become as it leaves the detector at point B, provided we say
that șrms is the effective value of the moving mirror’s constant misalignment angle. After using
both the linearity of E described in Sec. 3.10 of Chapter 3 and Eq. (3.9f) from that same chapter,
we apply the expectation operator E to both sides of Eq. (7.8f) to get

(
( tot )
E zBN ) ( ) (a n
( χ ) = E z B(tot ) ( χ ) + E (θ 2)
) ( )
( χ ) W ( χ ) = z B( tot ) ( χ ) + a W ( χ ) E n (θ 2) ( χ ) .

- 878 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3

This becomes, using Eq. (7.8d),


(( tot )
E zBN )
( χ ) = z B(tot ) ( χ ) . (7.8j)

Hence z B(tot ) ( χ ) is the expectation value, or average value, of the noise-contaminated signal
leaving the detector. It is the Ȥ-based signal we get after averaging together many independent
measurements of the same spectral radiance to reduce the mirror-misalignment noise to
negligible levels.

7.4 Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter)
To get the noise-contaminated signal through the detector circuit (which contains the anti-aliasing
filter) to point C in Fig. 6.2 in Chapter 6, we convert the noise-contaminated signal into a
function of time. Using the notation of Eq. (5.41a) in Chapter 5 and Eq. (6.4) of Chapter 6, we
write
χ
t= , (7.9a)
u

where u is the constant, positive OPD velocity—that is, the constant time rate of change of the
OPD value Ȥ. For the interferometer in Fig. 6.2, if Ȟ is the constant physical velocity of the
moving mirror, then u = 2v . The t = 0 origin of the time coordinate is chosen to coincide with
the OPD value χ = 0 . The time-based signal at point B can be written as, using (7.9a) to replace
Ȥ by ut in Eq. (7.8f),
( tot )
zBN (ut ) = z B( tot ) (ut ) + a n (θ 2) (ut ) W (ut ) . (7.9b)

To find the time-based output signal so (t ) leaving the detector circuit, we apply the standard
linear-circuit formula
so (t ) = h(t ) ∗ si (t ) , (7.10a)

where ∗ is the convolution operator defined in Eq. (2.38a) of Chapter 2, si(t) is the input signal
entering the detector circuit, and h(t) is the real-valued impulse-response function of the detector
circuit including the anti-aliasing filter.104 Equation (7.10a) can be written as


h(t ) ∗ si (t ) = ³ h(t ′) s (t − t ′) dt ′ .
−∞
i (7.10b)

104
See Appendix 5A of Chapter 5 for more discussion of the impulse-response function and the implications of Eq.
(7.10a) relating the input and output signals of the detector circuit.

- 879 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

We know that the detector circuit (and anti-aliasing filter) has a transfer function H(ƒ) such that h
and H are a Fourier-transform pair,

³ h(t ) e
−2π ift
H( f ) = dt (7.10c)
−∞
and

³ H( f ) e
2π ift
h(t ) = df . (7.10d)
−∞

According to Eq. (5A.6b) in Appendix 5A of Chapter 5, the transfer function H is Hermitian,

H(− f ) = H( f )∗ . (7.10e)

The ∗ superscript indicates that H(ƒ)* is the complex conjugate of H(ƒ). As explained in
Appendix 5A, formula (7.10e) holds true for any Fourier transform of a real function h. The
detector circuit is AC coupled to the detector, which means, according to Eq. (5.46d) in Chapter
5, that
H(0) = 0 .

Substituting from (7.10c) with f = 0 then leads to



H(0) = ³ h(t ) dt = 0 .
−∞
(7.10f)

An immediate consequence of Eq. (7.10f) and the definition of the convolution in Eq. (2.38a) in
Chapter 2 is that, for any time-independent constant K,


h(t ) ∗ K = K ³ h(t ′) dt ′ = 0 .
−∞
(7.10g)

From Eq. (6.21a) in Chapter 6, we know that, for a relatively small time value T,

h(t ) ≅ 0 for all t > T . (7.10h)

Equations (7.9b) and (7.10a) can now be combined to get the time-based output signal of the
( tot )
detector circuit (and anti-aliasing filter), which we decide to call sCN (t ) ,

( tot )
sCN (t ) = h(t ) ∗ [ z B( tot ) (ut ) + a n (θ 2) (ut ) W (ut )] . (7.11a)

- 880 -
Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter) · 7.4

Substitution from (7.8g) shows that zB( tot ) (ut ) has many constant—that is, time-independent—
terms. Gathering together all the constant terms inside a pair of braces { }, we use the linearity of
the convolution [see Eq. (2.38d) in Chapter 2] to write

ª∞ º
( tot )
sCN (t ) = h(t ) ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσ ut dσ »
¬ −∞ ¼
­ A ∆Ω · ∞
°§
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+ h(t ) ∗ ®¨
°̄© 2 ¹ 0

§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ
© 2 ¹ 0

½°
³ R (σ ) L (σ ) dσ ¾
( dir ) ( dir )
+ A det ∆Ω
0 °¿
+ h(t ) ∗ [a n (θ 2)
(ut ) W (ut )] .

According to Eq. (7.10g), the convolution with the constant terms is zero, leaving us with

ª∞ º
( tot )
sCN (t ) = h(t ) ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσ ut dσ »
¬ −∞ ¼ (7.11b)
+ h(t ) ∗ [a n (ut ) W (ut )] .
(θ 2)

For any time-based convolution such as the one in (7.10a) where

so (t ) = h(t ) ∗ si (t ) ,

Eq. (7.9a) and the formula for the convolution in (7.10b) can be used to convert back to functions
of Ȥ,
§χ·
so ¨ ¸ = h(t ) ∗ si (t ) t = χ / u
©u¹
or, using that t ′ = χ ′ / u ,

∞ ∞
§χ· §χ · 1 § χ′ · § χ χ′ · 1 §χ· §χ·
so ¨ ¸ = ³ h(t ′) si ¨ − t ′ ¸ dt ′ = ³ h ¨ ¸ si ¨ − ¸ d χ ′ = h ¨ ¸ ∗ si ¨ ¸ . (7.11c)
© u ¹ −∞ ©u ¹ u −∞ © u ¹ © u u ¹ u ©u¹ ©u¹

Applying this rule to Eq. (7.11b) gives

- 881 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

( tot ) §  · 1 § · ª5 º
sCN ¨ ¸ h ¨ ¸  « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »
© u ¹ u © u ¹ ¬ 5 ¼
a §·
 h ¨ ¸  ¬ª n (' 2) (  ) W (  ) ¼º
u ©u¹

( tot ) ( tot )
so that, deciding to call the Ȥ-based signal zCN (  ) instead of sCN (  / u ) , we can write

( tot ) § · ª5 º
zCN (  ) u 1h ¨ ¸  « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »
© u ¹ ¬ 5 ¼
(7.11d)
§·
 u 1 a h ¨ ¸  ª¬ n (' 2) (  ) W (  ) º¼ .
©u¹

( tot )
In this chapter, random function zCN represents the total signal contaminated by mirror-
misalignment noise at point C in Fig. 6.2 of Chapter 6.

7.5 Misalignment Noise in Uncalibrated Spectra of Double-Sided


Signals
To construct the double-sided signal, we repeat the definition of function  ( , D ) given in Eq.
(4C.1a) in Appendix 4C of Chapter 4 to get

­°1 for  4 D
 (  , D) ® . (7.12a)
°̄0 for  D

Any Ȥ-based signal multiplied by  (  , D) is left unchanged for OPD values between +D and íD
and set to zero for OPD values greater than D or less than íD. We now multiply both sides of Eq.
(7.11d) to
byget
 (the
 , Ddouble-sided signal at point
) to get the double-sided C in at
signal Fig. 6.2 C
point ofinChapter
Fig. 6.26,of Chapter 6,

( tot ) ­ § · ª5 º½
 (  , D) zCN (  ) u 1  (  , D) ®h ¨ ¸  « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »¼ ¾
¯ © u ¹ ¬ 5 ¿
(7.12b)
­ §·
 (' 2) (  ) W (  )]½¾ .
 u a  (  , D) ®h ¨ ¸  [n
1

¯ ©u¹ ¿

The approximation specified in Eq. (7.10h) can be used to simplify the second term on the
right-hand side of (7.12b). Because h is a narrow function, the definition of a convolution can be
approximated as [see Eqs. (2.38a) and (2.38b) in Chapter 2]

- 882 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5

­ §· ½
 (  , D) ®h ¨ ¸  [n (' 2) (  ) W (  )]¾
¯ ©u¹ ¿
­ §  ·½
 (  , D) ®[n (' 2) (  ) W (  )]  h ¨ ¸ ¾
¯ © u ¹¿
5
(7.13a)
§   3 · 3
 (  , D) ³ n (' 2) (  3) W (  3) h ¨ ¸d
5 © u ¹
  uTT
§   3 · 3
³
(' 2)
 (  , D) 
n (  3) W (  3) h ¨ ¸d .
 uTT © u ¹

Using the same reasoning as in the discussion following Eq. (6.26a) in Chapter 6, we note that
this equation reduces to 0 0 when Ȥ does not lie between D and íD. Consequently, the limits
on the integral over d  3 can be replaced by ( D  u qT ) and ( D  u T q ) . When the integral’s
D to (   u T
limits are extended like this, the extra range of integration going from a q ) and from
(   uqT ) to D
a makes only a negligible contribution to the integral due to the smallness of h at
these OPD values. Hence we can write

­ §· ½
 (  , D) ®h ¨ ¸  [n (' 2) (  ) W (  )]¾
¯ ©u¹ ¿
D  uq
§   3 · 3
 (  , D) ³
 ( D  uq )
n (' 2) (  3) W (  3) h ¨
© u
¸d
¹
(7.13b)

5
§   3 · 3
 (  , D) ³  (  3, D) n (' 2) (  3) W (  3) h ¨ ¸d ,
5 © u ¹
where
D Duq . (7.13c)

Referring back to the formula for the convolution of two functions in Eq. (2.38a) of Chapter 2,
we see that (7.13b) can be written as, using (2.38b) to reverse the order of the convolution,

­ §· ½
 (  , D) ®h ¨ ¸  [n (' 2) (  ) W (  )]¾
¯ ©u¹ ¿
(7.13d)
­ §· ½
 (  , D) ®h ¨ ¸  ¬ª (  , D) n (' 2) (  ) W (  ) ¼º ¾ .
¯ ©u¹ ¿

- 883 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

To make our Fourier notation more concise, we start using F , the Fourier-transform operator
defined by Eqs. (2.29a) and (2.29c) in Chapter 2. When using this notation


F ( − iσχ )
( u ( χ ) ) = ³ u ( χ )e−2π iσχ d χ (7.14a)
−∞

is the forward Fourier transform of any transformable function u and


F ( iσχ )
( v(σ ) ) = ³ v(σ )e2π iσχ dσ (7.14b)
−∞

is the reverse Fourier transform of any transformable function Ȟ.


To get the uncalibrated spectrum of a double-sided signal contaminated by mirror-
misalignment noise, we take the forward Fourier transform of both sides of Eq. (7.12b) to get,
using the linearity of the Fourier transform described in Sec. 2.6 of Chapter 2,

F ( −iσχ ) ( Π ( χ , D) zCN
( tot )
(χ ))
§ ­ §χ · ª∞ º½·
= u −1 F ( −iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ « ³ M( Rσ ′θ rms ) Z FOV (σ ′) e 2π iσ ′χ dσ ′ »¼ ¾ ¸
© ¯ © u ¹ ¬ −∞ ¿¹
§ ­ §χ· ½·
+ u −1 a F ( − iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ [n (θ 2) ( χ ) W ( χ )]¾ ¸ .
© ¯ ©u¹ ¿¹

The Fourier transform of [Π ( χ , D) zCN


( tot )
( χ )] is the uncalibrated signal spectrum contaminated by
mirror-misalignment noise, and we recognize this by writing that


Z eff ,totN (σ ) = F
( − iσχ )
(
Π ( χ , D) zCN
( tot )
(χ ) , ) (7.14c)

so that the previous formula becomes

 ( − iσχ ) § ­ §χ · ª∞ 2π iσ ′χ º½·
σ χ ®h ¨ ¸ ∗ « ³ M( Rσ ′θ rms ) Z FOV (σ ′) e dσ ′
−1
Z eff ,totN ( ) = u F ¨ Π ( , D ) »¼ ¾ ¸
© ¯ © u ¹ ¬ −∞ ¿¹
§ ­ §χ· ½·
+ u −1 a F ( −iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ [n (θ 2) ( χ ) W ( χ )]¾ ¸ .
© ¯ ©u¹ ¿¹

- 884 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5

We apply the Fourier convolution theorem shown in Eq. (2.39j) in Chapter 2 to the first term on
the right-hand side, and to the second term we apply the approximation shown in Eq. (7.13d).
This gives


Z eff ,totN () )

§ § 3 · ª 5 º·
u 1 F ( i) )   (  , D)   F (  i) 3) ¨ h ¨ ¸  « ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 3 d) 3 » ¸
© © u ¹ ¬ 5 ¼¹
§ ­ §· ½·
 u 1 a F ( i) ) ¨  (  , D) ® h ¨ ¸  [ (  , D
a) n (' 2) (  ) W (  )]¾ ¸ .
© ¯ ©u¹ ¿¹

We again apply the Fourier convolution theorem, this time using the forms shown in Eqs. (2.39j)
and (2.39a), to write


Z eff ,totN () )

­ § § 3 ·· § 5 ·½
u 1 F ( i) )   (  , D)   ®F (  i) 3) ¨ h ¨ ¸ ¸ A F (  i) 33) ¨ ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 33 d) 3 ¸ ¾
¯ © © u ¹¹ © 5 ¹¿
­ § § 3 · ·½
 u 1 a F ( i) )   (  , D)   ®F ( i) 3) ¨ h ¨ ¸  [ (  3, D a) n (' 2) (  3) W (  3)] ¸ ¾
¯ © ©u ¹ ¹¿

or, again applying (2.39a),


Z eff ,totN () )

­ § § 3 · · § 5 ·½ (7.15a)
u 1 F ( i) )  (  , D)   ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) ¨ ³ M( R) 3'rms ) ZFOV () 3) e2& i) 3 33 d) 3 ¸¾
¯ © © u ¹¹ © 5 ¹¿
­ § § 3 · · ½
 u 1 a F ( i) )  (  , D)   ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) (  33, D
 a) n (' 2) (  33) W (  33) ¾ .
¯ © © u ¹¹ ¿

The Fourier transform of  (  , D) is, according to Eq. (2.108b) in Chapter 2,

F ( 9 i) )   (  , D)  2 Dsinc(2&) D ) , (7.15b)

where the sinc function is, following the definition in Eq. (2.106d),

- 885 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

sin( x)
sinc( x) . (7.15c)
x

Glancing back at Eqs. (7.14a) and (7.10c), we note that (when t  / u ),

§ §  ··
5 5
§  · 2& i)
¨h¨ ¸¸ ³ h¨ ¸e d  u ³ h  t  e2& i) ut dt uH(u) ) .
(  i) )
F (7.15d)
© © u ¹ ¹ 5 © u ¹ 5

We can now substitute Eqs. (7.15b) and (7.15d) into (7.15a) to get


Z eff ,totN () )

­ § 5 ·½
[2 Dsinc(2&) D)]  ®H(u) ) F ( i) ) ¨ ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 d) 3 ¸¾
¯ © 5 ¹¿
 a [2 Dsinc(2&) D)]  H(u) )F (  i) )  (  , D) n (' 2) (  ) W (  )
1  2
or


Z eff ,totN () )

[2 Dsinc(2&) D)]  1H(u) ) M( R)' rms ) Z FOV () )2 (7.15e)


 a [2 Dsinc(2&) D)]  H(u) ) F (  i) )  (  , D) n (' 2) (  ) W (  ) ,
1  2
where in the last step the forward Fourier transform of the reverse Fourier transform returns the
original function:[see Eqs. (2.28A ) and (2.29a,b) in Chapter 2]:

F ( i) ) §¨ ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 d) 3 ·¸


5

© 5 ¹
 
F (  i) ) F ( i) 3 )  M( R) 3' rms ) Z FOV () 3)  M( R)' rms ) Z FOV () ) .

Working with the first term on the right-hand side of (7.15e), we note that [see Eq. (7.7b)
above]

H(u) ) M( R)' rms ) Z FOV () )


WA  (7.16a)
fore )
H(u) ) M( R)' rms ) R ( ) )! () )* a ( ) )[* f ( ) )L FOV ( ) )  L(FOV ( ) )  L(back)
FOV ( ) )] .
4

- 886 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5

In a well-designed interferometer all the functions on the right-hand side of (7.16a), except for
the radiances L FOV , L(FOVfore )
, and L(back)
FOV , must
vary vary slowly
slowly withwith comparedtoto sinc(2&) D) .
ı ıcompared
Furthermore, this sinc function is very narrow, dropping rapidly to zero compared to all the
nonradiance functions in (7.16a). Consequently, we can, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, treat the nonradiance functions as quasi-constants in the convolution

[2 Dsinc(2&) D)]  1H(u) ) M( R)' rms ) Z FOV () )2 .

This lets us write, after using Eqs. (5C.1) and (2.38d) in Chapter 2,

[2 Dsinc(2&) D )]  1H(u) ) M( R)' rms ) Z FOV () )2


WA 
[2 Dsinc(2&) D)]  1H(u) ) M( R)' rms ) R ( ) )! () )* a ( ) )
4
fore )
A[* f ( ) )L FOV ( ) )  L(FOV ( ) )  L(back)
FOV ( ) )] 2
WA 
H(u) ) M( R)' rms ) R ( ) )! () )* a ( ) )[* f ( ) )L mnf ( ) )  L(mnf
fore )
( ) )  L(back)
mnf ( ) )] ,
4
(7.16b)

where, following the notation of Eqs. (6.62c), (6.63c), and (6.63d) in Chapter 6, we say that

L mnf ( ) ) [2 Dsinc(2&) D)]  L FOV ( ) ) , (7.16c)

L(mnf
fore )
( ) ) [2 Dsinc(2&) D)]  L(FOV
fore )
() ) , (7.16d)
and
L(back) (back)
mnf ( ) ) [2 Dsinc(2&) D )]  L FOV ( ) ) . (7.16e)

Functions L mnf , L(mnf


fore )
, L(back)
mnf have the same units as L FOV , L(FOV
fore )
, L(back)
FOV and represent spectral

radiances distorted both by the effect of the interferometer’s finite field of view and by its finite
interferogram length. Defining function Z mnf to be

WA  ( fore ) (back)
Z mnf () ) R ( ) )! () )* a ( ) )[* f ( ) )L mnf ( ) )  L mnf ( ) )  L mnf ( ) )] (7.16f)
4

lets us write (7.16b) as

- 887 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

[2 Dsinc(2πσ D )] ∗ {H(uσ ) M( Rσθ rms ) Z FOV (σ )}


(7.16g)
≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) .

Remembering that all nonradiance functions—including H(uı) and M(Rıșrms)—can be treated as


quasi-constants in a convolution with sinc(2πσ D) , we again apply Eq. (5C.1) from Appendix 5C
in Chapter 5 to get

H(uσ ) M( Rσθ rms ) { [2 Dsinc(2πσ D)] ∗ Z FOV (σ )}


≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) ,
which reduces to
[2 Dsinc(2πσ D)] ∗ Z FOV (σ ) ≅ Z mnf (σ ) . (7.16h)

The second term on the right-hand side of (7.15e) can also be simplified. The nonradiance
H(uı) transfer function can be treated like a quasi-constant in the convolution over ı to get

{
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) ( Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) }
{
≅ H(uσ ) ⋅ [2 Dsinc(2πσ D)] ∗ F ( −iσχ ) (Π ( χ , D) n (θ 2) ( χ ) W ( χ )) . }
Equation (7.15b) and Eq. (2.39j) in Chapter 2 can be used to turn the sinc function into another
factor inside the Fourier transform,

{
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) ( Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) } (7.17a)
≅ H(uσ ) ⋅ F ( − iσχ )
( Π( χ , D) Π( χ , D) n (θ 2)
(χ ) W (χ )) .

Equation (7.13c) shows that D ≥ D , which means that [see the specification of Π in Eq. (7.12a)]

Π ( χ , D ) Π ( χ , D) = Π ( χ , D ) .

Hence, Eq. (7.17a) reduces to

{ (
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) )}
(
≅ H(uσ ) ⋅ F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) (7.17b)

( )
= H(uσ ) ⋅ ª¬ F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) ∗ F ( − iσχ ′) ( W ( χ ′) ) º¼ ,

- 888 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5

where again the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] is applied in the last
step. We define the D-limited Fourier transform of the noise n (θ 2) to be

( ) ³ Π( χ , D) n
n (Dθ 2) (σ ) = F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) = (θ 2)
( χ ) e −2π iσχ d χ
−∞
D
(7.17c)
= ³
−D
n (θ 2) ( χ ) e −2π iσχ d χ

so that
{
[2 Dsinc(2πσ D )] ∗ H(uσ ) F ( −iσχ ) (Π ( χ , D) n (θ 2) ( χ ) W ( χ )) }
(7.17d)
≅ H(uσ ) ⋅ ª¬n (Dθ 2) (σ ) ∗ F ( −iσχ ) ( W ( χ )) º¼ .

Reversing the Fourier transform in Eq. (7.8a) gives


σ Z FOV (σ ) =
2
³ W (χ ) e
−2π iσχ
d χ = F ( −iσχ ) ( W ( χ ) ) , (7.17e)
−∞

which can now be substituted into (7.17d) to get

{ (
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( − iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) )}
(7.17f)
{
≅ H(uσ ) ⋅ n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ . }
Having found approximations for the first and second terms on the right-hand side of the
formula in (7.15e), we can write down a simplified expression for the uncalibrated signal
spectrum of the double-sided signal contaminated by mirror-misalignment noise. Substituting
(7.16g) and (7.17f) into (7.15e) gives


Z eff ,totN (σ )
(7.18a)
{
≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) + a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ . }
For future use, we note that the expectation value of the noise term in (7.18a) is, using the
definition of convolution in Eq. (2.38a) in Chapter 2 and the linearity of the expectation operator
E explained in Sec. 3.10 of Chapter 3,

- 889 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

E (a H(uσ ) { n (θ 2)
D (σ ) ∗[σ 2 Z FOV (σ )] })
§ ∞
·
= a H(uσ ) E ¨ ³ 
n (θ 2)
D (σ ′) ª
¬ (σ − σ ′) 2
Z FOV (σ − σ ′) º
¼ d σ ′ ¸
© −∞ ¹ (7.18b)

= a H(uσ ) ³ E(n (θ 2)
D (σ ′)) ª¬(σ − σ ′) Z FOV (σ − σ ′) º¼ dσ ′
2

−∞

{
= a H(uσ ) E(n (Dθ 2) (σ )) ∗ ª¬ (σ ) 2 Z FOV (σ ) º¼ . }
Glancing back at the definition of n (Dθ 2) in Eq. (7.17c), we note that

D
E n ( (θ 2)
D ) ³ E ( n
(σ ) = (θ 2)
)
( χ ) e −2π iσχ d χ = 0 (7.18c)
−D

because, according to Eq. (7.8d),


E ( n (θ 2) ( χ ) ) = 0 .

Substituting (7.18c) into (7.18b), we see that

E (a H(uσ ){ n (θ 2)
D (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }) = 0 . (7.18d)

Applying the expectation operator to both sides of (7.18a) now gives, using Eqs. (3.9f) and
(3.16a) in Chapter 3,

(

E Z eff ,totN (σ ) )
= H(uσ ) M( Rσθ rms ) Z mnf (σ ) + E (a H(uσ ){ n (θ 2)
D (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }) (7.18e)
= H(uσ ) M( Rσθ rms ) Z mnf (σ ) .

This shows that, in principle, we can always reduce the mirror-misalignment noise to negligible
levels in the uncalibrated spectrum of the double-sided signal by averaging together many
independent measurements of the same spectral radiance.

- 890 -
Calibrated Spectra Contaminated by Misalignment Noise · 7.6

7.6 Calibrated Spectra Contaminated by Misalignment Noise


The easiest way to find the noise-contaminated spectral radiance is to apply the spectral
calibration algorithm, discussed in Sec. 5.19 of Chapter 5, to the uncalibrated spectral signal in
Eq. (7.18a). We choose L(1) (σ ) and L(2) (σ ) to be the known spectral radiances used to calibrate
the instrument, with both L(1) and L(2) being slowly varying functions of wavenumber so that the
distorting effects of the interferometer’s finite field of view and finite interferogram length can be
neglected. Applying Eqs. (6A.3) and (6A.6) in Appendix 6A of Chapter 6 to L(1) and L(2), we
write that
L(1) ( σ ) ≅ L(1)FOV ( σ ) ≅ L(1)
mnf ( σ ) (7.19a)
and
L(2) ( σ ) ≅ L(2)
FOV ( σ ) ≅ L mnf ( σ ) ,
(2)
(7.19b)

with absolute value signs used to make L(1) and L(2) even functions of wavenumber. We say that
 (1) (σ ) is the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2 of
Z eff ,totN

Chapter 6 when the interferometer is observing the L(1) spectral radiance. To get the formula for
 (1) (σ ) , we need to replace radiance L by radiance L(1) in formula (7.18a), which we do by
Z eff ,totN

writing

eff ,totN rms mnf D { FOV ¼ }


 (1) (σ ) ≅ H(uσ ) M( Rσθ ) Z (1) (σ ) + a H(uσ ) n (θ 2) (σ ) ∗ ªσ 2 Z (1) (σ ) º , (7.20a)
Z ¬

where, following the pattern of Eqs. (7.7b) and (7.16f), we define

WA ∆Ω
(1)
Z FOV (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L FOV ( σ ) − L FOV ( σ )]
(1) ( fore ) (back)
(7.20b)
4
and
WA ∆Ω
(1)
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L mnf ( σ ) − L mnf ( σ )] . (7.20c)
(1) ( fore ) (back)

The approximation shown in (7.19a) is our justification for dropping the FOV and mnf subscripts
from L(1) in Eqs. (7.20a)–(7.20c). Similarly, we define Z (2) (σ ) to be the uncalibrated, noise-
eff ,totN

contaminated spectrum at point C when the interferometer is observing the L(2) spectral radiance.
This gives, using (7.19b) to drop the FOV and mnf subscripts from L(2),

- 891 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

eff ,totN rms mnf D {


 (2) (σ ) ≅ H(uσ ) M( Rσθ ) Z (2) (σ ) + a H(uσ ) n (θ 2) (σ ) ∗ ªσ 2 Z (2) (σ ) º , (7.20d)
Z ¬ FOV ¼ }
where
WA ∆Ω
(2)
Z FOV (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L FOV ( σ ) − L FOV ( σ )]
(2) ( fore ) (back)
(7.20e)
4
and
WA ∆Ω
(2)
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L mnf ( σ ) − L mnf ( σ )] .
(2) ( fore ) (back)
(7.20f)
4

Because the uncalibrated signal spectra from L(1) and L(2) used in our calibration algorithm
should be noise-free, we average together a large number of measurements to get, following the
pattern of Eq. (7.18e) and the statement after it,

( eff ,totN )
 (1) (σ ) = H(uσ ) M( Rσθ ) Z (1) (σ )
E Z rms mnf (7.20g)
and
(eff ,totN )
 (2) (σ ) = H(uσ ) M( Rσθ ) Z (2) (σ ) .
E Z rms mnf (7.20h)

(
Since E Z (1,2) )
eff ,totN (σ ) are the noise-free spectral signals corresponding to L
(1,2)
, we can write

eff ,tot (σ ) = H(uσ ) M( Rσθ rms ) Z mnf (σ )


Z (1) (1)
(7.20i)
and
eff ,tot (σ ) = H(uσ ) M( Rσθ rms ) Z mnf (σ ) ,
Z (2) (2)
(7.20j)

where, to show that these are no longer random functions of ı, the tilde has been removed and
subscript totN has been changed to tot.
Now we can apply the calibration algorithm in Sec. 5.19 of Chapter 5 to get [see Eq. (5.95a)]

Measured Radiance
 ( meas ) (σ ) − Z (1) (σ )
Z (7.21a)
= ª¬L ( σ ) − L ( σ ) º¼ (2)
eff ,totN eff ,tot
(2) (1)
+ L(1) ( σ ) ,
Z eff ,tot (σ ) − Z (1)
eff ,tot (σ )

 ( meas ) (σ ) is the uncalibrated, noise-contaminated spectrum of the signal at point C in


where Z eff ,totN

Fig. 6.2 associated with the unknown optical radiance L that we want to measure. Note that,
although the expectation operator E is used to remove the noise from the L(1,2) signals, the noise
 ( meas ) (σ ) signal. This is our way of showing that, while a great deal of
is left in the uncalibrated Z eff ,totN

- 892 -
Calibrated Spectra Contaminated by Misalignment Noise · 7.6

effort can be invested in obtaining noise-free calibration data, the unknown spectrum L may be
changing slowly with time—and is often only one of a number of measurements to be performed
in a limited amount of time—which prevents us from averaging away its noise.105 The
uncalibrated (meas) signal spectrum, contaminated by mirror misalignment noise, is called

Z eff ,totN (σ ) in Eq. (7.18a), so we can now write that

 ( meas ) (σ ) = Z

Z eff ,totN eff ,totN (σ )
(7.21b)
{
≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) + a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }
with Z mnf (σ ) given by Eq. (7.16f) and Z FOV (σ ) given by Eq. (7.7b). Working with the first
term on the right-hand side of (7.21a), we note that, substituting from Eqs. (7.20c), (7.20f),
(7.20i), and (7.20j),

L(2) ( σ ) − L(1) ( σ )
,tot (σ ) − Z eff ,tot (σ )
(2) (1)
Z eff
L(2) ( σ ) − L(1) ( σ )
= (7.21c)
WA ∆Ω
H(uσ ) M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )[L(2) ( σ ) − L(1) ( σ )]
4
−1
ª WA ∆Ω º
=« H(uσ ) M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )» .
¬ 4 ¼

Consulting Eqs. (7.21b) and (7.16f), as well as (7.20c) and (7.20i), we get

 ( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot

WA ∆Ω
= H(uσ ) M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )[L mnf ( σ ) − L(1) ( σ )] (7.21d)
4
{ }
+ a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ .

Substituting (7.21c) and (7.21d) into (7.21a) gives

105
In Chapter 6, see the discussion at the end of Sec. 6.5 as well as the discussion following Eq. (6.33b).

- 893 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

Measured Radiance
4 a n (D' 2) () )  ª¬) 2 Z FOV () ) º¼
1 2 (7.21e)
L mnf ( ) )  .
WA  M( R)' rms )R ( ) )! () )* a ( ) )* f ( ) )

The right-hand side of (7.21e) is the sum of Lmnf, which is the spectral radiance distorted by
the effect of the interferometer’s finite field of view and finite interferogram length, and a random
noise term
4 a n (D' 2) () ) [) 2 Z FOV () )]
1 2
.
WA  M( R)' rms ) R ( ) )! () )* a ( ) )* f ( ) )

Function Lmnf is strictly real, but there is no reason to expect this noise term to be strictly real. In
fact only the real component of the noise term unavoidably contaminates the Lmnf data. We
conclude, then, that the  L measurement noise in the radiance spectrum is

§ 4 a n (D' 2) () )  ¬ª) 2 Z FOV () ) º¼


1 2 ·

 L Re ¨ ¸
¨ WA  M( R)' rms ) R ( ) )! () )* a ( ) )* f ( ) )¸
© ¹ (7.22a)
4 a ª¬ Re n () ) º¼  ª¬) Z FOV () ) º¼
1  (' 2)
D  2
2
.
WA  M( R)' rms )R ( ) )! () )* a ( ) )* f ( ) )

2 relies on n (D' 2) being the only complex quantity in the expression for
The second step in (7.27a)
the  L spectral noise. For future use, we note that the imaginary component of the noise term in
(7.21e) can be written as

§ 4 a n (D' 2) () ) [) 2 Z FOV () )]


1 2 ·
Im ¨ ¸
¨ WA  M( R)' rms )R ( ) )! () )* a ( ) )* f ( ) )¸
© ¹
(7.22b)
4 a ª¬ Im n () ) º¼ [) Z FOV () )]
1  (' 2)
D  2
2
.
WA  M( R)' rms )R ( ) )! () )* a ( ) )* f ( ) )

Taking the real part of the measured spectrum eliminates this noise component from the data, just
like it did in our analysis of the avoidable and unavoidable detector noise [see the discussion
following Eq. (6.35d) in Chapter 6].

- 894 -
Avoidable and Unavoidable Misalignment Noise in Ȥ-Based Signals · 7.7

7.7 Avoidable and Unavoidable Misalignment Noise in Ȥ-Based Signals


Examining Eq. (7.7b) for Z FOV () ) , we note that since Eq. (4.139g) in Chapter 4 shows Ș(ı) to
be even, Z FOV () ) is alsoalso
must evenbe[see Eq. (2.11a) of Chapter 2 for definition of an even function]:
even:

Z FOV () )
WA  ( fore ) (back)
R ( ) )! () )* a ( ) )[* f ( ) )L FOV ( ) )  L FOV ( ) )  L FOV ( ) )]
4
(7.23a)
WA  ( fore ) (back)
R ( ) )! () )* a ( ) )[* f ( ) )L FOV ( ) )  L FOV ( ) )  L FOV ( ) )]
4
Z FOV () ) .

Equation (5.10f) in Chapter 5 shows that M( R)' ma ) M( R)' ma ) , which means that

M( R)' rms ) Z FOV () ) M( R)' rms ) Z FOV () ) (7.23b)

is also even with respect to ı. Consequently the reverse Fourier transform

5
F (i) )  M( R)' ma ) Z FOV () )  ³ M( R)'
5
ma ) Z FOV () ) e 2& i) d) (7.23c)

must be a real and even function of Ȥ because it is the reverse Fourier transform of a real and
even function of ı (see entry 1 in Table 2.1 of Chapter 2). This forces z B(tot ) (  ) in Eq. (7.8g) to be
a real and even function of Ȥ. To show why this is so, we note that the formula for z B(tot ) (  ) is the
sum of the reverse Fourier transform specified in (7.23c) and several Ȥ-independent constant
terms. We have just seen that the Fourier transform is a real and even function of Ȥ, and the real
constant terms cannot change with Ȥ; hence, z B(tot ) (  ) must be real and even:

z B(tot ) (  ) z B( tot ) (  ) (7.23d)


and
 
Im z B(tot ) (  ) 0 . (7.23e)

Consulting the definition of W (  ) in Eq. (7.8a),

- 895 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms


W (χ ) = ³σ
2
(
Z FOV (σ ) e 2π iσχ dσ = F (iσχ ) σ 2 Z FOV (σ ) , ) (7.23f)
−∞

we note that since [see Eq. (7.23a)]

(−σ ) 2 Z FOV (−σ ) = σ 2 Z FOV (σ ) , (7.23g)

function W ( χ ) is also the reverse Fourier transform of an even function of ı. All the factors in
the definition of Z FOV (σ ) in Eq. (7.7b) are real, which means that the [σ 2 Z FOV (σ )] product in
(7.23g) is also real. Hence, W ( χ ) is the reverse Fourier transform of a real and even function,
making it also real and even:
W (− χ ) = W ( χ ) (7.23h)
and
Im ( W ( χ ) ) = 0 . (7.23i)

Following the same pattern as in Eq. (7.23a), we see that Z mnf (σ ) defined in Eq. (7.16f) is even
because

Z mnf (−σ )
WA ∆Ω
R ( −σ )η ( −σ )τ a ( −σ )[τ f ( −σ )L mnf ( −σ ) + L mnf ( −σ ) − L mnf ( −σ )]
( fore ) (back)
=
4
(7.23j)
WA ∆Ω
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )]
( fore ) (back)
=
4
= Z mnf (σ ) .

Every factor in the definition of Z mnf is real, so

Im ( Z mnf (σ ) ) = 0 . (7.23k)

Clearly, Z mnf is another real and even function.


Equation (7.8f) gives the signal contaminated by mirror-misalignment noise as it leaves the
detector:

( tot )
zBN ( χ ) = z B( tot ) ( χ ) + a n (θ 2) ( χ ) W ( χ ) . (7.24a)

- 896 -
Avoidable and Unavoidable Misalignment Noise in Ȥ-Based Signals · 7.7

From Eq. (7.23d) we know that the noise-free signal z B(tot ) (  ) is an even function of Ȥ, so in
principle we could reduce the noise in (7.24a) by comparing the noise-contaminated signal at Ȥ
and íȤ. (In practice, of course, we would have to worry about distortions introduced by any
circuit used to measure the signal of Ȥ and íȤ. See the discussion of the distortions produced by
the detector circuit in Sec. 5.12 of Chapter
Chapter 5.) To show how this works, we follow the pattern of
5.)To
Eqs. (2.11d), (2.11e) in Chapter 2 and divide the mirror-misalignment noise n (' 2) (  ) into even
and odd components, which we call ne(' 2) (  ) and no(' 2) (  ) respectively, by defining

1 (' 2)
ne(' 2) (  )
2

n (  )  n (' 2) (  )  (7.24b)
and
1 (' 2)
no(' 2) (  )
2

n (  )  n (' 2) (  ) .  (7.24c)
According to these definitions
ne(' 2) (  ) ne(' 2) (  ) (7.24d)
and
no(' 2) (  )  no(' 2) (  ) . (7.24e)

The sum of ne(' 2) and no(' 2) returns the original noise term,

1 (' 2) 1
ne(' 2) (  )  no(' 2) (  )
2
 n (  )  n (' 2) (  )    n (' 2) (  )  n (' 2) (  ) 
2
(' 2)
n (  ) .
Since

n (' 2) (  ) ne(' 2) (  )  no(' 2) (  ) , (7.24f)

we can replace n (' 2) in Eq. (7.24a) by the sum of ne(' 2) and no(' 2) to get

( tot )
zBN (  ) ª¬ z B( tot ) (  )  a ne(' 2) (  ) W (  ) º¼  a no(' 2) (  ) W (  ) . (7.24g)

The sum inside the square brackets [ ] is even with respect to Ȥ because, according to Eqs.
(7.23d), (7.23h), and (7.24d),

[ z B( tot ) (  )  a ne(' 2) (  ) W (  )] [ z B( tot ) (  )  a ne(' 2) (  ) W (  )] . (7.24h)

- 897 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

This sum, just like the noise-free signal z B( tot ) , inis Eq.
even, whichis means
(7.23d), that the
even, which even
means thatnoise
the
even noise component
component
a ne(' 2) (  ) W (  )
cannot be distinguished from the noise-free z B(tot ) signal. The odd noise component,

a n '
( 2)
o ( ) W ( ) ,

on the other hand, can in principle be eliminated—for example, by averaging together the noise-
contaminated signal at Ȥ and íȤ. To see how this works, we consult Eq. (7.24g) and write

1 ( tot ) 1
1 ( tot )
zBN (  )  zBN 2
(  ) [ z B( tot ) (  )  a ne(' 2) (  ) W (  )]  a no(' 2) (  ) W (  )
1
2 2
[ z B( tot ) (  )  a ne(' 2) (  ) W (  )]  a no(' 2) (  ) W (  ) .
2
This becomes, applying Eqs. (7.23h), (7.24e), and (7.24h),

1 ( tot ) 1
1 ( tot )
zBN (  )  zBN 2
(  ) [ z B( tot ) (  )  a ne(' 2) (  ) W (  )]
1
2 2
[ z B(tot ) (  )  a ne(' 2) (  ) W (  )] 2
[ z B( tot ) (  )  a ne(' 2) (  ) W (  )] .

( tot )
Averaging the noise-contaminated signal zBN at Ȥ and íȤ eliminates the odd noise component,
reducing the amount of mirror-misalignment noise contaminating the signal. For this reason, it
makes sense to call ne(' 2) the unavoidable mirror-tilt noise—because it is even and so cannot be
distinguished from the even, noise-free signal—and to call no(' 2) the avoidable mirror-tilt noise
because it can be removed by averaging the noise-contaminated signal at Ȥ and íȤ.

7.8 Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal


Spectrum
It is easy to connect the unavoidable ne(' 2) and avoidable no(' 2) noise components to the D-limited
Fourier transform n (D' 2) . Substitution of Eq. (7.24f) into (7.17c) gives

- 898 -
Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum · 7.8

D D
n (Dθ 2) (σ ) = ³
−D
ne(θ 2) ( χ ) e−2π iσχ d χ + ³
−D
no(θ 2) ( χ ) e−2π iσχ d χ
(7.25a)
θ 2) θ 2)
= n (De (σ ) + n (Do (σ ) ,
where we define
D ∞

³ n ³ Π ( χ , D) n
(θ 2) (θ 2) −2π iσχ (θ 2)
n De (σ ) = e (χ ) e dχ = e ( χ ) e −2π iσχ d χ
−D −∞ (7.25b)
(
= F ( − iσχ ) Π ( χ , D) ne(θ 2) ( χ ) )
and
D ∞

³ ³ Π ( χ , D) n
θ 2)
n (Do (σ ) = no(θ 2) ( χ ) e −2π iσχ d χ = (θ 2)
o ( χ ) e −2π iσχ d χ
−D −∞ (7.25c)
=F ( − iσχ )
( Π( χ , D) n (θ 2)
o )
(χ ) .

θ 2)
Equation (7.25b) states that n (De is the forward Fourier transform of [Π ( χ , D) ne(θ 2) ( χ )] .
Glancing back at Eqs. (7.12a) and (7.24d), we note that

Π (− χ , D) ne(θ 2) (− χ ) = Π ( χ , D) ne(θ 2) ( χ ) , (7.26a)

making [Π ( χ , D) ne(θ 2) ( χ )] even with respect to Ȥ. Hence n (De


θ 2)
is the forward Fourier transform
of a real and even function, which means (according to entry 1 of Table 2.1 in Chapter 2) that
θ 2)
n (De must also be real and even:
θ 2) θ 2)
n (De (−σ ) = n (De (σ ) (7.26b)
and
( θ 2)
Re n (De ) θ 2)
(σ ) = n (De (σ ) . (7.26c)

θ 2)
Equation (7.25c) states that n (Do is the forward Fourier transform of [Π ( χ , D) no(θ 2) ( χ )] .
According to Eqs. (7.12a) and (7.24e),

Π (− χ , D) no(θ 2) (− χ ) = −Π ( χ , D) no(θ 2) ( χ ) , (7.27a)

θ 2)
which makes n (Do the forward Fourier transform of a real and odd function. Consequently,
θ 2)
according to entry 4 of Table 2.1 in Chapter 2, n (Do must be imaginary and odd:

θ 2) θ 2)
n (Do (−σ ) = −n (Do (σ ) (7.27b)

- 899 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

and
( θ 2)
Im n (Do ) θ 2)
(σ ) = i −1 n (Do (σ ) . (7.27c)

Equations (7.26c) and (7.27c) show that taking the real part of both sides of (7.25a) now gives

( )
Re n (Dθ 2) (σ ) = n (De
θ 2)
(σ ) , (7.28a)

and taking the imaginary part of both sides gives

( )
Im n (Dθ 2) (σ ) = i −1 n (Do
θ 2)
(σ ) . (7.28b)

Equation (7.28a) shows that the real part of n (Dθ 2) , the D-limited Fourier transform of n (θ 2) , is
θ 2)
n (De , which is, according to (7.25b), the D-limited Fourier transform of the unavoidable signal
noise ne(θ 2) . Because the real part of n (Dθ 2) comes from ne(θ 2) , the unavoidable signal noise, it
makes sense to regard the real part of n (Dθ 2) as the unavoidable component of n (θ 2) in the spectral
domain. This matches what we see in Eq. (7.22a), where the formula for the noise δ L in the
measured spectrum uses only the real part of n (Dθ 2) (that is, it uses only the unavoidable
component of n (θ 2) in the spectral domain). Equation (7.28a) can be substituted into (7.22a) to
θ 2)
make the dependence on n (De explicit:

δ L =
{
4 a n (De
θ 2)
(σ ) ∗[σ 2 Z FOV (σ )] } . (7.28c)
WA ∆Ω M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )

Equations (4.139g) in Chapter 4 and (5.10f) in Chapter 5 show that Ș and M are even functions of
ı, and absolute value signs turn everything else in the denominator of the right-hand side of
θ 2)
(7.28c) into an even function of ı. Equations (7.26b) and (7.23g) show that n (De and
[σ 2 Z FOV (σ )] are even functions of ı, and Eq. (2.38f) in Chapter 2 requires the convolution of
two even functions to be another even function. Hence, the numerator of (7.28c) is also an even
function of ı. This makes the measurement noise δ L an even function of ı, which can be shown
by writing it as a function of σ ,
δ L = δ L ( σ )

Therefore, Eqs. (7.22a) and (7.28c) can be written as

- 900 -
Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum · 7.8

δ L ( σ ) =
{ ( )
4 a [Re n (Dθ 2) (σ ) ] ∗[σ 2 Z FOV (σ )] } (7.28d)
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )
and

δ L ( σ ) =
{
4 a n (De
θ 2)
(σ ) ∗[σ 2 Z FOV (σ )] } . (7.28e)
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )

Equation (7.28b) shows that the imaginary part of n (Dθ 2) is the same as i −1 n (Do
θ 2)
(σ ) , the D-limited
Fourier transform of the avoidable signal noise divided by i. Equation (7.28b) can be substituted
into (7.22b) to make this explicit:

§
Im ¨
{
4 a n (Dθ 2) (σ ) ∗[σ 2 Z FOV (σ )] } ·
¸
¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )¸
© ¹ (7.28f)

=
4a i −1
{ n (θ 2)
Do (σ ) ∗[σ Z FOV (σ )]
2
} .
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )

( )
Since E n (Dθ 2) (σ ) = 0 in Eq. (7.18c), we know that, using the linearity of E explained in Sec.
3.10 of Chapter 3,

( ( ) (
E (n (Dθ 2) (σ )) = E Re n (Dθ 2) (σ ) + i Im n (Dθ 2) (σ ) ))
= E ( Re ( n (θ 2)
D (σ ) ) ) + iE ( Im ( n (σ ) ) ) = 0 .
(θ 2)
D

Consequently both the real and imaginary components of E n (Dθ 2) (σ ) must be separately equal ( )
to zero, which means
( (
E Re n (Dθ 2) (σ ) = 0 )) (7.29a)
and
( (
E Im n (Dθ 2) (σ ) = 0 . )) (7.29b)

According to Eqs. (7.28a) and (7.28b), this can be written as

( θ 2)
E n (De (σ ) = 0 ) (7.29c)
and

- 901 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

( θ 2)
E n (Do )
(σ ) = 0 . (7.29d)

Applying the expectation operator to both sides of Eq. (7.28e) leads to, using Eqs. (2.38b) and
(2.38a) in Chapter 2 and the linearity of the expectation operator in Sec. 3.10 of Chapter 3,

(
E δ L ( σ ) )
§ 4a ·
= E¨
¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) { θ 2)
[σ 2 Z FOV (σ )] ∗ n (De (σ ) ¸
¸ }
© ¹
4a § ∞
·
= E ¨ ³ n (De
θ 2)
(σ − σ ′) ¬ªσ ′2 Z FOV (σ ′) ¼º dσ ′ ¸
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) © −∞ ¹
4a ∞
=
WA ∆Ω M( Rσθ )R ( σ )η (σ )τ ( σ )τ ( σ ) ³
θ 2)
E n (De ( )
(σ − σ ′) ª¬σ ′2 Z FOV (σ ′) º¼ dσ ′ ,
rms a f −∞

which becomes, using (7.29c),


(
E δ L ( σ ) = 0 .) (7.29e)

This shows that the measurement noise δ L ( σ ) is a zero-mean random variable. Similarly Eq.
(7.28f) gives us, after applying the expectation operator to both sides,

§ §
E ¨ Im ¨
{
4 a n (Dθ 2) (σ ) ∗[σ 2 Z FOV (σ )] } ··
¸¸
¨ ¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¸¸
© © ¹¹
4 a i −1 ∞
E ( n )
(σ ) ³
= (θ 2)
Do (σ − σ ′) ª¬σ ′2 Z FOV (σ ′) º¼ dσ ′ ,
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f −∞

which becomes, using (7.29d),

§ § 4 a { n (Dθ 2) (σ ) ∗[σ 2 Z FOV (σ )] } ··


E ¨ Im ¨ ¸¸ = 0 . (7.29f)
¨ ¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¸¸
© © ¹¹

Hence both the real and imaginary contamination of the measurement due to the signal’s mirror-
tilt noise can be reduced to negligible levels by averaging together many independent
measurements of the same spectrum.

- 902 -
Power Spectrum of ñ(ș2) · 7.9

7.9 Power Spectrum of n (' 2)


In the discussion following Eq. (7.8d) above, random function n (' 2) (  ) is assumed to be wide-
sense stationary. Hence, based on the analysis in Secs. 3.20 and 3.23 of Chapter 3, we expect its
power spectrum and autocorrelation function to be a Fourier-transform pair. The Ȥ-based
autocorrelation function is defined to be

(' 2)
onn
 (  ,  3) E (n (' 2) (  ) n (' 2) (  3)) .

Since n (' 2) is wide-sense stationary, we know that onn


(' 2)
 depends only on the difference between
Ȥ and  3 :
(' 2)
onn
 
(  3   ) E n (' 2) (  ) n (' 2) (  3) .  (7.30a)

The ı-based, double-sided power spectrum of n (' 2) is given by

³o  
(' 2) (' 2)
p 
nn () ) 
nn (  ) e 2& i) d  F (  i) ) onn
(' 2)
 ( ) (7.30b)
5

and of course this transform can be reversed to get

³p  
(' 2) (' 2)
o 
nn ( ) 
nn () ) e 2& i) d) F ( i) ) pnn
(' 2)
 () ) . (7.30c)
5

Equations (7.30b) and (7.30c) show how we set up the Ȥ-based autocorrelation of n (' 2) and the ı-
based power spectrum of n (' 2) as a Fourier-transform pair. Equation (7.8b) shows that onn (' 2)
 is
real because n (' 2) is real, and we also note that onn
(' 2)
 must be even because for any two values of
Ȥ and Ȥƍ,

(' 2)
onn
 (  3   ) E  n (' 2) (  ) n (' 2) (  3)  E  n (' 2) (  3) n (' 2) (  )  onn
(' 2)
 (    3) .

Therefore, after defining ȤƎ = Ȥ í Ȥƍ, we get

(' 2) (' 2)
onn
 (  33) onn
 (  33) (7.30d)
and
and, having just decided the autocorrelation is('real,
Im onn
2)
(  33) 0 .   (7.30e)
Im onn (' 2)
 (  33) 0 .   (7.30e)

- 903 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

(θ 2)
Equation (7.30b) then shows that, according to (7.30d) and (7.30e), pnn  is the forward Fourier
transform of a real and even function, which means that it must also be real and even:106

Im pnn
 (
(θ 2)
(σ ) = 0 ) (7.30f)

and
(θ 2) (θ 2)
pnn
 (−σ ) = pnn
 (σ ) . (7.30g)

(θ 2)
The χ = 0 value of the autocorrelation function can be used to connect the pnn  power
spectrum to the statistics of the misalignment angle. Setting χ = 0 in Eq. (7.30c) gives

³p
(θ 2) (θ 2)
o

nn (0) = 
nn (σ ) dσ
−∞

which means, according to (7.30a) with χ = χ ′ ,

(
E [n (θ 2) ( χ )]2 = ) ³p (θ 2)

nn (σ ) dσ . (7.31a)
−∞

Substituting from Eq. (7.8b) and using the linearity of operator E with respect to random
quantities (see Sec. 3.10 of Chapter 3) as well as Eq. (3.9f) of Chapter 3, we get

(
E ([n (θ 2) ( χ )]2 ) = E [θ rms
2
)
− θ ( χ ) 2 ]2 = E (θ rms
4
) − 2 E θ rms
2
(
θ ( χ ) 2 + E θ ( χ ) 4 ) ( )
= θ rms
4
− 2θ rms
2
E (θ ( χ ) ) + E (θ ( χ ) )
2 4

(
= E θ ( χ ) 4 − θ rms
4
,)
where in the last step E(θ ( χ ) 2 ) = θ rms
2
from Eq. (7.3c) is used to simplify the result. Substitution
of this formula into (7.31a) gives

( )
E θ ( χ ) 4 = θ rms
4
+ ³p
(θ 2)

nn (σ ) dσ . (7.31b)
−∞

106
See entry 1 of Table 2.1 in Chapter 2.

- 904 -
Power Spectrum of ñ(ș2) · 7.9

Because the statistics of ' do not depend on Ȥ, we are not surprised to see E(' (  ) 4 ) set equal to
(' 2)
a Ȥ-independent sum. This formula connects the integrated value of pnn  to the—presumably
already known—statistical quantities șrms and E (' (  ) 4 ) . Once a shape has been chosen for p(' 2) , 
nn

Eq. (7.31b) can be used to find the normalizing constant, which should be applied to the shape
(' 2)
function to get the exact formula for the pnn
 noise-power spectrum (see, for example, Sec. 7.13
7.14
below).

7.10 Calculating the Variance of  L


Equations (7.28e) and (7.28f) specify the measurement noise  L ( ) ) in the radiance spectrum.
To find the variance of  L ( ) ) , we must evaluate E [ L ( ) )]2 , which can be written as,
 
substituting from Eq. (7.28e),

E [ L ( ) )]2
 
§ª 2
·
¨« 4 a n (De
' 2)
1
() )  ¬ª) 2 Z FOV () ) ¼º 2 º
¸
E¨ » (7.32)
¸
¨ «¬ WA  M( R)' rms ) R ( ) )! () )* a ( ) )* f ( ) )»
¼ ¸
© ¹
4a
2
ª º §
2 ¸¹· .
2
«
) ¼» ©
1
» E ¨ n De () )  ¬ª) Z FOV () ) ¼º
(' 2) 2

¬« WA  M( R)' rms )R ( ) )! () )* a ( ) )* f ( )

The only difficult term in this formula is

E §¨ n (De 2 ·¸¹ ,
2

©
1 ' 2)
() )  ª¬) 2 Z FOV () ) º¼

which is what we now set out to calculate.


Reversing the transform in Eq. (7.8a) gives

5
) 2 Z FOV () ) ³ W (  3) e
2& i) 3
d  3 F (  i) 3)  W (  3)  . (7.33a)
5

Equations (7.25b) and (7.33a) can be combined to get

n (De
' 2)
() ) [) 2 Z FOV () )] F (  i) )   (  , D) ne(' 2) (  )   F (  i) 3)  W (  3)  ,

- 905 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

which becomes, using the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],

θ 2)
n (De (σ ) ∗[σ 2 Z FOV (σ )] = F ( − iσχ ) ( Π ( χ , D) ne(θ 2) ( χ ) W ( χ ) )
∞ (7.33b)
= ³
−∞
Π ( χ , D) ne(θ 2) ( χ ) W ( χ ) e −2π iσχ .

Now we can write (using the linearity of operator E discussed in Sec. 3.10 of Chapter 3)

{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
§∞ ∞
·
= E ¨ ³ d χ Π ( χ , D) ne ( χ ) W ( χ ) e
(θ 2) −2π iσχ
³ d χ ′ Π ( χ ′, D ) 
n (θ 2)
e ( χ ′) W ( χ ′) e −2π iσχ ′
¸ (7.33c)
© −∞ −∞ ¹
∞ ∞

³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) E ( n )
( χ ) ne(θ 2) ( χ ′) e −2π iσχ ′ .
(θ 2)
= e
−∞ −∞

From Eq. (7.24b) we get

1
( ) (
E ne(θ 2) ( χ ) ne(θ 2) ( χ ′) = E [n (θ 2) ( χ ) + n (θ 2) (− χ )][n (θ 2) ( χ ′) + n (θ 2) (− χ ′)]
4
)
1
( ) (
= ¬ªE n (θ 2) ( χ )n (θ 2) ( χ ′) + E n (θ 2) ( χ )n (θ 2) (− χ ′)
4
)
(
+E n (θ 2) (− χ )n (θ 2) ( χ ′) ) + E ( n (θ 2)
)
(− χ )n (θ 2) (− χ ′) ¼º .

This becomes, applying Eq. (7.30a),

1
E ( ne(θ 2) ( χ ) ne(θ 2) ( χ ′) ) = ª¬ onn
(θ 2)
 ( χ ′ − χ ) + onn (θ 2)
 (− χ ′ − χ )
4
+ onn(θ 2)
 ( χ ′ + χ ) + onn (θ 2)
 (− χ ′ + χ ) º¼ ,

which, according to Eq. (7.30d), simplifies to

1
(
E ne(θ 2) ( χ ) ne(θ 2) ( χ ′) = ) 2
ª¬ onn
(θ 2)

(θ 2)
( χ ′ − χ ) + onn
 ( χ ′ + χ ) º¼ . (7.33d)

- 906 -
Calculating the Variance of į L · 7.10

Substitution of (7.33d) into (7.33c) gives, using Eq. (7.30c),

{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
∞ ∞
1
³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) onn ( χ ′ − χ ) e −2π iσχ ′
(θ 2)
= 
2 −∞ −∞
∞ ∞
1
+ ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) onn
(θ 2)
 ( χ ′ + χ ) e −2π iσχ ′
2 −∞ −∞
∞ ∞ ∞
1
= ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π iσχ ′ ³ pnn
(θ 2)
 (σ ′) e 2π iσ ′( χ ′− χ ) dσ ′
2 −∞ −∞ −∞
∞ ∞ ∞
1
+ ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π iσχ ′ ³ pnn
(θ 2)
 (σ ′) e 2π iσ ′( χ ′+ χ ) dσ ′.
2 −∞ −∞ −∞

This can be written as, interchanging the order of the multiple integrals,

{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
∞ ∞ ∞
1
³ (σ ′) ³ d χ Π ( χ , D) W ( χ ) e −2π i (σ +σ ′) χ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π i (σ −σ ′) χ ′
(θ 2)
= dσ ′ pnn
 (7.33e)
2 −∞ −∞ −∞
∞ ∞ ∞
1
³ (σ ′) ³ d χ Π ( χ , D) W ( χ ) e −2π i (σ −σ ′) χ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π i (σ −σ ′) χ ′ .
(θ 2)
+ dσ ′ pnn

2 −∞ −∞ −∞

From Eq. (7.15b) and the Fourier convolution theorem [Eq. (2.39j) in Chapter 2], we get

³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ = F ( −iσχ ) ( Π ( χ , D) W ( χ ) )
−∞

= [2 Dsinc(2πσ D)] ∗ F ( −iσχ ) ( W ( χ ) ) .

Substitution from Eq. (7.17e) gives

³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ = [2 Dsinc(2πσ D)] ∗ [σ 2 Z FOV (σ )] .
−∞

- 907 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

The σ 2 term is broad and slowly varying compared to the narrow and rapidly varying sinc
function, so it acts like a quasi-constant and can be brought outside the convolution [see Eq.
(5C.1) in Appendix 5C of Chapter 5]. This means we can write, using the approximation in
(7.16h) above,

³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ ≅ σ 2 ( [2 Dsinc(2πσ D)] ∗ Z FOV (σ ) ) ≅ σ 2 Z mnf (σ ) . (7.33f)
−∞

Equation (7.33f) is now used to simplify (7.33e):

{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©

1
= ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ (7.33g)
2 −∞

1 2
³ (σ ′) ª¬(σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ .
(θ 2)
+ pnn

2 −∞

This expression is too complicated to substitute comfortably back into Eq. (7.32), the formula for
the variance of δ L ( σ ) , so we define a new function


1
J (θ 2)
(σ ) = ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞

(7.33h)
1 2
+ ³ pnn
(θ 2)
 (σ ′) ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ ,
2 −∞

which means that (7.33g) reduces to

{
E §¨ n (De } ·¸¹ = J (σ ) .
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ (7.33i)
©

Equation (7.32) can now be written as

4a
2
ª º
(
E [δ L ( σ )] = J 2
) (θ 2)
(σ ) ⋅ « » . (7.33j)
¬« WA ∆Ω M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¼»

- 908 -
Calculating the Variance of į L · 7.10

Using Eq. (7.2b) and A = π R 2 for a circle of radius R, we have

a = 2π 2
R2
= 2π .
A π R2

Here, variables A and R have the same meaning as in the discussion following Eq. (4.137e) in
Chapter 4. The discussion following Eq. (4.83) in Chapter 4 reveals that, because W must be 1 or
í1,
W 2 = 1. (7.33k)

These results can be substituted into (7.33j) to get

2
ª 8π º
( 2
)
E [δ L ( σ )] = J (θ 2)
(σ ) ⋅ « » . (7.33 A )
¬« ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¼»

7.11 Formula for the Misalignment NEdN of Double-Sided Signals


By definition, the mirror-misalignment NEdN of the double-sided signal analyzed here is the
square root of the variance in the noise. According to Eq. (7.29e), the mirror-misalignment
noise δ L is a zero-mean random variable, so the formula for the variance in Eq. (3.8f) in Chapter
3 shows that
(
Var (δ L ) = E [δ L ( σ )]2 )
because the mean µδ L of random variable δ L is zero. Consequently the formula for the
misalignment, or tilt-error, NEdN—which is defined in Sec. 6.1 of Chapter 6 to be the standard
deviation, or the square root of the variance, of the δ L noise—can be written as

NEdN tilt = E ([δ L ( σ )]2 ) . (7.34a)

Taking the square root of both sides of Eq. (7.33 A ) gives

ª 8π J (θ 2) (σ ) º
NEdN tilt = « ». (7.34b)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼

- 909 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

There are a number of ways to write the J (θ 2) function defined in Eq. (7.33h) above. The
second term on the right-hand side of (7.33h) can, for example, be written as a convolution [see
Eq. (2.38a) in Chapter 2 for the definition of a convolution]. This gives

1 (θ 2) 2
J (θ 2) (σ ) = pnn
 (σ ) ∗ ª¬σ 2 Z mnf (σ ) º¼
2
∞ (7.35a)
1
+ ³ pnn (θ 2)
 (σ ′) ª¬(σ + σ ′) Z mnf (σ + σ ′) º¼ ª¬(σ − σ ′) Z mnf (σ − σ ′) º¼ dσ ′ .
2 2

2 −∞

Perhaps the most revealing form in which to write the J (θ 2) function is


1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) + (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ . (7.35b)
4 −∞

To justify this latest formula, we consult Eq. (7.30g) and define a new dummy variable of
integration σ ′′ = −σ ′ in order to show that


2
³ (σ ′) ª¬ (σ + σ ′) 2 Z mnf (σ + σ ′) º¼ dσ ′
(θ 2)
pnn

−∞
−∞
2
= − ³ pnn
(θ 2)
 (−σ ′′) ª¬(σ − σ ′′) 2 Z mnf (σ − σ ′′) º¼ dσ ′′ (7.35c)


2
³ (σ ′′) ª¬(σ − σ ′′) 2 Z mnf (σ − σ ′′) º¼ dσ ′′ .
(θ 2)
= pnn

−∞

Now the square brackets [ ] in Eq. (7.35b) can be expanded to get

- 910 -
Formula for the Misalignment of NEdN of Double-Sided Signals · 7.11


1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ dσ ′
4 −∞

1 2
+ ³ pnn
(θ 2)
 (σ ′) ª¬(σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
4 −∞

1
+ ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞

1 2
= ³ pnn
(θ 2)
 (σ ′) ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞

1
+ ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ .
2 −∞

This is the same as Eq. (7.33h) above, showing that the right-hand side of (7.35b) is correct. We
(θ 2)
note, since the power spectrum pnn  can never be negative and the terms inside the square
brackets [ ] are all real, that the integral on the right-hand side of (7.35b) is never negative.
Consequently, J (θ 2) must be a non-negative quantity, which means there is never any problem
taking its square root in the formula for the mirror-misalignment NEdN in Eq. (7.34b). The Z mnf
function in Eq. (7.35b) is specified by formula (7.16f) above; we see that Z mnf depends on the
background radiances L(mnf
fore )
and L(back)
mnf as well as on Lmnf, the radiance being measured. Hence
both the internal background radiances and the radiance being measured end up contributing to
the mirror-tilt NEdN.

(θ 2)
7.12 Connection Between the pnn
 Power Spectrum and the Power
Spectra of θ , θ x y

To understand the implications of the NEdNtilt formulas derived in the previous sections, we
(θ 2)
need some information about the typical shape of the pnn  power spectrum. It turns out that if we
assign power spectra to the θ ( χ ) and θ ( χ ) random functions introduced in Sec. 7.2 above, we
x y
(θ 2) (θ 2)
can use them to get information about the probable shape of pnn
 by deriving a formula for pnn


in terms of the power spectra of θ ( χ ) and θ ( χ ) .


x y

Simplifying the notation in preparation for the algebra coming up, we define four new random
variables X , X ′ , Y , Y ′ by specifying that

- 911 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

θx ( χ ) = X + φ , (7.36a)

θx ( χ ′) = X ′ + φ , (7.36b)

θy ( χ ) = Y , (7.36c)
and
θy ( χ ′) = Y ′ (7.36d)

The point of this new notation is to emphasize the important information—namely, whether or
not we are dealing with the x or the y component of the angle—and to suppress all the irrelevant
aspects of argument Ȥ, keeping only the relevant information as to whether or not it is primed.
According to Eq. (7.2f), the average value of θx ( χ ) —which is the same thing as the system’s
bias tilt—is the constant angle φ at any value of Ȥ. Writing Eqs. (7.36a) and (7.36b) as

X = θx ( χ ) − φ and X ′ = θx ( χ ′) − φ

makes it easy to see that X and X ′ are zero-mean random functions of Ȥ and Ȥƍ respectively. The
statistics of θx ( χ ) and θy ( χ ) do not depend on Ȥ, so we expect the same to hold true for the
statistics of X , X ′ , Y , and Y ′ . Hence, we can assume that X , X ′ and Y , Y ′ are at least wide-
sense stationary functions of Ȥ (which is—according to Sec. 3.20 of Chapter 3—all that is
necessary to provide them with power spectra). Because they are wide-sense stationary, we can
set up the two autocorrelation functions

( )
  ′ = o ( xx ) ( χ ′ − χ )
E XX (7.36e)
and
E YY ( )
  ′ = o ( yy ) ( χ ′ − χ ) (7.36f)

to be functions only of the difference between Ȥ and Ȥƍ. The associated power spectra are, using
χ ′′ = χ ′ − χ ,

p ( xx )
(σ ) = ³o
( xx )
(
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( xx ) ( χ ′′) ) (7.36g)
−∞
and

p( yy ) (σ ) = ³o
( yy )
( )
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( yy ) ( χ ′′) ; (7.36h)
−∞

- 912 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn 

and, of course, the Fourier transforms can be reversed to get


o ( xx )
( χ ′′) = ³p
( xx )
(
(σ ) e 2π iσχ ′′ dσ = F ( iσχ ′′) p( xx ) (σ ) ) (7.36i)
−∞
and

o ( yy )
( χ ′′) = ³p
( yy )
(
(σ ) e 2π iσχ ′′ dσ = F ( iσχ ′′) p( yy ) (σ ) . ) (7.36j)
−∞

If we no longer assume that θx ( χ ) and θy ( χ ) are uncorrelated—which means that X and Y
might be correlated random variables—we must use the cross-correlation function

( )
  ′ = o ( xy ) ( χ ′ − χ ) ,
E XY (7.37a)

like the one defined in Eq. (3.30d) in Chapter 3, to describe the statistical relationship between
X and Y . Again we assume that, just like o ( xx ) and o ( yy ) , it is a real function of the difference
between Ȥ and Ȥƍ, which means that X and Y are jointly wide-sense stationary. Hence we can
define a new variable χ ′′ = χ − χ ′ and construct an associated cross-power spectrum [see Eq.
(3.48e) in Chapter 3],

p ( xy )
(σ ) = ³o
( xy )
(
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( xy ) ( χ ′′) . ) (7.37b)
−∞

Reversing the Fourier transform gives


o ( xy )
( χ ′′) = ³p
( xy )
( )
(σ ) e 2π iσχ ′′ dσ = F (iσχ ′′) p( xy ) (σ ) . (7.37c)
−∞

The same sort of reasoning used above in Sec. 7.9 [see Eqs. (7.30d)–(7.30g)] can be used here
to show that o ( xx ) , o ( yy ) , p( xx ) , and p( yy ) in Eqs. (7.36e)–(7.36h) are real and even functions. We
note that
o ( xx ) ( χ ′ − χ ) = E XX ( ) (
  ′ = E X ′X = o ( xx ) ( χ − χ ′) )
which becomes, substituting χ ′′ = χ ′ − χ ,

o ( xx ) ( χ ′′) = o ( xx ) (− χ ′′) . (7.38a)

- 913 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

The same argument can be applied to o ( yy ) to get

o ( yy ) ( χ ′′) = o ( yy ) (− χ ′′) ; (7.38b)

and, of course, both o ( xx ) and o ( yy ) must be real because they are, according to (7.36e) and
(7.36f), the expectation values of real products,

Im[ o ( xx ) ( χ ′′)] = Im[ o ( yy ) ( χ ′′)] = 0 . (7.38c)

Since o ( xx ) and o ( yy ) are real and even, their Fourier transforms p( xx ) and p( yy ) in Eqs. (7.36g)
and (7.36h) must also, according to entry 1 in Table 2.1 of Chapter 2, be real and even:

p( xx ) (−σ ) = p( xx ) (σ ) , (7.38d)

p( yy ) (−σ ) = p( yy ) (σ ) , (7.38e)
and
Im[p( xx ) ( χ ′′)] = Im[p( yy ) ( χ ′′)] = 0 . (7.38f)

We note in passing that this line of argument most definitely cannot be applied to o ( xy ) and
p( xy ) , because, as shown in Appendix 7B, the cross-power spectrum p( xy ) can have both real and
imaginary parts.
The probability density distributions in Eqs. (7.2h) and (7.2i) require θx and θy to be
normally distributed. Consequently, the definitions of X , X ′ , Y , Y ′ in Eqs. (7.36a)–(7.36d)
show that X , X ′ , Y , Y ′ are also normally distributed. Variables Y and Y ′ obey zero-mean
normal distributions because θ is a zero-mean random function; and X and X ′ also obey zero-
y

mean normal distributions because, according to the discussion following Eq. (7.36d), the effect
of subtracting φ from θx is to make X and X ′ zero-mean random quantities. Hence, X , X ′ ,
Y , Y ′ have the same properties as the jointly normal random variables n , n , n , n described
1 2 3 4

in Sec. 3.17 of Chapter 3,


X , X ′, Y , Y ′ ⇔ n1,2,3,4 . (7.39a)

Note that jointly normal random variables may or may not be correlated and thus may or may not
be independent random quantities. Considered in pairs, the random quantities X , X ′ and Y , Y ′
obey the formulas describing pairs of jointly normal random variables, for example, Eqs. (3.35c)
and (3.41b) in Chapter 3. When they are examined in isolation, they obey formulas describing
single normal variables, for example, Eq. (3.41c) in Chapter 3. Equations (7.36a)–(7.36d) also

- 914 -
and the Power Spectra of 'x , 'y · 7.12
(' 2 )
Connection Between pnn 

require the spread in the probable values of X , X 3 about zero to be the same as the spread in the
probable values of 'x at Ȥ or Ȥƍ about  ; and of course Y , Y 3 have the same spread about zero as
' at Ȥ or Ȥƍ because they are the same random variables. Consequently the standard deviations of
y

X , X 3 are the same as the x standard deviation of 'x at Ȥ or Ȥƍ and the standard deviations of
Y , Y 3 are the same as the y standard deviation of 'y at Ȥ or Ȥƍ [ x and y are introduced in
the discussion
discussion following
following Eq. Eq. (7.2g)
(7.2g) above].
above]. We We
see see
thatthat

E( X 2 ) E( X 32 ) 2
x (7.39b)
and
E(Y 2 ) E(Y 32 ) 2
y . (7.39c)

Having laid the required mathematical foundation, we begin the derivation of the desired
(' 2)
formula for the pnn
 power spectrum in terms of the power spectra of 'x (  ) and 'y (  ) . The first
step is to evaluate [see Eqs. (7.36a)–(7.36d) above]

E 'x (  ) 2 'x (  3) 2 E ( X   ) 2 ( X 3   ) 2 ,
    (7.40a)

E 'x (  ) 2 'y (  3) 2 E ( X   ) 2 Y 32 ,
    (7.40b)
and
E 'y (  ) 2 'y (  3) 2 E(Y 2Y 32 )
  (7.40c)

in terms of the correlation functions o ( xx ) , o ( yy ) , and o ( xy ) .


Starting with (7.40a), we use the linearity of the expectation operator with regard to random
variables (see Sec. 3.10 of Chapter 3) to write

E 'x (  ) 2 'x (  3) 2 E ( X   ) 2 ( X 3   ) 2
   
E( X 2 X 32  2 X 2 X 3   2 X 2
 2 XX  32  4 2 XX
  3  2 3 X
  2 X 32  2 3 X 3   4 ) (7.41a)
E( X 2 X 32 )  2E( X 2 X 3)   2E( X 2 )
 2E( XX  32 )  4 2E( XX  3)  2 3E( X )
  2E( X 32 )  2 3E( X 3)   4 ,

- 915 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

where, in the last step, Eq. (3.9f) in Chapter 3 is used to get E φ 4 = φ 4 . We examine the ( )
discussion following Eq. (3.34d) in Chapter 3 and apply Eq. (3.35c) to get

( ) (
  ′2 = E X 2 X ′ = 0 .
E XX ) (7.41b)

We of course also know that


( ) ( )
E X = E X ′ = 0 (7.41c)

because X and X ′ are zero-mean random variables. Equation (7.41a) can now be written as

( )
E θx ( χ ) 2 θx ( χ ′) 2 = E( X 2 X ′2 ) + φ 2E( X 2 ) + 4φ 2E( XX
  ′) + φ 2E( X ′2 ) + φ 4
(7.41d)
= E( X 2 X ′2 ) + 4φ 2E( XX
  ′) + 2φ 2γ 2 + φ 4 ,
x

where in the last step Eq. (7.39b) is used to replace E( X 2 ) and E( X ′2 ) by γ x2 . Examining the
discussion following Eq. (3.40c) in Chapter 3, we note that Eq. (3.41b) shows us that

( ) ( ) ( ) ( )
2
E X 2 X ′2 = E X 2 E X ′2 + 2E XX
 ′
or, again using (7.39b),
( )  ′ 2.
E X 2 X ′2 = γ x4 + 2E XX ( ) (7.41e)

Substituting (7.41e) into (7.41d) and then applying (7.36e) gives

( )
E θx ( χ ) 2 θx ( χ ′) 2 = γ x4 + 2 o ( xx ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ ) + 2φ 2γ x2 + φ 4
or
( )
E θx ( χ ) 2 θx ( χ ′) 2 = 2 o ( xx ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ ) + (γ x2 + φ 2 ) 2 . (7.41f)

Having finished with (7.40a), we turn our attention to (7.40b). Again using Eqs. (7.36a) and
(7.36c) and the linearity of the expectation operator (see Sec. 3.10 in Chapter 3), we have

( ) ( )
E θx ( χ ) 2 θy ( χ ′) 2 = E ( X + φ ) 2 Y ′2 = E( X 2Y ′2 + 2φ XY
  ′2 + φ 2Y ′2 )
(7.42a)
= E( X 2Y ′2 ) + 2φE( XY
  ′2 ) + φ 2E(Y ′2 ).

- 916 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn 

Again Eqs. (3.35c) and (3.41b) in Chapter 3 can be applied to the jointly normal random
quantities X , Y ′ to get
E XY  ′2 = 0 ( ) (7.42b)
and
( ) ( ) ( ) ′ 2 .
E X 2Y ′2 = E X 2 E Y ′2 + 2E XY ( ) (7.42c)

Equations (7.37a), (7.39b), and (7.39c) let us write (7.42c) as

( )
E X 2Y ′2 = γ x2γ y2 + 2 o ( xy ) ( χ ′ − χ ) 2 . (7.42d)

Substituting (7.39c), (7.42b), and (7.42d) into (7.42a) gives

( )
E θx ( χ ) 2 θy ( χ ′) 2 = γ x2γ y2 + 2 o ( xy ) ( χ ′ − χ ) 2 + φ 2γ y2
or
( )
E θx ( χ ) 2 θy ( χ ′) 2 = γ y2 (γ x2 + φ 2 ) + 2 o ( xy ) ( χ ′ − χ ) 2 . (7.42e)

Equation (7.40c) is the easiest to evaluate. This time applying Eq. (3.41b) in Chapter 3 to the
jointly normal random quantities Y and Y ′ , we can write

( ) ( ) ( )
 ′ 2 .
E Y 2Y ′2 = E Y 2 E Y ′2 + 2E YY ( ) (7.43a)

Substituting this into (7.40c), we get

( ) ( ) ( )
 ′ 2 ,
E θy ( χ ) 2 θy ( χ ′) 2 = E Y 2 E Y ′2 + 2E YY ( )
which becomes, using (7.39c) and (7.36f),

( )
E θy ( χ ) 2 θy ( χ ′) 2 = γ y4 + 2 o ( yy ) ( χ ′ − χ ) 2 . (7.43b)

Now that (7.40a)–(7.40c) have been evaluated, the next step is to use them to find a formula
(θ 2)
for onn
 in terms of o ( xx ) , o ( yy ) , and o ( xy ) . Substituting Eq. (7.8b) into (7.30a) gives

(θ 2)
onn
 ( χ ′ − χ ) = E θ rms
2
((
− θ ( χ ) 2 θ rms
2
)(
− θ ( χ ′) 2 )) . (7.44a)

- 917 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

The product on the right-hand side can be expanded to get

(' 2)
onn
 (  3   ) E ' (  ) 2 ' (  3) 2  ' rms
 2  2 
' (  ) 2  ' rms' (  3)2  ' rms
4
. 
The linearity of the expectation operator [see Sec. 3.10 in Chapter 3 and also Eq. (3.9f)] lets this
be written as

(' 2)
onn
 (  3   ) E ' (  ) 2 ' (  3) 2  ' rms
 2
 E ' (  ) 2  ' rms
 2
E ' (  3) 2  ' rms
  4
, 
which becomes, using Eq. (7.3c),

(' 2)
onn
 (  3   ) E ' (  ) 2 ' (  3) 2  ' rms
 4
 . (7.44b)

Substituting from Eq. (7.2c) now gives

(' 2)
onn
 
(  3   ) E 'x (  ) 2  'y (  ) 2 'x (  3) 2  'y (  3) 2
   ' 4
rms ,

which we can, following the same procedure as before, expand to get

(' 2)
onn
 (  3   ) E 'x (  ) 2 'x (  3) 2  E 'x (  ) 2 'y (  3) 2  E 'x (  3) 2 'y (  ) 2
     
(7.44c)
 E ' (  ) ' (  3)   '
y
2
y
2 4
rms .

Applying Eqs. (7.41f), (7.42e),


(7.43b), and (7.43b)
(7.42e) to the“as
both right-hand side interchanging
is” and after  and  3, gives
of (7.44c) gives

(' 2)
onn
 (  3   ) 2 o ( xx ) (  3   ) 2  4 2 o ( xx ) (  3   )  ( 2
x   2 )2
2 2
 y ( x   2 )  2 o ( xy ) (  3   ) 2
2 2
 y ( x   2 )  2 o ( xy ) (    3) 2
4
 y  2 o ( yy ) (  3   ) 2  ' rms
4

2 o ( xx ) (  3   ) 2  4 2 o ( xx ) (  3   )  ( 2
x 2  2 2
y )
 2 o ( xy ) (  3   ) 2  2 o ( xy ) (    3) 2  2 o ( yy ) (  3   ) 2  ' rms
4
.

Glancing back to Eq. (7.3d), we see that this simplifies to

- 918 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn 

(θ 2)
onn
 ( χ ′ − χ ) = 2 o ( xx ) ( χ ′ − χ ) 2 + 2 o ( yy ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ )
(7.44d)
+ 2 o ( xy ) ( χ ′ − χ ) 2 + 2 o ( xy ) ( χ − χ ′) 2 .

(θ 2)
This is what we want, a formula for onn  in terms of o ( xx ) , o ( yy ) , and o ( xy ) .
The final step is to apply the Fourier transform to Eq. (7.44d). We define χ ′′ = χ ′ − χ and
write
(θ 2)
onn
 ( χ ′′) = 2 o ( xx ) ( χ ′′) 2 + 2 o ( yy ) ( χ ′′) 2 + 4φ 2 o ( xx ) ( χ ′′)
+ 2 o ( xy ) ( χ ′′) 2 + 2 o ( xy ) (− χ ′′) 2 .

Dropping the primes and taking the Fourier transform of both sides gives, using the linearity of
the Fourier transform described in Sec. 2.6 of Chapter 2,
(
F ( −iσχ ) onn
(θ 2)
 ) ( )
( χ ) = 2 F ( −iσχ ) o ( xx ) ( χ )2 + 2 F ( − iσχ ) o ( yy ) ( χ ) 2 ( )
( ) (
+ 2 F ( −iσχ ) o ( xy ) ( χ ) 2 + 2 F ( − iσχ ) o ( xy ) (− χ ) 2 ) (7.45a)

(
+ 4φ 2 F ( −iσχ ) o ( xx ) ( χ ) . )
The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] lets us write

( ) ( ) (
F ( −iσχ ) o ( xx ) ( χ ) 2 = F ( − iσχ ) o ( xx ) ( χ ) ∗ F ( − iσχ ′) o ( xx ) ( χ ′) )
and
( ) ( ) (
F ( −iσχ ) o ( yy ) ( χ ) 2 = F ( − iσχ ) o ( yy ) ( χ ) ∗ F ( − iσχ ′) o ( yy ) ( χ ′) . )
Equations (7.36g) and (7.36h) then give

( )
F ( −iσχ ) o ( xx ) ( χ ) 2 = p( xx ) (σ ) ∗ p( xx ) (σ ) , (7.45b)
and
( )
F ( −iσχ ) o ( yy ) ( χ ) 2 = p( yy ) (σ ) ∗ p( yy ) (σ ) . (7.45c)

Equation (7.36g) needs to be substituted directly into our formula, so we drop the primes and
rewrite it as
(
F ( −iσχ ) o ( xx ) ( χ ) = p( xx ) (σ ) . ) (7.45d)

Now Eqs. (7.45b)–(7.45d) can be substituted into (7.45a) to get

- 919 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

F ( −iσχ ) ( onn
(θ 2)
 ( χ ) ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )]

( )
+ 2 F ( −iσχ ) o ( xy ) ( χ ) 2 + 2 F ( − iσχ ) o ( xy ) (− χ ) 2 ( )
+ 4φ 2 p( xx ) (σ ) ,

which becomes, applying (7.30b) to the left-hand side,

(θ 2)
pnn
 (σ ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )] + 4φ 2 p( xx ) (σ )
(7.45e)
{ ( ) (
+ 2 F ( − iσχ ) o ( xy ) ( χ ) 2 + F ( − iσχ ) o ( xy ) (− χ ) 2 . )}
The term inside the braces { }, which is the last term on the right-hand side of (7.45e), can be
simplified if we write the Fourier transforms as integrals. Defining χ ′′′ = − χ lets us write

F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )
∞ −∞

³ o ( xy ) ( χ ) 2 e −2π iσχ d χ − ³o ( χ ′′′) 2 e 2π iσχ ′′′ d χ ′′′


( xy )
=
−∞ ∞
∞ ∞

³ o ( xy ) ( χ ) 2 e −2π iσχ d χ + ³o ( χ ) 2 e 2π iσχ d χ .


( xy )
=
−∞ −∞

Glancing back at the definition of o ( xy ) in Eq. (7.37a), we note that o ( xy ) is real, which makes
the second integral,

³o ( χ ) 2 e 2π iσχ d χ ,
( xy )

−∞
the complex conjugate of the first,

³o ( χ ) 2 e −2π iσχ d χ .
( xy )

−∞
Hence
F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )

= 2 Re ³o
( xy )
(
( χ ) 2 e −2π iσχ d χ = 2 Re ª¬ F ( − iσχ ) o ( xy ) ( χ ) 2 º¼ . )
−∞

Applying Eq. (2.39j) in Chapter 2 now gives

- 920 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn 

F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )

( ) ( )
= 2 Re ª¬ F ( − iσχ ) o ( xy ) ( χ ) ∗ F ( − iσχ ′) o ( xy ) ( χ ′) º¼ ,

which becomes, applying Eq. (7.37b),

F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 ) = 2 Re[p( xy ) (σ ) ∗ p( xy ) (σ )] . (7.45f)

Equation (7.45f) can now be substituted into (7.45e) to get

(θ 2)
pnn
 (σ ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )]
(7.45g)
+ 4φ 2 p( xx ) (σ ) + 4 Re[p( xy ) (σ ) ∗ p( xy ) (σ )] .

This, surprisingly enough, is the result we need in order to learn something about the likely shape
(θ 2)
of the pnn
 noise-power spectrum.

(θ 2)
7.13 The Shape of the pnn
 Power Spectrum

If we return to the ideal case where θx and θy are taken to be independent random variables, then
Eq. (3.11b) in Chapter 3 can be used to write

( ) ( ) (
E θx ( χ ) θy ( χ ′) = E θx ( χ ) ⋅ E θy ( χ ′) = 0 )
because θy ( χ ′) is, according to Eq. (7.2g), a zero-mean random variable:

( )
E θy ( χ ′) = 0 .

Similarly, according to Eqs. (7.36a) and (7.36d), we have, using the linearity of the expectation
operator E described in Sec. 3.10 of Chapter 3,

(( ) ) ( ) (
E( X Y ′) = E θx ( χ ) − φ ⋅θy ( χ ′) = E θx ( χ ) ⋅ E θy ( χ ′) − φ E θy ( χ ′) = 0 ) ( )

- 921 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

again because 'y (  3) is a zero-mean random variable. Consequently o ( xy ) (  3   ) in Eq. (7.37a)


above is zero and so is its Fourier transform p( xy ) () ) defined in Eq. (7.37b). The formula for
(' 2)
pnn
 () ) in Eq. (7.45g) now becomes

(' 2)
pnn
 () ) 2[p( xx ) () )  p( xx ) () )]  2[p( yy ) () )  p( yy ) () )]  4 2 p( xx ) () ) . (7.46)

We can recognize two extreme cases for the right-hand side of formula (7.46)—one where  2 is
relatively large compared to p( xx )  p( xx ) and p( yy )  p( yy ) , and one where  2 is relatively small
compared to p( xx )  p( xx ) and p( yy )  p( yy ) . When the bias angle  2 is relatively large,

(' 2)
pnn
 () ) 4 2 p( xx ) () ) , (7.47a)
and when  2 is relatively small,

(' 2)
pnn
 () ) 2[p( xx ) () )  p( xx ) () )]  2[p( yy ) () )  p( yy ) () )] . (7.47b)

Equation (7.47a) shows that, when  2 is large, pnn


(' 2)
 is proportional to p( xx ) so it has the same
basic shape as p( xx ) ; and, since p( xx ) is taken to be the standard power spectrum of an ordinary
(' 2)
wide-sense stationary random variable, we expect pnn  also to have a shape appropriate to an
ordinary wide-sense stationary random variable. When  2 is small, however, Eq. (7.47b) shows
(' 2)
that pnn
 is the sum of the convolutions of the p( xx ) and p( yy ) power spectra with themselves.
Before going any further, we pause to examine more carefully what it means to say that  2 is
large or small compared to p( xx )  p( xx ) or p( yy )  p( yy ) . Suppose that ) spread
( xx )
is the length of ı axis
over which p( xx ) () ) is significantly different from zero and that ) spread
( yy )
is the length of ı axis
over which p( yy ) () ) is significantly different from zero. Similarly, we say that ptyp( xx ) is the typical
scale size of p( xx ) and ptyp( yy ) is the typical scale size of p( yy ) . Then the scale size of the
convolutions p( xx )  p( xx ) and p( yy )  p( yy ) can be approximated as [see the definition of the
convolution in Eq. (2.38a) of Chapter 2]

5
p( xx ) () )  p( xx ) () ) ³p
( xx )
() 3)p( xx ) ()  ) 3) d) 3 ? ptyp( xx )2) spread
( xx )
(7.47c)
5
or
and

- 922 -
(' 2 )
The Shape of the pnn  Power Spectrum · 7.13
7.14

³p
( yy ) ( yy ) ( yy )
p () )  p () ) () 3)p( yy ) ()  ) 3) d) 3 ? ptyp( yy )2) spread
( yy )
. (7.47d)
5

wefollows
From Eqs. (7.36i) and (7.36j), it know that
that

5 5

³p ³p
( xx ) ( xx ) ( yy ) ( yy )
o (0) () ) d) and o (0) () ) d) ,
5 5

which can be approximated as


o ( xx ) (0) ? ptyp( xx )) spread
( xx )
(7.47e)
and
o ( yy ) (0) ? ptyp( yy )) spread
( yy )
. (7.47f)

From the definitions of o ( xx ) (0) and o ( yy ) (0) in Eqs. (7.36e) and (7.36f), we know that

o ( xx ) (0) E X 2 and o ( yy ) (0) E Y 2


   
because, according to (7.36a)–(7.36d), X X 3 and Y Y 3 when   3 . Hence, (7.47e) and
(7.47f) can also be written as

E X 2 ? ptyp( xx )) spread
  ( xx )
and E Y 2 ? ptyp( yy )) spread
 ( yy )
,

which, after substituting from (7.39b) and (7.39c), simplifies to

2
x ? ptyp( xx )) spread
( xx )
(7.47g)
and
2
y ? ptyp( yy )) spread
( yy )
. (7.47h)

This means that the approximations in (7.47c) and (7.47d) can be written as

p( xx ) () )  p( xx ) () ) ? ptyp( xx ) 2
x (7.47i)
and
p( yy ) () )  p( yy ) () ) ? ptyp( yy ) 2
y . (7.47j)

- 923 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

Hence, when—in Eq. (7.46)—we say that the product  2 p( xx ) is large or small compared to
p( xx )  p( xx ) or p( yy )  p( yy ) , it is the same as saying that

 2 ptyp( xx )

is large or small compared to ptyp( xx ) 2


x and ptyp( yy ) 2
y . Assuming that ptyp( xx ) ? ptyp( yy ) , it follows that the
relevant comparison is between  2 and 2
x, y —or,
or, taking
taking the
the square root, between  and x, y . If
 is much larger than x and y , then Eq. (7.46) reduces to approximation (7.47a); and if  is
much smaller than x and y , then Eq. (7.46) reduces to approximation (7.47b).
Working with approximation (7.47b) first, we note that, according to inequality (3.54g) in
Chapter 3, a power spectrum is never negative—and Eq. (3.49b) in Chapter 3 reminds us that
(' 2)
power spectra are even functions of their arguments. Equation (7.47b) requires pnn  to be the
sum of of two
twoterms,
terms,with
witheach
eachterm
term being
being the the
twice convolution
convolutionof of
a apower
powerspectrum
spectrumwith
with itself.
itself. This
(' 2)
means that pnn  ought to share the same general shape as the convolution of any non-negative
and even power spectrum with itself. Experience shows that noise-power spectra tend to have one
of the two types of shape shown by the dashed lines in Figs. 7.2(a) or 7.2(b). Section 3.25 in
Chapter 3 describes what is meant by band-limited white noise, and the dashed line in Fig. 7.2(a)
depicts a power spectrum closely resembling this ideal case. The other type of shape often seen is
what could be called quasi-harmonic noise, which has a power spectrum containing multiple
narrow peaks. This is the type of spectrum shown by the dashed lines in Fig. 7.2(b). In a way,
band-limited white noise and quasi-harmonic noise represent the two possible extreme cases—or
“opposite types” of noise—which could describe the behavior of the X and Y random
quantities. The
quantities. Thesolid
solidcurve
line in Fig. 7.2(a) shows what the convolution of the dashed plot in Fig.
7.2(a) with itself
itself looks
looks like,
like, and
and the
the solid
solid peaks
line ininFig.
Fig.7.2(b)
7.2(b)shows
show what the convolution of the
dashed plot in Fig. 7.2(b) with itself looks like. These solid lines show that the convolution of
either extreme case with itself has a shape with a large central hump. Consequently, we expect
(' 2)
pnn
 also to have a large central hump when it obeys (7.47b), so it makes sense, when picking a
(' 2)
generic shape for an pnn
 () ) power spectrum described by formula (7.47b), to choose a
Gaussian function:

(' 2)  
 ) 2 2 s2
pnn
 () )  e . (7.48)

In this formula, both Į and s are positive real numbers.


(' 2)
Formula (7.47a), the other approximation for pnn , forces it to have the same basic shape as
the p( xx ) power spectrum. In this situation, it makes sense to assume that pnn
(' 2)
 and p( xx ) have the

- 924 -
(' 2 )
The Shape of the pnn  Power Spectrum · 7.13
7.14

FIGURE 7.2(a).

3
3.0

2
S test )
i

Sconv )
i
1

9
2.061 10 0
3 2 1
) 0 0 1 2 3
3 ) 3
i

FIGURE 7.2(b).

3.5
3

S test f
i 2
Sconv f
i

0 0
6 4 2
) 0 0 2 4 6
6 f 6.0
i
The dashed lines in Figs. 7.2(a) and 7.2(b) represent function f(ı) and the solid lines
represent the convolution of function f(ı) with itself—that is, they represent f(ı) * f(ı). Figure
7.2(a) shows what happens to a smooth f(ı) localized near the origin and Fig. 7.2(b) shows
what happens to f(ı) when it consists of multiple, isolated peaks.

- 925 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.2(c).

p0(θ 2) or p0( xx )

− (σ C + σ M ) − σC σC σC +σM

σ (in cm-1)

(θ 2)
This plot shows the shape with respect to ı of functions p (σ ) and p( xx ) (σ ) for the quasi-harmonic
formulas in Eqs. (7.49a) and (7.49b).

simple quasi-harmonic shape depicted in Fig. 7.2(c), just to see what the misalignment NEdN
(θ 2)
looks like when, unlike Eq. (7.48), the largest pnn
 values are far away from the σ = 0 origin.
The quasi-harmonic power spectral shape in Fig. 7.2(c) can be specified by

ª § § σ · σM · § § σM · σM ·º
p( xx ) (σ ) = p0( xx ) ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.49a)
¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼

when referring to the noise-power spectrum of X = θx ( χ ) − φ in Eq. (7.36a) and by

(θ 2) ª § § σ · σM · § § σM · σ M ·º
pnn
 (σ ) = p0(θ 2) ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.49b)
¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼

- 926 -
(' 2 )
The Shape of the pnn  Power Spectrum · 7.13
7.14

when referring to the noise-power spectrum of n (' 2) (  ) from Eq. (7.8b). In both formulas p0( xx ) ,
p0(' 2) , ) C , and ) M are positive real parameters; and both power spectra have the same shape
(only their maximum values p0( xx ) and p0(' 2) are different). The  function has the
the same
sameformulas
formula
as Eq.
in in Eq. (7.12a)
(7.12a) above:
above:
­°1 for ) a 4 ) b
 () a , ) b ) ® .
°̄0 for ) a ) b

(' 2)
7.14 The Size of the pnn
 Power Spectrum
(' 2)
Having chosen either (7.48) or (7.49b) to specify the shape of the pnn
 power spectrum, we turn
(' 2)
to Eq. (7.31b) to connect the amplitude of pnn
 to its spread in wavenumbers. Substitution of Eq.
(7.3d) into (7.31b) gives
5
E ' (  ) 4 ( 2 
  2
x  2 2
y )  ³p
(' 2)

nn () ) d) , (7.50a)
5

where  , x ,are respectively the bias angle, the standard deviation of the 'x component of
y

the random misalignment angle, and the standard deviation of the 'y component of the random
misalignment angle. All three quantities are, of course, measured in radians. At the beginning of
the previous section, we specified 'x and 'y to be independent random quantities; and the
derivation of Eq. (7.45g) in Sec. 7.12 assumes that both ' and ' are normally distributed x y

random variables [obeying probability density distributions of the type shown in Eqs. (7.2h) and
(7.2i) above]. Equation (7.4d) thus requires that

E ' (  ) 4 3
  4
x  6 2 2
x 4  3 4
y  2 ( 2  2
x ) 2
y ,

which can be put into Eq. (7.50a) to get

³p
(' 2)

nn () ) d) ( 2  2
x  2 2
y )  3( 4
x  4
y )  6 2 2
x 4  2 2
y ( 2  2
x )
5
or
5

³p
(' 2) 4 4

nn () ) d) 2( x  y  2 2 2
x ). (7.50b)
5

- 927 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

(' 2)
Equation (7.50b) is the formula we need to connect the size of the proposed pnn
 spectral shape
to its spread in wavenumbers.
When the Gaussian shape in formula (7.48) is chosen, we note that parameter Į specifies the
size of the spectrum and parameter s determines the spectral spread in wavenumbers. Substituting
(7.48) into (7.50b) gives
5
 
 ) 2 2 s2
³e d) 2( 4
x  4
y  2 2 2
x ). (7.51a)
5

Equation (7A.3d) in Appendix 7A can be written as, replacing t by ı and by s,

5
 
 ) 2 2 s2

5
³e dt s 2& . (7.51b)

Applying this to (7.51a), we find that

4 4
 s 2& 2( x  y  2 2 2
x ),

which can be solved for Į to get


1 2 4 4
 ( x  y  2 2 2
x ). (7.51c)
s &

This is the expected connection between the size of the noise-power spectrum and its spread in
wavenumbers. Glancing back at the discussion following Eq. (7.47j), we recall that the Gaussian
spectral shape in (7.48) stems from an assumption that the bias angle  is small compared to Ȗx
and Ȗy. Hence (7.51c) can be approximated as

1 2 4 4
 ( x  y ). (7.51d)
s &

Formulas(7.51c)
Formula (7.51c)can
andbe(7.51d) can be
substituted substituted
back back
into (7.48) into (7.48) to get
to get

(' 2) 1 2 4 4   .,
 ) 2 2 s2
pnn
 () ) ( x  y  2 2 2
x )e (7.51e)
s &

which that
Using simplifies to, using
 is small that to is xsmall
compared and compared
and ,y , we substitute
to x and y , into (7.48), or just neglect  in
(7.51d)
(7.51e), to get

- 928 -
(θ 2 )
The Shape of the pnn  Power Spectrum · 7.14

(θ 2) 1 2 4 − σ 2 ( 2 s2 )
pnn
 (σ ) ≅ ( γ x + γ y4 ) e . (7.51f)
s π

(θ 2)
The quasi-harmonic shape for pnn  specified in Eq. (7.49b) stems from the assumption that
the bias angle φ is large compared to γ x and γ y . Here the spread in wavenumbers is specified by
parameter σ M and the size of the power spectrum is determined by p0(θ 2) . Substituting (7.49b)
into (7.50b) gives
2σ M p0(θ 2) = 2( γ x4 + γ y4 + 2 φ 2γ x2 )
or
1
p0(θ 2) = ( γ x4 + γ y4 + 2 φ 2γ x2 ) . (7.52a)
σM

This is the connection between size p0(θ 2) and wavenumber spread σ M for the spectral shape
specified in (7.49b). Because now we are assuming that φ is large compared to γ x , the formula
can be approximated as
2 φ 2γ x2
p0(θ 2) ≅ . (7.52b)
σM

Formula (7.52a) can be applied to Eq. (7.49b) to get

(θ 2)
pnn
 (σ )
γ x4 + γ y4 + 2 φ 2γ x2 ª § § σ · σ · § § σ · σ ·º (7.52c)
= ⋅ «Π ¨ σ − ¨ σ C + M ¸ , M ¸ + Π ¨ σ + ¨ σ C + M ¸ , M ¸ »
σM ¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼

or, again using that φ is large compared to γ x and γ y ,

(θ 2) 2 φ 2γ x2 ª § § σ · σM · § § σM · σM ·º
pnn
 (σ ) ≅ ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.52d)
σM ¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼

(θ 2)
for the quasi-harmonic pnn
 noise-power spectrum.

7.15 Simulated Misalignment Noise


To show how misalignment noise can disturb the measurements of Michelson interferometers, we
simulate misalignment-contaminated measurements of both a black-body spectrum and an

- 929 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

isolated Lorentz emission line. For the misalignment-contaminated black-body measurements, we


use a Gaussian noise-power spectrum such as the one specified in Eq. (7.48), and for the Lorentz
emission line we use the quasi-harmonic noise-power spectrum specified in Eq. (7.49b) and
graphed in Fig. 7.2(c). In both cases the 'x and 'y components of the misalignment angle are
taken to be independent random variables.
When generating the black-body measurements, the simulated interferometer samples the
interferogram at N = 8192 evenly spaced positions between the optical-path differences of
  D and  D , with D = 1.28 cm. Glancing back at Eq. (5.67) in Chapter 5, we see that the
unapodized spectral resolution is now
1
0.391 cm 1 (7.53a)
2D

and the optical-path difference between the evenly spaced samples of the interferogram signal is

2D
 3.125 ;104 cm . (7.53b)
N

The background radiance is assumed to be negligible, so

L( dir ) () ) L(FOV
fore )
( ) ) L(back) ( fore ) (back)
FOV ( ) ) L mnf ( ) ) L mnf ( ) ) 0 . (7.53c)

We might as well give the responsivity R and the optical parameters * a , * f , Ș their ideal values

amp A sec
R( ) ) 1 (7.53d)
erg
and
* a ( ) ) * f ( ) ) !( ) ) 1 , (7.53e)

because in formula (7.34b) they just end up rescaling the spectral noise to turn it into NEdNtilt.
The beam passing through the interferometer has a circular cross section of radius R = 3 cm , so
according to Eq. (7.2b)

a 2& 2
R 2 177.65 cm 2 (7.53f)

and, of course, the beam cross-sectional area is

A & R 2 28.27 cm 2 . (7.53g)

- 930 -
Simulated Misalignment Noise · 7.15

The interferometer’s field of view is

 1.086 ;104 ster , (7.53h)

and parameter W, explained in the discussion following Eq. (4.83) in Chapter 4, is

W 1. (7.53i)

The detector electronics in Fig. 6.2 of Chapter 6 are given a three-pole, low-pass Butterworth
filter. Figure 7.3 plots
Re  H(u) )  , Im  H(u) )  , and H(u) )

of this filter against wavenumber ı. The OPD velocity u is taken to be 5 cm/sec and the filter
cutoff frequency is 8000 Hz. This means the magnitude H(u) ) of the transfer function does not
fall off by much inside the 650 cmí1 to 1150 cmí1 band of wavenumbers measured by the
interferometer. The simulated instrument is calibrated using Planck black-body radiances of 77 K
(the temperature of liquid nitrogen) and 350 K.
To characterize the noise in these simulated black-body measurements, we have already
decided to use the Gaussian noise-power spectrum in Eq. (7.48), which means [see discussion
following Eq. (7.46) and continuing on to Eq. (7.48)] that the bias angle  must be negligible. To
keep things simple, we make the bias angle zero,

 0. (7.54a)

The Gaussian power spectrum in (7.48) has

s 200 cm 1 (7.54b)
and
 3.989 ;1023 cm A rad 4 , (7.54c)

which gives us, by combining Eqs. (7.48) and (7.34b), all the information needed to calculate the
(' 2)
NEdNtilt contaminating the black-body spectrum. Figure 7.4(a) plots the Gaussian pnn  noise-
power spectrum in (7.48) for the s and Į values in (7.54b) and (7.45c).
Now that s and Į are specified, and the bias angle is set to zero in (7.54a), Eqs. (7.51c)
(7.51c)and or
(7.51d) show that
4 4 20
x  y 10 rad 4 . (7.55a)

- 931 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

This does not specify uniquely the amount of misalignment error contributed by the 'x and 'y
components of the misalignment angle. We could, for example, treat ' and ' on an equal x y

footing by saying that


4 4
x y 5 ;1021 rad 4 , (7.55b)

which would then give, according to Eq. (7.3d),

' rms 1.19 ;105 rad . (7.55c)

To keep the arithmetic simple, we choose another approach, assuming that 'y is always zero so
that
y 0. (7.55
.55dD)

The remaining 'x component obeys a zero-mean normal probability-density distribution


specified by the value of x . Now Eqs. (7.55a) and (7.3d) reduce to

x 105 rad (7.55e)


so that
' rms 105 rad . (7.55f)

Figures 7.4(b) and 7.4(c) plot a simulation of n (' 2) misalignment noise [defined in Eq. (7.8b)
above] for an interferometer disturbed by a Gaussian noise-power spectrum governed by the
.54 (7.55d), and (7.55e). Figure 7.4(b) covers a small
parameter choices shown in (7.54a)–(7.45c),
range of OPD values to show what this sort of misalignment noise looks like in detail, and Fig.
7.4(c) covers the entire range of OPD values between +1.28 cm and í1.28 cm.
Figures 7.5(a) and 7.5(b) show what happens when the Gaussian misalignment noise just
described above contaminates measurements of a 320-K Planck black-body spectrum performed
by the interferometer system specified at the beginning of this section [see Eqs. (7.53a)–(7.53i)
and the paragraph immediately following Eq. (7.53i)]. The solid line in Fig. 7.5(a) is the true
spectral radiance entering the instrument. This black-body curve is smooth enough that, when
calculating NEdNtilt, we do not have to worry about the different shapes of the radiance functions
L, LFOV, and Lmnf specified107 in Secs. 5.18 and 5.23 of Chapter 5. [A similar point was made
earlier in Sec. 7.6 about the L(1) and L(2) calibration radiances—see Eqs. (7.19a) and (7.19b)]. .]
Figure 7.5(a) also contains ten independent, noise-contaminated measurements shown by dotted

107
The modified radiances LFOV and Lmnf are defined in Eqs. (5.83e) and (5.108d) respectively.

- 932 -
Simulated Misalignment Noise · 7.15

FIGURE 7.3.

1.5
1.5

1.0 1

0.50.5
Re( Htot( u .σ ) )

Im( Htot( u .σ ) )
0.0 0
Htot( u .σ )

-0.50.5

-1.0 1

1.5 1.5
0 500 1000 1500 2000
0 0 500 1000
σ 1500 2000
2000

σ (in cm-1)

The solid curve is the magnitude of the transfer function H(uı) plotted against ı. The dashed
and dotted curves are its real and imaginary parts respectively.

curves, several of which are too close to the solid curve to be easily seen. This gives some idea of
how the misalignment noise causes the 320-K radiance curve generated by the interferometer to
jump around from measurement to measurement while retaining the general shape of a true
black-body spectrum. The solid curve in Fig. 7.5(b) is the NEdNtilt calculated from formula
(7.34b) above. It is clearly consistent with the spread of the dotted curves in Fig. 7.5(a). We have
analyzed 3600 independent, noise-contaminated spectral measurements of this 320-K radiance
curve, calculating the standard deviation of the error as a function of wavenumber ı between
650 cm-1 and 1150 cm -1 . The crosses in Fig. 7.5(b) plot these standard deviations; there is a close

- 933 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.4(a).

-23
5x10
23
5 .10

-23
23
4x10
4 10

-23
23
3x10
3 10

pn~(nθ~ 2)
(in rad4 cm)sV( 2x10
σ)
-23
2 10
23

-23
23
1x10
1 10

0.0 0
23
0.5 .10
800 600 400 200 0 200 400 600 800
800 -800 -600 -400 -200 0.0
σ 200 400 600 800
800

σ (in cm-1)

This is a plot of the Gaussian noise-power spectrum in Eq. (7.48) with Į = 3.989x1023 rad4 cm
and s = 200 cm-1.

______________________________________________________________________________

match between them and the predicted NEdNtilt curve, showing that the simulated interferometer
measurements obey the expected spectral statistics.
Figure 7.6 shows the Lorentz emission line measured in the second simulated interferometer
measurement. We use the same interferometer system as in the black-body measurement, with
two connected changes: the fore optics transmission is taken to be

τ f ( σ ) = 0.9 (7.56a)

instead of one, and the fore optics background radiance L(mnf


fore )
is no longer assumed to be zero.
These changes are connected because, when τ f is less than one, it contributes a nonzero

- 934 -
Simulated Misalignment Noise · 7.15

FIGURE 7.4(b).

10 6x10-10
6 .10
-10
10
4x10
4 10

-10
10
2x10
2 10

n (θ 2) ( χ )
Re nθ2Vtemp
2kPlot 0.0 0
(in rad )
-10
10
-2x10
2 10

-10
-4x10
4 10
10

10 -10
6 .10 -6x10
6 10
10
0.05 0 0.05
0.1 -0.1 -0.05 kPlot .∆χ 0.0
1.28 0.05 0.1
0.1

χ (in cm)

background radiance to the optical signal. (The changes are made to show the effect of
background radiance on a Lorentz-line measurement contaminated by misalignment noise.) The
L(mnf
fore )
background radiance is taken to be a gray-body Planck curve [described in the discussion
following Eq. (5.3k) in Chapter 5] with a constant emissivity of 0.1; and, since all the other
interferometer optics are taken to be ideal, we can still set the other background radiances to zero.
Because we are now dealing with a Lorentz emission line instead of a smooth Planck curve, it is
no longer safe to assume automatically that the input spectrum is so smooth that the
interferometer’s finite field of view and finite interferogram length have no significant effect on
the measured spectrum.
Equation (5.83e) in Chapter 5 reminds us that the finite field of view rescales the wavenumber
axis by a factor of
§ ∆Ω ·
¨1 + ¸.
© 4π ¹

This becomes, using the ǻȍ = 1.086 × 10-4 ster value from Eq. (7.53h),

- 935 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.4(c).

10 6x10-10
6 10
-10
4x10
4 10
10

-10
10
2x10
2 10

nRe('n'2Vtemp
2)
( )
kPlot 0.0 0
(in rad2)
-10
10
-2x10
2 10

-10
-4x10
4 10
10

10 -10
6 10 -6x10
6 10
10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot  1.28 0.5 1.0 1.28

 D  (in cm)  D

_____________________________________________________________________________________________

§  · 6
¨1  ¸ 1  8.642 ;10 . (7.56b)
© 4& ¹

Consequently, the effective wavenumber ı of the input radiance spectrum is in error by


approximately 0.00086%. For the Lorentz emission line in Fig. 7.6, this amounts to a shift of
82 cm-1 , which is far too small to see on the scale of the graph. The  finite field of
about 0.0086
view also, according to Eqs. (5.83e) and (5.82c) in Chapter 5, blurs the input radiance over a
wavenumber interval of
 ) 104 A 103 cm 1
? 0.0167 cm 1 . (7.56c)
2& 6

- 936 -
Simulated Misalignment Noise · 7.15

FIGURE 7.5(a).

200
200
200

190190

180180
LinpV
kR

LmeasV 170170
kR

Lmeas2V
kR
160160
Lmeas3V
kR

Lmeas4V
Radiance 150
kR 150

(in mW/m2/sr/cmLmeas5V
-1
) kR
Lmeas6V140140
kR

Lmeas7V
kR

Lmeas8V
130130
kR

Lmeas9V
kR
120120
Lmeas10V
kR

110110

100100

98.53893290 90
600 700 800 900 1000 1100 1200
600
650 700 800 900
σR 1000 1100 1200
. 3
kR 1.15 10

σ (in cm-1)

- 937 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.5(b).

5.0 55

44

33
NEdNV
Radiance Errork
(in mW/m2/sr/cm
NEdNest )
-1
k

22

11

0
0
0
600
600 700 800
800 900
900 1000
1000 1100
1100 1200
1200
650 σg 3
k 1.15 .10

σ (cm-1)

This is also much too small to matter on the scale of Fig. 7.6. All that is now left to check is the
effect of the finite interferogram length. The value of the unapodized spectral resolution is
0.391 cm -1 in Eq. (7.53a) above. Glancing back at the discussion following Eq. (5.67) in Chapter
5, we note that the unapodized spectral resolution determines the scale of the spectral blurring
caused by the interferometer’s finite interferogram length. The Lorentz line in Fig. 7.6 looks wide
enough not to have its width significantly affected by the blurring effects of an unapodized
spectral resolution of 0.391 cm-1 . So there is still no need to worry about the slightly different
shapes of the radiance functions L, LFOV, and Lmnf when discussing the radiance spectrum
entering—or measured by—the interferometer.

- 938 -
Simulated Misalignment Noise · 7.15

The analysis in Sec. 7.13 shows that the bias angle  must be large compared to x and y
(' 2)
for the noise-power spectrum p to have the quasi-harmonic shape specified in Eq. (7.52d)

nn

above. To satisfy this requirement for the noise contaminating the Lorentz emission line, we set
 105 rad and x 106 rad . Taking y to be approximately the same size as x , we again
use Eq. (7.3d) to get
' rms 105 rad (7.57a)

just like in Eq. (7.55f) above for the black-body measurements. Choosing

) C 100 cm 1 (7.57b)
and
) M 20 cm 1 , (7.57c)

we consult Eq. (7.52b) to get


p0(' 2) 1023 cm A rad 4 (7.57d)

(' 2)
in Eq. (7.49b). To get the desired quasi-harmonic pnn
 spectrum, we just apply these ) C , ) M ,
and p0(' 2) parameters to the graph in Fig. 7.2(c). Figures 7.7(a) and 7.7(b) contain an example of
n (' 2) misalignment noise [as defined in Eq. (7.8b)] obeying this quasi-harmonic spectrum. The x
and y components are independent, zero-mean, and normally distributed random quantities.
Figure 7.7(a) plots n (' 2) over a small set of OPD values to show what this quasi-harmonic
misalignment noise looks like in detail, and Fig. 7.7(b) plots n (' 2) over the entire range of OPD
values between +1.28 cm and í1.28 cm.
FiguresFigures 7.8(a)
7.8(a) andand 7.8(b)
7.8(b) show
show whathappens
what happenswhen
whenthe thequasi-harmonic
quasi-harmonic noise
noise described
described above
contaminates the measurement of the Lorentz emission line in Fig. 7.6. The split solid curves in
Fig. 7.8(a) depict the rising and trailing edges of the Lorentz emission line using a stretched y
axis, which puts the peak top ofofthe
theemission
emissionline
lineoff
offthe
thetop
topof ofthe
thegraph.
graph. The
The continuous
continuous solid
solid line
line is
the NEdNtilt curve predicted by formulas (7.34b) and (7.35a), and the dotted lines are ten
measurements of the Lorentz emission contaminated by the quasi-harmonic misalignment noise.
The NEdNtilt curve correctly predicts the presence and location of the “ghost-line” noise peaks in
the dotted curves, and it also confirms the way the overall level of the noise-contaminated
measurements rises and falls with respect to the true spectral level far away from the ghost lines.
The ghost-line noise is predicted by the first term on the right-hand side of Eq. (7.35a). This term
(' 2)
is basically a convolution of the quasi-harmonic pnn  power spectrum with the Lorentz line
shape contained in the square of the [) 2 Z mnf () )] function.

- 939 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.6.

100 100
99.00892

80 80

60 60
Radiance
(in mW/m2/sr/cm -1 σg
Linp
) ig
40 40

20 20

3
1.111099 .10 0 0
800 850 900 950 1000 1050 1100
800
800 850 900 950
σg 1000 1050 1100
1100
ig

σ (in cm-1)

- 940 -
Simulated Misalignment Noise · 7.15

We note that the ghost-line regions lie on either side of the Lorentz emission line, offset from the
line center by
σM
σC + = 110 cm −1 , (7.58)
2

as we would expect from the convolution. The overall rise and fall of the noise-contaminated
measurements with respect to the true spectral level comes from both the first and second terms
on the right-hand side of (7.35a) and can be traced to the interferometer’s nonzero background
radiance. This is what happens when misalignment noise interacts with a smooth Planck-like
spectrum, just like in Figs. 7.5(a) and 7.5(b). It is important to realize that large background
radiances can produce large amounts of background noise even at those wavenumbers where the
spectrum being measured is relatively small. We also see that misalignment noise, unlike the
detector noise discussed in Chapter 6, need not look very “fuzzy” and noiselike; it can easily be
mistaken for part of the spectral signal. Figure 7.8(b) has the same basic format as Fig. 7.5(b).
Again, we generate 3600 noise-contaminated measurements and calculate the standard deviations
of the spectral error as a function of wavenumber ı. Just as before, the crosses marking the values
of these standard deviations are a good match to the solid line giving the predicted NEdNtilt
values.
______________________________________________________________________________

FIGURE 7.7(a).
11
8 10-11
8x10
11
8 .10
-11
11
6x10
6 10

-11
11
4x10
4 10

-11
11
2x10
2 10

nRe(θnθ2Vtemp
2)
(χ )
0.0 0
(in rad2)
kPlot

-11
11
-2x10
2 10

-11
-4x10
4 10
11

-11
-6x10
6 10
11

11 -11
8 .10 -8x10
8 10
11
0.05 0 0.05
0.1 -0.1 -0.05 kPlot .∆χ 0.0
1.28 0.05 0.1
0.1

χ (in cm)

- 941 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.7(b).

-11 11
118x10
8 10
8 .10
-1111
6x10
6 10

-1111
4x10
4 10

-1111
2x10
2 10

n (θ 2) ( χ )
Re nθ2Vtemp 0.0 0
(in rad2) kPlot
-1111
-2x10
2 10

-1111
-4x10
4 10

-1111
-6x10
6 10

11 -1111
8 .10 -8x10
8 10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot .∆χ 1.28 0.5 1.0 1.28

χ = −D χ (in cm) χ=D

- 942 -
Simulated Misalignment Noise · 7.15

FIGURE 7.8(a).

0.4
0.4

0.3 0.3
LinpV
kR
Noise-free
Spectrum
NEdNV
kR

LmeasV
kR
0.2 0.2
Lmeas2V
kR

Lmeas3V
kR
Radiance Error Lmeas4V
kR
(in mW/m2/sr/cm-1) 0.1 0.1
Lmeas5V
kR

Lmeas6V
kR

Lmeas7V
kR
0.0
Lmeas8V
0
kR

Lmeas9V
kR

Lmeas10V
kR NEdNtilt
-0.1 0.1

-0.2
0.172296 0.2
800 850 900 950 1000 1050 1100
800
800 850 900 950
)R 1000 1050 1100
1100
kR

) (in cm-1)

- 943 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

FIGURE 7.8(b).

0.10
0.1

0.08
0.08

0.06
0.06

NEdNV
k
radiance
(mW/m2/ster/cmNEdNest
-1
) k1

0.04
0.04

0.02
0.02

0 0
0
800 850 900 950 1000 1050
1050 1100
1100
800 850 900 950 1000
800 σg , σg1 1100
k k1

σ (cm-1)

- 944 -
Appendix 7A

Appendix 7A
We want to calculate the second and fourth moments of the normal probability density
distribution
1 −(ς −φ )2 ( 2γ 2 )
pς (ς ) = e . (7A.1)
γ 2π

Here, pς (ς )dς is the probability that the continuous random variable ς takes on a value between
ς and ς + d ς . The mean value of ς is φ , and its standard deviation is γ .
We know, for a > 0 , that108


1 π
³x e
2 − ax 2
dx = .
0
4a a

2
Since x 2 e − ax is an even function of x , this can be written as [according to Eq. (2.19) in Chapter
2]

1 π
³−∞ x e dx = 2a a .
2 − ax 2
(7A.2a)

Taking the partial derivative with respect to a of both sides gives


3 −5 2
³xe
4 − ax 2
dx = a π . (7A.2b)
−∞
4

To get the second moment of ς when it obeys the pς (ς ) probability density distribution in
(7A.1), we must calculate

∞ ∞
1 −(ς −φ )2 ( 2γ 2 )
³−∞ ς pς (ς ) dς = γ 2π ³ς e dς .
2 2

−∞

This becomes, changing the variable of integration to t = ς − φ ,

108
Lennart Rade and Bertil Westergren, Beta β Mathematics Handbook, 2nd ed. (CRC Press, Inc., Boca Raton, FL,
1990), formula (42), p. 164.

- 945 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

∞ ∞
1 ( )
−t 2 2γ 2
³−∞ ς pς (ς ) dς = γ 2π ³ (t + φ ) 2 e
2
dt
−∞
or

∞ ∞
1 ( )
−t 2 2γ 2 φ 2 ∞ − t ( 2γ ) φ2 2 2

( )
−t 2 2γ 2
³−∞ ς pς (ς ) dς = γ 2π ³t e ³ te ³
2 2
dt + dt + e dt . (7A.3a)
−∞
γ π −∞ γ 2π −∞

Applying (7A.2a) to the first term on the right-hand side gives


1 ( ) dt = γ 2
− t 2 2γ 2
³t e
2
. (7A.3b)
γ 2π −∞

According to Eq. (2.17) in Chapter 2, the second term on the right-hand side must be zero
( )
(because [t exp −t 2 /(2γ 2 ) ] is an odd function of t), and we see that the third term must be


φ2 ( )
−t 2 2γ 2

γ 2π ³
−∞
e dt = φ 2 (7A.3c)

because

1 ( )
−t 2 2γ 2

γ 2π ³
−∞
e dt = 1 (7A.3d)

is just the integral of the zero-mean normal probability density over all its allowed values [see
Eq. (7A.1)]. Substituting (7A.3b) and (7A.3c) into (7A.3a) gives

³ς pς (ς ) d ς = γ 2 + φ 2 .
2
(7A.3e)
−∞

To get the fourth moment of ς when it obeys the pς (ς ) probability density distribution in
(7A.1), we evaluate

³ς pς (ς ) d ς ,
4

−∞

by again changing the variable of integration to t = ς − φ to get

- 946 -
Appendix 7A

∞ ∞ (ς −φ )2 ∞ t2
1 −
1 −

³ ³ς ³ (t + φ ) e
2γ 2 2γ 2
ς 4
pς (ς ) d ς = 4
e dς = 4
dt
−∞ γ 2π −∞ γ 2π −∞
or
∞ ∞ t2 ∞ t2 ∞ t2
1 −
2φ 2 −
3φ 2 2 −

³−∞ ς pς (ς ) dς = γ 2π ³t e ³te ³t e


4 4 2γ 2 3 2γ 2 2 2γ 2
dt + dt + dt
−∞
γ π −∞
γ π −∞
(7A.4a)
∞ t2 ∞ t2
2φ 3
2 −
φ 4 −

³ te ³e
2γ 2 2γ 2
+ dt + dt .
γ π −∞ γ 2π −∞

The second and fourth terms on the right-hand side of (7A.4a) are zero because
[t 3 exp(−t 2 /(2γ 2 ))] and [t exp(−t 2 /(2γ 2 ))] are odd functions of t. Applying Eqs. (7A.2b),
(7A.3b), and (7A.3d),


1 3
³ς pς (ς ) d ς = ⋅ ⋅ (2γ 2 )5 2 π + 6φ 2γ 2 + φ 4
4

−∞ γ 2π 4 (7A.4b)
= 3γ 4 + 6φ 2γ 2 + φ 4 .

In Sec. 7.2 above, random variable θx obeys a normal probability density distribution that has
a mean of φ and a standard deviation of γ x . According to Eqs. (7A.3e) and (7A.4b), we can
therefore write that
E(θx2 ) = γ x 2 + φ 2 (7A.5a)
and
E(θx4 ) = 3γ x4 + 6φ 2γ x2 + φ 4 . (7A.5b)

Random variable θy obeys a probability density distribution with a mean of zero and a standard
deviation of γ y . This means that, setting φ = 0 in Eqs. (7A.5a) and (7A.5b), we know

E(θy2 ) = γ y2 (7A.5c)
and
E(θy4 ) = 3γ y4 . (7A.5d)

Equations (7A.5a)–(7A.5d) are the results we need for the derivation of the mirror-tilt NEdN.

- 947 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

Appendix 7B
Although the o ( xx ) , o ( yy ) autocorrelation functions and the p( xx ) , p( yy ) noise-power spectra
introduced in Sec. 7.12 above follow the expected pattern, being both real and even like every
autocorrelation function and power spectrum of a wide-sense stationary random function,109 the
cross-correlation function o ( xy ) and cross-power spectrum p( xy ) introduced in Eqs. (7.37a) and
(7.37b) exhibit a more complicated symmetry. In particular, we should be careful to note that the
p( xy ) cross-power spectrum can have a nonzero imaginary component.
Equation (7.37a) defines the cross-correlation function of X and Y to be, using the notation
of Sec. 7.12,
E XY ( )
  ′ = o ( xy ) ( χ ′ − χ ) . (7B.1a)

This can also be written as [see Eqs. (7.36a) and (7.36d)]

( )
E [θx ( χ ) − φ ]θy ( χ ′) = o ( xy ) ( χ ′ − χ ) . (7B.1b)

Using the linearity of E with respect to random variables (see Sec. 3.10 of Chapter 3), we note
that

( ) ( ) ( )
E [θx ( χ ) − φ ]θy ( χ ′) = E θx ( χ )θy ( χ ′) − φ E θy ( χ ′) .

( )
Since E θy ( χ ′) = 0 , this reduces to

( ) (
E [θx ( χ ) − φ ]θy ( χ ′) = E θx ( χ )θy ( χ ′) , )
which means that Eq. (7B.1b) can be written as

( )
E θx ( χ ) θy ( χ ′) = o ( xy ) ( χ ′ − χ ) . (7B.1c)

This shows that o ( xy ) does not depend on the bias tilt angle φ . Interchanging the positions of Ȥ
and Ȥƍ in Eqs. (7B.1a) and (7B.1c) gives

( )
E X ′ Y = o ( xy ) ( χ − χ ′) (7B.1d)

109
See Sec. 3.20 of Chapter 3 as well as, in Sec. 3.15, the discussion following Eq. (3.30b).

- 948 -
Appendix 7B

and
E 'x (  3) 'y (  ) o ( xy ) (    3) .
  (7B.1e)
We note that since
  3 E Y 3X

E XY   
automatically holds true, it follows—interchanging the roles of the x, y labels and the Ȥ, Ȥƍ
variables—that
in Eq. (7B.1a)—that

o ( xy ) (  3   ) o ( yx ) (    3) ,

which can also be written as, using  33  3   ,

o ( xy ) (  33) o ( yx ) (  33)

or, changing the sign of the argument,

o ( xy ) (  33) o ( yx ) (  33) . (7B.1f)

The cross-power spectrum defined in Eq. (7.37b) is

³o
( xy ) ( xy )
p () ) (  33) e 2& i) 33 d  33 . (7B.2a)
5

We note that, substituting  333   33 ,

5 5

³o d  33  ³ o ( xy ) (  333) e 2& i) 333 d  333


( xy ) ( xy ) 2& i) 33
p () ) (  33) e
5 5
5

³o
( xy )
(  333) e 2& i) 333 d  333 .
5

Substituting from (7B.1f) gives


5

³o
( xy ) ( yx )
p () ) (  333) e 2& i) 333 d  333 . (7B.2b)
5

We can now interchange the roles of the x and y labels in (7B.2a) to get

- 949 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

³o
( yx ) ( yx )
p () ) (  33) e 2& i) 33 d  33 (7B.2c)
5

and use this definition to write (7B.2b) as

p( xy ) () ) p( yx ) () ) . (7B.2d)

Equation (7B.2d) matches the relationship between the cross-correlation functions in Eq. (7B.1f).
According to Eq. (7B.1a), the cross-correlation o ( xy ) is the expectation value of the product
of two real numbers, so it must be real. We can then write, substituting

e 2& i) 33 cos(2&) 33)  i sin(2&) 33)


into Eq. (7B.2a), that

5 5
p( xy ) () ) ³
5
o ( xy ) (  33) cos(2&) 33)d  33  i ³ o ( xy ) (  33) sin(2&) 33)d  33
5
so that
5

³o
( xy ) ( xy )
Re[p () )] (  33) cos(2&) 33)d  33 (7B.3a)
5
and
5
Im[p ( xy )
() )]  ³ o ( xy ) (  33) sin(2&) 33)d  33 . (7B.3b)
5

The remark following Eq. (2.15b) in Chapter 2 points out that the product of any even function
with the sine is an odd function, which means, according to Eq. (2.17) in Chapter 2, that its
integral from í’ to ’ must be zero. Thus, if o ( xy ) is an even function, Eq. (7B.3b) is the integral
of an odd function between í’ and ’ and must be zero, showing that the cross-power spectrum
p( xy ) must be real because Im [p( xy ) ] 0 . The obvious
next obvious next tostep
next step, seeiswhether
to investigate whether
the cross-power
( xy )
spectrum
o (  33) must
( xy )
mustbebe real, is tofunction
an even investigate
of whether
33 . o (  33) must be an even function of  33 .
Again we say, just as in the discussion following Eq. (7B.1e), that  33  3   so that [see
Eqs. (7B.1e) and (7B.1c)]
E 'x (  3) 'y (  ) o ( xy ) (  33)
  (7B.4a)
and
E 'x (  ) 'y (  3) o ( xy ) (  33) .
  (7B.4b)

- 950 -
Appendix 7B

Equation (7.9a) shows that t = χ / u for u > 0 , so when χ ′ > χ the θy ( χ ′) random value in Eq.
(7B.4b) occurs at a later time than the θx ( χ ) random value. Suppose we assume that the θy
random quantity always resembles the θ after a time delay T has elapsed because any
x

disturbance in the x component of the misalignment angle is followed by a similar disturbance in


the y component of the misalignment angle. In fact, suppose we set up the idealized, but entirely
possible, situation that the bias tilt angle φ is zero and the random y component is exactly equal
to the random x component after a time delay of T. This means we can write

θx ( χ ) = ψ ( χ ) (7B.4c)
and
θy ( χ ) = ψ ( χ − uT ) (7B.4d)

for some random function ψ . The value of the cross-correlation function o ( xy ) at

χ ′′ = χ ′ − χ = uT so that χ ′ = uT + χ

is then, according to Eq. (7B.4b),

( )
o ( xy ) (uT ) = E θx ( χ ) θy ( χ + uT ) .

Substituting from (7B.4c) and (7B.4d) then gives

( )
o ( xy ) (uT ) = E (ψ ( χ )ψ (uT + χ − uT ) ) = E ψ ( χ ) 2 . (7B.4e)

This is the variance of ȥ, which could easily be a rather large quantity if there are large
disturbances in the x and y components of the misalignment angle. According to Eq. (7B.4a), on
the other hand,
( ) ( )
o ( xy ) (−uT ) = E θx ( χ ′) θy ( χ ) = E θx (uT + χ ) θy ( χ ) ,

which becomes, substituting from (7B.4c) and (7B.4d),

o ( xy ) (−uT ) = E (ψ (uT + χ )ψ ( χ − uT ) ) . (7B.4f)

This shows that o ( xy ) (−uT ) could easily be quite small when random function ȥ is only poorly
correlated with itself at different values of its argument. The x and y components of the

- 951 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms

misalignment angle could, for example, be subject to large random disturbances that first perturb
the 'x value and then, after a time delay T, perturb the 'y value. This would make the value of
o ( xy ) (uT ) in Eq. (7B.4e) rather large. The disturbances could also, however, be rather short in
duration, so that the perturbation of an angle component at one time has little resemblance to the
perturbation of that same component at another time. This would make the value of o ( xy ) (uT )
in Eq. (7B.4f) rather small. We can conclude, then, that there is no reason for o ( xy ) to be an even
function of its argument. Hence there is no reason to expect the sine integral in (7B.3a)—orb for
that matter the cosine integral in (7B.3b)—toa be zero, which means the cross-power spectrum
( xy )
p in Eqs. (7B.2a), (7B.3a), and (7B.3b) can easily have nonzero real and imaginary
components.

- 952 -
8
SAMPLING-ERROR NEdN IN DOUBLE-
SIDED INTERFEROGRAMS
Random errors in the sampling position produce random errors in the sampled signal. As was
done in Chapter 7 when analyzing misalignment noise, we use wide-sense stationary random
functions to describe the sampling noise, tracing the effect through the calibration process to find
out what the NEdN of the measured spectrum looks like when it is dominated by this sort of
error. In a well-designed interferometer, the sampling-noise NEdN, just like the misalignment-
noise NEdN, should be a small source of error compared to the detector noise. The formulas
derived here can nevertheless be very useful when designing interferometers because they show
how accurately the interferometer signal needs to be sampled. Moreover, when interferometers
produce unusual types of random errors, the size and shape of the errors can be compared to the
noise can
predictions of these formulas, making it easier to determine whether an unexpectedly large
sampling noise
error could be contributing to the problem.

8.1 Noise-Free Signal at the A/D Converter


Sampling noise occurs at point C in Fig. 6.2 of Chapter 6 where the signal is being sampled at the
analog-to-digital (A/D) converter. Equation (6.8c) in Chapter 6 specifies the total noise-free
signal at point C as a function of the optical-path difference (OPD) value Ȥ,

zC(tot ) (  ) zC (  )  zC( cold ) (  ) .

Equations (6.5d) and (6.12a) in Chapter 6 contain formulas for zc and zC( cold ) respectively.
Substituting these into the formula for zC( tot ) gives

zC( tot ) (  )
"
WA 

4 "³ H(u ) M( R ma ) R (  ) ( )  f (  ) a (  )L FOV (  ) e 2 i d

"
WA 
³ H(u ) M( Rma ) ( ) R (  ) a (  )[L FOV ( )  L FOV ( )]e d .
( fore ) (back) 2 i

4 "

-953 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

This can be written as


WA ∆Ω
(χ ) = ³ H(uσ ) M( Rσθma ) R ( σ ) η(σ ) τ a ( σ ) ⋅
( tot )
zC
4 −∞ (8.1a)
2π iσχ
[τ f ( σ )L FOV ( σ ) + L ( fore )
FOV (σ ) − L
(back)
FOV ( σ )] e dσ .

The definition of Z FOV in Eq. (7.7b) of Chapter 7,

§ WA ∆Ω ·
Z FOV (σ ) = ¨ ¸ R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] , (8.1b)
( fore ) (back)

© 4 ¹

can now be substituted into (8.1a) to get


(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ
( tot )
zC ma (8.1c)
−∞

for the noise-free signal at point C in Fig. 6.2. The Fourier F operator defined in Sec. 2.5 of
Chapter 2 [see Eqs. (2.29a) and (2.29c)] lets this be written as

zC(tot ) ( χ ) = F ( iσχ ) ( H(uσ ) M( Rσθ ma ) Z FOV (σ ) ) , (8.1d)

and, of course, the transform can be reversed to get

(
H(uσ ) M( Rσθ ma ) Z FOV (σ ) = F ( −iσχ ) zC( tot ) ( χ ) . ) (8.1e)

Unlike the interferometer model analyzed in Chapter 7, in this chapter we assume that the
misalignment angle șma, when it is significantly different from zero, has the same constant value
during spectral measurements and their associated calibration procedures—that is, we assume
that the misalignment angle șma does not change with time.

8.2 Sampling Noise at the A/D Converter


The sampling-position noise n ( s ) is defined as a function of the OPD value Ȥ by saying that if
zC(tot ) is supposed to be sampled at Ȥcorrect and is instead sampled at the randomly incorrect OPD
value χ incorrect , then the sampling-position noise n ( s ) at χ correct is defined to be

- 954 -
Sampling Noise at the A/D Converter · 8.2

n ( s ) ( χ correct ) = χ incorrect − χ correct . (8.2a)

Clearly the units of n ( s ) are the same as the OPD—that is, units of length (cm). Suppose the plan
is to sample zC(tot ) at N equally spaced OPD values in order to generate a double-sided
interferogram signal with χ = 0 occurring at or near the middle of the sample set. In the absence
of error, we expect the samples to occur at χ = χ j with

χ j = j ∆χ , (8.2b)

where ¨Ȥ is the OPD separation between adjacent samples and, just like in Eq. (5.103b) in
Chapter 5,
N N N N
j = − + 1, − + 2, … , − 1, 0, 1, … , − 1, . (8.2c)
2 2 2 2

In the absence of sampling-position noise, there is one sample taken at χ = 0 when j = 0 , and
there is one more sample taken for χ > 0 than for χ < 0 . When the sampling-position noise
n ( s ) ( χ ) is present, we know that the actual sample positions occur at

χ j + n ( s ) ( χ j )

instead of χ j , and the corresponding sample values are

(
zC(tot ) χ j + n ( s ) ( χ j ) )
instead of zC( tot ) ( χ j ) . We define the sampling noise to be the random errors in the sample values
zC due to the sampling-position noise n ( s ) . We assume that

n ( s ) ( χ j ) << ∆χ for all j . (8.2d)

This lets us write, for the jth sample value contaminated by sampling noise,

dzC( tot )
( )
zC(tot ) χ j + n ( s ) ( χ j ) ≅ zC( tot ) ( χ j ) + n ( s ) ( χ j ) ⋅

. (8.2e)
χ =χ j

-955 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Hence, reverting to continuous OPD notation, we have

dzC( tot )
z ( tot )
C ( χ + n (s)
)
(χ ) ≅ z ( tot )
C ( χ ) + n ( χ ) ⋅
(s)


. (8.2f)

( )
Since zC(tot ) ( χ ) is the noise-free signal and zC(tot ) χ + n ( s ) ( χ ) is the noise-contaminated signal,
we see that the formula for the noise-contaminated signal can be approximated by

dzC(tot )
( tot )
zCN ( χ ) = zC(tot ) ( χ ) + n ( s ) ( χ ) ⋅ . (8.2g)

( tot )
In this chapter the random function zCN represents the signal contaminated by sampling noise,
with the sampling noise caused by the sampling-position noise n ( s ) at point C in Fig. 6.2 of
Chapter 6.

8.3 Power Spectrum and Autocorrelation Function of the Sampling


Noise
The n ( s ) sampling-position noise is zero-mean,

( )
E n ( s ) ( χ ) = 0 . (8.3a)

The expectation operator E is linear with respect to random quantities (see Sec. 3.10 in Chapter
3). Substituting (8.2a) into (8.3a) and applying the expectation operator E , we get

E ( χ incorrect − χ correct ) = E ( χ incorrect ) − E ( χ correct ) = 0 .

Parameter Ȥcorrect is nonrandom, which means that [see Eq. (3.9f) in Chapter 3]

E ( χ correct ) = χ correct .

Consequently, we end up with the formula

E ( χ incorrect ) = χ correct .

Hence, (8.3a) is just another way of saying that there is no bias in the attempt to sample the
signal; although any given attempt is randomly incorrect, on the average we get the correct OPD

- 956 -
Power Spectrum and Autocorrelation Function of the Sampling Noise · 8.3

value. Following the assumptions stated in the previous section, we take n ( s ) to be at least wide-
sense stationary. This means, according to Eq. (3.30b) in Chapter 3, that its autocorrelation
function onn(s ) with respect to the OPD can be written as

  (χ ′ − χ ) = E n
(s)
onn (
 ( s ) ( χ ) ⋅ n ( s ) ( χ ′) . ) (8.3b)
Clearly,
( ) ( )
E n ( s ) ( χ ) ⋅ n ( s ) ( χ ′) = E n ( s ) ( χ ′) ⋅ n ( s ) ( χ ) ,
which means that
  ( χ ′ − χ ) = onn
  ( χ − χ ′) .
(s) (s)
onn

This becomes, defining χ ′′ = χ − χ ′ ,

onn(s ) (− χ ′′) = onn(s ) ( χ ′′) , (8.3c)

showing that the autocorrelation function for the sampling-position noise is an even function of
its argument. It is, of course, also real because n ( s ) is real:

(s)
Im onn (
  (χ ) = 0 . ) (8.3d)

(s)
The Fourier transform of onn  is called the power spectrum of the n ( s ) sampling-position
noise (see Sec. 3.20 of Chapter 3),


p (σ ) =
(s)

nn ³o
(s)

nn ( χ ) e−2π iσχ d χ = F ( −iσχ ) ( onn(s ) ( χ ) ) . (8.4a)
−∞

The transform can, of course, be reversed to get

  (χ ) =
(s)
onn ³p
(s)

nn (σ ) e 2π iσχ dσ = F (iσχ ) pnn
(s)
(
  (σ ) . ) (8.4b)
−∞

Equations (8.3c) and (8.3d) show that the autocorrelation function is real and even which means
(s)
that, according to entry 1 of Table 2.1 in Chapter 2, the power spectrum pnn  must also be real
and even:
  ( −σ ) = pnn
  (σ )
(s) (s)
pnn (8.4c)

-957 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

and
Im ( pnn
  (σ ) ) = 0 .
(s)
(8.4d)

(s) (s)
  (0) can be used to scale the power spectrum pnn
The value of onn   . Consulting Eq. (8.4b), we

have

³p (σ ) dσ
(s) (s)
o (0) =

nn 
nn (8.5a)
−∞

which can be written as, substituting from Eq. (8.3b) with χ = χ ′ ,

(
E [n ( χ )] =
(s) 2
) ³p (s)

nn (σ ) dσ . (8.5b)
−∞

Formula (8.5b) shows, since its right-hand side depends only on the shape and size of the power
spectrum, that the wide-sense stationary nature of the sampling-position noise requires
E([n ( s ) ( χ )]2 ) to be independent of the OPD value Ȥ. If we know that function S h (σ ) specifies
the shape of the power spectrum, but we do not know the size of the power spectrum, then there
exists a real constant Į such that
  (σ ) = α S h (σ ) .
(s)
pnn (8.5c)

Substituting (8.5c) into (8.5b) then gives

(
E [n ( χ )] = α
(s) 2
) ³ S (σ ) dσ
h
−∞
or
−1 −1
ª∞ º ª∞ º
(
(s) 2
)
α = « ³ Sh (σ ) dσ » ⋅ E [n ( χ )] = « ³ Sh (σ ) dσ » ⋅ E([n ( s ) ]2 ) , (8.5d)
¬ −∞ ¼ ¬ −∞ ¼

where the last step drops the argument Ȥ because wide-sense stationary random functions have
the same mean-square value E([n ( s ) ]2 ) at all values of Ȥ. Hence we can find the value of Į from
the shape function Sh and the mean-squared error E([n ( s ) ]2 ) . Knowing both Į and S h (σ )
(s)
determines the size and shape of function pnn
 in Eq. (8.5c), completely specifying the power
spectrum of the sampling-position noise in terms of the shape function and the mean-squared
error in the sampling position.

- 958 -
Uncalibrated Spectral Signals · 8.4

8.4 Uncalibrated Spectral Signals


( tot )
To create the noise-contaminated, double-sided interferogram, we multiply zCN in Eq. (8.2g) by
the function
°­1 for χ ≤ D
Π ( χ , D) = ® , (8.6a)
°̄0 for χ > D

which has already been defined in Eq. (4C.1a) in Appendix 4C of Chapter 4. (This is also the
same as the Π function in Eq. (2.56c) of Chapter 2, except for its value at χ = D ; in particular,
we know from the discussion following Eq. (2.9e) that both versions of Π must have the same
Fourier transform.) We now have, from (8.2g), that

dzC( tot )
Π ( χ , D ) z ( tot )
CN ( χ ) = Π ( χ , D) z ( tot )
C ( χ ) + Π ( χ , D ) n ( χ ) ⋅
(s)
(8.6b)

for the total double-sided interferogram signal contaminated by sampling noise at point C in Fig.
6.2. Multiplying by Π ( χ , D) in this way explicitly reminds us that the double-sided
interferogram is truncated—that is, data is only recorded for OPD values lying between D and
íD. The forward Fourier transform of
Π ( χ , D) zCN
( tot )
(χ )

is the uncalibrated spectral signal contaminated by sampling noise—and we show this by writing,
just like in Eq. (7.14c) of Chapter 7, that


Z eff ,totN (σ ) = F
( − iσχ )
(
Π ( χ , D) zCN
( tot )
(χ ) . ) (8.6c)

Section 2.6 in Chapter 2, where the linear nature of the Fourier operator F is explained, shows
that when the forward Fourier transform is applied to (8.6b) we get a sum of two Fourier
transforms on the right-hand side:

(
F ( − iσχ ) Π ( χ , D) zCN
( tot )
) (
( χ ) = F ( − iσχ ) Π ( χ , D) zC( tot ) ( χ ) )
§ dz ( tot ) ·
+ F ( − iσχ ) ¨ Π ( χ , D) n ( s ) ( χ ) ⋅ C ¸ .
© dχ ¹
This can also be written as, substituting from (8.6c),

-959 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

( − iσχ ) § dzC( tot ) ·


Z
eff ,totN (σ ) = F (
( − iσχ )
Π ( χ , D ) zC ( χ ) + F
( tot )
) ¨ Π ( χ , D) n ( χ ) ⋅
 (s)

dχ ¹
¸. (8.6d)
©

Expanding the first term on the right-hand side of (8.6d) is a straightforward process. The
Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] gives

F ( −iσχ ) ( Π ( χ , D) zC( tot ) ( χ ) ) = F ( − iσχ ) ( Π ( χ , D) ) ∗ F ( − iσχ ′) ( zC( tot ) ( χ ′) ) . (8.7a)

According to Eq. (2.108b) in Chapter 2 [with ƒ in (2.108b) replaced by Ȥ, t replaced by ı, and F


replaced by D]
F ( − iσχ ) ( Π ( χ , D) ) = 2 D sinc(2πσ D) , (8.7b)

where, following the definition in Eq. (2.106d),

sin( x)
sinc( x) = . (8.7c)
x

Equations (8.7b) and (8.1e) can now be substituted into (8.7a) to get

F ( − iσχ ) ( Π ( χ , D) zC( tot ) ( χ ) ) = 2 D sinc(2πσ D) ∗ [H(uσ ) M( Rσθ ma ) Z FOV (σ )] . (8.7d)

Functions H and M vary slowly with ı compared to sinc(2πσ D ) , and the sinc function is very
narrow about σ = 0 compared to H and M. This means, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, that (8.7d) can be approximated as

F ( − iσχ ) ( Π ( χ , D) zC( tot ) ( χ ) ) ≅ H(uσ ) M( Rσθ ma )[2 D sinc(2πσ D) ∗ Z FOV (σ )] ,

which becomes, substituting from Eq. (7.16h) in Chapter 7,

F ( − iσχ ) ( Π ( χ , D) zC( tot ) ( χ ) ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ ) , (8.7e)

where [see Eq. (7.16f)]

WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ ) ª¬τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ ) º¼ .
( fore ) (back)
(8.7f)
4

Expanding the second term on the right-hand side of Eq. (8.6d) starts out the same way as

- 960 -
Uncalibrated Spectral Signals · 8.4

expanding the first. Again, we use Eq. (2.39j) in Chapter 2 to get

( − iσχ ) § dzC(tot ) ·
¨ Π ( χ , D) n ( χ )
(s)
F ¸
© dχ ¹
(8.8a)
§ dz ( tot )( χ ′) ·
= F ( −iσχ ) ( )
Π ( χ , D)n ( s ) ( χ ) ∗ F ( −iσχ ′) ¨ C
d χ ′
¸.
© ¹

Formula (2.35e) in Chapter 2 shows that

§ dz ( tot )( χ ′) ·
F ( −iσχ ′) ¨ C ¸ = 2π iσ F
( − iσχ ′ )
(
zC( tot )( χ ′) , )
© d χ′ ¹

which becomes, substituting from Eq. (8.1e),

§ dz ( tot )( χ ′) ·
F ( − iσχ ′) ¨ C ¸ = 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) . (8.8b)
© d χ ′ ¹

For future use, we define, reversing the Fourier transform in (8.8b), that

dzC( tot )
Ws ( χ ) = = F ( iσχ ) ( 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ) , (8.8c)

which, of course, can also be written as [see Eq. (2.29c) in Chapter 2]


Ws ( χ ) = 2π i ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e2π iσχ dσ . (8.8d)
−∞

The D-limited Fourier transform of n ( s ) ( χ ) is defined to be

(
n (Ds ) (σ ) = F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) ) (8.8e)
or

n (Ds ) (σ ) = ³ Π ( χ , D)n ( χ ) e −2π iσχ d χ .
(s)
(8.8f)
−∞

Because n (Ds ) (σ ) is the Fourier transform of the real-valued random function

-961 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

[ (  , D)n ( s ) (  )] ,

it must, according to entry 7 in Table 2.1 of Chapter 2, be Hermitian:

n (Ds ) ( ) n (Ds ) ( ) . (8.8g)

The formula for n (Ds ) ( ) can also be written as, consulting the prescription for (  , D) in (8.6a),

³ n
(s) (s)
n ( )
D (  ) e 2 i d  . (8.8h)
D

To finish up the analysis of the second term, we substitute Eqs. (8.8b) and (8.8e) into (8.8a) to get

(  i ) § (s) dzC(tot ) · (s)


F ¨ (  , D) n (  ) ¸ n D ( )   2 i H(u ) M( R ma ) Z FOV ( )  . (8.8i)
© d ¹

Now that the two terms on the right-hand side of Eq. (8.6d) have been expanded and analyzed,
we use their formulas in Eqs. (8.7e) and (8.8i) to write the formula for the uncalibrated spectral
signal contaminated by sampling noise:


Z eff ,totN ( ) H(u ) M( R ma ) Z mnf ( )
(8.9a)
 n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( )  .

Applying the expectation operator E to both sides of Eq. (8.8h)


(8.8f) gives, according to Eqs. (3.16a)
and (3.17c) in Sec. 3.10 of Chapter 3,

D D

 
E n (Ds ) ( ) ³ E n (  ) e
  ³ E  n 
(s) 2 i (s)
d (  ) e 2 i d  ,
D D

which reduces to, substituting from Eq. (8.3a) above,


E n (Ds ) ( ) 0 .  (8.9b)

Again using the linearity of E with respect to random quantities as explained in Sec. 3.10 of
Chapter 3, we apply the expectation operator to both sides of (8.9a) to get

- 962 -
Uncalibrated Spectral Signals · 8.4

(

E Z )
eff ,totN (σ ) ≅ E ( H(uσ ) M( Rσθ ma ) Z mnf (σ ) )

(
+ E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] ) (8.9c)
= H(uσ ) M( Rσθ ma ) Z mnf (σ )

(
+ E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] , )

where in the last step we apply Eq. (3.9f) of Chapter 3, noting that E (c ) = c for nonrandom
quantities c. The convolution in the second term on the right-hand side can be written as the
integral [see Eqs. (2.38b) and (2.38a) in Chapter 2]

n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]


= [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] ∗ n (Ds ) (σ )

= ³ [ 2π iσ ′ H(uσ ′) M( Rσ ′θ
-∞
ma ) Z FOV (σ ′) ] n (Ds ) (σ − σ ′) dσ ′.

Applying E to both sides gives

(
E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] )
§∞ ·
= E ¨ ³ [ 2π iσ ′ H(uσ ′) M( Rσ ′θ ma ) Z FOV (σ ′) ] n (Ds ) (σ − σ ′) dσ ′ ¸ .
© -∞ ¹

We use the linearity of E as explained in Sec. 3.10 of Chapter 3 to move E inside the integral
and then substitute from (8.9b) to get

E ( n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ )])


∞ (8.9d)
( )
= ³ [ 2π iσ ′ H(uσ ′) M( Rσ ′θ ma ) Z FOV (σ ′) ] E n (Ds ) (σ − σ ′) dσ ′ = 0.
-∞

Hence, Eq. (8.9c) reduces to


E Z ( )
eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ ) . (8.9e)

-963 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

This shows that the sampling noise can always be reduced to negligible levels in the uncalibrated
spectral signal by averaging together many independent measurements of the same spectral
radiance. This shows
In this that the sampling noise behaves the same way as the detector noise and
respect,
mirror-misalignment noise examined in the two previous chapters [see Eq. (7.18e) in Chapter 7
and the discussion following Eq. (6.30c) in Chapter 6].

8.5 Calibrating the Spectral Signal Contaminated by Sampling Noise


To find the calibrated spectral radiance contaminated by sampling noise, we again apply the
spectral calibration algorithm described in Sec. 5.19 of Chapter 5. The analysis here closely
follows the pattern of Sec. 7.6 in Chapter 7, where the same algorithm is used to find the spectral
radiance contaminated by mirror-misalignment noise. Just like before, the spectral radiances L(1)
and L(2) chosen to calibrate the instrument are set up to be slowly varying functions of
wavenumber so that [see Eqs. (7.19a) and (7.19b) in Chapter 7]

L(1) (  ) L(1)FOV (  ) L(1)


mnf (  ) (8.10a)
and
L(2) (  ) L(2) (2)
FOV (  ) L mnf (  ) . (8.10b)

To describe the uncalibrated spectral signal generated from observation of L(1), we again use
(1) (1)
functions Z FOV and Z mnf defined in Eqs. (7.20b) and (7.20c) of Chapter 7:

(1) WA  (1) ( fore ) (back)


Z FOV ( ) R (  ) ( ) a (  )[ f (  )L (  )  L FOV (  )  L FOV (  )] (8.10c)
4
and
(1) WA  (1) ( fore ) (back)
Z mnf ( ) R (  ) ( ) a (  )[ f (  )L (  )  L mnf (  )  L mnf (  )] . (8.10d)
4

When we write down these formulas, functions L(1), L(1)FOV , and L(1)
mnf can be used interchangeably

as shown in Eq. (8.10a). Similarly, describing the uncalibrated spectral signal generated from
observation of L(2), we reuse functions Z FOV
(2) (2)
and Z mnf defined in Eqs. (7.20e) and (7.20f) of
Chapter 7,

(2) WA  (2) ( fore ) (back)


Z FOV ( ) R (  ) ( ) a (  )[ f (  )L (  )  L FOV (  )  L FOV (  )] (8.10e)
4
and
(2) WA  (2) ( fore ) (back)
Z mnf ( ) R (  ) ( ) a (  )[ f (  )L (  )  L mnf (  )  L mnf (  )] . (8.10f)
4

- 964 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5

In these formulas, as shown by Eq. (8.10b), functions L(2), L(2) (2)


FOV , and L mnf can be used

interchangeably.
Still using the same notation as in Sec. 7.6 of Chapter 7, we call Z (1)
eff ,totN (σ ) the uncalibrated,

noise-contaminated spectral signal produced when the interferometer observes L(1) ( σ ) and
Z (2) (σ ) the uncalibrated, noise-contaminated spectral signal produced when the
eff ,totN

interferometer observes L(2) ( σ ) . Remembering that


Z eff ,totN (σ )

on the left-hand side of Eq. (8.9a) is the uncalibrated spectral signal for any interferometer
measurement contaminated by sampling noise, we can get the formulas for Z (1) eff ,totN (σ ) and

Z (2) (σ ) by applying Eqs. (8.10c)–(8.10f) to the right side of Eq. (8.9a),


eff ,totN

 (1) (σ ) ≅ H(uσ ) M( Rσθ ) Z (1) (σ )


Z eff ,totN ma mnf
(8.11a)
+ n (Ds ) (σ ) ∗ ª¬ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV
(1)
(σ ) º¼
and

 (2) (σ ) ≅ H(uσ ) M( Rσθ ) Z (2) (σ )


Z eff ,totN ma mnf
(8.11b)
+ n (Ds ) (σ ) ∗ ª¬ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV
(2)
(σ ) º¼ .
We assume that the experimental procedure associated with this calibration algorithm includes
some form of data analysis equivalent to averaging together many independent measurements to
 (1,2) (σ ) . We apply the expectation operator E to both sides
eliminate the sampling noise from Z eff ,totN

of (8.11a) and (8.11b) to get the result of this averaging. Following the same reasoning used to go
from Eq. (8.9a) to (8.9e) above, we see that

(
E Z eff ,totN )
 (1) (σ ) ≅ H(uσ ) M( Rσθ ) Z (1) (σ )
ma mnf

and
E Z(eff ,totN )
 (2) (σ ) ≅ H(uσ ) M( Rσθ ) Z (2) (σ ) .
ma mnf

Following the same procedure as in Eqs. (7.20i) and (7.20j) in Chapter 7, we again remove the
tilde and change the totN subscript to tot to define

(  (1) )
eff ,tot (σ ) = E Z eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ )
Z (1) (1)
(8.11c)

-965 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

and
(
 (2) )
eff ,tot (σ ) = E Z eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ ) .
Z (2) (2)
(8.11d)

The noise cannot, of course, be averaged away from the spectral measurement itself because [as
is discussed following Eq. (7.21a) in Chapter 7] in practice we cannot take the same amount of
care when collecting the spectral measurements as we do when collecting the known calibration
data. Just as is done in Sec. 7.6 of Chapter 7, we use Z eff ,totN (σ ) to represent the signal for the
( meas )

uncalibrated spectral measurement contaminated by noise—in this case, sampling noise. When
analyzing sampling noise, Z eff ,totN (σ ) is the same quantity as
( meas )


Z eff ,totN (σ )

in Eq. (8.9a), which means Eq. (8.9a) can now be written as

 ( meas ) (σ ) ≅ H(uσ ) M( Rσθ ) Z (σ )


Z eff ,totN ma mnf
(8.11e)
+ n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] .

Here formula (8.7f) specifies Z mnf (σ ) and (8.1b) specifies Z FOV (σ ) .


Now we can apply the calibration algorithm. Following the same procedure as in Sec. 7.6 of
Chapter 7, we have [repeating Eq. (7.21a)]

Measured Radiance
 ( meas ) (σ ) − Z (1) (σ )
Z (8.12a)
= ª¬L ( σ ) − L ( σ ) º¼ (2)
eff ,totN eff ,tot
(2) (1)
+ L(1) ( σ ).
Z eff ,tot (σ ) − Z (1)
eff ,tot (σ )

The formulas in (8.11c) and (8.11d) show that

L(2) ( σ ) − L(1) ( σ ) L(2) ( σ ) − L(1) ( σ )


≅ , (8.12b)
,tot (σ ) − Z eff ,tot (σ ) H(uσ ) M( Rσθ ma )[ Z mnf (σ ) − Z mnf (σ )]
(2) (1) (2) (1)
Z eff

which becomes, substituting from (8.10d) and (8.10f),

- 966 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5

L(2) (  )  L(1) (  )
(2) (1)
Z eff ,tot ( )  Z eff ,tot ( )

L(2) (  )  L(1) (  )

WA 
H(u ) M( R rms )R (  ) ( ) a (  ) f (  )[L(2) (  )  L(1) (  )]
4

or
L(2) (  )  L(1) (  )
(2) (1)
Z eff ,tot ( )  Z eff ,tot ( )
(8.12c)
1
ª WA  º
« H(u ) M( R rms )R (  ) ( ) a (  ) f (  )» .
¬ 4 ¼

This result is identical to (7.21c) in Chapter 7 because the sampling noise, like the mirror-
misalignment noise, can be reduced to negligible levels by averaging together many independent
measurements of the same radiance when gathering data for the calibration algorithms. In fact,
this formula
(8.12c) holdsholds true whenever
true whenever the noise
the noise in thecan becan
data removed this way.
be removed thisTo findTo
way. thefind
value
theof
value of

 ( meas ) ( )  Z (1) ( )
Z eff ,totN eff ,tot

in Eq. (8.12a), we substitute from Eqs. (8.11e) and (8.11c) to get

 ( meas ) ( )  Z (1) ( ) H(u ) M( R )[ Z ( )  Z (1) ( )]


Z eff ,totN eff ,tot ma mnf mnf

 n (Ds ) ( )  [2 i H(u ) M( R ma ) Z FOV ( )] ,

(1)
which becomes, consulting Eqs. (8.7f) and (8.10d) for the formulas of Z mnf and Z mnf ,

 ( meas ) ( )  Z (1) ( )
Z eff ,totN eff ,tot

WA 
H(u ) M( R ma )R (  ) ( ) a (  ) f (  ) ¬ª L mnf (  )  L(1) (  ) ¼º . (8.12d)
4
 n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( ) .

Equations (8.12c) and (8.12d) can now be put into (8.12a) to get

-967 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Measured Radiance
4{n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( ) } (8.12e)
L mnf (  )  .
(WA )H(u ) M( R ma ) R (  ) ( ) a (  ) f (  )

Equation (6.55a) in Chapter 6 shows that the complex-valued transfer function H can be written
as
H(u ) H(u ) ei ( ) (8.12f)

with  ( ) being the phase of the complex-valued function H(uı). According to the discussion
following Eq. (6.55a), Eq. (5A.6b) in Appendix 5A of Chapter 5 applies to the transfer function
H in (8.12f); that is, H is Hermitian:
H(u ) H(u ) . (8.12g)

Substitution of (8.12f) into (8.12g) gives

H(u ) ei (  ) H(u ) e i ( ) ,

which can only be true if H(u ) isisan


aneven
evenfunction
functionofofits
itsargument
argumentuı,
uı,

H(u ) H(u ) , (8.12h)

and  ( ) is an odd function of its argument ı,

 ( )  ( ) . (8.12i)

Equation (8.12f) can be used to write the formula in (8.12e) as

Measured Radiance
4e i ( ) {n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( ) } (8.12j)
L mnf (  )  .
(WA ) H(u ) M( R ma )R (  ) ( ) a (  ) f (  )

The denominator of the second term on the right-hand side is real, but the numerator of this term
almost certainly has both a real and imaginary component. This
In this regard,Eq.
formula, the(8.12j),
result resembles
resemblesEq.
Eq.
(7.21e) in Chapter 7, which also shows the measured radiance spectrum to be the sum of
L mnf (  ) and a complex random term. Just like in the discussion following (7.21e), we note that
only the real part of the second term acts as a source of unavoidable noise, since we can always

- 968 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5

discard any imaginary components of the noise-contaminated measured radiance. Once again we
can conclude that only the real part of the second term of the formula is the random spectral noise
δ L for the measured radiance,

δ L =
(
4 Re e − iψ (σ ) {n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]} ). (8.12k)
(WA ∆Ω) H(uσ ) M( Rσθ ma ) R ( σ )η (σ )τ a ( σ )τ f ( σ )

8.6 Random Sampling Error in the Measured Spectrum


The δ L sampling error is an even function of wavenumber ı. To see why this is so, we need
 . The R, τ , and τ factors are
only analyze the different functions inside formula (8.12k) for δ L a f

written as even functions of wavenumber—that is, as functions of the absolute value of ı—and
Eq. (8.12h) shows that H is also an even function of ı. Equation (4.139g) in Chapter 4 states
that Ș(ı) is even, and Eq. (5.10f) in Chapter 5 reveals M to be an even function of ı. Hence the
whole denominator of (8.12k) must be an even function of wavenumber ı. To analyze the
numerator, we note that everything in the formula for Z FOV in Eq. (8.1b) is real, so Z FOV is real
and—since Ș(ı) and the other functions in the formula are even—function Z FOV is also even:

Z FOV (−σ ) = Z FOV (σ ) . (8.13a)

The convolution in the numerator of Eq. (8.12k) can be written as [see Eqs. (2.38a) and (2.38b) in
Chapter 2]

n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]


= [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] ∗ n (Ds ) (σ )

= ³ [ 2π iσ ′ H(uσ ′) M( Rσ ′θ
−∞
ma ) Z FOV (σ ′) ] n (Ds ) (σ − σ ′) dσ ′ ,

which means that, since only i, H, and n (Ds ) are complex,

-969 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

{ n (s)
D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }∗

= ³ ª¬−2π iσ ′ H(uσ ′)

{ }
M( Rσ ′θ ma ) Z FOV (σ ′) º¼ n (Ds ) (σ − σ ′)∗ dσ ′
−∞

= ª¬ −2π iσ H(uσ )∗ M( Rσθ ma ) Z FOV (σ ) º¼ ∗ n (Ds ) (σ )∗


= n (Ds ) (σ )∗ ∗ ª¬ −2π iσ H(uσ )∗ M( Rσθ ma ) Z FOV (σ ) º¼ .

This is the formula for the complex value of the convolution. The numerator in (8.12k) is
proportional to

(
Re e − iψ (σ ) {n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]} ; )
and since the real part of any complex number c can be written as 0.5(c + c* ) , we see that the real
part of the convolution is

( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{ })

+ eiψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] (8.13b)

=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{
+ eiψ (σ ) n (Ds ) (σ )∗ ∗ ª¬ −2π iσ H(uσ )∗ M( Rσθ ma ) Z FOV (σ ) º¼ . })
Equations (8.8g), (8.12g), and (8.12i) show that this can also be written as

( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{
+ e − iψ ( −σ ) n (Ds ) (−σ ) ∗ [ 2π i (−σ ) H(−uσ ) M( Rσθ ma ) Z FOV (σ ) ] . })

- 970 -
Random Sampling Error in the Measured Spectrum · 8.6

Since, as has already been noted, M and Z FOV are even, it follows that

( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
(
1 − iψ (σ ) ( s )
e {
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] } (8.13c)

{ })
+ e − iψ ( −σ ) n (Ds ) (−σ ) ∗ [ 2π i (−σ ) H(−uσ ) M(− Rσθ ma ) Z FOV (−σ ) ] .

The right-hand side of (8.13c) is clearly an even function of wavenumber; when ı is replaced by
íı, only the order of the sum changes. Consequently the left-hand side of (8.13c) must also be an
even function of ı; hence, the numerator of (8.12k), just like the denominator of (8.12k), must be
an even function of wavenumber. Consequently, it makes sense to write the formula for the
random sampling error in (8.12k) as

δ L ( σ ) =
( { }) .
4 Re e −iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]
(8.13d)
(WA ∆Ω) H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )

The absolute value signs in the argument for δ L  remind us that both sides of this formula are
even functions of ı.
The linearity of the expectation operator E with respect to random quantities (see Sec. 3.10 in
Chapter 3) lets us apply E to both sides of (8.13d) and take the nonrandom quantities outside the
expectation value to get

(
E δ L ( σ ) = )
( ( {
4 E Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ )] }) )
. (8.14a)
(WA ∆Ω) H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )

To evaluate the numerator of the right-hand side, we again note that any complex number c can
be written as 0.5(c + c* ) and then use the linearity of E to get

( ( {
E Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }) )
1
(
= [e − iψ (σ )E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]
2
) (8.14b)

({ } )] .

+ eiψ (σ )E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ) Z (σ ) ]
ma FOV

-971 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

According to Eq. (8.9d),

 
E n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( )  0 ; (8.14c)

a complex
and if the mean ofnumber is zero
a random so is its
complex complex
number conjugate:
is zero so is the mean of its complex conjugate:


E n (Ds ) ( )   2 i H(u ) M( R ) Z ( ) 0 .
ma FOV

 (8.14d)

We conclude, referring back to Eq. (8.14b), that

  
E Re e  i ( ) n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( )    0 . (8.14e)

Substituting this latest result back into (8.14a) now gives

E  L (  ) 0 .
  (8.14f)

Hence the random sampling error  L is a zero-mean random variable.

8.7 Calculating the NEdN from the Random Sampling Error


Since  L is a zero-mean random variable, its variance is [after applying Eqs. (3.8f), (3.8c) in
Chapter 3 and Eq. (8.14f) above]

E ¨§ ª L (  )  E  L (  ) º ¸· E ª¬ L (  ) º¼ .
 
2 2

©¬
 ¼ ¹
(8.15a)

Consulting Eq. (8.13d), we see that


E ª¬ L (  ) º¼
2


¬ 
E ª Re e i ( ) n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( )  º 2
 ¼  , (8.15b)
2
ª¬ A  4  H(u ) M( R ma ) R (  ) ( ) a (  ) f (  ) º¼

where we have used that W 2 1 because W = 1 or í1 [see discussion immediately preceding Eq.

- 972 -
Calculating the NEdN from the Random Sampling Error · 8.7

(4.84a) in Chapter 4]. We define function J ( s ) ( ) to be

 
J ( s ) ( ) E ª Re e i ( ) n (Ds ) ( )   2 i H(u ) M( R ma ) Z FOV ( )  º 2 ,
¬  ¼   (8.15c)

which means the variance in Eq. (8.14b) can be written as

16 J ( s ) ( )

E ª¬ L (  ) º¼
2

ª¬ A  H(u ) M( R ma ) R (  ) ( ) a (  ) f (  ) º¼
2
. (8.15d)

Equation (6.3f) in Chapter 6 states that the NEdN associated with any random error  L is its
standard deviation—that is, the square root of the variance of  L . Hence, NEdNsamp, the NEdN
caused by the sampling error, is

4 J ( s ) ( )
NEdN samp (  ) . (8.15e)
A  H(u ) M( R ma )R (  ) ( ) a (  ) f (  )

The J ( s ) function specifies how the sampling noise interacts with the radiance spectrum, and the
denominator rescales the result so it has the right size with respect to the spectral
measuredmeasurement.
spectrum.
(s)
To evaluate J , we set up three new functions of wavenumber called T1 ( ) , T2 ( ) , and
T3 ( ) . Using function Ws (  ) from Eqs. (8.8c) or (8.8d) above, we define

T1 ( ) E §¨ ª¬ F (  i ) (  , D) n ( s ) (  ) Ws (  ) º¼ ·¸ ,
2

©
  ¹
(8.16a)

T2 ( ) E §¨ ª¬ F ( i ) (  , D ) n ( s ) (  ) Ws (  ) º¼ ·¸ ,
2

©
  ¹
(8.16b)

and
T3 ( )
(8.16c)
    
E F (  i ) (  , D ) n ( s ) (  ) Ws (  ) ( F ( i ) (  , D ) n ( s ) (  ) Ws (  ) .

Equation (8.8c) shows that


dzC( tot )
Ws (  ) (8.16d)
d

-973 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

is real, as are functions n ( s ) ( χ ) and Π ( χ , D) introduced in (8.2a) and (8.6a), so when taking the
complex conjugate of the Fourier transform in Eq. (8.16a) we get, applying Eqs. (2.29a) and
(2.29c) in Chapter 2,


ª∞ º
( )

ª ( − iσχ )
Π ( χ , D ) n ( χ ) Ws ( χ ) ¼º = « ³ Π ( χ , D) n ( s ) ( χ ) Ws ( χ )e −2π iσχ d χ »
(s)
¬F
¬ −∞ ¼

³ Π( χ , D) n ( χ ) Ws ( χ )e 2π iσχ d χ
(s)
= (8.16e)
−∞

(
= F (iσχ ) Π ( χ , D) n ( s ) ( χ ) Ws ( χ ) . )
Hence the Fourier transforms in (8.16a) and (8.16b) are complex conjugates, which means their
squares must also be complex conjugates, as are the expectation values of the squares. We thus
end up with the relationship
T1 (σ )∗ = T2 (σ ) . (8.16f)

Consequently any formula derived for T1 (σ ) can be turned into a formula for T2 (σ ) just by
taking the complex conjugate of both sides of the equation.
Working first with the T1 term in Eq. (8.16a), we consult the definition of the Fourier-
transform operator F [see Eqs. (2.29a) and (2.29c) in Chapter 2) and Eq. (3.17c) in Chapter 3 to
get
§∞ ∞
·
T1 (σ ) = E ¨ ³ Π ( χ , D) n ( χ ) Ws ( χ )e
 (s) −2π iσχ
d χ ³ Π ( χ ′, D) n ( s ) ( χ ′) Ws ( χ ′)e −2π iσχ ′ d χ ′ ¸
© −∞ −∞ ¹
∞ ∞
= ³ d χ Π ( χ , D ) W ( χ )e s
−2π iσχ
³ d χ ′Π ( χ ′, D) W ( χ ′)e
s
−2π iσχ ′
( )
E n ( s ) ( χ )n ( s ) ( χ ′) .
−∞ −∞

This can also be written as, first substituting from Eq. (8.3b) and then applying (8.3c),

∞ ∞

³ d χ Π ( χ , D ) W ( χ )e ³ d χ ′Π ( χ ′, D) W ( χ ′)e
−2π iσχ −2π iσχ ′
T1 (σ ) =   ( χ − χ ′) .
(s)
s s onn (8.17)
−∞ −∞

The Fourier transform of Π ( χ , D ) is [see Eq. (8.7b)]

2 D sinc(2πσ D)

- 974 -
Calculating the NEdN from the Random Sampling Error · 8.7

and reversing the transform in (8.8d) shows that the forward


Fourier transform
Fourier transform
of Ws (  )ofisWs (  ) is

2 i H(u ) M( R ma ) Z FOV ( ) .

Applying the Fourier convolution theorem to the Fourier transform of the product function

(  , D) Ws (  )

then gives, using formula (2.39k) in Chapter 2,

"

³ (  , D) W (  ) e
2 i
s d
"
(8.18a)
[2 D sinc(2 D)]  [2 i H(u ) M( R ma ) Z FOV ( )] .

In a well-designed interferometer, the D parameter limiting the extent of the double-sided


interferogram is large enough to make the 2 i factor, H, and M all slowly varying functions of
wavenumber compared to
2 D sinc(2 D) .

This means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that (8.18a) can be
approximated as

"

³ (  , D) W (  ) e
2 i
s d
"

2 i H(u ) M( R ma )  [2 D sinc(2 D )]  Z FOV ( )  ,

which becomes, after consulting Eq. (7.16h) in Chapter 7,

"

³ (  , D) W (  ) e
2 i
s d  2 i H(u ) M( R ma ) Z mnf ( ) . (8.18b)
"

Taking the complex conjugate of both sides gives, since (  , D) , Ws (  ) , M, and Z mnf are real
quantities,
"

³ (  , D) W (  ) e
2 i
s d  2 i H(u ) M( R ma ) Z mnf ( ) . (8.18c)
"

-975 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Substituting (8.4b) into (8.17) leads to

T1 ( )
" " "

³ d  (  , D ) W (  )e ³ d  (  , D ) W (  )e ³ d p
2 i 2 i (s)
s s 
nn ( ) e2 i (    )
" " "
" " "

³ d ³ d  (  , D) W (  ) e ³ d  (  , D) W (  ) e
(s) 2 i  (  ) 2 i  (  )
  ( )
pnn s s
" " "

which becomes, applying Eqs. (8.18b) and (8.18c),

"

³p
(s) 
T1 ( ) 
nn ( )[2 i (   ) H  u (   )  M  R(   ) ma  Z mnf (   )] (
" (8.18d)
[2 i (   )H  u (   )  M  R(   ) ma  Z mnf (   )]d .

Glancing back at the formula for Z mnf in Eq. (8.7f) above, we see that every function in the
formula depends on  except for Ș, and according to Eq. (4.139g) in Chapter 4, Ș is also an
even function of ı. Hence Z mnf is even:
Z mnf ( ) Z mnf ( ) . (8.18e)

Equation (5.10f) in Chapter 5 shows that M( R ma ) is an even function of ı, and Eq. (5A.6b)
(5B.2a) in
Appendix 5B A of Chapter 5 shows that H is Hermitian. Consequently the formula for T1 ( ) in
(8.18d) can be written as

T1 ( )
"
 4 2 ³ pnn
(s)
  ( )[(   ) H  u (   )  M  R (   ) ma  Z mnf (   )] ( (8.18f)
"

[(   )H  u (   )  M  R (   ) ma  Z mnf (   )]d .

Equation (8.4d) shows that the power spectrum of the sampling-position noise is real, and we
already know that M and Z mnf are real. Hence only the transfer function H in (8.18f) can have
a nonzero imaginary component, so when Eq. (8.16f) is applied to (8.18f) to get the formula for
T2 ( ) , the result is

- 976 -
Calculating the NEdN from the Random Sampling Error · 8.7

T2 (σ ) =

  (σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R (σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅



− 4π 2 ³
(s)
pnn (8.18g)
−∞

[(σ + σ ′)H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′)]dσ ′ .


Function T3 (σ ) in Eq. (8.16c) can be written as, applying the Fourier transform operator as
shown in Eqs. (2.29a,c) in Chapter 2,

T3 (σ )
§ ∞ ∞
·
³−∞ ³−∞ Π( χ ′, D) n ( χ ′) Ws ( χ ′)e d χ ′ ¸¹ .
−2π iσχ 2π iσχ ′
= E¨ Π ( χ , D ) 
n (s)
( χ ) Ws ( χ ) e d χ ⋅ (s)

Equation (3.17c) in Chapter 3 shows that the expectation operator E can be taken inside the
integrals to get, after applying Eq. (8.3b),

T3 (σ )
∞ ∞
= ³ d χ Π ( χ , D ) W ( χ )e
s
−2π iσχ
³ d χ ′ Π ( χ ′, D) W ( χ ′)e
s
2π iσχ ′
(
E n ( s ) ( χ )n ( s ) ( χ ′) ) (8.19a)
−∞ −∞
∞ ∞

³ d χ Π ( χ , D ) W ( χ )e ³ d χ ′ Π ( χ ′, D) W ( χ ′)e
−2π iσχ 2π iσχ ′
  (χ ′ − χ ) .
(s)
= s s onn
−∞ −∞

(s)
  is even,
According to Eq. (8.3c), the autocorrelation function onn

  ( − χ ) = onn
  (χ ) ,
(s) (s)
onn

  ( χ ′ − χ ) in the formula for T3 can be written as


(s)
which means that onn

1 (s) 1 (s)
  (χ ′ − χ ) =   ( χ − χ ′) +   (χ ′ − χ ) .
(s)
onn onn onn (8.19b)
2 2

Substituting this into (8.19a) gives

-977 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

" "
1
T3 ( ) ³ d  (  , D) Ws (  )e 2 i ³ d  (  , D ) Ws (  )e 2 i onn
(s)
  (   )
2 " "
" "
1
 ³ d  (  , D) Ws (  )e 2 i ³ d  (  , D ) Ws (  )e 2 i onn
(s)
  (   ) .
2 " "

Again, just like in the analysis of T1 above, Eq. (8.4b) is applied to get

T3 ( )
" " "
1
³ d  (  , D) Ws (  )e 2 i ³ d  (  , D ) Ws (  )e 2 i ³ d
(s)
  ( ) e
pnn 2 i (    )

2 " " "


" " "
1
 ³ d  (  , D) Ws (  )e 2 i ³ d  (  , D ) Ws (  )e 2 i ³ d
(s)
  ( ) e
pnn 2 i (    )

2 " " "

or

T3 ( )
" " "
1
³ d pnn
  ( ) ³ d  (  , D ) Ws (  )e
(s) 2 i  (  )
³ d  (  , D) Ws (  )e 2i(() ) (8.19c)
2 " " "
" " "
1
 ³ d pnn
  ( ) ³ d  (  , D ) Ws (  )e
(s) 2 i  (  )
³ d  (  , D) Ws (  )e 2i(() ). .
2 " " "

Equations (8.18b) and (8.18c) show that

T3 ( )
"
2 2{³ (s)
  ( )[(   ) H  u (   )  M  R (   ) ma  Z mnf (   )] (
pnn
"

[(   ) H  u (   )  M  R(   ) ma  Z mnf (   )] d
"

³
(s)
   ( ) [(   ) H  u (   )  M  R (   ) ma  Z mnf (   )] (
pnn
"

[(   ) H  u (   )  M  R (   ) ma  Z mnf (   )] d }
or

- 978 -
Calculating the NEdN from the Random Sampling Error · 8.7

T3 (σ ) =

2π 2 {³ p (s)

nn (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
2
(8.19d)
−∞

+ ³   (σ ′) (σ + σ ′) H ( u (σ + σ ′) ) M ( R (σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′
(s)
pnn
2
}
−∞

because everything inside the integrals except H is real. According to Eq. (3.54g) in Chapter 3,
  (σ ′) power spectra in both
(s) (s)
noise-power spectra such as pnn  can never be negative. The pnn
integrals of (8.19d) are multiplied by the squared magnitudes of complex numbers before being
integrated over dıƍ. Consequently neither of the integrals in (8.19d) can be negative, showing that

T3 (σ ) ≥ 0 (8.19e)
for all values of wavenumber ı.
Having found formulas for T1 , T2 , and T3 , we are now prepared to expand the J ( s ) function
defined in Eq. (8.15c) above. Reversing the transform in Eq. (8.8c) gives

2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) = F ( − iσχ ) ( Ws ( χ ) ) , (8.20a)

and Eq. (8.8e) above shows that n (Ds ) is the Fourier transform

( )
n (Ds ) (σ ) = F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) .

According to Eq. (2.39j) in Chapter 2, the Fourier convolution theorem shows that

(
n (Ds ) (σ ) ∗ [2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ )] = F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) . ) (8.20b)

Hence, the formula for J ( s ) in Eq. (8.15c) can also be written as

§
( )) ·
2

© ¬
(
J ( s ) (σ ) = E ¨ ª Re e−iψ (σ ) F ( −iσχ ) Π ( χ , D )n ( s ) ( χ ) Ws ( χ ) º ¸ .
¼ ¹
(8.20c)

We now begin the analysis of the right-hand side of Eq. (8.20c). Once again noting that the
real part of any complex number c is 0.5(c + c* ) , we write

-979 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

(
Re e− iψ (σ ) F ( −iσχ ) ( Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ) )
1 −iψ (σ ) ( −iσχ )
=
2
e { F ( Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) )

+ eiψ (σ ) ¬ª F ( −iσχ ′) ( Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) ) º¼ ∗ . }


The product
Π ( χ , D)n ( s ) ( χ ) Ws ( χ )

inside the Fourier transforms is real, according to Eqs. (8.6a), (8.2a), and (8.8c), so [applying
formulas (2.29a) and (2.29c) in Chapter 2]


ª∞ ∗ º
ªF
¬
( − iσχ ′ )
( )
Π ( χ ′, D)n ( χ ′) Ws ( χ ′) º¼ = « ³ Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′)e −2π iσχ ′ d χ ′»
(s)

¬ −∞ ¼

³ Π( χ ′, D)n ( χ ′) Ws ( χ ′)e 2π iσχ ′ d χ ′


(s)
=
−∞

(
= F (iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) . )
This shows that

( (
Re e − iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ))
1 −iψ (σ ) ( − iσχ )
=
2
e { F (
Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ) (8.21a)

(
+ eiψ (σ ) F (iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) )}.
Squaring this formula leads to

( (
ª Re e − iψ (σ ) F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¬ ¼ ))
1 −2iψ (σ ) ª ( − iσχ )
=
4
e { ¬ F (
Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º¼ 2 ) (8.21b)
(
+ e 2iψ (σ ) ª¬ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) º¼ 2 )
( ) (
+ 2 F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ⋅ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) )}.

- 980 -
Calculating the NEdN from the Random Sampling Error · 8.7

The expectation operator E is linear with respect to random quantities (see Sec. 3.10 of Chapter
3), so E can be applied to both sides of (8.21b) to get

( (
¬ (
E ª Re e −iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¼ )) )
1
( (
= e −2iψ (σ )E ª¬ F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º¼ 2
4
) )
1
( (
+ e 2iψ (σ )E ª¬ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) º¼ 2
4
) )
1
( ( ) ( ))
+ E F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ⋅ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) .
2

Substituting from Eqs. (8.16a)–(8.16c) gives

( (
¬ (
E ª Re e −iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¼ )) )
1 1 1
= e −2iψ (σ )T1 (σ ) + e 2iψ (σ )T2 (σ ) + T3 (σ ) ,
4 4 2

and Eq. (8.20c) shows that this result can be written as

1 −2iψ (σ ) 1 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T2 (σ ) + T3 (σ ) . (8.21c)
4 4 2

Equations (8.18f), (8.18g), and (8.19d) have formulas for T1 , T2 , and T3 that can be substituted
into (8.21c) to get

-981 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

J ( s ) (σ ) =

−π e 2 −2 iψ (σ )
³p
(s)

nn (σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅
−∞

[(σ + σ ′)H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′)]dσ ′



(σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅

− π 2 e 2iψ (σ ) ³p
(s)

nn (8.22a)
−∞

[(σ + σ ′)H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′)]dσ ′



2
+π 2
³   (σ ′) e
(s)
pnn − iψ (σ )
(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞

2
+π 2
³   (σ ′) e
(s)
pnn − iψ (σ )
(σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ ,
−∞

where e−iψ (σ ) terms are inserted into the squared magnitudes of the last two integrals. We can do
this because, for any complex number c,
2 2
c = e − iψ c .

We next define the complex-valued function a (σ , σ ′) to be

a (σ , σ ′) = e − iψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) . (8.22b)

It follows that

a (σ , −σ ′) = e − iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R (σ − σ ′)θ ma ) Z mnf (σ − σ ′) . (8.22c)

Equation (8.22a) can now be written as, remembering that M and Z mnf are real-valued functions,

∞ ∞
{
J ( s ) (σ ) = π 2 − ³ pnn
  (σ ′) a (σ , −σ ′) a (σ , σ ′) d σ ′ −
(s)
³p
(s)

nn (σ ′) a (σ , −σ ′)∗ a (σ , σ ′)∗ dσ ′
−∞ −∞
∞ ∞
+ ³ p (σ ′) a(σ , −σ ′) dσ ′ +
(s)

nn
2
³
2
  (σ ′) a (σ , σ ′) dσ ′
(s)
pnn }
−∞ −∞

- 982 -
Calculating the NEdN from the Random Sampling Error · 8.7

or, combining the four integrals into one,

"
J (s)
( )  2
³p
(s)

nn { 2
( ) a ( ,  )  a( ,  )
2

" (8.22d)
 a ( ,  ) a ( ,  )  a ( ,  ) a ( ,  ) d .}
We note that

2
a ( ,  )  a( ,  ) ¬ª a ( ,  )  a ( ,  ) ¼º ( ¬ª a ( ,  )  a ( ,  ) ¼º
a( ,  ) a( ,  )  a ( ,  ) a ( ,  )
 a ( ,  ) a( ,  )  a ( ,  ) a ( ,  )
2 2
a ( ,  )  a( ,  )  a( ,  ) a( ,  )
 a( ,  ) a ( ,  ) .

This shows that the formula for J ( s ) can be written as

"
2
³p
(s) 2 (s)
J ( )  
nn ( ) a( ,  )  a( ,  ) d ,
"

which becomes, substituting from Eqs. (8.22b) and (8.22c),

J ( s ) ( )
"

³p
2 (s)
 
nn ( ) e i ( ) (   ) H  u (   )  M  R(   ) ma  Z mnf (   ) (8.22e)
"

 2
 ei ( ) (   ) H  u (   )  M  R(   ) ma  Z mnf (   ) d .

(s)
The pnn noise-power spectrum can never be negative [see inequality (3.54g) in Chapter 3], and
inside the
inside Eq. integral
(8.22e) in
the(8.22e)
noise-power spectrum spectrum
the noise-power is multiplied by the squared
is multiplied by the magnitude
magnitude of a
complex number. Hence the integral in (8.22e) is over the product of two non-negative quantities
and itself can never be negative:

J ( s ) ( ) % 0 . (8.22f)

-983 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

This shows there can never be any problem taking the square root of J ( s ) in formula (8.15e)
when calculating the sampling-error NEdN. Combining Eqs. (8.15e) and (8.22e) in a single place
gives
4 J ( s ) (σ )
NEdN samp ( σ ) = , (8.22g)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
where

J ( s ) (σ ) =

π 2 ³ pnn(s ) (σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) (8.22h)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ .

Part of the formula for J ( s ) can also be written as a convolution. Equation (8.4c), which
(s)
  is even, can be applied to the second integral in Eq. (8.19d) to get
shows that pnn

T3 (σ ) =

2π 2
{³ p (s)

nn
2
(σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞

+ ³
(s)
pnn
2
  ( −σ ′) (σ + σ ′) H ( u (σ + σ ′) ) M ( R (σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ . }
−∞

Changing the variable of integration in the second integral to σ ′′ = −σ ′ then gives

T3 (σ ) =

2π {³ p
2 (s)

nn
2
(σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞

+ ³
(s)
pnn
2
  (σ ′′) (σ − σ ′′) H ( u (σ − σ ′′) ) M ( R (σ − σ ′′)θ ma ) Z mnf (σ − σ ′′) dσ ′′},
−∞

which becomes, glancing back at the definition of a convolution in Eq. (2.38a) in Chapter 2,

{ ª 2
º
  (σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
T3 (σ ) = 4π 2 pnn
(s)
¬ ¼ } (8.23a)

- 984 -
Calculating the NEdN from the Random Sampling Error · 8.7

This can be substituted back into Eq. (8.21c) to get

1 −2iψ (σ ) 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T2 (σ )
4 4

{
+ 2π 2 pnn ª 2
º
}
  (σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
(s)
¬ ¼

Equation (8.16f) shows that this can be written as

1 −2iψ (σ ) 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T1 (σ )∗
4 4
+ 2π 2 pnn{ ª 2
  (σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) »
(s)
¬
º
¼ }
1 1 ∗
= e −2iψ (σ )T1 (σ ) + ª¬e −2iψ (σ )T1 (σ ) º¼
4 4

{ ª 2
º
  (σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
+ 2π 2 pnn
(s)
¬ ¼ }
Again noting that
1
Re(c) = (c + c * )
2
for any complex number c, we see that

J ( s ) (σ )

=
1
2
( )
Re e −2iψ (σ )T1 (σ ) + 2π 2 pnn
(s)
{ ª
¬
2
º
  (σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » ,
¼ } (8.23b)

where Eq. (8.18f) shows the formula for T1 (σ ) to be

T1 (σ ) =

− 4π 2
³p
(s)

nn (σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅ (8.23c)
−∞

[(σ + σ ′)H ( u (σ + σ ′) ) M ( R (σ + σ ′)θ ma ) Z mnf (σ + σ ′)]dσ ′ .

This alternative formula for J ( s ) is useful later on when analyzing the behavior of the sampling-
noise NEdN associated with the measurement of an isolated emission line (see Sec. 8.9).

-985 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

8.8 Black-Body Spectrum Contaminated by Sampling Noise


To show what sampling noise looks like, we simulate an interferometer system contaminated by
large amounts of sampling noise while observing a 400-K Planck black-body spectrum. [There is
a brief discussion of black-body radiance spectra following Eq. (5.3h) in Chapter 5.] The
simulated interferometer is similar to the one set up in Sec. 7.15 of Chapter 7. Again the
interferogram is (supposed) to be evenly sampled 8192 times between the OPD values of D and
íD, with D = 1.28 cm. According to Eq. (5.67) in Chapter 5, this means the unapodized spectral
resolution is
1
0.391 cm 1 , (8.24a)
2D

and, of course, the change in OPD between interferogram samples is still

2D
 3.125 &104 cm . (8.24b)
N

The background radiance from the interferometer’s interior surfaces is assumed to be small
compared to the 400-K Planck radiance being measured, so we say that

L( dir ) ( ) L(FOV
fore )
(  ) L(back) ( fore ) (back)
FOV (  ) L mnf (  ) L mnf (  ) 0 . (8.24c)

Again, the beam radius is taken to be R = 3 cm, which makes the beam cross-sectional area

A  R 2 28.27 cm 2 . (8.24d)

The interferometer’s field of view is

 1.086 &104 ster (8.24e)

and the responsivity R is still given its ideal value

amp ( sec
R(  ) 1 . (8.24f)
erg

The beam-splitter efficiency and the transmissions of the fore and aft optics are also ideal,

a (  )  f (  ) (  ) 1 . (8.24g)

- 986 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8

The interferometer is perfectly aligned, with θ ma = 0 so that

M( Rσθ ma ) = 1.0 , (8.24h)

and parameter W [see discussion following Eq. (4.83) in Chapter 4] is

W = 1. (8.24i)

The detector electronics again have a three-pole, low-pass Butterworth filter with a cutoff
frequency of 8000 Hz (see Fig. 7.3 in Chapter 7). The OPD velocity u is still 5 cm/sec, so the
wavenumber corresponding to the cutoff frequency is

8000 Hz
= 1600 cm −1 . (8.24j)
5 cm/sec

One difference from the interferometer system simulated in the previous chapter is the band of
wavenumbers over which the spectrum is measured: this time it is 650 to 1250 cmí1. Another
difference is the radiances used to calibrate the instrument. Since we are simulating the
measurement of a 400-K black-body spectrum, the high-temperature calibration is now a 500-K
instead of a 350-K black-body radiance. The low-temperature calibration is still that of liquid
nitrogen (77 K).
Figure 8.1(a) shows that the sampling-position noise contaminating the 400-K black-body
(s)
measurement has a quasi-harmonic pnn noise-power spectrum. This has the same shape as the
spectrum in Fig. 7.2(c) in Chapter 7, with the power spectrum in Fig. 8.1(a) having ı C = 30 cm -1
and ı M = 10 cm -1 . The upper level in the sampling-position power spectrum is

p0 = 1.25 ×10−13 cm3 . (8.25a)

Imitating Eq. (7.49b) in Chapter 7, we write the formula for the quasi-harmonic spectrum as

  (σ )
(s)
pnn
(8.25b)
{ ( ) ( )}
= [1.25 ×10 −13 cm3 ] ⋅ Π σ − 35 cm −1 ,5 cm −1 + Π σ + 35 cm −1 ,5 cm −1 .

Consulting Eq. (8.5b) above, we see that the variance in the sampling-position error due to this
noise-power spectrum is

-987 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

³p (σ ) dσ = 20 cm −1 ⋅1.25 ×10−13 cm3 = 2.5 × 10−12 cm 2 ,


(s)

nn (8.25c)
−∞

which means that the root-mean-square average of the error in the sampling position is

srms = 2.5 ×10−12 cm 2 ≅ 1.581×10−6 cm . (8.25d)

Comparing this to (8.24b), we see that this is

1.581×10−6 cm
,
3.125 ×10−4 cm

or approximately 0.5% of the OPD separation between adjacent samples. This may be somewhat
larger than the typical size of the sampling error in well-designed interferometers, but the bad
sampling does make it easier to see how sampling error affects the measured spectra. Figures
8.1(b) and 8.1(c) give an example of sampling-position noise obeying the quasi-harmonic noise-
power spectrum in Fig. 8.1(a). Both figures plot the same simulation of the n ( s ) ( χ ) random
function, with the Ȥ axis expanded in Fig. 8.1(b) to provide a detailed example of the n ( s ) ( χ )
oscillations. This sampling-position error is a zero-mean and normally distributed random
quantity. Comparing this example of quasi-harmonic noise to the one shown in Figs. 7.7(a) and
7.7(b) in Chapter 7, we see that here the random oscillations occur at a somewhat lower
frequency. This is due to our choice of a much smaller value of σ C , which is 30 cm-1 for the
noise in Figs. 8.1(b) and 8.1(c) compared to 100 cm-1 for the noise in Figs. 7.7(a) and 7.7(b).
Figure 8.2(a) plots ten simulated measurements of the 400-K black-body radiance spectrum
for the interferometer system specified by the discussion accompanying Eqs. (8.24a)–(8.24j)
above. In Fig. 8.2(a), and only in Fig. 8.2(a), the actual sampling noise is multiplied by a factor of
20 before being added back to the true radiance; it can be regarded as increasing the srms root-
mean-square sampling-position error to 10% of the intersample spacing. This increase makes it
easy to see how the sampling noise reshapes the spectral measurements, because now the width
of the black solid line representing the true 400-K spectrum does not cover over the dashed lines
representing the noise-contaminated measurements. We note that there is a region near
ı = 1031 cm-1 where the error is always small. The solid curve in Fig. 8.2(b) is the NEdN versus
wavenumber curve predicted by formulas (8.22g) and (8.22h) for this sampling-position noise,

- 988 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8

FIGURE 8.1(a).

pn~(n~s ) (σ )
1.25x10-13 cm3

-50 -40 -30 -20 -10 10 20 30 40 50

σ (in cm-1)

FIGURE 8.1(b).

-6
5x10
. 6
5 10

n (Re
s)
χ ) kPlot
(nSVtemp
0.0 0
(in cm)20

-5x10 6-6
5 .10
0.4 0.2 0 0.2 0.4
-0.5
0.5 0.0
kPlot .∆χ 1.28 0.5
0.5

χ (in cm)

-989 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.1(c).

-6
5x10
. 6
5 10

n ( sRe
) nSVtemp
( χ ) kPlot
0.0 0
(in cm) 20

-5x10 6-6
5 .10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot .∆χ 1.28 0.5 1.0 1.28

χ = −D χ (in cm) χ=D

and it also shows the NEdN dipping down to zero near ı = 1031 cm-1. The NEdN is, of course,
just the standard deviation of the error in the noise-contaminated spectral measurements (see Sec.
6.1 in Chapter 6). We can take a large number of noise-contaminated measurements and calculate
directly the standard deviation of their error at any wavenumber ı. We have done this for 300
measurements contaminated by statistically independent examples of sampling-position noise
obeying the power spectrum in Fig. 8.1(a) and plotted the results with crosses in Fig. 8.2(b). As
expected, there is a good match to the predicted NEdN values—that is, the solid curve—and we
see that the crosses marking the standard deviation also dip down to zero near ı = 1031 cm-1.
[The reason they do not go as far down as the solid curve is explained in the discussion following
Eq. (8.34c) in Sec. 8.10 below.]
The formula for NEdNsamp in Eqs. (8.22g) and (8.22h) predicts this dip. The phase angle ȥ of
the three-pole, low-pass filter used in the interferometer simulation [this phrase angle is
introduced in Eq. (8.12f) above] is to a very good approximation linear in wavenumber,

ψ (σ ) ≅ − Kσ +ψ 0 , (8.26a)

for a real ψ 0 and a real, positive constant K. Many types of low-pass filter have this sort of

- 990 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8

approximately linear dependence of the transfer function’s phase. Equations (8.24f)–(8.24h) can
be applied to the formula for NEdNsamp in Eq. (8.22g) to get

4 J ( s ) (σ )
NEdN samp ( σ ) = . (8.26b)
ª amp ⋅ sec º
A ∆Ω H(uσ ) ⋅ «1
¬ erg »¼

Equations (8.24c) and (8.24i) together with the previously used (8.24f)–(8.24h) can be substituted
into the formula for Z mnf in Eq. (8.7f) to get

§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) = ¨ ¸ ⋅ «1 ⋅ L mnf ( σ ) (8.26c)
© 4 ¹ ¬ erg »¼

in the formula used for J ( s ) [for example, Eq. (8.22h)]. Again we note, just as in the discussion
following Eq. (7.55f) in Chapter 7, that for this interferometer system the black-body spectrum is
smooth enough to neglect the nonrandom measurement errors due to the interferometer’s finite
field of view and finite interferogram length—that is, we do not need to worry about the
potentially different shapes of the L, LFOV, and Lmnf radiance functions. Hence the formula for
Z mnf can be written as

§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) ≅ ¨ ¸ ⋅ «1 ⋅ L( σ ) , (8.26d)
© 4 ¹ ¬ erg »¼

where in Eq. (8.26d) L is the spectral radiance curve for Planck radiation coming from a 400-K
black body.
The formula for J ( s ) can be simplified in the same way that the NEdNsamp and Z mnf formulas
were. Equations (8.24h) and (8.26d) can be substituted into (8.22h) to get

2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
J (s)
(σ ) = ¨ ⋅ «1 ¸ ³p
(s)
(σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
erg »¼ ¹

nn
© 4 ¬ −∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .

-991 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.2(a).

370

360
360

LinpV
kR

NEdNV 340
340
kR

Lmeas1V
kR

Lmeas2V
kR

Lmeas3V 320
kR 320

Lmeas4V
kR
Radiance σ ≅ 1031cm-1
2 -1
(in mW/m /sr/cm
Lmeas5V
kR )
Lmeas6V
kR300
300

Lmeas7V
kR

Lmeas8V
kR

Lmeas9V
kR
280
280
Lmeas10V
kR

260
260

250
600
600 700
700 800
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300
650 σR 3
kR 1.25 .10
-1
σ (in cm )
This graph contains 10 simulated measurements of a 400 K black-body spectrum
contaminated by the sampling noise. The noise is increased by a factor of 20 over the
size specified by the noise-power spectrum in Fig. 8.1(a) to make it easier to see.

- 992 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8

FIGURE 8.2(b).

0.30
0.3

0.250.25

0.200.2
σ ≅ 1031 cm −1
Radiance Error
NEdNestP
(in mW/m2/sr/cm-1)
ks

NEdNV
0.150.15
k

NEdNTV
k

0.100.1

0.050.05

3
1.300381 .10 0.0 0
600 700 800 900 1000 1100 1200 1300
600
600 700 800 900 σp 1000
, σg
ks k
1100 1200 1300
1300

σ (in cm-1)

-993 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Applying (8.12f) gives

J ( s ) (σ )
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
=¨ ⋅ «1 ¸ ³p
(s)
(σ ′) ei[ −ψ (σ )+ψ (σ −σ ′)] (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
erg »¼ ¹

nn
© 4 ¬ −∞
2
− ei[ψ (σ )−ψ (σ +σ ′)] (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ ,

which becomes, after substituting from Eq. (8.26a),

J ( s ) (σ )
2
§ π A∆Ω ª amp ⋅ sec º ·
≅¨ ⋅ «1 »¸
© 4 ¬ erg ¼ ¹

⋅ ³ pnn
  (σ ′) e
(s) i[ Kσ −ψ 0 − K (σ −σ ′ ) +ψ 0 ]
(σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
−∞
2
− ei[ − Kσ +ψ 0 + K (σ +σ ′) −ψ 0 ] (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
=¨ ⋅ «1 »¸ ³p
(s)

nn (σ ′) eiKσ ′ (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
© 4 ¬ erg ¼ ¹ −∞
2
− eiKσ ′ (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .
Since
C1eiKσ ′ + C2 eiKσ ′ = eiKσ ′ (C1 + C2 ) = eiKσ ′ ⋅ C1 + C2 = C1 + C2

for any two complex numbers C1 and C2 , this formula for J ( s ) reduces to

2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
J (s)
(σ ) ≅ ¨ ⋅ «1 »¸ ³p
(s)

nn (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
© 4 ¬ erg ¼ ¹ −∞ (8.27)
2
− (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .

The black-body radiance L varies slowly with wavenumber ı, as does the magnitude H of
the filter transfer function. We can define a new function

- 994 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8

g(σ ) = H(uσ ) L( σ ) , (8.28a)

which is also a slowly varying function of wavenumber ı. Now the integral in Eq. (8.27) can be
written as


2
³   (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞

³p
2
= (s)

nn (σ ′) (σ − σ ′) g(σ − σ ′) − (σ + σ ′)g(σ + σ ′) dσ ′ .
−∞

  (σ ′) is nonzero over only a relatively small


(s)
According to Fig. 8.1(a), the power spectrum pnn
range of ıƍ centered on ıƍ= 0, so in effect the integral is only over a small region of the ıƍ axis
near ıƍ= 0, and we only need to know the value of g(σ ± σ ′) inside the integral for small values
of ıƍ. In this situation it makes sense to expand g(σ ± σ ′) as a Taylor series in ıƍ to get


2
³   (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞
2

ª dg º ª dg º
≅ ³ p (σ ′) (σ − σ ′) «g(σ ) − σ ′
(s)

nn » − (σ + σ ′) « g(σ ) + σ ′ » dσ ′
−∞ ¬ dσ at σ ¼ ¬ dσ at σ ¼

dg dg
³p (σ ′) σ g(σ ) − σσ ′ − σ ′ g(σ ) + σ ′2
(s)
= 
nn
−∞
dσ at σ dσ at σ
2
ª dg dg º
− «σ g(σ ) + σσ ′ + σ ′ g(σ ) + σ ′ 2
» dσ ′ .
¬ dσ at σ dσ at σ ¼

This simplifies to


2
³   (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞
2 ∞
(8.28b)
dg
≅ 4 g(σ ) + σ ³ σ′ p (σ ′)dσ ′ .
2 (s)

nn
dσ at σ −∞

Equation (8.28b) can now be substituted into (8.27) to get

-995 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

2 2 "
§  A  ª amp ( sec º · dg
³
(s) 2 (s)
J ( ) ¨ ( «1 » ¸ g( )     ( ) d ,
pnn
© 2 ¬ erg ¼ ¹ d at  "

which can in turn be put into (8.26b), giving

1/ 2
­° " ½°
2 ( g( )   d g
NEdN samp (  ) ( ® ³  2 pnn
(s)
  ( ) d  ¾ . (8.28c)
H(u ) d at  ¯° " °¿

Clearly, NEdNsamp is going to be very small when the absolute value of

dg
T ( ) g( )   (8.28d)
d at 

is small; in fact for the approximation shown in (8.28c), the NEdNsamp value is zero when

dg
g( )  , (8.28e)
d at 


because then T ( ) g( )   (d g / d ) is zero in (8.28c). Formula (8.28a) defines function g
used to define T ( ) in (8.28d). Figure (8.3) is a graph of T ( ) versus ı for the T ( ) function
specified by a 400-K black-body spectrum and the magnitude H of the filter transfer function.
We see that T ( ) is zero for
 1030.5 cm 1 , (8.28f)
which explains the dip at
 1031 cm 1

in Fig. 8.2(b) and the negligible sampling error near  1030.5 cm -1 of all ten noise-
contaminated measurements in Fig. 8.2(a). We can expect this sort of behavior whenever we
examine theNEdN
examine the sampcurve
NEdNsamp curveforfor
a noise-contaminated black-body spectrum.
a sample-noise-contaminated black-body spectrum.

8.9 Sampling Noise and an Isolated Lorentz Emission Line


Sampling-position noise, just like misalignment noise, can generate spurious ghost lines when
contaminating measurements of strong emission lines (this misalignment-noise effect is discussed
in Sec. 7.15 of Chapter 7). To see how it works, we take the same system discussed in Sec. 8.8,
contaminated by sampling-noise obeying the same power spectrum shown in Fig. 8.1(a), and

- 996 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

FIGURE 8.3.

600
600
520.709

400
400

 1030.5cm-1
200
200

T ( )
fp(  )
(in mW/m2/sr/cm-1)
00

-200
200

-400
301.469 400
600
600 700
700 800
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300
600  1300
 (in cm-1)

This is a plot of the T ( ) curve showing where it crosses zero on the


wavenumber axis.

-997 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

change the spectral radiance entering the system into the single Lorentz emission line shown in
Fig. 7.6 in Chapter 7. Again the expression for NEdNsamp in Eq. (8.22g) reduces to the formula
shown in (8.26b),
4 J ( s ) (σ )
NEdN samp ( σ ) = . (8.29a)
ª amp ⋅ sec º
A ∆Ω H(uσ ) ⋅ «1
¬ erg »¼

The formula for Z mnf associated with Eq. (8.22h) is the same as it was before in (8.26c),

§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) = ¨ ¸ ⋅ «1 ⋅ L mnf ( σ ) ,
© 4 ¹ ¬ erg »¼

where Lmnf is the Lorentz emission line as measured by the interferometer. The effects of the
interferometer’s finite interferogram length and finite field of view are the same as when they are
analyzed in Sec. 7.15 of Chapter 7, so we can again ignore the slight differences in shape of the
L, LFOV, and Lmnf radiance functions [see discussion following Eq. (7.56c) in Chapter 7] to get

§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) ≅ ¨ ¸ ⋅ «1 ⋅ L( σ ) , (8.29b)
© 4 ¹ ¬ erg »¼

where L is the spectral radiance of the Lorentz emission line entering the system, that is, the
spectral radiance in Fig. 7.6 in Chapter 7.
If we use formula (8.23b) instead of (8.22h) for the J ( s ) function in Eq. (8.29a), it will be
easier to understand how the sampling-position noise can generate ghost lines when the
interferometer measures the emission line. According to Eq. (8.24h), the M function is one, so
Eq. (8.23b) can be written as

J ( s ) (σ ) =
1
2
( ) {
Re e−2iψ (σ )T1 (σ ) + 2π 2 pnn
(s) ª
¬
2

  (σ ) ∗ « σ H ( uσ ) Z mnf (σ ) » .
¼

Substitution of (8.29b) gives

J ( s ) (σ )

π 2 A2 ∆Ω 2 ­ ª ª amp ⋅ sec º
2
º ½° (8.30a)
1 ° (s)
= Re e
2
(
−2 iψ (σ )
T1 (σ ) + )8
  (σ ) ∗ σ H ( uσ ) «1
®pnn «
«¬ erg »¼
L(σ ) » ¾ .
»¼ °
°¯ ¬ ¿

- 998 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

The formula for T1 ( ) comes from substituting (8.24h) and (8.29b) into Eq. (8.23c) to get

2 "
 2 A2  2 ª amp ( sec º
³p
(s)
T1 ( )  «1 » 
nn ( )[(   ) H  u (   )  L(    )] (
4 ¬ erg ¼ "
(8.30b)
[(   )H  u (   )  L(    )] d .

The L radiance function is narrow enough (see Fig. 7.6 in Chapter 7), and the [ H(u )] varies
slowly enough, that we can make the approximation that

 H  u  L(  )  e H  u e  L(  ) , (8.30c)

where  e is the wavenumber of the emission line’s peak value (for the Lorentz emission line in
this simulation,  e 950 cm-1 ). When ı is far from  e , function L in Eq. (8.30c) is essentially
zero, making the value assigned to [ H(u )] irrelevant—and, of course, when ı is near to  e ,
we can approximate [ H(u )] by its value [ e H(u e )] at  e . In effect, L is treated as a sort of
delta function to which we have applied formula (2.68e) in Chapter 2. Equations (8.30a) and
(8.30b) can now be written as

J ( s ) ( )
2
1  2 A2 2 ª amp ( sec º (8.30d)
Re  e
2
2 i ( )
T1 ( )  
8 «1 erg »  e H  u e 
¬ ¼
 (s)
  ( ) 
pnn
¬ 
ª L( ) 2 º ,
¼

with

T1 ( )
2 "
 2 A2  2 ª§ amp ( sec · º (8.30e)
³
(s)
 «¨ 1 ¸  e H  u e  »   ( ) L (    ) ( L (    ) d .
pnn
4 ¬© erg ¹ ¼ "

The solid curve in Fig. 8.4(a) is the Lorentz emission line L centered over the graph of the
(s) (s)
  ( ) function in (8.4b), with pnn
pnn having
  ( ) still the same
having basicquasi-harmonic
the same quasi-harmonic graph
shape shown in
Fig. 8.1(a). The effective half-width of the Lorentz line is taken to be  w . The two dashed curves
in Fig. 8.4(a) show the L function displaced to either side of original emission line, with new
peak values at  e $  w .

-999 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.4(a) [top]


[TOP] and
AND FIGURE 8.4(b) [[bottom].
BOTTOM].

 e 950 cm 1
L( )
L(   w )

L(   w )


w
 e   C 980 cm 1

 e   C 920 cm 1 e w

e w


C  M
C M
C C

- 1000 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

When these two dashed curves are closer together, having peaks at σ e ± σ ′ with σ ′ < σ w , then
those wavenumbers ı where the dashed curves have significant overlap shows where the product

L( σ − σ ′ ) ⋅ L( σ + σ ′ )

is significantly different from zero. When these dashed curves are further apart, having peaks at
σ e ± σ ′ with σ ′ > σ w , then there is no significant overlap and the product

L( σ − σ ′ ) ⋅ L( σ + σ ′ )

is not significantly different from zero. Hence, the position of the dashed curves in Fig. 8.4(a)
shows where this product drops to zero; any further apart and the

L( σ − σ ′ ) ⋅ L( σ + σ ′ )

product cannot make any significant contribution to the integral in (8.30e). Notice, however, that
when
−σ w ≤ σ ′ ≤ σ w

  (σ ′) in Fig. 8.4(b) shows that


(s)
so that the “double L” product can contribute, then the plot of pnn
  (σ ′) is itself zero. Hence the
(s)
the value of pnn

  (σ ′) L ( σ − σ ′ ) ⋅ L ( σ + σ ′ )
(s)
pnn

product is zero for all ıƍ values for the configuration shown in Figs. 8.4(a) and 8.4(b)—because
(s) (s)
when the “double L” product is non-negligible then pnn  is zero, and when pnn
 is nonzero then
the “double L” product is negligible. We conclude that the integral in (8.30e) is very small or
zero, which means that T1 can be neglected in Eq. (8.30d). Hence, (8.30d) simplifies to

J (s)
(σ ) ≅
π 2 A2 ∆Ω 2 ª amp ⋅ sec º
8 «1
¬ erg » σ e H ( uσ e )
¼
{p(s)

nn
¬
2
}
(σ ) ∗ ª L(σ ) º .
¼
(8.30f)

Consequently, the NEdNsamp formula for this sort of measurement can be written as, substituting
(8.30f) into (8.29a),

-1001 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

 e H  u e  ( s )
NEdN samp (  )  2
H(u )

  ( ) 
pnn ª L( ) 2 º
¬ ¼  1/ 2
. (8.30g)

(s)
This approximate formula for the sampling-noise NEdN can be used used whenever
whenever the
an pnn
 power
spectrum “straddles” a strong emission line the way it does in Figs. 8.4(a) and 8.4(b).
The ten dotted lines in Fig. 8.5(a) plot ten spectral measurements of the Lorentz emission line
using the simulated interferometer contaminated by this quasi-harmonic sampling-position noise,
and the two split solid lines show the true spectral values. The continuous solid line in Fig. 8.5(a)
is the NEdNsamp curve specified by the formulas in (8.29a), (8.30d), and (8.30e). The formula in
(8.30g) shows that NEdNsamp is approximately proportional to the square root of the convolution
(s)
of the squared emission-line radiance L with the quasi-harmonic power spectrum pnn  in Eq.
(8.25b). According to the discussion at the end of Sec. 7.15 of Chapter 7, a similar convolution in
the NEdN formula for the misalignment noise is also associated with ghost lines on either side of
the Lorentz emission line, as can be seen by comparing Fig. 8.5(a) to Fig. 7.8(a) in Chapter 7.
The resemblance is also present in Fig. 8.5(b), which gives an expanded view of the ghost-line
region on the right-hand side of the emission line. Just like in Fig. 7.8(a) for the misalignment
noise, the convolution predicts the presence of ghost lines on either side of the emission line, with
the center of the ghost-line region offset by wavenumber intervals of

M
C 
2

from the wavenumber, marking the peak of the emission line. Unlike the quasi-harmonic noise-
power spectrum in Chapter 7, the noise-power spectrum used here has  C 30 cm -1 and
 M 10 cm-1 so that
M
C  35 cm 1 . (8.31)
2

This agrees with the ghost-line offsets seen in Figs. 8.5(a) and 8.5(b).
Figure 8.6 compares the standard deviations of the errors in the measured radiances to the
NEdNsamp values predicted by the formulas in (8.29a), (8.30d) and (8.30e). It follows the same
format as Fig. 8.2(b), and once again we see a good match between the calculated standard
deviations represented by the crosses and the NEdNsamp predictions represented by the solid line.
The only difference between the procedure used to generate Fig. 8.2(b) and the procedure used to
generate Fig. 8.6 is that the standard deviations in (8.6) are calculated from 900, instead of from
300, noise-contaminated interferometer measurements.

- 1002 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

FIGURE 8.5(a).

2.0
2
2

1.5 1.5
LinpV
kR

NEdNV
kR
Noise-free Noise-free
Lmeas1V Spectrum
kR Spectrum
1.0 1
Lmeas2V
kR

Lmeas3V
kR
Radiance
(in mW/m2/sr/cm-1) kR
Lmeas4V
0.5 0.5
Lmeas5V
kR

Lmeas6V
kR

Lmeas7V
kR

Lmeas8V
0.0 0
kR

Lmeas9V
kR

Lmeas10V
kR

-0.5 0.5 NEdNsamp

-1.0
1 1
850 900 950 1000 1050
850
850 900 950
σR 1000 1050
1050
kR

σ (in cm-1)

-1003 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.5(b).

1.0
1.0

0.8 0.8

Noise-free
Spectrum
LinpV 0.6
kR 0.6

NEdNV
kR
0.4
Lmeas1V 0.4
kR

Lmeas2V
kR
0.2 0.2
Lmeas3V
Radiance kR

(in mW/m2/sr/cmLmeas4V
-1
) 0.0
kR
0
Lmeas5V
kR

Lmeas6V
-0.2
kR
0.2
Lmeas7V
NEdNsamp
kR

Lmeas8V
-0.4
kR
0.4
Lmeas9V
kR

-0.6
Lmeas10V
kR 0.6

-0.8 0.8

-1.0
1.0 1
960 980 1000 1020 1040
950 960 980 1000
σR 1020 1040 1050
kR

σ (in cm-1)

- 1004 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

FIGURE (8.6).

0.30
0.3

0.250.25

0.20 0.2

0.150.15

Radiance ErrorNEdNestPks
0.10 0.1
(in mW/m2/sr/cmNEdNV
-1
) k

0.050.05

0.0 0

-0.050.05

-0.10
0.1
800 850 900 950 1000 1050 1100
800
800 850 900 950
σp , σg 1000 1050 1100
1100
ks k

σ (in cm-1)

-1005 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Section 7.15 of Chapter 7 included background radiance in the simulated measurements of a


Lorentz emission line contaminated by misalignment noise, and nothing stops us from doing the
same thing here with the sampling noise. Following the same strategy as before [see discussion
following Eq. (7.56a) in Chapter 7], we now change the fore-optics transmission to

 f (  ) 0.5 (8.32a)

rather than one [the value it has had up to now is one, see Eq. (8.24g)] so that the fore-optics
background radiance L(FOV fore )
is no longer insignificant as it was in Eq. (8.24c). For this sort of
setup, a first-order estimate for the effective fore-optics emissivity is

1   f (  ) 0.5 ,

and the effective temperature of the background radiance is taken to be 350 K. The measured
emission line is the same one used before, having the spectral radiance shown in Fig. 7.6 of
Chapter 7, and the sampling-position noise is the same as in Figs. 8.5(a) and 8.5(b); that is, it is
the noise specified by the power spectrum in Fig. 8.1(a). Because significant amounts of
background radiance are present, Eq. (8.30g) is no longer a good approximation for NEdNsamp; we
must instead return to Eqs. (8.22g) and (8.22h), remembering to allow for L(FOV
fore )
no longer being
zero and  f being 0.5. Since we still have

amp ( sec
R(  ) 1 ,  a (  )  (  ) 1 , M( R ma ) 1.0 , W 1 ,
erg
and
L( dir ) ( ) L(back) (back)
FOV (  ) L mnf (  ) 0

from Eqs. (8.24f-i) and (8.24c), Eq. (8.22g) now simplifies to

8 J ( s ) ( )
NEdN samp (  ) ; (8.32b)
§ amp ( sec ·
A  H(u ) ( ¨1
© erg ¸¹
and we have,from
fromEq.
Eq.(8.7f)
(8.7f)that
that

A  § amp ( sec · ª 1 ( fore ) º


Z mnf ( ) ¨1 ¸ « L mnf (  )  L mnf (  )» (8.32c)
4 © erg ¹ ¬ 2 ¼

- 1006 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9

in the simplified formula from (8.22h) used for the J ( s ) calculation:


J ( s ) (σ ) = π 2 ³ pnn
  (σ ′) e
(s) − iψ (σ )
(σ − σ ′) H ( u (σ − σ ′) ) Z mnf (σ − σ ′)
−∞ (8.32d)
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) Z mnf (σ + σ ′) dσ ′ .

Again, according to the discussion after Eq. (7.56b) in Chapter 7, we can neglect the difference
between the L, LFOV, and Lmnf spectral radiance functions; for the same reasons, we can also
neglect the difference between the L( fore ) , L(FOV
fore )
, L(mnf
fore )
background radiance spectra.
The dotted lines in Figs. 8.7(a) and 8.7(b) show ten spectral measurements of the Lorentz
emission line contaminated by sampling noise when the 350-K background radiance is present,
and Fig. 8.7(c) is a close-up of the right-hand side of the same set of curves. The continuous solid
lines in Figs. 8.7(a)–8.7(c) show the NEdNsamp values predicted by Eqs. (8.32b)–(8.32d), and the
split solid lines give the true L(σ ) spectral radiance. Comparing Figs. 8.7(a) and 8.7(c) to the
plots in Figs. 8.5(a) and 8.5(b) without the background radiance, we see that the background
radiance prevents the measurement error from dropping to zero outside the regions where the
ghost lines occur. Figure 8.7(b), which is a somewhat expanded version of Fig. 8.7(a), makes it
easy to see that when the presence of the ghost lines is disregarded, the NEdN from the Planck
black-body radiance drops to zero near σ ≅ 940 cm-1 . This is the same sort of behavior seen
before in Figs. 8.2(a) and 8.2(b), with the dip now occurring at a smaller wavenumber (940 cmí1
instead of 1030 cmí1) because the background radiance curve is at a lower temperature, 350 K,
instead of the 400 K of Figs. 8.2(a) and 8.2(b). This dip can be seen even more plainly in Fig. 8.8.
Just like in Fig. 8.6, the crosses plot standard deviations of the radiance errors, calculated from
900 measurements contaminated by the power spectrum in Fig. 8.1(a). There is again a good
match between the standard deviations and the solid curve showing the NEdNsamp values
predicted by Eqs. (8.32b)–(8.32d), and again the crosses do not go down as far as the NEdN
curve in the region of the dip.

8.10 Error from Quasi-Static Sampling Noise

  (σ ) is proportional to a delta function, so that


(s)
When the power spectrum pnn

  (σ ) = o0δ (σ )
(s)
pnn (8.33a)

for some positive and constant o0 value, the formula for J ( s ) in Eq. (8.22h) reduces to

-1007 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.7(a).

2.0
2
2

Noise-free
Spectrum
1.5 1.5
LinpV
kR

NEdNV
kR

Lmeas1V
kR
1.0 1
Lmeas2V
kR

Lmeas3V
kR

Radiance Lmeas4VkR
0.5 0.5
(in mW/m2/sr/cmLmeas5V
-1
) kR
Lmeas6V
kR

Lmeas7V
kR

Lmeas8V 0.0 0
kR

Lmeas9V
kR

Lmeas10V
kR NEdNsamp
-0.50.5

-1.0
1 1
850 900 950 1000 1050
850
850 900 950
R 1000 1050
1050
kR

 (in cm-1)

- 1008 -
Error from Quasi-Static Sampling Noise · 8.10

FIGURE 8.7(b).

1.0
1
1

0.80.8 Noise-free
Spectrum
LinpV
kR

NEdNV
kR 0.60.6
Lmeas1V
kR

Lmeas2V
kR

Lmeas3V 0.40.4
kR

Radiance Lmeas4VkR
(in mW/m2/sr/cm-1
)
Lmeas5V
0.20.2
kR

Lmeas6V
kR

Lmeas7V
kR

Lmeas8V
0.0 0
kR

Lmeas9V
kR

Lmeas10V
NEdNsamp
-0.2
kR 0.2

-0.40.4
0.5
850 900 950 1000 1050
850
850 900 950
R 1000 1050
1050
kR

 (in cm-1)

-1009 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

FIGURE 8.7(c).

1.0
1.0

0.80.8 Noise-free
Spectrum

LinpV
kR 0.60.6
NEdNV
kR

Lmeas1V 0.40.4
kR

Lmeas2V
kR

Lmeas3V
0.20.2
kR

Radiance Lmeas4VkR
(in mW/m2/sr/cm -1
) 0.0 0
Lmeas5V
kR

Lmeas6V
kR
-0.20.2
Lmeas7V
kR
NEdNsamp
Lmeas8V
kR
-0.40.4
Lmeas9V
kR

Lmeas10V
-0.6
kR 0.6

-0.80.8

-1.0
1.0 1
960 980 1000 1020 1040
950 960 980 1000
σR 1020 1040 1050
kR

σ (in cm-1)

- 1010 -
Error from Quasi-Static Sampling Noise · 8.10

FIGURE 8.8.
0.30
0.3

0.25
0.25

0.200.2

NEdNestP
Radiance Error ks
0.150.15
(in mW/m2/sr/cmNEdNV
-1
) k

0.100.1

0.05
0.05

1.149519 .10
3 0 0
800 850 900 950 1000 1050 1100
800
800 900 σp , σg 1000 1100
1100
ks k

σ (in cm-1)

J ( s ) (σ ) = (8.33b)
2
π 2 o0 e−iψ (σ )σ H ( uσ )M ( Rσθ ma ) Z mnf (σ ) − eiψ (σ )σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) .

Substituting from (8.12f), we find that

2
J ( s ) (σ ) = π 2 o0 σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) − σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) = 0 , (8.33c)

which shows that, according to Eq. (8.22g),

-1011 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

NEdN samp 0 (8.33d)

when the nonzero noise-power spectrum for the sampling-position noise is a delta function.
This odd result is an artifact of the approximations made in deriving the NEdNsamp formulas,
and we can show this by getting the same result using another line of reasoning. Equation (8.33a)
specifies a noise-power spectrum concentrated at  0 . Substituting Eq. (8.33a) into (8.4b)
gives

(s)
  (  ) o0 .
onn

From the definition of the autocorrelation function in Eq. (8.3b), we see that an autocorrelation
function can
function canhave
havethe
thesame
samenonzero
nonzeroo0ovalue
0 value at all OPD
at all values
OPD Ȥ only
values Ȥ when
whenthe
therandom
random sampling
sampling
(s)
error n is the same at all OPD values Ȥ. We interpret this to mean that all the samples of the
interferogram signal are shifted by the same random value from their expected positions during a
single sweep of the interferometer’s moving mirror. Later, after many new spectral measurements
and many more sweeps of the moving mirror, we find that the shift in the sample positions has
changed to another random value. We can think of this as quasi-static sampling noise; although
effectively constant during each sweep, the sampling shift can gradually change over many
sweeps to a new random value. Suppose r is the random shift in the OPD value Ȥ for every
sample of a spectral measurement’s interferogram, which means the random function n ( s ) (  )
defined in Sec. 8.2 above is now
n ( s ) (  ) r . (8.34a)

If we take a very large number of spectral measurements, there is no way to tell ahead of time
what r will be for any particular sweep of the moving mirror—but whatever r happens to be at
the beginning of the sweep, it has the same value at the end of the sweep. This is why it makes
sense to use (8.34a) to specify the sampling noise n ( s ) as a stationary but nonergodic function
[function n ( s ) is identical to the stationary but nonergodic random function discussed following
Eq. (3.47a) in Chapter 3]. Substituting (8.34a) into (8.8e) gives gives, using the linearity of the Fourier
transform from Sec. 2.6 of Chapter 2,
n (Ds ) ( ) F ( i )  r (  , D)  r F (  i )  (  , D)  ,
n (Ds ) ( ) F ( i )  r (  , D)  r F (  i )  (  , D)  ,
which becomes, using formula (8.7b),
which becomes, using formula (8.7b),
n (Ds ) ( ) 2rD
 sinc(2 D) . (8.34b)
(s)
n D ( ) 2rD  sinc(2 D) . (8.34b)

- 1012 -
Error from Quasi-Static Sampling Noise · 8.10

The error  L in Eq. (8.12k) is now

 L

4r Re e  i ( )  2 D sinc(2 D)    2 i H(u ) M( R ma ) Z FOV ( )  .
(WA ) H(u ) M( R ma )R (  ) ( ) a (  ) f (  )

Just like before [see discussion following Eq. (8.18a)], we note that the product

 2 i H(u ) M( R ma )
varies slowly with wavenumber ı compared to the sinc function, setting up the approximation

 L

4r Re 2 i e  i ( ) H(u ) M( R ma ){ 2 D sinc(2 D)   Z FOV ( )} 
(WA ) H(u ) M( R ma )R (  ) ( ) a (  ) f (  )

based on Eq. (5C.1) in Appendix 5C of Chapter 5. The formula for H in Eq. (8.12f) shows that

 L

4r Re 2 i H(u ) M( R ma )  2 D sinc(2 D)   Z FOV ( ) . (8.34c)
(WA ) H(u ) M( R ma ) R (  ) ( ) a (  ) f (  )

Functions M, H , and Z FOV inside Re( ) on the right-hand side are strictly real. Formula
(8.34c) is based on the standard approximations used in this chapter—nothing extra has been
added. Consequently, when we rely on these approximations, the error  L ends up proportional
to the real part of a strictly imaginary quantity; that is, it ends up being zero. Hence, we have
confirmed that the standard approximations used so far in this chapter end up predicting zero
sampling noise when the sampling-position noise is quasi-static with a delta function for its
power spectrum.
The best way to interpret the results in (8.33d) and (8.34c) is to regard them as predicting that
for this case the sampling error in the radiance measurement is going to be small instead of
completely nonexistent. There is already a strong hint in Sec. 8.8 that there are times when these
approximations break down—we remember that in Fig. 8.2(b) the exact sampling error marked
by the crosses does not follow the solid curve all the way down to zero at  1030.5 cm -1 . The
1031 cm
approximation used when taking the slowly varying H and M functions outside the convolution
with [2 D sinc(2 D)] is actually rather good. These functions are also under our control when
designing the instrument; they can be made effectively constant over the band of wavenumbers
being measured, turning the approximation used to remove them from the convolution into an
exact equality. Consequently, if a more accurate formula for NEdNsamp is desired, it is better to

-1013 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

rethink the approximation specified in Eq. (8.2e) above. When the error from the linear
approximation
dz ( tot )
zC(tot )   j  n ( s ) (  j )  zC( tot ) (  j )  n ( s ) (  j ) ( C
d   
j

disappears, it is clearly time to consider what happens when the quadratic approximation is used:

 
zC( tot )  j  n ( s ) (  j )
dz ( tot ) 1 d 2 zC( tot ) (8.35)
zC( tot ) (  j )  n ( s ) (  j ) ( C  n ( s ) (  j ) 2 ( .
d   j
2 d 2   j

Including the effect of the third term on the right-hand side of (8.35), the quadratic term in the
 
Taylor series for zC(tot )  j  n ( s ) (  j ) , would stop the solid curve in Fig. 8.2(b) from dipping
down so close to zero—and also prevent the noise formulas from producing a strictly zero value
for NEdNsamp when the sampling-position noise is quasi-static and obeys the delta-function power
spectrum in (8.33a).
(6.33a).
Retaining both the linear and quadratic terms in (8.35) is, according to Eq. (8.34a), the same
as retaining O (r ) and O(r 2 ) terms everywhere they occur in the noise equations. Postponing for
a while the expansion of the signal error in powers of r , we use the exact formula for the noise-
( tot )
contaminated signal zCN (  ) , writing that

( tot )
zCN (  ) zC(tot ) (   r ) (8.36a)

rather than using the approximation in Eq. (8.2g) above. Our strategy is to repeat the same
procedure used before to derive NEdNsamp, taking advantage of the way the sampling error is now
a random constant r instead of a random function n ( s ) . Having already set up Eq. (8.36a) to
replace (8.2g) at the end of Sec. 8.2, we skip past the next section (because there is no reason to
repeat the explanation of the sampling-noise autocorrelation function and power spectrum) and
move on to Sec. 8.4. The formula corresponding to Eq. (8.6b) is

( tot )
(  , D) zCN (  ) (  , D) zC( tot ) (   r ) , (8.36b)

which means that instead of Eq. (8.6c) we have


Z (  i )
 (  , D) zC(tot ) (   r) 
eff ,totN ( ) F

- 1014 -
Error from Quasi-Static Sampling Noise · 8.10

representing the uncalibrated spectral signal contaminated by sampling noise. Applying Eq.
(2.39j) of Chapter 2 (the Fourier convolution theorem) gives


Z (  i )
 (  , D)   F ( i )  zC(tot ) (   r )  ,
eff ,totN ( ) F

which becomes, substituting from Eq. (8.7b),


Z (  i )
 zC(tot ) (   r)  .
eff ,totN ( ) [2 D sinc(2 D )]  F

The Fourier shift theorem [see Eq. (2.36h)


(2.36i) in Chapter 2] gives


Z ª 2 i r F ( i )  zC(tot ) (  )  º ,
eff ,totN ( ) [2 D sinc(2 D )]  ¬ e ¼

which can be written as, since the small value of r makes e 2 i r a slowly varying function of ı
compared to the sinc function [see Eq. (5C.1) in Appendix 5C of Chapter 5]


Z eff ,totN ( ) e
2 i r
[2 D sinc(2 D)]  F (  i )  zC( tot ) (  )  . 
Using Eq. (8.7b) to replace [2 D sinc(2 D)] by F (  i )  (  , D)  , we get


Z eff ,totN ( ) e
2 i r
 
F (  i )  (  , D)   F (  i )  zC( tot ) (  )  ,

which can be written as, according to Eq. (2.39j) in Chapter 2,


Z 2 i r (  i )
 
(  , D) zC( tot ) (  ) .
eff ,totN ( ) e F

This becomes, applying the formula in (8.7e) above,


Z 2 i r
H(u ) M( R ma ) Z mnf ( ) . (8.36c)
eff ,totN ( ) e

The alert reader will notice that the error in Eq. (8.36c) can now be entirely eliminated by taking
the magnitude of the complex spectral signal contaminated by this particular type of sampling
noise:

-1015 -
8 · Sampling-Error NEdN in Double-Sided Interferograms


Z 2 i r
eff ,totN ( ) e H(u ) M( R ma ) Z mnf ( ) H(u ) M( R ma ) Z mnf ( )
(8.36d)
H(u ) M( R ma ) Z mnf ( ).

Here, the last step acknowledges that only H(uı) is a complex-valued function of ı.
Unfortunately—leaving
Unfortunately––leaving aside thisthis
aside special case—in
special case––ingeneral
generaltaking
takingthe
themagnitude
magnitudeof
of the
the complex
spectral signal increases the amount of noise present. When, for example, the signal is
contaminated by detector noise, taking the magnitude of the complex spectral signal puts both the
avoidable and unavoidable detector-noise components into the spectral measurement.110
Consequently the signal-processing algorithms of Fourier-transform spectrometers usually avoid
taking the magnitude of the complex, noise-contaminated spectral signal and instead use
calibration algorithms like the one described in Sec. 5.19 of Chapter 5 (we have, in fact, already
applied this algorithm to standard sampling noise in Sec. 8.5 above). Although we know that our
analysis here is for the special case of sampling-position noise characterized by a delta function
power spectrum, a real spectroscopist cannot know this ahead of time and so would process his
Fourier-transform data as though other types of noise—for example, detector noise—dominate
his noise budget. Hence we should now investigate what happens to sampling-position noise
characterized by a delta-function power spectrum when it is processed this way—that is,
processed as though it is detector noise. This first step, then, is to approximate e 2 i r in such a
way as to convert it into an additive noise.
We decide to take advantage of the smallness of r , expanding e 2 i r into a power series
while remembering to retain, as promised in the discussion immediately preceding Eq. (8.36a)
above, both the O(r ) and O(r 2 ) terms,

1
e 2 i r cos(2 r )  i sin(2 r ) 1  (2 r ) 2  i (2 r )
2 (8.37a)
1  2 i r  2 2 2 r 2 .
When put back into (8.36c), this gives


Z eff ,totN ( )

H(u ) M( R ma ) Z mnf ( )  (2 i r )H(u ) M( R ma ) Z mnf ( ) (8.37b)


 (2 2 2 r 2 )H(u ) M( R ma ) Z mnf ( )

for the uncalibrated spectral signal contaminated by delta-function sampling-position noise. We


expect that the noise in any spectral signal can be removed by averaging together many

110
The discussion following Eq. (6.35d) in Chapter 6 explains the difference between the avoidable and unavoidable
detector-noise in a spectral measurement.

- 1016 -
Error from Quasi-Static Sampling Noise · 8.10

independent measurements of the same spectrum—that is, by taking its expectation value—so we
apply the expectation operator E to both sides of (8.37b) to get

(

E Z )
eff ,totN (σ ) ≅

H(uσ ) M( Rσθ ma ) Z mnf (σ ) + 2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ ) ⋅ E(r )


− 2π 2σ 2 H(uσ ) M( Rσθ ma ) Z mnf (σ ) ⋅ E(r 2 ) .

Here, we have once again applied Eqs. (3.9f) and (3.16a) from Chapter 3 to simplify the formula
by distributing operator E over the expression for the uncalibrated spectral signal. Substitution of
(8.34a) into (8.3a) shows that the random parameter r is zero-mean,

E ( r ) = 0 . (8.37c)

This also makes good intuitive sense because we expect the sampling offset r to be equally
likely to take on a positive or a negative value for any given sweep of the interferometer’s
moving mirror. Hence, the expectation value of the uncalibrated spectral signal can be written as


E Z( )
eff ,totN (σ ) ≅

H(uσ ) M( Rσθ ma ) Z mnf (σ ) − 2π 2σ 2 rrms


2
H(uσ ) M( Rσθ ma ) Z mnf (σ )
or

(

E Z )
eff ,totN (σ ) ≅ (1 − 2π σ rrms ) H(uσ ) M( Rσθ ma ) Z mnf (σ ) ,
2 2 2
(8.37d)

where we define
rrms = E(r 2 ) . (8.37e)

Since r is taken to be a small random error in the sampling position, the factor

(1 − 2π 2σ 2 rrms
2
)
in Eq. (8.37d) is always positive with

2π 2σ 2 rrms
2
<< 1 .
Since E ( r ) = 0 , we see that
2
rrms ( )
= E(r 2 ) = E [r − E(r )]2 . (8.37f)

-1017 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Hence rrms, being the square root of the variance E([r − E(r )]2 ) , is the standard deviation of r .
[See Eqs. (3.5c) and (3.8e) of Chapter 3 for definitions of the standard deviation and variance.]
The uncalibrated, noise-contaminated spectral signal in (8.37b) can be written as, after both
adding and subtracting
2π 2σ 2 rrms
2
H(uσ ) M( Rσθ ma ) Z mnf (σ )
from the formula,


Z eff ,totN (σ ) ≅

ª¬ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ ⋅ (1 − 2π 2σ 2 rrms


2
)
(8.38a)
+ 2π 2σ 2 (rrms
2
− r 2 ) H(uσ ) M( Rσθ ma ) Z mnf (σ )
+ (2π iσ r )H(uσ ) M( Rσθ ma ) Z mnf (σ ) .

We now define a new random variable


ρ (2) = rrms
2
− r 2 . (8.38b)

Taking the expectation value of both sides of (8.38b) gives, again applying Eqs. (3.9f) and
(3.16a) from Chapter 3,
E( ρ (2) ) = rrms
2
− E(r 2 )

which becomes, substituting from Eq. (8.37f),

E( ρ (2) ) = 0 . (8.38c)

We can also define a new function


M ( x) = 1 − 2π 2 x 2 . (8.38d)

Substituting (8.38b) and (8.38d) into (8.38a) gives

 ª º
Z eff ,totN (σ ) ≅ ¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) ¼

+ 2π 2σ 2 ρ (2) H(uσ ) M( Rσθ ma ) Z mnf (σ )


+ (2π iσ r )H(uσ ) M( Rσθ ma ) Z mnf (σ )
or
 ª º
Z eff ,totN (σ ) ≅ ¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) ¼
(8.38e)
+ (r − iπσρ (2) ) ⋅ ª¬ 2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ .

- 1018 -
Error from Quasi-Static Sampling Noise · 8.10

According to the discussion following Eq. (8.37e), we can count on M (σ rrms ) defined in (8.38d)
always being a positive quantity slightly less than one. Substituting Eq. (8.38d) into (8.37d) gives


E Z( )
eff ,totN (σ ) ≅ H(uσ ) M (σ rrms )M( Rσθ ma ) Z mnf (σ ) . (8.38f)

It is now time to apply the calibration algorithm using the same procedure as in Sec. 8.5
above. We note that Eq. (8.38e) corresponds to Eq. (8.9a) in Sec. 8.4, only now the leading
nonrandom term is
ª¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) º¼

instead of
ª¬ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ ,

and the small random error is

(r − iπσρ (2) ) ⋅ [2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ )]

instead of

n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] .

These observations can be written symbolically as

ª¬ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ → ª¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) º¼ (8.39a)

for the large nonrandom term and

n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] →


(8.39b)
(r − iπσρ (2) ) ⋅ ª¬ 2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼

 ( meas ) , the uncalibrated and noise-


for the small random term. Hence, the formula for Z eff ,totN

contaminated signal spectrum, is now, applying (8.39a) and (8.39b) to Eq. (8.11e),

-1019 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

 ( meas ) (σ ) ≅ H(uσ ) M (σ r ) M( Rσθ ) Z (σ )


Z eff ,totN rms ma mnf
(8.39c)
+ (r − iπσρ (2) ) ⋅ ª¬ 2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ .

When an interferometer contaminated by delta-function sampling-position noise—that is, quasi-


static sampling-position noise obeying a power spectrum like the one in (8.33a)—observes the
calibration radiance L(1), we note that the rule in (8.39a) becomes

ª¬ H(uσ ) M( Rσθ ma ) Z mnf


(1)
(σ ) º¼ → ª¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf
(1)
(σ ) º¼ (8.39d)

(1)
where the superscript (1) is added to show that Eq. (8.10d) specifying Z mnf is now the proper
formula for Z mnf (because L(1) is now the input radiance). Similarly, when the interferometer
observes the L(2) calibration radiance, we have

ª¬ H(uσ ) M( Rσθ ma ) Z mnf


(2)
(σ ) º¼ → ª¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf
(2)
(σ ) º¼ , (8.39e)

(2)
where now Eq. (8.10f) specifying Z mnf is the proper formula for the Z mnf function. Applying
(8.39d) and (8.39e) to Eqs. (8.11c) and (8.11d) respectively gives

eff ,tot (σ ) ≅ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ )


Z (1) (1)
(8.39f)
and
eff ,tot (σ ) ≅ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) .
Z (2) (2)
(8.39g)

The formula corresponding to Eq. (8.12b) above is, again applying (8.39d) and (8.39e),

L(2) ( σ ) − L(1) ( σ ) L(2) ( σ ) − L(1) ( σ )


≅ ,
eff ,tot (σ ) − Z eff ,tot (σ )
Z (2) H(uσ ) M (σ rrms ) M( Rσθ ma )[ Z mnf (σ ) − Z mnf (σ )]
(1) (2) (1)

which becomes, after substituting from (8.10d) and (8.10f),

L(2) ( σ ) − L(1) ( σ )
eff ,tot (σ ) − Z eff ,tot (σ )
Z (2) (1)

(8.39h)
−1
ª WA ∆Ω º
=« H(uσ ) M (σ rrms ) M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )» .
¬ 4 ¼

- 1020 -
Error from Quasi-Static Sampling Noise · 8.10

This is the formula corresponding to Eq. (8.12c) in Sec. 8.5 above. To construct the formula
corresponding to Eq. (8.12d), we subtract (8.39f) from (8.39c) to get

 ( meas ) (σ ) − Z (1) (σ ) ≅ H(uσ ) M (σ r ) M( Rσθ )[ Z (σ ) − Z (1) (σ )]


Z eff ,totN eff ,tot rms ma mnf mnf

+ (r − iπσρ (2) ) ⋅ [2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ )] ,

which becomes, after substituting from (8.7f) and (8.10d),

 ( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot

WA ∆Ω
≅ H(uσ ) M (σ rrms ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )[L mnf ( σ ) − L(1) ( σ )] (8.39i)
4
+ (r − iπσρ (2) ) ⋅ [2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ )] .

Equations (8.39h) and (8.39i) can now be substituted into the fundamental calibration formula
(8.12a) to get

Measured Radiance
8πσ [πσρ (2) + ir ] Z mnf (σ )
= L mnf ( σ ) + .
(WA ∆Ω) M (σ rrms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )

Substitution from Eq. (8.7f) gives

Measured Radiance
°­ L(mnf ( σ ) − L(back) ½
mnf ( σ ) °
fore )
2πσ [πσρ (2) + ir ] (8.39j)
= L mnf ( σ ) + ®L mnf ( σ ) + ¾.
M (σ rrms ) ¯° τ f ( σ ) °¿

As always, the true error in the measured radiance is the real part of the complex error terms that
are present [see, for example, the discussion following Eq. (7.21e) in Chapter 7 or Eq. (6.35d) in
Chapter 6]. Hence the error δ L is the real part of the second term on the right-hand side:

2π 2σ 2 ρ (2) ­° L(mnf
fore )
( σ ) − L(back)
mnf ( σ ) ½°
δ L = ®L mnf ( σ ) + ¾. (8.39k)
M (σ rrms ) °¯ τ f (σ ) ¿°

When the interferometer’s background radiances L(mnf


fore )
and L(back)
mnf are negligible, the δ L error

-1021 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

has—except for the slowly varying σ 2 M (σ rrms ) factor—the same shape as the Lmnf radiance
being measured. This is good news if all that is needed is the shape of the input Lmnf radiance—
maybe we just want the position of absorption or emission lines in an unknown spectrum—
because the change in ρ (2) from measurement to measurement acts like a small random change in
the zero level of the radiance curve. It is, however, disturbing news if the Lmnf spectrum must be
radiometrically accurate, because there is little “off shape” evidence of the sampling error in the
measurement.
When the interferometer is contaminated by quasi-static or delta-function sampling-position
noise, the expected value for δ L  is, applying the expectation operator to both sides of Eq.
(8.39k),

2π 2σ 2 ­° L(mnf
fore )
( σ ) − L(back)
mnf ( σ ) ½°
E(δ L ) = ®L mnf ( σ ) + ¾ ⋅ E ( ρ ) .
(2)

M (σ rrms ) ¯° τ f (σ ) ¿°

According to (8.38c) we can now conclude that

E(δ L ) = 0 ,

confirming that δ L is still, just like in Eq. (8.14f) above, a zero-mean random quantity. Hence,
its variance is, applying formula (8.15a) to (8.39k),

§ª ­° ( σ ) − L(back) ) ½°º
2
·
L(mnf mnf ( σ
fore )
 ¨ 2π 2σ 2 (2) 2 ¸
E(δ L ) = E «
2
®L mnf ( σ ) + ¾» [ ρ ] .
¨ « M (σ rrms ) °¯ τ f (σ ) °¿»¼ ¸
©¬ ¹

The linearity of operator E with respect to random quantities (see Sec. 3.10 of Chapter 3) lets us
write
2
ª 2π 2σ 2 °­
fore )
L(mnf ( σ ) − L(back)
mnf ( σ ) °½º

E(δ L ) = «
2
®L mnf ( σ ) + ¾» ⋅ E ([ ρ ] ) .
(2) 2

«¬ M (σ rrms ) °¯ τ f (σ ) °¿»¼

This becomes, after substituting from (8.38b) and (8.38d),

2
ª 2π 2σ 2 ­° fore )
L(mnf ( σ ) − L(back)
mnf ( σ ) ½°º
E(δ L 2 ) = « ®L mnf ( σ ) +
2 2 2
¾» ⋅ E ([rrms − r ] ) .
«¬1 − 2π σ rrms τ f (σ )
2 2 2
°¯ °¿»¼

- 1022 -
Error from Quasi-Static Sampling Noise · 8.10

Again it is important to remember that, according to the discussion following Eq. (8.37e), the
factor (1  2 2 2 rrms
2
) is always a positive number slightly less that one. The standard deviation
of  L is the square root of its variance [see Eq. (3.5c) in Chapter 3]; and, as explained in Sec.
6.1 of Chapter 6, the NEdN of a spectral measurement is the standard deviation of its random
error. Hence, the formula for the NEdN of delta-function sampling-position noise is

delta 2 2 2 E ([rrms
2
 r 2 ]2 ) L(mnf
fore )
(  )  L(back)
mnf (  )
NEdN samp L mnf (  )  . (8.40a)
1  2 2 2 rrms
2
 f ( )

Using the linearity of the expectation operator E with respect to random quantities [see Eq. (3.9f)
and Sec. 3.10 of Chapter 3] and then substituting from Eq. (8.37f), we note that

 2
E [rrms 
 r 2 ]2 E rrms
4
 2
 2rrms r 2  r 4 rrms
4
 2
 2rrms E (r 2 )  E (r 4 ) E (r 4 )  rrms
4
.

delta
The NEdN samp formula can now be written as

delta 2 2 2 E (r 4 )  rrms


4
L(mnf
fore )
(  )  L(back)
mnf (  )
NEdN samp L mnf (  )  . (8.40b)
1  2 2 2 rrms
2
 f ( )

We already know, according to Eq. (8.37c), that r has a zero-mean probability density
distribution. If we also assume that this is a zero-mean normal distribution, then Eq. (7A.5d) in
Appendix 7A of Chapter 7 shows that

E (r 4 ) 3rrms
4
,

where, of course, we know from the discussion following Eq. (8.37f) that rrms is the standard
deviation of r . Now we have

delta 2 2  2 2 rrms
2
L(mnf
fore )
(  )  L(back)
mnf (  )
NEdN samp 2 2 2
L mnf (  )  (8.40c)
1  2  rrms  f ( )

as the formula for the NEdN of our measurement. By keeping both the O(r ) and O(r 2 ) terms
everywhere they occur in the noise equations, we have ended up with a reasonable formula for
the quasi-static sampling
sampling noise.
noise. We see that neglecting the quadratic term in Eq. (8.2e) is the reason
our previous NEdN formula gave zero for the quasi-static sampling noise.

-1023 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

8.11 Comparing the Sampling-Error, Misalignment, and Detector NEdNs


Equations (7.34b) and (7.35b) in Chapter 7 specify the NEdN due to random misalignment error
to be

ª 8π J (θ 2) (σ ) º
NEdN tilt =« », (8.41a)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼
where


1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
 (σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) + (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ . (8.41b)
4 −∞

The corresponding pair of formulas for the sampling-error NEdN is, in Eqs. (8.22g) and (8.22h)
above,
4 J ( s ) (σ )
NEdN samp ( σ ) = (8.41c)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
and

J ( s ) (σ ) =

π 2
³p
(s)

nn (σ ′) e − iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) (8.41d)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ .

The formula for Z mnf is, of course, the same in both sets of equations; Eq. (8.7f) in this chapter
just repeats the definition in (7.16f) in the previous chapter. For the two types of NEdN,

WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )] . (8.41e)
( fore ) (back)

Speaking very approximately, and noting what happens when the formulas for J (θ 2) and J ( s )
are substituted into Eqs. (8.41a) and (8.41c), we see that both NEdNtilt and NEdNsamp diminish as
the J and thus (disregarding for now the effect of the integrals over dı) decrease in an
approximately linear way with Z mnf . This point is not just academic because, at least in
principle, both L(mnf
fore )
and L(back)
mnf are under the control of the interferometer’s designer. Hence, by

- 1024 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11

arranging for L(back)


mnf , the background radiance from the aft optics, as nearly as possible to equal—

under typical operating conditions when measuring a typical input spectrum—the sum

τ f ( σ )L mnf ( σ ) + L(mnf
fore )
(σ ) ,

we can minimize both NEdNtilt and NEdNsamp. This same relationship shows up in the formula for
random quasi-static sampling error. Equation (8.40b) shows that if

L(mnf
fore )
( σ ) − L(back)
mnf ( σ )
L mnf ( σ ) + ≈0
τ f (σ )
or
mnf ( σ ) ≈ τ f ( σ ) L mnf ( σ ) + L mnf ( σ ) ,
L(back) ( fore )
(8.41f)

delta
then NEdN samp ≈ 0 also.
It is not difficult to understand why (8.41f) minimizes random misalignment and sampling
errors—for both types of noise this minimizing relationship is present from the start of our
analysis. It is also not very difficult to show how this works. Working first with the sampling-
position noise, we get from Eq. (8.1c) that


(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ ,
( tot )
zC ma (8.42a)
−∞

which means that its derivative can be written as


dzC( tot )
= 2π i ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e 2π iσχ dσ . (8.42b)
dχ −∞

Equation (8.42b) can be substituted into (8.2g) to get


( tot )
zCN ( χ ) = zC(tot ) ( χ ) + 2π i n ( s ) ( χ ) ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e 2π iσχ dσ . (8.42c)
−∞

Examining the derivation of Eq. (8.2g), we see that it comes from removing the j subscripts in
( )
Eq. (8.2e). If we want to include the O [n ( s ) ]2 error term from the analysis of the quasi-static
sampling error in Sec. 8.10, we can similarly remove the j subscripts from Eq. (8.35) and
substitute from (8.42a) to get

-1025 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

"
z ( tot )
CN ( ) z ( tot )
C (  )  2 i n ( s ) (  ) ³  H(u ) M( R ma ) Z FOV ( ) e 2 i d
( sampling
noise ) "
"
(8.42d)
³
2 (s) 2 2 2 i
 2 [n (  )] H(u ) M( R ma ) Z FOV ( ) e d .
"

Clearly the size of the sampling noise is governed by the size of Z FOV . As for the mirror-tilt error
in Chapter 7, Eq. (7.8a) can be substituted into (7.11d) to get

( tot ) § · ª" º
zCN (  ) u 1h ¨ ¸  « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( )
mirror-tilt
noise (8.42e)
§  · ª ( 2) º
"
 u a h ¨ ¸  « n (  ) ³  2 Z FOV ( ) e 2 i d » .
1

©u¹ ¬ " ¼

Again the size of the signal noise is governed by the size of Z FOV . The formula for Z FOV is [see
Eq. (8.1b) above or (7.7b) in Chapter 7]

§ WA  · ( fore ) (back)
Z FOV ( ) ¨ ¸ R (  ) ( ) a (  )[ f (  )L FOV (  )  L FOV (  )  L FOV (  )] . (8.42f)
© 4 ¹

All the random-error terms in the formulas for the interferogram signal contaminated by sampling
noise and random misalignment errors—that is, all the random-error terms on the right-hand
sides of Eqs. (8.42d) and (8.42e)—can be minimized by minimizing Z FOV . Equation (8.42f)
shows that this occurs when
 f L FOV  L(FOV
fore )
 L(back)
FOV ' 0

or

L(back) ( fore )
FOV (  ) '  f (  )L FOV (  )  L FOV (  ) . (8.42g)

Assuming that the interferometer does a reasonable job of resolving the L, L(back) , and L( fore )
radiance spectra—that is, assuming that the distorting effects of the finite interferogram and finite
field of view are negligible—we know, just like in Eqs. (7.19a) and (7.19b) in Chapter 7, that

L(back) (  ) L(back) (back)


FOV (  ) L mnf (  ) , (8.42h)

- 1026 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11

L( fore ) (  ) L(FOV
fore )
(  ) L(mnf
fore )
( ) , (8.42i)
and
L(  ) L FOV (  ) L mnf (  ) . (8.42j)

Under these conditions, Eqs. (8.42g) and (8.41f) are effectively identical. Since (8.41f) comes
from minimizing the final formulas for NEdNtilt and NEdNsamp, and (8.42g) comes from
minimizing the raw noise contaminating the initial interferogram signals, we have now confirmed
that this noise-minimizing relationship is present from the beginning of the analysis and
continues through to the end.
The noise associated with the randomly changing misalignment and sampling errors is
sometimes called multiplicative noise.111 The name comes from the way these random errors
enter the
the equations
equations only
onlyafter
afterbeing
beingmultiplied
multipliedbyby integrals proportional to Z FOV which is itself
terms proportional
proportional to
 f L FOV  L(FOV
fore )
 L(back)
FOV .

In Eq. (8.42e), for example, n ( 2) is multiplied by

"

³
2
Z FOV ( ) e 2 i d
"

before contributing to the uncontaminated interference signal. In Eq. (8.42d), n ( s ) is multiplied


by
"

³  H(u ) M( R
"
ma ) Z FOV ( ) e 2 i d
(s) 2
and [n ] is multiplied by
"

³
2
H(u ) M( R ma ) Z FOV ( ) e 2 i d
"

before contributing to the uncontaminated signal.


The equation for detector noise corresponding to Eqs. (8.42d,e) is [see Eq. (6.22a) in Chapter
6]
( tot ) 1ª §  ·º
zCN (  ) zC (  )  zC( cold ) (  )  « n (det) (  )  h ¨ ¸ » . (8.43)
u ¬ © u ¹ ¼
( noise )
detector

111
John Chamberlain, The Principles of Interferometric Spectroscopy, pp. 303–309.

-1027 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Here the random signal error


1 ª (det) § χ ·º
« n ( χ ) ∗ h ¨ ¸ »
u¬ © u ¹¼

is directly added to the uncontaminated interference signal

zC ( χ ) + zC( cold ) ( χ ) .

Even though there is a convolution with h( χ / u ) before the addition (to show what happens to
the noise when it passes through the signal processing chain after leaving the detector), no terms
proportional to the input or background radiances are included in the random error before it is
added to the uncontaminated signal. This is why the random error coming from the detector is
sometimes called additive noise.
The noise-free components of the interference signals in Eqs. (8.42d) and (8.43) are the same.
It is easy to show this is true. Setting n (det) to zero in (8.43) reduces the right-hand side to

zC ( χ ) + zC( cold ) ( χ ) ,

which becomes, after substituting from Eqs. (6.5d) and (6.12a) in Chapter 6,

zC ( χ ) + zC( cold ) ( χ )

WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ


WA ∆Ω
³
2π iσχ
+ H(uσ ) M( Rσθ ma ) η(σ )R ( σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ .
4 −∞

Combining the two integrals into one, we get

zC ( χ ) + zC( cold ) ( χ )

WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθma ) R ( σ ) η(σ )τ a ( σ ) ⋅
2π iσχ
[τ f ( σ ) L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ .

Substituting from (8.42f) gives

- 1028 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11


zC ( χ ) + z (χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ ,
( cold )
C ma
−∞

which can also be written as, according to Eq. (8.42a),

zC ( χ ) + zC( cold ) ( χ ) = zC(tot ) ( χ ) . (8.44)

This is the same zC(tot ) that the right-hand side of (8.42d) reduces to when the sampling-position
noise n ( s ) is zero; in both cases, not surprisingly, the same function can be used to represent the
noise-free signal.
The right-hand side of Eq. (8.42e) for the random mirror-misalignment error also reduces to
( tot )
zC as the misalignment noise goes to zero—but unfortunately it takes some analysis to show
this. When n (θ 2) is zero, the Fourier F operator defined in Eqs. (2.29a) and (2.29c) in Chapter 2
can be used to write the right-hand side of (8.42e) as

§χ · ª∞ º
u −1h ¨ ¸ ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ »
© u ¹ ¬ −∞ ¼
(8.45a)
§χ·
= u −1h ¨ ¸ ∗ F (iσχ ) ( M( Rσθ rms ) Z FOV (σ ) ) .
©u¹

We note that the transform in Eq. (6.27b) in Chapter 6 can be reversed to get (replacing the
dummy variables χ ′′ , σ by χ , σ ′ respectively)

§χ·
h ¨ ¸ = F (iσ ′χ ) ( uH(uσ ′) ) = uF ( iσ ′χ ) ( H(uσ ′) ) ,
©u¹

where in the last step we have used the linearity of F to move u outside the Fourier transform (see
Sec. 2.6 in Chapter 2). This can also be written as

1 §χ·
h ¨ ¸ = F ( iσ ′χ ) ( H(uσ ′) ) . (8.45b)
u ©u¹

Substituting (8.45b) into (8.45a) gives

-1029 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

§ · ª" º
u 1h ¨ ¸  « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( i  )
F  H(u )   F (i )  M( R rms ) Z FOV ( )  .
Writing the right-hand side as a Fourier integral [after applying Eq. (2.39j) in Chapter 2], we get

§ · ª" º
u 1h ¨ ¸  « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( i )
F  H(u )M( R rms ) Z FOV ( )  .
We again apply (2.29c)
consult inin
(2.29c) Chapter 2 to
Chapter thethe
2 to right-hand side
right-hand toto
side write
write

§ · ª" º
u 1h ¨ ¸  « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
"
(8.45c)
³ H(u )M( R
2 i
rms ) Z FOV ( )e d .
"
Formula (7.3d) in Chapter 7 states that

2
 rms  2   x2   y2 ;

and when the misalignment noise drops to zero, we expect the  x , y standard deviations of the
random misalignment angles  and  also to go to zero, giving us
x y

 rms  .

The discussion following Eq. (7.2e) defines  to be the bias-tilt angle of the randomly varying
misalignment, so when the randomly changing misalignment error goes to zero, it makes sense to
regard  as the static misalignment angle șma,

 rms )  ma .

Replacing șrms by șma and then comparing the right-hand side of (8.45c) to (8.42a), we see that as
the misalignment noise drops to zero, the noise-free signal once again simplifies to the Fourier
transform

- 1030 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11


(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ .
( tot )
zC ma (8.45d)
−∞

This is, of course, the same noise-free signal we get in Eq. (8.44) above. Hence we have now
demonstrated that the noise-free signals from our analysis of the detector noise, the sampling-
position noise, and the mirror-misalignment noise indeed all reduce to the same expression, as
they should.
According to the discussion following Eq. (8.42g), the approximate radiance equalities
specified by (8.41f) and (8.42g) are essentially equivalent in well-designed interferometers, and
from this it follows that Z FOV [whose formula is given by (8.42f)] is minimized by (8.42g) at the
same time that NEdNtilt and NEdNsamp are minimized by (8.41f). At this point, however, we notice
that the zC(tot ) noise-free signal component in (8.45d) is also minimized when Z FOV is minimized.
This seems to cause a problem, because the spectral measurement depends on this noise-free
component—it clearly does not make sense to design the interferometer for minimal tilt and
sample noise if the signal itself then goes away.
To solve this puzzle, we need to be more explicit about what exactly is being measured.
According to the mathematics of information theory, the more unexpected an occurrence is, the
more information it provides.112 Turning this statement around, the more expected an occurrence
is, the less information it provides. With this idea as a guide, we can divide the L(ı) radiance
spectrum being measured into an expected component and an unexpected—or unknown—
component,

L( σ ) = L(exp) ( σ ) + L(unk) ( σ ) . (8.46a)

The L(exp) spectral radiance is what we expect to measure; it could, for example, be the average
spectrum measured in the past under circumstances similar to the present. Assuming that there
are N past measurements, we can label each measurement with an index j = 1, 2,… , N and call
the radiance seen in the jth past measurement L( j ) ( σ ) so that

N
1
L (exp)
(σ ) =
N
¦L
j =1
( j)
(σ ) . (8.46b)

According to (8.46a), the unknown component L(unk ) for the spectrum L now being measured
must be the difference between that spectrum and L(exp) , so

112
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 534.

-1031 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

L(unk) ( σ ) = L( σ ) − L(exp) ( σ ) . (8.46c)

Function L(unk ) is the real information in the signal because we cannot know anything about it
ahead of time; in fact, because it is defined to be the difference between L and L(exp) , it’s equally
likely to be positive or negative. Not knowing anything about it ahead of time, we cannot design
the instrument around it; we can, however, just like any other truly unpredictable quantity,
estimate its expected size by calculating the associated standard deviation:

N
1
¦ ª¬L
2
L(unk) ( σ ) ≈ ( j)
( σ ) − L(exp) ( σ ) º¼ . (8.46d)
N j =1

To show the effect of the interferometer’s finite field of view, we follow the pattern of Eqs.
(5.83e), (6.11b), and (6.11c) in Chapters 5 and 6, setting

∆Ω σ
∆σ =

and then defining that

­ L(exp) ( σ ) for small ǻȍ where cos α ε


° can be approximated as one
°
°°
L(exp)
FOV ( σ ) = ® § ∆Ω · ∆σ (8.47a)
σ ⋅ 1+ +
° 1 ¨© 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ⋅ ³
° ∆σ § ∆Ω · ∆σ
L(exp) ( σ ′ ) dσ ′
cannot be approximated as one
°̄ σ ⋅¨1+
© 4π ¹
¸ −
2

and
­ L(unk) ( σ ) for small ǻȍ where cos α ε
° can be approximated as one
°
°°
L(unk
FOV
)
( σ ) = ® § ∆Ω · ∆σ (8.47b)
σ ⋅ 1+ +
° 1 ¨© 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ∆σ § ∆Ω³ · ∆σ
° ⋅ L (unk)
( σ ′ ) d σ ′
cannot be approximated as one
°̄ σ ⋅¨1+ ¸−
© 4π ¹ 2

Similarly, following the pattern of Eqs. (5.108d), (6.25b), and (6.25c) in Chapters 5 and 6, the
distorting effect of the finite interferogram length can be introduced by defining

- 1032 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11

L(exp) (exp)
mnf ( ) [2 Dsinc(2 D )]  L FOV (  ) (8.47c)

and
L(unk) (unk)
mnf ( ) [2 Dsinc(2 D )]  L FOV (  ) . (8.47d)

The analysis following Eq. (5.108d) applies equally well to L(exp) (unk )
mnf and L mnf , letting us write

L(exp) (exp)
mnf (  ) L mnf ( ) (8.47e)
and
L(unk ) (unk )
mnf (  ) L mnf ( ) (8.47f)

to show that L(exp) (unk )


mnf and L mnf are even functions of ı. Formulas (8.47e) and (8.47f) can be
substituted into (8.47c) and (8.47d) to get

L(exp) (exp)
mnf (  ) [2 Dsinc(2 D )]  L FOV (  ) (8.47g)
and
L(unk) (unk)
mnf (  ) [2 Dsinc(2 D )]  L FOV (  ) . (8.47h)

We now combine results. Substituting (8.46a) into the right-hand side of Eq. (5.83e) in Chapter 5
gives
­ L(exp) (  )  L(unk ) (  )
°
° for small ǻȍ where cos   can be approximated as one
as one
°
° §  ·  §  · 
°  (¨1 ¸  (¨1 ¸
L FOV (  ) ® 1 © 4 ¹ 2
1 ©
4 ¹ 2

³ ³
(exp)
° ( L (  ) d  ( L(unk ) (  ) d
°   (§¨1  ·¸    §  · 
 (¨1 ¸
° © 4 ¹ 2 © 4 ¹ 2

° for slightly larger ǻȍ where cos   cannot be approximated as one


as one
°
¯

Equations (8.47a) and (8.47b) show that this formula is the same thing as saying that

L FOV (  ) L(exp) (unk )


FOV (  )  L FOV (  ) . (8.47i)

-1033 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

Equation (8.47i) can now be substituted into Eq. (5.108d) in Chapter 5 to get

L mnf (σ ) = [2 Dsinc(2πσ D )] ∗ [L(exp)


FOV ( σ ) + L FOV ( σ )] ,
(unk)

which becomes, using the linearity of the convolution [see Eq. (2.38d) in Chapter 2],

{
L mnf (σ ) = [2 Dsinc(2πσ D)] ∗ L(exp) } {
FOV ( σ ) + [2 Dsinc(2πσ D )] ∗ L FOV ( σ ) .
(unk)
}
Substitution from Eqs. (8.47g,h) gives

L mnf (σ ) = L(exp)
mnf ( σ ) + L mnf ( σ ) .
(unk)

If the right-hand side of this formula is an even function of ı—and it is—then the left-hand side
must also be an even function of ı, allowing us to write

L mnf ( σ ) = L(exp)
mnf ( σ ) + L mnf ( σ ) .
(unk )
(8.47j)

Equations (8.47i) and (8.47j) match the form of (8.46a), showing that the distinction between the
expected and unknown radiances extends naturally to the distorted radiance functions produced
by the finite field of view and finite interferogram length.
The expected component L(exp) of the measured radiance often acts like a type of background
radiance generated outside the instrument. Suppose, for example, a spectroscopist is trying to
measure the infrared spectrum of a small burning candle with an interferometer having a
relatively large field of view. The optical signal coming from the candle could easily turn out to
be rather small compared to the infrared background signal coming from the laboratory walls. In
this sort of situation, we can say that

L(exp) ( σ ) ≅ L(wall) ( σ ) . (8.48)

Of course L(exp) , if defined by (8.46b), cannot be exactly the same as L(wall) because the candle
would contribute some small average radiance to L(exp) , but this could easily turn out to be
negligible, justifying the approximation in (8.48).
Having divided L into L(exp) and L(unk ) , we can revisit the minimization conditions for the
sampling and mirror-misalignment noise in Eqs. (8.41f) and (8.42g). Substituting Eqs. (8.47i) and
(8.47j) into (8.41f) and (8.42g) gives

FOV ( σ ) ≈ τ f ( σ )L FOV ( σ ) + τ f ( σ )L FOV ( σ ) + L FOV ( σ )


L(back) (exp) (unk) ( fore )
(8.49a)

- 1034 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11

and
L(back) (exp) (unk ) ( fore )
mnf (  ) '  f (  )L mnf (  )   f (  )L mnf (  )  L mnf (  ) . (8.49b)

In well-designed interferometers where (8.42h)–(8.42j) are reasonable approximations, we might


think to build the instrument so that L(back) satisfies the approximate equalities in (8.49a) and
(8.49b), minimizing the sampling and mirror-misalignment noise. The problem with this strategy
is not so much a lack of fine control over the spectral shape of L(back) , although that is definitely
an important consideration, as it is our complete lack of knowledge about what L(unk ) will be in
any particular measurement. Having defined L(unk ) to be that part of the measured radiance about
which nothing can be known ahead of time, we do not even know whether L(unk ) will be positive
or negative [see discussion following Eq. (8.46c)]. Hence, the best that can be done to satisfy
(8.49a) and (8.49b) is to set up the instrument so that

L(back) (exp) ( fore )


FOV (  ) '  f (  )L FOV (  )  L FOV (  ) (8.49c)
and
L(back) (exp) ( fore )
mnf (  ) '  f (  )L mnf (  )  L mnf (  ) . (8.49d)

Now we can make sense of the situation that occurs when instruments are designed to
minimize multiplicative noise like NEdNtilt and NEdNsamp. Substituting (8.47i) into formula
(8.42f) gives

§ WA  ·
Z FOV ( ) ¨ ¸ R (  ) ( ) a (  )
© 4 ¹

(  f (  )[L(exp) (unk ) ( fore ) (back)
FOV (  )  L FOV (  )]  L FOV (  )  L FOV (  ) 
or

§ WA  ·
Z FOV ( ) ¨ ¸ R (  ) ( ) a (  )
© 4 ¹ (8.50a)

(  f (  )L (unk)
FOV (  )  [ f (  )L (exp)
FOV
( fore
(  )]LL( fore
FOV
FOV
)) (back)
( () )LL(back)
FOV
FOV 
( ()])] . .

This can be put into formula (8.45d) for the noise-free signal to get

-1035 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

zC( tot ) (  )
"
WA 
4 "³ H(u ) M( R ma ) R (  ) ( ) a (  ) e 2 i (

 ( ()L)L
f f
(exp)
(unk)
FOV ( () )[[f (f ()L
FOV
(unk
)L )
(exp)
)])  L(FOV
FOV( (
FOV
fore )
(  )  L(back) 
FOV (  )] d

Separating this into an integral containing L(unk )


FOV and an integral containing everything else, we

write
zC(tot ) (  ) zC(unk) (  )  zC(exp) (  ) , (8.50b)
where
zC(unk) (  )
WA 
"
(8.50c)
³ H(u ) M( R ma ) R (  ) ( ) a (  ) f (  )L(unk ) 2 i
FOV (  )e d
4 "
and
zC(exp) (  )
"
WA 
4 "³ H(u ) M( R ma )R (  ) ( ) a (  )e2 i ( (8.50d)

[ f (  )L(exp) ( fore ) (back)


FOV (  )]  L FOV (  )  L FOV (  )] d .

Now, if the interferometer is built so that (8.49c) holds true, then all that can happen is that zC(exp)
disappears, reducing zC(tot ) in (8.50b) to

zC(tot ) (  ) zC(unk ) (  ) . (8.50e)

The zC(unk) component of the noise-free signal is, however, all we really cared about in the first
place. The zC(exp) expected signal component is already known—it provides no new information
because that part of the signal is expected to be there every time the experiment is done. Hence an
interferometer can be designed so that approximations (8.49c) and (8.49d) hold true without
affecting the relevant part of the signal passing through the instrument. Now that there is no
concern about decreasing the quality of the measurement, Eq. (8.47j) can be substituted into
formula (8.41e) to get

- 1036 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11

WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ ) ⋅
4
[τ f ( σ )L(exp)
mnf ( σ ) + τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )].
(unk ) ( fore ) (back)

Condition (8.49d) can then be applied to minimize multiplicative noise such as NEdNtilt and
NEdNsamp, leading to

WA ∆Ω
(min)
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )τ f ( σ )L mnf ( σ ) .
(unk)
(8.50f)
4

This is now substituted into formulas (8.41a) and (8.41b) for NEdNtilt to get

ª 8π J (min,θ 2) (σ ) º
NEdN (min)
tilt =« », (8.50g)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼
where
J (min,θ 2) (σ ) =
1

2 (8.50h)
³ ª º
(θ 2)
p
nn (σ ′) ¬ (σ + σ ′) 2
Z (min)
mnf (σ + σ ′) + (σ − σ ′) 2
Z (min)
mnf (σ − σ ′) ¼ dσ ′ .
4 −∞

Equation (8.50f) can also be substituted into formulas (8.41c) and (8.41d) to get

4 J (min, s ) (σ )
NEdN (min)
samp (σ ) = , (8.50i)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
with

J (min, s ) (σ ) =

π 2
³p
(s)

nn (σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R (σ − σ ′)θ ma ) Z mnf (σ − σ ′)
(min)
(8.50j)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf
∗ (min)
(σ + σ ′) dσ ′ .

-1037 -
8 · Sampling-Error NEdN in Double-Sided Interferograms

There is no guarantee, of course, that the interferometer will always be used under the
conditions for which it is designed, or even that it is possible to design the interferometer so that
the minimizing conditions in (8.49c) and (8.49d) are satisfied. We know, for example, that
detector noise dominates the random-error budgets of most well-designed interferometers.
According to the discussion at the beginning of Sec. 6.15 of Chapter 6, many detectors operate
under close to ideal conditions, so any increase in background radiance L(back) needed to satisfy
(8.49c) and (8.49d) can easily end up increasing the NEdN(det) detector noise more than it
decreases the NEdNtilt and NEdNsamp multiplicative noise. Perhaps, then, it is best just to note that
for any Fourier-transform spectrometer

(min)
NEdN tilt ≥ NEdN tilt (8.51a)
and
(min)
NEdN samp ≥ NEdN samp , (8.51b)

(min) (min)
with NEdN tilt and NEdN samp specified by Eqs. (8.50g) and (8.50i) above.
Inequalities such as the ones in (8.51a) and (8.51b) can be very useful. If a proposed
interferometer design, with a good guess of L(unk )
FOV based on Eq. (8.46d), produces unacceptably
(min) (min)
large values for NEdN tilt and NEdN samp , then—because there is no way the true NEdNs of the
actual instrument can be smaller—the design fails. The multiplicative noise in the system must be
reduced before further progress can be made.

- 1038 -
BIBLIOGRAPHY

Articles
Bell, E. E., and Sanderson, R. B. “Spectral Errors Resulting From Random Sampling-Position
Errors in Fourier Transform Spectroscopy.” Applied Optics, 11, no. 3 (March 1972), pp.
688–689.
Cohen, D. “Characterization of a Space-Class Fourier Transform Spectrometer Against a
Detailed Performance Model.” IEEE Aerospace Conference at Snowmass, CO (March
1999).
Cohen, D. “Noise-Equivalent Change in Radiance for Misalignment Noise in a Double-Sided
Interferogram.” Applied Optics, 42, no. 31 (1 November 2003), pp. 6292–6304.
Cohen, D. “Noise-Equivalent Change in Radiance for Sampling Noise in a Double-Sided
Interferogram.” Applied Optics, 42, no. 13 (1 May 2003), pp. 2289–2300.
Cohen, D. “Performance Degradation of a Michelson Interferometer When Its Misalignment
Angle Is a Rapidly Varying Random Time Series.” Applied Optics, 36, no. 18 (20 June
1997), pp. 4034–4041.
Cohen, D. “Performance Degradation of a Michelson Interferometer Due to Random Sampling
Errors.” Applied Optics, 38, no. 1 (1 January 1999), pp. 139–151.
Forman, Michael L., W. Howard Steel, and George A. Vanasse. “Correction of Asymmetric
Interferograms Obtained in Fourier Spectroscopy.” Journal of the Optical Society of
America, 56, no. 1 (January 1966), pp. 59–63.
Haschberger, Peter. “Impact of the Sinusoidal Drive on the Instrumental Line Shape Function of
a Michelson Interferometer with Rotating Retroreflector.” Applied Spectroscopy, 48, no. 3
(1994), pp. 307–315.
Hirschfeld, Tomas. “Multiple Order Spectra in Fourier Transform Infrared Spectroscopy.”
Applied Optics, 16, no. 7 (July 1977), pp. 1905–1907.
Kauppinen, Jyrki, and Pekka Saarinen. “Line-Shape Distortions in Misaligned Cube Corner
Interferometers.” Applied Optics, 31, no. 1 (January 1992), pp. 69–73.
Lambert, D. K., and P. L. Richards. “New Results in the Theory of a Plane-Mirror
Interferometer.” Journal of the Optical Society of America, 68, no. 8 (August 1978), pp.
1124–1130.
Learner, R. C. M., A. P. Thorne, and J. W. Brault. “Ghosts and Artifacts in Fourier-Transform
Spectrometry.” Applied Optics, 35, no. 16 (June 1996), pp. 2947–2953.
Loewenstein, Ernest V. “Fourier Spectroscopy: An Introduction.” Proceedings of the Aspen
International Conference on Fourier Spectroscopy, Aspen, CO (March 16–20, 1970), pp.
3–17.

- 1039 -
Bibliography

Mattson, David R. “Sensitivity of a Fourier Transform Infrared Spectrometer.” Applied


Spectroscopy, 32, no. 4 (1978), pp. 335–338.
Michelson, Albert A. “The Relative Motion of the Earth and the Luminiferous Ether.” The
American Journal of Science, 22 (Second Series, 1881), pp. 120–129.
Michelson, Albert A., and Edward W. Morley. “On the Relative Motion of the Earth and the
Luminiferous Ether.” The American Journal of Science, 34, no. 203 (Third Series, 1887),
pp. 333–345.
Miller, Dayton C. “The Ether Drift Experiment and the Determination of the Absolute Motion of
the Earth.” Nature, 133 (3 February 1934), pp. 162–164.
Miller, Dayton C. “The Ether-Drift Experiment and the Determination of the Absolute Motion of
the Earth.” Reviews of Modern Physics, 5 (July 1933), pp. 203–242.
Murty, M. V. R. K. “Modification of Michelson Interferometer Using Only One Cube-Corner
Prism.” Journal of the Optical Society of American (Letters to the Editor), 50, no. 1, pp.
83–84.
Murty, M. V. R. K. “Some More Aspects of the Michelson Interferometer with Cube Corners.”
Journal of the Optical Society of America, 10, no. 1 (January 1960), pp. 7–10.
Nishiyama, Taichiro, Takashi Yamauchi, Masanao Ohno, Masao Morii, Nobuo Ura, and Koji
Masutani. “New Sampling Method in Fourier Spectroscopy.” Japanese Journal of Applied
Physics, 14, Suppl. 14-1, (1975), pp. 67–69.
Park, Jae H. “Analysis and Application of Fourier Transform Spectroscopy in Atmospheric
Remote Sensing.” Applied Optics, 23, no. 15 (1 August 1984), pp. 2604–2607.
Park, Jae H. “Analysis Method for Fourier Transform Spectroscopy.” Applied Optics, 22, no. 6
(15 March 1983), pp. 835–849.
Park, Jae H. “Effect of Interferogram Smearing on Atmospheric Limb Sounding by Fourier
Transform Spectroscopy.” Applied Optics, 21, no. 8, (15 April 1982), pp. 1356–1366.
Raspollini, Piera, Peter Ade, Bruno Carli, and Marco Ridolfi. “Correction of Instrument Line-
Shape Distortions in Fourier Transform Spectroscopy.” Applied Optics, 37, no. 17 (10 June
1998), pp. 3697–3704.
Revercomb, Henry E., H. Buijs, Hugh B. Howell, D. D. Laporte, William L. Smith, and L. A.
Sromovsky. “Radiometric Calibration of IR Fourier Transform Spectrometers: Solution to
a Problem with the High-Resolution Interferometer Sounder.” Applied Optics, 27, no. 15 (1
August 1988), pp. 3210–3218.
Saarinen, Pekka, and Jyrki Kauppinen. “Spectral Line-Shape Distortions in Michelson
Interferometers due to Off-Focus Radiation Source.” Applied Optics, 31, no. 13, (1 May
1992), pp. 2353–2359.
Sakai, H. “Consideration of the Signal-to-Noise Ratio in Fourier Spectroscopy.” Proceedings of
the Aspen International Conference on Fourier Spectroscopy, Aspen, CO (March 16–20,
1970), pp. 19–40.
Sakai, H., and G. A. Vanasse. “Spectral Recovery in Fourier Spectroscopy.” Journal of the
Optical Society of America, 58, no. 1 (January 1968), pp. 84–90.

- 1040 -
Bibliography

Schumann, L. W., T. S. Lomheim, and J. F. Johnson. “Design Constraints on Advanced Two-


Dimensional LWIR Focal Planes for Imaging Fourier Transform Spectrometer Sensors.”
Paper 3063-13 presented at Aerosense: the SPIE International Symposium on Aerospace
and Defense Sensing, Simulation, and Controls at Orlando, FL (April 1997).
Shankland, R. S., S. W. McCuskey, F. C. Leone, and G. Kuerti. “New Analysis of the
Interferometer Observations of Dayton C. Miller.” Reviews of Modern Physics, 27, no. 2
(April 1955), pp. 167–178.
Shaw, J. E. “Spectroradiometry over Broad Spectral Regions by Fourier Spectroscopy.” Journal
of the Optical Society of America, 57, no. 9 (September 1967), pp. 1136–1140.
Stroke, George W. “Photoelectric Fringe Signal Information and Range in Interferometers with
Moving Mirrors.” Journal of the Optical Society of America, 47, no. 12 (December 1957),
pp. 1097–1103.
Tanner, D. B., and R. P. McCall. “Source of a Problem with Fourier Transform Spectroscopy.”
Applied Optics, 23, no. 14 (15 July 1984), pp. 2363–2368.
Williams, Charles S. “Mirror Misalignment in Fourier Spectroscopy Using a Michelson
Interferometer with Circular Aperture.” Applied Optics, 5, no. 6 (June 1966), pp. 1084–
1085.
Yap, B. K., W. A. M. Blumberg, and R. E. Murphy. “Off-Axis Effects in a Mosaic Michelson
Interferometer.” Applied Optics, 21, no. 22 (15 November 1982), pp. 4176–4182.
Zachor, Alexander S. “Drive Nonlinearities: Their Effects in Fourier Spectroscopy.” Applied
Optics, 16, no. 5 (May 1977), pp. 1412–1424.
Zachor, Alexander S., and Steve M. Aaronson. “Delay Compensation: Its Effect in Reducing
Sampling Errors in Fourier Spectroscopy.” Applied Optics, 18, no. 1 (1 January 1979), pp.
68–75.

Books

Abramowitz, Milton, and Irene A. Stegun (eds.). Handbook of Mathematical Functions (National
Bureau of Standards, Applied Mathematics Series 55, Washington, DC, 1964).
Bass, Michael (ed.). Handbook of Optics, Vols. I and II, 2nd ed. (Optical Society of America,
McGraw-Hill, Inc., New York, 1995).
Batygin, V. V., and I. N. Toptygin. Problems in Electrodynamics (Academic Press, New York,
1964).
Beer, Reinhard. Remote Sensing by Fourier Transform Spectrometry (John Wiley & Sons, Inc.,
New York, 1992).
Beers, Yardley. Introduction to the Theory of Error, 2nd ed. (Addison-Wesley Publishing
Company, Inc., Reading, MA, 1957).
Bennett, Jean M., and Lars Mattson. Introduction to Surface Roughness and Scattering (Optical
Society of America, Washington, DC, 1989).

- 1041 -
Bibliography

Blake, Ian F. An Introduction to Applied Probability (John Wiley & Sons, Inc., New York, 1979).
Bois, G. Petit. Tables of Indefinite Integrals (Dover Publications, Inc., New York, 1961),
unabridged translation of a book first published by B. G. Teubner in 1906.
Born, Max, and Emil Wolf. Principles of Optics: Electromagnetic Theory of Propagation,
Interference, and Diffraction of Light, 7th exp. ed. (Cambridge University Press, New
York, 1999).
Bracewell, Ron. The Fourier Transform and Its Applications (McGraw-Hill Book Company,
New York, 1965).
Chamberlain, John. The Principles of Interferometric Spectroscopy (John Wiley & Sons, New
York, 1979).
Champeney, D. C. A Handbook of Fourier Theorems (Cambridge University Press, New York,
1987).
Chandrasekhar, S. Radiative Transfer (Dover Publications, Inc., New York, 1960), slightly
revised from 1950 book.
Cohen, D. Demystifying Electromagnetic Equations: A Complete Explanation of EM Unit
Systems and Equation Transformations (SPIE Press, Bellingham, WA, 2001).
Davenport, Wilbur B., Jr., and William L. Root. An Introduction to the Theory of Random Signals
and Noise (McGraw-Hill Book Company, Inc., New York, 1958).
Davis, Sumner P., Mark C. Abrams, and James W. Brault. Fourier Transform Spectrometry
(Academic Press, New York, 2001).
Defense Supply Agency, Standardization Division. Military Standardization Handbook Optical
Design, MIL-HDBK-141, 5 October 1962.
Dereniak, Eustace L., and Devon G. Crowe. Optical Radiation Detectors (John Wiley & Sons,
Inc., New York, 1984).
Ditchburn, R. W. Light, Vols. 1 and 2, 2nd ed. (Interscience Publishers, a division of John Wiley
& Sons, Inc., New York, 1963).
Evans, Merran, Nicholas Hastings, and Brian Peacock. Statistical Distributions, 2nd ed. (John
Wiley & Sons, Inc., New York, 1993).
Eyges, Leonard. The Classical Electromagnetic Field (Dover Publications, Inc., New York,
1980), an unabridged and corrected edition of 1972 book published by Addison Wesley.
Francon, M. Optical Interferometry (Academic Press, New York, 1966).
Freeman, J. J. Principles of Noise (John Wiley & Sons, Inc., New York, 1958).
Gabel, Robert A., and Richard A. Roberts. Signals and Linear Systems, 2nd ed. (John Wiley and
Sons, Inc., New York, 1980).
Gaskill, Jack D. Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, Inc., New
York, 1978).
Goldstein, D. Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003).
Goodman, Joseph W. Introduction to Fourier Optics, McGraw-Hill, Inc., New York, 1988),
reissue of 1968 book.
Goodman, Joseph W. Statistical Optics (John Wiley & Sons, New York, 1985).

- 1042 -
Bibliography

Goody, R. M., and Y. L Yung. Atmospheric Radiation: Theoretical Basis, 2nd ed. (Oxford
University Press, New York, 1989).
Gradshteyn, I. S., and I. M. Ryzhik. Table of Integrals, Series, and Products, 5th ed., edited by
Alan Jeffrey (Academic Press, New York, 1994).
Griffiths, David J. Introduction to Electrodynamics, 2nd ed. (Prentice-Hall, Englewood Cliffs,
NJ, 1989).
Griffiths, Peter R., and James A. de Haseth. Fourier Transform Infrared Spectrometry (John
Wiley and Sons, Inc., New York, 1986).
Heavens, O. S. Optical Properties of Thin Solid Films (Butterworths Scientific Publications,
London, 1955).
Hecht, Eugene. Optics, 2nd ed., with contributions by Alfred Zajac (Addison-Wesley Publishing
Company, Reading, MA, 1987).
Helstrom, Carl W. Statistical Theory of Signal Detection, 2nd ed. (Pergamon Press, New York,
1968).
Jackson, John David. Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York,
1999).
Jaffe, Bernard. Michelson and the Speed of Light (Anchor Books, Doubleday and Company, Inc.,
New York, 1960).
Jeffrey, Alan. Handbook of Mathematical Formulas and Integrals (Academic Press, Inc., New
York, 1995).
Jenkins, F., and H. White. Fundamentals of Optics, 3rd ed. (McGraw-Hill Book Company, New
York, 1957).
Kay, Steven M. Fundamentals of Statistical Signal Processing: Estimation Theory (PTR Prentice
Hall, Inc., Englewood Cliffs, NJ, 1993).
Keigo, Iizuka. Engineering Optics, rev. translation of the 2nd original Japanese ed. (Springer-
Verlag, New York, 1983).
Klambauer, Gabriel. Aspects of Calculus (Springer-Verlag, New York, 1986).
Klein, Miles V. Optics (John Wiley & Sons, Inc., New York, 1970).
Kusse, Bruce, and Eric Westwig. Mathematical Physics: Applied Mathematics for Scientists and
Engineers (John Wiley and Sons, New York, 1998).
Lamb, H. Hydrodynamics (Dover Publications, Inc., New York, 1945), copy of the 6th ed. first
published in 1879.
Landau, L. D., and E. M. Lifshitz. Electrodynamics of Continuous Media, translated from the
Russian by J. B. Sykes and J. S. Bell (Pergamon Press, New York, 1960).
Landau, L. D., and E. M. Lifshitz. The Classical Theory of Fields, 3rd rev. English ed., translated
from the Russian by Morton Hamermesh (Pergamon Press, New York, 1971).
Lathi, B. P. An Introduction to Random Signals and Communication Theory (International
Textbook Company, Scranton, PA 1968).
Lighthill, M. J. Introduction to Fourier Analysis and Generalized Functions (Cambridge
University Press, New York, 1958).

- 1043 -
Bibliography

Livingston, Dorothy Michelson. The Master of Light: A Biography of Albert A. Michelson


(Charles Scribner’s Sons, New York, 1973).
Mandel, Leonard, and Emil Wolf. Optical Coherence and Quantum Optics (Cambridge
University Press, New York, 1995).
Michelson, A. A. Light Waves and Their Uses (The University of Chicago Press, Chicago, 1903).
Michelson, A. A. Studies in Optics (The University of Chicago Press, Chicago, 1927).
Mobley, Curtis D. Light and Water: Radiative Transfer in Natural Waters, based in part on
collaborations with Rudolf W. Preisendorfer (Academic Press, New York, 1994).
Morse, P., and K. Ingard. Theoretical Acoustics (McGraw-Hill, Inc., New York, 1968).
Morse, Philip M., and Herman Feshbach. Methods of Theoretical Physics, Parts I and II
(McGraw-Hill Book Company, Inc., New York, 1953).
O’Neill, Edward L. Introduction to Statistical Optics (Dover Publications, Inc., New York,
copyright 1963, 1991 by Edward O’Neill).
Papoulis, Athanasios. Systems and Transforms with Applications in Optics (McGraw-Hill
Publishing Company, Inc., New York, 1968).
Papoulis, Athanasios. The Fourier Integral and Its Applications (McGraw-Hill, Inc., New York,
copyright 1962 and 1987).
Papoulis, Athansios. Probability, Random Variables, and Stochastic Processes, 3rd ed.
(McGraw-Hill, Inc., New York, 1991).
Papoulis, Athansios. Signal Analysis (McGraw-Hill Book Company, New York, 1977).
Porat, Boaz. A Course in Digital Signal Processing (John Wiley & Sons, Inc., New York, 1997).
Press, William H., Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Numerical
Recipes in C: The Art of Scientific Computing, 2nd ed. (Cambridge University Press, New
York, 1992).
Rade, Lennart, and Bertil Westergren. Beta β Mathematics Handbook, 2nd ed. (CRC Press, Boca
Raton, FL, 1990).
Rybicki, George B., and Alan P. Lightman. Radiative Processes in Astrophysics (John Wiley &
Sons, Inc., New York, 1979).
Sears, Francis Weston. Optics, 3rd ed. (Addison-Wesley Publishing Company, Reading, MA,
1949).
Slater, John C., and Nathaniel H. Frank. Electromagnetism (Dover Publications, Inc., 1947).
Sneddon, Ian N. Fourier Transforms (Dover Publications, Inc., New York, 1995) an unabridged
and unaltered version of the book published by McGraw-Hill in 1951.
Sommerfeld, Arnold, Optics, Lectures on Theoretical Physics, Vol. IV, (Academic Press, New
York, 1964).
Soong, T. T. Random Differential Equations in Science and Engineering (Academic Press, New
York, 1973).
Sparrow, E. M., and R. D. Cess. Radiation Heat Transfer, augmented ed. (Hemisphere Publishing
Corporation, New York, 1978).
Staff of the Bateman Manuscript Project. Tables of Integral Transforms, Vol. I and II, (McGraw-
Hill Book Company, Inc., New York, 1954).

- 1044 -
Bibliography

Steel, W. H. Interferometry (Cambridge University Press, New York, 1967).


Stokes, G. Mathematical and Physical Papers, Vol. III, (Cambridge University Press, New York,
1901).
Stone, John M. Radiation and Optics: An Introduction to the Classical Theory (McGraw-Hill
Book Company, Inc., New York, 1963).
Thomas, John B. An Introduction to Applied Probability and Random Processes (John Wiley &
Sons, Inc., New York, 1971).
Thomson, J. H., and F. G. Smith. Optics (John Wiley & Sons, Ltd., New York, 1971).
Thorne, Anne P. Spectrophysics, 2nd ed. (Chapman and Hall, New York, 1988).
Valasek, Joseph. Elements of Optics (McGraw-Hill Book Company, Inc., New York, 1932).
Vincent, John David. Fundamentals of Infrared Detector Operation and Testing (John Wiley and
Sons, Inc., New York, 1990).
Wax, Nelson (ed.). Selected Papers on Noise and Stochastic Processes (Dover Publications, Inc.,
New York, 1954).
Weast, Robert C. (ed.). Handbook of Chemistry and Physics, 51st ed.(The Chemical Rubber
Company, Cleveland, OH, 1970–1971).
Whittaker, Edmund. A History of the Theories of Aether and Electricity, Vols. I and II (Tomash
Publishers and American Institute of Physics, 1951), published by the Philosophical
Library, copyright 1987 by American Institute of Physics).
Williams, W. Ewart. Applications of Interferometry, 4th ed. (John Wiley & Sons, Inc., New
York, 1950).
Wirsching, Paul H., Thomas L. Paez, and Keith Ortiz. Random Vibrations: Theory and Practice
(John Wiley and Sons, Inc., New York, 1995).
Wolfe, William L., and Zissus, George J. (eds.). The Infrared Handbook, rev. ed., (Infrared
Information Analysis [IRIA] Center for the Office of Naval Research, first ed. 1978,
revised ed. 1985).
Wyatt, Clair L. Radiometric Calibration: Theory and Methods (Academic Press, Inc., New York,
1978).
Yariv, Amnon, and Pochi Yeh. Optical Waves in Crystals: Propagation and Control of Laser
Radiation (Wiley Interscience, Hoboken, NJ, 2003).
Zemanian, A. H. Distribution Theory and Transform Analysis: An Introduction to Generalized
Functions with Applications (Dover Publications, Inc., New York, 1965), copyright by
Zemanian, an unabridged, slightly corrected version of 1965 book published in 1987 by
McGraw-Hill.

- 1045 -
Index
1/f noise, 6.7, 764-767 430, 444, 448, 470, 522-526, 555
beam splitter, 2, 3, 7, 12, 14, 18, 19, 22, 24, 26, 28, 31, 42,
A 44, 46, 47, 54, 58, 59, 355, 394-400, 406, 407, 411-415,
A/D converter. See analog-to-digital converter 456, 464, 466, 467, 470, 472, 474, 478, 479, 481, 489, 536-
absolutely integrable, 69, 74, 75, 108 538, 543, 574, 575, 577, 585, 586, 602, 608, 698, 749
absorption line, 742, 806, 1022. See also emission line Bessel function, 456, 462, 485, 530, 630, 871
AC coupling, 619, 630, 750, 752, 757, 880 bias angle, 922, 927-929, 931, 939
additive noise, 1016, 1028 bias-tilt angle, 870, 912, 948, 951, 1030
aft optics, 5.8, 605, 607-609, 618, 626, 628-630, 632, 635, black-body, 559, 605, 756, 930-932, 934, 939, 987, 991
659, 700, 986, 1025 black-body spectrum, 8.8, 929, 931-933, 986-988, 991, 992,
alias, 195, 200, 716-720, 723 994, 996, 1007
aliasing, 188, 195-198, 200, 218, 708, 709, 715, 716, 851, BLIP, 807, 814
853, 854 Boltzmann's constant, 559
amplitude-reflection coefficient, 368, 407, 412, 413, 467, bounded function, 69, 71
473, 475
amplitude-transmission coefficient, 357-359, 362, 367, 407, C
473 cadmium red line, 24, 28
analog-to-digital converter, 8.1, 8.2, 555, 696, 697, 749, 849, calibration, 5.19, 465, 487, 488, 555, 681, 683, 685, 704,
850, 953, 954 725, 726, 742, 753, 762, 764, 766, 767, 782, 784, 822, 843,
angle-wavenumber transform, 4.8, 380, 382, 386, 391, 393, 853, 893, 932, 953, 954, 966, 987, 1020, 1021
394 calibration algorithm, 685, 686, 762, 782-785, 806, 820,
anti-aliasing filter, 6.22, 7.4, 555, 849, 853, 879, 880 853, 865, 891, 892, 964-967, 1016, 1019
anti-Hermitian function, 101, 220, 222 Cassegrain telescope, 385, 386, 389
apodization, 5.16, 650, 654-656 cat's-eye, 54
apodizing, 654, 656 Cauchy, Augustin, 1
approximation, gray-body. See gray-body approximation Cauchy principle value, 82, 118-122, 140, 142-144, 158
artificially created even signals, 3.27, 319 causal system, 729
autocorrelation, 3.13, 249, 319, 322, 523, 903 central dark fringe, 12, 14, 22, 26
autocorrelation function, 8.3, 223, 250, 251, 258, 274, 275, central fringe, 12, 14, 31
277, 278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319, central limit theorem, 3.11, 227, 243, 246, 248
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912, 948, characteristic function, 231, 267
956, 957, 977, 1012, 1014 co-adding, 764
autocovariance, 3.13, 249-251, 258 coefficient
avoidable misalignment noise, 7.7, 895 amplitude-reflection, 368, 407, 412, 413, 467, 473, 475
avoidable noise, 6.8, 767-770, 787, 788, 843, 848, 901, 1016 amplitude-transmission, 357-359, 362, 367, 407, 473
power reflection, 478, 481
B power transmission, 478, 481
background-limited infrared proton, 807. See also BLIP compensator plate, 7, 14, 394-397, 399, 412, 456, 466, 467,
background radiance, 4.18, 5.13, 6.3, 6.4, 6.5, 465, 466, 468, 473, 474, 489, 533-536, 538-541, 543-546, 549, 551, 553,
473, 474, 476, 479-481, 483-486, 555, 587, 626, 628-631, 554
639-641, 664, 681, 686, 698, 752, 753, 755, 758, 782, 806, complex scalar field, 403-409, 476
852, 911, 930, 934, 935, 941, 986, 1006, 1007, 1021, 1025, complex vector field, 335, 490-498
1028, 1034, 1038 constructive interference, 46
balanced background signal, 330, 464, 474, 475, 585 continuous function, 49, 64, 65, 161, 188, 195, 200, 202,
balanced output, 46, 47, 49, 54, 438 204, 217, 727
balanced radiation field, 4.15, 394, 415, 417, 438, 552 convolution, 110-115, 130, 132, 161, 162, 175, 202, 204,
balanced signal, 4.16, 5.4, 56, 453, 454, 464, 465, 551, 573, 218, 282, 284, 312, 313, 625, 626, 646, 650, 679, 683, 684,
575, 582-585, 587, 588, 591, 594, 599, 602, 603, 605, 606, 702, 703, 727, 738, 741, 771, 772, 774, 775, 777, 778, 823,
608, 610, 611, 616, 630-632 826, 830, 832-837, 858, 879-883, 887-889, 900, 908, 910,
band-limited function, 200, 393 922, 924, 925, 939, 941, 963, 969, 970, 984, 1002, 1013,
band-limited radiation, 4.10, 390, 428 1028, 1034
band-limited white noise, 3.25, 6.13, 299, 301, 767, 795, three-dimensional, 215
798, 807, 808, 812, 817, 924 two-dimensional, 211, 212, 214-216
bandwidth, 301, 795, 808 corner cube, 54, 55, 56
beam-chopped radiation, 4.9, 4.14, 383-385, 391, 394, 427, corner frequency, 766

1046
Index
correlated random variable, 239, 265, 913, 914 766, 795, 812, 819, 842, 846, 903
cosine curve, 66, 67, 218 double-sided signal, 6.8, 6.10, 6.14, 6.16, 7.5, 7.11, 682,
cosine transform, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86, 767, 772, 787, 789, 800, 814, 815, 817, 842, 843, 845, 848,
89-91, 93, 95, 96, 98, 103, 218, 463, 464, 780 882, 884, 889, 890, 909, 955
coupling, AC. See AC coupling
covariance, 237, 262 E
covariance stationary, 258 Earth's orbital velocity, 14, 19, 23
cross-correlation function, 259, 281, 283, 913, 948, 950, 951 effective spectrum, 5.11, 5.12, 622-624, 644, 645, 650, 663,
cross-power spectrum, 281, 282, 286, 913, 914, 948-952 665, 666, 668, 674, 677, 681, 686, 690, 692, 700, 702, 708
curl, 331, 335, 492, 493 Einstein, Albert, 1, 23
elastic vibrations, 2, 43
D electric field, 522
D*, 813, 820 electromagnetic radiation, 5.1, 55, 372, 394, 555, 556, 611,
D-limited Fourier transform, 779, 792, 800, 840, 889, 898, 810
900, 901, 961 electromagnetic wave, 4.1, 4.2, 329, 330, 335, 339, 360,
D-star, 813 428, 432, 489
delta function, 144-146, 148-154, 157, 158, 161, 162, 169, emission line, 8.9, 742, 930, 934-936, 939, 941, 985, 996,
172, 191-193, 216-218, 229, 279, 301, 312, 431, 445, 474, 998, 999, 1002, 1006, 1007, 1022
580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007, ensemble, 3.14, 251-253, 279, 280, 303, 765, 798, 800, 869
1012-1014, 1016 ensemble average, 271, 272, 274, 278, 280
nth derivative of, 154 ergodic, 272, 274, 277, 279, 791
dependent random variable, 3.5, 3.9, 223, 263 in the autocorrelation function, 274, 275, 277, 278
derivative in the mean, 271, 272, 274, 277-279
of a generalized function, 130 in the variance, 277, 278
of the delta function, 153, 154 random function, 3.18, 271, 274, 275, 279, 280
destructive interference, 46 ergodicity, 223, 279, 280, 301, 329
detector circuit, 5.10, 6.9, 6.22, 7.4, 617-619, 621, 622, 624- ether, 1, 2, 14, 23, 31, 330
626, 630, 636-641, 643, 656, 667, 668, 674, 681, 685, 686, drift, 23
696, 698, 699, 727, 730, 748, 750, 768-771, 777, 821, 822, luminiferous, 1, 2, 14
849, 853, 879, 880, 897 stationary, 14, 23
detector NEdN, 8.11, 1024 wind, 1, 1.2, 14, 23, 24, 26, 54
detector noise, 6.6, 6.9, 6.10, 6.12, 6.13, 6.14, 6.17, 6.18, even function, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119,
6.19, 6.20, 726, 742, 763, 764, 766-770, 772, 786-789, 121, 128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
791, 792, 794, 795, 800, 806, 807, 814, 815, 817, 819-821, 307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-488,
823, 828, 829, 840, 844-846, 848, 849, 853, 894, 941, 953, 575, 577, 580, 584, 589, 590, 601, 603, 604, 606, 610, 613-
964, 1016, 1027, 1031, 1038 615, 617, 624-627, 630, 631, 633-635, 655, 668, 672-674,
detector responsivity, 611, 612, 632, 751, 759, 808, 874 680, 684, 703, 729, 732, 746, 767, 768, 775, 786, 787, 790,
detector signal, 5.9, 611, 618, 626, 629, 698, 768, 822, 874 792, 800, 815, 818, 819, 826-828, 840, 843, 844, 848, 891,
DFT, 182, 183, 185, 187, 188, 190, 192, 195, 197, 699, 849, 895-897, 899, 900, 904, 913, 924, 945, 950, 952, 957, 968,
851. See also discrete Fourier transform 969, 971, 976, 1033, 1034
Dirac delta function, 144 expectation operator, 3.4, 3.10, 230, 232, 239-243, 245, 247,
direction-chopped radiation, 500, 555 248, 250, 258-260, 266, 268, 270, 271, 283, 284, 290, 314,
direction cosines, 339 318, 321, 329, 432, 440, 445, 745, 761, 781, 809, 814, 817,
discrete Fourier transform, 5.23, 62, 173, 181, 182, 218, 555, 841, 842, 878, 889, 890, 892, 902, 915, 916, 918, 921, 956,
699, 704, 708, 709, 713-716, 720, 722, 723, 849. See also 962, 963, 965, 971, 977, 981, 1017, 1022, 1023
DFT
distribution theory, 121 F
distributions, 121, 246, 870, 873, 914 fast-Fourier transform, 55, 96, 699
divergence, 335 Fellget advantage, 55
divergent integral, 117 FFT, 55, 188, 699. See also fast-Fourier transform
dot product, 349, 373, 405, 490, 494, 495 field of view, 22, 26, 28, 31, 330, 453-455, 460, 461, 472,
double-sided interferogram, 5.15, 555, 643, 646, 650, 667, 483, 485, 573, 588, 594, 601, 603, 605, 608, 612, 626, 627,
677, 680-683, 701, 742, 850, 853, 865, 953, 959, 975 630, 637, 639, 645, 656, 659-661, 665, 667, 683, 686, 731,
double-sided NEdN, 848 753, 754, 756-758, 809, 857, 931, 986, 1034
double-sided power spectrum, 289, 296, 297, 455, 474, 483, filter theory, 117

1047
Index
finite field of view, 5.17, 656, 673, 676, 684, 685, 726, 744, delta, 144-146, 148-154, 157, 158, 161, 162, 169, 172,
775, 783, 857, 859, 887, 891, 894, 935, 936, 991, 998, 191-193, 216-218, 229, 279, 301, 312, 431, 445, 474,
1026, 1032, 1034 580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007,
finite variation, 69, 73, 141 1012-1014, 1016
fixed mirror, 24, 26, 27, 33, 35, 44, 58, 394-397, 399, 412, Dirac delta, 144
466, 553, 574, 667, 692, 696, 698, 749 even, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119, 121,
focal plane, 385, 387, 594-597, 599, 600, 688 128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
fore optics, 607-609, 618, 626-628, 634, 635, 749, 934, 986, 307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-
1006 488, 575, 577, 580, 584, 589, 590, 601, 603, 604, 606,
Fourier convolution theorem, 2.9, 2.17, 110, 112, 114, 115, 610, 613-615, 617, 624-627, 630, 631, 633-635, 668,
159, 160, 162, 176, 202, 204, 212-216, 218, 286, 292, 311, 672-674, 680, 703, 729, 732, 746, 767, 768, 775, 778-
625, 645, 646, 655, 679, 728, 773, 777, 778, 824, 829, 831, 780, 786, 790, 792, 800, 815, 819, 826-828, 840, 843,
885, 889, 906, 907, 919, 960, 975, 979, 1015 844, 848, 891, 895-897, 899, 900, 904, 913, 924, 945,
Fourier identities, 2.8, 103, 209 950, 952, 957, 968, 969, 971, 976, 1033, 1034
Fourier scaling theorem, 107, 210, 211 generalized, 2.11, 2.13, 2.17, 62, 121-130, 132, 136-139,
Fourier series, 2.20, 62, 173, 177-179, 181 141-145, 148, 152, 155, 156, 159-162, 167-170, 172,
Fourier shift theorem, 106, 209, 725, 1015 175, 218
Fourier transform, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23, Hermitian, 101, 102, 219, 221, 282, 286-288, 372, 419,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112, 114, 420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822,
115, 117-122, 124, 136-142, 144, 157, 167, 168, 171, 173, 825, 880, 962, 968, 976
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207-210, impulse-response, 282, 285, 287, 625, 626, 727-730, 770,
213-215, 218, 231, 281, 282, 285-290, 292, 297-299, 302, 822, 879
303, 310, 311, 371, 372, 381-384, 391, 393, 426, 447, 449, instrument line-shape, 114, 115, 648
451, 456, 464, 488, 525, 605, 610, 614, 620, 623, 625, 626, instrument-response, 114, 115, 647, 728, 729
639-641, 643-646, 650, 654-656, 677-680, 683, 699, 704, mixed, 2.3, 76, 77
708, 709, 715, 728-730, 754, 756, 757, 772-775, 777-780, odd, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128, 129,
786-788, 790, 792, 794, 822, 824, 826, 829-831, 833-835, 139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
837, 838, 840, 849, 860, 862, 866, 880, 884-886, 888, 889, 459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899,
895, 896, 899, 904, 913, 914, 919, 920, 922, 957, 959, 961, 946, 947, 950, 968
962, 974, 975, 979, 980, 1029, 1030 random, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242, 249,
Fourier transform of generalized functions, 144, 159, 167, 250, 252, 253, 257-261, 271-275, 277-282, 284, 287-
168 290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522,
Fourier transform of the delta function, 2.16, 157 523, 526, 535, 744, 746, 747, 760-766, 780, 792, 798,
Fourier transform of the shah function, 2.19, 165, 171 800, 815, 840, 844-847, 860, 869, 871, 873, 874, 876,
Fourier transform pairs, 143, 159, 160, 206, 319, 677, 703 877, 882, 892, 903, 911, 912, 914, 951, 953, 956, 962,
frequency, 24, 31, 34, 37, 39-41, 43, 45, 47, 49, 51, 67, 94- 988, 1012, 1013
96, 107, 108, 115, 121, 188, 190, 192, 195, 196, 198, 200, stationary random, 3.15, 252, 260, 261, 271, 279, 282,
201, 203, 224, 249, 289, 297, 298, 314, 319, 533, 534, 538, 287, 303, 319, 791, 861, 862
557, 558, 560, 723, 766, 808-811, 813, 819, 820, 853-855, tapering, 678, 822, 825, 827
931. See also Nyquist frequency test, 121-133, 135, 136, 138, 141, 142, 144-148, 151-154,
Fresnel, Augustin, 1 161-164, 168-171, 173-175
fringe, 1.6, 12, 14, 22-24, 26, 28-30, 47, 50, 52, 58 transfer, 285-287, 620-622, 624, 645, 661, 668, 674, 681,
central. see central fringe 728-731, 733, 777, 778, 821, 822, 825, 831, 853, 880,
central dark. see central dark fringe 888, 923, 931, 968, 976, 991, 994, 996
fringe shift, 23, 34, 54 functional, 121-123, 126, 130, 144, 145
function
anti-Hermitian, 101, 220, 222 G
autocorrelation, 8.3, 223, 250, 251, 258, 274, 275, 277, Gaussian
278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319, multivariate, 261-263
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912, probability distribution, 227, 243, 246, 800
948, 956, 957, 977, 1012, 1014 random processes, 3.16, 261, 262, 279
band-limited, 200 generalized function, 2.11, 2.13, 2.17, 62, 121-132, 136,
bounded, 69, 71 137-139, 141-145, 148, 152, 155, 156, 159-162, 167-170,
characteristic, 231, 267 172, 175, 218
continuous, 49, 64, 65, 161, 188, 195, 200, 202, 204, 217 generalized function, derivative of a, 130
cross-correlation, 259, 281, 283, 913, 948, 950, 951

1048
Index
generalized function theory, 62, 143, 144 Jones, 813
generalized limit, 2.12, 132, 133, 135-138, 141, 142, 145- jump discontinuity, 69, 70, 80, 81, 119, 124
147, 157, 160-162, 164, 167-171, 175, 177, 218
geometric optics, 383, 385 K
geometric series, 167, 168, 184 Kronecker delta, 185
ghost line, 939, 941, 996, 998, 1002, 1007
gray-body approximation, 559, 935
Green, George, 1 L
laser-based servo controls, 1.8, 57, 59
light
H monochromatic, 1.3, 3, 24, 28, 31, 580
Hartley transform, 87-89, 93, 99 speed of, 1, 19, 23, 31, 346, 559, 808
Heaviside step function, 155, 156, 310, 320, 322, 828, 835, white, 6, 7, 12, 14, 22, 26, 28, 40, 41, 44, 47, 50
840, 845 linear combination, 125-127, 161, 727
Heidinger rings, 593, 597 linear operation, 2.6, 97, 99, 110, 727, 823
Hermitian function, 101, 102, 219, 221, 282, 286-288, 372, linear operator, 97, 240, 335, 496, 816
419, 420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822, linear polarization, 4.4, 349-351, 355, 356, 362, 366
825, 880, 962, 968, 976 linear system, 3.21, 282, 285, 287, 288, 727, 729
Hertz, Heinrich, 1 Lorentz, Hendrik, 1
homogeneous, 299, 524 luminiferous ether. See ether, luminiferous
homogeneous random field, 297, 523, 524

I M
magnetic-induction field, 331, 332, 350, 368, 372
ILS, 647. See also instrument line shape magnetic permeability, 331
impulse-response function, 282, 285, 287, 625, 626, 727- Maxwell, James Clerk, 1
730, 770, 822, 879 Maxwell's equations, 2, 50, 330, 344, 363, 496, 556
independent random variable, 3.6, 233, 234, 236, 239, 243, mean, 3.3, 3.8, 3.13, 62, 64, 226-230, 235, 240, 241, 243,
245, 260, 261, 280, 810, 811, 872, 914, 921, 927, 930 246-250, 260, 262-269, 271, 272, 278, 301, 798, 800, 811,
index of refraction, 355, 385, 532-534 870, 871, 877, 902, 909, 912, 914, 921, 922, 932, 939, 945-
information theory, 1031 947, 956, 958, 972, 988, 1017, 1022, 1023
infrared spectra, 55, 58, 464, 501, 626, 641, 752, 1034 mercury green line, 24
instrument line shape, 647, 648 Michelson, Albert, 1, 2, 12, 14, 22-24, 28, 31, 42, 49, 50, 52,
instrument line-shape function, 114, 115, 648 54
instrument-response function, 114, 115, 647, 728, 729 Michelson-based spectroscopy, 29
interference Michelson interferometer, 1.1, 1.4, 1.5, 2, 3, 4.11, 4.12, 4.13,
constructive, 46 5.4, 5.5, 5.6, 5.7, 5.13, 14, 23, 24, 31, 41, 44, 46, 47, 50,
destructive, 46 52, 54, 55, 62, 115, 117, 197, 200, 330, 355, 385, 390, 391,
interferogram, 5.24, 5.25, 197, 200, 463-465, 487, 488, 579- 394, 395, 400, 415, 427, 464, 481, 502, 534, 543, 551, 555,
581, 583-585, 603, 604, 610, 683, 704, 715, 721, 723, 726, 573, 585, 588, 599, 626, 660, 667, 682, 683, 685, 781, 849,
744, 764, 775, 782, 783, 807, 857, 859, 865, 887, 891, 894, 853, 929
930, 935, 938, 986, 991, 998, 1012, 1026, 1031, 1034 Michelson-Morley experiment, 1
interferogram signal, 5.22, 5.26, 197, 555, 580, 623-625, Michelson's mistake, 22
642, 643, 650, 654, 656, 666-668, 678, 683, 696, 698, 699, mirror, fixed. See fixed mirror
701, 704, 709, 715, 716, 723, 725, 742, 748, 764, 767, 775, mirror-misalignment NEdN, 865, 911
782, 798, 802, 804, 806, 857, 930, 1012, 1026, 1027 mirror-misalignment noise, 7.3, 7.8, 870, 873, 874, 879, 882,
inverse Fourier transform, 2.5, 6.4, 137, 139, 168, 171, 194, 884, 889, 890, 896-898, 909, 964, 967, 1031, 1034, 1035
204, 208-211, 213-215, 280, 281, 287, 371, 372, 381, 382, mirror, moving. See moving mirror
426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680, 729, misalignment angle, 7.2, 456, 459, 502, 504, 575, 692, 694,
750, 753, 767, 774, 822, 826 867, 868, 878, 904, 927, 930, 932, 951, 952, 954, 1030
inverse-square law, 5.3, 571, 573 misalignment NEdN, 7.11, 8.11, 909, 926, 953, 1024
misalignment noise, 7.4, 7.5, 7.6, 7.7, 7.15, 726, 865, 879,
J 882, 891, 895, 929, 932, 933, 935, 939, 941, 953, 996,
Jacquinot advantage, 55 1002, 1006, 1029, 1030
jointly normal random variable, 3.17, 263, 266, 271, 914, mixed function, 2.3, 76, 77
917 monochromatic beam, 7, 22, 26, 28, 37, 46
jointly wide-sense stationary, 259, 281, 913 monochromatic light, 1.3, 3, 24, 28, 31, 580

1049
Index
monochromatic plane wave, 4.4, 4.12, 348-353, 357, 360, nth derivative of the delta function, 154
362, 363, 368, 373, 395, 400, 403, 406, 411, 412, 415, 416, Nyquist frequency, 190, 192, 196, 197, 201, 203, 704, 716
465, 478, 500, 532-534, 538-540, 543-547, 549, 551, 554 Nyquist wavenumber, 704, 716-718, 798, 851
monochromatic wavetrain, 4.3, 7, 14, 22, 37, 39-42, 44-47,
58, 344, 434, 522, 525 O
Morley, Edward, 1 odd function, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128,
moving mirror, 7.2, 24, 26, 28-31, 33, 44-47, 51, 52, 55, 57, 129, 139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
58, 394, 395, 401, 412, 414, 415, 454, 456, 459, 460, 473, 459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899, 946,
481, 502-504, 507, 510, 546, 547, 551, 552, 554, 574, 575, 947, 950, 968
577, 588, 591-594, 597, 602, 617, 630, 636, 667, 668, 675, off-axis signal, 5.6, 555, 588, 589, 591-593
692, 694, 696, 726, 748, 749, 763, 768, 821, 867, 869, 871, off-center sampling, 5.26, 723, 822
873, 878, 879, 1012, 1017 OPD, 395, 398, 414, 453, 482, 500, 501, 573, 582, 597, 748-
multidimensional Wiener-Khinchin theorem, 3.24, 297, 298, 750, 757, 762-764, 791, 815, 843, 849, 850, 860, 869, 879,
434, 522, 525 882, 883, 932, 939, 953-959, 986, 988, 1012. See also
multiplicative noise, 1027, 1035, 1037, 1038 optical-path difference
multivariate Gaussian, 261-263 OPD velocity, 619, 636, 748, 792, 795, 813, 879, 931, 987
optical axis, 385-389, 395, 400, 404-407, 416, 425, 453,
N 456, 465, 534, 544-546, 552, 553, 573, 588, 590, 592-594,
NEdN, 6.1, 6.16, 8.7, 742-745, 747, 763, 768, 807, 814, 815, 599, 606, 628, 660, 686, 696
821, 844, 845, 848, 853, 865, 911, 930, 932, 933, 947, 953, optical-path difference, 41, 395, 501, 577, 585, 617, 619,
972, 973, 988, 990, 1002, 1007, 1023, 1038 622, 624, 630, 643, 650, 655, 659, 666, 667, 686, 690, 696,
detector, 6.21, 8.11, 844, 1024 699, 712-715, 723, 930, 953. See also OPD
double-sided, 848 oversampling, 5.24, 704, 715, 723, 852-854
mirror-misalignment, 865, 911
misalignment, 7.11, 8.11, 909, 926, 953, 1024 P
sampling error, 8.11, 885, 953, 984, 1002, 1024 p-wave, 359, 362, 368, 407, 412, 413
sampling-noise, 953, 985, 1002 pencil rays, 556-558, 566-568, 570, 571, 573, 575, 577, 584,
single-sided, 848 588, 590, 597, 606
noise pencils of rays. See pencil rays
additive, 1016, 1028 permittivity, 331
avoidable, 6.8, 767-770, 787, 788, 843, 848, 901, 1016 photon noise, 6.15, 806-808, 812, 814
avoidable misalignment, 7.7, 895 photovoltaic, 807, 814
band-limited, 3.25, 299, 301 Planck radiation, 559, 931, 932, 986, 991, 1007
band-limited white, 6.13, 87, 795, 798, 808, 812, 817, 924 Planck's constant, 559, 808
detector, 953, 964, 1016, 1027, 1031, 1038 plane of incidence, 353, 355, 357-361, 367-369, 400, 406,
mirror-misalignment, 964, 967, 1031, 1034, 1035 407, 410, 411, 467, 540, 547
misalignment, 726, 953, 996, 1002, 1006, 1029, 1030 plane wave, 4.5, 4.6, 4.13, 344, 346, 350, 352, 353, 355-360,
multiplicative, 1027, 1035, 1037, 1038 362, 366, 367, 375, 383, 385, 386, 394, 395, 400, 401, 405-
photon, 6.15, 806-808, 812, 814 407, 409, 412-417, 425, 451, 465, 467, 478, 556, 570, 571,
quasi-harmonic, 924, 926, 929, 930, 939, 987, 988, 999 573, 594, 596, 597, 599, 602, 606-608, 611, 612, 660
quasi-static sampling, 8.10, 988, 1002, 1007, 1012, 1020, polarization, 54, 350, 454, 480
1022, 1023, 1025 polarization, linear, 4.4, 349, 356, 358
sampling-position, 8.2, 8.3, 954-958, 976, 987, 988, 990, polychromatic plane wave, 362, 372, 373, 383, 605-607
996, 998, 1002, 1006, 1012-1014, 1020, 1022 polychromatic wavefield, 4.7, 368, 369, 428
signal, 6.5, 225, 280, 753, 759, 843, 848, 853, 900, 1026 power reflection coefficient, 478, 481
unavoidable, 6.8, 767-770, 786-788, 843, 900, 969, 1016 power spectrum, 3.20, 3.22, 3.23, 8.3, 280-282, 287-290,
unavoidable misalignment, 7.7, 895 296, 297, 299, 301, 305, 311, 319, 320, 525, 579, 580, 591,
white, 223, 301 592, 594, 617, 766, 767, 791, 794, 795, 798, 819, 846, 860,
noise-equivalent change in radiance, 742, 743. See also 862, 864, 956-958, 976, 979, 987, 990, 995, 1002, 1006,
NEdN 1007, 1013, 1014, 1016, 1020
noise-power spectrum, 223, 312, 328, 765, 766, 812, 813, power transmission coefficient, 478, 481
905, 921, 924-932, 934, 939, 948, 979, 983, 987, 988, 992, Poynting vector, 430, 438, 470
1002, 1012 principle of independent superposition, 41, 47
normal probability distribution, 265, 870, 873, 914, 932, prism-based spectrometer, 55
945-947 probability density distribution, 798, 800, 870, 871, 914,

1050
Index
927, 932, 945-947, 1023 Revercomb calibration algorithm, 686
propagation vector, 338, 349, 353, 354, 362-364, 376, 382, ringing, 647, 656, 683, 709, 806
383, 385, 386, 390, 394, 395, 399-401, 405, 407, 416, 421,
434, 453, 455, 482, 500, 502, 504, 507, 573, 660 S
pupil function, 452, 456, 460, 522, 528, 529, 531 s-wave, 357, 358, 362, 367, 368, 407, 412, 413
PV. See photovoltaic sampling error, 8.6, 8.7, 696, 969, 971-973, 988, 996, 1012-
1014, 1022, 1025, 1027
Q sampling-error NEdN, 8.11, 953, 984, 985, 1002, 1024
quantum efficiency, 808 sampling-position error, 696, 987, 988, 1017
quasi-harmonic noise, 924, 926, 929, 930, 939, 987, 988, sampling-position noise, 8.2, 8.3, 954, 955-958, 976, 987,
999 988, 990, 996, 998, 1002, 1006, 1012-1014, 1020, 1022
quasi-static sampling noise, 8.10, 988, 1002, 1007, 1012, sampling theorem, 2.24, 200
1020, 1022, 1023, 1025 self-apodization, 666, 667
shah function, 2.18, 2.19, 162, 165, 171, 175
R signal noise, 6.5, 225, 280, 753, 759, 843, 848, 853, 900,
radiance, 5.2, 5.3, 55, 248, 416, 425, 474, 476, 481, 484, 1026
485, 566-568, 570, 571, 573, 627, 629, 630, 639-641, 643, signal-to-noise ratio, 55, 742
647, 664, 681, 685, 686, 698, 699, 703, 726, 742, 743, 745, sine curve, 66, 67, 218
747, 748, 750, 753, 758, 762, 775, 781, 782, 806, 808, 809, sine transform, 2.2, 2.4, 67, 68, 70, 75, 80-85, 87-89, 91, 93,
813, 816, 823, 844, 857, 887, 891, 892, 911, 932, 933, 936- 95, 96, 98, 99, 119, 121, 218
938, 967, 969, 987, 988, 991, 992, 994, 998, 999, 1002, single-sided interferogram, 5.18, 555, 643, 667, 673, 674,
1007, 1013, 1020-1022, 1031, 1034, 1035 681, 682, 726, 742, 850, 853
radiant energy, 355, 430, 432, 433, 438, 470, 480, 555-558, single-sided NEdN, 848
566-568, 571, 572 single-sided power spectrum, 296, 297, 455, 576, 766, 812,
radiometric spectral radiance, 6.2, 329, 455, 486, 742-744, 819, 820
748, 751-753, 760, 762, 763, 772, 775, 778, 783, 784, 789, single-sided signal, 6.18, 6.19, 6.20, 6.21, 821, 823, 829,
800, 816, 821, 852, 857-859, 879, 880, 890, 891, 905, 932, 840, 844, 845, 848
936, 938, 964, 968, 973, 986, 991, 998, 1006, 1007, 1026, Snell's law, 532, 534
1031 SNR. See signal-to-noise ratio
radiometry, 455, 555-557, 566 solid angle, 390, 437, 453, 455, 461, 472-474, 476, 478, 483,
random error, 50, 223, 247, 742-745, 747, 759, 763, 764, 485, 555-558, 566, 567, 570-573, 589, 597, 599, 601, 603,
766, 768, 789, 844, 853, 865, 953, 955, 973, 1017, 1019, 629, 743, 806, 809
1023, 1027, 1028 source fluctuations, 55
random function, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242, space look, 642
249, 250, 252, 253, 257-261, 271-275, 277-282, 284, 287- specific detectivity, 813
290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522, 523, spectral doublet, 28
525, 526, 744, 746, 747, 760-766, 780, 792, 798, 800, 815, spectral intensity function, 49, 51
840, 844-847, 860, 869, 871, 873, 874, 876, 877, 882, 892, spectral line, 1.4, 1.5, 24, 26-30, 32, 47, 50-52, 55
903, 911, 912, 914, 951, 953, 956, 962, 988, 1012, 1013 spectral multiplet, 32
random process, 249, 301, 791. See also Gaussian random spectral radiance, 6.2, 329, 455, 486, 555-560, 566, 570,
processes 571, 575, 576, 590, 591, 594, 597, 599, 601, 605, 606, 608,
random signal, 223, 762, 810, 1028 612, 629, 631, 643, 646, 647, 671, 677, 685, 686, 703, 725,
random variable, 3.1, 3.5, 3.6, 3.7, 3.9, 3.17, 223-227, 230- 726, 742-744, 748, 751-753, 760, 762, 763, 772, 775, 778,
243, 246, 249-251, 254, 255, 257, 259-269, 271, 273, 275, 783, 784, 789, 800, 816, 821, 852, 857-859, 879, 880, 887,
432, 438, 446, 523, 525, 526, 809-811, 814, 816, 869, 902, 890, 891, 894, 905, 932, 936, 938, 964, 968, 973, 986, 991,
909, 911, 915, 916, 921, 922, 947, 948, 972, 1018 998, 1006, 1007, 1026, 1031
rays, 383, 385, 394, 395, 459, 464, 467, 474, 532-534, 541- spectral resolution, 647, 665, 667, 677, 709, 715, 821, 823,
545, 551, 556, 570, 573, 585, 588, 589, 592, 594, 606. See 853, 930, 938
also pencil rays spectrometer
real linear operator, 335, 496-498 Fourier-transform, 1.7, 31, 50, 52, 54, 55, 57-59, 599,
real scalar field, 493 617, 623, 640, 643, 647, 667, 699, 707, 719, 727, 742,
relativity theory, 23 764, 767, 1016, 1038
resolving power, 647, 667, 668, 682 grating based, 55
response time, 36, 808 prism-based, 55
retroreflector, 54, 55, 59 spectroscope, 24

1051
Index
spectroscopy, Michelson-based, 383, 437 866, 880, 882, 884-886, 888, 889, 895, 896, 899, 904,
speed of light, 1, 19, 23, 31, 346, 559, 808 913, 914, 919, 920, 922, 957, 959, 961, 962, 974, 975,
standard deviation, 3.3, 226, 228, 229, 243, 246-248, 267, 979, 980, 1029, 1030
269, 745, 747, 766, 821, 853, 870-873, 909, 915, 927, 933, Hartley, 87-89, 93, 99
941, 945, 947, 973, 990, 1002, 1007, 1018, 1023, 1030, inverse Fourier, 2.5, 6.4, 137, 139, 168, 171, 194, 204,
1032 208-211, 213-215, 280, 281, 287, 371, 372, 381, 382,
stationarity, 223, 280, 297, 301, 791 426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680,
stationary, 18, 20, 252-254, 258, 259, 262, 263, 271, 272, 729, 750, 753, 767, 774, 822, 826
274, 278-280, 319, 523, 524, 791, 1012 sine, 2.2, 2.4, 67, 68, 70, 75, 80, 81, 83-85, 87-89, 91, 93,
stationary ether. See ether, stationary 95, 96, 98, 99, 119, 121, 218
stationary random function, 3.15, 252, 260, 261, 271, 279, three-dimensional Fourier, 382, 391, 426, 447-449, 525
282, 287, 304, 319, 523, 791, 861, 862, 869 time-limited Fourier, 302
step function, 320. See also Heaviside step function two-dimensional Fourier, 210, 213, 215, 451, 456, 528,
stochastic process, 225, 249 529, 531
strongly ergodic, 278, 279 vector Fourier, 209, 382
vector inverse Fourier, 209, 382
T transverse vibrations, 1, 2
T-limited Fourier transform, 793, 814, 846 truncated interferogram signal, 701, 708, 715, 806, 959
tapering function, 678, 822, 825, 827 tunnel diagram, 400, 467, 543-545, 551
Taylor series, 995, 1014 two-dimensional convolution, 211, 212, 214-216
test function, 121-133, 135, 136, 138, 141, 142, 144-148, two-dimensional delta function, 216
151-154, 161-164, 168-171, 173-175 two-dimensional Fourier transform, 210, 213, 215, 451, 456,
theory of relativity, 23 528, 529, 531
thin film, 353, 360, 361, 479, 539
three-dimensional convolution, 215 U
three-dimensional delta function, 216 unapodized spectral resolution, 647, 715, 930, 938, 986
three-dimensional Fourier transform, 382, 391, 426, 447- unavoidable misalignment noise, 7.7. See also mirror-
449, 525 misalignment noise
time average, 34, 36, 38, 43, 253, 271, 272, 274-276, 278, unavoidable noise, 6.8, 767-770, 786-788, 843, 900, 969,
453 1016
time-chopped radiation, 4.10, 4.14, 390-393, 427, 430, 444, unbalanced background signal, 330, 464, 465, 470, 472, 474,
448, 470, 522-524, 526 479, 482, 485, 486, 551, 585, 630
time-invariant linear system, 727 unbalanced output, 46, 47, 50, 54-56
time-limited Fourier transform, 302 unbalanced radiation field, 4.17, 394, 464, 467, 470, 472
transfer function, 285-287, 620-622, 624, 645, 661, 668, unbalanced signal, 5.5, 55, 464, 465, 585-587, 632
674, 681, 728-731, 733, 777, 778, 821, 822, 825, 831, 853, uncalibrated spectrum, 6.19, 7.5, 8.4, 682, 683, 685, 781-
880, 888, 923, 931, 968, 976, 991, 994, 996 785, 800, 829, 842, 849, 882, 884, 889-893, 959, 962, 964-
transform 966, 1015-1019
angle-wavenumber, 4.8, 380, 382, 386, 391, 393, 394 uncorrelated random variable, 239
cosine, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86, 89-91, undersampling, 5.25, 200, 715, 716, 718, 723, 852, 853, 855
93, 95, 96, 98, 103, 218, 463, 464, 780 unfolded interferometer, 400, 401, 406, 407, 415, 465, 482
D-limited Fourier, 779, 792, 800, 840, 889, 898, 900, 901,
961 V
fast-Fourier, 55, 96, 699 variance, 3.3, 7.10, 226, 228, 229, 231, 240, 241, 244, 246,
Fourier, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23, 14, 30, 248, 266, 277, 278, 301, 798, 800, 811, 812, 821, 905, 908,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112, 909, 951, 972, 973, 987, 1018, 1022, 1023
114, 115, 117-122, 124, 136-142, 157, 167, 168, 171, vector calculus, 491
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207- vector Fourier transform, 209, 382
210, 213-215, 218, 231, 281, 282, 285-290, 292, 297- vector inverse Fourier transform, 209, 382
299, 300, 303, 310, 311, 371, 372, 381-384, 391, 393, vector notation, 208, 211, 215, 217, 218, 298, 372, 490, 491
426, 447, 449, 451, 456, 464, 488, 525, 605, 610, 614, velocity at Earth's equator, 14, 23
620, 623, 625, 626, 639-641, 643-646, 650, 654-656, vibrations
677-680, 683, 699, 704, 708, 709, 715, 728-730, 754, elastic, 2, 43
756, 757, 772-775, 777-780, 786-788, 790, 792, 794, transverse, 1, 2
822, 824, 826, 829-831, 833-835, 837, 838, 840, 860,

1052
Index

W
wavefield, 4.7, 7, 14, 23, 31, 34, 36, 37, 39-41, 54, 55, 346,
349, 353, 355, 357, 359-363, 368, 369, 406, 407, 413, 428,
478, 534-540, 808
wavelength, 2, 3, 7, 10, 12, 14, 22, 24, 26, 28-31, 34, 47, 55,
57, 58, 249, 346, 351, 352, 385, 391, 392, 420, 428, 435,
533-539, 547, 555-557, 560, 566, 571, 607, 611, 692, 814
wavenumber, 34, 49, 51, 53, 346, 348, 353, 357, 359, 362,
363, 368, 370, 383, 392-394, 401, 407, 411, 412, 416, 428,
429, 434, 436-438, 451, 453, 455, 462, 464, 468, 477, 478,
486, 556, 557, 559, 560, 566, 570, 571, 575, 580, 584, 606,
607, 611, 622, 627, 631, 645, 647, 648, 650, 664-666, 671,
673-677, 682, 684-686, 691, 692, 700, 705-707, 709, 713,
714, 716-719, 723, 726, 727, 743, 744, 747, 755, 783, 789,
790, 798, 800, 806, 808, 813, 814, 816, 848, 849, 853, 857-
859, 874, 891, 927-929, 931, 933, 935, 936, 941, 964, 969,
971, 973, 975, 979, 987, 988, 990, 994, 995, 997, 999,
1001, 1002, 1007, 1013
wavetrain, 7, 14, 22, 37, 39-42, 44-47, 58
weakly ergodic, 278, 279, 869
weakly stationary, 258
white light. See light, white
white noise. See noise, white
wide-sense stationary, 258-261, 263, 279-283, 285, 287,
288, 290, 291, 299, 302, 304, 791, 792, 815, 860-862, 865,
869, 877, 903, 912, 922, 948, 953, 957, 958
Wiener-Khinchin theorem, 3.24, 223, 297-299, 434, 522,
525
window function, 654-658
windowing, 654

Y
Yerkes observatory, 28
Young, Thomas, 28

Z
Zeeman, Pieter, 1
zero-path difference, 26, 395, 577. See also ZPD
ZPD, 26, 28, 30, 31, 55, 395, 414, 577, 587, 591-594, 597,
599, 617, 667, 668, 807
ZPD position, 28-31, 46, 52, 395, 396, 413, 414, 577, 591,
593, 667-670, 749

1053

You might also like