Professional Documents
Culture Documents
1
A. Michelson, “The Relative Motion of the Earth and the Luminiferous Ether,” American Journal of Science 22,
Series 2 (1881), p. 120–129.
2
E. Whittaker, A History of the Theories of Aether and Electricity, Vol. I, The Classical Theories (Thomas Nelson &
Sons, Ltd., New York, 1951), pp. 129–142.
-1-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
such a propagating electromagnetic wavefield.3 So the ether concept was not only alive and well
at the time of Michelson’s experiments, but it could also be said, with the growing acceptance of
Maxwell’s equations to describe the behavior of the luminiferous ether, that it had never been
healthier.
3
D. Goldstein, Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003), p. 298.
-2-
7KH)LUVW0LFKHOVRQ,QWHUIHURPHWHUÂ
),*85($
D7KHILUVW0LFKHOVRQLQWHUIHURPHWHU
Figs. 1.3(a)±1.3(c) describe the changing length and orientations of the tip of the wavefield’s
oscillating electric or magnetic field vectors.4
Suppose length D in Fig. 1.1(b) is adMusted until the distance from mirror C to the beam splitter
is exactly the same as the distance from mirror D to the beam splitter. When monochromatic
light—that is, light having a unique wavelength—enters the interferometer as shown in Figs.
1.4(a) and 1.4(b), then the beams reflected from C and D recombine when leaving the
interferometer in such a way that their planes of vibration, as well as their state of oscillation,
exactly match. Since the planes of vibration match, we can disregard the planes’ orientation and
Must add together the two beams’ sinusoidal curves. Figure 1.5(a) shows that if the RT and TR
beams line up exactly—as they must when the distances from mirrors C and D to the beam
splitter are equal—then the summed oscillation is a maximum because the two wavefields are in
phase. If the distances from mirrors C and D to the beam splitter are unequal, then beams RT and
TR shift with respect to each other, as shown in Figs. 1.5(b)±1.5(e). The two beams can be out of
wavelength.depending on the
phase by any fraction of a wavelength howamount
much the
of inequality in mirror
the twodistance is.
distances.
4
See, for example, the discussion in Secs. 4.2 through 4.4 of Chapter 4. Figures 1.2(a) and 1.2(b) can be profitably
compared to Figs. 4.5 and 4.6 in Chapter 4.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.1(b).
Mirror D
a Beam Compensator
Splitter Plate
Incident
Light
Mirror C
partially reflective
surface
Beam RT Beam TR
first reflected then first transmitted then
transmitted at beam splitter reflected at beam splitter
Observing Telescope
-4-
The First Michelson Interferometer · 1.1
FIGURE 1.2(a).
cut in
wavefield
plane perpendicular
to direction of
propagation
FIGURE 1.2(b).
vibrations of vibrations of
transverse wavefield transverse wavefield
cut in wavefield
direction of
propagation
plane
perpendicular
to direction of
propagation
-5-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
three different
planes of vibration
FIGURE 1.3(a).
vibration wavelength
vibration wavelength
FIGURE 1.3(b).
-6-
The First Michelson Interferometer · 1.1
The closer this fraction is to one-half, the smaller the summed oscillation; and if they are out of
phase by exactly a half-wavelength, then their sum is zero and the combined beam disappears.
When one beam is shifted against the other by exactly one wavelength, and the planes of
vibration still match, then once again the monochromatic RT and TR beams are in phase and
producing a bright combined oscillation.5 There seems to be a real possibility that a
monochromatic beam cannot be used to confirm that mirrors C and D are the same distance from
the beam splitter because the recombined exit beam may look the same as it does when no shift at
all exists if one wavefield is shifted against the other by one, two, etc., wavelengths.
Suppose two monochromatic beams with two different wavelengths are sent through the
interferometer at the same time. If the distances from mirrors C and D to the beam splitter are
equal, then both the monochromatic beams, even though they have different wavelengths, must
be in phase when leaving the interferometer, producing a maximally bright oscillation in the
recombined exit beam. When the distances to the beam splitter are not exactly equal, however,
one of the monochromatic beams may end up shifted against itself by one, two, etc., wavelengths,
but there is no reason for the other beam to be shifted against itself the same way. When three
monochromatic beams are sent through the interferometer while the distances to the beam splitter
are not equal, matching all three wavetrains becomes even more unlikely. Hence, if we pass
white light containing innumerable distinct monochromatic wavetrains through the instrument,
then the RT and TR beams will recombine to produce a maximally bright output beam if and only
if the distances from mirrors C and D to the beam splitter are equal.
To make the white-light beam work as intended, the interferometer needs a glass compensator
plate between mirror C and the beam splitter [see Fig. 1.1(b)]. The compensator plate must be the
same thickness and orientation—and made from the same type of glass—as the glass in front of
the beam splitter’s partially reflecting surface. Figure 1.6(a) shows how light waves reflect from
mirrors C and D; the wavelength does not change while reflecting. In Fig. 1.6(b), however, light
waves inside the glass are somewhat shorter than they are outside the glass; the wavelength of the
light with respect to the glass thickness is greatly exaggerated to show this effect.
Therefore, a given distance traveled inside the glass corresponds to more wavelengths of a
monochromatic beam than the same distance in empty space. Moreover, different colors or
wavelengths of light shrink by different amounts, and this effect was a familiar one to 19th-
century optical scientists. If the compensator plate is not present, then the RT beam in Fig. 1.1(b)
passes through the glass in the beam splitter three times, whereas the TR beam passes through the
beam-splitter glass only once. The RT beam thus contains more wavelengths than the TR beam
even though the distances between the mirrors and the beam splitter are equal. With the
compensator plate there, however,
present, however,both thethe
both TRTRandand
RTthe
beams pass through
RT beams three glass
pass through threelayers.
glass
thicknesses.
5
In fact, we now know that a strictly monochromatic beam of light must have matching planes of vibration when
shifted against itself by exactly one, two, etc., wavelengths.
-7-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.4(a). Figure 1.4(a) shows a segment of radiation entering the interferometer and Fig. 1.4(b)
shows what that segment becomes when it leaves the interferometer if the distance it travels up and back
each interferometer arm is the same.
-8-
The First Michelson Interferometer · 1.1
FIGURE 1.4(b).
Beam RT Beam TR
-9-
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Beam TR
FIGURE 1.5(a).
Beam RT
In Phase Total
Beam TR
FIGURE 1.5(b).
Beam RT
Out-of-Phase
by a Quarter
Wavelength Total
Beam TR
FIGURE 1.5(c).
Out-of-Phase Beam RT
by a Half
Wavelength
Total
Beam TR
FIGURE 1.5(d).
Beam RT
Out-of-Phase by
Three-Quarters Total
Wavelength
Beam TR
FIGURE 1.5(e).
Beam RT
In Phase
Total
- 10 -
The First Michelson Interferometer · 1.1
FIGURE 1.6(a).
Incident Wavefield
Reflected Wavefield
FIGURE 1.6(b).
Reflected Wavefield
Incident Wavefield
Transmitted
Glass Wavefield
Substrate
Beamsplitting Film
- 11 -
Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV
6
See Secs. 5.20 and 5.21 in Chapter 5 for a more detailed discussion of how to analyze a tilted mirror.
7
F. Jenkins and H. White, )XQGDPHQWDOV RI 2SWLFV 3rd ed. (McGraw-Hill Book Company, New <ork, 1957), p.
251.
The First Michelson Interferometer · 1.1
Centerline of
FIGURE 1.7. Tilted Mirror
Angle
Note: The angle of tilt is
greatly exaggerated in of Tilt
this diagram.
- 13 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Fig. 1.8. Now the central fringe coming from the center line of the tilted mirror is dark because
all the monochromatic components of the two beams cancel out rather than add together. When
Michelson sent white light through his interferometer, he thus saw a central dark fringe with
parallel multicolored fringes on either side. The colored fringes come from the off-center strips of
the tilted mirror where one or another monochromatic wavetrain is shifted against itself by
exactly one, two, etc., wavelengths, increasing the amplitude of its oscillation with respect to the
wavetrains of other colors inside the recombined beam. In this setup, the central dark fringe is
unique, making it easy for Michelson to see how its position changes as the interferometer is
rotated.
- 14 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
FIGURE 1.8.
Beam RT Beam TR
- 15 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.9(a).
Direction of
Earth’s Motion
vt1 vt 2 vt1 vt 2
a
Incident Light
To Telescope
- 16 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
FIGURE 1.9(b).
Direction of
Earth’s Motion
Mirror D
Positions of the
Beam Splitter
a
Incident Light
vt 3 vt 3
To Telescope
- 17 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Time
Time t2 elapses
t2 elapses while
while thethe wavecrestreturns
wavecrest returnsfrom
frommirror
mirrorCCtotothe
thebeam
beam splitter,
splitter, and
and similar
reasoning shows that
a vt2 ct2 . (1.1b)
Solving
Solving for for
t1 and
t1 and
t2 in
t2 Eqs.
in Eqs.
(1.1a)
(1.1a)
andand
(1.1b)
(1.1b)
gives
gives
a
t1
cv
and
a
t2 .
cv
TheThe
wavecrest
wavecrest
spends
spends
timetime
a a 2ac
t1 t2 2 2
cv cv c v
going out to mirror C and back to the beam splitter, and it does so while traveling at velocity c, so
it covers a total distance
2ac 2
c A (t1 t2 ) 2 2 . (1.1c)
c v
Figure 1.9(a) also shows the wavecrest traveling at an angle, instead of straight down, after it
reflects off the beam splitter when leaving the interferometer’s arm. This allows it to head toward
where the observing telescope will be by the time the wavecrest reaches it; there is thus no
danger of the telescope missing the wavecrest because it has moved out of position. Figures
1.10(a) and 1.10(b) show why this happens. Figure 1.10(a) shows a single wavecrest reflecting
off a 458 stationary mirror. The large dots indicate where the “corner” of the reflecting wavecrest
is now and has been in the past as it reflects from the stationary mirror. The reflected wavecrest
travels upward at 908 from its original direction, as expected. Figure 1.10(b) shows what happens
when the same type of wavecrest reflects off a moving 458 mirror. The four thin solid lines show
the positions of the mirror at four equally spaced instants in time, and the large dots again show
where the corner of the reflecting wavecrest is at these times. Connecting these dots with a thick
dashed line, we see that the wavecrest feels an effective stationary mirror that is slanted at an
angle somewhat greater than 458. This means the reflected wavecrest does not travel straight up
as in Fig. 1.10(a) but instead moves a little off to the right.
- 18 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
Figure 1.9(b) shows how the wavecrest travels up and back the interferometer arm
perpendicular to velocity v. In time t3 , the wavecrest travels a distance a 2 v 2t32 from the beam
splitter to mirror D; and, because it does this at velocity c, we must have
ct3 a 2 v 2t32
or
a
t3 .
c2 v2
Figure
Figure 1.9(b)
1.9(b) shows
shows thatthat
thethe totaldistance
total distancetraveled
traveledfrom
fromthe
thebeam
beamsplitter
splitter to
to mirror
mirror D
D and
back again must be
2ac
2ct3 . (1.2)
c2 v2
Even though the two interferometer arms are both of length a, if the interferometer is moving
then a single wavecrest splitting at the beam splitter does not travel the same distance in each arm
before recombining at the beam splitter. The difference ¨s between the distances traveled out and
back in each arm is, according to Eqs. (1.2) and (1.1c),
2ac ª c º 2a ª 1 º
s c(t1 t2 ) 2ct3 « 1» « 1» .
c2 v2 ¬ c2 v2 ¼ 1 v 2 c 2 «¬ 1 v 2 c 2 »¼
The Earth’s orbital velocity is about 104 of the speed of light c, so we can make the
approximation
1 2 v2
1 v2 c2
1
2c 2
.
This gives
§ v2 · § v2 · av 2
s
2a ¨1 2 ¸ ¨1 2 1¸ 2 O(v 4 c 4 ) .
© 2c ¹ © 2c ¹ c
- 19 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.10(a). An incident wavecrest enters from the right and is reflected up from a stationary
surface. The dots show where the corner of the wavecrest is at equally spaced time intervals while it is
reflecting off the surface.
incident wavecrest
moving to the left
reflected wavecrest
moving up
reflecting surface
- 20 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
FIGURE 1.10(b). The same wavecrest is shown here at four instants of time, each instant
separated from the next by a time interval of ¨t, as it enters from the right and reflects off a flat
surface traveling from left to right across the page. The dots show where the corner of the wavecrest
is at these four instants of time, and the thick dashed line shows the effective slant of the surface
experienced by the wavecrest as it reflects.
t t – ǻt
t 2t
direction of travel of
incident wavecrest
t 3t
direction of travel of
reflected wavecrest
t
t 3t
t 2t t t
- 21 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Since v 2 c 2 108 and v 4 c 4 1016 , it makes sense to neglect the v 4 c 4 terms and write
av 2
s
2 ? 108 a . (1.3a)
c
It isIt perhaps of of
is perhaps interest to to
interest point
pointoutoutthat
thatMichelson,
Michelson,bybymistakenly
mistakenlyassuming
assuming that
that the
the light
light
traveling up and back the arm perpendicular to the orbital velocity covered a distance 2a instead
of 2ac / c 2 v 2 , ended up with
2av 2
s
? 2 ;108 a (1.3b)
c2
in his 1881 paper. This incorrect formula did not affect Michelson’s overall analysis because, as
he explained in the paper, the data was good enough to rule out an effect ten times smaller than
what he expected to see.
As pointed out in Sec. 1.1, when white light passed through the interferometer with one of the
end mirrors slightly tilted, Michelson saw a central dark band or fringe from the centerline of the
tilted mirror because the centerline is the same distance from the beam splitter as the untilted
mirror. Remembering that Michelson used a beam splitter that reversed the direction of vibration
in one of the recombining beams, we know that at the center of the dark fringe each
monochromatic wavetrain in the white-light beam cancels itself out. At the first colored band or
fringe on either side of the centerline, the wavetrains go from cancelling themselves out to
reinforcing themselves, becoming bright at those positions on the tilted mirror where the length
traveled out and back the tilted mirror arm is a half-wavelength longer than at the center of the
dark band [see, for example, the transition from Fig. 1.5(c) to Fig. 1.5(e)]. Hence, for each
monochromatic wavetrain, the transition from dark to bright is halfway complete where the
length traveled out and back the tilted-mirror arm is a quarter wavelength different from what it is
at the center of the dark band. Considering the joint actions of all the monochromatic wavetrains
in the white-light beam, Michelson then knew that going from the center to the edge of the dark
fringe corresponded to shifting from a position on the tilted mirror where the length out and back
in both interferometer arms was equal to a position where the length out and back the tilted
mirror arm was different by one quarter of the average wavelength Ȝav of the white-light beam.
Thus the fringe widths inside the telescope’s field of view gave him an extremely fine-grained
scale for measuring the difference in distance between the two arms. For greater accuracy, a
monochromatic beam could be sent through the interferometer and the tilted mirror adjusted until
the fringes matched up with the scale marks of the telescope’s eyepiece.
If the interferometer is rotated so that the arm originally parallel to v is now perpendicular to
v, then the distance out and back one arm is shorter by ¨s and the distance out and back in the
other arm is longer by ¨s, so there is—according to Eq. (1.3a)—a shift of
- 22 -
Historical Reasoning Behind the Ether-Wind Experiment · 1.2
2av 2
2∆s ≅ 2
≈ 2 ×10−8 a (1.4)
c
of the wavefield from one arm when compared to the wavefield from the other arm. If 2¨s equals
λav / 4 , the dark fringe shifts until its center is located at the previous position of one of its edges;
if 2¨s is larger, then the dark fringe shifts more; and if 2¨s is smaller, then the dark fringe shifts
less. For the value of a he chose, Michelson expected the fringe to shift by approximately one-
tenth its width. To within experimental error, he did not see the dark fringe shift at all. Michelson
concluded that
the hypothesis of the stationary ether is thus shown to be incorrect, and the necessary conclusion follows that
the hypothesis is erroneous.8
The existence of the ether was accepted by a lot of scientists, so this experiment was by no
means the last word in the matter; indeed, it inaugurated 50 years of ever more painstaking
attempts to detect an ether wind using larger and more sensitive Michelson interferometers.
Michelson himself took the first step down this road when, in 1887, he collaborated with Edward
Morley to repeat his experiment; Fig. 1.11 shows the optical diagram of the interferometer they
constructed. They concluded that the velocity v of the interferometer with respect to the ether was
probably less than a sixth of the Earth’s orbital velocity, an upper limit suggested by
experimental error.9 Michelson and Morley regarded this as another negative result. Many
scientists, including Michelson, at first interpreted these experiments as showing that the Earth
dragged along a layer of ether near its surface, making it hard to say just how fast the
interferometer might be moving with respect to the ether in the laboratory. Interferometers were
set up on tops of mountains and sent up in high-altitude balloons, hoping to get outside the ether
layer dragged along by the Earth, but no one came up with any results convincingly larger than
experimental error. According to Einstein’s special theory of relativity, published in 1905, there
is no reason to expect “ether drift” at all, because the speed of light is the same in all inertial
frames of reference. After 1905, attempts to detect ether drift were basically attempts to disprove
relativity theory, and scientists who pursued them were regarded by their peers as ever more
eccentric. Perhaps the last serious attempt to detect an ether wind using a Michelson
interferometer took place on top of Mount Palomar, where Dayton Miller ran an extremely large
and sensitive Michelson experiment in the 1920s. When publishing the results in the early 1930s,
he claimed to detect ether-wind velocities on the order of 10 km/sec,10,11 but the data remained
8
Michelson, “The Relative Motion of the Earth.”
9
A. Michelson and E. Morley, “On the Relative Motion of the Earth and the Luminiferous Ether,” American Journal
of Science 34, Series 3 (1887), 333–345.
10
D. Miller, “The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,” Reviews of
Modern Physics 5, no. 2 (July 1933), 203–242.
- 23 -
Â(WKHU:LQG6SHFWUDO/LQHVDQG0LFKHOVRQ,QWHUIHURPHWHUV
controversial. After his death, the results were attributed to slight but systematic temperature
changes in the instrument during the measurements.12
0RQRFKURPDWLF/LJKWDQG6SHFWUDO/LQHV
The wavelength λ of a monochromatic light wave and the frequency I in cycles per unit time of
that same monochromatic light wave are connected by
λI =F, (1.5)
where F is the velocity of light. By the second half of the 19th century, it was known that the light
emitted by free atoms, such as from the atoms inside a hot dilute gas, is often emitted at specific
frequencies called spectral lines. Equation (1.5) then requires the light from a spectral line to
have a precise wavelength λ FI. Michelson used these spectral lines to generate the
monochromatic light sent through his interferometer. When, for example, a spectroscope was
used to separate out the cadmium red line and send it through the interferometer, he would see a
regular pattern of red fringes; when the mercury green line was sent through, he would see
regular green fringes; and so on. Many of these lines are in reality clumped groups of spectral
lines, all having nearly the same wavelength; they masquerade as a single bright line when
observed by low-resolution spectroscopes and spectrometers.
$SSO\LQJWKH0LFKHOVRQ,QWHUIHURPHWHUWR6SHFWUDO/LQHV
After the first ether-wind experiments, Michelson demonstrated that his interferometer could also
be used both as an extremely accurate, practical ruler for measuring fundamental lengths and as
an extremely high-resolution spectrometer. To understand Michelson’s approach, we must keep
in mind that the only ³optical detectors´ available back then were cameras (whose images had to
be chemically developed in darkrooms) and the human eye.
When the interferometer is used as a ruler or spectrometer, one of the arms is modified so that
its mirror is easily moved, as shown in Fig. 1.12. This moving mirror and the fixed mirror on the
other arm are still slightly tilted with respect to each other; that is, when extended indefinitely,
the planes of the mirror surfaces do not meet at exactly 90°. In this discussion, we refer to the
moving mirror as being tilted and the fixed mirror as being untilted. To keep things consistent
Sec. 1.1,
with the discussion in Sec. 1.1, the
the beam
beam splitter
splitter isis assumed
assumedto
tobe
bethe
thesame
sametype
typeused
usedininthe
the1881
1881
11
D. Miller, ³The Ether-Drift Experiment and the Determination of the Absolute Motion of the Earth,´ 1DWXUH
(February 3, 1934), 162±164.
12
R. Shankland, S. McCuskey, F. Leone, and G. Kuerti, ³New Analysis of the Interferometer Observations of
Dayton C. Miller,´ 5HYLHZVRI0RGHUQ3K\VLFV , no. 2 (April 1955), 167±178.
Applying the Michelson Interferometer to Spectral Lines · 1.4
FIGURE 1.11.
- 25 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
ether-wind experiment. Hence, when a white-light beam is sent through the instrument, an
observer notes a central dark fringe if the center of the tilted moving mirror is the same distance
from the beam splitter as the center of the fixed mirror. This equidistant position of the moving
mirror is today often called the position of zero-path difference (ZPD) because the light’s path up
and back each arm of the interferometer is the same when there is no tilt present.
The position and tilt of the moving mirror can be adjusted until the central dark fringe is
centered on rulings marked in the telescope’s eyepiece. When the white-light beam is replaced by
a monochromatic beam from a spectral line, the observer sees a sequence of light and dark bands
forming a regular pattern of fringes having the same color as the spectral line. The marked
position of the central dark fringe in the center of the eyepiece is now occupied by a dark null of
the monochromatic fringe pattern. This null corresponds to the centerline strip of the tilted
mirror’s surface being the same distance from the beam splitter as the untilted mirror’s surface.
The two bright fringes on either side of the marked null separate that null from the two
neighboring nulls, with the neighboring nulls corresponding to two strips of the tilted mirror’s
surface that are a half-wavelength closer to, and a half-wavelength further away from, the beam
splitter. A half-wavelength difference in distance from the beam splitter creates, of course, a full
wavelength’s difference in the distance traveled up and back the interferometer’s arm, which is
why we see another null. Depending on the configuration of the telescope, the amount of tilt in
the tilted mirror, and the wavelength of the monochromatic beam, there will be some number of
additional fringes alternating bright and dark across the field of view, with the nulls
corresponding to strips of the tilted mirror’s surface that are one half-wavelength closer to and
further away from the beam splitter, two halves or one full wavelength closer to and further away
from the beam splitter, three halves closer to and further away from the beam splitter, and so on.
The observer can slowly move the tilted mirror out along its arm, watching as the fringe
pattern moves across the telescope’s field of view. The movement occurs, of course, because the
strips of the moving mirror’s tilted surface that are 1/2, 1, 3/2, etc., wavelengths closer to or
further away from the beam splitter are now no longer where they used to be. The marked null
shifts and, after the mirror moves half a wavelength from its original position, the null that used
to be immediately to one side shifts into the marked location. The fringe pattern looks the same
as just before the mirror began moving, but the observer knows there has been a half-wavelength
shift in the position of the moving mirror because the fringes have been carefully watched as their
positions changed. As the mirror moves, old fringes move out of sight on one side of the field of
view while new fringes replace them on the other side of the field of view. The observer checks
that the tilt of the moving mirror does not change by making sure that there is always the same
number of bright-null repetitions in the fringe pattern. Since the position of the moving mirror is
always known to within a small fraction of a wavelength, the interferometer has now become an
extremely accurate way to measure distance.
- 26 -
Applying the Michelson Interferometer to Spectral Lines · 1.4
FIGURE 1.12.
Moving Mirror
p
Beam Compensator
Splitter Plate
Fixed
Mirror
To Telescope
- 27 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Michelson did not hesitate to measure distances with his interferometer. In 1892 he
established that the standard meter bar in Paris corresponded, to an accuracy of one part in two
million, to 1,553,163.5 wavelengths of monochromatic light from the red cadmium spectral line.
At Yerkes Observatory in Wisconsin, he measured the extremely small tidal distortions of the
planet Earth due to the moon’s gravity, helping to establish that the Earth has an iron core, and
published the results in 1919. There is, however, a fundamental difficulty limiting his ability to
use the interferometer as a ruler: As the moving mirror gets further and further away from its
equidistant or ZPD position, the pattern of fringes starts to fade and eventually disappears. This
phenomenon is caused by the beam from the spectral line not being exactly monochromatic—
either because what looks like a single spectral line is in reality a group of two or more lines
having almost the same wavelength, or because the line itself has a finite spectral “width,”
simultaneously emitting light at a very large number of wavelengths all very close to each other
in value.
To see why the fade-out occurs for a closely spaced group of spectral lines, we first analyze
what happens when the light from a pair of equal-intensity, closely spaced spectral lines,
sometimes called a spectral doublet, is sent through the interferometer. Inside the interferometer,
the doublet behaves like two monochromatic beams—each having a slightly different
wavelength—simultaneously passing through the instrument. After using white light to put the
moving, tilted mirror at its ZPD position, we begin sending the doublet beam through the
interferometer. Each monochromatic beam produces a fringe pattern. To the human eye, the
fringe patterns have the same color and their nulls seem to be at exactly the same places in the
telescope’s field of view. Because the wavelengths of the beams are nearly identical, the two
fringe patterns lie almost exactly on top of each other, reinforcing each other the same way the
dashed and solid oscillations lie on top of each other to create a thicker line at the left-hand edge
of Fig. 1.13. When, for example, there is a null in one beam’s fringe pattern because that strip of
the tilted mirror’s surface is an integer number of half-wavelengths closer to or further away from
the beam splitter, the null from the other beam’s fringe pattern falls in almost exactly the same
place because it has almost exactly the same wavelength. As we shift the moving mirror further
away from ZPD and watch the fringes move, we know that when each new fringe forms at the
leading edge of the field of view, it shows that the edge of the tilted moving mirror is an ever
larger number of half-wavelengths further from the beam splitter. Sooner or later, however, the
same thing happens to the two beams’ fringe patterns that happens in Fig. 1.13 as we look away
from its left-hand edge—the oscillations get out of phase. Just as the dashed and solid lines in
Fig. 1.13 no longer match up exactly because they have slightly different repetition lengths, so do
the two fringe patterns of the two beams match up less well because they have slightly different
wavelengths. There always comes a point—perhaps when the next null is forming at 10,000 or
50,000 or more half-wavelengths from the ZPD position of the moving mirror—where the
monochromatic beam with the slightly shorter wavelength λ1 is ready to form a null somewhat
before the beam with the slightly longer wavelength λ2. The nulls and brights from one
monochromatic fringe pattern shift enough with respect to the other that we begin to notice a
change: the pattern begins to fade. Eventually, the two fringe patterns are completely out of
- 28 -
Applying the Michelson Interferometer to Spectral Lines · 1.4
phase, with the brights and nulls of one pattern lying on, respectively, the nulls and brights of the
other. If the two beams are of equal intensity, then the fringe pattern fades away completely.
Suppose the λ1 set of fringes first becomes exactly out of phase with the λ2 set of fringes when
the moving mirror has traveled a distance of approximately N/2 wavelengths of the λ2 beam from
its equidistant or ZPD location. At this point, N satisfies the approximate equation
1 1§ 1·
N λ2 ≅ ¨ N + ¸ λ1 , (1.6a)
2 2© 2¹
which can also be written as
λ2 − λ1 1
≅ . (1.6b)
λ1 2N
λ2 − λ1
λ1
between the doublet’s wavelengths in terms of N. If N is too large for convenient counting and
only several digits of accuracy are needed, we can directly measure the distance p in Fig. 1.12 at
which the fringe pattern disappears. Recognizing that both sides of Eq. (1.6a) are formulas for p
at the fade-out point, we can approximate either side of Eq. (1.6a) by N λav , where λav is the
approximate wavelength of the doublet, and write
N λav
≅ p. (1.6c)
2
to estimate N in terms of the known values of p and λav . This approximate value of N can then
be put into Eq. (1.6b) to find the fractional spread in the doublet. Hence, we see that the fade-out
is both a “bug” and a “feature” of the interferometer—although it sets a limit on the distances that
can be measured, it also specifies the exact separation of spectral lines too close to be resolved by
other types of spectrometers. This exercise also establishes the basic idea behind Michelson-
based spectroscopy: examining the behavior of the interference signal to measure the beam’s
spectral shape.
- 29 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.13. The solid oscillation represents the fringe pattern of one spectral line in the doublet and
the dashed oscillation represents the fringe pattern of the other spectral line in the doublet. The
wavelengths of both spectral lines are almost the same, so their fringe patterns slowly change from being
in-phase, to being out-of-phase, and then back to being in-phase.
ax( p )
p
i
0
P
i
min ( p )
0 1 2 3 4 5 6 7 8 9 10
0 x 10
i
Now that we understand why the fringe pattern of a doublet fades, it is easy to see why the
same sort of thing happens with any size group—or multiplet—of closely spaced spectral lines.
Each line of intrinsically greater or lesser intensity generates a fringe pattern of intrinsically
greater or lesser intensity connected to its wavelength. Near ZPD, all the fringe patterns are in
phase, but as the moving mirror shifts away from ZPD, the fringe patterns, since each is produced
by a slightly different wavelength, go out of phase, causing the fringes to fade. Figure 1.14 even
suggests a quick way of understanding something about why a single, finite-width spectral line
also produces fading fringe patterns; approximating it as a closely spaced multiplet, we might
expect its fringes to behave the same way any other multiplet’s would. We should, however, be
careful about carrying this sort of reasoning too far. Figure 1.13 suggests that if, after reaching
the fade-out point, we keep moving the tilted mirror away from its ZPD position, then the
doublet’s fringe pattern starts to reappear, eventually becoming as strong as it was near ZPD. The
same sort of phenomenon should also occur for any multiplet consisting of a finite number of
exact wavelengths; if we go far enough from ZPD, then there should be a region where the fringe
patterns are all back in phase. In reality, when moving away from ZPD, there are indeed regions
where a multiplet’s fringe pattern first fades then grows stronger, but the finite width of each
spectral line inside the multiplet stops the fringes from ever regaining their full ZPD strength.
The fringes always, eventually, fade away completely. To explain this behavior, it is enough to
examine how and why the fringe pattern of a single, finite-width spectral line fades away. This is
done in the next three sections, where we show how a fringe pattern is connected to the Fourier
transform of the spectral intensity.
- 30 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
K § 2π z · § 2π z ·
Af = xU
ˆ f cos ¨ − 2π ft + δU ¸ + yV
ˆ f cos ¨ − 2π ft + δV ¸ . (1.7a)
¨ λf ¸ ¨ λf ¸
© ¹ © ¹
Here, t is the time coordinate, f is the frequency of the monochromatic disturbance, and λf is the
wavelength corresponding to frequency f. The period of the disturbance is, of course, 1/f, and Eq.
(1.5) reminds us that the wavelength λf is connected to the frequency f by
λf f = c ,
K
where again c is the speed of light. Vector Af has no ẑ component, allowing it to represent a
transverse disturbance in the “ether”
K of the type shown in Figs. 1.2(a)–1.2(c) and 1.3(a)–1.3(c).
The x̂ and ŷ components of Af are the real-valued expressions
§ 2π z ·
U f cos ¨ − 2π ft + δU ¸
¨ λf ¸
© ¹
- 31 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.14.
Spectral Intensity
frequency f
Spectral Intensity
Spectral Multiplet
frequency f
- 32 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
FIGURE 1.15.
90 deg.
p Moving Mirror
Fixed
Mirror
45 deg. Compensator
source at
Plate
focus 90 deg.
Beam
Splitter
Detector
- 33 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
and
§ 2π z ·
V f cos ¨ − 2π ft + δV ¸¸
¨ λf
© ¹
respectively. These components must both oscillate at the same frequency f because the light
beam is monochromatic, but they can have different constant phase shifts δU and δV . This allows
K
Af to point in different directions in the x, y plane when we move along the beam, as suggested
by the changing orientations of the arrows in beams RT and TR of Fig. 1.16. The Uf and Vf
amplitudes of the x and y oscillations do not have to be equal. To simplify the notation, and
because the concept will be routinely used in the rest of the book, we define
1
σf = (1.7b)
λf
to be the wavenumber of the monochromatic disturbance. Now Eqs. (1.7a) and (1.5) can be
written as
K
ˆ f cos ( 2πσ f z − 2π ft + δU ) + yV
Af = xU ˆ f cos ( 2πσ f z − 2π ft + δV ) (1.7c)
with
σ f = f /c . (1.7d)
This is the same monochromatic disturbance as before; all that changes is the notation used to
specify how its phase changes with z.
The power transported by a physical wavefield of any type is usually proportional to its
squared amplitude;13,14 and in optics it is now, as it was in Michelson’s time, customary to set the
time average of the squared amplitude equal to the intensity of the transverse wavefield.15 Visible
light has a wavelength on the order of 5 × 10−7 meters , so by Eq. (1.5) its frequency is about
c
f ≅ ≅ 6 ×1014 Hz (1.8a)
5 ×10 meters
−7
given that c ≅ 3 ×108 m/sec . Hence one cycle of the transverse wavefield has a period of about
13
H. Lamb, Hydrodynamics (6th edition), Dover Publications, New York, 1945 copy of the 6th edition first
published in 1879, p. 370.
14
P. Morse and K. Ingard, Theoretical Acoustics, McGraw-Hill, Inc., New York, 1968, p. 250.
15
G. Stokes, Mathematical and Physical Papers, Vol. III, Cambridge at the University Press, 1901, pp. 233-258.
- 34 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
FIGURE 1.16.
Moving Mirror
Fixed
Mirror
Beam
Splitter
Compensator
Plate
χ = 2p
Beam RT
ŷ
y axis
x axis
z axis
ẑ x̂
Beam TR
- 35 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
1
2 ;1015 sec . (1.8b)
6 ;1014 Hz
TheThe response
response time
time of of
thethe unaidedhuman
unaided humaneye eyeisisperhaps
perhapsasasshort 10í2í2 s,s, and
shortasas10 and 2×10 í15
2×10í15 s is
13 13
shorter than that by a factor of about 10 . The response of the fastest optical detectors available
today is on the order of 10í9 s, which is still an incredibly long time compared to 2×10í15 s.
Therefore, we might as well take the time over which the squared amplitude is averaged to be
infinitely long, because compared to the wavefield’s period, that’s what it effectively is.
Following the notation of the time, the time average of a function g(t) is taken to be
T
1
j g (t ) lim
T 75 2T ³ g (t )dt .
T
(1.9a)
ForFor
anyany
twotwo functions
functions g(t)g(t)
andand
h(t),h(t),
we wethenthen have
have
T T T
1 1 1
j g (t ) h(t ) lim
T 75 2T ³ [ g (t ) h(t )]dt lim
T
T 75 2T ³
T
g (t )dt lim
T 75 2T ³ h(t )dt
T
or
j g (t ) h(t ) j g (t ) j h(t ) . (1.9b)
Multiplying
Multiplying g(t)g(t)
by abyconstant
a constant K and
K and thenthen averaging,
averaging, we we
get get
T T
1 1
j K A g (t ) lim
T 75 2T ³T [ Kg (t )]dt K Tlim
75 2T ³ g (t )dt
T
or
j K A g (t ) K A j g (t ) . (1.9c)
- 36 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
The average of the squared cosine is 1/2 over one of its cycles.16 As the averaging time gets
longer, it contains ever more cycles of the squared cosine, as well as—almost certainly—some
fraction of a cycle. The contribution of the squared cosine over a fractional cycle has practically
no influence compared to the squared cosine’s average value of 1/2 over a large number of
complete cycles. In the limit as T ĺ , it follows that
j cos 2 (at b) 1/ 2 (1.10c)
for all real values of a and b. Hence, the formula for the intensity of the monochromatic beam in
Eq. (1.10b) now reduces to
K K 1
j ( Af i Af ) U 2f V f2 .
2
(1.10d)
Although the squared cosine is always positive, the cosine itself is negative as often as it is
positive and averages to zero over one cycle. As the averaging time increases, it includes an ever
larger number of cycles as well as (probably) some leftover fraction of a cycle. Again, the
influence of the zero from the large number of complete cycles outweighs the contribution of
whatever fractional cycle may be present, and as T ĺ in the limit
j cos(at b) 0 (1.11)
for all real values of a and b.
The wavefield of a beam of light containing two monochromatic wavetrains of frequencies f1
and f2 can be written as K K K
A A f1 A f2 , (1.12a)
where
K
ˆ f1 cos 2&) f1 z 2& f1t U(1) yV
Af1 xU
ˆ f1 cos 2&) f1 z 2& f1t V(1) (1.12b)
and
K
ˆ f2 cos 2&) f2 z 2& f 2t U(2) yV
Af2 xU
ˆ f2 cos 2&) f2 z 2& f 2t V(2) . (1.12c)
16
D. Griffiths, Introduction to Electrodynamics, 2nd ed. (Prentice Hall, Englewood Cliffs, NJ, 1989), p. 359.
- 37 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
The beam’s intensity is the time average of its squared amplitude, which is
K K K K K K K K K K K K
j A = A j ( Af1 Af2 ) = ( Af1 Af2 ) j Af1 = Af1 Af2 = Af2 2 Af1 = Af2 ) .
Substituting Eqs. (1.12b) and (1.12c) into the cross term in Eq. (1.12d) gives
K K
j Af1 = Af2 j U f1U f2 cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)
V f1V f2 cos 2&) f1 z 2& f1t V(1) cos 2&) f2 z 2& f 2t V(2) .
There
There is aistrigonometric
a trigonometric identity
identity
1 1
(cos . )(cos ) cos(. ) cos(. ) , (1.12f)
2 2
which shows that
cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)
1
2
cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2) (1.12g)
1
cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2) .
2
Taking
Taking the the
timetime average
average of both
of both sides
sides andand applying
applying Eqs.Eqs. (1.9b)
(1.9b) andand (1.9c),
(1.9c), we we
see see
thatthat
- 38 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
j cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2)
1
2
j cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2)
1
j cos 2& z () f1 ) f2 ) 2& t ( f1 f 2 ) U(1) U(2)
2
.
Equation (1.11)
Equation requires
(1.11) bothboth
requires terms on the
terms right-hand
on the sideside
right-hand to be
to zero, which
be zero, gives
which gives
j cos 2&) f1 z 2& f1t U(1) cos 2&) f2 z 2& f 2t U(2) = 0 . (1.12h)
Replacing
Replacing U(1,2)
U(1,2)bybyV(1,2)
V(1,2)in inthethealgebra
algebraused
usedtotoreach
reach this
this result
result does
does not
not change
change the
conclusion, which means that
j cos 2&) f1 z 2& f1t V(1) cos 2&) f2 z 2& f 2t V(2) = 0 (1.12i)
for any two frequencies f1 and f2 such that f1 f2. Hence, Eq. (1.12d) can be written as
K K K K K K
j A = A j Af1 = Af1 j Af2 = Af2 . (1.12k)
Comparing
Comparing thethe
formula in in
formula (1.12k)
(1.12k)forforthe
theintensity
intensityofofa abeam
beamcontaining
containing two
two monochromatic
monochromatic
wavefields to the left-hand side of the formula in (1.10d) for the intensity of a single
monochromatic wavefield, we note that the intensity of the beam with two monochromatic
wavefields is the sum of the intensities of each monochromatic wavefield.
The wavefield of a beam of light containing three monochromatic wavetrains of frequencies
f1, f2, and f3 can be written as K K K K
A A f1 Af2 A f3 (1.13a)
K K K
with Af1 , Af2 specified by formulas (1.12b) and (1.12c) respectively and Af3 specified by
- 39 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
K
ˆ f3 cos 2&) f3 z 2& f 3t U(3) yV
Af3 xU
ˆ f3 cos 2&) f3 z 2& f3t V(3) . (1.13b)
Following thethe
Following same
sameanalysis as as
analysis before,
before,wewenotenotethat
thatthe
theintensity
intensityofofthis
thisthree-frequency
three-frequency light
beam is
K K K K K K K K
j A = A j ( Af1 Af2 Af3 ) = ( Af1 Af2 Af3 )
K K K K K K K K K K K K
j Af1 = Af1 Af2 = Af2 A f3 = Af3 2 A f1 = A f2 2 Af1 = Af3 2 A f2 = Af 3
K K K K K K
j Af1 = Af1 j Af2 = Af2 j Af3 = Af3
K K K K K K
2 j Af1 = Af2 2 j Af1 = Af3 2 j Af2 = Af3 .
Equation (1.12j) shows that
K K
j Af1 = Af2 0
K K
for any two distinct frequencies f1 and f2. The only thing different about j Af1 = Af3 and
K K
j Af2 = Af3 is the subscripts assigned to the distinct frequencies, so the same algebra showing
K K
that j Af1 = Af2 is zero also shows that
K K K K
j Af1 = Af3 j Af2 = Af3 0 .
K K K
Hence, the the
Hence, three-frequency formula
three-frequency for for
formula j jA= A= A
reduces
reduces
to to
K K K K K K K K
j A = A j Af1 = Af1 j Af2 = Af2 j Af3 = Af3 . (1.13c)
Here again, the intensity of the beam equals the sum of the intensities of its monochromatic
wavetrains.
This same argument can obviously be generalized to a beam consisting of N monochromatic
wavetrains. Since N may be left unspecified and can be made as large as we please, this is the
same as extending it to a beam of white light. The white-light wavefield can be written as
K N K
A ¦ A fi , (1.14a)
i 1
where
K
ˆ fi cos 2&) fi z 2& f i t U( i ) yV
Afi xU
ˆ fi cos 2&) fi z 2& fi t V( i ) (1.14b)
- 40 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
K K §§ N K · § N K ·· § N N K K ·
j ( A • A ) = j ¨ ¨ ¦ A fi ¸ • ¨ ¦ A f j ¸ ¸¸ = j ¨ ¦¦ Afi • Af j ¸ ,
¨
© © i =1 ¹ © j =1 ¹¹ © i =1 j =1 ¹
K K K K K K K K N K K
( ) ( ) (
j ( A • A ) = j Af1 • Af1 + j Af2 • Af2 + " + j Af N • Af N = ¦ j Afi • Afi ) i =1
( ) (1.14e)
because all the i j terms disappear. Equation (1.14e) shows that the intensity of any beam, even
a white-light beam, is the sum of the intensities of its monochromatic wavetrains. This is
sometimes called the principle of independent superposition,17 and can be written as
N
I = I f1 + I f2 + " + I f N = ¦ I fi , (1.14f)
i =1
where
K K
I = j ( A • A) (1.14g)
is the total intensity of the beam and
K K
(
I fi = j A fi • A fi ) (1.14h)
17
J. Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley & Sons, New York, 1979), p. 98.
- 41 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
K )
A(fTR
i
= xU (
ˆ fi cos 2πσ fi z − 2π fi t + δU(i ) + yV ) (
ˆ fi cos 2πσ fi z − 2π fi t + δV(i ) ) (1.15a)
in beam TR, there must be, according to Fig. 1.16, a corresponding monochromatic wavetrain
K
( )
ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δU( i ) + yV
A(fiRT ) = xU (
ˆ fi cos 2πσ fi ( z + χ ) − 2π f i t + δV( i ) ) (1.15b)
in beam RT. The total disturbance for the combined beams’ fith wavetrain is then
K K )
A(fiRT ) + A(fTR
i
in Fig. 1.16. We also note, however, that the beam splitter in Fig. 1.16 is evidently not the same
sort of beam splitter as the one used by Michelson because it does not reverse the direction of the
oscillation of the TR beam the way that the beam splitter in Fig. 1.8 did. For this sort of beam
splitter, the total disturbance of the combined beam’s fith wavetrain should be
K K )
A(fiRT ) − A(fTR
i
according to the discussion at the end of Sec. 1.1. To accommodate both possibilities, we write
the fith wavetrain of the combined beam as
K K K )
A(ficb ) = A(fiRT ) + WA(fTR
i
, (1.15c)
where parameter W is í1 for Michelson-type beam splitters Kand 1 for non-Michelson beam
splitters. The superscript (cb) indicates that the disturbance A(ficb ) is the fith wavetrain of two
beams combined in a balanced way—that is, each beam has undergone one transmission and one
reflection at the beam splitter. The intensity of the combined fith wavetrain is
K K K K K K )
( ) (
I (ficb ) = j A(ficb ) • A(ficb ) = j ( A(fiRT ) + WA(fiTR ) ) • ( A(fiRT ) + WA(fTR
i
)
)
K K K ) K (TR ) K ( RT ) K (TR )
(
= j A(fiRT ) • A(fiRT ) + W 2 A(fTR
i
• Af
i
+ 2WA fi • Af
i
)
.
- 42 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
where we have recognized that W2 = 1 because W = ±1. Since both disturbances have the same fi
K K )
frequency, Eq. (1.12j) cannot be used to say that j A(fiRT ) = A(fTR
i
is zero. Substituting from
(1.15a) and (1.15b) gives
K K )
j A(fiRT ) = A(fTR
i
j U 2fi cos 2&) fi ( z ) 2& f i t U( i ) cos 2&) fi z 2& f i t U( i )
V f2i cos 2&) fi ( z ) 2& fi t V( i ) cos 2&) fi z 2& f i t V( i ) ,
or
K K )
j A(fiRT ) = A(fTR
i
U 2fi j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& fi t U(i )
(1.15e)
2
V j cos 2&) fi z 2&) fi 2& fi t
fi
(i )
V cos 2&) fi z 2& fi t (i )
V .
j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& f i t U( i )
§1 1 ·
j ¨ cos 4&) fi z 2&) fi 4& fi t 2U( i ) cos 2&) fi ¸ .
©2 2 ¹
Applying (1.9b)
Applying andand
(1.9b) (1.9c), we we
(1.9c), get get
thatthat
j cos 2&) fi z 2&) fi 2& f i t U( i ) cos 2&) fi z 2& fi t U( i )
(1.15f)
1 1
2
j cos 4&) fi z 2&) fi 4& f i t 2U( i )
2
j cos 2&) fi .
TheThe
timetime average
average of any
of any time-independent
time-independent quantity
quantity equals
equals thatthat quantity—that
quantity—that is, is,
j K K (1.15g)
j cos 4&) fi z 2&) fi 4& fi t 2U(i ) 0 .
- 43 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
j cos 2&) fi z 2&) fi 2& fi t U(i ) cos 2&) fi z 2& f i t U( i )
(1.15h)
1
2
cos 2&) fi .
Replacing U(i ) by
Replacing (i )
U byV(i ) does
(i )
V doesnot not
change
change
the the
algebra
algebra
usedused
to derive
to derive
(1.15h).
(1.15h).
It follows
It follows
thatthat
j cos 2&) fi z 2&) fi 2& fi t V(i ) cos 2&) fi z 2& f i t V( i ) 12 cos 2&) . (1.15i)
fi
Substituting
Substituting (1.15h)
(1.15h) andand (1.15i)
(1.15i) intointo (1.15e)
(1.15e) nownow gives
gives
K K ) 1 2
j A(fiRT ) = A(fTR
i
2
U fi V f2i cos 2&) fi , (1.15j)
For an ideal Michelson interferometer, the intensity of the fith monochromatic wavetrain in
the RT beam and the intensity of the fith monochromatic wavetrain in the TR beam must be
identical because they arise in a symmetric way from the fith wavetrain of the white-light beam
entering the instrument. We can imagine taking out the moving mirror from its interferometer
arm
K (TR )so that only the TR beam is reflected back to the beam splitter. This means that only the
A fi monochromatic disturbance leaves the interferometer in the proper direction, and its
K ) K (TR )
intensity is, of course, j A(fTR i
= Af
i
. Taking out the fixed mirror in the other arm and
replacing the moving mirror in the first arm ensures that only the RT beam reflects back to the
K K
beam splitter. Now j A(fiRT ) = A(fiRT ) is the intensity of the monochromatic disturbance leaving
the interferometer in the proper direction. Since we have just said that these two intensities must
be equal, it follows that
K K K K )
j A(fiRT ) = A(fiRT ) j A(fiTR ) = A(fTR
i
. (1.16a)
- 44 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
KK
Equation
Equation
(1.10d)
(1.10d)
holds
holds true
true
forforany
anymonochromatic
monochromaticwavetrain
wavetrain AAf f of
offrequency
frequency f,f, so
so itit must
K (TR )
apply to wavetrain Afi of frequency f1. Hence, Eq. (1.15a) must mean that
K ) K (TR ) 1 2
j A(fTRi
= Af
i
2
(U fi V f2i ). (1.16b)
K KRT( RT) )
Equation (1.10d)
Equation (1.10d)also
alsoapplies wavetrain A(A
appliestotowavetrain fi fi ofoffrequency
frequency fifi inin Eq.
Eq. (1.15b),
(1.15b), which
similarly leads to
K K 1
j A(fiRT ) = A(fiRT ) (U 2fi V f2i ) .
2
(1.16c)
The right-hand sides of (1.16b) and (1.16c) are the same, which makes sense since the left-hand
sides of (1.16b) and (1.16c) must satisfy Eq. (1.16a).
Again taking out the moving mirror, we note that then, in an ideal interferometer, one quarter
of the entering beam’s power ends up leaving the interferometer as beam TR traveling along the z
axis in Fig. 1.16. Hence, if I (0)
fi is the intensity of the fith monochromatic wavetrain entering this
interferometer, we must have
K ) K (TR ) 1
j A(fTR
i
= Af
i
I (0)
4 i
f . (1.17a)
K K 1
j A(fiRT ) = A(fiRT ) I (0)
4 i
f (1.17b)
I (0) 2 2
fi 2(U fi V fi ) . (1.17c)
1 (0) W (0)
I (ficb ) I f I fi cos 2&) fi
2 i 2
or
1 (0) ª
I (ficb ) I f 1 W cos 2&) fi º .
(1.17d)
2 i ¬ ¼
- 45 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
Equation (1.17d) is the basic equation for the intensity of a monochromatic wavetrain leaving
an ideal Michelson interferometer when the intensity of the corresponding wavetrain entering the
interferometer is I (0)
fi and the moving mirror is displaced from its ZPD position by a distance
p / 2 , as shown in Fig. 1.16. We note that for those values of Ȥ = 2p, where
W cos 2&) f 1 , the intensity of the fith monochromatic wavetrain leaving the interferometer is
i
the same as the intensity of the fith monochromatic wavetrain entering the interferometer. This
corresponds to constructive interference of the fith monochromatic component of the RT and TR
beams. Suppose the beam entering the interferometer consists of just this one monochromatic
component. Glancing back at Fig. 1.1(b), we see that the power of the beam entering an ideal
Michelson interferometer can leave by either the combined RT and TR dotted beams or by the
two combined dash-dot beams traveling in the opposite direction to the incident beam. The dotted
beams are often called the balanced output of the interferometer, because each one has undergone
one transmission and one reflection at the beam splitter; similarly, the dash-dot beams are called
the unbalanced output, because one beam has undergone two reflections and the other beam has
undergone two transmissions. Conservation of energy requires that the power in all the
monochromatic beams leaving the ideal interferometer must equal the power in the one
monochromatic beam entering the interferometer. Hence, when constructive interference of the
balanced RT and TR beams makes their combined intensity equal to that of the beam entering the
interferometer, we know that destructive interference of the two unbalanced beams must make
their combined intensity equal to zero. Consequently, at each Ȥ = 2p value where
W cos 2&) f 1 , not only is the intensity of the balanced monochromatic beams the same as
i
that of the monochromatic beam entering the interferometer, but also the intensity of the
unbalanced monochromatic beams is zero. On the other hand, for moving-mirror positions where
Ȥ = 2p has a value such that W cos 2&) f 1 , the intensity of the combined monochromatic
i
RT and TR beams in Fig. 1.1(b) is zero according to Eq. (1.17d). At these moving-mirror
locations, the balanced output undergoes destructive interference. Conservation of energy then
requires the unbalanced output to undergo constructive interference and have the same intensity
as the monochromatic beam entering the interferometer.
This analysis can be generalized to any mirror position and value of Ȥ = 2p. If I (ficu ) is the
intensity of the unbalanced monochromatic wavetrain and, as before, I (0)
fi and I (ficb ) are the
intensities of the incident monochromatic wavetrain and balanced monochromatic wavetrain
respectively, then conservation of energy forces us to write
I (0) ( cb )
fi I fi I (ficu ) . (1.18a)
- 46 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
1 (0) ª
I (0)
fi =
2 ¬ ¼ (
I fi 1 + W cos 2πσ fi χ º + I (ficu ) , )
which can be solved for I (ficu ) to get
1 (0) ª
2
I fi 1 − W cos 2πσ fi χ º .
I (ficu ) =
¬ ¼ ( (1.18b) )
This specifies the intensity of the fith monochromatic wavetrain in the unbalanced output of an
ideal Michelson interferometer.
The dashed lines in Fig. 1.17 show the positions of the moving mirror at which
n n +1 n + 2
χ = …, , , ,… .
σf i
σf i
σf i
These are the positions where I (ficb ) = 0 in Eq. (1.17d) when W = í1 for an interferometer using a
Michelson-type beam splitter. This can also be written as, substituting from Eq. (1.7b),
where λ fi is the wavelength of the fith monochromatic wavetrain. For beam splitters where
W = 1 , of course, these dashed lines represent the moving-mirror positions at which I (ficb ) = I (0)
fi . If
the moving mirror is slightly tilted, so that its surface crosses more than one dashed line, and the
beam entering the interferometer contains only the fith monochromatic wavetrain, then the
combined RT and TR beams leaving the interferometer have light and dark strips as the surface
of the tilted mirror crosses through those planes in space where an untilted mirror would produce
an all-bright or an all-dark balanced output. This connects Eq. (1.17d) to the bright and null
fringe patterns from a spectral line discussed in Sec. 1.4.
When a beam of white light passes through the interferometer—that is, a beam having many
different frequencies—the principle of independent superposition in Eq. (1.14f) requires the
intensity of the interferometer’s balanced output to be the sum of the intensities of each
monochromatic wavetrain,
N
I ( cb ) = ¦ I (ficb ) ,
i =1
1 N (0) ª
I ( cb )
= ¦ I fi 1 + W cos 2πσ fi χ º .
2 i =1 ¬ ¼ ( ) (1.19a)
- 47 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.17.
(n + 3)rd crossing
(n + 2)nd crossing
distance between
dashed lines is λ fi / 2
(n + 1)st crossing
nth crossing
- 48 -
Interference Equation for the Ideal Michelson Interferometer· 1.5
When describing natural sources of light, we often replace sums of discrete quantities with
integrals over continuous functions, and this transformation was perhaps even more characteristic
of late 19th-century science than it is of today’s physics. So it would be an automatic process for
Michelson and his contemporaries to define a spectral intensity function I (0) ( f ) to describe the
radiation entering the instrument. When using this sort of mathematical formalism, we say that
I (0) ( f )df is the optical intensity of all the radiation having frequency values between f and f + df
entering the interferometer. The intensity of the balanced output is then
5
1 (0)
2 ³0
I ( cb ) I ( f ) ª¬1 W cos 2&) f º¼ df . (1.19b)
TheThe physical
physical meaning
meaning of of
Eq.Eq. (1.19b)
(1.19b) is isexactly
exactlythe
thesame
sameasasEq.
Eq.(1.19a);
(1.19a);we
wehave
have just
just replaced
replaced
(0) (0)
I fi by I ( f )df and changed the sum to an integral. We have also relied on variable f itself
instead of index i to label the different frequencies. To make this last tactic work, we just assume
that I (0) ( f ) is zero for those frequencies f that are not part of the original sum over i; this also
lets us specify the integral to be over all possible frequencies f between 0 and . The
wavenumber ıf can be eliminated by substituting from the formula for f in (1.7d) to get
ª § 2& f · º
5
1
I ( cb )
³ I (0) ( f ) «1 W cos ¨ ¸ » df . (1.19c)
20 ¬ © c ¹¼
TheThe only
only problem
problem with
with this
this equationis isthe
equation theunreasonably
unreasonablyhigh
highnumbers
numbersrequired
required to
to represent
represent f
at optical frequencies—when going from one extreme to the other across the visible spectrum, for
example, frequency f changes from 4×1014 Hz to 7.5×1014 Hz (approximately). Consequently,
today’s Fourier spectroscopists often use Eq. (1.7d) to eliminate f rather than ı from Eq. (1.19b).
To do this, we differentiate both sides of (1.7d) to get
1
df c d) or d) df
c
and define
S () ) cI (0) (c) ) (1.19d)
so that
1
S () ) d) cI (0) (c) ) A df
c
simplifies to
S () ) d) I (0) (c) ) df . (1.19e)
- 49 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
∞
1
I ( cb )
= ³ S (σ ) ª¬1 + W cos ( 2πσχ ) º¼ dσ . (1.19f)
20
To get the white-light intensity formulas for the unbalanced output, we can apply to the
unbalanced monochromatic formula the same analysis used on the balanced monochromatic
formula. Comparing the unbalanced formula (1.18b) to the balanced formula (1.17d), we see that
changing the sign of W is all that needs to be done to go from the balanced formula to the
unbalanced formula. Hence, when we apply to the unbalanced formula the same algebra used on
the balanced formula, we know that all the way through the derivation—and, of course, in the
final results—the only difference would be that W is replaced by íW. Consequently, we can write
down at once the unbalanced white-light formulas corresponding to (1.19b), (1.19c), and (1.19f)
as
∞
1
I ( cu )
= ³ I (0) ( f ) ª¬1 − W cos ( 2πσ f χ ) º¼ df , (1.20a)
20
∞
1 ª § 2π f · º
I ( cu )
= ³ I (0) ( f ) «1 − W cos ¨ χ ¸ » df , (1.20b)
20 ¬ © c ¹¼
and
∞
1
I ( cu )
= ³ S (σ ) ª¬1 − W cos ( 2πσχ ) º¼ dσ (1.20c)
20
respectively. Formulas (1.19b), (1.19c), and (1.19f) contain all the basic information needed to
understand how Fourier-transform spectroscopy works, and it was derived here using only those
facts that Michelson knew over 100 years ago about the nature of light. Unfortunately, it applies
only to an ideal interferometer; not surprisingly, the 19th-century approach used to derive it is
difficult to adapt to the study of both the random and nonrandom errors present in even the most
accurate of today’s Michelson interferometers. For this reason, in Chapter 4 we return to basic
principles and rederive the formula for I(cb) starting from the modern form of Maxwell’s
equations, this time being careful to include all the nonideal terms needed for the error analysis.
Formula (1.19f) is, however, already good enough—if we borrow several mathematical results
from Chapter 2—to explain why the fringes from even the thinnest of spectral lines discussed in
Sec. 1.4 must eventually fade away as Ȥ = 2p increases.
- 50 -
Fringe Patterns of Finite-Width Spectral Lines· 1.6
5 5
1 W
I ( cb )
³ S () ) d) ³ S () ) cos 2&) d) . (1.21a)
20 2 0
Since ) :)0: in
Since 0 the
in the
integrals
integrals
over
overd)d), nothing
, nothingstops
stopsususfrom
fromreplacing
replacing SS(()))) by
by SS(()) )) in the
second term to get
5 5
Anticipating
Anticipating some
some of of
thethe Fourier
Fourier materialininChapter
material Chapter2,2,we
wenote
notethat,
that,according
according to
to Eq.
Eq. (2.11a)
(2.11a)
in Chapter 2, function S ( ) ) is even because
S ( ) ) S ( ) ) ,
and, of course, it is real because it represents a real physical quantity—the intensity of the
spectral line. Turning next to Eq. (2.34g) in Chapter 2, we see that because S ( ) ) is a real and
even function, the cosine integral on the right-hand side of Eq. (1.21b) is one half of the Fourier
transform of S [if we specify that parameter ı in (1.21b) corresponds to variable t in (2.34g) and
that parameter Ȥ in (1.21b) corresponds to variable f in (2.34g)]. Anticipating the material in
Chapter 2 one last time, we consult Eq. (2.35k) and note that if the nth derivative of S has a well-
defined Fourier transform, then for large values of its argument the Fourier transform of S
approaches zero as the nth power of the absolute value of its argument. Since S describes a
spectral line—that is, a natural phenomenon—we expect it to have derivatives of all orders and
also expect those derivatives to have Fourier transforms. The argument of the Fourier transform
of S is Ȥ, and we already know that the right-hand side of (1.21b) is half the Fourier transform of
S, so we can now conclude that
- 51 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
5 5
5
1
I ( cb )
2 ³0
S () ) d ) O
n
(1.21d)
for large values of Ȥ. Hence, as the moving mirror gets further and further from its ZPD location,
increasing the value of 2 p , the value of I ( cb ) eventually stops changing and approaches the
constant value
5
1
lim I ( cb )
³ S () ) d) . (1.21e)
75 20
This happens for all types of intensity curves, not just those associated with spectral lines. If S
does represent a spectral line such as the one in Fig. 1.18, the brights and nulls associated with
the dashed lines in Fig. 1.17 eventually fade away. Consequently, no matter how the moving
mirror is tilted, no fringes can be seen. If the Michelson interferometer is being used as a ruler,
the fringe counting must stop. When the spectral line is a closely spaced multiplet, each line in
the group has a finite spectral width, ensuring that—no matter how the lines interact with each
other to form bright and dim regions in the overall fringe pattern—eventually any and all fringe
traces must disappear. Every spectral line found in nature produces light having some finite
spectral width, no matter how small, so this sort of fade-out is a universal phenomenon.
³ S () ) cos 2&) d) ,
0
coming from the second term on the right-hand side of Eq. (1.21a). In the previous section we
found that this curve is half the Fourier transform of S. This means that if the curve could be
- 52 -
Fourier-Transform Spectrometers · 1.7
FIGURE 1.18.
( 0)
Spectral Intensity I (f)
f1 f2 frequency f
S (σ ) = cI (0) (cσ )
f1 f2 wavenumber σ
σ1 = σ2 =
c c
- 53 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
measured, then the Fourier transform could be reversed to get the shape of the S spectrum
entering the interferometer. In the 1950s, both optical detectors to measure I(cb) and digital
computers to reverse the Fourier transform became widely available. Spectroscopists began to
design and build spectrometers based on measuring I(cb) as a function of Ȥ and then reversing the
Fourier transform to find S. Today, these sorts of instruments are usually called Fourier-transform
spectrometers.
Equation (1.21a) is an idealized form of the fundamental equation of Fourier-transform
spectroscopy. It describes the intensity of the beam leaving an interferometer whenever we
Although this is exactly what happens inside a standard Michelson interferometer, Figs. 1.19(a)–
1.19(d) show that there are many other combinations of beam splitters and mirrors that divide and
recombine beams in this way.18
Figure 1.19(a) shows the first and perhaps most obvious modification. Michelson put the arms
of his interferometer at right angles to maximize the fringe shift due to the ether wind thought to
exist by 19th-century scientists. If all that is desired, however, is to divide and recombine beams,
then the two arms can be at any (reasonable) angle with respect to each other, as shown in Fig.
1.19(a). The setup in Fig. 1.19(a) may in fact have some advantages over the standard Michelson
interferometer; arranging for near-normal reflections off the beam splitter usually modifies the
polarization of the wavefields less than large-angle reflections (see Sec. 4.4 of Chapter 4 for an
explanation of polarization).
Figure 1.19(b) shows that the end mirrors can be replaced by retroreflectors like corner cubes
or cat’s-eyes. For best results, both arms should have the same type of retroreflector.
The discussion following Eq. (1.17d) above explains the difference between the balanced and
unbalanced optical outputs leaving the standard Michelson interferometer. In Figs. 1.19(a) and
1.19(b), the unbalanced output cannot be detected because it goes back out along the entrance
beam, making it impossible to separate the two. The interferometer in Fig. 1.19(c), however,
shows that there are ways to keep the entrance beam separate from the unbalanced output, giving
us access to both the balanced and unbalanced optical signals. According to Eqs. (1.19f) and
(1.20c), if I(cb) is the intensity of the balanced output and I((cu)
cb)
is the intensity of the unbalanced
output, then
5
I ( cb )
I ( cu )
W ³ S () ) cos 2&) d) (1.22a)
0
and
18
To keep things simple, compensation plates and other secondary optical components have been omitted.
- 54 -
)RXULHU7UDQVIRUP6SHFWURPHWHUVÂ
∞
, ( FE )
+, ( FX )
= ³ 6 (σ ) Gσ . (1.22b)
0
Equation (1.22a) shows that subtracting the output of the detectors measuring the balanced and
unbalanced signals eliminates the constant term and doubles the size of the signal component
containing the Fourier transform. Adding the detectors’ outputs in Eq. (1.22b) eliminates the
Fourier transform, producing the integrated spectral intensity of the entrance beam. This
integrated source intensity should, of course, remain constant during a spectral measurement
because Fourier-transform spectrometers are vulnerable to source fluctuations. Astronomers often
design their Fourier-transform spectrometers so that both the balanced and unbalanced outputs
are available. When they investigate the spectra of weak and fluctuating sources (such as
twinkling stars), these instruments allow them both to double the signal from—and to check the
constancy of—the radiances being measured. If the source fluctuates, formula (1.22b) can be
used to measure the fluctuation. Sometimes this allows the astronomer to rescale the Fourier
signal in (1.22a) to correct the spectral measurement.
In a standard Michelson interferometer such as the one shown in Fig. 1.1(b), and in the setups
shown in Figs. 1.19(a)±1.19(c), the wavefield of one recombining beam is displaced a distance Ȥ
with respect to the wavefield of the other whenever the moving mirror or corner cube is displaced
from =PD by a distance Ȥ/2. In Fig. 1.19(d), however, the corner cube only has to move a
distance Ȥ/4 to displace one wavefield by Ȥ with respect to the other. Equation (5.67) in Chapter 5
shows that larger values of Ȥ lead to more detailed spectral measurements in standard Michelson
interferometers, and the same holds true for the nonstandard interferometers discussed here. In
particular, a setup such as the one shown in Fig. 1.19(d) lets us achieve larger Ȥ values with
smaller displacements of the corner cube. The moving corner cube is also, strictly speaking, no
longer the retroreflector; plane mirrors in both arms are used to reverse the beam directions.
During the 1950s, it was established that Fourier-transform spectrometers had two basic
advantages—often called the Jacquinot advantage and the Fellget advantage—over contemporary
types of prism-based and grating-based spectrometers.19 These advantages revealed that under
many circumstances spectra measured by Fourier-transform spectrometers had a better signal-to-
noise ratio than equivalent prism-based or grating-based instruments. With the popularization of
the fast-Fourier transform (FFT) algorithms in the 1960s, Fourier-transform spectrometers soon
established themselves as usually the first and best choice for measuring infrared spectra
(electromagnetic radiation having wavelengths between 1 and 100 ȝm). The growing availability
of personal and desktop computers in the late 1970s and 1980s made Fourier-transform systems
more compact, powerful, and user-friendly. Over the past two decades, there has been a tendency
standard Michelson
to use standard Michelson configurations,
configurations,such
suchasasthose
thoseininFigs.
Figs.1.1(b)
1.1(b)oror1.19(a),
1.19(a),when
when
19
J. Chamberlain, 7KH3ULQFLSOHVRI,QWHUIHURPHWULF6SHFWURVFRS\ p. 16.
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.19(a). χ
p=
2
Moving
Mirror
Beam Fixed
Splitter Mirror
Entrance Beam
To Balanced
Signal Detector
Moving Corner χ
FIGURE 1.19(b). Cube p=
2
Beam
Splitter
Entrance Beam Fixed Corner
Cube
To Balanced
Signal Detector
χ
p=
FIGURE 1.19(c). 2
Beam
Entrance Beam Splitter
Fixed Corner
Cube
To Unbalanced To Balanced
Signal Detector Signal Detector
- 56 -
Fourier-Transform Spectrometers · 1.7
FIGURE 1.19(d).
Beam
Entrance Beam Splitter
Fixed
Mirror
designing the optics of Fourier-transform spectrometers. Standard Michelsons are well suited to
the laser-based servo controls often used to maintain the alignment of the fixed and moving
mirrors.
- 57 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
measurement. Systems designed to measure infrared spectra typically have lasers that work in the
visible. Not only do modest standards of alignment and control in the visible correspond to
extremely accurate standards of alignment and control in the infrared—because visible
wavelengths are much shorter than infrared wavelengths—but the infrared detectors responsible
for the spectral measurements are also easily shielded from stray laser light. The laser servo
systems follow many different designs. Figures 1.20(a) and 1.20(b) show a typical setup that may
not be exactly like any system now in use but that does present the basic ideas behind them.
In Fig. 1.20(a), a single laser beam is separated into beams A, B, and C by laser-beam
splitters. Separating one beam into three ensures that all three beams have the same wavelength.
The three beams enter the interferometer parallel to, and at the edges of, the entrance beam.
Figure 1.20(b) shows the path of beams A and B through the instrument; beam C is not shown
because it is out of the plane of the page, but it is assumed to follow a path similar to beams A
and B. The solid lines representing the laser beams are always parallel to the dotted lines showing
the path of the entrance beam through the interferometer; and the laser beams interact with the
interferometer’s beam splitter, fixed mirror, and moving mirror exactly the same way the
entrance beam does. Because all three laser beams are monochromatic wavetrains of wavelength
λ, the same reasoning used to produce Fig. 1.17 shows that we can draw a sequence of dashed
lines perpendicular to the laser beams to represent the moving-mirror positions where the laser
beams would form fringes. Just like in Fig. 1.17, each dashed line is separated from its two
nearest neighbors by λ/2. Taking the dashed lines to represent nulls, we note that if the moving
mirror has a slight tilt, as shown in Fig. 1.20(b), then the laser detector for beam B will see a near
null in the beam B fringe while the laser detector for beam A will see a near bright in the beam A
fringe. If the moving mirror is aligned in the plane of Fig. 1.20(b) but has a small out-of-plane
tilt, then the laser detector for beam C is sure to see a different fringe brightness than the laser
detectors for beams A and B. The three laser detectors send their signals to a servomechanism
that readjusts the mirror tilt until both detectors see the same fringe intensity, keeping the
interferometer aligned while the moving mirror changes position. Often these servomechanisms
readjust the tilt of the fixed mirror instead of directly correcting the moving mirror’s tilt. It is not
difficult to design systems of this sort that can detect changes of λ/100 in the position of the
moving-mirror’s surface. The A, B, and C laser detectors can also be used to count fringes as the
moving mirror changes position, keeping a record of where the moving mirror is and how fast it
is moving. This information is almost always used to sample the interferometer’s output signal at
equally spaced positions of the moving mirror, and it is often sent to a servomechanism
responsible for producing steady motion in the moving mirror.
___________
Chapters 2 and 3 spell out the mathematical ideas needed to analyze the performance of
Fourier-transform spectrometers, and they also establish the notation used to describe these ideas
in subsequent chapters. Readers who are already familiar with Fourier theory and random
- 58 -
Laser-Based Control Systems · 1.8
functions can skip ahead to Chapter 4, returning to Chapters 2 and 3 as needed to refresh their
understanding. Chapter 4 starts with Maxwell’s equations, working with them to derive the
nonideal versions of Eq. (1.19f) and (1.20c) needed to understand both the nonrandom and
random sources of error in Fourier-transform spectrometers. We always assume a standard
Michelson configuration, such as the ones shown in Fig. 1.1(b) or 1.19(a), controlled by laser-
based metrology and alignment systems similar to the ones shown in Figs. 1.20(a) and 1.20(b).
These are arguably the most common type of Fourier-transform spectrometer in use today. Most
of the basic ideas applied here to these standard Michelson systems are also relevant to other
types of Fourier-transform spectrometers; anyone who reads and understands the analysis
presented in Chapters 4 through 8 will be able to modify the equations presented there so that
they apply to nonstandard Michelson configurations. One possible exception to this rule are
Michelsons such as the one shown in Fig. 1.19(b) that use nonstandard retroreflectors to return
the split entrance beam to the beam splitter. These sorts of systems, which are outside the scope
of this book, are spared many forms of the “tilt” misalignment possible in a standard Michelson,
which is an advantage, but on the other hand exhibit shear types of misalignments, which
standard Michelsons do not have. The equations governing shear misalignment turn out to be
similar to those for tilt misalignment, but it does not necessarily make sense to analyze them as a
source of random error, the way tilt is analyzed in Chapter 7.
- 59 -
1 · Ether Wind, Spectral Lines, and Michelson Interferometers
FIGURE 1.20(a).
Interferometer
Beam Splitter
Beam C
Beam B
Laser
Beam A
Laser Beam
Splitters
Entrance
Beam
- 60 -
Laser-Based Control Systems · 1.8
FIGURE 1.20(b).
Moving
Mirror
Laser
Beam C
Interferometer Fixed
Laser Beam Beam Splitter Mirror
Splitters
Beam B
Entrance
Beam
Beam A
To Laser
Detector B
To Laser
Detector A
To Infrared Detector
- 61 -
2
FOURIER THEORY
Many single-chapter introductions to Fourier theory follow a top-down approach, defining what a
Fourier transform is and then listing the mathematical consequences. Here, on the other hand, we
begin with more of a bottom-up approach, seeking not only to present the mathematical
formalism of Fourier transforms but also to give an intuitive feel for how they work and what
they mean. Once the basic idea is established, we need to know which data sequences and
functions have well-defined Fourier transforms. This topic is often scanted because Fourier
theory is notorious for providing no simple mathematical answers to this simple mathematical
question. Indeed, engineers, scientists, and applied mathematicians have a long tradition of using
Fourier transforms in mathematically improper—yet extremely useful—ways that usually give
the correct answer. To show why these techniques work, and also when they cannot be trusted,
there is a brief sketch of generalized function theory. This is followed by a discussion of the
Fourier series and the discrete Fourier transform, including an exact description of how they are
connected to the integral Fourier transform. The discrete Fourier transform is particularly
important because, almost without exception, the only type of Fourier transform calculated on
today’s computers is the discrete Fourier transform; without it, the Michelson interferometer
would be a much more limited instrument. The chapter then concludes with a brief discussion of
how Fourier transforms are applied to two-dimensional and three-dimensional functions.
-- 62
62 --
Basic Concept of a Fourier Transform · 2.1
FIGURE 2.1(a).
List uk
1 2 3 4
increasing index k
FIGURE 2.1(b).
List vk
1 2 3 4
increasing index k
-- 63
63 --
2 · Fourier Theory
we form the sum S of the products of the differences from the mean,
N
S ¦ uk u vk v . (2.2)
k 1
If the graphs of uk and vk have similar shapes, so that uk u ? vk v for most values of k,
then uk u and vk v are very likely to have the same sign for most values of k. This means
few terms in the sum are negative and S ends up being a large positive number. If uk and vk have
little similarity in shape, then uk u and vk v are as likely to have opposite signs as the
same sign and the terms in the sum are just as likely to be positive as they are to be negative.
When this happens, S is a sum of terms that tend to cancel out, and the magnitude of S is likely to
be small.
The same basic idea can be applied to continuous functions u(t) and v(t). To create a formal
correspondence between functions and lists, we define an interval ¨t in t and match uk and vk to
u(t) and v(t) with the equations
u
u (k t ) k
t
and
v(k t ) vk .
Because u and v are continuous functions of time, we can assume that they vary in an
unsurprising manner between the isolated points at t , 2t , … , N t at which they have been
specified. Traditionally, the argument of functions u and v is called t and assumed to be time, but
it is worth remembering that t can stand for any relevant physical parameter, such as length,
voltage, current, etc. Now we can approximate Eq. (2.2) as
N t
S
³ u (t ) u v(t ) v dt ,
t
(2.3a)
where now
N t
1
u
N t ³ u (t )dt
t
(2.3b)
and
N t
1
v
N t ³ v(t )dt .
t
(2.3c)
Equations (2.3b) and (2.3c) just ensure that u and v are now the average values of u(t) and
- 64 --
- 64
Basic Concept of a Fourier Transform · 2.1
v(t) respectively. We note that the value of u has been redefined from what it was in Eq. (2.1a)
above,
unew
uold / t ,
whereas v has basically the same value as in Eq. (2.1b)—the only change is to replace the sum
by the equivalent integral. At this point, the finite value of ¨t is just a distraction, because it is the
shapes of the continuous functions u(t) and v(t) that are being compared. Taking the limit as
t 7 0 and N 7 5 in such a way that
we get
Tmax
S ³ u (t ) u v(t ) v dt ,
0
(2.4b)
where
Tmax
1
u
Tmax ³ u (t )dt
0
(2.4c)
and
Tmax
1
v
Tmax ³
0
v(t )dt . (2.4d)
We still expect S to be large when functions u and v have similar shapes and S to be small when
they have dissimilar shapes.
Equation (2.4b) can be written as
Tmax Tmax
where in the last step (2.4c) ensures that the term in the square brackets [ ] is zero and (2.4d) is
-- 65
65 --
2 · Fourier Theory
used to replace the integral over v by vTmax . To get to Fourier theory from Eq. (2.5), we suppose
v(t) to be an oscillatory function like sin(2& ft ) or cos(2& ft ) with f > 0 . This makes function u
the data—that is, the value of our measurement at time t is u(t). Equation (2.4d) then reveals,
depending on whether we choose v to be a sine curve or a cosine curve, that
Tmax
1
vTmax ³ sin(2& ft )dt 2& f A 1 cos(2& fT )
0
max (2.6a)
or
Tmax
1
vTmax ³
0
cos(2& ft )dt
2& f
A sin(2& fTmax ) . (2.6b)
When v is a sine curve, vTmax oscillates between 1 & f and 0 as Tmax increases; and when v
is a cosine curve, vTmax oscillates between 1 2& f and 1 2& f as Tmax increases. Keeping in
mind that u(t) represents a function measured in a laboratory, if we want to compare the shape of
u to either sin(2& ft ) or cos(2& ft ) , common sense requires Tmax, the range of t over which data is
gathered, to be much greater than 1/ƒ, the period of the sine or cosine curve to which we want to
compare the data. Unless u entirely lacks a resemblance to the sine or cosine so that
Tmax
³ u (t )v(t )dt
0
0
Tmax
³ u (t )v(t )dt
0
to be large when the u measurements are large, and small when the u measurements are small—
and the integral’s magnitude should also increase as Tmax increases. So when u represents a
typical set of data that is not completely unlike v in shape, then
Tmax
or
- 66 --
- 66
Basic Concept of a Fourier Transform · 2.1
Tmax
1
u ³ u(t )v(t )dt O(T
0
max ).
Equations (2.6a) and (2.6b) show that vTmax must remain somewhere between the two values
1 & f and 1 2& f no matter how large Tmax gets, which means
vTmax O( f 1 ) .
Having already concluded that Tmax has been chosen much larger than 1/ƒ, we expect
Tmax
1
u ³ u(t )v(t )dt O(T
0
max ) O( f 1 ) vTmax ,
ª 1 Tmax º T
1 max
Tmax
The integral in (2.7) can be regarded as assigning the number S to the similarity in shape of u and
v, when v is a sine or cosine curve of frequency ƒ. Remembering where S came from, we realize
that this number is large when u and v have similar shapes and small when u and v have
dissimilar shapes.
-- 67
67 --
2 · Fourier Theory
5
C ( ft )
u (t ) 2³ u (t ) cos(2& ft )dt . (2.8b)
0
The notation p( ft ) u (t ) and C ( ft ) u (t ) shows that the function u(t) is being multiplied by,
respectively, the sine or cosine function having—as indicated by the superscript—an argument ft
multiplied by 2& . The order of the ft product in the superscript does not matter because it does
not matter in the arguments of the sine and cosine, so
p( ft ) u (t ) p( tf ) u (t ) and C ( ft ) u (t ) C ( tf ) u (t ) .
In particular we know, because t is repeated in both u(t) and the superscript of p and C , that t is
the dummy variable of integration whereas ƒ, which is only contained in the superscript, is an
independent parameter. This means the transforms p( ft ) u (t ) and C ( ft ) u (t ) are themselves
functions of the parameter ƒ,
5
U p f 2 ³ u (t ) sin(2& ft )dt (2.8c)
0
and
5
U C f 2 ³ u (t ) cos(2& ft )dt . (2.8d)
0
The “capital U” names of functions U p and U C show that they are mathematically associated
with the original function u(t), created from u(t) by the integrals in (2.8c) and (2.8d).
Although the upper limit of integration is now in Eqs. (2.8a) and (2.8b), this should not be
interpreted as taking the limit as Tmax 7 5 in Eq. (2.7). The upper limit is put at just to
eliminate Tmax as an explicit parameter, and the idea behind the presence of Tmax—that u(t)
represents the result of a measurement—is kept alive by placing restrictions on the type of
function u can be. In particular, we expect u(t), in some sense, to diminish or get small as t gets
large, because it is impossible to measure data for all the times t out to . It turns out that when
the right sorts of restrictions are placed on u, the Fourier sine and cosine transforms can be
inverted to recover the original functions,
5
u (t ) 2 ³ U p f sin(2& ft ) df (2.8e)
0
- 68 --
- 68
Fourier Sine and Cosine Transforms · 2.2
and
5
u (t ) 2 ³ U C f cos(2& ft ) df (2.8f)
0
for t 0 .
If we adopt the strictest definition of what is meant by the integral of a function between 0 and
, then Eqs. (2.8a)–(2.8f) are true when function u(t) satisfies the following four requirements:
We now show why function u(t) naturally satisfies all these restrictions when it represents a
(possibly idealized) measurement controlled or described by a continuous parameter t.
No matter what the argument t of function u represents—time, voltage, energy, etc.—function
u(t) can only be measured over a finite range of t. Although there may be no reason to think u is
zero or negligible when measured outside this range, we obviously cannot “make up” values for
what it might be. If we extrapolate to get the unmeasured t values, the extrapolation should not
dominate the information contained in u. In general, the measurement should be carried out in
such a way that the unmeasured or extrapolated values are of negligible importance compared to
the measured values. Mathematically we might say that there exists a positive, finite value of t,
which we call Tmax, such that the important measured values of u are all at t 4 Tmax . One way of
expressing this constraint is to require
Tmax 5
³
0
u (t ) dt
³ u (t ) dt .
0
(2.9a)
Since the left-hand integral ought to be finite, when (2.9a) is true, it follows that
³ u (t ) dt
5 .
0
(2.9b)
Functions u that satisfy (2.9b) are said to be absolutely integrable; clearly, all functions
representing possible measurements share this quality, satisfying requirement (I) above.
Understanding requirement (II) requires some discussion of what it means to call an
experimental measurement continuous. To assign, with negligible experimental error, a definite
value of t to a measurement u, some minimum and finite change in t must occur between adjacent
measurements. In practice, continuous measurements are constructed by connecting sequences of
-- 69
69 --
2 · Fourier Theory
adjacent but separate points. We then assume that if u were measured between these already
known points, it would equal (to within experimental error) the values selected by connecting the
points. Thus, the continuity of u is a requirement that the measurement captures all the relevant
detail. In this sense, asserting that u is continuous is a type of idealization—just another way of
saying that the measurement is accurate and representative. This takes care of the first part of
requirement (II), but there is a second part permitting u to have a finite number of jump
discontinuities. Figure 2.2 shows a jump discontinuity in u(t). Jump discontinuities represent
another type of idealization—what can occur when, for example, instruments are turned on or off
during a measurement. Because it is unrealistic to have this happen an infinite number of times
over a finite range of t, it makes sense to say that all functions u representing measurements are
continuous over any finite range of t except for a finite number of jump discontinuities.
Consequently, we can expect all functions representing measurements to satisfy requirement (II).
Standard proofs that the Fourier transform of the Fourier transform returns the original
function u usually end up showing as their final step that
5
1
2 ³ U p f sin(2& ft )df lim u (t ) u (t ) (2.9c)
70 2
0
and
5
1
2 ³ U C f cos(2& ft ) df lim u (t ) u (t ) . (2.9d)
70 2
0
When u is continuous, this immediately reduces to the desired result, but when the integrals are
evaluated at a jump discontinuity, such as at t to in Fig. 2.2, the limits on the right-hand side of
(2.9c) and (2.9d) give u a value at the jump discontinuity that is probably different from the
original value of u at the jump discontinuity. To keep this from happening, we define the value of
u to be, for all values t t jump marking the location of a jump discontinuity,
1
u (t jump ) lim ª¬u (t jump ) u (t jump ) º¼ . (2.9e)
70 2
Modifying u this way cannot change the value of any integral whose integrand is the product of u
with another smooth function. The sine and cosine are smooth functions, so using (2.9e) to
modify the value of u at jump discontinuities does not change the values of the sine or cosine
transforms.
Measurements must be done with physically realizable equipment, which necessarily
produces finite values of u. This means there always exists a finite real number B
5 such that
- 70 --
- 70
Fourier Sine and Cosine Transforms · 2.2
Figure 2.2.
u (t )
t t0
______________________________________________________________________________
u (t )
B (2.9f)
u (t ) u1 (t ) u2 (t ) (2.9g)
In Fig. 2.3(a), function u is drawn with a continuous line where it is increasing and with a dashed
line where it is decreasing. In Fig. 2.3(b), we see that functions u1 and u2 are constructed so that
every time u increases, u1 also increases while u2 remains the same, and every time u decreases,
u2 increases while u1 remains the same. Consequently, for any function u and time values b : a ,
the differences u1 (b) u1 (a) and u2 (b) u2 (a ) are non-negative and can only increase, which
means that their sum
-- 71
71 --
2 · Fourier Theory
FIGURE 2.3(a).
u (t )
a b
t1 t2 t3
FIGURE 2.3(b).
u1,2 (t )
u1 (t )
u2 (t )
a b
t1 t2 t3
- 72 --
- 72
Fourier Sine and Cosine Transforms · 2.2
is also non-negative. Functions u1 and u2 have been constructed so that every time u goes up and
down, the differences u1 (b) u1 (a ) and u2 (b) u2 (a ) increase, making the size of Vab (u ) a
record of how many times u oscillates in the interval a
t
b . We define Vab (u ) to be the
variation of u over the interval a
t
b , and if
Vab (u )
5 , (2.9i)
This test function clearly satisfies (I) through (IV) and so must have a Fourier cosine transform,
2 & 1
sin( f )
GC ( f ) 2& ³
0
cos(2& ft )dt
f
(2.10b)
such that we return to the original function g by taking cosine transform of the GC transform,
-- 73
73 --
2 · Fourier Theory
5 5
sin( f )
g (t ) 2 ³ GC ( f ) cos(2& ft )df 2³ cos(2& ft )df . (2.10c)
0 0
f
sin(t )
h(t )
t
5
sin(t )
H C ( f ) 2³ cos(2& ft )dt . (2.10d)
0
t
The integral in (2.10d) is clearly the same as the first integral in (2.10c) with the variables ƒ and t
interchanged. Therefore,
& for 0 4 f
1 2&
°
H C ( f ) g ( f ) ® & / 2 for f 1 2&
° 0 for f 1 2&
¯
Hence we know that h(t) satisfies Eqs. (2.8b), (2.8d), and (2.8f)—it is both cosine transformable
and its cosine transform returns the original function when cosine transformed—exactly because
g(t) in (2.10a) satisfies Eqs. (2.8b), (2.8d), and (2.8f). Yet h(t), unlike g(t), does not satisfy
requirements (I) through (IV)—in particular, it violates requirement (I) because it is not
absolutely integrable. To see that this is true, note that
j& j&
5
sin(t ) 5
sin(t ) 5
1 2 5
1
³ dt ¦ ³& dt : ¦ ³& sin(t ) dt ¦ j 7 5,
0
t j 1 j 1 t j 1 j& j 1 & j 1
where the last step uses a well-known property of the harmonic series,
5
1
¦ j,
j 1
that it grows large without limit. This simple example also shows that just because a function g(t)
satisfies requirements (I) through (IV), so that the transform of the transform returns the original
- 74 --
- 74
Fourier Sine and Cosine Transforms · 2.2
function g(t), it does not necessarily follow that transform itself satisfies requirements (I) through
(IV).
Here is another example to show that, even though the transform of a function may exist, if
requirements (I) through (IV) are violated, then the transform of the transform does not
necessarily return the original function. We consider another test function,
z (t ) t 1 , (2.10e)
5 A
dt dt
³0 t lim
A75 ³ lim ª¬ ln A º¼ 5 ,
t A775
70 0
5
sin(2& ft )
Z p ( f ) 2³ dt .
0
t
0 for f 0
Zp ( f ) ® . (2.10f)
¯& for f 0
Therefore, the sine transform Z p of z (t ) t 1 exists, yet the sine transform of the sine transform
does not return z:
5 F
1 1
2& ³ sin(2& ft )df lim 2& ³ sin(2& ft ) df lim 1 cos(2& Ft ) > . (2.10g)
F 75 F 75 t t
0 0
Clearly, if a function violates requirements (I) through (IV) yet has a well-defined sine or
cosine transform, the sine transform of the sine transform and the cosine transform of the cosine
transform must be checked explicitly to confirm that the original function is returned. The only
exception is when the transform itself satisfies (I) through (IV) even though the original test
function does not. Because we could just as easily have started with the transform itself instead of
the original test function, we can conclude that the transform of the transform of the original
function must return the original function. In general, repeatedly applying the sine or cosine
-- 75
75 --
2 · Fourier Theory
transform just takes us back and forth between the same two functions, and the transformations
are mathematically justified whenever at least one of those functions satisfies requirements (I)
through (IV).
u (t ) u (t ) (2.11a)
for all values of t, negative as well as positive; an odd function satisfies the constraint
u (t ) u (t ) (2.11b)
for all values of t, negative as well as positive; and a mixed function is partly even and partly odd
in the sense that it is the sum of an even function and an odd function, neither of which is
identically zero. Any function u(t)—whether even, odd, or mixed—can be written as the sum of
two functions, ue and uo , with ue being an even function obeying (2.11a) and uo being an odd
function obeying (2.11b),
u (t ) ue (t ) uo (t ) , (2.11c)
where
1
ue (t ) u (t ) u (t ) (2.11d)
2
and
1
uo (t ) u (t ) u (t ) . (2.11e)
2
Clearly,
1 1
ue (t ) u (t ) u (t ) u (t ) u (t ) ue (t )
2 2
and
1 1
u o ( t ) u (t ) u (t ) u (t ) u (t ) uo (t ) .
2 2
If u starts off as an even function, then u ue , and uo is identically zero; if u starts off as an odd
function, then u uo , and ue is identically zero; and if u starts off as a mixed function, then
- 76 --
- 76
Even, Odd, and Mixed Functions · 2.3
neither ue nor uo are identically zero. If u is identically zero, it can be regarded as either even or
odd, according to the classifier’s convenience.
Figures 2.4(a) and 2.4(b) graph examples of even and odd functions respectively, and Fig.
2.4(c) shows a mixed function that is split up into its even and odd parts. We note that cos(2& ft )
is an even function of both ƒ and t and sin(2& ft ) is an odd function of both ƒ and t. One point
worth remembering is that the behavior of even and odd functions is severely constrained near
t 0 . For any odd function at t 0 , we have
from Eq. (2.11b). Since the only number equal to its own negative value is zero, all odd functions
u(t) that have a well-defined value at t 0 must be zero at t 0 ,
Because u (t ) u (t ) for even functions, when t is near zero the value of u (if u is continuous) is
almost constant. Therefore, when t is exactly zero the derivative of any even function u(t), if it is
well defined, must be zero,
du
0 if the derivative at zero exists and u is even. (2.12b)
dt t 0
du ª u (t ) u (t ) º ª u (t ) u (t ) º
lim « » lim « »¼ ,
dt 70 ¬ ¼ 70 ¬
This shows that when u is even, the derivative of u is odd, and so from (2.12a), which states that
odd functions are zero when their argument is zero, we know that (2.12b) must be true. Similarly,
for any odd function u,
-- 77
77 --
2 · Fourier Theory
FIGURE 2.4(a).
u (t )
FIGURE 2.4(b).
u (t )
- 78 --
- 78
Even, Odd, and Mixed Functions · 2.3
FIGURE 2.4(c).
10
9.28
ue (t )
5
u (t )
u t
i
ue t
i 0
uo t
i
uo (t )
5
9.557 10
2 1.5 1 0.5 0 0.5 1 1.5 2
2 t ti 0 t 2
showing that when u is odd, its derivative is even. The second derivative d 2u dt 2 of an even
function u is the first derivative of du dt that is odd, and so d 2u dt 2 must be even; similarly, the
third derivative d 3u dt 3 is the first derivative of d 2u dt 2 that is even, and so must be odd.
Examining in this fashion ever higher derivatives of the even function u, we conclude that
-- 79
79 --
2 · Fourier Theory
The same reasoning applied to the derivatives of an odd function u shows that
Equation (2.12c) states that the odd-numbered derivatives of an even function are odd while the
even-numbered derivatives of an even function are even, and Eq. (2.12d) states that the odd-
numbered derivatives of an odd function are even while the even-numbered derivatives of an odd
function are odd. Therefore, an immediate consequence of (2.12a), (2.12c), and (2.12d) is that the
odd-numbered derivatives of an even function—if they exist and are well-defined—are zero at
t 0 and the even-numbered derivatives of an odd function—if they exist and are well-defined—
are zero at t 0 .
³ u(t ) dt
5 .
5
(2.13a)
(VI) Function u (t ) must be continuous except for a finite number of jump discontinuities
over any finite interval 5
a
t
b
5 .
(VII) There must exist a finite positive number B such that
u (t )
B . (2.13b)
(VIII) The non-negative variation Vab (u ) of function u(t) as defined in Eqs. (2.9g) and (2.9h)
is finite over any finite interval 5
a
t
b
5 ,
Vab (u )
5 . (2.13c)
- 80 --
- 80
Extended Sine and Cosine Transforms · 2.4
We also define the value of u at all its jump discontinuities to be given by Eq. (2.9e). These new
requirements are clearly just the old set of requirements extended to cover negative as well as
positive values of t.
The extended Fourier sine transform of u is
5
pE ( ft )
u (t ) ³ u (t ) sin(2& ft )dt , (2.14a)
5
5
CE ( ft )
u (t ) ³ u (t ) cos(2& ft )dt . (2.14b)
5
Just like in Eqs. (2.8a) and (2.8b), defining the standard sine and cosine transforms, the order of
the ft product in the superscript does not matter:
pE ( ft ) u (t ) pE ( tf ) u (t )
and
CE ( ft ) u (t ) CE ( tf ) u (t ) .
We can write u as the sum of even and odd functions, u (t ) ue (t ) uo (t ) , as described in Eq.
(2.11c), and substitute this sum into the definitions of the extended sine and cosine transforms in
(2.14a) and (2.14b) to get
5 5
pE ( ft )
u (t ) ³ ue (t ) sin(2& ft )dt ³ uo (t ) sin(2& ft )dt (2.15a)
5 5
and
5 5
CE ( ft ) u (t ) ³ ue (t ) cos(2& ft )dt
5
³ u (t ) cos(2& ft )dt .
5
o (2.15b)
We note that the product of an even function ue and the sine, as well as the product of an odd
function uo and the cosine, must be an odd function,
-- 81
81 --
2 · Fourier Theory
and
uo (t ) cos 2& f A (t ) uo (t ) cos(2& ft ) uo (t ) cos(2& ft ) . (2.16b)
The integral between í and + of any odd function o (t ) can be thought of as the limit of
the sum of a large number of small terms,
Because o is odd, o (0) is zero; o (dt ) A dt o (dt ) A dt and cancels o (dt ) A dt ;
o (2dt ) A dt o (2dt ) A dt and cancels o (2dt ) A dt ; and so on. Therefore,20
³ (t )dt 0 ,
5
o (2.17)
5
pE ( ft ) u (t )
5
³ u (t ) sin(2& ft )dt
o (2.18a)
and
5
CE ( ft )
u (t ) ³ ue (t ) cos(2& ft )dt . (2.18b)
5
Because e is even, e ( dt ) e (dt ) , e (2dt ) e (2dt ) , and so on. Therefore, the integral over
negative t has the same value as the integral over positive t and we can write
20
Strictly speaking, we are here treating the integral between í and + as a Cauchy principle value, a concept
introduced in Sec. 2.10 below.
- 82 --
- 82
Extended Sine and Cosine Transforms · 2.4
5 5
³ (t )dt 2³ (t )dt .
5
e
0
e (2.19)
and the product of ue and the cosine, both of them even functions, is another even function.
Consequently, the extended sine and cosine transforms in Eqs. (2.18a) and (2.18b) are, according
to (2.19), (2.8a), and (2.8b),
5 5
pE ( ft ) u (t ) ³ uo (t ) sin(2& ft )dt 2³ uo (t ) sin(2& ft )dt p uo (t )
( ft )
(2.21a)
5 0
and
5 5
CE ( ft )
u (t ) ³ ue (t ) cos(2& ft )dt 2³ ue (t ) cos(2& ft )dt C ( ft ) ue (t ) . (2.21b)
5 0
Equation (2.21a) shows that the extended sine transform of a function u(t) is the unextended sine
transform of uo , the odd component of u; and Eq. (2.21b) shows that the extended cosine
transform of u(t) is the unextended cosine transform of ue , the even component of u. Because the
result will be needed later, we also show that the extended sine transform defined in Eq. (2.14a)
is an odd function of ƒ,
5 5
pE ( ft )
u (t ) ³ u (t ) sin(2& ft )dt ³ u (t ) sin(2& ft )dt pE ( ft ) u (t ) ; (2.22a)
5 5
and a similar manipulation shows that the extended cosine transform defined in (2.14b) is an even
function of ƒ,
5 5
CE ( ft ) u (t ) ³ u (t ) cos(2& ft )dt ³ u (t ) cos(2& ft )dt C u (t ) .
( ft )
E (2.22b)
5 5
We now examine what happens when the extended sine and cosine transforms are applied
twice to the same function. We define
-- 83
83 --
2 · Fourier Theory
U pE f pE ( ft ) u (t ) p( ft ) uo (t ) (2.23a)
and
U CE f CE ( ft ) u (t ) C ( ft ) ue (t ) , (2.23b)
where the second step in Eqs. (2.23a) and (2.23b) comes from (2.21a) and (2.21b). Taking the
extended Fourier sine and cosine transforms of U pE and U CE respectively, we get
5
pE ( tf )
U pE ( f ) pE U pE ( f ) ³ U pE ( f ) sin(2& ft )df
( ft )
(2.24a)
5
and
5
CE ( tf )
U CE ( f ) CE U CE ( f ) ³ U CE ( f ) cos(2& ft )df
( ft )
. (2.24b)
5
The second step in (2.24a) and (2.24b) is there just to emphasize that we are allowed to change
the order of the ft product in the superscripts.
Equation (2.22a) shows that the extended sine transform U pE is an odd function of ƒ, so its
product with the sine is an even function of ƒ; and Eq. (2.22b) shows that the extended cosine
transform U CE is an even function of ƒ, so its product with the cosine is also an even function of
ƒ. Hence, according to (2.19), Eqs. (2.24a) and (2.24b) become
5
pE ( tf )
U pE ( f ) 2³ U pE ( f ) sin(2& ft )df (2.25a)
0
and
5
CE ( tf ) U CE ( f ) 2³ U CE ( f ) cos(2& ft )df . (2.25b)
0
But Eq. (2.23a) shows that U pE is also the unextended sine transform of uo , so from (2.25a) we
see that
pE ( tf ) U pE ( f )
equals the unextended sine transform of the unextended sine transform of uo , the odd component
of function u. According to Eqs. (2.8a), (2.8c), and (2.8e), the unextended sine transform of the
unextended sine transform returns the original function for positive values of t. This means that
the extended sine transform of the extended sine transform,
- 84 --
- 84
Extended Sine and Cosine Transforms · 2.4
pE ( tf ) U pE ( f ) ,
which we have just seen to be equal to the unextended sine transform of the unextended sine
transform, must return uo for positive values of t. Consequently, for positive values of t, Eq.
(2.25a) becomes
5
pE ( tf )
U pE ( f ) 2³ U pE ( f ) sin(2& ft )df uo (t ) . (2.26a)
0
Function uo is, however, defined for all values of t according to the rule for odd functions
uo (t ) uo (t ) , and the integral
5
2 ³ U pE ( f ) sin(2& f (t ))df
0
5 5
2 ³ U pE ( f ) sin(2& f (t ))df 2³ U pE ( f ) sin(2& ft ) df .
0 0
Consequently, the integral exists and is well defined for negative t whenever the integral exists
and is well-defined for positive t. We conclude that Eq. (2.26a) holds true for negative as well as
positive t. Hence, using Eq. (2.23a) to substitute for U pE in Eq. (2.26a), we can write
pE (tf ) pE ( ft 3) u (t 3) uo (t ) (2.26b)
This shows that taking the extended sine transform of the extended sine transform returns the odd
component uo of function u for all values of t, both positive and negative. Switching now to the
extended cosine transform U CE , we see that Eq. (2.23b) shows the extended cosine transform U CE
is also the unextended cosine transform of ue , the even component of function u. From the right-
hand side of Eq. (2.25b), we then know that
CE ( tf ) U CE ( f )
is equal to the unextended cosine transform of the unextended cosine transform of ue . Equations
(2.8b), (2.8d), and (2.8f) show that the unextended cosine transform of the unextended cosine
transform returns the original function for positive values of t. Consequently, the extended cosine
-- 85
85 --
2 · Fourier Theory
CE ( tf ) U CE ( f ) ,
which we have just seen to be equal to the unextended cosine transform of the unextended cosine
transform of ue , must also equal ue for positive values of t. This means that Eq. (2.25b) becomes
(for positive values of t),
5
CE ( tf )
U CE ( f ) 2³ U CE ( f ) cos(2& ft )df ue (t ) . (2.26c)
0
But ue (t ) is defined for negative as well as positive values of t according to the rule
ue (t ) ue (t ) for even functions of t, and the integral
5
2 ³ U CE ( f ) cos(2& ft )df
0
5 5
2 ³ U CE ( f ) cos 2& f (t ) df 2³ U CE ( f ) cos 2& f (t ) df .
0 0
Consequently, the integral exists and is well defined for negative t if it exists and is well defined
for positive t. We conclude that Eq. (2.26c) is valid for both negative and positive t and that,
substituting Eq. (2.23b) into Eq. (2.26c),
CE (tf ) CE ( ft 3) u (t 3) ue (t ) . (2.26d)
This shows that taking the extended cosine transform of the extended cosine transform returns
ue , the even component of function u, for all values of t both positive and negative. Equations
(2.11d) and (2.11e), the original definitions of the even and odd components of a function u,
show that Eqs. (2.26b) and (2.26d) can be written as
1
pE ( tf ) pE ( ft 3) u (t 3)
2
u (t ) u (t ) (2.26e)
and
- 86 --
- 86
Extended Sine and Cosine Transforms · 2.4
1
CE ( tf ) CE ( ft 3) (u (t 3))
2
u (t ) u (t ) . (2.26f)
Adding together the extended sine transform of the extended sine transform and the extended
cosine transform of the extended cosine transform then gives
pE ( tf ) pE ( ft 3) u (t 3) CE ( tf ) CE ( ft 3) u (t 3)
1 1 (2.26g)
u (t ) u (t ) u (t ) u (t ) u (t ) .
2 2
We conclude that for any function u(t), the sum of the extended sine transform of the extended
sine transform and the extended cosine transform of the extended cosine transform returns the
original function.
One obvious way to proceed from this point is to define the Hartley transform
5
e a
( ft )
u (t ) ³ u (t ) cos(2& ft ) sin(2& ft ) dt
5
5 5
³ u (t ) cos(2& ft )dt ³ u(t ) sin(2& ft )dt
5 5
(2.26h)
CE (tf ) u (t ) pE (tf ) u (t )
U CE ( f ) U pE f ,
where in the next-to-last step we use definitions (2.14a) and (2.14b) of the extended sine and
cosine transforms and in the last step Eqs. (2.23a) and (2.23b) are used to write the extended sine
and cosine transforms as functions of ƒ. The order of the ft product in the superscript is not
important because, just like in the sine and cosine transforms, we have
ea( ft ) u (t ) ea( tf ) u (t ) .
Working with this definition, we see that the Hartley transform of the Hartley transform gives
ea( tf ) ea( ft 3) u (t 3) ea(tf ) U CE ( f ) U pE f
5 (2.26i)
³
5
ª¬U CE ( f ) U pE f º¼ cos(2& ft ) sin(2& ft ) df .
-- 87
87 --
2 · Fourier Theory
According to Eqs. (2.22a) and (2.22b), the extended sine transform U pCE is an odd function of ƒ
and the extended cosine transform U CE is an even function of ƒ. Using the same reasoning as in
Eqs. (2.16a) and (2.16b) above,
and
U pE ( f ) cos 2& t A ( f ) U pE ( f ) cos(2& ft ) U pE ( f ) cos(2& ft ) .
We see that U CE ( f ) sin(2& ft ) and U pE f cos(2& ft ) are both odd functions of ƒ, and Eq. (2.17)
states that the integral between í and + of any odd function is zero. Therefore,
5 5
³U
5
CE ( f ) sin(2& ft )df ³ U f cos(2& ft )df 0 .
5
pE
Now the Hartley transform of the Hartley transform in Eq. (2.26i) can be simplified to
5
e a
( tf )
e u (t 3) ³
a
( ft 3 )
ª¬U CE ( f ) U pE f º¼ cos(2& ft ) sin(2& ft ) df
5
5 5
³
5
U CE ( f ) cos(2& ft )df ³U
5
CE ( f ) sin(2& ft )df
5 5
³ U f cos(2& ft )df ³ U f sin(2& ft )df
5
pE
5
pE
5 5
³U
5
CE ( f ) cos(2& ft )df ³ U f sin(2& ft )df
5
pE
CE (tf ) U CE ( f ) pE ( tf ) U pE ( f ) .
ea( tf ) ea( ft 3) u (t 3) CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) ,
- 88 --
- 88
Extended Sine and Cosine Transforms · 2.4
ea( tf ) ea( ft 3) u (t 3) u (t ) . (2.26j)
We see that the Hartley transform of the Hartley transform returns the original function for both
positive and negative values of t. The Hartley transform was never very popular and is only rarely
encountered today. What is done instead, as we shall see in the next section, is to combine the
extended sine and cosine transforms into a single Fourier transform based on a complex
exponential.
where i 1 .
For any real function u(t) satisfying requirements (V) through (VIII) in Sec. 2.4, we can add
the extended cosine transform to i times the extended sine transform to get
5 5
CE ( ft ) u (t ) i A pE ( ft ) u (t ) ³ u(t ) cos(2& ft ) i sin(2& ft ) dt ³e
2& ift
u (t )dt . (2.28a)
5 5
CE ( ft ) u (t ) U CE f and pE ( ft ) u (t ) U pE f ,
³e
2& ift
u (t )dt U CE f iU pE f . (2.28b)
5
5 5 5 5
-- 89
89 --
2 · Fourier Theory
because U CE f sin(2& ft ) is an odd function of ƒ and integrates to zero [see discussion after Eq.
(2.26i) above]. Taking the extended cosine transform of both sides of Eq. (2.28b) gives
5 5 5 5
5 5
³ df sin(2& ft ) ³ dt 3 e
2& ift 3
u (t 3) i A pE ( tf ) U pE ( f ) (2.28e)
5 5
and
5 5
³
5
df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) CE ( tf ) U CE ( f ) .
5
(2.28f)
5 5
³ df sin(2& ft ) ³ dt 3 e
2& ift 3
u (t 3) i A pE ( tf ) pE ( ft 3) u (t 3) (2.28g)
5 5
and
5 5
³ df cos(2& ft ) ³ dt 3 e
2& ift 3
u (t 3) CE ( tf ) CE ( ft 3) u (t 3) . (2.28h)
5 5
We now multiply both sides of (2.28g) by ( i ) and sum the resulting equation with Eq. (2.28h) to
get
5 5 5 5
³
5
df cos(2& ft ) ³ dt 3 e2& ift 3u (t 3) i ³ df sin(2& ft ) ³ dt 3 e 2& ift 3u (t 3)
5 5 5
CE ( tf )
C E
( ft 3 )
u (t 3) pE (tf ) pE ( ft3) u (t 3)
- 90 --
- 90
Forward and Inverse Fourier Transforms · 2.5
5 5
³ df e ³ dt 3 e
2& ift 2& ift 3
u (t 3) CE ( tf ) CE ( ft 3) u (t 3) pE (tf ) pE ( ft3) u (t 3) . (2.28i)
5 5
5 5
³ df e ³ dt 3 e
2& ift 2& ift 3
u (t 3) u (t ) . (2.28j)
5 5
If, in Eq. (2.28a), we start out by adding the extended cosine transform to (i ) times the extended
sine transform, then instead of Eqs. (2.28g) and (2.28h), we get [just replace i by (i )
everywhere]
5 5
³
5
df sin(2& ft ) ³ dt 3 e2& ift 3u (t 3) i A pE ( tf ) pE ( ft 3) u (t 3)
5
and
5 5
³ df cos(2& ft ) ³ dt 3 e
2& ift 3
u (t 3) CE ( tf ) CE ( ft 3) u (t 3) .
5 5
Now we must multiply the top equation by i before summing it with the bottom equation to get
5 5 5 5
5 5 5 5
C E
( tf )
C E
( ft 3 )
u (t 3) pE (tf ) pE ( ft3) u (t 3)
or
5 5
³ df e ³ dt 3 e
2& ift 2& ift 3
u (t 3) u (t ) . (2.28k)
5 5
Clearly, Eqs. (2.28j) and (2.28k) are basically the same identity, which can be written as
5 5
³ df e ³ dt 3 e
92& ift B2& ift 3
u (t 3) u (t ) . (2.28 A )
5 5
As long as the exponent of e changes sign in the two integrals over ƒ and t, we get back the
original function. Looking at how Eqs. (2.28j) and (2.28k) are derived, we see that if the sign of
the exponent does not change, we get
-- 91
91 --
2 · Fourier Theory
CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3)
instead of
CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) .
Equations (2.26e) and (2.26f) then show that
CE ( tf ) CE ( ft 3) u (t 3) pE ( tf ) pE ( ft 3) u (t 3) u (t ) ,
which gives
5 5
³ df e 92& ift ³ dt 3 e
92& ift 3
u (t 3) u (t ) (2.28m)
5 5
This interesting result shows that when u is even so that u (t ) u (t ) , we still get back the
original function, and when u is odd so that u (t ) u (t ) , we just have to multiply by ( 1) to
retrieve u. Even when u is mixed, no information is lost; reversing the sign of the argument still
gets us back to the original function. Replacing t by ít in (2.28m) takes us back to the original
formula (2.28 A ).
Up to this point, we have taken u to be real, but if Eq. (2.28 A ) holds true when u is a real
function of a real argument, it must also hold true when u is a complex function of a real
argument. To show why this is so, we break complex functions u(t) of a real argument t into real
and imaginary parts,
u (t ) ur (t ) iui (t ) ,
where ur and ui are both real functions of t. Substituting this complex-valued u(t) into the left-
hand side of (2.28 A ) gives
5 5
³ ³ dt 3 e
B2& ift 3
df e92& ift ur (t 3) iui (t 3)
5 5
5 5 5 5
³ df e ³ dt 3 e ur (t 3) i ³ df e ³ dt3 e
92& ift B2& ift 3 92& ift B2& ift 3
ui (t 3) .
5 5 5 5
Since (2.28 A ) holds for real functions ur and ui , this last expression must be equal to the
original complex function u,
ur (t ) iui (t ) u (t ) ,
- 92 --
- 92
Forward and Inverse Fourier Transforms · 2.5
showing that Eq. (2.28 A ) is true for complex functions of t as well as strictly real functions of t.
Similar reasoning shows that (2.28m) also holds true for complex functions of real variables.
Indeed, we can even apply this analysis to the unextended sine and cosine transforms to show that
the unextended sine transform of the unextended sine transform and the unextended cosine
transform of the unextended cosine transform return the original function (for positive values of
the argument) when the original function is complex.
We now define the Fourier transform of a complex function u with real argument t to be
5
F ( ift ) u (t ) ³ u(t )e
2& ift
dt . (2.29a)
5
The notation for F introduced in (2.29a) explicitly shows that t, being repeated inside both upper
and lower parentheses, is the dummy variable of integration; and that F produces a function of ƒ
because ƒ is only listed in the upper parentheses. We call (2.29a) the forward Fourier transform
and, when convenient, follow the custom of writing it with the upper-case letter of the
transformed function,
5
³ u (t )e
2& ift
U( f ) dt . (2.29b)
5
³ U ( f )e
( itf ) 2& ift
F (U ( f )) df . (2.29c)
5
In both the forward and inverse transform the order of the tf product in the superscript is
irrelevant, just as it is for the sine, cosine, and Hartley transforms,
What is important is the sign inside the superscript, since it determines whether the forward or
inverse transform is being performed. Equation (2.28 A ) shows, of course, that
5
u (t ) F ( itf ) U ( f ) ³ U ( f )e
2& ift
df F (itf ) F ( ift 3) u (t 3) . (2.29d)
5
It is entirely a matter of convention which Fourier transform is called the forward transform and
which is called the reverse transform; all that matters is for (2.28 A ) to be satisfied. Some authors
-- 93
93 --
2 · Fourier Theory
change the sign of the exponent 2& ift , defining the forward Fourier transform to be F ( ift ) ,
5
F ( ift )
u (t ) ³ u (t )e2& ift dt ,
5
5
F ( itf ) U ( f ) ³ U ( f )e
2& ift
df .
5
Clearly, this convention also satisfies (2.28 A ), with the inverse Fourier transform of the forward
Fourier transform still returning the original function.
In physics and related disciplines, the frequency variable is often changed to - 2& f , so that
(2.28 A ) becomes
5 5
1
³ ³ dt 3 eB i-t 3u (t 3) u (t ) .
9 i-t
d - e (2.30a)
2& 5 5
Authors using the frequency variable Ȧ allocate the factor of 1 (2& ) different ways when
defining the forward and inverse Fourier transforms in terms of Ȧ, with all reasonable
possibilities chosen at one time or another:
5
Forward Fourier transform of u (t ) is ³ u (t )e B i-t dt U (- ) , (2.30b)
5
5
1
³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- ,
2& 5
5
1
³ u (t )e
B i-t
Forward Fourier transform of u (t ) is dt U (- ) , (2.30c)
2& 5
5
1
³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- ,
2& 5
5
1
³ u (t )e
B i-t
Forward Fourier transform of u (t ) is dt U (- ) , (2.30d)
2& 5
- 94 --
- 94
Forward and Inverse Fourier Transforms · 2.5
³ U (- )e
9 i-t
Inverse Fourier transform of U (- ) d- .
5
In each of the three pairs of definitions listed above, the plus and minus signs are synchronized;
so if the top (bottom) sign is chosen for the first member of the pair then the top (bottom) sign
must also be chosen for the second member of the pair. This gives a total of six different ways of
defining the forward and inverse Fourier transforms, and all six satisfy Eq. (2.30a).
The unextended sine and cosine transforms—usually called just the sine and cosine
transforms—can also be defined in many different ways. Equations (2.8a), (2.8c), (2.8e), and
(2.8b), (2.8d), (2.8f) can be combined to write
5 5
4 ³ df sin(2& ft ) ³ dt 3 u (t 3) sin(2& ft 3) u (t ) for t 0 (2.31a)
0 0
and
5 5
4 ³ df cos(2& ft ) ³ dt 3 u (t 3) cos(2& ft 3) u (t ) for t 0 . (2.31b)
0 0
5 5
2
& ³ df sin(-t )³ dt 3 u (t 3) sin(-t 3) u(t ) for t 0
0 0
(2.31c)
and
5 5
2
& ³ df cos(-t )³ dt 3 u (t 3) cos(-t 3) u(t )
0 0
for t 0 . (2.31d)
Just like the factor of 1 (2& ) in Eq. (2.30a), the factor of 2 & in (2.31c) and (2.31d) can be
allocated three different ways when defining the forward and inverse sine and cosine transforms:
5
Forward sine transform of u (t ) for t 0 is ³ u (t ) sin(-t )dt U p - , (2.31e)
0
5
Forward cosine transform of u (t ) for t > 0 is ³ u (t ) cos(-t ) dt U C - ,
0
5
2
Inverse sine transform of U p - is
& ³ U - sin(-t )d- u(t )
0
p for t 0 ,
-- 95
95 --
2 · Fourier Theory
5
2
Inverse cosine transform of U C - is
& ³ U - cos(-t )d- u (t )
0
C for t 0 ,
5
2
Forward sine transform of u (t ) for t > 0 is
& ³ u (t ) sin(-t )dt U - ,
0
p (2.31f)
5
2
Forward cosine transform of u (t ) for t > 0 is
& ³ u (t ) cos(-t )dt U - ,
0
C
5
2
Inverse sine transform of U p - is
& ³ U - sin(-t )d- u (t )
0
p for t 0 ,
5
2
Inverse cosine transform of U C - is
& ³ U - cos(-t )d- u (t )
0
C for t 0 ,
5
2
Forward sine transform of u (t ) for t > 0 is
& ³ u (t ) sin(-t )dt U - ,
0
p (2.31g)
5
2
Forward cosine transform of u (t ) for t > 0 is
& ³ u (t ) cos(-t )dt U - ,
0
C
5
Inverse sine transform of U p - is ³ U p - sin(-t )d - u (t ) for t 0 ,
0
5
Inverse cosine transform of U C - is ³ U C - cos(-t )d - u (t ) for t 0 .
0
The reader should expect to encounter all three classes of definitions given in (2.31e)–(2.31g).
The symmetric definitions in (2.31f) are the most popular, probably because they remove the
distinction between the forward and inverse transform, letting us say that the sine transform of
the sine transform and the cosine transform of the cosine transform return the original function
for t 0 .
In today’s optical-engineering textbooks—and user manuals for the fast Fourier transform—
there is a tendency to choose Eq. (2.29a)–(2.29d) as the definitions of the forward and inverse
Fourier transform, and that is the convention followed here. It is perhaps somewhat
unconventional not to use the frequency variable - 2& f when defining the sine and cosine
transforms, but using ƒ rather than Ȧ brings their definitions into conformity with the definitions
chosen for the forward and inverse Fourier transforms.
- 96 --
- 96
Fourier Transform as a Linear Operation · 2.6
L1 u (t ) g (t ) A u (t ) ,
t2
L3 u (t ) ³ u (t ) dt .
t1
du (t ) dv(t )
L2 u (t ) v(t ) L2 u (t ) L2 v(t ) ,
dt dt
and
t2 t2
Combinations of linear operators are always linear; for example, the operator Z defined by
Z u (t ) L3 L1 u (t )
must be linear because
-- 97
97 --
2 · Fourier Theory
5
F ( ift )
u (t ) ³ u (t )e2& ift dt
5
as defined in Eq. (2.29a) is, in fact, just L3 L1 u (t ) with g (t ) e 2& ift in the L1 multiplication
and t1 5 , t2 5 in the L3 integration. Similarly, the inverse Fourier transform is,
interchanging the roles of the ƒ and t variables in Eq. (2.29b),
5
F (ift ) U (t ) ³ U (t )e
2& ift
dt ,
5
5 5
p ( ft )
u (t ) 2³ u (t ) sin(2& ft )dt and pE ( ft )
u (t ) ³ u (t ) sin(2& ft )dt ,
0 5
are also both L3 L1 u (t ) : the unextended sine transform has g (t ) 2sin(2& ft ) in the L1
multiplication and t1 0 , t2 5 in the L3 integration; and the extended sine transform has
g (t ) sin(2& ft ) in the L1 multiplication and t1 5 , t2 5 in the L3 integration. The
unextended and extended cosine transforms in Eqs. (2.8b) and (2.14b),
5 5
C ( ft )
u (t ) 2³ u (t ) cos(2& ft )dt and CE ( ft )
u (t ) ³ u (t ) cos(2& ft )dt ,
0 5
are, of course, identical to the unextended and extended sine transforms in being L3 L1 u (t ) ;
the only change is that the sines change to cosines in the L1 multiplications. From Eq. (2.32b), all
- 98 --
- 98
Fourier Transform as a Linear Operation · 2.6
four transforms—the extended sine transform, the unextended sine transform, the extended
cosine transform, and the unextended cosine transform—are linear operations. We see that the
only other transform discussed so far, the Hartley transform
5
ea( ft ) u (t ) ³ u (t ) cos(2& ft ) sin(2& ft ) dt
5
We have already seen that the inverse Fourier transform of U ( f ) returns the original function,
³ U ( f )e
2& ift
df F (itf ) U ( f ) u (t ) . (2.33b)
5
u ( f ) F ( ift ) U (t ) , (2.33c)
which shows that u(íf) is the forward Fourier transform of U(t). We expect, then, that U(t) is the
inverse Fourier transform of u(íf). To show this is true, we interchange the roles of variables ƒ
and t in (2.33a) and then make f 3 f the new variable of integration to get
-- 99
99 --
2 · Fourier Theory
5 5 5
U (t ) F ( itf )
u ( f ) ³ u ( f )e 2& ift
df ³ u ( f 3)e 2& if 3t df 3 ³ u ( f )e
2& ift
df
5 5 5
(2.33d)
( itf )
F u ( f ) .
Not only does this show that U(t) is the inverse Fourier transform of u(íf) but also, by comparing
the two expressions involving the F operator, we see that changing the sign of the integration
variable ƒ does not change the value of the Fourier operation F. It does, however, change its
name—the first F operation in (2.33d) is the forward Fourier transform of u(f) and the second F
operation in (2.33d) is the inverse Fourier transform of u(íf). Taking the complex conjugate of all
three expressions in Eq. (2.33b) gives
³ U( f ) e
2& ift
u (t ) df F ( itf ) U ( f ) ,
5
which shows that we get the complex conjugate of operator F by taking the complex conjugates
of the quantities inside both parentheses. Starting with the original Fourier transform relationship
between U and u,
U ( f ) F ( ift ) u (t ) (2.33e)
and
u (t ) F ( itf ) U ( f ) , (2.33f)
U ( f ) F ( ift ) u (t ) ,
and then change the sign of ƒ to get
U ( f ) F ( ift ) u (t ) . (2.33g)
This shows that U(íf)* is the forward Fourier transform of u(t)*. Since U(íf)* is the forward
Fourier transform of u(t)*, we expect the inverse Fourier transform of U(íf)* to be u(t)*. To show
this is true, we just change the sign of integration variable in Eq. (2.33f),
u (t ) F ( itf ) U ( f ) ,
- 100
- 100- -
Mathematical Symmetries of the Fourier Transform · 2.7
u (t ) F ( itf ) U ( f ) . (2.33h)
F ( ift ) u (t ) F ( ift ) u (t ) ,
U ( f ) U ( f )
U ( f ) U ( f ) . (2.34a)
Functions U(f) that obey Eq. (2.34a) are called Hermitian. If u(t) is purely imaginary, so that
u (t ) u (t ) , then Eq. (2.33g) becomes
U ( f ) F ( ift ) u (t )
or
F ( ift ) u (t ) U ( f ) , (2.34b)
where the linearity of F is used to take (1) outside the transform and shift it over to the other
side of the equation. Since F ( ift ) u (t ) is just U(f), Eq. (2.34b) shows that
U ( f ) U ( f )
or
U ( f ) U ( f ) (2.34c)
when u is purely imaginary. Functions U(f) that obey Eq. (2.34c) are called anti-Hermitian. A
special and very important case occurs when u is both real and even. Then, since U is the forward
-- 101
101 --
2 · Fourier Theory
Fourier transform of u with U ( f ) F ( ift ) u (t ) , we take the complex conjugate of both sides to
get
U ( f ) F ( ift ) u (t ) .
Because u is real this becomes, changing the sign of the variable of integration,
U ( f ) F ( ift ) u (t ) U ( f )
so that
U ( f ) U ( f ) . (2.34d)
Hence, U equals its own complex conjugate, which shows it must be real. Because u is real, we
already know that U is Hermitian and (2.34a) must hold true; now that U is known to be real, Eq.
(2.34a) can be written as
U ( f ) U ( f ) (2.34e)
This shows that U must be real and even when u is real and even. Taking the real part of Eq.
(2.33a) now gives, since both U and u are known to be real,
§5 · 5
U ( f ) Re ¨ ³ u (t )e 2& ift
dt ¸ ³ u (t ) Re e2& ift dt ,
© 5 ¹ 5
5
U( f ) ³ u(t ) cos(2& ft ) dt .
5
(2.34f)
Because u(t) is also even, we know that the product u (t ) cos(2& ft ) is even with respect to t,
which means that (2.34f) can be written as [see formula (2.19) above]
5
U ( f ) 2 ³ u (t ) cos(2& ft ) dt . (2.34g)
0
- 102
- 102- -
Mathematical Symmetries of the Fourier Transform · 2.7
The right-hand side is the unextended cosine transform of u, showing that when u(t) is real and
even, its Fourier transform equals its cosine transform. According to Eq. (2.8f), it follows that u
must then be the cosine transform of U,
5
u (t ) 2 ³ U ( f ) cos(2& ft ) df . (2.34h)
0
ª5 º
U( f ) f 0
« ³ u (t )e B2& ift dt »
¬ 5 ¼ f 0
or
5
U (0) ³ u (t )dt .
5
(2.35a)
ª5 º
u (t ) t 0 « ³ U ( f )e 92& ift df »
¬ 5 ¼ t 0
or
5
u (0) ³ U ( f )df .
5
(2.35b)
When U(f) is the forward Fourier transform of u(t), the nth derivative of U is
d nU <n
5 5
and, because Eqs. (2.29a) and (2.29d) require u to be the inverse transform of U when U is the
forward transform of u, the nth derivative of u is
-- 103
103 --
2 · Fourier Theory
d nu < n
5 5
Therefore, when both u and d nu dt n satisfy requirements (V) through (VIII) in Sec. 2.4 and U(f)
is the forward Fourier transform of u(t), Eq. (2.35d) shows that [(2& i ) n f nU ( f )] must be the
forward Fourier transform of d nu dt n because d nu dt n is the inverse Fourier transform of
[(2& i ) n f nU ( f )] . Equation (2.35c) similarly shows that when u(t) and [t nu (t )] satisfy
requirements (V) through (VIII) in Sec. 2.4 and U(f) is the forward Fourier transform of u(t), the
forward Fourier transform of [t nu (t )] is
1 d nU
.
(2& i ) n df n
d nu
6 (2& i ) n f nU ( f ) (2.35e)
dt n
and
1 d nU
t nu (t ) 6 . (2.35f)
(2& i ) n df n
b b
³ c(t ) dt 4 ³ c(t ) dt
a a
(2.35g)
must hold true for any two real values of a and b where a 4 b . When u(t) is real, so is its nth
derivative, and we can write
- 104
- 104- -
Basic Fourier Identities · 2.8
d nu 2& ift d nu
5 5
³ dt n e dt 4 5³ dt n dt .
5
(2.35h)
Because we are supposing the Fourier transform of d nu / dt n to exist, the existence requirement
in Eq. (2.13a) shows that
d nu
5
³ dt n dt
5
d n u 2& ift
5
³5 dt n e dt
also to be finite, which means that we can assume that it is less than or equal to some finite real
and non-negative number B for all values of ƒ:
d nu 2& ift
5
³ dt n e dt 4 B .
5
(2.35i)
³5 dt n e dt (2& ) i f U ( f ) ,
n n n
(2.35j)
where
5
³ u (t )e
2& ift
U( f ) dt
5
is, of course, the Fourier transform of u(t). Taking the magnitude of the complex values of both
sides of (2.35j) and remembering that i n 1 shows that
d nu 2& ift
5
³5 dt n e dt (2& ) f
n n
U( f ) ,
-- 105
105 --
2 · Fourier Theory
n
B : (2& ) n f U( f )
or
B n
U( f ) 4 f . (2.35k)
(2& ) n
Hence, when the Fourier transform of the nth derivative of u(t) exists, we know that the
n
magnitude U ( f ) of the Fourier transform of u decreases as f for large values of ƒ.
We next examine a set of identities often called the Fourier shift theorem. When U(f) is the
forward Fourier transform of u(t),
5
³ u (t )e
2& ift
U( f ) dt ,
5
u (t ) 7 u (t a) ,
then the forward Fourier transform of u (t a) is, changing the variable of integration to
t3 t a ,
5 5
Hence the forward Fourier transform of u (t a) is e 2& ifaU ( f ) when the forward Fourier
transform of u(t) is U(f), which we can write as
operator, we
In terms of the Fourier F operator, we have
have
Working with the reverse Fourier transform of U ( f f 0 ) and changing the variable of
integration to f 3 f f 0 , we see that
- 106
- 106- -
Basic Fourier Identities · 2.8
5 5
³ U ( f f )e ³ U ( f 3)e
2& ift 2& if0 t 2& if 3t
0 df e df 3 e 2& if0t u (t ) (2.36c)
5 5
or
e 2& if0t u (t ) 6 U ( f f 0 ) . (2.36d)
Equations (2.36d)–(2.36f) show that multiplying u(t) by e 2& if0t shifts U(ƒ), the forward Fourier
transform of u(t), to the right by a frequency f 0 . By interchanging the roles of t and ƒ—and
replacing u by U and f 0 by a—in (2.36e) and comparing the result to (2.36b), we see the two
equations can be combined into one formula:
This last result can also be written as, defining a new constant b a ,
5 5
³ u(t b) e ³ u(t ) e
92& ift B2& ifb 92& ift
dt e dt (2.36h)
5 5
or
F ( 9 ift ) u (t b) e B2& ifb F ( 9 ift ) u (t ) . (2.36i)
The next set of identities is sometimes called the Fourier scaling theorem. If U(ƒ) is the
forward Fourier transform of u(t) and the argument of u is scaled by the real constant a,
u (t ) 7 u (at ) ,
5 5 § ft 3 ·
1 2& i ¨ ¸ 1 § f ·
³ u(at )e ³ u (t 3)e
2& ift © a ¹
dt dt 3 U ¨ ¸.
5
a 5
a ©a¹
-- 107
107 --
2 · Fourier Theory
We also have, scaling the frequency by a positive constant a and letting f 3 af , that
5 5 § f 3t ·
1 2& i ¨ ¸ 1 §t·
³ U (af )e df ³ U ( f 3)e © a ¹ df 3 u ¨ ¸ .
2& ift
5
a 5 a ©a¹
Equation (2.37b) and (after interchanging the roles of ƒ and t) Eq. (2.37d) can be combined into
the single formula,
1 9 i f a t
F ( 9 ift ) u (at ) F u (t ) for a 0 . (2.37e)
a
Because u(t) must satisfy requirements (V) through (VIII) in Sec. 2.4 for these results to be
true—and in particular it must satisfy requirement (V) that it be absolutely integrable—there may
well be only a finite region of t over which u(t) is significantly different from zero. When
0
a
1 so that the range of t over which u is significantly different from zero expands, formula
(2.37a) shows that the region of ƒ over which U(ƒ) is significantly different from zero shrinks;
and, of course, when a 1 , just the opposite occurs. For 0
a
1 , function u (at ) more closely
resembles sin(2& ft ) and cos(2& ft ) for smaller values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero shrinks; and when a 1 , function u (at ) more closely
resembles sin(2& ft ) and cos(2& ft ) for larger values of ƒ, explaining why the region of ƒ for
which U is significantly different from zero expands. We also note that if f 1 (2& ) , so that
sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) , then the sine and cosine can change significantly in
value only when t changes by at least
- 108
- 108- -
Basic Fourier Identities · 2.8
tmin O (1) .
Suppose t must also change by at least tmin O (1) for a significant change in u(t) to occur,
which means that sin(2& ft ) sin(t ) and cos(2& ft ) cos(t ) vary about as fast with respect to t as
u does—that is, sin(t ) and cos(t ) “resemble” u somewhat. Recalling the heuristic reasoning used
in Sec. 2.1 to introduce and justify the sine and cosine integrals, we now expect U(ƒ) to be
significantly different from zero when f 1 (2& ) . Suppose next that t changes by less than
tmin O (1) so that u does not change significantly in value, remaining almost constant. Now
when ƒ becomes significantly larger than 1 (2& ) , functions sin(2& ft ) and cos(2& ft ) oscillate
ever more rapidly so that they change significantly in value for changes in t that are ever smaller
than tmin . For these larger values of ƒ, the sine and cosine do not much resemble u(t), forcing
the Fourier transform U(ƒ) to be negligible or zero for f O (1 (2& )) . We can modify the
original function u by creating a new function u (t ) u (t ) for 0 . Now t must change by at
least an O( ) amount for u to change significantly; and when t changes by less than O( ) ,
function u does not change significantly in value. We know from (2.37a) with a 1 that the
forward Fourier transform of u is U ( f ) U f . Hence, when ƒ is larger than
O 1 (2& ) , it must be true that U ( f ) is negligible or zero, since this is the same as having
f O(1 (2& )) in U(ƒ). Because 2& is often regarded as an O(1) quantity, this result can also be
interpreted as showing that U ( f ) must be negligible or zero for f O (1 ) . Since the original
Fourier transform pair
u (t ) 6 U ( f )
is left unspecified, u in fact represents any function v(t) where t must change by at least an
O( ) amount for a significant change in v to occur. Consequently, we can conclude if t must
change by at least an O( ) amount for v(t) to change significantly, then the forward Fourier
transform of v(t) must be negligible or zero for f O (1 ) . The arguments leading to this
conclusion work just as well when we consider the inverse Fourier transform in Eqs. (2.37c) and
(2.37e). Therefore, this more general result is also true: if v(t) is a function such that t must
change by at least an O( ) amount for a significant change in v to occur, then the forward or
inverse Fourier transform,
5
³ v(t )e
92& ift
V( f ) dt ,
5
-- 109
109 --
2 · Fourier Theory
5
u (t ) v(t ) ³ u(t3)v(t t3) dt 3 .
5
(2.38a)
Here, u and v may be complex functions but their argument t is assumed to be real. The
convolution is commutative and associative. It is commutative because making the substitution
t 33 t t 3 gives
5 5 5
u (t ) v(t ) ³ u (t 3)v(t t 3) dt 3 ³ u (t t 33)v(t 33) dt 33 ³ v(t33)u(t t 33) dt 33 ,
5 5 5
showing that
u (t ) v(t ) v(t ) u (t ) . (2.38b)
The convolution is associative because for three complex functions u(t), v(t), and h(t) with real
argument t we can write, changing the variable of integration to t 333 t 33 t 3 ,
5 5 5 5
Hence,
u (t ) v(t ) h(t ) u (t ) v(t ) h(t ) . (2.38c)
The convolution is a linear operation, because for any two complex constants Į and ȕ,
- 110
- 110- -
Fourier Convolution Theorem · 2.9
5
h(t ) u (t ) v(t ) ³ h(t 3) u (t t 3) v(t t 3) dt 3
5
5 5
³ h(t 3)u (t t 3)dt 3 ³ h(t 3)v(t t 3)dt 3 ,
5 5
showing that
This shows that the convolution is linear on both the left-hand and right-hand sides of the .
The convolution of two even functions or two odd functions is an even function. If u(t) and
v(t) are both even or both odd, then we have, using t 33 t 3 ,
5 5
u (t ) v(t ) ³ u(t 3)v(t t 3) dt 3 ³ u (t 33)v(t t 33) dt 33
5
5
5
(2.38f)
5
³ u (t 33)v(t t 33) dt 33 u(t ) v(t ) .
5 5
u ( t ) v ( t ) ³ u(t 3)v(t t 3) dt 3 ³ u (t 33)v(t t33) dt 33
5 5
5
(2.38g)
³ u (t 33)v(t t 33) dt 33 u (t ) v(t ) .
5
-- 111
111 --
2 · Fourier Theory
5
u ( y, x1 , x2 ,…) v( y, x13, x23 ,…) ³ u( y3, x , x ,…)v( y y3, x3, x3 ,…) dy3 ,
5
1 2 1 2
³ dt 3u (t 3) ³ dt e
92& ift
v(t t 3).
5 5
5 5
F ( 9 ift ) u (t ) v(t ) ³
5
dt 3u (t 3)e 92& ift 3 ³ dt 33e92& ift 33v(t 33)
5
ª 5
º ª5 º
« ³ dt 3u (t 3)e 92& ift 3
» A « ³ dt 33e
92& ift 33
v(t 33) »
¬ 5 ¼ ¬ 5 ¼
or
F ( 9 ift ) u (t ) v(t ) F ( 9 ift ) u (t ) A F ( 9 ift ) v(t ) . (2.39a)
If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39a) to get
³e
2& ift
u (t ) v(t ) dt U ( f ) AV ( f ) , (2.39b)
5
which shows that
u (t ) v(t ) 6 U ( f ) A V ( f ) . (2.39c)
Equation (2.28 A ) can be written as, for any function g(t) after interchanging the roles of t and t 3 ,
- 112
- 112- -
Fourier Convolution Theorem · 2.9
F ( 9 it 3f ) F ( B ift ) g (t ) g (t 3) . (2.39d) (2.39d)
We replace F ( 9 ) by F ( B ) on the right-hand side of Eq. (2.39a), which is just a change in the order
in which the two possible signs of the exponent are listed, and then take F ( 9 it 3f ) of both sides to
get that, applying (2.39d) with g (t ) u (t ) v(t ) ,
u (t 3) v(t 3) F ( 9 it 3f ) F ( B ift ) u (t ) A F ( B ift ) v(t ) . (2.39e)
Because u(t) and v(t) represent arbitrary, Fourier-transformable functions of t, F ( B ift ) (u (t )) and
F ( B ift ) (v(t )) must be arbitrary, Fourier-transformable functions of ƒ, which we can call U ( B ) and
V ( B ) respectively,
U ( B ) ( f ) F ( B ift ) u (t ) (2.39f)
and
V ( B ) ( f ) F ( B ift ) v(t ) . (2.39g)
Applying this notation to (2.39d), first with g (t ) u (t ) and then with g (t ) v(t ) , we see that
F ( 9 it 3f ) U ( B ) ( f ) u (t 3) (2.39h)
and
F ( 9 it 3f ) V ( B ) ( f ) v(t 3) . (2.39i)
F ( 9 it 3f 3) U ( B ) ( f 3) F ( 9 it 3f 33) V ( B ) ( f 33) F ( 9 it 3f ) U ( B ) ( f ) AV ( B ) ( f ) ,
where the convolution is over t 3 because it is the only argument repeated on both sides of the .
Since U ( B ) and V ( B ) are arbitrary, transformable functions, we can replace them by the arbitrary
transformable functions u and v to get, after interchanging the roles of ƒ and t 3 ,
-- 113
113 --
2 · Fourier Theory
If U(ƒ) and V(ƒ) are the forward Fourier transforms of u(t) and v(t) respectively, we can choose
the minus sign of (2.39j) to get
³e
2& ift
u (t ) A v(t ) dt U ( f ) V ( f ) (2.39k)
5
or
u (t ) A v(t ) 6 U ( f ) V ( f ) . (2.39 A )
Equation (2.39b) shows that the forward Fourier transform of the convolution of two functions
is the product of the forward Fourier transform of each function, and (2.39k) shows that the
forward Fourier transform of the product of two functions is the convolution of the forward
Fourier transform of each function. Equations (2.39a) and (2.39j) show that everything we just
said about the forward Fourier transform still holds true when we take the reverse Fourier
transform of the product of two functions or of the convolution of two functions.
When using the Fourier convolution theorem, we usually regard one of the two convolved
functions as representing the undisturbed signal—that is, the true set of values for what is to be
measured—and the other—usually much more narrow—function as specifying the blurring or
smearing effect of an imperfect measurement. The blurring or smearing function has different
names in different engineering disciplines; optical engineers often call it the instrument-response
or instrument line-shape function. In Fig. 2.5(a), function u is taken to be the true signal, and in
Fig. 2.5(b) function v is the instrument-response or instrument line-shape function. The
convolution
5
u (t ) v(t ) ³ u (t 3)v(t t 3) dt 3 u
5
blur (t )
defines the new function ublur (t ) as shown in Figs. 2.5(c)–2.5(e). The function v is flipped left to
right and slid along the t 3 axis in Fig. 2.5(c) by changing the value of t. Figure 2.5(d) is a close-
up of v at a specific value of t, with the shaded region being the area under the product
u (t 3)v(t t 3) . Since u (t 3)v(t t 3) is zero where v(t t 3) is zero, the area of the shaded region can
be found by integrating u (t 3)v(t t 3) over t 3 between í and +. This is, of course, just the
convolution of u and v for this particular value of t , which means the area of the shaded region
must be ublur (t ) for this value of t. Figure 2.5(e) represents the complete ublur (t ) function for all
values of t; clearly ublur has less detail than the original signal u.
The v(t) function in Fig. 2.5(b) is an unusual type of instrument response because it is not an
even function of t. Figure 2.5(f) shows a typical even instrument response ve (t ) . When the
instrument-response function is ve , the blurred signal is
- 114
- 114- -
Fourier Convolution Theorem · 2.9
ue ,blur (t ) u (t ) ve (t ) . (2.40a)
5 5
ue,blur (t ) ³ u(t 3)ve (t t 3) dt 3
5
³ u(t 3)v (t 3 t ) dt 3
5
e (2.40b)
with the last integral in (2.40b) making it perhaps more obvious that ue,blur is a localized and
weighted average of u centered on t. Instrument-response or line-shape functions are usually
designed to be even because an even instrument-response function does not shift the center point
of isolated peaks in the true data u.
As described in the first chapter, when using Michelson interferometers, we do not much care
about the exact shape of the optical intensity signal u but are instead interested in the shape of its
transform,
U ( f ) F ( ift ) u (t ) . (2.40c)
The relationship between U e ,blur and U must be understood to design the electrical circuits
properly. Here is an important example of how to use the Fourier convolution theorem.
Substitution of (2.40a) into (2.40d) gives
U e ,blur ( f ) F ( ift ) u (t ) ve (t ) .
Using the Fourier convolution theorem as presented in Eq. (2.39a), this is rewritten as
-- 115
115 --
2 · Fourier Theory
FIGURE 2.5(a).
u (t )
u (t 3)
FIGURE 2.5(c).
t3
u (t 3)v(t t 3)
t value
v(t t 3)
FIGURE 2.5(d).
t3
ublur (t )
FIGURE 2.5(e). t
ve (t )
t
FIGURE 2.5(f).
- 116
- 116- -
Fourier Convolution Theorem · 2.9
Ve ( f ) F ( ift ) ve (t ) .
Equation (2.40e) is a very reassuring result, stating that as long as Ve ( f ) is known and not zero,
we can recover the Fourier transform of the true signal U(ƒ) from U e ,blur ( f ) by calculating
U e ,blur ( f )
U( f ) . (2.40f)
Ve ( f )
To design the circuits of a Michelson interferometer, we find the frequencies ƒ for which U(ƒ)
must be known and arrange for Ve to be as constant as possible—and definitely not zero—over
these frequencies. It turns out that preserving certain signal frequencies while neglecting others is
a standard problem in electrical circuit design, and it is usually easy to arrange for this to occur.
There is, in fact, a whole branch of electrical engineering called filter theory that describes
exactly how to design circuits where Ve is zero or very small at some frequencies while being
large and quasi-constant at others.
³ u (t )dt
5
for the function u(t) is that
-- 117
117 --
2 · Fourier Theory
5 T2
5 ª ts 1 T2
º
³5 u (t ) dt lim « ³
« T1
u (t ) dt ³ u (t ) dt ». (2.41a)
¼»
T1 75 , T2 75
1 70, 2 70 ¬ ts 2
5 ª t s T º
³5 u (t ) dt lim «
« T
³ u (t ) dt ³ u (t )dt » . (2.41b)
¼»
T 75
70 ¬ ts
The limiting process in definition (2.41b) is said to give the Cauchy principle value of the
integral, sometimes written as
5 5
_
PV ³ u (t )dt or ³ u(t )dt .
5 5
If u(t) has multiple singular points, the definition is expanded in the obvious way. For example,
with two singular points at ts1 and ts 2 with ts1
ts 2 , we have
5 ª ts1 1 t s 2 2 T º
PV ³ u (t )dt lim « ³ u (t )dt ³ u (t )dt ³ u (t ) dt » (2.41c)
1 70 « »¼
T 75
5 ¬ T ts 1 1 ts 2 2
2 70
and so on for three, four, etc., interior points of singularity in u(t). If an improper integral
converges to a finite value in the standard sense of (2.41a), then its Cauchy principle value also
converges to the same answer, but many improper integrals that do not converge in the sense of
(2.41a) nevertheless have well-defined Cauchy principle values. For this reason, it is customary
in Fourier-transform theory to interpret all improper integrals—such as the forward and inverse
Fourier transforms—as Cauchy principle values, and that is what we shall do from now on. There
will be no special notation used to distinguish Cauchy principle values from ordinary improper
integrals.
- 118
- 118- -
Fourier Transforms and Divergent Integrals · 2.10
To show the relevance of the Cauchy principle value, we calculate the Fourier transform of
1 t , an example already considered above in connection with the sine transform [see discussion
following Eq. (2.10e)]. Using the identity ei cos( ) i sin( ) , we have
5 5 5
F ( ift ) (t 1 ) ³ e 2& ift t 1dt ³ cos(2& ft ) t 1dt i ³ sin(2& ft ) t 1dt . (2.42a)
5 5 5
There is no problem evaluating the imaginary part of this transform. Because [t 1 sin(2& ft )] is
an even function of t, we can apply formulas (2.19) and (2.10f) to get
5 5
i ³ sin(2& ft ) t dt 2i ³ sin(2& ft ) t 1dt i& for
1
f 0.
5 0
When f
0 , we have
5 5
i ³ sin(2& ft ) t dt i ³ sin(2& f t ) t 1dt i& ,
1
5 5
allowing us to write
5
i ³ sin(2& ft ) t 1dt i& sgn( f ) , (2.42b)
5
where we define
1 for f 0
°
sgn( f ) ® 0 for f 0 . (2.42c)
° 1 for f
0
¯
The specification that sgn(0) 0 makes sgn( f ) a proper odd function, equal to zero at f 0 ,
even though it has a jump discontinuity there. It also, of course, makes sense considering that
(2.42b) is the integral of the zero function when f 0 . Evaluation of the real part of the
transform in (2.42a) shows the usefulness of interpreting improper integrals as Cauchy principle
values. When f 0 , the real part of the left-hand side of (2.42a) becomes, using the standard
interpretation of an improper integral in (2.41a),
-- 119
119 --
2 · Fourier Theory
5
dt ª 1 dt T2 dt º ª T1 dt § T2 · º
³ t T1 75,T2 75 « ³T t ³ t » T1 75,T2 75 « ³ t ¨© 2 ¸¹»»
lim « » lim « ln
5 70, 70 ¬ 1
1 2 2 ¼ 70, 70 ¬ 1
1 2 ¼
ª §T · § T ·º
lim « ln ¨ 1 ¸ ln ¨ 2 ¸ » (2.43a)
T1 75 , T2 75
1 70, 2 70 ¬ © 1 ¹ © 2 ¹¼
ª § · § T ·º
lim «ln ¨ 1 ¸ ln ¨ 2 ¸ » .
T1 75 , T2 75
70, 7 0 ¬
1 2
© 2 ¹ © T1 ¹ ¼
The expression ln(1 2 ) can be made anything we want depending on the limiting ratio
chosen for 1 2 as 1 7 0 and 2 7 0 ; the same is true of ln(T1 T2 ) as T1 7 5 and T2 7 5 .
Therefore, under the standard interpretation of an improper integral, the limit in (2.43a) does not
exist. Comparison of (2.41a) to (2.41b) shows that (2.43a) can be converted to a Cauchy principle
value by setting 1 2 , T1 T2 T , and taking the limit as T 7 5 , 7 0 . This leads to
ª § · § T ·º
lim «ln ¨ ¸ ln ¨ ¸ » 0 ,
70 ¬
T 75
© ¹ © T ¹¼
5
dt
allowing us to give a well-defined value to the expression ³ t .
5
In general, the Cauchy principle value of any odd function is always zero,
³ u(t )dt 0
5
for any function u such that u (t ) u (t ), (2.43b)
because when taking the limit we are always simultaneously adding u (t )dt increments to the
integral at values of t and ít with the balanced addition of increments always cancelling out.
Hence, interpreted as a Cauchy principle value,
³ cos(2& ft ) t
1
dt 0 (2.43c)
5
- 120
- 120- -
Fourier Transforms and Divergent Integrals · 2.10
For this answer to be a true extension to Fourier-transform theory, however, 1/t must satisfy
Eq. (2.28 A ); that is, the inverse transform
5
F ( itf )
i& sgn( f ) i& ³ e2& ift sgn( f )df
5
5 5
(2.43e)
i& ³ cos(2& ft ) sgn( f )df & ³ sin(2& ft ) sgn( f )df .
5 5
The cosine integral is again the integral of an odd function so its Cauchy principle value is zero,
but it is still not clear what value to assign the integral of [sin(2& ft ) sgn( f )] . As the integral of
an even function, we might try applying formula (2.19) to get
5
? 5 5
& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) sgn( f )df 2& ³ sin(2& ft ) df , (2.43f)
5 0 0
but then we have the same difficulty already encountered when trying to evaluate the sine
transform
5
2& ³ sin(2& ft )df
0
in Eq. (2.10g). To evaluate the inverse transform of i& sgn( f ) , we need to create a new class of
mathematical entities, called generalized functions, together with a set of rules for how they
behave inside integrals. This extension to Fourier-transform theory is often called distribution
theory, with the generalized functions called distributions.
-- 121
121 --
2 · Fourier Theory
function . In general, we can use any complex function u(t) having a real argument t as a
weighting function inside an integral to create a functional. This functional, called ³ u , is defined
to be
5
According to this definition the functional ³ u is linear, like the Fourier transform, because
5 5 5
³ u ³ u
1 2
³u
5
G
(t ) (t ) dt ³ uG (2.46)
for any test function . Since we already know what complex number the functional ³ uG gives
for any test function , Eq. (2.46) is not a definition of ³ uG but rather a definition of what it
means to put [uG (t ) A (t )] inside an integral. Clearly, the generalized function itself is well
defined only when its product with a test function is integrated over t. Because the functional ³ uG
behaves in every way like the functionals ³ u based on the Cauchy-principle-value integration of
true functions, we have established a new type of integration using the product of generalized
- 122
- 122- -
Generalized Functions · 2.11
functions uG (t ) with test functions (t ) . Hence, we have not only generalized what is meant by a
function but have also extended again what is meant by integration.
To handle algebraic expressions involving both generalized functions and true functions, we
must define what it means to say two generalized functions uG (t ) and vG (t ) are equal. We say
that when
5 5
³u
5
G (t ) (t )dt ³v
5
G (t ) (t )dt (2.47a)
uG (t ) vG (t ) . (2.47b)
We also define a generalized function uG (t ) , which we know only from its associated
functional ³ uG using definition (2.46), to be equal to a true function v(t) when
³ u ³ v
G (2.48a)
for all appropriate test functions . Another way of stating this is that whenever
5 5
³ uG (t ) (t )dt
5
³ v(t ) (t )dt
5
(2.48b)
³ uG (t )ab (t )dt
5
³v
5
G (t )ab (t )dt (2.48e)
for all test functions ab (t ) that are identically zero for all t
a and for all t b . The key point
here is that we are explicitly allowing ab (t ) to be nonzero only inside the interval a
t
b . We
also say that a true function v(t) equals a generalized function uG (t ) in the interval a
t
b ,
-- 123
123 --
2 · Fourier Theory
uG (t ) v(t ) for a
t
b , (2.48f)
whenever
5 5
³u
5
G (t )ab (t )dt ³ v(t )
5
ab (t )dt (2.48g)
for all the ab (t ) test functions. In Eqs. (2.48d)–(2.48g), we allow for half-infinite intervals by
permitting constant b to be 5 with constant a finite and constant a to be í with constant b
finite.
The definitions of equality between two generalized functions or between a generalized
function and a true function can be, depending on the set of test functions chosen, either very
much looser than the standard idea of equality or very much the same. Suppose, by way of
analogy, we define two true functions u1 (t ) and u2 (t ) to be “equal” when
5 5
³ u (t ) (t )dt ³ u (t ) (t )dt
5
1
5
2 (2.49)
for all test functions . If the only allowed test function is (t ) 0 , then any two functions u1 (t )
and u2 (t ) are “equal.” If, on the other hand, the allowed test functions are (t ) e 92& ift for all real
values of ƒ, we are saying that u1 (t ) and u2 (t ) are “equal” when their Fourier transforms
F ( 92& ift ) u1 (t ) and F ( 92& ift ) u2 (t ) are the same. From the Fourier inversion formulas, it then
follows that u1 (t ) must be identical to u2 (t ) , except possibly at jump discontinuities and isolated
points, for all reasonably well-behaved functions u1 (t ) and u2 (t ) . In general, we expect the set of
test functions to be diverse enough that serious thought and some mathematical ingenuity are
required to find two functions u1 (t ) and u2 (t ) that satisfy Eq. (2.49) yet are not basically the
same function. Of course, the integrals used in Eq. (2.49)—and all the other integrals involving
only true functions in Eqs. (2.44) through (2.48g), for that matter—must be known to exist. Often
the finiteness of these integrals and the general smoothness of the test functions are enforced by
the requirement that
N
lim[ t (t )] 0 for N 0,1, 2,… , (2.50a)
t 75
- 124
- 124- -
Generalized Functions · 2.11
N
lim[ t ( M ) (t )] 0 for N 0,1, 2,…
t 75 . (2.50b)
and M 1, 2,…
2
A function such as e at for a 0 satisfies (2.50a) and (2.50b), and in general all functions
representing physically realistic measurements can be taken to satisfy these two requirements. It
turns out, however, that the most useful and popular generalized function used in Fourier theory
can handle a wider variety of test functions, requiring only that the test functions be
continuous at t 0 (see Sec. 2.14 below).
Continuing to develop what is meant by the sign applied to generalized functions, we say
that the product of a true function w(t) and a generalized function uG (t ) is another generalized
function vG (t ) ,
vG (t ) w(t ) A uG (t ) , (2.51a)
³
5
vG (t ) (t )dt ³ w(t )u
5
G (t ) (t ) dt
for all test functions (t ) . A linear combination of true functions and generalized functions
specified by
wG (t ) u1 (t )vG1 (t ) u2 (t )vG 2 (t ) " (2.51b)
5 5 5
³
5
wG (t ) (t )dt ³ u1 (t )vG1 (t ) (t )dt
5
³ u (t ) v
5
2 G2 (t ) (t ) dt "
for all test functions (t ) . In general, there is no difficulty assigning a meaning to equations such
as
u1 (t )vG1 (t ) u2 (t )vG 2 (t ) " u N (t )vGN (t )
(2.51c)
U1 (t )VG1 (t ) U 2 (t )VG 2 (t ) " U M (t )VGM (t )
-- 125
125 --
2 · Fourier Theory
5 5 5
³ u1 (t )vG1 (t ) (t )dt
5 5
³ u2 (t )vG 2 (t ) (t )dt " ³u
5
N (t )vGN (t ) (t ) dt
5 5 5
³ U (t )V
5
1 G1 (t ) (t ) dt ³ U 2 (t )VG 2 (t ) (t ) dt " ³ U M (t )VGM (t ) (t ) dt
5 5
for all test functions (t ) . Even the simplest nonlinear expressions, however, such as
? 2
vG (t ) uG (t ) ,
cannot be resolved by putting both sides inside an integral, because the right-hand side of
5
?5
³ vG (t ) (t )dt ³ uG (t ) (t )dt
2
5 5
is still undefined. We know that the left-hand side is the same as applying the already-understood
functional ³ uG to ,
5
³u
5
G
(t ) (t )dt ³ uG ,
³ u
2
G (t ) (t )dt
5
in terms of the functional ³ uG . It turns out that, in general, nonlinear expressions involving
generalized functions cannot be given useful interpretations. Hence, generalized functions must
be treated with caution unless they are used inside linear combinations of the type shown in
(2.51b) and (2.51c).
Although generalized functions do have limitations, there are many things that can be done
with them. We can give meaning to uG (t a ) for any real constant a by defining that
5 5
³ uG (t a) (t )dt
5
³u
5
G (t ) (t a)dt (2.52a)
for all test functions . This definition is, of course, consistent with what happens when the
formal substitution t 3 t a is made inside the original integral,
- 126
- 126- -
Generalized Functions · 2.11
5 5 5
treating uG (t a ) like a true function u (t a) . We can give meaning to uG (at ) for any real
constant a by defining that
5 5
1
³5 G
u ( at ) (t ) dt ³ uG (t ) t a dt
a 5
(2.52b)
for all test functions . This definition is consistent with what happens when we make the formal
substitution t 3 at in the integral
5
³u
5
G (at ) (t )dt
1 5 ½
5 ° ³
° a 5
uG (t 3) t 3 a dt 3 for a 0 °
° 1
5
When the argument of uG is the a linear combination at c for real constants a and c, we
define
5 5
1
³ uG (at c) (t )dt a 5³ uG (t ) (t c) a dt
5
(2.52c)
and, combining the arguments used to explain definitions (2.52a) and (2.52b), we see that
transforming the variable of integration to t 3 at c gives
5 5
1
³5 uG (at c) (t )dt a ³u
5
G (t 3) (t 3 c) a dt 3 ,
justifying definition (2.52c). In general, any variable transformation that is permitted for the
argument of a true function we also permit for the argument of a generalized function unless it
results in an inappropriate test function.
We define a generalized function uG (t ) to be even if
-- 127
127 --
2 · Fourier Theory
³u
5
G (t )o (t )dt 0 (2.52d)
³u
5
G (t )e (t )dt 0 (2.52e)
for all even test functions e . This gives uG (t ) the same behavior it would have if it were an even
or odd true function multiplied by e or o and integrated over all t. Putting a subscript e on the
generalized function uGe (t ) to show that it obeys the above definition for an even generalized
function, we note that, as described in Eq. (2.11c) above, any test function (t ) can be written as
the sum of an even function e (t ) and an odd function o (t ) . Hence, for any test function and
an even generalized function uGe (t ) , we can write, using definition (2.52d),
5 5 5 5
³ uGe (t ) (t )dt
5
³ uGe (t ) e (t ) o (t ) dt
5 5
³ uGe (t )e (t )dt ³u
5
Ge (t )o (t )dt
5
³u
5
Ge (t )e (t )dt .
5 5 5
where in the last two steps we use o (t ) o (t ) , e (t ) e (t ) , and definition (2.52d). We see
that both
- 128
- 128- -
Generalized Functions · 2.11
5 5
5
³u Ge (t ) (t )dt and ³u
5
Ge (t ) (t )dt
are equal to
5
5
³u Ge (t )e (t )dt
for any test function , so by definition (2.47a) for the equality of two generalized functions, it
follows that
uGe (t ) uGe (t ) (2.52f)
for any even generalized function uGe (t ) . If uGo (t ) is any odd generalized function, we can use
(t ) e (t ) o (t ) and definition (2.52e) to get
5 5 5
³u
5
Go (t ) (t )dt ³u
5
Go (t ) e (t ) o (t ) dt ³u
5
Go (t )o (t )dt
5 5 5 5
or
5 5
³ [ u
5
Go (t )] (t ) dt ³u
5
Go (t )o (t ) dt .
5 5
Clearly, ³ uGo (t ) (t )dt and
5
³ [u
5
Go (t )] (t ) dt are equal to each other because they are both
5
equal to ³u
5
Go (t )o (t )dt for any test function , so by definition (2.47a) we conclude that
-- 129
129 --
2 · Fourier Theory
or
uGo (t ) uGo (t ) . (2.52g)
uG3 (t ) uG(1) (t ) .
for any test function . Therefore, the new generalized function uG3 (t ) satisfies the equation
5 5
§ d ·
³ u3 (t ) (t )dt ³ u
5
G
5
G (t ) ¨ ¸ dt
© dt ¹
(2.53b)
for any test function . We note that this definition is consistent with a formal integration by
parts, treating uG3 (t ) like a true function u 3(t ) to get
5 5 5
§ d · § d ·
³5 uG3 (t ) (t )dt uG (t ) (t )5 5³ uG (t ) ¨© dt ¸¹ dt 5³ uG (t ) ¨© dt ¸¹ dt ,
5
with the term in square brackets [ ] zero for all test functions . We can make this first term zero
either by requiring to approach zero as t 7 95 or by having uG (t ) equal a true function in the
sense of (2.48g) with the true function becoming zero as t 7 95 . The integral involving
3(t ) d dt must also, of course, have a well-defined meaning for all the test functions .
The convolution of two generalized functions uG (t ) and vG (t ) is defined to be another
generalized function
wG (t ) uG (t ) vG (t ) . (2.54a)
From Eqs. (2.47a) and (2.47b), we know that (2.54a) must mean that
- 130
- 130- -
Generalized Functions · 2.11
5 5
³w
5
G (t ) (t )dt
5
³ u G (t ) vG (t ) (t )dt (2.54b)
for all test functions . We now give meaning to both sides of (2.54b) by defining that, for all
test functions ,
5 5 5 5
³
5
wG (t ) (t )dt
5
³ uG (t ) vG (t ) (t )dt 5
³ dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3 t 33) .
5
(2.54c)
Note that the right-hand side of (2.54c) is as well defined as our previous definitions, since
5
v ³v
5
G (t 33) (t 3 t 33)dt 33
is just another complex number depending on the real parameter t 3 , which can be treated as
another true test function v (t 3) inside the double integral of (2.54c),
5 5 5
³ dt 3 u
5
G (t 3) ³ dt 33 vG (t 33) (t 3 t 33)
5
³u
5
G (t 3) v (t 3) dt 3 .
As long as (t 3 t 33) and v (t 3) are both test functions whenever is a test function,
definition (2.54c) should present no difficulties. To justify this definition, we note that formally
treating uG (t ) and vG (t ) as true functions gives
5 5 5
where the last step interchanges the order of integration. We now use (2.52a) to write
5 5
³ (t 33)v
5
G (t 33 t 3)dt 33 ³v
5
G (t 33) (t 33 t 3) dt 33 ,
which leads to
-- 131
131 --
2 · Fourier Theory
5 5 5
justifying the definition given in (2.54c). Note that the order of integration inside the double
integral of (2.54c) can be freely interchanged,
5 5 5 5
³
5
dt 3 uG (t 3) ³ dt 33 vG (t 33) (t 3 t 33)
5
³
5
dt 33 vG (t 33) ³ dt 3 uG (t 3) (t 3 t 33) ,
5
5 5 5
uG (t ) (t ) ³u
5
G (t 3) (t t 3)dt 3 ³u
5
G (t 3) (t 3 t ) dt 3 ³u
5
G (t t 3) (t 3)dt 3 , (2.55a)
where definition (2.52c) with a 1 and c t is used in the last step of (2.55a). It clearly makes
sense to say that
5
³u
5
G (t t 3) (t 3)dt 3 (t ) uG (t ) ,
5 5 5
We define Glim, the generalized limit of the sequence of true functions un (t ) , by taking the
standard limit of the sequence of integrals,
- 132
- 132- -
Generalized Limits · 2.12
5
lim
n 75 ³ u (t ) (t )dt ,
5
n
and requiring that the generalized limit of the sequence of true functions un (t ) , written as
G lim un (t ) ,
n 75
for any test function . In effect, the generalized limit Glim is what we get when we insist on
moving the standard limit inside the integral. Almost always, of course, it turns out that the
generalized limit is the same as the standard limit,
G lim un (t ) lim un (t ) ,
n 75 n 75
so that
5 5
lim
n 75 ³ u (t ) (t )dt ³ ª¬ lim u (t )º¼ (t )dt ,
5
n
5
n 75
n (2.56b)
but this is not always the case. If we define the function (see Fig. 2.6) by
1 for t
T
°
(t , T ) ®1 2 for t T , (2.56c)
° 0 for t T
¯
1 §t ·
un (t ) ¨ ,1¸ . (2.56d)
n ©n ¹
(t ) 1
-- 133
133 --
2 · Fourier Theory
5 5
1 §t ·
³5 un (t )dt n 5³ ¨© n ,1¸¹ dt 2 ,
which makes
5
lim
n 75 ³ u (t )dt 2 .
5
n (2.56e)
³ ª¬ lim u (t )º¼ dt 0 .
5
n 75
n (2.56f)
______________________________________________________________________________
FIGURE 2.6. (t , T )
t T t T
- 134
- 134- -
Generalized Limits · 2.12
The disagreement of (2.56e) and (2.56f) shows that there can be a very important difference
between the generalized limit and the standard limit, because Eq. (2.56b) does not always hold
true. We cannot avoid this problem by ruling out constant test functions such as (t ) 1 .
Consider, for example,
1
(t )
1 t2
un (t ) t sin(t n) .
We find that21
5
t sin(t n)
5
³ 1 t 2
dt & e 1 n , (2.57a)
which gives
5
t sin(t n)
lim
n 75 ³
5
1 t2
dt & . (2.57b)
Once again, we have found a sequence of true functions un (t ) that does not satisfy (2.56b). This
second example can, in fact, be seen to fail (2.56b) for much the same reason as the first. Since an
even function is being integrated, we can write that [see Eq. (2.19)]
5 5
t sin(t n) t sin(t n)
lim ³ 2
dt 2 lim ³ dt . (2.57d)
n 75
5
1 t n 75
0
1 t2
Consider what happens to the first, positive hump of the sine as n increases in the integral on the
right-hand side of Eq. (2.57d). The values of t for which sin(t n) is significantly different from
zero, say from n A (& 4) to n A (3& 4) , comprise an interval t n A (& 2) with a width that
increases linearly with n, just like the interval 2n in (2.56d) over which (t n ,1) equals one. The
21
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, edited by Alan Jeffrey, 5th ed.
(Academic Press, New York, 1994), p. 445, formula 4 in Sec. 3.723 with a=1/n and =1.
-- 135
135 --
2 · Fourier Theory
center of this hump is at t n A (& 2) , so as n increases, the hump’s center appears at ever larger
values of t. Hence, we can make the approximation that for large n
t 2
2
t 1 ? .
1 t n&
at the hump decreases as 1 n , while the hump’s width, t n A (& 2) , increases as n. The product
of the size and width therefore tends to a constant as n gets large, preventing the integral from
shrinking as n 7 5 . This is the same phenomenon that caused our first example n 1 (t n ,1) to
fail Eq. (2.56b). Up to this point, we have, of course, only discussed the contribution of the first
This formula should be interpreted in the sense of (2.47b) and (2.56a); that is, it means
5 5 5
ª º
³5 «¬G limn75un (t )»¼ (t )dt lim
n 75 ³
5
un (t ) (t )dt ³ uG (t ) (t )dt
5
(2.58b)
for all test functions . We use the sequence of true functions whose generalized limit is the
generalized function to define the Fourier transform of the generalized function. If a sequence of
true functions w1 (t ), w2 (t ),… , wn (t ),… can be forward Fourier transformed to give another
- 136
- 136- -
Fourier Transforms of Generalized Functions · 2.13
³ w (t )e
2& ift
Wn ( f ) n dt (2.59a)
5
and
5
³ W ( f )e
2& ift
wn (t ) n df (2.59b)
5
for all values of n, we then define the forward Fourier transform of the generalized function
wG (t ) G lim wn (t ) (2.59c)
n 75
to be
F ( ift ) wG (t ) G lim Wn ( f ) . (2.59d)
n 75
We expect the sequence of true functions W1 ( f ), W2 ( f ),… , Wn ( f ),… also to give a generalized
function when we take the generalized limit of the sequence,
WG ( f ) G lim Wn ( f ) , (2.59e)
n 75
The double-arrow notation 6 introduced in the discussion after Eq. (2.35d) can be used to
restate this definition more concisely. We define that whenever
w1 (t ), w2 (t ),… , wG (t )
w1 (t ) 6 W1 ( f ), w2 (t ) 6 W2 ( f ), … , wn (t ) 6 Wn ( f ),…
-- 137
137 --
2 · Fourier Theory
wG (t ) 6 WG (t ) (2.59g)
Now at last we can attach a meaning to the Fourier transform pair that could not be completed
in Eqs. (2.43d)–(2.43f). The explicit development that follows is perhaps somewhat long, but
worth doing to show how to construct the Fourier transforms of some of the functions violating
one or more of requirements (V) through (VIII) in Sec. 2.4. We create the sequence
where quotes “ ” are used to indicate that the “ sgn( f ) ” is a generalized function instead of the
true function sgn( f ) defined in Eq. (2.42c) above. The reason for this choice of sequence is
straightforward—function [sgn( f ) ( f , n)] satisfies requirements (V) through (VIII) in Sec. 2.4
for every finite value of n and so has a well-defined Fourier transform; as n increases, function
[sgn( f ) ( f , n)] resembles ever more closely the sgn( f ) function to which we want to give a
Fourier transform. We note that for any test function
5 5
5
lim ³ ( f ) sgn( f ) ( f , n)df
n 75
5
n
lim ³ ( f ) sgn( f )df
n 75
n
5
³ ( f ) sgn( f )df ,
5
so
"sgn( f )" sgn( f ) (2.60b)
- 138
- 138- -
Fourier Transforms of Generalized Functions · 2.13
in the sense of Eq. (2.48c). This equivalence can be used to justify dropping the distinction
between “ sgn( f ) ” and sgn( f ) . Applied mathematicians who work with generalized functions
often drop the distinction between a generalized function and the true function to which it is
equivalent, and the double-quote notation introduced here is not standard usage. There is,
however, no harm in keeping track of the distinction between the two types of functions, and the
double quotes acknowledge the close relationship of the two functions while reminding us that
they are not the same.
The inverse Fourier transform of [ i& sgn( f ) ( f , n)] is, using the identity
ei" cos " i sin " ,
5 n
F ( itf )
i& sgn( f ) ( f , n) i& ³ e 2& ift
sgn( f ) ( f , n) df 2& ³ sin(2& ft ) df .
5 0
which is an odd function in ƒ, has an integral that is zero according to Eq. (2.17); and the integral
between (ín) and n of [sin(2& ft ) sgn( f )] , which is an even function in ƒ, is twice the value of its
integral from zero to n according to Eq. (2.19). Making the substitution f 3 2& tf gives
1 2& nt
F (itf ) i& sgn( f ) ( f , n) cos f 30 .
t
This shows that the inverse Fourier transform of [i& sgn( f ) ( f , n)] is
Now we calculate the forward Fourier transform of (1/ t )[1 cos(2& nt )] . We get
5
F ( ift )
t 1
³e
[1 cos(2& nt )] 2& ift 1
t [1 cos(2& nt )] dt
5
5 5
dt 1
³
5
e 2& ift ³ e 2& ift cos(2& nt ) dt
t 5 t
5
1
i& sgn( f ) i ³ cos(2& nt ) sin(2& ft ) dt .
5
t
-- 139
139 --
2 · Fourier Theory
In the last step, Eq. (2.43d) is used to evaluate the integral of [e 2& ift t 1 ] ; we also substitute
ei" cos " i sin " into the integral of [e 2& ift t 1 cos(2& nt )] , discovering that the Cauchy principle
value of the integral of [t 1 cos(2& ft ) cos(2& nt )] , which is an odd function in t, is zero [see Eq.
(2.17)]. The remaining integral over the even function
[t 1 sin(2& ft ) cos(2& nt )]
can be simplified by applying Eq. (2.19) and then consulting a table of definite integrals,22
5 5
1 1
³5 t cos(2& nt ) sin(2& ft ) dt 2sgn( f )³0 t cos(2& nt ) sin(2& f t ) dt
& sgn( f ) (2& n, 2& f ) & sgn( f ) (n, f ) .
F ( ift ) t 1[1 cos(2& nt )] sgn( f ) ª¬ i& i& (n, f ) º¼ i& sgn( f ) ª¬1 ( n, f ) º¼
i& sgn( f ) ( f , n) .
Hence, (1/ t )[1 cos(2& nt )] and [i& sgn( f ) ( f , n)] are a Fourier-transform pair,
1
1 cos(2& nt ) 6 i& sgn( f ) ( f , n) .
t
1 1 1
1 cos(2& t ) , 1 cos(4& t ) , … , 1 cos(2& nt ) , … (2.60c)
t t t
and
22
I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, p. 453, formula 2 in Sec. 3.741 with
a=2&|f| and b=2&n.
- 140
- 140- -
Fourier Transforms of Generalized Functions · 2.13
such that each member of the lower sequence is the forward Fourier transform of the
corresponding member of the upper sequence and each member of the upper sequence is the
inverse Fourier transform of the corresponding member of the lower sequence. We know from
(2.60a) and (2.60b) that the generalized function given by the generalized limit of the lower
sequence is
but what is the generalized function given by the generalized limit of the upper sequence? We
have for any test function
5 5
1
³ (t ) G lim 1 t [1 cos(2& nt )]2 dt lim ³ (t ) 1 cos(2& nt ) dt
1
5
n 75 n 75
5
t
° 5 dt
5
dt ½°
lim ® ³ (t ) ³ (t ) cos(2& nt ) ¾ (2.60e)
n 75
¯° 5 t 5 t ¿°
5 5
dt 1
³ (t ) lim ³ (t ) cos(2& nt ) dt .
5
t n75 5 t
5
1 1
lim ³ (t ) cos(2& nt ) dt lim ³ (t ) cos(2& nt ) dt
n 75
5
t n 75
5
t
1
lim ³ (t ) cos(2& nt ) dt (2.60f)
n 75
t
5
1
lim ³ (t ) cos(2& nt ) dt ,
n 75
t
where is a small positive number. By making all the test functions (t ) have finite variation as
in requirement (VIII) in Sec. 2.4, we recognize the first and third integrals on the right-hand side
of (2.60f) become zero as n 7 5 , because eventually the cosine oscillates both positive and
negative over each infinitesimal interval while (t ) t barely changes at all—the integrals can be
made as small as desired by picking a large enough value of n. For future use, we note that for
any continuous, finite-variation test function ,
-- 141
141 --
2 · Fourier Theory
5 5 5
lim ³ (t ) sin(nt )dt lim ³ (t ) cos(nt )dt lim ³ (t )e 9 int dt 0 ,
n 75 n 75 n 75
5 5 5
so that
G lim sin(nt ) G lim cos(nt ) G lim e 9 int 0 . (2.60g)
n 75 n 75 n 75
5
dt 1
³ (t ) cos(2& nt ) t
(0)5³ (t , ) t cos(2& nt ) dt ,
where we have chosen small enough that (t ) barely changes over the integral, letting us
replace it by (0) . Now the middle integral on the right-hand side of (2.60f) can be recognized as
the Cauchy principle value of the integral of (1 t ) (t , ) cos(2& nt ) , which is an odd function of t
and must be zero according to Eq. (2.17). Hence, (2.60f) becomes
5
1
lim ³ (t ) cos(2& nt ) dt 0 ,
n 75
5
t
5 5
dt
³ (t ) G lim 1 t [1 cos(2& nt )]2 dt ³ (t )
1
(2.60h)
5
n 75
5
t
for any test function . Since (2.60h) denotes equality in the sense of Eq. (2.48c), we can define
the generalized function “ t 1 ” to be
1
" t 1 " G lim t 1[1 cos(2& nt )]
n 75
2 (2.60i)
Equations (2.60d) and (2.60j) show that [i& "sgn( f )"] and “ t 1 ” are the generalized limits of the
two sequences in (2.60c). Because all the sequence members are Fourier transform pairs, we
- 142
- 142- -
Fourier Transforms of Generalized Functions · 2.13
know, according to (2.59g), that [ i& "sgn( f )"] and " t 1 " are a Fourier transform pair even
though [i& sgn( f )] and t 1 do not satisfy requirements (V) through (VIII) in Sec. 2.4 and, as
shown in Eqs. (2.43a) and (2.43f), their transforms cannot be evaluated as standard integrals. In
this sense, we can write that
This can also be written as, reversing the sign of ƒ in (2.60k), the sign of t in (2.60 A ), and using
Eq. (2.42c) to get that sgn( f ) sgn( f ) ,
It is important to remember that Eqs. (2.60k) and (2.60m) are true only when integrals between
í and + are interpreted as Cauchy principle values and (2.60 A ) and (2.60n) are true only
when equality is defined as in Eq. (2.48c) using generalized function theory. Strictly speaking, it
might be better to say that the Cauchy principle value of
5
dt
³e
92& ift
is 9i& sgn( f )
5
t
and that
5
³e
92& ift
i& "sgn( f )" df 9 " t 1 " .
5
is usually not listed in standard tables of improper integrals without notation showing that it is a
Cauchy principle value, and the equality
5
i
³e
92& ift
sgn( f ) df 9 (2.61b)
5
&t
-- 143
143 --
2 · Fourier Theory
is usually not listed in these tables under any circumstances. It is also true, however, that (2.61a)
and (2.61b) are constantly used either explicitly or implicitly in Fourier-transform theory; and
lists of Fourier-transform pairs often contain (2.61a) and (2.61b). Unfortunately, it is standard
practice in the Fourier-transform tables that do list these integrals to omit any explanation that
they are only true when interpreted as the Fourier transforms of generalized functions. In general,
when using tables of Fourier transforms, all those transforms that do not exist as standard
integrals or Cauchy principle values should be interpreted as the transforms of generalized
functions and used only in the context of generalized function theory.
with
b
f (0) for a
0
b
³ (t ) f (t )dt ®
a ¯ 0 for a
b
0 or 0
a
b
. (2.62b)
(t ) lim[n (t , n 1 )] (2.63a)
n 75
or
§ n 2 ·
(t ) lim ¨¨ e nt ¸¸ . (2.63b)
© &
n 75
¹
There are, in fact, two different—but equivalent—mathematically exact ways to define the delta
function. The first way is to create a well-defined functional ³ that, when operating on a
complex-valued test function (t ) with a real argument t, produces as its complex number (0) ,
the value of at t equal to zero,
³ (0) . (2.64a)
- 144
- 144- -
The Delta Function · 2.14
This makes (t ) the generalized function associated with functional ³ , with (t ) having the
property that
5
³ (t ) (t )dt (0)
5
(2.64b)
for all test functions . The second way to define (t ) is to say it is the generalized limit of a
sequence such as the ones specified in (2.63a) and (2.63b),
(t ) G lim[n (t , n 1 )] (2.65a)
n 75
or
§ n 2 ·
(t ) G lim ¨¨ e nt ¸¸ . (2.65b)
© &
n 75
¹
Although the delta function is a generalized function in every sense of the term, we follow
standard notation and do not add the G subscript—or add the quotes “ ”—used to label other
generalized functions in this chapter.
Defining (t ) with a functional, as in (2.64a), shows that this generalized function can be
used on an extremely large set of test functions—any true function that is continuous at the origin
is an acceptable and appropriate test function. The subset of test functions ab used in Eqs.
(2.48d)–(2.48g) has a
b with ab (t ) automatically set to zero when t does not lie inside the
interval a
t
b . These functions can be used in (2.64b) to show that
³ (t )
5
ab (t )dt ab (0) 0
when a
b
0 or 0
a
b . Therefore, we have
5 5
³ (t )
5
ab (t )dt ³ 0 A
5
ab (t )dt 0
-- 145
145 --
2 · Fourier Theory
5
ª n nt 2 º
5
n nt 2
5
n nt 2
³5 (t ) «Gn75
lim
&
e » dt lim ³
n 75 &
e (t )dt
lim (0) ³
n 75 &
e dt
¬ ¼ 5 5
5
n nt 2
(0) lim ³ e dt (2.66)
n 75
5
&
(0)
for any test function . As n gets large in (2.66), only the value of at t 0 can contribute
significantly to the integral. Replacing (t ) by (0) quickly reduces the whole expression to
(0) , showing that the generalized limit of the sequence in (2.65b) is indeed the delta function.
Some commonly used sequences that have the delta function as their generalized limits are
(t ) G lim
n & , (2.67a)
n 75 1 n 2t 2
sin 2 (nt )
(t ) G lim , (2.67b)
n 75 n& t 2
sin(2& nt )
(t ) G lim , (2.67c)
n 75 &t
and so on. Perhaps the most interesting of these sequences is (2.67c). We know from (2.65c) that
one important property of the delta function is
³ (t )
5
ab (t )dt 0
5 5
ª sin(2& nt ) º ª sin(2& nt ) º
³5 «¬ n75 & t »¼
G lim ab (t ) dt
n 75 ³ «
lim
5 ¬ & t »¼ ab (t )dt 0
- 146
- 146- -
The Delta Function · 2.14
sin(2& nt )
G lim (t ) 0 for t > 0
n 75 &t
in Eq. (2.60g). To understand the behavior near t 0 , we construct function a 0b (t ) in which the
interval a
t
b does include t 0 . Now we can write, transforming the variable of integration
to t 3 2& nt ,
5 5
ª sin(2& nt ) º 1 ª sin(t 3) º § t3 · 3
³ « n75 & t »¼
5 ¬
G lim a 0 b (t ) dt lim
n 75 & ³5 «¬ t 3 »¼ ©¨ 2& n ¹¸ dt
a 0 b
5 5
1 ª sin(t 3) º
a 0b 0 lim ³ « dt a 0b 0 ³ (t )a 0b (t )dt ,
3
n 75 &
5 ¬
t 3 »¼ 5
where in the second-to-last step we use (see any handbook of definite integrals)
5
sin(t 3)
³
5
t3
dt 3 & .
Any arbitrary test function can be written as a function a 0b (t ) whose interval of nonzero values
includes t 0 plus other test functions whose intervals of nonzero values do not include t 0 ;
that is, we can always write (t ) a 0b (t ) [other functions zero at the origin] . When this (t ) is
multiplied by G lim sin(2& nt ) (& t ) and integrated over t between í and +, we realize that the
n 75
value of the integral is a 0b (0) (0) because the other functions that are zero at the origin give
zero contribution to the integral as n 7 5 . Consequently,
5 5
ª sin(2& nt ) º
³5 «¬Gn75
lim
&t »¼ (t )dt 0 ³ (t ) (t )dt ,
5
-- 147
147 --
2 · Fourier Theory
sin(2& nt )
&t
equals the delta function in the only sense that two generalized functions can ever be equal—the
integral of the left-hand side with any test function is always the same as the integral of the
right-hand side with any test function [see discussion after Eq. (2.47b)]. Figures 2.7(a)–2.7(c)
2
and 2.8(a)–2.8(c) plot the behavior of n & A e nt and (& t ) 1 sin(2& nt ) sequences, showing the
two different ways these sequences change into delta functions.
We note that for any odd test function o (t )
5
³ (t ) (t )dt (0) 0
o o
because, according to Eq. (2.12a), odd functions are zero at the origin. Therefore, from the
definitions of even and odd generalized functions in Eqs. (2.52d) and (2.52e), we conclude that
the delta function is an even generalized function because its integral with all odd test functions is
always zero. This means we can write [see Eq. (2.52f)]
(t ) (t ) . (2.68a)
5 5
³ (t t ) (t )dt ³ (t ) (t t )dt (t )
5
0
5
0 0
and, because the delta function equals the zero function for t > 0 , this result can be written as
b
0 for a
b
t0 or t0
a
b
³a (t t0 ) (t )dt ®¯ (t0 ) for a
t0
b . (2.68b)
5 5
1 1
³5 (c A t ) (t )dt c ³ (t ) (t / c)dt
5
c
(0) ,
- 148
- 148- -
The Delta Function · 2.14
FIGURE 2.7(a).
0
t
FIGURE 2.7(b).
0 t
FIGURE 2.7(c).
0 t
2
Figures 2.7(a)–2.7(c) show how n / & e nt changes into a delta function of t as n increases.
-- 149
149 --
2 · Fourier Theory
FIGURE 2.8(A).
0 t
FIGURE 2.8(b).
0 t
FIGURE 2.8(c).
0 t
-1
Figures 2.8(a)–2.8(c) show how (ʌt) sin(2ʌnt) changes into a delta function of t as n increases.
- 150
- 150- -
The Delta Function · 2.14
1
(ct ) (t ) (2.68c)
c
because
5
1
5
ª1 º
³ ( c A t ) (t ) dt
c
(0) ³ « c »» (t )dt
«
5 ¬
(t )
5 ¼
for all test functions . We note that this last rule, Eq. (2.68c), can also be used to show that the
delta function is even, since (2.68a) is just a special case of (2.68c) with c 1 .
Equation (2.52c) shows that there is no difficulty handling a general linear transformation of
the delta function’s argument, because for any two real constants a and c, we have
5
1
5
1 §c·
5
ª 1 § c ·º
³ (a A t c) (t )dt
5
a ³ (t ) ((t c) / a)dt
5
¨ ¸ ³ « ¨ t ¸ » (t )dt
a © a ¹ 5 «¬ a © a ¹ »¼
This is the same answer we would get from factoring a out of the delta function argument and
then using (2.68c) to rescale the delta function.
When the delta function is multiplied by a true function v(t), we have
5 5 5
1
u (t ) ¦ (t tk ) , (2.68f)
all k u 3(tk )
where u3(t ) du dt and t1 , t2 ,… are the values of t for which u (t ) 0 . This formula only makes
sense, of course, when u3(tk ) > 0 for t1 , t2 ,… . Perhaps the easiest way to see that (2.68f) must be
-- 151
151 --
2 · Fourier Theory
true is to note that the delta function equals the zero function whenever its argument is not zero.
Therefore,
5 ª tk º
³ (u (t )) (t ) dt ¦ « ³ (u (t )) (t ) dt » (2.68g)
5 « tk
all k ¬ ¼»
where we expand u as
1
(t tk )u 3(tk ) (t tk ) ,
u 3(tk )
so that
tk tk
ª 1 º
³ u (t ) (t )dt ³ «« u3(t ) (t t ) »» (t )dt
k
tk tk ¬ k ¼
5
ª 1 º
³5 « u3(tk )
« (t t k ) » (t )dt .
¬ ¼»
5
ª 1
5
º 5
ª 1 º
³5 (u (t )) (t )dt ¦ ³ « (t tk ) » (t )dt ³ « ¦ (t tk ) » (t )dt
¬ u 3(tk )
all k 5 « »¼ ¬ all k u 3(tk )
5 « »¼
for all test functions . This justifies Eq. (2.68f) according to the definition for the equality of
generalized functions [see Eqs. (2.47a) and (2.47b)].
- 152
- 152- -
Derivatives of the Delta Function · 2.15
5 5
which shows that now the first derivative of all the test functions must be continuous at the
origin. If we start out with a test function ab (t ) that must be identically zero for all t
a and for
all t b , then Eq. (2.69a) becomes
5 5
5 5
for a
b
0 or 0
a
b , showing that 3(t ) equals the zero function in the sense of Eq. (2.48f)
for t > 0 . Equation (2.52a) can be used in conjunction with (2.53b) to evaluate 3(t ) when it is
shifted from the origin by an amount t0 ,
5 5 5
where now we require the first derivative of the test functions to be continuous at t t0 . This
result can be applied to test functions ab (t ) to get
5 5 5
³ 3(t t )
5
0 ab (t )dt ³ 3(t )
5
ab (t t0 )dt ³ (t )ab
5
3 (t t0 )dt ab
3 (t0 ) 0
-- 153
153 --
Â)RXULHU7KHRU\
ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W 7KHUHIRUH
∞ ∞
³ δ ′W − W φDE W GW = = ³ ⋅φ DE W GW
−∞ −∞
ZKHQHYHU D < E < W RU W < D < E VKRZLQJWKDW δ ′W − W HTXDOVWKH]HURIXQFWLRQ>LQWKHVHQVHRI
(TI@IRU W ≠ W (TXDWLRQVDDQGEFDQEHDSSOLHGDQ\QXPEHURIWLPHVWRJHW
δ Q WKHQWKGHULYDWLYHRIWKHGHOWDIXQFWLRQVKLIWHGDZD\IURPWKHRULJLQE\DQDPRXQW W :H
KDYH
∞ ∞ ∞
³δ W − W φ W GW = − ³ δ W φ W + W GW = ³δ W φ W + W GW = "
Q Q − Q −
−∞ −∞ −∞
ZKLFKHYHQWXDOO\EHFRPHV
∞
G Qφ
³δ W − W φ W GW = ( −) φ W = ( −)
Q Q Q Q
F
−∞
GW Q W = W
$JDLQWKLVODWHVWUHVXOWFDQEHDSSOLHGWRWHVWIXQFWLRQV φDE W WRJHW
∞
³ δ Q W − W φDE W GW = ( −) φDE Q W =
Q
−∞
ZKHQHYHUWKHLQWHUYDO D < W < E GRHVQRWFRQWDLQ W = W %HFDXVH
∞ ∞
³δ Q
W − W φDE W GW = = ³ ⋅φ DE W GW
−∞ −∞
ZKHQHYHU W = W OLHV RXWVLGH WKLV LQWHUYDO ZH HQG XS ZLWK >XVLQJ WKH GHILQLWLRQ RI HTXDOLW\ LQ
I@
δ Q W − W = IRUW ≠ W G
7KH WHVW IXQFWLRQV LQWHJUDWHG ZLWK δ Q W − W PXVW RI FRXUVH KDYH WKHLU QWK GHULYDWLYHV
FRQWLQXRXVDW W = W
Derivatives of the Delta Function · 2.15
1 for t 0
°
(t ) ®1 2 for t 0 . (2.70a)
° 0 for t
0
¯
d
(1) (t ) (t ) (2.70b)
dt
to be the first derivative of the function, then (1) (t ) 0 for all t > 0 . To evaluate (1) (t ) at
the origin, we decide to turn (t ) and (1) (t ) into generalized functions that we call “ (t ) ” and
“ (1) (t ) ” respectively. We define
5 5
for all test functions , which means that, according to Eqs. (2.48b) and (2.48c),
Having established the generalized function “ (t ) ”, we know from Eq. (2.53b) that the
generalized function “ (1) (t ) ” must satisfy
5 5
5 5
-- 155
155 --
2 · Fourier Theory
5 5
Hence, for all test functions continuous at the origin (note that they do not have to approach
zero at ), we have
5 5
5 5
so
d
" (1) (t )" " (t )" (t ) (2.70e)
dt
in the sense of Eq. (2.47b). There is nothing unique about the Heaviside step function. We can
also show, using the generalized function "sgn(t )" introduced in Eqs. (2.60a) and (2.60b) above,
that for any test function
5 5
1
³5 2 "sgn (t )" (t )dt 5³ (t ) (t )dt ,
(1)
(2.70f)
where "sgn (1) (t )" is the first derivative of "sgn(t )" . To show this is true, we do a formal
integration by parts,
5 5
1 1 1
³5 2 "sgn (t )" (t )dt 2 "sgn(t )"A (t )5 2 5³ "sgn(t )" 3(t )dt .
(1) 5
5 0 5
1 1ª º 1 ª lim (t ) º 1 3(t )dt 1 3(t ) dt
³2 ³ 2 ³0
(1)
"sgn (t )" ( t ) dt lim (t )
5
2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 5
1 1 1 1 1 1
ªlim (t ) º ª lim (t ) º ª lim (t ) º (0) (0) ª lim (t ) º
2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 ¬ t 75 ¼ 2 2 2 ¬ t 75 ¼
5
(0) ³ (t ) (t )dt .
5
This shows Eq. (2.70f) is true. Again, we get a formula
- 156
- 156- -
Derivatives of the Delta Function · 2.15
1
"sgn (1) (t )" (t ) (2.70g)
2
in the sense of Eq. (2.47b), where the only major restriction on the test functions is that they be
continuous at the origin.
5
sin(2& nt ) § sin(2& nt ) ·
³e
2& ift
dt F ( ift ) ¨ ¸ ( f , n)
5
&t © &t ¹
and
5
sin(2& nt )
³
5
e 2& ift ( f , n)df F (ift ) ( f , n)
&t
so that
sin(2& nt )
6 ( f , n) . (2.71a)
&t
Although Eq. (2.71a) holds true for all real n, it is here used only for integer values of n. We
know from (2.67c) that the generalized limit as n 7 5 of the left-hand side of (2.71a) is (t ) ,
but what is the corresponding generalized limit of the right-hand side? We have
5 5 n 5
³5 ( f ) df ª¬Gn75
lim ( f , n) º lim ³ ( f , n) ( f ) df lim ³ ( f ) df ³ 1A ( f ) df
¼ n75 5 n 75
n 5
G lim ( f , n) 1 ,
n 75
which is no surprise. Therefore, taking the generalized limit as n 7 5 of both sides of (2.71a)
23
Jack D. Gaskill, Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, New York, 1978), p. 201,
with the sinc, rect function pair corresponding to formula (2.71a) above.
-- 157
157 --
2 · Fourier Theory
gives
(t ) 6 1 , (2.71b)
or, restating this result,
5
³ (t )e
2& ift
dt 1 (2.71c)
5
and
5
³e
2& ift
df (t ) . (2.71d)
5
e 2& if A0 1 ;
but Eq. (2.71d) is true only in the sense of Eq. (2.47b), and it is only safe to substitute freely from
(2.71d) when the substitution takes place inside an integral.
Because the sine is an odd function of its argument, we have according to Eq. (2.17), and
assuming the integral is a Cauchy principle value, that
³ sin(2& ft )df
5
0.
Therefore, Eq. (2.71d) becomes, using Eq. (2.19) and that the cosine is even,
5 5
Since the integral over the sine always disappears, we can also write
5 5
³ cos(2& ft ) 9 i sin(2& ft ) df ³ e
92& ift
(t ) df .
5 5
5
2 ³ cos(2& ft )df (t ) (2.71e)
0
- 158
- 158- -
Fourier Transform of the Delta Function · 2.16
and
5
³e
92& ift
df (t ) . (2.71f)
5
As was the case for Eq. (2.71d), these formulas are meant to be used inside integrals.
v1 (t ) 6 V1( ) ( f )
v2 (t ) 6 V2( ) ( f )
# ,
()
vn (t ) 6 Vn ( f )
#
we know from Eq. (2.59g) that the generalized functions vG (t ) and VG( ) ( f ) specified by
vG (t ) G lim vn (t ) (2.72a)
n 75
and
VG( ) ( f ) G lim Vn( ) ( f ) (2.72b)
n 75
We also suppose that there exists a third sequence of true functions labeled with a superscript
plus sign,
V1( ) (t ),V2( ) (t ),… , Vn( ) (t ),… ,
such that
-- 159
159 --
2 · Fourier Theory
V1( ) (t ) 6 v1 ( f )
V2( ) (t ) 6 v2 ( f )
# .
()
Vn (t ) 6 vn ( f )
#
then the generalized functions VG( ) (t ) and vG ( f ) are also a Fourier transform pair,
VG( ) (t ) 6 vG ( f ) . (2.72e)
where we have replaced t by ƒ in (2.72d); and Eqs. (2.72c) and (2.72e) taken together give
³e ³U
92& ift (9)
u (t )vn (t ) dt ( f 3) Vn( 9 ) ( f f 3) df 3 ,
5 5
where
5 5
U (9) ( f ) ³ e 92& ift u (t )dt and Vn( 9 ) ( f ) ³e
92& ift
vn (t )dt .
5 5
The integral formula for Vn( 9 ) ( f ) just restates the definitions given to Vn( ) and Vn( ) on the two
previous pages. Taking the limit of both sides as n 7 5 gives
- 160
- 160- -
Fourier Convolution Theorem with Generalized Functions · 2.17
5 5
lim
n 75 ³
5
e 92& ift u (t )vn (t ) dt lim ³ U ( 9 ) ( f 3) Vn( 9 ) ( f f 3) df 3
n 75
5
or, moving the limiting process inside the integral so that it becomes a generalized limit [see
discussion after Eq. (2.56a)],
5 5
³e ³U
92& ift (9)
u (t ) G lim vn (t ) dt ( f 3) G lim Vn( 9 ) ( f f 3)df 3 .
n 75 n 75
5 5
From the definitions of vG (t ) and VG( 9 ) ( f ) [see Eqs. (2.72a) and (2.72f)], we get
5 5
³e ³U
92& ift (9)
u (t )vG (t ) dt ( f 3)VG ( 9 ) ( f f 3) df 3 ,
5 5
which becomes
5
³e
92& ift
u (t )vG (t ) dt U ( 9 ) ( f ) VG ( 9 ) ( f ) (2.72h)
5
Consulting Eq. (2.55b) above, we note that convolution with a generalized function is
commutative, just like the convolution of two standard functions, so Eqs. (2.72h) and (2.72i) can
also be written as
5
³e
92& ift
u (t )vG (t ) dt VG ( 9 ) ( f ) U ( 9 ) ( f ) (2.72j)
5
and
F ( 9 ift ) u (t ) A vG (t ) F ( 9 ift 33) vG (t 33) F ( 9 ift 3) u (t 3) . (2.72k)
This establishes the generalized-function counterpart to Eq. (2.39j) whenever e 92& ift u (t ) and
U ( 9 ) ( f ) qualify as acceptable test functions. Since almost all well-behaved, continuous functions
are acceptable test functions when used with linear combinations of delta functions or the
derivatives of delta functions, Eqs. (2.72h) and (2.72i) are valid whenever vG (t ) is a linear
combination of delta functions or the derivatives of delta functions.
-- 161
161 --
2 · Fourier Theory
Establishing the Fourier convolution theorem in the other direction is even easier. We just
write, making the variable substitution t 33 t t 3 and remembering that the convolutions are
commutative,
5 5 5
³
5
e 92& ift [u (t ) vG (t )] dt ³
5
dt e92& ift
5
³ dt3 u(t t 3) G lim v (t 3)
n 75
n
5 5
³ dt 3 G lim v (t 3) ³ dt u(t t 3) e
92& ift
n
n 75
5 5
5 5
³ dt 3 v (t 3) ³ dt u(t t 3) e
92& ift
lim n
n 75
5 5
5 5
lim
n 75 ³
5
dt 3 vn (t 3) e92& ift 3 ³ dt 33 u (t 33) e92& ift 33
5
5 5
[³ e 92& ift 3
G lim vn (t 3) dt 3] A [ ³ u (t 33) e92& ift 33 dt 33] .
n 75
5 5
We conclude that
F ( 9 ift ) u (t ) vG (t ) F ( 9 ift 3) u (t 3) A F ( 9 ift 33) vG (t 33) , (2.72 A )
showing that Eq. (2.39a) holds true for the convolution of a true function and a generalized
function as well as for the convolution of two true functions.
§ t § 1 ··
sin ¨ 2& ¨ n ¸ ¸
1
II( t , T ) A G lim © T© 2 ¹¹
. (2.73)
T n75 § t ·
sin ¨ & ¸
© T¹
For any test function (t ) , we have
5 ª sin 2& tT 1 n (1 2)
»ºdt lim 5 ° sin 2& tT 1 n (1 2)
½°dt
³5 (t ) Gn75
lim «
«¬ sin & tT 1 »¼ n 75 ³
(t ) ®
sin & tT 1
¾ (2.74a)
5 ¯° ¿°
- 162
- 162- -
The Shah Function · 2.18
As n gets large in (2.74a), the term in braces { } oscillates ever more rapidly between +1 and í1,
causing the more slowly varying function to make only a negligible contribution to the
integral. The only place this might not hold true is at the isolated t values
t 0, 9 T , 9 2T ,… . (2.74b)
It is easy to see why these isolated values are different. Suppose t differs from one of these
isolated values by only a small amount ¨t so that
sin 2& (t 9 mT )T 1 n (1 2) sin 2&tT n (1 2) 9 2& nm 9 & m
1
sin & (t 9 mT )T 1 sin(&tT 1 9 & m)
sin 2&tT 1 n (1 2) .
1
sin(&tT )
To explain the last step, we note that the sine does not change when a ±nm number of 2ʌ’s is
added to its argument, and adding a ±m number of ʌ’s to the sine’s argument either leaves the
sine unchanged (if m is even) or multiplies it by í1 (if m is odd). Since the sine values in both the
numerator and denominator have the same number of ʌ’s added to their arguments, we do not
care if m is odd because the factor of í1 cancels, leaving the sine ratio unchanged. As ¨t is taken
to be ever smaller in magnitude for a fixed value of n, there comes a time when the arguments of
both sines are small in magnitude, allowing each sine to be approximated by its argument. We
then have
sin 2& (t 9 mT )T 1 n (1 2)
sin 2&tT 1 n (1 2)
sin & (t 9 mT )T 1 sin &tT 1
1
2&tT n (1 2) 2
n (1 2) .
&tT 1
Consequently, the peak values of the term in braces get ever larger at the isolated points in
(2.74b) as n increases, as shown in Figs. 2.9(a)–2.9(c). We see that the triangular peaks at the
isolated points in (2.74b) have widths equal to T (n (1 2)) . As n gets ever larger, the term in
braces oscillates so rapidly between +1 and í1 compared to the test function that there is no
contribution made to the integral on the right-hand side of (2.74a) except at the isolated t values
shown in Figs. 2.9(a)–2.9(c). At these t values, we have
-- 163
163 --
2 · Fourier Theory
5 ° sin 2& tT 1 n (1 2)
°½dt " (T ) 1area of triangular peak2
lim ³ (t ) ® ¾
n 75
5 °¯
sin & tT 1 °¿
(0) 1area of triangular peak2
(T ) 1area of triangular peak2 "
1 T
A A 2 n (1 2) 1" (T ) (0) (T ) "2 ,
2 n (1 2)
which simplifies to
5 ° sin 2& tT 1 n (1 2)
½° dt T k 5
lim ³ (t ) ® ¾ ¦ (kT ) . (2.75a)
n 75
5 °¯ sin & tT 1 °¿ k 5
k 5
But T ¦ (kT ) can be regarded
k 5
thought ofasaswhat
whatwe
weget
getwhen
whenevaluating
evaluatingthe
theintegral
integral
ª k 5 º
5 k 5 5 k 5
³5 «¬ k¦
(t ) T
5
(t kT ) »
¼
dt T ¦³
k 5 5
(t kT ) (t ) dt T ¦ (kT ) .
k 5
5 ° sin 2& tT 1 n (1 2)
½° dt 5
ª k 5
º
lim ³ (t ) ® ¾ ³ (t ) «T ¦ (t kT ) »¼ dt (2.75b)
n 75
5 ¯° sin & tT 1 ¿° 5 ¬ k 5
or, using (2.56a) to take the limit inside the integral as a generalized limit,
5 ° sin 2& tT 1 n (1 2)
½° dt 5
ª k 5
º
³5 (t ) Gn75
lim ® ¾ ³ (t ) «T ¦ (t kT ) »¼ dt .
°¯ sin & tT 1 °¿ 5 ¬ k 5
Since this last result is true for any test function , we conclude that
- 164
- 164- -
The Shah Function · 2.18
° sin 2& tT 1 n (1 2) ½° k 5
G lim ® ¾ T ¦ (t kT ) (2.75c)
n 75
°¯ sin & tT 1 °¿ k 5
in the sense of Eq. (2.47b). Comparison of this result to the definition of the shah function in Eq.
(2.73) above shows that
5
II( t , T ) ¦ (t kT ) .
k 5
(2.75d)
° sin 2& fT 1 n (1 2)
½° T k 5
G lim ® ¾ ¦ ( f kT ) .
n 75
¯°
sin & fT 1 ¿° k 5
where
sin(2& (n 1)t )
g n (t ) . (2.76b)
&t
-- 165
165 --
2 · Fourier Theory
FIGURE 2.9(a).
FIGURE 2.9(b).
FIGURE 2.9(c).
The formula for the t interval between the arrows is T /( n 1/ 2) in all three plots. Figures 2.9(a), 2.9(b),
and 2.9(c) show how the base width of the central lobe becomes ever narrower as n increases.
- 166
- 166- -
Fourier Transform of the Shah Function · 2.19
Since adding one to n does not make any difference in the limit, we end up with
G lim g n (t ) (t ) ; (2.76c)
n 75
To find the generalized function that is the forward Fourier transform of the generalized limit of
Gn as n 7 5 , we must evaluate the forward Fourier transform of Gn for finite n,
5 n 5
F ( ift ) Gn (t ) ³ e Gn (t ) dt
2& ift
¦ ³e
k n 5
2& ift
g n (t kT ) dt
5
n 5
¦ e2& ifkT
k n
³e
2& ift 3
g n (t 3) dt 3,
5
where in the last step the variable of integration has been changed to t 3 t kT . The Fourier
transform inside the sum can be done using (2.76b) and (2.76d) to get
n
F ( ift ) Gn (t ) ( f , n 1) A ¦ e 2& ifkT . (2.77a)
k n
n
The sum ¦e
k n
2& ifkT
is just a disguised form of geometric series. We can write
n n
¦e
k n
2& ifkT
¦w
k n
k
, (2.77b)
where
w e 2& ifT
and define
n n
Sn ¦w
k n
k
¦e
k n
2& ifkT
.
-- 167
167 --
2 · Fourier Theory
Using the standard approach for calculating the sum of a geometric series, we note that
multiplying every term in the sum by w increases each power of w in the sum by one. This is the
same as adding wn 1 and subtracting w n from the original sum, giving
n 1
wSn ¦
k n 1
wk S n wn 1 w n
or
wn 1 w n
Sn .
w 1
2& ifT n 1 2
e
2& ifT n 1 2
n
e 2& ifT ( n 1) e 2& ifT ( n ) e
¦e
k n
2& ifkT
e 2& ifT 1
e & ifT e& ifT
(2.77c)
sin 2& fT n 1 2 ,
sin(& fT )
F ( ift ) (Gn (t ))
sin 2& fT n 1 2 ( f , n 1) . (2.77d)
sin(& fT )
The inverse Fourier transform of the forward Fourier transform returns the original function [see
Eqs. (2.29b) and (2.29d)], so this last result lets us write
Gn (t ) 6
sin 2& fT n 1 2 ( f , n 1) . (2.77e)
sin(& fT )
From the definition of the Fourier transform of a generalized function [see (2.59g)], we know that
taking the generalized limit of both sides of (2.77e) gives a Fourier transform relationship
between two generalized functions—all that needs to be done now is to find out what these
generalized functions are.
To find the generalized function that is the generalized limit of Gn as n 7 5 , we write for
any test function , using Eq. (2.76a), that
- 168
- 168- -
Fourier Transform of the Shah Function · 2.19
ª º
5 5 5 n
Equation (2.76c) states that the generalized limit of g n is the delta function, so
5 5 5
lim ³ (t ) g n (t kT ) dt ³ (t ) G lim g n (t kT ) dt ³ (t ) (t kT ) dt (kT ) ,
n 75 n 75
5 5 5
5 5
³ (t ) ªG lim Gn (t )º dt
¬ n 75 ¼ ¦ (kT ) .
k 5
(2.77g)
5
But, just as in the discussion following Eq. (2.75a) above, we can regard
¦ (kT )
k 5
5
II( t , T ) ¦ (t kT )
k 5
with any test function , since
ª 5 º
5 5 5
³ II( t , T ) (t ) dt
5
³ «¦
5 ¬ k 5
(t kT ) »
¼
(t ) dt ¦ (kT ) .
k 5
-- 169
169 --
2 · Fourier Theory
ª 5 º
5 5
ª lim Gn (t ) ³ « ¦ (t kT ) » (t ) dt
³5 dt (t ) ¬Gn75 º (2.77h)
¼ 5 ¬ k 5 ¼
5
G lim Gn (t )
n 75
¦ (t kT ) II( t , T )
k 5
(2.77i)
ª sin 2& fT n 1 2
º ½°
°
5
³5 ( f ) ®Gn75
lim «
« sin(& fT )
( f , n 1) » ¾ df
»°
°¯ ¬ ¼¿
n 1 ª sin 2& fT n 1 2 º
lim ³ ( f ) « » df (2.78a)
n 75 « sin(& fT ) »
( n 1)
¬ ¼
5 ª sin 2& fT n 1 2 º
lim ³ ( f ) « » df ,
n 75 « sin( & fT ) »
5
¬ ¼
where in the last step we recognize that the behavior of the sine ratio inside the square brackets
[ ] is not affected by the endpoints for the region of integration as n 7 5 . Equations (2.56a) and
(2.75e) show that
° sin 2& fT n (1 2) °½ ª º
5 5 k 5
lim ³ ( f ) ® ¾ df ³ ( f ) «T 1 ¦ ( f kT 1 ) » df ,
n 75
5 °¯ sin & fT °¿ 5 ¬ k 5 ¼
- 170
- 170- -
Fourier Transform of the Shah Function · 2.19
ª sin 2& fT n 1 2
º ½°
°
5
³5 ( f ) ®Gn75
lim «
« sin(& fT )
( f , n 1) » ¾ df
»°
°¯ ¬ ¼¿
ª º
5 k 5
³ ( f ) «T 1 ¦ ( f kT 1 ) » df
5 ¬ k 5 ¼
for any test function ( f ) . Therefore,
ª sin 2& fT n 1 2
º 1 k 5
§ k·
G lim « ( f , n 1) » ¦ ¨ f ¸ (2.78b)
n 75 « sin(& fT ) » T k 5 © T¹
¬ ¼
in the sense of Eq. (2.47b). Since the right-hand side of (2.78b) is, according to (2.75d),
proportional to the shah function, we end up with
1 5 § k · 1
¦ ¨f
T k 5 © T ¹ T
1
¸ II( f , T ) . (2.78c)
Equations (2.78b) and (2.77i) let us take the generalized limits as n 7 5 of both sides (2.77e) to
get
5
1 5 § k·
¦
k 5
(t kT ) 6 ¦ ¨ f ¸ .
T k 5 © T¹
(2.78d)
1
II( t , T ) 6 II( f , T 1 ) . (2.78e)
T
2& ift ª º
5 5
1 5
§ j·
³5 e « ¦ (t kT ) » dt T ¦ ¨© f T ¸¹ (2.79a)
¬ k 5 ¼ j 5
and
5
ª
2& ift 1
5
§ j ·º 5
³ «¬ T
e ¦ ¨ ¸ » ¦ (t kT ) .
j 5 ©
f
T ¹¼
df
k 5
(2.79b)
5
-- 171
171 --
2 · Fourier Theory
The discussion following Eq. (2.52c) above shows that linear transformations of the variables of
integration are allowed when using generalized functions, so we can change to t 3 t in Eqs.
(2.79a) and (2.79b) to get
2& ift 3 ª º
5 5
1 5
§ j·
³ e « ¦
¬ k 5
( t 3 kT ) »
¼
dt 3
T
¦ ¨© f T ¸¹
j 5
5
and
5
ª
2& ift 3 1
5
§ j ·º 5
³5 e «¬ T ¦¨f
j 5 ©
¸»
T ¹¼
df ¦ (t 3 kT ) .
k 5
The sum over index k goes over all positive and negative integers, so we can change the sum’s
index to k 3 k and use that the delta function is even [see Eq. (2.68a)] to get
2& ift 3 ª º
5 5
1 5
§ j·
³5 «¬ k¦
e
35
(t 3 k 3T ) » dt 3
¼ T
¦ ¨© f T ¸¹
j 5
and
5
ª1 5
§ j ·º 5
³e
2& ift 3
« ¦ ¨ f ¸ » df ¦ (t 3 k 3T ) .
5 ¬T j 5 © T ¹¼ k 3 5
Dropping the primes and combining these results with Eqs. (2.79a) and (2.79b) produces the
more general formulas
92& ift ª º
5 5
1 5
§ j·
³ e « ¦
¬ k 5
(t kT ) »
¼
dt
T
¦ ¨© f T ¸¹
j 5
(2.79c)
5
and
5
ª
92& ift 1
5
§ j ·º 5
³5 e «¬ T ¦¨f
j 5 ©
¸»
T ¹¼
df ¦ (t kT ) .
k 5
(2.79d)
In fact, we can easily show that Eqs. (2.79c) and (2.79d) are really the same formula. First, we
interchange the j, k indices and the ƒ, t variables in Eq. (2.79c) so that it becomes
5
ª 5 º 1 5 § k ·
³ e 92& ift
« ¦ ( f jT ) » df ¦ ¨t
T k 5 © T
¸.
¹
5 ¬ j 5 ¼
Parameter T is arbitrary, so—just like in the analysis following Eq. (2.75d) above—it can be
replaced everywhere by T 1 to get
- 172
- 172- -
Fourier Transform of the Shah Function · 2.19
5
ª 5 § j ·º 5
§ k·
³5 «¦ ¨ ¦
92& ift
e f ¸» df T ¨ t kT¸ .
¬ j 5 © T ¹¼ k 5 © T¹
After dividing through by T, we see that this last result is the same as Eq. (2.79d), showing that
Eqs. (2.79c) and (2.79d) are really the same formula.
5
U( f ) F ( ift )
u (t ) ³ u (t )e2& ift dt (2.80a)
5
and
u (t ) 6 U ( f ) . (2.80b)
From u(t), we create a new function u[ 5 ] (t , T ) that repeats forever along the t axis at intervals of
T,
5
u[ 5 ] (t , T ) ¦ u(t kT ) .
k 5
(2.81a)
Although perhaps redundant, it turns out that listing T as one of the arguments of u[ 5 ] is a
convenient way to keep track of the connection between u and u[ 5 ] . Function u is called a
5
periodic function of period T because, for any finite positive or negative integer m,
u[ 5 ] (t mT , T ) u[ 5 ] (t , T ) . (2.81b)
Figures 2.10(a) and 2.10(b) show the plots for both u and u[ 5 ] as functions of t. Since function u
is left unspecified, u[ 5 ] can be thought of as representing an arbitrary periodic function. We can
24
The analysis in Secs. 2.20 and 2.21 is adapted from A. Papoulis, Signal Analysis (McGraw-Hill Book Company,
New York, 1977), pp. 76–81.
-- 173
173 --
2 · Fourier Theory
N
u [N ]
(t , T ) ¦ u (t kT ) .
k N
(2.81c)
Clearly,
lim u[ N ] (t , T ) u[ 5 ] (t , T ) . (2.81d)
N 75
We assume that u[ N ] is well behaved with respect to the test functions , so that
5 5
³ (t ) u ³ (t ) u
[N] [5]
lim (t , T ) dt (t , T ) dt . (2.81e)
N 75
5 5
_____________________________________________________________________________
FIGURE 2.10(a). u (t )
FIGURE 2.10(b).
u[ 5 ] (t , T )
T
Figure 2.10(a) is a plot of u (t ) . The solid curve in Fig. 2.10(b), shifted upward from its true position, is
u[ 5 ] (t , T ) and the dashed curves represent u (t ) displaced by multiples of T .
- 174
- 174- -
Fourier Series · 2.20
From (2.81e) and the definition of the generalized limit [see Eq. (2.56a)], we then know that
5 5 5
lim ³ (t ) u
[N ]
(t , T ) dt ³ (t ) ªG lim u[ N ] (t , T ) º dt ³ (t ) u[ 5 ] (t , T ) dt ,
N 75
5 5
¬ N 75 ¼ 5
N
[ N ] (t , T ) ¦ (t kT )
k N
(2.82a)
and
5
[5]
(t , T ) ¦ (t kT ) .
k 5
(2.82b)
Function [ 5 ] (t , T ) is clearly just another way of writing the shah function II( t , T ) . [The shah
5
function is defined in Eq. (2.73) and shown equal to ¦ (t kT )
k 5
in Eq. (2.75d).] The
N
[N]
(t , T ) ¦ (t kT )
k N
5 N 5
u (t ) [ N ] (t , T ) ³ u(t 3) (t t 3, T )dt 3
[N ]
¦ ³ u(t3) t 3 (t kT )
k N 5
5
N
¦ u(t kT ) ,
k N
where the next-to-last step uses ( x ) ( x ) as shown in Eq. (2.68a). The definition of u[ N ] in
(2.81c) then gives
u[ N ] (t , T ) u (t ) [ N ] (t , T ) . (2.82c)
-- 175
175 --
2 · Fourier Theory
Taking the integral Fourier transform of both sides, using the Fourier convolution theorem [see
Eq. (2.72 A )], and remembering that U(ƒ) is the forward Fourier transform of u(t), we get
F ( ift ) u[ N ] (t , T ) F ( ift ) u (t ) A F ( ift ) [ N ] (t , T )
N 5
U( f )A ¦ ³e
k N 5
2& ift
(t kT )dt
N (2.83a)
U( f ) ¦ e 2& ikfT
k N
sin 2& fT ( N 1 2)
U( f ) ,
sin(& fT )
where in the last step we substitute from Eq. (2.77c) above. Having now found that
sin 2& fT ( N 1 2)
F ( ift ) u[ N ] (t , T ) U ( f ) sin(& fT )
,
5
sin 2& fT ( N 1 2)
³e
[N ] 2& ift
u (t , T ) U( f ) df . (2.83b)
5
sin(& fT )
5
sin 2& fT ( N 1 2)
³e
[5] 2& ift
u (t , T ) lim U( f ) df . (2.83c)
N 75
5
sin(& fT )
5
ª sin 2& fT ( N 1 2) º
³e
[5] 2& ift
u (t , T ) U ( f ) G lim « » df
5
N 75
¬ sin(& fT ) ¼
1ª k ·º
5 5
§
³ e 2& iftU ( f ) « ¦ ¨ f ¸ » df
5
T ¬ k 5 © T ¹¼
or
5 kt
¦ ª¬T
2& i
u [5]
(t , T ) 1
U (k T ) º¼ e T
. (2.83d)
k 5
- 176
- 176- -
Fourier Series · 2.20
Equation (2.83d) specifies the Fourier series for an arbitrary periodic function u[ 5 ] , showing that
u[ 5 ] can be written as the infinite sum of complex exponentials multiplied by the complex
constants [T 1U (k T )] . To get these complex constants directly from u[ 5 ] , we note that for any
real number * and integer m,
1 ° ½°
5 m * ( N 1)T m
1 §m· 1 2& i t 2& i t
U ¨ ¸ ³ u (t )e T
dt lim ® ³ u (t )e T
dt ¾
T © T ¹ T 5 N 75 T
¯° * NT ¿°
1 °
* ( N 1)T m * ( N 2)T m
2 & i t 2 & i t
lim ®
N 75 T
°̄ *
³
NT
u (t )e T
dt ³
* ( N 1)T
u (t )e T
dt "
* m * T m
2 & i t 2& i t
³ u (t )e
* T
T
dt ³* u(t )e T
dt
* 2T m
2 & i t
* ( N 1)T m
2& i t ½°
* T
³ u (t )e T
dt " ³
* NT
u (t )e T
dt ¾ .
°¿
This can be simplified to
* ( k 1)T m
1 §m· 1 N
U ¨ ¸ lim ¦
2& i t
T © T ¹ N 75 T k N ³
* kT
e T
u (t )dt . (2.83e)
* ( k 1)T m * T m * T m
2& i t 2& i t 3 2& i t 3
³ ³* e ³*
2& imk
e T
u (t ) dt T
e u (t 3 kT ) dt 3 e T
u (t 3 kT ) dt 3 ,
* kT
where we use that e 2& imk 1 . Substituting this into (2.83e) gives
* T m * T m
1 §m· 1 N 1 ª N º
U ¨ ¸ lim ¦ « ¦ u (t 3 k 3T ) » dt 3 ,
2& i t 3 2& i t 3
T © T ¹ N 75 T k N ³* e T
u (t 3 kT )dt 3 lim
N 75 T ³* e T
¬ k 3 N ¼
where in the last step we have replaced index k by index k 3 k . Now, taking the limit inside the
integral to get the generalized limit [see Eq. (2.56a) above], we rely on (2.81f) to get
* T m * T m
1 §m· 1 ª N º 1
G lim « ¦ u (t 3 k 3T ) » dt 3
2& i t 3 2 & i t 3
³* e ³* e
[5 ]
U¨ ¸ T T
u (t 3, T ) dt 3 . (2.83f)
T ©T ¹ T N 75
¬ k 3 N ¼ T
-- 177
177 --
2 · Fourier Theory
Equations (2.83d) and (2.83f) let us put the Fourier series into its standard form. For any
periodic function
5
[5]
v(t ) u (t , T ) ¦ u (t kT )
k 5
of period T, we have found that
5 t
¦
2& ik
v(t ) Ak e T
, (2.84a)
k 5
where
* T k
1 2& i t
Ak
T ³* e T
v(t ) dt . (2.84b)
for any finite value of * . Because we did not require u(t) to be real in (2.80a), Eqs. (2.83d),
(2.83f), (2.84a), and (2.84b) still hold true for complex periodic functions with real arguments t.
It is customary—but of course not mandatory—to choose * 0 or * T 2 in (2.84b).
Using v(t ) u[ 5 ] (t , T ) , we know from Eqs. (2.83d), (2.83f), (2.84a), and (2.84b) that the Ak
coefficients can be specified in terms of the forward Fourier transform U(ƒ) of u(t),
1 §k ·
Ak U¨ ¸. (2.85a)
T ©T ¹
When u is real—which means that v(t ) u[ 5 ] (t , T ) is also real—we know from entry 7 of Table
2.1 (located at the end of this chapter) that U(ƒ) must be Hermitian so that
U ( f ) U ( f ) .
Hence, when v(t) is real in (2.84a), it then follows from (2.85a) that
A k Ak (2.85b)
in (2.84b). This procedure can be extended to all the entries in Table 2.1, giving us the entries in
Table 2.2 (also located at the end of this chapter). To go through another example, if u is
imaginary and odd, we know from entry 3 of Table 2.1 that U is real and odd, so
U ( f ) U ( f ) and Im U ( f ) 0 .
- 178
- 178- -
Fourier Series · 2.20
A k Ak and Im Ak 0 . (2.85c)
We can show that v(t ) u[ 5 ] (t , T ) is imaginary and odd when u is imaginary and odd (let
k 3 k ),
5 5 5
v(t ) u [5 ]
(t , T ) ¦ u(t kT ) ¦ u (t k 3T ) ¦ u(t k 3T )
k 5 k 3 5 k 3 5
u[ 5 ] (t , T ) v(t ) ,
and
5
Re v(t ) ¦ Re u (t kT ) 0 .
k 5
This shows that we end up with (2.85c) associated with v(t) being imaginary and odd, as stated in
entry 3 of Table 2.2.
A final point worth mentioning about Fourier series is that the Ak coefficients are often
reshuffled so that the series can be written as a sum of sines and cosines. Equation (2.84a) can be
rewritten as, using ei cos i sin ,
5 ª t
2& i k º
t
v(t ) A0 ¦ « A k e
2 & i k
T
Ak e T »
k 1 ¬ ¼ (2.86a)
5 § 2& k t · 5 § 2& k t ·
A0 ¦ ª¬ A k Ak º¼ cos ¨ ¸ ¦ i ª¬ Ak A k º¼ sin ¨ ¸.
k 1 © T ¹ k 1 © T ¹
1
* T
ª 2& i k Tt 2& i k º
t
2
* T
§ 2& k t ·
A k Ak
T ³* v(t ) « e
¬
e T
» dt ³ v(t ) cos ¨
¼ T * © T ¹
¸ dt , (2.86c)
and
i
* T
ª 2& i k Tt 2& i k º
t
2
* T
§ 2& k t ·
i ª¬ Ak A k º¼
T ³* v(t ) «e
¬
e T » dt ³ v(t ) sin ¨
¼ T * © T
¸ dt .
¹
(2.86d)
-- 179
179 --
2 · Fourier Theory
c0 5 § 2& kt · 5 § 2& kt ·
v(t ) ¦ ck cos ¨ ¸ ¦ sk sin ¨ ¸, (2.87a)
2 k 1 © T ¹ k 1 © T ¹
where
* T
2 § 2& kt ·
ck
T ³* v(t ) cos ¨© T ¹
¸ for k 0,1, 2,… (2.87b)
and
* T
2 § 2& kt ·
sk
T ³* v(t ) sin ¨© T ¹
¸ for k 1, 2,3,… . (2.87c)
The absolute value signs are dropped from index k because it is defined positive in (2.87a), and
A0 is replaced by c0 2 so that the formula for c0 can be folded into the general formula for ck in
(2.87b). Although it is still not mandatory, parameter * is usually given the value 0 or T 2 .
Nowhere has v been required to be real, so Eqs. (2.87a)–(2.87c), just like Eqs. (2.84a) and
(2.84b), still hold true when v is a complex-valued periodic function of (real) period T. Indeed, if
v is a complex-valued function of a real argument t, both its real part
vR (t ) Re v(t )
and its imaginary part
vI (t ) Im v(t )
are real-valued periodic functions of period T. This means that when, for any integer m, we have
vR (t 9 mT ) vR (t ) (2.88b)
and
vI (t 9 mT ) vI (t ) . (2.88c)
Since sines and cosines of real arguments are strictly real, we can now take the real and
imaginary parts of (2.87a)–(2.87c) to get
- 180
- 180- -
Fourier Series · 2.20
and
* T
2 § 2& kt ·
Re( sk )
T ³* v R (t ) sin ¨
© T ¹
¸ for k 1, 2,3,… , (2.89c)
as well as
Im(c0 ) 5 § 2& kt · 5 § 2& kt ·
vI (t ) ¦ Im(ck ) cos ¨ ¸ ¦ Im( sk ) sin ¨ ¸ , (2.90a)
2 k 1 © T ¹ k 1 © T ¹
with
* T
2 § 2& kt ·
Im(ck )
T ³* v (t ) cos ¨©
I
T ¹
¸ for k 0,1, 2,… (2.90b)
and
* T
2 § 2& kt ·
Im( sk )
T ³* v (t ) sin ¨©
I
T ¹
¸ for k 1, 2,3,… . (2.90c)
³ u (t )e
2& ift
U( f ) dt (2.91a)
5
and, following the same procedure used in Eq. (2.81a) above, create a periodic function of period
T:
5
u[ 5 ] (t , T ) ¦ u (t kT ) .
k 5
(2.91b)
As was shown Sec. 2.20, we can now write the associated Fourier series as [see Eq. (2.83d)]
-- 181
181 --
2 · Fourier Theory
kt
1 5 §k · 2& i T
u[ 5 ] (t , T ) ¦ U¨
T k 5 © T
¸e
¹
, (2.91c)
km
1 5 §k · 2& i N
u [5]
(mt , T ) ¦ U ¨ ¸e , (2.92a)
T k 5 © T ¹
to simplify the exponent of (2.92a). The infinite sum in (2.92a) can be split in two by making the
substitution k n rN with n 0,1, 2,… , N 1 and r 0, 9 1, 9 2,…. This gives
nm
1 5 N 1 § n rN · 2& i N 2& irm
u [5]
(mt , T ) ¦ ¦ U ¨ ¸e e .
T r 5 n 0 © T ¹
Since e2& irm 1 and T N t , this becomes, making the index substitution r 3 r ,
nm
1 N 1 2& i N 5
§n r3 ·
u [5]
(mt , T ) ¦ e ¦ U ¨© T t ¸¹
T n 0 r 3 5
or
nm
1 N 1 2& i N [ 5 ] § n 1 ·
u[ 5 ] (mt , T ) ¦ e U ¨© T , t ¸¹ ,
T n 0
(2.93a)
where we follow the pattern of Eqs. (2.81a) and (2.91b) and define
5
U [5] ( f , F ) ¦ U ( f rF )
r 5
(2.93b)
- 182
- 182- -
Discrete Fourier Transform · 2.21
with a two-dimensional plot; but its transform U(ƒ) is often complex, so it makes more sense to
plot U ( f ) if we just want to show where U(ƒ) is different from zero.] When function u[ 5 ] has
period T and is uniformly sampled at intervals of ¨t, then function U [ 5 ] has period
1
F (2.93c)
t
and is uniformly sampled at intervals of
1
f . (2.93d)
T
Note, of course, we could also say that u[ 5 ] has period 1 f and is uniformly sampled at
intervals of 1 F when U [ 5 ] has period F and is sampled at intervals of ¨ƒ. When both ¨ƒ and ¨t
are known, we have from (2.92b) and (2.93d) that
1
f A t (2.93e)
N
Figures 2.12(a) and 2.12(b) show that if T and F are large and functions u(t) and U(ƒ) die away
relatively quickly when t and f are large—which means that u and U are localized near the t
and ƒ origins—then the corresponding periodic functions u[ 5 ] (t , T ) and U [ 5 ] ( f , F ) can be used
to approximate the non-negligible regions of u and U. Almost always when the DFT is used, its
users have in mind a situation such as that shown in Figs. 2.12(a) and 2.12(b), with u[ 5 ] and U [ 5 ]
being good approximations of u and U for small to moderately large values of t and ƒ.
To complete the DFT transform pair, we define
2& i
wN e N
(2.94a)
N 1
1 N 1 [ 5 ] § n 1 · ª N 1 mA( n k ) º ½
¦u
m 0
[5 ]
(mt , T ) w mk
N ¦ ®U ¨ , ¸ A « ¦ wN
T n 0 ¯ © T t ¹ ¬ m 0
»¾ .
¼¿
(2.94c)
-- 183
183 --
2 · Fourier Theory
FIGURE 2.11(a).
1
T
f
u[ 5 ] (t , T )
t
t 1/ F
FIGURE 2.11(b).
1
F
U [5 ] ( f , F ) t
f
f 1/ T
The sum over m on the right-hand side is the sum of a geometric series,
N 1
Vn[,Nk ] ¦ wNm ( n k ) . (2.94d)
m 0
This can be solved using the standard procedure for geometric sums [see the analysis following
Eq. (2.77b) above], multiplying every term in the sum by wNn k to get
- 184
- 184- -
Discrete Fourier Transform · 2.21
where in the last step definition (2.94a) is used to eliminate wN . Index n goes from zero to N 1
for each value of k [see Eqs. (2.94b) and (2.94c)]. Deciding also to restrict k to one of the integers
k 0,1, 2,… , N 1 , we see that the denominator in (2.94f) can be zero only when n k . This
looks like it could be a problem, but when n = k, we can return to the original formula in (2.94d),
noting that for n = k the sum Vn[,Nk ] is equal to N. When n k, the right-hand side of (2.94f) shows
that Vn[,Nk ] is zero because e2& i ( n k ) 1 . We conclude that
N for n k ½
Vn[,Nk ] ® ¾ N k ,n , (2.94g)
¯ 0 for n > k ¿
N 1
1 N 1 [ 5 ] § n 1 · [ N ] ½
¦ u[5 ] (mt , T )wN mk ¦ ®U ¨© T , t ¸¹ AVn,k ¾ .
T n 0 ¯
m 0 ¿
N [ 5 ] § k 1 · N 1 [ 5 ]
U ¨ , ¸ ¦ u (mt , T ) wN mk . (2.94i)
T © T t ¹ m 0
This equation is the other half of the DFT [the first half is specified by Eqs. (2.94a) and (2.94b)].
Using Eqs. (2.94a) and (2.92b) to replace wN by e(2& i ) / N and N T by 1 t , we write (2.94b)
and (2.94i) as
§ mn ·
1 N 1 2& i¨ ¸ §n 1 ·
u (mt , T ) ¦ e © N ¹ U [ 5 ] ¨ , ¸
[5]
(2.95a)
T n 0 © T t ¹
and
§ mn ·
§n 1 · N 1 2& i ¨ ¸
U [5]
¨ , ¸
© T t ¹
t ¦
m 0
u[ 5 ] (mt , T )e © N ¹ , (2.95b)
-- 185
185 --
2 · Fourier Theory
FIGURE 2.12(a).
u[ 5 ] (t , T ) 1
T
f
t 1/ F
FIGURE 2.12(b).
U [5] ( f , F ) 1
F
t
f
f 1/ T
- 186
- 186- -
Discrete Fourier Transform · 2.21
where index k has been replaced by n in (2.94i). This can also be written as, using Eqs. (2.93c)
and (2.93d),
N 1 2& i § mn ·
¨ ¸
u (mt , T ) f ¦ e
[5] © N ¹
U [ 5 ] nf , F (2.95c)
n 0
and
N 1 § mn ·
2& i ¨ ¸
U [5]
nf , F t ¦ u [5]
(mt , T )e © N ¹
. (2.95d)
m 0
The forward and inverse DFTs shown in (2.95c) and (2.95d) are often written as
N 1 § mn ·
2& i ¨ ¸
um ¦ U n e © N ¹
(2.96a)
n 0
and
N 1 § mn ·
1 2 & i ¨ ¸
Un
N
¦u
m 0
m e © N ¹
. (2.96b)
um u[ 5 ] (mt , T ) (2.96c)
and
U n f A U [ 5 ] (nf , F ) , (2.96d)
and to get Eq. (2.96b), both sides of (2.95d) are multiplied by ¨ƒ, using (2.93e) to replace f A t
by 1 N . We can also define
U n U [ 5 ] (nf , F ) (2.97a)
and
um t A u[ 5 ] (mt , T ) (2.97b)
N 1 § mn ·
1 2& i ¨ ¸
um
N
¦U n e
n 0
© N ¹
(2.97c)
and
N 1 § mn ·
2& i ¨ ¸
U n ¦ um e © N ¹
, (2.97d)
m 0
-- 187
187 --
2 · Fourier Theory
³ u (t )e
2& ift
U( f ) dt
5
over a range of ƒ values for an arbitrary function u(t), it is standard practice to convert the
integral to a DFT and do the job on a computer with a FFT. As we saw in the previous section,
the DFT deals directly with u[ 5 ] and U [ 5 ] rather than u and U. Thus, successfully using the DFT
to calculate the integral transform requires that u[ 5 ] and U [ 5 ] consist of well-separated, repetitive
regions of u and U, as shown in Figs. 2.12(a) and 2.12(b), instead of overlapping regions of u and
U, as shown in Figs. 2.11(a) and 2.11(b). Ensuring that u[ 5 ] consists of nonoverlapping regions
of u tends to occur naturally; the shape of u is already known so there is no real difficulty in
picking T large enough to prevent significant amounts of overlap in u[ 5 ] . The shape of U,
however, is not known in advance, so care must be taken to avoid significant amounts of overlap
in U.
Consider what happens when the DFT is used to analyze a real signal u(t) having the spectrum
U(ƒ) and we know that U(ƒ) is zero for all f : f max and nonzero for 0
f
f max . Because u is
real, we know from entry 7 in Table 2.1 that U ( f ) U ( f ) , ensuring that U(ƒ) is also nonzero
for negative frequency values 0 f f max ; that is, for every positive ƒ at which U is nonzero
there must be a íƒ at which U is nonzero, and because U is zero for f : f max it follows that U is
zero for all f 4 f max . Hence U can be represented schematically by the solid triangle centered
on the origin of Fig. 2.14. To construct U [ 5 ] , we write
- 188
- 188- -
Aliasing as an Error · 2.22
FIGURE 2.13(a).
u[ 5 ] (t , T )
1
T
f
t 1/ F
FIGURE 2.13(b).
U [5 ] ( f , F ) 1
F
t
f 1/ T
-- 189
189 --
2 · Fourier Theory
5
U [5] ( f , F ) ¦ U ( f kF ) ,
k 5
(2.98a)
where the smallest we can make F and still avoid overlap is, as shown by the dotted triangles in
Fig. 2.14,
F 2 f max . (2.98b)
where ¨t is the interval in t between adjacent samples of u(t). If ¨t is made smaller, then F
increases, moving the regions of nonzero U further apart in Fig. 2.14; and if ¨t is made larger,
then F decreases, forcing the regions of nonzero U to overlap in Fig. 2.14. Making ¨t smaller is
wasteful, in that more effort than is needed goes into sampling u(t), and making ¨t larger
damages the integrity of the U calculations for large values of ƒ near f max . Clearly, the frequency
value F/2 plays an important role in DFT analysis, because optimum performance requires
f max F / 2 . For this reason frequency F/2 is given a special name: the Nyquist frequency
f Nyq F / 2 . From (2.93c), we see that
1
f Nyq . (2.99a)
2t
A realistic system, of course, is designed with some built-in margin for error. The requirement
then becomes that ¨t be small enough to separate unexpectedly high frequencies when the
highest expected frequency is f max . To provide this margin, we take
1
f Nyq f max (2.99b)
2t
or
1
t
. (2.99c)
2 f max
Now the region between f max and f Nyq is available for analysis of unexpectedly high frequencies.
Suppose U(ƒ) is negligible everywhere except at two frequencies, the positive frequency f 0
and the corresponding negative frequency f 0 . Since U(ƒ) is the transform of a real signal,
entry 7 of Table 2.1 requires U ( f ) U ( f ) , forcing the existence of a non-negligible transform
- 190
- 190- -
Aliasing as an Error · 2.22
value at f 0 when there is a non-negligible transform value at f 0 . The two frequencies are
represented by wide, solid-sided arrows in Fig. 2.15. The arrows represent isolated, narrow
regions where U is very large, so we can think of them as proportional to delta functions and
write U(ƒ) as
U ( f ) A A ( f f0 ) B A ( f f0 ) .
Variables A and B are arbitrary complex constants. We have just seen that Table 2.1 requires
U ( f ) U ( f ) . Because the delta functions are real, the equation U ( f ) U ( f ) can be
written as
A A ( f f 0 ) B A ( f f 0 ) A A ( f f 0 ) B A ( f f 0 )
or, since the delta functions are also even [see Eq. (2.68a)],
A A ( f f 0 ) B A ( f f 0 ) A A ( f f 0 ) B A ( f f 0 ) .
This can only be true if A B (which is, of course, the same thing as having B A ).
Therefore, we have the freedom to choose only one arbitrary complex constant, say A, and after
making that choice function U(ƒ) becomes
______________________________________________________________________________
FIGURE 2.14.
U [5 ] ( f , F )
-F - f max f max F
U( f )
-- 191
191 --
2 · Fourier Theory
U ( f ) A A ( f f 0 ) A A ( f f 0 ) . (2.100a)
It is not difficult to figure out what happens when the DFT is used to calculate this double-delta
frequency spectrum. If the double-delta U(ƒ) is used to construct U[](f, F) according to formula
(2.98a), we get multiple isolated regions where U[] is very large, as shown by the wide dashed
arrows in Fig. 2.15. The curved single arrows show which wide dashed arrows come from the
wide, solid-sided arrow at f0 and which wide dashed arrows come from the wide solid-sided
arrow at f 0 . For example, the wide dashed arrow closest to f0 comes from the wide solid-
sided arrow at (–f0), and the wide dashed arrow closest to (–f0) comes from the wide solid-sided
arrow at f0. The two wide solid-sided arrows at f0 and –f0 lie a distance a inside the positions of
the positive and negative Nyquist frequencies fNyq and –fNyq, and the two wide dashed arrows that
are closest to f0 and –f0 lie a distance a outside the positive and negative Nyquist frequencies fNyq
and –fNyq. We see that the original double-delta U(ƒ) transform can be written as [from Eq.
(2.100a)]
and we can pair up the two wide dashed arrows closest to f0 and –f0 to create the transform
Because the delta function ( f f Nyq a) ( f f 0 ) has the coefficient A in (2.100b), the
curved single arrow going from f 0 to f Nyq a shows that the delta function ( f f Nyq a )
at f Nyq a must have the coefficient A in Eq. (2.100c); similarly, the curved single arrow going
from f 0 to f Nyq a shows that the delta function ( f f Nyq a ) at f Nyq a must have the
coefficient A in Eq. (2.100c). Nothing stops us from continuing out from the origin, pairing the
wide dashed arrows at f 3 f Nyq a and f 3 f Nyq a to get
and pairing the wide dashed arrows at f 3 f Nyq a and f 3 f Nyq a to get
- 192
- 192- -
Aliasing as an Error · 2.22
FIGURE 2.15.
frequency – f 0 frequency f 0
a a a a
F 2 f nyq
Each time, the curved single arrows in Fig. 2.15 are consulted to find the coefficients of the delta
functions. This can obviously be continued out to indefinitely large values of ƒ, creating the
paired transforms U [4] ,U [5] ,…, etc. The general formula for U [ k ] turns out to be
A ( f f Nyq kf Nyq a)
°
° A ( f f Nyq kf Nyq a) for k even
°
U [ k ] ( f ) ® . (2.100f)
° A ( f f (k 1) f a)
° Nyq Nyq
° A ( f f Nyq (k 1) f Nyq a ) for k odd
¯
-- 193
193 --
2 · Fourier Theory
We started out with the double-delta U(ƒ) being the forward Fourier transform of u(t), which
means that u(t) is the inverse Fourier transform of the double-delta U(ƒ),
³ U ( f )e
2& ift
u (t ) df .
5
We now show that u(t), the inverse transform of the double-delta U(ƒ), and u[1] (t ), u[2] (t ),… the
inverse transforms of U [1] ,U [2] ,…, all have the same values at t mt for m 0, 9 1, 9 2,… ,
u (mt ) u[1] (mt ) u[2] (mt ) " u[ k ] (mt ) " . (2.100g)
We begin by taking the inverse Fourier transform of the double-delta U(ƒ) function specified
in (2.100b),
5
u (t ) ³ [ A A ( f f
5
Nyq a) A A ( f f Nyq a)]e 2& ift df
(2.101a)
2& it ( f Nyq a ) 2& it ( a f Nyq ) 2& it ( f Nyq a )
Ae Ae 2 Re[ Ae ].
Substituting t mt from (2.100g) and f Nyq 1 (2t ) from (2.99a) into Eq. (2.101a) gives
1
u (mt ) 2 Re[ Ae2& imt ((2 t ) a )
] 2 Re[ Aei& m e 2& imat ]
(2.101c)
2 Re[(1) m Ae 2& imat ] .
- 194
- 194- -
Aliasing as an Error · 2.22
°
° 2 Re[ Aei& m ei& mk e2& imat ] for k even
°
u[ k ] (mt ) ® . (2.101d)
° 1 1
2& imt ((2 t ) ( k 1)(2 t ) a )
°2 Re[ Ae ]
° 2 Re[ Ae i& m e i& m ( k 1) e2& imat ] for k odd
¯
But e 9 i& mk (1) mk 1 when k is even and e 9 i& m ( k 1) (1) m ( k 1) 1 when k is odd, so this last
result can be written as
Comparing this with (2.101c), we conclude that u (mt ) u[ k ] ( mt ) for all values of m and k,
showing that (2.100g) must be true. Because the u[ k ] functions have exactly the same values as
the u functions at t mt for m 0, 9 1, 9 2,… , the u[ k ] functions are called aliases of function
u. Figure 2.16 graphs an example of u(t) and to show how u and its alias u[1] can have identical
values at all the sample positions on the t axis.
The term “alias” is an interesting one; it suggests that there is no real way to distinguish these
functions if all we know are the values of the sample points at t mt . Yet in Figs. 2.14 and
2.15, there is really no question as to which is the correct region of U [ 5 ] ; spectral values whose
frequencies do not lie between +fNyq and –fNyq can clearly be disregarded. Consider, however, that
before u(t) is analyzed there is no guarantee as to what the correct value of fmax is. Figure 2.17, for
example, shows a pattern for U [ 5 ] that seems to have well-separated regions for U and all its
aliases when in fact there is a high-frequency triangle that is hidden by aliasing. The unwary
analyst might conclude that U has the shape shown in Fig. 2.18(a) when its true shape is the one
shown in Fig. 2.18(b). There is really no way to be sure of the true shape of U when all that is
known is the DFT of the sampled signal u(t). The basic problem, which is that the DFT is the
sampled version of U [ 5 ] instead of U, does not disappear when F 1 t is made larger by
decreasing the sampling interval ¨t; there is always the possibility that the true U curve is broad
enough to overlap. Returning to Fig. 2.16, we see that no matter how small ¨t is made, the
information thrown away from between the samples inevitably allows high frequencies to
masquerade as low frequencies. There is no foolproof method for both sampling the data and
avoiding this possibility.
Fortunately, there are usually ways of avoiding this logical dead end. As is pointed out in Sec.
2.2 above [see discussion after Eq. (2.9b)], in practice all measurements are sampled and, before
representing them by continuous functions, we must know that the samples capture all the
-- 195
195 --
2 · Fourier Theory
relevant detail. In other words, there must be some way of knowing, based on past experience or
knowledge of how the data is gathered, that the sampling is rapid enough to represent faithfully
all the important high-frequency details. In terms of the notation used to discuss Fig. 2.14, we
must eventually be prepared to say that, for some specific ƒmax, no higher frequencies are present
to create aliasing—that is, we must know that if more closely spaced sampling is done all that
would be found is a smooth, quasi-linear variation between the current samples. Many times the
electronic instruments used to make the measurements cannot sense high-frequency data, so even
if high-frequency components exist, they cannot be recorded. Other times, all that can be done is
to look at the data samples and decide whether it is reasonable to suspect the presence of unseen
high-frequency components. The data in Fig. 2.19(a), for example, almost certainly do not
contain significant amounts of unseen high frequencies, whereas unseen high frequencies could
well be present in Fig. 2.19(b). There may be cases where all that can be done is to shorten ¨t and
see whether previously aliased frequency components suddenly appear. The question of whether
aliasing is present is analogous to the question of whether experimental error is present. Just as it
is always logically possible that data contain significant amounts of undetected error, so it is
FIGURE 2.16.
1.1
1
0.5
y
i
0
Y
i
0.5
1
1.1
5 4 3 2 1 0 1 2 3 4 5
4.5 x
i t 4.5
The solid line represents a sinusoidal oscillation at a frequency that is 0.8 times the Nyquist
frequency, and the dashed line represents a sinusoidal oscillation that is 1.2 times the
Nyquist frequency. When the curves are sampled at the rate represented by the black dots—
which in this case is the Nyquist frequency—there is no way to tell them apart in the sampled
data.
- 196
- 196- -
Aliasing as an Error · 2.22
always logically possible that significant amounts of aliasing are being overlooked. Just as we
often expect insignificant amounts of error to occur no matter what precautions are taken, so we
often expect insignificant amounts of aliasing to occur in the calculated DFT. What is needed is
the presence of good engineering and scientific judgment; there must always be someone willing
to pick a value for ƒmax, allowing us to specify the sampling interval t 4 1 (2 f max ) that prevents
significant aliasing in the DFT.
³ u (t )e
2& ift
U( f ) dt ,
5
which is zero for all positive frequencies ƒ that do not lie between the two positive numbers ƒmin
and ƒmax; that is, U(ƒ) is zero when 0 4 f 4 f min and f : f max . Because u(t) is real, U(ƒ) must be
Hermitian (see entry 7 of Table 2.1), which means
U ( f ) U ( f ) .
This shows
This shows thatthat
U(U(ƒ)
f ) must
mustalso
alsobebestrictly
strictlyzero
zerofor
fornegative
negative frequencies
frequencies ƒf where
where f min 4 f 4 0
and f 4 f max . The U(ƒ) transform is schematically represented in Fig. 2.20 with the two blocks
showing that U is zero unless ƒ lies between ( f max , f min ) or ( f min , f max ) .
The situation shown in Fig. 2.20 describes the signal produced by Michelson interferometers.
At the beginning of this chapter, we mentioned that interferometers produce interferograms that
must then be Fourier transformed to produce the desired spectral measurement. As explained
later in Chapter 4 (see Sec. 4.10), interferometers use optical filters to block out undesired
electromagnetic frequencies, which means there always exist values of ƒmin and ƒmax such that the
transform U(ƒ) of the interferogram signal u(t) is zero unless ƒ lies between ( f max , f min ) or
( f min , f max ) . Suppose we sample the interferogram signal with a sampling interval ¨t such that
the Nyquist frequency f Nyq (2t ) 1 is slightly larger than ƒmax. Repeating the reasoning used to
get Fig. 2.15 above, we see that
5
U [5] ( f , F ) ¦ U ( f kF )
k 5
-- 197
197 --
2 · Fourier Theory
FIGURE 2.17.
U [5 ] ( f , F )
f
F 2 f Nyq F 2 f Nyq
f Nyq f Nyq
FIGURE 2.18(a). U( f )
FIGURE 2.18(b).
U( f )
The U [5 ] ( f , F ) data in Fig. 2.17 contains hidden aliasing that can lead spectral analysts to assume
that the Fig. 2.18(a) rather than 2.18(b) depicts the true frequency spectrum.
- 198
- 198- -
Aliasing as a Tool · 2.23
FIGURE 2.19(a).
This data is relatively smooth, suggesting that it does not contain high-frequency components.
FIGURE 2.19(b).
This curve varies rapidly in three locations, suggesting the presence of high-frequency
components in the data.
-- 199
199 --
2 · Fourier Theory
now has the form shown in Fig. 2.21. Again, the solid blocks show the original U(ƒ), the dashed
blocks show the aliases created by turning U(ƒ) into U [ 5 ] ( f , F ) , and the curved arrows drawn
show exactly how the aliased blocks are created from the original blocks. No solid blocks overlap
with the dashed blocks, so aliasing is not a problem.
Now consider what happens when we force aliasing to occur by choosing ¨t to be half its
original size, creating the U [ 5 ] plot shown in Fig. 2.22. As in Fig. 2.21, none of the solid blocks
overlap with the dashed blocks. Because the dashed blocks come from turning U into U [ 5 ] , the
spectral shapes represented by the solid and dashed blocks are all identical. This means that the
aliasing does not cause spectral information to be lost; either the solid blocks or the dashed
blocks can be used to recover the true shape of U(ƒ). The electronic equipment used to sample
u(t) only needs to sample half as often as before, which usually makes it less expensive to build,
and as a bonus the rate at which data flows from the interferometer ends up being cut in half. This
last point is often a significant consideration when the interferometer is on a satellite and all the
data has to be communicated to the ground. The scheme shown in Fig. 2.22 is called
undersampling. There is nothing special about undersampling by a factor of 2; if the distance
between ƒmin and ƒmax is small enough, and ƒmin is far enough from f 0 , we can undersample
by much higher factors. Figure 2.23 shows a scheme that withundersamples
4 aliases rather
bythan one. of 5.
a factor
³ u(t )e
2& ift
U( f ) dt ,
5
is strictly zero when f 4 f max or f : f max . The previous section indicated that the interferogram
of a Michelson interferometer is a special case of a band-limited function; not only is its
transform zero for f : f max , but there is also a positive frequency ƒmin such that its transform is
zero for f 4 f min (see Fig. 2.20). It can be shown that whenever a continuous function u(t) is
also band limited, then its samples u (mt ) (with m 0, 9 1, 9 2,… ) can be used to reconstruct the
complete function—including the values of u between the samples—as long as we choose
1
t
(2.102)
2 f max
to prevent aliasing.
We start by forming the mathematical construct
- 200
- 200- -
Sampling Theorem · 2.24
FIGURE 2.20.
U( f )
f
f max f min f min f max
FIGURE 2.21.
U [5] ( f , F )
f
f min f min
F f max f max f Nyq F
f Nyq
-- 201
201 --
2 · Fourier Theory
5
v(t ) ¦ u(mt ) (t mt ) .
m 5
(2.103)
Clearly, the u (mt ) sample values of function u are the only data used to set up function v(t).
Because u (t ) (t t0 ) u (t0 ) (t t0 ) for any continuous function u [see Eq. (2.68e) above], this
can be written as
5
v(t ) ¦ u (t ) (t mt )
m 5
or
ª 5 º
v(t ) u (t ) A « ¦ (t mt ) » .
¬ m 5 ¼
here tt in
Note that here has
thereturned
functiontoubeing a continuous,
has returned not
to being a sampled, variable. Taking the Fourier
a continuous
transform of both sides gives, using the Fourier convolution theorem [see Eq. (2.72i)],
ª1 5 § k ·º
V ( f ) U ( f ) « ¦ ¨ f ¸» , (2.104a)
¬ t k 5 © t ¹ ¼
where
5
³ v(t )e
2& ift
V( f ) dt , (2.104b)
5
³ u (t )e
2& ift
U( f ) dt , (2.104c)
5
and
ª 5 º 2& ift
5
1 5 § k ·
³5 ¬« k¦
5
(t k t ) »
¼
e dt ¦ ¨ f ¸
t k 5 © t ¹
(2.104d)
from formula (2.78d). Note that here both ƒ and t are continuous, not sampled, variables. We can
now use the linearity of the convolution [see discussion after Eq. (2.38c)] and the definition of
the convolution in Eq. (2.38a) to write (2.104a) as
5
5
§ k · 5
§ k ·
t AV ( f ) ¦
k 5
U ( f ) ¨
©
f ¸ ¦ ³ U ( f 3) ¨ f f 3 ¸ df 3
t ¹ k 5 5 © t ¹ (2.105a)
5
§ k · § 1 ·
¦ U ¨ f ¸ U [5] ¨ f , ¸ ,
k 5 © t ¹ © t ¹
- 202
- 202- -
Sampling Theorem · 2.24
f min f min
FIGURE 2.22.
F F
[5 ]
f max U ( f ,F)
f max
f
f Nyq
f Nyq
[5]
U ( f ,F)
FIGURE 2.23.
f min f max
f max f min f Nyq
F F
f Nyq
In both Figs. 2.22 and 2.23, frequency F is twice the Nyquist frequency f Nyq .
where U [ 5 ] is as defined in Eq. (2.93b) above. Inequality (2.102) ensures that the separate
regions of U that combine to create U [ 5 ] do not overlap, giving us the graph of U [ 5 ] shown in
Fig. 2.24. Hence, we can use the function defined in Eq. (2.56c) to select just the region of
nonzero U [ 5 ] between (2t ) 1 and (2t ) 1 , recreating the original U(ƒ) transform.
Multiplication of (2.105a) by f , (2t ) 1 then gives
§ 1 · [5] § 1 · § 1 ·
U( f ) ¨ f , ¸ AU ¨ f , ¸ t AV ( f ) A ¨ f , ¸. (2.105b)
© 2t ¹ © t ¹ © 2t ¹
-- 203
203 --
2 · Fourier Theory
Having recovered the original U(ƒ), an inverse Fourier transform of U(ƒ) gives back the original
unsampled u(t). Using the Fourier convolution theorem again to take the inverse Fourier
transform of both sides of (2.105b), we get [applying Eq. (2.39j) after interchanging the roles of ƒ
and t]
5
§ 1 · 2& ift
u (t ) t ³ V ( f ) A ¨ f , ¸ e df
5 © 2 t ¹
(2.106a)
ª 5
º ª5 § 1 · 2& if 3t º
t « ³ V ( f )e df » « ³ ¨ f 3,
2& ift
¸ e df 3» ,
¬ 5 ¼ ¬ 5 © 2t ¹ ¼
where the convolution between the two expressions inside square brackets [ ] is over the variable
t. From (2.104b), function V(ƒ) is the forward Fourier transform of v(t), making v(t) equal to the
inverse Fourier transform of V(ƒ) in (2.106a), with v(t) defined as
5
v(t ) ¦ u(mt ) (t mt )
m 5
in Eq. (2.103). From Eq. (2.71a) above, the inverse Fourier transform of is
§ § 1 ··
5
§ 1 · 1 § &t ·
¸¸ ³ e ¨ f ,
( ift ) 2& ift
F ¨¨ f , ¸ df sin ¨ ¸ .
© © 2t ¹ ¹ 5 © 2t ¹ & t © t ¹
ª 5 º ª1 § & t ·º
u (t ) t « ¦ u (mt ) (t mt ) » « sin ¨ ¸ » . (2.106b)
¬ m 5 ¼ ¬& t © t ¹ ¼
5
ª 1 § & t ·º ½
u (t ) t ¦ ®u (mt ) « (t mt ) & t sin ¨© t ¸¹» ¾
m 5 ¯ ¬ ¼¿
5 ° ª 1 § & (t mt ) · º ½°
u (t ) ¦ °®u (mt ) « & ((t mt ) t ) sin ©¨ t
¸» ¾ .
¹ ¼ ¿°
(2.106c)
m 5 ¯ ¬
- 204
- 204- -
Sampling Theorem · 2.24
FIGURE 2.24.
§ 1 ·
U [5] ¨ f , ¸
© t ¹
f
§1 · 1
¨ f max ¸ f max f max f max
© t ¹ t
1 1
U( f )
2t 2t
This formula gives us u(t) everywhere in terms of the samples u (mt ) and the function
1 § &t ·
sin ¨ ¸ .
& (t t ) © t ¹
-- 205
205 --
2 · Fourier Theory
Many authors use a different definition of the sinc function, which we call here sincalt , with
sin(& x)
sinc alt ( x) .
&x
5
§ (t mt ) ·
u (t ) ¦ u(mt )sinc
m 5
alt ¨
© t
¸.
¹
sin( x) sin(& x)
For the rest of this book, the symbol sinc will refer to instead of . We also
x &x
note that the Fourier transform pair in (2.71a) can be written in terms of sinc( x) as
³e
2& ift
[2 Fsinc(2& Ft )] dt ( f , F )
5
and
5
³e
2& ift
( f , F ) df 2 Fsinc(2& Ft ) .
5
³e
2& ift
[2 Fsinc(2& Ft )] dt ( f , F ) ( f , F )
5
and
5
³e
2& ift
( f , F ) df 2 Fsinc(2& Ft ) 2 Fsinc(2& Ft ) ,
5
where we have used that ( f , F ) and sinc(2& Ft ) are even functions of their arguments:
This means we can write this Fourier relationship using the more general formulas
- 206
- 206- -
Sampling Theorem · 2.24
5
F ( 9 ift )
2 Fsinc(2& Ft ) ³ e92& ift [2 Fsinc(2& Ft )] dt ( f , F ) (2.108a)
5
and
5
F ( 9 ift )
( f , F ) F ( 9 itf )
( f , F ) ³ e92& ift ( f , F ) df 2 Fsinc(2& Ft ) . (2.108b)
5
³ dx ³ dy e
2& i ( x. y! )
U (. ,! ) u ( x, y ) . (2.109a)
5 5
5 5
³ d. ³ d! e
2& i ( x. y! )
u ( x, y ) U (. ,! ) . (2.109b)
5 5
5 5 5
U (. ,! , 0 ) ³
5
dx ³ dy ³ dz e2& i ( x. y! z0 )u ( x, y, z )
5 5
(2.109c)
and
5 5 5
³ d. ³ d! ³ d0 e
2& i ( x. y! z0 )
u ( x, y , z ) U (. ,! , 0 ) . (2.109d)
5 5 5
This pattern of forward and inverse transforms can be extended indefinitely to functions u and U
with ever larger numbers of arguments, but for the purposes of this book there is no need to go
beyond the two- and three-dimensional transforms given in Eqs. (2.109a)–(2.109d). As a matter
of notation, we often use the standard Cartesian x̂ and ŷ unit vectors pointing along the x and y
axes of a Cartesian coordinate system to define vectors
G G
( xxˆ yyˆ and q . xˆ ! yˆ .
-- 207
207 --
2 · Fourier Theory
G G
We introduce the symbol u ( ( ) as a shorthand for u(x,y) and the symbol U (q ) as a shorthand for
U (. ,! ) . Now Eqs. (2.109a) and (2.109b) can be written as
5
G G G G
U (q ) ³³
5
d 2 ( e 2& i ( =q u ( () (2.110a)
and
5
G G G G
u(( ) ³³
5
d 2q e 2& i( =qU (q ) . (2.110b)
5
G G G G
³ ³³
3 2& ir = s
U (s ) d r e u (r ) (2.110c)
5
and
5
G G G G
³ ³³d se
3 2& ir = s
u (r ) U (s ) . (2.110d)
5
Vector notation is sometimes used to group families of associated forward and inverse Fourier
transforms into a single equation. We might, for example, write the six scalar equations
5 5
G G G G G G G G
³ ³ ³ d r e u x (r ) , u x (r ) ³ ³³d
3 2& ir = s 3
U x (s ) s e 2& ir = sU x ( s ) ,
5 5
5 5
G G G G G G G G
³ ³³d re ³ ³³
3 2& ir = s 3 2& ir = s
U y (s ) u y (r ) , u y (r ) d s e U y (s ) ,
5 5
and
5 5
G G G G G G G G
³ ³ ³ d r e u z (r ) , u z (r ) ³ ³³d se
3 2& ir = s 3 2& ir = s
U z (s ) U z (s )
5 5
³ ³³
3
U (s ) d r e u (r ) (2.110e)
5
- 208
- 208- -
Fourier Transforms in Two and Three Dimensions · 2.25
and
G G G G
5
G G
³ ³³d
3
u (r ) s e 2& ir = sU ( s ) , (2.110f)
5
where
G G G G G G G G G G
u (r ) xˆu x (r ) yˆ u y (r ) zˆu z (r ) and U ( s ) xˆU x ( s ) yˆU y ( s ) zˆU z ( s ) .
G G G G G G
We call U ( s ) the vector Fourier transform of u (r ) and u (r ) the vector inverse Fourier
G G
transform of U ( s ) . Just as in the one-dimensional case, it makes no difference which Fourier
transform is labeled the forward transform and which is labeled the inverse transform as long as
there is a change in sign of the exponent of e. Following the pattern of Eq. (2.28 A ), we can also
write
5 5
G G G G G G
³³ ³³
2 9 2& i ( = q 2 B 2 & i ( 3= q
d q e d ( 3 e u ( ( 3) u ( ( ) (2.110g)
5 5
and
5 5
G G
³³ ³ d ³ ³ ³ d r3 e
G G G G
3 se 92& ir = s 3 B2& ir 3= s v (r 3) v(r ) (2.110h)
5 5
G G
for two-dimensional and three-dimensional scalar functions u ( ( ) and v(r ) . For three-
dimensional vector functions, this becomes
5 5
G G G G G G G G
³³ ³ d s e ³ ³ ³ d r3 e
3 9 2& ir = s 3 9 2& ir 3= s
v (r 3) v (r ) . (2.110i)
5 5
5 5 5
G G G G
³ ³d ³ dx ³ dy e
2 92& i ( = q 92& i ( x. y! )
(e u(( a) u ( x ax , y a y )
5 5 5
5 5
³ dx3 ³ dy3 e
B2& i (. a x ! a y ) 92& i ( x3. y 3! )
e u ( x3, y3) ,
5 5
where in the last step we define x3 x ax and y3 y ax . We now see that (dropping the
primes inside the double integral)
-- 209
209 --
2 · Fourier Theory
5 5
G G G G G G G G G
³ ³d (e u ( ( a ) e B2& ia =q ³ ³
2 92& i ( = q 2 92& i ( = q
d ( e u( ( ) . (2.110j)
5 5
G G G G
This shows the forward or inverse two-dimensional Fourier transform of u( ( a) to be e B2& ia =q
G
multiplied by the forward or inverse two-dimensional Fourier transform of u ( ( ) . Similarly in
G
ˆ x yb
three dimensions, we have, for a three-dimensional constant vector b xb ˆ y zb
ˆ z , that
G G
5 5 5 5
G G
³ ³³d re ³ dx ³ dy ³
3 92& ir = s
v(r b ) dz e92& i ( x. y! z0 ) v( x bx , y by , z bz )
5 5 5 5
5 5 5
where x3 x bx , y3 y by , and z 3 z bz . This time we find that the forward or inverse three-
G G G G
dimensional Fourier transform of v (r b ) is e B2& is =b multiplied by the forward or inverse three-
G
dimensional Fourier transform of v(r ) ,
G G
5 5
G G G G G G G
³ ³³d re v(r b ) e B2& is =b ³ ³³
3 92& ir = s 3 92& ir = s
d r e v( r ) . (2.110k)
5 5
5
G G G G
V ( 9 ) (q ) ³³
5
d 2 ( e 9 2& i ( =qv ( () (2.110 A )
G G G G
and v( ( ) is replaced by v(( ) , where Į is a real scalar, then we can substitute ( 3 ( to get
G
5 5 § ( 3· G
G G G 1 9 2& i¨ ¸ = q G 1 G
³³d ³³d
2 9 2& i ( = q 2
(e v ( () 2 (3e © ¹
v ( ( 3) 2 V ( 9 ) (q ) . (2.110m)
5
5
G G G G
Suppose there is a function of ( called u ( ( ) such that ( has to change by a vector distance (
G
whose magnitude must be at least (
for there to be a significant change in the value of
G
u ( ( ) . Using the same reasoning as was applied to the one-dimensional Fourier scaling theorem
G
[see the analysis following Eq. (2.37e)], we can show that U ( 9 ) (q ) , the two-dimensional forward
- 210
- 210- -
Fourier Transforms in Two and Three Dimensions · 2.25
G
or inverse Fourier transform of u, must be negligible or zero for all vectors q whose magnitude
G
q exceeds 1 . The Fourier scaling theorem in three dimensions starts with
5
G G G G
³ ³³
(9) 3 92& ir = s
V (s ) d r e v(r ) , (2.110n)
5
G G G
from which we discover, replacing r by r 3 r , that
G
5 5 § r3 · G
G G G 1 9 2& i¨ ¸ =s G 1 G
³ ³³ d r e ³ ³³ d
3 9 2& ir =s 3
v ( r ) 3 r3 e © ¹
v (r 3) 3 V ( 9 ) ( s ) . (2.110o)
5 5
G G
Again we can conclude that if there is a function u (r ) such that r must be at least ȕ for there
G
to be a significant change in u, then U ( 9 ) ( s ) , the three-dimensional forward or inverse Fourier
G G
transform of u, must be negligible or zero for all vector arguments s whose magnitude s
exceeds 1 .
The two-dimensional convolution of scalar functions u(x,y) and v(x,y) is written using the
symbol and defined to be
5 5
u ( x, y ) v( x, y ) ³ dx3 ³ dy3u( x3, y3)v( x x3, y y3) ,
5 5
(2.111a)
or
5
G G G G G
³ ³d
2
u ( ( ) v( ( ) ( 3 u ( ( 3)v( ( ( 3) (2.111b)
5
using the more concise vector notation. The vector notation may make the connection between
the one- and two-dimensional convolutions in Eqs. (2.38a) and (2.111b) easier to see. The two-
dimensional convolution, like the one-dimensional convolution, is both commutative and
associative. Using the same type of reasoning as in the analysis in Sec. 2.9, we have for the two-
G G G
dimensional functions u ( ( ) , v( ( ) , and h( ( ) that
5 5
G G G G G G G G
³ ³ ³ ³
2 2 2
u ( ( ) v( ( ) d ( 3 u ( ( 3) v ( ( ( 3) 1 d ( 33 u ( ( ( 33) v ( ( 33)
5
5
5
(2.111c)
G G G G G
³ ³d
2
( 33 v( ( 33)u ( ( ( 33) v( ( ) u ( ( )
5
and
-- 211
211 --
2 · Fourier Theory
5 5
G G G G G 2 G G G G
³³ ³³
2
u ( ( ) v( ( ) h( ( ) d ( 33 h ( ( ( 33) d ( 3 u ( ( 3) v ( ( 33 ( 3)
5 5
5 5
G G G G G G
³³ d ( 3 u ( ( 3) ³ ³d
2 2
( 33 h( ( ( 33)v( ( 33 ( 3)
5 5
(2.111d)
5 5
G G G G G G
³³ d ( 3 u ( ( 3) ³ ³d
2 2
( 333 v( ( 333) h(( ( ( 3) ( 333)
5 5
G G G
u ( ( ) v( ( ) h( ( ) ,
where to show that the two-dimensional convolution is commutative we make the variable
G G G
substitution ( 33 ( ( 3 in (2.111c); and to show it is associative, we make the variable
G G G
substitution ( 333 ( 33 ( 3 in (2.111d). The two-dimensional convolution is also linear. For any
two complex constants Į and ȕ, we have
5
G G G G G G G G
³ ³d
2
u ( ( ) v( ( ) h( ( ) ( 3 u ( ( 3) v( ( ( 3) h( ( ( 3)
5
5 5
G G G G G G
³ ³d ³ ³d
2 2
( 3 u ( ( 3)v( ( ( 3) ( 3 u ( ( 3)h( ( ( 3) (2.111e)
5 5
G G G G
u ( ( ) v( ( ) u ( ( ) h( ( ),
It is easy to show that the Fourier convolution theorem holds true in two dimensions. We start
with
5 5
³ dx ³ dy e
92& i ( x. y! )
[u ( x, y ) v( x, y )]
5 5
5 5 5 5
³
5
dx ³ dy e 92& i ( x. y! ) ³ dx3 ³ dy3 u ( x3, y3)v( x x3, y y3)
5 5 5
5 5 5 5
- 212
- 212- -
Fourier Transforms in Two and Three Dimensions · 2.25
Now we replace the x, y integration variables by x33 x x3 and y33 y y3 , with dx33 dx and
dy33 dy , so that
5 5
³ dx ³ dy e
92& i ( x. y! )
[u ( x, y ) v( x, y )]
5 5
5 5 5 5
³ dx ³ dy e
92& i ( x. y! )
[u ( x, y ) v( x, y )] U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) , (2.112a)
5 5
5 5
U ( 9 ) (. ,! ) ³
5
dx ³ dy e 92& i ( x. y! )u ( x, y ) ,
5
(2.112b)
(9)
and V is the two-dimensional forward or inverse Fourier transform of v,
5 5
³ dx ³ dy e
(9) 92& i ( x. y! )
V (. ,! ) v ( x, y ) . (2.112c)
5 5
This gives the first half of the two-dimensional Fourier convolution theorem. To get the
second half, we reverse the transform in (2.112a). If the plus sign is used in (2.112a), take the
forward two-dimensional Fourier transform of both sides, and if the minus sign is used take the
inverse two-dimensional Fourier transform of both sides. This leads to
5 5
³ d. ³ d! e
B2& i ( x. y! )
U ( 9 ) (. ,! ) A V ( 9 ) (. ,! ) u ( x, y ) v( x, y ) , (2.113a)
5 5
5 5
³ d. ³ d! e
B2& i ( x. y! )
u ( x, y ) U ( 9 ) (. ,! ) (2.113b)
5 5
and
5 5
³ ³ d! e
B2& i ( x. y! )
v ( x, y ) d. V ( 9 ) (. ,! ) . (2.113c)
5 5
-- 213
213 --
2 · Fourier Theory
The first half of the two-dimensional Fourier convolution theorem, Eqs. (2.112a)–(2.112c),
shows that the forward or inverse two-dimensional Fourier transform of the two-dimensional
convolution of two functions u and v is the product of the forward or inverse two-dimensional
Fourier transforms of u and v. Because no restrictions are placed on the nature of u and v, other
than that they are transformable, there are also no restrictions on the nature of their U ( 9 ) and V ( 9 )
transforms. This means we can think of U ( 9 ) and V ( 9 ) as arbitrary transformable functions. The
(9 ) superscripts on U and V in Eqs. (2.113a)–(2.113c) then just tell us that, according to Eqs.
(2.112b) and (2.112c),
5 5
³ dx ³ dy e
(9) 92& i ( x. y! )
U (. ,! ) u ( x, y )
5 5
and
5 5
V ( 9 ) (. ,! ) ³
5
dx ³ dy e92& i ( x. y! ) v( x, y ) .
5
We already know this, however, from looking at Eqs. (2.113b) and (2.113c)—just take the
opposite-sign Fourier transform of both sides. Hence, we can drop the (9 ) superscripts on U and
V in Eqs. (2.113a)–(2.113c) as long as ( B ) superscripts are added to u and v to distinguish
between the two choices of sign in (2.113b) and (2.113c). Now Eqs. (2.113a)–(2.113c) become
5 5
³ d. ³ d! e
B2& i ( x. y! )
U (. ,! ) A V (. ,! ) u ( B ) ( x, y ) v ( B ) ( x, y ) , (2.114a)
5 5
where
5 5
u ( B ) ( x, y ) ³ ³ d! e
B2& i ( x. y! )
d. U (. ,! ) (2.114b)
5 5
and
5 5
³ d. ³ d! e
(B) B2& i ( x. y! )
v ( x, y ) V (. ,! ) . (2.114c)
5 5
The letters used to label the functions and variables are, of course, arbitrary, so nothing stops us
from interchanging the letters u and U, v and V, x and ȗ, y and Ș, and the vertical order of the ±
signs to get
5 5
³ dx ³ dy e
92& i ( x. y! )
u ( x, y ) A v( x, y ) U ( 9 ) (. ,! ) V ( 9 ) (. ,! ) , (2.115a)
5 5
- 214
- 214- -
Fourier Transforms in Two and Three Dimensions · 2.25
where
5 5
U ( 9 ) (. ,! ) ³
5
dx ³ dy e 92& i ( x. y! )u ( x, y )
5
(2.115b)
and
5 5
³ dx ³ dy e
(9) 92& i ( x. y! )
V (. ,! ) v ( x, y ) . (2.115c)
5 5
Equations (2.115a)–(2.115c) are the other half of the two-dimensional Fourier convolution
theorem—they show that the forward or inverse two-dimensional Fourier transform of the
product of two functions u and v is the two-dimensional convolution of the forward or inverse
two-dimensional Fourier transforms of u and v.
The three-dimensional convolution is written using the symbol and defined to be
5 5 5
u ( x, y, z ) v( x, y, z ) ³
5
dx3 ³ dy3 ³ dz3u ( x3, y3, z3) v( x x3, y y3, z z3)
5 5
(2.116a)
or
5
G G G G G
³ ³ ³ d r 3 u (r 3) v(r r 3) .
3
u (r ) v(r ) (2.116b)
5
Using three-dimensional vector notation, the three-dimensional convolution has the same
commutative, associative, and linearity properties as the two-dimensional convolution, as can be
seen by returning to Eqs. (2.111c)–(2.111f), mentally adding an extra , an extra integral sign,
and replacing all the superscript 2’s by superscript 3’s.
G G G G
u ( ( ) v( ( ) v( ( ) u ( ( ) , (2.117a)
G G G G G G
u ( ( ) v( ( ) h( ( ) u ( ( ) v( ( ) h( ( ) , (2.117b)
G G G G G G G
u ( ( ) v( ( ) h( ( ) u ( ( ) v( ( ) u ( ( ) h( ( ) , (2.117c)
and
G G G G G G G
v( ( ) h( ( ) u ( ( ) v( ( ) u ( ( ) h( ( ) u ( ( ) . (2.117d)
-- 215
215 --
2 · Fourier Theory
Looking carefully at the variable manipulations used to derive Eqs. (2.112a)–(2.112c), the first
half of the two-dimensional Fourier convolution theorem, we see that working with an extra
product z0 in the exponent of e and an extra integration over dz does not affect the end result.
We can therefore say that
5 5 5
³ dx ³ dy ³ dz e
92& i ( x. y! z0 )
[u ( x, y, z ) v( x, y, z )]
5 5 5
(2.118a)
(9) (9)
U (. ,! , 0 ) A V (. ,! , 0 ) ,
where
5 5 5
U ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e
92& i ( x. y! z0 )
u ( x, y , z ) (2.118b)
5 5 5
and
5 5 5
³ dx ³ dy ³ dz e
(9) 92& i ( x. y! z0 )
V (. ,! , 0 ) v ( x, y , z ) . (2.118c)
5 5 5
The argument about relabeling the functions and variables used to go from (2.112a)–(2.112c) to
(2.115a)–(2.115c) works equally well here, giving us at once the other half of the three-
dimensional Fourier convolution theorem,
5 5 5
³ dx ³ dy ³ dz e
92& i ( x. y! z0 )
u ( x, y , z ) A v ( x, y , z )
5 5 5
(2.119a)
U ( 9 ) (. ,! , 0 ) V ( 9 ) (. ,! , 0 ) ,
where
5 5 5
³ dx ³ dy ³ dz e
(9) 92& i ( x. y! z0 )
U (. ,! , 0 ) u ( x, y , z ) (2.119b)
5 5 5
and
5 5 5
V ( 9 ) (. ,! , 0 ) ³ dx ³ dy ³ dz e
92& i ( x. y! z0 )
v ( x, y , z ) . (2.119c)
5 5 5
One last matter of notation worth mentioning is that we can create two-dimensional and three-
dimensional delta functions from the products of the already-discussed one-dimensional delta
function:
- 216
- 216- -
Fourier Transforms in Two and Three Dimensions · 2.25
G
( ( ) ( x) A ( y ) (2.120a)
and
G
(r ) ( x) A ( y ) A ( z ) . (2.120b)
5 5 5 5
5 5 5
³ dx ³ dy ³ dz v( x, y, z) ( x x ) ( y y ) ( z z )
5 5 5
o o o
5 5
³ dx ( x x ) ³ dy v( x, y, z ) ( y y )
5
o
5
o o (2.121b)
5
³ dx ( x x )v( x, y , z ) v( x , y , z ).
5
o o o o o o
5
G G G G
³ ³d
2
( u ( ( ) ( ( (o ) u ((o ) (2.121c)
5
and
5
G G G G
³ ³ ³ d r v( r ) (r r ) v(r ) .
3
o o (2.121d)
5
Combining Eq. (2.71f) for the one-dimensional delta function with Eqs. (2.120a) and (2.120b),
we see that in two dimensions
5 5 5
G G G
-- 217
217 --
2 · Fourier Theory
G
using the vector notation q . xˆ ! yˆ ; and in three dimensions
5 5 5
G
³ d. e ³ d! e ³ d0 e
92& ix. 92& iy! 92& iz0
(r ) ( x) A ( y ) A ( z )
5 5 5
5
(2.122b)
G G
³ ³³d
3 92& ir = s
se
5
G
using the vector notation s . xˆ ! yˆ 0 zˆ .
__________
This chapter provides both an intuitive understanding and a rigorous explanation of how
Fourier transforms work. Sine and cosine transforms are introduced as a way to measure how
much functions resemble sine and cosine curves, and these transforms are then combined to
create the standard complex Fourier transform. We describe convolutions and how they produce
new functions by blurring old ones. The Fourier convolution theorem—whose importance is
difficult to overstate—directly connects the convolution to Fourier-transform theory. Generalized
limits are explained to show in what sense some of the more puzzling functions found in lists of
Fourier transforms belong there, and a brief outline of generalized functions is presented to show
how delta functions can be described without making them sound like obvious nonsense.
Computers use discrete Fourier transforms to handle Fourier calculations, and we explain how
the discrete Fourier transform can be used to approximate the integral Fourier transform. The
discrete Fourier transform produces aliasing; we show when aliasing is desirable, when it is not
desirable, and when it can be neglected. All the major concepts explained in this chapter—the
linearity of the Fourier transform, the linearity of the convolution, the Fourier convolution
theorem, the idea of even and odd functions, and the delta function—have important roles to play
in the pages that follow.
- 218
- 218- -
Table 2.1
Table 2.1
U ( f ) F ( ift ) (u (t )) u (t ) F (ift ) (U ( f ))
-- 219
219 --
2 · Fourier Theory
Table 2.1
(continued)
- 220
- 220- -
Table 2.2
Table 2.2
§t·
5 2&ik ¨ ¸
T §k ·
1 2 & i ¨ t ¸ v(t ) ¦ Ak e ©T ¹
Ak ³ e © T ¹ v(t )dt k 5
T 0
(2)
(3) [imag.,
[real, even]
odd] [imag., odd]
[imag., even]
Re( A
Im( Ak ))
00 ,, A
Ak
AA
kk
Re(vv((tt ))
Re( ))
00 ,, vv((
tt ))
v(vt()t )
k k
(3)
(4) [real, odd] [imag., odd]
[real, odd]
[imag., odd]
Im( Re(
Im(vv((tt )) 00 ,, vv((tt)) vv((tt))
Re( Ak ) 00 ,, A
A k ) Akk A
Akk ))
(4) [imag., odd] [real, odd]
(5) [complex, even] [complex, even]
Re( Ak ) 0 , Ak Ak Im(v(t )) 0 , v(t ) v(t )
Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(5) [complex, even] [complex, even]
A (tv)(t))v>
vRe( (t )0 for some t
k A )A>
Re( kk 0 for some k
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(6) [complex, odd]
Ak Ak v[complex,
(t ) v(todd]
)
Re( Ak ) > 0 for some k Re(v(t )) > 0 for some t
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(6) [complex, odd] [complex, odd]
A Ak0 for some k (tv)(t))>v(0t )for some t
vRe(
Re( k A ) >
k
Im( Ak ) > 0 for some k Im(v(t )) > 0 for some t
(7)
A[Hermitian] v[real]
(t ) v(t )
k Ak Im(v(t )) 0
Ak Ak
(7) [Hermitian] [real]
(8) A Ak Im(v(t )) 0
[Hermitian]
k
[real]
Im( Ak ) 0 v(t ) v(t )
- 221 -
(8) [real] [Hermitian]
Im( Ak ) 0 v(t ) v(t )
-- 221
221 --
2 · Fourier Theory
Table 2.2
(continued)
- 222
- 222- -
3
RANDOM VARIABLES, RANDOM
FUNCTIONS, AND POWER SPECTRA
Engineers and scientists are taught many statistical concepts in school, but all too often this is
done in an informal manner that does a good job of explaining how to eliminate random errors
and noise from real experimental data and a poor job of explaining how to analyze random errors
and noise in physical models. Understanding the correct way to represent random errors and
noise requires formal knowledge of the statistical concepts used to describe random signals;
otherwise, basic equations can be misunderstood and misused. For this reason, we here take a
more formal approach to the subject. Starting off with an explanation of the basics—random
functions, independent and dependent random variables, the expectation operator E , stationarity
and ergodicity—that do not require the Fourier theory discussed in the previous chapter, we then
move on to topics that do, such as autocorrelation functions, white noise, the noise-power
spectrum, and the Wiener-Khinchin theorem. The techniques explained in this chapter are used a
few times in the next chapter during the derivation of the Michelson interference equations and
then over and over again in Chapters 6, 7, and 8 to analyze the random errors and noise found in
Michelson systems.
- 223 -
3 · Random Variables, Random Functions, and Power Spectra
that is, he knows what the chances are that the coins, dice, or needles return one set of numbers
rather than another. Most scientists and engineers do not pay much attention to the difference
between controlled and uncontrolled variables—perhaps because most of their “controlled”
variables are usually a little “uncontrolled” in the sense that they come from imperfectly accurate
measurements—but it is very convenient when analyzing a statistical model to keep careful track
of this distinction. To help us remember which variables are random and which are not, we put a
wavy line or tilde over the random variables while writing the nonrandom variables in the usual
way. As an example of how this looks, we note that u, a0, and zƍ are all nonrandom variables
whereas NJ, ã0, and z′ are all random.
y = f ( x ) (3.1a)
is another random variable. To give an example of how this works, we create a nonrandom time
variable t and a random angular frequency ω , multiply them together and take the sine of their
product to get
y = sin(ω t ) . (3.1b)
The value of y is clearly uncontrolled; for each unpredictable value of ω at time t, there is a
corresponding unpredictable number y that is given by sin(ω t ) . This example also shows that
when a function has several arguments, its value becomes random when only one of the
arguments is random. In Eq. (3.1b) the sine of ω t , regarded as a function of both ω and t, is
random even though only one of its arguments, ω , is random.
Many times when a function has multiple arguments, the controlled argument or arguments
are more interesting than the uncontrolled argument or arguments that make the function random.
One way to handle this situation is to list only the nonrandom arguments and say that what we
have is a random function with nonrandom arguments. To show what is going on, we put a wavy
line over the function name, indicating that even though all the listed arguments are nonrandom,
the function itself is random. If, for example, we are only interested in the nonrandom time t, we
could define
R (t ) = sin(ω t ) (3.2a)
to be a random function of the nonrandom variable t. Now whenever there is a list of time values
t1, t2, …, there is a corresponding list of random variables
- 224 -
Random and Nonrandom Functions · 3.2
Although Eq. (3.2b) implicitly assumes a list of distinct and separate t values, this reasoning still
holds up when t is explicitly made a continuous variable. Nothing, for example, stops us from
saying that for each value of t between í and +, there corresponds a different random variable
The idea of a random function of nonrandom arguments becomes more attractive when there is
no realistic possibility of analyzing the effect of multiple random arguments on a single
nonrandom function. We might, for example, know exactly how N random parameters r1 , r2 , …,
rN interact to cause an error e in an electrical signal s at time t. This lets us write the error as a
nonrandom function
e(t , r1 , r2 ,… , rN ) .
Rather than investigating how r1 , r2 , …, rN are behaving, it usually makes more sense to say that
there is a random noise
n (t ) = e(t , r1 , r2 ," , rN ) (3.3a)
contaminating electrical signal s. Now we can put the error into our model as a random function ñ
that depends on a nonrandom parameter t instead of as a nonrandom function e that depends on t
and N random parameters r1 , r2 , …, rN . Sometimes the signal s in our model depends on more
than one nonrandom parameter, such as the x, y coordinates of an image point at time t. If the
corresponding error e in the signal s depends on x, y, and t as well as the random parameters r1 ,
r2 , …, rN , then we can say there is a random noise
contaminating signal s(x, y, t). Note that we can think in terms of a signal noise ñ(t) or ñ(x,y,t)
even when we are not sure what random arguments r1 , r2 , …, rN make the nonrandom function e
behave randomly. This is, of course, why the idea of a random function is so useful. In this book,
we use the term “random function” to refer to what statisticians often prefer to call a random or
stochastic process.
- 225 -
3 · Random Variables, Random Functions, and Power Spectra
³ p (r ) dr = 1 .
−∞
r (3.4)
For Eq. (3.4) to make sense, the probability density distribution pr (r ) must be defined for all r
between í and + with the understanding that
pr (r ) = 0
for those values of r to which the random variable r can never be equal.
The predicted average or mean value of r can be written as
∞
µr = ³ p (r ) r dr .
−∞
r (3.5a)
Note that µr , just like pr , is nonrandom even though it has a random subscript. The predicted
variance of r , which is defined to be the predicted average or mean squared difference between
r and µr , is another nonrandom quantity
∞
³ p (r ) (r − µ )
2
vr = r r dr . (3.5b)
−∞
Many people prefer to characterize a random number r by its standard deviation σ r instead of its
variance vr . The standard deviation of a random number r is defined to be the square root of the
variance,
- 226 -
Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3
σ r = vr . (3.5c)
Of course σ r , like vr , is a nonrandom quantity. In general, the probability density distribution pr
lets us find the predicted average or mean value of any nonrandom function f of the random
variable r by calculating the nonrandom quantity
∞
predicted mean value of f = ³ p (r ) f (r ) dr .
−∞
r (3.5d)
When f (r ) = r , this equation reduces to formula (3.5a) for µr ; and when f (r ) = (r − µr ) 2 , this
equation reduces to formula (3.5b) for vr .
Many random variables found in nature appear to obey a Gaussian, or “normal,” probability
distribution:
( r − µ r ) 2
1 −
2σ r2
pr (r ) = e . (3.6a)
σ r 2π
This can in part be explained as a consequence of the central limit theorem,25 which is described
in Sec. 3.11 below. It is easy to show that parameter µr in Eq. (3.6a) is the mean of the Gaussian
distribution. Consulting formula (3.5a) above, we see that the mean of the distribution in (3.6a)
must be
∞ ( r − µr )2 ∞ ( r ′ )2
r −
1 −
³σ ³
2σ r2 2σ r2
e dr = (r ′ + µr ) e dr ′ , (3.6b)
−∞ r 2π σ r 2π −∞
where on the right-hand side the variable of integration is changed to r ′ = r − µr . This becomes,
consulting Eq. (7A.3d) in Appendix 7A of Chapter 7,
∞ ( r ′ )2 ∞ ( r ′ )2 ∞ ( r ′ )2
1 −
1 −
µr −
³ (r ′ + µ ) e ³ r′ e ³e
2σ r2 2σ r2 2σ r2
r dr ′ = dr ′ + dr ′
σ r 2π −∞ σ r 2π −∞ σ r 2π −∞
(3.6c)
∞ ( r ′ )2
1 −
³ r′ e
2σ r2
= dr ′ + µr ⋅1 .
σ r 2π −∞
25
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, 3rd ed. (McGraw-Hill, Inc., New
York, 1991), p. 214.
- 227 -
3 · Random Variables, Random Functions, and Power Spectra
If we replace r ′ by −r ′ in
( r ′ )2
−
2σ r2
g (r ′) = r ′ e ,
it is the same as multiplying g by −1 , which makes g an odd function [see Eq. (2.11b) in Chapter
2). Hence, according to Eq. (2.17) in Chapter 2,
∞ ( r ′)2
−
³ r′ e
2σ r2
dr ′ = 0
−∞
because it is the integral of an odd function between í and +. Therefore, Eq. (3.6c) simplifies
to
∞ ( r ′ )2
1 −
³ (r ′ + µ ) e
2σ r2
r dr ′ = µr , (3.6d)
σ r 2π −∞
∞ ( r − µ r ) 2
r −
³σ
2σ r2
e dr = µ r . (3.6e)
−∞ r 2π
This shows that, as claimed above, parameter µr is the mean of the probability distribution
specified in Eq. (3.6a). It is just as easy to show that σ r is the standard deviation of the
distribution in (3.6a). From (3.5b) we know that the variance of this distribution is
∞ ( r − µ r )2 ∞ ( r ′ )2
(r − µ r ) 2 − (r ′) 2 − 2σ r2
³−∞ σ 2π e dr = ³
2σ r2
e dr ′
r −∞ σ r 2 π
when the variable of integration is changed to r ′ = r − µr . According to Eq. (7A.3b) in Appendix
7A of Chapter 7, we can write
∞ ( r ′ )2
(r ′) 2 − 2σ r2
³−∞ σ 2π e dr ′ = σ r .
2
(3.6f)
r
Consequently, σ r2 is the variance of this probability density distribution. The square root of the
variance is the standard deviation according to (3.5c). Hence, it is, as claimed, easy to see that σ r
- 228 -
Probability Density Distributions: Mean, Variance, Standard Deviation · 3.3
N
pr (r ) = ¦ pk ⋅ δ (r − rk ) . (3.7a)
k =1
The integral for the predicted mean value of r in Eq. (3.5a) now reduces to
∞ N N ∞ N
µr = ³ [¦ pk ⋅ δ (r − rk )] r dr = ¦ pk ³ δ (r − rk ) r dr = ¦ pk rk (3.7b)
−∞ k =1 k =1 −∞ k =1
∞ N N ∞
³ [¦ pk ⋅ δ (r − rk )](r − µr ) dr = ¦ pk ³ δ (r − r ) (r − µ )
2 2
vr = k r dr
−∞ k =1 k =1 −∞
(3.7c)
N
= ¦ pk (rk − µr ) 2 ;
k =1
and, according to Eq. (3.5d), the predicted mean value of f (r ) becomes
∞ N N ∞ N
³ [¦ pk ⋅ δ (r − rk )] f (r ) dr = ¦ pk
−∞ k =1 k =1
³
−∞
f (r ) δ (r − rk ) dr = ¦ pk f (rk ) .
k =1
(3.7d)
Again, the integral formulas reduce to the correct probability-weighted sums. Looking at the
limiting case where N = 1 and p1 = 1 , we get
pr (r ) = δ (r − r1 )
so that
∞
µr = ³ δ (r − r ) r dr = r
−∞
1 1 (3.7e)
- 229 -
3 · Random Variables, Random Functions, and Power Spectra
³ (r − r ) δ (r − r1 ) dr = (r1 − r1 )2 = 0 .
2
vr = 1 (3.7f)
−∞
Results (3.7e) and (3.7f) show that the value of r is now completely controlled; it must be equal
to r1 and no longer needs to be treated like a random variable. Hence, the limiting case where
N = 1 and p1 = 1 can be regarded as changing a random variable into a nonrandom variable.
is the predicted mean, or average, value of f ( x ) . We also call E ( f ( x ) ) the expectation value of
f ( x ) . Mathematically we define
∞
E ( f ( x ) ) = ³ p ( x) f ( x) dx .
x (3.8a)
−∞
Just like before, px ( x) dx is the probability that the random variable x takes on a value between
x and x + dx . We can find E( x ) , the expectation value of x , by choosing f ( x ) = x in Eq. (3.8a)
to get
∞
E( x ) = ³ p ( x) x dx .
−∞
x (3.8b)
Comparing this to Eq. (3.5a) above, we see that the expectation value of x is the same as the
predicted mean or average value of x ,
E( x ) = µ x , (3.8c)
(
E ( x − µ x ) 2 = ) ³ p ( x) ( x − µ )
x x
2
dx . (3.8d)
−∞
- 230 -
The Expectation Operator · 3.4
(
vx = E ( x − µ x ) 2 . ) (3.8e)
(
Var ( x ) = E ( x − µ x ) 2 . ) (3.8f)
When the E operator is applied to any sort of random variable or function—for example,
f ( x ) —the result is always a nonrandom variable or function, namely
³ p ( x) f ( x) dx .
−∞
x
For example, the characteristic function Φ x of a random variable x , which is the nonrandom
Fourier transform of the probability density distribution of x ,
³ p ( x )e
−2π iν x
Φ x (ν ) = x dx , (3.9a)
−∞
Φ x (ν ) = E (e −2π iν x ) . (3.9b)
pρ ( ρ ) = δ ( ρ − c) . (3.9c)
According to the discussion following Eqs. (3.7e,f) above, this makes ρ equivalent to the
nonrandom variable c. Consequently, we can say that
E(c) = E( ρ ) (3.9d)
and use Eq. (3.8b) above to get
- 231 -
3 · Random Variables, Random Functions, and Power Spectra
∞ ∞
E( c ) = ³ pρ ( ρ ) ρ d ρ = ³ δ ( ρ − c ) ρ d ρ = c .
−∞
−∞
(3.9e)
This justifies the general rule—which also makes good intuitive sense—that
E( c ) = c (3.9f)
for any nonrandom quantity c.
The expectation operator E can be applied to multiple random variables at the same time—all
that we need is the appropriate probability density distribution. Suppose, for example, that the
behavior of two random variables x and X is described by a two-argument probability density
distribution pxX
( x, X ) , with pxX
( x, X ) dx dX being the probability that the random variable x
takes on a value between x and x + dx while the random variable X takes on a value between X
and X + dX . No matter what the behavior of random variables x and X , we can always
construct an appropriate probability density distribution p . Since x and X must always take
xX
the same reasoning used to produce Eq. (3.4) now shows that
∞ ∞
³
−∞
dx ³ dX pxX
−∞
( x, X ) = 1 (3.10a)
∞ ∞
( ) ³
E f ( x, X ) = dx ³ dX pxX
( x, X ) f ( x , X ) . (3.10b)
−∞ −∞
In particular, we can always set f ( x , X ) = x X to get the expected value of the random variables’
product,
∞ ∞
)=
E( xX ³ x dx ³ dX X p
xX
( x, X ) . (3.10c)
−∞ −∞
- 232 -
Independent and Dependent Random Variables · 3.5
pxX
( x, X ) = px ( x) ⋅ p X ( X ) . (3.11a)
where px and p X are the standard probability density distributions for x and X when x and X
26
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 132.
- 233 -
3 · Random Variables, Random Functions, and Power Spectra
are treated as solitary random variables. This means that px ( x) dx is the probability that x lies
between x and x + dx regardless of the value of X , and p X ( X ) dX is the probability that X lies
between X and X + dX regardless of the value of x . We see that, according to Eqs. (3.10c) and
of two independent random variables is
(3.11a), the expectation value of the product xX
∞ ∞ ∞ ∞
)=
E( xX ³ x dx ³ dX X pxX
( x, X ) = ³ x dx ³ dX X px ( x) p X ( X )
−∞ −∞ −∞ −∞
∞ ∞
= [ ³ px ( x) x dx] ⋅ [ ³ p X ( X ) X dX ] .
−∞ −∞
) = E( x ) ⋅ E( X )
E( xX (3.11b)
or
) = µ x µ X .
E( xX (3.11c)
px1x2 "xN ( x1 , x2 ,… , xN )
such that
px1x2 "xN ( x1 , x2 ,… , xN ) dx1 dx2 " dxN
is the probability that x1 lies between x1 and x1 + dx1 , that x2 lies between x2 and x2 + dx2 , ... ,
that x N lies between xN and xN + dxN . The expectation value of any function f ( x1 , x2 ,… , x N ) of
these N random variables is
E ( f ( x1 , x2 ,… , x N ) )
∞ ∞ ∞ (3.12a)
= ³
−∞
dx1 ³ dx2 " ³ dxN f ( x1 , x2 ,… , xN ) px1 x2 "xN ( x1 , x2 ,… , xN ).
−∞ −∞
- 234 -
Large Numbers of Random Variables · 3.7
Note that nothing has been said so far about the connections between these N random variables;
they could be either dependent or independent. If we now assume that these N random variables
are all independent with respect to one another, then
where px1 ( x1 ) dx1 is the probability that x1 lies between x1 and x1 + dx1 regardless of the values
of the other N − 1 random variables, px2 ( x2 ) dx2 is the probability that x2 lies between x2 and
x2 + dx2 regardless of the values of the other N − 1 random variables, …, pxN ( xN ) dxN is the
probability that x N lies between xN and xN + dxN regardless of the values of the other N − 1
random variables. The expectation value of the product of these N random variables can now be
written as, setting f ( x1 , x2 ," , x N ) = x1 x2 " x N in Eq. (3.12a),
∞ ∞ ∞
E( x1 x2 " x N ) = ³
−∞
dx1 ³ dx2 " ³ dxN [ x1 x2 " xN ] px1 x2 "xN ( x1 , x2 ,… , xN )
−∞ −∞
∞ ∞ ∞
= ³
−∞
px1 ( x1 ) x1 dx1 ³ px2 ( x2 ) x2 dx2 " ³ pxN ( xN ) xN dxN .
−∞ −∞
and
∞ ∞
µ X = E( X ) = ³ dx ³ dX X p
xX
( x, X ) . (3.13b)
−∞ −∞
- 235 -
3 · Random Variables, Random Functions, and Power Spectra
and
∞ ∞
E( X ) = ³ X [³ pxX
( x, X ) dx] dX , (3.13d)
−∞ −∞
we compare them to the formula for the expected value of a random variable given in Eq. (3.8b).
This comparison suggests that, if we want to specify the behavior of one random variable while
disregarding the presence of the other, we can construct the single-argument probability density
distributions of x and X by writing
∞
px ( x) = ³p
−∞
xX
( x, X ) dX (3.13e)
and
∞
p X ( X ) =
−∞
³p
xX
( x, X ) dx . (3.13f)
Up to this point, none of the integrations have required assumptions about the dependence or
independence of the random variables, so Eqs. (3.13e) and (3.13f) hold true both for dependent
and independent random variables x and X . If we specify that x and X are independent, then
Eq. (3.11a) can be substituted into (3.13e) and (3.13f) to get
∞ ∞
px ( x) = ³ p ( x)
−∞
x p X ( X ) dX = px ( x) ³ p X ( X ) dX
−∞
and
∞ ∞
p X ( X ) =
−∞
³ px ( x) p X ( X ) dx = p X ( X ) ³ px ( x) dx .
−∞
Glancing back at Eq. (3.4), we note that these last two equalities are trivially true, because in both
cases the right-most integrals must be one.
- 236 -
Analyzing Dependent Random Variables · 3.9
y = ( x − µ x )( X − µ X ) . (3.14a)
(
E( y ) = E ( x − µ x )( X − µ X ) ) (3.14b)
is just the predicted average value of y . We can imagine, each time we acquire a random pair of
x and X values, comparing the sizes of x and X to their respective averages µ x and µ X by
subtracting µ and µ from them. If x and X are both simultaneously greater than, or both
x X
simultaneously less than, their averages, then y is positive; and if one is greater than its average
when the other is less that its average, then Ϳ is negative. If there is a tendency for one of the
random variables to exceed its average whenever the other exceeds its average, or a tendency for
one of the random variables to fall below its average whenever the other falls below its average,
then Ϳ has a greater probability of being positive than negative, so
E( y ) > 0 .
If, on the other hand, there is a tendency for one of the random variables to exceed its average
when the other falls below its average, then Ϳ has a greater probability of being negative than
positive, so
E( y ) < 0 .
If E( y ) is zero, it indicates that Ϳ is just as likely to be negative as positive, which means that
knowing one variable lies above or below its average tells us nothing about the likelihood that the
other variable lies above or below its average. Writing out the integral formula for E( y ) in terms
of the probability density distribution pxX
( x, X ) gives
∞ ∞
( ) ³ dx ³ dX [( x − µ )( X − µ
E( y ) = E ( x − µ x )( X − µ X ) = x X
)] pxX
( x, X ) . (3.14c)
−∞ −∞
We say that the value of the integral in Eq. (3.14c) measures the covariance of random variables
x and X . When
(
E( y ) = E ( x − µ x )( X − µ X ) )
is greater than zero, x and X are said to be positively correlated; when
- 237 -
3 · Random Variables, Random Functions, and Power Spectra
(
E( y ) = E ( x − µ x )( X − µ X ) )
is less than zero, x and X are said to be negatively correlated; and when
(
E( y ) = E ( x − µ x )( X − µ X ) )
equals zero, x and X are said to be uncorrelated.
Evaluating E( y ) and finding it not equal to zero is a standard way of showing that two
random variables x and X are correlated and so cannot be independent. We cannot, however,
say that x and X are independent just because E( y ) is zero; that is, saying that x and X are
uncorrelated is a weaker statement than saying that x and X are independent. To show why this
is so, we set up a random variable φ which has a probability density distribution
The probability density distribution pφ shows that φ is equally likely to take on any value
between zero and 2ʌ, and that φ never takes on values less than zero or greater than 2ʌ. We next
define two random variables u and v such that
u = sin(φ ) (3.15b)
and
v = cos(φ ) . (3.15c)
It follows that
∞ 2π
1
µu = E(u ) = E(sin φ ) = ³−∞ pφ (φ ) sin(φ ) dφ = 2π ³ sin(φ ) dφ = 0 , (3.15d)
0
2π
1
µv = E(v ) =
2π ³ cos(φ ) dφ = 0 .
0
(3.15e)
Note that
- 238 -
Analyzing Dependent Random Variables · 3.9
(
E ( (u − µu )(v − µv ) ) = E(u v ) = E (sin φ )( cos φ ) )
2π
1
=
2π ³ sin(φ ) cos(φ ) dφ
0
(3.15f)
2π
1
4π ³0
= sin(2φ ) dφ = 0 ,
which means that u and v are uncorrelated random variables. On the other hand, we also know
that
u 2 + v 2 = sin 2 φ + cos 2 φ = 1 ,
which means that whenever u takes on a particular random value, say 1/2, then v must take on
one of the two random values
± 1 − (1 2) 2 = ± 3 2 .
Consequently, u and v are by no means independent random variables even though by definition
they are uncorrelated random variables.
∞ ∞ ∞
=α ³ dx ³ dx " ³ dx
−∞
1
−∞
2
−∞
N f ( x1 , x2 ," , xN ) px1 x2 "xN ( x1 , x2 ," , xN ) (3.16a)
∞ ∞ ∞
+β ³
−∞
dx1 ³ dx2 " ³ dxN g ( x1 , x2 ,… , xN ) px1x2 "xN ( x1 , x2 ,… , xN )
−∞ −∞
- 239 -
3 · Random Variables, Random Functions, and Power Spectra
Note that in the last step Eq. (3.12a) is applied again to return to the expectation operator.
According to Eq. (2.32a) in Chapter 2, the definition of a linear operator L is that
L (α f + β g ) = α L ( f ) + β L ( g ) (3.16b)
for any two functions f, g and any two constants Į, ȕ. When we think of the nonrandom variables
Į and ȕ as “constants,” we see that Eqs. (3.16a) and (3.16b) provide plenty of justification for
calling the expectation operator E a linear operator with respect to all random quantities.
The linearity of E can be used to show that multiplying any random variable x by a
nonrandom parameter Į results in the mean of x being multiplied by Į and the variance of x
being multiplied by Į2. Starting with Eq. (3.8c), we multiply both sides by Į to get
α E( x ) = αµ x . (3.16c)
Because E is linear, E(α x ) = α E( x ) , which means that Eq. (3.16c) can be written as
E(α x ) = αµ x . (3.16d)
This shows that multiplying x by Į changes its average value from µ x to αµ x . As for the
variance vx of random variable x , according to Eq. (3.8e) we have
( )
E ( x − µ x ) 2 = vx (3.16e)
α 2E ( ( x − µ x ) 2 ) = α 2 vx . (3.16f)
Again the linearity of E lets us write
α 2E ( ( x − µ x ) 2 ) = E (α 2 ( x − µ x )2 ) ,
α 2E ( ( x − µ x )2 ) = E ( (α x − αµ x )2 ) .
- 240 -
Linearity of the Expectation Operator · 3.10
E ( (α x − αµ x ) 2 ) = α 2 vx . (3.16g)
Since α x is the new random variable which comes from multiplying x by Į and [according to
Eq. (3.16d)] the quantity αµ x is the mean of this new random variable, we now realize—
consulting the definition of the variance in Eq. (3.8e)—that E ( (α x − αµ x ) 2 ) must be the variance
of the new random variable α x . Equation (3.16e) reminds us that vx is the variance of the old
random variable x . Hence, Eq. (3.16g) states that if x is multiplied by Į then its variance must
be multiplied by Į2.
The expectation operator usually can be moved inside an integral over a nonrandom variable.
Suppose function f depends on one nonrandom variable z in addition to N random variables
x1 , x2 ,…, x N . Then, again using Eq. (3.12a), the expectation value of the integral
zB
³ f ( z, x , x ,…, x
zA
1 2 N ) dz
is
zB
E ( ³ f ( z , x1 , x2 ,… , x N ) dz )
zA
∞ ∞ ∞ zB
= ³ dx ³ dx " ³ dx
−∞
1
−∞
2
−∞
N px1 x2 "xN ( x1 , x2 ,… , xN ) ³ f ( z, x1 , x2 ,… , xN ) dz .
zA
As long as we can interchange the order of these integrations—which is almost always allowed
when dealing with physically realistic integrals—the expectation value can also be written as
§ zB ·
E ¨ ³ f ( z, x1 , x2 ,… , x N ) dz ¸
¨z ¸
© A ¹
zB
ª∞ ∞ ∞
º
= ³ dz « ³ dx1 ³ dx2 " ³ dxN px1 x2 "xN ( x1 , x2 ,… , xN ) f ( z, x1 , x2 ,… , xN ) » .
zA ¬ −∞ −∞ −∞ ¼
§ zB · zB
E ¨ ³ f ( z, x1 , x2 ,… , xN ) dz ¸ = ³ E ( f ( z, x1 , x2 ,… , x N ) ) dz .
(3.17a)
¨z ¸ z
© A ¹ A
- 241 -
3 · Random Variables, Random Functions, and Power Spectra
The same reasoning can be extended to M integrals over M nonrandom variables z1 , z2 ,…, zM .
We have
§ z1 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸
¨z ¸
© 1 A z2 A zMA ¹
∞ ∞ z1 A zMB
= ³
−∞
dx1 " ³ dxN px1x2 "xN ( x1 ,… , xN )
−∞
³
z1 A
dz1 " ³ dz
zMA
M f ( z1 ,… , zM , x1 ," , xN )
z2 A zMB
ª∞ ∞
º
= ³
z1 A
dz1 " ³z M «¬ −∞³ 1 −∞³ dxN px1x2"xN ( x1 ,… , xN ) f ( z1 ,…, zM , x1 ," , xN ) »¼ ,
dz dx "
MA
§ z2 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ¸
¨z ¸
© 1 A z2 A zMA ¹ (3.17b)
z1 B z2 B zMB
= ³ dz ³ dz " ³ dz
1 2 M E ( f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N ) ).
z1 A z2 A zMA
The expectation operator can even be moved inside the integral of a random function
f ( z1 , z2 ,… , zM ) .
f ( z1 , z2 ,… , zM ) = f ( z1 , z2 ,… , zM , x1 , x2 ,… , x N )
for some set of random variables x1 , x2 ,…, x N . Hence, we can just suppress the random variables
x1 , x2 ,…, x N in Eq. (3.17b) to get
- 242 -
Linearity of the Expectation Operator · 3.10
§ z2 B z2 B zMB
·
E ¨ ³ dz1 ³ dz2 " ³ dzM f ( z1 , z2 ,… , zM ) ¸
¨z ¸
© 1 A z2 A zMA ¹ (3.17c)
z1 B z2 B zMB
= ³ dz ³ dz " ³ dz
1 2 M ( )
E f ( z1 , z2 ,… , zM ) .
z1 A z2 A zMA
has a probability density distribution psN ( sN ) that resembles a Gaussian or normal probability
density distribution more and more as N gets large,
( s N − µ sN )2
−
1 2σ s2N
psN ( sN ) ≅ e . (3.18b)
σ s N
2π
In Eq. (3.18b), µ sN is the mean or average value of sN and σ sN is the standard deviation of sN
about its mean. Figure 3.1 is a plot of the Gaussian distribution specified on the right-hand side of
(3.18b). For large but finite values of N, this Gaussian distribution tends to be a relatively good
approximation of psN ( sN ) for sN values near the peak in Fig. 3.1 and a not-so-good
approximation of psN ( sN ) for sN values in the tails of Fig. 3.1—that is, for sN values far from
the peak.
The mean of sN comes from applying the expectation operator E to both sides of Eq. (3.18a).
Remembering that E is linear with respect to random quantities [see Eq. (3.16a) above], we get
- 243 -
3 · Random Variables, Random Functions, and Power Spectra
FIGURE 3.1.
p ~sN ( s N )
sN
( )
vsN = E ( sN − µ sN ) 2 ,
- 244 -
The Central Limit Theorem · 3.11
§§ N N · ·
2
§§ N · ·
2
§ N N N
·
vsN = E ¦ (rj − µrj ) + ¦¦ [(rj − µrj )(rk − µrk )] ¸ ,
¨ 2
¨ j =1 ¸
¨ j =1 k =1 ¸
© k≠ j ¹
and the linearity of the expectation operator with respect to random quantities then lets us write
this as
( ) ( )
N N N
vsN = ¦ E (rj − µrj ) 2 + ¦¦ E (rj − µrj )(rk − µrk ) . (3.19b)
j =1 j =1 k =1
k≠ j
Since r1 , r2 ,…, rN are independent random quantities, so must the random quantities r1 − µr1 ,
r2 − µr2 ,…, rN − µrN also be independent. Hence, according to Eq. (3.11b), we see that when
j≠k
( )
E (rj − µrj )(rk − µrk ) = E(rj − µrj ) ⋅ E(rk − µrk ) . (3.19c)
But, applying the linearity of the expectation operator and Eqs. (3.8c) and (3.9f), we have
(
E (rj − µ rj )(rk − µ rk ) = 0 ) (3.19d)
( )
N
vsN = ¦ E (rj − µrj ) 2 ,
j =1
- 245 -
3 · Random Variables, Random Functions, and Power Spectra
is the variance of rj for j = 1, 2,… , N . The standard deviation of a random quantity is the square
root of its variance [see Eq. (3.5c)], so formulas (3.19e) and (3.19f) can also be written as
is the standard deviation of rj for j = 1, 2,… , N and σ sN is the standard deviation of sN .
Returning to the approximation in Eq. (3.18b) used to explain the central limit theorem, we
notice that some care must be exercised in interpreting the limit as N → ∞ ; in particular, it is
clear from Eqs. (3.19a) and (3.19g) that there is a tendency for both µ sN and σ sN to become large
without limit as N increases, making the expression on the right-hand side of (3.18b) difficult to
interpret in the limit of large N. The central limit theorem can be written in terms of a
mathematically well-defined limit as N → ∞ if we are careful how the arguments of the
Gaussian or normal distribution are defined. To state the central limit theorem precisely, we
define a new random variable
sN − µ sN
zN = (3.20a)
σ s N
that has a probability density distribution pzN ( z N ) . Now we can present the central limit theorem
exactly by stating that
1 − z2 / 2
lim ª¬ pzN ( z ) º¼ = e . (3.20b)
N →∞ 2π
The right-hand side of (3.20b) is the Gaussian or normal distribution introduced above in Eq.
(3.6a) where the random variable has a mean of zero and a standard deviation of one. For any
large but finite value of N, we can recover the approximation in (3.18b) by assuming that pzN is
near its limit and then replacing z in (3.20b) by zN as defined in (3.20a). [The extra factor of σ sN
- 246 -
The Central Limit Theorem · 3.11
multiplying the 2π on the right-hand side of (3.18b) can be regarded as coming from Eq. (3.4)
above—if it isn’t there, then the integral of the probability density distribution between í and
+ does not equal one.]
sN
a N = . (3.21a)
N
Applying the expectation operator E to both sides gives, using the linearity of the expectation
operator (see Sec. 3.10 above),
1
E(a N ) = E( sN ) . (3.21b)
N
Since E( sN ) = µ sN , Eq. (3.19a) shows that, since all the rj have the same mean value µ,
Equation (3.21d) states that the expected value of the experimental average a N is µ, the true
value of the experimental quantity being measured. This is no great surprise, because the
averaging process would not make sense unless it were true. The typical size of the error left after
the rj are averaged together—that is, the amount by which a N is likely to be different from its
average value—is just its standard deviation [see Eqs. (3.5c) and (3.8e) above],
- 247 -
3 · Random Variables, Random Functions, and Power Spectra
σ a = E ( (a N − µ ) 2 ) ,
N
which can also be written as, after substituting from Eq. (3.21a) and using the linearity of the
expectation operator,
§§ 1 · · 1
2
σ a N
= E ¨ ¨ sN − µ ¸ ¸ =
¨© N
©
¸
¹ ¹ N
E ( sN − N µ ) .
2
( ) (3.21e)
(
E ( sN − N µ )
2
).
the variance vsN of sN [see Eq. (3.8e) above]. Hence, (3.21e) can be written as
1 1
σ a = vsN = σ s2N
N
N N
because the variance is the square of the standard deviation σ sN . Substituting from (3.19g) now
gives
1 1
σ a = vsN = σ r21 + σ r22 + " + σ r2N .
N
N N
As already mentioned above, we can assume that all the rj have the same standard deviation ı.
Hence,
1 σ
σ a = Nσ 2 = . (3.21f)
N
N N
This shows that when the standard deviation or expected error in one measurement is ı, then the
standard deviation or expected error in the average a N of N identical but independent
measurements is σ / N , a significantly smaller number. Although we use several formulas from
the previous section on the central limit theorem to get this result, there is no assumption here
that the rj obey any particular probability density distribution. In order to derive Eqs. (3.21d) and
(3.21f), all that is needed is that the rj are independent and that the probability density
distributions of the rj have the same mean and standard deviation.
When spectrometers are used to make independent measurements of the same radiance
- 248 -
Averaging to Improve Experimental Accuracy · 3.12
spectra, we can extend the above analysis to the spectral measurements by regarding the
independent but identical random variables rj as random functions of the spectral wavelength or
frequency, with different values of index j now representing different spectral curves from
independent spectral measurements. We can now repeat all the algebraic manipulations used in
(3.21a)–(3.21f) above while regarding every quantity except N as a function of the spectral
wavelength or frequency and end up with the same results. If, for example, the quantities are
regarded as functions of the spectral wavelength Ȝ, then we just need to visualize a (Ȝ)
immediately following the relevant variables. In a sense, all that is happening is that we have
decided to repeat the algebra of Eqs. (3.21a)–(3.21f) at each spectral wavelength. Equation
(3.21d), for example, becomes
E ( a N (λ ) ) = µ (λ ) , (3.22a)
showing that the point-by-point average of the rj (λ ) spectral curves creates another curve a N (λ )
whose expected value is the true spectrum µ(Ȝ). The average spectrum a N (λ ) is allowed to have
a different expected value µ(Ȝ) at each wavelength Ȝ because it is now, of course, taken to be a
function of Ȝ. Similarly Eq. (3.21f) becomes
σ (λ )
σ a (λ ) = . (3.22b)
N
N
This shows that the expected error σ aN (λ ) at wavelength Ȝ of the average spectrum a N (λ ) is
smaller by a factor of N than the expected error ı(Ȝ) at wavelength Ȝ of a single spectral
measurement. The expected error σ (λ ) , just like the average µ(Ȝ), is allowed to be different at
different wavelengths. As long as the expected value µ(Ȝ) of a N (λ ) is the true spectral curve, Eq.
(3.22b) shows that we can approach this true spectrum as closely as we desire—that is, make the
error in our point-by-point average spectrum arbitrarily small—by making N as large as
necessary.
- 249 -
3 · Random Variables, Random Functions, and Power Spectra
We also know the behavior of random variables can be described by probability density
distributions. Associated with any N sequential random variables n (t1 ) , n (t2 ) ,..., n (t N ) specified
by the time values t1 < t2 < " < t N there is a probability density distribution
is the probability first that ñ(t1) takes on a value between n1 and n1 + dn1 , and then that n (t2 )
takes on a value between n2 and n2 + dn2 , and then that n (t3 ) takes on a value between n3 and
n3 + dn3 , …, and then that n (t N ) takes on a value between nN and nN + dnN . The expectation
operator E has the same meaning as before: the expected or mean value of any function f of the
N random variables n (t1 ) , n (t2 ) , ... , n (t N ) is
One of the most important expectation values associated with ñ occurs when we set N = 2 and
specify that
f ( n (t1 ), n (t2 ),… , n (t N ) ) = n (t1 ) ⋅ n (t2 )
∞ ∞
Rnn (t1 , t2 ) = E ( n (t1 ) ⋅ n (t2 ) ) = ³ dn1 ³ dn2 [n1n2 ] pn ( t1 ) n (t2 ) (n1 , n2 ) . (3.23b)
−∞ −∞
∞
µn (t ) = E ( n (t ) ) = ³np n ( t ) (n) dn , (3.23c)
−∞
and the autocovariance of ñ,
- 250 -
Mean, Autocorrelation, Autocovariance of Random Functions of Time · 3.13
(t1 , t2 ) = E
Cnn ((
n (t1 ) − µn ( t1 ) )( n(t ) − µ ) )
2 n ( t2 )
∞ ∞ (3.23d)
= ³ dn ³ dn (n − µ
−∞
1
−∞
2 1 n ( t1 ) )(n2 − µn ( t2 ) ) pn ( t1 ) n ( t2 ) (n1 , n2 ).
(t1 , t2 ) = Cnn
Rnn (t1 , t2 ) . (3.23e)
Almost always, the random functions used to represent noise in a physical system are specified in
such a way that µn ( t ) = 0 , which means the distinction between the autocorrelation function and
the autocovariance function becomes irrelevant.
3.14 Ensembles
Just as random variables are often regarded as taking on one or another specific value chosen
randomly from some collection of allowed nonrandom values, so too do we often think of
random functions as becoming one or another specific, nonrandom function chosen randomly
from a collection—or ensemble—of allowed nonrandom functions. We can visualize this
situation by imagining an infinitely long row of biased and crooked slot machines, one for every
value of t on the time axis.27 The slot machines do not necessarily behave identically and they are
wired together so that they can influence each other. When a slot machine’s lever is pulled, there
is never any jackpot; all that happens is that another number appears inside its window. Each time
we simultaneously pull all the levers of the slot machines, we randomly choose another member
of the ensemble of allowed functions. The probability pn ( t ) (n) dn that random variable ñ(t) takes
on a value between n and n + dn is just the probability that the slot machine at t takes on a value
between n and n + dn , and it is also the probability that some member function randomly chosen
from the ensemble of allowed functions has a value between n and n + dn at time t. In fact, we
can say that
is the probability, after the slot machine levers are pulled, that the slot machine at t1 has a value
between n1 and n1 + dn1 , that the slot machine at t2 has a value between n2 and n2 + dn2 , …, and
27
An objection that could be raised here is that an infinite number of slot machines is only what is called countably
infinite whereas the number of points on the time axis is uncountably infinite, a much “larger” type of infinity. For
our purposes, the distinction between these two types of infinity is not important.
- 251 -
3 · Random Variables, Random Functions, and Power Spectra
that the slot machine at tN has a value between nN and nN + dnN . It can also, of course, be thought
of as the probability that a member function randomly chosen from the ensemble of allowed
functions has values at times t1 < t2 < " < t N that lie between n1 and n1 + dn1 , n2 and n2 + dn2 ,
…, nN and nN + dnN respectively.
for any value of τ and all N = 1, 2,… , ∞ . Thus, for any integrable function f with N arguments,
∞ ∞ ∞
³ dn ³ dn " ³ dn
−∞
1
−∞
2
−∞
N f (n1 , n2 ,… , nN ) pn ( t1 ) n (t2 )"n ( tN ) (n1 , n2 ,… , nN )
∞ ∞ ∞
(3.24b)
= ³ dn ³ dn " ³ dn
−∞
1
−∞
2
−∞
N f (n1 , n2 ,… , nN ) pn ( t1 +τ ) n ( t2 +τ )"n ( tN +τ ) (n1 , n2 ,… , nN ) ,
where t1 < t2 < " < t N and N = 1, 2,… , ∞ . This means that, according to Eq. (3.23a),
for any integrable function f, any value of τ , and N = 1, 2,… , ∞ . We note that when Eq. (3.24c)
holds true,
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )
cannot depend on all the N independent time values t1 , t2 ,…, t N as we might at first suppose. To
see why this is so, we just set τ = −t1 in (3.24c) to get
28
Paul H. Wirsching, Thomas L. Paez, and Keith Ortiz, Random Vibrations: Theory and Practice (John Wiley and
Sons, Inc., New York, 1995), p. 80.
29
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 297.
- 252 -
Stationary Random Functions · 3.15
must be a function of just the nonrandom time parameters (t2 − t1 ) , (t3 − t1 ) ,…, (t N − t1 ) and there
are, of course, only N − 1 of these.
Equations (3.24b)–(3.24d) can be understood in terms of the following thought experiment.
We randomly pick some function from the ensemble of allowed functions and choose N time
values t1 < t2 < " < t N . The randomly picked function has values n1 , n2 ,…, nN at times
t1 , t2 ,…, t N respectively. Next, we create some nonrandom function f that has N arguments and is
not one of those physically unreasonable abstractions that mathematicians specialize in. We
calculate and store the value of f (n1 , n2 ,… , nN ) . Randomly choosing another function from the
ensemble of allowed functions for n (t ) , we again use n1 , n2 ,…, nN at t1 , t2 ,…, t N to calculate and
store a new value of f (n1 , n2 ,… , nN ) . Repeating this procedure enough times to get a large
collection of f values, we average them all together to get a good estimate of
Shifting to a new set of time values t1 + τ , t2 + τ ,…, t N + τ , we again generate another large
collection of f values, this time averaging them together to get a good estimate of
Since n is strict-sense stationary, we know that no matter what the positive integer N is, and no
matter what the function f is, and no matter what the value of τ is, both collections of f values
always have approximately the same average, with the difference between the averages becoming
less and less as the collections of f values get larger and larger.
To give an example of a random function ñ(t) that is strict-sense stationary, we define
- 253 -
3 · Random Variables, Random Functions, and Power Spectra
pab
(a, b) = pab
( a 2 + b2 ) . (3.25b)
a1 cos(ω t1 ) + b1 sin(ω t1 ) ,
we know that the number in the window of the slot machine located at time value t2 is
a1 cos(ω t2 ) + b1 sin(ω t2 ) ,
and so on. If we pull all the levers again and get values a2 for a and b2 for b , then we know that
the slot machine at t1 has a number
a2 cos(ω t1 ) + b2 sin(ω t1 ) ,
a2 cos(ω t2 ) + b2 sin(ω t2 ) ,
30
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 301.
- 254 -
Stationary Random Functions · 3.15
which we can find by solving Eqs. (3.26a) and (3.26b) for a and b in terms of A and B .
Equations (3.26a) and (3.26b) state that if random variables a and b take on the values a and b,
then random variables A and B must take on the values
a cos(ωτ ) + b sin(ωτ )
and
b cos(ωτ ) − a sin(ωτ )
respectively. Similarly Eqs. (3.26c) and (3.26d) state that if random variables A and B take on
values A and B, then random variables a and b must take on values
A cos(ωτ ) − B sin(ωτ )
and
B cos(ωτ ) + A sin(ωτ )
respectively. Whenever there are two random variables x and y that have a probability density
distribution pxy ( x, y ) and we use constants α1 , α 2 , α 3 , and α 4 to construct from x and y two
new random variables
z = α1 x + α 2 y (3.27a)
and
w = α 3 x + α 4 y , (3.27b)
then we can find the probability density distribution pzw and w by calculating the reverse
for z
transformation
x = β1 z + β 2 w (3.27c)
- 255 -
3 · Random Variables, Random Functions, and Power Spectra
and
y = β 3 z + β 4 w , (3.27d)
31
and requiring that
1
( z , w) =
pzw p ( β z + β 2 w, β 3 z + β 4 w) . (3.27e)
α1α 4 − α 2α 3 xy 1
and
β1 = cos(ωτ ) , β 2 = − sin(ωτ ) , β3 = sin(ωτ ) , β 4 = cos(ωτ ) .
Consequently,
α1α 4 − α 2α 3 = cos 2 (ωτ ) + sin 2 (ωτ ) = 1 ,
(
( A, B ) = pab
p AB A cos(ωτ ) − B sin(ωτ ), A sin(ωτ ) + B cos(ωτ ) ) . (3.28a)
Since pab
is circularly symmetric, obeying Eq. (3.25b), this becomes
( A, B ) = pab
p AB
([ A2 cos 2 (ωτ ) + B 2 sin 2 (ωτ ) − 2 AB sin(ωτ ) cos(ωτ )
+ A2 sin 2 (ωτ ) + B 2 cos 2 (ωτ ) + 2 AB sin(ωτ ) cos(ωτ )]1 2 )
= pab
( ( ) (
A2 cos 2 (ωτ ) + sin 2 (ωτ ) + B 2 cos 2 (ωτ ) + sin 2 (ωτ ) ))
= pab
( A2 + B 2 ).
From Eqs. (3.26c) and (3.26d), we know that, whenever A and B take on the values A and B,
that a and b must then take on the values
31
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 144.
- 256 -
Stationary Random Functions · 3.15
A cos(-* ) B sin(-* )
and
B cos(-* ) A sin(-* ) .
Hence,
2 2
a 2 b2 A cos(-* ) B sin(-* ) A sin(-* ) B cos(-* ) A2 B 2
so that
p AB
( A, B ) pab
a 2 b 2 pab
( a , b) ,
where Eq. (3.25b) is reversed to make the last step in this equality. We have now shown that Eq.
(3.28a) can be written as
p AB
( A, B ) pab
( a , b) (3.28b)
where
becausethepequal
ab
probability
is circularly densities do not depend on
symmetric. -* .
Equation (3.28b) is a very restrictive statement applied to random variables A and B because
it requires A and B to obey exactly the same statistics as a and b . Consequently, we can set up
a random function
N (t ) A cos(- t ) B sin(- t ) (3.29a)
and know that it has exactly the same random behavior as ñ(t) in Eq. (3.25a). Substituting Eqs.
(3.26a) and (3.26b) into (3.29a) gives
N (t ) n (t * ) . (3.29c)
This means that not only does Ñ(t) have the same random behavior as ñ(t), it also has the same
random behavior as n (t * ) . Consequently, ñ(t) and n (t * ) must both have the same random
behavior. We have made no assumptions about the value of * ; hence, Eq. (3.29c) holds true for
any * value. We have therefore demonstrated that
- 257 -
3 · Random Variables, Random Functions, and Power Spectra
n (t ) = a cos(ω t ) + b sin(ω t )
Other terms applied to random functions ñ(t) that satisfy these two restrictions are weakly
stationary or covariance stationary.33 Equation (3.30a) requires the average value of ñ(t) to be
finite and independent of time. We call this average µ n instead of µ n ( t ) as in Eq. (3.23c) to
emphasize that it does not depend on time. Equation (3.30b) requires the autocorrelation function
(t1 , t2 ) defined in Eq. (3.23b) to depend only on (t2 − t1 ) , the difference between times t2 and
Rnn
t1 . Glancing back at the definition of Cnn (t1 , t2 ) in Eq. (3.23d), we see that when Eqs. (3.30a) and
(t1 , t2 ) = E ( ( n
Cnn (t1 ) − µn )( n (t2 ) − µn ) )
= E ( n (t1 )n (t2 ) − µn n (t1 ) − µn n (t2 ) + µn2 )
= E ( n (t1 )n (t2 ) ) − µn E ( n (t1 ) ) − µn E ( n (t2 ) ) + µn2 .
The last step uses the linearity of the expectation operator (see Sec. 3.10 above) and Eq. (3.9f).
Consequently, the formula for Cnn becomes, using Eqs. (3.30a) and (3.30b),
(t 2 − t1 ) − µ n .
2
(t1 , t2 ) = Rnn
Cnn (3.30c)
32
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 298.
33
T. T. Soong, Random Differential Equations in Science and Engineering (Academic Press, New York, 1973), p.
43.
- 258 -
Stationary Random Functions · 3.15
stationary also depends only on (t2 − t1 ) , the difference between times t2 and t1 . We note that
random functions that are wide-sense stationary need not be strict-sense stationary, but random
functions that are strict-sense stationary must also be wide-sense stationary. For future use, we
note that two random functions nα (t ) and nβ (t ) are defined to be jointly wide-sense stationary34
when each one is itself wide-sense stationary and when
which is called their cross-correlation function, depends only on the difference between times t1
and t2 .
Returning to the ñ(t) defined in Eq. (3.25a) above,
n (t ) = a cos(ω t ) + b sin(ω t ) ,
Hence, for E ( n (t ) ) to obey Eq. (3.30a) and so be time independent, we must have
E(a ) = 0 (3.31a)
and
E(b ) = 0 . (3.31b)
These are the first two restrictions that must be placed on a and b for ñ(t) to be wide-sense
stationary. We also know from Eq. (3.30b) that Rnn must have the same value whenever
t2 − t1 = 0 or t2 = t1 , so (remember that nothing has been said about what the value of time t2 = t1
is)
E ( n (t3 ) n (t3 ) ) = E ( n (t4 ) n (t4 ) )
34
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 299.
35
This treatment is taken from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p.
300.
- 259 -
3 · Random Variables, Random Functions, and Power Spectra
must hold true for all values of t3 and t4 . In particular, this must hold true when t3 = 0 and
t4 = π (2ω ) . But from Eq. (3.25a)
This is the third restriction that must be placed on a and b for ñ(t) to be wide-sense stationary.
To find the fourth and last restriction, we evaluate the left-hand side of Eq. (3.30b) for t1 ≠ t2 ,
using (3.25a) and the linearity of the expectation operator (see Sec. 3.10) to get
E(n (t1 )n (t2 )) = E([a cos(ω t1 ) + b sin(ω t1 )][a cos(ω t2 ) + b sin(ω t2 )])
= E(a 2 cos(ω t ) cos(ω t ) + ab
1
cos(ω t ) sin(ω t )
2 1 2
) ⋅ sin (ω (t1 + t2 ) ) .
E ( n (t1 )n (t2 ) ) = E(a 2 ) ⋅ cos (ω (t2 − t1 ) ) + E(ab (3.31d)
The first term on the right-hand side of (3.31d) depends only on (t2 − t1 ) , which is what Eq.
(3.30b) requires, but the second term on the right-hand side does not. Therefore, the last
restriction on random variables a and b is
) = 0 .
E(ab (3.31e)
Equations (3.31a), (3.31b), (3.31c), and (3.31e) list all the restrictions on random variables a and
b needed to ensure that ñ(t) in Eq. (3.25a) is a wide-sense stationary random function.
If a and b are independent random variables that obey the same probability density
distribution, and this probability density distribution assigns a mean value of zero to random
variables obeying it, then Eqs. (3.31a)–(3.31c) are automatically satisfied and, since a and b are
independent, Eqs. (3.31a) and (3.31b) show that (3.31e) is also satisfied:
- 260 -
Stationary Random Functions · 3.15
) = E(a ) ⋅ E(b ) = 0 ⋅ 0 = 0 .
E(ab
This is sufficient to make ñ(t) wide-sense stationary, but there are other ways to do the job. We
can, for example, set a = u and b = v where u and v are the random variables defined in Eqs.
(3.15b) and (3.15c) above. Equations (3.15d) and (3.15e) then show that Eqs. (3.31a) and (3.31b)
are satisfied, and Eq. (3.15f) shows that (3.31e) is satisfied. The only requirement left is (3.31c),
which can be checked now by writing
2π
1 1
E(a 2 ) = E(u 2 ) = ³ sin φ dφ =
2
(3.32a)
2π 0
2
and
2π
1 1
E(b 2 ) = E(v 2 ) = ³ cos
2
φ dφ = . (3.32b)
2π 0
2
Clearly, Eq. (3.31c) is also satisfied. We conclude that even though a = u and b = v are not, as is
pointed out in the discussion following Eq. (3.15f), independent random variables, the random
function ñ(t) in Eq. (3.25a) is still wide-sense stationary. Note that Eqs. (3.15b) and (3.15c) can
now be used to write ñ(t) as
In (3.32c), random variable φ can, according to Eq. (3.15a), be regarded as a random phase
equally likely to take on any value between zero and 2ʌ. Adding this sort of random phase to the
argument of a sinusoidal oscillation always produces a wide-sense stationary random function.
G G
n (t ) = ( n (t1 ), n (t1 ),… , n (t N ) ) , (3.33b)
- 261 -
3 · Random Variables, Random Functions, and Power Spectra
and
G G
(G )
µnG (tG ) = E n (t ) = (E ( n (t1 ) ) , E ( n (t1 ) ) ,… , E ( n (t N ) ) ) . (3.33c)
Glancing back at Eq. (3.23c), we remember that µn (t ) is the expected or mean value of the
random variable ñ(t), so Eq. (3.33c) can also be written as
G
µnG (tG ) = ( µn (t ) , µn (t ) ,… , µn (t ) ) .
1 2 N
(3.33d)
We define the covariance matrix C to be the N × N square matrix whose i,jth element is given by
(
(C)ij = E [n (ti ) − µ n (ti ) ][n (t j ) − µ n ( t j ) ] . ) (3.33e)
Equation (3.14c) reminds us that (C)ij is measuring the covariance of the two random variables
n (ti ) and n (t j ) . A T superscript applied to a matrix or vector specifies the transpose of that
matrix or vector; so, for example,
§ n1 ·
¨ ¸
G T ¨ n2 ¸
n = .
¨ # ¸
¨ ¸
© nN ¹
Now the multivariate Gaussian distribution
In this formula, det(C) stands for the determinant of C , and C−1 is the inverse matrix of C .
Nothing said so far about Gaussian random processes requires them to be stationary in any
sense of the term, and in fact not all Gaussian random processes are stationary. They are often
good models for the noise found in mechanical processes and electrical signals. Perhaps the most
interesting thing about them, however, is that it can be shown that if they are wide-sense
- 262 -
Gaussian Random Processes · 3.16
µn = 0
1
µn = 0
2
#
etc.
We start by specifying three jointly normal, zero-mean random variables n1 , n2 , and n3 .
Consulting Eq. (3.33f) above, we note that the jointly normal probability density function for n1 ,
n2 , and n3 can be written as, by expanding the matrix product in the exponent after setting the
G
means vector µ to zero,
3 3
− ¦¦α jk n j nk
pn1n2 n3 (n1 , n2 , n3 ) = K e j =1 k =1
(3.34a)
for real constants K and α jk (with j , k = 1, 2,3 ). Note that these three random variables can be
either independent or dependent random variables and still obey the probability density
distribution in (3.34a). The expected value of the triple product n1n2 n3 is [applying Eq. (3.12a)
36
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 300.
37
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
38
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 197.
- 263 -
3 · Random Variables, Random Functions, and Power Spectra
above]
3 3
∞ ∞ ∞ − ¦¦α jk n j nk
E (n1n2 n3 ) = K ³ dn ³ dn ³ dn (n n n )e . (3.34b)
j =1 k =1
1 2 3 1 2 3
−∞ −∞ −∞
The only number that is equal to (í1) times itself is zero, so we conclude that
for any three distinct, jointly normal, and zero-mean random variables.
When n1 , n2 , and n3 are not three distinct random variables—or, what amounts to the same
thing, two or more are perfectly correlated—we can redo the analysis to see what happens.
If two of the three random variables n1 , n2 , and n3 are perfectly correlated, there are really
only two distinct, jointly normal, zero-mean random variables that we call n1 and n2 . Their
multivariate probability density distribution can be written as
2 2
− ¦¦α jk n j nk
pn1n2 (n1 , n2 ) = K e j =1 k =1
for real constants K and α jk (with j , k = 1, 2 ). If necessary, we renumber the random variables so
that n2 represents the two perfectly correlated random variables that used to be distinct. Equation
(3.34b) now simplifies to
- 264 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
2 2
∞ ∞ − ¦¦α jk n j nk
E (n1n22 ) = K ³ dn ³ dn (n n ) e
2
1 2 1 2
j =1 k =1
. (3.35a)
−∞ −∞
∞ ∞
or
2 2
∞ ∞ − ¦¦α jk u j uk
E (n1n22 ) = − K ³ du1 ³ du2 (u1u22 ) e j =1 k =1
. (3.35b)
−∞ −∞
E (n1n22 ) = −E (n1n22 ) ,
so using the same reasoning as before—that only zero can be equal to (í1) times itself—we get
E (n1n22 ) = 0 . (3.35c)
Hence, Eq. (3.34d) still holds true when any two of the jointly normal, zero-mean random
variables n1 , n2 , n3 are perfectly correlated.
When all three of these random variables are perfectly correlated, there is really just one zero-
mean random variable n1 obeying the normal probability distribution [see Eq. (3.6a) above],
n12
−
1 2σ n21
pn1 (n1 ) = e .
σ n 2π
1
The left-hand side of (3.34d) now becomes E (n13 ) , which satisfies the formula
n12
∞ −
1 2σ n21
E (n13 ) = ³ne
3
1 dn1 . (3.36a)
σ n 2π
1 −∞
- 265 -
3 · Random Variables, Random Functions, and Power Spectra
Since this is the integral between + and – of an odd function, it must be zero [see Eq.
(2.17) in Chapter 2]. Consequently,
E (n13 ) = 0 (3.36b)
for any zero-mean, normally distributed random variable n1 . We conclude that Eq. (3.34d) holds
for any three strictly normal and zero-mean random variables even if they are not distinct.
To construct a formula for E (n1n2 n3n4 ) for four zero-mean, jointly normal random variables
n1 , n2 , n3 , n4 , we construct a new random variable,
4
w = ω1n1 + ω2 n2 + ω3 n3 + ω4 n4 = ¦ ω j n j . (3.37a)
j =1
There is no requirement that n1 , n2 , n3 , and n4 be distinct random variables, but we do assume
that the real parameters ω 1 , ω 2 , ω 3 , and ω 4 can independently take on any value between í
and +. Since n1 , n2 , n3 , and n4 are jointly normal, w is also a normal variable.39 Using the
linearity of the expectation operator with respect to random variables (see Sec. 3.10 above) and
remembering that n1 , n2 , n3 , and n4 are zero mean, we have
§ 4 · 4
E ( w ) = E ¨ ¦ ω j n j ¸ = ¦ ω jE (n j ) = 0 , (3.37b)
© j =1 ¹ j =1
showing that w is also zero-mean. For future use we note, applying (3.37b) to Eq. (3.8e), that the
variance of w is
§§ 4 ·§ 4 ·· 4 4
vw = E ( w 2 ) = E ¨ ¨ ¦ ω j n j ¸ ¨ ¦ ωk nk ¸ ¸ = ¦¦ ω jωk E(n j nk ) ,
¨ ¹ ¹¸ j =1 k =1
© © j =1 ¹ © k =1
which can also be written as, recognizing that [according to Eq. (3.5c)] the variance vw is the
square of the standard deviation σ w of w ,
4 4
σ w2 = ¦¦ ω jωk E(n j nk ) . (3.37c)
j =1 k =1
39
This analysis is an expanded version of a treatment given in Athanasios Papoulis, Probability, Random Variables,
and Stochastic Processes, pp. 197–198.
- 266 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
³p
−2π iν w
E (e )= w ( w)e−2π iν w dw ,
−∞
where pw ( w) is the probability density distribution of random variable w . Since w obeys a
zero-mean normal distribution [defined in Eq. (3.6a)], this becomes
∞ w2
−
1
³e
−2π iν w 2σ w2
E (e )= e −2π iν w dw . (3.38a)
σ w 2π −∞
∞ w2 ∞ w2
− −
1 i
³ cos(2πν w) e ³ sin(2πν w) e
−2π iν w 2σ w2 2σ w2
E (e )= dw + dw .
σ w 2π −∞ σ w 2π −∞
When we replace w by − w in
w2
−
2σ w2
Y ( w) = sin(2πν w) e ,
we see that
( − w )2 ( w)2
− −
2σ w2 2σ w2
Y (− w) = sin(−2πν w) e = − sin(2πν w) e = −Y ( w) ,
showing that Y is an odd function. Hence, according to Eq. (2.17) in Chapter 2, its integral
between í and + is zero. The formula for E (e −2π iν w ) must then reduce to
∞ w2
1 −
³ cos(2πν w) e
−2π iν w 2σ w2
E (e )= dw . (3.38b)
σ w 2π −∞
A table of integrals40 shows that, for any two real parameters a and b,
40
Formula 679 of the Handbook of Chemistry and Physics, edited by Robert C. Weast, 51st ed. (The Chemical
Rubber Company, Cleveland, OH, 1970–1971), p. A-215.
- 267 -
3 · Random Variables, Random Functions, and Power Spectra
∞ b2
π −
³e
− a2 x2
cos(bx)dx = e 4 a2
.
0
2a
Setting
2 2
Z ( x) = cos(bx) e − a x ,
∞ b2
π −
³e
− a2 x2
cos(bx)dx = e 4 a2
. (3.38c)
−∞
a
1
Applying formula (3.38c) to Eq. (3.38b) by specifying that a = and b = 2πν , we get
σ w 2
2 2
σ w2
E (e −2π iν w ) = e −2π ν . (3.38d)
Equation (3.38d) holds true for any value of ν ; in particular, when ν = (2π ) −1 , it must still be
true:
2
E (e − iw ) = e −σ w / 2 . (3.38e)
Formula (3.38e) applies to any zero-mean, normal random variable, which means it applies to w
for any set of ω 1 , ω 2 , ω 3 , ω 4 values in Eq. (3.37a) above.
We can expand the left-hand side of (3.38e) in powers of w to get, using the linearity of the
expectation operator with respect to random variables (see Sec. 3.10 above),
§ w 2 w 3 w 4 · E ( w 2 ) E ( w 3 ) E ( w 4 )
E (e − iw ) = E ¨ 1 − iw − +i + + " ¸ = 1 − iE ( w ) − +i + +".
© 2 6 24 ¹ 2 6 24
According to Eqs. (3.37b) and (3.36b), both E ( w ) and E ( w 3 ) are zero [the discussion following
Eq. (3.37a) shows that w like n1 is a zero-mean, normally distributed random variable, which
means that it must satisfy both Eqs. (3.37b) and (3.36b)]. Hence, we can write, remembering that
E ( w 2 ) = σ w2 because σ w is the standard deviation of w and w is zero mean, that
- 268 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
σ w2 E ( w 4 )
E (e− iw ) = 1 − + +" . (3.39a)
2 24
2 σ w2 σ w4
e−σ w / 2 = 1 − + +" . (3.39b)
2 8
σ w2 E ( w 4 ) σ2 σ4
1− + + " = 1 − w + w + "
2 24 2 8
or
E ( w 4 ) σ w4
+" = +". (3.39c)
24 8
Equation (3.37c) reminds us that σ w2 is the weighted sum of ω jωk products, so for small ω it
follows that σ w2 is of order ω 2 . This means that σ w4 on the right-hand side of (3.39c) is of order
ω 4 . Similarly, Eq. (3.37a) reminds us that E ( w 4 ) on the left-hand side of (3.39c) is order ω 4
when the ω values are small. Formula (3.39c) must hold true for all values of ω 1 , ω 2 , ω 3 , and
ω 4 . If we choose ω 1 through ω 4 to be small, we must have
E ( w 4 ) = 3σ w4 . (3.39d)
If (3.39d) is false, then the higher powers of w and σ w in (3.39c), which are represented by
“ +" ” on both sides of the formula, cannot make (3.39c) hold true because these +" terms
contain only order ω 6 and higher powers of ω 1 through ω 4 , making them too small to rescue
the equality.
The next step is to expand E ( w 4 ) . Raising w to the fourth power in (3.37a) gives
- 269 -
3 · Random Variables, Random Functions, and Power Spectra
Paying attention only to those terms whose coefficients are proportional to ω1ω2ω3ω4 , we have
Formula (3.37c) gives, again concentrating only on terms whose coefficients are proportional to
ω1ω2ω3ω4 ,
σ w4 = [ω12E (n12 ) + ω1ω2E (n1n2 ) + ω1ω3E (n1n3 ) + ω1ω4E (n1n4 )
+ ω2ω1E (n2 n1 ) + ω22E (n22 ) + ω2ω3E (n2 n3 ) + ω2ω4E (n2 n4 )
+ ω3ω1E (n3 n1 ) + ω3ω2E (n3 n2 ) + ω32E (n32 ) + ω3ω4E (n3n4 )
+ ω4ω1E (n4 n1 ) + ω4ω2E (n4 n2 ) + ω4ω3E (n4 n3 ) + ω42E (n42 )]2 ,
which becomes
which simplifies to, using the linearity of the expectation operator (see Sec. 3.10),
This must hold true for any combination of ω 1 , ω 2 , ω 3 , and ω 4 values, large or small, so the
coefficients of all the ω1ω2ω3ω4 terms must be the same on both sides of this equation. Therefore,
E (n1n2 n3n4 ) = E (n1n2 )E (n3n4 ) + E (n1n3 )E (n2 n4 ) + E (n2 n3 )E (n1n4 ) (3.40c)
for any collection of zero-mean, jointly normal random variables n1 , n2 , n3 , and n4 .
Equation (3.40c) requires ω 1 through ω 4 to be distinct real parameters, but it does not
- 270 -
Products of Two, Three, and Four Jointly Normal Random Variables· 3.17
require the n1 , n2 , n3 , and n4 random variables to be distinct. Consequently, if n1 and n2 are the
same, we can relabel the jointly random variables using
Similarly, if n3 and n4 are also identical, we can relabel n1 through n4 as
When all four random variables are the same, Eq. (3.40c) collapses to
which holds true for any zero-mean random variable ñ obeying a normal distribution.
µn (t ) = E ( n (t ) ) .
41
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
- 271 -
3 · Random Variables, Random Functions, and Power Spectra
T
1
2T ³ n (t ) dt
−T
and take the limit as T → ∞ . Since “ergodic” refers to using time averages to calculate ensemble
averages, we might expect that a random function that is ergodic in the mean would satisfy the
equation
T
1
T →∞ 2T ³
µn (t ) = lim n (t ) dt . (3.42a)
−T
There are two problems with Eq. (3.42a). The first is that µn (t ) is allowed to be a function of time
t, whereas
T
1
lim
T →∞ 2T ³ n (t ) dt
−T
is not. This means Eq. (3.42a) can only be true when µn (t ) does not depend on time.
Consequently, for ñ to be ergodic in the mean, we must also require ñ to be stationary in the mean
with [see Eq. (3.30a) above]
The second problem is more difficult to deal with. We note that the value of
T
1
2T ³ n (t ) dt
−T
must be a random value because it is proportional to the integral of a random function. Hence, we
expect
T
1
T →∞ 2T ³
lim n (t ) dt
−T
also to be a random value. This means Eq. (3.42b) sets a random value equal to µn , a nonrandom
- 272 -
Ergodic Random Functions · 3.18
value, which is in general not allowed. The way out of this impasse is to put a restriction on the
limiting process used to get the right-hand side of (3.42b). Clearly,
T
1
ξ (T ) = ³ n (t ) dt (3.42c)
2T −T
is a random function of T. This means there must be a probability density distribution pξ (T ) (ξ )
such that pξ (T ) (ξ ) d ξ is the probability that ξ (T ) takes on a value between ξ and ξ + dξ . We
now require the limiting random variable
T
1
ξ∞ = lim ξ (T ) = lim ³ n (t ) dt (3.42d)
T →∞ T →∞ 2T
−T
pξ (ξ ∞ ) = δ (ξ ∞ − µ n ) . (3.42e)
∞
According to the discussion following Eqs. (3.7e) and (3.7f) above, this turns ξ∞ into a random
variable that behaves like a constant, since
∞
E (ξ∞ ) = ³ δ (ξ ∞ − µn ) ⋅ ξ ∞ ⋅ dξ ∞ = µn
−∞
and
∞
( ) ³ δ (ξ
E (ξ∞ − µn ) 2 = ∞ − µn ) ⋅ (ξ ∞ − µn ) 2 ⋅ dξ ∞ = 0 .
−∞
Now we can note that, yes, strictly speaking, Eq. (3.42b) does equate a random variable to a
nonrandom variable, but this does not matter because Eq. (3.42e) makes the random variable
T
1
lim
T →∞ 2T ³ n (t ) dt
−T
- 273 -
3 · Random Variables, Random Functions, and Power Spectra
Once again we face the same two problems: the left-hand side of this equation is allowed to be a
function of t1 whereas the right-hand side is not, and the left-hand side of this equation is
nonrandom whereas the right-hand side is random.
Dealing with the t1 problem first, we again say that
E ( n (t1 ) n (t1 + τ ) )
does not depend on t1 , making ñ(t) stationary with respect to its autocorrelation function. Now
Eq. (3.43a) can be written as
§ 1 T
·
E (n (t1 ) n (t1 + τ )) = Rnn
(τ ) = lim ¨
T →∞ 2T
©
³ n (t )n (t + τ ) dt ¸¹ .
−T
(3.43b)
Both in Eqs. (3.42a) and (3.42b) describing what it means to be ergodic in the mean, and in Eqs.
(3.43a) and (3.43b) describing what it means to be ergodic in the autocorrelation function, the
time dependence that ensemble averaging preserves is lost in the time average. This is clearly
going to happen whenever some sort of ensemble average is set equal to the corresponding time
average. We conclude that when a random function is ergodic in some way, it must also be
stationary in that same way. In this sense, ergodic random functions are always stationary.43
Moving on to the second problem with Eq. (3.43a)—that of equating random and nonrandom
quantities—we follow the same procedure as before. This time the random function ξ is defined
to be
T
1
2T −³T
ξ (T ,τ ) = n (t ) n (t + τ ) dt (3.44a)
42
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
43
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
- 274 -
Ergodic Random Functions · 3.18
Associated with .5 (* ) is the probability density distribution p.5 (* ) such that p.5 (* ) (.5 ) d .5 is the
probability that .5 (* ) has a value between .5 and .5 d. 5 . We again require
This shows, according to the discussion following Eqs. (3.7e) and (3.7f), that the random variable
.5 (* ) behaves like a nonrandom quantity. We have now solved the second problem with Eq.
(3.43a) and therefore can make sense of the idea that a random function can be ergodic in the
autocorrelation function.
The pattern used in analyzing the ergodic qualities of a random function ñ(t) has by now been
set. There is some mathematically useful and reasonable function f that has N arguments. We pick
N time values t1 , t2 ,…, t N and calculate an ensemble expectation value or average
T
1
lim
T 75 2T ³ f n (t ), n (t *
T
2 ), n (t * 3 ),… , n (t * N ) dt .
We define
* 2 t2 t1 , * 3 t3 t1 , ... , * N t N t1
and set the expectation value equal to the time average by writing
- 275 -
3 · Random Variables, Random Functions, and Power Spectra
cannot be a function of t1 . This means the right-hand side this of relationship still has the same
value when t1 is increased by any time value τ ; hence we can write, increasing t1 by τ only on
the right-hand side,
Remembering that
τ 2 = t2 − t1 , τ 3 = t3 − t1 , …, τ N = t N − t1 ,
This is the same as Eq. (3.24c) above. We conclude that Eq. (3.24c) must be true whenever Eq.
(3.45a) is true. According to the discussion following Eq. (3.24c), whenever Eq. (3.45a) is true,
the expectation value
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) )
τ 2 = t2 − t1 , τ 3 = t3 − t1 , …, τ N = t N − t1 .
Consequently, the expectation values and the time integral in Eq. (3.45a) have the same number
- 276 -
Ergodic Random Functions · 3.18
T
1
S (τ 2 ,τ 3 ,… ,τ N ) = lim
T →∞ 2T ³ f ( n (t ), n (t + τ
−T
2 ), n (t + τ 3 ),… , n (t + τ N ) ) dt , (3.45c)
where
S (τ 2 ,τ 3 ,… ,τ N ) = E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) . (3.45d)
Equation (3.45a) needs to have one more requirement imposed on it—the random quantity on the
right-hand side must be equivalent to the nonrandom quantity on the left. This means the random
quantity
T
1
ξ∞ (τ 2 ,τ 3 ,… ,τ N ) = lim f ( n (t ), n (t + τ 2 ), n (t + τ 3 ),… , n (t + τ N ) ) dt
T →∞ 2T ³
(3.45e)
−T
( )
E ξ∞ (τ 2 ,τ 3 ,… ,τ N ) = S (τ 2 ,τ 3 ,… ,τ N ) (3.45f)
and
(
E ª¬ξ∞ (τ 2 ,τ 3 ,… ,τ N ) − S (τ 2 ,τ 3 ,… ,τ N ) º¼
2
)=0. (3.45g)
Now, by requiring Eqs. (3.45b)–(3.45g) to hold true, we can be sure that Eq. (3.45a) is
mathematically self-consistent.
It is not difficult to relate this mathematical machinery to the analysis of what it means to say
that ñ(t) is ergodic in the mean or ergodic in the autocorrelation function. When specifying what
it means to say that ñ(t) is ergodic in the mean, we take N = 1 and define function f to be
f ( x) = x ; and when specifying what it means to say that ñ(t) is ergodic in the autocorrelation
function, we take N = 2 and define function f to be f ( x, y ) = xy . To give another example of
how to use Eqs. (3.45a)–(3.45g), we examine an often encountered type of ergodicity called
“ergodic in the variance.”44 We define ergodic in the variance for a random function ñ(t) by
setting N = 1 and f ( x) = ( x − µn ) 2 , with µn in function f being the stationary mean of ñ,
E ( n (t ) ) = µn ,
specified by Eq. (3.30a) above. When a random function ñ(t) is ergodic in the variance, Eq.
44
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
- 277 -
3 · Random Variables, Random Functions, and Power Spectra
(3.45a) becomes
T
1
( )
E [n (t ) − µn ]2 = lim
T →∞ 2T ³ [n (t ) − µ ] dt .
n
2
(3.46a)
−T
( ) (
E [n (t ) − µn ]2 = E [n (t + τ ) − µn ]2 ) (3.46b)
( )
E [n (t ) − µn ]2 = vn = nonrandom variable independent of time. (3.46c)
Here, we write vn instead of vn ( t ) for the variance of ñ(t) to emphasize that vn does not depend
on time. Equation (3.46c) can be interpreted as saying that ñ is stationary with respect to its
variance vn . We note that variance vn is equivalent to S in Eq. (3.45d), so Eqs. (3.45e), (3.45f),
and (3.45g) now reduce to
T
1
T →∞ 2T ³
ξ∞ = lim [n (t ) − µn ]2 dt , (3.46d)
−T
A random function ñ(t) is called weakly ergodic if it is ergodic in the mean, ergodic in the
variance, and ergodic in the autocorrelation function.45 It is called strongly ergodic if Eqs.
(3.45a)–(3.45g) are satisfied for all N = 1, 2,… , ∞ and for any reasonable choice of function f.
This is equivalent to requiring that all reasonable ensemble averages of the random function ñ(t)
be equal to their corresponding time averages.
The distinction made between weakly ergodic and strongly ergodic is reminiscent of the
distinction made between wide-sense stationary and strict-sense stationary. Just as all strict-sense
stationary random functions are also wide-sense stationary, but not all wide-sense stationary
random functions are strict-sense stationary, so too are all strongly ergodic random functions also
weakly ergodic, but not all weakly ergodic random functions are strongly ergodic. The Gaussian
random processes discussed in Sec. 3.16 above are an important special case. We have already
45
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 82.
- 278 -
Ergodic Random Functions · 3.18
said that when Gaussian random processes are wide-sense stationary they must also be strict-
sense stationary; it can also be shown that whenever Gaussian random processes are weakly
ergodic they must also be strongly ergodic.46
Although we have seen that all ergodic random functions are also stationary, it is easy to show
that not all stationary random functions are ergodic. The random function
n (t ) = c , (3.47a)
where c is a random constant chosen from a probability density distribution pc (c ) , is clearly
strict-sense stationary. To see why this is so, we just observe that Eq. (3.24c) is automatically
satisfied, since
∞
E ( f ( n (t1 ), n (t2 ),… , n (t N ) ) ) = ³ p (c) f (c, c,…, c) dc
c
−∞ (3.47b)
= E ( f (c, c,… , c ) ) = E ( f ( n (t1 + τ ), n (t2 + τ ),… , n (t N + τ ) ) )
for any value of τ and any integrable function f with N = 1, 2,… , ∞ arguments. On the other
hand, n (t ) = c cannot be ergodic because once a value for c is chosen from the ensemble, it must
stay the same for all time values. Looking at even the simplest type of ergodicity, ergodicity in
the mean, we get from Eq. (3.42d)
T
1 § 1 ·
ξ∞ = lim ³ n (t ) dt = lim ¨© 2T ⋅ (2Tc) ¸¹ = c . (3.47c)
T →∞ 2T T →∞
−T
Hence, the probability density distribution of ξ∞ is the same as the probability density
distribution pc , which, unless pc is a delta function, violates requirement (3.42e) for ergodic in
the mean.
46
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 83.
- 279 -
3 · Random Variables, Random Functions, and Power Spectra
functions by running the experiment many different times. This is, of course, unlikely to happen;
there is usually not much incentive to do the same experiment over and over in exactly the same
way, because the point of most experiments is to measure a signal, not the noise associated with
it. Sometimes repeating an experiment is literally impossible. If, for example, stock-market prices
are treated as random functions of time, there is no way to repeat last year to see what happens
this time around. Consequently, when examining random functions of time, there is usually only
one, or at best a few, member functions of the ensemble to examine. In practice, then, most
experimental statisticians are forced to assume that their random functions are ergodic as well as
stationary; otherwise, they cannot calculate the ensemble averages needed for their analysis.
Another point worth making about stationarity and ergodicity is that, strictly speaking, no
experimental data can be truly stationary or truly ergodic in even the weakest sense, because
before an experiment begins or after an experiment ends the random function representing the
noise must be strictly zero. One way of handling this is to regard the noise data as a finite-length
sample of some random function stretching between t = í and t = +, but we should also
acknowledge that stationarity and ergodicity are ideals that experimental noise can only realize to
some degree of approximation. Just as, in Sec. 3.5 above, many pairs of independent random
variables turn out after all to depend slightly on each other, so too do many recordings of
experimental noise turn out, after close analysis, to be stationary and ergodic only to some degree
of approximation.
(t2 − t1 ) = E ( n
Rnn (t1 ) n (t2 ) ) (3.48a)
(t2 − t1 ) = Rnn
Rnn (t1 − t2 )
or, setting τ = t2 − t1 ,
(τ ) = Rnn
Rnn ( −τ ) , (3.48b)
the single real parameter τ , we can set up the one-dimensional Fourier transform of Rnn , getting
- 280 -
The Power Spectrum · 3.20
( f ) =
S nn ³R
−∞
nn (τ ) e −2π if τ dτ . (3.48c)
spectrum47,48 of the random function ñ(t). Over the next few sections of this chapter, we examine
the properties of S nn
, showing as we go along why it makes sense to call it the power spectrum.
Functions ñ that have power spectra must be wide-sense stationary because we are assuming
that the autocorrelation Rnn is a function with only a single real argument. Given that S nn
exists,
we can always reverse the transform in Eq. (3.48c) and write the autocorrelation function of ñ as
the inverse Fourier transform of the power spectrum,
(τ ) =
Rnn ³S
−∞
nn ( f ) e 2π if τ df . (3.48d)
When two random functions nα (t ) and nβ (t ) are jointly wide-sense stationary, as defined in the
discussion following Eq. (3.30c), we can define their cross-power spectrum to be
∞
S nα nβ ( f ) = ³R
−∞
nα nβ (τ ) e −2π if τ dτ , (3.48e)
where
Rnα nβ (t2 − t1 ) = E ( nα (t1 ) nβ (t2 ) )
transform, the power spectrum S nn , is the Fourier transform of a real and even function. Because
the Fourier transform of a real and even function is always another real and even function,49 it
follows that S nn
is also real and even:
Im ( S nn
( f )) = 0 (3.49a)
and
( − f ) = S nn
S nn ( f ) . (3.49b)
47
Paul H. Wirsching et al., Random Vibrations: Theory and Practice, p. 124.
48
Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, p. 319.
49
See entry 1 of Table 2.1 in Chapter 2.
- 281 -
3 · Random Variables, Random Functions, and Power Spectra
We note in passing that the cross-power spectrum S n n in (3.48e) is not necessarily a real-valued
function. It is, however, the Fourier transform of a real-valued function Rn n so it must be
Hermitian,50
Equation (3.49a) shows that S nn behaves like a power spectrum by being strictly real; Eq.
(3.49b) shows that S nn is double-sided, having the same value at +f and –f. The next step is to
show that S nn behaves like a power spectrum by being non-negative for all values of f, but that
has to wait until we examine what happens to S nn when a wide-sense stationary random function
v(t ) hu (t ) uh(t ) .
According to the definition of convolution in Chapter 2 [see Eq. (2.38a)], this can be written as
5
v(t ) ³ h(* 3) u(t * 3) d* 3 .
5
When a random function ñ(t) is the input to a linear system characterized by an impulse-
response function h(t), the output is another random function m (t ) given by
5
m (t ) ³ h(* 3) n (t * 3) d* 3 .
5
(3.50a)
50
See entry 7 of Table 2.1 in Chapter 2.
- 282 -
Random Inputs and Outputs of Linear Systems · 3.21
(t1 , t 2 ) = E ( m
Rmn (t1 ) n (t2 ) ) . (3.50b)
Function Rmn
(t1 , t2 ) is called the cross-correlation function of m and ñ. Substitution of (3.50a)
gives
§ ª∞ º· §∞ ·
Rmn (t1 , t 2 ) = E ¨ n(t2 ) « ³ h(τ ) n(t1 − τ ) dτ » ¸ = E ¨ ³ h(τ ′) n
′ ′ ′ (t2 )n (t1 − τ ′) dτ ′ ¸
¨ ¸
© ¬ −∞ ¼¹ © −∞ ¹
Using Eq. (3.17c) to move the expectation operator inside the integral, and using (3.16a) to put h
outside the expectation operator because it is a nonrandom quantity, we get
(t1 , t 2 ) =
Rmn ³ h(τ ′) E ( n (t )n (t
−∞
2 1 − τ ′) ) dτ ′ .
so that
∞
(t1 , t 2 ) =
Rmn
−∞
³ h(τ ′) R
nn (t1 − t2 − τ ′) dτ ′ . (3.50c)
This shows that Rmn depends only on the difference between t1 and t2 . Nothing then stops us
(τ ) =
Rmn ³ h(τ ′) R
−∞
nn (−τ − τ ′) dτ ′
(τ ) =
Rmn ³ h(τ ′) R
−∞
nn (τ + τ ′) dτ ′ .
51
This derivation comes from Athanasios Papoulis, Probability, Random Variables, and Stochastic Processes, pp.
323–324.
- 283 -
3 · Random Variables, Random Functions, and Power Spectra
(τ ) =
Rmn ³ h(−τ ′′) R
−∞
nn (τ − τ ′′) dτ ′′ = h(−τ ) ∗ Rnn
(τ ) . (3.50d)
Equation (3.50a) can also be used to evaluate the autocorrelation function of the random
output m (t ) , giving
§ ª∞ º·
Rmm (t
1 2, t ) = E (
m (t1 )
m (t 2 ) ) = E
¨ 1 «³
¨ m (t ) h (τ ′)
n (t 2 − τ ′) dτ ′ » ¸¸
© ¬ −∞ ¼¹
§ ∞
·
= E ¨ ³ h(τ ′)m (t1 ) n (t2 − τ ′) dτ ′ ¸ .
© −∞ ¹
Again moving the expectation operator inside the integral, we use Eq. (3.50b) to write
∞ ∞
(t1 , t2 ) =
Rmm ³ h(τ ′)E ( m (t1 ) n (t2 − τ ′) ) dτ ′ = ³ h(τ ′) R
mn (t1 , t2 − τ ′) dτ ′ .
−∞ −∞
(t1 , t2 ) =
Rmm ³ h(τ ′) R
−∞
mn (t2 − t1 − τ ′) dτ ′ . (3.51a)
This is an important result because it shows that the autocorrelation of the output random
function m depends only on τ = t2 − t1 . Substituting τ for (t2 − t1 ) gives
(τ ) =
Rmm ³ h(τ ′) R
−∞
mn (τ − τ ′) dτ ′ = h(τ ) ∗ Rmn
(τ ) . (3.51b)
Glancing back at Eqs. (3.30a) and (3.30b) above, and having shown that the autocorrelation
(t1 , t2 ) depends only on (t2 − t1 ) , we realize that m
function Rmm must be wide-sense stationary if
- 284 -
Random Inputs and Outputs of Linear Systems · 3.21
E ( m (t ) ) is time-independent and finite. Taking the expectation value of both sides of (3.50a)
gives
§∞ · ∞
E ( m (t ) ) = E ¨ ³ h(τ ′) n (t − τ ′) dτ ′ ¸ = ³ h(τ ′) E ( n (t − τ ′) ) dτ ′
© −∞ ¹ −∞ (3.51c)
∞
= µn ³ h(τ ′) dτ ′
−∞
,
where we have again assumed that ñ(t) is wide-sense stationary so that, according to Eq. (3.30a),
³ h(t ) e
−2π ift
H( f ) = dt , (3.51d)
−∞
of the linear system. (The idea of a transfer function is discussed in greater detail below in
Appendix 5A of Chapter 5.) Therefore Eq. (3.51c) can also be written as
This shows that when H(0), the zero-frequency value of the transfer function, is finite, so is
E ( m (t ) ) . We conclude that the output m (t ) of the linear system is wide-sense stationary when
the input ñ(t) is wide-sense stationary and the H(0) value of the transfer function is finite.
Because the H(f) transfer function is the Fourier transform of h(t), which is a strictly real
function, we can take the complex conjugate of both sides of Eq. (3.51d) to get
∞ −∞
In the last step of (3.52a), we change the variable of integration to t ′ = −t . Equation (3.52a) can
also be written as, dropping the prime,
³ h ( −t ) e
∗ −2π ift
H( f ) = dt . (3.52b)
−∞
- 285 -
3 · Random Variables, Random Functions, and Power Spectra
Clearly, H ( f )∗ , the complex conjugate of the transfer function H(f), is the Fourier transform of
h(−t ) . Since H is the Fourier transform of a real function h, it must, according to entry 7 of Table
2.1 in Chapter 2, be Hermitian,
H (− f ) = H ( f )* . (3.52c)
( f ) =
S mn ³R
−∞
mn (τ ) e −2π if τ dτ . (3.53a)
(τ ) =
Rmn ³S
−∞
mn ( f ) e 2π if τ df . (3.53b)
Applying the Fourier convolution theorem to Eq. (3.50d) above gives, according to Eq. (2.39a) in
Chapter 2,
This can be written as, using Eqs. (3.53a), (3.52b), and (3.48c),
∗
( f ) = H ( f ) ⋅ S nn
S mn ( f ) . (3.53c)
( f ) = H ( f ) S mn
S mm ( f ), (3.53d)
where
∞
(f )=
S mm ³R
−∞
mm (τ ) e −2π if τ dτ (3.53e)
(3.53a) respectively. The Fourier transform in (3.53e) can, of course, be reversed to get
- 286 -
Random Inputs and Outputs of Linear Systems · 3.21
(τ ) =
Rmm ³S
−∞
mm ( f ) e 2π if τ df . (3.53f)
Substitution of (3.53c) into (3.53d) gives the result we have been working toward:
2
( f ) = H ( f ) S nn
S mm ( f ) . (3.53g)
This result shows that the power spectrum of the random input function ñ(t) gives, when
multiplied by the squared modulus of the transfer function, the power spectrum of the random
output function m (t ) of the linear system.
random function cannot be negative. To show how this is done, we set up a linear system that has
the transfer function
−i for f1 ≤ f ≤ f 2
° i for − f ≤ f ≤ − f
° 2 1
HB ( f ) = ® , (3.54a)
° 0 for f < f1
°¯ 0 for f > f 2
where f1 and f 2 are both non-negative frequencies. Function H B ( f ) is (−i ) when f lies
between f1 and f 2 and i when f lies between (− f1 ) and (− f 2 ) ; otherwise it is zero. The transfer
function H B satisfies
H B (− f ) = H B ( f )∗ , (3.54b)
which [see Eq. (3.52c)] makes it an acceptable transfer function because it is Hermitian. By
reversing the Fourier transform in (3.51d), we find that the impulse-response function for this
linear system must be the inverse Fourier transform of the transfer function,
∞
hB (t ) = ³H
−∞
B ( f ) e 2π ift df .
According to entry 7 in Table 2.1 of Chapter 2, since H B ( f ) is Hermitian, its inverse Fourier
transform hB (t ) must be real. We can take any random function ñ(t) that is wide-sense stationary
- 287 -
3 · Random Variables, Random Functions, and Power Spectra
and run it through the H B linear system. Looking at the resulting output m (t ) , we know from the
discussion following Eq. (3.51e) that m (t ) must also be wide-sense stationary because H B (0) is
finite. This means that m has a well-defined autocorrelation function
(t2 − t1 ) = E ( m
Rmm (t1 ) m (t2 ) )
(0) = E m
Rmm (
(t1 ) 2 ≥ 0 . ) (3.54c)
(0) =
Rmm ³S
−∞
mm ( f ) df . (3.54d)
2
( f ) = H B ( f ) S nn
S mm ( f ) .
This can be substituted into (3.54d) to get, noting the definition of H B in (3.54a), that
− f1 f2
(0) =
Rmm ³S
− f2
nn ( f ) df + ³ S nn
( f ) df .
f1
(0) = 2 ³ S nn
Rmm ( f ) df . (3.54e)
f1
f2
³S
f1
nn ( f ) df ≥ 0 . (3.54f)
- 288 -
The Sign of the Power Spectrum · 3.22
No assumptions have been made about the values of f1 and f 2 other than
0 ≤ f1 ≤ f 2 .
Therefore, because inequality (3.54f) must hold true for all allowed values of f1 and f 2 no
matter where they are on the positive f axis or how close together they are, we conclude that
( f ) ≥ 0 for all f ≥ 0 . Because
S nn
( − f ) = S nn
S nn ( f )
( f ) ≥ 0
S nn (3.54g)
be a non-negative function of frequency f. These are all attributes that a double-sided power
spectrum ought to have. The final step in justifying the label “power spectrum” for S nn
is to show
that it satisfies a power-spectrum type of formula with regard to the random function ñ(t).
2
ZT ( f )
Pzz ( f ) = lim . (3.55a)
T →∞ 2T
Here, ZT ( f ) is the Fourier transform between times t = −T and t = T of a real signal z(t):
³ z (t ) e
−2π ift
ZT ( f ) = dt . (3.55b)
−T
52
B. P. Lathi, An Introduction to Random Signals and Communication Theory (International Textbook Company,
Scranton, PA, 1968), p. 59.
- 289 -
3 · Random Variables, Random Functions, and Power Spectra
We now justify the label “power spectrum” for the function S nn ( f ) defined in Eq. (3.48c) by
(3.55a) for the power spectrum Pzz ( f ) of the nonrandom function z(t).
We define N ( f ) to be the Fourier transform of the random function ñ(t) between times
T
t = −T and t = T :
T
N T ( f ) = ³ n (t ) e
−2π ift
dt . (3.56a)
−T
In effect, N is a random function of the two nonrandom variables f and T, and it could be written
as N ( f , T ) to emphasize this fact. When ñ(t) is a random function that is wide-sense stationary,
we have, since ñ is real,
( 2
) (
§ªT
)
ºªT º·
E N T ( f ) = E N T ( f )∗ ⋅ N T ( f ) = E ¨ « ³ n (t1 ) e2π ift1 dt1 » « ³ n (t2 ) e−2π ift2 dt2 » ¸
¨
© ¬ −T ¼ ¬ −T ¼¹
¸
(3.56b)
§T T
·
= E ¨ ³ dt1 ³ dt2 n (t1 )n (t2 ) e−2π i (t2 −t1 ) f ¸.
© − T −T ¹
Applying Eqs. (3.17c) and (3.16a), the expectation operator E is taken inside the double integral
to get
T T
2
E( N T ( f ) ) = ³ dt ³ dt E(n (t )n (t ) ) e
1 2 1 2
−2π i ( t2 −t1 ) f
−T −T
T T
(3.56c)
³ dt ³ dt R
−2π i ( t2 −t1 ) f
= 1 2
nn (t2 − t1 ) e .
−T −T
(t2 − t1 ) .
for the wide-sense stationary ñ by the autocorrelation function Rnn
The rightmost expression in Eq. (3.56c) is a double integral of a function
- 290 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
−T ≤ t1 ≤ T
and
−T ≤ t2 ≤ T .
Figure 3.2 shows that the value of ȥ must be constant along any line given by
t2 − t1 = τ = constant
in the t1 , t2 plane. To lowest order in dτ in Fig. 3.2, the shaded area is, when t2 ≥ t1 so that
τ ≥ 0,
dτ
⋅ (2T − τ ) 2 = (2T − τ ) dτ .
2
When t2 < t1 , as shown in Fig. 3.3, the value of τ is negative, so the formula for the shaded area
in Fig. 3.3 is
dτ
⋅ (2T − τ ) 2 = (2T − τ ) dτ .
2
T T
³ ³
−2π if τ −2π if τ
= (τ ) e
Rnn (2T − τ ) dτ + (τ ) e
Rnn (2T − τ ) dτ
0 −2T
2T
³
−2π if τ
= (τ ) e
Rnn (2T − τ ) dτ .
−2T
Taking the factor of 2T outside the integral and substituting the result back into Eq. (3.56c) gives
( ) § τ ·
2T
2
E N T ( f ) = 2T ³−2T © 2T ¸¹ Rnn (τ ) e dτ .
¨ 1 − −2π if τ
(3.57a)
- 291 -
3 · Random Variables, Random Functions, and Power Spectra
( )
∞
1 2
E N T ( f ) = ³ Λ(τ , 2T ) Rnn
(τ ) e
−2π if τ
dτ , (3.57b)
2T −∞
where
1 − ta for ta ≤ tb
° tb
Λ (ta , tb ) = ® . (3.57c)
°
¯ 0 for ta > tb
∞ 2
ª sin(2π fT ) º
³ Λ(t , 2T ) e
−2π ift
dt = 2T ⋅ « » . (3.57d)
−∞ ¬ 2π fT ¼
The right-hand side of Eq. (3.57b) is the Fourier transform of the product of functions ȁ and Rnn .
According to the Fourier convolution theorem [see Eq. (2.39k) in Chapter 2], this must equal the
convolution of the Fourier transforms of Λ and Rnn . Therefore, Eq. (3.57b) can be written as,
(
E N T ( f )
2
) = °®2T ⋅ ª sin(2π fT ) º ½°¾ ∗ S
2
(f). (3.57e)
2T « 2π fT »
nn
°¯ ¬ ¼ °¿
2
ª sin(2π fT ) º
2T ⋅ « » →δ( f ). (3.57f)
¬ 2π fT ¼
53
John B. Thomas, An Introduction to Applied Probability and Random Processes (John Wiley & Sons, Inc., New
York, 1971), p. 231. Formula (3.57f) is also a slightly disguised version of Eq. (2.67b) in Chapter 2.
- 292 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
FIGURE 3.2.
T − (τ − T )
t2
= 2T − τ
dτ
τ −T
t1
τ −T
τ
−T
−T T
- 293 -
3 · Random Variables, Random Functions, and Power Spectra
FIGURE 3.3.
t2
t1
τ
−T
2T − τ
−T dτ T
- 294 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
FIGURE 3.4.
1.0
ta
− tb tb
- 295 -
3 · Random Variables, Random Functions, and Power Spectra
Consequently, we can take the limit of both sides of (3.57e) as T → ∞ to get [using Eq. (2.55a)
in Chapter 2)
« T(
ª E N ( f ) 2 º
» ) ∞
( f ) = ³ δ ( f − f ′) S nn
( f ′) df ′
lim
T →∞ « 2T » = δ ( f ) ∗ Snn
« » −∞
¬ ¼
or
S nn ( f ) = lim
« T (
ª E N ( f ) 2 º
» )
T →∞ « 2T ». (3.57g)
« »
¬ ¼
Comparing this result to the similar formula in Eq. (3.55a) for the power spectrum of a
nonrandom function z, we see that the formulas are similar enough to justify the definition of S nn
(3.54g), and so on, is often called the double-sided power spectrum because it is defined for both
positive and negative values of its argument f. It is typically found as a weighting function in
integrals of the form
∞
³S
−∞
nn ( f ) φe ( f ) df ,
be even, this integral can also be written as [see Eq. (2.19) in Chapter 2]
∞ ∞
³S
−∞
nn ( f ) φe ( f ) df = 2³ Snn
( f ) φe ( f ) df .
0
(3.58a)
(1)
Many analysts define a single-sided power spectrum S nn
to be
(1)
( f ) = 2 S nn
S nn ( f ) for f ≥ 0 (3.58b)
∞ ∞
³S ( f ) φe ( f ) df = ³ Snn
( f ) φe ( f ) df .
(1)
nn (3.58c)
−∞ 0
- 296 -
The Power Spectrum and Fourier Transforms of Random Functions · 3.23
The motivation for this procedure is often the feeling that only positive frequencies f are
meaningful, so we ought to restrict ourselves to using power spectra with positive arguments.54
Many times articles and textbooks refer to “the” power spectrum without making it clear whether
they are referring to the double-sided or single-sided power spectrum. Casual references to power
spectra should be treated with caution until it becomes clear which type of power spectrum the
author has in mind.
random field
( f1 , f 2 ,… , f K )
S nn
∞ ∞ ∞
(3.59b)
³ dτ 1 ³ dτ 2 " ³ dτ K Rnn −2π i ( f1τ1 + f 2τ 2 +"+ f K τ K )
= (τ 1 ,τ 2 , … ,τ K )e .
−∞ −∞ −∞
54
There is, of course, no more problem in using negative ƒ values when ƒ represents a frequency than there is in
using negative x values when x represents a length along the axis of a coordinate system. Lengths can never be
negative, so when we allow x to be negative we are implicitly talking about a length coordinate rather than a length.
Similarly, when we allow ƒ to be negative we are implicitly talking about a frequency coordinate rather than a
frequency.
- 297 -
3 · Random Variables, Random Functions, and Power Spectra
( f1 , f 2 ,… , f K )
Rnn
∞ ∞ ∞
(3.59c)
³ df1 ³ df 2 " ³ df K Snn 2π i ( f1τ1 + f 2τ 2 +"+ f Kτ K )
= ( f1 , f 2 ,… , f K )e .
−∞ −∞ −∞
( f1 , f 2 ,… , f K )
S nn
= lim «
ª 1
T1 →∞ (2T )(2T ) " (2T )
T2 →∞ ¬ 1 2 K
( 2 º
E N T1T2 "TK ( f1 , f 2 ,… , f K ) » ,
¼
) (3.59d)
#
TK →∞
where
N T1T2 "TK ( f1 , f 2 ,… , f K )
T1 T2 TK
(3.59e)
=
−T1
³ dt1 ³
−T2
dt2 " ³
−TK
dt K n (t1 , t2 ,… , t K )e−2π i ( f1t1 + f2t2 +"+ f K tK ) .
The next chapter uses the three-dimensional Wiener-Khinchin theorem with one time
coordinate t and two space coordinates x and y. Using the vector notation introduced in Chapter 2
(see Sec. 2.25), we write the random field ñ as
G
n ( x, y, t ) = n ( ρ , t ) , (3.60a)
with
G
ρ = xxˆ + yyˆ (3.60b)
being the position vector defined in terms of the x̂ and ŷ unit vectors corresponding to the x and
G
y coordinates. We also define a vector u with u x and u y components such that
G
u = xu
ˆ x + yu
ˆ y. (3.60c)
Here, u x and u y are the spatial frequencies corresponding to the x and y coordinates respectively.
The frequency corresponding to time t is called w. The truncated time and space Fourier
G
transform of n ( ρ , t ) can now be written as
T
N T , A (u x , u y , w) = ³ dt ³³
−2π i ( xu x + yu y + wt )
dx dy n ( x, y, t )e
−T area A
or
- 298 -
The Multidimensional Wiener-Khinchin Theorem · 3.24
T
G G G G
N T , A (u , w) = ³ dt ³³ d 2 ρ n ( ρ , t )e −2π i ( u • ρ + wt ) . (3.60d)
−T area A
G
Random field n ( ρ , t ) has an autocorrelation function
( x′ − x , y ′ − y , t ′ − t ) = E ( n
Rnn ( x, y, t ) n ( x′, y′, t ′) ) , (3.61a)
Because Rnn depends only on the difference between the unprimed and primed coordinates, we
say that field ñ is (wide-sense) stationary and homogeneous. The corresponding power spectrum
is
∞ ∞
G G G G
³ ³ ³ d ρ Rnn ( ρ , t )e
2 −2π i ( u • ρ + wt )
S nn
(u , w) = dt . (3.61c)
−∞ −∞
³ dw ³ ³ d 2u Snn 2π i ( u • ρ + wt )
(ρ , t) =
Rnn (u , w) e . (3.61d)
−∞ −∞
Glancing back at the notation for the truncated Fourier transform of ñ in Eq. (3.60d), we see that
the three-dimensional Wiener-Khinchin theorem for this case can be stated as
G ª 1
(u , w) = lim «
S nn
T →∞ 2TA
A→∞ ¬
G
(
2 º
E N T , A (u , w) » .
¼
) (3.61e)
- 299 -
3 · Random Variables, Random Functions, and Power Spectra
FIGURE 3.5.
Wn~n~ ( f )
W0
−F F
- 300 -
Band-Limited White Noise · 3.25
The bandwidth of this white noise is said to be F (see Fig. 3.5). Equation (3.48d) shows that the
autocorrelation function of this band-limited white noise must be
F
sin(2π Fτ )
³e
2π if τ
(τ ) = W0
Rnn df = W0 . (3.62c)
−F
πτ
( )
E ( n (t ) ⋅ n (t ) ) = E n (t ) 2 = Rnn
(0) ,
(
E n (t ) 2
) = W ³ dτ = 2FW .
0 0 (3.62d)
−F
According to (3.62b) ñ is a zero-mean random function, so Eq. (3.62d) shows that product 2FW0
must be the variance of ñ(t) when ñ is band-limited white noise.
Sometimes we take the limit as F → ∞ in Eqs. (3.62a)–(3.62d) to get white noise that has no
band limits. Now the power spectrum of ñ(t) is
( f ) = W0
Wnn (3.63a)
for all values of f. According to formula (3.62c) and Eq. (2.71f) in Chapter 2, this makes the
autocorrelation function Rnn
proportional to a delta function,
(τ ) = W0 ³ e
2π if τ
Rnn df = W0δ (τ ) , (3.63b)
−∞
with of course
lim[E(n (t ) 2 )] = ∞ (3.63c)
F →∞
and
E ( n (t ) ) = 0 . (3.63d)
Just like the concepts of stationarity and ergodicity, the concept of white noise (even of band-
limited white noise) is an idealization that is often useful for approximating random processes
seen in nature. When a poor-quality recording is played on an audio system, the noise
contaminating it is often white in nature, showing up as unwanted hissing, crackling, and an
overall “shussing” sound. This white noise is band limited, with the band specified by the finite
- 301 -
3 · Random Variables, Random Functions, and Power Spectra
range of frequencies produced by the audio system and heard by the audience. Setting a TV set to
a channel or station that does not exist, or that cannot be picked up, often produces hissing in the
speakers and a rapidly changing speckle (sometimes called snow) on the screen; both the snow
and the hissing come from quasi white-noise processes that the TV is treating like a nonrandom
signal.
N (t ) = N ( + ) (t ) + N ( − ) (t ) , (3.64a)
where
1
N ( + ) (t ) = ª¬ N (t ) + N (−t ) º¼ (3.64b)
2
and
1
N ( − ) (t ) = ª¬ N (t ) − N (−t ) º¼ . (3.64c)
2
We now apply to Ñ(t) the time-limited Fourier transform shown in Eq. (3.56a),
T ∞
(f)= N (t ) e −2π ift dt =
³ ³ Π(t , T ) N (t ) e
−2π ift
N T dt . (3.65a)
−T −∞
Here, the Π (t , T ) function [defined in Eq. (2.56c) of Chapter 2] is used to convert the integral
between +T and –T into a true Fourier transform. Substituting (3.64a) into (3.65a) gives
∞ ∞
(f)= Π (t , T ) N ( + ) (t ) e−2π ift dt +
³ ³ Π(t , T ) N (t ) e −2π ift dt ,
(−)
N T
−∞ −∞
( f ) = N( +) ( f ) + N(−) ( f ) ,
N (3.65b)
T T T
where
∞
(+) ( f ) =
³ Π(t , T ) N (t ) e −2π ift dt
(+)
N T (3.65c)
−∞
- 302 -
Even and Odd Components of Random Functions · 3.26
and
∞
(−) ( f ) =
³ Π (t , T ) N (t ) e −2π ift dt .
(−)
N T (3.65d)
−∞
According to entries 1 and 4 of Table 2.1 in Chapter 2, random function N ( + ) must be a real and
T
even function of f because it is the forward Fourier transform of a real and even function of t;
and random function N ( − ) must be an imaginary and odd function of f because it is the forward
T
Fourier transform of a real and odd function of t. This means that every function in the ensemble
of functions associated with random function N ( + ) is real and even, and every function in the
T
ensemble of functions associated with random function N ( − ) is imaginary and odd. It also reveals
T
( + ) ( f ) = Re N
N T T(
(f) ) (3.65e)
and
( − ) ( f ) = i Im N
N T (
(f) .
T ) (3.65f)
There is a simple connection between the expectation values of the squared magnitudes of
(±)
NT and N , that is between
T
( )
( ± ) ( f ) 2 and E N
E N T
(f)2 ,
T ( )
which is worth taking the time to analyze in detail.
( ± ) ( f ) 2 to get
We start by applying formulas (3.65c) and (3.65d) to E N T ( )
( ) (( )( ))
∗
(±) ( f ) 2 = E N
E N (±) ( f ) N
(±) ( f )
T T T
§§ ∞ · § ∞
·
∗
·
= E ¨ ³ Π (t , T ) N (t ) e
¨ ( ± ) −2 π ift
dt ¸ ¨ ³ Π (t ′, T ) N (t ′) e
( ± ) −2 π ift ′
dt ′ ¸ ¸ .
¨ © −∞ ¹ © −∞ ¹ ¸¹
©
- 303 -
3 · Random Variables, Random Functions, and Power Spectra
Everything inside the integral over dt ′ is real except for e −2π ift ′ , so we can write this as
( §∞
) ·
∞
2
E NT ( f ) = E ¨ ³ Π (t , T ) N (t ) e
(±) (±) −2π ift
dt ³ Π (t ′, T ) N ( ± ) (t ′) e 2π ift ′ dt ′ ¸ .
© −∞ −∞ ¹
(
(±) ( f ) 2
E N T )
§1 ∞
∞
·
= E ¨ ³ Π (t , T ) ¬ N (t ) ± N (−t ) ¼ e
ª º −2π ift
dt ³ Π (t ′, T ) ª¬ N (t ′) ± N (−t ′) º¼ e 2π ift ′ dt ′ ¸ ,
© 4 −∞ −∞ ¹
which becomes, applying the linearity of operator E discussed in Sec. 3.10 above,
(
(±) ( f ) 2
E N T )
∞ ∞ (3.66a)
1
(
= ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) e 2π ift ′ E ª¬ N (t ) ± N (−t ) º¼ ª¬ N (t ′) ± N (−t ′) º¼
4 −∞ −∞
).
The linearity of E can also be used to write
(
E [ N (t ) ± N ( −t )] [ N (t ′) ± N (−t ′)] )
(
= E N (t ) N (t ′) ± N (t ) N (−t ′) ± N (−t ) N (t ′) + N (−t ) N (−t ′) )
= E ( N (t ) N (t ′) ) ± E ( N (t ) N (−t ′) ) ± E ( N (−t ) N (t ′) ) + E ( N (−t ) N (−t ′) ) .
Equation (3.30b), which specifies the autocorrelation function of wide-sense stationary random
functions like Ñ(t), can now be applied to get
(
E [ N (t ) ± N (−t )][ N (t ′) ± N (−t ′)] )
(t ′ − t ) ± RNN
= RNN ( −t ′ − t ) ± RNN
(t ′ + t ) + RNN
( −t ′ + t ) .
simplified to
- 304 -
Even and Odd Components of Random Functions · 3.26
(
E [ N (t ) ± N (−t )][ N (t ′) ± N (−t ′)] = 2 RNN )
(t − t ′) ± 2 RNN
(t + t ′) .
( )
∞ ∞
( ± ) ( f ) 2 = 1 dt Π (t , T ) e −2π ift dt ′Π (t ′, T ) R (t − t ′) e 2π ift ′
E N T ³
2 −∞ ³
−∞
NN
∞ ∞
(3.66b)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
(t + t ′) e
2π ift ′
.
2 −∞ −∞
(t ± t ′) =
RNN ³S
−∞
NN
( f ) e 2π if ( t ±t ′) df .
Substituting this expression into the first term on the right-hand side of the formula for
(
E N T )
( ± ) ( f ) 2 and moving the integral over S to the front, we get
NN
( )
∞ ∞ ∞
( ± ) ( f ) 2 = 1 df ′ S ( f ′) dt Π (t , T ) e −2π it ( f − f ′) dt ′Π (t ′, T )e 2π it ′( f − f ′)
E N T ³ NN −∞³
2 −∞ ³
−∞
∞ ∞
(3.66c)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
(t + t ′) e
2π ift ′
.
2 −∞ −∞
Interchanging the roles of f, t and then replacing F by T in Eq. (2.108b) of Chapter 2 gives
∞ ∞
³ Π(t , T ) e
−2π it ( f − f ′ )
dt = ³ Π(t , T ) e
2π it ( f − f ′ )
dt = 2T sinc ( 2π ( f − f ′)T ) , (3.66d)
−∞ −∞
with Eq. (2.106d) showing that the definition of the sinc function is
sin( x)
sinc( x) = . (3.66e)
x
- 305 -
3 · Random Variables, Random Functions, and Power Spectra
( )
∞
( ± ) ( f ) 2 = 1 S ( f ′)[2T sinc(2π ( f − f ′)T )]2 df ′
E N T ³ NN
2 −∞
∞ ∞
(3.66f)
1
± ³ dt Π (t , T ) e −2π ift ³ dt ′Π (t ′, T ) RNN
(t + t ′) e
2π ift ′
.
2 −∞ −∞
To evaluate the integral over df ′ in (3.66f), we assume that T is chosen large enough that
2
ª sin(2π f ′ T ) º
[sinc(2π f ′ T )] = « 2
»
¬ 2π f ′ T ¼
( f ′) , we must have
required to cause a significant change in S NN
1
∆f S ⋅ T >> 1 or T >> . (3.67a)
∆f S
2
ª sin(2π f ′ T ) º
2T « » ′ ) ≅ δ ( f ′) .
= 2T sinc 2 (2π f T (3.67b)
¬ 2π f ′ T ¼
Applying this approximation to the integral over df ′ on the right-hand side of (3.66f), we replace
2
ª sin ( 2π ( f − f ′) T ) º 2
2T « » = 2T ª¬sinc ( 2π ( f − f ′) T ) º¼
¬ 2π ( f − f ′) T ¼
by δ ( f − f ′) to get
∞ ∞
This result can now be substituted back into Eq. (3.66f) to get
- 306 -
Even and Odd Components of Random Functions · 3.26
(T NN)
(±) ( f ) 2 ≅ T S ( f ) ± 1 Λ ,
E N
2
T (3.67c)
∞ ∞
∞
ª −∞
º
³ dtΠ (t , T ) e «(−1) ³ dt ′′Π (−t ′′ − t , T ) RNN
−2π ift −2π if ( t ′′ + t )
ΛT = ( −t ′′) e »
−∞ ¬ +∞ ¼
∞
ª∞ º
³−∞ « ³ dt ′′Π ( −(t ′′ + t ), T ) RNN
−2π ift −2π if ( t ′′+ t )
= dt Π (t , T ) e ( −t ′′) e ».
¬ −∞ ¼
Π ( −(t ′′ + t ), T ) = Π (t ′′ + t , T ) .
( −t ′′) = RNN
RNN (t ′′) .
Applying these two formulas to the ΛT double integral gives, after interchanging the order of the
integrals over dt and dt ′′ ,
∞ ∞
³ ³ dtΠ(t , T )Π (t + t ′′, T ) e .
−2π ift ′′ −4π ift
ΛT = dt ′′RNN
(t ′′) e (3.67e)
−∞ −∞
To simplify the inner integral on the right-hand side of (3.67e), we note that only when both
Π (t , T ) and Π (t + t ′′, T ) are one is their product one—in other words, when either Π (t , T ) or
Π (t + t ′′, T ) is zero, then their product is zero and no contribution is made to the integral. Figure
3.6(a) shows what happens for positive values of t ′′ , and Fig. 3.6(b) shows what happens for
negative values of t ′′ .
- 307 -
3 · Random Variables, Random Functions, and Power Spectra
FIGURE 3.6(a).
t ′′
−T T
Π (t , T )
FIGURE 3.6(b).
í t ′′
−T T
- 308 -
Even and Odd Components of Random Functions · 3.26
In both Figs. 3.6(a) and 3.6(b), the dark solid line is a plot of Π (t , T ) and the dashed line is a plot
of Π (t + t ′′, T ) . When t ′′ > 0 , the dashed block shifts to the left; when t ′′ < 0 , the dashed block
shifts to the right. Only in the region of overlap of the solid and dashed lines in Figs. 3.6(a) and
3.6(b) does the product function
Π (t , T )Π (t + t ′′)
disregarding the edge points of the Π functions because these single-point values do not
contribute to the integral. Equation (3.67e) thus reduces to
0 T
³ ³
−2π ift ′′
ΛT = dt ′′RNN
(t ′′) e dt e−4π ift
−2T −T −t ′′
2T T −t ′′
(3.67g)
+ ³ dt ′′R
0
NN
(t ′′) e −2π ift ′′ ³
−T
dt e−4π ift .
We note that
b
1
³e ª¬e −4π ifa − e −4π ifb º¼ .
−4π ift
dt = (3.67h)
a
4π if
Changing the variable of integration in the first integral to t ′′′ = −t ′′ leads to [remember to apply
Eq. (3.48b)]
- 309 -
3 · Random Variables, Random Functions, and Power Spectra
2T
1
ΛT = ³R (t ′′′) ( e4π ifT e −2π ift ′′′ − e−4π ifT e2π ift ′′′ ) dt ′′′
4π if 0
NN
2T
1
+ ³R (t ′′) ( e 4π ifT e−2π ift ′′ − e−4π ifT e2π ift ′′ ) dt ′′
4π if 0
NN
4π ifT 2T 2T
e e −4π ifT
³R ³R
−2π ift
= (t )e dt − (t )e2π ift dt ,
2π if 0
NN
2π if 0
NN
where in the last step we have dropped the primes from the variables of integration. The second
integral is the complex conjugate of the first, so this formula can be written as
ª e 4π ifT 2T
º
³
−2π ift
ΛT = Re « RNN
(t )e dt » (3.67i)
¬ π if 0 ¼
1 for t > 0
°
Ξ(t ) = ®1 2 for t = 0 (3.67j)
° 0 for t < 0
¯
in Eq. (2.70a) of Chapter 2. The integral on the right-hand side of (3.67i) can now be written as
2T ∞
³R ³ Ξ(t )Π (t , 2T ) R
−2π ift
NN
(t )e dt =
NN
(t )e −2π ift dt . (3.67k)
0 −∞
Ξ(t )Π (t , 2T ) RNN
(t )
and the Fourier-transform operator F defined in Eq. (2.29a) of Chapter 2 can be used to write it as
- 310 -
Even and Odd Components of Random Functions · 3.26
The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] can be applied to get
( f ) such that
According to Eq. (3.48c) there exists a power spectrum S NN
(f )=
S NN ³R
NN
(t ′′)e −2π ift ′′ dt ′′ = F ( −ift ′′) ( RNN
(t ′′) ) . (3.68b)
−∞
Evaluating F ( −ift ′) ( Ξ(t ′)Π (t ′, 2T ) ) is not much more difficult. Writing the Fourier transform as an
integral gives [remember that eiφ = cos(φ ) + i sin(φ ) ]
2T
1
F ( −ift ′) ( Ξ(t ′)Π (t ′, 2T ) ) = ³e
−2π ift ′
dt ′ = ª¬1 − e−4π ifT º¼
0
2π if
e −2π ifT 2π ifT
=
2π if
e( − e −2π ifT )
1
= [cos(2π fT ) − i sin(2π ft )]sin(2π fT )
πf
1 i
= sin(4π fT ) − sin 2 (2π fT ) ,
2π f πf
Applying the formula for the sinc function from Eq. (3.66e), we end up with
- 311 -
3 · Random Variables, Random Functions, and Power Spectra
2T
³R
NN { }
(t )e −2π ift dt = 2Tsinc(4π fT ) − i (2π fT ) ª¬ 2Tsinc 2 (2π fT ) º¼ ∗ S NN
(f ). (3.68d)
0
sin(2π nf )
→δ( f )
πf (3.68e)
as n → ∞ ,
where t in (2.67c) is here replaced by f. We note that, working with Eq. (3.66e),
As n gets large in (3.68e), the sine oscillates ever more rapidly with f. Similarly, as 2T gets large
in (3.68f)—which is, of course, the same as T getting large—the sinc oscillates ever more rapidly
with f. In order to approximate the sinc in (3.68f) by a delta function, then, we need to have the
other functions of f that are also present varying slowly compared to the original oscillation.
Again assuming, as in the discussion following Eq. (3.66f), that T is large enough for the first
sinc function on the right-hand side of Eq. (3.68d) to oscillate rapidly compared to the noise-
power spectrum S NN , we expand the convolution in (3.68d), writing it as [apply Eq. (2.38e) in
Chapter 2]
2T
2T
1
³R (t )e −2π ift dt ≅ ( f ) − i {(2π fT [2Tsinc (2π fT )]) ∗ S NN
2
NN
S NN ( f )} . (3.68g)
0
2
- 312 -
Even and Odd Components of Random Functions · 3.26
The remaining convolution on the right-hand side can be written as [see Eqs. (2.38a) and (2.38b)
in Chapter 2]
[2Tsinc 2 2& ( f f 3 T )]
5
(2& fT [2Tsinc2 (2& fT )]) S NN
(f ) ³S
5
NN
( f 3){2& T ( f f 3) ( f f 3)} df 3 0 . (3.68h)
2T
1
³
2& ift
(t )e
RNN dt
S ( f ), (3.68i)
0
2 NN
which can then be put back into (3.67i) to get that [using ei cos( ) i sin( ) ]
ª cos(4& fT
ft ))iisin(4 fT)
sin(4&& ft
ft)) 11 ºº
T
Re « (( ff ))»» ..
AA SSNN
NN
¬ &&ifif 22 ¼¼
T
[ 2Tsinc(4& fT )]S NN
(f ). (3.68j)
T
( 9 ) ( f ) 2
T S ( f ) 9 Tsinc(4& fT ) S ( f )
E N NN NN
(3.68k)
( f )[1 9 sinc(4& fT )].
T S NN
- 313 -
3 · Random Variables, Random Functions, and Power Spectra
The approximation in (3.68k) makes sense whenever T is large enough for sinc(2π fT ) and
sinc(4π fT ) to oscillate rapidly with frequency f compared to S NN
( f ) , which is usually true for
white-noise-like power spectra. When fT >> 1 , the sinc function’s value in formula (3.68k) is
small compared to one [see, for example, Figs. 3.7(a) and 3.7(b)] and we can write
(
E N T )
(±) ( f ) 2 ≅ T S ( f ) .
NN
(3.69a)
When f = 0 , it is of course no longer true that fT >> 1 . For this special case, the sinc function
is one; and, according to (3.68k), no matter how large T is we have
(
( − ) (0) 2 ≅ 0
E N T ) (3.69b)
and
(
E N T )
( + ) (0) 2 ≅ 2T S ( f ) .
NN
(3.69c)
Equation (3.69b) is easy to understand after reviewing the discussion following Eq. (3.65d)
above. Since N ( − ) is always an odd function of f, it must be zero at f = 0 according to Eq.
T
(2.12a) of Chapter 2. To understand Eq. (3.69c), we consult Eqs. (3.65e) and (3.65f) and note that
(+) ( f ) 2 + N
N ( − ) ( f ) 2 = [Re(N
( f ))]2 + [Im(N ( f ) 2.
( f ))]2 = N
T T T T T
Applying the expectation operator E to both sides and using its linearity with respect to random
quantities (see Sec. 3.10 above), we get
( ) (
(+) ( f ) 2 + E N
E N T T )
( − ) ( f ) 2 = E §¨ ª Re N
©¬
( f ) º ·¸ + E §¨ ª Im N
(T
2
¼ ¹ ) ©¬
( f ) º ·¸ = E N
(
T
2
)
¼ ¹ (
(f)2
T )
or
E N(
(f )2 =E N
T T ) (
(+) ( f ) 2 + E N
) (
( −) ( f ) 2 .
T ) (3.69d)
- 314 -
Even and Odd Components of Random Functions · 3.26
FIGURE 3.7(a).
sinc(2πfT )
1.0
1 1
−
2T 2T
FIGURE 3.7(b).
sinc(4πfT )
1.0
1 1
−
4T 4T
- 315 -
3 · Random Variables, Random Functions, and Power Spectra
Glancing back at formula (3.57g), we realize, because T is assumed to be large in our analysis
here, that
E N
(f)2
T
2T
E N T
( f ) 2
2TS ( f )
NN
(3.69e)
for large values of T. This approximation works well no matter what the value of f is. Therefore,
at f 0 we can substitute (3.69e) into (3.69d) to get
() 2 E N
(0)
E N T (0)
2TS NN
( ) (0) 2 .
T (3.69f)
() 2 .
(0)
E NT (0)
2TS NN
This result then justifies formula (3.69c) above.
Equation (3.69d) can also be used to justify the assumption
– but f > 0formula
only whenbehind (3.69e) that,behind
– the assumption when
formula (3.69e)
f > 0 , the ratio that the ratio
(f)2
E N T
2T
E N T
( f ) 2
2TS ( f ) .
NN
- 316 -
Even and Odd Components of Random Functions · 3.26
(
E N T )
( + ) ( f ) 2 = E ¨§ ª Re N
©¬
( f ) º ¸· ≅ TS ( f )
T (
2
¼ ¹ NN )
and
(
E N T )
( − ) ( f ) 2 = E ¨§ ª Im N
©¬
( f ) º ¸· ≅ TS ( f )
T (
2
¼ ¹ NN )
( )
( f ) 2 . Having arrived at the formula
contribute equally to E N T
E N T( NN )
( f ) 2 ≅ 2TS ( f )
without using Eq. (3.57g)—that is, without thinking about what the limiting value of the ratio
( (f)2
E N T )
2T
(f)2
E N T ( )≅S (f ).
NN
2T
( (f)2
E N T )
2T
expected value of the squared imaginary component of N contribute equally to the expected
T
value of the squared magnitude of N . In other words, both
T
(
E N T )
( + ) ( f ) 2 = E ¨§ ª Re N
©¬
( f ) º ¸·
T (
2
¼ ¹ )
- 317 -
3 · Random Variables, Random Functions, and Power Spectra
and
( ) ( f ) 2 E §¨ ª Im N
( f ) º ·¸
E N T ©¬
T
2
¼ ¹
have turned out to be about half the expected value of the squared magnitude of N , which lets
T
us write
( ) ( f ) 2 2E ¨§ ª Re N
( f ) º ¸·
E N
T
( f ) 2
2E N
T
©¬
T
2
¼ ¹
(3.69g)
and
( ) ( f ) 2 2E ¨§ ª Im N
( f ) º ¸· .
E N
T
( f ) 2
2E N
T ©¬
T
2
¼ ¹
(3.69h)
A not-very-rigorous argument often used to derive Eqs. (3.69a), (3.69g), and (3.69h) starts out
by breaking N ( f ) into real and imaginary parts. (This step is sound—we did the same thing in
T
( f ) 2 [Re N
N ( f ) ]2 [Im N
( f ) ]2 ,
(3.70a)
T T T
E [Re N T
( f ) ]2 E [Im N
( f ) ]2 .
T (3.70b)
This is the
This result,
is the of of
result, course, that
course, wewe
that have
havegone
gonetotosome
sometrouble
troubletotojustify
justifyanalytically
analytically rather
rather
than just assuming it applies; it is sometimes true and sometimes very wrong, for example, when
f 0 or when S NN varies rapidly with f. Applying the E expectation operator to both sides of
E N T
( f ) 2 E [Re N
T
( f ) ]2 E [Im N
( f ) ]2 .
T (3.70c)
E N T
( f ) 2 2 E [Re N
( f ) ]2
T (3.70d)
and
- 318 -
Even and Odd Components of Random Functions · 3.26
(
E N T )
( f ) 2 = 2 E [Im N( (
( f ) ]2 .
T ) ) (3.70e)
Consulting Eqs. (3.65e) and (3.65f), we see that formulas (3.70d) and (3.70e) are identical to
(3.69g) and (3.69h). Fortunately, since a more rigorous line of reasoning has already been used to
derive Eqs. (3.69g) and (3.69h), there is no need to rely on the assumption that (3.70b) is true to
establish the truth of (3.70d) and (3.70e). Having derived these results more rigorously, we also
now know that formulas (3.69g) and (3.69h) and formulas (3.70d) and (3.70e) are approximations
that should be used only when T is large, when fT >> 1 , and when S NN varies slowly with
frequency f.
Although random function n E (t ) is neither ergodic nor stationary, we can assume that a real-
valued and stationary random function ñ(t) exists such that
Just like any other stationary random function, ñ(t) has an autocorrelation function [see Eq.
(3.30b)]
(t − t ′) = E ( n
Rnn (t ′) n (t ) ) . (3.71c)
Following the conventions of Sec. 3.20 above [see Eqs. (3.48a)–(3.48c)], we note that Rnn
is an
even function,
( −τ ) = Rnn
Rnn (τ ) , (3.71d)
- 319 -
3 · Random Variables, Random Functions, and Power Spectra
5
S nn
( f ) ³R
5
nn (* ) e 2& if * d* (3.71e)
and
5
Rnn
(* ) ³S
5
nn ( f ) e 2& if * df . (3.71f)
T 5
N T ( f ) ³ n (t ) e
2& ift
dt ³ (t , T ) n (t ) e
2& ift
dt (3.72a)
T 5
and
T 5
N TE ( f ) ³ nE (t ) e 2& ift dt ³ (t , T ) n E (t ) e2& ift dt . (3.72b)
T 5
2
E N TE ( f ) ,
the expectation value of the squared magnitude of N TE , in terms of
E N T ( f )
2
and the power spectrum S nn
( f ) .
We start by specifying the Heaviside step function to be the same as in Eq. (3.67j):
1 for t 0
°
(t ) ®1 2 for t 0 . (3.73a)
° 0 for t
0
¯
This is the same step function defined in Eq. (2.70a) in Chapter 2. It follows that n E (t ) can be
written as [see Eqs. (3.71a) and (3.71b)]
- 320 -
Analyzing the Noise in Artificially Created Even Signals · 3.27
We note that for t > 0 , the first term has Ξ (t ) = 1 and the second term has Ξ(−t ) = 0 , so
n E (t ) = n (t ) .
For t < 0 , the first term has Ξ(t ) = 0 and the second term has Ξ(−t ) = 1 , so
n E (t ) = n (−t ) ,
n E (0) = n (0) .
We can now write, using Eq. (3.72b) and remembering that n E is real, that
( 2
) (
E N TE ( f ) = E N TE ( f )∗ ⋅ N TE ( f ) )
§∞ ∞
·
= E ¨ ³ Π (t ′, T ) nE (t ′) e −2π ift ′
dt ′ ³ Π (t , T ) n E (t ) e2π ift dt ¸ .
© −∞ −∞ ¹
Using the linearity of E described in Sec. 3.10 above, we bring the expectation operator inside
the double integral over dt and dt ′ to get
( )
∞ ∞
2
E N TE ( f ) = ³ dt ′Π(t ′, T ) e
−2π ift ′
³ dt Π(t , T ) e
2π ift
E ( nE (t ′)nE (t ) ) . (3.73c)
−∞ −∞
Equation (3.73b) shows that, again using the linearity of the expectation operator,
E ( n E (t ′)nE (t ) ) = E ([Ξ(t ′)n (t ′) + Ξ(−t ′)n (−t ′)] ⋅ [Ξ(t )n (t ) + Ξ(−t )n (−t )])
= Ξ(t ′)Ξ(t )E ( n (t ′)n (t ) ) + Ξ(−t ′)Ξ(t )E ( n (−t ′)n (t ) )
+ Ξ(t ′)Ξ(−t )E ( n (t ′)n (−t ) ) + Ξ(−t ′)Ξ(−t )E ( n (−t ′)n (−t ) ) .
- 321 -
3 · Random Variables, Random Functions, and Power Spectra
Substituting the right-hand side of (3.73d) into the double integral in (3.73c) gives
(
E N TE ( f )
2
)
∞ ∞
= ³ dt ′Π (t ′, T ) e −2π ift ′ ³ dt Π (t , T ) e 2π ift [Ξ(t ′)Ξ(t ) + Ξ(−t ′)Ξ(−t )]Rnn
(t ′ − t )
−∞ −∞ (3.73e)
∞ ∞
+ ³ dt ′Π(t ′, T ) e
−2π ift ′
³ dt Π (t , T ) e
2π ift
[Ξ(−t ′)Ξ(t ) + Ξ(t ′)Ξ(−t )]Rnn
(t + t ′)
−∞ −∞
=Λ +Λ ,
1 2
where
∞ ∞
Λ1 = ³ dt ′Π (t ′, T ) e −2π ift ′ ³ dt Π (t , T ) e2π ift [Ξ(t ′)Ξ(t ) + Ξ(−t ′)Ξ(−t )]Rnn
(t ′ − t ) (3.73f)
−∞ −∞
and
∞ ∞
Λ 2 = ³ dt ′Π (t ′, T ) e −2π ift ′
³ dt Π(t , T ) e
2π ift
[Ξ(−t ′)Ξ(t ) + Ξ(t ′)Ξ(−t )]Rnn
(t + t ′) . (3.73g)
−∞ −∞
The dark solid line in Fig. 3.8(a) is a plot of the Heaviside step function Ξ(t ) and the dashed
line is a plot of Π (t , T ) . Disregarding the edge points whose values do not contribute to the
integrals in (3.73f) and (3.73g), the product [Ξ (t ) ⋅ Π (t , T )] is zero unless both Ξ and Π are
one—that is, the product is zero unless t lies inside the region where both the solid and dashed
plots are one in Fig. 3.8(a). Comparing this region to the plot of
§ T T·
Π¨t − , ¸
© 2 2¹
in Fig. 3.8(b), we see that
§ T T·
Ξ(t ) ⋅ Π (t , T ) = Π ¨ t − , ¸ . (3.74a)
© 2 2¹
- 322 -
Analyzing the Noise in Artificially Created Even Signals · 3.27
In Fig. 3.8(c), the dashed line is again a plot of Π (t , T ) , but now the dark solid line is a plot of
Ξ(−t ) . Comparing the region where both Ξ (−t ) and Π (t , T ) are one in Fig. 3.8(c) to the plot of
§ T T·
Π¨t + , ¸
© 2 2¹
§ T T·
Ξ(−t ) ⋅ Π (t , T ) = Π ¨ t + , ¸ . (3.74b)
© 2 2¹
t t
−T T −T T
t t
−T T −T T
- 323 -
3 · Random Variables, Random Functions, and Power Spectra
Splitting the formula in Eq. (3.73f) into two double integrals, we get that
∞ ∞
Λ1 = ³ dt ′Ξ(t ′)Π (t ′, T ) e −2π ift ′
³ dtΞ(t ) Π(t , T ) e
2π ift
(t ′ − t )
Rnn
−∞ −∞
∞ ∞
³ dt ′Ξ(−t′)Π (t ′, T ) e ³ dt Ξ(−t )Π (t , T ) e
−2π ift ′ 2π ift
+ (t ′ − t ) ,
Rnn
−∞ −∞
∞ ∞
Λ1 = ³ dt ′Π §¨ t ′ − , ·¸ e−2π ift ′ ³ dtΠ §¨ t − , ·¸ e2π ift Rnn (t ′ − t )
T T T T
−∞
2 2 © ¹ 2 2 −∞ © ¹
∞ ∞
§ ′ T T · −2π ift ′ § T T · 2π ift
+ ³−∞ © 2 2 ¹
dt ′Π ¨ t + , ¸ e ³−∞ © t + 2 , 2 ¸¹ e Rnn (t′ − t ) .
dt Π ¨
After changing the variables of integration in the first double integral from t , t ′ to τ = t − (T / 2)
and τ ′ = t ′ − (T / 2) , and changing the variables of integration in the second double integral from
t , t ′ to τ ′′ = t + (T / 2) and τ ′′′ = t ′ + (T / 2) , we see that
∞ § T· ∞ § T·
Since
e −2π if ( ±T / 2) ⋅ e 2π if ( ±T / 2) = 1 ,
the double integral over dτ ′′′ and dτ ′′ has the same value as the double integral over dτ ′ and
dτ , which means that
∞ ∞
Λ1 = 2 ³ dτ ′ Π §¨τ ′, ·¸ e−2π if τ ′ ³ dτ Π §¨τ , ·¸ e2π if τ Rnn (τ ′ − τ ) .
T T
−∞ © 2¹ −∞ © 2¹
This type of double integral has already been evaluated in Sec. 3.26 while simplifying Eq.
(3.66b), but there is no harm in quickly repeating the procedure. Applying Eq. (3.71f), we get
- 324 -
Analyzing the Noise in Artificially Created Even Signals · 3.27
∞ ∞ ∞
Λ1 = 2 ³ dτ ′ Π §¨τ ′, ·¸ e−2π if τ ′ ³ dτ Π §¨τ , ·¸ e2π if τ ³ df ′ Snn ( f ′)e2π if ′(τ ′−τ )
T T
−∞ © 2¹ −∞ © 2¹ −∞
∞ ∞ ∞
§ T · −2π i ( f − f ′)τ ′ § T·
= 2 ³ df ′ Snn
( f ′) ³ dτ ′ Π ¨ τ ′, ¸e ³ dτ Π ¨τ , ¸ e2π i ( f − f ′)τ .
−∞ −∞ © 2¹ −∞ © 2¹
This expression can be simplified further using Eq. (3.66d). Equation (3.66d) still holds true if T
is replaced by T/2 because the original T is a dummy parameter. So, replacing T by T/2 and
substituting the result in the formula for ȁ1,
∞
Λ1 = 2T ³ S nn ( f ′) ª¬Tsinc2 (π T ( f − f ′) ) º¼ df ′ . (3.75a)
−∞
T sinc 2 (π Tf ) → δ ( f )
(3.75b)
as T → ∞ .
T sinc 2 (π T ( f − f ′) ) ≅ δ ( f − f ′)
∞
Λ1 ≅ 2T ³ Snn ( f ′)δ ( f − f ′)df ′
−∞
or
Λ1 ≅ 2TSnn ( f ) . (3.75c)
- 325 -
3 · Random Variables, Random Functions, and Power Spectra
To evaluate Λ 2 , we apply Eqs. (3.74a) and (3.74b) to the right-hand side of Eq. (3.73g) to get
∞ ∞
Λ2 = ³ dt ′Π (t ′, T )Ξ(−t ′) e −2π ift ′
³ dt Π (t , T )Ξ(t ) e
2π ift
(t + t ′)
Rnn
−∞ −∞
∞ ∞
+ ³ dt ′Π (t ′, T )Ξ(t ′) e −2π ift ′ ³ dt Π (t , T ) Ξ(−t ) e2π ift Rnn
(t + t ′)
−∞ −∞
∞ ∞
§ T T· § T T·
³ dt ′Π ¨© t ′ + 2 , 2 ¸¹ e ³ dt Π ¨© t − 2 , 2 ¸¹ e
−2π ift ′ 2π ift
= (t + t ′)
Rnn
−∞ −∞
∞ ∞
§ T T · −2π ift ′ § T T · 2π ift
+ ³−∞ dt ′Π ¨© t ′ − 2 , 2 ¸¹ e −∞³ dtΠ ¨© t + 2 , 2 ¸¹ e Rnn (t + t ′) .
In the first double integral, the t ′ , t variables of integration are replaced by τ ′ = t ′ + (T / 2) and
τ = t − (T / 2) respectively; and in the second double integral, the t ′ , t variables of integration are
replaced by τ ′′′ = t ′ − (T / 2) and τ ′′ = t + (T / 2) respectively. This leads to
∞ § T· ∞ § T·
Everything on the right-hand side of (3.75d) is real except the complex exponentials, so the
second term is the complex conjugate of the first term. It is easy to show that this is true. Starting
with the first term we have
- 326 -
Analyzing the Noise in Artificially Created Even Signals · 3.27
∗
ª 2π ifT ∞ § ′ T · −2π if τ ′
∞
§ T · 2π if τ º
«
¬
e ³
−∞
dτ ′ Π ¨ τ
© 2¹
, ¸ e ³
−∞
dτ Π ¨ τ
© 2¹
, ¸ e R
nn (τ + τ ′) »
¼
∞ ∞
§ T · 2π if τ ′ § T · −2π if τ
³−∞ dτ ′ Π ¨©τ ′, 2 ¸¹ e −∞³ dτ Π ¨©τ , 2 ¸¹ e Rnn (τ + τ ′)
−2π ifT
=e
∞ ∞
§ T · −2π if τ ′′′ § T · 2π if τ
³−∞ dτ ′′′ Π ¨©τ ′′′, 2 ¸¹ e −∞³ dτ ′′ Π ¨©τ ′′, 2 ¸¹ e Rnn (τ ′′ + τ ′′′) ,
−2π ifT
=e
where in the last step we interchange the order of the double integral and replace the dummy
variables of integration τ , τ ′ by τ ′′ , τ ′′′ respectively. Clearly, the second term in (3.75d) is the
complex conjugate of the first. Since 2 Re(c ) = c + c∗ for any complex number c, it follows that
Eq. (3.75d) can be written as
§ 2π ifT ∞ § T · −2π if τ ′
∞
§ T · 2π if τ ·
Λ2 = 2 Re ¨ e ³ dτ Π ¨τ , ¸ e
′ ′ ³ dτ Π ¨ τ , ¸ e R
nn (τ + τ ′) ¸. (3.75e)
© −∞ © 2¹ −∞ © 2¹ ¹
After the variable of integration of the inner integral is changed to t ′′ = −(τ + τ ′) , it can be written
as
∞ ∞
§ T · 2π if τ § T · −2π if (t ′′+τ ′)
³−∞ dτ Π ¨©τ , 2 ¸¹ e Rnn (τ + τ ) = −∞³ dt ′′ Π ¨© −t ′′ − τ ′, 2 ¸¹ e
′ ( −t ′′) .
Rnn (3.75f)
According to Eq. (3.48b) above and Eq. (2.56c) in Chapter 2, both Π and Rnn
are even functions,
and
( −t ′′) = Rnn
Rnn (t ′′) .
Substituting these two formulas into the right-hand side of (3.75f) gives
∞ ∞
§ T · 2π if τ § T · −2π if ( t ′′+τ ′)
³−∞ dτ Π ¨©τ , 2 ¸¹ e Rnn (τ + τ ′) = −∞³ dt ′′ Π ¨© t ′′ + τ ′, 2 ¸¹ e (t ′′) ,
Rnn
- 327 -
3 · Random Variables, Random Functions, and Power Spectra
§ 5 5
·
2 2 Re ¨ e2& ifT ³ d* 3 §¨* 3, ·¸ e2& if * 3 ³ dt 33 §¨ t 33 * 3, ·¸ e2& if (t 33* 3) Rnn (t 33) ¸ .
T T
© 5 © 2¹ 5 © 2¹ ¹
Interchanging the order of integration and replacing the variable * 3 by t, we end up with
§ 2& ifT 5 5
§ T· § T· ·
2 2 Re ¨ e ³ dt 33 Rnn (t 33)e 2& ift 33
³ dt ¨ t , ¸ ¨ t 33 t , ¸ e 4& ift ¸ . (3.75g)
© 5 5 © 2¹ © 2¹ ¹
Comparing (3.75g) with (3.67e), we note that the double integral in the formula for 2 can be
written as
5 5
§ T· § T · 4& ift
³5 dt Rnn (t )e 5³ dt ¨© t, 2 ¸¹ ¨© t 33 t, 2 ¸¹ e
2& ift 33
33 33 T / 2
with the understanding that the random function is now ñ(t) instead of Ñ(t) as in Eq. (3.67e). This
leads to a simpler—well, shorter—formula for 2 ,
We have already found the appropriate approximation for T and T / 2 when T and T/2 are large
enough to make the sinc functions oscillate rapidly with f compared to the noise-power
spectrum. Hence, we now apply formula (3.68j) to (3.75h), which gives, after remembering to
replace Ñ by ñ and T by T/2,
Since
e 2& ifT cos(2& fT ) i sin(2& fT ) ,
Having found good approximations for 1 and 2 , we can substitute (3.75c) and (3.75i) into
- 328 -
Analyzing the Noise in Artificially Created Even Signals · 3.27
(3.73e) to get
2
E N TE ( f )
2TS nn
( f ) 2T cos(2& fT ) sinc(2& fT )]
SSnn ((f f))
nn
or
2
E N TE ( f )
2TSnn
( f ) A [1 cos(2& fT ) sinc(2& fT )] . (3.76a)
cos(2& fT ) sin(2& fT ) 1
cos(2& fT ) sinc(2& fT ) 4
1
2& fT 2& fT
because (i) the absolute value of the product of the sine and cosine must always be less than or
equal to one and (ii) the value of 1/ 2& fT must be small when fT is large. The formula in
(3.76a) now simplifies to
2
E N TE ( f )
2TSnn
( f ) . (3.76c)
This will be a useful approximation to know when analyzing detector noise in Chapter 6.
__________
The basic concepts introduced in this chapter—such as random variables and functions, the
autocorrelation function, the noise-power spectrum, stationarity and ergodicity—may not be as
important as the Fourier theory covered in Chapter 2, but they turn up over and over again in the
following pages. The Wiener-Khinchin theorem is used to transform electromagnetic wavefields
into the spectral radiances that Michelson interferometers are built to measure. Stationary random
functions are added to interference signals to represent what happens when the interference
signals become contaminated by noise. The expectation operator E is applied to the products of
random quantities to turn them into autocorrelation functions, and the autocorrelation functions
are then transformed into noise-power spectra in formulas for the random-measurement error.
This chapter has explained the statistical ideas behind these procedures—and the context in
which the ideas arise—to show what the formulas mean and why they make sense.
- 329 -
4
FROM MAXWELL’S EQUATIONS TO
THE MICHELSON INTERFEROMETER
The interference formulas for a highly idealized version of the standard Michelson interferometer
can be derived in a page or two, and that is what is done in most textbooks. Section 1.5 of
Chapter 1 lays out the basic approach of this derivation, pointing out that all we really need is the
19th-century ether-wave theory of light because a full knowledge of Maxwell’s equations is not
required. Afterwards, these ideal interference formulas can, with some difficulty and an appeal to
ad hoc arguments, be modified to handle the measurement errors and distortions present in
nonideal instruments, but this is difficult to do in a straightforward and convincing way.
Consequently, in this chapter we prefer to start with first principles, carefully tracing the plane-
wave solutions to Maxwell’s equations through the standard Michelson interferometer and then
applying the Fourier methodology and random-signal theory explained in the previous two
chapters to describe the electromagnetic wavefields leaving the instrument. Although longer than
the standard textbook procedure, this approach leads naturally to detailed formulas describing
what happens when the optical setup is slightly misaligned, what happens when the input
radiation is polarized, and what happens when the interferometer measures an input spectrum that
is nonuniform over its field of view. We do this both for the interferometer’s balanced
interference signal and its unbalanced background signal, explaining first the reasoning behind
the formulas for the balanced input signal and then showing how the same sort of analysis
produces similar formulas for the unbalanced background signal. At the end of this process, the
reader has a detailed understanding of how the formulas describing ideal Michelson
interferometers should
interferometers should be
be modified
modified and
andexpanded
expandedto to
describe optical
describe imperfections
nonideal andinnon-
instruments an
ideal inputs.
imperfect world.
- 330 -
Deriving the Electromagnetic Wave Equations · 4.1
G G
∇•E = 0 , (4.1c)
and
G G
∇•B = 0 (4.1d)
where
µo = 4π ⋅10−7 henry meter
and
1
εo = . (4.1e)
µo c 2
G G
In these equations, E is the electric field, which is a function of position and time; B is the
magnetic-induction field, which is also a function of position and time; t is the time coordinate;
µo is the magnetic permeability of free space; ε o is the permittivity of free space; c is the
G
velocity of light; and ∇ is the standard vector-derivative “del” operator [see Eq. (4A.7a) in
Appendix 4A for a definition]. We take the curl of both sides in Eqs. (4.1a) and (4.1b) to get
G G G ∂ G G
∇ × [∇ × B] = µoε o
∂t
∇× E ( ) (4.2a)
and
G G G ∂ G G
∇ × [∇ × E ] = −
∂t
∇× B .( ) (4.2b)
G
But for any vector field v , we have the identity
G G G G G G G
(
∇ × [∇ × v ] = ∇ ∇ • v − ∇ 2 v . ) (4.2c)
G G G G ∂ G G
( )
∇ ∇ • B − ∇ 2 B = µ oε o
∂t
∇× E , ( )
G G G G ∂ G G
( )
∇ ∇ • E − ∇2 E = −
∂t
∇× B , ( )
or
G
G ∂2 B
∇ B − µ oε o 2 = 0 ,
2
∂t
- 331 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G
G ∂2E
∇ E − µ oε o 2 = 0 ,
2
∂t
G G G G
where we have used ∇ • B = ∇ • E = 0 from (4.1c) and (4.1d) and
G G G G G G
∇ × E = − ∂B ∂t , ∇ × B = µoε o ∂E ∂t
from (4.1a) and (4.1b) to simplify our results. The substitution µoε o = c −2 from (4.1e) now gives
G
G 1 ∂2 B
2
∇ B− 2 2 =0 (4.3a)
c ∂t
and
G
G 1 ∂ 2
E
∇2 E − 2 2 = 0 . (4.3b)
c ∂t
G
Equation (4.3a) is the wave equation for E , the electric field as a function of position and time;
G
and (4.3b) is the wave equation for B , the magnetic-induction field as a function of position and
G G
time. Because E and B are vectors and the wave equation is usually applied to scalar fields, we
now rewrite Eqs. (4.3a) and (4.3b) as a collection of six scalar wave equations to show the
G G
meaning of the two vector wave equations. The first step is to identify the E and B Cartesian
G
field components. Figure 4.1 specifies a three-dimensional Cartesian coordinate system for the E
G
and B field vectors located at a single point P. We use the x̂ , ŷ , ẑ unit vectors of the coordinate
system to write
G
E = xE
ˆ x + yE
ˆ y + zE
ˆ z (4.4a)
and
G
B = xB
ˆ x + yB
ˆ y + zB
ˆ z , (4.4b)
where, as shown in Fig. 4.1, Ex , E y , Ez are the real x, y, z components of the electric field and
Bx , By , Bz are the real x, y, z components of the magnetic-induction field. Both Ex , y , z and Bx , y , z
are, of course, functions of position and time. We define a position vector
G
r = xx
ˆ + yy
ˆ + zz
ˆ (4.4c)
G G
and show the dependence of the E and B fields on position and time by rewriting (4.4a) and
(4.4b) as
- 332 -
Deriving the Electromagnetic Wave Equations · 4.1
FIGURE 4.1. z
Ez > 0
G
Draw only the E field and
its x, y, z components
Ey > 0
y
z Ex < 0
G
E
Point P at the
same x, y, z
G
B z coordinates
y
x
G
Draw only the B field and Bz < 0
its x, y, z components By > 0
y
Bx > 0
- 333 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G G G G
E (r , t ) = xE
ˆ x (r , t ) + yE
ˆ y (r , t ) + zE
ˆ z (r , t )
and
G G G G G
B (r , t ) = xB
ˆ x (r , t ) + yB
ˆ y (r , t ) + zB
ˆ z (r , t ) .
This notation is best regarded as a shorthand for [see the discussion after Eq. (2.109d) in Sec.
2.25 of Chapter 2]
G
E ( x, y, z , t ) = xE
ˆ x ( x, y, z , t ) + yE
ˆ y ( x, y, z , t ) + zE
ˆ z ( x, y , z , t )
and
G
B ( x, y, z , t ) = xB
ˆ x ( x, y, z , t ) + yB
ˆ y ( x, y, z , t ) + zB
ˆ z ( x, y , z , t ) .
G
For any vector v we have, according to Eq. (4A.11c) in Appendix 4A,
G
∇ 2 v = xˆ∇ 2 vx + yˆ ∇ 2 v y + zˆ∇ 2 vz
G
where vx , v y , vz are the real x, y, z components of real vector v . It follows that substitution of
Eqs. (4.4a) and (4.4b) into (4.3a) and (4.3b) gives six scalar wave equations, one for each
Cartesian component of the two vector equations (4.3a) and (4.3b):
2 1 ∂ 2 Ex ∂ 2 Ex ∂ 2 Ex ∂ 2 Ex 1 ∂ 2 Ex
∇ Ex − 2 = + + − = 0, (4.5a)
c ∂t 2 ∂x 2 ∂y 2 ∂z 2 c 2 ∂t 2
2 2 2 2 2
2 1 ∂ Ey ∂ Ey ∂ Ey ∂ Ey 1 ∂ Ey
∇ Ey − 2 = + + − = 0, (4.5b)
c ∂t 2 ∂x 2 ∂y 2 ∂z 2 c 2 ∂t 2
2 1 ∂ 2 Ez ∂ 2 Ez ∂ 2 Ez ∂ 2 Ez 1 ∂ 2 Ez
∇ Ez − 2 = + + 2 − 2 = 0, (4.5c)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2
2 1 ∂ 2 Bx ∂ 2 Bx ∂ 2 Bx ∂ 2 Bx 1 ∂ 2 Bx
∇ Bx − 2 = + + 2 − 2 = 0, (4.5d)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2
2 2 2 2 2
2 1 ∂ By ∂ By ∂ By ∂ By 1 ∂ By
∇ By − 2 = + + 2 − 2 =0, (4.5e)
c ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2
- 334 -
Deriving the Electromagnetic Wave Equations · 4.1
1 ∂ 2 Bz ∂ 2 Bz ∂ 2 Bz ∂ 2 Bz 1 ∂ 2 Bz
∇ 2 Bz − = + + 2 − 2 =0. (4.5f)
c 2 ∂t 2 ∂x 2 ∂y 2 ∂z c ∂t 2
G G
where all the f A values are real and EA , BA may be complex vector functions of position.
Substituting (4.6a) and (4.6b) into (4.3a) and (4.3b) shows that then we end up with
G G
¦ [(∇ E
A
2
A + 4π 2σ A 2 EA ) e −2π ifAt ] = 0
and
G G
¦ [(∇ B
A
2
A + 4π 2σ A 2 BA ) e−2π ifAt ] = 0
if we define
- 335 -
4 · From Maxwell’s Equations to the Michelson Interferometer
fA
σA = . (4.7a)
c
The only way these sums can be identically zero for all times t is to set
G G
∇ 2 EA + 4π 2σ A 2 EA = 0 (4.7b)
and
G G
∇ 2 BA + 4π 2σ A 2 BA = 0 (4.7c)
and
G G G 2π i ( kG •rG )
BA (r ) = ¦ BAj e Aj , (4.8b)
j
G G G
where all the kAj are constant, real, three-dimensional vectors and EAj , BAj are complex, constant,
three-dimensional vectors. In terms of the x̂ , ŷ , ẑ unit vectors of Fig. 4.1,
G
kAj = xk
ˆ Ajx + yk
ˆ Ajy + zk
ˆ Ajz ,
- 336 -
Electromagnetic Plane Waves · 4.2
G G G 2 ª 2π i( kGAj •rG ) º
∇ EA (r ) = ¦ EAj ∇ e
2
j
«¬ »¼
G ª ∂ 2 2π i( xkAjx + ykAjy + zkAjz )
= ¦ EAj « 2 e
j ¬ ∂x
∂ 2 2π i( xkAjx + ykAjy + zkAjz )
+ 2e
∂y
∂ 2 2π i( xkAjx + ykAjy + zkAjz ) º
+ 2e »
∂z ¼
G G G
2 π i ( kA j • r ) G 2 G 2π i( kGAj •rG )
= −4π 2 ¦ ( 2 2
kAjx + kAjy + kAjz EAj e 2
) = −4π ¦ kAj EAj e
2
j j
and similarly,
G G G 2 G 2π i( kGAj •rG )
∇ BA (r ) = −4π ¦ kAj BAj e
2 2
.
j
Substitution of these two results and Eqs. (4.8a) and (4.8b) into (4.7b) and (4.7c) gives
(
ª EG e 2π i ( kAj •rG ) σ 2 − kG
)º»¼ = 0
G 2
¦j «¬ Aj A Aj (4.9a)
and
(
ª BG e 2π i ( kAj •rG ) σ 2 − kG
)º»¼ = 0 .
G 2
¦j «¬ Aj A Aj (4.9b)
G G G
This can be true over all values of r with nonzero values of EAj and BAj only when
G 2
σ A 2 = k Aj (4.9c)
G
for all values of A and j. Equation (4.9c) requires the real vector kAj to have a magnitude
G
kAj = σ A that depends only on index A . This suggests that the j index specifies the different
G
directions taken on by the kAj vectors, giving
G
ˆ .
k Aj = σ A ⋅ Ω Aj
- 337 -
4 · From Maxwell’s Equations to the Michelson Interferometer
ˆ is a dimensionless unit vector, called the propagation vector, which for a specified
Here Ω Aj
value of A points in different directions for different values of j. In fact, nothing stops us from
assuming that the Ω ˆ propagation vectors range over the same (indefinitely large) set of j
Aj
directions for each A value; if we want to leave out some j direction for a given A , we can always
G G
remove those directions by making both EAj and BAj zero for the unwanted values of A and j. We
can thus write
G
ˆ .
k Aj = σ A ⋅ Ω (4.9d)
j
Substitution of (4.8a), (4.8b), (4.7a), and (4.9d) into (4.6a) and (4.6b) gives
G G G 2π i σ ( Ωˆ •rG −(σ A )
E (r , t ) = ¦¦ EAj e A j
σ A ) ct
(4.10a)
A j
and
G G G 2π i σ A ( Ωˆ j •rG −(σ A )
B (r , t ) = ¦¦ BAj e
σ A ) ct
. (4.10b)
A j
ˆ • r − ct )
2π σ A (Ω j
if
σA σA =1
and
ˆ • r + ct ) if σ σ = −1 .
2π σ A (Ω j A A
When
σ A σ A = 1,
- 338 -
Electromagnetic Plane Waves · 4.2
ˆ
Figure 4.2 shows that the choice made here is to have the phase increasing in the direction of Ω j
and
G G G 2π iσ ( Ωˆ •rG −ct )
B (r , t ) = ¦¦ BAj e A j . (4.11b)
A j
The next section explains why these double sums are called electromagnetic plane waves.
We define
ˆ = xˆε + yˆε + zˆε
Ω (4.12a)
j jx jy jz
4.3,
ε jx = Ω ˆ • yˆ = cos(θ ) , ε = Ω
ˆ • xˆ = cos(θ ) , ε = Ω ˆ • zˆ = cos(θ ) . (4.12b)
j jx jy j jy jz j jz
The standard relationship between direction cosines—that the sum of their squares is one—is the
ˆ have unit length
same as the requirement that Ω j
G G
Although we have chosen E and B to satisfy the vector wave equations (4.3a) and (4.3b),
they must also satisfy the full set of Maxwell conditions, Eqs. (4.1a)–(4.1d). Substituting (4.11a)
into (4.1c) gives, using Eq. (4A.12b) from Appendix 4A,
G G ˆ • rG − ct ) G G 2π iσ ( Ωˆ •rG −ct )
¦¦ ∇ • [ E ] =¦¦ EAj • ∇[e A j
2π iσ A ( Ω
Aj e j
]=0. (4.13a)
A j A j
- 339 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.2.
x
ˆ
unit vector Ω j
G
ˆ • r = ct = constant , with each value of ct specifying
The planes of constant phase are specified by Ω j
ˆ .
a different plane perpendicular to Ω j
- 340 -
Electromagnetic Plane Waves · 4.2
FIGURE 4.3.
ˆ
unit vector Ω j
θ jx
θ jz
θ jy
- 341 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G
The only way (4.14a) and (4.14b) can hold true for all values of r and t with nonzero σ A is to
require
G
ˆ =0
EAj • Ω (4.14c)
j
and
G
ˆ =0
BAj • Ω (4.14d)
j
for all values of A and j . Working next with Eq. (4.1a), we substitute (4.11a) and (4.11b) to get
G G ˆ • rG − ct ) ∂ G 2π iσ A ( Ωˆ j •rG −ct )
¦¦ ∇ ×[ B e ] = µoε o ¦¦
2π iσ A ( Ω
Aj
j
[ EAj e ]
A j A j ∂t
Substituting from Eq. (4.13b) and using µoε o = c −2 [see Eq. (4.1e)] gives
ˆ • rG − ct ) ªG G
ˆ − 1 E º = 0.
¦¦ 2π iσ A e
2π iσ A ( Ω j
« B × Ω (4.15a)
c »¼
A j j Aj
A j ¬
- 342 -
Electromagnetic Plane Waves · 4.2
G
The only way this can be true for all r and t with nonzero σ A is if
G G
ˆ )=E
c( BAj × Ω (4.15b)
j Aj
for all values of A and j. Similarly, substitution of (4.11a) and (4.11b) into (4.1b) gives
ˆ • rG − ct ) G G
¦¦ 2π iσ A e ˆ + cB º = 0 .
ª EAj × Ω
2π iσ A ( Ω j
¬ j Aj ¼ (4.15c)
A j
G
The only way (4.15c) can hold true for all r and t with nonzero σ A is if
G G
ˆ = −cB
EAj × Ω (4.15d)
j Aj
for all values of A and j. It is not difficult to show that (4.15b) and (4.15d) are just different forms
of the same equation. Taking the cross product of the left-hand side of (4.15d) with Ω ˆ gives,
j
G
where we use Ω ˆ • E = 0 from Eq. (4.14c) and that Ω ˆ •Ω ˆ = 1 because Ω
ˆ has unit length.
j Aj j j j
Therefore taking the cross product of both sides of (4.15d) with Ωˆ gives
j
G G G
ˆ × B = cB × Ω
EAj = −cΩ ˆ ,
j Aj Aj j
which is the same as Eq. (4.15b). We can also take the cross product of the left-hand side of
G
ˆ and use Ω
(4.15b) with Ω ˆ • B = 0 from (4.14d) and Eq. (4A.14) in Appendix 4A to get
j j Aj
G G
ˆ × (B × Ω
c[Ω ˆ )] = cB .
j Aj j Aj
ˆ now must
Taking the cross product of both the right-hand and left-hand sides of (4.15b) with Ω j
give
G G G
ˆ × E = −E × Ω
cBAj = Ω ˆ .
j Aj Aj j
- 343 -
4 · From Maxwell’s Equations to the Michelson Interferometer
ˆ and
This is the same formula as Eq. (4.15d). Hence, as stated above, the restrictions placed on Ω j
G G
the complex vectors EAj , BAj in Eqs. (4.14c) and (4.14d) make (4.15b) and (4.15d) the same
equality. We see that the double sums shown in (4.11a) and (4.11b) lead to acceptable complex
G G
solutions to the vector wave equations for E and B in (4.3a) and (4.3b); and when the
G G
ˆ , E , and B , the
restrictions (4.14c), (4.14d), and either (4.15b) or (4.15d) are placed on Ω j Aj Aj
double sums also satisfy (4.1a)–(4.1d), Maxwell’s equations for empty space. No limits are
placed on the size of these double sums. This means we can create two different double sums,
both matching the criteria of this section and so solving Maxwell’s equations, and add them
together to get one big double sum matching the criteria of this section and solving Maxwell’s
equations. In general we can add together any number of plane-wave solutions to Maxwell’s
equations to create a new and larger collection of plane waves solving Maxwell’s equations.
G G G 2π iσ ( Ωˆ •rG −ct )
E (r , t ) = EAj e A j (4.16a)
and
G G G 2π iσ ( Ωˆ •rG −ct )
B (r , t ) = BAj e A j (4.16b)
with
G G G G
ˆ = B •Ω
EAj • Ω ˆ ×E )
ˆ = 0 and B = c −1 (Ω (4.16c)
j Aj j Aj j Aj
from (4.14c), (4.14d), and (4.15d). Although it is customary to leave wave formulas in complex
form, strictly speaking only the real parts (or imaginary parts, see discussion at end of Appendix
4A) of the right-hand sides of (4.16a) and (4.16b) provide acceptable physical solutions to wave
Eqs. (4.3a) and (4.3b). Since an x, y, z coordinate system has not yet been specified, nothing stops
us from choosing the z axis to be parallel to Ω ˆ ; and because both ẑ and Ω ˆ are dimensionless,
j j
real, unit-length vectors, we then have Ω ˆ = zˆ . Equations (4.14c) and (4.14d) now show that the
j
G G
complex vectors EAj and BAj have zero z components, allowing us to write
G
EAj = xE
ˆ Ajx + yE
ˆ Ajy (4.17a)
and
G
BAj = xB
ˆ Ajx + yB
ˆ Ajy (4.17b)
- 344 -
Monochromatic Wave Trains · 4.3
where EAjx , EAjy , BAjx , BAjy are all complex numbers. Substituting into (4.15b) gives, using
ˆ = xˆ × zˆ = − yˆ and yˆ × Ω
xˆ × Ω ˆ = yˆ × zˆ = xˆ ,
j j
ˆ Ajx + yE
xE ( ˆ B + yˆ × Ω
ˆ Ajy = c xˆ × Ω j Ajx
ˆ B
j Ajy ) (4.17c)
= − yˆ ( cBAjx ) + xˆ ( cBAjy ) ,
which means that
EAjx = cBAjy (4.17d)
and
EAjy = −cBAjx . (4.17e)
If we write
iφAjx
EAjx = EAjx e (4.18a)
and
iφAjy
EAjy = EAjy e (4.18b)
using real phase terms φAjx and φAjy to describe the EAjx , EAjy complex constants, it then follows
from (4.17d) and (4.17e), because c is real, that
1 iφ
BAjy = EAjx e Ajx (4.18c)
c
and
1 iφ
BAjx = − EAjy e Ajy . (4.18d)
c
ˆ = zˆ
so that taking the real part of the right-hand sides of Eqs. (4.16a) and (4.16b) gives, using Ω j
- 345 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
Re[ EAj e 2π iσ A ( z •r −ct ) ]
ˆ
( )
G
= Re ª xˆ EAjx e Ajx + yˆ EAjy e Ajy e2π iσ A ( z •r −ct ) º
iφ iφ ˆ
(4.19a)
¬ ¼
= xˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) + yˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy )
and
G 2π iσ zˆ•rG −ct )
Re[ BAj e A ( ]
ª§ 1 1 iφ ·
G º
= Re «¨ − xˆ EAjy e Ajy + yˆ EAjx e Ajx ¸ e 2π iσ A ( z •r −ct ) »
iφ ˆ
(4.19b)
¬ © c c ¹ ¼
1 1
= − xˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy ) + yˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) .
c c
G G
When z is held constant, all the x and y components of the E and B fields in (4.19a) and (4.19b)
oscillate at the same frequency f = σ A c . We can recognize what is going on by keeping z
constant and noting that if t increases (or decreases) by 1/(σ A c ) , then the phases of all the cosines
in Eqs. (4.19a) and (4.19b) increase (or decrease) by 2ʌ. This makes the wavefield specified in
(4.19a) and (4.19b) a plane wavefield, since every point on a plane specified by z = constant has
G G
the same real E field and B field at all times t. Figure 4.4 shows that when t is held constant in
Eqs. (4.19a) and (4.19b) and z increases (or decreases) in value by 1 σ A , the phases of all the
cosines also increase (or decrease) by 2ʌ. Consequently, planes in Fig. 4.4 that are separated by
G G
1 σ A have the same phase and thus the same real E and B fields. This distance is called the
wavelength Ȝ of the plane wavefield. Parameter σ A is called the wavenumber, already defined in
Eq. (1.7b) of Chapter 1 to be 1/Ȝ. The plane wave is called monochromatic because it is specified
by a single frequency f = σ A c and wavelength Ȝ. Its wavenumber σ A is 1/Ȝ, so the equality
f = σ Ac (4.19c)
the classic relationship between wavelength, frequency, and velocity for any wavefield. We
ˆ = zˆ direction at
conclude that Eqs. (4.19a) and (4.19b) describe a wavefield traveling in the Ω j
- 346 -
Monochromatic Plane Waves · 4.3
FIGURE 4.4.
1
z=
σA
G
G E
E
G
G B z
B
G ˆ
unit vector Ω
G E j
G
G B
B
- 347 -
4 · From Maxwell’s Equations to the Michelson Interferometer
pair of terms from formulas (4.11a) and (4.11b). Since the pair of sums in (4.11a) and (4.11b) is a
general solution to the vector wave equations, this sort of general solution can now be interpreted
as a sum over an arbitrary collection of monochromatic plane waves characterized by different
wavenumbers and directions of propagation, where for each wavenumber σ A , there is a unique
frequency cσ A .
From Eqs. (4.19a) and (4.19b), we get
G 2π iσ zˆ •rG −ct ) G 2π iσ zˆ•rG −ct )
{Re[ EAj e A ( ]} • {Re[ BAj e A ( ]}
1
=− EAjx EAjy cos ( 2πσ A ( z − ct ) + φAjx ) cos ( 2πσ A ( z − ct ) + φAjy ) (4.20)
c
1
+ EAjx EAjy cos ( 2πσ A ( z − ct ) + φAjx ) cos ( 2πσ A ( z − ct ) + φAjy ) = 0 ,
c
G G
showing that the real E and B fields of a monochromatic plane wave are always perpendicular
to each other while they oscillate. From (4.17a), (4.17b), (4.17d), and (4.17e), we get
G G § 1 · §1 ·
EAj • BAj = EAjx BAjx + EAjy BAjy = EAjx ¨ − EAjy ¸ + EAjy ¨ EAjx ¸ = 0 . (4.21a)
© c ¹ ©c ¹
G G
In this sense, we can say that the complex monochromatic plane wave E and B fields are also
perpendicular to each other. Another result worth deriving, again using Eqs. (4.17a), (4.17b),
(4.17d), and (4.17e), is that
G G
EAj × BA∗j = [ xE
ˆ Ajx + yE ˆ A∗jx + yB
ˆ Ajy ] × [ xB ˆ A∗jy ] = zˆ EAjx BA∗jy − zˆ EAjy BA∗jx
( ) ( )
= [ EAjx c −1 EA∗jx + EAjy c −1 EA∗jy ] zˆ (4.21c)
1 G G∗ 1 G G
=
c
( ) c
(
EAj • EAj zˆ = EAj • EA∗j ) Ωˆ j ,
- 348 -
Monochromatic Plane Waves · 4.3
(4.21c), can be written using only dot products and cross products, hold true in all (proper)
coordinate systems if they hold true in any one (proper) coordinate system.55 Choosing a new
coordinate system where the ẑ unit vector is not the same as the Ω ˆ propagation vector is
j
geometrically equivalent to specifying a new direction for the propagation vector that is not
parallel to the original ẑ unit vector. Since (4.21a) and (4.21c) use only dot and cross products,
they must also hold true in those coordinate systems where Ω ˆ is not parallel to ẑ . Hence we can
j
conclude that Eqs. (4.21a) and (4.21c) must be obeyed when the A , j monochromatic plane wave
propagates in any direction, not just when it propagates parallel to the z axis. Therefore the
G G
double sums over A and j in Eqs. (4.11a) and (4.11b) must all have coefficients EAj and BAj
satisfying Eqs. (4.21a) and (4.21c), with
G G
EAj • BAj = 0 (4.22a)
and
G G 1 G G ˆ
c
(
EAj × BA∗j = EAj • EA∗j Ω)j. (4.22b)
G G
Similarly, the perpendicularity of the real, physical E and B fields as they oscillate in Eq. (4.20)
G G
cannot be affected by the choice of coordinate system, which means the oscillating E and B
fields stay perpendicular when ẑ is not chosen parallel to Ω ˆ . Since, once again, this is
j
geometrically equivalent to specifying a new direction of propagation, we conclude that the real
G G
ˆ vectors—that is, they are perpendicular
oscillating E and B fields are perpendicular for all Ω j
G iφ
EAj = xˆ EAjx e Ajx = xE
ˆ Ajx . (4.23a)
55
The cross product is invariant only if the coordinate systems are always chosen to be left-handed or right-handed.
This book uses right-handed coordinate systems, sometimes referred to as proper coordinate systems, where the x̂ ,
ŷ , ẑ vectors are always chosen so that xˆ × yˆ = zˆ .
- 349 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G 1 i 1
BAj yˆ EAjx e Ajx yˆ EAjx . (4.23b)
c c
Setting EAjy 0 in Eqs. (4.19a) and (4.19b) now leads to
G G
Re[ EAj e 2& i) A z =r ct ] xˆ EAjx cos 2&) A ( z ct ) Ajx
ˆ
(4.23c)
and
G G 1
Re[ BAj e2& i) A z =r ct ] yˆ EAjx cos 2&) A ( z ct ) Ajx .
ˆ
(4.23d)
c
Equations (4.23a)–(4.23d) describe a plane wave whose real electric-field vector always points
strictly along the x axis and whose real magnetic-induction vector always points strictly along the
y axis. Characterizing this wave by the direction of the electric-field vector, we call it linearly
polarized along the x axis, or x-polarized for short (see Fig. 4.5). Equation (4.23a) shows that in
G
an x-polarized plane wave the complex vector EAj is the x̂ unit vector multiplied by a complex
G
constant EAjx —which, of course, means that in (4.23b) the complex vector BAj must be the ŷ
unit vector multiplied by the complex constant EAjx c .
To get a monochromatic plane wave that is linearly polarized in the y direction, we choose
EAjx 0 . Then, repeating the analysis used to find Eqs. (4.23a)–(4.23d), we have
G i
EAj yˆ EAjy e Ajy yE
ˆ Ajy , (4.24a)
G 1 i 1
BAj xˆ EAjy e Ajy xˆ EAjy , (4.24b)
c c
G G
Re[ EAj e 2& i) A z =r ct ] yˆ EAjy cos 2&) A ( z ct ) Ajy ,
ˆ
(4.24c)
and
G G 1
Re[ BAj e2& i) A z =r ct ] xˆ EAjy cos 2&) A ( z ct ) Ajy .
ˆ
(4.24d)
c
The monochromatic plane wave described by Eqs. (4.24a)–(4.23d) 4.24d has an electric-field vector
that always points along the y axis and a magnetic induction vector that always points along the
íx axis (see Fig. 4.6). Equation (4.24a) shows that y polarization can be recognized by noting that
G
the complex vector EAj is the ŷ unit vector multiplied by a complex constant EAjy [with,
- 350 -
Linear Polarization of Monochromatic Plane Waves · 4.4
FIGURE 4.5.
E field vectors
B field vectors
G
according to (4.24b), complex vector BAj being the x̂ unit vector multiplied by the complex
constant (− EAjy c) ].
Writing down Eqs. (4.19a) and (4.19b) again while switching the order of addition in the
second equation gives
G
Re[ EAj e 2π iσ A ( z •r −ct ) ] = xˆ EAjx cos ( 2πσ A ( z − ct ) + φAjx ) + yˆ EAjy cos ( 2πσ A ( z − ct ) + φAjy )
G
ˆ
and
- 351 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.6.
x
E field vectors
y
B field vectors
G 2ʌiı z•r G
1 1
Re[ Blj e l
ˆ -ct
c
c
] = yˆ Eljx cos 2ʌıl (z - ct)+ijljx - xˆ Eljy cos 2ʌıl (z - ct)+ijljy .
Clearly, the first term in the general formula for the E field and the first term in the general
formula for the B field can be grouped together and called an x-polarized wave, and similarly the
second terms in the general formulas can be grouped together and called a y-polarized wave. This
shows that the E field of an arbitrary monochromatic plane wave—that is, a plane wave where
neither EAjx nor EAjy is automatically zero—can be represented as the sum of the E field of a
monochromatic planewave
monochromatic plane wave linearly
linearly polarized
polarized in theinx the x direction
direction and the and thetheE Efield
sum of field of
of a
monochromatic plane wave linearly polarized in the y direction. Similarly, the B field of that
same monochromatic plane wave can be represented as the sum of the B field of the
corresponding x-polarized plane wave and the B field of the corresponding y-polarized plane
- 352 -
Linear Polarization of Monochromatic Plane Waves · 4.4
wave. This point is often made by stating that any monochromatic plane wave can be written as
the sum of an x-polarized plane wave and a y-polarized plane wave.
(4.19b), that the incident plane wave can be represented by the real part of
G
( )
G
EAj e2π iσ A ( z •r −ct ) = xˆ EAjx e Ajx + yˆ EAjy e Ajy e 2π iσ A ( z −ct )
ˆ iφ iφ
(4.25a)
G G
§ 1 1 iφ ·
BAj e 2π iσ A ( z •r −ct ) = ¨ − xˆ EAjy e Ajy + yˆ EAjx e Ajx ¸ e 2π iσ A ( z −ct ) .
ˆ iφ
(4.25b)
© c c ¹
The thin film divides the space in Fig. 4.7 into two regions labeled A and B. Equations (4.25a)
and (4.25b) only apply to points in region A, the region occupied by the incident wavefield. The
unit normal vector n̂ of the surface on which the plane wave is incident lies in the y, z plane of
the coordinate system, making an angle ψ j with respect to the z axis. Angle ψ j is called the
angle of incidence, and we give it an index j because it specifies the direction of the Ω ˆ
j
propagation vector with respect to n̂ . The interaction of the plane wave with the film creates a
transmitted radiation field in region B that also propagates in the Ω ˆ = zˆ direction, and a
j
ˆ (r ) = Ω
Ω j
ˆ − 2nˆ Ω
j
ˆ • nˆ
j ( ) (4.26a)
or
ˆ ( r ) = zˆ + 2nˆ ( cosψ ) .
Ω (4.26b)
j j
Both the transmitted and reflected wavefields have the same σ A wavenumber as the incident
wave. For any wavefield incident on a flat surface, the plane of incidence is defined to be that
plane containing both the surface normal n̂ and the incident propagation vector Ω ˆ . Equation
j
ˆ
(4.26a) shows that the Ω ( r )
propagation vector of the reflected wave automatically lies in the
j
- 353 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.7.
A B
propagation
ˆ = zˆ
vector Ω j
z
ψj
surface normal n̂
ψj
ˆ (r )
propagation vector Ω j
- 354 -
Transmitted Plane Waves · 4.5
same plane as n̂ and Ω ˆ . In Fig. 4.7, the plane of incidence is the y, z plane of the coordinate
j
system.
Since the transmitted radiation field is also a monochromatic plane wave traveling down the z
axis, the E and B fields of the wave can still be found from the real parts of complex plane wave
solutions such as the ones given in Eqs. (4.16a) and (4.16b),
G 2π iσ ( Ωˆ •rG −ct ) G (t ) 2π iσ ( z −ct )
EA(jt ) e A j = EAj e A (4.27a)
and
G 2π iσ ( Ωˆ •rG −ct ) G ( t ) 2π iσ ( z −ct )
BA(jt ) e A j = BAj e A , (4.27b)
where the (t) superscript specifies the transmitted wavefield and Eqs. (4.27a) and (4.27b) are
G
assumed to apply only to region B in Fig. 4.7. The complex vector EA(jt ) can be written as
G
EA(jt ) = xE
ˆ A(jxt ) + yE
ˆ A(jyt )
with the two complex numbers EA(jxt ) and EA(jyt ) representing its x and y components. Equations
G G
(4.18e) and (4.18f) show that the complex vectors EA(jt ) , BA(jt ) can now be written as
G iφ ( t ) iφ ( t )
EA(jt ) = xˆ EA(jxt ) e Ajx + yˆ EA(jyt ) e Ajy (4.27c)
and
K 1 iφ ( t ) 1 iφ ( t )
BA(jt ) = − xˆ EA(jyt ) e Ajy + yˆ EA(jxt ) e Ajx , (4.27d)
c c
where we have used the two real constants φA(jxt ) and φA(jyt ) to represent the phases of EA(jxt ) and EA(jyt )
respectively. We require the film to be nonbirefringent, nonoptically active, and to have an index
of refraction that is constant in layers parallel to its surface; that is, the index of refraction can
only depend on the distance from the film’s surface. If the film absorbs radiant energy, we
account for it in the usual way by making its index of refraction complex.56 This sort of film turns
out to be an adequate model for the partially transmitting, partially reflecting layer of a Michelson
interferometer’s beam splitter.
When the plane wave incident on the film has EAjy = 0 or EAjx = 0 , making the wave in Eqs.
(4.25a) and (4.25b) linearly x-polarized or linearly y-polarized respectively, the transmitted wave
56
Leonard Eyges, The Classical Electromagnetic Field (Dover Publications, Inc., New York, 1972), p. 340.
- 355 -
4 · From Maxwell’s Equations to the Michelson Interferometer
must have the same type of linear polarization.57 Hence, when EAjy = 0 in (4.25a) and (4.25b),
the transmitted plane wave must also be linearly polarized along the x axis, making EA(jyt ) = 0 in
Eqs. (4.27c) and (4.27d); and when EAjx = 0 , the transmitted plane wave, which must be linearly
polarized along the y axis, has EA(jxt ) = 0 in (4.27c) and (4.27d).
Consulting Eqs. (4.25a) and (4.25b), we see that for linear polarization along the x axis with
EAjy = 0 , the incident plane wave is given by the real part of
1
EAjx e Ajx e 2π iσ A ( z −ct )
iφ
yˆ (4.28b)
c
for the magnetic induction. The corresponding transmitted plane wave is given by the real part of
iφA(jxt )
xˆ EA(jxt ) e e2π iσ A ( z −ct ) (4.29a)
1 ( t ) iφA(jxt ) 2π iσ A ( z −ct )
yˆ EAjx e e (4.29b)
c
for the magnetic induction [see Eqs. (4.27c) and (4.27d) with EA(jyt ) = 0 ). The ratio of the complex
transmitted electric field’s x component in (4.29a) to the complex incident electric field’s x
component in (4.28a) is the complex coefficient
EA(jxt ) (
i φA(jxt ) −φAjx ).
ts = e (4.30a)
EAjx
We see by inspection that this is the same as the ratio of the two complex magnetic inductions in
(4.29b) and (4.28b). Consequently, no matter what happens inside the film to produce the
57
Max Born and Emil Wolf, Principles of Optics: Electromagnetic Theory of Propagation, Interference, and
Diffraction of Light, 7th (expanded) ed. (Cambridge University Press, New York, 1999), p. 55.
- 356 -
Transmitted Plane Waves · 4.5
transmitted x-polarized wave, the process can be described by a complex parameter ts , which in
general is a function of the wavenumber σ A and ψ j , the angle of incidence in Fig. 4.7,
ts = ts (σ A ,ψ j ) . (4.30b)
The subscript s in Eqs. (4.30a) and (4.30b) is traditionally applied to incident plane waves whose
electric field is linearly polarized perpendicular to the plane of incidence, and parameter ts is
called the s-wave amplitude-transmission coefficient.58
It is important to note that t s does not depend on either EAjx or φAjx , giving it the same value
for all monochromatic plane waves having equal wavenumbers and angles of incidence.59
Equations (4.28a), (4.28b), (4.29a), and (4.29b) and the definition of parameter ts (σ A ,ψ j ) in
(4.30a) let us write
G G
EA(jst ) e 2π iσ A ( z −ct ) = ts (σ A , φ ) ⋅ EAjs e2π iσ A ( z −ct ) (4.31a)
and
G G
BA(jst ) e 2π iσ A ( z −ct ) = ts (σ A , φ ) ⋅ BAjs e2π iσ A ( z −ct ) , (4.31b)
where
G iφ
G 1 iφ
EAjs = xˆ EAjx e Ajx , BAjs = yˆ EAjx e Ajx , (4.31c)
c
and
G iφ ( t ) G 1 iφ ( t )
EA(jst ) = xˆ EA(jxt ) e Ajx , and BA(jst ) = yˆ EA(jxt ) e Ajx . (4.31d)
c
This shows that to get the complex formula for the transmitted plane wave linearly polarized
perpendicular to the plane of incidence, we need only multiply the complex formula for the
incident plane wave by ts (σ A ,ψ j ) . If the plane wavefield incident on the optical film at an angle
ψ j contains more than one wavenumber (but is still polarized perpendicular to the plane of
incidence), then its electric field is given by the real part of
G
¦EA
Ajs e 2π iσ A ( z −ct )
58
This notation can be traced back to the German word for perpendicular, senkrecht.
59
O. S. Heavens, Optical Properties of Thin Solid Films (London, Butterworths Scientific Publications, 1955), pp.
46–95.
- 357 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G
¦BA
Ajs e2π iσ A ( z −ct ) ,
where an s subscript has been added to show that all the waves are linearly polarized
perpendicular to the plane of incidence. The s-wave amplitude-transmission coefficient can now
be used to write the complex formulas for the transmitted radiation fields as
G (t ) G
¦E
A
Ajs e2π iσ A ( z −ct ) = ¦ ts (σ A ,ψ j ) ⋅ EAjs e2π iσ A ( z −ct )
A
(4.31e)
and
G (t ) G
¦BA
Ajs e 2π iσ A ( z −ct ) = ¦ ts (σ A ,ψ j ) ⋅ BAjs e2π iσ A ( z −ct )
A
(4.31f)
because
G G G G
EA(jst ) = ts (σ A ,ψ j ) ⋅ EAjs and BA(jst ) = ts (σ A ,ψ j ) ⋅ BAjs (4.31g)
and the magnetic induction of the incident plane wave is given by the real part of
1
EAjy e Ajy e2π iσ A ( z −ct ) .
iφ
− xˆ (4.32b)
c
Recalling that the corresponding transmitted plane wave must have the same type of linear
polarization as the incident wave, we set EA(jxt ) = 0 in Eqs. (4.27c) and (4.27d) to get that the
electric field of the transmitted plane wave is the real part of
iφA(jyt )
yˆ EA(jyt ) e e2π iσ A ( z −ct ) (4.33a)
and the magnetic induction of the transmitted plane wave is the real part of
1 (t ) iφA(jyt ) 2π iσ A ( z −ct )
− xˆ EAjy e e . (4.33b)
c
- 358 -
Transmitted Plane Waves · 4.5
The ratio of the complex transmitted electric field in (4.33a) to the complex incident electric field
in (4.32a) is
EA(jyt ) i(φA(jyt ) −φAjy )
tp = e . (4.34a)
EAjy
Again, this is the same as the ratio of the two complex magnetic inductions in (4.33b) and
(4.32b)—so again the process of transmission is described by a single complex parameter that is a
function of σ A and ψ j but not of EAjy or φAjy ,
t p = t p (σ A ,ψ j ) . (4.34b)
The p subscript is traditionally applied to incident plane waves whose electric field is linearly
polarized parallel to the plane of incidence, and parameter t p is called the p-wave amplitude-
transmission coefficient.60 When the incident wavefield contains more than one wavenumber and
every monochromatic component is a p-type plane wave, its electric field is given by the real part
of
G
¦ EAjp e2π iσ A ( z −ct )
A
(4.35a)
where
G iφ
G 1 iφ
EAjp = yˆ EAjy e Ajy and BAjp = − xˆ EAjy e Ajy (4.35c)
c
with the p subscript showing that the waves are linearly polarized parallel to the plane of
incidence. To get the complex formula for the transmitted plane wave linearly polarized parallel
to the plane of incidence, we need only multiply the complex term for each incident plane wave
by t p (σ A ,ψ j ) to get
G (t ) 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ Ajp
E
A
e =
A
¦ p A j Ajp e
t (σ ,ψ ) ⋅ E (4.35d)
and
60
This notation can also be traced back to German scientists, with the German word for parallel spelled the same as
in English, parallel.
- 359 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G (t ) G
¦B A
Ajp e 2& i) A z ct ¦ t p () A ,/ j ) A BAjp e2& i) A z ct .
A
(4.35e)
The details of the mathematics used here to represent the incident and transmitted wavefields
have an unfortunate tendency to conceal the basic ideas behind what is being done. No matter
what the orientation of the E field in the incident monochromatic plane wave—parallel or
perpendicular to the plane of incidence—terms having the form
m (with A = 2 & ) A and bA = 2 & ) A c)
are used to describe the electromagnetic wavefields on the incident side of the thin film, and
terms such as
are used to describe the electromagnetic wavefields on the transmitted side of the thin film. Here,
* is a complex number standing for either ts or t p in the above formulas; and A is a complex
number standing for either the x or y components of the E and B fields’ complex amplitudes—for
example, EAjx , BAjy , etc.. If we write the complex A value as
A A ei A ,
then
Aei () A z bctctA )) AA eeii(())AAzzctbctctAA)A )
and
* Aei () z bctct)) ** AA eeii(()) zzctbcct) ). .
A A AA A AA
If the incident monochromatic wavefield is shifted forward or back along the z axis—that is,
along its direction of propagation—by a distance z0 , then z 7 z 9 z0 so that
To change the amplitude of the incident wavefield to some fraction of its original value, we
multiply A by a real number Į between zero and one to get
Ae 9 i)
A z0
e A z bctA tAA))
i ()
7 AA ee99i)i)AAzz00 eeii(())AAzzctbctA A)A ). .
7
- 360 -
Transmitted Plane Waves · 4.5
* * ei* ,
becomes
* Aei () z bctt)) ** AA eeii** eeii(()))zzzctbct) .) .
A A AAA A AA
A e 9 i)
A z0
e A z bctct
i () A AA))
for the incident wavefield shifted by z0 and diminished by a real factor Į, we note that
B*
and
9)9)zA0zB
A *.* .
0 B
is that the amplitude A of the original wavefield changes to * A A and the oscillations of the
wavefield are moved forward or back by a distance
* arg(* )
)
A )
A
along the direction of propagation. This mathematical fact—knowing what happens when the
complex expression for a monochromatic wavefield is multiplied by a complex parameter—gives
meaning to the formulas derived in the first part of this section. Monochromatic wavefields
transmitted through the thin film in Fig. 4.7 have their amplitudes diminished by ts if the E field
is perpendicular to the plane of incidence and by t p if the E field is parallel to the plane of
incidence. The oscillations of the transmitted wavefields are also moved forward or back with
- 361 -
4 · From Maxwell’s Equations to the Michelson Interferometer
respect to the incident wavefield as specified by the complex phases or arguments of ts and t p .
How much the wavefields shift and change in amplitude depends on the angle of incidence and
wavenumber—that is why t s and t p are written as functions of ψ j and σ A .
From the work done in Sec. 4.4, we know that any monochromatic plane wave having a
propagation vector parallel to the z axis can be analyzed as the sum of a monochromatic plane
wave linearly polarized along the x axis and a monochromatic plane wave linearly polarized
along the y axis. This means that any monochromatic plane wave incident on the optical film in
Fig. 4.7 can be treated as the sum of an s-type monochromatic plane wave and a p-type
monochromatic plane wave. Consequently, we expect an arbitrary plane wavefield incident along
the z axis in region A of Fig. 4.7 to have both s-type and p-type components, with its electric field
given by the real part of
G 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ Ajs
E
A
e + ¦ Ajp e
E
A
(4.36a)
The recipe for taking this combined wavefield through the optical film into region B of Fig. 4.7 is
to multiply each s-wave component and p-wave component by the appropriate s-wave and p-
wave amplitude-transmission coefficients. Hence, the electric field for the transmitted wave in
region B is the real part of
G 2π iσ A ( z −ct ) G 2π iσ A ( z −ct )
¦ s A j Ajs
t
A
(σ ,ψ ) E e + ¦ p A j Ajp e
t (σ ,ψ ) E
A
(4.36c)
Thus the transmission of any plane wavefield containing many different wavenumbers—that is,
the transmission of any polychromatic plane wave—can be handled by writing each incident
monochromatic wave as the sum of an s-wave and a p-wave, as shown in (4.36a) and (4.36b), and
then multiplying each s-wave and p-wave in that sum by the correct s-wave and p-wave
amplitude-transmission coefficient, as shown in (4.36c) and (4.36d).
- 362 -
Reflected Plane Waves · 4.6
monochromatic plane wave with wavenumber σ and propagation vector Ω ˆ ( r ) . In Fig. 4.8, we
A j
construct a special x , y , z coordinate system to analyze the reflected plane wave. The z ( r )
(r ) (r ) (r )
discussion at the end of Sec. 4.2, the sum of the incident and reflected plane waves is still a
solution to Maxwell’s equations in region A. We see that the x ( r ) , y ( r ) , z ( r ) coordinate system is
just the x, y, z coordinate system rotated about the x axis to make ẑ parallel to Ω ˆ ( r ) , so the two
j
coordinate systems have the same origin. Both coordinate systems have the same x axis, so
ˆ ( r ) × xˆ .
xˆ ( r ) = xˆ , and to get the y axis of the new coordinate system, we specify yˆ ( r ) = zˆ ( r ) × xˆ ( r ) = Ω j
When an x, y, z coordinate system is rotated by an angle ȕ about its x axis to create a new x ( r ) ,
y ( r ) , z ( r ) coordinate system (see Fig. 4.9), the relationship between the x̂ , ŷ , ẑ unit vectors and
the xˆ ( r ) , yˆ ( r ) , zˆ ( r ) unit vectors is
xˆ ( r ) = xˆ , (4.37a)
respectively,
G 2π iσ ( Ωˆ ( r ) •rG −ct ) G ( r ) 2π iσ ( z( r ) −ct )
EA(jr ) e A j = EAj e A (4.38a)
and
G 2π iσ ( Ωˆ ( r ) •rG −ct ) G ( r ) 2π iσ ( z ( r ) −ct )
BA(jr ) e A j = BAj e A . (4.38b)
- 363 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.8.
x, xr
ˆ
propagation vector j
y ((rr))
/j surface normal n̂
z(r )
/j ˆ (r )
propagation vector j
z(r )
A B
G G
The r superscript on the complex EA(jr ) and BA(jr ) vectors show that they belong to the reflected
G
wave. Vector EA(jr ) in (4.38a) can be written as
G
EA(jr ) xE
ˆ A(jxr ) yˆ ( r ) EA(jyr )( r ) (4.38c)
using two complex numbers EA(jxr ) and EA(jyr )( r ) to represent its x̂ and yˆ ( r ) components. Although
the y subscripts and unit vectors have an r superscript to show that they belong to the x ( r ) , y ( r ) ,
- 364 -
Reflected Plane Waves · 4.6
FIGURE 4.9.
x, xr
z(r )
β
z
β
y y(r )
- 365 -
4 · From Maxwell’s Equations to the Michelson Interferometer
z ( r ) coordinate system, the x subscripts and unit vectors do not need one because x̂ and xˆ ( r ) are
identical in the two coordinate systems. Following the pattern of Eqs. (4.27c) and (4.27d), we
G G
write the complex vectors EA(jr ) and BA(jr ) as
G iφ ( r ) iφ ( r )( r )
EA(jr ) = xˆ EA(jxr ) e Ajx + yˆ ( r ) EA(jyr )( r ) e Ajy (4.38d)
and
G 1 iφ ( r )( r ) 1 iφ ( r )
BA(jr ) = − xˆ EA(jyr )( r ) e Ajy + yˆ ( r ) EA(jxr ) e Ajx (4.38e)
c c
using the real constants φA(jxr ) and φA(jyr )( r ) to represent the phases of the complex values of EA(jxr ) and
EA(jyr )( r ) respectively.
When the plane wave incident on the optical film is linearly polarized along the x axis or y
axis, the reflected wave is linearly polarized along the xˆ ( r ) = xˆ axis or the yˆ ( r ) axis respectively.61
Equations (4.28a) and (4.28b), which give the complex formulas for an incident plane wave
that is linearly x-polarized, force the reflected plane wave to be linearly polarized along the
xˆ ( r ) = xˆ axis. According to Eq. (4.38d), this reflected wave must have
EA(jyr )( r ) = 0
for it to be linearly polarized along the xˆ ( r ) = xˆ axis. Equations (4.38a)–(4.38e) then show that the
E field of the reflected wave is given by the real part of
iφA(jxr ) (r)
xˆ EA(jxr ) e e2π iσ A ( z − ct )
(4.39a)
and the B field of the reflected wave is given by the real part of
1 ( r ) iφA(jxr ) 2π iσ A ( z( r ) −ct )
yˆ ( r ) EAjx e e . (4.39b)
c
Comparing these two complex formulas to the complex formulas (4.28a) and (4.28b) for the
incident wave, we note that if we consider only the scalar factors that do not depend on position
or time, then the xˆ ( r ) = xˆ components of the complex E fields together with the ŷ , yˆ ( r )
components of the complex B fields have the same complex ratio
61
Max Born and Emil Wolf, Principles of Optics, p. 55.
- 366 -
Reflected Plane Waves · 4.6
EA(jxr ) (
i φA(jxr ) −φAjx ).
rs = e (4.40a)
EAjx
Parameter rs is called the s-wave amplitude-reflection coefficient, with s again referring to the
incident plane wave’s being polarized perpendicular to the plane of incidence. In general,
rs = rs (σ A ,ψ j ) , (4.40b)
where rs , like the amplitude-transmission coefficients t s and t p , does not depend on either EAjx
or φAjx ; it is the same for all incident plane waves having the same σ A and ψ j . Comparing the x-
polarized reflected wave in (4.39a) and (4.39b) to the x-polarized incident wave in (4.28a) and
(4.28b), we see that multiplying the complex formulas in (4.28a) and (4.28b) by rs converts them
to the complex formulas in (4.39a) and (4.39b) if ŷ is replaced by yˆ ( r ) and z is replaced by z ( r ) .
Turning to the case of the y-polarized incident wave specified by the complex formulas
(4.32a) and (4.32b), we remember that now the reflected wave must be polarized along the yˆ ( r )
axis. This forces EA(jxr ) = 0 in Eqs. (4.38a)–(4.38e), showing the reflected E field is given by the
real part of
iφ ( r )( r ) (r)
yˆ ( r ) EA(jyr )( r ) e Ajy
e 2πσ A ( z − ct )
(4.41a)
Comparing these two formulas to (4.32a) and (4.32b) for the incident wave, we again see that if
we consider only the scalar factors that do not depend on position or time then the ŷ , yˆ ( r )
components of the complex E fields together with the xˆ ( r ) = xˆ components of the complex B
fields have the same complex ratio
EA(jyr )( r ) i §¨ φ ( r )( r ) −φAjy ·¸
rp = e© Ajy ¹
. (4.42a)
EAjy
- 367 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Parameter rp is called the p-wave amplitude-reflection coefficient, where again p refers to the
incident wave being polarized parallel to the plane of incidence. This coefficient, like rs , ts , and
t p , in general depends only on the wavenumber and incidence angle,
rp = rp (σ A ,ψ j ) . (4.42b)
Multiplying the complex formulas in (4.32a) and (4.32b) by rp converts them to (4.41a) and
(4.41b) if ŷ is replaced by yˆ ( r ) and z is replaced by z ( r ) .
Having analyzed how to create the reflected wavefield when the incident wavefield is a
monochromatic s-wave or monochromatic p-wave, we are now prepared to handle the reflection
of an arbitrary polychromatic plane wavefield incident along the z axis. Splitting each
monochromatic term into an s-wave component and a p-wave component as in formulas (4.36a)
and (4.36b), we can write the incident wave’s E field as the real part of
G G
¦EA
Ajs e2π iσ A ( z −ct ) + ¦ EAjp e2π iσ A ( z −ct )
A
Similarly, the incident wave’s B field is, using Eqs. (4.31c) and (4.35c), the real part of
1 1
¦ yˆ c E e2π iσ A ( z −ct ) − ¦ xˆ EAjy e Ajy e 2π iσ A ( z −ct ) .
iφAjx iφ
Ajx e (4.43b)
A A c
In these latest formulas, (4.43a) and (4.43b), the first term is the sum over the s-wave components
of the incident wavefield and the second term is the sum over the p-wave components of the
incident wavefield. To get the corresponding polychromatic reflected wavefield, we follow the
just-described recipes for finding the reflected monochromatic plane waves generated by each
incident monochromatic plane wave. The electric field of the reflected wavefield is then found to
be the real part of
¦ r (σ + ¦ rp (σ A ,ψ j ) yˆ ( r ) EAjy e
iφAjx (r) iφAjy (r)
s A ,ψ j ) xˆ EAjx e e 2πσ A ( z − ct )
e 2πσ A ( z − ct )
(4.43c)
A A
and the magnetic-induction field of the reflected wavefield is found to be the real part of
- 368 -
Reflected Plane Waves · 4.6
1 1
¦ r (σ EAjx e Ajx e2πσ A ( z −ct ) − ¦ rp (σ A ,ψ j ) xˆ EAjy e Ajy e 2πσ A ( z −ct ) .
iφ (r) iφ (r)
s A ,ψ j ) yˆ ( r ) (4.43d)
A c A c
These reflected-wave formulas are, of course, the counterpart equations to (4.36c) and (4.36d) for
the transmitted wavefields.
G G G 2π iσ ( Ωˆ •rG −ct ) ½
E (rad) (r , t ) = Re ®¦¦ EAj e A j ¾
¯ j A ¿ (4.44a)
1 G 2π iσ A ( Ωˆ j •r −ct ) 1
G G ∗ −2π iσ A ( Ωˆ j •rG −ct ) ½
= ¦ ® ¦ EAj e + ¦ EAj e ¾
j ¯2 A 2 A ¿
and
G G G 2π iσ ( Ωˆ •rG −ct ) ½
B (rad) (r , t ) = Re ®¦¦ BAj e A j ¾
¯ j A ¿ (4.44b)
1 G 2π iσ ( Ωˆ •rG −ct ) 1 G −2π iσ A ( Ωˆ j •rG −ct ) ½
= ¦ ® ¦ BAj e A j + ¦ BA∗j e ¾.
j ¯2 A 2 A ¿
G
In Eq. (4.44a), to convert the first inside sum over EAj into an integral, we replace σ A ≥ 0 with
G
the continuous variable σ ≥ 0 . To convert the sum over EAj ∗ into an integral, we use negative
values of the same continuous variable ı; that is, we replace −σ A with σ < 0 . To set up these
conversions, we define
G 1 G
∆σ A E j (σ ) = EAj for σ = σ A > 0 , (4.45a)
2
and
G 1 G
∆σ A E j (σ ) = EA∗j for σ = −σ A < 0 (4.45b)
2
with
∆σ A = σ A +1 − σ A .
- 369 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
A similar conversion of sums into integrals can be applied to Eq. (4.44b) if we define
G 1 G
∆σ A B j (σ ) = BAj for σ = σ A > 0 , (4.45c)
2
and
G 1 G
∆σ A B j (σ ) = BA∗j for σ = −σ A < 0 . (4.45d)
2
G G
Equations (4.45a) and (4.45c) associate positive ı arguments in E j (σ ) and B j (σ ) with the
G G
original EAj and BAj vectors, and Eqs. (4.45b) and (4.45d) associate negative ı arguments in
G G G G
E j (σ ) and B j (σ ) with the complex conjugate EAj ∗ and BAj ∗ vectors. In the limit of decreasing
∆σ A and increasing numbers of σ A values per unit wavenumber interval, Eqs. (4.44a) and
(4.44b) become
G (rad) G ∞
G ˆ • rG − ct )
2π iσ ( Ω
E (r , t ) = ¦ ³ E j (σ ) e j
dσ (4.46a)
j −∞
and
G G
∞
G ˆ • rG − ct )
2π iσ ( Ω
B (rad) (r , t ) = ¦ ³ B j (σ ) e j
dσ . (4.46b)
j −∞
G G
For this limit to make sense, we have to set E j (σ ) = 0 and B j (σ ) = 0 in (4.45a)–(4.45d) at those
wavenumbers for which there are no specified A index values in (4.44a) and (4.44b); in effect,
the indices left out of the sums are now included but assigned zero for their complex vector
G G G G
coefficients EAj and BAj . Although Eqs. (4.44a) and (4.44b) force vectors E (rad) and B (rad) to be
G G
real, vectors E j (σ ) and B j (σ ) are allowed to be complex.
Equations (4.46a) and (4.46b) are a vector shorthand for the six scalar equations
∞
G ˆ • rG − ct
( )
(r , t ) = ¦
2π iσ Ω
³ E jx (σ ) e dσ ,
(rad)
Ex j
j −∞
∞
G ˆ • rG − ct
( )
E y (rad) (r , t ) = ¦
2π iσ Ω
³ E jy (σ ) e dσ ,
j
j −∞
∞
G (ˆ • rG − ct )
(r , t ) = ¦
2π iσ Ω
³ E jz (σ ) e dσ ,
(rad)
Ez j
j −∞
- 370 -
Polychromatic Wave Fields · 4.7
and
∞
G (ˆ • rG − ct )
Bx (rad) (r , t ) = ¦
2π iσ Ω
³ B jx (σ ) e dσ ,
j
j −∞
∞
G ˆ • rG − ct
( )
(r , t ) = ¦
2π iσ Ω
³ B jy (σ ) e dσ ,
(rad)
By j
j −∞
∞
G (ˆ • rG − ct )
Bz (rad) (r , t ) = ¦
2π iσ Ω
³ B jz (σ ) e dσ ,
j
j −∞
where
G G G G G
E (rad) (r , t ) = xE
ˆ x (rad) (r , t ) + yE
ˆ y (rad) (r , t ) + zE
ˆ z (rad) (r , t )
with
G
E j (σ ) = xE
ˆ jx (σ ) + yE
ˆ jy (σ ) + zE
ˆ jz (σ )
and
G G G G G
B (rad) (r , t ) = xB
ˆ x (rad) ( r , t ) + yB
ˆ y (rad) ( r , t ) + zB
ˆ z (rad) (r , t )
with
G
B j (σ ) = xB
ˆ jx (σ ) + yB
ˆ jy (σ ) + zB
ˆ jz (σ )
for any x̂ , ŷ , ẑ triplet of mutually perpendicular Cartesian unit vectors. The integrals in (4.46a)
and (4.46b) are inverse Fourier transforms, so we can define, using ξ = Ω ˆ • rG − ct ,
j
∞ ∞
³E ³E
2π iσξ
E jx (ξ ) = jx (σ ) e dσ , E jy (ξ ) = jy (σ ) e2π iσξ dσ ,
−∞ −∞
∞
E jz (ξ ) = ³E
−∞
jz (σ ) e 2π iσξ dσ
and
∞ ∞
B jx (ξ ) = ³
−∞
B jx (σ ) e2π iσξ dσ , B jy (ξ ) =
−∞
³B jy (σ ) e2π iσξ dσ ,
∞
B jz (ξ ) = ³B
−∞
jz (σ ) e 2π iσξ dσ .
- 371 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G ∞
G
E j (ξ ) = ³
−∞
E j (σ ) e2π iσξ dσ (4.46c)
and
G ∞
G
B j (ξ ) = ³
−∞
B j (σ ) e2π iσξ dσ (4.46d)
where
G
E j (ξ ) = xˆ E jx (ξ ) + yˆ E jy (ξ ) + zˆ E jz (ξ ) (4.46e)
and
G
B j (ξ ) = xˆ B jx (ξ ) + yˆ B jy (ξ ) + zˆ B jz (ξ ) . (4.46f)
ˆ • rG − ct )
Now Eqs. (4.46a) and (4.46b) can be written as (remember that ξ = Ω j
G G G ˆ G
E (rad) (r , t ) = ¦ E j (Ω j • r − ct ) (4.46g)
j
and
G G G ˆ G
B (rad) (r , t ) = ¦ B j (Ω j • r − ct ) . (4.46h)
j
G G
Returning to the definitions of E j and B j in Eqs. (4.45a)–(4.45d), we see that
G G
E j (−σ ) = E j (σ )∗ (4.47a)
and
G G
B j (−σ ) = B j (σ )∗ . (4.47b)
G G
This shows that E j and B j are Hermitian, and entry 7 in Table 2.1 of Chapter 2 requires the
inverse Fourier transforms of Hermitian functions to be real. Consequently, because they are
G ˆ G G ˆ G
inverse Fourier transforms of Hermitian functions, each E j (Ω j • r − ct ) and B j (Ω j • r − ct ) vector
G ˆ G G ˆ G
function in (4.46g) and (4.46h) is real. Every E (Ω • r − ct ) and B (Ω
j j • r − ct ) pair of vector
j j
functions can be thought of as the real electric and magnetic-induction fields of a single
ˆ at velocity c. Hence these two equations
polychromatic plane wave traveling in direction Ω j
demonstrate that electromagnetic radiation fields in empty space can be represented as the sum of
polychromatic plane waves traveling in a specified collection of different directions.
- 372 -
Polychromatic Wave Fields · 4.7
G G
From Eqs. (4.14c) and (4.14d), we know that BAj • Ω ˆ = 0 and E • Ω ˆ = 0 . Taking the
j Aj j
G∗ G∗
complex conjugate of these two relationships gives BAj • Ω ˆ = 0 and E • Ω ˆ = 0 . We can now
j Aj j
ˆ to get
take the dot product of both sides of Eqs. (4.45a) and (4.45b) with Ω j
G
ˆ =0
E j (σ ) • Ω (4.48a)
j
ˆ to get
and the dot product of both sides of Eqs. (4.45c) and (4.45d) Ω j
G
ˆ =0
B j (σ ) • Ω (4.48b)
j
and
G ∞
G
ˆ =
B j (ξ ) • Ω j ³ ª¬ B (σ ) • Ωˆ
−∞
j j
º e2π iσξ dσ
¼
because Ωˆ is a constant unit vector. Substituting from Eqs. (4.48a) and (4.48b) and
j
ˆ • rG − ct now leads to
remembering that ξ = Ω j
G ˆ G ˆ
E j (Ω j • r − ct ) • Ω j = 0 (4.49a)
and
G ˆ G ˆ
B j (Ω j • r − ct ) • Ω j = 0 (4.49b)
G ˆ G G ˆ G
for any polychromatic plane wave E j (Ω j • r − ct ) and B j (Ω j • r − ct ) . Consequently, the E and
B fields of a polychromatic plane wave, just like the E and B fields of a monochromatic plane
wave, are transverse to the wave’s direction of propagation. From Eq. (4.22a) we note that, taking
the complex conjugates of the original equality,
G G G G
EAj • BAj = EA∗j • BA∗j = 0 .
- 373 -
4 · From Maxwell’s Equations to the Michelson Interferometer
through
Hence from Eqs. (4.45a) and (4.45d)
(4.45c) it follows
it follows that that
G G 1 G G
E j () ) = B j () ) 2
EAj = BAj 0
4 ) A
for ) 0 and
G G 1 G G
E j () ) = B j () ) 2
EAj = BAj 0
4 ) A
for )
0 . We conclude, in the limit of decreasing ) A and increasing numbers of ) A values,
that
G G
E j () ) = B j () ) 0 (4.49c)
for all positive and negative values of ı. We divide both sides of Eq. (4.22b) by 4() A ) 2 to get
1 G G 1ª 1 º G G
ˆ .
4 ) A
2
EAj ; BAj « 2
c « 4 ) A »
» EAj = EAj j (4.49d)
¬ ¼
Consulting Eq. (4.45a), the complex conjugate of Eq. (4.45a), and the complex conjugate of Eq.
(4.45c), we note that in the limit of decreasing ) A and increasing numbers of ) A it follows that
G G 1 G G
ˆ
E j () ) ; B j () ) E j () ) = E j () ) (4.49e)
j
c
for ) 0 . For )
0 we have, using (4.45b) and the complex conjugate of (4.45d), that
1 G G G G
2
EAj ; BAj E j () ) ; B j () ) .
4 ) A
G G 1 G G
E j () ) ; B j () )
4c ) A
2 E Aj = EAj ˆ j .
Remembering that )
0 , we now use (4.45b) and the complex conjugate of (4.45b) to write, in
the limit of decreasing ) A and increasing numbers of ) A , that
- 374 -
Polychromatic Wave Fields · 4.7
G G 1 G G G G
( ˆ = 1 E (σ ) • E (σ )∗ Ω
E j (σ ) × B j (σ )∗ = E j (σ )∗ • E j (σ ) Ω) ( ˆ . )
j j j j
c c
G G 1 G G
( ) ˆ
E j (σ ) × B j (σ )∗ = E j (σ ) • E j (σ )∗ Ω
c
j (4.49f)
holds true for all positive and negative values of ı. Glancing back at Eq. (4.47a), we see that this
can also be written as
G G 1 G G
( ˆ
E j (σ ) × B j (σ )∗ = E j (σ ) • E j (−σ ) Ω
c
)j (4.49g)
ε jz > 0 . Now all the plane waves in Eqs. (4.46a) and (4.46b) are traveling more or less along the
ˆ and ẑ is
positive z axis of the Cartesian coordinate system—that is, the angle between Ω j
ˆ
This makes it clear that the two real parameters ε jx and ε jy specify the propagation direction Ω j
of the jth plane wave. Consequently, each plane wave in the sums over j in Eqs. (4.46a) and
(4.46b) can be specified by a single point in the ε x , ε y plane. Figure 4.10 shows how this works
for the sum of the five plane waves specified by the points (ε1x , ε1 y ) , (ε 2 x , ε 2 y ) , (ε 3 x , ε 3 y ) ,
(ε 4 x , ε 4 y ) , and (ε 5 x , ε 5 y ) . We can construct a grid of ε x , ε y values such that each plane wave is
located at a node in the grid, where if necessary the grid lines are unevenly spaced as in Fig. 4.10.
After numbering the grid lines, we can replace the single index j by a pair of indices m and n. The
five plane waves in Fig. 4.10, for example, become
- 375 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
(ε1x , ε1 y ) → (ε 2 x , ε 4 y ) , (ε 2 x , ε 2 y ) → (ε 5 x , ε1 y ) ,
(ε 3 x , ε 3 y ) → (ε 3 x , ε 2 y ) , (ε 4 x , ε 4 y ) → (ε 4 x , ε 5 y ) ,
and
(ε 5 x , ε 5 y ) → (ε1x , ε 3 y ) .
Replacing index j by a pair of indices m and n lets us write the sums in Eqs. (4.46a) and (4.46b)
as
G (rad) G ∞ ∞ ∞ G ˆ • rG − ct )
2π iσ ( Ω
E (r , t ) = ³ ¦ ¦ Enm (σ ) e nm
dσ (4.51a)
−∞ n =−∞ m =−∞
and
G G
∞ ∞ ∞ G ˆ • rG − ct )
2π iσ ( Ω
B (rad) (r , t ) = ³¦¦ Bnm (σ ) e dσ , (4.51b)
nm
−∞ n =−∞ m =−∞
G G
where we define Enm (σ ) = Bnm (σ ) = 0 for those grid points that do not correspond to propagation
directions specified in the original sums over j. The new set of Ω ˆ propagation vectors can be
nm
written as
ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 .
Ω (4.51c)
nm nx my nx my
For each m and n propagation direction in Eqs. (4.51a) and (4.51b), we now define that
G G
∆ε nx ∆ε my e (ε nx , ε my , σ ) = Enm (σ ) (4.52a)
and
G G
∆ε nx ∆ε my b(ε nx , ε my , σ ) = Bnm (σ ) (4.52b)
with
∆ε nx = ε n +1, x − ε n , x (4.52c)
and
∆ε my = ε m +1, y − ε m, y . (4.52d)
- 376 -
Angle-Wavenumber Transforms· 4.8
FIGURE 4.10.
εy
6 ε 4x ,ε 4 y
ε 1x , ε 1 y
5
4
3 ε 3x , ε 3 y
2
ε 2x ,ε 2 y
1
ε 5x , ε 5 y
0
εx
-1
-2
-2 -1 0 1 2 3 4 5 6
G G
∞
G ˆ rG − ct )
2π iσ ( Ω•
E (rad) (r , t ) = ³ ³ ³ 2 x 2y x y
d σ d ε d ε e (ε , ε , σ ) e
(4.53a)
−∞ [ε x + ε x < 1]
and
G G
∞
G ˆ rG − ct )
2π iσ ( Ω•
B (rad) (r , t ) = ³ ³ ³ 2 x 2y x y
d σ d ε d ε b (ε , ε , σ ) e
(4.53b)
−∞ [ε x + ε x < 1]
- 377 -
4 · From Maxwell’s Equations to the Michelson Interferometer
writing
ˆ = εG + zˆ 1 − ε 2
Ω (4.54a)
and
G G
r = ρ + z zˆ , (4.54b)
where
G
ε = xˆε x + yˆε y , (4.54c)
G2
ε 2 = ε = ε x2 + ε y2 , (4.54d)
and
G
ρ = x xˆ + y yˆ . (4.54e)
G G
As a shorthand, we write the complex vector functions e and b as
G G G G G G
e (ε x , ε x , σ ) = e (ε , σ ) and b(ε x , ε x , σ ) = b(ε , σ ) .
G G G G
Equations (4.52a) and (4.52b) show that both e (ε , σ ) and b(ε , σ ) must be negligible or zero for
G
values of ε = xˆε x + yˆε y that do not correspond to grid points contained in the original sums over
G G G G G G
j. We also require e(ε , σ ) and b(ε , σ ) to be zero for values of ε for which ε ≥ 1 . Now Eqs.
(4.53a) and (4.53b) become
G G
∞ ∞
G G G G
E (rad) ( ρ , z, t ) = ³ dσ ³ ³ d ε [e(ε ,σ ) e
2 2π iσ z 1−ε 2
]e2π iσ (ε • ρ −ct ) (4.55a)
−∞ −∞
and
G G
∞ ∞
G G G G
B (rad) ( ρ , z, t ) = ³ dσ ³ ³ d ε [b(ε , σ ) e
2 2π iσ z 1−ε 2
]e2π iσ (ε • ρ −ct ) , (4.55b)
−∞ −∞
G G
where we have singled out the z dependence of E (rad) and B (rad) , writing that
G G G G G
E (rad) (r , t ) = E (rad) ( x, y, z , t ) = E (rad) ( ρ , z, t )
and
G G G G G
B (rad) (r , t ) = B (rad) ( x, y, z , t ) = B (rad) ( ρ , z, t ) .
- 378 -
Angle-Wavenumber Transforms· 4.8
From Eqs. (4.49f), (4.52a), and (4.52b), we have, replacing each j index by the appropriate m
and n pair of indices,
G G 2 G G
Enm () ) ; Bnm () ) nx my e ( nx , my , ) ) ; b( nx , my , ) )
1 2 G G
ˆ .
nx my e ( nx , my , ) ) = e ( nx , my , ) ) nm
c
G
Dropping the m and n indices, making the notation change x , y 7 , and dividing through by
( nx my ) 2 , we get, in the limit of decreasing nx , my and increasing numbers of specified
propagation directions, that
G G G G 1 G G G G
ˆ.
e ( , ) ) ; b( , ) ) e ( , ) ) = e ( , ) ) (4.56a)
c
Following
Following the the same
same procedure,
procedure, we we substitute
substitute Eqs.Eqs. (4.52a)
(4.52a) andand (4.52b)
(4.52b) intointo (4.49g)
(4.49g) to get
to get
G G G G 1 G G G G ˆ .
e( , ) ) ; b( , ) ) e ( , ) ) = e ( , ) ) (4.56b)
c
We can also substitute (4.52a) into (4.48a) to get, replacing each j by appropriate m and n indices,
G ˆ
nx my e ( nx , my , ) ) =
nm 0 , (4.56c)
which becomes, making the same notation changes as before and taking the same limit as before,
G G ˆ 0.
e ( , ) ) = (4.56d)
- 379 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
5 5
G G G G
³ ³ ³
2& i) = ( ct
E (rad) ( ( , z , t ) d) d 2
E ( , z , ) ) e (4.58a)
5 5
and
G G
5 5
G G G G
³ ³ ³
2& i) = ( ct
B (rad) ( ( , z, t ) d) d 2
B ( , z , ) ) e . (4.58b)
5 5
G G G G GG G
TheThe complex
complex vectors
vectors E(E,(z ,,)z ,)) )and
andB(B(,z,, z), )
) )are
arecalled
calledthe
theangle-wavenumber
angle-wavenumber transforms
transforms
G (rad) G (rad)
of E and B respectively. By definition [see Eqs. (4.57a) and (4.57b)], the angle-
wavenumber transforms at z0 z are given by
G G G G 1 2
E( , z0 z, ) ) E( , z0 , ) )e 2& i) z (4.59a)
and
G G G G 1 2
B( , z0 z , ) ) B( , z0 , ) )e 2& i) z . (4.59b)
G G GG GG GG
These equalities
These show
equalities that
show getgetE Eand
to to
that andB Batatz0z0 z zwe
weneed
needonly multiply EE and
onlymultiply and B
B at z0
1 2 1 2
by e2& i) z . Multiplication of Eqs. (4.56d) and (4.56e) by e2& i) z gives
G G
ˆ 0
E( , z , ) ) = (4.59c)
and
G G
ˆ 0.
B( , z , ) ) = (4.59d)
1 2 1 2
e2& i) z A e2& i) z 1
gives
G
ª eG (G , ) ) e 2& i) z º ; ªb(G , ) ) e 2& i) z
1 2 1 2º
«¬ »¼ «¬ »¼
G G º = ª eG (G , ) ) e 2& i) z
ª« e ( , ) ) e 2& i) z 1 2 1 2 º ˆ
¬ ¼» ¬« ¼»
or
G G G G 1 G G G G
ˆ .
E( , z , ) ) ; B( , z , ) ) E( , z, ) ) = E( , z , ) ) (4.59e)
c
- 380 -
Angle-Wavenumber Transforms· 4.8
G G G G 1 G G G G
( ˆ.
E(ε , z , σ ) × B(ε , z, σ )∗ = E(ε , z, σ ) • E(ε , z , −σ ) Ω
c
) (4.59f)
Equations (4.58a) and (4.58b) are a disguised form of the inverse Fourier transform. Writing
G G
(4.58a) using x and y for ρ , ε x and ε y for ε , and then making the substitutions
w = −σ c , (4.60a)
u x = σε x , (4.60b)
u y = σε y (4.60c)
gives G
E (rad) ( x, y, z , t )
∞ ∞ ∞
G 2π i ( xu x + yu y −σ )
³ ³ x ³
ct
= dσ σ du −1
σ −1
du y E (σ u x , σ u y , z , σ )e
−∞ −∞ −∞
−∞ ∞
dw § c ·
∞
§ c· G § −cu x −cu y w · 2π i( xux + yu y + wt )
=−³ ³ ¨ − ¸ x³¨
du − du
¸ y ¨E , , z, − ¸ e
∞
c −∞ © w ¹ −∞ ©
w¹ © w w c¹
or
G G
ª −2 G § cu
∞ ∞
G w · º 2π i( ρG •uG + wt )
E (rad) ( ρ , z , t ) = ³−∞ ³−∞³ «¬
2
dw d u cw E ¨ − , z , − ¸ e , (4.61a)
© w c ¹ »¼
G G
ª −2 G § cu
∞ ∞
G w · º 2π i( ρG •uG + wt )
B (rad) ( ρ , z , t ) = ³ ³−∞³ «¬
2
dw d u cw B ¨ − , z , − ¸ e . (4.61d)
−∞ © w c ¹ »¼
- 381 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
According to Eq. (2.110f) in Chapter 2, we have now demonstrated that functions E (rad) and B (rad)
at a specified value of z are the vector inverse Fourier transforms of
G § cuG
−2 w·
cw E ¨ − , z , − ¸
© w c¹
and
G § cuG
−2 w·
cw B ¨ − , z , − ¸ .
© w c¹
G G
Hence, the vector forward Fourier transforms of E (rad) and B (rad) must be [see Eq. (2.110e) in
Chapter 2)
G G ∞ ∞
G
−2 § cu w· G G G
cw E ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ E (rad) ( ρ , z , t ) e −2π i( ρ •u + wt ) (4.62a)
© w c ¹ −∞ −∞
and
G G ∞ ∞
G
−2 § cu w· G G G
cw B ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ B (rad) ( ρ , z , t ) e −2π i( ρ •u + wt ) (4.62b)
© w c ¹ −∞ −∞
G
or, returning to the ε and ı arguments,
G G ∞ ∞
G G G G
E ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ E (rad) ( ρ , z , t ) e −2π iσ (ε • ρ −ct )
2
(4.62c)
−∞ −∞
and
G G ∞ ∞
G G G G
B ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ B (rad) ( ρ , z , t ) e −2π iσ (ε • ρ −ct ) .
2
(4.62d)
−∞ −∞
Equations (4.58a), (4.58b), (4.62c), and (4.62d) are a formal transformation from the angle-
wavenumber transforms to the real E and B radiation fields and back again, subject only to the
ˆ = εG + zˆε and that
constraint that ε z > 0 in the propagation vector Ω z
G G G G G2
E ( ε , z , σ ) = B ( ε , z , σ ) = 0 when ε = ε x2 + ε y2 ≥ 1 .
To go from Eqs. (4.58a) and (4.58b) to (4.62c) and (4.62d), we show the original angle-
wavenumber transforms to be a form of three-dimensional vector Fourier transform. This lets us
use Fourier transform theory to write down the integrals for the inverse transforms.
G G
Unfortunately, the change in Eqs. (4.60a)–(4.60c) from the ε , ı variables to the u , w variables
that reveals the transform’s Fourier nature is a somewhat awkward one. There are two reasons for
- 382 -
Angle-Wavenumber Transforms· 4.8
this: In the physical sciences, waves conventionally travel from left to right, forcing ı and w to
have opposite signs in (4.60a), and in spectroscopy, wavenumbers rather than frequencies are
conventionally used to characterize monochromatic radiation. Nevertheless, the rewards of
converting to the Fourier transform—immediate access to the well-known results of Fourier
theory—significantly outweigh the inconvenience, and the reader can expect to see
G G
transformations between the ε , ı variables and the u , w variables more than once in the balance
of this chapter.
points in the direction of Ωˆ , and the plane surfaces specified by Ω ˆ • rG = constant are surfaces
j j
perpendicular to all the parallel rays. If the plane wave is monochromatic, then these surfaces
where Ω ˆ • rG = constant are also surfaces of constant phase at fixed time t, since the
j
This means, of course, that the monochromatic E field as well as the monochromatic B field is
constant over any of these plane surfaces at fixed time t. If the plane wave is polychromatic, we
review the discussion following Eq. (4.47b) and note that a single polychromatic plane wave has
E and B fields specified by the vector functions
G ˆ G G ˆ G
E j (Ω j • r − ct ) and B j (Ω j • r − ct )
respectively. Consequently, at any fixed time t, the polychromatic E field as well as the
ˆ • rG = constant ; that is, they are
polychromatic B field is constant over any plane surface where Ω j
constant over any plane surface perpendicular to the rays. For both monochromatic and
polychromatic plane waves, the E and B fields themselves lie in these plane surfaces because they
must be perpendicular to the propagation vector Ωˆ [as shown by Eqs. (4.14c), (4.14d), (4.49a),
j
and (4.49b)].
Figure 4.11 shows a plane wave encountering an aperture. The rays entering the aperture pass
on through, creating a beam; we say that the aperture creates a beam-chopped radiation field.
- 383 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
FIGURE 4.11.
x̂
ẑ
ŷ
ŷ
x̂
ẑ
ŷ
From our current point of view, the most important characteristic of beam-chopped fields is that
they obviously can be Fourier transformed in planes perpendicular to the beam’s direction of
travel. Using the x, y, z coordinate system shown in Fig. 4.11, with its origin in the center of the
beam and its z axis pointing down the beam, we drop the (rad) superscript from Eqs. (4.62a) and
(4.62b) and write
G G 5 5
G G
2 § cu w· G G
cw E ¨ , z , ¸ ³ dt ³ ³ d 2 ( E ( ( , z , t ) e
2& i ( =u wt
© w c ¹ 5 5
(4.63a)
5 5 5
G 2& i xu x yu y wt
³ dt ³ dx ³ dy E ( x, y, z , t ) e
5 5 5
and
G § cuG w·
5 5
G G G G
cw B ¨ , z , ¸ ³ dt ³ ³ d 2 ( B( ( , z , t ) e
2 2& i ( =u wt
© w c ¹ 5 5
(4.63b)
5 5 5
G 2& i xu x yu y wt
³ dt ³ dx ³ dy B( x, y, z, t ) e ,
5 5 5
- 384 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
G G
where E ( x, y, z , t ) and B( x, y, z , t ) represent the E and B fields after the aperture in Fig. 4.11. In
these formulas the integrals over x and y can be assumed to converge because the beam-chopped
E and B fields are negligibly small for large values of x and y.
Figure 4.11 suggests that a beam-chopped radiation field can travel indefinitely far to the right
with a cross-section that is always the same shape as the aperture. We know, however, that
diffraction eventually causes all beam-chopped radiation fields to spread; the smaller the
characteristic (or average) wavelength of the radiation compared to the characteristic (or average)
distance across the aperture, the farther the beam travels before significant spreading occurs.62
Michelson interferometers use apertures that are very large compared to the wavelengths of
interest, ensuring that only an insignificant amount of spreading occurs in the beam-chopped field
as it travels through the instrument.
In geometric optics, when a lens is placed perpendicular to the z axis—that is, the optical
axis—of a beam, the plane waves with propagation vectors parallel to the optical axis are focused
onto the point where the optical axis intersects a perpendicular surface called the focal plane (see
Fig. 4.12). Plane waves with propagation vectors at an angle with respect to the optical axis are
focused onto points in the focal plane that are off to the side. Figure 4.12 shows four rays
representing a plane wave propagating at a small angle to the optical axis being focused by lens A
slightly to the side of where the axis intersects the focal plane. Every propagation direction that is
at a small angle to the optical axis is focused onto a unique point in the focal plane close to the
optical axis and each point in the focal plane close to the optical axis corresponds to a unique
propagation direction at a small angle to the optical axis. Directions that differ only slightly with
respect to each other are focused at closely adjacent points. The plane wave in Fig. 4.12 has a
propagation vector propagating at a small enough angle with respect to the optical axis that it is
focused by lens A only slightly to the side of where the axis intersects the focal plane.
Consequently, it passes through the small aperture placed in the focal plane and out to lens B,
which defocuses it back into a plane wave. Figure 4.13 gives a side view of this phenomenon.
Here there are three plane waves a, b, and c propagating in different directions with respect to the
beam’s optical axis. All the rays belonging to the plane wave a are focused at point a in the focal
plane; all the rays belonging to plane wave b are focused at point b in the focal plane; and all the
rays belonging to plane wave c are focused at point c in the focal plane. Only those plane waves
with propagation vectors at just a slight angle to the optical axis, such as plane wave b, pass
through the central aperture, allowing lens B to create a beam of plane waves propagating nearly
parallel to the optical axis. We say that the radiation leaving lens B has been direction-chopped,
meaning that it contains only a small range of propagation directions. The distance between the
focal plane and lenses A and B depends on the lens’ index of refraction, which may in turn
depend on the radiation frequency f = σ c . If the frequency dependence is strong, then the two
lenses in Fig. 4.13 may not do a good job of creating a polychromatic direction-chopped beam.
When this is a concern, the all-reflective setup shown in Fig. 4.14 (composed of two Cassegrain
62
R. W. Ditchburn, Light, Vol. I, 2nd ed. (Interscience Publishers, a division of John Wiley & Sons, Inc., New York,
1963), pp. 162–166, 195.
- 385 -
4 · From Maxwell’s Equations to the Michelson Interferometer
that are nearly parallel to the optical axis. This means that both
for all values of n, m in the sum over plane waves in Eqs. (4.51a) and (4.51b).
When these sums are transformed into double integrals in (4.53a) and (4.53b), the propagation
vectors
ˆ = xˆε + yˆε + zˆ 1 − ε 2 .
Ω x y
Functions
G G G G G G
e (ε , σ ) = e (ε x , ε x , σ ) and b(ε , σ ) = b(ε x , ε x , σ )
G G G G
Since the angle-wavenumber transforms E(ε , z , σ ) and B(ε , z , σ ) in (4.57a) and (4.57b) are
G G G G
proportional to e(ε , σ ) and b(ε , σ ) , they should also be negligible or zero in direction-chopped
beams when ε x and ε y are not both very small. Consequently, in Eqs. (4.58a) and (4.58b) we
see, dropping the (rad) superscript, that the formulas for the E and B fields of the direction-
chopped beam,
G G ∞ ∞
G G G G
E ( ρ , z , t ) = ³ dσ ³ ³ d 2ε E(ε , z , σ )e 2π iσ (ε • ρ −ct ) (4.64a)
−∞ −∞
- 386 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
FIGURE 4.12.
Focal Plane
with Aperture
Lens A
Lens B Optical
Axis
Two matched lenses can be used to create direction-chopped radiation. Only plane waves
propagating at small angles to the optical axis can make it through the aperture in the focal
plane of the lenses (see also Figs. 4.13 and 4.14).
- 387 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.13.
b c
a
Optical Axis
c
b
a
b b b
c
c a
b
Fig.
F 4.124.12
IGURE gives a three-dimensional
GIVES view of
A THREE-DIMENSIONAL how
VIEW OFmatched lenses LENSES
HOW MATCHED can be CANusedBEto USED
createTOdirection-
CREATE
chopped radiation,
DIRECTION -CHOPPEDand this diagram
RADIATION , AND is theDIAGRAM
THIS side view.
IS Plane waves
THE SIDE propagating
VIEW . PLANE WAVESat large angles to the
PROPAGATING AT
optical ANGLES
LARGE axis, likeTOthe a OPTICAL
THE and c plane , LIKE THE
AXISwaves a and
in the c planeare
diagram, waves in thefrom
removed diagram, are removed
the beam because from
they
the beam
focus because
outside they focus
the aperture outside
in the focal the aperture
plane in the
(see also Fig.focal plane (see also Fig. 4.14).
4.14).
- 388 -
Beam-Chopped and Direction-Chopped Radiation · 4.9
FIGURE 4.14.
Optical
Axis
b
c b
x̂
ẑ
a ŷ
b
b
c
Focal Plane
with Aperture
Telescope A Telescope B
Just like the lenses in Fig. 4.13, two matched Cassegrain telescopes can be used to
create direction-chopped radiation. Again plane waves propagating at large angles to the
optical axis are removed from the beam because they focus outside the focal-plane
aperture.
and
G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
B( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d B ( , z , ) e , (4.64b)
−∞ −∞
have double integrals over d 2ε = d ε x d ε y that must converge. For each ε x , ε y pair of values
inside the double integral, the tip of the Ω̂ vector can be thought of as lying somewhere inside
the infinitesimal area d 2ε = d ε x d ε y (see Fig. 4.15). As long as only direction-chopped beams
where both ε x and ε y are small are being analyzed, this d 2ε infinitesimal area must be
- 389 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.15.
unit vector x̂
infinitesimal area
element d 2ε
propagation vector Ω̂
εx
εy
unit vector ẑ
unit vector ŷ
approximately perpendicular to the direction in which Ω̂ points. Because Ω̂ is of unit length and
d 2ε is an infinitesimal area, the formula for the solid angle subtended by d 2ε becomes
2
d 2ε Ωˆ = d 2ε . Hence, d 2ε can also be regarded as an infinitesimal solid angle, and the double
integrals over d 2ε can be interpreted as integrals over all the solid angles that specify allowed
propagation directions inside the direction-chopped beam.
- 390 -
Time-Chopped and Band-Limited Radiation · 4.10
radiation fields are time-chopped. The formulas for the angle-wavenumber transforms of time-
chopped radiation fields are [dropping the (rad) superscript from Eqs. (4.62c) and (4.62d)]
G G ∞ ∞
G G G G
E ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ E ( ρ , z , t ) e −2π iσ (ε • ρ −ct )
2
(4.65a)
−∞ −∞
and
G G ∞ ∞
G G G G
B ( ε , z , σ ) = cσ ³ dt ³ ³ d 2 ρ B( ρ , z , t ) e −2π iσ (ε • ρ −ct ) .
2
(4.65b)
−∞ −∞
G G G G
Since the E ( ρ , z , t ) and B( ρ , z , t ) radiation fields are assumed to be time-chopped, the integrals
between í and + over time must be well defined and so converge. When the E and B fields are
beam-chopped, the infinite double integrals over d 2 ρ are also well defined and converge [see the
G G G G
discussion after (4.63b)] so the multiple integrals defining E ( ε , z , σ ) and B ( ε , z , σ ) are well-
G G G G
defined quantities when E ( ρ , z , t ) and B( ρ , z , t ) represent beam-chopped and time-chopped
radiation fields. Similar reasoning shows that when the angle-wavenumber transforms are
calculated using the three-dimensional Fourier transforms in Eqs. (4.63a) and (4.63b),
G § cuG w·
∞ ∞
G G G G
cw E ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ E ( ρ , z , t ) e −2π i ( ρ •u + wt )
−2
(4.65c)
© w c ¹ −∞ −∞
and
G § cuG w·
∞ ∞
G G G G
cw B ¨ − , z , − ¸ = ³ dt ³ ³ d 2 ρ B ( ρ , z , t ) e −2π i( ρ •u + wt ) ,
−2
(4.65d)
© w c ¹ −∞ −∞
the infinite integrals over dt and d 2 ρ are for the same reasons well-defined and convergent when
G G G G
E ( ρ , z , t ) and B( ρ , z , t ) represent beam-chopped and time-chopped radiation fields.
The inverse transforms to Eqs. (4.65a) and (4.65b) are given in (4.64a) and (4.64b),
G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
E ( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d E ( , z , ) e (4.66a)
−∞ −∞
and
G G ∞ ∞
G G G G
2π iσ ( ε • ρ − ct )
B( ρ , z, t ) = ³ dσ ³ ³ ε ε σ
2
d B ( , z , ) e . (4.66b)
−∞ −∞
- 391 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G G G
measure. Hence, in (4.66a) and (4.66b), we expect E ( ε , z , σ ) and B ( ε , z , σ ) to be negligible for
wavenumbers ı corresponding to radiation wavelengths blocked by the filters. The filters are said
to define the spectral band (or bands) to which the instrument is sensitive. Even when these filters
are built into the detectors themselves, which means the actual radiation fields traversing the
instrument may contain out-of-band radiation, it is mathematically convenient to assume that only
negligible amounts of out-of-band radiation are present inside the instrument (while, of course,
retaining the correct amounts of in-band radiation measured by the detectors). The situation is
very similar to that encountered in the discussion of time-chopped radiation fields; just as we
assume the absence of radiation outside the time interval during which the measurement occurs,
so now we assume the absence of out-of-band radiation to which the detectors are insensitive.
We must be careful to note which wavenumbers ı correspond to the radiation band passed by
the filters. Remembering that the wavenumber is one over the wavelength, and reviewing how the
original sums over σ A in Eqs. (4.44a) and (4.44b) become integrals over ı in Eqs. (4.46a) and
(4.46b), we see that if only wavelengths Ȝ between λa and λb are measured by the detectors,
0 < λb ≤ λ ≤ λa , (4.67a)
G G G G
then E ( ε , z , σ ) and B ( ε , z , σ ) can be non-negligible only for ı values inside the two intervals
1 1
−∞ < − ≤σ ≤ − <0
λb λa
and
1 1
0< ≤σ ≤ <∞.
λa λb
These intervals can also be written as
- 392 -
Time-Chopped and Band-Limited Radiation · 4.10
³e
− 2π ift
G( f ) = g (t ) dt
−∞
becomes strictly zero when f > F for some positive value of F. There is a well-known theorem
that states that when a function g(t) is time-chopped, meaning that there exists some positive
value of T such that g(t) is strictly zero whenever t > T , then there is no value of F such that the
Fourier transform
∞
³e
− 2π ift
G( f ) = g (t ) dt
−∞
becomes strictly zero whenever f > F . In short, a function cannot be both time-chopped and
G G G G
band-limited.63 If the angle-wavenumber transforms E ( ε , z , σ ) and B ( ε , z , σ ) are taken to be
strictly zero for wavenumbers ı outside the intervals specified in (4.67b) and (4.67c), then,
G G G G
because Eqs. (4.65c) and (4.65d) show E ( ε , z , σ ) and B ( ε , z , σ ) to be proportional to the
G G G G G G G G
Fourier transforms of E ( ρ , z , t ) and B( ρ , z , t ) , functions E ( ρ , z , t ) and B( ρ , z , t ) must be band-
limited functions. Therefore, according to the just-mentioned theorem, we cannot say that
G G G G
functions E ( ρ , z , t ) and B( ρ , z , t ) are both band-limited and time-chopped. Unfortunately, we
G G G G
have just said in the previous two paragraphs that we expect E ( ρ , z , t ) and B( ρ , z , t ) to be just
that—both band-limited and time-chopped. The loophole in this situation is that Fourier
transforms
∞
³e
− 2π ift
G( f ) = g (t ) dt
−∞
can be negligibly small without becoming strictly zero, allowing us to create time-chopped
functions g(t) whose Fourier transforms G(ƒ) are only approximately zero when f > F for some
positive value of F. Hence it is possible for g(t) to be exactly time-chopped and approximately
G G
band-limited.64 Similarly, we are free to regard the angle-wavenumber transforms E and B as
being negligibly small rather than strictly zero for values of ı representing out-of-band radiation
G G
when E and B represent strictly time-chopped radiation fields. Hence it does make sense to treat
G G
the radiation fields as both time-chopped and approximately band-limited, taking the E ( ρ , z , t )
63
Athanasios Papoulis, Signal Analysis, p. 188.
64
We can also create functions g(t) that are exactly band-limited and approximately time-chopped.
- 393 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
and B( ρ , z , t ) fields to be strictly zero for all times t outside the measurement interval and
G G G G
assuming the angle-wavenumber transforms E ( ε , z , σ ) and B ( ε , z , σ ) to be negligible or zero
for all wavenumbers ı lying outside the intervals specified in (4.67b) and (4.67c).
The same mathematical point, by the way, comes up when analyzing the relationship of beam-
G G
chopped and direction-chopped E and B radiation fields. For this reason we have been careful
in the previous section to say that beam-chopped radiation fields are negligible or zero, instead of
strictly zero, for positions outside the beam and that direction-chopped radiation fields have
angle-wavenumber transforms that are negligible or zero, instead of strictly zero, for those
propagation vectors removed from the beam. This allows the beam passing through the
interferometer to be both direction-chopped and beam-chopped without getting into mathematical
difficulties.
- 394 -
Top-Level Description of a Standard Michelson Interferometer ·4.11
the radiation’s angle of incidence and whether it consists of s-type or p-type plane waves. The
phase shift is strongly dependent on the angle of incidence and ı. It can also depend on whether
s-type or p-type plane waves are passing through the substrate material. Appendix 4E introduces
six complex parameters γ s( a ) , γ (pa ) , γ s( b ) , γ (pb ) , γ s( c ) , γ (pc ) to describe the passage of radiation
through the two optical elements—the beam splitter substrate and the compensator plate—that
are made from the beam-splitter substrate material.
When the moving mirror in Fig. 4.16 is further from the beam splitter than the fixed mirror,
the rays in the moving-mirror arm travel a longer distance down and back than the rays in the
fixed-mirror arm. Just like in Eq. (1.15b) of Chapter 1, we call this extra distance the optical-path
difference (OPD) and represent it by the variable Ȥ. There is, of course, a position of the moving
mirror for which the OPD is zero, shown by the dash-dot line in Fig. 4.16. When the moving
mirror is closer to the beam splitter than this dash-dot line, the OPD value Ȥ is taken to be
negative. Just like in Sec. 1.4 of Chapter 1, the position of the dash-dot line is called the zero-path
difference (ZPD) position. When the OPD is Ȥ for the interferometer setup shown in Fig. 4.16, the
moving mirror is a distance Ȥ/2 from its ZPD position.
Section 1.7 of Chapter 1 shows that there are many different ways to build a Michelson
interferometer, and for some setups the moving mirror is not Ȥ/2 from its ZPD position when the
OPD is Ȥ [see, for example, Fig. 1.19(d)]. The interferometer signal in the ideal case does not,
however, depend directly on the interferometer setup but rather on the OPD value Ȥ generated by
the setup. For this reason, it makes sense to unfold the interferometer as shown in Fig. 4.18. Now
we see only what is common to Michelson interferometers of all configurations: the distance
traveled along one path through the interferometer differs by Ȥ from the distance traveled along
the other path through the interferometer.
mirror to be tilted slightly out of alignment, producing a very small angle θ d between the two
- 395 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.16.
ẑ
x̂ ŷ
Input Radiance
xˆ[ i ] zˆ[i ]
yˆ[i ]
Beam Compensator
Splitter Plate Fixed
Mirror
ZPD Position χ
2
Moving Mirror
- 396 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
FIGURE 4.17.
yˆ[i ] xˆ[ i ]
zˆ[i ]
Moving Mirror
Input Radiance
x̂
ẑ
To Detector
ŷ
Beam Splitter
Compensator Plate
Fixed Mirror
- 397 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.18.
Fixed-Mirror Arm
xˆ[ i ] x̂
zˆ[i ] ẑ
yˆ[ i ] ŷ
Moving-Mirror Arm
- 398 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
FIGURE 4.19.
angle θb ẑ
x̂ ŷ
propagation vector Ω̂
angle θ d
ˆ
propagation vector Ω d Compensator Plate
Fixed Mirror
xˆ[ i ] zˆ[i ]
yˆ[i ]
Angles θb and θd are drawn much larger than they actually are. Note that propagation vector Ω̂ ,
ˆ , and the optical axis do not necessarily all lie in the same plane.
propagation vector Ω d
- 399 -
4 · From Maxwell’s Equations to the Michelson Interferometer
ˆ and Ω
propagation vectors. Angle θ d between Ω ˆ is greatly exaggerated in Fig. 4.19; the Ω
ˆ
d d
unit vector is drawn much shorter than the Ω̂ unit vector, using perspective to show there is no
reason to expect θ d and θb to be co-planar angles.
Michelson interferometers are, of course, designed to keep θ d small, and as a general rule
they do not work well unless θ d is much less than the typical angle θb between the plane wave’s
propagation vector and the optical axis,
θ d << θb . (4.68)
As is pointed out at the end of Appendix 4E, angle θ d is so small that we expect neither the
amplitude nor the phase shifts of monochromatic plane waves propagating through the beam
splitter substrate to be affected by it.
We note that when θb = θ d = 0 , the plane of incidence65 is the same for all reflections and
transmissions through the beam splitter in Figs. 4.16, 4.17, and 4.19. Both the x̂[i] and x̂ unit
vectors are normal to this plane of incidence; indeed, they are the same unit vector. If we unfold
the interferometer as shown in Fig. 4.18, the
coordinate systems are brought into alignment, with ( yˆ[i] , yˆ ) and ( zˆ[i] , zˆ ) also becoming the same
unit vectors. Now the only difference between the two coordinate systems is the location of their
origins, with the ( xˆ[i] , yˆ[i] , zˆ[i] ) system having its origin on the optical axis of the input beam
approaching the beam splitter and the ( xˆ , yˆ , zˆ ) system having its origin on the optical axis of the
output beam traveling from the beam splitter to the detector. This means the two coordinate
systems are essentially equivalent, allowing us to discard one and keep the other. For the rest of
this chapter, we work with the unfolded interferometer and use only the ( xˆ , yˆ , zˆ ) coordinate
system to represent the plane waves in the input beam, the fixed-mirror arm, the moving-mirror
arm, and the output beam traveling from the beam splitter to the detector.
When θb is not zero, the tunnel-diagram analysis performed in Figs. 4E.4(a) and 4E.4(b) of
Appendix 4E shows that vector Ω̂[i] must have the same angles with respect to ( xˆ[i] , yˆ[i] , zˆ[i] ) that
vector Ω̂ has with respect to ( xˆ , yˆ , zˆ ) ; in particular, angle θb is the same in both the input and
output coordinate systems. Vector Ω ˆ and its associated angle θ , on the other hand, are defined
d d
65
The plane of incidence of a reflected or transmitted monochromatic plane wave is defined in Sec. 4.5 above.
- 400 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
in the output ( xˆ , yˆ , zˆ ) coordinate system after reflection off the slightly misaligned moving mirror
but not, of course, in the input ( xˆ[i] , yˆ[i] , zˆ[i] ) coordinate system
From the work done in Secs. 4.3 and 4.4, we know that the input plane wave can be written
using the real part of
G ˆ G
E0 e 2π iσ ( Ω•r −ct ) (4.69a)
1 ˆ G 2π iσ ( Ω•
c
(
Ω × E0 e ) ˆ rG − ct )
(4.69b)
to represent the wave’s B field when angle θb is small. These formulas come from dropping the
A, j subscripts from Eqs. (4.16a) and (4.16b) and using (4.16c) to substitute for the B vector. In
G
(4.69a) and (4.69b), parameter E0 is a constant complex vector; and the convention of the
unfolded interferometer is used to replace the propagation vector Ω̂[i] by Ω̂ when describing the
input plane wave. The wavenumber ı is taken to be positive. According to Eq. (4.16c), the
G
complex E0 vector satisfies
G
ˆ =0.
E0 • Ω (4.69c)
The work done in Sec. 4.3 shows that this means the plane wave’s real E field is always
perpendicular to the direction of propagation Ω̂ .
Since θb is small, we know that plane waves entering the interferometer must be propagating
parallel to, or nearly parallel to, the z axis. When, as in Fig. 4.19, Ω̂ is tilted at a nonzero small
angle θb to the z axis, it follows that the real E field must have a small component along the z
axis. According to Fig. 4.20, the real E-field component along the z axis must be on the order of
sin θb . Since θb is a small angle, we have
- 401 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.20.
θb
G
vector E
unit vector Ω̂
θb
unit vector ẑ
- 402 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
we note that the real E field of the monochromatic plane wave must be
G ˆ G ˆ G
Re[ E0 e 2π iσ ( Ω•r −ct ) ] = xˆ Re[ E0 x e 2π iσ ( Ω•r −ct ) ]
ˆ G
+ yˆ Re[ E0 y e 2π iσ ( Ω•r −ct ) ] (4.71b)
ˆ rG − ct )
2π iσ ( Ω•
+ zˆ Re[ E0 z e ].
G
Looking at the special point in space r = 0 at time t = 0 , we see that, according to Fig. 4.20,
ˆ G
The imaginary part of E0 z e 2π iσ ( Ω•r −ct ) has no physical relevance, so it can also be specified as
G G
O(θb ) at point r = 0 when t = 0 . This means the formula for E0 can be written as
G
E0 = xE ˆ 0 y + zˆ[O(θb ) + iO(θb )] .
ˆ 0 x + yE (4.71d)
as a notational convenience to describe a complex scalar whose real and imaginary parts are both
O(θb ) . Then Eq. (4.71d) can be written as
G
E0 = xE ˆ 0 y + zˆ O(θb ) .
ˆ 0 x + yE (4.71f)
The O(θb ) symbol, like the O(θb ) symbol, is an algebraic “black hole” absorbing other finite
algebraic quantities. Some of the formal rules for manipulating O(θb ) are that
- 403 -
4 · From Maxwell’s Equations to the Michelson Interferometer
ˆ = εG + zˆ 1 − ε 2 = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 .
Ω x y x y
Clearly, both ε x and ε y are O(sin θb ) = O(θb ) when Ω̂ is nearly parallel to the optical axis
(see Fig. 4.20), so
ˆ = zˆ + xO
Ω ˆ (θb ) + yO
ˆ (θb ) , (4.73a)
where
§ ε 2 ε y2 ·
zˆ 1 − ε x2 − ε y2 ≅ zˆ ¨1 − x − ¸ ≅ zˆ ,
¨ 2 2 ¸¹
©
neglecting terms of O(θb 2 ) . From Eqs. (4.71f) and (4.73a) we have, again neglecting terms of
O(θb 2 ) , that
G
ˆ × E = [ zˆ + xO
Ω ˆ (θb ) + yO
ˆ (θb )] × [ xE ˆ 0 y + zˆ O(θb )]
ˆ 0 x + yE
0
(4.73b)
= yE ˆ 0 y + zˆ O(θb ).
ˆ 0 x − xE
G
We next introduce the symbol O(θb ) to represent a small complex vector, each of whose
G
( xˆ , yˆ , zˆ ) components are O(θb ) . The symbol O(θb ) is another algebraic black hole. We note that
G G G
a O(θb ) + b O(θb ) = O(θb ) (4.74a)
- 404 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
vector, each of whose real ( xˆ , yˆ , zˆ ) components are O(θb ) . Then for any two real scalars a and b,
we have
G G G
a O(θb ) + b O(θb ) = O(θb ) , (4.75a)
G G
c • O(θb ) = O(θb ) , (4.75c)
and
G G G
c × O(θb ) = O(θb ) (4.75d)
G G
for the vector dot product and vector cross product with any finite complex vector c . If c is a
finite real vector, we can, of course, drop the underscore on the right-hand sides of (4.75c) and
G
(4.75d) to show that the resulting small quantities must also be strictly real. The O(θb ) symbol
can be used to write Ω̂ in (4.73a) as
G
ˆ = zˆ + O(θ )
Ω (4.76a)
b
G
and, of course, the O(θb ) symbol can be used to write the complex vectors in (4.71f) and (4.73b)
as
G G
E0 = xE ˆ 0 y + O(θb )
ˆ 0 x + yE (4.76b)
and
G G
ˆ × E = yE
Ω ˆ 0 y + O(θb ) .
ˆ 0 x − xE (4.76c)
0
Substituting Eqs. (4.76b) and (4.76c) into the expressions for the complex E and B fields in
(4.69a) and (4.69b) gives, when angle θb is small,
ˆ G G
Complex E field = ( xE ˆ 0 y ) e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x + yE (4.77a)
and
1 ˆ G G
Complex B field = ˆ 0 y ) e 2π iσ ( Ω•r −ct ) + O(θb ) ,
ˆ 0 x − xE
( yE (4.77b)
c
G
where (4.74b) is used to simplify the final results and Ω̂ is given by Eq. (4.76a). If the O(θb )
G
terms in (4.77a) and (4.77b) and the O(θb ) terms in (4.76a) are all exactly zero, then the plane
wave’s propagation vector is strictly parallel to the ẑ optical axis; when they are not, the plane
- 405 -
4 · From Maxwell’s Equations to the Michelson Interferometer
wave is propagating in a slightly off-axis direction. Looking at how the interferometer is unfolded
G G
going from Fig. 4.17 to Fig. 4.18, we see that if all the O(θb ) and O(θb ) terms are exactly zero,
then the x̂ component of E is strictly perpendicular to the plane of incidence on the beam splitter
and the ŷ component of E is strictly parallel to the plane of incidence on the beam splitter. For
G G
now, we assume that all the O(θb ) and O(θb ) terms are exactly zero and analyze just plane
waves that propagate parallel to the optical axis—that is, just the on-axis plane waves. From the
work done in Secs. 4.5 and 4.6, we can then predict that the on-axis monochromatic plane
wavefield transmitted through the beam splitter is
ˆ G
Complex E field = [ xE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct )
ˆ 0 x tsγ s( a ) + yE (4.77c)
and
1 ˆ G
Complex B field = [ yE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) ,
ˆ 0 x tsγ s( a ) − xE (4.77d)
c
where γ s( a ) is the complex parameter introduced in Appendix 4E that describes the passage of s-
type monochromatic plane waves on their first pass through the beam-splitter substrate, and γ (pa )
is the complex parameter from Appendix 4E describing the passage of p-type monochromatic
plane waves on their first pass through the beam-splitter substrate. Both γ s( a ) and γ (pa ) are
functions of ı and the plane wave’s angle of incidence on the substrate. The plane wave reflected
off the beam splitter after passing into and out of the substrate is
ˆ G
Complex E field = [ xE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct )
ˆ 0 x rsγ s( ab ) + yE (4.77e)
and
1 ˆ G
Complex B field = [ yE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) .
ˆ 0 x rsγ s( ab ) − xE . (4.77f)
c
Here, we define
γ s(ab) = γ s( a ) ⋅ γ s(b ) (4.77g)
and
γ p(ab) = γ p( a ) ⋅ γ p(b ) , (4.77h)
where γ s( b ) is the complex parameter introduced in Appendix 4E that describes the second pass of
s-type monochromatic plane waves through the beam-splitter substrate and γ (pb ) is the complex
parameter from Appendix 4E that describes the second pass of p-type monochromatic plane
waves through the substrate. Like γ s( a ) and γ (pa ) , the γ s(,bp) complex parameters are functions of ı
and the plane wave’s angle of incidence. The complex parameters rs, rp, ts, tp describe what
- 406 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
happens in the thin beam-splitter layer in Fig. 4.16, where the partial transmission and partial
reflection of the radiation fields occur. Parameters rs and ts are the s-wave amplitude-reflection
and amplitude-transmission coefficients, and parameters rp and tp are the p-wave amplitude-
reflection and amplitude-transmission coefficients. Recognizing that the amount of reflection and
transmission can depend on both wavenumber and angle of incidence, we realize that these
coefficients must also be functions of ı and the angle of incidence on the substrate. For the on-
axis plane waves characterized by Eqs. (4.77c)–(4.77f), the angle of incidence on the beam-
splitter substrate must be the same as the angle of incidence φ made by the optical axis on the
beam splitter. The unfolded model of the interferometer in Fig. 4.18 lets us use the same symbol
ŷ for both the original and reflected ŷ unit vectors and also allows us to represent both the
transmitted and reflected propagation vectors by the same symbol Ω̂ .
Now we consider what happens to the slightly off-axis plane waves where șb is no longer
exactly zero, which means that Ω̂ is at a slight angle to the optical axis. In this situation Fig. 4.21
shows that the angle of incidence on the beam splitter changes by an O (θb ) amount from the
optical axis’s angle of incidence φ . We want to show that the transmitted and reflected
wavefields can now be written as
ˆ G G
Complex E field = [ xE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x tsγ s( a ) + yE (4.78a)
and
1 ˆ G G
Complex B field = [ yE ˆ 0 y t pγ (pa ) ] e2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x tsγ s( a ) − xE (4.78b)
c
ˆ G G
Complex E field = [ xE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x rsγ s( ab ) + yE (4.79a)
and
1 ˆ G G
Complex B field = [ yE ˆ 0 y rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) + O(θb )
ˆ 0 x rsγ s( ab ) − xE (4.79b)
c
- 407 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.21.
angle șb
unit vector ẑ
angle φ
- 408 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
The unit vector perpendicular to both Ω̂ and ŝ is given by [see Fig. 4.22 and Eq. (4.76a)]
G G
ˆ × sˆ = [ zˆ + O(θ )] × [ xˆ + O(θ )] .
pˆ = Ω b b
G G
This becomes, gathering together the O(θb ) terms and neglecting the [O(θb )]2 terms,
G
pˆ = yˆ + O(θb ) . (4.80b)
G
We take components along ŝ and p̂ of the complex vector E0 vector used to describe the
incident plane wave in Eqs. (4.69a) and (4.69b) to get [since ŝ , p̂ , and Ω̂ are mutually
G
perpendicular unit vectors and the complex E0 vector is, according to (4.69c), strictly
perpendicular to Ω̂ ]
G
E0 = sE
ˆ 0 s + pE
ˆ 0p . (4.80c)
G
Here, E0s and E0 p are two complex scalars representing the components of E0 along ŝ and p̂ .
Substitution of (4.80a) and (4.80b) into (4.80c) gives
G G
E0 = xE ˆ 0 p + O(θb ) .
ˆ 0 s + yE
E0 s = E0 x + O(θb ) (4.80d)
and
E0 p = E0 y + O(θb ) (4.80e)
G
if the two formulas for E0 are to be consistent.
Using the relationships Ω ˆ × sˆ = pˆ and Ω
ˆ × pˆ = − sˆ from Fig. 4.22, we substitute (4.80c) into
(4.69a) and (4.69b) to write the incident wave as
ˆ G
Complex E field = ( sE ˆ 0 p ) e 2π iσ ( Ω•r −ct )
ˆ 0 s + pE
and
1 ˆ G
Complex B field = ˆ 0 p ) e 2π iσ ( Ω•r −ct ) .
ˆ 0 s − sE
( pE
c
- 409 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.22.
unit vector ŝ
Angle here is O(θb )
unit vector ẑ
unit vector Ω̂
unit vector x̂
θb
New,
slightly unit vector ŷ
tilted
plane of
incidence
containing
the Ω̂ and Angle here is
n̂ vectors O(θb )
unit vector p̂
- 410 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
In effect, the original ( xˆ , yˆ , zˆ ) coordinate system is replaced by the slightly tilted ( sˆ, pˆ , Ω ˆ)
coordinate system with E0s and E0 p playing the role of E0 x and E0 y . Thus it has now been
G
shown that we can make the O(θb ) terms in (4.77a) and (4.77b) equal to zero by replacing
[ ( xˆ, yˆ ) , E0 x , E0 y ] with [ ( sˆ, pˆ ) , E0s , E0 p ] respectively. Previously xˆ and yˆ represented unit
vectors perpendicular and parallel to the plane of incidence, and now sˆ and pˆ represent unit
vectors perpendicular and parallel to the plane of incidence. Following the pattern established in
going from Eqs. (4.77a) and (4.77b) to Eqs. (4.77c)–(4.77f), we see that the wave transmitted
through the beam splitter must be
ˆ G
Complex E field = [ sE ˆ 0 p t pγ (pa ) ] e2π iσ ( Ω•r −ct )
ˆ 0 s tsγ s( a ) + pE (4.80f)
and
1 ˆ G
Complex B field = [ pE ˆ 0 p t pγ p( a ) ] e2π iσ ( Ω•r −ct ) ,
ˆ 0 s tsγ s( a ) − sE (4.80g)
c
and the wave reflected off the beam splitter after passing into and out of the substrate must be
ˆ G
Complex E field = [ sE ˆ 0 p rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) ,
ˆ 0 s rsγ s( ab ) + pE (4.80h)
and
1 ˆ G
Complex B field = [ pE ˆ 0 p rpγ (pab ) ] e 2π iσ ( Ω•r −ct ) .
ˆ 0 s rsγ s( ab ) − sE (4.80i)
c
The γ s(,ap) , γ s(,bp) , γ s(,abp ) parameters are the same functions of ı and the angle of incidence as in Eqs.
(4.77c)–(4.77f); and the rs,p and ts,p parameters are also the same functions as they were in
(4.77c)–(4.77f). We note that even if the wavenumber ı has the same value as in Eqs. (4.77c)–
(4.77f), the work done in Appendix 4E shows that the values of γ s(,ap) , γ s(,bp) , and γ s(,abp ) are different
because these complex-valued functions are very sensitive to the slight changes in the angle of
incidence produced by nonzero values of θb . The values of rs,p and ts,p do not, however, usually
depend as sensitively on the angle of incidence. As long as θb is small, we can treat rs,p and ts,p as
complex functions that depend only on the wavenumber ı.
Substituting Eqs. (4.80a), (4.80b), (4.80d), and (4.80e) into Eqs. (4.80f)–(4.80i) and gathering
G
together the O(θb ) terms while neglecting the O(θb 2 ) terms gives us, as expected, Eqs. (4.78a),
(4.78b), (4.79a), and (4.79b) for the beam splitter’s transmitted and reflected waves. This
establishes that (4.78a), (4.78b), (4.79a), and (4.79b) can be used to represent monochromatic
plane waves propagating through the interferometer in a slightly off-axis direction. From now on,
we use (4.78a), (4.78b), (4.79a), and (4.79b) to represent both the on-axis and off-axis
- 411 -
4 · From Maxwell’s Equations to the Michelson Interferometer
monochromatic plane waves with the understanding, of course, that both 'b and all the order 'b
terms are strictly zero for on-axis propagation.
The plane wave transmitted through the beam splitter into the fixed-mirror arm of the
interferometer reflects off the fixed mirror and returns to the beam splitter. There is no way to
distinguish between s-wave and p-wave reflections when ˆ is exactly parallel to the z axis, so
we use the single amplitude-reflection coefficient rFM to describe normal reflection off the fixed
mirror. When ˆ is not exactly parallel to the ẑ axis, which means the reflection off the fixed
mirror is only nearly normal and not strictly normal, we can distinguish between s-wave and p-
wave reflections; but there is no real point to it because both the s-wave and p-wave amplitude-
reflection coefficients are approximately equal to rFM . When ˆ is allowed to be approximately
parallel to ẑ , the radiation fields of the plane wave after reflection off the fixed mirror are
Complex E field
( abc ) ( abc ) ˆ G G (4.81a)
ˆ 0 xts
rFM [ xE s
ˆ 0 yt p
yE p ] e 2& i) ( =r ct ) O('b )
and
Complex B field
rFM ( abc ) ( abc ) ˆ G G (4.81b)
ˆ 0 xts
[ yE s
ˆ 0 yt p
xE p ] e2& i) ( =r ct ) O('b ).
c
Here,
( abc ) ( ab ) (c ) (a) (b ) (c)
s s A s s A s A s (4.81c)
and
( abc ) ( ab ) (c ) (a) (b ) (c)
p p A p p A p A p , (4.81d)
(c )
where s, p are the complex parameters introduced in Appendix 4E to describe the third pass
through the beam-splitter substrate and the second pass through the compensator plate of the s-
type and p-type waves respectively. We note that s(,bp) can, according to Eq. (4E.7b) in Appendix
4E, describe the first passage of a plane wave through the compensator plate as well as the second
(a) (a) (b) (b) (c )
passage through
throughthethebeam-splitter substrate.
beam-splitter Like Like
substrate. s, p and
s, p and
s , p , the
s , p , they
s, p are functions of
( abc )
wavenumber ı and the angle of incidence. In Eqs. (4.81a) and (4.81b), the factors of s, p show
that the plane wave passes once through the beam-splitter substrate and twice through the
G
compensator plate, and the O('b ) symbol again represents complex vector components that are
too small to be worth keeping track of explicitly. Just like before, these equations reduce to the
G
case where ˆ is exactly parallel to ẑ when all the O(' ) terms are taken to be exactly equal to
b
zero.
- 412 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
The plane wave reflected off the beam splitter and into the interferometer’s moving-mirror
arm reflects off the moving mirror and returns to the beam splitter. Because it reflects normally or
near normally, we can write, following the pattern of Eqs. (4.81a) and (4.81b) and assuming the
moving mirror is at its ZPD position,
Complex E field
ˆ G G (4.82a)
ˆ 0 y rpγ p( abc ) ] e 2π iσ ( Ωd •r −ct ) + O(θb )
ˆ 0 x rsγ s( abc ) + yE
= rMM [ xE
and
Complex B field
rMM ˆ G G (4.82b)
= [ yE ˆ 0 y rpγ p( abc ) ] e 2π iσ ( Ωd •r −ct ) + O(θb ) ,
ˆ 0 x rsγ s( abc ) − xE
c
where rMM is the complex amplitude-reflection coefficient for plane waves normally incident on
the moving mirror, and ȍˆ is replaced by ȍ
ˆ because of the slightly tilted moving mirror. The
d
factors of γ ( abc )
s, p now represent three passages through the beam-splitter substrate. At the end of
Appendix 4E, there is a discussion about why it makes sense to neglect the very slight change in
the angle of incidence due to the tilted moving mirror. Most interferometers use identical
reflective surfaces for the fixed and moving mirrors, so from now on we assume that
- 413 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Complex E field
( abc ) ( abc ) ˆ G G (4.84a)
ˆ 0 x ts rs
WrM [ xE s
ˆ 0 y t p rp
yE p ] e2& i) ( =r ct ) O('b )
and
Complex B field
WrM ( abc ) ( abc ) ˆ G G (4.84b)
ˆ 0 x ts rs
[ yE s
ˆ 0 y t p rp
xE p ] e 2& i) ( =r ct ) O('b ) ,
c
where Eq. (4.83) is used to replace rFM by rM in the expressions for the complex E and B fields.
There is no difficulty passing the plane wave coming from the moving-mirror arm through the
beam-splitter film because it transmits the same way the original plane wave transmitted through
to the fixed-mirror arm. This means we can use the same ts,p complex parameters to describe the
change in the plane wave in Eqs. (4.84a)
(4.84) and (4.84b).
(4.82). Now, however, we also want to allow for
the possibility that the moving mirror is no longer at ZPD. In Eqs. (4.84a) and (4.84b) the
complex exponential
ˆ G
e 2& i) ( =r ct )
is always the correct phase term for the plane wave traveling toward the detector after passing out
and back the fixed-mirror arm. When the moving mirror is no longer at ZPD, the correct phase
term for the plane wave passing out and back the moving-mirror arm iss, in Eqs. (4.82),
ˆ G
ˆ
e 2& i) [ d =( r z )ct ]
G G
with r 7 r zˆ to account for the moving-mirror arm’s OPD (that is, to account for the extra
distance Ȥ traveled when the moving mirror is not at its ZPD position).66 Therefore, we now write
the E and B fields of the plane wave traveling toward the detector after transmitting through the
beam splitter from the moving-mirror arm ass [just put ts,p into (4.82 a,b) and use (4.83)]
Complex E field
ˆ G G (4.85a)
ˆ 0 x ts rs
rM [ xE ( abc )
s
ˆ 0 y t p rp
yE ( abc )
p ] e 2& i) [ d =( r zˆ ) ct ] O('b )
and
Complex B field
rM ˆ G G (4.85b)
ˆ 0 x ts rs
[ yE ( abc )
s
ˆ 0 y t p rp
xE ( abc )
p ] e 2& i) [ d =( r zˆ ) ct ] O('b ).
c
66
The OPD is defined in Sec. 4.11 and first used in Eq. (1.15b) of Chapter 1. The ZPD is defined at the beginning of
Sec. 1.4 of Chapter 1.
- 414 -
Monochromatic Plane Waves and Michelson Interferometers · 4.12
In Sec. 4.11 we decided that the recombined radiation on the far side of the beam splitter
would be called the balanced radiation field. Having traced a monochromatic plane wave through
the interferometer, we can now represent its balanced E and B fields by adding together the
formulas in (4.84a), (4.84b), (4.85a), and (4.85b),
and
Complex balanced B field
rM ˆ G ˆ ˆ ˆ G
= yˆ E0 x ts rsγ s( abc ) e 2π iσ [ Ω•r −ct ] (W + e2π iσχ ( Ωd • zˆ ) e2π iσ ( Ωd −Ω )•r )
c
(4.86b)
r ˆ G ˆ ˆ ˆ G
− xˆ M E0 y t p rpγ p( abc ) e 2π iσ [ Ω•r −ct ] (W + e2π iσχ ( Ωd • zˆ ) e2π iσ ( Ωd −Ω )•r )
c
G
+ O(θb ) .
According to inequality (4.68), angle θb is much greater than θ d ; and because the input beam is
direction-chopped, we know that θb is itself a small quantity. When the typical values of șb and
șd for standard Michelson interferometers are plugged into the phase terms of (4.86a) and (4.86b),
it can be shown that [see Eqs. (4B.5d) and (4B.10d) from Appendix 4B]
ˆ ˆ
e 2π iσχ ( Ωd • zˆ ) ≅ e 2π iσχ ( Ω• zˆ ) (4.87a)
and
ˆ ˆ G G
e 2π iσ ( Ωd −Ω )•r ≅ e 4π iσ ( nˆM − zˆ )•r . (4.87b)
Here nˆM is the dimensionless unit normal vector to the moving mirror’s surface and, following
the convention of the unfolded interferometer, ẑ points from the moving mirror to the beam
splitter. When the moving mirror is perfectly aligned, nˆM = zˆ . Substitution of these two
approximations into the formulas for the complex balanced E and B fields gives
- 415 -
4 · From Maxwell’s Equations to the Michelson Interferometer
and
§1 ˆ G · 2π iσ A ( Ωˆ j •rG −ct )
Complex input B field = ¦¦
A
¨
j ©c
Ω j × E Aj ¸ e
¹
(4.89b)
1 ˆ • rG − ct ) G
= ¦¦ ( yE ˆ Ajy ) e A ( j
2π iσ Ω
ˆ Ajx − xE + O(θb ) ,
A j c
where Eqs. (4.77a) and (4.77b) are used to write the sums over A and j in terms of the x and y
G
ˆ
components of EAj . Equations (4.89a) and (4.89b) apply to a collection of plane waves with Ω j
propagation vectors parallel to, or nearly parallel to, the optical axis. Having passed through the
- 416 -
Multiple Plane Waves and Michelson Interferometers · 4.13
interferometer, each plane wave takes on the form given in Eqs. (4.88a) and (4.88b), so that the
total balanced radiation field traveling to the detector becomes
O('b )
and
Complex balanced B field
° rM A sj( abc )
ˆ = rG ct ] G
¦¦ ® yˆ A 2& i) [ ˆ = zˆ )
2& i) (
EAjx tsA rsA e A j (W e A j e4& i) A ( nˆM zˆ )=r )
A j °̄ c
(4.89d)
rM A ( abc )
pjA ˆ = rG ct ]
2& i) A [ ˆ = zˆ ) 4& i) ( nˆ zˆ )= rG
2& i) A ( ½°
xˆ EAjy t pA rpA e j
(W e j
e A M
)¾
c °¿
G
O('b ) .
Note that all the parameters depending on ı acquire A subscripts; all the parameters depending on
the angle of incidence acquire j subscripts; and all the parameters with A and j subscripts depend
on both. Specifically, we define
( abc ) ( abc )
sjA s at the )
ı =ı)l Awave
wavenumber
numberand
andatatthe
theangles
anglesofofincidence
incidence
G G
corresponding to a monochromatic plane wave with an ȍ = ȍ j (4.89e)
propagation vector
and
( abc ) ( abc )
pjA p at the ı
) =ı)l Awavenumber
wave number
and
and
at at
thethe
angles
angles
of of
incidence
incidence
G G
corresponding to a monochromatic plane wave with an ȍ = ȍ j (4.89f)
propagation vector.
Similarly, we define
rsA rs at ) ) A , (4.89g)
rpA rp at ) ) A , (4.89h)
tsA ts at ) ) A , (4.89i)
- 417 -
4 · From Maxwell’s Equations to the Michelson Interferometer
t pA = t p at σ = σ A , (4.89j)
and
rM A = rM at σ = σ A . (4.89k)
Following the procedure shown in Eqs. (4.44a) and (4.44b) above, the true radiation fields can
be written as the real part of the above formulas, giving
¼
(4.90a)
1 ª ˆ • rG − ct ] ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
¦ ∗ ∗ −2π iσ A [ Ω −2π iσ A χ ( Ω
xˆ rM A∗γ sj( abc )∗ ∗
E t r e j
(W + e j
e A M
)
2 A ¬
A Ajx sA sA
+ yˆ rM A∗γ (pjabc
G
A
)∗ ∗
EAjy t pA ∗ rpA ∗ e
ˆ • rG − ct ]
−2π iσ A [ Ω j
(W + e
ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
−2π iσ A χ ( Ω j
e A M
)º
¼ }
+ O(θb )
and
Real balanced B field =
° 1 ª rM Aγ sjA
( abc )
ˆ • rG − ct ] G
¦j ® 2 ¦A « y c EAjxtsA rsA e A j (W + e A j e4π iσ A ( nˆM − zˆ )•r )
2π iσ [ Ω ˆ • zˆ )
2π iσ χ ( Ω
ˆ
°¯ ¬«
rM Aγ (pjabc
A
)
ˆ • rG − ct ]
2π iσ A [ Ω ˆ • zˆ ) 4π iσ ( nˆ − zˆ)• rG
2π iσ A χ ( Ω º
− xˆ EAjy t pA rpA e j
(W + e j
e A M
)» +
c »¼
(4.90b)
1 ª rM A γ sjA
∗ ( abc ) ∗
ˆ • rG − ct ] ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
¦ ∗ ∗ −2π iσ A [ Ω
∗ −2π iσ A χ ( Ω
« ˆ
y E t r
Ajx sA sA e j
(W + e j
e A M
)
2 A «¬ c
rM A∗γ (pjabc
A
)∗
ˆ • rG − ct ]
−2π iσ A [ Ω ˆ • zˆ ) −4π iσ ( nˆ − zˆ )• rG
−2π iσ A χ ( Ω º ½°
− xˆ EA∗jy t pA∗ rpA∗ e j
(W + e j
e A M
) »¾
c »¼ ¿°
G
+ O(θb ) ,
G
where the underscore has been removed from the O(θb ) symbol to show that only the real part of
this small uncertainty is retained. We define
- 418 -
Multiple Plane Waves and Michelson Interferometers · 4.13
1
∆σ A E jx (σ ) = EAjx for σ = σ A > 0 , (4.91a)
2
1
∆σ A E jy (σ ) = EAjy for σ = σ A > 0 , (4.91b)
2
1 ∗
∆σ A E jx (σ ) = EAjx for σ = −σ A < 0 , (4.91c)
2
and
1 ∗
∆σ A E jy (σ ) = EAjy for σ = −σ A < 0 , (4.91d)
2
with
∆σ A = σ A +1 − σ A .
We set up new versions of the r , t , and γ parameters by defining the complex functions
r , rs , rp , ts , t p , γ sj( abc ) , γ (pjabc ) to be
Now r , rs , rp , ts , t p , γ sj( abc ) , γ (pjabc ) are Hermitian functions of ı [see remark following Eq. (2.34a) in
Chapter 2]. The definitions of E jx (σ ) and E jy (σ ) in Eqs. (4.91a)–(4.91d) require that
- 419 -
4 · From Maxwell’s Equations to the Michelson Interferometer
E jx (−σ ) = E jx (σ )∗ (4.92g)
and
E jy (−σ ) = E jy (σ )∗ , (4.92h)
showing that they are also Hermitian functions. Just as in Eqs. (4.46a) and (4.46b), the sums over
A in (4.90a) and (4.90b) can be converted to integrals over ı to get
The limits of integration are put at í and + by defining E jx (σ ) and E jy (σ ) to be zero for ı
values that do not correspond to allowed index values in the sums over A . In particular, we
expect the integrals to converge because E jx , y (σ ) are negligible or zero for values of ı
corresponding to radiation wavelengths not measured by the interferometer’s detector [see
discussion after Eq. (4.66b) above].
Following the procedure already explained in Sec. 4.8, we replace the sum over j with a
double integral over d 2ε . The first step is to convert the sum over j into a double sum over
indices m, n as in Eqs. (4.51a) and (4.51b) above,
- 420 -
Multiple Plane Waves and Michelson Interferometers · 4.13
³ ¦ ª¬ xˆ r () )
ˆ ˆ
( abc )
snm () ) Enmx () )ts () )rs () ) e 2& i) [ nm =r ct ] (W e2& i) ( nm = zˆ ) e 4& i) ( nˆM zˆ )=r )
5 n , m (4.94a)
ˆ = rG ct ] ˆ = zˆ ) 4& i) ( nˆ zˆ )= rG
yˆ r () ) ( abc )
() ) Enmy () )t p () )rp () ) e 2& i) [ nm
(W e 2& i) ( nm
e M
) º d)
pnm ¼
G
O('b )
and
³ ¦ ª¬ yˆ r () )
ˆ ˆ
( abc )
snm () ) Enmx () )ts () )rs () ) e 2& i) [ nm =r ct ] (W e2& i) ( nm = zˆ ) e4& i) ( nˆM zˆ )=r )
c 5 n ,m (4.94b)
ˆ = rG ct ] ˆ = zˆ ) 4& i) ( nˆ zˆ )= rG
xˆ r () ) ( abc )
() ) Enmy () )t p () )rp () ) e 2& i) [ nm
(W e 2& i) ( nm
e M
) º d)
pnm ¼
G
O('b ),
where we define Enmx () ) Enmy () ) 0 for those m and n values that do not correspond to j
values in the original sums, the
the ones
sumsover
overpropagation
propagationdirections
directionsininEqs.
Eqs.(4.93a)
(4.93a)and
and(4.393b).
(4.93b). As
ˆ propagation vectors can be written as [see Eq. (4.51c)]
in Sec. 4.8, the nm
ˆ xˆ yˆ zˆ 1 2 2 .
nm nx my nx my
Unlike the situation at the beginning of Sec. 4.8, parameters nx and mx are always very small
compared to one because all the j values in the original sum correspond to propagation vectors
that are parallel,to,
to or nearly parallel to, ẑ ; that is
nx
1 (4.95a)
and
my
1 . (4.95b)
nx A my A e x ( nx , my , ) ) Enmx () ) (4.96a)
and
nx A my A e y ( nx , my , ) ) Enmy () ) (4.96b)
- 421 -
4 · From Maxwell’s Equations to the Michelson Interferometer
with
∆ε nx = ε n +1, x − ε n , x (4.96c)
and
∆ε my = ε m +1,m − ε m, y . (4.96d)
γ s( abc ) (ε nx , ε my , σ ) = γ snm
( abc )
(σ ) and γ (pabc ) (ε nx , ε my , σ ) = γ (pnm
abc )
(σ ) . (4.96e)
Since
γ sj( abc ) (−σ ) = γ sj( abc ) (σ )∗ and γ (pjabc ) (−σ ) = γ (pjabc ) (σ )∗
index j → indices m, n
we must have
γ snm
( abc )
(−σ ) = γ snm
( abc )
(σ )∗ and γ (pnm
abc )
(−σ ) = γ (pnm
abc )
(σ )∗
so that
Just like in Eqs. (4.53a) and (4.53b), we pass to the limit of decreasing ∆ε nx , ∆ε my in (4.94a) and
(4.94b) to get
- 422 -
Multiple Plane Waves and Michelson Interferometers · 4.13
c −∞ −∞
ˆ G
⋅ (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ) (4.97b)
G G ˆ G G
− xˆ r (σ )γ (pabc ) (ε , σ )e y (ε , σ )t p (σ )rp (σ ) e 2π iσ [ Ω•r −ct ] (W + e 2π iσχ ( Ω• zˆ ) e 4π iσ ( nˆM − zˆ )•r ) º
ˆ
¼
G
+ O(θb )
G
As in Sec. 4.8, the vector argument ε = xˆε x + yˆε y is used as a shorthand for the two arguments
ε x and ε y , so that
G G
e x (ε x , ε y , σ ) = e x (ε , σ ) , e y (ε x , ε y , σ ) = e y (ε , σ )
and
G
γ s(,abc
p (ε x , ε y , σ ) = γ s , p (ε , σ ) .
) ( abc )
ˆ = xˆε + yˆε + zˆ 1 − ε 2 − ε 2 = εG + zˆ 1 − ε 2
Ω (4.97d)
x y x y
with
G2
ε 2 = ε = ε x2 + ε y2 .
G G G
Vector ρ = xxˆ + yyˆ lets us write r = ρ + zzˆ [see Eqs. (4.54b) and (4.54e)] so that the expressions
G ˆ G G ˆ G
e x (ε , σ ) e 2π iσ [ Ω•r −ct ] and e y (ε , σ ) e 2π iσ [ Ω•r −ct ] become
G ˆ G G 1−ε 2
G G G G G
e x (ε , σ ) e 2π iσ [ Ω•r −ct ] = e x (ε , σ ) e 2π iσ z e 2π iσ [ε • ρ −ct ] = E x (ε , z , σ ) e 2π iσ [ε • ρ −ct ]
and
G ˆ G G 1−ε 2
G G G G G
e y (ε , σ ) e 2π iσ [ Ω•r −ct ] = e y (ε , σ ) e 2π iσ z e 2π iσ [ε • ρ −ct ] = E y (ε , z , σ ) e 2π iσ [ε • ρ −ct ] ,
- 423 -
4 · From Maxwell’s Equations to the Michelson Interferometer
where we define
G G 1−ε 2
E x (ε , z, σ ) = e x (ε , σ ) e 2π iσ z (4.98a)
and
G G 1−ε 2
E y (ε , z , σ ) = e y (ε , σ ) e 2π iσ z . (4.98b)
{
∞ ∞
³ dσ ³ ³d ε
2
G G
(
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ 1−ε 2
G
e 4π iσ ( nˆM − zˆ )•r ⋅ )
−∞ −∞
(4.99a)
G
G G G G
ª¬ xˆ E x (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) + yˆ E y (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼ }
+ O(θb )
- 424 -
Multiple Plane Waves and Michelson Interferometers · 4.13
and
G G
B (bal) ( ρ , z , t ) =
{
∞ ∞
1
³ dσ
c −∞ ³ ³d ε
2
G G
(
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ 1−ε 2
G
)
e 4π iσ ( nˆM − zˆ )•r ⋅
−∞
(4.99b)
G
G G G G
}
ª¬ yˆ E x (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) − xˆE y (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼
+ O(θb ).
In Eqs. (4.92g) and (4.92h) we see that E jx (σ ) and E jy (σ ) are Hermitian functions, so when
the index j is replaced by the pair of indices m, n, it follows that Enmx (σ ) and Enmy (σ ) must also
G G
be Hermitian. This forces e x (ε , σ ) and e y (ε , σ ) in Eqs. (4.96a) and (4.96b) to be Hermitian
2
functions of ı. Changing the sign of ı in e 2π iσ z 1−ε is equivalent to taking its complex conjugate,
G G
so Eqs. (4.98a) and (4.98b) show that E x (ε , z , σ ) and E y (ε , z , σ ) are also Hermitian functions of
ı, giving
G G
E x (ε , z , −σ ) = E x (ε , z , σ )∗ (4.100a)
and
G G
E y (ε , z , −σ ) = E y (ε , z, σ )∗ . (4.100b)
Returning briefly to the discussion leading up to inequalities (4.95a) and (4.95b), we see that
because only plane waves traveling parallel to, or nearly parallel to, the optical axis can pass
through the interferometer—that is, because the radiation passing through the interferometer is
direction-chopped—both ex and ey must be zero or negligible unless ε x and ε y are small.
Consequently, consulting the definitions of Ex and Ey in (4.98a) and (4.98b), it follows that for
the direction-chopped radiation passing through the interferometer both Ex and Ey must be zero or
G
negligible unless ε << 1 .
G G G G
The connection between the output radiation fields E (bal) ( ρ , z , t ) , B (bal) ( ρ , z , t ) and the input
radiance is easy to understand because we have just created a carefully elaborated connection
G G
between E x (ε , z , σ ) , E y (ε , z , σ ) and the complex EAjx , EAjy values in Eqs. (4.89a) and (4.89b)
characterizing the input radiation fields. To develop a consistent notation and make the
connection explicit, we apply the same process used to go from (4.89c) and (4.89d) to (4.99a) and
(4.99b) to Eq. (4.89a) and (4.89b) representing the input radiation fields. The interferometer’s
input fields then become
- 425 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G G G
For future use, we note that E (in) , B (in) , E (bal) , and B (bal) can be written as three-dimensional
inverse Fourier transforms. We make the same variable substitutions used above in equations
(4.60a)–(4.60c), (4.61b), and (4.61c), specifying that
w = −σ c , (4.102a)
u x = σε x , (4.102b)
and
u y = σε y , (4.102c)
with
G G wG
u = xu ˆ y = σε = − ε .
ˆ x + yu (4.102d)
c
and
G2
u 2 = u = u x2 + u y2 . (4.102e)
- 426 -
Multiple Plane Waves and Michelson Interferometers · 4.13
G G
E (bal) ( ρ , z , t ) =
{
2
2π iwχ § cu ·
∞ ∞
c w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −
³−∞ dw ³−∞³ d u w2 ⋅
2π i[ u • ρ + wt ] M
2
e r (− ) W + e c ©w¹
e c
c
G G
cu w ( abc ) cu w
[ xˆ E x (− , z, − )γ s (− , − ) ts (− w )rs (− w ) (4.104a)
w c w c c c
}
G G
cu w ( abc ) cu w w w
+ yˆ E y (− , z , − )γ p (− , − ) t p (− )rp (− )]
w c w c c c
G
+ O(θb )
and
G G
B (bal) ( ρ , z , t ) =
{
2
2π iwχ § cu ·
∞
1
∞
w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −
³ ³−∞³ w2 ⋅
2π i[ u • ρ + wt ] M
dw d 2
u e r (− ) W + e c ©w¹
e c
−∞
c
G G
cu w ( abc ) cu w
[ yˆ E x (− , z, − ) γ s (− , − )ts (− w )rs (− w ) (4.104b)
w c w c c c
}
G G
cu w ( abc ) cu w w w
− x E y (− , z , − ) γ p (− , − )t p (− )rp (− )]
ˆ
w c w c c c
G
+ O(θb ) .
67
The radiation is also, of course, direction-chopped. The direction-chopped property is used in the discussion after
Eq. (4.119d) below.
- 427 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
Consequently, there exists a positive wavenumber σ av , which can be thought of as the typical or
“average” wavenumber of the approximately band-limited radiation, that characterizes the
polychromatic wavefield passing through the interferometer. We require T to be extremely long
compared to the period 1 f av = cσ av of a typical electromagnetic wave inside the interferometer.
We also require any characteristic distance across area A to be extremely large compared to the
wavelength λav = 1 σ av of a typical electromagnetic wave inside the interferometer,
T >> cσ av (4.105a)
And
1
A >> . (4.105b)
σ av
To show how the T, A subscripts are used, we rewrite Eqs. (4.99a) through (4.104b) using T, A
subscripts and neglecting all terms of O (θb ) ,
G (in) G ∞ ∞
G G G G
³ ³ ³ ¬ ª º 2π iσ [ε • ρ − ct ]
ETA ( ρ , z, t ) = dσ d 2
ε ˆ
x E xTA (ε , z , σ ) + ˆ
y E yTA (ε , z , σ ) ¼ e , (4.106a)
−∞ −∞
G (in) G 1
∞ ∞
G G G G
BTA ( ρ , z , t ) = ³ dσ ³ ³ ¬ xTA
d 2
ε ª ˆ
y E (ε , z , σ ) − ˆ
x E yTA (ε , z , σ ) º
¼ e 2π iσ [ ε • ρ − ct ]
, (4.106b)
c −∞ −∞
G (bal) G
ETA ( ρ , z, t ) =
{ ( )
∞ ∞
G G G
³ ³ ³d ε e 4π iσ ( nˆM − zˆ )•r ⋅
1−ε 2
dσ 2
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ (4.107a)
−∞ −∞
G G G G
ª¬ xˆ E xTA (ε , z , σ ) γ s( abc ) (ε , σ )ts (σ )rs (σ ) + yˆ E yTA (ε , z , σ ) γ p( abc ) (ε , σ )t p (σ )rp (σ ) º¼ } ,
G (bal) G
BTA ( ρ , z, t ) =
{ ( )
∞ ∞
1 G G G
³ dσ ³ ³d ε e 4π iσ ( nˆM − zˆ )•r ⋅
1−ε 2
2
e 2π iσ [ε • ρ −ct ] r (σ ) W + e 2π iσχ (4.107b)
c −∞ −∞
G G G G
ª¬ yˆ E xTA (ε , z , σ )γ s( abc ) (ε , σ ) ts (σ )rs (σ ) − xˆE yTA (ε , z , σ )γ p( abc ) (ε , σ ) t p (σ )rp (σ ) º¼ } ,
- 428 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
G (in) G
ETA ( ρ , z, t )
∞ ∞ G G (4.108a)
§ c ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « x E xTA (− , z , − ) + y E yTA (− , z , − ) » e 2π i[u • ρ + wt ] ,
ˆ2
ˆ
−∞ −∞ © w ¹¬ w c w c ¼
G (in) G
BTA ( ρ , z, t )
∞ ∞ G G (4.108b)
§ 1 ·ª cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « y E xTA (− , z , − ) − xE yTA (− , z , − ) » e 2π i[ u • ρ + wt ] ,
ˆ2
ˆ
−∞ −∞ © w ¹¬ w c w c ¼
G (bal) G
ETA ( ρ , z, t ) =
{
2
2π iwχ § cu ·
∞ ∞
c w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −
³−∞ dw ³−∞³ d u w2 ⋅
2 2π i [ u • ρ + wt ] c ©w¹ M
e r (− ) W + e e c
c
G G (4.109a)
ˆ
cu w ( abc ) cu w
[ x E xTA (− , z, − ) γ s (− , − )ts (− w )rs (− w )
w c w c c c
}
G G
cu w ( abc ) cu w w w
+ yˆ E yTA (− , z , − ) γ p (− , − )t p (− )rp (− )] ,
w c w c c c
G (bal) G
BTA ( ρ , z, t ) =
{
2
2π iwχ § cu ·
∞ ∞
1 w 1−¨ ¸ − 4π iw ( nˆ − zˆ )• rG
( )
G G −
³−∞ dw ³−∞³ d u w2
2π i[u • ρ + wt ]
r (− ) W + e c ©w¹
⋅
2 M
e e c
c
G G (4.109b)
cu w ( abc ) cu w
[ yˆ E xTA (− , z, − )γ s (− , − ) ts (− w )rs (− w )
w c w c c c
}
G G
cu w ( abc ) cu w w w
− xˆ E yTA (− , z , − )γ p (− , − ) t p (− )rp (− )] .
w c w c c c
Equations (4.100a) and (4.100b) require the E xTA and E yTA functions inside these integrals to
satisfy the Hermitian condition for their wavenumber arguments,
G G
E xTA (ε , z , −σ ) = E xTA (ε , z, σ )∗ (4.110a)
and
G G
E yTA (ε , z , −σ ) = E yTA (ε , z , σ )∗ . (4.110b)
- 429 -
4 · From Maxwell’s Equations to the Michelson Interferometer
For future use, we note that this is the same thing as saying
G G
cu w cu w
E xTA (− , z , − ) = E xTA (− , z , )∗ (4.110c)
w c w c
and
G G
cu w cu w
E yTA (− , z , − ) = E yTA (− , z , )∗ . (4.110d)
w c w c
The power flux—energy per unit area per unit time—carried by the input radiation field at any
point in space is given by the Poynting vector,68
G G (in) G (in)
The Poynting vector S is zero where ETA and BTA are zero, so it is given TA subscripts to show
G (in) G (in)
that it is time-chopped and beam-chopped in the same way that ETA and BTA are time-chopped
and beam-chopped. Equations (4.108a) and (4.108b) show that the total radiant energy entering
the interferometer during a time interval −T ≤ t ≤ T is
∞ ∞ G (in)
³ dt ³ ³ d 2ρ STA (• zˆ )
−∞ −∞
∞ ∞ ∞ ∞ ∞ ∞
c G G G
dw ³ dw′ ³ ³ d u ³ ³ d 2u′ ⋅ w′−2 ⋅ w−2 ⋅ e [
2π i ρ •( u + u ′ ) + ( w + w′ ) t ]
³ dt ³ ³ d ρ ³
2 2
= (4.111b)
µo −∞ −∞ −∞ −∞ −∞ −∞
G G G G
ª cu w cu ′ w′ cu w cu ′ w′ º
⋅ «E xTA (− , z , − )E xTA (− , z , − ) + E yTA (− , z , − )E yTA (− , z, − )» .
¬ w c w′ c w c w′ c ¼
The integrals over d 2u and dw in (4.108a) and (4.108b) are changed to integrals over d 2u , d 2u′
and dw, dw′ before they are substituted into (4.111a). This maneuver is often used to show that
formulas such as the one in (4.111b) deal with integrals over independent variables of integration.
We have also used the unit-vector identities xˆ × yˆ = − yˆ × xˆ = zˆ and xˆ × xˆ = yˆ × yˆ = 0 to simplify
the expression inside the square brackets [ ]. We note that the integrals over dt and d 2ρ can be
G (in)
extended to í and + exactly because the time-chopped and beam-chopped nature of ETA ,
G (in) G (in)
BTA , and STA ensures that their integrals drop to zero thus correctly excluding the
68
John David Jackson, Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York, 1999), p. 259.
- 430 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
electromagnetic energy at large values of x, y, and t that are not part of the interferometer
measurement. Moving the integrals over dt and d 2( to the inside to get
5 5 G (in)
³
5
dt ³ ³ d 2( STA
5
= zˆ
5 5 5 5 5 5
c G G G
we recognize these integrals to be forms of the delta function [see Eqs. (2.71f) and (2.122a) in
Chapter 2),
³ dt e
2& i ( w w3 ) t
( w w3)
5
and
5 5 5
G G G
³ ³d ³ dx e A ³ dy e
2 2& i ( = ( u u 3 ) 2& ix ( u x u x3 ) 2& iy ( u y u 3y )
(e
5 5 5
G G
(u x u 3x ) A (u y u 3y ) (u u 3).
Substituting these delta functions back into the multiple integral gives
5 5 G (in)
³
5
dt ³ ³ d 2( STA
5
= zˆ
5 5 5
c G G
³ dw A w ³ ³ d u ³ ³ d u3 (u u3)
4 2 2
$o 5 5 5
G G G G
ª cu w cu3 w cu w cu3 w º
A «E xTA ( , z , )E xTA ( , z , ) E yTA ( , z , )E yTA ( , z, ) »
¬ w c w c w c w c ¼
5 5
c
³ dw A w ³ ³ d u
4 2
$o 5 5
G G G G
ª cu w cu w cu w cu w º
A «E xTA ( , z, )E xTA ( , z , ) E yTA ( , z , )E yTA ( , z , ) » .
¬ w c w c w c w c ¼
- 431 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G G 2
cu w cu w cu w
E xTA (− , z , − )E xTA (− , z , ) = E xTA (− , z , − ) (4.112a)
w c w c w c
and
G G G 2
cu w cu w cu w
E yTA (− , z , − )E yTA (− , z , ) = E yTA (− , z , − ) , (4.112b)
w c w c w c
which shows the total radiant energy entering the interferometer during a time interval −T ≤ t ≤ T
to be
∞ ∞ G (in)
³ ³ ³ d ρ STA • zˆ
dt 2
( )
−∞ −∞
G G (4.113a)
1
∞ ∞
ª c2 cu w
2
c2 cu w º
2
The radiation fields entering the interferometer—unlike, say, the electromagnetic signal put
out by television or radio stations—can be modeled as random variables because they are not
under our direct control. Following the notation used in Chapter 3, we now write
G (in)
,E
STA xTA and E yTA
to show that these are random functions (see Sec. 3.2). No tilde is added to their arguments
because the arguments are nonrandom variables. To find the average or expected radiant energy
entering the interferometer during a time interval 2T, which is long compared to the period
1 f av = cσ av
of a typical electromagnetic wave inside the interferometer, we apply the expectation operator E
defined in Sec. 3.4 of Chapter 3 to both sides of Eq. (4.113a) to get
ª∞
( ) º
∞ G (in)
Average input energy = E « ³ dt ³ ³ d 2ρ STA • zˆ »
¬ −∞ −∞ ¼
G G (4.113b)
1
∞ ∞ ª § c cu w
2
· § c cu w
2
· º
= ³ dw ³ ³ d 2u ⋅ «E ¨ 2 E
µo c −∞ −∞ ¨
«¬ © w
xTA ( −
w
, z, − ) ¸ + E ¨ 2 E
c ¹ ¸ ¨ yTA ( − , z, − ) ¸ » .
c ¸¹ »¼
© w w
- 432 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
Equations (3.17c) and (3.16a) of Chapter 3 are used when taking E inside the integrals over dw
and d 2u .
Although the random radiation fields are not under our direct control, the amount of radiant
energy that is linearly polarized in the x or y direction is. We can, for example, imagine passing
the radiation in (4.113b) through a polarizing filter, setting E
yTA to zero without affecting E xTA or
setting E to zero without affecting E . Therefore (4.113b) can be interpreted as saying that
xTA yTA
§ G 2·
c cu w
E ¨ E (− , z, − ) ¸ d 2u dw (4.114c)
µo w ¨ xTA w
4
c ¸
© ¹
and
G
c
§
¨ cu w 2 ·¸ 2
E E (− , z, − ) d u dw (4.114d)
µo w4 ¨ yTA w c ¸
© ¹
G G
as the average or expected energy characterized by u = εσ and w = −σ c that is carried by,
respectively, the x-polarized and y-polarized radiation fields entering the interferometer during a
time interval 2T in length. By converting the integrals over dw and d 2u to integrals over dı and
d 2ε using the variable transformations [see Eqs. (4.102a)–(102d)]
G G
σ = − w c , ε = −c u w , dw = −cdσ , and d 2u = ( w2 / c 2 )d 2ε ,
- 433 -
4 · From Maxwell’s Equations to the Michelson Interferometer
( )
−∞ ∞
1 § w2 · 2 c 2 (εG, z , σ ) 2
µo c ∞³ ³−∞³ © c ¹ w
=− ( cd σ ) ¨ 2 ¸ d ε ⋅ 4
⋅ E E xTA (4.115a)
( )
∞ ∞
1 (εG, z , σ ) 2
= εo ³ dσ ³ ³d ε ⋅ E
2
⋅ E xTA
−∞ −∞
σ 2
and
Average input energy polarized in y
( )
∞ ∞
1 (εG, z , σ ) 2 . (4.115b)
= εo ³ dσ ³ ³d ε ⋅ E
2
⋅ E yTA
−∞ −∞
σ 2
In (4.115a) and (4.115b), we use that ε o = µo−1c −2 from Eq. (4.1e) above. Remembering that the
ˆ = εG + zˆ 1 − ε 2 is specified by vector εG , we note that
direction of the propagation vector Ω
εo
σ 2
⋅E E (
(εG, z , σ ) 2 dσ d 2ε
xTA ) (4.116a)
and
εo
σ 2
⋅E E (
(εG , z , σ ) 2 dσ d 2ε
yTA ) (4.116b)
can be interpreted as the average or expected energy entering the interferometer during a time
interval 2T in length carried by, respectively, the x-polarized or y-polarized radiation fields
traveling in the Ω̂ direction at wavenumber ı.
From Appendix 4C, we see that, according to the three-dimensional Wiener-Khinchin theorem
G G
discussed in Sec. 3.24 of Chapter 3, there exist power spectra Sx (u , w) and S y (u , w) such that
[see Eqs. (4C.10a) and (4C.10b)]
° 1 1 c 2 § G
w · ½°
2
G cu
Sx (u , w) = lim ® ⋅ ⋅ 4 E ¨ E xTA (− , z , − ) ¸ ¾ (4.117a)
T →∞ 2T A w ¨ w c ¸¹ °¿
A→∞ °
¯ ©
and
° 1 1 c 2 § G
w · ½°
2
G cu
S y (u , w) = lim ® ⋅ ⋅ 4 E ¨ E yTA (− , z , − ) ¸ ¾ . (4.117b)
T →∞ 2T A w ¨ w c ¸¹ ¿°
°
A→∞ ¯ ©
- 434 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
Here, the limit as A → ∞ is interpreted to be the limit as the beam cross-sectional area A extends
to cover the entire x, y plane; and, of course, the limit as T → ∞ means that the measurement
time becomes infinitely long. We have dropped z from the argument list of Sx,y on the left-hand
side of these two equations because, as is pointed out at the end of Appendix 4C, the values of
E 2 and E 2 are no longer functions of z. According to inequalities (4.105a) and (4.105b),
xTA yTA
area A has already been assumed to be much wider than the typical wavelength of the radiation
fields and time interval 2T has already been assumed to be much longer than the typical period of
the radiation fields. It is therefore plausible that in (4.117a) and (4.117b) the values of A and T are
already large enough for the expressions inside the braces { } to be approximately equal to their
limits. Assuming this to be true and multiplying both sides by ( µo c) −1 d 2u dw , we then get
§ G
w · 2
2
1 G 1 1 c cu
2
S (u , w) d u dw ≅ ⋅ ⋅ E ¨ E xTA (− , z , − ) ¸ d u dw (4.118a)
µo c x 2T A µo w4 ¨© w c ¸¹
and
§ G
w · 2
2
1 G 1 1 c cu
2
S (u , w) d u dw ≅ ⋅ ⋅ E ¨ E yTA (− , z , − ) ¸ d u dw . (4.118b)
µo c y 2T A µo w4 ¨© w c ¸¹
G
Comparing (4.114c) to (4.118a), we see that (µo c) −1 Sx (u , w) d 2u dw is the average x-polarized
G
input energy at (u, w) divided by the both the time interval 2T during which it entered the
interferometer and the area A through which it entered the interferometer. This means
G
(µo c) −1 Sx (u , w) d 2u dw can be interpreted as the average x-polarized input power per unit area at
G G
values u, w ; and a similar comparison of (4.114d) to (4.118b) shows that (µo c) −1 S y (u , w) d 2u dw
G
is the average y-polarized input power per unit area at values (u, w) .
Integrating both sides of (4.118a) and (4.118b) over dw and d 2u gives expressions for the
G
average input x-polarized and y-polarized input power per unit area from all the u and w values,
µo 2TA −∞ −∞
w ¨© w c ¸¹
and
- 435 -
4 · From Maxwell’s Equations to the Michelson Interferometer
µo 2TA −∞ −∞
w ¨© w c ¸¹
( )
∞ ∞
2 ª εo (εG, z , σ ) 2 º
≅ ³
−∞
dσ ³−∞³ «¬ 2TAσ 2
d ε E E xTA »
¼
and
Average y - polarized input power per unit area
∞
ªσ 2
∞
G º
= ³ dσ ³ ³ d ε « S y (σε , −σ c) »
2
(4.119b)
−∞ −∞ ¬ µo ¼
( )
∞ ∞
2 ª εo (εG, z , σ ) 2 º ,
≅ ³
−∞
dσ ³−∞³ «¬ 2TAσ 2
d ε E E yTA »
¼
where we again use ε o = µo−1c −2 from Eq. (4.1e). These last two equations suggest that
σ2
µo
G
Sx (σε , −σ c)d 2ε dσ ≅
εo
2TAσ 2
E E xTA (
(εG , z , σ ) 2 d 2ε dσ
) (4.119c)
can be interpreted as the average x-polarized input power per unit area traveling in direction
Ωˆ = εG + zˆ 1 − ε 2 at wavenumber ı, and
σ2
µo
G
S y (σε , −σ c)d 2ε dσ ≅
εo
2TAσ 2
E E yTA (
(εG , z , σ ) 2 d 2ε dσ
) (4.119d)
- 436 -
Energy Flux of the Time-Chopped and Beam-Chopped Radiation Fields · 4.14
can be interpreted as the average y-polarized input power per unit area traveling in direction
Ωˆ = εG + zˆ 1 − ε 2 at wavenumber ı. From the discussion following Eq. (4.100b) in Sec. 4.13, we
know that E and E
xTA
yTA represent direction-chopped radiation even though nothing like the T or
A subscripts has been used to make this explicit. This means, of course, that E xTA and E yTA must
G
be negligible or zero for ε values that do not represent propagation directions that are parallel to,
or nearly parallel to, the z axis. Consequently, Eqs. (4.119c) and (4.119d) show that Sx and Sy
G
must also be negligible or zero for ε values not representing propagation directions parallel to or
nearly parallel to the z axis. From the observations made at the end of Sec. 4.9, we know that d 2ε
can be interpreted as an infinitesimal solid angle. Hence, we can always regard
σ2
µo
Sx , y dσ ≅
εo
2TAσ 2
E
E xTA(, yTA
2
dσ)
as the input power per unit area and per unit solid angle of x-polarized or y-polarized radiation
respectively. The next obvious step is to drop dı and recognize
σ2
µo
Sx , y ≅
εo
2TAσ 2
E
E (
xTA, yTA
2
)
as the input power per unit area per unit solid angle and per unit wavenumber interval of the x-
polarized or y-polarized radiation respectively. It is customary in interferometric spectroscopy to
G G
define two functions L x (ε , σ ) and L y (ε , σ ) to represent the x-polarized and y-polarized radiant
power per unit area per unit solid angle and per unit wavenumber interval traveling in the
ˆ = εG + zˆ 1 − ε 2 at wavenumber ı. Hence it now makes sense to define that
direction Ω
G
L x (ε , σ ) =
µo
σ2 G
Sx (σε , −σ c) ≅
εo
2TAσ 2 (
(ε , z , σ ) 2
E E xTA ) (4.120a)
and
G
L y (ε , σ ) =
σ2
µo
G
S y (σε , −σ c) ≅
εo
2TAσ 2
E E(
(ε , z, σ ) 2 .
yTA ) (4.120b)
Again, because the beam is direction-chopped, the newly defined functions Lx,y must be
G
negligible or zero for ε values not representing directions parallel to, or nearly parallel to, the z
axis. As noted in Sec. 4.10 in the discussion after Eq. (4.66b), we are never interested in the
values of E xTA or E yTA at σ = 0 . Consequently, we can always take the expected values of
- 437 -
4 · From Maxwell’s Equations to the Michelson Interferometer
2
E to be zero at σ = 0 , preventing the factors of σ −2 in the last steps of (4.120a) and
xTA, yTA
(4.120b) from specifying a singularity when the wavenumber ı is zero. The x-polarized and y-
polarized power spectra specified by Lx and Ly are double-sided in ı because functions Lx,y equal
µo−1σ 2 Sx , y , and the Sx,y functions are double-sided. [We know that the Sx,y are double-sided
because, according to Eqs. (4.119a) and (4.119b), the Sx,y must be integrated over all
wavenumbers ı between í and + to get the average power per unit area.] Equations (4.110a)
2 2
and (4.110b) show that E (ε , z , σ ) and E (ε , z , σ ) must have the same values at íı that
xTA yTA
where again the tilde is used to indicate that these are random functions of nonrandom variables
(see Sec. 3.2 in Chapter 3). The balanced energy flux at any point in the balanced output beam is
now simply the expected or average value of
( )
G (bal) 1 G (bal) G (bal)
STA • zˆ = ETA × BTA • zˆ , (4.122a)
µo
the z component of the Poynting vector. To get the radiant energy reaching the detector, we just
integrate the expected value of the Poynting vector’s z component over the beam’s cross-sectional
area and the time interval −T ≤ t ≤ T used to collect the signal. Therefore,
- 438 -
Energy Flux of the Balanced Radiation Fields · 4.15
Average energy in balanced signal over time interval 2T and beam cross - section A
§5 5 G (bal) · § 1 5 5
G (bal) G (bal) · (4.122b)
E ¨ ³ dt ³ ³ d 2( STA
© 5 5
= zˆ ¸ E ¨
¹
© $o
³
5
dt ³ ³ d 2( ETA
5
; BTA = zˆ ¸ .
¹
In this section, we use Eqs. (4.109a) and (4.109b) to evaluate the right-hand side of (4.122b) from
the inside out. Most of the massive algebraic manipulations we encounter turn out to be
conceptually simple exercises in listing—and then eliminating through integration—a large
number of superfluous variables.
We introduce simplifying notation before substituting (4.109a) and (4.109b) into (4.122b).
According to Eq. (4B.12b) in Appendix 4B, the typical angle between nˆm and ẑ is small enough
for us to neglect the z component of the (nˆm zˆ ) vector in Eqs. (4.109a) and (4.109b).
ThisThis
means there
means must
there exist
must twotwo
exist realreal
constants a and
constants b such
a and thatthat
b such
r r ( w c), r 3 r ( w3 c) , (4.123b)
t p t p ( w c), t 3p t p ( w3 c) , (4.123f)
and
( abc ) G ( abc ) G
s, p s, p (cu / w, w c), 33
s, p s, p (cu 3 / w3, w3 c) . (4.123g)
- 439 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Now at last Eqs. (4.109a) and (4.109b) can be substituted into (4.122b). Postponing for a
while the application of the expectation operator E , we write
1
5 5
G (bal) G (bal)
$o ³
5
dt ³ ³ d 2( ETA
5
; BTA = zˆ
{ wwrr33
5 5 5 5 5 5
c
$o ³
5
dt ³ ³ dxdy ³ dw ³ dw3 ³ ³ d 2u ³ ³ d 2u3
5 5 5 5 5
2
2& i ¬ª x ( u x u 3x ) y ( u y u 3y ) t ( w w3 ) ¼º
Ae
A ªW e 2& i ( w c ) 1 ( c 2u 2 ) w2
e2& i ( w c )A( xa yb ) º (4.123h)
¬« ¼»
A ªW e 2& i ( w e 2& i ( w3 c )A( xa yb ) º
3 c ) 1 ( c 2u 32 ) w32
¬« ¼»
G G
ª 33 3 3
cu w cu3 w3
A« s s rs rs t s t s E xTA ( , z , )E xTA ( , z, )
¬ w c w3 c
G G
cu
p 33p rp rp3t p t 3p E yTA ( , z , )E yTA (
w
w
c
cu3
w3
w3 º
, z, ) »
c ¼ }.
The three double integrals over d 2( , d 2u , and d 2u3 are, of course, a shorthand for dxdy,
du x du y , and du 3x du 3y respectively. Moving the integral over dt to the inside givess [see Eq. (2.71f )
in Chapter 2]
³ dt e
2& it A( w w3 )
( w w3) . (4.124a)
5
We define
( abc ) G
3
s, p s, p (cu3 / w, w / c) (4.124b)
- 440 -
Energy Flux of the Balanced Radiation Fields · 4.15
( )
1
∞ ∞
G (bal) G (bal)
µo ³
−∞
dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ
{
∞ ∞ ∞ 2 ∞
c r 2π i ª¬ x ( u x + u ′x ) + y ( u y + u ′y ) º¼
³ dw ³ ³ d u ³ ³ d u ′ ⋅ ³ ³ dxdye
2 2
=
µo −∞ −∞ −∞
w 4
−∞
⋅ ªW 2 + We −2π i ( w c ) χ
2 2 2
1− ( c u ) w
e −2π i ( w c )⋅( xa + yb )
«¬
1− ( c 2u ′2 ) w2
+ We 2π i ( w c ) χ e 2π i ( w c )⋅( xa + yb ) (4.125a)
+ e 2π i ( w c ) χ [ 1− ( c 2u ′2 ) w2 − 1− ( c 2u 2 ) w2 ] º
»¼
G G
ª 2 2 cu w cu′ w
⋅ « rs ts γ sγ s′E xTA (− , z , − )E xTA ( , z , )
¬ w c w c
G G
2 2
′ cu w cu ′ w º
+ rp t p γ pγ p E yTA (− , z , − )E yTA ( , z , ) »
w c w c ¼ } .
In Eq. (4.125a), the integral over δ ( w + w′)dw′ has been used to replace w′ by íw everywhere, so
that [see Eqs. (4.92a-e), (4.123g), and (4.124b)]
2
rr ′ → r (− w c)r ( w c) = r ,
2
rs rs′ → rs (− w c)rs ( w c) = rs ,
2
ts ts′ → ts (− w c)ts ( w c) = ts ,
2
rp rp′ → rp (− w c)rp ( w c) = rp ,
2
t p t ′p → t p (− w c)t p ( w c) = t p ,
and
γ s , pγ s′′, p → γ s , pγ s′, p .
G G
Equations (4.123g), (4.124b), and (4.97c) show that when u ′ → −u in the argument lists of
γ sγ s′ , we get
- 441 -
4 · From Maxwell’s Equations to the Michelson Interferometer
( abc ) G G
3
s s s (cu / w, w / c) s( abc ) (cu3 / w, w / c) 7
( abc ) G G ( abc ) 2
(4.125b)
s (cu / w, w / c) s( abc ) (cu / w, w / c) ( abc ) ( abc )
s s s
and similarly
( abc ) 2
p
3p 7 p (4.125c)
G G
when u3 7 u in the argument lists of p
3p . Equation (4.97c) also shows that, when
G G
cu / w and w / c ) ,
we can write
( abc ) G ( abc ) G ( abc ) G ( abc ) G ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) )
and
( abc ) G ( abc ) G ( abc ) G ( abc ) G ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) s, p ( , ) ) .
( abc ) G ( abc ) G
At the extreme left-hand side of these last two formulas, we find s, p ( , ) ) s, p ( , ) ) and
( abc ) G ( abc ) G
s , p ( , ) ) s , p ( , ) ) , and since it must always be true that
it follows that, examining the extreme right-hand sides of these two formulas,
( abc ) G 2 ( abc ) G 2
s, p ( , ) ) s, p ( , ) ) . (4.125d)
From thethe
From discussion following
discussion Eq.
following Eq.(4.83)
(4.83)ininSec.
Sec.4.12
4.12above,
above,we
wesee W22 11 because
thatW
seethat because W must
be either 1 or í1. We also note, according to Eq. (2.122a) in Chapter 2, that
5 5 5
2& i ¬ª xA( u x u x3 ) y A( u y u 3y ) ¼º
³ ³ dxdy e ³ dxe ³ dye
2& ixA( u x u 3x ) 2& iy A( u y u 3y )
5 5 5
(u x u3x ) A (u y u3y ).
Now Eq. (4.125a) can be written as
- 442 -
Energy Flux of the Balanced Radiation Fields · 4.15
( )
1
∞ ∞
G (bal) G (bal)
µo −∞
³ dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ
G
{
2
2c
∞ ∞
ª 2 2 ( abc ) 2 r cu w
2
³ dw ³−∞³ d u w4 ⋅ «« rs ts γ s E xTA (− w , z, − c )
2
=
µo −∞ ¬
G
w º
}
2
2 2 ( abc ) 2 cu
+ rp t p γ p E yTA (− , z , − ) »
w c »¼
{w ⋅ ³ ³ dxdy e
∞ ∞ ∞ 2 ∞ ª wa wb º
Wc r 2π i « x ( u x + u ′x − ) + y ( u y + u ′y − )
c »¼
³ dw ³ ³ d u ³ ³ d u ′
2 2
+ ¬ c
µo −∞ −∞ −∞
4
−∞
G G
−2π i ( w c ) χ 1− ( c 2u 2 ) w2 ª 2 2 cu w cu ′ w
⋅e ⋅ « rs ts γ sγ s′E xTA (− , z , − )E xTA ( , z , )
¬ w c w c
G G
2 2
′ cu w cu ′ w º
+ rp t p γ pγ p E yTA (− , z , − )E yTA ( , z , ) »
w c w c ¼ }
{w ⋅ ³ ³ dxdy e
∞ ∞ ∞ 2 ∞ ª wa wb º
Wc r 2π i « x ( u x + u ′x + ) + y ( u y + u ′y + ) »
³ dw ³ ³ d u ³ ³ d u′
2 2
+ ¬ c c ¼
µo −∞ −∞ −∞
4
−∞
G G
2π i ( w c ) χ 1− ( c 2u ′2 ) w2 ª 2 2 cu w cu ′ w
⋅e ′
⋅ « rs ts γ sγ s E xTA (− , z , − )E xTA ( , z , )
¬ w c w c (4.125e)
G G
2 2
cu w cu ′ w º
+ rp t p γ pγ ′p E yTA (− , z , − )E yTA ( , z , ) » ,
w c w c ¼ }
where Eqs. (4.112a), (4.112b), (4.125b), and (4.125c) are used to simplify the first set of integrals
on the right-hand side. Even though Eqs. (4.112a) and (4.112b) state an equality between
nonrandom quantities E xTA and E yTA , we know this equality is also true for the random quantities
E and E because (4.112a) and (4.112b) must hold true for any radiation fields. The
xTA yTA
∞ ª § wa · § wb · º
2π i « x⋅¨ u x + u ′x ± ¸ + y ⋅¨ u y + u ′y ± ¸
c ¹ »¼ wa wb
³ ³ dxdy e ¬ ©
= δ (u x + u ′x ± ) ⋅ δ (u y + u′y ±
c ¹ ©
). (4.125f)
−∞
c c
wχ c2 § G w G · § G w G · wχ u 2 c2
2π i 1− 2 ¨ u + ∆ ¸ •¨ u + ∆ ¸ 2π i 1− 2
c w © c ¹ © c ¹
e ≅e c w
, (4.126a)
- 443 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G
∆ = axˆ + byˆ = 2(nˆM − zˆ ) . (4.126b)
This new vector will make it easier to write down what happens to Eq. (4.125e) when we
substitute from (4.126a). Equations (4.100a) and (4.100b) hold true for all physically possible
radiation fields, so they must still be true when E xTA and E yTA are taken to be the random
quantities E and E . Therefore we can take the complex conjugate of both sides of Eqs.
xTA yTA
(εG, z , σ ) = E
E (εG , z , −σ )∗ (4.126c)
xTA xTA
and
(εG, z , σ ) = E
E (εG , z, −σ )∗ , (4.126d)
yTA yTA
where the T, A subscripts are added because now we are explicitly acknowledging their time-
G
chopped and beam-chopped nature. Equation (4.124b) shows that when the argument u ′ of γ s′, p
G G
is replaced by – (u ± w∆ / c) , we get
G
§ −cu G w ·
γ s′, p → γ ( abc )
s, p ¨ B ∆, ¸ .
© w c¹
G
Examining the definition of ∆ in Eq. (4.126b), we note that the angle between nˆM and ẑ is
O(θ d ) , which means, according to inequality (4.68) above, that the angle between nˆM and ẑ
must be much smaller than the typical size of the off-axis propagation angle șb. Although we
know from the discussion at the beginning of Appendix 4E that changing the propagation
direction by șb can significantly affect the value of the complex γ s(,abc
p
)
parameters, the discussion
- 444 -
Energy Flux of the Balanced Radiation Fields · 4.15
at the end of Appendix 4E demonstrates that changing the direction of propagation by only an
O(' d ) amount does not significantly affect s(,abc )
p . Hence, Eqs. (4.125b) and (4.125c) still specify
G G G
what happens to s s3 and p 3p when u3 7 (u 9 w / c) . Taking all this into account while
changing the double integrals over dxdy in Eq. (4.125e) into the delta functions specified by
(4.125f) then leads to
o [applying (4.126a), (4.126c), and (4.126d)]
1
5 5
G (bal) G (bal)
$o ³
5
dt ³ ³ d 2( ETA
5
; BTA = zˆ
G
{ r ª 2
5 5 2 2
2c ( abc ) 2 E ( cu , z , w )
³ dw ³ ³ d u
2 2
« rs ts s xTA
$o 5 5
w4 «¬ w c
G
w º
}
2
2 2
cu
( abc ) 2
rp t p E yTA ( , z , ) »
p
w c »¼
G G
{
2
r ª 2 2 ( abc ) 2 cu G
5 5
Wc cu w w
³ dw ³ ³ d u
2
4 « s
r ts s E xTA ( , z, )E xTA ( , z , )
$o 5 5
w ¬ w c w c
G G
}
w ( cu )2
2 2 ( abc ) 2 cu w cu G w º 2& i c 1 w2
rp t p p E yTA ( , z, )E yTA ( , z, ) » e
w c w c ¼
G G
{
2
r ª 2 2 ( abc ) 2 cu G
5 5
Wc cu w w
³ dw ³ ³ d u
2
4 « s
r ts s E xTA ( , z, )E xTA ( , z, )
$o 5 5
w ¬ w c w c
G G
}
w ( cu ) 2
2 2 ( abc ) 2 cu w cu G w º 2& i c 1 w2
rp t p p E yTA ( , z, )E yTA ( , z, ) » e . (4.127a)
w c w c ¼
There is no point postponing any longer the application of the expectation operator E to both
sides of this formula. Because the expectation operator is linear with respect to nonrandom
quantities [see Eqs. (3.16a) and (3.17c) in Chapter 3], it can be taken inside all the integrals on
the right-hand side, which means Eq. (4.122b) can now be written as
- 445 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Average energy in balanced signal over time interval 2T and beam cross - section A
§ 1 5 5
G (bal) G (bal) ·
E¨
© $o
³
5
dt ³ ³ d 2( ETA
5
; BTA = zˆ ¸
¹
G
r ª 2
{
2
2c
5 5
(abc ) 2
§ cu w ·
2
³ dw ³ ³ d u
2
2
« rs ts E ¨ E xTA ( , z , ) ¸
$o w4 «¬
s
¨ w c ¸¹
5 5 ©
G
§ w ·º
}
2
2 2
cu
(abc ) 2
rp t p E ¨ E yTA ( , z , ) ¸ »
¨ p
w c ¸¹ »¼
©
G G
{
2
r ª 2 2 (abc ) 2 § cu G
5 5
Wc cu w w ·
³ dw ³ ³ d u E ¨ E xTA ( , z , )E xTA ( , z , ) ¸
2
4 « s
r ts s
$o 5 5 w ¬ © w c w c ¹
G G G
}
w ( cu ) 2
2 2 2 § cu w
( , z , )E cu w
( , z , ) e c · º 2& i 1
rp t p (abc )
p E¨ E yTA yTA ¸»
w2
© w c w c ¹¼
G G
{
2
r ª 2 2 (abc ) 2 § cu G
5 5
Wc cu w w ·
³ dw ³ ³ d u E ¨ E xTA ( , z , )E xTA ( , z , ) ¸
2
4 « s
r ts s
$o 5 5 w ¬ © w c w c ¹
G G
}
w ( cu )2
2 2 (abc ) 2 § cu w cu G w · º 2& i c 1 w2
rp t p p E ¨ E yTA ( , z , )E yTA ( , z , ) ¸ » e .
© w c w c ¹¼
(4.127b)
The key terms in Eq. (4.127b) are the expectation values of the random variables
2 ) , E( E
E( E 2),
xTA yTA
G G
§ cu w cu G w ·
E ¨ E xTA ( , z , )E xTA ( 9 , z , ) ¸ ,
© w c w c ¹
andand
G G
§ cu w cu G w ·
E ¨ E yTA ( , z , )E yTA ( 9 , z , ) ¸ .
© w c w c ¹
2 ) , E( E
We learned how to handle terms such as E( E 2 ) in Sec. 4.14 [see Eqs. (4.120a)
xTA yTA
- 446 -
Energy Flux of the Balanced Radiation Fields · 4.15
G G
§ cu w cu G w ·
E ¨ E xTA (− , z , − )E xTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
and
G G
§ cu w cu G w ·
E ¨ E yTA (− , z , − )E yTA (− ± ∆, z , − )∗ ¸ ?
© w c w c ¹
To evaluate this new type of term, we return to Eq. (4.108a) above, making the radiation field
random and taking x and y components to get
∞ ∞ G
(in) G ª −2 cu w º 2π i[ uG • ρG + wt ]
E xTA ( ρ , z, t ) = ³ ³−∞³ «¬
dw d u 2
cw E ( − , z , − ) e (4.128a)
c »¼
xTA
−∞
w
and
∞ ∞ G
G ª −2 cu w º G G
E (in)
yTA ( ρ , z , t ) = ³ dw ³ ³ d u « cw E yTA (− , z , − ) » e 2π i[ u • ρ + wt ] .
2
(4.128b)
−∞ −∞ ¬ w c ¼
cw−2 E −2
xTA and cw E yTA ,
G ∞ ∞
cu w (in) G
G G
cw E xTA (− , z , − ) = ³ dt ³ ³ d 2 ρ E xTA
−2
( ρ , z , t ) e −2π i[u • ρ + wt ] (4.129a)
w c −∞ −∞
and
G ∞ ∞
cu w (in) G
G G
cw E yTA (− , z , − ) = ³ dt ³ ³ d 2 ρ E yTA
−2
( ρ , z , t ) e −2π i[ u • ρ + wt ] . (4.129b)
w c −∞ −∞
- 447 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
§ cu w cu G w ·
E ¨ E x , yTA ( , z , )E x , yTA ( 9 , z , ) ¸
© w c w c ¹
§ ª w2 5 5
G G G º
E¨ « ³ dt ³ ³ d 2 ( E x(in)
, yTA ( ( , z , t ) e
2& i[ u = ( wt ]
»
¨ c
© ¬ 5 5 ¼
(4.130)
ª w2 5 5
G G G G
º·
³ dt 3 ³ ³ d ( 3 E x(in)
2 2& i[ u B ( w c ) = ( 3 wt 3 ]
A« , yTA ( ( , z , t ) e
3 3 » ¸¸
¬ c 5 5 ¼¹
w4
5 5 5 5
G G G G G
G (in) G 3 3
³ dt ³ dt 3 ³ ³ d 2 ( ³ ³ d 2 ( 3 e 2& i[ u =( ( ( 3) w( t t 3) 9 ( w c ) = ( 3]E E x(in)
, yTA ( ( , z , t ) E x , yTA ( ( , z , t ) .
c2 5 5 5 5
It is important to remember, when using this approximation, that Rx,y are the three-dimensional
autocorrelation functions of the x and y radiation field components before they enter the
G
interferometer [see Eqs. (4C.3a) and (4C.3b) in Appendix 4C]. The (t , T ) and ( ( ; A)
functions are defined in Appendix 4C to be69
°1 for t 4 T
(t , T ) ® (4.131b)
°̄0 for t T
G
1 when point ( ( x, y ) lies inside or on the edge
G °° of the beam of cross - sectional area A
( ( ; A) ( x, y; A) ® G . (4.131c)
° 0 when point ( ( x, y ) lies outside the beam of
°̄ cross - sectional area A
These functions approximate what happens to the original autocorrelation function Rx,y when
radiation enters the interferometer; they make explicit the time-chopped and beam-chopped
nature of the interferometer signal (see discussion in Secs. 4.9 and 4.10 above). Substitution of
(4.131a) into (4.130) gives
69
This formula for (t , T ) in Eq. (4.131b) is similar to the formula for (t , T ) given in Eq. (2.56c) 2,
(1.56c) of Chapter 1,
differing only in the value specified for at t 9T .
- 448 -
Energy Flux of the Balanced Radiation Fields · 4.15
G G
§ cu w cu G w ·
E ¨ E x , yTA (− , z , − )E x , yTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
∞ ∞ ∞ G G ∞
w4 G G
³ [Π ( ρ ; A)
(
B2π i ( w c ) ρ ′•∆ )
³ Π(t ′, T )dt ′ ³ Π(t , T )dt ³ ³ Π( ρ ′; A)e d ρ′ ³
2
≅ 2 (4.132a)
c −∞ −∞ −∞ −∞
G G
G G G
⋅ e −2π i[ u •( ρ − ρ ′) + w( t −t ′)] ⋅ R x , y ( ρ − ρ ′, t − t ′, z ) d 2 ρ ]
G G G
Transforming the variables of integration to ρ ′′ = ρ − ρ ′ and t ′′ = t − t ′ so that dt ′′ = dt and
d 2 ρ ′′ = d 2 ρ changes the formula to
G G
§ cu w cu G w ·
E ¨ E x , yTA (− , z , − )E x , yTA (− ± ∆, z , − )∗ ¸
© w c w c ¹
∞ ∞ G G ∞ ∞
w4 G G G
³ [Π ( ρ ′′ + ρ ′; A)
(
B2π i ( w c ) ρ ′•∆ )
³ Π (t ′, T )dt ′ ³ ³ Π ( ρ ′; A)e d ρ ′ ³ Π (t ′′ + t ′, T )dt ′′ ³
2
≅ 2 (4.132b)
c −∞ −∞ −∞ −∞
G
G G
]
⋅ e −2π i[ u • ρ ′′+ wt ′′] ⋅ R x , y ( ρ ′′, t ′′, z ) d 2 ρ ′′ .
We note that, in the limit as T → ∞ and A → ∞ , the inner integrals over dt ′′ and d 2 ρ ′′ become
G
the three-dimensional Fourier transform of R x , y ( ρ ′′, t ′′, z ) :
∞ ∞
G G G
³ [Π( ρ ′′ + ρ ′; A) e ]
G G
³ Π (t ′′ + t ′, T )dt ′′ ³ −2π i[ u • ρ ′′ + wt ′′ ]
⋅ R x , y ( ρ ′′, t ′′, z ) d 2 ρ ′′
−∞ −∞
T → ∞ A → ∞ (4.133a)
∞ ∞
G G G
= ³
−∞
dt ′′ ³
−∞
³ R x , y ( ρ ′′, t ′′, z ) e −2π i[ u • ρ ′′+ wt ′′] d 2 ρ ′′ .
According to Eqs. (4C.5a) and (4C.5b) in Appendix 4C, the three-dimensional Fourier transform
G
of R x , y ( ρ ′′, t ′′, z ) is
∞ ∞
G G G G
³
−∞
dt ′′ ³ ³ d 2ρ ′′ R x , y ( ρ ′′, t ′′, z ) e −2π i (u • ρ ′′+ wt ′′) = Sx , y (u , w) ,
−∞
(4.133b)
- 449 -
4 · From Maxwell’s Equations to the Michelson Interferometer
where, as was discussed at the end of Appendix 4C, functions Sx,y do not need to have z as part of
their argument list because they do not depend on that variable. Equations (4.133a) and (4.133b)
can be combined to give
5 5
G G G
³ [( ( 33 ( 3; A) e ]
G G
³ (t 33 t 3, T )dt 33 ³ 2& i[ u = ( 33 wt 33 ]
A R x , y ( ( 33, t 33, z ) d 2 ( 33
5 5
(4.133c)
T 7 5 A 7 5
G
Sx , y (u , w).
Following
Following the same reasoning
the same reasoning used
used in
in the
the discussion
discussion following
following Eq.
Eq. (4.117b)
(4.117b) above,
above, wewe assume
assume
that in a well-designed interferometer A and T are large enough – in fact, that a relatively
that in a well-designed interferometer A and T are large enough for the left-hand side of (4.133c) small
patch of A (say A/100 or A/1000) is
to be approximately equal to its limit:large enough and a similarly small fraction of T is large enough
– for the left-hand side of (4.133c) to be approximately equal to its limit:
5 5
G G G
³ [( ( 33 ( 3; A) e ]
G G
³ (t 33 t 3, T )dt 33 ³ 2& i[ u = ( 33 wt 33 ]
A R x , y ( ( 33, t 33, z ) d 2 ( 33
5 5 (4.133d)
G
Sx , y (u , w).
G
Another way of looking at this is to say that only the values of Rx,y reasonably near ( 33 = 0 and3
When using this approximation, the inner integrals in (4.132b) no longer depend on variables t
t 33= 0G contribute significantly to the Sx,y Fourier transform. When using this approximation, the
and ( 3 , allowing us to write G
inner integrals in (4.132b) no longer depend on variables t 3 and ( ,3 except for a relatively small
border region around the edge of A and relatively small time durations at the beginning and end
of T. Neglecting the contribution of these small border regions and time durations, we make the
approximation that G G
§ cu w cu G w ·
E ¨ E x , yTA ( G , z , )E x , yTA ( G 9 , z , ) ¸
§© w
cu c
w w G
cu c ·¹
w
E¨ E ( , z , )
E ( 9 , z , 5) ¸
(4.133e)
x , yTA x , yTA 5 G G
© w w4 c G w c ¹ G B2& i ( w c ) ( 3= 2
2 Sx , y (u , w) ³ (t 3, T )dt 3 ³ ³ ( ( 3; A)e d (3 . (4.133e)
c
w 4
G 5
5 5
G G G
B2& i ( w c ) ( 3= 2
2 Sx , y (u , w) ³ (t 3, T )dt 3 ³ ³ ( ( 3; A)e
5
d (3 .
c 5 5
At this point, we have everything needed to put together the interferometer’s balanced-signal
equations. Using Eqs. (4.102a)–(4.102d) to transform the variables of integration in Eq. (4.127b)
G G
to ) ( w c) and c(u w) gives
- 450 -
Energy Flux of the Balanced Radiation Fields · 4.15
Average energy in balanced signal over time interval 2T and beam cross section A
{σ
∞ ∞
d 2ε
³ dσ ³ ³ σ
2
= 2ε o 2
r( )
−∞ −∞
[ 2 2 2
(εG, z , σ ) 2
⋅ rs (σ ) ts (σ ) γ s( abc ) (σ ) E E xTA ( )
2 2 2
(εG, z , σ ) 2
+ rp (σ ) t p (σ ) γ (pabc ) (σ ) E E yTA ( )]}
{σ
∞ ∞
d 2ε
³ dσ ³ ³ σ
2
+W ε o 2
r( )
−∞ −∞
G G G
⋅[ rs (σ ) ts (σ ) γ s( abc ) (σ ) E ( E xTA (ε , z, σ )E xTA (ε − ∆, z , σ )∗ )
2 2 2
2 2 2
(εG, z , σ )E
+ rp (σ ) t p (σ ) γ (pabc ) (σ ) E E yTA yTA
G
(εG − ∆, z , σ )∗
( )] e −2π iσχ 1−ε 2
}
{σ
∞ ∞
d 2ε
³ ³³σ
2
+W ε o dσ 2
r( )
−∞ −∞
G G G
⋅[ rs (σ ) ts (σ ) γ s( abc ) (σ ) E ( E xTA (ε , z, σ )E xTA (ε + ∆, z , σ )∗ )
2 2 2
(4.134a)
2
+ rp (σ ) t p (σ ) γ
2 ( abc )
p
2
(
(ε , z , σ )E
(σ ) E E yTA
G G
(ε + ∆, z , σ )∗
yTA
G
)]e 2π iσχ 1−ε 2
} .
Here rule (4E.6a) in Appendix 4E is used to acknowledge that γ s( abc ) and γ (pabc ) are functions
only of the wavenumber ı for on-axis and slightly off-axis plane waves, and once again Eq.
(4.1e) is used to simplify the constant outside the integrals. Converting the arguments in (4.133e)
G G
to σ = −( w c) and ε = −c(u w) gives
G G G
E E (
x , yTA (ε , z , σ )
E x , yTA (ε ± ∆ , z , σ )∗ )
∞ ∞ G
G G G
±2π i ( ρ ′•(σ∆ ) ) 2
≅ c 2σ 4Sx , y (σε , −σ c) ³ Π (t ′, T )dt ′ ³ ³ Π ( ρ ′; A)e d ρ′ (4.134b)
−∞ −∞
G G
= c σ Sx , y (σε , −σ c) ⋅ 2T ⋅ Ȇ A (Bσ∆) ,
2 4
- 451 -
4 · From Maxwell’s Equations to the Michelson Interferometer
5
G G G G
³ ³
2 2& i ( =u
Ȇ A (u ) d ( ( ( ; A) e . (4.134c)
5
G G
of the beam’s pupil function ( ( ; A) defined in Eq. (4.131c). Because ( ( ; A) is strictly real,
we see that
G
5
G G G § 5 2 G G G ·
³ ³ d ( ( ( ; A) e ¨ ³ ³ d ( ( ( ; A) e2& i( =u ¸
2 2& i ( =u
Ȇ A (u )
5 © 5 ¹
or
G G
Ȇ A (u ) Ȇ A (u ) . (4.134d)
Equations
Equations (4.120a)
(4.120a) andand (4.120b)
(4.120b) let substitute
let us us substitute Sx,ySinx,y(4.134b)
for for in (4.134b)
to get
to get
G G
(G, z , ) )E
E E (G 9 , z , ) ) 2T $ c 2) 2 L (G , ) ) Ȇ (B))
(4.135a)
xTA xTA o x A
and
G G
(G , z , ) )E
E E (G 9 , z , ) ) 2T $ c 2) 2 L (G , ) ) Ȇ (B)) .
(4.135b)
yTA yTA o y A
These last two results, together with o $o c 2 from Eq. (4.1e), can be substituted into (4.134a)
to give
Average energy in balanced signal over time interval 2T and beam cross section A
d 2
{)
5 5
³ d) ³ ³ )
2
2 o 2
r( )
5 5
[ (G , z , ) ) 2
2 2 2
A rs () ) ts () ) Ȗs( abc ) () ) E E xTA
2 2 2
(G , z , ) ) 2
rp () ) t p () ) Ȗ (pabc ) () ) E E yTA ]}
{ G
5 5
G
2TW ³ d) ³ ³d
2 2 2
[ 2 2
r () ) Ȇ A ()) rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5
2 2 2 G
rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) e 2& i) ] 1 2
}
{ G
5 5
G
2TW ³ d) ³ ³d
2 2 2 2
[ 2
r () ) Ȇ A ()) rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5
2 2 2 G
rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) e 2& i) ] 1 2
}. (4.135c)
- 452 -
Energy Flux of the Balanced Radiation Fields · 4.15
2
Returning again to Eqs. (4.120a) and (4.120b), this time to substitute for E( E x , yTA ) , gives
Average energy in balanced signal over time interval 2T and beam cross section A
{)
5 5
4TA ³ d) ³ ³d
2 2
r( )
5 5
«¬
2 2 G 2 2 G
A ª rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) ) rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) º
2 2
»¼ } (4.135d)
{)
5 5
G
2TW ³ d) ³ ³ d
2
r( )
2
[ r () )
s
2 2 2
ts () ) Ȗs( abc ) () ) L x ( , ) )
5 5
2 2 2 G
¬
]
G
rp () ) t p () ) Ȗ (pabc ) () ) L y ( , ) ) A ª« Ȇ A ()) e 2& i) 1 2
G
Ȇ A ()) e2& i) 1 2 º
}
»¼
G G
where Eq. (4.134d) is used to replace Ȇ A ()) by Ȇ A ()) . From the definitions of Lx,y in the
G
discussion preceding Eqs. (4.120a), we know that L x ( , ) ) d 2 is the x-polarized optical power
per unit area of the beam and per unit wavenumber interval at wavenumber ı that is inside the
G G
d 2 solid angle and traveling in the direction of the propagation vector zˆ 1 2 . A
G
similar statement can be made about L y ( , ) ) d 2 —that it is the y-polarized optical power per
unit area of the beam and per unit wavenumber interval at wavenumber ı that is inside the d 2
G
solid angle and traveling in the direction of the propagation vector . The discussion following
Eq. (4.120b) points out that Lx and Ly must represent direction-chopped radiation, with both Lx
G
and Ly negligible for values specifying propagation directions that are not parallel to, or nearly
parallel to, the optical axis ẑ . Hence Lx and Ly are negligible for those propagation directions
that cannot enter the interferometer because they lie outside the interferometer’s field of view,
and we can regard the integrals over d 2 as occurring only over the interferometer’s field of
view. We define Pbal ( ) to be the time-averaged power in the balanced signal from the beam of
cross-sectional area A at an OPD OPD value value of
of Ȥ.
Ȥ. Dividing
Dividing both
both sides
sides of
of (4.135d)
(4.135d)by by2T
2Tthen s, after
thengives
gives
using that Re(c) = (c+c )/2 for any complex number c,
{{
5
G
2
[ 2 2 2
Pbal ( ) 2 A 5³ d) ³ ³ d 2 r () ) A rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
G
³5 d) field
Pbal ( ) 2 A5 ³field³ofofdview
2 2
[ 2 2 2
r () ) A rs () ) ts () ) Ȗs( abc ) () ) L x ( , ) )
G
view 2 2 2
rp () ) t p () ) Ȗs( abc ) () ) L y ( , ) ) (4.135e)
G
]
2 2 ( abc ) 2
rp () ) t p () ) Ȗs () ) L y ( , ) ) (4.135e) ]
}
W G
[
A 1 Re Ȇ A ()G )e2& i) cos ]
[W
A 1 A Re Ȇ A ())e2& i) cos ]
A
}
- 453 -
4 · From Maxwell’s Equations to the Michelson Interferometer
where
cos 1 2
K (4.135f)
cosine of the angle between the propagation vector ȍ and the z axis.
We note that by definition is the same as angle 'b used in Sec. 4.12 and Appendix 4B.
Writing the integral over d 2 like this lets us think of Lx and Ly as representing the radiation
field before it becomes direction-chopped—always assuming, of course, that direction-chopping
the incident radiation does not significantly change Lx and Ly. Equation (4.135e) makes it clear
that the triple integral over dı and d 2 must be real because the quantity being integrated is
always real.
4.16 Simplified Formulas for the Optical Power in the Balanced Signal
Equation (4.135e) specifies the optical power in the balanced signal when L x > L y so that the
G
incident radiation is polarized, when Lx and Ly depend on as well as ı so that there is both
G
spectral and intensity variation across the interferometer’s field of view, and when > 0 so that
there are small misalignments in the moving mirror. This section strips away these effects step by
step, eventually to arrive at the same formula for the optical power in the balanced signal for an
ideal interferometer that was presented in Eq. (1.19f) of Chapter 1.
The first step is to specify unpolarized incident radiation, which we do by setting
G G 1 G
L x ( , ) ) L y ( , ) ) L ( , ) ) . (4.136a)
2
Here, the incident radiation is made unpolarized by splitting the total power equally between the
two possibilities—x polarization and y polarization. From Eqs. (4.121a) and (4.121b), we see that
both Lx and Ly are even functions of ı, requiring L to be another even function of ı:
G G
L ( , ) ) L ( , ) ) . (4.136b)
{)
5
G
Pbal ( ) A ³ d) ³³d
2 2
r ( ) L ( , ) )
5 field of view
] }.
WW GG
A [1A [1 ReReȆȆ(A)())e)e2&2&i)i)cos
cos
AA A
- 454 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16
Glancing back to the definitions of Lx and Ly [see discussion preceding Eq. (4.120a) above], we
G G G
recognize L ( , ) ) L x ( , ) ) L y ( , ) ) to be the total optical power per unit cross-sectional area
of this unpolarized beam per unit solid angle per unit wavenumber interval at wavenumber ı. As
an argument of function L, the wavenumber ı takes on negative as well as positive values:
5
)
5 . This makes L analogous to a double-sided power spectrum [in Chapter 3, see Sec.
3.20 and the discussion following Eq. (3.57g)]. In radiometry the spectral radiance of an optical
field is the transmitted optical power per unit area transverse to the direction of propagation per
unit solid angle in the direction of propagation per unit wavenumber interval. This is the same
meaning we have attached to L; however, in radiometry, the wavenumber ı is always positive.70
This makes the radiometric spectral radiance of the optical field analogous to a single-sided
power spectrum. Because the radiation passing through the interferometer is direction-chopped,
ensuring that all the propagation vectors are parallel to, or nearly parallel to, the ẑ axis, a unit
cross-sectional area of the beam is approximately the same as a unit area transverse to the
radiation’s direction of propagation; and a solid angle d 2 in Eq. (4.135e) is approximately the
G
same as a solid angle in the direction of propagation. Hence we could interpret L ( , ) ) as the
radiometric spectral radiance of the optical field if L were not in fact defined for both positive and
negative wavenumbers, making it analogous to a double-sided rather than a single-sided power
spectrum. Therefore we use the standard conversion for going from double-sided to single-sided
power spectra [see Eq. (3.58b) in Chapter 3] to define the spectral radiance L of the optical field
as
G G
L( , ) ) 2 L ( , ) ) for ) : 0 . (4.136d)
The next step is to assume no spectral or intensity variation across the interferometer’s field of
G
view, which means we suppress the dependence of L and L on and write Eqs. (4.136a),
(4.136b), and (4.136d) as
G G 1 G 1
L x ( , ) ) L y ( , ) ) L ( , ) ) L () ) , (4.136e)
2 2
with
L () ) L () ) (4.136f)
and
L() ) 2 L () ) for ) : 0 . (4.136g)
A
5
W
W GG
[[ Re §¨ Ȇ
Pbal ( ) ³ d) ³ ³ d 2 ! ( ) )L () ) A 11 Re
ȆA ()())e)e2& 2i)&icos
) cos
]
] ·
¸ (4.136h)
2 5 field of view AA © A ¹
70
See Table 1-2 on page 1-4 in The Infrared Handbook, edited by William L. Wolfe and George J. Zissis, rev. ed.
(Infrared Information Analysis Center of the Environmental Research Institute of Michigan, 1985).
- 455 -
4 · From Maxwell’s Equations to the Michelson Interferometer
η (σ ) = 2 r (σ ) ª« rs (σ ) ts (σ ) γ s( abc ) (σ ) + rp (σ ) t p (σ ) γ p( abc ) (σ ) º» .
2 2 2 2 2 2 2
(4.136i)
¬ ¼
so that η = 1 . For realistic interferometers, we expect 0 < η < 1 ; and the closer Ș is to one, the
more nearly ideal is the performance of the interferometer’s optical components (i.e., the beam
splitter, compensator, and return mirrors in Figs. 4.16, 4.17, and 4.19).
Traditional interferometers have beams with a circular cross section. Equation (4D.6b) in
Appendix 4D gives the formula for the two-dimensional forward Fourier transform of a circular
pupil function when R is the pupil radius,
G
G J1 (2π R u )
Ȇ A (u ) = R⋅ G . (4.137a)
circle of
radius R
u
Here J1 is the first-order Bessel function of the first kind. We note that this two-dimensional
G
Fourier transform depends only on the magnitude of vector u ; and, since J1 is always real-valued
G G
for real arguments (see Fig. 4.23), the transform is always real. Substitution of σ∆ for u in
(4.137a) gives
G
G J1 (2π R ⋅ σ ⋅ ∆ ) J (4π R ⋅ σ ⋅ nˆM − zˆ )
Ȇ A (σ∆) = R⋅ G = R⋅ 1 , (4.137b)
circle of σ ⋅∆ 2 ⋅ σ ⋅ nˆM − zˆ
radius R
G G
where in the last step we replace ∆ with its definition from Eq. (4.126b): ∆ = 2(nˆM − zˆ ) .
Both nˆM , the normal vector to the reflecting surface of the moving mirror, and ẑ , the vector
pointing down the interferometer beam along the optical axis, have unit length. Because the angle
between them is always small—the moving mirror is assumed to be only slightly misaligned—it
follows that nˆM − zˆ is the misalignment angle of the moving mirror with respect to the optical
- 456 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16
Figure 4.23.
1.2
1.2
1.0 1
0.8 0.8
0.6 0.6
J1 ( x )
0.440.4
y
i
0.2 0.2
0.0 0
-0.20.2
-0.40.4
-0.60.6
0.582
30 20 10 0 10 20 30
-30
30 -20 -10 0.0
x 10 20 3030
i
- 457 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.24.
1.2
1.2
1.0 1
0.8 0.8
2 J 1 ( x) 0.6 0.6
x y
i
0.440.4
0.2 0.2
0.0 0
-0.20.2
0.132
30 20 10 0 10 20 30
-30
30 -20 -10 0.0
x 10 20 3030
i
- 458 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16
axis and [see discussion following Eq. (4B.4h) in Appendix 4B], the angle between rays
reflecting from a perfectly aligned moving mirror and rays reflecting from a slightly misaligned
moving mirror is always θ d = 2 nˆM − zˆ . We define
θ ma = nˆM − zˆ (4.137c)
to be the misalignment angle of the interferometer’s moving mirror for a beam with a circular
cross section and write (4.137b) as
G J1 (4π R ⋅ σ ⋅θ ma )
Ȇ A (σ∆) = R⋅ . (4.137d)
circle of 2 ⋅ σ ⋅ θ ma
radius R
It follows that
1 G J (4π R ⋅ σ ⋅ θ ma )
Ȇ A (σ∆) = 1 (4.137e)
A circle of
2π R ⋅ σ ⋅ θ ma
radius R
because the area of a circular beam is, of course, A = π R 2 . To see how this function behaves, we
note that the right-hand side of (4.137e) can be written as
J1 ( x )
2 for x = 4π R σ θ ma
x
J1 ( − x ) = − J1 ( x ) , (4.137f)
function J1 is an odd function (see Sec. 2.3 of Chapter 2 for a description of what an odd function
is).71 This means that
J1 (4π R(−σ )θ ma ) J1 (4π Rσθ ma )
= , (4.137g)
2π R(−σ )θ ma 2π Rσθ ma
which shows that
1 G
Ȇ A (σ∆ )
A circle of
radius R
71
The standard series formula for J1 shows at once that it is odd. See Eq. (9.1.10) in Handbook of Mathematical
Functions, edited by Milton Abramowitz and Irene A. Stegun (National Bureau of Standards, Applied Mathematics
Series 55, November 1964), p. 360.
- 459 -
4 · From Maxwell’s Equations to the Michelson Interferometer
is an even function of ı. Consequently the absolute value signs can be dropped from ı so that Eq.
(4.137e) becomes
1 G J (4& R)' ma )
Ȇ A ()) 1 . (4.137h)
A circle of
2& R)' ma
radius R
Hence for an interferometer beam with a circular cross section, Eq. (4.136h) can be written as
Pbal ( )
A
5
ª JJ(4(4&&RR)'
)'mama) ) º (4.137i)
³ d) ³ ³ d 2 ! ( ) )L () ) ª«1W
WA A 1 1 cos(2&)
A Acos(2 cos))º»..
&)cos
2 5 field of view
¬¬ 22&&RR)'
)'mama ¼¼
G G
Returning to the original definition of A1Ȇ A ()) in Eq. (4.134c), we have, as goes to zero,
1 G 5
G G G
2& i ( =() ) 1
³5³
2
Ȇ A ()) d ( ( ( ; A) e A A 1 (4.137j)
A G
0 G
A
70
G
for the pupil functions ( ( ; A) of beams with any shape cross section. According to Eq. (4.137c)
and the discussion preceding it,
G
2 nˆM zˆ 2' ma .
G
This means the limit when 7 0 is the same as the limit when ' ma 7 0 . For a beam with a
circular cross section, it must then be true that [see Eqs. (4.137h) and (4.137j)]
J1 (4& R)' ma )
lim 1. (4.137k)
' ma 70 2& R)' ma
5
A
Pbal ( ) ³ d) ³ ³ d 2 ! ( ) )L () ) ª1 W cos(2&) cos ) º . (4.137 A )
2 5 field of view
¬ ¼
If we assume
If we thatthat
assume thethe
interferometer’s field
interferometer’s ofofview
field viewisissufficiently
sufficientlynarrow
narrowthat
that cos
1 , then
Eq. (4.137i), for an imperfectly aligned moving mirror, becomes
- 460 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16
Pbal ( )
A
5
ª (4&&RR)'
J 11(4 ) )
)'ma ma ºº (4.138a)
³
2 5
! ( ) )L () ) «
¬
1 W A
2
2&&RR
)')'
mama
A cos(2
cos(2&)
&)) ») »dd))
¼¼
where
ǻ = solid angle of the interferometer’s field of view (4.138b)
2 2
r () ) r () ) r () ) r () ) r () ) r () ) r () ) r () ) , (4.139a)
2 2
rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) rs () ) , (4.139b)
2 2
ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) ts () ) , (4.139c)
2 2
rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) rp () ) , (4.139d)
and
2 2
t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) t p () ) . (4.139e)
Equation (4.125d) shows that, using rule (4E.6a) in Appendix 4E to drop the superfluous
G
argument ,
( abc ) 2 ( abc ) 2 ( abc ) 2 ( abc ) 2
s () ) s () ) and p () ) p () ) . (4.139f)
! () )
2 r () ) ª rs () ) ts () ) () ) º (4.139g)
2 2 2 ( abc ) 2 2 2 ( abc ) 2
() ) rp () ) t p () )
¬« s p
¼»
2 r () ) ª rs () ) ts () ) () ) º ! () ).
2 2 2 ( abc ) 2 2 2 ( abc ) 2
() ) rp () ) t p () )
¬« s p
¼»
We also know that L is an even function of ı [see Eq. (4.136f)], that cos(2&) ) is an even
function of ı, and that
- 461 -
4 · From Maxwell’s Equations to the Michelson Interferometer
J1 (4& R)' ma )
2& R)' ma
is an even function of ı [see Eq. (4.137g)]. Therefore, the entire product being integrated in
(4.138a),
ª J (4& R)' ma ) º
! () )L () ) «11 W
WA 1 A cos(2&) ) » ,
¬ 2& R)' ma ¼
is an even function of ı. Hence we can write, using the rule from Eq. (2.19) in Chapter 2 and also
that L() ) 2 L () ) from Eq. (4.136g), that the balanced-signal power specified in (4.138a) is
A
5
ª J (4& R)' ma ) º
Pbal ( )
2 0³ ! () )L() ) «11 W
¬
WA 1
2& R)' ma
A cos(2&) ) » d) .
¼
(4.140a)
Making the interferometer perfectly aligned by taking ' ma 0 , we see from the limit
J1 (4& R)' ma )
lim 1
' ma 70 2& R)' ma
in (4.137k) that the Bessel function ratio disappears in (4.140a). Now we can write
5
1
Pbal ( ) ³ ! () )S() ) [1 W cos(2&) )] d) , (4.140b)
20
where we define
S() ) AL() ) (4.140c)
to be the total optical power per unit wavenumber interval entering the interferometer. All that
needs to be done now is to make the final idealization, ! 1 , and we get the same formula for the
ideal interferometer signal given in Eq. (1.19f) in Chapter 1,
5
1
Pbal ( ) ³ S() ) [1 W cos(2&) )] d) . (4.140d)
20
The only difference is that in Chapter 1 the balanced optical power is called I ( cb ) instead of
Pbal ( ) , and that now we can get progressively less idealized formulas for the balanced optical
power by reversing the simplifications leading to (4.140d).
- 462 -
Simplified Formulas for the Optical Power in the Balanced Signal · 4.16
∞ ∞ ∞
1 1 1
Pbal ( χ ) = ³ S(σ )dσ + ³ S(σ ) cos(2πσχ ) dσ = constant + ³ S(σ ) cos(2πσχ ) dσ .
20 20 20
Separating out the nonconstant signal component that changes with Ȥ, we give it the name
∞
1
Ibal ( χ ) = ³ S(σ ) cos(2πσχ ) dσ . (4.141a)
20
In this book, we call Ibal ( χ ) the interferogram. Comparing this last result to Eq. (2.8b) in Chapter
2, we see that the interferogram is 1/4 of the Fourier cosine transform of S(ı). Since
and L is an even function [see Eqs. (4.136f) and (4.136g)], the definition of S(ı) can be extended
to negative values of ı by making S another even function:
The cosine is also even, so we can then write the interferogram as [see Eq. (2.19) in Chapter 2)
∞
1
Ibal ( χ ) = ³ S(σ ) cos(2πσχ ) dσ .
4 −∞
The sine is an odd function and S(ı) is even, so the product S(σ ) sin(2πσχ ) is an odd function of
ı. According to Eq. (2.17) of Chapter 2, the integral between í and + of any odd function is
zero, so
∞
³ S(σ ) sin(2πσχ ) dσ = 0 .
−∞
Hence we can write
∞ ∞
1 i
Ibal ( χ ) = ³
4 −∞
S(σ ) cos(2πσχ ) d σ ± ³ S(σ ) sin(2πσχ ) dσ
4 −∞
∞
1
= ³ S(σ )[cos(2πσχ ) ± i sin(2πσχ )] dσ
4 −∞
- 463 -
4 · From Maxwell’s Equations to the Michelson Interferometer
or
∞
1
Ibal ( χ ) = ³
4 −∞
S(σ ) e ±2π iσχ dσ (4.141d)
using cos(φ ) ± i sin(φ ) = e ± iφ . This shows that the interferogram of an ideal interferometer is 1/4 of
either the forward or inverse Fourier transform of S(ı).
Equations (4.141a) and (4.141d) are important because—as pointed out at the beginning of
Sec. 1.7 of Chapter 1—they show why people build Michelson interferometers. Reversing the
complex Fourier transform gives
∞
S(σ ) = 4 ³ Ibal ( χ ) e B2π iσχ d χ . (4.141e)
−∞
or, using Eq. (2.8f) in Chapter 2 to reverse the cosine transform in (4.141a),
∞
S(σ ) = 8³ Ibal ( χ ) cos(2πσχ ) d χ . (4.141f)
0
To find S(ı), the radiation spectrum as a function of wavenumber, we need only measure Ibal ( χ )
and take its transform.
- 464 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
ˆ G
ˆ 0(back)
Complex E field = ( xE x
ˆ 0(back)
+ yE y ) e2π iσ ( Ω•r −ct ) (4.142a)
and
ˆ G
Complex B field = c −1 ( yE
ˆ 0(back)
x
ˆ 0(back)
− xE y ) e 2π iσ ( Ω•r −ct ) . (4.142b)
We assume that intelligent efforts are made to control the background radiation from the warm
optical surfaces, so that, unless Ω̂ is parallel to or nearly parallel to the optical axis ẑ , the
background radiation cannot reach the detector. This means that, as was the case for the balanced
signal, only direction-chopped radiation at relatively small angles θb can reach the detector.
Equations (4.142a) and (4.142b) correspond to (4.77a) and (4.77b) in the balanced derivation, and
the complex plane wave they specify represents radiation entering the interferometer from the
detector side of the system (neglecting terms of order θb ). Again we imagine an unfolded system
of coordinates such as the one shown in Fig. 4.18 above, only now the coordinates are unfolded
in such a way as to trace the unbalanced background signal—rather than the balanced input
signal—into and out of the interferometer. Both ways of unfolding the interferometer end up
specifying the same exit beam traveling to the detector. Therefore the xˆ , yˆ , zˆ coordinate system
used for vectors Ω̂ and r̂ in Eqs. (4.142a) and (4.142b) is the same coordinate system as the one
located in the exit beam of the unfolded interferometer in Fig. 4.18. In this sense, the xˆ , yˆ , zˆ
coordinate system used to specify Ω̂ and r̂ in (4.142a) and (4.142b) is the same as the xˆ , yˆ , zˆ
coordinate system used to specify Ω̂ and r̂ in Eqs. (4.77a) and (4.77b).
- 465 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4.25.
Beam Compensator
Input Side of the Splitter Plate Fixed
Interferometer Mirror
χ
2
Moving Mirror
- 466 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
The plane wave specified in Eqs. (4.142a) and (4.142b) can be decomposed into two linearly
polarized plane waves: one plane wave that has E0(back)x for its complex amplitude and is linearly
polarized perpendicular to the plane of incidence on the beam splitter, and one plane wave that
has E0(back)
y for its complex amplitude and is linearly polarized parallel to the plane of incidence on
the beam splitter. Tracing the background rays through Fig. 4.25, we find that the unbalanced
radiation field for the rays traveling out and back the moving-mirror arm are (again neglecting
terms of order θb )
Complex E field
ˆ G (4.143a)
ˆ 0(back)
= rM [ xE x ts2γ s( uv ) + yE
ˆ 0(back)
y t p2γ p(uv ) ] e 2π iσ [ Ωd •( r + zˆ χ ) −ct ]
and
Complex B field
r ˆ G (4.143b)
ˆ 0(back)
= M [ yE x ts2γ s(uv ) − xE
ˆ 0(back)
y t 2pγ (puv ) ] e 2π iσ [ Ωd •( r + zˆ χ ) −ct ] .
c
Appendix 4F presents the tunnel diagrams used to construct the γ (uv ) parameters for the s-type
and p-type plane waves passing through the beam-splitter substrate and compensator plate; the
ts , t p , rM , Ω d , and Ȥ variables all have the same meaning as before. Equations (4.143a) and
(4.143b) correspond to Eqs. (4.85a) and (4.85b) in the balanced derivation. Corresponding to Eqs.
(4.84a) and (4.84b), we have (neglecting terms of order θb )
Complex E field
ˆ G (4.144a)
ˆ 0(back)
= rM [ xE x rs2γ s( uv ) + yE
ˆ 0(back)
y rp2γ (puv ) ] e 2π iσ ( Ω•r −ct )
and
Complex B field
r ˆ rG − ct )
2π iσ ( Ω•
(4.144b)
ˆ 0(back)
= M [ yE x r γ
2 ( uv )
s s − ˆ
xE0y r γ
(back) 2 ( uv )
p p ] e .
c
From the discussion following Eq. (4.83) above, we know that the amplitude reflection
coefficients for plane waves reflecting off the back side of the beam splitter are Wrs (σ ) and
Wrp (σ ) , with W = 1 or W = í1 depending on the type of beam splitter being used. The W
parameter occurred in Eqs. (4.84a) and (4.84b) of the balanced derivation because the rs and rp
parameters appeared to the first power in the formulas. In Eqs. (4.144a) and (4.144b), on the other
hand, only the squares of rs and rp appear—which means, since W 2 = 1 , that the W parameter
disappears. The formulas for the recombined, unbalanced fields corresponding to Eqs. (4.88a)
and (4.88b) in the balanced derivation is (neglecting terms of order θb )
- 467 -
4 · From Maxwell’s Equations to the Michelson Interferometer
For future use, we note that Eqs. (4.89g)–(4.89k), (4.92a)–(4.92e), and (4.139a)–(4.139e) already
specify how rs , rp , t s , t p , and rM = r behave as functions of wavenumber ı; and the γ s(,uvp ) can be
set up to behave the same way the γ s(,abc
p
)
do in Eqs. (4.97c) and (4.139f),
G G G G
γ s(uv ) (ε , −σ ) = γ s(uv ) (ε , σ )∗ and γ (puv ) (ε , −σ ) = γ (puv ) (ε , σ )∗ (4.145c)
with
2 2 2 2
γ s(uv ) (−σ ) = γ s(uv ) (σ ) and γ (puv ) (−σ ) = γ (puv ) (σ ) . (4.145d)
Equation (4F.2a) in Appendix 4F points out that, like the magnitudes of the γ s(,abc
p
)
parameters, the
magnitudes of the γ s(,uvp ) parameters are functions only of wavenumber ı.
The next major step in the balanced derivation was to represent the radiation entering the
system by integrals over dw and d 2u , as in Eqs. (4.103a) and (4.103b). We now do the same for
the background radiation entering the interferometer from the detector side of the system
(neglecting terms of order θb ),
G G
E (back) ( ρ , z , t )
∞ ∞ G G (4.146a)
§ c · ª (back) cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « xˆ E x (− , z , − ) + yˆ E y (− , z , − ) » e 2π i[u • ρ + wt ]
2 (back)
−∞ −∞ © w ¹¬ w c w c ¼
and
- 468 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
G G
B (back) ( ρ , z , t )
∞ ∞ G G (4.146b)
§ 1 · ª (back) cu w cu w º G G
= ³ dw ³ ³ d u ¨ 2 ¸ « y E x (− , z , − ) − xE y (− , z , − ) » e 2π i[ u • ρ + wt ] .
2
ˆ ˆ (back)
−∞ −∞ © w ¹¬ w c w c ¼
G G
We note that E (back) and B (back) must be real whereas E(back)
x and E(back)
y are allowed to be complex.
In addition E(back)
x and E(back)
y must satisfy all the symmetry relations that Ex and Ey satisfied for
the incident signal radiation entering the interferometer [see, for example, Eqs. (4.100a) and
(4.100b)],
G G
E(back)
x (ε , z , −σ ) = E(back)
x (ε , z , σ )∗ (4.146c)
and
G G
E(back)
y (ε , z , −σ ) = E(back)
y (ε , z , σ )∗ . (4.146d)
The total E and B fields for the unbalanced radiation traveling back to the detector are also
written as integrals over dw and d 2u ,
G G
E (unb) ( ρ , z , t ) =
{
∞ ∞ G G
c 2π i[ uG • ρG + wt ] w cu w (uv ) cu w
³−∞ ³−∞³ w2 e (− , z, − ) ⋅ γ s (− , − )
2 (back)
dw d u r (− ) xˆ E x
c w c w c
2π iwχ
4π iw § cu ·
2
G G (4.147a)
w 2 w 2 cu w
[ ]
− − 1−¨ ¸
( nˆM − zˆ )• r
⋅ rs (− ) + ts (− ) e e cc ©w¹ (back)
+ yˆ E y (− , z, − )
c c w c
G
}
2
2π iwχ § cu ·
cu w w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG
w c
[
⋅ γ p (− , − ) rp (− ) + t p (− ) e
( uv )
c c
e ]
and G G
B (unb) ( ρ , z , t ) =
{
∞ ∞ G G
1 2π i[ uG • ρG + wt ] w cu w ( uv ) cu w
³−∞ dw ³−∞³ d u w2 e (− , z , − ) ⋅ γ s (− , − )
2 (back)
r (− ) yˆ E x
c w c w c
2π iwχ § cu ·
2
G (4.147b)
w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG cu w
[
⋅ rs (− ) + ts (− ) e
c c
e ]
(back)
− xˆ E y (− , z , − )
w c
G
}
2
2π iwχ § cu ·
cu w w 2 w 2 − c 1−¨© w ¸¹ − 4πciw ( nˆM − zˆ )•rG
w c
[
⋅ γ p (− , − ) rp (− ) + t p (− ) e
( uv )
c c
e ]
- 469 -
4 · From Maxwell’s Equations to the Michelson Interferometer
These two equations correspond to Eqs. (4.104a) and (4.104b) in the balanced derivation
(neglecting terms of order θb ).
The unbalanced background signal from the warm optical surfaces can be thought of as
traveling to the detector from the beam splitter along the same ray paths as the balanced optical
signal; consequently, it ends up being processed by the system much the same way as the
balanced optical signal. For this reason, we now give T, A subscripts to
G G
E (unb) , B (unb) , E(back)
x , and E(back)
y
to show that they also represent time-chopped and beam-chopped radiation fields. The
unbalanced E and B fields are time-chopped to the same 2T time interval as the balanced fields,
because the detector records both signal and background for the same length of time. Although
the effective cross-sectional area of the background beam is probably somewhat larger than that
of the input beam, in a well-designed system they are roughly the same size and can be
represented by the same symbol A. Again we treat
G G
E (unb) , B (unb) , E(back)
x , and E(back)
y
as random quantities. Hence, the z component for the Poynting vector for the unbalanced
radiation fields is another random quantity given by
( )
G (unb) 1 G (unb) G (unb)
STA • zˆ = ETA × BTA • zˆ . (4.148a)
µo
This corresponds to Eq. (4.122a) in the balanced derivation. The average radiant energy from the
unbalanced background reaching the interferometer’s detector during a time interval 2T over a
beam cross-sectional area A is now
§ 1
( ) ·
∞ ∞
G (unb) G (unb)
E¨
© µo
³
−∞
dt ³ ³ d 2ρ ETA
−∞
× BTA • zˆ ¸ ,
¹
(4.148b)
which corresponds to the right-hand side of (4.122b) in the balanced derivation. Adding T, A
subscripts to E(back)
x and E(back)
y in Eqs. (4.147a) and (4.147b)—and representing them as random
quantities—we substitute the right-hand sides of these two equations into the expression in
(4.148b) to get, after a great deal of algebra, that
- 470 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
Average energy in unbalanced background signal over time 2T and beam cross section A
{[ r (σ ) ( )
∞ ∞
d 2ε (back) (εG, z , σ ) 2
= εo ³ dσ ³³σ 2
r (σ )
2
s
4
+ ts (σ )
4
]γ ( uv )
s
2
(σ ) E E xTA
−∞ −∞
[ 4
+ rp (σ ) + t p (σ )
4
]γ ( uv )
p
2
(σ ) E E yTA (
(back) (εG, z , σ ) 2
)}
{ G
∞ ∞
d 2ε (back) (εG, z , σ )E
(back) (εG − ∆, z , σ )∗
+ε o ³ dσ ³³σ 2
r (σ )
2
[r (σ )
s
∗2 2
ts (σ ) 2 γ s( uv ) (σ ) E E xTA ( xTA ) (4.148c)
−∞ −∞
2
(back) (εG , z , σ )E
+ rp (σ )∗2 t p (σ ) 2 γ p(uv ) (σ ) E E yTA ( yTA
G
(back) (εG − ∆, z , σ )∗ )]e −2π iσχ 1−ε 2
}
{σ G
∞ ∞
d 2ε G G
+ε o ³ dσ ³³σ 2
r( )
2
[r (σ ) t (σ )
s
2
s
∗2
γ s(uv ) (σ ) E ( E (back)
2
xTA (ε , z , σ )E xTA (ε + ∆, z , σ ) )
(back) ∗
−∞ −∞
2
(εG, z , σ )E
+ rp (σ ) 2 t p (σ )∗2 γ (puv ) (σ ) E E yTA ( yTA
G
(εG + ∆, z , σ )∗ )] e 2π iσχ 1−ε 2
}
Here we have used Eqs. (4.102a)–(4.102d) to transform the integrals over dw and d 2u into
G
integrals over dı and d 2ε , and once again we have used ∆ = 2(nˆM − zˆ ) . Equation (4.148c)
corresponds to Eq. (4.134a) in the balanced derivation.
Following the pattern of Eqs. (4.120a), (4.120b), (4.135a), and (4.135b), we now write
εo
2TAσ 2 xTA (
(back) (ε , z , σ ) 2 ≅ L(back) (εG, σ ) ,
E E x ) (4.149a)
εo
2TAσ 2 yTA (
(back) (ε , z , σ ) 2 ≅ L(back) (εG, σ ) ,
E E y ) (4.149b)
G G
(back) (εG, z , σ )E
E E( (back) (εG ± ∆, z , σ )∗ ≅ 2T µ c 2σ 2 L(back) (εG , σ ) Ȇ (Bσ∆) ,
) (4.149c)
xTA xTA o x A
and
G G
(back) (εG, z , σ )E
E E( (back) (εG ± ∆, z , σ )∗ ≅ 2T µ c 2σ 2 L(back) (εG , σ ) Ȇ (Bσ∆) .
) (4.149d)
yTA yTA o y A
As was the case for the balanced derivation [see Eqs. (4.121a) and (4.121b)], L(back)
x and L(back)
y are
even functions of ı,
- 471 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G
L(back)
x ( , ) ) L(back)
x ( , ) ) (4.149e)
and
G G
L(back)
y ( , ) ) L(back)
y ( , ) ) . (4.149f)
Glancing back at where these two functions came from, we see that these two functions represent,
respectively, the x-polarized and y-polarized background optical power per unit area per unit solid
angle per unit ı interval entering the interferometer from the detector side of the system.
Substitution of Eqs. (4.149a)–(4.149d) into (4.148c) gives
Average energy in unbalanced background signal over time 2T and beam cross section A
{ {[ r () )
5
G
2TA ³ d) ³ ³ d r () )
2 2
s
4
ts () )
4
] ( uv )
s
2
() ) L(back)
x ( , ) )
5fieldfield
of view
of view
[
rp () ) t p () )
4 4
] ( uv )
p
2
() ) L(back)
y
G
( , ) ) }
{)
5
G
2TA ³ d) ³³d
2
r( )
2
[r () )
s
2
ts () ) 2 ( uv )
s
2
() ) L(back)
x ( , ) )
5 field of view
rp () )2 t p () ) 2 ( uv )
p
2
() ) L(back)
y (
G
, ) ) A
1
A
Ȇ A ()
G 2& i)
) e][ ] 1 2
}
{
5
G
2TA ³ d) ³³d
2
r () )
2
[r () ) t () )
s
2
s
2 ( uv )
s
2
() ) L(back)
x ( , ) )
5 field of view
rp () ) 2 t p () )2 ( uv )
p
2
() ) L(back)
y
G 1 G
][
( , ) ) A Ȇ A () ) e 2& i)
A
] 1 2
} ,
(4.150)
G
where we use Eq. (4.1e) to replace o $o c 2 by one and Eq. (4.134d) to replace Ȇ A (u ) by
G
Ȇ A (u ) . This result corresponds to Eq. (4.135d) in the balanced derivation, except we have
anticipated the reasoning used to go from (4.135d) to (4.135e) by using the interferometer’s field
of view to set the limits on the double integral over d 2 . Strictly speaking, this should be the
field of view for the unbalanced background radiation coming from the warm optical surfaces
between the beam splitter and the detector, but in a well-designed system the two fields of view
are roughly the same. In this formula, the second triple integral over dı and d 2 is the complex
conjugate of the third triple integral over dı and d 2 , ensuring that their sum is real. Since the
first triple integral is the integral of a real expression, evaluation of the right-hand side of (4.150)
produces a real number—which makes sense considering that this is the formula for the energy in
the unbalanced background signal.
- 472 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
To make further simplifications in this energy formula, we break from the pattern of the
balanced derivation and use Eq. (4.150) to represent a somewhat idealized interferometer with a
nonideal beam-splitter film. From this point on to the end of this section, we are not so much
analyzing a likely type of Michelson setup as we are constructing a thought experiment to
discover hidden properties of the beam-splitter amplitude-transmission and amplitude-reflection
coefficients ts, tp, rs, and rp. The first step is to set up the interferometer so that no electromagnetic
energy enters the system through the input port—for example, by having the interferometer
entrance aperture look at a chilled nonreflective surface. This means only detector-side
background radiation enters the system. To keep things simple, we first assume that all single-
pass s-type and p-type transmissions through the beam-splitter substrate and compensator plate
are equivalent, with every single-pass transmission characterized by complex constants having
3
the same magnitude γ . Now the γ s(,abc p
)
terms correspond to γ and the γ s(,uvp ) terms correspond
2
to γ so that
2 6 2 4
γ s(,abc
p
)
→γ and γ s(,uvp ) → γ . (4.151a)
This lets us assume that only negligible amounts of optical power are lost passing through the
substrate and compensator plate by saying that γ is approximately equal to one; similarly, we
say that only negligible amounts of optical power are lost by reflection off the fixed and moving
mirrors by saying that r is approximately equal to one. These assumptions can be written as
γ (σ ) ≅ 1 and r (σ ) ≅ 1 . (4.151b)
G
In addition, the moving mirror is taken to be in perfect alignment with ∆ = 0 so that [see Eq.
(4.137j)]
1 1
Ȇ A (0) = Ȇ A (0)∗ = 1 . (4.151c)
A A
To keep our thought experiment simple, we force the background radiation to be x-polarized and
confined to a very narrow solid angle ∆Ω back , so that
G
L(back)
y (ε , σ ) ≅ 0 (4.151d)
and
G G
L(back)
x (ε , σ ) ≅ ∆Ω backδ (ε ) L(xback) (σ ) = ∆Ω backδ (ε x )δ (ε y ) L(xback) (σ ) . (4.151e)
- 473 -
4 · From Maxwell’s Equations to the Michelson Interferometer
5
showing that the delta function when integrated over d 2 (that is, integrated over a solid angle
G
containing 0 ) always produces the dimensionless number one. In Eq. (4.151e), we drop the
G
dependence of the background optical power L (xback) on ( x , y ) , using ( x ) and ( y ) to
show that only the contribution from the on-axis direction is significant. Just like L(ı) in Eq.
(4.136f), function L (xback)) must be even,
Although it is highly unlikely that an actual interferometer would have this sort of idealized x-
polarized background radiance, we can always arrange for an existing system to have this sort of
contaminating background without changing the properties of the interferometer’s beam splitter.
Substitution of Eqs. (4.151a)–(4.151e) into (4.150) gives
Average energy in unbalanced background signal over time 2T and beam cross section A
{
5
ª rs () ) 4 ts () ) 4 º
³L
(xback)
2TA back () ) (4.152)
¬ ¼
5
}
e 2& i) rs () )2 ts () ) 2 e 2& i) rs () ) 2 ts () )2 d) .
We next consider what happens to the balanced, instead of the unbalanced, detector-side
background signal. Equation (4.135d), which specifies the energy in the balanced input signal,
can be adapted to describe the balanced background signal, but to do this we must analyze how
the balanced background signal differs from the balanced input signal. We note that rs and rp in
(4.135d) refer to an initial reflection of the beam coming from the input port that is off the front
side of the interferometer, as shown in Fig. 4.16, whereas the balanced background signal must,
as shown in Fig. 4.25, have its initial reflection off the back side of the beam splitter. Tracing the
balanced background rays through the interferometer, we see that, compared to the balanced input
rays, front-side beam-splitter reflections are replaced by back-side beam-splitter reflections and
back-side beam-splitter reflections are replaced by front-side beam-splitter reflections. We also
note that rays pass through the compensator plate and beam-splitter substrate a different number
- 474 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
of times, but this does not matter because we take γ (σ ) ≅ 1 in our idealized interferometer. From
the discussion following Eq. (4.83) above, we know that if the front-side reflection coefficients
are rs and rp, then the back-side reflection coefficients are Wrs and Wrp. This means that to
convert the balanced input-signal derivation to the balanced background-signal derivation, we
need to convert all the rs and rp variables to Wrs and Wrp whenever rs and rp refer to front-side
reflection coefficients. What about those times when rs and rp are already part of Wrs and Wrp
products referring to back-side reflection coefficients? To handle this situation, we note that when
W = −1 , making the original back-side reflection coefficients (− rs ) and (− rp ) , then W2rs and W2rp
return to us the front-side coefficients rs and rp; and when W = 1 , the back-side and front-side
coefficients are always equal and can be multiplied by as many powers of W as we please.
Therefore, if Wrs and Wrp refer to back-side reflection coefficients in the balanced input-signal
derivation, then W2rs and W2rp automatically convert the terms to the desired front-side reflection
coefficients. This shows that replacing the rs and rp variables everywhere by Wrs and Wrp
converts all front-side reflection terms to back-side reflection terms and all back-side reflection
terms to front-side reflection terms. Hence, Eq. (4.135d) can be used to calculate the energy in the
balanced background signal if rs and rp are replaced everywhere by Wrs and Wrp (and, of course,
Lx and Ly are replaced by L(back)
x and L(back)
y ). The only values W can have are +1 or í1 so as
always W 2 = 1 . Looking at Eq. (4.135d), we see that rs and rp only enter the formula as
2 2
r and rp , so replacing rs and rp by Wrs and Wrp does not change the equation. Therefore, all
s
that needs to be done to adapt (4.135d) to the balanced background signal using the
approximations in (4.151d) is to set γ s(,abc
p (σ ) = γ (σ ) = r (σ ) = 1 and to replace Lx and Ly by
)
L(back)
x and L(back)
y , which gives us
Average energy in balanced background over time 2T and beam cross section A
∞ ∞
= 4TA ³ dσ ³ ³d ε
2
−∞ −∞
{[
⋅
2 2
rs (σ ) ts (σ ) L(back)
x
G 2 2
(ε , σ ) + rp (σ ) t p (σ ) L(back)
y
G
(ε , σ ) ]} (4.153)
{
∞ ∞
G
³ ³ ³ d ε [ r (σ )
2 2
+2TW dσ 2
s ts (σ ) L(back)
x (ε , σ )
−∞ −∞
]}
G G −2π iσχ G
2 2
+ rp (σ ) t p (σ ) L(back)
y (ε , σ ) ⋅ ][
Ȇ A (σ∆ )e 1−ε 2
+ Ȇ A (σ∆)∗ e 2π iσχ 1−ε 2
- 475 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Average energy in balanced background over time 2T and beam cross section A
∞
(4.154)
³
2 2
= 2TA∆Ω back rs (σ ) ts (σ ) L (xback) (σ ) ª¬ 2 + We −2π iσχ + We 2π iσχ º¼ dσ .
−∞
We now consider formulas (4.152) and (4.154) for the balanced and unbalanced background
energy. Although the background radiance while passing through the interferometer may have
some of its energy absorbed, by conservation of energy there is no way for its energy to
increase—consequently, the sum of (4.152) and (4.154) must be less than or equal to the total x-
polarized energy produced by the radiant background in time 2T for a beam of cross-sectional
area A and solid angle ∆Ω back ,
∞
2TA∆Ω back ³L
−∞
(xback) (σ )dσ . (4.155a)
Since L (xback) is even [see Eq. (4.151f)], the total background energy entering the interferometer
can also be written as
∞
4TA∆Ω back ³ L (xback) (σ )dσ (4.155b)
0
{
∞
ª rs (σ ) 4 + ts (σ ) 4 º + rs (σ )∗2 ts (σ ) 2 e −2π iσχ
³L (σ )
(xback)
= 2TA∆Ω back
¬ ¼
−∞
2 2 2 2
(4.156a)
+ rs (σ ) 2 ts (σ )∗2 e 2π iσχ + 2 rs (σ ) ts (σ ) + W rs (σ ) ts (σ ) e −2π iσχ
2 2
+ W rs (σ ) ts (σ ) e 2π iσχ dσ . }
We represent the complex scalars rs and t s by
rs (σ ) = rs (σ ) eiθrs (σ ) (4.156b)
and
ts (σ ) = ts (σ ) eiθts (σ ) (4.156c)
- 476 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
for ' rs () ) and 'ts () ) defined to be real wavenumber-dependent angles representing the phases
of rs and t s . Since rs () ) rs () ) and ts () ) ts () ) from Eqs. (4.92b) and (4.92d), we must
have
' rs () ) ' rs () ) (4.156d)
and
'ts () ) 'ts () ) (4.156e)
in (4.156b) and (4.156c), the defining equations for ' rs () ) and 'ts () ) . Substitution of (4.156b)
and (4.156c) into (4.156a) gives
{
5 2
ª rs () ) 2 ts () ) 2 º
³L
(xback)
2TA back () ) (4.157a)
¬ ¼
5
rs () ) ts () ) ªe 2& i) e ts
2 2
¬
2 i ' () ) ' rs () )
e 2& i) e 2i ('ts () ) 'rs () )) We 2& i) We 2& i) º d) .
¼ }
2 2
Equations (4.139b) and (4.139c) show that rs () ) and ts () ) are even functions of ı, as is
L(back) according to (4.151f). The term
ª¬ e 2& i) e 2i'ts () ) 'rs () ) e 2& i) e 2i'ts () ) 'rs () ) We 2& i) We2& i) º¼
inside the integral is also even with respect to ı, because by (4.156d) and (4.156e)
Eq. (2.19) in
This means (4.157a) is an integral of an even expression between í and +, so by rule
Chapter 2 it can also be written as
{
5 2
rs () ) ts () ) ªe 2& i) e ts
2 2
¬
2 i ' () ) ' rs () )
e 2& i) e 2i ('ts () ) 'rs () )) We 2& i) We 2& i) º d) .
¼ }
- 477 -
4 · From Maxwell’s Equations to the Michelson Interferometer
εo
σ 2 ( (0, z , σ ) 2
⋅E E xTA )
is the average input energy, per unit wavenumber interval and per unit solid angle, that is entering
the interferometer during a time 2T and is carried by the x-polarized radiation field traveling in
the ẑ direction at wavenumber ı. We note that the z in the argument list of E
xTA can be
2 does not
disregarded because, as is mentioned at the end of Appendix 4C, the value of E xTA
the analysis in Appendix 4E that the effect of one transmission through the beam splitter is to
replace the monochromatic plane wavefield specified by E (0, z , σ ) with the monochromatic
xTA
(0, z , σ ) . Hence, the average energy, per unit wavenumber
plane wavefield specified by γ t E s xTA
interval per unit solid angle, that passes through the beam splitter during a time 2T and is carried
by the x-polarized radiation field traveling in the ẑ direction at wavenumber ı is
εo
σ 2 (
⋅ E γ (σ )ts (σ )E xTA )
(0, z , σ ) 2 = γ (σ ) 2 t (σ ) 2 ε o E E
s
σ 2 (
(0, z , σ ) 2
xTA )
2 ε
≅ ts (σ ) o2 E E
σ ( )
(0, z , σ ) 2 ,
xTA
2
where in the last step (4.151b) is used to drop γ from the formula. This result shows why ts (σ )
is called the power transmission coefficient for x-polarized radiation. Using similar reasoning, the
effect on the plane waves of one reflection from the beam splitter is to replace E (0, z , σ ) by
xTA
2
γ r E (0, z, σ ) . Hence, the formula for the average energy, per unit wavenumber interval and
s xTA
per unit solid angle, that is carried by the x-polarized radiation reflected off the beam splitter in
time 2T is
εo
σ 2 (
⋅ E γ (σ ) 2 rs (σ )E xTA ) s
σ 2 (
(0, z , σ ) 2 = γ (σ ) 4 r (σ ) 2 ε o E E
(0, z , σ ) 2
xTA )
2 ε
≅ rs (σ ) o2 E E
σ ( )
(0, z , σ ) 2 .
xTA
2
This shows why rs (σ ) is called the power reflection coefficient for x-polarized radiation.
Although the beam-splitter substrate can absorb energy—a process now being neglected by
- 478 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
taking γ about equal to one—a well-designed beam splitter has only negligible absorption in the
thin film where the partial transmission and reflection of the interferometer beam occurs. This
means, by conservation of energy, that
εo
σ 2 (
(0, z , σ ) 2
⋅E E xTA )
= ts (σ )
2εo
σ2
⋅ (
E E xTA s ) σ2 (
(0, z , σ ) 2 + r (σ ) 2 ε o ⋅ E E
(0, z , σ ) 2
xTA )
2 2 ε
= ( ts (σ ) + rs (σ ) ) ⋅ o2 ⋅ E E
σ (
(0, z , σ ) 2
xTA )
or
2 2
ts (σ ) + rs (σ ) = 1 . (4.157c)
{
⋅ cos[2πσχ − 2 (θts (σ ) − θ rs (σ ) )] + W cos(2πσχ ) }] dσ ,
where eiφ = cos φ + i sin φ is used to reduce the complex exponentials to a sum of cosines. For an
ideal beam splitter
2 2
rs (σ ) = ts (σ ) = 1 2 ,
2 2
so 2 rs (σ ) ts (σ ) must also be about equal to 1/2 for a well-designed, nonideal beam splitter; it
obviously cannot be a small term. We now compare (4.157d) to formula (4.155b) for the total
energy produced by the radiant background. Unless the term inside the braces { } in (4.157d) is
identically zero for all values of ı, we can always construct an x-polarized background spectrum
L(xback) that, for certain values of Ȥ, specifies more energy leaving the interferometer in the
balanced and unbalanced background signal than entered the interferometer in (4.155b).
Therefore, the term inside the braces { } must be identically zero for all non-negative values of ı,
which means that
- 479 -
4 · From Maxwell’s Equations to the Michelson Interferometer
±2 i[θts (σ ) −θ rs (σ )]
e = −W . (4.158b)
By Eqs. (4.156d) and (4.156e), this constraint holds true for all negative values of ı if it holds
true for all non-negative values of ı, since
±2i[θts ( −σ ) −θ rs ( −σ )] B2i[θts (σ ) −θ rs (σ )]
e =e .
When this constraint is substituted back into Eq. (4.157b), the right-hand side reduces to
= 4TA∆Ω back ³ dσ L (σ ) ª rs (σ ) + ts (σ ) º ,
(xback) 2 2
¬ ¼
0
which, by substituting in (4.157c), is shown to be the same as the expression for the background
radiant energy given in (4.155b).
We have just seen that the background radiant energy is conserved—for x-polarized
background radiation. Clearly, nothing stops us from now making the background energy y-
polarized and repeating the analysis. If we return to Eq. (4.150), now specifying that
G
L(back)
x (ε , σ ) ≅ 0 (4.159a)
and
G G
L(back)
y (ε , σ ) ≅ ∆Ω backδ (ε ) L (yback) (σ ) = ∆Ω backδ (ε x )δ (ε y ) L (yback) (σ ) , (4.159b)
everything will proceed as before because all the properties used to get to (4.158b) for rs and ts
also hold true for rp and tp. Having switched from x polarization to y polarization, we define
iθ rp (σ )
rp (σ ) = rp (σ ) e (4.160a)
and
iθtp (σ )
t p (σ ) = t p (σ ) e (4.160b)
- 480 -
Energy Flux in the Unbalanced Radiation Fields · 4.17
for θ rp (σ ) and θtp (σ ) real parameters representing the phase of rp and tp as functions of ı. Again,
these functions must be odd:
θ rp (−σ ) = −θ rp (σ ) (4.160c)
and
θtp (−σ ) = −θtp (σ ) . (4.160d)
2
We can show that t p (σ ) is the power transmission coefficient through the beam splitter for y-
2
polarized waves and that rp (σ ) is the power reflection coefficient through the beam splitter for
y-polarized waves so that
2 2
t p (σ ) + rp (σ ) = 1 . (4.160e)
±2 i ª¬θtp (σ ) −θ rp (σ ) º¼
e = −W (4.160f)
for all positive and negative values of ı, allowing us to conserve energy for the y-polarized
background radiation passing through the interferometer.
These results certainly hold for the beam-splitter transmission and reflection coefficients ts, tp,
rs, and rp in our thought experiment on an ideal interferometer, but what about the ts, tp, rs, and rp
coefficients of a nonideal interferometer? The idealizations made at the start of this analysis in
Eqs. (4.151a)–(4.151c) are standard ways of improving the performance of Michelson
interferometers—decreasing substrate absorption, improving mirror reflectivity, and correctly
aligning the moving mirror—and in that sense are physically possible modifications that can be
made to the interferometer without changing the ts, tp, rs, and rp of the partially transmitting,
partially reflecting beam-splitter film. Similarly, we can imagine using polarizing filters and
beam collimators to create an x-polarized or y-polarized radiance field that is severely direction-
chopped, and then switching the interferometer’s entrance and exit ports to create “background”
radiation of the type specified in Eqs. (4.151d), (4.151e), (4.159a), and (4.159b). This is also a
procedure that does not affect the ts, tp, rs, and rp beam-splitter coefficients. Hence, our analysis
strongly suggests that the constraints on ts, tp, rs, and rp in Eqs. (4.158b) and (4.160f), which are
derived from these idealizations, can be confidently applied to the nonideal system of Eq. (4.150).
Concluding that this is in fact the case, we substitute (4.158b) and (4.160f) into (4.150) to get
- 481 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Average energy in unbalanced background signal over time 2T and beam cross section A
{[ r (σ )
∞
G
= 2TA ³ dσ ³ ³ d 2ε r (σ )
2
s
4
+ ts (σ )
4
]γ ( uv )
s
2
(σ ) L(back)
x (ε , σ )
−∞ field of view
[ 4
+ rp (σ ) + t p (σ )
4
]γ ( uv )
p
2
(σ ) L(back)
y
G
(ε , σ ) }
{σ
∞
G
−2WTA ³ dσ ³ ³ d 2ε r( )
2
[ r (σ )
s
2 2 2
ts (σ ) γ s( uv ) (σ ) L(back)
x (ε , σ )
−∞ field of view
2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y (ε
G
, σ ) ⋅
1
A
Ȇ A (σ∆
G −2π iσχ
][
) e ] 1−ε 2
}
{
∞
G
−2WTA ³ dσ ³ ³ d 2ε r (σ )
2
[ r (σ )
s
2 2 2
ts (σ ) γ s( uv ) (σ ) L(back)
x (ε , σ )
−∞ of view
field
2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y
G 1 G
][
(ε , σ ) ⋅ Ȇ A (σ∆)∗ e 2π iσχ
A
] 1−ε 2
}
.
(4.161a)
(back)
Punb (χ )
{
∞
ª rs (σ ) 4 + ts (σ ) 4 º γ s( uv ) (σ ) 2 L(back) G
= A ³ dσ ³ ³ d 2ε r (σ )
2
(ε , σ )
¬ ¼ x
−∞ field of view
G
+ ª rp (σ ) + t p (σ ) º γ (puv ) (σ ) L(back)
4 4 2
(ε , σ ) (4.161b)
¬« »¼ y
W G
−2
A
[ 2 2 2
rs (σ ) ts (σ ) γ s(uv ) (σ ) L(back)
x (ε , σ )
2 2 2
+ rp (σ ) t p (σ ) γ (puv ) (σ ) L(back)
y (ε
G
, σ ) ⋅ Re ª Ȇ
¬ A (σ∆
G
]
)e −2π iσχ cosαε º¼ }
where cos α ε = 1 − ε 2 has the same meaning as in Eqs. (4.135f) above (it is the cosine of the
angle the propagation vector Ω ˆ = εG + zˆ 1 − ε 2 makes with the ẑ axis of the unfolded
(back) (back)
interferometer). This integral clearly gives a real value for Punb , as it should because Punb
is a real quantity.
- 482 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18
G G 1
L(back)
x ( , ) ) L(back)
y ( , ) ) L (back) () ) . (4.162a)
2
Here, L(back) () ) is the total background optical power per unit cross-sectional area of the beam per
unit solid angle per unit ı interval. Just like LL(()))) in (4.136f), L (back) is a double-sided power
spectrum, making it an even function of ı:
When there is negligible absorption in the partially transmitting and partially reflecting beam-
splitter film, Eqs. (4.157c) and (4.160e) require that
2
ª ts () ) 2 rs () ) 2 º 1
¬ ¼
and
2
ª t () ) 2 r () ) 2 º 1 .
¬« p p
¼»
4 4 2 2
ts () ) rs () ) 1 2 ts () ) rs () )
and
4 4 2 2
t p () ) rp () ) 1 2 t p () ) rp () ) ,
4 4 4 4
ts () ) rs () ) t p () ) rp () )
(4.162c)
2 2 ª ts () ) rs () ) t p () ) rp () ) º .
2 2 2 2
«¬ »¼
- 483 -
4 · From Maxwell’s Equations to the Michelson Interferometer
() ) ª rs () ) ts () ) rp () ) t p () ) º ,
2 6 2 2 2 2
! () )
2 r () ) «¬ »¼
(4.162d)
4 4 4 4 ! () )
ts () ) rs () ) t p () ) rp () )
2 6 2
. (4.162e)
() ) r () )
Applying the idealizations in (4.151a) to (4.161b), and then substituting from Eqs. (4.162a),
(4.162d), and (4.162e) gives
(back)
Punb ( )
{ {2 r() )
5
A ! () )
³ d) ³ ³ d 2 L (back) () )
2 4
Ȗ() ) 2
(4.162f)
2 5 field
field of
of view
view Ȗ() )
W ª ! () ) º
« »
A « Ȗ() ) 2 »
¬ ¼
A Re ª Ȇ
¬ A ()
G 2& i) cos
)e º
¼ }.
The next idealization is to give the background-radiance beam a circular cross section. From
the work done in the balanced derivation [see Eq. (4.137e)], the formula for A1Ȇ A is then
1 G J (4& R A ) A ' ma )
Ȇ A ()) 1 ,
A circle of
2& R A ) A ' ma
radius R
where ' ma is the angle (in radians) between the surface normal vectors of the correctly aligned
and misaligned moving-mirror positions. From Eq. (4.137g), we know that
J1 (4& R)' ma )
2& R)' ma
has the same value at íı as it has at +ı, so we can discard the absolute value signs and write
1 G J (4& R)' ma )
Ȇ A ()) 1 .
A circle of
2& R)' ma
radius R
- 484 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18
The J1 Bessel function is always real when it has a real argument, so A1Ȇ A must be real for a
circular cross section. This means that when this last expression is substituted into (4.162f), we
get
(back)
Punb ( )
{ {
5
A ! () )
³ d) ³ ³ d 2 L (back) () )
2 4
2 r () ) Ȗ() ) 2
(4.163a)
2 5 field
field of view
of view Ȗ () )
Equation (4.163a) corresponds to (4.137i) in the balanced derivation. Assuming the effective field
of view for the background radiance is sufficiently narrow that cos
1 , we can write (4.163a)
as
{2 r() )
5
A ! () )
³
(back) 2 4
Punb ( ) L (back) () ) Ȗ() ) 2
2 5 Ȗ() )
(4.163b)
ª ! () ) º ª J (4& R)' ) º
W « 2
»A« 1
«¬ Ȗ() ) »¼ ¬ 2& R)'
ma
ma ¼
}
» A cos(2&) ) d) .
This result corresponds to Eq. (4.138a). Again, ¨ represents the value of the integral over d 2 .
This makes ¨ the solid angle of the interferometer’s effective field of view for the unbalanced
background signal, which should be, as pointed out in the discussion after Eq. (4.150), about the
same size as the interferometer’s input field of view. We note that the entire product
ª ! () ) º ª J (4& R)' ) º
{
L (back) () )
2 4
2 r () ) Ȗ() )
! () )
Ȗ() )
2
W « 2
»A« 1
«¬ Ȗ() ) »¼ ¬ 2& R)'
ma
ma
» A cos(2&) )
¼
}
2 2
is an even function of ı if L (back) , r , , ! , cos(2&) ) , and
J1 (4& R)' ma )
2& R)' ma
are all even functions of ı. The cosine is always an even function, and Eq. (4.162b) shows that
2
L (back) is even. The analysis following Eq. (4.138b) shows that r and Ș are also even functions
- 485 -
4 · From Maxwell’s Equations to the Michelson Interferometer
2
of ı, and Eq. (4.137g) shows that (2& R)' ma ) 1 J1 (4& R)' ma ) is even. As for , the only
uncertainty left, we know that it must be an even function of ı because, according to (4.151a), it
( abc ) 2 ( uv ) 2
comes from idealized approximations for s, p and s, p that are themselves, as shown in
Eqs. (4.139f) and (4.145d), even functions of ı. Hence Eq. (4.163b) can be written as [by
applying formula (2.19) in Chapter 2]
{ r)
5
A ! () )
³
(back) 2 4
P unb ( ) L(back) () ) 2 ( ) Ȗ() ) 2
2 0 Ȗ() )
(4.163c)
ª ! () ) º ª J (4& R)' ) º
W « 2
»A« 1 ma
where we define
We recognize from the discussion preceding Eq. (4.136d) that L(back) can be thought of as the
spectral radiance of the background radiation causing the unbalanced background signal.
Equation (4.163c), then, corresponds to Eq. (4.140a) in the balanced derivation.
Our next idealization is to assume the interferometer is well aligned so that ' ma 0 .
Substitution of Eq.
Application of Eq. (4.137k),
(4.137k), which
which states
states that
that
J1 (4& R)' ma )
lim 1,
' ma 70 2& R)' ma
into (4.163c)
to (4.163c) gives
gives
{ r)
5
1 ! () )
( ) ³ S(back) () )
(back) 2 4
Punb 2 ( ) Ȗ() ) 2
20 Ȗ() )
(4.164a)
ª ! () ) º
W « 2
» A cos(2&) ) d) ,
«¬ Ȗ() ) »¼
}
where
S(back) () ) A L(back) () ) (4.164b)
is the total, single-sided optical power per unit wavenumber interval entering the detector-side of
the interferometer as background radiation. This corresponds to Eq. (4.140b) in the balanced
derivation.
- 486 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18
() )
r () )
! () )
1
so that
5
1
P (back)
unb ( ) ³ S(back) () ) A 1 W cos(2&) ) d) , (4.164c)
20
which matches Eq. (4.140d) in the balanced derivation. We can then adopt the same convention
as most optical textbooks by setting W 1 to get
5
1
P (back)
unb ( ) ³ S(back) () ) A 1 cos(2&) ) d) . (4.164d)
20
(back)
Separating out the signal component Iunb ( ) , which changes with Ȥ, gives
5
1 (back)
I (back)
unb ( ) ³ S () ) cos(2
S() ) cos(2 ,).
)d) )d
&)&) (4.165a)
20
(back)
corresponding to Eq. (4.141a) in the balanced derivation. Function Iunb ( ) is often called the
unbalanced background interferogram. It is difficult to imagine a procedure for recording the
balanced interferogram for the input optical signal that does not at the same time record the
unbalanced background interferogram; fortunately, there are several well-known calibration
methods discussed in Secs. 5.14 and 5.19 of Chapter 5 that can be used to measure and eliminate
the unbalanced background interferogram from interferometer data.
From (4.163d) and (4.164b), we have
Because L (back) is an even function [see Eq. (4.162b)], we can easily extend the definition of
S(back) to negative values of ı by saying that
- 487 -
4 · From Maxwell’s Equations to the Michelson Interferometer
S(back) (σ ) cos(2πσχ )
is an even function of ı and Eq. (2.19) of Chapter 2 can be used to write (4.165a) as
∞
1
I (back)
unb ( χ ) = − ³ S (back) (σ ) cos(2πσχ )dσ . (4.165d)
4 −∞
S(back) (σ ) sin(2πσχ )
³S (σ )e ±2π iσχ dσ
(back)
−∞
∞ ∞
³S (σ ) cos(2πσχ )dσ
(back)
=
−∞
S(back) (σ ) sin(2πσχ )
must be zero [see Eq. (2.17) in Chapter 2]. This last result can be combined with Eq. (4.165d) to
get
∞
1
Iunb ( χ ) = − ³ S (back) (σ ) e ±2π iσχ dσ ,
(back)
(4.165e)
4 −∞
corresponding to Eq. (4.141d) in the balanced derivation. Equation (4.165e), just like (4.141d) for
the balanced interferogram, shows that we can get the unbalanced background spectrum by taking
(back)
the appropriate Fourier transform of Iunb . There are calibration procedures that can be used to
isolate the unbalanced background interferogram, giving us access to the unbalanced background
- 488 -
Simplified Formulas Describing Unbalanced Background Radiation · 4.18
spectrum, but these measurements are usually of interest only to scientists and engineers trying to
improve the performance of poorly working interferometers.
__________
This chapter starts with Maxwell’s equations and ends up with detailed formulas for the
balanced and unbalanced optical power leaving the exit port of a standard Michelson
interferometer. The formulas account for imperfect reflection off the interferometer’s end mirrors
as well as the reflection, transmission, and absorption characterizing nonideal beam splitters and
compensator plates. Along the way, we have learned how to characterize the optical beams
passing through interferometers as well as how to handle polarized input radiation, slightly
misaligned instruments, and an input spectrum that is nonuniform over the field of view. We have
also, and in the end perhaps most importantly, introduced the concept of spectral radiance to
describe the behavior of electromagnetic wavefields inside Michelson interferometers.
- 489 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Appendix 4A
G
We define a complex vector a to be, for any three-dimensional Cartesian coordinate system
having xˆ, yˆ, zˆ unit vectors along the x, y, z Cartesian axes,
G
a = xa
ˆ x + ya
ˆ y + za
ˆ z, (4A.1)
where ax , a y , az are three complex scalars. Using the subscript r to denote a complex scalar’s real
part and the subscript i to denote the complex scalar’s imaginary part, we have
for i = −1 . The xˆ, yˆ, zˆ unit vectors themselves are taken to be real and can be written in
column-vector notation as
§1· §0· §0·
¨ ¸ ¨ ¸ ¨ ¸
xˆ = ¨ 0 ¸ , yˆ = ¨ 1 ¸ , zˆ = ¨ 0 ¸ ,
¨0¸ ¨0¸ ¨1¸
© ¹ © ¹ © ¹
G
which means the complex vector a can be written in column-vector notation as
§ arx + iaix ·
G ¨ ¸
a = ¨ ary + iaiy ¸ .
¨ a + ia ¸
© rz iz ¹
Many of the standard three-dimensional formulas for real vectors can be extended to complex
vectors without any difficulty. For example, we define the vector dot product of two complex
G G
vectors a and b to be
G G
a • b = ax bx + a y by + az bz , (4A.3a)
where ax bx , a y by , az bz are the complex products of two complex scalars. Applying (4A.3a) to the
G
formulas for xˆ, yˆ, zˆ , and a , we get
- 490 -
Appendix 4A
G G G G G G
ax = a • xˆ = xˆ • a , a y = a • yˆ = yˆ • a , az = a • zˆ = zˆ • a (4A.3b)
G G
just like when a is a real vector. To make the length of a complex vector a a non-negative real
number, we define
G G G
a = a • a∗ (4A.4)
G G
where a ∗ , the complex conjugate of a , is
G
a ∗ = xa
ˆ x∗ + ya
ˆ ∗y + za
ˆ z∗ (4A.5)
or in column-vector notation
§ arx − iaix ·
G∗ ¨ ¸
a = ¨ ary − iaiy ¸ .
¨ a − ia ¸
© rz iz ¹
G G
The formula for the vector cross product of two complex three-dimensional vectors a and b is
also identical to the formula for the cross product of two real three-dimensional vectors,
G G G G
a × b = −b × a = xˆ (a y bz − az by ) + yˆ (az bx − ax bz ) + zˆ (ax by − a y bx ) . (4A.6)
The well-known operations of vector calculus on real three-dimensional vector fields can also
G
be extended to fields of complex three-dimensional vectors. We define the ∇ operator in the
usual way,
G ∂ ∂ ∂
∇ = xˆ + yˆ + zˆ , (4A.7a)
∂x ∂y ∂z
G ∂α ∂α ∂α
∇α = xˆ + yˆ + zˆ
∂x ∂y ∂z
(4A.7b)
§ ∂α ∂α · § ∂α ∂α · § ∂α ∂α ·
= xˆ ¨ r + i i ¸ + yˆ ¨ r + i i ¸ + zˆ ¨ r + i i ¸ ,
© ∂x ∂x ¹ © ∂y ∂y ¹ © ∂z ∂z ¹
where α = α r + iα i for α r the real part of α and α i the imaginary part of α . We know for any
G
real three-dimensional vector field ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z that
- 491 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G G ∂ρ ∂ρ y ∂ρ z
∇•ρ = x + + . (4A.8a)
∂x ∂y ∂z
G
For any complex vector field a = xa
ˆ x + ya
ˆ y + za
ˆ z we now define
G
where the vector field’s real component ar is the real vector
§ arx ·
G
ar = xa
ˆ rx + ya ˆ rz = ¨¨ ary ¸¸ ,
ˆ ry + za (4A.9b)
¨a ¸
© rz ¹
G
and the vector field’s imaginary component ai is the real vector
§ aix ·
G
ai = xa
ˆ ix + ya ˆ iz = ¨¨ aiy ¸¸ .
ˆ iy + za (4A.9c)
¨a ¸
© iz ¹
Now we can treat i like any other constant scalar to write
G G G G G G G G G
∇ • a = ∇ • (ar + iai ) = ∇ • ar + i∇ • ai . (4A.10a)
G G
Equation (4A.10a) is the same as (4A.8b) and can be used instead of (4A.8b) to define ∇ • a for a
G G
complex vector field a in terms of the already-understood ∇ • operation applied to the real
G G G G
vector fields ar and ai . We know that the curl ∇ × ρ of any real, three-dimensional vector field
G
ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z is
G G § ∂ρ ∂ρ y · § ∂ρ x ∂ρ z · § ∂ρ y ∂ρ x ·
∇ × ρ = xˆ ¨ z − ¸ + yˆ ¨ − ¸ + zˆ ¨ − ¸.
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹
G
Now for the curl of any complex vector field a we can write
- 492 -
Appendix 4A
G G G G G G G G G
∇ × a = ∇ × (ar + i ai ) = ∇ × ar + i (∇ × ai ) , (4A.10b)
G G G G G G G G
which defines ∇ × a in terms of the curls ∇ × ar and ∇ × ai of two real vector fields ar and ai .
We know that ∇ 2α r for any real scalar field α r is
∂ 2α r ∂ 2α r ∂ 2α r
∇ 2α r = + + 2 , (4A.11a)
∂x 2 ∂y 2 ∂z
§ ∂ 2α ∂ 2α ∂ 2α · § ∂ 2α i ∂ 2α i ∂ 2α i ·
∇ 2α = ∇ 2α r + i ∇ 2α i = ¨ 2r + 2r + 2r ¸+i¨ 2 + 2 + 2 ¸. (4A.11b)
© ∂x ∂y ∂z ¹ © ∂x ∂y ∂z ¹
G G
The standard definition of ∇ 2 ρ for any real vector ρ = xˆ ρ x + yˆ ρ y + zˆ ρ z is
G
∇ 2 ρ = xˆ∇ 2 ρ x + yˆ∇ 2 ρ y + zˆ∇ 2 ρ z . (4A.11c)
G
For any complex vector field a we say that
G
∇ 2 a = xˆ∇ 2 ax + yˆ ∇ 2 a y + zˆ∇ 2 az (4A.11d)
for ax , a y , az , the three complex scalar fields that are the x, y, z components of the complex
G G
vector field a . Equations (4A.11a), (4A.11b) and (4A.11d) when taken together define ∇ 2 a for
G
any complex vector field a . Note that we can also use
G G G
∇ 2 a = ∇ 2 ar + i∇ 2 ai (4A.11e)
G G G
to define ∇ 2 a in terms of ∇ 2 applied to the real vector fields ar and ai .
G
If we have a constant complex vector u multiplied by a complex scalar field Į, then
G G
u = xu
ˆ x + yu ˆ z and α u = xˆ (α u x ) + yˆ (α u y ) + zˆ (α u z ) ,
ˆ y + zu
where u x , u y , u z are constant complex scalars and Į is a complex scalar function of position. From
(4A.11d) we have
- 493 -
4 · From Maxwell’s Equations to the Michelson Interferometer
G
∇ 2 (α u ) = xˆ∇ 2 (α u x ) + yˆ ∇ 2 (α u y ) + zˆ∇ 2 (α u z )
G (4A.12a)
ˆ x ∇ 2α + yu
= xu ˆ y ∇ 2α + zu ˆ z ∇ 2α = u∇ 2α .
G
Another useful identity involving a constant complex vector u multiplied by a complex scalar
G G
field Į comes from using Eq. (4A.8b) to simplify ∇ • (α u ) ,
G G ∂ ∂ ∂
∇ • (α u ) = (α u x ) + (α u y ) + (α u z )
∂x ∂y ∂z
(4A.12b)
∂α ∂α ∂α G G
= ux + uy + uz = u • (∇α ) .
∂x ∂y ∂z
Here we have used Eqs. (4A.3a) and (4A.7b) in the last step of (4A.12b). We also note that
G G
∇ × (α u )
§ ∂ (α u z ) ∂ (α u y ) · § ∂ (α u x ) ∂ (α u z ) · § ∂ (α u y ) ∂ (α u x ) ·
= xˆ ¨ − ¸ + yˆ ¨ − ¸ + zˆ ¨ − ¸
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹
(4A.12c)
§ ∂α ∂α · § ∂α ∂α · § ∂α ∂α ·
= xˆ ¨ u z − uy ¸ + yˆ ¨ u x − uz ¸ + zˆ ¨ u y − ux ¸
© ∂y ∂z ¹ © ∂z ∂x ¹ © ∂x ∂y ¹
G G
= −u × (∇α ).
G G G G
We define a complex vector a = ar + i ai to be orthogonal to a real vector ρ when
G G G G G G G G
ρ • a = a • ρ = ρ • ar + i ρ • ai = 0 . (4A.13)
G G G G
In (4A.13), both the real and imaginary components of the dot product, ρ • ar and ρ • ai
G G G
respectively, must be zero. Equation (4A.13) requires that both ar and ai be perpendicular to ρ
in the standard sense of real three-dimensional vectors. Another vector identity that holds true for
G G G G G
two real vectors ρ a , ρb , and a complex vector a = ar + i ai is
G G G G G G G G G
ρ a × ( ρb × a ) = ( ρ a • a ) ρb − ( ρ a • ρb )a . (4A.14)
- 494 -
Appendix 4A
G G G
holds true for any real vectors ρ1 , ρ 2 , ρ3 , it follows that
G G G G G G G G G
ρ a × ( ρb × a ) = ρ a × ( ρb × ar ) + i [ ρ a × ( ρb × ai )]
G G G G G G G G G G G G
= ( ρ a • ar ) ρb − ( ρ a • ρb )ar + i [( ρ a • ai ) ρb − ( ρ a • ρb )ai ]
G G G G G G G G G G G G G G
= [ ρ a • (ar + iai )]ρb − ( ρ a • ρb )(ar + iai ) = ( ρ a • a ) ρb − ( ρ a • ρb )a.
It follows that
G G G G G G G G G G
( ρ × a ) • ( ρ × a ∗ ) = [ ρ × (ar + iai )] • [ ρ × (ar − iai )]
G G G G G G G G
= [( ρ × ar ) + i ( ρ × ai )] • [( ρ × ar ) − i ( ρ × ai )] (4A.15)
G G G G G G G G
= ( ρ × ar ) • ( ρ × ar ) + ( ρ × ai ) • ( ρ × ai )
G G G G G G
Because ρ and ar are real—and because ρ and ar are orthogonal (remember that ρ • ar = 0 )—
G G
we know the length of ρ × ar must be
G G G G G G
ρ × ar = ρ ⋅ ar ⋅ sin θ = ρ ⋅ ar ,
G G G G G G
where ρ is the length of ρ , ar is the length of ar , and the angle θ between ρ and ar must be
π 2 . Because the dot product of a real vector with itself gives the square of its length, we
conclude that
G G G G G2 G 2 G G G G
( ρ × ar ) • ( ρ × ar ) = ρ ar = ( ρ • ρ )(ar • ar ) .
- 495 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Re(α ) = α r (4A.17a)
and
Im(α ) = α i (4A.17b)
G G G
to specify the real and imaginary parts of Į. Similarly, for any complex vector a = ar + i ai , we
can use the notation
G G
Re(a ) = ar (4A.17c)
and
G G
Im(a ) = ai (4A.17d)
G
to specify the real and imaginary parts of a . We define LR to be a linear operator that, when
operating on a real three-dimensional scalar or vector field, creates another scalar or vector field
that is also real. We call LR a real linear operator. When operating on a complex quantity, a real
linear operator LR can return either a real or complex quantity; but when operating on a real
quantity, a real linear operator must return another real quantity. Because LR is linear, we know
that
G G G G
LR (α a + β b ) = α LR (a ) + β LR (b ) (4A.18)
G G
for any two real or complex constant scalars Į, ȕ and any two real or complex vectors fields a , b .
When dealing with scalar fields we need only remove all the vector signs from the linear-operator
G G
formula in Eq. (4A.18). We note that the ∇ × , ∇ • , and ∂ ∂t operators in Maxwell’s equations
are all real linear operators, as are the ∂ 2 ∂t 2 and ∇ 2 operators created by manipulation of
Maxwell’s equations.
G G
Many times we have to find real vector fields aR and bR that satisfy equations of the form
G G
L1 (aR ) + L2 (bR ) = 0 (4A.19a)
- 496 -
Appendix 4A
G G
L3 (aR ) + L4 (bR ) = 0 (4A.19b)
#
etc.
for real linear operators L1, L2 , L3 , L4 , … . It is often easier to find two complex vector-field
G G
solutions a and b such that
G G
L1 (a ) + L2 (b ) = 0 (4A.19c)
G G
L3 (a ) + L4 (b ) = 0 (4A.19d)
#
etc.
G G
than it is to find real vector fields a and b satisfying (4A.19a) and (4A.19b). For any real linear
G G G G G
operator LR acting on a complex vector field c = cr + i ci , with cr and ci the real and imaginary
G
parts of c , we have
G G G G G
LR (c ) = LR (cr + i ci ) = LR (cr ) + i LR (ci ) .
G G
Both LR (cr ) and LR (ci ) must be real because they represent real linear operators acting on real
G G
vector fields cr and ci . Hence,
G G G
Re ( LR (c ) ) = LR (cr ) = LR ( Re(c ) ) (4A.20a)
and
G G G
Im ( LR (c ) ) = LR (ci ) = LR ( Im(c ) ) . (4A.20b)
Although Re and Im are not themselves true linear operators, we do know that for any two
G G
complex vector fields u and v
G G G G
Re(u + v ) = Re(u ) + Re(v ) (4A.21a)
and
G G G G
Im(u + v ) = Im(u ) + Im(v ) . (4A.21b)
We can thus take the real and imaginary parts of (4A.19c) and (4A19d), using (4A.21a) and
(4A.21b) to get
G G
Re[ L1 (a )] + Re[ L2 (b )] = 0
G G
Re[ L3 (a )] + Re[ L4 (b )] = 0
#
etc.
- 497 -
4 · From Maxwell’s Equations to the Michelson Interferometer
and
G G
Im[ L1 (a )] + Im[ L2 (b )] = 0
G G
Im[ L3 ( a )] + Im[ L4 (b )] = 0
#
etc.
G G
( )
L1 ( Re(a ) ) + L2 Re(b ) = 0 (4A.22a)
G G
( )
L3 ( Re(a ) ) + L4 Re(b ) = 0 (4A.22b)
#
etc.
and
G G
( )
L1 ( Im(a ) ) + L2 Im(b ) = 0 (4A.22c)
G G
( )
L3 ( Im(a ) ) + L4 Im(b ) = 0 (4A.22d)
#
etc.
G G G G
Equations (4A.22a)–(4A.22d) show that both Re(a ), Re(b ) and Im(a ), Im(b ) are pairs of real
G G
aR , bR fields that satisfy Eqs. (4A.19a) and (4A.19b). We can thus solve sets of equations based
on real linear operators by allowing the proposed solutions to be complex vector fields, finding
formulas for these complex vector fields, and then—at the very end of the process—taking either
the real or imaginary part of the complex solutions to get the desired real solutions. When
following this procedure, it is customary to take the real rather than the imaginary parts of the
complex solutions to get the desired real solutions.
- 498 -
Appendix 4B
Appendix 4B
We must be careful when approximating the phase terms of interferometer equations because
phase changes can be significant while still being very small compared to the largest term in the
phase expression. Consider, for example, the expressions
are good approximations of Eqs. (4B.1a) and (4B.1b)? At first glance, we might say that if
δ << 1 , then the contribution of δ to the phase expression A ⋅ (1 + δ ) can be neglected because no
matter what the size of A the fractional error in the phase from neglecting the presence of δ is
A ⋅ (1 + δ ) − A
= δ << 1 .
A
A = 2 Nπ + a
for some positive (or negative) integer N and a non-negative real variable a < 2π . Because
A ⋅ (1 + δ ) is a phase, Eqs. (4B.1a) and (4B.1b) can be written as
Scmplx = ei ( A+ Aδ ) = ei (2 Nπ + a + Aδ ) = ei ( a + Aδ )
and
S real = cos( A + Aδ ) = cos(2 N π + a + Aδ ) = cos(a + Aδ ) .
Now it looks like what matters is that Aδ be small compared to a. But all we are interested in is
the approximate value of ei ( a + Aδ ) or cos(a + Aδ ) . If A is about equal to 2Nʌ so that a = A − 2 N π
is very small, making it about the same size or even smaller than the small value of Aδ , then
Aδ can still be neglected as long as we can say
- 499 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Scmplx = eiA⋅(1+δ ) ≅ ei 2 Nπ
and
S real = cos ( A ⋅ (1 + δ ) ) ≅ cos(2 N π ) .
For this reason, we adopt as our rule for neglecting į that the change in phase Aį must be small
compared to the change in phase producing an O (1) change in exp(iA ⋅ (1 + δ )) or cos( A ⋅ (1 + δ )) .
This means į must satisfy
Aδ << π 4 ≈ 1 (4B.1c)
or
cos ( A ⋅ (1 + δ ) ) ≅ cos( A) .
Our rule of thumb, then, is to give both A and į their extreme allowed values, maximizing Aδ ,
and after that to check to see whether the resulting maximum Aδ value satisfies (4B.1c). If it
does, we can be sure that (4B.1c) is also satisfied for all the nonextreme Aį products, allowing us
to neglect į in Eqs. (4B.1a) and (4B.1b).
We start our analysis with terms such as
ˆ
e 2π iσχΩ• zˆ ,
where Ω̂ and ı are respectively the propagation vector and wavenumber of a monochromatic
plane wave and Ȥ is the OPD value of an interferometer. The beam passing through an
interferometer is direction-chopped, which means that all the plane waves have propagation
ˆ • zˆ ≅ 1 . Does this mean that
vectors that are parallel to, or nearly parallel to, ẑ , so Ω
ˆ
e 2π iσχΩ• zˆ ≅ e 2π iσχ ?
We now show why this approximation does not work. Following the notation developed in
Sec. 4.12 above, we take șb to be the angle between Ω̂ and the ẑ axis. The types of
interferometers we are interested in have angles șb that are relatively small,
- 500 -
Appendix 4B
In a well-designed interferometer χ max , the largest possible absolute value of the OPD, or optical
path difference, must satisfy the inequality
for accurate spectral measurements to occur.72 As a general rule, interferometer designs attempt
to maximize the optical signal, which usually means making șbmax as large as possible.
Consequently, it makes sense to assume that
ˆ • zˆ gives
because angle șb is small. Substituting this into the phase 2πσχΩ
Here, 2πσχ plays the role of A [see discussion following Eq. (4B.1b)], and the terms θb2 2 and
θb4 24 play the role of į. We first take δ = θb4 24 and note that the maximum expected value of
Aδ is
1
2πσ max χ max (θb4max 24) ≅ θb2max ,
4
where we have taken π ≅ 3 and used (4B.2d). Inequality (4B.2a) then shows that
1 −4
Aδ ≤ ⋅10 ,
4
72
John Chamberlain, The Principles of Interferometric Spectroscopy (John Wiley and Sons, New York, 1979), pp.
220–222.
- 501 -
4 · From Maxwell’s Equations to the Michelson Interferometer
which, according to (4B.1c), is small enough to neglect. When, however, δ = θb2 2 , it follows
that Aδ can be as large as
θb2
cos θb ≅ 1 − (4B.3b)
2
5 ⋅10−5 << 1 .
ˆ
We conclude that e 2π iσχΩ• zˆ cannot be approximated as e 2π iσχ unless we are prepared to put stricter
limits on ı, Ȥ, and șb.
We now consider a plane wave with a unit-length propagation vector ω̂ that is incident on the
flat moving mirror of a Michelson interferometer. When the moving mirror is correctly aligned,
its unit-length surface normal is ẑ , pointing approximately antiparallel to ω̂ as shown in Fig.
4B.1; and when the moving mirror is misaligned by a very small angle, its unit-length surface
normal is nˆM . The unit-length propagation vector of the plane wave reflected from the aligned
moving mirror is Ω̂ , and the unit-length propagation vector of the plane wave reflected from the
misaligned moving mirror is Ω ˆ . We know that the angle between Ω ˆ and Ω ˆ is șd, with șd
d d
much smaller than șb as shown in inequality (4.68) of Sec. 4.12 above. Since we are only
interested in finding the interferometer’s measurement noise for small misalignment angles, we
say that șdmax, the maximum expected value of șd, satisfies
θ d max
≤ 10−2 . (4B.4a)
θb max
- 502 -
Appendix 4B
FIGURE 4B.1.
unit vector Ω̂
unit vector ẑ
Reflective
șb Surface of
the Moving
Mirror
șb
unit vector ω̂
- 503 -
4 · From Maxwell’s Equations to the Michelson Interferometer
There is also a close connection between șdmax and the cross-sectional size of the interferometer’s
beam. In a well-designed interferometer,73
where D is the typical distance across the beam’s cross-sectional area. If, for example, the beam
has a circular cross section, then D is the circle’s diameter.
Although, as shown in Fig. 4B.1, vectors ω̂ , ẑ , and Ω̂ always lie in the same plane, there is
no reason to expect the surface normal nˆM of the misaligned moving mirror—or the propagation
vector Ω ˆ of the plane wave reflected off misaligned moving mirror—also to lie in that plane.
d
ˆ are all unit-length vectors. When we put the
We do, however, know that ω̂ , ẑ , Ω̂ , nˆM , and Ω d
ˆ
bases or “tails” of vectors ẑ , Ω̂ , nˆ , and Ω at the same location, their tips always lie on the
M d
surface of a sphere of unit radius; and if we put the tip of ω̂ together with the other four vectors’
bases, then the base of ω̂ lies on the surface of that same sphere. Because θb ≤ 10−2 radians and
θ d ≤ 10−4 radians [see inequalities (4B.2a) and (4B.4b)] are very small angles, we can approximate
the sphere’s curving surface near the tip of ẑ as a plane, drawing the construction shown in Fig.
4B.2. Then, according to the law of specular reflection, the base of ω̂ lies on a straight line with
the tips of ẑ and Ω̂ , with the tip of ẑ lying a distance șb from the base of ω̂ and the tip of Ω̂
lying a distance șb from the tip of ẑ . Similarly, the base of ω̂ lies on a straight line with the tips
of nˆM and Ω ˆ , with the tip of nˆ lying halfway between ω̂ and Ω ˆ . Having defined—using
d M d
ˆ and Ω
this flat-plane approximation—that the distance between the tips of Ω ˆ on the unit sphere
d
is angle șd, we then know that the distance between the tips of ẑ and nM must be θ d 2 . This
ˆ
result comes from the similar triangle theorem: the triangle formed by the base of ω̂ and the tips
of Ωˆ and Ωˆ is twice the size of, and similar in shape to, the triangle formed by the base of ω̂
d
73
D. Cohen, “Performance Degradation of a Michelson Interferometer When Its Misalignment Angle Is a Rapidly
Varying Time Series,” Applied Optics 36, no. 18 (20 June 1997), pp. 4034–4042.
- 504 -
Appendix 4B
unit vector Ω̂
FIGURE 4B.2. ˆ
unit vector Ω d șd
unit vector ẑ
G
vector Γ
G
vector γ
θd / 2
φ
șb
unit vector ω̂
This diagram and Fig. 4B.3 go with the discussion following Eq. (4B.4c) in Appendix 4B. No matter where ω̂ is
put in this geometric construction, the angle between Ω and Ω d is always twice the angle between the tips of
ˆ and Ω̂ , Ω
vectors ẑ and nˆM . Note in Fig. 4B.3 how the angle between the tips of Ω̂1 , Ω ˆ is twice as
d1 2 d2
large as the angle between the tips of ẑ , nˆM even though ω̂1 and ω̂2 are not the same vector.
- 505 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4B.3.
unit vector Ω̂1
vector Γ d 1
ˆ
unit vector Ω d1
unit vector ẑ
vector γ
θb 2
θb1
ˆ
unit vector Ω unit vector Ω̂ 2
d2
vector Γ d 2
- 506 -
Appendix 4B
G G
with γ the displacement vector from the tip of ẑ to the tip of nˆM and Γ the displacement vector
ˆ . According to these definitions, we have
from the tip of Ω̂ to the tip of Ωd
G
Γ ≅ θd (4B.4f)
and
G
γ ≅ θd 2 . (4B.4g)
ˆ −Ω
Ω ˆ ≅ 2 ⋅ (nˆ − zˆ ) (4B.4i)
d M
for all possible incident propagation vectors ω̂ (as long as the incident wave is part of the field-
chopped beam, propagating parallel to or nearly parallel
G to the ẑ axis).
G
We next consider whether the approximation Γ ≅ 2γ , which is strictly true when the surface
of the unit sphere is treated as a plane, is accurate enough to use in the phase terms of Chapter 4.
Figure 4B.4 shows the orientation of vectors ω̂ , ẑ , nˆM , Ω ˆ , and Ω
ˆ on the curved surface of a
d
unit sphere. We acknowledge the curvature of the sphere by drawing two straight lines s′ and s′′
perpendicularly from the shaft of ẑ to the tips of Ω ˆ and Ω ˆ respectively. We also draw two arc
d
- 507 -
4 · From Maxwell’s Equations to the Michelson Interferometer
lengths a′ and a′′ running along the surface of the sphere from the tip of ẑ to the tips of
Ωˆ and Ωˆ respectively. If we decrease φ while holding șb and șd constant, which shortens a′
d
ˆ closer to the tip of ẑ , then the straight line s′ hits the shaft of ẑ at a
and draws the tip of Ω d
ˆ • zˆ , the distance from where s′ hits ẑ to the
point that gets closer to the tip of ẑ , increasing Ω d
base of ẑ . Changing angle φ does not change a′′ , s′′ , or the value of Ω ˆ • ẑ , the distance from
where s′′ hits ẑ to the base of ẑ . Clearly, the smaller we can make a′ compared to a′′ , the
greater is the difference between the values of Ω ˆ • zˆ and Ωˆ • ẑ . If instead of decreasing φ we
d
increase it past π 2 , the point where s′ hits the shaft of ẑ starts dropping, eventually going
below the point where s′′ hits the shaft of ẑ . Thus it is also true that as we increase φ , the
difference between Ω ˆ • ẑ eventually begins to increase. We conclude, then, that the
ˆ • zˆ and Ω
d
ˆ • ẑ is maximized when φ is 0 or π , making a′ − a′′ a
ˆ • zˆ and Ω
difference between Ω d
maximum.
ˆ
The term e 2π iσχ ( Ωd • zˆ ) first appears in Eqs. (4.86a) and (4.86b) in Sec. 4.12. We want to
maximize the difference ¨ between the phase term 2πσχΩ ˆ • zˆ and 2πσχΩˆ • zˆ to see if, even
d
when this difference ¨ is at a maximum, the latter can be used to approximate the former.
Therefore we take σ = σ max , χ = χ max in the phase term to get
ˆ • zˆ − 2πσχΩ
∆ = 2πσχΩ ˆ • zˆ ≤ 2πσ χ Ω ˆ • zˆ − Ω
ˆ • zˆ
d max max d
- 508 -
Appendix 4B
G
G vector Γ
vector γ s′
a′
FIGURE 4B.4.
a′′ θd
θb unit vector Ω̂
unit vector ẑ
unit vector ω̂
- 509 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Substituting this latest result into the previous inequality to set an upper bound on the size of the
¨, we get
θ d2max
∆ ≤ 2πσ max χ max [θb maxθ d max + ] + O(σ max χ maxθb3maxθ d max ) . (4B.5a)
2
which according to (4B.1c) is small enough to neglect. Hence we can write (4B.5a) as
From inequality (4B.4a), the first term on the right-hand side is less than or equal to 2π ⋅10−2 , and
the second term on the right-hand side is less than or equal to π ⋅10−4 . Both of these are,
according to (4B.1c), small enough to neglect. We conclude that ∆ must itself be small enough
to neglect, letting us write
2πσχΩ d • zˆ ≅ 2πσχΩ • zˆ
or
e2πσχΩd • zˆ ≅ e2πσχΩ• zˆ (4B.5d)
ˆ = ωˆ − 2 zˆ (ωˆ • zˆ )
Ω (4B.6a)
for plane waves reflecting off the correctly aligned moving mirror, and
ˆ = ωˆ − 2nˆ (ωˆ • nˆ )
Ω (4B.6b)
d M M
- 510 -
Appendix 4B
for plane waves reflecting off the misaligned moving mirror. The orientation of ω̂ and Ω̂ with
respect to ẑ shows that
ˆ • zˆ ,
ωˆ • zˆ = −Ω (4B.6c)
ˆ • nˆ .
ωˆ • nˆM = −Ω (4B.6d)
d M
G G
Hence Eqs. (4B.6a-d) can be used to write the phase in e 2π iσ r •Γ as
G G G ˆ ˆ G
2πσ r • Γ = 2πσ r • (Ω ˆ ˆ
d − Ω ) = 4πσ r • [ nM (Ω d • nM ) − z
ˆ ˆ • zˆ )] .
ˆ (Ω (4B.7a)
G
Remembering the definition γ = nˆM − zˆ in Eq. (4B.4d), we will now demonstrate that the
G G
rightmost expression in (4B.7a) can be approximated as 4πσ r • γ , which turns (4B.7a) into
G G G G G
2πσ r • Γ ≅ 4πσ r • γ = 4πσ r • [nˆM − zˆ ] . (4B.7b)
G ˆ • nˆ − Ω
ˆ • zˆ ) term can be shown to be negligible by evaluating the upper
The 4πσ (r • nˆM )(Ω d M
limit of its absolute value,
G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ G ˆ ˆ ˆ
4πσ (r • nˆM )(Ω d M max r • nM ⋅ Ω d • nM − Ω • z
ˆ ˆ
(4B.8b)
ˆ • nˆ − Ω
≅ 4πσ max z ⋅ Ω ˆ • zˆ ,
d M
- 511 -
4 · From Maxwell’s Equations to the Michelson Interferometer
because the nˆM unit vector is tilted away from ẑ by only a very small angle.74 Although at first
we might suppose that z can be indefinitely large, this is not the case. The maximum value of
șb, which we called șbmax above, governs how much the interferometer beam’s cross section
spreads as radiation travels through the interferometer. We are only interested in approximating
the phase terms for field points inside the interferometer, and if z gets too large it represents
points outside the interferometer where the validity of our phase approximations is irrelevant. We
assume that in a well-designed interferometer the beam does not spread more than 5%, which
means the product z 'b max satisfies the inequality
D D
0 4 z 'b max 4 or z max 4 (4B.8c)
20 20 'b max
and nˆ . Following the same sort of reasoning used above to analyze the behavior of ˆ = zˆ and
M d
ˆ = ẑ [see discussion after Eq. (4B.4i)], we note that as decreases to zero in Fig. 4B.5, the arc
ˆ , and
length aƎƍ eventually decreases, because as the tips of nˆ , ẑ , ˆ fall onto the same arc,
M d
ˆ [see Eq. (4B.4h)].
the tip of nˆM only goes about half as far toward the base of -̂ as the tip of d
This means that the point where s333 perpendicularly joins the shaft of nˆM approaches the tip of
ˆ = nˆ . While this happens, there is no change in the position where
nˆ , increasing the value of
M d M
ˆ = ẑ stays the same. Thus, for 0 there is a
s33 perpendicularly joins the shaft of ẑ , so
maximum in the value of ˆ = nˆ that, because ˆ = ẑ stays the same, maximizes the expression
d M
ˆ = nˆ
ˆ = zˆ .
d M
When increases to ʌ in Fig. 4B.5, arc length aƎƍ eventually increases, because as the tips of
ˆ , and
nˆM , ẑ , ˆ fall onto the same arc, the tip of
ˆ moves away from the base of -̂ by
d d
double the distance that nˆM does. This makes the point where s333 perpendicularly joins the shaft
of nˆ drop further from the tip of nˆ , decreasing the value of ˆ = nˆ . Consequently, &
M M d M
marks the other maximum in
74 5
This angle is approximately ' d 2 , which is less than or equal to 5 ; 10 radians ; see inequality (4B.4b).
- 512 -
Appendix 4B
G a′′
vector γ s′′
FIGURE 4B.5. aƎƍ s′′′ G
vector Γ
φ
unit vector Ω̂ d
θd
θb
unit vector Ω̂
unit vector ẑ
unit vector ω̂
- 513 -
4 · From Maxwell’s Equations to the Michelson Interferometer
ˆ • nˆ − Ω
Ω ˆ • zˆ .
d M
G ˆ • nˆ − Ω
ˆ • zˆ ) term in Eq.
Hence the upper limit of the absolute value of the 4πσ (r • nˆM )(Ω d M
(4B.8a) is given by, using Eq. (4B.8b),
G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ ˆ ˆ ˆ
4πσ (r • nˆM )(Ω d M max z ⋅ Ω d • nM − Ω • z
ˆ
≤ 4πσ max z max cos(a′′′) − cos θb
To maximize cos(a′′′) − cos θb , we either maximize cos(a′′′) when cos(a′′′) > cos(θb ) at φ = 0 or
minimize cos(a′′′) when cos(θb ) > cos(a′′′) at φ = π . When φ = 0 we have a′′′ = θb − θ d / 2 so
cos ( a′′′ ) = cos (θb − θ d / 2 ) . Similarly, when φ =π , we have a′′′ = θb + θ d / 2 so
cos ( a′′′ ) = cos (θb + θ d / 2 ) . Hence the two possible maximums of cos(a′′′) − cos θb at φ = 0 and
φ = π must each be less than or equal to cos(θb ± θ d / 2) − cos θb . This latest expression can only
get larger when we stop dividing θ d by 2. Therefore we can write
G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ 4πσ
4πσ (r • nˆM )(Ω d M max z max cos(θ b ± θ d ) − cos θ b . (4B.9a)
Inequality (4B.8c) can now be used to show that (approximating the cosine by its power series
because both θb and θ d are small)
G ˆ • nˆ − Ω
ˆ • zˆ ) ≤ πσ D θ d2
4πσ (r • nˆM )(Ω d M max Bθbθ d − + O(θb 3θ d )
5θb max 2
π Dσ max θ d2
≤ [ θbθd + + O(θb3θd ) ] .
5θb max 2
Replacing șb and șd by șbmax and șdmax gives
G ˆ • nˆ − Ω
ˆ • zˆ )
4πσ (r • nˆM )(Ω d M
5 10 θb max
- 514 -
Appendix 4B
According to (4B.1c), this upper bound is small enough to neglect, so we can rewrite inequality
(4B.9b) as
G ˆ • nˆ − Ω
ˆ • zˆ )
4πσ (r • nˆM )(Ω d M
which is again small enough to neglect. Clearly, the left-hand side of inequality (4B.9d) must
always be small enough to neglect, allowing us to approximate Eq. (4B.8a) as
G ˆ • zˆ )] ≅ 4πσ rG • [nˆ (Ω
ˆ • nˆ ) − zˆ (Ω ˆ • zˆ ) − zˆ (Ω
ˆ • zˆ )]
4πσ r • [nˆM (Ω d M M
G (4B.10a)
ˆ • zˆ ).
= 4πσ r • (nˆM − zˆ ) ⋅ (Ω
We now write
ˆ • zˆ = cos θ ≅ 1 − θb
2
Ω b
2
- 515 -
4 · From Maxwell’s Equations to the Michelson Interferometer
The second term on the right-hand side of (4B.10b) has an absolute value with an upper limit
G G
2&) r = (nˆM zˆ )'b2 4 2&) max'b2max r = (nˆM zˆ )
G G
2&) max' b2max r =
G G
4 2&) max' b2max r max A max .
G G
We note that the last step here is really a gross overestimate of r = because the two unit-length
G G
vectors nˆM and ẑ are almost parallel, making vectors r and nˆM zˆ almost perpendicular for
G G
large values of r . We estimate r max by z max , writing that
G G
2&) r = (nˆM zˆ )'b2 4 2&) max'b2max z max A max
D ' d max
4 2&) max'b2max A A ,
20 'b max 2
G
where in the second step Eq. (4B.4g) is used to replace max
by ' d max 2 , and inequality (4B.8c)
is used to replace z max by D ( 20'b max) . Now we can use inequalities (4B.4c) and (4B.2a) to
write
G 0.14 & 0.14 &
2&) r = (nˆM zˆ )'b2 4 A'b max 4 A102 ,
20 20
which is, according to (4B.1c), small enough to neglect. We conclude that the second term on the
right-hand side of (4B.10b) is small enough to ignore, so
giving
that (4B.10b) becomes
G ˆ = zˆ )]
4&) rG = (nˆ zˆ ) .
ˆ = nˆ ) zˆ (
4&) r = [nˆM ( d M M
For the final step, we substitute this back into Eq. (4B.7a) to get
G G G
2&) r =
4&) r = (nˆM zˆ )
or
G G G G
2&) r =
4&) r = , (4B.10c)
G
where in the last step we use that nˆM zˆ from Eq. (4B.4d). This shows that the
approximation in Eq. (4B.7b) holds true, which is what we set out to demonstrate. Since
G
ˆ ˆ [see Eq. (4B.4e) above], this result can also be written as
d
- 516 -
Appendix 4B
G ˆ ˆ G
e 2π iσ r •( Ωd −Ω ) ≅ e 4π iσ r •( nˆM − zˆ ) . (4B.10d)
G
Before moving on, it is worth checking whether the phase term e 4π iσ r •( nˆM − zˆ ) can be simplified
any further. Figure 4B.6 (see caption) shows that when the angle between the ẑ and nˆM unit-
normal vectors is approximately θ d 2 , as specified in Fig. 4B.2, then the deviation of vector
G
γ = nˆM − zˆ from being exactly perpendicular to ẑ is approximately the angle θ d 4 . If we
G G G
decompose γ into a vector γ ⊥ that is exactly perpendicular to ẑ and a vector γ || that is
antiparallel to ẑ , we have
G G G
γ = γ ⊥ + γ || (4B.11a)
with
G G θd
γ⊥ ≅ γ ≅ (4B.11b)
2
and
G θd G θ d2
γ || ≅ ⋅ γ ⊥ ≅ . (4B.11c)
4 8
G G G G
ˆ −Ω
Substitution of γ = nˆM − zˆ = γ ⊥ + γ || into (4B.10c) gives, remembering that Γ = Ω ˆ ,
d
G ˆ ˆ G G G G G G G
2πσ r • (Ω d − Ω) ≅ 4πσ r • (γ ⊥ + γ || ) = 4πσ r • γ ⊥ + 4πσ r • γ || . (4B.12a)
G G
The absolute value of 4πσ r • γ || has an upper limit
G G G θ2
4πσ r • γ || ≤ 4πσ max z max γ || ≅ 4πσ max z max d max
8
π D § 0.14 ·
≤ σ max ⋅ ⋅ θ d max ⋅ ¨ ¸,
2 20θb max © Dσ max ¹
where we have used (4B.11c), (4B.8c), and (4B.4c) to simplify the expression for the upper limit.
Clearing away common factors and using inequality (4B.4a) gives
G G π
4πσ r • γ || ≤ ⋅ (0.14 × 10−2 ) ,
40
which is, according to (4B.1c), small enough to neglect. Hence, (4B.12a) can be written as
- 517 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4B.6.
' G
angle d vector G
4 vector
'd
angle G
4 vector G
vector
& 'd
angle
2 4
G
vector
'd 'd
angle angle
2 4
Angle is part of the right triangle whose hypotenuse is unit vector ẑ , showing that
'd & &
. The sum of angles and is also because
is perpendicular ẑ . Hence,
4 2 2
'd
angle must be equal to .
4
- 518 -
Appendix 4B
G ˆ ˆ G G
2πσ r • (Ω d − Ω) ≅ 4πσ r • γ ⊥ (4B.12b)
G
with γ ⊥ being the component of nˆM − zˆ that is perpendicular to ẑ . We note that the right-hand
side of (4B.12b) is too large to neglect. This expression can be as large as
G
2πσ max D γ ⊥ ,
where we remember that, when an interferometer beam has a circular cross section, D is its
G G
diameter and the component of r parallel to γ ⊥ can then be as large as D/2. From (4B.11b) and
G
(4B.4c), we see that 2πσ max D γ ⊥ can be as large as
G θ d max
2πσ max D γ ⊥ ≅ 2πσ max D ⋅ = π ⋅ 0.14 ≅ 0.44 ,
2
which, according to (4B.1c), is really too large to neglect. This is why we retain the term
G ˆ ˆ
2πσ r • (Ω d − Ω) on the left-hand side of (4B.12b) in the interferometer equations.
The final phase approximation that we need to justify is
ˆ − zˆ cos θ − 2γG ) 2 − 1 − (Ω
K ≤ 2πσ max χ max 1 − (Ω ˆ − zˆ cos θ ) 2 . (4B.13c)
b b
( Ωˆ − zˆ cosθ ) − 2γG
b
has its maximum and minimum values when φ is zero and ʌ respectively. These minimum and
maximum values will maximize the right-hand side of (4B.13c) because the term
- 519 -
4 · From Maxwell’s Equations to the Michelson Interferometer
θ d << θb , the maximum and minimum values of Ω ˆ − zˆ cos θ − 2γG are θ ± θ , so that we can
b b d
write
ˆ − zˆ cos θ − 2γG ) 2 − 1 − (Ω
1 − (Ω ˆ − zˆ cos θ ) 2 ≤ 1 − (θ ± θ ) 2 − 1 − θ 2
b b b d b
(θb ± θ d ) 2 θ2
= 1− − (1 − b ) + O (θb4 )
2 2
θ d2
= ±θ bθ d − + O (θ b4 )
2
θ d2max
≤ + θ b maxθ d max + O (θ b4 ).
2
This can be used in (4B.13c) to get
θ d2max
K ≤ 2πσ max χ max [θb maxθ d max + ] + O(σ max χ maxθb4max ) . (4B.13d)
2
We have already seen from the discussion following Eq. (4B.5c) that
ª θ2 º
2πσ max χ max «θb maxθ d max + d max »
¬ 2 ¼
= 2πσ max χ maxθb maxθ d max + πσ max χ maxθ d2max
can be neglected. We note that substitution from (4B.2a) and (4B.2d) gives
which, according to (4B.1c), is small enough to neglect. Hence everything on the right-hand side
of (4B.13d) is small enough to neglect, which means K can be dropped from (4B.13b), making
G
Eq. (4B.13a) a good approximation. From the definition of ε in Eq. (4.54c), we can rewrite
(4.54a) to get, after applying Eq. (4.135f), that
G ˆ − zˆ 1 − ε 2 = Ω
ˆ − zˆ cos θ .
ε =Ω b
- 520 -
Appendix 4B
G G
∆ = 2(nˆM − zˆ ) = 2γ .
These two formulas together with Eqs. (4.102a) and (4.102d) in Sec. 4.13 lead to
G
ˆ − zˆ cos θ − 2γG ) 2 = −2πσχ 1 − (εG − ∆) 2
−2πσχ 1 − (Ω b
w c G G
= 2π χ 1 − (− u − ∆) 2
c w
w c2 G w G 2
= 2π χ 1 − 2 (u + ∆)
c w c
and
ˆ w c2 G 2
−2πσχ 1 − (Ω − zˆ cos θb ) = 2π χ 1 − 2 (u ) .
2
c w
G G
The phase approximation used in (4B.13a) now becomes, written in terms of w , c , u , and ∆ ,
w c2 G w G w c 2u 2
2π i χ 1− 2 ( u + ∆ )2 2π i χ 1− 2
e c w c
≅e c w
. (4B.13e)
- 521 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Appendix 4C
In this appendix, we apply the three-dimensional Wiener-Khinchin theorem explained in Sec.
3.24 of Chapter 3 to the random functions describing the radiation fields entering the
interferometer.
We specify function Π (t , T ) to be
°1 for t ≤ T
Π (t , T ) = ® , (4C.1a)
°̄0 for t > T
G
and also define a two-dimensional version of this function to be Π ( ρ ; A) = Π ( x, y; A) such that
G
1 when point ρ = ( x, y ) lies inside or on the edge
G °° of the beam of cross - sectional area A
Π ( ρ ; A) = Π ( x, y; A) = ® G . (4C.1b)
° 0 when point ρ = ( x, y ) lies outside the beam of
°̄ cross - sectional area A
G
Function Π ( ρ ; A) can be thought of as a pupil function for the beam.75 Function Π (t , T )
G
specifies the one-dimensional measurement time for the beam, and function Π ( ρ ; A) specifies
the two-dimensional cross section of the beam as it passes through the interferometer.
We set up two random functions
(in) G (in) G
E xTA ( ρ , z , t ) and E yTA ( ρ , z, t )
The T, A subscripts in R xTA , R yTA show that these are the autocorrelations of time-chopped and
beam-chopped radiation fields. The argument z is always unprimed because we want to compare
75
Joseph W. Goodman, Introduction to Fourier Optics (McGraw-Hill Inc., New York, 1988), p. 83; reissue of 1968
book.
- 522 -
Appendix 4C
the E xTA
(in)
and E yTA
(in)
variables at the same z coordinate along the beam. Because E xTA
(in)
and E yTA
(in)
Π (t , T ) = 0 or Π (t ′, T ) = 0 ,
we expect the product
(in) G (in) G
E xTA ( ρ , z , t ) E xTA ( ρ ′, z , t ′)
(in) G (in) G
E yTA ( ρ , z , t ) E yTA ( ρ ′, z , t ′) .
Consequently,
(in) G (in) G (in) G (in) G
(
E E xTA ( ρ , z , t ) E xTA )
( ρ ′, z , t ′) and E E yTA (
( ρ , z , t ) E yTA ( ρ ′, z , t ′) )
G G
should be zero when Π (t , T ) = 0 or Π (t ′, T ) = 0 . Similarly, when either ρ or ρ ′ represent points
outside the beam cross-section, so that
G G
Π ( ρ ; A) = 0 or Π ( ρ ′; A) = 0 ,
we know that
G G G G
R xTA ( ρ − ρ ′, t − t ′, z ) or R yTA ( ρ − ρ ′, t − t ′, z )
G
as we would for the three-dimensional autocorrelations in ρ and t of homogeneous and
stationary random functions. On the other hand, if the radiation field had not been time-chopped
and beam-chopped, we would expect E x(in) and E y(in) to follow the pattern of other radiation fields
in nature. When described by random variables, these fields are usually taken to be stationary in
time,76 and—since we have given the non-beam-chopped fields no preferred structure—it makes
G
sense to have them homogeneous in ρ also. We can therefore assume that the random functions
76
Handbook of Optics, edited by Michael Bass, Vol. I (McGraw-Hill Inc., New York, 1995), Chapter 4, page 4.2,
sponsored by the Optical Society of America.
- 523 -
4 · From Maxwell’s Equations to the Michelson Interferometer
E x(rad) and E y(rad) representing the radiation before it enters the interferometer are homogeneous
G
and stationary in ( and t, with autocorrelation functions Rx and Ry, which can be written as
G G G G
R x ( ( ( 3, t t 3, z ) E E x(rad) ( ( , z , t ) E x(rad) ( ( 3, z , t 3)
(4C.3a)
and
G G G G
R y ( ( ( 3, t t 3, z ) E E y(rad) ( ( , z , t ) E y(rad) ( ( 3, z , t 3) .
(4C.3b)
We also suppose the interferometer to be well designed, only minimally perturbing E x(rad) and
E y(rad) , when
when turning into E xTA
turning them into (in)
and E yTA
(in)
to create the time-chopped and beam-chopped
radiation fields entering the interferometer. This means we can assume that E xTA (in)
and E yTA
(in)
are the
G
same as E x(rad) and E y(rad) well away from the boundaries of the beam in ( and t. Hence, we can
make the approximations that
(in) G G G
E xTA ( ( , z , t )
(t , T ) ( ( ; A) E x(rad) ( ( , z , t ) (4C.4a)
and
(in) G G G
E yTA ( ( , z , t )
(t , T ) ( ( ; A) E y(rad) ( ( , z, t ) . (4C.4b)
(in) G (in) G
E E xTA
( ( , z , t ) E xTA ( ( 3, z , t 3)
G G G G
(t , T ) (t 3, T ) ( ( ; A) ( ( 3; A) E E x(rad) ( ( , z , t ) E x(rad) ( ( 3, z , t 3)
(4C.4c)
G G G G
(t , T ) (t 3, T ) ( ( ; A) ( ( 3; A) R x ( ( 3 ( 3, t t 3, z )
and
(in) G (in) G
E E yTA
( ( , z , t ) E yTA ( ( 3, z , t 3)
G G G G
(t , T ) (t 3, T ) ( ( ; A) ( ( 3; A) E E y(rad) ( ( , z , t ) E y(rad) ( ( 3, z , t 3)
(4C.4d)
G G G G
(t , T ) (t 3, T ) ( ( ; A) ( ( 3; A) R y ( ( ( 3, t t 3, z ) ,
- 524 -
Appendix 4C
where we have used (4C.3a) and (4C.3b) in the final steps of these two equations. From the three-
dimensional Wiener-Khinchin theorem, we know that the Fourier transforms of Rx and Ry are the
two power spectra
∞ ∞
G G G G
Sx (u , z , w) = ³ dt ³ ³ d 2 ρ R x ( ρ , t , z )e −2π i ( ρ •u + wt ) (4C.5a)
-∞ −∞
and
∞ ∞
G G G G
³ ³³
−2π i ( ρ •u + wt )
S y (u , z , w) = dt d 2
ρ R y ( ρ , t , z ) e . (4C.5b)
-∞ −∞
The Wiener-Khinchin theorem also states that the power spectra Sx and Sy are given by the limits
G
Sx (u , z, w) = lim
1 1 G
(
⋅ ⋅ E ExTA (u , z , w)
T →∞ 2T A
A→∞
2
) (4C.5c)
and
G
S y (u , z , w) = lim
1 1
T →∞ 2T A
A→∞
G 2
(
⋅ ⋅ E EyTA (u , z , w) , )
(4C.5d)
G G
where the random functions ExTA (u , z , w) and EyTA (u , z , w) are defined to be the three-dimensional
forward Fourier transforms of
G G G G
Π (t , T )Π ( ρ ; A) E x(rad) ( ρ , z, t ) and Π (t , T )Π ( ρ ; A) E y(rad) ( ρ , z , t ) ,
given by
∞ ∞
G G (rad) ( ρG , z, t )e −2π i ( ρ •u + wt )
G G
ExTA (u , z , w) = ³ ³³ ρ ρ
2
dt d Π (t , T ) Π ( ; A) E x (4C.6a)
-∞ −∞
and
∞ ∞
G G (rad) ( ρG , z, t )e−2π i ( ρ •u + wt ) .
G G
EyTA (u , z , w) = ³ ³³ ρ ρ
2
dt d Π (t , T ) Π ( ; A) E y (4C.6b)
-∞ −∞
In Eqs. (4C.5c) and (4C.5d), lim is interpreted to be the limit as the time interval specified by
T →∞
Π (t , T ) extends to cover all time, and lim is interpreted to be the limit as the cross-sectional area
A→∞
G
specified by Π ( ρ ; A) extends to cover the entire x, y plane. From the approximations in (4C.4a)
and (4C.4b), we have
∞ ∞
G (in) G
G G
ExTA (u , z , w) ≅ ³ dt ³ ³ d 2 ρ E xTA ( ρ , z , t )e−2π i ( ρ •u + wt ) (4C.7a)
-∞ −∞
and
- 525 -
4 · From Maxwell’s Equations to the Michelson Interferometer
∞ ∞
G (in) ( ρG , z, t )e −2π i ( ρ •u + wt ) .
G G
EyTA (u , z , w) ≅ ³ ³³ ρ
2
dt d E yTA (4C.7b)
-∞ −∞
We compare these results to Eqs. (4.129a) and (4.129b) in this chapter to get
G
G −2 § cu w·
ExTA (u , z , w) ≅ cw E xTA ¨ − , z , − ¸ (4C.8a)
© w c¹
and
G
G −2 § cu w·
E yTA (u , z , w) ≅ cw E yTA ¨ − , z , − ¸ . (4C.8b)
© w c¹
§ G
w ·
2
G 1 1 c2 cu
Sx (u , z, w) ≅ lim ⋅ ⋅ ⋅ E ¨ E xTA (− , z , − ) ¸ (4C.9a)
T →∞ 2T A w4 ¨ w c ¸¹
A→∞ ©
and
§ G 2
·
G 1 1 c2 cu w
(− , z, − ) ¸ .
S y (u , z , w) ≅ lim ⋅ ⋅ 4 ⋅E ¨ E (4C.9b)
T →∞ 2T A w ¨ yTA
w c ¸¹
A→∞ ©
§ G 2
·
G 1 1 c2 cu w
(− , z , − ) ¸
Sx (u , w) = lim ⋅ ⋅ 4 ⋅E ¨ E (4C.10a)
T →∞ 2T A w ¨ xTA
w c ¸¹
A→∞ ©
and
§ G
w ·
2
G 1 1 c2 cu
S y (u , w) = lim ⋅ ⋅ ⋅ E ¨ E yTA (− , z , − ) ¸ . (4C.10b)
T →∞ 2T A w4 ¨ w c ¸¹
A→∞ ©
- 526 -
Appendix 4C
Tracing E
xTA and E yTA back to their original definitions of E x and E y in Eqs. (4.98a) and
2
2 are no
(4.98b)—before they acquired their T, A subscripts—we recognize that E xTA and E yTA
longer functions of z, allowing us to drop z from the argument lists of Sx and Sy.
- 527 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Appendix 4D
We calculate here the two-dimensional Fourier transform of the pupil function
G
Π ( ρ ; A) = Π ( x, y; A) = Π ( x, X) ⋅ Π ( y, Y) (4D.1a)
for an interferometer beam with a (2X) × (2Y) rectangular cross section as well as the two-
dimensional Fourier transform of the pupil function
G
Π ( ρ ; A) = Π ( x, y; A) = Π ( x 2 + y 2 , R) (4D.1b)
for an interferometer beam with a circular cross section of radius R. Function Π (u, v) can be
thought of as a one-dimensional pupil function and is defined to be [see Eq. (4C.1a) in Appendix
4C]
°1 for u ≤ v
Π (u, v) = ® . (4D.1c)
°̄0 for u > v
G
It can be distinguished from the two-dimensional Π ( ρ ; A) functions by the absence of a
semicolon in its argument list.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1a), we write
∞ ∞ ∞
G G G
³ ³d ³ dxΠ( x, X) e ³ dyΠ ( y, Y) e
±2π i ( ρ •u ) ±2π ixu x ±2π iyu y
2
ρ Π ( ρ ; A) e = . (4D.2)
−∞ −∞ −∞
∞ X X
sin( x)
where sinc( x) = . The integral over y is identical, of course, so the final result is
x
∞
G G G
³ ³
±2π i ( ρ •u )
d 2
ρ Π ( ρ ; A) e = 4XYsinc(2π u x X) sinc(2π u y Y) (4D.3b)
−∞
- 528 -
Appendix 4D
or, choosing the minus sign in the exponent of e to match the definition of Ȇ A in Eq. (4.134c) of
this chapter,
G
Ȇ A (u ) = 4XYsinc(2π u x X) sinc(2π u y Y) . (4D.3c)
This is the two-dimensional forward Fourier transform of the pupil function of an interferometer
beam with a (2X) × (2Y) rectangular cross section.
To evaluate the two-dimensional Fourier transform of the pupil function in (4D.1b), we write
∞ ∞ ∞
G G G ±2π i ( xu x + yu y )
³ ³d ³ dx ³ dy e
±2π i ρ •u
2
ρ Π ( ρ ; A) e = Π ( x 2 + y 2 , R)
−∞ −∞ −∞
R 2π
(4D.4a)
= ³ ρ d ρ ³ dθ e±2π i ρu (cosθ cosφ +sinθ sin φ ) ,
0 0
where in the last step the variables of integration have been transformed using
G
ρ = x2 + y 2 , u = u x2 + u y2 = u ,
y = ρ sin θ , u y = u sin φ .
We note that
∞
cos( z cos θ ) = J 0 ( z ) + 2¦ (−1) k J 2 k ( z ) cos(2kθ ) (4D.5b)
k =1
and
∞
sin( z cos θ ) = 2¦ ( −1) k J 2 k +1 ( z ) cos ( (2k + 1)θ ) , (4D.5c)
k =0
77
See Eqs. (9.1.44) and (9.1.45) in Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun (National Bureau of Standards, Applied Mathematics Series 55, November 1964), p. 361.
- 529 -
4 · From Maxwell’s Equations to the Michelson Interferometer
where J n ( z ) is a Bessel function of the first kind of order n, with n = 0,1, 2,… , and z a non-
negative real number. Using Eqs. (4D.5a)–(4D.5c), we find that
2π
±2π i ρ u (cosθ cosφ +sinθ sin φ )
³e
0
dθ
2π 2π
= ³ cos ( 2πρ u cos(θ − φ ) )dθ ± i ³ sin ( 2πρ u cos(θ − φ ) )dθ
0 0
∞ 2π
= 2π J 0 (2πρ u ) + 2¦ (−1) J 2 k (2πρ u ) ³ cos ( 2k (θ − φ ) )dθ
k
k =1 0
∞ 2π
± 2i ¦ (−1) J 2 k +1 (2πρ u ) ³ cos ( (2k + 1)(θ − φ ) ) dθ .
k
k =0 0
The integrals over the cosine are clearly zero, because in each one the cosine is being integrated
over an integral number of periods. Hence,
2π
±2π i ρ u (cosθ cosφ +sinθ sin φ )
³ dθ e
0
= 2π J 0 (2πρ u ) . (4D.5d)
Substituting this result back into (4D.4a) gives, changing the variable of integration to
ρ ′ = 2πρ u ,
∞ R 2π uR
G G G 1
³ ³d = 2π ³ ρ J 0 (2πρ u )d ρ = ³
±2π i ρ •u
2
ρ Π ( ρ ; A) e ρ ′J 0 ( ρ ′) d ρ ′ .
−∞ 0
2π u 2 0
³ zJ
0
0 ( z )dz = xJ1 ( x)
∞
G G G 2π uR J1 (2π uR) R
³−∞³
±2π i ρ •u
d 2
ρ Π ( ρ ; A) e = = J1 (2π uR) (4D.6a)
2π u 2 u
or, choosing the minus sign in the exponent of e to match the definition of Ȇ A in Eq. (4.134c) of
78
Joseph W. Goodman, Introduction to Fourier Optics, p. 16.
- 530 -
Appendix 4D
this chapter,
G R G
Ȇ A (u ) = G J1 (2π u R) , (4D.6b)
u
where now Ȇ A is the two-dimensional forward Fourier transform of the pupil function of an
interferometer beam with a circular cross section of radius R.
- 531 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Appendix 4E
Snell’s law requires monochromatic plane waves entering a thick transparent slab or window to
change their angle of propagation. Figure 4E.1 uses a triplet of parallel rays to show this change,
and the angle variables specified there can be used to write Snell’s law as
where nA is the index of refraction outside the slab and nB is the index of refraction inside the
slab.
FIGURE 4E.1.
nA
nB
nCA
/
/
- 532 -
Appendix 4E
The index of refraction of any transparent medium is here taken to be a real dimensionless ratio
of the monochromatic wave’s velocity in empty space to the same monochromatic wave’s
velocity inside the slab. The index of refraction of empty space is thus always one. The index of
refraction of air is extremely close to one, being just little bit larger than one by an amount that
can usually be neglected when analyzing optical instruments. The indexes of refraction of the
transparent substances used to make interferometer windows, beam-splitter substrates, and
compensator plates are significantly larger than one and usually less than six or seven. If c is the
monochromatic wave’s velocity in empty space, v A is its velocity outside the slab in Fig. 4E.1,
and vB is its velocity inside the slab, then
c c
nA = and nB = . (4E.1b)
vA vB
The wavelength of a monochromatic plane wave also changes inside the transparent slab [see,
for example, Fig. 1.6(b) in Chapter 1]. This effect is shown in Fig. 4E.1 by drawing the planes of
constant phase perpendicular to the triplet of rays as being more closely spaced inside the slab
than outside the slab. If λA is the wavelength outside the slab and λB is the wavelength inside the
slab, then
nAλA = nB λB . (4E.2a)
Because the wavenumber is one over the wavelength, this can also be written as
§ c · § c ·
λA ¨ ¸ = λB ¨ ¸
© vA ¹ © vB ¹
or
λA λB
= . (4E.2c)
vA vB
Remembering that the frequency of a monochromatic plane wave, according to Eq. (1.5) in
Chapter 1, satisfies the formula
we note that the wavelength divided by the velocity must be the frequency. Therefore, Eq.
(4E.2c) requires the frequency of a monochromatic plane wave to be equal inside and outside the
- 533 -
4 · From Maxwell’s Equations to the Michelson Interferometer
slab in Fig. 4E.1. Another point worth mentioning here is that the index of refraction can be—and
usually is—a function of frequency when the monochromatic plane wave is propagating through
a transparent substance. Rearranging Eq. (4E.2a), we see that the ratio of the monochromatic
wavelengths inside and outside the slab shown in Fig. 4E.1 must be
λA nB
= , (4E.3)
λB nA
and so can also depend on the plane wave’s frequency. This point is discussed in a general sort of
way in Sec. 1.1 of Chapter 1 when explaining why Michelson’s interferometer needed a
compensator plate to work correctly.
Figure 4E.2(a) shows two monochromatic plane waves propagating through a Michelson
interferometer’s compensator plate. The interferometer’s optical axis passes horizontally through
the compensator plate, parallel to the ray showing the direction of propagation of the on-axis
wave incident on the plate. The direction of propagation of the off-axis wave incident on the plate
has a slight downward component. The solid ray paths show the change in direction of the on-
axis and off-axis plane waves inside the plate as well as the way Snell’s law requires both types
of wave to revert to their incident propagation directions when leaving the plate. The short, solid
lines perpendicular to and crossing through the rays show the planes of constant phase of the
monochromatic waves, with the distance between equivalent planes being much less inside the
plate than outside the plate. This distance can be regarded as a proxy for the wavelength if we are
careful to remember that the diagram would not then be drawn to scale—the typical wavelength
of these infrared plane wavefields is 1000 to 10,000 times less than the thickness of a typical
interferometer’s compensator plate. The dashed rays with the dashed lines of constant phase show
how the incident monochromatic waves would have propagated had the compensator plate not
been present.
In Fig. 4E.2(a), the on-axis plane wave travels a distance p1 through the compensator plate
and the off-axis plane wave travels a distance p2 through the compensator plate. The substances
used to make compensator plates and beam-splitter substrates can absorb significant amounts of
power from propagating wavefields, with the amount of absorbed power depending linearly on
the distance traveled inside the substance. In a well-designed interferometer, the plane waves
propagating in an off-axis direction are traveling at nearly the same angle of incidence with
respect to the compensator plate as are the plane waves propagating on axis, making p1 and p2
nearly equal. Hence, both types of plane wave lose about the same amount of power passing
through the compensator plate and so, to a good approximation, the amplitudes of both the on-
axis and off-axis monochromatic plane waves decrease by the same fractional amount, say γ .
The absolute value or magnitude signs here force γ to be non-negative, but this is no problem
because we can take the plane-wave amplitudes before and after passage through the plate also to
be inherently non-negative.
- 534 -
Appendix 4E
FIGURE 4E.2(a).
p1
p2
The behavior of the wavefield for on-axis and oblique rays passing through the compensator plate is
shown schematically by short lines drawn perpendicularly to the rays’ direction of propagation. The
solid rays and lines show how the rays and wavefields actually behave while passing through the
compensator plate, and the dashed rays and lines show how the rays and wavefields would have
behaved had the compensator plate not been present. Although not drawn to scale—the wavelengths
are typically several orders of magnitude shorter than the width of the compensator plate—radiation
wavelengths do, as shown, get shorter inside the compensator plate, which means the solid rays’
wavefields are very unlikely to match up exactly to the dashed rays’ wavefields. Hence, there is almost
always a phase change of the wavefields compared to what they would have been had they not
passed through the compensator plate.
- 535 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4E.2(b).
o1
The behavior of the wavefield for an on-axis ray interacting with the beam splitter and its substrate can
be analyzed the same way the compensator plate was analyzed in Fig. 4E.2(a). Again, the solid rays
and lines show how the rays and wavefields actually behave, and the dashed rays and lines show how
the rays and wavefields would have behaved had the substrate not been present. Like the
compensator plate, the substrate is not drawn to scale—the wavelengths are made much too large
compared to the substrate’s width. Radiation wavelengths shorten inside the substrate just as they do
inside the compensator plate, so the solid wavefields do not match up exactly to the dashed
wavefields. This again produces a phase change in the wavefields compared to what they would have
been had the substrate not been present.
- 536 -
Appendix 4E
FIGURE 4E.2(c).
o2
The behavior of the wavefield for an oblique ray interacting with the beam splitter and its substrate is
similar to that of an on-axis ray and wavefield [see Fig. 4E.2(b)]. Again, the solid rays and lines show
how the rays and wavefields actually behave while passing through the substrate, and the dashed rays
and lines show how the rays and wavefields would have behaved had the substrate not been present.
As before, radiation wavelengths shorten inside the substrate, so the solid wavefields do not match up
to the dashed wavefields. Just like for the on-axis ray, there is a phase change of the wavefields
compared to what they would have been had the substrate not been present.
- 537 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Using the same notation as in the discussion following Eq. (4.35e) in Sec. 4.5 of this chapter, we
can write that
Aτ = γ ⋅ Ai (4E.4a)
where Ai is a complex parameter standing for the complex amplitudes of any of the components
of the E or B field of the monochromatic plane wave and Aτ is another complex parameter
standing for the complex amplitudes of the corresponding E or B field components of the
monochromatic plane wave after it has passed through the slab. The value of γ can change
significantly for different values of frequency; we can allow for this by writing
γ = γ (σ ) , (4E.4b)
where ı is the wavenumber of the plane wave incident on the compensator plate. Having just
noted that in a well-designed interferometer γ does not change significantly when comparing
on-axis and off-axis plane waves, there is no need to include a dependence on the angle of
incidence in Eq. (4E.4b).
When comparing a monochromatic plane wave entering the slab in Fig. 4E.2(a) to the same
monochromatic plane wave leaving the slab, we are analyzing a situation very similar to the
situation examined in Sec. 4.5 above—the only real difference is that in Sec. 4.5 we discuss what
happens to monochromatic plane waves passing through a thin slab or film and now we are
analyzing what happens when passing through a thick slab or window. Passage through a thin
slab or film produces a change in phase as well as a change in amplitude, and the same thing
happens in the passage through the thick slab in Fig. 4E.2(a). In Fig. 4E.2(a) there are short
dashed lines representing what the planes of constant phase in the monochromatic wave would be
if the slab were absent. Comparing these to the short solid lines showing the actual position of the
planes of constant phase after the wave leaves the slab, we note that in both the on-axis and off-
axis cases they fail to match up. This comes from the shortening of the wavefield’s wavelength
inside the slab. Even though there is only a slight difference in the p1 and p2 distances covered by
the on-axis and off-axis waves, the on-axis phase change is much different from the off-axis
phase change because the wavefields’ wavelengths are so much shorter than the width of the slab.
This means that even though p1 and p2 are almost equal, their difference is still large compared to
a wavelength.
Figure 4E.2(b) shows an on-axis monochromatic wavefield reflecting off and transmitting
through the beam splitter and beam-splitter substrate. The one-way distance through the substrate
is called o1. Figure 4E.2(c) shows an off-axis monochromatic wave reflecting off and
transmitting through the beam splitter and its substrate; here, the one-way distance through the
substrate is called o2. Just like in Fig. 4E.2(a), the off-axis ray is only slightly off-axis because in
well-designed interferometers only slightly off-axis plane waves are allowed to pass through the
- 538 -
Appendix 4E
instrument. Hence, o1 and o2, just like p1 and p2, are almost equal. The compensator plate is
made from the same material—and is designed with the same thickness and orientation—as the
beam-splitter substrate, so the same value of γ that is used for the compensator plate can be
used to describe the one-way passage through the beam-splitter substrate of the on-axis and
slightly off-axis monochromatic plane waves. Just like in Eq. (4E.4b), we expect γ to be a
function only of ı because the loss of power is about the same for both the on-axis and slightly
off-axis waves. Figures 4E.2(b) and 4E.2(c) also show that, just like in Fig. 4E.2(a), the on-axis
and off-axis monochromatic waves can undergo significantly different phase shifts after passing
through the beam-splitter substrate. Again, this is due to the wavelength being so short compared
to the thickness of the slab, which is in this case the beam-splitter substrate.
Section 4.5 of this chapter explains how to show that a monochromatic plane wavefield
Ae2π iσ ( z −ct )
traveling along the z axis has undergone both a phase shift and a change in amplitude: just
multiply by a complex constant. The magnitude of the constant changes the wavefield’s
amplitude and the complex phase angle of the constant changes the wavefield’s phase, shifting
the position of the planes of constant phase from where they would be if the multiplication did
not occur. This happens no matter what direction in space is taken to be the z axis—that is, no
matter what the direction of propagation of the plane wavefield. We have already chosen γ (σ )
to be the magnitude specifying how much the amplitude of the plane wave changes when it goes
through the compensator plate, and now nothing stops us from taking γ to be a complex number
γ = γ ei arg(γ ) , (4E.5)
where the complex phase angle—that is, the argument of the γ complex value—is chosen so that
multiplying by the complex γ correctly modifies both the phase and the amplitude of the plane
wave. Now taking the z axis to lie along either the on-axis or the off-axis ray in Fig. 4E.2(a), we
know that if
Ae2π iσ ( z −ct )
represents any E field or B field component of the monochromatic plane wave incident on the
compensator plate, then
γ Ae2π iσ ( z −ct )
must represent the corresponding E or B component after the monochromatic plane wave has
passed once through the compensator plate.
When analyzing transmission through a thin film in Sec. 4.5 of this chapter, we distinguish
- 539 -
4 · From Maxwell’s Equations to the Michelson Interferometer
between s-type wavefields where the E field is perpendicular to the plane of incidence and p-type
wavefields where the E field is parallel to the plane of incidence. Following the same pattern
here, we say that there is a s complex parameter specifying how s-type waves transmit through
the compensator plate and a p complex parameter specifying how p-type waves transmit
through the compensator plate.
We have already noted that the phase shift undergone by a monochromatic plane wave
passing through the compensator plate depends sensitively on the path taken through the plate;
even small differences in p1 and p2 in Fig. 4E.2(a) can lead to significantly different phase shifts.
Hence, even though s , p does not depend sensitively on the monochromatic plane wave’s angle
of incidence on the compensator plate so that for both on-axis and slightly off-axis plane waves
s , p can be taken to depend only on the wavenumber as shown in Eq. (4E.4b), the same cannot
be said about the complex phase angle or argument of s, p . It follows that for a well-designed
standard interferometer,
Multiplying a plane wavefield by a complex parameter is also a good way to show what
happens to the wavefield when it passes once through the beam-splitter substrate before reflecting
off or transmitting through the beam-splitter film, and similarly a complex parameter can be used
to show what happens to the wavefield when it passes back through the beam-splitter substrate
after reflecting from the beam-splitter film. The above discussion of Figs. 4E.2(b) and 4E.2(c)
shows that the Eqs. (4E.6a)–(4E.6c) still hold true when s , p is taken to be a complex parameter
describing one passage—before or after reflection—through the beam-splitter substrate. The only
course, isisthat
caveat, of course, thatisisnow
now taken
(see Fig.to4E.1)
be the
theangle
angleofofincidence
incidenceof
of the
the monochromatic plane
wave on the combined substrate-and-film beam-splitter optical element shown in Figs. 4E.2(b)
and 4E.2(c). A little thought shows that rules (4E.6a)–(4E.6c) must in fact have a still wider
application: if they hold true for any two complex parameters A and B , then they must also
hold true for their complex product
A A .
It is easy to see why; we just note that
- 540 -
Appendix 4E
γ = γA γA
and
arg(γ ) = arg(γ A ) + arg( γ A ) .
Hence γ must be a function only of ı, not depending significantly on the angle of incidence,
because it is the product of functions for which this is true; and similarly arg(γ ) must depend on
both ı and the angle of incidence because it is the sum of functions for which this is true. This
same reasoning can in fact be extended to conclude that rules (4E.6a)–(4E.6c) must hold true for
all complex products γ s , p representing any number of passages through any combination of the
compensator plate and beam-splitter substrate.
Since the complex phase angles of the γ parameters describing the compensator plate and
beam-splitter substrate depend sensitively on the angle of incidence, we should examine how the
angle of incidence of a plane wave changes as it passes through the interferometer.
Most textbooks on elementary optics describe a simple procedure for analyzing the geometry
of rays reflecting off mirrors and other types of specular surfaces—they recommend the
construction of a mirror-image virtual world on the other side of the mirror or specularly
reflecting surface. Figure 4E.3(a) shows how this works for the elementary case of rays leaving a
chair and then specularly reflecting into an observer’s eye. For each ray entering the observer’s
eye, there is a corresponding direction at which the ray originally left the chair, as shown by the
solid arrows in Fig. 4E.3(a). To find the direction at which a ray must leave the chair, we
construct a virtual world—in this case, a virtual chair—on the other side of the reflecting surface,
as shown by the dashed lines in Fig. 4E.3(a). The virtual chair is drawn point by point exactly the
same distance “behind” the mirror as the real chair is in front of the mirror. To find, for example,
the direction of ray Ar in coordinate system S such that it reflects off the mirror and enters the
observer’s eye as ray A, we just draw a straight dashed Ar′ extension of the ray back to the dashed
virtual chair on the other side of the mirror and look at the direction of Ar′ in the virtual S ′
coordinate system.
Figure 4E.3(b) shows how to analyze optical configurations by constructing virtual objects on
the virtual side of all the specularly reflecting surfaces. The plane wave represented by the Z
triplet of rays drawn with solid arrows in Fig. 4E.3(b) reflects first off mirror M1, then reflects off
mirror M2 and into the transparent slab T. Using the same procedure as in Fig. 4E.3(a), we
construct T1′ and M 2′ , the dashed virtual images of T and M2 on the far side of M1. Just like
before, the direction at which the rays approach M2 can be found by extending the Z triplet of rays
as dashed straight arrows onto the virtual M 2′ surface. To analyze and “reflect” the virtual rays
off M 2′ , we construct T12′′ , a dash-dot virtual image of T1′ on the far side of the virtual mirror M 2′ .
- 541 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4E.3(a).
C C′
S S′
Ar Ar′
- 542 -
Appendix 4E
FIGURE 4E.3(b).
T2′
M2 T12′′
M 2′
T Z T1′
M1
The direction of the Z rays at the true transparent slab T can now be found by extending the
dashed arrows as dash-dot arrows onto the virtual T12′′ transparent slab. (The symmetry of the
situation, by the way, shows that T12′′ can also be constructed as the virtual image in M1 of the
virtual image T2′ on the other side of M2.) The collection of surfaces M1, M 2′ , and T12′′ together
with the solid, dashed, and dash-dot Z rays is sometimes called a tunnel diagram. Tunnel
diagrams can be a convenient way to keep track of the angle of incidence of plane waves
propagating through a collection of specularly reflecting flat surfaces.
Figures 4E.4(a) and 4E.4(b) are tunnel diagrams for the A triplet of rays propagating through a
standard Michelson interferometer. The A rays represent a monochromatic plane wave A
propagating through the instrument in a slightly off-axis direction. Figure 4E.4(a) is the tunnel
diagram for the interferometer arm without the compensator plate. Here the dashed slab Sb and
mirror M ′ come from constructing a virtual Sa and M on the other side of the beam splitter’s
thin, partially reflecting film; and the dash-dot Sc slab is then the virtual representation of Sb in
mirror M ′ . The dashed and dash-dot virtual extensions of the A rays show what the angle of
- 543 -
4 · From Maxwell’s Equations to the Michelson Interferometer
incidence of plane wave A must be for its three passes through Sa while following the path of the
solid rays in Fig. 4E.4(a). In this tunnel diagram, we characterize the passage of slightly off-axis
plane waves through Sa, Sb, and Sc by complex parameters γ s(,ap) , γ s(,bp) , and γ s(,cp) respectively. For s-
type plane waves, the s subscript is chosen and for p-type plane waves the p subscript is chosen.
This is, of course, the same thing as saying that the γ s(,ap) characterize the first passage of s-type
and p-type plane waves through the beam-splitter substrate before reflection off the beam-splitter
film, that the γ s(,bp) characterize the second passage of s-type and p-type plane waves through the
beam-splitter substrate after reflection off the beam-splitter film, and that the γ s(,cp) characterize the
third s-type or p-type passage through the beam-splitter substrate after reflection off mirror M.
We note that it is important to distinguish between these three passages because, as shown by the
tunnel diagram, the angle of incidence corresponding to γ s(,cp) must be slightly different from the
angle of incidence corresponding to γ s(,ap) for slightly off-axis plane waves; and, of course, the γ s(,bp)
is allowed to be different from γ s(,ap) and γ s(,cp) because it characterizes the reverse passage through
the beam-splitter substrate.
______________________________________________________________________________
FIGURE 4E.4(a). This is a tunnel diagram for the interferometer arm without the compensator plate.
M
γ s(,ap)
γ s(,cp)
A γ s(b, p)
Virtual
Optical
Axis
Sa
Sb Sc
M′
- 544 -
Appendix 4E
FIGURE 4E.4(b). This is the tunnel diagram for the interferometer arm with the compensator plate.
S a3 S b3 S c3 M3
Optical Virtual
Optical ( c )'
( a )' (b )' Axis s, p
s, p s, p Axis
Figure 4E.4(b) is the tunnel diagram for the interferometer arm with the compensator plate; it
is simpler than the tunnel diagram in Fig. 4E.4(a) because it uses the virtual images
corresponding to one rather than two specularly reflecting surfaces. Slab Sa3 represents the beam-
splitter substrate in Fig. 4E.4(b); it must have the same shape and orientation as Sa in Fig. 4E.4(b)
a
because it represents the same block of substrate material. Hence, for any monochromatic plane
wave, on-axis or off-axis, we must have
(a) ( a )3
s, p s, p (4E.7a)
( a )3
where s, p are the complex parameters specifying an s-type or p-type plane wave’s passage
(a)
through slab Sa3 in Fig. 4E.4(b) and s, p are the complex parameters in Fig. 4E.4(a) that have
been defined in the previous paragraph to specify an s-type or p-type plane wave’s first passage
through slab Sa. Clearly, since passage through slab Sa3 is just another name for the same event as
passage through slab Sa, Eq. (4E.7a) is trivially true; in fact, for this reason it makes sense to drop
the primes from the s(,ap)3 parameters, assuming them to be always the same as the s(,ap)
parameters. We next note that if the compensator plate in Fig. 4E.4(b) has the same shape as the
substrate slab and it is given the appropriate orientation parallel to the substrate slab, then the
angle of passage of any on-axis or slightly off axis plane wave through slab Sb in Fig. 4E.4(a) is
the same as the angle of passage of that same plane wave through slab Sb3 in Fig. 4E.4(b).
Consequently, we can regard the angle of incidence at which monochromatic plane waves
approach Sb to be the same as the angle of incidence at which the monochromatic plane waves
approach Sb3 , which means that if the plane waves have the same wavenumber then
- 545 -
4 · From Maxwell’s Equations to the Michelson Interferometer
where γ s(,bp)′ are the complex parameters in Fig. 4E.4(b) specifying an s-type or p-type plane
wave’s passage through slab Sb′ , and γ s(,bp) are the already-defined complex parameters in Fig.
4E.4(a) specifying an s-type or p-type plane wave’s passage through slab Sb. Even though we are
here saying that γ s(,bp)′ and γ s(,bp) are only approximately equal because the compensator plate may
not be exactly matched in thickness and orientation to the moving-mirror arm’s second pass
through the beam-splitter substrate, it still makes sense to idealize the situation and drop the
primes, assuming that in a well-designed interferometer all the γ s(,bp) complex parameters are
effectively the same. Finally, we examine the angles of incidence of the plane wave on slab Sc in
Fig. 4E.4(a) and on slab Sc′ in Fig. 4E.4(b). Even though the ray triplet hits the two slabs at
different places, the angles of incidence must always be the same for any on-axis or slightly off-
axis monochromatic plane wave. The plane wave incident on the virtual compensator plate Sc′ in
Fig. 4E.4(b) passes through the slab “in reverse” compared to Sb′ ; and, of course, the same
observation applies to Sc compared to Sb in Fig. 4E.4(a). So again, in a well-designed
interferometer, we know that the same compensator plate satisfying Eq. (4E.7b) can also satisfy
where γ s(,cp)′ are the complex parameters specifying a plane wave’s passage through slab Sc′ and
γ s(,cp) are the previously defined complex parameters specifying a plane wave’s passage through
slab Sc. In this situation, the primed and unprimed complex parameters may be only
approximately equal not only due to a slightly mismatched compensator plate but also because
the moving mirror may be slightly out of alignment, changing the angle of incidence from what it
ought to be.
When the moving mirror is slightly out of alignment, we know that angle șd defined at the
beginning of Sec. 4.12 of this chapter is nonzero and can give rise to a slight change in the angle
of propagation for on-axis and off-axis plane waves propagating back down the moving mirror
arm of the interferometer and into the beam-splitter substrate for the third time. Angle șd is much
smaller than angle șb, the typical angle at which a slightly off-axis plane wave propagates with
respect to the optical axis. According to the discussion associated with Eqs. (4E.6a)–(4E.6c), the
only reason to worry about the effect of slightly different angles of incidence on the complex Ȗ
parameters associated with the beam-splitter substrate and compensator plate is that the phase—
but not the amplitude—of plane waves passing through these transparent slabs can depend
sensitively on the angle of passage. This change in the plane wave’s phase shows up as a change
in the complex phase angle or argument of the Ȗ parameters and does not significantly affect the
- 546 -
Appendix 4E
value of their complex magnitudes γ . We now show that the typical șd angle is in fact small
enough to disregard its effect on the phase of the monochromatic plane waves, allowing us to
disregard its effect on the complex phase angle—and thus on the value—of the Ȗ parameters. This
means in particular that even for typical nonzero șd values we can drop the primes from the γ s(,cp)′
complex parameters and assume that γ s(,cp) and γ s(,cp)′ are always effectively the same parameters in
a well-designed instrument.
Figure 4E.5(a) shows the solid ray corresponding to a properly aligned plane wave
propagating down the z axis toward the origin of an x, y, z Cartesian coordinate system. The
hollow arrow going through the origin and lying in the x, z plane is the unit-normal vector of the
surface of the transparent slab corresponding to the beam-splitter substrate. The plane of
incidence of the solid ray is then of course the x, z plane of the coordinate system. The solid ray
makes an angle of incidence șA when it intersects the slab’s surface at the origin. This ray is
labeled as ray 1. It refracts into the slab as ray I, still lying in the x, z plane of incidence and
having an angle of refraction
§ n A sin θ A ·
θ B = sin −1 ¨ ¸
© nB ¹
from Eq. (4E.1a) above. The dashed ray, labeled ray 2 in Fig. 4E.5(a), corresponds to the
direction of propagation of the solid ray’s plane wave when the moving mirror is slightly
misaligned, changing its direction of propagation by an angle șd. There is no reason to expect șd
to lie in the x, z plane, so the plane of incidence of ray 2 is depicted as being different from that of
ray 1. When ray 2 refracts at the origin, turning into ray II, we see that the angle between rays I
and II must be the same order of magnitude as θ d , so we write this angle as O(șd).
Figure 4E.5(b) shows the plane containing refracted rays I and II. The intersection of this
plane with the flat surfaces of the beam-splitter substrate produces lines 1 and 2 in Fig. 4E.5(b),
showing where rays I and II enter and exit the slab. The O(șd) angle between them is also clearly
shown, now lying in the plane of the diagram. The distance between lines 1 and 2 must be O(w)
where w is the thickness of the slab. To estimate the monochromatic plane wave’s change in
phase due to the O(șd) change in the angle of propagation through the slab, we evaluate
2π 2π § 2π w 2 ·
∆s ≅ ⋅ [O(θ d ) ⋅ O( w) ] ⋅ O(θ d ) = O ¨ θd ¸ , (4E.8a)
λ λ © λ ¹
- 547 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4E.5(a).
ray 1 angle θ d
angle θ A ray 2
unit vector ẑ
surface normal
vector n̂
unit vector x̂
unit vector ŷ
ray I
ray II
angle is O (θ d )
−1 § n A sin θ A ·
angle θ B = sin ¨¨ ¸¸
© nB ¹
- 548 -
Appendix 4E
We note that
# #min ? 104 cm
for infrared systems and that, according to inequality (4B.4b) in Appendix 4B of this chapter, the
maximum expected value of șd is
' d ? 104 radians .
The thickness w
w isis typically
typicallyon
onthe
theorder
orderofof1 1cm.
cm.WeWe now seesee
now thatthat [see requirement (4B.1c) in
Appendix 4B]
2& § 2& ·
2& s 4 O §¨ 2&4 1088 ·¸
2& A1044 . (4E.8b)
# s 4 O ¨© 104 10 ¸¹
2& A10 . (4E.8b)
# © 10 ¹
This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
This is clearly small enough to ignore, justifying our decision in the discussion following Eq.
(4E.7c) to disregard the difference between the complex s((,ccp)) and s((,ccp))33 parameters. Inequality
(4E.7c) to disregard the difference between the complex s , p and s , p parameters. Inequality
(4B.2a) in Appendix 4B reveals that șb, the typical off-axis angle of propagation of the off-axis
(4B.2a) in Appendix 4B reveals that șb, the typical off-axis angle of propagation of the off-axis
plane waves, can be as large as 1022 radians . Putting this into formula (4E.8a) gives
plane waves, can be as large as 10 radians . Putting this into formula (4E.8a) gives
2& § 2& ·
2& s
O §¨ 2&4 1044 ·¸ O(2& ) . (4E.8c)
# s
O ©¨ 104 10 ¹¸ O(2& ) . (4E.8c)
# © 10 ¹
This is clearly too large to neglect, showing why we have been so careful to do the bookkeeping
This
on theis phase
clearlychanges
too large to neglect,
undergone byshowing whyand
the on-axis weoff-axis
have been so careful to plane
monochromatic do thewaves
bookkeeping
as they
on the phase changes undergone by the on-axis and off-axis monochromatic
propagate through the beam-splitter substrate and compensator plate. plane waves as they
propagate through the beam-splitter substrate and compensator plate.
- 549 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4E.5(b).
line 1
line 2
Small length
Angle size is O (' d ) . here is s .
Length is O (w) .
line 2
- 550 -
Appendix 4F
Appendix 4F
Figures 4F.1 and 4F.2 are tunnel diagrams like the ones used to explain the meaning of the γ s(,ap) ,
γ s(,bp) , and γ s(,cp) complex parameters introduced in Appendix 4E. The only difference is that these
tunnel diagrams apply to the monochromatic plane waves of the unbalanced background optical
signal coming from the detector side of a standard Michelson interferometer while the tunnel
diagrams in Appendix 4E analyze the monochromatic plane waves for the balanced optical signal
entering the interferometer’s front aperture.
Figures 4F.1 and 4F.2 show the path of a slightly off-axis plane wave represented by three
rays coming from the detector side of the interferometer. The tunnel diagram in Fig. 4F.1
corresponds to the rays transmitting through the beam-splitter film and substrate, reflecting off
the moving mirror, and then transmitting a second time through the beam-splitter substrate and
film on their way back to the detector. The tunnel diagram in Fig. 4F.2 corresponds to the rays
coming from the same off-axis direction, but this time they reflect off the back side of the beam-
splitter film, transmit twice through the compensator plate while going out and back the fixed-
mirror arm, and then reflect a second time off the back side of the beam-splitter film to return to
the detector.
In Fig. 4F.1, the angle of incidence of the off-axis rays on the back side of the beam-splitter
substrate is slightly different from the angle of incidence on the virtual beam-splitter substrate
shown on the other side of the moving mirror in the tunnel diagram. We have found that the
change in a plane wave passing through a transparent slab can be described by a complex
parameter whose argument or complex phase angle depends sensitively on the angle of incidence
and whose magnitude does not [see, for example, Eqs. (4E.6a)–(4E.6c) in Appendix 4E]. The
tunnel diagram in Fig. 4F.1 shows that the angle of incidence for the second pass through the
beam-splitter substrate differs slightly from that of the first pass, so we call the complex
parameter for the second pass γ s( v ) for s-type monochromatic plane waves and γ (pv ) for the p-type
monochromatic plane waves, while the complex parameter for the first pass is called γ s(u ) for s-
type monochromatic plane waves and γ (pu ) for the p-type monochromatic plane waves.
In Fig. 4F.2, the angle of incidence of the slightly off-axis rays on their second pass through
the compensator plate must also be slightly different from the angle of incidence of their first
pass; we also note, however, that according to the tunnel diagrams in Figs. 4F.1 and 4F.2, the
angle of incidence of the first pass through the beam-splitter substrate must be the same as the
angle of incidence of the first pass through the compensator plate when the compensator plate is
correctly aligned parallel to the beam-splitter substrate. Similarly, the angle of incidence of the
second pass through the compensator plate must be the same as the angle of incidence of the
second pass through the beam-splitter substrate.
- 551 -
4 · From Maxwell’s Equations to the Michelson Interferometer
FIGURE 4F.1.
Optical
Axis
Beam-Splitter Substrate
γ s(u, p)
Moving Mirror
γ s(,vp)
Virtual Beam-Splitter
Substrate
- 552 -
Appendix 4F
γ s(u, p)
γ s(,vp)
- 553 -
4 · From Maxwell’s Equations to the Michelson Interferometer
Hence, the same complex parameters γ s(,up) and γ s(,vp) used for the first and second passes through
the beam-splitter substrate should also be used to describe the first and second passes of the s-
type and p-type monochromatic plane waves through the compensator plate.
For future use, we define that
γ s(uv ) = γ s(u ) ⋅ γ s( v ) (4F.1a)
and
γ (puv ) = γ p(u ) ⋅ γ (pv ) . (4F.1b)
The reason for this is the same as before: the change in phase of the slightly off-axis plane waves
passing through the beam-splitter substrate or compensator plate depends sensitively on their
angle of incidence while their loss of power does not. We also note that, according to the analysis
at the end of Appendix 4E of this chapter, this dependence of the phase change on the angle of
incidence is not so sensitive as to be affected by the very small misalignments of the moving
mirror that may occur in well-designed interferometers.
- 554 -
5
DESCRIPTION OF PRACTICAL
INTERFEROMETER MEASUREMENTS
The concept of spectral radiance was introduced in Chapter 4 to simplify the interference
equations, and it turns out to have a much wider range of usefulness than might at first be
suspected.
expected. We start off this chapter with a quick description of how the spectral radiance can be
used to analyze the large-scale power flow and spectral content of electromagnetic radiation
fields, matching this to our intuitive understanding of what is meant by the brightness and
darkness of both near and distant objects. This is followed by a description of what is seen with
the naked eye when looking out through a standard Michelson interferometer, showing how it fits
in with the previous chapter’s interference formulas. The somewhat abstract equations derived in
Chapter 4 are converted into more practical formulas, and we explain the consequences of the
nonrandom errors and signal distortions found in realistic instruments. We describe the balanced,
unbalanced, and off-axis interferogram signals as well as how calibration removes contaminating
background radiances from the measured spectra. The characteristic strengths and weaknesses of
double-sided and single-sided interferogram systems are discussed, and we analyze the
degradation introduced by nonflat optical surfaces. The signals produced in the detector are
traced through the anti-aliasing filter to the analog-to-digital converter, where they are
transformed into digital input for the discrete Fourier transform. The chapter ends with an
explanation of why it sometimes makes sense to oversample or undersample the interferogram
signals.
79
See, for example, the discussion at the start of Sec. 4.16 of Chapter 4.
- 555 -
5 · Description of Practical Interferometer Measurements
energy dE) passing through a cross-sectional area A of a beam in time 2T into a solid angle d
and having a wavenumber between ı and ı + dı is
In analyzing any radiation field as a collection of radiant beams, as we are doing here, the idea of
a spectral radiance can be applied to any large-scale description of electromagnetic radiation. In
Sec. 4.9 of the last chapter, parallel groups of rays are used to represent plane waves inside a
beam. In radiometry, these ray groups are bundled together into what are often called pencils of
rays,80 or pencil rays for short, such that each pencil ray becomes an idealized representation of a
single conceptual beam of the radiation field. The pencil rays can be thought of as channels along
which electromagnetic energy flows. Just as the interferometer beam has a spectral radiance, so
too can a spectral radiance be assigned to every pencil ray of a large-scale radiation field.
In radiometry
radiometrywe thethespectral
spectralradiance
radiance of each
each pencil
pencilray
rayto is
be aa ffunction L(ı) such that the
amount of radiant energy dE) passing through a cross-sectional area dA of a pencil ray in time dt
into a solid angle d and having a wavenumber between ı and ı + dı is
In this formula area dA has its normal vector parallel to the axis of the ray as shown in Fig. 5.1(a).
Equations (5.1) and (5.2a) can be matched to each other exactly if we make the associations
A B dA (5.2b)
and
2T B dt . (5.2c)
This shows that to keep the radiometric L(ı) function consistent with Maxwell’s equations, the
physical quantities dA and dt, although mathematical infinitesimals, must always be thought of as
much larger than the wavelengths and periods of the propagating electromagnetic fields. If the
normal vector of area dA makes an angle ș with respect to the axis of the pencil ray, we expect
the effective area transverse to the beam to be (dA A cos ' ) as shown in Fig. 5.1(b). Now the
energy propagating along the pencil ray is
80
Max Born and Emil Wolf, Principles of Optics, 7th exp. ed. (Macmillan Company, New York, 1964).
- 556 -
Radiometric Description of Electromagnetic Fields · 5.1
FIGURE 5.1(a).
edge-on
view of dA
FIGURE 5.1(b).
edge-on
view of dA
unit vector normal to dA
edge-on view of
(cosθ ) dA θ
- 557 -
5 · Description of Practical Interferometer Measurements
Similarly dEf, the radiant energy flowing in time dt through an area dA making an angle ș with
respect to the pencil ray into a solid angle d and having a frequency between ƒ and f + df
should be
dE f = L f ( f ) ⋅ dt ⋅ (dA ⋅ cos θ ) ⋅ d Ω ⋅ df . (5.3b)
The total amount of radiant energy flowing along the ray should be the same no matter how the
spectrum is represented, so
∞ ∞
(dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L(σ )dσ = (dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L f ( f )df
0 0
∞
= (dA ⋅ cos θ ) ⋅ dt ⋅ d Ω ³ L λ (λ )d λ
0
or
∞ ∞
³ L(σ )dσ = ³ L
0 0
f ( f )df (5.3c)
and
∞ ∞
³ L(σ )dσ = ³ Lλ (λ )d λ
0 0
(5.3d)
But σ = 1 λ = f c [see discussion immediately preceding Eq. (4.19c) in chapter 4], which we
can use to change the variable of integration in these last two equations to get
∞ ∞
ª 1 § f ·º
³0 «¬ c L ¨© c ¸¹ »¼ df = ³0 L f ( f )df (5.3e)
and
∞ ∞
ª 1 § 1 ·º
³0 «¬ λ 2 L ¨© λ ¸¹»¼ d λ = ³0 Lλ (λ )d λ . (5.3f)
These equations must hold true for any physically conceivable spectral radiance L(ı), which
means that
1 § f ·
L f ( f ) = L¨ ¸ (5.3g)
c ©c¹
and
- 558 -
Radiometric Description of Electromagnetic Fields · 5.1
1 §1·
L λ (λ ) = L¨ ¸ . (5.3h)
λ ©λ¹
2
This can be used to define Lƒ and LȜ in terms of L. The phrase “physically conceivable” lets us
assume that L(1 λ ) → 0 as λ → 0 and that it does this fast enough to avoid any concern that the
right-hand side of (5.3h) becomes singular as λ → 0 .
Radiation escaping from relatively small holes in cavities whose walls are all at the same
temperature T is called black-body or Planck radiation. One of the first triumphs of quantum
mechanics at the beginning of the 20th century was to explain why the spectral radiance of this
sort of radiation is always given by the formula
2hc 2σ 3
L(σ ) Planck = hcσ
, (5.3i)
e kT
−1
(2hf 3 / c 2 )
Lf ( f ) = hf
(5.3j)
Planck
e kT
−1
and
(2hc 2 / λ 5 )
L λ (λ ) Planck = hc
. (5.3k)
e kT λ
−1
We often use a “gray-body” approximation to get the spectral radiance for heat or infrared
radiation that a surface of temperature T spontaneously emits. To use the gray-body
approximation, we just multiply L Planck at the surface’s temperature T by a dimensionless fraction
between zero and one, which is called the surface’s emissivity, with different surfaces having
different emissivity values. Sometimes, to get more accuracy, the emissivity is taken to be a
function of wavenumber and temperature; when this is done the L Planck function is being used to
give the correct overall size and shape to the surface’s spectral radiance while the spectral
dependence of the emissivity is used to reproduce the rapid fluctuations with respect to ı
characteristic of the surface.
Figures 5.2(a)–5.2(c) contain plots of L(σ ) Planck , L f ( f ) , and L λ (λ ) Planck at temperatures
Planck
- 559 -
5 · Description of Practical Interferometer Measurements
of 300 K, 400 K and 500 K. The spectral radiance increases with temperature at every
wavenumber, matching our intuition about what ought to occur. We note that at 300 K
(approximate room temperature) only negligible radiation is emitted in the visible region of the
electromagnetic spectrum between approximately 15,000 cmí1 and 22,000 cmí1—which is, of
course, what we should expect—and the same is also true of the 400 K and 500 K curves.
(Surfaces in fact start to become visibly hot only at 700 K and higher.) Unfortunately, the Planck
curve is rather featureless, tending to conceal what is going on when we switch from L to Lƒ to
LȜ to represent the same radiance spectrum. Figures 5.2(d)–5.2(f) show a more interesting
electromagnetic spectrum represented using the L(ı), L f ( f ) , and L λ (λ ) spectral radiance
functions. These plots reveal that the transformation from L to LȜ not only distorts the spectrum’s
overall shape but also reverses the ordering of the spectral features, putting large wavenumber
features at small wavelengths and small wavenumber features at large wavelengths. The
transformation from L to Lƒ, on the other hand, just involves a rescaling of the x and y axes of the
spectrum. This latter transformation, then, acts like a simple change in our choice of units; and
for this reason the word “frequency” is sometimes used to refer to wavenumber. The idea behind
this terminology is that wavenumbers are just frequencies that happen to be measured in “units”
of cmí1.
FIGURE 5.2(a).
800800
710.524107
500 K
600600
L(σ ) Planck
400 K
[in (erg/sec)/cm/sr]B1 σi
B2 σ 300 K
400400
i
B3 σ
i
200200
σ (in cm-1)
- 560 -
Radiometric Description of Electromagnetic Fields · 5.1
FIGURE 5.2(b).
3 .10 3x10-8
8
8 500 K
2.5 10
-88
2x10
2 10
Lf ( f ) B1f
i 400 K
Planck
2
[in (erg/sec)/cm /sr/Hz]i B2f
1.5 10
8
B3f
i 300 K
-88
1x10
1 10
9
5 10
14
8.263917 .10 0.0 0
13 13 13 13 14 14
0.0 4 10 13
4x10 8 10 13
8x10 1.2 10 14
2.998 .10
0
10
2 10 6 10
f
i
1 10
1.2x10
1.1992 .10
14
f (in Hz)
- 561 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.2(c).
9
1.4x101.4 109
9
1.279813 .10
9
1.2 109
1.2x10
500 K
9 9
1.0x10
1 10
L λ (λ ) Planck B1λ
i
8.0x10 88
8 10
[in (erg/sec)/cm3/sr] B2λ
i 400 K
88
6.0x10
B3λ
i
6 10
300 K
88
4.0x10
4 10
88
2.0x10
2 10
0 0.0 0
0 0.001 0.002 0.003 0.004
0 0.0 0.001 0.002 λ 0.003 0.004 0.005
λmax
i
λ (in cm)
- 562 -
Radiometric Description of Electromagnetic Fields · 5.1
FIGURE 5.2(d).
0.003
Lmax
0.003
0.0025
0.002
0.002
L(σ )
(in Watts/cm2/sr/cm -1
) 0.0015
L( σ )
0.001
0.001
4
5 10
0 0.0 0
0 500 1000 1500 2000 2500 3000
σmin 0 500 1000 1500
σ 2000 2500 3000
σmax
σ (in cm-1)
- 563 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.2(e).
13
1 10-13
1x10
Lνmax
-14
14
8x10
8 10
-14
14
Lf ( f ) 6x10
6 10
(in Watts/cm2/sr/Hz)
Lν( ν )
-14
14
4x10
4 10
-14
14
2x10
2 10
0 0.0 0
13 13 13 13 13 13 13 13 13 14
νmin 0 20 40 60 80 100
0 1 10 2 10 3 10 4 10 5 10 6 10 7 10 8 10 9 10 1 10
ν νmax
f (in TeraHz)
- 564 -
Radiometric Description of Electromagnetic Fields · 5.1
FIGURE 5.2(f).
4000
Lλmax
4000
3500
3000
3000
2500
L λ (λ )
2000
Lλ( λ ) 2000
(in Watts/cm3/sr)
1500
1000
1000
500
0.0
0 0
4
0
λmin0 5 10 10
0.001 0.0015 20
0.002
λ
0.0025 30
0.003 0.0035 40
0.004
λmax
λ (in microns)
- 565 -
5 · Description of Practical Interferometer Measurements
Different authors use different notations for L, Lƒ, and LȜ. The easiest way to find out what
exactly is meant by the term “spectral radiance” is to check the units. Consulting Eqs. (5.2d),
(5.3a), and (5.3b), we see that L must have units of energy per unit time per unit area per unit
solid angle per unit wavenumber interval, whereas Lƒ has units of energy per unit time per unit
area per unit solid angle per unit frequency interval and LȜ has units of energy per unit time per
unit area per unit solid angle per unit wavelength interval. Although solid angles measured in
steradians, like angles measured in radians, are strictly speaking dimensionless, it is customary in
radiometry to write out the steradian unit explicitly, treating it as if it had a dimension. This
convention makes it easy to distinguish physical quantities such as the spectral radiance, which
are both “per unit surface area” and “per unit steradian” from physical quantities such as the
energy flux that are just “per unit surface area.”
To go from the spectral radiance to the radiance, we need only integrate L(ı) over all positive
wavenumbers, integrate L f ( f ) over all positive frequencies, or integrate LȜ over all
wavelengths. Using l to represent the radiance, we say that
∞ ∞ ∞
l = ³ L(σ )dσ = ³ L f ( f )df = ³ L λ (λ )d λ . (5.4a)
0 0 0
The integrals are between 0 and because L and Lƒ are defined in such a way as to spread the
radiant energy over positive wavenumbers and frequencies respectively—and wavelength, of
course, must be a positive quantity. In this sense, they are all analogous to the single-sided power
spectra discussed at the end of Sec. 3.23 in Chapter 3. We integrate Eq. (5.2d) over positive ı and
use (5.4a) to get that the total energy dE flowing in time dt through an area dA making an angle ș
with respect to the pencil ray into a solid angle d is
The same formula comes from integrating Eqs. (5.3a) or (5.3b) over positive frequencies or
wavelengths respectively. Different authors may use different notations for the radiance, and
again the surest way to find out what is going on is to check the units. The units of the radiance l
are, of course, energy per unit time per unit area per unit solid angle.
- 566 -
Radiance Fields in Space · 5.2
where l1 is the radiance at position 1 along the pencil ray. Similarly, the amount of radiant energy
passing through dA2 in time dt into a solid angle d2 is
where l 2 is the radiance at position 2 along the pencil ray. The values of l1 and l 2 cannot depend
on the size of the infinitesimal quantities dA1, dA2, d1, d2, or dt, so nothing stops us from
choosing d1,2 to be the solid angles subtended by dA2,1 at positions 1,2:
dA cos ' 2
d 1 dA2 cos2 '
(5.5c)
d 1 2 r 2 2 (5.5c)
and r
and dA cos '1
d 2 dA1 cos
2 '
, (5.5d)
d 2 1 r 2 1 , (5.5d)
r
where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.
where, as shown in Fig. 5.3(b), r is the distance between positions 1 and 2.
______________________________________________________________________________
______________________________________________________________________________
FIGURE 5.3(a).
FIGURE 5.3(a). Unit Vector
Unit Vector
normal to
normal
area dAto
area dA
'
'
- 567 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.3(b).
Unit Vector
normal to dA2
'1
Unit Vector
normal to dA1
r
Position 1 Position 2
______________________________________________________________________________
If we make the reasonable assumption that energy travels in straight lines inside a homogeneous
medium, as shown by the dotted lines in Fig. 5.3(c), and also specify that the values of l1 and l 2
do not change with time, then the radiant energy passing through dA1 into d1 in time dt must be
the same as the radiant energy passing through dA2 into d2 in time dt. From Eqs. (5.5a)–(5.5d)
we then have
which reduces to
l1 l 2 (5.5f)
Hence
Hence when
when thethe radiance
radiance is is
notnot changingwith
changing withtime
timeit itmust
mustalso
alsobe
beconstant
constant along
along any
any pencil
pencil
of rays.
We have now established a self-consistent model for radiation fields in empty space and
transparent media. To find the radiance at any point, such as point A in Fig. 5.4, we need only
take note of all the criss-crossing pencil rays, establishing their radiances by tracing them back to
the surfaces where they originated. It does not matter whether the surface has reflected them like
surface 1 or, being self-luminous, has created them like surfaces 2 and 3; all that is relevant is the
radiance value they have when leaving the surface. There is nothing special about point A in Fig.
5.4—its location is obviously arbitrary.
- 568 -
Radiance Fields in Space · 5.2
FIGURE 5.3(c).
dA1 dA2
Position 1 Position 2
FIGURE 5.4.
Point A
Surface 1
Surface 3 Surface 2
- 569 -
5 · Description of Practical Interferometer Measurements
By moving point A around and specifying the radiances of the different pencil rays passing
G ˆ G
through point A, we construct a radiance field l (r , ) that is a function both of position r and
direction ˆ . Having picked a position rG at which to evaluate l, we need ˆ as well to specify
G
one particular pencil ray passing through position r . It is even possible, once the radiance l is
G ˆ , to derive a simple differential equation describing the
thought of as a function of r and
gradual change in radiance undergone by these pencil beams when they travel through
semiopaque and self-luminous media, such as clouds of radiating gas. This last idea is the starting
point for modeling radiance fields inside stars or planetary atmospheres, but is not really needed
for the material in this book.81
G ˆ
Along with the radiance field l (r , ) , which is a function of position and direction of
G ˆ
propagation, we can associate a spectral radiance field L() , r , ) that is a function of
G ˆ such that
wavenumber ı, position r , and direction of propagation
5
G ˆ G ˆ
l (r , ) ³ L() , r , ) d) . (5.6)
0
G
Suppressing the r dependence on position to represent a radiance field that is constant over some
region of space, and choosing a direction in space to be the ẑ axis of a coordinate system so that
ˆ G zˆ 1 2 ,
G G
as in Eq. (4.97d) of Chapter 4, we can write L as a function of , as in L L( , ) ) , to show its
dependence on the radiation’s direction of propagation. This function L is the same quantity as
G
the spectral radiance L( , ) ) specified by Eq. (4.136d) in Chapter 4. Many times in the rest of
this chapter we will talk about a single pencil ray from a distant source passing through an
G
interferometer. The pencil ray has, of course, a unique spectral radiance L( , ) ) ; and the pencil
ray while passing through the interferometer can be decomposed into a group of parallel rays
because it emanates from a distant source. This parallel group of rays, according to Sec. 4.9 of
Chapter 4, specifies a plane wave passing through the interferometer. To get the optical energy
per unit area per unit time per unit wavenumber interval carried by the plane wave, we just
multiply the spectral radiance of the pencil ray by the extremely small solid angle d subtended
81
The interested reader is referred to S. Chandrasekhar, Radiative Transfer (Dover Publications, New York, 1960)
for a classic textbook, or Curtis D. Mobley, Light and Water: Radiative Transfer in Natural Waters (Academic
Press, New York, 1994), based in part on collaborations with Rudolf W. Preisendorfer, for a more modern work in
this field. What we call radiance and spectral radiance, Chandrasekhar calls, respectively, intensity and specific
intensity.
- 570 -
Radiance Fields in Space · 5.2
by the distant source at the position of the interferometer. This procedure amounts to nothing
G
more than mentally associating d with L(ε , σ ) in Eq. (5.2a) to get
G
dEσ = [ L(ε , σ ) ⋅ d Ω ] ⋅ dt ⋅ dA ⋅ dσ .
Writing this equality as
dEσ G
= [ L(ε , σ ) ⋅ d Ω]
dt ⋅ dA ⋅ dσ
makes it easy to see why multiplying the spectral radiance of the pencil ray by d gives the
optical energy of the plane wave per unit time per unit area per unit wavenumber interval.
where
asurf
dΩA = (5.7b)
rA2
- 571 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.5.
rB
A B
rA
is the solid angle subtended by the surface patch at position B. Substitution of (5.7b) into (5.7a)
and (5.8b) into (5.8a) gives
dE A l ⋅a ⋅ asurf
= PA = surf pupil (5.9a)
dt rA2
and
dEB l ⋅a ⋅ asurf
= PB = surf pupil , (5.9b)
dt rB2
- 572 -
Radiance, Brightness, and the Inverse-Square Law · 5.3
where PA and PB are, respectively, the radiant power entering the observer’s eye at positions A
and B. This result can be written as
PB § rA2 ·
=¨ ¸, (5.9c)
PA © rB2 ¹
showing how the familiar inverse-square law for radiant power hides inside the rule that the
radiance along any pencil ray is constant.
The idea that the interior points of a surface patch can have a brightness only makes sense
when the observer is near enough to resolve—or “see”—the shape of the surface patch. When the
observer is so distant that the surface patch is just a point of light, we say that the image of the
surface patch is unresolved. Now the brightness of that point of light follows the inverse-square
law directly by growing ever dimmer as the distance between the observer and surface patch
increases. The “brightness” of an unresolved point source, then, depends not on the radiance of
the pencil ray emanating from that source but rather on the total radiant power entering the
observer’s eye.
∞
1
S (σ ) [1 + W ⋅ M(Rσθ ma ) ⋅ cos(2πσχ ) ] dσ ,
2 ³0
Pbal ( χ ) = (5.10a)
where
S (σ ) = A ⋅ ∆Ω ⋅η (σ ) ⋅ L(σ ) (5.10b)
- 573 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.6.
Moving Mirror
χ
a=
2
Ideal
Beam
Splitter Fixed
Mirror
- 574 -
The Balanced Signal of a Michelson Interferometer · 5.4
and
J1 (4π Rσθ ma )
M(Rσθ ma ) = . (5.10c)
2π Rσθ ma
For an ideal interferometer the beam-splitter efficiency Ș is always one and so S(ı) specified by
(5.10b) is the same S(ı) specified by Eq. (1.19d) in Chapter 1 and (4.140c) in Chapter 4. Function
Pbal ( χ ) gives the optical power in the balanced interference signal coming from the point source,
and we often call Pbal the balanced interference signal when context makes it clear what is meant.
Figure 5.6 shows the source observed through the interferometer by the unaided eye, so in Eq.
(5.10b) the effective cross-sectional area A of the interferometer beam is the area of the eye’s
circular entrance pupil and L(ı), of course, is the spectral radiance of the pencil ray from the
distant source. The beam-splitter efficiency Ș(ı), which is a function of wavenumber, reminds us
that radiation of wavenumber ı only contributes to the interference signal to the extent that it
penetrates the beam splitter—wavenumbers for which the beam splitter is opaque so that η = 0 ,
for example, cannot contribute to Pbal ( χ ) . As is pointed out in the discussion following Eq.
(4.136i) of Chapter 4, we expect that
0 <η <1
for realistic interferometers. In Eq. (5.10c) the radius R of the eye’s circular entrance pupil is
related to the pupil area by the standard formula
A
R= , (5.10d)
π
and θ ma is the misalignment angle of the moving mirror. For an ideal interferometer that is in
perfect alignment, θ ma = 0 ; and, according to Eq. (4.137k) of Chapter 4, when θ ma is zero
J1 (4π Rσθ ma )
= M(0) = 1 . (5.10e)
2π Rσθ ma θ
ma = 0
This means M = 1 is a shorthand for the assumption that the interferometer is in perfect
alignment. For future use we also note that, according to Eq. (4.137g),
- 575 -
5 · Description of Practical Interferometer Measurements
M ≤1 (5.10g)
always. Unless otherwise stated, we assume in this chapter that M is constant, postponing until
the next chapter any discussion of what happens when M changes while the interferometer is
measuring spectra.
To show how formula (5.10a) works, we choose a specific shape for the spectral radiance
L(ı), making the idealization that η = M = 1 at all ı for which L(σ ) ≠ 0 . Figure 5.7 specifies the
shape of L(ı), and according to Eq. (5.10b) the single-sided power spectrum
S (σ ) = A ⋅ ∆Ω ⋅ L(σ )
FIGURE 5.7.
4
3 10
Lmax
4
2.5 10
4
2 10
4
L(σ
L( σ )1.5) 10
4
1 10
5
5 10
0 0
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
σmin -2000 -1000 σ0.0 1000 2000 σmax
σ (in cm-1)
- 576 -
The Balanced Signal of a Michelson Interferometer · 5.4
χ = 2a (5.11a)
for this interferometer, is also zero. This means that when a = 0 the moving mirror is at the zero-
path difference (ZPD) position shown by the dashed line in Fig. 5.6. From (5.10a) we see that at
ZPD when η = M = 1
∞
1 P for W = 1
Pbal (0) = ³ S (σ ) (1 + W ) dσ = ® inp , (5.11b)
20 ¯ 0 for W = −1
where
∞
input radiant power in pencil ray = Pinp = ³ S (σ )dσ . (5.11c)
0
Evaluating (5.10a) when η = M = 1 for all nonzero values of χ = 2a , we get the two different
Pbal curves shown in Figs. 5.8(a) and 5.8(b). When χ = 2a = 0 and W = 1 , we see that Pbal (0) in
Eq. (5.11b) specifies the maximum possible value for the interference signal; and when W = −1 ,
we see that Pbal (0) specifies the minimum possible value for the interference signal. The observer
in Fig. 5.6 sees the starlike source disappear when the pencil ray passes through an ideal
interferometer that has its moving mirror at ZPD and a beam splitter with W = −1 . When the
pencil ray passes through an ideal interferometer whose beam splitter has W = 1 , the observer
sees the full brightness of the starlike source when the moving mirror is at ZPD. If Ȥ is changed
by shifting the moving mirror, both Figs. 5.8(a) and 5.8(b) show how, for this ideal
interferometer, the source brightness seen by the observer oscillates around Pinp/2, half the full
brightness of the starlike source. We note that when a and Ȥ are positive (that is, when
a = χ 2 > 0 ), the moving mirror is more distant from the beam splitter than it is at ZPD; and
when a and Ȥ are negative (that is, when a = χ 2 < 0 ), the moving mirror is closer to the beam
splitter than it is at ZPD. Because
cos(−2πσχ ) = cos(2πσχ ) ,
which means that Pbal is an even function of the optical-path difference. Consequently the
observer sees the same source brightness when the moving mirror moves off ZPD and away from
the beam splitter by a distance a = χ 2 as he does when the moving mirror moves off ZPD and
closer to the beam splitter by a distance a = χ 2 .
- 577 -
5 · Description of Practical Interferometer Measurements
Pinp
0.06
Pbal ( )
IferSig 0.04
Pinp
ng
2
0.02
0.0
0. 0
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
10
2 -0.008 -0.004 0.0ng
graph 0.004 0.008 10
2
(in cm)
0.08
2 Imax
Pinp
0.06
Pbal ( )
IferSig 0.04
Pinp
ng
2
0.02
0.0
0. 0
0.01
2
0.008
-0.008 0.006 0.004
-0.004 0.002
0.0
0
graph
0.002 0.004
0.004 0.006 0.008
0.008 0.01
2
10 ng 10
(in cm)
- 578 -
The Balanced Signal of a Michelson Interferometer · 5.4
Following the notation introduced in Eq. (4.141a) of Chapter 4, we say the ideal interferogram
of the balanced power spectrum is
5
1
I ( ideal )
bal ( ) ³ S () ) cos(2&) ) d) . (5.13a)
20
5
1
³
( ideal )
Pbal ( ) S () ) d) W A I bal ( ) (5.13b)
20
or, using Eq. (5.11c),
1 ( ideal )
Pbal ( ) Pinp W A I bal ( ) . (5.13c)
2
( ideal )
Figure 5.8(c) shows the I bal ( ) interferogram that corresponds to both the W 1 and the
W 1 interference signals.
FIGURE 5.8(c).
P
Imax
inp
0.04
2
0.02
(ideal )
I bal ( )
Igraph0.0 0
ng
0.02
Pinp
2
Imax 0.04
0.01 0.008 0.006 0.004 0.002 0 0.002 0.004 0.006 0.008 0.01
2 -0.008 -0.004 0.0
graph 0.004 0.008 2
10 ng 10
(in cm)
- 579 -
5 · Description of Practical Interferometer Measurements
( ideal )
the interferogram I bal takes on negative as well as positive values. A negative interferogram
value does not mean the total optical power reaching the observer has gone negative—this cannot
ever happen, of course—but just that the interference signal has dropped below Pinp/2. One easy
way to keep track of the distinction between the interferogram signal and the interference signal
is to remember that the interferogram signal has negative values whereas the interference signal
is never negative. Because, according to Eq. (5.12),
Pbal (− χ ) = Pbal ( χ ) ,
1 1
( ideal )
W ⋅ I bal (− χ ) = Pbal (− χ ) − Pinp = Pbal ( χ ) − Pinp = W ⋅ I bal
( ideal )
(χ )
2 2
or
( ideal )
I bal (− χ ) = I bal
( ideal )
(χ ) . (5.14)
( ideal )
Hence both Pbal and I bal are even functions of Ȥ. Since the interference signal Pbal approaches
Pinp/2 as χ gets large in Figs. 5.8(a) and 5.8(b), the balanced interferogram
Pbal ( χ ) − (1 2 ) Pinp
( ideal )
I bal (χ ) = (5.15)
W
approaches zero for large values of χ in Fig. 5.8(c). This behavior is typical of all
interferograms; the only way to avoid it is to make the power spectrum a delta function,
S (σ ) = S0 ⋅ δ (σ − σ 0 ) , (5.16a)
of the type discussed in Sec. 2.14 of Chapter 2 [see also Fig. 5.9(a)]. This delta function
represents monochromatic light of wavenumber σ 0 coming from the distant source. Equation
(5.11c) now requires Pinp = S0 , so according to Eqs. (5.13a) and (5.13c) the balanced interference
signal Pbal becomes
- 580 -
The Balanced Signal of a Michelson Interferometer · 5.4
S0 WS0 S
Pbal ( χ ) = + ⋅ cos(2πσ 0 χ ) = 0 [1 + W cos(2πσ 0 χ ) ] , (5.16b)
2 2 2
which is plotted in Figs. 5.9(b) and 5.9(c) for W = 1 and W = −1 . Equation (5.15) gives the
associated interferogram
S
( ideal )
I bal ( χ ) = 0 ⋅ cos(2πσ 0 χ ) , (5.16c)
2
which we plot in Fig. 5.9(d). Formula (5.16b) is clearly identical to Eq. (1.17d) in Chapter 1 after
we set up the correspondences
fi ⇔ S 0 , and σ fi ⇔ σ 0 .
I (ficb ) ⇔ Pbal , I (0)
This ideal delta-function spectrum can be approximated by passing a laser through the
interferometer, producing interferograms resembling the one shown in Fig. 5.9(d). Even lasers,
however, have a small but finite spread in their power spectra, causing their interferograms to
approach zero at extremely large values of χ .
FIGURE 5.9(a).
2.5
2.5
1.5 S (σ )
1
c0( χ ) 0.5
0
σ =σ0
0.5
1.5 1.5
2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5
2.5 χ
σ 2.5
- 581 -
5 · Description of Practical Interferometer Measurements
2.5
2.5
S0 2
1.5
S0
1
2
c1( χ ) 0.5
0.0 0
0.5
S0 1/ σ 0
− 1
2
1.5 1.5
2.5 2 1.5 1 0.5 χ =χ0 0 0.5 1 1.5 2 2.5
2.5 χ 2.5
2.5
2.5
S0 2
1.5
S0
1
2
c2( χ ) 0.5
0.0 0
0.5
S 1/ σ 0
− 01
1.5
2
1.5
2.5 2 1.5 1 0.5
χ =χ0 0 0.5 1 1.5 2 2.5
2.5 χ 2.5
- 582 -
The Balanced Signal of a Michelson Interferometer · 5.4
FIGURE 5.9(d).
2.5
2.5
S0 2
1/ σ 0
1.5
S0
1
2
c3( χ ) 0.5
0.0 0
0.5
S
− 0 1
2
1.5 1.5
2.5 2 1.5 1 0.5
χ =χ0 0 0.5 1 1.5 2 2.5
2.5
χ 2.5
Having separated the balanced interference signal Pbal ( χ ) for the ideal interferometer into a
( ideal )
constant term Pinp 2 and an ideal interferogram I bal ( χ ) , we note that a similar procedure can
be followed with respect to the nonideal interference signal where 0 < η < 1 and M < 1 .
Equation (5.10a) can be written as
∞ ∞
1 W
Pbal ( χ ) =
20³ S (σ ) dσ + ³ S (σ ) M(Rσθ ma ) ⋅ cos(2πσχ ) dσ
2 0 (5.17a)
= P0 / 2 + W ⋅ I bal ( χ )
or
Pbal ( χ ) − (1 2 ) P0
I bal ( χ ) = , (5.17b)
W
where, applying Eq. (5.10b),
∞ ∞
P0 = ³ S (σ ) dσ = ³ A ∆Ω L(σ )η (σ ) dσ (5.17c)
0 0
and
- 583 -
5 · Description of Practical Interferometer Measurements
∞
1
I bal ( χ ) = ³ S (σ ) M(Rσθ ma ) cos(2πσχ ) dσ
20
∞
(5.17d)
1
= ³ A ∆Ω L(σ )η (σ ) M(Rσθ ma ) cos(2πσχ ) dσ .
20
Although P0 in Eq. (5.17c) looks superficially like Pinp in Eq. (5.11c), since it too can be written
as
∞
³ S (σ ) dσ ,
0
the constant power level P0 is not the same as Pinp because now η (σ ) < 1 in Eq. (5.10b), making
P0 less than the radiant power Pinp of the pencil ray entering the interferometer. Similarly I bal ( χ )
in formula (5.17d) becomes, for χ = 0 ,
∞ ∞
1 1
I bal (0) = ³ A ∆Ω L(σ )η (σ ) M(Rσθ ma )dσ = ³ S (σ ) M(Rσθ ma )dσ ,
20 20
which means—since M < 1 in this nonideal case—that we cannot expect to have Pbal (0) be
either P0 or zero for W = 1 or W = −1 respectively. Nevertheless, in a well-designed
interferometer, both Ș and M are reasonably close to one for the wavenumbers of interest, and the
balanced signal of a nonideal interferometer usually behaves much the same as the balanced
signal of the ideal interferometer. In fact the symmetry properties with respect to Ȥ of the ideal
balanced signal—that the balanced interference signal and balanced interferogram are even
functions of the optical path difference—apply as well to the nonideal case where 0 < η < 1 and
M < 1 because neither Ș nor M depends on Ȥ. Hence the same reasoning already used to derive
Eqs. (5.12) and (5.14) can also be applied to this nonideal case to get
and
I bal (− χ ) = I bal ( χ ) for 0 < η < 1 and M < 1 . (5.18b)
- 584 -
The Unbalanced Signal of a Michelson Interferometer · 5.5
∞ ∞
1
Punb ( χ ) = ³ S (σ ) dσ − S (σ ) [1 + W cos(2πσχ ) ] dσ
0
2 ³0
or
∞
1
Punb ( χ ) = ³ S (σ ) [1 − W cos(2πσχ ) ] dσ . (5.19b)
20
Comparing this result to Eq. (5.10a) with Ș = M = 1, we see that, at this level of idealization,
going from the balanced to the unbalanced interference signal is the same as changing the sign of
W. Consulting Figs. 5.8(a) and 5.8(b), we see that when 5.8(a) is the balanced interference signal,
- 585 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.10.
Moving Mirror
Ideal
Beam
Splitter
Fixed
Mirror
The dashed lines show the rays going back out the front aperture as an unbalanced interference
signal. This unbalanced interference signal cannot by seen by the observer.
- 586 -
The Unbalanced Signal of a Michelson Interferometer · 5.5
then 5.8(b) is the unbalanced interference signal; and when 5.8(b) is the balanced interference
signal, then 5.8(a) is the unbalanced interference signal. Following the pattern of Eq. (5.13c), we
( ideal )
can define an ideal unbalanced interferogram I unb ( ) for the unbalanced optical signal by
saying that
1 ( ideal )
Punb ( ) Pinp W A I unb ( ) (5.19c)
2
so that
Punb ( ) 1 2 Pinp 1
5
2 ³0
( ideal )
I unb ( ) S () ) cos(2&) ) d) . (5.19d)
W
The sign convention chosen for the balanced and unbalanced interferograms in Eqs. (5.13a) and
(5.19d) specifies a positive ZPD peak for the balanced interferogram,
5
1
2 ³0
( ideal )
I bal S () ) d) 0 , (5.20a)
0
5
1
2 ³0
( ideal )
I unb S () ) d)
0 . (5.20b)
0
The qualitative behavior of the nonideal unbalanced interference signal and nonideal unbalanced
interferogram is, in a well-designed interferometer, very similar to the behavior of the ideal
unbalanced interference signal and ideal unbalanced interferogram. Note that although the shapes
of the balanced and unbalanced interference signals depend on the sign of W, the shapes of the
balanced and unbalanced interferograms do not.82
82
Section 4.17 of Chapter 4 derives the formulas for the nonideal unbalanced interference signal of the
interferometer’s
interferometer’sbackground
backgroundradiance because
radiance becausetheyit show the totaltoradiant
contributes power
the total reaching
radiant the interferometer
power detector.
reaching the detector.
The same procedures can be used to derive formulas for the nonideal unbalanced interference signal of the
interferometer’s input radiance. For the interferometer designs analyzed here, this signal is of much less interest
because it goes back out the interferometer’s entrance aperture and has no effect on the total radiant power reaching
the interferometer’s detector. There do exist interferometers, like the one shown in Fig. 1.19c of Chapter 1, for which
both types of formula are relevant.
- 587 -
5 · Description of Practical Interferometer Measurements
Here, the superscript (0) has been added to show that the balanced interference signal Pbal and the
spectrum S refer to the point source whose rays are parallel to—that is, at a zero angle to—the
optical axis. To get the corresponding formula for the off-axis point source, we use Eq. (4.137i)
from Chapter 4 to write
( βs )
Pbal (χ )
∞
A (5.21b)
³ dσ field³ ³of dviewεfor η (σ )L s (σ ) [1 + W ⋅ M( Rσθma ) ⋅ cos(2πσχ cos αε )] ,
2 (β )
=
2 −∞
βs point source
J1 (4π Rσθ ma )
M(Rσθ ma ) = .
2π Rσθ ma
The superscript (ȕs) is added to show that Pbal and L refer only to the off-axis source, the one
whose rays are at an angle ȕs to the optical axis. The effective cross-sectional area of the
interferometer beam is still A, the area of the eye’s entrance pupil; and R in the formula for M is
still the radius of the eye’s entrance pupil so that R = A / π . The relevant field of view, however,
- 588 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
FIGURE 5.11.
Moving Mirror
βs
is now ∆Ω( β s ) , the extremely small solid angle subtended by the second distant source at the
position of the interferometer. Recognizing that α ε ≅ β s for all the rays coming from this distant,
off-axis source, we perform the integral over d 2ε in (5.21b) to get
( βs )
Pbal (χ )
A ∆Ω( βs ) ∞ (5.21c)
³ η (σ )L s (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s ) ] dσ .
(β )
=
2 −∞
Equations (4.136f) and (4.139g) in Chapter 4 require L and Ș to be even functions of ı; Eq.
- 589 -
5 · Description of Practical Interferometer Measurements
(5.10f) shows that M is another even function of ı; and the cosine is also even. Therefore the
product
η (σ ) L( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )]
must be an even function of ı, which means that, according to Eq. (2.19) in Chapter 2,
³ η (σ ) L
( βs )
(σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ
−∞
∞
= 2 ³ η (σ ) L( β s ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ .
0
From Eq. (4.136g) in Chapter 4, we know that the off-axis spectral radiance is
L( β s ) (σ ) = 2 L( β s ) (σ ) ,
where the superscript (ȕs) is added to show that we are only interested in the pencil ray entering
the interferometer at an angle ȕs to the optical axis. This lets us write
³ η (σ ) L
( βs )
(σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ
−∞
∞
= ³ η (σ ) L( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ .
0
∞
1
P( βs )
bal ( χ ) = ³ S ( βs ) (σ ) [1 + W ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos β s )] dσ , (5.21d)
20
where
S ( βs ) (σ ) = A ∆Ω( β s ) L( βs )(σ ) η (σ ) . (5.21e)
∞
1
P( βs )
bal ( χ ) = ³ S ( βs ) (σ ) [1 + W cos(2πσχ cos β s )] dσ (5.21f)
20
where
- 590 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
Comparing Eq. (5.21f) for the ideal off-axis case to Eq. (5.21a) for the ideal on-axis case shows
that the only effect of the off-axis passage through the interferometer is to multiply ıȤ by cosȕs
and to replace S (0) by S ( βs ) .
Equations (5.21f) and (5.21g) for the off-axis source can be compared to Eq. (5.21a) for the
on-axis source under the assumption that both sources are the same size, have the same spectral
radiance L(ı), and are at the same distance from the interferometer. Both sources then pass the
same power spectrum S(ı) through the interferometer so that
∞
1
2 ³0
(0)
Pbal (χ ) = S (σ ) [1 + W cos(2πσχ )] dσ (5.22a)
and
∞
1
P( βs )
bal ( χ ) = ³ S (σ ) [1 + W cos(2πσχ cos β s )] dσ . (5.22b)
20
( βs )
Pbal ( χ ) = Pbal(0) ( χ cos β s ) . (5.23a)
The displacement a of the moving mirror from its ZPD position is given by (see Eq. (5.11a)]
a = χ 2. (5.23b)
( βs )
Pbal ( 2a ) = Pbal(0) ( 2a cos β s ) . (5.23c)
This shows that the balanced interference signal of a distant, on-axis source has the same power
when the moving mirror is displaced from ZPD by a distance (a cos β s ) that an identical distant,
off-axis source has when the moving mirror is displaced from ZPD by a distance a. Another way
of saying this is to note that the on-axis source looks as bright when the moving mirror is
displaced from ZPD by a distance a as the off-axis source does when the moving mirror is
displaced from ZPD by a distance ( a cos β s ) . Since ( a cos β s ) > a , as the moving mirror is
shifted steadily away from ZPD the brightness of the on-axis source predicts the brightness of the
off-axis source—if the on-axis source brightens or dims, we know that soon the same thing will
happen to the off-axis source.
- 591 -
5 · Description of Practical Interferometer Measurements
We next consider a ring of distant sources surrounding the on-axis source, with all the sources
passing the same power spectrum S(ı) through the interferometer. As shown in Fig. 5.12(a), an
observer looking at the ring sees these sources as a circle of stars, a circle with angular radius ȕs
centered on the distant on-axis source. Each source in the ring sends its own group of parallel
rays through the interferometer as shown in Fig. 5.12(b).
Every parallel
Every groupgroup
parallel of rays
ofpasses throughthrough
rays passes the interferometer at the same
the interferometer at ȕthe same
s angle with angle to
ȕs respect
the optical axis, so everything previously said about the single off-axis source also applies to the
ring of off-axis sources. As the moving mirror shifts away from ZPD, we know— using the same
reasoning as before—that if the central source brightens or dims then soon the same thing will
______________________________________________________________________________
FIGURE 5.12(a).
s
s
FIGURE 5.12(b). s s
Moving
Mirror
s
s
Fixed
Mirror
- 592 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
FIGURE 5.13.
Moving
Mirror
Fixed
Mirror
happen simultaneously to every source on the off-axis ring. We can imagine filling the entire
“sky” with identical distant sources, as shown in Fig. 5.13.
Now when the sky is observed directly, not looking through the interferometer, it exhibits a
uniform, featureless glow; but when it is observed indirectly through the interferometer with the
eye focused at infinity—which may require a little practice—the sky becomes a concentric series
of rings at different levels of brightness. These are sometimes called Heidinger rings. The rings
have different levels of brightness because they are at different angular distances from the on-axis
source. The only way to escape this effect is to put the moving mirror at its ZPD position, with
a = χ = 0 . According to Eq. (5.22b), the rays at every angle ȕs with respect to the optical axis
then all have the same Pbal value; and the observer looking through the interferometer either sees
the same uniform featureless glow seen when looking directly at the source-filled sky (if W = 1 )
or nothing at all (if W = −1 ). As the moving mirror shifts steadily away from ZPD, the region at
the center of the scene changes its brightness first and, then, obeying Eq. (5.23c), this change in
brightness forms a ring that expands and travels out to the edge of the scene. This is, of course,
just a consequence of the on-axis brightness predicting the off-axis brightness, with regions at
larger ȕs copying the central brightness after a longer delay as the interference rings form and
expand.
To record these rings in the laboratory, we need only replace the observer’s eye with a camera
- 593 -
5 · Description of Practical Interferometer Measurements
focused at infinity. In Fig. 5.14 this camera is shown schematically as a lens and a light-sensitive
surface in the lens’s focal plane. As has already been discussed in Sec. 4.9 of Chapter 4, each
group of parallel rays can be regarded as a single plane wave, and each plane wave reaching the
lens focuses to its own separate and distinct point of light on the light-sensitive surface. In fact
what the light-sensitive surface records is an image of the scene “at infinity,” with each distant
source showing up as a separate point of brightness on the lens’s focal plane. The position of
each bright point on the focal plane corresponds to the angular separations seen by an observer;
for example, the ring of distant sources depicted in Fig. 5.12(a) shows up as a ring of bright
points equidistant from the central bright point representing the on-axis source. In practice the
creation of bright distant sources all having the same spectrum is an awkward and tedious
business; what is done instead is to create a nearby extended source with a uniformly bright
surface having the same spectral radiance everywhere. From the discussion at the end of Sec. 4.2
as well as the discussion following Eq. (4.47b) in Chapter 4, we know that every radiation field
can be thought of as a collection of plane waves propagating in different directions. When the
extended source is placed close to the interferometer, its plane waves fill the interferometer’s
field of view; that is, every point on the light-sensitive surface of the lens’s focal plane represents
a different plane wave generated by the extended source (see Fig. 5.15). To get a sequence of
brightness rings such as the ones shown in Fig. 5.16, we make sure the camera is focused at
infinity and then just take a series of snapshots while steadily shifting the moving mirror away
from ZPD.
The discussion so far has assumed that all the plane waves entering the interferometer,
whether coming from distant sources or an extended nearby source, pass the same power
spectrum S(ı) through the interferometer. There is, of course, no reason why this has to be the
case. Returning to Eq. (5.21f), we rewrite it using slightly different notation. Instead of talking
about parallel rays passing through the interferometer at an angle ȕs to the optical axis, we give
each group of parallel rays an index i and refer to the ith group of parallel rays as the ith plane
wave passing through the interferometer. The balanced signal power associated with this ith
plane wave is then
∞
1
(i )
Pbal ( χ ) = ³ S ( i ) (σ ) [1 + W cos(2πσχ cos α i )] dσ , (5.24a)
20
where α i refers to the ith plane wave’s ȕs angle with respect to the interferometer’s optical axis
and S ( i ) (σ ) is the power spectrum of the ith plane wave as it passes through the interferometer.
According to Eq. (5.21g), if the plane wave is generated by a distant point source then we should
say that
S (i ) (σ ) = A ∆Ω( i ) L(i )(σ ) . (5.24b)
- 594 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
FIGURE 5.14.
Moving Mirror
The parallel
rays coming
Ideal
from a distant,
Beam
on-axis point
Splitter
source are Fixed
shown with Mirror
solid arrows.
Lens
LIght-Sensitive Surface in
the Focal Plane of the Lens
- 595 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.15.
Moving Mirror
Ideal
Beam
Splitter
Fixed
Mirror
Extended Lens
Source
LIght-Sensitive Surface in
the Focal Plane of the Lens
- 596 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
FIGURE 5.16.
1 2 3 4
5 6 7 8
Here L( i ) (σ ) is the spectral radiance of the pencil ray entering the interferometer from the distant
point source and A is the cross-sectional area of the beam gathered in by the lens—that is, the
area of the lens itself. We can think of the ith plane wave as just one of a group of i = 1, 2,… , N
plane waves all emanating from distant sources, which makes ∆Ω( i ) the extremely small solid
angle subtended by the ith distant source at the position of the interferometer.
After these plane waves pass through the interferometer, the lens in Figs. 5.14 and 5.15 forms
an image—that is, N points of brightness—from these N distant sources. If, as shown in Fig. 5.17,
we put an array of small detectors in the focal plane then, as the moving mirror shifts away from
(i )
ZPD, each detector records the Pbal signal given by Eq. (5.24a) that is generated by the ith plane
wave coming from the ith distant source. The central region of the focal plane no longer
- 597 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.17.
Moving Mirror
Plane
Waves
coming
from
Distant
Scene
Ideal Beam
Splitter
Fixed
Mirror
Lens
Detector Array
- 598 -
The Off-Axis Signal of a Michelson Interferometer · 5.6
automatically predicts the brightness of the off-center regions, and there need not exist any well-
formed, outwardly moving rings because now the different plane waves have different S ( i )
spectra.83 This setup is sometimes referred to as an imaging Fourier-transform spectrometer, and
when it is put on board a spacecraft it can be used to investigate distant astronomical scenes, such
as a planet’s surface viewed from orbit, where we expect the power spectra to vary with position
in the scene.
∞
Ndet
A Ndet ½
P (det)
bal ( χ ) = ¦ P ( χ ) = ³ L(σ ) ®¦ ∆Ω( i ) [1 + W cos(2πσχ cos α i ) ]¾ dσ ,
(i )
bal (5.25a)
i =1 20 ¯ i =1 ¿
where in the last step we have assumed that all the plane waves entering the interferometer have
the same spectrum S(ı) and thus the same spectral radiance L(ı). We convert the sum over i into
an integral over solid angle by writing
∞
A
(det)
Pbal (χ ) = ³ L(σ ) dσ ³³ [1 + W cos(2πσχ cos α ε ) ] d 2ε . (5.25b)
20 field of view
of detector
83
Of course if all these S(i) spectra have common features producing similar interference signals, there will still be a
tendency for ringlike features to form and expand out from the center as the moving mirror shifts away from ZPD.
- 599 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.18.
Moving Mirror
Ideal Beam
Splitter
Fixed
Mirror
Plane Waves
Coming from Lens
outside the
Interferometer
Circular Detector in
Focal Plane of Lens
- 600 -
The Standard Michelson Interferometer with Central Detector · 5.7
On the right-hand side of this equation, A is the area of the lens focusing the interferometer signal
onto the detector, d 2ε is an infinitesimal solid angle replacing ∆Ω( i ) , and angle α ε replaces
angle α i as the angle of propagation through the interferometer. We note that this angle α ε is the
same as the α ε defined in Eq. (4.135f) of Chapter 4 and used in Eq. (4.137i) of Chapter 4. This is
not very surprising, considering that the line of reasoning used to derive Eq. (5.25b) begins with
Eq. (5.21b), which is a special case of Eq. (4.137i).
We can, in fact, easily show that Eq. (5.25b) is the same as Eq. (4.137i) in Chapter 4 with
η = M = 1 . Formula (5.10c) lets us write the integral on the right-hand side of (4.137i) as
∞
A J (4π Rσθ ma )
³
2 −∞
dσ ³ ³ d 2ε η (σ ) L (σ ) [1 + W ⋅ 1
field of view
2π Rσθ ma
⋅ cos(2πσχ cos α ε )]
∞
A
= ³ dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )] .
2 −∞ field of view
Here A is the cross-sectional area of the interferometer beam; R = A / π is the radius of the
interferometer beam; the “field of view” limiting the integral over d 2ε is the interferometer’s
field of view; and of course α ε is the angle of propagation through the interferometer. For the
lens and detector in Fig. 5.18, the area of the lens focusing the beam onto the detector defines the
cross-sectional area of the interferometer beam, so variable A has the same meaning as in Eq.
(5.25b). The field of view specified by the size of the detector—that is, the detector’s field of
view—is the same as the field of view of the interferometer, so the integral over d 2ε is also the
same integral as in Eq. (5.25b). Following the procedure used in the discussion after Eq. (5.21c),
we recognize that “field of view” in the integral over d 2ε now refers to the detector’s field of
view and note that L, Ș, M, and the cosine are even functions of ı. This gives us, after applying
Eq. (2.19) in Chapter 2,
∞
A J (4π Rσθ ma )
³
2 −∞
dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W ⋅ 1
field of view
2π Rσθ ma
⋅ cos(2πσχ cos α ε )]
∞
A
= ³
2 −∞
dσ η (σ ) L (σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )]
field of view
(5.25c)
of detector
∞
A
=
20³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )] ,
field of view
of detector
where L(ı) is, according to Eq. (4.136g) of Chapter 4, the spectral radiance of the beam entering
- 601 -
5 · Description of Practical Interferometer Measurements
the interferometer. Equation (5.25c) is a new formula for the right-hand side of Eq. (4.137i) in
Chapter 4. Thus it can be substituted back into (4.137i) to get
Pbal ( χ )
∞
A
=
20³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )].
field of view
of detector
From Chapter 4 we know that Pbal in this formula is the optical power leaving the interferometer
in the balanced signal, and since the ideal lens in Fig. 5.18 focuses all of the beam onto the
(det)
detector, Pbal is the same quantity as Pbal in Eq. (5.25b). Hence this last result can be written as
(det)
Pbal (χ )
∞
A (5.25d)
= ³ dσ η (σ ) L(σ ) ³ ³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε )].
20 field of view
of detector
When η = M = 1 in Eq. (5.25d), it becomes the same as Eq. (5.25b). Consequently we have now,
as promised, shown that (5.25b) is the same as Eq. (4.137i) of Chapter 4 applied to an ideal
interferometer. Equation (5.25d) with 0 < η < 1 and M < 1 is then clearly the extension of Eq.
(5.25b) to the nonideal case of an interferometer with an imperfect beam splitter and an
imperfectly aligned moving mirror. Interchanging the integrals in (5.25d) gives
(det)
Pbal (χ )
° A ∞ ½° (5.25e)
= ³ ³d ε ® ³ dσ η (σ ) L(σ )[1 + W M( Rσθ ma ) cos(2πσχ cos α ε )]
2
¾ .
field of view ¯°2 0 ¿°
of detector
Now at last we make the idealization that the detector is small enough to assume that all the
plane waves focused on it provide an approximately uniform illumination across its surface,
allowing us to set cos α ε ≅ 1 in Eq. (5.25e) to get, after dropping the (det) superscript,
∞
A∆Ω
2 ³0
Pbal ( χ ) = η (σ ) L(σ )[1 + W M( Rσθ ma ) cos(2πσχ )] dσ
∞
(5.26a)
1
= ³ S (σ )[1 + W M( Rσθ ma ) cos(2πσχ )] dσ ,
20
- 602 -
The Standard Michelson Interferometer with Central Detector · 5.7
where
³³ d 2
field of view
(5.26b)
of detector
and
S () ) A !() ) L () ) . (5.26c)
The (det) superscript has been dropped to emphasize the close resemblance of Eq. (5.26a) to Eq.
(5.10a) for the balanced interference signal of the distant, on-axis source. Indeed the only real
difference
difference is
is that
that ¨
solidinangle
Eqs. (5.10a), (5.10b),
¨ in Eqs. and(5.10b)
(5.10a), (5.10c) refers
refers to
to the
the solid
solid angle subtended by
the distant source and ¨ in (5.26a), (5.26b), and (5.26c) refers to the detector’s field of view.
Because the mathematical formalism is the same, it makes sense to call Pbal in (5.26a) the optical
power of the balanced interference signal hitting the detector and, following the pattern of Eqs.
(5.17a) through (5.17d), once again define
5
1
I bal ( ) ³ S () )M ( R)' ma )cos(2&) ) d) (5.27a)
20
to be the balanced interferogram. The only difference between (5.17d) and (5.27a) is the meaning
we attach to the solid angle ¨ in the definition of S. Now Eq. (5.26a) can be written as
1
Pbal ( ) P0 W I bal ( ) , (5.27b)
2
where, just like in (5.17c),
5
P0 ³ S () ) d) . (5.27c)
0
Since the cosine in Eq. (5.26a) is an even function of Ȥ, the interference signal Pbal must be, as it
is in Eq. (5.18a), an even function of Ȥ,
Pbal ( ) 1 2 P0
I bal ( ) (5.28b)
W
is once again an even function of Ȥ:
I bal ( ) I bal ( ) . (5.28c)
- 603 -
5 · Description of Practical Interferometer Measurements
As in Eq. (4.141c) of Chapter 4, we can make S(ı) into an even function by requiring
S () ) S () ) (5.29a)
S () ) A !() ) L ( ) ) . (5.29b)
Unlike Eqs. (4.140c) and (4.141c) of Chapter 4, the beam-splitter efficiency Ș is now included in
the definition of S. The argument of Ș does not have to be put inside absolute value signs
because, according to Eq. (4.139g) of Chapter 4, it is already an even function of ı. Function
M(R)' ma ) is also an even function of ı [see Eq. (5.10f)], as is cos(2&) ) , so both
S () ) M(R)' ma ) andª¬ S () ) M(R)' ma ) cos(2&) ) º¼ are even functions of ı. The sine of (2&) )
is an odd function of ı because
sin( 2&) ) sin(2&) ) ,
S () ) M(R)' ma ) sin(2&) ) .
This means we can write
This means we can write, using ei cos( ) i sin( ) ,
5
³³ M(
5 92& i)
M( R
R)'
)' ma
ma
)) SS (()
) )) ee92& i) dd)
)
5
5
5 5
³³ M( 9 ii ³³ M(
5 5
M( R
R)'
)' ma
ma
)) SS (()
) )) cos(2&) ))dd)
cos(2&) )9 M( R
R)' ) S () ) sin(2&) )d)
ma ) S () ) sin(2&) ) d)
)'ma
5 5
5 5
5
22³³ M(
5
M( R
R)' ) S () ) cos(2&) )d) .
ma ) S () ) cos(2&) ) d) .
)'ma
0
0
Here
Here we we use
use that
that the
the integral
integral of
of SS (() ) M(R)' ) cos(2&) ) over just positive ı is, according to
) ) M(R)' ma ) cos(2&) ) over just positive ı is, according to
ma
Eq.
Eq. (2.19) in Chapter 2, twice the value of its
(2.19) in Chapter 2, twice the value of its integral
integral between
between í í and and +,
+, because
because
SS (()) )) M(R)' ma ) cos(2&) ) is an even function of ı; and we also use that the integral of
M(R)' ma ) cos(2&) ) is an even function of ı; and we also use that the integral of
M(R)' ma ) sin(2&) ) over ı is the integral of an odd function between í and +, which,
SS (()) )) M( R)' ma ) sin(2&) ) over ı is the integral
integral ofof an
an odd
odd function
function between
between í í and
and +,
+ which,
according
according to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
to Eq. (2.17) in Chapter 2, must be zero. Comparison of this result to Eq. (5.27a) shows
that
that the the interferogram
interferogram can can bebe written
written as as
- 604 -
The Standard Michelson Interferometer with Central Detector · 5.7
∞
1
I bal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ (5.29c)
4 −∞
where the plus sign is chosen for the complex exponent of e. Note that, having now chosen
I bal ( χ ) to be the inverse Fourier transform of [(1 4 ) S (σ )M( Rσθ ma ) ] , we can reverse the Fourier
transform in (5.29c) to get
∞
S (σ ) M( Rσθ ma ) = 4 ³ I bal ( χ ) e −2π iσχ d χ . (5.29d)
−∞
Our choice of sign for the complex exponent thus makes [ S (σ )M( Rσθ ma )] the forward Fourier
transform of 4 I bal ( χ ) . This sign choice is, of course, purely a matter of convention, but it is the
one followed by most optical engineers today and it is the one used for the rest of this book.
A
∞
G ª W G −2π iσχ cosα º
Pbal ( χ ) =
2 field ³³
d 2ε ³−∞ L (ε , σ ) η (σ ) «
¬
1 +
A
Re Ȇ(A (σ∆ )e )
ε
» dσ .
¼
of view
Applying the same reasoning as in the discussion after Eq. (5.25b), we note that here the
interferometer beam’s cross-sectional area A must be the same as the area A of the lens, and that
the “field of view” in the integral over d 2ε must refer to the field of view of the detector. For the
standard interferometer beam with a circular cross section, we have from Eq. (4.137h) of Chapter
4 and Eq. (5.10c) that
1 G J (4π Rσθ ma )
Ȇ A (σ∆) = 1 = M( Rσθma ) .
A circle of
2π Rσθ ma
radius R
Substituting this into the expression for Pbal ( χ ) gives, since M is real and
- 605 -
5 · Description of Practical Interferometer Measurements
that
5
A G
Pbal ³³
2 field of view
d 2 ³ L ( ,) )! () ) 1 W M( R)'
5
ma ) cos(2&) cos ) d) .
G
Equations (4.136b) and (4.139g) of Chapter 4 require L ( , ) ) and Ș(ı) to be even functions of ı,
and we already know that M is an even function of ı [see Eq. (5.10f)]. Consequently, because the
cosine is also an even function, it follows that
G
L ( , ) )! () ) 1 W M( R)' ma ) cos(2&) cos )
is itself an even function of ı. Equation (2.19) in Chapter 2 can now be used to modify the upper
and lower bounds of the integral over dı, so that the integration takes place between 0 and .
Having made these changes, Eq. (4.136d) of Chapter 4 can then be used to write the formula for
Pbal ( ) as
Pbal
° 2 A 5 G ½° (5.30)
³³
field of view ¯
®
°
d
2 ³
0
L ( , ) ) ! () ) 1 W M( R)' ma ) cos(2&) cos ) d ) ¾.
°¿
G
Function L( , ) ) was defined in Eq. (4.136d) of Chapter 4 to be the spectral radiance as a
G
function of wavenumber ı and direction for the beam entering the interferometer; and from the
analysis in Sec. 5.2 above, and in particular the discussion following Eq. (5.6), we know that
G
L( , ) ) can also be interpreted as the spectral radiance of the pencil ray traveling in a direction
G
specified by . When Parallelthispencil
pencilrays
raywith
becomes part the
the same interferometer’s
radiance entering thebeam, it can be
interferometer
G
decomposed
are treated asinto a parallel
a parallel groupofofrays
group raystraveling
travelingininthe
thedirection specified by , which is of
direction specified
course the same thing as recognizing the existence of a plane wave traveling in the direction
G
specified by . This means that the integral over d 2 can be interpreted as a sum over all the
plane waves passing through the interferometer. Consequently, in Eq. (5.30), the term inside the
braces { },
5
A G
d 2 ³ L( , ) )! () ) 1 W M( R)' ma ) cos(2&) cos ) d) ,
20
- 606 -
The Fore and Aft Optics · 5.8
Having interpreted the integral over d 2 in Eq. (5.30) as a sum over the power contributed by
each polychromatic plane wave passing through the interferometer, we next analyze the integral
over dı as a sum over all the monochromatic wavenumber components present in any one
polychromatic plane wave.84 In Eq. (5.30) we regard the ıth wavenumber component of the plane
G
wave specified by as contributing an amount of power
§ A· G
d 2 d) ¨ ¸ L( , ) )! () ) 1 W M( R)' ma ) cos(2&) cos )
©2¹
to the optical power reaching the detector. Analyzing the system this way shows us how to
include the effects of nonideal optical components in the formulas for Pbal ( ) . If, for example,
the lens in Fig. 5.18 transmits some optical wavelengths Ȝ more efficiently than others, a behavior
typical of real optical materials, we can introduce a transmission * lens that is always a real number
between zero and one and make it a function of wavenumber ) # 1 . Now in Fig. 5.18 each ıth
G
wavenumber component of the plane wave specified by contributes an order-of-magnitude
d 2 A d) amount of power
§ A· G ½
* lens () ) A ®d 2 d) ¨ ¸ L( , ) )! () ) 1 W M( R)' ma ) cos(2&) cos ) ¾
¯ ©2¹ ¿
to the detector. Those wavenumbers for which * lens 0 , showing that for them the lens is opaque,
are blocked from contributing any power to Pbal ( ) ; and those wavenumbers for which * 1 ,
meaning that they pass through the lens without losing any power, contribute to Pbal ( ) as if they
were being focused by an ideal lens.
In general, an interferometer such as the one shown in Fig. 5.18 will have both “fore optics”
to gather in and prepare outside radiation for passage through the interferometer and “aft optics”
to focus the optical beam onto the detector after passage through the interferometer (see Fig.
5.19). In an astronomical Fourier-transform system, for example, the fore optics could be a
telescope designed to gather in large quantities of photons and send them through the
interferometer while the aft optics, like the lens in Fig. 5.18, is designed to focus the beam onto
the detector. We can lump the transmissions of the individual optical elements of both the fore
optics and the aft optics into two combined transmission functions * f () ) and * a () )
GG
respectively. This means the ıth component
component of of the
the 'th plane wave can only contribute a
84
In effect, we are reverting to the analysis at the beginning of Chapter 4, representing the optical field propagating
through the interferometer as a sum of monochromatic plane waves over different directions and wavenumbers.
- 607 -
5 · Description of Practical Interferometer Measurements
§ A· G ½
τ f (σ ) ⋅τ a (σ ) ⋅ ®d 2ε dσ ¨ ¸ L(ε , σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ]¾
¯ ©2¹ ¿
amount to the optical power reaching the detector. Consequently, we adjust the formula for
Pbal ( χ ) in Eq. (5.30) to get
Pbal ( χ )
A
∞
G (5.31a)
, σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] dσ
2 field ³³ ³
= d 2
ε L (ε
of view 0
for the total power from the balanced optical signal reaching the detector in Fig. 5.19. If all the
plane waves of interest are characterized by the same spectral radiance, the dependence of L on
G
ε can be suppressed to get
Pbal ( χ )
A
∞
(5.31b)
= ³ dσ L(σ )τ f (σ )τ a (σ )η (σ ) ³³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] .
20 field of view
If, in addition, the field of view is sufficiently small to make cos α ε ≅ 1 a good approximation,
then we can write
∞
A ∆Ω
Pbal ( χ ) = L(σ )τ f (σ )τ a (σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ ,
2 ³0
(5.31c)
where
∆Ω = ³³
field of view
d 2ε . (5.31d)
Equations (5.31a)–(5.31d) are a useful set of formulas for describing Pbal ( χ ) . If an interferometer
is built with no fore optics, then we can set τ f (σ ) = 1 ; and to represent negligible loss in the aft
optics, we set τ a (σ ) = 1 . As was discussed in the previous sections, we know that for an ideal
beam splitter η (σ ) = 1 , and for a perfectly aligned interferometer M = 1.
- 608 -
The Fore and Aft Optics · 5.8
FIGURE 5.19.
Moving Mirror
FORE
OPTICS
Circular Detector
We can put Eqs. (5.31c) and (5.31d) into the same form as Eqs. (5.26a)–(5.26c) by writing
∞
1
Pbal ( χ ) = ³ S (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ , (5.32a)
20
where
S (σ ) = A ∆Ω L(σ ) τ f (σ ) τ a (σ )η (σ ) . (5.32b)
All that is different from Eqs. (5.26a)–(5.26c) is the definition of S(ı), which now includes
- 609 -
5 · Description of Practical Interferometer Measurements
∞
1
I bal ( χ ) = ³ S (σ )M ( Rσθ ma )cos(2πσχ ) dσ , (5.32c)
20
with
1
Pbal ( χ ) = P0 + W I bal ( χ ) (5.32d)
2
and
∞
P0 = ³ S (σ ) dσ . (5.32e)
0
Again we can see that Ibal and Pbal are even functions of Ȥ because the cosine is an even function
of Ȥ:
I bal (− χ ) = I bal ( χ ) (5.33a)
and
S (−σ ) = S (σ ) , (5.33c)
by writing
S (σ ) = A ∆Ω η(σ ) L ( σ )τ f ( σ )τ a ( σ ) (5.33d)
for negative values of ı. Using the same argument as in the discussion following Eq. (5.29b), the
interferogram can now be written as the inverse Fourier transform of [(1 4 ) S (σ ) M(Rσθ ma ) ] ,
∞
1
I bal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ , (5.34a)
4 −∞
∞
S (σ ) M( Rσθ ma ) = 4 ³ I bal ( χ ) e−2π iσχ d χ . (5.34b)
−∞
- 610 -
The Fore and Aft Optics · 5.8
There is nothing new here; all that has changed from the previous Fourier-transform relations in
Eqs. (5.29c) and (5.29d) is that we have extended the definition of S(ı) from
S (σ ) = A ∆Ω η(σ ) L ( σ )
in Eq. (5.29b) to
S (σ ) = A ∆Ω η(σ ) L ( σ )τ f ( σ )τ a ( σ )
in Eq. (5.33d). In fact, all of Eqs. (5.26a) through (5.29d) can now be regarded as a special case
of Eqs. (5.32a) through (5.34b), what we get when making the idealization that τ f = τ a = 1 .
to the balanced component of the optical power reaching the detector in Eq. (5.31a). To find the
corresponding contribution to the electrical signal leaving the detector, we multiply this by R(ı)
to get
§ A· G ½
R (σ ) ⋅ ® d ε dσ ¨ ¸ L(ε , σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ]¾ .
2
¯ ©2¹ ¿
Consequently, the balanced component of the electrical signal leaving the detector at an optical
- 611 -
5 · Description of Practical Interferometer Measurements
path difference Ȥ is
K bal ( χ )
∞
A G (5.35a)
= ³³ d 2ε ³ L(ε , σ )R(σ )η (σ )τ f (σ )τ a (σ ) [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] dσ .
2 field of 0
view
When all the plane waves of interest have the same spectral radiance L, this becomes
K bal ( χ )
∞
A (5.35b)
= ³ dσ L(σ ) R (σ )τ f (σ )τ a (σ )η (σ ) ³³ d 2ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] ;
20 field of
view
and if we can assume cos α ε ≅ 1 because the interferometer’s field of view is small, then
∞
A ∆Ω
K bal ( χ ) = L(σ ) R (σ )τ f (σ )τ a (σ )η (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ
2 ³0
(5.35c)
with
∆Ω = ³³
field of
d 2ε . (5.35d)
view
From the way this result is derived, we see that it is always easy to go from the formulas for the
signal leaving the detector to the formulas for the optical signal hitting the detector: just set
R (σ ) = 1 .
We work now with the assumption that all the plane waves of interest have the same spectral
radiance L. Just like in Eq. (5.32b), we define a function
S (σ ) = A ∆Ω R (σ ) η(σ ) L (σ )τ f (σ )τ a (σ ) . (5.36a)
This definition of S(ı), unlike the one in (5.32b), contains the detector responsivity R(ı).
Equation (5.35b) becomes, when cos α ε ≅ 1 is not a good approximation,
∞
½
1 ° 1 °
K bal ( χ ) = ³ S (σ ) ® ³³ d ε [1 + W M( Rσθ ma ) cos(2πσχ cos α ε ) ] ¾ dσ ;
2
(5.36b)
20 ° ∆Ω field of °
¯ view ¿
- 612 -
The Detector Signal · 5.9
∞
1
K bal ( χ ) = ³ S (σ ) [1 + W M( Rσθ ma ) cos(2πσχ ) ] dσ . (5.36c)
20
Following the same pattern as in the discussions after Eqs. (5.26c) and (5.32b), we can write
either of these two expressions as the sum of a constant term and a term depending on Ȥ,
1
K bal ( χ ) = K 0 + WK Ibal ( χ ) . (5.37a)
2
Whether or not cos α ε ≅ 1 is a good approximation, Eqs. (5.37c) and (5.37d) show that K Ibal is an
even function of the optical path difference Ȥ,
for all values of cos α ε . Since K Ibal is even, it follows from (5.37a) that Kbal must also be an even
function of Ȥ,
K bal (− χ ) = K bal ( χ ) (5.38b)
- 613 -
5 · Description of Practical Interferometer Measurements
As before, the nonconstant component K Ibal of the total signal can made proportional to a Fourier
transform. Equation (5.10f) shows M( Rσθ ma ) to be an even function of ı, and we can always
force S to be even by defining S (σ ) = S ( σ ) so that
S (−σ ) = S (σ ) . (5.39a)
Now both
S (σ ) M( Rσθ ma ) cos(2πσχ )
and
½
° 1 °
S (σ )M( Rσθ ma ) ® ³³ d 2ε cos(2πσχ cos α ε ) ¾
° ∆Ω field of °
¯ view ¿
are even functions of ı because they are the products of even functions of ı. We can write Eq.
(5.37b) as
∞
1
K 0 = ³ S (σ ) dσ (5.39b)
2 −∞
because S is even [see Eq. (2.19) in Chapter 2]. Equation (5.37c) becomes
∞
½
1 ° 1 °
K Ibal ( χ ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε cos(2πσχ cos α ε ) ¾ dσ (5.39c)
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
1
K Ibal ( χ ) = ³ S (σ ) M( Rσθma ) cos(2πσχ ) dσ (5.39d)
4 −∞
when cos α ε can be approximated as one. Using the same reasoning as in the discussion
following Eq. (5.29b), we see that
- 614 -
The Detector Signal · 5.9
∞
1
³
4 −∞
S (σ ) M( Rσθma ) e2π iσχ dσ
∞ ∞
1 i
= ³ S (σ ) M( Rσθma ) cos(2πσχ )dσ + ³ S (σ ) M( Rσθma ) sin(2πσχ ) dσ
4 −∞ 4 −∞
∞
1
= ³ S (σ ) M( Rσθma ) cos(2πσχ ) dσ .
4 −∞
must, according to Eq. (2.17) of Chapter 2, equal zero. Hence, when cos α ε can be approximated
as one, Eq. (5.39d) can be written as
∞
1
K Ibal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ . (5.40a)
4 −∞
∞
1
K Ibal ( χ ) = ³³ d 2ε ³ dσ S (σ )M( Rσθ ma ) cos(2πσχ cos α ε ) .
4∆Ω field of −∞
view
- 615 -
5 · Description of Practical Interferometer Measurements
∞
1
³
4 −∞
S (σ ) M( Rσθma ) e2π iσχ cosαε dσ
∞ ∞
1 i
= ³ S (σ ) M( Rσθma ) cos(2πσχ cos α ε )dσ + ³ S (σ ) M( Rσθma ) sin(2πσχ cos α ε ) dσ
4 −∞ 4 −∞
∞
1
= ³ S (σ ) M( Rσθma ) cos(2πσχ cos α ε ) dσ .
4 −∞
can be written as
∞
1
K Ibal ( χ ) = ³³ d 2ε ³ dσ S (σ )M( Rσθ ma ) e2π iσχ cosαε
4∆Ω field of −∞
view
½ (5.40b)
∞
1 ° 1 °
= ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e2π iσχ cosαε ¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿
Therefore we have shown that, according to Eqs. (5.40a) and (5.40b), K Ibal can be written as
∞
½
1 ° 1 °
K Ibal ( χ ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ (5.40c)
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
1
K Ibal ( χ ) = ³ S (σ ) M( Rσθ ma ) e 2π iσχ dσ (5.40d)
4 −∞
when cos α ε can be approximated as one. Glancing back at Eqs. (5.37a) and (5.39b), we note that
the balanced component of the electrical signal leaving the detector due to the input spectral
power is [see also Eqs. (5.40c) and (5.40d)]
- 616 -
The Detector Signal · 5.9
5 5
½
1 W ° 1 °
K bal ³ S () ) d)
³5 S () )M( R)'ma ) ® field³³of d e
2 2& i) cos
¾ d) (5.40e)
4 5 4 ° °
¯ view ¿
when cos cannot be approximated as one, and
5 5
1 W
K bal ³ S () ) d) ³ S () )M( R)' ma )e 2& i) d) (5.40f)
4 5
0
4 5
when it can. The formula for S(ı) comes from Eqs. (5.39a) and (5.36a), which can be combined
to give
S () ) A R ( ) ) !( ) ) L ( ) )* f ( ) )* a ( ) )
(5.40g)
A R ( ) ) !() ) L ( ) )* f ( ) )* a ( ) ) .
The absolute value signs are dropped from the argument of Ș because it is already an even
function [see Eq. (4.139g) of Chapter 4].
2a .
Taking the time t to be zero when the moving mirror is at ZPD with a 0 , we have
a vt
for v the velocity of the moving mirror. Substituting the second formula into the first gives
- 617 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.20.
Moving Mirror
FORE
OPTICS
Fixed
Ideal Beam Splitter
Mirror
AFT OPTICS
Circular
Detector
Electrical Signal
Detector circuit
from Detector
to process K bal
- 618 -
The Detector Circuit · 5.10
χ = 2vt = ut , (5.41a)
where
u = 2v (5.41b)
is a quantity called the optical-path-difference velocity, or OPD velocity for short. Just as the
optical-path difference Ȥ has the same length units as the mirror displacement a, so does u have
the same velocity units as v.
Substitution of (5.41a) into (5.37a) gives
1
K bal (ut ) = K 0 + WK Ibal (ut ) , (5.42a)
2
∞
½
1 ° 1 °
K Ibal ( ut ) = ³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσ ut cosαε ¾ dσ , (5.42b)
4 −∞ ° ∆Ω field of °
¯ view ¿
when cos α ε cannot be approximated as one and, according to Eq. (5.40d), K Ibal (ut ) can be
written as
∞
1
K Ibal ( ut ) = ³ S (σ ) M( Rσθ ma ) e2π iσ ut dσ (5.42c)
4 −∞
when cos α ε can be approximated as one. If the detector circuit is built to record only time-
varying signals, a process sometimes called “AC coupling” of the detector,85 then the K bal (ut )
signal leaving the detector only contributes its time-varying part K Ibal (ut ) to the rest of the
system.
Suppose we define
1
gin (t ) = K bal (ut ) = K 0 + WK Ibal (ut ) (5.43)
2
to be the time-varying signal leaving the detector and entering the detector circuit. Assuming the
circuit to be linear—and the interferometer cannot produce accurate spectral measurements if it is
not—we know from the discussion in Appendix 5A of this chapter that the product of the circuit
transfer function and Fourier transform of the input signal equals the Fourier transform of the
85
AC stands for alternating current.
- 619 -
5 · Description of Practical Interferometer Measurements
output signal [see Eq. (5A.3a) in Appendix 5A]. Consequently, to get the output of a linear
circuit, we just take the Fourier transform of the input, multiply by the transfer function, and then
take the inverse Fourier transform of the product. Applying this recipe to gin (t ) , we see from Eq.
(5.43) that Gin ( f ) , the Fourier transform of gin (t ) , is
5 5 5
ª1 º 2& ift 1
³5 «¬ 2 K 0 WK Ibal (ut ) »¼ e dt 2 K 0 5³ e dt W 5³ K Ibal (ut ) e dt .
2& ift 2& ift
Gin ( f )
According to Eq. (2.71f) of Chapter 2, the constant term turns into a delta function. This means
that when cos cannot be approximated by one, Eq. (5.42b) can be used to write
Gin ( f )
½
K W
5 5
° 1 ° (5.44a)
³ dt e ³5 d) S () )M( R)'ma ) ® field³³of d e
2& ift 2 2& i) ut cos
0 (f ) ¾,
2 4 5 ° °
¯ view ¿
and when cos can be approximated as one, Eq. (5.42c) can be used to write
5 5
K W
³ dt e ³ d) S () ) M( R)'
2& ift
Gin ( f ) 0 ( f ) ma ) e2& i) ut . (5.44b)
2 4 5 5
In either case, we can move the integral over dt to the inside to get, using Eq. (2.71f) from
Chapter 2, that
5
1 § f ·
³
2& it () u cos f )
e dt () u cos f ) ¨ ) ¸
5
u cos © u cos ¹
5
1 § f ·
³e
2& it () u f )
dt () u f ) ¨ ) ¸
5
u © u¹
when it can. In both these expressions, Eq. (2.68d) of Chapter 2 is used to factor the arguments of
the delta functions. Substitution of these two
Here u is positive, and results back
so is the into Eqs.
cosine (5.44a)
because its and (5.44b)isgives
argument always a
relatively small angle. Substitution of these two results back into Eqs. (5.44a) and (5.44b) gives
- 620 -
The Detector Circuit · 5.10
K0 W d 2ε § f · § Rf θ ma ·
Gin ( f ) =
2
δ( f )+ ³³of cos αε © u cos αε ¸¹ M ¨© u cos αε ¸¹
4u ∆Ω field
S ¨ (5.45a)
view
K0 W § f · § Rf θ ma ·
Gin ( f ) = δ( f )+ S¨ ¸M¨ ¸ (5.45b)
2 4u © u ¹ © u ¹
when it can. Still following the recipe for the detector circuit’s output signal, we define H(ƒ) to
be the detector circuit’s transfer function and take the inverse Fourier transform of the product
H( f ) ⋅ Gin ( f )
to get the formula for the signal leaving the detector circuit:
³e
2π ift
gout (t ) = H( f )Gin ( f )df . (5.46a)
−∞
∞
ª K0 º
³e
2π ift
g out (t ) = « 2 H( f ) δ ( f ) » df
−∞ ¬ ¼
W d 2ε
∞
§ f · § Rf θ ma · (5.46b)
³³of cos αε ³ df e
2π ift
+ H( f ) S ¨ ¸M¨ ¸
4u∆Ω field −∞ © u cos α ε ¹ © u cos α ε ¹
view
∞
½
K W ° 1 °
³−∞ S (σ ′) M ( Rσ ′θma ) ® ∆Ω field³³of d ε H(σ ′u cos αε )e
2π iσ ′ut cos αε
¾ dσ ′ ,
2
= 0 H(0) +
2 4 ° °
¯ view ¿
f
σ′ = .
u cos α ε
Glancing back at Eq. (5.45b), we see that the formula for the case where cos α ε can be
- 621 -
5 · Description of Practical Interferometer Measurements
approximated by one must be [just take cos 1 in Eq. (5.46b)]and apply Eq. (5.35d)]
5
K W
gout (t ) 0 H(0)
2 4 ³
5
H() 3u ) S ) 3 M R) 3' ma e 2& i) 3ut d) 3 . (5.46c)
In either case, we can AC couple the detector to the detector circuit by designing the circuit so
that its transfer function has
H(0) 0 . (5.46d)
This eliminates the constant term from formulas (5.46b) and (5.46c). At this level of idealization,
there is no particular reason to think of the signal leaving the detector circuit as a function of time
rather than the optical-path difference, since they are linearly related to each other by formula
(5.41a) above. Dropping the prime from ı, we use (5.41a) to write the output of the detector
circuit as
z ( ) gout ( u ) (5.47a)
with
5
½
W ° 1 2& i) cos °
³ S ) M R)'ma ® field³³of d H() u cos )e
2
z( ) ¾ d) (5.47b)
4 5 ° °
¯ view ¿
5
W
z( )
4 ³
5
H() u ) S ) M R)' ma e 2& i) d) (5.47c)
when it can. Because these last two formulas refer to the time-based signal leaving the detector
circuit, it may seem unnatural to write them in terms of Ȥ and ı, but we will find it useful to have
them written in terms of the optical-path difference and wavenumber just like the previous
equations discussed in this chapter. To neglect the effect of the detector circuit, for example, we
need only take H = 1 inside the integrals of (5.47b) and (5.47c) to return at once to the integrals
in (5.40c) and (5.40d) respectively, which, when multiplied by W, become WK Ibal , the Ȥ-
dependent part of the signal leaving
absorbedthebydetector.
the detector.
- 622 -
The Effective Spectrum · 5.11
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ ,
where
W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) . (5.48a)
4
This shows that in (5.47c) the interferogram signal z(Ȥ) can be written as the inverse Fourier
transform of an effective spectrum Z eff (σ ) . It is easy to show that the interferogram signal can
always be written as the inverse Fourier transform of an effective spectrum. As long as the
interferogram signal z(Ȥ) is a transformable function, we can take its Fourier transform,
³ z( χ )e
−2π iσχ
dχ ,
−∞
and call it the effective spectrum,
∞
³ z( χ )e
−2π iσχ
Z eff (σ ) = dχ . (5.48b)
−∞
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.48c)
When, for example, cos α ε cannot be approximated as one, as in Eq. (5.47b), we can write for the
effective spectrum
Zeff (σ )
∞
(5.48d)
³ z ( χ )e
−2π iσχ
= dχ
−∞
∞ ∞
½
W ° 1 °
³ dχ e ³−∞ dσ ′ S (σ ′) M ( Rσ ′θma ) ® ∆Ω field³³of d ε H(σ ′u cos αε )e
−2π iσχ 2 2π iσ ′χ cos αε
= ¾,
4 −∞ ° °
¯ view ¿
- 623 -
5 · Description of Practical Interferometer Measurements
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ .
Although there is nothing very profound about this procedure, it can be a useful way of analyzing
the distortions undergone by the interferogram signal as it passes through the Fourier-transform
spectrometer.
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ
must also be a real and even function of the optical-path difference Ȥ. After the interferogram
signal passes through the detector circuit, it is still, of course, real, but there is no reason to
suppose that it is still even.
Suppose we look first at the simpler case where cos α ε can be approximated as one. Then,
according to the Eq. (5.48a), we have
W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) .
4
From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H is Hermitian,
H( - uσ ) = H(uσ )∗ , (5.49a)
and the discussion following (5A.6b) points out that H must have a nonzero imaginary part. We
know that W = +1 or í1 and that S(ı) and M( Rσθ ma ) are both real. From Eqs. (5.39a) and
(5.10f), we know that
S (−σ ) = S (σ )
and
M(− Rσθ ma ) = M( Rσθ ma )
are even. Hence the transfer function H in Eq. (5.48a) must give a nonzero imaginary part to
Z eff , and consequently all that can be said about Z eff is that it is Hermitian:
- 624 -
Symmetries of the Interferogram Signal and Effective Spectrum · 5.12
W W
Z eff (−σ ) = H(−uσ ) S (−σ )M(− Rσθ ma ) = H(uσ )∗ S (σ )M( Rσθ ma )
4 4
∗
ªW º
= « H(uσ ) S (σ )M( Rσθ ma ) » (5.49b)
¬4 ¼
= Z eff (σ ) .
∗
This makes
∞
z( χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ
the inverse Fourier transform of a Hermitian function. Therefore, according to entry 7 in Table
2.1 of Chapter 2, z(Ȥ) must be real but need not be even. In fact, if z(Ȥ) were both even and real,
then entry 1 of Table 2.1 states that Z eff must be both real and even—that is, entry 1 requires
Z eff to have a zero imaginary part when z(Ȥ) is even. Since we already know that Z eff must have
a nonzero imaginary part, we conclude that z(Ȥ) cannot be an even function of Ȥ. So already in the
simpler case where cos α ε is approximated as one, the interferogram signal cannot be even after
passing through the detector circuit.
The interferogram signal, in fact, always becomes uneven after passing through the detector
circuit. To see why this is so, we return to Eq. (5.46a), which holds true both when cos α ε can be
approximated as one and when it cannot. According to the Fourier convolution theorem, the
right-hand side, which is now the inverse Fourier transform of the product of two functions, can
be replaced by a convolution to get [see Eq. (2.39c) in Chapter 2]
³e
2π ift
h(t ) = H ( f ) df (5.50b)
−∞
is the impulse-response function of the detector circuit, as described at the beginning of Appendix
5A, and
∞
³e
2π ift
gin (t ) = Gin ( f ) df . (5.50c)
−∞
In Eq. (5.43) we defined gin to be the signal as it leaves the detector and enters the detector
circuit, and in the discussion following (5.43) Gin was defined to be the Fourier transform of gin.
Hence gin must be the inverse Fourier transform of Gin as shown in Eq. (5.50c). We know from
Eqs. (5.43) and (5.38b) that gin is an even function of time when t = 0 is chosen to coincide with
- 625 -
5 · Description of Practical Interferometer Measurements
z ( χ ) = g out ( χ u )
can never be an even function of Ȥ. No assumptions have been made about the value of cos α ε , so
this result clearly holds true whether or not we approximate cos α ε by one in the double integral
over the interferometer’s field of view.
One last point worth making is that, although we now know that z(Ȥ) cannot be strictly even,
detector circuits are often designed to preserve the major features of the signals passing through
them, making the delays with which signals pass through the circuit small compared to the signal
fluctuation rate. Consequently in (5.50a) we then have
gout (t ) ≈ gin (t )
so that
z ( χ ) ≈ g in ( χ u ) .
Now, since gin is an even function, z(Ȥ) is an approximately even function so that
z (− χ ) ≅ z ( χ ) .
In some systems, the output signal of the detector circuit may have to be examined quite closely
to confirm that it is not a strictly even function of its argument.
- 626 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
optical elements may be as strong a source of infrared radiance as the object itself.
Figure 5.21 shows that the fore optics’ background masquerades as an additional type of
radiance entering the interferometer. To include the fore optics’ background in our formulas, we
add a background term to the input spectrum S(ı) defined in Eq. (5.36a),
S (σ ) → S (σ ) + S ( fore ) (σ ) .
The S ( fore ) (σ ) term is just like S(ı) in (5.36a) except that, since the radiance L( fore ) (σ ) coming
from the fore optics does not have to pass through the fore optics before reaching the
interferometer, we set τ f (σ ) = 1 to get
Remembering that the formula for S(ı) in Eq. (5.36a) is made into an even function of ı in
(5.39a), we do the same thing to S ( fore ) (σ ) by writing
As before, there is no need to add absolute value signs to the wavenumber argument of Ș(ı)
because, according to Eq. (4.139g) in Chapter 4, it is already an even function of wavenumber.
Here we implicitly assume that the detector’s field of view ¨ for the fore optics is the same as
its field of view ¨ for the external source—which is usually a good approximation for well-
designed systems.
Now when we consider Eq. (5.37a) for the signal leaving the detector,
1
K bal ( χ ) = K 0 + WK Ibal ( χ ) , (5.51b)
2
∞ ∞ ∞
K 0 = ³ ª¬ S (σ ) + S ( fore ) (σ ) º¼ dσ = ³ S (σ ) dσ + ³ S ( fore ) (σ ) dσ (5.51c)
0 0 0
- 627 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.21.
Moving Mirror
Ideal Beam
Splitter
FORE
OPTICS
Fixed
Mirror
AFT OPTICS
Circular Detector
The warm surfaces of the fore and aft optics emit infrared background radiation in both
directions along the interferometer’s optical axis.
- 628 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
∞
½
1 ° 1 °
K Ibal ( χ ) = ³ [ S (σ ) + S ( fore ) (σ )]M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
½
1 ° 1 2π iσχ cos αε °
= ³ S (σ )M( Rσθ ma ) ® ³³ d εe
2
¾ dσ (5.51e)
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
½
1 ° 1 °
+ ³ S ( fore )
(σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿
when cos α ε is not approximated by one in (5.40b). Unfortunately the background radiance
generated by the aft optics cannot be handled this simply.
Figure 5.21 shows that the background radiance generated by the aft optics travels in two
different directions—directly to the detector and backwards into the interferometer. The detector
sees the aft optics’ radiation that shines directly on it as a constant level of infrared illumination,
introducing a new constant term into the detector signal. This term can be written as
∞
S ( dir ) = Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ , (5.51f)
0
is the background optical power contributed by warm surfaces emitting a spectral radiance
L( dir ) (σ ) uniformly over a solid angle ∆Ω( dir ) as seen from the detector of area Αdet . Just like the
- 629 -
5 · Description of Practical Interferometer Measurements
constant term in the interference signal coming from the source, this additional constant signal is
removed by the detector’s AC coupling to the detector circuit and for that reason can be
disregarded (it should, however, be taken into account when calculating the noise terms in the
next chapter). The aft optics’ radiance going backward into the interferometer, on the other hand,
interferes with itself as it passes “backwards” through the interferometer, generating an
interference signal that depends on Ȥ, the optical-path difference. Some of this Ȥ-dependent
optical signal ends up returning to the detector. As the moving mirror changes its position, this
interference signal also changes, generating a time-dependent signal capable of passing through
the AC coupling to the rest of the system. In Sec. 4.17 of Chapter 4, we call this the unbalanced
background signal and derive a formula for Punb ( back )
( χ ) , the power in the unbalanced background
signal at an optical-path difference Ȥ.
Working at the same level of idealization as in the analysis of the balanced interference signal
reaching the detector, we set γ ≅ 1 to neglect substrate absorption in formula (4.163a) for
( back )
Punb ( χ ) from Chapter 4 to get
{
∞
A
³ dσ ³ ³ d 2ε L (back) (σ ) 2 r (σ ) − η (σ )
2
(back)
Punb (χ ) =
2 −∞ field of
view (5.52)
}
− Wη (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ cos α ε ) .
Here Eq. (5.10c) is used to substitute M for the original Bessel-function ratio, and A again refers
to the area of the aperture in the aft optics that specifies the cross-sectional area of the beam
passing through the interferometer. The double integral over d 2ε can be taken over the
detector’s field of view of the exterior source, since in well-designed systems this is usually a
good approximation for the detector’s background field of view. The L(back) (σ ) function refers to
all the radiance entering the back end of the interferometer, not only the background radiance
coming directly from the aft optics but also radiance emitted from the detector itself that passes
backwards through the aft optics before entering the back end of the interferometer. This is why
the unbalanced background signal is sometimes called the “Narcissus” interference signal,
because it can come in part from the detector “looking at itself” in the interferometer.
From Eq. (5.10f) in this chapter and Eqs. (4.139a), (4.139g), and (4.162b) of Chapter 4, we
2
know that M, r , Ș, and L(back) are all even functions of ı, as is, of course, cos(2πσχ cos α ε ) .
Hence, the double integral
³ ³
field of
{ 2
}
d 2ε L (back)(σ ) 2 r (σ ) − η (σ ) − Wη (σ ) ⋅ M ( Rσθ ma ) ⋅ cos(2πσχ cos α ε )
view
- 630 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
has the same value at ı and í ı, making it another even function of ı. Equation (5.52) can thus be
written as
{
5
A
Punb ( ) ³ d) ³ ³ d 2 L
(back) back 2
L (back) () ) 2 r () ) ! () )
2 0 field of
view (5.53a)
as the spectral radiance of the infrared background entering the back end of the interferometer.
When cos can be approximated as one, this equation reduces to
(back)
Punb ( )
A
5
(5.53b)
³
2
L (back) () )[ 2 r () ) ! () ) W! () ) A M( R)' ma ) A cos(2&) )]d) ,
2 0
with
³³
field of
d 2 . (5.53c)
view
Keeping in mind the definition of M given in Eq. (5.10c) and our approximation that
1 , we
see that (5.53b) is the same as Eq. (4.163c) in Chapter 4.
Just as we did for the power in the balanced signal, we can interpret the integrals over dı in
Eqs. (5.53a) and (5.53b) to be sums over all the power contributions of all the monochromatic
wavenumber components ı of the background radiation. Hence, when cos is approximated by
one in Eq. (5.53b), we say that
A (back)
() ) ª 2 r () ) ! () ) W! () ) A M( R)' ma ) A cos(2&) ) º
2
d) A L
2 ¬ ¼
is the power carried by the ıth wavenumber component leaving the interferometer and traveling
- 631 -
5 · Description of Practical Interferometer Measurements
toward the detector; and when cos cannot be approximated by one in Eq. (5.53a), we make the
same claim for
A
d) A
2 ³ ³d L
field of
2 (back)
1 2
2
() ) 2 r () ) ! () ) W! () ) A M( R)' ma ) A cos(2&) cos ) .
view
Following the same reasoning used in Secs. 5.8 and 5.9 above to analyze the power in the
balanced optical signal, we multiply these expressions first by the aft optics’ transmission * a () )
to get the fraction of power component passing from the interferometer to the detector and then
by the detector responsivity R(ı) to get the signal component produced by the interferometer’s
detector. This makes
{2 r() )
5
A
³ d) ³ ³ d 2 * a () )R () )L(back) () )
2
K unb ( ) ! () )
2 0 field of
view (5.54a)
K unb ( )
A
5 (5.54b)
ª 2 r () ) 2 ! () ) W! () ) A M( R)' ma ) A cos(2&) ) º d) ,
2 ³0
(back)
* () ) R () ) L () )
a
¬ ¼
the total unbalanced interference signal leaving the detector when cos can be approximated as
one.
Following the pattern of Eq. (5.37a), we can write Eqs. (5.54a) and (5.54b) as
1 ( unb )
K unb ( ) K 0 WK Iunb ( ) , (5.55a)
2
where
K Iunb ( )
5
A
³ d) ³ ³ d 2 * a () ) R () )L (back) () ) ! () ) A M ( R)' ma )A cos (2&) cos ) (5.55b)
cos
2 0 field of
view
- 632 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
when cos α ε can be approximated as one. No matter how cos α ε is approximated, we have
∞
( χ ) = A ∆Ω ³ τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
( unb ) 2
K 0 (5.55d)
0
∞
½
1 ° 1 ° (5.56b)
− ³S (back)
(σ ) M( Rσθ ma ) ® ³ ³
2
d ε cos(2πσχ cos α ε )¾ dσ
20 ° ∆Ω field of °
¯ view ¿
∞
1
K Iunb ( χ ) = − ³ S (back) (σ ) ⋅ M( Rσθ ma ) ⋅ cos(2πσχ )dσ , (5.56c)
20
There is, of course, no need to put absolute value signs on the argument of Ș because we already
know from Eq. (4.139g) of Chapter 4 that it is even. Since the cosine is an even function of its
argument and, according to Eq. (5.10f), so is M, we recognize that now both
- 633 -
5 · Description of Practical Interferometer Measurements
are even functions of ı. Repeating the same argument that has already been used before to
convert cosine integrals over even functions into Fourier transforms, we note that
³S
(back)
() ) A M ( R)' ma ) A ªcos(2&) cos ) º d)
0
¬ ¼
5
1
³ S (back) () ) A M( R)' ma ) A ª cos(2&) cos ) i sin(2&) cos ) º d) (5.57b)
2 5 ¬ ¼
5
1 2& i) cos
³ S (back) () )M( R)' ma ) e d)
2 5
because
S (back) () ) A M( R)' ma ) A sin(2&) cos )
is an odd function of ı, making its integral over ı between í and + equal to zero for all values
of cos .[see Eq. Eq.
Hence, (2.17) in Chapter
(5.57b) can be 2].
usedHence, Eq.
to write (5.57b)
(5.56b) as can be used to write (5.56b) as
5
1 1
K Iunb ( ) A A ³³ d 2 ³ d) S (back) () )M( R)' ma ) cos(2&) cos )
2 field of 0
view
5
1 1
A A ³³ d 2 ³ d) S (back) () )M( R)' ma ) e 2& i) cos
4 field of 5
view
or
5
ª º
1 « 1 2& i) cos »
K Iunb ( ) ³ S (back)
() )M( R)' ma ) « ³³of d e
2
» d) (5.58a)
4 5 field
«¬ view »¼
when cos cannot be approximated as one. When cos can be approximated as one, (5.57b)
can becos
with used to can
=1 writebe(5.56c)
used toaswrite (5.56c) as
5
1
K Iunb ( ) ³
4 5
S (back) () ) A M( R)' ma ) e 2& i) d) . (5.58b)
To get all of the interference signal reaching the detector from the source, the fore optics’
- 634 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
background, and the aft optics’ background, we add together the expressions for the signal
components from the source, the fore optics’ background, and the aft optics’ background.
Equation (5.51b) specifies the combined signal and fore optics’ background, and Eqs. (5.51f),
(5.55a) give the signal coming from the aft optics’ background. Adding all these formulas
together gives
If cos α ε cannot be approximated as one, Eq. (5.59a) expands to, after applying Eqs. (5.51c)–
(5.51f), (5.55d), (5.58a), and (5.58b),
∞ ∞
1
K tot ( χ ) = Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ +
2 ³0
S (σ ) dσ
0
∞ ∞
1 A ∆Ω
+ ³ S ( fore ) (σ ) dσ + ³
2
τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ
20 2 0
∞
½
W ° 1 2π iσχ cos αε °
³−∞ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
2
+
4 ° °
¯ view ¿
∞
½
W ° 1 2π iσχ cos αε °
³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
( fore ) 2
+
4 −∞ ° °
¯ view ¿
∞
½
W ° 1 2π iσχ cos αε °
³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
(back) 2
−
4 −∞ ° ° (5.59b)
¯ view ¿
- 635 -
5 · Description of Practical Interferometer Measurements
∞ ∞
1
K tot ( χ ) = Αdet ∆Ω ³ R (σ )L (σ ) dσ + ³ S (σ ) dσ
( dir ) ( dir )
0
20
∞ ∞
1 A ∆Ω
+ ³ S ( fore ) (σ ) dσ + ³
2
τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ
20 2 0
∞
W
+
4 ³
−∞
S (σ )M( Rσθ ma ) e 2π iσχ dσ (5.59c)
∞
W
+
4 ³
−∞
S ( fore ) (σ )M( Rσθ ma )e 2π iσχ dσ
∞
W
−
4 ³
−∞
S (back) (σ )M( Rσθ ma ) e 2π iσχ dσ .
χ = ut ,
then the constant terms (that is, the terms that do not depend on Ȥ) do not make it past the detector
circuit that AC couples the detector to the rest of the system. According to the discussion
following Eq. (5A.2a) in Appendix 5A, if we know what the output of the linear detector circuit
is for each individual component of a sum of input signals, then we know that the output of the
linear detector circuit for the sum of the input signals is the sum of the outputs of the individual
components. Using Ȥ = ut to represent the nonconstant terms, we already know from the
procedure used to transform Eq. (5.42a) to (5.47b) that the term
∞
½
W ° 1 2π iσχ cos αε °
WK Ibal ( χ ) = ³ S (σ )M( Rσθma ) ® ∆Ω field³³of d ε e ¾ dσ
2
4 −∞ ° °
¯ view ¿
∞
½
W ° 1 °
³−∞ S (σ ) M ( Rσθma ) ® ∆Ω field³³of d ε H(σ u cos αε )e
2π iσχ cos αε
¾ dσ
2
4 ° °
¯ view ¿
- 636 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
in Eq. (5.47b) when cos α ε cannot be approximated as one. Consequently, when the same term
∞
½
W ° 1 °
³ S (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿
occurs in Eq. (5.59b), we know that it comes out of the detector circuit as
∞
½
W ° 1 °
³ S (σ ) M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿
Passage through the detector circuit just introduces a factor of H(σ u cos α ε ) into the integral
over the field of view when cos α ε cannot be approximated as one. Examining the other two
nonconstant terms in Eq. (5.59b), we note that the only difference between them and the term just
analyzed is way the S(ı) function is labeled: for one of the input terms we have
S (σ ) → S ( fore ) (σ )
S (σ ) → − S (back ) (σ ) .
∞
½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
( fore )
S
4 −∞ ° ∆Ω field of °
¯ view ¿
becomes
∞
½
W ° 1 °
³ S ( fore ) (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ ,
4 −∞ ° ∆Ω field of °
¯ view ¿
- 637 -
5 · Description of Practical Interferometer Measurements
and that
∞
½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε e 2π iσχ cosαε ¾ dσ
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿
becomes
∞
½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿
We now know what the output of the detector circuit is for each nonconstant component of the
sum in Eq. (5.59b), and we have already noted that the Ȥ-independent, constant terms in (5.59b)
have zero output. Knowing what the output is for each individual component of the sum in
(5.59b), we can write down the total output of (5.59b) as the sum of the outputs of each
individual component to get, when cos α ε cannot be approximated as one, that the total signal
leaving the detector circuit is
ztot ( χ )
∞
½
W ° 1 2π iσχ cos αε °
³ S (σ ) M ( Rσθma ) ® ∆Ω field³³of d ε H(σ u cos αε )e ¾ dσ
2
=
4 −∞ ° °
¯ view ¿
∞
½
W ° 1 °
+ ³ S ( fore ) (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
½
W ° 1 °
³ (σ )M( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
(back )
− S
4 −∞ ° ∆Ω field of °
¯ view ¿
∞
W
= ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ ⋅
4 −∞
½
° 1 °
M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ .
° ∆Ω field of ° (5.60a)
¯ view ¿
To get the total signal leaving the detector circuit when cos α ε can be approximated as one, we
- 638 -
Background Radiation Inside a Standard Michelson Interferometer · 5.13
∞
W
ztot ( χ ) = ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ . (5.60b)
4 −∞
to dispose of the integral over the field of view. Equations (5.60a) and (5.60b) show that to
include the effect of the background radiance in the standard formulas for the signal leaving the
detector circuit, we need only replace the original source spectrum S(ı) in Eqs. (5.47b) and
(5.47c) with
S (σ ) → S (σ ) + S ( fore ) (σ ) − S (back) (σ ) . (5.60c)
Equations (5.40g), (5.51a), and (5.57a) are now substituted into (5.60c) to get
L( fore ) ( σ ) L(back ) ( σ )
L( σ ) → L( σ ) + − . (5.60d)
τ f (σ ) τ f (σ )
When the background radiance L( back ) is very large, the signal ztot leaving the detector circuit can
quite literally be the transform of a “negative” spectrum. The replacement rules given in (5.60c)
and (5.60d) are one reason we only need to keep track of the input radiance L when analyzing the
noise-free signal leaving the detector circuit—because (5.60c) or (5.60d) can be used at any point
to reintroduce the background radiances into the Fourier transforms. The next section gives
another reason the background radiances can be disregarded: they are easy to eliminate from the
signal leaving the detector circuit before any attempt is made to measure the input radiance
spectrum.
- 639 -
5 · Description of Practical Interferometer Measurements
∞
W
³ z( χ ) e
−2π iσχ
dχ = H(σ u ) S (σ ) M ( Rσθ ma ) . (5.61a)
-∞
4
∞
A ∆ΩW
³ z( χ ) e
−2π iσχ
dχ = L ( σ )H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )M ( Rσθ ma ) , (5.61b)
-∞
4
L (σ )
−1 ∞
ª A ∆ΩW º (5.61c)
=« H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )M ( Rσθ ma ) » ³ z( χ ) e d χ .
−2π iσχ
¬ 4 ¼ -∞
Before we started analyzing the interferometer’s background radiance, this sort of equation had
been enough to explain how to find the source radiance, since in a well-aligned interferometer
M ≅ 1 and all the other quantities,
A, ∆Ω, W , R, η, τ a , τ f , H, and u ,
−1 ∞
ª A ∆ΩW º
³ z( χ ) e
−2π iσχ
L (σ ) = « H(σ u ) R ( σ ) η(σ )τ f ( σ )τ a ( σ )» dχ (5.61d)
¬ 4 ¼ -∞
- 640 -
Removing the Background Spectra · 5.14
to get a formula for what we want to measure in terms of the Fourier transform of z(Ȥ) and other
known quantities. Now, however, we know from the work done in the previous section that when
measuring infrared spectra there may be significant amounts of background radiance
contaminating the source spectrum. Equations (5.60a) and (5.60b) show that if the background
radiance cannot be neglected, then the signal leaving the detector circuit is not z(Ȥ) but rather
ztot ( χ ) , which is not the correct signal to substitute into equations such as (5.61d).
To recover z(Ȥ) from ztot ( χ ) there must be two measurements made: one looking at the source
and one looking at nothing at all. No matter how cos α ε is approximated, when the
interferometer observes an extremely cold source, it produces a signal in Eqs. (5.60a) and (5.60b)
in which S(ı), the infrared source spectrum, is very small compared to the background spectra
S ( fore ) (σ ) and S (back) (σ ) . To match the notation used in Chapter 6, where the background
radiances play a more important role than they do here, we call this signal zC( cold ) ( χ ) . According
to Eqs. (5.60a) and (5.60b), zC( cold ) ( χ ) can be written as
∞
W
(χ ) = ³ ª¬ S ( fore ) (σ ) − S (back) (σ ) º¼ ⋅
(cold)
zC
4 −∞
½ (5.62a)
° 1 °
M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ
° ∆Ω field of °
¯ view ¿
∞
W
z (cold)
C (χ ) = ³ ª¬ S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ (5.62b)
4 −∞
when cos α ε can be approximated as one. Assuming the interferometer is stable, meaning that the
background radiances of the instrument do not change, we can then measure ztot ( χ ) as given in
formulas (5.60a) and (5.60b) and subtract from it zC( cold ) ( χ ) as defined in Eqs. (5.62a) and
(5.62b). This gives
- 641 -
5 · Description of Practical Interferometer Measurements
z ( ) ztot ( ) zC(cold) ( )
5
W
4 ³
5
ª¬ S ) S ( fore ) () ) S (back) () ) º¼ A
½
° 1 °
M R)' ma ® ³³ d 2 H() u cos )e 2& i) cos ¾ d)
° field of °
¯ view ¿
5
W
4 ³
5
ª¬ S ( fore ) () ) S (back) () ) º¼ A
½
° 1 °
M R)' ma ® ³³ d 2 H() u cos )e 2& i) cos ¾ d)
° field of °
¯ view ¿
5
½
W ° 1 °
4 ³ S ) M R)' ma ® ³³ d 2 H() u cos )e 2& i) cos ¾ d)
5 ° field of °
¯ view ¿ (5.62c)
z ( ) ztot ( ) zC(cold) ( )
5
W
4 ³
5
ª¬ S ) S ( fore ) () ) S (back ) () ) º¼ M R)' ma H() u )e 2& i) d) A
W
5
(5.62d)
³ ¬ª S () ) S (back ) () ) ¼º M R)' ma H() u )e 2& i) d) A
( fore )
4 5
5
W
³ S ) M R)' H() u)e d)
& ) 2 i
ma
4 5
when cos can be approximated as one. This is one of the ways infrared spectroscopists using
interferometers with uncooled optics can eliminate unwanted background spectra and retrieve the
desired z(Ȥ) interferogram signal associated with the source spectrum. Designers of satellite
interferometers almost always schedule some form of “space look” where the instrument
observes nothing but empty space, containing only distance sources of radiation too dim for the
instrument to detect. This sort of space look allows it to acquire the information needed to find
the zc(C(cold)
cold )
( ) signal generated by its own internal warmth. A quick way of achieving the same
- 642 -
Removing the Background Spectra · 5.14
effect on the ground is to point the interferometer at a surface cooled by liquid nitrogen or, for
greater accuracy, liquid helium.
Now that we know how to extract z(Ȥ) from the unwanted background, the presence of the
background signal can be disregarded when analyzing nonrandom spectral distortions introduced
by nonideal interferometer measurements. This is what we do for the rest of this chapter (except
for Sec. 5.19, where we discuss one common method of extracting a radiance measurement from
the raw signal spectrum). These formulas do, however, return in the next chapter because the
background signal can have a significant effect on the amount of random noise present in the
measurement.
³ z(χ ) e
−2π iσχ
dχ
-∞
of the interferogram signal z(Ȥ) leaving the detector circuit. It is, of course, impossible to measure
z for all optical-path differences Ȥ between í and +, so there is no hope of calculating the
direct, unadulterated Fourier transform of z. We must therefore settle for an approximation of the
Fourier transform, and there are two different ways to do this—one using finite-length, double-
sided measurements of the interferogram signal and one using finite-length, single-sided
measurements of the interferogram signal. Because it is conceptually simpler, we start with the
double-sided interferogram measurement, postponing discussion of the single-sided
interferogram until Sec. 5.18 below.
As was remarked at the end of Sec. 5.12, the interferogram signal leaving the detector circuit
is usually approximately—although not exactly—even, so that it tends to look as shown in Fig.
5.22 when plotted as a function of Ȥ. In a double-sided interferogram measurement, there is a
positive length D such that the signal z(Ȥ) is measured for all
−D ≤ χ ≤ D ,
or
χ ≤ D.
When z is only measured for χ ≤ D , there is no way to know what z is in the regions marked
with question marks “?” in Fig. 5.22, and in a double-sided interferogram measurement, the value
of z(Ȥ) in these regions is assumed to be, if not negligible, at any rate unimportant. The Fourier
transform of z then becomes,
- 643 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.22.
χ =0
χ = −D χ=D
? 2D ?
______________________________________________________________________________
∞ D ∞
- 644 -
Double-Sided Interferograms · 5.15
Eq. (5.48b)],
5
³ z( ) e
2& i)
Z eff () ) d . (5.64a)
-5
Rewriting Eq. (5.61d) by substituting (5.64a) for the Fourier transform of z gives
1
ª A W º
L () ) « H() u ) R ( ) ) !() )* f ( ) )* a ( ) ) » Z eff () ) . (5.64b)
¬ 4 ¼
The terms inside the square brackets are usually designed to be slowly varying functions over the
range of wavenumbers for which L(ı) is being measured. This means Z eff () ) contains the fine
details of spectrum L(ı). Since L(ı) is real and—according to the discussion following Eq.
(5A.6b) in Appendix 5A—the transfer function H(u) ) is complex, the effective spectrum
Z eff () ) in (5.64b) must also be complex. Taking the complex magnitude of both sides of formula
(5.64b), we indicate that Z eff () ) carries the fine details of L(ı) by writing
L( ) ) ~ Z eff () ) . (5.64c)
Although Eq. (5.64b) comes from formulas that apply only when the interferometer’s field of
view is sufficiently narrow that cos can be approximated as one, the idea expressed by
(5.64c), that Z eff () ) carries the fine details of the L(ı) spectrum, holds true even when cos
cannot be approximated as one.
We now consider what happens to these fine details when what we have is not Z eff () ) , the
true Fourier transform of z(Ȥ), but rather the double-sided approximation specified in Eq. (5.63a).
The integral in (5.63a) is the Fourier transform of the product ( , D) A z ( ) , and by the Fourier
convolution theorem [see Eq. (2.39k) of Chapter 2], this can be written as the convolution of the
Fourier transforms
transform of and
and zz.. We
We already
already know
know that
that ZZeffeff (()))) is
is the
the Fourier
Fourier transform of z, and
the Fourier transform of can be evaluated directly as
5 D
1 D
³ ( , D) e ³e ª¬e 2& i) º¼ 2 Dsinc(2&) D) ,
2& i) 2& i)
d d (5.65a)
-5 -D
2& i) D
- 645 -
5 · Description of Practical Interferometer Measurements
previously defined in Eq. (2.106d) of Chapter 2. Hence, by the Fourier convolution theorem
³ Π ( χ , D) z ( χ ) e
−2π iσχ
d χ = [ 2 Dsinc(2πσ D) ] ∗ Z eff (σ ) . (5.65c)
-∞
This shows that what we settle for in a double-sided interferogram measurement is the
convolution of Z eff (σ ) with 2 Dsinc(2πσ D) instead of the true Fourier transform Z eff (σ ) .
In the discussion following Eq. (2.39 A ) of Chapter 2, we pointed out that when two functions
are convolved and one of them is much narrower than the other, the narrower function can be
thought of as blurring and distorting the shape of the other. Since what we are interested in is the
fine detail encoded in
∞
³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ ,
-∞
we cannot hope to get even an approximate measurement of this fine detail unless
2 Dsinc(2πσ D) is narrower than Z eff (σ ) , the Fourier transform of z. We substitute the right-
hand side of (5.65c), which is our approximation for Z eff (σ ) , the Fourier transform of z, into
(5.64c) to get
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ Zeff (σ ) . (5.66a)
The original spectral radiance L(ı) encodes its own fine details at least as well as Z eff (σ ) , which
lets us write (5.66a) as
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ L( σ )
or
Lblur ( σ ) ~ 2 Dsinc(2πσ D) ∗ L( σ ) . (5.66b)
In the last step, we restrict the magnitude signs to the arguments of L and Lblur because
2 Dsinc(2πσ D) and L(ı) are always real—making their convolution real—and because negative
values of the convolution indicate an unphysically distorted measurement of L(ı), because L
cannot be negative. Since, according to Eq. (2.38b) in Chapter 2, it does not matter in what order
two functions are convolved, this can also be written as
- 646 -
Double-Sided Interferograms · 5.15
Lblur ( ) ) ~ L( ) ) 2 Dsinc(2&) D) .
Comparing this result to Eq. (2.40a) of Chapter 2, we realize that 2Dsinc(2&) D ) is playing the
role of an instrument response function. Figure 5.23 reveals the width of function
2 Dsinc(2&) D) between the two zeros bracketing the central peak to be 1/D. This shows us how
to control the narrowness of the spectrometer’s instrument-response function. When designing
Fourier-transform spectrometers we try to pick D sufficiently large that the blurring sinc
function in (5.66b) does not significantly distort the spectral features of the radiance L(ı) that we
want to measure.
Figures 5.24(a)–5.24(f) give examples of how this works when the 2 Dsinc(2&) D)
instrument-response function acts acts to
toblur
blurtogether
togethera apair
collection of ever-closer
of ever-closer spectral spectral
peaks. Wepeaks. We
see that
see
whenthat
thewhen
peaksthearepeaks are separated
separated by a wavenumber
by a wavenumber interval interval
1
) (5.67)
2D
all sure knowledge of their separate existence is lost. In Fourier-transform spectrometry, the
quantity (2 D) 1 is often called the unapodized spectral resolution of the interferometer
measurement. This terminology can be confusing, because a smaller spectral resolution ¨ı now
corresponds to a higher resolving power for the interferometer. The important thing to remember
is that the interferometer’s resolving power—that is, its ability to measure spectral detail—is
directly proportional to D. Figures 5.24(a)–5.24(f) also show that when the true spectra are
convolved with sinc-like instrument-response functions, the oscillations in the instrument-
response functions create secondary oscillations in regions where L(ı) is changing rapidly. This
is sometimes referred to as “ringing” in the measured spectrum Lblur () ) . This ringing can lead to
unphysically negative values in Lblur () ) , as shown in Figs. 5.24(b), 5.24(d), and 5.24(f).
In InFourier-transform
Fourier-transformspectroscopy,
spectroscopy,thetheinstrument-response
instrument-response function
function isis often
often called
called the
instrument line shape, or ILS for short. The instrument line shape can be measured by passing a
laser beam through the interferometer. Although all lasers in practice have some spectral width,
they do produce a spectral radiance L(ı) that is, as shown in Fig. 5.25(a), very close to a delta
function.86 Figure 5.25(b) plots the curve Lblur () ) produced by a Fourier-transform spectrometer
when it measures the laser spectrum at wavenumber ) ) 0 . We can normalize Lblur () ) so that
the total area under the curve is one, creating a new curve
86
Equation (5.16c) gives the ideal interferogram created by a strictly monochromatic source represented by a delta
function.
- 647 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.23.
2D
0.0
1 1
2D 2D
sinc(2&) D) versus ı.
This is a graph of 2 D sinc(
____________________________________________________________________________________
1
ª5 º
L ( norm )
blur () ) « ³ Lblur () 3) d) 3» Lblur () ) . (5.68a)
¬0 ¼
The origin of the wavenumber axis is then shifted so that the center of the normalized curve is at
the origin, giving a measurement of the instrument-response function or instrument line shape at
) )0 ,
I LS () ) L(blur
norm )
() ) 0 ) , (5.68b)
- 648 -
Double-Sided Interferograms · 5.15
L(σ ) Lblur (σ )
σ σ
§ 1 · § 1 ·
3⋅¨ ¸ 3⋅¨ ¸
© 2D ¹ © 2D ¹
FIGURE 5.24(c). FIGURE 5.24(d).
L(σ ) Lblur (σ )
σ σ
§ 1 · § 1 ·
2⋅¨ ¸ 2⋅¨ ¸
FIGURE 5.24(e). © 2D ¹ FIGURE 5.24(f). © 2D ¹
L(σ ) Lblur (σ )
σ σ
1 1
2D 2D
- 649 -
5 · Description of Practical Interferometer Measurements
as shown in Fig. 5.25(c). To a first approximation (and as a general rule of thumb), we expect to
get about the same shape for I LS () ) no matter what the wavenumber ) 0 of the laser used to
make the measurement.
One last point worth making is that after the effective spectrum Z eff () ) has been blurred by a
convolution with the sinc function, what we end up with is a new effective spectrum
³ ( , D) z( ) e
2& i)
Z eff , new () ) d (5.69b)
-5
and
5
( , D) z( ) ³
-5
Z eff ,new () )e2& i) d) . (5.69c)
So even this aspect of the interferogram signal—that we cannot measure it for all optical-path
differences between í and +—can be expressed by representing the truncated signal
( , D) z ( )
5 5
³ z( ) e ³ ( , D) z ( ) e
2& i) 2& i)
d
d ,
-5 -5
it is perhaps not so obvious that putting function ( , D) inside the integral on the right-hand
side leads to the best possible approximation of the true Fourier transform of z. Suppose we
replace with an arbitrary function of Ȥ called aD ( ) , making the approximation that
- 650 -
Apodization of Spectra · 5.16
FIGURE 5.25(a).
L(σ )
σ =σ0
- 651 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.25(b).
Lblur (σ )
σ0
1 1
σ0 − σ0 +
2D 2D
- 652 -
Apodization of Spectra · 5.16
FIGURE 5.25(c).
I LS (σ )
1 0.0 1
−
2D 2D
- 653 -
5 · Description of Practical Interferometer Measurements
∞ ∞
³ z(χ ) e ³a
−2π iσχ
dχ ≅ D ( χ ) z ( χ ) e−2π iσχ d χ . (5.70a)
-∞ -∞
since we do not know what values to give z when χ > D . Setting up the problem of
approximating the true Fourier transform in this way—that is, the way it is stated in Eq. (5.70a)—
suggests that what we need to do is find that function aD ( χ ) for which the integral
³a
-∞
D ( χ ) z ( χ ) e −2π iσχ d χ
³ z( χ ) e
−2π iσχ
dχ .
-∞
Trying to approximate the Fourier transform of a function z, which is known from only a finite
stretch of data, is not a problem unique to Fourier-transform spectroscopy; in fact, it occurs over
and over again in many different fields of electrical engineering and signal processing. In these
fields, aD is called the window function and multiplying z(Ȥ) by aD (χ ) is referred to as
windowing z(Ȥ). In Fourier-transform spectroscopy aD is called the apodization function, and
multiplying z(Ȥ) by aD (χ ) is called apodizing the interferogram signal z.
There are several different types of restrictions put on the apodization function aD (χ ) . If
³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ (5.71a)
-∞
∞
z (0) = ³
-∞
Z eff (σ )dσ . (5.71b)
When we replace z(Ȥ) by aD (χ ) z(χ ) in Eq. (5.70a), distorting the shape of the Fourier transform
- 654 -
Apodization of Spectra · 5.16
Z eff (σ ) , we want the integral over the distorted spectrum to have the same value as the integral
over the undistorted spectrum in (5.71b). Because the distorted spectrum is by definition the
Fourier transform of aD (χ ) z (χ ) , it follows—again using (2.35b) of Chapter 2—that the integral
over the distorted spectrum is aD (0) z (0) . Forcing the integrals over the distorted and undistorted
spectra to have the same values now leads to
It is hard to justify giving the apodization or window function a nonzero imaginary part, so
almost always
Im ( aD ( χ ) ) = 0 . (5.71d)
According to the discussion at the end of Sec. 5.12, z(Ȥ) is often an approximately symmetric
function of the optical-path difference Ȥ, which means there is no obvious reason to weight z(íȤ)
differently from z(Ȥ) in the integral on the right-hand side of (5.70a). This suggests that the
apodization should be an even function of the optical-path difference:
aD (-χ ) = aD (χ ) . (5.71e)
From Eqs. (5.71d) and (5.71e), we know that aD (χ ) is real and even, which means, according to
entry 1 in Table 2.1 of Chapter 2, that the Fourier transform A D (σ ) is also real and even. Figures
5.26(a) and 5.26(b) give some of the more popular apodization or window functions and their
corresponding Fourier transforms. Compared to Π (Ȥ,D), they all do a better job of preventing
- 655 -
5 · Description of Practical Interferometer Measurements
ringing; in fact, the Bartlett and Parzen window functions, because their Fourier transforms do
not go negative, can never produce unphysical negative values when convolved with the non-
negative true spectrum L(ı) [which is the basic shape-determining factor of Z eff (σ ) on the right-
hand side of Eq. (5.72a)]. Apodization functions in fact get their name from the way they can
diminish or remove unsightly ringing at the base of sharp, spectral peaks in Fourier
measurements. The “pod” root comes from the Latin word for “foot,” a metaphorical reference to
the small spurious bumps often present at the base of these peaks; and the “a” prefix before the
“pod” shows that apodization is intended to remove (or diminish) the “feet.” As a rule of thumb,
apodizing the interferogram signal is more a matter of aesthetics—making the measured spectrum
look better—than it is a way to reveal previously hidden spectral detail. If there are doubts about
the true shape of a measured spectrum, it is better to increase the value of D than to introduce a
more sophisticated apodization function.
∞
½
W ° 1 °
z(χ ) = ³ S (σ ) M ( Rσθ ma ) ® ³³ d 2ε H(σ u cos α ε )e 2π iσχ cosαε ¾ dσ . (5.73a)
4 −∞ ° ∆Ω field of °
¯ view ¿
To investigate what happens to this signal when the field of view is sufficiently large that cos α ε
is approximately but not exactly equal to one, we write
α ε2
cos α ε ≅ 1 − . (5.73b)
2
z(χ )
∞
½
W ° 1 uσα ε2 ° (5.73c)
= ³ S (σ ) M ( Rσθ ma ) e 2π iσχ
® ³³ d ε e
2 −π iσχαε2
H(uσ − ) ¾ dσ .
4 −∞ ° ∆Ω field of 2 °
¯ view ¿
- 656 -
The Effect of a Finite Field of View · 5.17
FIGURE 5.26(a).
1.0
1.0
0.9
0.80.8 Bartlett
0.7
0.60.6 Parzen
B
kg
0.5
T
kg
0.40.4
H
kg
P 0.3
kg
0.20.2 Tukey
0.1
Hamming
0.0 0
0.1
0.2 0.2
1.2 1 0.8 0.6 0.4 0.2 0 0.2 0.4 0.6 0.8 1 1.2
1.2 0.0
t
kg
1.2
D D
- 657 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.26(b).
3D
1.5
2 1.4
1.3
1.2
Hamming
1.1
D 1
0.9
B
kg
)
0.8
T
kg 0.7
H 0.6
kg
D Tukey
P 0.5
kg
2 Parzen
0.4
Bartlett
0.3
0.2
0.1
0.0 0
0.1
0.2 0.2
3 2.5 2 1.5 1 0.5 0 0.5 1 1.5 2 2.5 3
3 0.0
t 3.0
kg
1/ D 1/ D
AdD((ı)) )ofofthe
This graph plots the Fourier transforms A thefour
fourapodization
apodizationoror
window functions shown in Fig. 5.26(a).
- 658 -
The Effect of a Finite Field of View · 5.17
The outer integral over dı goes between í and +, so as long as α ε2 is not zero there is
eventually a value of ı large enough to make cos α ε in the expression σ cos α ε too large to be
approximated by (5.73b) in the phase formulas in Eqs. (5.73a) and (5.73c). (The first part of
Appendix 4B of Chapter 4 explains why we must be careful when deriving approximations for
the phase.) The first step, then, in treating α ε2 as a small quantity in Eq. (5.73c) is to require that
S(ı) be zero or negligible for those values of ı large enough to invalidate (5.73b). Because S is
even so that [see Eq. (5.39a) above]
S (−σ ) = S (σ ) ,
it follows that S must also be zero or negligible for large negative values of ı. Glancing back at
the definition of S for positive ı in Eq. (5.36a),
S (σ ) = A ∆Ω R (σ ) η(σ ) L (σ )τ f (σ )τ a (σ ) ,
we see that the behavior of S at large ı is under the control of the interferometer designer—for
example, the fore and aft optics can be constructed so that the product τ f (σ )τ a (σ ) is zero or
negligible for large values of ı. When σα ε2 is also multiplied by the optical-path difference Ȥ, as
it is in the phase of
2
e−π iσχαε
in Eq. (5.73c), the interferometer designer must also choose an appropriate upper limit on χ .
This upper limit was called D in Secs. 5.15 and 5.16 above, so in (5.73a)–(5.73c) we want D to
be chosen small enough that for all
χ ≤D (5.73d)
cos α ε = 1 − ε 2 . (5.74a)
When the detector’s field of view is small, which means ε is always close to zero, we can
approximate the square root as
- 659 -
5 · Description of Practical Interferometer Measurements
ε2
1− ε 2 ≅ 1− .
2
ε2
cos α ε ≅ 1 − (5.74b)
2
inside the double integral over d 2ε in Eq. (5.73c). Comparing Eq. (5.74b) to (5.73b), we see that
if α ε is small—which is, of course, the same as saying the detector’s field of view is small—it
follows that
αε2 ≅ ε 2 . (5.74c)
In the discussion at the beginning of Sec. 5.7 above, we interpreted the double integral over d 2ε
in Eq. (5.25b) as a sum over all the plane waves propagating through the interferometer, with α ε
being the angle of the ε th plane wave’s propagation vector with respect to the optical axis. The
double integral over d 2ε in Eqs. (5.73a) and (5.73c) can be interpreted in the same way. The
angle α ε is always taken to be greater than or equal to zero, so Eq. (5.74c) can be written as
αε ≅ ε . (5.74d)
Hence, when the propagation angle is small, ε can be thought of as the angle in radians with
respect to the optical axis at which the ε th plane wave is propagating through the interferometer.
The discussion in Sec. 5.7 shows that interferometer setups that have a circular detector centered
on the optical axis, such as the standard Michelson interferometer shown in Fig. 5.18, have
propagation angles α ε = ε that are at a maximum ε max when the plane waves are focused onto
the detector’s edge. The interior points of the detector absorb the focused energy of plane waves
passing through the interferometer at propagation angles ε < ε max ; in fact, all plane waves with
the same propagation angle ε end up focused onto a circle surrounding the detector’s center,
with the radius of the circle proportional to ε as shown in Fig. 5.27.
The double integral over the field of view has this same sort of circular symmetry.
Substituting (5.74c) into (5.73c) gives
- 660 -
The Effect of a Finite Field of View · 5.17
z(χ )
∞
½
W ° 1 uσε 2 ° (5.75a)
= ³ S (σ ) M ( Rσθ ma ) e 2π iσχ
® ³³ d ε e H(uσ − 2 )
2 −π iσχε 2
¾ dσ .
4 −∞ ° ∆Ω field of °
¯ view ¿
−π iσχε 2 uσε 2
e H(uσ − ),
2
depends only on ε 2 . This means the double integral can be thought of as an integral over all the
infinitesimal area patches d 2ε = d ε x d ε y of a quantity that only depends on ε 2 = ε = ε x2 + ε y2 ,
the distance of any point in this area integral from the origin where ε = 0 . Consequently the area
integral d 2ε has circular symmetry and can be treated as a one-dimensional integral over a
collection of rings with radii between 0 and ε max ,
ε max
³³ d ε → 2π ³ ε dε .
2
field of 0
view
z(χ )
° 2π max °½
∞ ε
W uσε 2
³ S (σ ) M ( Rσθ ) e ³
2π iσχ −π iσχε 2
= ma ® ε [ e H(uσ − )]d ε ¾ dσ (5.75b)
4 −∞ ¯° ∆Ω 0 2 ¿°
W
∞
° 2π ε max ½°
≅ ³ S (σ ) M ( Rσθ ma ) H(uσ ) e 2π iσχ ® ³ ε e −π iσχε 2
d ε ¾ dσ ,
4 −∞ ¯° ∆Ω 0 ¿°
where in the last step we have assumed that the transfer function H(ƒ) is such a slowly varying
function of ƒ that we can disregard the effect of adding the small quantity (uσε 2 ) / 2 to the
argument uı. For future use, we note that the circular symmetry of the detector’s field of view
lets us write Eq. (5.35d) as
- 661 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.27.
- 662 -
The Effect of a Finite Field of View · 5.17
ε max
³³ d ε = 2π ³ ε dε = πε
2 2
∆Ω = max (5.75c)
field of 0
view
Hence ¨ is given by the formula for the area of a circle of radius ε max . In Eq. (5.75b) the term
inside the braces { } can be simplified to
ε max
2π 2 −1 ª −π iσχε 2 ºε max
∆Ω ³0
ε e−π iσχε d ε =
iσχ ∆Ω ¬
e
¼0
e ( )
2
− 1 2 π iσχε max
=−
iσχ ∆Ω
e ( ) (
2
− 1 2 π iσχε max
− e( )
2
1 2 π iσχε max
. )
Equation (5.75c) can be written as ε max
2
= ∆Ω π , and with this substitution the integral becomes
ª § σχ∆Ω · º
2π
ε max « sin ¨ 2 ¸ »
« © ¹» .
−π iσχε 2 −( i 2 )σχ∆Ω
∆Ω ³0
ε e dε = e
« § σχ∆Ω · »
«¬ ¨© 2 ¸¹ »¼
ε max
2π § σχ∆Ω ·
³ ε e−π iσχε d ε = e−( i 2)σχ∆Ωsinc ¨
2
¸. (5.75d)
∆Ω 0 © 2 ¹
∞ § ∆Ω ·
W § σχ ∆Ω · 2π i χσ ¨©1− 4π ¸¹
z(χ ) = ³ S (σ ) M ( Rσθ ma ) H(uσ ) sinc ¨ ¸e dσ . (5.75e)
4 −∞ © 2 ¹
According to the discussion in Sec. 5.11, we can associate an effective spectrum with the
formula in Eq. (5.75e),
- 663 -
5 · Description of Practical Interferometer Measurements
³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ
−∞
§ ∆Ω ·
(5.76a)
∞ ∞
ªW § σ ′χ ∆Ω · º 2π i χσ ′¨©1− 4π ¸¹
= ³ dχ e
−2π iσχ
³ dσ ′ « S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸» e .
−∞ −∞ ¬4 © 2 ¹¼
From Eqs. (5B.8a) and (5B.8b) in Appendix 5B at the end of this chapter, it follows that
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
1 ªW º
Z eff (σ ) ≅
∆σ ³ «¬ 4 S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) »¼ dσ ′ , (5.76b)
§ ∆Ω · ∆σ
σ ⋅¨1+ ¸−
© 4π ¹ 2
where
∆Ω σ
∆σ = . (5.76c)
2π
Formula (5.76b) is good for fields of view small enough that cos α ε can be approximated
quadratically as
α ε2
cos α ε ≅ 1 − ,
2
but not so small that cos α ε can be approximated as one. In (5.76b) the term inside the square
brackets [ ] is, as pointed out in Appendix 5B, averaged over a wavenumber interval that is
centered on
§ ∆Ω ·
σ ⋅ ¨1 + ¸
© 4π ¹
Equations (5.47c) and (5.48a) show that when cos α ε can be approximated by one and there is no
background radiance, the effective signal spectrum can be written as
W
Z eff (σ ) = S (σ ) M ( Rσθ ma ) H(uσ ) . (5.76d)
4
- 664 -
The Effect of a Finite Field of View · 5.17
This expression is the same as the term inside the square brackets in (5.76b). We conclude that in
Eq. (5.76b) the term inside the integral is just the effective signal spectrum of the narrow field-of-
view case where cos can be approximated by one. Consequently, the effect of increasing the
field of view beyond the point where cos can be approximated by one is to blur the effective
signal spectrum by averaging it over a wavenumber region of width
)
)
2&
§ ·
centered on wavenumber ) A ¨ 1 ¸ instead of ı. Therefore, another effect of the increased
© 4& ¹
field of view is to scale the wavenumber axis of the effective signal spectrum by a factor of
§ ·
¨1 ¸ . In other words, the spectral details at ) ) 0 are blurred over a region ¨ı
) in width
© 4& ¹
1
§ ·
around ı0 and then, in the spectral measurements, show up at wavenumber ) 0 A ¨1 ¸ instead
© 4& ¹
of at wavenumber ı0. When the ¨ field of view is known, we can always rescale the
wavenumber axis to put the spectral details in their correct locations, but the blurring degrades
the spectral resolution in a way that cannot be fixed.
We specify a new variable of integration
) 33 ) A g (5.77a)
with
§ ·
g ¨ 1 ¸ (5.77b)
© 4& ¹
z( z(
) )
5 5
ª W ª W1 º § g
1
) 33 · 2& i) 33 (5.77c)
³ « ³ «g )S g ) M g ) R' ma H g ) u » sinc ¨
1
33 1
33 1
33 ¸e d) 33.
¬
5 5
4 ¬ 4 ¼ © 2 ¹
The term inside the square brackets is just another version of the effective spectrum in (5.76d),
but now it is multiplied by the factor
- 665 -
5 · Description of Practical Interferometer Measurements
§ g ∆Ω
−1
σ ′′χ ∆Ω · § § ∆Ω · −1 σ ′′χ ∆Ω ·
sinc ¨ ¸ = sinc ¨¨ ¨1 − ¸ ¸¸ .
© 2 ¹ © © 4π ¹ 2 ¹
This sinc factor artificially decreases the size of the effective spectrum, forcing it to contribute
too little to the integration over dσ ′′ so that the signal z(Ȥ) is smaller than it would otherwise be
at large values of the optical-path difference Ȥ. This effect is sometimes called the “self-
apodization” of the interferogram signal. To avoid having significant amounts of self-apodization
in the measured spectrum, we should keep the optical-path difference Ȥ from becoming so large
that the sinc factor becomes small or even negative. Following the notation of Sec. 5.15, there
must be a length D with
χ ≤D
such that
§ § ∆Ω ·−1 σ ′′D ∆Ω ·
sinc ¨ ¨1 − ¸ ¸¸
¨© 4π ¹ 2
© ¹
stays reasonably close to one. In any well-designed interferometer, the wavenumbers to which the
detector is sensitive lie within a specified wavenumber range,
as is discussed following Eq. (4.66b) in Chapter 4. Consequently, the traditional rule of thumb is
to require the sinc factor to be greater than 2/3 for the maximum possible value of its argument,
§ § ∆Ω · −1 σ max D ∆Ω · 2
sinc ¨ ¨1 − ¸ ¸¸ > , (5.79)
¨© 4π 2
© ¹ ¹ 3
−1
§ ∆Ω · σ max D ∆Ω
¨1 − ¸ < 1.488
© 4π ¹ 2
or
§ ∆Ω · (2.976) 2.976
D < ¨1 − ¸ ≅ , (5.80)
© 4π ¹ σ max ∆Ω σ max ∆Ω
where in the last step we assume [see Eqs. (5B.1c) and (5B.1d) in Appendix 5B]
- 666 -
The Effect of a Finite Field of View · 5.17
∆Ω
<< 1 ,
4π
something that is almost always the case. As was discussed in the Sec. 5.15, the size of D
controls the overall resolution of the spectral measurement, with small values of D producing
low-resolution spectral measurements and large values of D producing high-resolution spectral
measurements [see Eq. (5.67)]. What we have here, then, is the interferometric version of the
classic inverse relationship between spectral resolution and field of view that affects all
spectrometers, not just the Fourier-transform type. Inequality (5.80) states that to avoid self-
apodization, large fields of view ¨ should have small values of D, producing low-resolution
spectral measurements, and small fields of view ¨ can have large values of D, producing high-
resolution spectral measurements. If inequality (5.80) is ignored, then self-apodization occurs and
resolution is lost from the blurring effect of the integral in Eq. (5.76b) above.
to get z between í2D and zero. Consequently, we end up with the same knowledge of the
interferogram signal that we would get from measuring a double-sided interferogram between
í2D and 2D. Putting the moving mirror’s ZPD location at the beginning of its range of motion
therefore doubles the effective length of the interferogram signal. According to Eq. (5.67), the
resolving power of a double-sided interferogram is directly proportional to the interferogram
signal’s length, so—when the interferogram signal is even—putting the moving mirror’s ZPD
location at the beginning of its range of motion doubles the spectral resolving power.
Shifting the position of the fixed mirror is, as a general rule, much easier than extending the
moving mirror’s range of motion, so it is unfortunate that—because z is not exactly even after
passing through the detector circuit87—we cannot so simply double the resolving power of
already-built Michelson interferometers. If, however, the fixed mirror is shifted as shown in Fig.
87
See discussion at the end of Sec. 5.12.
- 667 -
5 · Description of Practical Interferometer Measurements
5.29 so that the ZPD position is put close to, rather than exactly at, the beginning of the moving
mirror’s range of motion, we can usually symmetrize the interferogram signal, turning it into an
exactly even function of Ȥ. This returns us to the ideal case discussed above, letting us increase
the interferometer’s spectral resolving power by increasing the effective length of the
interferogram signal. Because we do not put the ZPD exactly at the beginning of the moving
mirror’s range of motion, we cannot double the resolving power; but in almost all cases there is a
large increase—almost a doubling—in the amount of spectral detail which the interferometer can
measure.
From the work done in Sec. 5.11, we know that after passing through the detector circuit the
interferogram signal can be written as the inverse Fourier transform of an effective spectrum,
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.81a)
From entry 7 in Table 2.1 of Chapter 2, we know that, since z(Ȥ) is real, Z eff (σ ) must be
Hermitian,
Z eff (−σ ) = Z eff (σ )∗ . (5.81b)
We also know from the discussion following Eq. (5A.6b) in Appendix 5A that the transfer
function H(uı) of the detector circuit must have a nonzero imaginary component. For small fields
of view where cos α ε can be approximated by one, Eq. (5.48a) gives
W
Z eff (σ ) = H(uσ ) S (σ )M( Rσθ ma ) , (5.82a)
4
showing that, since W = +1 or í1 and functions S(ı) and M( Rσθ ma ) are real, the effective
spectrum Z eff (σ ) has a nonzero imaginary component only because H has a nonzero imaginary
component.
For larger fields of view when cos α ε cannot be approximated by one, we can again show that
Z eff (σ ) has a nonzero imaginary component because H has a nonzero imaginary component.
Equations (5.76b) and (5.76c) give
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
1 ªW º
Z eff (σ ) =
∆σ ³ « 4 S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) » dσ ′ (5.82b)
§ ∆Ω · ∆ σ
¬ ¼
σ ⋅¨1+ ¸−
© 4π ¹ 2
- 668 -
Single-Sided Interferogram · 5.18
FIGURE 5.28.
Moving
Mirror
Old ZPD Position Range
of
Motion
New ZPD Position
Ideal
Beam
Splitter New Old
Radiance heading to the Position Position
Detector of Fixed of Fixed
Mirror Mirror
- 669 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.29.
Ideal
Beam
Splitter
New Old
Radiance heading to the
Position Position
Detector
of Fixed of Fixed
Mirror Mirror
- 670 -
Single-Sided Interferogram · 5.18
with
)
) . (5.82c)
2&
In a well-designed interferometer system we want M (if it is not equal to one) and H to vary
slowly as functions of ı, letting S(ı) carry the high-resolution spectral detail. In fact, we know
from Eq. (5.40g) that
S () ) A R ( ) ) !() ) L ( ) )* f ( ) )* a ( ) ) . (5.82d)
This shows that the interferometer can be designed and built so that R, Ș, * a , and * f also vary
slowly with ı over the range of wavenumbers being measured, allowing the rapid variation with
wavenumber to come entirely from the spectral radiance L(ı). This is, in fact, how we expect R,
Ș, * a , and * f to behave in well-designed interferometers. Consequently, all the slowly varying
functions of ı can be brought outside the integral in Eq. (5.82b), which means we can substitute
(5.82d) into (5.82b) to get
WA § § · · § § · ·
Z eff () )
A H ¨ u) ¨1 ¸ ¸ A M ¨ R' ma) ¨ 1 ¸¸A
4)
) © © 4& ¹ ¹ © © 4& ¹¹
§ § · · § § · · § § · · § § · ·
R¨ ) ¨1 ¸ ¸ A! ¨ ) ¨ 1 ¸ ¸ A* a ¨ ) ¨1 ¸ ¸ A* f ¨) ¨1 ¸¸A
© © 4& ¹ ¹ © © 4& ¹ ¹ © © 4& ¹ ¹ © © 4& ¹ ¹
§ · )
) A¨1 ¸
© 4& ¹ 2
(5.83a)
³ L( ) 3 ) d) 3
§ · )
) A¨1 ¸
© 4& ¹ 2
§ · )
) A¨1 ¸
WA H u) M R' ma) R ) ! ) * a ) * f ) © 4& ¹ 2
4) ³ L( ) 3 ) d) 3.
) § · )
) A¨1 ¸
© 4& ¹ 2
1
4&
- 671 -
5 · Description of Practical Interferometer Measurements
§ § ∆Ω · · § § ∆Ω · ·
H ¨ uσ ¨1 + ¸ ¸ ≅ H ( uσ ) , M ¨ Rθ maσ ¨1 + ¸ ¸ ≅ M ( Rθ maσ ) , (5.83b)
© © 4π ¹ ¹ © © 4π ¹ ¹
§ § ∆Ω · · § § ∆Ω · ·
R ¨ σ ¨1 + ¸ ¸ ≅ R (σ ) , η ¨ σ ¨ 1 + ¸ ¸ ≅ η (σ ) ,
© © 4π ¹ ¹ © © 4π ¹ ¹
§ § ∆Ω · · § § ∆Ω · ·
τ a ¨ σ ¨1 + ¸ ¸ ≅ τ a (σ ) , τ f ¨ σ ¨1 + ¸ ¸ ≅ τ f (σ ) .
© © 4π ¹ ¹ © © 4π ¹ ¹
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
³ L ( σ ′ ) dσ ′
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸−
© 4π ¹ 2
WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L ( σ ) , (5.83c)
4
WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) , (5.83d)
4
- 672 -
Single-Sided Interferogram · 5.18
Absolute value signs are put around the argument of LFOV in (5.83d) and (5.83e) in part to remind
us, as pointed out in the discussion following (5.83b), that the integral
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
© 4π ¹ 2
³ ∆σ
L ( σ ′ ) dσ ′
§ ∆Ω ·
σ ⋅¨1+ ¸−
© 4π ¹ 2
must be an even function of ı. Figures 5.30(a) and 5.30(b) show how the original spectrum L(ı)
is shifted and blurred by an interferometer’s finite field of view. In Fig. 5.30(b) the compression
of the wavenumber axis can be removed by stretching the axis so that spectral edge E is returned
to its proper position, but nothing can recover the detail lost in the spectral blurring.
The next step in setting up a single-sided interferogram measurement is to write Eq. (5.83d) as
for real functions Zeff (σ ) and ȥ(ı). Here Zeff is the magnitude of Z eff ,
WA ∆Ω
Zeff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) (5.85a)
4
and
ψ (σ ) = arg[H(uσ )] . (5.85b)
- 673 -
5 · Description of Practical Interferometer Measurements
because H(uı) is a slowly varying function of the wavenumber. Substituting (5.84a) into (5.81b)
gives, since both Zeff and ȥ are real, that
Taking the complex logarithm of both sides shows that ȥ must be an odd function of ı,
ψ (−σ ) = −ψ (σ ) . (5.86b)
Equation (5.85b) suggests that we automatically know ȥ(ı) because, having designed and built
the detector circuit, we know its transfer function H. In practice, however, it is often difficult to
know H with sufficient accuracy to get good measurements of LFOV. It turns out that all we need
to make single-sided interferograms practical is to know that ȥ is a slowly varying function of the
wavenumber, because then it is easy to measure ȥ as a function of ı. The key point to take away
from Eq. (5.85b), then, is that if the transfer function is designed to be a slowly varying function
of wavenumber, then we have good reason to expect ȥ to be a slowly varying function of
wavenumber.88
The customary procedure used to measure ȥ(ı) directly is to run the moving mirror in Fig.
5.29 between χ = −d and χ = 2D − d , at first confining our attention to the z(Ȥ) signal values
88
This point is more important than it looks. There are interferometer defects not discussed here that, like the
transfer function, contribute slowly varying complex modulations to the effective spectrum. All we need for a good
single-sided interferogram measurement is to know that the total complex modulation is slowly varying, and then we
can use the procedure discussed in this section to remove all these complex modulations from the effective spectrum
at the same time.
- 674 -
Single-Sided Interferogram · 5.18
FIGURE 5.30(a).
Spectral Edge E
L() )
- 675 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.30(b).
Spectral Edge E
L FOV () )
The same small section of the radiance spectrum plotted in Fig. 5.30(a) is shown here
with the rescaled wavenumber axis and blurring due to the interferometer’s finite field of
view. Spectral Edge E is measured at a wavenumber slightly smaller than its true
positon.
- 676 -
Single-Sided Interferogram · 5.18
between Ȥ = íd and Ȥ = +d. These signal values give a perfectly good double-sided interferogram
of the type described in Sec. 5.15 above, leading to a low-resolution estimate of the effective
spectrum
Z eff (σ ) ≅ Z (low
eff
res)
(σ ) . (5.87a)
1
∆σ low res = . (5.87b)
2d
This spectral resolution is not sufficient to measure L(ı), the spectral radiance of the source, but
we can easily make it good enough to capture all the spectral detail in the slowly varying function
ȥ(ı). We choose d twice as large as we would for a minimally accurate representation, making
∆σ low res half the size of the spectral interval ∆σ detail used to examine the detail in ȥ(ı). This
makes
1
∆σ detail = 2∆σ low res = . (5.87c)
d
Because ȥ is a low-resolution function of wavenumber ı, Eqs. (5.84c) and (5.87a) show that
ψ (σ ) = arg[Z (low
eff
res)
(σ )] . (5.88a)
Now that ȥ is known, we can define a new function ϖ ( χ ) such that e − iψ (σ ) is the Fourier
transform ϖ ( χ ) . According to Eq. (5.78), we are only interested in ı values that are between
σ min , σ max and (−σ min ) , (−σ max ) . This means ȥ(ı) can be given any values we please outside
these two ranges, and for that matter so can any function of ȥ such as e − iψ (σ ) . Keeping in mind,
then, that the only ı values that matter satisfy
σ min ≤ σ ≤ σ max ,
we set up the Fourier transform pair
∞
³ ϖ (χ ) e
− iψ (σ ) −2π iσχ
V (σ )e = dχ (5.88b)
−∞
and
∞
ϖ ( χ ) = ³ [V (σ ) e−iψ (σ ) ] e2π iσχ dσ . (5.88c)
−∞
- 677 -
5 · Description of Practical Interferometer Measurements
In these two formulas, V(ı) is a real-valued tapering function chosen so that V (σ ) → 0 slowly as
σ → ∞ with
V (σ ) = 1 for σ min ≤ σ ≤ σ max . (5.88d)
For future use (and to keep things neat), we require V(ı) to be non-negative and even,
V (σ ) ≥ 0
and
V (−σ ) = V (σ ) . (5.88e)
Since ψ (−σ ) = −ψ (σ ) in Eq. (5.86b) and V(ı) is real and even in (5.88e), we note that
V (σ ) e − iψ (σ ) is Hermitian,
Consequently, according to entry 7 of Table 2.1 in Chapter 2, its Fourier transform must be real:
Im (ϖ ( χ ) ) = 0 . (5.88g)
Because ȥ and V are slowly varying functions of ı, their product V (σ ) e − iψ (σ ) is also a slowly
varying function of ı. According to the discussion following Eq. (2.37e) in Chapter 2, it follows
that ϖ ( χ ) , the inverse Fourier transform of V (σ ) e − iψ (σ ) in (5.88c), must be a relatively narrow
function of Ȥ. By the end of that discussion, we realize that if ∆σ detail is the change in ı required
to produce a significant change in V (σ ) e − iψ (σ ) , then the inverse Fourier transform ϖ ( χ ) must be
negligible at all values of Ȥ with χ > ∆σ detail
-1
. From Eq. (5.87c), we know
1
∆σ detail = ,
d
which means that
ϖ ( χ ) is negligible when χ > d . (5.88h)
Having analyzed ϖ ( χ ) , we now turn our attention to the entire interferogram signal recorded
between χ = −d and χ = 2D − d . When the interferogram signal in Eq. (5.81a) is convolved
with ϖ ( χ ) , the result is
zconv ( χ ) = ϖ ( χ ) ∗ z ( χ ). (5.89a)
- 678 -
Single-Sided Interferogram · 5.18
From the definition of convolution in Chapter 2 [see Eq. (2.38a)], we understand that both ϖ ( χ )
and z(Ȥ) must be known for all Ȥ between í and + to calculate their convolution,
∞
zconv ( χ ) = ³ ϖ ( χ ′) z ( χ − χ ′)d χ ′ .
−∞
(5.89b)
We have just seen, however, that ϖ ( χ ) is a narrow function of Ȥ, so from (5.88h) we get
d
zconv ( χ ) ≅ ³ ϖ ( χ ′) z ( χ − χ ′)d χ ′ . (5.89c)
−d
∞ ∞
³z ³ [ϖ ( χ ) ∗ z ( χ )] e
−2π iσχ −2π iσχ
conv (χ ) e dχ = dχ
−∞ −∞
(5.90a)
ª∞ º
= V (σ ) e −iψ (σ ) ⋅ « ³ z ( χ ) e −2π iσχ d χ » .
¬ −∞ ¼
³ z(χ ) e
−2π iσχ
Z eff (σ ) = dχ , (5.90b)
−∞
³z
−∞
conv ( χ ) e −2π iσχ d χ = V (σ ) e − iψ (σ ) ⋅ Zeff (σ ).
- 679 -
5 · Description of Practical Interferometer Measurements
σ ≥ σ max or σ ≤ σ min . Consequently, Zeff is zero for those ı values where V(ı) is, according to
(5.88d), not necessarily equal to one. Hence, this latest result can be written as
³z
−∞
conv ( χ ) e −2π iσχ d χ = e − iψ (σ ) ⋅ Zeff (σ ). (5.90c)
Consulting Eqs. (5.84a) and (5.84b), we see that (5.90c) could also be written as
³z
−∞
conv ( χ ) e −2π iσχ d χ = Zeff (σ ). (5.90d)
We already know that Zeff (σ ) is real, and from (5.86a) we see that Zeff (σ ) is even. Reversing
the Fourier transform in (5.90d) now gives
∞
zconv ( χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ . (5.91a)
According to entry 1 of Table 2.1 in Chapter 2, the inverse Fourier transform of a real and even
function is another real and even function, which means that zconv is an even function of Ȥ,
This is the result we need. In the discussion following Eq. (5.89c), we supposed that zconv was
known for negative as well as positive values of its argument because we wanted to take its
Fourier transform. It now turns out, however, that when zconv is known between χ = 0 and
χ = 2 D − 2d , it is also known between χ = 0 and χ = −(2 D − 2d ) because it must be an even
function. This means that measuring z(Ȥ) between χ = −d and χ = 2D − d , as shown in Fig.
5.29, gives enough information to calculate zconv(Ȥ) for
−2( D − d ) ≤ χ ≤ 2( D − d ) . (5.91c)
Applying the double-sided approximation for the Fourier transform discussed in Sec. 5.15 to
formula (5.90d), we can now treat zconv(Ȥ) as a double-sided interferogram signal to get
2( D − d )
Zeff (σ ) ≅ ³
−2( D − d )
zconv ( χ ) e −2π iσχ d χ . (5.91d)
- 680 -
Single-Sided Interferogram · 5.18
³ z(χ ) e
−2π iσχ
Z eff (σ ) ≅ dχ (5.92a)
-D
for the effective spectrum Z eff (σ ) . According to Eq. (5.83d), the correct formula for the effective
spectrum is
WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.92b)
4
Now compare this to the single-sided situation. Following the procedure outlined above, we
measure signal z(Ȥ) between χ = −d and χ = 2D − d . This data lets us calculate zconv(Ȥ) between
χ = 0 and χ = 2( D − d ) . Because zconv(Ȥ) is even, we end up knowing its values between
χ = −2( D − d ) and χ = +2( D − d ) , allowing us to make the new approximation
2( D − d )
Zeff (σ ) ≅ ³
−2( D − d )
zconv ( χ ) e −2π iσχ d χ , (5.92c)
WA ∆Ω
Zeff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.92d)
4
The only difference between the two spectral formulas in Eqs. (5.92b) and (5.92d) is that in
(5.92b) spectrum Z eff is proportional to the full complex transfer function H while in (5.92d)
spectrum Zeff is proportional to the magnitude of H. Although the detector circuit’s transfer
function H must have a nonzero imaginary part [see discussion following Eq. (5A.6b) in
Appendix 5A], we shall see in the following section that the calibration formula used for complex
H also works when the original transfer function H is replaced by H . The alternate method of
removing an interferometer’s background radiance discussed in Sec. 5.14 also works as desired
when H is replaced by H . Consequently, we can think of the magnitude of H in (5.92d) as just
another type of transfer function and treat Zeff like any other effective spectrum when it becomes
time to calibrate the interferometer and eliminate unwanted background radiation from our
measurements. Since the integral in Eq. (5.92c) goes between −2( D − d ) and + 2( D − d ) rather
than between íD and +D, the discussion in Sec. 5.15 shows that we must end up with a more
- 681 -
5 · Description of Practical Interferometer Measurements
highly resolved spectrum. According to Eq. (5.67), a double-sided interferogram system can
measure spectral details separated by a wavenumber interval as small as
1
∆σ double sided = (5.93a)
2D
1
∆σ single sided = < ∆σ double sided , (5.93b)
2(2 D − 2d )
which gives us much more resolving power than the equivalent double-sided system,
From what has been said so far, it seems that all spectral measurements ought to be made
using single-sided rather than double-sided interferograms. In practice, however, we often want
to compare one side of a double-sided interferogram signal to the other to check that no blunders
have been made in taking the measurement—and we clearly give up this possibility when using
single-sided interferograms. In addition, the expected noise amplitude of single-sided
measurements is, as a general rule, larger by 2 than the expected noise amplitude of equivalent,
equal-resolution double-sided measurements [see the discussion following Eq. (6.76e) in Chapter
6 below]. Finally, to justify our single-sided procedure, we are forced to assume that the phase
term e–iȥ(ı) is a slowly varying function of wavenumber ı and then choose parameter d large
enough to capture all the relevant spectral detail in e–iȥ(ı). The only way to confirm that this is
true is to make a high-resolution, double-sided spectral measurement, verify that e–iȥ(ı) behaves as
expected, and adjust the value of d accordingly. In this sense, then, a good single-sided
measurement depends on our having at some point performed a high-resolution, double-sided
measurement with the same instrument. Nevertheless, having the flexibility to perform single-
sided measurements can be a very attractive way to increase an interferometer’s resolving power
when a standard double-sided measurement turns up unexpected but poorly resolved spectral
detail, and for this reason many interferometer designs include it as one of their options.
5.19 Calibration
The uncalibrated spectrum of a standard Michelson interferometer can be treated the same way as
the output spectrum of any other type of uncalibrated spectrometer would be treated. Consider,
for example, Eq. (5.60b) for the total interferogram signal ztot when the interferogram’s field of
- 682 -
Calibration · 5.19
∞
W
ztot ( χ ) = ³ ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) e 2π iσχ dσ . (5.94a)
4 −∞
In this section, we can regard function M as a constant and steady misalignment, unchanging
during calibration and subsequent spectral measurements—or we can think of the instrument as
being so well-aligned that M ≅ 1 . Assuming that ztot in (5.94a) is analyzed using a double-sided
interferogram with D large enough that there is no significant ringing or loss of spectral detail
from the sinc convolution in Eq. (5.66b), we can treat the Fourier-transform of ztot, which we call
Z eff ,tot (σ ) , as the uncalibrated spectrum of the Michelson interferometer. Reversing the Fourier
transform in (5.94a) then gives
∞
Z eff ,tot (σ ) = ³z tot ( χ ) e −2π iσχ d χ
-∞ (5.94b)
W
= ª¬ S (σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma ) .
4
S (σ ) = A ∆Ω R ( σ ) η(σ ) L ( σ )τ f ( σ )τ a ( σ ) , (5.94c)
Zeff ,tot (σ )
W (5.94d)
= ª¬ A ∆Ω R ( σ ) η(σ ) L ( σ )τ f ( σ )τ a ( σ ) + S ( fore ) (σ ) − S (back ) (σ ) º¼ H(σ u ) M ( Rσθ ma )
4
for the ideal case where cos α ε can be approximated by one and D is large enough that there is
no significant loss of detail from the sinc convolution described in Sec. 5.15 above.
What can be done with the more realistic case where there is significant loss of detail from the
sinc convolution and cos α ε can no longer be approximated as one because the field of view is
relatively large? Glancing back at the analysis used in Sec. 5.18 to go from Eq. (5.82b) to
(5.83d)—and in particular paying close attention to the approximations listed in (5.83b)—we
note that in a well-designed interferometer R, Ș, τ a , τ f , H, and M all vary slowly with ı
compared to L(ı). In fact compared to L(ı) they can be regarded as quasi-constants, especially
over the range of wavenumbers
- 683 -
5 · Description of Practical Interferometer Measurements
σ min ≤ σ ≤ σ max
over which L is being measured. In Eq. (5.83d) we account for the effect of a small but finite
field of view blurring and distorting the measurement of L by replacing L(ı) with LFOV(ı). This
is very similar to the situation examined in Sec. 5.15 above, where we represented the distorting
effect of the sinc convolution on the measured spectrum by replacing L(ı) with Lblur(ı). To
combine the blurring and distorting effects of both the sinc convolution and the finite field of
view, we replace L(ı) by Leff(ı) in Eq. (5.94c) to get
where we have added absolute value signs to the argument of Leff to keep S(ı) well-defined for
both positive and negative ı values and to show that it is still an even function, having the same
value at ı and íı. Applying this to Eq. (5.94d), we say that
Z eff ,tot (σ )
W (5.94f)
= ª¬ A ∆Ω R ( σ ) η(σ ) L eff ( σ )τ f ( σ )τ a ( σ ) + Seff( fore ) (σ ) − Seff( fore ) (σ ) º¼ H(σ u ) M ( Rσθ ma )
4
with L, S ( fore ) , and S ( back ) replaced by L eff , Seff( fore ) , and Seff( back ) respectively to show that the finite
field of view and sinc convolution have somewhat blurred and distorted the original functions.
We can, in fact, regard L eff ( σ ) as the best measurement of L( σ ) that the interferometer system
can be expected to produce. Hence, for relatively small fields of view in situations where the sinc
convolution introduces only a negligible distortion,
L eff ( σ ) ≅ L( σ ) ,
and for situations where the finite field of view and sinc convolution must be taken into account,
L eff ( σ ) is what L( σ ) is measured as when subjected to these two unavoidable effects.
To calibrate any type of spectrometer having a linear response to the input spectrum, we need
- 684 -
Calibration · 5.19
to observe at least two known spectral radiances L(1) ( ) ) and L(2) ( ) ) where again we use
absolute value signs to make the radiances well-defined for negative as well as positive ı values.
For an interferometer the L(1) and L(2) radiances should be distinct and slowly varying functions
of wavenumber so that they undergo only negligible distortion from the sinc convolution and
finite field of view; a black-body target at two widely separated temperatures does nicely. We
suppose that Z (1) (2)
eff ,tot () ) and Z eff ,tot () ) are the uncalibrated spectra measured when the
interferometer is observing the known spectral radiances L(1) ( ) ) and L(2) ( ) ) respectively. We
( meas )
then observe a source of unknown spectral radiance L( ) ) and calculate Z eff ,tot () ) , the
uncalibrated spectrum associated with the ztot signal generated by L( ) ) . For a standard
Michelson interferometer, we note that the traditional linear calibration algorithm gives,
consulting Eq. (5.94f) to get the appropriate formulas for Z (1) (2) ( meas )
eff ,tot , Z eff ,tot , and Z eff ,tot ,
( meas ) (1)
Z eff ,tot () ) Z eff ,tot () )
ª¬ L ( ) ) L ( ) ) º¼ A (2)
(2) (1)
(1)
L(1) ( ) )
Z eff ,tot () ) Z eff ,tot () )
ª¬L(2) ( ) ) L(1) ( ) ) º¼ A
W (5.95a)
M R)' ma H() u ) A ª¬ L eff ( ) ) L(1) ( ) ) º¼ R ( ) ) !() )* f ( ) ) * a ( ) )
4
W
M R)' ma H() u ) A ª¬ L(2) ( ) ) L(1) ( ) ) º¼ R ( ) ) !() )* f ( ) )* a ( ) )
4
(1)
L ( ) ) L eff ( ) ).
This is the best estimate of the unknown spectral radiance that the interferometer can be expected
to produce, which shows that the standard linear calibration algorithm can work well when we
treat the effective total spectrum Z eff ,tot () ) of the signal leaving the detector circuit just like we
would any other uncalibrated spectrometer signal that depended linearly on the spectral radiance
entering the
the instrument.
instrument.Equation
Once the(5.94e)
systemcan hasbebeen
generalized as we can measure any number of
calibrated,
other spectra simply by pointing the instrument at the other radiances, recording new “(meas)”
quantities, and plugging these “(meas)”
S () ) Lquantities into Equation (5.95a) while leaving all
eff ( ) ) A 1Function of ı2 .
other
(5.95b)
formula values the same.
NowEquation (5.94e)
in Eq. (5.94f) thecan be generalized
effective as
signal spectrum can be written as, for both positive and negative
ı values,
S () ) L eff ( ) ) A 1Function of ı2 . (5.95b)
Now in Eq. (5.94f) the effective signal spectrum can be written as, for both positive and negative
ı values,
- 685 -
5 · Description of Practical Interferometer Measurements
W
Z eff ,tot (σ ) = [L eff ( σ ) ⋅ {Function of σ }]H(σ u ) M ( Rσθ ma )
4
W
+ ª¬ Seff( fore ) (σ ) − Seff(back) (σ ) º¼ H(σ u ) M ( Rσθ ma ) (5.95c)
4
= L eff ( σ ) ⋅ {Complex Function of σ } + {Background Complex Function of σ }.
As long as the effective spectrum of the total signal can be written as a product of the spectral
radiance and a complex function of wavenumber that, due to the background radiance, must be
added to another complex function of the wavenumber, the standard linear calibration algorithm
given in (5.95a) successfully extracts the desired spectral measurement L eff ( σ ) . This procedure
is sometimes called the Revercomb calibration algorithm.89
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ , (5.96a)
89
H. E. Revercomb et al., “Radiometric Calibration of IR Fourier Transform Spectrometers: Solution to a Problem
with the High-Resolution Interferometer Sounder,” Applied Optics, 27, no. 5 (1 August 1988), pp. 3210–3218.
- 686 -
Nonflat Optical Surface · 5.20
FIGURE 5.31.
Ideal Beam
Splitter
Fixed Mirror
Surface
- 687 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.32.
Moving Mirror Surface
Ideal Beam
Splitter
Fixed Mirror
Surface
Lens
Circular Detector
in the Focal Plane
of the Lens
- 688 -
Nonflat Optical Surface · 5.20
FIGURE 5.33.
entering the
interferometer
Moving Mirror
heading to the
detector
y axis
x axis
Grid of Secondary
Interferometers on the
Fixed Mirror
- 689 -
5 · Description of Practical Interferometer Measurements
using the effective spectrum Z eff () ) explained in Sec. 5.11 above.90 If the total cross-sectional
area of the interferometer beam is A, then for ( x, y ) > 0 the beam coming from the x, y
secondary interferometer can be thought of as producing a signal
5
dx A dy
zsecondary ( )
x, y
A 5 ³ Z eff () ) e 2& i) ( x , y ) d) . (5.96b)
The total signal coming from the interferometer can now be written as
5
1
z ( ) ³ dx ³ dy³ d) Z eff () ) e 2& i) ( ( x , y ))
A cross section 5
of main beam
ª
¬«
º (5.96c)
5
« 1 y )( x, y ) »
»
³ ³crossdxsection
³dxdy³edy e
«
Z eff () ) e 2& i) « A
2& i)2(&xi,)
« »d)».d) .
5
«¬ of main beam
ª » »¼
which simplifies to
³ dx³ dydy( x, y( x) , y)0 . 0 .
cross section
(5.96d)
of main beam
The interferometer has no hope of working unless is always small. We use that
1 2
ex
1 x x
2
for small x and write that
e 2& i)
1 2& i) 2& 2) 2 2 . (5.97a)
90
Any of the previously discussed formulas for the effective spectrum can be substituted into the formulas used in
this section as long as the mirror-tilt term M is taken to be identically equal to one. We explain the reason for this
rule at the beginning of Sec. 5.21 below.
- 690 -
Nonflat Optical Surface · 5.20
2& 22)&22) 2
5
2& i)
³ Z eff () ) e 2& i) [1 dx³³dx ³ dydy ³ dx³³dx
dy³dy y( )x2, ]y(dx))
2
( x, y( x) , y ) ( x, dy , ]yd).2)] d.) .
5
A oss cross
section
section
A A cross cross
section
section
main
of beam
main beam of main
of beam
main beam
Equation (5.96d) shows that the imaginary term inside the square brackets [ ] disappears, leading
to
5
z( )
³Z
5
eff () ) ª¬1 2& 2) 2 2 º¼ e 2& i) d) , (5.97b)
1
2 ³ dx³³ dydy
Across section
[ [(x(, xy,)]y2)].2 . (5.97c)
of main beam
We want [1 2& 2) 2 2 ] to be approximately one for all the wavenumbers measured by the
interferometer, so if we plan to measure spectra over the wavenumber range defined by
0
) min 4 ) 4 ) max , (5.98a)
we must use surfaces whose average squared deviation from flatness 2 satisfies
1
2
. (5.98b)
2& 2) 2
for all the wavenumbers between ) min and ) max . If (5.98b) is satisfied at ) ) max , it is satisfied
for all the wavenumbers in (5.98a). Hence, after defining the root-mean-square deviation from
flatness to be RMS 2 , the inequality in (5.98b) reduces to
#min
RMS
, (5.98c)
& 2
- 691 -
5 · Description of Practical Interferometer Measurements
where the formula σ = λ −1 is used to write the inequality in terms of the minimum measured
wavelength instead of the maximum measured wavenumber.
J1 (4π Rσθ ma )
M(Rσθ ma ) = ,
2π Rσθ ma
and we see from Eq. (5.10e) that M = 1 when the misalignment angle șma is zero. A misaligned
moving mirror is, of course, misaligned with respect to the fixed mirror, so we can always model
this imperfection as a misalignment of the fixed mirror rather than the moving mirror (see Fig.
5.34). The size of the fixed mirror’s misalignment angle is also șma, the same as the size of the
moving mirror’s misalignment angle. This means that when θ ma > 0 , as in Fig. 5.34, we have a
special case of the nonflat optical surface discussed in Sec. 5.20. Hence, when using the analysis
for a nonflat optical surface in Sec. 5.20, we must also set M = 1 in all the formulas for Z eff (σ ) ,
because otherwise we “double count” the effect of a tilted moving mirror. By the same reasoning,
however, the accuracy of the procedure used to analyze nonflat surfaces can be checked by
comparing it to what we get when șma is small but not zero.
Equation (5.97b) states that when the moving or fixed mirror is not flat for any reason—
including, for example, being slightly misaligned and so having a nonzero șma value—the
original formula for the effective spectrum Z eff (σ ) should be multiplied by a factor of
ª1 − 2π 2σ 2 δ 2 º .
¬ ¼
Equations (5.82a) and (5.83d), on the other hand, require the formulas for Z eff (σ ) to be
multiplied by
J1 (4π Rσθ ma )
M(Rσθ ma ) =
2π Rσθ ma
when the misalignment angle șma is small but nonzero. (As before, R is the radius of the circular
cross section of the beam passing through the interferometer.) Comparing these two expressions,
we see that for them to be consistent
- 692 -
An Example of How to Analyze Nonflat Optical Surfaces · 5.21
FIGURE 5.34.
Moving Mirror
θ ma
Fixed Mirror
with Tilt
Radiance
entering the
Interferometer Radiance heading to
the detector
- 693 -
5 · Description of Practical Interferometer Measurements
J1 (4π Rσθ ma ) ≅?
ª1 − 2π 2σ 2 δ 2 º (5.99)
2π Rσθ ma ¬ ¼
J1 (4π Rσθ ma ) 1
≅ (2π Rσθ ma ) −1 ⋅ [2π Rσθ ma − (2π Rσθ ma )3 ]
2π Rσθ ma 2
(5.100b)
1
= 1 − (2π Rσθ ma ) 2 = 1 − 2π 2 R 2σ 2θ ma
2
.
2
δ ( x, y ) = 2θ ma y . (5.101a)
Circular symmetry allows us to choose the orientation of the x, y axes any way we please, and
they have been chosen so that the moving mirror is tilted by a rotation șma about the x axis. We
convert to polar coordinates using
x = r cos φ
y = r sin φ
δ (r , φ ) = 2θ ma r sin φ . (5.101b)
Since the main beam has a circular cross section of radius R, Eq. (5.97c) can be written as
91
See Eq. (9.1.10) on page 360 of Handbook of Mathematical Functions, edited by Milton Abramowitz and Irene
Stegun.
- 694 -
An Example of How to Analyze Nonflat Optical Surfaces · 5.21
FIGURE 5.35.
δ = 2 ⋅ θ ma ⋅ y
Radiance
Impinging
on the
Tilted
Mirror
- 695 -
5 · Description of Practical Interferometer Measurements
2& R 2 2& R
1 4' ma
³0 d ³0 dr r [2'ma r sin ] & R 2 ³ d sin ³ dr r
2 2 2 3
& R2 0 0
2 2&
4' ª sin(2 ) º R 4
ma
A A (5.101c)
& R «¬ 2 4 »¼ 0 4
2
R 2' ma
2
.
Substitution of Eq. (5.100b) into the left-hand side—and Eq. (5.101c) into the right-hand side—
of the proposed equality in (5.99) gives
?
1 2& 2 R 2) 2' ma
2
[1 2& 2) 2 R 2' ma
2
],
which is clearly true. This result not only shows why we should be careful to regard a misaligned
fixed mirror
moving or fixed mirror as
as aa special
special type
type of
ofnonflat
nonflat optical
opticalsurface
surfacebut
butalso
alsojustifies
checks the procedure
used in Sec. 5.20 to analyze more general types of nonflat optical surfaces.
- 696 -
Sampling the Interferogram Signal · 5.22
Ideal Beam
Splitter
Outside
Radiance
entering the
Interferometer
Laser
Lens
Trigger
Circuit
processing
the Signal Interferometer
from the Detector
Laser
Detector
Detector
Circuit
A/D
Converter
- 697 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.37.
Total Power in the Laser
Interference Signal
Laser Trigger
Lever
#0
laser-trigger and main-beam detector circuits.92 As a general rule, this makes the error a great
deal smaller. Similarly, slight changes in the overall size of the interferometer setup due to
mechanical flexing need no longer concern us; the laser beam establishes an invariant “ruler” that
does not care whether the overall distance between, say, the beam splitter and the fixed mirror
has changed by several microns since the last time the instrument was calibrated.
Section 5.14 above points out that to remove the background radiance from the main-beam
detector signal, we just subtract the interferogram signal produced by a very cold source from the
interferogram signal produced by the source whose spectrum we want to measure. Equation
(5.62c) describes this process as
92
In Chapter 8 we analyze this sort of sampling error as a random source of noise.
- 698 -
Sampling the Interferogram Signal · 5.22
where ztot(Ȥ) is the interferogram signal produced by the combination of the desired source
spectrum with the instrument background, z(cold)
C ( χ ) is the interferogram signal produced by just
the instrument background when observing a very cold source, and z(Ȥ) is the interferogram
signal by just the source spectrum we want to measure. When we sample the signal leaving the
detector circuit at equal optical-path-difference intervals ¨Ȥ, what we get is either
when observing the source radiance combined with the instrument background or
The fast-Fourier transform algorithms that are applied to these samples work best when N is a
multiple of 2, as is mentioned at the beginning Sec. 2.22 of Chapter 2, so in (5.103a) the index
values of m can be chosen so that
N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1, . (5.103b)
2 2 2 2
Note that (5.103b) specifies one “extra” sample to occur on the positive Ȥ axis.
93
To keep things simple, we assume for now that the sample with index m = 0 occurs at χ = 0 . Section 5.26
below shows what happens when we stop assuming that one of the samples occurs at exactly χ = 0 .
- 699 -
5 · Description of Practical Interferometer Measurements
∞
z(χ ) = ³Z
−∞
eff (σ ) e 2π iσχ dσ (5.104a)
WA ∆Ω
Z eff (σ ) = H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) . (5.104b)
4
Usually the aft optics’ transmission function τ a (σ ) is nonzero only for those wavenumbers ı that
satisfy
σ min ≤ σ ≤ σ max , (5.105)
making the effective spectrum Z eff equal to zero for σ > σ max or σ < σ min as shown in Fig.
5.38.
______________________________________________________________________________
FIGURE 5.38.
Z eff (σ )
- 700 -
Setting Up the Discrete Fourier Transform · 5.23
FIGURE 5.39.
z trunc ( χ )
-D D
______________________________________________________________________________
The work done in Sec. 5.15 shows that the interferogram signal for a double-sided
interferogram, which we call the truncated interferogram signal, can be written as
ztrunc ( χ ) = Π ( χ , D) z ( χ ) (5.106a)
so that
° z ( χ ) for χ ≤ D
ztrunc ( χ ) = ® (5.106b)
°̄ 0 for χ > D
2D
∆χ = . (5.107a)
N
- 701 -
5 · Description of Practical Interferometer Measurements
Since
ztrunc ( χ ) = z ( χ ) for χ ≤ D , (5.107b)
it then follows that
ztrunc (m∆χ ) = z (m∆χ ) (5.107c)
for
N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1, . (5.107d)
2 2 2 2
Equation (5.65c) shows, after we substitute from (5.106a), that the effective spectrum
associated with the unsampled signal is
³z
−∞
trunc ( χ ) e −2π iσχ d χ = Z eff(σ )
trunc
(5.108a)
with
Z eff(σ ) = [2 Dsinc(2πσ D )] ∗ Z eff (σ ) . (5.108b)
trunc
Figure 5.40 shows that D will be chosen large enough to make Z eff just a slightly blurred
trunc
version of Z eff with a tendency to oscillate at abrupt changes in value. According to the
discussion following Eq. (5.82c) above, the quantities H, M, R, Ș, τ a , and τ f are all slowly
varying functions of their arguments.94 This means that when the formula for Z eff in (5.104b) is
substituted into Eq. (5.108b), the sinc function is narrow enough for these quantities to be treated
as quasi-constant with respect to the convolution [see Eq. (5C.1) in Appendix 5C]. Hence, we can
approximate (5.108b) as
WA ∆Ω
Z eff (σ ) ≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L mnf ( σ ) , (5.108c)
trunc 4
where
L mnf (σ ) = [2 Dsinc(2πσ D )] ∗ L FOV ( σ ) . (5.108d)
94
So if the fore-optics transmission τ a and the detector responsivity R drop to zero for | σ | > σ max and
| σ | < σ min , this must occur slowly compared to the rate at which LFOV varies with ı.
- 702 -
Setting Up the Discrete Fourier Transform · 5.23
FIGURE 5.40.
Z eff (σ )
trunc
We note that because both 2 Dsinc(2πσ D ) and L FOV ( σ ) are even functions of ı, their
convolution Lmnf(ı) is also even [see Eq. (2.38f) in Chapter 2],
Even though the argument of Lmnf does not need absolute value signs because Lmnf is by
definition in (5.108d) already an even function, they are put there anyway to keep the notation
parallel with the previous L-type radiance symbols. The mnf subscript indicates that Lmnf is the
measured, noise-free spectral radiance produced by the interferometer; it is L(ı) blurred both by
the finite field-of-view effect discussed in Sec. 5.17 and the finite-interferogram effect discussed
in Sec. 5.15. Figures 5.41(a)–5.41(c) show the progression from the original L(ı) radiance
spectrum to LFOV(ı) defined in Eq. (5.83e) above to Lmnf(ı) defined in Eq. (5.108d). The
unsampled, noise-free signal can now be written as the Fourier transform pair,
- 703 -
5 · Description of Practical Interferometer Measurements
∞
Z eff(σ )
trunc
= ³z
−∞
trunc ( χ ) e −2π iσχ d χ (5.109a)
and
∞
ztrunc ( χ ) = ³Z
−∞
(σ ) e 2π iσχ dσ .
eff
trunc
(5.109b)
Function L mnf ( σ ) is closely related to function L eff ( σ ) in Eqs. (5.94e) and (5.94f). Hence, it
makes sense to assume that
L mnf ( σ ) ≅ L eff ( σ ) (5.110)
Perhaps the most important decision involved in going from the integral to the discrete Fourier
transform is the choice of step size ¨Ȥ between the equally spaced samples of ztrunc ( χ ) .
Converting Eq. (2.99a) of Chapter 2 from variables t and ƒ to variables Ȥ and ı, we see that the
Nyquist wavenumber σ Nyq corresponding to the Nyquist frequency f Nyq is given by
1
σ Nyq = . (5.112)
2∆χ
The discussion at the beginning of Sec. 2.22 of Chapter 2 shows that oversampling the
interferogram signal ztrunc(Ȥ) means choosing the sampling interval ¨Ȥ in such a way that the
Nyquist wavenumber ıNyq satisfies
- 704 -
Oversampling the Interferogram · 5.24
FIGURE 5.41(a).
Spectral Edge E
L(σ )
This small piece of the radiance spectrum, the same piece plotted in Fig. 5.30(a) above, is
here graphed in all its detail as it enters the interferometer. This is why the y axis is labeled
L(σ ) . Note that Spectral Edge E lies at wavenumber 1010 cm-1.
- 705 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.41(b).
Spectral Edge E
L FOV (σ )
Here the same small piece of the radiance spectrum plotted in Figs. 5.41(a) and 5.30(b)
is shown with the rescaled wavenumber axis and blurring due to the interferometer’s
finite field of view. Hence, the y axis is labeled LFOV(ı). Note that Spectral Edge E now
occurs at a slightly smaller wavenumber than before.
- 706 -
Oversampling the Interferogram · 5.24
FIGURE 5.41(C).
Spectral Edge E
L mnf () )
- 707 -
5 · Description of Practical Interferometer Measurements
with ımax defined by inequality (5.105) and Fig. 5.38. The larger ıNyq is compared to ımax, the
more accurate the transformation from the integral Fourier transform to the discrete Fourier
transform. The reason for this, of course, is that the larger ıNyq is compared to ımax, the less likely
it is that significant amounts of aliasing will occur when going from the integral to the discrete
Fourier transform. Although both aliasing and the transformation from the integral to the discrete
Fourier transform have already been covered in Secs. 2.21–2.23 of Chapter 2, it does no harm to
review these ideas here in the context of the truncated interferogram signal ztrunc(Ȥ) and its
effective spectrum Z eff(σ ) .
trunc
[∞ ]
The first step in setting up the discrete Fourier transform is to construct function ztrunc ( χ , 2 D)
from ztrunc(Ȥ) following the procedure used in Eq. (2.91b) of Chapter 2,
∞
[∞ ]
ztrunc ( χ , 2 D) = ¦z
k =−∞
trunc ( χ − 2kD) . (5.113a)
From Eq. (5.106b) and Fig. 5.39, we know that ztrunc is zero for χ > D . Consequently
[∞]
ztrunc ( χ , 2 D) has the form shown in Fig. 5.42. This matches the situation shown in Fig. 2.12(a) of
[∞]
Chapter 2, with the original signal ztrunc turned into a nonoverlapping, periodic signal ztrunc of
period 2D. In particular, we note that
[∞ ]
ztrunc ( χ , 2 D ) = ztrunc ( χ ) for χ ≤ D . (5.113b)
∞
Z (σ , 2σ Nyq ) =
[∞ ]
eff
trunc
¦Z
k =−∞
eff(σ − 2kσ Nyq ) .
trunc
(5.113c)
Glancing at the plot of Z eff in Fig. 5.43, we see that the plot of Z[eff∞ ] has the form shown in
trunc trunc
Fig. 5.44. The original signal Z eff is turned into a periodic signal Z[eff∞ ] of period 2σ Nyq ,
trunc trunc
matching the situation shown in Fig. 2.12(b) of Chapter 2. Consequently, we have that
- 708 -
Oversampling the Interferogram · 5.24
FIGURE 5.42.
[∞]
z trunc ( χ ,2 D )
χ
- 5D - 3D -D D 3D 5D
The edge ripples of Z eff are small and become smaller as we get further from the edge, but they
trunc
can in principle extend indefinitely far along the wavenumber axis, which means that overlapping
can occur making Z[eff∞ ] not exactly equal to Z eff for σ ≤ σ Nyq . Reviewing the discussion
trunc trunc
following Eqs. (2.93a) and (2.93b) of Chapter 2, we see that approximating ztrunc and Z eff by
trunc
periodic functions (with periods 2D and 2σ Nyq respectively) is exactly what we need to do when
approximating the integral Fourier transform by the discrete Fourier transform. Now we
understand why the correct choice of ıNyq is so important; if ıNyq is set too close to ımax, the
ringing at the edges of Z eff could create significant amounts of overlap in its periodic extension
trunc
[∞ ]
to function Z eff .
trunc
As is pointed out in Sec. 2.22 of Chapter 2, this sort of overlap is called aliasing of the signal
spectrum. When ıNyq is chosen to be decidedly greater than ımax, the interferogram signal is said
∞]
to be oversampled. The choice of D made when going from ztrunc to z[trunc , although in principle
equally important in characterizing the discrete Fourier transform, is in practice specified at an
earlier stage of the interferometer design when deciding on the spectral resolution of the
measured spectrum [see Eq. (5.67) above].
Because ztrunc(Ȥ) is zero for χ > D , and both the real and imaginary components of Z eff(σ )
trunc
are negligible for σ > σ Nyq , the pair of integral Fourier transforms in Eqs. (5.109a) and (5.109b)
can be approximated by
- 709 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.44.
- 710 -
Oversampling the Interferogram · 5.24
D
Z eff(σ ) =
trunc
³z
−D
trunc ( χ ) e −2π iσχ d χ (5.114a)
and
σ Nyq
ztrunc ( χ ) =
−σ
³ Z eff(σ ) e 2π iσχ dσ .
trunc
(5.114b)
Nyq
With the understanding that only the signal values at χ ≤ D and the spectral values at σ ≤ σ Nyq
are of interest on the left-hand sides of the formulas, we use Eqs. (5.113b) and (5.113d) to replace
∞]
ztrunc by z[trunc and Z eff by Z[eff∞ ] . Equations (5.114a) and (5.114b) now become
trunc trunc
D
Z (σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π iσχ d χ
[∞ ] [∞ ]
eff trunc (5.115a)
trunc −D
and
σ Nyq
Working first with the right-hand side of Eq. (5.115b), we note that
σ Nyq 0
−σ
³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ =
trunc − σ
³ Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ
trunc
Nyq Nyq
σ Nyq
+ ³0
Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ
trunc
2σ Nyq
(5.116)
³
−2π i (2σ Nyq ) χ
= Z[eff∞ ] (σ ′ − 2σ Nyq , 2σ Nyq ) e 2π iσ ′χ e dσ ′
σ trunc
Nyq
σ Nyq
+ ³0
Z[eff∞ ] (σ , 2σ Nyq ) e 2π iσχ dσ ,
trunc
where in the last step the variable of integration in the first integral has been changed to
σ ′ = σ + 2σ Nyq . From Eq. (5.112) we get
- 711 -
5 · Description of Practical Interferometer Measurements
e 2 & i ( )
e2& im 1 .
[5]
Substituting (5.116) back into (5.115b) and deciding to evaluate ztrunc only at those optical-path
differences Ȥ for which / m , we get
2) Nyq
³
[5]
ztrunc (m , 2 D) Z[eff5 ] () 3 2) Nyq , 2) Nyq ) e 2& i) 3m d) 3
) trunc
Nyq
) Nyq
³ 0
Z[eff5 ] () , 2) Nyq ) e2& i) m d) .
trunc
This becomes, dropping the prime and recognizing that Z[eff5 ] is periodic with period 2) Nyq ,
trunc
2) Nyq
³
[5]
z
trunc (m , 2 D) Z[eff5 ] () , 2) Nyq ) e 2& i) m d) . (5.117)
0 trunc
NowNow
we we
switch ourour
switch attention to to
attention Eq.Eq.
(5.115a). Following
(5.115a). Followingthe
thesame
sameprocedure
procedureas
as before,
before, this
this time
changing the variable of integration to 3 2D , we write its right-hand side as
D 2D
³z ³z
[5] 2& i) [5]
trunc ( , 2 D) e d trunc ( 3 2 D, 2 D) e 2& i) 3e 2& i) (2 D ) d 3
D D
D
³ ztrunc
[5 ]
( , 2 D) e 2& i) d .
0
[5]
Substituting this into (5.15a) gives, since ztrunc is periodic with period 2D,
D
Z[eff5 ] () , 2) Nyq ) ³ ztrunc
[5]
( , 2 D) e 2& i) d
trunc 0
2D
(5.118)
³z
[5]
trunc ( , 2 D) e 2& i) e 2& i) (2 D ) d ,
D
where the prime has been dropped from the integral between D and 2D. From Eq. (2.93d) of
Chapter 2, we note—remembering that variables Ȥ and ı here correspond, respectively, to
- 712 -
Oversampling the Interferogram · 5.24
variables t and ƒ there—that the interval ¨ı between samples of Z[eff∞ ] in the discrete Fourier
trunc
transform is
1
∆σ = (5.119)
2D
[∞]
when ztrunc has period 2D. This means that
e 2π iσ (2 D ) = e2π i (σ ∆σ )
e 2 π i (σ ∆σ )
= e2π in = 1 .
Now, deciding to evaluate Z[eff∞ ] only at wavenumbers for which σ / ∆σ = n , we can write
trunc
(5.118) as
2D
Z (n∆σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π i ( n∆σ ) χ d χ .
[∞ ] [∞]
eff trunc (5.120)
trunc 0
2D
Z (n∆σ , 2σ Nyq ) = ³z ( χ , 2 D) e −2π i ( n∆σ ) χ d χ
[∞ ] [∞]
eff trunc (5.121a)
trunc 0
and
2σ Nyq
The discussion following Eq. (5.106b) defined N to be the number of equally spaced samples of
[∞]
ztrunc between íD and D, with ∆χ = (2 D) / N , so N equally spaced samples spaced ¨Ȥ apart must
also cover the optical-path difference between zero and 2D. We now show that N equally spaced
samples of Z[eff∞ ] spaced ¨ı apart in wavenumber cover the wavenumber interval between zero
trunc
and 2σ Nyq . Remembering that variables Ȥ and ı here correspond to variables t and ƒ respectively
in Chapter 2, we rewrite Eq. (2.93e) of Chapter 2 as
- 713 -
5 · Description of Practical Interferometer Measurements
1
∆σ ∆χ = . (5.122a)
N
Consequently,
1
∆σ =
N ∆χ
N ∆σ = 2σ Nyq . (5.122b)
Therefore N equally spaced samples ¨ı apart must cover the wavenumber interval between zero
and 2σ Nyq .
Having established that N equally spaced samples cover the regions of integration in Eqs.
(5.121a) and (5.121b), we approximate both integrals as sums over N equally spaced samples in
wavenumber and optical-path difference. This gives
N −1
Z[eff∞ ] (n∆σ , 2σ Nyq ) ≅ ∆χ ¦ ztrunc
[∞]
(m∆χ , 2 D) e −2π i ( n∆σ )( m∆χ ) (5.123a)
trunc m =0
and
N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq ) e 2π i ( n∆σ )( m∆χ ) . (5.123b)
n =0 trunc
To put this into the traditional form of the discrete Fourier transforms shown in Eqs. (2.96a) and
(2.96b) in Chapter 2, just multiply both sides of (5.123a) by ¨ı and use ∆σ∆χ = N −1 from
(5.122a) to get
N −1 nm
2π i
zm = ¦ Z n e N
(5.124a)
n =0
and
N −1 nm
1 −2π i
Zn =
N
¦z
m =0
m e N
, (5.124b)
where
[∞]
zm = ztrunc (m∆χ , 2 D) (5.124c)
and
Z n = ∆σ ⋅ Z[eff∞ ] (n∆σ , 2σ Nyq ) . (5.124d)
trunc
It is important to remember, when using the discrete Fourier transforms defined in (5.124a)–
- 714 -
Oversampling the Interferogram · 5.24
(5.124d) to approximate the integral Fourier transforms in (5.114a) and (5.114b), that functions
[5]
ztrunc and Z[eff5 ] are qualitatively different from the truncated interferogram signal ztrunc and its
trunc
[5]
associated spectrum Z eff with which we began—because functions ztrunc and Z[eff5 ] , unlike ztrunc
trunc trunc
and Z eff , are periodic with periods of 2D and 2) Nyq respectively. We also note that the
trunc
unapodized spectral resolution ¨ı given in Eq. (5.67) above is, when using the discrete Fourier
transform, the same as the distance between spectral samples given by Eq. (5.119),
1
) .
2D
Consequently, the unapodized spectral resolution can be defined very simply and exactly as the
distance between adjacent spectral samples after the discrete Fourier transform is applied to the
sampled interferogram signal. This is one reason why the unapodized spectral resolution has
become such a widespread figure of merit for resolution in Fourier-transform spectroscopy—
when discrete Fourier transforms are used to approximate integral Fourier transforms, it sets an
easily understood limit on how much spectral detail we can hope to resolve.
of the spectrum. We do this by requiring that ıNyq end up well to the right of ımax in Fig. 5.43
when creating the periodic function
5
Z[eff5 ] () , 2) Nyq )
trunc
¦Z
k 5
eff() 2k) Nyq )
trunc
(5.125a)
in Eq. (5.113c) above. Consequently ¨Ȥ, the optical-path difference between adjacent samples of
the interferogram signal ztrunc(Ȥ), must be chosen small enough that, according to formula (5.112),
1
) Nyq (5.125b)
2
decidedly larger
is decidedly largerthan
thanıımax
max. .Since
SinceZZ
eff(eff
)()) )also
becomes
becomes
negligibly
negligiblysmall
smallfor
for ))
))min
min ,, we
we may
trunc
trunc
also be able
be able to follow
to follow the the strategy
strategy outlined
outlined in inSec.
Sec.2.23
2.23ofofChapter
Chapter 22 and
and undersample
undersample the
- 715 -
5 · Description of Practical Interferometer Measurements
interferogram signal instead. We now review how undersampling works, explaining in more
detail how to set up the appropriate discrete Fourier transform for an undersampled interferogram
signal.
The first step in undersampling an interferogram signal is to compare the wavenumber
interval (σ max − σ min ) to ımin to see how many aliases of the original spectrum can be fit between
σ = 0 and σ = σ min . For the spectrum in Fig. 5.45, we could choose the undersampled Nyquist
wavenumber σ Nyq
(u )
small enough to fit in as many as two aliases, as shown by the dashed curves;
but we decide to be conservative and only fit in one, as shown in Fig. 5.46. This conservative
strategy is called undersampling by a factor of 2.
When undersampling by a factor of 2, the old Nyquist frequency ıNyq and the new Nyquist
frequency σ Nyq
(u )
are related by
σ Nyq = 2σ Nyq
(u )
. (5.126)
Just like
1
2σ Nyq = (5.127a)
∆χ
for the old Nyquist frequency and the old sampling interval in Eq. (5.112), we associate with
σ Nyq
(u )
a new sampling interval ∆χ ( u ) such that
1
2σ Nyq
(u )
= . (5.127b)
∆χ ( u )
∆χ ( u ) = 2 ∆χ . (5.127c)
This is, of course, why what we are doing is called undersampling by a factor of 2; according to
(5.127c), the interferogram signal is to be sampled half as often as before.
[∞]
In the previous section, we found the sampled interferogram signal ztrunc could be written as
[see Eq. (5.123b)]
N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq ) e 2π i ( n∆σ )( m∆χ ) . (5.128)
n =0 trunc
- 716 -
Undersampling the Interferogram · 5.25
FIGURE 5.45.
−σ Nyq (u )
(u ) (u ) (u )
−3σ Nyq = −σ Nyq −2σ Nyq 3σ Nyq = σ Nyq
(u ) (u )
σ Nyq 2σ Nyq
______________________________________________________________________________
Note that here ¨Ȥ, ¨ı, ıNyq, and N all retain the old oversampled values specified in the previous
section. Assuming that the number of samples N is large, we see that as the index n goes from
zero to N − 1 , the wavenumber argument n∆σ of Z[eff∞ ] goes from zero to
trunc
1 1
( N − 1)∆σ ≅ N ∆σ = ⋅ ∆σ = = 2σ Nyq .
∆σ∆χ ∆χ
Here both N = (∆σ∆χ ) −1 from formula (5.122a) and 2σ Nyq = (∆χ ) −1 from formula (5.127a) are
used to get the final result. Since
( N − 1)∆σ ≅ 2σ Nyq ,
we see that the sum over Z[eff∞ ] in Eq. (5.128) is over the original oversampled spectrum between
trunc
σ = 0 and σ = σ Nyq and one of its aliases between σ = σ Nyq and σ = 2σ Nyq . Suppose the old
Nyquist wavenumber in Eq. (5.128) is replaced by σ Nyq
(u )
, half the old Nyquist value, to get
? N −1
[∞]
ztrunc (m∆χ , 2 D) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ ) . (5.129)
n =0 trunc
- 717 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.46.
(u ) (u )
− σ Nyq (u )
σ Nyq 3σ Nyq
(u ) (u )
(u )
− 2σ Nyq = −σ Nyq 2σ Nyq = σ Nyq 4σ Nyq = 2σ Nyq
Solid lines show the position of the original spectrum on the wavenumber axis, and the unshaded
dashed lines show the aliases associated with the original Nyquist wavenumber σ Nyq . The shaded
dashed lines show the aliases produced by undersampling. They are associated with the
undersampled Nyquist wavenumber σ Nyq .
(u )
Figure 5.46 shows that the new spectrum Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) has twice as many aliases as the
trunc
original spectrum Z (n∆σ , 2σ Nyq ) . Comparing the new spectrum in Fig. 5.46 to the original
[∞ ]
eff
trunc
spectrum in Fig. 5.44, we see that the sum in (5.129) covers two extra aliases in Fig. 5.46 that it
did not cover in Fig. 5.44. The wavenumbers where Z[eff∞ ] are zero do not, of course, contribute
trunc
anything to the sum. Let’s see what happens when we eliminate two of the aliases by taking the
sum over the new spectrum only up to the new, rather than the old, Nyquist wavenumber.
According to the discussion following Eq. (5.103a), N is even, which means that formula (5.129)
can now be written as
? ( N / 2) −1
[∞]
ztrunc (m∆χ , 2 D ) ≅ ∆σ ¦ Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ ) . (5.130)
n =0 trunc
- 718 -
Undersampling the Interferogram · 5.25
(u ) (u )
the original
original sum,
sum,and replaces
as well the alias
as the alias between 2) Nyq and 3) Nyq , which was partregard
we can of theas
original
being
(u ) (u )
replaced
sum, withbythe
thealias
aliasbetween
betweenzero
zeroand
and ) Nyq . The alias between zero and ) Nyq is an exact copy of
(u ) (u )
the alias between 2) Nyq and 3) Nyq , and these two aliases are separated by a wavenumber interval
[see Eqs. (5.126), (5.127a), and (5.122a)]
(u ) 1 ) 1 N )
2) Nyq ) Nyq A .
2 2 ) 2
Consequently, we can write that
§ N (u ) ·
Z[eff5 ] ¨(n ) ) , 2) Nyq [5 ] (u )
¸ Z eff (n) , 2) Nyq ) (5.131a)
trunc © 2 ¹ trunc
(u )
when comparing spectral values in the alias between zero and ) Nyq to spectral values in the alias
(u )
between 2) Nyq (u )
and 3) Nyq . As far as the complex exponent multiplying Z[eff5 ] is concerned, we
trunc
Suppose we add a subscript 2 to m to show that it must be a non-negative and even integer,
m2 0, 2, 4, … . (5.131b)
This lets us write the latest result as
- 719 -
5 · Description of Practical Interferometer Measurements
§ N ( u ) · 2π i ( n − ( N / 2) ) ∆σ ⋅m2 ∆χ
Z[eff∞ ] ¨(n − ) ∆σ , 2σ Nyq ¸e
trunc © 2 ¹ (5.131d)
2π i ( n∆σ )( m2 ∆χ )
= Z (n∆σ , 2σ
[∞]
eff
(u )
Nyq ) e .
trunc
This shows that—whenever m = m2 = a non-negative even integer—each term in the original sum
over the alias in Fig. 5.46 between 2σ Nyq
(u )
and 3σ Nyq
(u )
is the same as the corresponding term in the
new sum over the alias in Fig. 5.46 between zero and σ Nyq
(u )
. Therefore, whenever the ¨Ȥ index is a
non-negative even integer, we can remove the question mark from formula (5.130) and write
( N / 2) −1
z [∞]
trunc (m2 ∆χ , 2 D ) ≅ ∆σ ¦
n =0
Z[eff∞ ] (n∆σ , 2σ Nyq
trunc
(u )
) e 2π i ( n∆σ )( m 2∆χ ) ,
where we have replaced m by m2 on the left-hand side to honor the restriction placed on the
permitted values of the ¨Ȥ index. If we define an undersampled value of N,
N (u ) = N / 2 , (5.132a)
then the formula becomes
N ( u ) −1
z[∞]
trunc (m2 ∆χ , 2 D) ≅ ∆σ ¦
n =0
Z[eff∞ ] (n∆σ , 2σ Nyq
trunc
(u )
) e 2π i ( n∆σ )( m 2∆χ ) . (5.132b)
m2 = 2m for m = 0, 1, 2, … .
m2 ∆χ = 2m∆χ = m∆χ ( u ) .
N ( u ) −1
¦
(u )
z[∞]
trunc (m∆χ , 2 D) ≅ ∆σ
(u )
Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e 2π i ( n∆σ )( m∆χ )
. (5.132c)
n =0 trunc
This gives one of the two formulas for the discrete Fourier transform of the undersampled
interferogram signal.
- 720 -
Undersampling the Interferogram · 5.25
N ( u ) −1
¦
(u )
e−2π i ( n′′∆σ )( m∆χ ) [∞]
ztrunc (m∆χ ( u ) , 2 D)
m =0
(5.133)
N ( u ) −1 N ( u ) −1
2 1
∆χ ( u ) ∆σ = 2∆χ∆σ = = (u ) (5.134a)
N N
N ( u ) −1 N ( u ) −1 N ( u ) −1
¦
m =0
e 2π i ( m∆χ ( u ) )( n − n′′) ∆σ
= ¦
m =0
e 2π im ( n − n′′) / N ( u )
= ¦
m =0
( wN ( u ) ) m ( n −n′′)
N (u )
wN ( u ) = e 2π i .
N ( u ) −1
¦
m =0
( wN ( u ) ) m ( n − n′′) = N ( u )δ n′′,n ,
and so
N ( u ) −1
¦
(u )
e2π i ( m∆χ )( n − n′′) ∆σ
= N ( u )δ n′′,n . (5.134b)
m =0
- 721 -
5 · Description of Practical Interferometer Measurements
N ( u ) −1
or
N ( u ) −1
1
¦
(u )
Z (n′′∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ (u ) [∞]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n′′∆σ )( m∆χ )
. (5.135a)
trunc N ∆σ m =0
1
= ∆χ ( u ) , (5.135b)
N ∆σ (u )
N ( u ) −1
¦
(u )
Z (n∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ ∆χ (u ) [∞ ]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n∆σ )( m∆χ )
. (5.135c)
trunc m =0
Having now found the second formula for the discrete Fourier transform of the undersampled
signal, we gather together Eqs. (5.132c) and (5.135c) to write
N ( u ) −1
¦
(u )
Z (n∆σ , 2σ
[∞ ]
eff
(u )
Nyq ) ≅ ∆χ (u ) [∞ ]
ztrunc (m∆χ ( u ) , 2 D) e −2π i ( n∆σ )( m∆χ )
(5.136a)
trunc m =0
and
N ( u ) −1
¦
(u )
z[∞]
trunc (m∆χ , 2 D) ≅ ∆σ
(u )
Z[eff∞ ] (n∆σ , 2σ Nyq
(u )
) e2π i ( n∆σ )( m∆χ )
. (5.136b)
n =0 trunc
This pair of equations has the exact same form as the pair of equations specifying the discrete
Fourier transform for the oversampled signal in Eqs. (5.123a) and (5.123b), with
∆χ ∆χ ( u ) ,
N N (u ) ,
and
σ Nyq σ Nyq
(u )
.
- 722 -
Undersampling the Interferogram · 5.25
This shows we can sample the interferogram signal ztrunc with double the sampling interval used
in the previous section—that is, undersample by a factor of 2—and plug the resulting ztrunc values
into formula (5.136a) to get
Knowing that the wavenumber interval ∆σ has not changed from what it was before, and that
the aliases of Z eff do not overlap when undersampled by a factor of 2, we can now use the
trunc
correspondences shown in Fig. 5.46 to extract the true spectral values between
[−2σ Nyq
(u )
] , [−σ Nyq
(u )
] , and [σ Nyq
(u )
] , [2σ Nyq
(u )
].
When oversampling the interferogram signal ztrunc in the previous section, N interferogram
samples are used to find the spectrum Z eff ; and in this section, when undersampling the
trunc
N N N N
m=− + 1, − + 2, … , − 1, 0, 1, … , − 1,
2 2 2 2
(see footnote 93 above). The problem with this is that it assumes the interferogram signal is
sampled at exactly χ = 0 when m = 0 , as shown in Fig. 5.47(a). In practice, however, it is very
hard to sample the interferogram at exactly χ = 0 ; often the sample nearest χ = 0 is located a
large fraction of a sampling interval away from χ = 0 . We call this fraction of a sampling
interval α , with
1 1
− ≤α ≤ ,
2 2
- 723 -
5 · Description of Practical Interferometer Measurements
FIGURE 5.47(a).
FIGURE 5.47(B).
α ∆χ
- 724 -
Off-Center Sampling of the Interferogram Signal · 5.26
which means that the peak of ztrunc is located at χ = α∆χ , as shown in Fig. 5.47(b).
Mathematically, this can be regarded as a displacement of ztrunc(Ȥ), the interferogram signal as
defined above in Eq. (5.106a), along the Ȥ axis by a distance α∆χ . The displaced interferogram
signal can be written as
(α )
ztrunc ( χ ) = ztrunc ( χ − α∆χ ) . (5.137a)
Glancing back at Eq. (5.108a), we see that the new effective signal spectrum is
∞ ∞
³z ³z
(α ) (α ) −2π iσχ
Z (σ )
eff = trunc (χ ) e dχ = trunc ( χ − α∆χ ) e −2π iσχ d χ . (5.137b)
trunc −∞ −∞
∞ ∞
³z ³z
(α ) −2π iσχ ′ −2π iσα∆χ −2π iσα∆χ
Z (σ )
eff = trunc ( χ ′) e e d χ′ = e trunc ( χ ) e−2π iσχ d χ ,
trunc −∞ −∞
where in the last step the prime is dropped from the variable of integration. Substituting from Eq.
(5.108a), we see that
Since α∆χ is a small quantity, the effect of shifting ztrunc by a distance α∆χ along the Ȥ axis is
to multiply the original signal spectrum Z eff by a slowly varying, complex function of ı. There
trunc
is nothing profound about this result; it is just an example of the Fourier shift theorem given in
Eq. (2.36a) of Chapter 2. From the discussion following Eq. (5.95c) above, we know that
multiplying the original effective signal spectrum by another complex function of ı does not
change the way the calibration procedure extracts the desired spectral radiance L(ı)—as long as
the complex function does not change after the instrument is calibrated. Hence, as long as Į is a
true constant, having the same value each time the moving mirror scans through its range of
- 725 -
5 · Description of Practical Interferometer Measurements
motion, the extra factor of e −2π iσα∆χ in Eq. (5.137c) can be removed by calibration. Since e −2π iσα∆χ
is a slowly varying function of ı, we can even, as described in footnote 88 above, use the type of
single-sided system discussed in Sec. 5.18 to measure the spectral radiance L(ı).
__________
- 726 -
Appendix 5A
Appendix 5A
The detector circuit of a Fourier-transform spectrometer is a time-invariant linear system. If g(t)
is the input signal as a function of time going into the linear system, then the output signal k(t)
can always be written as
∞
k (t ) = ³ g (t ′) h(t − t ′) dt ′ ,
−∞
(5A.1a)
where h(t) is a continuous function of time specifying how the input signal is modified by passing
through the circuit. The explicit limits on the integral expression for output k(t) are + and –,
but in practice we always assume that the input signal g(t) is time limited, with the true limits on
the integral being set by the finite range of t over which g(t) is not zero. Function h(t) is often
called the impulse-response function of the linear circuit, because when the input is a delta
function impulse (see Sec. 2.14 of Chapter 2),
g (t ) = δ (t ) ,
then the output is h(t):
∞
k (t ) = ³ δ (t ′) h(t − t ′) dt′ = h(t ) .
−∞
(5A.1b)
As a general rule, we expect h(t) to be a much narrower function of time than g(t), which means
that output k(t) can be regarded as just a slightly blurred and distorted version of the input g(t).
According to Eq. (2.38a) of Chapter 2, Eq. (5A.1a) states that output k is the convolution of h
and g,
k (t ) = g (t ) ∗ h(t ) . (5A.2a)
The convolution is a linear operation, so when the input signal is the linear combination of two
functions g1 and g 2 , with
g (t ) = α g1 (t ) + β g 2 (t )
for two real constants Į and ȕ, then the resulting output is Į multiplied by the output that would
occur if only g1 were present plus ȕ multiplied by the output that would occur if only g 2 were
present [see Eq. (2.38e) in Chapter 2],
Therefore, if we know the output of the circuit for input g1 and the output of the circuit for input
g 2 , we know at once the output of the circuit for an input [α g1 (t ) + β g 2 (t )] . In particular, taking
- 727 -
5 · Description of Practical Interferometer Measurements
g 2( out ) (t ) = g 2 (t ) ∗ h(t ) ,
g1 (t ) + g 2 (t )
Glancing back to Eq. (2.40a) in Chapter 2 and the discussion following it, we note that the input
signal g(t) in Eq. (5A.2a) plays the role of u(t) in (2.40a), that the output signal k(t) plays the role
of ue,blur (t ) in (2.40a), and that the impulse-response function h(t) plays the role of the
instrument-response function ve (t ) in (2.40a). In fact we already know from the discussion
following Eq. (2.40a) that the correct way to handle Eq. (5A.2a) is to take the Fourier transform
of both sides and then apply the Fourier convolution theorem [see Eq. (2.39b) of Chapter 2] to get
K ( f ) = G( f ) ⋅ H ( f ) , (5A.3a)
where
∞
³ v(t ) e
−2π ift
K( f ) = dt , (5A.3b)
−∞
∞
³ g (t ) e
−2π ift
G( f ) = dt , (5A.3c)
−∞
and
∞
³ h(t ) e
−2π ift
H( f ) = dt . (5A.3d)
−∞
The Fourier transform H(ƒ) of the impulse-response function is often called the transfer
function of the linear circuit. The formula shown in Eq. (5A.3a) is often the easiest way to find
the output k(t) corresponding to a given input g(t). We first calculate G(ƒ), the Fourier transform
- 728 -
Appendix 5A
of input g(t), then multiply G(ƒ) by the transfer function H(ƒ) to get K(ƒ), the Fourier transform
of the output. Having found K(ƒ), we then take the inverse Fourier transform to get output k(t),
³ K( f )e
2π ift
k (t ) = df . (5A.4)
−∞
Although the impulse-response function h(t) in Eq. (5A.2a) plays the same role as the instrument-
response function ve (t ) in Eq. (2.40a) of Chapter 2, there is one important difference. A linear
circuit is a causal system, which means that its output signal k(t) cannot start happening before
the input signal g(t) occurs. Consequently, the circuit’s impulse-response function h(t) must
satisfy the restriction
h(t ) = 0 for t < 0 . (5A.5)
Suppose, for example, we supply a delta function at t = 0 , the impulse signal g (t ) = δ (t ) , for the
circuit’s input. Then, according to Eq. (5A.1b), the circuit’s output is
k (t ) = h(t ) ;
and if, for some t < 0 , we have h(t ) ≠ 0 , then there will be some part of the circuit’s output
signal being produced at t < 0 before its cause, the input delta function at t = 0 , has occurred.
This is why the impulse-response function of a causal linear system, unlike ve (t ) in Fig. 2.5(f) of
Chapter 2, must satisfy (5A.5).
Because h(t) is a nonzero function that must nevertheless be zero for negative values of t, it
cannot be an even function,95
h(−t ) ≠ h(t ) . (5A.6a)
The transfer function H(ƒ) is, according to Eq. (5A.3d), the Fourier transform of the real impulse-
response function h(t), which means, according to entry 7 of Table 2.1 of Chapter 2, that H is a
Hermitian function of ƒ,
H ( − f ) = H ( f )∗ . (5A.6b)
If H were a real function of ƒ, then it would also need to be even in order to satisfy (5A.6b).
According to entry 1 of Table 2.1, however, function H(ƒ) can be both real and even only when
h(t) is both real and even. Since, according to (5A.6a), we know that h(t) is not even, we conclude
that H, although Hermitian, cannot be real. We can directly verify this conclusion by using
eiφ = cos φ + i sin φ to break the Fourier transform of the real impulse-response function h(t) into
95
See Eq. (2.11a) of Chapter 2 for a definition of what it means to say a function is even.
- 729 -
5 · Description of Practical Interferometer Measurements
5 5 5
The last step here uses the restriction in (5A.5) to limit the sine and cosine integrals to non-
negative values of t. Because the sine integral in particular is limited to non-negative values of t,
we note that the imaginary part of the transfer function,
5
Im[ H ( f )] ³ h(t ) sin(2& ft ) dt , (5A.7b)
0
can be zero for all values of ƒ only if h(t) is zero for all non-negative values of t. Since we
already know that h is
is zero
zero for
for all
all negative
negativevalues
valuesof
oft,t,ititfollows
followsthat
thathhwould
must be zero everywhere.
This is an unacceptable impulse-response function, confirming our previous assertion that the
transfer function H(ƒ) of the detector circuit cannot be a real-valued function.
- 730 -
Appendix 5B
Appendix 5B
This appendix shows how to simplify Eq. (5.76a) in Sec. 5.17 of Chapter 5. We start off with
Z eff (σ )
σ ′χ ∆Ω · º 2π i χσ ′¨©1− 4π ¸¹ (5B.1a)
∞ ∞ § ∆Ω ·
ªW §
= ³ dχ e
−2π iσχ
³ dσ ′ « S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸» e
−∞ −∞ ¬4 © 2 ¹¼
and note that the integral over dȤ can be moved inside to get
Z eff (σ )
0 ∞
W
= ³ dσ ′ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π i χ [σ −σ ′(1−α )] (5B.1b)
4 −∞ −∞
∞ ∞
W
³ dσ ′ ª¬ S (σ ′) M ( Rσ ′θ ) H(uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e
−2π i χ [σ −σ ′ (1−α )]
+ ma ,
4 0 −∞
where
∆Ω
α= (5B.1c)
4π
and the integral over dσ ′ is, for future convenience, divided into two integrals—one from í to
zero and one from zero to . For any reasonable interferometer design, the ¨ȍ field of view (in
steradians, of course) is small compared to 4π , so
From Eq. (5A.6b) in Appendix 5A, we know that the transfer function H(ƒ) is Hermitian,
H ( − f ) = H ( f )∗ . (5B.2a)
H(0) = 0 . (5B.2b)
H ( f ) = Λ ( f )eiυ ( f ) , (5B.2c)
- 731 -
5 · Description of Practical Interferometer Measurements
where both Λ and υ are real functions of ƒ. The same sort of reasoning used to derive Eqs.
(5.86a) and (5.86b) in Chapter 5 can be used here to analyze Λ( f ) and υ ( f ) . Substituting
(5B.2c) into (5B.2a) gives, since both Λ and υ are real,
Λ (− f ) eiυ ( − f ) = Λ( f ) e − iυ ( f ) .
Λ(− f ) = Λ( f ) . (5B.2d)
To match Eq. (5B.2b), we require
Λ(0) = 0 . (5B.2e)
Now we can write
Λ ( f ) eiυ ( − f ) = Λ ( f ) e− iυ ( f ) or eiυ ( − f ) = e−iυ ( f )
υ (− f ) = −υ ( f ) , (5B.2f)
showing υ to be an odd function of ƒ. Because both Eqs. (5B.2d) and (5B.2e) must be true, we
conclude that not only is Λ( f ) equal to zero at f = 0 but also that the derivative of Λ( f ) with
respect to ƒ is zero at f = 0 . The point of this analysis is revealed when we substitute formula
(5B.2c) into the two integrals on the right-hand side of (5B.1b) to get
0 ∞
W
³ dσ ′Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) e iυ ( uσ ′ )
º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π i χ [σ −σ ′(1−α )]
4 −∞ −∞
and
∞ ∞
W
³ dσ ′Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) eiυ ( uσ ′) º¼ ³ d χ sinc ( 2πσ ′χα ) e −2π iχ [σ −σ ′(1−α )] .
4 0 −∞
0 −∞
W dσ ′
³ Λ (uσ ′) ª¬ S (σ ′ ) M ( Rσ ′θ ma ) eiυ (uσ ′) º¼ ³ d χ ′ sinc ( 2πχ ′α ) e
−2π i χ ′[(σ / σ ′ ) − (1−α )]
4 −∞
σ′ +∞
and
- 732 -
Appendix 5B
5 5
W d) 3
4 ³
0
)3
(u) 3) ª¬ S ) 3 M R) 3' ma ei+ ( u) 3) º¼ ³ d 3 sinc 2& 3 e 2& i 3[() / ) 3) (1 )] ,
5
where, in the first integral over d 3 , we note that 3 is negative when Ȥ is positive because
) 3
0 . Ordinarily we might worry about the singularity at ) 3 0 in the outside integrals over
d) 3 , but since both ȁ and its derivative are zero at ) 3 0 , it follows that (u) 3) / ) 3 must be
zero at ) 3 0 and very small near ) 3 0 . Consequently, both of these integrals are well-defined,
and, replacing the ei+ product by the transfer function H, we can write Eq. (5B.1b) as
Z eff () )
0 5
W d) 3
4 ³
5
)3
ª¬ S ) 3 M R) 3' ma H(u) 3) º¼ ³ d 3 sinc 2& 3 e 2& i 3[() / ) 3) (1 )]
5
(5B.3)
5 5
W d) 3
4 ³
0
)3
ª¬ S ) 3 M R) 3' ma H(u) 3) º¼ ³ d 3 sinc 2& 3 e 2& i 3[() / ) 3) (1 )] .
5
5
1
³ sinc(2& t ) e
2& itf
dt ( f , ) , (5B.4a)
5
2
where
1 for f
°
( f , ) ®1/ 2 for f
= 0 . (5B.4b)
° 0 for f
¯
This definition of function is the same as that in Eq. (2.56c) of Chapter 2, and the definition of
the sinc function is given in Eq. (2.106d) of Chapter 2.
Applying Eq. (5B.4a) to (5B.3) gives (here 3 plays the role of t)
0
W 1 §) ·
Z eff () )
8 ³
5
)3
ª¬ S ) 3 M R) 3' ma H(u) 3) º¼ ¨ (1 ), ¸ d) 3
©)3 ¹
5
(5B.5a)
W 1 §) ·
8 ³
0
)3
ª¬ S ) 3 M R) 3' ma H(u) 3) º¼ ¨ (1 ), ¸ d) 3.
©)3 ¹
- 733 -
5 · Description of Practical Interferometer Measurements
σ
−α ≤ − (1 − α ) ≤ α
σ′
or
σ
1 − 2α ≤ ≤ 1.
σ′
1
ª S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼
σ′ ¬
is very small or zero when ıƍ is near or at zero, which means that the region around σ ′ = 0 cannot
contribute significantly to either integral in Eq. (5B.5a). Therefore, in the first integral between
í and zero, we can think of ıƍ as always negative, and in the second integral between zero and
we can think of ıƍ as always positive. According to (5B.5b), then, the first integral can be
nonzero only when σ < 0 and the second integral can be nonzero only when σ > 0 . This means
that Eq. (5B.5a) can be written as
W −σ 1
°− ³ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ dσ ′ for σ < 0
° 8α − σ /(1− 2α ) σ ′
Z eff (σ ) = ® σ /(1− 2α )
.
°W 1
° 8α ³ σ ′ ª¬ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) º¼ dσ ′ for σ > 0
¯ σ
- 734 -
Appendix 5B
W ) /(1 2 ) 1
³
° S ) 33 M R) 33' ma H(u) 33) d) 33 for )
0
° 8 ) ) 33
® ) /(1 2 )
° W 1
Z eff () ) °
8 ) ³ ) 3
S ) 3 M R) 3' ma H(u) 3) d) 3 for ) 0 ,
¯
where
S () 33) S () 33) , (5B.6a)
from Eqs. (5.39a) in Chapter 5, (5.10f) in Chapter 5, and (5A.6b) in Appendix 5A respectively.
Since it makes no difference whether we label the variable of integration ıƍ or ıƎ, we can now
write, remembering that /(4& ) from Eq. (5B.1c),
W & ) /[1(2& ) ] 1
1
° ³
S ) 3 M R) 3' ma H(u) 3) d) 3 for )
0
°° 2 ) )3
Z eff () ) ® . (5B.7a)
) /[1 (2& )1 ]
° W& 1
°
2 ³) ) 3
S ) 3 M R) 3' ma H(u) 3) d) 3 for ) 0
°̄
° ¨ ³
¸ S ) 3 M R) 3' ma H(u) 3) d) 3 for )
0
°° 4 © ¹ ) )3 (5B.7b)
Z eff () )
® 1
) ) (2& )
° W § 2& · 1
Z eff () )
° ¨
4 ©
¸
¹
³ ) 3
S ) 3 M R) 3' ma H(u) 3) d) 3 for ) 0 .
°̄ )
- 735 -
5 · Description of Practical Interferometer Measurements
σ + ∆Ω σ (2π ) −1 = σ (1 + ∆Ω(2π ) −1 )
inside the top and bottom integrals, we can, remembering from Eqs. (5B.1c) and (5B.1d) that
∆Ω
<< 1 ,
4π
−1
1 1 § ∆Ω ·
1
= = ¨1 + ¸ .
1ª § ∆Ω · º σ ª ∆Ω º σ © 4π ¹
σ + σ ¨1 + ¸ 2+
2 «¬ © 2π ¹ »¼ 2 «¬ 2π »¼
Now the 1/ıƍ term can be brought outside the integrals to get
Z eff (σ )
σ (1+
∆Ω σ ∆Ω
)+
°§ 2π · 1 4π 4π
W
S (σ ′ ) M ( Rσ ′θ ma ) [ H(uσ ′) ] dσ ′ for σ < 0
∗
°¨¨ ∆Ω σ ¸¸ ⋅ ⋅ ³
°© ¹ 1 + ∆Ω σ (1+ ∆Ω ) − σ ∆Ω 4
° 4π 4π 4π (5B.7c)
≅® .
∆Ω σ ∆Ω
° σ (1+ )+
°¨ § 2π · 1 4π 4π
W
¸ ⋅ ⋅ ³ S (σ ′ ) M ( Rσ ′θ ma ) H(uσ ′) dσ ′ for σ > 0
° ¨ ∆Ω σ ¸ ∆Ω 4
°̄ © ¹ 1+ ∆Ω σ ∆Ω
σ (1+ )−
4π 4π 4π
Making the variable substitution σ ′′ = −σ ′ in the upper integral of (5B.7c), we can write, using
Eqs. (5B.6a)–(5B.6c) and remembering that σ < 0 so that − σ = σ ,
∆Ω σ ∆Ω
σ (1+ )+
4π 4π
W
S (σ ′ ) M ( Rσ ′θ ma ) [ H(uσ ′) ] dσ ′
∗
³ σ ∆Ω 4
∆Ω
σ (1+ )−
4π 4π
σ ∆Ω
(5B.7d)
∆Ω
σ (1+ )+
4π 4π
W
= ³ H(uσ ′′) S (σ ′′ ) M ( Rσ ′′θ ma ) dσ ′′.
∆Ω σ ∆Ω 4
σ (1+ )−
4π 4π
- 736 -
Appendix 5B
§ ∆Ω · § ∆Ω ·
In the bottom integral of (5B.7c), we can replace σ ⋅ ¨1 + ¸ by σ ⋅ ¨ 1 + ¸ because σ > 0 ,
© 4π ¹ © 4π ¹
making it look the same as Eq. (5B.7d). Consequently, both the top and bottom parts of Eq.
(5B.7c) can be combined into a single formula,
Z eff (σ )
∆Ω σ ∆Ω
σ (1+ )+
§ 2π · 1 ªW
4π 4π
º
≅¨ ⋅ ⋅ ³ H(uσ ′) S (σ ′ ) M ( Rσ ′θ ma ) » dσ ′.
¨ ¸¸ «
© ∆Ω σ ¹ 1 + ∆Ω σ (1+ ∆Ω ) − σ ∆Ω ¬ 4 ¼
4π 4π 4π
∆Ω
1+ ≅ 1,
4π
we can write this latest formula as
∆Ω ∆
σ (1+ )+ σ
4π 2
1 ªW º
Z eff (σ ) ≅
∆σ
⋅ ³ « 4 H(uσ ′) S (σ ′ ) M ( Rσ ′θ ma ) » dσ ′ , (5B.8a)
σ (1+
∆Ω ∆
)− σ
¬ ¼
4π 2
where
∆Ω σ
∆σ = . (5B.8b)
2π
We conclude that Z eff (σ ) is, to a very good approximation, given by the average value of
W
H(uσ ) S (σ ) M ( Rσθ ma )
4
- 737 -
5 · Description of Practical Interferometer Measurements
Appendix 5C
When a relatively narrow and rapidly varying function h(z) centered on zero is convolved with
the product of another rapidly varying function g(z) and a broad, slowly varying function G(z),
we can often approximate the result as
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] . (5C.1)
It is easy to see why this works. We start out by making h(z) a narrow function centered on z0,
as shown in Fig. 5C.1. Starting with the definition of a convolution in Eq. (2.38a) of Chapter 2,
we have
∞
h( z ) ∗ [G ( z ) ⋅ g ( z )] = ³ h( z′)G( z − z′) g ( z − z′)dz′ .
−∞
(5C.2)
∞ ∞
or
∞
using the definition of the convolution in Eq. (2.38a). Substituting this back into (5C.2) now
gives the desired result,
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z − z0 ) ⋅ [h( z ) ∗ g ( z )] (5C.4a)
or
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] (5C.4b)
- 738 -
Appendix 5C
FIGURE 5C.1.
h( z )
z
z0
Lh Lh
FIGURE 5C.2.
G( z)
2 L2h Lh
- 739 -
5 · Description of Practical Interferometer Measurements
FIGURE 5C.3.
hsum ( z )
FIGURE 5C.4.
h1 ( z ) h2 ( z ) h3 ( z )
z z z
- 740 -
Appendix 5C
N
hsum ( z ) = ¦ hk ( z − zk ) (5C.5)
k =1
with the N narrow hk ( z ) functions centered at the origin. Figure 5C.3 shows what a plot of
hsum ( z ) might look like for N = 3 when h1 , h2 , h3 are as shown in Fig. 5C.4. The linearity of the
convolution shown in Eq. (2.38e) of Chapter 2 can now be used to write
N
hsum ( z ) ∗ [G ( z ) ⋅ g ( z )] = ¦ {hk ( z − zk ) ∗ [G ( z ) ⋅ g ( z )]}
k =1
N
(5C.6)
≅ ¦ {G ( z − zk ) ⋅ [h( z − zk ) ∗ g ( z )]} ,
k =1
where the last step uses Eq. (5C.4a) to move G outside the convolutions.
- 741 -
6
NEdN AND DETECTOR NOISE
Laboratory measurements contaminated by random errors are usually characterized by their
signal-to-noise ratio (SNR). In measurements of spectral radiance, however, signal-to-noise ratios
can be confusing because the SNR can change by orders of magnitude as the signal itself—in
spectra having strong emission or absorption lines—changes by orders of magnitude. Hence the
noise performance of Fourier-transform spectrometers is often characterized by the noise-
equivalent change in radiance (NEdN) instead of the signal-to-noise ratio.96 By far the largest
part of the random error or NEdN in the spectral measurements of most Fourier-transform
spectrometers comes from random errors in the way detectors respond to the optical signal. These
random errors in the detector response are called detector noise. Because as few assumptions as
possible are made in this chapter about the shape of the detector-noise power spectrum, our
approach to detector noise is more elaborate than most discussions of the subject. In this chapter,
we derive formulas for the detector-noise NEdN of Michelson spectrometers using double-sided
and single-sided interferogram signals. While deriving our NEdN formulas, we are careful to
trace through what happens to the spectral signal during calibration, making it easy to understand
the different ways detector noise is processed in double-sided and single-sided systems. Although
the formulas in this chapter apply directly only to the detector noise in standard two-port
Michelson systems, the approach used here can be easily adapted to any type of Fourier-
transform spectrometer by changing the details of the analysis to accommodate the interferogram
signals generated by more elaborate instruments.
interferometers.
96
The NEdN is described in Sec. 6.1 below.
- 742 -
Definition of NEdN · 6.1
in the measured radiance if the same measurement is repeated; so for predominantly random
errors, the error bars also show that a repeated measurement of the same spectrum value with the
same instrument is likely to lie within one error-bar length of the original data point. For
predominantly random errors, then, the length BE of the the error
errorbar
barattached
attachedtoto
thethe data
data point
point also
approx-
imates theNEdN
gives the NEdN value—the
value noise-equivalent change in radiance—associated with the data point.
When the measurement errors are not predominantly random, the error bar specifies the total
measurement error, both random and nonrandom. Hence, when a data point is also contaminated
by significant amounts of nonrandom error, BE is larger than the probable change in value if the
measurement is is repeated.
repeated. Consequently,
The NEdN when alwaysbothhasrandom
the same and type
nonrandom
of unitserrors are radiance
as the present,
measurements
the NEdN canit describes.
be thoughtIn of thisaschapter, the NEdN
that portion describes
of the the expected
error-bar amountby
length caused of random
error in the spectral
measurement error—thatradiance L NEdN
is, the as a function of wavenumber
is the increase ı, soofhere
in the length the NEdN
BE due always has
to the presence of
units of measurement
random optical powererrors.
per unit area the
Because per NEdN
unit solid
mustangle per the
be either uniterror-bar
wavenumberlengthinterval
itself or(for
an
2 í1
example, inwatts/m
increase /sr/cmlength,
the error-bar or the NEdN 2always
erg/sec/cm /sr/cmí1has= theerg/sec/sr/cm). In a aswell-designed
same type of units the radiance
interferometer,
measurements itwe expect the
describes. NEdN—indeed,
In this we expect
chapter, the NEdN the total
describes measurement
the expected amount error—to
of random be
small compared
error in to theradiance
the spectral average Lor as
typical size ofofthewavenumber
a function radiance. ı, so here the NEdN always has
units of optical power per unit area per unit solid angle per unit wavenumber interval (for
example, watts/m2/sr/cmí1 or erg/sec/cm2/sr/cmí1 = erg/sec/sr/cm). In a well-designed
interferometer, we expect the NEdN—indeed, we expect the total measurement error—to be
small compared to the average or typical size of the radiance.
______________________________________________________________________________
FIGURE 6.1.
N () )
- 743 -
6 · NEdN and Detector Noise
L mN (σ ) = Lmnf (σ ) + δ L (σ ) (6.1a)
over L mN and δ L shows that these are both random functions of ı (see Secs. 3.1 and 3.2 of
Chapter 3 for an explanation of the wavy-line notation and random functions). We need δ L to be
a random function of ı because very often the size and nature of the random error in the spectral
measurement depends strongly on the value of the wavenumber ı. The δ part of δ L reminds us
that the random error takes on values that are small compared to the typical size of Lmnf.
Representing this typical size by the spectral average of Lmnf, we note that
σ max
1
δ L (σ ) << ³ Lmnf (σ ) dσ . (6.1b)
σ max − σ min σ min
In this inequality, the interferometer is assumed to measure spectral radiances between ımin and
ımax. Using the same notation as in inequality (5.78) of Chapter 5, we say that
( ) ( ) ( )
E L mN (σ ) = E L mnf (σ ) + δ L (σ ) = L mnf (σ ) + E δ L (σ ) , (6.2a)
where Eqs. (3.16a) and (3.9f) in Chapter 3 are used to simplify the right-hand side of the
equation. Looking at (6.2a), we might be tempted to define E(δ L (σ )) , which is the average or
- 744 -
Definition of NEdN · 6.1
expected value of δ L , to be the NEdN associated with the L mN measurement; but there are
problems with this approach. Suppose, for example, that only random errors are present in our
measurements and that the error bar attached to the data point shows that the random error is just
as likely to make a measured radiance too large as it is to make it too small. This suggests that the
radiance value that would be produced by a noise-free interferometer measurement can be
estimated by averaging together a large number of independent measurements, with the presence
of the randomly occurring “too-large” measurements compensating for the presence of the
randomly occurring “too-small” measurements. According to Sec. 3.4 of Chapter 3, to get the
average value of a randomly varying quantity, we should apply the expectation operator E .
Hence, the assumption that averaging together many randomly occurring too-large and too-small
measurements produces a good estimate of the noise-free interferometer measurement can be
written as
( )
E L mN (σ ) = L mnf (σ ) . (6.2b)
(
E δ L (σ ) = 0 . ) (6.2c)
So now if E(δ L (σ )) is defined to be the NEdN, we end up saying that our measurements have
zero NEdN even though every individual measurement is contaminated by a substantial amount
of random error. This is obviously not acceptable.
To define the NEdN correctly, we must remember that the NEdN is not the average random
error itself, E(δ L (σ )) , but rather the average size of the random error. Glancing back at Eqs.
(3.5c) and (3.8e) in Chapter 3, we see that the standard deviation of δ L , which can be written as
{( )}
1/ 2
¬ (
E ªδ L (σ )-E δ L (σ ) º 2
¼ ) ,
( )
gives us what we want. Even if E δ L (σ ) is zero, the standard deviation
( ( ) )
E ªδ L (σ )-E δ L (σ ) º 2 = E δ L (σ ) 2
¬ ¼ ( )
will be greater than zero as long as δ L itself is not identically zero. Hence, the standard
deviation behaves the way we want it to when E(δ L (σ )) is zero while δ L (σ ) is not zero. The
next step is to check how well this definition of the NEdN works when E(δ L (σ )) is not equal to
- 745 -
6 · NEdN and Detector Noise
zero.
Suppose Eq. (6.2c) is no longer satisfied; that is, suppose that
( )
E δ L (σ ) = δ L nr (σ ) = small nonzero error which depends on σ . (6.3a)
δ L (σ ) = δ L nr (σ ) + δ L r (σ ) . (6.3b)
Taking the expectation value of both sides and using Eqs. (3.16a) and (3.9f) in Chapter 3, we get
( ) (
E δ L (σ ) = δ L nr (σ ) + E δ L r (σ ) . )
We can reconcile this result with (6.3a) only by requiring that
( )
E δ L r (σ ) = 0 . (6.3c)
Equations (6.3a)–(6.3c) show that if E(δ L (σ )) is not zero, then δ L (σ ) can be written as the
sum of both a random function δ L r (σ ) and a nonrandom function δ L nr (σ ) , with the nonrandom
function δ L (σ ) equal to the nonzero expectation value of δ L (σ ) and the random function
nr
L mN (σ ) = [L mnf (σ ) + δ L nr (σ )] + δ L r (σ ) . (6.3e)
Equation (6.3e) shows that the sum inside the square brackets [ ] plays the same role as Lmnf does
in (6.1a) because it is a nonrandom function of ı added to a random function of ı; and Eq. (6.3d)
shows that δ L nr (σ ) cannot be removed by averaging together many different measurements of
L (σ ) .
mN
When repeated measurements are made of the same data point and then averaged together,
- 746 -
Definition of NEdN · 6.1
Eqs. (6.3d) and (6.3e) show that the random change from one measurement to the next comes
entirely from δ L r (σ ) , with δ L nr (σ ) just shifting the data point away from the Lmnf value by the
same amount each time the measurement is made. This shows that the increase in BE, the error-
bar length, due to random error comes entirely from the random component δ L r (σ ) of δ L (σ ) .
Fortunately, defining the NEdN to be the standard deviation of δ L (σ ) still gives us a well-
behaved value for the NEdN when δ L (σ ) has a significant nonrandom component δ L (σ ) . We nr
have already seen that [see the formula following Eq. (6.2c) above]
{( )}
1/ 2
standard deviation of δ L = E ªδ L (σ )-E δ L (σ ) º 2
¬ ¼ ( ) .
Substituting first (6.3b) and then (6.3a) into the right-hand side gives
{( )} { ( )}
1/ 2 1/ 2
¬ ( )
E ªδ L nr (σ ) + δ L r (σ )-E δ L (σ ) º 2
¼
= E ¬ªδ L r (σ ) ¼º 2 .
Again the standard deviation gives us what we want: a nonzero and positive value of the NEdN
that does not depend in any way on the nonrandom error component δ L nr (σ ) of δ L (σ ) . We
conclude that it makes sense to define the NEdN of any radiance measurement described by Eq.
(6.1a) to be the standard deviation of the random function δ L (σ ) even when E(δ L (σ )) does not
equal zero:
{( )}
1/ 2
¬ (
NEdN (σ ) = E ªδ L (σ )-E δ L (σ ) º 2
¼ ) . (6.3f)
We note that this definition automatically gives the NEdN units of spectral radiance, as it should.
To emphasize that the standard deviation formula only applies to non-negative wavenumbers ı,
we often write
{( )}
1/ 2
¬ (
NEdN ( σ ) = E ªδ L ( σ )-E δ L ( σ ) º 2
¼ ) . (6.3g)
Equation (6.3g) can also be thought of as giving the NEdN the same behavior with respect to
negative wavenumber values as the spectral radiance; the absolute value signs make the NEdN an
even function of ı in the same way that absolute value signs make L, L( fore ) , and L(back) even
functions of ı in Eqs. (5.40g), (5.51a), and (5.57a) of Chapter 5.
- 747 -
6 · NEdN and Detector Noise
χ = ut . (6.4)
Here u is the constant OPD velocity and t is a time coordinate chosen so that t = 0 when χ = 0 .
We usually find it more convenient to represent the interferometer signal and signal errors as
functions of Ȥ while remembering that, according to Eq. (6.4), the OPD value Ȥ and the time
coordinate t are directly proportional to each other.
The interferometer signal can be evaluated at any position along the signal chain shown in
Fig. 6.2. If we think of the signal as being the electrical impulses leaving the detector circuit due
to the input radiance L(ı), then we can analyze it at point C in Fig. 6.2 and represent it by zC ( χ ) .
Function zC ( χ ) can be either the voltage or current as a function of OPD, depending on how we
want to record the signal; and Eq. (6.4) can always be used to write the signal as zC (ut ) if we
want it as a function of time. To get the corresponding electrical impulses leaving the detector,
we can analyze the signal at point B in Fig. 6.2 and represent it by z B ( χ ) ; and if we think of the
interferometer signal as being the corresponding optical power reaching the detector, then we
analyze it at point A in Fig. 6.2 and represent it by z A ( χ ) . Again we have the choice of using
either volts or amps to represent the electrical signal z B ( χ ) , and signal z A ( χ ) is usually thought
of as having units of optical power. Just like the zC signal, the zB and zA signals can be specified
as functions of time by writing z B (ut ) and z A (ut ) .
At point C in Fig. 6.2, we know from Sec. 5.18 of Chapter 5 [see Eqs. (5.81a) and (5.83d)]
that the electrical signal due to the spectral radiance L(ı) entering the interferometer’s aperture is
³Z
−∞
eff (σ ) e 2π iσχ dσ
∞
(6.5a)
WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ ,
- 748 -
Signal from the Spectral Radiance · 6.2
FIGURE 6.2.
Input Scene
Radiance
Interferometer Moving Mirror
Fore Optics
ZPD position
Interferometer Beam Splitter
of moving An OPD value of
mirror Ȥ corresponds to
a physical shift of Ȥ/2
Aft
Optics
Detector
POINT
POINT B
POINT B
B
Det. circuit
w/ antialiasing
Region of Electrical Signal filter
POINT C
POINT C
Analog-to-Digital Converter
Region of Digital Signal sampling signal at equally-
spaced Ȥ values
- 749 -
6 · NEdN and Detector Noise
)
) . (6.5c)
2&
Since zC(Ȥ) is defined to be the electrical signal due to L(ı) at point C of Fig. 6.2, we can now
write Eq. (6.5a) as
zC ( )
WA
5
(6.5d)
4 5³ H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L FOV ( ) ) e 2& i) d) .
Examining carefully Eqs. (5.104a) and (5.104b) in Chapter 5, we see that zC(Ȥ) is the same signal
as z ( ) in (5.104a) because the signal spectrum Z eff () ) in (5.104b) is the same as the
expression put through the inverse Fourier transform on the right-hand side of (6.5d).
The easiest way to get the formulas for zB(Ȥ) and zA(Ȥ) is to go backwards through the signal
chain in Fig. 6.2.
Going back to zB(Ȥ), we note that it is the component of the electrical signal leaving the
detector due to the input radiance L(ı). To find this component, we just set H 1 in (6.5d) to
remove the influence of the detector circuit. Since the AC coupling of the detector circuit also
removes constant terms from the signal, we should also add back any constant signal terms
leaving the detector.97 Equation (6.4), which requires time and OPD to be proportional, reminds
us that the constant signal terms must be independent of both time t and the OPD value Ȥ.
Examining Eqs. (5.40e)–(5.40g) in Sec. 5.9 of Chapter 5, we note that the formulas for K bal ( )
97
See the discussion following Eq. (5.46c) in Chapter 5 for an explanation of how the constant terms are eliminated
as the signal passes from point B to C in Fig. 6.2.
- 750 -
Signal from the Spectral Radiance · 6.2
are formulas for what we are now calling zB(Ȥ), the electrical signal leaving the detector due to the
spectral radiance L(ı) entering the interferometer’s aperture. Both of the formulas for K bal = z B
in Eqs. (5.40e) and (5.40f) have the same constant term—that is, the same Ȥ-independent term—
no matter what approximation is used for cos α ε . This term can be written as, substituting from
Eq. (5.40g),
∞ ∞
1 A∆Ω
³
4 −∞
S (σ ) d σ = ³ η( σ ) R ( σ ) τ f ( σ ) τ a ( σ )L ( σ ) dσ .
4 −∞
(6.6a)
Because this constant term is the same no matter what approximation is used for cos α ε , all that
we need to do to get the formula for signal zB(Ȥ) is to add this constant term to the formula for zC
in (6.5d) with H set equal to one. This gives
zB ( χ )
∞
A∆Ω
= ³ η ( σ ) R ( σ ) τ f ( σ ) τ a ( σ )L ( σ ) d σ
4 −∞
(6.6b)
∞
WA ∆Ω
+
4 −∞³ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ .
To get zA(Ȥ), the optical power reaching the detector due to the spectral radiance L(ı) entering the
interferometer’s aperture, we go back one more step in Fig. 6.2. According to the remark
following Eq. (5.35d) at the beginning of Sec. 5.9 of Chapter 5, replacing the detector
responsivity R ( σ ) by one takes us from the electrical signal produced by the detector to the
optical power hitting the detector. Therefore, to get zA(Ȥ), the optical power reaching the detector
at point A due to the spectral radiance L(ı), we just set R ( σ ) = 1 in Eq. (6.6b) to get
zA (χ )
∞
A∆Ω
= ³ η( σ ) τ f ( σ ) τ a ( σ )L ( σ ) dσ
4 −∞
(6.6c)
∞
WA ∆Ω
+
4 −∞³ M( Rσθ ma ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ .
- 751 -
6 · NEdN and Detector Noise
The formulas for zA, zB, and zC in Eqs. (6.6a)–(6.6c) show that if L(ı), the spectral radiance
entering the front aperture, is zero, then zA, zB, and zC are also zero. The standard way of making
the infrared spectral radiance L(ı) negligible—that is, effectively zero compared to the
background radiances—is to point the interferometer at an extremely cold surface. When this is
done, Eqs. (6.7a)–(6.7c) reduce to
The superscript (cold) reminds us that the background terms at points A, B, and C are the same
thing as the total signal at points A, B, and C when the interferometer is looking at a cold surface.
Because z (Acold ) , z B( cold ) , and zC( cold ) represent the background terms, and these terms are the same
no matter what negligible or non-negligible spectral radiance L(ı) is entering the front aperture
of the interferometer, Eqs. (6.7a)–(6.7c) can also be written as
- 752 -
Signal from the Background Radiance · 6.3
and
zC( tot ) ( ) zC ( ) zC( cold ) ( ) . (6.8c)
As a general rule, the calibration of any well-designed interferometer system provides us with the
data needed to find
z (Acold ) ( ) , z B( cold ) ( ) , and zC( cold ) ( ) .
Consequently, in principle all that need be done to recover signals zA(Ȥ), zB(Ȥ), and zC(Ȥ) at points
A, B, and C is to subtract z (Acold ) , zB( cold ) , and zC( cold ) from z A(tot ) , zB(tot ) , and zC(tot ) to get
zC( cold ) ( )
WA
5
(6.9)
³ H(u) ) M( R)'ma ) R ( ) ) !() ) * a ( ) )[L () ) L () )] e d) .
( fore ) (back) 2& i)
4 5
- 753 -
6 · NEdN and Detector Noise
The nonideal case where cos α ε can no longer be approximated by one requires somewhat more
work. Equation (5.62a) in Chapter 5 gives the nonideal formula for zC( cold ) . When we compare
(5.62a) to Eq. (5.73a) in Chapter 5, we notice that Eq. (5.73a) becomes identical to (5.62a) when
functions z(Ȥ) and S(ı) are taken to be the same as zC( cold ) ( χ ) and [ S ( fore ) (σ ) − S (back) (σ )]
respectively:
In the mathematical analysis following Eq. (5.73a), functions z(Ȥ) and S(ı) are just “placeholder”
functions—that is, our mathematical analysis down to Eq. (5.75e) holds true for any appropriate
pair of “z” and “S” functions because it makes no assumptions about them other than that they are
related by a formula like Eq. (5.73a). This means we can find out what would happen to Eq.
(5.62a) when the same sort of analysis is applied to it as is applied to the z and S functions in
(5.73a) simply by replacing z and S in (5.75e) by zC( cold ) and [ S ( fore ) − S (back) ] respectively. Making
this replacement shows that the mathematical relationship specified in Eq. (5.62a) transforms into
zC(cold) ( χ )
W
∞ § ∆Ω ·
§ σχ ∆Ω · 2π iχσ ¨©1 − 4π ¸¹ (6.10c)
= ³ ª¬ S ( fore )
(σ ) − S (back)
(σ ) º¼ M ( Rσθ ma ) H(uσ ) sinc ¨ ¸e dσ .
4 −∞ © 2 ¹
For this new formula to be true, we must assume, just as in the analysis following Eq. (5.73a),
that χσα ε2 can be treated as a small quantity for all − D ≤ χ ≤ D over which the zC( cold ) ( χ ) signal
is recorded and that the field of view ∆Ω , although relatively large, is not so large that
α ε2
cos α ε ≅ 1 −
2
- 754 -
Inverse Fourier Transform of the Background Radiance · 6.4
∞ ∞
( ) ³
F ( −iσχ ) zC( cold ) ( χ ) = d χ e −2π iσχ ³ dσ ′ ⋅
−∞ −∞
(6.10d)
° W ( fore ) § σ ′χ ∆Ω · 2π i χσ ′¨©1− 4π ¸¹ °½
§ ∆Ω ·
® ¬ª S (σ ′) − S (back )
(σ ′) ¼º M ( Rσ ′θ ma ) H(uσ ′) sinc ¨ ¸e ¾.
¯° 4 © 2 ¹ ¿°
Comparing this to Eq. (5.76a) in Chapter 5, we note that the right-hand sides of (5.76a) and
(6.10d) become identical if we once again use (6.10b), matching S (σ ′) to
[ S ( fore ) (σ ′) − S (back) (σ ′)] . Checking out how Appendix 5B is used to transform the right-hand side
of (5.76a) into the right-hand side of (5.76b), we note that again S is just a placeholder function.
This means the mathematical analysis still holds true when S (σ ′) is replaced by
[ S ( fore ) (σ ′) − S (back) (σ ′)] . Consequently, we can apply the same transformation used on (5.76a) to
Eq. (6.10d) to get
(
F ( −iσχ ) zC( cold ) ( χ ) ≅ )
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
1 © 4π ¹ 2
W ( fore ) ½ (6.10e)
³ ® ª¬ S (σ ′) − S (back ) (σ ′) º¼ M ( Rσ ′θ ma ) H(uσ ′) ¾ dσ ′ ,
∆σ § ∆Ω · ∆ σ
¯4 ¿
σ ⋅¨1+ ¸−
© 4π ¹ 2
where
∆Ω σ
∆σ = .
2π
1 § WA∆Ω ·
(
F ( −iσχ ) zC( cold ) ( χ ) ≅ ) ∆σ
⋅¨
© 4 ¹
¸⋅
§ ∆Ω · ∆ σ
σ ⋅¨1+
©
¸+
4π ¹ 2
(6.10g)
³ {M ( Rσ ′θ ma ) H(uσ ′) η(σ ) R ( σ )τ a ( σ ) ª¬L( fore) ( σ ′ ) − L(back) ( σ ′ ) º¼} dσ ′ .
§ ∆Ω · ∆ σ
σ ⋅¨1+ ¸−
© 4π ¹ 2
According to the discussion following Eq. (5.82c) of Chapter 5, the functions M, H, R, η , and τ a
all vary slowly with wavenumber ı, allowing them to be brought outside the integral in (6.10g).
In well-designed interferometers, it is often true that the background radiances L( fore ) and L(back)
- 755 -
6 · NEdN and Detector Noise
are also slowly varying functions of ı, being more or less proportional to a combination of Planck
black-body curves, but for now we can leave open the possibility that this is not the case.
Equation (6.10g) can now be written as, using the approximations specified in (5.83b) of Chapter
5,
§ WA∆Ω ·
( )
F ( −iσχ ) zC( cold ) ( χ ) ≅ ¨
© 4 ¹
¸ M ( Rσθ ma ) H(uσ ) η(σ ) R ( σ )τ a ( σ ) ⋅
ª § ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
º ª § ∆Ω · ∆ σ
σ ⋅¨1+ ¸+
º ½ (6.10h)
°° « 1 © 4 π ¹ 2
» « 1 © 4π ¹ 2
» °°
³ σ ′ σ ′ ³ σ ′ σ ′
( fore ) (back )
®« ⋅ L ( ) d » − « ⋅ L ( ) d »¾ .
° « ∆ σ σ ⋅§1+ ∆Ω · − ∆σ » « ∆ σ σ ⋅§1+ ∆Ω · − ∆σ »°
°¯ ¬« ¨
© 4π ¹
¸
2 ¼» ¬« ¨
© 4π ¹
¸
2 »¼ °¿
Equation (6.10h) applies, of course, to the nonideal case where ¨ȍ is small but not so small that
cos α ε can be approximated by one. Returning to Eq. (6.9), which gives the formula for
zC( cold ) ( χ ) when the ¨ȍ field of view is small enough to approximate cos α ε by one, we take the
forward Fourier transform of both sides of (6.9) to get
( )
F ( −iσχ ) zC( cold ) ( χ ) =
§ WA ∆Ω · (6.10i)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L ( σ ) − L(back) ( σ )].
( fore )
¨
© 4 ¹
Comparing Eqs. (6.10h) and (6.10i), we see they can be combined into a single result by writing
( )
F ( −iσχ ) zC( cold ) ( χ ) ≅
§ WA ∆Ω · (6.11a)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] ,
( fore ) (back)
¨
© 4 ¹
- 756 -
Inverse Fourier Transform of the Background Radiance · 6.4
and
L(back) ( σ ) for small ǻȍ where cos α ε
° can be approximated as one
°
°
°
L FOV ( σ ) = ®
(back)
§ ∆Ω · ∆ σ (6.11c)
σ ⋅¨1+ +
° 1 © 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ⋅ ³
° ∆ σ § ∆Ω · ∆σ
L(back) ( σ ′ ) dσ ′
cannot be approximated as one .
°̄ σ ⋅¨ 1+
© 4π ¹
¸ −
2
zC( cold ) ( χ ) ≅
§ WA ∆Ω ·
∞
(6.12a)
¸³
2π iσχ
¨ H(uσ ) M( Rσθ ma ) η (σ ) R ( σ ) τ a ( σ )[ L( fore )
FOV ( σ ) − L(back)
FOV ( σ )]e d σ .
© 4 ¹ −∞
This is the formula for zC( cold ) that belongs in Eq. (6.8c).
Having found the background terms at point C in Fig. 6.2, we now get the background terms
at point B by going back up the signal chain the same way we did for zA, zB, and zC in Sec. 6.2
above. To evaluate the right-hand side of (6.12a) at point B, we set H(uσ ) = 1 to get
∞
§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)
dσ .
© 4 ¹ −∞
Unfortunately, to get the complete z B( cold ) ( χ ) background signal at point B, we have to add to this
the constant terms removed by the AC coupling of the detector to the rest of the system.98 Since,
according to Eq. (6.4), time and the OPD value Ȥ are proportional to each other, the time-
independent constant terms are also Ȥ independent. We note that according to Eq. (6.8b) above,
the total interference signal at point B is
z B(tot ) ( χ ) = z B ( χ ) + z B( cold ) ( χ ) .
Returning to Chapter 5, we compare Eq. (5.59b), which gives the total interference signal at point
B when the interferometer’s field of view is too large for cos α ε to be approximated as one, and
Eq. (5.59c), which gives the total interference signal at point B when the field of view is small
98
See discussion following Eqs. (5.42c) and (5.46c) in Sec. 5.10 of Chapter 5 for more information on AC coupling.
- 757 -
6 · NEdN and Detector Noise
enough for cos α ε to approximated by one, and see that they both have the same Ȥ-independent
constant terms:
∞
χ - independent terms = Αdet ∆Ω ³ R (σ )L (σ ) dσ
( dir ) ( dir )
0
∞ ∞
1 1
+
20³ S (σ ) dσ + ³ S ( fore ) (σ ) dσ
20
∞
A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ .
2 0
Because we are only trying to find the constant terms in the z B( cold ) ( χ ) background radiance—that
is, constant terms that are still present when the input radiance L(σ ) → 0 because the instrument
observes a cold scene—we must be careful to drop everything that is zero when L(ı) is zero.
Formula (5.40g) in Chapter 5 shows that the integral over S(ı) becomes zero when L(ı) is zero,
so it should be removed to give
Because these Ȥ-independent background terms are the same no matter how cos α ε is
approximated, they correctly represent the constant background terms at point B for all
reasonable sizes of the interferometer’s field of view. Adding them to the Ȥ-dependent terms from
Eq. (6.12a) with H = 1 thus gives the total background interference signal at point B,
z B( cold ) ( χ ) =
∞
§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)
dσ
© 4 ¹ −∞
∞
1
∞ (6.12b)
³ R (σ ) L (σ ) dσ + ³ S ( fore ) (σ ) dσ
( dir ) ( dir )
+ Αdet ∆Ω
0
20
∞
A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ )[ 2 r (σ ) − η (σ )] dσ .
2 0
- 758 -
Inverse Fourier Transform of the Background Radiance · 6.4
We substitute for S ( fore ) (σ ) from Eq. (5.51a), with absolute value signs dropped from the ı
arguments because the integral does not cover negative ı values, to get
z B( cold ) ( χ ) =
∞
§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)
dσ
© 4 ¹ −∞
∞
A ∆Ω
∞ (6.12c)
+ Αdet ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ +
2 ³0
R (σ ) η(σ ) L (σ )τ a (σ ) dσ
( fore )
0
∞
A ∆Ω
³
2
+ τ a (σ )R (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
2 0
Now that the constant terms have been correctly incorporated into (6.12c), it is easy to get the
formula for z (Acold ) ( χ ) , the total background signal at point A in Fig. 6.2: just set the detector
responsivity to R = 1 . Hence, we see that the total background optical power reaching the
detector is
z (Acold ) ( χ ) =
∞
§ WA ∆Ω ·
¸ ³ M( Rσθ ma ) η(σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )]e
2π iσχ
¨
( fore ) (back)
dσ
© 4 ¹ −∞
∞
A ∆Ω
∞
(6.12d)
+ Αdet ∆Ω( dir ) ³ L( dir ) (σ ) dσ + ³ η(σ ) L (σ )τ a (σ ) dσ
( fore )
0
2 0
∞
A ∆Ω
³
2
+ τ a (σ )L (back) (σ ) [ 2 r (σ ) − η (σ )] dσ .
2 0
Equations (6.5d), (6.6b), (6.6c), and (6.12a)–(6.12d) give all the information needed to make
sense of the formulas (6.8a)–(6.8c) for the z (tot ) ( χ ) signals at points A, B, and C in Fig. 6.2.
- 759 -
6 · NEdN and Detector Noise
( tot )
z AN ( ) z A ( ) z (Acold ) ( ) z A ( ) , (6.13a)
( tot )
zBN ( ) z B ( ) z B( cold ) ( ) zB ( ) , (6.13b)
and
( tot )
zCN ( ) zC ( ) zC( cold ) ( ) zC ( ) . (6.13c)
Here z A ( ) represents the noise associated with any signal at point A in Fig. 6.2, zB ( )
represents the noise associated with any signal at point B in Fig. 6.2, and zC ( ) represents the
noise associated with any signal at point C in Fig. 6.2. Just like in Eq. (6.1a) above, the noise
terms have a to show that they are expected to be small, and they have wavy lines or tildes to
( tot ) ( tot ) ( tot )
show that they are random functions of Ȥ. Tildes are added to z AN ( ) , zBN ( ) , and zCN ( ) to
show that these signals are also random quantities (because they are contaminated by the random
noise).
As pointed out in Sec. 6.3, the z ( cold ) ( ) signals are special cases of the z (tot ) ( )
interferometer signals; they are just the total signals at points A, B, or C when zA(Ȥ), zB(Ȥ), and
zC(Ȥ) are negligible or zero because the interferometer is observing a cold scene having negligible
or zero spectral radiance L(ı). Hence, when L is negligible or zero, Eqs. (6.13a)–(6.13c) can be
specialized by writing
( cold )
z AN ( ) z A( cold ) ( ) z A( cold ) ( ) , (6.13d)
( cold )
zBN ( ) z B( cold ) ( ) zB( cold ) ( ) , (6.13e)
and
( cold )
zCN ( ) zC( cold ) ( ) zC( cold ) ( ) . (6.13f)
E z AN
( tot )
( ) z (Atot ) ( ) , (6.14a)
E zBN
( tot )
( ) z B( tot ) ( ) , (6.14b)
- 760 -
Background Radiance, Total Error, and Signal Noise · 6.5
and
(
E zCN
( tot )
)
( χ ) = zC( tot ) ( χ ) (6.14c)
(
E z AN
( cold )
)
( χ ) = z (Acold ) ( χ ) , (6.14d)
(
E zBN
( cold )
)
( χ ) = z B( cold ) ( χ ) , (6.14e)
and
(
E zCN
( cold )
)
( χ ) = zC( cold ) ( χ ) . (6.14f)
We substitute (6.13a)–(6.13c) into (6.14a)–(6.14c) and use the linearity of the expectation
operator E as explained in Sec. 3.10 of Chapter 3 to get
(
E z A, B ,C ( χ ) + z (Acold )
)
, B ,C ( χ ) + E ( δ z
A, B ,C ( χ ) ) = z (Atot, B),C ( χ ) .
( )
E z A( tot, B),C ( χ ) + E (δ z A, B ,C ( χ ) ) = z (Atot, B),C ( χ ) ,
E (δ z A, B ,C ( χ ) ) = 0 . (6.14g)
Similarly, we can substitute (6.13d)–(6.13f) into (6.14d)–(6.14f) and use the linearity of the
expectation operator to get
E ( z (Acold
, B ,C ( χ ) ) + E ( δ z
)
A, B ,C ( χ ) ) = z (Acold
, B ,C ( χ ) .
)
E (δ z A( cold
, B ,C ( χ ) ) = 0 .
)
(6.14h)
- 761 -
6 · NEdN and Detector Noise
Equations (6.14g) and (6.14h) require the expectation or average values of the random functions
representing the noise to be equal to zero at every OPD value Ȥ. From this point on, we can think
of the z signal “noise” as a random signal error whose expectation value is always zero.
Function Lmnf(ı) is, according to the discussion in Sec. 6.1, the the best and most distorted
non-randomly accurate
spectral-radiance measurement
measurement that
produced by an interferometer.
an interferometer can produce.It Itcan canbeberecovered
recovered from
from the
noise-free signal at points A, B, or C in Fig. 6.2; however, all that we get from a single
( tot ) ( tot ) ( tot )
measurement is the noise-contaminated signal z AN , zBN , zCN or (when looking at a cold
( cold ) ( cold ) ( cold )
surface) z AN , zBN , zCN . In principle, we could average together large numbers of
measurements to get, according to Eqs. (6.14a)–(6.14f), both of the noise-free signals z (Atot, B),C and
( cold )
z A( cold )
, B ,C . Then, following the recipe in Eqs. (6.8d)–(6.8f), the noise-free z signal could be
subtracted from the noise-free z ( tot ) signal to get zA(Ȥ), zB(Ȥ), or zC(Ȥ) at points A, B, or C in Fig.
6.2. This is exactly what is needed to gain access to Lmnf(ı); unfortunately, it is also impractical.
Typically, enough work is invested in calibrating an interferometer to produce very high-quality
estimates of the z ( cold ) signals if we want them. Even when we calibrate in the spectral domain, as
discussed in Sec. 5.19 of Chapter 5, the calibration algorithm requires substantially noise-free
signal spectra from which we could extract substantially noise-free z ( cold ) signals. When making
everyday measurements, on the other hand, we end up relying on less high-quality information;
( tot ) ( tot ) ( tot )
that is, we use the noise-contaminated z AN , zBN , zCN signals or their equivalents. Everyday
measurements are less accurate than the information used to calibrate the interferometer because
that is what it means to calibrate an instrument: however accurate the everyday measurement, and
however many noise-suppression averages go into its making, we expect the calibration to be
done with even greater care. Hence, when analyzing the zA,B,C signals generated by the input L(ı)
radiance, we can assume there is always enough noise-free data to subtract off, if only as a
thought experiment, the nonrandom functions z (Acold )
, B ,C from the random functions z
( tot )
AN , BN ,CN to get
( tot )
z AN ( ) z AN ( ) z A( cold ) ( ) , (6.15a)
( tot )
zBN ( ) zBN ( ) z B( cold ) ( ) , (6.15b)
and
( tot )
zCN ( ) zCN ( ) zC( cold ) ( ) . (6.15c)
z AN ( ) z A ( ) z A ( ) , (6.16a)
- 762 -
Background Radiance, Total Error, and Signal Noise · 6.5
and
zCN ( χ ) = zC ( χ ) + δ zC ( χ ) . (6.16c)
Equations (6.16a)–(6.16c) show that any noise in the signals at points A, B, or C in Fig. 6.2
“automatically” ends up attached to zA,B,C; that is, it ends up attached to the signal component
used to recover the Lmnf(ı) spectral radiance measured by the interferometer.
( tot )
zBN ( χ ) = z B ( χ ) + z B( cold ) ( χ ) + n (det) ( χ ) . (6.17c)
Since only detector noise is being analyzed in this chapter, we specify here that only negligible
amounts of noise occur “upstream” of point B in Fig. 6.2 by setting
δ z A ( χ ) = 0 (6.17d)
in Eqs. (6.13a) and (6.16a). We also assume that only negligible amounts of extra noise enter the
signal chain downstream of point C, which means that δ zC in (6.13c) and (6.16c) comes entirely
from the transmission of δ zB = n (det) between points B and C. Our job is to find what δ zC looks
like in terms of n (det) and then to use that information to find a formula for the NEdN due to
detector noise.
Many Fourier-transform systems go to great lengths to minimize detector noise. Some tactics
are obvious—for example, careful choice and treatment of detectors so that they perform well
- 763 -
6 · NEdN and Detector Noise
and do not generate large amounts of random error. Other tactics are perhaps less obvious—for
example, averaging together a large number of interferogram signals to reduce the detector noise
present. Section 3.12 of Chapter 3 has a discussion of how averaging of identical, noise-
contaminated signals works to reduce random error;; and and ofofcourse
courseFourier-transform
interferometer signals
signals are
put through computers to extract spectra, making it easy to store and average them. This sort of
averaging often involves the combination of many different independent measurements at the
same OPD value Ȥ and is often referred to as “co-adding” the interferograms. (It should not be
confused with the averaging discussed in Sec. 6.8 below, where we talk about averaging together
the signal values at Ȥ and í Ȥ.) There are two points that should be kept in mind when reading the
balance of this chapter:
(1) However much effort is put into co-adding interferograms to reduce noise, almost
always—as discussed at the end of the previous section—even more effort is put
into processing the calibration data to reduce noise; and
(2) The n (det) random function in Eq. (6.17a) above can be taken to represent the amount
of noise that still contaminates the signal after co-adding has occurred.
We can, in effect, pretend that co-adding is something that happens to the signal immediately
after it leaves the detector, acting to reduce the noise at point B and all points further downstream
in the signal processing chain of Fig. 6.2.
nk(det) (t ) measured detector noise for the kth detector as a function of time t
with T 4 t 4 T
- 764 -
1/f Noise in Detectors · 6.7
Note that although these are error functions, they are not random since they represent the actual
measured error for each detector. Because the detectors are all identical, each nk(det) (t ) can be
thought of as a specific instance of the same random function n (det) (t ) ; that is, each nk(det) (t ) can
be treated as a typical member of the ensemble of functions associated with the n (det) (t ) random
function.99 Returning briefly to Sec. 3.23 of Chapter 3, we use Eq. (3.56a) to calculate another set
of functions,
T
³n
(k ) (det) 2& ift
2& ift
N T (f) k e(t ) e dt .dt .
T
Each NT( k ) ( f ) can be regarded as a member of the ensemble of functions associated with random
function
T
N T ( f ) ³ n
(det) 2& ift2& ift
e(t ) e dt .dt .
T
Formula (3.57g) in Chapter 3 then states that the noise-power spectrum of n (det) (t ) is
ª E N ( f ) 2
S nn
( f ) lim «
« T º» .
T 75 2T »
« »
¬ ¼
Because the expected value of a random quantity can be estimated by taking its average, we can
write that
ª 1 1 k 2º
S nn
( f )
lim « A ¦ NT( k ) ( f ) »
T 75 2T k
¬ k 1 ¼
and the formula reduces to
2
1 k NT( k ) ( f )
S nn
( f )
k
¦k 1 2T
2
NT( k ) ( f )
2T
99
See Sec. 3.14 of Chapter 3 for an explanation of what is meant by an ensemble of functions.
- 765 -
6 · NEdN and Detector Noise
to be close to its limit as T 7 5 . This result shows how to calculate the noise-power spectrum of
the k== identical detectors. When discussing 1/f noise, it is customary to introduce one final step:
using Eq. (3.58b) to go from the double-sided power spectrum Sññ to the single-sided power
(1)
spectrum S nn ,
(1)
S nn
( f ) 2 S nn
( f ) for f : 0 .
This is really just a change of scale—doubling the size of the noise-power spectrum—along with
an agreement to ignore the negative f values because they are always the same as the positive
ones [see Eq. (3.49b) in Chapter 3].
(1) (1)
Figure
Figure6.3(a)) onshows
6.3(a) page 795 sshows aplot,
a typical typical
forplot, for detector
detector noise,noise, S
of
of S nn versus
versus
nn f onf on a log-logscale.
a log-log scale. For
For
(1)
most detectors, there is a “corner” frequency fc such that when f > fc the value of S nn is
essentially constant over a wide range of frequencies (before rolling off at very high f). When f <
(1)
fc, on the other hand, the value of S nn is typically proportional to 1/ f , with Į approximately
equal to one. Low frequencies correspond to long time intervals, so the growth in the value of
(1)
as f gets small reflects the way detector calibrations go stale as time goes by. It has become
Snn
convenient to refer to this phenomenon as detector 1/f noise because in many detectors the corner
frequency fc is relatively large, meaning that their calibrations start to go stale in a very small
fraction of a second. We like to set up Fourier-transform systems so that the low-frequency noise
at f < fc cannot significantly contaminate our measurements. The basic strategy for doing this is
to use high-quality detectors—meaning that fc is small—and calibrate often enough that 1/f noise
does not become important. This is only the first line of defense; there are other ways of
minimizing the effect of 1/f noise and they will be pointed out in the remainder of the chapter
when appropriate.
A mathematical point often ignored in elementary discussions of 1/f noise is that if noise-
power spectra are 1/f all the way down to zero frequency, then integrals over frequency that
include the zero must diverge—that is, they become infinite. Standard treatments of random
function theory require the use of these integrals. Equation (3.48d) in Chapter 3, for example,
shows that Rññ(0) is equal to the integral of the power spectrum over all frequency values—
including, of course, f=0. Hence, the integral formula for Rññ(0) diverges when the power
spectrum is 1/f all the way down to zero. According to Eq. (3.48a) in Chapter 3, Rññ(0) is just the
squared standard deviation of the random function ñ at any time t. This squared standard
deviation must have a well-defined value to describe the detector noise accurately. Consequently,
the integral for Rññ(0) cannot be allowed to diverge. Perhaps the quickest way out of this problem
is to note that zero frequency corresponds to the most recent calibration occurring an infinite time
in the past; so, as long as the detectors have been calibrated more recently than that, we do not
expect the 1/f region of the noise-power spectrum to extend all the way down to zero. In general,
when the 1/f form of the noise-power spectrum leads to problems near f=0, it means that an
important aspect of the random error—an aspect which prevents the 1/f noise from producing
- 766 -
1/f Noise in Detectors · 6.7
zB ( )
5
A
³ ! ( ) ) R ( ) ) * f ( ) ) * a ( ) )L ( ) ) d )
4 5
5
WA
4 5³ M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L FOV ( ) ) e 2& i) d) .
Consulting Eqs. (4.139g) of Chapter 4 and (5.10f) of Chapter 5, we see that Ș and M are even
functions of ı. This turns the second integral on the right-hand side into the inverse Fourier
transform of a real and even function of ı. Therefore, according to entry 1 of Table 2.1 in Chapter
2, the integral itself is a real and even function of Ȥ. Because the first integral on the right-hand
side is a constant, independent of Ȥ, we conclude that the noise-free signal zB ( ) must also be a
real and even function of Ȥ,
zB ( ) zB ( ) . (6.18b)
- 767 -
6 · NEdN and Detector Noise
Glancing back at the formula for zBN ( χ ) in Eq. (6.18a), we see that the detector noise n (det) ( χ )
is, however, another story—it would be strange indeed if the random error coming from the
detector is an even function of Ȥ. The detector cannot possibly care what the position of the
moving mirror is; the only reason n (det) depends on Ȥ is that we acknowledge n (det) to be a
function of time and then use Eq. (6.4) to make it function of Ȥ. Consequently zBN , the sum of zB
and n (det) in (6.18a), is an uneven function of Ȥ only because it is a noise-contaminated signal.
This distinction between zB(Ȥ) and n (det) ( χ ) , that one is an even function and the other is not, can,
in principle, be used to reduce the NEdN of the interferometer’s spectral measurements. (In
practice we always have to worry about the distorting effect of any circuit used to measure the
detector signal—see for example the discussion of the detector circuit in Sec. 5.12 of Chapter 5.)
For this reason, we say that some of the noise contributed to zB(Ȥ) by n (det) ( χ ) is avoidable
noise—that is, noise that can be eliminated by an intelligent analysis of the zBN signal.
Perhaps the quickest way to distinguish the avoidable and unavoidable noise in zBN ( χ ) is to
recall the discussion following Eq. (2.11b) in Chapter 2, where it is pointed out that any function
can be written as the sum of even and odd components. Hence, we can always write
Here ne(det) is the even component of n (det) and no(det) is the odd component of n (det) ,
Equations (6.19d) and (6.19e) are just the definition of what it means for a function to be even or
odd [see Eqs. (2.11a) and (2.11b) in Chapter 2], and it is easy to see that (6.19d) and (6.19e) are
true by checking what happens when the sign of the argument is changed in formulas (6.19b) and
(6.19c). Substitution of (6.19a) into (6.18a) gives
- 768 -
Avoidable and Unavoidable Noise in Double-Sided Signals · 6.8
A little thought shows that ne(det) ( χ ) must be the unavoidable component of the noise, because
there is no way to distinguish the noise-contaminated sum inside the square brackets [ ] from a
noise-free measurement of a zB(Ȥ) interference signal. The no(det) ( χ ) noise, on the other hand, is an
avoidable source of error. We could, for example, eliminate it by averaging together zBN ( χ ) and
zBN (− χ ) ,
1 1
[ zBN ( χ ) + zBN (− χ )] = [ z B ( χ ) + ne(det) ( χ ) + no(det) ( χ )]
2 2
1
+ [ z B (− χ ) + ne(det) (− χ ) + no(det) (− χ )]
2
1
= [ z B ( χ ) + z B (− χ )]
2
1
+ [ne(det) ( χ ) + ne(det) (− χ )]
2
1
+ [no(det) ( χ ) + no(det) (− χ )]
2
= z B ( χ ) + ne(det) ( χ ) ,
where in the last step Eqs. (6.18b), (6.19d), and (6.19e) are used to show that the average
produces signal zB(Ȥ) contaminated only by ne(det) ( χ ) , the unavoidable even-noise component.
Although in practice the avoidable noise no(det) ( χ ) is usually not averaged away at this point in the
signal processing chain, it could in principle be eliminated this way. To show that the no(det) ( χ )
avoidable noise has not yet been eliminated from the noise-contaminated signal, we substitute
(6.19a) into (6.17c) to get
( tot )
zBN ( χ ) = z B ( χ ) + z B( cold ) ( χ ) + ne(det) ( χ ) + no(det) ( χ ) . (6.19g)
For now, this is still the signal we trace through the signal chain, always remembering that only
the ne(det) noise component is an unavoidable source of signal contamination.
- 769 -
6 · NEdN and Detector Noise
as in Eq. (6.19g), then the output signal must be the sum of the outputs generated by each signal
going through the circuit separately. We already know, according to Eqs. (6.8b) and (6.8c) above,
that the output corresponding to input
and we also know that the total signal plus noise leaving the detector circuit at point C is,
according to Eq. (6.13c),
( tot )
zCN ( ) zC ( ) zC( cold ) ( ) zC ( ) . (6.20a)
Hence zC ( ) , the noise contaminating the signal at point C, is the signal we would get when
passing the sum m in Eq. (6.19a)
n (det) ( ) ne(det) ( ) no(det) ( )
of both the avoidable and unavoidable noise through the detector circuit as a separate signal.
The first step in sending the total detector noise n (det) ( ) through the detector circuit is to use
Eq. (6.4) above to convert
n (det) n (det) (ut ) (6.20b)
into a function of time. Then, using formula (5A.1a) in Appendix 5A of Chapter 5, we know the
corresponding output is
5
³
5
n (det) (ut 3) h(t t 3) dt 3 ,
where h(t) is the impulse-response function of the detector circuit. Following the suggestion in
Eq. (6.4), we change the variable of integration to 3 ut 3 . The detector circuit’s output
corresponding to the n (det) input is then
5
1 § 3 · 3
³
(det)
n ( 3) h ¨t ¸d .
u 5 © u ¹
Now we substitute t / u from Eq. (6.4) to get the noise output corresponding to input n (det) as
a function of Ȥ,
- 770 -
Passing the Detector Noise Through the Detector Circuit · 6.9
∞
1 § χ − χ′ · ′
³ χ ′ ¸dχ .
(det)
n ( ) h ¨
u −∞ © u ¹
The discussion following Eq. (6.20a) above shows that δ zC ( χ ) must be exactly this integral—
that is, the output of the detector circuit corresponding to input n (det) . Therefore, we can write
∞
1 § χ − χ′ · ′
δ zC ( χ ) = ³ n (det) ( χ ′) h ¨ ¸dχ . (6.20c)
u −∞ © u ¹
Glancing back at the definition of the convolution in Eq. (2.38a) of Chapter 2, we note that this
can also be written as
1ª § χ ·º
δ zC ( χ ) = « n (det) ( χ ) ∗ h ¨ ¸ » . (6.20d)
u¬ © u ¹¼
Equations (6.20c) and (6.20d) are exact formulas for δ zC ( χ ) , but there is also an
approximation for it that is often useful. According to the analysis at the beginning of Appendix
5A to Chapter 5, when h(t) is a narrow function of time the output of the detector circuit is just a
slightly blurred and distorted version of the input; and, according to the discussion at the end of
Sec. 5.12 of Chapter 5, detector circuits are typically designed to produce this sort of output. We
can almost always assume that h(t) is relatively narrow—that is, that there exists a time T such
that h(t) is negligible when t lies outside the time interval between +T and íT,
In fact, if h is causal, we can also assume that h(t ) = 0 for t < 0 [see Eq. (5A.5) in Appendix 5A
of Chapter 5]. Therefore the time-based output of the detector circuit can be approximated as
∞ t +T
Again we change the t ′ dummy variable of integration to χ ′ = ut ′ and replace the time parameter
t by t = χ / u to get
∞ χ +uT
1 § χ − χ′ · ′ 1 § χ − χ′ · ′
³
u −∞
n (det) ( χ ′) h ¨
© u ¹
¸dχ ≅ ³
u χ −u T
n (det) ( χ ′) h ¨
© u ¹
¸dχ . (6.21c)
- 771 -
6 · NEdN and Detector Noise
According to the definition of convolution in Eq. (2.38a) of Chapter 2, this can also be written as
uT
§· § 3 · 3
n (det)
( ) h ¨ ¸
³ n (det) ( 3) h ¨ ¸d . (6.21d)
© u ¹ u T © u ¹
uT
1 § 3 · 3
u ³uT
(det)
zC ( )
n ( 3) h ¨ ¸d . (6.21e)
© u ¹
( tot ) 1 ª (det) § ·º
zCN ( ) zC ( ) zC( cold ) ( )
n ( ) h ¨ ¸» . (6.22a)
u «¬ © u ¹¼
( tot )
We multiply zCN ( ) in (6.22a) by
°1 for 4 D
( , D) ® (6.22b)
°̄0 for D
to make it a double-sided signal, following the same tactic used before in Eq. (5.106a) of Chapter
5. Function ( , D) is given the same definition as in Appendix 4C of Chapter 4 [see Eq.
(4C.1a)]. The formula for the double-sided and noise-contaminated signal used to measure the
spectral radiance thus becomes
( tot )
( , D) zCN ( )
1 ª § ·º (6.22c)
( , D ) zC ( ) ( , D) zC( cold ) ( ) ( , D) « n (det) ( ) h ¨ ¸ » .
u ¬ © u ¹¼
Applying
Section 5.11the Fourier5transform
of Chapter to there
explains why both issides of an
always theeffective
equation gives, corresponding
spectrum because the Fourier transform
to an interferometer
is linear
signal. We(see
nowSec. 2.6 aofformula
develop Chapter
for2),
the detector noise-contaminated effective spectrum corresponding to the
signal in Eq. (6.22c). Applying the Fourier transform to both sides of the equation gives, because the Fourier
transform is linear (see Sec. 2.6 of Chapter 2),
- 772 -
Total Detector Noise in Double-Sided Signals · 6.10
F ( i) ) ( , D) zCN
( tot )
( ) F ( i) ) ( , D) zC ( )
1 § ª § ·º · (6.22d)
F ( i) ) ( , D) zC( cold ) ( ) F ( i) ) ¨ ( , D) « n (det) ( ) h ¨ ¸ » ¸ .
u © u ¹¼ ¹
© ¬
Evaluating the first Fourier transform on the right-hand side of (6.22d) is not very difficult.
The remark following Eq. (6.5d) above points out that zC(Ȥ) is the same signal as z(Ȥ) in Eq.
(5.104a) of Chapter 5. The discussion following (5.104a) shows that ( , D) zC ( ) must then be
the same signal function that we called ztrunc(Ȥ) in (5.106a). This means the Fourier transform
F ( i) ) ( , D) zC ( )
is the same quantity as Z eff() ) specified in Eqs. (5.108a) and (5.108b). According to Eq.
trunc
WA
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) ,
4
where
L mnf () ) [2 Dsinc(2&) D)] L FOV ( ) ) . (6.23a)
We conclude that the same expression can be used to approximate F ( i) ) ( , D) zC ( ) ; that
is, we can write that
F ( i) ) ( , D) zC ( )
F ( i) ) ( ,WA C ( )
D) z (6.23b)
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) . (6.23b)
WA4
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) ) .
4
The second Fourier transform on the right-hand side of (6.22d) is not much more difficult.
According to theFourier
The second Fouriertransform
convolution on theorem [see Eq.side
the right-hand (2.39j) in Chapter
of (6.22d) is not2], much more difficult.
According to the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],
F ( i) ) ( , D) zC( cold ) ( ) F ( i) 3) ( 3, D) F ( i) 33) zC( cold ) ( 33) . (6.24a)
F
( i) )
( cold )
( , D ) zC ( ) F
( i) 3 )
(( i)3,)D) F
( i) 33 )
( cold )
F ( i) ) ( , D) zC( cold ) ( ) F ( i) 3) ( 3, D) F ( i) 33) zC( cold ) ( 33) .
zC ( 33) .
(6.24a)
(6.24a)
Equation (5.65a) in Chapter 5 and the definition of F from Eq. (2.29a) in Chapter 2 give
Equation (5.65a) in Chapter 5 and the definition of F (( ii) )
) ) from Eq. (2.29a) in Chapter 2 give
Equation
Equation (5.65a)
(5.65a) in in Chapter
Chapter 55 andand thethe definition
definition of of F F ( i) ) from
from Eq.
Eq. (2.29a)
(2.29a) in in Chapter
Chapter 2
2 give
give
- 773 -
6 · NEdN and Detector Noise
∞
F ( − iσχ ′ )
( Π ( χ ′, D) ) = ³ Π( χ ′, D) e
−2π iσχ ′
d χ ′ = 2 Dsinc(2πσ D) , (6.24b)
-∞
where
sin( x)
sinc( x) =
x
( )
F ( −iσχ ) Π ( χ , D) zC( cold ) ( χ ) = [ 2 Dsinc(2πσ D) ] ∗ F ( −iσχ ′′) zC( cold ) ( χ ′′) . ( ) (6.24c)
Consulting Eq. (6.12a) above, we note that zC( cold ) ( χ ) is the inverse Fourier transform of
§ WA ∆Ω ·
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] ,
( fore ) (back)
¨
© 4 ¹
( )
which means that F ( −iσχ ) zC( cold ) ( χ ) , the forward Fourier transform of zC( cold ) ( χ ) , is
(
F ( −iσχ ) zC( cold ) ( χ ) = )
§ WA ∆Ω · (6.24d)
¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L FOV ( σ ) − L FOV ( σ )] .
( fore ) (back)
¨
© 4 ¹
According to the discussion following Eq. (5.82c), the quantities M, H, R, Ș, and τ a are all
slowly varying functions of their arguments. Hence they can, following the reasoning explained
in Appendix 5C of Chapter 5, be treated as quasi-constants with respect to the narrow sinc
convolution when (6.24d) is substituted into (6.24c). This leads to the approximation
(
F ( −iσχ ) Π ( χ , D) zC( cold ) ( χ ) )
§ WA ∆Ω ·
≅¨ ¸ H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )
© 4 ¹
⋅ { [2Dsinc(2πσ D)] ∗[L ( fore )
FOV ( σ ) − L(back) }
FOV ( σ )] .
The linearity of the convolution [see, for example, Eq. (2.38d) of Chapter 2] now lets us write
- 774 -
Total Detector Noise in Double-Sided Signals · 6.10
F ( i) ) ( , D) zC( cold ) ( )
§ WA · ( fore ) (back)
¨ ¸ H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A [L mnf ( ) ) L mnf ( ) )] , (6.25a)
© 4 ¹
where
L(mnf
fore )
() ) [2 Dsinc(2&) D)] L(FOV
fore )
() ) (6.25b)
and
L(back) (back)
mnf () ) [2 Dsinc(2&) D )] L FOV ( ) ) . (6.25c)
convolutions of L(FOV
fore )
and L(back)
FOV with the sinc function are also even [see Eq. (2.38f) in Chapter
2],
L(mnf
fore )
() ) L(mnf
fore )
() ) , (6.25d)
and
L(back) (back)
mnf () ) L mnf () ) . (6.25e)
L(mnf
fore )
( ) ) L(mnf
fore )
() ) , (6.25f)
and
L(back) (back)
mnf ( ) ) L mnf () ) . (6.25g)
Functions L(mnf
fore )
and L(back)
mnf are the background spectral radiances distorted by the effects of the
interferometer’s finite field of view and finite length of interferogram signal. They are given the
subscript mnf to show their similarity to Lmnf, the input spectral radiance distorted by the
interferometer’s finite field of view and finite interferogram length.
Unfortunately, the third Fourier transform on the right-hand side of Eq. (6.22d) is not as easy
to evaluate as the first two. We start the analysis by multiplying both sides of Eq. (6.21d) by
[ ( , D) / u ] to get
uT
ª § ·º § 3 · 3
u ( , D) « n (det) ( ) h ¨ ¸ »
u 1 ( , D) ³ n (det) ( 3) h ¨
1
¸d (6.26a)
¬ © u ¹¼ u T © u ¹
- 775 -
6 · NEdN and Detector Noise
where, according to Eq. (6.21a), h(t ) ≈ 0 for t > T . The Π ( χ , D) function specified in Eq.
(6.22b) automatically makes both sides of (6.26a) equal to zero when χ > D , so we only need a
good approximation for the integral on the right-hand side when χ ≤ D . In particular, we note
that in (6.26a) the integral goes between χ ′ = χ − uT and χ ′ = χ + uT , which means that
χ − uT ≤ χ ′ ≤ χ + uT .
−( D + u T ) ≤ χ ′ ≤ D + u T .
Hence n (det) ( χ ′) can be multiplied by Π ( χ ′, D + u T ) without changing the value of the integral
for any values of Ȥ that matter. Consequently Eq. (6.26a) can be written as
ª § χ ·º
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ »
¬ © u ¹¼
χ +uT
(6.26b)
§ χ − χ′ · ′
≅ u Π ( χ , D) ³ Π ( χ ′, D)n ( χ ′) h ¨
−1
¸dχ ,
(det)
χ −u T © u ¹
where
D = D + uT . (6.26c)
The integral’s limits between χ + uT and χ − uT in (6.26b) came from the observation in (6.21a)
that function h is very small outside these limits, making the product
§ χ − χ′ ·
Π ( χ ′, D)n (det) ( χ ′) h ¨ ¸
© u ¹
- 776 -
Total Detector Noise in Double-Sided Signals · 6.10
ª § χ ·º
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ »
¬ © u ¹¼
∞
§ χ − χ′ ·
≅ u −1Π ( χ , D) ³ Π ( χ ′, D)n (det) ( χ ′) h ¨ ¸ d χ′ .
−∞ © u ¹
Equation (2.38a) of Chapter 2 shows this integral to be the convolution of Πn (det) and h,
ª § χ ·º § χ ·½
u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ≅ u −1Π ( χ , D) ® ª¬ Π ( χ , D)n (det) ( χ ) º¼ ∗ h ¨ ¸ ¾ . (6.26d)
¬ © u ¹¼ ¯ © u ¹¿
Taking the Fourier transform of both sides, and then applying the Fourier convolution theorem,
gives [see Eqs. (2.39a) and (2.39j) in Chapter 2]
§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27a)
( ) ( )
≅ F ( −iσχ ) u −1Π ( χ , D) ∗ ª¬ F ( − iσχ ′) Π ( χ ′, D)n (det) ( χ ′) ⋅ F ( − iσχ ′′) ( h( χ ′′ / u ) ) º¼ .
³ h( χ ′′ / u ) e
−2π iσχ ′′
d χ ′′ = uH(uσ ) , (6.27c)
−∞
where H, the Fourier transform of h, is the transfer function of the detector circuit in Fig. 6.2.
Substituting this into Eq. (6.27a) gives
- 777 -
6 · NEdN and Detector Noise
§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27d)
( ) (
≅ F ( − iσχ ) u −1Π ( χ , D) ∗ ª¬uH(uσ ) ⋅ F ( − iσχ ′) Π ( χ ′, D)n (det) ( χ ′) º¼ . )
Equation (6.24b) [see also Eq. (5.65a) of Chapter 5] shows that
∞
F ( − iσχ )
(u )
Π ( χ , D) = u
−1 −1
³ Π ( χ , D) e
−2π iσχ
d χ = 2u −1 D sinc(2πσ D) .
−∞
Hence Eq. (6.27d) can be written as, using the linearity of the convolution to cancel out u −1 and
u,
§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.27e)
≅ [ 2 D sinc(2πσ D) ] ∗ ¬ª H(uσ ) ⋅ F ( − iσχ ′ )
(
Π ( χ ′, D)n ( χ ′) ¼º .
(det)
)
According to the discussion following Eq. (5.82c) of Chapter 5, the transfer function H(uσ )
varies slowly compared to the spectral radiance L(ı) that the interferometer is measuring, and
Sec. 5.15 of Chapter 5 explains why [ 2 D sinc(2πσ D) ] should be a narrow function compared to
L(ı). Consequently, there is every reason to expect H(uσ ) to vary slowly with respect to
[ 2 D sinc(2πσ D)] . Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, Eq. (6.27e)
can be approximated as
§ ª § χ ·º ·
F ( −iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
≅ H(uσ ) ⋅{ [2D sinc(2πσ D)] ∗ F ( − iσχ ′ )
( Π( χ ′, D)n(det)
( χ ′) )}
= H(uσ ) ⋅{ F ( − iσχ ′′ )
(Π ( χ ′′, D)) ∗ F ( − iσχ ′ )
( Π( χ ′, D)n (det)
( χ ′) )}
where in the last step Eq. (6.24b) is again used, this time to replace
2 D sinc(2πσ D)
- 778 -
Total Detector Noise in Double-Sided Signals · 6.10
written as
§ ª § χ ·º ·
F ( − iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
(
≅ H(uσ ) ⋅ F ( − iσχ ′) Π ( χ ′, D) ⋅ Π ( χ ′, D) n (det) ( χ ′) . )
Glancing back at Eq. (6.22b) above, we note that
Π ( χ , D ) ⋅ Π ( χ , D) = Π ( χ , D ) (6.28a)
§ ª § χ ·º ·
F ( − iσχ ) ¨ u −1Π ( χ , D) « n (det) ( χ ) ∗ h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.28b)
≅ H(uσ ) ⋅ F ( − iσχ ′ )
( Π( χ ′, D) n (det)
)
( χ ′) .
∞
(σ ) = ³ Π ( χ , D) n ( χ )e −2π iσχ d χ ,
(det) (det)
n D (6.29a)
−∞
which can also be written as
D (σ ) = F
n (det) ( − iσχ )
( Π( χ , D) n (det) ( χ ) ) (6.29b)
or
D
D (σ ) =
n (det) ³ n ( χ ) e −2π iσχ d χ .
(det)
(6.29c)
−D
Equation (6.28b) now becomes, using the linearity of the Fourier transform to take the factor of
u −1 outside the F operator,
1 ( −iσχ ) § ª (det) § χ ·º ·
¨ Π ( χ , D) « n ( χ ) ∗ h ¨ ¸ » ¸ ≅ H(uσ ) n D (σ ) .
(det)
F (6.29d)
u © ¬ © u ¹¼ ¹
- 779 -
6 · NEdN and Detector Noise
working directly with the simple Fourier transform of n (det) . To see why this is so, we write down
the simple Fourier transform of n (det) ,
³ n
(det)
( )e 2& i) d ,
5
and note that there is no reason to think that n(det) always satisfies requirement (V) in Sec. 2.4 of
Chapter 2 for the existence of Fourier transforms.100 Function n (det) D () ) , on the other hand,
F ( i) ) ( , D) zCN
( tot )
( )
WA
H(u) ) M( R)' ma ) R ( ) ) !() ) * f ( ) )* a ( ) )L mnf ( ) )
4
WA
H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A [L(mnf
fore )
( ) ) L(back)
mnf ( ) )]
4
H(u) ) n (det)
D () ) .
F ( i) ) ( , D) zCN
( tot )
( )
WA
H(u) ) M( R)' ma ) R ( ) ) !() ) * a ( ) ) ª¬* f ( ) )L mnf ( ) )
4 (6.30a)
L(mnf
fore )
( ) ) L(back) º
mnf ( ) )¼
H(u) ) n (det )
D () ) .
100
Remember that the extended sine and cosine transforms to which requirement (V) applies will be used to define
the standard Fourier transform in Eq. (2.28a) of Chapter 2, so requirement (V) also applies to the standard Fourier
transform.
- 780 -
Total Detector Noise in Double-Sided Signals · 6.10
as the
the uncalibrated,
uncalibrated,noise-contaminated
noise-contaminatedoutput
ouput spectrum
spectrumofofthe
theinterferometer.
interferometer. It is the detector
noise-contaminated
In principle, there effective
is nospectrum.
problem removing all the noise from (6.30a). Glancing back at the
In principle,
formula for n (det)
D
there
in Eq.is(6.29c),
no problem removing
we apply all the noise
the expectation from E
operator (6.30a).
to bothGlancing back at the
sides to get
formula for n D in Eq. (6.29c), we apply the expectation operator E to both sides to get
(det)
§ DD (det) · DD
E n
(det)
() ) E §¨ ³ n (det) ( ) e2& i) d ·¸ ³ E n (det)
2 & i)
( ) e 2& i) d 0 , (6.30b)
E n (det)
D
() ) E ¨ ³D n ( ) e d ¸ ³D E n (det) ( ) e 2& i) d 0 , (6.30b)
D © ¹
© D ¹ D
where Eq. (3.17c) of Chapter 3 and Eq. (6.17b) are used to show that E(n (det)
D () )) , the average or
E F ( i) ) ( , D) zCN
( tot )
( )
WA
H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) ) A
4
ª¬* f ( ) )L mnf ( ) ) L(mnf
fore )
( ) ) L(back) º
mnf ( ) ) ¼ (6.30c)
ª WA º
L mnf ( ) ) « H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) ) »
¬ 4 ¼
ª WA º
L(mnf
fore )
( ) ) L(back)
mnf ( ) ) «
¬ 4
H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )» ,
¼
measurements of
F ( i) ) ( , D) zCN
( tot )
( )
are averaged together. Hence the right-hand side of (6.30c) can be thought of as the uncalibrated,
noise-free output spectrum of the interferometer. According to Eq. (5.110) in Chapter 5, the Lmnf
radiance spectrum on the right-hand side of (6.30c) is the same as spectrum Leff in Eq. (5.95c) of
Chapter 5. Consequently the entire right-hand side of (6.30c) has the same form as Zeff,tot in
(5.95c), since it looks like
This is no surprise, because Zeff,tot in Sec. 5.19 of Chapter 5 is defined to be the uncalibrated
output spectrum of a Michelson interferometer, which is exactly what the total noise-free signal
spectrum at point C of Fig. 6.2 ought to be. Therefore it now makes sense to write Eqs. (6.30a)
- 781 -
6 · NEdN and Detector Noise
and (6.30c) as
F ( −iσχ ) ( Π ( χ , D) zCN
( tot )
( χ ) ) ≅ Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ ) (6.31a)
and
(
E F ( − iσχ ) ( Π ( χ , D) zCN
( tot )
)
( χ ) ) ≅ Z eff ,tot (σ ) , (6.31b)
ª WA ∆Ω º
Z eff ,tot (σ ) ≅ L mnf ( σ ) « H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) »
¬ 4 ¼
ª WA ∆Ω º
+ ( L(mnf
fore )
mnf ( σ )) «
( σ ) − L(back) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) »
¬ 4 ¼ (6.31c)
WA ∆Ω
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4
ª¬τ f ( σ )L mnf ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .
and, according to the discussion following Eq. (6.30c), that is exactly the form taken by the
noise-free uncalibrated spectrum in the previous section. To use the calibration algorithm in
- 782 -
Measuring the Noise-Contaminated Spectrum · 6.11
(5.95a), we not only need the uncalibrated spectral output Zeff,tot(ı) associated with the spectral
radiance L but also must have the uncalibrated output signals associated with two known,
calibrating spectral radiances. Following the notation of Sec. 5.19 of Chapter 5, we call the two
calibrating radiances L(1) and L(2) and the two output signals associated with them Z (1)
eff ,tot (σ ) and
eff ,tot (σ ) respectively. Equation (6.31b) reminds us that to extract the noise-free signals Z eff ,tot
Z (2) (1)
(1)
and Z (2)
eff ,tot in the presence of noise, we need only point the interferometer at radiances L and
L(2) and average together a large number of uncalibrated output spectra to get each spectrum’s
noise-free expectation value. Examining Eqs. (6.31b) and (6.31c) closely, we realize that
(1)
eff ,tot (σ ) cannot depend directly on L
Z (1,2) and L(2) but instead must depend directly on L(1)
mnf and
(1)
L(2)
mnf , where again the mnf subscripts indicate that the L and L(2) radiances entering the front
end of the interferometer are blurred and distorted by the interferometer’s finite field of view and
finite interferogram length. Fortunately, because L(1) and L(2) are under our control, we can
choose them to be slowly varying functions of wavenumber. This means, according to Eq. (6A.6)
in Appendix 6A, that
mnf ( σ ) ≅ L ( σ )
L(1) (1)
(6.32a)
and
mnf ( σ ) ≅ L ( σ )
L(2) (2)
(6.32b)
WA ∆Ω
eff ,tot (σ ) ≅
Z (1) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.33a)
¬ªτ f ( σ )L ( σ ) + L mnf ( σ ) − L mnf ( σ ) º¼
(1) ( fore ) (back)
and
WA ∆Ω
eff ,tot (σ ) ≅
Z (2) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.33b)
ª¬τ f ( σ )L(2) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .
This, together with the uncalibrated output spectrum Zeff,tot(ı) produced by the unknown radiance
L that the interferometer is being used to measure, is all we need to apply the spectral calibration
algorithm.
Although we could in principle collect a large number of measurements of Zeff,tot(ı), averaging
eff ,tot (σ ) , in practice more
them together to remove the noise just like we did when calculating Z (1,2)
- 783 -
6 · NEdN and Detector Noise
effort is put into removing noise from the calibration data than is put into removing noise from
everyday measurements. [This same point is made at the end of Sec. 6.5 when discussing noise in
the measurements of signals zA(Ȥ), zB(Ȥ), and zC(Ȥ).] Consequently, even though noise-free values
of Z (1,2)
eff ,tot are available for use in the calibration algorithm in Eq. (5.95a) of Chapter 5, we should
in Eq. (6.31a):
F ( i) ) ( , D) zCN
( tot )
( )
Z eff ,tot () ) H(u) ) n (det)
D () ) .
formula
( meas ) () ) Z (det)
Z eff ,totN eff ,tot () ) H(u) ) n D () ) . (6.34a)
by noise; and the tilde shows that the spectral signal now has a random component. Function
Z ( meas ) is just, of course, a different name for
eff ,totN
F ( i) ) ( , D) zCN
( tot )
( ) .
The discussion at the beginning of Sec. 6.1 points out that Lmnf is
is the
thenoise-free
best measurement
measurementof
of the
unknown spectral radiance L that can be extracted from the interferometer, and substituting Eq.
( meas ) in terms of Lmnf,
(6.31c) into (6.34a) gives Z eff ,totN
NowNowwe we
apply the calibration
apply algorithm
the calibration in Sec.
algorithm in 5.19
Sec. of Chapter
5.19 5. Equations
of Chapter (6.34b) (6.34b)
5. Equations and (6.33a)
and
give give
(6.33a)
- 784 -
Measuring the Noise-Contaminated Spectrum · 6.11
( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot
WA ∆Ω
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ª¬ L mnf ( σ ) − L(1) ( σ ) º¼ (6.35a)
4
+ H(uσ ) n (det)
D (σ ) .
Because we have decided to include noise in our measurement of the uncalibrated output
spectrum, this corresponds to the difference
WA ∆Ω (6.35b)
= H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ª¬ L(2) ( σ ) − L(1) ( σ ) º¼ .
4
( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot
L mnf ( σ ) − L(1) ( σ )
= (6.35c)
L(2) ( σ ) − L(1) ( σ )
D (σ )
4n (det)
+ .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )[L(2) ( σ ) − L(1) ( σ )]
The left-hand side of this formula is, of course, the noise-contaminated version of the ratio
in Eq. (5.95a). We now complete the calibration algorithm by substituting (6.35c) into (5.95a) to
get
- 785 -
6 · NEdN and Detector Noise
° Z ( meas ) (σ ) − Z (1) (σ ) ½°
ª¬ L ( σ ) − L ( σ )º¼ ⋅ ® eff(2),totN
(2) (1) eff ,tot
¾ + L (σ )
(1)
(6.35d)
D (σ )
4n (det)
= L mnf ( σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
At first we might think, examining Eq. (6.35d) and comparing it to (6.1a) above, that the
right-hand side is just a disguised version of
L mnf (σ ) + δ L (σ ) ,
? D (σ )
4n (det)
δ L (σ ) = .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
A little thought, however, shows that this cannot be correct. The quantities W, A, ¨ȍ, M, Ș, R, τ a ,
and τ are all real, as is δ L , but there is no reason for n (det) to be real. From Eq. (6.29a), we
f D
have
∞
D (σ ) =
n (det) ³ Π ( χ , D) n ( χ )e −2π iσχ d χ .
(det)
−∞
Π ( χ , D) n (det) ( χ ) ,
D ( −σ ) = n D (σ ) .
n (det) (det) ∗ (6.36)
Unless Π ( χ , D) n (det) ( χ ) is also an even function—and there is absolutely no reason for this to be
true—we expect n (det)D to have both real and imaginary components.
The observation that Π ( χ , D) n (det) ( χ ) must be even for n (det)
D to be strictly real brings to mind
the distinction previously made between avoidable and unavoidable detector noise. In Sec. 6.8
above, we note that only the even component
- 786 -
Measuring the Noise-Contaminated Spectrum · 6.11
1 (det)
ne(det) ( χ ) = ª¬ n ( χ ) + n (det) (− χ ) º¼
2
of the total detector noise in double-sided signals is unavoidable, because in principle the odd
component
1
no(det) ( χ ) = ª¬ n (det) ( χ ) − n (det) (− χ ) º¼
2
can be removed from the signal at point B by averaging together the signal values at +Ȥ and –Ȥ.
We also point out in Sec. 6.8 that the avoidable noise is usually not eliminated this way, but
instead passed along the signal chain to be eliminated later. We have now reached the point
where it is easy to eliminate the avoidable noise in double-sided signals.
Suppose, just like in Eq. (6.19a) of Sec. 6.8, we write n (det) ( χ ) as the sum of an unavoidable,
even component and an avoidable, odd component,
Since n (det)
D is the forward Fourier transform of Π ( χ , D) n (det) ( χ ) , we have
D (σ ) = F
n (det) ( − iσχ )
( Π( χ , D) n (det) ( χ ) )
= F ( −iσχ ) ( Π ( χ , D) ne(det) ( χ ) + Π ( χ , D)no(det) ( χ ) ) (6.37a)
= F ( −iσχ ) ( Π ( χ , D) ne(det) ( χ ) ) + F ( −iσχ ) ( Π ( χ , D)no(det) ( χ ) ) ,
where in the last step the linearity of the Fourier transform is used to write the transform of the
sum as the sum of the transforms (see Sec. 2.6 of Chapter 2). To get a spectrum for the
unavoidable detector noise, we now define
De (σ ) = F
n (det) ( − iσχ )
(
Π ( χ , D) ne(det) ( χ ) . ) (6.37b)
De (σ ) = Re ( n De (σ ) )
n (det) (det) (6.37c)
and
De ( −σ ) = n De (σ ) .
n (det) (det) (6.37d)
- 787 -
6 · NEdN and Detector Noise
n (det)
Do () ) F
( i) )
( , D) no(det) ( ) . (6.37e)
of ı,
n (det)
(det)
Do () ) i Im n Do () ) (6.37f)
and
n (det) (det)
Do ( ) ) n Do () ) . (6.37g)
De () ) Re n D () )
n (det) (det) (6.37i)
and
Do () ) i Im n D () ) .
n (det) (det) (6.37j)
Therefore we can remove all of the avoidable detector noise from the n (det)
D detector noise
spectrum by taking its real part, as shown in (6.37i); moreover, since the noise-free spectral
measurement of Lmnf must be real, we can remove all of the avoidable detector noise from our
noise-contaminated spectral measurement by taking its real part. The right-hand side of Eq.
(6.35d) gives the formula for the noise-contaminated spectral measurement, and taking its real
part gives
- 788 -
Measuring the Noise-Contaminated Spectrum · 6.11
§ D (σ )
4n (det) ·
Re ¨ L mnf ( σ ) + ¸¸
¨ (WA ∆Ω ) M( Rσθ ) η (σ ) R ( σ ) τ ( σ ) τ ( σ )
© ma a f ¹
(6.38a)
4 Re ( n D (σ ) )
(det)
= L mnf ( σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
The imaginary part of the right-hand side is, of course, pure noise:
§ D (σ )
4n (det) ·
Im ¨ L mnf ( σ ) + ¸
¨ (WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) ¸¹
©
(6.38b)
4 Im ( n (det)
D (σ ) )
= .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
L mnf (σ ) + δ L (σ ) ,
now suggests that the appropriate formula for the unavoidable random error in a double-sided
signal contaminated by detector noise must be
δ L (σ ) =
( D (σ )
4 Re n (det) ) . (6.38c)
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
The right-hand side of (6.38c) comes from (6.38a), which was derived while permitting
wavenumber ı to be negative as well as positive; the left-hand side however comes from Eq.
(6.1a) where, according to (6.1c),
0 < σ min ≤ σ ≤ σ max .
Wavenumbers ımax and ımin are the maximum and minimum wavenumber values over which
radiance spectra are measured, and in a well-built interferometer unwanted spectral energy is
usually prevented from entering the optical signal chain by designing the product
R (σ ) τ a (σ ) τ f (σ )
to be zero when ı does not lie between ımin and ımax. Because the denominator on the right-hand
side of (6.38c) contains the product
- 789 -
6 · NEdN and Detector Noise
R ( σ )τ a ( σ )τ f ( σ ) ,
Therefore the restrictions on the left-hand and right-hand sides of (6.38c) look very similar; the
only real difference is the way ı is allowed to be negative on the right-hand side but not on the
left. According to Eq. (5.10f) in Chapter 5 and (4.139g) in Chapter 4, functions M and Ș in the
denominator of (6.38c) are even with respect to ı, and of course the absolute value signs in R, τ a ,
and τ f force them to be even functions of their arguments. Equations (6.37d) and (6.37i) show
that the real part of n (det)
D is also an even function:
Re ( n (det)
D ( −σ ) ) = Re ( n D (σ ) ) .
(det) (6.38e)
Consequently, the entire right-hand side of Eq. (6.38c) is an even function of ı and there is no
extra information to be lost if we require ı to be positive on both sides of (6.38c). To show that
both sides should be evaluated for positive wavenumbers ı, we follow the convention used in
Sec. 6.1 when going from Eq. (6.3f) to (6.3g) and write (6.38c) as
δ L ( σ ) =
(
D (σ )
4 Re n (det) ) . (6.38f)
(WA ∆Ω) M( R σ θ ma ) η( σ ) R ( σ )τ a ( σ )τ f ( σ )
De ( σ )
4n (det)
δ L ( σ ) = , (6.38g)
(WA ∆Ω) M( R σ θ ma ) η( σ ) R ( σ )τ a ( σ )τ f ( σ )
where applying the definition of the forward Fourier transform to (6.37b) gives [see Eq. (2.29a)
in Chapter 2]
∞
De (σ ) =
n (det) ³ Π( χ , D) n ( χ ) e −2π iσχ d χ .
(det)
e (6.38h)
−∞
- 790 -
Measuring the Noise-Contaminated Spectrum · 6.11
( D (σ )
4 Im n (det) )
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
(6.38i)
4i −1 n (det)
Do (σ )
= ,
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
Do (σ ) =
n (det) ³ Π( χ , D) n ( χ ) e −2π iσχ d χ .
(det)
o (6.38j)
−∞
Equations (6.38h)–(6.38j) can be used for both negative and positive values of ı.
in Eq. (6.4) to analyze the detector noise n (det) as a random process or function that is wide-sense
stationary in Ȥ instead of t. We can say that [see Eq. (6B.4e) in Appendix 6B] that the
autocorrelation function onn(det)
of n (det) ( χ ) is given by
( χ 2 − χ1 ) = E n
(det)
onn (
(det) ( χ1 ) ⋅ n (det) ( χ 2 ) . ) (6.39a)
The corresponding Ȥ-based power spectrum is [see Eq. (6B.6a) in Appendix 6B]
- 791 -
6 · NEdN and Detector Noise
³o
(det) (det)
p
nn () )
nn ( ) e 2& i) d , (6.39b)
5
³p
(det) (det)
o
nn ( )
nn () ) e2& i) d) . (6.39c)
5
(det)
Glancing back at (6.39a), we note that onn
is real because n (det) is real. We can easily show that
(det)
onn
must be even. Starting with (6.39a), we have
( 2 1 ) E n
(det) ( 1 ) A n (det) ( 2 ) E n (det) ( 2 ) A n (det) ( 1 )
(det)
onn
(det) (det)
( 1 2 ) onn
onn ( 2 1 ) .
Hence, replacing 2 1 by Ȥ, we gett [see Eq. (2.11a) in Chapter 2 defining even functions]
(det) (det)
( ) onn
onn ( ) (6.39d)
(det) (det)
It follows, since pnn
is the forward Fourier transform of a real and even function, that pnn
is
also real and even (see entry 1 of Table 2.1 in Chapter 2):
(det) (det)
( ) ) pnn
pnn () ) (6.39e)
and
Im pnn
() ) 0 .
(det)
(6.39f)
The detector noise can also, of course, be analyzed in a more conventional way, treating it as a
random function of time N (det) (t ) that is wide-sense stationary. The transformation between n (det)
and N (det) is given in Eqs. (6B.2a) and (6B.2b) in Appendix 6B as
where u is the OPD velocity. The D-limited transform of n (det) defined in Eq. (6.29a) can be
- 792 -
Characterizing the Detector Noise · 6.12
D (σ ) =
n (det) ³ n ( χ ) e −2π iσχ d χ .
(det)
−D
D/u
(σ ) = u ³ n (det) (ut ) e −2π iσ ut dt .
(det)
n D
−D/u
If we define
T = D/u (6.40c)
and set
f = uσ , (6.40d)
T
(det) (t ) e −2π ift dt .
D (σ ) = u ³ N
n (det) (6.40e)
−T
Working in the time domain, it makes sense to define the T-limited Fourier transform of N (det) (t )
to be
T
(det) ( f ) =
³ N (t ) e−2π ift dt ,
(det)
N T (6.40f)
−T
which means that (6.40e) can now be written as, remembering that f = uσ ,
(det) (uσ )
D (σ ) = u NT
n (det) (6.40g)
or
u −1n (det) (det) ( f ) .
D ( f / u ) = NT (6.40h)
[These are the detector-noise versions of Eqs. (6B.7g) and (6B.7h) in Appendix 6B.] Equation
(6B.7i) in Appendix 6B now gives
(det)
pnn
(σ ) = lim
1
®
D →∞ 2 D
¯
E
n (det)
D (σ )
2 ½
¾,
¿
( ) (6.40i)
- 793 -
6 · NEdN and Detector Noise
1
2D (
E n (det)
D (σ )
2
)
to be close to its limit as D → ∞ ,
(σ ) ≅
(det)
pnn
1
2D
E n (det)(
D (σ )
2
. ) (6.40j)
The time-based autocorrelation function of the detector noise is [see Eq. (6B.3a) in Appendix 6B)
(det)
RNN (
(det) (t ) ⋅ N (det) (t )
(t2 − t1 ) = E N 1 2 ) (6.41a)
(det)
,
with an associated time-based power spectrum that is the forward Fourier transform of RNN
∞
³R (t ) e−2π ift dt .
(det) (det)
(f )=
S NN
NN
(6.41b)
−∞
The transform can be reversed to get
³S ( f ) e2π ift df .
(det) (det)
(t ) =
RNN
NN
(6.41c)
−∞
Equations (6B.4g), (6B.4h), (6B.6d), and (6B.6f) of Appendix 6B give the transformation
formulas connecting onn(det)
(det)
to RNN (det)
and pnn
(det)
to S NN
:
( χ ) = RNN
(χ / u) ,
(det) (det)
onn (6.41d)
(det) (det)
(t ) = onn
RNN (ut ) , (6.41e)
(σ ) = uS NN
(uσ ) ,
(det) (det)
pnn (6.41f)
and
(det) −1 (det)
( f ) = u pnn
S NN ( f / u) . (6.41g)
Working with power spectra and autocorrelation functions that are both time-based and Ȥ-based
can sometimes be confusing, but the custom of using variables Ȥ and ı to analyze interferometer
signals makes it hard to avoid.
- 794 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13
FIGURE 6.3(a).
log Snn
( f )
(1)
log( f c ) log(f )
p0(det) u A Sconst
(det)
(6.42a)
and a ı bandwidth of
fband
) band . (6.42b)
u
In these two equations, u is still the constant OPD velocity used in Eq. (6.4) above.
- 795 -
6 · NEdN and Detector Noise
FIGURE 6.3(b).
(det)
S NN
(f )
(det)
S const
− f band f band
FIGURE 6.3(c).
(σ )
p (det)
nn
p0(det) = uSconst
(det)
- 796 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13
FIGURE 6.4(a).
6
2 .10
I1N(det)
10
6
− I1N(det)
10
6
6
2 .10
0 0.02 0.04 0.06 0.08 0.1
0 −D 0.0
k .∆χ ( Nσask D
1 ) .∆χ
FIGURE 6.4(b).
9
4 .10
Z2(det)
DN
10
9
Real part
~ (det) 0.0
D (σ )
Re hdTradSpec
of n k 0
− Z2(det)
10
DN
9
9
4 .10
- 797 -
6 · NEdN and Detector Noise
Figure 6.4(a) plots one possible member of the ensemble of functions associated with the
random function n (det) ( ) obeying a band-limited, white noise power spectrum like the one in
(c)
Fig. 6.3(b)—that is, Fig. 6.4(a) contains a specific instance of n (det) ( ) . In Fig. 6.4(a), the Ȥ
interval between samples is
1
, (6.43a)
2) Nyq
where ıNyq is the Nyquist wavenumber of the sampled interferogram signal that we plan to
contaminate with this noise. We make the simulated n (det) ( ) relatively large so that its effects
are easily visible, giving it a scale size I N(det) , shown with dashed lines, equal to 1/50th of the
maximum value of the simulated interferogram signal. A power spectrum such as the one shown
in Fig. 6.3(c) does not uniquely determine all the statistical rules needed to generate the random
noise sequence in Fig. 6.4(a)—we also need to pick a probability density distribution for n (det) at
each value of Ȥ. This probability density distribution must be zero-mean because, according to
Eq. (6.17b),
E n (det) ( ) 0 .
To match the probability density distribution to the power spectrum, we also need to give it the
correct variance. Remembering that the noise is zero-mean, and consulting Eq. (6.39a) with
2 1 , we see that
variance of detector noise vn(det)
E n (det) ( )
2
o(det)
nn (0) . (6.43b)
5
vn(det) ³p
(det) (det)
(0)
onn
nn () ) d) . (6.43c)
5
Equations (6.42a) and (6.42b)—and the band-limited nature of the white-noise power spectrum in
(det)
(6.43c)—now give (remember that pnn
is zero for ) ) band )
vn(det)
2) band p0(det) . (6.43d)
Having made the probability density distribution zero-mean, and matched its variance to the
power level, we are left free to arrange everything else about the probability density distribution
- 798 -
Detector Noise with a Band-Limited, White-Noise Power Spectrum · 6.13
FIGURE 6.4(c).
9
4 .10
Z (det)
2 10
DN
9
Imaginary part
~ (det)
Im hdTradSpec 0.0 0
of n D (σ )
k
− Z2(det)
9
10
DN
9
4 .10
0
− 0σ Nyq1000 2000 3000 4000 0.0
5000
k
6000 7000 8000 9000 σ1 Nyq
10
4
. 3
9.999 10
σ
FIGURE 6.4(d).
Z2(det)
10
DN
9
Real andReimaginary
hdTradSpec
k
~ (det) 0.0
D (σ )k
0
n
parts of Im hdTradSpec
− Z2(det)
10
DN
9
9
2.68834 .10 4 10
9
4799 4839 4879 4919 4959 4999 5039 5079 5119 5159 5199
4999 200 ⋅ ∆σ
− 200 0.0
k .∆f 200 ⋅ ∆σ
4999 200
- 799 -
6 · NEdN and Detector Noise
any way we please. Sometimes knowing the variance is enough to pick a specific probability
distribution from a family of similar zero-mean density distributions; it is certainly all that is
needed to specify the Gaussian probability distribution used to generate the random noise in Fig.
6.4(a).
We have already noted that the simulated noise plotted in Fig. 6.4(a) can be regarded as a
member function picked at random from the ensemble of functions associated with n (det) ( ) ; that
is, it is a single instance of the detector noise. Even though it is not, in the strictest sense, possible
to graph a random function as such—because it stands for a whole collection or ensemble of
functions—as a convenient shorthand, we often call graphs such as the one in Fig. 6.4(a) a
simulation of n (det) ( ) . Figure 6.4(b) contains the real part of the D-limited forward Fourier
transform of the detector noise shown in Fig. 6.4(a). This means, following the notation specified
in Eqs. (6.29a), (6.37i), and (6.37j), that Fig. 6.4(b) simulates the random function
De () ) Re n D () ) .
n (det) (det)
Figure 6.4(c) plots the imaginary part of this same transform, which means Fig. 6.4(c) simulates
the random function
Im n (det)
D () ) i n Do () )
1 (det)
corresponding to the detector noise specified in Fig. 6.4(a). The scale size Z (det)
DN , shown with
dashed lines in Figs. 6.4(b) and 6.4(c), is 1/50th of the maximum uncalibrated spectral signal
produced by the simulated interferometer. Figure 6.4(d) plots a very short stretch of the two
curves in Figs. 6.4(b) and 6.4(c) around ) 0 . Here ¨ı is the distance between adjacent samples
on the wavenumber axis for the radiance spectrum measured by the interferometer. We see that
the imaginary part obeys Eqs. (6.37g) and (6.37j) by being an odd function of ı; and the real part
obeys Eqs. (6.37d) and (6.37i) by being an even function of ı.
- 800 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14
FIGURE 6.5(a).
5
2 10
5L
1.5 10max
5
1 10
Lp
L( ) )ip
6
5 10
0.0 0
6
5 10
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
) Nyq / 2
2500 )p0.0 ) Nyq2500/ 2
ip
5
2 10
L 5
1.5 10max
5
1 10
L () )
L( kPlot ) )
6
5 10
0.0
0
6 6
5 10 5 10
1000 1100 1200 1300 1400 1500
1000 )
kPlot ) 1500
- 801 -
6 · NEdN and Detector Noise
FIGURE 6.5(b).
4
1 10
5
5.614754 .10
5
χ
5 10
Interferogram
IfTradNFT 0
kPlot
signal
5
5 10
5
5.344839 .10 1 10
4
0.02 0.01 0 0.01 0.02
− D/2
0.030
kPlot
Nσtot0.0
1 .∆χ
D0.030
/2
2
5
1 .10
Interferogram
IfTradNFT 0
signal kPlot
5
1 .10
0.005 0.01 0.015 0.02 0.025
0.005 Nσtot 0.03
1 .∆χ
kPlot
χ2
- 802 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14
FIGURE 6.5(c).
5
2 .10
L 5
1.5 10max
5
1 10
( σkPlot
L mnf gp )
6
5 10
0.0 0
6
5 .10
2500 2000 1500 1000 500 0 500 1000 1500 2000 2500
− σ Nyq / 2
2500
kPlot
0.0
Nσtot
1 .∆σ
σ Nyq
2500/ 2
2
σ
5
2 .10
L 5
1.5 10max
5
1 10
Re LNFTL mnf (σ )
kPlot
6
5 10
0.0 0
6
5 .10 5 10
6
1000 1100 1200 1300 1400 1500
1000 σ
kPlot .∆σ 1500
- 803 -
6 · NEdN and Detector Noise
FIGURE 6.6(a).
4
1 10
5
5.582875 10
5
5 10
Interferogram
IfTradNT 0
kPlot
signal
5
5 10
5 4
44839 10
5.392793 1 10
0.02 0.01 0 0.01 0.02
D/2
0.030
kPlot
0.0
0.0
N)tot
N)tot
1
D0.030
/2
2
5
1 10
Interferogram
IfTradNT 0
signalkPlot
5
D/2 0.0 D/2
0
5
1 10
0.005 0.01 0.015 0.02 0.025
0.005
kPlot
N)tot
1
0.030
2
- 804 -
An Example of Simulated Detector Noise in a Double-Sided Signal · 6.14
FIGURE 6.6(b).
5
2 10
L 5
1.5 10max
5
1 10
Noise-contaminated
radianceLTradNT
measurement
kPlot
6
5 10
0.0 0
6
5 10
6000 4000 2000 0 2000 4000 6000
6000 ) Nyq 0.0
N)tot ) ) Nyq 6000
kPlot 1 )
2
5
2 10
L 5
1.5 10max
5
1 10
LTradNT
kPlot
Noise-contaminated 6
5 10
radiance measurement
0.0 0
6 6
5 10 5 10
1000 1100 1200 1300 1400 1500
1000 N)tot 1500
kPlot 1 )
2
- 805 -
6 · NEdN and Detector Noise
Fig. 6.2 when only negligible amounts of noise and background radiance are present. Figure
6.5(c) gives the L mnf ( ) ) radiance measurement extracted from the interferogram signal in Fig.
6.5(b). The most dramatic change is perhaps the spurious oscillation or “ringing” produced
throughout the measured spectrum by the finite signal length or truncation of the interferogram
signal (only signal values between Ȥ = +D and Ȥ = –D are recorded in this double-sided system).
Careful examination also reveals the blurring effects of this truncation—note that three
absorption lines in the center of Fig. 6.5(c) are not quite as deep and are more closely matched in
intensity than are the absorption lines in Fig. 6.5(a). The characteristic scale of the radiance axis
in Figs. 6.5(a) and 6.5(c) is taken to be Lmax, the maximum value of the input radiance spectrum
(in units of optical power per unit area per unit solid angle per unit wavenumber interval). Next
detector noise is added to the radiance measurement. Figure 6.6(a) plots the interference signal in
Fig. 6.5(b) contaminated by the band-limited detector noise plotted in Fig. 6.4(a), and Fig. 6.6(b)
gives the spectral measurement produced by this noise-contaminated signal.
The discussion following Eq. (6.35d) above reveals that the detector noise n (det) ( ) in the
zC(Ȥ) signal adds a complex spectral noise n (det)D () ) to the spectral data coming out of the
calibration algorithm; and, as shown in Eq. (6.38a), only the real component of the complex
spectral noise unavoidably contaminates the spectral measurement. Figure 6.6(b) shows that this
real component typically introduces a fuzziness into the measured spectrum, which is most easily
seen where the noise-free Lmnf spectrum is negligible or zero. Figures 6.7(a) and 6.7(b) show the
real and imaginary parts of the complex spectral noise in this simulated interferometer
measurement. Because the last step in producing a double-sided interferometer measurement is—
according to Eq. (6.38a) above—to take the real part of the calculated spectrum, only the real part
plotted in Fig. 6.7(a) ends up contaminating the spectral measurement. The plots in Figs.
6.7(a)and 6.7(b) look qualitatively similar and have the same characteristic size, which is typical
of detector noise (see the discussion at the beginning of Sec. 6.17 below).
It is important to remember that the random noise in Figs. 6.7(a) and 6.7(b) comes from one
specific spectral measurement. The very next measurement might have negative errors where
there are now positive, or positive errors where there are now negative, or something in
between—there is quite literally no necessary connection to the random spectral errors in the
previous measurement. If we keep track of the detector-noise error in a very large collection of
measurements, and then at each wavenumber average together the detector-noise error from all
the different measurements, we would discover that the average detector-noise error approaches
zero at every wavenumber as we increase the number of independent measurements. This is, of
just what
course, just whatshould
shouldhappen
happenaccording
accordingto to
Eq.Eq. (6.30b)
(6.30b) above. If we calculate the standard
above.
deviation at every wave number, we get the NEdN levels shown in Figs. 6.7(a,b).
6.15 Photon Noise in Detectors
6.15
Most Photon
detectorsNoise in Detectors
approach an ideal state when chilled to very low temperatures (typically tens of
degrees
Most Kelvin)approach
detectors at reasonable levels
an ideal stateofwhen
illumination.
chilled toFor anlow
very ideal detector, the
temperatures only source
(typically tensof
of
degrees Kelvin) at reasonable levels of illumination. For an ideal detector, the only source of
- 806 -
Photon Noise in Detectors · 6.15
FIGURE 6.7(a).
7
6.10 5 10
7
L maxL/ 50
Real max
part of the
complex
spectral noise
LrTradNT 0.0 0
in the radiancekPlot
measurement
− L/ 50
−L max max
7
5 10
7
6.10
4000 2000 0 2000 4000
−σ
5000 Nyq
0.0
Nσtot
σ Nyq
5000
kPlot 1 .∆σ σ
2
detector noise is the quantum fluctuations in the number of photons it absorbs. When the detector
experiences a constant level of illumination, these quantum fluctuations show up as band-limited
white noise. The photon noise in many types of photovoltaic (PV) detectors often approaches the
ideal of band-limited white noise. Many times this occurs when the detector observes the signal
in the presence of large amounts of background radiation, because then most of the photons
reaching the detector come from the constant background, keeping the total number of absorbed
photons approximately constant as the optical signal varies. A detector operating in this mode is
said to have reached its background-limited infrared photon, or BLIP, limit. Figures 5.8(a) and
5.8(b) in Chapter 5 show that when detectors measure interferograms, the total signal variation
about its average level is usually small except very close to ZPD in a region symmetrically
located about Ȥ = 0. In this sense, even when background radiation is disregarded, PV detectors
measuring interferograms are analogous to PV detectors operating in the BLIP limit: photons are
absorbed at a more or less constant rate during most of the measurement. Experience has shown
- 807 -
6 · NEdN and Detector Noise
that for this reason the photon noise contaminating interferograms can usually be approximated
as band-limited white noise, with the photon noise level specified by the detector’s average
illumination from both the background and signal radiances.
To derive a power level for the photon noise generated in a detector, we treat the detector as
an element of an electric circuit—it does, after all, put out an electric signal when illuminated—
(det)
which means it must have a typical bandwidth that we call fband . Associated with this bandwidth
is a response time
1
τ band
(det)
= (det)
. (6.44a)
2 ⋅ fband
If the illumination hitting the detector varies significantly on a timescale shorter than τ band
(det)
, the
detector does not record the change in illumination directly but instead generates a signal based
on the average level of illumination reaching the detector over the τ band
(det)
time interval. In this
sense, τ band
(det)
is the effective length of time during which the detector collects photons to produce
its signal. We also assume that the detector responsivity R (σ ) (which is defined at the beginning
of Sec. 5.9 in Chapter 5) can be written as the product of two functions ηd (σ ) and ed (σ ) for
wavenumbers ı greater than zero,
R (σ ) = η d (σ ) ⋅ ed (σ ) . (6.44b)
Function Șd is often called the detector’s quantum efficiency; it specifies the fraction of photons
of frequency f = c / λ = cσ that are absorbed after hitting the detector’s surface. The value of Șd
for any ı must be a dimensionless number between zero and one:
0 ≤ ηd (σ ) ≤ 1 . (6.44c)
Every photon is associated with a monochromatic wavefield of frequency f (in cycles per
second) and carries an amount of energy hf = hcσ , where h ≅ 6.626 ×10−27 erg ⋅ sec is Planck’s
constant and c ≅ 2.998 ×1010 cm/sec is the speed of light in a vacuum. We define P1 to be the
random number of photons absorbed by the detector in time τ band(det)
that have frequency f1 = cσ 1 ,
P to be the random number of photons absorbed in time τ
2
(det)
that have frequency f = cσ , P
band 2 2 3
to be the random number of photons absorbed in time τ that have frequency f3 = cσ 3 , and so
(det)
band
on. The statistical rules obeyed by photons require P1 , P2 , P3 ,… to be independent random
numbers.
The total number of photons absorbed by the detector in time τ band
(det)
is
- 808 -
Photon Noise in Detectors · 6.15
FIGURE 6.7(b).
Thick, solid line is the
NEdN level for this
noise.
7
6.10 5 10
7
Imaginary Lmax
Imaginary part L max / 50
partofofthe
thecomplex
complex
spectral noise
spectral
in thenoise
radiance
LiTradNT 0.0 0
in the radiance kPlot
measurement
measurement
−L
−L maxmax
/ 50
7
5 10
7
6.10
4000 2000 0 2000 4000
− Nyq
5000 σ 0.0
Nσtot σ Nyq
5000
kPlot 1 .∆σ
2 σ
______________________________________________________________________________
The detector has an area Ad, a field of view specified by the solid angle ¨ȍd, and is illuminated
by a constant radiance Ld(ı) that is defined only for σ ≥ 0 . As has already been pointed out at the
beginning of this section, for interferometers we can take Ld(ı) to be the average radiance level,
both from the optical background and the optical signal, reaching the detector. Using the linearity
of the expectation operator E with respect to random variables (see Sec. 3.10 of Chapter 3), the
average number of photons absorbed by the detector in time τ band
(det)
is
- 809 -
6 · NEdN and Detector Noise
Ad ∆Ω d L d (σ 1 )
E( P1 ) ≅ η d (σ 1 ) ⋅τ band
(det)
⋅ dσ ,
hcσ 1
Ad ∆Ω d L d (σ 2 )
E( P2 ) ≅ ηd (σ 2 ) ⋅τ band
(det)
⋅ dσ ,
hcσ 2
# (6.45c)
Ad ∆Ω d L d (σ j )
E( Pj ) ≅ η d (σ j ) ⋅τ band
(det)
⋅ dσ ,
hcσ j
#
where
Ad ∆Ω d L d (σ j ) dσ
is, of course, the average number of photons per unit time carried by that radiation.
Returning to Eq. (6.45a), we see that the actual random optical power Wd absorbed by the
detector over a time interval τ band
(det)
is
This should not be confused with the average or expected optical power absorbed over the time
interval τ band
(det)
. Since the photons have already been absorbed, all that is needed to get the actual
random signal I is to multiply the first term by e (σ ) , the second term by e (σ ) , etc., which
d d 1 d 2
gives
§ hcσ · § hcσ · § hcσ ·
Id = ed (σ 1 ) ⋅ ¨ (det)1 ¸ ⋅ P1 + ed (σ 2 ) ⋅ ¨ (det)2 ¸ ⋅ P2 + ed (σ 3 ) ⋅ ¨ (det)3 ¸ ⋅ P3 + " . (6.46b)
© τ band ¹ © τ band ¹ © τ band ¹
The right-hand side of this equation is a sum of independent random variables. Equation (3.19e)
in Chapter 3 states that the variance of the sum of independent random variables is the sum of the
- 810 -
Photon Noise in Detectors · 6.15
variances, so we can use the notation introduced in Eq. (3.8f) of Chapter 3 to write
§ hcσ ·
Var ( Id ) = Var ¨ ed (σ 1 ) (det)1 P1 ¸
© τ band ¹
§ hcσ · § hcσ ·
+ Var ¨ ed (σ 2 ) (det)2 P2 ¸ + " + Var ¨ ed (σ j ) (det)j Pj ¸ + " .
© τ band ¹ © τ band ¹
Equation (3.16g) in Chapter 3 points out that multiplying a random variable by a nonrandom
parameter means that its variance must be multiplied by the square of that parameter, so the
variance in signal Id can also be written as
2
§ hcσ ·
Var ( Id ) = ¨ ed (σ 1 ) (det)1 ¸ ⋅Var ( P1 )
© τ band ¹
2 2
(6.46c)
§ hcσ · § hcσ j ·
+ ¨ ed (σ 2 ) (det)2 ¸ ⋅Var ( P2 ) + " + ¨ ed (σ j ) (det) ¸ ⋅ Var ( Pj ) + " .
© τ band ¹ © τ band ¹
The number of photons absorbed at any frequency f = cσ j obeys Poisson statistics, which means
that the variance in the random number of photons equals the mean or average number of
photons:
2
§ hcσ · ª (det) Ad ∆Ω d L d (σ 1 ) º
Var ( Id ) = ¨ ed (σ 1 ) (det)1 ¸ ⋅ «η d (σ 1 ) ⋅τ band ⋅ » dσ
© τ band ¹ ¬ hc σ 1 ¼
2
§ hcσ · ª (det) Ad ∆Ω d L d (σ 2 ) º
+ ¨ ed (σ 2 ) (det)2 ¸ ⋅ «ηd (σ 2 ) ⋅τ band ⋅ » dσ + "
© τ band ¹ ¬ hc σ 2 ¼
2
§ hcσ · ª Ad ∆Ω d L d (σ j ) º
+ ¨ ed (σ j ) (det)j ¸ ⋅ «ηd (σ j ) ⋅τ band
(det)
⋅ » dσ + " ,
© τ band ¹ «
¬ hc σ j »¼
- 811 -
6 · NEdN and Detector Noise
§ hcσ ·
Var ( Id ) = Ad ∆Ω d ⋅ ¨ (det)1 ¸ ⋅ ed (σ 1 ) 2η d (σ 1 ) ⋅ L d (σ 1 )dσ
© τ band ¹
§ hcσ ·
+ Ad ∆Ω d ⋅ ¨ (det)2 ¸ ⋅ ed (σ 2 ) 2η d (σ 2 ) ⋅ L d (σ 2 ) dσ
© τ band ¹
§ hcσ ·
+ Ad ∆Ω d ⋅ ¨ (det)3 ¸ ⋅ ed (σ 3 ) 2ηd (σ 3 ) ⋅ L d (σ 3 ) dσ
© τ band ¹
+ ".
§ hc · ∞
Var ( I d ) = Ad ∆Ω d ¨ (det) ¸ ⋅ ³ ed (σ ) 2η d (σ )L d (σ ) σ dσ .
© τ band ¹ 0
∞
R (σ )
2
Var ( I d ) = 2 f band hc Ad ∆Ω d ⋅ ³
(det)
L (σ ) σ dσ . (6.46e)
0
ηd (σ ) d
The photon noise is band-limited white noise like that shown in Figs. 6.3(b) and 6.3(c) above.
Hence, Eq. (3.62d) in Chapter 3, which connects the variance of band-limited white noise to the
constant level of its noise-power spectrum, here allows us to write that
where S p(det)
2 is the constant power level of the double-sided, time-based power spectrum due to
the random quantum fluctuations in the number of photons absorbed by the detector. Comparing
Eq. (6.46e) to (6.46f), we see that
∞
R (σ )
2
S (det)
p2 = hc Ad ∆Ω d ⋅ ³ L d (σ ) σ dσ . (6.46g)
0
η d (σ )
A single-sided power spectrum must, according to Eq. (3.58b) of Chapter 3, have a constant
power level S p(det)
1 that is twice the size of the double-sided power level, hence
- 812 -
Photon Noise in Detectors · 6.15
∞
R (σ )
2
S (det)
p1 = 2 hc Ad ∆Ω d ⋅ ³ L d (σ ) σ dσ . (6.46h)
0
η d (σ )
§ Q ·
L d (σ ) = ¨ ¸ ⋅ hcσ 0 ⋅ δ (σ − σ 0 ) , (6.47a)
© ∆Ω d ¹
where Q, which is often called the photon incidence, is defined to be the number of photons per
unit time and per unit area hitting the detector. The delta function in (6.47a) has units of inverse
wavenumbers (that is, length) and is explained in Sec. 2.14 of Chapter 2. Substitution of (6.47a)
into (6.46h) gives
R (σ ) ª§ Q · º
∞ 2
S p(det)
1 = 2 hc Ad ∆Ω d ³
⋅ «¨ ¸ ⋅ hcσ 0 ⋅ δ (σ − σ 0 ) » σ dσ
0
ηd (σ ) «¬© ∆Ω d ¹ »¼
or
2 Ad Q
[ hcσ 0 R (σ 0 )] .
2
S p(det)
1 = (6.47b)
ηd (σ 0 )
Detectors are often characterized by a figure of merit called the specific detectivity D*, or “D-
star.” The specific detectivity of a detector at a positive wavenumber ı is defined to be
R( σ ) Ad
D∗ ( σ ) = , (6.48a)
S1(det) (u σ )
where u is again the constant OPD velocity used in Eq. (6.4) above, R(ı) is the detector’s
responsivity, Ad is the detector area, and S1(det) ( f ) is the single-sided noise-power density at the
signal frequency f (in Hz). The absolute value signs applied to ı both remind us that its value
must be positive and allow us to extend the definition of D* to negative wavenumbers. The units
of D* are cm ⋅ Hz/watt (which is often called a Jones). The D* tends to be constant for all
infrared detectors made from the same detector material and operating at the same temperature,
no matter what the detector area Ad; consequently, it can be used to predict the amount of noise
contamination present in any size detector, all other things being equal. High-performance
detectors produce low-noise signals and have large D* values (for example, 1014 cm ⋅ Hz/watt ),
and low-performance detectors have small D* values (for example, 107 cm ⋅ Hz/watt ). The D* of
an ideal detector that is photon-noise limited and experiencing an approximately constant level of
- 813 -
6 · NEdN and Detector Noise
R (σ 0 ) Ad 1 η d (σ 0 )
D∗ = = (6.48b)
S p(det)
1
hcσ 0 2Q
λ0 ηd (σ 0 )
D∗ = . (6.48c)
hc 2Q
This equation is the standard D∗ formula for a PV detector in the BLIP limit.101
E ( n (det)
D (σ ) ) = 0 . (6.49a)
Substituting from Eq. (6.37h) now gives, using the linearity of the expectation operator with
respect to random variables [see Eq. (3.16a) in Chapter 3],
E ( n (det)
De (σ ) + n Do (σ ) ) = E ( n De (σ ) ) + E ( n Do (σ ) ) = 0 .
(det) (det) (det) (6.49b)
101
See, for example, Eq. (2.48a) in John David Vincent, Fundamentals of Infrared Detector Operation and Testing
(John Wiley and Sons, New York, 1990), p. 65.
- 814 -
Detector-Noise NEdN in Double-Sided Signals · 6.16
Taking the expectation value of both sides of the formula for L ( ) ) in Eq. (6.38g) now gives
the desired result:
E L ( ) ) 0
(6.49e)
for the double-sided detector noise.
To get the detector-noise NEdN in a double-sided signal, we first substitute Eq. (6.49e) into
(6.3g) to get
NEdN ( ) ) E ª¬ L ( ) ) º¼ 2 (6.50a)
©
D () ) ¼ ¸
¹
NEdN 2(det)( ) ) . (6.50b)
( A ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) )* f ( ) )
The subscript 2 and the superscript (det) are added to the NEdN parameter to show that this is the
NEdN of a double-sided signal contaminated by detector noise. According to the discussion
immediately preceding Eqs. (4.84a) and (4.84b) in Chapter 4, parameter W = +1 or í1, which
means that it drops out of the formula when L ( ) ) is squared. We can remove the absolute
value signs from the arguments of M and Ș because they are already even functions [see Eqs.
(4.139g) and (5.10f) in Chapters 4 and 5 respectively] to get
4 E §¨ ª¬ Re n (det) º ·
2
©
D () ) ¼ ¸ ¹
NEdN 2(det)( ) ) . (6.50c)
( A ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) )
According to the discussion at the beginning of Sec. 6.12, we can assume the detector noise to
be wide-sense stationary; and Appendix 6B shows that it has this property both when treated as a
random function of time and when treated as a random function of the OPD value Ȥ. Using the
transformation specified in Eqs. (6.40a) and (6.40b) to treat the detector noise as the random
function of time N (det) (t ) , we
we use
useEq.
Eq.(6.40f)
(6.40f)to to
construct its T-limited
construct Fourier
its T-limited transform
Fourier transform [Eq.
(6.22b) above defines ]
T 5
(det) ( f ) T N (det) (t ) e2& ift dt 5 (t , T ) N (det) (t ) e 2& ift dt .
(det) ( f ) ³T N (det) (t ) e2& ift dt 5
³
N T (6.51a)
³ ³ (t , T ) N (t ) e dt .
(det) 2& ift
N T (6.51a)
T 5
The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that
The analysis given in Sec. 3.26 of Chapter 3 shows, according to Eqs. (3.69g) and (3.69h), that
- 815 -
6 · NEdN and Detector Noise
©¬
(det) ( f ) º ·¸ ≅ 1 E N
E §¨ ª Re N(T )
2
¼ ¹ 2 (
(det) ( f ) 2
T ) (6.51b)
and
©¬
(det) ( f ) º ¸· ≅ 1 E N
E ¨§ ª Im N(T )
2
¼ ¹ 2 (
(det) ( f ) 2
T ) (6.51c)
as long as f is substantially greater than O(T −1 ) . This is a very easy requirement to satisfy
since at this point all it really does is show how large T must be chosen for us to have Eqs.
(6.51b) and (6.51c) hold true at the frequencies f we are interested in. Remembering that E is a
linear operator with respect to random variables and that σ = f / u , we use Eq. (6.40h) to write
E §¨ ª¬ Re n (det)
©
( D (σ)) º
¼
2
¸
¹ 2 (
· ≅ 1 E n (det) (σ ) 2
D ) (6.51d)
and
E §¨ ª¬ Im n (det)
©
( D (σ )
) º
2
¼ ¸¹ 2 (
· ≅ 1 E n (det) (σ ) 2 .
D ) (6.51e)
These two formulas only hold true as long as σ is substantially greater than O( D −1 ) as can be
seen by applying Eqs. (6.40c) and (6.40d) to the requirement that f is substantially greater than
O(T −1 ) . The intersample distance between spectral samples along the wavenumber axis of the
radiance measurement is, according to the discussion following Eq. (5.124d) in Chapter 5,
1
∆σ = .
2D
Consequently, as long as the wavenumbers between ımin and ımax at which the spectral radiance is
being measured lie a reasonable number of ¨ı lengths away from the σ = 0 origin of the
wavenumber axis—as would be the case in a well-designed interferometer system—we can rely
on σ being substantially greater than O( D −1 ) for the wavenumbers of interest. Hence formulas
(6.51d) and (6.51e) can be assumed to hold true. Now we can substitute Eq. (6.51d) into (6.50c)
to get
NEdN 2(det)( σ ) =
(
2 2 E n (det)
D (σ )
2
) . (6.51f)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
- 816 -
Detector-Noise NEdN in Double-Sided Signals · 6.16
This basic equation for the detector-noise NEdN of a double-sided signal can be put into a
variety of forms.
If the power spectrum of the detector noise is known, we can evaluate
(
E n (det)
D (σ )
2
)
directly no matter what shape it has. In particular, we do not need to assume that the detector
produces band-limited white noise. Starting with Eq. (6.29a), we have
2 ∗
ª (det) º
D ( σ ) = n D ( σ ) ⋅ ¬n D ( σ ) ¼
n (det) (det)
∗
ª∞ º ª∞ º
= « ³ Π ( χ , D) n (det) ( χ )e −2π iσχ d χ » ⋅ « ³ Π ( χ ′, D) n (det) ( χ ′)e −2π iσχ ′ d χ ′» ,
¬ −∞ ¼ ¬ −∞ ¼
∞ ∞
2
³ Π( χ , D) n d χ ³ Π ( χ ′, D) n (det) ( χ ′)e 2π iσχ ′ d χ ′ .
−2π iσχ
n (det)
D (σ ) = (det)
( χ )e
−∞ −∞
Equation (3.17c) in Chapter 3 allows the expectation operator E to be taken inside the double
integral formula, so applying E to both sides leads to
( ) = ³ d χ Π(χ , D)e
∞ ∞
( )
2
E n (det) ³ d χ ′ Π( χ ′, D) e E n (det) ( χ ′)n (det) ( χ ) .
−2π iσχ 2π iσχ ′
D (σ )
−∞ −∞
( ) = ³ d χ Π ( χ , D) e
∞ ∞
2
E n ³ d χ ′ Π( χ ′, D) e
−2π iσχ 2π iσχ ′
(σ ) ( χ − χ ′)
(det) (det)
D onn
−∞ −∞
∞ ∞ ∞
³ dσ ′ p (σ ′) ³ d χ Π ( χ , D) e −2π i (σ −σ ′) χ ³ d χ ′ Π( χ ′, D) e
(det) −2π i (σ ′−σ ) χ ′
=
nn .
−∞ −∞ −∞
- 817 -
6 · NEdN and Detector Noise
B f , D B F , () ) 3) B t
for the integral
5
³ ( , D) e
2& i () ) 3 )
d
5
3 B f , D B F , () 3 ) ) B t
for the integral
5
³
5
( 3, D) e 2& i () 3) ) 3 d 3 .
This gives
5
³
5
( , D) e 2& i () ) 3) d 2 Dsinc(2& () ) 3) D) (6.52b)
and
5
³
5
( 3, D) e 2& i () 3) ) 3 d 3 2 Dsinc(2& () 3 ) ) D) , (6.52c)
sin( x)
sinc( x) .
x
E n (det)
D () )
2
5 (6.52d)
³p () 3) A ª¬ 2 Dsinc 2& () ) 3) D º¼ A ª¬ 2 Dsinc 2& () 3 ) ) D º¼ d) 3.
(det)
nn
5
sin( x) sin( x)
sinc( x) sinc( x) .
x x
- 818 -
Detector-Noise NEdN in Double-Sided Signals · 6.16
( ) { }
∞
2 2
E n (det)
D (σ ) = 2 D ³ pnn ¬sinc ( 2π (σ − σ ′) D ) º¼ dσ ′.
(σ ′) ⋅ 2 D ª
(det)
(6.52e)
−∞
(det)
We assume that the detector noise has a power spectrum pnn that varies slowly with ı compared
to
sinc(2π (σ − σ ′) D) .
This means we can, just as in Eq. (3.67b) of Chapter 3, approximate the action of
2 D [sinc(2π (σ − σ ′) D) ]
2
inside the integral by replacing it with a delta function δ (σ − σ ′) . Equation (6.52e) then
simplifies to
(
E n (det)
D (σ )
2
) = 2D p (det)
nn (σ ) , (6.52f)
(σ )
(det)
4 D pnn
NEdN 2 (σ ) =
(det)
. (6.52g)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
(det)
We note that pnn is a double-sided power spectrum, which means [see Eqs. (6.39e) and (6.39f)
above] it is real and even, making the absolute value signs applied to its argument superfluous.
Many times the detector noise is characterized by its power spectrum written as a function of
(det)
the frequency f (in Hz). This is called S NN ( f ) in Sec. 6.12 above, and Eq. (6.41f) can be used to
write (6.52g) as
(u σ )
(det)
4 uDS NN
NEdN (det)
2 (σ ) = (6.53a)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
Again, the absolute value signs do not need to be added to the argument of the power spectrum
because it is a real and even function. This formula is often written in terms of the single-sided
power spectrum described by Eq. (3.58b) of Chapter 3, which is defined only for non-negative
values of frequency f = u σ . Calling this single-sided power spectrum S1(det) ( f ) , we know from
Eq. (3.58b) that
- 819 -
6 · NEdN and Detector Noise
S1(det) ( f ) = 2 S NN
(det)
( f ). (6.53b)
Here, the absolute value signs are needed to show that the frequency argument must be non-
negative. Substituting this into (6.53a) gives
2 2uDS1(det) (u σ )
NEdN 2 (σ ) =
(det)
. (6.53c)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
One last form into which this formula can be put uses the D* figure of merit introduced in Eq.
(6.48a),
R( σ ) Ad
D∗ ( σ ) = .
S1(det) (u σ )
2 2uDAd
NEdN 2(det)( σ ) = , (6.53d)
( A ∆Ω) M( Rσθ ma ) η(σ )τ a ( σ )τ f ( σ ) D∗ ( σ )
- 820 -
Real and Imaginary Parts of the Detector Noise · 6.17
16 E §¨ Im ( n (det)
D (σ ) ) ¸
·
2
© ¹
2
ª¬(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) º¼
=
(
8 E n (det)
D (σ )
2
) .
2
ª¬( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ ) º¼
This is the variance of the noise in Fig. 6.7(b). Taking the square root gives, for the imaginary
component,
standard deviation =
(
2 2 E n (det)
D (σ )
2
) . (6.54)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
The thick solid line labeled NEdN in Fig. 6.7(b) shows the size of this standard deviation. Figure
6.7(a) plots the actual spectral noise in the measured spectrum. Not only does this spectral noise
qualitatively resemble the imaginary component of the complex data in Fig. 6.7(b), but also, as
shown by the thick solid line in Fig. 6.7(a), the NEdN or standard deviation of the spectral noise
has the same value as the standard deviation of the imaginary component of the complex data.
This is no surprise; glancing back at the right-hand side of Eq. (6.51f), we note that the right-hand
side of (6.51f) has the same formula for the NEdN (or standard deviation of the spectral noise) as
appears on the right-hand side of Eq. (6.54) above.
- 821 -
6 · NEdN and Detector Noise
Footnote 88 of Chapter 5 explains that there are other nonideal aspects to interferometer
signals—such as the off-center sampling mentioned in Sec. 5.26 of Chapter 5—that can modify
the nonzero phase angle ȥ (although it always remains a slowly varying function of ı). From this
point on, we can include all these aspects in our analysis by regarding H as the “effective”
transfer function that includes not only the effects of the detector circuit but also all the other
significant causes of a nonzero phase angle. This makes h, the forward Fourier transform of H
(see Appendix 5A of Chapter 5), an “effective” impulse-response function for the signal leaving
the detector. Because H is still the forward Fourier transform of a real-valued function h when H
and h are taken to be the effective transfer function and effective impulse-response function, H is
still a Hermitian function satisfying Eq. (5A.6b) in Appendix 5A. Equation (5A.5), however, may
not be satisfied by an “effective” impulse-response function because the effective h may not be
causal.
Equation (5.88c) defines function ϖ ( χ ) to be the inverse Fourier transform of e− iψ (σ )
multiplied by the tapering function V (σ ) specified in Eq. (5.88d),
∞
ϖ ( χ ) = ³ [V (σ ) e−iψ (σ ) ] e2π iσχ dσ . (6.55b)
−∞
As pointed out in the discussion following Eq. (5.88a) of Chapter 5, we only need to know ȥ
exactly for
σ min ≤ σ ≤ σ max ;
outside this range, function V can be adjusted to make [V (σ )e −iψ (σ ) ] taper to zero, ensuring that
the Fourier transform in (6.55b) exists.
Functions ȥ(ı) and ϖ ( χ ) can usually be recovered from the calibration procedure applied to
the interferometer. One method, as described at the beginning of Sec. 6.11 above, is to subtract
off the background signal zC( cold ) described in Sec. 6.3 and then—being sure to repeat the signal
measurements often enough to average away the noise—to calculate ȥ and ϖ from the recipe
given in Sec. 5.18 of Chapter 5. Another possibility is to note that every detector signal must pass
through the same signal chain, ending up multiplied by the same effective transfer function H.
eff ,tot (σ ) and Z eff ,tot (σ ) in Eqs. (6.33a) and (6.33b) above are complex because all
Hence both Z (1) (2)
their real functions of ı are multiplied by the same complex transfer function H(uı), giving both
spectra the same nonzero phase angle ȥ(ı). In this sense Z (1) eff ,tot (σ ) and Z eff ,tot (σ ) are
(2)
mathematically equivalent to Zeff(ı) in Eq. (5.83d) of Chapter 5—which means that we can get
- 822 -
Detector Noise in a Single-Sided Signal · 6.18
1
∆σ double sided = ,
2D
and Eq. (5.93b) specifies the corresponding spectral resolution of a single-sided measurement
with zconv ( χ ) known between χ = 0 and χ = 2 D − 2d to be
1
∆σ single sided = .
2(2 D − 2d )
For the single-sided interferometer discussed in Sec. 5.18 of Chapter 5, we expect to have
d << D , (6.56)
which means that ∆σ single sided ≅ 1/(4 D) = ∆σ double sided / 2 . Hence, to create a single-sided
measurement with the same spectral detail as a double-sided measurement, we should record
zconv(Ȥ) only between χ = 0 and χ = D rather than between χ = 0 and χ = 2 D − 2d ≅ 2 D . This
ensures that both the single-sided and double-sided cases have the same spectral resolution.
To construct the zconv signal between 0 and D, we convolve ϖ ( χ ) with the signal component
created by the L(ı) input radiance at point C in Fig. 6.2, as shown by Eq. (5.89a) in Chapter 5.
Nothing stops us, however, from convolving the total signal at point C with ϖ while planning to
discard the unwanted background components later on. Because we want to keep track of the
noise, ϖ should be convolved with the total noise-contaminated signal zCN( tot )
( χ ) specified in Eq.
(6.22a) above. We get, remembering that the convolution is a linear operation [see Eqs. (2.38b)
and (2.38d) in Chapter 2],
1 ª (det) § χ ·º
( tot )
zCN ( χ ) ∗ϖ ( χ ) = zC ( χ ) ∗ϖ ( χ ) + zC( cold ) ( χ ) ∗ϖ ( χ ) + « n ( χ ) ∗ h ¨ ¸ » ∗ϖ ( χ ) .
u¬ © u ¹¼
The associative property of the convolution [see Eq. (2.38c) in Chapter 2] gives
- 823 -
6 · NEdN and Detector Noise
¬ © ¹¼ ¬ © ¹ ¼ © ¹
where we define
§χ· §χ·
h/ ¨ ¸ = h ¨ ¸ ∗ϖ ( χ ) . (6.57a)
©u¹ ©u¹
1 ª (det) § χ ·º
( tot )
zCN ( χ ) ∗ϖ ( χ ) = zC ( χ ) ∗ϖ ( χ ) + zC( cold ) ( χ ) ∗ϖ ( χ ) + « n ( χ ) ∗ h/ ¨ ¸ » , (6.57b)
u¬ © u ¹¼
and to get the total noise-free signal, we just set n (det) ( χ ) to zero:
To analyze zC ( χ ) ∗ϖ ( χ ) , the first term in Eq. (6.57c), we apply the Fourier convolution
theorem to its forward Fourier transform [see Eq. (2.39a) in Chapter 2],
The Fourier transforms in Eqs. (6.55b) and (6.5d) can be reversed to get
∞
V (σ ) e − iψ (σ )
= ³ ϖ (χ ) e
−2π iσχ
d χ = F ( − iσχ ) (ϖ ( χ ) ) (6.58b)
−∞
and
WA ∆Ω
H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ )
4
∞ (6.58c)
= ³ zC ( χ ) e −2π iσχ
dχ = F ( − iσχ )
( zC ( χ ) ) .
−∞
- 824 -
Detector Noise in a Single-Sided Signal · 6.18
F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω · (6.58d)
¸ ⋅ ª¬ e H(uσ ) º¼ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ ) τ a ( σ )L FOV ( σ ).
− iψ (σ )
= V (σ ) ⋅ ¨
© 4 ¹
According to Eq. (6.55a) and the discussion following it, ȥ(ı) is the argument or complex phase
angle of the effective transfer function H(uσ ) , so
We also note that [see Eq. (5.88d) in Chapter 5] the tapering function V(ı) equals one for those ı
values where 0 < σ min ≤ σ ≤ σ max . These are also, according to the discussion following Eq.
(6.38c) above, the ı values where the product
R ( σ )τ a ( σ )τ f ( σ )
in Eq. (6.58d) is not zero. So either V(ı) is multiplied by zero on the right-hand side of (6.58d),
which means that its value does not matter, or else ı has a value for which V(ı) is one. Hence Eq.
(6.58d) can be written as
F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω ·
¸ ⋅ ª¬ e H(uσ ) º¼ M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) ,
− iψ (σ )
=¨
© 4 ¹
F ( −iσχ ) ( zC ( χ ) ∗ϖ ( χ ) )
§ WA ∆Ω · (6.58f)
=¨ ¸ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) .
© 4 ¹
- 825 -
6 · NEdN and Detector Noise
Equation (5.10f) in Chapter 5 and (4.139g) in Chapter 4 show that M and Ș are also even
functions, and clearly the product
R ( σ ) τ f ( σ ) τ a ( σ )L FOV ( σ )
is even because all the functions depend on σ . Hence the entire right-hand side of Eq. (6.58f) is
a real and even function of ı. Reversing the Fourier transform in (6.58f) to get
zC ( χ ) ∗ϖ ( χ )
§ § WA ∆Ω · · (6.58h)
= F ( iσχ ) ¨ ¨ ¸ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) ¸ ,
©© 4 ¹ ¹
we conclude that the convolution zC ( χ ) ∗ϖ ( χ ) is another real and even function because it is the
inverse Fourier transform of a real and even function (see entry 1 in Table 2.1 of Chapter 2):
zC (− χ ) ∗ϖ (− χ ) = zC ( χ ) ∗ϖ ( χ ) . (6.58i)
zC( cold ) ( χ ) ∗ϖ ( χ ) ,
we take its forward Fourier transform to get, again using Eq. (2.39a) in Chapter 2,
( ) ( )
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) = F ( −iσχ ) zC( cold ) ( χ ) ⋅ F ( − iσχ ′) (ϖ ( χ ′) ) . (6.59a)
This can be written as, substituting from Eqs. (6.58b) and (6.11a),
( )
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) =
§ WA ∆Ω · −iψ (σ ) (6.59b)
V (σ ) ¨ ¸ ª¬ e H(uσ ) º¼ M( Rσθma ) η(σ ) R ( σ )τ a ( σ )[L(FOV (σ ) − L(back)
FOV ( σ )].
fore )
© 4 ¹
Comparing this to Eq. (6.58d), we note that if τ f is replaced by one, and if LFOV is replaced by
[L(FOV
fore )
− L(back)
FOV ] , then the right-hand side of (6.58d) becomes the same as the right-hand side of
(6.59b)—that is,
(
F ( −iσχ ) zC( cold ) ( χ ) ∗ϖ ( χ ) )
- 826 -
Detector Noise in a Single-Sided Signal · 6.18
F ( i) ) zC ( ) , ( ) .
No special assumption was made about the nature of LFOV when analyzing the formula for
F ( i) ) zC ( ) , ( ) ,
and only one assumption was made about * f : that the tapering function V(ı) equals one for those
ı values where the product
R ( ) )* a ( ) )* f ( ) )
is not zero [see discussion following Eq. (6.58e) above]. Nothing stops us from tightening this
assumption slightly by requiring that the tapering function equals one when the product
R ( ) ) * a ( ) ) is not equal to zero; this prevents * f from having any effect on our previous
analysis of F ( i) ) ( zC ( ) , ( )) . Hence both * f and LFOV turn into placeholder functions when
deriving Eqs. (6.58h)
f and (6.58i) from (6.58d), which means that (6.58h) f and (6.58i) still hold
( fore ) (back)
true when * f is set equal to one and LFOV is replaced by [L FOV L FOV ] . Consequently, we can
now apply Eqs. (6.58h)
f and (6.58i) to Eq. (6.59b) to get, setting * f equal
equal totoone andreplacing
one, replacing
LFOV by [L(FOV
fore )
L(back)
FOV ] , and adding a “(cold)” superscript to Eqs. (6.58 f, i),
F ( i) ) zC( cold ) ( ) , ( )
§ WA · (6.59c)
( fore ) (back)
¨ ¸ H(u) ) M( R)' ma ) !() ) R ( ) )* a ( ) )[L FOV ( ) ) L FOV ( ) )]
© 4 ¹
and
Equations (6.58i) and (6.59d) show that both terms on the right-hand side of Eq. (6.57c) are
even functions of Ȥ, which means that their sum
- 827 -
6 · NEdN and Detector Noise
must also be an even function of Ȥ. This means we can take the zC( tot ) ( ) , ( ) data collected
from 0 to D and use it to create an artificial signal between 0 and D . We call
the artificially doubled, noise-free signal
Even[zC( tot ) ( ) , ( )]
(6.60a)
( , D )[ zC ( ) , ( )] ( , D )[ zC( cold ) ( ) , ( )].
Even[z ( )] ( , D) z (6.60b)
for any function z ( ) . This forces Even[zC( tot ) ( ) , ( )] to have the same values
s atat –9900 , for
0 4 0 4 D , as zC( tot ) ( ) , ( ) has at 0 . The ( , D ) function has the same meaning as in
Eq. (6.22c) above, reminding us, since it equals one for 4 D and equals zero otherwise, that
no data exists for D . We note that, although absolute value signs are applied to Ȥ on the
right-hand side of (6.60b), they are not needed in (6.60a) because the right-hand side is already
an even function of Ȥ. These formulas seem straightforward enough, but we should note that the
Even operator has an interesting effect on a noise-contaminated signal such as the one in Eq.
(6.57b): the noise contaminating the signal at positive Ȥ automatically becomes the same as the
noise contaminating the signal at negative Ȥ. Another way of putting this is that, for any
0 4 0 4 D , the signal at 0 is always in error from the presence of random detector noise
by exactly the same amount as the signal at 0 . To show what the Even operator does to the
noise-contaminated signal in (6.57b), we need the Heaviside step function [which has already
been defined in Eq. (2.70a) of Chapter 2],
1 for 0
°
( ) ®1 2 for 0 . (6.60c)
° 0 for
0
¯
- 828 -
Detector Noise in a Single-Sided Signal · 6.18
( tot )
Even[ zCN ( χ ) ∗ϖ ( χ )]
= Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] + Π ( χ , D) ª¬ zC( cold ) ( χ ) ∗ϖ ( χ ) º¼ (6.60d)
ª § χ ·º ª § χ ·º ½
+ u −1Π ( χ , D) ®Ξ( χ ) ⋅ « n (det) ( χ ) ∗ h/ ¨ ¸ » + Ξ(− χ ) ⋅ « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¾ .
¯ ¬ © u ¹¼ ¬ © u ¹¼ ¿
To show that the noise term is handled correctly in (6.60d), we note that when Ȥ is positive, the
first term inside the braces { } specifies the noise because the second term is zero; and when Ȥ is
negative, the second term specifies the noise to be the same as it is for − χ ≥ 0 because then the
first term is zero. This ensures that the random noise inside the braces automatically has the same
value at +Ȥ and –Ȥ.
(
F ( −iσχ ) Even[ zCN
( tot )
( χ ) ∗ϖ ( χ )] )
(
= F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] ) + F ( −iσχ ) Π ( χ , D ) ª¬ zC( cold ) ( χ ) ∗ϖ ( χ ) º¼ )
§ ª § χ ·º · (6.61)
+ u −1F ( − iσχ ) ¨ Π ( χ , D) ⋅ Ξ( χ ) ⋅ « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ ª § χ ·º ·
+ u −1F ( − iσχ ) ¨ Π ( χ , D) ⋅ Ξ(− χ ) ⋅ « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸ .
© ¬ © u ¹¼ ¹
The first two terms on the right-hand side are easier to evaluate than the last two, so we start with
the first two and leave the more difficult work for later.
Using the Fourier convolution theorem once on the first term [see Eq. (2.39j) of Chapter 2]
gives
- 829 -
6 · NEdN and Detector Noise
F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] )
= F ( −iσχ ) (Π ( χ , D)) ∗ F ( −iσχ ′) ( zC ( χ ′) ∗ϖ ( χ ′) ) (6.62a)
WA ∆Ω ½
= [ 2 Dsinc(2πσ D) ] ∗ ® H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ )¾ ,
¯ 4 ¿
where in the last step we substitute from Eqs. (6.24b) and (6.58f) to evaluate the convolved
Fourier transforms. According to the discussion following Eq. (5.82c) in Chapter 5, everything
inside the braces { } is a slowly varying function of ı compared to LFOV; and Sec. 5.15 of Chapter
5 explains why sinc(2πσ D) must, in a well-designed interferometer, be a narrow function
varying no less rapidly than the major features of LFOV. Hence everything inside the braces
(except LFOV) must be slowly varying with ı compared to the narrow function sinc(2πσ D) .
Therefore, according to Eq. (5C.1) in Appendix 5C of Chapter 5, the convolution in (6.62a)
primarily affects LFOV, giving us
F ( −iσχ ) ( Π ( χ , D) [ zC ( χ ) ∗ϖ ( χ ) ] )
WA ∆Ω (6.62b)
≅ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L mnf ( σ ) ,
4
The second term on the right-hand side of (6.61) is handled the same way as the first. Again
using Eq. (2.39j) in Chapter 2, we write
WA ∆Ω ½
® H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]¾
¯ 4 ¿
with Eqs. (6.24b) and (6.59c) used to evaluate the convolved Fourier transforms. Only
[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]
- 830 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19
inside
insidethe
thebraces
braces{ }{might notabeslowly
} is not a slowly varyingfunction
varying function of ı compared to sinc(2&) D) , so again Eq.
(5C.1) in Appendix 5C can be used to write
L(mnf
fore )
( ) ) [2 Dsinc(2&) D )] L(FOV
fore )
() ) (6.63c)
and
L(back) (back)
mnf ( ) ) [2 Dsinc(2&) D )] L FOV ( ) ) . (6.63d)
Now we are ready to analyze the last two terms in Eq. (6.61). Evaluation of the forward
Fourier transforms of h ( / u ) , [ ( , D) A ( )] , and [ ( , D) A ( )] comes first.
Taking the forward Fourier transform of h ( / u ) defined in Eq. (6.57a) gives, applying the
Fourier convolution theorem [Eq. (2.39a) in Chapter 2],
§ § ·· § § ··
F ( i) ) ¨ h ¨ ¸ ¸ F ( i) ) ¨ h ¨ ¸ ¸ A F ( i) 3) , ( 3) .
© © u ¹¹ © © u ¹¹
This can be written as, substituting from Eqs. (6.27b) and (5.88b) in Chapter 5,
§ § ··
F ( i) ) ¨ h ¨ ¸ ¸ u H(u) ) A V () )e i/ () ) uV () ) H(u) ) , (6.64a)
© © u ¹¹
from (6.58e) to simplify the formula. According to Eq. (6.58g), the magnitude of the effective
transfer function H(u) ) is even with respect to ı, and of course it must also be real. Function
V(ı) is real and, according to Eq. (5.88e) in Chapter 5, it is also even. Hence, (6.64a) reveals that
the forward Fourier transform of h ( / u ) is real and even. Entry 1 of Table 2.1 in Chapter 2 now
shows that h itself must be real and even:
§ · §·
h ¨ ¸ h ¨ ¸ (6.64b)
© u¹ ©u¹
- 831 -
6 · NEdN and Detector Noise
and
§ § χ ··
Im ¨ h/ ¨ ¸ ¸ = 0 . (6.64c)
© © u ¹¹
For future use, we note that h/ ( χ / u ) , just like h(t), is a relatively narrow function of its argument.
To see why this is so, we consult Eq. (6.21a) and note that there exists a time T such that
Function ϖ ( χ ) is also a relatively narrow function of Ȥ with [see Eq. (5.88h) in Chapter 5]
Function h/ ( χ / u ) is, according to Eq. (6.57a), the convolution of h( χ / u ) and ϖ ( χ ) and so can
be written as [see the definition of the convolution in Eq. (2.38a) of Chapter 2]
∞
§χ· § χ′ ·
h/ ¨ ¸ = ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′ .
© u ¹ −∞ © u ¹
uT
§χ· § χ′ ·
h/ ¨ ¸ ≅ ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′ . (6.65c)
© u ¹ −uT © u ¹
§ χ′ ·
h ¨ ¸ϖ ( χ − χ ′)
©u ¹
χ − χ′ < d ,
- 832 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19
because, when this is not true, (6.65b) forces ϖ to be small. But the limits on the integral confine
χ ′ to values between +uT and íuT, so when
χ > d + uT ,
it is impossible for
§ χ′ ·
h ¨ ¸ϖ ( χ − χ ′)
©u ¹
to make a significant contribution to the integral for any of the allowed values of χ ′ .
Consequently,
uT
§χ· § χ′ ·
h/ ¨ ¸ ≅ ³ h ¨ ¸ϖ ( χ − χ ′) d χ ′
© u ¹ −uT © u ¹
This demonstration that h/ is a narrow function relies only on its being the convolution of two
other narrow functions. In general, the convolution of two narrow functions produces another
narrow function whose width can be no wider than (approximately) the sum of the widths of the
functions being convolved.
The forward Fourier transform of [Π ( χ , D) ⋅ Ξ( χ )] is, according to Eq. (6.22b) and (6.60c),
D ∞
§ D D·
F ( −iσχ ) ( Π ( χ , D) Ξ( χ ) ) = ³ e −2π iσχ d χ = ³ Π ¨© χ − 2 , 2 ¸¹ e
−2π iσχ
dχ
0 −∞
§ § D D ··
= F ( −iσχ ) ¨ Π ¨ χ − , ¸ ¸ .
© © 2 2 ¹¹
- 833 -
6 · NEdN and Detector Noise
0 ∞
§ D D·
F ( − iσχ )
( Π ( χ , D) Ξ(− χ ) ) = ³e
−2π iσχ
dχ = ³ Π ¨© χ + 2 , 2 ¸¹ e
−2π iσχ
dχ
−D −∞
or
F ( −iσχ ) ( Π ( χ , D) Ξ(− χ ) ) = eπ iσ D [ Dsinc(πσ D)] . (6.66b)
∞
§χ· § χ − χ′ · ′
n (det) ( χ ) ∗ h/ ¨ ¸ = ³ n (det) ( χ ′) h/ ¨ ¸dχ
© u ¹ −∞ © u ¹
χ + ( d +uT )
(6.67a)
§ χ − χ′ · ′
³ ( χ ′) h/ ¨ ¸dχ .
(det)
≅ n
χ −( d +uT ) © u ¹
The approximation in the last step comes from noting that the product
§ χ − χ′ ·
n (det) ( χ ′) h/ ¨ ¸
© u ¹
is negligible when ( χ − χ ′) lies outside the range of values between (d+uT) and í(d+uT) for
which h/ is significantly different from zero [see (6.65d)]. Multiplying both sides of (6.67a) by
Π ( χ , D) gives
χ + ( d +uT )
ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ n (det) ( χ ′) h/ ¨ ¸dχ .
¬ © u ¹¼ χ −( d +uT ) © u ¹
The new Π ( χ , D) factor reduces this equation to 0 = 0 when χ > D . Remembering that h/ is
negligible whenever ( χ − χ ′) lies outside the range of values between (d+uT) and í(d+uT), we
extend the limits of the integral on the right-hand side to get the new approximation
D + ( d +uT )
ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ n (det) ( χ ′) h/ ¨ ¸dχ .
¬ © u ¹¼ − D −( d +uT ) © u ¹
- 834 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19
∞
ª § χ ·º § χ − χ′ · ′
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ³ Π ( χ ′, D) n (det) ( χ ′) h/ ¨ ¸dχ ,
¬ © u ¹¼ −∞ © u ¹
where
D = D + d + uT . (6.67b)
ª § χ ·º § χ ·½
Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ≅ Π ( χ , D) ® ª¬Π ( χ , D) n (det) ( χ ) º¼ ∗ h/ ¨ ¸ ¾ . (6.67c)
¬ © u ¹¼ ¯ © u ¹¿
ª (det) § χ ·º § χ ·½
Ξ( χ ) Π ( χ , D) « n ( χ ) ∗ h/ ¨ ¸ » ≅ Ξ( χ ) Π ( χ , D) ® ª¬Π ( χ , D) n ( χ ) º¼ ∗ h/ ¨ ¸ ¾ .
(det)
¬ © u ¹¼ ¯ © u ¹¿
Taking the forward Fourier transform of both sides gives, using Eqs. (2.39j) and (2.39a) in
Chapter 2,
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ § χ ′′ · · ½
≅ F ( −iσχ ) ( Ξ( χ ) Π ( χ , D) ) ∗ ® F ( −iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ⋅ F ( −iσχ ′′) ¨ h/ ¨ ¸ ¸ ¾ .
¯ © © u ¹ ¹¿
This can be written as, substituting from Eqs. (6.64a) and (6.66a),
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
≅ ª¬ D e −π iσ D sinc(πσ D) º¼ ∗ ¬ªuV (σ ) H (uσ ) ⋅ F ( −iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ¼º .
We note, due to the size of the ıD product, that e −π iσ D sinc(πσ D) is about as narrow and rapidly
varying a function of ı as sinc(πσ D) . Hence, glancing back at the discussion following Eq.
- 835 -
6 · NEdN and Detector Noise
(6.62a) above, we see that V(ı) and H (uσ ) vary slowly with ı compared to e −π iσ D sinc(πσ D) ,
which means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that V(ı) and H (uσ ) can
be moved outside the convolution:
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
{ }
≅ uV (σ ) H (uσ ) ª¬ D e −π iσ D sinc(πσ D) º¼ ∗ ¬ª F ( − iσχ ′) ( Π ( χ ′, D) n (det) ( χ ′) ) ¼º .
Remembering that
D e −π iσ D sinc(πσ D) = F ( −iσχ ) ( Π ( χ , D) Ξ( χ ) )
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
(
≅ uV (σ ) H (uσ ) F ( −iσχ ) Ξ ( χ ) Π ( χ , D )Π ( χ , D) n (det) ( χ ) .)
From Eq. (6.67b), we know D > D , which means that [see Eq. (6.22b)]
Π ( χ , D )Π ( χ , D) = Π ( χ , D ) .
Therefore
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.67d)
(
≅ uV (σ ) H (uσ ) F ( −iσχ ) Ξ( χ ) Π ( χ , D) n (det) ( χ ) . )
This takes care of the third term on the right-hand side of Eq. (6.61). At no point during this
derivation did we make any assumptions about the behavior of n (det) ( χ ) ; it acts as a placeholder
and could be replaced by other functions—both random and nonrandom—without making any
part of the analysis untrue.
It is now time to simplify the fourth and last term in Eq. (6.61). We have just remarked that, in
the analysis of the third term in (6.61), n (det) ( χ ) acts as a placeholder and can be replaced by any
other reasonable choice. It turns out that we are not so much interested in modifying the final
result in Eq. (6.67d) as we are in modifying the approximation in (6.67c) that appears partway
- 836 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19
through the derivation of (6.67d). Replacing the n (det) ( ) placeholder in (6.67c) by n(det) ( )
gives
ª § ·º § ·½
( , D ) « n (det) ( ) h ¨ ¸ »
( , D ) ® ¬ª ( , D) n (det) ( ) ¼º h ¨ ¸ ¾ . (6.68a)
¬ © u ¹¼ ¯ © u ¹¿
This can be written as, using (6.64b) to modify the left-hand side,
ª § ·º § ·½
( , D) « n (det) ( ) h ¨ ¸ »
( , D) ® ª¬ ( , D) n (det) ( ) º¼ h ¨ ¸ ¾ . (6.68b)
¬ © u ¹¼ ¯ © u ¹¿
Multiplying through by ( ) and taking the forward Fourier transform of both sides leads to
§ ª § ·º ·
F ( i) ) ¨ ( ) ( , D) « n (det) ( ) h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
(6.68c)
§ §· ½ ·
F ( i) ) ¨ ( ) ( , D) ® ¬ª ( , D) n (det) ( ) ¼º h ¨ ¸ ¾ ¸ .
© ¯ © u ¹¿ ¹
The left-hand
left-hand side
side of
ofthis
thisformula
formulaisis(after
(after dividing
dividing byby u) exactly
u) the Fourier the same as
transform of the
the fourth term
in (6.61) that we need to evaluate. We apply Eqs. (2.39a) and (2.39j) in Chapter 2 to the right-
hand side to get
§ ª § ·º ·
F ( i) ) ¨ ( ) ( , D) « n (det) ( ) h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ § 3 ·· ½
F ( i) ) ( ) ( , D) ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) ( 33, D) n (det) ( 33) ¾ .
¯ © © u ¹¹ ¿
§ ª § ·º ·
F ( i) ) ¨ ( ) ( , D) « n (det) ( ) h ¨ ¸ » ¸
© ¬ © u ¹¼ ¹ (6.68d)
1
¬ª De& i) D sinc(&) D) ¼º uV () ) H(u) ) A F ( i) ) ( , D) n (det) ( ) . 2
Again, just like in the analysis of the third term of (6.61), Eq. (5C.1) in Appendix 5C of Chapter
5 is used to move V and H outside the convolution because they vary slowly compared to
- 837 -
6 · NEdN and Detector Noise
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ ¬ª Deπ iσ D sinc(πσ D) ¼º ∗ F ( −iσχ ) Π ( χ , D) n (det) (− χ ) .
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ F ( −iσχ ) Ξ(− χ )Π ( χ , D) Π ( χ , D) n (det) (− χ ) .
Again we note [see Eq. (6.67b)] that D > D , making Π ( χ , D)Π ( χ , D) = Π ( χ , D) . Hence,
§ ª § χ ·º ·
F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹ (6.68e)
{ ( )}
≅ uV (σ ) H(uσ ) ⋅ F ( − iσχ ) Ξ(− χ )Π ( χ , D) n (det) (− χ ) .
This takes care of the fourth term on the right-hand side of Eq. (6.61).
Before substituting our results back into Eq. (6.61), it makes sense to use the linearity of the
Fourier transform (see Sec. 2.6 in Chapter 2) to combine the equation’s third and fourth terms.
Multiplying by u −1 and adding together (6.67d) and (6.68e) gives
- 838 -
Uncalibrated Spectra of Single Sided Signals with Detector Noise · 6.19
§ ª § χ ·º ·
u -1F ( − iσχ ) ¨ Ξ( χ ) Π ( χ , D) « n (det) ( χ ) ∗ h/ ¨ ¸ » ¸
© ¬ © u ¹¼ ¹
§ ª § χ ·º ·
+ u -1F ( −iσχ ) ¨ Ξ(− χ )Π ( χ , D) « n (det) (− χ ) ∗ h/ ¨ − ¸ » ¸
© ¬ © u ¹¼ ¹
{ (
≅ V (σ ) H (uσ ) F ( −iσχ ) Ξ( χ )Π ( χ , D) n (det) ( χ ) ) (6.69a)
(
+ F ( −iσχ ) Ξ(− χ )Π ( χ , D) n (det) (− χ ) )}
(
= V (σ ) H (uσ ) F ( −iσχ ) Π ( χ , D) ¬ªΞ( χ ) n (det) ( χ ) + Ξ(− χ )n (det) (− χ ) ¼º . )
Now we can substitute into Eq. (6.61) the approximations shown in Eqs. (6.62b), (6.63b), and
(6.69a) to get
(
+ V (σ ) H (uσ ) F ( −iσχ ) Π ( χ , D) ª¬ Ξ( χ ) n (det) ( χ ) + Ξ(− χ )n (det) (− χ ) º¼ . )
The next section explains how to analyze the noise term in this formula.
- 839 -
6 · NEdN and Detector Noise
The Heaviside step function Ξ ( χ ) from Eq. (6.60c) ensures that n E(det) always has the same value
at − χ as it does at + χ : when χ = χ is positive, the first term specifies the value of n E(det)
because the second term is zero; and when χ = − χ is negative, the second term specifies the
value of n E(det) to be the same as it is for χ = χ because the first term is zero. This means random
function n E(det) is always even,
n E(det) (− χ ) = n E(det) ( χ ) , (6.70b)
and, because it represents noise contaminating a real signal, it must also be real:
( )
Im n E(det) ( χ ) = 0 . (6.70c)
Following the same pattern as in the previous Ȥ-based noise terms [see Eq. (6.29a)], we define the
D-limited forward Fourier transform of n E(det) to be
DE (σ ) = F
n (det) (
( − iσχ )
Π ( χ , D)nE(det) ( χ ) = ) ³ Π( χ , D) n (det)
E ( χ ) e −2π iσχ d χ (6.70d)
−∞
or
D
(σ ) = ³ n ( χ ) e −2π iσχ d χ .
(det) (det)
n DE E (6.70e)
−D
DE (σ ) = F
n (det) ( − iσχ )
(
Π ( χ , D)[Ξ( χ ) ⋅ n (det) ( χ ) + Ξ(− χ ) ⋅ n (det) (− χ )] . ) (6.70f)
We have just seen that function n E(det) is real and even. Function Π ( χ , D) is also real and even
DE (σ ) in (6.70d) and (6.70e) is the forward Fourier transform of a
[see Eq. (6.22b) above], so n (det)
DE (σ ) another real and even function (see entry 1 of Table
real and even function. This makes n (det)
2.1 in Chapter 2):
- 840 -
Calibrated Spectra of Single Sided Signals with Detector Noise · 6.20
DE ( −σ ) = n DE (σ )
n (det) (det) (6.70g)
and
Im ( n (det)
DE (σ ) ) = 0 . (6.70h)
The expectation value of n E(det) ( χ ) is, applying the expectation operator E to both sides of
(6.70a),
( ) ( )
E nE(det) ( χ ) = Ξ( χ ) E n (det) ( χ ) + Ξ(− χ ) E n (det) (− χ ) , ( )
using the linearity of the expectation operator with respect to random quantities discussed in Sec.
3.10 of Chapter 3. Since
E n (det) ( χ ) = 0( )
for any value of Ȥ [see Eq. (6.17b)], we can now see that
(
E nE(det) ( χ ) = 0 .) (6.70i)
Applying the expectation operator E to both sides of Eq. (6.70e) gives, using Eq. (3.17c) in
Chapter 3,
D
(
E n (det)
DE (σ ) = ) ³ E ( n (det)
E )
( χ ) e −2π iσχ d χ .
−D
( )
Since we now know that E nE(det) ( χ ) is zero, this shows that
(
E n (det)
DE (σ ) = 0 . ) (6.70j)
The detector-noise term in Eq. (6.69b) can be simplified by substituting from Eq. (6.70f):
- 841 -
6 · NEdN and Detector Noise
(
F ( −iσχ ) Even[ zCN
( tot )
( χ ) ∗ϖ ( χ )] )
as the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2, because it plays
the same role that
(
F ( − iσχ ) Π ( χ , D) zCN
( tot )
(χ ) )
does in the double-sided signal spectrum specified in Eq. (6.30a) above. Comparing the formulas
for
(
F ( −iσχ ) Even[ zCN
( tot )
) (
( χ ) ∗ϖ ( χ )] and F ( − iσχ ) Π ( χ , D) zCN
( tot )
)(χ )
in Eqs. (6.71a) and (6.30a), we see that there is an exact correspondence if H(uı) in (6.30a) is
matched with H(uσ ) in (6.71a) and if n (det)
D (σ ) in (6.30a) is matched with [V (σ ) n DE (σ )] in
(det)
(6.71a):
H(uσ ) ⇔ H(uσ ) (6.71b)
and
D (σ ) ⇔ [V (σ ) n DE (σ )] .
n (det) (det)
We also note that the expectation value of the spectral noise “ [V (σ ) n (det)
DE (σ )] ” in (6.71a) is zero,
(
E V (σ ) n (det) ) (
DE (σ ) = V (σ ) E n DE (σ ) = 0 .
(det) ) (6.71c)
Knowing that the spectral noise in (6.70a) has a zero expectation value, we can repeat the
mathematical analysis used in Sec. 6.11 to extract the Lmnf data from the uncalibrated spectrum
in (6.30a), only this time replacing H(uı) by H(uσ ) and n (det) D (σ ) by [V (σ ) n DE (σ )] as
(det)
eff ,tot (σ ) and Z eff ,tot (σ ) in Eqs. (6.33a) and (6.33b) now
specified in (6.71b). The formulas for Z (1) (2)
become
WA ∆Ω
eff ,tot (σ ) ≅
Z (1) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.71d)
ª¬τ f ( σ )L(1) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼
- 842 -
Calibrated Spectra of Single Sided Signals with Detector Noise · 6.20
and
WA ∆Ω
eff ,tot (σ ) ≅
Z (2) H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.71e)
ª¬τ f ( σ )L(2) ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼ .
Substituting these expressions into the calibration formula in (6.35d) now gives
° Z ( meas ) (σ ) − Z (1) (σ ) ½°
ª¬ L(2) ( σ ) − L(1) ( σ )º¼ ⋅ ® eff(2),totN eff ,tot
¾ + L (σ )
(1)
(6.71g)
4V (σ ) n (det)
DE (σ )
= L mnf (σ ) + .
(WA ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
The most important difference between the single-sided formula in (6.71g) and the double-sided
DE (σ ) is strictly real whereas, as was
formula in (6.35d) is that, according to Eq. (6.70h), n (det)
D (σ ) has both real and imaginary components. According
pointed out following Eq. (6.35d), n (det)
to the discussion of the double-sided case following Eq. (6.36) above, the imaginary component
D (σ ) is called the avoidable spectral noise because it can be eliminated by taking the real
of n (det)
part of the interferometer measurement; and the real component of n (det) D (σ ) is called the
unavoidable spectral noise because it cannot be eliminated from the interferometer measurement.
The avoidable spectral noise comes from the odd part of the n (det) ( χ ) signal noise contaminating
the interferometer data, and the unavoidable spectral noise comes from the even part of the
n (det) ( χ ) signal noise contaminating the interferometer data. The n (det) ( χ ) noise contaminating
the double-sided signal has both even and odd components because the interferometer data is
recorded for both positive and negative values of the OPD value Ȥ. In the single-sided case, on
the other hand, interferometer data is recorded only for non-negative values of Ȥ and then
artificially extended to negative Ȥ values, automatically turning the noise contaminating the
signal into an even function of Ȥ [see Eq. (6.70b)]. Consequently, the single-sided spectral noise
- 843 -
6 · NEdN and Detector Noise
n (det)
DE () ) is always real and even [see Eqs. (6.70g) and (6.70h)], and there is no avoidable noise
that can be eliminated by taking the real part of the measured spectrum. Hence, when comparing
the right-hand side of (6.71g) to a spectral radiance measurement contaminated by random error,
such as
L mnf () ) L () )
in Eq. (6.1a) above, we see that for single-sided spectral measurements contaminated by detector
noise all of n (det)
DE contributes to L , giving
4V () ) n (det)
DE () )
L () )
. (6.72a)
(WA ) M( R)' ma ) !() ) R ( ) )* a ( ) )* f ( ) )
function of ı. Functions Ș, M, and V are real and—according to Eq. (4.139g) in Chapter 4 and
Eqs. (5.10f) and (5.88e) in Chapter 5—even functions of ı. Functions R, * a , and * f are also real
and have ) for their argument, forcing them to be even functions of ı. It follows that Eq.
(6.72a) presents aa well-founded for L () ) that is, as it should be, a real and even random
formula for
single-sided formula
function of ı just like equation (6.38c) above. Following the convention adopted there, we write
L in (6.72c) as a function of ) to get
4 V ( ) ) n (det)
DE ( ) )
L ( ) )
. (6.72b)
(WA ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) )* f ( ) )
4 V ( ) ) E n (det)
DE ( ) )
E L ( ) )
0. (6.73a)
(WA ) M( R ) ' ma ) !( ) ) R ( ) )* a ( ) ) * f ( ) )
To find the NEdN for the detector noise in the single-sided signal, we apply Eqs. (6.72b) and
(6.73a) to the formula in (6.3g) above to get, remembering that W 2 1 according to the
discussion following Eq. (4.83) in Chapter 4, that
- 844 -
Detector-Noise NEdN in a Single-Sided Signal · 6.21
NEdN (det)
(σ ) =
(
4 V ( σ ) E n (det)
DE ( σ )
2
)
1
( A ∆Ω) M( R σ θ ma ) η( σ )R ( σ )τ a ( σ )τ f ( σ )
or, removing the absolute value signs from the arguments of n (det)
DE , Ș, M, and V,
NEdN (det)
(σ ) =
(
4 V (σ ) E n (det)
DE (σ )
2
) . (6.73b)
1
( A ∆Ω) M( Rσθ ma ) η(σ )R ( σ )τ a ( σ )τ f ( σ )
The absolute value signs are removed because these functions are even [see Eq. (6.70g), Eq.
(4.139g) in Chapter 4, and Eqs. (5.10f) and (5.88e) in Chapter 5]. The subscript 1 and superscript
(det) show that this is the formula for the NEdN of a single-sided signal contaminated by detector
noise.
The quickest way to connect NEdN1(det) to the formula for the double-sided signal is to
analyze the detector noise as a time-based rather than a Ȥ-based random function. Returning to
the definition of n E(det) ( χ ) in Eq. (6.70a) above, we use χ = ut from Eq. (6.4) to convert both
sides of (6.70a) to time-based, rather than Ȥ-based, functions,
where Eq. (6.40b) is used to replace n (det) by N (det) on the left-hand side of the formula, and on
the right-hand side we define
N E(det) (t ) = n E(det) (ut ) (6.74d)
so that
n E(det) ( χ ) = N E(det) ( χ / u ) . (6.74e)
Equation (6.74c) is exactly the same as Eq. (3.73b) in Sec. 3.27 of Chapter 3 when N (det) (t ) is
matched to n (t ) and N E(det) (t ) is matched to n E (t ) ,
- 845 -
6 · NEdN and Detector Noise
N (det) (t ) ⇔ n (t ) (6.75a)
and
N E(det) (t ) ⇔ n E (t ) .
Remember that in this section all terms with the superscript “(det)” refer to the detector noise
being analyzed in this chapter and the terms without the superscript “(det)” come from Chapter 3.
Section 3.27 of Chapter 3 defines the T-limited forward Fourier transform of n E (t ) to be,
according to Eq. (3.72b),
T
N TE ( f ) = ³ n E (t ) e −2π ift dt . (6.75b)
−T
Following this lead, we copy this idea and define the T-limited forward Fourier transform of
N E(det) (t ) to be
T
(det) ( f ) =
³ N (t ) e −2π ift dt ,
(det)
N TE E (6.75c)
−T
where, just like in Eq. (6.40c) above, T = D / u . Since N E(det) (t ) matches up to n E (t ) in (6.75a), it
follows that Eqs. (6.75b) and (6.75c) are now the same equation with N (det) ( f ) matching up to
TE
N TE ( f ) ,
(det) ( f ) ⇔ N ( f ) .
N (6.75d)
TE TE
The analysis presented in Sec. 3.27 [see Eq. (3.76c) in Chapter 3] shows that
( 2
E N TE ( f ) ≅ 2 T Snn )
( f ) , (6.75e)
where S nn ( f ) is the double-sided power spectrum of random function ñ(t) in (6.75a). We know
that N (t ) , which corresponds to ñ(t) has, according to Eq. (6.41b), its own power spectrum
(det)
S (det)
( f ) . Since N
(det) (t ) corresponds to ñ(t), the power spectrum in S ( f ) (6.75e) corresponds
NN nn
(det)
to the power spectrum S
NN
(f),
(det)
( f ) ⇔ S nn
S NN ( f ) . (6.75f)
Hence the formula corresponding to Eq. (6.75e), which has been directly copied from (3.76c) in
Chapter 3, must be, according to (6.75d) and (6.75f),
- 846 -
Detector-Noise NEdN in a Single-Sided Signal · 6.21
( TE )
(det) ( f ) 2 ≅ 2 T S (det)
E N ( f ).
NN
(6.75g)
To find the counterpart of this result for Ȥ-based random functions, we follow Eq. (6.4) and
change the dummy variable of integration in (6.75c) to χ = ut to get
uT
(det) ( f ) = u −1 N E(det) ( χ / u ) e−2π i ( f / u ) χ d χ .
N TE ³
− uT
According to Eqs. (6.40c), (6.40d), and (6.74e), this can be transformed into
D
(det) (uσ ) = u −1 n (det) ( χ ) e −2π iσχ d χ ,
N TE ³ E −D
We again consult Eqs. (6.40c) and (6.40d), this time using them to write (6.75g) as
(
E N TE )
(det) (uσ ) 2 ≅ 2 u −1 D S (det)
(uσ ) .
NN
(
E n (det)
DE (σ )
2
) ≅ 2Dp (det)
nn (σ ) (6.76b)
DE (σ ) is strictly real,
or, since (6.70h) shows that n (det)
E ( [n (det)
DE (σ )] ) ≅ 2 D pnn
2
(σ ) .
(det)
(6.76c)
4 V (σ ) 2 D pnn
(σ )
(det)
NEdN (det)
1 (σ ) = . (6.76d)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
- 847 -
6 · NEdN and Detector Noise
We are usually interested in the NEdN only for ı values corresponding to the wavenumber
range over which L(ı) is to be measured—that is, formula (6.76d) is almost always used for
wavenumbers ı such that
σ min ≤ σ ≤ σ max
with ımin and ımax the same as in Eq. (5.78) in Chapter 5. According to Eq. (5.88d) in Chapter 5,
V(ı) is always one for these ı values, which means it can be eliminated from (6.76d),
(σ )
(det)
4 2 D pnn
NEdN (det)
1 (σ ) = , (6.76e)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
4 uD S1(det) (u σ )
NEdN 1 (σ ) =
(det)
, (6.77b)
( A ∆Ω) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ )τ f ( σ )
and
4 uDAd
NEdN1(det)( σ ) = . (6.77c)
( A ∆Ω) M( Rσθ ma ) η(σ )τ a ( σ )τ f ( σ ) D∗ ( σ )
- 848 -
Detector Circuit as an Anti-Aliasing Filter · 6.22
(
F ( −iσχ ) Π ( χ , D) zCN
( tot )
)
( χ ) = Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ ) , (6.78a)
where
WA ∆Ω
Z eff ,tot (σ ) = H(uσ ) M( Rσθ ma ) η(σ ) R ( σ )τ a ( σ ) ⋅
4 (6.78b)
ª¬τ f ( σ )L mnf ( σ ) + L(mnf
fore )
( σ ) − L(back) º
mnf ( σ ) ¼
[Π ( χ , D) zCN
( tot )
( χ )] and [Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ )]
in (6.78a) are a Fourier transform pair and then analyze what must happen to them when
[Π ( χ , D) zCN
( tot )
( χ )] is sampled and put through a DFT.
Section 2.21 of Chapter 2 explains the effect of sampling and the DFT on any two functions,
such as U(f) and u(t) in Eq. (2.91a) of Chapter 2, which form a Fourier-transform pair. To match
the interferometer signal at point C to functions u(t) and U(f), we write Eq. (6.78a) as
∞
[Z eff ,tot (σ ) + H(uσ ) n (σ )] = ³ [Π ( χ , D) z ( χ )] e −2π iσχ d χ .
(det) ( tot )
D CN (6.79)
−∞
Comparing this to Eq. (2.91a) in Chapter 2, we note that wavenumber ı corresponds to f, the
OPD value Ȥ corresponds to t, function U(f) corresponds to
- 849 -
6 · NEdN and Detector Noise
[Π ( χ , D) zCN
( tot )
( χ )] .
t⇔χ, (6.80a)
f ⇔σ , (6.80b)
u (t ) ⇔ [Π ( χ , D) zCN
( tot )
( χ )] , (6.80c)
and
U ( f ) ⇔ [Z eff ,tot (σ ) + H(uσ ) n (det)
D (σ )] . (6.80d)
This corresponds to Eq. (2.92b) in Chapter 2, which states that the N equally spaced samples used
to represent u(t) are separated by intervals ¨t such that
N ∆t = T .
∆t ⇔ ∆χ (6.81b)
and
T ⇔ 2D . (6.81c)
1
F=
∆t
- 850 -
Detector Circuit as an Anti-Aliasing Filter · 6.22
1
F⇔ . (6.81d)
∆χ
1
σ Nyq = , (6.81e)
2∆χ
F ⇔ 2σ Nyq . (6.81f)
Formulas (2.95c) and (2.95d) show what happens to the sampled interferometer signal when
the DFT is applied: the original Fourier-transform pair u(t) and U(f), which describes the signal
and its spectrum, changes into u[ ∞ ] (t , T ) and U [ ∞ ] ( f , F ) . The transformation of spectrum U(f)
into U [ ∞ ] ( f , F ) is discussed at some length in Secs. 2.22 and 2.23 of Chapter 2, which show why
it is referred to as aliasing the signal spectrum. Equation (2.93b) defines U [ ∞ ] ( f , F ) to be
∞
U [∞] ( f , F ) = ¦ U ( f − rF ) .
r =−∞
Therefore, applying correspondences (6.80b), (6.80d), and (6.81f) to (2.93b), we see that the
original noise-contaminated spectrum
in (6.78a) and (6.79) must transform, after sampling and the DFT, into
- 851 -
6 · NEdN and Detector Noise
FIGURE 6.8(a).
Z eff ,tot
When n (det)
D = 0 in (6.82)—that is, in the absence of noise—Eq. (6.82) becomes the same as Eq.
(5.113c) in Chapter 5 if all the background radiances are negligible compared to the radiance
spectrum entering the interferometer. The practical consequences of Eq. (5.113c) are discussed at
length in Secs. 5.24 and 5.25 of Chapter 5. Following the same sort of reasoning used there, we
note that Zeff,tot is expected to be negligible or zero unless
σ min ≤ σ ≤ σ max ,
so that the spectrum is oversampled and its original shape preserved, as shown in Fig. 6.8(b). If
there is a large gap between σ = 0 and σ = σ min , we can instead choose ¨Ȥ large enough to
undersample the spectrum while preserving its original shape, as shown in Fig. 6.8(c). When
- 852 -
Detector Circuit as an Anti-Aliasing Filter · 6.22
n (det)
D is not zero, however, both oversampling and undersampling may introduce extra noise into
the measured spectrum if
H(u) )n (det)
D () )
all frequencies, both high and low; but we can design the detector circuit so that H(uı) is very
small for those wavenumbers ı that can contaminate the spectral measurement due to aliasing.
Detector circuits of this sort are often referred to as anti-aliasing filters or as containing an anti-
aliasing filter. Although it may not be mandatory to design the anti-aliasing transfer function H so
that H(uı) is negligible or zero unless
) min 4 ) 4 ) max ,
__________
Detectors are the major source of random error in almost all Michelson interferometers. The
NEdN of an interferometer measurement is defined at the beginning of this chapter to be the
standard deviation of the random measurement error, which suggests that some effort might be
required to observe detector noise. It turns out, however, that the distinctively “fuzzy” appearance
of detector noise [see Fig. 6.6(b)] usually means that a single spectral measurement is enough to
show its presence and importance. We have traced detector noise through the block diagram of a
standard Michelson interferometer (shown in Fig. 6.2), taking care to include the effect of the
calibration process on the spectral signal. In double-sided interferogram systems, some of the
signal noise can be eliminated rather easily by taking the real part of the noise-contaminated
measurement after the calibration algorithm has been applied. This lets us divide the signal noise
of double-sided systems into avoidable and unavoidable components. Signal noise is somewhat
more prominent in systems using single-sided interferograms—being larger by a factor of square-
root of 2—because there is no way to eliminate
eliminate the
an avoidable component of the signal noise. This
is the inevitable price paid for the gain in spectral resolution discussed in Sec. 5.18 of Chapter 5.
- 853 -
6 · NEdN and Detector Noise
FIGURE 6.8(b).
This is a schematic plot of the magnitude of the noise-contaminated spectral signal Zeff,tot
against wavenumber ı when the data has been oversampled. The solid lines represent
the noise-free Zeff,tot and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Only aliased high-frequency
noise ends up in the measured spectrum.
- 854 -
Detector Circuit as an Anti-Aliasing Filter · 6.22
FIGURE 6.8(c).
high-frequency
noise
This is a schematic plot of the magnitude of the noise-contaminated spectral signal Zeff,tot
against wavenumber ı when the data has been undersampled. The solid lines represent
the noise-free Zeff,tot and the dashed lines represent its aliases. The solid bars represent
the high-frequency and low-frequency noise at their correct positions on the
wavenumber axis, and the dotted bars represent the high-frequency and low-frequency
noise at their aliased positions on the wavenumber axis. Both low-frequency noise and
aliased high-frequency noise end up in the measured spectrum.
- 855 -
6 · NEdN and Detector Noise
FIGURE 6.8(d
D).
H (u) )
1.0
) min ) min
) max ) max
- 856 -
Appendix 6A
Appendix 6A
When a spectral radiance L(ı) is a slowly varying function of wavenumber, then the distortion
given by an interferometer’s field of view can be disregarded. To see why this is so, we use the
formula given in Eq. (6.5b) [and also in Eq. (5.83e) of Chapter 5] for LFOV(ı), the spectral
radiance distorted by an interferometer’s finite field of view ¨ȍ when ¨ȍ is small but also large
enough that cos cannot be approximated as one:
§ · )
) A¨1 ¸
© 4& ¹ 2
1
L FOV () )
) ³ L() 3)d) 3 (6A.1)
§ · )
) A¨1 ¸
© 4& ¹ 2
A )
) . (6A.2)
2&
When L(ı) is a slowly varying function of wavenumber, we can assume that it is quasi-constant
when ı changes by an amount ¨ı, so the integral in (6A.1) can be approximated as
§ · ) § · )
) A¨1 ¸ ) A¨1 ¸
© 4& ¹ 2
§ ) · © 4& ¹ 2
³ L() 3)d) 3
L ¨ )
©
¸A
2 ¹ ³ d) 3
)) A L() ) .
§ · ) § · )
) A¨1 ¸ ) A¨1 ¸
© 4& ¹ 2 © 4& ¹ 2
showing that an interferometer with a small but finite field of view does not significantly distort
the measured spectral radiance when the radiance is a slowly varying function of wavenumber.
The effect of the interferometer’s finite interferogram length can also be shown to disappear
when L is a slowly varying function of wavenumber. Following the notation introduced in Sec.
5.15 of Chapter 5, we say that 2D is the finite length of the interferogram signal. According to
Eq. (5.108d) in Chapter 5,
is then the spectral radiance distorted by both the interferometer’s finite interferogram length and
its finite field of view. Using (6A.3), this reduces to
- 857 -
6 · NEdN and Detector Noise
The sinc function has a tall central lobe centered on σ = 0 and then oscillates to zero as we move
away from the origin (see Fig. 5.23 in Chapter 5). Since L is a slowly varying function of
wavenumber, the sinc can be thought of as an extremely narrow function compared to L.
Appendix 5C of Chapter 5 discusses what happens when a narrow function such as
[2 Dsinc(2πσ D)] in (6A.4b) is convolved with a broad, slowly varying function such as L. To
make use of the work done in Appendix 5C, we consult Eq. (5C.4b) to get
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ [h( z ) ∗ g ( z )] .
Here, h(z) represents the narrow function and G(z) represents the broad function. We apply the
definition of the convolution in Eq. (2.38a) of Chapter 2 to just the right-hand side of this formula
to get
∞
h( z ) ∗ [G ( z ) ⋅ g ( z )] ≅ G ( z ) ⋅ ³ h( z ′) g ( z − z ′) dz ′ .
−∞
Remembering that h(z) represents the narrow sinc function and G(z) represents the broad, slowly
varying spectral radiance L, we set up the correspondences
z ⇔σ
G ( z ) ⇔ L( σ )
h( z ) ⇔ [2 Dsinc(2πσ D)]
∞
L mnf (σ ) ≅ L ( σ ) ⋅ ³ 2 Dsinc(2πσ ′D)dσ ′ . (6A.4d)
−∞
- 858 -
Appendix 6A
Glancing back at Eq. (2.108a) in Chapter 2, we mentally replace F by D and t by ıƍ, noting that
when f = 0 formula (2.108a) becomes
Equation (2.56c) in Chapter 2 shows that Π (0, D) is one for all D > 0 , so
∞
³ [2Dsinc(2π Dσ ′)] dσ ′ = 1
−∞
(6A.5)
Hence, when the spectral radiance L is a slowly varying function of wavenumber with respect to
[2 Dsinc(2πσ D)] and with respect to a change in wavenumber
∆Ω ⋅ σ
∆σ = ,
2π
then it undergoes only negligible distortion from the interferometer’s ¨ȍ finite field of view and
2D finite interferogram length.
- 859 -
6 · NEdN and Detector Noise
Appendix 6B
The noise contaminating a time-based signal can be represented by a random function Ñ of time t,
which we write as Ñ(t) using the notation of Chapter 3 [see Sec. 3.2 of Chapter 3). According to
Eq. (6.4), for time-based interferometer signals the time t is linearly proportional to the OPD
value Ȥ,
t /u , (6B.1)
where u is the OPD velocity. Hence, when Ñ(t) represents noise contaminating a time-based
interferometer signal, we can also decide to represent the same noise as a random function ñ(Ȥ),
with
n ( ) N ( / u ) (6B.2a)
or
n (ut ) N (t ) . (6B.2b)
From Sec. 3.15 of Chapter 3 [see Eq. (3.30b)], we know that when Ñ is wide-sense stationary it
has an autocorrelation function RNN
given by
RNN
(t 2 t1 ) E N (t1 ) A N (t2 ) . (6B.3a)
5
RNN
(* ) ³S
5
NN
( f ) e 2& if * df . (6B.3c)
(t 2 t1 ) E n
RNN (ut1 ) A n (ut2 ) . (6B.4a)
- 860 -
Appendix 6B
§ χ 2 − χ1 ·
RNN
¨ ¸ = E ( n ( χ1 ) ⋅ n ( χ 2 ) ) . (6B.4b)
© u ¹
We can now, using the most basic definition of the autocorrelation function in Eq. (3.23b) of
Chapter 3, define the autocorrelation function of ñ(Ȥ) to be
( χ1 , χ 2 ) = E ( n
onn ( χ1 ) ⋅ n ( χ 2 ) ) , (6B.4c)
§ χ 2 − χ1 ·
( χ1 , χ 2 ) = RNN
onn ¨ ¸. (6B.4d)
© u ¹
( χ 2 − χ1 ) = E ( n
onn ( χ1 ) ⋅ n ( χ 2 ) ) . (6B.4e)
§ χ 2 − χ1 ·
( χ 2 − χ1 ) = RNN
onn ¨ ¸ (6B.4f)
© u ¹
or, setting χ ′ = χ 2 − χ1 ,
§ χ′ ·
( χ ′) = RNN
onn ¨ ¸. (6B.4g)
©u ¹
(uτ ) = RNN
onn (τ ) . (6B.4h)
Equations (6B.4g) and (6B.4h) specify the connection between the autocorrelation functions of Ñ
and ñ.
We examine the definition of a wide-sense stationary random function in Sec. 3.15 of Chapter
3 [in Eq. (3.30b)] and note that (6B.4e) is the major requirement for showing that ñ(Ȥ) is wide-
sense stationary. All that remains is to discover whether or not
- 861 -
6 · NEdN and Detector Noise
E ( n ( χ ) )
is finite and independent of Ȥ. If Ñ(t) is wide-sense stationary, we know from Eq. (3.30a) of
Chapter 3 that
( )
E N (t ) = µ N = finite constant . (6B.5a)
which, since χ = ut from (6B.1), is clearly the same thing as saying that
Therefore, putting together (6B.4e) and (6B.5b), we find that ñ(Ȥ) satisfies all the requirements
for being a wide-sense stationary random function of Ȥ whenever Ñ(t) is a wide-sense stationary
random function of t.
The power spectrum pnn ( χ ) is the forward Fourier transform of its autocorrelation
of n
function onn
,
∞
(σ ) =
pnn ³o
−∞
nn ( χ ′) e −2π iσχ ′ d χ ′ . (6B.6a)
( χ ′) =
onn ³p
−∞
nn (σ ) e 2π iσχ ′ dσ . (6B.6b)
(σ ) =
pnn
−∞
³R
NN
( χ ′ / u ) e −2π iσχ ′ d χ ′ .
We can, following the suggestion contained in Eq. (6B.1), change the dummy variable of
integration to τ = χ ′ / u (with d χ ′ = u dτ ) to get
(σ ) = u ³ RNN
−2π iσ uτ
pnn (τ ) e dτ . (6B.6c)
−∞
- 862 -
Appendix 6B
(σ ) = uS NN
pnn (uσ ) , (6B.6d)
which, setting
f = uσ , (6B.6e)
Equation (3.57g) in Chapter 3 can be written as, using the notation of this appendix,
S NN
1
( f ) = lim ®
T →∞ 2T
¯
T
¿
(
( f ) 2 ½¾ ,
E N ) (6B.7a)
where
T ∞
(f)= N (t ) e −2π ift dt =
³ ³ Π(t , T ) N (t ) e
−2π ift
N T dt (6B.7b)
−T −∞
with Π (t , T ) defined the same way it was in Eq. (4C.1a) in Appendix 4C of Chapter 4:
°1 for t ≤ T
Π (t , T ) = ® . (6B.7c)
°̄0 for t > T
Transforming Eq. (6B.7b) from f and t variables to ı and Ȥ variables gives [see Eqs. (6B.1) and
(6B.6e)]
uT
1
NT (uσ ) = ³ N ( χ / u ) e −2π i ( uσ )⋅( χ / u ) d χ
u − uT
D = uT . (6B.7e)
- 863 -
6 · NEdN and Detector Noise
If we also define
D ∞
³ n ( χ ) e −2π iσχ d χ = ³ Π( χ , D) n ( χ ) e
−2π iσχ
n D (σ ) = dχ , (6B.7f)
−D −∞
Replacing ı by f / u gives
( f ) = u −1n ( f / u ) .
N (6B.7h)
T D
Now Eqs. (6B.6f), (6B.7h), and (6B.7e) can be combined with Eq. (6B.7a) to get
1 § 1 2 ·½
u −1pnn
( f / u ) = lim ® E ¨ 2 n D ( f / u ) ¸ ¾
D →∞ 2( D / u )
¯ ©u ¹¿
or, replacing f by uσ ,
1
(σ ) = lim ®
pnn
D →∞ 2 D
¯
(
2 ½
E n D (σ ) ¾ .
¿
) (6B.7i)
Equations (6B.6d), (6B.6f), and (6B.7a)–(6B.7i) specify the connections between the Ȥ-based and
the t-based power spectra of ñ and Ñ.
- 864 -
7
MIRROR-MISALIGMENT NEdN IN
DOUBLE-SIDED INTERFEROGRAMS
Unlike the detector noise described in the previous chapter, the misalignment noise in a well-
designed interferometer should be a small source of random error. To design these instruments
properly, ensuring that misalignment noise is likely to be small, we need some way to analyze it.
The formulas derived in Chapters 4 and 5 can handle static interferometer misalignments—that
is, they can handle situations where the alignment does not significantly change during a spectral
measurement—but a more sophisticated approach is needed when the alignment changes rapidly
and randomly. In this chapter we use wide-sense stationary random functions of the type
described in Sec. 3.15 of Chapter 3 to describe how the interferometer’s randomly changing
misalignment can contaminate the interference signal. By tracing the contaminated interferogram
through the entire signal-processing chain, including the calibration algorithm, we discover what
the spectral NEdN looks like when the interferometer is dominated by misalignment noise. This
not only produces the formulas needed to design interferometers with insignificant amounts of
random misalignment but it also, when interferometers break down, gives us the information
needed to decide whether unexpectedly large and randomly changing alignment errors are
contributing to the problem.
z A(tot ) ( χ ) = z A ( χ ) + z (Acold ) ( χ ) .
- 865 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
∞
A ∆Ω
(χ ) = ³ η (σ )τ f ( σ )τ a ( σ )L( σ ) dσ
( tot )
z A
4 −∞
∞
WA ∆Ω
+
4 −∞³ M( Rσθ ma )η (σ )τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ
∞
WA ∆Ω
³
2π iσχ
+ M( Rσθ ma )η (σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ
4 −∞
∞
(7.1a)
A ∆Ω
+
2 0³ η (σ )τ a (σ )L(fore) (σ ) dσ
∞
A ∆Ω
³
2
+ [2 r (σ ) − η (σ )]τ a (σ )L(back) (σ ) dσ
2 0
∞
³L (σ ) dσ .
( dir ) ( dir )
+ A det ∆Ω
0
We note that because Ș is even [see Eq. (4.139g) in Chapter 4] that the product
η (σ )τ f ( σ )τ a ( σ )L( σ )
∞ ∞
³ η (σ )τ
−∞
f ( σ )τ a ( σ )L( σ ) dσ = 2³ η (σ )τ f (σ )τ a (σ )L(σ ) dσ .
0
(7.1b)
This allows the first and fourth terms on the right-hand side of (7.1a) to be combined into a single
integral,
∞ ∞
A ∆Ω A ∆Ω
³
4 −∞
η (σ )τ f ( σ )τ a ( σ ) L ( σ ) d σ +
2 0³ η (σ )τ a (σ )L(fore) (σ ) dσ
∞
(7.1c)
A ∆Ω
=
2 0³ η (σ )τ a (σ )[τ f (σ ) L(σ ) + L(fore) (σ )] dσ .
In a similar way, we can combine the two Fourier transforms in (7.1a) to get
- 866 -
Setting Up the Signal Equations · 7.1
³ M( Rσθ
−∞
ma )η (σ )τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ
³ M( Rσθ
2π iσχ
+ ma )η (σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ (7.1d)
−∞
∞
³ M( Rσθ
2π iσχ
= ma )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ .
−∞
z A( tot ) ( χ ) =
∞
WA ∆Ω
³
2π iσχ
M( Rσθ ma )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e dσ
4 −∞
A ∆Ω
∞
(7.1e)
+
2 0³ η (σ )τ a (σ )[τ f (σ )L(σ ) + L(fore) (σ )] dσ
∞ ∞
A ∆Ω
³ − η (σ )]τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ L( dir ) (σ ) dσ .
2
+ [2 r (σ )
2 0 0
This is the formula for z A(tot ) ( χ ) that we will trace through the signal chain of Fig. 6.2 in Chapter
6.
too is random. The dashed arrow in Fig. 7.1 shows the orientation of the surface normal when it
is misaligned by the random angle
θ = θ 2 + θ 2 , x y
and the bold arrow pointing along the interferometer’s optical axis shows the orientation of the
moving mirror’s surface normal when it is correctly aligned. The formula for the modulation
function M used in Eq. (7.1e) and defined in Eq. (5.10c) of Chapter 5 assumes that the beam
- 867 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
passing through the interferometer has a circular cross section. In a well-designed interferometer
' is always small, so in Eq. (5.10c) we can make the approximation102
J (4& R)' )
M( R)' ) 1
1 a) 2' 2 (7.2a)
2& R)'
with
a 2& 2
R2 . (7.2b)
FIGURE 7.1. The z axis is the correctly aligned normal vector of the mirror surface and the dashed arrow
is the misaligned normal vector. The x and y axes show the orientation of the 'x ( ) and 'y ( )
components of the total ' ( ) misalignment angle.
~
z axis ' y ( )
~
' x ( )
~
' ( )
y axis
x axis
102
Handbook of Mathematical Functions, edited by Abramowitz and Stegun, see formula (9.1.10), p. 360.
- 868 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2
The random angles θx and θy can take on both positive and negative values, but random angle θ
can never be negative. All three angles— θ , θ , and θ —can be treated as random functions of
x y
By making these angles stationary random functions of Ȥ, we can analyze what happens to the
interferometer signal when θx and θy change randomly with OPD while the moving mirror is in
motion.
Using stationary random functions to represent angles θx , θy , and θ is an obvious approach
when the misalignment is driven by outside disturbances—when, for example, interferometers
are operated in high-vibration environments. In this sort of situation, we expect θx ( χ ) , θy ( χ ) ,
and θ ( χ ) to be at least wide-sense stationary and weakly ergodic (these types of random
quantities are discussed in Secs. 3.15 and 3.18 of Chapter 3). In low-vibration environments,
however, there may well be a tendency for the interferometer’s own motion—it does, after all,
have a moving mirror—to excite internal resonances that disturb the alignment. When this
happens, the misalignment may well be preferentially large at certain Ȥ values. Although at first
glance it may seem that θx , θy , and θ must now be nonstationary random functions, we can
instead, remembering the discussion following Eq. (3.47a) in Chapter 3, say that θ , θ , θ are
x y
still stationary but nonergodic. Before the instrument is built, it is very difficult to know at what Ȥ
values the random quantities θx , θy , and θ have a greater chance of taking on large values.
Hence, in our ignorance, while designing the instrument, we can treat these angles as equally
likely to be large or small at any Ȥ value—that is, we say that θx , θy , and θ are stationary.
Building the interferometer then corresponds to choosing specific angle functions from the
ensemble of allowed functions, as discussed in Sec. 3.14 in Chapter 3. If the angle function turns
out to be preferentially large at some Ȥ values, this just means that a nonergodic member function
of the ensemble has been chosen. So even in a low-vibration environment we can still, while
designing the interferometer, regard θx ( χ ) , θy ( χ ) , and θ ( χ ) as wide-sense stationary random
functions.
Now that we have decided to treat θx ( χ ) and θy ( χ ) as wide-sense stationary random
functions, we note that θx and θy are usually zero-mean random variables, which means that
their expectation values are zero:
- 869 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
E 'x ( ) 0
(7.2d)
and
E 'y ( ) 0 .
(7.2e)
Some interferometers, however, have a bias tilt angle , which is the same thing as saying that
E(' ( )) and E(' ( )) are not both equal to zero. When this happens, the expectation values of
x y
'x ( ) and 'y ( ) are assumed to be independent of Ȥ, and we can orient the x and y axes in Fig.
7.1 so that
E 'x ( ) (7.2f)
and
E 'y ( ) 0 .
(7.2g)
that when 0 , these equations reduce to the previous formulas in (7.2d) and (7.2e). To
Note that,
analyze mirror-misalignment noise both with and without bias tilt, we say that the probability
density distribution characterizing the behavior of 'x at all values of Ȥ has a mean of and that
the probability density distribution characterizing the behavior of ' at all values of Ȥ has a mean
y
of zero. We assume that the probability density distributions for 'x and 'y are normal and have
standard deviations x and y respectively. These two normal distributions can then be written
as
1 (' )2 2 x 2
p'x (' x ) e x (7.2h)
x 2&
and
1 ' 2 2 2
p'y (' y ) e y y . (7.2i)
y 2&
Here p'x (' x ) d' x is the probability that 'x takes on a value between ' x and ' x d' x and
p'y (' y ) d' y is the probability that 'y takes on a value between ' y and ' y d' y . Having used
Eqs. (7.2h) and (7.2i) to set up the 'x and 'y distributions, it can be shown that if
x y
- 870 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2
and θx , θy are independent, then θ in Eq. (7.2c) must obey the probability density
distribution103
(θ 2 +φ 2 )
θ § θφ · − 2γ 2
pθ (θ ) = 2 I 0 ¨ 2 ¸ e , (7.2j)
γ ©γ ¹
where
2π
1
³e
ξ cos ω
I 0 (ξ ) = dω (7.2k)
2π 0
any position of the interferometer’s moving mirror. This means that the average or mean squared
values of θx , θy , and θ are Ȥ-independent constants. Equations (7A.5a) and (7A.5c) in Appendix
7A then show that
( )
E θx ( χ ) 2 = φ 2 + γ x2 (7.3a)
and
(
E θy ( χ ) 2 = γ y2 . ) (7.3b)
We define θ rms
2
to be the Ȥ-independent constant equal to E θ ( χ ) 2 , ( )
( 2
)
E θ ( χ ) 2 = θ rms . (7.3c)
Squaring (7.2c) and taking the expectation value of both sides gives, after applying Eq. (3.16a) in
Chapter 3,
θ rms
2
( ) ( )
= E θ ( χ ) 2 = E θx ( χ ) 2 + θy ( χ )2 = E θx ( χ )2 + E θy ( χ ) 2 .( ) ( )
Substituting from (7.3a) and (7.3b) gives
( )
E θ ( χ ) 2 = θ rms
2
= φ 2 + γ x2 + γ y2 . (7.3d)
103
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 140.
- 871 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
When both θx and θy have the same standard deviation, with
γx =γy =γ ,
then Eq. (7.3d) becomes
(
E θ ( χ ) 2 = θ rms
2
)= φ 2 + 2γ 2 ,
( )
For future use, we derive the value of E θ ( χ ) 4 . Taking the fourth power of both sides of Eq.
(7.2c) and taking the expectation value gives [again using Eq. (3.16a) in Chapter 3],
( ) ( 2
) (
E θ ( χ ) 4 = E ª¬θx ( χ ) 2 + θy ( χ ) 2 º¼ = E θx ( χ ) 4 + θy ( χ ) 4 + 2 θx ( χ ) 2 θy ( χ ) 2 )
= E (θ ( χ ) ) + E (θ ( χ ) ) + 2 E (θ ( χ ) θ ( χ ) ) .
x
4
y
4
x
2
y
2
Assuming that θx ( χ ) and θy ( χ ) are independent random variables—which, of course, means
that θ ( χ ) 2 and θ ( χ ) 2 are also independent—we can write that [see formula (3.12c) in Chapter
x y
3)
( ) ( ) ( )
E θ ( χ ) 4 = E θx ( χ ) 4 + E θy ( χ ) 4 + 2 E θx ( χ ) 2 E θy ( χ ) 2 . ( ) ( ) (7.4a)
( )
E θx ( χ ) 4 = 3 γ x4 + 6 φ 2γ x2 + φ 4 (7.4b)
and
( )
E θy ( χ ) 4 = 3 γ y4 . (7.4c)
Substitution of Eqs. (7.3a), (7.3b), (7.4b), and (7.4c) into (7.4a) gives
( )
E θ ( χ ) 4 = 3 γ x4 + 6 φ 2γ x2 + φ 4 + 3 γ y4 + 2 (φ 2 + γ x2 ) γ y2
(7.4d)
= 3(γ + γ ) + 2γ (3φ + γ ) + 2φ γ + φ .
4
x
4
y
2
x
2 2
y
2 2
y
4
- 872 -
Specifying the Random Misalignment Angle of the Moving Mirror · 7.2
When θx and θy have the same standard deviation γ x = γ y = γ , this reduces to
( )
E θ ( χ ) 4 = 8 γ 4 + 8 φ 2γ 2 + φ 4 . (7.4e)
2
§ θ rms
2
−φ2 · 2 § θ rms − φ ·
2 2
( 4
)
E θ (χ ) = 8 ¨
2
¸ + 8φ ¨
2
¸ +φ
4
© ¹ © ¹
= 2 (θ rms
4
+ φ 4 − 2θ rms
2
φ 2 ) + 4 φ 2θ rms
2
− 4φ 4 + φ 4 .
( )
E θ ( χ ) 4 = 2θ rms
4
−φ4 (7.4f)
when θx and θy are independent and obey normal distributions having the same standard
deviation.
( tot )
z AN (χ ) =
∞
WA ∆Ω
4 −∞³ ( )
M Rσθ( χ ) η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )] e
2π iσχ
dσ
A ∆Ω
∞
(7.5a)
+
2 0³ η (σ )τ a (σ )[τ f (σ )L(σ ) + L(fore) (σ )] dσ
∞ ∞
A ∆Ω
³ [2 r (σ ) −η (σ )]τ a (σ )L(back) (σ ) dσ + Adet ∆Ω( dir ) ³ L( dir ) (σ ) dσ .
2
+
2 0 0
( tot )
In this chapter, the random function z AN represents the total signal contaminated by mirror-
misalignment noise at point A in Fig. 6.2 of Chapter 6. The AN subscript and (tot) superscript
( tot )
remind us that z AN is the noise-contaminated total signal at point A, and the tilde shows that
- 873 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
θ ( χ ) turns z AN
( tot )
into a random function of Ȥ. To get the detector signal generated by all the
optical power hitting the detector, we insert the detector responsivity R into the integrals on the
right-hand side of (7.5a):
( tot )
zBN (χ ) =
∞
WA ∆Ω
4 −∞³ ( )
R ( σ )M Rσθ ( χ ) η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e
( fore ) (back) 2π iσχ
dσ
A ∆Ω
∞
(7.5b)
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
2
+
2 0 0
( tot )
Here zBN represents the total signal contaminated by mirror-misalignment noise at point B in
Fig. 6.2. Traditionally the responsivity R(ı) is defined only for positive wavenumber arguments,
so inside the first integral on the left-hand side the argument of R has absolute value signs to
make R well-defined for negative ı values.
Equation (7.2a) can be substituted into (7.5b) to get
( tot )
zBN (χ ) =
∞
WA ∆Ω
³
2π iσχ
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e dσ
( fore ) (back)
4 −∞
∞
A ∆Ω
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+ (7.6a)
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + Adet ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ
2
+
2 0 0
∞
WA ∆Ω
− aθ ( χ )2 ³ σ 2 R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e
2π iσχ
dσ .
4 −∞
∞
WA ∆Ω
aθ 2
rms ³ σ 2 R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e
2π iσχ
dσ
4 −∞
- 874 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3
³ (1 − aθ
2π iσχ
2
σ 2 )R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
rms
fore )
( σ ) − L(back)
FOV ( σ )]e dσ
−∞
∞
§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨ (7.6b)
© 2 ¹ 0
∞ ∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ
2 (back)
+¨
© 2 ¹ 0 0
§ WA ∆Ω ·
− a ⋅ [θ ( χ ) 2 − θ rms
2
]⋅ ¨ ¸⋅
© 4 ¹
∞
³σ
2π iσχ
R( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV ( σ ) − L(back)
FOV ( σ )]e dσ .
2 fore )
−∞
³ M( Rσθ
2π iσχ
rms ) R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ
−∞
∞
§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨ (7.7a)
© 2 ¹ 0
∞ ∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L (σ ) dσ + A det ∆Ω( dir ) ³ R (σ )L( dir ) (σ ) dσ
2 (back)
+¨
© 2 ¹ 0 0
§ WA ∆Ω ·
− a ⋅ [θ ( χ ) 2 − θ rms
2
]⋅ ¨ ¸⋅
© 4 ¹
∞
³σ
2π iσχ
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )]e dσ .
2 ( fore ) (back)
−∞
- 875 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
Now by defining
§ WA ∆Ω ·
Z FOV (σ ) = ¨ ¸ R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] , (7.7b)
( fore ) (back)
© 4 ¹
∞
( tot )
zBN ( χ ) = ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ
−∞
∞
§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨
© 2 ¹ 0
∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ (7.7c)
© 2 ¹ 0
∞
³ R (σ ) L (σ ) dσ
( dir ) ( dir )
+ A det ∆Ω
0
∞
+ a[θ rms
2
− θ ( χ ) 2 ] ³ σ 2 Z FOV (σ ) e 2π iσχ dσ .
−∞
( tot )
The formula for zBN can be cleaned up some more by defining function W ( χ ) to be
∞
W (χ ) = ³σ Z FOV (σ ) e2π iσχ dσ
2
(7.8a)
−∞
n (θ 2) ( χ ) = θ rms
2
− θ ( χ ) 2 . (7.8b)
n (θ 2) ( χ ) = E(θ ( χ ) 2 ) − θ ( χ ) 2 . (7.8c)
We note that, using the linearity of operator E described in Sec. 3.10 of Chapter 3,
( ) ( ( ) ) ( (
E n (θ 2) ( χ ) = E E θ ( χ ) 2 − θ ( χ )2 = E E θ ( χ ) 2 ) ) − E (θ( χ ) ) .
2
- 876 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3
( )
Since E θ ( χ ) 2 is a nonrandom quantity, Eq. (3.9f) of Chapter 3 requires that
( (
E E θ ( χ ) 2 ) ) = E (θ( χ ) ) ,
2
( ) ( (
E n (θ 2) ( χ ) = E E θ ( χ ) 2 ) ) − E (θ( χ ) ) = E (θ( χ ) ) − E (θ( χ ) ) = 0 .
2 2 2
(7.8d)
³ R (σ ) L (σ ) dσ
( dir ) ( dir )
+ A det ∆Ω
0
+ a n (θ 2) ( χ ) W ( χ ) .
The first four terms on the right-hand side are all nonrandom, so it makes sense to write (7.8e) as
( tot )
zBN ( χ ) = z B( tot ) ( χ ) + a n (θ 2) ( χ ) W ( χ ) , (7.8f)
where
- 877 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
∞
z B( tot ) ( χ ) = ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ
−∞
∞
§ A ∆Ω ·
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+¨
© 2 ¹ 0
∞ (7.8g)
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ
© 2 ¹ 0
∞
+ A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
0
Substituting for Z FOV (σ ) from (7.7b) lets the formula for z B(tot ) be written as
z B( tot ) ( χ ) =
WA ∆Ω ∞ 2π iσχ
³ M( Rσθ rms )R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] e dσ
( fore ) (back)
4 −∞
∞
A ∆Ω (7.8h)
³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+
2 0
∞ ∞
A ∆Ω
³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L(back) (σ ) dσ + A det ∆Ω( dir ) ³ R (σ ) L( dir ) (σ ) dσ .
2
+
2 0 0
Comparing this latest expression to the formula for z A(tot ) ( χ ) in Eq. (7.1e), we note that z A(tot ) ( χ )
turns into z B( tot ) ( χ ) if we insert the responsivity R into all the integrals of (7.1e) and also set
θ rms = θ ma (7.8i)
in the modulation term M. This correspondence justifies the z B(tot ) label given to the sum of the
four nonrandom terms in (7.8e) above, because this term looks like what the noise-free signal
z (Atot ) at point A in Fig. 6.2 would become as it leaves the detector at point B, provided we say
that șrms is the effective value of the moving mirror’s constant misalignment angle. After using
both the linearity of E described in Sec. 3.10 of Chapter 3 and Eq. (3.9f) from that same chapter,
we apply the expectation operator E to both sides of Eq. (7.8f) to get
(
( tot )
E zBN ) ( ) (a n
( χ ) = E z B(tot ) ( χ ) + E (θ 2)
) ( )
( χ ) W ( χ ) = z B( tot ) ( χ ) + a W ( χ ) E n (θ 2) ( χ ) .
- 878 -
Ȥ-Based Signal Contaminated by Mirror-Misalignment Noise · 7.3
Hence z B(tot ) ( χ ) is the expectation value, or average value, of the noise-contaminated signal
leaving the detector. It is the Ȥ-based signal we get after averaging together many independent
measurements of the same spectral radiance to reduce the mirror-misalignment noise to
negligible levels.
7.4 Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter)
To get the noise-contaminated signal through the detector circuit (which contains the anti-aliasing
filter) to point C in Fig. 6.2 in Chapter 6, we convert the noise-contaminated signal into a
function of time. Using the notation of Eq. (5.41a) in Chapter 5 and Eq. (6.4) of Chapter 6, we
write
χ
t= , (7.9a)
u
where u is the constant, positive OPD velocity—that is, the constant time rate of change of the
OPD value Ȥ. For the interferometer in Fig. 6.2, if Ȟ is the constant physical velocity of the
moving mirror, then u = 2v . The t = 0 origin of the time coordinate is chosen to coincide with
the OPD value χ = 0 . The time-based signal at point B can be written as, using (7.9a) to replace
Ȥ by ut in Eq. (7.8f),
( tot )
zBN (ut ) = z B( tot ) (ut ) + a n (θ 2) (ut ) W (ut ) . (7.9b)
To find the time-based output signal so (t ) leaving the detector circuit, we apply the standard
linear-circuit formula
so (t ) = h(t ) ∗ si (t ) , (7.10a)
where ∗ is the convolution operator defined in Eq. (2.38a) of Chapter 2, si(t) is the input signal
entering the detector circuit, and h(t) is the real-valued impulse-response function of the detector
circuit including the anti-aliasing filter.104 Equation (7.10a) can be written as
∞
h(t ) ∗ si (t ) = ³ h(t ′) s (t − t ′) dt ′ .
−∞
i (7.10b)
104
See Appendix 5A of Chapter 5 for more discussion of the impulse-response function and the implications of Eq.
(7.10a) relating the input and output signals of the detector circuit.
- 879 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
We know that the detector circuit (and anti-aliasing filter) has a transfer function H(ƒ) such that h
and H are a Fourier-transform pair,
∞
³ h(t ) e
−2π ift
H( f ) = dt (7.10c)
−∞
and
∞
³ H( f ) e
2π ift
h(t ) = df . (7.10d)
−∞
H(− f ) = H( f )∗ . (7.10e)
The ∗ superscript indicates that H(ƒ)* is the complex conjugate of H(ƒ). As explained in
Appendix 5A, formula (7.10e) holds true for any Fourier transform of a real function h. The
detector circuit is AC coupled to the detector, which means, according to Eq. (5.46d) in Chapter
5, that
H(0) = 0 .
An immediate consequence of Eq. (7.10f) and the definition of the convolution in Eq. (2.38a) in
Chapter 2 is that, for any time-independent constant K,
∞
h(t ) ∗ K = K ³ h(t ′) dt ′ = 0 .
−∞
(7.10g)
From Eq. (6.21a) in Chapter 6, we know that, for a relatively small time value T,
Equations (7.9b) and (7.10a) can now be combined to get the time-based output signal of the
( tot )
detector circuit (and anti-aliasing filter), which we decide to call sCN (t ) ,
( tot )
sCN (t ) = h(t ) ∗ [ z B( tot ) (ut ) + a n (θ 2) (ut ) W (ut )] . (7.11a)
- 880 -
Misalignment Noise and the Detector Circuit (or Anti-Aliasing Filter) · 7.4
Substitution from (7.8g) shows that zB( tot ) (ut ) has many constant—that is, time-independent—
terms. Gathering together all the constant terms inside a pair of braces { }, we use the linearity of
the convolution [see Eq. (2.38d) in Chapter 2] to write
ª∞ º
( tot )
sCN (t ) = h(t ) ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσ ut dσ »
¬ −∞ ¼
A ∆Ω · ∞
°§
¸ ⋅ ³ R (σ )η (σ )τ a (σ )[τ f (σ )L(σ ) + L (σ )] dσ
(fore)
+ h(t ) ∗ ®¨
°̄© 2 ¹ 0
∞
§ A ∆Ω ·
¸ ⋅ ³ [2 r (σ ) − η (σ )] R (σ )τ a (σ )L
2
+¨ (back)
(σ ) dσ
© 2 ¹ 0
∞
½°
³ R (σ ) L (σ ) dσ ¾
( dir ) ( dir )
+ A det ∆Ω
0 °¿
+ h(t ) ∗ [a n (θ 2)
(ut ) W (ut )] .
According to Eq. (7.10g), the convolution with the constant terms is zero, leaving us with
ª∞ º
( tot )
sCN (t ) = h(t ) ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσ ut dσ »
¬ −∞ ¼ (7.11b)
+ h(t ) ∗ [a n (ut ) W (ut )] .
(θ 2)
so (t ) = h(t ) ∗ si (t ) ,
Eq. (7.9a) and the formula for the convolution in (7.10b) can be used to convert back to functions
of Ȥ,
§χ·
so ¨ ¸ = h(t ) ∗ si (t ) t = χ / u
©u¹
or, using that t ′ = χ ′ / u ,
∞ ∞
§χ· §χ · 1 § χ′ · § χ χ′ · 1 §χ· §χ·
so ¨ ¸ = ³ h(t ′) si ¨ − t ′ ¸ dt ′ = ³ h ¨ ¸ si ¨ − ¸ d χ ′ = h ¨ ¸ ∗ si ¨ ¸ . (7.11c)
© u ¹ −∞ ©u ¹ u −∞ © u ¹ © u u ¹ u ©u¹ ©u¹
- 881 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
( tot ) § · 1 § · ª5 º
sCN ¨ ¸ h ¨ ¸ « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »
© u ¹ u © u ¹ ¬ 5 ¼
a §·
h ¨ ¸ ¬ª n (' 2) ( ) W ( ) ¼º
u ©u¹
( tot ) ( tot )
so that, deciding to call the Ȥ-based signal zCN ( ) instead of sCN ( / u ) , we can write
( tot ) § · ª5 º
zCN ( ) u 1h ¨ ¸ « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »
© u ¹ ¬ 5 ¼
(7.11d)
§·
u 1 a h ¨ ¸ ª¬ n (' 2) ( ) W ( ) º¼ .
©u¹
( tot )
In this chapter, random function zCN represents the total signal contaminated by mirror-
misalignment noise at point C in Fig. 6.2 of Chapter 6.
°1 for 4 D
( , D) ® . (7.12a)
°̄0 for D
Any Ȥ-based signal multiplied by ( , D) is left unchanged for OPD values between +D and íD
and set to zero for OPD values greater than D or less than íD. We now multiply both sides of Eq.
(7.11d) to
byget
(the
, Ddouble-sided signal at point
) to get the double-sided C in at
signal Fig. 6.2 C
point ofinChapter
Fig. 6.26,of Chapter 6,
( tot ) § · ª5 º½
( , D) zCN ( ) u 1 ( , D) ®h ¨ ¸ « ³ M( R)' rms ) Z FOV () ) e 2& i) d) »¼ ¾
¯ © u ¹ ¬ 5 ¿
(7.12b)
§·
(' 2) ( ) W ( )]½¾ .
u a ( , D) ®h ¨ ¸ [n
1
¯ ©u¹ ¿
The approximation specified in Eq. (7.10h) can be used to simplify the second term on the
right-hand side of (7.12b). Because h is a narrow function, the definition of a convolution can be
approximated as [see Eqs. (2.38a) and (2.38b) in Chapter 2]
- 882 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5
§· ½
( , D) ®h ¨ ¸ [n (' 2) ( ) W ( )]¾
¯ ©u¹ ¿
§ ·½
( , D) ®[n (' 2) ( ) W ( )] h ¨ ¸ ¾
¯ © u ¹¿
5
(7.13a)
§ 3 · 3
( , D) ³ n (' 2) ( 3) W ( 3) h ¨ ¸d
5 © u ¹
uTT
§ 3 · 3
³
(' 2)
( , D)
n ( 3) W ( 3) h ¨ ¸d .
uTT © u ¹
Using the same reasoning as in the discussion following Eq. (6.26a) in Chapter 6, we note that
this equation reduces to 0 0 when Ȥ does not lie between D and íD. Consequently, the limits
on the integral over d 3 can be replaced by ( D u qT ) and ( D u T q ) . When the integral’s
D to ( u T
limits are extended like this, the extra range of integration going from a q ) and from
( uqT ) to D
a makes only a negligible contribution to the integral due to the smallness of h at
these OPD values. Hence we can write
§· ½
( , D) ®h ¨ ¸ [n (' 2) ( ) W ( )]¾
¯ ©u¹ ¿
D uq
§ 3 · 3
( , D) ³
( D uq )
n (' 2) ( 3) W ( 3) h ¨
© u
¸d
¹
(7.13b)
5
§ 3 · 3
( , D) ³ ( 3, D) n (' 2) ( 3) W ( 3) h ¨ ¸d ,
5 © u ¹
where
D Duq . (7.13c)
Referring back to the formula for the convolution of two functions in Eq. (2.38a) of Chapter 2,
we see that (7.13b) can be written as, using (2.38b) to reverse the order of the convolution,
§· ½
( , D) ®h ¨ ¸ [n (' 2) ( ) W ( )]¾
¯ ©u¹ ¿
(7.13d)
§· ½
( , D) ®h ¨ ¸ ¬ª ( , D) n (' 2) ( ) W ( ) ¼º ¾ .
¯ ©u¹ ¿
- 883 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
To make our Fourier notation more concise, we start using F , the Fourier-transform operator
defined by Eqs. (2.29a) and (2.29c) in Chapter 2. When using this notation
∞
F ( − iσχ )
( u ( χ ) ) = ³ u ( χ )e−2π iσχ d χ (7.14a)
−∞
∞
F ( iσχ )
( v(σ ) ) = ³ v(σ )e2π iσχ dσ (7.14b)
−∞
F ( −iσχ ) ( Π ( χ , D) zCN
( tot )
(χ ))
§ §χ · ª∞ º½·
= u −1 F ( −iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ « ³ M( Rσ ′θ rms ) Z FOV (σ ′) e 2π iσ ′χ dσ ′ »¼ ¾ ¸
© ¯ © u ¹ ¬ −∞ ¿¹
§ §χ· ½·
+ u −1 a F ( − iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ [n (θ 2) ( χ ) W ( χ )]¾ ¸ .
© ¯ ©u¹ ¿¹
Z eff ,totN (σ ) = F
( − iσχ )
(
Π ( χ , D) zCN
( tot )
(χ ) , ) (7.14c)
( − iσχ ) § §χ · ª∞ 2π iσ ′χ º½·
σ χ ®h ¨ ¸ ∗ « ³ M( Rσ ′θ rms ) Z FOV (σ ′) e dσ ′
−1
Z eff ,totN ( ) = u F ¨ Π ( , D ) »¼ ¾ ¸
© ¯ © u ¹ ¬ −∞ ¿¹
§ §χ· ½·
+ u −1 a F ( −iσχ ) ¨ Π ( χ , D) ®h ¨ ¸ ∗ [n (θ 2) ( χ ) W ( χ )]¾ ¸ .
© ¯ ©u¹ ¿¹
- 884 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5
We apply the Fourier convolution theorem shown in Eq. (2.39j) in Chapter 2 to the first term on
the right-hand side, and to the second term we apply the approximation shown in Eq. (7.13d).
This gives
Z eff ,totN () )
§ § 3 · ª 5 º·
u 1 F ( i) ) ( , D) F ( i) 3) ¨ h ¨ ¸ « ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 3 d) 3 » ¸
© © u ¹ ¬ 5 ¼¹
§ §· ½·
u 1 a F ( i) ) ¨ ( , D) ® h ¨ ¸ [ ( , D
a) n (' 2) ( ) W ( )]¾ ¸ .
© ¯ ©u¹ ¿¹
We again apply the Fourier convolution theorem, this time using the forms shown in Eqs. (2.39j)
and (2.39a), to write
Z eff ,totN () )
§ § 3 ·· § 5 ·½
u 1 F ( i) ) ( , D) ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) ¨ ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 33 d) 3 ¸ ¾
¯ © © u ¹¹ © 5 ¹¿
§ § 3 · ·½
u 1 a F ( i) ) ( , D) ®F ( i) 3) ¨ h ¨ ¸ [ ( 3, D a) n (' 2) ( 3) W ( 3)] ¸ ¾
¯ © ©u ¹ ¹¿
Z eff ,totN () )
§ § 3 · · § 5 ·½ (7.15a)
u 1 F ( i) ) ( , D) ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) ¨ ³ M( R) 3'rms ) ZFOV () 3) e2& i) 3 33 d) 3 ¸¾
¯ © © u ¹¹ © 5 ¹¿
§ § 3 · · ½
u 1 a F ( i) ) ( , D) ®F ( i) 3) ¨ h ¨ ¸ ¸ A F ( i) 33) ( 33, D
a) n (' 2) ( 33) W ( 33) ¾ .
¯ © © u ¹¹ ¿
where the sinc function is, following the definition in Eq. (2.106d),
- 885 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
sin( x)
sinc( x) . (7.15c)
x
§ § ··
5 5
§ · 2& i)
¨h¨ ¸¸ ³ h¨ ¸e d u ³ h t e2& i) ut dt uH(u) ) .
( i) )
F (7.15d)
© © u ¹ ¹ 5 © u ¹ 5
We can now substitute Eqs. (7.15b) and (7.15d) into (7.15a) to get
Z eff ,totN () )
§ 5 ·½
[2 Dsinc(2&) D)] ®H(u) ) F ( i) ) ¨ ³ M( R) 3' rms ) Z FOV () 3) e 2& i) 3 d) 3 ¸¾
¯ © 5 ¹¿
a [2 Dsinc(2&) D)] H(u) )F ( i) ) ( , D) n (' 2) ( ) W ( )
1 2
or
Z eff ,totN () )
© 5 ¹
F ( i) ) F ( i) 3 ) M( R) 3' rms ) Z FOV () 3) M( R)' rms ) Z FOV () ) .
Working with the first term on the right-hand side of (7.15e), we note that [see Eq. (7.7b)
above]
- 886 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5
In a well-designed interferometer all the functions on the right-hand side of (7.16a), except for
the radiances L FOV , L(FOVfore )
, and L(back)
FOV , must
vary vary slowly
slowly withwith comparedtoto sinc(2&) D) .
ı ıcompared
Furthermore, this sinc function is very narrow, dropping rapidly to zero compared to all the
nonradiance functions in (7.16a). Consequently, we can, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, treat the nonradiance functions as quasi-constants in the convolution
This lets us write, after using Eqs. (5C.1) and (2.38d) in Chapter 2,
where, following the notation of Eqs. (6.62c), (6.63c), and (6.63d) in Chapter 6, we say that
L(mnf
fore )
( ) ) [2 Dsinc(2&) D)] L(FOV
fore )
() ) , (7.16d)
and
L(back) (back)
mnf ( ) ) [2 Dsinc(2&) D )] L FOV ( ) ) . (7.16e)
radiances distorted both by the effect of the interferometer’s finite field of view and by its finite
interferogram length. Defining function Z mnf to be
WA ( fore ) (back)
Z mnf () ) R ( ) )! () )* a ( ) )[* f ( ) )L mnf ( ) ) L mnf ( ) ) L mnf ( ) )] (7.16f)
4
- 887 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
The second term on the right-hand side of (7.15e) can also be simplified. The nonradiance
H(uı) transfer function can be treated like a quasi-constant in the convolution over ı to get
{
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) ( Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) }
{
≅ H(uσ ) ⋅ [2 Dsinc(2πσ D)] ∗ F ( −iσχ ) (Π ( χ , D) n (θ 2) ( χ ) W ( χ )) . }
Equation (7.15b) and Eq. (2.39j) in Chapter 2 can be used to turn the sinc function into another
factor inside the Fourier transform,
{
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) ( Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) } (7.17a)
≅ H(uσ ) ⋅ F ( − iσχ )
( Π( χ , D) Π( χ , D) n (θ 2)
(χ ) W (χ )) .
Equation (7.13c) shows that D ≥ D , which means that [see the specification of Π in Eq. (7.12a)]
Π ( χ , D ) Π ( χ , D) = Π ( χ , D ) .
{ (
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) )}
(
≅ H(uσ ) ⋅ F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) ) (7.17b)
( )
= H(uσ ) ⋅ ª¬ F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) ∗ F ( − iσχ ′) ( W ( χ ′) ) º¼ ,
- 888 -
Misalignment Noise in Uncalibrated Spectra of Double-Sided Signals · 7.5
where again the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] is applied in the last
step. We define the D-limited Fourier transform of the noise n (θ 2) to be
( ) ³ Π( χ , D) n
n (Dθ 2) (σ ) = F ( −iσχ ) Π ( χ , D) n (θ 2) ( χ ) = (θ 2)
( χ ) e −2π iσχ d χ
−∞
D
(7.17c)
= ³
−D
n (θ 2) ( χ ) e −2π iσχ d χ
so that
{
[2 Dsinc(2πσ D )] ∗ H(uσ ) F ( −iσχ ) (Π ( χ , D) n (θ 2) ( χ ) W ( χ )) }
(7.17d)
≅ H(uσ ) ⋅ ª¬n (Dθ 2) (σ ) ∗ F ( −iσχ ) ( W ( χ )) º¼ .
∞
σ Z FOV (σ ) =
2
³ W (χ ) e
−2π iσχ
d χ = F ( −iσχ ) ( W ( χ ) ) , (7.17e)
−∞
{ (
[2 Dsinc(2πσ D)] ∗ H(uσ ) F ( − iσχ ) Π ( χ , D) n (θ 2) ( χ ) W ( χ ) )}
(7.17f)
{
≅ H(uσ ) ⋅ n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ . }
Having found approximations for the first and second terms on the right-hand side of the
formula in (7.15e), we can write down a simplified expression for the uncalibrated signal
spectrum of the double-sided signal contaminated by mirror-misalignment noise. Substituting
(7.16g) and (7.17f) into (7.15e) gives
Z eff ,totN (σ )
(7.18a)
{
≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) + a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ . }
For future use, we note that the expectation value of the noise term in (7.18a) is, using the
definition of convolution in Eq. (2.38a) in Chapter 2 and the linearity of the expectation operator
E explained in Sec. 3.10 of Chapter 3,
- 889 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
E (a H(uσ ) { n (θ 2)
D (σ ) ∗[σ 2 Z FOV (σ )] })
§ ∞
·
= a H(uσ ) E ¨ ³
n (θ 2)
D (σ ′) ª
¬ (σ − σ ′) 2
Z FOV (σ − σ ′) º
¼ d σ ′ ¸
© −∞ ¹ (7.18b)
∞
= a H(uσ ) ³ E(n (θ 2)
D (σ ′)) ª¬(σ − σ ′) Z FOV (σ − σ ′) º¼ dσ ′
2
−∞
{
= a H(uσ ) E(n (Dθ 2) (σ )) ∗ ª¬ (σ ) 2 Z FOV (σ ) º¼ . }
Glancing back at the definition of n (Dθ 2) in Eq. (7.17c), we note that
D
E n ( (θ 2)
D ) ³ E ( n
(σ ) = (θ 2)
)
( χ ) e −2π iσχ d χ = 0 (7.18c)
−D
E (a H(uσ ){ n (θ 2)
D (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }) = 0 . (7.18d)
Applying the expectation operator to both sides of (7.18a) now gives, using Eqs. (3.9f) and
(3.16a) in Chapter 3,
(
E Z eff ,totN (σ ) )
= H(uσ ) M( Rσθ rms ) Z mnf (σ ) + E (a H(uσ ){ n (θ 2)
D (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }) (7.18e)
= H(uσ ) M( Rσθ rms ) Z mnf (σ ) .
This shows that, in principle, we can always reduce the mirror-misalignment noise to negligible
levels in the uncalibrated spectrum of the double-sided signal by averaging together many
independent measurements of the same spectral radiance.
- 890 -
Calibrated Spectra Contaminated by Misalignment Noise · 7.6
with absolute value signs used to make L(1) and L(2) even functions of wavenumber. We say that
(1) (σ ) is the uncalibrated, noise-contaminated signal spectrum at point C in Fig. 6.2 of
Z eff ,totN
Chapter 6 when the interferometer is observing the L(1) spectral radiance. To get the formula for
(1) (σ ) , we need to replace radiance L by radiance L(1) in formula (7.18a), which we do by
Z eff ,totN
writing
WA ∆Ω
(1)
Z FOV (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L FOV ( σ ) − L FOV ( σ )]
(1) ( fore ) (back)
(7.20b)
4
and
WA ∆Ω
(1)
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L ( σ ) + L mnf ( σ ) − L mnf ( σ )] . (7.20c)
(1) ( fore ) (back)
The approximation shown in (7.19a) is our justification for dropping the FOV and mnf subscripts
from L(1) in Eqs. (7.20a)–(7.20c). Similarly, we define Z (2) (σ ) to be the uncalibrated, noise-
eff ,totN
contaminated spectrum at point C when the interferometer is observing the L(2) spectral radiance.
This gives, using (7.19b) to drop the FOV and mnf subscripts from L(2),
- 891 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
Because the uncalibrated signal spectra from L(1) and L(2) used in our calibration algorithm
should be noise-free, we average together a large number of measurements to get, following the
pattern of Eq. (7.18e) and the statement after it,
( eff ,totN )
(1) (σ ) = H(uσ ) M( Rσθ ) Z (1) (σ )
E Z rms mnf (7.20g)
and
(eff ,totN )
(2) (σ ) = H(uσ ) M( Rσθ ) Z (2) (σ ) .
E Z rms mnf (7.20h)
(
Since E Z (1,2) )
eff ,totN (σ ) are the noise-free spectral signals corresponding to L
(1,2)
, we can write
where, to show that these are no longer random functions of ı, the tilde has been removed and
subscript totN has been changed to tot.
Now we can apply the calibration algorithm in Sec. 5.19 of Chapter 5 to get [see Eq. (5.95a)]
Measured Radiance
( meas ) (σ ) − Z (1) (σ )
Z (7.21a)
= ª¬L ( σ ) − L ( σ ) º¼ (2)
eff ,totN eff ,tot
(2) (1)
+ L(1) ( σ ) ,
Z eff ,tot (σ ) − Z (1)
eff ,tot (σ )
Fig. 6.2 associated with the unknown optical radiance L that we want to measure. Note that,
although the expectation operator E is used to remove the noise from the L(1,2) signals, the noise
( meas ) (σ ) signal. This is our way of showing that, while a great deal of
is left in the uncalibrated Z eff ,totN
- 892 -
Calibrated Spectra Contaminated by Misalignment Noise · 7.6
effort can be invested in obtaining noise-free calibration data, the unknown spectrum L may be
changing slowly with time—and is often only one of a number of measurements to be performed
in a limited amount of time—which prevents us from averaging away its noise.105 The
uncalibrated (meas) signal spectrum, contaminated by mirror misalignment noise, is called
Z eff ,totN (σ ) in Eq. (7.18a), so we can now write that
( meas ) (σ ) = Z
Z eff ,totN eff ,totN (σ )
(7.21b)
{
≅ H(uσ ) M( Rσθ rms ) Z mnf (σ ) + a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ }
with Z mnf (σ ) given by Eq. (7.16f) and Z FOV (σ ) given by Eq. (7.7b). Working with the first
term on the right-hand side of (7.21a), we note that, substituting from Eqs. (7.20c), (7.20f),
(7.20i), and (7.20j),
L(2) ( σ ) − L(1) ( σ )
,tot (σ ) − Z eff ,tot (σ )
(2) (1)
Z eff
L(2) ( σ ) − L(1) ( σ )
= (7.21c)
WA ∆Ω
H(uσ ) M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )[L(2) ( σ ) − L(1) ( σ )]
4
−1
ª WA ∆Ω º
=« H(uσ ) M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )» .
¬ 4 ¼
Consulting Eqs. (7.21b) and (7.16f), as well as (7.20c) and (7.20i), we get
( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot
WA ∆Ω
= H(uσ ) M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )[L mnf ( σ ) − L(1) ( σ )] (7.21d)
4
{ }
+ a H(uσ ) n (Dθ 2) (σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ .
105
In Chapter 6, see the discussion at the end of Sec. 6.5 as well as the discussion following Eq. (6.33b).
- 893 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
Measured Radiance
4 a n (D' 2) () ) ª¬) 2 Z FOV () ) º¼
1 2 (7.21e)
L mnf ( ) ) .
WA M( R)' rms )R ( ) )! () )* a ( ) )* f ( ) )
The right-hand side of (7.21e) is the sum of Lmnf, which is the spectral radiance distorted by
the effect of the interferometer’s finite field of view and finite interferogram length, and a random
noise term
4 a n (D' 2) () ) [) 2 Z FOV () )]
1 2
.
WA M( R)' rms ) R ( ) )! () )* a ( ) )* f ( ) )
Function Lmnf is strictly real, but there is no reason to expect this noise term to be strictly real. In
fact only the real component of the noise term unavoidably contaminates the Lmnf data. We
conclude, then, that the L measurement noise in the radiance spectrum is
2 relies on n (D' 2) being the only complex quantity in the expression for
The second step in (7.27a)
the L spectral noise. For future use, we note that the imaginary component of the noise term in
(7.21e) can be written as
Taking the real part of the measured spectrum eliminates this noise component from the data, just
like it did in our analysis of the avoidable and unavoidable detector noise [see the discussion
following Eq. (6.35d) in Chapter 6].
- 894 -
Avoidable and Unavoidable Misalignment Noise in Ȥ-Based Signals · 7.7
Z FOV () )
WA ( fore ) (back)
R ( ) )! () )* a ( ) )[* f ( ) )L FOV ( ) ) L FOV ( ) ) L FOV ( ) )]
4
(7.23a)
WA ( fore ) (back)
R ( ) )! () )* a ( ) )[* f ( ) )L FOV ( ) ) L FOV ( ) ) L FOV ( ) )]
4
Z FOV () ) .
Equation (5.10f) in Chapter 5 shows that M( R)' ma ) M( R)' ma ) , which means that
5
F (i) ) M( R)' ma ) Z FOV () ) ³ M( R)'
5
ma ) Z FOV () ) e 2& i) d) (7.23c)
must be a real and even function of Ȥ because it is the reverse Fourier transform of a real and
even function of ı (see entry 1 in Table 2.1 of Chapter 2). This forces z B(tot ) ( ) in Eq. (7.8g) to be
a real and even function of Ȥ. To show why this is so, we note that the formula for z B(tot ) ( ) is the
sum of the reverse Fourier transform specified in (7.23c) and several Ȥ-independent constant
terms. We have just seen that the Fourier transform is a real and even function of Ȥ, and the real
constant terms cannot change with Ȥ; hence, z B(tot ) ( ) must be real and even:
- 895 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
∞
W (χ ) = ³σ
2
(
Z FOV (σ ) e 2π iσχ dσ = F (iσχ ) σ 2 Z FOV (σ ) , ) (7.23f)
−∞
function W ( χ ) is also the reverse Fourier transform of an even function of ı. All the factors in
the definition of Z FOV (σ ) in Eq. (7.7b) are real, which means that the [σ 2 Z FOV (σ )] product in
(7.23g) is also real. Hence, W ( χ ) is the reverse Fourier transform of a real and even function,
making it also real and even:
W (− χ ) = W ( χ ) (7.23h)
and
Im ( W ( χ ) ) = 0 . (7.23i)
Following the same pattern as in Eq. (7.23a), we see that Z mnf (σ ) defined in Eq. (7.16f) is even
because
Z mnf (−σ )
WA ∆Ω
R ( −σ )η ( −σ )τ a ( −σ )[τ f ( −σ )L mnf ( −σ ) + L mnf ( −σ ) − L mnf ( −σ )]
( fore ) (back)
=
4
(7.23j)
WA ∆Ω
R ( σ )η (σ )τ a ( σ )[τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )]
( fore ) (back)
=
4
= Z mnf (σ ) .
Im ( Z mnf (σ ) ) = 0 . (7.23k)
( tot )
zBN ( χ ) = z B( tot ) ( χ ) + a n (θ 2) ( χ ) W ( χ ) . (7.24a)
- 896 -
Avoidable and Unavoidable Misalignment Noise in Ȥ-Based Signals · 7.7
From Eq. (7.23d) we know that the noise-free signal z B(tot ) ( ) is an even function of Ȥ, so in
principle we could reduce the noise in (7.24a) by comparing the noise-contaminated signal at Ȥ
and íȤ. (In practice, of course, we would have to worry about distortions introduced by any
circuit used to measure the signal of Ȥ and íȤ. See the discussion of the distortions produced by
the detector circuit in Sec. 5.12 of Chapter
Chapter 5.) To show how this works, we follow the pattern of
5.)To
Eqs. (2.11d), (2.11e) in Chapter 2 and divide the mirror-misalignment noise n (' 2) ( ) into even
and odd components, which we call ne(' 2) ( ) and no(' 2) ( ) respectively, by defining
1 (' 2)
ne(' 2) ( )
2
n ( ) n (' 2) ( ) (7.24b)
and
1 (' 2)
no(' 2) ( )
2
n ( ) n (' 2) ( ) . (7.24c)
According to these definitions
ne(' 2) ( ) ne(' 2) ( ) (7.24d)
and
no(' 2) ( ) no(' 2) ( ) . (7.24e)
The sum of ne(' 2) and no(' 2) returns the original noise term,
1 (' 2) 1
ne(' 2) ( ) no(' 2) ( )
2
n ( ) n (' 2) ( ) n (' 2) ( ) n (' 2) ( )
2
(' 2)
n ( ) .
Since
we can replace n (' 2) in Eq. (7.24a) by the sum of ne(' 2) and no(' 2) to get
( tot )
zBN ( ) ª¬ z B( tot ) ( ) a ne(' 2) ( ) W ( ) º¼ a no(' 2) ( ) W ( ) . (7.24g)
The sum inside the square brackets [ ] is even with respect to Ȥ because, according to Eqs.
(7.23d), (7.23h), and (7.24d),
- 897 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
This sum, just like the noise-free signal z B( tot ) , inis Eq.
even, whichis means
(7.23d), that the
even, which even
means thatnoise
the
even noise component
component
a ne(' 2) ( ) W ( )
cannot be distinguished from the noise-free z B(tot ) signal. The odd noise component,
a n '
( 2)
o ( ) W ( ) ,
on the other hand, can in principle be eliminated—for example, by averaging together the noise-
contaminated signal at Ȥ and íȤ. To see how this works, we consult Eq. (7.24g) and write
1 ( tot ) 1
1 ( tot )
zBN ( ) zBN 2
( ) [ z B( tot ) ( ) a ne(' 2) ( ) W ( )] a no(' 2) ( ) W ( )
1
2 2
[ z B( tot ) ( ) a ne(' 2) ( ) W ( )] a no(' 2) ( ) W ( ) .
2
This becomes, applying Eqs. (7.23h), (7.24e), and (7.24h),
1 ( tot ) 1
1 ( tot )
zBN ( ) zBN 2
( ) [ z B( tot ) ( ) a ne(' 2) ( ) W ( )]
1
2 2
[ z B(tot ) ( ) a ne(' 2) ( ) W ( )] 2
[ z B( tot ) ( ) a ne(' 2) ( ) W ( )] .
( tot )
Averaging the noise-contaminated signal zBN at Ȥ and íȤ eliminates the odd noise component,
reducing the amount of mirror-misalignment noise contaminating the signal. For this reason, it
makes sense to call ne(' 2) the unavoidable mirror-tilt noise—because it is even and so cannot be
distinguished from the even, noise-free signal—and to call no(' 2) the avoidable mirror-tilt noise
because it can be removed by averaging the noise-contaminated signal at Ȥ and íȤ.
- 898 -
Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum · 7.8
D D
n (Dθ 2) (σ ) = ³
−D
ne(θ 2) ( χ ) e−2π iσχ d χ + ³
−D
no(θ 2) ( χ ) e−2π iσχ d χ
(7.25a)
θ 2) θ 2)
= n (De (σ ) + n (Do (σ ) ,
where we define
D ∞
³ n ³ Π ( χ , D) n
(θ 2) (θ 2) −2π iσχ (θ 2)
n De (σ ) = e (χ ) e dχ = e ( χ ) e −2π iσχ d χ
−D −∞ (7.25b)
(
= F ( − iσχ ) Π ( χ , D) ne(θ 2) ( χ ) )
and
D ∞
³ ³ Π ( χ , D) n
θ 2)
n (Do (σ ) = no(θ 2) ( χ ) e −2π iσχ d χ = (θ 2)
o ( χ ) e −2π iσχ d χ
−D −∞ (7.25c)
=F ( − iσχ )
( Π( χ , D) n (θ 2)
o )
(χ ) .
θ 2)
Equation (7.25b) states that n (De is the forward Fourier transform of [Π ( χ , D) ne(θ 2) ( χ )] .
Glancing back at Eqs. (7.12a) and (7.24d), we note that
θ 2)
Equation (7.25c) states that n (Do is the forward Fourier transform of [Π ( χ , D) no(θ 2) ( χ )] .
According to Eqs. (7.12a) and (7.24e),
θ 2)
which makes n (Do the forward Fourier transform of a real and odd function. Consequently,
θ 2)
according to entry 4 of Table 2.1 in Chapter 2, n (Do must be imaginary and odd:
θ 2) θ 2)
n (Do (−σ ) = −n (Do (σ ) (7.27b)
- 899 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
and
( θ 2)
Im n (Do ) θ 2)
(σ ) = i −1 n (Do (σ ) . (7.27c)
Equations (7.26c) and (7.27c) show that taking the real part of both sides of (7.25a) now gives
( )
Re n (Dθ 2) (σ ) = n (De
θ 2)
(σ ) , (7.28a)
( )
Im n (Dθ 2) (σ ) = i −1 n (Do
θ 2)
(σ ) . (7.28b)
Equation (7.28a) shows that the real part of n (Dθ 2) , the D-limited Fourier transform of n (θ 2) , is
θ 2)
n (De , which is, according to (7.25b), the D-limited Fourier transform of the unavoidable signal
noise ne(θ 2) . Because the real part of n (Dθ 2) comes from ne(θ 2) , the unavoidable signal noise, it
makes sense to regard the real part of n (Dθ 2) as the unavoidable component of n (θ 2) in the spectral
domain. This matches what we see in Eq. (7.22a), where the formula for the noise δ L in the
measured spectrum uses only the real part of n (Dθ 2) (that is, it uses only the unavoidable
component of n (θ 2) in the spectral domain). Equation (7.28a) can be substituted into (7.22a) to
θ 2)
make the dependence on n (De explicit:
δ L =
{
4 a n (De
θ 2)
(σ ) ∗[σ 2 Z FOV (σ )] } . (7.28c)
WA ∆Ω M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )
Equations (4.139g) in Chapter 4 and (5.10f) in Chapter 5 show that Ș and M are even functions of
ı, and absolute value signs turn everything else in the denominator of the right-hand side of
θ 2)
(7.28c) into an even function of ı. Equations (7.26b) and (7.23g) show that n (De and
[σ 2 Z FOV (σ )] are even functions of ı, and Eq. (2.38f) in Chapter 2 requires the convolution of
two even functions to be another even function. Hence, the numerator of (7.28c) is also an even
function of ı. This makes the measurement noise δ L an even function of ı, which can be shown
by writing it as a function of σ ,
δ L = δ L ( σ )
- 900 -
Avoidable and Unavoidable Mirror-Misalignment Noise in the Signal Spectrum · 7.8
δ L ( σ ) =
{ ( )
4 a [Re n (Dθ 2) (σ ) ] ∗[σ 2 Z FOV (σ )] } (7.28d)
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )
and
δ L ( σ ) =
{
4 a n (De
θ 2)
(σ ) ∗[σ 2 Z FOV (σ )] } . (7.28e)
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )
Equation (7.28b) shows that the imaginary part of n (Dθ 2) is the same as i −1 n (Do
θ 2)
(σ ) , the D-limited
Fourier transform of the avoidable signal noise divided by i. Equation (7.28b) can be substituted
into (7.22b) to make this explicit:
§
Im ¨
{
4 a n (Dθ 2) (σ ) ∗[σ 2 Z FOV (σ )] } ·
¸
¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )¸
© ¹ (7.28f)
=
4a i −1
{ n (θ 2)
Do (σ ) ∗[σ Z FOV (σ )]
2
} .
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )
( )
Since E n (Dθ 2) (σ ) = 0 in Eq. (7.18c), we know that, using the linearity of E explained in Sec.
3.10 of Chapter 3,
( ( ) (
E (n (Dθ 2) (σ )) = E Re n (Dθ 2) (σ ) + i Im n (Dθ 2) (σ ) ))
= E ( Re ( n (θ 2)
D (σ ) ) ) + iE ( Im ( n (σ ) ) ) = 0 .
(θ 2)
D
Consequently both the real and imaginary components of E n (Dθ 2) (σ ) must be separately equal ( )
to zero, which means
( (
E Re n (Dθ 2) (σ ) = 0 )) (7.29a)
and
( (
E Im n (Dθ 2) (σ ) = 0 . )) (7.29b)
( θ 2)
E n (De (σ ) = 0 ) (7.29c)
and
- 901 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
( θ 2)
E n (Do )
(σ ) = 0 . (7.29d)
Applying the expectation operator to both sides of Eq. (7.28e) leads to, using Eqs. (2.38b) and
(2.38a) in Chapter 2 and the linearity of the expectation operator in Sec. 3.10 of Chapter 3,
(
E δ L ( σ ) )
§ 4a ·
= E¨
¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) { θ 2)
[σ 2 Z FOV (σ )] ∗ n (De (σ ) ¸
¸ }
© ¹
4a § ∞
·
= E ¨ ³ n (De
θ 2)
(σ − σ ′) ¬ªσ ′2 Z FOV (σ ′) ¼º dσ ′ ¸
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) © −∞ ¹
4a ∞
=
WA ∆Ω M( Rσθ )R ( σ )η (σ )τ ( σ )τ ( σ ) ³
θ 2)
E n (De ( )
(σ − σ ′) ª¬σ ′2 Z FOV (σ ′) º¼ dσ ′ ,
rms a f −∞
This shows that the measurement noise δ L ( σ ) is a zero-mean random variable. Similarly Eq.
(7.28f) gives us, after applying the expectation operator to both sides,
§ §
E ¨ Im ¨
{
4 a n (Dθ 2) (σ ) ∗[σ 2 Z FOV (σ )] } ··
¸¸
¨ ¨ WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¸¸
© © ¹¹
4 a i −1 ∞
E ( n )
(σ ) ³
= (θ 2)
Do (σ − σ ′) ª¬σ ′2 Z FOV (σ ′) º¼ dσ ′ ,
WA ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f −∞
Hence both the real and imaginary contamination of the measurement due to the signal’s mirror-
tilt noise can be reduced to negligible levels by averaging together many independent
measurements of the same spectrum.
- 902 -
Power Spectrum of ñ(ș2) · 7.9
(' 2)
onn
( , 3) E (n (' 2) ( ) n (' 2) ( 3)) .
³o
(' 2) (' 2)
p
nn () )
nn ( ) e 2& i) d F ( i) ) onn
(' 2)
( ) (7.30b)
5
³p
(' 2) (' 2)
o
nn ( )
nn () ) e 2& i) d) F ( i) ) pnn
(' 2)
() ) . (7.30c)
5
Equations (7.30b) and (7.30c) show how we set up the Ȥ-based autocorrelation of n (' 2) and the ı-
based power spectrum of n (' 2) as a Fourier-transform pair. Equation (7.8b) shows that onn (' 2)
is
real because n (' 2) is real, and we also note that onn
(' 2)
must be even because for any two values of
Ȥ and Ȥƍ,
(' 2)
onn
( 3 ) E n (' 2) ( ) n (' 2) ( 3) E n (' 2) ( 3) n (' 2) ( ) onn
(' 2)
( 3) .
(' 2) (' 2)
onn
( 33) onn
( 33) (7.30d)
and
and, having just decided the autocorrelation is('real,
Im onn
2)
( 33) 0 . (7.30e)
Im onn (' 2)
( 33) 0 . (7.30e)
- 903 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
(θ 2)
Equation (7.30b) then shows that, according to (7.30d) and (7.30e), pnn is the forward Fourier
transform of a real and even function, which means that it must also be real and even:106
Im pnn
(
(θ 2)
(σ ) = 0 ) (7.30f)
and
(θ 2) (θ 2)
pnn
(−σ ) = pnn
(σ ) . (7.30g)
(θ 2)
The χ = 0 value of the autocorrelation function can be used to connect the pnn power
spectrum to the statistics of the misalignment angle. Setting χ = 0 in Eq. (7.30c) gives
³p
(θ 2) (θ 2)
o
nn (0) =
nn (σ ) dσ
−∞
(
E [n (θ 2) ( χ )]2 = ) ³p (θ 2)
nn (σ ) dσ . (7.31a)
−∞
Substituting from Eq. (7.8b) and using the linearity of operator E with respect to random
quantities (see Sec. 3.10 of Chapter 3) as well as Eq. (3.9f) of Chapter 3, we get
(
E ([n (θ 2) ( χ )]2 ) = E [θ rms
2
)
− θ ( χ ) 2 ]2 = E (θ rms
4
) − 2 E θ rms
2
(
θ ( χ ) 2 + E θ ( χ ) 4 ) ( )
= θ rms
4
− 2θ rms
2
E (θ ( χ ) ) + E (θ ( χ ) )
2 4
(
= E θ ( χ ) 4 − θ rms
4
,)
where in the last step E(θ ( χ ) 2 ) = θ rms
2
from Eq. (7.3c) is used to simplify the result. Substitution
of this formula into (7.31a) gives
∞
( )
E θ ( χ ) 4 = θ rms
4
+ ³p
(θ 2)
nn (σ ) dσ . (7.31b)
−∞
106
See entry 1 of Table 2.1 in Chapter 2.
- 904 -
Power Spectrum of ñ(ș2) · 7.9
Because the statistics of ' do not depend on Ȥ, we are not surprised to see E(' ( ) 4 ) set equal to
(' 2)
a Ȥ-independent sum. This formula connects the integrated value of pnn to the—presumably
already known—statistical quantities șrms and E (' ( ) 4 ) . Once a shape has been chosen for p(' 2) ,
nn
Eq. (7.31b) can be used to find the normalizing constant, which should be applied to the shape
(' 2)
function to get the exact formula for the pnn
noise-power spectrum (see, for example, Sec. 7.13
7.14
below).
E [ L ( ) )]2
§ª 2
·
¨« 4 a n (De
' 2)
1
() ) ¬ª) 2 Z FOV () ) ¼º 2 º
¸
E¨ » (7.32)
¸
¨ «¬ WA M( R)' rms ) R ( ) )! () )* a ( ) )* f ( ) )»
¼ ¸
© ¹
4a
2
ª º §
2 ¸¹· .
2
«
) ¼» ©
1
» E ¨ n De () ) ¬ª) Z FOV () ) ¼º
(' 2) 2
¬« WA M( R)' rms )R ( ) )! () )* a ( ) )* f ( )
E §¨ n (De 2 ·¸¹ ,
2
©
1 ' 2)
() ) ª¬) 2 Z FOV () ) º¼
5
) 2 Z FOV () ) ³ W ( 3) e
2& i) 3
d 3 F ( i) 3) W ( 3) . (7.33a)
5
n (De
' 2)
() ) [) 2 Z FOV () )] F ( i) ) ( , D) ne(' 2) ( ) F ( i) 3) W ( 3) ,
- 905 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
which becomes, using the Fourier convolution theorem [see Eq. (2.39j) in Chapter 2],
θ 2)
n (De (σ ) ∗[σ 2 Z FOV (σ )] = F ( − iσχ ) ( Π ( χ , D) ne(θ 2) ( χ ) W ( χ ) )
∞ (7.33b)
= ³
−∞
Π ( χ , D) ne(θ 2) ( χ ) W ( χ ) e −2π iσχ .
Now we can write (using the linearity of operator E discussed in Sec. 3.10 of Chapter 3)
{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
§∞ ∞
·
= E ¨ ³ d χ Π ( χ , D) ne ( χ ) W ( χ ) e
(θ 2) −2π iσχ
³ d χ ′ Π ( χ ′, D )
n (θ 2)
e ( χ ′) W ( χ ′) e −2π iσχ ′
¸ (7.33c)
© −∞ −∞ ¹
∞ ∞
³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) E ( n )
( χ ) ne(θ 2) ( χ ′) e −2π iσχ ′ .
(θ 2)
= e
−∞ −∞
1
( ) (
E ne(θ 2) ( χ ) ne(θ 2) ( χ ′) = E [n (θ 2) ( χ ) + n (θ 2) (− χ )][n (θ 2) ( χ ′) + n (θ 2) (− χ ′)]
4
)
1
( ) (
= ¬ªE n (θ 2) ( χ )n (θ 2) ( χ ′) + E n (θ 2) ( χ )n (θ 2) (− χ ′)
4
)
(
+E n (θ 2) (− χ )n (θ 2) ( χ ′) ) + E ( n (θ 2)
)
(− χ )n (θ 2) (− χ ′) ¼º .
1
E ( ne(θ 2) ( χ ) ne(θ 2) ( χ ′) ) = ª¬ onn
(θ 2)
( χ ′ − χ ) + onn (θ 2)
(− χ ′ − χ )
4
+ onn(θ 2)
( χ ′ + χ ) + onn (θ 2)
(− χ ′ + χ ) º¼ ,
1
(
E ne(θ 2) ( χ ) ne(θ 2) ( χ ′) = ) 2
ª¬ onn
(θ 2)
(θ 2)
( χ ′ − χ ) + onn
( χ ′ + χ ) º¼ . (7.33d)
- 906 -
Calculating the Variance of į L · 7.10
{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
∞ ∞
1
³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) onn ( χ ′ − χ ) e −2π iσχ ′
(θ 2)
=
2 −∞ −∞
∞ ∞
1
+ ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) onn
(θ 2)
( χ ′ + χ ) e −2π iσχ ′
2 −∞ −∞
∞ ∞ ∞
1
= ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π iσχ ′ ³ pnn
(θ 2)
(σ ′) e 2π iσ ′( χ ′− χ ) dσ ′
2 −∞ −∞ −∞
∞ ∞ ∞
1
+ ³ d χ Π ( χ , D) W ( χ ) e −2π iσχ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π iσχ ′ ³ pnn
(θ 2)
(σ ′) e 2π iσ ′( χ ′+ χ ) dσ ′.
2 −∞ −∞ −∞
This can be written as, interchanging the order of the multiple integrals,
{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
∞ ∞ ∞
1
³ (σ ′) ³ d χ Π ( χ , D) W ( χ ) e −2π i (σ +σ ′) χ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π i (σ −σ ′) χ ′
(θ 2)
= dσ ′ pnn
(7.33e)
2 −∞ −∞ −∞
∞ ∞ ∞
1
³ (σ ′) ³ d χ Π ( χ , D) W ( χ ) e −2π i (σ −σ ′) χ ³ d χ ′ Π ( χ ′, D) W ( χ ′) e −2π i (σ −σ ′) χ ′ .
(θ 2)
+ dσ ′ pnn
2 −∞ −∞ −∞
From Eq. (7.15b) and the Fourier convolution theorem [Eq. (2.39j) in Chapter 2], we get
³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ = F ( −iσχ ) ( Π ( χ , D) W ( χ ) )
−∞
³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ = [2 Dsinc(2πσ D)] ∗ [σ 2 Z FOV (σ )] .
−∞
- 907 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
The σ 2 term is broad and slowly varying compared to the narrow and rapidly varying sinc
function, so it acts like a quasi-constant and can be brought outside the convolution [see Eq.
(5C.1) in Appendix 5C of Chapter 5]. This means we can write, using the approximation in
(7.16h) above,
³ Π ( χ , D) W ( χ ) e
−2π iσχ
d χ ≅ σ 2 ( [2 Dsinc(2πσ D)] ∗ Z FOV (σ ) ) ≅ σ 2 Z mnf (σ ) . (7.33f)
−∞
{
E §¨ n (De } ·¸¹
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼
©
∞
1
= ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ (7.33g)
2 −∞
∞
1 2
³ (σ ′) ª¬(σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ .
(θ 2)
+ pnn
2 −∞
This expression is too complicated to substitute comfortably back into Eq. (7.32), the formula for
the variance of δ L ( σ ) , so we define a new function
∞
1
J (θ 2)
(σ ) = ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞
∞
(7.33h)
1 2
+ ³ pnn
(θ 2)
(σ ′) ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ ,
2 −∞
{
E §¨ n (De } ·¸¹ = J (σ ) .
2
θ 2)
(σ ) ∗ ª¬σ 2 Z FOV (σ ) º¼ (7.33i)
©
4a
2
ª º
(
E [δ L ( σ )] = J 2
) (θ 2)
(σ ) ⋅ « » . (7.33j)
¬« WA ∆Ω M( Rσθ rms ) R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¼»
- 908 -
Calculating the Variance of į L · 7.10
a = 2π 2
R2
= 2π .
A π R2
Here, variables A and R have the same meaning as in the discussion following Eq. (4.137e) in
Chapter 4. The discussion following Eq. (4.83) in Chapter 4 reveals that, because W must be 1 or
í1,
W 2 = 1. (7.33k)
2
ª 8π º
( 2
)
E [δ L ( σ )] = J (θ 2)
(σ ) ⋅ « » . (7.33 A )
¬« ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ ) ¼»
ª 8π J (θ 2) (σ ) º
NEdN tilt = « ». (7.34b)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼
- 909 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
There are a number of ways to write the J (θ 2) function defined in Eq. (7.33h) above. The
second term on the right-hand side of (7.33h) can, for example, be written as a convolution [see
Eq. (2.38a) in Chapter 2 for the definition of a convolution]. This gives
1 (θ 2) 2
J (θ 2) (σ ) = pnn
(σ ) ∗ ª¬σ 2 Z mnf (σ ) º¼
2
∞ (7.35a)
1
+ ³ pnn (θ 2)
(σ ′) ª¬(σ + σ ′) Z mnf (σ + σ ′) º¼ ª¬(σ − σ ′) Z mnf (σ − σ ′) º¼ dσ ′ .
2 2
2 −∞
∞
1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) + (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ . (7.35b)
4 −∞
To justify this latest formula, we consult Eq. (7.30g) and define a new dummy variable of
integration σ ′′ = −σ ′ in order to show that
∞
2
³ (σ ′) ª¬ (σ + σ ′) 2 Z mnf (σ + σ ′) º¼ dσ ′
(θ 2)
pnn
−∞
−∞
2
= − ³ pnn
(θ 2)
(−σ ′′) ª¬(σ − σ ′′) 2 Z mnf (σ − σ ′′) º¼ dσ ′′ (7.35c)
∞
∞
2
³ (σ ′′) ª¬(σ − σ ′′) 2 Z mnf (σ − σ ′′) º¼ dσ ′′ .
(θ 2)
= pnn
−∞
- 910 -
Formula for the Misalignment of NEdN of Double-Sided Signals · 7.11
∞
1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ dσ ′
4 −∞
∞
1 2
+ ³ pnn
(θ 2)
(σ ′) ª¬(σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
4 −∞
∞
1
+ ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞
∞
1 2
= ³ pnn
(θ 2)
(σ ′) ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′
2 −∞
∞
1
+ ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) º¼ ª¬ (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ .
2 −∞
This is the same as Eq. (7.33h) above, showing that the right-hand side of (7.35b) is correct. We
(θ 2)
note, since the power spectrum pnn can never be negative and the terms inside the square
brackets [ ] are all real, that the integral on the right-hand side of (7.35b) is never negative.
Consequently, J (θ 2) must be a non-negative quantity, which means there is never any problem
taking its square root in the formula for the mirror-misalignment NEdN in Eq. (7.34b). The Z mnf
function in Eq. (7.35b) is specified by formula (7.16f) above; we see that Z mnf depends on the
background radiances L(mnf
fore )
and L(back)
mnf as well as on Lmnf, the radiance being measured. Hence
both the internal background radiances and the radiance being measured end up contributing to
the mirror-tilt NEdN.
(θ 2)
7.12 Connection Between the pnn
Power Spectrum and the Power
Spectra of θ , θ x y
To understand the implications of the NEdNtilt formulas derived in the previous sections, we
(θ 2)
need some information about the typical shape of the pnn power spectrum. It turns out that if we
assign power spectra to the θ ( χ ) and θ ( χ ) random functions introduced in Sec. 7.2 above, we
x y
(θ 2) (θ 2)
can use them to get information about the probable shape of pnn
by deriving a formula for pnn
Simplifying the notation in preparation for the algebra coming up, we define four new random
variables X , X ′ , Y , Y ′ by specifying that
- 911 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
θx ( χ ) = X + φ , (7.36a)
θx ( χ ′) = X ′ + φ , (7.36b)
θy ( χ ) = Y , (7.36c)
and
θy ( χ ′) = Y ′ (7.36d)
The point of this new notation is to emphasize the important information—namely, whether or
not we are dealing with the x or the y component of the angle—and to suppress all the irrelevant
aspects of argument Ȥ, keeping only the relevant information as to whether or not it is primed.
According to Eq. (7.2f), the average value of θx ( χ ) —which is the same thing as the system’s
bias tilt—is the constant angle φ at any value of Ȥ. Writing Eqs. (7.36a) and (7.36b) as
makes it easy to see that X and X ′ are zero-mean random functions of Ȥ and Ȥƍ respectively. The
statistics of θx ( χ ) and θy ( χ ) do not depend on Ȥ, so we expect the same to hold true for the
statistics of X , X ′ , Y , and Y ′ . Hence, we can assume that X , X ′ and Y , Y ′ are at least wide-
sense stationary functions of Ȥ (which is—according to Sec. 3.20 of Chapter 3—all that is
necessary to provide them with power spectra). Because they are wide-sense stationary, we can
set up the two autocorrelation functions
( )
′ = o ( xx ) ( χ ′ − χ )
E XX (7.36e)
and
E YY ( )
′ = o ( yy ) ( χ ′ − χ ) (7.36f)
to be functions only of the difference between Ȥ and Ȥƍ. The associated power spectra are, using
χ ′′ = χ ′ − χ ,
∞
p ( xx )
(σ ) = ³o
( xx )
(
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( xx ) ( χ ′′) ) (7.36g)
−∞
and
∞
p( yy ) (σ ) = ³o
( yy )
( )
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( yy ) ( χ ′′) ; (7.36h)
−∞
- 912 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn
∞
o ( xx )
( χ ′′) = ³p
( xx )
(
(σ ) e 2π iσχ ′′ dσ = F ( iσχ ′′) p( xx ) (σ ) ) (7.36i)
−∞
and
∞
o ( yy )
( χ ′′) = ³p
( yy )
(
(σ ) e 2π iσχ ′′ dσ = F ( iσχ ′′) p( yy ) (σ ) . ) (7.36j)
−∞
If we no longer assume that θx ( χ ) and θy ( χ ) are uncorrelated—which means that X and Y
might be correlated random variables—we must use the cross-correlation function
( )
′ = o ( xy ) ( χ ′ − χ ) ,
E XY (7.37a)
like the one defined in Eq. (3.30d) in Chapter 3, to describe the statistical relationship between
X and Y . Again we assume that, just like o ( xx ) and o ( yy ) , it is a real function of the difference
between Ȥ and Ȥƍ, which means that X and Y are jointly wide-sense stationary. Hence we can
define a new variable χ ′′ = χ − χ ′ and construct an associated cross-power spectrum [see Eq.
(3.48e) in Chapter 3],
∞
p ( xy )
(σ ) = ³o
( xy )
(
( χ ′′) e −2π iσχ ′′ d χ ′′ = F ( − iσχ ′′) o ( xy ) ( χ ′′) . ) (7.37b)
−∞
∞
o ( xy )
( χ ′′) = ³p
( xy )
( )
(σ ) e 2π iσχ ′′ dσ = F (iσχ ′′) p( xy ) (σ ) . (7.37c)
−∞
The same sort of reasoning used above in Sec. 7.9 [see Eqs. (7.30d)–(7.30g)] can be used here
to show that o ( xx ) , o ( yy ) , p( xx ) , and p( yy ) in Eqs. (7.36e)–(7.36h) are real and even functions. We
note that
o ( xx ) ( χ ′ − χ ) = E XX ( ) (
′ = E X ′X = o ( xx ) ( χ − χ ′) )
which becomes, substituting χ ′′ = χ ′ − χ ,
- 913 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
and, of course, both o ( xx ) and o ( yy ) must be real because they are, according to (7.36e) and
(7.36f), the expectation values of real products,
Since o ( xx ) and o ( yy ) are real and even, their Fourier transforms p( xx ) and p( yy ) in Eqs. (7.36g)
and (7.36h) must also, according to entry 1 in Table 2.1 of Chapter 2, be real and even:
p( xx ) (−σ ) = p( xx ) (σ ) , (7.38d)
p( yy ) (−σ ) = p( yy ) (σ ) , (7.38e)
and
Im[p( xx ) ( χ ′′)] = Im[p( yy ) ( χ ′′)] = 0 . (7.38f)
We note in passing that this line of argument most definitely cannot be applied to o ( xy ) and
p( xy ) , because, as shown in Appendix 7B, the cross-power spectrum p( xy ) can have both real and
imaginary parts.
The probability density distributions in Eqs. (7.2h) and (7.2i) require θx and θy to be
normally distributed. Consequently, the definitions of X , X ′ , Y , Y ′ in Eqs. (7.36a)–(7.36d)
show that X , X ′ , Y , Y ′ are also normally distributed. Variables Y and Y ′ obey zero-mean
normal distributions because θ is a zero-mean random function; and X and X ′ also obey zero-
y
mean normal distributions because, according to the discussion following Eq. (7.36d), the effect
of subtracting φ from θx is to make X and X ′ zero-mean random quantities. Hence, X , X ′ ,
Y , Y ′ have the same properties as the jointly normal random variables n , n , n , n described
1 2 3 4
Note that jointly normal random variables may or may not be correlated and thus may or may not
be independent random quantities. Considered in pairs, the random quantities X , X ′ and Y , Y ′
obey the formulas describing pairs of jointly normal random variables, for example, Eqs. (3.35c)
and (3.41b) in Chapter 3. When they are examined in isolation, they obey formulas describing
single normal variables, for example, Eq. (3.41c) in Chapter 3. Equations (7.36a)–(7.36d) also
- 914 -
and the Power Spectra of 'x , 'y · 7.12
(' 2 )
Connection Between pnn
require the spread in the probable values of X , X 3 about zero to be the same as the spread in the
probable values of 'x at Ȥ or Ȥƍ about ; and of course Y , Y 3 have the same spread about zero as
' at Ȥ or Ȥƍ because they are the same random variables. Consequently the standard deviations of
y
X , X 3 are the same as the x standard deviation of 'x at Ȥ or Ȥƍ and the standard deviations of
Y , Y 3 are the same as the y standard deviation of 'y at Ȥ or Ȥƍ [ x and y are introduced in
the discussion
discussion following
following Eq. Eq. (7.2g)
(7.2g) above].
above]. We We
see see
thatthat
E( X 2 ) E( X 32 ) 2
x (7.39b)
and
E(Y 2 ) E(Y 32 ) 2
y . (7.39c)
Having laid the required mathematical foundation, we begin the derivation of the desired
(' 2)
formula for the pnn
power spectrum in terms of the power spectra of 'x ( ) and 'y ( ) . The first
step is to evaluate [see Eqs. (7.36a)–(7.36d) above]
E 'x ( ) 2 'x ( 3) 2 E ( X ) 2 ( X 3 ) 2 ,
(7.40a)
E 'x ( ) 2 'y ( 3) 2 E ( X ) 2 Y 32 ,
(7.40b)
and
E 'y ( ) 2 'y ( 3) 2 E(Y 2Y 32 )
(7.40c)
E 'x ( ) 2 'x ( 3) 2 E ( X ) 2 ( X 3 ) 2
E( X 2 X 32 2 X 2 X 3 2 X 2
2 XX 32 4 2 XX
3 2 3 X
2 X 32 2 3 X 3 4 ) (7.41a)
E( X 2 X 32 ) 2E( X 2 X 3) 2E( X 2 )
2E( XX 32 ) 4 2E( XX 3) 2 3E( X )
2E( X 32 ) 2 3E( X 3) 4 ,
- 915 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
where, in the last step, Eq. (3.9f) in Chapter 3 is used to get E φ 4 = φ 4 . We examine the ( )
discussion following Eq. (3.34d) in Chapter 3 and apply Eq. (3.35c) to get
( ) (
′2 = E X 2 X ′ = 0 .
E XX ) (7.41b)
because X and X ′ are zero-mean random variables. Equation (7.41a) can now be written as
( )
E θx ( χ ) 2 θx ( χ ′) 2 = E( X 2 X ′2 ) + φ 2E( X 2 ) + 4φ 2E( XX
′) + φ 2E( X ′2 ) + φ 4
(7.41d)
= E( X 2 X ′2 ) + 4φ 2E( XX
′) + 2φ 2γ 2 + φ 4 ,
x
where in the last step Eq. (7.39b) is used to replace E( X 2 ) and E( X ′2 ) by γ x2 . Examining the
discussion following Eq. (3.40c) in Chapter 3, we note that Eq. (3.41b) shows us that
( ) ( ) ( ) ( )
2
E X 2 X ′2 = E X 2 E X ′2 + 2E XX
′
or, again using (7.39b),
( ) ′ 2.
E X 2 X ′2 = γ x4 + 2E XX ( ) (7.41e)
( )
E θx ( χ ) 2 θx ( χ ′) 2 = γ x4 + 2 o ( xx ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ ) + 2φ 2γ x2 + φ 4
or
( )
E θx ( χ ) 2 θx ( χ ′) 2 = 2 o ( xx ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ ) + (γ x2 + φ 2 ) 2 . (7.41f)
Having finished with (7.40a), we turn our attention to (7.40b). Again using Eqs. (7.36a) and
(7.36c) and the linearity of the expectation operator (see Sec. 3.10 in Chapter 3), we have
( ) ( )
E θx ( χ ) 2 θy ( χ ′) 2 = E ( X + φ ) 2 Y ′2 = E( X 2Y ′2 + 2φ XY
′2 + φ 2Y ′2 )
(7.42a)
= E( X 2Y ′2 ) + 2φE( XY
′2 ) + φ 2E(Y ′2 ).
- 916 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn
Again Eqs. (3.35c) and (3.41b) in Chapter 3 can be applied to the jointly normal random
quantities X , Y ′ to get
E XY ′2 = 0 ( ) (7.42b)
and
( ) ( ) ( ) ′ 2 .
E X 2Y ′2 = E X 2 E Y ′2 + 2E XY ( ) (7.42c)
( )
E X 2Y ′2 = γ x2γ y2 + 2 o ( xy ) ( χ ′ − χ ) 2 . (7.42d)
( )
E θx ( χ ) 2 θy ( χ ′) 2 = γ x2γ y2 + 2 o ( xy ) ( χ ′ − χ ) 2 + φ 2γ y2
or
( )
E θx ( χ ) 2 θy ( χ ′) 2 = γ y2 (γ x2 + φ 2 ) + 2 o ( xy ) ( χ ′ − χ ) 2 . (7.42e)
Equation (7.40c) is the easiest to evaluate. This time applying Eq. (3.41b) in Chapter 3 to the
jointly normal random quantities Y and Y ′ , we can write
( ) ( ) ( )
′ 2 .
E Y 2Y ′2 = E Y 2 E Y ′2 + 2E YY ( ) (7.43a)
( ) ( ) ( )
′ 2 ,
E θy ( χ ) 2 θy ( χ ′) 2 = E Y 2 E Y ′2 + 2E YY ( )
which becomes, using (7.39c) and (7.36f),
( )
E θy ( χ ) 2 θy ( χ ′) 2 = γ y4 + 2 o ( yy ) ( χ ′ − χ ) 2 . (7.43b)
Now that (7.40a)–(7.40c) have been evaluated, the next step is to use them to find a formula
(θ 2)
for onn
in terms of o ( xx ) , o ( yy ) , and o ( xy ) . Substituting Eq. (7.8b) into (7.30a) gives
(θ 2)
onn
( χ ′ − χ ) = E θ rms
2
((
− θ ( χ ) 2 θ rms
2
)(
− θ ( χ ′) 2 )) . (7.44a)
- 917 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
(' 2)
onn
( 3 ) E ' ( ) 2 ' ( 3) 2 ' rms
2 2
' ( ) 2 ' rms' ( 3)2 ' rms
4
.
The linearity of the expectation operator [see Sec. 3.10 in Chapter 3 and also Eq. (3.9f)] lets this
be written as
(' 2)
onn
( 3 ) E ' ( ) 2 ' ( 3) 2 ' rms
2
E ' ( ) 2 ' rms
2
E ' ( 3) 2 ' rms
4
,
which becomes, using Eq. (7.3c),
(' 2)
onn
( 3 ) E ' ( ) 2 ' ( 3) 2 ' rms
4
. (7.44b)
(' 2)
onn
( 3 ) E 'x ( ) 2 'y ( ) 2 'x ( 3) 2 'y ( 3) 2
' 4
rms ,
(' 2)
onn
( 3 ) E 'x ( ) 2 'x ( 3) 2 E 'x ( ) 2 'y ( 3) 2 E 'x ( 3) 2 'y ( ) 2
(7.44c)
E ' ( ) ' ( 3) '
y
2
y
2 4
rms .
(' 2)
onn
( 3 ) 2 o ( xx ) ( 3 ) 2 4 2 o ( xx ) ( 3 ) ( 2
x 2 )2
2 2
y ( x 2 ) 2 o ( xy ) ( 3 ) 2
2 2
y ( x 2 ) 2 o ( xy ) ( 3) 2
4
y 2 o ( yy ) ( 3 ) 2 ' rms
4
2 o ( xx ) ( 3 ) 2 4 2 o ( xx ) ( 3 ) ( 2
x 2 2 2
y )
2 o ( xy ) ( 3 ) 2 2 o ( xy ) ( 3) 2 2 o ( yy ) ( 3 ) 2 ' rms
4
.
- 918 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn
(θ 2)
onn
( χ ′ − χ ) = 2 o ( xx ) ( χ ′ − χ ) 2 + 2 o ( yy ) ( χ ′ − χ ) 2 + 4φ 2 o ( xx ) ( χ ′ − χ )
(7.44d)
+ 2 o ( xy ) ( χ ′ − χ ) 2 + 2 o ( xy ) ( χ − χ ′) 2 .
(θ 2)
This is what we want, a formula for onn in terms of o ( xx ) , o ( yy ) , and o ( xy ) .
The final step is to apply the Fourier transform to Eq. (7.44d). We define χ ′′ = χ ′ − χ and
write
(θ 2)
onn
( χ ′′) = 2 o ( xx ) ( χ ′′) 2 + 2 o ( yy ) ( χ ′′) 2 + 4φ 2 o ( xx ) ( χ ′′)
+ 2 o ( xy ) ( χ ′′) 2 + 2 o ( xy ) (− χ ′′) 2 .
Dropping the primes and taking the Fourier transform of both sides gives, using the linearity of
the Fourier transform described in Sec. 2.6 of Chapter 2,
(
F ( −iσχ ) onn
(θ 2)
) ( )
( χ ) = 2 F ( −iσχ ) o ( xx ) ( χ )2 + 2 F ( − iσχ ) o ( yy ) ( χ ) 2 ( )
( ) (
+ 2 F ( −iσχ ) o ( xy ) ( χ ) 2 + 2 F ( − iσχ ) o ( xy ) (− χ ) 2 ) (7.45a)
(
+ 4φ 2 F ( −iσχ ) o ( xx ) ( χ ) . )
The Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] lets us write
( ) ( ) (
F ( −iσχ ) o ( xx ) ( χ ) 2 = F ( − iσχ ) o ( xx ) ( χ ) ∗ F ( − iσχ ′) o ( xx ) ( χ ′) )
and
( ) ( ) (
F ( −iσχ ) o ( yy ) ( χ ) 2 = F ( − iσχ ) o ( yy ) ( χ ) ∗ F ( − iσχ ′) o ( yy ) ( χ ′) . )
Equations (7.36g) and (7.36h) then give
( )
F ( −iσχ ) o ( xx ) ( χ ) 2 = p( xx ) (σ ) ∗ p( xx ) (σ ) , (7.45b)
and
( )
F ( −iσχ ) o ( yy ) ( χ ) 2 = p( yy ) (σ ) ∗ p( yy ) (σ ) . (7.45c)
Equation (7.36g) needs to be substituted directly into our formula, so we drop the primes and
rewrite it as
(
F ( −iσχ ) o ( xx ) ( χ ) = p( xx ) (σ ) . ) (7.45d)
- 919 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
F ( −iσχ ) ( onn
(θ 2)
( χ ) ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )]
( )
+ 2 F ( −iσχ ) o ( xy ) ( χ ) 2 + 2 F ( − iσχ ) o ( xy ) (− χ ) 2 ( )
+ 4φ 2 p( xx ) (σ ) ,
(θ 2)
pnn
(σ ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )] + 4φ 2 p( xx ) (σ )
(7.45e)
{ ( ) (
+ 2 F ( − iσχ ) o ( xy ) ( χ ) 2 + F ( − iσχ ) o ( xy ) (− χ ) 2 . )}
The term inside the braces { }, which is the last term on the right-hand side of (7.45e), can be
simplified if we write the Fourier transforms as integrals. Defining χ ′′′ = − χ lets us write
F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )
∞ −∞
Glancing back at the definition of o ( xy ) in Eq. (7.37a), we note that o ( xy ) is real, which makes
the second integral,
∞
³o ( χ ) 2 e 2π iσχ d χ ,
( xy )
−∞
the complex conjugate of the first,
∞
³o ( χ ) 2 e −2π iσχ d χ .
( xy )
−∞
Hence
F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )
∞
= 2 Re ³o
( xy )
(
( χ ) 2 e −2π iσχ d χ = 2 Re ª¬ F ( − iσχ ) o ( xy ) ( χ ) 2 º¼ . )
−∞
- 920 -
and the Power Spectra of θx , θy · 7.12
(θ 2 )
Connection Between pnn
F ( − iσχ ) ( o ( xy ) ( χ ) 2 ) + F ( − iσχ ) ( o ( xy ) (− χ ) 2 )
( ) ( )
= 2 Re ª¬ F ( − iσχ ) o ( xy ) ( χ ) ∗ F ( − iσχ ′) o ( xy ) ( χ ′) º¼ ,
(θ 2)
pnn
(σ ) = 2[p( xx ) (σ ) ∗ p( xx ) (σ )] + 2[p( yy ) (σ ) ∗ p( yy ) (σ )]
(7.45g)
+ 4φ 2 p( xx ) (σ ) + 4 Re[p( xy ) (σ ) ∗ p( xy ) (σ )] .
This, surprisingly enough, is the result we need in order to learn something about the likely shape
(θ 2)
of the pnn
noise-power spectrum.
(θ 2)
7.13 The Shape of the pnn
Power Spectrum
If we return to the ideal case where θx and θy are taken to be independent random variables, then
Eq. (3.11b) in Chapter 3 can be used to write
( ) ( ) (
E θx ( χ ) θy ( χ ′) = E θx ( χ ) ⋅ E θy ( χ ′) = 0 )
because θy ( χ ′) is, according to Eq. (7.2g), a zero-mean random variable:
( )
E θy ( χ ′) = 0 .
Similarly, according to Eqs. (7.36a) and (7.36d), we have, using the linearity of the expectation
operator E described in Sec. 3.10 of Chapter 3,
(( ) ) ( ) (
E( X Y ′) = E θx ( χ ) − φ ⋅θy ( χ ′) = E θx ( χ ) ⋅ E θy ( χ ′) − φ E θy ( χ ′) = 0 ) ( )
- 921 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
(' 2)
pnn
() ) 2[p( xx ) () ) p( xx ) () )] 2[p( yy ) () ) p( yy ) () )] 4 2 p( xx ) () ) . (7.46)
We can recognize two extreme cases for the right-hand side of formula (7.46)—one where 2 is
relatively large compared to p( xx ) p( xx ) and p( yy ) p( yy ) , and one where 2 is relatively small
compared to p( xx ) p( xx ) and p( yy ) p( yy ) . When the bias angle 2 is relatively large,
(' 2)
pnn
() )
4 2 p( xx ) () ) , (7.47a)
and when 2 is relatively small,
(' 2)
pnn
() )
2[p( xx ) () ) p( xx ) () )] 2[p( yy ) () ) p( yy ) () )] . (7.47b)
5
p( xx ) () ) p( xx ) () ) ³p
( xx )
() 3)p( xx ) () ) 3) d) 3 ? ptyp( xx )2) spread
( xx )
(7.47c)
5
or
and
- 922 -
(' 2 )
The Shape of the pnn Power Spectrum · 7.13
7.14
³p
( yy ) ( yy ) ( yy )
p () ) p () ) () 3)p( yy ) () ) 3) d) 3 ? ptyp( yy )2) spread
( yy )
. (7.47d)
5
wefollows
From Eqs. (7.36i) and (7.36j), it know that
that
5 5
³p ³p
( xx ) ( xx ) ( yy ) ( yy )
o (0) () ) d) and o (0) () ) d) ,
5 5
From the definitions of o ( xx ) (0) and o ( yy ) (0) in Eqs. (7.36e) and (7.36f), we know that
E X 2 ? ptyp( xx )) spread
( xx )
and E Y 2 ? ptyp( yy )) spread
( yy )
,
2
x ? ptyp( xx )) spread
( xx )
(7.47g)
and
2
y ? ptyp( yy )) spread
( yy )
. (7.47h)
This means that the approximations in (7.47c) and (7.47d) can be written as
p( xx ) () ) p( xx ) () ) ? ptyp( xx ) 2
x (7.47i)
and
p( yy ) () ) p( yy ) () ) ? ptyp( yy ) 2
y . (7.47j)
- 923 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
Hence, when—in Eq. (7.46)—we say that the product 2 p( xx ) is large or small compared to
p( xx ) p( xx ) or p( yy ) p( yy ) , it is the same as saying that
2 ptyp( xx )
(' 2)
) 2 2 s2
pnn
() ) e . (7.48)
- 924 -
(' 2 )
The Shape of the pnn Power Spectrum · 7.13
7.14
FIGURE 7.2(a).
3
3.0
2
S test )
i
Sconv )
i
1
9
2.061 10 0
3 2 1
) 0 0 1 2 3
3 ) 3
i
FIGURE 7.2(b).
3.5
3
S test f
i 2
Sconv f
i
0 0
6 4 2
) 0 0 2 4 6
6 f 6.0
i
The dashed lines in Figs. 7.2(a) and 7.2(b) represent function f(ı) and the solid lines
represent the convolution of function f(ı) with itself—that is, they represent f(ı) * f(ı). Figure
7.2(a) shows what happens to a smooth f(ı) localized near the origin and Fig. 7.2(b) shows
what happens to f(ı) when it consists of multiple, isolated peaks.
- 925 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.2(c).
p0(θ 2) or p0( xx )
− (σ C + σ M ) − σC σC σC +σM
σ (in cm-1)
(θ 2)
This plot shows the shape with respect to ı of functions p (σ ) and p( xx ) (σ ) for the quasi-harmonic
formulas in Eqs. (7.49a) and (7.49b).
simple quasi-harmonic shape depicted in Fig. 7.2(c), just to see what the misalignment NEdN
(θ 2)
looks like when, unlike Eq. (7.48), the largest pnn
values are far away from the σ = 0 origin.
The quasi-harmonic power spectral shape in Fig. 7.2(c) can be specified by
ª § § σ · σM · § § σM · σM ·º
p( xx ) (σ ) = p0( xx ) ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.49a)
¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼
(θ 2) ª § § σ · σM · § § σM · σ M ·º
pnn
(σ ) = p0(θ 2) ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.49b)
¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼
- 926 -
(' 2 )
The Shape of the pnn Power Spectrum · 7.13
7.14
when referring to the noise-power spectrum of n (' 2) ( ) from Eq. (7.8b). In both formulas p0( xx ) ,
p0(' 2) , ) C , and ) M are positive real parameters; and both power spectra have the same shape
(only their maximum values p0( xx ) and p0(' 2) are different). The function has the
the same
sameformulas
formula
as Eq.
in in Eq. (7.12a)
(7.12a) above:
above:
°1 for ) a 4 ) b
() a , ) b ) ® .
°̄0 for ) a ) b
(' 2)
7.14 The Size of the pnn
Power Spectrum
(' 2)
Having chosen either (7.48) or (7.49b) to specify the shape of the pnn
power spectrum, we turn
(' 2)
to Eq. (7.31b) to connect the amplitude of pnn
to its spread in wavenumbers. Substitution of Eq.
(7.3d) into (7.31b) gives
5
E ' ( ) 4 ( 2
2
x 2 2
y ) ³p
(' 2)
nn () ) d) , (7.50a)
5
where , x ,are respectively the bias angle, the standard deviation of the 'x component of
y
the random misalignment angle, and the standard deviation of the 'y component of the random
misalignment angle. All three quantities are, of course, measured in radians. At the beginning of
the previous section, we specified 'x and 'y to be independent random quantities; and the
derivation of Eq. (7.45g) in Sec. 7.12 assumes that both ' and ' are normally distributed x y
random variables [obeying probability density distributions of the type shown in Eqs. (7.2h) and
(7.2i) above]. Equation (7.4d) thus requires that
E ' ( ) 4 3
4
x 6 2 2
x 4 3 4
y 2 ( 2 2
x ) 2
y ,
³p
(' 2)
nn () ) d) ( 2 2
x 2 2
y ) 3( 4
x 4
y ) 6 2 2
x 4 2 2
y ( 2 2
x )
5
or
5
³p
(' 2) 4 4
nn () ) d) 2( x y 2 2 2
x ). (7.50b)
5
- 927 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
(' 2)
Equation (7.50b) is the formula we need to connect the size of the proposed pnn
spectral shape
to its spread in wavenumbers.
When the Gaussian shape in formula (7.48) is chosen, we note that parameter Į specifies the
size of the spectrum and parameter s determines the spectral spread in wavenumbers. Substituting
(7.48) into (7.50b) gives
5
) 2 2 s2
³e d) 2( 4
x 4
y 2 2 2
x ). (7.51a)
5
5
) 2 2 s2
5
³e dt s 2& . (7.51b)
4 4
s 2& 2( x y 2 2 2
x ),
This is the expected connection between the size of the noise-power spectrum and its spread in
wavenumbers. Glancing back at the discussion following Eq. (7.47j), we recall that the Gaussian
spectral shape in (7.48) stems from an assumption that the bias angle is small compared to Ȗx
and Ȗy. Hence (7.51c) can be approximated as
1 2 4 4
( x y ). (7.51d)
s &
Formulas(7.51c)
Formula (7.51c)can
andbe(7.51d) can be
substituted substituted
back back
into (7.48) into (7.48) to get
to get
(' 2) 1 2 4 4 .,
) 2 2 s2
pnn
() ) ( x y 2 2 2
x )e (7.51e)
s &
which that
Using simplifies to, using
is small that to is xsmall
compared and compared
and ,y , we substitute
to x and y , into (7.48), or just neglect in
(7.51d)
(7.51e), to get
- 928 -
(θ 2 )
The Shape of the pnn Power Spectrum · 7.14
(θ 2) 1 2 4 − σ 2 ( 2 s2 )
pnn
(σ ) ≅ ( γ x + γ y4 ) e . (7.51f)
s π
(θ 2)
The quasi-harmonic shape for pnn specified in Eq. (7.49b) stems from the assumption that
the bias angle φ is large compared to γ x and γ y . Here the spread in wavenumbers is specified by
parameter σ M and the size of the power spectrum is determined by p0(θ 2) . Substituting (7.49b)
into (7.50b) gives
2σ M p0(θ 2) = 2( γ x4 + γ y4 + 2 φ 2γ x2 )
or
1
p0(θ 2) = ( γ x4 + γ y4 + 2 φ 2γ x2 ) . (7.52a)
σM
This is the connection between size p0(θ 2) and wavenumber spread σ M for the spectral shape
specified in (7.49b). Because now we are assuming that φ is large compared to γ x , the formula
can be approximated as
2 φ 2γ x2
p0(θ 2) ≅ . (7.52b)
σM
(θ 2)
pnn
(σ )
γ x4 + γ y4 + 2 φ 2γ x2 ª § § σ · σ · § § σ · σ ·º (7.52c)
= ⋅ «Π ¨ σ − ¨ σ C + M ¸ , M ¸ + Π ¨ σ + ¨ σ C + M ¸ , M ¸ »
σM ¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼
(θ 2) 2 φ 2γ x2 ª § § σ · σM · § § σM · σM ·º
pnn
(σ ) ≅ ⋅ «Π ¨ σ − ¨ σ C + M ¸, ¸ + Π ¨σ + ¨σ C + ¸, ¸» (7.52d)
σM ¬ © © 2 ¹ 2 ¹ © © 2 ¹ 2 ¹¼
(θ 2)
for the quasi-harmonic pnn
noise-power spectrum.
- 929 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
and the optical-path difference between the evenly spaced samples of the interferogram signal is
2D
3.125 ;104 cm . (7.53b)
N
L( dir ) () ) L(FOV
fore )
( ) ) L(back) ( fore ) (back)
FOV ( ) ) L mnf ( ) ) L mnf ( ) ) 0 . (7.53c)
We might as well give the responsivity R and the optical parameters * a , * f , Ș their ideal values
amp A sec
R( ) ) 1 (7.53d)
erg
and
* a ( ) ) * f ( ) ) !( ) ) 1 , (7.53e)
because in formula (7.34b) they just end up rescaling the spectral noise to turn it into NEdNtilt.
The beam passing through the interferometer has a circular cross section of radius R = 3 cm , so
according to Eq. (7.2b)
a 2& 2
R 2
177.65 cm 2 (7.53f)
- 930 -
Simulated Misalignment Noise · 7.15
W 1. (7.53i)
The detector electronics in Fig. 6.2 of Chapter 6 are given a three-pole, low-pass Butterworth
filter. Figure 7.3 plots
Re H(u) ) , Im H(u) ) , and H(u) )
of this filter against wavenumber ı. The OPD velocity u is taken to be 5 cm/sec and the filter
cutoff frequency is 8000 Hz. This means the magnitude H(u) ) of the transfer function does not
fall off by much inside the 650 cmí1 to 1150 cmí1 band of wavenumbers measured by the
interferometer. The simulated instrument is calibrated using Planck black-body radiances of 77 K
(the temperature of liquid nitrogen) and 350 K.
To characterize the noise in these simulated black-body measurements, we have already
decided to use the Gaussian noise-power spectrum in Eq. (7.48), which means [see discussion
following Eq. (7.46) and continuing on to Eq. (7.48)] that the bias angle must be negligible. To
keep things simple, we make the bias angle zero,
0. (7.54a)
s 200 cm 1 (7.54b)
and
3.989 ;1023 cm A rad 4 , (7.54c)
which gives us, by combining Eqs. (7.48) and (7.34b), all the information needed to calculate the
(' 2)
NEdNtilt contaminating the black-body spectrum. Figure 7.4(a) plots the Gaussian pnn noise-
power spectrum in (7.48) for the s and Į values in (7.54b) and (7.45c).
Now that s and Į are specified, and the bias angle is set to zero in (7.54a), Eqs. (7.51c)
(7.51c)and or
(7.51d) show that
4 4 20
x y
10 rad 4 . (7.55a)
- 931 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
This does not specify uniquely the amount of misalignment error contributed by the 'x and 'y
components of the misalignment angle. We could, for example, treat ' and ' on an equal x y
To keep the arithmetic simple, we choose another approach, assuming that 'y is always zero so
that
y 0. (7.55
.55dD)
Figures 7.4(b) and 7.4(c) plot a simulation of n (' 2) misalignment noise [defined in Eq. (7.8b)
above] for an interferometer disturbed by a Gaussian noise-power spectrum governed by the
.54 (7.55d), and (7.55e). Figure 7.4(b) covers a small
parameter choices shown in (7.54a)–(7.45c),
range of OPD values to show what this sort of misalignment noise looks like in detail, and Fig.
7.4(c) covers the entire range of OPD values between +1.28 cm and í1.28 cm.
Figures 7.5(a) and 7.5(b) show what happens when the Gaussian misalignment noise just
described above contaminates measurements of a 320-K Planck black-body spectrum performed
by the interferometer system specified at the beginning of this section [see Eqs. (7.53a)–(7.53i)
and the paragraph immediately following Eq. (7.53i)]. The solid line in Fig. 7.5(a) is the true
spectral radiance entering the instrument. This black-body curve is smooth enough that, when
calculating NEdNtilt, we do not have to worry about the different shapes of the radiance functions
L, LFOV, and Lmnf specified107 in Secs. 5.18 and 5.23 of Chapter 5. [A similar point was made
earlier in Sec. 7.6 about the L(1) and L(2) calibration radiances—see Eqs. (7.19a) and (7.19b)]. .]
Figure 7.5(a) also contains ten independent, noise-contaminated measurements shown by dotted
107
The modified radiances LFOV and Lmnf are defined in Eqs. (5.83e) and (5.108d) respectively.
- 932 -
Simulated Misalignment Noise · 7.15
FIGURE 7.3.
1.5
1.5
1.0 1
0.50.5
Re( Htot( u .σ ) )
Im( Htot( u .σ ) )
0.0 0
Htot( u .σ )
-0.50.5
-1.0 1
1.5 1.5
0 500 1000 1500 2000
0 0 500 1000
σ 1500 2000
2000
σ (in cm-1)
The solid curve is the magnitude of the transfer function H(uı) plotted against ı. The dashed
and dotted curves are its real and imaginary parts respectively.
curves, several of which are too close to the solid curve to be easily seen. This gives some idea of
how the misalignment noise causes the 320-K radiance curve generated by the interferometer to
jump around from measurement to measurement while retaining the general shape of a true
black-body spectrum. The solid curve in Fig. 7.5(b) is the NEdNtilt calculated from formula
(7.34b) above. It is clearly consistent with the spread of the dotted curves in Fig. 7.5(a). We have
analyzed 3600 independent, noise-contaminated spectral measurements of this 320-K radiance
curve, calculating the standard deviation of the error as a function of wavenumber ı between
650 cm-1 and 1150 cm -1 . The crosses in Fig. 7.5(b) plot these standard deviations; there is a close
- 933 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.4(a).
-23
5x10
23
5 .10
-23
23
4x10
4 10
-23
23
3x10
3 10
pn~(nθ~ 2)
(in rad4 cm)sV( 2x10
σ)
-23
2 10
23
-23
23
1x10
1 10
0.0 0
23
0.5 .10
800 600 400 200 0 200 400 600 800
800 -800 -600 -400 -200 0.0
σ 200 400 600 800
800
σ (in cm-1)
This is a plot of the Gaussian noise-power spectrum in Eq. (7.48) with Į = 3.989x1023 rad4 cm
and s = 200 cm-1.
______________________________________________________________________________
match between them and the predicted NEdNtilt curve, showing that the simulated interferometer
measurements obey the expected spectral statistics.
Figure 7.6 shows the Lorentz emission line measured in the second simulated interferometer
measurement. We use the same interferometer system as in the black-body measurement, with
two connected changes: the fore optics transmission is taken to be
τ f ( σ ) = 0.9 (7.56a)
- 934 -
Simulated Misalignment Noise · 7.15
FIGURE 7.4(b).
10 6x10-10
6 .10
-10
10
4x10
4 10
-10
10
2x10
2 10
n (θ 2) ( χ )
Re nθ2Vtemp
2kPlot 0.0 0
(in rad )
-10
10
-2x10
2 10
-10
-4x10
4 10
10
10 -10
6 .10 -6x10
6 10
10
0.05 0 0.05
0.1 -0.1 -0.05 kPlot .∆χ 0.0
1.28 0.05 0.1
0.1
χ (in cm)
background radiance to the optical signal. (The changes are made to show the effect of
background radiance on a Lorentz-line measurement contaminated by misalignment noise.) The
L(mnf
fore )
background radiance is taken to be a gray-body Planck curve [described in the discussion
following Eq. (5.3k) in Chapter 5] with a constant emissivity of 0.1; and, since all the other
interferometer optics are taken to be ideal, we can still set the other background radiances to zero.
Because we are now dealing with a Lorentz emission line instead of a smooth Planck curve, it is
no longer safe to assume automatically that the input spectrum is so smooth that the
interferometer’s finite field of view and finite interferogram length have no significant effect on
the measured spectrum.
Equation (5.83e) in Chapter 5 reminds us that the finite field of view rescales the wavenumber
axis by a factor of
§ ∆Ω ·
¨1 + ¸.
© 4π ¹
This becomes, using the ǻȍ = 1.086 × 10-4 ster value from Eq. (7.53h),
- 935 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.4(c).
10 6x10-10
6 10
-10
4x10
4 10
10
-10
10
2x10
2 10
nRe('n'2Vtemp
2)
( )
kPlot 0.0 0
(in rad2)
-10
10
-2x10
2 10
-10
-4x10
4 10
10
10 -10
6 10 -6x10
6 10
10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot 1.28 0.5 1.0 1.28
D (in cm) D
_____________________________________________________________________________________________
§ · 6
¨1 ¸
1 8.642 ;10 . (7.56b)
© 4& ¹
- 936 -
Simulated Misalignment Noise · 7.15
FIGURE 7.5(a).
200
200
200
190190
180180
LinpV
kR
LmeasV 170170
kR
Lmeas2V
kR
160160
Lmeas3V
kR
Lmeas4V
Radiance 150
kR 150
(in mW/m2/sr/cmLmeas5V
-1
) kR
Lmeas6V140140
kR
Lmeas7V
kR
Lmeas8V
130130
kR
Lmeas9V
kR
120120
Lmeas10V
kR
110110
100100
98.53893290 90
600 700 800 900 1000 1100 1200
600
650 700 800 900
σR 1000 1100 1200
. 3
kR 1.15 10
σ (in cm-1)
- 937 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.5(b).
5.0 55
44
33
NEdNV
Radiance Errork
(in mW/m2/sr/cm
NEdNest )
-1
k
22
11
0
0
0
600
600 700 800
800 900
900 1000
1000 1100
1100 1200
1200
650 σg 3
k 1.15 .10
σ (cm-1)
This is also much too small to matter on the scale of Fig. 7.6. All that is now left to check is the
effect of the finite interferogram length. The value of the unapodized spectral resolution is
0.391 cm -1 in Eq. (7.53a) above. Glancing back at the discussion following Eq. (5.67) in Chapter
5, we note that the unapodized spectral resolution determines the scale of the spectral blurring
caused by the interferometer’s finite interferogram length. The Lorentz line in Fig. 7.6 looks wide
enough not to have its width significantly affected by the blurring effects of an unapodized
spectral resolution of 0.391 cm-1 . So there is still no need to worry about the slightly different
shapes of the radiance functions L, LFOV, and Lmnf when discussing the radiance spectrum
entering—or measured by—the interferometer.
- 938 -
Simulated Misalignment Noise · 7.15
The analysis in Sec. 7.13 shows that the bias angle must be large compared to x and y
(' 2)
for the noise-power spectrum p to have the quasi-harmonic shape specified in Eq. (7.52d)
nn
above. To satisfy this requirement for the noise contaminating the Lorentz emission line, we set
105 rad and x 106 rad . Taking y to be approximately the same size as x , we again
use Eq. (7.3d) to get
' rms
105 rad (7.57a)
just like in Eq. (7.55f) above for the black-body measurements. Choosing
) C 100 cm 1 (7.57b)
and
) M 20 cm 1 , (7.57c)
(' 2)
in Eq. (7.49b). To get the desired quasi-harmonic pnn
spectrum, we just apply these ) C , ) M ,
and p0(' 2) parameters to the graph in Fig. 7.2(c). Figures 7.7(a) and 7.7(b) contain an example of
n (' 2) misalignment noise [as defined in Eq. (7.8b)] obeying this quasi-harmonic spectrum. The x
and y components are independent, zero-mean, and normally distributed random quantities.
Figure 7.7(a) plots n (' 2) over a small set of OPD values to show what this quasi-harmonic
misalignment noise looks like in detail, and Fig. 7.7(b) plots n (' 2) over the entire range of OPD
values between +1.28 cm and í1.28 cm.
FiguresFigures 7.8(a)
7.8(a) andand 7.8(b)
7.8(b) show
show whathappens
what happenswhen
whenthe thequasi-harmonic
quasi-harmonic noise
noise described
described above
contaminates the measurement of the Lorentz emission line in Fig. 7.6. The split solid curves in
Fig. 7.8(a) depict the rising and trailing edges of the Lorentz emission line using a stretched y
axis, which puts the peak top ofofthe
theemission
emissionline
lineoff
offthe
thetop
topof ofthe
thegraph.
graph. The
The continuous
continuous solid
solid line
line is
the NEdNtilt curve predicted by formulas (7.34b) and (7.35a), and the dotted lines are ten
measurements of the Lorentz emission contaminated by the quasi-harmonic misalignment noise.
The NEdNtilt curve correctly predicts the presence and location of the “ghost-line” noise peaks in
the dotted curves, and it also confirms the way the overall level of the noise-contaminated
measurements rises and falls with respect to the true spectral level far away from the ghost lines.
The ghost-line noise is predicted by the first term on the right-hand side of Eq. (7.35a). This term
(' 2)
is basically a convolution of the quasi-harmonic pnn power spectrum with the Lorentz line
shape contained in the square of the [) 2 Z mnf () )] function.
- 939 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.6.
100 100
99.00892
80 80
60 60
Radiance
(in mW/m2/sr/cm -1 σg
Linp
) ig
40 40
20 20
3
1.111099 .10 0 0
800 850 900 950 1000 1050 1100
800
800 850 900 950
σg 1000 1050 1100
1100
ig
σ (in cm-1)
- 940 -
Simulated Misalignment Noise · 7.15
We note that the ghost-line regions lie on either side of the Lorentz emission line, offset from the
line center by
σM
σC + = 110 cm −1 , (7.58)
2
as we would expect from the convolution. The overall rise and fall of the noise-contaminated
measurements with respect to the true spectral level comes from both the first and second terms
on the right-hand side of (7.35a) and can be traced to the interferometer’s nonzero background
radiance. This is what happens when misalignment noise interacts with a smooth Planck-like
spectrum, just like in Figs. 7.5(a) and 7.5(b). It is important to realize that large background
radiances can produce large amounts of background noise even at those wavenumbers where the
spectrum being measured is relatively small. We also see that misalignment noise, unlike the
detector noise discussed in Chapter 6, need not look very “fuzzy” and noiselike; it can easily be
mistaken for part of the spectral signal. Figure 7.8(b) has the same basic format as Fig. 7.5(b).
Again, we generate 3600 noise-contaminated measurements and calculate the standard deviations
of the spectral error as a function of wavenumber ı. Just as before, the crosses marking the values
of these standard deviations are a good match to the solid line giving the predicted NEdNtilt
values.
______________________________________________________________________________
FIGURE 7.7(a).
11
8 10-11
8x10
11
8 .10
-11
11
6x10
6 10
-11
11
4x10
4 10
-11
11
2x10
2 10
nRe(θnθ2Vtemp
2)
(χ )
0.0 0
(in rad2)
kPlot
-11
11
-2x10
2 10
-11
-4x10
4 10
11
-11
-6x10
6 10
11
11 -11
8 .10 -8x10
8 10
11
0.05 0 0.05
0.1 -0.1 -0.05 kPlot .∆χ 0.0
1.28 0.05 0.1
0.1
χ (in cm)
- 941 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.7(b).
-11 11
118x10
8 10
8 .10
-1111
6x10
6 10
-1111
4x10
4 10
-1111
2x10
2 10
n (θ 2) ( χ )
Re nθ2Vtemp 0.0 0
(in rad2) kPlot
-1111
-2x10
2 10
-1111
-4x10
4 10
-1111
-6x10
6 10
11 -1111
8 .10 -8x10
8 10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot .∆χ 1.28 0.5 1.0 1.28
- 942 -
Simulated Misalignment Noise · 7.15
FIGURE 7.8(a).
0.4
0.4
0.3 0.3
LinpV
kR
Noise-free
Spectrum
NEdNV
kR
LmeasV
kR
0.2 0.2
Lmeas2V
kR
Lmeas3V
kR
Radiance Error Lmeas4V
kR
(in mW/m2/sr/cm-1) 0.1 0.1
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
0.0
Lmeas8V
0
kR
Lmeas9V
kR
Lmeas10V
kR NEdNtilt
-0.1 0.1
-0.2
0.172296 0.2
800 850 900 950 1000 1050 1100
800
800 850 900 950
)R 1000 1050 1100
1100
kR
) (in cm-1)
- 943 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
FIGURE 7.8(b).
0.10
0.1
0.08
0.08
0.06
0.06
NEdNV
k
radiance
(mW/m2/ster/cmNEdNest
-1
) k1
0.04
0.04
0.02
0.02
0 0
0
800 850 900 950 1000 1050
1050 1100
1100
800 850 900 950 1000
800 σg , σg1 1100
k k1
σ (cm-1)
- 944 -
Appendix 7A
Appendix 7A
We want to calculate the second and fourth moments of the normal probability density
distribution
1 −(ς −φ )2 ( 2γ 2 )
pς (ς ) = e . (7A.1)
γ 2π
Here, pς (ς )dς is the probability that the continuous random variable ς takes on a value between
ς and ς + d ς . The mean value of ς is φ , and its standard deviation is γ .
We know, for a > 0 , that108
∞
1 π
³x e
2 − ax 2
dx = .
0
4a a
2
Since x 2 e − ax is an even function of x , this can be written as [according to Eq. (2.19) in Chapter
2]
∞
1 π
³−∞ x e dx = 2a a .
2 − ax 2
(7A.2a)
∞
3 −5 2
³xe
4 − ax 2
dx = a π . (7A.2b)
−∞
4
To get the second moment of ς when it obeys the pς (ς ) probability density distribution in
(7A.1), we must calculate
∞ ∞
1 −(ς −φ )2 ( 2γ 2 )
³−∞ ς pς (ς ) dς = γ 2π ³ς e dς .
2 2
−∞
108
Lennart Rade and Bertil Westergren, Beta β Mathematics Handbook, 2nd ed. (CRC Press, Inc., Boca Raton, FL,
1990), formula (42), p. 164.
- 945 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
∞ ∞
1 ( )
−t 2 2γ 2
³−∞ ς pς (ς ) dς = γ 2π ³ (t + φ ) 2 e
2
dt
−∞
or
∞ ∞
1 ( )
−t 2 2γ 2 φ 2 ∞ − t ( 2γ ) φ2 2 2
∞
( )
−t 2 2γ 2
³−∞ ς pς (ς ) dς = γ 2π ³t e ³ te ³
2 2
dt + dt + e dt . (7A.3a)
−∞
γ π −∞ γ 2π −∞
∞
1 ( ) dt = γ 2
− t 2 2γ 2
³t e
2
. (7A.3b)
γ 2π −∞
According to Eq. (2.17) in Chapter 2, the second term on the right-hand side must be zero
( )
(because [t exp −t 2 /(2γ 2 ) ] is an odd function of t), and we see that the third term must be
∞
φ2 ( )
−t 2 2γ 2
γ 2π ³
−∞
e dt = φ 2 (7A.3c)
because
∞
1 ( )
−t 2 2γ 2
γ 2π ³
−∞
e dt = 1 (7A.3d)
is just the integral of the zero-mean normal probability density over all its allowed values [see
Eq. (7A.1)]. Substituting (7A.3b) and (7A.3c) into (7A.3a) gives
³ς pς (ς ) d ς = γ 2 + φ 2 .
2
(7A.3e)
−∞
To get the fourth moment of ς when it obeys the pς (ς ) probability density distribution in
(7A.1), we evaluate
∞
³ς pς (ς ) d ς ,
4
−∞
- 946 -
Appendix 7A
∞ ∞ (ς −φ )2 ∞ t2
1 −
1 −
³ ³ς ³ (t + φ ) e
2γ 2 2γ 2
ς 4
pς (ς ) d ς = 4
e dς = 4
dt
−∞ γ 2π −∞ γ 2π −∞
or
∞ ∞ t2 ∞ t2 ∞ t2
1 −
2φ 2 −
3φ 2 2 −
³ te ³e
2γ 2 2γ 2
+ dt + dt .
γ π −∞ γ 2π −∞
The second and fourth terms on the right-hand side of (7A.4a) are zero because
[t 3 exp(−t 2 /(2γ 2 ))] and [t exp(−t 2 /(2γ 2 ))] are odd functions of t. Applying Eqs. (7A.2b),
(7A.3b), and (7A.3d),
∞
1 3
³ς pς (ς ) d ς = ⋅ ⋅ (2γ 2 )5 2 π + 6φ 2γ 2 + φ 4
4
−∞ γ 2π 4 (7A.4b)
= 3γ 4 + 6φ 2γ 2 + φ 4 .
In Sec. 7.2 above, random variable θx obeys a normal probability density distribution that has
a mean of φ and a standard deviation of γ x . According to Eqs. (7A.3e) and (7A.4b), we can
therefore write that
E(θx2 ) = γ x 2 + φ 2 (7A.5a)
and
E(θx4 ) = 3γ x4 + 6φ 2γ x2 + φ 4 . (7A.5b)
Random variable θy obeys a probability density distribution with a mean of zero and a standard
deviation of γ y . This means that, setting φ = 0 in Eqs. (7A.5a) and (7A.5b), we know
E(θy2 ) = γ y2 (7A.5c)
and
E(θy4 ) = 3γ y4 . (7A.5d)
Equations (7A.5a)–(7A.5d) are the results we need for the derivation of the mirror-tilt NEdN.
- 947 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
Appendix 7B
Although the o ( xx ) , o ( yy ) autocorrelation functions and the p( xx ) , p( yy ) noise-power spectra
introduced in Sec. 7.12 above follow the expected pattern, being both real and even like every
autocorrelation function and power spectrum of a wide-sense stationary random function,109 the
cross-correlation function o ( xy ) and cross-power spectrum p( xy ) introduced in Eqs. (7.37a) and
(7.37b) exhibit a more complicated symmetry. In particular, we should be careful to note that the
p( xy ) cross-power spectrum can have a nonzero imaginary component.
Equation (7.37a) defines the cross-correlation function of X and Y to be, using the notation
of Sec. 7.12,
E XY ( )
′ = o ( xy ) ( χ ′ − χ ) . (7B.1a)
( )
E [θx ( χ ) − φ ]θy ( χ ′) = o ( xy ) ( χ ′ − χ ) . (7B.1b)
Using the linearity of E with respect to random variables (see Sec. 3.10 of Chapter 3), we note
that
( ) ( ) ( )
E [θx ( χ ) − φ ]θy ( χ ′) = E θx ( χ )θy ( χ ′) − φ E θy ( χ ′) .
( )
Since E θy ( χ ′) = 0 , this reduces to
( ) (
E [θx ( χ ) − φ ]θy ( χ ′) = E θx ( χ )θy ( χ ′) , )
which means that Eq. (7B.1b) can be written as
( )
E θx ( χ ) θy ( χ ′) = o ( xy ) ( χ ′ − χ ) . (7B.1c)
This shows that o ( xy ) does not depend on the bias tilt angle φ . Interchanging the positions of Ȥ
and Ȥƍ in Eqs. (7B.1a) and (7B.1c) gives
( )
E X ′ Y = o ( xy ) ( χ − χ ′) (7B.1d)
109
See Sec. 3.20 of Chapter 3 as well as, in Sec. 3.15, the discussion following Eq. (3.30b).
- 948 -
Appendix 7B
and
E 'x ( 3) 'y ( ) o ( xy ) ( 3) .
(7B.1e)
We note that since
3 E Y 3X
E XY
automatically holds true, it follows—interchanging the roles of the x, y labels and the Ȥ, Ȥƍ
variables—that
in Eq. (7B.1a)—that
o ( xy ) ( 3 ) o ( yx ) ( 3) ,
o ( xy ) ( 33) o ( yx ) ( 33)
³o
( xy ) ( xy )
p () ) ( 33) e 2& i) 33 d 33 . (7B.2a)
5
5 5
³o
( xy )
( 333) e 2& i) 333 d 333 .
5
³o
( xy ) ( yx )
p () ) ( 333) e 2& i) 333 d 333 . (7B.2b)
5
We can now interchange the roles of the x and y labels in (7B.2a) to get
- 949 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
³o
( yx ) ( yx )
p () ) ( 33) e 2& i) 33 d 33 (7B.2c)
5
p( xy ) () ) p( yx ) () ) . (7B.2d)
Equation (7B.2d) matches the relationship between the cross-correlation functions in Eq. (7B.1f).
According to Eq. (7B.1a), the cross-correlation o ( xy ) is the expectation value of the product
of two real numbers, so it must be real. We can then write, substituting
5 5
p( xy ) () ) ³
5
o ( xy ) ( 33) cos(2&) 33)d 33 i ³ o ( xy ) ( 33) sin(2&) 33)d 33
5
so that
5
³o
( xy ) ( xy )
Re[p () )] ( 33) cos(2&) 33)d 33 (7B.3a)
5
and
5
Im[p ( xy )
() )] ³ o ( xy ) ( 33) sin(2&) 33)d 33 . (7B.3b)
5
The remark following Eq. (2.15b) in Chapter 2 points out that the product of any even function
with the sine is an odd function, which means, according to Eq. (2.17) in Chapter 2, that its
integral from í to must be zero. Thus, if o ( xy ) is an even function, Eq. (7B.3b) is the integral
of an odd function between í and and must be zero, showing that the cross-power spectrum
p( xy ) must be real because Im [p( xy ) ] 0 . The obvious
next obvious next tostep
next step, seeiswhether
to investigate whether
the cross-power
( xy )
spectrum
o ( 33) must
( xy )
mustbebe real, is tofunction
an even investigate
of whether
33 . o ( 33) must be an even function of 33 .
Again we say, just as in the discussion following Eq. (7B.1e), that 33 3 so that [see
Eqs. (7B.1e) and (7B.1c)]
E 'x ( 3) 'y ( ) o ( xy ) ( 33)
(7B.4a)
and
E 'x ( ) 'y ( 3) o ( xy ) ( 33) .
(7B.4b)
- 950 -
Appendix 7B
Equation (7.9a) shows that t = χ / u for u > 0 , so when χ ′ > χ the θy ( χ ′) random value in Eq.
(7B.4b) occurs at a later time than the θx ( χ ) random value. Suppose we assume that the θy
random quantity always resembles the θ after a time delay T has elapsed because any
x
θx ( χ ) = ψ ( χ ) (7B.4c)
and
θy ( χ ) = ψ ( χ − uT ) (7B.4d)
χ ′′ = χ ′ − χ = uT so that χ ′ = uT + χ
( )
o ( xy ) (uT ) = E θx ( χ ) θy ( χ + uT ) .
( )
o ( xy ) (uT ) = E (ψ ( χ )ψ (uT + χ − uT ) ) = E ψ ( χ ) 2 . (7B.4e)
This is the variance of ȥ, which could easily be a rather large quantity if there are large
disturbances in the x and y components of the misalignment angle. According to Eq. (7B.4a), on
the other hand,
( ) ( )
o ( xy ) (−uT ) = E θx ( χ ′) θy ( χ ) = E θx (uT + χ ) θy ( χ ) ,
This shows that o ( xy ) (−uT ) could easily be quite small when random function ȥ is only poorly
correlated with itself at different values of its argument. The x and y components of the
- 951 -
7 · Mirror-Misalignment NEdN in Double-Sided Interferograms
misalignment angle could, for example, be subject to large random disturbances that first perturb
the 'x value and then, after a time delay T, perturb the 'y value. This would make the value of
o ( xy ) (uT ) in Eq. (7B.4e) rather large. The disturbances could also, however, be rather short in
duration, so that the perturbation of an angle component at one time has little resemblance to the
perturbation of that same component at another time. This would make the value of o ( xy ) (uT )
in Eq. (7B.4f) rather small. We can conclude, then, that there is no reason for o ( xy ) to be an even
function of its argument. Hence there is no reason to expect the sine integral in (7B.3a)—orb for
that matter the cosine integral in (7B.3b)—toa be zero, which means the cross-power spectrum
( xy )
p in Eqs. (7B.2a), (7B.3a), and (7B.3b) can easily have nonzero real and imaginary
components.
- 952 -
8
SAMPLING-ERROR NEdN IN DOUBLE-
SIDED INTERFEROGRAMS
Random errors in the sampling position produce random errors in the sampled signal. As was
done in Chapter 7 when analyzing misalignment noise, we use wide-sense stationary random
functions to describe the sampling noise, tracing the effect through the calibration process to find
out what the NEdN of the measured spectrum looks like when it is dominated by this sort of
error. In a well-designed interferometer, the sampling-noise NEdN, just like the misalignment-
noise NEdN, should be a small source of error compared to the detector noise. The formulas
derived here can nevertheless be very useful when designing interferometers because they show
how accurately the interferometer signal needs to be sampled. Moreover, when interferometers
produce unusual types of random errors, the size and shape of the errors can be compared to the
noise can
predictions of these formulas, making it easier to determine whether an unexpectedly large
sampling noise
error could be contributing to the problem.
Equations (6.5d) and (6.12a) in Chapter 6 contain formulas for zc and zC( cold ) respectively.
Substituting these into the formula for zC( tot ) gives
zC( tot ) ( )
"
WA
4 "³ H(u ) M( R ma ) R ( ) ( ) f ( ) a ( )L FOV ( ) e 2 i d
"
WA
³ H(u ) M( Rma ) ( ) R ( ) a ( )[L FOV ( ) L FOV ( )]e d .
( fore ) (back) 2 i
4 "
-953 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
∞
WA ∆Ω
(χ ) = ³ H(uσ ) M( Rσθma ) R ( σ ) η(σ ) τ a ( σ ) ⋅
( tot )
zC
4 −∞ (8.1a)
2π iσχ
[τ f ( σ )L FOV ( σ ) + L ( fore )
FOV (σ ) − L
(back)
FOV ( σ )] e dσ .
§ WA ∆Ω ·
Z FOV (σ ) = ¨ ¸ R ( σ )η (σ )τ a ( σ )[τ f ( σ )L FOV ( σ ) + L FOV ( σ ) − L FOV ( σ )] , (8.1b)
( fore ) (back)
© 4 ¹
∞
(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ
( tot )
zC ma (8.1c)
−∞
for the noise-free signal at point C in Fig. 6.2. The Fourier F operator defined in Sec. 2.5 of
Chapter 2 [see Eqs. (2.29a) and (2.29c)] lets this be written as
(
H(uσ ) M( Rσθ ma ) Z FOV (σ ) = F ( −iσχ ) zC( tot ) ( χ ) . ) (8.1e)
Unlike the interferometer model analyzed in Chapter 7, in this chapter we assume that the
misalignment angle șma, when it is significantly different from zero, has the same constant value
during spectral measurements and their associated calibration procedures—that is, we assume
that the misalignment angle șma does not change with time.
- 954 -
Sampling Noise at the A/D Converter · 8.2
Clearly the units of n ( s ) are the same as the OPD—that is, units of length (cm). Suppose the plan
is to sample zC(tot ) at N equally spaced OPD values in order to generate a double-sided
interferogram signal with χ = 0 occurring at or near the middle of the sample set. In the absence
of error, we expect the samples to occur at χ = χ j with
χ j = j ∆χ , (8.2b)
where ¨Ȥ is the OPD separation between adjacent samples and, just like in Eq. (5.103b) in
Chapter 5,
N N N N
j = − + 1, − + 2, … , − 1, 0, 1, … , − 1, . (8.2c)
2 2 2 2
In the absence of sampling-position noise, there is one sample taken at χ = 0 when j = 0 , and
there is one more sample taken for χ > 0 than for χ < 0 . When the sampling-position noise
n ( s ) ( χ ) is present, we know that the actual sample positions occur at
χ j + n ( s ) ( χ j )
(
zC(tot ) χ j + n ( s ) ( χ j ) )
instead of zC( tot ) ( χ j ) . We define the sampling noise to be the random errors in the sample values
zC due to the sampling-position noise n ( s ) . We assume that
This lets us write, for the jth sample value contaminated by sampling noise,
dzC( tot )
( )
zC(tot ) χ j + n ( s ) ( χ j ) ≅ zC( tot ) ( χ j ) + n ( s ) ( χ j ) ⋅
dχ
. (8.2e)
χ =χ j
-955 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
dzC( tot )
z ( tot )
C ( χ + n (s)
)
(χ ) ≅ z ( tot )
C ( χ ) + n ( χ ) ⋅
(s)
dχ
. (8.2f)
( )
Since zC(tot ) ( χ ) is the noise-free signal and zC(tot ) χ + n ( s ) ( χ ) is the noise-contaminated signal,
we see that the formula for the noise-contaminated signal can be approximated by
dzC(tot )
( tot )
zCN ( χ ) = zC(tot ) ( χ ) + n ( s ) ( χ ) ⋅ . (8.2g)
dχ
( tot )
In this chapter the random function zCN represents the signal contaminated by sampling noise,
with the sampling noise caused by the sampling-position noise n ( s ) at point C in Fig. 6.2 of
Chapter 6.
( )
E n ( s ) ( χ ) = 0 . (8.3a)
The expectation operator E is linear with respect to random quantities (see Sec. 3.10 in Chapter
3). Substituting (8.2a) into (8.3a) and applying the expectation operator E , we get
Parameter Ȥcorrect is nonrandom, which means that [see Eq. (3.9f) in Chapter 3]
E ( χ correct ) = χ correct .
E ( χ incorrect ) = χ correct .
Hence, (8.3a) is just another way of saying that there is no bias in the attempt to sample the
signal; although any given attempt is randomly incorrect, on the average we get the correct OPD
- 956 -
Power Spectrum and Autocorrelation Function of the Sampling Noise · 8.3
value. Following the assumptions stated in the previous section, we take n ( s ) to be at least wide-
sense stationary. This means, according to Eq. (3.30b) in Chapter 3, that its autocorrelation
function onn(s ) with respect to the OPD can be written as
(χ ′ − χ ) = E n
(s)
onn (
( s ) ( χ ) ⋅ n ( s ) ( χ ′) . ) (8.3b)
Clearly,
( ) ( )
E n ( s ) ( χ ) ⋅ n ( s ) ( χ ′) = E n ( s ) ( χ ′) ⋅ n ( s ) ( χ ) ,
which means that
( χ ′ − χ ) = onn
( χ − χ ′) .
(s) (s)
onn
showing that the autocorrelation function for the sampling-position noise is an even function of
its argument. It is, of course, also real because n ( s ) is real:
(s)
Im onn (
(χ ) = 0 . ) (8.3d)
(s)
The Fourier transform of onn is called the power spectrum of the n ( s ) sampling-position
noise (see Sec. 3.20 of Chapter 3),
∞
p (σ ) =
(s)
nn ³o
(s)
nn ( χ ) e−2π iσχ d χ = F ( −iσχ ) ( onn(s ) ( χ ) ) . (8.4a)
−∞
(χ ) =
(s)
onn ³p
(s)
nn (σ ) e 2π iσχ dσ = F (iσχ ) pnn
(s)
(
(σ ) . ) (8.4b)
−∞
Equations (8.3c) and (8.3d) show that the autocorrelation function is real and even which means
(s)
that, according to entry 1 of Table 2.1 in Chapter 2, the power spectrum pnn must also be real
and even:
( −σ ) = pnn
(σ )
(s) (s)
pnn (8.4c)
-957 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
and
Im ( pnn
(σ ) ) = 0 .
(s)
(8.4d)
(s) (s)
(0) can be used to scale the power spectrum pnn
The value of onn . Consulting Eq. (8.4b), we
have
∞
³p (σ ) dσ
(s) (s)
o (0) =
nn
nn (8.5a)
−∞
(
E [n ( χ )] =
(s) 2
) ³p (s)
nn (σ ) dσ . (8.5b)
−∞
Formula (8.5b) shows, since its right-hand side depends only on the shape and size of the power
spectrum, that the wide-sense stationary nature of the sampling-position noise requires
E([n ( s ) ( χ )]2 ) to be independent of the OPD value Ȥ. If we know that function S h (σ ) specifies
the shape of the power spectrum, but we do not know the size of the power spectrum, then there
exists a real constant Į such that
(σ ) = α S h (σ ) .
(s)
pnn (8.5c)
(
E [n ( χ )] = α
(s) 2
) ³ S (σ ) dσ
h
−∞
or
−1 −1
ª∞ º ª∞ º
(
(s) 2
)
α = « ³ Sh (σ ) dσ » ⋅ E [n ( χ )] = « ³ Sh (σ ) dσ » ⋅ E([n ( s ) ]2 ) , (8.5d)
¬ −∞ ¼ ¬ −∞ ¼
where the last step drops the argument Ȥ because wide-sense stationary random functions have
the same mean-square value E([n ( s ) ]2 ) at all values of Ȥ. Hence we can find the value of Į from
the shape function Sh and the mean-squared error E([n ( s ) ]2 ) . Knowing both Į and S h (σ )
(s)
determines the size and shape of function pnn
in Eq. (8.5c), completely specifying the power
spectrum of the sampling-position noise in terms of the shape function and the mean-squared
error in the sampling position.
- 958 -
Uncalibrated Spectral Signals · 8.4
which has already been defined in Eq. (4C.1a) in Appendix 4C of Chapter 4. (This is also the
same as the Π function in Eq. (2.56c) of Chapter 2, except for its value at χ = D ; in particular,
we know from the discussion following Eq. (2.9e) that both versions of Π must have the same
Fourier transform.) We now have, from (8.2g), that
dzC( tot )
Π ( χ , D ) z ( tot )
CN ( χ ) = Π ( χ , D) z ( tot )
C ( χ ) + Π ( χ , D ) n ( χ ) ⋅
(s)
(8.6b)
dχ
for the total double-sided interferogram signal contaminated by sampling noise at point C in Fig.
6.2. Multiplying by Π ( χ , D) in this way explicitly reminds us that the double-sided
interferogram is truncated—that is, data is only recorded for OPD values lying between D and
íD. The forward Fourier transform of
Π ( χ , D) zCN
( tot )
(χ )
is the uncalibrated spectral signal contaminated by sampling noise—and we show this by writing,
just like in Eq. (7.14c) of Chapter 7, that
Z eff ,totN (σ ) = F
( − iσχ )
(
Π ( χ , D) zCN
( tot )
(χ ) . ) (8.6c)
Section 2.6 in Chapter 2, where the linear nature of the Fourier operator F is explained, shows
that when the forward Fourier transform is applied to (8.6b) we get a sum of two Fourier
transforms on the right-hand side:
(
F ( − iσχ ) Π ( χ , D) zCN
( tot )
) (
( χ ) = F ( − iσχ ) Π ( χ , D) zC( tot ) ( χ ) )
§ dz ( tot ) ·
+ F ( − iσχ ) ¨ Π ( χ , D) n ( s ) ( χ ) ⋅ C ¸ .
© dχ ¹
This can also be written as, substituting from (8.6c),
-959 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
dχ ¹
¸. (8.6d)
©
Expanding the first term on the right-hand side of (8.6d) is a straightforward process. The
Fourier convolution theorem [see Eq. (2.39j) in Chapter 2] gives
sin( x)
sinc( x) = . (8.7c)
x
Equations (8.7b) and (8.1e) can now be substituted into (8.7a) to get
Functions H and M vary slowly with ı compared to sinc(2πσ D ) , and the sinc function is very
narrow about σ = 0 compared to H and M. This means, according to Eq. (5C.1) in Appendix 5C
of Chapter 5, that (8.7d) can be approximated as
WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ ) ª¬τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ ) º¼ .
( fore ) (back)
(8.7f)
4
Expanding the second term on the right-hand side of Eq. (8.6d) starts out the same way as
- 960 -
Uncalibrated Spectral Signals · 8.4
( − iσχ ) § dzC(tot ) ·
¨ Π ( χ , D) n ( χ )
(s)
F ¸
© dχ ¹
(8.8a)
§ dz ( tot )( χ ′) ·
= F ( −iσχ ) ( )
Π ( χ , D)n ( s ) ( χ ) ∗ F ( −iσχ ′) ¨ C
d χ ′
¸.
© ¹
§ dz ( tot )( χ ′) ·
F ( −iσχ ′) ¨ C ¸ = 2π iσ F
( − iσχ ′ )
(
zC( tot )( χ ′) , )
© d χ′ ¹
§ dz ( tot )( χ ′) ·
F ( − iσχ ′) ¨ C ¸ = 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) . (8.8b)
© d χ ′ ¹
For future use, we define, reversing the Fourier transform in (8.8b), that
dzC( tot )
Ws ( χ ) = = F ( iσχ ) ( 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ) , (8.8c)
dχ
∞
Ws ( χ ) = 2π i ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e2π iσχ dσ . (8.8d)
−∞
(
n (Ds ) (σ ) = F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) ) (8.8e)
or
∞
n (Ds ) (σ ) = ³ Π ( χ , D)n ( χ ) e −2π iσχ d χ .
(s)
(8.8f)
−∞
-961 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
[ ( , D)n ( s ) ( )] ,
The formula for n (Ds ) ( ) can also be written as, consulting the prescription for ( , D) in (8.6a),
³ n
(s) (s)
n ( )
D ( ) e 2 i d . (8.8h)
D
To finish up the analysis of the second term, we substitute Eqs. (8.8b) and (8.8e) into (8.8a) to get
Now that the two terms on the right-hand side of Eq. (8.6d) have been expanded and analyzed,
we use their formulas in Eqs. (8.7e) and (8.8i) to write the formula for the uncalibrated spectral
signal contaminated by sampling noise:
Z eff ,totN ( ) H(u ) M( R ma ) Z mnf ( )
(8.9a)
n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) .
D D
E n (Ds ) ( ) ³ E n ( ) e
³ E n
(s) 2 i (s)
d ( ) e 2 i d ,
D D
E n (Ds ) ( ) 0 . (8.9b)
Again using the linearity of E with respect to random quantities as explained in Sec. 3.10 of
Chapter 3, we apply the expectation operator to both sides of (8.9a) to get
- 962 -
Uncalibrated Spectral Signals · 8.4
(
E Z )
eff ,totN (σ ) ≅ E ( H(uσ ) M( Rσθ ma ) Z mnf (σ ) )
(
+ E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] ) (8.9c)
= H(uσ ) M( Rσθ ma ) Z mnf (σ )
(
+ E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] , )
where in the last step we apply Eq. (3.9f) of Chapter 3, noting that E (c ) = c for nonrandom
quantities c. The convolution in the second term on the right-hand side can be written as the
integral [see Eqs. (2.38b) and (2.38a) in Chapter 2]
(
E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] )
§∞ ·
= E ¨ ³ [ 2π iσ ′ H(uσ ′) M( Rσ ′θ ma ) Z FOV (σ ′) ] n (Ds ) (σ − σ ′) dσ ′ ¸ .
© -∞ ¹
We use the linearity of E as explained in Sec. 3.10 of Chapter 3 to move E inside the integral
and then substitute from (8.9b) to get
E Z ( )
eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ ) . (8.9e)
-963 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
This shows that the sampling noise can always be reduced to negligible levels in the uncalibrated
spectral signal by averaging together many independent measurements of the same spectral
radiance. This shows
In this that the sampling noise behaves the same way as the detector noise and
respect,
mirror-misalignment noise examined in the two previous chapters [see Eq. (7.18e) in Chapter 7
and the discussion following Eq. (6.30c) in Chapter 6].
To describe the uncalibrated spectral signal generated from observation of L(1), we again use
(1) (1)
functions Z FOV and Z mnf defined in Eqs. (7.20b) and (7.20c) of Chapter 7:
When we write down these formulas, functions L(1), L(1)FOV , and L(1)
mnf can be used interchangeably
as shown in Eq. (8.10a). Similarly, describing the uncalibrated spectral signal generated from
observation of L(2), we reuse functions Z FOV
(2) (2)
and Z mnf defined in Eqs. (7.20e) and (7.20f) of
Chapter 7,
- 964 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5
interchangeably.
Still using the same notation as in Sec. 7.6 of Chapter 7, we call Z (1)
eff ,totN (σ ) the uncalibrated,
noise-contaminated spectral signal produced when the interferometer observes L(1) ( σ ) and
Z (2) (σ ) the uncalibrated, noise-contaminated spectral signal produced when the
eff ,totN
Z eff ,totN (σ )
on the left-hand side of Eq. (8.9a) is the uncalibrated spectral signal for any interferometer
measurement contaminated by sampling noise, we can get the formulas for Z (1) eff ,totN (σ ) and
of (8.11a) and (8.11b) to get the result of this averaging. Following the same reasoning used to go
from Eq. (8.9a) to (8.9e) above, we see that
(
E Z eff ,totN )
(1) (σ ) ≅ H(uσ ) M( Rσθ ) Z (1) (σ )
ma mnf
and
E Z(eff ,totN )
(2) (σ ) ≅ H(uσ ) M( Rσθ ) Z (2) (σ ) .
ma mnf
Following the same procedure as in Eqs. (7.20i) and (7.20j) in Chapter 7, we again remove the
tilde and change the totN subscript to tot to define
( (1) )
eff ,tot (σ ) = E Z eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ )
Z (1) (1)
(8.11c)
-965 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
and
(
(2) )
eff ,tot (σ ) = E Z eff ,totN (σ ) ≅ H(uσ ) M( Rσθ ma ) Z mnf (σ ) .
Z (2) (2)
(8.11d)
The noise cannot, of course, be averaged away from the spectral measurement itself because [as
is discussed following Eq. (7.21a) in Chapter 7] in practice we cannot take the same amount of
care when collecting the spectral measurements as we do when collecting the known calibration
data. Just as is done in Sec. 7.6 of Chapter 7, we use Z eff ,totN (σ ) to represent the signal for the
( meas )
uncalibrated spectral measurement contaminated by noise—in this case, sampling noise. When
analyzing sampling noise, Z eff ,totN (σ ) is the same quantity as
( meas )
Z eff ,totN (σ )
Measured Radiance
( meas ) (σ ) − Z (1) (σ )
Z (8.12a)
= ª¬L ( σ ) − L ( σ ) º¼ (2)
eff ,totN eff ,tot
(2) (1)
+ L(1) ( σ ).
Z eff ,tot (σ ) − Z (1)
eff ,tot (σ )
- 966 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5
L(2) ( ) L(1) ( )
(2) (1)
Z eff ,tot ( ) Z eff ,tot ( )
L(2) ( ) L(1) ( )
WA
H(u ) M( R rms )R ( ) ( ) a ( ) f ( )[L(2) ( ) L(1) ( )]
4
or
L(2) ( ) L(1) ( )
(2) (1)
Z eff ,tot ( ) Z eff ,tot ( )
(8.12c)
1
ª WA º
« H(u ) M( R rms )R ( ) ( ) a ( ) f ( )» .
¬ 4 ¼
This result is identical to (7.21c) in Chapter 7 because the sampling noise, like the mirror-
misalignment noise, can be reduced to negligible levels by averaging together many independent
measurements of the same radiance when gathering data for the calibration algorithms. In fact,
this formula
(8.12c) holdsholds true whenever
true whenever the noise
the noise in thecan becan
data removed this way.
be removed thisTo findTo
way. thefind
value
theof
value of
( meas ) ( ) Z (1) ( )
Z eff ,totN eff ,tot
(1)
which becomes, consulting Eqs. (8.7f) and (8.10d) for the formulas of Z mnf and Z mnf ,
( meas ) ( ) Z (1) ( )
Z eff ,totN eff ,tot
WA
H(u ) M( R ma )R ( ) ( ) a ( ) f ( ) ¬ª L mnf ( ) L(1) ( ) ¼º . (8.12d)
4
n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) .
Equations (8.12c) and (8.12d) can now be put into (8.12a) to get
-967 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
Measured Radiance
4{n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) } (8.12e)
L mnf ( ) .
(WA )H(u ) M( R ma ) R ( ) ( ) a ( ) f ( )
Equation (6.55a) in Chapter 6 shows that the complex-valued transfer function H can be written
as
H(u ) H(u ) ei ( ) (8.12f)
with ( ) being the phase of the complex-valued function H(uı). According to the discussion
following Eq. (6.55a), Eq. (5A.6b) in Appendix 5A of Chapter 5 applies to the transfer function
H in (8.12f); that is, H is Hermitian:
H(u ) H(u ) . (8.12g)
( ) ( ) . (8.12i)
Measured Radiance
4e i ( ) {n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) } (8.12j)
L mnf ( ) .
(WA ) H(u ) M( R ma )R ( ) ( ) a ( ) f ( )
The denominator of the second term on the right-hand side is real, but the numerator of this term
almost certainly has both a real and imaginary component. This
In this regard,Eq.
formula, the(8.12j),
result resembles
resemblesEq.
Eq.
(7.21e) in Chapter 7, which also shows the measured radiance spectrum to be the sum of
L mnf ( ) and a complex random term. Just like in the discussion following (7.21e), we note that
only the real part of the second term acts as a source of unavoidable noise, since we can always
- 968 -
Calibrating the Spectral Signal Contaminated by Sampling Noise · 8.5
discard any imaginary components of the noise-contaminated measured radiance. Once again we
can conclude that only the real part of the second term of the formula is the random spectral noise
δ L for the measured radiance,
δ L =
(
4 Re e − iψ (σ ) {n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]} ). (8.12k)
(WA ∆Ω) H(uσ ) M( Rσθ ma ) R ( σ )η (σ )τ a ( σ )τ f ( σ )
written as even functions of wavenumber—that is, as functions of the absolute value of ı—and
Eq. (8.12h) shows that H is also an even function of ı. Equation (4.139g) in Chapter 4 states
that Ș(ı) is even, and Eq. (5.10f) in Chapter 5 reveals M to be an even function of ı. Hence the
whole denominator of (8.12k) must be an even function of wavenumber ı. To analyze the
numerator, we note that everything in the formula for Z FOV in Eq. (8.1b) is real, so Z FOV is real
and—since Ș(ı) and the other functions in the formula are even—function Z FOV is also even:
The convolution in the numerator of Eq. (8.12k) can be written as [see Eqs. (2.38a) and (2.38b) in
Chapter 2]
-969 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
{ n (s)
D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }∗
∞
= ³ ª¬−2π iσ ′ H(uσ ′)
∗
{ }
M( Rσ ′θ ma ) Z FOV (σ ′) º¼ n (Ds ) (σ − σ ′)∗ dσ ′
−∞
This is the formula for the complex value of the convolution. The numerator in (8.12k) is
proportional to
(
Re e − iψ (σ ) {n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]} ; )
and since the real part of any complex number c can be written as 0.5(c + c* ) , we see that the real
part of the convolution is
( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{ })
∗
+ eiψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] (8.13b)
=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{
+ eiψ (σ ) n (Ds ) (σ )∗ ∗ ª¬ −2π iσ H(uσ )∗ M( Rσθ ma ) Z FOV (σ ) º¼ . })
Equations (8.8g), (8.12g), and (8.12i) show that this can also be written as
( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
e (
1 −iψ (σ ) ( s )
{
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }
{
+ e − iψ ( −σ ) n (Ds ) (−σ ) ∗ [ 2π i (−σ ) H(−uσ ) M( Rσθ ma ) Z FOV (σ ) ] . })
- 970 -
Random Sampling Error in the Measured Spectrum · 8.6
Since, as has already been noted, M and Z FOV are even, it follows that
( {
Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] })
=
2
(
1 − iψ (σ ) ( s )
e {
n D (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] } (8.13c)
{ })
+ e − iψ ( −σ ) n (Ds ) (−σ ) ∗ [ 2π i (−σ ) H(−uσ ) M(− Rσθ ma ) Z FOV (−σ ) ] .
The right-hand side of (8.13c) is clearly an even function of wavenumber; when ı is replaced by
íı, only the order of the sum changes. Consequently the left-hand side of (8.13c) must also be an
even function of ı; hence, the numerator of (8.12k), just like the denominator of (8.12k), must be
an even function of wavenumber. Consequently, it makes sense to write the formula for the
random sampling error in (8.12k) as
δ L ( σ ) =
( { }) .
4 Re e −iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]
(8.13d)
(WA ∆Ω) H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
The absolute value signs in the argument for δ L remind us that both sides of this formula are
even functions of ı.
The linearity of the expectation operator E with respect to random quantities (see Sec. 3.10 in
Chapter 3) lets us apply E to both sides of (8.13d) and take the nonrandom quantities outside the
expectation value to get
(
E δ L ( σ ) = )
( ( {
4 E Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ )] }) )
. (8.14a)
(WA ∆Ω) H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
To evaluate the numerator of the right-hand side, we again note that any complex number c can
be written as 0.5(c + c* ) and then use the linearity of E to get
( ( {
E Re e − iψ (σ ) n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ] }) )
1
(
= [e − iψ (σ )E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ ) ]
2
) (8.14b)
({ } )] .
∗
+ eiψ (σ )E n (Ds ) (σ ) ∗ [ 2π iσ H(uσ ) M( Rσθ ) Z (σ ) ]
ma FOV
-971 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
E n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) 0 ; (8.14c)
a complex
and if the mean ofnumber is zero
a random so is its
complex complex
number conjugate:
is zero so is the mean of its complex conjugate:
E n (Ds ) ( ) 2 i H(u ) M( R ) Z ( ) 0 .
ma FOV
(8.14d)
E Re e i ( ) n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) 0 . (8.14e)
E L ( ) 0 .
(8.14f)
E ¨§ ª L ( ) E L ( ) º ¸· E ª¬ L ( ) º¼ .
2 2
©¬
¼ ¹
(8.15a)
E ª¬ L ( ) º¼
2
¬
E ª Re e i ( ) n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) º 2
¼ , (8.15b)
2
ª¬ A 4 H(u ) M( R ma ) R ( ) ( ) a ( ) f ( ) º¼
where we have used that W 2 1 because W = 1 or í1 [see discussion immediately preceding Eq.
- 972 -
Calculating the NEdN from the Random Sampling Error · 8.7
J ( s ) ( ) E ª Re e i ( ) n (Ds ) ( ) 2 i H(u ) M( R ma ) Z FOV ( ) º 2 ,
¬ ¼ (8.15c)
16 J ( s ) ( )
E ª¬ L ( ) º¼
2
ª¬ A H(u ) M( R ma ) R ( ) ( ) a ( ) f ( ) º¼
2
. (8.15d)
Equation (6.3f) in Chapter 6 states that the NEdN associated with any random error L is its
standard deviation—that is, the square root of the variance of L . Hence, NEdNsamp, the NEdN
caused by the sampling error, is
4 J ( s ) ( )
NEdN samp ( ) . (8.15e)
A H(u ) M( R ma )R ( ) ( ) a ( ) f ( )
The J ( s ) function specifies how the sampling noise interacts with the radiance spectrum, and the
denominator rescales the result so it has the right size with respect to the spectral
measuredmeasurement.
spectrum.
(s)
To evaluate J , we set up three new functions of wavenumber called T1 ( ) , T2 ( ) , and
T3 ( ) . Using function Ws ( ) from Eqs. (8.8c) or (8.8d) above, we define
T1 ( ) E §¨ ª¬ F ( i )
( , D) n ( s ) ( ) Ws ( ) º¼ ·¸ ,
2
©
¹
(8.16a)
T2 ( ) E §¨ ª¬ F ( i )
( , D ) n ( s ) ( ) Ws ( ) º¼ ·¸ ,
2
©
¹
(8.16b)
and
T3 ( )
(8.16c)
E F ( i )
( , D ) n ( s ) ( ) Ws ( ) ( F ( i )
( , D ) n ( s ) ( ) Ws ( ) .
-973 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
is real, as are functions n ( s ) ( χ ) and Π ( χ , D) introduced in (8.2a) and (8.6a), so when taking the
complex conjugate of the Fourier transform in Eq. (8.16a) we get, applying Eqs. (2.29a) and
(2.29c) in Chapter 2,
∗
ª∞ º
( )
∗
ª ( − iσχ )
Π ( χ , D ) n ( χ ) Ws ( χ ) ¼º = « ³ Π ( χ , D) n ( s ) ( χ ) Ws ( χ )e −2π iσχ d χ »
(s)
¬F
¬ −∞ ¼
∞
³ Π( χ , D) n ( χ ) Ws ( χ )e 2π iσχ d χ
(s)
= (8.16e)
−∞
(
= F (iσχ ) Π ( χ , D) n ( s ) ( χ ) Ws ( χ ) . )
Hence the Fourier transforms in (8.16a) and (8.16b) are complex conjugates, which means their
squares must also be complex conjugates, as are the expectation values of the squares. We thus
end up with the relationship
T1 (σ )∗ = T2 (σ ) . (8.16f)
Consequently any formula derived for T1 (σ ) can be turned into a formula for T2 (σ ) just by
taking the complex conjugate of both sides of the equation.
Working first with the T1 term in Eq. (8.16a), we consult the definition of the Fourier-
transform operator F [see Eqs. (2.29a) and (2.29c) in Chapter 2) and Eq. (3.17c) in Chapter 3 to
get
§∞ ∞
·
T1 (σ ) = E ¨ ³ Π ( χ , D) n ( χ ) Ws ( χ )e
(s) −2π iσχ
d χ ³ Π ( χ ′, D) n ( s ) ( χ ′) Ws ( χ ′)e −2π iσχ ′ d χ ′ ¸
© −∞ −∞ ¹
∞ ∞
= ³ d χ Π ( χ , D ) W ( χ )e s
−2π iσχ
³ d χ ′Π ( χ ′, D) W ( χ ′)e
s
−2π iσχ ′
( )
E n ( s ) ( χ )n ( s ) ( χ ′) .
−∞ −∞
This can also be written as, first substituting from Eq. (8.3b) and then applying (8.3c),
∞ ∞
³ d χ Π ( χ , D ) W ( χ )e ³ d χ ′Π ( χ ′, D) W ( χ ′)e
−2π iσχ −2π iσχ ′
T1 (σ ) = ( χ − χ ′) .
(s)
s s onn (8.17)
−∞ −∞
2 D sinc(2πσ D)
- 974 -
Calculating the NEdN from the Random Sampling Error · 8.7
Applying the Fourier convolution theorem to the Fourier transform of the product function
( , D) Ws ( )
"
³
( , D) W ( ) e
2 i
s d
"
(8.18a)
[2 D sinc(2 D)] [2 i H(u ) M( R ma ) Z FOV ( )] .
This means, according to Eq. (5C.1) in Appendix 5C of Chapter 5, that (8.18a) can be
approximated as
"
³
( , D) W ( ) e
2 i
s d
"
"
³
( , D) W ( ) e
2 i
s d 2 i H(u ) M( R ma ) Z mnf ( ) . (8.18b)
"
Taking the complex conjugate of both sides gives, since
( , D) , Ws ( ) , M, and Z mnf are real
quantities,
"
³
( , D) W ( ) e
2 i
s d 2 i H(u ) M( R ma ) Z mnf ( ) . (8.18c)
"
-975 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
T1 ( )
" " "
³ d
( , D ) W ( )e ³ d
( , D ) W ( )e ³ d p
2 i 2 i (s)
s s
nn ( ) e2 i ( )
" " "
" " "
³ d ³ d
( , D) W ( ) e ³ d
( , D) W ( ) e
(s) 2 i ( ) 2 i ( )
( )
pnn s s
" " "
"
³p
(s)
T1 ( )
nn ( )[2 i ( ) H u ( ) M R( ) ma Z mnf ( )] (
" (8.18d)
[2 i ( )H u ( ) M R( ) ma Z mnf ( )]d .
Glancing back at the formula for Z mnf in Eq. (8.7f) above, we see that every function in the
formula depends on except for Ș, and according to Eq. (4.139g) in Chapter 4, Ș is also an
even function of ı. Hence Z mnf is even:
Z mnf ( ) Z mnf ( ) . (8.18e)
Equation (5.10f) in Chapter 5 shows that M( R ma ) is an even function of ı, and Eq. (5A.6b)
(5B.2a) in
Appendix 5B A of Chapter 5 shows that H is Hermitian. Consequently the formula for T1 ( ) in
(8.18d) can be written as
T1 ( )
"
4 2 ³ pnn
(s)
( )[( ) H u ( ) M R ( ) ma Z mnf ( )] ( (8.18f)
"
Equation (8.4d) shows that the power spectrum of the sampling-position noise is real, and we
already know that M and Z mnf are real. Hence only the transfer function H in (8.18f) can have
a nonzero imaginary component, so when Eq. (8.16f) is applied to (8.18f) to get the formula for
T2 ( ) , the result is
- 976 -
Calculating the NEdN from the Random Sampling Error · 8.7
T2 (σ ) =
∞
Function T3 (σ ) in Eq. (8.16c) can be written as, applying the Fourier transform operator as
shown in Eqs. (2.29a,c) in Chapter 2,
T3 (σ )
§ ∞ ∞
·
³−∞ ³−∞ Π( χ ′, D) n ( χ ′) Ws ( χ ′)e d χ ′ ¸¹ .
−2π iσχ 2π iσχ ′
= E¨ Π ( χ , D )
n (s)
( χ ) Ws ( χ ) e d χ ⋅ (s)
Equation (3.17c) in Chapter 3 shows that the expectation operator E can be taken inside the
integrals to get, after applying Eq. (8.3b),
T3 (σ )
∞ ∞
= ³ d χ Π ( χ , D ) W ( χ )e
s
−2π iσχ
³ d χ ′ Π ( χ ′, D) W ( χ ′)e
s
2π iσχ ′
(
E n ( s ) ( χ )n ( s ) ( χ ′) ) (8.19a)
−∞ −∞
∞ ∞
³ d χ Π ( χ , D ) W ( χ )e ³ d χ ′ Π ( χ ′, D) W ( χ ′)e
−2π iσχ 2π iσχ ′
(χ ′ − χ ) .
(s)
= s s onn
−∞ −∞
(s)
is even,
According to Eq. (8.3c), the autocorrelation function onn
( − χ ) = onn
(χ ) ,
(s) (s)
onn
1 (s) 1 (s)
(χ ′ − χ ) = ( χ − χ ′) + (χ ′ − χ ) .
(s)
onn onn onn (8.19b)
2 2
-977 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
" "
1
T3 ( ) ³ d
( , D) Ws ( )e 2 i ³ d
( , D ) Ws ( )e 2 i onn
(s)
( )
2 " "
" "
1
³ d
( , D) Ws ( )e 2 i ³ d
( , D ) Ws ( )e 2 i onn
(s)
( ) .
2 " "
Again, just like in the analysis of T1 above, Eq. (8.4b) is applied to get
T3 ( )
" " "
1
³ d
( , D) Ws ( )e 2 i ³ d
( , D ) Ws ( )e 2 i ³ d
(s)
( ) e
pnn 2 i ( )
or
T3 ( )
" " "
1
³ d pnn
( ) ³ d
( , D ) Ws ( )e
(s) 2 i ( )
³ d
( , D) Ws ( )e 2i(() ) (8.19c)
2 " " "
" " "
1
³ d pnn
( ) ³ d
( , D ) Ws ( )e
(s) 2 i ( )
³ d
( , D) Ws ( )e 2i(() ). .
2 " " "
T3 ( )
"
2 2{³ (s)
( )[( ) H u ( ) M R ( ) ma Z mnf ( )] (
pnn
"
[( ) H u ( ) M R( ) ma Z mnf ( )] d
"
³
(s)
( ) [( ) H u ( ) M R ( ) ma Z mnf ( )] (
pnn
"
[( ) H u ( ) M R ( ) ma Z mnf ( )] d }
or
- 978 -
Calculating the NEdN from the Random Sampling Error · 8.7
T3 (σ ) =
∞
2π 2 {³ p (s)
nn (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
2
(8.19d)
−∞
∞
+ ³ (σ ′) (σ + σ ′) H ( u (σ + σ ′) ) M ( R (σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′
(s)
pnn
2
}
−∞
because everything inside the integrals except H is real. According to Eq. (3.54g) in Chapter 3,
(σ ′) power spectra in both
(s) (s)
noise-power spectra such as pnn can never be negative. The pnn
integrals of (8.19d) are multiplied by the squared magnitudes of complex numbers before being
integrated over dıƍ. Consequently neither of the integrals in (8.19d) can be negative, showing that
T3 (σ ) ≥ 0 (8.19e)
for all values of wavenumber ı.
Having found formulas for T1 , T2 , and T3 , we are now prepared to expand the J ( s ) function
defined in Eq. (8.15c) above. Reversing the transform in Eq. (8.8c) gives
and Eq. (8.8e) above shows that n (Ds ) is the Fourier transform
( )
n (Ds ) (σ ) = F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) .
According to Eq. (2.39j) in Chapter 2, the Fourier convolution theorem shows that
(
n (Ds ) (σ ) ∗ [2π iσ H(uσ ) M( Rσθ ma ) Z FOV (σ )] = F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) . ) (8.20b)
§
( )) ·
2
© ¬
(
J ( s ) (σ ) = E ¨ ª Re e−iψ (σ ) F ( −iσχ ) Π ( χ , D )n ( s ) ( χ ) Ws ( χ ) º ¸ .
¼ ¹
(8.20c)
We now begin the analysis of the right-hand side of Eq. (8.20c). Once again noting that the
real part of any complex number c is 0.5(c + c* ) , we write
-979 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
(
Re e− iψ (σ ) F ( −iσχ ) ( Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ) )
1 −iψ (σ ) ( −iσχ )
=
2
e { F ( Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) )
inside the Fourier transforms is real, according to Eqs. (8.6a), (8.2a), and (8.8c), so [applying
formulas (2.29a) and (2.29c) in Chapter 2]
∗
ª∞ ∗ º
ªF
¬
( − iσχ ′ )
( )
Π ( χ ′, D)n ( χ ′) Ws ( χ ′) º¼ = « ³ Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′)e −2π iσχ ′ d χ ′»
(s)
¬ −∞ ¼
∞
(
= F (iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) . )
This shows that
( (
Re e − iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ))
1 −iψ (σ ) ( − iσχ )
=
2
e { F (
Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ) (8.21a)
(
+ eiψ (σ ) F (iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) )}.
Squaring this formula leads to
( (
ª Re e − iψ (σ ) F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¬ ¼ ))
1 −2iψ (σ ) ª ( − iσχ )
=
4
e { ¬ F (
Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º¼ 2 ) (8.21b)
(
+ e 2iψ (σ ) ª¬ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) º¼ 2 )
( ) (
+ 2 F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ⋅ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) )}.
- 980 -
Calculating the NEdN from the Random Sampling Error · 8.7
The expectation operator E is linear with respect to random quantities (see Sec. 3.10 of Chapter
3), so E can be applied to both sides of (8.21b) to get
( (
¬ (
E ª Re e −iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¼ )) )
1
( (
= e −2iψ (σ )E ª¬ F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º¼ 2
4
) )
1
( (
+ e 2iψ (σ )E ª¬ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) º¼ 2
4
) )
1
( ( ) ( ))
+ E F ( − iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) ⋅ F ( iσχ ′) Π ( χ ′, D)n ( s ) ( χ ′) Ws ( χ ′) .
2
( (
¬ (
E ª Re e −iψ (σ ) F ( −iσχ ) Π ( χ , D)n ( s ) ( χ ) Ws ( χ ) º 2
¼ )) )
1 1 1
= e −2iψ (σ )T1 (σ ) + e 2iψ (σ )T2 (σ ) + T3 (σ ) ,
4 4 2
1 −2iψ (σ ) 1 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T2 (σ ) + T3 (σ ) . (8.21c)
4 4 2
Equations (8.18f), (8.18g), and (8.19d) have formulas for T1 , T2 , and T3 that can be substituted
into (8.21c) to get
-981 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
J ( s ) (σ ) =
∞
−π e 2 −2 iψ (σ )
³p
(s)
nn (σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅
−∞
∞
2
+π 2
³ (σ ′) e
(s)
pnn − iψ (σ )
(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞
∞
2
+π 2
³ (σ ′) e
(s)
pnn − iψ (σ )
(σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ ,
−∞
where e−iψ (σ ) terms are inserted into the squared magnitudes of the last two integrals. We can do
this because, for any complex number c,
2 2
c = e − iψ c .
It follows that
Equation (8.22a) can now be written as, remembering that M and Z mnf are real-valued functions,
∞ ∞
{
J ( s ) (σ ) = π 2 − ³ pnn
(σ ′) a (σ , −σ ′) a (σ , σ ′) d σ ′ −
(s)
³p
(s)
nn (σ ′) a (σ , −σ ′)∗ a (σ , σ ′)∗ dσ ′
−∞ −∞
∞ ∞
+ ³ p (σ ′) a(σ , −σ ′) dσ ′ +
(s)
nn
2
³
2
(σ ′) a (σ , σ ′) dσ ′
(s)
pnn }
−∞ −∞
- 982 -
Calculating the NEdN from the Random Sampling Error · 8.7
"
J (s)
( ) 2
³p
(s)
nn { 2
( ) a ( , ) a( , )
2
" (8.22d)
a ( , ) a ( , ) a ( , ) a ( , ) d .}
We note that
2
a ( , ) a( , ) ¬ª a ( , ) a ( , ) ¼º ( ¬ª a ( , ) a ( , ) ¼º
a( , ) a( , ) a ( , ) a ( , )
a ( , ) a( , ) a ( , ) a ( , )
2 2
a ( , ) a( , ) a( , ) a( , )
a( , ) a ( , ) .
"
2
³p
(s) 2 (s)
J ( )
nn ( ) a( , ) a( , ) d ,
"
J ( s ) ( )
"
³p
2 (s)
nn ( ) e i ( ) ( ) H u ( ) M R( ) ma Z mnf ( ) (8.22e)
"
2
ei ( ) ( ) H u ( ) M R( ) ma Z mnf ( ) d .
(s)
The pnn noise-power spectrum can never be negative [see inequality (3.54g) in Chapter 3], and
inside the
inside Eq. integral
(8.22e) in
the(8.22e)
noise-power spectrum spectrum
the noise-power is multiplied by the squared
is multiplied by the magnitude
magnitude of a
complex number. Hence the integral in (8.22e) is over the product of two non-negative quantities
and itself can never be negative:
J ( s ) ( ) % 0 . (8.22f)
-983 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
This shows there can never be any problem taking the square root of J ( s ) in formula (8.15e)
when calculating the sampling-error NEdN. Combining Eqs. (8.15e) and (8.22e) in a single place
gives
4 J ( s ) (σ )
NEdN samp ( σ ) = , (8.22g)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
where
J ( s ) (σ ) =
∞
π 2 ³ pnn(s ) (σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) (8.22h)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ .
∗
Part of the formula for J ( s ) can also be written as a convolution. Equation (8.4c), which
(s)
is even, can be applied to the second integral in Eq. (8.19d) to get
shows that pnn
T3 (σ ) =
∞
2π 2
{³ p (s)
nn
2
(σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞
∞
+ ³
(s)
pnn
2
( −σ ′) (σ + σ ′) H ( u (σ + σ ′) ) M ( R (σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ . }
−∞
T3 (σ ) =
∞
2π {³ p
2 (s)
nn
2
(σ ′) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) dσ ′
−∞
∞
+ ³
(s)
pnn
2
(σ ′′) (σ − σ ′′) H ( u (σ − σ ′′) ) M ( R (σ − σ ′′)θ ma ) Z mnf (σ − σ ′′) dσ ′′},
−∞
which becomes, glancing back at the definition of a convolution in Eq. (2.38a) in Chapter 2,
{ ª 2
º
(σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
T3 (σ ) = 4π 2 pnn
(s)
¬ ¼ } (8.23a)
- 984 -
Calculating the NEdN from the Random Sampling Error · 8.7
1 −2iψ (σ ) 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T2 (σ )
4 4
{
+ 2π 2 pnn ª 2
º
}
(σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
(s)
¬ ¼
1 −2iψ (σ ) 1
J ( s ) (σ ) = e T1 (σ ) + e 2iψ (σ )T1 (σ )∗
4 4
+ 2π 2 pnn{ ª 2
(σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) »
(s)
¬
º
¼ }
1 1 ∗
= e −2iψ (σ )T1 (σ ) + ª¬e −2iψ (σ )T1 (σ ) º¼
4 4
{ ª 2
º
(σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » .
+ 2π 2 pnn
(s)
¬ ¼ }
Again noting that
1
Re(c) = (c + c * )
2
for any complex number c, we see that
J ( s ) (σ )
=
1
2
( )
Re e −2iψ (σ )T1 (σ ) + 2π 2 pnn
(s)
{ ª
¬
2
º
(σ ) ∗ « σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) » ,
¼ } (8.23b)
T1 (σ ) =
∞
− 4π 2
³p
(s)
nn (σ ′)[(σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′)] ⋅ (8.23c)
−∞
This alternative formula for J ( s ) is useful later on when analyzing the behavior of the sampling-
noise NEdN associated with the measurement of an isolated emission line (see Sec. 8.9).
-985 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
2D
3.125 &104 cm . (8.24b)
N
The background radiance from the interferometer’s interior surfaces is assumed to be small
compared to the 400-K Planck radiance being measured, so we say that
L( dir ) ( ) L(FOV
fore )
( ) L(back) ( fore ) (back)
FOV ( ) L mnf ( ) L mnf ( ) 0 . (8.24c)
Again, the beam radius is taken to be R = 3 cm, which makes the beam cross-sectional area
A R 2 28.27 cm 2 . (8.24d)
amp ( sec
R( ) 1 . (8.24f)
erg
The beam-splitter efficiency and the transmissions of the fore and aft optics are also ideal,
a ( ) f ( ) ( ) 1 . (8.24g)
- 986 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8
W = 1. (8.24i)
The detector electronics again have a three-pole, low-pass Butterworth filter with a cutoff
frequency of 8000 Hz (see Fig. 7.3 in Chapter 7). The OPD velocity u is still 5 cm/sec, so the
wavenumber corresponding to the cutoff frequency is
8000 Hz
= 1600 cm −1 . (8.24j)
5 cm/sec
One difference from the interferometer system simulated in the previous chapter is the band of
wavenumbers over which the spectrum is measured: this time it is 650 to 1250 cmí1. Another
difference is the radiances used to calibrate the instrument. Since we are simulating the
measurement of a 400-K black-body spectrum, the high-temperature calibration is now a 500-K
instead of a 350-K black-body radiance. The low-temperature calibration is still that of liquid
nitrogen (77 K).
Figure 8.1(a) shows that the sampling-position noise contaminating the 400-K black-body
(s)
measurement has a quasi-harmonic pnn noise-power spectrum. This has the same shape as the
spectrum in Fig. 7.2(c) in Chapter 7, with the power spectrum in Fig. 8.1(a) having ı C = 30 cm -1
and ı M = 10 cm -1 . The upper level in the sampling-position power spectrum is
Imitating Eq. (7.49b) in Chapter 7, we write the formula for the quasi-harmonic spectrum as
(σ )
(s)
pnn
(8.25b)
{ ( ) ( )}
= [1.25 ×10 −13 cm3 ] ⋅ Π σ − 35 cm −1 ,5 cm −1 + Π σ + 35 cm −1 ,5 cm −1 .
Consulting Eq. (8.5b) above, we see that the variance in the sampling-position error due to this
noise-power spectrum is
-987 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
which means that the root-mean-square average of the error in the sampling position is
1.581×10−6 cm
,
3.125 ×10−4 cm
or approximately 0.5% of the OPD separation between adjacent samples. This may be somewhat
larger than the typical size of the sampling error in well-designed interferometers, but the bad
sampling does make it easier to see how sampling error affects the measured spectra. Figures
8.1(b) and 8.1(c) give an example of sampling-position noise obeying the quasi-harmonic noise-
power spectrum in Fig. 8.1(a). Both figures plot the same simulation of the n ( s ) ( χ ) random
function, with the Ȥ axis expanded in Fig. 8.1(b) to provide a detailed example of the n ( s ) ( χ )
oscillations. This sampling-position error is a zero-mean and normally distributed random
quantity. Comparing this example of quasi-harmonic noise to the one shown in Figs. 7.7(a) and
7.7(b) in Chapter 7, we see that here the random oscillations occur at a somewhat lower
frequency. This is due to our choice of a much smaller value of σ C , which is 30 cm-1 for the
noise in Figs. 8.1(b) and 8.1(c) compared to 100 cm-1 for the noise in Figs. 7.7(a) and 7.7(b).
Figure 8.2(a) plots ten simulated measurements of the 400-K black-body radiance spectrum
for the interferometer system specified by the discussion accompanying Eqs. (8.24a)–(8.24j)
above. In Fig. 8.2(a), and only in Fig. 8.2(a), the actual sampling noise is multiplied by a factor of
20 before being added back to the true radiance; it can be regarded as increasing the srms root-
mean-square sampling-position error to 10% of the intersample spacing. This increase makes it
easy to see how the sampling noise reshapes the spectral measurements, because now the width
of the black solid line representing the true 400-K spectrum does not cover over the dashed lines
representing the noise-contaminated measurements. We note that there is a region near
ı = 1031 cm-1 where the error is always small. The solid curve in Fig. 8.2(b) is the NEdN versus
wavenumber curve predicted by formulas (8.22g) and (8.22h) for this sampling-position noise,
- 988 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8
FIGURE 8.1(a).
pn~(n~s ) (σ )
1.25x10-13 cm3
σ (in cm-1)
FIGURE 8.1(b).
-6
5x10
. 6
5 10
n (Re
s)
χ ) kPlot
(nSVtemp
0.0 0
(in cm)20
-5x10 6-6
5 .10
0.4 0.2 0 0.2 0.4
-0.5
0.5 0.0
kPlot .∆χ 1.28 0.5
0.5
χ (in cm)
-989 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
FIGURE 8.1(c).
-6
5x10
. 6
5 10
n ( sRe
) nSVtemp
( χ ) kPlot
0.0 0
(in cm) 20
-5x10 6-6
5 .10
1 0.5 0 0.5 1
1.28 -1.0 -0.5 0.0
kPlot .∆χ 1.28 0.5 1.0 1.28
and it also shows the NEdN dipping down to zero near ı = 1031 cm-1. The NEdN is, of course,
just the standard deviation of the error in the noise-contaminated spectral measurements (see Sec.
6.1 in Chapter 6). We can take a large number of noise-contaminated measurements and calculate
directly the standard deviation of their error at any wavenumber ı. We have done this for 300
measurements contaminated by statistically independent examples of sampling-position noise
obeying the power spectrum in Fig. 8.1(a) and plotted the results with crosses in Fig. 8.2(b). As
expected, there is a good match to the predicted NEdN values—that is, the solid curve—and we
see that the crosses marking the standard deviation also dip down to zero near ı = 1031 cm-1.
[The reason they do not go as far down as the solid curve is explained in the discussion following
Eq. (8.34c) in Sec. 8.10 below.]
The formula for NEdNsamp in Eqs. (8.22g) and (8.22h) predicts this dip. The phase angle ȥ of
the three-pole, low-pass filter used in the interferometer simulation [this phrase angle is
introduced in Eq. (8.12f) above] is to a very good approximation linear in wavenumber,
ψ (σ ) ≅ − Kσ +ψ 0 , (8.26a)
for a real ψ 0 and a real, positive constant K. Many types of low-pass filter have this sort of
- 990 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8
approximately linear dependence of the transfer function’s phase. Equations (8.24f)–(8.24h) can
be applied to the formula for NEdNsamp in Eq. (8.22g) to get
4 J ( s ) (σ )
NEdN samp ( σ ) = . (8.26b)
ª amp ⋅ sec º
A ∆Ω H(uσ ) ⋅ «1
¬ erg »¼
Equations (8.24c) and (8.24i) together with the previously used (8.24f)–(8.24h) can be substituted
into the formula for Z mnf in Eq. (8.7f) to get
§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) = ¨ ¸ ⋅ «1 ⋅ L mnf ( σ ) (8.26c)
© 4 ¹ ¬ erg »¼
in the formula used for J ( s ) [for example, Eq. (8.22h)]. Again we note, just as in the discussion
following Eq. (7.55f) in Chapter 7, that for this interferometer system the black-body spectrum is
smooth enough to neglect the nonrandom measurement errors due to the interferometer’s finite
field of view and finite interferogram length—that is, we do not need to worry about the
potentially different shapes of the L, LFOV, and Lmnf radiance functions. Hence the formula for
Z mnf can be written as
§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) ≅ ¨ ¸ ⋅ «1 ⋅ L( σ ) , (8.26d)
© 4 ¹ ¬ erg »¼
where in Eq. (8.26d) L is the spectral radiance curve for Planck radiation coming from a 400-K
black body.
The formula for J ( s ) can be simplified in the same way that the NEdNsamp and Z mnf formulas
were. Equations (8.24h) and (8.26d) can be substituted into (8.22h) to get
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
J (s)
(σ ) = ¨ ⋅ «1 ¸ ³p
(s)
(σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
erg »¼ ¹
nn
© 4 ¬ −∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .
∗
-991 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
FIGURE 8.2(a).
370
360
360
LinpV
kR
NEdNV 340
340
kR
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V 320
kR 320
Lmeas4V
kR
Radiance σ ≅ 1031cm-1
2 -1
(in mW/m /sr/cm
Lmeas5V
kR )
Lmeas6V
kR300
300
Lmeas7V
kR
Lmeas8V
kR
Lmeas9V
kR
280
280
Lmeas10V
kR
260
260
250
600
600 700
700 800
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300
650 σR 3
kR 1.25 .10
-1
σ (in cm )
This graph contains 10 simulated measurements of a 400 K black-body spectrum
contaminated by the sampling noise. The noise is increased by a factor of 20 over the
size specified by the noise-power spectrum in Fig. 8.1(a) to make it easier to see.
- 992 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8
FIGURE 8.2(b).
0.30
0.3
0.250.25
0.200.2
σ ≅ 1031 cm −1
Radiance Error
NEdNestP
(in mW/m2/sr/cm-1)
ks
NEdNV
0.150.15
k
NEdNTV
k
0.100.1
0.050.05
3
1.300381 .10 0.0 0
600 700 800 900 1000 1100 1200 1300
600
600 700 800 900 σp 1000
, σg
ks k
1100 1200 1300
1300
σ (in cm-1)
-993 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
J ( s ) (σ )
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
=¨ ⋅ «1 ¸ ³p
(s)
(σ ′) ei[ −ψ (σ )+ψ (σ −σ ′)] (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
erg »¼ ¹
nn
© 4 ¬ −∞
2
− ei[ψ (σ )−ψ (σ +σ ′)] (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ ,
J ( s ) (σ )
2
§ π A∆Ω ª amp ⋅ sec º ·
≅¨ ⋅ «1 »¸
© 4 ¬ erg ¼ ¹
∞
⋅ ³ pnn
(σ ′) e
(s) i[ Kσ −ψ 0 − K (σ −σ ′ ) +ψ 0 ]
(σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
−∞
2
− ei[ − Kσ +ψ 0 + K (σ +σ ′) −ψ 0 ] (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
=¨ ⋅ «1 »¸ ³p
(s)
nn (σ ′) eiKσ ′ (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
© 4 ¬ erg ¼ ¹ −∞
2
− eiKσ ′ (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .
Since
C1eiKσ ′ + C2 eiKσ ′ = eiKσ ′ (C1 + C2 ) = eiKσ ′ ⋅ C1 + C2 = C1 + C2
for any two complex numbers C1 and C2 , this formula for J ( s ) reduces to
2 ∞
§ π A∆Ω ª amp ⋅ sec º ·
J (s)
(σ ) ≅ ¨ ⋅ «1 »¸ ³p
(s)
nn (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ )
© 4 ¬ erg ¼ ¹ −∞ (8.27)
2
− (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′ .
The black-body radiance L varies slowly with wavenumber ı, as does the magnitude H of
the filter transfer function. We can define a new function
- 994 -
Black-Body Spectrum Contaminated by Sampling Noise · 8.8
which is also a slowly varying function of wavenumber ı. Now the integral in Eq. (8.27) can be
written as
∞
2
³ (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞
∞
³p
2
= (s)
nn (σ ′) (σ − σ ′) g(σ − σ ′) − (σ + σ ′)g(σ + σ ′) dσ ′ .
−∞
∞
2
³ (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞
2
∞
ª dg º ª dg º
≅ ³ p (σ ′) (σ − σ ′) «g(σ ) − σ ′
(s)
nn » − (σ + σ ′) « g(σ ) + σ ′ » dσ ′
−∞ ¬ dσ at σ ¼ ¬ dσ at σ ¼
∞
dg dg
³p (σ ′) σ g(σ ) − σσ ′ − σ ′ g(σ ) + σ ′2
(s)
=
nn
−∞
dσ at σ dσ at σ
2
ª dg dg º
− «σ g(σ ) + σσ ′ + σ ′ g(σ ) + σ ′ 2
» dσ ′ .
¬ dσ at σ dσ at σ ¼
This simplifies to
∞
2
³ (σ ′) (σ − σ ′) H ( u (σ − σ ′) ) L( σ − σ ′ ) − (σ + σ ′) H ( u (σ + σ ′) ) L( σ + σ ′ ) dσ ′
(s)
pnn
−∞
2 ∞
(8.28b)
dg
≅ 4 g(σ ) + σ ³ σ′ p (σ ′)dσ ′ .
2 (s)
nn
dσ at σ −∞
-995 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
2 2 "
§ A ª amp ( sec º · dg
³
(s) 2 (s)
J ( ) ¨ ( «1 » ¸ g( ) ( ) d ,
pnn
© 2 ¬ erg ¼ ¹ d at "
1/ 2
° " ½°
2 ( g( ) d g
NEdN samp ( ) ( ® ³ 2 pnn
(s)
( ) d ¾ . (8.28c)
H(u ) d at ¯° " °¿
dg
T ( ) g( ) (8.28d)
d at
is small; in fact for the approximation shown in (8.28c), the NEdNsamp value is zero when
dg
g( ) , (8.28e)
d at
because then T ( ) g( ) (d g / d ) is zero in (8.28c). Formula (8.28a) defines function g
used to define T ( ) in (8.28d). Figure (8.3) is a graph of T ( ) versus ı for the T ( ) function
specified by a 400-K black-body spectrum and the magnitude H of the filter transfer function.
We see that T ( ) is zero for
1030.5 cm 1 , (8.28f)
which explains the dip at
1031 cm 1
in Fig. 8.2(b) and the negligible sampling error near 1030.5 cm -1 of all ten noise-
contaminated measurements in Fig. 8.2(a). We can expect this sort of behavior whenever we
examine theNEdN
examine the sampcurve
NEdNsamp curveforfor
a noise-contaminated black-body spectrum.
a sample-noise-contaminated black-body spectrum.
- 996 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
FIGURE 8.3.
600
600
520.709
400
400
1030.5cm-1
200
200
T ( )
fp( )
(in mW/m2/sr/cm-1)
00
-200
200
-400
301.469 400
600
600 700
700 800
800 900
900 1000
1000 1100
1100 1200
1200 1300
1300
600 1300
(in cm-1)
-997 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
change the spectral radiance entering the system into the single Lorentz emission line shown in
Fig. 7.6 in Chapter 7. Again the expression for NEdNsamp in Eq. (8.22g) reduces to the formula
shown in (8.26b),
4 J ( s ) (σ )
NEdN samp ( σ ) = . (8.29a)
ª amp ⋅ sec º
A ∆Ω H(uσ ) ⋅ «1
¬ erg »¼
The formula for Z mnf associated with Eq. (8.22h) is the same as it was before in (8.26c),
§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) = ¨ ¸ ⋅ «1 ⋅ L mnf ( σ ) ,
© 4 ¹ ¬ erg »¼
where Lmnf is the Lorentz emission line as measured by the interferometer. The effects of the
interferometer’s finite interferogram length and finite field of view are the same as when they are
analyzed in Sec. 7.15 of Chapter 7, so we can again ignore the slight differences in shape of the
L, LFOV, and Lmnf radiance functions [see discussion following Eq. (7.56c) in Chapter 7] to get
§ A ∆Ω · ª amp ⋅ sec º
Z mnf (σ ) ≅ ¨ ¸ ⋅ «1 ⋅ L( σ ) , (8.29b)
© 4 ¹ ¬ erg »¼
where L is the spectral radiance of the Lorentz emission line entering the system, that is, the
spectral radiance in Fig. 7.6 in Chapter 7.
If we use formula (8.23b) instead of (8.22h) for the J ( s ) function in Eq. (8.29a), it will be
easier to understand how the sampling-position noise can generate ghost lines when the
interferometer measures the emission line. According to Eq. (8.24h), the M function is one, so
Eq. (8.23b) can be written as
J ( s ) (σ ) =
1
2
( ) {
Re e−2iψ (σ )T1 (σ ) + 2π 2 pnn
(s) ª
¬
2
}º
(σ ) ∗ « σ H ( uσ ) Z mnf (σ ) » .
¼
J ( s ) (σ )
π 2 A2 ∆Ω 2 ª ª amp ⋅ sec º
2
º ½° (8.30a)
1 ° (s)
= Re e
2
(
−2 iψ (σ )
T1 (σ ) + )8
(σ ) ∗ σ H ( uσ ) «1
®pnn «
«¬ erg »¼
L(σ ) » ¾ .
»¼ °
°¯ ¬ ¿
- 998 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
The formula for T1 ( ) comes from substituting (8.24h) and (8.29b) into Eq. (8.23c) to get
2 "
2 A2 2 ª amp ( sec º
³p
(s)
T1 ( ) «1 »
nn ( )[( ) H u ( ) L( )] (
4 ¬ erg ¼ "
(8.30b)
[( )H u ( ) L( )] d .
The L radiance function is narrow enough (see Fig. 7.6 in Chapter 7), and the [ H(u )] varies
slowly enough, that we can make the approximation that
H u L( ) e H u e L( ) , (8.30c)
where e is the wavenumber of the emission line’s peak value (for the Lorentz emission line in
this simulation, e 950 cm-1 ). When ı is far from e , function L in Eq. (8.30c) is essentially
zero, making the value assigned to [ H(u )] irrelevant—and, of course, when ı is near to e ,
we can approximate [ H(u )] by its value [ e H(u e )] at e . In effect, L is treated as a sort of
delta function to which we have applied formula (2.68e) in Chapter 2. Equations (8.30a) and
(8.30b) can now be written as
J ( s ) ( )
2
1 2 A2 2 ª amp ( sec º (8.30d)
Re e
2
2 i ( )
T1 ( )
8 «1 erg » e H u e
¬ ¼
(s)
( )
pnn
¬
ª L( ) 2 º ,
¼
with
T1 ( )
2 "
2 A2 2 ª§ amp ( sec · º (8.30e)
³
(s)
«¨ 1 ¸ e H u e » ( ) L ( ) ( L ( ) d .
pnn
4 ¬© erg ¹ ¼ "
The solid curve in Fig. 8.4(a) is the Lorentz emission line L centered over the graph of the
(s) (s)
( ) function in (8.4b), with pnn
pnn having
( ) still the same
having basicquasi-harmonic
the same quasi-harmonic graph
shape shown in
Fig. 8.1(a). The effective half-width of the Lorentz line is taken to be w . The two dashed curves
in Fig. 8.4(a) show the L function displaced to either side of original emission line, with new
peak values at e $ w .
-999 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
e 950 cm 1
L( )
L( w )
L( w )
w
e C 980 cm 1
e C 920 cm 1 e w
e w
C M
C M
C C
- 1000 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
When these two dashed curves are closer together, having peaks at σ e ± σ ′ with σ ′ < σ w , then
those wavenumbers ı where the dashed curves have significant overlap shows where the product
L( σ − σ ′ ) ⋅ L( σ + σ ′ )
is significantly different from zero. When these dashed curves are further apart, having peaks at
σ e ± σ ′ with σ ′ > σ w , then there is no significant overlap and the product
L( σ − σ ′ ) ⋅ L( σ + σ ′ )
is not significantly different from zero. Hence, the position of the dashed curves in Fig. 8.4(a)
shows where this product drops to zero; any further apart and the
L( σ − σ ′ ) ⋅ L( σ + σ ′ )
product cannot make any significant contribution to the integral in (8.30e). Notice, however, that
when
−σ w ≤ σ ′ ≤ σ w
(σ ′) L ( σ − σ ′ ) ⋅ L ( σ + σ ′ )
(s)
pnn
product is zero for all ıƍ values for the configuration shown in Figs. 8.4(a) and 8.4(b)—because
(s) (s)
when the “double L” product is non-negligible then pnn is zero, and when pnn
is nonzero then
the “double L” product is negligible. We conclude that the integral in (8.30e) is very small or
zero, which means that T1 can be neglected in Eq. (8.30d). Hence, (8.30d) simplifies to
J (s)
(σ ) ≅
π 2 A2 ∆Ω 2 ª amp ⋅ sec º
8 «1
¬ erg » σ e H ( uσ e )
¼
{p(s)
nn
¬
2
}
(σ ) ∗ ª L(σ ) º .
¼
(8.30f)
Consequently, the NEdNsamp formula for this sort of measurement can be written as, substituting
(8.30f) into (8.29a),
-1001 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
e H u e ( s )
NEdN samp ( ) 2
H(u )
( )
pnn ª L( ) 2 º
¬ ¼ 1/ 2
. (8.30g)
(s)
This approximate formula for the sampling-noise NEdN can be used used whenever
whenever the
an pnn
power
spectrum “straddles” a strong emission line the way it does in Figs. 8.4(a) and 8.4(b).
The ten dotted lines in Fig. 8.5(a) plot ten spectral measurements of the Lorentz emission line
using the simulated interferometer contaminated by this quasi-harmonic sampling-position noise,
and the two split solid lines show the true spectral values. The continuous solid line in Fig. 8.5(a)
is the NEdNsamp curve specified by the formulas in (8.29a), (8.30d), and (8.30e). The formula in
(8.30g) shows that NEdNsamp is approximately proportional to the square root of the convolution
(s)
of the squared emission-line radiance L with the quasi-harmonic power spectrum pnn in Eq.
(8.25b). According to the discussion at the end of Sec. 7.15 of Chapter 7, a similar convolution in
the NEdN formula for the misalignment noise is also associated with ghost lines on either side of
the Lorentz emission line, as can be seen by comparing Fig. 8.5(a) to Fig. 7.8(a) in Chapter 7.
The resemblance is also present in Fig. 8.5(b), which gives an expanded view of the ghost-line
region on the right-hand side of the emission line. Just like in Fig. 7.8(a) for the misalignment
noise, the convolution predicts the presence of ghost lines on either side of the emission line, with
the center of the ghost-line region offset by wavenumber intervals of
M
C
2
from the wavenumber, marking the peak of the emission line. Unlike the quasi-harmonic noise-
power spectrum in Chapter 7, the noise-power spectrum used here has C 30 cm -1 and
M 10 cm-1 so that
M
C 35 cm 1 . (8.31)
2
This agrees with the ghost-line offsets seen in Figs. 8.5(a) and 8.5(b).
Figure 8.6 compares the standard deviations of the errors in the measured radiances to the
NEdNsamp values predicted by the formulas in (8.29a), (8.30d) and (8.30e). It follows the same
format as Fig. 8.2(b), and once again we see a good match between the calculated standard
deviations represented by the crosses and the NEdNsamp predictions represented by the solid line.
The only difference between the procedure used to generate Fig. 8.2(b) and the procedure used to
generate Fig. 8.6 is that the standard deviations in (8.6) are calculated from 900, instead of from
300, noise-contaminated interferometer measurements.
- 1002 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
FIGURE 8.5(a).
2.0
2
2
1.5 1.5
LinpV
kR
NEdNV
kR
Noise-free Noise-free
Lmeas1V Spectrum
kR Spectrum
1.0 1
Lmeas2V
kR
Lmeas3V
kR
Radiance
(in mW/m2/sr/cm-1) kR
Lmeas4V
0.5 0.5
Lmeas5V
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
0.0 0
kR
Lmeas9V
kR
Lmeas10V
kR
-1.0
1 1
850 900 950 1000 1050
850
850 900 950
σR 1000 1050
1050
kR
σ (in cm-1)
-1003 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
FIGURE 8.5(b).
1.0
1.0
0.8 0.8
Noise-free
Spectrum
LinpV 0.6
kR 0.6
NEdNV
kR
0.4
Lmeas1V 0.4
kR
Lmeas2V
kR
0.2 0.2
Lmeas3V
Radiance kR
(in mW/m2/sr/cmLmeas4V
-1
) 0.0
kR
0
Lmeas5V
kR
Lmeas6V
-0.2
kR
0.2
Lmeas7V
NEdNsamp
kR
Lmeas8V
-0.4
kR
0.4
Lmeas9V
kR
-0.6
Lmeas10V
kR 0.6
-0.8 0.8
-1.0
1.0 1
960 980 1000 1020 1040
950 960 980 1000
σR 1020 1040 1050
kR
σ (in cm-1)
- 1004 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
FIGURE (8.6).
0.30
0.3
0.250.25
0.20 0.2
0.150.15
Radiance ErrorNEdNestPks
0.10 0.1
(in mW/m2/sr/cmNEdNV
-1
) k
0.050.05
0.0 0
-0.050.05
-0.10
0.1
800 850 900 950 1000 1050 1100
800
800 850 900 950
σp , σg 1000 1050 1100
1100
ks k
σ (in cm-1)
-1005 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
f ( ) 0.5 (8.32a)
rather than one [the value it has had up to now is one, see Eq. (8.24g)] so that the fore-optics
background radiance L(FOV fore )
is no longer insignificant as it was in Eq. (8.24c). For this sort of
setup, a first-order estimate for the effective fore-optics emissivity is
1 f ( ) 0.5 ,
and the effective temperature of the background radiance is taken to be 350 K. The measured
emission line is the same one used before, having the spectral radiance shown in Fig. 7.6 of
Chapter 7, and the sampling-position noise is the same as in Figs. 8.5(a) and 8.5(b); that is, it is
the noise specified by the power spectrum in Fig. 8.1(a). Because significant amounts of
background radiance are present, Eq. (8.30g) is no longer a good approximation for NEdNsamp; we
must instead return to Eqs. (8.22g) and (8.22h), remembering to allow for L(FOV
fore )
no longer being
zero and f being 0.5. Since we still have
amp ( sec
R( ) 1 , a ( ) ( ) 1 , M( R ma ) 1.0 , W 1 ,
erg
and
L( dir ) ( ) L(back) (back)
FOV ( ) L mnf ( ) 0
8 J ( s ) ( )
NEdN samp ( ) ; (8.32b)
§ amp ( sec ·
A H(u ) ( ¨1
© erg ¸¹
and we have,from
fromEq.
Eq.(8.7f)
(8.7f)that
that
- 1006 -
Sampling Noise and an Isolated Lorentz Emission Line · 8.9
∞
J ( s ) (σ ) = π 2 ³ pnn
(σ ′) e
(s) − iψ (σ )
(σ − σ ′) H ( u (σ − σ ′) ) Z mnf (σ − σ ′)
−∞ (8.32d)
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) Z mnf (σ + σ ′) dσ ′ .
∗
Again, according to the discussion after Eq. (7.56b) in Chapter 7, we can neglect the difference
between the L, LFOV, and Lmnf spectral radiance functions; for the same reasons, we can also
neglect the difference between the L( fore ) , L(FOV
fore )
, L(mnf
fore )
background radiance spectra.
The dotted lines in Figs. 8.7(a) and 8.7(b) show ten spectral measurements of the Lorentz
emission line contaminated by sampling noise when the 350-K background radiance is present,
and Fig. 8.7(c) is a close-up of the right-hand side of the same set of curves. The continuous solid
lines in Figs. 8.7(a)–8.7(c) show the NEdNsamp values predicted by Eqs. (8.32b)–(8.32d), and the
split solid lines give the true L(σ ) spectral radiance. Comparing Figs. 8.7(a) and 8.7(c) to the
plots in Figs. 8.5(a) and 8.5(b) without the background radiance, we see that the background
radiance prevents the measurement error from dropping to zero outside the regions where the
ghost lines occur. Figure 8.7(b), which is a somewhat expanded version of Fig. 8.7(a), makes it
easy to see that when the presence of the ghost lines is disregarded, the NEdN from the Planck
black-body radiance drops to zero near σ ≅ 940 cm-1 . This is the same sort of behavior seen
before in Figs. 8.2(a) and 8.2(b), with the dip now occurring at a smaller wavenumber (940 cmí1
instead of 1030 cmí1) because the background radiance curve is at a lower temperature, 350 K,
instead of the 400 K of Figs. 8.2(a) and 8.2(b). This dip can be seen even more plainly in Fig. 8.8.
Just like in Fig. 8.6, the crosses plot standard deviations of the radiance errors, calculated from
900 measurements contaminated by the power spectrum in Fig. 8.1(a). There is again a good
match between the standard deviations and the solid curve showing the NEdNsamp values
predicted by Eqs. (8.32b)–(8.32d), and again the crosses do not go down as far as the NEdN
curve in the region of the dip.
(σ ) = o0δ (σ )
(s)
pnn (8.33a)
for some positive and constant o0 value, the formula for J ( s ) in Eq. (8.22h) reduces to
-1007 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
FIGURE 8.7(a).
2.0
2
2
Noise-free
Spectrum
1.5 1.5
LinpV
kR
NEdNV
kR
Lmeas1V
kR
1.0 1
Lmeas2V
kR
Lmeas3V
kR
Radiance Lmeas4VkR
0.5 0.5
(in mW/m2/sr/cmLmeas5V
-1
) kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V 0.0 0
kR
Lmeas9V
kR
Lmeas10V
kR NEdNsamp
-0.50.5
-1.0
1 1
850 900 950 1000 1050
850
850 900 950
R 1000 1050
1050
kR
(in cm-1)
- 1008 -
Error from Quasi-Static Sampling Noise · 8.10
FIGURE 8.7(b).
1.0
1
1
0.80.8 Noise-free
Spectrum
LinpV
kR
NEdNV
kR 0.60.6
Lmeas1V
kR
Lmeas2V
kR
Lmeas3V 0.40.4
kR
Radiance Lmeas4VkR
(in mW/m2/sr/cm-1
)
Lmeas5V
0.20.2
kR
Lmeas6V
kR
Lmeas7V
kR
Lmeas8V
0.0 0
kR
Lmeas9V
kR
Lmeas10V
NEdNsamp
-0.2
kR 0.2
-0.40.4
0.5
850 900 950 1000 1050
850
850 900 950
R 1000 1050
1050
kR
(in cm-1)
-1009 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
FIGURE 8.7(c).
1.0
1.0
0.80.8 Noise-free
Spectrum
LinpV
kR 0.60.6
NEdNV
kR
Lmeas1V 0.40.4
kR
Lmeas2V
kR
Lmeas3V
0.20.2
kR
Radiance Lmeas4VkR
(in mW/m2/sr/cm -1
) 0.0 0
Lmeas5V
kR
Lmeas6V
kR
-0.20.2
Lmeas7V
kR
NEdNsamp
Lmeas8V
kR
-0.40.4
Lmeas9V
kR
Lmeas10V
-0.6
kR 0.6
-0.80.8
-1.0
1.0 1
960 980 1000 1020 1040
950 960 980 1000
σR 1020 1040 1050
kR
σ (in cm-1)
- 1010 -
Error from Quasi-Static Sampling Noise · 8.10
FIGURE 8.8.
0.30
0.3
0.25
0.25
0.200.2
NEdNestP
Radiance Error ks
0.150.15
(in mW/m2/sr/cmNEdNV
-1
) k
0.100.1
0.05
0.05
1.149519 .10
3 0 0
800 850 900 950 1000 1050 1100
800
800 900 σp , σg 1000 1100
1100
ks k
σ (in cm-1)
J ( s ) (σ ) = (8.33b)
2
π 2 o0 e−iψ (σ )σ H ( uσ )M ( Rσθ ma ) Z mnf (σ ) − eiψ (σ )σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) .
∗
2
J ( s ) (σ ) = π 2 o0 σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) − σ H ( uσ ) M ( Rσθ ma ) Z mnf (σ ) = 0 , (8.33c)
-1011 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
when the nonzero noise-power spectrum for the sampling-position noise is a delta function.
This odd result is an artifact of the approximations made in deriving the NEdNsamp formulas,
and we can show this by getting the same result using another line of reasoning. Equation (8.33a)
specifies a noise-power spectrum concentrated at 0 . Substituting Eq. (8.33a) into (8.4b)
gives
(s)
( ) o0 .
onn
From the definition of the autocorrelation function in Eq. (8.3b), we see that an autocorrelation
function can
function canhave
havethe
thesame
samenonzero
nonzeroo0ovalue
0 value at all OPD
at all values
OPD Ȥ only
values Ȥ when
whenthe
therandom
random sampling
sampling
(s)
error n is the same at all OPD values Ȥ. We interpret this to mean that all the samples of the
interferogram signal are shifted by the same random value from their expected positions during a
single sweep of the interferometer’s moving mirror. Later, after many new spectral measurements
and many more sweeps of the moving mirror, we find that the shift in the sample positions has
changed to another random value. We can think of this as quasi-static sampling noise; although
effectively constant during each sweep, the sampling shift can gradually change over many
sweeps to a new random value. Suppose r is the random shift in the OPD value Ȥ for every
sample of a spectral measurement’s interferogram, which means the random function n ( s ) ( )
defined in Sec. 8.2 above is now
n ( s ) ( ) r . (8.34a)
If we take a very large number of spectral measurements, there is no way to tell ahead of time
what r will be for any particular sweep of the moving mirror—but whatever r happens to be at
the beginning of the sweep, it has the same value at the end of the sweep. This is why it makes
sense to use (8.34a) to specify the sampling noise n ( s ) as a stationary but nonergodic function
[function n ( s ) is identical to the stationary but nonergodic random function discussed following
Eq. (3.47a) in Chapter 3]. Substituting (8.34a) into (8.8e) gives gives, using the linearity of the Fourier
transform from Sec. 2.6 of Chapter 2,
n (Ds ) ( ) F ( i ) r
( , D) r F ( i )
( , D) ,
n (Ds ) ( ) F ( i ) r
( , D) r F ( i )
( , D) ,
which becomes, using formula (8.7b),
which becomes, using formula (8.7b),
n (Ds ) ( ) 2rD
sinc(2 D) . (8.34b)
(s)
n D ( ) 2rD sinc(2 D) . (8.34b)
- 1012 -
Error from Quasi-Static Sampling Noise · 8.10
L
4r Re e i ( ) 2 D sinc(2 D) 2 i H(u ) M( R ma ) Z FOV ( ) .
(WA ) H(u ) M( R ma )R ( ) ( ) a ( ) f ( )
Just like before [see discussion following Eq. (8.18a)], we note that the product
2 i H(u ) M( R ma )
varies slowly with wavenumber ı compared to the sinc function, setting up the approximation
L
4r Re 2 i e i ( ) H(u ) M( R ma ){ 2 D sinc(2 D) Z FOV ( )}
(WA ) H(u ) M( R ma )R ( ) ( ) a ( ) f ( )
based on Eq. (5C.1) in Appendix 5C of Chapter 5. The formula for H in Eq. (8.12f) shows that
L
4r Re 2 i H(u ) M( R ma ) 2 D sinc(2 D) Z FOV ( ) . (8.34c)
(WA ) H(u ) M( R ma ) R ( ) ( ) a ( ) f ( )
Functions M, H , and Z FOV inside Re( ) on the right-hand side are strictly real. Formula
(8.34c) is based on the standard approximations used in this chapter—nothing extra has been
added. Consequently, when we rely on these approximations, the error L ends up proportional
to the real part of a strictly imaginary quantity; that is, it ends up being zero. Hence, we have
confirmed that the standard approximations used so far in this chapter end up predicting zero
sampling noise when the sampling-position noise is quasi-static with a delta function for its
power spectrum.
The best way to interpret the results in (8.33d) and (8.34c) is to regard them as predicting that
for this case the sampling error in the radiance measurement is going to be small instead of
completely nonexistent. There is already a strong hint in Sec. 8.8 that there are times when these
approximations break down—we remember that in Fig. 8.2(b) the exact sampling error marked
by the crosses does not follow the solid curve all the way down to zero at 1030.5 cm -1 . The
1031 cm
approximation used when taking the slowly varying H and M functions outside the convolution
with [2 D sinc(2 D)] is actually rather good. These functions are also under our control when
designing the instrument; they can be made effectively constant over the band of wavenumbers
being measured, turning the approximation used to remove them from the convolution into an
exact equality. Consequently, if a more accurate formula for NEdNsamp is desired, it is better to
-1013 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
rethink the approximation specified in Eq. (8.2e) above. When the error from the linear
approximation
dz ( tot )
zC(tot ) j n ( s ) ( j ) zC( tot ) ( j ) n ( s ) ( j ) ( C
d
j
disappears, it is clearly time to consider what happens when the quadratic approximation is used:
zC( tot ) j n ( s ) ( j )
dz ( tot ) 1 d 2 zC( tot ) (8.35)
zC( tot ) ( j ) n ( s ) ( j ) ( C n ( s ) ( j ) 2 ( .
d j
2 d 2 j
Including the effect of the third term on the right-hand side of (8.35), the quadratic term in the
Taylor series for zC(tot ) j n ( s ) ( j ) , would stop the solid curve in Fig. 8.2(b) from dipping
down so close to zero—and also prevent the noise formulas from producing a strictly zero value
for NEdNsamp when the sampling-position noise is quasi-static and obeys the delta-function power
spectrum in (8.33a).
(6.33a).
Retaining both the linear and quadratic terms in (8.35) is, according to Eq. (8.34a), the same
as retaining O (r ) and O(r 2 ) terms everywhere they occur in the noise equations. Postponing for
a while the expansion of the signal error in powers of r , we use the exact formula for the noise-
( tot )
contaminated signal zCN ( ) , writing that
( tot )
zCN ( ) zC(tot ) ( r ) (8.36a)
rather than using the approximation in Eq. (8.2g) above. Our strategy is to repeat the same
procedure used before to derive NEdNsamp, taking advantage of the way the sampling error is now
a random constant r instead of a random function n ( s ) . Having already set up Eq. (8.36a) to
replace (8.2g) at the end of Sec. 8.2, we skip past the next section (because there is no reason to
repeat the explanation of the sampling-noise autocorrelation function and power spectrum) and
move on to Sec. 8.4. The formula corresponding to Eq. (8.6b) is
( tot )
( , D) zCN ( )
( , D) zC( tot ) ( r ) , (8.36b)
Z ( i )
( , D) zC(tot ) ( r)
eff ,totN ( ) F
- 1014 -
Error from Quasi-Static Sampling Noise · 8.10
representing the uncalibrated spectral signal contaminated by sampling noise. Applying Eq.
(2.39j) of Chapter 2 (the Fourier convolution theorem) gives
Z ( i )
( , D) F ( i ) zC(tot ) ( r ) ,
eff ,totN ( ) F
Z ( i )
zC(tot ) ( r) .
eff ,totN ( ) [2 D sinc(2 D )] F
Z ª 2 i r F ( i ) zC(tot ) ( ) º ,
eff ,totN ( ) [2 D sinc(2 D )] ¬ e ¼
which can be written as, since the small value of r makes e 2 i r a slowly varying function of ı
compared to the sinc function [see Eq. (5C.1) in Appendix 5C of Chapter 5]
Z eff ,totN ( ) e
2 i r
[2 D sinc(2 D)] F ( i ) zC( tot ) ( ) .
Using Eq. (8.7b) to replace [2 D sinc(2 D)] by F ( i )
( , D) , we get
Z eff ,totN ( ) e
2 i r
F ( i )
( , D) F ( i ) zC( tot ) ( ) ,
Z 2 i r ( i )
( , D) zC( tot ) ( ) .
eff ,totN ( ) e F
Z 2 i r
H(u ) M( R ma ) Z mnf ( ) . (8.36c)
eff ,totN ( ) e
The alert reader will notice that the error in Eq. (8.36c) can now be entirely eliminated by taking
the magnitude of the complex spectral signal contaminated by this particular type of sampling
noise:
-1015 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
Z 2 i r
eff ,totN ( ) e H(u ) M( R ma ) Z mnf ( ) H(u ) M( R ma ) Z mnf ( )
(8.36d)
H(u ) M( R ma ) Z mnf ( ).
Here, the last step acknowledges that only H(uı) is a complex-valued function of ı.
Unfortunately—leaving
Unfortunately––leaving aside thisthis
aside special case—in
special case––ingeneral
generaltaking
takingthe
themagnitude
magnitudeof
of the
the complex
spectral signal increases the amount of noise present. When, for example, the signal is
contaminated by detector noise, taking the magnitude of the complex spectral signal puts both the
avoidable and unavoidable detector-noise components into the spectral measurement.110
Consequently the signal-processing algorithms of Fourier-transform spectrometers usually avoid
taking the magnitude of the complex, noise-contaminated spectral signal and instead use
calibration algorithms like the one described in Sec. 5.19 of Chapter 5 (we have, in fact, already
applied this algorithm to standard sampling noise in Sec. 8.5 above). Although we know that our
analysis here is for the special case of sampling-position noise characterized by a delta function
power spectrum, a real spectroscopist cannot know this ahead of time and so would process his
Fourier-transform data as though other types of noise—for example, detector noise—dominate
his noise budget. Hence we should now investigate what happens to sampling-position noise
characterized by a delta-function power spectrum when it is processed this way—that is,
processed as though it is detector noise. This first step, then, is to approximate e 2 i r in such a
way as to convert it into an additive noise.
We decide to take advantage of the smallness of r , expanding e 2 i r into a power series
while remembering to retain, as promised in the discussion immediately preceding Eq. (8.36a)
above, both the O(r ) and O(r 2 ) terms,
1
e 2 i r cos(2 r ) i sin(2 r ) 1 (2 r ) 2 i (2 r )
2 (8.37a)
1 2 i r 2 2 2 r 2 .
When put back into (8.36c), this gives
Z eff ,totN ( )
110
The discussion following Eq. (6.35d) in Chapter 6 explains the difference between the avoidable and unavoidable
detector-noise in a spectral measurement.
- 1016 -
Error from Quasi-Static Sampling Noise · 8.10
independent measurements of the same spectrum—that is, by taking its expectation value—so we
apply the expectation operator E to both sides of (8.37b) to get
(
E Z )
eff ,totN (σ ) ≅
Here, we have once again applied Eqs. (3.9f) and (3.16a) from Chapter 3 to simplify the formula
by distributing operator E over the expression for the uncalibrated spectral signal. Substitution of
(8.34a) into (8.3a) shows that the random parameter r is zero-mean,
E ( r ) = 0 . (8.37c)
This also makes good intuitive sense because we expect the sampling offset r to be equally
likely to take on a positive or a negative value for any given sweep of the interferometer’s
moving mirror. Hence, the expectation value of the uncalibrated spectral signal can be written as
E Z( )
eff ,totN (σ ) ≅
(
E Z )
eff ,totN (σ ) ≅ (1 − 2π σ rrms ) H(uσ ) M( Rσθ ma ) Z mnf (σ ) ,
2 2 2
(8.37d)
where we define
rrms = E(r 2 ) . (8.37e)
Since r is taken to be a small random error in the sampling position, the factor
(1 − 2π 2σ 2 rrms
2
)
in Eq. (8.37d) is always positive with
2π 2σ 2 rrms
2
<< 1 .
Since E ( r ) = 0 , we see that
2
rrms ( )
= E(r 2 ) = E [r − E(r )]2 . (8.37f)
-1017 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
Hence rrms, being the square root of the variance E([r − E(r )]2 ) , is the standard deviation of r .
[See Eqs. (3.5c) and (3.8e) of Chapter 3 for definitions of the standard deviation and variance.]
The uncalibrated, noise-contaminated spectral signal in (8.37b) can be written as, after both
adding and subtracting
2π 2σ 2 rrms
2
H(uσ ) M( Rσθ ma ) Z mnf (σ )
from the formula,
Z eff ,totN (σ ) ≅
Taking the expectation value of both sides of (8.38b) gives, again applying Eqs. (3.9f) and
(3.16a) from Chapter 3,
E( ρ (2) ) = rrms
2
− E(r 2 )
E( ρ (2) ) = 0 . (8.38c)
ª º
Z eff ,totN (σ ) ≅ ¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) ¼
- 1018 -
Error from Quasi-Static Sampling Noise · 8.10
According to the discussion following Eq. (8.37e), we can count on M (σ rrms ) defined in (8.38d)
always being a positive quantity slightly less than one. Substituting Eq. (8.38d) into (8.37d) gives
E Z( )
eff ,totN (σ ) ≅ H(uσ ) M (σ rrms )M( Rσθ ma ) Z mnf (σ ) . (8.38f)
It is now time to apply the calibration algorithm using the same procedure as in Sec. 8.5
above. We note that Eq. (8.38e) corresponds to Eq. (8.9a) in Sec. 8.4, only now the leading
nonrandom term is
ª¬ H(uσ ) M (σ rrms ) M( Rσθ ma ) Z mnf (σ ) º¼
instead of
ª¬ H(uσ ) M( Rσθ ma ) Z mnf (σ ) º¼ ,
instead of
contaminated signal spectrum, is now, applying (8.39a) and (8.39b) to Eq. (8.11e),
-1019 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
(1)
where the superscript (1) is added to show that Eq. (8.10d) specifying Z mnf is now the proper
formula for Z mnf (because L(1) is now the input radiance). Similarly, when the interferometer
observes the L(2) calibration radiance, we have
(2)
where now Eq. (8.10f) specifying Z mnf is the proper formula for the Z mnf function. Applying
(8.39d) and (8.39e) to Eqs. (8.11c) and (8.11d) respectively gives
The formula corresponding to Eq. (8.12b) above is, again applying (8.39d) and (8.39e),
L(2) ( σ ) − L(1) ( σ )
eff ,tot (σ ) − Z eff ,tot (σ )
Z (2) (1)
(8.39h)
−1
ª WA ∆Ω º
=« H(uσ ) M (σ rrms ) M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )» .
¬ 4 ¼
- 1020 -
Error from Quasi-Static Sampling Noise · 8.10
This is the formula corresponding to Eq. (8.12c) in Sec. 8.5 above. To construct the formula
corresponding to Eq. (8.12d), we subtract (8.39f) from (8.39c) to get
( meas ) (σ ) − Z (1) (σ )
Z eff ,totN eff ,tot
WA ∆Ω
≅ H(uσ ) M (σ rrms ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )[L mnf ( σ ) − L(1) ( σ )] (8.39i)
4
+ (r − iπσρ (2) ) ⋅ [2π iσ H(uσ ) M( Rσθ ma ) Z mnf (σ )] .
Equations (8.39h) and (8.39i) can now be substituted into the fundamental calibration formula
(8.12a) to get
Measured Radiance
8πσ [πσρ (2) + ir ] Z mnf (σ )
= L mnf ( σ ) + .
(WA ∆Ω) M (σ rrms ) R ( σ )η (σ )τ a ( σ )τ f ( σ )
Measured Radiance
° L(mnf ( σ ) − L(back) ½
mnf ( σ ) °
fore )
2πσ [πσρ (2) + ir ] (8.39j)
= L mnf ( σ ) + ®L mnf ( σ ) + ¾.
M (σ rrms ) ¯° τ f ( σ ) °¿
As always, the true error in the measured radiance is the real part of the complex error terms that
are present [see, for example, the discussion following Eq. (7.21e) in Chapter 7 or Eq. (6.35d) in
Chapter 6]. Hence the error δ L is the real part of the second term on the right-hand side:
2π 2σ 2 ρ (2) ° L(mnf
fore )
( σ ) − L(back)
mnf ( σ ) ½°
δ L = ®L mnf ( σ ) + ¾. (8.39k)
M (σ rrms ) °¯ τ f (σ ) ¿°
-1021 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
has—except for the slowly varying σ 2 M (σ rrms ) factor—the same shape as the Lmnf radiance
being measured. This is good news if all that is needed is the shape of the input Lmnf radiance—
maybe we just want the position of absorption or emission lines in an unknown spectrum—
because the change in ρ (2) from measurement to measurement acts like a small random change in
the zero level of the radiance curve. It is, however, disturbing news if the Lmnf spectrum must be
radiometrically accurate, because there is little “off shape” evidence of the sampling error in the
measurement.
When the interferometer is contaminated by quasi-static or delta-function sampling-position
noise, the expected value for δ L is, applying the expectation operator to both sides of Eq.
(8.39k),
2π 2σ 2 ° L(mnf
fore )
( σ ) − L(back)
mnf ( σ ) ½°
E(δ L ) = ®L mnf ( σ ) + ¾ ⋅ E ( ρ ) .
(2)
M (σ rrms ) ¯° τ f (σ ) ¿°
E(δ L ) = 0 ,
confirming that δ L is still, just like in Eq. (8.14f) above, a zero-mean random quantity. Hence,
its variance is, applying formula (8.15a) to (8.39k),
§ª ° ( σ ) − L(back) ) ½°º
2
·
L(mnf mnf ( σ
fore )
¨ 2π 2σ 2 (2) 2 ¸
E(δ L ) = E «
2
®L mnf ( σ ) + ¾» [ ρ ] .
¨ « M (σ rrms ) °¯ τ f (σ ) °¿»¼ ¸
©¬ ¹
The linearity of operator E with respect to random quantities (see Sec. 3.10 of Chapter 3) lets us
write
2
ª 2π 2σ 2 °
fore )
L(mnf ( σ ) − L(back)
mnf ( σ ) °½º
E(δ L ) = «
2
®L mnf ( σ ) + ¾» ⋅ E ([ ρ ] ) .
(2) 2
«¬ M (σ rrms ) °¯ τ f (σ ) °¿»¼
2
ª 2π 2σ 2 ° fore )
L(mnf ( σ ) − L(back)
mnf ( σ ) ½°º
E(δ L 2 ) = « ®L mnf ( σ ) +
2 2 2
¾» ⋅ E ([rrms − r ] ) .
«¬1 − 2π σ rrms τ f (σ )
2 2 2
°¯ °¿»¼
- 1022 -
Error from Quasi-Static Sampling Noise · 8.10
Again it is important to remember that, according to the discussion following Eq. (8.37e), the
factor (1 2 2 2 rrms
2
) is always a positive number slightly less that one. The standard deviation
of L is the square root of its variance [see Eq. (3.5c) in Chapter 3]; and, as explained in Sec.
6.1 of Chapter 6, the NEdN of a spectral measurement is the standard deviation of its random
error. Hence, the formula for the NEdN of delta-function sampling-position noise is
delta 2 2 2 E ([rrms
2
r 2 ]2 ) L(mnf
fore )
( ) L(back)
mnf ( )
NEdN samp L mnf ( ) . (8.40a)
1 2 2 2 rrms
2
f ( )
Using the linearity of the expectation operator E with respect to random quantities [see Eq. (3.9f)
and Sec. 3.10 of Chapter 3] and then substituting from Eq. (8.37f), we note that
2
E [rrms
r 2 ]2 E rrms
4
2
2rrms r 2 r 4 rrms
4
2
2rrms E (r 2 ) E (r 4 ) E (r 4 ) rrms
4
.
delta
The NEdN samp formula can now be written as
We already know, according to Eq. (8.37c), that r has a zero-mean probability density
distribution. If we also assume that this is a zero-mean normal distribution, then Eq. (7A.5d) in
Appendix 7A of Chapter 7 shows that
E (r 4 ) 3rrms
4
,
where, of course, we know from the discussion following Eq. (8.37f) that rrms is the standard
deviation of r . Now we have
delta 2 2 2 2 rrms
2
L(mnf
fore )
( ) L(back)
mnf ( )
NEdN samp 2 2 2
L mnf ( ) (8.40c)
1 2 rrms f ( )
as the formula for the NEdN of our measurement. By keeping both the O(r ) and O(r 2 ) terms
everywhere they occur in the noise equations, we have ended up with a reasonable formula for
the quasi-static sampling
sampling noise.
noise. We see that neglecting the quadratic term in Eq. (8.2e) is the reason
our previous NEdN formula gave zero for the quasi-static sampling noise.
-1023 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
ª 8π J (θ 2) (σ ) º
NEdN tilt =« », (8.41a)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼
where
∞
1 2
J (θ 2)
(σ ) = ³ pnn
(θ 2)
(σ ′) ª¬(σ + σ ′) 2 Z mnf (σ + σ ′) + (σ − σ ′) 2 Z mnf (σ − σ ′) º¼ dσ ′ . (8.41b)
4 −∞
The corresponding pair of formulas for the sampling-error NEdN is, in Eqs. (8.22g) and (8.22h)
above,
4 J ( s ) (σ )
NEdN samp ( σ ) = (8.41c)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
and
J ( s ) (σ ) =
∞
π 2
³p
(s)
nn (σ ′) e − iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R(σ − σ ′)θ ma ) Z mnf (σ − σ ′) (8.41d)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf (σ + σ ′) dσ ′ .
∗
The formula for Z mnf is, of course, the same in both sets of equations; Eq. (8.7f) in this chapter
just repeats the definition in (7.16f) in the previous chapter. For the two types of NEdN,
WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )[τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )] . (8.41e)
( fore ) (back)
Speaking very approximately, and noting what happens when the formulas for J (θ 2) and J ( s )
are substituted into Eqs. (8.41a) and (8.41c), we see that both NEdNtilt and NEdNsamp diminish as
the J and thus (disregarding for now the effect of the integrals over dı) decrease in an
approximately linear way with Z mnf . This point is not just academic because, at least in
principle, both L(mnf
fore )
and L(back)
mnf are under the control of the interferometer’s designer. Hence, by
- 1024 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
under typical operating conditions when measuring a typical input spectrum—the sum
τ f ( σ )L mnf ( σ ) + L(mnf
fore )
(σ ) ,
we can minimize both NEdNtilt and NEdNsamp. This same relationship shows up in the formula for
random quasi-static sampling error. Equation (8.40b) shows that if
L(mnf
fore )
( σ ) − L(back)
mnf ( σ )
L mnf ( σ ) + ≈0
τ f (σ )
or
mnf ( σ ) ≈ τ f ( σ ) L mnf ( σ ) + L mnf ( σ ) ,
L(back) ( fore )
(8.41f)
delta
then NEdN samp ≈ 0 also.
It is not difficult to understand why (8.41f) minimizes random misalignment and sampling
errors—for both types of noise this minimizing relationship is present from the start of our
analysis. It is also not very difficult to show how this works. Working first with the sampling-
position noise, we get from Eq. (8.1c) that
∞
(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ ,
( tot )
zC ma (8.42a)
−∞
∞
dzC( tot )
= 2π i ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e 2π iσχ dσ . (8.42b)
dχ −∞
∞
( tot )
zCN ( χ ) = zC(tot ) ( χ ) + 2π i n ( s ) ( χ ) ³ σ H(uσ ) M( Rσθ ma ) Z FOV (σ ) e 2π iσχ dσ . (8.42c)
−∞
Examining the derivation of Eq. (8.2g), we see that it comes from removing the j subscripts in
( )
Eq. (8.2e). If we want to include the O [n ( s ) ]2 error term from the analysis of the quasi-static
sampling error in Sec. 8.10, we can similarly remove the j subscripts from Eq. (8.35) and
substitute from (8.42a) to get
-1025 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
"
z ( tot )
CN ( ) z ( tot )
C ( ) 2 i n ( s ) ( ) ³ H(u ) M( R ma ) Z FOV ( ) e 2 i d
( sampling
noise ) "
"
(8.42d)
³
2 (s) 2 2 2 i
2 [n ( )] H(u ) M( R ma ) Z FOV ( ) e d .
"
Clearly the size of the sampling noise is governed by the size of Z FOV . As for the mirror-tilt error
in Chapter 7, Eq. (7.8a) can be substituted into (7.11d) to get
( tot ) § · ª" º
zCN ( ) u 1h ¨ ¸ « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( )
mirror-tilt
noise (8.42e)
§ · ª ( 2) º
"
u a h ¨ ¸ « n ( ) ³ 2 Z FOV ( ) e 2 i d » .
1
©u¹ ¬ " ¼
Again the size of the signal noise is governed by the size of Z FOV . The formula for Z FOV is [see
Eq. (8.1b) above or (7.7b) in Chapter 7]
§ WA · ( fore ) (back)
Z FOV ( ) ¨ ¸ R ( ) ( ) a ( )[ f ( )L FOV ( ) L FOV ( ) L FOV ( )] . (8.42f)
© 4 ¹
All the random-error terms in the formulas for the interferogram signal contaminated by sampling
noise and random misalignment errors—that is, all the random-error terms on the right-hand
sides of Eqs. (8.42d) and (8.42e)—can be minimized by minimizing Z FOV . Equation (8.42f)
shows that this occurs when
f L FOV L(FOV
fore )
L(back)
FOV ' 0
or
L(back) ( fore )
FOV ( ) ' f ( )L FOV ( ) L FOV ( ) . (8.42g)
Assuming that the interferometer does a reasonable job of resolving the L, L(back) , and L( fore )
radiance spectra—that is, assuming that the distorting effects of the finite interferogram and finite
field of view are negligible—we know, just like in Eqs. (7.19a) and (7.19b) in Chapter 7, that
- 1026 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
L( fore ) ( ) L(FOV
fore )
( ) L(mnf
fore )
( ) , (8.42i)
and
L( ) L FOV ( ) L mnf ( ) . (8.42j)
Under these conditions, Eqs. (8.42g) and (8.41f) are effectively identical. Since (8.41f) comes
from minimizing the final formulas for NEdNtilt and NEdNsamp, and (8.42g) comes from
minimizing the raw noise contaminating the initial interferogram signals, we have now confirmed
that this noise-minimizing relationship is present from the beginning of the analysis and
continues through to the end.
The noise associated with the randomly changing misalignment and sampling errors is
sometimes called multiplicative noise.111 The name comes from the way these random errors
enter the
the equations
equations only
onlyafter
afterbeing
beingmultiplied
multipliedbyby integrals proportional to Z FOV which is itself
terms proportional
proportional to
f L FOV L(FOV
fore )
L(back)
FOV .
"
³
2
Z FOV ( ) e 2 i d
"
³ H(u ) M( R
"
ma ) Z FOV ( ) e 2 i d
(s) 2
and [n ] is multiplied by
"
³
2
H(u ) M( R ma ) Z FOV ( ) e 2 i d
"
111
John Chamberlain, The Principles of Interferometric Spectroscopy, pp. 303–309.
-1027 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
zC ( χ ) + zC( cold ) ( χ ) .
Even though there is a convolution with h( χ / u ) before the addition (to show what happens to
the noise when it passes through the signal processing chain after leaving the detector), no terms
proportional to the input or background radiances are included in the random error before it is
added to the uncontaminated signal. This is why the random error coming from the detector is
sometimes called additive noise.
The noise-free components of the interference signals in Eqs. (8.42d) and (8.43) are the same.
It is easy to show this is true. Setting n (det) to zero in (8.43) reduces the right-hand side to
zC ( χ ) + zC( cold ) ( χ ) ,
which becomes, after substituting from Eqs. (6.5d) and (6.12a) in Chapter 6,
zC ( χ ) + zC( cold ) ( χ )
∞
WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθ ma ) R ( σ ) η(σ ) τ f ( σ )τ a ( σ )L FOV ( σ ) e 2π iσχ dσ
∞
WA ∆Ω
³
2π iσχ
+ H(uσ ) M( Rσθ ma ) η(σ )R ( σ )τ a ( σ )[L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ .
4 −∞
zC ( χ ) + zC( cold ) ( χ )
∞
WA ∆Ω
=
4 −∞³ H(uσ ) M( Rσθma ) R ( σ ) η(σ )τ a ( σ ) ⋅
2π iσχ
[τ f ( σ ) L FOV ( σ ) + L(FOV
fore )
( σ ) − L(back)
FOV ( σ )]e dσ .
- 1028 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
∞
zC ( χ ) + z (χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ ,
( cold )
C ma
−∞
This is the same zC(tot ) that the right-hand side of (8.42d) reduces to when the sampling-position
noise n ( s ) is zero; in both cases, not surprisingly, the same function can be used to represent the
noise-free signal.
The right-hand side of Eq. (8.42e) for the random mirror-misalignment error also reduces to
( tot )
zC as the misalignment noise goes to zero—but unfortunately it takes some analysis to show
this. When n (θ 2) is zero, the Fourier F operator defined in Eqs. (2.29a) and (2.29c) in Chapter 2
can be used to write the right-hand side of (8.42e) as
§χ · ª∞ º
u −1h ¨ ¸ ∗ « ³ M( Rσθ rms ) Z FOV (σ ) e 2π iσχ dσ »
© u ¹ ¬ −∞ ¼
(8.45a)
§χ·
= u −1h ¨ ¸ ∗ F (iσχ ) ( M( Rσθ rms ) Z FOV (σ ) ) .
©u¹
We note that the transform in Eq. (6.27b) in Chapter 6 can be reversed to get (replacing the
dummy variables χ ′′ , σ by χ , σ ′ respectively)
§χ·
h ¨ ¸ = F (iσ ′χ ) ( uH(uσ ′) ) = uF ( iσ ′χ ) ( H(uσ ′) ) ,
©u¹
where in the last step we have used the linearity of F to move u outside the Fourier transform (see
Sec. 2.6 in Chapter 2). This can also be written as
1 §χ·
h ¨ ¸ = F ( iσ ′χ ) ( H(uσ ′) ) . (8.45b)
u ©u¹
-1029 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
§ · ª" º
u 1h ¨ ¸ « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( i )
F H(u ) F (i ) M( R rms ) Z FOV ( ) .
Writing the right-hand side as a Fourier integral [after applying Eq. (2.39j) in Chapter 2], we get
§ · ª" º
u 1h ¨ ¸ « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
( i )
F H(u )M( R rms ) Z FOV ( ) .
We again apply (2.29c)
consult inin
(2.29c) Chapter 2 to
Chapter thethe
2 to right-hand side
right-hand toto
side write
write
§ · ª" º
u 1h ¨ ¸ « ³ M( R rms ) Z FOV ( ) e 2 i d »
© u ¹ ¬ " ¼
"
(8.45c)
³ H(u )M( R
2 i
rms ) Z FOV ( )e d .
"
Formula (7.3d) in Chapter 7 states that
2
rms 2 x2 y2 ;
and when the misalignment noise drops to zero, we expect the x , y standard deviations of the
random misalignment angles and also to go to zero, giving us
x y
rms .
The discussion following Eq. (7.2e) defines to be the bias-tilt angle of the randomly varying
misalignment, so when the randomly changing misalignment error goes to zero, it makes sense to
regard as the static misalignment angle șma,
rms ) ma .
Replacing șrms by șma and then comparing the right-hand side of (8.45c) to (8.42a), we see that as
the misalignment noise drops to zero, the noise-free signal once again simplifies to the Fourier
transform
- 1030 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
∞
(χ ) = ³ H(uσ ) M( Rσθ ) Z FOV (σ ) e 2π iσχ dσ .
( tot )
zC ma (8.45d)
−∞
This is, of course, the same noise-free signal we get in Eq. (8.44) above. Hence we have now
demonstrated that the noise-free signals from our analysis of the detector noise, the sampling-
position noise, and the mirror-misalignment noise indeed all reduce to the same expression, as
they should.
According to the discussion following Eq. (8.42g), the approximate radiance equalities
specified by (8.41f) and (8.42g) are essentially equivalent in well-designed interferometers, and
from this it follows that Z FOV [whose formula is given by (8.42f)] is minimized by (8.42g) at the
same time that NEdNtilt and NEdNsamp are minimized by (8.41f). At this point, however, we notice
that the zC(tot ) noise-free signal component in (8.45d) is also minimized when Z FOV is minimized.
This seems to cause a problem, because the spectral measurement depends on this noise-free
component—it clearly does not make sense to design the interferometer for minimal tilt and
sample noise if the signal itself then goes away.
To solve this puzzle, we need to be more explicit about what exactly is being measured.
According to the mathematics of information theory, the more unexpected an occurrence is, the
more information it provides.112 Turning this statement around, the more expected an occurrence
is, the less information it provides. With this idea as a guide, we can divide the L(ı) radiance
spectrum being measured into an expected component and an unexpected—or unknown—
component,
The L(exp) spectral radiance is what we expect to measure; it could, for example, be the average
spectrum measured in the past under circumstances similar to the present. Assuming that there
are N past measurements, we can label each measurement with an index j = 1, 2,… , N and call
the radiance seen in the jth past measurement L( j ) ( σ ) so that
N
1
L (exp)
(σ ) =
N
¦L
j =1
( j)
(σ ) . (8.46b)
According to (8.46a), the unknown component L(unk ) for the spectrum L now being measured
must be the difference between that spectrum and L(exp) , so
112
A. Papoulis, Probability, Random Variables, and Stochastic Processes, p. 534.
-1031 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
Function L(unk ) is the real information in the signal because we cannot know anything about it
ahead of time; in fact, because it is defined to be the difference between L and L(exp) , it’s equally
likely to be positive or negative. Not knowing anything about it ahead of time, we cannot design
the instrument around it; we can, however, just like any other truly unpredictable quantity,
estimate its expected size by calculating the associated standard deviation:
N
1
¦ ª¬L
2
L(unk) ( σ ) ≈ ( j)
( σ ) − L(exp) ( σ ) º¼ . (8.46d)
N j =1
To show the effect of the interferometer’s finite field of view, we follow the pattern of Eqs.
(5.83e), (6.11b), and (6.11c) in Chapters 5 and 6, setting
∆Ω σ
∆σ =
2π
and
L(unk) ( σ ) for small ǻȍ where cos α ε
° can be approximated as one
°
°°
L(unk
FOV
)
( σ ) = ® § ∆Ω · ∆σ (8.47b)
σ ⋅ 1+ +
° 1 ¨© 4π ¸¹ 2 for slightly larger ǻȍ where cos α ε
° ∆σ § ∆Ω³ · ∆σ
° ⋅ L (unk)
( σ ′ ) d σ ′
cannot be approximated as one
°̄ σ ⋅¨1+ ¸−
© 4π ¹ 2
Similarly, following the pattern of Eqs. (5.108d), (6.25b), and (6.25c) in Chapters 5 and 6, the
distorting effect of the finite interferogram length can be introduced by defining
- 1032 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
L(exp) (exp)
mnf ( ) [2 Dsinc(2 D )] L FOV ( ) (8.47c)
and
L(unk) (unk)
mnf ( ) [2 Dsinc(2 D )] L FOV ( ) . (8.47d)
The analysis following Eq. (5.108d) applies equally well to L(exp) (unk )
mnf and L mnf , letting us write
L(exp) (exp)
mnf ( ) L mnf ( ) (8.47e)
and
L(unk ) (unk )
mnf ( ) L mnf ( ) (8.47f)
L(exp) (exp)
mnf ( ) [2 Dsinc(2 D )] L FOV ( ) (8.47g)
and
L(unk) (unk)
mnf ( ) [2 Dsinc(2 D )] L FOV ( ) . (8.47h)
We now combine results. Substituting (8.46a) into the right-hand side of Eq. (5.83e) in Chapter 5
gives
L(exp) ( ) L(unk ) ( )
°
° for small ǻȍ where cos can be approximated as one
as one
°
° § · § ·
° (¨1 ¸ (¨1 ¸
L FOV ( ) ® 1 © 4 ¹ 2
1 ©
4 ¹ 2
³ ³
(exp)
° ( L ( ) d ( L(unk ) ( ) d
° (§¨1 ·¸ § ·
(¨1 ¸
° © 4 ¹ 2 © 4 ¹ 2
Equations (8.47a) and (8.47b) show that this formula is the same thing as saying that
-1033 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
Equation (8.47i) can now be substituted into Eq. (5.108d) in Chapter 5 to get
which becomes, using the linearity of the convolution [see Eq. (2.38d) in Chapter 2],
{
L mnf (σ ) = [2 Dsinc(2πσ D)] ∗ L(exp) } {
FOV ( σ ) + [2 Dsinc(2πσ D )] ∗ L FOV ( σ ) .
(unk)
}
Substitution from Eqs. (8.47g,h) gives
L mnf (σ ) = L(exp)
mnf ( σ ) + L mnf ( σ ) .
(unk)
If the right-hand side of this formula is an even function of ı—and it is—then the left-hand side
must also be an even function of ı, allowing us to write
L mnf ( σ ) = L(exp)
mnf ( σ ) + L mnf ( σ ) .
(unk )
(8.47j)
Equations (8.47i) and (8.47j) match the form of (8.46a), showing that the distinction between the
expected and unknown radiances extends naturally to the distorted radiance functions produced
by the finite field of view and finite interferogram length.
The expected component L(exp) of the measured radiance often acts like a type of background
radiance generated outside the instrument. Suppose, for example, a spectroscopist is trying to
measure the infrared spectrum of a small burning candle with an interferometer having a
relatively large field of view. The optical signal coming from the candle could easily turn out to
be rather small compared to the infrared background signal coming from the laboratory walls. In
this sort of situation, we can say that
Of course L(exp) , if defined by (8.46b), cannot be exactly the same as L(wall) because the candle
would contribute some small average radiance to L(exp) , but this could easily turn out to be
negligible, justifying the approximation in (8.48).
Having divided L into L(exp) and L(unk ) , we can revisit the minimization conditions for the
sampling and mirror-misalignment noise in Eqs. (8.41f) and (8.42g). Substituting Eqs. (8.47i) and
(8.47j) into (8.41f) and (8.42g) gives
- 1034 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
and
L(back) (exp) (unk ) ( fore )
mnf ( ) ' f ( )L mnf ( ) f ( )L mnf ( ) L mnf ( ) . (8.49b)
Now we can make sense of the situation that occurs when instruments are designed to
minimize multiplicative noise like NEdNtilt and NEdNsamp. Substituting (8.47i) into formula
(8.42f) gives
§ WA ·
Z FOV ( ) ¨ ¸ R ( ) ( ) a ( )
© 4 ¹
( f ( )[L(exp) (unk ) ( fore ) (back)
FOV ( ) L FOV ( )] L FOV ( ) L FOV ( )
or
§ WA ·
Z FOV ( ) ¨ ¸ R ( ) ( ) a ( )
© 4 ¹ (8.50a)
( f ( )L (unk)
FOV ( ) [ f ( )L (exp)
FOV
( fore
( )]LL( fore
FOV
FOV
)) (back)
( () )LL(back)
FOV
FOV
( ()])] . .
This can be put into formula (8.45d) for the noise-free signal to get
-1035 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
zC( tot ) ( )
"
WA
4 "³ H(u ) M( R ma ) R ( ) ( ) a ( ) e 2 i (
( ()L)L
f f
(exp)
(unk)
FOV ( () )[[f (f ()L
FOV
(unk
)L )
(exp)
)]) L(FOV
FOV( (
FOV
fore )
( ) L(back)
FOV ( )] d
write
zC(tot ) ( ) zC(unk) ( ) zC(exp) ( ) , (8.50b)
where
zC(unk) ( )
WA
"
(8.50c)
³ H(u ) M( R ma ) R ( ) ( ) a ( ) f ( )L(unk ) 2 i
FOV ( )e d
4 "
and
zC(exp) ( )
"
WA
4 "³ H(u ) M( R ma )R ( ) ( ) a ( )e2 i ( (8.50d)
Now, if the interferometer is built so that (8.49c) holds true, then all that can happen is that zC(exp)
disappears, reducing zC(tot ) in (8.50b) to
The zC(unk) component of the noise-free signal is, however, all we really cared about in the first
place. The zC(exp) expected signal component is already known—it provides no new information
because that part of the signal is expected to be there every time the experiment is done. Hence an
interferometer can be designed so that approximations (8.49c) and (8.49d) hold true without
affecting the relevant part of the signal passing through the instrument. Now that there is no
concern about decreasing the quality of the measurement, Eq. (8.47j) can be substituted into
formula (8.41e) to get
- 1036 -
Comparing the Sampling-Error, Misalignment, and Detector NEdNs · 8.11
WA ∆Ω
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ ) ⋅
4
[τ f ( σ )L(exp)
mnf ( σ ) + τ f ( σ )L mnf ( σ ) + L mnf ( σ ) − L mnf ( σ )].
(unk ) ( fore ) (back)
Condition (8.49d) can then be applied to minimize multiplicative noise such as NEdNtilt and
NEdNsamp, leading to
WA ∆Ω
(min)
Z mnf (σ ) = R ( σ )η (σ )τ a ( σ )τ f ( σ )L mnf ( σ ) .
(unk)
(8.50f)
4
This is now substituted into formulas (8.41a) and (8.41b) for NEdNtilt to get
ª 8π J (min,θ 2) (σ ) º
NEdN (min)
tilt =« », (8.50g)
«¬ ∆Ω M( Rσθ rms )R ( σ )η (σ )τ a ( σ )τ f ( σ )»
¼
where
J (min,θ 2) (σ ) =
1
∞
2 (8.50h)
³ ª º
(θ 2)
p
nn (σ ′) ¬ (σ + σ ′) 2
Z (min)
mnf (σ + σ ′) + (σ − σ ′) 2
Z (min)
mnf (σ − σ ′) ¼ dσ ′ .
4 −∞
Equation (8.50f) can also be substituted into formulas (8.41c) and (8.41d) to get
4 J (min, s ) (σ )
NEdN (min)
samp (σ ) = , (8.50i)
A ∆Ω H(uσ ) M( Rσθ ma )R ( σ )η (σ )τ a ( σ )τ f ( σ )
with
J (min, s ) (σ ) =
∞
π 2
³p
(s)
nn (σ ′) e −iψ (σ ) (σ − σ ′) H ( u (σ − σ ′) ) M ( R (σ − σ ′)θ ma ) Z mnf (σ − σ ′)
(min)
(8.50j)
−∞
2
− eiψ (σ ) (σ + σ ′) H ( u (σ + σ ′) ) M ( R(σ + σ ′)θ ma ) Z mnf
∗ (min)
(σ + σ ′) dσ ′ .
-1037 -
8 · Sampling-Error NEdN in Double-Sided Interferograms
There is no guarantee, of course, that the interferometer will always be used under the
conditions for which it is designed, or even that it is possible to design the interferometer so that
the minimizing conditions in (8.49c) and (8.49d) are satisfied. We know, for example, that
detector noise dominates the random-error budgets of most well-designed interferometers.
According to the discussion at the beginning of Sec. 6.15 of Chapter 6, many detectors operate
under close to ideal conditions, so any increase in background radiance L(back) needed to satisfy
(8.49c) and (8.49d) can easily end up increasing the NEdN(det) detector noise more than it
decreases the NEdNtilt and NEdNsamp multiplicative noise. Perhaps, then, it is best just to note that
for any Fourier-transform spectrometer
(min)
NEdN tilt ≥ NEdN tilt (8.51a)
and
(min)
NEdN samp ≥ NEdN samp , (8.51b)
(min) (min)
with NEdN tilt and NEdN samp specified by Eqs. (8.50g) and (8.50i) above.
Inequalities such as the ones in (8.51a) and (8.51b) can be very useful. If a proposed
interferometer design, with a good guess of L(unk )
FOV based on Eq. (8.46d), produces unacceptably
(min) (min)
large values for NEdN tilt and NEdN samp , then—because there is no way the true NEdNs of the
actual instrument can be smaller—the design fails. The multiplicative noise in the system must be
reduced before further progress can be made.
- 1038 -
BIBLIOGRAPHY
Articles
Bell, E. E., and Sanderson, R. B. “Spectral Errors Resulting From Random Sampling-Position
Errors in Fourier Transform Spectroscopy.” Applied Optics, 11, no. 3 (March 1972), pp.
688–689.
Cohen, D. “Characterization of a Space-Class Fourier Transform Spectrometer Against a
Detailed Performance Model.” IEEE Aerospace Conference at Snowmass, CO (March
1999).
Cohen, D. “Noise-Equivalent Change in Radiance for Misalignment Noise in a Double-Sided
Interferogram.” Applied Optics, 42, no. 31 (1 November 2003), pp. 6292–6304.
Cohen, D. “Noise-Equivalent Change in Radiance for Sampling Noise in a Double-Sided
Interferogram.” Applied Optics, 42, no. 13 (1 May 2003), pp. 2289–2300.
Cohen, D. “Performance Degradation of a Michelson Interferometer When Its Misalignment
Angle Is a Rapidly Varying Random Time Series.” Applied Optics, 36, no. 18 (20 June
1997), pp. 4034–4041.
Cohen, D. “Performance Degradation of a Michelson Interferometer Due to Random Sampling
Errors.” Applied Optics, 38, no. 1 (1 January 1999), pp. 139–151.
Forman, Michael L., W. Howard Steel, and George A. Vanasse. “Correction of Asymmetric
Interferograms Obtained in Fourier Spectroscopy.” Journal of the Optical Society of
America, 56, no. 1 (January 1966), pp. 59–63.
Haschberger, Peter. “Impact of the Sinusoidal Drive on the Instrumental Line Shape Function of
a Michelson Interferometer with Rotating Retroreflector.” Applied Spectroscopy, 48, no. 3
(1994), pp. 307–315.
Hirschfeld, Tomas. “Multiple Order Spectra in Fourier Transform Infrared Spectroscopy.”
Applied Optics, 16, no. 7 (July 1977), pp. 1905–1907.
Kauppinen, Jyrki, and Pekka Saarinen. “Line-Shape Distortions in Misaligned Cube Corner
Interferometers.” Applied Optics, 31, no. 1 (January 1992), pp. 69–73.
Lambert, D. K., and P. L. Richards. “New Results in the Theory of a Plane-Mirror
Interferometer.” Journal of the Optical Society of America, 68, no. 8 (August 1978), pp.
1124–1130.
Learner, R. C. M., A. P. Thorne, and J. W. Brault. “Ghosts and Artifacts in Fourier-Transform
Spectrometry.” Applied Optics, 35, no. 16 (June 1996), pp. 2947–2953.
Loewenstein, Ernest V. “Fourier Spectroscopy: An Introduction.” Proceedings of the Aspen
International Conference on Fourier Spectroscopy, Aspen, CO (March 16–20, 1970), pp.
3–17.
- 1039 -
Bibliography
- 1040 -
Bibliography
Books
Abramowitz, Milton, and Irene A. Stegun (eds.). Handbook of Mathematical Functions (National
Bureau of Standards, Applied Mathematics Series 55, Washington, DC, 1964).
Bass, Michael (ed.). Handbook of Optics, Vols. I and II, 2nd ed. (Optical Society of America,
McGraw-Hill, Inc., New York, 1995).
Batygin, V. V., and I. N. Toptygin. Problems in Electrodynamics (Academic Press, New York,
1964).
Beer, Reinhard. Remote Sensing by Fourier Transform Spectrometry (John Wiley & Sons, Inc.,
New York, 1992).
Beers, Yardley. Introduction to the Theory of Error, 2nd ed. (Addison-Wesley Publishing
Company, Inc., Reading, MA, 1957).
Bennett, Jean M., and Lars Mattson. Introduction to Surface Roughness and Scattering (Optical
Society of America, Washington, DC, 1989).
- 1041 -
Bibliography
Blake, Ian F. An Introduction to Applied Probability (John Wiley & Sons, Inc., New York, 1979).
Bois, G. Petit. Tables of Indefinite Integrals (Dover Publications, Inc., New York, 1961),
unabridged translation of a book first published by B. G. Teubner in 1906.
Born, Max, and Emil Wolf. Principles of Optics: Electromagnetic Theory of Propagation,
Interference, and Diffraction of Light, 7th exp. ed. (Cambridge University Press, New
York, 1999).
Bracewell, Ron. The Fourier Transform and Its Applications (McGraw-Hill Book Company,
New York, 1965).
Chamberlain, John. The Principles of Interferometric Spectroscopy (John Wiley & Sons, New
York, 1979).
Champeney, D. C. A Handbook of Fourier Theorems (Cambridge University Press, New York,
1987).
Chandrasekhar, S. Radiative Transfer (Dover Publications, Inc., New York, 1960), slightly
revised from 1950 book.
Cohen, D. Demystifying Electromagnetic Equations: A Complete Explanation of EM Unit
Systems and Equation Transformations (SPIE Press, Bellingham, WA, 2001).
Davenport, Wilbur B., Jr., and William L. Root. An Introduction to the Theory of Random Signals
and Noise (McGraw-Hill Book Company, Inc., New York, 1958).
Davis, Sumner P., Mark C. Abrams, and James W. Brault. Fourier Transform Spectrometry
(Academic Press, New York, 2001).
Defense Supply Agency, Standardization Division. Military Standardization Handbook Optical
Design, MIL-HDBK-141, 5 October 1962.
Dereniak, Eustace L., and Devon G. Crowe. Optical Radiation Detectors (John Wiley & Sons,
Inc., New York, 1984).
Ditchburn, R. W. Light, Vols. 1 and 2, 2nd ed. (Interscience Publishers, a division of John Wiley
& Sons, Inc., New York, 1963).
Evans, Merran, Nicholas Hastings, and Brian Peacock. Statistical Distributions, 2nd ed. (John
Wiley & Sons, Inc., New York, 1993).
Eyges, Leonard. The Classical Electromagnetic Field (Dover Publications, Inc., New York,
1980), an unabridged and corrected edition of 1972 book published by Addison Wesley.
Francon, M. Optical Interferometry (Academic Press, New York, 1966).
Freeman, J. J. Principles of Noise (John Wiley & Sons, Inc., New York, 1958).
Gabel, Robert A., and Richard A. Roberts. Signals and Linear Systems, 2nd ed. (John Wiley and
Sons, Inc., New York, 1980).
Gaskill, Jack D. Linear Systems, Fourier Transforms, and Optics (John Wiley & Sons, Inc., New
York, 1978).
Goldstein, D. Polarized Light, 2nd ed. (Marcel Dekker, Inc., New York, 2003).
Goodman, Joseph W. Introduction to Fourier Optics, McGraw-Hill, Inc., New York, 1988),
reissue of 1968 book.
Goodman, Joseph W. Statistical Optics (John Wiley & Sons, New York, 1985).
- 1042 -
Bibliography
Goody, R. M., and Y. L Yung. Atmospheric Radiation: Theoretical Basis, 2nd ed. (Oxford
University Press, New York, 1989).
Gradshteyn, I. S., and I. M. Ryzhik. Table of Integrals, Series, and Products, 5th ed., edited by
Alan Jeffrey (Academic Press, New York, 1994).
Griffiths, David J. Introduction to Electrodynamics, 2nd ed. (Prentice-Hall, Englewood Cliffs,
NJ, 1989).
Griffiths, Peter R., and James A. de Haseth. Fourier Transform Infrared Spectrometry (John
Wiley and Sons, Inc., New York, 1986).
Heavens, O. S. Optical Properties of Thin Solid Films (Butterworths Scientific Publications,
London, 1955).
Hecht, Eugene. Optics, 2nd ed., with contributions by Alfred Zajac (Addison-Wesley Publishing
Company, Reading, MA, 1987).
Helstrom, Carl W. Statistical Theory of Signal Detection, 2nd ed. (Pergamon Press, New York,
1968).
Jackson, John David. Classical Electrodynamics, 3rd ed. (John Wiley & Sons, Inc., New York,
1999).
Jaffe, Bernard. Michelson and the Speed of Light (Anchor Books, Doubleday and Company, Inc.,
New York, 1960).
Jeffrey, Alan. Handbook of Mathematical Formulas and Integrals (Academic Press, Inc., New
York, 1995).
Jenkins, F., and H. White. Fundamentals of Optics, 3rd ed. (McGraw-Hill Book Company, New
York, 1957).
Kay, Steven M. Fundamentals of Statistical Signal Processing: Estimation Theory (PTR Prentice
Hall, Inc., Englewood Cliffs, NJ, 1993).
Keigo, Iizuka. Engineering Optics, rev. translation of the 2nd original Japanese ed. (Springer-
Verlag, New York, 1983).
Klambauer, Gabriel. Aspects of Calculus (Springer-Verlag, New York, 1986).
Klein, Miles V. Optics (John Wiley & Sons, Inc., New York, 1970).
Kusse, Bruce, and Eric Westwig. Mathematical Physics: Applied Mathematics for Scientists and
Engineers (John Wiley and Sons, New York, 1998).
Lamb, H. Hydrodynamics (Dover Publications, Inc., New York, 1945), copy of the 6th ed. first
published in 1879.
Landau, L. D., and E. M. Lifshitz. Electrodynamics of Continuous Media, translated from the
Russian by J. B. Sykes and J. S. Bell (Pergamon Press, New York, 1960).
Landau, L. D., and E. M. Lifshitz. The Classical Theory of Fields, 3rd rev. English ed., translated
from the Russian by Morton Hamermesh (Pergamon Press, New York, 1971).
Lathi, B. P. An Introduction to Random Signals and Communication Theory (International
Textbook Company, Scranton, PA 1968).
Lighthill, M. J. Introduction to Fourier Analysis and Generalized Functions (Cambridge
University Press, New York, 1958).
- 1043 -
Bibliography
- 1044 -
Bibliography
- 1045 -
Index
1/f noise, 6.7, 764-767 430, 444, 448, 470, 522-526, 555
beam splitter, 2, 3, 7, 12, 14, 18, 19, 22, 24, 26, 28, 31, 42,
A 44, 46, 47, 54, 58, 59, 355, 394-400, 406, 407, 411-415,
A/D converter. See analog-to-digital converter 456, 464, 466, 467, 470, 472, 474, 478, 479, 481, 489, 536-
absolutely integrable, 69, 74, 75, 108 538, 543, 574, 575, 577, 585, 586, 602, 608, 698, 749
absorption line, 742, 806, 1022. See also emission line Bessel function, 456, 462, 485, 530, 630, 871
AC coupling, 619, 630, 750, 752, 757, 880 bias angle, 922, 927-929, 931, 939
additive noise, 1016, 1028 bias-tilt angle, 870, 912, 948, 951, 1030
aft optics, 5.8, 605, 607-609, 618, 626, 628-630, 632, 635, black-body, 559, 605, 756, 930-932, 934, 939, 987, 991
659, 700, 986, 1025 black-body spectrum, 8.8, 929, 931-933, 986-988, 991, 992,
alias, 195, 200, 716-720, 723 994, 996, 1007
aliasing, 188, 195-198, 200, 218, 708, 709, 715, 716, 851, BLIP, 807, 814
853, 854 Boltzmann's constant, 559
amplitude-reflection coefficient, 368, 407, 412, 413, 467, bounded function, 69, 71
473, 475
amplitude-transmission coefficient, 357-359, 362, 367, 407, C
473 cadmium red line, 24, 28
analog-to-digital converter, 8.1, 8.2, 555, 696, 697, 749, 849, calibration, 5.19, 465, 487, 488, 555, 681, 683, 685, 704,
850, 953, 954 725, 726, 742, 753, 762, 764, 766, 767, 782, 784, 822, 843,
angle-wavenumber transform, 4.8, 380, 382, 386, 391, 393, 853, 893, 932, 953, 954, 966, 987, 1020, 1021
394 calibration algorithm, 685, 686, 762, 782-785, 806, 820,
anti-aliasing filter, 6.22, 7.4, 555, 849, 853, 879, 880 853, 865, 891, 892, 964-967, 1016, 1019
anti-Hermitian function, 101, 220, 222 Cassegrain telescope, 385, 386, 389
apodization, 5.16, 650, 654-656 cat's-eye, 54
apodizing, 654, 656 Cauchy, Augustin, 1
approximation, gray-body. See gray-body approximation Cauchy principle value, 82, 118-122, 140, 142-144, 158
artificially created even signals, 3.27, 319 causal system, 729
autocorrelation, 3.13, 249, 319, 322, 523, 903 central dark fringe, 12, 14, 22, 26
autocorrelation function, 8.3, 223, 250, 251, 258, 274, 275, central fringe, 12, 14, 31
277, 278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319, central limit theorem, 3.11, 227, 243, 246, 248
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912, 948, characteristic function, 231, 267
956, 957, 977, 1012, 1014 co-adding, 764
autocovariance, 3.13, 249-251, 258 coefficient
avoidable misalignment noise, 7.7, 895 amplitude-reflection, 368, 407, 412, 413, 467, 473, 475
avoidable noise, 6.8, 767-770, 787, 788, 843, 848, 901, 1016 amplitude-transmission, 357-359, 362, 367, 407, 473
power reflection, 478, 481
B power transmission, 478, 481
background-limited infrared proton, 807. See also BLIP compensator plate, 7, 14, 394-397, 399, 412, 456, 466, 467,
background radiance, 4.18, 5.13, 6.3, 6.4, 6.5, 465, 466, 468, 473, 474, 489, 533-536, 538-541, 543-546, 549, 551, 553,
473, 474, 476, 479-481, 483-486, 555, 587, 626, 628-631, 554
639-641, 664, 681, 686, 698, 752, 753, 755, 758, 782, 806, complex scalar field, 403-409, 476
852, 911, 930, 934, 935, 941, 986, 1006, 1007, 1021, 1025, complex vector field, 335, 490-498
1028, 1034, 1038 constructive interference, 46
balanced background signal, 330, 464, 474, 475, 585 continuous function, 49, 64, 65, 161, 188, 195, 200, 202,
balanced output, 46, 47, 49, 54, 438 204, 217, 727
balanced radiation field, 4.15, 394, 415, 417, 438, 552 convolution, 110-115, 130, 132, 161, 162, 175, 202, 204,
balanced signal, 4.16, 5.4, 56, 453, 454, 464, 465, 551, 573, 218, 282, 284, 312, 313, 625, 626, 646, 650, 679, 683, 684,
575, 582-585, 587, 588, 591, 594, 599, 602, 603, 605, 606, 702, 703, 727, 738, 741, 771, 772, 774, 775, 777, 778, 823,
608, 610, 611, 616, 630-632 826, 830, 832-837, 858, 879-883, 887-889, 900, 908, 910,
band-limited function, 200, 393 922, 924, 925, 939, 941, 963, 969, 970, 984, 1002, 1013,
band-limited radiation, 4.10, 390, 428 1028, 1034
band-limited white noise, 3.25, 6.13, 299, 301, 767, 795, three-dimensional, 215
798, 807, 808, 812, 817, 924 two-dimensional, 211, 212, 214-216
bandwidth, 301, 795, 808 corner cube, 54, 55, 56
beam-chopped radiation, 4.9, 4.14, 383-385, 391, 394, 427, corner frequency, 766
1046
Index
correlated random variable, 239, 265, 913, 914 766, 795, 812, 819, 842, 846, 903
cosine curve, 66, 67, 218 double-sided signal, 6.8, 6.10, 6.14, 6.16, 7.5, 7.11, 682,
cosine transform, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86, 767, 772, 787, 789, 800, 814, 815, 817, 842, 843, 845, 848,
89-91, 93, 95, 96, 98, 103, 218, 463, 464, 780 882, 884, 889, 890, 909, 955
coupling, AC. See AC coupling
covariance, 237, 262 E
covariance stationary, 258 Earth's orbital velocity, 14, 19, 23
cross-correlation function, 259, 281, 283, 913, 948, 950, 951 effective spectrum, 5.11, 5.12, 622-624, 644, 645, 650, 663,
cross-power spectrum, 281, 282, 286, 913, 914, 948-952 665, 666, 668, 674, 677, 681, 686, 690, 692, 700, 702, 708
curl, 331, 335, 492, 493 Einstein, Albert, 1, 23
elastic vibrations, 2, 43
D electric field, 522
D*, 813, 820 electromagnetic radiation, 5.1, 55, 372, 394, 555, 556, 611,
D-limited Fourier transform, 779, 792, 800, 840, 889, 898, 810
900, 901, 961 electromagnetic wave, 4.1, 4.2, 329, 330, 335, 339, 360,
D-star, 813 428, 432, 489
delta function, 144-146, 148-154, 157, 158, 161, 162, 169, emission line, 8.9, 742, 930, 934-936, 939, 941, 985, 996,
172, 191-193, 216-218, 229, 279, 301, 312, 431, 445, 474, 998, 999, 1002, 1006, 1007, 1022
580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007, ensemble, 3.14, 251-253, 279, 280, 303, 765, 798, 800, 869
1012-1014, 1016 ensemble average, 271, 272, 274, 278, 280
nth derivative of, 154 ergodic, 272, 274, 277, 279, 791
dependent random variable, 3.5, 3.9, 223, 263 in the autocorrelation function, 274, 275, 277, 278
derivative in the mean, 271, 272, 274, 277-279
of a generalized function, 130 in the variance, 277, 278
of the delta function, 153, 154 random function, 3.18, 271, 274, 275, 279, 280
destructive interference, 46 ergodicity, 223, 279, 280, 301, 329
detector circuit, 5.10, 6.9, 6.22, 7.4, 617-619, 621, 622, 624- ether, 1, 2, 14, 23, 31, 330
626, 630, 636-641, 643, 656, 667, 668, 674, 681, 685, 686, drift, 23
696, 698, 699, 727, 730, 748, 750, 768-771, 777, 821, 822, luminiferous, 1, 2, 14
849, 853, 879, 880, 897 stationary, 14, 23
detector NEdN, 8.11, 1024 wind, 1, 1.2, 14, 23, 24, 26, 54
detector noise, 6.6, 6.9, 6.10, 6.12, 6.13, 6.14, 6.17, 6.18, even function, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119,
6.19, 6.20, 726, 742, 763, 764, 766-770, 772, 786-789, 121, 128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
791, 792, 794, 795, 800, 806, 807, 814, 815, 817, 819-821, 307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-488,
823, 828, 829, 840, 844-846, 848, 849, 853, 894, 941, 953, 575, 577, 580, 584, 589, 590, 601, 603, 604, 606, 610, 613-
964, 1016, 1027, 1031, 1038 615, 617, 624-627, 630, 631, 633-635, 655, 668, 672-674,
detector responsivity, 611, 612, 632, 751, 759, 808, 874 680, 684, 703, 729, 732, 746, 767, 768, 775, 786, 787, 790,
detector signal, 5.9, 611, 618, 626, 629, 698, 768, 822, 874 792, 800, 815, 818, 819, 826-828, 840, 843, 844, 848, 891,
DFT, 182, 183, 185, 187, 188, 190, 192, 195, 197, 699, 849, 895-897, 899, 900, 904, 913, 924, 945, 950, 952, 957, 968,
851. See also discrete Fourier transform 969, 971, 976, 1033, 1034
Dirac delta function, 144 expectation operator, 3.4, 3.10, 230, 232, 239-243, 245, 247,
direction-chopped radiation, 500, 555 248, 250, 258-260, 266, 268, 270, 271, 283, 284, 290, 314,
direction cosines, 339 318, 321, 329, 432, 440, 445, 745, 761, 781, 809, 814, 817,
discrete Fourier transform, 5.23, 62, 173, 181, 182, 218, 555, 841, 842, 878, 889, 890, 892, 902, 915, 916, 918, 921, 956,
699, 704, 708, 709, 713-716, 720, 722, 723, 849. See also 962, 963, 965, 971, 977, 981, 1017, 1022, 1023
DFT
distribution theory, 121 F
distributions, 121, 246, 870, 873, 914 fast-Fourier transform, 55, 96, 699
divergence, 335 Fellget advantage, 55
divergent integral, 117 FFT, 55, 188, 699. See also fast-Fourier transform
dot product, 349, 373, 405, 490, 494, 495 field of view, 22, 26, 28, 31, 330, 453-455, 460, 461, 472,
double-sided interferogram, 5.15, 555, 643, 646, 650, 667, 483, 485, 573, 588, 594, 601, 603, 605, 608, 612, 626, 627,
677, 680-683, 701, 742, 850, 853, 865, 953, 959, 975 630, 637, 639, 645, 656, 659-661, 665, 667, 683, 686, 731,
double-sided NEdN, 848 753, 754, 756-758, 809, 857, 931, 986, 1034
double-sided power spectrum, 289, 296, 297, 455, 474, 483, filter theory, 117
1047
Index
finite field of view, 5.17, 656, 673, 676, 684, 685, 726, 744, delta, 144-146, 148-154, 157, 158, 161, 162, 169, 172,
775, 783, 857, 859, 887, 891, 894, 935, 936, 991, 998, 191-193, 216-218, 229, 279, 301, 312, 431, 445, 474,
1026, 1032, 1034 580, 581, 620, 626, 647, 727, 729, 813, 819, 999, 1007,
finite variation, 69, 73, 141 1012-1014, 1016
fixed mirror, 24, 26, 27, 33, 35, 44, 58, 394-397, 399, 412, Dirac delta, 144
466, 553, 574, 667, 692, 696, 698, 749 even, 2.3, 51, 76, 77, 79-84, 86, 88, 111, 114, 119, 121,
focal plane, 385, 387, 594-597, 599, 600, 688 128, 135, 139, 140, 206, 268, 281, 282, 288, 296, 303,
fore optics, 607-609, 618, 626-628, 634, 635, 749, 934, 986, 307, 319, 327, 438, 454, 460-463, 471, 477, 483, 485-
1006 488, 575, 577, 580, 584, 589, 590, 601, 603, 604, 606,
Fourier convolution theorem, 2.9, 2.17, 110, 112, 114, 115, 610, 613-615, 617, 624-627, 630, 631, 633-635, 668,
159, 160, 162, 176, 202, 204, 212-216, 218, 286, 292, 311, 672-674, 680, 703, 729, 732, 746, 767, 768, 775, 778-
625, 645, 646, 655, 679, 728, 773, 777, 778, 824, 829, 831, 780, 786, 790, 792, 800, 815, 819, 826-828, 840, 843,
885, 889, 906, 907, 919, 960, 975, 979, 1015 844, 848, 891, 895-897, 899, 900, 904, 913, 924, 945,
Fourier identities, 2.8, 103, 209 950, 952, 957, 968, 969, 971, 976, 1033, 1034
Fourier scaling theorem, 107, 210, 211 generalized, 2.11, 2.13, 2.17, 62, 121-130, 132, 136-139,
Fourier series, 2.20, 62, 173, 177-179, 181 141-145, 148, 152, 155, 156, 159-162, 167-170, 172,
Fourier shift theorem, 106, 209, 725, 1015 175, 218
Fourier transform, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23, Hermitian, 101, 102, 219, 221, 282, 286-288, 372, 419,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112, 114, 420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822,
115, 117-122, 124, 136-142, 144, 157, 167, 168, 171, 173, 825, 880, 962, 968, 976
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207-210, impulse-response, 282, 285, 287, 625, 626, 727-730, 770,
213-215, 218, 231, 281, 282, 285-290, 292, 297-299, 302, 822, 879
303, 310, 311, 371, 372, 381-384, 391, 393, 426, 447, 449, instrument line-shape, 114, 115, 648
451, 456, 464, 488, 525, 605, 610, 614, 620, 623, 625, 626, instrument-response, 114, 115, 647, 728, 729
639-641, 643-646, 650, 654-656, 677-680, 683, 699, 704, mixed, 2.3, 76, 77
708, 709, 715, 728-730, 754, 756, 757, 772-775, 777-780, odd, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128, 129,
786-788, 790, 792, 794, 822, 824, 826, 829-831, 833-835, 139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
837, 838, 840, 849, 860, 862, 866, 880, 884-886, 888, 889, 459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899,
895, 896, 899, 904, 913, 914, 919, 920, 922, 957, 959, 961, 946, 947, 950, 968
962, 974, 975, 979, 980, 1029, 1030 random, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242, 249,
Fourier transform of generalized functions, 144, 159, 167, 250, 252, 253, 257-261, 271-275, 277-282, 284, 287-
168 290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522,
Fourier transform of the delta function, 2.16, 157 523, 526, 535, 744, 746, 747, 760-766, 780, 792, 798,
Fourier transform of the shah function, 2.19, 165, 171 800, 815, 840, 844-847, 860, 869, 871, 873, 874, 876,
Fourier transform pairs, 143, 159, 160, 206, 319, 677, 703 877, 882, 892, 903, 911, 912, 914, 951, 953, 956, 962,
frequency, 24, 31, 34, 37, 39-41, 43, 45, 47, 49, 51, 67, 94- 988, 1012, 1013
96, 107, 108, 115, 121, 188, 190, 192, 195, 196, 198, 200, stationary random, 3.15, 252, 260, 261, 271, 279, 282,
201, 203, 224, 249, 289, 297, 298, 314, 319, 533, 534, 538, 287, 303, 319, 791, 861, 862
557, 558, 560, 723, 766, 808-811, 813, 819, 820, 853-855, tapering, 678, 822, 825, 827
931. See also Nyquist frequency test, 121-133, 135, 136, 138, 141, 142, 144-148, 151-154,
Fresnel, Augustin, 1 161-164, 168-171, 173-175
fringe, 1.6, 12, 14, 22-24, 26, 28-30, 47, 50, 52, 58 transfer, 285-287, 620-622, 624, 645, 661, 668, 674, 681,
central. see central fringe 728-731, 733, 777, 778, 821, 822, 825, 831, 853, 880,
central dark. see central dark fringe 888, 923, 931, 968, 976, 991, 994, 996
fringe shift, 23, 34, 54 functional, 121-123, 126, 130, 144, 145
function
anti-Hermitian, 101, 220, 222 G
autocorrelation, 8.3, 223, 250, 251, 258, 274, 275, 277, Gaussian
278, 280, 281, 284, 285, 288, 290, 299, 301, 304, 319, multivariate, 261-263
329, 448, 522, 524, 791, 794, 860-862, 903, 904, 912, probability distribution, 227, 243, 246, 800
948, 956, 957, 977, 1012, 1014 random processes, 3.16, 261, 262, 279
band-limited, 200 generalized function, 2.11, 2.13, 2.17, 62, 121-132, 136,
bounded, 69, 71 137-139, 141-145, 148, 152, 155, 156, 159-162, 167-170,
characteristic, 231, 267 172, 175, 218
continuous, 49, 64, 65, 161, 188, 195, 200, 202, 204, 217 generalized function, derivative of a, 130
cross-correlation, 259, 281, 283, 913, 948, 950, 951
1048
Index
generalized function theory, 62, 143, 144 Jones, 813
generalized limit, 2.12, 132, 133, 135-138, 141, 142, 145- jump discontinuity, 69, 70, 80, 81, 119, 124
147, 157, 160-162, 164, 167-171, 175, 177, 218
geometric optics, 383, 385 K
geometric series, 167, 168, 184 Kronecker delta, 185
ghost line, 939, 941, 996, 998, 1002, 1007
gray-body approximation, 559, 935
Green, George, 1 L
laser-based servo controls, 1.8, 57, 59
light
H monochromatic, 1.3, 3, 24, 28, 31, 580
Hartley transform, 87-89, 93, 99 speed of, 1, 19, 23, 31, 346, 559, 808
Heaviside step function, 155, 156, 310, 320, 322, 828, 835, white, 6, 7, 12, 14, 22, 26, 28, 40, 41, 44, 47, 50
840, 845 linear combination, 125-127, 161, 727
Heidinger rings, 593, 597 linear operation, 2.6, 97, 99, 110, 727, 823
Hermitian function, 101, 102, 219, 221, 282, 286-288, 372, linear operator, 97, 240, 335, 496, 816
419, 420, 425, 429, 624, 625, 668, 678, 729, 731, 786, 822, linear polarization, 4.4, 349-351, 355, 356, 362, 366
825, 880, 962, 968, 976 linear system, 3.21, 282, 285, 287, 288, 727, 729
Hertz, Heinrich, 1 Lorentz, Hendrik, 1
homogeneous, 299, 524 luminiferous ether. See ether, luminiferous
homogeneous random field, 297, 523, 524
I M
magnetic-induction field, 331, 332, 350, 368, 372
ILS, 647. See also instrument line shape magnetic permeability, 331
impulse-response function, 282, 285, 287, 625, 626, 727- Maxwell, James Clerk, 1
730, 770, 822, 879 Maxwell's equations, 2, 50, 330, 344, 363, 496, 556
independent random variable, 3.6, 233, 234, 236, 239, 243, mean, 3.3, 3.8, 3.13, 62, 64, 226-230, 235, 240, 241, 243,
245, 260, 261, 280, 810, 811, 872, 914, 921, 927, 930 246-250, 260, 262-269, 271, 272, 278, 301, 798, 800, 811,
index of refraction, 355, 385, 532-534 870, 871, 877, 902, 909, 912, 914, 921, 922, 932, 939, 945-
information theory, 1031 947, 956, 958, 972, 988, 1017, 1022, 1023
infrared spectra, 55, 58, 464, 501, 626, 641, 752, 1034 mercury green line, 24
instrument line shape, 647, 648 Michelson, Albert, 1, 2, 12, 14, 22-24, 28, 31, 42, 49, 50, 52,
instrument line-shape function, 114, 115, 648 54
instrument-response function, 114, 115, 647, 728, 729 Michelson-based spectroscopy, 29
interference Michelson interferometer, 1.1, 1.4, 1.5, 2, 3, 4.11, 4.12, 4.13,
constructive, 46 5.4, 5.5, 5.6, 5.7, 5.13, 14, 23, 24, 31, 41, 44, 46, 47, 50,
destructive, 46 52, 54, 55, 62, 115, 117, 197, 200, 330, 355, 385, 390, 391,
interferogram, 5.24, 5.25, 197, 200, 463-465, 487, 488, 579- 394, 395, 400, 415, 427, 464, 481, 502, 534, 543, 551, 555,
581, 583-585, 603, 604, 610, 683, 704, 715, 721, 723, 726, 573, 585, 588, 599, 626, 660, 667, 682, 683, 685, 781, 849,
744, 764, 775, 782, 783, 807, 857, 859, 865, 887, 891, 894, 853, 929
930, 935, 938, 986, 991, 998, 1012, 1026, 1031, 1034 Michelson-Morley experiment, 1
interferogram signal, 5.22, 5.26, 197, 555, 580, 623-625, Michelson's mistake, 22
642, 643, 650, 654, 656, 666-668, 678, 683, 696, 698, 699, mirror, fixed. See fixed mirror
701, 704, 709, 715, 716, 723, 725, 742, 748, 764, 767, 775, mirror-misalignment NEdN, 865, 911
782, 798, 802, 804, 806, 857, 930, 1012, 1026, 1027 mirror-misalignment noise, 7.3, 7.8, 870, 873, 874, 879, 882,
inverse Fourier transform, 2.5, 6.4, 137, 139, 168, 171, 194, 884, 889, 890, 896-898, 909, 964, 967, 1031, 1034, 1035
204, 208-211, 213-215, 280, 281, 287, 371, 372, 381, 382, mirror, moving. See moving mirror
426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680, 729, misalignment angle, 7.2, 456, 459, 502, 504, 575, 692, 694,
750, 753, 767, 774, 822, 826 867, 868, 878, 904, 927, 930, 932, 951, 952, 954, 1030
inverse-square law, 5.3, 571, 573 misalignment NEdN, 7.11, 8.11, 909, 926, 953, 1024
misalignment noise, 7.4, 7.5, 7.6, 7.7, 7.15, 726, 865, 879,
J 882, 891, 895, 929, 932, 933, 935, 939, 941, 953, 996,
Jacquinot advantage, 55 1002, 1006, 1029, 1030
jointly normal random variable, 3.17, 263, 266, 271, 914, mixed function, 2.3, 76, 77
917 monochromatic beam, 7, 22, 26, 28, 37, 46
jointly wide-sense stationary, 259, 281, 913 monochromatic light, 1.3, 3, 24, 28, 31, 580
1049
Index
monochromatic plane wave, 4.4, 4.12, 348-353, 357, 360, nth derivative of the delta function, 154
362, 363, 368, 373, 395, 400, 403, 406, 411, 412, 415, 416, Nyquist frequency, 190, 192, 196, 197, 201, 203, 704, 716
465, 478, 500, 532-534, 538-540, 543-547, 549, 551, 554 Nyquist wavenumber, 704, 716-718, 798, 851
monochromatic wavetrain, 4.3, 7, 14, 22, 37, 39-42, 44-47,
58, 344, 434, 522, 525 O
Morley, Edward, 1 odd function, 2.3, 76, 77, 79-85, 88, 90, 111, 119-121, 128,
moving mirror, 7.2, 24, 26, 28-31, 33, 44-47, 51, 52, 55, 57, 129, 139, 140, 142, 148, 158, 218, 228, 266, 267, 303, 314,
58, 394, 395, 401, 412, 414, 415, 454, 456, 459, 460, 473, 459, 463, 488, 604, 615, 634, 674, 732, 788, 800, 899, 946,
481, 502-504, 507, 510, 546, 547, 551, 552, 554, 574, 575, 947, 950, 968
577, 588, 591-594, 597, 602, 617, 630, 636, 667, 668, 675, off-axis signal, 5.6, 555, 588, 589, 591-593
692, 694, 696, 726, 748, 749, 763, 768, 821, 867, 869, 871, off-center sampling, 5.26, 723, 822
873, 878, 879, 1012, 1017 OPD, 395, 398, 414, 453, 482, 500, 501, 573, 582, 597, 748-
multidimensional Wiener-Khinchin theorem, 3.24, 297, 298, 750, 757, 762-764, 791, 815, 843, 849, 850, 860, 869, 879,
434, 522, 525 882, 883, 932, 939, 953-959, 986, 988, 1012. See also
multiplicative noise, 1027, 1035, 1037, 1038 optical-path difference
multivariate Gaussian, 261-263 OPD velocity, 619, 636, 748, 792, 795, 813, 879, 931, 987
optical axis, 385-389, 395, 400, 404-407, 416, 425, 453,
N 456, 465, 534, 544-546, 552, 553, 573, 588, 590, 592-594,
NEdN, 6.1, 6.16, 8.7, 742-745, 747, 763, 768, 807, 814, 815, 599, 606, 628, 660, 686, 696
821, 844, 845, 848, 853, 865, 911, 930, 932, 933, 947, 953, optical-path difference, 41, 395, 501, 577, 585, 617, 619,
972, 973, 988, 990, 1002, 1007, 1023, 1038 622, 624, 630, 643, 650, 655, 659, 666, 667, 686, 690, 696,
detector, 6.21, 8.11, 844, 1024 699, 712-715, 723, 930, 953. See also OPD
double-sided, 848 oversampling, 5.24, 704, 715, 723, 852-854
mirror-misalignment, 865, 911
misalignment, 7.11, 8.11, 909, 926, 953, 1024 P
sampling error, 8.11, 885, 953, 984, 1002, 1024 p-wave, 359, 362, 368, 407, 412, 413
sampling-noise, 953, 985, 1002 pencil rays, 556-558, 566-568, 570, 571, 573, 575, 577, 584,
single-sided, 848 588, 590, 597, 606
noise pencils of rays. See pencil rays
additive, 1016, 1028 permittivity, 331
avoidable, 6.8, 767-770, 787, 788, 843, 848, 901, 1016 photon noise, 6.15, 806-808, 812, 814
avoidable misalignment, 7.7, 895 photovoltaic, 807, 814
band-limited, 3.25, 299, 301 Planck radiation, 559, 931, 932, 986, 991, 1007
band-limited white, 6.13, 87, 795, 798, 808, 812, 817, 924 Planck's constant, 559, 808
detector, 953, 964, 1016, 1027, 1031, 1038 plane of incidence, 353, 355, 357-361, 367-369, 400, 406,
mirror-misalignment, 964, 967, 1031, 1034, 1035 407, 410, 411, 467, 540, 547
misalignment, 726, 953, 996, 1002, 1006, 1029, 1030 plane wave, 4.5, 4.6, 4.13, 344, 346, 350, 352, 353, 355-360,
multiplicative, 1027, 1035, 1037, 1038 362, 366, 367, 375, 383, 385, 386, 394, 395, 400, 401, 405-
photon, 6.15, 806-808, 812, 814 407, 409, 412-417, 425, 451, 465, 467, 478, 556, 570, 571,
quasi-harmonic, 924, 926, 929, 930, 939, 987, 988, 999 573, 594, 596, 597, 599, 602, 606-608, 611, 612, 660
quasi-static sampling, 8.10, 988, 1002, 1007, 1012, 1020, polarization, 54, 350, 454, 480
1022, 1023, 1025 polarization, linear, 4.4, 349, 356, 358
sampling-position, 8.2, 8.3, 954-958, 976, 987, 988, 990, polychromatic plane wave, 362, 372, 373, 383, 605-607
996, 998, 1002, 1006, 1012-1014, 1020, 1022 polychromatic wavefield, 4.7, 368, 369, 428
signal, 6.5, 225, 280, 753, 759, 843, 848, 853, 900, 1026 power reflection coefficient, 478, 481
unavoidable, 6.8, 767-770, 786-788, 843, 900, 969, 1016 power spectrum, 3.20, 3.22, 3.23, 8.3, 280-282, 287-290,
unavoidable misalignment, 7.7, 895 296, 297, 299, 301, 305, 311, 319, 320, 525, 579, 580, 591,
white, 223, 301 592, 594, 617, 766, 767, 791, 794, 795, 798, 819, 846, 860,
noise-equivalent change in radiance, 742, 743. See also 862, 864, 956-958, 976, 979, 987, 990, 995, 1002, 1006,
NEdN 1007, 1013, 1014, 1016, 1020
noise-power spectrum, 223, 312, 328, 765, 766, 812, 813, power transmission coefficient, 478, 481
905, 921, 924-932, 934, 939, 948, 979, 983, 987, 988, 992, Poynting vector, 430, 438, 470
1002, 1012 principle of independent superposition, 41, 47
normal probability distribution, 265, 870, 873, 914, 932, prism-based spectrometer, 55
945-947 probability density distribution, 798, 800, 870, 871, 914,
1050
Index
927, 932, 945-947, 1023 Revercomb calibration algorithm, 686
propagation vector, 338, 349, 353, 354, 362-364, 376, 382, ringing, 647, 656, 683, 709, 806
383, 385, 386, 390, 394, 395, 399-401, 405, 407, 416, 421,
434, 453, 455, 482, 500, 502, 504, 507, 573, 660 S
pupil function, 452, 456, 460, 522, 528, 529, 531 s-wave, 357, 358, 362, 367, 368, 407, 412, 413
PV. See photovoltaic sampling error, 8.6, 8.7, 696, 969, 971-973, 988, 996, 1012-
1014, 1022, 1025, 1027
Q sampling-error NEdN, 8.11, 953, 984, 985, 1002, 1024
quantum efficiency, 808 sampling-position error, 696, 987, 988, 1017
quasi-harmonic noise, 924, 926, 929, 930, 939, 987, 988, sampling-position noise, 8.2, 8.3, 954, 955-958, 976, 987,
999 988, 990, 996, 998, 1002, 1006, 1012-1014, 1020, 1022
quasi-static sampling noise, 8.10, 988, 1002, 1007, 1012, sampling theorem, 2.24, 200
1020, 1022, 1023, 1025 self-apodization, 666, 667
shah function, 2.18, 2.19, 162, 165, 171, 175
R signal noise, 6.5, 225, 280, 753, 759, 843, 848, 853, 900,
radiance, 5.2, 5.3, 55, 248, 416, 425, 474, 476, 481, 484, 1026
485, 566-568, 570, 571, 573, 627, 629, 630, 639-641, 643, signal-to-noise ratio, 55, 742
647, 664, 681, 685, 686, 698, 699, 703, 726, 742, 743, 745, sine curve, 66, 67, 218
747, 748, 750, 753, 758, 762, 775, 781, 782, 806, 808, 809, sine transform, 2.2, 2.4, 67, 68, 70, 75, 80-85, 87-89, 91, 93,
813, 816, 823, 844, 857, 887, 891, 892, 911, 932, 933, 936- 95, 96, 98, 99, 119, 121, 218
938, 967, 969, 987, 988, 991, 992, 994, 998, 999, 1002, single-sided interferogram, 5.18, 555, 643, 667, 673, 674,
1007, 1013, 1020-1022, 1031, 1034, 1035 681, 682, 726, 742, 850, 853
radiant energy, 355, 430, 432, 433, 438, 470, 480, 555-558, single-sided NEdN, 848
566-568, 571, 572 single-sided power spectrum, 296, 297, 455, 576, 766, 812,
radiometric spectral radiance, 6.2, 329, 455, 486, 742-744, 819, 820
748, 751-753, 760, 762, 763, 772, 775, 778, 783, 784, 789, single-sided signal, 6.18, 6.19, 6.20, 6.21, 821, 823, 829,
800, 816, 821, 852, 857-859, 879, 880, 890, 891, 905, 932, 840, 844, 845, 848
936, 938, 964, 968, 973, 986, 991, 998, 1006, 1007, 1026, Snell's law, 532, 534
1031 SNR. See signal-to-noise ratio
radiometry, 455, 555-557, 566 solid angle, 390, 437, 453, 455, 461, 472-474, 476, 478, 483,
random error, 50, 223, 247, 742-745, 747, 759, 763, 764, 485, 555-558, 566, 567, 570-573, 589, 597, 599, 601, 603,
766, 768, 789, 844, 853, 865, 953, 955, 973, 1017, 1019, 629, 743, 806, 809
1023, 1027, 1028 source fluctuations, 55
random function, 3.2, 3.13, 3.15, 3.23, 3.26, 223-225, 242, space look, 642
249, 250, 252, 253, 257-261, 271-275, 277-282, 284, 287- specific detectivity, 813
290, 296, 297, 299, 301-303, 319, 328, 432, 438, 522, 523, spectral doublet, 28
525, 526, 744, 746, 747, 760-766, 780, 792, 798, 800, 815, spectral intensity function, 49, 51
840, 844-847, 860, 869, 871, 873, 874, 876, 877, 882, 892, spectral line, 1.4, 1.5, 24, 26-30, 32, 47, 50-52, 55
903, 911, 912, 914, 951, 953, 956, 962, 988, 1012, 1013 spectral multiplet, 32
random process, 249, 301, 791. See also Gaussian random spectral radiance, 6.2, 329, 455, 486, 555-560, 566, 570,
processes 571, 575, 576, 590, 591, 594, 597, 599, 601, 605, 606, 608,
random signal, 223, 762, 810, 1028 612, 629, 631, 643, 646, 647, 671, 677, 685, 686, 703, 725,
random variable, 3.1, 3.5, 3.6, 3.7, 3.9, 3.17, 223-227, 230- 726, 742-744, 748, 751-753, 760, 762, 763, 772, 775, 778,
243, 246, 249-251, 254, 255, 257, 259-269, 271, 273, 275, 783, 784, 789, 800, 816, 821, 852, 857-859, 879, 880, 887,
432, 438, 446, 523, 525, 526, 809-811, 814, 816, 869, 902, 890, 891, 894, 905, 932, 936, 938, 964, 968, 973, 986, 991,
909, 911, 915, 916, 921, 922, 947, 948, 972, 1018 998, 1006, 1007, 1026, 1031
rays, 383, 385, 394, 395, 459, 464, 467, 474, 532-534, 541- spectral resolution, 647, 665, 667, 677, 709, 715, 821, 823,
545, 551, 556, 570, 573, 585, 588, 589, 592, 594, 606. See 853, 930, 938
also pencil rays spectrometer
real linear operator, 335, 496-498 Fourier-transform, 1.7, 31, 50, 52, 54, 55, 57-59, 599,
real scalar field, 493 617, 623, 640, 643, 647, 667, 699, 707, 719, 727, 742,
relativity theory, 23 764, 767, 1016, 1038
resolving power, 647, 667, 668, 682 grating based, 55
response time, 36, 808 prism-based, 55
retroreflector, 54, 55, 59 spectroscope, 24
1051
Index
spectroscopy, Michelson-based, 383, 437 866, 880, 882, 884-886, 888, 889, 895, 896, 899, 904,
speed of light, 1, 19, 23, 31, 346, 559, 808 913, 914, 919, 920, 922, 957, 959, 961, 962, 974, 975,
standard deviation, 3.3, 226, 228, 229, 243, 246-248, 267, 979, 980, 1029, 1030
269, 745, 747, 766, 821, 853, 870-873, 909, 915, 927, 933, Hartley, 87-89, 93, 99
941, 945, 947, 973, 990, 1002, 1007, 1018, 1023, 1030, inverse Fourier, 2.5, 6.4, 137, 139, 168, 171, 194, 204,
1032 208-211, 213-215, 280, 281, 287, 371, 372, 381, 382,
stationarity, 223, 280, 297, 301, 791 426, 464, 605, 610, 620, 621, 623-625, 668, 678, 680,
stationary, 18, 20, 252-254, 258, 259, 262, 263, 271, 272, 729, 750, 753, 767, 774, 822, 826
274, 278-280, 319, 523, 524, 791, 1012 sine, 2.2, 2.4, 67, 68, 70, 75, 80, 81, 83-85, 87-89, 91, 93,
stationary ether. See ether, stationary 95, 96, 98, 99, 119, 121, 218
stationary random function, 3.15, 252, 260, 261, 271, 279, three-dimensional Fourier, 382, 391, 426, 447-449, 525
282, 287, 304, 319, 523, 791, 861, 862, 869 time-limited Fourier, 302
step function, 320. See also Heaviside step function two-dimensional Fourier, 210, 213, 215, 451, 456, 528,
stochastic process, 225, 249 529, 531
strongly ergodic, 278, 279 vector Fourier, 209, 382
vector inverse Fourier, 209, 382
T transverse vibrations, 1, 2
T-limited Fourier transform, 793, 814, 846 truncated interferogram signal, 701, 708, 715, 806, 959
tapering function, 678, 822, 825, 827 tunnel diagram, 400, 467, 543-545, 551
Taylor series, 995, 1014 two-dimensional convolution, 211, 212, 214-216
test function, 121-133, 135, 136, 138, 141, 142, 144-148, two-dimensional delta function, 216
151-154, 161-164, 168-171, 173-175 two-dimensional Fourier transform, 210, 213, 215, 451, 456,
theory of relativity, 23 528, 529, 531
thin film, 353, 360, 361, 479, 539
three-dimensional convolution, 215 U
three-dimensional delta function, 216 unapodized spectral resolution, 647, 715, 930, 938, 986
three-dimensional Fourier transform, 382, 391, 426, 447- unavoidable misalignment noise, 7.7. See also mirror-
449, 525 misalignment noise
time average, 34, 36, 38, 43, 253, 271, 272, 274-276, 278, unavoidable noise, 6.8, 767-770, 786-788, 843, 900, 969,
453 1016
time-chopped radiation, 4.10, 4.14, 390-393, 427, 430, 444, unbalanced background signal, 330, 464, 465, 470, 472, 474,
448, 470, 522-524, 526 479, 482, 485, 486, 551, 585, 630
time-invariant linear system, 727 unbalanced output, 46, 47, 50, 54-56
time-limited Fourier transform, 302 unbalanced radiation field, 4.17, 394, 464, 467, 470, 472
transfer function, 285-287, 620-622, 624, 645, 661, 668, unbalanced signal, 5.5, 55, 464, 465, 585-587, 632
674, 681, 728-731, 733, 777, 778, 821, 822, 825, 831, 853, uncalibrated spectrum, 6.19, 7.5, 8.4, 682, 683, 685, 781-
880, 888, 923, 931, 968, 976, 991, 994, 996 785, 800, 829, 842, 849, 882, 884, 889-893, 959, 962, 964-
transform 966, 1015-1019
angle-wavenumber, 4.8, 380, 382, 386, 391, 393, 394 uncorrelated random variable, 239
cosine, 2.2, 2.4, 67, 68, 70, 73-75, 80, 81, 83-86, 89-91, undersampling, 5.25, 200, 715, 716, 718, 723, 852, 853, 855
93, 95, 96, 98, 103, 218, 463, 464, 780 unfolded interferometer, 400, 401, 406, 407, 415, 465, 482
D-limited Fourier, 779, 792, 800, 840, 889, 898, 900, 901,
961 V
fast-Fourier, 55, 96, 699 variance, 3.3, 7.10, 226, 228, 229, 231, 240, 241, 244, 246,
Fourier, 2.1, 2.5, 2.6, 2.7, 2.10, 2.13, 2.25, 3.23, 14, 30, 248, 266, 277, 278, 301, 798, 800, 811, 812, 821, 905, 908,
31, 50-52, 54, 57-59, 62, 70, 76, 89, 93-107, 109, 112, 909, 951, 972, 973, 987, 1018, 1022, 1023
114, 115, 117-122, 124, 136-142, 157, 167, 168, 171, vector calculus, 491
176, 178, 181, 182, 188, 194, 197, 200, 202, 204, 207- vector Fourier transform, 209, 382
210, 213-215, 218, 231, 281, 282, 285-290, 292, 297- vector inverse Fourier transform, 209, 382
299, 300, 303, 310, 311, 371, 372, 381-384, 391, 393, vector notation, 208, 211, 215, 217, 218, 298, 372, 490, 491
426, 447, 449, 451, 456, 464, 488, 525, 605, 610, 614, velocity at Earth's equator, 14, 23
620, 623, 625, 626, 639-641, 643-646, 650, 654-656, vibrations
677-680, 683, 699, 704, 708, 709, 715, 728-730, 754, elastic, 2, 43
756, 757, 772-775, 777-780, 786-788, 790, 792, 794, transverse, 1, 2
822, 824, 826, 829-831, 833-835, 837, 838, 840, 860,
1052
Index
W
wavefield, 4.7, 7, 14, 23, 31, 34, 36, 37, 39-41, 54, 55, 346,
349, 353, 355, 357, 359-363, 368, 369, 406, 407, 413, 428,
478, 534-540, 808
wavelength, 2, 3, 7, 10, 12, 14, 22, 24, 26, 28-31, 34, 47, 55,
57, 58, 249, 346, 351, 352, 385, 391, 392, 420, 428, 435,
533-539, 547, 555-557, 560, 566, 571, 607, 611, 692, 814
wavenumber, 34, 49, 51, 53, 346, 348, 353, 357, 359, 362,
363, 368, 370, 383, 392-394, 401, 407, 411, 412, 416, 428,
429, 434, 436-438, 451, 453, 455, 462, 464, 468, 477, 478,
486, 556, 557, 559, 560, 566, 570, 571, 575, 580, 584, 606,
607, 611, 622, 627, 631, 645, 647, 648, 650, 664-666, 671,
673-677, 682, 684-686, 691, 692, 700, 705-707, 709, 713,
714, 716-719, 723, 726, 727, 743, 744, 747, 755, 783, 789,
790, 798, 800, 806, 808, 813, 814, 816, 848, 849, 853, 857-
859, 874, 891, 927-929, 931, 933, 935, 936, 941, 964, 969,
971, 973, 975, 979, 987, 988, 990, 994, 995, 997, 999,
1001, 1002, 1007, 1013
wavetrain, 7, 14, 22, 37, 39-42, 44-47, 58
weakly ergodic, 278, 279, 869
weakly stationary, 258
white light. See light, white
white noise. See noise, white
wide-sense stationary, 258-261, 263, 279-283, 285, 287,
288, 290, 291, 299, 302, 304, 791, 792, 815, 860-862, 865,
869, 877, 903, 912, 922, 948, 953, 957, 958
Wiener-Khinchin theorem, 3.24, 223, 297-299, 434, 522,
525
window function, 654-658
windowing, 654
Y
Yerkes observatory, 28
Young, Thomas, 28
Z
Zeeman, Pieter, 1
zero-path difference, 26, 395, 577. See also ZPD
ZPD, 26, 28, 30, 31, 55, 395, 414, 577, 587, 591-594, 597,
599, 617, 667, 668, 807
ZPD position, 28-31, 46, 52, 395, 396, 413, 414, 577, 591,
593, 667-670, 749
1053