You are on page 1of 90

Approximation Methods

Physics 130B, UCSD Fall 2009


Joel Broida
November 15, 2009
Contents
1 The Variation Method 1
1.1 The Variation Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Excited States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Linear Variation Functions . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Proof that the Roots of the Secular Equation are Real . . . . 17
2 Time-Independent Perturbation Theory 21
2.1 Perturbation Theory for a Nondegenerate Energy Level . . . . . . . 21
2.2 Perturbation Theory for a Degenerate Energy Level . . . . . . . . . 26
2.3 Perturbation Treatment of the First Excited States
of Helium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4 SpinOrbit Coupling and the Hydrogen Atom Fine Structure . . . . 46
2.4.1 Supplement: Miscellaneous Proofs . . . . . . . . . . . . . . . 53
2.5 The Zeeman Eect . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.5.1 Strong External Field . . . . . . . . . . . . . . . . . . . . . . 59
2.5.2 Weak External Field . . . . . . . . . . . . . . . . . . . . . . . 61
2.5.3 Intermediate-Field Case . . . . . . . . . . . . . . . . . . . . . 62
2.5.4 Supplement: The Electromagnetic Hamiltonian . . . . . . . . 66
3 Time-Dependent Perturbation Theory 71
3.1 Transitions Between Two Discrete States . . . . . . . . . . . . . . . 71
3.2 Transitions to a Continuum of States . . . . . . . . . . . . . . . . . . 80
i
ii
1 The Variation Method
1.1 The Variation Theorem
The variation method is one approach to approximating the ground state energy
of a system without actually solving the Schrodinger equation. It is based on the
following theorem, sometimes called the variation theorem.
Theorem 1.1. Let a system be described by a time-independent Hamiltonian H,
and let be any normalized well-behaved function that satises the boundary con-
ditions of the problem. If E
0
is the true ground state energy of the system, then
[H) E
0
. (1.1)
Proof. Consider the integral I = [(H E
0
)). Then
I = [H) E
0
[) = [H) E
0
.
We must show that I 0. Let
n
be the true (stationary state) solutions to
the Schrodinger equation, so that H
n
= E
n

n
. By assumption, the
n
form a
complete, orthonormal set, so we can write
=

n
a
n

n
where
n
[
m
) =
nm
. Then
I =

n
a

n
[(H E
0
)

m
a
m
[
m
)
=

n,m
a

n
a
m
(
n
[H
m
) E
0

nm
)
=

n,m
a

n
a
m
(E
m
E
0
)
nm
=

n
[a
n
[
2
(E
n
E
0
) .
But [a
n
[ 0 and E
n
> E
0
for all n > 0 because E
0
is the ground state of the
system. Therefore I 0 as claimed.
Suppose we have a trial function that is not normalized. Then multiplying
by a normalization constant N, equation (1.1) becomes [N[
2
[H[) E
0
. But by
denition we know that 1 = N[N) = [N[
2
[) so that [N[
2
= 1/[), and
hence our variation theorem becomes
[H)
[)
E
0
. (1.2)
1
The integral in (1.1) (or the ratio of integrals in (1.2)) is called the variational
integral.
So the idea is to try a number of dierent trial functions, and see how low we can
get the variational integral to go. Fortunately, the variational integral approaches
E
0
a lot faster than approaches
0
, so it is possible to get a good approximation
to E
0
even with a poor . However, a common approach is to introduce arbitrary
parameters and minimize the energy with respect to them.
Before continuing with an example, there are two points I need to make. First, I
state without proof that the bound stationary states of a one-dimensional system are
characterized by having no nodes interior to the boundary points in the ground state
(i.e., the wavefunction is never zero), and the number of nodes increases by one for
each successive excited state. While the proof of this statement is not particularly
dicult (its really a statement about Sturm-Liouville type dierential equations),
it would take us too far astray at the moment. If you are interested, a proof may
be found in Messiah, Quantum Mechanics, Chapter III, Sections 8-12.
A related issue is the following: In one dimension, the bound states are nonde-
generate. To prove this, suppose we have two degenerate states
1
and
2
, both
with the same energy E. Multiply the Schrodinger equation for
1
by
2
:


2
2m

2
d
2

1
dx
2
+V
1

2
= E
1

2
and multiply the Schrodinger equation for
2
by
1
:


2
2m

1
d
2

2
dx
2
+V
1

2
= E
1

2
.
Subtracting, we obtain

2
d
2

1
dx
2

1
d
2

2
dx
2
= 0 .
But then
d
dx
_

2
d
1
dx

1
d
2
dx
_
=
2
d
2

1
dx
2

1
d
2

2
dx
2
= 0
so that

2
d
1
dx

1
d
2
dx
= const .
However, we know that 0 as x , and hence the constant must equal
zero. Rewriting this result we now have d ln
1
= d ln
2
or ln
1
= ln
2
+ lnk
where ln k is an integration constant. This is equivalent to
1
= k
2
so that
1
and
2
are linearly dependent and hence degenerate as claimed.
The second topic I need to address is the notion of classication by symmetry.
So, let us consider the time-independent Schrodinger equation H = E, and
suppose that the potential energy function V (x) is symmetric, i.e.,
V (x) = V (x) .
2
Under these conditions, the total Hamiltonian is also symmetric:
H(x) = H(x) .
To understand the consequences of this, let us introduce an operator called the
parity operator, dened by
f(x) = f(x)
where f(x) is an arbitrary function. It is easy to see that is Hermitian because
f[g) =
_

f(x)

g(x) dx =
_

f(x)

g(x) dx
=
_

f(x)

g(x) dx =
_

[f(x)]

g(x) dx
= f[g)
where in going from the rst line to the second we simply changed variables x x.
(I will use the symbol dx to denote the volume element in whatever n-dimensional
space is under consideration.)
Now what can we say about the eigenvalues of ? Well, if f = f, then

2
f = (f) = f =
2
f .
On the other hand, it is clear that

2
f(x) = (f(x)) = f(x) = f(x)
and hence we must have
2
= 1, so the eigenvalues of are 1. Let us denote the
corresponding eigenfunctions by f

:
f
+
= f
+
and f

= f

.
In other words,
f
+
(x) = f
+
(x) and f

(x) = f

(x) .
Thus f
+
is any even function, and f

is any odd function. Note that what have


shown is the existence of a Hermitian operator with only two eigenvalues, each of
which is innitely degenerate. (I leave it as an easy exercise for you to show that
f
+
and f

are orthogonal as they should be.)


Next, note that any f(x) can always be written in the form
f(x) = f
+
(x) +f

(x)
where
f
+
(x) =
f(x) +f(x)
2
and f

(x) =
f(x) f(x)
2
3
are obviously symmetric and antisymmetric, respectively. Thus the eigenfunctions
of the parity operator are complete, i.e., any function can be written as the sum of
a symmetric function and an antisymmetric function.
It will be extremely convenient to now introduce the operators

dened by

=
1
2
.
In terms of these operators, we can write

f = f

.
It is easy to see that the operators

satisfy the three properties

+
= 0

+
+

= 1 .
The operators

are called projection operators.


Returning to our symmetric Hamiltonian, we observe that
(H(x)(x)) = H(x)(x) = H(x)(x) = H(x)(x)
and thus the Hamiltonian commutes with the parity operator. But if [H, ] = 0,
then it is trivial to see that [H,

] = 0 also, and therefore acting on H


E
= E
E
with

we see that
H
E+
= E
E+
and
H
E
= E
E
.
Thus the stationary states in a symmetric potential can always be classied accord-
ing to their parity, i.e., they can always be chosen to have a denite symmetry.
Moreover, since, as we saw above, the bound states in one dimension are nonde-
generate, it follows that each bound state in a one-dimensional symmetric potential
must be either even or odd.
Example 1.1. Let us nd a trial function for a particle in a one-dimensional box
of length l. Since the true wavefunction vanishes at the ends x = 0 and x = l, our
trial function must also have this property. A simple (un-normalized) function that
obeys these boundary conditions is
= x(l x) for 0 x l
and = 0 outside the box.
4
The integrals in equation (1.2) are
[H) =

2
2m
_
l
0
x(l x)
d
2
dx
2
x(l x) dx
=

2
m
_
l
0
x(l x) dx =

2
l
3
6m
and
[) =
_
l
0
x
2
(l x)
2
dx =
l
5
30
.
Therefore
E
0

[H)
[)
= 5

2
ml
2
.
For comparison, the exact solution has energy levels
E
n
=
n
2

2
2ml
2
n = 1, 2, . . .
so the ground state (n = 1) has energy

2
2

2
ml
2
= 4.9348

2
ml
2
for an error of 1.3%. The gure below is a plot of the exact normalized ground state
solution to the particle in a box together with the normalized trial function. You
can see how closely the trial function is to the exact solution.
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Trial Function
Exact Solution
Figure 1: Plot of

2 sinx and

30x(1 x).
Example 1.2. Let us construct a variation function with parameter for the one-
dimensional harmonic oscillator, and nd the optimal value for that parameter.
5
What do we know in general? First, the wavefunction must vanish as x .
The most obvious function that satises this is e
x
2
. However, x has units of length,
and we can only take the exponential of a dimensionless quantity (think of the power
series expansion for e
x
2
). However, if we include a constant with dimensions
of length
2
, then e
x
2
is satisfactory from a dimensional standpoint. In addition,
since the potential V =
1
2
kx
2
is symmetric, we know that the eigenstates will have
a denite parity. And since the ground state has no nodes, it must be an even
function (since an odd function has a node at the origin). Thus the trial function
= e
x
2
has all of our desired properties.
Since is unnormalized, we use equation (1.2). The Hamiltonian is


2
2m
d
2
dx
2
+
1
2
m
2
x
2
and hence
[H) =

2
2m
_

e
x
2 d
2
e
x
2
dx
2
dx +
1
2
m
2
_

x
2
e
2x
2
dx
=

2
2m
_

_
4
2
x
2
e
2x
2
2e
2x
2
_
dx +
1
2
m
2
_

x
2
e
2x
2
dx
=
_
2
2

2
m
+
1
2
m
2
_ _

x
2
e
2x
2
dx +

2

m
_

e
2x
2
dx.
The second integral is easy (and you should already know the answer):
_

e
2x
2
dx =
_

2
.
Using this, the rst integral is also easy. Letting = 2 we have
_

x
2
e
2x
2
dx =
_

x
2
e
x
2
dx =

e
x
2
dx
=

=
1
2

1/2

3/2
=
1
2

1/2
(2)
3/2
.
After a little algebra, we now arrive at
[H) =

2

1/2

1/2
2
3/2
m
+
m
2

1/2

3/2
2
7/2
.
And the denominator in equation (1.2) is just
[) =
_

e
2x
2
dx =
_

2
.
6
Thus our variational integral becomes
W :=
[H)
[)
=

2

2m
+
m
2
8
.
To minimize this with respect to we set dW/d = 0 and solve for :

2
2m

m
2
8
2
= 0
or
=
m
2
.
The negative root must be rejected because otherwise = e
x
2
would be divergent.
Substituting the positive root for into our expression for W yields
W =
1
2

which is the exact ground state harmonic oscillator energy. This isnt surprising,
because up to normalization, our with = m/2 is just the exact ground state
harmonic oscillator wave function.
1.2 Excited States
So far all we have discussed is how to approximate the ground-state energy of a
system. Now we want to take a look at how to go about approximating the energy
of an excited state. Let us assume that the stationary states of our system are
numbered so that
E
0
E
1
E
2
.
If
n
is a complete set of orthonormal eigenstates of H, then our normalized
trial function can be written =

n
a
n

n
where a
n
=
n
[). Then as we have
seen
[H) =

n,m
a

n
a
m
E
m

n
[
m
) =

n,m
a

n
a
m
E
m

nm
=

n=0
[a
n
[
2
E
n
and
[) =

n=0
[a
n
[
2
= 1 .
Suppose we restrict ourselves to trial functions that are orthogonal to the true
ground-state wavefunction
0
. Then a
0
=
0
[) = 0 and we are left with
[H) =

n=1
[a
n
[
2
E
n
and [) =

n=1
[a
n
[
2
= 1 .
7
For n 1 we have E
n
E
1
so that [a
n
[
2
E
n
[a
n
[
2
E
1
and hence

n=1
[a
n
[
2
E
n

n=1
[a
n
[
2
E
1
= E
1

n=1
[a
n
[
2
= E
1
.
This gives us our desired result
[H) E
1
if
0
[) = 0 and [) = 1 . (1.3)
While equation (1.3) gives an upper bound on the energy E
1
of the rst excited
state, it depends on the restriction
0
[) = 0 which can be problematic. However,
for some systems this is not a dicult requirement to achieve even though we dont
know the exact ground-state wavefunction. For example, a one-dimensional problem
with a symmetric potential has a ground-state wavefunction that is always even,
while the rst excited state is always odd. This means that any (normalized) trial
function that is an odd function will automatically satisfy
0
[) = 0.
It is also possible to extend this approach to approximating the energy levels of
higher excited states. In particular, if we somehow choose the trial function so
that

0
[) =
1
[) = =
n
[) = 0,
then, following exactly the same argument as above, it is easy to see that if [) = 1
we have
[H) E
n+1
.
For example, consider any particle moving under a central potential V (r) (e.g.,
the hydrogen atom). Then the Schrodinger equation factors into a radial equation
that depends on V (r) times an angular equation (that is independent of V ) with
solutions that are just the spherical harmonics Y
m
l
(, ). It may very well be that
we cant solve the radial equation with this potential, but we know that spherical
harmonics with dierent values of l are orthogonal. Thus, we can get an upper
bound to the energy of the lowest state with a particular angular momentum l by
choosing a trial function that contains the factor Y
m
l
.
1.3 Linear Variation Functions
The approach that we are now going to describe is probably the most common
method of nding approximate molecular wave functions. A linear variation
function is a linear combination of n linearly independent functions f
i
:
=
n

i=1
c
i
f
i
.
The functions f
i
are called basis functions, and they must obey the boundary
conditions of the problem. The coecients c
i
are to be determined by minimizing
the variational integral.
8
We shall restrict ourselves to a real , so the functions f
i
and coecients c
i
are
taken to be real. Later we will remove this requirement. Furthermore, note that
the basis functions are not generally orthogonal since they are not necessarily the
eigenfunctions of any operator. Let us dene the overlap integrals S
ij
by
S
ij
:= f
i
[f
j
) =
_
f

i
f
j
dx
(where the asterisk on f
i
isnt necessary because we are assuming that our basis
functions are real). Then (remember that the c
i
are real)
[) =
n

i,j=1
c
i
c
j
f
i
[f
j
) =
n

i,j=1
c
i
c
j
S
ij
.
Next, we dene the integrals
H
ij
:= f
i
[Hf
j
) =
_
f

i
Hf
j
dx
so that
[H) =
n

i,j=1
c
i
c
j
f
i
[Hf
j
) =
n

i,j=1
c
i
c
j
H
ij
.
Then the variation theorem (1.2) becomes
W =
[H)
[)
=

n
i,j=1
c
i
c
j
H
ij

n
i,j=1
c
i
c
j
S
ij
or
W
n

i,j=1
c
i
c
j
S
ij
=
n

i,j=1
c
i
c
j
H
ij
. (1.4)
Now W is a function of the n c
i
s, and we know that W E
0
. In order
to minimize W with respect to all of the the c
k
s, we must require that at the
minimum we have
W
c
k
= 0 ; k = 1, . . . , n.
Taking the derivative of (1.4) with respect to c
k
and using
c
i
c
k
=
ik
we have
W
c
k
n

i,j=1
c
i
c
j
S
ij
+W
n

i,j=1
(
ik
c
j
+ c
i

jk
)S
ij
=
n

i,j=1
(
ik
c
j
+c
i

jk
)H
ij
9
or (since W/c
k
= 0)
W
n

j=1
c
j
S
kj
+W
n

i=1
c
i
S
ik
=
n

j=1
c
j
H
kj
+
n

i=1
c
i
H
ik
.
However, the basis functions f
i
are real so we have
S
ik
=
_
f
i
f
k
dx = S
ki
and since H is Hermitian (and H(x) is real) we also have
H
ik
= f
i
[Hf
k
) = Hf
i
[f
k
) = f
k
[Hf
i
)

= f
k
[Hf
i
) = H
ki
.
Therefore, because the summation indices are dummy indices, we see that the two
terms on each side of the last equation are identical, and we are left with
W
n

j=1
c
j
S
kj
=
n

j=1
c
j
H
kj
or
n

j=1
(H
kj
WS
kj
)c
j
= 0 ; k = 1, . . . , n. (1.5)
This is just a system of n homogeneous linear equations in n unknowns (the n
coecients c
j
), and hence for a nontrivial solution to exist (we dont want all of the
c
j
s to be zero) we must have the secular equation
det(H
kj
WS
kj
) = 0 . (1.6)
(You can think of this as a system of the form

j
a
kj
x
j
= 0 where the matrix
A = (a
jk
) must be singular or else A
1
would exist and then the equation Ax = 0
would imply that x = 0. The requirement that A be singular is equivalent to the
requirement that det A = 0.) Written out, equation (1.6) looks like

H
11
WS
11
H
12
WS
12
H
1n
WS
1n
H
21
WS
21
H
22
WS
22
H
2n
WS
2n
.
.
.
.
.
.
.
.
.
H
n1
WS
n1
H
n2
WS
n2
N
nn
WS
nn

= 0 .
The determinant in (1.6) is a polynomial in W of degree n, and it can be proved
that all n roots of this equation are real. (The proof is given at the end of this
section for those who are interested.) Let us arrange the roots in order of increasing
value as
W
0
W
1
W
n1
.
10
Similarly, we number the bound states of the system so that the corresponding true
energies of these bound states are also arranged in increasing order:
E
0
E
1
E
n1
E
n
.
From the variation theorem we know that E
0
W
0
. Furthermore, it can also be
proved (see the homework) that
E
i
W
i
for each i = 0, . . . , n 1 .
In other words, the linear variation method provides upper bounds for the energies
of the lowest n bound states of the system. It can be shown that increasing the
number of basis functions used (and hence increasing the number of states whose
energies are approximated), the better the accuracy of the previously calculated
energies.
Once we have found the n roots W
i
, we can substitute them one-at-a-time back
into equation (1.5) and solve for the coecients c
(i)
j
, where the superscript denotes
that fact that this particular set of coecients applies to the root W
i
. (Again, this
is just like nding the eigenvector corresponding to a given eigenvalue.) Note also
that all we can really nd is the ratios of the coecients, say relative to c
1
, and
then x c
1
by normalization.
There are some tricks that can simplify the solution of equation (1.6). For
example, if we choose the basis functions to be orthonormal, then S
kj
=
kj
. If
the originally chosen set of basis functions isnt orthonormal, we can always use the
Gram-Schmidt process to construct an orthonormal set. Also, we can make some of
the o-diagonal H
kj
s vanish if we choose our basis functions to be eigenfunctions
of some other Hermitian operator A that commutes with H. This because of the
following theorem:
Theorem 1.2. Let f
i
and f
j
be eigenfunctions of a Hermitian operator A corre-
sponding to the eigenvalues a
i
,= a
j
. If H is an operator that commutes with A,
then
H
ji
= f
j
[Hf
i
) = 0 .
Proof. Let us rst assume that the eigenvalue a
i
is nondegenerate. Then Af
i
= a
i
f
i
and
A(Hf
i
) = HAf
i
= a
i
(Hf
i
) .
Thus Hf
i
is in the eigenspace V
ai
of A corresponding to the eigenvalue a
i
. But
a
i
is nondegenerate so that the eigenspace is one-dimensional and spanned by f
i
.
Hence we must have Hf
i
= b
i
f
i
for some scalar b
i
. Recalling that eigenfunctions
belonging to distinct eigenvalues of a Hermitian operator are orthogonal, we have
f
j
[Hf
i
) = b
i
f
j
[f
i
) = 0 .
11
Now assume that the eigenvalue a
i
is degenerate. This means that the eigenspace
V
ai
has dimension greater than one, say dimV
ai
= n. Then V
ai
has a basis g
1
, . . . , g
n
consisting of eigenvectors of A corresponding to the eigenvalue a
i
, i.e., Ag
k
= a
i
g
k
for each k = 1, . . . , n. So if Hf
i
is in V
ai
, then Hf
i
=

n
k=1
c
k
g
k
for some expansion
coecients c
k
. But then we again have
f
j
[Hf
i
) =
n

k=1
c
k
f
j
[g
k
) = 0
because the eigenfunctions f
j
and g
k
belong to the distinct eigenvalues a
j
and a
i
respectively.
Another (possibly easier) way to prove Theorem 1.2 is this. Let Af
i
= a
i
f
i
and
Af
j
= a
j
f
j
where a
i
,= a
j
. (In other words, f
i
and f
j
belong to dierent eigenspaces
of A.) Then on the one hand we have
f
j
[HAf
i
) = a
i
f
j
[Hf
i
)
while on the other hand, we can use the fact that H and A commute along with
the fact that A is Hermitian and hence has real eigenvalues, to write
f
j
[HAf
i
) = f
j
[AHf
i
) = Af
j
[Hf
i
) = a
j
f
j
[Hf
i
) .
Equating these results shows that (a
i
a
j
)f
j
[Hf
i
) = 0. Therefore, if a
i
,= a
j
, we
must have f
j
[Hf
i
) = 0.
Finally, it is left as a homework problem to show that equations (1.5) and (1.6)
also hold if the variation function is in fact allowed to be complex.
Example 1.3. In Example 1.1 we constructed the trial function = x(l x) for
the ground state of the one-dimensional particle in a box. Let us now construct a
linear variation function =

i
c
i
f
i
to approximate the energies of the rst four
states. This means that we need at least four independent functions f
i
that obey
the boundary conditions of vanishing at the ends of the box. While there are an
innite number of possibilities, we want to limit ourselves to integrals that are easy
to evaluate.
We begin by taking
f
1
= x(l x) ,
and another simple function that obeys the proper boundary conditions is
f
2
= x
2
(l x)
2
.
If the origin were chosen to be at the center of the box, we know that the exact
solutions would have a denite parity, alternating between even and odd functions,
starting with the even ground state. To see that both f
1
and f
2
are even functions,
12
we shift the origin to the center of the box by changing variables to x

= x l/2.
Then x = x

+l/2 and we nd
f
1
= (x

+l/2)(l/2 x

) and f
2
= (x

+l/2)
2
(l/2 x

)
2
which shows that f
1
and f
2
are both clearly even functions of x

.
Since both f
1
and f
2
are even functions, if we took = c
1
f
1
+c
2
f
2
we would end
up with an upper bound for the two lowest energy even states (the n = 1 and n = 3
states). In order to also approximate the odd n = 2 and n = 4 states, we must add
in two odd functions. Thus we need two functions that vanish at x = 0, x = l and
x = l/2. Two functions that satisfy these requirements are
f
3
= x(l x)(l/2 x)
and
f
4
= x
2
(l x)
2
(l/2 x) .
By again changing variables as we did for f
1
and f
2
, you can easily show that f
3
and f
4
are indeed odd functions. Note also that the four functions we have chosen
are linearly independent as they must be.
One of the advantages in choosing our functions to have a denite parity is that
many of the integrals that occur in equation (1.6) will vanish. In particular, since
any integral of an odd function over an even interval is identically zero, and since
the product of an even function with an odd function is odd, it should be clear that
S
13
= S
31
= 0 S
14
= S
41
= 0
S
23
= S
32
= 0 S
24
= S
42
= 0 .
Furthermore, since the functions have a denite parity, they are eigenfunctions of the
parity operator with f
1,2
= +f
1,2
and f
3,4
= f
3,4
. And since the potential
is symmetric, we have [, H] = 0 so that by Theorem 1.2 we know that H
ij
= 0 if
one index refers to an even function and the other refers to an odd function:
H
13
= H
31
= 0 H
14
= H
41
= 0
H
23
= H
32
= 0 H
24
= H
42
= 0 .
With these simplications, (1.6) becomes

H
11
WS
11
H
12
WS
12
0 0
H
21
WS
21
H
22
WS
22
0 0
0 0 H
33
WS
33
H
34
WS
34
0 0 H
43
WS
43
H
44
WS
44

= 0 .
Since the determinant of a block diagonal matrix is the product of the determinants
of the blocks, we can nd all four roots by nding the two roots of each of the
13
following equations:

H
11
WS
11
H
12
WS
12
H
21
WS
21
H
22
WS
22

= 0 (1.7a)

H
33
WS
33
H
34
WS
34
H
43
WS
43
H
44
WS
44

= 0 . (1.7b)
Let the roots of (1.7a) be denoted W
1
, W
3
. These are the approximations to the
energies of the n = 1 and n = 3 even states. Similarly, the roots W
2
, W
4
of (1.7b)
are the approximations to the odd energy states n = 2 and n = 4. Once we have the
roots W
i
, we substitute them one-at-a-time back into equation (1.5) to determine
the set of coecients c
(i)
j
corresponding to that particular root. In the particular
case of W
1
, this yields the set of equations
(H
11
W
1
S
11
)c
(1)
1
+ (H
12
W
1
S
12
)c
(1)
2
= 0
(H
21
W
1
S
21
)c
(1)
1
+ (H
22
W
1
S
22
)c
(1)
2
= 0
(1.8a)
(H
33
W
1
S
33
)c
(1)
3
+ (H
34
W
1
S
34
)c
(1)
4
= 0
(H
43
W
1
S
43
)c
(1)
3
+ (H
44
W
1
S
44
)c
(1)
4
= 0 .
(1.8b)
Now, W
1
was a root of (1.7a), so the determinant of the coecients in (1.8a)
must vanish, and we have a nontrivial solution for c
(1)
1
and c
(1)
2
. However, W
1
was
not a root of (1.7b), so the determinant of the coecients in (1.8b) does not vanish,
and hence there is only the trivial solution c
(1)
3
= c
(1)
4
= 0. Thus the trial function
for W
1
is
1
= c
(1)
1
f
1
+c
(1)
2
f
2
. Exactly the same reasoning applies to the other three
roots, and we have the trial functions

1
= c
(1)
1
f
1
+c
(1)
2
f
2

3
= c
(3)
1
f
1
+c
(3)
2
f
2

2
= c
(2)
3
f
3
+c
(2)
4
f
4

4
= c
(4)
3
f
3
+c
(4)
4
f
4
.
So we see that the even states
1
and
3
are approximated by the trial functions
1
and
3
consisting of linear combinations of the even functions f
1
and f
2
. Similarly,
the odd states
2
and
4
are approximated by the trial functions
2
and
4
that
are linear combinations of the odd functions f
3
and f
4
.
To proceed any further, we need to evaluate the non-zero integrals H
ij
and S
ij
.
From Example 1.1 we can immediately write down H
11
and S
11
. The rest of the
integrals are also straight-forward to evaluate, and the result is
H
11
=
2
l
3
/6m H
12
= H
21
=
2
l
5
/30m H
22
=
2
l
7
/105m
H
33
=
2
l
5
/40m H
44
=
2
l
9
/1260m H
34
= H
43
=
2
l
7
/280m
S
11
= l
5
/30 S
12
= S
21
= l
7
/140 S
22
= l
9
/630
S
33
= l
7
/840 S
44
= l
11
/27720 S
34
= S
43
= l
9
/5040 .
14
Substituting these results into equation (1.7a) to determine W
1
and W
3
we have

2
l
3
6m

l
5
30
W

2
l
5
30m

l
7
140
W

2
l
5
30m

l
7
140
W

2
l
7
105m

l
9
630
W

= 0 .
To evaluate this, it is easiest to recall that multiplying any single row of a deter-
minant by some scalar is the same as multiplying the original determinant by that
same scalar. (This is an obvious consequence of the denition
det A =
n

i1,...,in=1

i1in
a
1i1
a
nin
.)
Since the right hand side of this equation is zero, we dont change anything by
multiplying any row in this determinant by some constant. Multiplying the rst
row by 420m/l
3
and the second row by 1260m/l
5
we obtain

70
2
14ml
2
W 14
2
l
2
3ml
4
W
42
2
9ml
2
W 12
2
l
2
2ml
4
W

= 0 (1.9)
or
ml
4
W
2
56ml
2

2
W + 252
4
= 0 .
The roots of this quadratic are
W
1,3
= (
2
/ml
2
)(28

532) = 4.93487
2
/ml
2
, 51.0651
2
/ml
2
.
Similarly, substituting the values for H
ij
and S
ij
into (1.7b) results in
W
2,4
= (
2
/ml
2
)(60

1620) = 19.7508
2
/ml
2
, 100.249
2
/ml
2
.
For comparison, the rst four exact solutions E
n
= n
2

2
/2ml
2
are
E
n
= 4.9348
2
/ml
2
, 19.7392
2
/ml
2
, 44.4132
2
/ml
2
, 78.9568
2
/ml
2
so the errors are (in the order of increasing energy levels) 0.0014%, 0.059%, 15.0%
and 27.0%. As expected, we did great for n = 1 and n = 2, but not so great for
n = 3 and n = 4.
We still have to nd the approximate wave functions that correspond to each of
the W
i
s. We want to substitute W
1
= 4.93487
2
/ml
2
into equations (1.8a) and use
the integrals we have already evaluated. However, it is somewhat easier to note that
the coecients of c
(1)
1,2
in equations (1.8a) are equivalent to the entries in equation
(1.9). Furthermore, as we have already noted, all we can nd is the ratio of the
c
i
s, so the two equations in (1.9) are equivalent, and we only need to use either one
of them. (That the equations are equivalent is a consequence of the fact that the
determinant (1.9) is zero, so the rows must be linearly dependent. Hence we get no
new information by using both rows.)
15
So choosing the rst row we have
70
2
14ml
2
W
1
= 70
2
14ml
2
(4.93487
2
/ml
2
) = 0.91182
2
14
2
l
2
3ml
4
W
1
= 14
2
l
2
3ml
4
(4.93487
2
/ml
2
) = 0.80461
2
l
2
so that
c
(1)
2
=
0.91182
2
0.80461
2
l
2
c
(1)
1
= 1.133c
(1)
1
/l
2
.
To x the value of c
(1)
1
we use the normalization condition:
1 =
1
[
1
) = c
(1)
1
f
1
+c
(1)
2
f
2
[c
(1)
1
f
1
+c
(1)
2
f
2
)
= [c
(1)
1
]
2
S
11
+ 2c
(1)
1
c
(1)
2
S
12
+ [c
(1)
2
]
2
S
22
= [c
(1)
1
]
2
_
S
11
+ 2
1.133
l
2
S
12
+
(1.133)
2
l
4
S
22
_
= [c
(1)
1
]
2
_
l
5
30
+ 2
1.133
l
2
l
7
140
+
(1.133)
2
l
4
l
9
630
_
= 0.05156[c
(1)
1
]
2
l
5
and hence c
(1)
1
= 4.404l
5/2
.
Putting this all together we nally obtain

1
= 4.404l
5/2
f
1
+ 4.990l
9/2
f
2
= 4.404l
5/2
x(l x) + 4.990l
9/2
x
2
(l x)
2
= l
1/2
[4.404(x/l)(1 x/l) + 4.990(x/l)
2
(1 x/l)
2
] .
As you can see from the plot below, the function
1
is almost identical to the exact
solution
1
=

2 sinx/l:
0.2 0.4 0.6 0.8 1.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
Trial Function
Exact Solution
Figure 2: Plot of
1
and
1
vs x/l.
16
Repeating all of this with the other roots W
2
, W
3
and W
4
we eventually arrive
at

2
= l
1/2
[16.78(x/l)(1 x/l)(1/2 x/l) + 71.85(x/l)
2
(1 x/l)
2
(1/2 x/l)]

3
= l
1/2
[28.65(x/l)(1 x/l) 132.7(x/l)
2
(1 x/l)
2
]

4
= l
1/2
[98.99(x/l)(1 x/l)(1/2 x/l) 572.3(x/l)
2
(1 x/l)
2
(1/2 x/l)]
1.3.1 Proof that the Roots of the Secular Equation are Real
In this section we will prove that the roots of the polynomial in W dened by
equation (1.6) are in fact real. In order to show this, we must rst review some
basic linear algebra.
Let V be a vector space over C. By an inner product on V (sometimes called
the Hermitian inner product), we mean a mapping , ) : V V C such that
for all u, v, w V and a, b C we have
(IP1) au +bv, w) = a

u, w) +b

v, w) ;
(IP2) u, v) = v, u)

;
(IP3) u, u) 0 and u, u) = 0 if and only if u = 0 .
If e
i
is a basis for V , then in terms of components we have
u, v) =

i,j
u

i
v
j
e
i
, e
j
) :=

i,j
u

i
v
j
g
ij
where we have dened the (square) matrix G = (g
ij
) = e
i
, e
j
). As a matrix
product, we may write
u, v) = u
T
Gv .
I emphasize that this is the most general inner product on V , and any inner product
can be written in this form. (For example, if V is a real space and g
ij
= e
i
, e
j
) =

ij
, then we obtain the usual Euclidean inner product on V .) Notice that
g
ij
= e
i
, e
j
) = e
j
, e
i
)

= g

ji
and hence G = G

so that G is in fact a Hermitian matrix. (Some of you may


realize that in the case where V is a real vector space, the matrix G is just the
usual metric on V .)
Now, given an inner product, we may dene a norm on V by |u| = u, u)
1/2
.
Note that because of condition (IP3), we have |u| 0 and |u| = 0 if and only if
u = 0. This imposes a condition on G because
|u|
2
= u, u) = u
T
Gu =

i,j
u

i
u
j
g
ij
0
17
and equality holds if and only if u = 0. A Hermitian matrix G with the property
that u
T
Gu > 0 for all u ,= 0 is said to be positive denite.
It is important to realize that conversely, given a positive denite Hermitian
matrix G, we can dene an inner product by u, v) = u
T
Gv. That this is true
follows easily by reversing the above steps.
Another fundamental concept is that of the kernel of a linear transformation (or
matrix). If T is a linear transformation, we dene the kernel of T to be the set
Ker T = u V : Tu = 0 .
A linear transformation whose kernel is zero is said to be nonsingular.
The reason the kernel is so useful is that it allows us to determine whether or not
a linear transformation is an isomorphism (i.e., one-to-one). A linear transformation
T on V is said to be one-to-one if u ,= v implies Tu ,= Tv. An equivalent way
to say this is that Tu = Tv implies u = v (this is the contrapositive statement).
Thus, if Tu = Tv, the using the linearity of T we see that 0 = TuTv = T(uv)
and hence u v Ker T. But if Ker T = 0, then we in fact have u = v so
that T is an isomorphism. Conversely, if T is an isomorphism, then we must have
Ker T = 0. This is because T is one-to-one, and any linear transformation has
the property that T0 = 0. (Because Tu = T(u +0) = Tu +T0 so that T0 = 0.)
Now suppose that T is a nonsingular surjective (i.e., onto) linear transformation
on V . Such a T is said to be a bijection. You should already know that the matrix
representation A = (a
ij
) of T with respect to the basis e
i
for V is dened by
Te
i
=

j
e
j
a
ji
.
This is frequently written as A = [T]
e
. Then the fact that T is a bijection simply
means that the matrix A is invertible (i.e., that A
1
exists).
(Actually, if T : U V is a nonsingular (one-to-one) linear transformation
between two nite-dimensional vector spaces of equal dimensions, then it is auto-
matically surjective. This is a consequence of the well-known rank theorem which
says
rankT + dimKer T = dimU
where rankT is another term for the dimension of the image of T. Therefore, if
Ker T = 0 we have dimKer T = 0 so that rankT = dimU = dimV . The proof of
the rank theorem is also not hard: Let dimU = n, and let w
1
, . . . , w
k
be a basis
for Ker T. Extend this to a basis w
1
, . . . , w
n
for U. Then ImT is spanned by
Tw
k+1
, . . . , Tw
n
, and it is easy to see that these are linearly independent. Thus
dimU = n = k + (n k) = dimKer T + dimImT.)
Note that if G is positive denite, then we must have Ker G = 0. This is
because if u ,= 0 and Gu = 0, we would have u, u) = u
T
Gu = 0 in contradiction to
the assumed positive deniteness of G. Thus a positive denite matrix is necessarily
nonsingular.
Let us take a more careful look at S
ij
= f
i
[f
j
). I claim that the matrix S =
(S
ij
) is positive denite. To show this, I will prove a general result. Suppose
18
I have n linearly independent (complex) vectors v
1
, . . . , v
n
, and I construct the
nonsingular matrix M whose columns are just the vectors v
i
. Letting v
ij
denote
the jth component of the vector v
i
, we have
M =
_

_
v
11
v
21
v
n1
v
12
v
22
v
n2
.
.
.
.
.
.
.
.
.
v
1n
v
2n
v
nn
_

_
.
From this we see that
M

=
_

_
v

11
v

12
v

1n
v

21
v

22
v

2n
.
.
.
.
.
.
.
.
.
v

n1
v

n2
v

nn
_

_
and therefore
M

M =
_

_
v
1
[v
1
) v
1
[v
2
) v
1
[v
n
)
v
2
[v
1
) v
2
[v
2
) v
2
[v
n
)
.
.
.
.
.
.
.
.
.
v
n
[v
1
) v
n
[v
2
) v
n
[v
n
)
_

_
. (1.10)
A matrix of this form is called a Gram matrix.
If I denote the Hermitian matrix M

M by S, then for any vector c ,= 0 we have


c[Sc) = c[M

Mc) = Mc[Mc) = |Mc|


2
> 0
so that S is positive denite. That this is strictly greater than zero (and not greater
than or equal to zero) follows from the fact that M is nonsingular so its kernel is
0, together with the assumption that c ,= 0. In other words, any matrix of the
form (1.10) is positive denite.
But this is exactly what we had when we dened S
ij
= f
i
[f
j
) = i[j), where the
linearly independent functions f
i
dene a basis for a vector space. In other words,
what we really have is f
i
= v
i
so that the matrix M

M dened above is exactly


the matrix S dened by S
ij
= i[j).
With all of this formalism out of the way, it is now easy to show that the roots
of the secular equation are real. Let us write equation (1.5) in matrix form as
Hc = WSc
so that
c[Hc) = Wc[Sc) .
19
On the other hand, using the fact that H is Hermitian and S is real and symmetric,
we can write
c[Hc) = Hc[c) = WSc[c) = W

Sc[c) = W

c[Sc) .
Thus we have
(W W

)c[Sc) = 0
which implies W = W

because c ,= 0 so that c[Sc) > 0.


Note that this proof is also valid in the case where is complex because (1.5)
still holds, and S = M

M is Hermitian so that Sc[c) = c[Sc).


20
2 Time-Independent Perturbation Theory
2.1 Perturbation Theory for a Nondegenerate Energy Level
Suppose that we want to solve the time-independent Schrodinger equation H
n
=
E
n

n
, but the Hamiltonian is too complicated for us to nd an exact solution.
However, let us suppose that the Hamiltonian can be written in the form
H = H
0
+H

where we know the exact solutions to H


0

(0)
n
= E
(0)
n

(0)
n
. (We will use a super-
script 0 to denote the energies and eigenstates of the unperturbed Hamiltonian H
0
.)
The additional term H

is called a perturbation, and it must in some sense be


considered small relative to H
0
. The dimensionless parameter is redundant, but
is introduced for mathematical convenience; it will not remain a part of our nal
solution. For example, the unperturbed Hamiltonian H
0
could be the (free) hydro-
gen atom, and the perturbation H

could represent the interaction energy eE r of


the electron with an electric eld E. (This leads to an energy level shift called the
Stark eect.)
The full (i.e., interacting or perturbed) Schrodinger equation is written
H
n
= (H
0
+ H

)
n
= E
n

n
(2.1)
and the unperturbed equation is
H
0

(0)
n
= E
(0)
n

(0)
n
. (2.2)
We think of the parameter as varying from 0 to 1, and taking the system smoothly
from the unperturbed system described by H
0
to the fully interacting system de-
scribed by H. And as long as we are discussing nondegenerate states, we can think
of each unperturbed state
(0)
n
as undergoing a smooth transition to the exact state

n
. In other words,
lim
0

n
=
(0)
n
and lim
0
E
n
= E
(0)
n
.
Since the states
n
=
n
(, x) and energies E
n
= E
n
() depend on , let us
expand both in a Taylor series about = 0:

n
=
(0)
n
+
_

_
=0
+

2
2!
_

2
_
=0
+
E
n
= E
(0)
n
+
_
dE
n
d
_
=0
+

2
2!
_
d
2
E
n
d
2
_
=0
+ .
Now introduce the notation

(k)
n
=
1
k!

=0
E
(k)
n
=
1
k!
d
k
E
n
d
k

=0
21
so we can write

n
=
(0)
n
+
(1)
n
+
2

(2)
n
+ (2.3a)
E
n
= E
(0)
n
+E
(1)
n
+
2
E
(2)
n
+ . (2.3b)
For each k = 1, 2, . . . we call
(k)
n
and E
(k)
n
the kth-order correction to the
wavefunction and energy. We assume that the series converges for = 1, and that
the rst few terms give a good approximation to the exact solutions.
It will be convenient to simplify some of our notation, so integrals such as

(j)
n
[
(k)
n
) will simply be written n
(j)
[n
(k)
). We assume that the unperturbed
states are orthonormal so that
m
(0)
[n
(0)
) =
mn
and we also choose our normalization so that
n
(0)
[n) = 1 . (2.4)
If this last condition on
n
isnt satised, then multiplying
n
by n
(0)
[n)
1
will
ensure that it is. Since multiplying the Schrodinger equation H
n
= E
n

n
by a
constant doesnt change E
n
, this has no eect on the energy levels. If so desired,
at the end of the calculation we can always re-normalize
n
in the usual way.
Substituting (2.3a) into (2.4) yields
1 = n
(0)
[n
(0)
) +n
(0)
[n
(1)
) +
2
n
(0)
[n
(2)
) + .
Now, it is a general result that if you have a power series equation of the form

n=0
a
n
x
n
= 0 for all x, then a
n
= 0 for all n. That a
0
= 0 follows by letting
x = 0. Now take the derivative with respect to x and let x = 0 to obtain a
1
= 0.
Taking the derivative again and letting x = 0 yields a
2
= 0. Clearly we can continue
this procedure to arrive at a
n
= 0 for all n. Applying this result to the above power
series in and using the fact that n
(0)
[n
(0)
) = 1 we conclude that
n
(0)
[n
(k)
) = 0 for all k = 1, 2, . . . . (2.5)
We now substitute equations (2.3) into the Schrodinger equation (2.1):
(H
0
+H

)(
(0)
n
+
(1)
n
+
2

(2)
n
+ )
= (E
(0)
n
+E
(1)
n
+
2
E
(2)
n
+ )(
(0)
n
+
(1)
n
+
2

(2)
n
+ )
or, grouping powers of ,
H
0

(0)
n
+(H
0

(1)
n
+H

(0)
n
) +
2
(H
(0)

(2)
n
+H

(1)
n
) +
= E
(0)
n

(0)
n
+(E
(0)
n

(1)
n
+E
(1)
n

(0)
n
)
+
2
(E
(0)
n

(2)
n
+E
(1)
n

(1)
n
+E
(2)
n

(0)
n
) + .
22
Again ignoring questions of convergence, we can equate powers of on both sides
of this equation. For
0
we simply have
H
0

(0)
n
= E
(0)
n

(0)
n
(2.6a)
which doesnt tell us anything new. For
1
we have
H
0

(1)
n
+H

(0)
n
= E
(0)
n

(1)
n
+E
(1)
n

(0)
n
or
(H
0
E
(0)
n
)
(1)
n
= (E
(1)
n
H

)
(0)
n
. (2.6b)
For
2
we have
H
(0)

(2)
n
+H

(1)
n
= E
(0)
n

(2)
n
+E
(1)
n

(1)
n
+E
(2)
n

(0)
n
or
(H
0
E
(0)
n
)
(2)
n
= (E
(1)
n
H

)
(1)
n
+E
(2)
n

(0)
n
. (2.6c)
And in general we have for k 1
(H
0
E
(0)
n
)
(k)
n
= (E
(1)
n
H

)
(k1)
n
+E
(2)
n

(k2)
n
+ +E
(k)
n

(0)
n
. (2.6d)
Notice that at each step along the way,
(k)
n
is determined by
(k1)
n
,
(k2)
n
,
. . . ,
(0)
n
. We can also add an arbitrary multiple of
(0)
n
to each
(k)
n
without
aecting the left side of these equations. Hence we can choose this multiple so that
n
(0)
[n
(k)
) = 0 for k 1, which is the same result as we had in (2.5).
Now using the hermiticity of H
0
and the fact that E
(0)
n
is real, we have
n
(0)
[H
0
n
(k)
) = H
0
n
(0)
[n
(k)
) = E
(0)
n
n
(0)
[n
(k)
) = 0 for k 1 .
Then multiplying (2.6d) from the left by
(0)
n
and integrating, we see that the
left-hand side vanishes, and we are left with (since n
(0)
[n
(0)
) = 1)
0 = n
(0)
[H

n
(k1)
) +E
(k)
n
or
E
(k)
n
= n
(0)
[H

n
(k1)
) for k 1 . (2.7)
In particular, we have the extremely important result for the rst order energy
correction to the nth state
E
(1)
n
= n
(0)
[H

n
(0)
) =
_

(0)
n
H

(0)
n
dx. (2.8)
Letting = 1 in (2.3b), we see that to rst order, the energy of the nth state is
given by
E
n
E
(0)
n
+E
(1)
n
= E
(0)
n
+
_

(0)
n
H

(0)
n
dx.
23
Example 2.1. Let the unperturbed system be the free harmonic oscillator, with
ground-state wavefunction

(0)
0
=
_
m

_
1/4
e
mx
2
/2
and energy levels
E
(0)
n
=
_
n +
1
2
_
.
Now consider the anharmonic oscillator with Hamiltonian
H = H
0
+H

:= H
0
+ax
3
+ bx
4
.
The rst-order energy correction to the ground state is given by
E
(1)
0
= n
(0)
[H

n
(0)
) =
_
m

_
1/2
_

e
mx
2
/
(ax
3
+bx
4
) dx.
However, the integral over x
3
vanishes by symmetry (the integral of an odd function
over an even interval), and we are left with
E
(1)
0
= b
_
m

_
1/2
_

x
4
e
mx
2
/
dx = b
_

_
1/2
_

x
4
e
x
2
dx
= b
_

_
1/2

2
_

e
x
2
dx = b
_

_
1/2

2
_

_
1/2
=
3b
4
2
=
3b
4

2
m
2

2
.
Thus, to rst order, the ground state energy of the anharmonic oscillator is given
by
E
0
E
(0)
0
+E
(1)
0
=
1
2
+
3b
4

2
m
2

2
.
Now lets nd the rst-order correction to the wavefunction. Since the unper-
turbed states
(0)
n
form a complete orthonormal set, we may expand
(1)
n
in terms
of them as

(1)
n
=

m
a
m

(0)
m
where
a
m
= m
(0)
[n
(1)
) .
(It would be way too cluttered to try and label these expansion coecients to denote
the fact that they also refer to the rst-order correction of the nth state.) Then for
24
m ,= n, we multiply (2.6b) from the left by
(0)
m
and integrate:
m
(0)
[(H
0
E
(0)
n
)n
(1)
) = E
(1)
n
m
(0)
[n
(0)
) m
(0)
[H

n
(0)
)
or (since H
0

(0)
m
= E
(0)
m

(0)
m
and m
(0)
[n
(0)
) = 0 for m ,= n)
(E
(0)
m
E
(0)
n
)m
(0)
[n
(1)
) = m
(0)
[H

n
(0)
) .
Therefore
a
m
= m
(0)
[n
(1)
) =
m
(0)
[H

n
(0)
)
E
(0)
n
E
(0)
m
for m ,= n. (2.9)
You should realize that this last step was where the assumed nondegeneracy of the
states came in. In order for us to divide by E
(0)
n
E
(0)
m
, we must assume that
it is nonzero. This is true as long as m ,= n implies that E
(0)
n
,= E
(0)
m
. Since
a
n
= n
(0)
[n
(1)
) = 0 (this is equation (2.5)), we nally obtain

(1)
n
=

m=n
m
(0)
[H

n
(0)
)
E
(0)
n
E
(0)
m

(0)
m
. (2.10)
Now that we have the rst-order correction to the wavefunction, it is easy to
get the second-order correction to the energy. Using (2.10) in (2.7) with k = 2 we
immediately have
E
(2)
n
=

m=n
m
(0)
[H

n
(0)
)n
(0)
[H

m
(0)
)
E
(0)
n
E
(0)
m
=

m=n

n
(0)
[H

m
(0)
)

2
E
(0)
n
E
(0)
m
. (2.11)
The last term we will compute is the second-order correction to the wavefunction.
We again expand in terms of the
(0)
n
as

(2)
n
=

m
b
m

(0)
m
where b
m
= m
(0)
[n
(2)
). Multiplying (2.6c) from the left by
(0)
m
and integrating
we have (assuming m ,= n)
(E
(0)
m
E
(0)
n
)m
(0)
[n
(2)
) = E
(1)
n
m
(0)
[n
(1)
) m
(0)
[H

n
(1)
)
or
b
m
= m
(0)
[n
(2)
) =
E
(1)
n
E
(0)
m
E
(0)
n
m
(0)
[n
(1)
)
m
(0)
[H

n
(1)
)
E
(0)
m
E
(0)
n
.
Now use (2.9) in the rst term on the right-hand side and use (2.10) in the second
term to write
b
m
=
E
(1)
n
m
(0)
[H

n
(0)
)
(E
(0)
n
E
(0)
m
)
2

k=n
m
(0)
[H

k
(0)
)k
(0)
[H

n
(0)
)
(E
(0)
m
E
(0)
n
)(E
(0)
n
E
(0)
k
)
.
25
Using (2.8) we nally obtain

(2)
n
=

m=n

k=n
m
(0)
[H

k
(0)
)k
(0)
[H

n
(0)
)
(E
(0)
n
E
(0)
m
)(E
(0)
n
E
(0)
k
)

(0)
m

m=n
m
(0)
[H

n
(0)
)n
(0)
[H

n
(0)
)
(E
(0)
n
E
(0)
m
)
2

(0)
m
. (2.12)
Let me make several points. First, recall that because of equation (2.4), our
states are not normalized. Second, be sure to realize that the sums in equations
(2.10), (2.11) and (2.12) are over states, and not energy levels. If some of the
energy levels other than the nth are degenerate, then we must include a term in
each of these sums for each linearly independent wavefunction corresponding to the
degenerate energy level. The reason for this is that the expansions of
(1)
n
and
(2)
n
were in terms of a complete set of functions, and hence we must be sure to include
all linearly independent states in the sums. Furthermore, if there happens to be
a continuum of states in the unperturbed system, then we must also include an
integral over these so that we have included all linearly independent states in our
expansion.
2.2 Perturbation Theory for a Degenerate Energy Level
We now turn to the perturbation treatment of a degenerate energy level, meaning
that there are multiple unperturbed states that all have the same energy. If we
let d be the degree of degeneracy, then we have states
(0)
1
, . . . ,
(0)
d
satisfying the
unperturbed Schrodinger equation
H
0

(0)
n
= E
(0)
n

(0)
n
(2.13a)
with
E
(0)
1
= E
(0)
2
= = E
(0)
d
. (2.13b)
You must be careful with the notation here, because we dont want to clutter it up
with too many indices. Even though we write E
(0)
1
, . . . , E
(0)
d
, this does not mean
that these are necessarily the d lowest-lying states that satisfy the unperturbed
Schrodinger equation. We are referring here to a single degenerate energy level.
The interacting (or perturbed) Schrodinger equation is
H
n
= (H
0
+H

)
n
= E
n

n
.
In our treatment of a nondegenerate energy level, we assumed that lim
0
E
n
=
E
(0)
n
and lim
0

n
=
(0)
n
where the state
(0)
n
was unique. However, in the case
of degeneracy, the second of these does not hold. While it is true that as goes to
zero we still have
lim
0
E
n
= E
(0)
n
26
the presence of the perturbation generally splits the degenerate energy level into
multiple distinct states. However, there are varying degrees of splitting, and while
the perturbation may completely remove the degeneracy, it may also only partially
remove it or have no eect at all. This is illustrated in the gure below.
Energy
E
2abc
E3
E3
E
4abc
E5
E5
E2c
E
2ab
E4a
E
4b
E4c
E1
E1
0 1
Figure 3: Splitting of energy levels due to a perturbation.
The important point to realize here is that in the limit 0, the state
n
does
not necessarily go to a unique
(0)
n
, but rather only to some linear combination
of the normalized degenerate states
(0)
1
, . . . ,
(0)
d
. This is because any such linear
combination
c
1

(0)
1
+c
2

(0)
2
+ +c
d

(0)
d
will satisfy (2.13a) with the same eigenvalue E
(0)
n
. Thus there are an innite number
of such linear combinations made up of these d linearly independent normalized
eigenfunctions, and any of them will work as the unperturbed state.
For example, recall that the hydrogen atom states are labeled
nlm
where the
energy only depends on n and l, and the factor e
im
makes the wave function
complex for m ,= 0. The 2p states correspond to n = 2 and l = 1, and these are
broken into the wave functions 2p
1
and 2p
1
. However, instead of these complex
wave functions, we can take the real linear combinations dened by

2px
=
1

2
(
2p1
+
2p1
)
and

2py
=
1
i

2
(
2p1

2p1
)
which have the same energies. For most purposes in chemistry, these real wave
functions are much more convenient to work with. And while the 2p
0
, 2p
1
and 2p
1
states are degenerate, the presence of an electric or magnetic eld will split the
27
degeneracy because the interaction term in the Hamiltonian depends on the spin of
the electron (i.e., the m value).
Returning to our problem, all we can say is that
lim
0

n
=
d

i=1
c
i

(0)
i
, 1 n d .
Hence the rst thing we must do is determine the correct zeroth-order wave func-
tions, which we denote by
(0)
n
. In other words,

(0)
n
:= lim
0

n
=
d

i=1
c
i

(0)
i
, 1 n d (2.14)
where each
(0)
n
has a dierent set of coecients c
i
. (These should be labeled c
(n)
i
,
but Im trying to keep it simple.) Note that since H
0

(0)
i
= E
(0)
d

(0)
i
for each
i = 1, . . . , d it follows that
H
0

(0)
n
= E
(0)
d

(0)
n
. (2.15)
For the d-fold degenerate case, we proceed as in the nondegenerate case, except
that now we use
(0)
n
instead of
(0)
n
for the zeroth-order wave function. Then
equations (2.3) become

n
=
(0)
n
+
(1)
n
+
2

(2)
n
+ (2.16a)
E
n
= E
(0)
d
+E
(1)
n
+
2
E
(2)
n
+ (2.16b)
where we have used (2.13b). Equations (2.16) apply for each n = 1, . . . , d. As in the
nondegenerate case, we substitute these into the Schrodinger equation H
n
= E
n

n
and equate powers of . This is exactly the same as we had before, except that now
we have
(0)
n
instead of
(0)
n
, so we can immediately write down the results from
equations (2.6).
Equating the coecients of
0
we have H
0

(0)
n
= E
(0)
d

(0)
n
. Since for each n =
1, . . . , d the linear combination
(0)
n
is an eigenstate of H
0
with eigenvalue E
(0)
d
(this
is just the statement of equation (2.15)), this doesnt give us any new information.
From the coecients of
1
we have (for each n = 1, . . . , d)
(H
0
E
(0)
d
)
(1)
n
= (E
(1)
n
H

)
(0)
n
. (2.17)
Multipling this from the left by
(0)
n
and integrating we have (here Im not using
n
(0)
as a shorthand for
(0)
n
to make sure there is no confusion with
(0)
n
)

(0)
n
[H
0

(0)
n
) E
(0)
d

(0)
n
[
(0)
n
) = E
(1)
n

(0)
n
[
(0)
n
)
(0)
n
[H

(0)
n
) .
Using (2.15) we see that the left-hand side of this equation vanishes, so assuming
that the correct zeroth-order wave functions are normalized, we arrive at the rst
order correction to the energy
E
(1)
n
=
(0)
n
[H

(0)
n
) . (2.18)
28
This is similar to the nondegenerate result (2.8) except that now we use the correct
zeroth-order wave functions. Of course, in order to evaluate these integrals, we must
know the functions
(0)
n
which, so far, we dont.
So, for any 1 m d, we multiply (2.17) from the left by one of the d-fold
degenerate unperturbed wave functions
(0)
m
and integrate to obtain

(0)
m
[H
0

(1)
n
) E
(0)
d

(0)
m
[
(1)
n
) = E
(1)
n

(0)
m
[
(0)
n
)
(0)
m
[H

(0)
n
) .
Since H
0

(0)
m
= E
(0)
d

(0)
m
, we see that the left-hand side of this equation vanishes,
and we are left with

(0)
m
[H

(0)
n
) E
(1)
n

(0)
m
[
(0)
n
) = 0 , m = 1, . . . , d .
There is no loss of generality in assuming that the zeroth-order wave functions
(0)
i
of the degenerate level are orthonormal, so we take

(0)
m
[
(0)
i
) =
mi
for m, i = 1, . . . , d . (2.19)
(If the zeroth-order wave functions
(0)
i
arent orthonormal, then apply the Gram-
Schmidt process to construct an orthonormal set. Since the new orthonormal func-
tions are just linear combinations of the original set, and the correct zeroth-order
functions
(0)
n
are linear combinations of the
(0)
i
, the
(0)
n
will just be dierent linear
combinations of the new orthonormal functions.) Then substituting the denition
(2.14) for
(0)
n
we have
d

i=1
c
i

(0)
m
[H

(0)
i
) E
(1)
n
d

i=1
c
i

(0)
m
[
(0)
i
) = 0
or
d

i=1
(H

mi
E
(1)
n

mi
)c
i
= 0 , m = 1, . . . , d (2.20a)
where
H

mi
=
(0)
m
[H

(0)
i
) .
This is just another homogeneous system of d equations in the d unknowns c
i
. In
fact, if we let c be the vector with components c
i
, then we can write (2.20a) in
matrix form as
H

c = E
(1)
n
c (2.20b)
which shows that this is nothing more than an eigenvalue equation for the matrix
H

acting on the d-dimensional eigenspace of degenerate wave functions.


As usual, if (2.20a) is to have a nontrivial solution, we must have the secular
equation
det(H

mi
E
(1)
n

mi
) = 0 . (2.21)
29
Written out, this looks like

11
E
(1)
n
H

12
H

1d
H

21
H

22
E
(1)
n
H

2d
.
.
.
.
.
.
.
.
.
H

d1
H

d2
H

dd
E
(1)
n

= 0 .
This is a polynomial of degree d in E
(1)
n
, and the d roots E
(1)
1
, E
(1)
2
, . . . , E
(1)
d
are
the rst-order corrections to the energy of the d-fold degenerate unperturbed state.
So, we solve (2.21) for the eigenvalues E
(1)
n
, and use these in (2.20b) to solve for the
eigenvectors c. These then dene the correct zeroth-order wave functions according
to (2.14).
Again, note that all we are doing is nding the eigenvalues and eigenvectors
of the matrix H

mi
. And since H

is Hermitian, eigenvectors belonging to distinct


eigenvalues are orthogonal. But each eigenvector c has components that are just
the expansion coecients in (2.14), and therefore (reverting to a more complete
notation)

(0)
m
[H

(0)
n
) =
d

i,j=1
c
(m)
i

(0)
i
[H

(0)
j
)c
(n)
j
=
d

i,j=1
c
(m)
i
H

ij
c
(n)
j
= c
(m)
H

c
(n)
= E
(1)
n
c
(m)
c
(n)
= E
(1)
n
c
(m)
[c
(n)
)
or

(0)
m
[H

(0)
n
) = E
(1)
n

mn
(2.22)
where we assume that the eigenvectors are normalized.
In the case where m = n, we arrive back at (2.18). What about the case m ,= n?
Recall that in our treatment of nondegenerate perturbation theory, the reason we
had to assume the nondegeneracy was because equations (2.10) and (2.11) would
blow up if there were another state
(0)
m
with the same energy as
(0)
n
. However, in
that case, we would be saved if the numerator also went to zero, and that is precisely
what happens if we use the correct zeroth-order wave functions. Essentially then,
the degenerate case proceeds just like the nondegenerate case, except that we must
use the correct zeroth-order wave functions.
Returning to (2.21), if all d roots are distinct, then we have completely split the
degeneracy into d distinct levels
E
(0)
d
+E
(1)
1
, E
(0)
d
+E
(1)
2
, . . . , E
(0)
d
+E
(1)
d
.
If not all of the roots are distinct, then we have only partly removed the degeneracy
(at least to rst order). We will assume that all d roots are distinct, and hence that
the degeneracy has been completely lifted in rst order.
30
Now that we have the d roots E
(1)
n
, we can take them one-at-a-time and plug
back into the system of equations (2.20a) and solve for c
2
, . . . , c
d
in terms of c
1
.
(Recall that because the determinant of the coecient matrix of the system (2.20a)
is zero, the d equations in (2.20a) are linearly dependent, and hence we can only nd
d1 of the unknowns in terms of one of them.) Finally, we x c
1
by normalization,
using equations (2.14) and (2.19):
1 =
(0)
n
[
(0)
n
) =
d

i,j=1
c

i
c
j

(0)
i
[
(0)
j
) =
d

i,j=1
c

i
c
j

ij
=
d

i=1
[c
i
[
2
. (2.23)
Also be sure to realize that we obtain a separate set of coecients c
i
for each root
E
(1)
n
. This is how we get the d independent zeroth-order wave functions.
Obviously, nding the roots of (2.21) is a dicult problem in general. However,
under some special conditions, the problem may be much more tractable. The
best situation would be if all o-diagonal elements H
mi
, m ,= i vanished. Then the
determinant is just the product of the diagonal elements, and the d roots are simply
E
(1)
n
= H

mm
for m = 1, . . . , d or
E
(1)
1
= H

11
, E
(1)
2
= H

22
, . . . , E
(1)
d
= H

dd
.
Let us assume that all d roots are distinct. Taking the root E
(1)
n
= E
(1)
1
= H

11
as
a specic example, (2.20a) becomes the set of d 1 equations
(H

22
E
(1)
1
)c
2
= 0
(H

33
E
(1)
1
)c
3
= 0
.
.
.
(H

dd
E
(1)
1
)c
d
= 0 .
Since E
(1)
1
= H

11
,= H

mm
for m = 2, 3, . . . , d, it follows that c
2
= c
3
= = c
d
= 0.
Normalization then implies that c
1
= 1, and the corresponding zeroth-order wave
function dened by (2.14) is
(0)
1
=
(0)
1
. Clearly this applies to any of the d roots,
so we have

(0)
i
=
(0)
i
, i = 1, . . . , d .
Thus we have shown that when the secular equation is diagonal and the d matrix
elements H

mm
are all distinct, then the initial wave functions
(0)
i
are the correct
zeroth-order wave functions
(0)
i
.
Another situation that lends itself to a relatively simple solution is when the
secular determinant is block diagonal. For example, in the case where d = 4 we
31
would have

11
E
(1)
n
H

12
0 0
H

21
H

22
E
(1)
n
0 0
0 0 H

33
E
(1)
n
H

34
0 0 H

43
H

44
E
(1)
n

= 0 .
This is of the same form as we had in Example 1.3 (except with S
ij
=
ij
). Exactly
the same reasoning we used to show that two of the variation functions were linear
combinations of f
1
and f
2
and two of the variation functions were linear combina-
tions of f
3
and f
4
now shows that the correct zeroth-order wave functions are of
the form

(0)
1
= c
(1)
1

(0)
1
+ c
(1)
2

(0)
2

(0)
2
= c
(2)
1

(0)
1
+c
(2)
2

(0)
2

(0)
3
= c
(3)
3

(0)
3
+ c
(3)
4

(0)
4

(0)
4
= c
(4)
3

(0)
3
+c
(4)
4

(0)
4
Is there any way we can choose our initial wave functions
(0)
i
to make things
easier? Well, referring back to Theorem 1.2, suppose we have a Hermitian oper-
ator A that commutes with both H
0
and H

. If we choose our initial wave func-


tions to be eigenfunctions of both A and H
0
, then the o-diagonal matrix elements
H

ij
=
(0)
i
[H

(0)
j
) will vanish if
(0)
i
and
(0)
j
belong to dierent eigenspaces of
A. Therefore, if the functions
(0)
i
all have dierent eigenvalues of A, the secular
determinant will be diagonal so that the
(0)
i
=
(0)
i
.
If more than one
(0)
i
belongs to a given eigenvalue a
k
of A (in other words,
dimV
a
k
> 1), then this subcollection will form a block in the secular determinant.
So in general, we will have a secular determinant that is block diagonal where each
block has size dimV
a
k
. In this case, each correct zeroth-order wave function will be
a linear combination of those
(0)
i
that belong to the same eigenvalue of A.
Before proceeding with an example, let me prove a very important and useful
property of the spherical harmonics. The parity operation is r r, and in
spherical coordinates, this is equivalent to and +.
x
y
z

r
32
Indeed, we know that (for the unit sphere) z = cos , and from the gure we see
that z would be at . Similarly, a point on the x-axis at = 0 goes to the
point x at = . Alternatively, letting in x = sin cos doesnt change
x, so in order to have x x we need cos cos which is accomplished by
letting +.
Now observe that under parity, r r and p p, so that L = r p is
unchanged. Thus angular momentum is a pseudo-vector, as you probably already
knew. But this means that the parity operation commutes with the quantum
mechanical operator L, so that the three operators L
2
, L
z
and are mutually
commuting, and the eigenfunctions Y
m
l
(, ) of angular momentum can be chosen
to have a denite parity. Note also that since and L commute, it follows that
and L

commute, so acting on any Y


m
l
with L

wont change its parity.


Look at the explicit form of the state Y
l
l
:
Y
l
l
(, ) = (1)
l
_
(2l + 1)!
4
_
1/2
1
2
l
l!
(sin )
l
e
il
.
Letting we have (sin )
l
(sin )
l
, but under + we have
e
il
e
il
e
il
= (1)
l
e
il
. Therefore, under parity we see that Y
l
l
(1)
l
Y
l
l
.
But we can get to any Y
m
l
by repeatedly applying L

to Y
l
l
, and since this doesnt
change the parity of Y
m
l
we have the extremely useful result
Y
m
l
(, ) = (1)
l
Y
m
l
(, ) . (2.24)
Example 2.2 (Stark Eect). In this example we will take a look at the eect
of a uniform electric eld E = Ez on a hydrogen atom, where the unperturbed
Hamiltonian is given by
H
0
=
p
2
2m

e
2
r
.
and r = r
1
r
2
is the relative position vector from the proton to the electron. We
rst need to nd the perturbing potential energy.
The force on a particle of charge q in an electric eld E = is F = qE =
q where (r) is the electric potential. On the other hand, the force is also
given in terms of the potential energy V (r) by F = V , and hence V = q
so that
_
r
0
V dr = q
_
r
0
dr
or
V (r) V (0) = q[(r) (0)] .
If we take V (0) = (0) = 0, then we have
V (r) = q(r) .
33
Thus the interaction Hamiltonian H

consists of both the energy e(r


2
) of the
proton and the energy e(r
1
) of the electron, and therefore
H

= e[(r
2
) (r
1
)] .
But the electric eld is constant so that
_
r2
r1
E dr = E (r
2
r
1
) = E (r
1
r
2
) = E r = E z
while we also have
_
r2
r1
E dr =
_
r2
r1
dr = [(r
2
) (r
1
)] .
Hence the nal form of our perturbation is H

= eE r or
H

= eE z .
Note also that if we dene the electric dipole moment
e
= e(r
2
r
1
) = er,
then H

can be called a dipole interaction because


H

=
e
E.
Let us rst consider the ground state
100
of the hydrogen atom. This state
is nondegenerate, so the rst-order energy correction to the ground state is, from
equation (2.8),
E
(1)
100
=
100
[eE z[
100
) = eE
100
[z[
100
) .
But H
0
is parity invariant, so the states
nlm
all have a denite parity (1)
l
. Then
E
(1)
100
is the integral of an odd function over an even interval, and hence it vanishes:
E
(1)
100
= 0 .
In fact, this shows that any nondegenerate state of the hydrogen atom has no rst-
order Stark eect.
Now consider the n = 2 levels of hydrogen. This is a four-fold degenerate state
consisting of the wave functions
200
,
210
,
211
and
21 1
. Since the parity of the
states is given by (1)
l
, we see that the l = 0 state has even parity while the l = 1
states are odd.
However, it is not hard to see that [H

, L
z
] = 0. This either a consequence of
the fact that H

is a function of z = cos while L


z
= i/, or you can note
that [L
i
, r
j
] = i

k

ijk
r
k
so that [L
z
, z] = 0. Either way, we have
0 =
nl

m
[[H

, L
z
][
nlm
) =
nl

m
[H

L
z
L
z
H

[
nlm
)
= (mm

)
nl

m
[H

[
nlm
)
34
and hence we have the selection rule

nl

m
[H

[
nlm
) = 0 if m ,= m

.
(This is an example of Theorem 1.2.) This shows that H

can only connect states


with the same m values. And since H

has odd parity, it can only connect states


with opposite parities, i.e., in the present case it can only connect an l = 0 state
with an l = 1 state.
Suppressing the index n = 2, we order our basis states
lm
as

00
,
10
,
11
,
1 1
. (In other words, the rows and columns are labeled by these
functions in this order.) Then the secular equation (2.21) becomes (also writing E
instead of E
(1)
for simplicity)

E
00
[H

[
10
) 0 0

10
[H

[
00
) E 0 0
0 0 E 0
0 0 0 E

= 0
or (since its block diagonal)
[E
2
(H

12
)
2
]E
2
= 0
where
H

12
=
00
[H

[
10
) =
10
[H

[
00
) = H

21
because both H

and the wave functions are real. Therefore the roots of the secular
equation are
E
(1)
n
= H

12
, 0, 0 .
For our wave functions we have
nlm
= R
nl
Y
m
l
or

200
=
_
1
2a
3
0
_
1/2
_
1
r
2a
0
_
e
r/2a0
Y
0
0

210
=
_
1
24a
3
0
_
1/2
r
a
0
e
r/2a0
Y
0
1
where a
0
is the Bohr radius dened by a
0
=
2
/m
e
e
2
, and hence
H

12
=
200
[eE z[
210
)
= eE
_
(2a
0
)
3
2

3
e
r/a0
r
a
0
_
1
r
2a
0
_
zY
0
0
Y
0
1
r
2
drd.
But
Y
0
0
=
1

4
and z = r cos = r
_
4
3
Y
0
1
35
so that using
_
dY
m
l
Y
m

l
=
ll

mm

we have
H

12
= eE (2a
0
)
3
2
3a
0
_
e
r/a0
r
4
_
1
r
2a
0
_
Y
0
1
Y
0
1
drd
= eE (2a
0
)
3
2
3a
0
_

0
_
r
4

r
5
2a
0
_
e
r/a0
dr .
Using the general result
_

0
r
n
e
r
dr = (1)
n

n

n
_

0
e
r
dr
= (1)
n

n

1
=
n!

n+1
we nally arrive at
H

12
= 3eEa
0
.
Now we need to nd the corresponding eigenvectors c that will specify the correct
zeroth-order wave functions. These are the solutions to the system of equations
H

c = E
(1)
n
c for each value of E
(1)
n
(see equation (2.20b)). Let E
(1)
1
= H

12
. Then
the eigenvector c
(1)
satises
_

_
H

12
H

12
0 0
H

12
H

12
0 0
0 0 H

12
0
0 0 0 H

12
_

_
_

_
c
1
c
2
c
3
c
4
_

_
= 0 .
This implies that c
1
= c
2
and c
3
= c
4
= 0. Normalizing we have c
1
= c
2
= 1/

2 so
that

(0)
1
=
1

2
(
200
+
210
) .
Next we let E
(1)
2
= H

12
. Now the eigenvector c
(2)
satises
_

_
H

12
H

12
0 0
H

12
H

12
0 0
0 0 H

12
0
0 0 0 H

12
_

_
_

_
c
1
c
2
c
3
c
4
_

_
= 0
36
so that c
1
= c
2
and c
3
= c
4
= 0. Again, normalization yields c
1
= c
2
= 1/

2
and hence

(0)
2
=
1

2
(
200

210
) .
Finally, for the two degenerate roots E
(1)
3
= E
(1)
4
= 0 we have
_

_
0 H

12
0 0
H

12
0 0 0
0 0 0 0
0 0 0 0
_

_
_

_
c
1
c
2
c
3
c
4
_

_
= 0
so that c
1
= c
2
= 0 while c
3
and c
4
are completely arbitrary. Thus we can simply
choose

(0)
3
=
211
and
(0)
4
=
21 1
.
In summary, the correct zeroth-order wave functions for treating the Stark eect
are
(0)
1
which gets an rst-order energy shift of 3eE a
0
, the wave function
(0)
2
which gets a rst-order energy shift of +3eE a
0
, and the original degenerate states

(0)
3
=
211
and
(0)
4
=
21 1
which remain degenerate to this order.
2.3 Perturbation Treatment of the First Excited States
of Helium
The helium atom consists of a nucleus with two protons and two neutrons, and two
orbiting electrons. If we take the nuclear charge to be +Ze instead of +2e, then
our discussion will apply equally well to helium-like ions such as H

, Li
+
or Be
2+
.
Neglecting terms such as spinorbit coupling, the Hamiltonian is
H =

2
2m
e

2
1


2
2m
e

2
2

Ze
2
r
1

Ze
2
r
2
+
e
2
r
12
(2.25)
where r
i
is the distance to electron i, r
12
is the distance from electron 1 to electron
2, and
2
i
is the Laplacian with respect to the coordinates of electron i. The
Schrodinger equation is thus a function of six variables, the three coordinates for
each of the two electrons. (Technically, the electron mass m
e
is the reduced mass
m = m
e
M/(m
e
+ M) where M is the mass of the nucleus. But M m
e
so that
m m
e
. If this isnt familiar to you, we will treat two-body problems such as this
in detail when we discuss identical particles.)
Because of the term e
2
/r
12
the Schrodinger equation isnt separable, and we
must resort to approximation methods. We write
H = H
0
+H

37
where
H
0
= H
0
1
+H
0
2
=

2
2m
e

2
1

Ze
2
r
1


2
2m
e

2
2

Ze
2
r
2
(2.26)
is the sum of two independent hydrogen atom Hamiltonians, and
H

=
e
2
r
12
. (2.27)
We can now use separation of variables to write the unperturbed wave function
(r
1
, r
2
) as a product
(r
1
, r
2
) =
1
(r
1
)
2
(r
2
) .
In this case we have the time-independent equation
H
0
= (H
0
1
+H
0
2
)
1

2
=
2
H
0
1

1
+
1
H
0
2

2
= E
1

2
so that dividing by
1

2
yields
H
0
1

1
= E
H
0
2

2
.
Since the left side of this equation is a function of r
1
only, and the right side is a
function of r
2
only, each side must in fact be equal to a constant, and we can write
E = E
1
+E
2
where each E
i
is the energy of a hydrogenlike wave function:
E
1
=
Z
2
n
2
1
e
2
2a
0
E
2
=
Z
2
n
2
2
e
2
2a
0
and a
0
is the Bohr radius
a
0
=

2
m
e
e
2
= 0.529

A.
In other words, we have the unperturbed zeroth-order energies
E
(0)
= Z
2
_
1
n
2
1
+
1
n
2
2
_
e
2
2a
0
, n
1
= 1, 2, . . . , n
2
= 1, 2, . . . . (2.28)
Correspondingly, the zeroth-order wave functions are products of the usual hydro-
genlike wave functions.
The lowest excited states of helium have n
1
= 1, n
2
= 2 or n
1
= 2, n
2
= 1.
Then from (2.28) we have (for Z = 2)
E
(0)
= 2
2
_
1
1
2
+
1
2
2
_
e
2
2a
0
= 5(13.606 eV) = 68.03 eV.
38
For n = 2, the possible values of l are l = 0, 1, and since there are 2l + 1 values of
m
l
, we see that the n = 2 level of a hydrogenlike atom is fourfold degenerate. (This
just says that the 2s and 2p states have the same energy.) Thus the rst excited
unperturbed state of He is eightfold degenerate, and the eight unperturbed wave
functions are

(0)
1
= 1s(1)2s(2)
(0)
2
= 2s(1)1s(2)
(0)
3
= 1s(1)2p
x
(2)
(0)
4
= 2p
x
(1)1s(2)

(0)
5
= 1s(1)2p
y
(2)
(0)
6
= 2p
y
(1)1s(2)
(0)
7
= 1s(1)2p
z
(2)
(0)
8
= 2p
z
(1)1s(2)
Here the notation 1s(1)2s(2) means, for example, that electron 1 is in the 1s state
and electron 2 is in the 2s state. I have also chosen to use the real hydrogenlike
wave functions 2p
x
, 2p
y
and 2p
z
which are dened as linear combinations of the
complex wave functions 2p
0
, 2p
1
and 2p
1
:
2p
x
:=
1

2
(2p
1
+ 2p
1
) =
1
4

2
_
Z
a
0
_
5/2
re
Zr/2a0
sin cos
=
1
4

2
_
Z
a
0
_
5/2
xe
Zr/2a0
(2.29a)
2p
y
:=
1
i

2
(2p
1
2p
1
) =
1
4

2
_
Z
a
0
_
5/2
re
Zr/2a0
sin sin
=
1
4

2
_
Z
a
0
_
5/2
ye
Zr/2a0
(2.29b)
2p
z
:= 2p
0
=
1
4

2
_
Z
a
0
_
5/2
re
Zr/2a0
cos
=
1
4

2
_
Z
a
0
_
5/2
ze
Zr/2a0
(2.29c)
This is perfectly valid since any linear combination of solutions with a given energy
is also a solution with that energy. (However, the 2p
x
and 2p
y
functions are not
eigenfunctions of L
z
since they are linear combinations of eigenfunctions with dif-
ferent values of m
l
.) These real hydrogenlike wave functions are more convenient
for many purposes in constructing chemical bonds and molecular wave functions.
In fact, you have probably seen these wave functions in more elementary chemistry
courses. For example, a contour plot in the plane (i.e., a cross section) of a real 2p
wave function is shown in Figure 4 below. (Let = /2 in any of equations (2.29).)
The three-dimensional orbital is obtained by rotating this plot about the horizontal
axis, so we see that the actual shape of a real 2p orbital (i.e., a one-electron wave
function) is two separated, distorted ellipsoids.
It is not hard to verify that the real 2p wave functions are orthonormal, and
hence the eight degenerate wave functions
(0)
i
are also orthonormal as required by
equation (2.19). The secular determinant contains 8
2
= 64 elements. However, H

39
40 20 0 20 40
40
20
0
20
40
Figure 4: Contour plot in the plane of a real 2p wave function.
is real, as are the
(0)
i
, so that H

ij
= H

ji
and the determinant is symmetric about
the main diagonal. This cuts the number of integrals almost in half.
Even better, by using parity we can easily show that most of the H

ij
are zero.
Indeed, the perturbing Hamiltonian H

= e
2
/r
12
is an even function of r since
r
12
= [(x
1
x
2
)
2
+ (y
1
y
2
)
2
+ (z
1
z
2
)
2
]
1/2
and this is unchanged if r
1
r
1
and r
2
r
2
. Also, the hydrogenlike s-
wave functions depend only on r = [r[ and hence are invariant under r r.
Furthermore, you can see from the above forms that the 2p wave functions are odd
under parity since they depend on r and either x, y or z. Hence, since we are
integrating over all space, any integral with only a single factor of 2p must vanish:
H

13
= H

14
= H

15
= H

16
= H

17
= H

18
= 0
and
H

23
= H

24
= H

25
= H

26
= H

27
= H

28
= 0 .
Now consider an integral such as
H

35
=
_

1s(1)2p
x
(2)
e
2
r
12
1s(1)2p
y
(2) dr
1
dr
2
.
If we let x
1
x
1
and x
2
x
2
, then r
12
is unchanged as are 1s(1) and 2p
y
(2).
However, 2p
x
(2) changes sign, and the net result is that the integrand is an odd
function under this transformation. Hence it is not hard to see that the integral
vanishes. This lets us conclude that
H

35
= H

36
= H

37
= H

38
= 0
40
and
H

45
= H

46
= H

47
= H

48
= 0 .
Similarly, by considering the transformation y
1
y
1
and y
2
y
2
, it follows
that
H

57
= H

58
= H

67
= H

68
= 0 .
With these simplications, the secular equation becomes

b
11
H

12
0 0 0 0 0 0
H

12
b
22
0 0 0 0 0 0
0 0 b
33
H

34
0 0 0 0
0 0 H

34
b
44
0 0 0 0
0 0 0 0 b
55
H

56
0 0
0 0 0 0 H

56
b
66
0 0
0 0 0 0 0 0 b
77
H

78
0 0 0 0 0 0 H

78
b
88

= 0 (2.30)
where
b
ii
= H

ii
E
(1)
, i = 1, 2, . . . , 8 .
Since the secular determinant is in block-diagonal form with 2 2 blocks on the
diagonal, the same logic that we used in Example 1.3 would seem to tell us that the
correct zeroth-order wave functions have the form

(0)
1
= c
1

(0)
1
+c
2

(0)
2

(0)
2
= c
1

(0)
1
+ c
2

(0)
2

(0)
3
= c
3

(0)
3
+c
4

(0)
4

(0)
4
= c
3

(0)
3
+ c
4

(0)
4

(0)
5
= c
5

(0)
5
+c
6

(0)
6

(0)
6
= c
5

(0)
5
+ c
6

(0)
6

(0)
7
= c
7

(0)
7
+c
8

(0)
8

(0)
8
= c
7

(0)
7
+ c
8

(0)
8
where the barred and unbarred coecients distinguish between the two roots of
each second-order determinant. However, while that argument applies to the upper
2 2 determinant (i.e., the rst two equations of the system), it doesnt apply to
the whole determinant in this case. This is because it turns out (as we will see
below) that the lower three 2 2 determinants are identical. Therefore, their pairs
of roots are the same, and all we can say is that there are two six-dimensional
eigenspaces. In other words, all we can say is that for each of the two roots and
for each n = 3, 4, . . . 8, the function
(0)
n
will be a linear combination of
(0)
3
, . . . ,

(0)
8
. However, we can choose any basis we wish for this six-dimensional space, so
we choose the three two-dimensional orthonormal
(0)
n
s as shown above.
The rst determinant is

11
E
(1)
H

12
H

12
H

22
E
(1)

= 0 (2.31)
41
where
H

11
=
_
1s(1)2s(2)
e
2
r
12
1s(1)2s(2) dr
1
dr
2
=
_
[1s(1)]
2
[2s(2)]
2
e
2
r
12
dr
1
dr
2
H

22
=
_
[1s(2)]
2
[2s(1)]
2
e
2
r
12
dr
1
dr
2
.
Since the integration variables are just dummy variables, it is pretty obvious that
letting r
1
r
2
shows that
H

11
= H

22
.
Similarly, it is easy to see that
H

33
= H

44
H

55
= H

66
H

77
= H

88
.
The integral H

11
is sometimes denoted by J
1s2s
and called a Coulomb integral:
H

11
= J
1s2s
=
_
[1s(1)]
2
[2s(2)]
2
e
2
r
12
dr
1
dr
2
.
The reason for the name is that this represents the electrostatic energy of repulsion
between an electron with the probability density function [1s]
2
and an electron with
probability density function [2s]
2
. The integral H

12
is denoted by K
1s2s
and called
an exchange integral:
H

12
= K
1s2s
=
_
1s(1)2s(2)
e
2
r
12
2s(1)1s(2) dr
1
dr
2
.
Here the functions to the left and right of H

dier from each other by the exchange


of electrons 1 and 2. The general denitions of the Coulomb and exchange integrals
are
J
ij
= f
i
(1)f
j
(2)[e
2
/r
12
[f
i
(1)f
j
(2))
K
ij
= f
i
(1)f
j
(2)[e
2
/r
12
[f
j
(1)f
i
(2))
where the range of integration is over the full range of spatial coordinates of particles
1 and 2, and the functions f
i
, f
j
are spatial orbitals.
Substituting these integrals into (2.31) we have

J
1s2s
E
(1)
K
1s2s
K
1s2s
J
1s2s
E
(1)

= 0 (2.32)
or
J
1s2s
E
(1)
= K
1s2s
and hence the two roots are
E
(1)
1
= J
1s2s
K
1s2s
and E
(1)
2
= J
1s2s
+K
1s2s
.
42
Just as in Example 1.3, we substitute E
(1)
1
back into (2.20a) to write
K
1s2s
c
1
+K
1s2s
c
2
= 0
K
1s2s
c
1
+K
1s2s
c
2
= 0
and hence c
2
= c
1
. Normalizing
(0)
1
we have (using the orthonormality of the

(0)
i
)

(0)
1
[
(0)
1
) = c
1

(0)
1
c
1

(0)
2
[c
1

(0)
1
c
1

(0)
2
) = [c
1
[
2
+[c
2
[
2
= 1
so that c
1
= 1/

2. Thus the zeroth-order wave function corresponding to E


(1)
1
is

(0)
1
= 2
1/2
[
(0)
1

(0)
2
] = 2
1/2
[1s(1)2s(2) 2s(1)1s(2)] .
Similarly, the wave function corresponding to E
(1)
2
is easily found to be

(0)
2
= 2
1/2
[
(0)
1
+
(0)
2
] = 2
1/2
[1s(1)2s(2) + 2s(1)1s(2)] .
This takes care of the rst determinant in (2.30), but we still have the remaining
three to handle.
First look at the integrals H

33
and H

55
:
H

33
=
_
1s(1)2p
x
(2)
e
2
r
12
1s(1)2p
x
(2) dr
1
dr
2
H

55
=
_
1s(1)2p
y
(2)
e
2
r
12
1s(1)2p
y
(2) dr
1
dr
2
.
The only dierence between these is the 2p(2) orbital, and the only dierence be-
tween the 2p
x
and 2p
y
orbitals is their spatial orientation. Since the 1s orbitals are
spherically symmetric, it should be clear that these integrals are the same. For-
mally, in H

33
we can change variables by letting x
1
y
1
, y
1
x
1
, x
2
y
2
and
y
2
x
2
. This leaves r
12
unchanged, and transforms H

33
into H

55
. The same
argument shows that H

77
= H

33
also. Hence we have
H

33
= H

55
= H

77
=
_
1s(1)2p
z
(2)
e
2
r
12
1s(1)2p
z
(2) dr
1
dr
2
:= J
1s2p
.
A similar argument shows that we also have equal exchange integrals:
H

34
= H

56
= H

78
=
_
1s(1)2p
z
e
2
r
12
2p
z
(1)1s(2) dr
1
dr
2
:= K
1s2p
.
Thus the remaining three determinants in (2.30) are the same and have the form

J
1s2p
E
(1)
K
1s2p
K
1s2p
J
1s2p
E
(1)

= 0 .
43
But this is the same as (2.32) if we replace 2s by 2p, and hence we can immediately
write down the solutions:
E
(1)
3
= E
(1)
5
= E
(1)
7
= J
1s2p
K
1s2p
E
(1)
4
= E
(1)
6
= E
(1)
8
= J
1s2p
+K
1s2p
and

(0)
3
= 2
1/2
[1s(1)2p
x
(2) 1s(2)2p
x
(1)]

(0)
4
= 2
1/2
[1s(1)2p
x
(2) + 1s(2)2p
x
(1)]

(0)
5
= 2
1/2
[1s(1)2p
y
(2) 1s(2)2p
y
(1)]

(0)
6
= 2
1/2
[1s(1)2p
y
(2) + 1s(2)2p
y
(1)]

(0)
7
= 2
1/2
[1s(1)2p
z
(2) 1s(2)2p
z
(1)]

(0)
8
= 2
1/2
[1s(1)2p
z
(2) + 1s(2)2p
z
(1)]
So what has happened? Starting from the eight degenerate (unperturbed) states

(0)
i
that would exist in the absence of electron-electron repulsion, we nd that in-
cluding this repulsion term splits the degenerate states into two nondegenerate levels
associated with the conguration 1s2s, and two triply degenerate levels associated
with the conguration 1s2p. Interestingly, going to higher-order energy corrections
will not completely remove the degeneracy, and in fact it takes the application of
an external magnetic eld to do so.
In order to evaluate the Coulomb and exchange integrals in the expressions for
E
(1)
we need to use the expansion
1
r
12
=

l=0
l

m=l
4
2l + 1
r
l
<
r
l+1
>
[Y
m
l
(
1
,
1
)]

Y
m
l
(
2
,
2
) (2.33)
where r
<
means the smaller of r
1
and r
2
and r
>
is the larger of these. The details
of this type of integral are left to the homework, and the results are
J
1s2s
=
17
81
Ze
2
a
0
= 11.42 eV J
1s2p
=
59
243
Ze
2
a
0
= 13.21 eV
K
1s2s
=
16
729
Ze
2
a
0
= 1.19 eV K
1s2p
=
112
6561
Ze
2
a
0
= 0.93 eV
where we used Z = 2 and e
2
/2a
0
= 13.606 eV. Recalling that E
(0)
= 68.03 eV
we obtain
E
(0)
+E
(1)
1
= E
(0)
+J
1s2s
K
1s2s
= 57.8 eV
E
(0)
+E
(1)
2
= E
(0)
+J
1s2s
+K
1s2s
= 55.4 eV
E
(0)
+E
(1)
3
= E
(0)
+J
1s2p
K
1s2p
= 55.7 eV
E
(0)
+E
(1)
4
= E
(0)
+J
1s2p
+K
1s2p
= 53.9 eV.
44
E
(0)
68.0 eV
Jp Js
1s2p
1s2s
Kp
Ks
E
(0)
+ E
(1)
53.9 eV
55.4 eV
55.7 eV
57.8 eV
Figure 5: The rst excited levels of the helium atom.
(See Figure 5 below.) The rst-order energy corrections place the lower 1s2p level
below the upper 1s2s level, which disagrees with the actual helium spectrum. This is
due to the neglect of higher-order corrections. Since the electron-electron repulsion
is not a small quantity, this is not surprising.
Finally, let us look at the sources of the degeneracy of the original eight zeroth-
order wave functions and the reason for the partial lifting of this degeneracy. There
are three types of degeneracy to consider: (1) The degeneracy between states with
the same n but dierent values of l. The 2s and 2p functions have the same energy.
(2) The degeneracy between wave functions with the same n and l but dierent
values of m
l
. The 2p
x
, 2p
y
and 2p
z
functions have the same energy. (This could
just as well have been the 2p
0
, 2p
1
and 2p
1
complex functions.) (3) There is an
exchange degeneracy between functions that dier only in the exchange of elec-
trons between the orbitals. For example,
(0)
1
= 1s(1)2s(2) and
(0)
2
= 1s(2)2s(1)
have the same energy.
By introducing the electron-electron perturbation H

= e
2
/r
12
we removed the
degeneracy associated with l and the exchange degeneracy, but not the degeneracy
due to m
l
. To understand the reason for the lifting of the l degeneracy, realize that
a 2s electron has a greater probability than a 2p electron of being closer to the
nucleus than a 1s electron, and hence a 2s electron is not as eectively shielded
from the nucleus by the 1s electrons as a 2p electron is. Since the energy levels are
given by
E =
Z
2
n
2
e
2
2a
0
we see that a larger nuclear charge means a lower energy, and hence the 2s electron
has a lower energy than the 2p electron. This is also evident from the Coulomb
integrals, where we see that J
1s2s
is less than J
1s2p
. These integrals represent the
45
electrostatic repulsion of their respective charge distributions: when the 2s electron
penetrates the 1s charge distribution it only feels a repulsion due to the unpene-
trated portion of the 1s distribution. Therefore the 1s-2s electrostatic repulsion is
less than the 1s-2p repulsion, and the 1s2s levels lies below the 1s2p levels. So we
see that the interelectronic repulsion in many-electron atoms lifts the l degeneracy,
and the orbital energies for the same value of n increase with increasing l.
To understand the removal of the exchange degeneracy, note that the origi-
nal zeroth-order wave functions specied which electron went into which orbital.
Since the secular determinant wasnt diagonal, these couldnt have been the correct
zeroth-order wave functions. In fact, the correct zeroth-order wave functions do not
assign a specic electron to a specic orbital, as is evident from the form of each

(0)
i
. This is a consequence of the indistinguishability of identical particles, and will
be discussed at length a little later in this course. Since, for example,
(0)
1
and
(0)
2
have dierent energies, the exchange degeneracy is removed by using the correct
zeroth-order wave functions.
2.4 SpinOrbit Coupling and the Hydrogen AtomFine Struc-
ture
The Hamiltonian
H
0
=

2
2m
_

2
r
2
+
2
r

r
_
+
1
2mr
2
L
2

e
2
r
(2.34)
used to derive the hydrogen atom wave functions
nlm
that we have worked with
so far consists of the kinetic energy of the electron plus the potential energy of
the Coulomb force binding the electron and proton together. (Recall that in this
equation, m is really the reduced mass m = m
e
M
p
/(m
e
+ M
p
) m
e
.) While this
works very well, the actual Hamiltonian is somewhat more complicated than this.
In this section we derive an additional term in the Hamiltonian that is due to a
coupling between the orbital angular momentum L and the spin angular momentum
S.
The discussion that follows is a somewhat heuristic approach to deriving an
interaction term that agrees with experiment. You shouldnt take the physical
picture too seriously. However, the basic idea is simple enough. From the point of
view of the electron, the moving nucleus (i.e., a proton) generates a current that
is the source of a magnetic eld B. This current is proportional to the electrons
angular momentum L. The interaction energy of a magnetic moment with this
magnetic eld is B. Since the magnetic moment of an electron is proportional
to its spin S, we see that the interaction energy will be proportional to L S.
With the above disclaimer, the interaction term we are looking for is due to
the fact that from the point of view of the electron, the moving hydrogen nucleus
(the proton) forms a current, and thus generates a magnetic eld. From special
relativity, we know that the electric and magnetic elds are related by a Lorentz
transformation so that
B

= (B

+ E

) B

= B

46
E

= (E

) E

= E

where = v/c is the velocity of the primed frame with respect to the unprimed
frame, = (1
2
)
1/2
and , | refer to the components perpendicular or parallel
to .
O
O

We let the primed frame be the proton rest frame, and note that there is no B

eld in the protons frame due to the proton itself. Also, if 1, then 1 and
we then have
B = E

and E = E

.
If v is the electrons velocity with respect to the lab (or the proton), then = v/c
so the eld felt by the electron is
B =
v
c
E

. (2.35)
The electric eld E

due to the proton is


E

=
e
r
2
r =
e
r
3
r (2.36)
where e > 0 and r is the position vector from the proton to the electron.
From basic electrodynamics, we know that the energy of a particle with magnetic
moment in a magnetic eld B is given by (see the end of this section)
W = B (2.37)
so we need to know . Consider a particle of charge q moving in a circular orbit.
It forms an eective current
I =
q
t
=
q
2r/v
=
qv
2r
.
By denition, the magnetic moment has magnitude
=
I
c
area =
qv
2rc
r
2
=
qvr
2c
.
But the angular momentum of the particle is L = mvr so we conclude that the
magnetic moment due to orbital motion is

l
=
q
2mc
L. (2.38)
47
The ratio of to L is called the gyromagnetic ratio.
While the above derivation of (2.38) was purely classical, we know that the
electron also possesses an intrinsic spin angular momentum. Let us hypothesize
that the electron magnetic moment associated with this spin is of the form

s
= g
e
2mc
S.
The constant g is found by experiment to be very close to 2. (However, the rel-
ativistic Dirac equation predicts that g is exactly 2. Higher order corrections in
quantum electrodynamics predict a slightly dierent value, and the measurement
of g 2 is one of the most accurate experimental result in all of physics.)
So we now have the electron magnetic moment given by

s
=
e
mc
S (2.39)
and hence the interaction energy of the electron with the magnetic eld of the
proton is (using equations (2.35) and (2.36))
W =
s
B = +
e
mc
S B =
e
mc
S
_
e
r
3
c
v r
_
=
e
2
m
2
c
2
r
3
S (r p)
or
W =
e
2
m
2
c
2
r
3
S L. (2.40)
Alternatively, we can write W in another form as follows. If we assume that the
electron moves in a spherically symmetric potential eld, then the force eE on the
electron may be written as the negative gradient of this potential energy:
eE = V (r) =
dV
dr
r =
r
r
dV
dr
.
Using this in (2.35) we have
B =
v
c
r
1
er
dV
dr
=
1
mc
r p
1
er
dV
dr
and hence
W =
s
B =
e
m
2
c
2
S (r p)
1
er
dV
dr
or
W =
1
m
2
c
2
1
r
dV
dr
S L. (2.41)
However, we have made one major mistake. The classical equation that leads
to (2.37) is
dL
dt
= N = B (2.42)
where L is the angular momentum of the particle in its rest frame, N is the applied
torque, and B is the magnetic eld in that frame. But this only applies if the
48
electrons rest frame isnt rotating. If it is, then the left side of this equation isnt
valid (i.e., it isnt equal to only the applied torque), and we must use the correct
(operator) expression from classical mechanics:
_
d
dt
_
lab
=
_
d
dt
_
rot
+ . (2.43)
(If you dont know this result, I will derive it at the end of this section so you can
see what is going on and why.)
For the electron, (2.42) gives dS/dt in the lab frame, so in the electrons frame
we must use
_
dS
dt
_
rot
=
_
dS
dt
_
lab

T
S (2.44)
where
T
is called the Thomas precessional frequency. Thus we see that the
change in the spin angular momentum of the electron, (dS/dt)
rot
, is given by the
change due to the applied torque B minus an eect due to the rotation of the
coordinate system:
_
dS
dt
_
rot
= B
T
S =
e
mc
S B+S
T
or
_
dS
dt
_
rot
= S
_

eB
mc
+
T
_
. (2.45)
This is the analogue of (2.42), so the analogue of (2.37) is
W = S
_

eB
mc
+
T
_
=
e
mc
S BS
T
. (2.46)
Note that the rst term is what we already calculated in equation (2.40). What we
need to know is the Thomas factor S
T
. This is not a particularly easy calculation
to do exactly, so we will give a very simplied derivation. (See Jackson, Classical
Electrodynamics, Chapter 11 if you want a careful derivation.)
Basically, Thomas precession can be attributed to time dilation, i.e., observers
on the electron and proton disagree on the time required for one particle to make a
revolution about the other. Let T be the time required for a revolution according
to the electron, and let it be T

according to the proton. Then T

= T where
= (1
2
)
1/2
. (Note that a circular orbit means an acceleration, so even this
isnt really correct.) Then the electron and proton each measure orbital angular
velocities of 2/T and 2/T

respectively.
To the electron, its spin S maintains its direction in space, but to the proton, it
appears to precess at a rate equal to the dierence in angular velocities, or

T
=
2
T

2
T

= 2
_
1
T

1
T

_
= 2
_

T


1
T

_
=
2
T

_
(1
2
)
1/2
1
_

2
T

2
2
.
49
But in general we know that = v/r and hence
2
T

=
v
r
=
mvr
mr
2
=
L
mr
2
and therefore

T
=
L
mr
2

2
2
=
L
mr
2
v
2
2c
2
=
1
2
L
m
2
c
2
1
r
mv
2
r
.
We also know that F = ma, where for circular motion we have an inward
directed acceleration a = v
2
/r. Since F = V , we have
F =
mv
2
r
r =
dV
dr
r
and we can write

T
=
1
2
1
m
2
c
2
1
r
dV
dr
L. (2.47)
From this we see that S
T
is just one-half the energy given by equation (2.41),
and equation (2.46) shows that it is subtracted o. Therefore the correct spinorbit
energy is given by
W =
1
2m
2
c
2
1
r
dV
dr
L S (2.48a)
or, from (2.40) with a slight change of notation,
H
so
=
e
2
2m
2
c
2
r
3
L S. (2.48b)
Calculating the spinorbit interaction energy E
so
by nding the eigenfunctions
and eigenvalues of the Hamiltonian H = H
0
+ H
so
is a dicult problem. Since
the eect of H
so
is small compared to H
0
(at least for the lighter atoms), we will
estimate the value of E
so
by using rst-order perturbation theory. Then rst-order
energy shifts for the hydrogen atom will be the integrals
E
(1)
so
[H
so
)
where the hydrogen atom wave functions including spin are of the form
= R
nl
(r)Y
m
l
(, )(s) .
From J = L +S, we have J
2
= L
2
+S
2
+ 2L S so that
L S =
1
2
(J
2
L
2
S
2
) . (2.49)
Note that neither L nor S separately commutes with L S, but you can easily show
that J = L+S does in fact commute with L S. Because of this, we can choose our
states to be simultaneous eigenfunctions of J
2
, J
z
, L
2
and S
2
, all of which commute
with H.
50
Since Y
m
l
is an eigenfunction of L
z
and is an eigenfunction of S
z
, the wave
function Y
m
l
is an eigenfunction of J
z
= L
z
+ S
z
but not of J
2
. However, by
the usual addition of angular momentum problem, in this case L and S, we can
construct simultaneous eigenfunctions of J
2
, J
z
, L
2
and S
2
. In this case we have
s = 1/2, so we know that the resulting possible j values are j = l 1/2, l + 1/2.
The reason we want to do this is because there are 2(2l + 1) degenerate levels for
a given n and l, where the additional factor of 2 comes from the two possible spin
orientations.
Let us assume that we have constructed these eigenfunctions, and we now denote
the hydrogen atom wave functions by
= R
nl
(r)(, , s)
where, by (2.49)
L S =

2
2
[j(j + 1) l(l + 1) s(s + 1)]
=

2
2
_
j(j + 1) l(l + 1)
3
4
_
.
Using this, our rst-order energy estimate becomes
E
(1)
so

_
R
nl

e
2
2m
2
c
2
1
r
3
L S

R
nl

_
=
e
2

2
4m
2
c
2
_
j(j + 1) l(l + 1)
3
4
__
R
nl

1
r
3

R
nl
_
(2.50)
where
_
R
nl

1
r
3

R
nl

_
=
_
R
nl

1
r
3

R
nl
_
because [) = 1. The integral in (2.50) is not at all hard to do if you use some
clever tricks. I will show how to do it at the end of this section, and the answer is
_
R
nl

1
r
3

R
nl
_
=
1
a
3
0
n
3
l(l + 1/2)(l + 1)
(2.51)
where the Bohr radius is
a
0
=

2
me
2
=

mc
(2.52)
and the ne structure constant is
=
e
2
c

1
137
. (2.53)
Note that for l = 0 we also have L S = 0 anyway, so there is no spinorbit energy.
51
Recall that the energy corresponding to H
0
is
E
(0)
n
=
me
4
2
2
n
2
=
mc
2

2
2n
2
. (2.54a)
or
E
(0)
n
=
E
(0)
1
n
2
=
13.6 eV
n
2
. (2.54b)
Combining (2.50) and (2.51) we have
E
(1)
so
=
e
2

2
4m
2
c
2
a
3
0
n
3
_
[j(j + 1) l(l + 1) 3/4]
l(l + 1/2)(l + 1)
_
=

E
(0)
n

2
2n
_
[j(j + 1) l(l + 1) 3/4]
l(l + 1/2)(l + 1)
_
. (2.55)
Since j = l 1/2, this gives us the two corrections to the energy
E
(1)
so
=

E
(0)
n

2
n
_
1
(2l + 1)(l + 1)
_
for j = l + 1/2 and l ,= 0 (2.56a)
E
(1)
so
=

E
(0)
n

2
n
_
1
l(2l + 1)
_
for j = l 1/2 and l ,= 0 . (2.56b)
There is yet another correction to the hydrogen atom energy levels due to the
relativistic contribution to the kinetic energy of the electron. The kinetic energy is
really the dierence between the total relativistic energy E = (p
2
c
2
+m
2
c
4
)
1/2
and
the rest energy mc
2
. To order p
4
this is
T = (p
2
c
2
+m
2
c
4
)
1/2
mc
2

p
2
2m

p
4
8m
3
c
2
.
Since the Hamiltonian is the sum of kinetic and potential energies, we see from this
that the term
H
rel
=
p
4
8m
3
c
2
(2.57)
may be treated as a perturbation to the states
nlm
.
While the states
nlm
are in general degenerate, in this case we dont have to
worry about it. The reason is that H
rel
is rotationally invariant, so its already
diagonal in the
nlm
basis, and that is precisely what the zeroth-order wavefunc-
tions
(0)
n
accomplish (see equation (2.22)). Therefore we can use simple rst-order
perturbation theory so that
E
(1)
rel
=
1
8m
3
c
2

nlm
[p
4
[
nlm
) .
Using H
0
= p
2
/2me
2
/r we can write
p
4
= 4m
2
_
p
2
2m
_
2
= 4m
2
_
H
0
+
e
2
r
_
2
52
and therefore
E
(1)
rel
=
1
2mc
2
_
(E
(0)
n
)
2
+ 2E
(0)
n
e
2
_
1
r
_
+e
4
_
1
r
2
__
where ) is shorthand for
nlm
[ [
nlm
). These integrals are not hard to evaluate
(see the end of this section), and the result (in dierent forms) is
E
(1)
rel
=
(E
(0)
n
)
2
2mc
2
_
3 +
4n
l + 1/2
_
=

E
(0)
n

2
n
2
_

3
4
+
n
l + 1/2
_
=
1
2
mc
2

4
_

3
4n
4
+
1
n
3
(l + 1/2)
_
.
(2.58)
Adding equations (2.56) and (2.58) we obtain the ne structure energy shift
E
(1)
fs
=
mc
2

4
2n
3
_

3
4n
+
1
j + 1/2
_
=

E
(0)
n

2
n
2
_

3
4
+
n
j + 1/2
_
(2.59)
which is valid for both j = l 1/2. This is the rst-order energy correction due to
the ne structure Hamiltonian
H
fs
= H
so
+H
rel
. (2.60)
2.4.1 Supplement: Miscellaneous Proofs
Now lets go back and prove several miscellaneous results stated in this section. The
rst thing we want to show is that the energy of a magnetic moment in a uniform
magnetic eld is given by B where for a loop of area A carrying current
I is dened to have magnitude IA and pointing perpendicular to the loop in the
direction of your thumb if the ngers of your right hand are along the direction of
the current. To see this, we simply calculate the work required to rotate a current
loop from its equilibrium position to the desired orientation.
Consider Figure 6 below, where the current ows counterclockwise out of the
page at the bottom and into the page at the top. Let the loop have length a on the
sides and b across the top and bottom, so its area is ab. The magnetic force on a
current-carrying wire is
F
B
=
_
Idl B
and hence the forces on the opposite a sides of the loop cancel, and the force on
the top and bottom b sides is F
B
= IbB. The equilibrium position of the loop is
53
B
B
B
FB
FB
a/2
a/2

Figure 6: A current loop in a uniform magnetic eld


horizontal, so the potential energy of the loop is the work required to rotate it from
= 0 to some value . This work is given by W =
_
F dr where F is the force that
I must apply against the magnetic eld to rotate the loop.
Since the loop is rotating, the force I must apply at the top of the loop is in the
direction of and perpendicular to the loop, and hence has magnitude F
B
cos .
Then the work I do is (the factor of 2 takes into account both the top and bottom
sides)
W =
_
F dr = 2
_
F
B
cos (a/2)d = IabB
_

0
cos d = Bsin .
But note that B = Bcos(90 +) = Bsin , and therefore
W = B. (2.61)
In this derivation, I never explicitly mentioned the torque on the loop due to B.
However, we see that
|N| = |r F
B
| = 2(a/2)F
B
sin(90 +) = IabBsin(90 +)
= Bsin(90 +) = | B|
and therefore
N = B. (2.62)
Note that W =
_
|N| d.
Next I will prove equation (2.43). Let A be a vector as seen in both the rotating
and lab frames, and let e
i
be a xed basis in the rotating frame. Then (using the
summation convention) A = A
i
e
i
so that
dA
dt
=
d
dt
(A
i
e
i
) =
dA
i
dt
e
i
+A
i
de
i
dt
.
54
Now (dA
i
/dt)e
i
is the rate of change of A with respect to the rotating frame, so we
have
dA
i
dt
e
i
=
_
dA
dt
_
rot
.
And e
i
is a xed basis vector in the frame that is rotating with respect to the lab
frame. Then, just like any vector rotating in the lab with angular velocity , we
have
de
i
dt
= e
i
.
(See the gure below. Here = d/dt, and dv = v sin d so dv/dt = v sin or
dv/dt = v.)
v
dv

d
Then
A
i
de
i
dt
= A
i
e
i
= A
i
e
i
= A.
Putting this all together we have
_
dA
dt
_
lab
=
_
dA
dt
_
rot
+ A.
Equation (2.43) is just the operator version of this result.
Finally, let me show how to evaluate the integrals 1/r), 1/r
2
) and 1/r
3
) where
the expectation values are taken with respect to the hydrogen atom wave functions

nlm
.
First, instead of 1/r), consider /r). This can be interpreted as the rst-order
correction to the energy due to the perturbation /r. But H
0
= T +V = T e
2
/r,
so H = H
0
+H

= H
0
+/r = T (e
2
)/r, and this is just our original problem
if we replace e
2
by e
2
everywhere. In particular, the exact energy solution is
then
E
n
() =
m(e
2
)
2
2
2
n
2
=
me
4
2
2
n
2
+
me
2

2
n
2

2
m
2
2
n
2
.
But another way of looking at this is as the expansion of E
n
() given in (2.3b):
E
n
= E
(0)
n
+
_
dE
n
d
_
=0
+

2
2!
_
d
2
E
n
d
2
_
=0
+
= E
(0)
n
+E
(1)
n
+
2
E
(2)
n
+
55
where the rst-order correction E
(1)
n
= H

) is just the term linear in . Therefore,


letting 1, we have 1/r) = H

) = me
2
/
2
n
2
or
_
1
r
_
=
1
a
0
n
2
. (2.63)
Note that if you have the exact solution E
n
(), you can obtain E
(1)
n
by simply
evaluating (dE
n
/d)
=0
.
Before continuing, let me rewrite the hydrogen atom Hamiltonian as follows:
H
0
=

2
2m
_

2
r
2
+
2
r

r
_
+
1
2mr
2
L
2

e
2
r
=
p
2
r
2m
+
L
2
2mr
2

e
2
r
(2.64)
where I have dened the radial momentum p
r
by
p
r
= i
_

r
+
1
r
_
.
Now consider /r
2
). Again, letting H = H
0
+ H

= H
0
+ /r
2
, we can still
solve the problem exactly because all we are doing is modifying the centrifugal term
L
2
2mr
2

L
2
+ 2m
2mr
2


2
l(l + 1) + 2m
2mr
2
=

2
l

(l

+ 1)
2mr
2
where l

= l

() is a function of . (Just write


2
l

(l

+ 1) =
2
l(l + 1) + 2m and
use the quadratic formula to nd l

as a function of .)
Recall that the exact energies were dened by
E
n
=
me
4
2
2
n
2
=
me
4
2
2
(k +l + 1)
2
where k = 0, 1, 2, . . . was the integer that terminated the power series solution of
the radial equation. Now what we have is
E(l

) =
me
4
2
2
(k +l

+ 1)
2
= E() = E
(0)
+E
(1)
+
where (note = 0 implies l

= l)
E
(1)
=
dE
d

=0
=
dl

=l
dE
dl

=l
.
Then from the explicit form of E(l

) and the denition of n we have


dE
dl

=l
=
me
4

2
(k +l + 1)
3
=
me
4

2
n
3
56
and taking the derivative of
2
l

(l

+1) =
2
l(l +1) +2m with respect to yields
dl

=l
=
2m

2
1
2l + 1
=
m

2
1
(l + 1/2)
.
Therefore
E
(1)
=
(me
2
/
2
)
2
(l + 1/2)n
3
and /r
2
) = E
(1)
so that
_
1
r
2
_
=
1
a
2
0
(l + 1/2)n
3
. (2.65)
The last integral to evaluate is 1/r
3
). Since there is no term in H
0
that goes
like 1/r
3
, we have to try something else. Note that H
0

nlm
= E
n

nlm
so that
[H
0
, p
r
]) =
nlm
[H
0
p
r
p
r
H
0
[
nlm
) = E
n
p
r
) p
r
)E
n
= 0 .
Using
_
1
r
,

r
_
=
1
r
2
and
_
1
r
2
,

r
_
=
2
r
3
(recall [ab, c] = a[b, c] + [a, c]b), it is easy to use (2.64) and show that
[H
0
, p
r
] =
i
m
L
2
r
3
+
ie
2
r
2
.
But now
0 = [H
0
, p
r
]) =
i
m
_
L
2
r
3
_
+ie
2
_
1
r
2
_
=
i
3
l(l + 1)
m
_
1
r
3
_
+ie
2
_
1
r
2
_
and therefore
_
1
r
3
_
=
me
2

2
l(l + 1)
_
1
r
2
_
or
_
1
r
3
_
=
1
a
0
l(l + 1)
_
1
r
2
_
. (2.66)
Combining this with (2.65) we have
_
1
r
3
_
=
1
a
3
0
l(l + 1)(l + 1/2)n
3
(2.67)
57
2.5 The Zeeman Eect
In the previous section we studied the eect of an atomic electrons magnetic mo-
ment interacting with the magnetic eld generated by the nucleus (a proton). In
this section, I want to investigate what happens when a hydrogen atom is placed
in a uniform external magnetic eld B. These types of interactions are generally
referred to as the Zeeman eect, and they were instrumental in the discovery of
spin. (Pieter Zeeman and H.A. Lorentz shared the second Nobel prize in physics
in 1902. For a very interesting summary of the history of spin, read Chapter 10 in
the text Quantum Mechanics by Hendrik Hameka.)
The hydrogen atom Hamiltonian, including ne structure, is given by
H = H
0
+H
fs
= H
0
+H
so
+H
rel
where
H
0
=

2
2m
_

2
r
2
+
2
r

r
_
+
1
2mr
2
L
2

e
2
r
(equation (2.34))
H
so
=
e
2
2m
2
c
2
r
3
L S (equation (2.48b))
H
rel
=
p
4
8m
3
c
2
(equation (2.57)) .
(And where Im approximating the reduced mass by the electron mass m
e
.) The
easy way to include the presence of an external eld Bis to simply add an interaction
energy
H
mag
=
tot
B
where, from equations (2.38) and (2.39), we know that the total magnetic moment
for a hydrogenic electron is

tot
=
l
+
s
=
e
2m
e
c
(L + 2S) =
e
2m
e
c
(J +S) . (2.68)
However, the correct way to arrive at this is to rewrite the Hamiltonian taking into
account the presence of an electromagnetic eld. For those who are interested, I
work through this approach at the end of this section.
In any case, the Hamiltonian for a hydrogen atom in an external uniform mag-
netic eld is then
H = H
0
+H
so
+H
rel
+H
mag
.
There are really three cases to consider. (Ill ignore H
rel
for now because its a
correction to the kinetic energy and irrelevant to this discussion.) The rst is when
B is strong enough that H
mag
is large relative to H
so
. In this case we can treat
H
so
as a perturbation on the states dened by H
0
+ H
mag
, where these states are
simultaneous eigenfunctions of L
2
, S
2
, L
z
and S
z
(rather than J
2
and J
z
). The
reason that J is not a good quantum number is that the external eld exerts a
58
torque
tot
B on the total magnetic moment, and this is equivalent to a changing
total angular momentum dJ/dt. Thus J is not conserved, and in fact precesses
about B. In addition, if there is a spinorbit interaction, then this internal eld
causes L and S to precess about J.
The second case is when B is weak and H
so
dominates H
mag
. In this situation,
H
mag
is treated as a perturbation on the states dened by H
0
+H
so
. As we saw in
our discussion of H
so
, in this case we must choose our states to be eigenfunctions
of L
2
, S
2
, J
2
and J
z
because L and S are not conserved separately, even though
J = L+S is conserved. (Neither L nor S alone commutes with LS, but [J
i
, LS] = 0
and hence J
2
commutes with H.)
And the third and most dicult case is when both H
so
and H
mag
are roughly
equivalent. Under this intermediate-eld situation, we must take them together
and use degenerate perturbation theory to break the degeneracies of the basis states.
2.5.1 Strong External Field
Let us rst consider the case where the external magnetic eld is much stronger
than the internal eld felt by the electron and due to its orbital motion. Taking
B = Bz we have
H
mag
=
eB
2m
e
c
(L
z
+ 2S
z
) . (2.69)
If we rst ignore spin, then the rst-order correction to the hydrogen atom energy
levels is
E
(1)
nlm
=
_

nlm

eB
2m
e
c
L
z

nlm
_
=
e
2m
e
c
Bm :=
B
Bm
where

B
=
e
2m
e
c
= 5.79 10
9
eV/gauss = 9.29 10
21
erg/gauss
is called the (electron) Bohr magneton. Thus we see that for a given l, the (2l+1)-
fold degeneracy is lifted. For example, the 3-fold degenerate l = 1 state is split into
three states, with an energy dierence of
B
B between states:
l = 1
m = 1
m = 0
m = 1
BB
BB
This strong eld case is sometimes called the Paschen-Back eect.
If we now include spin, then
E
(1)
nlm
l
ms
=
B
B(m
l
+ 2m
s
) (2.70)
where m
s
= 1/2. This yields the further splitting (or lifting of degeneracies)
sometimes called the anomalous Zeeman eect:
59
l = 1
m
l
= 1
m
l
= 0
m
l
= 1
BB
ms = 1/2
ms = 1/2
ms = 1/2
ms = 1/2
ms = 1/2, 1/2
This gives us the energy levels E
(0)
n
+E
(1)
nlm
l
ms
where E
(0)
n
is given by (2.54a).
However, since the basis states we used here are just the usual hydrogen atom
wave functions, it is easy to include further corrections due to both H
so
and the
relativistic correction H
rel
discussed in Section 2.4. We simply apply rst-order
perturbation theory using these as the perturbing potentials. For H
rel
, we can
simply use the result (2.58). However, we cant just use equations (2.56) for H
so
because they were derived using the eigenfunctions of J
2
which dont apply when
there is a strong external magnetic eld.
To get around this problem, we simply calculate
nlm
l
ms
[L S[
nlm
l
ms
). We
have
L S = L
x
S
x
+L
y
S
y
+L
z
S
z
where L
x
= (L
+
+ L

)/2 and L
y
= (L
+
L

)/2i with similar results for S


x
and
S
y
. Using these, it is quite easy to see that the orthogonality of the eigenfunctions
yields
[L
x
S
x
[) = [L
y
S
y
[) = 0
while

nlm
l
ms
[L
z
S
z
[
nlm
l
ms
) =
2
m
l
m
s
. (2.71)
Combining the results for H
rel
and H
so
we obtain the following corrections to
the unperturbed energies E
(0)
n
+E
(1)
nlm
l
ms
:
E
(1)
rel
+ E
(1)
so
=
mc
2

4
2n
3
_
3
4n

1
l + 1/2
_
+
e
2
2m
2
c
2

2
m
l
m
s
1
a
3
0
n
3
l(l + 1)(l + 1/2)
where we used equations (2.58), (2.48b), (2.67) and (2.71). After a little algebra,
which I leave to you, we arrive at
E
(1)
rel
+E
(1)
so
=
me
4

2
2
2
n
3
_
3
4n

_
l(l + 1) m
l
m
s
l(l + 1)(l + 1/2)
__
= E
(0)
1

2
n
3
_
3
4n

_
l(l + 1) m
l
m
s
l(l + 1)(l + 1/2)
__
. (2.72)
60
2.5.2 Weak External Field
Now we turn to the second case where the external eld is weak relative to the
spinorbit term. As we discussed above, now we must take our basis states to be
eigenfunctions of L
2
, S
2
, J
2
and J
z
.
For a many-electron atom, there are basically two ways to calculate the total J.
The rst way is to calculate L =

L
i
and S =

S
i
and then evaluate J = L+S.
This is called LS or Russel-Saunders coupling. It is applicable to the lighter
elements where interelectronic repulsion energies are signicantly greater than the
spinorbit interaction energies. This is because if the spinorbit coupling is weak,
then L and S almost commute with H
0
+H
so
.
The second way is to rst calculate J
i
= L
i
+S
i
so that J =

J
i
. This is called
jj coupling. It is used for heavier elements where the electrons are moving very
rapidly, and hence there is a strong spinorbit interaction. Because of this, L and
S no longer commute with H, even though J does so. This type of coupling is also
more dicult to use, so we will deal only with the LS scheme.
Here is the physical situation:
B
L
S
S
J

tot
J +S = L + 2S
Since J commutes with H
0
+H
so
, it is conserved (and hence is xed in space), even
though L and S are not. This means that L and S both precess about J. If the
applied external B eld is much weaker than the internal eld, then J will precess
much more slowly about B than L and S precess about J. We need to evaluate the
correction (2.69) in rst-order perturbation theory.
Since our basis states are eigenfunctions of J
2
and J
z
but not L
z
and S
z
, we
cant directly evaluate the expectation value of L
z
+ 2S
z
= J
z
+ S
z
. The correct
way to handle this is to use the Wigner-Eckart theorem, which is rather beyond the
scope of this course. Instead, we will use a physical argument that gets us to the
same answer.
We note that since L and S (and hence
tot
) precess rapidly about J, the time
average of the Hamiltonian H
av
mag
=
tot
B) will be the same as
tot
) B. But
61
the average of
tot
is just its component along J, which is

tot
) = (
tot

J)

J =

tot
J
J
2
J.
Using L = J S so that L
2
= J
2
+S
2
2S J we have
(J +S) J = J
2
+S J = J
2
+
1
2
(J
2
+S
2
L
2
) .
Then since B = Bz, we now have
H
av
mag
= B
tot
) z =
eB
2m
e
c
(J +S) J
J
2
J
z
=
eBJ
z
2m
e
c
_
1 +
J
2
+S
2
L
2
2J
2
_
.
Our basis states are simultaneous eigenstates of L
2
, S
2
, J
2
and J
z
, so the average
energy E
av
mag
is given by the rst-order correction
E
av
mag
=
eBm
j
2m
e
c
_
1 +
j(j + 1) +s(s + 1) l(l + 1)
2j(j + 1)
_
=
eBm
j
2m
e
c
_
3
2
+
3/4 l(l + 1)
2j(j + 1)
_
:=
B
Bm
j
g
J
(2.73)
where the Lande g-factor g
J
is dened by
g
J
= 1 +
j(j + 1) +s(s + 1) l(l + 1)
2j(j + 1)
.
The total energy of a hydrogen atom in a uniform magnetic eld is now given
by the sum of the ground state energy E
(0)
n
(equation (2.54a)), the ne-structure
correction E
(1)
fs
(equation (2.59)) and E
av
mag
(equation (2.73)).
2.5.3 Intermediate-Field Case
Finally, we consider the intermediate-eld case where the internal and external
magnetic elds are approximately the same. In this situation, we must apply de-
generate perturbation theory to the degenerate unperturbed states
nlm
l
ms
by
treating H

= H
fs
+ H
mag
as a perturbation. It is easiest to simply work out an
example.
As we saw in our discussion of spinorbit coupling, it is best to work in the basis
in which our states are simultaneous eigenstates of L
2
, S
2
, J
2
and J
z
. (The choice
of basis has no eect on the eigenvalues of H
fs
+H
mag
, and the eigenvalues are just
62
what we are looking for when we solve (2.21).) Let us consider the hydrogen atom
state with n = 2, so that l = 0, 1. Since s = 1/2, the possible j values are
0
1
2
+ 1
1
2
=
1
2
+
3
2

1
2
or j = 1/2, 3/2, 1/2. Our basis states [l s j m
j
) are given in terms of the states
[l s m
l
m
s
) using the appropriate Clebsch-Gordan coecients (which you can look
up or calculate for yourself).
For l = 0 we have j = 1/2 so m
j
= 1/2 and we have the two states

1
:=

0
1
2
1
2
1
2
_
=

0
1
2
0
1
2
_

2
:=

0
1
2
1
2

1
2
_
=

0
1
2
0
1
2
_
where the rst state in each line is the state [l s j m
j
), and the second state in each
line is the linear combination of states [l s m
l
m
s
) with Clebsch-Gordan coecients.
(For l = 0 the C-G coecients are just 1.)
For l = 1 we have the four states with j = 3/2 and the two states with j = 1/2
(which we order with a little hindsite so the determinant (2.21) turns out block
diagonal):

3
:=

1
1
2
3
2
3
2
_
=

1
1
2
1
1
2
_

4
:=

1
1
2
3
2

3
2
_
=

1
1
2
1
1
2
_

5
:=

1
1
2
3
2
1
2
_
=
_
2
3

1
1
2
0
1
2
_
+
_
1
3

1
1
2
1
1
2
_

6
:=

1
1
2
1
2
1
2
_
=
_
1
3

1
1
2
0
1
2
_
+
_
2
3

1
1
2
1
1
2
_

7
:=

1
1
2
3
2

1
2
_
=
_
1
3

1
1
2
1
1
2
_
+
_
2
3

1
1
2
0
1
2
_

8
:=

1
1
2
1
2

1
2
_
=
_
2
3

1
1
2
1
1
2
_
+
_
1
3

1
1
2
0
1
2
_
.
Now we need to evaluate the matrices of H
fs
= H
so
+H
rel
and H
mag
in the [j m
j
)
basis
i
. Since H
rel
p
4
, its already diagonal in the [j m
j
) basis. And since
H
so
S L = (1/2)(J
2
L
2
S
2
), its also diagonal in the [j m
j
) basis. Therefore
H
fs
is diagonal and its contribution is given by (2.59):
jm
j
[H
fs
[jm
j
) =

E
(0)
n

2
n
2
_

3
4
+
n
j + 1/2
_
=

E
(0)
1

2
16
_
2
j + 1/2

3
4
_
where I used E
(0)
n
= E
(0)
1
/n
2
and let n = 2. For states with j = 1/2, this gives a
contribution

i
[H
fs
[
i
) =
5

E
(0)
1

2
64
:= 5 for i = 1, 2, 6, 8 (2.74a)
63
and for states with j = 3/2 this is

i
[H
fs
[
i
) =

E
(0)
1

2
64
:= for i = 3, 4, 5, 7. (2.74b)
Next, we easily see that the rst four states
1

4
are eigenstates of H
mag

L
z
+ 2S
z
(since they each contain only a single factor [l s m
l
m
s
)). Hence H
mag
is
already diagonal in this 4 4 block, and so contributes the diagonal terms

i
[H
mag
[
i
) =
B
B(m
l
+ 2m
s
) := (m
l
+ 2m
s
) for i = 1, 2, 3, 4.
For the remaining four states
5

8
we must explicitly evaluate the matrix elements.
For example,
H
mag
[
5
) =

B
B

(L
z
+ 2S
z
)
_
_
2
3

1
1
2
0
1
2
_
+
_
1
3

1
1
2
1
1
2
_
_
=
B
B
_
1
_
2
3

1
1
2
0
1
2
_
+ 0
_
1
3

1
1
2
1
1
2
_
_
=
B
B
_
2
3

1
1
2
0
1
2
_
and therefore (using the orthonormality of the states [l s m
l
m
s
))

5
[H
mag
[
5
) =
2
3

B
B :=
2
3

and

6
[H
mag
[
5
) =
5
[H
mag
[
6
) =

2
3

B
B :=

2
3
.
Also,

6
[H
mag
[
6
) =
_

_
1
3

B
B

1
1
2
0
1
2
_
=
1
3

B
B :=
1
3
.
Since all other matrix elements with
5
and
6
vanish, there is a 2 2 block
corresponding to the subspace spanned by
5
and
6
. Similarly, there is a 2 2
block corresponding to the subspace spanned by
7
and
8
with

7
[H
mag
[
7
) =
2
3

8
[H
mag
[
7
) =
7
[H
mag
[
8
) =

2
3

8
[H
mag
[
8
) =
1
3
.
64
Combining all of these matrix elements, the matrix of H

= H
fs
+ H
mag
used in
(2.21) becomes
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
5 +
5
+ 2
2
+
2
3

2
3

2
3
5 +
1
3


2
3

2
3

2
3
5
1
3

3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
.
Now we need to nd the eigenvalues of this matrix (which are the rst-order
energy corrections). Since its block diagonal, the rst four diagonal entries are
precisely the rst four eigenvalues. For the remaining four eigenvalues, we must
diagonalize the two 2 2 submatrices. Calling the eigenvalues , the characteristic
equation for the
5
,
6
block is

+
2
3

2
3

2
3
5 +
1
3

=
2
+(6 ) + 5
2

11
3
= 0 .
From the quadratic formula we nd the roots

{5,6}

= 3 +

2

_
4
2
+
2
3
+
1
4

2
.
Looking at the
7
,
8
block, we see that we can just let and use the
same equation for the roots:

{7,8}

= 3

2

_
4
2

2
3
+
1
4

2
.
The energy E
(1)
i
of each of these eight states is then given by
E
(1)
1
= E
(0)
2
5 +
E
(1)
2
= E
(0)
2
5
E
(1)
3
= E
(0)
2
+ 2
E
(1)
4
= E
(0)
2
2
E
(1)
5
= E
(0)
2
3 +

2
+
_
4
2
+
2
3
+
1
4

2
65
E
(1)
6
= E
(0)
2
3 +

2

_
4
2
+
2
3
+
1
4

2
E
(1)
7
= E
(0)
2
3

2
+
_
4
2

2
3
+
1
4

2
E
(1)
8
= E
(0)
2
3

2

_
4
2

2
3
+
1
4

2
For i = 1, 2, 3, 4 the energy E
(1)
i
corresponds to
i
. But for i = 5, 6 the energy E
(1)
i
corresponds to some linear combination of
5
and
6
, and similarly for i = 7, 8 the
energy E
i
corresponds to a linear combination of
7
and
8
. (This is the essential
content of Section 2.2.)
It is easy to see that for = 0 (i.e., B = 0), these energies reduce to E
fs
given by
(2.74), and for very large , we obtain the Paschen-Back energies given by (2.70).
Thus our results have the correct limiting behavior. See Figure 7 below.
2 4 6 8 10

20
10
10
E
Figure 7: Intermediate-eld energy corrections as a function of B for n = 2.
2.5.4 Supplement: The Electromagnetic Hamiltonian
In a proper derivation of the Lagrange equations of motion, one starts from dAlemberts
principle of virtual work, and derives Lagranges equations
d
dt
T
q
i

T
q
i
= Q
i
(2.75)
where the q
i
are generalized coordinates, T = T(q
i
, q
i
) is the kinetic energy and
Q
i
=

j
F
j
(x
j
/q
i
) is a generalized force. In the particular case that Q
i
is
derivable from a conservative force F
j
= V/x
j
, then we have Q
i
= V/q
i
.
Since the potential energy V is assumed to be independent of q
i
, we can replace
T/ q
i
by (T V )/ q
i
and we arrive at the usual Lagranges equations
d
dt
L
q
i

L
q
i
= 0 (2.76)
66
where L = T V . However, even if there is no potential function V , we can still
arrive at this result if there exists a function U = U(q
i
, q
i
) such that the generalized
forces may be written as
Q
i
=
U
q
i
+
d
dt
U
q
i
because dening L = T U we again arrive at equation (2.76). The function U
is called a generalized potential or a velocity dependent potential. We now
seek such a function to describe the force on a charged particle in an electromagnetic
eld.
Recall from electromagnetism that the Lorentz force law is given by
F = q
_
E+
v
c
B
_
or
F = q
_

1
c
A
t
+
v
c
(A)
_
where E = (1/c)A/t and B = A. Our goal is to write this in the
form
F
i
=
U
x
i
+
d
dt
U
x
i
for a suitable U. All it takes is some vector algebra. We have
[v (A)]
i
=
ijk

klm
v
j

l
A
m
= (
l
i

m
j

m
i

l
j
)v
j

l
A
m
= v
j

i
A
j
v
j

j
A
i
= v
j

i
A
j
(v )A
i
.
But x
i
and x
j
are independent variables (in other words, x
j
has no explicit depen-
dence on x
i
) so that
v
j

i
A
j
= x
j
A
j
x
i
=

x
i
( x
j
A
j
) =

x
i
(v A)
and we have
[v (A)]
i
=

x
i
(v A) (v )A
i
.
But we also have
dA
i
dt
=
A
i
x
j
dx
j
dt
+
A
i
t
= v
j
A
i
x
j
+
A
i
t
= (v )A
i
+
A
i
t
so that
(v )A
i
=
dA
i
dt

A
i
t
and therefore
[v (A)]
i
=

x
i
(v A)
dA
i
dt
+
A
i
t
.
67
But we can write A
i
= (v
j
A
j
)/v
i
= (v A)/v
i
which gives us
[v (A)]
i
=

x
i
(v A)
d
dt

v
i
(v A) +
A
i
t
.
The Lorentz force law can now be written in the form
F
i
= q
_

x
i

1
c
A
i
t
+
1
c
[v (A)]
i
_
= q
_

x
i

1
c
A
i
t
+
1
c

x
i
(v A)
1
c
d
dt

v
i
(v A) +
1
c
A
i
t
_
= q
_


x
i
_

v
c
A
_

d
dt

v
i
_
v
c
A
_
_
.
Since is independent of v we can write

d
dt

v
i
_
v
c
A
_
=
d
dt

v
i
_

v
c
A
_
so that
F
i
= q
_


x
i
_

v
c
A
_
+
d
dt

v
i
_

v
c
A
_
_
or
F
i
=
U
x
i
+
d
dt
U
x
i
where U = q( v/c A). This shows that U is a generalized potential and that
the Lagrangian for a particle of charge q in an electromagnetic eld is
L = T q +
q
c
v A (2.77a)
or
L =
1
2
mv
2
q +
q
c
v A. (2.77b)
From this, the canonical momentum is dened by p
i
= L/ x
i
= L/v
i
so
that
p = mv +
q
c
A.
Using this, the Hamiltonian is then given by
H =

p
i
x
i
L = p v L
= mv
2
+
q
c
A v
1
2
mv
2
+q
q
c
A v
=
1
2
mv
2
+q
=
1
2m
_
p
q
c
A
_
2
+q.
68
This is the basis for the oft heard statement that to include electromagnetic forces,
you need to make the replacement p p(q/c)A. Including any other additional
potential energy terms, the Hamiltonian becomes
H =
1
2m
_
p
q
c
A
_
2
+q +V (r) . (2.78)
Lets evaluate (2.78) for the case of a uniform magnetic eld. Since B = A,
it is not hard to verify that
A =
1
2
r B
will work (Ill work it out, but you could also just plug into a vector identity if you
take the time to look it up):
[(r B)]
i
=
ijk

klm

j
(x
l
B
m
)
= (
il

jm

im

jl
)[
jl
B
m
+x
l

j
B
m
]
= B
i
3B
i
= 2B
i
where I used
j
x
l
=
jl
,
jl

lj
=
jj
= 3 and
j
B
m
= 0 since B is uniform. This
shows that B = (1/2)[(r B)] = A as claimed. Note also that for this
B we have
2 A = (r B) =
ijk

i
(x
j
B
k
) =
ijk

ij
B
k
= 0
because
ijk

ij
=
iik
= 0. Hence A = 0.
Before writing out (2.78), let me use this last result to show that
(p A) = i (A) = i( A) +iA = (A p)
and hence p A = A p. (Note this shows that p A = A p even if B is not uniform
if we are using the Coulomb gauge A = 0.) Now using this, we have
1
2m
_
p
q
c
A
_
2
=
1
2m
_
p
2

q
c
(p A+A p) +
q
2
c
2
A
2
_
=
p
2
2m

q
mc
A p +
q
2
2mc
2
A
2
.
But (thinking of the scalar triple product as a determinant and switching two rows)
q
mc
A p =
q
2mc
(r B) p = +
q
2mc
B (r p)
=
q
2mc
B L.
And using (Ill leave the proof to you)
A
2
=
1
4
(r B) (r B) =
1
4
[r
2
B
2
(r B)
2
]
69
we obtain
1
2m
_
p
q
c
A
_
2
=
p
2
2m

q
2mc
B L +
q
2
8mc
2
[r
2
B
2
(r B)
2
] .
Lets compare the relative magnitudes of the B L term and the quadratic (last)
term for an electron. Taking r
2
a
2
0
and L , we have
(e
2
/8mc
2
)r
2
B
2
(e/2mc)B L
=
(e
2
/8mc
2
)a
2
0
B
2
(e/2mc)B
=
1
4
e
2
c
B
e/a
2
0
=
1
4
1
137
B
(4.8 10
10
esu)/(0.5 10
8
cm)
2
=
B
9 10
9
gauss
.
Since magnetic elds in the lab are of order 10
4
gauss or less, we see that the
quadratic term is negligible in comparison.
Referring back to (2.38), we see that
q
2mc
L =
l
where, for an electron, we have q = e. And as we have also seen, for spin we must
postulate a magnetic moment of the form

s
= g
q
2mc
S
where g = 2 for an electron (and g = 5.59 for a proton). Therefore, an electron has
a total magnetic moment

tot
=
e
2m
e
c
(L + 2S)
as we stated in (2.68).
Combining our results, the Hamiltonian for a hydrogen atom in a uniform ex-
ternal magnetic eld is then given by
H =
p
2
2m
e

e
2
r

tot
B = H
0

tot
B = H
0
+H

where we are taking q + V (r) = 0 e


2
/r, and m
e
in this equation is really the
reduced mass, which is approximately the same as the electron mass.
70
3 Time-Dependent Perturbation Theory
3.1 Transitions Between Two Discrete States
We now turn our attention to the situation where the perturbation depends on
time. In this situation, we assume that the system is originally in some denite
state, and that applying a time-dependent external force then induces a transition
to another state. For example, shining electromagnetic radiation on an atom in its
ground state will (may) cause it to undergo a transition to a higher energy state.
We assume that the external force is weak enough that perturbation theory applies.
There are several ways to deal with this problem, and everyone seems to have
their own approach. We shall follow a method that is closely related to the time-
independent method that we employed.
To begin, suppose
H = H
0
+H

(t)
and that we have the orthonormal solutions
H
0

n
= E
n

n
with

n
(t) =
n
e
iEnt/
.
Note that we no longer need to add a superscript 0 to the energies, because with a
time-dependent Hamiltonian there is no energy conservation and hence we are not
looking for energy corrections.
We would like to solve the time-dependent Schrodinger equation
H(t) = [H
0
+H

(t)](t) = i
(t)
t
. (3.1)
In this case, the solutions
n
still form a complete set (they describe every possible
state available to the system), the dierence being that now the state (t) that
results from the perturbation will depend on time. So let us write
(t) =

k
c
k
(t)e
iE
k
t/

k
. (3.2)
The reason for this form is that we want the time-dependent coecients c
n
(t) to
reduce to constants if H

(t) = 0. In other words, so H

(t) 0 implies (t) (t).


Our goal is to nd the probability that if the system is in an eigenstate
i
= (0) at
time t = 0, it will be found in the eigenstate
f
at a later time t. This probability
is given by
P
if
(t) = [
f
[(t))[
2
= [c
f
(t)[
2
(3.3)
where (t)[(t)) = 1 implies

k
[c
k
(t)[
2
= 1 .
71
Using (3.2) in (3.1) we obtain

k
c
k
(t)e
iE
k
t/
[E
k
+H

(t)]
k
=

k
i
_
c
k
(t)
iE
k

c
k
(t)
_
e
iE
k
t/

k
or
i

k
c
k
(t)e
iE
k
t/

k
=

k
H

(t)c
k
(t)e
iE
k
t/

k
. (3.4)
But
n
[
k
) =
nk
so that
i c
n
(t)e
iEnt/
=

n
[H

(t)[
k
)c
k
(t)e
iE
k
t/
.
Dening the Bohr angular frequency

nk
=
E
n
E
k

(3.5)
we can write
c
n
(t) =
1
i

n
[H

(t)[
k
)c
k
(t)e
i
nk
t
. (3.6a)
This set of equations for c
n
(t) is exact and completely equivalent to the original
Schrodinger equation (3.1). Dening
H

nk
(t) =
n
[H

(t)[
k
)
we may write out (3.6a) in matrix form as (for a nite number of terms)
i
_

_
c
1
(t)
c
2
(t)
.
.
.
c
n
(t)
_

_
=
_

_
H

11
H

12
e
i12t
H

1n
e
i1nt
H

21
e
i21t
H

22
H

2n
e
i2nt
.
.
.
.
.
.
.
.
.
H

n1
e
in1t
H

n2
e
in2t
H

nn
_

_
_

_
c
1
(t)
c
2
(t)
.
.
.
c
n
(t)
_

_
. (3.6b)
As we did in the time-independent case, we now let H

(t) H

(t), and expand


c
k
(t) in a power series in :
c
k
(t) = c
(0)
k
(t) +c
(1)
k
(t) + . (3.7)
Inserting this into (3.6a) yields
c
(0)
n
(t) + c
(1)
n
(t) +
2
c
(2)
n
(t) +
=
1
i

k
H

nk
(t)[c
(0)
k
(t) +
2
c
(1)
k
(t) +
3
c
(2)
k
(t) + ]e
i
nk
t
.
Equating powers of , for
0
we have
c
(0)
n
(t) = 0 (3.8a)
72
and for
s+1
with s 0 we have
c
(s+1)
n
(t) =
1
i

k
H

nk
(t)c
(s)
k
(t)e
i
nk
t
. (3.8b)
In principle, these may be solved successively. Solving (3.8a) gives c
(0)
k
(t), and using
this in (3.8b) then gives c
(1)
n
(t). Then putting these back into (3.8b) again yields
c
(2)
n
(t), and in principle this can be continued to any desired order.
Let us assume that the system is initially in the state
i
, so that
c
n
(0) =
ni
. (3.9a)
Since this must be true for all , we have
c
(0)
n
(0) =
ni
(3.9b)
and
c
(s)
n
(0) = 0 for s 1 . (3.9c)
From (3.8a) we see that the zeroth-order coecients are constant in time, so we
have
c
(0)
n
(t) =
ni
(3.9d)
and the zeroth-order solutions are completely determined.
Using (3.9b) in (3.8b) we obtain, to rst order,
c
(1)
n
(t) =
1
i

k
H

nk
(t)
ki
e
i
nk
t
=
1
i
H

ni
(t)e
init
so that
c
(1)
n
(t) =
1
i
_
t
0
H

ni
(t

)e
init

dt

(3.10)
where the constant of integration is zero by (3.9c). Using (3.9d) and (3.10) in (3.2)
yields (t) to rst order:
(t) =
i
e
iEit/
+

k
_
1
i
_
t
0
H

ki
(t

)e
i
ki
t

/
dt

_
e
iE
k
t

k
.
From (3.3) we know that the transition probability to the state
f
is given by
P
if
(t) = [
f
[(t))[
2
= [c
f
(t)[
2
where c
f
(t) = c
(0)
f
(t) +c
(1)
f
(t) + . We will only consider transitions to states
f
that are distinct from the initial state
i
, and hence c
(0)
f
(t) = 0. Then the rst-order
transition probability is
P
if
(t) =
2

c
(1)
f
(t)

2
73
or, from (3.10) and letting 1,
P
if
(t) =
1

_
t
0
H

fi
(t

)e
i
fi
t

dt

2
. (3.11)
A minor point is that our initial conditions could equally well be dened at
t . In this case, the lower limit on the above integrals would obviously be
rather than 0.
Example 3.1. Consider a one-dimensional harmonic oscillator of a particle of
charge q with characteristic frequency . Let this oscillator be placed in an electric
eld that is turned on and o so that its potential energy is given by
H

(t) = qE xe
t
2
/
2
where is a constant. If the particle starts out in its ground state, let us nd the
probability that it will be in its rst excited state after a time t .
Since t , we may as well take t as limits. From (3.11), we see that
we must evaluate the integral
I =
_

10
(t

)e
i10t

dt

where
H

10
(t) = qE e
t
2
/
2

1
[x[
0
)
and E
n
= (n + 1/2) so that
10
= (E
1
E
0
)/ = 1. Then (keeping
10
for
generality at this point)
I = qE
1
[x[
0
)
_

e
t
2
/
2
e
i10t
dt
= qE
1
[x[
0
)
_

e
(1/
2
)(t
2
i10
2
t)
dt
= qE
1
[x[
0
)e

2
10

2
/4
_

e
(1/
2
)(ti10
2
/2)
2
dt
= qE
1
[x[
0
)e

2
10

2
/4
_

e
(1/
2
)u
2
du
= qE
1
[x[
0
)e

2
10

2
/4

2
.
The easy way to do the spatial integral is to use the harmonic oscillator ladder
operators. From
x =
_

2m
(a + a

)
74
where
a
n
=

n
n1
and a

n
=

n + 1
n+1
we have

1
[x[
0
) =
_

2m

1
[a

0
) =
_

2m

1
[
1
) =
_

2m
.
Therefore
I = qE
_

2m
e

2
10

2
/4
so that
P
01
(t ) =
q
2
E
2

2
2m
e

2
10

2
/2
=
q
2
E
2

2
2m
e

2
/2
.
Note that as (i.e., the electric eld is turned on very slowly), we have
P
01
0. This shows that the system adjusts adiabatically to the eld and is
not shocked into a transition.
Example 3.2. Let us consider a harmonic perturbation of the form
H

(t) = V
0
(r) cos t , t 0 .
Note that letting = 0 we obtain the constant perturbation H

(t) = V
0
(r) as a
special case. It just isnt much harder to treat the more general situation, which
represents the interaction of the system with an electromagetic wave of frequency
.
If we dene
V
fi
=
f
[V
0
(r)[
i
) ,
then
H

fi
=
f
[V
0
(r) cos t[
i
) =
f
[V
0
(r)[
i
) cos t = V
fi
cos t .
Using cos t = (e
it
+e
it
)/2i, we then have
_
t
0
H

fi
(t

)e
i
fi
t

dt

=
V
fi
2i
_
t
0
(e
it

+e
it

)e
i
fi
t

dt

=
V
fi
2i
_
t
0
(e
i(
fi
+)t

+e
i(
fi
)t

) dt

=
V
fi
2i
_
e
i(
fi
+)t
1
i(
fi
+)
+
e
i(
fi
)t
1
i(
fi
)
_
.
Inserting this into (3.11), we can write
P
if
(t; ) =
[V
fi
[
2
4
2

1 e
i(
fi
+)t

fi
+
+
1 e
i(
fi
)t

fi

2
(3.12)
75
where Im specically including as an argument of P
if
because the transition
probability depends on .
Let us consider the special case of a constant (i.e., time-independent) perturba-
tion, = 0. In this case, (3.12) reduces to
P
if
(t; 0) =
[V
fi
[
2

2
fi

1 e
i
fi
t

2
=
[V
fi
[
2

2
fi
2(1 cos
fi
t) .
Using the elementary identity
cos A = cos(A/2 +A/2) = cos
2
A/2 sin
2
A/2 = 1 2 sin
2
A/2
we can write the transition probability as
P
if
(t; 0) =
[V
fi
[
2

2
_
sin
fi
t/2

fi
/2
_
2
:=
[V
fi
[
2

2
F(t;
fi
) . (3.13)
The function
F(t;
fi
) =
_
sin
fi
t/2

fi
/2
_
2
= t
2
_
sin
fi
t/2

fi
t/2
_
2
has amplitude equal to t
2
, and zeros at
fi
= 2n/t. See Figure 8 below.
5 5
1
2
3
4
Figure 8: Plot of F(t;
fi
) vs
fi
for t = 2.
The main peak lies between zeros at 2/t, so its width goes like 1/t while its
height goes like t
2
, and hence its area grows like t.
It is also interesting to see how the transition probability depends on time.
76
2 4 6 8 10 12 14
0.0
0.2
0.4
0.6
0.8
1.0
Figure 9: Plot of F(t;
fi
) vs t for
fi
= 2.
Here we see clearly that for times t = 2n/
fi
the transition probability is zero, and
the system is certain to be in its initial state. Because of this oscillatory behavior,
the greatest probability for a transition is to allow the perturbation to act only for
a short time /
fi
.
For future reference, let me make a (very un-rigorous but useful) mathemat-
ical observation. From Figure 8, we see that as t , the function F(t, ) =
t
2
[(sint/2)/(t/2)]
2
has an amplitude t
2
that also goes to innity, and a width
4/t centered at = 0 that goes to zero. Then if we include F(t, ) inside the
integral of a smooth function f(), the only contribution to the integral will come
where = 0. Using the well-known result
_

sin
2
x
x
2
dx =
we have (with x = t/2 so dx = (t/2)d)
lim
t
_

f()t
2
_
sin t/2
t/2
_
2
d = 2tf(0)
_

sin
2
x
x
2
dx = 2tf(0)
and hence we conclude that
F(t; ) =
_
sin t/2
/2
_
2
= t
2
_
sin t/2
t/2
_
2
t
2t() . (3.14)
Example 3.3. Let us take a look at equation (3.12) when
fi
. This is called
a resonance phenomenon. We will assume that 0 by denition, and we will
consider the case where
fi
> 0. The alternative case where
fi
< 0 can be treated
in an analogous manner.
77
We begin by rewriting the two complex terms in (3.12). For the rst we have
A
+
:=
1 e
i(
fi
+)t

fi
+
= e
i(
fi
+)t/2
_
e
i(
fi
+)t/2
e
i(
fi
+)t/2

fi
+
_
= ie
i(
fi
+)t/2
_
sin(
fi
+)t/2
(
fi
+)/2
_
and similarly for the second
A

:=
1 e
i(
fi
)t

fi

= ie
i(
fi
)t/2
_
sin(
fi
)t/2
(
fi
)/2
_
If
fi
, then A

dominates and is called the resonant term, while the term


A
+
is called the anti-resonant term. (These terms would be switched if we were
considering the case
fi
< 0.)
We are considering the case where [
fi
[ [
fi
[, so A
+
can be neglected in
comparison to A

. Under these conditions, (3.12) becomes


P
if
(t; ) =
[V
fi
[
2
4
2
[A

[
2
=
[V
fi
[
2
4
2
_
sin(
fi
)t/2
(
fi
)/2
_
2
:=
[V
fi
[
2
4
2
F(t;
fi
) . (3.15)
A plot of F(t;
fi
) as a function of would be identical to Figure 8 except
that the peak would be centered over the point =
fi
. In particular, F(t;
fi
)
has a maximum value of t
2
, and a width between its rst two zeros of
=
4
t
. (3.16)
Here is another way to view Example 3.3. Let us consider a time-dependent
potential of the form
H

(t) = V
0
(r)e
it
. (3.17)
Then
_
t
0
H

fi
(t

)e
i
fi
t

dt

= V
fi
_
t
0
e
i(
fi
)t

dt

= V
fi
e
i(
fi
)t
1
i(
fi
)
= V
fi
e
i(
fi
)t/2
sin(
fi
)t/2
(
fi
)/2
and (3.11) becomes
P
if
(t) =
[V
fi
[
2

2
_
sin(
fi
)t/2
(
fi
)/2
_
2
. (3.18)
78
As t , we can use (3.14) to write
lim
t
P
if
(t) =
2

[V
fi
[
2
(E
f
E
i
)t
where we used the general result (ax) = (1/ [a[)(x) so that () = (E/) =
(E). Note that the transition probability grows linearly with time. We can write
this as
P
if
(t ) =
if
t (3.19a)
where the transition rate (i.e., the transition probability per unit time) is dened
by

if
=
2

[V
fi
[
2
(E
f
E
i
) . (3.19b)
(The result (3.19b) diers from (3.15) by a factor of 4 in the denominator. This is
because in Example 3.2 we used cos t which contains the terms (1/2)e
it
.)
Because of the delta function, we only get transitions in those cases where
[E
f
E
i
[ = , which is simply a statement of energy conservation. Assuming
that E
f
> E
i
, in the case of a potential of the form V
0
e
+it
, we have E
f
= E
i

so the system has emitted a quantum of energy. And in the case where we have a
potential of the form V
0
e
it
, we have E
f
= E
i
+ so the system has absorbed a
quantum of energy.
In Example 3.3, we saw that resonance occurs when =
fi
. Since we are
considering the case where
fi
= (E
f
E
i
) 0, this means that resonance is at
the point where E
f
= E
i
+. In other words, a system with energy E
i
undergoes a
resonant absorption of a quantum of energy to transition to a state with energy
E
f
. Had we started with the case where
fi
< 0, we would have found that the
system underwent a resonant induced emission of the same quantum of energy ,
so that E
f
= E
i
.
Also recall that in Example 3.3, we neglected A
+
relative to A

. Noting that
[A
+
()[
2
= [A

()[
2
, it is easy to see that a plot of [A
+
[
2
is exactly the same as
a plot of [A

[
2
reected about the vertical axis = 0. See Figure 10 below. Note
that both of these curves have a width = 4/t that narrows as time increases.
30 20 10 10 20 30
1
2
3
4
Figure 10: Plot of [A
+
[
2
and [A

[
2
vs for t = 2 and
fi
= 20.
79
In addition, we see that A
+
will be negligible relative to A

as long as they are


well-separated, in other words, as long as
2 [
fi
[ .
Since = 4/t, this is equivalent to requiring
t
1
[
fi
[

1

.
Physically, this means that the perturbation must act over a long enough time
interval t for the system to oscillate enough that it indeed appears sinusoidal.
On the other hand, in both Examples 3.2 and 3.3, the transition probability
P
if
(t; ) has a maximum value proportional to t
2
. Since this approaches innity
as t , and since a probability always has to be less than or equal to 1, there is
clearly something wrong. One answer is that the rst order approximation we are
using has a limited time range. In Example 3.3, resonance occurs when =
fi
, in
which case
P
if
(t; =
fi
) =
[V
fi
[
2
4
2
t
2
.
So in order for our rst-order approximation to be valid, we must have
t

[V
fi
[
.
Combining this with the previous paragraph, we conclude that
1
[
fi
[


[V
fi
[
.
This is the same as
[
fi
[ = [E
f
E
i
[ [V
fi
[ =
f
[V
0
[
i
)
and hence the energy dierence between the initial and nal states must be much
larger than the matrix element V
fi
between these states.
3.2 Transitions to a Continuum of States
In the previous section we considered the transition probability P
if
(t) from an
initial state
i
to a nal state
f
. But in the real experimental world, detectors
generally observe transitions over a (at least) small range of energies and over a
nite range of incident angles. Thus, we should treat not a single nal state
f
,
but rather a group (or continuum) of closely spaced states centered about some

f
. Since the area under the curve in Figure 8 grows like t, we expect that the
transition probability to a set of states with approximately the same energy as
f
to grow linearly with time. (We saw this for a transition to a single state in equation
(3.19a).)
80
Let us now generalize (3.19b) to a more physically realistic detector. After all,
no physical transition rate can go like a delta function. To get a good idea of what
to expect, we rst consider the perturbation (3.17) and the resulting transition
probability (3.18).
For a physically realistic detector, instead of a transition to a single nal state
we must consider all transitions to a group of nal states centered about E
f
:
P(t) =

E
f
E
f
[V
fi
[
2

2
_
sin(
fi
)t/2
(
fi
)/2
_
2
=

E
f
E
f
[V
fi
[
2
_
sin(E
f
E
i
)t/2
(E
f
E
i
)/2
_
2
where the sum is over all states with energies in the range E
f
. We assume that
the nal states are very closely spaced, and hence may be treated as a continuum
of states. In that case, the sum may be converted to an integral over the interval
E
f
by writing the number of states with energy between E
f
and E
f
+ dE
f
as
(E
f
) dE
f
, where (E
f
) is called the density of nal states. It is just the number
of states per unit energy. Then
P(t) =
_
E
f
+E
f
/2
E
f
E
f
/2
(E
f
) dE
f
[V
fi
[
2
_
sin(E
f
E
i
)t/2
(E
f
E
i
)/2
_
2
. (3.20)
As t becomes very large, we have seen that the term in brackets becomes sharply
peaked about E
f
= E
i
, and hence we may assume that (E
f
) and [V
fi
[ are
essentially constant over the region of integration, which we may also let go to .
Changing variables to x = (E
f
E
i
)t/2 we then have
P(t) = (E
f
) [V
fi
[
2
2t

sin
2
x
x
2
dx =
2

(E
f
) [V
fi
[
2
t .
Dening the transition rate = dP/dt we nally arrive at
=
2

(E
f
) [V
fi
[
2
_
E
f
=Ei
(3.21)
which is called Fermis golden rule.
A completely equivalent way to write this is to take equations (3.19) and write
P(t) =

nal states
P
if
(t) =

nal states

if
t = t
where

if
=
2

[V
fi
[
2
(E
f
E
i
)
and
=

nal states

if
.
81
If you wish, you can then replace the sum over states by an integral over energies
if you include a density of states factor (E). This has the same eect as simply
using (3.14) in (3.20) to write
P(t) =
_
E
f
+E
f
/2
E
f
E
f
/2
(E
f
) dE
f
[V
fi
[
2
2

t(E
f
E
i
)
=
_
2

[V
fi
[
2
_
E
f
+E
f
/2
E
f
E
f
/2
(E
f
)(E
f
E
i
) dE
f
_
t
= t .
Example 3.4. Let us consider a simple, one-dimensional model of photo-ionization,
in which a particle of charge e in its ground state
0
in a potential U(x) is irradiated
by light of frequency , and hence is ejected into the continuum.
To keep things simple, we rst assume that the wavelength of the incident light
is much longer than atomic dimensions. Under these conditions, the electric eld of
the light may be considered uniform in space, but harmonic in time. (The magnetic
eld of the light exerts a force that is of order v/c less than the electric force, and
may be neglected.) Since we are treating the absorption of energy, we write the
electric eld as E = E e
it
x. Using E = we have
_
E dx = E e
it
_
dx = E e
it
x =
_
dx = (x)
so that (x) = E e
it
x. From Example 2.2 we know that the interaction energy
of the particle in the electric eld is given by e(x), and hence the perturbation is
H

(x, t) = eE xe
it
= V
0
(x)e
it
.
The second assumption we shall make is that the frequency is large enough
that the nal state energy E
f
is very large compared to U(x), and therefore we may
treat the nal state of the ejected particle as a plane wave (i.e., a free particle of
denite energy and momentum).
We need to nd the density of nal states and the normalization of these states.
The standard trick to accomplishing this is to consider our system to be in a box of
length L, and then letting L . By a proper choice of boundary conditions, this
will give us a discrete set of normalizable states. However, we cant treat this like
a particle in a box, because such states must vanish at the walls, and a state of
denite momentum cant vanish. Therefore, we employ the mathematical (but non-
physical) trick of assuming periodic boundary conditions, whereby the walls are
taken to lie at x
0
and x
0
+L together with (x
0
+L) = (x
0
).
The free particle plane waves are of the form e
ipx/
, so our periodic boundary
conditions become
e
ip(x0+L)/
= e
ipx0
82
so that e
ipL/
= 1 and hence
p =

2mE =
2n
L
; n = 0, 1, 2, . . . .
This shows that the momentum (and hence energy) of the particle takes on discrete
values. Note that as L gets larger and larger, the spacing of the states becomes closer
and closer, and in the limit L they become the usual free particle continuum
states of denite momentum. This is the justication for using periodic boundary
conditions. Finally, the normalization condition
_
x0+L
x0
[[
2
dx = 1 implies that the
normalized wave functions are then

E
=
1

L
e
i

2mE x/
.
The next thing we need to do is nd the density of states (E), which is dened
as the number of states with an energy between E and E+dE, i.e., (E) = dN/dE.
Consider a state with energy E dened by

2mE =
2N
L
so that
N =
L
2

2mE .
From n = 0, 1, 2, . . . , N, we see that there are 2N + 1 states with energy less
than or equal to E. Calling this number N(E), we have
N(E) = 2N + 1 =
L

2mE + 1 .
But then
N(E +dE) =
L

_
2m(E +dE) + 1 =
L

2mE
_
1 +dE/E + 1

2mE(1 +dE/2E) + 1 = N(E) +


L
2
_
2m
E
dE
and hence
dN = N(E +dE) N(E) =
L
2
_
2m
E
dE .
Directly from the denition of (E) we then have
(E) =
L
2
_
2m
E
. (3.22)
Now we turn to the matrix element V
fi
. The initial state is the normalized wave
function
0
with energy E
0
= where is the binding energy. The nal state is
the normalized free particle state
E
f
with energy E
f
= E
0
+ = . Then
V
fi
= E
E
f
[ex[
0
) = E
_
1

L
e
i

2mE
f
x/
ex
0
dx.
83
Note that this is the quantum mechanical average of the energy of an electric dipole
in a uniform electric eld E .
Putting all of this together in (3.21), we have the transition probability
=
2

L
2

2m
E
f
e
2
E
2
1
L

_
e
i

2mE
f
x/
x
0
dx

2
=
e
2
E
2

2m
E
f

_
e
i

2mE
f
x/
x
0
dx

2
. (3.23)
Note that the box size L has canceled out of the nal result, as it must.
Lets actually evaluate the integral in (3.23) for the specic example of a particle
in a square well potential. Recall that the solutions to this problem consist of
sines and cosines inside the well, and exponentially decaying solutions outside. To
simplify the calculation, we assume rst that the well is so narrow that the ground
state is the only bound state (a cosine wave function), and second, that this state
is only very slightly bound, so that its wave function extends far beyond the edges
of the well. By making the well so narrow, we can simply replace the cosine wave
function inside the well by extending the exponential wave functions back to the
origin.
With these additional simplications, the normalized ground state wave function
is

0
=
_
2m

2
_
1/4
e

2m |x|/
where is the binding energy. Then the integral in (3.23) becomes
_

e
i

2mE
f
x/
x
0
dx =
_
2m

2
_
1/4
_

2m(

|x|+i

E
f
x)/
xdx
=
_
2m

2
_
1/4
__
0

2m(

E
f
)x/
xdx
+
_

0
e

2m(

+i

E
f
)x/
xdx
_
.
Using
_
0

e
ax
xdx =

a
_
0

e
ax
dx =

a
1
a
=
1
a
2
and
_

0
e
bx
xdx =

b
_

0
e
bx
dx =

b
1
b
=
1
b
2
we have
_

e
i

2mE
f
x/
x
0
dx =
_
2m

2
_
1/4

2
2m
_
1
(

+i
_
E
f
)
2

1
(

i
_
E
f
)
2
_
84
=
_
2m

2
_
1/4

2
2m
(4i)
_
E
f
( +E
f
)
2
.
Hence equation (3.23) becomes
=
8e
2
E
2
m

3/2
E
1/2
f
( +E
f
)
4
where E
f
= , or + E
f
= . Since our second initial assumption was
essentially that , we can replace E
f
in the numerator by , leaving us with
the nal result
=
8e
2
E
2
m
5/2

3/2

7/2
.
What this means is that if we have a collection of N particles of charge e and
mass m in their ground state in a potential well with binding energy , and they
are placed in an electromagnetic wave of frequency and electric vector E , then
the number of photoelectrons with energy produced per second is N .
Now that we have an idea of what the density of states means and how to use the
golden rule, let us consider a somewhat more general three-dimensional problem.
We will consider an atomic decay
i

f
, with the emission of a particle (photon,
electron etc.), whose detection is far from the atom, and hence may be described
by a plane wave
(r, t) =
1

V
e
i(prpt)
.
(At the end of our derivation, we will generalize to multiple particles in the nal
state.) Here V is the volume of a box that contains the entire system, and the
factor 1/

V is necessary to normalize the wave function. If we take the box to be


very large, its shape doesnt matter, so we take it to be a cube of side L. In order
to determine the allowed momenta, we impose periodic boundary conditions:
(x +L, y, z) = (x, y, z)
and similarly for y and z. Then e
ipxL/
= e
ipyL/
= e
ipzL/
= 1 so that we must
have
p
x
=
2
L
n
x
; p
y
=
2
L
n
y
; p
z
=
2
L
n
z
where each n
i
= 0, 1, 2, . . . .
Our real detector will measure all incoming momenta in a range p to p + p,
and hence we want to calculate the transition rate to all nal states in this range.
Thus we want
=

if
(p)
85
where
if
(p) is given by (3.19b). Since each momentum state is described by the
triple of integers (n
x
, n
y
, n
z
), this is equivalent to the sum
=

nx,ny,nz

if
(n)
_
d
3
n
if
(n)
where we have gone over to an integral in the limit of a very large box, so that
compared to L, each n
i
becomes an innitesimal dn
i
. Noting that
d
3
n = dn
x
dn
y
dn
z
=
_
L
2
_
3
dp
x
dp
y
dp
z
=
V
(2)
3
d
3
p (3.24)
we then have (from (3.19b))
=
2

_
V
(2)
3
d
3
p [M
fi
[
2
(E
f
E
i
+E) (3.25)
where we have assumed that the emitted particle has energy E (which is essen-
tially the integration variable), and we changed notation slightly to [M
fi
[
2
=
[
f
[H

(t)[
i
)[
2
where H

(t) = V
0
(r)e
+it
as in (3.17).
If we let d
p
= d cos
p
d
p
be the element of solid angle about the direction
dened by p, then
=
2

_
d
p
_
V
(2)
3
p
2
dp [M
fi
[
2
(E
f
E
i
+E)
=
2

_
d
p
_
V
(2)
3
dE
_
p
2
dp
dE
_
[M
fi
[
2
(E
f
E
i
+E)
=
2

_
d
p
V
(2)
3
__
p
2
dp
dE
_
[M
fi
[
2
_
E=EiE
f
. (3.26)
Here the integral is over
p
, and is to cover whatever solid angle range we wish to
include. This could be just a small detector angle, or as large as 4 to include all
emitted particles. The quantity in brackets is evaluated at E = E
i
E
f
as required
by the energy conserving delta function. And the factor of V in the numerator will
be canceled by the normalization factor (1/

V )
2
coming from [M
fi
[
2
and due to
the outgoing plane wave particle.
From (3.24) we see that
d
p
V
(2)
3
_
p
2
dp
dE
_
=
V
(2)
3
d
3
p
dE
=
d
3
n
dE
:= (E) (3.27)
where the density of states (E) is dened as the number of states per unit of
energy. Note that in the case of a photon (i.e., a massless particle) we have E = pc
so that
p
2
dp
dE
=
p
2
c
=
E
2
c
3
=

2
c
3

2
86
where we used the alternative relation E = . And in the case of a massive
particle, we have E = p
2
/2m and
p
2
dp
dE
= p
2
m
p
= mp = m

2mE .
You should compare (3.27) using these results to (3.22). In all cases, the density of
states goes like 1/E as it should.
In terms of the density of states, (3.26) may be written
=
2

(E) [M
fi
[
2
_
E=EiE
f
. (3.28)
This is the golden rule for the emission of a particle of energy E. If the nal state
contains several particles labeled by k, then (3.25) becomes
=
2

_
indep p
k

k
V d
3
p
k
(2)
3
[M
fi
[
2

_
E
f
E
i
+

k
E
k
_
where the integral is over all independent momenta, since the energy conserving
delta function is a condition on the total momenta of the emitted particles, and
hence eliminates a degree of freedom. However, the product of phase space factors
V d
3
p
k
/(2)
3
is over all particles in the nal state. Alternatively, we may leave
the integral over all momenta if we include an energy conserving delta function in
addition:
=
2

_

k
V d
3
p
k
(2)
3
[M
fi
[
2

_
E
f
+

k
E
k
E
i
_

_
p
f
+

k
p
k
p
i
_
.
87

You might also like