Nucl - Phys.B v.603

Nuclear Physics B 603 (2001) 341
www.elsevier.nl/locate/npe
A large N duality via a geometric transition

F. Cachazo a , K. Intriligator b , C. Vafa a
a Jefferson Physical Laboratory, Harvard University, Cambridge, MA 02138, USA
b UCSD Physics Department, 9500 Gilman Drive, La Jolla, CA 92093, USA
Received 9 May 2001; accepted 10 May 2001
Abstract
We propose a large N dual of 4d, N = 1 supersymmetric, SU(N) YangMills with adjoint field
and arbitrary superpotential W (). The field theory is geometrically engineered via D-branes
partially wrapped over certain cycles of a non-trivial CalabiYau geometry. The large N, or lowenergy, dual arises from a geometric transition of the CalabiYau, where the branes have disappeared
and have been replaced by suitable fluxes. This duality yields highly non-trivial exact results for the
gauge theory. The predictions indeed agree with expected results in cases where it is possible to use
standard techniques for analyzing the strongly coupled, supersymmetric gauge theories. Moreover,
the proposed large N dual provides a simpler and more unified approach for obtaining exact results
for this class of supersymmetric gauge theories. 2001 Published by Elsevier Science B.V.
1. Introduction
Partially wrapping D-branes over non-trivial cycles of non-compact geometries yields
large classes of interesting gauge theories, depending on the choice of geometry. It has also
been suggested in [1,2] that N 1 D-branes, wrapped over cycles, have a dual description
(in a suitable regime of parameters) involving transitions in geometry, where the D-branes
have disappeared and have been replaced by fluxes. This duality can be reformulated and
explained as a geometric flop in the context of M-theory propagating on G2 holonomy
manifolds [3,4]. In this paper, we use these ideas to propose a new class of dualities.
The simplest case, which will be the main focus of this paper, corresponds to an N = 1
supersymmetric gauge theory with adjoint chiral superfield and tree-level superpotential
Wtree =
n+1

gp
p=1
Tr p
n+1
gp up ,
(1.1)
p=1
where the gauge group can be either SU(N) or U (N), depending on whether or not we treat
g1 as a Lagrange multiplier imposing tracelessness of . For simplicity, we generally refer
E-mail address: keni@ucsd.edu (K. Intriligator).
0550-3213/01/$ see front matter 2001 Published by Elsevier Science B.V.
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 2 2 8 - 0
F. Cachazo et al. / Nuclear Physics B 603 (2001) 341
to U (N), with the understanding that the SU(N) can be obtained by imposing the Lagrange
multiplier condition. Without the superpotential (1.1), the theory would be N = 2 superYangMills. The theory with superpotential (1.1) arises [5] by wrapping N type IIB
D5 branes on special cycles of certain CalabiYau geometries; the choice of n and the
parameters gp are given by the geometry. Using the corresponding geometric transition,
we construct a dual theory without the D-branes, but with suitable fluxes. There is also
a mirror IIA description, involving D6 branes wrapping 3-cycles. The IIB description
is simpler, in that there are no worldsheet instanton corrections to the superpotential.
However, the IIA perspective is useful for explaining the origin of these dualities, as they
are related to geometric flop transitions in M-theory on G2 holonomy geometries [3].
The classical theory with superpotential (1.1) has many vacua, where the eigenvalues of
are various roots ai of
W (x) =
n
gp+1 x p gn+1
p=0
n

(x ai ).
i=1
In the vacuum where classically P (x) det(x ) =

is broken as
U (N)
n
U (Ni )
with
i=1
n
(1.2)
n
i=1 (x
ai )Ni , the gauge group
Ni = N.
(1.3)
i=1
In the geometric construction [5], this is seen because we can wrap Ni D5 branes on any
n
of n choices of S 2
= P1 . Such a vacuum exists for any partition of N = i=1 Ni .
Applying the proposal of [1,2] to each S 2 , a transition occurs where we are instead left
with n S 3 s. As we discuss, the non-compact CalabiYau geometry is now given by the
following surface in C4 :
W (x)2 + fn1 (x) + y 2 + z2 + v 2 = 0,
(1.4)
W (x)
the degree n polynomial (1.2) and fn1 (x) a degree n 1 polynomial. As

with
for any CalabiYau, we can form an integral basis of 3-cycles, Ai and Bi , which form a
symplectic pairing
(Ai , Bj ) = (Bj , Ai ) = ij ,
(Ai , Aj ) = (Bi , Bj ) = 0,
(1.5)
with the periods of the CalabiYau given by the integral of the holomorphic 3-form over
these cycles. In the present case (1.4), we have i = 1, . . . , n, with the Ai cycles compact
and the Bi non-compact. We denote the periods as

Si ,
Ai
0
F
i =
Si
(1.6)
Bi
with F (Si ) the prepotential. 0 is a cutoff needed to regulate the divergent Bi integrals;
this is actually an infrared cutoff in the geometry integral, which will naturally be identified
with the ultraviolet cutoff of the 4d QFT. Using (1.6), the polynomial fn1 (x) in (1.4) is
to be solved for in terms of the n Ai periods Si .
As in [2], the dual theory obtained after the transitions to the geometry (1.4) has a
superpotential due to fluxes through the 3-cycles of (1.4):

1
Weff =
(Ni i + i Si ),
2i
n
(1.7)
i=1
with Ni 3-form (HR + HNS ) flux through Ai and i 3-form flux (HR + HNS ) through
Bi [6,7]. If not for the superpotential (1.7), the dual theory would yield a 4d, N = 2
supersymmetric, U (1)n gauge theory, with the Si the N = 1 chiral superfields in the N = 2
U (1)n vector multiplets. In terms of this field theory, the superpotential (1.7) corresponds
to breaking N = 2 to N = 1 by adding electric and magnetic FayetIliopoulous terms [8].
There will be N = 1 supersymmetric vacua, with the Si massive and thus fixed to some
particular Si
, but with the N = 1 U (1)n gauge fields left massless. In the applications
we consider, all j 1/g02 , the bare gauge coupling of the gauge theory; this combines in
a natural way with 0 , replacing the cutoff with the physical scale of the gauge theory.
The duality proposal, generalizing that of [2], is that these U (1)n gauge fields coincide
with those of the original theory (1.3) after the SU(Ni ) get a mass gap and confine. In
particular, the exact quantum effective gauge couplings ij (gr , ; Ni ) of the remaining
massless U (1)n gauge fields should be given by the prepotential of the above dual, ij =
2 F /Si Sj , evaluated at Si
. Further, as in [2], the Sj are to be identified with the
1
SU(Nj ) glueball chiral superfields Sj = 32

2 TrSU(Nj ) W W , whose first component
is the SU(Nj ) gaugino bilinear. Finally, we claim that the superpotential (1.7) is the exact
quantum effective superpotential of the low-energy SU(N) theory with superpotential
(1.1), in the vacuum with the Higgsing (1.3).
Note that the U (1)n dual theory only knows about the values of the Ni via the
coefficients appearing in (1.7). In particular, the i (Sj ; gr , ) and F (Si ; gr , ) are
completely independent of the Ni , depending only on and the parameters gr via (1.4).
Upon adding (1.7) to the dual theory, one obtains Si
which are complicated functions
of the Ni , gr , and . Integrating out the Si gives the exact quantum 1PI effective
superpotential Weff (gr , , Ni ) of the original theory.
The geometric transition leads to a new duality, which can be stated in purely field theory
terms: the U (N) theory with adjoint and superpotential (1.1) is dual to a U (1)n theory and
superpotential (1.7). This duality is reminiscent of that of [9].
The above duality makes some highly non-trivial predictions for the exact U (1)n gauge
couplings ij (gr , ) and the exact effective superpotential Weff (gr , ). This allows us to
check the duality, by comparing with the exact results which can (at least in principle) be
obtained for these quantities via a direct field theory analysis. The above quantities can
be exactly obtained (again, at least in principle) by viewing the N = 1 U (N) theory with
adjoint and superpotential (1.1) as a deformation of N = 2, and using the known exact
results for N = 2 field theories. We find perfect agreement between these results, which is
a highly non-trivial check of our proposed duality.
The organization of this paper is as follows. In Section 2 we review the large N duality
of [2] for N = 1 YangMills theory, and briefly discuss the extension to include massive
flavors in the fundamental of U (N). In Section 3, we discuss how to geometrically engineer
the general N = 1 theory with adjoint and superpotential (1.1). In Section 4 we propose
the large N dual of these theories via the transition in the CY geometry where S 2 s
are blown down, S 3 s are blown up, and the branes have been replaced with fluxes. In
Section 5 we analyze the U (N) theory with adjoint and superpotential using standard
supersymmetric field theory tools. In Section 6 we specialize these results to the case of
the cubic superpotential. In Section 7 we analyze the proposed large N duals and show
how the leading order computation of gauge theory based on gaugino condensate follows
from monodromies of the geometry. In Section 8 we specialize to the cubic superpotential
and compute exact results for the quantum corrected superpotential using the proposed
dual. We find perfect agreement with the results based on a direct gauge theory analysis.
In Appendix A we present the details of the analysis for one of the field theory examples,
and in Appendix B we discuss the series expansion for computing the periods for the case
of cubic superpotential.
2. Review of the large N duality for N = 1 YangMills

Consider type IIA string theory on a non-compact CalabiYau threefold of T S 3 , i.e.,
the conifold, with defining equation given by
x 2 + y 2 + z2 + v 2 = ,
and consider wrapping N D6 branes on the S 3 , with the unwrapped dimensions filling the
Minkowski spacetime. This gives rise to a 4d N = 1 U (N) pure YangMills theory. The
duality proposed in [2], which was motivated by embedding the large N topological string
duality of [1] into superstrings, states that in the large N limit this theory is equivalent to
type IIA strings propagating on the blow up of the conifold. This is a geometry involving
a rigid sphere P1 , where the normal bundle to the P1 in the CY is given by a O(1) +
O(1) bundle over it (i.e., two copies of the spinor bundle over the sphere). The branes
have disappeared and have been replaced by an RR flux through P1 and an NS flux on
the dual four cycle [2]. This duality has been embedded into M-theory, where it admits a
purely geometric interpretation [3,4]. The SU(N) gauge theory decouples from the bulk in
the limit where the size S of the blowup sphere P1 is small. The size S is fixed in terms of
the units of flux, and the appropriate decoupling limit is large N . S gets identified [2] with
1
of the SU(N) theory, so its expectation value
the glueball superfield S = 32
2 Tr W W
corresponds to gaugino condensation in the SU(N) theory.
As noted in [2] one can also consider the mirror description of this geometry, which is
simpler to work with (as the worldsheet instanton corrections to spacetime superpotential
are absent). This corresponds to switching from IIA to IIB theory and reversing the arrow of
transition: the original U (N) theory is obtained from type IIB D5 branes wrapped around
the P1 in the blown up conifold geometry and, in the large N limit, this is equivalent to
type IIB on the deformed conifold background:
f = x 2 + y 2 + z2 + v 2 = 0.
The deformation parameter will, again, be identified with the SU(N) glueball superfield.
Rather than the N original D5 branes, there are now N units of RR flux through S 3 , and
also some NS flux through the non-compact cycle dual to S 3 . This mirror description is
related to a particular limit of the large N duality proposed in [10] and [11].
The value of the modulus is fixed [2] by the fluxes, and this is captured by a
superpotential for S, whose first component is proportional to . Specializing (1.5) and
(1.6) to the conifold, we have a single compact 3-cycle A
= S 3 , and a single dual, noncompact 3-cycle B. The A period of the holomorphic 3-form is S. There are N units of
RR flux through A, and the NS flux through B; is identified with the bare coupling of
the 4d U (N) gauge theory.
The holomorphic three-form is given by
=
dx dy dz
dx dy dz dv dx dy dz
=
.
df
v
x 2 y 2 z2
The 3-cycles A and B can be viewed as 2-spheres spanned by a real subspace of y, z

fibered over x, as in [1214], and integrating over the fiber y, z yields a one-form in
the x-plane:

dx x 2 = .
S2
The A-cycle, projected to the x-plane, becomes an interval between x = . Thus the
A-period is given by

=
S=
1
2i

dx
x2 =
.
4
The B-period can be viewed as an integral from x = to infinity. However this integral
is divergent, and thus must be cutoff to regulate the infinity. Giving S dimension 3, x has
3/2
dimension 3/2, so we put the cutoff at x = 0 where 0 has mass dimension 1:

3/2
1
2i
dx
x2 =

1 1 3
0 3S log 0 S(1 log S) + O(1/0 ).
2i 2
Note that, under 30 e2i 30 , S, shifting the B period by an A period. Using

the fact that we have N units of RR flux through S 3 and units of NS flux through the
B-cycle, we find the superpotential [2]:

Weff = N 3S log 0 + S(1 log S) 2iS,
is related to the bare coupling constant of the SU(N) gauge theory by 2i = 8 2 /g02 .
The coefficient of S in the above superpotential is given by
S(3N log 0 2i),
which is the geometric analog of the running of the coupling. depends on 0 in such a
way that the above quantity is finite as 0 :
8 2
= const + 3N log 0 ,
g 2 (0 )
which is exactly the expected running of the coupling constant for the 4d N = 1 U (N)
YangMills theory. The upshot is to replace the cutoff 0 in the above expression with the
scale of the gauge theory, which we will denote by . We thus have for the superpotential

Weff = S log 3N /S N + NS
(the linear term NS is a matter of convention and defines what one means by the
physical scale ). This is indeed the superpotential of [15] for the massive glueball S.
Integrating out S via dWeff /dS = 0 leads to the N supersymmetric vacua of SU(N) N = 1
supersymmetric YangMills:
S
= e2ik/N 3 ,
k = 1, . . . , N.
2.1. Gauge-theoretic reformulation of the duality

We can formulate the above large N duality in purely gauge theoretic terms. The
conifold geometry without the fluxes corresponds to an N = 2 U (1) gauge theory with
a charged hypermultiplet [16]. Turning on fluxes is equivalent to adding electric and
magnetic FayetIliopoulous superpotential terms, which softly break N = 2 to N = 1.
The N = 2 vector multiplet consists of a neutral N = 1 chiral superfield S and an N = 1
photon. The N = 1 U (1) photon is left massless, and is to be identified with the overall
U (1) U (N) of the original N = 1 theory. The N = 1 chiral superfield S gets a mass,
and is to be identified with the massive glueball chiral superfield S of the SU(N) theory.
The identification of the U (1) of the dual theory with the U (1) U (N) is consistent
with the fact that minimization of the superpotential gives rise to
+ = N + = 0,
S
where we used the special geometry to connect the periods of the B-cycles with the
coupling constant of the U (1). Note that the coupling of the U (1) theory is /N
as it should be where is the bare coupling of the U (N) theory and U (1) is identified
with 1/N times the identity matrix in U (N) adjoint. In fact the charged hypermultiplet
of the U (1) is nothing but the baryon field of the original U (N) theory. To see this note
that before turning on the RR flux on S 3 , wrapping a D3 brane around it gives a charged
hypermultiplet. Turning on the RR flux, induces N units of fundamental charge on it, as
noted in the context of AdS/CFT correspondence in [1719]. After turning on the flux
the field is not allowed by itself, i.e., it is attached to N fundamental strings going off to
infinity. Thus after the FI deformations of the superpotential it is slightly misleading to
think of the U (1) theory as having a fundamental hypermultiplet. In that context one can
simply view this as an effective U (1) theory with the SW N = 2 geometry as would have
been the case with a fundamental hypermultiplet.
N
2.2. Adding massive fields

As discussed in [2], we can also consider adding some quark chiral superfields, in the
fundamental representation of SU(N). In the type IIB description this is done by taking
a D5 brane wrapping a holomorphic 2-cycle not intersecting the P1 , but separated by
a distance , where is proportional to the mass of the hypermultiplet, as the matter
comes from strings stretching between the non-compact brane and the N branes wrapped
on P1 . If (1 , 2 ) denote the O(1) + O(1) bundle over P1 , the 2-cycle is the curve
(1 , 2 ) = (, 0) over a point on P1 . Passing this through the conifold transition, which in
these coordinates is given by
1 a 2 b = ,
and rewriting it by a change of variables in the form
F (x, y) = x 2 + y 2 = 2 b,
we have a D5 brane wrapping a 2-cycle given by 2 = 0 and x = . Since here x has
1/2
dimension 3/2, and should be proportional to the mass m0 , we identify = m0 0 . As
discussed in [20] such a D-brane gives rise to an additional spacetime superpotential

3/2
1
1Weff =
2

dx x 2 = S log(m0 /0 ) + O(1/0 ).
1/2
m0 0
This gives the running of the mass parameter with the cutoff 0 . We define the
renormalized mass by m/ = m0 /0 . Generalizing to any number of matter fields in the
fundamental representation, with mass matrix m, we find (see Fig. 1)

3NNf det m
+
N
.
Weff = S log 3N /S N + NS + S Tr log[m/] = S log
SN
Integrating out S via dWeff /dS = 0 yields the correct field theory result:
S N = 3NNf det m.
3. Geometric engineering N = 1 theories with adjoint and superpotential

Wtree ()
The N = 1 SU(N) YangMills theory of the previous section can be regarded as a
special case of the more general theory with adjoint and superpotential as in (1.1),
Wtree () =
n+1

gp
p=1
Tr p .
(3.1)
For n = 1, the adjoint gets a mass m = g2 and we recover the case reviewed in the previous
section. We here review the geometric construction of [5] for general n.
10
Fig. 1. Location of the branch cut in the x-plane. Contours of integration of the different periods of
the geometry including those coming from massive fields.
For Wtree () = 0, the 4d field theory would be pure N = 2 YangMills system. To

geometrically engineer that, all we need is a P1 in a CalabiYau manifold for which the
normal bundle is O(2) + O(0) (i.e., it has the same normal geometry as if the P1 were
in a K3). If we wrap N D5 branes around the P1 we obtain an N = 2 U (N) gauge theory
in the uncompactified worldvolume of the D5 brane. The adjoint scalar gets identified
with the deformations of the brane in the O(0) direction, normal to the P1 .
To describe the geometry in more detail, let z denote the coordinate in the north patch
of P1 and z = 1/z in the south patch. Let x, x denote the coordinate of O(0) direction in
the north and south patches respectively, and let u, u denote the coordinates of O(2) in
the north and south patches, respectively. Then we have
z = 1/z,
x = x,
u = uz2 .
(3.2)
There is a continuous family of P1 s, labeled by arbitrary x, at u = 0 = u . Each of the

N D5 branes can wrap a P1 at any value of x. In the N = 2 gauge theory living in the
unwrapped directions, this freedom to choose any x for each brane corresponds to moving
along the Coulomb branch, with the ai of each brane corresponding to an eigenvalue of the
adjoint field .
This connection between x and the Coulomb branch moduli makes it clear how the
geometry must be deformed to obtain the N = 1 theory with superpotential (3.1). Rather
than having the P1 , with coordinate z and z at the point u = u = 0, for arbitrary x, it
should exist only for particular values of x, namely the values x = ai where W (x)

gn ni=1 (x ai ) = 0. This is the case if (3.2) is deformed to
z = 1/z,
x = x,
u = uz2 + W (x)z,
(3.3)
11
which is indeed only compatible with u = u = 0 at the n choices of x = ai where

W (x) = 0. Note that now we can distribute the N D5-branes among the vacua ai , i.e.,
Ni branes wrapping the corresponding S 2 at x = ai . This gives a geometric realization of

the breaking of U (N) i U (Ni ).
4. Large N duality proposal

We now obtain the large N dual of the U (N) theory with adjoint and superpotential
Wtree () by considering the geometric transition where each of the n P1 s have shrunk
and have been replaced by a finite size S 3 . As already mentioned, the sizes of the n S 3 s
will correspond to the non-zero gaugino condensation expectation values in the n factors
of N = 1 non-Abelian gauge groups in (1.3). The needed blow-down of the n P1 s of the
geometry of the previous section has been discussed in [21] and we will review it here.
We start with the defining equation (3.3). Its blowdown can be obtained by the change of
variables as follows: define x1 x, x2 u , x3 z u , x4 u; using (3.3), these satisfy
x2 x4 x32 + x3 W (x1 ) = 0.
By completing the square involving x3 and W and redefining the variables slightly we
obtain the equation
W (x)2 + y 2 + z2 + v 2 = 0.
(4.1)
This geometry is singular, even for a generic W (x); near each critical point of W (x) it
has the standard conifold singularity. The large N dual follows from desingularizing the
geometry (4.1), allowing the n S 3 s to have finite size, rather than zero size as in (4.1).
4.1. Desingularization of the geometry
Consider the most general desingularization of (4.1), subject to the restriction of [13]
that the deformation be a normalizable mode. For the case at hand, as W 2 is a polynomial
of degree 2n, the most general desingularization of (4.1) subject to the normalizability
restriction is to add a polynomial fn1 (x) of degree n 1 in x [14], giving the geometry
W (x)2 + fn1 (x) + y 2 + z2 + v 2 = 0.
(4.2)
Under this deformation, each of the n critical points x = ai (1.2) (where W = 0) splits
into two, which we denote as ai+ and ai .
As in the case of the conifold, the period integrals of the holomorphic three-form over
the Ai and Bi cycles can be written as integrals of an effective one-form over projections
of the cycles to the x plane. As in the conifold case, the non-trivial 3-cycles have simple
projections to the x plane. The one-form is given by doing the integral over the fiber
S 2 cycles (corresponding to the y, z, v coordinates on the surface (3.3)); this gives

= dx W 2 (x) + fn1 (x).
(4.3)
12
Fig. 2. Geometry before and after introducing the deformation fn1 (x). The choice of branch cuts
and integration contours for the different periods is also shown. Dashed lines are paths on the lower
sheet.
Therefore, the periods of the holomorphic three-form over the n 3-cycles Ai of (4.2),
which are compact 3-spheres, are given by (see Fig. 2)
+
1
Si =
2i
ai
(4.4)
ai
where the sign depends on the orientation; the periods over the dual Bi cycles are
1
i =
2i
0
.
(4.5)
ai+
The map between the n coefficients in fn1 (x) and the Si can thus be obtained by direct
computation, and fn1 (x) can then be solved for as particular functions fn1 (x; Si ).
As we already mentioned, the n values of Si are mapped under the duality to the n
1
for the non-Abelian factors in (1.3). (The S
glueball fields Si = 32
i
2 TrSU(Ni ) W W
can be defined in a gauge invariant way.) Just as with the case of pure N = 1 U (N) Yang
Mills, the Si of the dual theory will become massive and obtain particular expectation
values thanks to a superpotential Weff , with the expectation values Si
determined from
finding the critical points of Weff . The dual superpotential Weff arises from the nonzero
fluxes left after the transition.
Rather than having D-branes, as present before the transition, the above deformed
geometry will have Ni units of HR flux through the ith S 3 cycle Ai . In addition, there
is an HNS flux through each of the dual non-compact Bi cycles, with 2i = 8 2 /g02
given in terms of the bare coupling constant g0 of the original 4d U (N) field theory. We
thus have the superpotential, given in terms of the Ai and Bi periods (1.6) as

1
Weff =
Ni i +
Si .
2i
n
i=1
i=1
13
(4.6)
This Weff depends on the coefficients gr of the classical superpotential (1.1) of the original
U (N) theory with adjoint by way of the geometry (4.2). Weff is a function of the n Si , or
equivalently the n unknown parameters in fn1 (x). The supersymmetric vacua have fixed
Si
, obtained by solving
Weff
(4.7)
= 0, i = 1, . . . , n.
Si
These Si
will depend on the Ni , the parameters gr entering in the original Wtree () and
thus on the geometry (4.2), and 0 , the Bi integral infrared cutoff.
In the classical limit, where we set the Si to zero, and thus fn1 (x) = 0, the period of
the one-form (4.3) gives
1
i =
2i
0

1
W (0 ) W (ai ) .
dx W (x) =
2i
(4.8)
ai

Then the dual superpotential is Weff = i Ni W (ai ) (ignoring the irrelevant constant
W (0 )). This indeed matches with the classical superpotential of the original U (N) theory,
given by simply evaluating the superpotential (1.1) in the vacuum with breaking (1.3),
where Ni eigenvalues of the field take eigenvalue ai .
4.2. Aspects of the U (1)n gauge fields
The dual theory obtained after the transition is an N = 2 U (1)n gauge theory, broken
to N = 1 U (1)n by the superpotential Weff (4.6). The Si , which are in the same N = 2
multiplet as the U (1)n , get masses and frozen to particular Si
by Weff . On the other hand,
the N = 1 U (1)n gauge fields remain massless. The couplings ij of these U (1)s can
be determined from i (S) or the N = 2 prepotential F (Si ), with i = F /Si , of the
geometry under consideration:
ij =
i
2 F (Si )
=
.
Sj
Si Sj
The couplings (4.9) should be evaluated at the Si

obtained from (4.7).
Note that (4.7) and (4.9) imply

Ni ij + = 0.
(4.9)
(4.10)
We identify the Fi the ith block U (1) field strength with the generator in U (N) which is
1/Ni times the identity matrix in the ith block and zero elsewhere. In this way the Fi Fj
correspond to field strengths of the U (1)n1 s coming from the SU(N) and the Ni Fi will
corresponds to the overall U (1). Thus the above equation is consistent with the fact that the
overall U (1) is a linear combination of the U (1)n s with coefficients given by Ni , together
with the fact that the bare coupling constant of the overall U (1) should be the same as that
14
of the original U (N) theory, as the U (1) is decoupled. Moreover, it is consistent with the
fact that there is no coupling between the field strength of this overall U (1) with the other
U (1)n1 . Thus extremizing the superpotential is equivalent to this structure for the gauge
coupling constants of the U (1) factors.
One can also relate the coupling constants of the U (1) factors to the period matrix of the
hyperelliptic curve
y 2 = W (x)2 + fn1 (x).
To see that, from (4.9) we will have to compute the period integrals of /(Si Sj ) about
the cycles of the hyperelliptic curve, where = y dx. As we will discuss in Section 7 the
coefficient of x n1 of fn1 (x) is proportional to the sum of Si s and thus considering
/(Si Sj ) gives rise to a linear combination of
x nr dx
y
with 2 r n, a basis of the n 1 holomorphic one-forms on the hyperelliptic curve.
Thus ij can be identified with the period matrix of the hyperelliptic curve.
4.3. Gauge theoretic reformulation
Just as in the case of n = 1 we can reformulate this duality in terms of a duality of two
gauge systems. We start with N = 2 pure YangMills theory for gauge group U (N) and
deform it by the superpotential Wtree () of degree n + 1 in the scalar field, breaking the
U (N) into n factors U (Ni ). The SU(Ni ) gaugino bilinear together with the U (1) U (Ni )
forms an N = 2 multiplet. One considers a dual N = 2 multiplet containing U (1)n softly
broken to N = 1 by a superpotential term. Note that the N = 2 we have proposed is of the
form that appears in an N = 2 theory with a U (n) gauge group with some matter fields
(whose structure is dictated by the superpotential). In fact the dual N = 2 system we have
been considering is of the type studied in [22] and was connected to a type IIB description
considered here in [14]. In such a formulation the decoupling of the overall U (1) from the
other U (1)s occurs as in (4.10), consistent with the minimization of the superpotential.
5. Field theory analysis

We now analyze the strong coupling dynamics of the U (N) theory, with adjoint and
superpotential (1.1), in the vacuum with the classical breaking (1.3). In the quantum theory,
each N = 1 super YangMills SU(Ni ) in (1.3) generally confines, with Ni supersymmetric
vacua. The Ni vacua correspond to Ni th roots of unity phases of the gaugino condensate
1
the SU(N ) glueball chiral superfield. The U (1)n
Si
= 0, with Si = 32
i
2 Tr W W
in (1.3) are free, and therefore remain unconfined and present in the low energy theory.
The vacua can also have more interesting behavior. For example, in SU(3) with a cubic
superpotential for but no quadratic mass term, the vacuum is at the non-trivial conformal
field theory point of [22].
15
The low energy theory contains an effective superpotential Weff (gp , ) which gives the
chiral superfield expectation values via [23]
Weff (gp , )
= up
,
gp

Weff (gp , )
= S

Si
.
2N
log
n
(5.1)
i=1
Weff can often be obtained exactly, thanks to its holomorphic dependence on gp and [24].
In the present case, well discuss how Weff can indeed, in principle, be obtained exactly
via the N = 2 curves [2527]; in practice, however, the result is quite difficult to obtain.
5.1. Approximate Weff via naive integrating in
The effective superpotential can often be obtained exactly via starting from the lowenergy effective theory and integrating in the massive matter fields [23,28]. As discussed
in [28], for this procedure to give an exact answer, one must be able to argue that the scale
matching relations are known exactly and that a possible additional unknown contribution
W to the superpotential necessarily vanishes. Our N = 1 theory with adjoint and
superpotential (1.1), does not admit this kind of symmetry and limits arguments needed
to prove the naive scale matching relations and W = 0 as exact statements. So naive
integrating in need not give the exact answer for Weff ; nevertheless, it is still useful here
for obtaining an approximate answer.
To illustrate how naive integrating in can fail to give the exact answer in the theory with
adjoint , consider the vacuum where classically
= 0, leaving SU(N) unbroken. Such
a vacuum exists for any tree level superpotential (1.1). The mass of in this vacuum is
W (0) = g2 m, independent of the other gp . The low energy theory is N = 1 SU(N)
pure YangMills and the dynamical scale L of this theory is related to that of the original
high energy theory by matching the running gauge coupling at the threshold scale m, giving
N 2N
3N
L = m . The low-energy theory has N vacua with gaugino condensation and lowenergy superpotential
Wlow = e2ik/N N3L = e2ik/N Nm2 .
(5.2)
Using (5.1) one could use this to try to find the ur

in this vacuum, but the answer would
be incorrect for SU(N) with N > 3. The exact answer can be found from deforming
the N = 2 curve following [29], as reviewed in the next subsection. The exact effective
superpotential is found from this to be

[n/2]
g2p
2p 2p
.
Wexact = N
(5.3)
2p
p
p=1
The g2 term coincides with (5.2), so both give the same u2

, but (5.2) gives all other
ur
= 0, whereas (5.3) gives higher u2p
N2p = 0.
The terms in (5.3) which are missing from (5.2) are weighted by g2p 2p , which should
be small as compared with the leading term m2 . The reason is that the higher g2p appear
16
irrelevant in the original SU(N) description, so their required UV cutoff should be larger
than the dynamical scale in order for the theory to be well-defined, i.e., the g3+n n
should be small. So the lesson is that naive integrating in here neednt give the exact answer,
but it does generally give the leading term or terms.
On the other hand, naive integrating in actually does give the exact answer for Weff in
the vacua where SU(N) SU(2) U (1)N2 [30]. In fact, the exact curve of the entire
N = 2 theory can be re-derived via integrating in in the SU(2) U (1)N2 vacua [30].
We now outline the naive integrating-in procedure for the general vacuum (1.3). The
low-energy N = 1 SYM with gauge group (1.3) leads to a low-energy superpotential via
gaugino condensation in each of the decoupled, non-abelian groups:
Wlow = Wcl (gr ) +
n
e2iki /Ni Ni 3i .
(5.4)
i=1
The term Wcl (gr ) is simply the value of the classical superpotential (1.1), evaluated in the
classical vacuum:
Wcl =
n

i=1
Ni
n+1
gp
p=1
ai
,
p
(5.5)
with the ai defined in (1.2). As in (5.1), Wcl (gr )/gr = ur

cl .
The dynamical scale i entering in (5.4) is that of the low-energy SU (Ni ) theory, which
is related to the scale of the high-energy theory by matching the running gauge coupling
across two thresholds: that of the massive SU(N)/SU(Ni ) W-bosons, and that of the mass
of the field in the vacuum. The classical masses of the W-bosons which are charged
under SU(Ni ) are mWij = aj ai . The mass of the SU(Ni ) adjoint i is classically

mi = W (ai ) = gn+1 j =i (aj ai ). The scale i of the low-energy SU(Ni ) is thus
obtained by naive threshold matching to be
2Nj

Ni
i
i
= 2N mN
mWij = gn+1
2N
(aj ai )Ni 2Nj .
3N
(5.6)
i
i
j =i
j =i
It will be useful in what follows to also integrate in the glueball fields Si :

Wlow = Wcl (gr ) +
n

Ni
i
Si log 3N
+ Ni .
i /Si
(5.7)
i=1
3N
The Si are massive, with supersymmetric vacua Si

= i i , and integrating out the Si
leads back to (5.4).
The final result of naive integrating in is thus expressed in terms of the ai (gr ) as
n+1

Ni
n

ap
gn+1
2N j =i (aj ai )(Ni 2Nj )
i
Ni
+ Si log
Wlow (gr ) =
gp
+ Ni .
p
SiNi
i=1
p=1
(5.8)
The quantum term in (5.8), coming from SU(Ni ) gaugino condensation, is to be omitted
when Ni = 1; e.g., in the case of [30], where N1 = 2 and all other Ni = 1. The result
17
(5.8) happens to be exact when no Ni > 2 but, as emphasized above, (5.8) is only an
approximation to the exact answer in the more general case, where some Ni 3.
5.2. The exact Wexact (gr ) via deforming the N = 2 results
In this subsection, we obtain the exact 1PI generating function Wexact (gr ) by deforming
the exact solution [2527] of the N = 2 theory by the Wtree () (1.1). The large N
duality proposal of Section 4 gives the exact superpotential Wexact (gr ; Si ) as (4.6), with
the glueball fields included. (As verified in Section 7, the naive integrating in result (5.8) is
indeed an approximation to this exact result; generally there is an infinite series expansion
of corrections to the naive formula (5.8).) Upon integrating out the massive Si from
Wexact (gr , Si ) (4.6), one obtains Wexact (gr ), which we will verify indeed agrees with the
field theory result obtained in this subsection. Our Wexact (gr ; Si ) (4.6), however, contains
the additional information about the glueball fields Si . Although the Si are massive, this
additional information about their superpotential is physical; for example 1W between the
different Si
vacua gives the BPS tension of the associated domain walls. Perhaps theres
also a way to exactly integrate in the Si in the context of the deformed N = 2 field theory,
though this is not presently known.

The N = 2 theory deformed by Wtree = n+1
i=1 gr ur only has unbroken supersymmetry
on submanifolds of the Coulomb branch, where there are additional massless fields besides
the ur . The additional massless fields are the magnetic monopoles or dyons, which
become massless on some particular submanifolds up
[25]. Near a point with l massless
monopoles, the superpotential is
W=
l
Mk (ur )qk qk +
k=1
n+1
gp up ,
(5.9)
p=1
and the supersymmetric vacua are at those up

satisfying

Mk up
= 0
and
l

Mk ( up
)
k=1
up
qk qk
+ gp = 0,
(5.10)
the first equations are for all k = 1, . . . , l and the second for all r = 1, . . . , N (with gp = 0
for p > n + 1). The value of the superpotential (5.9) in this vacuum is simply
Weff =
n+1
gp up
,
(5.11)
p=1
with up
the solution of Mk ( up
) = 0, where the monopoles are massless. The explicit
monopole masses Mk (ur ) on the Coulomb branch can be obtained via the appropriate
periods of the one-form [26],

1 x PN (x)
x Nr dx
Mk = , with =
dx, satisfying
+ d( ),
2i y x
sr
y
k
(5.12)
18
but this will not be needed here.

In the vacuum (1.3), there are n massless photons, whereas the original N = 2 theory
had N massless photons. So the vacuum (1.3) must have N n mutually local magnetic
monopoles being massless and getting an expectation value as in (5.10), qk qk
= 0 for
k = 1, . . . , N n. It can indeed be shown from (5.10) that if the highest Casimir with
nonzero gp in Wtree is un+1 , as in (1.1), then the supersymmetric vacuum necessarily has
at least l = N n mutually local monopoles condensed. (More than N n condensed
monopoles correspond to those classical vacua in (1.3) where some Ni = 0, and thus there
are fewer than n photons left massless.) The vacuum obtained from integrating out up as
in (5.10), will give some values of the up
which are determined in terms of the gp .
Solving for the supersymmetric vacua as in (5.10), is equivalent to minimizing Wtree =
n+1
p=1 gp up , subject to the constraint that up
lie on the the codimension N n subspace
of the Coulomb branch where at least N n mutually local monopoles or dyons are
massless. This is just a matter of replacing the monopoles with N n Lagrange multipliers,
imposing that the ur lie in the subspace with N n massless monopoles; i.e., we integrate

out the up with W = Wtree + Nn
k=1 Lk Mk (u), with Mk (u) the monopole masses on the
Coulomb branch and Lk Lagrange multipliers, and the Lk
= qk qk
. The resulting up
will be some fixed value, depending on the gr and , giving finally Wexact (gr , ) =

r gr ur
.
Recall that the curve of the U (N) theory is
y 2 = P (x; ur )2 42N ,
P (x, ur ) det(x ) =
N
x Nk sk ,
(5.13)
k=0
with the sk related to the ur by

ksk +
k
rur skr = 0,
(5.14)
r=1
and s0 1 and u0 0; thus s1 = u1 , s2 = 12 u21 u2 , etc. (for SU(N) we impose u1 = 0).

The condition for having N n mutually local massless magnetic monopoles is that
2

2

PN x; up
42N = HNn (x) F2n (x),
(5.15)
where HNn is a polynomial in x of degree N n and F2n is a polynomial in x of
degree 2n. The LHS of (5.15) has 2N roots, and the RHS says that N n pairs of roots
should be tuned to coincide; thus (5.15) is satisfied on codimension N n subspaces of

the Coulomb branch. We need to integrate out the up , with Wtree = n+1
p=1 gp up , subject to
the constraint that up
satisfy (5.15).
Of the n massless photons, the one corresponding to the trace of U (N), does not couple
to the rest of the theory and so its coupling constant is the same as the one we started with.
The other n 1 photons which are left massless in (1.3) have gauge couplings which are
given by the period matrix of the reduced curve

y 2 = F2n x; ur
= F2n (x; gp , ),
(5.16)
19
with F2n (x; up

) the same function appearing in (5.15) and ur
the point on the solution
space of (5.15) which minimizes Wtree . The curve (5.16) thus gives the exact gauge
couplings of U (1)n1 which remain massless in (1.3) as functions of gp and .
The dual CalabiYau geometry which we proposed in Section 4,
W (x)2 + fn1 (x) + y 2 + z2 + v 2 = 0,
is already similar to the SW geometry (5.16), giving the coupling constants of the massless
U (1)s. To show that the ij obtained from (5.16) agrees with that obtained from (4.9), we
need to show that the F2n (x) of (5.15) and (5.16) is given by
2
gn+1
F2n (x) = W (x)2 + fn1 (x),
2
gn+1
(5.17)
x 2n ,
because the highest order term in F2n (x) is

whereas that of
with the factor of

n
W (x) is gn+1 x . We will indeed verify that the structure of F2n predicted from (5.17)
is correct, i.e., it is a deformation of a degree n 1 polynomial in x added to W 2 .
However, more needs to be done to show that the dual geometry and gauge theory predict
the same coupling constants for the U (1)s. Namely, we have to show that the coefficients
of the fn1 predicted from dual geometry and that of the gauge theory have identical
dependence on Ni and the parameters of the superpotential. This is indeed a highly nontrivial statement, which we will later verify for cubic superpotential in Section 8.
As a first hint about why (5.17) holds, consider the classical limit, 0, where

PN (x) = det(x ) ni=1 (x ai )Ni , with ai the roots of W (x) = gn+1 ni=1 (x ai ).

2
In this limit PN2 42N HNn
F2n , as in (5.15), with HNn (x) = ni=1 (x ai )Ni 1
n
2
and F2n = i=1 (x ai )2 = gn+1
W (x)2 . The motivation for this splitting is applying the
intuition of [29] to each SU(Ni ) factor: each PN2 i 1 splits to (x ai )2 times a degree
Ni 1 polynomial. We thus find that (5.17) holds in the 0 limit, and see that the
fn1 (x) appearing in (5.17) satisfies fn1 (x) 0 for 0.
To prove (5.17) exactly, and also get some insight into how the ur
are determined, we
note that we can minimize our Wtree (1.1), subject to the constraint that the ur
satisfy
(5.15), by introducing several Lagrange multipliers:
W=
n

r=1
gr ur +
l

i=1

Li PN (x; ur )x=p 2;i N + Qi PN (x; ur )x=p ,

i
i
x
(5.18)
with ;i = 1. Were generally allowing l mutually local massless monopoles, and will see
that l N n. The Li , Qi , and pi are all treated as Lagrange multipliers; so we should
independently take derivatives of (5.18) with respect to all ur , Li , Qi , and pi , and set all
these derivatives to zero. The pi will be the roots of Hl (x) in (5.15), and the Li and Qi
constraints implement the LHS of (5.15) having double zeros at these l points pi .
The variation of (5.18) with respect to pi gives

2 PN
Qi
(5.19)
= 0,
x 2 x=pi
where we used the Qi constraint to eliminate the term involving Li . For generic gr , the
RHS of (5.15) has some double roots, but no triple or higher roots; therefore (5.19) implies
20
that Qi
= 0. The situation where the RHS of (5.15) does have triple or higher order roots
is where the unperturbed N = 2 theory has an interacting N = 2 superconformal field
theory, as in [22]. Our N = 1 theory with Wtree does put the vacuum at such points for
some special choices of the gr , but well consider the generic situation for the moment.
Since the Qi
= 0, the variation of (5.18) with respect to all ur gives
gr +
l
N
Nj
Li pi
i=1 j =0
sj
= 0,
ur
(5.20)
with the understanding that the gr = 0 for r > n + 1. Using (5.14), (5.20) becomes
gr =
l
N
Nj
Li pi
sj r .
(5.21)
i=1 j =0
We should also impose the Li and Qi constraints in (5.18). These equations and (5.21) fix
the ur
, Li
, pi
, and Qi
as functions of the gr and . The Li
are proportional to
the expectation values qi qi
of the l N n condensed, mutually local, monopoles.
Following a similar argument in [31], we multiply (5.21) by x r1 and sum:
Wcl (x) =
N
gr x r1
r=1
N
l
N
Nj
x r1 pi
sj r Li
r=1 i=1 j =0
N
N
l
Nj
x r1 pi

sj r Li 2LN x 1 + O x 2
r= i=1 j =0
l
N

Nj
PN x; u
x j N1 pi
Li 2LN x 1 + O x 2
i=1 j =
l

PN (x; u
)
i=1
We define L
l

i=1
l
x pi
i=1 Li ;i .

Li 2LN x 1 + O x 2 .
(5.22)
Defining, as in [31], the order l 1 polynomial Bl1 (x) by
Bl1 (x)
Li
,
=
x pi
Hl (x)
with Hl (x) the polynomial appearing in (5.15), we thus have

42N
Wcl (x) + 2LN x 1 = Bl1 (x) F2N2l (x) +
+ O x 2 .
Hl (x)2
(5.23)
(5.24)
Since the highest order term in Wcl is gn+1 x n , we see that Bl1 (x) should actually be order
n N + l. This shows that l N n and, in particular, for l = N n, BNn1 = gn+1 is
a constant. Squaring (5.24) gives

2
gn+1
F2n = Wcl 2 + 4gn+1 LN x n1 + O x n2 .
21
(5.25)
2 F
2
We have thus derived (5.17), gn+1
2n = W + fn1 (x), and found that fn1 (x) =
N
n1
n2
4gn+1 L x
+ O(x ).
This shows that the exact ij (gr , ) of the U (1)n photons left massless found using the
reduced N = 2 curve (5.16), evaluated in the supersymmetric vacua, is consistent with
that of (4.9), found in Section 4 via our large N duality. However, as noted before to show
they are exactly the same we have to match the coefficients of fn1 (x), which depends
in a highly non-trivial way on Ni and the coupling constants of the superpotential. The
above method also, in principle, gives the ur
, and thus Weff (gr ), which can be compared
with the duality result Wexact (gr , Si ) (4.6) (upon integrating out the Si ). The duality results
(4.9) and (4.6) give the answers, and in particular the Ni dependence, in a much simpler
and more elegant fashion.
It is interesting to ask if the duality results of Section 4 could be recovered more directly
by a field theory analysis which includes the n glueball chiral superfields Si of the unbroken

gauge group ni=1 U (Ni ). In the original SU(N) theory, we can construct N generalized
glueball objects Tr i W W , i = 0, . . . , N 1. The N n monopole condensates or
Lagrange multiplier expectation values in the above analysis is (indirectly) related to N n
of these generalized glueballs. The n remaining ones should be those of the unbroken low
energy ni=1 U (Ni ). It is not known how to exactly include these from a direct field theory
analysis.
For any Wtree , there are vacua where classically U (N) or SU(N) is unbroken and, in the
quantum theory, N 1 mutually local monopoles condense. These are the only vacua for
Wtree = mu2 , but also exist for any n 1. The condition for having the N 1 mutually
local massless monopoles is [29]
2

P x; ur
42N = HN1 (x)2 F2 (x),
(5.26)
which is satisfied via Chebyshev polynomials:

PN x, ur
= N TN (x/), TN x t + t 1 = t N + t N .
(5.27)
With the normalization of (5.27), TN (x) = x N Nx N2 + , the first Chebyshev

polynomials. The roots of PN = det(x ), as given by (5.27), are j = 2 cos((2j +
1)/2N), j = 0, . . . , N 1; this gives (5.3).
More generally, we can use Chebyshev polynomials to construct new solutions of
the massless monopoles constraint (5.15). Given a solution PN (x) of (5.15) which is
appropriate for the SU(N) theory where the vacuum is broken to
SU(N)
n

SU(Ni ) U (1)n1
with
i=1
Ni = N,
(5.28)
we can immediately construct the solution PKN (x) which is appropriate for a SU(KN)
theory, with the same Wtree (1.1), in the vacuum where the gauge group is broken as
SU(KN)
n

i=1
SU(KNi ) U (1)n1
with

i
Ni = N.
(5.29)
22
The solution PKN (x) of (5.15) for the theory (5.29) is given by the Chebyshev polynomial
of the K = 1 solution PN (x):

PN (x)
NK TK
,
PKN (x) =
(5.30)
N
and the scales of SU(KN) and SU(N), respectively. To see that this satisfies the
with
condition of (5.15) note
2KN
PKN (x)2 4

2 2
PN
PN 2
PN
2NK TK
2KN UK1
=
4
=
4
N
N
2N
2

2
PN
2KN 2N UK1
=
HNn (x) F2n (x) HKNn (x) F2n (x). (5.31)
N
We denote the second Chebyshev functions UK1 (x t + t 1 ) (t K t K )/(t t 1 ) =

x K1 + , and the second line uses the fact that PN is a solution of (5.15). Thus PNK (x)
given by (5.30) indeed satisfies the condition (5.15) appropriate for (5.29). Furthermore,
the U (1)Nn in (5.29) has gauge couplings given by the curve y 2 = F2n (x), which is the
same as that of the K = 1 theory. This fits with the dual geometry prediction of Section 4,
as will be discussed in the next section.
Expanding out (5.30) relates the expectation values u p
of the SU(KN) theory to the
up
of the SU(N) theory. The relation is especially simple for the lower Casimirs:
u 2 = Ku2 ,
u 3 = Ku3 ,
(5.32)
with some more complicated relations for the general higher Casimirs.
By the above construction, it suffices to consider (1.3) where the Ni have no common
integer divisor. The simple K dependence fits with the duality results of Section 4.
5.3. Other possible connections
The quantum N = 2 theory is related to an integrable hierarchy, which is known to have
integrable Whitham hierarchy deformations; see, e.g., [32]. Our superpotential Wtree is
naturally regarded as a Whitham deformation of the N = 2 theory, where the Whitham
times are the gr in (1.1) which, from the N = 2 perspective, are spurious breaking
N = 2 to N = 1. The exact solution can still be obtained as a function of the Whitham
hierarchy, see, e.g., the last reference of [32]. It would be interesting to see how this
function is related to the Si and i periods of Section 4.
The N = 1 U (N) field theories with adjoint , Nf fundamental flavors, and general
superpotential Wtree () (1.1) can also be constructed via N IIA D4 branes suspended
between a NS brane and n NS branes. The construction was discussed in detail in [31]
and references cited therein. Four of the five directions transverse to the D4s in IIA are
conventionally written as having complex coordinates w and v. The NS branes are given
(v), giving the n NS branes at the
by some (v, w) curve, which classically is w = Wtree
minima of Wtree . Going to M-theory, the brane configuration becomes a smooth M5 brane
23
configuration, as in [33]. Our geometric flop transition duality is roughly reminiscent of

exchanging the roles of v and w; it was already speculated [31] that this exchange could
be related to the field theory duality of [9]. Perhaps this can be made more precise.
6. The case with the cubic superpotential in more detail

Consider in more detail the case n = 2, with Wcl = gu3 + mu2 + u1 . Then W =
g( a1 )( a2 ), with

2
2
m
m
m
+
a2 =
,
.
a1 =
(6.1)
2g
2g
g
2g
2g
g
For SU(N) SU(N1 ) SU(N2 ) U (1), as opposed to U (N) U (N1 ) U (N2 ),
should be treated as a Lagrange multiplier, enforcing u1 = 0. In that case,

N2
N1
m
m
a1 =
(6.2)
,
a2 =
.
g (N1 N2 )
g (N1 N2 )
The classical low-energy superpotential is
Wcl =
m3 N1 N2 (N1 + N2 )
g 2 6(N1 N2 )2
(6.3)
and
1
= g N1 N1 2N2 2N ,
3N
1
2
3N
= g N2 N2 2N1 2N ,
2
(6.4)
with mW = a1 a2 = (m/g)(N/(N1 N2 )) and m = g. Naive integrating in

then gives Weff = Wcl + Wnp with
3N

2

i i
Si log
Wnp =
+ Ni
SiNi
i=1

2
2
g
= N1 S1 log
+ S1 + S2 log
S1
2

2
2
g
+ S2 + S1 log
.
+ N2 S2 log
(6.5)
S2
2
The exact answer for the value of the superpotential at the minima of W can be obtained
via deforming the N = 2 curve, is given by (5.11), with the ur
given by solving (5.15)
for n = 2:
2
PN2 42N = HN2
F4 .
(6.6)
Again, this does not include the glueball fields.

As discussed in the previous section, a solution of (6.6) appropriate for SU(N)
SU(N1 ) SU(N2 ) U (1) can be used to immediately construct a solution of (6.6)
24
appropriate for SU(KN) SU(KN1 ) SU(KN2 ) U (1). Using (5.32), the low energy
effective superpotential for the SU(KN) theory is

Weff SU(KN) = m u 2
+ g u 3
= Km u2
+ Kg u3
= KWeff SU(N) ,
(6.7)
simply a factor of K times that of the SU(N) theory. The cl
which minimizes Weff ,
giving the vacuum on the solution space of (6.6), is thus K independent, so K really does
just factor out as an overall multiplicative factor in the superpotential.
6.1. Examples
6.1.1. U (3N) U (2N) U (N)
As a simple example of the procedure outlined in the last section, consider the case of
U (3N) in the vacuum where the unbroken group is U (2N) U (N). As discussed above
it suffices to consider the case N = 1. The superpotential of (5.18) is

W = u1 + mu2 + gu3 + L p3 + s1 p2 + s2 p + s3 23

+ Q 3p2 + 2s1 p + s2 .
(6.8)
The p equation of motion (along with Qs) gives Q
= 0 and (5.21) then gives = L(p2 +
ps1 + s2 ), m = L(p + s1 ), g = L. Thus s1
= g 1 m p, s2
= g 1 ( mp), and s3
=
23 pg 1 . p
is fixed by the Q constraint to be either a1 or a2 of (6.1), so Wcl (x) =
g(x p)(x + p + g 1 m). We then have P3 (x)
= g 1 (x p)Wcl (x) 23 , and thus
P32 46 = (x p)2 F4 (x), with g 2 F4 (x) = W (x)2 4g3 (gx +gp +m), which matches
with (5.25). For SU(3), we treat also as a Lagrange multiplier, enforcing s1
= u1
= 0, i.e., p
= m/g. The Q constraint then gives
= 2m2 /g, so u2
= 3(m/g)2 and
u3
= 2(m/g)3 23 . Plugging these back into W gives Wlow = (m3 /g 2 ) 2g3 .
Equivalently, we could simply solve the L and Q constraints at the outset by taking P3 =
(x a)2 (x b) 23 , giving u1
= 2a + b, u2
= 2a 2 + b 2 , u3
= 2a 3 + b 3 23
and thus Wlow = 2W (a) + W (b) 2g3 . Minimizing with respect to a and b gives
a
= a1 , b
= a2 and Wlow = Wcl 2g3 with Wcl = 2W (a1 ) + W (a2 ). In order to get
the SU(3) SU(2) U (1) answer we impose Wlow / = 0, which implies a1 = m/g,
a2 = 2m/g.
We thus find for SU(3) Wlow = (m3 /g 2 ) 2g3 and the remaining massless photon
has gauge coupling (g/m) which is given exactly by the curve y 2 = g 2 F4 (x) = W 2
m
3
4g3 (gx + 2m), with W (x) = g(x m
g )(x + 2 g ). This curve degenerates at (m/g) =
3 , i.e., u3
= 0, which is where an additional magnetic monopole becomes massless
in the N = 2 theory. The SU(2) glueball has S
= g3 .
6.1.2. Splittings of SU(5)
The computation of the one parameter family of N = 2 curves for the different splittings
of SU(5), namely, SU(3) SU(2) U (1) and SU(4) U (1) can be done explicitly. This
will provide the highly non-trivial exact answer for the low energy effective superpotential
that will be used to check the answer from the geometry in Section 8.4. As discussed
25
before this answer also provides the solution for SU(5K) SU(3K) SU(2K) U (1)
and SU(5K) SU(4K) SU(K) U (1) for any integer K.
We need to solve (6.6) for N = 5, i.e., to find P5 (x) such that
P52 (x) 410 = F4 (x)H32(x).
(6.9)
Clearly, P5 (x) has five parameters, given by the positions of the roots since the coefficient
of x 5 can be normalized to one. However, three of them have to be used to produce the
three double roots and one in order to impose the quantum tracelessness condition, i.e., to
set to zero the x 4 coefficient. This leaves us with a one parameter family of curves.
Let us set 5 = 1/2 and H3 (x) = (x a)(x b)x. The LHS of (6.9) can be factored as
(P5 1)(P5 + 1) where it is clear that the two factors should contain no common roots.
Therefore, we can freely set,
P5 (x) = (x a)2(x b)2 (x c) 1.
(6.10)
Now we want to make sure that P5 1 will have a double root at x = 0. This condition can
be easily implemented by
dP5
(0) = 0.
dx
In terms of a, b and c, these conditions read as follows:

a 2 b2 c = 2,
ab 2c(a + b) + ab = 0.
P5 (0) = 1,
(6.11)
Finally, we can impose the tracelessness condition by shifting x x

+ b) + c).
We can now read off the gauge theory Casimir expectation values (using Tr
= 0),

1
1
P5 (x) = det(x ) = x 5 Tr 2 x 3 Tr 3 x 2 + .
2
3
Since, our solution is symmetric in a and b it is more natural to write it in term of the
symmetric polynomials s = a + b and k = ab. The constraints (6.11) now read k 2 c = 2
and k(2cs + k) = 0. Assuming that k = 0 we can solve for k as k = 2cs. Then we are left
with only one constraint, namely, 2s 2 c3 = 1.
The Casimirs are now given by:

1
2 3
u3 =
2c 23c2s + 9cs 2 + s 3 ,
u2 = 2c2 + 18cs + 3s 2 ,
5
25
and the superpotential is now a function of c or s depending on how we use the constraint.
Let us introduce the constraint through a Lagrange multiplier and write the superpotential
as

Weff (c, s, ) = gu3 (c, s) + mu2 (c, s) + 5 s 2 c3 ,
1
5 (2(a
where we have introduced back for later convenience.

Now we need to solve Weff /c = 0 and Weff /s = 0 and then impose the constraint.
Computing these two equations and using one of them to eliminate from the other we
get the following simple equation
3c + s =
5m
,
g
(6.12)
26
subject to the constraint s 2 c3 = 5 . There is yet a better way to write the constraint,
namely, s 4 c6 = 10 . This will make very simple the identification of the different vacua.
Now we can see how the different splittings will come out. The classical limit
corresponds to setting 0 and the constraint can be solved in two ways, namely, s = 0
or c = 0. The former leads to c = 5m/3g using (6.12) while the latter leads to s = 5m/g.
Plugging this in the superpotential we reproduce in the former case the classical answer
for SU(4) U (1) and in the latter we get that of SU(3) SU(2) U (1).
6.1.3. SU(4) U (1)
In order to get Wlow we need to solve for c using (6.12) and s 4 = 10 /c6 . Clearly, we
have 4 solutions to the constraint giving s = s(c). These are the N1 N2 = 4 vacua. The
equation we need to solve is then
c=
5m s(c)
,
3g
3
5/2 as expansion parameter. Once this is

this can be solved recursively using t ( 3g
5m )
done, s can also be found and plugging them back in the superpotential we get

1
125 m3 2
7
5
221 5
22 6
Wlow =
+ 4t t 2 t 3 t 4
t
t
2
27 g 25
3
54
54
2592
243

2185 7
286 8
9147325 9
t
t
t + .
20736
2187
53747712
The above exact answer for the value of the superpotential at the critical point differs from
the naive integrating in analysis (5.4), which would terminate at order t 2 . The coefficients
of the classical t 0 term and t term agree with the exact answer above, but the coefficient of
t 2 term differs from the exact answer.
6.1.4. SU(3) SU(2) U (1)
In this case we need to solve for s using c6 = 10 /s 4 . Here, we have 6 solutions giving
the N1 N2 = 6 choices of vacua. The equation in this case becomes
s=
5m
3c(s),
g
5/3 we get for the

solving as before but using as expansion parameter t ( g
5m )
superpotential the following expression:

250 m3 1
+ 3t 2 + 2t 3 + 6t 4 + 26t 5 + 135t 6
Wlow =
2 g 2 25

14630 8
7
9
t + 32076t + .
+ 782t +
3
Again this differs from the result of the naive low energy analysis (5.4) which would
terminate at order t 3 ; up to that order the naive answer agrees with the above exact answer.
27
6.1.5. Splitting U (5) U (3) U (2)

It is also possible to find the curve for U (5) and from it to compute the SU(5) answer
by imposing the tracelessness constraint. However, the computation for U (5) is more
cumbersome than the SU(5) counterpart. In this part of the section we will simply show
the answer for the low energy effective superpotential and the computation can be found
in Appendix A.
Since we now do not impose the tracelessness condition, is a free parameter, rather
than a Lagrange multiplier. /g, m/g and combine into a single expansion parameter

5
2
m
3
, with = a1 a2 =
4 .
T =
g
g
The low energy superpotential is then given by

Wlow = 3W (a1 ) + 2W (a2 ) + g3 3T 2 + 2T 3 + 4T 4 + 10T 5 + .
In the dual geometric picture we will see that U (5) is the natural answer obtained, and then
one has to impose the constraint to get the SU(5) superpotential.
7. The analysis of the dual geometry

The dual geometry proposal gives rise to the superpotential of Section 4.1:

1
Weff =
Ni i +
Si ,
2i
n
i=1
i=1
(7.1)
where i s are the periods of the dual cycles and the Si s are the sizes of the S 3 s as defined
in (4.4) and (4.5).
Using (4.4) and (4.5), it is seen that under 0 e2i 0 the i period will change by
1i = 2
n
(Sj ).
(7.2)
j =1
The factor of two comes from the fact that we are dealing with two copies of the
x-plane connected by the n branch cuts (see Fig. 2). Let us choose the orientation of the
fundamental periods to be clockwise, therefore, it is easy to see that we always get the
upper sign in (7.2) for all i and j . We thus see that, in general, i must depend on the
cutoff 0 as

n
2
Sj log 0 + ,
i =
(7.3)
2i
j =1
with single-valued under 0 e2i 0 .

We now consider the full 0 -dependence. Consider the region of integration where x is
large compared to all ai s. Therefore, we can expand the effective one-form in x around
x = and it is easy to see that
28

W (x)2 + fn1 dx = W (x) +

1
1 bn1
+O 2
dx,
2gn+1 x
x
where bn1 is the coefficient of x n1 in the deformation polynomial fn1 (x) and W (x) =

gn+1 nj=1 (x aj ). Integrating this we get

bn1
1
i = + W (0 ) +
(7.4)
log 0 + O
,
2gn+1
0
where are the 0 -independent pieces. This allows us to make the following
identification using (7.3) and (7.4).
bn1 = 4gn+1
n
Sj .
j =1

Comparing with (5.25), we see that we must have j Sj
= LN , where both sides can
be solved for in terms of the gr and . As mentioned in Section 4.1, W (0 ) is an irrelevant
constant that can be ignored. However, we have to deal with the logarithmic dependence
because we want to take 0 at the end. Notice that, had we included deformations
of degree higher than n 1, more singular divergences would have appeared in (7.4) that
do not have a counterpart in the gauge theory side. This shows again that, as in (5.25), the
deformation f in F W 2 + f must have degree at most n 1.
Since every i has the same logarithmic divergence we can write the contribution to the
superpotential as follows:
n
n
n

Ni
Sj log 0 2i
Sk .
Weff = + 2
i=1
j =1
k=1
Now it is clear that the only way to obtain finite expressions is to take depending on 0
such that
N log = N log 0 i
(7.5)
n
is finite. Using j =1 Nj = N , we can replace 0 in Weff by the physical scale of the
SU(N) theory.

Note that, for fixed , the superpotential for a splitting of the form KN ni=1 KNi
has a trivial K dependence:

1
Weff =
KNi i = K
Ni i ,
2i
n
i=1
i=1
if we replace 0 by in the i s by using the term. This matches with the results
obtained from the gauge theory solution (5.30) using Chebyshev polynomials.
Some of the Si dependence of i can also be determined by using monodromy
arguments. Consider the semiclassical regime, |ai+ ai | |aj ak | for all i, j, k. Recall
n
+
2
that W (x)2 + fn1 (x) = gn+1
k=1 (x ak )(x ak ). In this regime Si can be written as
follows:
29
Fig. 3. (a) Contours of integration for Sj , Si and i before moving the j th S 3 around the ith S 3 .
(b) The i contour goes around the j th sphere after the operation in (a).
+
1
W (ai )
Si =
2i
ai
(x ai )2 eff dx,
ai
where we have Taylor-expanded W (x)2 + fn1 (x) around x = ai and

1
f
eff
(a
)
+
.
n1
i
W (ai )2
Each Si , in this limit, has been reduced to that of the single conifold, which has
Si = W (ai )eff
up to a numerical coefficient. On the other hand, it is easy to see that under eff e2i eff ,
i changes by 1i = Si . Therefore, we conclude that
1
1
Si
Si log eff =
Si log
+ .
2i
2i
W (ai )
Finally, we want to consider what happens to i when we move the j th 3-sphere all
the way around the ith 3-sphere. This corresponds to changing ij = ai aj to e2i ij
leaving ai fixed. Under this operation we get 1i = 2Sj (see Fig. 3). Therefore,
2
Sj log ij .
i = +
2i
i =
j =i
Now we can collect all these partial results in order to write,

Si
+
2
S
log
2
Sk log 0 + .
j
ij
W (ai )
n
2ii = Si log
j =i
k=1
30
Plugging this back in (7.1) and collecting all the Si pieces, we get

2N

n

W (ai )Ni j =i ij j 2N
Weff =
Si log
+ ,
SiNi
i=1
with the single-valued.
Comparing this to (5.8) and (5.6) we see that we have rederived the approximate Weff
obtained in Section 5.1 as well as the naive threshold matching relations. However, the
above analysis can not rule out further corrections to each i and hence to Weff in the
form of a power series in Si s. Indeed, as we will discuss in detail for the case of the cubic
superpotential, there is generally an infinite power series in Si s which corrects the above
expression.
8. Cubic superpotential from geometry: an explicit computation

In this section we consider the n = 2 case, deforming the N = 2 theory by Wtree =
u1 + mu2 + gu3 . This was discussed in detail from the gauge theory perspective in
Section 6. We now focus on the geometry side of the duality. In order to get the contribution
of the fluxes to the superpotential, we need to compute the periods of the relevant cycles in
the geometry. For this n = 2 case, (7.1) gives
1
Weff = N1 1 + N2 2 + (S1 + S2 ).
2i
The fundamental periods are given as in (4.4) by
x4
1
S1 =
2i
,
x3
1
S2 =
2i
(8.1)
x2
,
(8.2)
x1
and the dual periods by

1
1 =
2i
0
,
x3
1
2 =
2i
x1
,
(8.3)
where we have denoted by xi the roots of the quartic polynomial W (x)2 + f1 (x) appearing
in the definition of the effective one-form instead of ai+ , ai as in last section, in order to
simplify the notation.
To compute the effective superpotential, we need to express the dual periods 1 and
2 in terms of the fundamental periods S1 and S2 . Since, on the gauge theory side, one
does not have the exact answer for the superpotential in terms of the glueball fields, we
need to integrate out the Si , fixing them at their supersymmetric vacua Si
. This will give
Wexact (, m, g, ), which can be compared with the gauge theory results.
Recall that is a free parameter only for the U (N) theory. For SU(N), which we will
also compare, is a Lagrange multiplier imposing (quantum) tracelessness; this will fix
in terms of m, g and and the Ni .
31
8.1. Computation of the periods

As discussed in the general case in Section 7, only by using monodromy arguments it
is possible to show the general form of the Si dependence of the dual periods. In our case,
this reads,

1
S1
S1
1 =
W (0 ) W (a1 ) + S1 log
2i
g

(8.4)
+ 2S2 log 2(S1 + S2 ) log 0 + P ,
where P = P (S1 , S2 ) is an infinite power series in S1 and S2 , = a1 a2 and W (x) =
(1/3)gx 3 + (1/2)mx 2 + x. Recall that W (x) = g(x a1 )(x a2 ) was introduced in
Section 6. Use has also been made of W (a1 ) = g.
The explicit computation of P (S1 , S2 ) can be found in Appendix B up to order Si4 where
a method to compute higher order contributions is also given. Here we will only show the
result for 1 and 2 that will be used later in this section.

S1
1 + 2S2 log 2(S1 + S2 ) log 0
2i1 = W (0 ) W (a1 ) + S1 log
g

1 2
+ g3
2S1 10S1 S2 + 5S22
3
2
(g )

91 3
1
32 3
2
2
S 91S1 S2 + 118S1 S2 S2
+
(g3 )3 3 1
3

280 4 3484 3
1
S
S S2 + 2636S12 S22
+
3 1
(g3 )4 3 1

5
5272
871
S
S1 S23 +
S4 + O
3
3 2
(g3 )5
and

S2
1 + 2S1 log 2(S1 + S2 ) log 0
2i2 = W (0 ) W (a2 ) + S2 log
g

1 2
g3
2S2 10S1 S2 + 5S12
(g3 )2

91 3
1
32 3
2
2
S
S
91S
S
+
118S
S
2 1
2 1
(g3 )3 3 2
3 1

280 4 3484 3
1
S
S S1 + 2636S22 S12
+
3 2
(g3 )4 3 2

5272
871 4
S5
3
S2 S1 +
S +O
.
3
3 1
(g3 )5
32
8.2. Low energy superpotential

In order to compute the low energy superpotential we have to integrate out S1 and S2
from the effective superpotential. In order to do this in practice, it is convenient to define
S1
S2
.
,
y
3
g
(g3 )
In term of these new variable the dual periods can be written as follows:

1
1 (x, y) =
W (a1 ) + g3 F (x, y) ,
2i

1
2 (x, y) =
W (a2 ) + g3 F (y, x) ,
2i
where

0
F (x, y) = x(log x 1) 2(x y) log
+ 2x 2 + 10xy + 5y 2

91 3
32 3
2
2
x + 91x y + 118xy + y
+
3
3

280 4 3484 3
5272 3 871 4
2 2
x +
x y + 2636x y +
xy +
y + .
+
3
3
3
3
x
Note that we have removed the irrelevant constants W (0 ) in 1 and W (0 ) in 2 .

Now the effective superpotential is given by
1
Weff (x, y) = N1 1 + N2 2 + g3 (x y).
2i
Let us separate the contributions to (8.5) as
(8.5)
Weff (x, y) = Wcl + Wnp (x, y),

where Wcl = N1 W (a1 ) + N2 W (a2 ) and Wnp (x, y) = g(3 )(N1 F (x, y) + N2 F (y, x)).
In this expression, the cutoff 0 gets combined with the bare coupling to generate what
we identify with the gauge theory scale of the underlying N = 2 SU(N) YangMills theory
as in (7.5).
Having identified the gauge theory scale we can proceed to integrate out S1 and S2 or
equivalently x and y. The equations that need to be solved are:
Weff
Weff
= 0,
= 0.
x
y
The leading order can be easily extracted and reads:

N1 log(x) = 2(N1 + N2 ) log / ,

N2 log(y) = 2(N1 + N2 ) log / .
Now we can see the appearance of the N1 N2 vacua of the gauge theory from the solutions
to the above equations, namely,
2(N1 +N2 )
2(N1 +N2 )
x N1 =
,
y N2 =
.
33
It is useful to define the expansion parameter

2(N1 +N2 )
N1 N2
T
,
and the solution is then given by

x = T N2 ,
y = T N1 ,
where the choice of the N1 N2 -th root will determine the vacuum.
Note that the meaning of leading order depends on the values of N1 and N2 . Assuming
a power series expansion for x and y in T we can compute order by order Wlow . This
gives us the answer for the U (N) theory. To obtain the answer for SU(N), we only have to
impose that the quantum trace of the chiral superfield be zero: Tr
= Wlow ()/ = 0.
This should be imposed order by order in T .
8.3. Quantum tracelessness
Let us start by writing
Wlow (, ) = N1 W (a1 ) + N2 W (a2 ) + g3 P (T )
with P (T ) = N1 F (x(T ) , y(T ) ) + N2 F (y(T ) , x(T ) ). It is then easy to see that

Wlow (, )
2(N1 + N2 ) dP (T )
,
= N1 a1 + N2 a2 2 3P (T )
(8.6)
T
N1 N2
dT
where it was important to remember that T itself depends on through . Therefore, we
are forced to define a better expansion parameter given by
2(N1 +N2 )
N1 N2
t=
,
c
where c is computed using the Lagrange multiplier obtained by solving the classical
tracelessness constraint
2
N1 N2
m
c
=
.
(8.7)
g
(N1 N2 )2 g
Having found = (t) such that the quantum trace (8.6) vanishes, we can use it to compute
the low energy superpotential for our SU(N) theory that is given now as a power expansion
in t. It is possible to give an explicit formula for the first two terms, i.e., the classical
contribution and the first quantum correction for any N1 and N2 . Higher order corrections
have to be computed independently in each case. Assuming that N2 < N1 , we get

N +1
6(N1 + N2 )2 N2
1 m3 N1 N2 (N1 + N2 )
2
t
Wlow (t) =
1
+
.
+
O
t
6 g 2 (N1 N2 )2
N2 (N1 N2 )
8.4. Examples
Let us consider the different cases for which the deformed N = 2 field theory results
have been computed in Section 6, in order to compare the answer with that of the geometry.
34
8.4.1. U (3N) U (2N) U (N)

We only need to consider the case U (3) U (2) U (1). As we saw in Section 6.1,
this is particularly simple from the field theory perspective, where Weff = Wcl 2g3 ,
with only one quantum correction term. In order to reproduce this simple result, some
miraculous cancellations have to occur order by order in our series. Since we have
computed the dual periods up to order S14 and therefore the effective superpotential up to
order x 4 t 4 , we cannot compare orders equal or higher that t 5 even though they already
appear in our computation in the form xy 2 or x 3 y since y t 2 .
Let N1 = 2 and N2 = 1. Integrating out x and y we get

x(T ) = T 1 + T + 10T 2 + 140T 3 + ,

y(T ) = T 2 1 + 10T + 140T 2 + .
Plugging this back in Weff we get the answer for the U (3) case,

Wlow (T ) = Wcl + g3 2T + O T 5 ,
which is consistent with the exact answer W = Wcl + 2g3 T discussed in Section 6.1, to
the order we have computed. One might worry that imposing quantum tracelessness for
SU(3) SU(2) U (1) could result in T being a complicated expansion in terms of t.
However, one can check that the classical trace is not corrected quantum mechanically in
this case and therefore T = t. We thus have
Wlow (t) =

m3
1 + 54t + O t 5 ,
g2
3
and, recalling the definition of t = ( g
3m ) , we get
Wlow (t) =
m3
2g3 ,
g2
in perfect agreement with the field theory result.

We can also use the geometry analysis to obtain the gauge coupling of the IR U (1)
gauge theory photon, and compare with the field theory analysis. The field theory result
obtained in Section 6.1 is that the original SU(3) curve degenerates as P32 46 =
(x m/g)2 F4 (x), with g 2 F4 (x) = W (x)2 4g 2 3 (x + 2m/g). The remaining massless
photon has gauge coupling given by the complex modulus of the torus y 2 = F4 (x).
This matches perfectly with the geometry result if, at the extremum of our effective
superpotential for S, we have f1 (x; S
) = 4g 2 3 (x + 2m/g). Strikingly, this is indeed
the case.
8.4.2. U (5N) U (3N) U (2N)
In this case the deformed N = 2 field theory analysis predicts an infinite series discussed
in Section 6.1. From the dual geometry, to the order we have computed, we will be able to
compare up to order t 9 because x t 2 and y t 3 , therefore the t 10 receives corrections
from the x 5 .
35
Let N1 = 3 and N2 = 2. Integrating out x and y we get

8 2 10 3
2
4
5
6
7
x(T ) = T 1 + T T + a4 T + a5 T + a6 T + a7 T +
3
3
and

y(T ) = T 3 1 + 5T 2 + 11T 3 + b4 T 4 + b5 T 5 + b6 T 6 + .
The undetermined coefficients are shown to stress the fact that they do not contribute to
the order we are computing, despite being allowed by naive power counting. Plugging this
back in Weff we get the answer for U (5):

3
Wlow (T ) = Wcl + g 3T 2 2T 3 + 4T 4 10T 5

85 6 266 7 8170 8 3332 9
T +
T
T + .
+ T
3
3
27
3
In this case, we do have to take care with the quantum corrections to the trace, in order
to get the correct SU(5) superpotential. It turns out that

c
25
100 3 550 4 10400 5
1 t2 +
=
t
t +
t
g
g
3
3
3
9

508300 7 11338250 8
6
t
t + .
7875t +
9
27
Using this to compute T = T (t), a1 = a1 (t), and a2 = a2 (t), and plugging back in the
effective superpotential, we get

250 m3 1
+ 3t 2 2t 3 + 6t 4 26t 5 + 135t 6
Wlow (t) =
2 g 2 25

14630 8
7
9
t 32076t + .
782t +
3
This is in perfect agreement with the deformed N = 2 field theory answer.
8.4.3. U (5N) U (4N) U (N)
The deformed N = 2 field theory analysis again predicts an infinite series for Weff .
Again, this is also seen from the geometry dual, and we will be able to compare up to
order t 4 since we have computed the dual periods to order S14 . Let N1 = 4 and N2 = 1.
Integrating out x and y we get

47 2 73 3
3
x(T ) = T 1 T T T + ,
y(T ) = T 4 + O T 5 .
2
8
2
Plugging this back in the effective superpotential we get the U (4) answer:

47 3 75 4
3
2
Wlow = Wcl + g 4T 3T T T + .
6
2
For the SU (4) case, the vanishing of the quantum corrected trace implies that

c
25
175 3
25
=
t + .
1 + t + t2 +
g
g
3
9
72
36
Using this as in the previous case, we finally get the low energy superpotential to be

5
1 2
125 m3 2
7 3
5 4
+
4t
t
t
t
+
O
t
.
Wlow =
27 g 2 25
3
54
54
This exactly agrees, to this order, with the expected answer.
Acknowledgements
We would like to thank S. Katz for participation at the initial stages of this project. We
would also like to thank M. Aganagic, J. Edelstein and K. Hori for valuable discussions.
The work of F.C. and C.V. is supported in part by NSF grants PHY-9802709 and DMS
9709694. The work of K.I. is supported by DOE-FG03-97ER40546.
Appendix A. Deformed N = 2 field theory analysis for U (5) U (3) U (2)

We here find the supersymmetric vacua of the deformed N = 2 theory for one of
the splittings of U (5). The analysis goes along much the same lines as for SU(5). We
parameterize
2

2
P5 (x) = x (q + a) x (q + b) x (q + c) 1.
(A.1)
For SU(5), q was fixed by the tracelessness condition but now it is a free parameter. Since
a and b appear in a symmetric way it turns out to be useful to define s = a + b + 2q and
k = (a + q)(b + q). The constraints are now given by
k = q 2 q(2q s) + 2c(2q s),
4(2q s)2 c3 = 1.
From (A.1) we can read off u1 , u2 , and u3 using that

1 2
1 3
u1 u2 x 3 +
u1 + u1 u2 u3 x 2 + .
P5 (x) = x 5 u1 x 4 +
2
6
Plugging ui = ui (q, s, c) in Weff and introducing a Lagrange multiplier in order to impose
the constraint left after we eliminate k, we get

Weff = gu3 + mu2 + u1 + h 5 4(2q s)2 c3 .
The equations we need to solve are given by Weff /c = 0, Weff /s = 0, and
Weff /q = 0. Using the first to eliminate h in the second and the third, these equations
simplify to

+ m(q + c + s) + g q 2 + c2 + 3cs + s 2 2q(2c + s) = 0,

5 m(q + c + 2s) g 5q 2 14qc + c2 4qs + 8cs + 2s 2 = 0,
4(2q s)2 c3 = 5 .
37
In order to find an expansion parameter around the classical solution we have to take the
limit = 0 and solve the equations. We find that

m + m2 4g
m
q=
,
s = .
2g
g

Therefore, (2q s) = m2 /g 2 4/g = and it is clear that the expansion parameter is
T given by T 6 = (/)10 . Again, there are six possible solutions giving the six possible
vacua N1 N2 = 6, since N1 = 2 and N2 = 3.
Solving these equations assuming a power expansion in T for s = s(T ), q = q(T ) and
c = c(T ), we get after plugging back in the effective superpotential,

Wlow = 3W (a1 ) + 2W (a2 ) + g 3T 2 + 2T 3 + 4T 4 + 10T 5 + ,
where W (x) = g3 x 3 +
m 2
2x
+ x, W (x) = g(x a1 )(x a2 ) and = a1 a2 .
Appendix B. Computation of periods for the cubic superpotential

In this appendix we will show the explicit computation of the corrections P (S1 , S2 ) in
the expression for 1 in (8.4).
The computation of P (S1 , S2 ) will not be done directly in terms of S1 and S2 , we will
write all four periods in terms of two new variables 21 and 43 to be defined below
and at the end we will recollect P (S1 , S2 ). This procedure can be done systematically
up to any order in Si s.
B.1. Computation
For practical purposes we will write the effective one-form as follows:

dx W 2 (x) + f1 (x) = dx g (x x1 )(x x2 )(x x3 )(x x4 ).
(B.1)
It is also convenient to define new variables given by

1
1
21 (x2 x1 ),
43 (x4 x3 ),
2
2

1
1
I (x3 + x4 ) (x1 + x2 ) .
Q (x1 + x2 + x3 + x4 ),
2
2
It is clear that since f1 (x) is considered a small perturbation we will have
|21 | |43 | |I |.
We will use this in order to expand all four periods in powers of 21 and 43 .
Let us consider S1 . For this we change variables to y = x 12 (x1 + x2 ) and the integral
becomes:
g
S1 =
2
y4
y3

(y y3 )(y y4 ) y 2 221 .
38
Expanding the second square root for 21 small, each term in the series can be computed
explicitly and it is most easily given in terms of a generating function

F (a) (y3 + a)(y4 + a) + (y3 + y4 + 2a)

(B.2)
2
as follows:
g
g
(n)
(y3 + y4 )(y4 y3 )2 +
cn 2n
(0),
21 F
32
2
n=1
where cn are the coefficients in the expansion of 1 x and F (n) (a) is the nth derivative
with respect to a.
The explicit answer has the following structure:

g
g
S1 = 243 I K 221 , 243 , I 2 ,
(B.3)
4
2I
where

1
1
1
1
2
K(x, y, z) = xy 1 + (x + y) + 2 (x + y) + 2 xy + .
4
4z
8z
8z
S1 =
It is important to notice that this is symmetric in (x, y), namely, K(x, y, z) = K(y, x, z).
This allows us to write

g
g
S2 = 221 I + K 221 , 243 , I 2 .
(B.4)
4
2I
Let us now compute the dual periods starting with 1 . In this case we can use the same
expansion as before for S1 , however, we have to keep in mind that 0 will be taken to
infinity at the end and therefore we shall discard any contribution of order 1
0 or higher
in an expansion around infinity.
In this case it is also useful to define a generating function

(I + a) + 43 + (I + a) 43
2
2
,
G(a) = (I + a) 43 log
(B.5)
(I + a) + 43 (I + a) 43
and the answer is given by

2i
1
1
1
1 = 30 Q20 + Q2 I 2 2 243 + 221 0
g
3
2
4

1 2
1
1
+ I 21 243 log 0 (I + Q)3 + I (I + Q)2
2
24
8

1 2
1 2
1 2
2
+ 21 (I + Q) + 43 Q + I 43 21 log(243)
4
4
2

(n)
+
cn 2n
21 G (0),
n=1
(B.6)
where cn are as before the coefficients of the power expansion of 1 x.

This result is not enough because we want it to show only explicit dependence on
classical superpotential parameters m, g, and two deformation parameters 21 and 43 .
39
In order to do this we only have to realize that since f1 (x) in (B.1) is of degree one and
W 2 (x) of degree four, then the coefficients of x 3 and x 2 are given in terms of the classical
roots a1 and a2 . This allows us to write

I 2 = (a1 a2 )2 2 221 + 243 .
Q = a1 + a2 ,
Using this, (B.3) and (B.4) we can explicitly compare order by order in 21 and 43 the
two expressions for 1 given by (B.6) and (8.4) to obtain the following result:

S1
2i1 = W (0 ) W (a1 ) + S1 log
1 + 2S2 log 2(S1 + S2 ) log 0
g

1 2
2S1 10S1 S2 + 5S22
+ g3
(g3 )2

32 3
91 3
1
2
2
S 91S1 S2 + 118S1S2 S2
+
3
(g3 )3 3 1

1
280 4 3484 3
S
S S2 + 2636S12S22
+
(g3 )4 3 1
3 1

5272
871 4
S5
3
S1 S2 +
S +O
.
3
3 2
(g3 )5
Likewise we can get 2 from the above result by simply exchanging a1 a2 , S1 S2 ,
and 0 0 . This leads to

S2
1 + 2S1 log 2(S1 + S2 ) log 0
2i2 = W (0 ) W (a2 ) + S2 log
g

1 2
g3
2S2 10S1 S2 + 5S12
3
2
(g )

91 3
32 3
1
2
2
S 91S2 S1 + 118S2S1 S1
(g3 )3 3 2
3

280 4 3484 3
1
S
S S1 + 2636S22S12
+
3 2
(g3 )4 3 2

5272
871 4
S5
3
S2 S1 +
S +O
.
3
3 1
(g3 )5
This completes our computation of the periods.
References
[1]
[2]
[3]
[4]
[5]
R. Gopakumar, C. Vafa, On the gauge theory/geometry correspondence, hep-th/9811131.

C. Vafa, Superstrings and topological strings at large N , hep-th/0008142.
M. Atiyah, J. Maldacena, C. Vafa, An M-theory flop as a large N duality, hep-th/0011256.
B.S. Acharya, On realising N = 1 super YangMills in M theory, hep-th/0011089.
S. Kachru, S. Katz, A. Lawrence, J. McGreevy, Open string instantons and superpotentials,
hep-th/9912151, Phys. Rev. D 62 (2000) 026001.
[6] T.R. Taylor, C. Vafa, RR flux on CalabiYau and partial supersymmetry breaking, hepth/9912152, Phys. Lett. B 474 (2000) 130.
40
[7] P. Mayr, On supersymmetry breaking in string theory and its realization in brane worlds, hepth/0003198, Nucl. Phys. B 593 (2001) 99.
[8] I. Antoniadis, H. Partouche, T.R. Taylor, Spontaneous breaking of N = 2 global supersymmetry, hep-th/9512006, Phys. Lett. B 372 (1996) 83;
S. Ferrara, L. Girardello, M. Porrati, Spontaneous breaking of N = 2 to N = 1 in rigid and local
supersymmetric theories, hep-th/9512180, Phys. Lett. B 376 (1996) 275;
H. Partouche, B. Pioline, Partial spontaneous breaking of global supersymmetry, hepth/9702115, Nucl. Phys. Proc. Suppl. B 56 (1997) 322.
[9] D. Kutasov, A comment on duality in N = 1 supersymmetric non-Abelian gauge theories, hepth/9503086, Phys. Lett. B 351 (1995) 230;
D. Kutasov, A. Schwimmer, On duality in supersymmetric YangMills theory, hep-th/9505004,
Phys. Lett. B 354 (1995) 315;
D. Kutasov, A. Schwimmer, N. Seiberg, Chiral rings, singularity theory, and electricmagnetic
duality, hep-th/9510222, Nucl. Phys. B 459 (1996) 455.
[10] I.R. Klebanov, M.J. Strassler, Supergravity and a confining gauge theory: duality cascades and
SB-resolution of naked singularities, hep-th/0007191, JHEP 0008 (2000) 052.
[11] J.M. Maldacena, C. Nunez, Towards the large N limit of pure N = 1 super-YangMills, hepth/0008001, Phys. Rev. Lett. 86 (2001) 588.
[12] A. Klemm, W. Lerche, P. Mayr, C. Vafa, N. Warner, Self-dual strings and N = 2 supersymmetric field theory, hep-th/9604034.
[13] S. Gukov, C. Vafa, E. Witten, CFTs from CalabiYau four-folds, hep-th/9906070, Nucl. Phys.
B 584 (2000) 69.
[14] A. Shapere, C. Vafa, BPS structure of ArgyresDouglas superconformal theories, hepth/9910182.
[15] G. Veneziano, S. Yankielowicz, Phys. Lett. B 113 (1982) 321;
T.R. Taylor, G. Veneziano, S. Yankielowicz, Nucl. Phys. B 218 (1983) 493.
[16] A. Strominger, Massless black holes conifolds in string theory, hep-th/9504090, Nucl. Phys.
B 451 (1995) 96.
[17] E. Witten, Baryons and branes in anti-de-Sitter space, hep-th/9805112, JHEP 9807 (1998) 006.
[18] D.J. Gross, H. Ooguri, Aspects of large N gauge theory dynamics as seen by string theory,
hep-th/9805129, Phys. Rev. D 58 (1998) 106002.
[19] S. Gubser, I. Klebanov, Baryons and domain walls in an N = 1 superconformal gauge theory,
hep-th/9808075, Phys. Rev. D 58 (1998) 125025.
[20] M. Aganagic, C. Vafa, Mirror symmetry, D-branes and counting holomorphic discs, hepth/0012041.
[21] S. Katz, D. Morrison, Gorenstein threefold singularities with small resolutions via invariant
theory for Weyl groups, J. Alg. Geom. 1 (1992) 449530.
[22] P.C. Argyres, M.R. Douglas, New phenomena in SU(3) supersymmetric gauge theory, hepth/9595062, Nucl. Phys. B 448 (1995) 93;
P.C. Argyres, M.R. Plesser, N. Seiberg, E. Witten, New N = 2 superconformal field theories,
hep-th/9511154, Nucl. Phys. B 461 (1996) 71.
[23] K. Intriligator, R.G. Leigh, N. Seiberg, Exact superpotentials in four dimensions, hepth/9403198, Phys. Rev. D 50 (1994) 1092.
[24] N. Seiberg, Naturalness versus supersymmetric non-renormalization theorems, hepph/9309335, Phys. Lett. B 318 (1993) 469.
[25] N. Seiberg, E. Witten, Monopole condensation and confinement in N = 2 supersymmetric
YangMills theory, hep-th/9407087, Nucl. Phys. B 426 (1994) 19.
[26] P.C. Argyres, A.E. Faraggi, The vacuum structure and spectrum of N = 2 supersymmetric
SU(N) gauge theory, hep-th/9411057, Phys. Rev. Lett. 74 (1995) 3931.
[27] A. Klemm, W. Lerche, S. Theisen, S. Yankielowicz, Simple singularities and N = 2
supersymmetric YangMills theory, hep-th/9411048, Phys. Lett. B 344 (1995) 169.
41
[28] K. Intriligator, Integrating in and exact superpotentials in 4d, hep-th/9407106, Phys. Lett.
B 336 (1994) 409.
[29] M. Douglas, S. Shenker, Dynamics of SU(N) supersymmetric gauge theory, hep-th/9503163,
Nucl. Phys. B 447 (1995) 271.
[30] S. Elitzur, A. Forge, A. Giveon, K. Intriligator, E. Rabinovici, Massless monopoles via
confining phase superpotentials, hep-th/9603051, Phys. Lett. B 379 (1996) 121.
[31] J. de Boer, Y. Oz, Monopole condensation and confining phase of N = 1 gauge theories via
M-theory fivebrane, hep-th/9708044, Nucl. Phys. B 511 (1998) 155.
[32] A. Gorsky, I. Krichever, A. Marshakov, A. Mironov, A. Morozov, Integrability and Seiberg
Witten exact solution, hep-th/9505035, Phys. Lett. B 355 (1995) 466;
E. Martinec, N. Warner, Integrable systems and supersymmetric gauge theory, hep-th/9509161,
Nucl. Phys. B 549 (1996) 97;
T. Nakatsu, K. Takasaki, WhithamToda hierarchy and N = 2 supersymmetric YangMills
theory, hep-th/9505162, Mod. Phys. Lett. A 11 (1996) 157;
J.D. Edelstein, M. Marino, J. Mas, Whitham hierarchies, instanton corrections, and soft
supersymmetry breaking in N = 2 SU(N) super-YangMills theory, hep-th/9805172, Nucl.
Phys. B 541 (1999) 671.
[33] E. Witten, Solutions of four-dimensional field theories via M theory, hep-th/9703166, Nucl.
Phys. B 500 (1997) 3;
K. Hori, H. Ooguri, Y. Oz, Strong coupling dynamics of four-dimensional N = 1 gauge theories
from M-theory fivebrane, hep-th/9706082, Adv. Theor. Math. Phys. 1 (1998) 1;
E. Witten, Branes and the dynamics of QCD, hep-th/9706109, Nucl. Phys. B 507 (1997) 658.

Non-singlet structure functions beyond

the next-to-next-to-leading order
W.L. van Neerven, A. Vogt
Instituut-Lorentz, University of Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands
Received 13 March 2001; accepted 28 March 2001
Abstract
We study the evolution of the flavour non-singlet deep-inelastic structure functions F2,NS and F3
at the next-to-next-to-next-to-leading order (N3 LO) of massless perturbative QCD. The present
information on the corresponding three-loop coefficient functions is used to derive approximate
expressions of these quantities which prove completely sufficient for values x > 102 of the
Bjorken variable. The inclusion of the N3 LO corrections reduces the theoretical uncertainty
of s determinations from non-singlet scaling violations arising from the truncation of the
perturbation series to less than 1%. We also study the predictions of the soft-gluon resummation, of
renormalization-scheme optimizations by the principle of minimal sensitivity (PMS) and the effective
charge (ECH) method, and of the Pad summation for the structure-function evolution kernels. The
PMS, ECH and Pad approaches are found to facilitate a reliable estimate of the corrections beyond
N3 LO. 2001 Elsevier Science B.V. All rights reserved.
PACS: 12.38.Bx; 12.38.Cy; 13.60.Hb
1. Introduction
Structure functions in inclusive deep-inelastic leptonnucleon scattering (DIS) are
among the observables best suited for precise determinations of the strong coupling
constant s . At present their experimental uncertainties result in an error exp s (MZ2 )
0.002 at the mass of the Z-boson [1]. A further reduction of this error can be
expected, especially from measurements at the electronproton collider HERA after the
forthcoming luminosity upgrade. The standard next-to-leading order (NLO) approximation
of perturbative QCD summarized in Ref. [2], on the other hand, leads to a theoretical
error th s (MZ2 ) 0.005. This error is dominated by the uncertainty due to the truncation
E-mail address: avogt@lorentz.leidenuniv.nl (A. Vogt).
0550-3213/01/$ see front matter 2001 Elsevier Science B.V. All rights reserved.
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 5 8 - 4
W.L. van Neerven, A. Vogt / Nuclear Physics B 603 (2001) 4268
43
of the perturbation series as estimated from the renormalization-scale dependence. Hence

calculations beyond NLO are required to make full use of the present and forthcoming data
on structure functions.
The ingredients necessary for next-to-next-to-leading order (NNLO) analyses of the
structure functions in Bjorken-x space 1 have not been completed up to now: Unlike the
two-loop coefficient functions which were calculated some time ago [5] (and completely
checked recently [6]), only partial results [710] have been obtained for the three-loop
splitting functions so far. However, we have recently demonstrated [1113] that the
uncertainties resulting from the incompleteness of this information are entirely negligible
at x > 0.05. Moreover, these uncertainties are small even at much lower x, down to
x 104 at not too small scales, Q2 10 GeV2 [13]. Thus analyses of structure functions
in DIS (and of total cross sections for DrellYan lepton-pair production [14]) can be
promoted to NNLO over a wide kinematic region. Besides more accurate determinations of
the parton densities, such analyses facilitate a considerably improved theoretical accuracy
th s (MZ2 ) 0.002 of the determinations of the strong coupling.
In the present article we extend, for x > 102 , our treatment [11] of the flavour nonsinglet (NS) sector dominating s -extractions from DIS to the next-to-next-to-next-toleading order (N3 LO). This extension is facilitated by two circumstances: The first is
the existence of constraints on the three-loop coefficient functions which prove to be
sufficiently restrictive in this region of x. The seven lowest even-integer and odd-integer
moments have been computed [7,8] for the structure function F2,NS in electromagnetic
DIS and F3+ in charged-current DIS, respectively. Furthermore the four leading large-x
terms of these functions are known from the soft-gluon resummation [15,16]. The second
circumstance is the rapid convergence of the splitting-function expansion in the usual MS
factorization scheme also employed in Refs. [7,8]. Already the impact of the three-loop
splitting functions is small at x > 102 , in absolute size (less than 1% on s (MZ2 )) as well
as compared to the two-loop coefficient functions [11,12]. Hence one can safely expect
that the effect of the unknown four-loop splitting functions on determinations of s (MZ2 )
will be well below the 1% accuracy we are aiming at.
As demonstrated below, the N3 LO approximation suffices for achieving this accuracy
in the region x 0.75 usually covered by analyses of DIS data [1]. Terms beyond this
order are relevant at x 0.8, on the other hand, mainly due to the presence of large softgluon logarithms up to [ln2l1 (1 x)]/(1 x) in the l-loop coefficient functions. The
resummation of these logarithms [15,16] has been extended to the next-to-next-to-leading
logarithmic (NNLL) accuracy recently [17]. Here we will study the predictions of this
resummation for the factorization-scheme independent (physical) kernels governing the
scaling violations (evolution) of the non-singlet structure functions. Other approaches to
estimate higher-order corrections to these kernels, not restricted to very large x, include
Pad summations of the perturbation series [18] as well as renormalization scheme
optimizations such as the principle of minimal sensitivity (PMS) [19] and the effective
1 See Refs. [3,4] for NNLO analyses based on the integer Mellin-N results of Refs. [7,8] only.
44
charge (ECH) method [20]. We will also compare these estimates to the full NNLO
and N3 LO evolution kernels, and investigate the resulting predictions at order s5 (N4 LO)
and beyond.
The outline of this article is as follows: In Section 2 we express the physical evolution
kernels, up to N4 LO and NNLL accuracy, in terms of the corresponding splitting
functions and coefficient functions. The information on the three-loop coefficient functions
for F2,NS and F3 discussed above is employed in Section 3 to derive approximate
expressions for their x-dependence. Besides these functions the N3 LO evolution kernels
also involve convolutions of lower-order coefficient functions for which we provide
compact expressions in Section 4. These results are put together in Section 5 to study the
effects of the N3 LO terms on the evolution of the structure functions and on the resulting
determinations of the strong coupling constant. In Section 6 we discuss the predictions of
the soft-gluon resummation and of the Pad, PMS and ECH approximations. Finally our
results are summarized in Section 7. Some relations for the convolutions in Section 4 can
be found in Appendix A.
2. Fixed-order and resummed evolution kernels

For the choice 2r = 2f = Q2 of the renormalization and mass-factorization scales, the
structure functions
1
F1 = 2F1,NS ,
(2.1)
F2 = F2,NS ,
F3 = F3
x
are in perturbative QCD given by

Fa x, Q2 = Ca Q2 qa,NS Q2 (x)
=

asl Q2 ca,l qa,NS Q2 (x)
l=0
asl
2
Q
l=0
1

x 2
dy
ca,l (y)qa,NS
,Q .
y
y
(2.2)
Here ca,l represents the l-loop non-singlet coefficient functions with ca,0 (x) = (1 x),
and qa,NS stands for the respective combinations of the quark densities. The scale
dependence of the running coupling of QCD, in this article normalized as
s
,
as
(2.3)
4
is governed by

das
= (as ) =
asl+2 l .
(2.4)
2
d ln Q
l=0
Besides 0 and 1 [2] also the coefficients 2 and 3 have been computed [21,22] in the
MS renormalization scheme adopted throughout this study. All these four coefficients,
2
0 = 11 Nf ,
3
1 = 102
38
Nf ,
3
2 =
45
2857 5033
325 2
Nf +
N ,
2
18
54 f
1093 3
N ,
(2.5)
729 f
are required for N3 LO calculations. The irrational coefficients in Eq. (2.5) and in
Eqs. (2.15) and (2.18) below have been truncated to six digits for brevity. Nf denotes
the number of effectively massless flavours (mass effects are not considered in this article).
Finally the evolution equations for the parton densities in Eq. (2.2) read

d
qa,NS x, Q2 = Pa Q2 qa,NS Q2 (x)
2
d ln Q

=
(2.6)
asl+1 Pa,l qa,NS Q2 (x),
3 = 29243.0 6946.30Nf + 405.089Nf2 +
l=0
where abbreviates the Mellin convolution written out in the third line of Eq. (2.2).
Like the coefficient functions ca,l (x), the (l + 1)-loop splitting functions Pa,l (x) are scale
independent for the above choice of r and f .
Explicit expressions up to order s2 can be found in Refs. [23] and [5] for the non-singlet
splitting functions and coefficient functions, respectively. For the third-order splitting
functions Pa,2 (x) we will employ our approximate expressions of Ref. [13]. The threeloop coefficient functions ca,3 (x) are the subject of Section 3 below.
It is convenient to express the scaling violations of the non-singlet structure functions in
terms of these structure functions themselves, thus explicitly eliminating any dependence
on the factorization scheme and the scale f . The corresponding physical evolution
kernels Ka for 2r = Q2 can be derived by differentiating Eq. (2.2) with respect to Q2 by
means of Eqs. (2.4) and (2.6), and finally eliminating qa,NS using the inverse of Eq. (2.2).
Suppressing the dependencies on x and Q2 one arrives at the evolution equations

d ln Ca
d
F
=
P
+
Fa
a
a
d ln Q2
das

= Ka Fa =
asl+1 Ka,l Fa
l=0

= as P0 +
asl+1
Pa,l
l=1
l1

k ca,lk
Fa ,
(2.7)
k=0
with
ca,1 = ca,1 ,
2
ca,2 = 2ca,2 ca,1
,
3
ca,3 = 3ca,3 3ca,2 ca,1 + ca,1
,
2
2
4
+ 4ca,2 ca,1
ca,1
.
ca,4 = 4ca,4 4ca,3 ca,1 2ca,2
(2.8)
In Eq. (2.8) we have used the abbreviation f l for the (l 1)-fold convolution of a function
f (x) with itself, i.e., f 2 = f f , etc. The generalizations Ka,l of the kernels Ka,l in
Eq. (2.7) to 2r = Q2 can be obtained by expanding 2 as (Q2 ) in terms of as (2r ) and L =
ln(Q2 /2r ), yielding
2 Up to the fifth order this expansion can be read off from the K
a,0 terms of Eq. (2.9).
46
Ka,0 = Ka,0 ,
Ka,1 = Ka,1 0 LKa,0 ,

Ka,2 = Ka,2 20 LKa,1 1 L 02 L2 Ka,0 ,

Ka,3 = Ka,3 30 LKa,2 21 L 302 L2 Ka,1

5
2
3 3
2 L 1 0 L + 0 L Ka,0 ,
2

Ka,4 = Ka,4 40 LKa,3 31 L 602 L2 Ka,2

22 L 71 0 L2 + 403 L3 Ka,1

3
13
3 L 32 0 L2 12 L2 + 1 02 L3 04 L4 Ka,0 .
2
3
(2.9)
At Nm LO the terms up to l = m are included in Eq (2.7). Hence Eqs. (2.8) and (2.9)
formally specify the evolution kernels Ka up to N4 LO. Their extension to higher orders is
straightforward but irrelevant for the time being, as at least the coefficient functions beyond
four loops will not be calculated in the foreseeable future.
The leading terms of the coefficient functions for x 1, however, are known to all
orders from the soft-gluon resummation [1517]. Switching to Mellin moments,
1
f
dx x N1 f (x),
(2.10)
for the remainder of this section, the large-N (large-x) behaviour of the coefficient
functions in Eq. (2.2) takes the form

N
= g0 (as ) exp ln Ng1 () + g2 () + as g3 () + O as2 f () ,
Cres
(2.11)
up to terms which vanish for N . Here we have used the abbreviation
s
= as 0 ln N =
0 ln N,
4
(2.12)
and we have again put 2r = 2f = Q2 . By virtue of the first line of (2.7), Eq. (2.11) leads to
the following expression for the resummed kernel up to next-to-next-to-leading logarithmic
(NNLL) accuracy [17]:

1
2
dg1
N
= A1 as + A2 as2 + A3 as3 ln N 1 + as + as2 2
Kres
0
0
d

d
dg
2
as2 0
as 0 + as2 1
(2.13)
g3 () + O as3 (f ()) .
d
d
Thus the leading logarithmic (LL), next-to-leading logarithmic (NLL) and NNLL large-N
contributions to the physical evolution kernels are of the form (as ln N)n , as (as ln N)n and
as2 (as ln N)n , respectively. This is in contrast to the coefficient functions which receive
contributions up to (as ln2 N)n . The constants Al in Eq. (2.13) are the coefficients of the
leading [24] large-x terms 1/[1 x]+ of the l-loop MS splitting functions recall that
f N = ln N + O(1)
for f (x) =
1
.
[1 x]+
(2.14)
47
As in Eq. (2.5) inserting the numerical values for the QCD colour factors CA and CF , these
constants are given by
16
160
,
A2 = 66.4732
Nf ,
3
27
and the yet approximate, but sufficiently accurate three-loop result [9,13]
A1 =
(2.15)
64 2
N .
(2.16)
81 f
Inserting the explicit form of the functions g1 , g2 [15] and g3 [17] in Eq. (2.13) and
restoring the dependence on L = ln(Q2 /2r ) leads to
A3 = (1178.8 11.5) (183.95 0.85)Nf
N
Kres
=
( 2)
A1
+ as2 L2 A1 0
ln(1 ) + as LA1
0
1
2(1 )2

+ as A1 1 + ln(1 ) + (B1 A1 e )02 A2 0
1
02 (1 )

( 2)

ln(1 )
+ as2 L (B1 e A1 )0 A2
1 1
(1 )2
0 (1 )2

( 2)
+ as2 A1 e2 + 2 02 + 2A2 e 0 + A3 2B1 e 02 2B2 0
20 (1 )2

2
2 2
2
+ 2(A1 e B1 )1 0 ln(1 ) + A1 1 ln (1 )

1
A1 2 0 2 A2 1 0 2 2 2 ln(1 )
3
20 (1 )2

(1 )
+ 2D2
+ O as3 f (, L)
2
(1 2)

N
asl+1 Kres,l
.
(2.17)
l=0
Here 2 = 2 /6, and e represents the EulerMascheroni constant, e 0.577216.

Furthermore B1 = 4 [15], and the constants B2 and D2 are related by [17]
B2 + D2 = 36.2657 + 6.34888Nf .
(2.18)
We will return to the latter coefficients at the end of the next section.
N already taken into account in the
After subtracting the terms up to order asm+1 in Kres
Nm LO terms (2.9), Eq. (2.17) can be added to these fixed-order results to obtain the (Nm LO
+ resummed) approximation for the non-singlet evolution kernels,
N
KN
m LO+res
m

N

N
N
+ Kres
asl+1 Ka,l
Kres,l
.
(2.19)
l=0
Due to the renormalon singularities at = 1 and = 1/2 in Eq. (2.17) the resummed
evolution equations cannot be uniquely inverted to x-space, unlike the fixed-order
N F N . Note that strength of these singularities located at N 2000
approximations Ka,l
a
and N 45 for = 1 and = 1/2, respectively, at s = 0.2 and Nf = 4 increases
48
with the order of the soft-gluon expansion: the behaviour is logarithmic at the leadinglog level, but involves poles of order k for the Nk LL approximations. For our numerical
study of the all-order case at the end of Section 6 we will use the standard minimal
prescription contour [27] for the Mellin inversion. This contour runs to the left of the
renormalon singularities, but to the right of all other poles in the N -plane.
3. The 3-loop non-singlet coefficient functions

The l-loop coefficient functions ca,l for the non-singlet structure functions Fa=1,2,3
defined in Eq. (2.1) can be represented as
ca,l (x, Nf ) =
2l1

l (1 x) + csmooth(x, Nf )
Al Dm + B
a,l
(m)
m=0
2l1

(n) n

(n)
Ca,l L1 + Da,l Ln0 .
(3.1)
n=1
A(m)
l ,
l , C (n)
B
a,l
(n)
Here
and Da,l
are numerical coefficients which in general depend on the
number of flavours Nf , and we have employed the abbreviations

k
ln (1 x)
,
L1 = ln(1 x),
L0 = ln x
Dk =
(3.2)
1x
+
for the +-distributions (see Eqs. (3.3) and (3.4) below) and the end-point logarithms. The
smooth(x, N ) collect all contributions which are finite for 0 x 1. This
functions ca,l
f
regular term constitutes the mathematically complicated part of Eq. (3.1), it involves higher
transcendental functions like the harmonic polylogarithms introduced in Ref. [28]. As
usual, the +-distributions are defined via
1
1
dx a(x)+ f (x) =

dx a(x) f (x) f (1) ,
(3.3)
where f (x) is a regular function. The convolutions with the distributions occurring in
Eq. (3.1) can be written as 3
1
x[Dk f ](x) =
x

x
lnk (1 x) x
f
dy
xf (x)
1x
y
y
+ xf (x)
1
lnk+1 (1 x).
k+1
(3.4)
As already indicated in Eq. (3.1), the coefficients of Dm and of (1 x) are independent

of the choice of the structure function.

3 The second line of Eq. (3.4) is given by xf (x) x dy a(y) for a general +-distribution [a(x)] .
+
0
49
The three-loop contributions cS,3 known from the soft-gluon resummation read
14400
512
D5 L51
D4 + 264.062D3 + 1781.704D2,
27
81
13760
512
D5 L51
D4 + 188.210D3 + 1962.178D2,
cS,3 (x, 4)=
27
81
13120
512
D4 + 113.938D3 + 2131.195D2,
cS,3 (x, 5)=
D5 L51
27
81
cS,3 (x, 3)=
(3.5)
where we have again truncated the irrational coefficients and restricted ourselves to the
practically relevant cases Nf = 3, 4 and 5. Besides the Dm -terms determined in Ref. [16],
Eq. (3.5) also includes the leading integrable large-x logarithm. The general relation
between the coefficients of this term and the leading +-distribution has been conjectured
in Ref. [29]. Eqs. (3.5) complement the main present constraints on ca,3 (x, Nf ) provided
by the computation [7,8] of the seven lowest even-integer and odd-integer moments (2.10),
respectively, for electromagnetic (e.m.) DIS and the charged-current (CC) combination
F3+ . Note that the coefficients of the leading small-x logarithms are presently unknown
here, unlike for the splitting functions and the singlet coefficient functions [10].
We use this information for approximate reconstructions of c2,3 (x, Nf ) and c3,3 (x, Nf )
at Nf = 3, 4 and 5. Our method is analogous to the treatment of the three-loop splitting
smooth
functions in Refs. [1113]: A simple ansatz is chosen for ca,3
in Eq. (3.1), and its
free parameters are determined from the available moments together with a reasonably
(0,1)
(n)
(n)
smooth
and
balanced subset of the coefficients A3 , B
3 , Ca,3 and Da,3 . The ansatz for ca,3
the choice of the non-vanishing end-point parameters are then varied in order to estimate
(1)
(0)
3 ,
the residual uncertainty of ca,3 . Specifically we keep A3 ; one of each pair A3 and B
(4)
(3)
(2)
(1)
(n<4)
Ca,3 and Ca,3 , Ca,3 and Ca,3 ; one or two of the Da,3 ; and one or two parameters of
smooth . For a few combinations the
a polynomial up to second order in x representing ca,3
resulting system of linear equations which fixes these parameters by the seven moments
becomes almost singular, resulting in exceptionally large numerical coefficients. After
rejecting those about 5% of the combinations for which the modulus of at least one
parameter exceeds 105 , we are left with about 90 approximations for each case.
Before we present the approximate results for ca,3(x, Nf ), it is appropriate to illustrate
our procedure by applying it to a known quantity, for which we choose the two-loop e.m.
coefficient function c2,2 (x, Nf = 4). Adopting the coefficients of D3 and D2 defined in
Eq. (3.2) from the soft-gluon exponentiation, the procedure described in the preceding
(1,2,3)
are
paragraph is applied to this function with the small adjustment that two of the C2,2
kept as C (4) does not occur at two loops according to Eq. (3.1). Also here we reject a couple
of combinations, those with parameter(s) of modulus above 3 103 . The remaining about
70 approximations are compared to the exact result of Ref. [5] in Fig. 1.
The seven lowest even-integer moments supplemented by the soft-gluon coefficients
(m>1)
prove to constrain c2,2 (x) rather tightly at x 0.3. The region 0.1 x 0.3 is
A2
less accurately covered, and at x < 0.1 the lack of small-x information mentioned above
becomes very prominent. Also shown in Fig. 1 is the exact result [5] for c2,2 in chargedcurrent DIS. The difference to the electromagnetic case originating in a sign difference
50
e.m. obtained
Fig. 1. Approximations for the two-loop coefficient functions c2,2 (x, Nf = 4) for F2,NS
from the lowest seven even-integer moments and the two leading soft-gluon terms, compared to the
exact result of Ref. [5]. Also shown is the corresponding exact coefficient function for F2,NS in
charged-current DIS.
of the contributions from /W + q q + q + q with identical quarks in the final-state

is clearly visible only at x 0.2. The effect of this difference on the evolution of the
structure functions at NNLO is unnoticeable at x 0.1, and amounts to less that 1% for
x > 0.01, see Fig. 11 of Ref. [11]. We expect that the corresponding three-loop effect will
at least not be larger. Hence the approximations for c2,2 (x, Nf ), constructed for the e.m.
case, should be applicable also for neutrino DIS without introducing any relevant error.
The corresponding approximations for c2,3 and c3,3 are shown in Fig. 2 for Nf = 4
(concerning the scale of the ordinate recall the rather small expansion parameter (2.3)). As
expected, the accuracy pattern is qualitatively similar to the two-loop case of Fig. 2. The
uncertainty of c3,3 is smaller than that of c2,3 at small x, since for c3,3 the lowest calculated
moment, N = 1, is closer to the location of the rightmost pole at N = 0. For both functions
two representatives, denoted by A and B, are selected which rather completely cover the
uncertainty bands. With cS,3 of Eq. (3.5) these representatives read
A
c2,3
(x, 4) = cS,3 (x, 4) 6456.231D1 1085.97(1 x) + 258.876L41
22430.79L1 74705.15x 2 4062.14(2 x) 313.0L30,

B
c2,3
(x, 4) = cS,3 (x, 4) 5081.227D1 + 5028.23D0 + 1059.423L41
7292.84L21 17741.28(2 x) 18154.5L0 + 1168.02L30,

and
(3.6)
51
e.m. (left) and the chargedFig. 2. Approximations for the three-loop coefficient functions for F2,NS
current F3+ (right) derived from the respective seven lowest moments [7,8] and the soft-gluon
terms (3.5). The full lines show the selected functions (3.6) and (3.7).
A
c3,3
(x, 4) = cS,3 (x, 4) 6940.648D1 1526.23D0 42.598L41
33562.64L1 91639.87x 2 5898.84 + 424.49L20,

B
(x, 4) = cS,3 (x, 4) 4907.988D1 + 5587.906D0 + 1092.436L41
c3,3
8267.99L21 18120.78 7083.63L0 + 283.59L30.
(3.7)
The uncertainty bands of Fig. 2 do not directly indicate the range of applicability of our
approximations, as the coefficient function enter the structure functions and their evolution
only via the smoothening convolution (2.2) with non-perturbative initial distributions. In
Fig. 3 we therefore present the convolutions of the results (3.6) and (3.7) with a typical
non-singlet shape. This illustration shows that the residual uncertainties of ca,3 do not lead
to any relevant effects for x 0.1. The situation at smaller x depends on the numerical size
of the ca,3 contributions to the evolution kernels given by Eqs. (2.7) and (2.8). Anticipating
our findings in Section 5, we note that these contributions are actually unproblematically
small for x > 102 .
The results for Nf = 3 and Nf = 5 are similar to those presented in Fig. 2 and Fig. 3.
For brevity they are not displayed graphically in this article. The selected approximations
for Nf = 3 are given by
A
(x, 3) = cS,3 (x, 3) 6973.782D1 3929.78(1 x) + 232.676L41
c2,3
35949.75L1 93141.53x 2 10283.51 418.43L30,

B
c2,3
(x, 3) = cS,3 (x, 3) 5575.903D1 + 5474.48D0 + 927.478L41
4646.38L21 23345.85 11094.92L0 + 759.69L30,
(3.8)
52
Fig. 3. The convolution of the approximations (3.6) and (3.7) selected from the previous figure with
a shape typical of hadronic non-singlet distributions.
and
A
(x, 3) = cS,3 (x, 3) 7018.496D1 2957.33(1 x) + 162.667L41
c3,3
28363.91L1 74640.87x 2 5720.48(1 + x) + 330.21L20,

B
(x, 3) = cS,3 (x, 3) 5160.335D1 + 6896.360D0 + 1109.512L41
c3,3
7715.43L21 20541.22 7595.83L0 + 290.34L30.
(3.9)
The corresponding functions for Nf = 5 read

A
c2,3
(x, 5) = cS,3 (x, 5) 5951.174D1 391.37D0 2341.422L31
+ 19986.58L1 + 5517.39(2 x) + 5969.63L0 284.23L30,

B
(x, 5) = cS,3 (x, 5) 4802.695D1 + 3784.97D0 + 1041.041L41
c2,3
8021.15L21 15556.5(2 x) 16445.21L0 + 1084.36L30,
(3.10)
and
A
c3,3
(x, 5) = cS,3 (x, 5) 6560.902D1 2412.54D0 + 98.499L31
27899.69L1 82015.71x 2 2983.82 61.43L30,

B
c3,3
(x, 5) = cS,3 (x, 5) 4637.854D1 + 4317.29D0 + 1070.036L41
8767.685L21 15676.57 6543.88L0 + 276.32L30.
(3.11)
A + c B ) represents our central result.

In all cases the average 1/2(ca,3
a,3
We conclude this section by returning to the coefficients B2 and D2 entering the NNLL
soft-gluon resummation of the quark coefficient functions (2.11) and structure-function
53
evolution kernels (2.17). If only one of these constants were present, say D2 , then this
constant would be fixed by the consistency of Eq. (2.11) with the soft-gluon part cS,2 of the
(0)
NNLO coefficient functions of Ref. [5], more precisely by the coefficient A2 in Eq. (3.1).
Digressing for a moment, we note that this situation is actually realized for the (very closely
related) NNLL soft-gluon resummations of the quarkantiquark annihilation contribution
to the DrellYan cross section [17] and of the cross section for Higgs production via gluon
gluon fusion in the heavy top-quark limit. For these two processes the NNLO soft- and
virtual-gluon contribution have been computed in Refs. [25] and [26], respectively. In the
present DIS case, however, this consistency conditions only implies the constraint (2.18).
(3)
An exact result for the coefficient A2 of the three-loop coefficient functions (3.1) would
suffice to determine B2 and D2 , as this coefficient is related to the combination B2 + 2D2
independent from Eq. (2.18).
As discussed in the paragraph below that of Eq. (3.5), all our approximations (shown
(3)
for Nf = 4 in Fig. 2) include A2 . For about 95% of these approximations this coefficient
falls into the range
6800 . . . 5800
3
(3)
A2 6350 . . . 5500
(3.12)
for Nf = 4 .
5950 . . . 5200
5
The comparison of these results to the expansion of Eq. (2.11) (using g3 () of Ref. [17])
leads to the rather weak constraints
32 . . . 87
3
B2 42 . . . 93
(3.13)
for Nf = 4 ,
49 . . . 98
5
which can to sufficient accuracy be combined to the estimate
1
B2 P1 + 0 P0 ,
3
with
8 12.
(3.14)
represent the coefficients of (1 x) in the l-loop quark splitting functions.

Here Pl1
Retaining the colour factors CA = Nc = 3, CF = (Nc2 1)/(2Nc ) these coefficients read
P0 = 3CF ,

44
17
1 8
2 3
P1 = CF
+ 243 122 + CF CA
123 + 2 CF Nf
+ 2 ,
2
6
3
3 3
(3.15)
for our normalization (2.3) of the expansion parameter.
4. Convolutions for the N3 LO evolution kernels
Besides the (l + 1)-loop splitting functions and the l-loop coefficient functions, the
evolution kernels (2.7) involve simple and multiple convolutions of the coefficient
functions of lower order. Required at N3 LO are the simple and double convolutions of the
2
3
and ca,1
, and the convolutions
one-loop coefficient functions ca,1 with themselves, ca,1
Nl LO
54
of the one- with the two-loop coefficient functions, ca,1 ca,2 , see Eq. (2.8). Especially
the latter lead to rather complex exact expressions. These terms, however, do not require
any attention if the evolution is carried out using the moment-space technique [30], as in
N -space the convolutions reduce to products. On the other hand, many analyses of data
on structure functions are performed using brute-force x-space programs for solving the
evolution equations. For application in such programs we provide compact and accurate
parametrizations of the convolution contributions to the evolution kernels up to N3 LO.
These approximations are derived analogously to those of the two-loop coefficient
functions in Ref. [11]: The +-distribution parts are treated exactly (truncating irrational
coefficients), see Appendix A. The integrable x < 1 terms are fitted to the exact results for
x 106 . Finally the coefficients of (1 x) are slightly adjusted from their exact values
using the lowest integer moments. The resulting parametrizations deviate from the exact
results by no more than a few permille. This accuracy applies directly to Eqs. (4.1)(4.6)
as well as to their convolutions with typical hadronic input distributions.
Using the abbreviations (3.2) the simple convolutions of the one-loop coefficient
functions for F2,NS and F3 can be written as
2
(x) =
c2,1
256
D3 64D2 283.157D1 + 304.751D0 + 346.213(1 x)
9
26.51L31 + 192.9L21 + 198.2L1 + 113.0L21L0
1.230L30 + 9.466L20 + 32.45L0 483.3x 410.5,
(4.1)
and
2
(x) =
c3,1
256
D3 64D2 283.157D1 + 304.751D0 + 345.993(1 x)
9
27.09L31 + 162.1L21 + 248.0L1 + 91.79L21L0
1.198L30 + 3.054L20 + 65.54L0 305.3x 335.7.
(4.2)
The corresponding parametrizations for the double convolutions read

3
c2,1
(x) =
1024
1280
D5
D4 2757.883D3 + 9900.585D2
9
3
+ 3917.516D1 12573.13D0 2851.0(1 x) 151.4L51 + 118.9L41
6155L31 47990L21 30080L1 + 6423L21L0 0.35L50
4.30L40 106.7L30 1257L20 4345L0 + 3618x + 8547,
(4.3)
and
3
(x) =
c3,1
1024
1280
D5
D4 2757.883D3 + 9900.585D2 + 3917.516D1
9
3
12573.13D0 2888.1(1 x) 138.4L51 + 409.0L41
1479L31 24700L21 + 9646L1 10080L21L0 0.119L50 + 3.126L40
+ 84.84L30 + 288.7L20 + 264.87L0 + 15410x + 15890.
(4.4)
The convolutions ca,2 ca,1 for F2,NS in electromagnetic DIS and for the chargedcurrent combination F3+ are parametrized as
55
[c2,2 c2,1 ](x)

1536
D5 343.702D4 633.29D3 + 5958.86D2 6805.10D1
=
27
2464.47D0 101.7L51 155.1L41 6553L31 23590L21 + 10620L1
+ 9290L21L0 0.35L50 + 0.64L40 + 92.93L30 + 761.9L20 + 2450L0
1251x + 6286 + 8609.2 (1 x)

+ Nf 7.0912D4 55.3087D3 + 18.629D2 + 619.865D1
584.260D0 11.71L41 + 60.82L31 618.0L21 1979L1
919.6L21L0 + 0.48L40 1.08L30 43.83L20 125.5L0

295.1x + 522.4 809.14 (1 x) ,
(4.5)
and
[c3,2 c3,1 ](x)
= 1536/27D5 343.702D4 633.29D3 + 5958.86D2 6805.10D1
2464.47D0 77.39L51 + 289.1L41 2823L31 12500L21 + 25420L1
+ 9515L21L0 0.524L50 6.104L40 + 39.23L30 + 553.5L20 + 1393L0
+ 20080x + 2548 + 8478.2 (1 x)

+ Nf 7.0912D4 55.3087D3 + 18.629D2 + 619.865D1
584.260D0 14.30L41 + 10.47L31 775.2L21 2458L1
392.9L21L0 + 0.482L40 + 2.541L30 41.04L20 223.9L0

891.9x + 468.0 803.43 (1 x) .
(4.6)
The corresponding results for the charged-current quantities F2,NS and F3 are obtained
by replacing the non-Dk Nf terms of Eqs. (4.5) and (4.6), respectively, by
109.2L51 243.4L41 6890L31 24000L21 + 10840L1 + 9144L21L0 0.45L50,
+1.80L40 + 114.0L30 + 856.6L20 + 2602L0 711.6x + 6298 + 8569.2 (1 x),
and
77.39L51 + 295.5L41 2587L31 10580L21 + 30580L1 + 6461L21L0 0.404L50,
5.525L40 + 23.80L30 + 484.7L20 + 1577L0 + 22220x + 3349 + 8485.0 (1 x).
5. Numerical results for the scaling violations

In this section we illustrate the effect of the next-to-next-to-next-to-leading order
(N3 LO) contributions to the physical evolution kernels (2.7)(2.9) for the electromagnetic
structure function F2,NS and the charged-current combination F3+ henceforth simply
56
denoted F3 . Specifically, we will discuss the logarithmic derivatives Fa d ln Fa /d ln Q2

calculated at a fixed reference scale Q2 = Q20 for the initial conditions

F2,NS x, Q20 = xF3 x, Q20 = x 1/2 (1 x)3 .
(5.1)
The simple model shape (5.1) incorporates the most important features of non-singlet
x-distributions of nucleons. Its overall normalization is irrelevant for the logarithmic
derivatives considered here. The reference scale Q20 is specified via

s 2r = Q20 = 0.2,
(5.2)
irrespective of the order of the expansion. Eq. (5.2) corresponds to Q20 30 GeV2 , a scale
typical for fixed-target DIS, for s (MZ2 ) 0.116 beyond leading order (LO). The same
input (5.1) and (5.2) is chosen at all orders and for both structure functions in order to
facilitate a direct comparison of the various contributions to the evolution kernels (2.7).
All graphical illustrations below refer to Nf = 4 effectively massless quark flavours.
Before turning to the numerical results we have to specify our treatment of the fourloop splitting functions Pa,3 . These functions enter Eq. (2.7) at order s4 (N3 LO) together
with the three-loop coefficient functions (3.6)(3.11) and the convolutions (4.3)(4.6). As
already mentioned in the introduction, the size of the two- and three-loop terms in the
expansion of the non-singlet splitting functions strongly indicates that the effects of Pa,3
are very small in the x-region addressed by the present study, x > 102 . Hence a rather
rough estimate of these quantities is sufficient here. We have checked that the [0/1] Pad
approximation 4 gives a reasonable, though not particularly accurate estimate of the threeN
in N -space for Nf = 3, . . . , 5. Thus we choose
loop non-singlet splitting functions Pa,2
the Mellin inverse of
N
N
Pa,3
, = 0, . . . , 2,
Pa,3
(5.3)
[1/1] Pad
as our estimate of Pa,3 (x), i.e., we assign a 100% error to the predictions of the [1/1]
Pad summation (the [0/2] Pad results are similar). The results obtained by combining
A in Eqs. (3.6)(3.11) are denoted by N3 LO in the figures
Eq. (5.3) for = 2 with ca,3
A
B by N3 LO . As in Section 3 the central predictions
below, those using = 0 and ca,3
B
1/2(N3 LOA + N3 LOB ) are not shown separately.
The logarithmic scale derivatives F2,NS and F3 resulting from Eqs. (5.1), (5.2) are
shown in the left parts of Fig. 4 and Fig. 5, respectively, for the standard choice 2r = Q2
of the renormalization scale. At x 0.5 the size of the NNLO and N3 LO corrections can
barely be read on the scale of these graphs, therefore the differences Fa (Fa )NNLO are
displayed on a larger scale in the left parts of both figures. The difference of the N3 LOA and
N3 LOB results is very small down to x 102 even on this enlarged scale, demonstrating
that our approximations (3.6)(3.11) and (5.3) are completely sufficient in this region of x.
For both structure functions the N3 LO corrections are large only at very large x, where
the kernels are dominated by the universal soft-gluon contributions. Towards smaller x
4 A brief discussion of the Pad summations and the resulting higher-order approximations can be found in the
next section.
57
Fig. 4. The perturbative expansion of the scale derivative F2,NS d ln F2,NS /d ln Q2 of the
electromagnetic structure function F2,NS at 2r = Q2 30 GeV2 for the initial conditions specified
in Eqs. (5.1) and (5.2). The differences between the predictions at different orders in s are shown
on a larger scale in the right part.
Fig. 5. As Fig. 4, but for the charged-current combination F3 F3+ . In all figures the subscripts A
and B at N3 LO refer to the approximations discussed below Eq. (5.3).
58
Fig. 6. The dependence of the NLO, NNLO and N3 LO predictions for d ln F2,NS /d ln Q2 at
Q2 = Q20 30 GeV2 on the renormalization scale r for six typical values of x.
the N3 LO effects rapidly decrease, e.g., from 6% at x = 0.85 to 2% at x = 0.65. The

corresponding NNLO contributions amount to about 12% and 6%, respectively. At 102
x 0.6 the N3 LO corrections are particularly small for F2,NS .
The dependence of these scale derivatives on the renormalization scale 5 is illustrated in
Figs. 6 and 7. In the former figure the consequences of varying r are shown for F2,NS
at six representative values of x (note that the scales of the ordinates are different in all
six parts). Here we vary r over a rather wide range, 18 Q2 2r 8Q2 , corresponding
to 0.29 s (2r ) 0.15 for the initial condition (5.2). In the latter figure we display the
absolute scale uncertainties of F2,NS and F3 at Q2 = Q20 , estimated by
5 As already mentioned below Eq. (2.4), we use the MS renormalization scheme throughout this study.
59
Fig. 7. The renormalization scale uncertainties, as estimated by the quantities abs Fa defined in
Eq. (5.4), of the perturbative predictions for the scale derivatives of F2,NS and F3 displayed in Fig. 4
and Fig. 5, respectively. Also included are the corresponding approximate N4 LO results derived in
Section 6 below.
1
abs Fa
2

1 2
2
2
max Fa r = Q , . . . , 4Q
4

1 2
2
2
,
min Fa r = Q , . . . , 4Q
4
(5.4)
i.e., using the smaller conventional interval 12 Q, . . . , 2Q for r . Also shown here are the
further improvements resulting from including the approximate s5 (N4 LO) contributions
to Eqs. (2.7)(2.9) discussed in the next section.
Our new N3 LO results represent a clear improvement over the NNLO stability [11] for
all x-values of Fig. 6 except for x = 0.05 (here, however, the absolute spread is very small,
see Fig. 7), where the difference between the N3 LOA and N3 LOB results at small r becomes comparable to the r -variation at NNLO. This enhanced sensitivity at small scales
is due to the larger values of s up to almost 0.3, which enter the approximate contributions
to Eq. (2.7) as s4 . The present approximation uncertainties of the N3 LO results are actually dominated by the estimate (5.3) for the four-loop splitting functions, not by the residual
uncertainties of the three-loop coefficient functions quantified in Eqs. (3.6)(3.11).
The results shown in Fig. 7 correspond to relative uncertainties (abs Fa )/Fa of 8%
at NNLO and 5% and 3% at N3 LO and N4 LO, respectively, for both F2,NS and F3 at
x = 0.85. The corresponding figures at x = 0.65 read 5% (NNLO), 2% (N3 LO) and 1%
(N4 LO). These scale uncertainties are rather similar to the relative size of the highestorder contributions at 2r = Q2 in Figs. 4 and 5 (for the N4 LO contribution see Fig. 11
60
in Section 6). Hence the r -variation (5.4) and the size of the last contribution included
in Eq. (2.7) yield consistent uncertainty estimates in this region of x. At smaller x the
absolute scale uncertainties are very small at N3 LO and N4 LO. For abs F2,NS values even
below 0.001 are reached for x 0.5 at N3 LO and x 0.6 at N4 LO.
We conclude this section by illustrating the effect of the higher-order terms in Eq. (2.7)
on the determination of s from the scaling violations of non-singlet structure functions.
For this illustration we assume that F2,NS and F3 at Q20 30 GeV2 are given by Eq. (5.1)
with negligible uncertainty. The resulting average N3 LO predictions of F2,NS and F3 for
2r = Q20 and s = 0.2 are employed as model data at xk = 0.1k 0.05 with k = 1, . . . , 8.
Roughly following the experimental pattern, we assign errors of 0.005 for k = 2, . . . , 6, of
0.01 for k = 1, 7 and of 0.02 for k = 8 to these data points (for Eq. (5.5) only the relative
size of these errors is relevant). Again already including the N4 LO estimate obtained in the
next section, the fits of s (Q20 ) to these model data yield

0.2035+0.019
s Q20 NLO = 0.2080+0.021
0.013,
0.011,
2
+0.008
s Q0 NNLO = 0.20100.0025,
0.1995+0.0065
0.0015,
2
+0.003
s Q0 N3 LO = 0.20000.001,
0.2000+0.0025
0.0005,
2
+0.0015
(5.5)
s Q0 N4 LO = 0.20000.0005,
0.2005+0.0015
0.0005,
where the first column refers to F2,NS and the second to F3 . The central values represent the
respective results for 2r = Q20 , and the errors are due to the renormalization scale variation
1 2
2
2
3
4
4 Q r 4Q , for the N LO and N LO cases combined with the approximation
uncertainties. Unlike the NNLO terms, the N3 LO and N4 LO corrections do not cause
significant shifts of the central values, but just lead to a reduction of the r uncertainties
which reach about 1% at N3 LO. The difference of the NLO and NNLO central results
for F3 is half as large as that for F2,NS . This effect is due to larger positive corrections to the
logarithmic derivative at x < 0.4 in the former case (see Figs. 4 and 5), which counteract
the effect of the negative corrections at large x in the fit. As far as Eq. (5.5) can be compared
to the fits of real data in Refs. [3,4] (where higher-twist contributions affecting the central
values are included), our finding are consistent with those results.
6. Resummations and optimizations

Finally we address the predictions of the soft-gluon resummation, the ECH and PMS
scheme optimizations, and the Pad approximations for the physical evolution kernel Ka
in Eq. (2.7). Being especially interested in the region of large-x/large-N , where the higherorder corrections are large but similar for F2,NS and F3 , we will for brevity focus on
the former, more accurately measured quantity. Also in this section the numerical results
are given for Nf = 4 and the initial conditions (5.1) and (5.2). We will mainly consider
the predictions of the above-mentioned approaches at fixed order in s , and only at the
end briefly turn to the all-order results for the soft-gluon exponentiation and the Pad
summations.
61
The Nl LO predictions of the soft-gluon resummation for the kernels (2.7) are given by
the terms asl+1 Kres,l in Eq. (2.17). Recall that the leading logarithmic (LL), next-to-leading
logarithmic (NLL) and next-to-next-to-leading logarithmic (NNLL) contributions behave
as lnl+1 N , lnl N and (at l 2) lnl1 N , respectively. The terms in the l-loop coefficient
functions ca,l proportional to lnk N with k = l + 2, . . . , 2l cancel in the combinations (2.8)
for l 2. This implies that, from l = 5 onwards, actually none of the four leading
lnk N terms of ca,l presently fixed by the soft-gluon exponentiation (2.11) contributes to
the Nl LO kernels (2.7). Consequently, we expect a pattern for the numerical soft-gluon
approximations to the physical kernels which is rather different from that discussed in
Ref. [16] for the MS coefficient functions.
The cumulative effect of soft-gluon terms at NNLO (s3 ) and N3 LO (s4 ) is compared
to the (approximate) full results in Fig. 8. The two solid curves in the right plot refer to
the N3 LOA and N3 LOB approximation discussed below Eq. (5.3), the two NNLL results
to = 8 and = 12 in Eq. (3.14). For the (undisplayed) NLO contribution the LL and
NNLL predictions are considerably smaller and larger, respectively, than the full result,
whereas the inclusion of also the as2 N 0 term arising from soft and virtual emissions leads
to a reasonable approximation. Combined with this situation, the results of Fig. 8 indicate
that the number of soft-gluon logarithms required for a realistic approximation at Nl LO
systematically increases with the order l: The full NLO, NNLO, and N3 LO curves run
between the LL and NLL, close to the NLL, and between the NLL and NNLL results,
respectively. The NNLL soft-gluon contribution may thus be expected to represent a
reasonable estimate for the N4 LO (s5 ) term of Eq. (2.7) at large x/large N . As shown in
Fig. 9, 6 however, the spread due to the present uncertainty (3.14) of the parameters B2 and
D2 entering Eq. (2.17) is unfortunately rather large. Moreover, even if with this uncertainty
removed, e.g., by a future exact calculation of the three-loop coefficient functions, Figs. 8
and 9 indicate that the soft-gluon resummation can hardly be expected to provide accurate
information on the N4 LO term, even for moments as large as N 30.
The renormalization-scheme optimizations assume that that the higher-order corrections
(l)N
to the Nl LO physical kernels Ka in N -space given by

d ln FaN
(l)N
N
N
l N
(6.1)
1
+
a
,
=
K
=
a
K
r
+
+
a
r
s
s
a
s
a,l
0
a,1
d ln Q2
are small in a certain optimal scheme. The principle of minimal sensitivity (PMS)
proposed in Ref. [19] selects this scheme by the requirement
dKaN (RS)
(6.2)
= 0,
d(RS)
where d/d(RS) abbreviates the derivatives with respect to the l independent parameters
specifying the renormalization scheme at Nl LO. In the effective charge (ECH) method of
Ref. [20], on the other hand, these parameters are chosen such that
N
N
ra,1
= = ra,l
= 0.
(6.3)
6 Already Fig. 3 demonstrates that the universal soft-gluon terms do not provide a good approximation for
x < 0.7, hence the corresponding x-space results in Figs. 9 and 12 are shown only at larger x.
62
Fig. 8. The successive soft-gluon approximations for the NNLO (left) and N3 LO (right) contributions
to the moments (2.10) of the evolution kernel (2.7) at 2r = Q2 , compared with the full results for
F2,NS addressed in the previous section. Besides the lnn N terms of Kres,2 in Eq. (2.17), also the N 0
contribution is included for the NNLO case.
Fig. 9. The results of the soft-gluon resummation (2.17) for the N4 LO contribution to the kernel (2.7).
The left part corresponds to the right plot of Fig. 8, in the right part the resulting large-x predictions
are shown for the s5 corrections to the results of Fig. 4.
63
N
Assuming that in these schemes the next terms ra,l+1
are not just small but vanishing, the
transformation back to MS (or any other scheme) leads to the respective PMS and ECH
N
N
, . . . , ra,l
and the coefficients (2.4) of the predictions for this quantity in terms of ra,1
function. Up to r4 these predictions are explicitly given in Eqs. (6)(11) and (13)(17) of
Ref. [31], thus we refrain from repeating them here.
Another approach for estimating the higher-order corrections is provided by the Pad
summation of the perturbation series, for QCD in detail discussed, e.g., in Ref. [18]. In this
method Ka(l)N in Eq. (6.1) is replaced by
N
Ka,[
N /D ] = as K0
N + + a N pN
1 + as pa,1
s
a,N
N + + aDq N
1 + as qa,1
s a,D
(6.4)
with
D 1 and N + D = l.
(6.5)
The determination of the parameters pi and qj from the r1 , . . . , rl of Eq. (6.1) are
automatized in programs for symbolic manipulation such as MAPLE [32]. Expanding
N
l+1 LO coefficients
Ka,[
N /D ] to order l +1 then yields the [N /D] Pad predictions for the N
N
ra,l+1
. Also these predictions need not to be written down here. Beyond the second-order
results there is no obvious relation between the predictions of the scheme optimizations
and those of the Pad approximations. Consistent result of these methods for rl>2 are thus
usually considered as evidence of the approximate correctness of these predictions [18].
The PMS, ECH and Pad results for the NNLO and N3 LO N -space kernels (6.1) are
compared in Fig. 10 to the (approximate) full results already shown, on the same scale, in
Fig. 8. Disregarding large relative, but small absolute deviations at NNLO for small N , the
PMS and ECH results (which are very similar at NNLO and identical at N3 LO) represent
good approximations at both orders. The Pad approximations are somewhat smaller,
however, this offset seems to decrease with the order in s . In the left part of Fig. 11
we present the corresponding N4 LO predictions. The inner three curves have been derived
from the central N3 LO results of Section 5. The PMS and ECH are again very similar,
they are not shown separately. The impact of the present uncertainty of the N3 LO kernels,
dominated by the estimate (5.3) of the four-loop splitting functions, is included in the two
dotted curves which represent our final estimate for the N4 LO term and its uncertainty.
In the right part of Fig. 10 the N4 LO corrections to the results of Fig. 4 are compared
to the N3 LO contribution. Within the large uncertainties of the latter, these results are
consistent with the NNLL soft-gluon prediction shown in Fig. 9. The consequences of
including the N4 LO estimates have been presented in Fig. 7 and Eq. (5.5), respectively, for
the renormalization-scale stability and the determination of s .
Finally the sl>5 infinite-order predictions of the soft-gluon resummation (2.17) (using
the minimal prescription contour [27]) and the Pad approximations (6.4) are compared
in Fig. 12. For the present uncertainties (3.14) the curves in the figure refer to = 8
(upper), D2 = 0 in Eq. (2.18) (middle) and = 12 (lower) it is not possible to draw any
conclusions from the soft-gluon result. The Pad summation, on the other hand, provides
rather definite predictions: The terms beyond s5 can be expected to have a very small
64
Fig. 10. The PMS, ECH and Pad estimates of the NNLO (left) and N3 LO (right) parts of the
N -space evolution kernel for F2,NS at 2r = Q2 , compared with the full results illustrated in x-space
in Section 5. The scales of the graphs are the same as in Fig. 8.
Fig. 11. The PMS, ECH and Pad estimates of the N4 LO contributions to the evolution kernel (2.7)
for F2,NS . The left part is analogous to Fig. 10, in the right part the resulting s5 corrections to the
result in Fig. 4 are compared to the s4 (N3 LO) contribution. The scales of the graphs are the same
as in Fig. 9.
65
Fig. 12. Predictions of the soft-gluon resummation (2.17) and the Pad approximations (6.4) for the
contributions beyond N4 LO to the logarithmic scale derivative of F2 F2,NS .
impact at x 0.75. For our standard reference value s = 0.2 their effect reaches about
the size of the N4 LO and N3 LO contributions at x 0.9 and x 0.95, respectively.
7. Summary
We have investigated the predictions of massless perturbative QCD for the scaling violations of the most important non-singlet structure functions in unpolarized DIS, extending
our previous NNLO results [11] to N3 LO and N4 LO for the region x > 102 . The main
objective of this extension is to reduce the theoretical uncertainty of determinations of s
from inclusive DIS to about 1%, an accuracy which is sufficient to make full use of present
and future structure function measurements. Our results also facilitate improved determinations of power-suppressed contributions to the structure-function evolution by fits to data,
especially at large x where the uncertainties are still sizeable at NNLO.
The new ingredients entering the N3 LO physical evolution kernels are the four-loop
splitting functions and the three-loop coefficient functions. The impact of the former
quantities is expected to be very small in the MS scheme at x > 102 ; it has been estimated
by a Pad approximation assigned a 100% uncertainty. For the latter quantities we have
derived approximate expressions based on the available integer moment [7,8] and soft
gluon [15,16] results. The effect of these functions is very well under control at x > 102 ,
and almost perfectly at x 0.1. In fact, the uncertainty of the splitting functions dominates
the small residual uncertainties of the evolution kernels. Hence the accuracy of our present
N3 LO results will be superseded only by a future four-loop calculation.
66
We have also studied the predictions of the NNLL soft-gluon resummation [17] and
of the Pad, PMS and ECH approximations [1820]. Presently the predictions of the
resummation for the physical evolution kernels beyond N3 LO (in any case applicable
only at x 0.8) suffer from the incomplete determination of the soft-gluon parameters
B2 and D2 , a problem which will be removed by forthcoming exact calculation of the
three-loop coefficient functions [33]. The Pad, PMS and ECH approximations are found
to agree rather well with the NNLO and N3 LO results for the evolution kernels; these
approaches seem to provide reliable predictions of the effects at N4 LO and beyond.
For s 0.2 the N3 LO and N4 LO corrections at r Q are very small at x < 0.6 and
x < 0.8, respectively, especially for the most accurately measured structure function F2 .
Consequently the central values of s determined from the non-singlet scaling violations
hardly change any more once the larger NNLO terms have been included, see also Ref. [3].
The scale uncertainty of the resulting s (MZ2 ) is reduced to the unproblematic level of less
than 1% at N3 LO and 0.5% at N4 LO. In order to ensure an overall theoretical accuracy of
about 1% also the heavy quark (especially charm) mass effects need to be controlled with
this precision. We will address this point in a forthcoming publication.
FORTRAN subroutines of our approximations of the three-loop coefficient functions in
Section 3 and of the parametrizations of the convolutions entering the evolution kernels in
Section 4 can be found at http://www.lorentz.leidenuniv.nl/avogt.
Acknowledgements
We are grateful to J. Vermaseren for communicating the fourteenth moment of the threeloop coefficient function for F2 to us prior to publication. This work has been supported
by the European Community TMR research network Quantum Chromodynamics and the
Deep Structure of Elementary Particles under contract No. FMRX-CT98-0194.
Appendix A. Convolutions of +-distribution

The convolutions Di Dj of the +-distributions (3.2) for i + j 4 are given by
D0 D0 = 2D1 2 (1 x),
3
D1 D0 = D2 2 D0 + 3 (1 x),
2
1
D1 D1 = D3 22 D1 + 23 D0 22 (1 x),
10
4
4
D2 D0 = D3 22D1 + 23 D0 22 (1 x),
3
5
5
D2 D1 = D4 32D2 + 63 D1 22 D0 + (45 22 3 )(1 x),
6
2
D2 D2 = D5 42D3 + 123 D2 422 D1 + (165 82 3 )D0
3
67

46
+ 432 23 (1 x),
35
5
12
D3 D0 = D4 32D2 + 63 D1 22 D0 + 65 (1 x),
4
5
3
27
D3 D1 = D5 42D3 + 123 D2 22 D1 + (185 62 3 )D0
4
5

36
+ 332 23 (1 x),
35
6
48
192 3
D4 D0 = D5 42D3 + 123 D2 22 D1 + 245 D0
(1 x),
5
5
35 2
(A.1)
up to integrable contributions dealt with numerically in Section 4. Here l stand for the
Riemann -function, and 4 and 6 have been expressed in terms of 2 and 3 , respectively.
A convenient method to derive (and to extend, if required) Eq. (A.1) is by using the relation
between Di and the harmonic sums discussed, for example, in Ref. [34].
References
[1] D.E. Groom et al., Particle Data Group, Eur. Phys. J. C 15 (2000) 1, and references therein.
[2] W. Furmanski, R. Petronzio, Z. Phys. C 11 (1982) 293.
[3] A.L. Kataev, A. Kotikov, G. Parente, A.V. Sidorov, Phys. Lett. B 417 (1998) 374;
A.L. Kataev, G. Parente, A.V. Sidorov, Nucl. Phys. B 573 (2000) 405;
A.L. Kataev, G. Parente, A.V. Sidorov, hep-ph/0012014, CERN-TH-2000-343.
[4] J. Santiago, F.J. Yndurain, Nucl. Phys. B 563 (1999) 45;
J. Santiago, F.J. Yndurain, hep-ph/0102312, FTUAM 01-01.
[5] E.B. Zijlstra, W.L. van Neerven, Phys. Lett. B 272 (1991) 127;
E.B. Zijlstra, W.L. van Neerven, Phys. Lett. B 273 (1991) 476;
E.B. Zijlstra, W.L. van Neerven, Phys. Lett. B 297 (1992) 377;
E.B. Zijlstra, W.L. van Neerven, Nucl. Phys. B 383 (1992) 525.
[6] S. Moch, J.A.M. Vermaseren, Nucl. Phys. B 573 (2000) 853.
[7] S.A. Larin, T. van Ritbergen, J.A.M. Vermaseren, Nucl. Phys. B 427 (1994) 41;
S.A. Larin, P. Nogueira, T. van Ritbergen, J.A.M. Vermaseren, Nucl. Phys. B 492 (1997) 338.
[8] A. Retey, J.A.M. Vermaseren, hep-ph/0007294, NIKHEF-2000-018;
J.A.M. Vermaseren, private communication.
[9] J.A. Gracey, Phys. Lett. B 322 (1994) 141;
J.F. Bennett, J.A. Gracey, Nucl. Phys. B 517 (1998) 241.
[10] S. Catani, F. Hautmann, Nucl. Phys. B 427 (1994) 475;
J. Blmlein, A. Vogt, Phys. Lett. B 370 (1996) 149;
V.S. Fadin, L.N. Lipatov, Phys. Lett. B 429 (1998) 127, and references therein;
M. Ciafaloni, G. Camici, Phys. Lett. B 430 (1998) 349.
[11] W.L. van Neerven, A. Vogt, Nucl. Phys. B 568 (2000) 263.
[12] W.L. van Neerven, A. Vogt, Nucl. Phys. B 588 (2000) 345.
[13] W.L. van Neerven, A. Vogt, Phys. Lett. B 490 (2000) 111.
[14] R. Hamberg, W.L. van Neerven, T. Matsuura, Nucl. Phys. B 359 (1991) 343;
W.L. van Neerven, E.B. Zijlstra, Nucl. Phys. B 382 (1992).
[15] G. Sterman, Nucl. Phys. B 281 (1987) 310;
L. Magnea, Nucl. Phys. B 349 (1991) 703;
68
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
S. Catani, L. Trentadue, Nucl. Phys. B 327 (1989) 323;

S. Catani, L. Trentadue, Nucl. Phys. B 353 (1991) 183;
S. Catani, G. Marchesini, B.R. Webber, Nucl. Phys. B 349 (1991) 635.
A. Vogt, Phys. Lett. B 471 (1999) 97.
A. Vogt, Phys. Lett. B 497 (2001) 228.
M.A. Samuel, J. Ellis, M. Karliner, Phys. Rev. Lett. 74 (1995) 4380;
J. Ellis, E. Gardi, M. Karliner, M.A. Samuel, Phys. Lett. B 366 (1996) 268;
J. Ellis, E. Gardi, M. Karliner, M.A. Samuel, Phys. Rev. D 54 (1996) 6986;
S.J. Brodsky et al., Phys. Rev. D 56 (1997) 6980.
P.M. Stevenson, Phys. Rev. D 23 (1981) 2916.
G. Grunberg, Phys. Rev. D 29 (1984) 2315.
O.V. Tarasov, A.A. Vladimirov, A.Yu. Zharkov, Phys. Lett. B 93 (1980) 429;
S.A. Larin, J.A.M. Vermaseren, Phys. Lett. B 303 (1993) 334.
T. van Ritbergen, J.A.M. Vermaseren, S.A. Larin, Phys. Lett. B 400 (1997) 379.
G. Curci, W. Furmanski, R. Petronzio, Nucl. Phys. B 175 (1980) 27.
A. Gonzales-Arroyo, C. Lopez, F.J. Yndurain, Nucl. Phys. B 153 (1979) 161;
G.P. Korchemsky, Mod. Phys. Lett. A 4 (1989) 1257;
S. Albino, R.D. Ball, hep-ph/0011133, CERN-TH/2000-332.
T. Matsuura, W.L. van Neerven, Z. Phys. C 38 (1988) 623.
S. Catani, D. de Florian, M. Grazzini, hep-ph/0102227, CERN-TH-2001-044;
R.V. Harlander, W.B. Kilgore, hep-ph/0102241, BNL-HET-01-6.
S. Catani, M.L. Mangano, P. Nason, L. Trentadue, Nucl. Phys. B 478 (1996) 273.
E. Remiddi, J.A.M. Vermaseren, Int. J. Mod. Phys. A 15 (2000) 725.
M. Krmer, E. Laenen, M. Spira, Nucl. Phys. B 511 (1998) 523.
M. Diemoz, F. Ferroni, E. Longo, G. Martinelli, Z. Phys. C 39 (1988);
M. Glck, E. Reya, A. Vogt, Z. Phys. C 48 (1990) 471;
Ch. Berger, D. Graudenz, M. Hampel, A. Vogt, Z. Phys. C 70 (1996) 77;
D.A. Kosower, Nucl. Phys. B 506 (1997) 439;
D.A. Kosower, Nucl. Phys. B 520 (1998) 263.
A.L. Kataev, V.V. Starshenko, Mod. Phys. Lett. A 10 (1995) 235.
D. Redfern, The Maple Handbook (Maple V release 4), Springer, 1996.
S. Moch, J.A.M. Vermaseren, Nucl. Phys. (Proc. Suppl.) 89 (2000) 131;
S. Moch, J.A.M. Vermaseren, Nucl. Phys. (Proc. Suppl.) 89 (2000) 137;
S. Moch, J.A.M. Vermaseren, M. Zhou, in preparation.
J. Blmlein, S. Kurth, Phys. Rev. D 60 (1999) 014018.

Evolution equation for the structure function

g2(x, Q2)
V.M. Braun a , G.P. Korchemsky b , A.N. Manashov a,c,1
a Institut fr Theoretische Physik, Universitt Regensburg, D-93040 Regensburg, Germany
b Laboratoire de Physique Thorique, Universit de Paris XI, 91405 Orsay Cdex, France 2
c Department dECM, Universitat de Barcelona, 08028 Barcelona, Spain
Received 13 March 2001; accepted 4 April 2001
Abstract
We perform an extensive study of the scale dependence of flavor-singlet contributions to the
structure function g2 (x, Q2 ) in polarized deep-inelastic scattering. We find that the mixing between
quarkantiquarkgluon and three-gluon twist-3 operators only involves the three-gluon operator with
the lowest anomalous dimension and is weak in other cases. This means, effectively, that only those
three-gluon operators with the lowest anomalous dimension for each moment are important, and
allows to formulate a simple two-component parton-like description of g2 (x, Q2 ) in analogy with
the conventional description of twist-2 parton distributions. The similar simplification was observed
earlier for the nonsinglet distributions, although the reason is in our case different. 2001 Elsevier
Science B.V. All rights reserved.
1. Introduction
Twist-three parton distributions in the nucleon are attracting constant interest as unique
probes of quarkgluon correlations in hadrons. Quantitative studies of twist-three effects
are becoming possible with the increasing precision of experimental data at SLAC and
RHIC, and can constitute an important part of the future spin physics program on highluminosity accelerators like ELFE, eRHIC, etc.
The structure function g2 (x, Q2 ) of polarized deep inelastic scattering received most
attention in the past. The experimental studies at SLAC [13] have confirmed theoretical
expectations about the shape of g2 (x, Q2 ) and provided first evidence on the most
interesting twist-3 contribution. On the theoretical side, a lot of effort was invested to
E-mail address: vladimir.braun@physik.uni-regensburg.de (V.M. Braun).
1 Permanent address: Department of Theoretical Physics, Sankt-Petersburg State University, St. Petersburg,
Russia.
2 Unite Mixte de Recherche du CNRS (UMR 8627).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 6 5 - 1
70
V.M. Braun et al. / Nuclear Physics B 603 (2001) 69124
understand the physical interpretation of twist-3 distributions (see, e.g., [46] for the
review of various aspects) and their scale dependence [712]. One-loop corrections to the
coefficient functions have been calculated [1315].
In spite of the significant progress that has been achieved, understanding of the scale
dependence of g2 (x, Q2 ) still poses an outstanding theoretical problem. The difficulty
is due to the well-known fact [46] that the structure function g2 (x, Q2 ) presents by
itself only one special projection of a more general three-particle quarkantiquarkgluon
correlation function in the nucleon that depends, generally, on two variables the
momentum fractions carried by partons.
In deep inelastic scattering with a transversely polarized target, only this special
projection can be measured. On the other hand, the scale dependence of the parent quark
antiquarkgluon correlation function involves the full function in a notrivial way [8] and
the knowledge of one particular projection g2 (x, Q20 ) at a given value of Q20 does not allow
to predict g2 (x, Q2 ) at different momentum transfers: a DGLAP-type evolution equation
for g2 (x, Q2 ) in QCD does not exist or, at least, is not warranted. The reason is simply that
inclusive measurements do not provide complete information on the relevant three-particle
parton correlation function.
From the phenomenological point of view this conclusion is not satisfactory since it
would mean that one cannot relate results of the measurements of g2 (x, Q2 ) at different
values of Q2 to one another without model assumptions. The theoretical challenge is,
therefore, to find out whether the complicated pattern of quarkgluon correlations can
be reduced to a few effective degrees of freedom. In this case one will be able to find a
meaningful approximation to the scale dependence that introduces a minimum amount of
nonperturbative parameters.
Such an approximation is known [10] for the flavor-nonsinglet (NS) contribution to the
structure function. To explain this result, it is convenient to use the language of the Operator
Product Expansion (OPE), see Section 2 for more details. The statement of the OPE is
that odd moments n = 3, 5, . . . , of the structure function g2 (x, Q2 ) can be expanded in
contributions of multiplicatively renormalized local quarkantiquarkgluon operators 3
1
dx x n1 g2NS
0
x, Q
n3

k=0

k
Cn3
s (Q)
s ()
k
n3 /b
k

On3 () ,
(1.1)
k
k
where Cn3
are the coefficient functions and On3
() are reduced matrix elements
k
normalized at the scale ; n3 are the corresponding anomalous dimensions that we
n3
0
1
< n3
< < n3
for each n, and b = 11/3Nc
assume are ordered with k: n3
2/3nf . Note that the number of contributing operators rises linearly with n in the r.h.s.
of (1.1). This should be compared with the familiar case of leading twist-2 distributions.
There, a single operator (flavor-nonsinglet) exists for each moment. A measurement of the
moment of g2 (x, Q2 ) cannot separate between contributions of different operators and is,
therefore, not sufficient to predict the scale dependence.
3 We neglect the WandzuraWilczek twist-2 contributions throughout this paper.
71
The situation simplifies drastically, however, in the large Nc limit. It turns out [10] that
k
the tree-level coefficient functions, Cn3
, of all operators other than the one with the lowest
anomalous dimension for each n are suppressed by powers of 1/Nc2 so that to this accuracy
one can approximate the sum in (1.1) by the first term k = 0. The corresponding anomalous
k=0
can be calculated analytically and result can be reformulated as a DGLAPdimension n3
type evolution equation
1

s

d
dz NS
NS
2
P (x/z) g2NS z, Q2 ,
Q
g2 x, Q =
2
4
z
dQ
x

2
4CF
1
2CF ,
2
P NS (z) =
+ (1 z) CF +
1z +
Nc
3
2
(1.2)
where CF = (Nc2 1)/(2Nc ). Here, we have also included the 1/Nc2 corrections calculated
in [12].
The present paper is devoted to the extension of this analysis to the flavor-singlet sector
in which case twist-3 composite local three-gluon operators have to be included. Coefficient functions of three-gluon operators vanish at tree level, so that gluon contributions
appear entirely through the evolution. The number of independent three-gluon operators
is, roughly speaking, half of the number of quarkgluon operators and, similarly, is rising
with the moment n. The subject of this work is to find out whether the whole set of gluon
operators contributes significantly, or one can reduce the gluon contribution to a certain
single degree of freedom. We find that such a reduction is indeed possible and formulate
a two-channel DGLAP-type evolution equation for the structure function g2 (x, Q2 ) that
presents our main result.
In physical terms, the approximation constructed in this paper corresponds to the introduction of quark and gluon transverse spin parton distributions which are identified with
the particular components of quarkantiquarkgluon and three-gluon parton correlation
functions that possess the lowest anomalous dimension. We will find that, first, the structure function g2 (x, Q2 ) is dominated by contributions of these two distributions at large
scales (including the O(s ) gluon contribution calculated in [15]), and, second, the leakage of transverse spin to genuine three-particle degrees of freedom at lower scales is small
due to the specific pattern of the QCD evolution. We would like to emphasize that importance of this result is not so much in the possibility to calculate the scale dependence, but
in the identification of important transverse spin degrees of freedom that are preserved by
QCD interactions.
The outline of the paper is as follows. Section 2 is mainly introductory. We introduce
necessary notation and present a summary of the existing calculations of the coefficient
functions in the operator product expansion. Section 3 is devoted to the general
formalism of the renormalization of twist-3 gluon operators. We emphasize importance
of the conformal symmetry and introduce a convenient framework that allows to treat
renormalization as a quantum mechanical problem with hermitian Hamiltonian. This
section is necessarily rather technical and a reader who is only interested in the applications
may prefer to skip this discussion and go over directly to Section 4 where we collect
72
our results. The main result of this paper is the generalization of the DGLAP evolution
equation (1.2) to the flavor-singlet channel; it is repeated in Section 5 which also contains
a summary and conclusions.
In Appendix A we present a detailed calculation of the relevant anomalous dimensions.
In Appendix B we collect necessary formulae for the Racah 6j -symbols of the SL(2, R)
group. The conformal basis representation of the QCD evolution kernels is discussed in
Appendix C.
2. The operator product expansion

The hadronic tensor which appears in the description of deep inelastic scattering of
polarized leptons on polarized nucleons, involves two structure functions 4

1
s q
(A)
q s g1 xB , Q2 + s
p g2 xB , Q2
W
=
(2.1)
pq
pq
and is related to the antisymmetric part of the Fourier-transform of the T-product of two
electromagnetic currents:
1
Im T (A) ,

i
(A)
iT
=
d 4 x eiqx p, s|T {j (x/2)j (x/2) j (x/2)j (x/2)}|p, s.
2
(A)
=
W
(2.2)
The light-cone expansion of (2.2) at x 2 0 goes in terms of nonlocal light-cone operators

of increasing twist, schematically
i
T {j (x)j (x) j (x)j (x)}
2

x 2 0 i
=
[C O ]tw-2 + [C O ]tw-3 + (higher twists) ,
2
16 x
(2.3)
where C O stands for the product (convolution) of the coefficient functions and
operators of the corresponding twist. This expression is explicitly U (1)-gauge invariant,
i.e., /x T {j (x)j (x)} = 0.
2.1. Twist-2
For the sake of completeness and in order to facilitate the comparison to twist-3, we
collect here the relevant portion of the results for the leading-twist.
Retaining the tree-level quark contribution and the leading gluon correction that starts at
order O(s ) one obtains [15]
4 We define the nucleon spin vector as s = u(p,
s) 5 u(p, s) where u(p, s) is the nucleon spinor
u(p,
s)u(p, s) = 2M, so that s 2 = 4M 2 .
[C O ]
tw-2
x
= 4
x
+

q=u,d,s,...
1
eq2
du
q(ux)/
x 5 q(ux) + (x x)
73

2
MS

4s 2 2
ln x MS + 2E u ln u + u(1 u)
x (ux)} ,
Tr{Gx (ux)G
(2.4)
where Gx = x Ga t a and the subscript [ ]2 indicates the normalization scale of the

operator. Here and in what follows it is implied that the gauge invariance
of nonlocal

light-cone operators is restored by including the gauge factors P exp(i dz A (z)). Note
simplicity of the answer (2.4): the entire gluon contribution can be eliminated by choosing
the proper scale of the quark operator 2 = 1/(x 2e2E ). This property is lost in the
MS
momentum space since after the Fourier transformation contributions of different lightcone separations get mixed.
Going over to the matrix elements, one introduces the quark and gluon helicity
distributions
1
p, s|q(x)/
x 5 q(x)|p, s = (sx)

d e2ipx 2q , 2 ,
x (x)}|p, s = i (sx)(px)
p, s| Tr{Gx (x)G
4
1

d e2ipx 2g , 2 .
(2.5)
In the first case, positive (negative) correspond to the contribution of quarks (antiquarks)
2q(xB ) = q (xB ) q (xB ), 2q(xB ) 2q(x
B ) = q (xB ) q (xB ), respectively. For
gluons, 2g( ) = 2g( ).
Ignoring the gluon contribution for a moment, making a Fourier transformation of (2.4),
taking imaginary part and comparing with the definition of structure functions in (2.1) one
obtains

1 2

eq 2q xB , 2 = Q2 + 2q xB , 2 = Q2 ,
g1 xB , Q2 =
(2.6)
2 q
where xB = Q2 /(2pq) is the Bjorken variable, and twist-2 contribution to g2 (xB , Q2 )
tw-2

g2 xB , Q2
= g2W W xB , Q2 = g1 xB , Q2 +
1

dy
g1 y, Q2 .
y
(2.7)
xB
(x, Q2 )
Eq. (2.6) tells that the structure function g1

is a measure of the quark helicity
distribution in the nucleon, as well known. Eq. (2.7) is the familiar WandzuraWilczek
relation [16]. We would like to stress that despite the fact that the relation in (2.6) is affected
by higher order perturbative radiative corrections, the WandzuraWilczek relation between
the structure functions, Eq. (2.7), is exact to all orders of perturbation theory and is only
modified by twist-3 contributions to g2 (xB , Q2 ) that are subject of this paper. The reason
74
for this is that the relation (2.7) follows from particular (and unique) form of the Lorentz
structure for the antisymmetric part of the T-product in (2.3) which in turn is dictated by the
U (1) gauge invariance. Although the coefficient function in front of the twist-2 operator
in (2.3) has a nontrivial perturbative expansion, it affects both structure functions g1 (x, Q2 )
and g2 (x, Q2 ) simultaneously and in the same way. Hence the WandzuraWilczek relation
holds true.
For most of the subsequent discussion it will be convenient to go over to the moments
space:

f n, Q2 =
1

dxB xBn1 f xB , Q2 ,
(2.8)
for any function f . In particular, restoring the gluon contribution we get [15]

1

2
2
g1 n, Q =
eq 2q n, 2MS + 2q n, 2MS
2
q=u,d,s,...

2

s n 1
Q
+
2g n, 2 ln 2 (n) 1 E ,
2 n(n + 1)
MS

n
1
2 tw-2
2
g2 n, Q
g1 n, Q .
=
n
(2.9)
2.2. Twist-3
The twist-3 contribution to the T-product in (2.3) is more complicated. One obtains the
following expression [15] 5
[C O ]tw-3
i
= 2
2x
+ u

q=u,d,s,...
2 s
1
eq2
u
du

dv (u + v)S (u, v, u) + (u v)S (u, v, u)

(v, u, u) + 2uO
(u, v, u) + (u v)O
(u, u, v)
(u + v)O

4s 2 2
ln x MS + 2E + 1
+

(v, u, u) + u O
(u, v, u) v O
(u, u, v)
uu
+ 14 u 2 + u ln u v O

1 2
(u, u, v) + O
(u, v, u) + O
(v, u, u)
u (u + 2) O
, (2.10)
12
2
MS
5 To avoid misunderstanding, note that this result does not include the O( ) correction to the coefficient
s
function of quarkantiquarkgluon operators. This correction has been calculated recently in [13,14] using a
different operator basis.
75
where u = 1 u and we have introduced the C-even nonlocal quarkgluon operator

S (u, v, u) = S+ (u, v, u) + S (u, v, u),
1
x (bx) gGx (bx)5]/
G
x q(cx)
S (a, b, c) = q(ax)[ig
2
(2.11)
and the nonlocal three-gluon operator

(u, v, w) = ig f abc Gax (ux)G
bx (vx)Gcx (wx).
O
2
(2.12)
The relatively complicated expression in the last three lines in (2.10) reflects quarkgluon
mixing and reproduces the corresponding term in the renormalization group equation for
the twist-3 operator S (u, v, u) [9,17]
[S (u, v, u)]2
2
u s
22
s
ln
= [S (u, v, u)]2
ds dt (2u)3
1
2 21
u
u

[2u(s t) + 4(u s)(t + s)]O (s, v, t)

(s, t, v) O
(v, s, t)] .
2|u|(s t)[O
(2.13)
Similar to the twist-2 contribution in Eq. (2.4) this contribution can be eliminated by
appropriate choice of the normalization scale of the quark operator. Finally, the expression
in the second line in (2.10) is not affected by the scale choice and defines a genuine
twist-3 gluon coefficient function.
Nucleon matrix elements of the nonlocal operators in (2.11), (2.12) define parton
correlation functions
p, s|S (u, v, u)|p, s
1
= 2i(px)[s (px) p (sx)]
D eipx[1 u+2 v3 u] Dq (1 , 2 , 3 )
(2.14)
and
(u, v, u)|p, s
p, s|O
1
= 2(px) [s (px) p (sx)]
2
D eipx[1 u+2 v3 u] Dg (1 , 2 , 3 ),
(2.15)
where the integration measure is given by

1
1
D
d1 d2 d3 (1 + 2 + 3 ).
(2.16)
76
The quark correlation functions Dq (i ) have the following symmetry property:

Dq (1 , 2 , 3 ) = Dq (3 , 2 , 1 ).
(2.17)
They are in general complex functions, but the imaginary parts do not contribute to the
structure functions and can be omitted [18]. In turn, the gluon correlation function Dg (i )
is real and antisymmetric to the interchange of the first and the third argument:
Dg (1 , 2 , 3 ) = Dg (3 , 2 , 1 ).
(2.18)
Substituting (2.10) into (2.3) and taking the Fourier transform one obtains the moments of
the structure function for positive odd n
tw-3

g2 n, Q2
1
=
2
4
eq2
q=u,d,s,...
1

q
D Dq i , 2MS n (1 , 3 )
1

2
s Dg (i , MS )
n+1

Q2
g
qg
n (i ) + n (i ) ln 2 (n) E 1
. (2.19)
MS
Here Dq (i ) is the distribution function corresponding to the C-even combination of

quarkgluon operators (2.11)
p, s|S (u, v, u)|p, s = 4i(px)[s (px) p (sx)]
1
D eipx[1 u+2 v3 u] Dq (1 , 2 , 3 )
(2.20)
so that
Dq (1 , 2 , 3 ) = Re Dq+ (1 , 2 , 3 ) = Re Dq (3 , 2 , 1 ),
(2.21)
the quark coefficient function is defined as 6

q
n (1 , 3 ) =
1n1 (3 )n1
3
1 + 3
(2.22)
q
and the gluon coefficient functions can be expressed in terms of n as

q

g
q
n (i ) = n1 (1 , 3 ) + n1 (1 3 , 3 ) + (1 3 ),

2
qg
g
n (i ) = 1 +
n (i )
n(n 2)

2(n 1) q
q
n1 (1 , 1 3 ) + n1 (3 , 1 + 3 ) .
+
n(n 2)
(2.23)
6 In difference to Ref. [12] we prefer to define the quark contribution in terms of Re D + ( ) instead of
q i
Re Dq (i ). Because of this, the quark coefficient function in (2.22) differs from the one given in [12] by the
replacement 3 1 . The gluon coefficients are symmetric under this transformation and are not affected.

q
77
Explicit expressions for n and n for the lowest moments n = 3, 5, 7 can be found
in [15].
Equivalently, one may choose to expand moments of the structure function in
contributions of local operators, for example
k
1
gG 5 ]/
S N = q(
D x)k [ig G
x x (D x)Nk q,
2
i
k
bx (D
x)N1k Gcx .
[G ]N = gf abc Gax (D x)k G
(2.24)
2
The reduced matrix elements of local operators
k

p, s| S N |p, s = 2(ipx)N+1 [s (px) p (sx)] [S ]kN ,

p, s|[G ]kN |p, s = 2(ipx)N+1 [s (px) p (sx)] [G]kN
(2.25)
are equal to moments of the three-particle distributions 7
1
1
k
k
k Nk
[S ]N = D 1 3 Dq (i ),
[G]N = D 1k 3N1k Dg (i ).
1
(2.26)
The symmetry relations (2.17) and (2.18) imply that [S ]kN = (1)N [S ]Nk
N and
[G]kN = [G]N1k
.
Therefore,
the
number
of
independent
quark
and
gluon
matrix
N
elements contribution to a given n = (N + 3)th moment is equal to ;q = N + 1 and ;g =
[N/2], respectively. Finally, we define the reduced matrix elements corresponding to the
C-even quarkgluon operator (2.11) as

[S]kN

= Re [S + ]kN = (1)N Re [S ]Nk
=
N
1
D 1k 3Nk Dq (i ).
(2.27)
2.3. Transverse spin parton densities: why and why not

In order to pursue a parton model-like interpretation, one can introduce transverse spin
distributions as the specific projections of general quarkantiquarkgluon and three-gluon
operators. Their definition is suggested by the explicit form of the contribution of these
operators to the operator product expansion (2.10)
u

dv p, s| (u + v)S+ (u, v, u) + (u v)S (u, v, u) |p, s
1

(sx)
= i s p
d e2i upx 2qT , 2
(px)
(2.28)
7 The distinction between plus and minus distributions is delicate since it is affected by the convention used
to define the 5 matrix. We use 5 = i0 1 2 3 = i 0 1 2 3 and < 0123 = <0123 = 1 [19]. A sign change
in the definition of the 5 matrix results in the replacement S + S and, for matrix elements [S]kN
(1)N [S]Nk
N . The coefficient functions in the OPE are changed accordingly, so that the results for the
physical observables remain intact.
78
and
u

(v, u, u) + 2u O
(u, v, u) + (u v)O
(u, u, v) |p, s
dv p, s| (u + v)O
1
= 2[s (px) p (sx)]

d e2i upx 2gT , 2 ,
(2.29)
so that
1
1
d
n1
2qT ( ) = 4
D n (i ) Dq (i ),
1
1
d
n1
2gT ( ) = 2
D n (i )Dg (i ).
(2.30)
, Q2 )
The functions 2qT (xB

and 2gT (xB , Q2 ) describe the momentum fraction distribution of the transverse spin of the proton and have the same support property as the parton
distribution in (2.5). Note that 2gT ( ) = 2gT ( ). However, in contrast with (2.5), they
do not have any probabilistic interpretation but rather can be expressed through the more
general three parton correlation functions Dq (1 , 2 , 3 ) and Dg (1 , 2 , 3 ) integrating out
the dependence on one momentum fraction. The explicit expressions for the lowest moments (2.30) look as follows
1

dx 2qT (x) + 2qT (x) = 0,
(2.31)
1
1

Dx Dq (xi ) = 4 [S]00 ,
dx x 2qT (x) + 2qT (x) = 4

2
1
(2.32)

dx x 2qT (x) + 2qT (x) = 4
1

Dx x12 2x1 x3 + 3x32 Dq (xi )

= 4 [S]22 2 [S]12 + 3 [S]02
(2.33)
for the quark distribution and

1
1
dx 2gT (x) =
dx x 2 2gT (x) = 0,
(2.34)
1
1
dx x 2gT (x) = 10
4
for the gluon distribution.

Dx x1 Dg (xi ) = 10 [G]02
(2.35)
79
At tree level, neglecting the O(s )-correction to (2.19), one obtains using (2.30)

g2Born

tw-3 1 2
=
e
xB , Q2
2 q q
1
dy
[2qT (y) + 2qT (y)]
y
(2.36)
xB
that looks very similar to the leading twist expression (2.6) and (2.7). The two contributions
in the square brackets can be interpreted as the contributions of quarks and antiquarks,
respectively. Following the analogy with the leading twist, it is convenient to introduce the
combinations of definite signature

2qT y, Q2 = 2qT y, Q2 2qT y, Q2 .
(2.37)
The distribution 2qT+ (y, Q2 ) corresponds to the even signature and can be obtained by the
analytic continuation from even moments N of the OPE. The distribution 2qT (y, Q2 )
can be obtained by the analytic continuation from odd moments N and defines the
valence quark contribution. The gluon contribution enters into (2.19) through O(s )
corrections and, according to (2.30), its genuine twist-3 part is parameterized by the
gluon distribution 2gT (x). This suggests that similar to the leading twist expressions (2.6)
and (2.7), the twist-3 structure function g2tw-3 (x) can be described in terms of the quark
and gluon distributions, 2qT+ (x) and 2gT (x), respectively.
A deficiency of this interpretation is, however, that it does not go through beyond the
leading order. This is seen explicitly on the gluon contribution in Eqs. (2.19) and (2.23): the
qg
coefficient function n (i ) that is responsible for the mixing with quarkgluon operators
g
does not coincide with n (i ) and, therefore, this mixing brings in gluon contributions that
are not expressed entirely in terms of 2gT ( ) defined in (2.30). Another reason is that the
scale dependence of the distributions introduced in (2.28), (2.29) involves the full threeparticle functions Dq and Dg in a nontrivial way and, again, brings in additional degrees
of freedom.
Aim of this paper is to analyse the effects of QCD evolution in some detail. We will find
that although the above mentioned difficulties do exist, their numerical impact is likely to
be minimal. We will then be able to write an approximate effective two-channel evolution
equation involving the two distributions 2qT+ (y, 2 ) and 2gT (y, 2 ) in full analogy with
the flavor-singlet DGLAP evolution equations in the leading twist.
3. Hamiltonian approach to the three-particle evolution equations

Choosing 2 = Q2 one can eliminate large logarithmic corrections to the gluon
MS
coefficient function in (2.19). To the leading logarithmic accuracy (LL) and retaining the
flavor-singlet (S) contribution to the structure function g2 (x) we write
g2LL
where

2
n, Q2 = eq2
n
1
1

q
D n (1 , 3 ) DqS i , Q2 ,
(3.1)
80
2
1
eq =
nf
eq2 ,
DqS (i ) = Du (i ) + Dd (i ) + Ds (i ) + .
(3.2)
q=u,d,s,...
The gluon distribution is not present explicitly (to this accuracy) but reappears through the
evolution of the quark distribution to lower scales. To see how this happens, expand (3.1)
k (Q2 ) defined in (2.27). Using (2.22)
in contributions of flavor-singlet local operators SN
and (2.26) one finds
N

2
k

g2LL n, Q2 = eq2
(1)Nk (N k + 1) SN
(Q2 ) .
n
(3.3)
k=0
The scale dependence of the reduced matrix elements is described by the system of
coupled evolution equations
s qq k 2 qg m 2
d k 2
VN kk SN Q + VN km GN Q
S
Q
=
,
Q2
N
dQ2
4
s gq k 2 gg m 2
d m 2
Q2
(3.4)
VN mk SN Q + VN mm GN Q
GN Q =
2
dQ
4
with [VNAB ] being the known matrices of anomalous dimensions [8], k, k = 0, . . . , N and
m, m = 0, . . . , [N/2] 1. (Here N = n 3 is the number of derivatives in the quarkgluon
operator, [ ] stands for an integer part.) Solving these equations one defines [3N/2] + 1
linear combinations of the matrix elements

q
g
Ck (N) [S]kN +
Cm (N) [G]m
ON, =
N ,
0kN
0m[N/2]1
= 0, . . . , [3N/2]
(3.5)
that are renormalized multiplicatively and obtain the moments of the structure function in
the standard form (1.1). The corresponding anomalous dimensions N can be found by
diagonalizing the full matrix of the anomalous dimensions entering (3.4)
qq
qg

q g
VN
VN
VN =
C , C VN N 1 = 0,
(3.6)
gq
gg .
VN
VN
The left eigenstates of the mixing matrix define the vector of the coefficient functions
q
g
(Ck (N), Cm (N)) entering (3.5).
For lowest values of the moments the mixing matrix VN looks as follows. For N = 0 the
matrix consists of only one element
17
1 1
2
V0 = Nc +
(3.7)
+ nf ,
6
6 Nc 3
while for N = 2 it has the following form
17 N 7 1 + 1 n
1N + 7
V2 =
4 c
6 Nc
5 f
1N + 7 1 + 2 n
c
2
12 Nc
15 f
3 N + 3 1 + 1n
20
c
20 Nc
5 f
37 N
120
c
1
2
2 c
4 Nc + 5 nf
59 N 3 1 + 4 n
12 c
2 Nc
15 f
3
23 1
2
5 Nc + 20 Nc + 5 nf
7 N
40 c
1 N + 1 1 + 1n
12
c
2 Nc
5 f
1N + 1 1 + 2 n
c
3
12 N
15 f
c
287 N 37 1 + 1 n
60 c
60 Nc
5 f
23
120 Nc
17
30 nf
7
30 nf
1 n
10 f
2
307
3 nf + 60 Nc
.
(3.8)
81
Going through an explicit calculation of (3.5) and (3.6) and putting Nc = nf = 3 one finds
for the two lowest moments (3.3)

3 2 1 LL
0
eq g2 3, Q2 = L0 /b S00 ,
2

5 2 1 LL
eq g2 5, Q2
2

0
= L2 /b 0.415 S22 2.558 S21 + 2.776 S20 + 0.966 G02

1
+ L2 /b 0.340 S22 0.152 S21 0.261 S20 0.114 G02

2
+ L2 /b 0.134 S22 + 0.496 S21 + 0.404 S20 0.909 G02

3
+ L2 /b 0.111 S22 + 0.214 S21 + 0.080 S20 + 0.057 G02 ,
(3.9)
where L = s (Q2 )/s (2 ), all reduced matrix elements on the r.h.s. are normalized at the
scale 2 and the flavor-singlet anomalous dimensions are equal to
00 = 10.5556,
20 = 10.7393,
22 = 17.6794,
23 = 18.1714.
21 = 13.5155,
(3.10)
If 2 = Q2 , then L = 1 and the coefficients in front of S22 , S21 , S20 , G02 coincide
with their tree-level values 1, 2, 3 and 0, respectively, as expected from (3.3). Note
that the largest coefficients occur in the contribution of the operator with the lowest
anomalous dimension, that is similar to flavor-nonsinglet case [10], and the most important
correction is apparently associated with the operator with the second-largest anomalous
dimension. Aim of this work is to explain this structure and understand how it extends for
arbitrary moment n. To this end, we develop a new framework for solving the three-particle
evolution equations, dubbed Hamiltonian approach in what follows. A short account of
the same technique is presented in our letter [12], where it was used to calculate 1/Nc2
correction to the evolution in flavor-nonsinglet sector.
The basic idea of our approach can be explained as follows. As it follows from (3.8), the
mixing matrices VN in (3.4) do not have any obvious symmetry and, in general, are quite
complicated. In particular, they are not hermitian and their eigenvectors are not orthogonal
to each other. On the other hand, by a numerical diagonalization, Eq. (3.6), one finds that
all eigenvalues of these matrices (anomalous dimensions) are real for arbitrary N . This
property is not obvious and allows to suspect some hidden symmetry of the problem, which
is not manifest in the particular representation of the evolution equations (3.4) involving
only forward matrix elements of the operators. We will argue that this symmetry indeed
exists and is nothing else as the familiar conformal symmetry of the QCD Lagrangian.
The conformal symmetry manifests itself through the SL(2, R) invariance of the
renormalization group equations describing the evolution of the local twist-3 operators
including operators with the total derivatives. The SL(2, R) symmetry of these evolution
equations is obscured by the restriction to forward matrix elements of the operators in
(3.4) (or, equivalently, the condition that momentum fractions of the partons sum to zero
in (3.1), cf. (2.16), 1 + 2 + 3 = 0). Since the operators containing total derivatives have
vanishing forward matrix elements, it seems natural to neglect their mixing with the twist-3
82
operators (2.24) in the discussion of deep inelastic scattering. But it is this reduction that
complicates the structure of the evolution equations if it is imposed from the beginning.
Our approach relies on the conformal symmetry of the evolution equations and can be
illustrated by the following scheme indicating a chain of transformations on the matrix of
the evolution kernels VN :

forward
non-forward
hermitian kernels
non-hermitian kernels
hermitian kernels
in conformal basis
!
"
!
"
!
"

;q +;g
1
2 [;q (;q 1)+;g (;g 1)]
;q +;g
Instead of dealing with the non-hermitian forward mixing matrices VN of dimension

;q + ;g = (N + 1) + [N/2], with ;q and ;g being the total number of quarkantiquark
gluon and three-gluon forward matrix elements, we choose to consider much bigger
matrices (but hermitian with respect to the so-called conformal scalar product) of
dimension [;q (;q 1) + ;g (;g 1)]/2 that take into account the mixing with the
operators containing total derivatives. Diagonalizing the thus defined non-forward
evolution kernels we expand its eigenstates over the basis of spherical harmonics of
the conformal SL(2, R) group and obtain much simpler matrix equation of the coefficients
in this expansion (see Appendix C). Thanks to the conformal invariance, the corresponding
matrix, defining the non-forward evolution kernels in the conformal basis representation,
has smaller dimension, ;q + ;g , and is now hermitian. We then make a forward projection
at the very end.
3.1. Coefficient functions of local operators
An arbitrary local three-particle operator ON, is defined by the set of coefficients in
the expansion over the standard basis of operators built from the elementary fields k and
covariant derivatives (cf. (3.5)), schematically

wk1 k2 k3 D k1 1 D k2 2 D k3 3 ,
ON, =
(3.11)
k1 +k2 +k3 =N
or, equivalently, by a characteristic homogenous polynomial of three variables

N, (x1 , x2 , x3 ) =
wkN,
x k1 x2k1 x3k1 ,
1 k2 k3 1
(3.12)
k1 +k2 +k3 =N
which we shall refer to as the coefficient function of the operator (not to be mixed with the
coefficient functions in the OPE). The rationale for the name is that with the help of the
coefficient function the local operator can be projected out of the corresponding nonlocal
operator, in our case
#
q
ON, = 2Nc N, (z1 , z2 , z3 )S (z1 , z2 , z3 )
$%
g
(z1 , z2 , z3 ) %%
+ nf N1, (z1 , z2 , z3 )O
z =0
i%

S (zi ) %
q
g
%
= 2Nc N, (zi ), nf N1, (zi )
,
(3.13)
(zi ) %
O
zi =0

q
83
where N (xi ) and N1 (xi ) are homogenous polynomials of degree Nqgq

= N and
8 The construction of the multiplicatively renormalizable

Ng gg
=
N
1,
respectively.
operators (3.13) is equivalent to finding of the appropriate coefficient functions. The

normalization color factors have been included in (3.13) for later convenience (see
Eq. (3.65)).
To expose the symmetries of the problem, it is convenient to start with the evolution
(zi ) defined in (2.11)
equations for the corresponding nonlocal operators S (zi ) and O
and (2.12), respectively. They have the following general form

d
s &
&qg O
(z1 , z2 , z3 ) ,
S (z1 , z2 , z3 ) =
Hqq S (z1 , z2 , z3 ) + H
2
dQ
4

d
&gg O
(z1 , z2 , z3 ) = s H
(z1 , z2 , z3 ) ,
&gq S (z1 , z2 , z3 ) + H
Q2
O
2
dQ
4
Q2
(3.14)
&ab (a, b =
where zi stand for the light-cone coordinates of the quarks and gluons, and H
q, g) are integral operators acting on quark (gluon) coordinates and
describing
the inter
&
&
action between quarks and gluons on the light-cone, Hqq S (zi ) = dzk Hqq (zi |zk )S (zk )
and similar for the other kernels. Note that the short-distance expansion of the nonlocal
gives rise to the local twist-3 operators (2.24) as well as operators
operators S and O
with the total derivatives. In difference to the previous discussion we do not assume the
translation invariance, S (z1 , z2 , z3 ) = S (z1 + , z2 + , z3 + ), so that the mixing with
operators containing total derivatives is included in (3.14). The explicit expressions for the
&ab can be found in [9]. They will not be needed for what follows.
kernels H
The evolution equation (3.14) has the form of the Schrdinger equation with the 2 2
&ab playing the rle of the Hamiltonian. Let
&q (zi )
matrix of the evolution kernels H
N,
&g
and
N1, (zi ) be homogenous polynomials in the light-cone coordinates of quarks and
gluons of degree N and N 1, respectively, satisfying the Schrdinger equation
&q

&
&q (zi )
&qg
N, (zi )
Hqq H
N,
=
E
(3.15)
N,
&gq H
&gg
&g
&g
H
N1, (zi )
N1, (zi )
(zi )
with the same evolution kernels as in (3.14). The nonlocal operators S (zi ) and O
can be expanded in terms of these functions with certain operator-valued coefficients

&q (zi )

S (zi )
N,
=
(3.16)
ON, Q2 ,
g
(zi ) Q2
&
O
N1, (zi )
N,
where the subscript in the l.h.s. stands for the normalization scale. It follows from
the evolution equation that the local operators ON, that appear in this expansion are
renormalized multiplicatively and their anomalous dimensions are determined by the
corresponding energy eigenvalues EN,
N = EN,
(3.17)
8 The difference in the number of derivatives is compensated by the different dimensions of the quark and
gluon field.
84
with the subscript = 0, 1, . . . , [3N/2] enumerating different solutions to (3.15). The scale
dependence of ON, takes, therefore, the standard form

s (Q2 ) N /b
ON, Q2 = ON, 2
(3.18)
.
s (2 )
&
& (zi ),
So far we have introduced two different sets of polynomials: (
N
N1 (zi )) and
q
g
(N (xi ), N1 (xi )). The former set defines the expansion (3.16) of nonlocal operators
(zi ) over the complete set of local multiplicatively renormalizable operators
S (zi ) and O
ON, , while the latter determines the particular form of the multiplicatively renormalizable
operators ON, , cf. (3.11), (3.12). It follows from the definition (3.13) and (3.16) that

%
q
%
&q (zi ) + nf g
&g

2Nc N, (zi )
N1, (zi )N 1, (zi ) z =0 = NN , (3.19)
N ,
q
& are dual to the coefficient functions . In what

and in this sense the polynomials
&
follows we shall refer to -functions as coefficient functions of a local operator in the dual
q
g
representation. One can determine the functions N, and N1, from the orthogonality
&q and
&g
condition (3.19) provided the complete set of eigenfunctions
N,
N1, of the
Schrdinger equation (3.16) is given. This task seems complicated, but in fact is not. Using
conformal symmetry, we will be able to find an explicit expression connecting the direct
&(zi ) coefficient functions of a multiplicatively renormalizable operator.
(xi ) and dual
The answer is given in Eq. (3.37).
3.2. Conformal symmetry
A remarkable property of the evolution kernels in (3.14) is that they are invariant under
the projective transformations of the light-cone coordinates of quark and gluons, zk ,
zk
azk + b
,
czk + d
ad bc = 1.
(3.20)
This invariance has its roots in the conformal symmetry of the QCD Lagrangian and the
transformations (3.20) form the SL(2, R) (collinear) subgroup of the full conformal group
acting on the fields living on the light-cone. As well known, the conformal symmetry of
QCD is broken by quantum corrections. However, since the leading-order renormalization
group equations are driven by tree-level counterterms, they have to respect the symmetry
of the QCD Lagrangian.
The action of the SL(2, R) transformations (3.20) on (quantum) fields a (z), where
x (z), respectively, is defined as
a = q,
q, g, g corresponds to q(z),
q(z), Gx (z), G

az + b
a (z) (cz + d)2j a
(3.21)
cz + d
and is described by three generators L + , L and L 0 that can be realized as first-order
differential operators acting on the field coordinates:
L
a a (z) = z a (z),
2
L +
a a (z) = (z z + 2ja z)a (z),
L 0a a (z) = (zz + ja ) a (z).
85
(3.22)
Here ja = (la + sa )/2 is the conformal spin of the field a (z), with la being a canonical
dimension (3/2 for quarks and 2 for gluons) and sk the spin projection on the light-cone
direction, Cpz a = isa a . In the case at hand the spin projections have their maximum
values sq = sq = 1/2, sg = sg = 1, leading to
jq = jq = 1,
jg = jg = 3/2.
(3.23)
In order to unify the notation, we introduce two 2 2 matrices of the evolution kernels
& and the generators of conformal transformations Lk (k = , 0) in the quarkantiquark
H
gluon and three-gluon channels

&
k
&qg
0
Lqgq
Hqq H
&=
,
L k =
,
H
(3.24)
&gq H
&gg
H
0
L kg gg
where L kqgq
and L kg gg
(k = , 0) are the total three-particle SL(2, R) generators acting on

antiquarkquarkgluon and three-gluon coordinates, respectively:
= L kq + L kq + L kg ,
L kqgq
k
k
k
L kg gg
= Lg + Lg + Lg .
The conformal invariance of the evolution equation is stated as

& L 2 = L 2 , L k = 0,
& L k ] = H,
[H,
(3.25)
(3.26)
where
1
L 2 = (L + L + L L + ) + L20
(3.27)
2
is the three-particle quadratic Casimir operator.
Thanks to the conformal invariance (3.26) the solutions to (3.15) can be classified
according to the representations of the SL(2, R) group. Namely, we can impose additional
constraints on the eigenfunctions
&q

q
N (zi )
& (zi )
& g (zi )
= L
[L0 J ]
N
g
N1
&
N1 (zi )

&q
2

N (zi )
= 0,
= L J (J 1)
(3.28)
& g (zi )
N1
where the SL(2, R) generators are given by the same expressions as in (3.22). Since the
eigenfunctions are homogenous polynomials in the light-cone coordinates zk , the first
equation in (3.28) is automatically satisfied provided
9
7
+ Nqqg
= + Ng gg
(3.29)
,
2
2
where Nqqg
and Ng gg
count the total number of covariant derivatives in the corresponding

local operator, Nqqg
= Ng gg
=0
+ 1 = N . Notice that for J = 7/2, or equivalently Nqqg

q
g
&
&
there exists no three-gluon contribution. The second condition ensures that N and N1
are invariant under translations of the light-cone coordinates. This requirement defines the
so-called highest weight of the discrete series representation of the SL(2, R) group, labeled
J=
86
by the conformal spin J = N + 7/2. An infinite tower of solutions to (3.15) can be obtained
from the highest weight by the repeated application of the step-up operator L + :

&q
n &q
N (zi )
N (zi )
n
=
L
.
(3.30)
+ &g
n
&g (zi )
N1 (zi )
N1
&N , with different n =
Since L+ commutes with the Hamiltonian, Eq. (3.26), all states n
0, 1, 2, . . . , have the same energy. As we will show in a moment (see Eq. (3.36)), the
corresponding operators n ON are just those obtained from the highest weight state operator
by adding the nth power of the total derivative and, therefore, they do not survive upon
taking a forward matrix element. Note that the expansion in (3.16) formally includes
operators with arbitrary powers of total derivatives, but we can ignore their contribution
and concentrate on studying the properties of the highest weights (3.28) only.
&(zi ) to the coefficient functions (xi )
Going over from the dual coefficient functions
defined in (3.13) corresponds to going over to a different (non-standard) representation of
the conformal group. Using the relation (3.19) and requiring
%

%
,0
%
&(z1 , z2 , z3 )%
&
= (z1 , z2 , z3 ) L ,0
Lk (z1 , z2 , z3 )
k (z1 , z2 , z3 ) z =0
z =0
i
(3.31)
one finds the following representation of the SL(2, R) generators on the space of the
coefficient functions (x1 , x2 , x3 )
L0k (xk ) = (xk xk + jk ) (xk ),
L+
k (xk ) = xk (xk ),

2
L
k (xk ) = xk xk + 2jk xk (xk )
(3.32)
with the conformal spins jk defined in (3.23). Note that these expressions are more
complicated compared to the standard expressions (3.22). Eqs. (3.32) and (3.22) define
the SL(2) generators in two different representations and, as such, they are related to each
other through a transformation
1
L
k = T Lk T,
L0k = T1 L 0k T,
&(zi ) = [T ](zi )
(3.33)
that maps into each other the coefficient functions in two different representations. The
explicit form of the T-transformation is given by
%
3
3
%
'
'
dti t 2ji 1 ti
2ji %
&
e (zi ti ).
=
(zi ) = (xi ) (1 xi zi )
(3.34)
%
%
E(2ji )
i=1
xi =0
i=1 0
&
To verify this relation, use (3.22) and (3.32) to check that L
k (zi ) = [T(Lk )](zi ) and
0&
0
Lk (zi ) = [T(Lk )](zi ). The conformal constraints (3.28) on the coefficient functions
corresponding to the highest weight look exactly as before:
q

q
N (xi )
N (xi )
[L0 J ]
= L
g
g
N1 (xi )
N1 (xi )

2
Nq (xi )
= L J (J 1)
(3.35)
= 0,
g
N1 (xi )
87
with the SL(2) generators defined in the representation (3.32). Note that the step-up
operator L+ has become very simple and its action consists of adding the sum of derivatives
acting on each of the three fields, as we anticipated:
q
q

n q
N (xi )
N (xi )
N (xi )
n
n
+
x
+
x
)
=
L
=
(x
.
(3.36)
g
g
1
2
3
+
n g (x )
N1 (xi )
N1 (xi )
N1 i
Finally, applying the transformation (3.34) to the coefficient function (3.12) we obtain the
following expression for the dual coefficient function

k k k E(k1 + 2j1 ) E(k2 + 2j2 ) E(k3 + 2j3 )
&(zi ) =
wk1 k2 k3 z11 z22 z33
. (3.37)
E(2j1 )
E(2j2 )
E(2j3 )
k1 +k2 +k3 =N
This relation establishes the one-to-one correspondence between the coefficient functions (3.12) and their dual counter-parts (3.37).
Much of the following discussion is based on the fact that the coefficient functions
of multiplicatively renormalizable operators N, (xi ) satisfying the highest weight
condition (3.35) are orthogonal with respect to the so-called conformal scalar product.
This property becomes crucial in establishing the hermiticity of the evolution Hamiltonian
in (3.15). The hermiticity property will be quite helpful in the further analysis and as we
argue below is a direct consequence of the conformal symmetry.
Eq. (3.19) suggests to define the following scalar product on the space of coefficient
functions given by (3.12)
%
q
&q (zi )%
N, |N, N, (zi )
(3.38)
= N, (i )[T ]N, (zi )|zi =0 .
N,
z =0
i
For the coefficient functions of local operators without total derivatives that satisfy the
constraints (3.28) (i.e., those that we are interested in) one can equivalently rewrite the
definition in (3.38) in a more familiar integral form:
N, |N, =
E(2j1 + 2j2 + 2j3 + 2N)

E(2j1 )E(2j2 )E(2j3 )
1
2j 1 2j 1 2j 1
[dx] x1 1 x2 2 x3 3 N, (xi )N, (xi ),
(3.39)
where jk are the conformal spins of the operators entering (3.11). Here, the integration
goes over the region 0 xk 1, x1 + x2 + x3 = 1 and the integration measure is defined
as
[dx] = dx1 dx2 dx3 (x1 + x2 + x3 1).
(3.40)
The SL(2) generators (3.32) are (anti)self-adjoint operators on the space of the coefficient
functions endowed with the scalar product (3.38). Indeed using Eqs. (3.31), (3.32)
and (3.33) it is straightforward to verify that
%
%
%
%

%
N, %L0k N, = L0k N, %N, .
N, %L
k N, = Lk N, N, ,
As a consequence, the two-particle Casimir operators
are the self-adjoint operators.
L2ik
= (Li + Li
)2
(3.41)
defined as (3.27)
88
3.3. Helicity basis

As we will see in the next section, the evolution Hamiltonians (3.24) can be written in
terms of the two-particle Casimir operators L 2ik = (L i + L i )2 in the dual representation.
This property ensures that the Hamiltonians inherit hermiticity properties of the generators
and are self-adjoint operators as well. As a consequence, their eigenvalues alias the
anomalous dimensions (3.17) are real and the corresponding eigenfunctions are orthogonal
to each other
N, |N, .
(3.42)
In fact, for three-gluon operators there is a complication that the solution to the Schr&g (zi ) that are
dinger equation (3.15) has to be found on the subspace of functions
N,
antisymmetric under the exchange of the first and third gluon z1 z3 . The permutation
operator P13 that projects onto the states with correct symmetry is not a self-adjoint
operator so that one has to be careful. 9 In physical terms, the problem arises because,
contains a sum of contributions
as we will see in a minute, the three-gluon operator O
of gluons with opposite helicity, antisymmetrized because of the crossing symmetry. The
way out [20] is therefore to write down the evolution equation for the helicity eigenstates
and restore the crossing symmetry at the end.
The construction of the helicity basis is based on the decomposition of the quark and
(zi ) into the
gluon fields entering the definition of the nonlocal operators S (zi ) and O
components of different chirality
1 5
1 5
q(z),
q (z) = q(z)
,
2
2

1
G
G,n i<
G,n ,
(z) =
2
q (z) =
(3.43)
=<
where <
p n /(pn). The fields defined in this way satisfy the conditions
5 q (z) = q (z),

i<
G (z) = G
(z)
(3.44)
and describe quark, antiquark and gluon of a definite helicity. Making this decomposition
(zi ) in terms of the
one obtains the expressions for the nonlocal operators S (zi ) and O
chiral fields. We remind that = 1, 2 is a transverse index and in order to construct
the three-particle states with definite overall helicity one has to take particular linear
combinations, projecting onto the two complex vectors in the transverse plane:
(1)
(2)
w = e, + ie, ,
(1)
(2)
w = e, ie, .
(3.45)
We find for the quarkantiquarkgluon operator

nq + (z3 ) + q + (z3 )G+
nq (z1 ),
Sw (z1 , z2 , z3 ) = q (z1 )G+
w (z2 )/
w (z2 )/
nq (z3 ) + q (z3 )G
nq + (z1 ).
Sw (z1 , z2 , z3 ) = q + (z1 )G
w (z2 )/
w (z2 )/
(3.46)
9 The situation is in fact very similar to the HartreeFock construction of the completely antisymmetric
fermionic wave functions in quantum mechanics.
89
Notice that the quark and the antiquark have opposite helicity in both cases, and helicity
of the gluon 1 coincides with the total helicity of the system. (We tacitly imply that the
momenta of the three partons are aligned along the same light-cone direction defined by
the proton momentum p.) The similar decomposition of the three-gluon operator looks as
follows

w (z1 , z2 , z3 ) = 1 Tw (z1 , z2 , z3 ) Tw (z3 , z2 , z1 ) ,
O
2

1
w (z1 , z2 , z3 ) = Tw (z1 , z2 , z3 ) Tw (z3 , z2 , z1 ) ,
O
(3.47)
2
where the notation was introduced
ig
b,+
c,+
Tw (z1 , z2 , z3 ) = f abc Ga,
w (z1 )Gw (z2 )Gw (z3 ),
2
ig
b,
c,
Tw (z1 , z2 , z3 ) = f abc Ga,+
(3.48)
w (z1 )Gw (z2 )Gw (z3 ).
2
The operators Tw (zi ) and Tw (zi ) describe the state of three gluons with the total helicity
1, respectively.
w (z1 , z2 , z3 ) are
It follows from (3.47) and (3.48) that the operators Tw (z1 , z2 , z3 ) and O
antisymmetric under the interchange of gluons with the same and the opposite helicity,
respectively:
w (z1 , z2 , z3 ) = O
w (z1 , z2 , z3 ),
w (z3 , z2 , z1 ) = P31 O
O
Tw (z1 , z2 , z3 ) = Tw (z1 , z3 , z2 ) = P23 Tw (z1 , z2 , z3 )
(3.49)
with Pik being the permutation operators. Using this, we can invert (3.47) to get
w (z1 , z2 , z3 ) + O
w (z2 , z3 , z1 ) O
w (z3 , z1 , z2 )
Tw (z1 , z2 , z3 ) = O
w (z1 , z2 , z3 ).
= (1 + P12 P31 P23 P31 ) O
(3.50)
w .
The same relation holds between the operators Tw
and O
The relations (3.47) and (3.50) allow one to rewrite the evolution equation (3.14) for the
i ) in terms of T (zi ) with definite helicity. The difference amounts
three-gluon operator O(z
& H
&h
to the following redefinition of the evolution kernels (3.24) H
&

&
&qh 1
&qg 1
Hqq H
Hqq H
0
0
&h =
=
.
H
&hq H
&hh
&gq H
&gg
0 (1 + P12 P23 )
0 12 (1 P31 )
H
H
(3.51)
&
Note that the kernel Hqq is not affected by this transformation. Despite the fact that the
two evolution equations are obviously equivalent, the premium in dealing with helicity
&h
operators Tw (zi ) (or Tw (zi )) is that, as we will show below, the evolution kernel H
becomes hermitian on the space of the coefficient functions. The original kernel in (3.24)
is not hermitian due to the presence of additional permutation operators in (3.51). The
&h will be given below.
explicit expressions for the operator H
i ) to the helicity states T (zi ) is equivalent to
According to (3.47), going over from O(z
the following ansatz for the dual coefficient function of the three-gluon operator:
h

& h (z3 , z2 , z1 ) ,
&g (z1 , z2 , z3 ) = 1
&
(z1 , z2 , z3 )
(3.52)
N1
N1
2 N1
90
& h (zi ) are defined by the short distance expansion of

where the coefficient functions
N1
Tw (zi ). Applying the equivalence relation (3.50) one finds
h
&N1
& g (z1 , z2 , z3 ) +
& g (z2 , z3 , z1 )
& g (z3 , z1 , z2 ). (3.53)
(z1 , z2 , z3 ) =
N1
N1
N1
Substituting (3.47) into (3.13), one rewrites the multiplicatively renormalizable operators as

%
q
h
(zi )T (zi ) %z =0 ,
ON, = 2Nc N, (zi )S (zi ) + nf N1,
(3.54)
i
h
and obtains similar relations between thus defined coefficient functions N1,
(xi ) and
g
N1, (xi ). We find that the local three-gluon operators are projected out of the T (zi ) by
the following coefficient function

1 g
g
N1, (x1 , x2 , x3 ) N1, (x1 , x3 , x2 )
2
that has the same symmetry (3.49) as the helicity operator itself:
h
(x1 , x2 , x3 ) =
N1,
(3.55)
h
h
h
N1
(x1 , x3 , x2 ) = P23 N1
(x1 , x2 , x3 ) = N1
(x1 , x2 , x3 ).
(3.56)
The inverse relation looks as

g
h
h
N1, (x1 , x2 , x3 ) = N1,
(x1 , x2 , x3 ) + N1,
(x3 , x1 , x2 )
h
(x2 , x3 , x1 ).
N1,
(3.57)
Notice that the relations between the two sets of the coefficient functions are different in
the direct and dual representations. Nevertheless, it follows from (3.52) and (3.57) that the
& h (zi ) and h (xi ) satisfy the same conformal constraints (3.28)
coefficient functions
N1
N1
g
g
&
and (3.35) as
N1 (zi ) and N1 (xi ), respectively.
Last but not least, the original Schrdinger equation (3.15) can be transformed into a
q
h (x )
Schrdinger equation for the coefficient functions N (xi ) and N1
i
q
q

N (xi )
N (xi )
Hqq Hqh
= EN
,
(3.58)
h
h
Hhq Hhh
(xi )
(xi )
N1
N1
h , are obtained from
where the evolution kernels without a hat in the helicity basis, Hab
h
& (3.51) through the T-transformation (3.33)
the corresponding kernels H
ab
&
Hab T1
a Hab Tb .
(3.59)
3.4. Conformal basis

Solution of the evolution equations (3.58) can be simplified significantly by expanding
q
h (x ) over the basis of spherical harmonics
the coefficient functions N (xi ) and N1
i
consistent with conformal symmetry constraints (3.28).
3.4.1. General construction
For a generic three-particle operator, (3.11) and (3.12), such a conformal basis can
be constructed as follows [21]. Conformal symmetry allows to fix the total three-particle
91
conformal spin J = j1 + j2 + j3 + N of the state that translates to the condition that

the corresponding coefficient function satisfies Eq. (3.28). We define the set of functions
(31)2
YJj
by requiring that they obey Eq. (3.28) and, in addition, have a definite value of
the conformal spin in one of the two-particle channels, (31) for definiteness. The latter
condition leads to
(31)2
(31)2
(xi ) = j (j 1)YJj
(xi ),
L231 YJj
(3.60)
where j = j1 + j3 + n and n = 0, . . . , N .
(31)2
uniquely and have the
Taken together, Eqs. (3.60), (3.28) determine the functions YJj
following solution:
(31)2
YJj
(xi ) = rJj (x1 + x2 + x3 )J j j2 (x1 + x3 )j j1 j3

(2j 1,2j 1) x1 x2 + x3
(2j 1,2j 1) x1 x3
PJ j2 j2
Pj j31 j3 1
.
x1 + x2 + x3
x1 + x3
(3.61)
(,)
(x) is the Jacobi polynomial and rJj an arbitrary normalization factor.

Here Pn
(31)2
The basis functions YJj (xi ) are orthogonal with respect to the scalar product (3.39).
(31)2 2
Requiring that YJj

2
rJj
=
= 1 we fix the normalization to be
E(j + j1 j3 )E(j j1 + j3 )
E(2J )
E(2j1 )E(2j2 )E(2j3 ) E(j j1 j3 + 1)E(j + j1 + j3 1)(2j 1)
E(J j + j2 )E(J + j j2 )
.
(3.62)
E(J j j2 + 1)E(J + j + j2 1)(2J 1)
The above construction of the conformal basis involves an obvious ambiguity in which
order the spins of partons are coupled to the total spin J . Choosing in (3.60) a different
two-particle channel one obtains a different conformal basis related to the original one
through the matrix of Racah 6j -symbols of the SL(2, R) group

(31)2
(12)3
jj (J )YJj (xi ).
YJj (xi ) =
(3.63)
j1 +j2 j J j3
Properties of the Racah 6j -symbols as well as their explicit expressions in terms of the
generalized hypergeometric series 4 F3 (1) are summarized in Appendix B.
3.4.2. Quarkgluon conformal basis
We now specify the above general construction to the particular cases of quark
antiquarkgluon and three-gluon operators and define
%
q
(31)2
(x1 , x2 , x3 )%j =j =1,j =3/2 ,
YN,k (x1 , x2 , x3 ) = YN+7/2,k+2
1
3
2
%
(31)2
h
%
YN1,k1 (x1 , x2 , x3 ) = YN+7/2,n+2 (x1 , x2 , x3 ) j =j =j =3/2 ,
(3.64)
1
where the prefactors have been chosen for later convenience. The three-particle conformal
spin is given by J = N + 7/2 = jq + jg + jq + N = jg + jg + jg + N 1 in both cases,
and the conformal spin in the (31)-subchannel is equal to j = k + 2 = jq + jq + k =
h
(xi ) in the
jg + jg + k 1. Notice that j 3 and, therefore, the basis functions YN,k1
92
gluon-gluon subchannel are well-defined for k 1 0. In what follows we assume that

h
YN1,1
(xi ) = 0.
Each of the two sets of functions in (3.64) forms an orthonormal basis with respect to
the conformal scalar product (3.39) whose explicit form depends on the conformal spin
of the particles and, therefore, is different for the quarkantiquarkgluon and three-gluon
systems. This suggests to define the scalar product on the space of two-dimensional vectors
of coefficient functions in the following form
q
q% q
h% h
a
%
%
,
a =
1 |2 = 2Nc 1 2 + nf 1 2 ,
(3.65)
ah
q
where the scalar product in each sector, 1 |2 and 1h |2h , is obtained from (3.39)
by substituting j1 = j3 = 1, j2 = 3/2 and j1 = j2 = j3 = 3/2, respectively. As we show
below such choice of the scalar product ensures the hermiticity of the evolution kernels.
We shall look for the solutions to the evolution equation (3.58) in the following form
uq
q

q
N
N,k Y
(x
)
i
N (xi )
N,k
h 2Nc
(3.66)
h (x ) =
u
N1
i
N,k Y h
(x )
k=0
nf
N1,k1
q
uN,k
and uhN,k being the expansion coefficients. In this way, the evolution kernels
with
finally become symmetric and real matrices acting on the vector of the expansion
coefficients. We fix the normalization of the coefficients by requiring the eigenstates (3.66)
to have the unit norm
N
, q ,2
, h
,2
% q %2 % h %2
, =
%u % + %u % = 1,
= 2Nc ,N, , + nf ,N1,
N,k
N,k
(3.67)
k=0
where uhN,0 = 0.
3.5. QCD evolution kernels
To one-loop accuracy, the evolution kernels are given [20] by the sum of two-particle
Hamiltonians describing the pair-wise interaction between quarks and gluons on the light&23 + H
&31 , or, equivalently H = H12 + H23 + H31 . Conformal invariance
&=H
&12 + H
cone: H
(3.26) then implies that each pair-wise Hamiltonian Hik only depends on the sum of
conformal spins of the interacting particles, e.g., in the direct representation Hik =
H (Jik ), where
L2ik = (Li + Lk )2 = Jik (Jik 1)
(3.68)
and the SL(2) generators Li ( = , 0) are given in Eq. (3.22). The explicit form of
this dependence can most easily be obtained by comparing the eigenvalues. To this end
it is sufficient to calculate the one-gluon exchange diagrams in a simplified situation when
momenta of the contributing partons sum up to zero. Alternatively, one can start with the
known integral representation for the evolution Hamiltonian [9] and project it onto the
conformal basis (3.64).
3.5.1. Diagonal evolution kernels

The diagonal quark evolution kernel is given by

% q
% q
2nf
%
Hqq YN,n = HS + +
J13 ,2 %YN,n ,
3
93
(3.69)
where b = 11/3Nc 2/3nf is the lowest-order coefficient of the QCD -function. The
first term in brackets stands for the flavour-nonsinglet evolution kernel (see below) and the
second term proportional to J13 ,2 comes from the additional Feynman diagram in which
the quark and the antiquark annihilate to produce a gluon that splits again into the q q-pair.
Since the quark and the antiquark are produced in this way in one spatial point, contribution
of this diagram is different from zero only for J13 = jq + jq = 2. The flavour-nonsinglet
Hamiltonian HS + can be represented as [20,2224]:
HS + = Nc H (0)
2 (1)
H ,
Nc
(3.70)
where
(0)
(0)
(J12 ) + Uqg
(J23 ),
H (0) = Vqg
(3.71)
(1)
(1)
(1)
H (1) = Vqg
(J12 ) + Uqg
(J23 ) + Uqq
(J13 ).
(3.72)
Here, the notation was introduced for the two-particle quarkquark and quarkgluon
kernels
(0)
(J ) = (J + 3/2) + (J 3/2) 2(1) 3/4,
Vqg
(0)
Uqg
(J ) = (J + 1/2) + (J 1/2) 2(1) 3/4,
(1)J 5/2
,
(J 3/2)(J 1/2)(J + 1/2)
(1)J 5/2
(1)
Uqg
,
(J ) =
2(J 1/2)
1
(1)
Uqq
(J ) = [(J 1) + (J + 1)] (1) 3/4,
2
where (x) = d ln E(x)/dx.
The diagonal gluon kernel is defined as
% h
% h

Hhh %YN1,k1
= Nc [H3/2 V3/2 ]%YN1,k1
,
(1)
Vqg
(J ) =
(3.73)
(3.74)
where

H3/2 = 2 Ugg (J12 ) + Ugg (J23 ) + Ugg (J31 ) b/Nc ,
(3.75)
V3/2 = Vgg (J12 ) + Vgg (J31 )
(3.76)
Ugg (j ) = (j ) (1),
(3.77)
and
Vgg (j ) =
3(1 + (1)j )
2
+
.
j (j 1) (j 2)(j 1)j (j + 1)
(3.78)
94
We notice that the kernel Hhh is invariant under permutations of the two gluons of the same
helicity
[Hhh , P23 ] = 0.
(3.79)
&ik and Hik have the same functional

Note that the pair-wise diagonal Hamiltonians H
&ik = h(J&ik ) and Hik = h(Jik ) with the same
dependence on the Casimir operators, i.e., H
function h.
3.5.2. Off-diagonal evolution kernels
The off-diagonal kernels describe the mixing between quarkantiquarkgluon and threegluon states. The conformal symmetry implies that the conformal spin of both states should
be the same, and this applies both to the total conformal spin of the three-parton system
and the conformal spin of the parton pair involved in the mixing. It follows that, therefore,
q
acting on the basis function in the quark sector, |YN,k , the evolution kernel Hhq transforms
h
it into the basis function in the gluon sector, |YN1,k1
, with the same N, k. Similarly, the
evolution kernel Hqh maps gluon conformal basis into the quark one. As a consequence,
the off-diagonal kernels can be written down in the following form
1 P23
1P23
,
Hhq =
Whq (J31 ),
2
2
where the operators Wqh and Whq is defined as

2 J + 2(1)J31
% h
% q

J31
31
%
%Y
Wqh YN1,k1 = nf
N,k ,
J31 (J31 1) (J31 + 1)(J31 2)

2 J + 2(1)J31
% q
% h

J31
31
%Y
Whq %YN,k = 2Nc
N1,k1 .
J31 (J31 1) (J31 + 1)(J31 2)
Hqh = Wqh (J31 )
(3.80)
(3.81)
The following comments are in order. The off-diagonal kernels Hqh and Hhq originate
from the Feynman diagrams in which quark and antiquark annihilate into two gluons
of opposite helicity. As a consequence, these kernels depend on the conformal spin in
the quarkantiquark subchannel, J31 , and the additional projector (1 P23 )/2 takes into
account antisymmetry of the three-gluon state under the interchange of gluons with the
same helicity, Eq. (3.49).
,
It follows from (3.81) that the off-diagonal kernels are adjoint to each other, Hqg = Hgq
with respect to the scalar product (3.65). Indeed, calculating the matrix elements of the offq
g
diagonal kernels between the states N and N1 defined in (3.66) and taking into account
their symmetry properties (3.56) one finds

q%
q% h
h
= Hhq N %N1
N %Hqh N1
N

(k + 1)(k + 2) + 2(1)k q h
u
u
,
= 2Nc nf
(k + 1)(k + 2) k(k + 3) N,k N1,k1

k=1
(3.82)
where we have substituted J31 = k + 2. Repeating the similar calculation for the matrix
q
q
h |H h , one makes sure
elements of the diagonal kernels, N |Hqq N and N1
hh N1
95
that Hqq
= Hqq and Hhh
= Hhh . We conclude that the matrix of the evolution kernels
entering the Schrdinger equation (3.58) is a hermitian operator on the space of the
coefficient functions endowed with the scalar product (3.65). As a consequence, its
eigenvalues EN are real and the corresponding eigenfunctions are orthogonal to each other.
3.6. Expansion in conformal operators

Once the Schrdinger equation in the helicity basis (3.58) is solved, we can easily restore
g
the gluon coefficient function N, using the symmetry relation (3.57) and reconstruct the
multiplicatively renormalizable operators (3.13). Their reduced forward matrix elements
can be expressed in terms of the multiparton distributions as follows:

ON, Q2
=
1
1
N,
#

$
q
g
D 4Nc N, (i )Dq i , Q2 + nf N1, (i )Dg i , Q2 .
(3.83)
Notice that in contrast with (3.39) the integration goes here over the region 1 + 2 + 3
= 0. Comparing this expression with the expansion is the standard operator basis (3.5)
and assuming that the eigenvectors N, is normalized to unity one gets the equivalence
relations

%
q
q
Ck (N)1n 3Nk ,
4Nc N, (i )%. =0 =
i
%
g
nf N1, (i )%. =0
i
0kN

g
Cm (N) 1m 3N1m 1N1m 3m .
(3.84)
0m[N/2]1
q
Next, expanding the OPE coefficient function n (1 , 3 ) (3.1) over the eigenstates of
the evolution kernels
%

q
q
2Nc N, (i ) %%
N+3 (1 , 1 3 , 3 )
wN,
=
(3.85)
g
0
nf N1, (i ) %1 +2 +3 =0
we can decompose moments of the structure function g2 (x, Q2 ) (3.3) in multiplicatively

renormalizable contributions (3.83) as
g2LL
[3N/2]

1 2 4

2
N + 3, Q = eq
wN, ON, Q2 .
2
N +3
(3.86)
=0
The expansion coefficients wN, can be expressed in terms of Ck (N) and Cm (N) by
comparing the coefficients in front of different powers of 1 and 3 in the both sides
of (3.85).
3.6.1. Quark coefficient function
There exists, however, a more efficient way of finding the same coefficients. To this
q
end, observe that the OPE coefficient function N+3 (1 , 1 3 , 3 ) can uniquely be
96
continued from the hyperplane 1 + 2 + 3 = 0 to arbitrary values of i by requiring that

q
the thus defined function N+3 (1 , 2 , 3 ) satisfies the conformal constraints (3.35) (and
q
coincides with N+3 (1 , 1 3 , 3 ) at 2 = 1 3 ). To find an explicit expression,
q
let us expand N+3 (1 , 2 , 3 ) over the conformal basis (3.64)
q
N+3 (x1 , x2 , x3 ) =
N
N,k YN,k (xi )
(3.87)
k=0
q
N,k
q
q
N+3 |YN,k .
=
Projecting out onto the hyperplane
with
using (3.61) that the basis functions are reduced to

%
q
N (1,1) x1 x3
%
.
YN,k (xi ) x =0 (x1 + x3 ) Pk
i i
x1 + x3
i xi
= 0, we find
(3.88)
and form an orthogonal basis on the subspace x1 +x3 = 1 and 0 x1 , x3 1. This property
allows us to calculate the expansion coefficients N,k as
q
N,k
1
(1,1)
dx1 dx3 (1 x1 x3 ) x1x3 Pk

0

x1 x3
q
N+3 (x1 , x1 x3 , x3 ).
x1 + x3
(3.89)
Substituting the actual expression for N+3 (i ) (2.22) and performing the integration one
arrives at

N + k + 5
N + k + 4
q
Nk
Nk
+ (1)
N,k = (1)
+1
1
N k+1
N k+2
1/2

(k + 1)(k + 2)(2k + 3)(N k + 1)(N k + 2) E 4 (N + 3)
.
8(N + 3)(N + k + 4)(N + k + 5)

E(2N + 6)
(3.90)
Using these coefficients one can easily calculate the norm of the quark coefficient function

N
, q ,2
q 2 E 4 (n) n4
1
2
,
, =
1
+
1
4(n)
=
E
N,k
N+3
E(2n) 4
n2
n3
(3.91)
k=0
where in the r.h.s. n = N + 3.

Finally, using (3.87) and (3.90) one finds the expansion coefficients for moments of the
structure function
N, =
% q
q
N+3 %N, =
q
N
1 q q
u .
2Nc k=0 N,k N,k
(3.92)
The coefficients uN,k stand for the expansion of a multiplicatively renormalizable operator
in the conformal basis, cf. Eqs. (3.66) and (3.67). Their dependence on the particular
operator (index ) is tacitly assumed.
97
3.6.2. Gluon coefficient functions

The similar procedure can be used to calculate the expansion coefficients of the
g
qg
gluon coefficient functions k and k defined in (2.19) and (2.23). The corresponding
conformal harmonics in the helicity basis are given in the forward direction by

%
(2,2) x1 x3
h
(xi )%. x =0 (x1 + x3 )N1 Pk
.
YN1,k
(3.93)
i i
x1 + x3
In order to decompose the gluon coefficient function (2.23) over this basis we first construct
the same coefficient function in the helicity representation
h
(xi ) =
N+3

1 g
g
N+3 (x1 , x2 , x3 ) N+3 (x1 , x3 , x2 )
2
(3.94)
h
h
h
so that P23 N+3
(xi ) = N+3
(xi ) and then define the expansion coefficients N1,k
as
h
N+3
(xi ) =
N1
h
h
N1,k
YN1,k
(xi ).
(3.95)
k=0
Using orthogonality of the Jacobi polynomials we can calculate these coefficients as

1
h
N1,k

dx1 dx3 (1 x1 x3 ) (x1 x3 )2 Pk(2,2)
x1 x3
x1 + x3
h
N+3
(x1 , x1 x3 , x3 ).
(3.96)
Substituting the gluon coefficient function (2.23) into (3.94) and (3.96) and going through
the calculation one finds the following explicit expressions for the expansion coefficients
for even conformal spins N
h
N1,k
=
(N + k + 5)(N + 7 + 5/2 k + 1/2 k 2)

hN,k
(k + 1)(N k + 1)
(3.97)
for even k and

h
N1,k
=
(k + 4)(N + k + 6)
hN,k
2(N k)
(3.98)
for odd n. Here, the normalization constant hN,k is given by

h2N,k =
(k + 1)(2 k + 5)(N k)(N k + 1)(N + 3)2 E 4 (N + 3)

.
8(k + 3)(k + 4)(k + 2)(N + 5 + k)(N + k + 6) E(2N + 6)
(3.99)
Using (3.95) we can calculate the norm of the gluon coefficient function

, h ,2 N1
2
h
,
, =
N1,k
N+3
=
k=0
E 4 (n)
n3 (n 1)
2(n 4)(n + 1)
1+
)
((n)
+
E ,
E(2n)
16
(n 1)2 n
where n = N + 3, as above.
(3.100)
98

qg
The second gluon coefficient function, n (i ), appears in the expression for the
moments of the structure function (2.19) due to mixing between quarkantiquarkgluon
and three-gluon distribution amplitudes. Similar to (3.94) and (3.95), one can transform
this function into the helicity representation and decompose it over the conformal basis
qh

1 qg
qg
N+3 (x1 , x2 , x3 ) N+3 (x1 , x3 , x2 )
2
N1

h
h
=
KN1,k
YN1,k
(xi ).
N+3 (xi ) =
(3.101)
k=0
qg
Substituting n (i ) by its explicit expression (2.23) it becomes straightforward to

h
using (3.96). The resulting explicit expression is
calculate the coefficients KN1,k
cumbersome and will not be displayed here. Comparing the expansions (3.95) and (3.101)
g
qg
one finds, however, [12] that the two coefficient functions n (i ) and n (i ) are to a
good accuracy proportional to each other
qg
n (i ) c(n) n (i ).
(3.102)
Applying the transformation (3.94) and (3.101) to the both sides of this relation and using
the orthogonality of the conformal basis we can calculate the coefficient c(n) as

/, h ,2
qh % /, ,2 N1
h
h
, ,
N1,k
KN1,k
c(n) = n %nh ,nh , =
n
(3.103)
k=0
nh
with the norm

given in (3.100). Going through the calculation one finds that c(n) is
given at large n by the following expansion:
c(n) = 1 +

1
[4 ln n + 4E 6] + O 1/n3 .
2
n
(3.104)
4. Results
In this section we present detailed results on the solution of the Schrdinger equation
in Eq. (3.58). The main advantage of the Hamiltonian approach described in the previous
section is that it allows to understand qualitative features of the solutions using the intuition
and a wealth of analytical tools well known from quantum mechanics. For this analysis,
we decompose the full Hamiltonian in (3.58) in two parts:

0
Hqh
Hqq
0
,
V=
.
H0 =
Hh = H0 + V,
(4.1)
Hhq
0
0
Hhh
The Hamiltonian H0 governs the scale-dependence separately in the quark and gluon
sectors, whereas the off-diagonal kernel V describes the mixing between the two sectors.
Our strategy throughout this section will be first to solve the Schrdinger equation for H0
and then examine the deformation of the spectrum induced by V that we, somewhat
imprecisely, refer to as the quarkgluon mixing. The rationale for such a two-step
procedure is that properties of the individual Hamiltonians Hqq and Hhh have been studied
99
in detail in recent works [2224] using their relation to completely integrable models. We
will, therefore, be able to use these results.
4.1. Diagonal evolution kernels
The spectrum of eigenvalues of H0 is obviously given by the superposition of the two
independent spectra of quarkantiquarkgluon and three-gluon operators
q,(0)
q,(0)
Hqq N, = EN, N, ,
h,(0)
h,(0)
Hhh N1, = EN, N1,
and the corresponding eigenfunctions are given by

q,(0)

N, (xi )
0
(0)
(0)
N, (xi ) =
.
,
N,+N+1 (xi ) =
h,(0)
N1,
(xi )
0
(4.2)
(4.3)
The superscript (0) stands to remind that the off-diagonal mixing terms are omitted and
the subscripts and enumerate different quark and gluon eigenstates corresponding
to multiplicatively renormalizable operators with N (quark) or N 1 (gluon) covariant
derivatives with the same canonical dimension (and the same conformal spin J =
N + 7/2). In addition, we require that the gluon eigenfunctions satisfy the symmetry
property (3.56). For a given N the total number of the quark and gluon eigenstates
is equal to ;q = N + 1 and ;g = [N/2], respectively, so that = 0, . . . , N and =
0, . . . , [N/2] 1.
The results of the numerical calculation of the spectrum for N < 20 and nf = 3 are
shown in Fig. 1. Note that the gluon eigenstates (open circles) and the quark eigenstates
Fig. 1. The spectrum of anomalous dimensions of flavor-singlet twist-3 operators with the mixing
between quarkantiquarkgluon (crosses) and three-gluon operators (open circles) switched off,
cf. (4.1).
100
(crosses) occupy two bands that lie on the top of each other. For large N the eigenvalues
(anomalous dimensions) rise logarithmically
q
4CF ln N EN, 4Nc ln N,

g
4Nc ln N EN, 6Nc ln N
(4.4)
and the coefficients in front of ln N are related to the color charges of the corresponding
(classical) parton configurations. Since O(N) levels have to fit within the band-width
O(ln N) (4.4), the distance between the neighboring levels in general goes to zero. The
analysis of the fine structure of the spectra [23,24] reveals, however, that in the limit
N three levels remain separated from the rest of the spectrum by a finite gap. These
three special levels are: the lowest quark level, the highest quark level and the lowest gluon
level; they will play a decisive rle in what follows.
4.1.1. The highest quarkantiquarkgluon state
q
g
The eigenvalues EN, and EN, depend on the number of light quark flavors nf . This
dependence is shown in Fig. 2 for N = 8 and 0 nf 10 and reveals the following
remarkable pattern: In the gluon sector the nf -dependence of all energy levels is linear,
g
EN, 2nf /3, and it can be traced to the additive b-correction to (3.75). In contrast to
this, in the quark sector all energy levels except the highest one vary very slowly with nf .
At the same time, the nf -dependence of the highest quark level is almost identical to
Fig. 2. The dependence of the anomalous dimensions of flavor-singlet quarkantiquarkgluon

(crosses) and three-gluon operators (open circles) on the number of light quark flavors nf . The
dotted line corresponds to the linear dependence (2/3)nf , see text. The quarkgluon mixing is
switched off, cf. (4.1).
101
Fig. 3. The coefficients of the expansion (4.6) of YN,k=0 (xi ) over the eigenstates of the quark
Hamiltonian Hqq . N = 8 and N = 20 for the left and the right panel, respectively.
q
that of the gluon levels, EN,N 2nf /3. To understand this property, notice that the nf dependence of the diagonal quark kernel in Eq. (3.69) comes entirely from the annihilation
term (2nf /3)J13 ,2 . At large nf this term dominates and the corresponding eigenstate is
given by
q,(0)
N,N (xi ) = YN,k=0 (xi ) + O(1/nf ).
(4.5)
Remarkably enough, this relation holds true with high accuracy for small values of nf
as well, including nf 0. To illustrate this, we show in Fig. 3 the coefficients in the
q
expansion of YN,k=0 (xi ) over the complete set of the quark eigenstates:
q
YN,k=0 (xi ) =
N

q
q,(0) q,(0)
YN,k=0 |N, N, (xi )
(4.6)
=0
for nf = 3 and two different values of N . Note that the normalization is such that
q,(0)
N, = 1 and, as a consequence, the sum of the coefficients squared is equal to one. It
is seen that the overlap with the exact wave function of the highest quark state = N is
very close to unity.
As a yet another test of the approximation for the wave function (4.5) one evaluates the
corresponding energy (see Appendix A for details)
%
% q

q
q
EN,N # YN,k=0 %Hqq %YN,k=0

6
19
= Nc 2 (N + 3) + 2 (N + 4) + 4E
(N + 3)2
6
1
6
2
+
(4.7)
+ nf .
Nc (N + 3)2 (N + 4) 3
The few first terms of the expansion of the matrix element at large N are
102
Table 1
Exact numerical results for the energy of the three special levels (see text) compared with the
calculation using the approximations for the corresponding eigenfunctions. The results for the lowest
quarkantiquarkgluon, the highest quarkantiquarkgluon and the lowest three-gluon eigenstates
are shown in the two left, two middle and two right columns, respectively
N
N
N
N
N
N
=2
=4
=6
=8
= 10
= 30
EN,0
q |Hqq | q
EN,N
Yk=0 |Hqq |Yk=0
EN,0
h |Hhh | h
10.933
12.629
13.938
14.994
15.876
20.824
10.987
12.644
13.946
14.999
15.881
20.828
18.107
22.587
25.785
28.285
30.343
41.634
17.993
22.395
25.561
28.046
30.094
41.367
17.350
22.087
25.474
28.107
30.260
41.847
17.350
22.160
25.541
28.162
30.304
41.878

1
2
19 19
1
+ nf + O
.
# Nc 4 ln(N + 3) + 4E
6
3 (N + 3)2
3
(N + 3)3
(4.8)
The approximation (4.7) is compared with the exact numerical calculations of the energy
in Table 1 (two middle columns). The accuracy turns out to be very good, better than 1%
for all N .
Last but not least, we note that according to (3.88) the wave function entering (4.5),
q
YN,0 (xi ) (x1 + x3 )N = (x2 )N , depends on a single (gluon) momentum fraction only
and, therefore, it defines a local operator (3.11) and (3.12) that contains derivatives acting
on the gluon but not on the quark fields. All such operators can be obtained from a Tailor
expansion of the nonlocal operator (2.11) in which the quark and the antiquark are located
at the same spacetime point and which can be rewritten as a two-gluon operator using the
equations of motion:
q
EN,N
x (vx)/
ax (vx)Dab Gbx (0).
G
x q(0) = i G
S (0, v, 0) = i q(0)g
(4.9)
Here a, b are color indices and D ab is the covariant derivative in the adjoint representation.
Thus, to a good accuracy, the quarkantiquarkgluon three-particle operator with the
highest anomalous dimension is in fact a two-particle gluon operator!
4.1.2. The lowest quarkantiquarkgluon state
Since eigenfunctions of the hermitian Hamiltonian Hqq are orthogonal to each other
and since the eigenfunction corresponding to the highest anomalous dimension turns out
to be very close to the eigenfunction of the nf -dependent contribution of the annihilation
diagram that gives rise to the second term in the Hamiltonian (3.69), it follows that the
wave functions of all other levels have a negligible overlap with this term and, therefore,
are to a good accuracy nf -independent, in agreement with Fig. 2.
This means that within the accuracy of (4.5), the eigenfunctions and eigenvalues of
all quark levels other than the highest one coincide with those of the flavor-nonsinglet
quark Hamiltonian HS + that was studied in [21,23,24]. In particular, it has been shown
103
q,(0)
Fig. 4. The coefficients of the expansion of N+3 (xi )/ N+3 (4.10) over the eigenstates N,
of the quark Hamiltonian Hqq .
that the wave function corresponding to the lowest anomalous dimension coincides, in the
large-Nc limit, with the tree-level coefficient function in the OPE (2.22) continued from
the hyperplane 1 + 2 + 3 = 0 to arbitrary values of i using the conformal symmetry,
cf. (3.87):
q
q,(0)
N,=0 =
N+3 (xi )
q
N+3

+ O 1/Nc2 .
(4.10)
The additional factor in the r.h.s. takes into account the different normalization of the states.
q
q
In Fig. 4 we show the coefficients of the expansion of N+3 (xi )/ N+3 over the
eigenstates of the quark Hamiltonian Hqq for N = 8 and N = 20. It is seen that the
expansion is dominated by the lowest quark state = 0. The corresponding eigenvalue
(anomalous dimension) can be calculated as (see Appendix A for details)
% q /, q ,2
q %
1 (1)
q
(0)
EN,0 # N+3 %Hqq %N+3 ,N+3 , = Nc EN +
E ,
Nc N
(4.11)
where [10]
(0)
EN
= 2 (N + 3) +
1
1
+ 2E
N +3 2
(4.12)
and [12]
2

ln (N + 3)
3 2
(1)
+O
EN
.
= 2 ln(N + 3) + E +
4
6
(N + 3)2
(4.13)
The quality of this approximation is illustrated in Table 1 (compare the numbers in the first
two columns).
104
h
h
h
Fig. 5. The coefficients of the expansion of N+3
(xi )/ N+3
(4.14) over the eigenstates N1,
of the gluon Hamiltonian Hhh .
4.1.3. The lowest three-gluon state

Last but not least, we have to work out an approximate description of the lowest
three-gluon state. As noticed in [12], the corresponding eigenfunction appears to be
close to the one-loop gluon coefficient function g (i ) (2.23) transformed to the helicity
representation (3.94) (and continued to arbitrary values of i using the conformal
symmetry):
h,(0)
N1,=0 #
=
h
(xi )
N+3
h
N+3

1
h
2 N+3

g
g
N+3 (x1 , x2 , x3 ) N+3 (x1 , x3 , x2 ) .
(4.14)
In difference to the above, this approximation cannot be justified in any formal limit, but
is no less accurate, as illustrated in Fig. 5 where we show the corresponding (normalized)
expansion coefficients in the basis of exact three-gluon eigenstates.
The corresponding approximation for the energy can be calculated as (see Appendix A
for details)
g
EN,0 #
h
h
|Hhh |N+3

N+3
h
2
N+3
2
g
= Nc EN + nf
3
(4.15)
with
1 2
g
EN = 4 ln(N + 3) + 4E +
3
3

2 2

2
1
6
+
ln(N + 3) + E
N +3
3
3
2

ln (N + 3)
+O
.
(N + 3)2
(4.16)
105
It is compared with the exact result in Table 1 (the two last columns). Note that the energies
of the lowest gluon and the highest quark states are very close to each other, as seen also
from Fig. 1:

EN,N EN,0
/
EN,N
102
.
ln(N + 3)
(4.17)
In fact, the difference is so small that one can treat these two levels as degenerate for most
purposes.
4.2. The quarkgluon mixing
The mixing of quarkantiquarkgluon and three-gluon operators is governed by the
off-diagonal part V of the Hamiltonian (4.1). It turns out that the mixing is generally
rather weak and for most practical purposes can be taken into account using the standard
perturbation theory. To the leading order, the mixing-induced corrections to the pure
quark or gluon eigenstates (4.2), (4.3) are equal to

V
0
(1)
,
N, (xi ) =
(4.18)
h,(0)
q
g
N1, (xi ) EN, EN,

L q,(0) (xi )
V
(1)
N,
N,+N+1 (xi ) =
(4.19)
q
g ,
0
EN, EN,

where the notation was introduced for the mixing matrix
%

h,(0) %
%Hhq % q,(0) .
V = N1,
N,
(0)
(4.20)
(0)
It is easy to see that N |V|N = 0 and, therefore, the energy eigenvalues do not receive
any corrections to the first order of the perturbative expansion. This explains why the
mixing-induced corrections to the anomalous dimensions are very small (see Table 2). The
numerical results presented below are in all cases based on the exact diagonalization of the
Table 2
The exact energy of the three special levels (see text) with the mixing-induced corrections taken
into account compared with the energy of the corresponding eigenstates with the mixing effects
neglected. The results for the lowest quarkantiquarkgluon, the highest quarkantiquarkgluon and
the lowest three-gluon eigenstates are shown in the two left, two middle and two right columns,
respectively
N
N
N
N
N
N
=2
=4
=6
=8
= 10
= 30
EN,0
EN,0 EN,0
EN,N
EN,N EN,N
EN,N+1
EN,N+1 EN,0
10.739
12.571
13.912
14.980
15.867
20.823
0.193
0.058
0.027
0.015
0.009
0.001
17.679
22.324
25.637
28.195
30.278
41.600
0.428
0.262
0.148
0.090
0.065
0.034
18.171
22.648
25.872
28.420
30.530
42.056
0.821
0.561
0.397
0.312
0.270
0.209
106
Fig. 6. A density plot for the matrix |V | for N = 100.
mixing matrix, and the first-order expressions in (4.18) only serve to illustrate the picture.
Note that the perturbation theory cannot be used to describe the mixing between the highest
quarkantiquarkgluon and the lowest three-gluon states because of the vanishing energy
denominators; in this case the explicit diagonalization is mandatory.
The mixing matrix V has a rather peculiar wing-like shape as illustrated in Fig. 6.
We remind that the index along the horizontal axis serves to numerate different quark
antiquarkgluon states, starting with the one with the lowest anomalous dimension at =
0, and the index enumerates the three-gluon eigenstates in the similar fashion. Dark
regions in Fig. 6 correspond to large absolute values of V and light regions indicate
small matrix elements. Note the complicated chess-board pattern with alternating large
and small entries.
The most important feature that is seen in Fig. 6 is that the lowest quark eigenstate = 0
mixes significantly only with the lowest gluon eigenstate = 0. In fact, we find a (roughly)
exponential hierarchy of the matrix elements
|V0,0 | $ |V1,0 | $ $ |V[N/2]1,0 |,
(4.21)
that is valid for all N , see Fig. 7. Higher quarkantiquarkgluon states get mixed with
gluons more heavily and involve many three-gluon states in an essential way. For a
few highest quark states, N , the mixing is again simplified somewhat, but still
involves several (in general ln N ) gluon states, see Fig. 6. It can be shown that the
mixing coefficient of the highest quarkantiquarkgluon state (4.5) with the lowest threegluon state (4.14) V0,N vanishes in the large-N limit and this smallness overcomes the
enhancement due the small energy denominator in (4.18) in the same limit, cf. (4.17).
107
Fig. 7. The mixing coefficients V0 of the different three-gluon states = 0, 1, . . . , [N/2] 1 with
the lowest quarkantiquarkgluon state, for four different values of N: N = 8 (triangles), N = 20
(squares), N = 50 (open circles) and N = 100 (full circles).
Since the Hamiltonian (4.1) is hermitian, the same mixing matrix describes the mixing
of the gluon states with the quark states. It is seen that the lowest gluon states = 0, 1, . . . ,
get mixed with many quark states while the highest gluon states N/2 only mix with
quark states with 2/3N .
As a yet another illustration of the mixing pattern, we show in Fig. 8 the exact numerical
results for the expansion coefficients of two different exact eigenstates of the full
Hamiltonian (4.1), N,0 (xi ) and N,N+1 (xi ), over the pure quark and gluon eigenstates.
The chosen states are those that reduce to the lowest quark and the lowest gluon states if
the mixing is switched off. For the lowest quark state (which is the lowest eigenstate in
the whole spectrum) the single significant contribution comes from the lowest gluon state
( 103 ) while all other contributions are suppressed by another order of magnitude. For
the gluon state the mixing is much larger ( 101 ) and involves many quark states. This
is the same structure that we observed earlier in Fig. 6. Another conclusion is that since
the mixing-induced corrections are at most of the order of 10% (for large N 100), the
perturbative expansion in (4.18) should be rather accurate and this indeed can be verified
by the direct numerical calculation.
108
Fig. 8. Exact numerical results for the expansion coefficients of the exact eigenstates of the
Hamiltonian Hh over the eigenvectors of the diagonal blocks: (a) the exact lowest eigenstate,
(0)
(0)
N,0 |N,k ; (b) the exact lowest eigenstate in gluon sector, N,N+1 |N,k . The index k
enumerates the operators ordered according to their anomalous dimension; 0 k = N
corresponds to quark operators, k = N + 1 + N + 1 to gluons, cf. (4.3). The quark eigenstates
are shown by crosses, the gluon eigenstates by open circles.
4.3. Reduction of the mixing matrix: the two-channel DGLAP equation

We have defined the transverse spin quark and gluon distributions 2qT (x) and 2gT (x)
as the specific projections of the generic quarkantiquarkgluon and three-gluon operators,
Eqs. (2.28) and (2.29), respectively. They correspond to the leading order quark and gluon
contributions to the operator product expansion of the T-product of the two electromagnetic
currents (2.10) and determine the odd moments of the structure function (2.19) at a high
scale Q2

tw-3 1

s

1
2qT+ n, Q2 + CgT (n) 2gT n, Q2 . (4.22)
g2 n, Q2
=
eq2
2
n
q=u,d,s,...
Here, we took into account the relation between the gluon coefficient functions (3.102) and
introduced a notation for the overall gluon coefficient function CgT
1
CgT (n) =
dx x n1 CgT (x) =
0

1
1 c(n)[(n) + E + 1]
n(n + 1)
(4.23)
with c(n) given by (3.104). It is therefore natural to exercise a model for the structure
function in which g2 (x, Q2 ) can be expressed in terms of these two distributions defined at
a low scale 2 . Such a model cannot be exact, since it is not theoretically consistent: other
partonic degrees of freedom are generated through the QCD evolution of 2qT (x, Q2 )
and 2gT (x, Q2 ) down to a low scale. Assuming the absence of their contribution at
two different scales Q2 and 2 simultaneously is, therefore, not possible. We shall
109
argue, however, that the admixture of the additional degrees of freedom turns out to
be small numerically, at least for large N . This means that the two-component model
for the structure function g2 (x, Q2 ) based on the parton distributions 2qT (x, Q2 ) and
2gT (x, Q2 ) has a good numerical accuracy and can eventually be made more sophisticated
(involve more states) when and if high accuracy experimental data become available.
In order to unravel the qualitative structure of the evolution it is necessary to go over
to large moments N so that the number of contributing operators becomes large and
possible systematic effects more transparent. To this end we take N = 100 and, as a first
1
step, examine the coefficients in the expansion of 2qT+ (N + 3) = 0 dx x N+2 2qT+ (, Q2 )
1
and 2gT (N + 3) = 0 dx x N+2 2gT (, Q2 ) in contributions of the multiplicatively
renormalizable operators (3.83). We recall that the moments 2qT+ (n) and 2gT (n) are given
by reduced matrix elements of the local composite quarkantiquarkgluon and three-gluon
operators defined as
q
%
g
%
(zi ) %
2gT (n) = n (i ) O
.
2qT+ (n) = 2 n (i ) S (zi ) %z =0 ,
(4.24)
z =0
i
Replacing the nonlocal operators by the expansion over the multiplicatively renormalizable
operators (3.16) one gets

2qT+ N
+ 3, Q
=2
3N/2
% q

q
N+3 %N, ON, (2 ) LEN, /b ,
=0

2gT N + 3, Q2 =
3N/2
% h

2 E /b
h
%
N+3
L N, ,
N1, ON,
(4.25)
=0
q
h
are quark and gluon components of the exact eigenstate of the
where N, and N1,
evolution kernel in the helicity basis.
q
q
The expansion coefficients N+3 |N, for N = 100 are shown in Fig. 9a. It is seen
+
that 2qT (N + 3) is grossly dominated by the contribution of the local operator with the
lowest anomalous dimension, which means that all other contributions to the evolution
are small. Among the other contributions, noticeable corrections come from a few quark
operators with the low anomalous dimensions, and a single gluon operator with the lowest
anomalous dimension in the gluon sector. The contributions of the quark operators with
the anomalous dimensions close the the lowest one are less interesting than the gluon
contribution for the following two reasons:
Since the anomalous dimensions of important quark operators are all close to that of
the leading quark operator, the corresponding contributions are barely distinguishable
by the evolution.
The admixture of quark operators with the next-to-the-lowest etc. anomalous
dimensions is roughly the same in the flavor-nonsinglet and the flavor-singlet
channels. Because of this, one does not expect any qualitative effects. On the other
hand, the appearance of the gluon operator is a new feature of the singlet evolution.
Neglecting the higher quark and gluon states in (4.25) we find that for arbitrary (large)
N the moments of the quark distribution receive the dominant contribution from only
110
Fig. 9. Exact numerical results for the coefficients: (a) |N,k |N+3 |/ N+3
and (b)
h
h
h
|N1,k
|N+3
|/ N+3
of the expansion of the N = 100th moment of quark and gluon transverse spin distributions, respectively, in the contributions of multiplicatively renormalized operators. The index k enumerates the operators ordered according to their anomalous dimension;
0 k = N corresponds to quark operators, k = N + 1 + N + 1 to gluons, cf. (4.3). The
quark eigenstates are shown by crosses, the gluon eigenstates by open circles.
two states the lowest quark ( = 0) and the lowest gluon ( = N + 1) levels and,
therefore, depend only on two nonperturbative parameters. As a consequence, thus defined
moments 2q + (n, Q2 ) satisfy the two-channel evolution equation which can be written in
the standard DGLAP form

d
Q2
2qT+ x; Q2
2
dQ
1

s
dy T
T
Pqq (x/y)2qT+ y; Q2 + Pqg
(x/y)2gT y; Q2 .
=
(4.26)
4
y
x
In order to determine the splitting functions one can take moments so that the convolution
in (4.26) gets replaced by a product and identify the moments of the splitting functions
with the corresponding anomalous dimensions (with sign minus).
Since moments of the quark distributions are obtained from the nonlocal quark
antiquarkgluon operators by projecting onto the quark coefficient function (4.24), one can
q
calculate the anomalous dimensions by applying n (i ) to the evolution equation (3.14).
q
g
Since the coefficient functions N+3 and N+3 of the quark and gluon transverse spin
distributions 2qT (N + 3) and 2gT (N + 3) (2.30) turn out to be very close to the lowest
eigenstates of the pure quark and pure gluon Hamiltonian H0 (4.1), see Eqs. (4.10),
(zi ) as
(4.14), we may invert (4.24) and expand the nonlocal operators S (zi ) and O
S (zi ) =

&nq (zi )

+
2
q 2 2qT n, Q + ,
n 2 n
(zi ) =
O

&ng (zi )
n
nh

2gT n, Q2 + ,
111
(4.27)
&ng are related to the coefficient functions through the transformation

&nq and
where
(3.34) and the dots denote the contribution of higher quark and gluon eigenstates. Then,
substituting (4.27) into the evolution equation (3.14), one calculates the corresponding
anomalous dimensions (for even N ) as
1
T
dy y N+2 Pqq
(y) =
N+3 |Hqq |N+3

q
N+3 2
1
= EN,0 ,
T
dy y N+2 Pqg
(y) = 2
h
N+3
N+3 |Hqh |N+3
N+3
h
N+3
h
N+3
(4.28)
where the last factor in the second expression serves to correct for the different
normalization of the quark and gluon coefficient functions. The first matrix element has
q
been calculated in [10,12] and the answer for the large-N expansion of EN,0 is given
in (4.12), (4.13).
1
1
T
+ E + CF
dy y n1 Pqq
(y) = 4CF (n) +
2n

2

ln n
2
1
+O
2
.
Nc
3
n2
(4.29)
For the second matrix element we have

N

%
% h
(k + 1)(k + 2) + 2(1)k q
q
h
N+3 %Hqh %N+3
= nf
N,k N1,k1
.
(k
+
1)(k
+
2)
k(k
+
3)
k=1
(4.30)
h
Using the explicit expressions for the coefficients N,k (3.90) and N1,k1
(3.97) one
gets

% h
q%
E 4 (n)n (n 1)(n3 2n2 6n 12)
%
%
n Hqh n = nf
2E(2n)
4(n + 1)(n 2)
2
(n 2n + 4)((n) + E )
.
+
(4.31)
2(n 1)
Combining together Eqs. (4.31), (3.100) and (3.91) and expanding at large n = N + 3 we
obtain

1
T
dy y n1 Pqg
(y) = 4nf

1
1
5 + 4E + 4 ln n
1
2+
,
+
O
n n
n3
n4
(4.32)
which agrees well with the exact expression for all n > 3.
In order to get a closed system of equations, we have to consider the evolution of 2gT (x)
as well. The coefficients of the expansion of the corresponding coefficient function over
112
the basis of multiplicately renormalizable operators, Eq. (4.25), are shown in Fig. 9b for
N = 100. We see that the gluon distribution is dominated by the contribution of the gluon
operator with the lowest anomalous dimension in the gluon sector, and contributions of all
other gluon operators with higher anomalous dimensions is strongly suppressed. Notice,
however, that the contribution of the quark operator with the lowest anomalous dimension
is small compared to the contributions of a large number N of other quark operators
with larger anomalous dimensions. This means that although we can formally write

d
2gT x; Q2
2
dQ
1

s
dy T
T
=
(x/y)2qT+ y; Q2 + ,
P (x/y)2gT y; Q2 + Pgq
4
y gg
Q2
(4.33)
with
1
T
dy y N+2 Pgg
(y) =
h
h
|Hhh |N+3

N+3
h
N+3
1
dy y
0
g
N+2
1
T
Pgq
(y) =
= EN,0 ,
h
h
N+3
|Hhq |N+3 N+3
h
N+3
N+3
N+3
(4.34)
T (x/y) is not
where EN,0 is given in (4.15), taking into account the mixing term Pgq
justified to the accuracy that the mixing with other quark degrees of freedom is omitted.
We suggest, therefore, to neglect the mixing of quarks to the gluon distribution altogether.
It is this mixing that sets the limitations for the accuracy the two-component transverse
spin model. This accuracy is in fact rather high since the sum of squares of the coefficients
of all quark operators in Fig. 9b is less than 2%.
For smaller values of N , the hierarchy of different contributions does not look so
convincing, see Fig. 10, but qualitatively remains the same. Most importantly, contributions
of all gluon operators other than the one with the lowest anomalous dimension remain
negligible.
The DGLAP equations (4.26) and (4.33) can easily be solved going over to moments.
Neglecting the quark mixing in (4.33) we find using (4.22)

eq2

T
LL
2
2qT+ n, 2 Lqq (n)/b
g2 n, Q =
2n
T (n)

qg

T (n)/b
T (n)/b
qq
2
gg
+ 2gT n, L
L
,
T (n) T (n)
gg
qq
(4.35)

1
T (n) =
n1 P T (x) and L = (Q2 )/ (2 ). For the two lowest
where ab
s
s
ab
0 dx x
moments n = 3 and n = 5, expanding moments of the parton distributions at a low
normalization scale 2qT+ (n, 2 ) and 2gT (n, 2 ) over the reduced matrix elements of local
113
Fig. 10. Same as in Fig. 9, but for N = 10.
operators [S]kN and [G]kN , Eqs. (2.31)(2.34), and using the explicit expressions for
the evolutions kernels, one obtains

3 2 1 LL
eq g2 3, Q2 = L10.556/b S00 ,
2
#

$
5 2 1 LL
eq g2 5, Q2 = L10.987/b S22 2 S21 + 3 S20 + 0.974 G02
2

+ L17.350/b 0.974 G02 .
(4.36)
These results have to be compared with the exact expressions in (3.9). We see that the
approximation advocated in this work corresponds to taking into account (for n = 5) the
two multiplicatively renormalizable operators with a large gluon contribution (the first and
the third lines in (3.9)) and neglecting the other terms. The coefficients in front of the three
quark local operators in the first line in (3.9) are reproduced reasonably well. Note that
values of the anomalous dimensions come out to be very close to the exact results.
5. Summary and conclusions

Based on our systematic analysis of the evolution pattern of twist-three operators we
arrive at the following approximate two-channel evolution equation for the flavor-singlet
quark and gluon transverse spin distributions:

s
d
Q
2qT+,S x; Q2 =
2
4
dQ
1

dy T
Pqq (x/y)2qT+,S y; Q2
y

T
(x/y)2gT y; Q2 ,
+ Pqg
114

s
d
Q
2gT x; Q2 =
2
dQ
4
1

dy T
Pgg (x/y)2gT y; Q2 ,
y
(5.1)
where the flavor-singlet quark distribution is defined similar to (3.2)

2qT+,S (x) = 2uT (x) + 2u T (x) + 2dT (x) + 2dT (x) + .
(5.2)
The Hamiltonian approach developed in this paper can be used to determine the asymptotic
expansion of the moments of the splitting functions at large N that translates to the
expansion in powers of 1 x at large momentum fractions x. Since quark and gluon
distributions in the nucleon appear to be decreasing strongly at large x, this contribution
to the splitting functions is the most important one numerically. Using the results for the
moments given in the text, we derive the following expressions for the splitting functions:

4CF
1
2
T
+ (1 x) CF +
Pqq (x) =
2
2CF ,
1x +
Nc
3

2
4Nc
1
2
T
Pgg
(x) =
+ (1 x) Nc
nf
1x +
3
3
3

2

2
1 x 2
2 + Nc ln
6 ,
+ Nc
3
x
3

T
2
Pqg (x) = 4nf x 2(1 x) ln(1 x) .
(5.3)
Here, the first two expressions are accurate up to corrections of order O(1 x) for x 1
and the third expression has the accuracy O((1 x)3 ). Note that the quark splitting
T (x) is the same in the flavor singlet and flavor-nonsinglet channels, cf. (1.2).
function Pqq
Remarkably enough, the obtained the twist-3 evolution kernels turn out to be very similar to
the well-known expressions for the twist-2 DGLAP kernels. Moreover, the leading x 1
asymptotics of the diagonal kernels is the same for twist-2 and twist-3 and the difference
occurs at the level of subleading (1 x) corrections. One can argue following [25] that
this property is rather general and it holds to all orders of perturbation theory.
To the leading logarithmic accuracy, the structure function g2 (x, Q2 ) is expressed
through the quark distribution
g2LL

1 2
e
x, Q2 = g2W W x, Q2 +
2 q q
1

dy
2qT+ y, Q2 ,
y
(5.4)
where g2W W (x, Q2 ) is the WandzuraWilczek contribution (2.7) and the gluon contribution
arises entirely through the evolution equations (5.1) (after the separation of the flavorsinglet part). Note that both the gluon and the quark distributions are defined by analytic
continuation from the odd moments n = 1, 3, . . . , and must satisfy the constraints
1
0

dx 2qT+ x, Q2 =
1
0

dx 2gT x, Q2 = 0
(5.5)
115
that follow from properties of the coefficient functions q , g in (2.30). The Burkhardt
1
Cottingham sum rule [26] 0 dx g2 (x, Q2 ) = 0 then follows from (5.4) which derivation
involves an additional implicit assumption about the absence of subtraction constants in
the dispersion relation for the spin-dependent Compton amplitude, see [4].
The gluon distribution is subject to the additional constraint
1

dx x 2 2gT x, Q2 = 0,
(5.6)
which ensures that the gluon contribution vanishes for the third moment.
To the next-to-leading logarithmic accuracy, the twist-3 gluon contribution at large
scales can be calculated as a finite part of the box diagram in the background gluon field
and the result [15] projected onto the gluon transverse spin distribution has the form
g2NL
1

dy T
1 2 s
2
C (x/y) 2gT (y, Q2 ),
eq
x, Q = +
2 q
y g
x

x
T
Cg (x) = (1 x) 1 + ln
1x

19
83 2
3 2 2
(1 x)
ln (1 x)
ln(1 x) +
3
9
54
9

4
+ O (1 x) ,
(5.7)
where the dots stand for the leading-order quark contribution and the O(s ) quark
corrections. We have argued in [15] and in this paper that this projection has a high
accuracy for all integer moments (i.e., all contributions left out by this projection are very
small).
To summarize, the set of formulas given in this section presents a theoretically motivated
approximation for the QCD description of deep inelastic scattering from a transversely
polarized nucleon. Importance of this approximation is not so much in the possibility
to calculate the scale dependence, but in the identification of important transverse spin
degrees of freedom that are preserved by QCD interaction. The formalism developed in
this paper is very general and can be applied to the study of other higher-twist distributions.
It is based on the operator product expansion and conformal symmetry of the QCD
Lagrangian. This technique turns out to be very effective in dealing with the operator
renormalization for large N and in many cases a WKB-type expansion in 1/N can be
constructed analytically. Going over from the moments to the momentum fraction space
involves analytic continuation and in general may be quite complicated, see [12] for a
discussion. Because of this, the small-x behavior of twist-3 (and higher-twist) parton
distributions presents a nontrivial problem and deserves further study.
116
Acknowledgements
The work by A.M. was supported by the DFG, project Nr. 920585, by the grant of the
Spanish Ministry of Science, and by the grant 00-01-00500 of the Russian Foundation for
Fundamental Research. The work by G.K. was supported by the EU network Training
and Mobility of Researchers, FMRX-CT98-0194. G.K. is grateful to L. Frankfurt and
M. Karliner for useful discussions and warm hospitality at the Tel-Aviv University.
Appendix A. Calculation of the anomalous dimensions

In this appendix we describe the calculation of the energy of the lowest quark and
gluon levels, Eqs. (4.11) and (4.16), respectively, and the energy of the highest quark level,
Eq. (4.7).
The energy of the lowest gluon level is defined by Eq. (4.15) and involves the expectation
value of the Hamiltonian Hhh over the gluon state nh (xi ) and its norm. To calculate the
both, we shall use the expansion of the wave function over the conformal basis. An arbitrary
three-gluon state can be characterized by total conformal spin J = n + 1/2 = N + 7/2
and the conformal spin in a certain two-particle channel. Depending on the particular
choice of the two-particle channel one can define three different sets of basis functions
(12)3
(23)1
(31)2
(xi ), Yn+1/2,k+3
(xi ) and Yn+1/2,k+3
(xi ). The functions belonging to each basis
Yn+1/2,k+3
are linear independent whereas the ones belonging to two different sets are related to each
other through a linear transformation
(h) (12)3

(31)2
(h) (23)1
km Yn+1/2,m+3 (xi ) =
(1)m+k km Yn+1/2,m+3 (xi ),
Yn+1/2,k+3 (xi ) =
m
(A.1)
(h)
km
(31)2
(12)3
= Yn+1/2,k+3 |Yn+1/2,m+3
with
being the Racah 6j -symbols.
The gluon coefficient function in the helicity representation can be expanded in any one
of the three conformal basis so that one gets three sets of the expansion coefficients:
nh (xi ) =
n4
(31)2
(31)2
n,k Yn+1/2,k+3 (xi ) =
k=0
n4
n4
(12)3
(12)3
n,k Yn+1/2,k+3 (xi )
k=0
(23)1 (23)1
n,k
Yn+1/2,k+3 (xi ).
(A.2)
k=0
These coefficients can be calculated similar to (3.96) and are given by

(n + 2 + k)(n + 4 + k 2 /2 + 5 k/2)
(12)3
(31)2
n,k = n,k = hn,k
,
(k + 1)(n 2 k)
(23)1
n,k
=0
(A.3)
for even k, and

(k + 4)(n + 3 + k)
1 (23)1
(12)3
(31)2
n,k
= n,k
= n,k
= hn,k
2
2(n 3 k)
(A.4)
117
for odd k. Here, the normalization factor hn,k is defined as

h2n,k =
(k + 1)(n + 3 + k)(n + 2 + k)(2k + 5)n2 E 4 (n)

.
8(k + 2)(k + 3)(k + 4)(n + 2 + k)(n + 3 + k) E(2n)
(A.5)
The relations between the coefficients respect the symmetry properties of nh (xi ). The
(23)1
coefficients n,k vanish for even k to ensure the antisymmetry of the state under the
interchange of gluons with the same helicity, x2 x3 . The sum of the coefficients in the
three different channels vanishes for arbitrary k since the three-gluon state is annihilated
by the operator 1 + P + P 2 = (P 3 1)/(P 1) with P being the operator of cyclic
permutations.
(31)2
The spherical harmonics Yn,k (xi ) form an orthonormal basis on the space of the
(31)2
(31)2
coefficient functions endowed with the scalar product (3.39), Yn,k |Yn,m = k,m . The
(12)3
(23)1
same is true for the states Yn,k
(xi ) and Yn,k
(xi ). Using this property we may calculate
the norm of the gluon state in one of the three equivalent forms
n4
n4
n4
, h ,2
(12)32
(23)12
(31)22
, , =
n,k
n,k
n,k
=
=
.
n
k=0
k=0
(A.6)
k=0
Substituting the expressions for the expansion coefficients (A.3) and (A.4), one gets a finite
sum over two-particle conformal spin k whose evaluation leads to Eq. (3.100). Expanding
the result at large n one obtains

, h ,2 E 4 (n) n4

1
2
, , =
1 + (2 ln n + 2E 1) + O 1/n ,
(A.7)
n
E(2n) 16
n
where we factored out the ratio of the E-functions for later convenience.
In order to calculate the expectation value of the evolution kernel defined in (3.74), we
notice that Hhh has a two-particle structure and can be split in three contributions each
of which only depends on the operator of the conformal spin Jab in a given two-particle
channel
%
%
h%
%
n %Hhh %nh = Nc nh %2Ugg (J12 ) Vgg (J12 )%nh
%
%
+ Nc nh %2Ugg (J31 ) Vgg (J31 )%nh
, ,2
%
%
+ 2Nc nh %Ugg (J23 )%nh b ,nh , .
(A.8)
(12)3
Since the operator J12 is diagonal in the conformal basis Yn,k (xi ) (by construction), it
is natural to evaluate the first term in (A.8) by expanding the gluon state nh over this
particular basis. By the same token, the second and the third terms in (A.8) are most easily
(31)2
(23)1
calculated using the expansion over Yn,k
(xi ) and Yn,k
(xi ), respectively. In this way one
arrives at
n4

%
(31)22 (23)12
h%
(12)3 2
Ugg (k + 3) n,k
+ n,k
+ n,k
n %Hhh %nh = 2Nc
k=0
Nc
n4

k=0

, ,2
(31)22
(12)3 2
Vgg (k + 3) n,k
+ n,k
b ,nh , ,
(A.9)
118
where we have used that the conformal spin in the two-gluon channel is equal to 2jg + k =
k + 3. Using explicit expressions for the kernels Ugg and Vgg defined in (3.77) and the
expansion coefficients (A.3) and (A.4), one can rewrite (A.9) as a finite sum over the
two-particle spin k. The sum involving Vgg can be calculated analytically while the sum
involving Ugg can be expanded in inverse powers of 1/n at large n. After some algebra
one arrives at

n4

(31)22 E 4 (n) n4 2 3 1
ln n
(12)3 2
=
+ +O
Vgg (k + 3) n,k
+ n,k
E(2n) 4 6
2 n
n2
k=0
(A.10)
and
n4

(31)22 (23)12
(12)3 2
Ugg (k + 3) n,k
+ n,k
+ n,k
k=0
0

E 4 (n) n4
2 1 1
1
2
=
2(ln n + E ) (ln n + E )
ln n + E +
+
E(2n) 8
12 2 n
2
2 1
ln n
(A.11)
.
+O
n2
Substituting these expressions into (A.9) and combining them with (A.7) one arrives at the
expression for the energy of the lowest gluon state given in the text, Eq. (4.15).
Calculation of the energy of the lowest quark level, Eq. (4.11), goes along the same
q
lines. The quark coefficient function n is defined in Eq. (2.22) on the hyperplane
x1 + x2 + x3 = 0 corresponding to the kinematics of the forward scattering, and can be
continued to arbitrary values of the momentum fractions xi using the conformal symmetry.
Its expansion over the conformal spherical harmonics in the three different two-particle
channels looks like
q
n (xi ) =
n3
(12)3 (12)3
n,k
Yn+1/2,k+5/2(xi ) =
k=0
n3
n3
(23)1 (23)1
n,k
Yn+1/2,k+5/2 (xi )
k=0
(31)2 (31)2
n,k Yn+1/2,k+2 (xi ),
(A.12)
k=0
(12)3
where, e.g., Yn+1/2,k+5/2
(xi ) is defined by the general expression (3.61) with j1 = j3 = 1,
j2 = 3/2, the total conformal spin equal to J = n = N + 3 and the two-particle conformal
spin in the (12) channel given by j12 = j1 + j2 + k = 3 + k.
The explicit expressions for the expansion coefficients can be obtained using the
representation similar to (3.89). The result is:
(12)3
= 2(1)k
n,k
n(n + 2 + k)
qn,k ,
k+1
(23)1
= 2(1)n (k + 3)(n + 2 + k)qn,k ,
n,k
(31)2
n,k
= (3 + 2k)(k + 2)qn,k
119
(1)nk 1 n + 2 + k (1)nk + 1 n + 1 + k
+
,
2
n2k
2
n1k
with the normalization factors
(A.13)
E 4 (n)
(k + 1)(n 2 k)
,
4(n + 2 + k)(k + 2)(k + 3) E(2n)
(n 1 k)(k + 3) 2
2
qn,k
(A.14)
q .
=2
(3 + 2k)(n + 1 + k) n,k
Using the expansion in (A.12) one finds three equivalent representations for the norm of
the quark state
2
=
qn,k
n3
n3
n3
, q ,2
(12)32
(23)12
(31)22
,n , =
=
=
.
n,k
n,k
n,k
k=0
k=0
(A.15)
k=0
Substituting explicit expressions for the expansion coefficients and performing the
summation one arrives at Eq. (3.91). The expansion of the norm at large n reads

, q ,2 E 4 (n) n4

1
,n , =
1 + 2 (1 4 ln n 4E ) + O 1/n2 .
(A.16)
E(2n) 4
n
The expectation value of the diagonal quark evolution kernel Hqq defined in (3.69) can be
written in the form analogous to (A.8)
% q
% q q%
q%
2 (1)
(0)
(J12 )
Vqg (J12 )%n
n %Hqq %n = n %Nc Vqg
Nc
% q
q%
2 (1)
(0)
+ n %Nc Uqg
(J23 )
Uqg (J23 )%n
Nc
% q
q % 2nf
2
J31 ,2
U (1) (J31 )%n .
+ n %
(A.17)
3
Nc qq
Expanding the quark state (A.12) over the suitable conformal basis one obtains

n3
% q
q%
(12)32
2 (1)
(0)
%
%
n Hqq n =
n,k
V (k + 5/2)
Nc Vqg (k + 5/2)
Nc qg
k=0
n3

(23)1 2
n,k
k=0
n3
k=0
(31)2 2
n,k

2 (1)
(0)
U (k + 5/2)
Nc Uqg (k + 5/2)
Nc qg

2nf
2 (1)
k,0
U (k + 2) ,
3
Nc qq
(A.18)
where we have replaced the operators of the two-particle conformal spins by the
corresponding eigenvalues Jab = ja + jb + (n 3). The necessary expansion coefficients
and explicit expressions for the evolution kernels entering this expression are given in
(A.13) and (3.73), respectively.
It turns out that the part of the sum in (A.18) proportional to Nc can be calculated exactly
n3

(23)12 (0)
(12)32 (0)
Vqg (k + 5/2) + n,k
Uqg (k + 5/2)
n,k
k=0
120

, q ,2
1 1
= ,n , (n) + + 2E .
(A.19)
n 2
The 1/Nc correction to the sum (A.18) is given by a ratio of rather complicated sums and
can easily be expanded at large n leading to

n3

, q ,2 7 2

(12)32 (1)

2
,
,
+ O ln n/n
n,k
,
Vqg (k + 5/2) = n
4
6
k=0
n3
, q ,2
(23)12 (1)

n,k
Uqg (k + 5/2) = ,n , 0 + O 1/n2 ,
k=0
n3
, q ,2
(31)22 (1)

n,k
Uqq (k + 2) = ,n , ln n + E 1 + O ln2 n/n2 .
(A.20)
k=0
Finally, the nf -dependent contribution to the (A.18) is given for odd n by

(31)22
(n 1)(n + 2) E 4 (n) ,
q ,2
= ,n , 0 + O 1/n4 .
=3
n,0
(A.21)
(n + 1)(n 2) E(2n)
Combining (A.19), (A.20), (A.21) and (A.17) we obtain the expression for the energy of
the lowest quark level given in Eq. (4.11).
The eigenstate of the highest quark level can be approximated by the expression in (4.6).
Calculating the corresponding energy (4.7) and using (3.69), one gets
%
% q

q
2
q
EN,N = nf + YN,k=0 %HS + %YN,k=0 .
(A.22)
3
Using the explicit expression (3.70) for the Hamiltonian HS + and taking into account that
q
q
YN,k=0 is symmetric with respect to the interchange of the quarks, (P13 1)YN,k=0 = 0,
one obtains
% q
q %
2
2 (1)
q
U (J13 = 2) + YN,0 %V (J12 )%YN,0 ,
EN,N = nf
(A.23)
3
Nc qq
where the notation was introduced for the linear combination of the kernels (3.73)

(0)
2 (1)
(0)
(1)
V (J ) = Nc Vqg
(A.24)
(J ) + Uqg
(J )
(J ) .
Vqg (J ) + Uqg
Nc
In order to calculate the matrix element entering (A.23) we use the Racah decomposition
q
(B.1) to expand YN,k=0 over the conformal basis in the quarkgluon channel

q
(12)3
2,m+5/2 (N + 7/2) YN+7/2,m+5/2
(xi ).
YN,0 (xi ) =
(A.25)
m
(1)
Taking into account that Uqq
(2) = 0 one finds

2
q
V (m + 5/2)[2,m+5/2(N + 7/2)]2 .
EN,N = nf +
3
N
(A.26)
m=0
Finally, using the explicit expressions for the Racah symbols (B.3) and the two-particle
evolution kernels (3.73) and and performing the summation in (A.26) one arrives at the
result given in Eq. (4.7).
121
Appendix B. Racah symbols for the SL(2, R) group

In this appendix we derive the explicit expression for the Racah 6j -symbols jj defined
in (3.63)

(31)2
(12)3
YJj
(B.1)
(xi ) =
jj (J ) YJj
(xi ),
j1 +j2 j J j3
(31)2
are given in Eq. (3.61). In order to find the coefficients
where the basis functions YJj
jj (J ) it is sufficient to set the three variables xi in the Eq. (B.1) to the following values:
x1 = x, x2 = x and x3 = 1. Using the explicit expressions for the basis functions (3.61)
one gets from (B.1):
E(j + j3 j1 ) E(J j + j2 )
E(2j3 ) E(2j2 )(j j1 j3 )!(J j j2 )!
2 F1 (j1 + j3 j, 1 + j3 j1 j, 2j3; x)
(1)J j j2 rJj
2 F1 (J + j + j2 , J + j + j2 1, 2j2 ; x)
J
j3
jj (J )
j =j1 +j2
rJj (x)j j1 j2
E(J + j j3 )
.

(j j1 j2 )!(J j j3 )! (2j 1) E(j + j1 + j2 1)
(B.2)
The product of the two hypergeometric functions in the l.h.s. of this identity is nothing else
but the generating function for the Racah polynomials (see [27] for details). Expanding the
hypergeometric functions and comparing the terms in the l.h.s. and the r.h.s. of (B.2) with
the same power of x one derives after some algebra the following explicit expression:
jj (J )
E(J j1 j2 j3 + 1)
E(2j1 )E(J + j1 + j2 + j3 1)
f (J, j , j1 , j2 , j3 ) f (J, j, j1 , j3 , j2 )
%

j + j1 + j2 , j + j1 + j2 1, j + j1 + j3 , j + j1 + j3 1 %%
4 F3
%1 ,
2j1 , J + j1 + j2 + j3 , J + j1 + j2 + j3 1
(B.3)
= (1)J j j2
where
f (J, j , j1 , j2 , j3 )

E(j + j1 j2 )E(j + j1 + j2 1)
= (2j 1)
E(j + j2 j1 )E(j j1 j2 + 1)
E(J j + j3 )E(J + j + j3 1) 1/2
.
E(J j j3 + 1)E(J + j j3 )
(B.4)
Appendix C. Evolution kernels in the conformal basis

To solve the Schrdinger equation (3.58) we expand the eigenstates over the conformal
basis in the quarkantiquarkgluon and three-gluon sectors, Eq. (3.66). In this representa-
122
tion the evolution kernels are given by real and symmetric matrices acting on the vector of
q
the expansion coefficients (uN,k , uhN,k ) defined in (3.66). In this appendix we work out the
explicit form of these matrices and discuss their properties.
q
The diagonal quark evolution kernel (3.69) is described in the conformal basis YN,k by
a square matrix of dimension ;q = N + 1
% q q %
% q 2nf
q %
k,m k,0
[Hqq ]km = YN,k %Hqq %YN,m = YN,k %HS + %YN,m +
(C.1)
3
with 0 k, m N . In turn, the Hamiltonian HS + is given by the sum of pair-wise kernels
(3.73) depending on the conformal spins in the different two-particle channels. We recall
q
that the operators J31 are diagonal by definition on the space of the states YN,k so that
% q
q %
YN,k %V (J31 )%YN,m = V (k + 2) km
(C.2)
for an arbitrary V . On the other hand, contributions to the Hamiltonian that are functions
of J12 can be expanded over the conformal basis in the two-particle (12)-channel

N
% q
% q
(12)3
q % (12)3
q %
V (l + 5/2) YN,k %YN+7/2,l+5/2 YN+7/2,l+5/2 %YN,m .
YN,k %V (J12 )%YN,m =
l=0
(C.3)
The similar representation exists for V (J23 ). The scalar product of the Y -functions
belonging to different conformal basis is given by the Racah 6j -symbols defined in (B.1)
and (B.3). In particular
% q
(12)3
(q)
YN+7/2,l+5/2 %YN,m = m+2,l+5/2(N + 7/2) ml ,
(C.4)
% q
(23)1
(q)
YN+7/2,l+5/2 %YN,m = (1)l+m m+2,l+5/2 (N + 7/2) (1)l+m ml .
(C.5)
Taking into account these relations one finds

N
% q
q %
(q) (q)
V (l + 5/2)ml kl ,
YN,k %V (J12 )%YN,m =
(C.6)
l=0
N
% q
q %
(q) (q)
YN,k %V (J23 )%YN,m =
(1)l+m V (l + 5/2)ml kl .
(C.7)
l=0
Using these identities and the explicit expression for HS + given in Section 3.5.1 one finds
the following matrix representation of the diagonal quark kernel

2nf
2 (1)
k,0 k,m
[Hqq ]km = Uqq (k + 2) +
Nc
3

N

2 (1)
(q) (q)
(0)
+
Oml Okl Nc Vqg
(l + 5/2)
V (l + 5/2)
Nc qg
l=0

2 (1)
k+m
(0)
+ (1)
U (l + 5/2) .
Nc Ugq (l + 5/2)
Nc gq
(C.8)
123
The explicit expressions for the Racah symbols and the evolution kernels are given in
Eqs. (B.3) and (3.73), respectively.
The gluon diagonal kernel in the helicity representation (3.74) is given in the conformal
g
basis YN1,k by a square matrix of dimension ;g = N/2
%
%
%
% h

h

h
%Hhh %Y h
%
%
[Hhh ]km = YN1,k
(C.9)
N1,m = Nc YN1,k H3/2 V3/2 YN1,m
with 0 k, m [N/2] 1. To obtain the explicit expression one has to follow the steps
similar to that in the quark case. Introducing the notation for the relevant Racah symbols
% g

(12)3
(h)
YN+7/2,l+3 %YN1,m = m+3,l+3 (N + 7/2) ml
(C.10)
and using the explicit expressions for the gluon kernels (3.75) one finds

[Hhh ]km = k,m Nc 2Ugg (k + 3) Vgg (k + 3) b
+ Nc
[N/2]

(h) (h)
ml kl 2Ugg (l + 3) 1 + (1)k+m Vgg (l + 3) . (C.11)
l=0
The explicit expressions for the Racah symbols and the functions Ugg , Vgg are given in
Eqs. (B.3) and (3.77), respectively.
Finally, the explicit expressions for the off-diagonal kernels Hqh and Hhq are given
in (3.82).
References
[1] K. Abe et al., E143 Collaboration, Phys. Rev. Lett. 76 (1996) 587.
[2] K. Abe et al., E154 Collaboration, Phys. Lett. B 404 (1997) 377.
[3] P.L. Anthony et al., E155 Collaboration, Phys. Lett. B 458 (1999) 529;
G.S. Mitchell, E155 Collaboration, hep-ex/9903055.
[4] B.L. Ioffe, V.A. Khoze, L.N. Lipatov, Hard Processes 1: Phenomenology, Quark Parton Model,
North-Holland, Amsterdam, 1984.
[5] M. Anselmino, A. Efremov, E. Leader, Phys. Rep. 261 (1995) 1.
[6] J. Kodaira, K. Tanaka, Prog. Theor. Phys. 101 (1999) 191.
[7] E.V. Shuryak, A.I. Vainshtein, Nucl. Phys. B 201 (1982) 141.
[8] A.P. Bukhvostov, E.A. Kuraev, L.N. Lipatov, Sov. Phys. JETP 60 (1984) 22.
[9] I.I. Balitsky, V.M. Braun, Nucl. Phys. B 311 (1989) 541.
[10] A. Ali, V.M. Braun, G. Hiller, Phys. Lett. B 266 (1991) 117.
[11] J. Kodaira, Y. Yasui, T. Uematsu, Phys. Lett. B 344 (1995) 348;
J. Kodaira et al., Phys. Lett. B 387 (1996) 855;
J. Kodaira et al., Prog. Theor. Phys. 99 (1998) 315.
[12] V.M. Braun, G.P. Korchemsky, A.N. Manashov, Phys. Lett. B 476 (2000) 455.
[13] X. Ji, W. Lu, J. Osborne, X. Song, Phys. Rev. D 62 (2000) 094016.
[14] A. Belitsky, X. Ji, W. Lu, J. Osborne, hep-ph/0007305.
[15] V.M. Braun, G.P. Korchemsky, A.N. Manashov, hep-ph/0010128.
[16] S. Wandzura, F. Wilczek, Phys. Lett. B 72 (1977) 195.
[17] B. Geyer, D. Muller, D. Robaschik, Nucl. Phys. Proc. Suppl. C 51 (1996) 106;
D. Muller, Phys. Lett. B 407 (1997) 314.
[18] R.L. Jaffe, Nucl. Phys. B 229 (1983) 205.
124
[19] L.B. Okun, Leptons and Quarks, North-Holland, Amsterdam, 1982.

[20] A.P. Bukhvostov, G.V. Frolov, L.N. Lipatov, E.A. Kuraev, Nucl. Phys. B 258 (1985) 601.
[21] V.M. Braun, S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Nucl. Phys. B 553 (1999)
355.
[22] V.M. Braun, S.E. Derkachov, A.N. Manashov, Phys. Rev. Lett. 81 (1998) 2020.
[23] A.V. Belitsky, Nucl. Phys. B 558 (1999) 259;
A.V. Belitsky, Nucl. Phys. B 574 (2000) 407.
[24] S.E. Derkachov, G.P. Korchemsky, A.N. Manashov, Nucl. Phys. B 566 (2000) 203.
[25] G.P. Korchemsky, Mod. Phys. Lett. A 4 (1989) 1257.
[26] H. Burkhardt, W.N. Cottingham, Ann. Phys. 56 (1970) 453.
[27] R. Koekoek, R. Swarttouw, The Askey scheme of hypergeometric orthogonal polynomials and
its q-analogue, Report 98-17, Delft University of Technology, Faculty TWI, 1998.

scattering
G. Colangelo a , J. Gasser b , H. Leutwyler b
a Institute for Theoretical Physics, University of Zrich, Winterthurerstr. 190, CH-8057 Zrich, Switzerland
b Institute for Theoretical Physics, University of Bern, Sidlerstr. 5, CH-3012 Bern, Switzerland
Abstract
We demonstrate that, together with the available experimental information, chiral symmetry
determines the low energy behaviour of the scattering amplitude to within very small
uncertainties. In particular, the threshold parameters of the S-, P -, D- and F -waves are predicted, as
well as the mass and width of the and of the broad bump in the S-wave. The implications for the
coupling constants that occur in the effective Lagrangian beyond leading order and also show up in
other processes, are discussed. Also, we analyze the dependence of various observables on the mass
of the two lightest quarks in some detail, in view of the extrapolations required to reach the small
physical masses on the lattice. The analysis relies on the standard hypothesis, according to which the
quark condensate is the leading order parameter of the spontaneously broken symmetry. Our results
provide the basis for an experimental test of this hypothesis, in particular in the framework of the
ongoing DIRAC experiment: the prediction for the lifetime of the ground state of a + atom
reads = (2.9 0.1) 1015 s. 2001 Elsevier Science B.V. All rights reserved.
PACS: 11.30.Rd; 12.38.Aw; 12.39.Fe; 13.75.Lb
Keywords: Roy equations; Mesonmeson interactions; Pionpion scattering; Chiral symmetries
1. Introduction
The study of scattering is a classical subject in the field of strong interactions.
The properties of the pions are intimately related to an approximate symmetry of
QCD. In the chiral limit, where mu and md vanish, this symmetry becomes exact, the
Lagrangian being invariant under the group SU(2)R SU(2)L of chiral rotations. The
symmetry is spontaneously broken to the isospin subgroup SU(2)V . The pions represent
the corresponding Goldstone bosons.
In reality, the quarks are not massless. The theory only possesses an approximate chiral
symmetry, because mu and md happen to be very small. The consequences of the fact
that the symmetry breaking is small may be worked out by means of an effective field
E-mail address: gasser@itp.unibe.ch (J. Gasser).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 4 7 - X
126
G. Colangelo et al. / Nuclear Physics B 603 (2001) 125179
theory [1]. The various quantities of interest are expanded in powers of the momenta and
quark masses. 1 In the case of the pion mass, for instance, the expansion starts with [3]

1
B = 2 |0|uu|0 |,
M2 + = (mu + md )B + O m2 ,
(1.1)
F
where F is the value of the pion decay constant in the chiral limit, mu , md 0. The
formula shows that the square of the pion mass is proportional to the product of mu +
md with the order parameter 0|uu|0 .
The two factors represent quantitative measures

for explicit and spontaneous symmetry breaking, respectively. If the explicit symmetry
breaking is turned off, the pions do become massless, as they should: the symmetry is then
exact, so that the spectrum contains three massless Goldstone bosons, while all other levels
form massive, degenerate isospin multiplets.
The properties of the Goldstone bosons are strongly constrained by chiral symmetry:
in the chiral limit, the scattering amplitude vanishes when the momenta of the pions
tend to zero. To first order in the symmetry breaking, the S-wave scattering lengths are
proportional to the square of the pion mass [4]:
a00 =
7M2
,
32F2
a02 =
M2
,
16F2
(1.2)
where aI stands for the scattering length in the isospin I channel with angular momentum . The two low energy theorems (1.2) are valid only at leading order in a series
expansion in powers of the quark masses. The next-to-leading order corrections were
calculated in [5], and even the next-to-next-to-leading order corrections are now known [6].
In the following, we exploit the fact that analyticity, unitarity and crossing symmetry
impose further constraints on the scattering amplitude. These were analyzed in detail in [7],
on the basis of the Roy equations [8] and of the experimental data available at intermediate
energies. The upshot of that analysis is that a00 and a02 are the essential low energy
parameters: once these are known, the available experimental data determine the low
energy behaviour of the scattering amplitude to within remarkably small uncertainties.
As discussed above, chiral symmetry predicts exactly these two parameters. Hence the low
energy behaviour of the scattering amplitude is fully determined by the experimental data
in the intermediate energy region and the theoretical properties just mentioned: analyticity,
unitarity, crossing symmetry and chiral symmetry.
The resulting predictions for the S-wave scattering lengths were presented already [9].
The purpose of the present paper is to (i) discuss the analysis that underlies these
predictions in more detail, (ii) present the results for the threshold parameters of the P -,
D-, and F -waves, (iii) give an explicit representation for the S- and P -wave phase shifts
and (iv) extract the information about the coupling constants of the effective Lagrangian.
Several authors have performed a comparison of the chiral perturbation theory
predictions with the data, in particular also in view of a determination of the effective
1 In fact, effective Lagrangians were used to analyze the low energy structure of the strong interactions even
before the discovery of QCD, that is at a time when the origin of chiral symmetry and its breaking were totally
obscure. In particular, the application of the so-called hard meson approach had been worked out in detail for
scattering [2].
127
coupling constants 1 and 2 [1023]. Stern and collaborators [24,25] investigate the
problem from a different point of view, referred to as Generalized Chiral Perturbation
Theory. These authors treat the S-wave scattering lengths as free parameters and
investigate the possibility that their values strongly deviate from those predicted by
Weinberg. In the language of the effective chiral Lagrangian, this scenario would arise if
the standard estimates for the effective coupling constant 3 were entirely wrong: the quark
condensate would then fail to represent the leading order parameter of the spontaneously
broken chiral symmetry. Indeed, these estimates rely on a theoretical picture that has not
been tested experimentally.
On the experimental side, the situation is the following. As shown in early numerical
analyses of the Roy equations [26], only data sufficiently close to threshold can provide
significant bounds on the scattering lengths. The often quoted values a00 = 0.26 0.05,
a02 = 0.028 0.012 [27,28] mainly rely on the 3 104 K e decays collected by
the GenevaSaclay collaboration, which provided its final results in 1977 [29]. There are
new data from Brookhaven [30,31], where more than 4 105 Ke4 decays are being analyzed, and the low energy behaviour of the relevant form factors is now also known much
better [23,33]. As will be discussed in Section 13, the preliminary results of this experiment indeed reduce the uncertainties significantly. A similar experiment is proposed by the
NA48 collaboration at CERN [34]. Unfortunately, the data taking at the DANE facility
is delayed, due to technical problems with the accelerator. A beautiful experiment is under
way at CERN [35], which is based on the fact that + atoms decay into a pair of neutral
pions, through the strong transition + 0 0 . Since the momentum transfer nearly
vanishes, only the scattering lengths are relevant: at leading order in isospin breaking, the
transition amplitude is proportional to a00 a02 [36]. The corrections at next-to-leading
order are now also known [37], as a result of which a measurement of the lifetime of a
+ atom amounts to a measurement of this combination of scattering lengths. Finally,
we mention the new data on pion production off nucleons, obtained by the CHAOS collaboration at Triumf [38]. The scattering lengths may be extracted from these data by means
of a ChewLow extrapolation procedure. Chiral symmetry, however, suppresses the onepion exchange contribution with a factor of t, so that a careful data selection is required to
arrive at a coherent ChewLow fit. It yet remains to be seen whether these data permit a
significant reduction of the uncertainties in the experimental determination of a00 and a02 .
The experiments mentioned above are of particular interest, because they offer a test
of the hypothesis that the quark condensate represents the leading order parameter of the
spontaneously broken symmetry: if the predictions obtained in the present paper should
turn out to be in contradiction with the outcome of these experiments, the commonly
accepted theoretical picture would require thorough revision.
2. Chiral representation
Throughout the present paper we work in the isospin limit: we disregard the e.m. interaction and set mu = md = m. The various elastic reactions among two pions may then
128
be represented by a single scattering amplitude A(s, t, u). Only two of the Mandelstam
variables are independent, s + t + u = 4M2 and, as a consequence of Bose statistics, the
amplitude is invariant under an interchange of t and u.
As discussed in detail in Ref. [10], chiral perturbation theory allows one to study
the properties of the scattering amplitude that follow from the occurrence of a
spontaneously broken approximate symmetry. The method is based on a systematic
expansion in powers of the momenta and of the light quark masses. We refer to this as
the chiral expansion and use the standard bookkeeping, which counts the quark masses
like two powers of momentum, m = O(p2 ).
The two loop representation of the scattering amplitude given in [6] yields the first three
terms in the chiral expansion of the partial waves:

tI (s) = tI (s)2 + tI (s)4 + tI (s)6 + O p8 .
(2.1)
At leading order, only the S- and P -waves are different from zero:
t00 (s)2 =
2s M2
,
32F2
t11 (s)2 =
s 4M2
,
96F2
t02 (s)2 =
s 2M2
.
32F2
(2.2)
In the low energy expansion, inelastic reactions start showing up only at O(p8 ). The
unitarity condition therefore reads:

4M2
.
Im tI (s) = (s)|tI (s)|2 + O p8 ,
(2.3)
(s) = 1
s
The condition immediately implies that the imaginary parts of the two loop amplitude may
be worked out from the one-loop representation:

Im tI (s) = (s)tI (s)2 tI (s)2 + 2 Re tI (s)4 + O p8 .
(2.4)
The formula shows that, at low energies, the imaginary parts of the partial waves with 2
are of order p8 and hence beyond the accuracy of the two loop calculation.
Stated differently, the imaginary part of the two loop representation is due exclusively to
the S- and P -waves. This implies that, up to and including O(p6 ), the chiral representation
of the scattering amplitude only involves three functions of a single variable:

A(s, t, u) = C(s, t, u) + 32 13 U 0 (s) + 32 (s u)U 1 (t) + 32 (s t)U 1 (u)

+ 12 U 2 (t) + 12 U 2 (u) 13 U 2 (s) + O p8 .
(2.5)
The first term is a crossing symmetric polynomial in s, t, u,
C(s, t, u) = c1 + sc2 + s 2 c3 + (t u)2 c4 + s 3 c5 + s(t u)2 c6 .
(2.6)
The functions U 0 (s), U 1 (s) and U 2 (s) describe the unitarity corrections associated with
s-channel isospin I = 0, 1, 2, respectively. In view of the fact that the chiral perturbation
theory representation for the imaginary parts of the partial waves grows with the power
Im tI (s)6 s 3 , we need to apply several subtractions for the dispersive representation of
129
these functions to converge. It is convenient to subtract at s = 0 and to write the dispersion

integrals in the form
s4
U (s) =
ds
(s )t00 (s )2 {t00 (s )2 + 2 Re t00 (s )4 }

,
s 4 (s s)
ds
(s )t11 (s )2 {t11 (s )2 + 2 Re t11 (s )4 }

,
s 3 (s 4M2 )(s s)
ds
(s )t02 (s )2 {t02 (s )2 + 2 Re t02 (s )4 }

.
s 4 (s s)
4M2
s3
U (s) =
4M2
s4
U (s) =
4M2
(2.7)
The subtraction constants are collected in the polynomial C(s, t, u). Alternatively, we
could set C(s, t, u) = 0 and book the subtraction terms as polynomial contributions to
U 0 (s), U 1 (s), U 2 (s). The decomposition of C(s, t, u) into a set of three polynomials of
a single variable is not unique, however, so that we would have to adopt a convention for
this splitting we find it more convenient to work with the above representation of the
amplitude.
The specific structure of the unitarity correction given above was noted already in [24]. It
is straightforward to check that the explicit result of the full two loop calculation described
in [6] is indeed of this structure. The essential result of that calculation is the expression
for the polynomial part of the amplitude, in terms of the effective coupling constants.
The corresponding formulae, which specify how the coefficients c1 , . . . , c6 depend on the
quark masses, are given in Appendix B. These, in particular contain Weinbergs low energy
theorem, which in this language states that the expansion of the coefficients c1 and c2 starts
with
c1 =

M2
1 + O M2 ,
F2
c2 =

1
1 + O M2 .
F2
(2.8)
The two loop calculation specifies the expansion of these two coefficients up to and
including next-to-next-to-leading order.
3. Phenomenological representation
As shown by Roy [8], the fixed-t dispersion relations for the isospin amplitudes can
be written in such a form that they express the scattering amplitude in terms of the
imaginary parts in the physical region of the s-channel. The resulting representation for
A(s, t, u) contains two subtraction constants, which may be identified with the scattering
lengths a00 and a02 . Unitarity converts this representation into a set of coupled integral
equations, which we recently examined in great detail [7]. In the present context, the
main result of interest is that the representation allows us to determine the imaginary parts
of the scattering amplitude in terms of a00 and a02 . Since the resulting representation is
130
based on the available experimental information, we refer to it as the phenomenological

representation.
In the following, we treat the imaginary parts of the partial wave amplitudes as if they
were completely known from phenomenology we will discuss the uncertainties in these
quantities as well as their dependence on a00 and a02 in detail, once we have identified the
manner in which they enter our predictions for the scattering lengths.
The chiral representation shows that the singularities generated by the imaginary parts
of the partial waves with 2 start manifesting themselves only at O(p8 ). Accordingly,
we may expand the corresponding contributions to the dispersion integrals into a Taylor
series of the momenta. The singularities due to the imaginary parts of the S- and P waves, on the other hand, start manifesting themselves already at O(p4 ) these cannot be
replaced by a polynomial. The corresponding contributions to the amplitude are of the same
structure as the unitarity corrections and also involve three functions of a single variable.
It is convenient to subtract the relevant dispersion integrals in the same manner as for the
chiral representation:
s4
W (s) =
0
ds
4M2
s3
W (s) =
1
ds
Im t00 (s )
,
s 4 (s s)
s 3 (s
4M2
s4
W (s) =
2

4M2
ds
Im t11 (s )
,
4M2 )(s s)
Im t02 (s )
.
s 4 (s s)
(3.1)
Since all other contributions can be replaced by a polynomial, the phenomenological

amplitude takes the form

4 0
2a0 5a02 s + P(s, t, u)
2
3M
1 0
(s) + 3 (s u)W
1 (t) + 3 (s t)W
1 (u)
+ 32 3 W
2
2

2 (t) + 1 W
2 (u) 1 W
2 (s) + O p8 .
+ 12 W
2
3
A(s, t, u) = 16a02 +
(3.2)
We have explicitly displayed the contributions from the subtraction constants a00 and a02 .
The term P(s, t, u) is a crossing symmetric polynomial
P(s, t, u) = p 1 + p2 s + p 3 s 2 + p4 (t u)2 + p5 s 3 + p6 s(t u)2 .
(3.3)
As demonstrated in the appendix, its coefficients can be expressed in terms of the following
integrals over the imaginary parts of the partial waves:
InI =

(2l + 1)
=0
4M2
ds
Im tI (s)
,
s n+2 (s 4M2 )
H=

=2
1
(2l + 1)( + 1)

ds
4M2
2 Im t0 (s) + 4 Im t2 (s)

.
9s 3 (s 4M2 )
131
(3.4)
The explicit expressions read

p1 = 128M4 I01 + I02 + 2M2 I11 + 2M2 I12 + 8M4 I22 ,

64M2 0
2I0 6I01 2I02 15M2 I11 3M2 I12 36M4 I22 + 6M2 H ,
3
8 0
4I0 9I01 I02 16M2 I10 42M2 I11 + 22M2 I12 72M4 I22
p3 =
3

+ 24M2 H ,

p4 = 8 I01 + I02 + 2M2 I11 + 2M2 I12 24M4 I22 ,

4 0
8I1 + 9I11 11I12 32M2 I20 + 44M2 I22 6H ,
p5 =
3

p6 = 4 I11 3I12 + 12M2 I22 + 2H .
(3.5)
p2 =
The fact that, at low energies, the scattering amplitude may be represented in terms of
integrals over the imaginary parts that can be evaluated phenomenologically, was noted
earlier, by Stern and collaborators [24]. These authors also worked out the implications for
the threshold parameters and the effective coupling constants of the chiral Lagrangian and
we will compare their results with ours, but we first need to specify the framework we are
using.
4. Matching conditions
In the preceding sections, we have set up two different representations of the scattering
amplitude: one based on chiral perturbation theory and one relying on the Roy equations.
The purpose of the present section is to show that, in their common domain of validity, the
two representations agree, provided the parameters occurring therein are properly matched.
The chiral and phenomenological representations are of the same structure. The
coefficients of the polynomials C(s, t, u) and P(s, t, u) are defined differently and, instead
of the functions U I (s) occurring in the chiral representation, the phenomenological one
I (s). The latter are defined in Eq. (3.1), as integrals over the
involves the functions W
imaginary parts of the physical S- and P -waves.
The key observation is that, in the integrals (3.1), only the region where s is of
order p2 matters for the comparison of the two representations. The remainder generates
contributions to the amplitude that are most of order p8 . Moreover, for small values
of s , the quantities Im tI (s ) are given by the chiral representation in Eq. (2.4) except
for contributions that again only manifest themselves at O(p8 ). This implies that the
I (s) and U I (s) are beyond the accuracy of the chiral
differences between the functions W
representation:

0 (s) = U 0 (s) + O p8 ,
W
132

1 (s) = U 1 (s) + O p6 ,
W

2 (s) = U 2 (s) + O p8 .
W
(4.1)
Hence the two representations agree if and only if the polynomial parts do,
C(s, t, u) = 16a02 +

4 0
2a0 5a02 s + P(s, t, u) + O p8 .
3M2
This implies that the coefficients of C(s, t, u) and P(s, t, u) are related by

c1 = 16a02 + p 1 + O p8 ,

c3 = p 3 + O p4 ,

c5 = p 5 + O p2 ,

4 0
2a0 5a02 + p2 + O p6 ,
3M2

c4 = p 4 + O p4 ,

c6 = p 6 + O p2 .
c2 =
(4.2)
The chiral representation specifies the coefficients c1 , . . . , c6 in terms of the effective

coupling constants, while the quantities p 1 , . . . , p6 are experimentally accessible. Since
the main uncertainties in the latter arise from the poorly known values of the scattering
lengths a00 , a02 , the above relations essentially determine the coefficients c1 , . . . , c6 in terms
of these two parameters.
5. Symmetry breaking in the effective Lagrangian

As discussed in Section 2, unitarity fully determines the scattering amplitude to third
order of the chiral expansion, in terms of the coupling constants occurring in the derivative
expansion of the effective Lagrangian to O(p6 ),
Leff = L2 + L4 + L6 + .
(5.1)
The leading term L2 only contains F and M 2 2mB. The vertices relevant for
scattering involve the coupling constants 1 , 2 , 3 , 4 from L4 , and L6 generates 6 further
couplings: r1 , . . . , r6 . We need to distinguish two different categories of coupling constants:
(a) Terms that survive in the chiral limit. Four of the coupling constants that enter
the two loop representation of the scattering amplitude belong to this category:
1 , 2 , r5 , r6 .
(b) Symmetry breaking terms. The corresponding vertices are proportional to a power
of the quark mass and involve the coupling constants 3 , 4 , r1 , r2 , r3 , r4 .
The constants of the first category show up in the momentum dependence of the scattering
amplitude, so that these couplings may be determined phenomenologically. The symmetry
breaking terms, on the other hand, specify the dependence of the amplitude on the quark
masses. Since these cannot be varied experimentally, information concerning the second
category of coupling constants can only be obtained from sources other than scattering.
In part, we are relying on theoretical estimates here. Although these are rather crude, the
uncertainties do not significantly affect our results, for the following reason.
The quark masses mu , md , which are responsible for the symmetry breaking effects, are
very small compared to the intrinsic scale of the theory, which is of order 500 MeV or
133
1 GeV. The group SU(2)R SU(2)L therefore represents a nearly perfect symmetry of the
QCD Hamiltonian. In the isospin limit, the symmetry breaking effects are controlled by
the ratio m/, with m = 12 (mu + md ). In view of m 5 MeV, the expansion parameter is
of the order of 102 , indicating that the expansion converges very rapidly.
In the framework of the effective theory, it is convenient to replace powers of m by
powers of M2 and to identify the intrinsic scale with 4F . The expansion parameter
m/ is then replaced by

=
M
4F
2
.
(5.2)
The numerical value 2 = 1.445 102 confirms the estimate just given.
We know of only one mechanism that can upset the above crude order of magnitude
estimate for the symmetry breaking effects: the perturbations generated by the quark
may be enhanced by small energy
mass term in the QCD Hamiltonian, mu uu
+ md dd,
denominators. Indeed, small energy denominators do occur:
(i) In the chiral limit, the pions are massless, so that the straightforward expansion in
powers of the quark masses leads to infrared singularities. For a finite pion mass, these
singularities are cut off at a scale of the order of M and the divergences are converted to
finite expressions that involve the logarithm of M . The most important contributions of
this type are generated by the vertices contained in the leading order effective Lagrangian,
which are fully determined by F and M . Accordingly, the coefficients of the leading
chiral logarithms do not involve any unknown constants. In those cases where this
coefficient happens to be large, the symmetry breaking effects are indeed enhanced, so
that the above rule of thumb estimate then fails.
(ii) States that remain massive in the chiral limit may give rise to small energy
denominators if their mass happens to be small. In the framework of chiral perturbation
theory, the occurrence of such states manifests itself only indirectly, through the fact
that some of the effective coupling constants are comparatively large. The -meson
represents the most prominent example and it is well-known that some of the coupling
constants (for instance 1 and 2 ) are dominated by the contribution from this state [10].
In fact, for all of those effective couplings that have been determined experimentally, the
observed magnitude is well accounted for by the hypothesis that they are dominated by the
resonances seen at low energies [40].
6. Low energy theorems

As the two loop formulae are rather lengthy, we first discuss the principle used to arrive
at the prediction for the S-wave scattering lengths at one loop level, where the algebra is
quite simple. The first order corrections to the two low energy theorems (2.8) are readily
2 Throughout this paper, we identify M with the mass of the charged pion and use F = 92.4 MeV [39].
134
obtained from the formulae given in Appendix B. Expressed in terms of the scale invariant
effective coupling constants 1 , . . . , 4 introduced in [5], the result reads:

2
4
M2
1
197
+O
c1 = 2 1 + 1 + 3 + 2 4
,
3
2
210
F

1
67
4
+ O 2 .
c2 = 2 1 + 1 + 24
(6.1)
3
140
F
The corrections involve both types of couplings: 1 is of type (a) and can thus be
determined from the momentum dependence of the scattering amplitude, while 3 and
4 are of type (b). Indeed, both 1 and 2 show up in the terms proportional to s 2 and
(t u)2 :

1
1 2 47
+
+O ,
c3 =
2
6
84
(4F ) 3

2 127
1
c4 =
+O .
(4F )2 6
840
These formulae show that, up to and including terms of order , the quantities

C1 F2 c2 + 4M2 (c3 c4 ) ,
C2

F2
c1 + 4M4 (c3 c4 )
2
M
(6.2)
exclusively contain the symmetry breaking couplings 3 and 4 :

887
C1 = 1 + 24
+ O 2 ,
420

3
18
C2 = 1 +
+ O 2 .
+ 2 4
2
7
In the following, we analyze the low energy theorems for the S-wave scattering lengths
by means of the quantities C1 and C2 defined in Eq. (6.2). The one for 2a00 5a02 , for
instance, is obtained by inserting the matching relations (4.2) in the definition of C1 and
solving for the scattering lengths. The result reads
2a00 5a02 =

3M2
C1 + M4 1 + O M8 ,
2
4F
where 1 collects the contributions from the phenomenological moments,

1 = 16M2 8I10 + 9I11 11I12 36M2 I22 6H .
(6.3)
(6.4)
The analogous low energy theorems for a00 and a02 read

7M2
C0 + M4 0 + O M8 ,
32F2

M2
a02 =
C2 + M4 2 + O M8 ,
2
16F
a00 =
(6.5)
135
where C0 is a combination of C1 and C2 ,

1
C0 = (12C1 5C2 ),
7
while 0 , 2 again stand for a collection of moments
(6.6)

4 0
5I0 + 10I02 + 28M2 I10 + 24M2 I11 16M2 I12 96M4 I22 + 6M2 H ,
3

8
2 = I00 + 2I02 4M2 I10 6M2 I11 + 10M2 I12 + 24M4 I22 + 6M2 H .
(6.7)
3
The relations (6.3)(6.7) specify the S-wave scattering lengths in terms of C1 , C2 and
the phenomenological moments InI and H . Note that these contain infrared singularities.
Their chiral expansion starts with the contributions generated by the square of the tree level
amplitudes:
0 =

227
+ O M2 ,
4
14M K

1
I21 =
+ O M2 ,
7M4 K

13
I22 =
+ O M2 ,
7M4 K
101
I10 = 2 + O(1),
M K
2
I11 = 2 + O(1),
M K
14
I12 = 2 + O(1),
M K
I20 =
H = O(1),
K 61440 3F4 .
(6.8)
The evaluation of the moments requires phenomenological information. Since the

behaviour of the imaginary parts near threshold is sensitive to the scattering lengths we
are looking for, the same applies for these moments. In the narrow range of interest, the
dependence is well described by the quadratic formulae in Appendix E, which yield
2
2

M4 0 = 0.0448 + 0.30 *a00 0.37 *a02 + 0.5 *a00 1.2 *a00 *a02 + 1.8 *a02 ,
2

M4 1 = 0.0619 + 0.48 *a00 0.26 *a02 + 0.86 *a00 1.7 *a00 *a02

2
+ 0.3 *a02 ,
2

M4 2 = 0.00553 + 0.023 *a00 0.095 *a02 0.1 *a00 *a02 + 0.7 *a02 ,
(6.9)
with *a00 = a00 0.225, *a02 = a02 + 0.03706.
7. The coupling constants 3 and 4

The representation of the S-wave scattering lengths derived in the preceding section
splits the correction to Weinbergs leading order formulae into two parts: a correction factor
Cn , which at first nonleading order only involves the coupling constants 3 and 4 and a
term n that can be determined on phenomenological grounds.
The significance of the coupling constants 3 and 4 is best seen in the expansion of M
and F in powers of the quark mass. The relation of Gell-Mann, Oakes and Renner [3]
136
states that the expansion of M2 starts with a term linear in m. The coupling constant 3
determines the first order correction:

M 2 2Bm.
M2 = M 2 1 12 3 + O 2 ,
(7.1)
2
The constant B stands for the value of |0|uu|0 |/F
in the chiral limit. Note that 3

2
contains a chiral logarithm, 3 = ln M + O(1). The coupling constant 4 , which also

contains a chiral logarithm with unit coefficient, 4 = ln M2 + O(1), is the analogous
term in the expansion of the pion decay constant,

F = F 1 + 4 + O 2 ,
(7.2)
where F is the value of F in the chiral limit.

The same two coupling constants also show up in the scalar form factor

|(p) = 1 + 1 r 2 t + O t 2 .
+ md dd
(p )| mu uu
6
s
(7.3)
The value of the matrix element at t = 0 is the pion -term. According to the Feynman
Hellman theorem, it is given by = mM2 /m. The relation (7.1) thus shows that 3
also determines the -term to first nonleading order:

= M2 1 12 (3 1) + 2 * + O 3 .
(7.4)
Moreover, chiral symmetry implies that the same coupling constant that determines the
difference between F and F also fixes the scalar radius at leading order of the chiral
expansion [10]:

2

3
4 13 + *r + O 2 .
r s=
(7.5)

8 2 F2
12
We may therefore eliminate 4 in favour of the scalar radius and rewrite the correction
factors in the form

M2 2
5
563
r s
3
+ 2 *0 + O 3 ,
C0 = 1 +
3
14
525

M2 2
23
C1 = 1 +
r s+
+ 2 *1 + O 3 ,
3
420

M2 2

17
+ 2 *2 + O 3 ,
C2 = 1 +
(7.6)
r s+
3
3
2
21
with *0 (12*1 5*2 )/7. The first order corrections are then determined by r 2 s
and 3 , while *0 , *1 and *2 represent the two loop contributions. The scalar form factor
is also known to two loops [41]. The explicit expressions for the second order corrections
are given in Appendix C.
For the numerical value of the scalar radius, we rely on the dispersive evaluation of
the scalar form factor described in Ref. [42]. We have repeated that calculation with the
information about the phase shift 00 (s) obtained in Ref. [7]. In view of the strong final state
interaction in the S-wave, the scalar radius is significantly larger than the electromagnetic
one, r 2 e.m. = 0.439 0.008 fm2 [43]. The result reads
r 2 s = 0.61 0.04 fm2 ,
(7.7)
137
where the error is our estimate of the uncertainties to be attached to the dispersive
calculation. The number confirms the value given in Ref. [42] and is consistent with earlier
estimates of the low energy constant 4 , based on the symmetry breaking seen in FK /F
or on the decay K [44], but is more accurate. It corresponds to 13 M2 r 2 s =
0.102 0.007, so that the contribution from the scalar radius represents a correction of
order 10%, in C0 , C1 , as well as in C2 .
The crucial parameter that distinguishes the standard framework from the one proposed
in Ref. [24] is 3 . The value of this coupling constant is not known accurately. Numerically,
however, a significant change in the prediction for the scattering lengths can only arise if
the crude estimate
3 = 2.9 2.4
(7.8)
given in Ref. [10] should turn out to be entirely wrong: with this estimate, the contribution
from 3 to a00 and a02 is of order 0.002 and 0.001, respectively. We do not make an attempt
at reducing the uncertainty in 3 within the standard framework, because it barely affects
our final result. Instead, we will explicitly display the sensitivity of the outcome to this
coupling constant.
8. Results for a00 and a02 at one loop level
We first drop the two loop corrections *n . Inserting the values r 2 s = 0.61 fm2 and
3 = 2.9, the low energy theorems (7.6) yield
C0 = 1.092,
C1 = 1.103,
C2 = 1.117.
(8.1)
The correction factor C1 is fully determined by the contribution from the scalar radius.
The numerical values of C0 and C2 differ little from C1 : the estimate (7.8) implies that
the contributions from the coupling constant 3 are very small, so that these terms are also
dominated by the scalar radius. Inserting the values (6.9), (8.1) in the relations (6.5) and
solving for a00, a02 , we then get
a00 = 0.2195,
a02 = 0.0446,
2a02 5a02 = 0.662.
(8.2)
These numbers are somewhat different from those obtained in [5], which are also based
on the one loop representation of the scattering amplitude. In fact, even if the two loop
corrections *n are dropped, the formulae (6.5) for the S-wave scattering lengths differ
from those given in Ref. [5]. In the case of a00 , for example, the formula given there reads

7M2
5
353
M2 2
25
0
r s
a0 =
1+
3
+ M4 a20 + 2a22 + O M6 ,
32F2
3
14
15
4
where a20 and a22 are the D-wave scattering lengths. As far as the contributions
proportional to r 2 s and 3 are concerned, the expression is the same, but instead of the
phenomenological moments contained in 0 , the above formula contains the term

737
25 0
0
(8.3)
a2 + 2a22 +
.
4
6720 3F4
138
Indeed, the D-wave scattering lengths may be expressed in terms of moments, up to

and including contributions of first nonleading order. Projecting the phenomenological
I (s) do not contribute
representation (3.2) onto the D-waves, we find that the functions W
to the scattering lengths, while the contribution from the background polynomial reads

16 0
I0 + 3I01 + 5I02 4M2 I10 3I11 + 5I12 + 30M2 H + O M4 ,
45

8 0
2
a2 =
2I0 3I01 + I02 4M2 2I10 + 3I11 + I12 + 24M2 H + O M4 . (8.4)
45
The comparison with the exact representation for the D-wave scattering lengths given
in [7] shows that the contributions from the imaginary parts of the S- and P -waves can
be represented in terms of the moments and the coefficients agree with those above. The
formula (8.4) includes the contributions from the higher partial waves, up to and including
corrections of first nonleading order. In the difference,

737
25 0
0
4
2
a + 2a2
*a0 M 0
,
4 2
6720 3F4
a20 =
the leading moments cancel, but the terms with I1I , I2I and H remain:
*a00 =

737M4
+ 8M6 8I10 + 4I11 + 4I12 H 128M8 I22 .
3
4
6720 F
(8.5)
The low energy expansion of the moments in Eq. (6.8) shows that the contributions of
O(M4 ) in *a00 indeed cancel out, demonstrating that the formula given in Ref. [5] agrees
with our representation, up to terms that are beyond the algebraic accuracy of that formula.
Numerically, however, the leading order terms represent a rather poor approximation for
the moments, so that there is a numerical difference: the numerical values of the moments
are given in Appendix E. Inserting these in (8.4), we obtain a20 = 1.76 103 M4 , a22 =
0.171 103M4 , so that the one loop formula of Ref. [5] yields a00 = 0.205, instead of the
value a00 = 0.2195 given above. The difference arises because we are matching the chiral
and phenomenological representations differently: we represent the amplitude in terms of
three functions of a single variable s and match the coefficients of the Taylor expansion at
s = 0. In Ref. [5], the one loop formulae for the various scattering lengths were obtained
by directly evaluating the chiral representation at threshold in other words, the matching
was performed at s = 4M2 rather than at s = 0.
We emphasize that the above discussion in the framework of the one loop approximation
only serves to explicitly demonstrate that the choice of the matching conditions is not
irrelevant. Admittedly, in our final analysis, where we will be working at two loop accuracy,
the noise due to that choice is significantly smaller.
9. Infrared singularities
From a purely algebraic point of view, the manner in which the matching is done is
irrelevant, as long as it is performed in the common region of validity of the chiral and
139
phenomenological representations. We could also match the two loop representation to the
phenomenological one at threshold and would then obtain a formula analogous to the one
given in [5], but now valid to next-to-next-to-leading order. Alternatively, we could match
the two representations of the scattering amplitude at the center of the Mandelstam triangle
the result would only differ by contributions that are beyond the accuracy of the chiral
representation.
There is a good reason for preferring the procedure specified above to a matching at
threshold: the branch cut required by unitarity starts there. The modifications of the tree
level result generated by the higher order effects are quite large at threshold, because they
are enhanced by a small energy denominator. Indeed, a00 contains a chiral logarithm with
an unusually large coefficient:
2

7M2
M 2
9
.

+
,

ln
a00 =
1
+
2
2
4F
32F
M2
The phenomenon gives rise to an exceptionally large correction that violates the rule of
thumb of Section 5 by an order of magnitude: the one-loop correction increases the tree
level prediction by about 25%!
At the center of the Mandelstam triangle, the amplitude also contains a chiral logarithm
(s0 = 43 M2 ):
A(s0 , s0 , s0 ) =
M2
1+
3F2
11
6

+ .
The coefficient is less than half as big as the one in a00 , but it still represents a sizeable
correction.
In our matching procedure, we replace a00 and a02 by C0 and C2 and at the same time also
eliminate 4 in favour of the scalar radius. What matters for the convergence properties of
the quantities appearing in our matching conditions are the infrared singularities contained
in
C0
M2 2
5
r s = 1 + ,
3
14
C2
M2 2
1
r s = 1 + + .
3
2
The coefficients occurring here are remarkably small. The term C1 13 M2 r 2 s does not
contain a chiral logarithm at all. We can therefore expect that, for the quantities that are
relevant for the determination of the S-wave scattering lengths, the perturbation series
converges very rapidly, much more so than for a matching at threshold or at the center of
the Mandelstam triangle. As we will see, this is indeed born out by the numerical analysis.
10. Estimates for symmetry breaking at O(p6 )
We now extend the analysis to next-to-next-to-leading order. For that purpose, we need
an estimate for the symmetry breaking couplings r1 , . . . , r4 and rS2 of L6 , which enter the
low energy theorems for C0 , C1 , C2 at order M4 , as well as the relation between the scalar
radius and the coupling constant 4 . The corresponding correction terms *0 , *1 , *2
140
are listed in (C.2) and (C.3). In the normalization used there, the resonance estimates of
Refs. [6,22,45] amount to
r1 1.5,
r2 3.2,
r3 4.2,
r4 2.5,
rS2 0.7.
(10.1)
Inserting these numbers, we obtain a shift in C0 , C1 , C2 by 0.3, 0.5 and 0.8 permille,
respectively. This confirms the expectation that the effects due to the symmetry breaking
coupling constants rn are tiny. Since the scale is set by the scalar or pseudoscalar nonGoldstone states contributing to the relevant sum rules, Ms 1 GeV, the corresponding
corrections are of order M4 /Ms4 4 104 . In the SU(2) framework we are using here,
continuum also contributes to the effective coupling constants, but in view of
the K K
2
4MK Ms2 , the corresponding scale is even somewhat larger. In the following, we assume
that the estimates in Eq. (10.1) are valid to within a factor of two.
In the case of r1 , . . . , r4 , the main uncertainty stems from the continuum underneath
the resonances, that is from the chiral logarithms. Since the formulae (C.2) are quadratic
in these, the scale dependence of those coupling constants is rather pronounced. This can
be seen by varying the scale , at which the running coupling constants are assumed to be
saturated by the resonance contributions. For 0.5 GeV < < 1 GeV, the corrections vary
in the range
0.002 2 *0 0.005,
0.001 2 *1 0.003,
0.005 2 *2 0.001.
In the representation (7.5) for the scalar radius, the two loop correction *r represents an
effect of first order. Estimating the magnitude in the same manner as for *0 , *1 , *2 , the
result varies in the range 0.18 *r 0.28. The correction thus shifts the scalar radius
by 0.04 0.01 fm2 .
In the following, the central values are calculated by using the resonance estimates (10.1)
at the scale = M . For some of the quantities analyzed in the present paper, the result
is insensitive to the uncertainties inherent in these estimates, but in some cases, they even
dominate our error bars we will discuss the sensitivity of the various results in detail.
11. Final results for a00 and a02
We are now in a position to describe the determination of a00 and a02 at two loop accuracy.
Our matching conditions identify two different representations for the coefficients
c1 , . . . , c6 : the chiral representation specified in Eq. (B.2) and the phenomenological one
in (4.2). For the evaluation of the S-wave scattering lengths, only the first four coefficients
are relevant. For these, the chiral representation involves the effective coupling constants
1 , 2 , 3 , 4 , r1 , r2 , r3 , r4 , while the phenomenological representation contains only the
two parameters a00 and a02 , which enter explicitly as well as implicitly, through the moments
p 1 , . . . , p 4 . In principle, we solve the four conditions for the four variables a00 , a02 , 1 , 2 ,
treating the symmetry breaking coupling constants 3 , 4 , r1 , . . . , r4 as known.
The constant 3 is varied in the range specified in (7.8). Concerning 4 , we rely on the
result for the scalar radius given in (7.7), thus in effect replacing the input variable 4
141
Table 1
Solution of the matching conditions. The first row contains the central values. The next four rows
indicate the uncertainties in this result, arising from the one in r 2 s , 3 , rn and in the experimental
input used in the Roy equations. The last row is obtained by adding these up in quadrature.
a00
r 2 s
3
rn
Exp.
Tot.
0.220
0.002
0.004
0.001
0.001
0.005
a02
1
2
4
r5
r6
0.0444
0.0003
0.0009
0.0002
0.0002
0.0010
0.36
0.04
0.01
0.51
0.29
0.59
4.31
0.02
0.00
0.10
0.04
0.11
4.39
0.19
0.02
0.10
0.03
0.22
3.8
0.05
0.01
1.04
0.12
1.05
1.0
0.03
0.00
0.10
0.02
0.11
by r 2 s . The analysis then involves a fifth condition: the relation (7.5), which expresses
the scalar radius in terms of effective coupling constants.
If all of the input variables are taken at their central values, the representation for
the moments given in Appendix E can be used. The solution of the resulting system of
numerical equations occurs at the values quoted in Table 1, first row. The next four rows
indicate the sensitivity to the input used for r 2 s , 3 , to the uncertainties in the symmetry
breaking coupling constants rn of O(p6 ), and to those in the experimental information
used when solving the Roy equations. The details of the error analysis that underlies these
numbers are described in Appendix F.
Table 1 shows that the uncertainties in the prediction for a00 and a02 are dominated by
those from 3 . In particular, the result for the S-wave scattering lengths is not sensitive
to the contributions from the coupling constants occurring at O(p6 ). Adding up the
uncertainties due to these and to the experimental input in the Roy equations, we arrive
at
a00 = 0.220 0.001 + 0.027*r 2 0.0017*3,
a02 = 0.0444 0.0003 0.004*r 2 0.0004*3,
where *r 2 and *3 are defined by
2
r s = 0.61 fm2 (1 + *r 2 ),
(11.1)
3 = 2.9 + *3 .
Our final result for the S-wave scattering lengths follows from this representation with the
estimates for *r 2 , *3 given in (7.7), (7.8), and reads
a00 = 0.220 0.005,
a02 = 0.0444 0.0010,
2a00 5a02 = 0.663 0.006,
a00 a02 = 0.265 0.004.
(11.2)
Expressed in terms of the coefficients C0 , C1 , C2 , this result corresponds to

C0 = 1.096 0.021,
C1 = 1.104 0.009,
C2 = 1.115 0.022.
(11.3)
142
12. Discussion
The terms omitted in the chiral perturbation series represent an inherent limitation of our
calculation. The matching must be done in such a manner that these are small. In contrast
to a matching at threshold that is, to the straightforward expansion of the scattering
lengths our method fulfills this criterion remarkably well: we are using the expansion
in powers of the quark masses only for the coefficients C0 , C1 and C2 , while the curvature
generated by the unitarity cut is evaluated phenomenologically. As discussed in Section 9,
the infrared singularities occurring in the expansion of these quantities have remarkably
small residues. Indeed, truncating the expansion of Cn at order 1, m and m2 , respectively
and solving Eq. (6.5) in the corresponding approximation, we obtain
a00 = 0.197 0.2195 0.220,
a02 = 0.0402 0.0446 0.0444,
2a00 5a02 = 0.594 0.662 0.663
(12.1)
indicating that the series converges very rapidly. For this reason, we expect the contributions from yet higher orders to be entirely negligible.
The rapid convergence of the series is a virtue of the specific method used to match the
chiral and phenomenological representations. To demonstrate this, we briefly discuss the
alternative approach used in Refs. [5,6], where the results for the various scattering lengths
and effective ranges are obtained by directly evaluating the chiral representation of the
scattering amplitude at threshold. Keeping the values of the effective coupling constants
fixed at the central values and truncating the series at order m, m2 and m3 , we obtain the
sequence
a00 = 0.159 0.200 0.216,
a02 = 0.0454 0.0445 0.0445,
2a00 5a02 = 0.545 0.624 0.654.
(12.2)
The first terms on the right correspond to Weinbergs formulae. The second and third terms
are in agreement with the old one loop results of Ref. [5] and the two loop results of
Refs. [6,23,46], respectively. As indicated by the difference between the second and third
terms, the corrections of O(p6 ) are by no means negligible for a matching at threshold.
This is illustrated in Fig. 1, where the three full circles correspond to the sequence (12.2).
The large, crossed error bars indicate the values quoted in the 1979 compilation of
Ref. [28]. The triangle and the diamond near the center of the figure correspond to set I
and set II of Ref. [6], respectively. The ellipse represents the 68% confidence contour of
our final result in Eq. (11.2). The details of the error analysis that underlies this result are
described in Appendix F.
The reason why the straightforward expansion of the scattering lengths in powers of the
quark masses converges rather slowly is that these represent the values of the amplitude
at threshold, that is at the place where the branch cut required by unitarity starts. The
truncated chiral representation does not describe that singularity well enough, particularly
at one loop, where the relevant imaginary parts stem from the tree level approximation.
143
Fig. 1. Constraints imposed on the S-wave scattering lengths by chiral symmetry. The three full
circles illustrate the convergence of the chiral perturbation series at threshold, according to Eq. (12.2).
The one at the left corresponds to Weinbergs leading order formulae. The error ellipse represents
our final result, while the narrow, curved band indicated the region allowed in generalized CHPT.
If the effective coupling constants are the same, the only difference between our method
I (s) and U I (s). In
and a matching at threshold is the one between the functions W
I (s) and
particular, the results for a00 , a02 only differ because the numerical values of W
I
2
U (s) at s = 4M are not the same. As mentioned above, the difference between the two
sets of functions affects the scattering amplitude only at O(p8 ) and beyond. Numerically,
however, it is not irrelevant which one of the two is used to describe the effects generated
I (s) account for the imaginary parts of the Sby the unitarity cuts: while the functions W
and P -waves to the accuracy to which these are known, the quantities U I (s) represent a
comparatively crude approximation, obtained by evaluating the imaginary parts with the
one-loop representation.
13. Correlation between a00 and a02

As mentioned earlier, the main difference between generalized chiral perturbation theory
and the standard one used in the present paper resides in the coupling constant 3 . Apart
from that, the formulae are identical only the bookkeeping for the chiral power of the
144
Fig. 2. S-wave scattering lengths as functions of 3 .
quark mass matrix is different. 3 In particular, the relation between the scalar radius and the
coupling constant 4 also holds in that framework, but there is no prediction for the S-wave
scattering lengths a00 and a02 , because these involve the coupling constant 3 . The fact that
4 is strongly constrained by the value of the scalar radius implies, however, that there is a
strong correlation between a00 and a02 , independently of whether the quark condensate is the
leading order parameter: apart from higher order corrections, both of these are controlled
by the same parameter 3 . The dependence is approximately described by the parabola
a00 = 0.225 1.6 103 3 1.3 105 (3 )2 ,
a02 = 0.0433 3.6 104 3 4.3 106 (3 )2 ,
(13.1)
which are displayed in Fig. 2. Note that the interval shown far exceeds the range relevant
for the standard picture, which is indicated by the vertical bar.
Eliminating the parameter 3 , we obtain a correlation between a00 and a02 :

2
a02 = 0.0444 0.0008 + 0.236 a00 0.22 0.61 a00 0.22

3
9.9 a00 0.22 .
(13.2)
The error given accounts for the various sources of uncertainty in our input evaluating
these as described in Appendix F, we find that they are nearly independent of a00 . The
correlation is indicated in Fig. 1: the values of a00 and a02 are constrained to the narrow strip,
whose boundaries touch the error ellipse associated with the standard picture. As discussed
in Ref. [7], a qualitatively similar correlation also results from the Olsson sum rule [47]
the two conditions are perfectly compatible, but the one above is considerably more
stringent. Fig. 1 also shows that for a00 < 0.18, or 3 > 25, the center of the region allowed
by the correlation falls outside the universal band, which is indicated by the tilted lines.
The same happens on the opposite side, for a00 > 0.28, 3 < 54. Since the Roy equations
only admit solutions if the two subtraction constants a00 and a02 are in the universal band,
3 If is large, the symmetry breaking effects generated by the quark masses are larger than in the standard
3
framework, so that a reordering of the series that gives these more weight is called for.
145
Fig. 3. Phase relevant for the decay K e. The three bands correspond to the three indicated
values of the S-wave scattering length a00 . The uncertainties are dominated by those from the
experimental input used in the Roy equations. The triangles are the data points of Rosselet et al. [29],
while the full circles represent the preliminary E865 results [31].
exceedingly large values of 3 are thus excluded. Note also that the correlation implies an
upper bound on the I = 2 scattering length: a02 < 0.035.
The correlation between a02 and a00 can be used, for instance, to analyze the information
about the phase difference 00 11 obtained from the decay K e . At the low
energies occurring there, this difference is dominated by the contribution a00 from the
I = 0 S-wave scattering length. The relation (13.2) allows us to correct for the higher
order terms of the threshold expansion: the phase difference can be expressed in terms of
the energy and the value of a00 , up to very small uncertainties. This is illustrated in Fig. 3:
the center of the three narrow bands shown is obtained by fixing the value of a02 with the
correlation (13.2) and inserting the result in the numerical parametrization of the phase
shifts in Appendix D of Ref. [7]. At a given value of a00 , the uncertainties in the result for
the phase difference 00 (s) 11 (s) are dominated by the one in the experimental input used
for the I = 0 S-wave. Near threshold, the uncertainties are proportional to (s 4M2 )3/2
in the range shown, they amount to less than a third of a degree. While the data of Rosselet
et al. [29] are consistent with all three of the indicated values of a00 , the preliminary results
of the E865 experiment at Brookhaven [30,31] are not. Instead they beautifully confirm
the prediction (11.2): the best fit to these data is obtained for a00 = 0.218, with 2 = 5.7
for 5 degrees of freedom. As pointed out in Ref. [48], the correlation (13.2) can be used
to convert data on the phase difference into data on the scattering lengths. For a detailed
discussion of the consequences for the value of a00 , we refer to [48,49].
146
Fig. 4. Values of the coupling constants 1 and 2 . The shaded ellipse shows the result of our
calculation. The rectangles indicate the ranges quoted in Refs. [10,23,25]. The triangle and the
diamond correspond to set I and set II of [6], respectively. The cross represents the resonance
saturation estimate of Ref. [50]. The full circle is the result obtained by matching at one loop and the
eff
thin ellipse close to it represents the uncertainties in the effective one loop couplings eff
1 , 2 .
14. Results for 1 and 2

The effective coupling constants of L4 enter the chiral perturbation theory representation
of the scattering amplitude and of the scalar form factor only as corrections, so that our
results for these are subject to significantly larger uncertainties than those for a00 , a02 .
According to Table 1, we obtain
1 = 0.4 0.6,
2 = 4.3 0.1.
(14.1)
The noise in the symmetry breaking couplings rn of L6 and the one in the Roy equation
input yield comparable contributions, while those from the other entries are negligibly
small. The corresponding error ellipse is shown in Fig. 4.
In order to investigate the uncertainties due to the neglected higher order terms, we again
compare this with what is found if the phenomenological representation is matched to the
one loop approximation of the chiral perturbation series. For the central values of the input
parameters, the solution of the matching conditions then occurs at 1 = 1.8, 2 = 5.4:
the two loop effects shift the one loop result by about +1.4 and 1.1 units, respectively.
The shift arises from the fact that the expansion of the coefficients c3 and c4 contains
very strong infrared singularities at first nonleading order. Analogous contributions also
occur in c1 and c2 , at next-to-next-to-leading order, but in the combinations C0 , C1 , C2
that matter for the determination of the scattering lengths, these singularities only generate
very small effects: in these quantities, the contributions of order p4 amount to less than 1%.
We conclude that, unlike the result for a00 , a02 , where the uncertainties from the neglected
higher order terms are tiny, the one for 1 and 2 is sensitive to these. Although we expect
147
the corresponding contributions to be small compared to the first order shift given above,
they might be of the same order as those from the uncertainties in our input we do not
offer a quantitative guess.
The couplings 1 and 2 are quark mass independent, whereas the physical quantities
used to estimate their values incorporate quark mass effects. As a result of this, it
is problematic to rely on phenomenological determinations based on the one loop
approximation when analyzing quantities at two loop order. The large infrared singularities
that accompany the contributions from 1 and 2 are automatically accounted for in the two
loop representation, but are missing in the framework of a one loop calculation in the
phenomenological analysis, their contributions are lumped into those from the coupling
constants. As an illustration, we mention the set I of couplings introduced in [6], that uses
the one-loop values for 1 and 2 , but leads to D-wave scattering lengths that do not agree
well with the values extracted from experiment, as was first pointed out in Ref. [25]. For a
detailed discussion of this issue, we refer to [50].
We now show that, once the shift in the values of 1 , 2 is accounted for, the one and
two loop representations for the coefficients c1 , . . . , c4 become nearly the same, so that
the results obtained by matching the phenomenological representation with the chiral one
at two loop level nearly coincide with those found in the one loop approximation. The
infrared singularities responsible for that shift are those contained in the coefficients b3 ,
b4 . If we solve the expressions for these coefficients in one loop approximation, we obtain
4
5
eff
eff
(14.2)
1 3(b3 b4 ) + ,
2 6b4 + .
3
6
The expansion of these quantities in powers of the quark masses starts with eff
n = n +
O( ). The infrared singularities generated by the two loop graphs show up in the terms of
order , in particular through contributions proportional to L 2 = ln2 (2 /M2 ), which are
very important numerically. Accounting for the uncertainties in our input, we obtain
eff
1 = 1.9 0.2,
eff
2 = 5.25 0.04.
(14.3)
The comparison with the values 1 = 1.8, 2 = 5.4, found when matching at one loop,
shows that the couplings relevant in the context of the one loop approximation may indeed
be characterized in this manner (compare Fig. 4, where the values for 1 , 2 obtained at
one loop are indicated by the full circle, while the thin ellipse corresponds to the above
eff
numerical result for eff
1 , 2 ).
Now comes the point we wish to make: we may also evaluate the one loop formulae (6.1)
for c1 , c2 , replacing 1 , 2 by the above effective values. The outcome differs from what is
obtained with the two loop formulae only by a fraction of a percent the difference is in
the noise of the two loop result. In this sense, the main effect of the infrared singularities in
the two loop graphs amounts to a shift in the values of the coupling constants 1 , 2 . This
explains why the matching conditions used in the present paper yield very accurate results
for the S-wave scattering lengths already at one loop, while the corresponding results for
these two couplings are off.
The literature contains quite a few determinations of the coupling constants 1 and 2
that are based on the one loop approximation of chiral perturbation theory [1017], starting
148
with the estimates 1 = 2.3 3.7, 2 = 6.0 1.3 given in Ref. [10], which are perfectly
eff
consistent with our result for eff

1 , 2 . Note that, in the case of 2 , the shift generated by
the two loop graphs takes the result outside the quoted range (as stated in Ref. [10], that
range only measures the accuracy to which the first order corrections can be calculated and
does not include an estimate of contributions due to higher order terms).
The results for the effective coupling constants obtained by Girlanda et al. [25] read 1 =
0.37 0.95 1.71, 2 = 4.17 0.19 0.43. The first error comes from the evaluation
of the integrals over the imaginary parts, while the second reflects the uncertainties in the
contributions from the couplings of L6 . Our results in Eq. (14.1) confirm these numbers,
with substantially smaller errors we repeat, however, that these only account for the
noise seen in our calculation.
Amoros, Bijnens and Talavera [23] have extracted values for the coupling constants of
L4 from their two loop analysis of the Ke4 form factors which is based on SU(3)R
SU(3)L chiral perturbation theory and obtain 1 = 0.4 2.4, 2 = 4.9 1.0. Fig. 4
shows that these are perfectly consistent with ours. As these authors are relying on the one
loop relations between the coupling constants Ln of that framework and the couplings n
relevant for SU(2)R SU(2)L , the results are accompanied by comparatively large errors.
15. Values of 4 , r5 and r6

For the central values of the input, the matching conditions lead to 4 = 4.39 (first row in
Table 1). The uncertainties in this number due to the various sources of error are dominated
by the one in the scalar radius and the noise in the symmetry breaking coupling constants
r1 , r2 , r3 , r4 , rS2 of L6 . In order to estimate the uncertainties due to the higher order
effects that our calculation neglects, we compare the above two loop result with the value
4 = 4.60, obtained by truncating the chiral representation for the scalar radius at leading
order. The comparison shows that the shift generated by the two loop contributions is of
the same size as the one due to the uncertainty in the scalar radius. Those from yet higher
orders are expected to be significantly smaller, so that the uncertainty in the final result is
dominated by the sources of error listed in the table. The net result reads
4 = 4.4 0.2.
(15.1)
The number is consistent with the one loop estimate 4 = 4.3 0.9, given in Ref. [10].
The infrared singularities that accompany the coupling constant 4 are much weaker than
those occurring together with 1 , 2 . The same is true also for 3 , where the uncertainties
are much too large for such effects to matter at all.
The above result confirms the value 4 = 4.4 0.3, obtained by Bijnens, Colangelo
and Talavera [22], from a comparison of the two loop representation with the dispersive
result of the scalar radius, but this was to be expected, because the input used in the two
evaluations is nearly the same.
In the framework of the calculation mentioned in Section 14, Amoros, Bijnens and
Talavera [23] obtain 4 = 4.2 0.18, also consistent with our result (as emphasized by
149
these authors, the error bar does not account for the uncertainties due to higher order
effects, which in their approach are quite substantial).
The coupling constants rn (4)4 rnr () are scale dependent. We could introduce
corresponding scale independent quantities, analogous to the terms n used for the coupling
constants of L4 . The scale dependence is rather complicated, however, because it is
quadratic in ln . We instead quote the values obtained for = M = 0.77 GeV. Our
analysis does not shed any light on the symmetry breaking coupling constants r1 , . . . , r4 ,
which belong to the input of our calculation, but we can determine r5 and r6 , from the
matching conditions for c5 and c6 we did not yet make use of these. Numerically, we
find:
r5 = 3.8 1.0,
r6 = 1.0 0.1.
(15.2)
Table 1 shows that the noise seen in our calculation is dominated by the one in the estimates
for the symmetry breaking coupling constants r1 , . . . , r4 . Note that the error bars do not
account for the uncertainties due to higher order contributions our evaluation does not
give us any handle on these.
The resonance estimates of Refs. [6,22,45] offer a test: they lead to
r5 2.7,
r6 0.75,
(15.3)
and thus corroborate the outcome of our analysis, both in sign and in magnitude. In fact,
as pointed out by Ecker [50], the estimates
1 0.7,
2 5.0,
3 1.9,
4 3.7,
(15.4)
obtained from resonance saturation of sum rules [40], are perfectly consistent with the
numbers found at two loop accuracy. We conclude that there is good evidence for the
picture drawn in Ref. [10] to be valid: the values of all of the effective coupling constants
encountered in the two loop representation of the scattering amplitude are consistent
with the assumption that these are dominated by the contributions from the singularities
due to the exchange of the lightest non-Goldstone states. Admittedly, this assumption
does not lead to very sharp values, because the separation of the resonance contributions
from the continuum underneath is not unique. The problem manifests itself in the scale
dependence of the coupling constants resonance saturation can literally hold only at
one particular scale. Also, it is not a straightforward matter to formulate the resonance
saturation hypothesis for singularities due to the exchange of particles of spin two or
higher [20,21]. Even so, we consider it important that the values found for the coupling
constants are within the noise inherent in the assumption that, once the poles and cuts due
to the Goldstone bosons are removed, the low energy behaviour of the scattering amplitude
is dominated by the singularities due to the remaining states. Since these remain massive
in the chiral limit, their contributions to the chiral expansion are suppressed by powers of
momenta or quark masses, but they do show up at nonleading orders.
150
16. The coefficients b1 , . . . , b6

The matching conditions (4.2) express the coefficients cn of the chiral representation
in terms of the S-wave scattering lengths and moments of the imaginary parts. Inserting
the numerical representation for the dependence of the moments on the scattering lengths
and comparing the result with Eq. (B.2), we obtain the following representation for the
coefficients introduced in Ref. [6]:
2
2

b1 = 0.1 0.1 21 *a00 + 1670 *a02 + 9 *a00 + 96 *a00 *a02 972 *a02 ,
2
2

b2 = 8.2 0.4 + 179 *a00 602 *a02 135 *a00 + 315 *a00 *a02 65 *a02 ,
2
2

b3 = 0.41 0.06 + 3.5 *a00 12.9 *a02 + 7 *a00 30 *a00 *a02 + 40 *a02 ,

2
2
b4 = 0.71 0.01 + 1.3 *a00 4.1 *a02 *a00 4 *a00 *a02 + 25 *a02 ,
2

b5 = 2.99 0.35 + 32.6 *a00 97.0 *a02 + 104 *a00 451 *a00 *a02
2

+ 602 *a02 ,
2
2

b6 = 2.18 0.01 + 7.2 *a00 28.4 *a02 3 *a00 + 9 *a00 *a02 62 *a02 ,
(16.1)
0.225,
+ 0.03706. The error bars indicate the uncertainties in
with
the outcome due to those in the experimental input used when solving the Roy equations.
The representation holds for arbitrary values of the scattering lengths in the vicinity of the
point of reference. Inserting our results from (11.2) and adding errors quadratically, we
finally obtain
*a00
a00
b1 = 12.4 1.6,
b4 = 0.74 0.01,
*a02
a02
b2 = 11.8 0.6,
b5 = 3.58 0.37,
b3 = 0.33 0.07,
b6 = 2.35 0.02.
(16.2)
We emphasize that the error bars only indicate the noise seen in our evaluation. In
b1 , . . . , b4 , the two loop representation does account for the contributions of next-toleading order, but in the case of b5 , b 6 , it only yields the leading terms these quantities
are particularly sensitive to the neglected higher orders.
The above results may be compared with the values found in the literature. Girlanda,
Knecht, Moussallam and Stern [25] work within generalized chiral perturbation theory
and do not have a prediction for the magnitude of the coefficients b1 and b2 , because the
corresponding expressions contain the two free parameters and . In their framework,
the analogs of the constants b3 , . . . , b6 are denoted by 1 , . . . , 4 . The explicit relation
between the two sets of quantities is given in Eq. (A.1). In our notation, the numerical
values of Ref. [25] correspond to b3 = 0.56 0.37, b4 = 0.737 0.039, b5 = 3.25 1.50,
b6 = 2.42 0.22 and are perfectly consistent with our results, where the errors are smaller.
The result for b1 and b2 , obtained above within the standard framework, amounts to a
prediction for the magnitude of and . Numerically, we obtain
= 1.08 0.07,
= 1.12 0.01.
(16.3)
151
Fig. 5. Result for b3 and b4 . The errors in our result are dominated by those in the experimental input
used when solving the Roy equations: the nearly degenerate ellipse indicates the result obtained if
these could be ignored. The rectangles correspond to the values quoted in Refs. [19,25], while the
diamond marks the one obtained in Ref. [6], set II.
Fig. 6. Result for b5 and b6 . The strip between the two horizontal lines corresponds to the value for
b6 of Wanders [19].
Wanders [19] has obtained values for the coefficients b3 , b4 and b6 from manifestly
crossing symmetric dispersion relations. Matching the chiral and dispersive representations
at the center of the Mandelstam triangle, he obtains the values b3 = 0.403 0.032,
b4 = 0.719 0.024, b6 = 2.29 0.075, which are also consistent with our numbers, see
Figs. 5 and 6. Note that the quoted errors only account for the uncertainties arising from
the procedure used in Ref. [19] and do not cover those in the input. Fig. 5 shows that, in the
case of b3 , the experimental input in the Roy equations represents the dominating source
of error.
Amoros, Bijnens and Talavera [23] have determined the coefficients bn on the basis of
their analysis of the Ke4 form factors, referred to earlier. The results for the coefficients
152
b3 , . . . , b6 are accompanied by rather large errors and we do not list these here, but merely
note that the central values in Eq. (16.2) are within the quoted range, in all cases. For
the first two terms, however, Amoros et al. arrive at comparatively accurate values, b1 =
10.8 3.3, b2 = 10.8 3.2, which are also perfectly consistent with those in Eq. (16.2).
The fact that, in their analysis, the remaining coefficients are subject to large uncertainties,
also manifests itself in column C of Table 2: the error bars in the first five rows of the table,
a00 , . . . , a11 , are much smaller than those in the remainder.
17. S- and P -wave phase shifts

For the reasons discussed in detail in Ref. [7], the two S-wave scattering lengths are
the essential parameters in the low energy domain. The result in Eq. (11.2) specifies these
to within very small uncertainties. In particular, we can now work out the phase shifts of
the S- and P -waves on this basis, using the Roy equation analysis of [7]. The available
experimental information for the imaginary parts above s0 = 0.8 GeV, as well as the
scattering lengths a00 , a02 are used as an input, while the output of the calculation consists
of the phases 00 (s), 11 (s) and 02 (s), in the region below s0 . In view of the two subtractions
occurring in the Roy equations, the behaviour of the imaginary parts above 1 GeV has
very little influence on the behaviour of the solutions below 0.8 GeV. Also, there is a
consistency check: in the region above s0 , the output must agree with the input. For the
values of the scattering lengths required by chiral symmetry, this condition is indeed met.
In fact, the solutions of the Roy equations closely follow the input, within the rather broad
range of variations allowed for the imaginary parts in Ref. [7]. This also means that the
Roy equations do not strongly constrain the behaviour of the phases above 0.8 GeV.
The result is shown in Figs. 7, 8 and 9. For comparison, these figures also show the data
points of the phase shift analyses given by Hyams et al. [51], Protopopescu et al. [52], the
solutions A and B of Hoogland et al. (ACM) [53] and the one of Losty et al. [54], as well as
the P -wave phase extracted from the data on the reactions e+ e + and .
For further information on the S-wave phase shifts, we refer the reader to [55,56].
The three central curves are described by the parametrization [57]

4M2 sI
4M2 2 I
tan I = 1
(17.1)
q A + BI q 2 + CI q 4 + DI q 6
,
s
s sI
with s = 4(M2 + q 2 ). The numerical values of the coefficients are:
A00 = 0.220,
A11 = 0.379 101,
A20 = 0.444 101 ,
B00 = 0.268,
B11 = 0.140 104 ,
B02 = 0.857 101 ,
C00 = 0.139 101 ,
C11 = 0.673 104 ,
C02 = 0.221 102 ,
D00 = 0.139 102,
D11 = 0.163 107 ,
D02 = 0.129 103 ,

(17.2)
153
Fig. 7. I = 0 S-wave phase shift. The full line results with the central values of the scattering
lengths and of the experimental input used in the Roy equations. The shaded region corresponds
to the uncertainties of the result. The dotted lines indicate the boundaries of the region allowed if the
constraints imposed by chiral symmetry are ignored [7]. The data points are from Refs. [51,52].
in units of M . In particular, the constants AI represent the scattering lengths of the three
partial waves under consideration, while the BI are related to the effective ranges.
The parameters sI specify the value of s where I (s) passes through 90 :
s00 = 36.77M2 ,
s11 = 30.72M2 ,
s02 = 21.62M2 .
(17.3)
In the channels with I = 0, 1, the corresponding energies are 846 MeV and 774 MeV,
respectively (the negative sign of s02 indicates that in the I = 2 channel, which is exotic,
the phase remains below 90 ).
2 is of special interest, in connection
The value of the phase difference 00 02 at s = MK
with the decays K . In particular, the phase of 6 /6 is determined by that phase
difference. Our representation of the scattering amplitude allows us to pin this quantity
down at the 3% level of accuracy:
2
2
2
00 MK
(17.4)
0 0 MK 0 = 47.7 1.5 .
We add two remarks concerning the comparison with the P -wave phase shift extracted
from the e+ e and data. First, we note that the agreement at 0.8 GeV is enforced by
our approach: in the Roy equation analysis, the value of the phase shift at that energy
represents an input parameter and we have made use of those data to pin it down. Once
154
Fig. 8. P -wave phase shift. The phase of the pion form factor is also shown, but it can barely be
distinguished from the central result of our analysis. The data points are from Refs. [51,52].
that is done, however, the behaviour of the phase shift at lower energies is unambiguously
fixed: chiral symmetry determines the two subtraction constants, so that the solution of the
Roy equations becomes unique. In other words disregarding the small effects due to the
uncertainties in the input of our analysis, which are shown in Fig. 8 there is only one
interpolation between threshold and 0.8 GeV that is consistent with the constraints imposed
by analyticity, unitarity and chiral symmetry. Fig. 8 shows that the predicted curve indeed
very closely follows the phase extracted from the e+ e and data. This confirms the
conclusions reached in Ref. [58].
Actually, the figure conceals a discrepancy in the threshold region, where the phase is
too small for the effect to be seen by eye: evaluating the P -wave scattering length with the
GounarisSakurai parametrization of the form factor given in Ref. [59] (the curve shown
in the figure), we obtain a result that is smaller than the value for a11 in Table 2, by about
10%, that is by many standard deviations of our prediction. The discrepancy is in the noise
of the data on the form factor: there is little experimental information in the threshold
region, so that the behaviour of the form factor is not strongly constrained there. Indeed,
there are alternative parametrizations that also fit the data, but have a distinctly different
behaviour near and below threshold. Even parametrizations with unphysical singularities at
s = 0, such as the one proposed in [60], provide decent fits in the experimentally accessible
155
Fig. 9. I = 2 S-wave phase shift. The full line results with the central values of the scattering
lengths and of the experimental input used in the Roy equations. The shaded region corresponds
to the uncertainties of the result. The data points represent the two phase shift representations of the
AachenCernMunich collaboration [53] and the one of Losty et al. [54].
region. In this respect, the present work does add significant information about the P -wave
phase shift, as it predicts the behaviour near threshold, within very narrow limits.
18. Poles on the second sheet

The partial wave amplitude t11 (s) contains a pole on the second sheet. Denoting the pole
position by s = (M 2i )2 , we obtain
M = 762.4 1.8 MeV,
= 145.2 2.8 MeV.
(18.1)
Note that the values quoted for the mass often represent the energy where the real part
of the amplitude vanishes in contrast to the position of the pole, that value is not
independent of the process considered. As the scattering is approximately elastic there, the
corresponding mass is the energy where the phase shift goes through 90 . For the P -wave,
this happens at
m = 773.5 2.5 MeV.
The real part of the pole position is smaller than the energy where the phase shift passes
through 90 , by about 10 MeV. The uncertainty in is significantly smaller than the error
156
bar quoted in [7]: the constraints imposed on the scattering amplitude by the low energy
theorems of chiral symmetry also allow a better determination of the width.
The I = 0 S-wave also contains a pole on the second sheet. The uncertainties in the pole
position are considerably larger than in the case of the , because the singularity is far from
the real axis. Also, the uncertainties in the phase shift are somewhat larger here. Varying
the input parameters as well as the analytic form of the representation used for t00 , we find
that the pole occurs in the region s = (470 30) i(295 20) MeV, while the phase
passes through 90 at s = 844 13 MeV.

There is no harm in calling this an unusually broad resonance, but that sheds little light
on the low energy structure of the scattering amplitude. In particular, it should not come as
a surprise if the values for the mass and width of the resonance, obtained on the basis of
the assumption that the pole represents the most important feature in this channel, are very
different from the real and imaginary parts of the energy at which the amplitude actually
has a pole there is more to the physics of the S-wave than the occurrence of a pole far
from the real axis. A collection of numbers concerning the pole position is given in [61]
and for a recent review of the abundant literature on the subject, we refer to [62]. A recent
discussion in the framework of the NN interaction is given in [63].
We add a remark concerning the physics behind the pole in t00 admittedly, the
reasoning is of qualitative nature. In the chiral limit, current algebra predicts t00 =
s/16F2 : the amplitude vanishes at threshold, but the real part grows quadratically with
the energy, so that the imaginary part rises with the fourth power. The rapid growth signals
the occurrence of a strong final state interaction. In order to estimate the strength of the
corresponding branch cut, we invoke the inverse amplitude method, replacing the above
formula by t00 = s/(16F2 is). The virtue of this operation is that, while it retains
the algebraic accuracy of the current algebra approximation, it yields an expression that
does obey elastic unitarity. The formula shows that, in this approximation, the amplitude
contains a pole at s = i16 F = 463 i463 MeV, indeed not far away from the
place where the full amplitude has a pole.
The physics of the P -wave is very different, because the unitarity cut generated by low
energy intermediate states is very weak. Repeating the above exercise for t11 , one again
finds a pole with equal real and imaginary parts, but it is entirely fictitious, as it occurs
at 1.1 i1.1 GeV, far beyond the region where current algebra provides a meaningful
approximation. The occurrence of a pole near the real axis cannot be understood on the
basis of chiral symmetry and unitarity alone.
In the framework of the effective theory, the difference manifests itself as follows.
While the unitarity corrections account perfectly well for the low energy behaviour of the
imaginary parts, the presence of the only shows up in the values of the effective coupling
constants 1 , 2 and 6 . There is no such pole in t11 , for instance, if the underlying theory is
identified with the linear -model, and the values of those coupling constants are then very
different [10]. In this sense, the pole in t11 reflects a special property of QCD, while the
one in t00 can be understood on the basis of the fact that chiral symmetry predicts a strong
unitarity cut: the pole position is related to the magnitude of F .
157
19. Threshold parameters

The scattering lengths of the partial waves with 1, as well as the effective ranges
(also of those of the S-waves) can be expressed in terms of sum rules over the imaginary
parts [64]. The corresponding numerical values are listed in the Table 2, together with
the S-wave scattering lengths. Column A indicates our final results, obtained by matching
the phenomenological and chiral representations in the subthreshold region and using the
Roy equations to evaluate the amplitude and its derivatives at threshold. In column B,
we quote the numbers obtained from a direct evaluation of the two loop representation at
threshold, using our central values for the effective coupling constants this amounts to
truncating the expansion of the threshold parameters in powers of mu and md . Column C
lists the results of Ref. [23], where the amplitude is also expanded at threshold, but the
coupling constants are determined on the basis of a two loop analysis of the Ke4 form
factors (see the next section for a comment concerning these entries). The comparison of
the columns A, B and C clearly shows that two loop chiral perturbation theory works very
well in describing both scattering and Ke4 decays. For the reasons given in Section 9,
the method described in the present paper yields the smallest error bars. In fact, it is quite
remarkable that the results for the effective range b31 in columns B and C represent a decent
estimate: in this case, only the infrared singularities occurring in the expansion in powers
of the quark masses contribute. For comparison, column D lists the values of Ref. [7],
which are obtained by analyzing the available data with the Roy equations and do not
invoke chiral perturbation theory. Finally, column E contains the values of the compilation
in Ref. [28].
Table 2
Threshold parameters. Our results are listed in column A. The numbers in the next two columns are
obtained by evaluating the chiral representation at threshold: the entries under B follow from our
values of the effective coupling constants, while those under C are taken from Ref. [23]. Column D
gives the outcome of a Roy equation analysis that does not invoke chiral symmetry [7], while E
contains the old experimental values [28]
A
Units
a00
0.220 0.005
0.216
0.220 0.005
0.24 0.06
0.26 0.05
b00
0.276 0.006
0.268
0.280 0.011
0.26 0.02
0.25 0.03
a02
b02
a11
b11
a20
b20
a22
b22
a31
b31
0.444 0.010
0.445
0.423 0.010
0.36 0.13
0.28 0.12
101
0.803 0.012
0.808
0.762 0.021
0.79 0.05
0.82 0.08
101 M2
0.379 0.005
0.380
0.380 0.021
0.37 0.02
0.38 0.02
0.567 0.013
0.537
0.58 0.12
0.54 0.04
0.175 0.003
0.176
0.22 0.04
0.17 0.01
0.355 0.014
0.343
0.32 0.10
0.35 0.06
0.170 0.013
0.172
0.29 0.10
0.18 0.08
0.326 0.012
0.339
0.560 0.019
0.545
0.61 0.11
0.58 0.12
0.402 0.018
0.312
0.36 0.02
0.44 0.14
0.36 0.9
M2
101 M2
102 M4
0.17 0.03
102 M4
103 M6
0.13 0.3
103 M4
103 M6
0.34 0.07
0.6 0.2
104 M6
104 M8
158
20. Quark mass dependence of M2 and F

The dependence of physical quantities on the quark masses is of interest, in particular,
for the following reason: by now, dynamical quarks with a mass of the order of the physical
value of ms are within reach on the lattice, but it is notoriously difficult to equip the two
lightest quarks with their proper masses. Invariably, the numerical results obtained for the
physical values of mu and md rely on an extrapolation of numerical data. For a recent,
comprehensive review of lattice work on the light quark masses, we refer to [65].
In this connection, chiral perturbation theory may turn out to be useful, because it
predicts the mass dependence in terms of a few constants: the coupling constants of the
effective Lagrangian. Above, we have determined some of these and we now discuss the
consequences for the dependence of M2 , F , a00 , a02 on the masses of the two lightest
quarks: we keep ms fixed at the physical value and set mu = md = m, but vary the value
of m in the range 0 < m < 12 ms (at the upper end of that range, the pion mass is about
500 MeV).
In the preceding sections, we have expressed all of the quantities in terms of the physical
pion mass and the physical decay constant, using the ratio as an expansion parameter.
Also, the logarithmic infrared singularities were normalized at the scale M . In particular,
the coupling constants n contain a chiral logarithm with unit coefficient, so that they may
be represented as
2
n ln n2 ,
M
(20.1)
where n is the intrinsic scale of n and is independent both of m and of the running
scale . In order to explicitly exhibit the quark mass dependence, we replace M2 by the
variable M 2 2mB and also normalize the chiral logarithms at the scale M, trading the
quantities and n for
x
M2
,
16 2 F 2
2
n ln n2 ,
M
respectively. According to (7.1), (7.2) the two sets of variables are related by

x = 1 + 12 (3 + 44 ) + O 2 ,
n = n 12 3 + O 2 ,

n = n + 12 x 3 + O x 2 .
= x 1 12 x(3 + 44 ) + O x 2 ,
(20.2)
(20.3)
The expansions of M2 and F in powers of m are known to next-to-next-to-leading order

[6,66,67]. In the above notation, the explicit expressions may be written in the form [68]:
3

2 2
2
,
M2 = M 2 1 12 x 3 + 17
8 x M + x k M + O x

3
2
2
2
5
F = F 1 + x 4 4 x F + x kF + O x ,

1
281 + 322 93 + 49 ,
M =
51

1
141 + 162 + 63 64 + 23 .
F =
(20.4)
30
159
In this representation, the infrared singularities are hidden in the scale invariant quantities
1 , . . . , 4 . Those generated by the two loop graphs have been completed to a square. The
normalization of the auxiliary quantities M , F is chosen such that their mass dependence
is also of the form
M = ln
2M
,
M2
F = ln
2F
.
M2
(20.5)
The constants kM , kF collect the analytic contributions at order x 2 and are independent
of m and . By completing the logarithms at order x 2 to a square, we have in effect
chosen a particular running scale: the one where the coefficient of the term linear in ln M 2
vanishes. This simplifies the representation, but is without physical significance the
decomposition into an infrared singular part arising from the Goldstone bosons and a
regular remainder is not unique. In Ref. [22], a somewhat different representation for the
analytic terms of order x 2 is used, which involves the two parameters rM , rF instead of
kM , kF .
21. Numerical results for quark mass dependence

In addition to the coupling constants 1 , . . . , 4 that also govern the low energy
properties of the scattering amplitude, the ratios M2 /M 2 and F /F contain the two
fourth order constants kM and kF . We expect that the contributions from these terms are
of order M4 /MS4 and can just as well be dropped unless m is taken much larger than in
nature. We did not make an attempt at quantifying the uncertainties associated with these
terms, because they are small compared to those from the coupling constants 1 , . . . , 4 .
The bands shown in Fig. 10 correspond to rM = rF = 0. If we were to instead set the
constants kM , kF equal to zero, the boundaries would be slightly shifted, but the shifts are
small compared to the width of the bands.
For small values of m, the contributions of O(m2 ) dominate. These are determined by
the two scales 3 and 4 . As discussed in Section 7, the information about the first one
is meagre the crude estimate (7.8) amounts to 0.2 GeV < 3 < 2 GeV. For the second
one, however, the value for 4 obtained in Section 11 yields a rather decent determination:
4 = 1.26 0.14 GeV.
(21.1)
The two parameters 3 , 4 play the same role as the coefficients cM and cF in the
polynomial approximations M2 = 2mB(1 + cM m), F = F (1 + cF m), that are sometimes
used to perform the extrapolation of lattice data. In contrast to these approximations, the
formulae (20.4) do account for the infrared singularities of the functions M (m) and
F (m) to the order of the expansion in powers of m considered, that representation
is exact.
Consider first the ratio F /F , for which the poorly known scale 3 only enters at
next-to-next-to-leading order. The upper one of the two shaded regions in Fig. 10 shows
the behaviour of this ratio as a function of M, according to formula (20.4). The change
in F occurring if M is increased from the physical value to MK is of the expected
160
Fig. 10. Dependence of the ratios F /F and M2 /M 2 on the mass of the two lightest quarks. The
variable M is defined by M 2 = (mu + md )B and F is the value of the pion decay constant for
mu = md = 0. The strange quark mass is held fixed at the physical value.
size, comparable to the difference between FK and F . As pointed out in Ref. [68], a
linear extrapolation in m is meaningless. The essential parameter here is the scale 4 that
determines the magnitude of the term of order M 2 . The corrections of order M 4 are small
the scale relevant for these is F 0.5 GeV.
In the case of the ratio M2 /M 2 , on the other hand, the dominating contribution is
determined by the scale 3 the corrections of O(M 4 ) are small also in this case (the
relevant scale is M 0.6 GeV). The fact that the information about the value of 3 is
very meagre shows up through very large uncertainties. In particular, with 3 0.5 GeV,
the ratio M2 /M 2 would remain very close to 1, on the entire interval shown. Note that
outside the range of values for 3 considered in the present paper, the dependence of M2
on the quark masses would necessarily exhibit strong curvature. This is illustrated with the
dashed line that indicates the behaviour of the ratio M2 /M 2 for 3 = 10. According to
Fig. 2 this value corresponds to a00 0.24.
The above discussion shows that brute force is not the only way to reach on the lattice
the very small values of mu and md observed in nature. It suffices to equip the strange
quark with the physical value of ms and to measure the dependence of the pion mass on
mu , md in the region where M is comparable to MK . Since the dependence on the quark
masses is known rather accurately in terms of the two constants B and 3 , a fit to the
data based on Eq. (20.4) should provide an extrapolation to the physical quark masses
that is under good control. Moreover, the resulting value for 3 would be of considerable
interest, because that scale also shows up in other contexts, in the scattering lengths, for
example. For recent lattice work in this direction, we refer to [69]. A measurement of the
mass dependence of F in the same region would be useful too, because it would provide
a check on the dispersive analysis of the scalar radius that underlies our determination
of 4 in view of the strong unitarity cut in the scalar form factor, a direct evaluation
161
Fig. 11. Size of the corrections to Weinbergs leading order predictions for the S-wave scattering
lengths, as a function of the pion mass.
of the scalar radius on the lattice is likely more difficult. Chiral logarithms also occur in
the quenched approximation [70,71], but since the coefficients differ from those in the
full theory, a naive comparison of the above formulae with quenched lattice data is not
meaningful.
In Section 9, we noted that the expansion of the scattering length a00 in powers of the
quark mass contains an unusually large infrared singularity at one loop level. We can now
complete that discussion with an evaluation of the contributions arising at next-to-next-toleading order, repeating the analysis of Ref. [72] with the information about the coupling
constants available now. The result is shown in Fig. 11, where we indicate the behaviour
of the correction factors R0 , R2 , defined by
a00 =
7M2
R0 ,
32F2
a02 =
M2
R2 ,
16F2
as a function of M . The reason for choosing the variable M rather than M is that the
uncertainties in 3 then affect the result less strongly. The comparison with Fig. 10, where
a larger mass range is shown, demonstrates that R0 grows much more rapidly with the
quark masses than F . The effect arises from the chiral logarithms associated with the
unitarity cut the coefficient of the leading infrared singularity in a00 exceeds the one in
F by a factor 92 . Note that the chiral perturbation theory formulae underlying the figure
are meaningful only in the range where the corrections are small. The shaded regions
exclusively account for the uncertainties in the values of the coupling constants. In the case
of R0 , those due to the terms of order M6 are by no means negligible for M > 0.2 GeV,
so that it matters what exactly is plotted. The curves shown in the figure are obtained by
expressing R0 , R2 in terms of the coupling constants F, 1 , . . . and of M , expanding the
162
result in powers of M and truncating the series at order M4 . On the left half of the figure,
the behaviour of R0 obtained, for instance, by truncating the expansion in powers of M
instead of the one in M is practically the same, but on the right half, there is a substantial
difference, indicating that the chiral perturbation series is out of control there.
We conclude that in the case of the I = 0 scattering length, a meaningful extrapolation
of lattice data to the physical values of mu and md requires significantly smaller quark
masses than in the case of M or F . In the I = 2 channel, the effects are much smaller,
because this channel is exotic: the final state interaction is weak and repulsive. The lattice
result, a02 = 0.0374 0.0049 [73], corresponds to R2 = 0.82 0.11. It is on the low side,
but not inconsistent with our prediction: a02 = 0.0444 0.0010, R2 = 0.98 0.02. As
in the case of F and M , the comparison between the lattice result and our prediction
is not really meaningful, because that result relies on the quenched approximation. The
evaluation of the scattering lengths to one loop in the quenched approximation [71,74] has
shown that the infrared singularities are different from those in full QCD. Moreover, as
pointed out by Bernard and Golterman [74], the very method used to extract the infinite
volume scattering lengths from finite volume observables [75] is affected: in addition to
the purely statistical error, the numbers in [73] have a sizeable systematic error.
22. Summary and conclusion
1. The Roy equations determine the scattering amplitude in terms of the imaginary
parts at intermediate energies, except for two subtraction constants: the S-wave scattering
lengths a00 , a02 . At low energies, the contributions from the imaginary parts are small, so that
the current experimental information about these suffices, but the one about the scattering
lengths is subject to comparatively large uncertainties.
2. The low energy theorems of chiral symmetry provide the missing element: they
predict the values of the two subtraction constants. In the chiral limit, where the pions are
massless, the S-wave scattering lengths vanish. The breaking of the symmetry generated
by the quark masses mu and md leads to nonzero values, for M as well as for a00 and a02 .
The Gell-MannOakesRenner relation (1.1) shows that the leading term in the expansion
of M2 in powers of the quark masses is determined by the quark condensate and by the
pion decay constant. Weinbergs low energy theorems (1.2) demonstrate that the same
two constants also determine the leading term in the expansion of the scattering lengths.
Ignoring the higher order contributions, these relations predict a00 = 0.159, a02 = 0.0454.
3. Chiral perturbation theory allows us to analyze the higher order terms of the expansion
in a systematic manner. In the isospin limit, mu = md = m, the perturbation series of the
scattering amplitude has been worked out to next-to-next-to-leading order (two loops).
The result, in particular, specifies the expansion of a00 and a02 in powers of m up to and
including O(m3 ). The isospin breaking effects due to mu = md only show up at nonleading
orders of the expansion and are small in contrast to the kaons or the nucleons, the pions
are protected from isospin breaking. For an analysis of the isospin breaking effects due to
the electromagnetic interaction, we refer the reader to [76]. In the present paper, we have
ignored these effects altogether.
163
4. Chiral symmetry does not fully determine the higher order contributions, because it
does not predict the values of the coupling constants occurring in the effective Lagrangian.
There are two categories of coupling constants: terms that survive in the chiral limit
and symmetry breaking terms proportional to a power of m. The former show up in
the momentum dependence of the scattering amplitude, so that these couplings can be
determined phenomenologically. For the coupling constants of the second category, which
describe the dependence on the quark masses, we need to rely on sources other than
scattering.
5. The higher order terms of the expansion are dominated by those of next-to-leading
order, which involve the coupling constants 1 , 2 from the first category and 3 , 4 from
the second. We rely on the dispersive analysis of the scalar pion form factor to pin down
the coupling constant 4 . Crude theoretical estimates indicate that the contributions from 3
are very small, but the uncertainties of these estimates dominate the error in our final result
for the S-wave scattering lengths. We also rely on theoretical estimates for the symmetry
breaking coupling constants r1 , . . . , r4 of next-to-next-to-leading order. These indicate that
the contributions to a00 and a02 from those constants are tiny and could just as well be
dropped.
6. The expansion in powers of the quark masses contains infrared singularities the
chiral logarithms characteristic of chiral perturbation theory. In the case of a00 , for instance,
these singularities enhance the magnitude of the corrections quite substantially. The origin
of the phenomenon is well understood: the final state interaction in the I = 0 S-wave
generates a strong branch cut. For this reason, the straightforward expansion of a00 in
powers of m converges only rather slowly. We exploit the fact that, in the subthreshold
region, the expansion of the scattering amplitude converges much more rapidly: in our
approach, the chiral and phenomenological representations of the scattering amplitude
are matched there. With this method, even the one loop approximation of the chiral
perturbation series yields values for the scattering lengths that are within the errors of
our final result, which reads
a00 = 0.220 0.005,
a02 = 0.0444 0.0010,
2a00 5a02 = 0.663 0.007,
a00 a02 = 0.265 0.004.
7. We have worked out the implications for the phase shifts of the S- and P -waves. As
shown in Figs. 79, chiral symmetry and the existing experimental information constrain
these to a rather narrow range. The corresponding predictions for the scattering lengths of
the D- and F -waves, as well as for the effective ranges are listed in Table 2.
8. Our representation of the scattering amplitude, in particular, also yields an accurate
prediction for the phase of 6 /6: the result for the phase difference between the two S2 reads
waves at s = MK
0
2
2
0
0 MK 0 02 MK
(22.1)
0 = 47.7 1.5 .
9. The mass and the width of the -meson can be calculated to within remarkably small
uncertainties:
M = 762.4 1.8 MeV,
= 145.7 2.6 MeV.
(22.2)
164
Also, we confirm that the I = 0 S-wave contains a pole far away from the real axis, at
s = (470 30) i(295 20) MeV. The phenomenon is related to the fact that chiral
symmetry requires the scattering amplitude to be very small at threshold and then to grow
with the square of the energy.
10. The consequences for the coupling constants 1 , 2 , 4 , r5 and r6 of the effective
Lagrangian were studied in detail. Our results are in good agreement with previous work,
but are more accurate. In particular, we have shown that 1 and 2 are accompanied by
strong infrared singularities generated by the two loop graphs, which shift the numerical
values obtained in one loop approximation, quite substantially. The effective couplings
relevant at one loop level are given in Eq. (14.3). We have shown that, with these values,
a very decent representation of the scattering amplitude is obtained by matching the Roy
equations with the one loop approximation of chiral perturbation theory: the result can
barely be distinguished from the representation that underlies the present paper.
11. The results for the various quantities of interest are strongly correlated. We have
examined the correlations in detail, not only for the threshold parameters and coupling
constants, but also for the coefficients b1 , . . . , b6 of the polynomial that occurs in the chiral
perturbation theory representation of the scattering amplitude.
12. The resulting picture for the low energy structure of the scattering amplitude is
consistent with the resonance saturation hypothesis: for all of the effective coupling
constants encountered in the two loop representation, our results are consistent with the
assumption that, once the poles and cuts due to pion exchange are removed, the low
energy structure of the amplitude is dominated by the singularities due to the lightest
non-Goldstone states [10]. Since the splitting of a resonance contribution from the
continuum underneath it is not unique, the saturation hypothesis does not lead to very
sharp predictions, but it is by no means trivial that these are consistent with the values
found, both in sign and magnitude.
13. On the lattice, it is difficult to reach the small values of mu and md that are realized
in nature. We have shown that chiral perturbation theory can be used to extrapolate the
results obtained at comparatively large values for these masses, in a controlled manner.
The method at the same time also allows a measurement of some of the coupling constants
that occur in the symmetry breaking part of the effective Lagrangian, in particular, of 3 .
In quite a few cases, the uncertainties in our results are dominated by those from this term.
14. We emphasize that most of our results rely on the standard picture, according to
which the quark condensate represents the leading order parameter of the spontaneously
broken symmetry, so that the Gell-MannOakesRenner relation holds. The crude
theoretical estimates for the coupling constant 3 we are relying on indicate that the higher
order terms in the expansion of M2 are very small, so that the square of the pion mass
indeed grows linearly with m = 12 (mu + md ) a curvature only shows up if m is taken
much larger than the physical value. In Ref. [24], 3 is instead treated as a free parameter
and is allowed to take large values, so that the dependence of M2 on the quark masses fails
to be approximately linear, even in the region below the physical value of m. There is no
prediction for the scattering lengths in that framework.
165
15. Even if the quark condensate is not assumed to represent the leading order parameter,
a strong correlation between a00 and a02 emerges, which originates in the relation between
these quantities and the scalar radius. The correlation is of interest, in particular, in
connection with the analysis of Ke4 data: as shown in Fig. 3, the preliminary results of
the E865 experiment at Brookhaven [30,31] yield a remarkably good determination of a00 .
The outcome beautifully confirms the prediction (11.2): the best fit to these data is obtained
for a00 = 0.218, with 2 = 5.7 for 5 degrees of freedom. For a detailed discussion of the
consequences for the value of a00 , we refer to [48,49].
16. A measurement that aims at determining the lifetime of a + atom to an accuracy
of 10% is currently under way at CERN. The interference of the electromagnetic and strong
interaction effects in the bound state and in the decay is now well understood, also on the
basis of chiral perturbation theory [37]. The decay rate of the ground state can be written
in the form

2
2
= 3 pa00 a02 (1 + ),
9

with p = M2 + M2 0 14 2 M2 + . The term accounts for the corrections of order as
well as those due to mu = md . According to Ref. [37], these effects increase the rate by
= 0.058 0.012, that is by about 6%. Inserting our result (11.2) for a00 a02 , we arrive
at the following prediction for the lifetime [77]:
= (2.9 0.1) 1015 s.
(22.3)
Since the decay rate is proportional to |a00 a02 |2 , the outcome of the experiment is expected
to lead to a determination of |a00 a02 | to an accuracy of 5%, thereby subjecting chiral
perturbation theory to a very sensitive test.
Acknowledgements
We are indebted to S. Pislak and P. Trul for providing us with preliminary data of the
E865 collaboration, and we thank H. Bijnens and G. Wanders for informative discussions
and comments. This work was supported in part by the Swiss National Science Foundation,
and by TMR, BBW-Contract No. 97.0131 and EC-Contract No. ERBFMRX-CT980169
(EURODANE).
Appendix A. Notation
We use the following abbreviations:

M 2
M 2
,
x=
,
=
4F
4F
2
L = ln 2 ,
M
N = 16 2 .
The intrinsic scales of the coupling constants of L4 are denoted by n . In terms of these,
the standard renormalized couplings are given by
166
rn () =
n
2n
ln
,
32 2 2
1
1 = ,
3
2
2 = ,
3
1
3 = ,
2
1 = 2,
where is the running scale. The scales relevant in the various applications are different
and the formulae can be simplified considerably if the coupling constants are normalized
at the appropriate scale. For this reason, we use three different symbols:
2
n = ln n2 ,
M
2
n = ln n2 ,
M
n = ln 2n .
The first coincides with the one introduced in [5], in the framework of a one loop analysis,
where there is no need to distinguish n from n . The quantities n differ from the running
coupling constants rn () only by a numerical factor, which it is convenient to remove
in order to simplify the expressions. For the same reason, we work with the coefficients
b1 , . . . , b6 , introduced in [6],
b1 = Nb1 ,
b5 = N 2 b5 ,
b2 = Nb2 ,
b6 = N 2 b6
b3 = Nb3 ,
b 4 = Nb4 ,
and also stretch the coupling constants of L6 by a power of N = 16 2 ,

rn = N 2 rnr (),
n = 1, . . . , 8.
In generalized chiral perturbation theory, the coefficients bn are replaced by the constants
, , 1 , . . . , 4 :
2 2
152 2
= 1 + (3b1 + 4b2 + 4b3 4b4 ) 11
36 9 ,
= 1 + (b2 + 4b3 4b4 ) + 4 2 (3b5 b6 ) 13 2 2 +
72
N1 = b3 b 4 + 2(3b5 b6 ) +
N2 = 2b4 1 ,
1 2
48
152 2
9 ,
38
3 ,
N 3 = b5 13 b6 + 82
27 ,
2
4
14
N 4 = b6 .
2
27
(A.1)
Throughout this paper, we identify M with the mass of the charged pion and use F =
92.4 MeV [39].
Appendix B. Polynomial part of the chiral representation

In this appendix, we convert the representation obtained in Ref. [6] for the scattering
amplitude to two loops of chiral perturbation theory into an explicit expression for the
coefficients of the polynomial C(s, t, u) defined in Eq. (2.6). As a first step, we decompose
that representation into three functions of a single variable and a polynomial, according
to Eq. (2.5). The resulting representation for the functions U 0 (s), U 1 (s), U 2 (s) contains
linear combinations of the loop integrals J(s), . . . , K4 (s) introduced in Ref. [6]. Note
that the decomposition is not unique: Eq. (2.5) fixes the functions U I (s) only up to a
polynomial in s.
167
Next, we expand the loop integrals in powers of s, using the explicit expressions of
Ref. [6]. In terms of the dimensionless variable s = s/M2 , the result reads:

s 2
s
s
3
+
+ O(s ) ,
1+
10 70
96 2

3
s2
s
s
+
+
O
s
,
K1 (s) =
1
+
256 4
12 90

3
s
7s
s2
K2 (s) =
1+
+
+ O s ,
384 4
120 168
2

3
6 (2 2 15)s (6 2 49)s 2
s
+
+
+ O s ,
K3 (s) =
1024 4
3
30
420
2

3
9 (2 2 19)s (3 2 29)s 2
s
K4 (s) =
+
+
+
O
s
.
3
20
105
3072 4
J(s) =
(B.1)
These representations suffice to determine the Taylor series expansions of the functions
U I (s) around s = 0, up to and including O(s 3 ) in the case of U 0 (s), U 2 (s) and to O(s 2 )
for U 1 (s).
The ambiguity mentioned above is fixed with the dispersive representation (2.7), which
requires the first few derivatives of the functions U I (s) to vanish at s = 0. Starting with
an arbitrary decomposition, for which this requirement need not be obeyed, we truncate
the Taylor series for U 0 (s), U 2 (s) at order s 3 and the one for U 1 (s) at order s 2 . It is
straightforward to check that the functions obtained by subtracting these terms indeed obey
the dispersion relations (2.7). Absorbing the subtractions in C(s, t, u), we may then read
off the coefficients of this polynomial:

8b1 32b2 464b3 3824b4
M2
68
2
c1 = 2 1 + b 1
315
105
63
315
315
F

601 2 17947
+
,
945
2835

1
323
11b1 211b2 628b3 5164, b 4
c2 = 2 1 + b2
+ 2
F
1260
70
315
315
315

3977 5237 2
+
,
630
7560

1
3 + 1 + 18b1 + 59b2 + 731b3 + 3601b4 5387
c3 =
b
42
35
105
315
315
15120
NF4

19121
,
7560

43b1 8b2 23b3 997b4 467 2

1
31
+
+
+
c4 =
b
4
2520
420 63
63
315
7560
NF4
63829
,
45360
168
b1 379b2 25b3 731b4

137
1
+
+
+ b5
c5 = 2 6
16
1680
28
180
N F 1680

269 2 61673
+
,
+
15120 18144
31
1
b1
47b2 65b3 547b4
c6 = 2 6
+
+ b6
1680 112 1680
252
420
N F

44287
2
+
.
+
15120 90720
(B.2)
The constants bn represent dimensionless combinations of coupling constants, introduced

in Ref. [6]. In the notation of Appendix A, the expressions read:
41 3
7L
13 49 L 2
+
24 +
b1 =
6
3
2
18
6

1 562
4
4

47
26

+ L
3
9
9
3
108
2
161 4 3
281 802 153 264
33 4 524 +
+
+
+ r1 +
3
2
27
27
4
9

34 2 3509
,
+
27
1296
41
2L
2 431 L 2
+ 24 +
b2 =
3
3
9
36
1242 53 204 203

+ L 61 +
+
+
9
2
3
54
161 4
1662 93 84
+ 3 4 + 524 41
+
+ r2
3
27
2
9

2
1789
317
,
+
216
432
b3 = L + 1 + 2 7 40 L + L 381 202 + 24 + 365

2
3
6
12
9
9
3
216

7063
41 4 22 4 891 382 74 311
+
+
+
+
,
+ r3 +
3
3
27
9
3
432
864
b4 = L + 2 5 + 5 L + L 1 + 82 + 24 47
6
6
36
6
9
9
3
216

22 4 51 42 54 17 2 1655

+
+
+
+
,
+ r4 +
3
27
27
9
216
2592
b5 = 85L + L 71 + 1072 625 + r5 311 1452 + 7 66029 ,

72
8
72
288
36
108
54
20736

2
2
5, 1 252 257
11375
5L
71 352
+ L
+
. (B.3)
b6 =
+ r6
24
72
72
864
108
108
27 20736
169
Note that the quark masses exclusively enter through and L the remaining quantities
are independent thereof. The expressions involve the logarithm of M2 , as well as the square
thereof in the chiral limit, the coefficients bn diverge logarithmically. The coefficients of
the leading infrared singularities are pure numbers, which are determined by the structure
of the symmetry group and the transformation properties of the symmetry breaking part of
the Hamiltonian, that is of the quark mass term. The scales of the logarithmic divergences,
on the other hand, are not determined by the symmetry, but are fixed by the intrinsic scales
1 , . . . , 4 of the effective coupling constants of L4 .
Appendix C. The corrections 0 , 1 , 2, r

The leading terms in the expansion of the scalar radius in powers of the quark masses
are determined by 4 . The next-to-leading order correction *r , which is defined in (7.5),
was calculated in [22], on the basis of an evaluation of the scalar form factor to two loops.
In the notation introduced above, their result reads:

29
31
34
145
*r = L 2 + L 1 2 + 44
+ rS2 + 3 4 + 242
18
9
9
216
22
5
13
23 2 869
+
.
(C.1)
1 + 22 3 4
9
24
6
72
648
As discussed in Section 6, the quantities C0 , C1 and C2 tend to unity in the chiral limit.
According to Eq. (7.6), the first order corrections can be expressed in terms of the scalar
radius and the coupling constant 3 . To work out the corrections of second order, it suffices
to insert the relations (B.2), (B.3) in the definition (6.2) of C1 and C2 and read off the
coefficients of the terms of order 2 . The result reads

401 802 53
71L 2
5393
+L
+ 4 4 +
*1 =
12
9
9
2
315
+ r2 + 4r3 4r4 2rS2 3 4 + 2
+
18261 31182 793 1444 521 2 24221

+
+
+
,
+
315
315
21
35
252
3024

2
175L
1482
9311
*2 =
+ L 101
+ 3 + 64 +
18
9
630
2
r1 + 4r3 4r4 2rS2 + 3 + 3 4 + 42
2
5561 43722 1043 1254 2939 2 43109

+
+
+
.
+
(C.2)
63
315
35
21
1260
5040
According to (6.6), the analogous correction in the low energy theorem for C0 can be
expressed in terms of these:
1
*0 = (12*1 5*2 ).
7
(C.3)
170
Appendix D. Phenomenological representation

In the present appendix, we convert the low energy representation of the scattering
amplitude constructed in Ref. [7] into the form given in Section 3. That representation
consists of a sum of two terms:
A(s, t, u) = A(s, t, u)SP + A(s, t, u)d .
(D.1)
The first describes the contributions generated by the imaginary parts of the S- and P
waves below s2 = 2 GeV, while the background amplitude A(s, t, u)d collects those
from the higher partial waves and higher energies.
The explicit expression for the first term involves three functions of a single variable:

A(s, t, u)SP = 32 13 W 0 (s) + 32 (s u)W 1 (t) + 32 (s t)W 1 (u) + 12 W 2 (t)

+ 12 W 2 (u) 13 W 2 (s) .
(D.2)
The functions W 0 (s), W 1 (s), W 2 (s) are determined by the imaginary parts of the S- and
P -waves and by the two subtraction constants a00 , a02 :
a 0s
s(s 4M2 )
W (s) = 0 2 +
4M
s2
W 1 (s) =
s2
ds
4M2
ds
4M2
Im t00 (s )
,
s (s 4M2 )(s s)
Im t11 (s )
,
s (s 4M2 )(s s)
a 2s
s(s 4M2 )
W (s) = 0 2 +
4M
s2
4M2
ds
Im t02 (s )
.
s (s 4M2 )(s s)
(D.3)
I (s) introduced in Section 3, but

These functions are closely related to the quantities W
there are two differences: the subtractions are not the same and the range of integration
differs.
0 (s), we consider the function
To compare W 0 (s) with W
s2
s4
w (s) =
4M2
ds
Im t00 (s )
,
s 4 (s s)
0 (s) only in the range of

which is intermediate between the two: it differs from W
0
integration and from W (s) only by a subtraction polynomial. The latter may be expressed
in terms of the following moments of the imaginary part:
Jn0
1
=
s2
4M2
ds Im t00 (s)
,
s n+2 (s 4M2 )
n = 0, 1, 2.
(D.4)
171
The explicit relation between W 0 (s) and w0 (s) reads

W 0 (s) = w0 (s) +

0

0
a00 s
2
2
2
+
s
s
4M
+
s
J
s
4M
J1 + 4M2 s 3 J20 .
0
4M2
(D.5)
0 (s) w0 (s), on the other hand, is given by an integral over the region
The difference W
s > s2 , which merely generates contributions of O(p8 ), so that

0 (s) + O p8 .
w0 (s) = W
To the accuracy to which the two loop representation holds, we may thus replace the term
0 (s).
w0 (s) on the r.h.s. of Eq. (D.5) by W
0 (s) and W 0 (s) only differ by
This shows that, up to terms of O(p8 ), the functions W
a polynomial whose coefficients are determined by the moments Jn0 . The same reasoning
also applies to the components with I = 1, 2. The relevant moments are given by
Jn1
1
=
s2
4M2
ds Im t11 (s)
,
s n+2 (s 4M2 )
Jn2
1
=
s2
4M2
ds Im t02 (s)
.
s n+2 (s 4M2 )
(D.6)
I (s)
The net result amounts to a representation for A(s, t, u)SP in terms of the functions W
and a set of polynomials involving the above moments:

a00 s
+ s s 4M2 J00 + s 2 s 4M2 J10 + 4M2 s 3 J20
2
4M
8
+O p ,

1

W (s) = W 1 (s) sJ01 s 2 J11 + O p6 ,
0 (s) +
W 0 (s) = W

a02 s
+ s s 4M2 J02 + s 2 s 4M2 J12 + 4M2 s 3 J22
2
4M
8
+O p .
2 (s) +
W 2 (s) = W
(D.7)
Appendix E. Moments of the background amplitude

We now turn to the second part in the decomposition (D.1). The chiral representation
shows that the infrared singularities contained in A(s, t, u)d start manifesting themselves
only at higher orders. Up to and including O(p6 ), the background amplitude is a crossing
symmetric polynomial of the momenta:

A(s, t, u)d = P (s, t, u) + O p8 ,
P (s, t, u) = p1 + p2 s + p3 s 2 + p4 (t u)2 + p5 s 3 + p6 s(t u)2 .
(E.1)
The coefficients p1 , . . . , p6 may be calculated by expanding the dispersion integrals in

powers of the momenta. For a detailed discussion, we refer to Appendix B of Ref. [7].
By construction, A(s, t, u)d does not contribute to the S-wave scattering lengths. This
condition fixes p1 and p2 in terms of the remaining coefficients:

p1 = 16M4 p4 ,
(E.2)
p2 = 4M2 p3 + p4 4M2 p5 .
172
The explicit expressions for these read

16 2
8 0
M 8I10 21I11 + 11I12 + 12H ,
4I0 9I01 I02 +
3
3

p4 = 8 I01 + I02 + 16M2 I11 + I12 ,

4 0
8I1 + 9I11 11I12 6H ,
p5 =
3

p6 = 4 I11 3I12 + 2H .
p3 =
(E.3)
represent integrals over the imaginary parts at t = 0 (total cross sections),
except that the contributions from the S- and P -waves below s2 = 2 GeV are removed.
In terms of the imaginary parts of the partial waves, the explicit expression reads [7]:
The moments InI
InI
s2

(2l + 1)
=2
ds
Im tI (s)
4M2 )
s n+2 (s
4M2

(2l + 1)
=0
ds
Im tI (s)
.
4M2 )
(E.4)
s n+2 (s
s2
The term H represents an analogous integral over the derivatives of the imaginary parts
with respect to t at t = 0. Since the S-wave contributions are independent of t, they drop
out. Moreover, on account of crossing symmetry, the contributions with I = 1 may be
expressed in terms of those with I = 0, 2:

2 Im t0 (s) + 4 Im t2 (s)
1
H=
.
(2l + 1)( + 1)
ds
9s 3 (s 4M2 )
=2
(E.5)
4M2
There is an important difference between the moments relevant for the background
amplitude and those associated with the S- and P -waves: while InI and H remain finite
when the quark masses are sent to zero,
I0I = O(1),
I1I = O(1),
I2I = O(1),
H = O(1),
the S- and P -wave moments with n 1 explode in that limit:

J2I = O M4 .
J0I = O(1),
J1I = O M2 ,
The phenomenon arises from the manner in which we have chosen to decompose the
I (s),
amplitude into a contribution from the S- and P -waves, described by the functions W
and a polynomial. These functions develop an infrared singularity in the chiral limit, which
cancels the one occurring in the polynomial the full scattering amplitude approaches a
finite limit when the quark masses are turned off. In fact, the problem does not occur in the
original form of the decomposition, based on the functions W I (s): these do have a decent
chiral limit.
The same singularities also show up in the chiral representation of the amplitude: as we
have normalized the unitarity correction by subtracting the dispersion integrals at s = 0,
173
they contain a quadratic infrared divergence in the chiral limit. Indeed, the relations (B.2)
show that the coefficients c5 and c6 contain contributions that are inversely proportional
to M2 and precisely cancel this divergence. The above choice of the decomposition has
the advantage that it is scale independent and leads to a simple form of the matching
conditions. The backside of the coin is that the two pieces do not have a smooth chiral
limit. We could modify the normalization of the unitarity correction in such a way that it
remains finite in the chiral limit, at the price of introducing an arbitrary scale to normalize
the logarithmic infrared singularities. There is no gain in doing that, however: anyway,
only the sum of the polynomial and the unitarity correction is relevant, so that different
decompositions lead to identical results. We stick to the one above.
Finally, we add up the two parts of the amplitude. The result takes the form of Eq. (3.2),
where P(s, t, u) represents the sum of the two polynomials associated with the two
parts. The coefficients p1 , . . . , p6 involve a linear combination of the various moments
introduced above. In fact, up to terms of O(p8 ), the result may be expressed in terms of
the combinations
In0 = In0 + Jn0 ,
In1 = In1 + 3Jn1,
In2 = In2 + Jn2 .
(E.6)
The contributions from the S- and P -wave moments JnI precisely represent the pieces
needed to complete the sum over the angular momenta the factor of 3 in front of the
P -wave moments accounts for the weight 2 + 1 that occurs in the definition (E.4) of the
moments In1 . The result is independent of the energy s2 used in the decomposition (D.1) of
the amplitude and is given in Eqs. (3.4), (3.5).
The moments are readily evaluated with the information given in Ref. [7]. Since the
angular momentum barrier suppresses the higher partial waves near threshold, the terms
I20 , I21 and I22 are negligibly small. The contributions from the S- and P -waves depend on
the values of the two S-wave scattering lengths a00 and a02 . In the narrow range of interest
here, this dependence is well described by a quadratic interpolation of the form

2

2
I = u0 + u1 *a00 + u2 *a02 + u3 *a00 + u4 *a00 *a02 + u5 *a02 ,
with *a00 = a00 0.225, *a02 = a02 + 0.03706. The numerical values of the coefficients are
listed in Table 3.
The quantity H exclusively receives contributions from the partial waves with 2.
As discussed in detail in [7], H is dominated by the contribution from the lowest spin 2
resonance, which is independent of a00 , a02 . The behaviour of the integrand in the threshold
region does depend on the two S-wave scattering lengths, because these determine the
threshold parameters of the higher partial waves, but since the contributions from that
region are very small, we ignore the dependence on a00 , a02 and use the value given in [7]:
H = 0.32 GeV6 .
(E.7)
174
Table 3
Moments of the background amplitude
I =0
u0
u1
u2
u3
u4
u5
I =1
I =2
I00
GeV4
I10
GeV6
I20
GeV6
I01
GeV4
I11
GeV6
I21
GeV6
I02
GeV4
I12
GeV6
I22
GeV6
9.44
58.5
75.2
82.4
232
322
66.7
507
462
902
1920
2280
609
4980
3300
9800
16200
16900
1.90
1.97
7.99
1.09
2.66
17.4
3.92
10.6
6.35
25.6
25.2
99.3
2.1
2.84
18.3 110
72.6
349
0.469
1.56
15.1
0.295
14.5
125
2.61
5.84
97.8
2.65
54.3
932
21.4
33.9
884
23.6
317
9190
Appendix F. Error analysis and correlations

The matching conditions and the two loop representation for the scalar radius determine
the values of the constants

x = a00 , a02 , 1 , 2 , 4 , r5 , r6
as functions of the parameters occurring in our input,

y = r 2 s , 3 , r1 , r2 , r3 , r4 , rS2 , 0 , 1 .
The last two of these characterize the experimental input used when solving the Roy
equations. Strictly speaking, that part of our input involves three functions, the imaginary
parts of the S- and P -waves above 0.8 GeV, but in practice, the solution of the matching
conditions is sensitive only to two parameters: the values of the phases 00 (s) and 11 (s) at
s = 0.8 GeV, which we denote by 0 and 1 , respectively. In Ref. [7], the uncertainties
in these parameters are estimated at 0 = 82.3 3.4 , 1 = 108.9 2 .
The uncertainties in the input give rise both to uncertainties in the individual components
of x and to correlations among these. We describe the correlations in terms of a Gaussian
distribution: the probability for x to be contained in the volume element dx is represented
as

dP = N exp 12 Q dx,

Q=
(F.1)
Cab *x a *x b , *x a = x a x a .
ab
The uncertainties in our results are characterized by the coefficients Cab of the quadratic
form in the exponential. In the Gaussian approximation, the matrix C is given by the
inverse of the correlation matrix K,

C K = 1,
K ab = *x a *x b ,
(F.2)
so that the error analysis boils down to an evaluation of the matrix K.
In the small region of interest, the response to a change of the input variables is
approximately linear. We denote the central values of the input parameters by yc and set
175
Table 4
Numerical elements of the correlation matrix (F.2)
*a00
*a02
*1
*a00
*a02
*1
*2
2.0 105
3.2 106
1.9 104
1.7 105
4.2 104
3.2 104
3.3 105
9.7 107
1.6 104
1.2 105
4.2 106
2.2 104
2.2 105
3.5 101
3.3 102
6.7 102
5.4 101
3.7 102
1.2 102
7.2 103
1.1 102
4.6 103
4.8 102
9.1 102
2.2 103
*2
*4
*4
*r5
*r5
1.1
*r6
9.2 102
1.1 102
*r6
y i = yci (1 + i ), with i = 0. Linearity then implies that the mean value of x a coincides
with the solution of the matching conditions that corresponds to our central set of input
parameters, and that the correlation matrix K can be expressed in terms of the matrix
i k . We treat the input variables as statistically independent, so that this matrix is
diagonal,
i k = ik (i )2 .
(F.3)
r 2 s
0.61 0.04 fm2 ,
The result of the dispersive evaluation,

=
implies that the mean
square deviation in the variable 1 is given by 1 = 0.04/0.61. We interpret the estimate
3 = 2.9 2.4 in the same manner: 2 = 2.4/2.9. Concerning the variables r1 , . . . , r4 and
rS2 , we assume that all values in the interval from 0 to twice the value obtained from
resonance saturation
are equally likely, so that, for n = 3, . . . , 7, the mean square deviation
becomes n = 1/ 3. Finally, the uncertainties in the input parameters 0 and 1 amount
to 8 = 3.4/82.3 and 9 = 2/108.9, respectively.
We also need an estimate for the sensitivity of our results to the value of the scale
used when applying the resonance estimates. For the mean values, we use = M . To
estimate the uncertainties due to that choice, we evaluate the shift occurring in the quantity
of interest if the coupling constants rn are held fixed, but the scale is replaced by =
1 GeV, repeat the calculation for = 0.5 GeV, and take the mean square of the two shifts.
Likewise, the corresponding contribution to the correlation matrix is the average of the two
matrices associated with those two shifts. Alternatively we could assume that all values of
in the interval between 0.5 and 1 GeV are equally likely. The uncertainties then become
somewhat smaller, but it suffices to slightly stretch the interval for the outcome to be nearly
the same.
The elements of the resulting correlation matrix are listed in Table 4. The off-diagonal
elements are of interest only if their numerical values are comparable to the product of
the square roots of the corresponding diagonal entries the numbers listed are significant
only insofar as this condition is met (those below the diagonal are omitted the matrix
is symmetric). The errors quoted in the various rows of Table 1 are the square roots of the
corresponding contributions to the diagonal elements of the correlation matrix.
If all variables except a00 and a02 are integrated out, the distribution reduces to a Gaussian
in these two variables:
176

dp = n exp 12 , q da00 da02,

2

2
q = c11 *a00 + 2c12 *a00 *a02 + c22 *a00 ,
where the 2 2 matrix c is the inverse of the submatrix of K that collects the correlations
of a00 and a02 . The result is illustrated in Fig. 1: the small ellipse shows the standard 68%
confidence limit, that is the contour where q = 1. For a careful analysis of the errors and
correlations associated with the various numerical evaluations found in the literature, we
refer to [46], where the corresponding error ellipses are also shown. Note that the radiative
corrections in the value of F are often not accounted for. At the accuracy under discussion,
these matter, as they increase the results for the S-wave scattering lengths by about two
percent.
References
[1] S. Weinberg, Physica A 96 (1979) 327.
[2] R. Arnowitt, M.H. Friedman, P. Nath, R. Suitor, Phys. Rev. Lett. 20 (1968) 475;
R. Arnowitt, M.H. Friedman, P. Nath, R. Suitor, Phys. Rev. 175 (1968) 1820.
[3] M. Gell-Mann, R.J. Oakes, B. Renner, Phys. Rev. 175 (1968) 2195.
[4] S. Weinberg, Phys. Rev. Lett. 17 (1966) 616.
[5] J. Gasser, H. Leutwyler, Phys. Lett. B 125 (1983) 325.
[6] J. Bijnens, G. Colangelo, G. Ecker, J. Gasser, M.E. Sainio, Phys. Lett. B 374 (1996) 210, hepph/9511397;
J. Bijnens, G. Colangelo, G. Ecker, J. Gasser, M.E. Sainio, Nucl. Phys. B 508 (1997) 263,
hep-ph/9707291;
J. Bijnens, G. Colangelo, G. Ecker, J. Gasser, M.E. Sainio, Nucl. Phys. B 517 (1998) 639,
Erratum.
[7] B. Ananthanarayan, G. Colangelo, J. Gasser, H. Leutwyler, hep-ph/0005297, Phys. Rep., in
press.
[8] S.M. Roy, Phys. Lett. B 36 (1971) 353.
[9] G. Colangelo, J. Gasser, H. Leutwyler, Phys. Lett. B 488 (2000) 261, hep-ph/0007112.
[10] J. Gasser, H. Leutwyler, Ann. Phys. 158 (1984) 142.
[11] J.F. Donoghue, C. Ramirez, G. Valencia, Phys. Rev. D 38 (1988) 2195.
[12] J. S Borges, Phys. Lett. B 149 (1984) 21;
J. S Borges, J. Soares Barbosa, V. Oguri, Phys. Lett. B 393 (1997) 413;
I.P. Cavalcante, J. S Borges, hep-ph/0101037;
I.P. Cavalcante, J. S Borges, hep-ph/0101104.
[13] J. Bijnens, Nucl. Phys. B 337 (1990) 635.
[14] C. Riggenbach, J. Gasser, J.F. Donoghue, B.R. Holstein, Phys. Rev. D 43 (1991) 127.
[15] M.R. Pennington, J. Portols, Phys. Lett. B 344 (1995) 399, hep-ph/9409426.
[16] B. Ananthanarayan, D. Toublan, G. Wanders, Phys. Rev. D 51 (1995) 1093, hep-ph/9410302;
B. Ananthanarayan, D. Toublan, G. Wanders, Phys. Rev. D 53 (1996) 2362, hep-ph/9510254.
[17] B. Ananthanarayan, P. Bttiker, Phys. Rev. D 54 (1996) 1125, hep-ph/9601285;
B. Ananthanarayan, P. Bttiker, Phys. Rev. D 54 (1996) 5501, hep-ph/9604217;
B. Ananthanarayan, P. Bttiker, Phys. Lett. B 415 (1997) 402, hep-ph/9707305.
[18] G. Wanders, Helv. Phys. Acta 70 (1997) 287, hep-ph/9605436.
[19] G. Wanders, Phys. Rev. D 56 (1997) 4328, hep-ph/9705323.
[20] D. Toublan, Phys. Rev. D 53 (1996) 6602, hep-ph/9509217.
[21] B. Ananthanarayan, Phys. Rev. D 58 (1998) 036002, hep-ph/9802338.
[22] J. Bijnens, G. Colangelo, P. Talavera, JHEP 9805 (1998) 014, hep-ph/9805389.
177
[23] G. Amoros, J. Bijnens, P. Talavera, Nucl. Phys. B 585 (2000) 293;

G. Amoros, J. Bijnens, P. Talavera, Nucl. Phys. B 598 (2001) 665, Erratum.
[24] M. Knecht, B. Moussallam, J. Stern, N.H. Fuchs, Nucl. Phys. B 457 (1995) 513, hepph/9507319;
M. Knecht, B. Moussallam, J. Stern, N.H. Fuchs, Nucl. Phys. B 471 (1996) 445, hepph/9512404.
[25] L. Girlanda, M. Knecht, B. Moussallam, J. Stern, Phys. Lett. B 409 (1997) 461, hepph/9703448.
[26] J.L. Basdevant, J.C. Le Guillou, H. Navelet, Nuovo Cimento A 7 (1972) 363;
J.L. Basdevant, C.D. Froggatt, J.L. Petersen, Phys. Lett. B 41 (1972) 173;
J.L. Basdevant, C.D. Froggatt, J.L. Petersen, Phys. Lett. B 41 (1972) 178;
M.R. Pennington, S.D. Protopopescu, Phys. Rev. D 7 (1973) 1429;
M.R. Pennington, S.D. Protopopescu, Phys. Rev. D 7 (1973) 2591;
J.L. Basdevant, C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 72 (1974) 413.
[27] C.D. Froggatt, J.L. Petersen, Nucl. Phys. B 129 (1977) 89.
[28] M.M. Nagels et al., Nucl. Phys. B 147 (1979) 189.
[29] L. Rosselet et al., Phys. Rev. D 15 (1977) 574.
[30] J. Lowe, in Ref. [32], p. 375;
J. Lowe, hep-ph/9711361;
J. Lowe, in: Proccedings of the Workshop on Physics and Detectors for DANE, Frascati, Nov.
1619, 1999, p. 439, http://wwwsis.lnf.infn.it/talkshow/dafne99.htm;
S. Pislak et al., A new measurement of K + + e+ (Ke4 ), Talk given at Laboratori
Nazionali di Frascati, June 22, 2000.
[31] P. Trul et al., E865 Collaboration, hep-ex/0012012.
[32] A.M. Bernstein, D. Drechsel, T. Walcher (Eds.), Chiral Dynamics: Theory and Experiment,
Lecture Notes in Physics, Vol. 513, Springer, 1997, Workshop held in Mainz, Germany,
15 Sept., 1997.
[33] J. Bijnens, G. Colangelo, J. Gasser, Nucl. Phys. B 427 (1994) 427, hep-ph/9403390.
[34] R. Batley et al., NA48 Collaboration, CERN/SPSC 2000-003.
[35] B. Adeva et al., CERN proposal CERN/SPSLC 95-1 (1995); available at
http://dirac.web.cern.ch/DIRAC/.
[36] S. Deser, M.L. Goldberger, K. Baumann, W. Thirring, Phys. Rev. 96 (1954) 774;
J.L. Uretsky, T.R. Palfrey Jr., Phys. Rev. 121 (1961) 1798;
T.L. Trueman, Nucl. Phys. 26 (1961) 57;
S.M. Bilenky, V.K. Nguyen, L.L. Nemenov, F.G. Tkebuchava, Sov. J. Nucl. Phys. 10 (1969)
469.
[37] J. Gasser, V.E. Lyubovitskij, A. Rusetsky, Phys. Lett. B 471 (1999) 244, hep-ph/9910438;
H. Sazdjian, Phys. Lett. B 490 (2000) 203, hep-ph/0004226;
H. Sazdjian, hep-ph/0012228;
Recent work on related matters is described in:
A. Gashi, G. Rasche, G.C. Oades, W.S. Woolcock, Nucl. Phys. A 628 (1998) 101, nuclth/9704017;
H. Jallouli, H. Sazdjian, Phys. Rev. D 58 (1998) 014011, hep-ph/9706450;
P. Labelle, K. Buckley, hep-ph/9804201;
M.A. Ivanov, V.E. Lyubovitskij, E.Z. Lipartia, A.G. Rusetsky, Phys. Rev. D 58 (1998) 094024,
hep-ph/9805356;
P. Minkowski, in: M.A. Ivanov et al. (Eds.), Proceedings of the International Workshop
Hadronic Atoms and Positronium in the Standard Model, Dubna, 2631 May, 1998, p. 74,
hep-ph/9808387;
E.A. Kuraev, Phys. Atom. Nucl. 61 (1998) 239;
U. Jentschura, G. Soff, V. Ivanov, S.G. Karshenboim, Phys. Lett. A 241 (1998) 351;
178
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
[66]
[67]
[68]
[69]
[70]
[71]
B.R. Holstein, Phys. Rev. D 60 (1999) 114030, nucl-th/9901041;

X. Kong, F. Ravndal, Phys. Rev. D 59 (1999) 014031;
X. Kong, F. Ravndal, Phys. Rev. D 61 (2000) 077506, hep-ph/9905539;
H.W. Hammer, J.N. Ng, Eur. Phys. J. A 6 (1999) 115, hep-ph/9902284;
D. Eiras, J. Soto, Phys. Rev. D 61 (2000) 114027, hep-ph/9905543;
D. Eiras, J. Soto, Phys. Lett. B 491 (2000) 101, hep-ph/0005066.
M. Kermani et al., CHAOS Collaboration, Phys. Rev. C 58 (1998) 3419;
M. Kermani et al., CHAOS Collaboration, Phys. Rev. C 58 (1998) 3431.
B.R. Holstein, Phys. Lett. B 244 (1990) 83.
G. Ecker, J. Gasser, A. Pich, E. de Rafael, Nucl. Phys. B 321 (1989) 311.
J. Bijnens, G. Colangelo, G. Ecker, JHEP 9902 (1999) 020, hep-ph/9902437;
J. Bijnens, G. Colangelo, G. Ecker, Ann. Phys. 280 (2000) 100, hep-ph/9907333.
J.F. Donoghue, J. Gasser, H. Leutwyler, Nucl. Phys. B 343 (1990) 341.
S.R. Amendolia et al., NA7 Collaboration, Nucl. Phys. B 277 (1986) 168.
J. Gasser, H. Leutwyler, Nucl. Phys. B 250 (1985) 465.
T. Hannah, Phys. Rev. D 55 (1997) 5613, hep-ph/9701389.
J. Nieves, E. Ruiz Arriola, Eur. Phys. J. A 8 (2000) 377, hep-ph/9906437.
M.G. Olsson, Phys. Rev. 162 (1967) 1338.
G. Colangelo, J. Gasser, H. Leutwyler, hep-ph/0103063, Phys. Rev. Lett., in press.
Forthcoming paper by the E865 collaboration.
G. Ecker, in [32], p. 337351, hep-ph/9710560.
B. Hyams et al., Nucl. Phys. B 64 (1973) 134.
S.D. Protopopescu et al., Phys. Rev. D 7 (1973) 1279.
W. Hoogland et al., Nucl. Phys. B 126 (1977) 109.
M.J. Losty et al., Nucl. Phys. B 69 (1974) 185.
K. Takamatsu, Sigma Collaboration, Prog. Theor. Phys. 102 (2001) E52, hep-ph/0012324.
D. Alde et al., GAMS Collaboration, Phys. Lett. B 397 (1997) 350;
R. Bellazzini et al., GAMS Collaboration, Phys. Lett. B 467 (1999) 296.
A. Schenk, Nucl. Phys. B 363 (1991) 97.
A. Pich, J. Portols, hep-ph/0101194.
S. Anderson et al., CLEO Collaboration, Phys. Rev. D 61 (2000) 112002, hep-ex/9910046.
J.H. Khn, A. Santamaria, Z. Phys. C 48 (1990) 445.
D.E. Groom et al., Eur. Phys. J. C 15 (2000) 1.
S. Ishida et al. (Eds.), Possible Existence of the -Meson and Its Implications to Hadron
Physics, workshop held in Kyoto, Japan, 1214 June 2000, Soryushiron Kenkyu, Vol. 102,
No. 5 (2001). Reprint available at http://amaterasu.kek.jp/YITPws/online/index.html.
E. Oset, H. Toki, M. Mizobe, T.T. Takahashi, Prog. Theor. Phys. 103 (2000) 351, nuclth/0011008.
G. Wanders, Helv. Phys. Acta 39 (1966) 228.
V. Lubicz, Nucl. Phys. Proc. Suppl. 94 (2001) 116, hep-lat/0012003.
U. Brgi, Nucl. Phys. B 479 (1996) 392, hep-ph/9602429.
G. Colangelo, Phys. Lett. B 350 (1995) 85, hep-ph/9502285;
G. Colangelo, Phys. Lett. B 361 (1995) 234, Erratum.
H. Leutwyler, Nucl. Phys. Proc. Suppl. 94 (2001) 108, hep-ph/0011049.
J. Heitger, R. Sommer, H. Wittig, ALPHA Collaboration, Nucl. Phys. B 588 (2000) 377, heplat/0006026.
S.R. Sharpe, Phys. Rev. D 41 (1990) 3233;
S.R. Sharpe, Phys. Rev. D 46 (1992) 3146, hep-lat/9205020;
C.W. Bernard, M.F. Golterman, Phys. Rev. D 46 (1992) 853, hep-lat/9204007.
G. Colangelo, E. Pallante, Phys. Lett. B 409 (1997) 455, hep-lat/9702019;
G. Colangelo, E. Pallante, Nucl. Phys. B 520 (1998) 433, hep-lat/9708005.
[72]
[73]
[74]
[75]
[76]
179
G. Colangelo, Phys. Lett. B 395 (1997) 289, hep-ph/9607205.

S. Aoki et al., JLQCD Collaboration, Nucl. Phys. Proc. Suppl. 83 (2000) 241, hep-lat/9911025.
C.W. Bernard, M.F. Golterman, Phys. Rev. D 53 (1996) 476, hep-lat/9507004.
M. Lscher, Commun. Math. Phys. 105 (1986) 153.
K. Maltman, C.E. Wolfe, Phys. Lett. B 393 (1997) 19, nucl-th/9610051;
U. Meissner, G. Muller, S. Steininger, Phys. Lett. B 406 (1997) 154, hep-ph/9704377;
U. Meissner, G. Muller, S. Steininger, Phys. Lett. B 407 (1997) 454;
K. Maltman, C.E. Wolfe, Phys. Lett. B 424 (1998) 413;
M. Knecht, R. Urech, Nucl. Phys. B 519 (1998) 329, hep-ph/9709348.
[77] J. Gasser, V.E. Lyubovitskij, A. Rusetsky, A. Gall, hep-ph/0103157, Phys. Rev. D, in press.

Helicity modulus as renormalized coupling

in the O(3) -model
ALPHA Collaboration
Heiko Molke 1 , Ulli Wolff
Institut fr Physik, Humboldt Universitt, Invalidenstr. 110, D-10099 Berlin, Germany
Received 29 December 2000; accepted 4 April 2001
Abstract
For the family of O(n) invariant nonlinear -models we consider boundary conditions that are
periodic up to an O(n) rotation. The helicity modulus is related to the change in free energy under
variations of the corresponding angle. It defines a nonperturbative finite volume running coupling
similar to the Schrdinger functional for QCD. For the two-dimensional O(3)-model we investigate
this quantity by analytical and numerical techniques. We establish its universal continuum relation
to the finite volume massgap coupling at all scales and coupling strengths. 2001 Elsevier Science
B.V. All rights reserved.
PACS: 11.15.Ha
1. Introduction
The property of asymptotic freedom is a decisive feature of QCD as well as of a
large class of two-dimensional non-Abelian spin models like the O(n) -models for
n > 2. Although it is based only on proofs in perturbation theory (to all orders), the
following structural properties of these models are widely accepted and assumed here. 2
The continuum limit is reached at vanishing bare coupling and a mass scale emerges
in the renormalized continuum theory by dimensional transmutation. Many features
associated with much higher energies or short distance can be related to each other by
renormalized perturbation theory. Usually one singles out one suitable high energy quantity
as renormalized coupling and constructs expansions for other observables in its powers.
E-mail address: uwolff@physik.hu-berlin.de (U. Wolff).
1 Present address: DESY, Platanenallee 6, D-15738 Zeuthen, Germany.
2 See Ref. [2] for a diverging point of view.
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 7 6 - 6
H. Molke, U. Wolff / Nuclear Physics B 603 (2001) 180194
181
Other phenomena around the fundamental scale, like the spectrum, are not accessible to
this strategy and are often investigated by numerical simulation. These opposite sectors
are really features of one and the same theory and it is hence both interesting and possible to
relate them, that is, to compute renormalized coupling constants at high energy in terms of
the low energy scale. This has been the programme of the ALPHA Collaboration in recent
years. An efficient method has been developed first for the O(3) -model [3] followed by
quenched QCD which is reviewed in [4]. Many present activities are related to the goal of
including dynamical quarks.
To relate the perturbative sector with low energy physics very dissimilar continuum
scales have to be accommodated on a lattice with a spacing that is small compared to
all other scales. In a direct approach this either calls for unmanageably large lattices or
one has to compromise with the attempted limits like the continuum extrapolation. The
ALPHA strategy overcomes this problem by a finite size scaling technique. A running
coupling constant g 2 (L) is constructed in the continuum limit, which at large L can be
related to spectral scales and which at small L can be used as an expansion parameter and
thus gives access to the perturbative sector. Step by step one computes g 2 (2L) in terms of
g 2 (L) by continuum extrapolation. Since the system size L is used as the physical scale,
L/a is the only large scale ratio, where a is the lattice spacing. The choice of g is not
unique and a number of practical criteria were taken into account.
For the O(n) model the finite volume mass gap was used in [3]
g 2 =
2
m(L)L,
n1
(1.1)
where m(L) is the gap of the transfer matrix referring to spatial periodic boundary
conditions with periodicity L. For L , m(L) saturates to the infinite volume massgap,
which we identify with the dynamically generated scale. At small L the mass gap becomes
perturbative [5] and can be used as an expansion parameter. For gauge theories a similarly
convenient though less obvious quantity was defined via the Schrdinger functional [4,6].
The basic mechanism is to study the response of the free energy under the variation of an
angle that enters into non-trivial boundary conditions. Again the system size is the only
scale beside the cutoff, and for the quenched theory s in the high energy limit has been
computed to a satisfactory precision.
In this article we define and investigate an alternative coupling g (L) for the O(n)
-model in two dimensions, which is closely related to the helicity modulus. We compare
its properties with the massgap coupling g(L).
They can be analytically related in the

continuum for both small and large coupling. Both expansions are checked and the
crossover range is controlled by high precision numerical simulations. In Section 2 we
define the helicity modulus and relate it to g at strong coupling. In Section 3 g is properly
normalized and its weak coupling behavior is explored up to two loops with details given in
Appendix A. Section 4 summarizes our numerical work and gives conclusions. This work
is based on the diploma thesis [1] of H. Molke. The introduction of g goes back to earlier
attempts [7,8] to investigate renormalization by finite size techniques.
182
2. Helicity modulus and transfer matrix

In this section we introduce the helicity modulus. For earlier discussions of similar
quantities see Ref. [9] and further references found there.
We consider the O(n) -model with its standard nearest neighbor lattice action
1
s(x)s(x + a ),
S = 2
(2.1)
g0 x
where s(x) is the unit length spin field and ,
= 0, 1, are unit vectors along the axes of a
square lattice with spacing a. We take T as the size in the time or 0-direction and L in the
other direction. In space direction we impose strictly periodic boundary conditions, while
in the time direction we demand periodicity up to a planar SO(n) rotation in spin space by
an angle ,
= exp(Kij )s(x).
s(x + T 0)
(2.2)
Here Kij generates rotations in the ij plane,

(Kij )kl = ik j l il j k ,
(2.3)
and we assume i = j fixed to an arbitrary pair of values until further notice. Integration of
all spins with the O(n) invariant measure gives the partition function

Z = Ds exp(S).
(2.4)
Ratios Z1 /Z2 depend on differences in free energy for different boundary conditions and
are expected to be universal. We now define the helicity modulus by

1 2 Z
.
=
(2.5)
Z0 2 =0
From this definition it is rather easy to establish a connection between and the transfer
matrix T . For the fully periodic case Z0 we have

Z0 = tr T T /a .
(2.6)
The extra twist by an angle corresponds to inserting a rotation operator under the trace
in the Hilbert space where T operates,

Z = tr T T /a exp(iKij ) .
(2.7)
If we realize states as wavefunctions ( ) on spatial one-dimensional spin fields , this
induced operator is given by

exp(iKij ) ( ) = exp(Kij ) .
(2.8)
The operator T possesses real positive eigenvalues 0 > 1 2 with 0
corresponding to the nondegenerate ground state. These eigenvalues depend on L and a.
We define the finite volume massgap m(L) as

1
exp m(L)a = .
(2.9)
0
183
Due to the O(n) invariance T and Kij commute and we simultaneously diagonalize
the generator that appears in (2.7). Hence for each eigenstate there is a value k , k =
0, 1, 2, . . . . For n = 3 these are just the magnetic quantum numbers of eigenstates in all
possible integer isospin multiplets in the spectrum. The -dependent partition function is
now given by

(k )T /a exp(ik ),
Z =
(2.10)
k0
and for the helicity modulus we obtain

1
=
(k )T /a 2k .
Z0
(2.11)
k0
For large mT we expect the ground state and the lowest excitations above it to dominate.
While the ground state is O(n) invariant, 0 = 0, we expect an n-fold degenerate vector
multiplet of one particle states as the next excitations above it. On it the generators are
represented in the form (2.3) and have eigenvalues 1, 1 and n 2 times 0. Asymptotically
this implies
T
exp(Ep T ).

2
(2.12)
p
The sum here is over single particle momenta p = 2j/L with L/2a < j L/2a,
and Ep are the associated energies. Contributions of higher states are assumed to be
exponentially suppressed. We now take the continuum limit at fixed large values
for mL
and mT . The one-particle spectrum is expected to become relativistic, Ep = m2 + p2
and we find that is given in terms of the renormalized coupling (1.1) introduced in [3]
by
g 2 1
j =+
exp 2 + (2j )2 .
(2.13)
j =
with
n1 2
g .
2
This formula holds in the continuum limit for large g 2 and fixed aspect ratio
=
(2.14)
T
(2.15)
.
L
For not extremely large g 2 it only takes the few lowest j -values to carry out the sum
in (2.13) numerically to machine precision. By Poisson resummation one may derive the
asymptotic form

2
K1 ( ) + O exp 2 + 1
(2.16)

2
exp( ) 1 + O 1 ,
(2.17)
184
Fig. 1. Relative discrepancies between the strong coupling formula (2.13) and approximations to it.
where K1 is a modified Bessel function. In Fig. 1 we show how these asymptotic forms
approach the sum (2.13) for the case = 1. Also shown is the j = 0 term alone, which
underestimates the sum by at most 5% for 3.5.
3. Helicity modulus in perturbation theory
3.1. Preparation and leading order
By changing the integration variables in (2.4)

x0
s(x) exp Kij s(x),
T
we arrive at

Z = Ds exp(SB ),
where the action
1
SB = 2
s(x)B s(x + a ),
g0 x
(3.1)
(3.2)
(3.3)
has acquired a constant SO(n) gauge field

a
B0 = exp Kij ,
(3.4)
B1 = 1,
T
and s(x) has become strictly periodic. The helicity modulus is now expressed by an
expectation value in the periodic ensemble
2

SB
SB 2
=
(3.5)
,
2
185
with B = 1 after taking all derivatives. So far the angle has referred to rotations with
one particular generator Kij . At this point we average over all planes i < j and split
into two O(n) invariant contributions = 1 2 ,
2

SB
2 a2
s(x)s(x + a 0)
1 =
(3.6)
=
2
2
ng0 V x
and

2 =
SB
2
=

2
a4 0
jij (x)jij0 (y) .
4
V
n(n 1)g0
i<j xy
(3.7)
In these expressions the volume V = T L and the currents
jij (x) = s(x)Kij s(x),
(3.8)
with the discrete derivative

1
s(x)
s(x) = s(x + a )
(3.9)
a
have been introduced. While 1 is a nearest neighbor correlation essentially corresponding
to the internal energy, 2 is a kind of susceptibility with correlations over all separations.
In perturbation theory contributions to 2 start at the one loop level and we find to leading
order in g0
=
2
+ O(1).
ng02
(3.10)
A correctly normalized and nonperturbatively defined renormalized L-dependent coupling

constant can now be defined as
2 1
,
g2 =
(3.11)
n
quite in the spirit of the Schrdinger functional. Its relation to g from Ref. [3] is

g 2 + O g 4 ,
g 2 0,
2
g = 1
n1 2
4
2
ng (n1) exp 2 g , g .
(3.12)
As g 2 is proportional to L at large volume or strong coupling, we find exponential growth

for g2 . This is also expected for the Schrdinger functional coupling in gauge theory [10].
The origin in both cases is the exponentially small sensitivity of the free energy to boundary
effects in a physically large volume.
3.2. Results of a two loop calculation
For simplicity we confined ourselves to the case T = L, = 1 for our perturbative
and our numerical calculations. Details on the perturbative evaluation of are given
in Appendix A. Nontrivial coefficients were evaluated for sequences of lattices of finite
L/a and then fitted to the expected asymptotic form. The extrapolation was carried out
186
as described in Appendix D of Ref. [11] with lattices up to L/a = 100. The cost in CPU
time was negligible in this two-dimensional case. As for the Schrdinger functional, it
was advantageous to compute some of the two loop diagrams in position space rather than
momentum space.
The nearest neighbor correlation in 1 is rather simple to compute [12] to the required
order. As a result we find

2 1
n1 n1 2
g + O g04 , a 2 .
1 =
(3.13)
n g02
2n
16n 0
The currentcurrent correlation in 2 is more involved and has the structure

(0)
(1)
2 = 2 + 2 g02 + O g04 .
(0)
(3.14)
(1)
By combining the results for 2 and 2 from Appendix A with the expansion of g 2
on the lattice [3,16] we find in the continuum limit the renormalized perturbative series

g2 (L) = g 2 (sL) + d1 g 4 (sL) + d2 g 6 (sL) + O g 8 (sL) ,
(3.15)
with
1
ln s,
d1 (s) = (n 2)0.16350689821
(3.16)
2

d2 (s) = d1 (s)2 + (n 2) 0.00315826256 (n 2)0.007733893180
1
(3.17)
ln s.
(2)2
Note that a free relative factor s between the scales at which g and g are taken has been
introduced here for later use.
4. Numerical results
Our main goal in this section is to establish the nonperturbative relation between g 2 (L)
and g2 (L) in the continuum limit of the O(3) model for arbitrary values of these couplings.
Our strategy is to first construct series of values L/a and = 1/g02 which correspond
to fixed values of g.
For precisely the same series we measure g2 and extrapolate these
values to L/a . For the first part of the task we include but extend as necessary
the data from [3]. These runs were carried out with precisely the same code as described
there. In particular, we took advantage of free boundary conditions in the time direction to
extract the massgap, and the reweighting technique allowed for a post-run fine-tuning of
to match a desired value of g 2 . The simulation of is rather conventional on an L L
torus. We employed the single cluster algorithm [13] and measured the observables given
in Eqs. (3.6) and (3.7). Both kinds of results are collected in Table 1.
For each of the eight series at fixed g 2 we have pairs of values with errors of g 2 and
( ) of g2 . For the extrapolation in a/L we combine them into effective errors in g2 only

2 2
2
g
2
g = +
(4.1)
.
g 2
187
Table 1
2
Simulation results for g 2 and g
L
6
8
10
12
16
20
24
32
g 2
2
g
2
g
1.8439
1.8947
1.9319
1.9637
2.0100
2.0489
0.8166(5)
0.8166(4)
0.8166(6)
0.8166(6)
0.8166(5)
0.8166(8)
0.9108(2)
0.9210(2)
0.9276(2)
0.9295(2)
0.9350(2)
0.9351(2)
2.1260
0.8166(8)
0.9375(3)
1.6050
1.6589
1.6982
1.7306
1.7800
1.8171
1.0595(7)
1.0595(7)
1.0595(7)
1.0595(7)
1.0595(6)
1.0595(7)
1.2063(4)
1.2276(4)
1.2414(4)
1.2462(4)
1.2541(4)
1.2579(4)
2.4596(16)
2.4596(16)
2.4596(17)
2.4596(17)
2.4596(17)
2.4596(16)
4.268(2)
4.284(3)
4.302(3)
4.316(3)
4.337(3)
4.344(3)
3.7692(17)
3.7692(19)
3.7692(18)
3.7692(17)
3.7692(16)
13.947(8)
14.030(8)
14.106(9)
14.179(11)
14.218(11)
2.0489
0.7390(5)
0.8276(1)
2.1260
2.1626
2.1930
2.2422
0.7390(6)
0.7390(5)
0.7390(6)
0.7390(5)
0.8341(1)
0.8361(1)
0.8369(1)
0.8363(1)
1.7276
1.7791
1.8171
1.8497
1.8965
0.9176(6)
0.9176(5)
0.9176(6)
0.9176(7)
0.9176(5)
1.0327(3)
1.0460(3)
1.0551(3)
1.0587(3)
1.0642(3)
1.9637
0.9176(6)
1.0674(3)
10
12
16
20
24
32
1.3634
1.4060
1.4683
2.0373(18)
2.0373(18)
2.0373(18)
2.9635(12)
2.9848(12)
3.0120(12)
1.5470
1.6000
2.0373(18)
2.0373(18)
3.0283(13)
3.0314(18)
1.2939
1.3413
1.4095
1.4579
1.4948
1.5500
10
12
16
24
32
1.2211
1.2736
1.3481
1.4413
1.5000
3.0280(27)
3.0280(27)
3.0280(27)
3.0280(27)
3.0280(27)
7.022(6)
7.076(7)
7.140(7)
7.158(8)
7.208(8)
1.1427
1.2014
1.2847
1.3862
1.4500
6
8
10
12
16
20
24
g 2
The required slope can be estimated with sufficient accuracy from the weak and strong coupling behavior. We then extrapolate by fitting a function A +B(a/L)2 to the g2 values with
these errors. A typical case is shown in Fig. 2 where the coarsest lattice was not included
in the fit. Recently there has been some debate in the literature [2,14] whether the use of an
asymptotic cutoff dependence proportional to a 2 , based on perturbation theory, is justified.
Our data are compatible with this form, and we assume it for continuum extrapolations in
this work. As a variation, a linear cutoff dependence, as suggested for other quantities in
[14], could however also be fitted to the data in Fig. 2 and would lead to an extrapolation
to 4.377(7), which differs not dramatically but clearly significantly. To distinguish numerically between these and yet other behaviors would require much higher precision and is
of interest for future study. All our extrapolated continuum results are collected in Table 2.
They are plotted in Fig. 3 and the close-up in Fig. 4, where one sees the data branch off
form the 2-loop curve (3.15). After a very narrow transition region they approach the strong
coupling behavior which we constructed by performing the sum in (2.13).
188
2 for g 2 = 2.4596 as a typical example.

Fig. 2. Continuum extrapolation of g
Table 2
2
Nonperturbative relation between g 2 and g
g 2
2
g
0.7390
0.8166
0.9176
1.0595
2.0373
2.4596
3.0280
3.7692
0.8388(6)
0.9386(6)
1.0701(7)
1.2631(9)
3.0407(33)
4.349(6)
7.210(15)
14.234(20)
To get a feeling for the lattice artefacts associated with our two couplings individually
we estimated their step scaling function (SSF) for a pair of values corresponding to similar
L, g 2 1.06 and g2 1.29. The SSF the focus of interest in [3] gives g 2 (2L) as a
function of g 2 (L). Pairs of simulations at identical and sizes L/a and 2L/a yield

(u, a/L) = g 2 (2L)g 2 (L)=u .
(4.2)
is then extrapolated to the continuum limit. A completely analogous quantity is
defined in terms of g2 . In Fig. 5 the continuum approaches of and are shown. Lattice
artefacts amount to a few percent at say L/a = 8 with deviating significantly more
from its continuum limit than . This trend is expected, since receives contributions
form several low-lying eigenstates of the transfer matrix (cf. (2.11)) while g is constructed
in terms of the massgap only.
Instead of relating the two couplings at one scale L one may also consider the connection
between g2 (L) and g 2 (sL). This has already been incorporated in the perturbative
189
2 versus g 2 with asymptotic expansions.

Fig. 3. Coupling g
2
Fig. 4. Close-up of g
g 2 in the transition region.
formulas (3.15). One may hope that an appropriate choice of s improves the accuracy
of the approximation as was for instance found in [10]. A somewhat natural value is s =
2.7936 (n = 3), for which d1 (s) in (3.16) vanishes and which coincides with the ratio of
the -parameters associated with the short distance behaviors of the two couplings. To
compare such an expansion with nonperturbative results we would have to simulate series
of lattices of size L and sL at the same bare couplings and take the continuum limit. While
this is not possible, we gained good control over the step scaling function for g 2 (L) in [3].
190
Fig. 5. Continuum approach of step scaling functions.
We used its four loop approximation [16,17] 3 and fitted the remainder with an effective
five loop coefficient to evolve from g 2 (L) to g 2 (sL). In this way we found however no
value s = 1 which significantly improved the series for g2 (L).
To conclude, we have investigated two nonperturbatively defined coupling constants for
the O(3) nonlinear -model with exponentially different low energy behaviors. Analytical
relations, valid in the continuum limit, are available for both weak and strong coupling.
Precise numerical simulations covered the intermediate range and matched with both
asymptotic expansions. As expected, the helicity modulus coupling shows larger lattice
artefacts than the finite volume massgap.
Acknowledgement
We would like to thank Erhard Seiler for pointing out to us that the strong coupling
asymptotics in an earlier version of this paper was incorrect.
Appendix A. Two loop expansion of

In this appendix we use lattice units with a = 1 and take T = L. For the perturbative
expansion of the -model on a finite lattice we have to fix the global O(n) invariance by
the FadeevPopov technique [15]. For O(n) invariant integrands I (s) it amounts to the
replacement I (s) I (s)f (s)F (s) which does not change the value of the integral. For
the noninvariant function f we take
f (s) = (M1 )(M2 ) (Mn1 ),
3 We evaluated b = 0.0040 at n = 3 for Eq. (3.47) in [17].
4
(A.1)
191

where M = x s(x) is the total magnetization, which is hence forced to point in the
n-direction. The compensating FadeevPopov factor is in this case given by
F (s) = |M|n1
(A.2)
up to an irrelevant overall constant factor. The spins are parameterized by an n 1

component real field (x)
s = (g0 , ), = 1 g02 2 .
(A.3)
The resulting contributions are gathered in an effective action

1
1
Seff =
( )2 + 2 ( )2 +
ln (n 1) ln
.
2 x
g0
x
x
(A.4)
It is understood to be expanded in g0 ,
Seff =
g02k S (k) ,
(A.5)
k=0
and the function f still has to be included in the path integral. It leads to the omission of the
zero momentum mode in the propagator of the field and makes perturbative coefficients
now well defined. The last two terms in Seff correspond to the O(n) invariant measure and
to F . We shall only need the leading terms
1
( )2 ,
2 x

2 1
1 2
1
1 (n 1)
.
2
S (1) =
8 x
2
V
x
S (0) =
(A.6)
(A.7)
The term S (0) defines the propagator

1 eip(xy)
k (x)l (y) 0 = kl G(x y) = kl
,
V p
p 2
(A.8)
where 1 k, l n 1 and the primed sum is over the appropriate lattice momenta
excluding p = (0, 0) and we have introduced p = 2 sin(p /2).
To compute 1 we set

= 1 g 2 E (0) g 4 E (1) + O g 6 ,
s(x)s(x + a 0)
(A.9)
0
0
0
1
E (0) = (0 )2 ,
(A.10)
2
2
1
E (1) = 0 2 ,
(A.11)
8
and find

c
2
1 = 2 1 g02 E (0) 0 g04 E (1) 0 E (0) S (1) 0 ,
(A.12)
ng0
192
with the connected correlation in the last bracket. These contributions evaluate to

(0)
n1
1
E 0=
(A.13)
1 2 ,
4
L

(0) (1) c n 1
(1)
1
1 2 (n 1)(n 2)
A1 2 .
E 0 E S 0=
(A.14)
1 2
32
L
4
L
The contribution 2 has been expanded in (3.14). We introduce

1 0
jij (x)jij0 (y) = H (0) + g02 H (1) + O g04 ,
4
V g0 i<j xy

2
4
(0)
H =
k 0 l ,
V
x
k<l

2
1 2
H (1) =
0 k
V
x
(A.15)
(A.16)
(A.17)
with the symmetric derivative

(x) =
and find
2 =

1
(x + )
(x )

c

2
H (0) 0 + g02 H (1) 0 H (0)S (1) 0 .
n(n 1)
(A.18)
(A.19)
Numerical values are

1
H 0 = (n 1)(n 2) A1 A2 ,
4

(1)
H 0 = (n 1) (n 2)B1 + B2 ,

(0) (1) c
1
1
1
1
H S 0 = (n 1)(n 2) A1 A2 A1 + A2
1
4
4
2
V

n2
A3 .
+2
V

(0)
(A.20)
(A.21)
(A.22)
In the above expressions the following L-dependent constants were introduced,

A1 =
A2 =
A3 =
1 1
,
V p p 2
4
1 p
,
(p 2 )2

2
1 sin (p )
B1 =
(A.23)
(A.24)

x
(p 2 )3
G(x)2 G(x),
(A.25)
(A.26)
B2 = B1 2

2
G(x) G(x) .
193
(A.27)
Evaluated as x- or p-sums as they stand, only O(V ) terms have to be summed.

It is now straightforward to express the coefficients k1 , k2 in

2
= g02 + k1 g04 + k2 g06 + O g08 ,
(A.28)
n
in terms of the above constants. We evaluated them numerically for L = 8, . . . , 100, and
determined the asymptotic behavior as explained in the appendix of Ref. [11]. The result
is

n1
ln(L)
k1 = (n 2)
(A.29)
0.12165689529 +
+ O 1/L2 ,
2
4

ln(L)
+ 0.02514054821
k2 k12 = (n 2)
(2)2

5
(n 2)2 0.00773389318 +
(A.30)
+ O 1/L2 .
96
g2 =
Errors are beyond the digits given here, and the coefficients of ln(L)/L2 and 1/L2
corrections are also known but not listed here. They are of the same size as the constants
appearing here. To obtain the last fraction we set B2 = 1/48 + O(1/L2) which we observed
to very high accuracy.
The massgap coupling (1.1) has been computed perturbatively up to two loops in [3] and
to three loops in [16,17]. From the last reference we extract

g 2 = g02 + m1 g04 + m2 g06 + O g08 ,
(A.31)
with

1
ln(L) ln( 2/) +
m1 = (n 2)
+ + O 1/L2 ,
+
2
2
4

ln(L)
5
2
+ O 1/L2 .
m2 m1 = (n 2)
+ 0.021982285645 +
(2)2
96

(A.32)
(A.33)
The exact fraction was again found to numerical precision.
References
[1] H. Molke, Renormierte Kopplungen im O(3) -Modell Diploma Thesis, Humboldt University, Berlin, 2000, http://dochost.rz.hu-berlin.de/diplom/physik/molke-heiko-2000-05-02/PS/
Molke.ps.
[2] A. Patrascioiu, E. Seiler, hep-lat/0009005.
[3] M. Lscher, P. Weisz, U. Wolff, Nucl. Phys. B 359 (1991) 221.
[4] R. Sommer, ALPHA Collaboration, Lectures given at 36. Internationale Universittswochen
fr Kern- und Teilchenphysik, Schladming 1997, hep-ph/9711243.
[5] M. Lscher, Phys. Lett. B 118 (1982) 391.
[6] M. Lscher, R. Narayanan, P. Weisz, U. Wolff, Nucl. Phys. B 384 (1992) 168, hep-lat/9207009.
[7] U. Wolff, Nucl. Phys. B 265 (1986) 506.
194
[8] U. Wolff, Nucl. Phys. B 265 (1986) 537.

[9] N. Schultka, cond-mat/9611043;
S. Chakravarty, Phys. Rev. 66 (1990) 481.
[10] G. de Divitiis et al., ALPHA Collaboration, Nucl. Phys. B 437 (1995) 447, hep-lat/9411017.
[11] A. Bode, P. Weisz, U. Wolff, ALPHA Collaboration, Nucl. Phys. B 576 (2000) 517, heplat/9911018.
[12] B. Alles, A. Buonanno, G. Cella, Nucl. Phys. B 500 (1997) 513, hep-lat/9701001.
[13] U. Wolff, Phys. Rev. Lett. 62 (1989) 361.
[14] P. Hasenfratz, F. Niedermayer, Nucl. Phys. B 596 (2001) 481.
[15] P. Hasenfratz, Phys. Lett. B 141 (1984) 385.
[16] D. Shin, Nucl. Phys. B 496 (1997) 408, hep-lat/9611006.
[17] D. Shin, Nucl. Phys. B 546 (1999) 669, hep-lat/9810025.

Sudakov suppression in azimuthal spin asymmetries

Danil Boer
RIKEN-BNL Research Center, Brookhaven National Laboratory, Upton, NY 11973, USA
Received 8 February 2001; accepted 28 March 2001
Abstract
It is shown that transverse momentum dependent azimuthal spin asymmetries suffer from
suppression due to Sudakov factors, in the region where the transverse momentum is much smaller
than the large energy scale Q2 . The size and Q2 dependence of this suppression are studied
numerically for two such asymmetries, both arising due to the Collins effect. General features are
discussed of how the fall-off with Q2 is affected by the nonperturbative Sudakov factor and by the
transverse momentum weights and angular dependences that appear in different asymmetries. For
a subset of asymmetries the asymptotic Q2 behavior is calculated analytically, providing an upper
bound for the decrease with energy of other asymmetries. The effect of Sudakov factors on the
transverse momentum distributions is found to be very significant already at present-day collider
energies. Therefore, it is essential to take into account Sudakov factors in transverse momentum
dependent azimuthal spin asymmetries. 2001 Elsevier Science B.V. All rights reserved.
PACS: 13.65.+i; 13.88.+e
1. Introduction
In this article we will study the effects of Sudakov factors in transverse momentum
dependent azimuthal spin asymmetries, like the Collins effect asymmetry of Ref. [1]. We
will demonstrate explicitly that such asymmetries suffer from suppression due to these
Sudakov factors, in the region where the transverse momentum is much smaller than
the large energy scale Q2 . This Sudakov suppression stems from soft gluon radiation
and increases with energy. It implies that tree level estimates of such asymmetries tend
to overestimate magnitudes and increasingly so with rising energy. In this paper, the
Q2 dependence of the suppression due to Sudakov factors will be studied numerically
in two examples which are relevant for present-day experimental studies. In addition,
the asymptotic Q2 dependence of an important subset of asymmetries is calculated
analytically, providing an upper bound for the decrease with energy of other asymmetries.
E-mail address: dboer@bnl.gov (D. Boer).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 5 6 - 0
196
D. Boer / Nuclear Physics B 603 (2001) 195217
The first example we consider is a Collins effect driven cos(2) asymmetry in electron
positron annihilation into two almost back-to-back pions [2,3], which in principle can
be determined from existing LEP data. This asymmetry allows for a determination
of the Collins effect fragmentation function, which in turn would be useful for the
extraction of the transversity distribution function h1 . The latter can be done via the single
spin asymmetry in semi-inclusive Deep Inelastic Scattering (DIS) which was originally
proposed by Collins [1] our second example. In Ref. [1] Collins already remarks that
the Sudakov factors will have the effect of diluting this single spin asymmetry due to
broadening of the transverse momentum distribution by soft gluon emission. We will study
this effect in a quantitative way to gain insight into the parametric dependences of such
Sudakov suppression.
For this we will follow the recent analysis [4] of a helicity non-flip double transverse
spin asymmetry in vector boson production in hadronhadron scattering. That asymmetry
is for instance relevant for the polarized protonproton collisions to be performed at
BNLRHIC. In Ref. [4] the effect of the Sudakov factors compared to the tree level
asymmetry expression was numerically estimated. It was shown that the inclusion of
Sudakov factors cause suppression by at least an order of magnitude compared to the tree
level result at scales around MW or MZ . Moreover, the suppression increases with energy
approximately as Q0.6 (in the studied range of roughly 10100 GeV). The conclusion
was that the Sudakov suppression together with a kinematic suppression (due to explicit
lightcone momentum fractions that appear in the prefactor) imply that the asymmetry will
be negligible for Z or W production and is interesting only at much lower energies.
Similarly, Sudakov suppression will turn out to be an important issue for the actual
determination of the above mentioned cos(2) asymmetry from LEP data (or in general at
high energy e+ e colliders). Due to the lack of knowledge of the nonperturbative Sudakov
factor in the case of electronpositron annihilation into almost back-to-back hadrons, solid
numerical conclusions about the size and Q2 dependence of the suppression cannot be
drawn. Nevertheless, one can draw conclusions about what determines the Q2 dependence
of the transverse momentum distribution of the asymmetries and about the size of the
suppression for generic nonperturbative Sudakov factors. For that purpose comparisons
to tree level are most instructive, since the evolution of the often unknown distribution
and fragmentation functions themselves does not affect the relative Sudakov suppression
compared to tree level.
The two examples to be investigated here the above mentioned cos(2) asymmetry
and the original Collins sin() asymmetry in semi-inclusive DIS have different
transverse momentum weights and angular dependences, producing different Sudakov
suppression effects. General conclusions about how these properties affect the Q2
dependence can be drawn. In general, a larger power of transverse momentum in the weight
of an asymmetry implies larger suppression. Both the numerical calculations at realistic
collider energies and the asymptotic behavior of asymmetries exhibit this feature.
Before going into the details of the specific examples, we will first give an overview
of the issues regarding factorization theorems for transverse momentum dependent cross
sections, in the region where the transverse momentum is very small compared to the hard
197
scale in the process [57]. This will provide the theoretical justification of the factorized
asymmetry expressions that will be derived.
The outline of this paper is as follows. In Section 2 we will discuss the essentials of
factorization theorems for the DrellYan process [7] (relevant for the double transverse
spin asymmetry studied in Ref. [4]), which are in fact completely analogous to the case of
back-to-back jets in electronpositron annihilation [5,6] and semi-inclusive DIS (SIDIS).
Following the approach of Ref. [4], we will then study the cos(2) asymmetry in electron
positron annihilation (Section 3) and the Collins sin() asymmetry in semi-inclusive
DIS (Section 4). In Section 5 we will discuss the asymptotic behavior of the Sudakov
suppression for certain subsets of asymmetries.
2. Factorization and transverse momentum

For definiteness, we will consider here the DrellYan process (H1 + H2 + + X)
and discuss what is known about the factorization of this process. First of all, it is wellknown that a local operator product expansion (OPE) cannot be applied, since the process
receives relevant contributions off the light cone [8], i.e., from matrix elements like

P1 , P2 |J ( )|XX|J (0)|P1 , P2 with 2 = 0.
(1)
X
Therefore, one usually considers nonlocal operators and what is called a working
redefinition of twist [8], which means the lowest value of t in 1/Qt 2 at which a function
can contribute to the cross section (Q is the invariant mass of the lepton pair). The nonlocal
operators appear in the leading twist factorization theorem

d
=
d dx1 dx2
1
1
dx
a,b x1

b
(x),
Q2
d x a (x)H ab x, x;
(2)
x2
Q2 ) is the hard (partonic) part of the scattering,

where a, b are flavor indices and H ab (x, x;
2
which is a function of the hard scale Q and the lightcone momentum fractions x, x only.
a = a ) describe the soft (nonperturbative) physics and are
The correlation functions (
given by nonlocal operator matrix elements

d ix
e P , S| j (0)L+ [0, n ]i (n )|P , S,
ij (x)
(3)
2
where the path-ordered exponential,

+
L [0, n ] = P exp ig

+
d A (n ) ,
(4)
renders the matrix element color gauge invariant. The hadronic state |P , S is determined
by the hadron momentum P and spin vector S. Also, we have suppressed the factorization
scale dependence in Eq. (2).
198
Secondly, if one considers the cross section differential in the transverse momentum
q T of the lepton pair, then for the case of q 2T Q2T Q2 collinear factorization, like in
Eq. (2), does not hold. Instead, the leading twist factorization takes the form [57,9,10]
d
d dx1 dx2 d 2 q T
1
1

=
dx d x d 2 k T d 2 p T
a,b x1
x2
d 2 b ib(pT +k T q T ) a
b (x,
e
(x, k T )H ab (x, x;
Q)eS(b,Q)
pT )
(2)2
+ Y (x1 , x2 , Q, QT ).
(5)
The correlation function a (x, k T ) is now also a function of transverse momentum, the
factor eS(b,Q) is the so-called Sudakov form factor and the factor Y (x1 , x2 , Q, QT )
becomes important only when QT Q. Here we use a different notation than in, e.g.,
Refs. [57,9,10], but to make a connection with for instance the notation of Ref. [5] we
A/ i (xA , b) P
/ A /2 and the eS
note that (x, k T ) corresponds to the Fourier transform of P
terms correspond to each other.
The factor Y (x1 , x2 , Q, QT ) is present to retrieve collinear factorization when QT is
a hard scale itself, i.e., QT Q. We will neglect the Y term from here on, since we will
only be interested in the region of Q2T Q2 . We will now comment on the other terms in
the above factorization theorem and explain the additional restrictions we will impose in
the limit Q2T Q2 .
The factor eS(b,Q) is the so-called Sudakov form factor, which arises due to
exponentiation of soft gluon contributions. This is in contrast to inclusive cross sections,
like Eq. (2), in which there is a cancellation of soft gluon contributions. At values
b2 = b2 1/2 , the Sudakov form factor is perturbatively calculable and is of the
form
Q
S(b, Q) =

Q2

d2
A s () ln 2 + B s () ,
2
(6)
b02 /b2
where b0 = 2 exp(E ) 1.123 (we choose the usual constants C1 = b0 , C2 = 1). One
can expand the functions A and B in s and the first few coefficients are known [11,12]. In
order to obtain a first estimate of the effect of including the Sudakov factor we will consider
only the leading contribution, i.e., take into account only the first term in the expansion of
A: A(1) = CF / , which is the same for unpolarized as well as polarized scattering. This
leads to the expression [13]

2 2
2

log(b2 Q2 /b02 )
b Q
Q
16
log
+ log
log 1
. (7)
S(b, Q) =
33 2nf
2
log(Q2 /2 )
b02
We will take for the number of flavors nf = 5 and also QCD = 200 MeV.
199
The correlation function a (x, k T ) is now defined as [9,14]

d 4 x ikx
(x, k T ) dk
e
(2)4

+
+
P , S| (0)L [0, ]L [, x](x)|P , S
k + =xP + ,k T
(8)
with for each colored field a path-ordered exponential like

x
L [, x] = P exp ig

dy A (y)

.
y + =x + ,y
(9)
T =x T
The correlation function

T ) also has a factorization scale dependence ( = b0 /b),
which is linked to that of the hard scattering part H .
The correlation function a (x, k T ) contains the nonperturbative dependence on the
transverse momentum, which cannot be calculated perturbatively and should be fitted to
experiment. In for example Refs. [57] it is explained how these functions can be replaced
by the ordinary parton distribution functions
a (x), by introduction of a b-regulator, e.g.,

2 , and by introducing a nonperturbative
the usual cut-off bmax , via b = b/ 1 + b2 /bmax
Sudakov factor SNP . Schematically, this proceeds as follows. The first term in Eq. (5) can
(b),
be written as an overall b integration of an integrand called W
a (x, k
d
=
d dx1 dx2 d 2 q T
1
d x
dx
x1
1
d 2 b ibq T
e
W (b) + Y.
(2)2
(10)
x2
(b)/W
(b )). The first term
(b) can be trivially rewritten as W
(b ) (W
This function W

(W (b )) can be calculated within perturbation theory since s (b0 /b ) is always small (we
will take the usual bmax = 0.5 GeV1 , hence s (b0 /b 2) 0.3) and the second term
can be shown to be of the form
(b)

W
b) exp SNP (b) .
= exp ln Q2 /Q20 gQ (b) gA (x, b) gB (x,
(b )
W
(11)
The functions g are not calculable in perturbation theory and need to be fitted to expe b) parameterize the intrinsic transverse momenriment. The functions gA (x, b) and gB (x,

a
b
(x, b) and
(x,
a is actually a function
tum of the functions
b) (in the polarized case
of b, see further comments below). It will be important later on that the functions gA and gB
a (x, b )
are independent of the scale Q. Finally, a lightcone expansion allows to express
a
in terms of (x).

a (x, b )
b (x,
b )H (x x,
Q) only to lowest order in s . The reason
We will consider
a (x, b ) is not known for the Collins function. Because
is that the perturbative tail of
of the unknown size of the distribution and/or fragmentation functions appearing in the
asymmetries, the magnitude of the asymmetries cannot be estimated. Therefore, here we
will be interested in the effect of soft gluons on the transverse momentum dependence of
the asymmetries, which determines the Q2 dependence of the QT dependence. Although
200
the hard scattering part is restricted to tree level, one can draw conclusions on the size of
the effect of Sudakov factors by comparison to the tree level result, because both results
will receive the same corrections to the hard part. Corrections to the hard scattering part
can be easily inserted with hindsight (for a one-loop example cf. Ref. [5], pp. 432 and 433).
As said we are interested in the Q2 evolution of the QT dependence of the asymmetries.
This Q2 dependence arises in a nontrivial way via Bessel functions, such that a numerical
study is warranted.
In Refs. [57] unpolarized scattering is considered, where only dependence on the
length of b appears. Here we will consider polarized scattering and the above sketched
(b), can also be applied
procedure of Refs. [57] to factor a nonperturbative part from W
2
a
after (x, k T ) is parameterized in terms of functions of k T . Explicit factors of k T S T
or k/ T can be included in the hard scattering part H .
The above factorization theorem for the DrellYan process provides the justification of
the factorized expression for the double transverse spin asymmetry studied in Ref. [4].
In Refs. [5,6] an analogous factorization theorem had been discussed earlier for backto-back jets in electronpositron annihilation, which comes down to a replacement of
(x, k T ) (z, k T ) the fragmentation correlation function. In Ref. [1] it is argued
that a similar factorization theorem holds for semi-inclusive DIS. Therefore, factorization
theorems analogous to the one discussed above (differing only by obvious replacements),
provide the theoretical foundation for the examples to be considered below.
It is interesting to see what happens if one takes tree level everywhere in the above
factorized expression Eq. (5) (ignoring the Y term). At tree level one finds the reduction
d 2 b ib(pT +kT q T ) S(b)

e
e
2 (pT + k T q T ),
(2)2
(12)
such that one arrives at the expression used by Ralston and Soper for the hadron tensor
[4,15,16]
W =

d 2 pT d 2 k T 2 (pT + k T q T ) Tr (x1 , pT )V1 (x
2 , k T )V2 p + , k

q q
+
(13)
.
1
3
The vertex Vi can be either the photon, Z or W boson vertex. In the above unpolarized
/ , where f1 is the parton momentum
case (x, pT ) is parameterized as 12 f1 (x, p2T )P
distribution function. Collins [17] has shown that polarization does not affect the
factorization theorems and therefore, we conclude that the tree level formalism of Ralston
and Soper Eq. (13) is in accordance with the factorization theorem Eq. (5) also in the case
when polarization is taken into account. Clearly, Eq. (13) is only applicable for QT values
of the order of the intrinsic transverse momentum. To extend the formalism of Ralston and
Soper to larger values of QT , but still under the restriction of Q2T Q2 , one can use the
above factorized expression Eq. (5) (without the Y term) beyond tree level.
201
3. Unpolarized asymmetry in electronpositron annihilation

In this section we investigate a cos(2) asymmetry in the process of electronpositron
annihilation into two almost back-to-back pions e+ e + X. This process is very
similar to the DrellYan process, so we can employ the same expression as given in
Eq. (5), but with the correlation functions , replaced by the fragmentation correlation
function [18]
d 4z
eikz 0|i (x)L [x, ]|P , S; XP , S; X|L [, 0] j (0)|0
ij (k) =
(2)4
X
(14)
which is used to define transverse momentum dependent fragmentation functions.
The asymmetry requires observing the transverse momentum of the vector boson
compared to the two pions. The tree level asymmetry expression for the cos(2)
asymmetry in the process e+ e + X was discussed in Refs. [2,3], to which we
refer for details. Here we shortly repeat the essentials. The asymmetry depends on the
fragmentation function H1 associated with the Collins effect [1]. The definition of the
Collins function is given by [19]

2P
2 kT P
/
M
+ H1 z, k T
D1 z, k T
,
(z, k T ) =
(15)
4P
M
M2
where we have only displayed the fragmentation functions that can be present for
unpolarized hadron production (D1 is the ordinary unpolarized fragmentation function).
The cos(2) asymmetry is an azimuthal spin asymmetry in the sense that the asymmetry
arises due to the correlation of the transverse spin states of the quarkantiquark pair. On
average the quark and antiquark will not be transversely polarized, but for each particular
event the transverse spin need not be zero and, moreover, the spin states of the quark
and antiquark are exactly correlated. Subsequently, the directions of the produced pions
are correlated due to the Collins effect and this correlation does not average out after
summing over all quark polarization states. The Collins effect correlates the azimuthal
angle of the transverse spin of a fragmenting quark with that of the transverse momentum
of the produced hadron (both taken around the quark momentum), via a sin() distribution
of their difference angle .
One finds for the leading order unpolarized cross section, taking into account both
photon and Z-boson contributions [2,3],
d (e+ e h1 h2 X)
d dz1 dz2 d 2 q T

3 2 2 2 a
1 + K a (y) cos(21 ) + K a (y) sin(21 )
= 2 z1 z2
K1 (y)F D1 D
3
4
Q

a,a

H1 H

1
F 2 h pT h k T p T k T
.
M1 M2
(16)
202
The convolutions F will be dicussed below (Eqs. (35) and (37)), but first we will give the
definition of the functions Kia (y) appearing in this expression and comment on the frame
in which this cross section is expressed. As before, a is the flavor index; the functions
Kia (y) are defined as
C(y) l

a
2gA ea gA
1 + c3l c3a 2 ,
K1a (y) = A(y) ea2 + 2gVl ea gVa 1 + c1l c1a 2
2

K3a (y) = B(y) ea2 + 2gVl ea gVa 1 + c1l c2a 2 ,

a
3 ,
K4a (y) = B(y) 2gVl ea gA
which contain the combinations of the couplings
j
j
j
c1 = gV 2 + gA 2 ,
j
j
j
c2 = gV 2 gA 2 ,
j
c3
j j
= 2gV gA ,
(17)
(18)
(19)
(20)
(21)
(22)
where gV and gA are the vector and axial-vector couplings to the Z-boson. The propagator
factors are given by
1 =
2 =
Q2 (Q2 MZ2 )
sin2 (2W ) (Q2 MZ2 )2 + Z2 MZ2

1
2
Q2
sin (2W )
Z MZ
3 =
1 .
Q2 MZ2
Q2
1 ,
MZ2
(23)
(24)
(25)
The above cross section is expressed in the following frame. A normalized timelike
vector t is defined by q (the vector boson momentum) and a normalized spacelike vector
z is defined by P = P (P q/q 2)q for one of the outgoing momenta, for which we
choose P2 (Pi are the hadron momenta),
q
,
Q
P
q
Q
.
z
P2 = 2 2
P2 q
z2 Q
Q
t
(26)
(27)
The azimuthal angles lie inside the plane orthogonal to t and z . In particular, gives
the orientation of l (g t t + z z )l , the perpendicular part of the lepton

momentum l.
In the cross sections we also encounter the following functions of y = l /q , which in
the lepton center of mass frame equals y = (1 + cos 2 )/2, where 2 is the angle of z with
respect to the momentum of the incoming lepton l:

1
2 cm 1
y + y = 1 + cos2 2 ,
A(y) =
(28)
2
4
cm 1
B(y) = y(1 y) = sin2 2 ,
(29)
4
203
Fig. 1. Kinematics of the annihilation process in the lepton center of mass frame for a back-to-back
jet situation. P2 is the momentum of a hadron in one jet, P1 is the momentum of a hadron in the
opposite jet.
cm
C(y) = (1 2y) = cos 2 .
(30)
Since we have chosen P2 to define the longitudinal direction, the momentum P1 can
be used to express the directions orthogonal to t and z . One obtains P1 = g P1 (see

Fig. 1),
P1 = z1 qT = z1 QT h ,
(31)
where we define the normalized vector h = P1 /|P 1 |. The angle 1 is between h

and l .
The asymmetry A(q T ) at the Z mass (ignoring the photon and interference contributions, which can be easily included) is now defined by:

d (e+ e h1 h2 X)
1 + cos(21 )A(q T ) ,
2
d dz1 dz2 d q T
with

A(q T )
e a
2

a c1 c2 B(y)F (2q T p T q T k T q T p T k T )H1 H1

1 ]
Q2T M1 M2 b c1e c1b A(y) 12 c3e c3b C(y) F [D1 D
(32)

.
(33)
A confirmation of this asymmetry would confirm the Collins effect without the need of
polarization. To avoid repeating irrelevant factors, we will first focus on the numerator and
denominator of the term

F (2q T pT q T k T q 2T p T k T )H1 H
1
.
=
(34)
1 ]
Q2T M1 M2 F [D1 D
In Refs. [2,3] tree level was considered, which means the expressions are valid in the
region where the observed transverse momentum is very small compared to the hard
scale(s), applicable only in the region of intrinsic transverse momentum. Eq. (34) is a tree
level expression if one uses the convolution notation

a

d 2 pT d 2 k T 2 (pT + k T q T )D a z1 , z12 p2T D
z2 , z22 k 2T ,
F DD
(35)
204
which only involves intrinsic transverse momenta. In order to go beyond this region,
we include the Sudakov factor arising from resummed perturbative corrections to the
transverse momentum distribution, by considering the approach of Refs. [5,6]. This will
extend the range of applicability from the region of intrinsic transverse momentum to the
region of moderate q T values (still under the restriction that q 2T Q2 ). This is outlined for
the general case in the previous section; here we will apply it to the cos(2) asymmetry.
Resummation of soft gluons into Sudakov form factors (see, e.g., [20]) results in
a replacement in Eq. (35) of

d 2 b ib(pT +k T q T ) S(b)
2
e
e
,
(pT + k T q T )
(36)
(2)2
leading to (suppressing the flavor indices)

d 2 b ibq T S(b)
, b)

e
e
F DD
D(z1 , b)D(z
2
(2)2

1
, b).
1 , b)D(z

db bJ0(bQT )eS(b)D(z
=
2
2
(37)
denotes the Fourier transform of D. In order to compute the above

The function D
expression, we will assume a Gaussian transverse momentum dependence for D1 (z, z2 k 2T ):

D1 z, z2 k 2T = D1 (z)Ru2 exp Ru2 k 2T /z2 D1 (z)G |k T |; Ru /z2 .
(38)
For details see Ref. [19]. Taking the Fourier transform of Eq. (38) yields

2
b2 2

D1 z, b = D1 (z) exp 2
z .
4Ru
The numerator in Eq. (34),

F 2q T pT q T k T q 2T pT k T D D

d 2 b ibq T S(b)
e
e
d 2 pT d 2 k T 2q T pT q T k T q 2T pT k T
2
(2)

z2 , z22 k 2T ,
eib(pT +kT ) D z1 , z12 p2T D
(39)
(40)
cannot be treated exactly like the denominator. A model for the transverse momentum
dependence of the function H1 is needed. In order to get a first estimate, or rather an
upper bound for the asymmetry, one might be inclined to assume the maximally allowed
function, by saturating the bound satisfied by H1 [21]

|k T |H1 z, |k T | zMh D1 z, |k T | ,
(41)
producing a 1/|k T | behavior of H1 (z, k T ). However, this is not consistent with the fact
that the Collins effect should vanish in the limit k T 0.
A model by Collins [1] suggests the following transverse momentum dependence.
Collins parameterization for the fragmentation function H1 is (note that Collins uses
205
H/a D sT i kTj H , where sT is the transverse spin of the fragmenting

the function D
1
T
quark)
ij
H1 (z, k 2T )
D1 (z, k 2T )
2MC Mh
k 2T
+ MC2
(1 z)
,
Im A k 2 B k 2
z
(42)
where Mh is the mass of the produced hadron and MC is the quark mass that appears in
a dressed fermion propagator i(A(k 2 )/k + B(k 2 )MC )/(k 2 MC2 ), the functions A and B
are unity at k 2 = MC2 . But for the present purpose, the additional fall-off with 1/k 2T on top
of the Gaussian fall-off is not needed. We will restrict to a Gaussian fall-off (with unknown
magnitude) and assume the simple form for H1a (z, z2 k 2T ) = H1a (z)G(|k T |; R)/z2 . Here
one should take the radius R to be larger than Ru of the unpolarized function D1 , such
as to satisfy the bound Eq. (41) for all |k T |. Also, we will assume that the fragmentation
functions for both hadrons are Gaussians of equal width, i.e., we take R1 = R2 = R and
Ru1 = Ru2 = Ru and also M1 = M2 = M.
One finds

d 2 pT d 2 k T 2q T p T q T k T q 2T pT k T eib(pT +k T ) G p2T ; R G k 2T ; R

1
b2
= 4 2(q T b)2 q 2T b2 exp 2 ,
(43)
4R
2R
which after application to Eq. (40) yields

F 2q T pT q T k T q 2T pT k T D D

d 2 b ibq T
1
, b).

1 , b)D(z
=
e
4 2(q T b)2 q 2T b2 eS(b)D(z
2
(2)2
4R
(44)
It is important to note that the factor 1/R 4 stems from the intrinsic transverse momentum
of the functions H1 (z, k T ) and therefore, is not dependent on the scale Q (see discussion
after Eq. (11)).
In analogy to AVT T [4], Eq. (34) can be transformed into (keeping in mind that in the end
the summation over a, b in numerator and denominator should be performed separately)
=
a (z2 )
(1)a (z2 )
H1(1)a (z1 )H
H1a (z1 )H
1
1
A(Q
A(QT ),
)
=
T
b (z )D
b (z2 )
b (z2 )
4M 4 R 4 D1b (z1 )D
D
1
1
1
1
(45)
where for future reference we have also expressed the asymmetry in terms of the function
(1)
H1 (z) = z2 d 2 k T k 2T /(2M 2 )H1 (z, z2 k 2T ) which occurs frequently in QT -weighted
cross sections (e.g., [2]); furthermore,

3
2 0 db b J2 (bQT ) exp(S(b )SNP (b))
A(QT ) M
(46)
.
0 db bJ0 (bQT ) exp(S(b )SNP (b))
This is similar to the factor A(QT ) of Ref. [4], but with the replacement of J0 (bQT )
J2 (bQT ) in the numerator. We note that unlike the case of AVT T investigated in Ref. [4],
the present asymmetry does not need to oscillate as a function of QT . Rather the vanishing
of the numerator after integration over d 2 q T is due to the angular integration. Note that
206
only depends on QT and not on 1 . Also, we note that this asymmetry has a kinematic
zero at QT = 0, since h cannot be defined in that case (and indeed J2 (QT = 0) = 0). This
is also seen from Ref. [22], where a similar asymmetry factor has been investigated as a
function of QT .
Here the main focus will be on the factor A(QT ), which is a measure for the effect
of the Sudakov factors on the asymmetry compared to the tree level result exp[(R 2
Ru2 )Q2T /2]M 2Q2T R 6 /Ru2 (cf. Eq. (48)), which is valid only for values of QT of the order
of the intrinsic transverse momentum.
In
have introduced the usual cut-off bmax , via b =
the above expression we
2
and replaced 12 b2/R 2 SNP (b), for which we take one of the standard
b/ 1 + b2 /bmax
nonperturbative smearing functions, needed to describe the low q T region properly. It is
important to realize that SNP (b) is introduced only in part to take care of the smearing
due to the intrinsic transverse momentum (cf. Eq. (11)), hence one cannot simply equate
1 2
1 2
2
2
2 b /R with SNP (b). But taking into account the term exp( 2 b /R ) in addition to
SNP (b) will just produce a change in the coefficient of the b2 term in SNP (b). To keep the
unpolarized cross section unaffected, we will, therefore, introduce as nonperturbative term
exp[SNP (b) + 12 b2 /R 2 ], in order not to count the contribution from intrinsic transverse
momentum twice. It is also worth mentioning again that since R = Ru (important at tree
level), SNP (b) need not be the same in numerator and denominator (much less relevant
however, since it affects only gA , gB and not gQ , cf. Eq. (11)) and that in principle, it can
depend on z1 and z2 , but we will not take into account these refinements.
Here we will take for the nonperturbative Sudakov factor the parameterization of
LadinskyYuan (Ref. [23] with x1 x2 = 102 ),

Q
2
2
SNP (b) = g1 b + g2 b ln
(47)
,
2Q0
with g1 = 0.11 GeV2 , g2 = 0.58 GeV2 , Q0 = 1.6 GeV and bmax = 0.5 GeV1 . Part of the
results will depend considerably on this choice and this issue will be addressed in detail
below. The value g1 = 0.11 GeV2 can be viewed as an intrinsic transverse momentum of
p 2T = 1/R 2 = (220 MeV)2 .
The reason we have chosen the parameterization of LadinskyYuan [23], which is
fitted to the transverse momentum distribution of W/Z production in pp (pp)
scattering,
is that unfortunately there is no nonperturbative Sudakov factor available for e+ e
A + B + X, except for the energyenergy correlation function obtain from low energy
data [24]. This is surprising considering the wealth of data from LEP experiments.
Moreover, the nonperturbative Sudakov factor in the energyenergy correlation function

( A,B dz1 z1 dz2 z2 Q2 d/dQ2T ) as fitted in Ref. [24] is not the same as the one in
the differential cross section as a function of the lightcone momentum fractions. For related
discussions, also relevant for SIDIS, see Refs. [25,26].
Although the nonperturbative Sudakov factors for DrellYan and e+ e need not be
related, the SNP of Ref. [23] can be viewed as a generic one and allows us to study the
general features of the Sudakov suppression. We will also investigate the dependence
on the nonperturbative Sudakov factor by varying the parameters. At a later stage one
207
Fig. 2. The asymmetry factor A(QT ) (in units of M 2 ) at Q = 30 GeV (upper curve), Q = 60 GeV
(middle curve) and at Q = 90 GeV.
can always insert a more appropriate (phenomenologically determined) nonperturbative

Sudakov factor into the asymmetry expression Eq. (46). Our results also underline the
importance of a good determination of SNP .
In Fig. 2 the asymmetry factor A(QT ) is given at the scales Q = 30 GeV, Q = 60 GeV
and Q = MZ .
The asymmetry factor A(QT ) at Q = 30 GeV (SNP (b) = 1.41 b 2), Q = 60 GeV
(SNP (b) = 1.81 b 2) and Q = MZ (SNP (b) = 2.05 b 2) is seen to be 0.57 (at QT 3.6 GeV),
0.31 (at QT 3.8 GeV) and 0.22 (at QT 4 GeV), respectively. One observes that the
magnitude of the asymmetry factor goes down with increasing energy and the position of
the maximum and also the average QT shifts to higher values of QT .
Now we will discuss the dependence of these results on the choice of bmax and SNP .
Taking a higher value of bmax increases the Gaussian width. The above choice of bmax =
0.5 GeV1 can be considered as optimistic (1/bmax is the scale down to which one trusts
perturbation theory), enhancing the asymmetry factor somewhat.
The asymmetry factor decreases with increasing Gaussian smearing width in SNP , to
which it has a considerable sensitivity. Empirically, we find that if the Gaussian width is
reduced by a factor , then the maximum of the asymmetry increases roughly by and
the maximum value of QT by . In Fig. 3 this is illustrated for the asymmetry factor
A(QT ) at Q = MZ . The solid curve is for SNP (b) = 2.05 b2 and the dashed curve for
SNP (b) = 1.37 b 2, where the latter width is taken from a recent two parameter fit [27].
The decrease with energy of the maximum of the asymmetry was found not to be very
sensistive to changes to the Gaussian width in SNP . We find that the decrease goes as
Q0.9 Q1.0 .
So far we have only considered the asymmetry as a function of QT , which means the
cross section needs to be kept differential in the angle 1 and the magnitude QT of the
transverse momentum q T . Since the asymmetry does not vanish after integration over
Q2T and since the Q2T -integrated cross section has been studied using LEP data [28], we
will now consider that case. Note that the integration has to be done in numerator and
denominator separately. Since the denominator has no dependence on 1 (the cos(21 )
208
Fig. 3. The asymmetry factor A(QT ) (in units of M 2 ) at Q = 90 GeV with SNP (b) = 2.05 b2 (solid
curve) and SNP (b) = 1.37b2 (dashed curve).
Fig. 4. The asymmetry factor A(QT ) (in units of M 2 ) at Q = 90 GeV multiplied by a factor 10
(solid curve) and the tree level quantity (in units of M 2 ) using Ru2 = 1 GeV2 and R 2 /Ru2 = 3/2.
dependence belongs to the numerator) one can in fact integrate over q T completely. This
corresponds to the inclusive cross section in which soft gluon contributions cancel, hence
this integration over the denominator of the asymmetry factor A(QT ) yields 1 and has no
dependence on Sudakov factors.
The Q2T -integrated result should be compared to tree level, therefore, we will first give
the tree level expression. The tree level asymmetry is (neglecting the small c3a dependent
term in the denominator)
a a
a
Q2T R 2 exp(R 2 Q2T /2) sin2 2
a c2 H1 (z1 )H1 (z2 )
(0)
.
A (QT ) =
(48)

b b
b
4M 2 Ru2 exp(Ru2 Q2T /2) 1 + cos2 2
b c1 D1 (z1 )D1 (z2 )
Note that if one integrates the cross section over Q2T , one should be careful to retain
all dependences on QT in both numerator and denominator separately. From the above
expression we also infer that A(QT ) itself should be compared with the tree level quantity
exp[(R 2 Ru2 )Q2T /2]M 2 Q2T R 6 /Ru2 as mentioned before.
In Fig. 4 we have displayed the comparison of A(QT ) at Q = 90 GeV and the tree level
quantity using the values Ru2 = 1 GeV2 and R 2 /Ru2 = 3/2, which were chosen such as to
209
minimize the magnitude. The value Ru2 = 1 GeV2 can be regarded as too small already.
We conclude that inclusion of Sudakov factors has the effect of suppressing the tree level
result by at least an order of magnitude. This is important to keep in mind when making
predictions of transverse momentum dependent azimuthal spin asymmetries based on tree
level expressions.
For the particular case of a produced + and , we will make the following simplifying
+
+
(z) and neglect

assumptions. We assume D1u (z) = D1d (z), D1d (z) = D1u
+
unfavored fragmentation functions like D d (z); and similarly for the Collins functions.
As a consequence of these assumptions the fragmentation functions can be taken outside
the flavor summation and appear as a square. After Q2T integration one arrives at a cross
section differential in the angle 1 of q T :

H1 (z1 ) 2
d
(49)
1+A
F (y) cos(21 ) ,
d dz1 dz2 d1
D1 (z1 )
where

a2
e a
a2
sin2 2
a=u,d c1 c2 B(y)
a gV gA
F (y) =
(50)
,
e b

1 e b
b2
1 + cos2 2 b gVb 2 + gA
b=u,d c1 c1 A(y) 2 c3 c3 C(y)
which is largest at 2 = 90 : Fmax 0.5 (for a plot of the full factor F (2 ) see Ref. [3]).
For the prefactor A one finds at tree level A(0) = 1/(2M 2R 2 ) = p 2T /(2M 2 ) and one

should assume a typical intrinsic transverse momentum squared value, in the range of
p 2T (200700 MeV)2 . For pions this means A(0) 112. This can be compared to
A(0) = 6/ of Ref. [28], which yields pT2 (270 MeV)2 (or R 2 13 GeV2 ), which
seems to be a reasonable value (remember that R 2 must be larger than Ru2 ).
After including the Sudakov factors our numerical calculation yields A = 0.07
(SNP (b, Q = MZ ) = 2.05b2), much smaller than the tree level values discussed above.
The Gaussian width in SNP is not crucial for this conclusion; if one takes a much smaller
width of, for instance, 1 GeV2 , then one obtains A = 0.11. It has to be emphasized that the
width used in SNP (b, Q) should increase with increasing energy.
Our result shows that upon including Sudakov factors one retrieves parton model
characteristics (also noted in Ref. [6]), but with transverse momentum spreads that are
significantly larger than would be expected from intrinsic transverse momentum (this is
supported by the presently available parameterizations of SNP in various processes, see,
e.g., [25]). This means that a large tree level value for A(0) (e.g., 6/ ), would lead to an
extracted Collins function that is considerably smaller than if Sudakov factors would have
been included. After the nonperturbative Sudakov factor has been obtained from LEP data,
one can estimate exactly how much. For our choice of SNP (b, Q = MZ ) = 2.05, b2, the
effect on the magnitude of the Collins function is an additional factor 6/(0.07) 5

compared to the extraction using the tree level expression.
We conclude that the Sudakov factors produce a strong suppression compared to the
tree level result. A tree level analysis applied at Q = MZ is expected to overestimate
the asymmetry at least by an order of magnitude and therefore, it will underestimate the
Collins function significantly. Hence, Sudakov factors should be taken into account when
210
extracting the Collins function from LEP data or in general, from e+ e data obtained at
high values of s.
4. Collins asymmetry in SIDIS

The Collins function H1 originally was shown to lead to a single spin azimuthal
asymmetry in semi-inclusive DIS [1]. This asymmetry has received much attention, since
it would provide an additional way of accessing the transversity function h1 . A preliminary
determination of the Collins asymmetry from SMC data has been performed [29],
yielding an asymmetry for + production of 11% 6%. Also, the sin azimuthal
asymmetry as recently measured by the HERMES Collaboration [30] might be related to
the Collins asymmetry, providing further indication that the Collins function is nonzero.
Measurements by HERMES and COMPASS are expected to provide more conclusive
information on the Collins effect and its magnitude. Since these experiments are performed
at different energies, it is important to know the Q2 dependence of the asymmetry. Here
we will investigate the Q2 dependence of the transverse momentum (QT ) distribution of
the asymmetry.
From Ref. [31] we extract the expression for this single spin asymmetry (extended to
include contributions from Z-boson exchange; the expressions for W -boson exchange can
be found in Ref. [31]):
2 2 xz2 s
d (H hX)
=
dx dz dy d d 2 q T
Q4

h1 H1
a
a

K1 (y)F [f1 D1 ] |S T |K3 (y) sin h + S F h k T

+ , (51)
Mh
a,a
where S T is the transverse spin of the incoming hadron H . The couplings K1a (y) and
K3a (y) are of the same form as before, except that now

1
A(y) = 1 y + y 2 ,
(52)
2
B(y) = (1 y),
(53)
C(y) = y(2 y),
(54)
and y = (P q)/(P l) q / l (l is the momentum of the beam lepton). Also, since Q2

is now space-like, the width Z can be ignored:
1 =
Q2
,
sin2 (2W ) Q2 + MZ2
1
2 = (1 )2 .
(55)
(56)
The azimuthal angle h (S ) around the three-momentum of the virtual boson is between
the leptons three-momentum and the outgoing hadrons three-momentum (transverse
spin), cf. Ref. [31].
211
Fig. 5. The asymmetry factor A(QT ) (in units of Mh ) at Q = 30 GeV (upper curve), Q = 60 GeV
(middle curve) and at Q = 90 GeV.
If we write the cross section as

d (H hX)
1 |S T | sin h + S A(q T ) ,
2
dx dz dy d d q T
the asymmetry analyzing power is given by
a
K (y)F [q T k T h1 H1 ]
A(q T ) a 3 a
.
QT Mh a K1 (y)F [f1 D1 ]
(57)
(58)
Assuming again Gaussian transverse momentum dependence and including Sudakov

factors one arrives at
a
a
a
a K3 (y)h1 (x)H1 (z)
A(q T ) =
(59)
A(QT ),

2Mh2 R 2 b K1b (y)f1b (x)D1b (z)
where R 2 is the Gaussian width of H1 and the asymmetry factor A(QT ) is defined as

db b2J1 (bQT ) exp(S(b )SNP (b))
A(QT ) Mh 0
(60)
.
0 db bJ0 (bQT ) exp(S(b )SNP (b))
This should be compared with Eq. (46) of the previous section. In Fig. 5 the asymmetry
factor A(QT ) is given at the scales Q = 30 GeV, Q = 60 GeV and Q = MZ .
The maximum of the asymmetry factor A(QT ) at Q = 30 GeV, Q = 60 GeV and
Q = MZ is seen to be 0.60 (at QT 3.1 GeV), 0.42 (at QT 3.4 GeV) and 0.34 (at
QT 3.5 GeV), respectively. Again we note that the magnitude of the asymmetry factor
goes down with increasing energy and the position of the maximum shifts to higher values
of QT .
As before we have used the nonperturbative Sudakov factor by Ladinsky and Yuan,
Eq. (47), in the absence of a well-established SNP for SIDIS (cf. Refs. [25,26]). But
the results obtained here can nevertheless be viewed as generic, only changes in the
specific numbers are expected. Like for the previous cos(2) asymmetry, the asymmetry
factor decreases with increasing Gaussian smearing width. Empirically we find that if the
Gaussian width in SNP is reduced by a factor , then both the maximum of the asymmetry
212
Fig. 6. The asymmetry factor A(QT ) (in units of Mh ) at Q = 90 GeV multiplied by a factor 5 (solid
curve) and the tree level quantity (in units of Mh ) using Ru2 = 1 GeV2 and R 2 /Ru2 = 3/2.
factor and the maximum value of QT increase roughly by . However, the decrease with
energy of the maximum of the asymmetry is not very sensistive to changes in this Gaussian
width. We find that the decrease goes as Q0.5 Q0.6 , which is considerably slower than
for the cos(2) asymmetry. In the next section we will discuss the comparison between
different asymmetries in more detail.
We will now compare these results to the tree level expression for this Collins
asymmetry. At tree level the convolution F (where w denotes a weight function) is given
by

F w(p T , k T )f D

d 2 p T d 2 k T 2 (pT + q T k T )w(p T , k T )f a x, p2T D a z, z2 k 2T .
(61)
If Gaussian transverse momentum distribution and fragmentation functions are assumed,
one obtains at tree level
a
a
a
QT R 2 exp(R 2 Q2T /2)
a K3 (y)h1 (x)H1 (z)
(0)
A (QT ) =
(62)
.

b
b
b
2Mh Ru2 exp(Ru2 Q2T /2)
b K1 (y)f1 (x)D1 (z)
Therefore, one should compare A(QT ) with the tree level quantity exp[(R 2
Ru2 )Q2T /2]Mh QT R 4 /Ru2 .
In Fig. 6 we have displayed the comparison of A(QT ) at Q = 90 GeV and the tree level
quantity using again the values Ru2 = 1 GeV2 and R 2 /Ru2 = 3/2, which minimize the
maximum asymmetry value. As before we conclude that inclusion of Sudakov factors has
the effect of suppressing the tree level result. It is also clear that for the Collins asymmetry
(which arises from a single Collins effect) the Sudakov suppression is less severe than for
the case of the cos(2) asymmetry (which depends on the Collins effect squared). One also
observes a pronounced increase of the average QT upon inclusion of Sudakov factors. The
difference to tree level becomes even more pronounced as the choice of Ru2 is increased
to larger values more appropriate for a tree level analysis. Hence, also for transverse
momentum dependent azimuthal spin asymmetries in SIDIS one needs to include Sudakov
factors in order to extract distribution and fragmentation functions reliably.
213
We now turn to a more general discussion of the different types of transverse momentum
dependent azimuthal spin asymmetries, categorized by different transverse momentum
weights and angular dependences. The comparisons are most cleanly done for the
asymptotic Q2 case.
5. Asymptotic behavior
In the previous sections we have studied asymmetries with different types of weights,
namely, 2h pT h k T pT k T and h k T . The former (and also pT k T in the
asymmetry AVT T of Ref. [4]) is typical of a double transverse spin asymmetry. It has
two powers of transverse momentum and is therefore expected to decrease faster than
the (single spin) asymmetries which have weights with a single power. The reason for
this difference in decrease is that the asymmetries, which are convolutions of transverse
momentum distributions, are largest if these distributions are large and have a large
overlap. The distributions are largest at small pT and k T and the overlap is largest at
small transverse momentum q T . However, the explicit powers of transverse momentum
in the weights tend to suppress the region of small transverse momentum. More powers
of transverse momentum in the weight thus implies more suppression. The power n of
bn in the numerator compared to the denominator is a measure of this effect. A weight
j
j
proportional to p iT k T will lead to n = 2 (cf. Eq. (46)), whereas piT or k T will lead to n = 1
(cf. Eq. (60)). Larger n means, larger suppression and since the transverse momentum
distributions broaden and decrease in magnitude with increasing energy, the convolution
also decreases. Asymmetries with n = 1 will fall off slower with energy and the Sudakov
suppression is less severe. This natural expectation is clearly observed in the numerical
analysis of the examples investigated here and in Ref. [4].
It is interesting to see what is the asymptotic behavior of the Sudakov suppression. As
is usually done, we will assume that at high values of Q the nonperturbative factor is
irrelevant (true only asymptotically) and consider asymmetry factors of the form
2 n
n 0 db b Jm (bQT ) exp(S(b))
.
An,m (QT ) M 2
(63)
0 db J0 (bQT ) exp(S(b))
The asymptotic behavior of the denominator at QT = 0 (relevant for the unpolarized
differential cross section) has been studied before [7,32] by means of a saddle point
approximation. Unfortunately, such a saddle point approximation cannot be applied to the
ratios An,m (QT ) for m = 0 and QT = 0, since one cannot take Jm (bQT ) Jm (bSP QT )
to deduce the Q2 dependence. For instance, since bSP QT is in general not small, i.e.,
m (even under the assumption Q2T Q2 ), no approximation to the Bessel functions
can be used in the region of interest (i.e., around the value of QT where the asymmetry is
maximal).
In Ref. [4] the asymmetry factor A2,0 (QT ) arises, which does allow for a straightforward
saddle point approximation at QT = 0, where the asymmetry is maximal. Therefore,
we simply assume that the weight 2h pT h k T p T k T has similar behavior as
214
pT k T . From our numerical studies we conclude that the maximum of A2,2 (QT ) actually
decreases faster with energy than the maximum of A2,0 (QT ) (at QT = 0). Moreover, for
the asymptotic behavior of An,0 (QT = 0) one can obtain an analytic expression. First we
would like to comment on the saddle point approximation of expressions of the form

db2 bn exp S(b) =
d ln(b2 2 )
n
(2 )1+ 2

n
ln b2 2 S(b) .
exp 1 +
2
(64)
As before we will keep only the leading term in the expansion of A(s ()) and s ()/ =
1/(1 ln(2 /2 )):
Q
S(b, Q) =
Q2
d2
CF
ln
.
2 1 ln(2 /2 ) 2
(65)
b02 /b2
The integral Eq. (64) has a saddle point at

n

b0 Q CF /[CF +(1+ 2 )1 ]
.
bSP =

(66)
For n = 0 we retrieve the power 0.41 of Refs. [7,32]. For general n one finds in the saddle
point approximation (defining n = CF /[CF + (1 + n2 )1 ])

SP
db b exp S(b)
2 n
b02
2
(1+ n )
Q2
2
(1+ n )n + CF (1+n +ln(n ))

2
(67)
In this way one finds for the denominator of A2,0 (QT = 0) an approximate power behavior
of (Q2 )0.94 and for the numerator (Q2 )0.62 , when one takes CF = 4/3 and 1 = 23/12
(corresponding to 5 flavors, but 6 flavors only makes a few percent difference). The
approximated asymmetry factor A2,0 (QT = 0) then has an asymptotic power behavior of
(Q2 )0.32 , which is a similar decrease as found numerically in Ref. [4] and apparently
holds also for very large values of Q. At lower energies, the saddle point is not very
pronounced and the introduction of the Gaussian smearing and b will have an effect on
the approximation, hence we view this agreement as a coincidence. Of course, the saddle
point approximation should become better with increasing Q2 .
For the ratios An,0 (QT = 0) one can also derive an analytic expression for the
asymptotic Q2 dependence:
2 n
2 2 n
2 CF ln cn
M b0 2 Q 1
n 0 db b exp(S(b))

, (68)
An,0 (QT = 0) = M
=
c
n
2
2
2
0 db exp(S(b))
where cn = (1 + CF /1 )/(1 + n/2 + CF /1 ). For n = 2 this yields (Q2 )0.32 for 5 flavors,
which shows that the saddle point approximation is an extremely good approximation
(the power of Q2 differs by O(109 )). One now expects that the maximum value of
A1,1 (QT ) (appearing in the Collins asymmetry) to fall off at least as fast as A1,0 (QT = 0)
(for all Q), which asymptotically goes as (Q2 )0.18 . We note that the introduction of
215
the Gaussian smearing and b have a considerable effect on the approximation at lower
energies as our numerical studies demonstrate (in the studied range of 10100 GeV
we found approximately 1/Q for max(A2,2 (QT )) and 1/ Q for max(A1,1 (QT ))). Our
main conclusion is that for very high Q transverse momentum dependent azimuthal spin
asymmetries will fall off as a fractional power of 1/Q and that the behavior (or an upper
bound to it) can approximately be found by looking at the power of b in the integrand.
Finally, we want to mention that the increase of the Sudakov suppression with energy
provides a solution to the following problem. In general, azimuthal spin asymmetries in
collinear configurations, where partons are collinear to the parent hadrons, are suppressed
by explicit powers of the hard energy scales. The transverse momentum factors in the
weights have to be generated in the hard scattering part and for this one pays a price in
terms of inverse powers of the hard scale. The collinear case can be viewed as averaged
over the transverse momenta, relevant for q T -integrated cross sections, which receives
its main contribution from the region of small transverse momentum. On the other hand,
in the case of q T -dependent differential cross sections at small q T (compared to Q),
small parton transverse momentum needs to be included (one deals with nearly collinear
partons, therefore). In this case there need not be an explicit power suppression to generate
similar azimuthal spin asymmetries (the average transverse momentum is now a scale
in the problem, so dimensionless ratios using QT rather than Q can be formed). This
seems counterintuitive, since by considering a differential cross section for a nearly
collinear configuration, one can obtain information without power suppression, that is
power suppressed in the q T -integrated cross section, which receives its main contribution
from the region of small transverse momentum also (for an example, see Ref. [33]). But we
find that instead of explicit power suppression the presence of Sudakov factors gives rise
to partial power suppression. In this sense the power suppression is effectively replaced
by a Sudakov suppression. This means that the q T -dependent azimuthal spin asymmetries
vanish with increasing energy, as do their q T -integrated counterparts. Hence, departing
from the collinear configuration (that is, by including small transverse momenta) does
not allow one to measure azimuthal spin asymmetries in (differential) cross sections at
arbitrary high energies.
6. Conclusions
In this article we have shown by quantitative examples how transverse momentum
dependent azimuthal spin asymmetries are suppressed by Sudakov factors, in the region
where the transverse momentum is much smaller than the large energy scale Q2 .
Physically, the Sudakov suppression stems from broadening of the transverse momentum
distribution due to recoil from soft gluon radiation and the suppression increases with
energy. This implies that tree level estimates of transverse momentum dependent azimuthal
spin asymmetries tend to overestimate the magnitudes and increasingly so with rising
energy.
216
The size and Q2 dependence of the Sudakov suppression have been studied numerically
for two such asymmetries, both arising due to the Collins effect. The size of the suppression
(compared to tree level) depends considerably on the nonperturbative Sudakov factor that
must be determined from experiment; however, the Q2 dependence of the suppression
turned out to be much less sensistive to the nonperturbative input.
We observe that in general the larger the power n of the transverse momentum in
the weight of an asymmetry, the larger the suppression. For the Collins effect driven
cos(2) asymmetry in electronpositron annihilation into two almost back-to-back pions
(n = 2), the Sudakov suppression was numerically found to be approximately 1/Q for the
maximum of the asymmetry and an upper bound for the asymptotic behavior was found
to be 1/Q0.6 . For the Collins effect single spin asymmetry in semi-inclusive deep inelastic
scattering (n = 1), the Sudakov suppression was numerically found to be approximately
1/ Q for the maximum of the asymmetry and an upper bound for the asymptotic behavior
was found to be 1/Q0.4 . For the maximum of the asymmetries of the type An,0 (QT ), the
asymptotic Q2 dependence could be calculated analytically. This provides upper bounds
on the fall-off of An,m (QT ), with m = 0.
Since our results depend on the input for the nonperturbative Sudakov factors SNP ,
which is not (well) determined for the above processes, the numerical conclusions about
the size and Q2 dependence of the suppression should be viewed as generic, not as
precise predictions. Therefore, we would like to stress the need for an extraction of the
nonperturbative Sudakov factor from the process e+ e A + B + X (for any two, almost
back-to-back hadrons A and B) and from SIDIS. Considering the wealth of data from the
LEP and HERA experiments this should pose no problem.
The Sudakov suppression of the transverse momentum distribution of azimuthal spin
asymmetries is significant already for Q values in the range of 10100 GeV as can be
concluded from comparison to tree level. In single spin asymmetries the magnitude of the
suppression is less severe than in double spin asymmetries and in both cases a pronounced
shift of the average QT to higher values is observed. We conclude that it is essential
to take into account Sudakov factors in transverse momentum dependent azimuthal spin
asymmetries.
Acknowledgements
I would like to thank John Collins, Anatoli Efremov, Zi-wei Lin, George Sterman and
Werner Vogelsang for helpful comments and discussions. Also, I thank Chris Dawson
for C++ help. Furthermore, I thank RIKEN, Brookhaven National Laboratory and the
U.S. Department of Energy (contract number DE-AC02-98CH10886) for providing the
facilities essential for the completion of this work.
References
[1] J.C. Collins, Nucl. Phys. B 396 (1993) 161.
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
217
D. Boer, R. Jakob, P.J. Mulders, Nucl. Phys. B 504 (1997) 345.

D. Boer, R. Jakob, P.J. Mulders, Phys. Lett. B 424 (1998) 143.
D. Boer, Phys. Rev. D 62 (2000) 094029.
J.C. Collins, D.E. Soper, Nucl. Phys. B 193 (1981) 381;
J.C. Collins, D.E. Soper, Nucl. Phys. B 213 (1983) 545, Erratum.
J.C. Collins, D.E. Soper, Acta Phys. Pol. B 16 (1985) 1047.
J.C. Collins, D.E. Soper, G. Sterman, Nucl. Phys. B 250 (1985) 199.
R.L. Jaffe, hep-ph/9602236.
J.C. Collins, D.E. Soper, G. Sterman, Phys. Lett. B 109 (1982) 388;
J.C. Collins, D.E. Soper, G. Sterman, Nucl. Phys. B 223 (1983) 381;
J.C. Collins, D.E. Soper, G. Sterman, Phys. Lett. B 126 (1983) 275.
J.C. Collins, D.E. Soper, G. Sterman, Phys. Lett. B 134 (1984) 263.
C. Davies, W.J. Stirling, Nucl. Phys. B 244 (1984) 337.
A. Weber, Nucl. Phys. B 382 (1992) 63;
A. Weber, Nucl. Phys. B 403 (1993) 545.
S. Frixione, P. Nason, G. Ridolfi, Nucl. Phys. B 542 (1999) 311.
D. Boer, P.J. Mulders, Nucl. Phys. B 569 (2000) 505.
J.P. Ralston, D.E. Soper, Nucl. Phys. B 152 (1979) 109.
R.D. Tangerman, P.J. Mulders, Phys. Rev. D 51 (1995) 3357, hep-ph/9408305.
J.C. Collins, Nucl. Phys. B 394 (1993) 169.
J.C. Collins, D.E. Soper, Nucl. Phys. B 194 (1982) 445.
P.J. Mulders, R.D. Tangerman, Nucl. Phys. B 461 (1996) 197;
P.J. Mulders, R.D. Tangerman, Nucl. Phys. B 484 (1997) 538, Erratum.
J.C. Collins, in: A.H. Mueller (Ed.), Perturbative Quantum Chromodynamics, World Scientific,
Singapore, 1989, p. 573.
M. Boglione, E. Leader, Phys. Rev. D 61 (2000) 114001.
D. Boer, Phys. Rev. D 60 (1999) 014012.
G.A. Ladinsky, C.-P. Yuan, Phys. Rev. D 50 (1994) R4239.
J.C. Collins, D.E. Soper, Phys. Rev. Lett. 48 (1982) 655;
J.C. Collins, D.E. Soper, Nucl. Phys. B 284 (1987) 253.
R. Meng, F.I. Olness, D.E. Soper, Phys. Rev. D 54 (1996) 1919.
P. Nadolsky, D.R. Stump, C.-P. Yuan, Phys. Rev. D 61 (2000) 014003.
F. Landry, R. Brock, G. Ladinsky, C.-P. Yuan, Phys. Rev. D 63 (2001) 013004.
A.V. Efremov, O.G. Smirnova, L.G. Tkachev, Nucl. Phys. B Proc. Suppl. 79 (1999) 554;
A.V. Efremov, O.G. Smirnova, L.G. Tkachev, Nucl. Phys. B Proc. Suppl. 74 (1999) 49.
A. Bravar, for the SMC Collaboration, Nucl. Phys. B Proc. Suppl. 79 (1999) 520.
A. Airapetian et al., HERMES Collaboration, Phys. Rev. Lett. 84 (2000) 4047.
D. Boer, R. Jakob, P.J. Mulders, Nucl. Phys. B 564 (2000) 471.
G. Parisi, R. Petronzio, Nucl. Phys. B 154 (1979) 427.
D. Boer, hep-ph/9912311.

Virtual photon impact factors with exact

gluon kinematics
A. Bialas a,b , H. Navelet c , R. Peschanski c
a M. Smoluchowski Institute of Physics, Jagellonian University, Reymonta 4, 30-059 Cracow, Poland
b Institute of Nuclear Physics, Cracow, Poland
c Service Physique Theorique, CE Saclay, SPhT CEA-Saclay F-91191 Gif-sur-Yvette Cedex, France
Received 18 January 2001; accepted 28 March 2001
Abstract
An explicit analytic formula for the transverse and longitudinal impact factors ST ,L (N, ) of the
photon using kT factorization with exact gluon kinematics is given. Applications to the QCD dipole
model and the extraction of the unintegrated gluon structure function from data are proposed. 2001
Elsevier Science B.V. All rights reserved.
1. Introduction
In the present knowledge on perturbative QCD resummations at leading level in
logarithms of the energy (i.e., log 1/xBj ) the coupling of external sources, in particular
a virtual gluon in deep-inelastic reactions, is based on the theorem of kT factorization [1],
proven in the leading logarithmic approximation of QCD [1]. The theorem states that the
unintegrated gluon distribution, i.e., the distribution of energy and transverse momentum
of gluons in the target, factorizes from the rest of the process. The remaining factor is
the so-called impact factor. Consequently, this impact factor is an universal quantity,
the same in all processes initiated by the same external source, e.g., the photon. The
unintegrated gluon distribution will depend on the target, but again the target impact
factor can also be factorized out, leaving place to an universal interaction term, given by
the Balitsky, Fadin, Kuraev, Lipatov, BFKL Pomeron [2]. At next-to-leading level, the
modified interaction term is now known [3] but not yet are the impact factors determined
at this order of perturbation [4]. It is expected that the effect of the exact kinematics of the
exchanged gluons, which is our subject, gives the main contribution to these higher order
terms.
E-mail addresses: bialas@th.if.uj.edu.pl (A. Bialas), navelet@spht.saclay.cea.fr (H. Navelet),
pesch@spht.saclay.cea.fr (R. Peschanski).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 5 3 - 5
A. Bialas et al. / Nuclear Physics B 603 (2001) 218230
219
The kT factorized impact factors can be conveniently expressed in terms of two Mellin
variables: , conjugated to transverse momentum squared, and N, conjugated to energy.
Up to now, only the impact factors at N = 0 were considered, since, strictly speaking, kT
factorization has been proven only at infinite energy, implying N = 0. If, however, the
validity of kT factorization is extended to the case of exact gluon kinematics, it implies the
knowledge of the combined , N dependence of impact factors. To our knowledge, this
combined dependence has been derived only for real photoproduction of heavy flavors [5].
It is the purpose of the present paper to give an explicit analytic expression for the , N
dependence of virtual photon impact factors. To this end, using the kT factorization and
exact gluon kinematics, we derive the explicit formulae for the total cross section of
longitudinal and transverse photons on any target with a given distribution of gluons.
In Section 2, the formulae for the (virtual) photon impact factors following from
kT factorization are given. The details of the calculation, implying multidimensional
integration and resummation of generalized hypergeometric functions are presented in
Section 3 for the longitudinal case and in Section 4 for the transverse one. Applications
of our results are given in Subsection 5.1 for the QCD dipole model [6] which turned out
to be rather successful in description of the deep inelastic total and diffractive cross-section
of the virtual photons [7]. In Subsection 5.2 we suggest a model independent method of
extraction from data of the unintegrated gluon structure function. Our conclusions are given
in the last section.
2. An explicit formula for the total cross-sections using kT factorization

Our starting point is the formula for the total longitudinal and transverse photon crosssections given in [8]:
4 2
L
FL = 4s Q2
Q2
1
0
2
dk

2
dz z(1 z)
k4
1
1
d p 2
2

2
p +Q
(p k)2 + Q
2
2

g xg , k 2 .
(1)
Using

g xg , k

=
dN
(xg )N
2i
d 2
k gN ( ),
2i
(2)
we write

L =
with
dN
(xBj )N
2i
1
d
gN ( )SL (N, ) Q2
,
2i
(3)
220
1
SL (N, ) = 4s

2
dz z(1 z)

dk 2
k2
Q2
2
2
1
1
d p 2
2 (p k)
2 +Q
2
p +Q
2 )N
(Q
,
2 +Q
2 + k 2 ]N
[(p (1 z)k)
2
(4)
where we have used the relation

xg = xBj
2 + k 2
(p (1 z)k )2 + Q
,
2
Q
2 = z(1 z)Q2 .
Q
(5)
Similarly,
4 2
FT
Q2
1

= s dz z2 + (1 z)2
T =
dk 2
k4
2

p
p k
d p 2
g xg , k 2
2
2
2

p +Q
(p k) + Q
and using (2) and (5) we have

1
dN
d
(xBj )N
gN ( )ST (N, ) Q2
T
,
2i
2i
(6)
(7)
with
1
ST (N, ) = s
0

dk 2 k 2 1
k 2 Q2

2

p
p k
2
d p 2
2 (p k)
2 +Q
2
p +Q
2 )N
(Q
.
2 +Q
2 + k 2 ]N
[(p (1 z)k)

dz z2 + (1 z)2
(8)
It turns out that the integrals (4) and (8) can be explicitly performed and expressed in terms
of digamma functions. The details of the calculation are given in Sections 3 and 4. Here
we only quote the final formulae:
( + + 1) ( + 1)
(N)

1
( + ) ( ) 3N 2 ( 2 1)
3
,
2
( 1)( 2 4)
2N
SL (N, ) = 8s
ST (N, ) = 2s
1
( + ) ( )
2
(N)
( 1)( 2 4)
(9)
221
( + ) ( ) N 2 (3(N + 1)2 + 9) 2N( 2 1) + ( 2 1)( 2 9)
4N

1
(10)
3(N + 1)2 + 9 + 2 1 ,
2
where we adopted the convenient notation N 2 + 1.

Note that the poles at = 0, 1, 2 in formulae (9), (10) are actually absent due
to zeroes in the numerators, as it should from regularity of the generalized (Meijer)
hypergeometric functions appearing in the derivation, see later. This provides numerous
and non trivial checks of the resummations leading to (9), (10).
3. Longitudinal photon impact factor

As the first step in the calculation we observe that, using the symmetry of the integrand
with respect to interchange of z and 1 z, the integrals in (4) can be written as a sum of
two terms:
2
SL (N, ) = 8s Q2
(11)
(A B),
where
1
A=
0
1
B=
0

2 dk 2 2
k
dz z(1 z)
k4

2 )N
1
(Q
d 2p 2
,
2 )2 [(p (1 z)k)
2+Q
2 + k 2 ]N
(p + Q
(12)

2 dk 2 2
k
dz z(1 z)
k4

2 )N
1
(Q
d 2p 2
. (13)
2 ][(p k)2 + Q
2 ] [(p (1 z)k)
2+Q
2 + k 2 ]N
[p + Q
Using several times the identity

1
1
=
CM
(M)
dt t M1 et C ,
(14)
we transform (12) and (13) into

1
A=
(N)
1
0

2 2 N

dz z(1 z) Q
dk 2 2
k
k4

v dv
dt t N1

2 t p (1 z)k 2 + Q
2 + k 2 ,
d 2 p exp v p2 + Q
(15)
222
1
B=
(N)
1
0

2 2 N
dk 2 2 1

k
dz z(1 z) Q
dt t N1
dv
dv
k4
(N)

2 v (p k)2 + Q
2
d 2 p exp v p2 + Q

2

2 + k 2 .
(16)
t p (1 z)k + Q
Thus the integration over d 2 p reduces to a Gaussian form and can be easily performed.
After rescaling the variables v = ty, v = ty and substitution u = tk 2 the integrals over
du and dt factorize from the rest and can be explicitly evaluated. The final result of these
operations reads
A=
( 1) (N + 2) 2 2
Q
(N)

1
y dy
dz z (1 z)
(y + z)1 ,
(1 + y)N2 +4
(17)
B=
(N + 2) ( 1) 2 2
Q
(N)
1

dy dy [(y + z)(y + 1 z)]1
dz z(1 z)
.
(1 + y + y )N2 +4
(18)
To evaluate integrals in (17) one takes h = 1 + y, y + z = h[1 (1 z)/ h]. Using the
formula for the Gauss series
(1 x)a =

(a + n) x n
n=0
(a)
(19)
n!
and picking the factor ( 1) in formula (18), the integral over dv leads to

y dy
( 1)
(y + z)1
(1 + y)N2 +4

( 1 + n)
(1 z)n
=
(N + 2 + n)(N + 1 + n) n!
2 G1 ( 1, N + 1; N + 3; 1 z),
(20)
(u) (v)
(w) 2 F1 (u, v; w; t).
where the Meijer function 2 G1 (u, v; w; t)

Substituting this into (17) we can integrate over z to obtain
A=
(N + 2) ( + 1) 2 2
Q
TA ,
(N)
(21)
where
TA = 3 G2 (2, 1, N + 1; + 3, N + 3; 1).
(22)
To calculate the integrals over dy and dy in (18) we first rescale the variables y, y
zy, (1 z)y . Then we introduce h = 1 + y, h = 1 + y and, finally = h/ h . The last
223
change implies 1 + y + y 1 + zy + (1 z)y = h [1 z(1 )]. Using again (19) the

integral over dy dy can be transformed into the series

dy dy [(y + z)(y + 1 z)]1
(1 + y + y )N2 +4

2 (N + 2 )
(N 2 + 4 + n) n
z + (1 z)n
= z(1 z)
N (N 2 + 4)
(N + 3 + n)
n=0
2 (N + 2 )
z(1 z)
N (N 2 + 4)

2 G1 (1, N 2 + 4; N + 3; z) + (z 1 z) .
(23)
Using known relations [9] on 2 F1 functions, one writes also

dy dy [(y + z)(y + 1 z)]1
(1 + y + y )N2 +4

1
= z2
2 G1 ( 1, N + 2; N + 3; z) + (z 1 z) . (24)
N ( 1)
Introducing this into (18) we perform integration over z to obtain finally
B
2 (N + 2) ( + 1) 2 2
Q
TB ,
(N)N
(25)
where
2
TB = Q2
3 G2 (3, 1, N + 2; + 4, N + 3; 1).
(26)
It turns out that the series in (22) and (26) can be summed up and expressed in terms
of the digamma functions. One uses known relations on Meijer functions at z = 1, see
Ref. [10]. One writes
TA 3 G2 (2, a, b; a + 4, b + 2; 1)
= 3 G2 (2, a, b; a + 4, b + 1; 1) 3 G2 (2, a, b + 1; a + 4, b + 2; 1)
1
= (b 1) 3 G2 (1, a, b 1; a + 3, b + 1; 1)
3

b 3 G2 (1, a, b; a + 3, b + 2; 1)
(27)
and
TB 3 G2 (3, a, b + 1; a + 5, b + 2; 1)
b(b 1)
=
3 G2 (1, a, b 1; a + 3, b + 2; 1),
6
using the abbreviations
a = 1,
b = N + 1 .
(28)
(29)
Using the generic formula for hypergeometric functions 3 F2 (1, a, b; a + p + 1,

b + q + 1; 1) [11], one writes:
224
3 G2 (1, a, b; a + p + 1, b + q
+ 1; 1)

(b a p)
(p
+
q
+
1)
(a) (b)
(1)p+1
(p + 1) (q + 1) (b a + q + 1)

p1
1
(1 + q + k) (1 a p + k)
(b a p) 1
+ (1)p
(1 a) (q + 1)
k p (k + 1) (b a + q p + k + 1)
k=0

+ {a b, p q} .
(30)
All in all, using for convenience the notations b a 1 and N b + a, we get

1
2TB
(a + 1 + ) (a + 1) 3N 2 ( 2 1)
TA
= 2
3
.
N
( 1)( 2 4)
2N
(31)
4. Transverse photon impact factor
T can be evaluated along the similar lines as L . Here we mark only the main
differences. The key point is that T can be evaluated using L , which gives a noticeable
simplification of the painful calculation.
Let us come back to the calculation of L , starting with formula (11) and define
(DL) Q2 A,
(NDL) Q2 B.
(32)
Using (8) we write

1
ST ( , N) = 2s (DT ) (NDT ) Q2
,
(33)
with
1

dz z2 + (1 z)2
(DT ) =
0
d 2p
1
(NDT ) =
0
(p2
2
dk 2 k 2
2 )N
(Q
(p)
2
,
2
2
) [(p (1 z)k)
2+Q
2 + k 2 ]N
+Q

dz z2 + (1 z)2
(34)
2
dk 2 k 2

2 )N
p(
p k)
(Q
.
2 ][(p k)2 + Q
2 ] [(p (1 z)k)
2+Q
2 + k 2 ]N
[p2 + Q
(35)
Let us rewrite the quantities of interest in the following form:
d 2p
1
(DL)
1
dz AL (z)I (z),
(NDL)
dz AL (z)J (z),
0
1
(DT )
225

dz AT (z) K(z) I (z) ,
1
(NDT )

1
dz AT (z) K(z) J (z) L(z) ,
2
(36)
where
AL (z) = z(1 z),
AT (z) = z2 + (1 z)2 = 1 2AL (z).
The integrals I, J, K, L are defined as follows:

2 )N
2
dk 2 2
(Q
Q
I (z) =
k
,
d 2p 2
4
2
2
) [(p (1 z)k)
2+Q
2 + k 2 ]N
k
(p + Q

J (z) =

K(z) =

L(z) =
dk 2 2
k
k4
(38)
d 2p
dk 2 2
k
k4
dk 2 2
k
k4
2
Q
2 ][(p k)2 + Q
2]
[p2 + Q
2 )N
(Q
,
2+Q
2 + k 2 ]N
[(p (1 z)k)
(37)
d 2p
2 )N
1
(Q
,
2 ) [(p (1 z)k)
2+Q
2 + k 2 ]N
(p2 + Q

d 2p
(39)
(40)
k2
[p2
2 ][(p k)2
+Q
2]
+Q
2 )N
(Q
.
2+Q
2 + k 2 ]N
[(p (1 z)k)
(41)
Consequently, one reads for L :

1
(DL) (NDL) =

AL (z) I (z) J (z) .
(42)
The corresponding expression for T shows a nice simplification, since the integrand K(z)
cancels from the calculation:
1
(DT ) (NDT ) =

1
AT (z) J (z) I (z) + L(z) .
2
(43)
Using (37), we get

(DT ) (NDT ) = 2 (DL) (NDL)
1
+
0

1
dz J (z) I (z) +
2
1
0

dz 1 2z(1 z) L(z).
(44)
226
From formula (20), it is straightforward to obtain (in the a, b notations):

I (z) = Q2a
(b + 1) a
z 2 G1 (a, b; b + 2; 1 z),
(b + a)
(45)
(b)
where 2 G1 (a, b; b + 2; 1 z) (a)
(b+2) 2 F1 (a, b; b + 2; 1 z).
After some straightforward transformations from formula (25), one writes
(b + 1)
z(1 z)
J (z) = Q2a
(b + a + 1)
a1

z
2 G1 (a, b + 1; b + 2; 1 z) + (z 1 z) ,
L(z) = Q2a
a

(b)
z 2 G1 (a + 1, b; b + 1; 1 z) + (z 1 z) ,
(b + a + 1)
(46)
(47)
where L(z) is obtained from J (z) by suppressing the factor z(1 z) and replacing a
2 for J (z) in formula (41).
a + 1, b b 1, due to the numerator k 2 instead of Q
Inserting expressions (45)(47) in formula (44), one gets after integration over z:

(DT ) (NDT ) = 2 (DL) (NDL)
(b) (a + 1) 2a
Q
+
(a + b + 1)

3 G2 (1, a + 1, b; a + 2, b + 1; 1) ba 3 G2 (1, a, b; a + 2, b + 2; 1)

(b 1)(a + 1) 3 G2 (1, a + 1, b 1; a + 3, b + 1; 1) .
(48)
All in all, using for convenience the notations b a and N b + a, one finally obtains
(DT ) (NDT ) =
(b) (a + 1) 2a
1
Q
2
(a + b)
( 1)( 2 4)

(a + 1 + ) (a + 1)
N 2 (3(N + 1)2 + 9) 2N( 2 1) + ( 2 1)( 2 9)
4N

1
(49)
3(N + 1)2 + 9 + 2 1 .
2
5. Applications
5.1. Comparison with the dipole model
In a recent paper [12] we have shown that the dipole model is, strictly speaking, not
compatible with the formulae obtained from the kT factorization including the exact
kinematics of the corresponding Feynman diagrams. The point is that the results from
kT factorization are non-diagonal in impact parameter space, contrary to the fundamental
assumption of the dipole model. Thus, it is worthwhile to derive the explicit modifications
of the model due to the gluon kinematics.
227
The formulae of the previous section can now be compared with those obtained in the
QCD dipole model, which amounts to consider [7] the impact factors at N = 0:
dip
SL ( ) SL (N = 0, ) =
dip
ST ( ) ST (N = 0, ) =
2 s (1 ) 2 (1 ) 2 ( )
,
3
1 23 ( 32 ) ( 32 + )
(50)
2 s (1 + )(1 12 ) 2 (1 ) 2 ( )
.
3
1 23
( 32 ) ( 32 + )
(51)
The knowledge of SL,T (N, ) allows one to take into account the modifications of the
QCD dipole model due to exact gluon kinematics. Indeed, let us insert the BFKL pole in
the formula for the unintegrated gluon structure function
v(
)w( ) 2
Q
, ( ) 2(1) ( ) (1 ),
gN ( ) =
(52)
N (
) 0
where is the effective value of the strong coupling constant = sNc , in the BFKL
kernel, w( ) is the Mellin transform of the probability to find a dipole in the target, Q0
sets the typical model scale of the unintegrated gluon structure function g(xg , k 2 ) (see
formula (2)) and
(1 )
(53)
22 (1 + )
is the Mellin transform of the probability to find a gluon in a dipole. In its final form, the
modified QCD dipole model for longitudinal (3) and transverse (7) cross sections with full
kinematics read:

Q2
d
2
(
)
(xBj )
L = Q
(54)
),
v(
)w( )SL N = (
,
2i
Q20
v(
)=
T = Q2

Q2
d
)
(xBj )(
),
v(
)w( )ST N = (
.
2i
Q20
(55)
Note that in these formulae, one has N 2 + 1 = (
) 2 + 1.
Formulae (54) and (55) give the explicit dependence of the cross-sections in the shift of
the hard pomeron intercept.
We have compared numerical results (in the saddle-point approximation) from the
Eqs. (9) and (10) with those obtained from Eqs. (50) and (51) in the range 0.01 >
x > 0.0001 and 20 GeV2 < Q2 < 160 GeV2 . It turns out that they give a rather similar
dependence on both x and Q2 (the deviations do not exceed 5%). Normalization changes,
however: the cross-sections including the full gluon kinematics are by about factor 2
smaller than the ones obtained from the high-energy approximation. Note the interesting
fact that the ratio R = L /T is affected: it increases by about 15%.
It is not very surprizing that the dependence on kinematic variables does not substantially
differ in the two approaches. Indeed, this dependence is mostly controlled by the position
of the saddle point which is the same in the two formulae. The rather important change
in normalization, however, could not have been easily guessed. In particular the ratio of
normalizations in R = L /T is a quantity of experimental interest.
228
5.2. Method of extraction of the unintegrated gluon distribution

The knowledge of the impact factors SL (N, ) and ST (N, ) as explicit analytic
functions of their two variables allows a model independent determination of the
unintegrated gluon structure function from L or T or from the experimental observable
2
F2 4Q2 (T + L ).
Indeed, formulae (3), (7) yield

dxBj
(xBj )N dQ2 Q2
T ,L xBj , Q2
T ,L (N, )
xBj
= gN ( )ST ,L (N, ).
(56)
Hence, from formula (2) defining the unintegrated gluon structure function, one gets

d 2 T ,L (N, )
dN
(xg )N
.
g xg , k 2 =
(57)
k
2i
2i
ST ,L (N, )
This relation (57) has quite interesting features.
First, it allows a determination of the unintegrated gluon structure function from
experiments. For instance, one may consider the ratio ( T + L )/(ST + SL ) using data
on F2 .
Second, the expected universality properties of g(xg , k 2 ) from kT factorization gives
predictions for various processes, e.g., the ratio R = FL /FT , the photoproduction and
leptoproduction of heavy flavors, diffractive leptoproduction of vector mesons and any
other process where kT factorization applies.
Third, the applicability of this relation goes beyond a specific model like the QCD dipole
one, provided the coupling to the virtual photon comes from the exchange of two off-mass
shell gluons with exact kinematics.
These results comes from the assumption that kT factorization is a valid approximation,
e.g., the dominance of the two considered Feynman diagrams for the impact factors, when
exact gluon kinematics can be considered. It remains to know to what extent higher order
QCD contributions and non perturbative corrections may spoil this approximation.
6. Conclusions
Our formulae (9), (10) give the exact two-variable dependence of the longitudinal and
transverse impact factors of the photon in terms of the usual Mellin variables N, . N is
the variable conjugated to xBj , while is conjugated to Q2 . In the particular framework
of the QCD dipole model, it gives the modification taking into account the shift in N =
(
) of the BFKL singularity. This results mostly in a change of the normalization of the
cross-section, whereas the dependence on kinematic variables predicted by the QCD dipole
model is hardly affected. Note that the relative normalization in R = FL /FT is increased
by 15%.
229
More generally, our result opens the way of a model-independent extraction of the
universal unintegrated gluon structure function which appears in various processes,
whenever kT factorization can be applied.
Acknowledgements
A.B. thanks the Service de Physique Theorique of Saclay for support and kind
hospitality. This investigation was supported in part by the KBN Grant No 2 P03B 086
14 and by the Subsidium of Fundation for Polish Science 1/99.
References
[1] S. Catani, M. Ciafaloni, F. Hautmann, Nucl. Phys. B 366 (1991) 135;
S. Catani, M. Ciafaloni, F. Hautmann, Nucl. Phys. B 29A (1992) 182;
J.C. Collins, R.K. Ellis, Nucl. Phys. B 360 (1991) 3;
E.M. Levin, M.G. Ryskin, Yu.M. Shabelski, A.G. Shuvaev, Sov. J. Nucl. Phys. 53 (1991) 657.
[2] L.N. Lipatov, Sov. J. Nucl. Phys. 23 (1976) 642;
V.S. Fadin, E.A. Kuraev, L.N. Lipatov, Phys. Lett. B 60 (1975) 50;
E.A. Kuraev, L.N. Lipatov, V.S. Fadin, Sov. Phys. JETP 44 (1976) 45;
E.A. Kuraev, L.N. Lipatov, V.S. Fadin, Sov. Phys. JETP 45 (1977) 199;
I.I. Balitsky, L.N. Lipatov, Sov. J. Nucl. Phys. 28 (1978) 822;
L.N. Lipatov, Zh. Eksp. Teor. Fiz. 90 (1986) 1536, Sov. Phys. JETP 63 (1986) 904, English
translation.
[3] V.S. Fadin, L.N. Lipatov, Phys. Lett. B 429 (1998) 127;
M. Ciafaloni, Phys. Lett. B 429 (1998) 363;
M. Ciafaloni, G. Camici, Phys. Lett. B 430 (1998) 349.
[4] J. Bartels, S. Gieseke, C.-F. Ciao, Phys. Rev. D 63 (2001) 056014;
V. Fadin, D. Ivanov, M. Kotsky, hep-ph/0007119;
V. Fadin, J. Bartels, private communications.
[5] S. Catani, M. Ciafaloni, F. Hautmann, Nucl. Phys. Proc. Suppl. A 29 (1992) 182;
We were informed by M. Ciafaloni, that the virtual photon impact factors might have been
discussed (but not published) some time ago by F. Hautmann.
[6] L.L. Frankfurt, M.I. Strikman, Phys. Rep. 160 (1988) 235;
A.H. Mueller, Nucl. Phys. B 335 (1990) 115;
A.H. Mueller, Nucl. Phys. B 415 (1994) 373;
N.N. Nikolaev, B.G. Zakharov, Z. Phys. C 49 (1991) 607;
N.N. Nikolaev, B.G. Zakharov, Phys. Lett. B 332 (1994) 184;
A.H. Mueller, B. Patel, Nucl. Phys. B 425 (1994) 471.
[7] H. Navelet, R. Peschanski, C. Royon, Phys. Lett. B 366 (1996) 329;
H. Navelet, R. Peschanski, C. Royon, S. Wallon, Phys. Lett. B 385 (1996) 357;
A. Bialas, R. Peschanski, C. Royon, Phys. Rev. D 57 (1998) 6899;
S. Munier, R. Peschanski, C. Royon, Nucl. Phys. B 354 (1998) 297.
[8] J. Kwiecinski, A. Martin, A. Stasto, Phys. Rev. D 56 (1997) 3991.
[9] I.S. Gradstein, I.M. Ryzhik, Table of Integrals, Series and Products, Academic Press, 1980.
230
[10] A. Prudnikov, Y. Brychkov, O. Marichev, Integrals and Series, Vol. 3, Gordon and Breach,
1986.
[11] S. Munier, H. Navelet, Eur. Phys. J. C 13 (2000) 651, Note a misprint in formula (A. 38): instead
of sinb , read: sinb (1)p .
[12] A. Bialas, H. Navelet, R. Peschanski, Nucl. Phys. B 593 (2001) 438.

Determination of the string scale in D-brane

scenarios and dark matter implications
D.G. Cerdeo a,b , E. Gabrielli c , S. Khalil d,e , C. Muoz a,b ,
E. Torrente-Lujan a
a Departamento de Fsica Terica C-XI, Universidad Autnoma de Madrid, Cantoblanco, 28049 Madrid, Spain
b Instituto de Fsica Terica C-XVI, Universidad Autnoma de Madrid, Cantoblanco, 28049 Madrid, Spain
c Institute of Physics, University of Helsinki, P.O. Box 9, Siltavuorenpenger 20 C, FIN-00014 Helsinki, Finland
d Centre for Theoretical Physics, University of Sussex, BN1 9QJ Brighton, UK
e Ain Shams University, Faculty of Science, 11566 Cairo, Egypt
Abstract
We analyze different phenomenological aspects of D-brane constructions. First, we obtain that
scenarios with the gauge group and particle content of the supersymmetric standard model lead
naturally to intermediate values for the string scale, in order to reproduce the value of gauge couplings
deduced from experiments. Second, the soft terms, which turn out to be generically non-universal,
and Yukawa couplings of these scenarios are studied in detail. Finally, using these soft terms and the
string scale as the initial scale for their running, we compute the neutralinonucleon cross section.
In particular we find regions in the parameter space of D-brane scenarios with cross sections in the
range of 106 105 pb, i.e., where current dark matter experiments are sensitive. For instance, this
can be obtained for tan > 5. 2001 Elsevier Science B.V. All rights reserved.
PACS: 11.25.Mj; 12.10.Kt; 95.35.+d; 04.65.+e
Keywords: D-branes; String scale; Soft terms; Dark matter
1. Introduction
Although the standard model provides a correct description of the observable world,
there exist, however, strong indications that it is just an effective theory at low energy of
some fundamental one. The only candidates for such a theory are, nowadays, the string
theories, which have the potential to unify the strong and electroweak interactions with
gravitation in a consistent way.
In the late eighties, working in the context of the perturbative heterotic string, a
number of interesting four-dimensional vacua with particle content not far from that of the
E-mail address: carlos.munnoz@uam.es (C. Muoz).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 5 9 - 6
232
D.G. Cerdeo et al. / Nuclear Physics B 603 (2001) 231258
supersymmetric standard model were found [1]. Supersymmetry breaking was most of the
times assumed to take place non-perturbatively by gaugino condensation in a hidden sector
of the theory. Until recently, it was thought that this was the only way in order to construct
realistic string models. However, in the late nineties, we have discovered that explicit
models with realistic properties can also be constructed using D-brane configurations from
type I string vacua [28]. Besides, it has been realized that the string scale, MI , may be
anywhere between the weak scale, MW , and the Planck scale, MPlanck [3,4,912].
This is
to be compared to the perturbative heterotic string where the relation MI = 8 MPlanck ,
with the gauge coupling, fixes the value of the string scale.
The freedom to play with the value of MI is particularly interesting since there are
several arguments in favour of supersymmetric scenarios with scales MI 101014 GeV.
First, these scales were suggested in [11] to explain many experimental observations as
neutrino masses or the scale for axion physics. Second, with the string scale of order
101012 GeV one is able to attack the hierarchy problem of unified theories [12]. The
mechanism is the following. In supergravity models supersymmetry can be spontaneously
broken in a hidden sector of the theory and the gravitino mass, which sets the overall scale
of the soft terms, is given by:
m3/2
F
MPlanck
(1)
where F is the auxiliary field whose vacuum expectation value breaks supersymmetry.
2
Since in supergravity one would expect F MPlanck
, one obtains m3/2 MPlanck and
therefore the hierarchy problem solved in principle by supersymmetry would be reintroduced, unless non-perturbative effects such as gaugino condensation produce F
MW MPlanck . However, if the scale of the fundamental theory is MI 101012 GeV
instead of MPlanck , then F MI2 and one gets m3/2 MW in a natural way, without
invoking any hierarchically suppressed non-perturbative effect. Third, for intermediate
scale scenarios charge and color breaking constraints become less important. Let us recall
that charge and color breaking minima in supersymmetric theories might make the standard
vacuum unstable. Imposing that the standard vacuum should be the global minimum the
corresponding constraints turn out to be very strong and, in fact, working with the usual
unification scale MGUT 1016 GeV, there are extensive regions in the parameter space
of soft supersymmetry-breaking terms that become forbidden [13]. For example, for the
dilaton-dominated scenario of superstrings the whole parameter space turns out to be
excluded [14] on these grounds. The stability of the corresponding constraints with respect
to variations of the initial scale for the running of the soft breaking terms was studied in
[15], finding that the larger the scale is, the stronger the bounds become. In particular, by
taking MPlanck rather than MGUT for the initial scale stronger constraints were obtained.
Obviously the smaller the scale is, the weaker the bounds become. In [16] intermediate
scales rather than MGUT were considered for the dilaton-dominated scenario with the
interesting result that it is allowed in a large region of parameter space. Finally, there are
other arguments in favour of scenarios with intermediate string scales MI 101014 GeV.
For example these scales might also explain the observed ultra-high energy ( 1020 eV)
233
cosmic rays as products of long-lived massive string mode decays. Besides, several models
of chaotic inflation favour also these scales [17].
In the present article we are going to analyze in detail whether or not those intermediate
string scales are also necessary in order to reproduce the low-energy data, i.e., the values of
the gauge couplings deduced from CERN e+ e collider LEP experiments. In this sense,
we will see that D-branes scenarios indeed lead naturally to intermediate values for the
string scale MI .
On the other hand, it has been noted that the neutralinonucleon cross section is quite
sensitive to the value of the initial scale for the running of the soft breaking terms [18]. The
smaller the scale is, the larger the cross section becomes. In particular, by taking 101012
GeV rather than 1016 GeV for the initial scale, the cross section increases substantially
106 105 pb. This result is extremely interesting since the lightest neutralino is
usually the lightest supersymmetric particle (LSP), and therefore a natural candidate for
dark matter in supersymmetric theories [19], and current dark matter detectors, DAMA
[20] and CDMS [21], are sensitive to a neutralinonucleon cross section in the above range.
The initial scale for the running of the soft terms in D-brane scenarios is MI .
As mentioned above, several theoretical and phenomenological arguments suggest that
intermediate values for this scale are welcome. Thus it is natural to wonder how much the
standard neutralinonucleon cross section analysis will get modified in D-brane scenarios.
This is another aim of this article.
The content of the article is as follows. In Section 2 we will try to determine the string
scale in D-brane scenarios imposing the experimental constraints on the values of the gauge
coupling constants. Although we will concentrate mainly in scenarios where the SU(3)c ,
SU(2)L and U (1)Y groups of the standard model come from different sets of Dp-branes,
we will also review the scenario where they come from the same set of Dp-branes. The
fact that the U (1)Y group arises as a linear combination of different U (1)s, due to their
D-brane origin, is crucial in the analysis.
In Section 3 we will use the results of Section 2, in particular the matter distribution due
to the D-brane origin of the U (1) gauge groups, in order to derive the soft supersymmetry
breaking terms of the D-brane scenarios which may give rise to the supersymmetric
standard model. Generically they are non-universal. This analysis is carried out under the
assumption of dilaton/moduli supersymmetry breaking [2730]. We emphasize that this
assumption of dilaton/moduli dominance is more compelling in the D-brane scenarios
where only closed string fields like S and Ti can move into the bulk and transmit
supersymmetry breaking from one D-brane sector to some other. Finally, we will also
discuss the structure of Yukawa coupling matrices.
In Section 4, using the soft terms of the D-brane scenarios previously studied, we
compute the neutralinonucleon cross section. We will see how the compatibility of
regions in the parameter space of these scenarios with the sensitivity of current dark
matter experiments depends not only on the value of the string scale but also on the nonuniversality of the soft terms.
Finally, the conclusions are left for Section 5.
234
2. D-brane scenarios and the string scale

As mentioned in the introduction there exists the interesting possibility that the
supersymmetric standard model might be built using D-brane configurations. In this case
there are two possible avenues to carry it out: (i) The SU(3)c , SU(2)L and U (1)Y groups of
the standard model come from different sets of Dp-branes. (ii) They come from the same
set of Dp-branes.
Since the two scenarios are interesting and qualitatively different, we will discuss both
separately. We will see in detail below that the first one (i), in order to reproduce the
values of the gauge couplings deduced from CERN e+ e collider LEP experiments,
leads naturally to intermediate values for the string scale MI . One realizes that this is
an interesting result since there are several arguments in favour of intermediate scales, as
discussed in the introduction. This approach was used first in [31] for the case of nonsupersymmetric Dp-branes with the result of a string scale of the order of a few TeV. In
any case, it is worth remarking the difficulty of obtaining three copies of quarks and leptons
if the gauge groups are attached to different sets of Dp-branes. 1 Thus whether or not the
scenarios discussed below, may arise from different sets of Dp-branes in explicit string
constructions is an important issue which is worth attacking in the future.
Concerning the other scenario (ii), models with the gauge group of the standard model
and three families of particles have been explicitly built [6,7]. We will review whether or
not intermediate scales arise naturally.
2.1. Embedding the gauge groups within different sets of Dp-branes
It is a plausible situation to assume that the SU(3)c and the SU(2)L groups of the
standard model could come from different sets of Dp-branes [3,32]. By different sets
we mean Dp-branes whose world-volume is not identical. In particular, notice that the
standard model contains particles (the left-handed quarks Qu ) transforming both under
SU(3)c and SU(2)L . That means that there must be some overlap of the world-volumes
of both sets of Dp-branes. Thus, e.g., one cannot put SU(3)c inside a set of D3-branes
and SU(2)L within another set of parallel D3-branes on a different point of the compact
space since then there would be no massless modes corresponding to the exchange of open
strings between both sets of branes which could give rise to the left-handed quarks.
Thus we need to embed SU(3)c inside D-branes, say Dp3 -branes, and SU(2)L within
other D-branes, say Dp2 -branes, in such a way that their corresponding world-volumes
have some overlap. Since we are working in general with type IIB orientifolds, pN can be
3, 5i , 7i , and 9, where the index i = 1, 2, 3, denotes what complex compact coordinate is
included in the D5-brane world-volume, or is transverse to the D7-brane world-volume.
Not all types of DpN -branes may be present simultaneously if we want to preserve N = 1
in D = 4. For a given D = 4, N = 1 vacuum we can have at most either D9-branes with
D5i -branes or D3-branes with D7i -branes.
1 We thank L.E. Ibez for discussions about this point (see also [6]).
235
Fig. 1. A generic D-brane scenario giving rise to the gauge bosons and matter of the standard model.
It contains three Dp3 -branes, two Dp2 -branes and one Dp1 -brane, where pN may be either 9 and
5i or 3 and 7i . The presence of extra D-branes, say Dq-branes, is also necessary as explained in the
text. For each set the DpN -branes are in fact on the top of each other.
In type IIB orientifold models, and in general on the world-volume of D-branes, SU(N)
groups come along with a U (1) factor, say U (1)N , so that indeed we are dealing with
U (N) groups, in which both SU(N) and U (1)N share the same coupling constant, N .
Thus U (1)Y might be a linear combination of two U (1) gauge groups arising from U (3)c
and U (2)L within Dp3 - and Dp2 -branes, respectively [33]. Although this is the simplest
possibility, its analysis is somehow subtle [31] and we prefer to carry it out at the end
of this subsection. Thus we will analyze first a more general case, where an extra U (1)
arising from another D-brane, say Dp1 -brane, contributes to the combination giving rise to
the correct hypercharge of the standard model matter [6,31]. This is schematically shown
in Fig. 1, where open strings starting and ending on the same sets of DpN -branes give rise
to the gauge bosons of the standard model. For the sake of visualization each set is depicted
at parallel locations, but in fact they are intersecting each other as discussed above.
In [6] a Z3 orientifold model with U (3)c U (2)L U (1) observable gauge group, and
therefore giving rise to SU(3)c SU(2)L U (1)3 U (1)2 U (1)1 as discussed above,
was explicitly built. Nevertheless, this model is embedded in D3-branes, i.e., p3 = p2 =
p1 = 3, and therefore we will discuss it in detail in Section 2.2.
On the other hand, in [31] the existence of standard models coming from different
sets of non-supersymmetric Dp-branes was assumed and several consequences were
discussed. In particular, imposing Dp3 = Dp1 , i.e., 3 = 1 , the low-energy data are
reproduced for a string scale of the order of a few TeV. Here we will carry out the general
analysis of supersymmetric Dp-branes with the interesting result that intermediate values
( 101012 GeV) for the string scale may arise in a natural way.
236
2.1.1. General scenario with Dp3 = Dp2 = Dp1 = Dp3

Let us denote by Q3 , Q2 and Q1 the charges of U (1)3 , U (1)2 , and U (1)1 , respectively.
Following then the analysis of Antoniadis, Kiritsis and Tomaras [31] a family of quarks
and leptons can have the four assignments of quantum numbers given in Table 1, in order
to obtain the hypercharge of the standard model
Y = c3 6 Q3 + c2 4 Q2 + 2 Q1 ,
(2)
where c3 = 1/3, c2 = 1/2 (c2 = 1/2) for the first (second) assignment and c3 = 2/3,
c2 = 1/2 (c2 = 1/2) for the third (fourth) assignment. Note that U (N) generators are
normalized asTr T a T b = 12 ab , and therefore the fundamental representation of SU(N)
has QN = 1/ 2N .
For example, as discussed above, the quark doublet Qu always arises from an open string
with one end on Dp3 -branes and the other end on Dp2 -branes. Inthe first assignment of
Table 1 Qu transforms as a 2 under U (2) and therefore Q2 = 1/ 4. uc (d c ) arises from
end on Dp1 (Dq)-branes 2 with
an open string
with one end on Dp3 -branes and the other
c
Q1 = 1/ 2 (0). Finally, in the case of leptons, Le (e ) arises from an openstring with
one end on Dp2 (Dp1 )-branes and the other end on Dq-branes with Q1 = 0 (1/ 2). This is
schematically shown in Fig. 1. The other three possible assignments can also be analyzed
similarly. Let us remark that other scenarios with uc , d c (ec ) arising from open strings
with both ends on Dp3 - (Dp2 )-branes are possible, since these particles can be obtained
as the antisymmetric product of two triplets of SU(3) (doublets of SU(2)). However, these
scenarios do not give rise to a modification of the analysis of the string scale [31], and
therefore we will not consider them here.
Concerning the possible quantum numbers of Higgses, they will be discussed in
Section 3 where they are important, e.g., in order to determine whether or not all Yukawa
couplings in D-brane scenarios are allowed.
Table 1
The four possible assignments of quantum numbers (multiplied by 2N ) of a family of quarks and
leptons of the standard model under U (1)3 U (1)2 U (1)1 . Note that Q3 is always fixed. The
usual hypercharge Y is given in the last column
Matter fields
Qu (3, 2)
1)
uc (3,
c
1)
d (3,
Le (1, 2)
ec (1, 1)
Q3
1
1
1
0
0
Q2
1
0
0
1
0
Q1
0
1
0
0
1
Q2
1
0
0
1
0
Q1
0
1
0
1
1
Q2
1
0
0
1
0
Q1
0
0
1
0
1
Q2
1
0
0
1
0
Q1
0
0
1
1
1
Y
1/6
2/3
1/3
1/2
1
2 As we see from here the presence of extra D-branes, say Dq-branes, is necessary in order to reproduce the
correct hypercharge for the matter. In addition, in Section 2.2 we will see an explicit model where Dq-branes are
also necessary to cancel non-vanishing tadpoles. The additional U (1) factors associated to the Dq-branes will be
anomalous and therefore with a mass of the order of the string scale.
237
Let us now try to determine the type I string scale MI , using the above information. On
the one hand, from (2) one obtains the following relation at MI :
6c32
4c22
2
1
=
+
+
.
Y (MI ) 1 (MI ) 2 (MI ) 3 (MI )
(3)
On the other hand, the usual RGEs for gauge couplings are given by
bjns Ms
bjs
1
MI
1
=
+
ln
ln
+
,
j (MI ) j (MZ ) 2 MZ 2 Ms
(4)
where bjs (bjns ) with j = 2, 3, Y , are the coefficients of the supersymmetric (non-supersymmetric) -functions, and the scale Ms corresponds to the supersymmetric threshold,
200 GeV Ms 1000 GeV. Thus using (3), (4) and the fact that always c22 = 1/4 one
can compute MI with the result

6c32
2
1
MI
1
ln
= 2
Ms
Y (MZ ) 1 (MI ) 2 (MZ ) 3 (MZ )

ns
Ms 2 s
1
(5)
6c3 b3 + b2s bYs
+ bY b2ns 6c32b3ns ln
.
MZ
Using the experimental values [34] MZ = 91.1870 GeV, 3 (MZ ) = 0.1184, 2 (MZ ) =
0.0338, Y (MZ ) = 0.01016, and the matter content of the minimal supersymmetric
standard model (MSSM), i.e., b3s = 3, b2s = 1, bYs = 11 and b3ns = 7, b2ns = 19/6, bYns =
41/6, one obtains for c3 = 1/3
ln
Ms
MI
1.05
1.22 ln
= 33.09
.
Ms
1 (MI )
MZ
(6)
For example, choosing the value of the coupling associated to the Dp1 -brane in the range
0.07 1 (MI ) 0.1 one obtains MI 101012 GeV. This scenario is shown in Fig. 2 for
1 (MI ) = 0.1 and Ms = 1 TeV. As discussed above, this intermediate initial scale is an
attractive possibility. Values of the coupling 1 (MI ) smaller than 0.07 are not interesting
since MI becomes also smaller, and therefore m3/2 in (1) will be too low to be compatible
with the experimental bounds on supersymmetric particle masses. Although the larger the
coupling is, the larger MI becomes (e.g., for 1 (MI ) = 1 one is even able to obtain MI
5 1015 GeV), one should be careful with the range of validity of the perturbative regime.
On the other hand, the case c3 = 2/3 is less interesting since one obtains the upper bound
MI 3 108 GeV.
It is worth noticing that non-supersymmetric scenarios can also be analyzed with the
above formula (5) with the substitutions Ms MZ , bis bins . For example, MI 1 TeV
can be obtained with 1 (MI ) 0.035 for c3 = 1/3, and 1 (MI ) 0.056 for c3 = 2/3.
2.1.2. Scenarios with Dp1 = Dp3 or Dp1 = Dp2
Let us now simplify the above analysis assuming that the D-brane associated to the
U (1)1 is on top of one of the other D-branes. In this case we have two possibilities, either
Dp1 = Dp3 or Dp1 = Dp2 . Let us start analyzing the possibility Dp1 = Dp3 , which
implies 1 = 3 . Then Eq. (5) is still valid with the substitutions
2
1 (MI )
6c32
3 (MZ )
238
Fig. 2. Running of the gauge couplings of the MSSM with energy Q embedding the gauge groups
within different sets of Dp-branes (solid lines). Due to the D-brane origin of the U (1) gauge groups,
relation (3) must be fulfilled. For comparison the running of the MSSM couplings with the usual
normalization factor for the hypercharge, 3/5, is also shown with dashed lines.
2+6c32
3 (MZ ) ,
6c32 b3s,ns (2 + 6c32 )b3s,ns . As a consequence, for c3 = 1/3, one obtains the
following prediction: MI 6 108 GeV, with Ms = 200 GeV. A slightly low value to be
able to obtain m3/2 MW , as discussed below Eq. (1). Obviously, the larger Ms is, the
smaller MI becomes. The case c3 = 2/3 is much worst since MI 100 TeV.
The other scenario Dp1 = Dp2 , which implies 1 = 2 , does not improve the above
2
1
situation. One can use again Eq. (5), but now with the substitutions 1 (M
+ 2 (M
I)
Z)
s,ns
3
3b2s,ns . In particular, for c3 = 1/3,
2 (MZ ) , b2
I
whereas ln M
Ms is even negative for c3 = 2/3.
MI 500 GeV, with Ms = 200 GeV,
On the other hand, extra particles appear quite frequently in superstring theories. Since
their presence will modify the denominator in (5), one might obtain larger values for MI .
For example, for Dp1 = Dp3 , and restricting ourselves to the case of singlets, SU(2)
doublets and colour triplets, one has

1
1
2 + 6c32 b3s + b2s bYs = 18 2 + 6c32 n3 n2 + q,
(7)
2
2
1 2
2
n3
where q = ni=1
Yi + 2 nj =1
YJ2 + 3 k=1
Yk2 and n1,2,3 is the number of extra singlets,
doublets and triplets that the model under consideration has. Extra (3, 2) representations
under SU(3) SU(2) must be introduced in the formula for q just as two triplets each.
For instance, assuming the presence of two copies of d c + d c for the case c3 = 1/3, one
obtains MI 4 1010 GeV ( 8 109 GeV), with Ms = 200 GeV (1 TeV). Concerning
the running of the couplings, this scenario is similar to the one shown in Fig. 2.

239
As above, we can also analyze non-supersymmetric scenarios. A string scale of order

a few TeV can be obtained without extra particles. In particular, for 1 = 2 , c3 = 1/3
and 1 = 3 , c3 = 2/3 we recover the results of [31], MI 300 GeV and MI 7 TeV,
respectively.
2.1.3. Scenario without Dp1 -brane
Let us finally consider the scenario where the U (1)Y is only a linear combination of
the two U (1) gauge groups arising from U (3)c and U (2)L within Dp3 - and Dp2 -branes,
respectively [33].
As discussed in [31], there is only one assignment of quantum numbers for quarks and
leptons, in order to obtain the hypercharge of the standard model.
The latter
is given by
Eq. (2) with c3 = 1/3, c2 = 1/2, Q1 = 0, i.e., Y = 13 6 Q3 12 4 Q2 . Whereas

c
the charges
assignment given in Table 1,
Q3 and Q2 for Qu , d and Lec arec as in the first
c
Q3 = 2/ 6, 0 and Q2 = 0, 2/ 4 for u , e . Clearly, u and ec must arise from open
strings with both ends on Dp3 -branes and Dp2 -branes, respectively. As mentioned above,
this is possible since these particles can be obtained as the antisymmetric product of two
triplets of SU(3) and doublets of SU(2), respectively.
1
1
= 2 (M
+ 32/3
With the above hypercharge, instead of Eq. (3) one obtains Y (M
(MI ) ,
I)
I)
2
and therefore Eq. (5) is still valid with c3 = 1/3 and the substitution 1 (M
0. As
I)
a consequence one can predict the string scale. For example, for Ms = 200 GeV (Ms =
1 TeV) one obtains MI 1.8 1016 GeV (MI 1016 GeV). On the other hand, a
non-supersymmetric scenario [31] gives rise to a string scale which is too large, MI
5 1013 GeV.
It is worth noticing [6] that for 3 = 2 one obtains the standard GUT normalization for
couplings Y = 35 2 , and therefore MI 2 1016 GeV.
We thus conclude that, concerning the string scale MI , the generic models analyzed
above are very interesting from the point of view of their predictivity. Besides, the values
obtained for MI can be accommodated in type I strings, choosing the appropriate values of
the moduli. For instance, for the example studied below Eq. (7) the experimental values of
I)
couplings are obtained with MI 3 1011 GeV, and therefore the ratio 32 (M
(MI ) 2. Let us
assume that SU(3)c is embedded inside D9-branes and SU(2)L inside D51 -branes. Then
one has the following relationships
M1 M2 M3
MI2
3 MPlanck
,
M1 MI2 2 MPlanck
,
=
M2 M3
2
(8)
where Mi , i = 1, 2, 3, are the compactification masses associated to the compact radii Ri .

Choosing
MI4
M22 M32
2 one is able to reproduce the above ratio.
2.2. Embedding all gauge groups within the same set of Dp-branes
The fact that to obtain three copies of quark and leptons is difficult, when gauge groups
come from different sets of DpN -branes, as mentioned above, is one of the motivations in
240
[6] to embed all gauge interactions in the same set of DpN -branes (p3 = p2 = p1 in the
notation above).
Here we will briefly review the results of Aldazabal, Ibez, Quevedo and Uranga [6]
concerning this issue. They are able to build Z3 orientifold models with the gauge group
SU(3)c SU(2)L U (1)3 U (1)2 U (1)1 embedded in D3-branes, with no additional
non-Abelian factors. They also argue that in the Z3 orientifold, which leads naturally to
three families, only the combination
1
1
6 Q3
4 Q2 + 2 Q1 ,
Y =
(9)
3
2
will be non-anomalous. It is worth noticing that this is precisely the hypercharge given in
(2) with c3 = 1/3 and c2 = 1/2, i.e., the first assignment of Table 1. Likewise Fig. 1
with Dp3 = Dp2 = Dp1 = D3, and all D3-branes on top of each other, is also valid as
a schematic representation of this type of models. Dq-branes in the figure are now D7branes, which must be introduced in order to cancel non-vanishing tadpoles. Since 3 =
2 = 1 = , instead of (3) one obtains
11/3
1
=
,
(10)
Y (MI ) (MI )
which is not the standard GUT normalization for couplings. This is due to the D-brane
origin of the U (1) gauge groups.
A model with all these properties was explicitly built in [6]. Although extra U (1)s on the
D7-branes are present, they are anomalous and therefore the associated gauge bosons have
masses of the order of MI . In addition to D7-branes, anti-D7-branes trapped at different
Z3 fixed points are also present. Since they break supersymmetry at the string scale MI ,
they can be used as the hidden sector of supergravity theories. Thus this is an example of
gravity mediated supersymmetry breaking.
On the other hand, in this model not only quarks and leptons come in three generations
but also Higgses, i.e., it contains two pairs of extra doublets with respect to the MSSM.
In addition, three pairs of extra colour triplets are also present. Unfortunately, this matter
content cannot give rise to the correct values for j (MZ ). Although generically the extra
triplets will be heavy, this does not modify the previous result. One cannot exclude,
however, the possibility that other models with the necessary matter content, in order to
reproduce the experimental values of couplings, might be built. For example, if besides the
matter content of the MSSM, we have six copies of H1 + H2 and two copies of d c + d c
unification at around MI = 1010 GeV, with (MI ) 1/14, is obtained. This scenario is
shown in Fig. 3 for Ms = 1 TeV.
It is worth noticing that these values can be accommodated in type I strings, choosing
the appropriate values of the moduli. For example, with an isotropic compact space, the
string scale is given by:
MI4 =
MPlanck 3
Mc ,
2
(11)
where Mc is the compactification scale. Thus one gets MI 101012 GeV with Mc
10810 GeV.
241
Fig. 3. Running of the gauge couplings with energy Q embedding all gauge groups within the same
set of D3-branes (solid lines). In addition to the matter content of the MSSM, extra Higgs doublets
and vector-like states are also present. Due to the D-brane origin of the U (1) gauge groups, the
normalization factor of the hypercharge is 3/11 (see Eq. (10)). For comparison the running of the
MSSM couplings with the usual normalization factor for the hypercharge, 3/5, is also shown with
dashed lines.
Let us finally mention that another model with the gauge group of the standard model
and three families has recently been built [7]. The presence of additional Higgs doublets
and vector-like states allows an unification scale at an intermediate value.
3. Soft terms and Yukawa couplings in D-brane scenarios

General formulas for the soft supersymmetry-breaking terms in D-brane constructions
were obtained in [33], under the assumption of dilaton/moduli supersymmetry breaking
[2730], using the parametrization introduced in [30]. On the other hand, general Yukawa
couplings in D-brane constructions have been studied in [33,35]. Since we need to use these
results to obtain the soft terms and Yukawa couplings associated to the D-brane scenarios
discussed above, we summarize them in Appendix A.
3.1. Embedding the gauge groups within different sets of Dp-branes
3.1.1. General scenario with Dp3 = Dp2 = Dp1 = Dp3
For the sake of concreteness, let us assume the following distribution of D-branes in
the scenario proposed in Section 2.1. Dp3 -branes are D9-branes, Dp1 -branes are D53 branes, Dp2 -branes are D51 -branes, and finally Dq-branes are D52 -branes. Then, the first
242
assignment of Table 1 shown schematically in Fig. 1 gives rise to the following soft masses,
using formulas (A.1) and (A.2) in Appendix A. The gaugino masses are given by:
M3 = 3 m3/2 sin ,
M2 = 3 m3/21 cos ,
MY = 3 m3/2 Y (MI )

6c32
2
1
3 cos +
1 cos +
sin ,
(12)
1 (MI )
2 (MI )
3 (MI )
where it is worth noticing that relation (3) has been taken into account in order to obtain
the gaugino mass associated to the gauge group U (1)Y . The scalar masses are given by:

3
m2Qu = m23/2 1 1 12 cos2 ,
2

3
m2d c = m23/2 1 1 22 cos2 ,
2

3
m2uc = m23/2 1 1 32 cos2 ,
2

3
m2ec = m23/2 1 sin2 + 12 cos2 ,
2

3
m2Le = m23/2 1 sin2 + 32 cos2 .
(13)
2
Note that quarks of type Qu , d c and uc are states C 951 , C 952 and C 953 , respectively,
whereas leptons of type ec and Le are states C 53 52 and C 51 52 , respectively.
These soft terms (12) and(13) are generically non-universal. For example, in the overall
modulus limit (1,2,3 = 1/ 3) universality cannot be obtained. The dilaton limit (sin2 =
1) would give rise to tachyonic states ec and Le .
Obviously, the other three assignments of quantum numbers in Table 1 give rise to the
same gaugino masses (12). The differences arise for some of the soft scalar masses in (13).
For the second assignment the masses of leptons of type Le in (13) must be replaced by

3
m2Le = m23/2 1 sin2 + 22 cos2 .
(14)
2
For the third assignment the masses of quarks of type uc and d c must be exchanged in (13),
i.e.,

2
2
3
3
2
2
2
2
2
2
md c = m3/2 1 1 3 cos .
muc = m3/2 1 1 2 cos ,
2
2
(15)
Finally, for the fourth assignment, both modifications (14) and (15) must be included in
Eq. (13).
Concerning the soft Higgs masses, we need to know the quantum numbers Q3,2,1 of
the two Higgs doublets of the supersymmetric standard model.
For the
first and third

1
1
assignments of Table 1 whose hypercharges are Y = 3 6 Q3 2 4 Q2 + 2 Q1
and Y = 23 6 Q3 12 4 Q2 + 2 Q1 , respectively, there are two possible assignments
243
for the Higgs with hypercharge 1/2, H2 (0, 1/ 4, 1/ 2) and H2 (0, 1/ 4, 0). For
the
Higgs with hypercharge
1/2,
there
are
also
two
possible
assignments
H
(0,
1/
4, 0)
1
and H1 (0, 1/ 4, 1/ 2). Thus we have four possible combinations:

H2 (0, 1, 1),
H1 (0, 1, 0),
(16)
H2 (0, 1, 1),
H1 (0, 1, 1),
(17)
H2 (0, 1, 0),
H1 (0, 1, 1),
H2 (0, 1, 0),
H1 (0, 1, 0),
(18)
(19)
where for simplicity we have multiplied the quantum numbers by 2N as in Table 1.

For example, combination (16) is the one shown in Fig. 1, where H2 is a state C 51 53 and
H1 is a state C 51 52 . The corresponding soft masses, using again formulas (A.2), are given
respectively by

3 2
2
2
2
2
mH2 = m3/2 1 sin + 2 cos ,
2

3 2
2
2
2
2
mH1 = m3/2 1 sin + 3 cos ,
(20)
2
m2H2
= m2H1
= m23/2

3 2
2
2
1 sin + 2 cos
2

3
m2H2 = m23/2 1 sin2 + 32 cos2 ,
2

3
m2H1 = m23/2 1 sin2 + 22 cos2
2

3
m2H2 = m2H1 = m23/2 1 sin2 + 32 cos2 .
2
(21)
(22)
(23)
With respect to the second and fourth assignment of quantum numbers in Table 1, it is
worth noticing that their hypercharges are equal to the ones of first and third assignment,
respectively, but with an opposite sign in front of Q2 . As a consequence, the four
combinations of quantum numbers (16)(19) are still valid using an opposite sign for the
value of Q2 . Thus the corresponding soft masses are like in Eqs. (20)(23).
Summarizing, we have obtained in total sixteen scenarios with different soft terms.
Concerning the soft trilinear parameters, since these are related to Yukawa couplings we
need to discuss first the structure of the latter. This can be carried out straightforwardly
taking into account the previous information about quantum numbers and the formula
for the renormalizable Yukawa Lagrangian (A.4) in Appendix A. The sixteen possible
scenarios will have in principle different Yukawa couplings since fields uc , d c , Le , H2 and
H1 are attached to different D-branes.
Let us concentrate on the eight scenarios which are more realistic from the phenomenological point of view. Following the discussion of Section 2.1.1 these are the ones with
c3 = 1/3, i.e., the first assignment of Table 1 with the four possible combinations for
244
Higgses (16)(19), and the second assignment of Table 1 with the same combinations but
with an opposite sign for the value of Q2 as discussed above. Let us also assume that we
have three copies of quarks and leptons. For instance, the first assignment with Higgses as

in (16) implies that couplings a,b Yuab H2 Qau ucb correspond to C 53 51 C 951 C 953 , couplings

ab
a c
51 52 C 951 C 952 , and finally couplings
ab
a c
a,b Yd H1 Qu db correspond to C
a,b Ye H1 Le eb
correspond to C 51 52 C 51 52 C 53 52 , where a, b are family indices. Whereas the last type of
couplings is forbidden, as can be seen from 3 (A.4) in Appendix A, the other two are allowed with the result
,
Yu,d = gq,1 Y
Ye = 0,
is defined as
where Y
1 1 1
= 1 1 1,
Y
1
(24)
(25)
1 1
g1 is the gauge coupling associated to the U (1)1 in the D53 -brane and gq is the gauge
coupling associated to the D52 -brane. An analysis of the above democratic texture for
Yu,d can be found in [36]. Lepton masses might appear from non-renormalizable couplings.
The Yukawa couplings for the other three combinations of Higgses (17)(19) can be
obtained straightforwardly as for the previous one, with the result
,
Yu,e = gq,3 Y
Yd = 0,
,
Yu,d = 0,
Ye = g3 Y
,
Yd = g1 Y
Yu,e = 0,
(26)
(27)
(28)
respectively. Here g3 is the gauge coupling associated to the SU(3)c in the D9-brane.
On the other hand, for the second assignment of Table 1 the Yukawa couplings are given
by
,
Yu,d,e = gq,1,3 Y
,
Yd,e = 0,
Yu = gq Y
(30)
Yu,d,e = 0,
(31)
,
Yd,e = g1,3 Y
Yu = 0.
(29)
(32)
It is worth noticing that in principle some of these scenarios seem to be hopeless. For
instance, it is difficult to imagine how scenario with Yukawa couplings (27) may give rise to
the observed fermion mass hierarchies with quark masses arising from non-renormalizable
terms.
Now that the structure of Yukawa couplings is known we can compute the corresponding
trilinear parameters using Eq. (A.3). Obviously, when Yukawa couplings are vanishing
trilinear parameters are also vanishing. When Yukawa couplings are non-vanishing, the
3 This is also obtained realizing that the sum of the U (1) charges of the fields is non-vanishing.
A-terms acquire the following values:

3
,
m3/2 (2 1 3 ) cos sin Y
Au =
2

3
,
m3/2 (3 1 2 ) cos sin Y
Ad =
2

3
.
m3/2 sin (1 + 2 + 3 ) cos Y
Ae =
2
245
(33)
(34)
(35)
For example, Yukawa couplings (24) have associated A-terms given by (33), (34) and
Ae = 0, Yukawa couplings (26) have associated A-terms (33), (35) and Ad = 0, etc.
3.1.2. Scenarios with Dp1 = Dp3 or Dp1 = Dp2
As we discussed in Section 2.1.2, scenarios with Dp1 = Dp3 are more interesting
from the phenomenological point of view than scenarios with Dp1 = Dp2 , thus we will
concentrate in the former. In any case, the analysis of the other scenario can be carried out
along similar lines. The first attempts to study these scenarios and their phenomenology,
in particular CP phases, Yukawa textures, and dark matter, were carried out in [37,38],
and [3942], respectively. However, they consider in fact toy scenarios since the important
issue of the D-brane origin of the U (1)Y gauge group as a combination of other U (1)s and
its influence on the matter distribution (see, e.g., Fig. 1) was not included in their analyses.
Thus we will take into account the discussion of Section 2 concerning this issue in order
to obtain the soft terms and Yukawa couplings, as we already did with the general scenario
of the previous subsection.
Assuming the same distribution of D-branes as in the previous subsection, we have that
Dp1 - and Dp3 -branes are D9-branes, Dp2 -branes are D51 -branes, and finally Dq-branes
are D52 -branes. Then, the first assignment of Table 1 gives rise to the following gaugino
masses:
M3 = 3 m3/2 sin ,
M2 = 3 m3/21 cos ,

2 + 6c32
1
1 cos +
sin ,
MY = 3 m3/2 Y (MI )
(36)
2 (MI )
3 (MI )
and scalar masses:

3
m2Qu = m23/2 1 1 12 cos2 ,
2

3
m2d c = m23/2 1 1 22 cos2 ,
2

2
2
muc = m3/2 1 3i2 cos2 ,
i

3
m2ec = m23/2 1 1 22 cos2 ,
2

3
m2Le = m23/2 1 sin2 + 32 cos2 .
2
(37)
246
Here i = 1, 2, 3, labels the three complex compact dimensions. Thus uci are states Ci9
coming from open strings starting and ending on D9-branes. These fields behave quite
similarly to untwisted sectors of perturbative heterotic orbifolds. It is then natural to use
this index as a family index. For example we will take uc1 = t c , uc2 = cc and uc3 = uc . On the
other hand, quarks of type Qu , d c are states C 951 and C 952 , respectively, whereas leptons
of type ec and Le are states C 952 and C 51 52 , respectively. As discussed below Eq. (13)
these soft terms are also generically non-universal. It is worth noticing here that due to the
family index i a potential problem due to flavor-changing neutral currents (FCNC) may
arise. Being conservative 4 we can avoid it by imposing 2 = 3 . This constraint will be
used in Section 4 when discussing neutralinoproton cross sections in this scenario.
For the second assignment, the value of m2Le must be replaced by

2
3
2
2
2
mLe = m3/2 1 1 1 cos .
(38)
2
For the third assignment the masses of quarks of type uc and d c must be exchanged in (37),
i.e.,

3
m2uc = m23/2 1 1 22 cos2 ,
2

2
2
md c = m3/2 1 3i2 cos2 .
(39)
i
For the fourth assignment, both modifications (38) and (39) must be included in Eq. (37).
Finally, the Higgs masses corresponding to the four combinations obtained in Eqs. (16)
(19) are

2
3
2
2
2
mH2 = m3/2 1 1 1 cos ,
2

3 2
2
2
2
2
mH1 = m3/2 1 sin + 3 cos ,
(40)
2

2
3
2
2
2
2
mH2 = mH1 = m3/2 1 1 1 cos ,
(41)
2

3 2
2
2
2
2
mH2 = m3/2 1 sin + 3 cos ,
2

2
3
2
2
2
mH1 = m3/2 1 1 1 cos ,
(42)
2

3 2
2
2
2
2
2
mH2 = mH1 = m3/2 1 sin + 3 cos ,
(43)
2
respectively. For example H2 , H1 in (40) are states C 951 , C 51 52 .
4 Recall that the relevant mass terms are the low-energy ones, not those generated at the string scale. As
discussed, e.g., in [30], one has to do the low-energy running of the scalar masses, and, for the squark case, for
gluino masses heavier than (or of the same order as) the scalar masses, there are large flavor-independent gluino
loop contributions which are the dominant source of scalar masses.
247
The analysis of Yukawa couplings and trilinear parameters can be carried out as in the
previous subsection, assuming again three copies of quark and leptons Qu , d c , Le , ec . As
an example let us consider the second assignment of Table 1 with Higgses as in (17) with
an opposite sign for Q2 . From Eq. (A.4) we deduce that the only allowed type of coupling,
which corresponds to C 951 C 951 C19 , is H2 Qau t c . The result for Yukawa couplings is then
,
Yu = g3 Y
Yd,e = 0,
(44)
is defined as
where Y
0 0 1
= 0 0 1.
Y
(45)
0 0 1
This structure for Yukawa matrices and its viability has been studied in [38]. Other results
for Yukawa couplings arise for the other interesting scenarios. For example, for the first
assignment of Table 1 with Higgses as in (16), whereas Yu and Ye are still as above, Yd has
a democratic matrix structure as in (25). This structure was also analyzed in [38].
Concerning the trilinear parameters, these can be computed using again (A.3). For
instance, Yukawa couplings (44) have associated
,
Ad,e = 0.
Au = 3 m3/2 sin Y
(46)
3.1.3. Scenario without Dp1 -brane
In this scenario the gauge groups of the standard model arise only from Dp3 -branes,
which are D9-branes, and Dp2 -branes, which are D51 -branes. Then, following the
discussion in Section 2.1.3, quarks of type Qu are states C 951 , quarks of type d c are states
C 952 , and leptons of type Le are states C 51 52 . On the other hand, quarks of type uc are
states Ci9 and leptons of type ec are states Ci51 . As mentioned before, it is natural to use
this index i as a family index. The only combination of quantum numbers which is now
allowed for Higgses is (19), and therefore H1 , H2 are states C 51 52 .
Using again Eqs. (A.1) and (A.2) we can obtain the soft terms for this scenario. In
particular, the gaugino masses are
M2 = 3 m3/21 cos ,
M3 = 3 m3/2 sin ,

1
2/3
1 cos +
sin ,
MY = 3 m3/2 Y (MI )
(47)
2 (MI )
3 (MI )
and the scalar masses are

2
3
2
2
2
mQu = m3/2 1 1 1 cos ,
2

2

3
2
2
2
md c = m3/2 1 1 2 cos ,
2

3 2
2
2
2
2
2
2
mLe = mH2 = mH1 = m3/2 1 sin + 3 cos ,
2

2
2
2
2
muc = m3/2 1 3i cos , m2ec = m23/2 1 3 sin2 ,
i

m2c = m23/2 1 332 cos2 ,
(48)
m2 c = m23/2 1 322 cos2 ,
248
where the following assignments have been used for the leptons. ec is a state C151 , c is a
state C251 , and finally c is a state C351 . As in the previous two scenarios these soft terms
are also generically non-universal.
Concerning Yukawa couplings, these are allowed for leptons in this scenario since
C 51 52 C 51 52 C351 exists. Assuming three copies of leptons Le , one obtains
,
Ye = g2 Y
(49)
has been defined in (45) and g2 is the gauge coupling associated to the SU(2)L in
where Y
the D51 -brane. The corresponding trilinear parameters are
.
Ae = 3 m3/21 cos Y
(50)
Notice however that the above assignment for leptons may be problematic concerning
FCNC since generically m2ec = m2c . We can avoid that potential problem choosing c
as a state C151 and, e.g., ec as a state C251 , c as a state C351 . Then we have to choose
uc1 = t c . Now, imposing 2 = 3 , FCNC will not be present. This constraint will be used
in Section 4 when discussing neutralinoproton cross sections in this scenario. Instead of
the matrix structure (49) there will be a new matrix with the non-vanishing entries in the
second column.
On the other hand, Yukawa couplings for quarks of type u are vanishing since couplings
C 51 52 C 951 Ci9 are forbidden. However, couplings C 51 52 C 951 C 952 are allowed and therefore
Yukawa couplings for quarks of type d exist. The matrix structure is like in (25). In any
case, as discussed for other scenarios in Section 3.1.1, is difficult to imagine how this
scenario may give rise to the observed fermion mass hierarchies with masses of quarks of
type u arising from non-renormalizable terms.
3.2. Embedding all gauge groups within the same set of Dp-branes
As discussed in Section 2.2, the D-brane model constructed in [6], where all gauge
groups are embedded in 3-branes, is very interesting. We will analyze here the soft terms
and Yukawa couplings of the model.
Let us recall that Dp3,2,1 -branes are in this scenario D3-branes, and Dq-branes are
D71,2,3 -branes. The distribution of matter is like in Fig. 1 with all D3-branes on top of each
other. Taking into account that under a T -duality transformation with respect to the three
compact dimensions the 9-branes transform into 3-branes and the 5i -branes into 7i -branes,
still the formulas for soft terms and Yukawas will be identical to the ones in Appendix A,
(A.1)(A.4), with the obvious replacements 9 3, 5i 7i everywhere. This implies the
following gaugino masses:
M3 = M2 = MY = 3 m3/2 sin ,
(51)
and scalar masses:

m2Qi = m2uc = m2H i = m23/2 1 3i2 cos2 ,
u
i
2

3
m2d c = m2Li = m2ec = m2H i = m23/2 1 1 i2 cos2 .
i
i
e
2
1
(52)
249
Note that Qu , uc , and H2 are states Ci3 , whereas d c , ec , Le and H1 are states C 37i . Thus due
to the index i = 1, 2, 3, three families arise in this model in a natural way. For example, we
will take uc1 = uc , uc2 = cc , uc3 = t c , etc. In order to avoid FCNC we may impose 1 = 2 .
It is also worth noticing that
universality can be obtained in the dilaton (sin2 = 1) and
overall modulus (1,2,3 = 1/ 3) limits unlike the scenarios in Section 3.1.
Let us now analyze the Yukawa couplings of the model [6]. Couplings of the type
3
C1 C23 C33 are allowed. Assuming that the physical Higgs H2 corresponds to C13 , the
following couplings exist: gH2 Qc t c and gH2 Qt cc , i.e.,
,
Yu = g Y
(53)
is defined as
where Y
0 0 0
= 0 0 1.
Y
0 1 0
(54)
The corresponding trilinear parameters are
.
Au = 3 m3/2 sin Y
(55)
Let us remark that in the presence of discrete torsion gH2 Qt t c may also be present [6].
On the other hand, couplings of the type Ci3 C 37i C 37i are also allowed. Assuming that the
physical Higgs H1 corresponds to C 373 , the coupling gQt bc H1 also exists, i.e.,
,
Y
Yb = g
(56)
is defined as
where
0 0 0
= 0 0 0.
Y
0 0 1
(57)
The trilinear parameters are
.
Y
Ab = 3 m3/2 sin
(58)
More involved couplings with off-diagonal entries in the matrix for quarks of type d are
possible in some circumstances [6]. Finally, renormalizable Yukawa couplings for leptons
are not present since they are of the type C 37 C 37 C 37 and these are not allowed.
4. Neutralinonucleon cross sections in D-brane scenarios

Recently there has been some theoretical activity [4349] analyzing the compatibility
of regions in the parameter space of the MSSM with the sensitivity of current (DAMA
[20], CDMS [21], HEIDELBERGMOSCOW [22], HDMS prototype [23], UKDMC [24],
CANFRANC [25]) and projected (GENIUS [26], DAMA 250 kg. [20], CDMS Soudan
[21], etc.) dark matter detectors. In particular, DAMA and CDMS are sensitive to a
neutralinonucleon cross section 0 p in the range of 106 105 pb. Working in the
1
250
supergravity framework for the MSSM with universal soft terms, it was pointed out in [43,
44,46,49] that the large tan regime allows regions where the above mentioned range of
0 p is reached. Besides, working with non-universal soft scalar masses m , 0 p
1
106 pb was also found for small values of tan , if m fulfill some special conditions [43,
44,46]. In particular, this was obtained for tan 25 (tan 4) working with universal
(non-universal) soft terms in [46]. The case of non-universal gaugino masses was also
analyzed in [47] with interesting results.
The above analyses were performed assuming universality (and non-universality) of the
soft breaking terms at the unification scale, MGUT 1016 GeV, as it is usually done in
the MSSM literature. However, inspired by superstrings, where the unification scale may
be smaller, it was analyzed in [18] the sensitivity of the neutralinonucleon cross section
to the value of the initial scale for the running of the soft breaking terms. Working in the
supergravity context with universal soft terms, the result was that the smaller the scale
is, the larger the cross section becomes. In particular, by taking 101012 GeV rather than
1016 GeV for the initial scale, the cross section increases substantially 0 p 106
1
105 pb.
The natural extension of this analysis is to carry it out with explicit D-brane
constructions. As mentioned in Section 3.1.2, the first attempts to study dark matter within
these constructions were carried out in [3941] for the unification scale as the initial scale
and in [42] for an intermediate scale as the initial scale in the case of dilaton dominance.
Here we will take into account the crucial issue of the D-brane origin of the U (1)Y and
its consequences on the matter distribution and soft terms in these scenarios. Thus we will
analyze the D-brane scenarios introduced in Section 2, using their soft terms computed in
Section 3. The fact that in these scenarios intermediate initial scales and/or non-universal
soft terms are possible allows us to think that large cross sections, in the small tan regime,
could be obtained in principle. Let us recall that this can be understood from the variation
in the value of , i.e., the Higgs mixing parameter which appears in the superpotential
W = H1 H2 . Both, intermediate scales and non-universality, can produce a decrease in
the value of . When this occurs, the Higgsino components, N13 and N14 , of the lightest
neutralino
0 + N12 W
0 + N13 H
0 + N14 H
0 ,
10 = N11 B
1
2
(59)
increase and therefore the scattering channels through Higgs exchange become important.
As a consequence the spin independent cross section also increases.
Before entering in details let us remark that we will work with the usual formulas for the
elastic scattering of relic LSPs on protons and neutrons that can be found in the literature
[19]. In particular, we will follow the re-evaluation of the rates carried out in [45], using
their central values for the hadronic matrix elements.
Let us discuss now the parameter space of our D-brane scenarios. As usual in
supersymmetric theories, the requirement of correct electroweak breaking leaves us
(modulo the sign of ) with the following parameters. The soft breaking terms, scalar
and gaugino masses and trilinear parameters, and tan . Although formulas for the soft
terms obtained in Section 3 leave us in principle with five parameters, m3/2 , and i with
251
Fig. 4. Scatter plot of the neutralinoproton cross section as a function of the neutralino mass for
the scenario with three different sets of Dp-branes. The string scale is MI = 1012 GeV. DAMA and
CDMS current limits and projected GENIUS limits are shown.

i = 1, 2, 3, due to relation i |i |2 = 1 only four of them are independent. In our analysis
we vary the parameters and i in the whole allowed range, 0 < 2 , 1 i 1.
For the gravitino mass we take m3/2 300 GeV. Concerning Yukawa couplings we will
fix their values imposing the correct fermion mass spectrum at low energies, i.e., we are
assuming that Yukawa structures of D-brane scenarios give rise to those values.
We will analyze first the scenario of Section 2.1.1 with three different sets of Dp-branes,
where the standard model gauge groups live. Since for the third and fourth assignments of
quantum numbers in Table 1 MI 3 108 GeV, and therefore m3/2 is too low to be
phenomenologically interesting, we will consider only the soft masses associated to the
first and second assignments, i.e., the eight possible combinations given by Eqs. (12)(14),
(20)(23). The discussion of the corresponding trilinear parameters can be found below
Eq. (23).
In particular, the cross sections associated to combination (12), (13) and (20) are shown
in Fig. 4 for MI = 1012 GeV (i.e., the example studied in Fig. 2). The other possible
combinations give rise to similar results. Fig. 4 displays a scatter plot of 0 p as a
1
function of the LSP mass m 0 for a scanning of the parameter space discussed above.
1
Two different values of tan , 10 and 15, are shown. Although this and the other figures
below have been obtained using negative values of , for positive values the corresponding
figures are equal. Notice that the spectrum of supersymmetric particles is invariant under
the transformation , A, M , A, M. Since the shift + implies for the
soft terms M M, A A and m m, a figure with positive will be equal to
a figure with negative shifting + . We have included in the figures LEP and
Tevatron bounds on supersymmetric masses. They forbid, e.g., values of m3/2 smaller than
170 GeV. Although bounds coming from CLEO b s branching ratio measurements are
not included in the figures, we have checked explicitly that their qualitative patterns are not
modified when such a bounds are considered. It is worth noticing that for tan = 10 there
252
Fig. 5. The same as in Fig. 4 but for the string scale MI = 5 1015 GeV.
are regions of the parameter space consistent with DAMA limits. In fact, we have checked
that tan > 5 is enough to obtain compatibility with DAMA. Since the larger tan is, the
large the cross section becomes, for tan = 15 these regions increase.
As discussed below Eq. (6), larger values of the string scale may be obtained
with 1 (MI ) > 0.1. In particular we show the example where MI = 5 1015 GeV,
corresponding to 1 (MI ) 1, in Fig. 5. Since the larger the scale is, the smaller the cross
section becomes, now the cross sections decrease with respect to the previous case. In
particular, tan > 10 is necessary in order to have compatibility with DAMA. On the other
hand, as discussed above, in the MSSM with universal soft terms at the unification scale
(which is close to the above MI ), tan 20 was needed to obtain compatibility. Clearly
the non-universality of the soft terms in this string scenario plays a crucial role increasing
the cross sections.
Let us finally recall that both figures are obtained taking m3/2 300 GeV, which
corresponds to squark masses mq 500 GeV at low energies. We have checked that larger
values of m3/2 produce cross sections below DAMA limits. In particular, the right hand
side and bottom of the figures will also be filled with points. Cross sections below projected
GENIUS limits will appear in both figures.
Although the scenario of Section 2.1.2 where Dp1 = Dp3 has soft terms different from
the previous scenario (see Section 3.1.2), the qualitative results concerning neutralino
proton cross sections will be similar. This is shown in Fig. 6 for the example discussed
below Eq. (7) where MI = 8 109 GeV. We use the soft terms given by (36), (37) and
(40) with the constraint 2 = 3 in order to avoid FCNC, as discussed in Section 3.1.2.
Thus, apart from tan , only three independent parameters are left: m3/2 , and one of the
i s. Other combinations of soft terms do not modify our conclusions. Note that there are
regions of the parameter space consistent with DAMA limits, as in Fig. 4. In this scenario
tan > 5 is also enough to obtain such a consistency.
The scenario without Dp1 -brane studied in Section 2.1.3 is shown in Fig. 7. We take the
string scale MI = 1016 GeV and the soft terms given in Section 3.1.3, with the constraint
253
Fig. 6. The same as in Fig. 4 but for the scenario with Dp1 = Dp3 . The string scale is
MI = 8 109 GeV.
Fig. 7. The same as in Fig. 4 but for the scenario without Dp1 -brane. The string scale is
MI = 1016 GeV.
2 = 3 to avoid FCNC as discussed there. Since the string scale is large the results are
qualitatively similar to the ones in Fig. 5.
Let us finally analyze the scenario of Section 2.2 where all gauge groups are embedded
within the same set of D3-branes. In this scenario the soft terms are given by (51), (52),
(55) and (58). Since we will take 1 = 2 in order to avoid FCNC, there will be in our
analysis only three independent parameters: m3/2 , and one of the i s, say 3 . The cross
sections are then shown in Fig. 8 for MI = 1010 GeV, i.e., the example studied in Fig. 3. We
consider two cases with tan = 10 and tan = 25. Now tan > 20 is necessary to obtain
regions consistent with DAMA limits. This is to be compared with the previous scenarios
with intermediate string scales where regions consistent with DAMA were obtained for
tan > 5. As discussed in the context of the MSSM in [18] this is due to the different
254
Fig. 8. The same as in Fig. 4 but for the scenario with all gauge groups embedded within the same
set of D3-branes, in such a way that gauge couplings unify at MI = 1010 GeV.
values of s at the string scale in both types of scenarios. Unlike the previous ones here
gauge couplings are unified.
Let us recall that this scenario is the only one where universality can be obtained. This
is the case for the dilaton limit (sin2 = 1) and the overall modulus limit (1,2,3 = 1/ 3).
The two central curves in the figures correspond precisely to this situation. Notice that the
deviation from universality may increase or decrease the cross sections, as shown in the
figures, depending on the values of the parameters and 3 chosen.
Before concluding let us discuss very briefly the effect of relic neutralino density bounds
on cross sections. The most robust evidence for the existence of dark matter comes from
relatively small scales. Lower limits inferred from the flat rotation curves of spiral galaxies
[19,50] are halo 10vis or halo h2 0.010.05, where h is the reduced Hubble
constant. On the opposite side, observations at large scales, (620)h1 Mpc, have provided
estimates of CDM h2 0.10.6 [51], but values as low as CDM h2 0.02 have also been
quoted [52]. Taking up-to-date limits on h, the baryon density from nucleosynthesis and
overall matter-balance analysis one is able to obtain a favoured range, 0.01 CDM h2
0.3 (at 2 CL) [53,54]. Note that conservative lower limits in the small and large scales
are of the same order of magnitude.
In this work the expected neutralino cosmological relic density has been computed
according to well known techniques (see [19]). Although bounds coming from them are not
included in the above figures we have checked explicitly that their qualitative patterns are
not modified for most of the regions in Figs. 47 when such a bounds are considered.
However the analysis of regions of the parameter space consistent with DAMA limits
is more delicate. From the general behaviour h2 1/ann , where ann is the cross
section for annihilation of neutralinos, it is expected that such high neutralinoproton cross
sections as those presented above will then correspond to relatively low relic neutralino
255
densities. We have seen that this is in fact the case. On these grounds, 5 most of those
points are at the border of the range of validity or below. On the other hand, it is worth
remarking that we are clearly below of the range of validity for the whole regions in the
scenario corresponding to Fig. 8.
5. Conclusions
In this paper we have analyzed different phenomenological aspects of D-brane scenarios.
First, assuming that the SU(3)c , SU(2)L and U (1)Y groups of the standard model come
from different sets of Dp-branes, intermediate values for the string scale MI 101012
GeV are obtained in a natural way. The reason is the following. Due to the D-brane origin
of the U (1)Y gauge group, the hypercharge is a linear combination of different U (1)
charges. Thus, in order to reproduce the low-energy data, i.e., the values of the gauge
couplings deduced from CERN e+ e collider LEP experiments, intermediate values for
MI are necessary. On the other hand, there is also the possibility that the gauge groups of
the standard model come from the same set of Dp-branes. In fact explicit models with this
property can be found in the literature. Again the U (1)Y gauge group has a D-brane origin
and therefore the normalization factor of the hypercharge is not as the usual one in GUTs.
The presence of additional doublets or triplets allows to obtain intermediate values for the
string scale.
Second, taking into account the matter assignment to the different Dp-branes of the
above scenarios, we have derived Yukawa couplings and soft supersymmetry-breaking
terms. The analysis of the soft terms has been carried out under the assumption of
dilaton/moduli supersymmetry breaking, and they turn out to be generically non-universal.
Finally, we have computed the neutralinonucleon cross section of these D-brane
scenarios. This computation is extremely interesting since the lightest neutralino is
a natural candidate for dark matter in supersymmetric theories. Using the previously
obtained soft terms, and taking into account that the string scale MI is the initial scale
for their running, we have found regions in the parameter space of the D-brane scenarios
with cross sections in the range of 106 105 pb. For instance, this can be obtained
for tan > 5. The above mentioned range is precisely the one where current dark matter
detectors, as, e.g., DAMA and CDMS, are sensitive.
Acknowledgements
D.G. Cerdeo acknowledges the financial support of the Comunidad de Madrid through
a FPI grant. The work of S. Khalil was supported by PPARC. The work of C. Muoz was
supported in part by the Ministerio de Ciencia y Tecnologa, and the European Union under
5 Of course there is always the possibility that not all the dark matter in our Galaxy are neutralinos. This would
modify the analysis since, e.g., < CDM .
256
contract HPRN-CT-2000-00148. The work of E. Torrente-Lujan was supported in part by

the Ministerio de Ciencia y Tecnologa.
Appendix A
We summarize in this appendix the formulas for the soft terms [33] and Yukawa
couplings [35] in D-brane constructions, using one set of 9-branes and three sets of 5branes, 5i . Assuming vanishing cosmological constant and neglecting phases, the gaugino
masses are given by
M9 =
3 m3/2 sin ,
M5i =
3 m3/2 i cos ,
(A.1)
where M9 (M5i ) are the masses of gauginos coming from open strings starting and ending
on 9 (5i )-branes. The scalar masses are given by

m2C 9 = m2 5k = m23/2 1 3i2 cos2 ,
Cj

= m23/2 1 3 sin2 ,

3
m2C 95i = m23/2 1 1 i2 cos2 ,
2

3
m2 5i 5j = m23/2 1 sin2 + k2 cos2 ,
C
2
m 2 5i
Ci
(A.2)
where Ci9 denote matter fields coming from open strings starting and ending on 9-branes
(the index i which labels the three complex compact dimensions is very useful in order to
generate three families as we discuss in the text), Ci5i and Cj5i with i = j are analogous
to the previous ones but with 9-branes replaced by 5i -branes, C 95i denote matter fields
coming from open strings starting (ending) on a 9-brane and ending (starting) on a 5i brane, C 5i 5j with i = j come from open strings starting in one type of 5i brane and ending
on a different type of 5j -brane. Finally the results for the trilinear parameters are
AC 9 C 9 C 9 = AC 9 C 95i C 95i = 3 m3/2 sin ,
= A 5i 5i 5k 5i 5k = 3 m3/2 i cos ,
Cj C
C

3
m3/2 sin
i cos ,
A C 51 52 C 53 51 C 52 53 =
2
i

3
AC 5i 5j C 95i C 95j =
m3/2 cos (k i j ) sin ,
2
1
5 5 5
C1 i C2 i C3 i
=A
5
Ci i C 95i C 95i
(A.3)
with i, j, k = 1, 2, 3 and i = j = k = i in the above equations. The angle and the i with

2
i |i | = 1, just parameterize the direction of the goldstino in the S, Ti field space.
257
On the other hand, the renormalizable Yukawa couplings which are allowed are given
by

3

9 9 9
51 52 53 51 52 53
9 95i 95i
LY = g9 C1 C2 C3 + C
C
C
+
Ci C C
i=1
3

i,j,k=1

g5i C15i C25i C35i + Ci5i C 95i C 95i

1
+ dij k Cj5i C 5i 5k C 5i 5k + dij k C 5j 5k C 95j C 95k ,
2
(A.4)
with dij k = 1 if i = j = k = i, otherwise dij k = 0.
References
[1] B. Greene, K. Kirklin, P. Miron, G.G. Ross, Nucl. Phys. B 278 (1986) 667;
B. Greene, K. Kirklin, P. Miron, G.G. Ross, Nucl. Phys. B 292 (1987) 602;
J.A. Casas, C. Muoz, Phys. Lett. B 209 (1988) 214;
J.A. Casas, C. Muoz, Phys. Lett. B 214 (1988) 63;
J.A. Casas, E.K. Katehou, C. Muoz, Nucl. Phys. B 317 (1989) 171;
A. Font, L.E. Ibaez, H.P. Nilles, F. Quevedo, Phys. Lett. B 210 (1988) 101;
I. Antoniadis, J. Ellis, J. Hagelin, D. Nanopoulos, Phys. Lett. B 231 (1989) 65.
[2] Z. Kakushadze, Phys. Lett. B 434 (1998) 269;
J. Lykken, E. Poppitz, S. Trivedi, Nucl. Phys. B 543 (1999) 105.
[3] G. Shiu, S.-H.H. Tye, Phys. Rev. D 58 (1998) 106007.
[4] Z. Kakushadze, S.-H.H. Tye, Phys. Rev. D 58 (1998) 126001.
[5] M. Cvetic, M. Plmacher, J. Wang, JHEP 0004 (2000) 004;
G. Aldazabal, L.E. Ibez, F. Quevedo, hep-ph/0001083.
[6] G. Aldazabal, L.E. Ibez, F. Quevedo, A.M. Uranga, JHEP 0008 (2000) 002.
[7] D. Bailin, G.V. Kraniotis, A. Love, hep-th/0011289.
[8] S.F. King, D.A.J. Rayner, hep-ph/0012076.
[9] J. Lykken, Phys. Rev. D 54 (1996) 3693.
[10] N. Arkani-Hamed, S. Dimopoulos, G. Dvali, Phys. Rev. Lett. B 249 (1998) 262;
I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos, G. Dvali, Phys. Lett. B 436 (1998) 263;
I. Antoniadis, C. Bachas, Phys. Lett. B 450 (1999) 83.
[11] K. Benakli, Phys. Rev. D 60 (1999) 104002.
[12] C. Burgess, L.E. Ibez, F. Quevedo, Phys. Lett. B 447 (1999) 257.
[13] For a review, see, e.g., C. Muoz, hep-ph/9709329, and references therein.
[14] J.A. Casas, A. Lleyda, C. Muoz, Phys. Lett. B 380 (1996) 59.
[15] J.A. Casas, A. Lleyda, C. Muoz, Phys. Lett. B 389 (1996) 305.
[16] S.A. Abel, B.C. Allanach, F. Quevedo, L.E. Ibez, M. Klein, JHEP 0012 (2000) 026.
[17] N. Kaloper, A. Linde, Phys. Rev. D 59 (1999) 101303.
[18] E. Gabrielli, S. Khalil, C. Muoz, E. Torrente-Lujan, Phys. Rev. D 63 (2001) 025008.
[19] For a review, see, e.g., G. Jungman, M. Kamionkowski, K. Griest, Phys. Rep. 267 (1996) 195,
and references therein.
[20] R. Bernabei et al., Phys. Lett. B 480 (2000) 23.
[21] R. Abusaidi et al., Phys. Rev. Lett. 84 (2000) 5699.
[22] HeidelbergMoscow Collaboration, Phys. Rev. D 59 (1998) 022001.
[23] HeidelbergMoscow Collaboration, Phys. Rev. D 63 (2001) 022001.
258
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
P.F. Smith et al., Phys. Rep. 307 (1998) 275.

Canfranc Underground Laboratory Collaboration, hep-ph/0011318.
H.V. Kapdor-Kleingrothaus, L. Baudis, G. Heusser, B. Majorovits, H. Paes, hep-ph/9910205.
For a review, see, A. Brignole, L.E. Ibez, C. Muoz, in: G. Kane (Ed.), Perspectives on
Supersymmetry, World Scientific, Singapore, 1998, p. 125, hep-ph/9707209.
L.E. Ibez, D. Lst, Nucl. Phys. B 382 (1992) 305.
V.S. Kaplunovsky, J. Louis, Phys. Lett. B 306 (1993) 269.
A. Brignole, L.E. Ibez, C. Muoz, Nucl. Phys. B 422 (1994) 125;
A. Brignole, L.E. Ibez, C. Muoz, Nucl. Phys. B 436 (1995) 747, Erratum;
A. Brignole, L.E. Ibez, C. Muoz, C. Scheich, Z. Phys. C 74 (1997) 157.
I. Antoniadis, E. Kiritsis, T.N. Tomaras, Phys. Lett. B 486 (2000) 186.
L.E. Ibez, hep-ph/9804236.
L.E. Ibez, C. Muoz, S. Rigolin, Nucl. Phys. B 553 (1999) 43.
Particle Data Group, Eur. Phys. J. C 15 (2000) 1.
M. Berkooz, R.G. Leigh, Nucl. Phys. B 483 (1997) 187.
H. Fritzsch, D. Holtmannspotter, Phys. Lett. B 338 (1994) 290;
H. Fritzsch, Z. Xing, Phys. Rev. D 61 (2000) 073016.
M. Brhlik, L. Everett, G.L. Kane, J. Lykken, Phys. Rev. D 62 (2000) 035005;
E. Accomando, R. Arnowitt, B. Dutta, Phys. Rev. D 61 (2000) 075010;
T. Ibrahim, P. Nath, Phys. Rev. D 61 (2000) 093004;
S. Khalil, T. Kobayashi, O. Vives, Nucl. Phys. B 580 (2000) 275;
E. Gabrielli, S. Khalil, E. Torrente-Lujan, Nucl. Phys. B 594 (2001) 3;
D. Carvalho, M. Gmez, S. Khalil, hep-ph/0101250.
L. Everett, G.L. Kane, S.F. King, JHEP 0008 (2000) 012.
S. Khalil, Phys. Lett. B 484 (2000) 98.
A. Corsetti, P. Nath, hep-ph/0003186.
R. Arnowitt, B. Dutta, Y. Santoso, hep-ph/0005154.
D. Bailin, G.V. Kraniotis, A. Love, Phys. Lett. B 491 (2000) 161.
A. Bottino, F. Donato, N. Fornengo, S. Scopel, Phys. Rev. D 59 (1999) 095004.
R. Arnowitt, P. Nath, Phys. Rev. D 60 (1999) 044004.
J. Ellis, A. Ferstl, K. Olive, Phys. Lett. B 481 (2000) 304.
E. Accomando, R. Arnowitt, B. Dutta, Y. Santoso, Nucl. Phys. B 585 (2000) 124.
A. Corsetti, P. Nath, hep-ph/0003186.
J.L. Feng, K.T. Matchev, F. Wilczek, Phys. Lett. B 482 (2000) 388.
M.E. Gmez, J.D. Vergados, hep-ph/0012020.
P. Salucci, M. Persic, astro-ph/9703027;
P. Salucci, M. Persic, astro-ph/9903432.
See, e.g., W.L. Freedman, astro-ph/9905222, and references therein.
N. Kaiser, astro-ph/9809341.
B. Sadoulet, Rev. Mod. Phys. 71 (1999) 197.
J.R. Primack, astro-ph/0007187;
J.R. Primack, M.A.K. Gross, astro-ph/0007165.

Holomorphic tachyons and fractional D-branes

Tadashi Takayanagi
Department of Physics, Faculty of Science, University of Tokyo, Tokyo 113-0033, Japan
Abstract
We study tachyon condensation on braneantibrane systems in orbifold theories from the
viewpoint of boundary string field theory. We show that the condensation of holomorphic tachyon
fields generates various fractional D-branes. The boundary N = 2 supersymmetry in the worldsheet
theory ensures this result exactly. Furthermore, our results are consistent with the twisted RR-charges
from detailed calculations of boundary states. We also discuss the generation of RR-charges due to
holomorphic tachyon fields on multiple braneantibrane pairs in flat space. 2001 Elsevier Science
B.V. All rights reserved.
1. Introduction
Tachyon fields naturally appear in open string theory if we consider various configurations of D-branes. For example, braneantibrane systems [13] and non-BPS D-branes [4]
in Type II superstring theory indeed have tachyon fields. Since the presence of tachyon
means the instability of the system, the condensation of tachyon is very important to know
the dynamical aspects of string theory.
Recently, tachyon condensation in open string theory has been intensively studied,
pioneered by Sen (for a review see [5]). 1 Sen conjectured that if the tachyon fields on
these unstable D-brane systems condense into the bottom of the tachyon potential, then the
negative energy density exactly cancels the D-brane tension [3]. After this conjecture was
proposed, the tachyon potentials for various unstable brane systems have been studied
by applying open string field theories and the off-shell structures for various unstable
brane systems have been revealed. Calculations with a good approximation called level
truncation scheme have been performed in Wittens cubic string field theory [7] and
Berkovitss superstring field theory [8]. The obtained tachyon potentials agree with the
E-mail address: takayana@hep-th.phys.s.u-tokyo.ac.jp (T. Takayanagi).
1 For earlier work on tachyon condensation see, [6].
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 4 8 - 1
260
T. Takayanagi / Nuclear Physics B 603 (2001) 259285
Sens conjecture (for example, see [9] and also refer to [10] for a review of string field
theory approach to tachyon condensation).
Another open string field theory which has been applied to tachyon condensation is the
boundary string field theory (or background independent open string field theory [11,12]).
In this theory one has only to discuss a finite number of string fields because the string
fields which have expectation values are considered to correspond to only the relevant
and marginal perturbations on the worldsheet. For example, one can calculate the exact
tachyon potential [13,14]. Though the validity of this truncation has not been proved
completely, one can compute tachyon condensation for specific tachyon fields without
any approximation and the result agrees with the Sens conjecture exactly [14]. Motivated
by the previous results on the worldsheet -model approach [15], this formulation has
also been generalized for non-BPS D-branes [16] and braneantibrane systems [1719] in
superstring theory.
Most of these recent developments in open string field theories are restricted to unstable
D-brane systems in non-compact flat backgrounds. However, if one wants to know the
geometrical aspects of tachyon condensation, one should challenge curved backgrounds.
Some results of tachyon condensations in curved space have already been obtained. For
example, tachyon condensation as marginal deformations [2022] has been studied for Z2 orbifolds [2325]. The condensation of holomorphic tachyon fields has been discussed in
more general (Ricci flat) Khler manifolds [17,26,27]. For tachyon condensation in SU(2)
WZW model see also [28,29]. The approach which utilizes noncommutative geometry [30]
also has been applied to various compact spaces [28,31,32].
Investigations of this problem may also be useful for the understanding of substringy
geometry for D-branes, which is called D-geometry (for example, see reviews [33]
and references therein). Since the condensation of topologically non-trivial tachyon field
generates lower-dimensional D-branes [20,34], one can regard a D-brane roughly as a
tachyon field on higher-dimensional space. If one handles the tachyon fields in boundary
string field theory (BSFT), this will give another stringy description of D-branes and this
will be useful to obtain D-geometry.
If we would like to get a BPS D-brane from a braneantibrane system, it is natural
to require that the tachyon field T should be holomorphic [17,26,27]. This fact seems
to be correct in BSFT if one takes the large volume limit because the worldsheet theory
becomes localized at T = 0 [35] after the tachyon condensation. Then this equation gives
the holomorphic cycle (or divisor) on which the BPS D-brane [36] is wrapped [17].
If one would like to discuss this requirement in the worldsheet theory, holomorphy of
tachyon fields is equally stated as the boundary (B-type) N = 2 supersymmetry [37] on
the worldsheet [17]. On the other hand, if we consider the backgrounds where stringy
corrections do exist, then the above arguments will be modified. Therefore the investigation
of tachyon condensation in BSFT with the N = 2 supersymmetry is very interesting in the
stringy regions.
As a first step of this, in this paper we discuss tachyon condensations in orbifold theories
from the viewpoint of boundary string field theory. We consider tachyon fields which
preserve the boundary B-type N = 2 supersymmetry (holomorphic tachyon). After the
261
tachyon condensation we obtain various fractional D-branes. We can identify the decay
products completely by combining the boundary string field theory with some results
from boundary state calculations. From this argument we obtain intriguing identities for
the characters of the discrete groups which define orbifolds. The worldsheet extended
supersymmetry ensures these results of tachyon condensation exactly. Further if we resolve
the orbifold singularities, then the final states are regarded as BPS D-branes which are
wrapped on various holomorphic cycles. Thus we see again the correspondence between
BPS D-branes and holomorphic tachyon fields. In this paper we mainly discuss only ZN orbifolds for simplicity.
The paper is organized as follows. In Section 2, we first review some known facts
on the relation between tachyon condensation in braneantibrane systems and N = 2
supersymmetry. Further we discuss tachyon condensation for multiple braneantibrane
pairs and generation of RR-chargeson flat space. In Section 3, we investigate tachyon
condensation on orbifolds from the viewpoint of boundary string field theory. In Section 4,
we conclude with some future directions. In Appendix A, we show the explicit calculations
of boundary state.
2. Tachyon condensation in BSFT and holomorphy of tachyon

In this section we first review tachyon condensation on braneantibrane systems in
flat background within the framework of BSFT [1114,1619] and we next investigate
the generation of D-brane charges from various braneantibrane systems. We obtain the
topological configurations which generalize the AtiyahBottShapiro construction [38].
In particular we are interested in the relation between the boundary N = 2 supersymmetry
and tachyon condensations, which was first discussed in [17]. Through the paper we use
the language of Type IIA theory, but all the arguments can be applied to Type IIB theory
straightforwardly.
2.1. Tachyon condensation on a brane and an antibrane
In BSFT for superstring, N = 1 superconformal symmetry is preserved in the bulk
of worldsheet, but its conformal symmetry is broken at the boundary due to boundary
interactions. In other words, only the boundary can be off-shell and the open string fields
are expressed as boundary interactions. Then the boundary interactions which describe
the tachyon condensation should naturally preserve N = 1 supersymmetry. To realize this
supersymmetry one needs extra fermionic fields [34,35] on the boundary and this freedom
corresponds to ChanPaton factors of non-BPS D-branes and braneantibrane systems.
They are called boundary fermions and we write them by , (complex fermion) for a
braneantibrane and (real fermion) for a non-BPS D-brane. Then the worldsheet action I
for a braneantibrane system in flat space is given by [1719,35]
I = I0 + IB ,
(2.1)
262

1
d 2 w w X w X + L w L + R w R ,

2

a
1
1
a
IB = d d D + T X + T (X ) ,
2
2
I0 =
(2.2)
(2.3)
where w, w denote the coordinates of (Euclidean) worldsheet and = w + w denotes
the boundary coordinate; we define X = XL + XR , R , L ( = 0 9) as the

familiar bosonic and fermionic (left-moving and right-moving) fields on the worldsheet.
The tachyon field T (Xa ), T (Xa ) depends only on the coordinates Xa which are along the
worldvolume of the braneantibrane.
Here we have used N = 1 superspace formulation at the boundary of worldsheet as
follows

X = X + 2i 12 R + L ,
= + F,
(2.4)
,

= + F
+
.
D =
Note that is the N = 1 superfield for the boundary fermion .
If we write the boundary interactions IB in the component form and integrate out the
, then we get:
auxiliary fields F, F

2
2
1
T + i
T +
TT .
IB = d i
(2.5)
From this it is easy to see that after the quantization of boundary fermions
{, }
= 1,
(2.6)
the ChanPaton factors + = 12 (1 + i2 ), = 12 (1 i2 ) and 3 correspond to ,

and [,
], respectively. This explains the correct degree of freedom of ChanPaton factors
(2 2 matrices) for a braneantibrane. The above action includes only the perturbations
which represent the tachyon field T , T . Furthermore, one can also incorporate the gauge
fields which correspond to the ChanPaton factors 1 and 3 , but we will set these fields to
zero in this paper.
Also the worldsheet action for non-BPS D-branes can be easily obtained if one applies
the descent relation [23]. This relation says that if one performs the Z 2 projection of the
boundary interactions = for a braneantibrane, then one gets those for a non-BPS
D-brane [16,35].
Now let us require N = 2 worldsheet supersymmetry. For example, this supersymmetry
is preserved for CalabiYau compactifications as is well-known. In order to investigate
D-branes in these examples, it is natural to consider a boundary analog of such an
extended supersymmetry, though this is not generic. If we get to an on-shell point after
the tachyon condensation, then this supersymmetry will be enhanced into N = 2 boundary
superconformal symmetries, which are classified into A-type and B-type superconformal
symmetry [37]. These coincide with the classification of BPS D-branes in CalabiYau
263
spaces in the large volume limit [36]. Therefore it will be particularly interesting to
consider the boundary interactions which preserve this symmetry. Then what kinds of
tachyon fields satisfy this requirement? It was pointed out in the paper [17] that the
B-type supersymmetry is not broken if one considers holomorphic tachyon field for brane
antibrane systems.
More concretely, the boundary interaction which preserves B-type N = 2 supersymmetry (below we will omit the word B-type and simply call this N = 2 supersymmetry) can
be written [17] as follows (for earlier relevant work see also [39])

1
IB = d d d + d d T Zi + (h.c.),
(2.7)
2
where we have employed N = 2 boundary superspace (, , ) and the boundary fermionic

chiral and antichiral superfield , are defined in our conventions as
i
i
= + F ,
2
2
i
i

.
= + F
2
2
(2.8)
Note that the tachyon field T (Z i ) depends only on the holomorphic coordinates Z i =
along the worldvolume in order to preserve N = 2 boundary supersymmetry.
The most interesting issue of N = 2 supersymmetry is the fact that the boundary
superpotential term d d T (Zi ) is not renormalized as argued in [17]. On the
other hand the kinetic term for the boundary fermion is included in the boundary D-term
and will receive quantum corrections. We assume that the contributions from the D-term
are not singular and therefore the potential term dominates the D-term after the tachyon
condensation |Ti | . For example, let us assume the following holomorphic tachyon
field on a D2D2 which is extended in Z 1 ( Z) direction [17]:
X2i1 + iX2i
T (Z) =
p

k=0
ak Z = ap
k
p

(Z zk ).
(2.9)
k=1
Then the values of {zk } are not renormalized. As Sen and Witten argued in [20,34], if the
tachyon field which has a topological charge does condense, then the corresponding lower
dimensional D-branes are generated. In our example the tachyon field (2.9) has the winding
number p and thus p D0-branes should be produced at each point zk .
Let us see this in BSFT. In superstring theory the spacetime action S of BSFT is argued
to be identified with the disk partition function Zdisk [1519]

S = Zdisk = [DX][D][D][D ]
(2.10)
exp(I0 IB ).
As the tachyon condenses infinitely ap , the path integrals around the p fixed points
Z = zk give dominant contributions to Zdisk . Then the partition function becomes p times
that for p = 1 case [17]. On the other hand, the boundary perturbation for p = 1 can be
treated within a free theory. Using the results in [1619], one can show
264
T
1
Z(a1 = 0) (Vol)1
=
= D2D2 ,
Z(a1 = )
TD0
2 2
(2.11)
where TD0 and TD2D2 denote the tension of a D0-brane and of a D2D2, respectively;
Vol denotes the volume of the D2-brane worldvolume. Thus after the condensation of
tachyon field (2.9), p D0-branes are produced as expected.
Another way to see this is to compute the RR-couplings of a D2D2. As discussed in
[18,19,40,41], those couplings for a DpDp system in BSFT are written by using Quillens
superconnection [42] if we ignore the the contributions from non-abelian transverse scalars
[19,43]. They are given by the following formula

S = TDp Str C exp(2 F ),

(1) T T
3/2 2 DT
2 F
(i)
2 F =
(2.12)
,
(i)3/2 2 DT 2 F (2) T T
where Str is supertrace and F is the field strength of superconnection. 2 Then let us
compute the RR-coupling which represents the D0-brane charge in the previous example.
As shown in [42], continuous deformations of the tachyon field do not change the result.
Therefore we can restrict the form of tachyon field to
T (Z) = ap Z p .
(2.13)
Then the integration in (2.12) along the coordinate Z 1 does not depend on ap and we obtain
the following RR-coupling

SRR = (i2 )TD2 CD0 dT dT eT T
2
= 4 p TD2
CD0
2r dr r 2p2 er
2p

= p TD0
CD0 ,
(2.14)
where we have used the relation TD0 = (2)p ( )p/2 TDp ; the 1-form field CD0 denotes
the RR-field which couples to D0-branes. Thus we get p units of D0-brane RR-charge
matching with the above result.
Next we would like to comment on the higher dimensional generalization. If the tachyon
field T depends only on one coordinate (for example, Z 1 ), then the generalization is trivial.
More generally, let us consider the tachyon field T (Z 1 , Z 2 , . . . , Z n ) on a D2nD2n. If the
holomorphic function T is reducible as T = T (1) T (2) T (q) , then we obtain the sum
of the decay products each corresponding to T = T (1) , T = T (2) , . . . , T = T (q) [17].
Therefore we can assume the function T (Z 1 , Z 2 , . . . , Z n ) is irreducible. Then we will
obtain a D(2n 2)-brane wrapping on a codimension two hypersurface T = 0. However,
2 We have replaced T with T in the Ref. [19]. Note also that a factor 1 in front of DT is different from
Eq. (4.8) in [19]. This is because here we assume T anticommutes with any odd-forms.
265
this may be problematic. In general this configuration of the curved D-brane seems to be
unstable in spite of its holomorphy since the D-brane is put in flat space and cannot wrap
any cycles. It would be interesting to investigate this further, though we mainly discuss the
generation of D0-brane charges in this paper.
Before closing this subsection, let us ask what will happen if we do not assume the
boundary N = 2 supersymmetry. First, one can produce lower dimensional non-BPS Dbranes. This requires a kink-like tachyon field and is not holomorphic. Second, one will
also be able to produce D0-branes and anti D0-branes at the same time. For example, let
us consider the following tachyon field on a D2D2
q.
= aq+p,q Z q+p Z
T (Z, Z)
(2.15)
In the same way as the above RR-charge computation, one can calculate D0-brane charge
of this configuration. The result is p times that of a D0-brane, which can be also seen
intuitively from the fact that the tachyon field (2.15) has the winding number p. One may
hastily conclude that the configuration (2.15) generates a system of (q + p) D0-branes and
q anti D0-branes after the tachyon condensation. In fact this boundary interaction (2.15)
breaks N = 2 supersymmetry and thus should be renormalized. Therefore we can argue
that a system of (q + p) D0-branes and q anti D0-branes for any q will be produced in a
certain limit of the following configurations
=
T (Z, Z)
q.
aq+p,q Z q+p Z
(2.16)
q=0
Generally, these are highly interactive theories and it will be difficult to analyze further.
2.2. Tachyon condensation on multiple branes and antibranes
The above formulation can be generalized for multiple braneantibrane systems. The
path-ordered formulation for these was given in [19]. If one wants to construct the
corresponding N = 1 boundary interaction, one has only to include more than one
boundary fermions [17,18]. We write the superfields for them as i , i (i = 1, 2, . . . , n).
The quantization of boundary fermions i is written by
{i , j } = i j .
(2.17)
Comparing this with the algebra of matrices 1 , . . . , 2n :

{a , b } = 2ab ,
(2.18)
we get the correspondence

1
i i+ (2i1 + i2i ),
2
1
i i (2i1 i2i ).
2
(2.19)
Then we can get 2n 2n ChanPaton matrices which corresponds to 2n1 branes and 2n1
antibranes even though for the general number of branes and antibranes, N = 1 boundary
266
superspace formulation is not known. In this formalism, the boundary interactions are
expanded as

i Ti (Xa )
IB = d d i D i +
+ Tij k Xa i j k
2

a
(2.20)
+ Tij
k X i j k + + (h.c.) ,
Note that in the above
where we have omitted the summation over the indices i, j, k, i.
n1 brane
equation the fields Ti , Tij k , Tij
k , . . . , represent non-abelian tachyon fields on 2
antibrane pairs. Here the non-abelian gauge fields are neglected again and these correspond
to the boundary interactions which include even number of boundary fermionic superfields.
If we are interested in N = 2 boundary supersymmetry, then the above boundary
interactions should be constrained. The boundary interactionswhich represent tachyon
fields should be in the boundary superpotential terms d d W . Therefore
(i) tachyon fields should be holomorphic (or depend only on Zi ) and (ii) the potential
terms should involve no anti-chiral superfields i . For example, the second requirement
does not allow the field Tij
k.
Now let us consider tachyon condensation on 2n1 D(2n)-branes and 2n1 anti D(2n)branes. In such a system there should be decay modes which generate BPS D0-branes
following the general arguments in K-theory [34]. We assume the following N = 2
boundary interaction [17] for simplicity

1
IB = d d d
(2.21)
i i + d d
i Ti (Z) + (h.c.),
2 i
i
where the tachyon fields Ti (Z) depend only on the holomorphic coordinate Z 1 , . . . , Z n of
the worldvolume. Note that if n = 1 or 2, then the general N = 2 boundary interaction can
be written as the above form. As argued in [17] the condensation of these tachyon fields
generally produces a D-brane wrapped on the intersection of hyper-surfaces Ti (Z) = 0.
Below we would like to investigate this further. The results will be useful in the next
section.
We first turn to the RR-coupling which corresponds to the D0-brane charge. The nonabelian tachyon field T can be written by

0 T
(2.22)
i+ Ti +
i T i ,
=
T 0
i
where note that matrices here are not projected into the Weyl representation. Notice that
we regard the tachyon fields as holomorphic if Ti are holomorphic functions. Generally
this does not mean that the non-abelian tachyon field T in the above matrix is holomorphic
in an ordinary sense.
Putting this into Eq. (2.12), we get the following RR-coupling in BSFT:

n

2
|Ti |
SRR = TD(2n) Str CD0 exp
i=1
n

+

i dTi + i dT i
exp 2i
267
i=1

n

1
2
= TD(2n)(2i )
|Ti |
CD0 exp
(2n)!
i=1

n
2n

+
Tr 2n+1
i dTi + i dT i
n
i=1
n
= (2i ) TD(2n)

exp
n

i=1

|Ti |
CD0
n

dTi dT i ,
(2.23)
i=1
where the chirality matrix 2n+1 = (i)n 1 2 2n was inserted in order to replace the
supertrace Str with the ordinary trace Tr. Note that there are no RR-charges other than
D0-branes produced from the tachyon fields (2.22) because of the trace over matrices.
If we assume Ti depends only on Zi , then the above integrations are divided into n
independent parts. If one sets the degree of Ti (Zi ) is pi , then the result is given by

n

SRR =
(2.24)
pi TD0 CD0 .
i=1

Thus we can conclude that ( ni=1 pi ) D0-branes are generated in this case. Furthermore
one can show that this configuration has a correct tension in BSFT. To see this one has only
to note that the partition function Zdisk is also divided into n independent path integrals for
each direction Z1 , . . . , Zn
Zdisk =
n

i
Zdisk
.
(2.25)
i=1
Then using the previous result of tachyon condensation on a D2D2, it is easy to see

the resulting tension is ( ni=1 pi ) times that of a D0-brane. In particular the configuration
p1 = = pn = 1 generates a BPS D0-brane and corresponds to AtiyahBottShapiro [38]
construction of K-theory charges.
Let us turn to other configurations. For simplicity, we set n = 2 and consider a system
which is consist of two D4-branes and two anti D4-branes. We consider the following
holomorphic tachyon fields for generic examples
T1 (Z1 , Z2 ) = (Z1 )p (Z2 )q a,
T2 (Z1 , Z2 ) = (Z1 )r (Z2 )s b,
(2.26)
where p, q, r, s are non-zero integers and we assume ps qr = 0.

First note that the above configurations include only 0-branes if either a or b is not
zero. This is because if one calculates the disk partition function in BSFT, one always

finds the factor exp( 2i=1 Ti T i ) and this means that the degree of freedom will be
localized at the points (0-branes) defined by the equations T1 = T2 = 0. One can calculate
268
the number of the points and the result 3 is given by |ps rq|. This shows the total number
of generated D0-branes and anti D0-branes is |ps rq| because one fixed point gives the
tension of a D0-brane (or anti D0-brane). On the other hand, one can also calculate the
D0-brane RR-charge of these configurations with an appropriate change of the variables
in the integration (2.23). The result is (ps qr) times that of a D0-brane. Thus we can
conclude that the above tachyon fields (2.26) generate (ps rq) BPS D0-branes unless
a = b = 0. Mathematically one can say that the integration (2.26) counts the number of the
(localized) solutions to the algebraic equations Ti = 0 and this result will hold for general
n and nonsingular polynomials Ti .
Next we consider the singular cases a = b = 0. The D0-brane RR-charge is again
given by (ps rq). After the condensation of these tachyon fields, 2-branes will also
be generated since the equations T1 = T2 = 0 are satisfied for Z1 = 0 or Z2 = 0. These
2-branes should be D2D2 systems because this configuration does not have D2-brane
RR-charge. The generation of D2D2 is not so surprising. If one assumes p = r, q = s = 0,
then this configuration corresponds to the decay into p pairs of D2D2 at Z1 = 0 as can be
seen easily by using U (2) rotational symmetry of ( 1 , 2 ). Though we cannot determine
how many D2D2 systems will be produced for general (p, q, r, s), it will be interesting to
note that a system which is generically D0-branes can become higher dimensional branes
for singular points in the field space of BSFT.
Finally we would like to comment on the relation between various tachyon condensation
modes and N = 2 boundary (B-type) supersymmetry. In the above arguments on tachyon
condensations in braneantibrane systems, we have not observed the generation of
D0-branes and anti D0-branes at the same time. 4 As can be seen from this example
we believe that there is an intriguing correlation in general backgrounds between the
BPS nature of final objects and the holomorphy of tachyon (or N = 2 supersymmetry).
In the next section we will see another evidence of this argument in orbifold theories,
which give the simplest examples in curved spaces.
3. Tachyon condensation on orbifolds

In this section we discuss tachyon condensation in braneantibrane systems on
orbifolds [44]. Mainly we consider the four dimensional orbifolds C 2 /ZN (N 2), but
the similar arguments will be applied to higher dimensional examples or more complicated
orbifold projections. The relation between the tachyon condensation in these systems and
the equivariant K-theory was discussed in [34,45]. The tachyon condensation from the
viewpoint of noncommutative geometry [30] was also discussed for orbifolds [32]. Here
3 To see this, let l be the g.c.d. of p and r as p = l and r = l . Then one obtains |q s| solutions about
Z2 as (Z2 )qs = a b . After we insert this in T1 = T2 = 0 again, we get l solutions about Z1 . Thus we
get |ps rq| solutions.
4 Of course, if one adds more boundary fermions with preserving N = 2 supersymmetry, then we can obtain
D0D0 systems. What we would like to say here is that we cannot obtain D0D0 systems if we start from the
minimal number (= 2n1 ) of D(2n)D(2n) pairs.
269
we investigate this in the framework of BSFT and determine what will be generated after
the tachyon condensation precisely. Before we do so, let us first review some useful facts
about D-branes on orbifolds [46,47].
In Type II superstring theory we can consider the orbifold projections on C2 which
preserve half of the bulk supersymmetries. 5 This means that the discrete groups of
the orbifold projections should be subgroups of SU(2) and are known to be classified
into A, D, E series. Geometrically, the orbifolds C 2 / can be realized in the neighborhoods
of the A, D, E singularities in K3 surface. These singularities are due to the vanishing
2-cycles in K3. If they are resolved by blowing up, then one gets ALE spaces (see,
for example, [48]). However, in string theory these singularities do not imply physical
singularities. Indeed there are B-field fluxes (= twisted NSNS-fields) through the 2-cycles
[49] and the worldsheet instantons and various D-branes which wrap these cycles do not
become tensionless. Thus the theory is not singular.
Below we concentrate on the A series for simplicity, which are equivalent to the familiar
discrete groups ZN . The action of ZN is defined as follows:
N

1, g, g 2 , . . . , g N1 ZN ,
g =1 ,
g : z1 e
2 i
N
z1 ,
z2 e
2 i
N
z2 ,
(3.1)
where z1 , z2 denote the coordinates of C2 .

Now let us turn to D-branes on C 2 /ZN . In this paper we always assume that the D-branes
are particle-like in the R1,5 direction. Then BPS Dp-branes exist for p = 0, 2, 4. In
particular D4-branes are wrapped on the whole C 2 /ZN . The D2-branes which are parallel
to the z1 -plane or z2 -plane are BPS objects and can be treated with the worldsheet N = 2
supersymmetry.
The open string spectrum of Dp-branes on the orbifold can be given by -projection
on the ChanPaton degree of freedom [46,47]. In other words, Dp-branes on C 2 /
are classified by the group theoretical representations of action on ChanPaton
factors. For = ZN , there are N irreducible representations and we denote these by
{ } ( = 0, 1, 2, . . . , (N 1)). The representation is defined as the one dimensional
representation which gives the phase rotation exp(2i/N). Then we call a Dp-brane
which corresponds to representation a -type Dp-brane. These N kinds of D-branes are
the most fundamental D-branes. For p = 0 they are called fractional D-branes [50], which
are identified with the D2-branes wrapped on vanishing 2-cycles. It is known that vanishing
2-cycles are also classified by the irreducible representations and a -type D0-brane
corresponds to a D2-brane wrapped on the 2-cycle [] [5052]. Fractional D-branes are
fixed at the origin z1 = z2 = 0 and cannot move from there. The tension of each of them
is 1/N times that of a bulk D0-brane, which can move freely in the orbifold. On the other
hand, for p = 2, 4 such a Dp-brane has the same tension as the ordinary Dp-brane since
the g-action acts on the worldvolume non-locally. These facts can also be verified by using
5 Of course the discrete group is completely different from the fermionic boundary superfield , though we
use the same symbol below.
270
boundary state formalism for orbifold theories (see, for example, [37,5158]) as we will
see in the Appendix A.
All the other D-branes in the orbifold theory are regarded as linear combinations of
these fundamental Dp-branes and correspond to all of the reducible representations as =
N1
N1
=0 c (c Z). For example, the regular representation reg = =0 corresponds
to a bulk D0-brane. Open strings between a -type Dp-brane and a -type Dp-brane
belong to the representation , where denotes the complex conjugation. 6 Then the
super YangMills theories called quiver gauge theories are realized on the worldvolume of
BPS D-branes as shown in [46].
Here we are interested in the tachyon field T which comes from the open string between
a -type Dp-brane and a -type anti Dp-brane. These open strings belong to with
the opposite GSO-projection and the g( ZN ) action is given by

2 i
2 i
2 i
g : T (z1 , z2 ) e N () T e N z1 , e N z2 .
(3.2)
3.1. Tachyon condensation on orbifolds in BSFT
Now let us investigate the tachyon condensation on orbifolds in BSFT. Again we
are interested in holomorphic tachyon fields, which preserve N = 2 supersymmetry. As
mentioned in the previous section, the spacetime action of BSFT in flat space is defined as
the disk partition function Eq. (2.10). If the worldsheet action I0 + IB is invariant under a
certain transformation of the worldsheet fields X and , we can twist the theory by this
symmetry. In particular if we regard g ZN as the symmetry, then we get the BSFT action
for D-branes on orbifolds.
Generation of codimension two D-branes
Let us first turn to a D2D2 pair of which worldvolume is defined by z2 = 0. We assume
that the D2-brane is -type and the anti D2-brane is -type. Note that the branes cannot
move from z2 = 0. If we remember the g ZN action (3.2), the tachyon field should be
projected as follows:
2 i
2 i
T (z1 ) = e N () T e N z1 .
(3.3)
In BSFT this can be equally stated that the boundary interaction (2.7) should be invariant
under the following transformation
g: e
2 i
N ()
,
Z1 e
2 i
N
Z1 .
(3.4)
Then the allowed tachyon field can be given by

T (z1 ) = aq (z1 )+Nq ,
(3.5)
where q is non-negative integers for and is positive integers for > . The BSFT
action becomes
6 Note that if one changes the orientation of the open strings, then they belong to .
271

S=
[DZ1 ][D] exp(I0 IB )
C 2 /
1
=
N

=

[DZ1 ][D] exp(I0 IB )
C2
+ Nq
N

TD0
dx 0 ,
(3.6)
where we have used the fact that the disk partition function after the condensation of
the tachyon field (3.5) is the same as that for ( + Nq) D0-branes as explained in
the previous section. The calculation of bulk RR-charge 7 is also in the same way as in
Section 2 and the result is ( + Nq)/N times that of a BPS bulk D0-brane. 8 Thus
we can conclude that + Nq fractional D0-branes will be generated at the point
z1 = 0. Then what kinds of fractional branes will be generated? To answer this question
completely we need the knowledge of twisted RR-charges and we will return to this in
the next subsection. Nevertheless we can obtain some hints from the above arguments.
First let us set = . Then the mode q = 0 corresponds to the decay into the vacuum
as the tachyon condenses a0 . This is impossible for other cases = since the
types of the brane and the antibrane are different and they cannot annihilate. Note also that
the tachyon field for = have qN zeros and they are invariant by the geometric ZN
action even if we deform the tachyon field (3.5) by allowed polynomials. Then it is natural
to identify these zeros as q bulk D0-branes. For example, it is obvious that they can move
from z1 = 0. Furthermore, it is not difficult to see that the condensation of the tachyon (3.5)
will generate both q bulk D0-branes and ( ) fractional D0-branes if we assume .
On the other hand, if we assume < , then we will obtain both (q 1) bulk D0-branes
and N + ( ) fractional D0-branes.
Next we turn to tachyon condensation on a D4D4 pair. In this case we can assume the
following tachyon field
T (z1 , z2 ) = aq,r (z1 )+Nq (z1 z2 )r .
(3.7)
We can apply the RR-coupling formula (2.12) or (2.23) to this. Then it is easy to see that the
final state after the tachyon condensation consists of ( + Nq + r) D2-branes on z1 = 0
and r D2-branes on z2 = 0, each of which corresponds to a irreducible representation.
Generation of codimension four D-branes
Next we consider two D4D4 pairs and discuss the generation of D0-branes. We can use
the boundary interaction (2.21) with i = 1, 2. To see the matrix representation of tachyon
fields T1 , T2 explicitly, let us use the standard expressions of matrices
7 Here bulk RR-charge means the RR-charge in the untwisted-sector. Notice that there are also twisted RR-
charges which is characteristic of orbifold theories. These charges will be discussed later.
8 Note that one can also consider g invariant tachyon field T = a (z )+Nq . For these the different N = 2
q
1
supersymmetry is preserved and have opposite RR-charge. Thus fractional anti D-branes will be produced from
these.
272

1 =

3 =
0
1
1
0
0
3
3
0

,
2 =
4 =
0
2
2
0

,

0 i1
,
i1
0
(3.8)
where 1 , 2 , 3 denote Pauli matrices. Then the non-abelian tachyon field T is given by

T2 T1
.
T=
(3.9)
T1 T2
Now we assume that the two D4-branes and two antiD4-branes correspond to the
representation + and + , respectively. The mod N integers , , are
arbitrary. The reason why we restrict to this form is because we want to maintain the
N = 2 supersymmetry in the presence of the boundary perturbation. Indeed if we assume
this form, we can read off from Eq. (3.9) the g-action on boundary fermionic superfields
as follows
g : 1 e
2 i
N ()
1 ,
2 e
2 i
N
2.
(3.10)
Further we can assume that and 0 without losing generality.

The holomorphic tachyon fields are classified into the form Eq. (2.26) with a = b = 0
and in addition they should be ZN -invariant. Here we are interested in the generation of
only D0-branes and thus we assume q = r = 0. 9
Then the tachyon fields are classified into the following form
T1 (z1 , z2 ) = aq (z1 )+Nq ,
T2 (z1 , z2 ) = br (z2 )+Nr .
(3.11)
In the same way as the codimension two case, we can conclude that ( + Nq)
( + Nr) fractional D0-branes will be generated. This can also be regarded as purely
fractional D0-branes and bulk D0-branes. The number of the former is given by ( )
mod N . In particular if one sets = 0 or = , then we obtain only bulk D0-branes. This
is consistent with the fact that the two branes and the two antibranes have identical type for
these cases. The more detailed argument which uses twisted RR-charges will be discussed
in the next section.
3.2. Tachyon condensation on orbifolds and twisted RR-charges
Here we discuss the previous examples of tachyon condensation on the orbifolds from
somewhat different viewpoint: we pay attention to the twisted RR-charges in the orbifold
theories.
Generally, an orbifold theory in the closed string sector [44] consists of a untwistedsector and twisted-sectors. Our orbifold C 2 /ZN possesses (N 1) twisted-sectors, which
are twisted by g, g 2 , . . . , g N1 . In each of the twisted NSNS-sectors there are four massless
scalars and these correspond to the moduli of hyper Khler geometry. On the other hand, in
9 As we saw in Section 2, some D2D2 systems will be generated for qr = 0. Note that we cannot deform this
as in (2.26) because of the orbifold projection.
273
each of the twisted RR-sectors there is one vector field for Type IIA theory. The RR-charges
for these vector fields are called twisted RR-charges.
These charges are carried by D-branes which do not belong to the regular representation.
In other words, these represent the geometrical information that the branes are wrapped
on some non-trivial 2-cycles in the ALE space. Therefore we argue that the twisted
RR-charges should be conserved during the tachyon condensation. 10 Our example is
consistent with this claim as we will see below. Note that this claim is in strikingly contrast
with the fact that for the untwisted (or equally bulk) RR-charge the generation of lower
dimensional D-brane charges does indeed occur. This is due to the non-compactness of
the orbifold. If one consider the orbifolded torus, then the untwisted charges should be
conserved. Indeed the results from the description of tachyon condensation as the marginal
deformation were obtained in Z2 -orbifolded torus [2325] and the results are consistent
with this.
To see this more generally, let us remember the RR-coupling formula (2.12). For
compact space, as shown in [42] the Chern character of the superconnection does not
change in cohomology if we shift the value of the tachyon fields continuously. This also
supports the above arguments.
Now let us return to our examples in C 2 /ZN . In principle the calculations of twisted RRcharges are possible in BSFT, but the determination of the normalizations is not so easy.
Therefore we calculate the charges in the boundary state formalism. For boundary states in
orbifold theories see, for example, [37,5158]. This formalism is useful to know couplings
with various fields in closed string sector because the boundary state is the description of a
D-brane from the viewpoint of closed string theory. The detailed computations are shown
in the Appendix A and here we will discuss the results.
The outline of the determination is as follows. First we can find the boundary state
which represents a -type Dp-brane so as to satisfy the Cardys condition [59]. Then the
total boundary state is given by
|Dp() =
N1

2 ik
N
(k)
T .
(3.12)
k=0
Here we defined the boundary states for untwisted sector |T (0) = |U and kth twisted
sectors |T (k) as follows
Tp
(|U NSNS + |U RR ),
2
(k) Tp (k)

T
T
(3.13)
+ T (k) RR ,
=
NSNS
2
where the two normalization Tp and Tp can be computed as in Eq. (A.30). Next note that
in the low energy limit the boundary state for each sector is proportional to a massless state
in the sector. Thus we can read off the coupling to the massless field from the coefficient
of the boundary state for each sector [51,52,57,58,60].
|U =
10 Similar conservation law for D-branes in NS5-brane background was recently discussed in [29].
274
In this way we can compute the twisted RR-charges and the result is as follows for the
kth twisted RR-charge Q(k)
,p of a -type Dp-brane
Q(k)
,p
p

k 1 2 2 3/2 1/2
1 2 i k
N
2 sin
= e
2 ( ) .
N
N
(3.14)
Note that the above method cannot determine the phase factors which do not depend on .
Then let us discuss the twisted RR-charges before and after the tachyon condensation.
First consider the generation of the D(p 2) brane charge from a DpDp by the tachyon
(k)
(k)
field (3.5). The original DpDp (p = 2, 4) has the kth twisted RR-charge (Q,p Q,p ).
Without losing generality we can assume . If one notes the following elementary
formula

2 ik
2 ik
i k
2 ik
k 2 ik
e N + + e N (1) ,
e N e N = (i)e N 2 sin
(3.15)
N
then one obtains
(k)
Q(k)
,p Q,p = (i)e
ik
N
1

(k)
Q,p2 .
(3.16)
This shows that the final state after the tachyon condensation on a D2D2 should be
fractional D0-branes of type {, + 1, . . . , 1} with some bulk D0-branes. 11 This is
consistent with the results in the previous subsection that the final state consists of q bulk
D0-branes and ( ) fractional D0-branes. Combining this with the above arguments we
can determine the final state completely.
For p = 4, one can also consider more general tachyon field (3.7). These will produce
the intersecting D2-brane system as mentioned in the previous subsection. Then we can
find that the twisted charges are conserved if the charges of r D2-branes on z1 = 0 do
cancel those of z2 = 0. Note also that this configuration is BPS.
Next we turn to the generation of codimension four D-brane charges from the tachyon
fields (3.11). In the same way as before we obtains the following formula
(k)
(k)
(k)
2
Q(k)
,4 + Q+,4 Q,4 Q+,4 = (i)
1

Q(k)
+,0 ,
(3.17)
=+1 =0
where we assumed and 0. This decomposition rule is again consistent with the
result in the previous subsection. Thus we can conclude that after the tachyon condensation
there are ( ) fractional D0-branes and their types are given by the above formula.
In both examples of generating D0-branes if we shift the Khler moduli and blow up
the orbifold singularities, then we get (mutually) BPS D2-branes which are wrapped on
the corresponding holomorphic 2-cycles in ALE spaces. Since the Khler structure is
11 The extra phase (i) e ik/N can be canceled by the phase factor which cannot be determined from the
calculations in the Appendix A because it does not depend on . The origin of e ik/N is easy to understand.
If one considers the DpD(p 2) open string, then the g-action on the fermionic zero mode generates the
factor ei /N . Thus one must project the open string as g = e
2 i ( 1 +)
N 2
.
275
independent from the complex structure, these holomorphic 2-cycles can be defined by
the the equations Ti = 0 in the A(N1) -type hypersurface

XY = Z N ,
(3.18)
X = z1N , Y = z2N , Z = z1 z2 .
It will be also interesting to discuss the relation between the shift of complex structure
and the corresponding tachyon field for these examples and we will leave this for future
problem.
3.3. Some comments on generalizations
Before we close this section, let us comment on some generalizations of our results.
First it is easy to see that the generalizations for higher dimensional ZN -orbifolds C n /ZN
(n 3) are straightforward since the above arguments largely depend on the algebraic
properties of the discrete groups ZN .
On the other hand, for other types of orbifolds C n / the results will be non-trivial. Here
we do not investigate these further, but it may be natural to conjecture that the following
general relation will hold for each g with a coefficient C(g)
n1
2
i (g)
i=1
n1
2
i=1
i (g) = C(g)
(g),
(3.19)
where (g) is the character of g for the irreducible representation and denotes a
certain subset of irreducible representations which depends on i , i . Note that if we return
to the C 2 /ZN examples, then the character is given by (g k ) = exp(2ik/N) and the
relation Eq. (3.19) is equivalent to Eq. (3.17). The coefficient C(g) will be due to a phase
factor and due to the trace over the zeromodes as in (A.6).
4. Conclusions
In this paper we have discussed the boundary string field theory description of tachyon
condensation with worldsheet N = 2 supersymmetry. This extended supersymmetry
generally requires that the tachyon field should be holomorphic [17]. Therefore it is natural
to believe that this constraint is related to the spacetime supersymmetry of final states after
the tachyon condensation. We have investigated this issue in two examples.
First we have considered braneantibrane systems in flat space and discuss the
generalization of AtiyahBottShapiro configuration. In the arguments of these the
RR-coupling formula for braneantibrane systems also played an important role. As a
result, we obtained only BPS configurations from the minimal number of braneantibrane
pairs.
Next we have investigated tachyon condensation on ZN -orbifolds mainly in four
dimension. This is one of the simplest examples in curved spaces and most of our
arguments can be performed algebraically. In this example we have seen that holomorphic
tachyon fields generate various BPS fractional D-branes which are wrapped on various
276
holomorphic cycles. The conservation law of various twisted RR-charges was used to
identify the final states.
Finally let us mention some future directions. If one wants to see the generation of lowerdimensional D-branes explicitly, it will be useful to construct the (off-shell) boundary states
during the tachyon condensation in the same way as in [21,22,25,61]. This will make more
clear the generation of fractional D-branes from braneantibrane systems.
In particular for BPS D-branes on the four dimensional orbifolds (or K3 surface), the
worldsheet N = 4 superconformal symmetry is realized [37]. Thus it is intriguing to
construct N = 4 boundary interactions and discuss tachyon condensation in BSFT.
As mentioned in Section 3.3, it will also interesting to investigate other examples of
orbifolds because the consideration of tachyon condensation seems to imply non-trivial
relations among the characters of irreducible representations.
We hope to return to these issues in future work.
Acknowledgements
I am very grateful to T. Eguchi for encouragement and valuable advice. I also thank
Y. Hikida, Y. Matsuo, M. Naka, M. Nozaki, K. Ohmori, Y. Sugawara, S. Terashima and
T. Uesugi for useful discussions. This work is supported by JSPS Research Fellowships
for Young Scientists.
Appendix A. Detailed boundary state computations

Here we compute the cylinder amplitudes of open strings between fractional Dp-branes
(p = 0, 2, 4) on the orbifold C 2 /ZN in order to get correct normalizations of the boundary
states. Similar calculations for p = 0 or for Z2 -orbifolds have been performed in various
papers, for example [5158] (see also [62]). Let us first summarize our conventions.
Conventions for open string
We define the open string Hamiltonian of worldsheet theory as

Ho = p p + No + a ,
(A.1)
where p is the momentum and No Z is the contributions from oscillators; a denotes the
zero energy
a=
1
2
(for NS-sector),
a=0
(for R-sector).
(A.2)
The moduli of the cylinder is written by t and we define q = e2i as

q = e2i e2t .
(A.3)
The one-loop amplitude Zopen of open string between a -type Dp-brane and -type
Dp-brane can be written as
Zopen =
277
N1
1 i 2 ()k (k)
e N
Zopen,
N
(A.4)
k=0
where
(k)
Zopen
is defined by
(k)
Zopen
=2

F
dt
k 1 + (1)
2Ho t
TrNSR g
e
.
2t
2
(A.5)
0
2
This means the ZN -projection into the states which satisfy g = ei N () .

Next let us consider the bosonic zeromodes along C 2 /ZN direction. The traces over
these zeromodes become
p

dk
1
,
Tr g k =
,
Tr(1) = Vp
(A.6)
p
2
(2 sin( k
N ))
where Vp denotes the volume of a Dp-brane before the ZN -projection. The second
equation follows from the calculation [62]

1
(dz)2 z|g k |z =
(A.7)
z|z 2 (z z ) .
k
2 sin( N )
Then we turn to the fermionic zeromodes in the R-sector along C 2 /ZN direction. The
action of g on these is defined as follows:
g |s1 , s2 = e
2 i
N (s1 s2 )
|s1 , s2 ,
(A.8)
where s1 , s2 { 12 } denote the spins of the spacetime fermions. From this one can obtain
the zeromode trace in R-sector as

2 ik
2 ik
k
TrR g k = e N + e N + 2 = 4 cos2
(A.9)
.
N
Below we use the trace tr over only oscillators (not the bosonic and fermionic zeromodes).
Formulae of -functions
Here we summarize the formulae of -functions. First we define the following
-functions:

1 qn ,
( ) = q 24
n=1
1

n=1
1 (, ) = 2q 8 sin()
2 (, ) = 2q 8 cos()

1 q n 1 e2i q n 1 e2i q n ,

1 q n 1 + e2i q n 1 + e2i q n ,
n=1
3 (, ) =

n=1

1
1
1 q n 1 + e2i q n 2 1 + e2i q n 2 ,
278
4 (, ) =

1
1
1 q n 1 e2i q n 2 1 e2i q n 2 ,
(A.10)
n=1
where we have defined q = e2i .

Then the modular transformations are given as follows
1
( ) = (i ) 2 (1/ ),
1 (, ) = i(i ) 2 ei
1
2 (, ) = (i )
21 i
3 (, ) = (i )
21
4 (, ) = (i )
21
e
e
2
i
2
i
1 (/, 1/ ),
4 (/, 1/ ),
3 (/, 1/ ),
2 (/, 1/ ).
(A.11)
Open string cylinder amplitudes

(k)
Let us compute the open string cylinder amplitudes Zopen . We only consider the two
coincident Dp-branes.
For the untwisted part k = 0, we obtain

(0)
Zopen
= 2Vp+1
dt 2 p+1
3 (0, it)4 4 (0, it)4 2 (0, it)4
2
8 t
2t
2(it)12
=2
3p+5
2
3p+5
2
ds s
p9
2
( )
p+1
2
Vp+1
3 (0, is/)4 2 (0, is/)4 4 (0, is/)4

,
2(is/)12
(A.12)
where Vp+1 is equal to Vp times the volume V1 of time-like direction. Note that in the
last expression we have performed the modular transformation. For the kth twisted parts,
the result is

2
dt 2 12 (2 sin( k
2V1
1
N ))
(k)
Zopen =
8
k p
k
2
2t
2(it)6
(2 sin( N ))
1 ( N , it)
0

3 (0, it)2 3 (k/N, it)2 4 (0, it)2 4 (k/N, it)2

2 (0, it)2 2 (k/N, it)2
2p
(2 sin( k
N ))
(i)2

k 2
5 (2 sin(
1
N ))
ds s 2
k
6 ( , is/)2
2
(is/)
1 ( N , it)
1 k

2
2
3 (0, is/) 3 (k , is/) 4 (0, is/)2 4 (k , is/)2

2 (0, is/)2 2 (k , is/)2 ,
5
= 2 2 2 2 V1
(A.13)
279
where we have defined k = iks/N .

Then let us compare these results with those from the boundary state calculations. Before
that we summarize the conventions. We use the light-cone gauge in NSR formulation [63]
and closely follow the normalization in [60].
Conventions for boundary state
The closed string Hamiltonian is defined by
Hc = k k + 2(NL + NR ) + 4a,
(A.14)
where NL and NR are the contributions from left-moving and right-moving oscillators;
a denotes the zero energy
k
1
(for NSNS-sector),
a = 0 (for RR-sector).
a= +
(A.15)
2 N
Note also that the momentum k in twisted sectors is always zero along the orbifold
direction C 2 /ZN .
Further one can define the propagator as

1

=
(A.16)
ds e 2 sHc .
2
The boundary state for the untwisted-sector and k = 1, 2, . . . , (N 1)th twisted-sectors
are given by
Tp
(|U NSNS + |U RR ),
2

(k) Tp
T
(|T (k) NSNS + |T (k) RR ),
(A.17)
=
2
where the constants Tp , Tp represent the tension and charges of the D-brane and will be
determined later. We have defined |U sector and |T (k) sector as
9p

1
dk
U, +, k a
|U NSNS =
(A.18)
U, , k a NSNS ,
NSNS
2
2
9p

dk
U, +, k a
U, , k a
|U RR = 2
(A.19)
+
,
RR
RR
2
5

(k)

1
dk (k)
T
=
(A.20)
T , +, k i NSNS T (k) , , k i NSNS ,
NSNS
2
2
|U =
(k)
T

=
NSNS
dk
2
5
(k)

T , +, k i
RR

+ T (k) , , k i RR ,
(A.21)
where k a and k i are momenta of the Dp-brane in the untwisted and twisted sectors,
respectively. If we regard x 6 , . . . , x 9 as the coordinates of C 2 /ZN , then we can take a =
1, . . . , 9 p and i = 1, 2, . . . , 5. The explicit forms of |U, , k a sector , |T (k) , , k i sector
are determined by the requirement that they should satisfy the desirable boundary
280
conditions. These conditions are solved by elementary calculations and the explicit forms
are given by coherent states of left and right-moving oscillators. Here we show the
explicit expression only for p = 0 in NSNS-sector as follows (we assume k < N/2 for
simplicity of the notation and we define T (0) = U )
(k)

T , ?, k
NSNS

5

5

1

= exp
n n + i?
r r
n
n=1
=2
r>0 =2

1
1
exp
k
k
k
k
n + Nk n N n N
n Nk n+ N n+ N
n=0
n=1

1
1
k
k
k
k
n Nk n+ N n+ N
n + Nk n N n N
n=0
n=1

r k r k +
r+ k r+ k
+ i?
N
r>0
+ i?

r>0
r+
r+ Nk
r>0

r>0
r k r k
N

(k)

T , ?, k (0)
NSNS
(A.22)
) and
where we defined the zeromode as |T (k) , ?, k (0)
NSNS . The oscillators ( ,
( , ) are for bosonic fields (XL , XR ) and for fermionic fields (L , R ) on the
denote the oscillators for (Z 1 , Z 1 , Z 2 , Z 2 ) and (, ,
, ) are
worldsheet; (, ,
, )
L
R
L
R
their superpartners. They follow the canonical (anti)commutation relations

m+ k , n k = (m + k/N)m,n ,
N
N

m k , n+ k = (m k/N)m,n ,
N
N

r k , r+ k = r+s ,
N
N

r+ k , r k = r+s .
(A.23)
N
The expressions for the others are also written almost in the same form as (A.22). For
more details we recommend the readers to refer to [51,52,55,57], for example.
We also comment that the above definition (A.20) does not work for k = N/2 because
there are extra fermionic zeromodes in twisted NSNS-sector along the orbifold direction.
In this case one should change the factor in front of R.H.S. of (A.20) into 1 and the sign in
the middle of (A.20) into +.
Next the zeromodes are normalized as follows: for the untwisted and twisted sectors

a a (0)
= Vp+1 (2)9p 9p k a k a ,
k k
i i (0)

k k
(A.24)
= V1 (2)5 5 k i k i .
Finally we get the total boundary state |Dp() which describes a -type Dp-brane as
follows:
|Dp() =
N1

2 ik
N
(k)
T .
281
(A.25)
k=0
The phase factors e2ik/N are inserted in order to be consistent with the open string
calculations. These are proportional to the charges in twisted-sectors.
Open-closed duality
As argued by Cardy [59], the one-loop amplitude of open string should be equal to the
tree level amplitude between two boundary states in closed string. This requirement is
called Cardys condition and often gives a crucial consistency condition of D-branes. In
our case, we can write this requirement as follows:
Zopen = Dp()||Dp().
(A.26)
Then comparing this with the Eq. (A.4) and using (A.25), we obtain

1 (k)
Zopen = T (k) T (k) .
N
(A.27)
Boundary state calculations and determination of the normalization

Now let us compute the cylinder amplitude in the boundary state formalism. The result
for untwisted-sector is given by
U ||U
9p
Vp+1 Tp2
1 2
dk
ds e 2 k s
=
16
2
3 (0, is/)4 2 (0, is/)4 4 (0, is/)4

.
(is/)12
(A.28)
For kth twisted-sectors we obtain

5

V1 (Tp )2

2
1 2
dk
ds e 2 k s (is/)6 (i)1(k , is/)
T (k) T (k) =
16
2

2
3 (0, is/) 3 (k , is/)2 4 (0, is/)2 4 (k , is/)2

2 (0, is/)2 2 (k , is/)2 .
(A.29)
Then after we perform the integration in the above equations, we can determine the
normalizations Tp , Tp from the Cardys condition:
3p
7
1
Tp = 23p 2 p ( ) 2 ,
N

p
1 2 3 1
k 1 2

2
2
Tp = 2 ( ) 2 sin
.
N
N
(A.30)
282
Tension and charges

Finally let us determine the tension TDp and kth twisted RR-charges Q(k)
,p of a -type
Dp-brane. Generally, one can compute a coupling with a closed string field from the
overlap of the boundary state with the corresponding vertex operator as discussed in [60].
Therefore the tension and twisted RR-charges of our example can also be read off from the
boundary state |Dp() (A.25) as follows
T0
TD2 = N T2 ,
TD4 = N T4 ,
TD0 = ,
N
2 i
1
N k T ,
Q(k)
(A.31)
,p = e
p
N
where the factor 1/ N is needed for the correct normalization of untwisted fields [57];
the different coefficients of the tensions for p = 2, 4 are due to the facts that the volume
factor Vp+1 in (A.24) should be divided by N in physical context. Then it is obvious that
the tension of a -type D0-brane is 1/N times that of an ordinary D0-brane in flat space.
On the other hand, for a -type D2 or D4-brane the tension is the same as that of a ordinary
D-brane. Some aspects of twisted RR-charges were discussed in Section 3. Note that the
D-brane also has a untwisted charge and twisted NSNS charges, which are proportional to
the tension and the twisted RR-charges, respectively.
References
[1] M.B. Green, Pointlike states for type 2b superstrings, Phys. Lett. B 329 (1994) 435, hepth/9403040.
[2] T. Banks, L. Susskind, Braneantibrane forces, hep-th/9511194.
[3] A. Sen, Tachyon condensation on the braneantibrane system, JHEP 9808 (1998) 012, hepth/9805170.
[4] O. Bergman, M.R. Gaberdiel, Stable non-BPS D-particles, Phys. Lett. B 441 (1998) 133, hepth/9806155.
[5] A. Sen, Non-BPS states and branes in string theory, hep-th/9904207.
[6] K. Bardakci, Dual models and spontaneous symmetry breaking, Nucl. Phys. B 68 (1974) 331;
K. Bardakci, Spontaneous symmetry breakdown in the standard dual string model, Nucl.
Phys. B 133 (1978) 297;
K. Bardakci, M.B. Halpern, Explicit spontaneous breakdown in a dual model, Phys. Rev. D 10
(1974) 4230;
K. Bardakci, M.B. Halpern, Explicit spontaneous breakdown in a dual model. 2. N point
functions, Nucl. Phys. B 96 (1975) 285.
[7] E. Witten, Noncommutative geometry and string field theory, Nucl. Phys. B 268 (1986) 253.
[8] N. Berkovits, SuperPoincare invariant superstring field theory, Nucl. Phys. B 450 (1995) 90,
hep-th/9503099;
N. Berkovits, A new approach to superstring field theory, Fortsch. Phys. 48 (2000) 31, hepth/9912121.
[9] V.A. Kostelecky, S. Samuel, On a nonperturbative vacuum for the open bosonic string, Nucl.
Phys. B 336 (1990) 263;
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
283
V.A. Kostelecky, S. Samuel, The static tachyon potential in the open bosonic string theory,
Phys. Lett. B 207 (1988) 169;
A. Sen, B. Zwiebach, Tachyon condensation in string field theory, JHEP 0003 (2000) 002,
hep-th/9912249;
N. Berkovits, A. Sen, B. Zwiebach, Tachyon condensation in superstring field theory, Nucl.
Phys. B 587 (2000) 147, hep-th/0002211.
K. Ohmori, A review on tachyon condensation in open string field theories, hep-th/0102085.
E. Witten, On background independent open string field theory, Phys. Rev. D 46 (1992) 5467,
hep-th/9208027;
E. Witten, Some computations in background independent off-shell string theory, Phys.
Rev. D 47 (1993) 3405, hep-th/9210065.
S.L. Shatashvili, Comment on the background independent open string theory, Phys. Lett. B 311
(1993) 83, hep-th/9303143;
S.L. Shatashvili, On the problems with background independence in string theory, hepth/9311177.
A.A. Gerasimov, S.L. Shatashvili, On exact tachyon potential in open string field theory,
JHEP 0010 (2000) 034, hep-th/0009103.
D. Kutasov, M. Marino, G. Moore, Some exact results on tachyon condensation in string field
theory, JHEP 0010 (2000) 045, hep-th/0009148.
E.S. Fradkin, A.A. Tseytlin, Nonlinear electrodynamics from quantized strings, Phys.
Lett. B 163 (1985) 123;
O.D. Andreev, A.A. Tseytlin, Partition function representation for the open superstring effective
action: cancellation of mobius infinities and derivative corrections to BornInfeld Lagrangian,
Nucl. Phys. B 311 (1988) 205;
A.A. Tseytlin, Sigma model approach to string theory effective actions with tachyons, hepth/0011033.
D. Kutasov, M. Marino, G. Moore, Remarks on tachyon condensation in superstring field
theory, hep-th/0010108.
K. Hori, Linear models of supersymmetric D-branes, hep-th/0012179.
P. Kraus, F. Larsen, Boundary string field theory of the D D-bar system, hep-th/0012198.
T. Takayanagi, S. Terashima, T. Uesugi, Braneantibrane action from boundary string field
A. Sen, SO(32) spinors of type I and other solitons on braneantibrane pair, JHEP 9809 (1998)
023, hep-th/9808141.
M. Frau, L. Gallot, A. Lerda, P. Strigazzi, Stable non-BPS D-branes in type I string theory,
Nucl. Phys. B 564 (2000) 60, hep-th/9903123.
Y. Matsuo, Tachyon condensation and boundary states in bosonic string, hep-th/0001044.
A. Sen, BPS D-branes on non-supersymmetric cycles, JHEP 9812 (1998) 021, hep-th/9812031.
J. Majumder, A. Sen, Blowing up D-branes on non-supersymmetric cycles, JHEP 9909 (1999)
004, hep-th/9906109;
J. Majumder, A. Sen, Vortex pair creation on braneantibrane pair via marginal deformation,
JHEP 0006 (2000) 010, hep-th/0003124;
J. Majumder, A. Sen, Non-BPS D-branes on a CalabiYau orbifold, JHEP 0009 (2000) 047,
hep-th/0007158.
M. Naka, T. Takayanagi, T. Uesugi, Boundary state description of tachyon condensation,
JHEP 0006 (2000) 007, hep-th/0005114.
Y. Oz, T. Pantev, D. Waldram, Braneantibrane systems on CalabiYau spaces, hep-th/0009112.
R. Tatar, A note on non-commutative field theory and stability of braneantibrane systems,
hep-th/0009213.
Y. Hikida, M. Nozaki, T. Takayanagi, Tachyon condensation on fuzzy sphere and noncommutative solitons, Nucl. Phys. B 595 (2001) 319, hep-th/0008023.
284
[29] Y. Hikida, M. Nozaki, Y. Sugawara, Formation of spherical D2-brane from multiple D0-branes,
hep-th/0101211.
[30] K. Dasgupta, S. Mukhi, G. Rajesh, Noncommutative tachyons, JHEP 0006 (2000) 022, hepth/0005006;
J.A. Harvey, P. Kraus, F. Larsen, E.J. Martinec, D-branes and strings as non-commutative
solitons, JHEP 0007 (2000) 042, hep-th/0005031;
J.A. Harvey, P. Kraus, F. Larsen, Exact noncommutative solitons, JHEP 0012 (2000) 024, hepth/0010060.
[31] I. Bars, H. Kajiura, Y. Matsuo, T. Takayanagi, Tachyon condensation on noncommutative torus,
hep-th/0010101;
E.M. Sahraoui, E.H. Saidi, Solitons on compact and noncompact spaces in large noncommutativity, hep-th/0012259.
[32] E.J. Martinec, G. Moore, Noncommutative solitons on orbifolds, hep-th/0101199.
[33] M.R. Douglas, Two lectures on D-geometry and noncommutative geometry, hep-th/9901146;
M.R. Douglas, Topics in D-geometry, Class. Quant. Grav. 17 (2000) 1057, hep-th/9910170.
[34] E. Witten, D-branes and K-theory, JHEP 9812 (1998) 019, hep-th/9810188.
[35] J.A. Harvey, D. Kutasov, E.J. Martinec, On the relevance of tachyons, hep-th/0003101.
[36] K. Becker, M. Becker, A. Strominger, Five-branes, membranes and nonperturbative string
theory, Nucl. Phys. B 456 (1995) 130, hep-th/9507158.
[37] H. Ooguri, Y. Oz, Z. Yin, D-branes on CalabiYau spaces and their mirrors, Nucl. Phys. B 477
(1996) 407, hep-th/9606112.
[38] M.F. Atiyah, R. Bott, A. Shapiro, Clifford Modules, Topology 3 (1964) 3.
[39] N.P. Warner, Supersymmetry in boundary integrable models, Nucl. Phys. B 450 (1995) 663,
hep-th/9506064.
[40] C. Kennedy, A. Wilkins, RamondRamond couplings on braneantibrane systems, Phys.
Lett. B 464 (1999) 206, hep-th/9905195.
[41] M. Alishahiha, H. Ita, Y. Oz, On superconnections and the tachyon effective action, hepth/0012222.
[42] D. Quillen, Superconnection and the Chern character, Topology 24 (1985) 89.
[43] S. Terashima, A construction of commutative d-branes from lower dimensional non-BPS Dbranes, hep-th/0101087.
[44] L. Dixon, J.A. Harvey, C. Vafa, E. Witten, Strings on orbifolds, Nucl. Phys. B 261 (1985) 678;
L. Dixon, J.A. Harvey, C. Vafa, E. Witten, Strings on orbifolds. 2, Nucl. Phys. B 274 (1986)
285;
L. Dixon, D. Friedan, E. Martinec, S. Shenker, The conformal field theory of orbifolds, Nucl.
Phys. B 282 (1987) 13.
[45] H. Garcia-Compean, D-branes in orbifold singularities and equivariant K-theory, Nucl.
Phys. B 557 (1999) 480, hep-th/9812226.
[46] M.R. Douglas, G. Moore, D-branes, Quivers, and ALE Instantons, hep-th/9603167.
[47] C.V. Johnson, R.C. Myers, Aspects of type IIB theory on ALE spaces, Phys. Rev. D 55 (1997)
6382, hep-th/9610140.
[48] T. Eguchi, P.B. Gilkey, A.J. Hanson, Gravitation, gauge theories and differential geometry,
Phys. Rep. 66 (1980) 213.
[49] P.S. Aspinwall, Enhanced gauge symmetries and K3 surfaces, Phys. Lett. B 357 (1995) 329,
hep-th/9507012.
[50] D. Diaconescu, M.R. Douglas, J. Gomis, Fractional branes and wrapped branes, JHEP 9802
(1998) 013, hep-th/9712230.
[51] D. Diaconescu, J. Gomis, Fractional branes and boundary states in orbifold theories, JHEP 0010
(2000) 001, hep-th/9906242.
[52] T. Takayanagi, String creation and monodromy from fractional D-branes on ALE spaces,
JHEP 0002 (2000) 040, hep-th/9912157.
285
[53] F. Hussain, R. Iengo, C. Nunez, C.A. Scrucca, Interaction of moving D-branes on orbifolds,
Phys. Lett. B 409 (1997) 101, hep-th/9706186.
[54] A. Sen, Stable non-BPS bound states of BPS D-branes, JHEP 9808 (1998) 010, hep-th/9805019;
O. Bergman, M.R. Gaberdiel, Non-BPS states in heterotic-type IIA duality, JHEP 9903 (1999)
013, hep-th/9901014;
M.R. Gaberdiel, B.J. Stefanski, Dirichlet branes on orbifolds, Nucl. Phys. B 578 (2000) 58,
hep-th/9910109.
[55] M. Billo, B. Craps, F. Roose, On D-branes in type 0 string theory, Phys. Lett. B 457 (1999) 61,
hep-th/9902196.
[56] I. Brunner, R. Entin, C. Romelsberger, D-branes on T(4)/Z(2) and T-duality, JHEP 9906 (1999)
016, hep-th/9905078.
[57] M. Billo, B. Craps, F. Roose, Orbifold boundary states from Cardys condition, JHEP 0101
(2001) 038, hep-th/0011060.
[58] M. Bertolini, P. Di Vecchia, M. Frau, A. Lerda, R. Marotta, I. Pesando, Fractional D-branes and
their gauge duals, JHEP 0102 (2001) 014, hep-th/0011077;
M. Frau, A. Liccardo, R. Musto, The geometry of fractional branes, hep-th/0012035;
P. Merlatti, G. Sabella, World volume action for fractional branes, hep-th/0012193.
[59] J.L. Cardy, Boundary conditions, fusion rules and the verlinde formula, Nucl. Phys. B 324
(1989) 581.
[60] P. Di Vecchia, M. Frau, I. Pesando, S. Sciuto, A. Lerda, R. Russo, Classical p-branes from
boundary state, Nucl. Phys. B 507 (1997) 259, hep-th/9707068.
[61] S.P. de Alwis, Boundary string field theory the boundary state formalism and D-brane tension,
hep-th/0101200.
[62] E.G. Gimon, C.V. Johnson, K3 orientifolds, Nucl. Phys. B 477 (1996) 715, hep-th/9604129.
[63] O. Bergman, M.R. Gaberdiel, A non-supersymmetric open-string theory and S-duality, Nucl.
Phys. B 499 (1997) 183, hep-th/9701137.

Non-critical non-singular bosonic strings,

linear dilatons and holography
Enrique lvarez a,b , Csar Gmez c , Lorenzo Hernndez a,b ,
Pedro Resco a,b
a Instituto de Fsica Terica, C-XVI, Universidad Autnoma de Madrid, E-28049-Madrid, Spain 1
b Departamento de Fsica Terica, C-XI, Universidad Autnoma de Madrid, E-28049-Madrid, Spain
c I.M.A.F.F., C.S.I.C., Calle de Serrano 113, E-28006-Madrid, Spain
Abstract
AdS5 with linear dilaton and nonvanishing B-field is shown to be a solution of the noncritical
string beta function equations. A noncritical (D = 5) solution interpolating between flat spacetime
and AdS5 , with asymptotic linear dilaton and nonvanishing B-field is also presented. This solution
is free of spacetime singularities and has got the string coupling constant everywhere bounded. Both
solutions admit holographic interpretation in terms of N = 0 field theories. Closed string tachyon
stability is also discussed. 2001 Elsevier Science B.V. All rights reserved.
1. Introduction
Much effort has recently been devoted to research in anti-de Sitter (AdS) string
backgrounds. The reason is at least, twofold. First of all, holography (cf. [6,7,10]) relates
bulk gravitational physics in AdS5 with some conformal field theory (CFT) at the boundary.
In addition, AdS is by far the simplest geometry where the ideas of RandallSundrum [9]
on four-dimensional confinement of gravity and hierarchy generation can be implemented.
The ten-dimensional background AdS5 S5 is a well-known background geometry for
the type IIB string description of the near horizon limit of D3 branes. In order for it to
become a consistent string background, a self-dual RamondRamond (RR) five-form has
to be turned on. The five sphere S5 plays the rle of an internal manifold, and is associated
with the extra scalars required by N = 4 supersymmetry.
One of the main motivations of the present paper stems from the natural question as to
whether AdS5 can define a noncritical string background. We will actually show this to be
E-mail address: enrique.alvarez@uam.es (E. lvarez).
1 Unidad de Investigacin Asociada al Centro de Fsica Miguel Cataln (C.S.I.C.).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 4 6 - 8
E. lvarez et al. / Nuclear Physics B 603 (2001) 286296
287
the case with euclidean signature, provided both an appropriate dilaton and KalbRamond
field are turned on.
A simple extrapolation of the holographic principle to the present situation suggests that
this background should be dual with some four-dimensional, nonsupersymmetric, N = 0,
CFT on the boundary.
Actually, the simplest noncritical bosonic string background is the linear dilaton [8],
with a dilaton field:
= Qr
(1)
living in flat d-dimensional Minkowski (or euclidean) spacetime:

2
+ dr 2
ds 2 = d x(1,d1)
(2)
with
d 25
(3)
Q2
6
(where Q R whenever D d + 1 < 26).
It is plain that we actually have two Z2 -related different potential vacua corresponding
to the two admissible signs in (1). Bearing in mind a holographic interpretation of the
coordinate r, it is natural to ask whether there are topologically nontrivial configurations
which interpolate between these two vacua, i.e., with a dilaton behavior of the type
q
(r)r Qr,
(r)r +Qr.
(4)
The answer is in the affirmative (again, only with euclidean signature). The first
interpretation of this solution that comes to mind is that of a sort of tunneling between
the two Z2 -related linear dilaton behaviors defined above.
Asymptotically (r ) this interpolating solution is just the linear dilaton in flat
spacetime with weak string coupling constant. At r the solution becomes AdS5
with linear dilaton, but again with weak string coupling constant (see Fig. 2). In addition
the solution is free of singularities with a bounded curvature everywhere.
It is of course interesting to look for the holographic interpretation of this solution.
String backgrounds that behave asymptotically at r as
2
+ dr 2 + ds 2 (M),
ds 2 = d x(1,d1)
gs2
=e
r
(5)
(6)
where is a constant and M a compact internal manifold, trivially fibered on the flat
(d + 1)-dimensional spacetime, have been first considered, from the holographic point of
view, in Ref. [1].
It was there argued that this type of string backgrounds are holographic. This fact can
be easily understood by considering the metric in the Einstein frame. In this frame, and
thanks to the linear dilaton, the (d + 1)-dimensional spacetime metric becomes effectively
AdS metric, which in particular implies the necessary condition of holography, namely
288
that timelike geodesics never reach the boundary. According to [1] the holographic dual
of these backgrounds is a nonlocal little string theory. One of the main problems of linear
dilaton backgrounds of type (5) considered in [1] is how to regulate the divergences in the
strong coupling region (r ).
At this point it is worth noticing that the interpolating solution we present in this
paper behaves asymptotically exactly as above (5), up to the fact that we have not any
internal compact manifold. 2 Moreover our solution automatically regulates the divergence
at r by becoming, in a smooth way, AdS5 with weak string coupling constant. From
the holographic point of view the boundary theory could be a nonlocal little string theory
with N = 0 supersymmetry. If this is the case it would be very interesting to understand if
this hypothetical nonlocality alluded to above could be due in our case to the nonvanishing
expectation value for the KalbRamond field.
The other solution we find, namely AdS5 with linear dilaton and B-field is also
interesting by itself, from the point of view of holography. The most natural candidate
for the boundary theory would be of course a N = 0 theory coupled to nonvanishing
B-field. In the context of N = 8 gauged supergravity there are two nontrivial AdS5 minima
associated with expectation values for the 10 and the 20 multiplets [5]. Holographically
these two vacua should correspond to N = 0 four-dimensional conformal field theories.
In these cases the string background is critical AdS5 W5 but with the internal manifold
W fibered in a nontrivial way on the holographic coordinate. It would be interesting to
study if there exist any relation between these N = 0 backgrounds predicted by N = 8
supergravity and the AdS5 solution with nonvanishing B described in this paper.
Finally, let us just mention that the interpolating solution we present is only valid with
euclidean signature. This is the reason we tend to interpret it as a sort of tunneling effect.
2. AdS5 and the bosonic string

The first step in determining whether strings can live in a given background is to check
the vanishing of the Weyl anomaly coefficients, that is:
1
RAB + 2A B HACD HBCD = 0,
4
1 C
HCAB + C HCAB = 0,
2
1
1
q + ()2 2 H 2 = 0.
(7)
2
24
We have denoted by capital latin letters spacetime indices (A, B, . . . = 0, 1, . . . ,
D 1 d), and by greek letters Poincar indices (, , . . . = 0, . . . , d 1).
We intend to look for backgrounds in which the metric part enjoys Poincar invariance,
that is
2
ds 2 = a(r) d x(1,d1)
+ dr 2 .
2 Although we could, just by introducing toroidal spectator dimensions.
(8)
289
But allowing for nontrivial dilaton and KalbRamond background. The dilaton background
still preserves I O(1, d 1) invariance because it depends on the holographic coordinate
only, = (r). But any nontrivial antisymmetric background necessarily breaks it (in the
sense that L(k)HABC = 0 for all nontranslational Killing vectors k); the only remaining
unbroken symmetry being the four-dimensional translation group, T4 .
Let us begin by assuming an AdS ansatz,
a(r) = e2Qr
(9)
dressing it with a linear dilaton,

(r) = Qr + 0
(10)
and allowing for a nontrivial KalbRamond field,

0
b (r) = (c /2Q)e20 a(r) + b
,
(11)
0 are constant tensors.

where both c and b
Remarkably enough, this ansatz can easily be shown to be a solution, provided only that
the constants obey the following relationships:
c c = 0 ( = ),
(2 d) = 2e40 c c ,
d = e40 c c .
(12)
These equations are compatible only if

d(d 2)
2
which fixes uniquely
d=
(13)
d =4
(14)
which seems worth noticing. On the other hand, the other equations of the set (12) are
equivalent to
2
= cj2k ,
c0i
(15)
where i, j, k = 1, 2, 3 in cyclic order. It is precisely this condition that forbids a

minkowskian solution, if the KalbRamond field is to be real.
Let us define the useful combination

2
c0i
.
c2
(16)
i
The only remaining equation relates this scale with the value of the dilaton at the origin:
c2 = 4qe40 .
(17)
The gravitational part of this solution is just AdS with radius R = 1/Q = 2 /7; whereas
the dilatonic part is exactly the noncritical linear dilaton; to combine the two in a noncritical
290
Fig. 1. Warp factor, dilaton and KalbRamond field solution v.s. holographic coordinate.
background it has been necessary to turn on the antisymmetric field, which in turn is only
possible in euclidean signature (Fig. 1). 3
There are some simple generalizations of this solution. First of all, we have considered
for the time being the bosonic string only, but nothing prevents us to assume worldsheet

supersymmetry (wss). This changes the value
of q only, to q = (d 9)/4 = 5/4 , in

such a way that the new radius of AdS is 4 /5.
A different generalization involves adding a Ricci flat internal manifold coordinatized
by y:
ds 2 = a(r) d xd2 + dr 2 + d yd2 .
(18)
The new value of the constant q is now q = (d + d 25)/6 , or, with wss, q =
(d + d 9)/4 . It is interesting that by adding spectator dimensions the AdS radius
becomes larger.
3. The interpolating background

The solution discussed in the previous paragraph has a strong coupling region,where the
dilaton diverges. It would be interesting to look for a solution which somewhat interpolates
between the two possible signs of the linear dilaton background and, as such, has a string
coupling constant everywhere bounded. If we write the general condition for the vanishing
of the Weyl anomaly coefficients (7) with the previous ansatz we get the equations relating
the conformal factor of the metric with the constant characterizing the KalbRamond
background:
1
2aa + (d 2)(a )2
+ a + c2 e4 = 0,
4a
2a
3 A similar AdS solution with B-field was found in [4].
d 2aa (a )2
+ 2 +
4
a2
1
d a
+
q + ( )2
2
4 a
dc2 4
e = 0,
4a 2
dc2 4
e = 0.
8a 2
291
(19)
It is a simple matter to check that an euclidean solution is obtained provided d = 4 and

a(r) =
e2Qr
1 + 2Qc e2Qr
(20)
where c is a new constant with dimension of a length. The dilaton is given by:
= 0 + log a Qr
(21)
and the KalbRamond field strength is just

H4 = 2Q
c 2 2Qr
a e
c
(22)
with c defined above in Eqs. (16) and (17).

This new solution reduces to the noncritical AdS background of the previous paragraph
when c = 0. On the other hand, when c = 0 the dilaton background is of the interpolating
type, because
lim (r) = Qr.
(23)
Several characteristics of the solution are summarized in the Figs. 2, 3 and 4. Notice, in
particular, that each component of the KalbRamond field strength has a Gaussian form,
1

log(4Q2 (c )2 ), and height H
with the location of the maximum fixed at r 4Q

c /(4cc ).
There are two features worth discussing. The first is that the solution is singularity free.
If the scalar curvature is plotted (cf. Fig. 4), it can be easily seen that there is a maximum
curvature Rmax = 16Q2 /9 as well as a negative minimum value, given by Rmin = 20Q2 .
Fig. 2. Warp factor and dilaton of the interpolating background.
292
Fig. 3. KalbRamond field strength shape.
Fig. 4. Curvature scalar. Transition between AdS and flat space is observed.
This means that there is some hope of not to be led astray by worldsheet sigma
model computations. Higher orders in ls could, however, become important as soon as
the curvature is big when measured in string units (that is, Rls2 1).
The second comment refers to the form of the graph of the dilaton in Fig. 2. This means
that the construct
gs e
(24)
which is usually referred to as the string coupling is always small

gs 1.
(25)
In spite of the fact that the string coupling is always small, tadpoles at higher genus can
modify our solution.
293
Fig. 5. c-function behaviour.
An interesting characteristic of the solution is the B-field action

1
I
dr d 4 x g e2 H 2
12
(26)
which can be easily computed to be

V4
,
(27)
c
where V4 is the four-dimensional euclidean volume. The corresponding density is then a
well defined concept, which vanishes for the linear dilaton and diverges in the pure AdS
background.
Actually, vanishing of the Weyl anomaly coefficients leads to
I=

1 2 2
e
H = A e2 A + 2qe2 .
(28)
12
This suggests that I can be used as a way to measure the topological change involved in
passing from = Qr at r = to = Qr at r = +. The generalization to world
sheet supersymmetry and/or flat spectator dimensions is straightforward.
Finally, if a geometric c-function is defined (cf. [2]) the monotonic behavior depicted in
Fig. 5 is obtained.
4. Comments on closed tachyons

It has often been argued that a closed tachyon condensate is in some sense equivalent for
the string as being noncritical, the precise relationship being given by the Weyl anomaly
coefficient
1
1
m2 2
T =0
q + ()2 2 H 2 +
2
24
16
(where m2 = 4 for the closed string tachyon).
(29)
294
If we assume worldsheet supersymmetry (wss), the only changes are in the numerical
value of q, q = (D 10)/4 and the tachyon mass, m2 = 2. Then the tachyon
potential is guaranteed to be even in T .
The preceding equation (29) depends only on the combination
m2 2 D 6T 2 26
T =
.
(30)
16
6
A vacuum expectation value (vev) for the tachyon is equivalent, from this point of view,
to a lower dimension. Actually, our solution trivially generalizes to this case, just by
changing Q q by c everywhere. Of course this does not change the number of

geometrical dimensions, which remains fixed to d = 4 (plus the holographic coordinate).
In order to analyze tachyons in AdS we can use the Breitelohner and Freedman [3]
bound
c q +
1
(D 1)2 m2 R 2
4
which for the AdS5 solution of Section 2 gives
(31)
14
.
(32)

This is not exactly true, due to dilatonic effects. The relevant Lagrangian for a scalar field
in the dilatonic AdS background of Section 2 is

1
S d(vol) e2 A T A T m2 T 2 .
(33)
2
In our case the g factor produces an effect e2Qr , whereas the dilaton term behaves as
2Qr
e
.
The explicit form of the action is then

1
S d 4 x dr e2Qr e2Qr ( T )2 + (r T )2 m2 T 2 .
(34)
2
m2
For radial fluctuations ( T = 0) the net effect is just a change of radius in AdS, from
R = 1/Q, which is the true radius of AdS, to an effective nondilatonic one, namely Re =
2/Q. All generic effects of fields propagating in AdS spaces remain the same. Using Re
the bound becomes
21
(35)
6
exactly the bound corresponding to the linear dilaton background. As it is clear from this
bound the closed string tachyon of m2 = 4/ produces an instability.
Let us now assume, for the sake of the discussion, that the tachyon vev complies with
BreitelohnerFreedman bound, that is T 2 1/2(6) (where the latter figure stands for the
wss case). We then would have a five-dimensional AdS background where the closed
string tachyon becomes of positive energy; i.e., a good tachyon (in this case the bound
(34) becomes m2 4 ( 2 )).
m2
295
In this framework we can now use the interpolating metric to define a tunneling
amplitude between both tachyonic vevs. The physical effect of the ensuing tunnelling
would be to restore the symmetry in the real vacuum:
0|T |0 = 0.
(36)
This real vacuum, |0 would be just a quantum mechanical superposition, and, as such,
not a solution of the classical equations of motion (in the same sense that the -vacuum in
QCD is not a classical solution of the YangMills equations). Giving our lack of ability to
determine the form of the closed tachyon potential, the above should be taken as a possible
scenario only.
Acknowledgements
This work has been partially supported by the European Union TMR program FMRXCT96-0012 Integrability, Non-perturbative Effects, and Symmetry in Quantum Field
Theory and by the Spanish grant AEN96-1655. The work of E.A. has also been supported
by the European Union TMR program ERBFMRX-CT96-0090 Beyond the Standard model
and the Spanish grant AEN96-1664. The work of L.H. has been supported by the Spanish
predoctoral grant AP99-4367460. The work of P.R. has been supported in part by a UAM
postgraduate grant.
References
[1] O. Aharony, M. Berkooz, D. Kutasov, N. Seiberg, Linear dilatons, NS5-branes and holography,
JHEP 9810 (1998) 004, hep-th/9808149.
[2] E. Alvarez, C. Gomez, Holography and the C-theorem, in: Contributed to 4th Annual European
TMR Conference on Integralibity, Nonperturbative Effects and Symmetry in Quantum field
Theory, Paris, France, 713 September 2000, hep-th/0009203.
[3] P. Breitelohner, D. Freedman, Stability in gauge extended supergravity, Ann. Phys. 144 (1982)
249.
[4] J. de Boer, S. Shatashvili, Two-dimensional conformal field theories on AdS2d+1 backgrounds,
JHEP 9906 (1999) 013, hep-th/9905032.
[5] L. Girardello, M. Porrati, M. Petrini, A. Zaffaroni, Novel local CFT and exact results on
perturbations of N = 4 super YangMills from AdS dynamics, JHEP 9812 (1998) 022, hepth/9810126.
[6] S. Gubser, I. Klebanov, A.M. Polyakov, Gauge theory correlators from noncritical string theory,
Phys. Lett. B 428 (1998) 105, hep-th/9802109.
[7] J. Maldacena, The large N limit of superconformal field theories and supergravity, Adv. Theor.
Math. Phys 2 (1998) 231252, hep-th/9711200.
[8] R. Myers, New dimensions for old strings, Phys. Lett. B 199 (1987) 371.
[9] L. Randall, R. Sundrum, A large mass hierarchy from a small extra dimension, Phys. Rev.
Lett. 83 (1999) 33703373, hep-th/9905221;
L. Randall, R. Sundrum, An alternative to compactification, Phys. Rev. Lett. 83 (1999) 4690
4693, hep-th/9906064.
296
[10] E. Witten, AdS space and holography, Adv. Theor. Math. Phys. 2 (1998) 253291, hep-th/
9802150;
E. Witten, Anti-de Sitter space, thermal phase transition, and confinement in gauge theories,
Adv. Theor. Math. Phys. 2 (1998) 505, hep-th/9803131.

QCD radiation off heavy particles

E. Norrbin, T. Sjstrand
Department of Theoretical Physics, Lund University, Slvegatan 14A, SE-223 62 Lund, Sweden
Received 12 October 2000; accepted 7 March 2001
Abstract
We study QCD radiation in decay processes involving heavy particles. As input, the first-order
gluon emission rate is calculated in a number of reactions, and comparisons of the energy flow
patterns show a non-negligible process dependence. To proceed further, the QCD parton shower
language offers a convenient approach to include multi-gluon emission effects, and to describe
exclusive event properties. An existing shower algorithm is extended to take into account the
process-dependent mass, spin and parity effects, as given by the matrix element calculations. This
allows an improved description of multiple gluon emission effects off b and t quarks, and also off
nonstandard particles like squarks and gluinos. Phenomenological applications are presented for
bottom production at LEP, Higgs particle decay to heavy flavours, top production and decay at linear
colliders, and some simple supersymmetric processes. 2001 Published by Elsevier Science B.V.
PACS: 12.38.Bx; 12.38.Cy; 13.65.+i
Keywords: QCD phenomenology; Parton showers; Heavy particles; Radiation patterns; Supersymmetry events
1. Introduction
Heavy particles often tend to be in the focus of particle physics research. The search
for new physics at the high-energy frontier is an obvious example. Not only are the
hypothetical new heavy states here of interest, but often their detection relies on decay
chains that involve heavy flavours. Here the b quark is the prime example. By secondary
vertices or semileptonic decays it can be tagged. It is a main decay product of top, where
the b tag is a standard requirement for top identification above background at hadron
colliders. A light Standard Model Higgs (mh 140 GeV) predominantly decays to bb,
and for any Higgs state the bb branching ratio is a key parameter in pinning down its
nature. These tagging and identification aspects also appear in other scenarios beyond the
standard model.
E-mail addresses: emanuel@thep.lu.se (E. Norrbin), torbjorn@thep.lu.se (T. Sjstrand).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 0 9 9 - 2
298
E. Norrbin, T. Sjstrand / Nuclear Physics B 603 (2001) 297342
Many of the potentially new particles carry colour, like squarks, gluinos and leptoquarks.
Also colourless states often decay to quarks. Therefore QCD radiation is unavoidable, and
the large value of the s coupling implies that such radiation can be quite profuse. It is
well-known, e.g., from LEP, that fixed-order perturbation theory fails to describe QCD
radiation effects off light quarks: the rate of a few well separated jets can be reproduced,
but the internal structure of these jets requires the resummation of multiple-gluon effects.
The most successful method for achieving such a resummation is to apply the partonshower language, where explicit final states can be generated, with full respect for energy
momentum conservation and other constraints. By introducing some low fixed cut-off
scale Q0 1 GeV, the very soft and nonperturbative rgime of QCD can be factored off
and put in a universal hadronization description, such as the Lund string fragmentation
model [1]. With this approach, it is possible to obtain a quite accurate description of
essentially all hadronic final-state properties of e+ e annihilation events. The physics
is more complicated in a hadron collider environment, with additional effects, e.g., from
initial-state QCD radiation and underlying events, so the level of ambition may have to be
set accordingly. However, this applies to all conceivable descriptions, so again we expect
a realistic description of jets to be most easily achieved with the help of the parton-shower
language.
Heavy quarks radiate less than light ones in the soft-gluon region, so in some respects
QCD multiple-emission corrections are not as crucial. The b is sufficiently light that it
still radiates profusely, but top and supersymmetric particles may be heavy enough that
multiple-gluon emission effects are limited. Furthermore, for particles that are very shortlived, the width will provide a cut-off on soft-gluon emission [2]. So for new-particle
searches, higher-order QCD effects may be less of an issue. However, whenever higher
precision is required, like for a mass determination of a new state, it may still be necessary
to model whatever soft-gluon emission effects there are. And, of course, the decays of
these new states will bring further radiation, e.g., like t bW+ , where the radiation off
the t and off the b become intertwined, and where width effects may become important for
short-lived particles. The aim of the current article is precisely to improve the description
of gluon radiation off b and c quarks as well as off heavier objects, in order to allow higherprecision physics studies.
The starting point is the calculation of a large number of first-order matrix elements,
for gluon emission in decay processes, often generalizing on results already found in the
literature. The processes thereby covered include gluon emission, e.g., in /Z0 /h0 /A0
bb (with differences between a vector, axial vector, scalar and pseudoscalar source
of the bs), t bW+ , and various supersymmetric reactions. Quite apart from the
implementation in the shower framework, these calculations can also be used to assess
the degree of (non)universality, i.e., the dependence on the colour and spin structure of the
processes under identical kinematical conditions. In particular, we will show that the dead
cone picture can be quite misleading for the emission of more energetic gluons.
These matrix elements are taken as input to an improved version of an existing shower
algorithm [3], which primarily has been used for radiation off lighter quarks and gluons.
This shower algorithm tends to slightly overestimate the amount of gluon radiation for
299
e+ e /Z0 qqg when q is massless, so a simple rejection procedure can be used

to match the hardest emission to the massless matrix elements. The radiation off heavier
quarks is underestimated in the collinear region, however. Or, put differently, the dead cone
[4] effects are exaggerated. In order to allow a corresponding approach for heavy flavours,
the evolution variable therefore has been modified to bring up the emission rate. It therefore
becomes possible to use the process-specific matrix elements as rejection factors also here,
and furthermore to do it in all stages of the shower.
The one place where precision tests are possible is in studies of b production at LEP1
[5]. We therefore study this topic in detail here, and use it as a test bed for different
possible variations of the basic shower algorithm. Other studies include Higgs, top and
supersymmetric particle production and decay. Furthermore, we also include some plots to
illustrate differences between gluon radiation off sources of different spin.
While not the main thrust of this article, we note that new data on gluon branching to
has been presented by the LEP collaborations [5].
heavy flavours, g cc and g bb,
We therefore also discuss what implications this might have for the shower algorithm, and
whether the data could be accommodated by reasonable modifications.
The plan of the article is as follows. In Section 2, the existing older shower algorithm is
explained, together with a few intermediate variants thereof. The new approach is described
in Section 3. A survey of the matrix-element calculations and the consequent radiation
patterns is given in Section 4. Some applications are then presented in Section 5. The
separate topic of gluon splitting is discussed in Section 6. Finally, a summary and outlook
is given in Section 7.
2. Previous models
Several shower algorithms have been proposed in the literature. Today the three most
commonly used ones probably are those found in P YTHIA /J ETSET [3,6], H ERWIG [7,8]
and A RIADNE [9]. The studies in this article will be based on the former one.
The P YTHIA final-state shower consists of an evolution in the squared mass m2 of a
parton. That is, emissions are ordered in decreasing mass of the radiating parton, and the
Sudakov form factor [10] is defined as the no-emission rate in the relevant mass range. Such
a choice is not as sophisticated as the angular one in H ERWIG or the transverse momentum
one in A RIADNE, but usually the three tend to give similar results for e+ e annihilation
events. An exception, where small but significant differences were found, is the emission
of photons in the shower [11]. In general, comparisons between the three are helpful
in estimating a range of theoretical uncertainty, in interpretations of existing data or in
predictions for the future.
One of the advantages of the P YTHIA algorithm is that a mapping between the partonshower and matrix-element variables is rather straightforward to O(s ) for massless
quarks, and that already the basic shower populates the full phase space region very closely
the same way as the matrix element. It is therefore possible to introduce a simple correction
to the shower to bring the two into agreement. Also in A RIADNE the massless matrix-
300
element matching is straightforward. By contrast, the H ERWIG angular-ordered approach

does not automatically cover the full qqg phase space, which means that a subset of threejet events are not generated at all. This leads to problems in the description of LEP data, that
were overcome by separately adding the missing class of three-jet events [12], in addition
to matching to the matrix elements in the allowed region. More recently, a similar approach
has been applied to top decay [13], where again H ERWIG did not populate the full phase
space.
2.1. The massless shower
In addition to mass, the other main variable in the P YTHIA shower is z, as used in the
splitting kernels. It is defined as the energy fraction taken by the first daughter in the CM
frame of the event. That is, in a branching a b + c, Eb = zEa and Ec = (1 z)Ea . In the
original choice of z, which is done at the same time as ma is selected, the b and c masses
are not yet known, and are therefore imagined massless, also in cases where either of them
is known to have a non-vanishing on-shell mass. A cut-off scale mmin = Q0 1 GeV is
used to constrain the allowed phase space, so that only branchings with ma > mmin are
allowed. For a massive quark, the cut-off is shifted to

Q2 Q0
ma,min = m2a + 0 +
(1)
.
4
2
The allowed z range, z < z < z+ , then becomes
z =

1
1 a (ma ma,min ) ,
2
(2)
with a = |pa |/Ea the a velocity and (x) the step function.
At a later stage of the evolution, when mb and mc are being selected, possibly well above
Q0 , the previously found z may be incompatible with these masses. The adopted solution
is to take into account mass effects by reducing the magnitude of the three-momenta pb =
pc in the rest frame of a. Expressed in terms of four-momenta in an arbitrary frame, this
is equivalent to
(0)
pb = (1 kb )pb + kc pc(0) ,
(0)
pc = (1 kc )pc(0) + kb pb ,
(0)
(3)
(0)
where pb and pc are the original massless momenta and pb and pc the modified massive
ones. The parameters kb and kc are found from the constraints pb2 = m2b and pc2 = m2c :
kb,c =
with
abc =
m2a abc (m2c m2b )

,
2m2a

m2a m2b m2c
2
4m2b m2c .
(4)
(5)
301
Fig. 1. Example of showers, with the notations used in the text. (a) A generic shower. (b) A shower
giving a three-jet event.
The relation between the preliminary and final energy sharing thus is given by
z
=
m2 abc + m2b m2c abc

Eb
= (1 kb )z + kc (1 z) = a
+ 2 z,
Ea
2m2a
ma
(6)
(0)
with z = Eb /Ea as above. The transverse momentum p of b and c with respect to the a
direction is given by

2abc z(1 z) 1 a2
2
,
p = 2
(7)
ma
a2
4a2
2 z(1 z)m2 when used as argument in (p2 ) for the shower.
but is approximated by p
s
a
Angular ordering is not automatic, but is implemented by vetoing emissions that dont
correspond to decreasing opening angles. The opening angle of a branching a b + c is
calculated approximately as

pb pc
1
1
1
ma
(8)
+
z(1 z)ma
+
.
=
Eb
Ec
zEa (1 z)Ea
z(1 z) Ea
The procedure thus is the following, Fig. 1a. In the /Z0 decay, the two original partons
1 and 2 are produced, back-to-back in the rest frame of the pair. In a first step, they are
evolved downwards from a maximal mass equal to the CM energy, with the restriction
that the two masses together should be below this CM energy. When the two branchings
are found, they define m1 and m2 and the z values of 1 3 + 4 and 2 5 + 6. These
branchings obviously have smaller opening angles than the 180 one between 1 and 2, so
no angular-ordering constraints appear here. A matching procedure to the matrix element
is used to correct the branchings, however, as will be described below. In subsequent steps,
a pair of partons like 3 and 4 are evolved in parallel, from maximum masses given by the
smaller of the mother (1) mass and the respective daughter (3 or 4) energy. Here angular
ordering restricts the region of allowed z values in their branchings, but there are no matrixelement corrections. Once m3 and m4 are fixed, the kinematics of the 1 3 + 4 branching
needs to be modified according to Eq. (3). This is the reason why the evolution is always
done for a pair of partons (whereof not both need branch further, however), and why the
final kinematics of a branching is postponed to a later stage than the choice of z value.
302
Several other aspects of the shower could be discussed, such as the choice of nonisotropic azimuthal angles to improve the coherence description and include gluon spin
effects, the possibility also to emit photons, or the option to force some branchings in order
to match higher-order matrix elements [14]. These are of less interest here and thus not
covered.
2.2. The massless matrix element correction
Let us now compare the parton-shower (PS) population of three-jet phase space with the
matrix-element (ME) one, for the case of e+ e annihilation to massless quarks. With the
conventional ME numbering q(1)q(2)g(3), with xj = 2Ej /ECM , the matrix element is of
the form [15]
x12 + x22
s
1 dME
CF
,
=
0 dx1 dx2 2
(1 x1 )(1 x2 )
(9)
with the colour factor CF = 4/3. We have normalized to the lowest-order cross section 0 ,
so that the expression can be interpreted as a probability distribution. For future reference,
we will use the notation
N(x1 , x2 , r)
s
1 d
CF
,
=
0 dx1 dx2 2
(1 x1 )(1 x2 )
(10)
where r = mq /ECM = mq /ECM . Thus NME (x1 , x2 , 0) = x12 + x22 .

There are two shower histories that could give such a three-jet event. One is /Z0 (0)
q(i)q(2) q(1)q(2)g(3), i.e., with an intermediate (i) quark branching q(i) q(1)g(3),
illustrated in Fig. 1b. This gives
2
,
Q2 = m2i = (p0 p2 )2 = (1 x2 )ECM
p0 p1 E1
x1
x1
z=
=
=
=
,
p0 pi
Ei
x1 + x3 2 x2
dx2 dx1
dQ2 dz
=
2
.
Q 1 z 1 x2 x3
The parton-shower probability for such a branching is

2
s
s
x1
1 x1
1 + z2 dQ2
dx1 dx2
CF
dz 2 =
CF 1 +
.
2
1z
Q
2
2 x2
x3 (1 x1 )(1 x2 )
(11)
(12)
(13)
(14)
In the second shower history, the roles of q and q are interchanged, i.e., x1 x2 . This is the
same set of Feynman graphs as in the matrix-element description, except that the shower
does not include any interference between the two diagrams. The two shower expressions
can therefore be added to give the overall shower population of the three-jet phase space,
of the form in Eq. (10) but with
2
1 x1
x1
1 x2
x2
NPS (x1 , x2 , 0) =
(15)
1+
+
1+
.
x3
2 x2
x3
2 x1
303
Fig. 2. The gluon emission rate as a function of emission angle 13 = qg , for a 10 GeV gluon
energy at ECM = 91 GeV, and with mb = 4.8 GeV. All curves are normalized to the massless
matrix-element expression, Eq. (9), here thus represented by the small-dotted line at unity. Dashed:
the massless shower before correction, NPS (x1 , x2 , 0)/NME (x1 , x2 , 0) = 1/RME/PS (x1 , x2 , 0). Full:
the rate from massive matrix elements, NME (x1 , x2 , r)/NME (x1 , x2 , 0). Dash-dotted: the rate from
massive parton shower, NPS (x1 , x2 , r)/NME (x1 , x2 , 0). Large dots: the new rate from massive
(x , x , r)/N
parton showers, NPS
ME (x1 , x2 , 0).
1 2
In spite of NPS being lengthier than NME , it turns out that the two almost exactly agree
over the whole phase space, but with the shower rate somewhat above, see, e.g., Fig. 2. It
is therefore straightforward and efficient to use the ratio
RME/PS (x1 , x2 , 0) =
dME NME (x1 , x2 , 0)

=
dPS
NPS (x1 , x2 , 0)
(16)
as an acceptance factor inside the shower evolution, in order to correct the first emission of
the quark and antiquark to give a sum in agreement with the matrix element.
Clearly, the shower will contain further branchings that modify the simple result, e.g.,
by the emission both from the q and the q , but these effects are formally of O(s2 ) and thus
beyond the accuracy we strive to match. One should also note that the shower modifies the
distribution in three-jet phase space by the appearance of Sudakov form factors, and by
2 ) rather than a fixed one. In both these respects, however, the shower
using a running s (p
should be an improvement over the fixed-order result.
304
2.3. The massive matrix element correction

The prescription of correcting the first branchings by the factor in Eq. (16) was the
original one, used up until J ETSET 7.3, for massless and massive quarks alike, where the
xi = 2Ei /ECM variables were defined with masses included. In version 7.4 an intermediate
improvement was introduced, in that the massive matrix element expression was used, as
given for a vector source like the [16]:

1 x2 1 x1
2
2
2
2
4
4
. (17)
NME (x1 , x2 , r) = x1 + x2 4r x3 8r 2r + 4r
+
1 x1 1 x2
The shower algorithm itself was not changed, nor the assumed shower weight, i.e., an
acceptance factor NME (x1 , x2 , r)/NPS (x1 , x2 , 0) was applied for the first branching on
either side. (The older behaviour remained as an option.)
The mass suppression in the matrix element is illustrated in Fig. 2. We remind that, in
the soft-gluon limit, a spin-independent (and thus universal) eikonal expression holds [8]:

dqqg
p1
p2 2 d3 p3
(1)
qq
p1 p3 p2 p3
E3

2
m1
m22
2p1 p2
(18)
E3 dE3 d cos 13.
(p1 p3 )(p2 p3 ) (p1 p3 )2 (p2 p3 )2
In the limit of small angles 13 this gives a mass suppression factor
2
2
13
d (x3 , 13 , r)
,
2 + 4r 2
d (x3 , 13 , 0)
13
(19)
i.e., the characteristic dead cone of opening angle approximately 2r = mq /Eq . Note that
the mapping between (x1 , x2 ) and (x3 , 13 ) depends on r, so Eq. (19) is not quite the same
as NME (x1 , x2 , r)/NME (x1 , x2 , 0).
2.4. The massive phase space correction in the shower
More recently [5], the issue of masses in the shower was further studied, since the
expression for NPS (x1 , x2 , 0) had not been touched when NME was improved.
In the derivation of NPS (x1 , x2 , r), one can start from the ansatz
x2 = 1
m2i m2q
2
ECM
m2i m2q

(1
k
x1 = 1 +
)z
+
k
(1
z)
,
1
3
2
ECM

m2i m2q

(1 k3 )(1 z) + k1 z .
x3 = 1 +
2
ECM
(20)
The quark mass enters both in the energy sharing between the intermediate quark i and
the antiquark 2, and in the correction procedure of Eq. (3) for the splitting of energy in
305
the branching q(i) q(1)g(3). The constraints p12 = m2q and p32 = 0 give k1 = 0 and
k3 = m2q /m2i . One then obtains
2
,
Q2 = m2i = (1 x2 + r 2 )ECM

1
x3
x1 r 2
,
z=
2 x2
1 x2
(21)
(22)
dx2
dQ2 dz
dx1
=
.
Q2 1 z 1 x2 + r 2 x3
This gives the answer
(23)
2
1 x1 1 x2
1
2 x3
NPS (x1 , x2 , r) =
1+
x1 r
x3 1 x2 + r 2
(2 x2 )2
1 x2
+ {x1 x2 },
(24)
where the second term comes from the graph where the antiquark radiates.
The mass effects go in the right direction, NPS (x1 , x2 , r) < NPS (x1 , x2 , 0), but actually
so much so that NPS (x1 , x2 , r) < NME (x1 , x2 , r) in major regions of phase space. This is
illustrated in Fig. 2. Very crudely, one could say that the massive shower exaggerates the
angle of the dead cone by about a factor of two, in this rather typical example. There is
no dead cone as such built in, however, but a rather more coincidental mass suppression
mainly generated by the factor (1 x2 )/(1 x2 + r 2 ).
Thus the amount of gluon emission off massive quarks is underestimated already in
the original prescription, where masses entered in the kinematics but not in the ME/PS
correction factor. When the intermediate correction ratio NME (x1 , x2 , r)/NPS (x1 , x2 , 0)
is applied, the net result is a distribution even more off from the correct one, by a factor
NPS (x1 , x2 , r)/NPS (x1 , x2 , 0). Thus it would have been better not to introduce the mass
correction in JETSET 7.4.
Based on the results above, one can now instead use the correct ME/PS factor
NME (x1 , x2 , r)/NPS (x1 , x2 , r). A technical problem is that this ratio can exceed unity,
in the example of Fig. 2 by up to almost a factor of two. This could be solved, e.g.,
by enhancing the raw rate of emissions by this factor. However, another trick was
applied, based on the facts that the shift of Eqs. (3) and (20) implies that smaller-energy
gluons would be allowed for a massive quark than a massless one, and additionally that
the accessible z range is overestimated in the original ansatz. Therefore, without any
(noticeable) loss of phase space, z can be rescaled to a z
according to

1 z
= (1 z)k ,
with k =
ln(r 2 )
2 )
ln(Q20 /ECM
< 1.
(25)
The ME/PS correction factor then has to be compensated by k, and thereby comes below
unity almost everywhere the remaining weighting errors are too small to be relevant.
This procedure, default since P YTHIA version 6.130, improves the shower description of
mass effects in the amount of three-jet events [5]. Mass effects are only included correctly
for the first branching of the q and q in the shower, however. Subsequent emissions involve
no correction procedure. Instead the dead cone effect is exaggerated, similarly to what was
306
shown in Fig. 2. Furthermore, even for the first branchings, only the possibility of a vector
source decaying to two identical-mass quarks is included, while the Z0 actually is a mixture
of vector and axial vector, and, e.g., the W would decay to two unequal-mass quarks. We
will therefore next try to develop a more powerful and general approach.
3. A new approach
One of the advantages of the Monte Carlo approach is that, so long as some upper
estimate can be found that allows simple generation, rejection down to a more complex
expression is straightforward. In particular, for an evolution in some variable Q with a
(Sudakov) form factor built up from the no-emission probability, the veto algorithm
[6] can be used, in which the initial overestimate of the emission rate is compensated
by the possibility of a rejection, with a continued evolution downwards in Q2 from the
rejected value onwards. As we have seen, the current set of Q2 and z variables is not so
convenient, since the emission rate off massive quarks is an underestimate of the correct
rate, and therefore the standard procedure does not work except after some extra tricks.
We will therefore pick another set of variables, preferably such that they reduce to the old
ones in the massless limit. Several approaches could have been taken, but here we have
chosen the minimal one of retaining the z definition and thereby the existing kinematics
machinery. As we will show, a modification for the branching a bc from Q2 = m2a to
Q2 = m2a (m2a )on-shell is enough to do the job.
3.1. The choice of shower variables
Again consider e+ e /Z0 q(1)q(2)g(3), and the formulae of Eq. (20), but now
assume Q2 = m2i m2q = m2i m21 for the q(i) q(1)g(3) branching. That corresponds
to 1/Q2 being the propagator of the off-shell parton 1. Then Eq. (22) is unchanged while
2
Q2 = m2i m21 = (1 x2 )ECM
,
dQ2
dx2 dx1
dz
=
.
Q2 1 z 1 x2 x3
(26)
(27)
So, with this simple trick, we have recovered the Jacobian of the massless case, Eq. (13).
This means that there is little mass suppression left in the shower evolution proper:

2
1 x1
1
2 x3
(x1 , x2 , r) =
r
1+
x
NPS
1
x3
(2 x2 )2
1 x2
+ {x1 x2 },
(28)
(x , x , r) reduces to N (x , x , 0),
as can be seen in Fig. 2. In particular, note that NPS
1 2
PS 1 2
Eq. (14), in the soft limit x3 0. Therefore a matrix-element correction now is even more
required, but also simpler to implement, to bring down the rate to a reasonable level.
So far, we have assumed decay to two equal-mass quarks, like in Z0 decay. For many of
the examples to be discussed, like W or t decay, the two decay products have unequal
307
masses. It is also convenient to allow slightly unequal masses for particles with nonnegligible widths, e.g., in e+ e tt. We will therefore generalize to the case with r1 =
m1 /ECM = mq /ECM = r2 = m2 /ECM = mq /ECM . The range of kinematically allowed xi
values is then 2r1 x1 1 + r12 r22 , 2r2 x2 1 + r22 r12 , 0 x3 1 (r1 + r2 )2 ,
with the joint condition that
2

2 2x1 2x2 + x1 x2 + 2r12 + 2r22 x12 4r12 x22 4r22 .
(29)
The ansatz of Eq. (20) is modified by mq m2 while k3 m21 /m2i . Furthermore

2
Q2 = m2i m21 = (p0 p2 )2 m21 = 1 + r22 r12 x2 ECM
,

1
x3
z=
x1 r12
,
2
2 x2
1 + r2 r12 x2
dQ2 dz
dx2
dx1
=
.
Q2 1 z 1 + r22 r12 x2 x3
(30)
(31)
(32)
When a colourless particle decays to two colour triplets, the two possible one-gluonemission shower histories then add up to give

1 + z22
1 + z12
s
1 dPS
CF
+
.
=
(33)
0 dx1 dx2 2
x3 (1 + r22 r12 x2 ) x3 (1 + r12 r22 x1 )
Here z1 is given by Eq. (31) and z2 is obtained by exchanging 1 2. Since the numerators
1 1 + zi2 2, their exact form is not a main concern for the qualitative discussions.
As it turns out, the Monte Carlo procedure is simplified if the shower is generated with a
numerator 2 instead of 1 + zi2 , so we will make this replacement in the following, with the
matrix element correction procedure compensating this overestimate.
3.2. The matrix element correction
Next, one should try to relate this to the structure of the matrix elements under similar
conditions. We may expect graphs where either of the partons can radiate a gluon and
therefore can give a propagator 1/Q21 = 1/(m213 m21 ), cf. Eq. (30), or 1/Q22 = 1/(m223
m22 ), depending on which side radiates. (Diagrams with a four-boson vertex, such as
/Z t tg, are not singular and therefore do not affect the discussion.) After summing
and squaring, the cross section then should have the form
4
4 B(x , x )
ECM A(x1, x2 ) ECM
s
1 dME
1 2
CF
=
+
2
2
2
2
0 dx1 dx2 2
(m13 m1 )
(m23 m22 )2

4 C(x , x )
ECM
1 2
+ 2
(m13 m21 )(m223 m22 )

A(x1 , x2 )
B(x1 , x2 )
s
CF
+
=
2
2
2
2
(1 + r2 r1 x2 )
(1 + r12 r22 x1 )2

C(x1 , x2 )
(34)
.
+
(1 + r22 r12 x2 )(1 + r12 r22 x1 )
308
The individual functions A, B and C depend on the gauge choice, but the total cross
section of course is gauge-independent. It also has some general features, which can be
seen in the soft-gluon limit expression, Eq. (18). Here the interference term is giving a
positive expression and the seemingly quadratically-divergent terms are negative and drive
the cross section to zero in the collinear limit, the dead cone. In order to have an upper
estimate of the cross section, for Monte Carlo applications, it is thus enough to have
the singularity structure of the interference term modelled, and no need to worry about
quadratic divergences.
As before, we could now compare the total shower and total matrix element rates, and
define a corrective factor between the two. To simplify some of the continued studies,
however, we have adopted an alternative approach. Instead of adding the two shower
histories to compare with the full matrix element, one can split the matrix element in two,
such that each part can be compared with only one shower history. Such a subdivision of
course is arbitrary, but should still be sensible. A gluon emitted close to the p1 direction
should predominantly be emitted by parton 1, and vice versa. A suitable such subdivision
is in the proportions given by the propagators, 1/Q21 : 1/Q22 , which also is the proportions
between the two shower histories (in the 1 + z2 2 approximation). For massless partons,
and going to the soft-gluon limit, this means a probability (1 + cos 13 )/2 = (1 cos 23 )/2
for emission off parton 1 and the rest for emission off parton 2.
In the description of the gluon emission rate off parton 1, a matrix element fraction
WME,1 (x1 , x2 ) =
1 + r12 r22 x1 1 dME

1 dME
=
x3
0 dx1 dx2
Q21 + Q22 0 dx1dx2
Q22
(35)
should be compared with the first half of the total parton shower expression in Eq. (33),
WPS,1 (x1 , x2 ) =
2
s
CF
.
2
2
x3 (1 + r2 r12 x2 )
(36)
The ME/PS correction factor then becomes

WME,1 (x1 , x2 )
R1 (x1 , x2 ) =
WPS,1 (x1 , x2 )
=
1
(1 + r22 r12 x2 )(1 + r12 r22 x1 ) s
1 dME
CF
,
2
2
0 dx1 dx2
(37)
which reduces to R1 (x1 , x2 ) = NME (x1 , x2 , r)/2 for r1 = r2 = r.

The intention is that R1 (x1 , x2 ) should be finite and well-behaved over all of phase space,
with one factor of each divergence now multiplied on to the matrix element expression.
We will illustrate the behaviour later on, but the key observation is that, for all the matrix
elements we will study, searches over the full mass parameter plane and phase space have
failed to find any point where the ratio is above unity. R1 therefore serves well as simple
rejection factor. (Without the replacement 1 + z2 2 in the shower description one would
find R1 (x1 , x2 ) > 1, and some extra precautions would be necessary.) The correction factor
R2 for the other shower history looks the same. Put another way, the relative probability
for a gluon to be emitted by parton 1 or by parton 2 is unchanged by the matrix element
correction procedure.
309
The above procedure works not only for quarks, but also, e.g., for squarks. The
numerator of the splitting kernel would have been 2z rather than 1 + z2 [18], but both are
equally well approximated from above by 2. For a gluino, the colour charge is NC rather
than CF , so the assumed shower emission rate has to be scaled up by a factor NC /CF =
9/4 (also for the recoiling colour triplet parton, since the separation of radiation is not
perfect), with no other change required.
We will also encounter processes where the decaying particle carries colour, like in t
bW+ . Then gluon emission off this particle has to be considered, i.e., graphs like t(0)
t(i)g(3) b(1) W+ (2)g(3), proceeding via an intermediate off-shell top. Neglecting the
top width, this introduces a new kind of inverse propagators
2
Q = m2 m2 = m2 m2 = 2(p1 + p2 )p3 = x3 E 2 .
(38)
0
0
i
123
12
CM
The leading term of the matrix element, proportional to 1/(Q20 Q21 ), will therefore have the
same singularity structure as the shower rate in Eq. (36). In a shower description, where
only the b is allowed to radiate, there is thus no problem in principle of letting that radiation
account for the full emission pattern of the matrix elements, i.e.,
R1 (x1 , x2 ) =
WME (x1 , x2 )
WPS,1 (x1 , x2 )
x3 (1 + r22 r12 x2 )
=
2
s
CF
2
1 dME
.
0 dx1 dx2
(39)
Since also here the shower rate turns out to exceed the matrix element one, a simple
rejection approach should work well.
3.3. Subsequent branchings
So far, we have considered matrix elements as providing the probability of exclusive
three-jet events. An alternative interpretation, however, would be in terms of an inclusive
density of gluon emissions, with the possibility of several such per event. This interpretation works well in the soft-gluon limit, while the emission of a hard gluon reduces the
phase space for subsequent emissions and thus ruins the picture of independent emissions.
Furthermore, the possibility for gluons to branch in their turn leads to the need to include
coherence effects [17] that constrain allowed emissions. Nevertheless, the matrix elements
can be used to extract important information, not only on the emission rate of hard gluons,
but also on that of soft and collinear ones.
Historically, different approaches have therefore been taken. In the J ETSET /P Y THIA procedure used until now, only the first branching is corrected by matrixelement information. Subsequent emissions involve no corrections, but only the processindependent splitting kernels, and therefore give the wrong emission rate off heavy quarks
that has already been noted. In the H ERWIG routine, a correction is performed for every
emission that is the hardest so far [12] whereas J ETSET emissions tend to be ordered
in hardness, this is less likely in H ERWIG, so the distinction is then relevant. A RIADNE,
finally, imposes a correction at all steps of the cascade.
310
Since our new algorithm does not have the correct dead-cone behaviour built in from the
onset, it is clear that some correction procedure will be required. Rather than imposing
the process-independent collinear behaviour, we have chosen to base ourselves on the
matrix elements also in this region. This offers a smooth interpolation between the hard
process-specific and the soft collinear universal behaviours, and sidesteps the issue of
which emissions are to be considered the hardest so far [12]. Thus the shower rate will be
corrected to the matrix element one at every step of the shower off the original partons.
The gluon cascading g gg is unaffected, since there are no gluon-mass effects or
matrix elements to be considered here. Light quarks (in /Z0 decay) are also essentially
unaffected, since there the shower matches so well with the matrix elements anyway.
The kinematics of the cascade changes in each emission, as the energy of the radiating
parton is reduced by the previous emissions. The mapping of an emission on to the matrix
element variables is thereby not unique. However, from the dead cone formula, Eq. (19),
we see that the emission angle and the mass-to-energy ratio of the emitting quark should
be represented faithfully in the choice of matching matrix element variables. For the
branchings subsequent to the first one, say 3 7 + 8 in Fig. 1a, the mother energy E3
is fixed. The choice of a Q2 = m23 m27 and a z of the branching maps onto four-momenta
p7 and p8 as described in Eqs. (3)(7), with p72 = m2q > 0 and p82 = 0. In order to make
contact with the matrix element variables, now construct a hypothetical recoiling parton 2
,
such that p
2 = p3 and p2
2 = mq2 . The CM energy of this reduced system is then given by
ECM
= E3 + E2
= E3 +
p23 + mq2 = E3 +

E32 Q2 m2q + mq2 .
(40)
, x
=
Now matrix element and parton shower weights can be evaluated for x1
= 2E7 /ECM
2
, r
= m /E
2E2
/ECM
q
1
CM and r2 = mq /ECM . Note that, in the limit E4 0, also ECM
ECM , and the ordinary matrix elements are recovered.
The above procedure is not unique. As an alternative, one could retain the original ECM ,
but construct an off-shell p2
to carry all the energy and momentum of the system except for
p3 = p7 + p8 , i.e., p2
= p2 + p4 in the case of Fig. 1a. This gives equivalent results so long
as the gluon energy already emitted is not too large, and else gives a somewhat lower rate
of wide-angle emission, reflecting that the parton 2
then is assigned such a large mass that
it radiates less. The overall picture therefore is less appealing, even if results in practical
applications are almost equivalent.
In this article, emphasis is put on the gluon emission off the primary quarks (or other
primary particles). The subsequent branchings g gg have not been affected by the mass
considerations, so are not discussed here. The rate of g qq branchings is another topic
of some interest, given that LEP results do not quite agree with predictions [5]. This issue
is further discussed in Section 6.
However, given that such a secondary cc or bb pair has been produced, the possibility of
further gluon emission off this pair should be considered, even if most pairs are produced
at such a low mass that the phase space left for further radiation is limited. In order to
provide a sensible behaviour in the collinear region, again matrix-element input is applied,
calculated for the decay of a colour octet source. (Process 66 of Table 1, so with the wrong
311
spin of the source, but correct for the radiating parton, which is the main point.) To first
approximation, this means that radiation occurs independently off the q and q ; see the
discussion on radiation patterns below. The kinematics for the matrix element corrections
is set up about as described above, i.e., by mapping onto a reduced system at rest with
preserved energy for the radiating parton. A simplification is that here the recoiling parton
always has the same mass as the radiating one.
3.4. Additional issues
In the older shower algorithm, the decay products of a branching a bc were assumed
2 of a
massless until assigned a mass by some subsequent step. This meant that the p
2
branching, Eq. (7) with a = 1 as a first simplification, was simplified further to p
2
2
z(1 z)ma when used as argument in s (p ). In a branching q qg, where the q is
2 z(1 z)m2 (1
assumed to have a non-vanishing rest mass, one would instead obtain p
a
2
2
2
mb /ma ) . (This follows from Eq. (7) or alternatively from the rescaling of x3 in Eq. (20).)
2 estimate implies a larger and emission rate for a given kinematical
A smaller p
s
2 > Q2 /4,
configuration, but also that more phase space is cut out by the requirement p
0
where Q0 is the soft cut-off scale of the cascade. The net result of such a change is therefore
not obvious, and we will study it later on.
Also the calculation of the approximate opening angle of a branching a bc, Eq. (8),
would be affected by the same considerations. The massive parton energy Eb is increased
at the expense of Ec , as given by Eq. (20). Since the common p is also decreased, the ratio
p /Ec is preserved, while p /Eb is decreased. The net result is a decrease of the opening
angle by a factor (1 + (m2b /m2a )(1 z)/z)1 . As another option, we will then consider
the consequences of such a decrease in the decay opening angle, without any change of
the production angle, as a way of minimally relaxing the angular ordering condition off
massive quarks.
While our choice of Q2 m2 variable has significant advantages for the matching
to matrix-element expressions, it does not offer as neat an implementation of coherence
effects as the angular variable of H ERWIG or the transverse momentum one of A RIADNE.
Without any further constraints, the amount of radiation is overestimated. Therefore, by
default, angular ordering of emissions is imposed as a further constraint. This, on the other
hand, tends to restrict emissions somewhat too much. The bulk of these ambiguities affect
rather soft gluons, which do not give rise to separate jets. For the precision studies in
Section 5.1, however, also small effects could be of interest. We have therefore introduced
a new intermediate coherence option, as compared to the minimal modification above,
wherein no angular constraint is imposed on emissions off the primary qq pair. These
emissions thus are ordered only in mass. Angular ordering is still imposed in the cascades
initiated by the gluons emitted off the primary quarks. In particular each such cascade
is restricted to a cone given by the emission angle of the initiating gluon. In this option,
generic event properties are only slightly changed compared with the default procedure.
So far, we have mainly considered configurations where a colour singlet decays to
Another class of events involve sequential decays of
stable particles, e.g., /Z0 bb.
312
. Even leaving
coloured objects. The obvious example would be /Z tt bW+ bW
aside the continued fate of the Ws, e.g., assuming they decay leptonically, the event now
contains four colour charges that may radiate. In the limit t 0, the radiation in the
top production stage /Z tt decouples completely from that in the top decays. For a
finite t , gluons with energies below or around this scale can receive contributions from
many colour sources, however, leading to complex radiation patterns [2]. Close to the tt
and tb,
threshold, the main sources are the respective top decays, described by dipoles tb

and radiation off the bb dipole created after the t and t have decayed. Gluons with energies
above (below) t predominantly feel the former (latter) dipoles. The radiation pattern can
be written as
Eg2
t2 + Eg2

+ tb +
tb
t2
t2 + Eg2
bb.
(41)
Thus the reduced radiation induced by the top finite lifetime is compensated by the
radiation from new sources that would not have been present for a long-lived top. The
bb dipole introduces a dependence on the opening angle between the b and the b that is not
there in the separate top decays, so the compensation is not complete. On the perturbative
level, this gives a dipole effect [19] that could be observed at low momenta. However, even
in the approximation of allowing radiation only within each top quark decay separately, the
string fragmentation picture [1] would imply that a nonperturbative colour string should be
stretched between the b and b (or their respective cascades), and this would introduce a
string effect [20] of almost equal character and magnitude [21]. Thus only very careful
studies, e.g., for high-precision measurements of the top mass, would be sensitive to the
detailed nature of the soft-gluon emission source. The critical transition is the one between
a top long-lived enough to produce top hadrons and one decaying too rapidly for that,
where the hadronic final state does change character.
The shower algorithm contains options that allows it to be run, either with soft or with
hard emission dampened according to the respective factors in Eq. (41). However, currently
the P YTHIA program contains no machinery to detect when a description of this kind
is required, nor a prescription how to combine all possible sources of radiation. This is
obviously an interesting task for the future, but one that will be required primarily for
particles with a width significantly larger than that of the top.
The normal generation sequence therefore contains a set of separated showers. For
instance, in the top example above, the /Z tt induces a first cascade, whereby the t
and t shower down to the mass shell. Thereafter follows the respective top decay, t bW+
, and the separate radiation in those decays. At some yet later stage, the two
and t bW
Ws may decay to quarks that again radiate. Anytime a colour singlet is exchanged, like the
Ws above, there is a clean separation into disjoint QCD subsystems, while the exchange of
a coloured state like the top will hook up separately showering systems to the strings that
later will produce the observable hadrons. Interconnection effects [22] could complicate
this picture, but so far the evidence is that any such effects would be small.
In hadronic collisions, more complex processes would occur, and also initial-state QCD
radiation has to be considered as a potential source of further interference effects. In this
313
article we will not address these additional complications, but defer that for some future
study. Currently such interference is almost completely neglected, except for some angular
restrictions [23].
Even if t is neglected in the showering, for the sake of providing a unique separation of
radiation before and after the top propagator, this does not mean that all top quarks have to
have the same mass. Instead, resonance masses are chosen according to the relevant Breit
Wigners, convoluted with the respective cross section formulae. In particular, this means
that the t and t masses of an event would be unequal. The shower algorithm described above
then operates on these event-by-event masses, without any reference to their nominal onshell equivalents.
The hadronization of a partonic configuration, obtained by the chain of decays and
showers outlined above, is described by the Lund string model [1]. All coloured partons
belong to strings, stretched from a quark endpoint via a number of intermediate gluons to
an antiquark one. (Also diquark endpoints and closed gluon loops can be considered, but
are of no relevance in this article.) Normally each such string would have a reasonably large
invariant mass, enough to produce several hadrons. However, occasionally a string could
come to have a small mass, e.g., by the splitting of a string in two by shower branchings
g qq. Then a special treatment may be required for one- or two-hadron decays. While the
normal string algorithm has remained essentially unchanged over a number of years, this
low-mass cluster treatment has recently been improved [24]. The direct consequences
for the topics studied here are minimal, except that quark mass values have now been
optimized, especially to describe charm asymmetries in fixed-target experiments. The
current default values thus are mu = md = 0.33 GeV, ms = 0.5 GeV, mc = 1.5 GeV and
mb = 4.8 GeV.
4. Matrix elements
As input and starting point for our shower studies, we will need the matrix elements
for the processes of interest. In this article this essentially means the two-body decay
of a particle, with associated gluon radiation. Some of the formulae are available in the
literature, but most are not, or at least not easily found. We have therefore calculated a
number of processes. This also gives us a chance to test the degree of universality of the
radiation patterns in channels of different colour and spin structure, but with the same
masses. These results are interesting in their own right.
4.1. Calculations
A number of matrix elements have been calculated, using C OMPHEP [25] for the
actual calculation, including an extension package for Supersymmetric processes [26], and
M ATHEMATICA [27] for subsequent simplification of the expressions. The list of lowestorder (LO) processes is given in Table 1. The LO expression gives the two-body decay rate
of a particle, i.e., a bc, and the matching first-order (FO) one the same decay with an
additional gluon in the final state, a bcg.
314
Table 1
The processes that have been calculated, also with one extra gluon in the final state. Colour is given
with 1 for singlet, 3 for triplet and 8 for octet. See the text for an explanation of the 5 column and
further comments
Colour
Spin
Example
Codes
13+3
(eikonal)
69
13+3
1, 5 , 1 5
1, 5 , 1 5
Z0 qq
t bW+
1619
1, 5 , 1 5
H0 qq
2124
33+1
1 12 + 12
1 1 +1
2
2
0 12 + 12
1 1 +0
2
2
1, 5 , 1 5
2629
13+3
33+1
10+0
00+1
1
1
t bH+
Z0 q q
13+3
33+1
00+0
00+0
1
1
13+3
1 1 +0
2
2
0 12 + 12
1 0+ 1
2
2
1 1 +0
2
2
0 12 + 12
1 0+ 1
2
2
1, 5 , 1 5
qq
1, 5 , 1 5
q q
t t
g qq
33+1
13+3
33+1
33+1
83+3
33+8
33+8
18+8
1, 5 , 1 5
1, 5 , 1 5
1, 5 , 1 5
1, 5 , 1 5
q q
W+
H0 q q
q q
H+
q qg
t tg
(eikonal)
1114
3134
3639
4144
4649
5154
5659
6164
6669
7174
7679
8184
While the matrix element calculations in this section have been performed from scratch,
some checks are based on results in the literature. These include: V qq for mq = mq
[16], H0 qq for mq = mq [28], V q q for mq = mq [18], and t bW+ for mb = 0
[13]. No doubt, many more are available, without our knowledge.
The process selection is based on the particle content of the Minimal Supersymmetric
Standard Model (MSSM), i.e., includes squarks q , gluinos g , neutralinos and charginos
, and Higgs states h0 , H0 , A0 and H . The idea, however, is that these situations could
represent also a number of other non-standard particles. For instance, the decay of a spin 0
leptoquark LQ q* is closely similar to q q.

All calculations have been performed in the zero-width limit of the decaying state and
the decay products, in order to allow a gauge invariant separation of radiation in the
production and decay stages. (As explained above, the Monte Carlo simulation of processes
does include mass selection according to the appropriate BreitWigners, so we here only
comment on the width dependence of the additional gluon radiation, i.e., in the ratio of
first to leading order cross sections.) Such a separation occurs naturally for exchanged
colourless particles, so is then no problem. For a coloured particle of width , this is a poor
approximation in the region of gluon energies (in the particle rest frame) below . The t is
still sufficiently narrow that the productiondecay interference is not a major problem, see
above, but for heavier squarks and gluinos a more complex description may be required.
315
The lowest coloured SUSY states, e.g., stop, tend to have small widths, however, and it is
on such states we will concentrate our studies. The additional complications for very wide
particles will be deferred to some future study.
The classification by colour and spin is fairly obvious, but that does not completely
specify the structure of the process. Consider, e.g., e+ e /Z0 qq. The cross
section (neglecting mass effects) is then

0 ee2 eq2 + 2 ee ve e eq vq + ve2 + ae2 | |2 vq2 + aq2 ,
(42)
where
=
1
2
16 sin W
W s mZ
the ratio between the Z0
2
cos2
s
+ imZ Z
(43)
represents
and the propagators and couplings. The term
2
proportional to aq corresponds to the qq pair coming from an axial vector source, the rest
to it coming from a vector source. Since the QCD radiation from these two is somewhat
different, we need to include the proper mixture, which depends on the CM energy, s =
2 . It has then been simplest to perform the calculations for a pure vector source and a
ECM
pure axial vector source, represented by 1 and 5 in Table 1, and mix in the proportions
required. We note that this mixing strategy is possible since the total cross section does
not contain any interference terms of the character vq aq . (Such terms do arise when
the forwardbackward charge asymmetry is considered.) The pure left-handed mixture
V A = 1 5 is of special interest, since it represents the W bosons. Therefore it has
been calculated separately, and is denoted by 1 5 in Table 1; as we already noted the sign
(of the interference term in the squared matrix element) is irrelevant for the QCD emission
aspects. In total, four alternatives are therefore open in our implementation: to have a pure
vector source, a pure axial vector, an arbitrary mixture V + (1 )A and the special
equal mixture. In this order, that gives the four codes 1114 in Table 1, according to the
numbering scheme used in the P YTHIA function PYMAEL introduced in version 6.153.
(This routine takes as input the process code, x1 , x2 , r1 , r2 and , and returns the ratio
(1/0 ) d/dx1 dx2 , omitting a factor of (s /2)CF .)
Correspondingly, also most other processes can come either with or without 5 factors
in the amplitude, or in arbitrary mixtures thereof. In the Higgs sector, normally the h0 and
H0 are scalar, the A0 pseudoscalar and the H a parameter-dependent mixture of 1 and
5 . If the coupling structure is generalized, also the neutral Higgses could be mixtures,
however. The q L and q R squark partners of the left- and right-handed quarks come with
wave function factors 1 5 and 1 + 5 , respectively. The squark mass eigenstates will be
mixtures of these, with significant mixing expected especially in the third generation. With
two squarks in a process, 52 = 1 ensures that the matrix element still can be written as a
sum 1 + (1 )5 . Again, therefore, we have chosen to perform most of the calculations
with and without a 5 factor and then leave open to have the mixing depend on the current
parameter choice. As a further simplification, the 1 5 mixture is used whenever the
correct choice is not known, since this mixture represents an average behaviour. In some
instances, further restrictions exist, e.g., a pseudoscalar cannot decay to two scalars, at least
among the MSSM processes at our disposal.
316
The process at the top of Table 1 is the spin-independent eikonal answer of Eq. (18),
extended from the soft region where it is intended to be valid:

2(x1 + x2 1 r12 r22 )
1 d
s
CF
=
0 dx1 dx2 2
(1 + r12 r22 x1 )(1 + r22 r12 x2 )

2r12
2r22
(44)
.
(1 + r22 r12 x2 )2 (1 + r12 r22 x1 )2
The first numerator, 2(x1 + x2 1 r12 r22 ), here is based on an evaluation of the 2p1 p2
numerator of Eq. (18) with p1 and p2 given by their values after the emission of p3 . Away
from the soft-gluon limit, however, there is some leeway in this assumption. If instead
the p1 and p2 values before the emission of p3 had been used, one would have obtained
x1 + x2 1 = 1 x3 1. Such a seemingly minor substitution has quite dramatic effects
for collinear emission even at rather small x3 , however, and is not really an option. In order
to have a not too unrealistic alternative to compare with, we therefore only allow deviations
proportional to x32 . (This is also what comes out of the x12 + x22 numerator of the massless
process V qqg.) The extreme in this direction would be x1 + x2 1 1 x3 + x32 ,
but we will also allow arbitrary admixtures x1 + x2 1 1 x3 + x32 , with as a free
parameter, obviously in no physics relation to the introduced above.
In all the formulae, the decay product mass ratios are kept as free parameters, r1 =
mb /ma and r2 = mc /ma , while the gluon is massless. Since we are not interested in cross
sections per se, but in the probability for gluon emission, in the LO cross sections only the
mass dependence is retained, normalized to unity for r1 = r2 = 0, e.g.,
0 (V , A q1 q 2 )
2

1
= 2 r12 r22 r12 r22 6r1 r2 (1 r12 r22 )2 4r12 r22 .
2
(45)
What is omitted is then some set of couplings and propagators. Exactly the same set is also
omitted from the first-order d/dx1 dx2 , thereby leaving the ratio unchanged. Additionally
2 ) free to be made
the common factor (s /2)CF , is omitted, to leave the choice of s (p
elsewhere. The ratio of the first to leading order cross sections then gives the assumed
differential gluon-emission rate. In the case of a mixture without and with 5 factors, the
sum has to be taken for numerator and denominator separately, e.g.,
1
d
(e+ e /Z0 qq)
0 dx1 dx2
d V /dx1 dx2 + (1 ) d A /dx1 dx2
=
.
0V + (1 ) 0A
(46)
The vector fraction can here be read off from the lowest-order expression in Eq. (42),
where the mass factors of Eq. (45) were omitted. The mass effects are instead included in
the individual d/dx1 dx2 and 0 terms of Eq. (46). This standard, e.g., means that V A
corresponds to = 1/2, which then because of mass effects gives a somewhat larger vector
than axial fraction.
317
Note that first-order corrections to the total cross section are not included in the 0
denominator. This is not a unique choice, but a rather natural one: if we consider the gluon
emission rate as the ratio of two cross sections, then including O(s ) corrections to one
but not the other is not likely to improve the overall accuracy of the calculation. In the
soft-gluon limit, it would even break the spin independence of the radiation pattern (cf. the
next subsection), i.e., give the wrong physics. And since a complete one-loop calculation
of both quantities is well beyond the scope of this article, we remain with lowest nontrivial order for both quantities. Furthermore, if the total cross section is written in the
form tot = 0 (1 + as /), then decay rates calculated so far tend to give a values of
order unity [28,29], i.e., small effects. An exception would be Coulomb corrections in the
threshold region, but there the phase space for real gluon emission is vanishingly small
anyway, so of no physical interest. By contrast, one-loop corrections to three-jet rates tend
to be larger [30], although most of that is absorbed by the parton-shower choice of a smaller
2 [30,31].
kinematics-dependent scale like p
4.2. Radiation patterns
Before applying the matrix elements to specific physics situations, it is interesting
to compare them between each other under similar conditions. This will provide an
understanding of the amount of spin and colour dependence in the matrix elements, and
thus the extent to which a process-dependent Monte Carlo implementation can be expected
to provide an improvement over a process-blind one, e.g., based on some simple dead cone
formula.
Fig. 3 is intended to provide a first qualitative glimpse of differences. The amount
of detail may be bewildering, but at this point the idea is to bring up differences and
similarities in a broad sense, without concentrating on each process specifically.
First consider the full curves in Figs. 3ac, which are all for the colour structure 1
3 + 3 and the same kinematics, and only differ by the spin pattern: 1 1/2 + 1/2, 0
1/2 + 1/2, 1 0 + 0, 0 0 + 0, and 1/2 1/2 + 0, in relevant cases with and without a
5 (but no intermediate mixtures such as 1 5 ). In the first frame, where the gluon is soft,
these curves almost completely overlap. At smaller x3 the agreement becomes even better,
and one can then truly speak of a universal soft-gluon emission pattern, with a characteristic
dead cone of opening angle 2r = 0.4 = 23 in this example. By contrast, when x3 is
increased, the curves tend to disagree more and more. At x3 = 0.3, the dead cone is still
visible, although it is starting to fill in, but at x3 = 0.6 it is completely gone. Note that
the vertical scale changes significantly between the three frames; in absolute numbers the
differences between the curves is about a factor two larger at x3 = 0.1 than at 0.6, but in
relative terms the differences are negligible at small x3 .
This should bring home the message that the dead cone concept can only be used at
small gluon energies, and is irrelevant for energetic gluons. It is only the lowest of the full
curves, the completely spinless process 0 0 + 0, that does preserve the exact dead cone
concept, i.e., has a cross section that always vanishes in the collinear limit (in the rest frame
of the decaying particle).
318
Fig. 3. The gluon emission rate as a function of the emission angle 13 . Specifically, the vertical axis
gives (1/0 )d/dx1 dx2 with a normalization factor (s /2)CF removed. This three-jet phase space
density differs from d/d13 dx3 by a simple Jacobian. A variety of different processes are compared:
decay of a colour singlet to a triplet plus an antitriplet full curves, ditto in the eikonal approximation
dotted, decay of a triplet to a triplet plus singlet dashed, and gluino processes dash-dotted. The four
frames differ in the scaled masses ri = mi /ECM of the two decay products, and in the gluon energy
fraction x3 . Further explanations are given in the text.
319
The lower dotted curve is the eikonal expression of Eq. (44), which is constructed to have
an exact dead cone. It does agree fairly well with the spinless process, but not with anything
else. The upper dotted curve is the modified eikonal, with an x32 term added. It tends to
overshoot all processes at small angles, while some intermediate mixture, 0.5, would
do a sensible job for many processes in the small-angle region. For medium small gluon
energies and large angles, both eikonal forms undershoot, however. In general, one may
conclude that the eikonal formula is not particularly useful for practical considerations,
and that the process-specific matrix elements need to be used.
Of course, the detailed pattern depends on the masses. In Fig. 3d these are lower than
Then the dead cone concept works up
in the first three, and representative for Z0 bb.
to somewhat larger gluon energies, and a trace of it is still left at x3 = 0.6. It is again
preserved exactly for 0 0 + 0 and the eikonal, and also approximately for 1 0 + 0.
In order to quantify the difference in the total amount of gluon radiation, we compare
three measures on the three-jet phase space. Since the total three-jet cross section contains
a soft-gluon divergence, this soft region has to be avoided. One measure is thus to integrate
the total amount of radiated gluon energy, in shorthand

1 d
dx1 dx2 .
x3 = x3
(47)
0 dx1 dx2

Correspondingly we define a shorthand (1 x1 )(1 x2 ) as an alternative removal of the

denominator of the matrix elements (generalized to (1 + r12 r22 x1 )(1 + r22 r12 x2 )
for unequal masses). Finally, a Durham distance [32] yij = min(xi2 , xj2 )(1 cos ij )/2
is used to define a hard three-jet
region for which yD = min(y12 , y13, y23 ) > 0.1 and a

corresponding three-jet rate (yD 0.1), where is the step function. To simplify a
comparison between the processes, all results have arbitrarily been normalized to those for
the V qq process.
As can be seen in Table 2, the three measures give about the same message, namely

that differences between the processes
are
significant.
In
x3 the
ratio between the two

extremes is a factor 1.30, in (1 x1 )(1 x2 ) 1.70, and in (yD 0.1) 1.91. This
shows a steady progression of larger ratios the more one is biased towards the hard threejet region. The eikonal expression is below all the calculated processes, and the modified
eikonal (full blast) above.
The results in Table 2 are for the case where the daughter masses constitute a significant
fraction of the energy available. The other limit, where instead the daughters are massless,
is shown in Table 3. The main message is that the process dependence remains also in
the latter case, even if normally reduced in magnitude by about a factor of two between
the extremes. The detailed picture is not so simple, however. Some processes agree in
the massless limit when they do not for nonvanishing masses while, in the other extreme,
others disagree even more for vanishing masses.
The set of processes with colour structure 3 3 + 1 can be viewed as crossed versions
of the 1 3 + 3 ones, but the differences in kinematics result in another overall picture.
Examples of radiation patterns are shown by dashed curves in Fig. 3. The small-angle
behaviour displays the universal dead cone effect at small x3 and again diverges wildly at
320
Table 2
Three measures on the amount of gluon radiation in different processes, for mass ratios r1 = r2 =
0.2. For clarity, results have been normalized to the process in the top line. See the text for a detailed
explanation

(1 x1 )(1 x2 )
(yD 0.1)
Colour
Spin
13+3
1 12 + 12
1
5
1
5
1
1
1
5
1.000
1.056
1.134
1.093
1.073
0.875
0.953
1.057
1.000
1.112
1.293
1.207
1.205
0.758
0.918
1.132
1.000
1.133
1.376
1.271
1.310
0.720
0.916
1.179
0 12 + 12
10+0
00+0
1 1 +0
2
2
x3
13+3
Eikonal
Eikonal + x32
0.802
1.201
0.695
1.518
0.659
1.670
33+1
1 1 +1
2
2
1
5
1
5
1
1
1
5
1
5
0.323
0.356
0.312
0.357
0.287
0.279
0.359
0.347
0.294
0.314
0.306
0.365
0.284
0.363
0.242
0.224
0.379
0.354
0.257
0.302
0.287
0.349
0.258
0.344
0.218
0.194
0.375
0.346
0.239
0.298
1
5
1
5
1.634
1.574
1.385
1.549
1.833
1.712
1.320
1.664
1.922
1.775
1.291
1.675
1
5
0.561
0.621
0.493
0.607
0.445
0.574
1 1 +0
2
2
00+1
00+0
0 12 + 12
1 0+ 1
2
2
33+8
0 12 + 12
1 0+ 1
2
2
83+3
1 1 +0
2
2
larger x3 . Note that we have oriented all processes so that the radiating daughter colour
charge is at 0 , wherefore the radiation continues to drop off at large angles. This is unlike
the previous set of curves, which turn around at or near the mid angle to the other radiating
daughter. (For small x3 the two daughters are almost back-to-back, i.e., a bisector at 90 ,
while an x3 of 0.6 allows an almost symmetric configuration with 120 between all three.)
It is here interesting to remind of the QED answer for the process V ff with mf =
mf = 0 [33]

x12 + x22
1 x1
1 x2 2
em
1 d
=
ef
,
ef
0 dx1 dx2
2 (1 x1 )(1 x2 )
x3
x3
(48)
321
Table 3
Two measures on the amount of gluon radiation in different processes, for mass ratios r1 =
the 5 factor makes no difference, and so those
r2 = 0, cf. Table 2. For massless daughters

results are not shown separately. The x3 measure now is collinear divergent and therefore
not shown

(1 x1 )(1 x2 )
(yD 0.1)
Colour
Spin
13+3
1 12 + 12
0 12 + 12
10+0
00+0
1 1 +0
2
2
1.000
1.000
1.167
1.000
0.667
0.917
1.184
1.141
0.773
0.979
13+3
Eikonal
Eikonal + x32
0.667
1.667
0.773
1.595
33+1
1 1 +1
2
2
1 1 +0
2
2
0.347
0.282
0.347
0.222
0.222
0.389
0.282
0.214
0.214
0.328
0.264
0.260
1.701
1.389
1.659
1.384
0.573
0.487
00+1
00+0
0 12 + 12
1 0+ 1
2
2
0 12 + 12
1 0+ 1
2
2
1 1 +0
2
2
33+8
83+3
where eV = ef + ef has been used to eliminate the explicit appearance of terms

corresponding to radiation off the V . For eV = 0, ef = ef this is the QED analogue of
Eq. (9), while the current case corresponds to ef = 0, em ef2 s CF :

x12 + x22
d
1
s
1 x1 2
CF
=
0 dx1 dx2 2
(1 x1 )(1 x2 )
x3
x 2 + x22 1 x1
s
CF 1
.
(49)
2
x3 (1 x2 ) x3
Here the first part of the final expression essentially is the shower answer of Eq. (36). In
the soft-gluon limit (1 x1 )/x3 (1 + cos 13)/2 (1 cos 23 )/2, so this extra factor
gives a further dampening at large emission angles, above the shower ansatz (which in
itself is dampened by one such angular factor relative to the colour singlet decay). A main
consequence of this extra factor is that the three-jet activity in a colour 3 3 + 1 event
is less than half of that of a 1 3 + 3 one, a pattern which remains when masses are
included, see Table 2.
Finally, we come to the processes with a gluino, denoted by dash-dotted lines in Fig. 3. In
the limit of infinitely many colours NC , the process 3 3 + 8 may be viewed as one colour
flowing through from the initial triplet to the gluino, and a different colouranticolour pair
=
322
Fig. 4. Test of the additivity of gluon emission rates, d/dx1 dx2 with a normalization factor
0 CF s /2 removed. The dotted curves show two processes with colour flow 3 3 + 1 and
3 1 + 3, respectively (and spin 1/2 1/2 + 1). The upper and lower full curves give the
complete expressions for 1 3 + 3 (spin 1 1/2 + 1/2) and 8 3 + 3 (spin 1/2 1/2 + 0)
processes, respectively. The two dashed curves, almost completely hidden by the full ones, are the
same processes according to the additive approximations in Eq. (50).
created between the final triplet and the gluino, i.e., as the sum of 3 1 + 3 and 1
3 + 3. Interference terms between the two colour flows would be suppressed by a factor
1/NC2 , just as in the radiation pattern for the colour-related process V qqg [19]. This
process therefore shows the most three-jet activity of the ones studied, especially in the
not-displayed gluino hemisphere.
By the same token, 8 3 + 3 may be approximated by the incoherent sum of a 3
3 + 1 and a 3 1 + 3 radiation pattern, which gives less radiation than the 1 3 + 3
processes. The QED formula, Eq. (48), may here offer a convenient starting point. The
1 3 + 3 process corresponds to ef = ef , so that the interference term ef ef is positive.
The 8 3 + 3 process, e.g., g qq , can instead be emulated in QED by ef = ef , i.e., a
doubly-charged gluino [34]. The interference term then is negative, although suppressed
by a colour factor 1/NC2 in the QCD case.
Thus, given a radiation pattern f1 (13 ) for emission off parton 1 like in a colour 3
3 + 1 process, and a corresponding f2 (13 ) for a colour 3 1 + 3 one (essentially
obtainable by swapping the kinematics of the above process, x1 x2 , r1 r2 ) one may
guess at the radiation pattern for the 1 3 + 3 and 8 3 + 3 processes:

f13+3 (13 ) = f1 (13 ) + f2 (13 ) + 2 f1 (13)f2 (13 ),
2
f83+3 (13 ) = f1 (13 ) + f2 (13 )
(50)
f1 (13 )f2 (13 ).
9
This turns out to be a good approximation up to fairly large x3 values, i.e., so long as the
spin structure is not too important, see, e.g., Fig. 4. The importance of the interference term
323
also gives a simple explanation for the difference between the height of the peaks in the
full and dashed curves in Fig. 3a,b.
4.3. Parity dependence
Let us further quantify differences induced by having or not a 5 factor in the matrix
element, i.e., between vector (V ) and axial vector (A) sources, between scalar (S) and
pseudoscalar (P ) ones, etc. We will use a measure which is the mean of the matrix element
ratios over the whole phase space for gluon emission. Since the matrix elements with and
without the 5 coupling both have the same divergence structures the ratio is well-behaved
everywhere. All phase space points are here given equal weight, so large differences may
not always be reflected in significant changes of the radiation pattern. To be specific, we
study the function

1

Q(x1 , x2 , r1 , r2 ) dx1 dx2 ,
Q(r1 , r2 ) =
(51)
dx1 dx2
where
Q(x1 , x2 , r1 , r2 ) =
01 d 5 / dx1 dx2
.
0 5 d 1 / dx1 dx2
(52)
Q(x1 , x2 , 0, 0) is equal to unity in all of phase space and in most cases Q(x1 , x2 , r, 0)
is also equal to unity. The exception is when the r = 0 mass is a boson (spin 0 or 1)
and the r = 0 one is spin 12 . All the processes in Table 1 which have both non-5 and
5 couplings are represented in Fig. 5. Processes with the same spin structure have the
same ratio, where both the spin of the initial and final states, as well as the order of the
decay products, are of importance. We notice that the difference between, e.g., vector and
axial vector couplings can be rather large in some cases, even for small and intermediate
masses. The most important case, however, with two fermions in the final state, show small
differences for r < 0.5, so this aspect will only be significant in the case of top production.
Physical consequences will be investigated in Section 5.
5. Applications
In this section, the matrix element corrected shower is used to study some processes at
current and future colliders. It is not to be seen as a comprehensive review, but as simple
examples intended to illustrate the main features.
5.1. Bottom in Z0 decay
We start by examining a process where data with good statistics already exists and
detailed comparisons are possible, namely gluon radiation off bottom quarks produced
in the process e+ e Z0 bb at the Z0 pole. Gluons are not observed directly in the
final state, instead they materialize as jets of hadrons. We will mainly consider jets on the
324
Fig. 5. Q(r1 , r2 ), Eq. (51), with r1 fixed, r2 fixed, or r = r1 = r2 . The processes are grouped
according to spin structure.
parton level, but we also study the effects of hadronization and decays. The experimental
data with which our results are compared have already been corrected to the parton level,
using, among others, the same model for fragmentation which we will use.
Jets are constructed by considering all pairs of particles (i, j ) in an event, where the
particles can be hadrons, partons or clusters of hadrons or partons, finding the pair with the
smallest distance yij . If this number is smaller than the jet resolution parameter, yc ,
the pair is joined into a new cluster by summing up their four-momenta. This procedure
is repeated until all yij are larger than yc . The resulting clusters are the jets in the event
at the resolution scale yc . Jet algorithms differ mainly by the definition of yij , and several
jet measures have been proposed in the literature [32,35]. Following the lead of the LEP
analyses we want to compare with, we have settled on the D URHAM [32] algorithm which
defines
yij =
2min(Ei2 , Ej2 )(1 cos ij )

2
Evis
(53)
For small angles, this is approximately the relative transverse momentum squared of the
pair, scaled to the visible energy in the event. If jets are constructed with a large yc , only the
most energetic partons will be resolved, and most events will become 2-jet events. When
yc is decreased, smaller structures will start to emerge and multi-parton final states become
more important. It is in this region that the parton shower approach to gluon radiation is
325
most useful.
For a primary quark flavour q, the n-jet rate is defined as
q
Rn (yc ) =
qqn jets (yc )

,
qqhadrons
(54)
which for n > 2 is generally a decreasing function of yc . We want to study the difference
between gluon radiation off b quarks and off light quarks. A suitable observable is then the
ratio between the respective n-jet rates
Rnbl (yc ) =
Rnb (yc )
,
Rnl (yc )
with l = u, d or s.
(55)
Experiments at LEP [36,37] have found both R3bl and R4bl to be smaller than one,
approaching unity for large yc . This can be understood qualitatively as a consequence of
the well-known dead cone effect [4], stating that the radiation of collinear gluons off heavy
quarks is suppressed. The production of well separated jets, however, is not significantly
suppressed and should approach that of the light quarks for large yc .
We want to study this effect more quantitatively by using our improved, matrix element
corrected, parton shower. The new approach, described in Section 3, has been implemented
in P YTHIA 6.153. This is compared to P YTHIA 6.152 containing the older approach of
Section 2.4, which includes the correct massive matrix element correction in the first
emission only. As a reference, we also include results from the algorithm implemented
in P YTHIA 6.129, where mass-effects are incorrectly included in the matrix element
correction of the first emission, cf. Section 2.3.
The jet rate reflects both the amount of energy radiated and the direction in which it is
radiated. Fig. 6a shows the gluon radiation pattern in the process e+ e Z0 qq at the
Z0 -pole, for the old and the new shower routines both for light (u, d and s) quarks and
heavy (bottom) quarks. The angle, g , is defined as the angle between the radiated gluon
and the primary quark in the CM system of the primary quark pair. In principle, gluons
are radiated by the qq dipole as a whole, but in our implementation each radiated gluon
is assigned to an initial quark, and in the collinear limit this separation is quite sensible.
In the case of the total energy flow, Fig. 6b, we add the two contributions, and include
the energy taken by quarks produced in gluon splitting, so two symmetrical peaks appear.
Fig. 6c shows the distribution of 5E = Erad /ECM , i.e., the total energy fraction radiated
in the shower. Fig. 6d, finally, shows the gluon multiplicity in the shower, which obviously
is quite dependent on the lower shower cut-off Q0 .
We note first of all that the difference between light and heavy quarks is largest in
the region of small angles, the dead cone, and also that the radiated energy fraction is
much larger for light quarks, peaked around 0.5 if Q0 = 1 GeV is used to remove the
collinear emission off light quarks. For heavy quarks, this collinear emission is regulated
by the mass. For large angles, the light and heavy quark distributions converge as they
should. In the new approach, the amount of gluon radiation in b events at small angles is
somewhat enhanced relative to the old one, where the dead cone was slightly exaggerated,
but differences are small. By the change of evolution variable, Eq. (26), the fraction of
326
Fig. 6. Gluon radiation and energy flow in the process e+ e Z0 qq, where q is either a light
(u, d, s) quark or a b quark. The b-mass is here set to the default in P YTHIA, 4.8 GeV. (a) Energy
weighted gluon angle distribution. (b) Energy flow. (c) Radiated energy fraction, 5E = Erad /ECM .
(d) Gluon multiplicity.
events without any shower at all (5E = 0) has decreased by almost a factor of 3 from
about 4 per mille to 1.4. As a consequence, the peak at 5E = 0 has been flattened out. The
gluon multiplicity has also increased slightly.
We now want to investigate how the changes in the algorithm affect R3bl and R4bl . Since
one is studying small deviations from unity, these measures are very sensitive to changes in
the gluon radiation pattern. Also, the LEP experiments have large samples of Z0 events and
studies at the per cent level are feasible. We will mostly study the behaviour of the model
on the parton level, comparing different alternatives. Here the parton level is defined by the
partonic configuration at the shower cut-off scale Q0 , below which no further emissions
occur. But first we should comment on the effects of fragmentation. Below the Q0 scale,
the Lund string fragmentation model [1] describes how the partons transform into the
primary hadrons. Subsequently these may decay further. Once a Q0 has been chosen,
the parameters of the fragmentation model should be fitted to data. Here we will only
study the variation with Q0 , both at the parton level and the hadron level, without any such
retuning. At the hadron level, we consider first the primary hadrons and study decay effects
separately.
Fig. 7a shows the effect of fragmentation and the Q0 dependence. We calculate R3bl (yc )
at the parton level and at the level of primary hadrons for Q0 = 1 and 2 GeV, respectively.
327
Fig. 7. Study of fragmentation and decay effects. (a) Effects of changing the cut-off, Q0 , in the
shower, both on the parton level and at the level of primary hadrons (i.e., before decays). (b) Effects of
hadronization, decay of rank 1 hadrons and decay of all hadrons. In (b) each alternative is represented
by the 1 curves given by the Monte Carlo statistics.
The Q0 dependence is largest on the parton level with R3bl slightly lower for the larger
Q0 . This is because the increase in Q0 increases the 3-jet rate for light quark events, while
heavy-quark events are not influenced as much: collinear emissions, which dominate at the
end of the cascade, are suppressed anyway. Intuitively one might have expected the jet rate
to be higher for a lower Q0 , since this corresponds to a larger number of partons, but the
further partons emitted between 1 and 2 GeV cannot give rise to new jets of their own.
They only smear out the energy of the existing jets and possibly make them fall below
the cut-off. Furthermore, the hadronization stage tends to decrease the Q0 dependence
because the Lund model is infrared safe and collinear gluon emissions do not affect the
string fragmentation process. A retuning of the fragmentation parameters would further
limit the effect of small changes in Q0 .
Fig. 7b shows the effects of fragmentation and decay. The rank 1 hadrons are the
ones that contain the primary quarks from the decay of the Z0 . In the case of bottom
production, the rank 1 hadrons will be mainly B mesons and b baryons. Because of the
hard fragmentation function for heavy hadrons, the multiplicity of primary hadrons in b
events is smaller than that in a light quark event, reflected in the lower R3bl curve. However,
once the heavy hadrons have decayed, their decay products will more than compensate this.
Thus, if the primary hadrons are allowed to decay, the value of R3bl lies above unity for yc >
0.03. So, on the one hand, heavy hadrons take a large part of the primary quark energy, but
they also decay to many particles. For R3bl there is not a large difference between allowing
all primary hadrons to decay or only the rank 1 ones, which can be seen in the small
difference between the two top curves in Fig. 7b. In the following, we will compare our
results to experimental data on the parton level, bearing in mind that this comparison can
be ambiguous in view of the dependence on Q0 . Our aim here is not to achieve a perfect
fit to data, but merely ensure that the results are at the right level. As we will see, the data
is anyway not good enough to make precise discriminations between all different model
variations.
328
Fig. 8. R3bl (yc ) for different model parameters and variations. (a) Different main versions. Some
variations of the latest version (6.153): (b) Different bottom masses. (c) Different sources. (d) Other
minor variations of the main theme, see the text for details. The data are from [36] and [37]. The
data points from A LEPH (except the first) have been shifted 0.001 units in yc for clarity. In (a) and
(b) the 1 curves are given, while only the central value is shown in (c) and (d) for clarity, but the
statistics is comparable.
In Figs. 8, 9, R3bl and R4bl are shown as functions of yc for several model variations.
Figs. 8a, 9a show the difference between the old and the modified shower models. An
even older version, where mass effects are exaggerated, is also shown as reference. Each
model curve is displayed as a one sigma band, showing the size of the Monte Carlo error
for 15 106 events of each kind. From the increase in multiplicity and energy flow in
b events in the new model, one would naively expect the 3- and 4-jet rates to increase
for heavy quarks, thus increasing Rnbl . Actually, the main effect is to reduce the 3-jet rate
for heavy quarks, again illustrating that allowing more collinear/soft emissions will not
necessarily increase the rate of well separated jets. The naively expected increase of R4bl
(from subsequent branchings) is balanced by a corresponding smearing loss as for R3bl ,
giving only a small net effect.
As already noted, the old and new models differ mainly in the treatment of subsequent
branchings; both include matrix element corrections to the first gluon emission on each side
of the event. A clear discrimination between these two versions is not possible, especially
in view of the variations that follow below and the relatively large experimental errors. It
is clear, however, that the algorithm in P YTHIA 5.129 is ruled out by the data.
329
Fig. 9. R4bl (yc ) for different model parameters and variations. See caption Fig. 8 for details.
Next we consider variations to the new default, in order to assess the uncertainties of the
model:
The bottom mass.
Parity dependence (vector vs axial vector source).
2 argument in from z(1 z)m2 to z(1 z)m2 (1 m2 /m2 )2 in a
Change of p
s
a
a
a
b
branching a bc, with a being a heavy quark.
Coherence effects.
The main free parameter is the bottom mass, and Figs. 8b, 9b show the result of varying
this mass between 4.6 GeV and 5 GeV. Fits of NLO QCD calculations to LEP data give a
value around 3 GeV for the running bottom mass in the MS renormalization scheme [5,36,
37] at renormalization scale mZ . Our model is based on LO matrix elements and the mass
in our case is the constituent quark mass, so they need not agree. However, a somewhat
lower mass than the default seems to be favoured, especially for R4bl .
In Sections 4.2 and 4.3 we saw that there are slight differences between different sources.
The relevant mixture for Z0 is given by Eq. (42), but in Figs. 8c, 9c we study the two
extreme cases of a pure vector source and a pure axial vector one. At the Z0 -pole, the axial
vector component dominates, as can be seen in the figure, but at higher energies the vector
one will take over. We see that the differences are non-negligible, but not as large as the
shown mass dependence, at least for small yc .
330
Finally, in Figs. 8d, 9d, some further aspects of the model, discussed in Section 3.4, are
varied. Both the modified p in s and the minimal change of coherence tends to reduce
R3bl , again showing how an increase in the total amount of radiation need not give more
separate jets. The intermediate coherence option, which still is a realistic alternative, does
give an increase, especially of R4bl . Thereby the overall agreement with data is improved,
as much as with a reduced b mass, although uncertainties are sufficiently large that no firm
conclusions should be drawn.
Other minor issues could be studied and here we just mention a few. The definition of
bl
Rn used here is based on a classification of events with heavy or light primary quarks. An
alternative is to use a ratio between b-tagged and anti b-tagged events. The difference here
in the shower. For the R bl
is in the classification of events with gluon splitting, g bb,
3
ratio this is a minor issue, but it is non-negligible in the case of R4bl [5]. This uncertainty
is currently under study in the D ELPHI collaboration, and is not further addressed here.
Another issue is the energy dependence of the results. Obviously, the mass effects will
decrease for larger energies, because of the decrease of r = mb / s, and this has already
been studied at 189 GeV [5]. At the same time, however, the fraction of vector coupling
increases significantly above the Z0 -pole, thus decreasing R3bl , cf. Fig. 8c. At intermediate
energies this could therefore give rise to a partial cancellation of effects.
5.2. Bottom in Higgs decay
We now turn to gluon radiation in Higgs decay. In the standard model, the decay H0
bb is expected to be large, or even dominate, for Higgs masses up to the W+ W threshold,

and in extensions to the standard model also heavier Higgs states can have significant bb
branching ratios. When the fragmentation function for B mesons, i.e., the distribution of
z = 2EB / s, is measured at LEP1, the b quark is produced from a spin 1 source. If instead
the source is a Higgs boson, the difference in gluon radiation could give rise to a changed
fragmentation function. Such a change would influence the experimental vertex detection
efficiency which, if uncorrected, would give rise to errors in the determination of cross
sections. It is therefore important to be able to describe in detail the gluon radiation pattern
in this decay. A different measure of the nature of the source is the jet topology, which is
also studied in the form of a modified R3bl ratio, cf. the preceding section.
We choose to study the production of bb pairs at a CM energy of 130 GeV. This number
is in the middle between the current lower limit and the W+ W threshold, at around the
value expected for the MSSM h0 , but obviously the relative comparison of the sources
is only mildly energy-dependent. Five different sources are compared. The first sample
is considered as a reference and consists of gluon radiation in the decay /Z qq,
where q is a light u, d or s quark. For simplicity they are assumed to be produced in equal
where
amounts. The other four samples consist of gluon radiation in the decay X bb,
X is a vector, axial vector, scalar or pseudoscalar source. Clearly the /Z is a mixture,

Eq. (42), and also h0 /H0 /A0 need not be pure states, but the separation allows us to study
the extreme range of possibilities.
331
Fig. 10. Comparison between vector, axial vector, scalar and pseudoscalar sources of bb pairs
at 130 GeV. (a) Energy flow in the shower. (b) 3-jet rate, normalized to /Z qq, i.e.,
l
R3bl (yc ) = R3b (X bb)/R

3 ( /Z qq), where X is V , A, S or P and
q is a light flavour.
1 Monte Carlo error bands are shown. (c) Distribution of zb = 2Eb / s at the parton level.
(d) Distribution of zB = 2EB / s, i.e., the fragmentation function, at the hadron level (only primary
hadrons are considered).
Fig. 10a shows the angular energy flow. The difference between having or not having
a 5 in the coupling is negligible, but the radiation at large angles is larger for the spin 0
source than a spin 1 one. The measure in Fig. 10b, i.e., the normalized 3-jet rate, is most
sensitive and shows that the 3-jet rate is significantly larger for a spin 0 source, especially
for well separated jets (large yc ), which is consistent with the larger energy flow at large
angles. The parity dependence is very small, on the other hand, and is neglected in the
following.
Considering the large spin dependence one could expect the fragmentation function also
to change considerably when going from a vector to a scalar source. Fortunately, as can
be seen in Figs. 10c,d, the changes are minor, indicating that the fragmentation function is
mainly sensitive to the bulk of gluons at smaller angles, where the sources give the same
emission rate. The larger gluon energy flow for a scalar source is reflected in a slightly
smaller z, with a difference of about 1% both at the parton and hadron level. The small
effect on the fragmentation function is positive from an experimentalists point of view, in
that fragmentation functions measured at the Z0 pole can be simply extrapolated also for
possible spin 0 sources at higher energies.
332
Fig. 11. Gluon multiplicity and energy flow in the tt shower at 500 GeV centre-of-mass.
5.3. Top production and decay

For top production around the tt threshold, real gluon emission will be limited because
of phase space, so aspects such as the form of the QCD confinement potential, best tackled
with other perturbative techniques, will be more relevant [38]. In this section we study the
process e+ e tt at 500 GeV, where gluon emission will start to become important, and
compare the new shower routine with the older versions. We focus on fundamental aspects
such as gluon radiation patterns and top mass determinations. In the process we will also
study gluon radiation in the decay of the top, and differences in the radiation patterns for
the two decay channels t bW+ and t bH+ to demonstrate the advertised process
dependence of the new shower routine.
We first study gluon radiation in the production of the top. The matrix element used in
the correction of the shower is an energy dependent mixture of vector and axial vector
ones. High above the Z0 pole, the vector part will dominate and at 500 GeV the
parameter, introduced in Section 4.1 to parameterize the relative mixture, is about 0.78 for
tt production. The dead cone effect is expected to be very pronounced in this case, because
the scaled mass is large, around r = 0.35. Fig. 11 shows the gluon multiplicity and the
energy flow in the tt shower. Most events have no gluon emission, but the no-gluon rate
has decreased in the new shower as expected. The result of an even older version, where
the dead cone effect is severely exaggerated in all branchings, is shown as a reference.
The lesson is that the correct description of the first emission, i.e., the difference between
6.129 and 6.152, is much more important than that of subsequent emissions, between 6.152
and 6.153. Once a gluon is emitted, it tends to be quite energetic and splits into several
further gluons. This is the reason for the large spread in the multiplicity and the dip in this
distribution for the old shower routine. The total energy flow has also increased in the new
shower relative to the older ones. Even if the energy flow seems to be large at large angles,
most energy is kept by the original top quarks at E = 0 and respectively.
The dominant top decay is t bW+ , where further gluons will be radiated. In the old
shower routine, gluon radiation off the b did not take into account the full tb interference
structure. Furthermore, gluon radiation in a hypothetical decay t bH+ was the same as
in the one above, apart from the difference induced by the different W boson and Higgs
333
masses. In order to separate off this trivial mass effect we put mH = mW , well aware that
this is not realistic.
The matrix element used in the decay t bW+ has the familiar V A coupling, but
the t bH+ case is less obvious. In the MSSM the vertex factor is proportional to mt (1
5 ) + mb tan2 (1 + 5 ), i.e., a parameter-dependent mixture of scalar (S) and pseudoscalar
(P ) couplings. Naively, the S P one would be expected to dominate because of the large
difference between the top and bottom masses. However, the favoured values of tan 3
could cancel this effect and give rise to an almost pure scalar coupling. We therefore study
the two extreme cases of S and P separately. The difference between the matrix elements
of this type has been studied in Tables 2 and 3 and found to be non-negligible for massive
daughters but to vanish in the massless limit. For the process t bX, where X is a W+ ,
pure scalar H+ or pure pseudoscalar H+ , some differences are visible in the gluon energy
flow for larger gluon energies, see Fig. 12a, but the differences induced in observables such
as the jet rates are small, Fig. 12b. That is, we are closer to the massless case of Table 3,
indicating that the small mass of the radiating b is more important than the large mass of
the non-radiating W+ /H+ . In what follows, therefore, only the decay to W+ is shown. This
is the most important case and, as it turns out, the intermediate alternative, cf. Fig. 12a.
Figs. 12c,d show the gluon multiplicity and energy flow in the decay of the top for the
new and the older shower routines. The gluon multiplicity is slightly more peaked in the
new shower and the no-gluon rate has been somewhat reduced. The angle E is defined as
the angle of emitted energy to the primary bottom quark in the rest frame of the decaying
top, which is why no peak appears at large angles, corresponding to the direction of the
colour neutral W/H. In the new correction to the shower, gluon radiation at large angles is
more severely suppressed. As a consequence, there is more energy left for radiation at small
ones, allowing the curves to cross there. This is an effect in addition to the influence of the
new matrix element correction scheme, visible, e.g., in Fig. 6a. The difference between
decays to W and H are very small also here (not shown). The result of P YTHIA 6.129 is
similar to 6.152, but the dead cone is slightly more pronounced, as noted before.
In a complete event, gluon emission from the decay and production are added, and also
possible radiation in hadronic decays of the W/H must be considered. In principle, these
showers will also interfere but, as a realistic first approximation, we work in the zero width
limit where these interferences vanish. Considering the two stages of production and decay
together, but ignoring the decay products of the W/H, Fig. 13a shows the total gluon
multiplicity for the new and older shower routines. Since the radiation has increased in
the production and decreased in the decay, effects tend to cancel. The net result is a slight
increase in the average gluon multiplicity and a somewhat smaller width. The 3-jet rate,
however, has decreased, Fig. 13b, presumably because of the reduced probability to have
gluon emissions at large angles in the top decay, cf. Fig. 12d.
The total gluon multiplicity is not a very good measure of the event properties and
certainly not an observable. To give a practical example where gluon radiation is important,
we consider the reconstruction of the top mass in a simplified scenario, assuming that the
W/H can be completely reconstructed in order only to study the effects of gluon emission
in the top production and decay. We find two jets and calculate the top mass by considering
334
Fig. 12. (a) The gluon emission rate as a function of the emission angle 13 for top decay, t bX.
The emission rate is given for the two gluon energies x3 = 0.1 (full curves) and x3 = 0.4 (dashed).
The curves corresponding to the higher energy have been multiplied by a factor of 40 for clarity.
Three different curves are shown for each energy: pseudoscalar Higgs, W boson and scalar Higgs
(from high to low curves). (b) The 3-jet rate in complete tt events as a function of yc for the three
different cases above, where the W/H are removed from the cluster search. (c)(d) Gluon multiplicity
and energy flow in the shower induced by the top decay (in the rest-frame of the decaying top with
the W at E = ).
the two possible combinations of jet + W momenta. The combination which minimizes
(m1 mt )2 + (m2 mt )2 , where mt = 175 GeV, gives the reconstructed top masses m1
and m2 . The distribution of the reconstructed mass is given in Fig. 13c, where jets are
clustered on the parton level. The reduced radiation level survives in this measure as a
slightly narrower peak for the new shower. The effects of fragmentation and decay are
quite large for the reconstructed mass, but some differences still survive in the high mass
wing, as seen in Fig. 13d.
5.4. Supersymmetry production and decay
In supersymmetric models, a whole new set of processes involving coloured particles is
introduced. We have calculated the most important LO matrix elements of the MSSM,
listed in Table 1, to be used as input to the parton shower routine. In versions of
P YTHIA prior to 6.153, supersymmetric particles did not shower. Because of the large
335
Fig. 13. tt events after top decays. (a) Total gluon multiplicity. (b) The 3-jet rate on the parton level,
with W/H removed from the clustering. Initial-state photon radiation is included, but both models
are equally affected by this. (c) Reconstructed top mass, parton level. (d) Reconstructed top mass,
hadron level.
masses of these hypothetical particles, this is a good first approximation. However, if

supersymmetric particles are discovered at the LHC or a future linear collider, detailed
studies of their properties will profit from a better understanding of QCD radiation patterns.
A complication that appears in the MSSM is significant rates of three body decays
involving one or several coloured particles in the initial or final states. Often several
interfering diagrams with intermediate off-shell propagators contribute. An example is
the decay + 0 q
q with either a W or squark propagator. These processes do not
fit into the present framework with corrections to matrix elements of the type a bcg.
As a preliminary solution, the q
q pair above is assumed to be produced from a V A
source. A new kind of gluon radiation process in the MSSM is from four-vertices such as
q q
W+ g. These diagrams introduce no new divergences. The interference terms only
contribute to the ordinary divergences, and the squared amplitude gives rise to a constant.
An upper estimate is again given by the new shower expression, which is corrected by the
full matrix element.
As a first example, consider top vs. stop production at a linear e+ e collider and the
decay t bW+ vs. t b + . Again, identical masses are assumed and consequently
mt = mt and m = mW are used. As usual, the tt source is an energy dependent mixture of
336
Fig. 14. Top vs. stop production and decay at a 500 GeV e+ e collider. (a) energy flow in the tt (t t)
shower. (b) 3-jet rate on the parton level for the production and decay taken together, with the W/
removed from the clustering.
vector and axial vector. Only a scalar coupling is possible for Z0 t t, so no ambiguity here
either. The top decay has the V A coupling, while the stop one is a parameter dependent
mixture of scalar and pseudoscalar. Again we consider the two extreme cases separately.
While the stop obviously decays isotropically, the current P YTHIA implementation does
not include top polarization information and thus also decays this particle isotropically.
Initial-state photon radiation has not been included in the comparison, since the different
threshold behaviours of the tt/t t cross sections allow more energetic radiation in the former
process, which will affect event topologies.
Fig. 14a shows the energy flow for top vs. stop production. The multiplicity is the same
for both top and stop, but the energy flow is slightly larger for stop at large angles. The
difference between the gluon radiation patterns in the top and stop decays is negligible
(not shown). As a result of the increased gluonic energy flow, the 3-jet rate at the parton
level, with the W/ removed from the cluster analysis, is slightly larger for stop, Fig. 14b.
We notice further that the parity dependence is negligible also in this case. The differences
between supersymmetric and standard model processes is not large in this example, at
least when mass effects are neglected, and the ambiguity in the coupling structure is
negligible. This is all good news, but since gluon emission from supersymmetric particles
is a completely new feature, many more tests should be done.
As a last example we consider a full simulation of gluino pair production (via gg g g )
at a 14 TeV pp collider. Since we have not developed a full shower formalism for 2 2
processes, as a first approximation the eikonal expression will be used for radiation off the
g g system, including a colour factor rescaling by NC /CF = 9/4. Initial state radiation is
studied separately and multiple interactions are neglected. A scenario with mg = 450 GeV,
mb 1 = 250 GeV and mt1 = 200 GeV is used. The dominant gluino decays are then g
If the is a neutralino,
b 1 b (t1t) and the squarks decay by processes of the type q b.
which here is the lightest supersymmetric particle, it will not decay further. A chargino
decays to a neutralino plus either leptons or quarks in approximately equal amounts. This
MSSM scenario is implemented in the P YTHIA event generator [6] and the details are
described in the S PYTHIA manual [39].
337
Fig. 15. Gluino production at a 14 TeV pp collider. (a)(b) Transverse gluon energy flow in
pseudorapidity, = 12 ln((p + pz )/(p pz )). (c)(d) Jet multiplicity for R = 0.75, E,min = 10
and 40 GeV, respectively.
To assess the importance of gluon radiation off supersymmetric particles in a hadron

hadron collision, we compare the transverse gluon energy flow from SUSY and nonSUSY particles in the final state radiation (FSR), Fig. 15a. The additional gluon radiation
from coloured SUSY particles is thus seen to be small compared to the ordinary gluon
radiation from quarks. In this paper we have only considered QCD final state radiation in
e+ e annihilation. In a hadron collider environment, gluons can be radiated also from the
incoming partons (initial state radiation, ISR). This contribution is almost as large as the
FSR one, Fig. 15a, but the rapidity distribution is different.
In Fig. 15b the three P YTHIA versions considered in this paper are compared. Only
the FSR component is shown, and the additional radiation from SUSY particles in the
latest version gives rise to an increased total transverse gluon energy flow. This increase is,
however, slightly compensated by the decrease in gluon radiation in the top and squark
decays, cf. Section 5.3. In Figs. 15c,d the result of a simple cluster
search is shown.

Particles with a summed transverse energy, E > E,min inside a (5)2 + (5)2 <
R cone are joined in a cluster. For large E,min , the jet multiplicity is slightly increased
when gluon radiation from supersymmetric particles is included. This could be caused by
energetic gluons radiated off the gluinos. For small E,min , on the other hand, smaller
structures are probed and the radiation off b and lighter quarks is most important. In
338
summary, we conclude that gluon radiation off supersymmetric particles at a hadron

collider is small compared to the other sources, and of importance mainly in high-precision
studies.
6. Gluon splitting to heavy quarks

Data at LEP1 show a larger rate of secondary charm and bottom production than
predicted in most shower descriptions [5,40], or in analytical studies [41]. We therefore
comment on a few of the issues in the Monte Carlo simulation and how a relaxation of
some demands would affect rates.
6.1. Strong coupling argument and kinematics
2 as argument. Actually, the
The default behaviour in P YTHIA is to let s have p
exact kinematics has not yet been reconstructed when s is invoked, so the approximate
2 z(1 z)m2 is used, see discussion at Eq. (7). Since blows up when
expression p
s
a
2 or on z and m ,
its argument approaches QCD , this translates into a requirement on p
a
restricting allowed emissions to p > Q0 /2, where Q0 is the shower cut-off scale. Also
when full kinematics is reconstructed, this is reflected in a suppression of branchings with
small p . Therefore, in g qq branchings analyzed in the g rest frame, the quarks do not
come out with the 1 + cos2 angular distribution (with respect to the direction of motion
of the gluon) one might expect away from threshold, or a somewhat more isotropic one
closer to threshold, but are rather peaked at 90 and dying out at 0 and 180 .
2 as scale
For g qq branchings, the soft-gluon results that lead to the choice of p
[42] are no longer compelling, however. One could instead use some other scale that does
not depend on z but only on ma = mg , i.e., the off-shellness of the branching gluon, and
remove the p cut. A reasonable choice, even if not unique, is to use m2g /4, where the
2 for z = 1/2. This possibility has been added as a
factor 1/4 ensures continuity with p
new option.
Actually, the change of s argument in itself leads to a reduced g qq splitting rate,
while the removal of the p > Q0 /2 requirement increases it. The net result is a decrease
by about 10%, Table 4. The topologies of the events are changed somewhat, so rates within
experimental cuts could be more affected. However, the changes are not as big as might
have been expected see the following.
6.2. Coherence
In the above subsection, it appears as if the 1 + cos2 distribution would be recovered
in the new s (m2g /4) option. However, this neglects the coherence condition, which is
imposed as a requirement in the shower that successive opening angles in branchings
become smaller. Such a condition actually disfavours branchings with z close to 0 or
1, since the opening angle becomes large in this limit, Eq. (8). It should be noted that
the opening angle discussed here is not the true one, but the one based on approximate
339
Table 4
The rate of gluon splitting to qq pairs, in Z0 decay at 91.2 GeV with the normal primary
flavour mixture
Coherence
g uu + dd + ss (%)
g cc (%)
g bb (%)
2)
s (p
full
14.3
1.26
0.15
2)
s (p
intermediate
14.8
1.27
0.16
2)
s (p
2)
s (p
s (m2g /4)
s (m2g /4)
s (m2g /4)
s (m2g /4)
reduced
21.1
1.92
0.26
none
38.8
3.06
0.31
full
12.9
1.15
0.15
intermediate
13.3
1.17
0.15
reduced
20.0
1.78
0.28
none
43.3
3.47
0.46
kinematics, including neglect of masses. More generally, the coherence formalism is not
really developed with this kind of configurations in mind, especially not with a heavy quark
pair close to threshold.
As a means to exploring consequences, two new coherence level options have thus
2 of a g q
q branching is reduced by the correct
been introduced. In the first, the p
2
mass-dependent factor, 1 4mq /m2g , while the massless approximation is kept for the
longitudinal momentum. This is fully within the uncertainty of the game, and no less
reasonable than the default. In the second, no angular ordering at all is imposed on g
qq branchings. This is certainly an extreme scenario, and should be viewed with caution.
However, it is still interesting to see what it leads to.
In Section 3.4 another intermediate coherence variation of the default full one
was introduced, affecting the rate of gluon emission off the primary quarks but not the
subsequent gluon cascades. This variation has negligible consequences for the secondary
quark rate at LEP energies and is not considered further.
It turns out that the decay angle distribution of the gluon is much more distorted
by the coherence than by the s and kinematics considerations described earlier. Both
modifications are required if one would like to have a 1 + cos2 shape, however. Also
other distributions, like gluon mass and energy, are affected by the choice of options.
The most dramatic effect appears in the total gluon branching rate, however, Table 4.
Already the reduced angular ordering requirement can boost the g bb rate by almost a
factor of two. The effects are even bigger without any angular ordering constraints at all.
It is difficult to know what to make of these big effects. Experimental information on the
angular distribution of secondary cc/bb pairs might help understand what is going on, but
probably that is not possible experimentally. Anyway, the measured (but still uncertain)
values are fully bracketed by the range of the models, indicating that there need not be a
conflict between theory and experiment.
340
7. Summary and outlook

We have in this article studied QCD radiation off heavy particles, both based on the
calculation of a wide set of first-order matrix elements and by the usage of these matrix
elements as input to a parton-shower description of multiple-gluon emission.
The matrix-element calculations provide at least two important insights. One is the
significant spin dependence on the rate of gluon emission, given identical kinematical
conditions. Some such effects could be expected simply from the different shape of the
splitting kernel of a quark and a squark, say, but equally important is the spin of the
decaying particle, and the parity of the process (combining that of mother and daughters).
The best illustration is the difference between a vector and a scalar colour singlet source
each decaying to a pair of squarks, where the gluon emission rate differs by up to a factor
of 1.8 in our limited study, and is likely to reach even higher levels in other observables.
The 5 dependence disappears in the massless limit, but the bulk of other effects remain
also there, so this is not only an issue for the production of heavy flavours. This process
dependence would not be caught by the traditional parton-shower philosophy, where the
universal aspects of gluon emission are emphasized.
The other insight is that the dead cone concept is one that only applies universally for
soft gluon emission. Only one process provides an exact zero in the collinear emission rate
while, for reasonably hard gluons, some processes show no dip at all at small angles. And,
while soft gluons may be ideologically interesting, the experimentally observable event
shapes are often more crucially dependent on some intermediate range of gluon energies.
Again, therefore, the message is that a universal shower description may be misleading.
Fortunately, it is not impossible to combine the shower picture with a process
dependence of the kind noted above. We have in this article developed one approach to
the problem, wherein matrix element information is provided for all shower branchings of
the primary particles. This combines a process dependent resummation of gluon emission
off these particles with the traditional strengths of the shower formalism, such as providing
exclusive final states with exact energymomentum conservation.
To illustrate this formalism, a few physics examples have been studied. For bottom production at the Z0 pole, detailed comparisons with data are possible. We find the new shower
routine to be in good agreement with current data if the uncertainties in both model and data
are taken into account. A significant dependence on coherence options were found both for
jet rates and gluon splitting rates in the shower. It has further been possible to rule out an
older version of the shower routine where mass effects were not correctly accounted for.
Differences in jet rates were found for different coupling structures in Higgs decay to
bottom, whereas the influence on the fragmentation function is minor. Both the production
and decay of the top quark is dependent on the matrix element correction, influencing, e.g.,
the mass reconstruction of the top quark. Again the jet rate was found to be most sensitive.
Gluon radiation off supersymmetric particles, squarks and gluinos, has been introduced,
but the effect of this additional radiation is small in high energy processes, especially at
hadron colliders, because of the background from initial state radiation and showers off
the standard model decay products. Differences between top and stop events are small
341
if equal masses are assumed, but many more comparisons between standard model and
supersymmetric processes could be envisaged.
From this limited study we conclude that gluon emission off b and lighter quarks
dominate the picture. If this component is modelled well, the rest is less important, but
still of ideological interest. Since the scaled bottom mass r = mb /mX , in the decay of a
resonance X, is small in most cases, we see little if any parity dependence. In Z0 decays
this effect is less than 1%, but it could be visible in a high-precision study. Also for the spin
dependence, effects are small in most cases, although the H0 bb example shows they
need not be quite as small. It thus turns out that the new process dependence will mainly
be important when high precision is strived at.
We note that the current study is not the end of the story. While we have sampled a
fair selection of colour and spin structures, more are likely to turn up even in the simple
context of two-body decays considered here. The MSSM also allows a significant rate
for three-body decays, normally as a sequence of two consecutive two-body ones with an
intermediate off-shell propagator, and with the possibility of interference between several
such intermediate states. The real challenge, however, may well be provided by the more
complex production processes at hadron colliders, where the concept of a sequence of
s-channel processes need no longer be valid, e.g., squark production gg q q with a tchannel squark propagator. The large rate of initial-state radiation also implies that the
initialfinal state interference terms may be more important than the final-state radiation
off the heavy particles themselves. Much work therefore lies ahead, if one desires a good
description of QCD effects in many processes of interest at the LHC.
References
[1] B. Andersson, G. Gustafson, G. Ingelman, T. Sjstrand, Phys. Rep. 97 (1983) 31;
T. Sjstrand, Nucl. Phys. B 248 (1984) 469.
[2] Yu.L. Dokshitzer, V.A. Khoze, L.H. Orr, W.J. Stirling, Nucl. Phys. B 403 (1993) 65.
[3] M. Bengtsson, T. Sjstrand, Phys. Lett. B 185 (1987) 435;
M. Bengtsson, T. Sjstrand, Nucl. Phys. B 289 (1987) 810.
[4] Yu.L. Dokshitzer, V.A. Khoze, S.I. Troyan, J. Phys. G 17 (1991) 1602.
[5] A. Ballestrero et al., in: S. Jadach et al. (Eds.), Reports of the Working Groups on Precision
Calculations for LEP2 Physics, CERN 2000-009, p. 137.
[6] T. Sjstrand, Comput. Phys. Commun. 82 (1994) 74, LU TP 95-20, hep-ph/9508391.
[7] G. Marchesini, B.R. Webber, Nucl. Phys. B 238 (1984) 1;
G. Marchesini, B.R. Webber, G. Abbiendi, I.G. Knowles, M.H. Seymour, L. Stanco, Comput.
Phys. Commun. 67 (1992) 465.
[8] G. Marchesini, B.R. Webber, Nucl. Phys. B 330 (1990) 261.
[9] G. Gustafson, U. Pettersson, Nucl. Phys. B 306 (1988) 746;
L. Lnnblad, Comput. Phys. Commun. 71 (1992) 15.
[10] V.V. Sudakov, Zh. Eksp. Teor. Fiz. 30 (1956) 87.
[11] Workshop on Photon Radiation from Quarks, S. Cartwright (Ed.), CERN 92-04.
[12] M.H. Seymour, Z. Phys. C 56 (1992) 161;
M.H. Seymour, Comput. Phys. Commun. 90 (1995) 95.
[13] G. Corcella, M.H. Seymour, Phys. Lett. B 442 (1998) 417.
342
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
J. Andr, T. Sjstrand, Phys. Rev. D 57 (1998) 5767.

J. Ellis, M.K. Gaillard, G.G. Ross, Nucl. Phys. B 111 (1976) 253.
B.L. Ioffe, Phys. Lett. B 78 (1978) 277.
A.H. Mueller, Phys. Lett. B 104 (1981) 161;
B.I. Ermolaev, V.S. Fadin, JETP Lett. 33 (1981) 269.
W. Beenakker, R. Hpker, P.M. Zerwas, Phys. Lett. B 349 (1995) 463.
Ya.I. Azimov, Yu.L. Dokshitzer, V.A. Khoze, S.I. Troyan, Phys. Lett. B 165 (1985) 147.
B. Andersson, G. Gustafson, T. Sjstrand, Phys. Lett. B 94 (1980) 211.
V.A. Khoze, T. Sjstrand, Phys. Lett. B 328 (1994) 466.
G. Gustafson, U. Pettersson, P. Zerwas, Phys. Lett. B 209 (1988) 90;
T. Sjstrand, V.A. Khoze, Z. Phys. C 62 (1994) 281;
T. Sjstrand, V.A. Khoze, Phys. Rev. Lett. 72 (1994) 28;
L. Lnnblad, T. Sjstrand, Phys. Lett. B 351 (1995) 293;
L. Lnnblad, T. Sjstrand, Eur. Phys. J. C 2 (1998) 165.
CDF Collaboration, F. Abe et al., Phys. Rev. D 50 (1994) 5562.
E. Norrbin, T. Sjstrand, Phys. Lett. B 442 (1998) 407;
E. Norrbin, T. Sjstrand, Eur. Phys. J. C 17 (2000) 137.
A. Pukhov et al., preprint INP MSU 98-41/542, hep-ph/9908288.
A. Arhib, M.C. Peyranere, G. Moultaka, Phys. Lett. B 341 (1995) 313.
M ATHEMATICA, Version 2.2, Wolfram Research, Champaign, IL, 1993.
P. Janot, Phys. Lett. B 223 (1989) 110.
J. Jersak, E. Laermann, P.M. Zerwas, Phys. Lett. B 98 (1981) 363;
M. Jezabek, J.H. Khn, Nucl. Phys. B 314 (1989) 1.
R.K. Ellis, D.A. Ross, A.E. Terrano, Nucl. Phys. B 178 (1981) 421.
Z. Kunszt, Phys. Lett. 99B (1981) 429.
S. Catani, Yu.L. Dokshitzer, M. Olsson, G. Turnock, B.R. Webber, Phys. Lett. B 269 (1991)
432.
T.R. Grose, K.O. Mikaelian, Phys. Rev. D 23 (1981) 123.
Yu.L. Dokshitzer, V.A. Khoze, A.H. Mueller, S.I. Troyan, Rev. Mod. Phys. 60 (1988) 373.
JADE Collaboration, W. Bartel et al., Z. Phys. C 33 (1986) 23;
JADE Collaboration, S. Bethke et al., Phys. Lett. B 213 (1988) 235;
N. Brown, W.J. Stirling, Z. Phys. C 53 (1992) 629;
Y.L. Dokshitzer, G.D. Leder, S. Moretti, B.R. Webber, JHEP 9708 (1997) 001;
L. Lnnblad, S. Moretti, T. Sjstrand, JHEP 9808 (1998) 001.
D ELPHI Collaboration, P. Abreu et al., Phys. Lett. B 418 (1998) 430;
P. Bambade, private communication.
A LEPH Collaboration, R. Barate et al., CERN-EP/2000-093, hep-ex/0008013.
A.H. Hoang et al., Eur. Phys. J. Direct C 3 (2000) 1, hep-ph/0001286, to appear in the
Proceedings of the Workshop Physics Studies for a Future Linear Collider, Top Quark WG.
S. Mrenna, Comput. Phys. Commun. 101 (1997) 232.
M.L. Mangano, in: K. Huitu et al. (Eds.), International Europhysics Conference on High Energy
Physics, IOP Publishing, Bristol, 2000, p. 33.
M.H. Seymour, Nucl. Phys. B 436 (1995) 163;
D.J. Miller, M.H. Seymour, Phys. Lett. B 435 (1998) 213.
D. Amati, A. Bassetto, M. Ciafaloni, G. Marchesini, G. Veneziano, Nucl. Phys. B 173 (1980)
429.

Statistical properties of the spectrum of the QCD

Dirac operator at low energy
D. Toublan a,b , J.J.M. Verbaarschot b
a Loomis Laboratory of Physics, University of Illinois, Urbana-Champaign, IL 61801, USA
b Department of Physics and Astronomy, State University of New York, Stony Brook, NY 11794, USA
Abstract
We analyze the statistical properties of the spectrum of the QCD Dirac operator at low energy in
a finite box of volume L4 by means of partially quenched Chiral Perturbation Theory, a low-energy
effective field theory based on the symmetries of QCD. We derive the two-point spectral correlation
function from the discontinuity of the chiral susceptibility. For eigenvalues much smaller than mc =
F 2 /L2 , where F is the pion decay constant and is the absolute value of the quark condensate,
our result for the two-point correlation function coincides with the result previously obtained from
chiral Random Matrix Theory (chRMT). The departure from the chRMT result above that scale is
due to the contribution of the nonzero momentum modes. In terms of the variance of the number
of eigenvalues in an interval containing n eigenvalues on average, it amounts to a crossover from a
log n-behavior to a n2 log n-behavior. 2001 Elsevier Science B.V. All rights reserved.
PACS: 11.30.Rd; 12.39.Fe; 12.38.Lg; 71.30.+h
Keywords: QCD Dirac operator; Chiral random matrix theory; Partially quenched chiral perturbation theory;
Thouless energy; Microscopic spectral density; Number variance; Spectral quark mass dependence
1. Introduction
Since the seminal work by Wigner, Dyson and Mehta [13] on the Random Matrix
Theory description of level correlations in nuclei, the problem of level statistics has been
analyzed in great detail for many different quantum systems (see [4] for a recent review).
/
In QCD, a statistical analysis can be applied to the eigenvalues of the Dirac operator iD
defined by
iD
/ k = ik k .
E-mail address: toublan@uiuc.edu (D. Toublan).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 0 9 3 - 1
(1.1)
344
D. Toublan, J.J.M. Verbaarschot / Nuclear Physics B 603 (2001) 343368
The Euclidean QCD partition function for Nf quarks of mass mf is given by

Z QCD =
[dA]
Nf
det(iD
/ + mf ) eSYM[A] ,
(1.2)
f =1
and can thus be expressed as

Nf

QCD
Z
=
(ik + mf )
f =1 k
(1.3)
YM
Here, YM denotes an average over the YangMills partition function. This shows that,
although the eigenvalues cannot be observed directly, their properties are of fundamental
importance to the physics of QCD.
For small enough energies, below the so-called Thouless energy [58], the eigenvalues
of the QCD Dirac operator are strongly correlated, and their correlations are given by
chiral Random Matrix Theory (chRMT) [912]. For energy differences much larger than
the Thouless energy but much smaller than QCD , the eigenvalues of the QCD Dirac
operator show much weaker correlations that are different from chRMT. In this domain,
the eigenvalue correlations can be computed perturbatively by means of partially quenched
Chiral Perturbation Theory (pqChPT). This is a low-energy effective theory based only on
the symmetries of QCD formulated to probe the spectrum of the QCD Dirac operator.
Finally, for energies beyond QCD the eigenvalues are uncorrelated. These different
domains have been identified in lattice QCD simulations [1315], and can be derived from
random hopping models with the chiral symmetry of the QCD partition function [16
21]. The occurrence of Random Matrix behavior in the Dirac spectrum can be understood
naturally from universality arguments [2227].
The Thouless energy can be studied quantitatively by means of pqChPT. From the
analysis of the average spectral density of the QCD Dirac operator, (), we found [7,
8,28] that the Thouless energy is given by mc = F 2 /L2 , where F is the pion decay
constant and is the absolute value of the quark condensate. The 1/L2 -dependence of mc
is well-known from the theory of mesoscopic systems (see for example [29]). In addition
to QCD there is one other important energy scale in the Dirac spectrum: the smallest
nonzero eigenvalue min . It is directly related to the spectral density near zero virtuality
and is therefore given by the BanksCasher formula [30]: min = 1/(0) = /V . With
the terminology adopted from the study of disordered mesoscopic systems [29] we can
thus distinguish four different domains in the Dirac spectrum. The quantum domain and
the ergodic domain are separated by min , the ergodic domain and the diffusive domain are
separated by the Thouless energy mc , and the diffusive domain and the ballistic domain
are separated by QCD .
In this article, we turn our attention to correlations of Dirac eigenvalues, i.e., the
fluctuations about the average behavior of the spectrum. For this purpose we consider
multi-level correlation functions. We focus our study on the average connected two-point
spectral correlation function, c (1 , 2 ), defined by
c (, ) =
( i )( j )
i,j
345
( k )
( )
QCD
QCD
( k )
QCD
( k )
+ R(, ),
(1.4)
QCD
where QCD denotes the average over the QCD partition function, and the k are the
eigenvalues of the Dirac operator. In the last line of this equation we have decomposed the
correlation function into a term containing the self-correlations and the two-point cluster
function, R(1 , 2 ). Both terms enter in the disconnected scalar susceptibility which is a
more natural object in a field theory context. It is defined by
1
m m ln ZQCD
V 1 2

1
1
1
=
(ik + m1 )
(ij + m2 )
V
k
j
QCD

1
1
1
(ik + m1 )
(ij + m2 )
V
1
V
QCD

d1
d2
(1.5)
QCD
c (1 , 2 )
.
(i1 + m1 )(i2 + m2 )
(1.6)
Because of the averaging over the QCD partition function, c (1 , 2 ) depends on the quark
masses. Therefore it is not possible to derive the two-point correlation function by inverting
the relation (1.6). In order to invert this relation and to compute the correlation function
one has to introduce special scalar sources, unequivocally related to the eigenvalues of the
QCD Dirac operator, in the QCD partition function. This partition function contains extra
degrees of freedom: one fermionic and one bosonic ghost-quark for each special scalar
source related to an argument of the spectral correlation function. The pqQCD partition
function with two special sources required for the calculation of the spectral two-point
function is given by

det(iD
/ + z1 + j1 /2) det(iD
/ + z2 + j2 /2)
Z pqQCD = [dA]
det(iD
/ + z1 j1 /2) det(iD
/ + z2 j2 /2)
Nf
det(iD
/ + mf ) eSYM[A]
f =1
Nf

k
f =1
ik + z1 + j1 /2 ik + z2 + j2 /2
(ik + mf )
ik + z1 j1 /2 ik + z2 j2 /2

.
YM
(1.7)
346
Notice that for j1 = j2 = 0 the pqQCD partition function reduces to the QCD partition
function. Therefore, the derivatives of the pqQCD partition function with respect to the
source terms at j1 = j2 = 0 are given by averages over the QCD partition function. This
enlarged partition function was first introduced to study the quenched limit in QCD and
is therefore known as partially quenched QCD (pqQCD) [31]. It has already been used to
compute the spectral density of the QCD Dirac operator in [7,8]. In that case, only one
such special scalar source had to be introduced.
We will be interested in the disconnected scalar susceptibility defined by

1
(z1 , z2 ) = j1 j2 ln ZpqQCD
V
j1 =j2 =0

1
1
1
=
V
ik + z1
ij + z2
k
j
QCD

1
1
1
.
(1.8)
V
ik + z1
ij + z2
k
QCD
QCD
It is related to the spectral two-point correlation function of the QCD Dirac operator (1.4)
as follows
1
(z1 , z2 ) =
V

d1
d2
c (1 , 2 )
.
(i1 + z1 )(i2 + z2 )
(1.9)
Because the average over the QCD partition function does not depend on z1 and z2 , this
integral equation can be inverted. Using that (z1 , z2 ) is odd in both z1 and z2 , we find that

1
1
c (1 , 2 ) =
Disc
(z1 , z2 )
2
V
4
z1 =i1 ,z2 =i2

1
=
(i1 + #, i2 + #) + (i1 + #, i2 + #)
lim
4 2 #0+
+ (i1 + #, i2 + #) + (i1 + #, i2 + #) .
(1.10)
At low energies, as already discussed in [7,8], the properties of the spectrum of the QCD
Dirac operator can be obtained from the low-energy limit of (1.7). As is the case for the
usual chiral Lagrangian, this effective theory is completely determined by the symmetries
of the pqQCD partition function. This topic will be discussed in the next section. An
alternative way to derive the perturbative results for the scalar susceptibility is the use
of the replica method [15,32]. However, nonperturbative results cannot be obtained this
way [3340].
Partially quenched Chiral Perturbation Theory, which is the basis of most of our
results, will be introduced in Section 2. The zero momentum sector of this theory will
be analyzed in Section 3. It will be shown that in this domain the two-point spectral
correlation function for arbitrary topological charge is given by chiral Random Matrix
Theory. This nonperturbative result generalizes a calculation in [41] to arbitrary topological
347
charge without starting from chiral Random Matrix Theory, but rather from an effective
chiral Lagrangian which is obtained from the symmetries of the microscopic theory.
The contribution of the nonzero momentum modes to the scalar susceptibility and its
discontinuity is calculated in Section 4. In Section 5 we evaluate the number variance for
the different domains in the Dirac spectrum. Concluding remarks are made in Section 6.
2. Partially quenched Chiral Perturbation Theory

It is well-known that the low-energy limit of QCD is given by a theory of weakly
interacting Goldstone bosons. The reason is two-fold: the spontaneous breaking of chiral
symmetry and the existence of a mass gap for the non-Goldstone excitations. In QCD,
chiral symmetry is maximally broken consistent with the VafaWitten theorem, implying
that SU L (Nf ) SU R (Nf ) is broken spontaneously to the diagonal subgroup SU V (Nf ).
The corresponding low-energy effective theory has been investigated in great detail by
means of Chiral Perturbation Theory. It describes successfully the strong interaction
phenomenology at low energies [4244].
In this section we construct the low-energy limit of pqQCD. This low-energy effective
theory, known as partially quenched Chiral Perturbation Theory (pqChPT), is again solely
based on the symmetries of pqQCD and is given by a theory of Goldstone modes
associated with the spontaneous symmetry breaking of chiral symmetry. Because the
unitary symmetry is not a good symmetry for the bosonic ghost quarks, i.e., unitary
transformations violate the complex conjugation of the fields necessary to obtain a
convergent integral, we start from the complexified flavor symmetry group given by
Gl(Nf + 2|2) Gl(Nf + 2|2). The flavor symmetry group of the bosonic ghost quarks
is then given by Gl(2)/U (2) Gl(2)/U (2) which results in convergent bosonic integrals.
The flavor symmetry group of the fermionic quarks is U (Nf + 2) U (Nf + 2). As is
the case for QCD we will assume that the axial symmetry of the pqQCD is maximally
broken by the vacuum expectation value of the chiral condensate. We will also assume that
supersymmetry is not spontaneously broken. Because of spontaneous breaking of chiral
symmetry to the diagonal subgroup, the complete Goldstone manifold, which also includes
fermionic Goldstone modes, is then given by the maximum super-Riemannian submanifold
f + 2|2). Its fermionfermion block
of Gl(Nf + 2|2) [7,8,45]. It will be denoted by Gl(N
is given by U (Nf + 2) and the bosonboson block given by Gl(2)/U (2). The matrix
elements of the bosonfermion and the fermionboson blocks of this manifold are given
by independent Grassmann variables. The unbroken chiral symmetry group of the pqQCD
R (Nf + 2|2).
L (Nf + 2|2) Gl
partition function is thus given by Gl
At low energies the relevant excitations are the Goldstone fields parameterized by

U = exp i 2 a Ta /F ,
(2.1)
f + 2|2). Under a Gl
L (Nf + 2|2)
with Ta the generators of the Goldstone manifold Gl(N
R (Nf + 2|2) transformation of the quark fields, the Goldstone fields transform as
Gl
U UL U UR1 .
(2.2)
348
The low energy effective partition function is obtained from the requirement that its
transformation properties are the same as for the pqQCD partition function. Under flavor
transformations the pqQCD partition function transforms as

Z pqQCD UL MUR1 , + i log Sdet UL UR1 = Z pqQCD (M, ),
(2.3)
where is the vacuum angle, and the quark mass matrix is denoted by M. To lowest order
in the momenta and quark masses, the effective partition function is thus given by

4
eff
dU e d x L(x)
Z (M, ) =
(2.4)
f +2|2)
U Gl(N
with the effective chiral Lagrangian given by [7,31]

2

m

F2
1
1
Str U U
Str M U + MU
.
+
L=
4
2
2
F
Here, the quark mass matrix is defined by

M = diag z1 + j1 /2, z2 + j2 /2, m, . . . , m, z1 j1 /2, z2 j2 /2 .

(2.5)
(2.6)
Nf
The axial anomaly is included through a mass term for the superfield, =
f + 2|2).
iF Str ln U , while integrating over the full axial symmetry group Gl(N
This term serves as a constraint that projects out the flavor singlet channel. The first
two terms in (2.5) also appear in ChPT to lowest order. However, in the case of
pqChPT, there
fermionic and bosonic Goldstone modes. Their masses are
are both
given by 2m/F , 2z1 /F , 2z2 /F , (m + z1 )/F , (m + z2 )/F , and
(z1 + z2 )/F , depending on their quark content.

The partially quenched effective partition function was first formulated for the supergroup U (Nf + k|k) to study the quenched approximation in QCD [31]. This formulation
is suitable for perturbative calculations. However, an effective partition function based on
this group cannot be used for nonperturbative calculations of the group integral. For example, the supertrace leads to the appearance of both positive and negative masses. The
f + 2|2) [7,8,45].
correct integration manifold is the super-Riemannian manifold Gl(N
As already mentioned in the introduction, it is possible to distinguish different domains
in the spectrum of the Dirac operator. First, the effective partition function is only valid for
z1 , z2 QCD . An important scale is given by the Thouless energy defined as the quark
mass scale for which the Compton wavelength of the lightest corresponding Goldstone
mode is equal to the size of the box, i.e.,
1
mc
= 2.
(2.7)
2
F
L
For |zi | mc the zi -dependence of the condensate is determined by the fluctuations of
the zero momentum modes. In this domain, the partition function factorizes into a zero
momentum sector and a nonzero momentum sector [7,46,47]. In [7,8] it was shown that
in this domain the pqQCD partition function for the one-point function reduces to the
349
chRMT partition function. In this article, we will show that the same is true for the spectral
two-point correlation function. Inside this domain, we can distinguish a second scale: the
smallest nonzero eigenvalue
min =
(2.8)
.
V
For |zi | min the group integrals can be evaluated perturbatively, whereas for smaller
values of the zi the group integrals have to be calculated exactly. In the domain zi
F 2 /L2 , the nonzero momentum modes become important and have to be taken into
account. These three domains for the spectral two-point function will be analyzed in detail
in the remainder of this article.
3. Quantum and ergodic domains

In these domains, corresponding to |zi | mc , the zero momentum mode sector and
the nonzero momentum sectors of the pqChPT partition function factorize [7]. The zi dependence of the partition function and therefore the spectral two-point function can be
obtained from its zero momentum part only. It is given by the super-unitary matrix integral
Zeff (M, )

=
f +2|2)
U Gl(N
2

mV
1
Str M U + MU
dU exp
. (3.1)
2
2
F
We decompose the partition function according to

ei Z (M),
Zeff (M, ) =
(3.2)
with the partition function in the sector of topological charge given by (for an in depth
analysis of the normalization factors of the different topological sectors we refer to [48])

2
Z (M) = C1 exp
2mV

V
(3.3)
dU Sdet U exp
Str MU + M U 1 .
2
f +2|2)
U Gl(N
f + 2|2) are mathematically quite

Exact integrals over the Goldstone manifold Gl(N
complicated. In the following we will restrict ourselves to the simplest case, and calculate
the scalar susceptibility in the quantum and ergodic domains only in the quenched
limit. A calculation of the two-point correlation function based on the supersymmetric
formulation of chiral Random Matrix Theory [49,50] was first given in [41] for the sector
of zero topological charge. In this article we start directly from the partially quenched chiral
Lagrangian and extend the calculation to all values of the topological charge. Remarkably,
the k-point correlation functions can be expressed in terms of the zero momentum part of
the ChPT partition function with 2k additional flavors [51]. The relation of these additional
350
flavors with the equal number of additional flavors that enter in the pqQCD partition has
been worked out in detail for the one-point function [7].
3.1. Scalar susceptibility in the quenched limit
In the quenched limit and in a sector of topological charge , the zero-mode part of the
effective partition function can be written as:

V
1
Zeff (z1 , z2 , j1 , j2 ) =
(3.4)
Str M U + U
,
dU Sdet U exp
2

Gl(2|2)
where the mass matrix is given by

M1 0
V M =
,
0 M2
(3.5)

is the maximum
with Mi = V diag(zi +ji /2, zi ji /2). The Goldstone manifold Gl(2|2)
Riemannian submanifold of Gl(2|2) with fermionfermion block given by U (2) and
bosonboson block given by Gl(2)/U (2) [7,8,45].
3.1.1. Parameterization of the Goldstone manifold
We parameterize the Goldstone manifold in terms of Goldstone modes related with the
one-point functions corresponding to z1 and z2 , and Goldstone modes that describe twopoint correlations. A convenient parameterization is given by [41],

1 ww w
w1 0
w1 0
,
U=
(3.6)
0 w2
0 w2
w
1 ww

and w, w Gl(1|1). An explicit parameterization of these
where w1,2 Gl(1|1)
supermatrices will be given below. The advantage of the parameterization (3.6) becomes
clear upon consideration of the supertrace that appears in the exponent in the effective
partition function (3.4):

V
Str M U + U 1 = S1 + S2 ,
2
(3.7)
with
S1 =

1
Str M1 w1 1 ww w1 + w11 1 ww w11 ,
2
(3.8)
and

1
(3.9)
Str M2 w2 1 ww
w2 + w21 1 ww
w21 .
2
Furthermore, the source terms j1 and j2 only occur in combination with w1 and w2 ,

respectively. As we will see in the next subsection, the invariant measure of Gl(2|2)
in
the parameterization (3.6) factorizes according to
S2 =
(w, w,
w1 , w2 ) = (w, w)(w
1 )(w2 ).
(3.10)
351
We thus find that the integrals over w1 and w2 in the computation of the scalar
susceptibility factorize

1 2
(z1 , z2 ) =
log Z
V j1 j2
j1 =j2 =0

1
=
2 (w, w),
(3.11)
dw d w (w, w)I
1 (w, w)I
V
where the integrals I1 (w, w)
and I2 (w, w)
are given by

Ii = dwi (wi )ji Si eSi j =0 , i = 1, 2.
(3.12)
i
A further simplification arises by using a polar decomposition of the 2 2 supermatrices

that appear in the parameterization (3.6),
wi = vi i vi1 ,
w = vSu
i = 1, 2,
w = u
Sv 1 ,
(3.13)
(3.14)
where 1,2 and S are 2 2 diagonal supermatrices with commuting elements given by

i = diag eii /2 , esi /2 , i = 1, 2,
(3.15)

S = diag sin ei , i sinh ei ,
(3.16)

i
i

,
S = diag sin e , i sinh e
(3.17)

C = 1 S
(3.18)
S = diag(cos , cosh ),
and u, v, v1 , and v2 , all elements of U (1|1)/U (1) U (1), can be conveniently
parameterized according to

0
0
u = exp
,
v = exp
,
0
0

0 i
vi = exp
(3.19)
,
i = 1, 2.
i 0
After these Grassmann diagonalizations, the supertraces in (3.8) and (3.9) are given by

1
1
1
,
S1 = Str v11 vCv 1 v1 1 v11 M1 v1 1 + 1
(3.20)
1 v1 M1 v1 1
2
and

1
1
1
S2 = Str v21 uCu1 v2 2 v21 M2 v2 2 + 1
(3.21)
.
2 v2 M2 v2 2
2
3.1.2. Measure
The parameterization of the Goldstone manifold is of the form
U = W T W = W 2 W 1 T W W 2 T .
The Berezinian of the transformation from the variables
invariant measure we thus consider
1

1
T U 1 dU T = W 2 dW 2 + [dT ]T .
(3.22)
T
to T is one. To calculate the

(3.23)
352
The measure thus factorizes into a product of one factor depending only on W and another
factor depending only on T (and on T after the transformation from T to T ). The W dependent part of the measure trivially factorizes into a w1 -dependent piece and a w2 dependent piece. We thus find
d(U ) = w12 dw12 w22 dw22 T 1 dT
dw d w.
(w1 ) dw1 (w2 ) dw2 (w, w)
(3.24)
The first two integration measures simply follow from the invariant measure of Gl(1|1)/
U (1|1) whereas the integration measure of the T -integrations is given by the invariant
measure of Gl(1|1). Both measures will be calculated in the next part of this section.
The matrices wi in the coset Gl(1|1)/U (1|1) have four independent parameters and can
be parameterized as in (3.6), (3.19). We first calculate the measure wi1 dwi . To obtain
the measure wi2 dwi2 we only have to replace the diagonal elements by their square. The
invariant measure is given by the Berezinian of the transformation from variables

wi vi1 wi1 dwi vi
(3.25)
to variables di , dsi , di and di . One easily derives that
1

wi = 1
i wi = i [vi i + di i vi ],
(3.26)
where vi = vi1 dvi . The Berezinian of the transformation from the variables wi to the
di and the off-diagonal elements of vi is given by
1 0
0
0
0 1
0
0
Sdet
0 0 esi /2 eii /2
0
0
eii /2 esi /2
1
= s /2
.
(e i eii /2 )(eii /2 esi /2 )
0
(3.27)
The Berezinian of the transformation from w to w is one (factors from the bosonic
integrations cancel against factors from the fermionic integrations), and (v)12 (v)21 =
d d. For the Berezinian, denoted by B, we thus find
wi1 dwi Bdi dsi di di
desi /2 deii /2 di di
(esi /2 eii /2 )(eii /2 esi /2 )
i dsi di di di
.
=
4(esi /2 eii /2 )(esi /2 eii /2 )
(3.28)
The integration measure for the wi variables is then simply obtained by squaring the
diagonal elements of i ,
(wi ) dwi =
(esi
i dsi di di di
.
eii )(esi eii )
(3.29)
353
Next we calculate the invariant measure of Gl(1|1). This group has eight independent
parameters and can be parameterized as in (3.6) and (3.19). With the observation that [33,
35] T 1 dT = dw d w we calculate the integration measure starting from
dw v 1 dw u = vS + dS Su,
S + d
S
Sv,
d w u1 d w v = u
(3.30)
with u = u1 du and v = v 1 dv. The Berezinian of the transformation from the

variables on the left hand side to the variables on the right hand side is given by
1
(sin2
+ sinh2 )2
1
(cos2
cosh2 )2
(3.31)
, S } to the variables
The Jacobian for the transformation of the variables {S11 , S22 , S11
22
{, , , } is simply given by
i sinh 2 sin 2.
(3.32)
For the invariant measure of Gl(1|1) we thus find

dw d w =
T 1 dT = (w, w)
i sinh 2 sin 2
(cos2 cosh2 )2
d d d d d d d d. (3.33)
3.1.3. Integration over the Goldstone manifold

The first supertrace S1 in (3.20) appearing in the integral (3.11) can be written as

S1
= z1 + 12 j1 cos cos 1 z1 12 j1 cosh cosh s1
V
+ j1 (cos cos 1 cosh cosh s1 )1 1

+ (cos cosh ) z1 + 12 j1 cos 1 z1 12 j1 cosh s1 ( 1 )( 1 )

+ j1 cos 1 cosh s1 (cos cosh ) ( 1 )1 + 1 ( 1 )
+ j1 (cos cosh )(cos 1 cosh s1 ) 1 1 .
(3.34)
The Grassmann integrals in (3.11) are calculated by collecting the coefficient of 1 1

in the expansion of eS1 . Only the terms linear in j1 contribute to susceptibility. Using the
nilpotency of the Grassmann variables, such as for example ( 1 )2 = 0, one easily shows
that

1
d d d1 d1 j1 eS1 j =0
1
V
!
= exp V z1 (cos cos 1 cosh cosh s1 ) (cos cosh )(cos 1 cosh s1 )
(1 + V z1 cos cosh 1 cosh cosh s1 ).
(3.35)
e S2 .
An analogous result is obtained from the expansion of

The integration over the
Grassmann variables in the disconnected scalar susceptibility (3.11) is now trivial. Taking
into account the measures (3.33), (3.29) we arrive at
1
(z1 , z2 ) = 4 V
2

dt
dp
1
(t 2
tp
F (V z1 )F (V z2 ),
p 2 )2
(3.36)
354
where
2
1
ex(t cos p cosh s) e(is)
(ei es )(ei es )
0
0

(cos cosh s) 1 + x(t cos p cosh s) .
(3.37)
F (x) = (t p)
ds
To avoid EfetovWegner terms [52,53] and problems related to the singularity in the
integrand of F (x), we compute

2
x
+
d ds ex(t cos p cosh s) e(is)
F (x) =
(t p) t p
4
0
0

s+i
1 esi 2 + x(t cos p cosh s)
1e

=
(3.38)
+
(t + p)I (xt)K (xp) .
t p
In the first equality we have used the identity

cos cosh s
1
(3.39)
= 1 esi ,
i
s
e e
2
and the second equality follows from relations for modified Bessel functions. We thus find
that

F (x) = t 2 p2 I (xt)K (xp).
(3.40)
Inserting this expression into (3.36), the final result for the disconnected scalar susceptibility is given by
1
(z1 , z2 ) = 4 V z1 z2
4
dt tI (V z1 t)I (V z2 t)
0
dp pK (V z1 p)K (V z2 p)
1

4 2 V z1 z2
= 2
z1 I+1 (V z1 )I (V z2 ) z2 I+1 (V z2 )I (V z1 )
2
2
(z1 z2 )

z1 K+1 (V z1 )K (V z2 ) z2 K+1 (V z2 )K (V z1 ) . (3.41)
In the limit z1 = z2 , this expression coincides with the result obtained in [13].
3.2. Two-point correlation function
Finally, the two-point spectral correlation function is given by the discontinuity of the
disconnected scalar susceptibility across the imaginary axis (1.10). This follows by using
relations for Bessel functions such as

i i/2
e
K (iz) =
J (z) + iN (z) ,
2
K (iz) = ei K (iz) iI (iz),

I (iz) = ei/2 J (z),
< arg z
355
,
2
2
= J (z)N+1 (z) J+1 (z)N (z).
(3.42)
z
In the quantum and ergodic domains, we then obtain the two-point correlation function
c (1 , 2 )
V 2 2

1 2
J (V 1 ) J+1 (V 1 )J1 (V 1 )
2
2
1 2
2
J
(V
)J
(V
J
(V
)J
(V
)
.
1
+1
1
2
2
+1
2
1
(1 22 )2
(3.43)
The first term represents the contribution due to the selfcorrelations of the eigenvalues
(see Eq. (1.4)). It comes from the non-trivial # 0 limit in the discontinuity of the
susceptibility (1.10). This result coincides with the chRMT result [54]. In the ergodic
domain, where V 1,2 1, the two-point spectral correlation function reduces to
= (1 2 )
1
1
(3.44)
+ ,
2 2 (1 2 )2
and agrees with the asymptotic result for the two-point correlation function of the Wigner
Dyson ensembles.
c (1 , 2 ) =
4. Ergodic and diffusive domains

In these domains, corresponding to min |zi | QCD , the scalar susceptibility can
be computed perturbatively within pqChPT with Lagrangian for = 0 given by (2.5),
Leff =

M 2
F2
Str U U 1 Str M U + U 1 + 0 2 + .
4
2
2
2
(4.1)
The (Nf + 4) (Nf + 4) matrix U is parameterized as

U = exp i 2 a Ta /F ,
(4.2)
f + 2|2), and the singlet field is
with Ta the generators of the Goldstone manifold Gl(N
normalized as
= iF Str ln U.
(4.3)
Its mass is related to the topological susceptibility by [7]

2 2 2m
(4.4)
,
2
F V
F2
where m
has been introduced to simplify expressions below. For m
= m/Nf this
expression gives the topological susceptibility for Nf light quarks with mass m,
" 2 # mV
=
(4.5)
.
Nf
M02 =
356
We can distinguish two types of Goldstone modes: those that correspond to the diagonal
generators and those that correspond to the off-diagonal generators. To one-loop order, the
off-diagonal Goldstone modes do not mix with the super-singlet field . Therefore, their
propagator is simply given by

1
G(p2 ) = p2 + M 2 ,
(4.6)
where M is the mass of the corresponding Goldstone mode. The diagonal Goldstone
modes, on the other hand, do mix with the super-singlet mode. It is still possible to
diagonalize the quadratic form in the Goldstone fields (see [28,31,55]). This results in
the following propagator for diagonal mesons in the sector of fermionic quarks (1 i, j
Nf + 2),

Gij p2 = ij
(p2 + M02 )(p2 + M 2 )

1
,
2 )((1 + N )p2 + M 2 + N M 2 )
p2 + Mii2
(p2 + Mii2 )(p2 + Mjj
f
f 0
(4.7)
where i denotes the flavor of the quarks of one of the diagonal mesons, and j the flavor of
the quarks of the other one. The masses in the propagator are given by the Gell-Mann
OakesRenner relation, Mii2 = 2zi /F 2 for i = 1, 2, M 2 Mii2 = 2m/F 2 for i =
3, . . . , Nf + 2.
We will consider two limits of this theory: M (i.e., m ), which is the
quenched case, and M0 , which is QCD with Nf light flavors of quarks (and a
completely decoupled singlet field). In the quenched case, the propagator matrix for the
diagonal mesons is given by

Gij p2 = ij
(p2 + M02 )
1
,
2 )
p2 + Mii2
(p2 + Mii2 )(p2 + Mjj
for 1 i, j 2,
(4.8)
whereas in the QCD limit we find

Gij p2 = ij
1
(p2 + M 2 )
,
2 )
p2 + Mii2
Nf (p2 + Mii2 )(p2 + Mjj
for 1 i, j Nf + 2.
(4.9)
Below we only need the propagators for 1 i, j 2 so that both limits can be treated at
the same time by introducing the propagator

Gij p2 = ij
1
p2 + Mii2
(p2 + 2m /F 2 )
2 )
(p2 + Mii2 )(p2 + Mjj
for 1 i, j 2.
(4.10)
In the QCD limit we have that = 1/Nf and m = m/Nf , whereas the quenched case is
obtained by the substitution = and m = m.

4.1. Disconnected scalar susceptibility
We compute the disconnected scalar susceptibility for spectral quarks with masses given
by z1 and z2 . The mass of all Nf other quarks is taken to be equal to m. To one-loop order
the contributions represented by the following diagrams have to be taken into account:
357
Fig. 1. One-loop diagrams in pqChPT which contribute to (z1 , z2 ). The full lines represent either
the standard propagator of an off-diagonal meson (4.6) or the propagator of a diagonal meson
(4.10) (with a cross). The wiggly lines denote the two different scalar sources.
2 )1 with
The propagators in the first diagram in Fig. 1 are given by G(p2 ) = (p2 + M12
2
= (z1 + z2 )/F . In the second diagram of Fig. 1, the lines with a cross denote the
propagators that mix the diagonal mesons with the super- , G12 (p) in (4.10). Such type
of contributions are also essential in the calculation of the resolvent from the partially
quenched chiral Lagrangian as well ([7,28,31]). The disconnected scalar susceptibility
computed to one-loop order within pqChPT is thus given by
2
M12
2
2 2 2
G p + 2G12 p2
VF4 p
$
%
(p2 + 2m /F 2 )2
2
1
=
+
2
.
2 )2
2 )2 (p2 + M 2 )2
V F 4 p (p2 + M12
(p2 + M11
22
(z1 , z2 ) =
In the notation of [56] with

(r) 2
r
Gr M 2 =
p + M2 ,
V
p
(4.11)
(4.12)
the scalar susceptibility can be rewritten as

$
2 2(z1 m )2
2 2(z2 m )2
2
2
+
M11 +
(z1 , z2 ) = 4 G2 M12
G
G2 M22
2
F
(z1 z2 )2
(z1 z2 )2
%
2
F 2 2(z1 m )(z2 m ) 2
G1 M11 G1 M22 . (4.13)
+
(z1 z2 )3
The functions G1 (M 2 ) and G2 (M 2 ) for momenta in a finite box were analyzed in detail
in [56]. They are obviously related to properties of the propagator of a free scalar particle
at the origin (notice that G2 (M 2 ) = M 2 G1 (M 2 )). For a box with volume L4 and
momentum cutoff they are given by [43,56],

G1 M 2 =
2
1
M2
2
M ,L ,
M
log
+
g
1
16 2
2

M2
1
G2 M 2 =
1
+
log
+ g2 M 2 , L ,
2
2
16
(4.14)
where

gr M , L =
2
1
16 2

d r3
0

n=0
eM
2 n2 L2 /4
(4.15)
358
and the sum is over a four-dimensional lattice of integers. The functions gr (M 2 ) obviously
vanish in the thermodynamic limit. For 1/L2 M 1/L they dominate the logarithmic
terms in the propagators and can be expanded in powers of M resulting in

g1 M 2 , L =

1
+ O M 0 /L2 ,
M 2 L4

1
g2 M 2 , L = 4 4 + O log(ML) .
M L
(4.16)
In the domain 1/L2 Mii 1/L (i = 1, 2) the disconnected scalar susceptibility (4.11)
is thus given by
(z1 , z2 ) =
m2
2z12 z22 V
1
+ .
(z1 + z2 )2 V
(4.17)
Therefore in the quenched limit, using the relation (4.4), the result reads,
(z1 , z2 ) =
2 2
2z12 z22 2 V 3
1
+ ,
(z1 + z2 )2 V
(4.18)
and in the QCD limit, with m = m/Nf , one finds

(z1 , z2 ) =
1
m2
+ .
+
2
2
2
(z1 + z2 )2 V
2Nf z1 z2 V
(4.19)
In the QCD limit, for spectral quarks with equal masses z1 = z2 = m, we indeed recover
the ChPT result that can be easily derived from the results given in [57],
=
Nf2 + 2
4Nf2
1
+ .
m2 V
(4.20)
In the thermodynamic limit at fixed values of the quark masses, the finite volume
corrections g1,2 (M 2 , L) in (4.14) can be ignored. This results in the susceptibility
(z1 , z2 )
$
2
z1 + z2
=
ln
2
4
16 F
2

z1
2
(z1 m ) m (z1 + z2 ) + z1 (z1 3z2 ) ln z1 z2
+
3
(z1 z2 )
%

2 2
,
+ (z1 z2 ) z1 + z22 2m (z1 + z2 ) + 2m2
(4.21)
where we have defined the scale = 2 F 2 /2 (compare to (4.14)). If the two spectral
quarks have the same mass z1 = z2 = z, a somewhat simpler expression is obtained,
$ 2
%
m
z
4m
2
2
1
+
2
(z) =
(4.22)
ln
+
.
16 2 F 4 3z2
3z
In the limit z 0, the disconnected scalar susceptibility is singular (notice that this
expression is only valid for z 1/V ). This is due to the double pole occurring in the
359
neutral meson propagator when m = 0. Such singularities are common in pqChPT. They
also appear, for instance, in the small mass behavior of the scalar radius of the pion in the
quenched approximation [58]. In the case m = 0, i.e., for 2 = 0 or in the chiral limit,
the algebraic singularities in (4.21) cancel and only a logarithmic singularity remains.
The physically more interesting QCD limit of the disconnected scalar susceptibility,
(z1 , z2 ), is obtained from (4.21) by putting m = m/Nf and = 1/Nf ,
(z1 , z2 )
2 $
z1 + z2 z12 + z22 + 2m2 2(z1 + z2 )m

1
+
ln
16 2 F 2
2
Nf2 (z1 z2 )2

%
2
z1
2
2
z1 (z1 3z2 ) + 4z1 z2 m (z1 + z2 )m ln z1 z2 .
+ 2
Nf (z1 z2 )3
(4.23)
For z1 z2 = z the apparent singularities cancel, and in this limit the expression (4.23)
simplifies to
2

z
4m 2
m
2
N
+
+
2
ln
,
(z) =
(4.24)
f
3z
16 2 Nf2 F 4 3z2
=
or, in the chiral limit,

(z) =
2 (Nf2 + 2)
16 2 Nf2 F 4
ln
z
.
(4.25)
Finally, in the case z1 = z2 = m, we recover the result derived in [57] within ChPT.
4.2. Two-point correlation function
In this section, we calculate the two-point spectral correlation function from the
discontinuities of the disconnected scalar susceptibility (see (1.10)) obtained perturbatively
within pqChPT in the previous section. Therefore, our results for the two-point function
are only valid within the domain of validity of perturbative pqChPT,
1/L4 i QCD ,
i = 1, 2,
|1 2 | min 1/L4 .
(4.26)
The second condition arises from the susceptibilities (i1 + #, i2 + #) that contribute
to the two-point correlation function (see Eq. (1.10)). In that case, we have zero-momentum
modes with mass |1 2 | and a perturbative evaluation of the susceptibility is only
possible for V |1 2 | 1. For eigenvalues inside the domain (4.26) satisfying the
additional condition i F 2 /L2 , that is the ergodic domain, only the zero momentum
modes contribute to the scalar susceptibility which is then given by (4.17). For the
connected two-point correlation function we then find
$
%
1
1
1
+
+ .
c (1 , 2 ) = 2
(4.27)
2 (1 2 )2 (1 + 2 )2
360
This result is also obtained from a perturbative expansion of the chRMT result, that is
the expression (3.43) derived from the zero-momentum sector of the pqChPT partition
function in Section 3.2. It does not depend on any of the low-energy coupling constants that
appear in the effective Lagrangian (4.1) and is thus the same, independent of the number
of flavors and the quark masses, provided that chiral symmetry is broken spontaneously.
The perturbative result for the two-point spectral correlation function is obtained from
the discontinuity of (4.13). At finite volume, in the domain (4.26), it is given by
1
1
1
1
2 2 (1 2 )2 2 2 (1 + 2 )2
2
4 2

2
p4
( + 2 )2
2 p F 4 (1 2 )
F4 1
+
+

2
2
2
2
2 2 F 4
p4 +
p4 +
( 2 )2
( + 2 )2
p=0
F4 1
F4 1
c (1 , 2 ) =
2
p4
2
p4
2 2
2 2
F4 1
F4 2
4
p
+
2m
/F
+

,
2 2 2
4 + 2 2 2
2 2 F 4
p4 +
p
4
4
p=0
1
2
F
F
(4.28)
where the sum is over the momenta p = 2k L with k a four-dimensional lattice of
integers. In the ergodic domain, only the contribution of the first two terms has to be taken
into account. In the diffusive domain, the thermodynamic limit of (4.28) is given by
$
(21 + 22 )(2 (21 + 22 ) 2m2 )
(21 22 )2
2V
c (1 , 2 ) =
+
8
ln
64 4 F 4
164
(21 22 )2

21
m |1 ||2 |
1
2 4
2 2
4
ln 2
16
+
4
+
6
1
1
2
2
(|2 | + |1 |)3
(21 22 )3
2
%
2

2
+ 2 21 41 621 22 342 ln 12 2 22 42 622 21 341 ln 22 .
(4.29)
This result can also be obtained directly from the thermodynamic limit of the scalar
susceptibility in the diffusive domain (4.21). The two-point correlation function is even
in both 1 and 2 but is not translational invariant. These properties originate from the
pairing of the Dirac eigenvalues. For massless quarks (or topological susceptibility equal
to zero in the quenched case) the expression (4.29) simplifies to
$
(21 22 )2
2V
42
+
c (1 , 2 ) =
ln
64 4 F 4
164
(21 22 )2

2

2
1
21 41 621 22 342 ln 12
2 21 + 22 + 2
2
1 2
%
2

(4.30)
,
22 42 621 22 341 ln 22
where = 1/Nf in the QCD case, and = in the quenched case. Notice that in the limit
1 2 the terms proportional to 2 are regular. For |1 2 | 1,2 the infrared singular
part of the correlation function (4.30) simplifies to
c (1 , 2 ) =
2
|1 2 |
.
log
32 4 F 4
361
(4.31)
This result, derived for m = 0, does not depend on the parameter . It is therefore valid for
QCD with any number of flavors of massless quarks, even zero. It cannot be obtained from
chRMT. Beyond the energy scale for which the kinetic term in the chiral Lagrangian cannot
be neglected, also known as the Thouless energy, pqChPT differs from chRMT. This was
observed earlier in the analysis of the spectral density in [5,7] and will be discussed in
greater detail in the next section.
5. Number variance
In the study of disordered systems, a frequently used measure of the spectral correlations
is the number variance of the eigenvalues. It is defined as the variance of the number of
eigenvalues in an interval that contains n eigenvalues on average. If the actual number of
eigenvalues in each such interval for the ith member of the ensemble is given by ni , the
p
pth moment of the number of eigenvalues is given by the ensemble average ni . With the
2
number variance denoted by (n) we thus have
n = ni ,
" #
2 (n) = n2i ni 2 .
(5.1)
the
If we denote the eigenvalues for the ith member of the ensemble by (i)
k
corresponding spectral density is given by

(i)
k .
i () =
(5.2)
k
The number of eigenvalues inside the interval [a, b] is equal to

b
ni =
d i ().
(5.3)
The average number of eigenvalues inside this interval is thus given by

b
n=
"
#
d i () ,
(5.4)
and the number variance can be written as

b b
(n) =
2
d1 d2 c (1 , 2 ),
(5.5)
a a
with the connected two-point correlation function given by

"
# "
#"
#
c (1 , 2 ) = i (1 )i (2 ) i (1 ) i (2 ) .
(5.6)
362
If the average spectral density is not constant on the interval [a, b] there will be a
contribution to the number variance due to the variation of the average spectral density.
We will always assume that this contribution has been eliminated by a procedure called
unfolding [3].
The two-point correlation function can be decomposed as
"
#
# "
#
"
( k ) +
( k )( l ) .
i ()i ( ) = ( )
(5.7)
k=l
The connected two-point correlation function can be written as (see (1.4))

c (, ) = ( )() + R(, ),
(5.8)
where the two-point cluster function R(, ) contains only correlations of different
eigenvalues and is regular for .
For a finite total number of eigenvalues, we obtain by integration of the connected twopoint correlation function the sum rule

d1 c (1 , 2 ) = 0.
(5.9)
The contribution due to the self-correlations is canceled by the contribution from the twopoint cluster function. The coefficient of the linear term in the asymptotic expansion of the
number variance is given by
d 2 (n)
=2
dn
V
n/V
d c

n
, ,
V
(5.10)
where we have used that c (1 , 2 ) = c (2 , 1 ). This quantity is also known as the

spectral compressibility. We observe that for large n the coefficient of the linear term
vanishes provided that the correlation function approaches zero faster than 1/, 1/ , i.e.,
when the integration in the sum-rule (5.9) is convergent without imposing a cutoff on the
total number of eigenvalues. If this is the case the correct asymptotic result for the number
variance is obtained from the asymptotic result of the two-point cluster function provided
it is regularized such that the sum-rule (5.9) is satisfied.
In the previous sections, the two-point correlation function was derived under the
assumption that 1 , 2 QCD . On this scale, the variations of the average spectral
density [7] can be neglected. The average number of eigenvalues in the interval [0, G]
is given by
G
n=
d ().
(5.11)
For m = 0 the spectral density for QCD is given by the BanksCasher relation
() =
V
+ .
(5.12)
363
The average number of eigenvalues in the interval [0, G], of eigenvalues is thus given by
V G
+ ,
and the number variance can be written as

n=
n/V
(n) =
2
(5.13)
n/V
d1
0
d2 c (1 , 2 )
0
n/V
=n+
n/V
d1
0
d2 R(1 , 2 ).
(5.14)
Perturbative calculations of the spectral two-point function are only valid for |1 2 |
min and do not include the term due to self-correlations. However, they are included in
the nonperturbative result (3.41) valid in the ergodic domain and in the quantum domain.
Therefore the number variance in the quantum domain and in the ergodic domain is
obtained by simply inserting the non-perturbative result (3.42) into (5.14). We are not
aware of any analytic expression for these integrals. However both in the case of small n
and in the case of large n, one can derive concise analytic expressions for the number
variance. Only the self-correlations in (3.42) contribute to the small-n limit of the number
variance,

2 (n) = n + O n2 .
(5.15)
As discussed above the asymptotic large-n result of the number variance can only be
obtained from the asymptotic result of the two-point correlation function after it has been
regularized such that the sum-rule (5.9) is satisfied. In the ergodic domain this is achieved
by

c (1 , 2 ) = () ( ) + ( + )
$
%
1
1
1
(5.16)
+
+ ,
2 2 (1 2 )2 + a 2 (1 + 2 )2 + a 2
where a is determined by the sum rule (5.9),
a=
1
.
2()
(5.17)
In the ergodic domain, we have that 1/L2 QCD so that () is well approximated by
(0). Then (0) drops out of the expression for the number variance, and at leading order
we find the familiar logarithmic dependence known from Random Matrix Theory,
1
log n + .
(5.18)
2 2
It coincides with the result derived from chRMT. Notice that in the bulk of the spectrum
the coefficient of log n is a factor of 2 larger.
At finite volume, for eigenvalues much larger than the smallest eigenvalues but in the
domain of chiral perturbation theory, the number variance can be calculated from the
2 (n) =
364
expression for the two-point correlation function that includes the nonzero-momentum
modes (4.28). We find
2

k 4 + 2n2 n
1
1
c
2
(n) =
log n +
log
2 2
4 2
k4
k=0

+ 8
n
2 2 nc
2
k=0
2m 2
k 2
+ F2
4 2

2 ,
k 4 + 2n2 n
c
(5.19)
where nc = mc /min = F 2 L2 / is the dimensionless Thouless scale. We have used that

p = 2k /L with k a four-dimensional hypercubic lattice with unit cell of length one.
In the diffusive domain, the thermodynamic limit of (5.19) is given by

1 + 22 n 2
n
log
,
2 (n) =
(5.20)
4
16
nc
nQCD
where nQCD /min QCD /min . Therefore, for QCD with Nf flavors of massless
quarks one finds that in the diffusive domain nc n nQCD , the number variance of the
Dirac spectrum is given by

Nf2 + 2 n 2
n
log
.
2 (n) =
(5.21)
2
4
nQCD
16 Nf nc
The number variance computed from pqChPT for eigenvalues below QCD is shown in
Fig. 2.
Finally in the ballistic domain, for n nQCD the eigenvalues of the Dirac operator in
QCD are correlated as those for a free Dirac operator. The n2 log n behavior saturates as
n for n nQCD where the interactions become weak because of asymptotic freedom. The
spectrum thus obeys Poisson statistics, and the number variance is given by
2 (n) = n.
(5.22)
6. Conclusions
We have analyzed the fluctuations of the eigenvalues of the QCD Dirac operator by
means of the partially quenched chiral susceptibility. The two-point spectral correlation
function is obtained from its discontinuity across the imaginary axis. The variance, 2 (n),
of the number of eigenvalues in an interval containing n eigenvalues on average is obtained
by integrating the spectral two-point correlation function. The generating function for the
scalar susceptibility is given by the QCD partition function with two additional fermionic
quarks and two additional bosonic ghost quarks with a mass equal to the spectral mass (also
known as valence quark mass) that enters in the partially quenched chiral susceptibility. For
spectral quark masses well below QCD the low-energy limit of this theory is completely
determined by its global symmetries.
365
Fig. 2. The number variance for levels below nQCD computed from pqChPT for Nf = 3 massless
quarks. The solid curve represents the result from the zero-mode sector of the partition function,
or chRMT, and the dashed curve shows the perturbative result. The volume of the box is (20 fm)4 ,
the Thouless scale is nc 27.5 and nQCD 105 . Notice how late the asymptotic log n-behavior is
reached, and how early the non-zero momentum corrections are visible. A strikingly similar figure
has been obtained for a disordered metal in [59].
Based on this partition function we have distinguished three important scales in the Dirac
spectrum, the smallest nonzero eigenvalue, min = /V , the spectral mass for which the
Compton wavelength of the corresponding Goldstone boson is equal to the size of the box,
mc = F 2 /L2 , and QCD . The analogues of these scales are well-known in mesoscopic
physics where they separate the quantum domain, the ergodic domain, the diffusive domain
and the ballistic domain, respectively.
For spectral quark masses well below mc , in the ergodic domain, the kinetic term in the
partially quenched chiral Lagrangian decouples from the zero momentum part, and only the
latter part has to be taken into account. In this domain the partition function only depends
on the number of flavors, the topological charge and the combination mV or zV .
Nontrivial results are obtained by keeping this combination fixed in the thermodynamic
limit. A perturbative expansion of the zero momentum part of the partition function is
possible for spectral quark masses that are much larger than min . For smaller spectral
quark masses we have calculated the super-integrals that appear in the partition function
for the sector of topological charge . Our results for the scalar susceptibility and the
corresponding spectral two-point correlation function in this domain are in complete
agreement with chiral Random Matrix Theory. This result, together with earlier work on the
366
microscopic spectral density, provides strong evidence that also all higher order correlation
functions that can be derived from the low-energy limit of the QCD partition function
coincide with the results from chiral Random Matrix Theory. The number variance in this
domain is given by log n/2 2 , showing that the fluctuations of the eigenvalues are strongly
suppressed.
For spectral masses in the diffusive domain, mc z QCD , when nonzero
momentum modes have to be taken into account, a perturbative evaluation of the partition is
justified. In this limit for QCD we have calculated the scalar susceptibility to one loop
in chiral perturbation theory. We have found that in the case of massive quarks (including
the quenched case) the scalar susceptibility shows a quadratic infrared divergence. At finite
volume this divergence is regulated by nonzero momentum modes, and thus only appears
for valence quark masses beyond the Thouless energy. We expect that this divergence will
show up much more prominently in lattice QCD simulations than its analogue of quenched
chiral logarithms.
The prediction for the behavior of the two-point correlation function in the diffusive
regime is c (1 2 ) 1/(1 2 )(d4)/2 [60]. In agreement with this result we have
found a logarithmic dependence of c (1 2 ) on |1 2 |. The corresponding number
variance is given by 2 (n) n2 log n. The proportionality constant given by (Nf2 +
2)/(16 2Nf2 F 2 L2 ) provides us with an alternative method to determine the pion decay
constant.
The behavior of the number variance is very different in the various domains we have
defined. Deep in the quantum domain, for small n, the interval contains randomly either
zero or one level and 2 (n) = n. When the first level is reached, the number variance
almost stops increasing. It has a log n behavior up to the Thouless scale nc = F 2 L2 / .
Therefore the spectrum is quite stiff in this domain. Around the Thouless scale, the number
variance increases much faster again. Well above nc but below nQCD , the level number
corresponding to QCD , the number variance grows like n2 log n. Finally, for level number
much larger than nQCD , the Dirac spectrum is in essence a free particle spectrum because of
asymptotic freedom. The levels are therefore randomly distributed and the number variance
is again given by 2 (n) = n. This behavior is typical of a disordered metallic phase in
condensed matter physics.
In conclusion, we have shown that the fluctuations of the QCD Dirac eigenvalues
below QCD in the phase of broken chiral symmetry are completely determined by the
symmetries of the QCD partition function.
Acknowledgements
This work was partially supported by the US DOE grant DE-FG-88ER40388. One
of us (D.T.) was supported in part by Holderbank-Stiftung and by Janggen-PhnStiftung. D. Dalmazi, P. Damgaard, J. Kogut, J. Osborn, K. Splittorff, and T. Wettig are
acknowledged for useful discussions.
367
References
[1] E.P. Wigner, Ann. Math. 53 (1951) 36.
[2] F.J. Dyson, J. Math. Phys. 3 (1962) 140, 157, 166, 1199.
[3] M. Mehta, Random Matrices, Academic Press, San Diego, 1991;
M. Mehta, J. Math. Phys. 4 (1963) 701.
[4] T. Guhr, A. Mller-Groeling, H.A. Weidenmller, Phys. Rep. 299 (1998) 189.
[5] J.J.M. Verbaarschot, Phys. Lett. B 368 (1996) 137.
[6] J.C. Osborn, J.J.M. Verbaarschot, Phys. Rev. Lett. 81 (1998) 268;
J.C. Osborn, J.J.M. Verbaarschot, Nucl. Phys. B 525 (1998) 738.
[7] J.C. Osborn, D. Toublan, J.J.M. Verbaarschot, Nucl. Phys. B 540 (1999) 317.
[8] P.H. Damgaard, J.C. Osborn, D. Toublan, J.J.M. Verbaarschot, Nucl. Phys. B 547 (1999) 305.
[9] P.H. Damgaard, U.M. Heller, A. Krasnitz, Phys. Lett. B 445 (1999) 366.
[10] E.V. Shuryak, J.J.M. Verbaarschot, Nucl. Phys. A 560 (1993) 306.
[11] J.J.M. Verbaarschot, Phys. Rev. Lett. 72 (1994) 2531.
[12] J.Z. Ma, T. Guhr, T. Wettig, Eur. Phys. J. A 2 (1998) 87.
[13] M. Gckeler, H. Hehl, P.E.L. Rakow, A. Schfer, T. Wettig, Phys. Rev. D 59 (1999) 094503.
[14] M.E. Berbenni-Bitsch, M. Gckeler, T. Guhr, A.D. Jackson, J.-Z. Ma, S. Meyer, A. Schfer,
H.A. Weidenmller, T. Wettig, T. Wilke, Phys. Lett. B 438 (1998) 14;
M.E. Berbenni-Bitsch, M. Gckeler, S. Meyer, A. Schfer, T. Wettig, Nucl. Phys. B Proc.
Suppl. 73 (1999) 605.
[15] M.E. Berbenni-Bitsch, M. Gckeler, H. Hehl, S. Meyer, P.E.L. Rakow, A. Schfer, T. Wettig,
Phys. Lett. B 466 (1999) 293;
M.E. Berbenni-Bitsch, M. Gckeler, H. Hehl, S. Meyer, P.E.L. Rakow, A. Schfer, T. Wettig,
Nucl. Phys. Proc. Suppl. 83 (2000) 974.
[16] R. Gade, F. Wegner, Nucl. Phys. B 360 (1991) 213;
R. Gade, Nucl. Phys. B 398 (1993) 499.
[17] A. Altland, B.D. Simons, Nucl. Phys. B 562 (1999) 445.
[18] K. Takahashi, S. Iida, Nucl. Phys. B 573 (2000) 685.
[19] T. Guhr, T. Wilke, H.A. Weidenmller, Phys. Rev. Lett. 85 (2000) 2252.
[20] K. Takahashi, S. Iida, cond-mat/0011003.
[21] T. Guhr, T. Wilke, Nucl. Phys. B 593 (2001) 361.
[22] G. Akemann, P. Damgaard, U. Magnea, S. Nishigaki, Nucl. Phys. B 487 [FS] (1997) 721.
[23] E. Brzin, S. Hikami, A. Zee, Nucl. Phys. B 464 (1996) 411.
[24] A.D. Jackson, M.K. Sener, J.J.M. Verbaarschot, Nucl. Phys. B 479 (1996) 707.
[25] T. Guhr, T. Wettig, Nucl. Phys. B 506 (1997) 589.
[26] A.D. Jackson, M.K. Sener, J.J.M. Verbaarschot, Nucl. Phys. B 506 (1997) 612.
[27] M.K. Sener, J.J.M. Verbaarschot, Phys. Rev. Lett. 81 (1998) 248.
[28] D. Toublan, J.J.M. Verbaarschot, Nucl. Phys. B 560 (1999) 259.
[29] G. Montambaux, in: E. Giacobino, S. Reynaud, J. Zinn-Justin (Eds.), Quantum Fluctuations,
Les Houches, Session LXIII, Elsevier Science, 1995, cond-mat/9602071.
[30] T. Banks, A. Casher, Nucl. Phys. B 169 (1980) 103.
[31] C. Bernard, M. Golterman, Phys. Rev. D 49 (1994) 486;
C. Bernard, M. Golterman, hep-lat/9311070.
[32] P.H. Damgaard, K. Splittorff, Nucl. Phys. B 572 (2000) 478.
[33] J.J.M. Verbaarschot, M.R. Zirnbauer, J. Phys. A 17 (1985) 1093.
[34] A. Kamenev, M. Mezard, J. Phys. A 32 (1999) 4373;
A. Kamenev, M. Mezard, Phys. Rev. B 60 (1999) 3944.
[35] I.V. Yurkevich, I.V. Lerner, Phys. Rev. B 60 (1999) 3955.
[36] E. Kanzieper, cond-mat/9908130.
[37] M.R. Zirnbauer, cond-mat/9903338.
368
[38] P.H. Damgaard, K. Splittorff, Phys. Rev. D 62 (2000) 054509;

P.H. Damgaard, Phys. Lett. B 476 (2000) 465.
[39] D. Dalmazi, J.J. Verbaarschot, Nucl. Phys. B 592 (2001) 419.
[40] G. Akemann, D. Dalmazi, P.H. Damgaard, J.J. Verbaarschot, hep-th/0011072.
[41] A.V. Andreev, B.D. Simons, N. Taniguchi, Nucl. Phys. B 432 (1996) 3420.
[42] S. Weinberg, Phys. Rev. Lett. 18 (1967) 188;
S. Weinberg, Phys. Rev. 166 (1968) 1568;
S. Weinberg, Physica A 96 (1979) 327.
[43] J. Gasser, H. Leutwyler, Ann. Phys. 158 (1984) 142;
[44] H. Leutwyler, Ann. Phys. 235 (1994) 165.
[45] M.R. Zirnbauer, J. Math. Phys. 37 (1996) 4986;
F.J. Dyson, Commun. Math. Phys. 19 (1970) 235.
[46] J. Gasser, H. Leutwyler, Phys. Lett. B 188 (1987) 477;
[47] H. Leutwyler, A. Smilga, Phys. Rev. D 46 (1992) 5607.
[48] P.H. Damgaard, Nucl. Phys. B 556 (1999) 327.
[49] K.B. Efetov, Supersymmetry in Disorder and Chaos, Cambridge Univ. Press, 1997;
K.B. Efetov, Adv. Phys. 32 (1983) 53.
[50] J.J.M. Verbaarschot, H.A. Weidenmller, M.R. Zirnbauer, Phys. Rep. 129 (1985) 367.
[51] P.H. Damgaard, Phys. Lett. B 424 (1998) 322;
G. Akemann, P.H. Damgaard, Nucl. Phys. B 528 (1998) 411;
G. Akemann, P.H. Damgaard, Phys. Lett. B 432 (1998) 390;
G. Akemann, P.H. Damgaard, Nucl. Phys. B 576 (2000) 597.
[52] P.B. Gossiaux, Z. Pluhar, H.A. Weidenmller, Ann. Phys. (NY) 268 (1998) 273.
[53] M.R. Zirnbauer, F.D.M. Haldane, Phys. Rev. B 52 (1995) 8729.
[54] J.J.M. Verbaarschot, I. Zahed, Phys. Rev. Lett. 70 (1993) 3852.
[55] M.F.L. Golterman, K.C. Leung, Phys. Rev. D 57 (1998) 5703;
M.F.L. Golterman, Acta Phys. Polon. B 25 (1994) 1731.
[56] P. Hasenfratz, H. Leutwyler, Nucl. Phys. B 343 (1990) 241.
[57] A. Smilga, J.J. Verbaarschot, Phys. Rev. D 54 (1996) 1087.
[58] G. Colangelo, E. Pallante, Nucl. Phys. B 520 (1998) 433.
[59] D. Braun, G. Montambaux, Phys. Rev. B 52 (1995) 13903.
[60] B.L. Altshuler, B.I. Shklovskii, Zh. Eksp. Teor. Fiz. 91 (1986) 220, Sov. Phys. JETP 64 (1986)
127.

QCD with adjoint scalars in 2D:

properties in the colourless scalar sector
P. Bialas a , A. Morel b , B. Petersson c , K. Petrov c , T. Reisz b,d
a Institute of Comp. Science, Jagellonian University, 33-072 Krakow, Poland
b Service de Physique Thorique de Saclay, CE-Saclay, F-91191 Gif-sur-Yvette cedex, France
c Fakultt fr Physik, Universitt Bielefeld, P.O. Box 100131, D-33501 Bielefeld, Germany
d Institut fr Theoretische Physik, Universitt Heidelberg, Philosophenweg 16, D-69120 Heidelberg, Germany
Received 27 December 2000; accepted 22 March 2001
Abstract
We present a numerical study of an SU(3) gauged 2D model for adjoint scalar fields, defined
by dimensional reduction of pure gauge QCD in (2 + 1)D at high temperature. We show that the
correlations between Polyakov loops are saturated by two colourless bound states, respectively, even
and odd under the Z2 symmetry related to time reversal in the original theory. Their contributions
(poles) in correlation functions of local composite operators An , respectively, of degree n = 2p and
2p + 1 in the scalar fields (p = 1, 2) fulfill factorization. The contributions of two particle states
(cuts) are detected. Their size agrees with estimates based on a meanfield-like decomposition of
the p = 2 operators into polynomials in p = 1 operators. In contrast to the naive picture of Debye
screening, no sizable signal in any An correlation can be attributed to 1/n times a Debye screening
length associated with n elementary fields. These results are quantitatively consistent with the picture
of scalar matter fields confined within colourless boundstates whose residual strong interactions
are very weak. 2001 Elsevier Science B.V. All rights reserved.
1. Introduction
Dimensional reduction is a powerful technique to study the infrared region of field
theories at high temperature [15]. Combined with a non-perturbative lattice simulation
of the reduced model, it has been employed to investigate the properties of gauge theories
and QCD with dynamical quarks in the plasma phase [69]. For a recent review see [10].
It is, however, still not clear, what are the limitations and the domain of validity of this
approach. Therefore, in a recent work [11], we have studied in detail the reduction to two
E-mail addresses: pbialas@agrest.if.uj.edu.pl (P. Bialas), morel@spht.saclay.cea.fr (A. Morel),
bengt@physik.uni-bielefeld.de (B. Petersson), petrov@physik.uni-bielefeld.de (K. Petrov),
t.reisz@thphys.uni-heidelberg.de (T. Reisz).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 5 2 - 3
370
P. Bialas et al. / Nuclear Physics B 603 (2001) 369388
dimensions of pure gauge QCD in (2 + 1) dimensions at high temperature. We refer the

reader to this article for a more complete discussion of our motivations, references to the
related literature and details on the reduction procedure. The reduced model is a model for
scalars belonging to the SU(3) algebra (formerly the electric gluons in a static gauge). They
interact with the 2D gauge fields, the static parts of the original 3D spatial gauge fields, and
via an effective potential whose self-couplings are computed by perturbative integration
over the non-static degrees of freedom. In [11] we restricted the perturbative integration to
the one loop order. The main conclusion of our investigation is that dimensional reduction
works very well in this case. In particular it was shown that it works within a few percent
for the correlation function of Polyakov loops (as well as for spacelike Wilson loops)
down to 1.5Tc , where Tc is the critical temperature in the (2 + 1)D theory. In fact for
T 1.5Tc numerical simulations showed that the Polyakov loops correlations measured in
(2 + 1)D QCD and in the reduced model were very close to one another down to quite short
distances, although our formalism is in principle an expansion both in high temperature and
in p/T , where p is the relevant momentum scale. It was further shown, by simulations at
different values of the bare parameters, that our measurements are in the scaling region,
and thus our results are valid in the continuum limit.
Less expected was the observation that these correlations assume a shape typical
of single particle propagation, as opposed to the standard picture of a screening mass
associated with two massive electric gluons. This feature, together with our previous
motivations, invites us to pursue the numerical exploration of the reduced model per se,
in particular the investigation of states connected with the Polyakov loop correlations.
While in [11] we only measured the correlations of Polyakov loops, here we analyze
separately those of SU(3) invariant polynomials of degree n in the elementary scalars A,
namely An = tr An . The action is invariant under a global sign reversal of all the As. This
Z2 -symmetry, which we denote by R , following the authors of Ref. [12], corresponds
to Euclidean time reversal in the (2 + 1)D theory. In this latter article it was suggested
to investigate operators odd under this symmetry, to obtain a possible gauge invariant
definition of the Debye mass. To investigate both operators which are even and odd under
R we consider separately correlations involving even and odd polynomials, n = 2p and
2p + 1. The R -symmetry may be spontaneously broken in the 2D model. In fact there
exists two phases, corresponding to R being conserved or broken. Only in the symmetric
phase, the model corresponds to the reduction of the high temperature (2 + 1) QCD phase
[9,1214]. The details of the actual phase diagram will not be studied in the present article.
From our data we can conclude, however, that the values of the coupling constants in the
reduced 2D theory are in the unphysical broken phase. In the same way as has been done
for the reduction from (3 + 1)D to 3D, we solve this problem by working in the metastable
part of the symmetric phase. Using zero field initial conditions on large enough lattices, we
make sure that we stay in the phase of unbroken R , where even and odd operators do not
mix. Investigations of states in the full and reduced model in the case of the (3 + 1) 3
reduction can be found in [69] and [1520], although a detailed analysis of the nature of
these states, as we perform in this paper, have not yet been made.
371
The 2D action under consideration is recalled in Section 2, together with its meaning
in terms of the (2 + 1)D QCD model, from which it originates, and the relevant operators
and correlations are defined. The simulations are performed in the temperature range 2Tc to
12Tc , where Tc is the deconfining temperature of the latter model. Our results are presented
and discussed in Sections 3 and 4. In Section 3, we first show that the measured correlations
fulfill the factorization properties expected if the lowest states in the n-even and n-odd
channels are two distinct one particle states, whose masses are then extracted from the
large distance decays. According to the criteria proposed in Ref. [12], the state found in
the odd channel is a candidate to define a Debye screening mass, although not the only
one. In Section 4, we further analyze the composite operators An , n = 2p and 2p + 1 and
their correlations An,m , showing that all the properties observed for p = 2 can be deduced
with a good accuracy from their knowledge for p = 1. This follows from the assumption
that, given the SU(3) and R symmetry constraints, the effective model for the elementary
As after integration over the gauge fields is a free field model for the massive composites
A2 and A3 . In particular, we give evidence that the (small) deviations from factorization
observed at short distances are mainly due to intermediate states containing two of the
above particles. The summary and the conclusions can be found in a last Section 5.
2. The reduced action, operators and correlations

In this section, we write down the reduced 2D action derived in [11], and define our
notations and the quantities of interest for the present work.
The lattice is an Ls Ls square; the spacing is a, set to one unless specified otherwise.
The weight in the partition function is written exp(S), with S a function of the SU(3)
group elements U (x; i), i = 1, 2, on the links and of the scalars A(x) in the adjoint
representation on the sites:
A(x) =
8

=1
A (x) ,
1
tr = .
2
(1)
Greek superscripts on A will always denote colour indices, unlike integers n, m used in
powers of the algebra element A. We write the 2D reduced action as follows:
S = SU + SU,A + SA ,

1
1
1; 2 U x + a
2; 1 U (x; 2)1 ,
1 tr U (x; 1)U x + a
SU = 3 L0
3
x
2
2
3 L0
tr Di (U )A(x) ,
6
x i=1

i U (x; i)1 A(x),
Di (U )A(x) = U (x; i)A x + a

2
SA =
k2 tr A2 (x) + k4 tr A2 (x) .
(2)
SU,A =
(3)
372
In the above, SU is the pure gauge term, SU,A the gauge invariant kinetic term for the
scalars and SA the scalar potential, whose self couplings k2 and k4 result from the oneloop integration over the non-static components of the 3D gauge fields. All terms have
the global R -symmetry A(x) A(x), while the Z3 symmetry of the original (2 + 1)D
SU(3) model is broken by the perturbative reduction procedure. It was found in [11] that
L2
3
5
(c0 log L0 + c1 ),
k4 = 0 . (4)
c0 = 1,
c1 = log 2 1,
2
2
64
The values of the parameters 3 , L0 follow from the original lattice regularization of 3D
pure QCD at temperature T and gauge coupling squared g32 in the continuum. The latter
has dimension one in energy and is used to set the scale:
k2 =
3 =
6
,
ag32
L0 =
1
.
aT
(5)
Accordingly, the continuum limit a 0 is obtained by letting 3 and L0 go to infinity

with the dimensionless temperature
=
T
g32
3
,
6L0
(6)
being kept fixed.

The original three-dimensional pure SU(3) gauge theory has a global Z3 symmetry. The
corresponding order parameter is the Polyakov loop. It is a static operator. In the reduced
theory it has the form

1
L(x) = tr exp iL0 A(x) .
(7)
3
At sufficiently high temperature the symmetry is spontaneously broken, signalling the
deconfinement of static charges in the fundamental representation. The reduced theory
in the above form does not have the Z3 symmetry any more, because the perturbative
reduction is made around one of the broken vacua, where A(x) = 0. The phase transition
in the three-dimensional theory appears at c 0.61 [21]. The reduced model should be
valid in the deconfined phase, at sufficiently high temperature and long distances. In [11]
we employed it to investigate the correlations between Polyakov loops in this phase, where
they are related to screening. We performed a detailed numerical analysis in the reduced
model and a comparison with the results in the full (2 + 1)D theory. Our simulations were
performed for two values of the parameter L0 , namely 4 and 8. It was shown that scaling
was very good, when comparing the data collected for fixed at these two values of L0 .
In this paper, we will show data collected for L0 = 4. As this is in the scaling domain, we
can give the results in physical units. Distances R are given by
r
RT =
(8)
,
L0
where r is the distance in lattice units, and temperatures are measured in units of the threedimensional critical temperature Tc . For L0 fixed in the scaling region one may use
T
3
=
.
Tc 3c
(9)
373
To discuss the effective Lagrangian of the model at fixed in the scaling limit, it is
convenient to normalize the scalar fields differently, defining (x) via

6
A(x) = (x)
(10)
.
L0 3
The corresponding Lagrangian Leff was derived in [11] from the small a expansion of the
effective action S. For clarity of the discussion we reproduce it here below. In S, we define
Ai by

U (x; i) = exp iag2 Ai (x) ,
(11)
where g22 = g32 T is the effective coupling of the 2D theory. Taking the limit a 0 (but for
the UV logarithm log L0 log aT in the 2 term), one obtains
g2
1 c c
Fij Fij + tr[Di ]2 + 2
4
32
8
Leff =
c=1
g2
T
2
tr 4 + LCT ,
Fij = i Aj j Ai + ig2 [Ai , Aj ],

Di = i + ig2 [Ai , ],

2
3g2
5
log(aT ) + log 2 1 tr 2 .
LCT =
2
2
(12)
(13)
This is a 2D, SU(3) gauge invariant Lagrangian for an adjoint scalar , but it is far from
being the most general one. The gauge coupling g2 , with its canonical dimension one in
energy, sets the scale. The non-kinetic quadratic term is the counterterm LCT , suited to
a lattice UV regularization with spacing a. The appearance of this term in the context of
dimensional reduction in, e.g., the lattice regularization framework was first discussed in
Refs. [3,5]. It is well known that such logarithmic terms also appear in general, when one
wants to define the continuum limit of a 2D lattice model, see, e.g., [22,23]. The reflection
symmetry is related to the Euclidean time reflection in the original (2 + 1)D
theory, noted R and discussed by Arnold and Yaffe [12]. In the two-dimensional model
of Eq. (2), it may, however, be spontaneously broken in some subspace of the unrestricted
parameter space 3 L0 , k2 , k4 . As was discussed in Refs. [9,13,14], and as follows from
the invariance of the original three-dimensional theory under Euclidean time reflection, the
physical phase is the R symmetric phase. The phase diagram in the 3D adjoint Higgs
model, related to 4 3 QCD reduction has been studied in Ref. [16]. For the case of
SU(2) in (3 + 1)D, R is the center of the gauge group so that R breaking is also gauge
symmetry breaking, a subject previously discussed in Refs. [13,14] and [9].
We have made a numerical investigation of the relative positions of the phase transition
and of the reduction point in the reduced model, for T /Tc = 1.97, and for two values
L0 = 4 and 8. We find that the reduction point is near to the phase transition, but in the
broken phase. The phase transition is first order and strong enough, so that we can study
the reduction in the metastable symmetric phase, by using appropriate starting values for
the fields, and employing a large enough (32 32) lattice.
374
We now turn to the definitions of the quantities relevant for the present investigation of
the 2D model. The Polyakov loop correlation P(x) is defined by

2
P(x) = L(0), L (x) L ,
(14)
where L(x) is the Polyakov loop operator of Eq. (7). We now expand L(x) in powers of
A(x), and we will study the operators An (x) and connected correlations of their traces,
defining
An (x) tr An (x),

An,m (x) An (x)Am (0) An Am .
(15)
(16)
When not ambiguous, the notation An may also represent An (x) . Any operator An (x) is
gauge invariant, even or odd under the R -symmetry of S for n even or odd, respectively.
Because the reduced model was derived from a small A-fields expansion, its properties are
significant for the (2 + 1)D model in the unbroken R phase only, where A2p+1 = 0 and
A2p is small.
In this article we will concentrate on the study of correlations of these operators,
which are directly related to the static operator L(x) and therefore relevant for the high
temperature properties of the original (2 + 1)D SU(3) theory. As can be easily seen from
the effective Lagrangian above, the 2D model has further symmetries beside R , namely
reflections of the 1- and 2-axis, which can be used to classify further operators, whose
correlations may be studied. (In 2D there is no spin quantum number.) A corresponding
analysis has been made in [16,20] in the case of the three-dimensional adjoint Higgs model.
Although a full numerical analysis of the two-dimensional adjoint Higgs model certainly
has an interest per se, the operators other than those which we defined above are not directly
related to a static (2 + 1)D operator. One would need a further study to ascertain to what
extent their correlations are related to the high temperature physics of the original theory.
We therefore do not discuss them in the context of this paper.
We have performed a numerical simulation of the model defined above, with a flat
measure for the As and the standard De Haar measure for the gauge fields. The algorithm
and error estimate techniques used are the same as in [11] and not reproduced here. The
lattice size is LS = 32, and L0 = 4, throughout the present work. The 3 values are 29,
42, 84 and 173. This corresponds to values of T /Tc equal approximately to 1.97, 2.85,
5.70 and 11.73, respectively. We were able to extract information from operators and
correlations corresponding to n = 2 to 5. The cases n = 2 and 3 for the 3D reduced model
were investigated in [16].
3. The lowest states of the scalar spectrum

In this section we present our results for the An,m correlations measured, and describe
them for each temperature in terms of two states S and P , respectively, even and odd under
R and appearing for n, m both even and both odd. Their physical masses will be denoted
375
Fig. 1. The on-axis correlations An,m (r) at T /Tc = 1.97 (3 = 29), versus the distance in units of
1/T . The even cases [n, m] = [2, 2], [2, 4] and [4, 4] all have the same shape, and the odd cases
[3, 3], [3, 5] and [5, 5], again similar with each other in shape, are steeper.
MS and MP . For simplicity we often use in this and the next section the bare parameters r
and 3 , related to R and T by Eqs. (8) and (9).
In the simulations reported here, all runs were initialized with zero A-fields values and,
as already stated, the system stayed in the metastable R unbroken phase, as desired, with
A2p+1 (x) being always compatible with zero. In this respect, the situation is thus similar
to that encountered in [1316].
For a first look at the data obtained in even and odd channels, we show in Fig. 1 the
on-axis correlations An,m (r), n m [2, 5] at T /Tc = 1.97 (3 = 29). They are plotted
against RT , that is the physical distance in units of the inverse temperature. In the even
cases, the three correlations all have the same shape, and they decrease by about one
order of magnitude each time two more powers of A are involved. The same is true for
the odd cases, with a common decay of the correlations steeper than in the former case
(smaller correlation length in lattice units). The overall situation is similar for 3 higher.
For n, m larger than 5, as well as for 3 very large, the signal/noise ratio becomes very
small. This can be understood at the qualitative level by noting that the rescaling Eq. (10)
of the A-fields normalizes the kinetic term for the -fields to the standard, parameter
independent form 1/2 tr(Di )2 . Hence if the field renormalization by the interactions is
weak, the correlations should depend only weakly on 3 , which means that An,m scales
(n+m)/2
like 3
. This will be illustrated more quantitatively in Section 4. Due to this scale
factor, the A-fields remain small in practice down to quite low values of 3 , which
a posteriori explains why the perturbative reduction may still work at a temperature as
376
low as 1.5Tc . In fact, we checked that the Polyakov loop correlations are actually fully
reconstructed within errors by keeping {n, m} up to {5, 5} only in their expansion in An,m s
obtained from the small A expansion of (14).
Now we want to analyze these An,m data quantitatively in terms of the lowest states of
the spectrum. Let mi = aMi be the lowest mass with quantum number i = S, P , in lattice
Latt (mi , p) in momentum space:
units. For particle i we introduce a lattice propagator
2
mi
2
1 (mi , p) = p

+
4
sinh
,
Latt
4

p1
p2
+ 4 sin2
.
p
2 = 4 sin2
(17)
2
2
The corresponding contribution to An,m (r) then reads
i
An,m (mi , r) = gn,m
1
Latt(mi , p),
cos(p1 r)
L2s p ,p
1
(18)
i
where gn,m
measures the residue of An,m at the pole of (17), which on large enough lattices
i
2 m2i . With our definitions, gn,m
is non-zero only for i = S if n and m
sits at p2 p
are even, and for i = P if n and m are odd. Prior to any fit of the masses to the data,
we notice that a first consequence of our expectations on the lowest part of the spectrum
i
= ni mi , a property which we can probe directly on
comes from residue factorization gn,m
the correlations since, as r becomes large, it implies
Xn
An,n (r)An+2,n+2 (r)

1.
A2n,n+2 (r)
(19)
That it is so is demonstrated on Figs. 25 showing X2 and X3 (symbols ) for T /Tc = 1.97

and 5.7 (3 = 29 and 84). The agreement is very good in all cases, although the quality of
the data is poorer for X3 due to the correlations involving A5 getting very small. Similar
results are obtained for other values of T /Tc . We thus conclude at this point that single
particle propagation accounts very well for the largest correlation length occurring in each
of the two channels. Most of the observed deviations of Xn from one will be interpreted in
the next section in terms of two particle state contributions (symbols in the same figures).
We now proceed to assign values to the two lowest masses MS and MP expected
from the above findings. This we do by various ways, in order to further enforce the
statement that the correlations do have the characteristics associated with the pole structure
of Eq. (17). Down to r 1, an excellent approximation to the on-axis correlation (18) is
given by

1
1
mi r
mi (Ls r)
e
+
e
,
An,m (mi , r) c
(20)
[mi r]1/2
[mi (Ls r)]1/2
where c is constant in r. We performed fits of this formula to all our An,m (r) data taken
at r > rmin . These fits are stable with respect to rmin provided it is larger than about 4, and
the values found for mi in different correlations are always consistent with each other. The
smallest errors were obtained by using fits to A2,2 and A3,3 .
377
Fig. 2. Residue factorization: data for the quantity X2 , Eq. (19) at T /Tc = 1.97 (3 = 29) versus the
2 corresponds to our
distance in units of 1/T . It approaches one at large distances. The quantity X
interpretation (Section 4, Eq. (35)) of the deviation from one of X2 at shorter distances.
Effective masses can also be obtained without any fitting by using 0-momentum
correlations, defined for a generic x-space correlation C(x1 , x2 ) by
1
C 0 (r) =
(21)
C(r, x2 ).
Ls x
2
If the lowest mass in C is m, the ansatz (17) gives

C 0 (r) cosh m(Ls /2 r) ,
(22)
and m can be extracted at any r by inverting this relation:

meff (r) = log Y (r) + Y 2 (r) 1 ,
Y (r) =
C 0 (r + 1) + C 0 (r 1)
.
2C 0 (r)
(23)
As an overall consistency check, we have extracted an effective mass meff (r) from 0momentum Polyakov loops correlations (14), and compared it to the mS values obtained
by our fits to A2,2 . We find that meff (r) is indeed nearly constant, in fact slowly decreasing
towards a value compatible with mS , due to smaller and steeper contributions to (14) of the
heavier particle P .
A contrario, we invalidate the interpretation of the largest correlation length in An,n
as being n times shorter than the Debye screening length, the inverse of a mass mE
associated with electric gluons of the initial (2 + 1)D model (the scalars of the reduced
378
Fig. 3. Residue factorization: data for the quantity X3 , Eq. (19) T /Tc = 1.97 (3 = 29) versus the
3 corresponds to our
distance in units of 1/T . It approaches one at large distances. The quantity X
interpretation (Section 4, Eq. (36)) of the deviation from one of X3 at shorter distances.
model). This scenario was advocated by DHoker in his perturbative study of QCD3 at high
temperature [24]. If such was the case, the on-axis correlations An,n should rather look like
n

1
1
mE r
mE (Ls r)
e
+
e
,
An,n (nmE , r)
(24)
[mE r]1/2
[mE (Ls r)]1/2
which differs in shape from (20), as was illustrated in [11] for Polyakov loop correlations.
We nevertheless tried fits with (24), but got definitely worse agreement, in the range of
temperatures, which we have investigated, i.e., up to 12Tc . Hence the above picture is
ruled out by the data in this temperature range, and if a mass can be defined for the electric
gluon in high temperature QCD3 it is most probably larger than both mS /2 and mP /3.
In a constituent gluon picture, as advocated in Ref. [25], one would have bound states
instead of a cut. One would, however, expect mP /mS 3/2.

Our final results for the S and P masses in units of the scale g32 T are collected in
Table 1 for the values of T /Tc investigated. They are taken from fits to A2,2 and A3,3 ,
respectively. The values for MS agree with those obtained in [11] from the Polyakov loop
correlations. As can be seen from the tables, the ratios MP /MS vary with T /Tc . There is,
however, no clear tendency in the region we have investigated, the ratios being 1.8, 2.0,
1.7, 1.6 in order of increasing temperature. We can, of course, not exclude that the ratio
goes to 1.5 at still higher temperatures.
Fig. 4. Same as in Fig. 2 at T /Tc = 5.7 (3 = 84).
Fig. 5. Same as in Fig. 3 at T /Tc = 5.7 (3 = 84).
379
380
Table 1

Masses in units of g32 T for the S and P states, as measured from fits to A2,2 and A3,3 , respectively,
for different values of T /Tc

T /Tc
MS
1.97
2.85
5.70
11.73
g32 T
1.39(3)
1.48(5)
1.89(5)
2.12(8)

MP
g32 T
2.51(13)
3.02(16)
3.22(15)
3.44(11)
4. Weak strong interactions between colourless states

Here we will show that even at quite short distances (r small compared to m1
S ) all
the condensates An An (x) and correlations An,m (r) can be reconstructed to a good
accuracy from the data for A2 , A2,2 and A3,3 . The assumption is that the elementary fields
A (x) (Greek superscripts are colour indices) interact only through S and P exchanges
between the non-interacting composites A2 (x) and A3 (x), the scale of the fields being
fixed by the size of A2 , while A3 = 0. The precise way how this idea is implemented and
the corresponding technicalities are detailed in Appendix A.
Here we limit ourselves to the simplest applications and give the results, starting with
the local condensates.
4.1. Weak residual interactions: the A-fields condensates
Since SU(3) has rank 2, any An (x) can be reduced to a polynomial in A2 (x) and A3 (x).
An elegant method 1 and explicit formulae are given in Appendix A. For n odd An is zero
by R symmetry. For n even, we apply Wick contraction to all pairs of A elementary
fields, followed by the meanfield-like substitution
1
A (x)A (x) , A2 (x).
4
As an illustration, consider A4 . With the definitions of Section 2, we have
2A4 (x) = A22 (x) =
8
1
A (x)A (x)A (x)A (x).
22
(25)
(26)
,=1
There we apply (25) and then replace A2 (x) by its average A2 . The A A and
A A contractions give (8 A2 /4)2 , and the additional contributions from = give
2 8(A2 /4)2 . Noting that A2,2(0) = A22 (x) A22 (see (16)), the net result can be put into
the two equivalent forms
5
2A4 = A22 ,
4
1 We thank M. Bauer for providing us with this simple trick.
(27)
381
1
A2,2 (0) = A22 .
(28)
4
This prediction is remarkably well verified in all cases. At 3 = 29, the left- and righthand sides of (28) are, respectively, 4.86(1) 103 and 4.825(10) 103 . They are
6.093(13) 104 and 6.069(1) 104 at 3 = 84. Similar manipulations lead to
A3,3 (0) =
5 3
A .
64 2
(29)
In this case the left- and right-hand sides are measured to be 2.105(7) 104 and
2.095(6) 104 for 3 = 29, 9.365(30) 106 and 9.344(3) 106 for 3 = 84. Hence
the effects of residual interactions via non-quadratic effective couplings in A2 (x) and
A3 (x) are less than the percent in the correlations at zero distance.
Before going to the correlations at non-zero distance, let us discuss their normalization,
as measured by the values of A2,2(0) and A3,3 (0) described just above. At the beginning
of Section 3, we argued that the behaviour in 3 , n, m observed for An,m could follow
from the absence of a large renormalization, by the interactions, of the -fields defined by
Eq. (10). Here we note that in the confined phase the effective degrees of freedom are the
massive composites i = tr i , i = 2, 3, so that in the limit where they are considered as
free fields, one may write (see (17))

1
i (0)i (0) Ri d 2 p
,
2
p
+ 4 sinh(m2i /4)

p1
p2
p
2 = 4 sin2
(30)
+ 4 sin2
,
2
2
the residue Ri being one if neither nor the composites get renormalized. We computed
Ri as the ratio of the l.h.s. of (30), directly measured, to the integral in the r.h.s, evaluated
numerically on the lattice for the mass values fitted to the correlation data. The result is
shown in Fig. 6: in the whole temperature range, both residues in the even and odd channels
remain uniformly very close to one.
4.2. Weak residual interactions: properties of the correlations
As we have seen in Section 2 (see Fig. 1), the different An,m (r)s corresponding to the
same channel have very similar shapes. Their analysis in terms of one particle exchange
was successful, confirmed by the agreement with residue factorization. Nevertheless
although the quantity Xn of Eq. (19) does go to one at large distances, it is significantly
different from one at medium and short distances (see Figs. 25). We will now show that
two particle exchange is responsible for most of this lack of factorization.
The simplest consequence of our assumptions for correlations at finite r is, using
Eq. (27),
5
A2,4 (r) = A2 A2,2 (r),
(31)
4
which is very well verified at any distance as shown in Fig. 7 for T /Tc = 1.97 (3 = 29).
A similar agreement is found for the relation
382
Fig. 6. The residues Ri defined by Eq. (30) stay close to one in the whole temperature range.
35
A2 A3,3 (r),
(32)
24
derived in Appendix A. A new situation arises when we consider A4,4, or A5,5 where both
the initial and final states may couple to a two particle state, (SS) or (SP ), respectively.
Then the intermediate state in a connected correlation between 0 and r may consist of
either one or two particles. For example, to compute A4,4,(r), we apply the substitution
rule (25) to the sum (26), and then average using the definitions of A2,2 and A2 . One finds
2

5 2
4A4,4(r) =
(33)
4A2A2,2 (r) + 2A22,2(r) .
4
A3,5 (r) =
A similar treatment given in Appendix A leads to the prediction

2

35 2
A5,5 (r) =
A2 A3,3 (r) + A2,2(r)A3,3 (r) .
24
(34)
In the two expressions above, the second contribution, a product of two propagators in
space, is that of a two-particle intermediate state, and it provides a correction to exact
factorization. From the definitions Eq. (19) of X2 and X3 , one actually gets the following
estimates:
A2,2 (r)
,
2A22
3 (r) 1 + A2,2 (r) .
X3 (r) X
A22
2 (r) 1 +
X2 (r) X
(35)
(36)
383
Fig. 7. Plot of 4A24 /5A2 A22 at T /Tc = 1.97 (3 = 29). This quantity is one if Eq. (31) exactly
holds.
2 (r), X
3 (r) are displayed in Figs. 2, 3 (respectively, Figs. 4, 5), for
The estimates X
comparison with the measured values X2 (r), X3 (r) at T /Tc = 1.97 (respectively, 5.7),
i.e., 3 = 29 (respectively, 84). We see that the corrections to factorization implied by twoparticle propagation provide a reasonable explanation of the behaviour observed for the Xs
at intermediate and short distances. This is especially true in the case of X2 , showing that
there is very little room for contributions from direct non-quadratic couplings in A2 (x)
in the full effective action (that resulting from integration over the gauge fields). This
justifies our statement that the residual interactions between the colourless boundstates
of the adjoint scalars are very weak.
5. Conclusions
In this paper, we have studied properties of the two-dimensional model derived in [11]
by dimensional reduction of 3D QCD at high temperature. In this model, scalar fields
A in the adjoint representation of SU(3) interacts via SU(3) gauge fields U , in addition to
a self-interaction generated by integration over the non-static 3D gauge degrees of freedom.
Such properties are interesting since it was shown in [11] that dimensional reduction works
remarkably well in this case. Also, they offer an opportunity to explore non-perturbative
features in a low dimensional situation where the IR singularities are particularly severe.
384
By means of numerical simulations, we explored that part of phase space where the
R -symmetry A A is unbroken, in accordance with the small A expansion used to
derive the model, known to be valid quite soon above the transition temperature of pure
3D QCD. We identified two boundstates S and P , respectively, even and odd under R ,
and thus coupled to monomials, respectively, of degree 2n and 2n + 1 in the As. The
S signal coincides with that previously obtained from Polyakov loops correlations [11],
where however the P -state contributions could not be disentangled.
These results came out from the measurement of three eveneven and three oddodd
distinct correlations, as functions of the on-axis lattice distance r. Great care was taken
in the analysis of their shape in r, with the result that in all cases, the signal found was
that expected from the occurrence of genuine poles in momentum space. A contrario,
this demonstrates that the picture where the decay with r of such correlations reflects the
propagation in 3D of p = 2n or 2n + 1 electric gluons, i.e., a correlation length equal to
1/p times the Debye screening length, is inadequate in the case under study.
By comparing the size of the three different correlations measured for each of the S
and P sectors, we were able to show that residue factorization holds, as expected on
general grounds when one particle propagates between different states. The agreement
with factorization was expectedly found to be particularly good at large distances, but we
could even show that deviations at shorter distances are to a large extent compatible with
propagation of two particles, namely two S or S and P , respectively, in the S or P channel.
The overall picture thus is that the scalar sector of the reduced model at large distances,
thought to accurately describe static properties of 3D QCD at high temperature, consists
of two weakly interacting colourless particles, respectively, even and odd under the R
symmetry of the model.
There are several problems, which this study invites to investigate further. Of course,
similarly detailed analysis for full QCD in (3 + 1)D would be interesting. Furthermore,
the construction of a reduced model where the Z3 -symmetry of the pure gauge theory is
not spoiled by the reduction process is highly desirable [26], with the hope that it exhibits
a transition to a symmetric Z3 phase analogous to the low temperature QCD phase.
Acknowledgements
We thank the DFG for support under the contract Ka 1198/4-1. K.P. was also supported
by DAAD and P.B. partially by KBN grant P03B01917.
Appendix A. Mean field technique for composite fields correlators

A.1. Formulae for traces and determinants
Let be a complex N N matrix. Here we give an elegant trick 2 to compute the traces
2 See footnote 1.
n tr n ,
385
(A.1)
for n > N , given p for p N .

Consider the determinant PN (t) Det(1 t), where t is a complex variable. It is
a polynomial of degree N in t and its term of degree N is (1)N Det(), and we have

log PN (t) = tr log(1 t).
(A.2)
Both sides of this identity can be expanded in t in some finite neighbourhood of zero.
The method consists in identifying the coefficients of the two series. The N first orders
determine the coefficients of PN (t) from the n s , n N . Then, the higher orders directly
express any n , n > N as a function of the p s, p N . Note that instead of computing
Det() from the order N coefficient of PN , one can alternatively compute N given
Det(), which is convenient for SU(N) group matrices.
If applied to = A, an element of the SU(3) algebra, (in which case A1 = 0), this
technique gives An , n > 3 in terms of A2 and A3 taken as independent variables. The first
non-trivial identities are
1
5
A5 = A2 A3 ,
A4 = (A2 )2 ,
2
6
1
1
7
3
2
A7 = A3 (A2 )2 ,
A6 = (A2 ) + (A3 ) ,
4
3
12
1
4
A8 = (A2 )4 + (A3 )2 A2 ,
(A.3)
8
9
and the determinant is
2A3
Det A =
(A.4)
.
3!
In what follows, we will have to manipulate monomials of the elementary scalar fields
A (x), defined for SU(N) through
A(x) =
2 1
N
A (x) ,
(A.5)
=1
where the traceless basis is subject to the normalization

1
tr = .
2
On this basis the anti-commutators read

= c 1N +
2 1
N
(A.6)
d ,
(A.7)
=1
with real and totally symmetric tensors c and d. With these normalizations and notations,
we have
tr A =
2
2 1
N
=1
N 1

1
tr A A =
A A ,
2
2
=1
(A.8)
386
tr A =
3
2 1
N
=1
N 1

1
tr A A A =
d A A A .
4
2
(A.9)
=1
The projection property

2 1
N
=1
ab dc

1
1
=
ac bd ab cd ,
2
N
(A.10)
can be used to derive that for any pair X, Y of complex N N matrices the following
identities hold:

1
1
tr X tr Y =
tr XY tr X tr Y ,
(A.11)
2
N

1
1
.
tr
XY
(A.12)
tr X Y =
tr X tr Y
2
N
In what follows we specialize to N = 3.

A.2. Correlators of composite operators
By gauge invariance, A2 (x) and A3 (x) can be chosen as the two effective degrees of
freedom. By R -symmetry, the even and odd sectors under A A decouple. Here we
derive consequences of the assumption that their dynamics is determined at leading order
by their given vacuum expectation values A2 and 0, respectively, and their connected twobody correlations A2,2 (x) and A3,3 (x).
We use a Wick-like treatment to express the higher order connected correlation functions
and averages through the quantities mentioned above. If n = 2p, any A (x) is assigned to
belong to a pair A2 (x), then considered as a free field denoted S(x). So each monomial
is replaced by a sum over all such pairings, and each of the p pairs is subject to the
substitution W2 ,
1
W2 : A A S(x),
(A.13)
4
leading to a monomial of degree p in S(x). If n = 2p + 1, one first performs the p 1
possible substitutions W2 (the result after p substitutions would transform as an octet
under SU(3) and thus vanishes), which yield a monomial necessarily proportional to
S p1 (x) tr A3 (x). There we apply the substitution W3 ,
W3 : tr A3 (x) P (x),
(A.14)
where P (x) is also considered as a free field.

Once the local operators have been expressed in terms of S(x) and P (x), any average is
obtained by using

S(x) = A2 ,
(A.15)

2
S(x)S(0) = A2,2 (x) + A2 ,
(A.16)

P (x)S(0) = 0,

P (x) = 0,

P (x)P (0) = A3,3(x).
387
(A.17)
(A.18)
(A.19)
These rules generalize the way how in Section 4 we computed averages involving A4 (x).
Let us now detail calculations involving A5 (x).
From Eqs. (A.3), (A.8), (A.9), we have

6
A5 (x) =
(A.20)
A A tr
A A A1 tr 1 .
5
,,1
We apply rule W2 to the right-hand side. The contraction of with produces S(x)P (x)
once. Using Eqs. (A.11), (A.12), (A.8), (A.9), one finds that the 6 W2 -contractions of either
or with either one of the three other indices contribute each the same amount
1
1
S(x) tr A3 (x),
4
2
that is from rule W3
(A.21)
1
S(x)P (x).
8
We thus arrive to the substitution

6
35
5
A5 (x)
1+
S(x)P (x) = S(x)P (x),
6
8
24
(A.22)
(A.23)
which we perform in the two point correlations A3,5 (x) A3 (x)A5(0) and A5,5(x)
A5 (x)A5(0) to get

35
P (x)P (0)S(0) ,
A3,5 (x) =
(A.24)
24
2

35
A5,5 (x) =
(A.25)
P (x)P (0)S(x)S(0) .
24
According to Eqs. (A.15)(A.19), these averages are given by
35
A2 A3,3(x),
24
2

35
A5,5 (x) =
A3,3 (x) A22 + A2,2 (x) .
24
A3,5 (x) =
As a last application, we derive the value of A3,3 (0) A23 (x) . By definition

A23 (x) =
A A A tr
A A1 A tr 1 ,
(A.26)
(A.27)
(A.28)
where all the fields are taken at the same point x. Applying all possible W2 substitutions
and using Eqs. (A.11), (A.12) leads to the substitution
5S 3 (x)
,
64
and averaging via Eq. (A.15) provides the final result (29).
A23 (x)
(A.29)
388
References
[1] P. Ginsparg, Nucl. Phys. B 170 (1980) 388.
[2] T. Appelquist, R. Pisarski, Phys. Rev. D 23 (1981) 2305.
[3] S. Nadkarni, Phys. Rev. D 27 (1983) 917;
S. Nadkarni, Phys. Rev. D 38 (1988) 3287;
S. Nadkarni, Phys. Rev. Lett. 60 (1988) 491.
[4] N.P. Landsman, Nucl. Phys. B 322 (1989) 498.
[5] T. Reisz, Z. Phys. C 53 (1992) 169.
[6] P. Lacock, D.E. Miller, T. Reisz, Nucl. Phys. B 369 (1992) 501.
[7] L. Krkkinen, P. Lacock, D.E. Miller, B. Petersson, T. Reisz, Phys. Lett. B 282 (1992) 121.
[8] L. Krkkinen, P. Lacock, B. Petersson, T. Reisz, Nucl. Phys. B 395 (1993) 733.
[9] K. Kajantie, M. Laine, K. Rummukainen, M. Shaposhnikov, Nucl. Phys. B 503 (1997) 357.
[10] O. Philipsen, Static correlation lengths in QCD at high temperature and finite density, heplat/0011019.
[11] P. Bialas, A. Morel, B. Petersson, K. Petrov, T. Reisz, High temperature 3D QCD: dimensional
reduction at work, Nucl. Phys. B 581 (2000) 477.
[12] P. Arnold, L.G. Yaffe, Phys. Rev. D 52 (1995) 7208.
[13] L. Krkkinen, P. Lacock, D.E. Miller, B. Petersson, T. Reisz, Nucl. Phys. B 418 (1994) 3.
[14] T. Reisz, Dimensionally reduced SU(2) YangMills theory is confined, in: B. Geyer, E.M. Ilgenfritz (Eds.), Quantum Field Theoretical Aspects of High Energy Physics, Frankenhausen,
1993, pp. 230235.
[15] K. Kajantie, M. Laine, J. Peisa, A. Rajantie, K. Rummukainen, M. Shaposnikov, Phys. Rev.
Lett. 79 (1997) 3130.
[16] K. Kajantie, M. Laine, A. Rajantie, K. Rummukainen, M. Tsypin, JHEP 9811 (1998) 11.
[17] F. Karsch, M. Oevers, P. Petreczky, Phys. Lett. B 442 (1998) 291.
[18] S. Datta, S. Gupta, Nucl. Phys. B 534 (1998) 392;
S. Datta, S. Gupta, Phys. Lett. B 471 (2000) 382.
[19] A. Hart, O. Philipsen, J.D. Stack, M. Teper, Phys. Lett. B 396 (1997) 217.
[20] A. Hart, O. Philipsen, Nucl. Phys. B 572 (2000) 243.
[21] C. Legeland, Aspects of (2 + 1) Dimensional Lattice Gauge Theory, Ph.D. Thesis, University
of Bielefeld, Germany, September 1998.
[22] G. Parisi, Statistical Field Theory, Addison-Wesley, New York, 1988.
[23] M. Alford, M. Gleiser, Phys. Rev. D 48 (1993) 2838;
J. Borrill, M. Gleiser, Nucl. Phys. B 483 (1997) 416.
[24] E. DHoker, Nucl. Phys. B 201 (1982) 401.
[25] W. Buchmuller, O. Philipsen, Phys. Lett. B 397 (1997) 112.
[26] R. Pisarski, Quarkgluon plasma as a condensate of SU(3) Wilson lines, hep-ph/0006205;
K. Kajantie, M. Laine, J. Peisa, A. Rajantie, K. Rummukainen, M. Shaposhnikov, Phys. Rev.
Lett. 79 (1997) 3130.

The non-abelian BornInfeld action at order F 6

Alexander Sevrin a , Jan Troost b , Walter Troost c
a Theoretische Natuurkunde, Vrije Universiteit Brussel Pleinlaan 2, B-1050 Brussels, Belgium
b Center for Theoretical Physics, MIT, 77 Mass Ave, Cambridge, MA 02139, USA
c Instituut voor Theoretische Fysica, Katholieke Universiteit Leuven, Celestijnenlaan 200D,
B-3001 Leuven, Belgium

Abstract
To gain insight into the non-abelian BornInfeld (NBI) action, we study coinciding D-branes
wrapped on tori, and turn on magnetic fields on their worldvolume. We then compare predictions
for the spectrum of open strings stretching between these D-branes, from perturbative string theory
and from the effective NBI action. Under some plausible assumptions, we find corrections to the
Str-prescription for the NBI action at order F 6 . In the process we give a way to classify terms in the
NBI action that can be written in terms of field strengths only, in terms of permutation group theory.
2001 Elsevier Science B.V. All rights reserved.
PACS: 12.60.Jv
1. Introduction
Consider a flat D-brane in type II string theory. The bosonic massless degrees of freedom
of an open string ending on the D-brane are a U (1) gauge field, associated to excitations
of the string longitudinal to the brane, and neutral scalar fields, associated to transverse
excitations of the brane. The effective action for these massless degrees offreedom for
slowly varying field strengths is known up to all orders in the string length . It is the
BornInfeld action: 1

S = Tp d p+1 det + 2 F .
(1)
Expanding this action in the field strength, we obtain a Maxwell action with higher order
corrections in F .
E-mail address: asevrin@tena4.vub.ac.be (A. Sevrin).
1 The Dp-brane tension we denote T , and {0, 1, . . . , p}. We choose the static gauge and we leave out the
p
transverse scalars for reasons to be explained below.

PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 0 4 - 3
390
A. Sevrin et al. / Nuclear Physics B 603 (2001) 389412
When N D-branes coincide, the massless degrees of freedom of open strings beginning
and ending on them are a U (N) gauge field, and a number of scalar fields in the adjoint
of the gauge group. The extra degrees of freedom come from strings stretching from one
D-brane to another that become massless when these D-branes coincide. A problem that
seems to appear naturally by analogy with the abelian case, is to write down the effective
action for these massless degrees of freedom, for slowly varying field strengths. In fact, we
know the first terms of such a non-abelian BornInfeld (NBI) action exactly. From string
scattering amplitudes [1] and a three-loop beta-function calculation [2], we know that the
expansion of the NBI Lagrangian in powers of the field strengths begins with: 2

1
1
1
L = Tr F 1 2 F2 1 + F 1 2 F2 3 F 3 4 F4 1 + F 1 2 F 3 4 F2 3 F4 1
4
24
12

1 1 2
1 1 2 1 2
1 2
F
F2 1 F
F2 1 F
F
F2 1 F2 1 + O F 6 . (2)
48
96
Through this order, this coincides with the expansion of the symmetrized trace action [3]:

S = d p+1 Str det + F ,
(3)
where the prescription is to formally expand the square root and the determinant in F first,
then to symmetrize over all orderings of the field strength factors, and finally to perform
the trace.
There is some ambiguity in the expression of the NBI action in terms of field strengths
and their covariant derivatives, since [D , D ]F = i[F , F ]. One could rewrite
expression (2) by assembling the second with the third term (and the fourth with the fifth)
at the cost of introducing extra [D, D]F F F terms. The all order proof [3] of the symmetric
trace formula is only claimed to be valid up to this type of terms, and therefore pertains
only to the sum of the coefficients of the second and third terms (and likewise for the fourth
and fifth), which in fact also follows from specialising to the abelian case. Nevertheless, it
is remarkable that, to fourth order, the symmetric trace gives the complete expression for
the superstring (though not for the bosonic string), and thus it deserves to be investigated
in detail at higher orders. In this paper, we will embed the symmetric trace hypothesis
into a more general action. Since we are approximating the NBI at string tree level, we do
keep the restriction of considering only an overall trace in the fundamental over the gauge
group factors. Expanding for slowly varying field strengths is admittedly ambiguous,
and an unambiguous order would add the number of F s to twice the number of Ds. We
will not include the most general possibility, but limit ourselves to a subset adapted to the
exploratory program that we propose in Section 2: all terms where the covariant derivatives
occur in antisymmetric combinations, and can therefore be written purely in terms of the
field strengths, are included in our analysis, but symmetric derivative combinations are
not. In other words, we adopt here the definition that acceleration terms are expressed as
symmetrized products of covariant derivatives.
2 We put 2 = 1 from now on, ignore the overall factor T , and an additive constant.
p
391
A direct calculation of the F 6 terms would imply the study of a 6-gluon open string
amplitude or a 5-loop -function. Both are technically very involved. In the next we will
develop a simpler approach which will allow us to determine the F 6 term to a large extent.
2. Wrapped D-branes and the NBI action

2.1. Magnetic field strengths on tori
In this section, we map out our testing ground for any proposal for the NBI action.
Consider N coinciding D2n-branes, wrapped around a T 2n torus. Switch on constant
magnetic fields in the Cartan subalgebra (CSA) of the U (N) gauge group. These
correspond to embedded D-branes of lower dimension. Choose the magnetic fields to be
blockdiagonal in the Lorentz indices, for simplicity. The plan [4] is now to compare the
spectrum for small fluctuations around this background as predicted by string theory, with
the spectrum predicted by the proposed non-abelian BornInfeld. Since we only want to
consider the (originally) massless degrees of freedom of the open string, we decouple the
massive modes by sending 0. To maintain the relevance of the non-linear corrections
to YangMills theory prescribed by the NBI, we crank up the magnetic field to keep F
constant.
2.2. Perturbative string theory spectrum
To write down the spectrum for the low-lying modes predicted by perturbative string
theory, we need some notation. Suppose we restrict to the situation in which we have only
2 D2n-branes. 3 Since the magnetic background is in a CSA, we can diagonalize it, and
(1)
(2)
associate a magnetic field strength to each of the two branes, F2i1,2i
and F2i1,2i
. We
chose the background to be blockdiagonal in the Lorentz indices. T-dualizing along the
2, 4, . . . , 2n directions, we end up with two Dn-branes at angles given by:
(n)
tan i
(n)
= F2i1,2i .
(4)
Then the modes of the open string connecting the two Dn-branes, which correspond to
the off-diagonal gauge field modes in the directions 2k 1, 2k, k {1, . . . , n}, have a
spectrum: 4
Mk2
=
(2mi + 1)i 2k ,
i=1
(1)
(2)
i = i i .
(5)
(6)
The details of how to compute this spectrum can be found in [7] and some handy formulas
are in [5].
3 We will do this throughout this paper. The rationale is that perturbative string theory as well as the linear
analysis we perform is only sensitive to the interactions between each pair of D-branes [5].
4 The modes of the scalar fields and the fermions have a similar spectrum, and we do not expect them to provide
any additional information [5,6].
392
2.3. YangMills analysis

To gain some intuition for how this spectrum comes about and to prepare for
the treatment in the case of the effective action, we take a look at the YangMills
approximation to the problem. Consider then the YangMills truncation of the non-abelian
BornInfeld action. We can study the same background as before, and determine the
spectrum of the fluctuations around the background in this approximation. This was done
in full detail in [6,8]. The result is:
Mk2 =
(1)

(1)

(2)
(2)
F2i1,2i
F2k1,2k
(2mi + 1) F2i1,2i
2 F2k1,2k
,
(7)
i=1
(1)
(2)
where, for convenience, we chose F2i1,2i
> F2i1,2i
. It is clear that for small field
strengths the string spectrum (5) reduces to the YangMills spectrum (7), as expected.
The YangMills spectrum can be argued for as follows. An endpoint of a string
ending on one of these D-branes behaves as an electric charge in a magnetic field.
The corresponding Landau problem has a harmonic oscillator spectrum with frequency
proportional to the magnetic field. The other endpoint of the string acts as a particle with
the opposite charge. This makes intuitive the fact that for the global motion of the string,
the difference between the field strengths on the two branes acts as spacing of the energy
levels. The zero-point energy, moreover, can be attributed to a Zeeman splitting of the
energy levels due to the fact that different combinations of the gauge field in directions
2k 1, 2k have spin 1 under the SO(2) associated to these directions.
2.4. String theory as rescaled YM

String theory adds a non-linearity to this spectrum that can for instance be intuitively
understood in the T-dual picture, where magnetic fields are interchanged for rotated branes.
(See [4] and [5], for instance.) For our purposes, the important observation is that the string
spectrum is merely a rescaled YangMills spectrum. Denoting

1 (1)
(2)
F2i1,2i + F2i1,2i ,
2

1 (1)
(2)
3
,
fi = F2i1,2i F2i1,2i
2
the spectrum is rescaled by a factor

2fi3
arctan
i
1+(fi0 )2 (fi3 )2
i2
=
3
2fi
2fi3
fi0 =
(8)
(9)
(10)
for field strength fluctuations in directions 2i 1, 2i.

A clearcut question is then, whether a proposal for the NBI action reproduces
this rescaled YangMills spectrum predicted by perturbative string theory. This was
investigated in detail for the Str-prescription in [5] (expanding on the initial explorations
in [4] and [9]). For the simplest case, on T 2 , the symmetrized trace prescription yielded a
393
spectrum with the same structure as the YangMills spectrum, but with incorrect spacings.
The disagreement shows up from third order on, confirming the veracity of the F 2 and F 4
terms. This clearly demonstrates that the Str-prescription is too crude an approximation to
the NBI to yield the correct mass spectrum on our testing ground. For T 4 , the situation
remained unclear since the complete spectrum predicted by the Str-action remained
undetermined. For BPS configurations on T 4 , the Str-action reproduces precisely the right
spectrum, but for other settings, it seems highly unlikely that the Str-prescription would
lead to the correct results. On T 6 , the Str would probably not yield the right spectrum even
for BPS configurations [5].
The spectrum as predicted by string theory is a rescaled YangMills spectrum, compare
Eqs. (5) and (7). Therefore, we will assume that the action relevant for this physical
situation, should yield a rescaled YangMills action for the fluctuations, meaning that it
can be brought back to a YangMills action by a suitable coordinate transformation. This
is certainly the simplest and perhaps the most natural way to reproduce the desired string
theory results. In these circumstances it seems less natural to allow in the Lagrangian terms
containing derivatives that cannot be written as combinations of field strengths. Not in the
least, they would make it much more difficult in practice to obtain results for the spectrum,
since one would be trying to diagonalize higher order operators. As indicated before, to
obtain this rescaled YangMills action we do include terms to the action corresponding
to all possible orderings and Lorentz contractions of field strengths. There might be an
a posteriori justification for this approach, if one could prove that for the fluctuation
eigenfunctions they can explicitly be written down in terms of theta-functions as in
[6] other kinds of derivative terms are suppressed.
From the formula for the rescaling factor, we expect only terms in the Lagrangian with
an even number of field strengths to contribute in our backgrounds. For this reason, we do
not consider terms with an odd number of field strengths.
2.5. BPS conditions
As already pointed out in [5], the translation of the BPS conditions in string theory in
terms of the background field strength in the effective action might provide an additional
handle on the NBI action. Concretely, in Section 6 we will investigate what constraints
on the NBI follow from the demand that self-dual configurations on T 4 should solve the
equations of motion.
These constraints on the action are a priori independent from the ones obtained from
the analysis along the line discussed in the previous subsection. They turn out to provide
an independent check on some of the results obtained with the rescaled YM program, and
also to give additional constraints on the NBI action.
3. The NBI at order F 4
We start by carrying out the program proposed above at the first non-trivial level, the F 4
terms in the non-abelian BornInfeld. This will serve to illustrate the method we use in a
394
simple setting. Moreover, it will turn out that the straightforward spectral analysis, under
the assumptions we make, is able to replace a four point function computation in open
string theory, or a three-loop beta-function computation in a non-linear -model approach,
demonstrating the power of our method.
The most general Lagrangian we can write down under the stated restrictions is then:

L = Tr a12 F1 2 F 2 1 + a14 F1 2 F 2 3 F3 4 F 4 1 + a24 F1 2 F3 4 F 2 3 F 4 1

+ a12,2F1 2 F 2 1 F1 2 F 2 1 + a22,2F1 2 F1 2 F 2 1 F 2 1 .
(11)
At this low order, it is easy to check that these are indeed the only linearly independent
terms. At higher order the analysis becomes untransparant. In Section 4, we will therefore
introduce a diagrammatic representation for these terms.
The symmetric trace prescription would relate the coefficients in Eq. (11) by a24 = 2a14 =
4a12,2 = 8a22,2 and the determinant formula sets this equal to a12 /3. Let us see how this
result comes about by imposing the correspondence of the spectrum with Eq. (7) with
the rescaling factor (10). First of all, we demand that the abelian action be reproduced
if we restrict to a U (1) subgroup. The 3 constraints this yields on the coeficients are
easy to determine and they are listed in Appendix C, Eq. (C.2). Next, we determine the
action quadratic in off-diagonal gauge field fluctuations, in a background blockdiagonal in
the Lorentz indices. We restrict to a U (2) subgroup since we always work with 2 branes
only. The action for the quadratic fluctuations in this background is given in Appendix D,
Eqs. (D.8) and (D.9), for second and fourth order respectively. Its structure is as follows:

2
2

(a)
(a) 2
(a)
ci (f, a) 1 F2i1,2i
L(2,4) = cikin(f, a) 1 F0,2i1 + 1 F0,2i

2
2
1
(a)
(a)
cij (f, a)
1 F2i1,2j
1 + 1 F2i1,2j
2

i=j
2
(a)
(a) 2
+ 1 F2i,2j
1 + 1 F2i,2j

nym
quad
(a)
(a)
(3)
+ cij (f, a) 1 F2i1,2i 1 F2j 1,2j 2ci (f, a)2 F2i1,2i fi3 ,
(12)
where (see Appendix D) in c( , )-coefficients f represent the background field strength
values, a stands for the coefficients ank of Eq. (11),
1 F = D A D A ,
(13)
2 F = i[A , A ],
(14)
D = + i[A , ],
(15)
and the superscript (a) runs over two orthogonal non-CSA SU(2) components.
The different lines are treated as follows:
The first line is the kinetic term. The first step in the comparison with the YangMills
action is a rescaling of the fluctuations of the gauge potentials such that the kinetic
term has the standard normalisation:
An = bi1 an
cikin
= bi2.
for n {2i 1, 2i},
(16)
(17)
395
The second line represents the deformation energy of the modes in directions 2i
1, 2i. By a rescaling of the space coordinates,
Xn = bi i xn
for n {2i 1, 2i},
ci = i2 bi4 ,
(18)
(19)
it is brought to the standard YangMills form with a rescaled background potential

a n = b i i An
for n {2i 1, 2i}.
(20)
In the third line, the rescalings above destroy the YangMills structure unless, when
cij = 0, we have that i = j (= ). This being granted, the overall factor agrees with
the YangMills value provided
cij = bi2 bj2 2 .
(21)
The fourth line is absent from the YangMills action. In accordance with our
nym
assumptions we put their coefficients cij to zero.
The fifth line contains the terms linear in the second order fluctuation 2 F =
[A , A ]. They have to follow the same scaling as the second line, but in fact this is
not an independent condition. If the YangMills structure of the third line is imposed,
this follows from the fact that fluctuations of the background configuration that are
gauge transformations leave the action unchanged.
Additional terms arise, with the structure
(a)
(a)
(a)
(a)
1 F2j,2i
1 F2i1,2j
1 1 F2j 1,2i 1 F2i1,2j .
Partial integration can be combined with the Lie-algebraic structure of these terms to
absorb them into the second, fourth and fifth lines.
Summarising: the non-YangMills terms have to be put to zero, and then ci (cikin)2 = 2
should be independent of i, and ci (cikin)1 should equal the required scaling factor i2 of
Eq. (10). These demands uniquely fix (after normalizing a12 = 1/4) all coefficients in the
action (11):
1
,
24
1
a24 = ,
12
a14 =
1
,
48
1
a22,2 = ,
96
a12,2 =
(22)
(23)
(24)
(25)
which matches the action, Eq. (2), predicted by the computation of scattering amplitudes,
a beta-function calculation, and the symmetric trace prescription.
396
4. Group theory and contractions

4.1. Diagrams
The implementation of our program at order 4 in the previous section starts from the
most general action consisting of terms that could be written in terms of field strengths
alone. This action is easy to write down in low orders, but at higher order, a more systematic
approach is called for. In this section we will describe an attempt to bring some systematics
into the classification of the different terms at order 2, 4, and 6 by using permutation group
theory. The explicit examples will be taken from order 6, and we will also give results at
order 4, but the scheme carries over to all orders.
Let us consider some typical terms in the action at order 6:

Tr F1 2 F 2 3 F3 4 F 4 1 F1 2 F 2 1 ,
(26)

2 3 4 1 2 1
.
F
F
Tr F1 2 F3 4 F1 2 F
(27)
The interplay between the Lorentz index contractions and the group theory trace can be
encoded in different ways. A pictorial way is to associate a diagram to each such term, by
drawing points on the corners of a regular hexagon, indicating the position of the F -factors
in the trace, and lines (with arrows, which will however soon be dropped) connecting the
different points, indicating the Lorentz contractions. The terms given in Eqs (26), (27) are
then represented by Fig. 1, where the left most F in the trace is represented by the upper
left corner of the hexagon.
An alternative description, geared towards the permutation group considerations that
follow, goes as follows. Label the first index on the 6 field strengths from 1 to 6. Then the
sequence of indices in the second position is a permutation of the first index. We will denote
this permutation i(.), and use it to label the diagram. The permutations corresponding to
(26)(27) are (1234)(56) and (1425)(36) in a cycle notation. 5 Obviously, each term at
order 6 can be represented by one (or more) of the 6! = 720 possible permutations.
4.2. Conjugacy classes
The complex linear combinations of diagrams are taken as a representation space for
the permutation group of 6 elements. The action of S6 on this representation space is by
Fig. 1. Diagrammatic way of representing the terms in Eqs. (26), (27).

5 I.e., for the second example, i(5) = 1, i(1) = 4, i(4) = 2, etc.
397
conjugation, as we now explain. The action of the permutation group consists of reshuffling
the vertices of the diagrams, which is the same as reshuffling F s in the trace. The action
of a permutation g on the vertices becomes, after relabeling:

g F1,i(1) F6,i(6) Fg 1 (1),i(g 1(1)) Fg 1 (6),i(g 1(6))
= F1,gig 1(1) F6,gig 1(6) .
(28)
Evidently, the set of diagrams within one conjugation class is invariant under this action.
As far as this representation of the permutation group on the diagrams is concerned, we
can study each conjugation class separately. Each of these representations separately is in
fact a (transitive) representation by permutation of diagrams.
The arrows on the diagrams can be dropped. Two diagrams that are the same up to the
orientation of a loop are equivalent 6 since they correspond to the same term in the action
(up to an unimportant sign): reversing the arrow in a loop amounts to flipping the order of
the indices in all the field strengths connected by that loop.
4.3. An induced representation
Now we analyse the representation of the permutation group on each conjugation class.
Consider a specific conjugation class, choose a diagram (without the arrows) and label it
i1 . The chosen diagram is invariant under a subgroup of the permutation group (acting by
conjugation as above). The invariance group of i1 we call H1 . For both our examples (26),
(27) the invariance group is isomorphic with Z4 Z2 Z2 .
It is clear that every other diagram i in the conjugacy class can be reached by the
action of some group element g, namely i = gi1 g 1 . Every gh with h H1 yields that
same diagram i. Therefore, the set of diagrams within a conjugacy class is the same
as the set of the left cosets with respect to the invariance group (of a diagram in that
conjugacy class). The action of the group on this set of cosets is the left regular action.
This representation is the representation induced [10] by the trivial representation of H1 on
S6 . Via Frbenius character formula we can then decompose this induced representation
in irreducible ones, using the character table of S6 . This decomposition provides an inroad
into the structure of the terms in the NBI, at order F 6 and potentially beyond. Note that if
we had picked a different diagam i2 in the same conjugacy class to start with, we would
have i2 = gi1 g 1 for some g S6 . The invariance group H2 = gH1 g 1 would yield an
equivalent construction to the previous one. Therefore, H1
= H cc is uniquely associated to
a conjugacy class (c.c.). The results for the invariance groups are summarized in Table 1, 7
for the relevant conjugacy classes. 8 The split into irreducible representations is assembled
in Table 2.
6 One can easily see that this equivalence relation is compatible with the action of the group.
7 The symbol in Table 1 does not denote a direct product. It is easy to deduce from the context how the
product of subgroups should be taken. The subgroups are ordered as follows. First the cyclic permutation within
a loop, then the permutation of loops of equal length, finally the orientation reversal. For loops of length 2, this
last group is trivial.
8 We momentarily explain why other conjugacy classes are irrelevant.
398
Table 1
Invariance groups associated to conjugacy
classes
Conjugacy class
Invariance group H cc
[6]
[42]
[33]
[222]
Z6 Z2
Z2 Z4 Z2
(Z3 )2 S2 (Z2 )2
(Z2 )3 S3
Table 2
Irreducible components and double cosets
Conjugacy class
Irreducible reps
# invariants
[6]
[4 2]
[3 3]
[2 2 2]
60 = [6] + [2 14 ] + 2.[23 ] + 2.[4 2] + [3 13 ] + [3 2 1]

45 = [6] + [5 1] + [23 ] + 2.[4 2] + [3 2 1]
10 = [6] + [4 2]
15 = [6] + [23 ] + [4 2]
14
9
3
5
4.4. Cyclicity and double cosets

At this stage a term in the Lagrangian corresponds to several diagrams, since the trace
is cyclic: a cyclic permutation corresponds to a rotation of the diagrams. We denote the
subgroup of S6 corresponding to these rotations as N = Z6c (where c stands for cyclicity).
Then it should be clear that the cosets gH within the double coset NgH correspond to
equivalent diagrams. We finally obtain therefore, that inequivalent diagrams correspond to
double cosets NgH .
To count these double cosets in the left regular representation on the H -cosets, it is
sufficient to count Z6c invariants within each irreducible component of the representation.
To do that, we can use Frbenius reciprocity and the character tables for S6 and Z6c .
The results then at order 6 are the following. Conjugacy classes of S6 with a cycle of
length 1, we do not consider since a field strength contracted with itself yields a term
equal to zero in the action. We have only four conjugacy classes left then. The number of
double cosets in each of these conjugacy classes is summarized in Table 2, along with the
decomposition into irreducible representations.
The diagrams corresponding to these invariants are drawn and labelled in Appendix A.
As we already indicated, this analysis generalizes to any order and gives therefore a
systematic way to count the number of unknown coefficients in the NBI action including
terms written using field strengths only, at any given order.
399
4.5. Invariant linear combinations

In the previous analysis, we split the representation space into conjugacy classes, next
into inequivalent irreducible representations, and we determined the number of double
cosets within an irreducible representation. Now we would like to write down explicitly
these Z6c invariants in terms of the diagrams, which translates directly into terms in the
action at order 6.
For most of the irreducible representations, the number of corresponding invariants
is larger than one. Lacking a criterium to decide which linear combinations are
most suitable, we made the following arbitrary choice. Corresponding to a specific
irreducible representation of the permutation group, there is a Young diagram and a
Young symmetrizer: acting with this Young symmetrizer on a specific diagram yields
automatically a vector in that irrep. The resulting vector can then simply be symmetrised
with respect to the Z6c cyclic group. The result of this procedure is one of the sought
after invariants. We have recorded in Table 5 (in the Appendix B) a complete 9 set of
combinations obtained in this way, together with the S6 irrep in which they are found.
Each line involves a choice of starting diagram, for which we found no good criterium
(like an a priori guarantee to give a linearly independent combination).
Alternatively, one may project on a generically reducible subspace formed by the sum of
equivalent S6 representations using the minimal projection operator [10] e(F ) associated
to a specific irrep F . For example, in the [6] class the projection on the [4 2] representations
yields a reducible representation, [4 2] [4 2]. Each of these contains two Z6c invariants.
Acting with e(F ) on a few (arbitrarily chosen) diagrams yields vectors from this reducible
space, and it is easy to pick out specific Z6c invariants. Since this seems to offer no particular
advantages (each choice of basis for the resulting invariants seems arbitrary), we do not
dwell on this further.
We pause here a moment to return to the results obtained in Section 3. If we carry out
the group theory analysis described in the previous paragraphs at order 4, we obtain the
results in Table 3.
In the part of the NBI action purely in terms of field strengths at order four only two of
the four potential fourth-order invariants are actually present:

1 2
1
1 22
4
.
I I
LNBI-F 4 = I2 +
(29)
4
24 1 4 1
The group theory that we introduced will similarly simplify the form of the NBI action
at order 6. At this stage we performed only the first step, providing a catalogue of
combinations that are in the different irreps, as recorded in Table 5 in Appendix B. We
now proceed to impose the data from the known string spectra.
9 We have not bothered to include the results of this analysis for the terms with Lorentz contraction structure
[3 3]. The reason is that, for the backgrounds we have studied, these terms give no contribution to the quadratic
action for the fluctuation, and therefore these terms remain completely arbitrary.
400
Table 3
Combinations of diagrams based on the permutation group: order 4. The square diagram is
represented by i1 , the diabolo is i2 , the cross is i3 and i4 the two parallel lines
Conjugacy class
Irrep
Linear combination
Name
[4]
[4]
i1 + 2i2
I14
[4]
[2 2]
i1 i2
I24
[2 2]
[4]
i3 + 2i4
I122
[2 2]
[2 2]
i3 i4
I222
5. A NBI at order 6
5.1. Reality
A first, fairly trivial constraint on the action comes from the demand that the action
be real. The complex conjugate of a term represented by a diagram, is given by the
term corresponding to the mirror diagram. This can easily be seen using the hermiticity
of the Lie algebra generators. We conclude that diagrams that are mirror to each other
have complex conjugate coefficients. The diagrams that are mirrorsymmetric have real
coefficients. 10
Note that all diagrams at order 4 were mirrorsymmetric, and therefore they all
necessarily had real coefficients. This is not true at sixth order. However, it turns out that,
apart from the general structure as described for the fourth order calculation in Eq. (12), an
additional term is present at sixth order, that is off-diagonal in the SU(2) components of the
field fluctuations. The rescaled YangMills requirement puts this to zero. This annihilates
the imaginary parts of the complex conjugate coefficients so that, as a conclusion, also at
sixth order all coefficients are real.
5.2. String spectrum data
In Section 3 we executed our program of demanding a rescaled YangMills action for
the action quadratic in the fluctuations on our testing ground. It was succesful there in
determining the coefficients of the NBI at order 4 that we know to be correct. In this
section, as discussed previously, we explore which constraints are found on the NBI if we
extend this analysis to order 6.
The action for the quadratic fluctuations at order 6 has virtually the same structure as that
discussed in detail for order 4 in Section 3. We follow the same route and rescale the action
by cikin and demand that the action is a rescaled YM action with appropriate rescaling
factor. The constraints from gauge invariance (see Section 3) were not imposed a priori,
but were used as a check on the computation. The result is a large set of linear equations
nnn of the different terms in the action (see Eq. (A.1)). Of these, 21 are
for the coefficients am
10 This mirror-operation is the only group operation represented on all double cosets.
401
independent, leaving 10 out of 31 (see Table 2) of the coefficients in the general sixth order
action undetermined.
Of these 10 undetermined coefficients, 3 are the coefficients of the invariants in class
[3 3]: for the background we consider these invariants give vanishing contribution as
we now argue. 11 The background (see Section 2) has block-diagonal fieldstrengths, and
therefore the Lorentz contraction of three background fields is frustrated and vanishes. 12
Consequently, the quadratic variations could only arise when the Lorentz contraction
structure is (F F1 F ) times (F F1 F ). But this also vanishes, since the k-sum in Fki Fjk
will contain only one term, and is hence diagonal in ij . We ignore this terms in the sequel,
and continue with the remaining 28 terms of Table 5 in Appendix B, 7 combinations out
of 28 having arbitrary coefficients.
To present the result in detail, we make a change of basis. We still base our choice
of combinations of diagrams on the permutation group considerations of Section 4. We
remind the reader that in many cases, a given irrep occurs more than once, and in addition
a given irrep usually contains two invariants. For such cases, the choice of basis for specific
invariants is a priori quite arbitrary, and what was written in Table 5 is a raw choice. With
hindsight, this choice can be improved, and the result is recorded in Table 4. The following
changes were made:
If the value of coefficients for a given representation is completely fixed, 13 this
combination was chosen as one of the basis vectors. The other basis vectors (which
therefore have zero coefficients) were taken to be orthogonal with the natural metric
for the diagrams. This is the case for the [4 2] and [3 2 1] irreps in all classes, as well
as for the [6]. Whereas this last fact is obvious (it corresponds to the abelian case),
the general reason for the other ones is unclear.
If the values of the coefficients are fixed numbers for some, and arbitrary parameters
for other combinations, we have separated the basis accordingly. This is the case for
all the [23 ] irreps, where for each class a single combination is fixed, and for the [313]
likewise.
The stand-alone [2 14] invariant (which has arbitrary coefficient as well) is not
touched.
In Table 4, we have again listed the potential cyclic invariants in the group-theoretic
classification, in the changed basis. The resulting sixth-order terms in the action are
L(6) =
1 6
1 6
1 6
1 6
I +
I
I +
I
720 6 6480 42 5760 321 720 222
6
6
6
6
6
+ 1 I222
+ 2 I222
+ 3 I222
+ 4 I3111
+ 5 I21111
1 42
1 42
1
1 42
42
42
I6 +
I42
I321
I + 6 I222
480
3240
11520
360 222
11 It is obvious that this argument extends to very many higher order terms with a structure that factorises with
Lorentz contractions of an odd number of field strengths.
12 The same argument eliminates terms arising from F , see Eq. (D.3).
2
13 We are here taking into account the requirements from reality and the rescaled YangMills ansatz, not the
BPS conditions. See further for the incorporation of those.
402
1 222
1
1 222
222
222
I6
I42
I
+ 7 I222
.
(30)
5760
25920
2880 222
It is clear that the resulting expression displays a remarkable amount of structure, but we
have not been able to penetrate beyond the obvious.
+
Table 4
Results on the coefficients of cyclic invariants by irreducible representation. The last column contains
names for future reference. The column before last has the following meaning: abel indicates a
coefficient fixed by the abelian case (or by Tseytlins proof of the symmetric trace formula), fixed
and undet mean fixed resp. undetermined by the rescaled YangMills analysis
Class
S6 -rep
Invariant linear combination
Coefficient
Name
222
[6]
3i1 + 2i2 + 1i3 + 3i4 + 6i5
abel
I6222
222
222
[4 2]
[4 2]
1i1 + 4i2 3i3 4i4 + 2i5

2i1 + 1i2 + 1i3 1i4 3i5
fixed
0
222
I42
222
[23 ]
3i1 + 2i2 + 1i3 + 0i4 + 0i5
fixed
222
I222
222
[23 ]
0i1 1i2 + 1i3 3i4 + 3i5
undet
42
[6]
2, 2, 2, 2, 2, 2, 1, 1, 1
abel
I642
42
42
42
42
[4 2]
[4 2]
[4 2]
[4 2]
2, 0 , 2, 1, 1, 1, 0 , 2, 1
1, 3, 2, 2, 1, 1, 0 , 1, 1
3, 3, 2, 2, 3, 3, 6, 1, 1
6, 0 , 1, 4, 3, 3, 0 , 1, 4
fixed
0
0
0
42
I42
42
42
[3 2 1]
[3 2 1]
2, 0 , 2, 2, 1, 1, 0 , 2, 2
0 , 0 , 0 , 0 , 1, 1, 0 , 0 , 0
fixed
0
42
I321
42
[23 ]
0 , 2, 2, 0 , 0 , 0 , 1, 1, 0
fixed
42
I222
42
[23 ]
1, 1, 0 , 2, 1, 1, 1, 0 , 1
undet
[6]
1, 6, 6, 3, 6, 6, 6, 6, 2, 3, 6, 3, 3, 3
abel
I66
6
6
6
6
[4 2]
[4 2]
[4 2]
[4 2]
1, 2, 2, 1, 2, 2, 6, 2, 2, 3, 2, 5, 3, 1
1, 2, 1, 2, 1, 1, 3, 2, 1, 0, 1, 1, 0, 2
1, 6, 1, 2, 1, 1, 1, 6, 3, 2, 9, 3, 2, 2
2, 3, 4, 5, 4, 4, 2, 3, 3, 1, 3, 3, 1, 2
fixed
0
0
0
6
I42
6
6
[3 2 1]
[3 2 1]
2, 2, 1, 0, 1, 4, 2, 2, 2, 1, 2, 2, 1, 2
0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0
fixed
0
6
I321
[23 ]
1, 2, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1
fixed
6
I222
[23 ]
1, 4, 1, 0, 1, 5, 1, 2, 1, 1, 1, 1, 1, 2
undet
[23 ]
0, 2, 2, 0, 2, 2, 2, 2, 0, 2, 4, 2, 2, 2
undet
[23 ]
0, 2, 2, 0, 2, 2, 4, 4, 0, 1, 2, 2, 1, 2
undet
6
I222
[3 13 ]
0, 0, 2, 0, 2, 0, 0, 0, 0, 1, 0, 0, 1, 0
fixed=0
6
I3111
[3 13 ]
1, 2, 1, 0, 1, 1, 1, 2, 1, 1, 1, 1, 1, 2
undet
[2 14 ]
1, 2, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1
undet
222
I222
42
I222
6
I222
6
I222
6
I3111
6
I21111
403
An important check on the arbitrariness is provided by the fact that some commutator
combinations can not possibly contribute in the restricted class of background that we investigated. These are, in an obvious notation (see Appendix D if an explanation is needed)
Tr[F1 F1 ][F2 F2 ][F3 F3 ],
(31)
Tr[F1 F1 ][F2 F2 ][F3 F4 ],
(32)
Tr[F1 F2 ][F3 F4 ][F5 F6 ],
(33)
Tr[F1 F2 ][F3 F5 ][F4 F6 ],
(34)
Tr[F1 F2 ][F2 F5 ][F4 F6 ].
(35)
The reason is obvious: the quadratic variation of these products of three commutators always has one commutator left, and vanishes since the background is abelian. 14 The first
line corresponds to 7 , the second to 6 . The last three lines generate through linear combinations the [3 1 1 1]-invariant with 4 , as well as the [2 2 2] invariants with 1 and 2 .
6. BPS configurations
Turning on magnetic fields usually breaks all supersymmetry with as a result that the
D-brane configuration becomes unstable. This can already be seen from the massformulae,
Eqs. (5) and (7), which exhibit the generic presence of tachyonic modes in the spectrum.
However, it was noticed in [11] that for very specific choices of the background some
supersymmetry survives.
We will first formulate this in the T-dual picture. We take two Dp-branes, one of them
in the (2, 4, . . . , 2p) direction and the other one rotated over an angle 1 in the (2 3) plane,
over an angle 2 in the (4 5) plane, . . . , over an angle p in the (2p 2p + 1) plane. Searching
for common directions in the supersymmetry charge and the rotated charge gives BPS
configurations which are summarized below:
p
BPS angle
2
3
4
1 = 2
1 = 2 + 3
1 = 2 + 3 + 4
8
4
2
1 = 2 , 3 = 4
1 = 2 = 3 = 4
4
6
susys
BPS magnetic fields

f13 = f23
f13 = f23 + f33 + f13 f23 f33
f13 = f23 + f33 + f43 + f13 f23 f33
+ f13 f33 f43 + f13 f23 f43 f23 f33 f43
3
f1 = f23 , f33 = f43
f13 = f23 = f33 = f43
We assumed that none of the angles are zero. In the table we list the conditions on the
angles, the number of preserved supercharges and finally the T-dual picture where the
condition on the angles is translated, using Eqs. (6) and (4), into a condition on the
14 It is less obvious that the (independent) combination F F F F F F (leaving the label 3 out of the
1 [2 |3| 4 5 6]
antisymmetrisation) also does not contribute: here the block-diagonal nature of the background is involved. This
6
with coefficient 5 . We have no short explanation for the seventh invariant, with 3 .
is in fact the invariant I21111
404
magnetic fields. For simplicity we took the magnetic field entirely in the 3 direction,
i.e., fi0 in Eq. (8) vanishes.
Though the conditions on the angles are linear, they translate for two cases into nonlinear conditions on the magnetic fields. In fact, when switching on the U (1) part of the
magnetic field, one always gets such corrections. At first sight one would expect this to
give a crucial handle on the NBI. Indeed, BPS configurations should solve the equations
of motion with as a result that the non-linear conditions relate different orders in the NBI.
However all backgrounds considered above are in the torus of U (2) and thus insensitive to
different ordenings in the equations of motion. In fact they all solve the equations of motion
of the abelian BornInfeld action and as a consequence those arising from our action
through order F 6 as well. There is one case where we do have a good guess for the general
BPS condition: rotated D2-branes or D4-branes with magnetic fields. In that case the
obvious guess for the full non-abelian BPS condition is self-duality of the magnetic field.
In [12], some arguments were put forward to sustain the claim that self-dual static
magnetic backgrounds solving the equations of motion while simultanously minimizing
the energy is equivalent to demanding that the whole NBI for such configurations collapses
to the leading YangMills term. It was shown that the symmetrized trace prescription does
share this property. Implementing this assumption in our case gives five conditions on the
general form of the action at sixth order. It turns out that three of these are dependent on
the previously implemented rescaled YangMills conditions, thus providing a consistency
check on both our results and the proposal in [12]. The remaining two take an extremely
simple form in terms of the coefficients in Eq. (30), viz.
3 =
1
,
1440
7 = 1
2 6
+ .
4
2
Note that different Lorentz contraction structures are connected. 15 As far as the
permutation group structure is concerned, the conditions are pure [2 2 2].
What about the full non-abelian version of the non-linear BPS conditions? While the
first order correction to the linear relations can easily be deduced from the fact that they
should solve the equations of motion through order F 3 , nothing can be said to all orders
yet. A more detailed study of these BPS configurations and their consequences for the NBI,
has to wait for a better understanding of supersymmetry in the NBI [13].
7. Conclusions
In this paper, we made a first systematic attempt to determine corrections to the Strterms in the NBI action. The physical testing ground on which we worked, were D-branes
wrapped on tori with magnetic backgrounds turned on. Central in an important part of
our analysis is the fact that the spectrum for open strings stretching between these Dbranes as predicted by perturbative string theory differs by a mere rescaling from the YM15 Self-duality, F
= F F /2. Repeatedly using
= F /2, implies that F F + F F

this, allows us to rewrite all terms of class 6 and 4 2 in terms of the five elements of the class 2 2 2.
405
approximation to the spectrum. We made two bold assumptions to proceed in unknown

territory. The first was that we did not take along all possible derivative corrections to
the NBI. This was inspired partly by practical motives, partly by the second assumption,
namely, that the action quadratic in the fluctuations in this background obtained from the
NBI should be a rescaled YM action. Under these assumptions we were able to put severe
constraints on the NBI action. A weak a posteriori argument is that this method yields
correct results at order 4. More encouraging is the fact that this approach yields constraints
that are compatible with the constraints we obtained from analysing BPS configurations
this was not evident from the outset.
By construction, this action heals a severe default of the Str NBI action, pointed out in
[4,9] and [5], namely, that it doesnt predict the correct spectrum for open strings on our
testing ground. A direct calculation of the terms at order 6 of the NBI via six point functions
in string theory or a five loop beta-function calculation in a non-linear -model would be
welcome, of course, to see whether our assumptions are valid. As long as this calculation is
not available, other methods to get a grip on the NBI are worth study. A natural extension
of our ideas is to just enlarge the testing ground by looking at other compactification
manifolds, and by looking at different, possibly electric backgrounds. It could help also if
we could gain more insight into the relative norms of the linear combinations of diagrams
corresponding to double cosets, for instance, to see whether there might be a systematic
expansion for the NBI although this is perhaps asking for too much. 16
There are of course alternative techniques that are more complementary to our approach.
The most promising route to obtain a grip on the non-abelian BornInfeld might be via
supersymmetry. Simply by noethering one can try to modify ten-dimensional YM with
non-linear corrections and modify the supersymmetry transformation rules accordingly.
In the abelian case this fixes, e.g., uniquely the fourth-order term in the BI action [14].
Because of the severely restricted form of the supersymmetry algebra in ten dimensions,
it might even be 17 that the BI action is the only supersymmetric deformation of abelian
YM. Continuing this line of thought, it should be clear that a similar analysis should be
performed in the non-abelian case. A good starting point would be the BPS conditions in
higher dimensions in Section 6 they provide a hint of how to modify the supersymmetry
variations. It seems important to us to carry out this program in the maximum of ten
dimensions, because the supersymmetry algebra is largest there and therefore puts (much)
stronger restrictions on the form of the action. For lower dimensions some partial results
for a supersymmetric non-abelian extension of BornInfeld theory are available [15].
Closely linked to the idea of a supersymmetric action on the brane is the idea of the construction of an action invariant under -symmetry [16]. This approach starts from the observation that the form of the WessZumino term, which describes the coupling of the gauge
fields to the RamondRamond bulkfields, is severely restricted, even in the non-abelian
case. As the variation of the WessZumino term under the -transformations has to be cancelled by the variation of the NBI, one gets a recursive method to construct the NBI. This
16 We thank J. de Boer for a discussion on this point.
17 This was suggested to us by Savdeep Sethi.
406
program was already carried out through quartic order in the YangMills field strength,
and including all fermion bilinear terms up to terms cubic in the field strength [16]. The ordenings are indeed completely fixed by requiring -invariance. Surprisingly, it was found
that at such low order, deviations of the symmetrized trace proposal do already appear.
Another route to the NBI derives from the study of the equivalence of non-commutative
and commutative BornInfeld actions via the SeibergWitten map. In this way one obtains
constraints on derivative corrections to the BornInfeld action. It would be interesting to
see whether this can teach us anything about the NBI action, as claimed in [17].
Solutions to different non-abelian extensions of BornInfeld theory, not necessarily
related to string theory have been studied. The non-linearity of the action often leads to
a smoothing out of solutions to an ordinary YM or Maxwell action. It could be interesting
to study what kind of corrections can be expected from more general proposals for nonabelian extensions of BornInfeld theory, including ours.
Acknowledgements
We thank Eric Bergshoeff, Jan de Boer, Mees de Roo, Savdeep Sethi and Pierre van
Baal for useful discussions. A.S. and W.T. are supported by the European Commission
RTN programme HPRN-CT-2000-00131, in which A.S. is associated to the university of
Leuven. The work of J.T. is supported in part by funds provided by the US Department
of Energy under cooperative research agreement DE-FC02-94ER40818. J.T., moreover,
thanks the Vrije Universiteit Brussel and the FWO Vlaanderen for support during the first
stages of this work.
Appendix A. Drawing diagrams

We draw here all the diagrams corresponding to the double cosets as introduced in
Section 4. 18
In (A.1) we give a few examples in our ansatz for the action that should leave no
ambiguity as to which terms in the action the diagrams correspond to. The upper index
nnn indicates the class, the lower index the number of the diagram:
on am

L=Tr + + a36 F1 2 F2 3 F3 4 F6 1 F4 5 F5 6
+ + a46 F1 2 F2 3 F3 4 F6 1 F5 6 F4 5
6
+ + a14
F1 2 F4 5 F2 3 F6 1 F3 4 F5 6
+ + a12,4 F1 2 F2 1 F1 2 F2 3 F3 4 F4 1
+ + a32,2,2F1 2 F1 2 F1 2 F2 1 F2 1 F2 1

+ O F8 .
(A.1)
18 In fact, it was intuitively easy to see, even before we knew the group theory from Section 4 that drawing all
different diagrams was sufficient to enumerate all different terms in the action.
407
Appendix B. Table
The table in this appendix records the result of the construction of a basis of invariants
based on permutation group analysis. The notation is as follows.
Table 5
Cyclic invariants by irreducible representation
Class
S6 -rep
Prefactor
Invariant linear combination
222
[6]
3i1 + 2i2 + i3 + 3i4 + 6i5
222
[4 2]
222
[4 2]
222
[23 ]
222
[23 ]
42
[6]
42
[4 2]
42
[4 2]
42
[4 2]
42
[4 2]
1
15
1
10
1
20
1
12
1
12
1
15
1
10
1
20
1
20
1
20
1
15
1
15
1
12
1
6
1
60
1
20
1
40
1
20
1
20
1
45
1
45
1
12
1
36
1
12
1
24
1
18
1
18
1
36
42
[3 2 1]
42
[3 2 1]
42
[23 ]
42
[23 ]
[6]
[4 2]
[4 2]
[4 2]
[4 2]
[3 2 1]
[3 2 1]
[23 ]
[23 ]
[23 1]
[23 ]
[3 13 ]
[3 13 ]
[2 14 ]
2i1 + 3i2 i3 3i4 i5

3i1 2i2 i3 + 2i4 + 4i5
6i1 + 5i2 + i3 + 3i4 3i5
0i1 i2 + i3 3i4 + 3i5
2, 2, 2, 2, 2, 2, 1, 1, 1
3, 3, 0, 2, 1, 1, 1, 2, 1
5, 3, 4, 2, 3, 3, 1, 2, 1
4, 2, 2, 0, 0, 0, 1, 1, 2
4, 2, 2, 4, 4, 4, 1, 1, 2
1, 0, 1, 1, 2, 3, 0, 1, 1
1, 0, 1, 1, 3, 2, 0, 1, 1
1, 5, 4, 2, 1, 1, 1, 2, 1
1, 1, 2, 2, 1, 1, 2, 1, 1
1, 6, 6, 3, 6, 6, 6, 6, 2, 3, 6, 3, 3, 3
1, 1, 1, 3, 1, 1, 0, 0, 2, 1/2, 1, 0, 1/2, 0
1, 4, 1, 0, 1, 1, 5, 2, 1, 1, 1, 1, 1, 2
1, 2, 1, 0, 1, 1, 1, 2, 1, 0, 3, 3, 0, 0
1, 2, 1, 2, 1, 1, 1, 0, 1, 1, 3, 1, 1, 2
4, 4, 2, 0, 2, 8, 4, 4, 4, 2, 4, 4, 2, 4
1, 1, 3, 0, 2, 2, 1, 1, 1, 3, 1, 1, 2, 1
1, 4, 2, 1, 2, 2, 2, 4, 0, 1, 0, 3, 1, 1
1, 2, 1, 0, 1, 7, 1, 4, 1, 1, 5, 1, 1, 4
1, 3, 3, 3, 3, 3, 3, 3, 2, 0, 0, 0, 0, 0
1, 2, 3, 2, 3, 5, 1, 0, 1, 1, 1, 1, 1, 2
0, 0, 1, 0, 1, 0, 0, 0, 0, 1/2, 0, 0, 1/2, 0
1, 2, 3, 0, 1, 1, 1, 2, 1, 0, 1, 1, 2, 2
1, 2, 2, 3, 2, 2, 2, 2, 2, 1, 2, 1, 1, 1
408
Fig. 2. Diagrammatic representation of the five [2 2 2] terms in the action.
Fig. 3. Diagrammatic representation of the nine [4 2] terms in the action, numbered as indicated,
from left to right.
Fig. 4. Diagrammatic representation of the fourteen [6] terms in the action.
The first five lines correspond to the class of terms with three times a double contraction
of Lorentz indices, labeled class 2 2 2. The second column gives the permutation group
class in the standard cycle notation, and the third gives the corresponding invariant. The
combination is written as a weighted sum of diagrams in , the latter labeled in the order
given in Fig. 2 (see Appendix A). In the following lines, this information is given for the
classes of terms with Lorentz contractions following the patterns 42 and 6, respectively.
For the invariant linear combinations we just give the coefficients, again corresponding to
the Figs. 3 and 4 in Appendix A.
409
Appendix C. Abelian constraint

We know the BornInfeld action for the gauge group U (1). After expanding the
determinant and the square root it looks as follows:
2
1
1
1
F F
L6 = F F + F F F F
4
8
32
1
1
+ F F F F F F F F F F F F
12
32
3
8
1
F F + O F .
+
(C.1)
384
From this we derive the following constraints on the coefficients in our general ansatz:
1
1
a14 + a24 = ,
a12 = ,
4
8
up to fourth order, and
14
ai6 =
i=1
1
,
12
ai2,4 =
i=1
a12,2 + a22,2 =
1
,
32
i=1
1
,
32
ai2,2,2 =
(C.2)
1
,
384
(C.3)
at sixth order.
Appendix D. Some technical details

D.1. Action quadratic in fluctuations in a magnetic background
We split the field strength in background and fluctuations, F = F + F , and the
fluctuations into a part linear in the gauge field fluctuations and a part quadratic in the
gauge field fluctuations:
F = A A + i[A , A ],
(D.1)
F = 1 F + 2 F ,
(D.2)
1 F = D A D A ,
(D.3)
2 F = i[A , A ],
(D.4)
D = + i[A , ].
(D.5)
We substitute F = F + 1 F + 2 F into the action up to order F 6 and restrict to the terms

quadratic in the fluctuations. Well get terms proportional to 2 F and terms proportional to
(1 F )2 .
D.2. Terms proportional to 2 F
When we choose F in the CSA we can write the terms proportional to 2 F as:
1
1
1
(2)
L2 = 2 F F + 2 F F F F 2 F F F F
2
2
8
410
1
1
+ 2 F F F F F F 2 F F F F F F
2
16

2
1
1
2 F F F F F F + 2 F F F F .
8
64
This part of the action naturally has the same coefficients as the abelian action.
(D.6)
D.3. Terms proportional to (1 F )2

The U (2) components of 1 F (see Eq. (D.3)) are denoted as in 1 F = n=1,2 1 F (n) n
and the background splits likewise as F = F 0 + F 3 3 . For the Lorentz index contraction
we use a shorthand notation indicating the sequence(s) of contractions, easily understood
and generalised from the following hypothetical example:
A1 2 B3 1 C1 2 D2 3 E2 1 A1 B3 C1 D2 E2 .
Our calculation to order four gives for the off-diagonal fluctuations:

2
1
1
2
2
L(2)
1 = 2a1 1 F1 1 F2 + 1 F1 1 F2

+ 2 1 F11 1 F21 + 1 F12 1 F22 4a14 + 4a24 F30 F40 + 4a14 F33 F43

+ 2 1 F11 1 F31 + 1 F12 1 F32 2a14 + 2a24 F20 F40 + 2a14 + 2a24 F23 F43

+ 2 1 F11 1 F21 + 1 F12 1 F22

2a12,2 + 2a22,2 F10 F20 + 2a12,2 2a22,2 F13 F23

+ 2 1 F11 1 F11 + 1 F12 1 F12

4a12,2 + 4a22,2 F20 F20 + 4a22,2 F23 F23 .
(D.7)
The corresponding expressions for the general form (with our restrictions) of the action at
order F 6 are not very illuminating, and we refrain from giving them explicitely.
D.3. Background blockdiagonal in Lorentz indices
After filling in the background we obtain, from the quadratic terms:
2
2

(a)
(a) 2
(a)
+ 1 F0,2i
1 F2i1,2i
L(2) = 1 F0,2i1

2
2
2
1

(a)
(a)
(a)
(a) 2
+ 1 F2i1,2j
+ 1 F2i,2j
+ 1 F2i,2j
1 F2i1,2j
1
1
2
i=j
(3)
22 F2i1,2i
fi3 ,
and from the quartic terms we find:

2
(a)
(a) 2
L(4) = 8 1 F0,2i1
+ 1 F0,2i

2
2
2
a14 + a24 fi0 + a14 fi3 + 2a12,2 + 2a22,2 fk0
2

+ 2a12,2 2a22,2 fk3
(D.8)
411
2 4
2

(a)
3a1 + 3a24 + 4a12,2 + 4a22,2 fi0
+ 8 1 F2i1,2i

2
2
+ a14 + a24 + 4a22,2 fi3 + 2a14 + 2a24 fk0

2
+ 2a12,2 2a22,2 fk3

2
2
2
(a)
(a)
(a)
(a) 2
1 F2i1,2j
+8
+ 1 F2i,2j
1 + 1 F2i1,2j
1 + 1 F2i,2j
i=j

2 2
2
a14 + a24 fi0 + a14 fi3 + a12,2 + a22,2 fk0
2

+ a12,2 a22,2 fk3

(a)
(a)
+ 8 1 F2i1,2i 1 F2j 1,2j

a14 + a24 + 4a12,2 + 4a22,2 fi0 fj0 + a14 + a24 + 4a22,2 fi3 fj3

2 2 2
2
(3)
+ 2 F2i1,2i
fi3 6 fi0 + 2 fi3 fk0 fk3

(3)
fi0 2fk0 fk3
+ 2 F2i1,2i

(3)
+8
2 F2i1,2i
fi3 a14 + a24 fi0 fj0 + a14 + a24 fi3 fj3 .
(D.9)
i=j
The sixth order calculation is analogous, the results are omitted.
References
[1] D.J. Gross, E. Witten, Nucl. Phys. B 277 (1986) 1;
A.A. Tseytlin, Nucl. Phys. B 276 (1986) 391;
A.A. Tseytlin, Nucl. Phys. B 291 (1987) 876.
[2] D. Brecher, M.J. Perry, Nucl. Phys. B 527 (1998) 121, hep-th/9801127;
K. Behrndt, Open superstring in non-abelian gauge field, in: Proceedings of the XXIII Int.
Symp., Ahrenshoop, Akademie der Wissenschaften der DDR, 1989, p. 174;
K. Behrndt, Untersuchung der Weyl-Invarianz im Verallgemeinter -Modell fr Offene Strings,
PhD thesis, Humboldt-Universitt zu Berlin, 1990.
[3] A.A. Tseytlin , Nucl. Phys. B 501 (1997) 41, hep-th/9701125.
[4] A. Hashimoto, W. Taylor, Nucl. Phys. B 503 (1997) 193, hep-th/9703217.
[5] F. Denef, A. Sevrin, J. Troost, Nucl. Phys. B 581 (2000) 135, hep-th/0002180.
[6] J. Troost, Nucl. Phys. B 568 (2000) 180, hep-th/9909187.
[7] A. Abouelsaood, C. Callan, C. Nappi, S. Yost, Nucl. Phys. B 280 (1987) 599.
[8] P. van Baal, Commun. Math. Phys. 94 (1984) 397;
P. van Baal, Commun. Math. Phys. 85 (1982) 529.
[9] P. Bain, On the non-abelian BornInfeld action, to appear in the Proceedings of the Cargse 99
Summer School, hep-th/9909154.
[10] B. Simon, Representations of Finite and Compact Groups, Graduate Studies in Mathematics,
Vol. 10, AMS, 1996;
A. Speiser, Die Theorie der Gruppen van Endlicher Ordnung, 4te Auflage, Birkhuser Verlag,
Basel, 1956;
D.E. Littlewood, The Theory of Group Characters, 2nd edn., Clarendon Press, Oxford, 1958.
[11] M. Berkooz, M.R. Douglas, R. Leigh, Nucl. Phys. B 480 (1996) 265, hep-th/9606139.
412
[12] D. Brecher, Phys. Lett. B 442 (1998) 117, hep-th/9804180.

[13] E. Bergshoeff, M. de Roo, A. Sevrin, in preparation.
[14] E. Bergshoeff, M. Rakowski, E. Sezgin, Phys. Lett. B 185 (1987) 371;
R. Metsaev, M. Rakhmanov, Phys. Lett. B 193 (1987) 202.
[15] S. Cecotti, S. Ferrara, Phys. Lett. B 187 (1987) 335;
S. Ketov, N = 1 and N = 2 supersymmetric non-abelian BornInfeld actions from superspace,
hep-th/0005265;
A. Refolli, N. Terzi, D. Zanon, Phys. Lett. B 486 (2000) 337, hep-th/0006067.
[16] E.A. Bergshoeff, M. de Roo, A. Sevrin, hep-th/0011018;
E.A. Bergshoeff, M. de Roo, A. Sevrin, hep-th/0011264;
E.A. Bergshoeff, M. de Roo, A. Sevrin, hep-th/0010151.
[17] L. Cornalba, On the general structure of the non-abelian BornInfeld action, hep-th/0006018.

The uniqueness of the abelian BornInfeld action

Lies De Foss, Paul Koerber 1 , Alexander Sevrin
Theoretische Natuurkunde, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussels, Belgium
Received 5 March 2001; accepted 4 April 2001
Abstract
Starting from BPS solutions to YangMills which define a stable holomorphic vector bundle, we
investigate its deformations. Assuming slowly varying fieldstrengths, we find in the abelian case
a unique deformation given by the abelian BornInfeld action. We obtain the deformed Donaldson
UhlenbeckYau stability condition to all orders in . This result provides strong evidence supporting
the claim that the only supersymmetric deformation of the abelian d = 10 supersymmetric Yang
Mills action is the BornInfeld action. 2001 Elsevier Science B.V. All rights reserved.
1. Introduction
An exciting consequence of the discovery of D-branes [1] was their close relation to
gauge theories. The bosonic worldvolume degrees of freedom of a single Dp-brane are
9 p scalar fields and a U (1) gauge field in p + 1 dimensions. The former describe the
transversal fluctuations of the D-brane while the latter describes an open string longitudinal
to the brane. For slowly varying fields, the effective action governing the low-energy
dynamics of a D-brane is known through all orders in : it is the ten-dimensional
supersymmetric BornInfeld action, dimensionally reduced to p + 1 dimensions [2]. Its
supersymmetric extension was obtained in [3]. The knowledge of the full effective action
was crucial for numerous applications.
Once several, say n, D-branes coincide, the gauge group is enhanced from U (1) to
U (n), [4]. The non-abelian extension of the BornInfeld theory is not known yet. The
most natural form for it is the symmetrized trace proposal in [5]. However, as shown
in [6] and [7], this does not correctly capture all of the D-brane dynamics. Using the
mass spectrum as a guideline, partial higher order results were obtained in [8]. In [9],
E-mail addresses: lies@tena4.vub.ac.be (L. De Foss), koerber@tena4.vub.ac.be (P. Koerber),
asevrin@tena4.vub.ac.be (A. Sevrin).
1 Aspirant FWO.
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 6 6 - 3
414
L. De Foss et al. / Nuclear Physics B 603 (2001) 413426
-symmetry was shown to be a powerful but technically involved tool to fix the ordenings
ambiguities.
In [7], it was pointed out that BPS configurations of Dp-branes at angles, [1012],
might provide an important tool to probe the structure of the effective action. Upon
T-dualizing we end up with D2p-branes in the presence of constant magnetic background
fields. In the large volume limit ( 0) the BPS conditions define a stable holomorphic
vector bundle [13]. Moving away from the large volume limit, these conditions receive
corrections. As a BPS configuration necessarily solves the equations of motion, we obtain
relations between different orders in in the effective action.
In the present paper we start the exploration of the consequences of this idea. As
we consider the present paper as a feasibility study we will make two simplifying
assumptions: we work in the limit of slowly varying fieldstrengths and restrict our attention
to the abelian case. The first assumption is translated by the fact that we will ignore
terms containing derivatives of the fieldstrength. The second assumption is implemented by
taking the magnetic background fields to live in the Cartan subalgebra of u(n). As a starting
point we take the theory in the 0 limit. I.e., we take the YangMills action reduced
to the torus of U (n) in the presence of magnetic background fields which define a stable
holomorphic vector bundle. Subsequently we add arbitrary powers of the fieldstrength to
it and demand that the BPS configurations solve the equations of motion. This problem
turns out to have a unique solution. The resulting action is precisely the abelian Born
Infeld action and the stability condition, also known as the DonaldsonUhlenbeckYau
condition [14], acquires corrections which are unique as well.
This result provides a serious incentive to extend the analysis to the much harder
non-abelian case [15]. In addition, there is a suspicion that, because of the severely
restricted form of the supersymmetry algebra in ten dimensions, the BI action is the only
supersymmetric deformation of abelian YangMills. 2 E.g., supersymmetry fixes in the
abelian case uniquely the fourth order term in the BI action [16]. As BPS configurations
are intimately related to supersymmetry, we believe that our present paper lends strong
support to this claim.
Our paper is organized as follows. In Section 2 we review BPS configurations of Dpbranes at angles. Section 3 relates this to supersymmetric YangMills theory. Using an
example, we outline our strategy in the third section. Section 4 provides the proof of our
assertion. We discuss our results and diverse applications in the final section. Conventions
are given in Appendix A while Appendix B gathers some useful results concerning the
abelian BornInfeld action.
2. BPS configurations from string theory

Simple BPS configurations of D-branes arise as follows [1012]. One starts with two
coinciding Dp-branes. Keeping one of them fixed, one performs a Lorentz transformation
2 This was suggested by Savdeep Sethi.
415
Table 1
p
BPS condition
Susys
1 + 2 = 2n
1 + 2 + 3 = 2n
1 + 2 + 3 + 4 = 2n
1 + 2 = 2n, 3 + 4 = 2m
1 = 2 = 3 = 4
2
4
6
on the other one. For all boosts and generic rotations, all supersymmetry gets broken in
this way. However there are particular rotations for which some of the supersymmetry is
preserved.
Consider two Dp-branes in the (1, 3, . . . , 2p 1) directions. Keeping one of them fixed,
rotate the other one subsequently over an angle 1 in the (1 2) plane, over an angle 2 in the
(3 4) plane, . . . , over an angle p in the (2p 1 2p) plane. The following table summarizes
for various values of p the BPS conditions on the angles (taken to be non-zero unless stated
otherwise) and the number of remaining supersymmetries.
In Table 1 we took n, m Z.
In order to make contact with the BornInfeld theory, we T-dualize the system in the
2, 4, . . . , 2p directions. In this way, we end up with two coinciding D2p-branes with
magnetic fields turned on. Indeed, having two D2p-branes extended in the 1, 2, . . . , 2p
directions with magnetic flux F2i1 2i , i {1, . . . , p},

gi + fi
0
F2i1 2i =
(2.1)
,
0
gi fi
we can choose a gauge such that the potentials have the form,
A2i1 = 0,
A2i = F2i1 2i x 2i1 .
(2.2)
T-dualizing back, we end up with two Dp-branes with transversal coordinates given by
X2i = 2 A2i .
(2.3)
Using Eq. (2.2) in Eq. (2.3), we recognize the original configuration with the two
Dp-branes at angles with the angles given by

i = arctan 2 (gi + fi ) arctan 2 (gi fi )
= arctan
4 fi
.
1 + (2 )2 (gi2 fi2 )
(2.4)
In Table 2 we translate the BPS conditions on the angles in BPS conditions on the
fieldstrengths, choosing for simplicity the U (1) part to be zero, gi = 0.
One notices that for p > 2, the BPS condition expressed in terms of fieldstrengths

corresponding to the angular relation i i = 2n, gets 2 corrections. In the next,
except when stated otherwise, we will always study BPS conditions of this type. In the
remainder of this paper, we will put 2 = 1.
416
Table 2
p
BPS condition
Fieldstrengths
1 + 2 = 2n
f1 + f2 = 0
1 + 2 + 3 = 2n
f1 + f2 + f3 = (2 )2 f1 f2 f3
1 + 2 + 3 + 4 = 2n
f1 + f2 + f3 + f4
= (2 )2 (f1 f2 f3 + f1 f3 f4 + f1 f2 f4 + f2 f3 f4 )
f1 + f2 = f3 + f4 = 0
f1 = f2 = f3 = f4
1 + 2 = 2n, 3 + 4 = 2m
1 = 2 = 3 = 4
3. BPS configurations in supersymmetric YangMills

The supersymmetric U (n) YangMills theory in d = 10 is given by 3

1
i
10
S = d x Tr F F + D
/ ,
4
2
(3.1)
where is a MajoranaWeyl spinor which transforms in the adjoint representation of

U (n). The action is invariant under the supersymmetry transformations rules
A = i ,
(3.2)
1
= F + ,
(3.3)
2
with A the U (n) gauge potential and and constant MajoranaWeyl spinors.
The leading term of the effective theory describing n coinciding D2p-branes (p 2) is
nothing but Eq. (3.1) dimensionally reduced to 2p + 1 dimensions. The gauge potentials
in the transverse directions appear as 9 2p scalar fields in the adjoint representation of
U (n), which are reinterpreted as the transversal coordinates of the D-branes. As they will
not play any significant role in this paper, we drop them from now on.
We now proceed with the analysis of Eq. (3.3) in the presence of magnetic background
fields and demand that some supersymmetry is preserved. I.e., we investigate whether for
certain magnetic background fields there is an such that = 0. In fact we can use the
transformation in Eq. (3.3) to reduce any F from u(n) to su(n). We start by switching on
F2i1 2i su(n), i {1, . . . , p}, which satisfy the BPS condition suggested by D-branes at
angles
p

F2i1 2i = 0.
(3.4)
i=1
For further convenience, we switch to complex coordinates (for details, we refer to

Appendix A), where we have that F = iF21 2 . Eq. (3.4) becomes in complex
coordinates, 4
3 We ignore an overall multiplicative constant.
4 Unless stated otherwise, we sum over repeated indices.
417
F = 0.
(3.5)
= F = 0,
(3.6)
We get that
holds provided that

=
p

(1 + 11
),
(3.7)
=1
with an arbitrary MajoranaWeyl spinor. This reduces the number of supersymmetry

charges from 16 to 16/2p1 .
It is not hard to check that when all magnetic fields are switched on, = 0 still holds
provided the magnetic fields do not only satisfy Eq. (3.5) but
F = F = 0,
, {1, . . . , p},
(3.8)
as well. For p = 2, Eqs. (3.5) and (3.8) are nothing but the well known instanton equations.
In general Eq. (3.8) defines a holomorphic vector bundle while Eq. (3.5), which can be
rewritten in a more covariant form,
g F = 0,
(3.9)
is the DonaldsonUhlenbeckYau condition for stability of the vector bundle [13].

Remains to check whether these configurations solve the equations of motion,
D F = 0. In complex coordinates, this becomes
0 = D F + D F
= D F + 2D F ,
(3.10)
where we used the Bianchi identities. This is indeed satisfied if Eqs. (3.5) and (3.8) hold.
Note that magnetic field configurations satisfying Eqs. (3.5) and (3.8) always solve the
equations of motion and always preserve supersymmetry, even when they are not constant.
As a consequence, we will not demand them to be constant anymore.
4. Deformations
A natural question which arises is whether we can deform the YangMills action in
such a way that the BPS configurations given in the previous section remain solutions
to the equations of motion. Though the discussion in the previous section holds for both
the abelian as well as the non-abelian case, we focus in the remainder of this paper on
the abelian case. In this way we avoid the additional complication of having to take the
different ordenings into account. From now on the magnetic fields take values in the
Cartan subalgebra of u(n) and we postpone the study of the non-abelian extension to a
future paper [15]. In addition, we will work under the assumption that the fieldstrengths
vary slowly. In other words, we add terms polynomial in the fieldstrength to the action
418
and ignore terms containing derivatives of the fieldstrength (acceleration terms). We will
further comment on these assumptions in the concluding section. Under these assumptions,
we arrive at equations of motion of the form

D F + xD F F F + yD F F F + O F 5 = 0,
(4.1)
where x and y are real constants. As we saw before, the analysis of the leading order term
led to the conditions (in complex coordinates)
F = F = 0,
g
F = F = 0,
(4.2)
(4.3)
where in the last line we used the fact that we are working in flat space. Passing to complex
coordinates while implementing the holomorphicity conditions Eqs. (4.2), (4.1) becomes,

D F + xD F F F + 2yD F F F + O F 5 = 0.
(4.4)
Upon using the Bianchi identities and Eq. (4.2), this results in

x
x
F D F F + xF F D F
D F + F F F + 2y +
3
2
5
+ 2yF F D F + O F = 0,
(4.5)
which vanishes if
x
y = ,
(4.6)
4
holds and provided that we deform the DonaldsonUhlenbeckYau condition, Eq. (4.3), to

x
F + F F F + O F 5 = 0.
(4.7)
3
Rescaling F and multiplying the equation of motion with a constant, we can put x = 1.
Upon restoring the SO(2p) invariance, we find that the equations of motion integrate to the
action

1
1
S = d 2p+1 x F1 2 F 2 1 + F1 2 F 2 3 F3 4 F 4 1
4
8

2

1
F1 2 F 2 1 + O F 6 ,
(4.8)
32
which, modulo an undetermined overall multiplicative constant, we recognize as the Born
Infeld action through order F 4 (see Appendix B).
In a similar way, one can push this calculation an order higher by adding the most general
integrable terms through fifth order in F to the equations of motion. Again we require that
the (deformed) BPS solutions solve the equations of motion. The scale of the fieldstrengths
was already fixed at previous order. In this calculation one needs, e.g., that the two last
terms in Eq. (4.5) get completed to a derivative of Eq. (4.7). At the end one finds that the
equations of motion get uniquely fixed and they indeed integrate to the BornInfeld action
419
through sixth order in F . Furthermore, the DonaldsonUhlenbeckYau condition acquiers

an order F 5 correction,

1
1
F + F F F + F F F F F + O F 7 = 0.
(4.9)
3
5
These results raise the suspicion that the BornInfeld action is the only deformation of
YangMills which allows for BPS solutions of the form Eqs. (4.2), (4.3). Furthermore,
one expects that the holomorphicity conditions, Eq. (4.2) remain unchanged, while
the DonaldsonUhlenbeckYau condition, Eq. (4.3) receives corrections. In the next
section, we will show that this is indeed the case.
5. All order results

In this section we will construct the unique deformation of abelian YangMills which
allows for BPS solutions which are in leading order given by Eqs. (4.2), (4.3). Consider
a general term in the deformed YangMills Lagrangian,

p
p
p

(p1 ,p2 ,...,pn ) tr F 2 1 tr F 4 2 . . . tr F 2n n , pi N, i {1, . . . , n},
(5.1)
where (p1 ,p2 ,...,pn ) R. Dropping the overall (p1 ,...,pn ) , this term contributes to the
equations of motion by,
n

4jpj D
j =1

p
p
p 1
p

F 2j 1 tr F 2 1 tr F 4 2 tr F 2j j tr F 2n n .
(5.2)
Passing to complex coordinates, Eq. (A.1), we get

n

4jpj 2p1 ++pn 1 D
F 2j 1
j =1
where

Fm
F

2 p1 2j pj 1 2n pn
F
F
F
,
(5.3)
F 2 F2 3 Fm
F1 2 F2 3 Fm 1 .
(5.4)
Using the Bianchi identities and Eq. (4.2), we find for the action of the derivative operator
D on (F 2j 1 ) :
2j
2

1
1
D F h F 2j 1h +
D F 2j 1 .
D F 2j 1 =
h
2j 1
h=1
Implementing this result in Eq. (5.3) yields,

2j 2
n

1
2j 1h
2j 1
1
p1 ++pn 1
h
D F F
D F
4jpj 2
+

h
2j 1
j =1
h=1
p

p 1
p
F 2 1 F 2j j F 2n n
(5.5)
420
n

p
p 1
p 1
p

pg D F 2g F 2j 1 F 2 1 F 2j j F 2g g F 2n n
g=1,g=j

2j 1 2 p1 2g pj 2 2n pn

2j
F
F
+ (pj 1) D F
F
F
.

(5.6)
We now study the different types of terms in the equations of motion. We will determine
the numerical prefactors such that the (deformed) BPS configurations solve them.
Terms of the form D F 2r1 : there is one of these terms in each order and they add
up to,

42
43
(0,1) F 3 +
(0,0,1) F 5 + .
D 4 1 (1)F +
(5.7)
3
5
In leading order it vanishes because of Eq. (4.3). It is clear that the all order expression
should vanish by itself thereby giving the deformed DonaldsonUhlenbeckYau
condition,

1
2
3
(5.8)
(1) F + (0,1) F 3 + (0,0,1) F 5 + = 0.
1
3
5
Terms of the form (D F 2r )(F 2l1 ) (tail), where
n

p
2i pi
p p

F
(tail) = F 2 1 F 4 2 F 2n n =
,
(5.9)
i=1
as these terms involve traces over even powers of the fieldstrength they can never be
cancelled by a condition as Eq. (5.8), so they should cancel order by order among
themselves. If we look at the first versus the two last terms of Eq. (5.6), we see
immediately that a term of this form originates from two different terms in the action,
namely (tr F 2l+2r )(tail) and (tr F 2l )(tr F 2r )(tail). Suppose first that l = r. Requiring
such a term to vanish results, using the first two terms in Eq. (5.6), in the following
condition,
(l + r)(pl+r + 1)(...,pl ,...,pr ,...,pl+r +1,...)
+ 4lr(pl + 1)(pr + 1)(...,pl +1,...,pr +1,...,pl+r ,...) = 0.
(5.10)
The BornInfeld coefficients Eq. (B.4) satisfy this condition. Analogously, using the
first and third term in Eq. (5.6), we get when l = r,
(p2l + 1)(...,pl ,...,p2l +1,...) + 2l(pl + 2)(pl + 1)(...,pl +2,...,p2l ,...) = 0,
(5.11)
again satisfied by the BornInfeld coefficients Eq. (B.4). Note that the two conditions
Eq. (5.10) and Eq. (5.11) are enough to determine all coefficients at a certain order if
one is known. We give an example of the chain of relations at order F 8 ,
(4, 0, 0, 0)
(5,11)
(2, 1, 0, 0)
(5,10)
(1, 0, 1, 0)
(5,11)
(0, 2, 0, 0)
(5,10)
(5,11)
(0, 0, 0, 1)
421
So, up until now, we find BornInfeld modulo a proportionality factor at each order,
(p1 ,p2 ,...,pn ) =
1
1
(1)k+1
X jpj ,
k
p
1
4
p1 ! pn ! 1 npn
(5.12)
where Xnj=1 jpj R are unknown constants.

Terms of the form (D F 2r1 )(tail): they relate different orders in F . The only way
to cancel these terms is by virtue of Eq. (5.8). Using Eqs. (5.6) and (5.12) we find that
such a term appears in the equation of motion as

Xl lpl +r

(1) l pl

D F 2r1 (tail),

p
p
l
l
2r 1
2 l
l (pl !) l l
(5.13)
where all summations and products run from l = 1 through l = n. For a given tail, the
sum over r of such terms has to vanish through the use of Eq. (5.8). This determines
all unknowns Xr in terms of two,
r2
X2
Xr = X2
(5.14)
, r 3.
X1
We still have the freedom to rescale the fieldstrength in the equations of motion by
an arbitrary factor and we also note that the equations of motion are only determined
modulo an arbitrary multiplicative factor. In other words, we can only determine the
action modulo an overall multiplicative factor. This freedom can be used to put X1 =
X2 = 1. Combining this with Eq. (5.14), we get
Xr = 1,
r 1.
(5.15)
At this point the equations of motion are completely fixed and they are exactly equal
to the equations of motion of the abelian BornInfeld theory, implying that our action
is, modulo an overall multiplicative constant, the BornInfeld action. This fixes the
BPS condition, Eq. (5.8), as well,
1 3
1
F + F 5 +
3
5
= tr arctanh F ,
0 = F +
(5.16)
where F is a p p matrix with elements F and the trace is taken over the Lorentz
indices.
Terms of the form (D F 2r1 )(F 2s ) (tail): these are the only terms left and they will
cancel because of Eq. (5.16). Using Eqs. (5.6), (5.12) and (5.15), we get the prefactor
of such term,

2s
1
(1) l pl
2r1

D
F (tail),
F

2 l pl l (pl !) l l pl 2r 1
(5.17)
and it is clear that when summing over r they vanish because of Eq. (5.16).
This completes the proof that the abelian BornInfeld action is the unique deformation of
the abelian YangMills action which allows for BPS solutions.
422
6. Discussion and conclusions

Fieldstrength configurations which define a stable, Eq. (4.3), holomorphic, Eq. (4.2),
vector bundle solve the YangMills equations of motion. Such configurations are relevant
in the study of BPS solutions for D-branes in the 0 limit. In this paper we deformed
the abelian theory by adding arbitrary powers of the fieldstrength to the YangMills
Lagrangian. Demanding that a deformation of Eqs. (4.2), (4.3) still solves the equations of
motion, we showed that the deformation is uniquely determined: it is precisely the abelian
BornInfeld theory. The holomorphicity condition Eq. (4.2) remains unchanged while the
DonaldsonUhlenbeckYau stability condition gets deformed to
tr arctanh F = 0,
(6.1)
with F a p p matrix with elements F .

The analysis in Section 5 holds not only in flat space but in Khler geometries as well.
Defining F to be the p p matrix with elements F g F with g the Khler
metric, one finds again Eq. (6.1). In this context, it might be worthwhile to mention that
it would be interesting to include the transverse scalars in the analysis. This would make
it possible to get an all result for the stability condition for branes wrapped around a
holomorphic submanifold of a Khler manifold [17].
Eqs. (4.2) and (6.1) play an important role in the study of BPS configurations for Dbranes at finite . As supersymmetry and magnetic field configurations discussed above
are closely related, our result provides evidence strengthening the belief that the only
supersymmetric deformation of ten-dimensional supersymmetric U (1) YangMills theory
is the supersymmetric BornInfeld action.
Eq. (6.1) holds in any dimension d = 2p. It is a U (p) invariant and therefore it can
be rewritten in terms of Casimir invariants. As U (p) has Casimir invariants of order
1, 2, . . . , p, one can rewrite Eq. 6.1) in a less elegant though more familiar form when
specifying to particular dimensions. We tabulate the resulting equivalent expressions for
the cases relevant to D-brane physics in Table 3.
Table 3
p
Stability condition
F11 + F22 = 0
F11 + F22 + F33 + F11 F22 F33 F12 F21 F33 F13 F22 F31 + F12 F23 F31
+ F13 F21 F32 F11 F23 F32 = 0
F11 + F22 + F33 + F44 F13 F22 F31 + F12 F23 F31 + F14 F43 F31 F13 F44 F31
+ F13 F21 F32 F11 F23 F32 F12 F21 F33 + F11 F22 F33 F14 F22 F41 + F12 F24 F41
F14 F33 F41 + F13 F34 F41 + F14 F21 F42 F11 F24 F42 F24 F33 F42 + F23 F34 F42
+ F24 F32 F43 F11 F34 F43 F22 F34 F43 F12 F21 F44 + F11 F22 F44 F23 F32 F44
+ F11 F33 F44 + F22 F33 F44 = 0
423
We expect that these BPS solutions minimize the energy. Let us briefly investigate this
for the case where Eq. (4.2) holds. 5 For simplicity, we will only switch on the magnetic
fields F11 , F22 , . . . , Fpp . The energy is given by

p

1 F .
E = d 4 x det 1 F = d 4 x
(6.2)
=1
For p = 2 we get,

E = d 4 x 1 F11 F22 + F11 F22

d 4 x 1 + F11 F22 F11 + F22 .
(6.3)
Contrary to the YangMills case, we find two situations in which the relation gets saturated.
The first is when F11 + F22 = 0, which we recognize as the familiar BPS condition we have
been discussing so far. The second configuration is characterized by 1 + F11 F22 = 0. This
corresponds to a D2/D4 system. Though this system is not supersymmetric, it becomes so
when we switch on magnetic fields F11 and F22 on the D4-brane which precisely satisfy
1 + F11 F22 = 0. For p = 3, we get

E = d 6x 1 F F F + F F + F F + F F F F F

11
22
33
11 22
11 33
22 33
11 22 33

d 6 x 1 + F11 F22 + F11 F33 + F22 F33 F11 + F22 + F33 + F11 F22 F33 .
(6.4)
Again the result is saturated in two cases. When the last factor vanishes in Eq. (6.4) we
have the standard BPS condition. When the first factor vanishes, we find a configuration
corresponding to either a D0/D6 or a D4/D6 system. By switching on magnetic fields F11 ,
F22 and F33 on the D6-brane which satisfy this relation we obtain a BPS configuration.
A similar analysis holds for p = 4. Either one recovers the standard BPS configuration
of D8-branes or a D2/D8 system (or equivalently a D6/D8 system) with magnetic fields
on the D8-brane such that the result is BPS. Aspects of some of these non-standard BPS
configurations were studied in [18] . Even as these exotic BPS configurations have no
0 limit, they are in fact T-dual to the BPS configurations studied in Section 2 as is
demonstrated in Fig. 1.
Our analysis was performed under the assumption that the fieldstrengths vary slowly,
i.e., we ignored terms having derivatives of the fieldstrength. Such terms are expected to
be present [19]. It would be very interesting to investigate whether our method can handle
such terms as well. However, an additional complication will arise in such an analysis. As
explained in [20] and [21], because of field redefinitions, derivative terms are ambiguous.
Nonetheless, it is worthwile to investigate this point as this will further clarify the relation
between the commutative and non-commutative pictures [22,23].
5 Throughout this discussion, we assume that the rhs of the inequalities are differences of invariants. This is
certainly so for p = 4, see, e.g., [13]. For p 6 this is very probably true as well. We postpone a more detailed
examination of the energy of BPS configurations to a future publication.
424
Fig. 1. Before T-dualizing, we have a D-brane extending in the 2i 1 2i plane with u(1) magnetic
flux F2i1 2i and another D-brane without magnetic flux perpendicular to it. The dotted lines show
the directions along which we T-dualize. After T-dualizing twice, we end up with two D-branes

which coincide in the 2i 1 2i plane with an su(2) flux F2i1
2i 3 . This shows that the more exotic
BPS conditions minimalizing the energy are T-dual to the ones studied in this paper.
Another point which deserves further attention is the study of the BPS conditions as
a function of the string coupling constant gS . In this way the method developed in this paper
might provide an alternative approach to the study of the effective action as a function of
the string coupling constant. In [20], it was shown that through second order in gS and in
flat space the BornInfeld action, modulo a renormalization of the tension, still describes
the effective dynamics. It would be intriguing if such a claim could be pushed at higher
orders (note that in non-trivial geometries this is very probably not true).
Finally, the results in this paper provide sufficient motivation for a detailed investigation
of the non-abelian case. As Eqs. (4.2) and (4.3) hold both in the abelian and the non-abelian
case, we can still use it as a starting point and investigate allowed deformations. Not only do
we expect a concrete ordening prescription for the action, but Eq. (6.1) should get supplied
with an ordening prescription as well. Note that derivative terms might become relevant in
this case. We will report on this in [15].
Acknowledgements
We thank Eric Bergshoeff, Frederik Denef, Mees de Roo, Marc Henneaux, Walter Troost
and Michel Van den Bergh for useful discussions. In particular, we are grateful to Jan
Troost for numerous suggestions and illuminating conversations. This work is supported
in part by the FWO-Vlaanderen and in part by the European Commission RTN programme
HPRN-CT-2000-00131, in which the authors are associated to the university of Leuven.
Appendix A. Notations and conventions

Our metric follows the mostly plus conventions. Indices denoted by , , . . . run from
0 to 2p, denoted by i, j , . . . run from 1 to 2p and denoted by , , . . . run from 1 to p. We
use real 32 32 -matrices satisfying { , } = 2g and T = 0 0 . By 1 n we
425
denote the (weighted) completely antisymmetrized product [1 2 n ] with [[ ]] =

[ ].
Instead of using real spatial coordinates x i , i {1, . . . , 2p}, we will often use complex
coordinates z , {1, . . . , p},

1
1
z x 21 ix 2 .
z x 21 + ix 2 ,
(A.1)
2
2
As we work in flat space, the metric is g = g = 0, g = .
Consider the rotation group SO(2p). The subgroup preserving the complex structure is
U (p). If we denote the so(2p) generators by Mij = Mj i , the u(p) generators are given
by the subset M . The u(1) generator commuting with all the u(p) generators is given by

M . The remainder of the so(2p) generators, M and M , respectively, transforms
in the p(p 1)/2 and the p(p 1)/2 of su(p), respectively.
Appendix B. The abelian BornInfeld action
In this appendix we derive a few properties of the abelian BornInfeld action needed in
section five.
The BornInfeld Lagrangian can be rewritten as 6

LBI = det F

k

(1)k+1
1
1
2
4
2p
tr F + tr F + + tr F + .
=
(B.1)
4k k!
2
p
k=0
A general term in the abelian BornInfeld Lagrangian

p
p

2 p1
tr F 4 2 tr F 2n n ,
BI
(p1 ,p2 ,...,pn ) tr F
(B.2)
originates from the kth term in the Taylor expansion, with k given by
k = p1 + p2 + + pn .
(B.3)
Hence, the numerical prefactor becomes

BI
(p1 ,p2 ,...,pn ) =
1
1
(1)k+1
.
4k
p1 ! pn ! 1p1 npn
(B.4)
References
[1] J. Polchinski, Phys. Rev. Lett. 75 (1995) 4724, hep-th/9510017;
J. Dai, R. Leigh, J. Polchinski, Mod. Phys. Lett. A 4 (1989) 2073.
[2] E.S. Fradkin, A.A. Tseytlin, Phys. Lett. B 163 (1985) 123;
R.G. Leigh, Mod. Phys. Lett. A 4 (1989) 2767;
A.A. Tseytlin, BornInfeld action, supersymmetry and string theory, in: M. Shifman (Ed.), The
Many Maces of the Superworld, World Scientific, 2000, hep-th/9908105.
6 The trace denotes a trace over the Lorentz indices.
426
[3] M. Aganagic, C. Popescu, J.H. Schwarz, Phys. Lett. B 393 (1997) 311, hep-th/9610249;
M. Aganagic, C. Popescu, J.H. Schwarz, Nucl. Phys. B 495 (1997) 99, hep-th/9612080;
M. Cederwall, A. von Gussich, B.E.W. Nilsson, P. Sundell, A. Westerberg, Nucl. Phys. B 490
(1997) 179, hep-th/9611159;
E. Bergshoeff, P.K. Townsend, Nucl. Phys. B 490 (1997) 145, hep-th/9611173.
[4] E. Witten, Nucl. Phys. B 460 (1996) 35, hep-th/9510135.
[5] A.A. Tseytlin, Nucl. Phys. B 501 (1997) 41, hep-th/9701125.
[6] A. Hashimoto, W. Taylor, Nucl. Phys. B 503 (1997) 193, hep-th/9703217.
[7] F. Denef, A. Sevrin, J. Troost, Nucl. Phys. B 581 (2000) 135, hep-th/0002180.
[8] A. Sevrin, J. Troost, W. Troost, The non-abelian BornInfeld action at order F 6 , Nucl.
Phys. B 603 (2001) 369, preceding article in this issue, hep-th/0101192.
[9] E.A. Bergshoeff, M. de Roo, A. Sevrin, Non-abelian BornInfeld and -symmetry, hepth/0011018;
E.A. Bergshoeff, M. de Roo, A. Sevrin, On the supersymmetric BornInfeld action, hepth/0011264;
E.A. Bergshoeff, M. de Roo, A. Sevrin, Towards a supersymmetric non-abelian BornInfeld
[10] M. Berkooz, M.R. Douglas, R. Leigh, Nucl. Phys. B 480 (1996) 265, hep-th/9606139.
[11] E. Bergshoeff, R. Kallosh, T. Ortin, G. Papadopoulos, Nucl. Phys. B 502 (1997) 149, hepth/9705040.
[12] J.P. Gauntlett, G.W. Gibbons, G. Papadopoulos, P.K. Townsend, Nucl. Phys. B 500 (1997) 133,
hep-th/9702202;
N. Ohta, P.K. Townsend, Phys. Lett. B 418 (1998) 77, hep-th/9710129.
[13] See, e.g., Chapter 15 in the second volume in: M.B. Green, J.H. Schwarz, E. Witten, Superstring
Theory, Cambridge University Press, 1986.
[14] K. Uhlenbeck, S.-T. Yau, Commun. Pure Appl. Math. 39 (1986) 257;
K. Uhlenbeck, S.-T. Yau, Commun. Pure Appl. Math. 42 (1989) 703;
S.K. Donaldson, Duke Math. J. 54 (1987) 231.
[15] L. De Foss, P. Koerber, A. Sevrin, J. Troost, in preparation.
[16] E. Bergshoeff, M. Rakowski, E. Sezgin, Phys. Lett. B 185 (1987) 371;
R. Metsaev, M. Rakhmanov, Phys. Lett. B 193 (1987) 202.
[17] J.A. Harvey, G. Moore, Commun. Math. Phys. 197 (1998) 489, hep-th/9609017.
[18] B. Chen, H. Itoyama, T. Matsuo, K. Murakami, Nucl. Phys. B 576 (2000) 177, hep-th/9910263;
M. Mihailescu, I.Y. Park, T.A. Tran, D-branes as solitons of an N = 1 D = 10 non-commutative
gauge theory, hep-th/0011079;
E. Witten, BPS bound states of D0D6 and D0D8 systems in a B-field, hep-th/0012054.
[19] O.D. Andreev, A.A. Tseytlin, Nucl. Phys. B 311 (1988) 205.
[20] A.A. Tseytlin, Nucl. Phys. B 276 (1986) 391;
A.A. Tseytlin, Nucl. Phys. B 291 (1987) 876, Erratum.
[21] D.J. Gross, E. Witten, Nucl. Phys. B 277 (1986) 1.
[22] N. Seiberg, E. Witten, JHEP 9909 (1999) 032, hep-th/9908142.
[23] L. Cornalba, JHEP 0009 (2000) 017, hep-th/9912293.

Impact parameter dependent S-matrix for

dipoleproton scattering from diffractive meson
electroproduction
S. Munier a , A.M. Stasto a,b , A.H. Mueller c,d
a INFN Sezione di Firenze, Largo E. Fermi 2, 50125 Firenze, Italy
b Department of Theoretical Physics, H. Niewodnicza
nski Institute of Nuclear Physics, 31-341 Krakw, Poland
c Department of Physics, Columbia University, New York, NY 10027, USA
d Laboratoire de Physique Thorique, Universit Paris-Sud, F-91405 Orsay Cedex, France
Received 27 February 2001; accepted 4 April 2001
Abstract
We extract the S-matrix element for dipoleproton scattering using the data on diffractive
electroproduction of vector mesons at HERA. By considering the full t dependence of this process
we are able to reliably unfold the profile of the S-matrix for impact parameter values b > 0.3 fm.
We show that the results depend only weakly on the choice of the form for the vector meson wave
function. We relate this result to the discussion about possible saturation effects at HERA. 2001
Published by Elsevier Science B.V.
1. Introduction
One of the key issues in deeply inelastic scattering and deeply inelastic diffraction is
whether parton saturation effects are present in the HERA energy regime. The answer to
this question has important implications for our understanding of the very early stages of
relativistic heavy ion collisions where parton saturation would result in the production of a
high density and high field strength, F 1/ s , state of QCD. Such a state would be a
new and exciting regime of nonperturbative but small coupling QCD.
Parton saturation can be viewed as a manifestation of unitarity limits being reached.
However, unitarity limits are difficult to see directly in deep inelastic scattering and
This research was partially supported by the EU Framework TMR programme, contract FMRX-CT98-0194,
by the Polish Committee for Scientific Research grants Nos. KBN 2P03B 120 19, 2P03B 051 19, 5P03B 144 20
and by the US Department of Energy.
E-mail address: stasto@fi.infn.it (A.M. Stasto).

PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 6 8 - 7
428
S. Munier et al. / Nuclear Physics B 603 (2001) 427445
diffraction. Thus, most studies of parton saturation at HERA have been done in terms
of models which explicitly impose unitarity at moderate Q2 and small x. The most widely
used such model is that of Golec-Biernat and Wsthoff [1] where the S-matrix for the
2 2
scattering of a dipole of separation r on a proton is given by S = er Qs /4 with the
saturation momentum, Q2s , depending on the energy of the scattering. The fact that the
Golec-BiernatWsthoff model and its generalization work so well at HERA is perhaps
some evidence that deep inelastic scattering, and especially diffraction, have reached
unitarity limits.
It would be nice to see the attainment of unitarity limits directly without having to
use a model. This is not an easy task. Recall that in protonproton scattering the energy
dependence of total and elastic cross sections changes little between ISR and Tevatron
energies. The near saturation of unitarity limits was found in protonproton scattering only
when Amaldi and Schubert [2] (for recent analysis see [3]) evaluated the protonproton
elastic scattering amplitude in impact parameter space and found near blackness for small
impact parameters.
Vector meson production seems to be the best process from which to extract the dipole
proton elastic scattering amplitude. For example, it is easy to check that our procedure (see
Eqs. (4)(12) below) would not work for the total diffractive cross section because there
is not a single definite state produced, a condition which seems necessary in the way we
proceed.
In this paper we copy the AmaldiSchubert analysis in order to determine the dipole
proton scattering amplitude in impact parameter space in terms of diffractive -production
data at HERA. Recall that diffractive electroproduction of a -meson can be viewed in
terms of the following sequence of transitions: (i) The virtual photon goes into a quark
antiquark pair (dipole). (ii) The dipole scatters elastically on the proton. (iii) The dipole
then goes into a -meson. Thus -electroproduction naturally carries information on the
dipolequark elastic scattering amplitude. The main uncertainty is in the wavefunction of
the -meson which appears to be reasonably well constrained by previous phenomenology.
The dipoleproton S-matrix which we extract as a function of impact parameter
is averaged over the dipole sizes which naturally appear in the and -meson
wavefunctions. This averaging does not affect the fact that for weak interactions S is near
1 and for strong interactions S is near zero. Thus the smallness of S is a measure of how
close one is to the unitarity limit, S = 0. Perhaps the best way to measure how close the
dipoleproton scattering is to the unitarity limit is to note that 1 S 2 1 S2 gives the
probability that a dipole passing the proton will induce an inelastic reaction at the impact
parameter in question ( denotes the average over dipole sizes as explained in Eq. (11)).
From our analysis this probability is likely significantly greater than 1/2 for Q2 2 GeV2
and for b near zero (see Fig. 6), although better large momentum transfer data would be
needed to definitively determine S at b 0. Thus, it is likely that saturation effects are
important in this low Q2 regime. Our estimate of Q2s is about 11.5 GeV2 at b 0.3 fm,
although this estimate has large uncertainties because it depends on knowing the average
dipole size as a function of Q2 . Finally, we determine the total dipoleproton cross sections
for x 104 to be 14.9, 10.6 and 7.5 mb at Q2 = 0.45, 3.5 and 7 GeV2 , respectively.
429
The outline of the paper is as follows: in Section 2 we establish the relation between
the S-matrix for dipoleproton scattering averaged over the distribution of dipole sizes
and the diffractive differential cross section for electroproduction of vector mesons. Due
to the particular properties of this distribution, this formula translates into a method of
determining the S-matrix profile in impact parameter space for a dipole of given size r.
The only theoretical input needed is the vector meson wave function. In Section 3 we
discuss the parametrizations for the wave functions that we used, and we study in detail
the model independence of our procedure. In Section 4 we apply this method to the
analysis of the available HERA data and present our main results on the S-matrix profile in
impact parameter space. We discuss the uncertainties due to the experimental errors on the
measured cross section and due to the lack of data in the high t region. We also discuss the
possible implications for studies of saturation effects at high energies. Finally, in Section 5
we state the summary of our results. We append some more technical points.
2. Dipoleproton S-matrix and diffractive meson production cross section

We consider the process of diffractive production of a vector meson, p V p,
shown in Fig. 1. We adopt here the dipole formulation [4,5] in which the virtual photon
fluctuates into a q q pair (dipole) which subsequently interacts elastically with a proton,
and eventually forms a meson bound state. This picture is justified at high center-ofmass energy of the p system, since in this regime it is guaranteed that these successive
processes are clearly separated in time. We assume that the energy is sufficiently high so
that s-channel helicity conservation holds to a good accuracy, and we limit ourselves to
the case of longitudinal vector meson production. The kinematics of this process is shown
in Fig. 1: we define the 2-momentum transfer related to the Mandelstam variable t by
t = 2 . Q2 = q 2 and x = Q2 /(2P q) are as usual the virtuality of the photon and the
Bjorken variable, respectively.
Fig. 1. Diagrammatic representation of diffractive production of vector mesons. (q) is the virtual
photon of momentum q, V (q + ) is the produced vector meson with momentum q + . The target
proton is scattered elastically.
430
In this framework, the amplitude Ael (x, , Q) for the scattering can be written in the
following factorized form:
q qp
Ael (x, , Q) =
(1)
d 2 r dz h,h (z, r; Q) Ael (x, r, )Vh,h (z, r),
h,h
q qp
where Ael
(x, r, ) is the elementary amplitude for the elastic scattering of a dipole
of size r on the proton. h,h is the photon light-cone wave function projected onto a
state made of a quarkantiquark pair of charges eq , masses mq and respective helicities
This function is computed in first order light-cone perturbation theory of quantum
h and h.
electrodynamics, and for the longitudinally polarized virtual photon, it reads [4,6]
eq
Nc
h,h
2z(1 z)Q K0 (r),
(z, r; Q) = h,h
(2)
2
4
where
2 = Q2 z(1 z) + m2q .
(3)
Similarly, Vh,h is the wave function of the vector meson produced in the final state. We
discuss it in more detail in the next section. The variable z in Eq. (1) is the fraction of
longitudinal momentum of the virtual photon carried by the quark.
The amplitude is normalized in such a way that the differential cross section for the full
process is
2
1
d
=
Ael (x, , Q) .
dt
16
(4)
q qp
Furthermore, the elementary amplitude Ael (x, r, ) can be related to the S-matrix
element S(x, r, b) for the scattering of a dipole of size r at impact parameter b [7]

q qp
q qp
Ael (x, r, ) = d 2 b A el (x, r, b)eib = 2 d 2 b 1 S(x, r, b) eib . (5)

We can check briefly the consistency of this formula, which defines S(x, r, b) (see
Ref. [8]). In the standard normalizations adopted here for the amplitudes, the optical
q qp
q qp
theorem takes the form tot (x, r) = Im iAel (x, r, = 0). Using formula (5), this
relation translates into the following expression for the total dipoleproton cross section:

q qp
tot (x, r) = 2 d 2 b 1 Re S(x, r, b) .

(6)
On the other hand, taking the amplitude (5) squared, dividing it by the flux factor and
integrating it over phase space, it is easy to see that the elastic cross section is given by

2
q qp
(x, r) = d 2 b 1 S(x, r, b) .

el
(7)
When S = 0, the unitarity limit is reached which is equivalent to the scattering on a black
body. One sees from formulae (6) and (7) that the elastic cross section is half the total one,
as it should be. This justifies the consistency of the normalization adopted for S(x, r, b)
431
q qp
in Eq. (5). We will assume in the following that iAel (x, r, ) is purely imaginary, i.e.,
that S(x, r, b) is real.
The main goal of this paper is to extract S(x, r, b) taking advantage of the present
experimental knowledge of the cross section for diffractive vector meson production, see
Eq. (1). To this aim we express the amplitude which appears in Eqs. (1), (4) by means of
S(x, r, b) using relation (5). Then we take the inverse Fourier transform of the square root
of Eq. (4), which gives

d 2 d ib
e
(2)2 dt

1
=
(8)
d 2 r dz h,h (z, r, Q) 2 1 S(x, r, b) Vh,h (z, r).
16
h,h
In the following we suppress the angular dependence of the S-matrix since in this process
one is only sensitive to the quantities which are angular averaged. We single out the
(logarithmic) distribution of dipole sizes at the photon vertex, which is the overlap between
the photon and the meson wave functions
p(r, Q) 2r
dz h,h (z, r; Q)Vh,h (z, r),
(9)
h,h 0
and we denote N(Q) its integral

N(Q)
dr
p(r, Q).
r
(10)
We then define the mean value of a given function f (r) with respect to the probability
measure p(r, Q)/N(Q) by

f (r) p
dr p(r, Q)
f (r).
r N(Q)
(11)
We now rewrite Eq. (8) using these definitions (9), (10), (11), and we obtain the following
formula:

1
2
ib d
,
d e
S(x, r, b) p = 1
(12)
2N(Q) 3/2
dt
which shows that the measurement of the differential cross section d/dt enables us
to determine the S-matrix at fixed impact parameter b, averaged over the dipole size
distribution p(r, Q) defined by Eq. (9).
One should stress that N(Q) which appears in Eq. (12) is the only source of theoretical
uncertainty in this formula for the average S-matrix. This quantity depends on the parametrization of the wave function of the vector meson. However, as we shall discuss in detail
in Section 3, N(Q) is very well constrained and is hardly model-dependent. We now have
432
Fig. 2. Logarithmic distribution of dipole sizes at the photon vertex p(r, Q)/N(Q), for different Q2 .
to investigate the meaning of this average over dipole sizes, by studying p(r, Q) in more
detail.
The distribution p(r, Q) has some interesting properties which are quite general and
rather independent of the particular model for the meson wave function:
it is sharply

peaked at a specific value of r, which depends on Q2 like rQ = A/ Q2 + m2V and its
width, roughly independent of Q2 , is of order unity on a logarithmic scale. To illustrate

these properties, we choose a particular model for V (z, r) [9] (details will be given in the
next section), and we plot p(r, Q) for various values of Q2 in Fig. 2. One can easily obtain
a rough estimate of rQ considering the fact that on one hand, the photon wave function
behaves like log r at small r, and on the other hand, it can be approximated by exp(r)
in the asymptotic region r , see Eq. (2). Assuming furthermore that the vector meson
wave function is smooth in r, the integrand p(r, Q)/r rK0 (r) is then peaked around
the maximum of the function r exp(r), i.e., around rQ 1/. In the case of longitudinal
vector meson production, the dominant contribution comes from symmetric configurations
for which z is close to 1/2. This leads to the above-quoted formula for rQ , with A 2.
When S(x, r, b)p is significantly below 1, the quantity rQ can be interpreted as the
mean value of the sizes r of the dipoles which participate in the interaction. In this case,
q qp
the dipoleproton amplitude A el (x, r, b) is large for any dipole of size r present in
the initial wave function and thus does not filter these dipoles: the dominant dipole sizes
433
involved in the interaction are then effectively distributed according to p(r, Q). However,
q qp
in the opposite case when S(x, r, b)p 1, the amplitude A el (x, r, b) is small for the
relevant values of r. In this situation, we are sensitive to the behaviour of the dipole cross
q qp
section at small r, i.e., A el (x, r, b) r 2 . This is known as the colour-transparency

property. Then the interacting dipoles are effectively distributed according to r 2 p(r, Q).
This results in the dominant dipole sizes being shifted towards larger values 1 of r rs
for which
an estimate can be provided using the same technique as previously, and yields
rs 6/ Q2 + m2V .
According to this discussion and considering the fact that one is a priori more interested
in the case when S(x, r, b) < 1, one can make the following approximation:

S(x, r, b) p S(x, rQ , b) with
rQ
A
Q2 + m2V
A 2.
(13)
Thus for fixed Q2 and energy Eqs. (12), (13) enable us to determine S(x, rQ , b), which
is the S-matrix element for the scattering of a dipole of size rQ on a proton, at energy
Q2 /x and impact parameter b. A novel feature of our analysis is the fact that we are able
to extract the profile of the S-matrix (or equivalently of the dipoleproton amplitude) in
impact parameter space. The previous studies of Refs. [1,1016] provided an estimate of
the dipole cross section integrated over the impact parameter.
3. From mesons to dipoles

As already mentioned earlier in Section 2 the only theoretical uncertainty in the
extraction of S-matrix profile in impact parameter space is the form of the vector meson
wave function which occurs in the overlap function p(r, Q) (Eq. (9)) and in its integral
N(Q) (Eq. (12)). In this section, we present the different models that we choose for the
vector meson wave function, then we compare them phenomenologically, and finally we
evaluate the uncertainty on S(x, rQ , b) induced by the freedom of choice of the vector
meson wave function.
3.1. Models for the vector meson wave function
Several different models for the vector meson wave function exist in the literature [9
11,17,18]. All these approaches use the information from spectroscopic models of long
distance physics. In these models one assumes that the meson is composed of a constituent
quark and antiquark which move in an harmonic oscillator potential. This results in a wave
function which has a Gaussian dependence on the spatial separation between the quarks.
Additionally these models are supplemented by the short-distance physics driven by the
QCD exchange of hard gluons between the valence quarks of the vector meson. Finally, a
relativization procedure has to be applied.
1 The quantity r in this case is the so called scanning radius introduced in [10].
s
434
Of course, there are many uncertainties in the above procedure of obtaining the wave
functions for the vector mesons, however they are constrained by model independent
features. First of all, Vh,h has to satisfy the following normalization condition [9]

2
1=
(14)
d 2 r dz Vh,h (z, r) .
h,h
Second, the value of the wave function at the origin is related to the leptonic decay width
(V e+ e ) of the vector meson given by the following formula

1
dz V (z, r = 0) =
0
fV
,
Nc 2eV
where 0|Jem
(0)|V eq fV mV ,
(15)
where fV is the coupling of the meson to the electromagnetic current and eV the isospin
factor, which is the effective charge of the quarks in units ofthe elementary charge
e: for
2, i.e., eV = 1/ 2. Finally,
the meson it is the charge of the combination (uu d d)/
one requires that the mean radius be consistent with the electromagnetic radius of the vector meson.
In our calculation we consider the models proposed in Ref. [9] (which we refer to as
DGKP) and [10,11] (which we denote NNZ(1994) and NNPZ(1997), respectively).
3.1.1. Model DGKP
The following wave function is adopted:

fV
1
Vh,h (z, r) = h,h z(1 z)

(16)
f (z) exp 2 r 2 /2 .
4 Nc eV
The parameter and the overall normalization are fixed by the condition (14) and by the
value of the leptonic width. (The exact values of these parameters as well as the form of
the function f (z) are given in Appendix A.) Using the above form of the wave function
V one can write p(r, Q) by replacing it in Eq. (9). In the case of model DGKP, it reads:

dz
2 2
fV eq z(1 z)f (z)e r /2 2z(1 z)QK0 (r).
p(r, Q) = 2r 2
(17)
4
3.1.2. Models NNZ(1994) and NNPZ(1997)
The wave function of the vector meson is given by

2

1
Nc
h,h
V (z, r) = h,h
(18)
m z(1 z) r 2 + m2q (r, z).
4 mV z(1 z) V
The form of the radial wave function (r, z) is given in Appendix A. The expression for
the overlap p(r, Q) then is
Nc eV 2eq Q
p(r, Q) = 2r 2
(2)2 mV

2
2
(r, z).
dz mV z(1 z) + mq K0 (r) /K1 (r)
(19)
r
The models NNZ(1994) and NNPZ(1997) differ only by the tuning of the parameters.
435
Expressions (17) and (19) differ on several points. We study the influence of these
differences on the distribution p(r, Q) and on its integral N(Q) in the next sections.
3.2. Comparison of the theoretical predictions with the HERA data at t = 0
Before applying the models (16), (17) and (18), (19) for the vector meson wave functions
to our procedure of extracting the S-matrix element, we additionally checked how they
compared with the data on forward diffractive vector meson electroproduction at HERA.
To this aim we compute d/dt|t =0 using formulae (1), (4) and substitute the dipole
q qp
proton amplitude Ael

by the one proposed in Golec-Biernat and Wsthoffs saturation
2
model ( (x, r) in the notations of Ref. [1]).
Fig. 3. The forward longitudinal cross section for the production of 0 vector meson. Data are
from [20] and the three curves correspond to different assumptions for the wave functions of the
vector meson. Solid curve: model DGKP [9], dashed curve: model NNPZ(1997) [11] and dotted
curve model NNZ(1994) [10]. The dipole cross section is taken from Golec-Biernat and Wsthoff
model [1].
2 A similar study was done in detail recently in [19] where different forms of the wave functions for and
0
436
Fig. 4. The overlap of the wave functions of the photon and vector meson p(r, Q) plotted as a
function of r (the size r is given in GeV1 ). Solid curve: DGKP [9], dashed curve NNPZ(1997)
[11], dotted curve NNZ(1994) [10]. The dashed-dotted curve corresponds to the dipole cross section
by Golec-Biernat and Wsthoff [1] calculated for x = 103 . The relative normalization of the dipole
cross section and the wave functions is not conserved.
The theoretical curves corresponding to all three different models DGKP, NNZ(1994),
NNPZ(1997) are plotted in Fig. 3. Let us stress that there is no additional tuning of any of
the parameters of the wave functions. One of the models, NNZ(1994), is able to describe
the data quite well, whereas the other ones underestimate the normalization by as much as
a factor of 2. However, the energy dependence of the cross section is well described by all
models. Qualitatively similar conclusions were reached in the study of Ref. [19].
In order to understand this discrepancy between the different models, we plot in Fig. 4
the integrand p(r, Q)/r which appears in Eq. (10) together with the dipole cross section
from the Golec-Biernat and Wsthoff model. One can see that different models for the
wave functions result in different shapes in r for p(r, Q)/r. It is clear from this plot that
the exact shape of the function p(r, Q)/r for large values of r is very important in this case
since it is weighted by the dipole cross section (x, r) which is large in this regime.
3.3. Overlap integral N(Q), mean radius rQ and theoretical uncertainty on S(x, rQ , b)
The results on the forward production cross section presented in Fig. 3 suggest that it is
essential to study in detail the model dependence of the S-matrix given by formulae (12)
and (13).
J /1 were considered and used in the calculation of the forward longitudinal and transverse total cross sections.
The meson wave functions were chosen according to general principles rather than just relying on existing models.
437
Fig. 5. Integrated dipole distribution N(Q) as a function of Q2 (the values of Q2 are given in GeV2 ).
The curves corresponding to the different models for the meson wave function are shown. The shaded
area is theoretically forbidden.
Let us first study the properties of the quantity N(Q). In Fig. 5 we plot it for the three
different models for the vector meson wave function. In the region of interest, 0.5 < Q2 <
10.0 GeV2 they result in comparable N(Q) within 1025%. This means that the average
S-matrix obtained using Eq. (12) is quite robust to a given choice of the model for the
vector meson wave function. In fact one can give some very general bounds on N(Q),
nearly independent on the model for V . From these upper bounds on N(Q) we could
obtain an absolute upper bound for the S-matrix which would be model-independent. We
refer to Appendix B for technical details. We represent these bounds in Fig. 5. One can see
that the models happen to be quite close to these limits (within, say, 25%).
Second, we estimate the variation of rQ with respect to the different models for the wave
functions. We compute the mean value rQ as:
rQ = elog rp .
(20)
The results for the different models at different values of Q2 are given in Table. 1. The
values of rQ are consistent within 25% for the various models. This confirms the validity
of formula (13).
438
Table 1
The mean values rQ of the dipole size for different Q2 obtained in models DGKP [9],
NNZ(1994) [10] and NNPZ(1997) [11]. rQ is evaluated according to formula (20). Consistency with
formula (13) is checked by quoting the corresponding values of A, all quite close to 2 and hardly
dependent on Q2
DGKP
NNZ
NNPZ
Q2 [GeV2 ]
rQ [fm]
rQ [fm]
rQ [fm]
0.45
3.5
7
27
0.35
0.21
0.16
0.09
1.88
2.24
2.32
2.49
0.49
0.26
0.20
0.11
2.63
2.77
2.90
3.04
0.52
0.26
0.19
0.10
2.79
2.77
2.75
2.76
The three models appear to be more or less equivalent for our purpose. For simplicity,
we then only consider model DGKP [9] in the following.
4. Impact parameter analysis of HERA data

In this section we extract the b-dependent S-matrix from the HERA data. To this aim,
we apply formulae (12), (13) to the data, together with expressions (17), (19) obtained
from the discussion of the vector meson wave function in the previous section. We also try
to deduce from these results the value of the saturation scale.
4.1. Profile S(b)
The experimental data for diffractive production of vector mesons are usually parametrized by the forward diffractive cross section d/dt|t =0 and the logarithmic slope in momentum transfer B(t) as follows:

d
d
(21)
=
eB(t )t .
dt
dt t =0
We take the available data [20,21] for the electroproduction of longitudinally polarized 0
vector mesons. These data are given for momentum transfer t below tmax = 0.6 GeV2 ,
which enables us to determine reliably the S-matrix only for impact parameter values
larger than b = 1/ tmax 0.3 fm. In order to compute the Fourier transform appearing
in Eq. (12), we have to assume an extrapolation of d/dt to larger values of t. The formula
we use is

d
1
S(x, rQ , b) = 1
2 N(Q) dt t =0
tmax

+
B(t )t /2

(22)
dt J0 b t e
+
dt J0 b t E(t) .
0
tmax
439
Assuming that B(t) is non-increasing with t, as various sets of data seem to indicate, we
choose different functional forms for the extrapolating functions E(t):
An exponential form exp(d t), with the constant d being of the order of B(tmax ).
This choice provides an upper bound on the S-matrix for b close to 0.
A power law form t . This choice is motivated by the fact that the data for
photoproduction at high t [22] indicate a t-dependence governed by such a power
law, with = 3. Such less steep t-dependence was also obtained in a theoretical
calculation [23]. We consider here two values for : = 3 and = 6.
The parameters of these functions E(t) are fixed from the fit to the experimental points
corresponding to the highest values for t, in the region 0.4 GeV2 < t < 0.6 GeV2 . These
specific assumptions give an idea of the uncertainty on the determination of S(x, rQ , b)
due to our lack of knowledge of the differential cross section for t > tmax . Another source
of uncertainty is the experimental errors on the measurements of B(t) and d/dt|t =0 .
A complete error analysis would lie beyond the scope of this paper. Here we just estimate
the influence of these uncertainties on S(x, rQ , b) by varying the measured quantities inside
the error bars. By this method, we believe that we obtain a strict overestimate of the errors.
The theoretical input N(Q) needed in formula (22) is computed within the model DGKP.
The value rQ is estimated using the procedure described in Sections 2 and 3. The other
models NNPZ(1997) and NNZ(1994) have also been tested and lead to very similar results.
We take three values of Q2 : Q21 = 0.45 GeV2 , Q22 = 3.5 GeV2 and Q23 = 7 GeV2 . These
values correspond to bins of the ZEUS analysis [20], which we consider in the following.
We note that the H1 data [21] lead to similar results. We recall that these values of Q2
correspond to the respective values of the dipole size rQ (see Table 1): rQ 1 = 0.35 fm,
rQ 2 = 0.21 fm, rQ 3 = 0.16 fm. For these values of Q2 , we take similar low values of
x, namely x1 = 4.7 104 , x2 = 4.3 104 and x3 = 5.8 104 , respectively. The
experimental slope B(t) is parametrized in the region t < 0.6 GeV2 by B(t) = B0 ct,
where the values of B0 and c are B0 = 9.5 GeV2 , c = 4.0 GeV4 for (Q2 = 0.45 GeV2 )
and c = 6.1 GeV4 (for Q2 = 3.5 GeV2 and Q2 = 7 GeV2 ).
The results for S(x, rQ , b) are shown in Fig. 6, and are obtained using Eq. (22)
applied to the ZEUS data. For each Q2 , the curves corresponding to three choices for
the extrapolation function E(t) are drawn. The uncertainty corresponding to the errors
on the measurements themselves are computed for the intermediate value of Q2 , and are
represented in the lower part of the plot by a hashed band.
First, we observe that for b > 0.3 fm, the curves corresponding to different values of Q2
are clearly separated and ordered according to:
S(x1 , rQ1 , b) < S(x2 , rQ2 , b) < S(x3 , rQ3 , b).
(23)
As x1 x2 x3 , this means that the proton is less transparent to dipoles of larger size r.
We also note that in this region of b > 0.3 fm, the results are not very dependent on the
choice of extrapolation of the t-dependence of the data. However, for b < 0.3 fm (shaded
region in Fig. 6), the results become very sensitive to the asymptotic behaviour of the cross
section d/dt, and data at larger t are needed to be able to make firm statements about
the value of S(x, rQ , b) near b = 0. We also note that the errors on the measurements
440
Fig. 6. S-matrix for dipoleproton scattering as a function of impact parameter b. Three different
Q2 are considered, corresponding to three typical values for the size of the interacting dipole (which
are estimated according to rQ elog rp ). For each value of Q2 , the curves corresponding to three
extrapolations of the data for t > 0.6 GeV2 are shown. The shaded band indicates the region of
impact parameter b where the choice of this extrapolation is crucial, and thus where our extraction
is not reliable. The hashed band on the bottom is an estimate of the errors due to the experimental
uncertainties on d/dt|t =0 and on B(t) (for t < 0.6 GeV2 ).
for t < 0.6 GeV2 lead at most to an uncertainty of about 25% for the S-matrix at impact
parameter b = 0.3 fm.
Finally, one can compute the total cross section for dipoleproton scattering using the
following formula (which is nothing more than Eq. (6) averaged over r, for real S-matrix):

q qp
tot
(x, r) p = 2

d 2 b 1 S(x, r, b)p .
(24)
441
Using formula (12) to replace S(x, r, b)p and performing the integral over b, one sees
that this cross section is determined by N(Q) and by the forward differential cross section
q qp
d/dt|t =0 . For photon virtualities Q21 , Q22 and Q23 , one obtains the values tot p =
14.9 mb, 10.6 mb and 7.5 mb, respectively. These results are consistent with those of
Refs. [1,1115], however, the relationship between Q2 and the radius rQ quoted in these
papers remains a large source of uncertainty in this comparison.
4.2. Towards an estimate of the saturation scale
One can try to estimate the quark saturation scale Q2s using the results presented in
Fig. 6. From this analysis it is in principle possible to extract the dependence of this scale
on the impact parameter b of the collision. We try the following phenomenological formula
[24] for the S-matrix:
2 2

Qs (x, b)/4 .
S(x, rQ , b) = exp rQ
(25)
This exponential form comes from a Glauber-like summation of multiple independent
scatterings of the quarkantiquark pair on the target nucleon [25,26]. It is also supported
by the Golec-Biernat and Wsthoff model [1] where the radius R0 (x) introduced there can
be related to the saturation scale by R0 (x) = 1/Qs (x, b). However, in that work, it was
assumed that the b-profile of the S-matrix has the form of a sharp cutoff at a distance of
order 1 fm.

Using the approximation rQ 2/ Q2 + m2V one obtains Q2s 11.5 GeV2 and Q2s
0.2 GeV2 for b = 0.3 fm and b = 1.0 fm, respectively. These values are consistent in
order of magnitude when computed at different Q2 , which confirms the relevance of
formula (25). These results would suggest that at small impact parameter values the
saturation scale is in the semi-hard regime and support the onset of the shadowing effects
at HERA. One should however stress that these estimates are rather rough and should be
taken with care since they strongly depend on the precise value of rQ . Indeed, rQ comes
squared in the formula for Q2s .
5. Summary
In this paper we have shown that using the HERA data on diffractive electroproduction
of vector mesons it is possible to extract the S-matrix element S(x, r, b) for dipoleproton
scattering in impact parameter space. By considering the full t dependence of this process
we have shown how to obtain the S-matrix element averaged over dipole sizes r. Due to
the particular properties of the overlap function between photon and meson wave
functions
this average can be replaced by the S-matrix element evaluated at r = rQ 2/ Q2 + m2V .
We have then shown that our results on the S-matrix are only weakly dependent on the
choice for the model of the meson wave function. Since the data are available only for
low momentum transfer, t < 0.6 GeV2 , we have shown that our results are reliable for
impact parameter values larger than b 0.3 fm. It would be very interesting to extend
442
the measurement of diffractive electroproduction of vector mesons to larger values of t.

This would enable us to explore the very interesting regime of central impact parameter
collisions at HERA. We have also estimated the value of the dipole cross section integrated
over the impact parameter and have found it to be consistent with other analyses. Finally
we have discussed how to use this result to estimate the saturation scale at HERA collider.
We have suggested that the onset of the shadowing effects can be dependent on the impact
parameter of the collision and that for central collisions it could occur in the semihard
regime. However, further theoretical and experimental studies are necessary.
Acknowledgements
We thank Marcello Ciafaloni for his comments, and Allen C. Caldwell and Mara S.
Soares for correspondence. S.M. thanks Columbia University for welcome at a preliminary
stage of this work, and the Service de Physique Thorique de Saclay for support at that
time. He also thanks Robi Peschanski and Bernard Pire for their suggestions. A.M.S. thanks
Krzysztof Golec-Biernat and Jan Kwiecinski for interesting discussions. A.H.M. wishes
to thank Marcello Ciafaloni for his hospitality at the University of Florence where this
collaboration began, and he wishes to thank Dominique Schiff for her hospitality in Orsay.
Appendix A. Models for the meson wave function

In this appendix, we detail the phenomenological forms we adopt for the meson wave
function.
A.1. Model by Dosch, Gousset, Kulzinger, Pirner (DGKP, Ref. [9])
The form of the vector meson wave function is given by:

3/2
N
Vh,h (z, r) = h,h 4 z(1 z)

4

2
1 m2V
1 2 2
z 1/2
exp
exp r .
2 2
2
The parameters chosen are the following:
= 0.330 GeV
and N = 4.48.
(A.1)
An additional feature of this model is that the mass mq of the quarks that appears in the
photon wave function is running with Q2 . This gives important effects only for very low Q2
and for photoproduction. In this regime, it is argued that the quarks should have constituent
mass. The formula for the running mass reads:

0.220 GeV 1 Q2 /Q20 for Q2 < Q20 = 1.05 GeV2 ,
mq (Q2 ) =
(A.2)
0
for Q2 > Q20 .
This model is referred to as DGKP.
443
A.2. Models by Nemchik, Nikolaev, Predazzi, Zakharov (NNZ(1994), Ref. [10] and NNPZ
(1997), Ref. [11])
In this model, the radial wave function (r, z) appearing in Eq. (19) satisfies the
following normalization condition:
Nc
1=
2
1
0
dz
z2 (1 z)2

2
d 2 r m2q 2 (z, r) + z2 + (1 z)2 r (z, r) .
(A.3)
This equation is nothing else but the normalization condition (14), applied to the
transversely polarized meson wave function. The obtained normalization factor 0 (1S)
is assumed to be the same for a longitudinally polarized meson. The function (r, z) is
defined by:

m2q R 2
2
(z, r) = 0 (1S) 4z(1 z) 2R exp
8z(1 z)

2 2

2
mq R
2z(1 z)r
exp
exp
2
R2

3

16a (r)
4
(A.4)
rK1 r A(z, r)/B(z, r) .
+C
A(z, r)B(z, r)3
The functions A and B are given by
C 2 a 2 (r)m2q
4C 2 a 2 (r)m2q ,
z(1 z)
C 2 a 2 (r)
,
B 2 (z, r) =
(A.5)
z(1 z)
and a(r) 3/(8mq s (r)). The following prescription is taken for the running coupling
s (r):
A2 (z, r) = 1 +
s (r) = 0
for r > rs ,
and s (r) =
4
0 log(1/2 r 2 )
for r < rs ,
(A.6)
where the parameter values are

rs = 0.42 fm,
0 = 0.8,
= 200 MeV.
(A.7)
The masses of the quarks are taken to be 0.15 GeV.

The other parameters 0 (1S), C and R 2 are chosen taking into account several
constraints: the normalization condition (A.3) for the wave function has to be satisfied,
and the value of the leptonic decay width must agree with the experimental measurement.
Additionally, the mean radius of the meson has to be of the order of a hadronic scale. Two
different sets of parameters are considered:
C = 0.25,
R 2 = 0.76 fm2
and C = 0.36,
(reference [10]),
R = 1.37 fm2
2
(reference [11]).
These two choices are referred to as NNZ(1994) and NNPZ(1997), respectively.
(A.8)
444
Appendix B. Model-independent bounds on the overlap integral N (Q)

In this appendix we explore more the properties of N(Q) and show that upper bounds
can be given on this quantity, regardless the model adopted for V . These provide upper
bounds on S(x, rQ , b).
N(Q) can be seen as a scalar product of the two wave functions, N(Q) = |V ,
see Eqs. (9) and (10). The CauchySchwartz inequality then applies. It leads to an upper
estimate for the integral: the product of the two wave functions is smaller than the square
root of the product of the integrals of the squared wave functions. Using the normalization
condition (14) for the meson wave function 3 and computing the full integral of the squared
photon wave function, one eventually obtains:

m4q
m2q
em Nc
1
1
1
tanh
N(Q) 2eV
.
1 6 2 + 24 4
6
Q
Q 1 + 4m2 /Q2
1 + 4m2q /Q2
q
(B.1)
The r.h.s. of Eq. (B.1) grows with Q2 and thus this inequality is only interesting in
the small-Q2 region. Although independent of the precise form of the vector meson wave
function, this bound nevertheless depends on the masses of the quarks.
Second, with the additional assumption that V (z, r) is maximum for r = 0, we can
write:

N(Q) 2eV d 2 r dz (z = 1/2, r; Q)V (z, r = 0).
(B.2)
The integrals over z and r in the r.h.s. are now factorized. The integration over r is
performed analytically, while the one over z, involving V , can be expressed as a function
of the coupling fV of the meson to the electromagnetic current, using the relation (15).
This finally leads to:
N(Q) 2 em fV
Q
.
Q2 + 4m2q
(B.3)
The r.h.s. vanishes like 1/Q at large Q, which makes this inequality useful for large Q.
The region in the (Q2 ,N(Q)) plane which is forbidden by these bounds is depicted in
Fig. 5. One sees that all the models are very close to the upper bounds.
References
[1] K. Golec-Biernat, M. Wsthoff, Phys. Rev. D 59 (1999) 014017;
K. Golec-Biernat, M. Wsthoff, Phys. Rev. D 60 (1999) 114023.
[2] U. Amaldi, K.R. Schubert, Nucl. Phys. B 166 (1980) 301.
[3] B.Z. Kopeliovich, I.K. Potashnikova, B. Povh, E. Predazzi, Phys. Rev. D 63 (2001) 054001.
3 Strictly speaking, this bound is only valid for models for which the condition (14) is enforced for the
longitudinally polarized vector meson.
445
[4] N.N. Nikolaev, B.G. Zakharov, Z. Phys. C 49 (1991) 607;

N.N. Nikolaev, B.G. Zakharov, JETP 78 (1994) 598.
[5] A.H. Mueller, Nucl. Phys. B 415 (1994) 373;
A.H. Mueller, B. Patel, Nucl. Phys. B 425 (1994) 471;
A.H. Mueller, Nucl. Phys. B 437 (1995) 107.
[6] J.D. Bjorken, J.B. Kogut, D.E. Soper, Phys. Rev. D 3 (1971) 1382.
[7] A.H. Mueller, Eur. Phys. J. A 1 (1998) 19.
[8] L.D. Landau, E.M. Lifshitz, Quantum Mechanics, Mir, 1966.
[9] H.G. Dosch, T. Gousset, G. Kulzinger, H.J. Pirner, Phys. Rev. D 55 (1997) 2602;
G. Kulzinger, H.G. Dosch, H.J. Pirner, Eur. Phys. J. C 7 (1999) 73.
[10] J. Nemchik, N.N. Nikolaev, B.G. Zakharov, Phys.Lett. B 341 (1994) 228.
[11] J. Nemchik, N.N. Nikolaev, E. Predazzi, B.G. Zakharov, Z. Phys. C 75 (1997) 71.
[12] M. McDermott, L. Frankfurt, V. Guzey, M. Strikman, Eur. Phys. J. C 16 (2000) 641.
[13] M. Rueter, H.G. Dosch, Phys. Rev. D 57 (1998) 4097.
[14] J.R. Forshaw, G. Kerley, G. Shaw, Phys. Rev. D 60 (1999) 074012.
[15] G. Cvetic, D. Schildknecht, A. Shoshi, Acta Phys. Polon. B 30 (1999) 3265.
[16] A. Capella, E.G. Ferreiro, C.A. Salgado, A.B. Kaidalov, Nucl. Phys. B 593 (2001) 336;
A. Capella, E.G. Ferreiro, C.A. Salgado, A.B. Kaidalov, Phys. Rev. D 63 (2001) 054010.
[17] S.J. Brodsky, L. Frankfurt, J.F. Gunion, A.H. Mueller, M. Strikman, Phys. Rev. D 50 (1994)
3134.
[18] L. Frankfurt, W. Koepf, M. Strikman, Phys. Rev. D 54 (1996) 3194.
[19] A. Caldwell, M.S. Soares, hep-ph/0101085.
[20] ZEUS Collaboration, Eur. Phys. J. C 6 (1999) 603.
[21] H1 Collaboration, Eur. Phys. J. C 13 (2000) 371.
[22] ZEUS Collaboration, Study of the diffractive production of vector mesons at large Q2 or at
large |t| at HERA, EPS 1999, Tampere.
[23] D.Yu. Ivanov, Phys. Rev. D 53 (1996) 3564;
I.F. Ginzburg, D.Yu. Ivanov, Phys. Rev. D 54 (1996) 5523.
[24] A.H. Mueller, Lectures given at International Summer School on Particle Production Spanning MeV and TeV Energies (Nijmegen 99), Nijmegen, Netherlands, 820 August 1999, and at
MEETING-NOTE = 17th Autumn School: QCD: Perturbative or Nonperturbative? (AUTUMN
99), Lisbon, Portugal, 29 September4 October 1999, hep-ph/9911289.
[25] A.H. Mueller, Nucl. Phys. B 335 (1990) 115.
[26] A.L. Ayala Filho, M.B. Gay Ducati, E.M. Levin, Nucl. Phys. B 493 (1997) 305;
A.L. Ayala Filho, M.B. Gay Ducati, E.M. Levin, Nucl. Phys. B 511 (1998) 355.
Nuclear Physics B 603 [PM] (2001) 449496

The many faces of Ocneanu cells

V.B. Petkova a,b , J.-B. Zuber c,1
a Institute for Nuclear Research and Nuclear Energy, 72 Tzarigradsko Chaussee, 1784 Sofia, Bulgaria
b School of Computing and Mathematics, University of Northumbria, NE1 8ST Newcastle upon Tyne, UK
c TH Division, CERN, CH-1211, Genve 23, Switzerland
Received 8 February 2001; accepted 12 March 2001
Abstract
We define generalised chiral vertex operators covariant under the Ocneanu double triangle
algebra A, a novel quantum symmetry intrinsic to a given rational 2d conformal field theory. This
provides a chiral approach, which, unlike the conventional one, makes explicit various algebraic
structures encountered previously in the study of these theories and of the associated critical lattice
models, and thus allows their unified treatment. The triangular Ocneanu cells, the 3j -symbols of the
weak Hopf algebra A, reappear in several guises. With A and its dual algebra A one associates a pair
While G are known to encode complete sets of conformal boundary states, the
of graphs, G and G.
classify twisted torus partition functions. The fusion algebra of the twist operators
Ocneanu graphs G
The study of bulk field correlators in the presence of twists reveals
provides the data determining A.
that the Ocneanu graph quantum symmetry gives also an information on the field operator algebra.
2001 Elsevier Science B.V. All rights reserved.
PACS: 11.25.Hf
1. Introduction
This paper stems from the desire to understand Ocneanu recent work on quantum
groupoids [1,2], also called, in a loose sense, finite subgroups of the quantum groups,
and to reformulate and to exploit it in the context of 2d rational conformal field theories
(RCFT). Our approach is inspired by the study of boundary conditions in CFT, either on
manifolds with boundaries, or on closed manifolds (e.g., a torus) where the introduction of
defect lines (or twists) is possible.
In Boundary CFT (BCFT), the type of boundary states and the corresponding character
multiplicities in cylinder partition functions are conveniently encoded in a graph (or a set
of graphs) G [3], with vertices denoted by a, b. More precisely the adjacency matrices of
E-mail addresses: valentina.petkova@unn.ac.uk (V.B. Petkova), zuber@spht.saclay.cea.fr (J.-B. Zuber).
1 On leave from: SPhT, CEA Saclay, F-91191 Gif-sur-Yvette.
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 0 9 6 - 7
450
V.B. Petkova, J.-B. Zuber / Nuclear Physics B 603 [PM] (2001) 449496
the graphs are given by a set of (nonnegative integer valued) matrices ni = {nia b } forming
a representation of the Verlinde fusion algebra

ni nj =
(1.1)
Nij k nk ,
k
and it is usually sufficient to specify only a fundamental subset of them, which generates
the other through fusion.
In accordance with these data we define generalised chiral vertex operators (GCVO),
covariant under Ocneanu double triangle algebra (DTA) A, a finite-dimensional C
weak Hopf algebra (WHA) in the axiomatic setting of [4]. They can be looked at as
extensions to the complex plane of the boundary fields and at the same time they yield
a precise operator meaning to these fields. The fact that the GCVO have nontrivial braiding
allows to give a global operator definition of the half-plane bulk fields, described in the
traditional approach only through their small distance (vanishing imaginary coordinate)
expansion. The bulk fields are defined as compositions of two generalised or conventional
CVO, which makes the construction of their correlators and the derivation of the equations
they satisfy straightforward.
The 3j -symbols (1)F of the Ocneanu quantum symmetry, also called cells, reproduce
the boundary field operator product expansion (OPE) coefficients, while the 6j -symbols F
coincide with the fusing matrices, i.e., the OPE coefficients of the conventional CVO. In
the diagonal theories, in which each local field is left-right symmetric, there is a one to
one correspondence between the set I and the spectrum of orthonormal boundary states;
then (1.1) is realised by the Verlinde matrices themselves and the two symbols F and (1) F
coincide. The 3j -symbols diagonalise the braiding matrices of the generalised CVO (the
R matrix of the quantum symmetry). These new braiding matrices are identified with the
Boltzmann weights (in the limit u i of their spectral parameter) of the critical sl(n)
lattice models which generalise the Pasquier ADE lattice models and their fused versions.
Once again the 3j -symbols provide the basic ingredients of these models. In particular
their identification with the Ocneanu intertwining cells gives some new solutions for the

theories; for the A
boundary field OPE coefficients in the exceptional Er cases of sl(2)
and D-series these constants were computed in [5,6].
Through a discussion parallel to that of boundary states, one may also study the allowed
twists (or defect lines) on a torus. The compatibility with conformal invariance and
a duality argument similar to Cardys consistency condition [7] restrict the multiplicities
ii ;x y } of occurrence of representations (i, i ) in the presence of twists x, y, to
ii = {V
V
be now nonnegative integer valued matrix representations of the squared Verlinde fusion
algebra [8]:

1 = Zij ,
ii V
jj =
kk ,
Nij k Ni j k V
V
V
(1.2)
ij ;1
k,k
i1 , V
1i of these matrices give rise
where Zij is the modular invariant matrix. Pairs V

to another graph G with vertices x, y [1,2]. Combining the concepts of twists and of
boundaries, i.e., inserting twists in the presence of boundaries, leads to yet another set
451
of multiplicities, n x = {n ax b }, which form a matrix realisation of a new, in general noncommutative, fusion algebra:

xy z n z ,
n x n y =
(1.3)
N
z
x N
y =
N
xy z N
z .
N
(1.4)
This algebra admits an interpretation as the algebra of the twist operators used in the

construction of the partition functions in [8]. It is associated with the Ocneanu graph G
in the sense of the relation

ij N
x =
ij ;x z N
z ,
V
(1.5)
V
z
graph algebra. In the cases described by

and we shall also refer to it as the G
a block-diagonal modular invariant (a diagonal invariant of an extended theory) it
possesses subalgebras interpreted as graph algebras of the chiral graph G, and furthermore
a subalgebra identified with the extended fusion algebra. We find, extending the analysis
in [8] to correlators in the presence of twists, that the representations of (1.4) are closely
related to the operator product algebra of the physical local fields of arbitrary spin.
In this approach, we see repeated manifestations of the quantum algebra A and of
both satisfying the axioms of the WHA of [4]. The structure as
its dual algebra A,
a whole is maybe most easily described in the combinatorial terms of Ocneanu quantum
(co)homology [4] (see also the related notion of 2-category in [9] 2 ). The latter considers
simplicial 3-complexes built out of the elements depicted on Fig. 1. There are three types
Fig. 1. The simplices.

2 We thank A. Wassermann for pointing this out to us.
452
Fig. 2. The double triangles.
of oriented 1-simplices and the triangular 2-simplices come with multiplicities. Each
tetrahedral 3-simplex (arrows omitted in Fig. 1) is assigned a C-valued 3-chain, subject
to a set of pentagon relations (the Big Pentagon of [4]); the middle tetrahedron (2) F
, while F , (1) F , (1) F
, F
can be chosen unitary. These data
appears with its inverse, (2) F
which are matrix algebras
enable one to construct on an abstract level A and its dual A,
with basis elements represented by two sets of double triangles, see Fig. 2, related, up to
a constant, by (2) F .
In the present context, the 1-simplices are labelled by the finite set I of representations
of the Verlinde fusion algebra and by the sets V and
V of vertices of graphs G and
of cardinality |V| = tr(Z) and |
G
V| = tr(Z t Z), respectively, where Z is the modular
invariant matrix. Each triangular 2-simplex comes with a multiplicity label t = 1, . . . , Nij k ,
xy z , and these multiplicities are subject
= 1, . . . , nia c , = 1, . . . , n ax c , = 1, . . . , N
to the relations (1.1)(1.5). The first two tetrahedra on Fig. 1 represent the 6j - and the
3j -symbols F , (1) F discussed above.
Thus Ocneanus double triangle algebra, which is attached specifically to each 2d CFT
and governs many of its aspects spectrum multiplicities, structure constants, lattice
realisations appears as its natural quantum symmetry. The problem of identifying the
underlying quantum symmetry of a given CFT is by no means new. Several attempts and
partial answers were achieved at the end of the 80s and beginning of the 90s, see the
discussion below in Sections 4 and 5. The previous approaches dealt with the chiral CFT, or
equivalently, with the diagonal theories. These are also the only examples of CFT discussed
in [4], where the relevance of the WHA as a quantum symmetry was first proposed in the
framework of algebraic QFT; in these diagonal cases the four triangle multiplicities above
coincide with the Verlinde fusion multiplicities Nij k and accordingly all the tetrahedra on
Fig. 1 reduce to the RCFT fusing matrix F . The development of BCFT on one hand side
and the work of Ocneanu on the other made available new tools and new ideas; our present
considerations yield in particular explicit and nontrivial examples of the structure of WHA.
The main novelty of the WHA approach is that it has a coassociative coproduct consistent
with the CFT fusion rules (the Ocneanu horizontal product). The presence of boundaries
provides an extension of the Hilbert space of the theory consistent with the fusion rules
and basic axioms of the RCFT. At the same time it should be stressed that the parallel
with the previous discussions on the hidden quantum group symmetry is to some extent
superficial, or deceptive, since this is only one of the facets of the Ocneanu symmetry; in
contrast to the former the new approach encompasses the full structure of the 2d CFT, so
is much richer in content and applications.
We should not conceal, however, that our understanding is still fragmentary. The
determination of the cells and of the remaining tetrahedra of Fig. 1 from the complicated
453
set of equations they satisfy poses a difficult technical problem and only partial results in

the sl(2)
related models are known. Some of the previous quantities, related in particular
to the dual structure of the DTA, are still awaiting a better field theoretic interpretation.

Moreover, several of our results are conjectures, tested mainly on the case of sl(2),
but lack
a general proof. On several of these points, it seems that the approach based on the theory of
subfactors [1,10,11] is more systematic. Still, our field theoretic approach provides explicit
realisations and exposes some new facts which show the consistency of the whole picture.
This paper is organised as follows: after a brief summary of notations (Section 2) we
introduce the double triangle algebra (Section 3), then define the GCVO and discuss their
fusing and braiding properties (Section 4). In Section 5 we show how the bulk fields may
be expressed in terms of GCVO and how the equations they satisfy and the various OPE
coefficients may be rederived in a more systematic way. Section 6 discusses briefly the
relation to the lattice models and the determination of their Boltzmann weights in terms
of the cells. Finally, Section 7 deals with the construction of solutions of (1.2)(1.5) and
and contains a derivation of a formula relating the
of the resulting Ocneanu graphs G
OPE coefficients of arbitrary spin fields to data of the graph. Details are relegated to two
appendices. Sections 5, 6 and 7 may be read independently of one another.
Preliminary accounts of this work have been reported at several conferences (ICMP,
London, July 2000; 24th Johns Hopkins Workshop, Budapest, August 2000; TMR Network
Conference, Paris, September 2000 [12]; Kyoto Workshop on Modular Invariance, ADE,
Subfactors and Geometry of Moduli Spaces, November, 2000) or have been published
separately [8]. It should be stressed that this work was strongly influenced by Ocneanus
(unfortunately unpublished) work and that many of the concepts and results presented here
originate in his work.
2. Notations
A rational conformal field theory is conventionally described by data of different nature:
Chiral data specify the chiral algebra A and its finite set I of irreducible representations Vi , i I, the characters i (q) = tr Vi q L0 c/24 , the unitary and symmetric matrix Sij of modular transformations of the , the fusion coefficients Nij k , i, j, k I,
assumed to be given by Verlinde formula
Si& Sj & Sk&
Nij k =
(2.1)
.
S1&
&I
Our convention is that the label i = 1 refers to the vacuum representation, and Vi
denotes the representation conjugate to Vi . Chiral vertex operators ijt (z) and their
fusion and braiding matrices Fpt [ ikjl ] and Bpt [ ki jl ] are also part of the set of chiral
data.
Spectral data specify which representations of A A appear in the bulk: these data
are usually conveniently encoded in the partition function on a torus, with the property
of modular invariance
454
Z=

Zij i (q) j (q) ;
(2.2)
i,j I
j in the Hilbert
here, the integer Zij specifies the multiplicity of occurrence of Vi V
space of the theory; unicity of the vacuum is expressed by Z11 = 1.
Finally these spectral data must be supplemented by data on the structure constants
of the Operator Product Algebra (OPA). This last set of data is the one which is most
difficult to determine as it results from the solution of a large system of nonlinear
equations involving the braiding matrices whose general form is in general unknown.
It has been recognized some time ago that these spectral and OPA data have to do with
graphs. The latter (ADE Dynkin diagrams and their generalizations) (i) encode in the
spectrum of their adjacency matrices the spectral data [1315]; (ii) contain, through the
so-called Pasquier algebra, information on the OPA structure constants, see [1618] and
below, Section 7. In fact these graphs are nothing else than the graphs of adjacency matrices
ni of (1.1). These matrices ni are diagonalisable in a common orthonormal basis:
Sij j j
nia b =
(2.3)
a b
S1j
j Exp
and obey the identities
nia b = ni a b = ni b a .
(2.4)
Here and throughout this paper, we make use of the notation Exp to denote the terms
appearing in the diagonal part of the modular invariant (2.2)

Exp = (j, ), = 1, . . . , Zjj .
(2.5)
The two notations (j,) and j , j Exp, will be used interchangeably. In the following,
1 refers to the PerronFrobenius eigenvector, whose components are all positive. Finally,
j
j
j
in (2.4), the conjugation of vertices a a is defined through a = (a ) = a .
A particular set of matrices n is provided by the Verlinde matrices N themselves, which
form the regular representation of the fusion algebra. This is the diagonal case for which
Exp = I and the corresponding torus partition function is simply given by Zij = ij .
3. Ocneanu graph quantum algebra

Given a solution of Eq. (1.1) consider an auxiliary Hilbert space V j
Cmj with
=

j,
b
b
basis states |eba , = 1, 2, . . . , nj a . It has dimension mj = a,b nj a = a,b, 1, in

particular dim V 1 = tr(n1 ) = |V|. A scalar product in j I V j is defined as

j, j ,
Sj 1
Pa Pb
1

eba eb a = bb aa jj
(3.1)
, dj =
, Pa = a1 .
dj
S11
1
i,
We define the tensor product decomposition of states |ecb
|eb a for coinciding b = b
according to
j,
c ij k

ka N

i,
j, n
i
(1)
e h e
=
F
bk
cb
ba
c
kI =1 t =1
j
a
t

Pb
dk
di dj
1/4
k,

e (ij ; t) .
ca
455
(3.2)
This is a truncated tensor product, in the sense that we restrict to a subspace V i h V j of

V i V j , (cb)h (b a) = bb (cb)(b a), with dim(V i h V j ) = a,c (ni nj )a c mi mj .
The multiplicity of V k in V i h V j is identified with the Verlinde multiplicity Nij k . Then
the counting of states in both sides of (3.2) is consistent, taking into account (1.1). In (3.2)
k,
eca (ij ; t) give a basis, normalised as in (3.1), for the space V k in

Nij k V k .
V i h V j =
(3.3)
k
The
C are ClebschGordan coefficients (3j -symbols), assumed to satisfy the
conditions:
if one of the indices i or j is equal to 1, the tensor product must trivialise and
accordingly

t
1 j
(1)
(3.4)
Fbk
= kj bc t 11 ;
c a
(1) F
the unitarity conditions, expressing the completeness and orthogonality of the bases
in V i h V j

t

t

i j
i j
(1)
(1)
F bk
F bk
= kk t t ,
c a
c a
b,,

t

t

i j
i j
(1)
(1)
(3.5)
F b k
F bk
= bb ,
c a
c a
k,,t
where (1) F is the complex conjugate of (1) F .

In the original (combinatorial) realisation of [1] (for the ADE graphs of the case sl(2)),
V j is the linear space of essential paths of length j on the graph G. Then (3.2) is
interpreted as a composition of essential paths, which is not an essential path in general,
but is a linear combination of such paths.
The requirement of associativity of the product (3.2) leads to the mixed pentagon
relation
F (1) F (1) F = (1) F (1) F ,
or, more explicitly

u u

t

i j 2 3 (1)
i m 1 2 (1)
j
Fmp
F bl
F cm
l k t2 t3
a d 1 2
b
m,2 ,t3 ,t2

u

u

p k 1 2 (1)
i j 1 3
(1)
=
F cl
F bp
.
a d 1 3
a c 1 2
(3.6)
k
d
2 t3
2 3
(3.7)
Here F is the matrix (the 6j -symbols), unitary in the sense of the analogue of (3.5),
relating the two bases in V i h V j h V k . To make contact with the standard notation (cf.,
e.g., [19]),
456
Fig. 3. Graphical representation of the (1) F 3j -symbols, of their orthogonality relations, of the
6j -symbols and of the pentagon identity (3.7). Factors depending on Pa and dj have been omitted.
Fmp
i
l

j
i
=
k
k
j
l

p
.
m
(3.8)
j,
There is a gauge freedom in (1)F due to the arbitrariness in the choice of basis, ecb

j,, j,
ecb , where U is an arbitrary unitary matrix. It is useful to have a graphical
Ucb
notation for the 3j -symbols (1) F by means of triangles, and for the 6j -symbols F by
means of tetrahedra (see Fig. 3). Then relations (3.4)(3.7) are simply depicted. 3 In this
3 The reader should not be confused by the multiplicity of graphical representations used in this paper for
the same objects. It turns out that depending on the question, a different representation may be clearer or more
profitable. The triangles used here for the cells may be regarded as obtained from the tetrahedra of Fig. 1 with
three and one by projecting the three edges with their label a, b, c on the triangle with black vertices.
Likewise, in the representation of Fig. 1, the pentagon identity (3.7) is depicted by the two ways of cutting
457
j,
graphical representation, the gauge freedom consists in changing any edge b c by

j,,
a unitary matrix Ucb .
The pentagon equation (3.7) can be solved for F given the 3j -symbols (1) F and using
the unitarity relation (3.5). Conversely, the relation (3.7) can be interpreted, given F , as
a (recursive) relation for (1) F . In fact, the matrix F is taken to be the matrix (1)F of
the diagonal case (ni Ni ), as we identify in that case the 3j - and the 6j -symbols and
Eq. (3.7) coincides then with the standard pentagon identity for F
F F F = F F.
(3.9)
We can also look at (3.7) and its solutions (1) F as providing more general realisations of
the pentagon identity (3.9), corresponding to the matrix representations ni of the Verlinde
j,
fusion algebra (1.1). If we consider along with the states |eca (triangles with one white
i,t
, t = 1, 2, . . . , Nij k (the
and two black vertices), the vector spaces of diagonal states |ekj
triangles with three black vertices in Fig. 1), we can identify the basis states in the r.h.s.
k,
i,t
.
of (3.2) with the mixed products |eca |ekj
A solution of (3.9) is determined by the chiral data characterising the CFT. For instance,

in the theories based on sl(2),
the solution provides the fusing matrices of the CVO and is
known to be given in terms of the 6j -symbols of the quantum algebra Uq (sl(2)), restricted
to matrix elements consistent with the fusion rules. For given F the solution of (3.7) is by
definition restricted by the data in (1.1), (2.3).
In agreement with the symmetry (2.4) we introduce two (commuting) antilinear
j,
j ,
j,
j , +
involutive maps V j V j , (eca ) = ec a , and (eca )+ = eac . Correspondingly there
are two bilinear forms on V j V j determined by the sesquilinear form (3.1), i.e., two
dual bases in V j . The first, given by , corresponds to the complex conjugation of the
components of the initial basis in V j when it is realised through unit vectors in Cmj . The
second basis is determined requiring that

j, j,
dj
1 j , +
j,
h eca = eca eca ,
(3.10)
eaa (j j ) eac
Pc
which implies

(1)
Fc1
j
a
j
a
1 1
+
= aa
Pc
.
Pa dj
(3.11)
This is a gauge fixing choice consistent with the unitarity condition (3.5) and the relation

dp Pa =
(3.12)
npa c Pc ,
c
derived from (2.3). In the diagonal case it coincides with the standard gauge fixing of

the fusing matrices F of the conformal models based on sl(n).
Assuming that on tensor
+
+
+
products (x y) = x y , (x y) = y x , and denoting the dual basis states in the
a double tetrahedron into two or three tetrahedra.
458
k ,
k , +
tensor product ec a (i j ; t ) := (eca (ij ); t) and eac (j i ; (t )) := (eca (ij ); t)+
(since Njk i = Nijk ), these maps imply the symmetry relations for the 3j -symbols (1) F
(1)
Fbk
i
c
j
a
k,
t

=
(1)
F b k
i
c
j
a
t

k,

=
(1)
F bk
j
a
i
c
while from the pentagon relation taken at l = 1 and (3.11) one derives

t

+
Pb dk (1) i k t
i j
(1)
F bk
=
F cj
.
c a
b a +
Pc dj
+ (t )
+ +
(3.13)
(3.14)

j
The space
j I End(V ) is a matrix algebra A =
j I Mmj on which a second
(1)
product (or a coproduct) is defined via the 3j -symbols F in (3.2). This is the Ocneanu
double triangle algebra [1], an example (and presumably a prototype) of the notion of
weak C Hopf algebra introduced in [4]; this structure has also received the name of
quantum groupoid [1,20]; see also [10,11] for recent developments of the original
Ocneanu approach. Together with its dual algebra, A is interpreted in the present context
as the quantum symmetry of the CFT, either diagonal or nondiagonal. We review below
briefly some basic properties of A and give further details in Appendix A.
The matrix units in Mmj (block matrices in A) are identified with states in V j V j ,
(ca),(c a )
ej ;,
j,
j,
dj
eca e ,
ca
1/4
(Pc Pa Pc Pa )
(3.15)
so that
(c a )(c a ) i,
eca = ik aa cc
ek; ,
Pa Pc
Pa Pc
1/4
k,
e .
ca
(3.16)
They are depicted as 4-point blocks in Fig. 4, where the states in V j correspond to
3-point vertices, or, dually, to triangles, whence the name double triangle algebra for the
algebra A spanned by the elements (3.15), j I. Their matrix (vertical) multiplication
is simply

(ca)(c a ) (d b )(db)
(ca)(db)
ej,,
ei, ,
= ij a b c d ej,,
.

(3.17)
The product (3.17) is illustrated on Fig. 5 by composing vertically the blocks representing
the two elements (the second above the first), and a similar picture represents (3.16).
The identity element 1v in A with respect to this multiplication is given by 1v =

(cb)(cb)
i,c,b, ei,, .
A second, horizontal, product is defined [1], composing two blocks horizontally, see
Fig. 6. Its decomposition is inherited from the r.h.s. of the product h in (3.2), and thus
the r.h.s. in Fig. 6 involves the 3j -symbols (1) F and (1) F . The normalisation constant is

d
p;b,b
= cibc cjab /cpac , with cjab = j S111 .
chosen for later convenience as given by gij
Alternatively, we can define [4] a coproduct : A A A
Pa Pb 1
j,
459
Fig. 4. Two alternative representations of (a) the basis vectors ( P jP )1/4 |eca , (b) the matrix units
a c
(ca),(c a )
ej ;,
.
Fig. 5. The vertical product.
Fig. 6. The horizontal product.
460
(ca)(c a )
:=
ek,,
i,j
t
(1)
F bk
b,b
, , ,
(cb)(c b )
ei,,
i
c
j
a
(ba)(b a )
ej, ,
t

(1)
F b k
i
c
j
a
t

(3.18)
The unitarity (3.5) of (1) F ensures that (ab) = (a)(b) while the coproduct is
coassociative, ( I d) = (I d ) , whenever there exist a unitary F (in the sense
of the diagonal analogue of (3.5)), satisfying along with (1) F the pentagon identity (3.7).
The star operation in A, (xy)+ = x + y + , is inherited from the map (+) defined above,
(cb)(c b ) +
(bc)(b c )
ei,,
(3.19)
= ei , + , + .
It is a homomorphism of the algebra, i.e., of the vertical product, and an antihomomorphism of the horizontal product, (a h b)+ = b + h a + , so that (a + ) = (a)+ .
The algebra A is given a coalgebra structure defining a counit : A C according to
(ca)(c a )
ej,,
(3.20)
:= j 1 ac a c 1 1 ,
which satisfies the compatibility condition ( I d) = I d = (I d ) .
The definitions (3.18), (3.20) imply, however, that (1v ) = 1v 1v and that the counit
is not a homomorphism of the algebra, (u)(w) = (uw) for general elements u, w A,
i.e., the DTA is not a Hopf algebra, see the appendix for more details on its structure of
weak Hopf algebra; in particular the antipode is defined according to

(cb)(c b )
Pb Pc (b c )(bc)
Pb Pc (c b )(cb) +
S ei,,
(3.21)
=
e
ei , + , + =
.
Pb Pc
Pb Pc i, ,
Using the unitarity (3.5) of the 3j -symbols (1) F , it is straightforward to show that the
elements
1 (cb)(cb)
ei =
(3.22)
e
,
cbc i,,
c,b, i
realise the Verlinde algebra with respect to the horizontal product in A,

ei h ej =
Nij k ek ,
(3.23)
and e1 is the identity matrix of A for that product.

4. Generalised chiral vertex operators
We now return to the field theory. Let i, j, k I s.t. Nij k = 0 and let I label
k
descendent states in Vi . The chiral vertex operator ij,t
;I (z), with t being a basis label,
t = 1, 2, . . . , Nij k , is an intertwining operator Vj Vk [21]. We tensor this field with an
intertwining operator V j V k

dj k,
j,
k,;j,
e
eab ,
Pcb,ab =
(4.1)
Pa Pb cb
461
which corresponds to a state in V k h V j . This defines a generalised chiral vertex operator

(GCVO)

Vj V j
Vk V k ,
j I
c
kI
a
i,;
I (z) =
k
ij,t
;I (z)
j,k,t

(1)
F ak
b,,
i
c
j
b
t
k,;j,
Pcb,ab .
(4.2)
The projectors (4.1) satisfy

i,;k,
k , ;j,
i,;j,
Pcb,ab Pa b ,db = bb aa kk Pcb,db ,

k,
k,;j, j ,
Pcb,ab ea b = bb aa jj ecb
,
1 k,;j, 1
edd Pcb,ab ed d = k1 j 1 cd bd bd ad Pa .
(4.3)
From (4.2), (4.3) we have in particular

1
j,
j
c a
= j 1 (0)|0 eca =: |j, ,
j, (0)|0 eaa
= 1, 2, . . . , nj a c ,
(4.4)
where |j, is the explicit form of the highest weight state of the chiral algebra
representation Vj, , augmented with the additional coupling label , used in the

computation of the cylinder partition function in the Hilbert space Ha|c = j, Vj, , [3].
The correlators of the generalised CVO (4.2) are computed projecting on vacuum states
1 in the space V V 1 ; recall that V 1 has a nontrivial dimension |V|. Since
|0 |eaa
1
1,1;1,1
1,1;1,1
Pab,db = ab bd Paa,aa
, the first and the last labels of any n-point correlator coincide,
i.e., we can associate with it a closed path {a, a1, . . . , an1 , a} with elements marked by
the graph indices and passing through the coordinate points z1 , . . . , zn . E.g., the 2-point
correlator reads

1 1
a c

j
j
j
a
(4.5)
j , (z1 )c j,
(z2 ) a = (1) F c1
Pa 0|j1 j (z1 )j 1 (z2 )|0.
a a
For real arguments one recovers the correlators of the boundary fields. Note that the
normalisation of boundary field correlators following from (3.1) differs from that used
6c TL
= Pa .
in [3], (Eq. (4.7)) by a factor 11 / S11 , i.e., 1a = S11
1 2 limL/T Z1|a e
(1 )
The algebra A acts on the operators (4.2) with the help of the antipode (3.21), namely
(c a )(c a )
for ep;
= e(1) e(2) we define a representation (e)
,
(c a )(c a ) c a
c
a
i, (z) : = e(1)
i,
(z)S(e(2) )
ep; ,

Pa Pc 1/4 c a
= ip aa cc
i, (z).
Pa Pc
(4.6)
Definition (4.2) has to be compared with earlier work [22,23] based on the use
of quantum groups (Hopf algebras), or some related versions, e.g., [24], obtained by
modifying the standard Hopf algebra axioms (see [4] for a discussion on the latter and
further references). The papers [2224] deal essentially with the diagonal case, and, more
462
importantly, exploit the true Uq (g) 3j -symbols at roots of unity (e.g., for g = sl(2)) in
formulae analogous to (4.2). We stress again that the 3j -symbols (1) F in (4.2) and (3.2)
reduce in the diagonal case to the 6j -symbols of the quantum groups Uq (g) (or products of
them), restricted to labels consistent with the CFT fusion rules. Thus the decomposition of
the Ocneanu horizontal product fits precisely the CFT fusion, without the need of additional
truncation as in the case of quantum group representations. As emphasized in [4], unlike the
alternative approaches which deviate from the standard Hopf algebras, the use of a WHA
as a quantum symmetry retains coassociativity reflected in (3.7). A price to be paid is the
multiplicity of vacua, which has, however, a physical interpretation in BCFT, as providing
a complete set of conformal boundary states.
From the operator representation (4.2) one derives various identities. In particular
inserting the r.h.s. of (4.2) in the product of two generalised CVO, then applying the OPE
for the standard CVO and finally using the pentagon identity (3.7), and once again the
representation (4.2), we derive for small z12 the OPE
c
b
i,
(z1 )b j, (z2 )

i
(1)
=
F bp
c
p,,t

i
(1)
F bp
=
c
p,,t
j
a
j
a
t

t
p, |ij ;t (z12 )|j, 0c ap,; (z2 )

p
p, 0|ij ;t (z12 )|j, 0c ap, (z2 ) + .
(4.7)
For arguments restricted to the real line one recovers the boundary field small distance
expansion [25] with OPE coefficients given by the 3j -symbols of (3.2). Conversely, the
expansion (4.7) was the starting point in [3] for the derivation of the pentagon identity (3.7).
Denote by c Uja the space of generalised CVO (4.2). The generalised CVO have
a nontrivial braiding defined through a new braiding matrix with 4 + 2 indices of two
types,
d
b
a
a
c
c
:
U i bU j
Uj dUi ,
B(@)
(4.8)
b
a
b
i,
(z1 )b j, (z2 ) =

d, ,

i

Bbd
c

j
a
(4.9)
21 (@) = 1,
12 (@)B
B
consistently with the commutativity of the intertwiners

nib c nj a b =
nj d c nia d .
b
(@)c dj, (z2 )d i, (z1 ),
(4.10)
(4.11)
In (4.9) z12
/ R , and @ stands for @12 = sign(Im(z12 )) and for i = 1 or j = 1 the matrix
is trivial. The braiding matrices B
satisfy the YangBaxter (YB) equation
B
23 (@13 )B
12 (@ 23 ) = B
23 (@23 )B
12 (@13 )B
23 (@12 ).
12 (@12 )B
B
(4.12)
463
Combining (4.9) with the definition (4.2) of the generalised CVO, using then the braiding
of the standard CVO and projecting on the state |0, we obtain the relation

d,

bd i
B
c

j
a

(@) (1) F dk
j
c
i
a
t

= ei@ij
k

(1)
F bk
i
c
j
a
t
,
(4.13)
where the phase in the r.h.s., depending on the scaling dimensions kij = i + j k ,
comes from the standard CVO braiding matrix B. In the diagonal case, where we can
with the standard fusing and braiding matrices, F and B, this relation
identify (1) F and B
is nothing else than the simplest hexagon relation (the q-Racah identity). Inverting (4.13)
in terms of (1)F
we get a bilinear representation of B

i

Bbd
c
j
a

(@) =

(1)
F bk
k,,t
i
c
j
a
t
e
i@kij (1)
F dk
j
c
i
a
t

(4.14)
whenever we know (1) F and the scaling dimensions j , i.e.,

This formula determines B
(1)
bd . It also implies the symmetries
the 3j -symbols Fbi diagonalise the matrix B

k
j k
k j
j

Bbd
(@) = Bdb
(@)
(@) = Bb d
a
c
c a
a c

k
b d j
(@).
=B
(4.15)
c a
The relation (4.13) is a particular case of the more general identity derived by inserting
(4.2) in (4.9) and using the analog of (4.9) for the standard CVO
(1)
(1) F (1)F ,
F (1) F B = B
or, more explicitly

j
(1)
F an
c
nI

c V
cc
B
m
b

i
d

i n
i j
Bnl
d b
k m

j (1)
i m (1)
j
F al
F c k
a
c b
d
(1)
F ck

l
,
b
(4.16)
as illustrated on Fig. 7. For m = 1 we recover (4.13). Eq. (4.16) implies that (products of
of the braiding group.
two) 3j symbols (1) F intertwine the two representations B and B
This is to be compared with the cells introduced in lattice models, [26,27,14,28], see
Section 6.
Another relation derived from the product of three GCVO gives a generalisation of
the braidingfusing identity of MooreSeiberg [21]
(1) F = B
B
(1) F ,
B
464
Fig. 7. Eqs. (4.16), (4.17).
or,

1

p

Bcc
a

b ,2 ,3 ,
k
d

1

(1)

(1)
F bp
1 3
F b p
i
c
j
d
i
a
j
c
t
2
1 t
1 2
bc
B
i
a
k
b

1
1 3

j

Bcb
b
k
d
3 2
(4.17)
2 3
In the diagonal case this is the equation from which one obtains (taking d = 1) the relation
between the braiding B and fusing F matrices; inserting this relation back in (4.17)
reproduces the standard pentagon identity for F . In general (4.17) provides a recursive
in the spirit of [29]. Namely, solving it for the B
matrix in the l.h.s., i.e.,
construction of B
(1)
(1)
B
F , we get an equation which determines B
recursively,
= b,b F B
writing it as B
given the subset of 3j -symbols with one of the labels i, j, p fixed to the fundamental
23 B
12 B
23 , i.e., to
representation(s). Using (4.14) the r.h.s of (4.17) can be completed to B
the r.h.s. of the YB equation (4.12). Similarly one derives a second braidingfusing identity
so that its r.h.s. is completed to the l.h.s. of (4.12). Comparing the two identities and using
twice in the l.h.s. of one of them the pentagon identity (3.7) and the unitarity of F , one
recovers the YB relation.
can
Together with the interpretation of (1)F as 3j -symbols, the braiding matrix B
be interpreted as the R matrix of a quasitriangular WHA, see Appendix A. The
relation (4.14) is an analogue of the representation of the R matrix in terms of the
465
3j -symbols, while (4.16) is an analogue of the relation between vertex representation and
path representation of the quantum group R matrix (Vertex-IRF correspondence), see [30].
There is an important difference in the analogy with the role of quantum groups in CFT.
Namely in the present approach the summations in all identities, like, e.g., over k in (4.14),
or over n in (4.16), run according to the fusion rules, while in their analogues, where
the true quantum group 3j -symbols appear, these summations run within the standard
classical tensor product bounds. When interpreted in the CFT framework the analogue
of the braiding relation (4.9) is then required to hold only on a physical subspace, or
alternatively, the conformal Hilbert spaces (and the conventional CVO in the definition of
the covariant CVO of [22]) have to be extended to accomodate unphysical intermediate
states incompatible with the fusion rules, [23,31,32], see also the recent work [33] for
a related discussion and further references.
5. Bulk fields chiral representation

i, i I, label the physical spectrum, corresponding to
Let now the pairs I = (i, i),
nonzero matrix elements of the modular mass matrix Zi i , or, in a more precise notation,
), = 1, 2, . . . , Z . We define
which we will for simplicity skip in this section, (i, i;
ii
(upper) half-plane bulk fields as compositions of two GCVO (4.2)

u

i
i
a
H
(i,i ,u)
(1)
a b
)=
Ra, (j ) Fbj
i, (z)b i , (z)
(i,i)
(z, z
a
a

a,b, , j,,u

n, ;l,
n,k,l;t ,t
n
k
ik;t
z)
C(i,
=
(5.1)
(z) (
,a; , Pab ,ab .
i l;t
i)a,b
n,k,l,t,t
a,b , ,
H (z, z ) transforms under

Here z H is the complex conjugate of z H+ and the field (i,
i)
a tensor product representation of one copy of the chiral algebra A labelled by the pairs
(i, i ), see [3,34,35] for discussions of more general gluing conditions. 4 The choice of the
constants in (5.1), related according to

n,k,l;t ,t
C(i,
,a; , =
i)a,b

j,u,u ,
(i,i ,u)
Ra,

(j )(1)F an
j
a
l
b
u

Fkj
i i
n l
u u
t t
(5.2)
is such that when applying for small z z = 2iy the OPE (4.7) for the two CVO in (5.1)
1 ) we recover in the leading order the boundary field a a (x)
(and projecting on |eaa
j,
,u)
(i,i
contributing with the bulk-boundary reflection coefficient Ra,
j,i ,1;u,1
(j ) = C(i,i)a,a,a;,1
4 For convenience we keep the same notation for the half-plane field H (with (i, i ) appearing in the CVO
(i,i)
corresponding
product in the r.h.s. of (5.1)) as for its full-plane counterpart P , with the second label in (i, i)
(i,i)
to a representation of a second copy of the chiral algebra; in our convention the diagonal torus modular invariants
H , i I.
correspond to the fields (i,i)
466
(i,i,u)
of [25]. (We denote here Ra,
(j ) what was denoted
expressed in terms of the graph eigenvector matrices
Ra(i,i ) (1) =
a; B j ;u
(i,i)
in [3].) For j = 1 it is
ai
ai eii
.
a1 di
(5.3)
From the operator representations (4.2) and (5.1), which involve the two sets of
constants, (1) F and R (or, C), one recovers all correlators of the fields and H ; they are
expressed as linear combinations of standard CVO correlators. E.g., the 2-point function
1 , is
projected on the state |eaa
c b

j, (z2 )IH (z, z ) a
Pa (i,i ,t )
j
i
1
= ab ac
Ra, + (j )0|jj
z)|0.
(5.4)
(z2 ) (z) (
i
1
i
i
;t
dj t
In (5.4) we have adopted an ordering corresponding to real parts increasing from right
to left, i.e., Re(z2 z) > 0. The inverse order would give a function which differs by
a phase (due to the nontrivial braiding of products of CVOs), even if the difference of
scaling dimensions, the spin sI = i i , is (half)integer, as required from the physical
spectrum. The phase vanishes if we furthermore restrict the argument z2 of the generalised
CVO to the real axis boundary of H+ and thus the bulk and the boundary fields commute.
Let us now briefly review the derivation of the equations resulting from the sewing constraints of CardyLewellen [25,36] in the BCFT. The operator representations introduced
here both for the boundary and the bulk fields make these derivations straightforward (in
fact also slightly more general) and reduce them to the use of the fusing and braiding
relations for the conventional CVO. First requiring locality (commutativity) of a bulk and
boundary operators, IH (z, z )a bj (x2 ) = a bj (x2 )IH (z, z ), has further implications, leading to an equation for the unknown constant C in the operator representation (5.1). It reads,
omitting for simplicity the multiplicity indices

n,k,l
j g
i j
(1)

C(i,i)a,b
Fbl
Bll
()
,a
a b
k g
l

k ,l ,g
j k
j i
(1)

Bk k
().
C(i,i)b,b
Fbn
=
(5.5)
,b
n l
a b

k
Projecting the product of two fields on |0, or on 0|, i.e., setting g = 1, or k = 1 in

(5.5), one recovers the (first) bulk-boundary CardyLewellen equation [25,36]; (5.5) is
a slightly more general version of it, corresponding to a 5-point chiral block. This equation
(i,i ,t )
(j ) in terms of the
provides a closed expression for the scalar reflection coefficients Ra,
(1)
3j -symbols F and the modular matrix S(j ) of 1-point torus correlators

(i,i )
i
Pa Ra (j )
k j
b (1)

Ski (j ).
(5.6)
Fak
= Si1
b a
dj R (i,i ) (1)
1i
1
k,b
In the diagonal case the l.h.s. reproduces Sai (j )/S1i [5,3].
467
H H first expressing
With (5.1) at hand one also derives the OPE of two bulk fields K
L
their product as a product of four standard CVO, exchanging then the second and third
fields and fusing each of the two pairs labelled by (k, l) and (k , l ) (this can be depicted by
a 6-point chiral block diagram slightly more general than Fig. 10 of [3]). In the process one
J ;t,t
. It reads symbolically,
finds an expression for the OPE coefficients, to be denoted dKL
ordering the constants in the l.h.s. in the sequence they appear in the above steps,
F FB()CC = dC,
(5.7)
or
m,n,g
g ,n,i
C(k,k)a,b,a
C(l,l)a,b,a

l k
k
J
(+)Fnj
=
dKLBgg
m
n n
j,j,g

l
k
F
g n j g

l
m,g,i
C(j,j)a,b,a .
i
(5.8)
Setting i = 1 and substituting the constants C with the reflection coefficients R as in (5.2),
this can be also rewritten, introducing a new constant M, as
(k,k;u1 )
(l,l;u2 )
(p1 )Ra,
(p2 )
Ra,
1
2
(j,j ,u3 ;p3 ,3 );a

=
M(k,k,u
;p , )(l,l,u
1
j,j,p3 ,u3 ,3
(j,j;u )
Ra,3 3 (p3 )
;p
,
)
2 2 2
(5.9)
with u1 = 1, . . . , Nk k p1 , 1 = 1, . . . , np1 a a , etc. This is the second of the two basic bulkboundary Lewellen equations [36]. In the diagonal case K = (k, k) the OPE coefficients
d = d (H ) coincide with their full-plane counterparts d (P ) and in the unitary gauge used
(P )J ;t,t
= t t for Nkl j = 0.
here are simply dKL
Eq. (5.9), taken at p1 = p2 = 1, allows to derive and generalise to higher rank cases
(see [37,3]) the empirical sl(2) result of [17] on the coincidence of the relative scalar OPE
coefficients and the structure constants of the Pasquier algebra [16]. The latter algebra
(i,i )
has 1-dimensional representations (characters) given by ai /a1 = eii di Ra (1),
cf. (5.3). A generalisation of this result will be discussed in Section 7.
The reflection coefficients satisfy
Ra(i,i) (j )
(i,i)
= Ra (j )e
j
i i
= Ra(i
,i )
(j )e
j
i i
(5.10)
1
and furthermore (5.8) implies, choosing (the positive) constant dKK
= 1.

a,
2 dk
)
(k,k;u)
(l,l;u
Ra,
(j )Ra,
(j ) a1
= lk lk uu kk
.
dj
(5.11)
The identity (5.11) reduces for j = 1 to the orthonormality property of al (expressing

the completeness of the set of boundary states) and in the diagonal cases to the unitarity
relation for the modular matrices S(j ). In general d (H ) and d (P ) differ by phases depending
on the spins sK = k k ,
(P )J
J
= ei 2 (sK +sL sJ ) dKL
,
dKL
(5.12)
468
reproducing in particular the spin-dependent full-plane 2-point function normalisation,

(P )1
dKK = (1)sK , proved to be consistent with the locality and reflection positivity
requirements [38].
6. Relation to integrable lattice models

Some of the identities in Sections 3 and 4, most notably the YB equation, coincide
with the basic identities of the related IRF integrable lattice models. The lattice Boltzmann
weights, however, depend on a spectral parameter u, which does not appear in the CFT,
and to compare the two discussions, a proper limit of this parameter has to be taken. This
correspondence has been established in the diagonal cases, [39], and in this section we
hn CFT.
show how it generalises to all models built on graphs related to sl(n)
The data required to define the generalised sl(n)-IRF models that we consider are
a graph G we postulate that one of the graphs met in the CFT discussion is
appropriate and a pair of representations j1 and j2 for sl(n). Then to each vertex of the

square lattice is assigned a vertex a of the graph. The Boltzmann weights Wj1 j2 bc da (u) are
functions of the four vertices a, b, c, d around a square face and of a spectral parameter u. It
is conventional to tilt the lattice by 45 degrees and to represent the Boltzmann weights as in
Fig. 8. Representation j1 is assigned to the SWNE bonds, and j2 to the SENW ones [40].
Intuitively, one goes from vertex a to vertex b through the action of j2 , and from b to c
through j1 , and accordingly, the weights depend also in general on bond labels , , . . . ,
which specify which path from a to b, from b to c etc. is chosen: = 1, . . . , nj1 b c ,
= 1, . . . , nj2 a b , etc.
The Boltzmann weights are solutions of the spectral parameter dependent YB and
inversion (unitarity) equations. Knowing them for the fundamental representation(s)
enables one to construct the other weights by a fusion procedure [4143].
In the simplest case where implicitly all the bonds carry the fundamental representation
of sl(n), the Boltzmann weights have the general form

W
c
b
d
a

c d
u
bd + sin(u)[2]q U
(u) = sin
,
h
a b

h1
(6.1)
where [2]q = 2 cos(/ h) for q = e2i h (h the Coxeter number of the graph G), and
[2]q U are Hecke algebra generators satisfying U 2 = U etc. Choosing the labels j = k =
Fig. 8. (a) The Boltzmann weight W for j1 = j2 = ; (b) U as a product of two cells.
469
we can cast it into a form

in the bilinear representation (4.14) for the braiding matrix B,
similar to (6.1)
bd () = bd q a q b CUbd ,
B
(6.2)
with (cf. Fig. 8)

U
c d
a

(1)
Fb
1
(1)
Fd
1
c
(6.3)
The constants a, b, C are determined from (4.10) and (4.13); from (4.10) we get
C = q ab + q ba , and from (4.13) we get a = h(Su 12 Su ), 2b a = h(Su
1 Su
Su
2 ), hence C = [2]q . Here are Sugawara conformal dimensions, while in (4.13)
enter the dimensions = Su
, of the minimal Wn model of central charge
h
c = (n 1)[1 + 2n(n + 1) n(n + 1)( h1
h + h1 )]; this shift of the dimensions is accounted for by the sign in front of the second term in (6.2). One obtains a = (n 1)/2n,
b = 1/2n.
When (6.2) is inserted in the YB equation (4.12), the latter reduces to the Hecke algebra
relation for the operators [2]q U in (6.3), which can be identified with the operators in the
r.h.s. of (6.1). Thus the Hecke generators are expressed in terms of the 3j -symbols (1) F ,
recovering a formula in [2]. Furthermore comparing (6.1) and (6.2) we obtain
bd
B

c
(@) = 2iq
1
@ 2n
lim
ui@
i@u

W
c
b
d
a

(u).
(6.4)
In other words, we can look at the correlators of the generalised CVO with all
representation labels fixed to the fundamental ones as realising a representation of the
corresponding Hecke algebra in parallel to the path representations of the lattice theory.
In the sl(2) case (6.2), (6.3) reproduce, inserting (3.11), the Boltzmann weight of the ADE
Pasquier models [44]

bd f f = q 1/2bd ac Pb Pd .
q 1/4B
(6.5)
a c
Pa
There is no general information on (1) F in the higher rank cases, however the particular
(fundamental) matrix elements in (6.3) are recovered from the sl(n) examples of
Boltzmann weights found in the literature, [4547,14,48]. Recently exhaustive results were
obtained 2 for all sl(3) graphs but one. A general existence theorem for (1) F and W
for a subclass of graphs corresponding to conformal embeddings appears in [49]. On the
other hand, as the counter-example of [2] shows, some solutions of (1.1) do not support
a representation of the Hecke algebra, i.e., a system of 3j -symbols (1) F .
In the sl(2) and sl(3) cases one can formulate [2] a quartic relation directly for the
cells (1) F which in turn implies the Hecke algebra (or YB) relations; in our notation it
reads
470

b ,
1 ,2 ,
3 4
(1)
F b
1
ac
Pc Pc
=
Pb

b,
1 ,2
3 ,4
(1)F b
(1)
1 2
(1)
F b
d c

Fc
3
b d
1 1

ac
3 4
F b
F b
F c

(1)
(1)
2 2
1 2
+ 1
(1)
3
b d
d c
(1)
4 3

F b
4
a c
1 4
+ 1
2
3 2
4 1
a c
1 4

Pc Pc
1

=
ad 1 2 3 4
cc 1 4 2 3 .
[2]2
Pa
(6.6)
The first delta-term here is present only for the sl(3) case where = and the two 3-point
couplings corresponding to the function are identical; for n = 2, where by convention
refers to the identity representation, the first term is zero; accordingly we recover the TLJ
algebra relation.
The fused Boltzmann weights are similarly expected to be related to more general
elements using the
braiding matrix elements. The recursive construction of the general B
fusingbraiding relation (4.17) is analogous to the fusion procedure of the lattice models
yielding the fused Boltzmann weights. The inversion equation for the Boltzmann weights
in the lattice models turns into the unitarity identity (4.10). The relation (4.17) taken for
p = 1 leads to the (crossing) identity

Pb Pa
i
i
k
k
bd
cb
(6.7)
(@)
= ac ,
(@)B
B

a b
b d
Pb Pd

b
while (4.13) with i = 1 reads

j j

(@) Pd = Pb e2i@j ,
Bbd
a a
(6.8)
a property analogous to one satisfied by the full (u-dependent) Boltzmann weights.

We now turn to the relation (4.16), which has the form of the intertwining relation for the
square Ocneanu
cells, studied in
To make contact with the notation in [14],
a[26,14,27,28].
i
i
(1) F
a
c
aj c b is identified with Y c j with b fixed and i, j, c, a restricted by nib , nj b = 0.
The data found in those papers provide thus a partial information on the 3j -symbols (i.e.,
on the boundary field OPE coefficients), namely determine those matrix elements in which
one of the representation labels is fixed to the
weight and b fixed to 1. On
fundamental

the other hand, knowledge of the cells (1)Faj c bi for arbitrary a, b, c and i, j is sufficient
to determine all the cells using the pentagon equation, in a way similar to the discussion

at the end of Section 4. A general solution for (1) F for the sl(2)
D-series has been found
in [6].
We conclude with the remark that it would be interesting to extend the correspondences
discussed in this section to the boundary lattice theories, see [50,51], and in particular to
clarify the role of the reflection equations [52] in the present setting.
471
7. Ocneanu graphs and the associated algebras

In the following we shall motivate on physical grounds and by analogy with a situation
already encountered in BCFT the construction of new sets of (nonnegative integer valued)
matrices and their associated graphs. On a mathematical level, this construction has been
justified in the subfactor approach [1,2,10,11], but the field theoretical approach provides
new insight.
In BCFT we know that three sets of matrices play an interlaced role, generalising the
fusion matrices Ni . The first is the set of |V| |V| matrices ni defined in (1.1), which form
a representation of the fusion algebra and define the graph G. As recalled in Section 2,
j
their diagonalisation introduces a set of orthonormal eigenvectors a , a V, j Exp.
ab c } in [3], forms
a = {N
The second set of matrices, also of size |V| |V|, denoted N
the regular representation of an associative algebra,
b = N
c
a N
ab c N
N
(the Ocneanu algebra) [26,53]. It is attached to the graph in the sense that

a =
b ,
nia b N
ni N
(7.1)
(7.2)
a is assigned to vertex a of the graph G, the action of ni on N

a gives
i.e., if the matrix N

a sum over the neighbouring matrices Nb (neighbouring in the sense of the adjacency
matrix nia b ).
a have entries that are integers, but in general of indefinite
In general, these matrices N
5
sign. At this point, we recall that RCFT and the associated graphs G come in two types.
Those for which the modular invariant partition function is block-diagonal and expressible
in terms of the n matrices as

ni1 a nj 1 a
Zij =
(7.3)
aT
for a certain subset T of vertices are called of type I. They are interpreted as diagonal
theories in the sense of some extended chiral algebra Aext . The set T is in one-to-one
correspondence with the set of ordinary representations of that algebra Aext and the integer
ni1 a is the multiplicity multa (i) of representation Vi of A in the representation of Aext
a , a V have nonnegative integer entries and the subset
labelled by a. Then all matrices N
a }aT forms a subalgebra isomorphic to the fusion algebra of Aext [15]. 6
{N
ab c as fusion coefficients of a class of twisted
An interpretation of the whole set of N
ext
representations of A broader than considered in Section 2 has been proposed in [3,55],
see also [56]. In contrast, a theory of type II cannot be written as in (7.3) and is obtained
5 A case where this integrality property of the N
ab c seemed invalid was pointed out in [18], but later it was
shown by Xu that integrality could be restored at the expense of commutativity [49], see Section 7.2.
6 These statements are for us empirical facts, of which we know no general proof. They seem to have been
established for a variety of cases in the subfactor approach or are taken as assumptions. Note that our definition of
type I in (7.3) above is slightly more restrictive than the one used previously [3,18]. It rules out one of the graphs
(12)
(E3 in the Table of [3]). See also [54] for cases which go beyond this simple classification.
472
from some type I one its parent theory through an automorphism of its fusion rules
acting on its right sector with respect to the left one [21,57]. We thus expect many of their
properties to be more simply expressed in terms of data pertaining to the parent theory. For
example, their torus partition function reads

Zij =
(7.4)
ni1 a nj 1 (a) ,
aT
where the ns are those of the parent type I theory.

algebra by the algebra of
We can then define the dual (in the sense of [58]) of the N
C. This algebra, also called the Pasquier algebra, is realised by matrices
linear maps N
M(i,) labelled by the elements of Exp and as mentioned in Section 5 relates to the scalar
OPE coefficients.
As a side remark, we recall that in the sl(2) case it is this M algebra which also
appears as the perturbed chiral ring of N = 2 superconformal CFTs perturbed by their
least relevant operator (or of their topological counterparts) [59], hence as a specialisation
of the Frobenius algebra [60]. We shall return to these algebras and their CFT interpretation
in the next sections.
In the following, we are going to introduce four sets of matrices, which generalise the
and satisfy analogous relations. The matrices n
previous three, define again graph(s) G,
and n,
, M) generalises to a pair
gives rise to two sets, denoted V
while the dual pair (N

(N , M).
matrices and Ocneanu graphs
7.1. The V
We first consider the integral, nonnegative matrix solutions of a system of equations for
ii ;x y with i, i I. It generalises (1.1), with the Verlinde fusion
commuting matrices V

multiplicities Nij k replaced by the product Nij k Ni j k

y
ij ;x y V
i j ;y z =
V

i j ;x z .
Nii i Njj j V
(7.5)
i ,j
whose
The labels x, y, . . . , of these matrices take their values in a finite set denoted V,

2

cardinality equals |V| = j j (Zj j ) in terms of the modular invariant matrix Z.
This property, and more generally the physical interpretation of (7.5), follow from the
discussion of torus partition functions in the presence of twist operators (physically defect
lines) denoted Xx , see [8] for details. The discussion is parallel to the way Eq. (1.1) appears
in the study of cylinder partition functions and involves the consistency between two
alternative pictures. In one picture, two twist operators Xx and Xy , attached to homology
cycles of type a of the torus, act in the Hilbert space of the ordinary bulk theory, H =
Zi i Vi V i , and are assumed to commute with the generators of the two copies of

the chiral algebra A. This forces them to be linear combinations of operators P (k,k; , )
intertwining the different copies of equivalent representations of A A
Xx =
i,i
, =1,...,Z
ii
473
x(i,i;, ) (i,i;,
)

P
,
S1i S1i
(7.6)
with
P (i,i;, ) P (j,j ;, ) = ij ij P (i,i;, ) .
(7.7)
The other picture makes use of a Hilbert space Hx|y associated with the homology cycles of
ij ;x y describes the multiplicity of representation Vi Vj
type b; the nonnegative integer V
in Hx|y . The equality of the twisted partition functions computed in these two alternative
ways leads to a consistency condition of the form
y =
V
i i;x
Sij Sij
J
, =1,...,Z
jj
S1j S1j
(j,j;, )
(j,j;, )
(j,j;, )
(j,j;, )
i, i I,
(7.8)
(J ;,)
where y
is the complex conjugate of y
. Then = {x
} is assumed
and by the pairs J = (j, j)
of labels
to be a square, unitary matrix, labelled by the x V
supplemented by their multiplicities in the spectrum , = 1, . . . , Zj j

x(J ;,)x(J ; , ) = jj jj ,

xV
(J ;,)
x(J ;,)x
= xx .
(7.9)
J
,=1,...,Z
jj
Following a standard argument, in (7.8) the (J ;,) appear as the eigenvectors and the
matrices. As the latter satisfy the double
ratios Sij Sij /S1j S1j as the eigenvalues of the V
ii

fusion algebra (7.5), so do the matrices V .
ij ;x y may be regarded not only as the entries of |
V| |
V|
In fact the integer numbers V
y

ij , i, j I, as we just did, but also as those of |I| |I| matrices V
x , x, y V.
matrices V
By convention, the label 1 refers to the trivial (neutral) twist, and it is thus natural to
reduces to the modular invariant matrix,
impose the further constraint that for y = z = 1, V
up to a conjugation
1 = Z .
V
i i ;1
ii
(7.10)
This is consistent with (7.8) if

1(J ;, ) = S1j S1j =: 1(J ) .
(7.11)
In particular 1(1) = S11 and denoting dx = x(1) /1(1) , this implies, using the unitarity of
and of the modular matrix S, the completeness relation

xV
dx2 =
1
(S11 )2

iI
di2
(7.12)
474
ij ;x y = V
i j ;y x and conversely, this latter
(see also [10]). It also follows from (7.8) that V
in an
property, together with (7.5), suffices to guarantee the diagonalisability of the V
orthonormal basis, as in (7.8).
ij ;1 x , thus T1 = Z, taking the x = z = 1 matrix
Then, if we define the matrices Txi,j := V
element of (7.5) yields

Txi,j Txi ,j =
Nii i Njj j Zi j ,
(7.13)
i ,j
which is the way the matrices Tx appeared originally in the work of Ocneanu, under the
name of modular splitting method.
ij may be regarded as the adjacency matrices of a set of graphs
The set of matrices V
In any RCFT, the fusion ring is generated by a finite
with a common set of vertices V.
number of representations f of I called fundamental, and because of (7.5), it is sufficient
1f for these representations to generate the whole set by
f 1 and of V
to give the graphs of V
12 suffice.) Following Ocneanu [1],

21 and V
fusion. (For example, for the sl(2)
theories, V
it is convenient to represent these graphs simultaneously on the same chart, with edges of
associated
different colours. We shall refer to this multiple graph as the Ocneanu graph G
with the graph G of the original theory. Examples are given in Fig. 10 of Appendix B for

sl(2)
theories, and additional ones may be found in [2,11]. If one attaches the matrix Tx to
vertex x, the two kinds of edges of the graph describe the action of the fusion matrices on
the left and right indices of the Tx . For example, the edges of the first colour (red full lines
f 1;y z in
on Fig. 10) encode the V

f 1;y z Tij y
V
Nf i i Ti j z =
(7.14)
y

f 1;y z Ty , and likewise, those of the second colour (blue,
or in short, (Nf I )Tz = y V

1f ;y z .
broken) describe (I Nf )Tz = y Ty V
algebra
7.2. The N
may be used to define a new algebra in the same way as
In turn, this Ocneanu graph G

x =
N was attached to the graph G. To each vertex x of the graph we attach a matrix N
z
1 = I . The matrices N
are assumed
yx } of size |
V| |
V|. For the special vertex 1, N
{N

x = z V
ij ;x z N
z (compare with (7.2).) Using the spectral
ij N
to satisfy the algebra (1.5): V
, one may construct an explicit solution for these matrices N
x
decomposition (7.8) of the V
yx z =
N

J ; ,
y(J ;,)
(J ;, )
1(J )
(J ;, )
(7.15)
1x z = N
x1 z = xz and
Taking into account the orthonormality of the , one finds that N

that Nx form a matrix representation of an algebra

x E
y =
xy z E
z ,
E
(7.16)
N
z
475
with an identity and a finite basis. The algebra is associative, but in general noncommutative if some Zij > 1. Indeed, if all Zij = 1, the summation over , , in (7.15)
is trivial, and this equation is, once again, nothing else than the spectral decomposition of
in terms of the one-dimensional representations x /1 of the algebra. If,
the matrices N
are not simultaneously diagonalisable, but rather
however, some Zij > 1, the matrices N
(J ;, )
(J ;, )
block-diagonalisable with blocks x
= x
/1(J ) forming a Zj j -dimensional
representation of the algebra

(J ;, )
xy z z(J ;, ).
x(J ;,)y
=
N
(7.17)
z
(See also [10] (Lemma 5.2) for a similar although somewhat less explicit variant of (7.15),
z = , , being the sector product matrices.) By inspection, one checks,
xy
with N
z x
y
matrices have
at least in all ADE cases (see below and Appendix B), that these N
nonnegative integer entries. They may indeed be viewed as multiplicities (of dual triangles
with three white vertices), and accordingly the algebra (7.16) appears as the algebra of the
with the product in the l.h.s. of (7.16) given by the vertical product of [1],
center of A,
compare with (3.23) and see Appendix A. These matrices are recovered also directly from
(7.6), (7.7),

yx z = Tr Xy Xx Xz
N
(7.18)
using that Tr P (J ;,) := S1j S1j (this definition of the trace may be justified in unitary
CFTs in exactly the same way as the norm of the Ishibashi states, via the
asymptotics of the characters j ( ), see [3,61] and (4.2) of [8]). Equivalently, we have
xy z Xz ,
Xx Xy = N
(7.19)
algebra. In this
thus justifying the name of twist fusion algebra that we give to the N

latter context, the noncommutativity of this N algebra may be viewed as coming from
the inpenetrability and the resulting lack of commutativity of the defect lines to which the
twists Xx and Xy are attached.
is defined through
If a conjugation in the set V

(J ;,)
x
(7.20)
= x(J ;,)
(note the reversal of the indices and !), it follows that
yx 1 = xy ,
N
(7.21)
modifies the analogue of the symmetry relations (2.4)

and the noncommutativity of N
according to
yx z = N
x y z = N
zx y .
N
Eq. (7.15) may be rewritten as a sum of
;
eJyz;,
(J ;,) (J ;, )
1
z
,
Zj j y
eJ ; eJ ;

xz;,

y,
j j; 1
;
J ;
J ;
eJxy;,
eyz; , = exz;, ,
(7.22)
j j Zj j

J,,
(matrix) idempotents
;
1 ,
Zj j eJ,
=N
(7.23)
476
x =
N
J ;, J ;
e,
Zj j x
J, ,
Zj j xJ
eJ ; ,
(7.24)
where, suppressing the matrix indices, the sum runs over the physical spectrum (J, ) :=
algebra, which are Z (J, , ). These are the labels of the representations of the N
jj
J
dimensional and given according to (7.17) by the matrices x , i.e.,

x = xJ ;, ,
x J ;, N
J ;, : N

z

x N
x J ;, N
y =
z .
y =
xy J ;, N
N
J ;, N
J ;, N
(7.25)
The formula (7.24) is then interpreted as a decomposition of the regular representation

-algebra into a sum of representations J each appearing with multiplicity Z
of the N
jj

so the dimension is j j Zj j dim(J ) = j j Zj2j = |
V|. In [11] a formula analogous to
x in A spanning (with respect to the
(7.25), or (7.17) appears directly for the elements E
vertical product) the algebra (1.4), see Appendix A.
of the chiral graph G mentioned in the
We now return to the graph algebra N
introduction to this section. In fact, we shall restrict our attention to type I cases, which are
a are nonnegative integers. Because
the only ones for which all the matrix elements of the N
in this case, Eq. (7.3) applies, each exponent appears (nj 1 a )2 times for each representation
a of the extended algebra, identified with a vertex a T . It is advantageous to denote
the corresponding eigenvectors of the n matrices as (j,a;,) , with , = 1, . . . , nj 1 a .
(j,a;,)
In [18], various formulae have been established for the components b
, b T . It is
easy to extend them to

(j,a;,)
(j,a)
(j,a)
ext
1
= 1 ,
1
= S1j S1a
, for naj1 = 0,
(j,a;,)
ext
(j,a) Sba
ext ,
S1a
= 1
for a, b T ,
using the modular invariance identity

ext
Sij ni1 b =
Sba
nj 1 a , b T .
iI
(7.26)
(7.27)
aT
algebra (of which the N

algebra turns out to be
The similarity with the case of the N
a subalgebra in these type I cases) suggests a formula which encompasses and generalises
all known cases
cb d =
N

aT ,jI
,, =1,...,nja 1
(j,a;, )
(j,a;,) b
(j,a;, )
d
.
(j,a)
1
(7.28)
It is an easy matter to check that the relations (7.1) and (7.2) are indeed satisfied. We have
2 1) , for which multiplicities

checked in the simplest case n = 2 of sl(2n)

2n so(4n
1
477
ni1 a > 1 are known to occur, and we conjecture in general, that this formula always yields
nonnegative integers, and gives an explicit realisation of the considerations of [49,62]. 7
7.3. The n matrices
We then introduce a new set of matrices n x = {n ax b }, a, b V, which form a nonnegative
algebra, see (1.3), in clear analogy
integer valued representation (nimrep) of this N
8

with (1.1). Like the N , these matrices are non-commuting in general, if some Zj j > 1,
and they admit a block decomposition like (7.15)
n ax b =
j ,=1,...,Zjj
(j,j ;,)
j, x
j,
b
(j,j )
1
= n bx a .
(7.29)
One also checks, using the orthonormality and conjugation properties of the and , that

(7.30)
n ax a n b x b =
nia b ni b a .

xV
iI
These matrices are again interpreted as multiplicities: namely n ax b describes the dimension
b of dual triangles with fixed markings x, a, b (one black, two white
ax
of the space V
x . Then (1.3) serves
vertices). Varying a, b, they form a basis of the dual vector space V
x v V
y ,
as a consistency condition needed to give sense to the dual (vertical) product V
z

in which Vz appears with multiplicity Nxy , the latter replacing the Verlinde multiplicities
in a formula analogous to (3.3). On the other hand (7.30) is interpreted as the equality
between the dimensions of the space of double triangles and that of dual double triangles
with a, a , b, b fixed and justifies a change of basis considered in Appendix A (see Fig. 9)

Recalling that in Section 3, mj = a,bV nj a b stands for the dimension of the space V j

x .
of triangles (or CVOs), we now denote m
x = a,bV n xa b the dimension of the space V
The equality of the dimensions of the double triangle algebra A and of its dual A amounts
to the identity

m2j =
m
2x ,
(7.31)
j I

xV
which results indeed from the summation over a, a , b, b in (7.30). On the other hand a less
trivial equality holds, checked case by case in all sl(2) cases,

mj =
m
x,
(7.32)
j I

Cx ,xV
(or classes in the N

algebra),
where the sum in the r.h.s. runs over the classes Cx in V

J,,

= tr(J (Ny )), i.e., the characters
determined by x y, iff J , tr(J (Nx )) = , x
7 It is understood that in cases where exponents come with a nontrivial multiplicity, the remaining arbitrariness
nonnegative integers, and this seems always possible in type I cases.

in the choice of the is used to make the N
8 In the subfactor approach, given an inclusion of subfactors N M, the equality (1.3) is interpreted as an
associativity condition for the MM, MN sectors, similarly as the analogous identity (1.1) for the N N , N M
sectors [11].
478
algebra are constant on the class Cx . For cases with trivial

of the representations of the N
multiplicities Zj j = 0, 1 the summation in the r.h.s. runs over the set
V and (7.32) expresses
In the sl(2) Deven
the equality of dimensions of the regular representations of A and A.
cases there are two nontrivial classes Cx , each containing the fork vertices in the chiral
see Appendix B.
subgraphs of G,
The physical interpretation of the matrices (7.29) is obtained by looking at the effect
of a twist in the presence of boundaries. One consider the RCFT on a finite cylinder with
boundary states |a and b| at the ends and a twist operator Xx in between. Repeating
the calculations of partition functions carried out in [3,8], one finds that the open string
channel is described by a Hilbert space with representation Vi occurring with multiplicity
(ni n x )a b , i.e., the matrix element of a (commuting) product of the matrices ni and n x .
Thus (n x )a b is the multiplicity of the identity character in this twisted cylinder partition
function.
matrices
7.4. The M
The last set of matrices that we may associate with the Ocneanu graph generalises the
algebra by the algebra
Pasquier algebra. We can define a dual (in the sense of [58]) of the N
of linear maps
(J ;, )
x(J ;, )
x =

,
J ;, N
x1
x1
+

+

+

I ;, +
J ;, Nx = I ;, Nx J ;, Nx

(I ;, )(J ;, ) (K; , ) +
=
M
K; , Nx
+
J ;,

x =
x + N
:N
J ;,
(7.33)
K; ,
with structure constants

(I ;, )(J ;, ) (K; , ) =

M
x(I ;, )
x
x1
(K; , )
x(J ;, ) x
(7.34)
This algebra is abelian and its 1-dimensional representations, or characters, are given
by (7.33). An involution () in the set {(I ; , )} is defined by the complex conjugation

x(I ;, ) = x(I ;, ) so that M(I ;, ) = t M (I ;, ) . The subset of the numbers formed
(I ;,)(J ;,)(K; , ), i.e., diagonal in the multiplicity indices, does not form
by the M
a subalgebra but does play a physical role. Their explicit computation (again in the ADE
cases) shows that (i) they are nonnegative algebraic numbers; (ii) they give the modulus
squares of the relative structure constants of the OPA of the corresponding CFT

d(I ;)(J ;)(K; )2 = M
(I ;,)(J ;,)(K; , ).
(7.35)
We recalled in Section 5 that the Pasquier algebra gives access to the relative structure
constants of spinless fields. The OPA structure constants of non-leftright symmetric fields,
479
however, were escaping in general this determination in terms of graph-related data. 9 The
empirical result in [17] only states that in the cases of conformal embeddings D4 , E6 , E8
the l.h.s. of (7.35) factorises into a product of scalar constants (and hence is expressed
by the Pasquier algebra structure constants) and that for the Deven series this factorisation
holds in a somewhat weaker sense; this factorisation is confirmed (see Appendix B) by
what is computed for the r.h.s.
In fact (7.35) can be derived extending the consideration of [8] to 4-point functions
of physical fields in the presence of twists; it is sufficient to look at the functions on
the plane, which can be interpreted as the L/T limit of the torus correlators,
limL/T Tr(e2LH . . .), when we map it to the plane through w z =2iw/T . Let us
sketch the argument which is a generalisation of the derivation of the locality equations; we
shall use the convention of notation in [18]. We consider a 4-point function with insertion
of two twist operators (7.6) (omitting the labels (P ) on the fields and the OPE coefficients)
0|(J ; ) (z1 , z 1 )(I ; ) (z2 , z 2 )Xx (I ; ) (z3 , z 3 )(J ; ) (z4 , z 4 )Xx |0
, )
(k,k;
x
(J ;)
(1)
d(J ; )(J ;) d(I ; )(K; , )
=
(k,k)
1
,
k,k,
(1)
(K; , )
d(I ; )(J ; )
(1)
i
0|j1 j (z1 )i k (z2 )ijk (z3 )i1
(z4 )|0 (right chiral block),
(7.36)
(J ; )
taking into account that d(J ; )(1) = 1. This correlator is alternatively written as
0|(I ; ) (z2 , z 2 )Xx (I ; ) (z3 , z 3 )(J ; ) (z4 , z 4 )Xx (J ; ) (z1 , z 1 )|0
=

p,p,,
(i,i;,)
(1)
d(I ; )(I ;)
1(i,i)
p
(I ;)
(P ;, )
d(I ; )(P ;, ) d(J ; )(J ; )

j
(j,j;,)
(j,j)
i
0|i1 i (z2 )ip
(z3 )jj (z4 )j 1 (z1 )|0 (right chiral block).
(7.37)
Next we use the braiding relations for the chiral blocks to identify the two products of
chiral correlators , i.e.,

move
j andj to the very right this brings about the product of
fusing matrices Fkp
j j
i i
Fk p
j j
i i
. We equate the coefficients and, furthermore, take the
value p = 1 in the resulting identity it implies = and = and also trivialises the
fusion matrices to the ones in the diagonal counterpart of (3.11), i.e., we get ratios of square
roots of q-dimensions, which precisely match the factors 1 (7.26) coming from the twists,
and we finally obtain taking also into account the symmetries of the OPE coefficients (this
produces the same sign (1)sI +sJ in both sides), see [18],

(K; , ) 2 x(K; , ) x(I ;,) x(J ;,)

d

=
,
(I ;)(J ;)
x(1)
x(1)
x(1)
,
k,k,
(7.38)
9 Eq. (5.7) represents the constants d in terms of the 3j - and 6j -symbols and the general nonscalar reflection
coefficients.
480
from which (7.35) follows. In deriving (7.38) we have assumed that the decomposition of
the physical fields involves several copies of each product of left and right chiral blocks,
i.e.,

( , )
(K; , )
d(I ;)(J ;, ) ijk (z) ikj (z) (,)(, ) .
I ; (z, z ) =
, ,
j,j,k,k,,
These copies are labelled by the pairs (, ), ( , ) and they correspond to the
, )
(k,k;
multiplicity of states in the projectors Px

in (7.6); explicit construction is provided
by the Coulomb gas realisation of the corresponding correlators. In the previous discussion
we have suppressed for simplicity the multiplicity indices t = 1, 2, . . . , Nij k and t =
1, 2, . . . , Nij k , appearing in the higher rank cases; when restored the modulus square in
(K; , ;t,t) 2
the l.h.s. of (7.38) and (7.35) is replaced by
. Note that in the presence
d
t,t
(I ;)(J ;)
of a twist operator the identity 1-point function appears normalised as 0|(1) Xx |0 =
x(1) /1(1) = dx .
An intriguing issue is the fact that from a mathematical point of view, the indices x play
a role dual to that of representation labels i I (dual in the algebraic sense, going from
a linear space to the space of its linear functionals, see Appendix A and also Eq. (7.12)),
while from a physical point of view, they play a role dual to that of the labels of bulk fields:
this is apparent in Eq. (7.15) where there is a (Fourier-like) duality between the set
V of x

and the set E
xp of pairs (J ) counted with a multiplicity (Zj j )2 , that is between the vertices

and the exponents of the graph G.
We conclude this subsection with the remark that some correlators including twist
operators may be interpreted as generalised orderdisorder field correlators, compare
with [38], where such functions matching the operator content of the Z2 -twisted torus
partition functions of [63,64] were constructed and their OPE coefficients computed. We
recall [8] that the partition functions of [63,64] provide the simplest examples of solutions
of (7.13).
graphs
7.5. Constructing the G
and all the
1 x and , from which the graph G
Let us see now how the matrices V

other matrices V , N , n and M may be constructed, can be determined in a given CFT,
i.e., starting from a given modular invariant Z and the associated graph G. (See [65] for

following a different approach.)
a detailed discussion of the particular E6 case of sl(2)
V with
First, in the case of a diagonal theory, Zij = ij , it is natural to identify the set
the set I of representations, since their cardinality agrees, and to take
ij = Ni Nj
V
(7.39)

ij x y = kI Nix k Nj k y , in particular V

ij ;1 k = Nij k .
understood as a matrix product, i.e., V
(j,j )
The corresponding x
are just the modular matrix elements Sxj and the Ocneanu graph
1f , both equal to Nf , is
= A,
which is generated by the fundamental V
f 1 and V
G
identical to the ordinary graph G = A.
481
As a second case, consider a nondiagonal theory with a matrix Zij = i(j ) , where
is the conjugation of representations or some other automorphism of the fusion rules
= I, and

(like the Z2 automorphism in the D2&+1 cases of sl(2)
theories). Then V
(diag)

Vij = Vi(j ) = Ni N(j ) . The graph is generated by Vf 1 = Nf and V1f = N(f ) , each one
giving a subgraph isomorphic to A (see Fig. 10 for the case of Dodd ). The x(J ) = Sxj j (j) ,
matrices reduce to those of the diagonal case, i.e., N
xy z = Nxy z ,
and one finds that the N
b
b
k
(k,(k))

= Nij .
x, y, z I, while n ax = nxa and M(i,(i))(j,(j ))
General expressions may be obtained for type I theories (7.3). The algebra (7.1) enables one to define a partition of the set V into equivalence classes T : a a if
ab a = 0 [14,58]. The number of such classes equals the number of represenb T : N
tations of A coupled to the identity in the modular invariant, i.e., of i I such that
Z1i = 0. 10 Since (7.3) applies to the matrix T1 = Z, it suggests to look for similar ex
pressions for the other Tx . We find that in all known type I cases, in particular for sl(2)
theories, the labels x may be taken of the form (a, b, ), a, b V, a class label, and

ij ;1 (a,b,) = P () :=
Txij = V
(7.40)
nic a nj c b
ab
cT
with c running over a certain subset T of vertices, or equivalently

ij ;1 (a,b,) =
V
nic a nj b c .
(7.41)
cT
One checks that indeed T1 = Z, the modular invariant matrix as given in (7.3). As the
matrices ni form a representation of the fusion algebra, ni nj = Nij k nk , one finds that
upon left multiplication by any Nf ,

()
Nf .Pab
(7.42)
=
nf a a Pa()
b
a
()
()
and likewise, upon right multiplication Pab
.Nf = d Pad
nf d b by repeated use of
T
nf = nf .

In theories like Zn orbifolds of sl(n)
theories, there is a partition of the set of vertices
into classes T such that
ba a = 0,
N
a, a T , b T = T ,
(7.43)
algebra respects the Zn grading of the vertices. Then let us prove that (7.13)
because the N
follows from the ansatz (7.41) with x = (a, 1, )
() ()

dc c nj 1 c nj 1 c
Nii i
ni 1 d N
Pa1 ij Pa1 i j =
a
cc i I
c,c i I
Nii
dc c nj 1 c nj 1 c
ni 1 d N
dT
10 This may be established in cases where the N

algebra is commutative, and where the structure constants of
and its dual M are nonnegative, following the work of [58]. It seems to extend to noncommutative cases
both N
4 , where some entries of M are negative or even
as well, as we checked on the aforementioned case of sl(4)
imaginary.
482
Nii i Njj j Zi j
(7.44)
i ,j
where we have repeatedly used (7.2) and (1.1) and on the second line, we have used (7.43)
to restrict the summation over d to the set T ; the constraint c c is then automatically
enforced, which enables us to sum over independent c and c .

cases and
For the case of a conformal embedding h k g 1 , we checked in all sl(2)
conjecture in general that the label may be dropped, and x represented by a pair of
vertices (a, b), a V, and b running over a subset of vertices. Then we make use of
j,d;,
, c T , in terms of the modular S ext
formula (7.26) to express the eigenvectors c
matrix of the extended algebra (i.e., of the g 1 current algebra). In that way we find,
multiplying (7.41) with Si j Si j ,
(J ; , )
a(j,d;,) (j,d;,
b
,
ext
S1d
x = (a, b),
(7.45)
d,,

where the sum in the l.h.s. runs according to = 1, . . . , Zj j = dT nj 1 d nj1 d , and
that in the r.h.s. runs on = 1, . . . , nj 1 d , = 1, . . . , nj1 d . If there is only one d T
in the sum we can identify = (, ),
if there are more, first has to be split into
a multiple index and then each identified with a pair (, )
depending on d. For a T
(J ; , )
(J ) ext ext
ab c = ext N ab c
= 1 Sad /S1d is consistent with (7.45) and implies that N
a

ext
2

for a, b, c T , using
j,jd Zj j S1j S1j = (S1d ) , and hence, that Na , a T , form
a subalgebra isomorphic to the extended fusion algebra.
algebra one computes
In particular in the cases with commutative N

i,j ;(a1 ,b1 ) (a2 ,b2 ) =
a1 a2 nj N
b1 b2 ,
ni N
V
c
c
cT
(a1 ,b1 )(a2 ,b2 ) (a3 ,b3 ) =

N
a2
a1 N
N

c
a3
b2
b1 N
N
b3
c
cT
a N
b ,
n (a,b) = N
(7.46)
which ensures that these matrices are integral, nonnegative valued. In particular
(i,1)(a11) (a2 ,1) = nia1 a2 ,
V
(1,j )(1,b1) (1,b2) = nj b1 b2 .

V
The description of Tx as a bilinear form in the n matrices does not seem to restrict to

type I theories like in (7.41). Indeed, this is what happens in the Dodd and E7 cases of sl(2)
x
x

theories. Knowledge of the T matrices (and in general of Vij ;(1,1,) for any ) determines
the whole structure. It is easy to invert the (block-)diagonalisation formula of the Tx and
to get, using also (7.11),

(J ; , )
x
= S1j S1j
Txii Si j Si j .
(7.47)
I
i,i
This determines x completely for cases with Zj j = 1, while higher multiplicities Zj j > 1
require a little more work and care, see Appendix B for an illustration on the D2& case of

, V
and M
matrices.
sl(2).
Once is known, it is a simple matter to obtain all n,
N
483
8. Conclusions and perspectives

The reader who has followed us that far should by now be convinced of the relevance
and utility of Ocneanus DTA A in the detailed study of rational CFT. In our view, two
new concepts developed in this paper in connection with this quantum algebra have proved
particularly useful:
the generalised CVO, which are covariant under the action of A, unify the treatment
of bulk and boundary fields and permit a more direct discussion of their operator
algebras;
the twist operators, whose role manifests itself in several ways, give a physical
xy z
interpretation to the abstract labels x of the dual algebra A and to the coefficients N
and also, through their interplay with bulk fields, provide a new way to determine the
general OPA structure constants in the bulk.
Several points deserve further investigation. First, as already pointed out in the
Introduction, many of our statements which rely on the explicit examination of particular
cases, mainly based on sl(2) and sl(3), and which are presented as conjectures in general,
should be extended in a systematic way to all RCFT. The case of orbifold theories, in which
the relevant graphs would be affine Dynkin diagrams and their generalisations, should be
quite instructive. Other directions of generalisations include irrational CFTs (generic c 1
CFTs or N = 2 superconformal CFTs) or noncompact theories like Liouville [66].
Secondly, among the five types of 3-chains F attached to the tetrahedra of Fig. 1, only
two, namely F and (1) F have received a physical interpretation, as they underlie both
the CFT and the related integrable critical lattice models. Understanding the meaning of
the others, which all involve one or several twist labels x, presumably requires a deeper
discussion of the interplay of twist operators with bulk and/or boundary fields.
In fact the general properties of twists and their relations with twisted representations
of the underlying chiral algebra A await a good discussion. We regard as quite significant
that all partition functions either on a torus or on a cylinder with or without defect lines
(twists) are expressible as linear or bilinear forms with nonnegative integer coefficients of
the |V| linear combinations of characters

ni1 a i = Za|1 .
a :=
(8.1)
This follows from Eqs. (7.3), (7.4) and from our ansatz (7.41) that in type I the matrices
Tx are bilinear in the ns. (In type II theories, we recall that the ns that appear here are
those of the parent type I theory.) The a thus appear as the building blocks of all partition
functions. Their natural interpretation, as alluded above, is that they are the characters
of a class of more general representations of the extended algebra Aext . Among them, the
subset a T represents the ordinary, untwisted, representations. The other have been called
twisted [3], or solitonic [55]. The induction/restriction method [49,62] of constructing
these twisted sectors essentially amounts to the recursive solution of the system of
Eqs. (1.1), (7.2), (7.1). On the other hand, the direct definition of (some) of these twisted
representations, closer in spirit to the concept of twist as developed in Section 7, has been
achieved only in a limited number of cases, see, e.g., [56].
484
Acknowledgements
We want to thank Gabriella Bhm, Robert Coquereaux, Patrick Dorey, David Evans,
Csar Gmez, Liudmil Hadjiivanov, Paul Pearce, Andreas Recknagel, Christoph Schweigert, Kornl Szlachnyi, Raymond Stora, Ivan Todorov, Peter Vecsernys, Antony
Wassermann, Gerard Watts for useful discussions, and especially, Adrian Ocneanu, for
the inspiration for this work. We thank Ivan Todorov for inviting us to the programme
Number Theory and Physics at ESI, Vienna. Hospitality of CERN where this long
paper was completed is also gratefully acknowledged. V.B.P. acknowledges the support
and hospitality of ICTP and INFN, Trieste and partial support of the Bulgarian National
Research Foundation (Contract O-643).
Appendix A. Ocneanu DTA dual structure

This appendix contains some more details on the Ocneanu DTA [1,11] and its WHA
interpretation [4].
The coproduct (3.18) does not preserve the identity, i.e.,

(cb)(cb)
(ba)(ba)
ei,,
ej,
=: 1v(1) 1v(2) = 1v 1v ,
(1v ) =
(A.1)
,
b i,j,a,c,,
while (1) = 1 1 is one of the axioms of a Hopf algebra.

For u, w A and uw the matrix (vertical) product, one has

(cb)(cb) (ba)(ba)

(uw) =
ej, , w =: u1v(1) 1v(2)w ,
uei,,
(A.2)
i,j,a,b,c,,

e.g., for u = a,b Ca,b e1aa,bb , w = a,b Ca,b e1aa,bb one gets (uw) = tr(CC ), and

(u)(w) = a,b Cab a ,b Ca b = (uw) in general, while the counit of a Hopf algebra
is an algebra homomorphism.
The antipode is a linear anti-homomorphism S(uw) = S(w)S(u), defined according to
(3.21), and so that S 1 (u) = (S(u )) . It is also an anti-cohomomorphism, i.e., inverts the
coproduct, in the sense that
S = (S S) op .
(A.3)
Here op (u) = u(2) u(1) for (u) = u(1) u(2) . Furthermore, instead of the Hopf algebra
postulate S(u(1) )u(2) = 1v @(u) the antipode of a WHA satisfies

S u(1) u(2) u(3) = 1v u (1v ) = 1v(1) u1v(2) .
(A.4)
The relations (A.3), (A.4) are checked using both unitarity relations (3.5), as well as (3.13),
(3.14); the choice of the coefficient in (3.21) is essential.
One turns A into a quasitriangular WHA by defining an R-matrix, i.e., an element
R op (1v )A A(1v ), which intertwines the two coproducts,
op (u)R = R(u),
(A.5)
485
subject to the constraints,

( I d)R = R13 R23 ,
(I d )R = R13 R12 .
(A.6)
Namely,

R=
(1)
F bp
i,j,p
a,a ,b,c,c ,d

, , , ,, ,t
(1) F dp
j
c
i
a
i
c
t
j
a
t

(ba )(cd)
ej, ,
i,j,p
wa,a ;c,c ;,
(c b)(da)
ei, ,
(A.7)
i,j,p
Here wa,a ;c,c ;, is a unitary matrix and to make contact with the CFT under
consideration we choose
i,j,p
wa,a ;c,c ;, = aa cc e
iij
(A.8)

so that the coefficient in (A.7) reproduces the braiding matrix B(+)
in (4.14). Similarly one
op

defines R (1v )A A (1v ), corresponding to B(); the inversion relation (4.10)
is equivalent to R R = (1v ), RR = op (1v ). The relations (A.6) are equivalent to the
fusingbraiding relation (4.17) and its counterpart discussed in Section 4. Denoting by P
the permutation operator in V i V j , the definition (A.7) with the choice (A.8) implies

1 j,
1 i,
i j
j,
i,

Bbd
() ecd h eda .
P R ecb h eba =
(A.9)
c
a
Pb
P
d

d,
,
The horizontal product in A depicted on Fig. 6 reads more explicitly

t
p;b,b
i j
(cb)(c b )
(da)(d a )
(1)
ei,,
h ej, ,
= bd b d
gij
Fbp
c a
p
, ,t
(1)
F b p
i
c
j
a
t

(ca)(c a )
ep,,
.

(A.10)
The dual algebra A of A is the space of linear functionals on A. It is a matrix algebra

(a a;)(d d; )
depicted by double
, x V},
A = xV Matm x with matrix unit basis {Ex
triangles, or blocks, with an intermediate index x see Figs. 2, 9. The indices (a, a ; ),

a, a V, = 1, 2, . . . , n ax a , label the states in a linear vector space Vx of dimension

x . They are depicted in Fig. 1 as triangles with two white and
dim(Vx ) = a,a n ax a = m
Fig. 9. Relating the two bases.
486
1 black vertices. The vertical and horizontal products are exchanged in the dual algebra,
i.e., the horizontal product is the matrix product in A and the vertical product for the basis
elements Ex is given by a dual analogue of (A.10), with the convention that the second
element appears above the first. In this product the role of the multiplicities Nj and nj
x and n x , with the relation (1.3) serving now as a consistency relation
is taken over by N
and F
, the last two of the tetrahedra
replacing (1.1). The dual 3j - and 6j -symbols (1) F
in Fig. 1, satisfy unitarity relations analogous to (3.5) and two more pentagon relations
dual to the fusing matrix F has all
parallel to (3.9) and (3.7), respectively.
The matrix F

y
x
bz
indices of type x, while (1)F
c a is a matrix with 3 + 3 indices of type a, b, c V and

x, y, z V. All the steps of Section 2 can be repeated, in particular we can choose a gauge

analogous to (3.11), using that dx Pa = b n ax b Pb .
fixing for (1) F

,
The finite dimensional algebras A and A can be identified, looking at {e, } and {Ex }
j
as providing different bases, see Fig. 9. This introduces a new fusing matrix (2) F , given,

,
up to a constant, by the numerical value of the linear functional Ex (ej, ) C. (2) F is
the third tetrahedron on Fig. 1, supported by two black and two white vertices, and two
types of triangles of multiplicities nj and n x . More explicitly we have

(a a)(d d) (cb)(c b )
Ex ej = Ex;,
ej ;

da (2)
j b
da
F bc
= ac bd a c b d cj cx
(A.11)
c x
with

cjbc cxbc =

dx dj S11 2
.
Pb Pc 11
(A.12)
The equality (7.30) ensures that the number of elements on both sides of Fig. 9 for fixed
a, a , d, d and varying j and x is the same, so the linear transformation (2) F is invertible,
, in the sense of the relations
the inverse denoted (2) F

j b
j b
bc bc (2)
(2)

cj cx
Fbc
= jj ,
F bc
c x
c x
x,,

j,,

cjbc

cxbc (2) Fbc
j
c
b
x
(2)
Fbc
j
c
b
x
= xx .
(A.13)
are trivial for x = 1 and j = 1, analogously to (3.4).

We shall require that (2) F and (2)F
This is consistent with the inverse relations (A.13), inserting (A.12) and using that
1 2

1
n ax b dx =
Pa Pb =
nj a b dj .
(A.14)
S
11
x
j
We recall that the ratio of constants cjbc appears in the normalisation of the horizontal

product (A.10), and similarly a ratio of the constants cxbc determines the constant in
the vertical product of the dual basis elements. Inserting the relation in Fig. 9 in both
487
sides of the horizontal product (A.10) and using furthermore that the horizontal product
acts trivially on the dual basis by a formula analogous to (3.22), one gets the pentagon
relation [1]

t

i j
i b
j c
(2)
(2)
(1)

F b p
F
F
ba
cb
a x
b x
a c

b , , ,

(1)
F bp
i
a
j
c
t

(2)
F ca
p
a
c
x
In terms of the functional values (A.11) the identity (A.15) reads [1]

Ex ei Ex ej
= xx Ex ei h ej
.
(A.15)
(A.16)
Similarly starting from the vertical product analogue of (A.10) for the dual basis we obtain
and (2)F
. The relation in
the dual analogue of (A.15), with (1) F and (2) F replaced by (1) F
Fig. 9 allows to define a sesquilinear form in the algebra determined by the pairing (A.11)
s.t. Ex , Ex = xx cx . Assuming furthermore that ej , ej = jj cj leads to
on A A,
. Then the above two dual pentagon identities are equivalent
the identification (2) F = (2) F
to the identities relating, via the pairing, the coproduct in each of the two algebras to the
product in its dual [4],

Ex Ey , ep = Ex Ey , (ep ) ,

Ez , ei ej = (Ez ), ei ej ,
(A.17)
where the products in the l.h.s. stand for the algebra multiplications in A and A (i.e., the
horizontal and vertical products, respectively). 11 The coproduct and the horizontal product
are related via the scalar product in A defined above

ei h ej , ep = ei ej , (ep ) .
(A.18)
We shall furthermore assume the analogues of the symmetry relations (3.13), (3.14)
(compatible with the form of (2)F and with the relation n ax b = n bx a )

Pb Pc (2) j c
j
b
j
c
(2)
(2)
(A.19)
Fbc
F cb
.
=
= F c b
c x
b x
b x
Pb Pc
Inserting the first equality of (A.19) in the relation obtained from (A.15) for p = 1, one
obtains using (3.11)

j a
j a
(2)
(2)
F ab
F cb
= ac .
(A.20)
b x
b x

b ,,
11 The above identification (2) F = (2) F

appears in [4] (up to different notation) as a solution in the diagonal
cases, where the r.h.s. of (A.12) simplifies to a ratio of q-dimensions. In general the matrix defining the pairing
on A A and its inverse are left unrelated and the equalities (A.17) lead to dual (with respect to the 3j -symbols)
pentagons in both of which only the inverse matrix enters. We are indebted to Gabriella Bhm and Kornl
Szlachnyi for a clarifying e-mail correspondence on this point.
488
Using (A.19) one also checks that the conjugation operation + computed directly from
(A.11) coincides with E + (e) = E(S 1 (e)+ ). To make contact with the basis for the dual
triangles exploited in [4] one has to introduce

(b b)(c c)
Pb Pc (cc )(bb )

,
Ex
= S Ex
x(cc )(bb ) :=
Pb Pc
so that + , e = , S(e)+ .
The identity (A.15) and its dual complete the set of pentagon type relations called the
x
Big Pentagon in [4]. In the diagonal case Zj j = j j where all multiplicities Ni , ni , N
(1)
(2)
(1)

and n x coincide with the Verlinde one, one can identify F = F = F = F = F since
all pentagon relations involved coincide with (3.9) and the unitarity relations (3.5) and their
dual counterparts, as well as (A.13), reduce to the unitarity of F . The next simple cases
diag
are the permutation modular invariants Zj j = Zj (j) , where is an automorphism of the
= N , n = n, and accordingly
fusion rules. For any of these cases
V is identified with I, N
(1)
(1)

F = F , F = F . We notice that in these cases the pentagon identity (A.16) looks like
the fusingbraiding identity (4.16) and this suggests that given (1) F , and hence by (4.14),
the latter matrix may provide, up to some constant, a solution for (2)F . In the
given B,
simplest example of the Dodd sl(2) series the matrices (1) F were computed in [6].
Defining the dual counterparts of (3.22)
1
x =
E (cb)(cb),
E
(A.21)
cxbc x,,
c,b,
one obtains the

and using the analogues of (A.10) and (3.5) with (1) F replaced by (1) F
algebra (7.16) with the multiplication identified with the vertical product

x v E
y =
xy z E
z .
E
(A.22)
N
z
The identity in A is given by 1h =

1
@. 12
(cb)(cb)
x,c,b, Ex,,
and (A.11), (A.13) ensure that 1h
The dual algebra A cannot be turned in general into
coincides with the counit |I |

a quasitriangular WHA.
The relations (A.22), (7.5), (1.5), imply that the chiral generators

(j 1)1x E
x ,
(1j )1 x E
x ,
V
V
pj =
pj+ =
x
(A.23)
satisfy the Verlinde algebra

pi v pj =
Nij k pk ,
(A.24)
k
12 The factor 1/P P in (A.12), dictated by the requirement of consistency of the full set of pentagon and
b c
inversion equations, can be assigned entirely to one of the constants cj or cx . Then one of the formulae (3.22)
or (A.21) gives elements in the center of the corresponding algebra, the so called minimal central projections.
However there seems to exist no consistent renormalisation of the two products and of the relation on Fig. 9
making central the basis elements of both algebras (3.23) and (A.22).
489
while
pi+ v pj =
z .
Tzij E
(A.25)
z
1
z ) = z1 , applying 1 to (A.25) reproduces in the r.h.s. the modular invariant
(E
|
|
|V
|V
1
matrix Zij = Tij [1]; in [11] the analogous relation reads i+ , j = Zij .
Since
Appendix B. The sl(2) theories

In this appendix, we illustrate the construction of Ocneanu graphs and of the associated

matrix algebras on the sl(2)
theories and modular invariants of ADE type.
The cases of An and D2&+1 have been covered by the discussion in Section 7.5: the
A cases are diagonal theories and the D2&+1 case is obtained from the diagonal case A4&1
with the same Coxeter number h = 4& by the Z2 automorphism (j ) = h j of the fusion
rules. The D2& , E6 and E8 cases have also been implicitly covered there. But we shall
collect here additional data on them and present the E7 case which does not follow from
the previous formulae. Throughout this appendix, we follow the notations of [3] on the
vertices and on the eigenvectors of ordinary Dynkin diagrams.
For the D2& theories, in which condition (7.43) is satisfied, formula (7.41) applies with
1 x as explained in Section 7.5, and with
b = 1 and = 0, 1. Diagonalising the matrices V
a little extra insight to find the appropriate combinations of eigenvectors with exponent
(J ) = (j, j) = (2& 1, 2& 1) of multiplicity (Zj j )2 = 4, we find that
{x(J ;,)}
aj
=
2
j
hj
hj
a
!" 2
j =1,3,...,2&3
hj
hj
2#
a(2&1,+)
0
a(2&1,)
0
a(2&1,+)
a(2&1,)
(B.1)
which should be understood as follows: the exponent of in the first four columns run over
j = 1, 3, . . . , 2& 3 (h = 4& 2 is the Coxeter number), and for each j the corresponding
value of the pair (J ) = (j, j) is successively (j, j ), (j, h j ), (h j, j ), (h j, h j ).
In the last four columns, the exponent of is ( h2 , h2 ; , ) with successively (, ) =
(1, 1), (2, 2), (1, 2), (2, 1). The row index x is of the form x = (a, ), as in Section 7.5,
and the first line of (B.1) refers to = 0, the second to = 1.
It is then easy to compute the various sets of matrices discussed in Section 7. One finds
that

ni nj
0
, if j is odd,
0
ni nj

Vij =
(B.2)
0
ni nj
, if j is even,
ni nj
0
490

0
Na
0 N
a c ,

Nx =

a
0
a c 0 ,
N
and

n x =
a ,
N
if = 0,
a C, if = 1,
N
if = 0,
(B.3)
if = 1,
(B.4)
where the index c in (B.3) denotes the Z2 involution of vertices of the (ordinary) D2&
diagram which exchanges the two vertices of the fork and leaves the other invariant,

and Cab = abc . Using these data, one checks (7.31) and (7.32). Finally the matrices M
restricted to the physical subset, i.e., those that do not involve j = j = 2& 1 with labels
= , are all nonnegative. For j1 , j1 , . . . = h/2 = 2& 1:
Mj1 j2 j3 , if there is 0 or 2 pairs of (j, h j )

(j3 ,j3 )

M
=
among (j1 , j1 ), (j2 , j2 ), (j3 , j3 ),
(j1 ,j1 )(j2 ,j2 )
0,
otherwise,
1
(2&1,2&1;,)

= Mj1 j2 (2&1,),
M
(j1 ,j1 )(j2 ,j2 )
2
(2&1,2&1,,),(2&1,2&1,,)(J ) = M(2&1,)(2&1,)j ,
M
(2&1,2&1,,),(2&1,2&1,,)(2&1,2&1, , ) = 2 M(2&1,)(2&1,)(2&1, ),
M
in terms of the ordinary Pasquier algebra structure constants Mj1 j2 j3 for which explicit
are in
expressions can be found in Appendix A of [17]. 13 These expressions of M
agreement with their connection with the relative structure constants (7.35).
We now turn to the three exceptional cases.
The case of E6
1 x equal to
In that case, it suffices to take x = (a, b), a = 1, . . . , 6, b = 1, 2, and the V
(1)
the matrices Pab := Pab . According to what was stated above in Eq. (7.42), the two sets
{Pa1 } and {Pa2 } are separately closed upon the left action of N2 . Moreover, because of
symmetries P13 = P62 , P14 = P52 , P16 = P61 , P15 = P51 , P32 = P23 , P42 = P24 the two
sets may also be regarded as {P1a } and {P2a }, a = 1, . . . , 6, and are separately closed upon
(6 is assigned its
right action of N2 . See Fig. 10 on which each vertex x of the graph E
x
.
matrix V
Using (7.45), it is easy to compute the various sets of matrices discussed in Section 7.
One finds, in accordance with (7.46), that

Na 0
if x = (a, 1),
0 N
a ,

Nx =
(B.5)
a
0
,
if
x
=
(a,
2),
a N
a N
6
N
13 With unfortunately a misprint which we correct here: in the last line of (A.2), the 1/2 should read 1/ 2.
491
x , written as a P
Fig. 10. The Ocneanu graphs of ADE type: each vertex x is assigned its matrix V
1
12 are shown in red full lines,
matrix as in (7.41) or in (B.8). Edges of V
21 , respecvtively V
or P
21 are
respectively blue broken ones, and the vertices of the different cosets for the action of V
depicted in different colours.
and n x is given by the last equation (7.46). One also computes mj = 6, 10, 14, 18, 20, 20,
20, 18, 14, 10, 6 for j = 1, . . . , 11; m
x = 6, 10, 14, 10, 6, 8 and 10, 20, 28, 20, 10, 14 for

x = (a, b = 1) and (a, 2), respectively, a = 1, . . . , 6. Hence one checks (7.32): j mj =
2 2

factorise into
x , and (7.31): j mj = x m
x = 2512. Finally the matrices M
156 = x m
a product of ordinary Pasquier matrices
(I )(J )(K) = Mij k M k .
M
ij
(B.6)
The case of E8
In that case there are 8 8/2 = 32 matrices that may be taken either as the four sets
{Pa1 }, {Pa2 }, {Pa3 } and {Pa8 }, a = 1, 2, . . . , 8, or using once again their symmetries, as
{P1a }, {P2a }, {P3a } and {P8a }, a = 1, 2, . . . , 8. See Fig. 10.
492
One then computes

Na 0
a
0 N
x =
N
0
0
0
a
N
x =
N
0
0
0
a
N
0

Na
0
0
0
a
N
0
0
0
,
0
a
N
0
a
N
0
a
7 N
N
if x = (a, 1),
0
0
a ,

N7 N
0
a
N
if x = (a, 2),
0
0
0
a
7 N
a
0
N
N
, if x = (a, 3),
x = 0
N
N
a

0
(N1 + N7 )Na
0
a
a
7 N
7 N
0
N
0 N
a
0
0
0
N
a
7 N
0
N
0
x = 0
N
(B.7)
0 N
a
a , if x = (a, 8),

7 N
0
N7 N
7 N
a
a
0
N
0
N
and n x as in (7.46). Also mj = m30j = 8, 14, 20, 26, 32, 38, 44, 48, 52, 56, 60, 62, 64,
64, 64 for j = 1, . . . , 15; m
x = (8, 14, 20, 26, 32, 22, 12, 16), (14, 28, 40, 52, 64, 44,
22, 32), (20, 40, 60, 78, 96, 64, 32, 48) and (16, 32, 48, 64, 78, 52, 26, 40) for x = (a, b =

1), (a, 2), (a, 3) and (a, 8), respectively, a = 1, . . . , 8. Hence one checks j mj = 1240 =

2
factorise again into a product
x , j m2j = x m
x = 63136. Finally the matrices M
xm
of ordinary Pasquier matrices, like in (B.6).
The case of E7
(1)
This case is known to be related to the D10 case. The Pab
matrices of D10 were defined
in (7.41) by

(1)
Pab ij =
nia c nj b c ,
cT ={1,3,5,7,9,10}
with n the solutions of (1.1) pertaining to D10 . Using the same matrices, let us now define
the P matrices (twisted version of the P s) by

ab =
P
(B.8)
nia c nj b (c) ,
ij
c{1,3,5,7,9,10}
with the usual involution acting on the vertices of T

{1, 3, 5, 7, 9, 10} $ {1, 9, 5, 7, 3, 10}.
As in Section 7 (Eq. (7.42)), we have the property that upon left (respectively right) mul

tiplication by N2 , N2 .Pab = a n2a a (Pa b ), (respectively, Pab .N2 = b (Pab )n2b b ).
Recall that here n2 is the adjacency matrix of D10 . This relation explains the D10 pat(7 on Fig. 10: the red full (respectively
tern of the two chiral parts of the Ocneanu graph E
493
blue broken) thick line represents left (respectively right) fusion by N2 and connects the
matrices Pa1 (respectively P1a ), a = 1, . . . , 10.
These matrices have to be supplemented by others to produce the full set of matrices Tx
(7 . Using the symmetries (Pab )T = Pba
and the second part (the coset [1]) of the graph E
etc., of the matrices P, we find that starting with matrix P12 , left multiplication by N2
produces the chain of matrices forming the coset
P16

26 P24 ,

P12 P22 P32 = P18 P42 = P24 P14 X := P
where the splitting of P52 into the sum P14 + P16 has formed the triple point of the E7
diagram. The matrix X itself may be expressed as a bilinear form in the matrices n (relative
to D10 )

7 N
9 b ni1 a nj 1 b
N
X=
(B.9)
a
a,b=2,4,6,8
in such a way that the pairs (a, b) that are summed over are
(a, b) {(4, 4), (6, 6), (8, 8), (2, 6), (6, 2), (4, 8), (8, 4), (6, 8), (8, 6)}.
One finds, following downward first the D10 subgraph, and then the E7 coset

x
9
0
0
N
N

Nx =
, x = 1, 8;
N9 =
;
0 nx(E7 )
0 n9(E7 ) n3(E7 )

0
n x10
0
x =
10 = N10
N
, x = 11, . . . , 17,
N
(E ) ,
n Tx10
0
0
n3 7
(B.10)
are relative to D10 ; n b , b = 1, . . . , 7, are

where n(E7 ) denote the n-matrices of E7 , and N
seven 10 7 rectangular matrices intertwining the D10 and E7 adjacency matrices (see [3],
Section 3.3, for a formula),
(E7 )
n x = ni
n 11 = n2(E7 ) ;
(E )
n 15 = n8 7 ;
(E7 )
x = 1, 8;
n 9 = n3
n 12 = n1(E7 )
(E )
n 16 = n3 7
+ n3(E7 ) ;
(E )
+ n5 7 ;
(E7 )
n 10 = n9
(E7 )
n3
n 13 = n8(E7 ) ;
n 14
(E7 )
(E7 )
n 17 = n7 n3 .
;
= n3(E7 ) + n5(E7 ) ;
=
One also computes mi = m18i = 7, 12, 17, 22, 27, 30, 33, 34, 35 for 1 i 9, and m

x
7, 12, 17, 22, 27, 30, 33, 34, 17, 18; 12, 24, 34, 44, 30, 16, 22, so that
m
=
m
x =
i i
2

matrices may also be computed, and yield
x = 10905. Finally,
the M
399, i m2i = x m
nonnegative numbers (0, 14 , 12 , 1 , 34 , 1, 2, 2), which match what was computed on the
2
relative structure constants d 2 .
References
[1] A. Ocneanu, Paths on Coxeter diagrams: from platonic solids and singularities to minimal
models and subfactors, in: Rajarama Bhat et al. (Eds.), Lectures on Operator Theory, Fields
Institute Monographies, AMS, 1999.
494
[2] A. Ocneanu, Quantum symmetries for SU(3) CFT Models, Lectures at Bariloche Summer
School, Argentina, January 2000, to appear in: R. Coquereaux, A. Garcia, R. Trinchero (Eds.),
AMS Contemporary Mathematics.
[3] R.E. Behrend, P.A. Pearce, V.B. Petkova, J.-B. Zuber, Phys. Lett. B 444 (1998) 163166, hepth/9809097;
R.E. Behrend, P.A. Pearce, V.B. Petkova, J.-B. Zuber, Nucl. Phys. B 579 (2000) 707773, hepth/9908036.
[4] G. Bhm, K. Szlachnyi, Lett. Math. Phys. 200 (1996) 437456, q-alg/9509008;
G. Bhm, K. Szlachnyi, Weak c -Hopf algebras: the coassociative symmetry of non-integral
dimensions, in: Quantum Groups and Quantum Spaces, Vol. 40, Banach Center, 1997, pp. 919;
G. Bhm, Weak C -Hopf algebras and their application to spin models, PhD Thesis, Budapest,
1997.
[5] I. Runkel, Nucl. Phys. B 549 (1999) 563578, hep-th/9811178.
[6] I. Runkel, Nucl. Phys. B 579 (2000) 561589, hep-th/9908046.
[7] J.L. Cardy, Nucl. Phys. B 270 (1986) 186204.
[8] V.B. Petkova, J.-B. Zuber, Generalised twisted partition functions, Phys. Lett. B 504 (2001)
157164, hep-th/0011021.
[9] S. Mac Lane, Categories for the Working Mathematician, 2nd edn., Springer, 1998.
[10] J. Bckenhauer, D.E. Evans, Commun. Math. Phys. 205 (1999) 183228, hep-th/9812110.
[11] J. Bckenhauer, D.E. Evans, Y. Kawahigashi, Commun. Math. Phys. 208 (1999) 429487,
math.OA/9904109;
J. Bckenhauer, D.E. Evans, Y. Kawahigashi, Commun. Math. Phys. 210 (2000) 733784,
math.OA/9907149.
[12] V.B. Petkova, J.-B. Zuber, PRHEP-tmr2000/038 (Proceedings of the TMR network conference
Nonperturbative Quantum Effects 2000), hep-th/0009219.
[13] A. Cappelli, C. Itzykson, J.-B. Zuber, Nucl. Phys. B 280 (1987) 445465;
A. Cappelli, C. Itzykson, J.-B. Zuber, Commun. Math. Phys. 113 (1987) 126;
A. Kato, Mod. Phys. Lett. A 2 (1987) 585600.
[14] P. Di Francesco, J.-B. Zuber, Nucl. Phys. B 338 (1990) 602646.
[15] P. Di Francesco, J.-B. Zuber, SU(N) lattice integrable models and modular invariance, in:
S. Randjbar-Daemi, E. Sezgin, J.-B. Zuber (Eds.), Recent Developments in Conformal Field
Theories, Trieste Conference, 1989, World Scientific, 1990;
P. Di Francesco, Int. J. Mod. Phys. A 7 (1992) 407500.
[16] V. Pasquier, J. Phys. A 20 (1987) 57075717.
[17] V.B. Petkova, J.-B. Zuber, Nucl. Phys. B 438 (1995) 347372, hep-th/9410209.
[18] V.B. Petkova, J.-B. Zuber, Nucl. Phys. B 463 (1996) 161193, hep-th/9510175;
V.B. Petkova, J.-B. Zuber, Conformal field theory and graphs, hep-th/9701103.
[19] A.N. Kirillov, N.Yu. Reshetikhin, Representations of the algebra Uq (sl(2)), q-orthogonal
polynomials and invariants of links, Adv. Series in Math. Phys. 7 (1989) 285339.
[20] D. Nikshych, V. Turaev, L. Vainerman, Invariants of knots and 3-manifolds from quantum
groupoids, math.QA/0006078.
[21] G. Moore, N. Seiberg, Commun. Math. Phys. 123 (1989) 177254.
[22] G. Moore, N.Yu. Reshetikhin, Nucl. Phys. B 328 (1989) 557574.
[23] C. Gmez, H. Sierra, Phys. Lett. B 240 (1990) 149157;
C. Gmez, H. Sierra, Nucl. Phys. B 352 (1991) 791828;
C. Gmez, H. Sierra, A brief history of hidden quantum symmetries in conformal field theories,
hep-th/9211068.
[24] G. Mack, V. Schomerus, Nucl. Phys. B 370 (1992) 185230.
[25] J.L. Cardy, D.C. Lewellen, Phys. Lett. B 259 (1991) 274278.
[26] V. Pasquier, Modles Exacts Invariants Conformes, Thse dEtat, Orsay, 1988.
[27] P. Roche, Commun. Math. Phys. 127 (1990) 395424.
[28]
[29]
[30]
[31]
[32]
[33]
[34]
[35]
[36]
[37]
[38]
[39]
[40]
[41]
[42]
[43]
[44]
[45]
[46]
[47]
[48]
[49]
[50]
[51]
[52]
[53]
[54]
[55]
[56]
[57]
[58]
[59]
[60]
[61]
[62]
[63]
[64]
[65]
495
P.A. Pearce, Y.K. Zhou, Int. J. Mod. Phys. B 7 (1993) 36493705, hep-th/9304009.
P.P. Kulish, N.Yu. Reshetikhin, E.K. Sklyanin, Lett. Math. Phys. 5 (1981) 393403.
V. Pasquier, Commun. Math. Phys. 118 (1988) 355364.
G. Mack, V. Schomerus, Commun. Math. Phys. 134 (1990) 139196.
P. Furlan, A.Ch. Ganchev, V.B. Petkova, Int. J. Mod. Phys. A 6 (1991) 48594884.
L.K. Hadjiivanov, I.T. Todorov, Monodromy representations of the braid group, hepth/0012099.
A. Recknagel, V. Schomerus, Nucl. Phys. B 531 (1998) 185225, hep-th/9712186;
A. Recknagel, V. Schomerus, Nucl. Phys. B 545 (1999) 233282, hep-th/9811237.
J. Fuchs, C. Schweigert, Nucl. Phys. B 530 (1998) 99136, hep-th/9712257.
D.C. Lewellen, Nucl. Phys. B 372 (1992) 654682.
G. Pradisi, A. Sagnotti, Ya.S. Stanev, Phys. Lett. B 381 (1996) 97104, hep-th/9603097.
V.B. Petkova, Int. J. Mod. Phys. A 3 (1988) 29452958;
V.B. Petkova, Phys. Lett. B 225 (1989) 357362;
P. Furlan, A.Ch. Ganchev, V.B. Petkova, Int. J. Mod. Phys. A 5 (1990) 27212735.
C.-H. Rehren, Commun. Math. Phys. 116 (1988) 675688.
M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Commun. Math. Phys. 119 (1988) 543565.
E. Date, M. Jimbo, A. Kuniba, T. Miwa, M. Okado, Adv. Stud. Pure Math. 16 (1988) 17122.
M. Wadati, T. Deguchi, Y. Akutsu, Phys. Rep. 180 (1989) 247332.
P.A. Pearce, Y.K. Zhou, Int. J. Mod. Phys. B 8 (1994) 35313577.
V. Pasquier, Nucl. Phys. B 285 (1987) 162172.
M. Jimbo, T. Miwa, M. Okado, Lett. Math. Phys. 14 (1987) 123131.
H. Wenzl, Inv. Math. 92 (1988) 349383.
P. Fendley, J. Phys. A 22 (1989) 46334642.
N. Sochen, Nucl. Phys. B 360 (1991) 613640.
F. Xu, Commun. Math. Phys. 192 (1998) 349403.
R.E. Behrend, P.A. Pearce, J.-B. Zuber, J. Phys. A 31 (1998) L763L770, hep-th/9807142.
R.E. Behrend, P.A. Pearce, Integrable and conformal boundary conditions for sl(2) A-D-E
lattice models and unitary minimal conformal field theories, J. Stat. Phys., to appear, hepth/0006094.
I.V. Cherednik, Teor. Mat. Fiz. 61 (1984) 3544.
A. Ocneanu, Quantized group string algebras and Galois theory for algebras, in: Operator
Algebras and Applications, Vol. 2, Warwick, 1987, London Math. Soc. Lect. Note Ser., Vol. 136,
Cambridge Univ. Press, pp. 119172.
J. Bckenhauer, D.E. Evans, Modular invariants from subfactors: type I coupling matrices and
intermediate subfactors, math.OA/9911239.
J. Fuchs, C. Schweigert, Phys. Lett. B 490 (2000) 163172, hep-th/0006181.
A. Honecker, Nucl. Phys. B 400 (1993) 574596, hep-th/9211130.
R. Dijkgraaf, E. Verlinde, Nucl. Phys. B (Proc. Suppl.) 5B (1988) 87.
E. Bannai, T. Ito, Algebraic Combinatorics I: Association Schemes, Benjamin/Cummings, New
York, 1984.
W. Lerche, N.P. Warner, in: N. Berkovits, H. Itoyama et al. (Eds.), Strings and Symmetries,
1991, World Scientific, 1992;
P. Di Francesco, F. Lesage, J.-B. Zuber, Nucl. Phys. B 408 (1993) 600634, hep-th/9306018.
B. Dubrovin, Nucl. Phys. B 379 (1992) 627689, hep-th/9303152;
B. Dubrovin, Springer Lect. Notes in Math. 1620 (1996) 120348, hep-th/9407018.
N. Ishibashi, Mod. Phys. Lett. A 4 (1987) 251264.
J. Bckenhauer, D.E. Evans, Commun. Math. Phys. 200 (1999) 57103, hep-th/9805023.
J.L. Cardy, Nucl. Phys. B 275 (1986) 200218.
J.-B. Zuber, Phys. Lett. B 176 (1986) 127129.
R. Coquereaux, Notes on the quantum tetrahedron, math-ph/0011006.
496
[66] J. Teschner, PRHEP-tmr2000/041 (Proceedings of the TMR network conference Nonperturbative Quantum Effects, 2000), hep-th/0009138;
B. Ponsot, J. Teschner, ClebschGordan and RacahWigner coefficients for a continuous series
of representations of Uq (sl(2, R)), math.QA/0007097.

On the structure of openclosed topological field

theory in two dimensions
C.I. Lazaroiu
C.N. Yang Institute for Theoretical Physics, SUNY at Stony Brook, NY 11794-3840, USA
Received 29 November 2000; accepted 21 March 2001
Abstract
I discuss the general formalism of two-dimensional topological field theories defined on open
closed oriented Riemann surfaces, starting from an extension of Segals geometric axioms. Exploiting the topological sewing constraints allows for the identification of the algebraic structure governing such systems. I give a careful treatment of bulk-boundary and boundary-bulk correspondences,
which are responsible for the relation between the closed and open sectors. The fact that these correspondences need not be injective nor surjective has interesting implications for the problem of
classifying boundary conditions. In particular, I give a clear geometric derivation of the (topological) boundary state formalism and point out some of its limitations. Finally, I formulate the problem
of classifying (on-shell) boundary extensions of a given closed topological field theory in purely
algebraic terms and discuss reducibility of boundary extensions. 2001 Elsevier Science B.V. All
rights reserved.
PACS: 11.10.Cd; 11.10.Kk; 11.25.-w
1. Introduction
The central importance of openclosed strings has become progressively clear since
the discovery of D-branes. It is now generally accepted that a deeper understanding of
openclosed string theory holds the key to deciphering not only D-brane dynamics but also
some of the basic structures involved in non-perturbative proposals for string theory such
as M-theory. On the other hand, recent studies of openclosed string theory on Calabi
Yau manifolds hold the promise of providing new insight into the phenomena of mirror
symmetry and topology change, as well as harmonizing the mathematical program of
homological mirror symmetry [15] with modern developments in D-brane physics [1,5,
9,10,12,24].
E-mail address: calin@insti.physics.sunysb.edu (C.I. Lazaroiu).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 3 5 - 3
498
C.I. Lazaroiu / Nuclear Physics B 603 [PM] (2001) 497530
It is somewhat surprising to notice that, in spite of its central importance, our

understanding of openclosed string theory is quite incomplete when compared with the
relatively well-developed framework available for the closed case. While considerable
progress has been made in providing systematic constructions [8,31,33,35], the current
approach is largely based on the boundary state formalism [23], which is sometimes
claimed to reduce most open-string questions to problems formulated in the bulk. While
this is certainly correct for some problems, this approach is in fact rather incomplete and
cannot fully replace a systematic analysis of openclosed conformal/string theory in its
native domain, namely through direct constructions motivated by the geometry of open
closed Riemann surfaces and two-dimensional field theory dynamics. A clear approach to
this problem seems especially important for studies of extended moduli spaces, which
forms the core of the homological mirror symmetry program. Indeed, general points
in the extended moduli space do not admit any standard geometric description, and a
clear definition of the systems under study is necessary in the absence of any intuitive
considerations. 1
The aim of this paper is to provide such an analysis for the simplified case of topological
openclosed field theories in two dimensions. Beyond being technically simpler, such
systems are bound to play a central role in current efforts to analyze D-brane dynamics
in curved backgrounds. In particular, understanding their structure is crucial for studies of
openclosed extensions of mirror symmetry.
By analogy with the closed case, openclosed topological strings can be built by
coupling an openclosed topological field theory to topological gravity defined on
openclosed Riemann surfaces (a generalization of the usual closed two-dimensional
topological gravity of [30]). A detailed understanding requires a close look at each of these
building blocks. In this paper I consider the first element only, namely the formalism of
openclosed topological field theories. These are distinguished from their gravitational
counterpart in that they do not contain a dynamical metric no integration over
worldsheet metrics is necessary in order to achieve diffeomorphism invariance. I consider
the abstract framework of such systems along the lines of [19,28,29] (see [22] for a review).
As in the closed case [18,20,21], one can exploit the topology of bounded Riemann surfaces
and the relevant axioms in order to encode all information about such theories into a
finite set of characteristic (structure) constants. These are subject to a set of conditions
stemming from the topological sewing constraints, and I analyze these in order to extract
the mathematical object they define. After making contact with the usual description in
terms of correlators, I discuss how (a topological version of) the boundary state formalism
can be recovered in this approach, and point out some of its conceptual limitations. I also
give an abstract definition of a boundary extension of a topological bulk theory and shortly
1 The reader used to the geometric approach via nonlinear sigma models will readily recognize simple
realizations of our axiomatic constructions as they apply to that situation. In [34], I consider this structure
for the case of openclosed A and B models. However, the main motivation of the present paper is to give a
clear definition of openclosed topological field theory suitable for situations when a worldsheet path integral
approach is either non-existent or unknown. In such cases, the meaning of boundary conditions is unclear, and
the classification of all boundary extensions of a given bulk theory must be approached in an abstract fashion.
499
discuss the problem of irreducible versus reducible boundary extensions of a bulk theory.
Finally, I discuss a rather obvious category-theoretic interpretation of boundary data and
point out that this physically-motivated structure underlies recent work on on D-brane
categories [7].
The formalism of the present paper is restricted to openclosed topological field theories
on oriented Riemann surfaces. The unoriented case requires a slightly modified approach,
which will not be discussed here. Some of the results derived below are probably familiar to
topological field theory experts, though a clear, general and systematic derivation does not
seem to have been given before. The expert reader may be interested in the detailed analysis
of bulk-boundary and boundary-bulk maps and the topological version of the (generalized)
Cardy constraint discussed in Section 4, as well as the discussion of reducibility and the
category-theoretic interpretation of Section 5. He may also be interested in our treatment of
boundary-condition changing sectors. The mathematical structure governing openclosed
topological field theories is summarized in Section 4.8. I tried to make this and Section 5
accessible to a mathematical audience, and to this end they collect some results derived in
the rest of the paper in an attempt to make the presentation self-contained. This paper is
foundational and as such it does not contain examples. The example of topological sigma
models in the presence of (many) D-branes will be treated in detail in [34].
2. Axiomatics
2.1. Surfaces, state spaces and products
The framework of openclosed topological field theories in two dimensions (boundary
topological field theories) can be formulated through an extension of the geometric
category approach of [19,28] to the case of bounded Riemann surfaces. In this paper,
we restrict to the case of oriented strings, and hence consider oriented Riemann surfaces
only. Since we allow for general boundary conditions (i.e., we define our theory in the
presence of D-branes), each open string boundary will carry a label (decoration) a, which
indicates the associated boundary sector. 2 Our Riemann surfaces carry two types of
boundaries. First, one has closed and open string boundaries. The former are oriented
circles C, while the latter are oriented segments I . The open string boundaries I carry
boundary sector labels a, b at their ends, which we indicate by writing Ca or Iba (in the
latter case, the convention is that I is oriented from a to b). Second, one has boundary
sector boundaries, which are oriented open or closed curves a carrying a single label a.
These are those bounding curves of on which the boundary conditions are imposed.
Since we deal with a topological field theory, we consider all objects up to orientationpreserving diffeomorphisms, which is to say that their parameterizations do not matter.
We shall declare a string boundary C or Iba to be incoming if its orientation agrees
with that of and outgoing otherwise. Physically, such boundaries are associated with
2 When a nonlinear sigma model description of some sort is available, the indices a label various boundary
conditions/choices of ChanPaton data.
500
incoming/outgoing strings. A topological field theory living on such surfaces defines a

bulk state space H (obtained through quantization on the infinite cylinder) and a collection
of boundary state spaces Hba (obtained through quantization on an infinite strip carrying
boundary conditions a and b). Our convention for the latter is that Hba corresponds to the
state space of the oriented open string stretching from a to b (in this order). The bulk and
boundary state spaces H, Hba are Z2 -graded:
H = H0 H1 ,
(1)
0
Hba = Hba
(2)
1
Hba
,
where the Z2 -degree of a state can be identified with its Grassmannality. That is, states
0 are bosonic (and, when such a description is available, generated
belonging to H0 , Hba
1 are fermionic (and
by Grassmann-even worldsheet fields), while states in H1 , Hba
3
generated by Grassmann-odd fields). If a state | has pure degree n,
we shall write
deg | = ||| = n.
A surface , together with an enumeration of its incoming and outgoing string boundary
components, defines a map (called the associated product) between the associated

incoming and outgoing state spaces. These are defined through Hin := m
i=1 Hiin ,
m
out
in
Hout := j =1 H out , where i , j are the (enumerated) incoming and outgoing string
j
boundary components of . Let us recall the path integral definition for completeness. 4 In
the path integral formalism, one associates a configuration space with each string boundary
of . In our situation, one has a bulk configuration space V (the space of configurations
of worldsheet fields restricted to a string bounding circle C) and open configuration spaces
Vba (the space of field configurations on a string bounding interval, subject to the boundary
conditions labeled by a and b at its two ends). Both bulk and boundary configuration
spaces are (infinite-dimensional) supermanifolds. 5 Next, one defines H, Hba as the spaces
of functionals over configuration spaces (functions defined on the supermanifolds V and
Vba ). Their Z2 -grading is induced by Taylor expansion with respect to odd coordinates on
V , Vba .
3 For topological sigma models, the Z -grading is induced by a Z-grading associated with the worldsheet
2
U (1) charge: states of even U (1) charge are Grassmann even and states of odd U (1) charge are Grassmann odd.
A general model does not possess a worldsheet U (1) symmetry, since it need not be obtained by twisting an
N = 2 superconformal field theory. However, the Z2 -degree is always defined.
4 In practice, one often obtains a realization of these axioms through cohomological field theories (such as the
A/B models); in this case, the two-dimensional metric enters as an explicit parameter and becomes irrelevant only
after taking the cohomology of a nilpotent operator Q. The path integral derivation of sewing constraints does not
directly apply to such models, through it is easy to show that they satisfy our axioms on-shell, i.e., after taking
Q-cohomology. For brevity, we illustrate the path-integral origin only for the simpler case of strict (i.e., off-shell
metric-independent) topological field theories.
5 Because we study topological field theories, Grassmann-odd worldsheet/boundary configurations are not
spinors from the worldsheet point of view. This is important when considering sewing operations which produce
nontrivial closed curves, in which case the path integral gives objects of the type Tr((1)F {. . .}). The factor
(1)F is due to the fact that odd configurations are always periodic along such cycles. This is familiar from the
case of twisted sigma models, where the G-odd fields are related to Ramond sector fermions of the untwisted
model.
501
Given an enumeration of incoming/outgoing boundaries of , define the incoming

n
and outgoing configuration spaces by Vin = m
i=1 Viin and Vout = j =1 Vjout . The
enumeration of incoming/outgoing string boundaries does matter in these definitions, since
the configuration spaces V , Vba contain Grassmann-odd elements. Picking in Vin , and
out Vout , we next consider the (Euclidean) path integral over field configurations on
subject to the boundary conditions | in = iin and | out = jout on the string boundaries,
j
i
and to the boundary conditions indexed by the label a on each other boundary a :

K (out, in ) =
(3)
D[]eS[] .
| out =jout , | in =jin
j
This gives a function K defined on Vout Vin . It allows us to define the map from

Hin = in Hin to Hout = out Hout as follows. For each incoming state Hin , we
define the associated outgoing state = () to be the function(al) on Vout given by the
following equation:

(out ) = D[in ]K (out, in )(in ).
(4)
Here D[in ] is the path integral measure on boundary configurations.
In the particular case when all boundaries of are incoming, the outgoing space
associated with is Hout = C and the map is a complex-valued linear functional
defined on the incoming space. This is the correlator defined by , and will also be
denoted by . . . .
2.2. Axioms
2.2.1. Degree
Topological products are subject to a degree axiom, which requires that all products
with a single output are maps of degree zero, while the maps with two inputs and no output
(the topological metrics, see below) have definite (but model-dependent) degree.
2.2.2. Sewing
The topological surfaces can be composed by sewing at their closed or open string
boundary components. Sewing is allowed only between two closed string boundaries
or two open string boundaries, and the orientations and endpoint labels of the sewn
boundaries must match. Since we deal with a topological field theory, parameterizations
at the boundaries do not matter, and hence there is no twist-sewing operation. Sewing
defines an associative composition on the collection of topological openclosed Riemann
surfaces, which endows it with the structure of a category. In this category, the objects are
direct products of closed and open string boundaries, i.e., oriented topological circles and
oriented segments with endpoint decorations. The morphisms are the Riemann surfaces
themselves mapping incoming into outgoing boundary components, while sewing gives
the morphism compositions (Fig. 1). These compositions are clearly associative. Since the
objects in our category are given by direct products, they come with a choice of ordering
502
Fig. 1. A typical openclosed Riemann surface.
on their components, and hence the Riemann surfaces connecting them are endowed
distinguished enumerations of the incoming and outgoing string boundaries. What we have
is a generalization of (the topological version of) Segals geometric category [19].
The sewing axiom is the requirement that the correspondence be a functor
from the geometric category to the linear category defined by tensor products of the spaces
H, Hba together with linear maps between such products. This requires that sewing of two
Riemann surfaces and corresponds to composition of the associated maps and
:
.
(5)
The sewing axiom can be derived from elementary properties of the path integral in the
standard manner.
2.2.3. Permutation symmetry (equivariance)
One also requires that the correspondence be graded equivariant with
respect to arbitrary permutations of closed string boundaries and cyclic permutations of
open string boundaries. For closed string boundaries, such permutations correspond to the
associated action on the tensor product components of Hin and Hout , and the map perm
determined by the permuted surface should be related to by composing with these
linear operations at its ends. The latter permutations act and with signs dictated by the
degree of the permuted elements (permuting two G-odd states gives a minus sign, etc.).
For open boundaries, the condition is imposed on diagrams whose open string boundaries
are all incoming or all outgoing. This corresponds to graded cyclic symmetry of open
amplitudes. Note that only cyclic permutations are allowed, even for the case when all
open string boundaries carry the same label 6 (Fig. 2).
6 This fact is familiar in boundary conformal field theory. In that case, open amplitudes are invariant under
arbitrary permutations of the boundary insertions (assuming all such boundaries carry the same label) only if
such insertions are mutually local (see the second reference of [8]). This happens, for example, for those open
boundary correlators which can be continued to the bulk.
503
Fig. 2. Permutations allowed in the equivariance axiom. For closed boundaries, any permutation
is allowed among incoming or outgoing data. For open boundaries, equivariance requires cyclic
symmetry of amplitudes. This condition refers only to diagrams having only incoming or only
outgoing open boundaries with the topology of a segment. Only cyclic permutations are allowed
in the second diagram, even for the case a = b = c = d = e.
Fig. 3. The surfaces entering the normalization axiom.
2.2.4. Normalization
Finally, we have to impose a normalization constraint. This requires that the linear
maps defined by the surfaces in Fig. 3 are the identity operators of the corresponding state
spaces, and encodes triviality of topological propagators. 7
3. Basic products, boundary vacua, metrics and traces and their first properties
Any oriented openclosed Riemann surface (with any choice of orientation of its string
boundaries) can be obtained by sewing some combination of the five basic surfaces shown
in Fig. 4. This is the analogue of the well-known pants decomposition of closed Riemann
7 Cohomological field theories satisfy our axioms only after taking BRST cohomology. For these models,
the bulk and boundary Hamiltonians H and Hba are BRST exact, and they induce trivial propagators in BRST
cohomology. However, off-shell propagation in such models is nontrivial.
504
Fig. 4. The basic openclosed Riemann surfaces are considered with the indicated boundary
orientations. Note that sewing with one of the surfaces (d), (e) allows us to revert the orientation
of any outer leg of the surfaces (a), (b), (c).
surfaces. For want of a better name, we shall call the three surfaces shown in Fig. 4(a)
(c) by the names of closed pants, open pants and openclosed conduits. Beyond these,
we also need two exceptional surfaces, namely, the cylinder and the half-strip with certain
string boundary orientations, which are shown in Fig. 4(d), (e). Some of these surfaces are
endowed with boundary decorations, as shown in the figure.
The sewing axiom allows us to decompose an arbitrary topological product into the
products defined by the basic surfaces. Hence the entire information about our theory is
encoded by some basic data, which we now consider in turn.
3.1. The closed (bulk) product
This is the degree zero bilinear product C : H H H defined by the surface in
Fig. 4(a). Since closed Riemann surfaces form a closed subclass under sewing, we can
immediately identify this with the basic product of the associated closed topological field
theory. It defines a bulk state-operator correspondence g, as follows. To each state | H,
we associate the operator := g(|) from H to H given by:

| := C |, | .
(6)
This parallels the bulk state-operator correspondence of closed conformal field theories,
albeit in a simplified fashion.
Since the vector space H is typically finite-dimensional, we can choose a finite basis
|i and define the coefficients Cijk through the expansions:

k
C |i , |j =
Cij |k .
k
(7)
505
These are the well-known bulk structure constants familiar from closed topological
field theory. Via the state-operator correspondence, the product C corresponds to usual
composition:

g |i g |j = g C |i , |j .
(8)
This follows from the definition of g, by using associativity of the bulk product C, to be
discussed in Section 5.
3.2. The open (boundary) product
The open pants of Fig. 4(b) define a degree zero bilinear product B(cba) : Hcb Hba
Hca . We introduce a boundary state-operator correspondence g(cba) which associates to
each state | of Hcb an operator (a) := g(cba)(|) from Hba to Hca :

(a) | := B(cba) |, | Hca ,
(9)
where | Hba . Choosing bases |ba for all spaces Hba , we can define boundary
structure constants B (cba) via:

B(cba) cb , ba =
B (cba)ca .
(10)
Associativity of the boundary product (to be discussed in Section 5 implies that the
boundary state-operator correspondence takes the boundary product into usual operator
compositions:

g(cbe)(2 ) g(bae)(1) = g B(cba)(2 , 1 ) ,
(11)
for 1 Hba and 2 Hcb .
As we shall see in more detail below, the role of the diagonal spaces Ha := Haa
is slightly different from that of the off-diagonal spaces Hba with b = a. Following
standard terminology, the operators a := aa will be called boundary operators in
the sector a, while the operators ba with b = a will be called topological boundary
condition changing operators. They are the topological counterparts of CFT operators
bearing the same names.
3.3. The bulk-boundary maps
The surface of Fig. 4 (c) defines degree zero bulk-boundary maps, 8 which we denote by
e(a) : H Ha . These take the closed (bulk) state space H into each diagonal boundary
state space Ha = Haa . There is generally no such map into the off-diagonal spaces Hba
8 In a nonlinear sigma model, these are realized through restriction of the bulk fields to the string boundary
components of the Riemann surface . No such interpretation exists for systems which do not admit a sigma
model description.
506
(b = a). Expressing this map in the bases i and a allows us to define bulk-boundary
coefficients ei (a) for each boundary condition a:
a

ei (a) .
e(a) |i =
(12)
The bulk and boundary state-operator correspondences translate this into the bulkboundary expansions. These are the maps E(ab) = g(aab) e(a) g 1 between the bulk
and boundary operator spaces. We have:

(b)
E(ab)(i ) =
(13)
ei (a) a
.
This relation is the topological counterpart of the bulk-boundary expansion in boundary

conformal field theories [23]. In the conformal case, it is usually written without explicit
indication of (the conformal analogue of) the map E(ab). Following the same convention,
we could rewrite it in the more familiar form:

ei (a)a .
i =
(14)
However, the maps e(a), E(ab) need not be injective nor surjective, 9 hence care must be
used when interpreting formal relations such as (14).
3.4. Topological metrics
The last two surfaces of Fig. 4 define complex bilinear maps : H H C and
(ab) : Hab Hba C, the bulk and boundary topological metrics. The equivariance
axioms shows that these have the graded symmetry properties:
(1 , 2 ) = (1)|1 ||2 | (2 , 1 ),
(ab)(1 , 2 ) = (1)
|1 ||2 |
(15)
(ba)(2, 1 ).
(16)
The metrics are invariant with respect to the bulk and boundary products, respectively,
as can be seen from Fig. 5:

C(1 , 2 ), 3 = 1 , C(2 , 3 ) ,
(17)
ab
bc ca

ab bc ca
(ac) B(abc) 1 , 2 , 3 = (ab) 1 , B(bca) 2 , 3 .
(18)
Moreover, Fig. 6 shows that the metrics are non-degenerate as bilinear forms: 10
(1 , 2 ) = 0, for all 1
ab ba
(ab) 1 , 2 = 0, for all 1ab

(ab) 1ab , 2ba = 0, for all 2ba
2 = 0,
(19)
= 0,
(20)
= 0.
(21)
2ba
1ab
9 Explicit examples of noninjectivity/nonsurjectivity are provided by CalabiYau sigma models [34].

10 We assume that our state spaces are all finite dimensional, which is the usual case in practice.
507
Fig. 5. Invariance of the topological metrics with respect to the bulk and boundary products.
Fig. 6. Graphical construction of inverses for the topological metrics. The two-legged surfaces
below the cuts define states q H H and p(ab) Hab Hba . The figure shows that
( idH )( q) = q and ((ab) idHba )( ab p) = ab . This implies that the topological
metrics are non-degenerate.
with H ,
Hence the topological metrics allow us to identify H with H and Hab
ba
where indicates the linear dual of a vector space. Defining the metric coefficients via:

ij := |i , |j ,
(22)
ab ba

,
(ab) := (ab) ,
(23)
| H
through the conditions:

we introduce functionals i
and ab | Hab

i | |j := i |j = ij ,
ab ab ab ab
:= = (ab),

(25)
and we have the completeness relation:

|i ij j | = idH ,
(26)
(24)
i,j
where ij is the matrix inverse to ij .

For the boundary sector, we proceed similarly by defining a map F (ab) : Hab Hba
through the equation:

ba (ab) ab ,
F :=
(27)
508
Fig. 7. Relating topological correlators to other products.
with (ab) the inverse of (ab). This takes |ab into |ba for all , and thus
gives an isomorphism between Hab and Hba . Identifying these two spaces via the
isomorphism F , we can treat them as identical, in which case F can be viewed as the
identity operator of the space Hab Hba . Then (27) can be understood as the completeness
relation for the basis |ab |ba in this vector space.
Note that the bulk and boundary topological metrics are not related in any simple
fashion. In particular, the boundary topological metric is not the boundary restriction
of the bulk metric. 11 That is, one need not have (a)(e(a)(1), e(b)(2 )) = (1 , 2 ), as
can be seen from the geometry of the associated Riemann surfaces. 12
3.5. Reduction to correlators
Given a surface (without a choice of orientation for its boundary components i ),
one can use it to define various products ,O associated to the possible orientations O
of i . A canonical choice is to consider incoming boundaries only, in which case one
obtains the correlator . . . . This can be related to the other products defined by with
the help of the topological metrics. For the example, let . . . be the correlator defined
by the surface and boundary orientations shown in Fig. 7. Then cutting very close to the
outgoing boundary gives:

ab bc ca
1 2 3 = (ac) m 1ab , 2bc , 3ca ,
(28)
where m is the product defined by the three-pronged surface determined by the cut.
Due to non-degeneracy of the topological metrics, we conclude that all products can
be determined from the knowledge of correlators. Similarly, they can all be determined
from knowledge of the topological metrics and of products with a single output.
11 In a nonlinear sigma model, the bulk-boundary map e is typically given by restriction to the boundary.
However, the action of the model will generally contain a nonzero boundary term, such as a boundary coupling
to a gauge connection. The boundary metric is given by a path integral on the strip (upper half plane punctured at
the origin), and hence depends on the boundary action. The bulk metric is given by a path integral on a cylinder
(complex plane punctured at the origin), and depends only on the bulk action.
12 We use the notation (a) := (aa) for the diagonal boundary sectors.
509
Fig. 8. Defining surfaces for the topological vacua. Each of these surfaces contains a single string
boundary, namely the circle/segment on their right. The boundary topological vacua arise from a path
integral with boundary condition a on the non-string boundary. This gives a functional on the space
of open string states supported on the string boundary, which is the segment Iaa to the right.
Fig. 9. The topological vacua are units for the basic topological products.
Using the state-operator correspondence, we can identify correlators with the topological
vevs of the associated operator products. 13 This recovers the usual formalism.
3.6. Units
The surfaces of Fig. 8 define degree zero linear maps from the field C of complex
numbers into the spaces H and Ha . Evaluating these maps at the complex identity 1
C defines special degree zero states which we denote by |0 H and |0a Ha . These
states play the role of topological vacua in their respective spaces, as can be seen by
considering the surfaces shown in Fig. 9 below. This figure shows that C(|0, |j ) =
|j , C(|i , |0) = |i and B(aab)(|0a , |ab ) = |ab , B(abb)(|ab , |0b ) = |ab ,
i.e.:
k
= jk ,
C0j
k
Ci0
= ik
and B0 (aab) = ,
B0 (abb) = .
(29)
It follows that the states |0 (in H) and |0a (in Ha ) associated with these states are neutral
elements (units) with respect to the bulk and boundary operator products. Note that there
is no natural definition of a topological vacuum in a boundary condition changing sector
Hab with a = b.
Investigation of Fig. 10 shows that the boundary vacua are related to the bulk vacuum
through the maps e(a):
e(a)|0 = |0a .
(30)
involves the observation that g(|)|0 = C(|, |0) = | and g(baa)(| ba )|0a =
B(baa)(| ba , |0a ) = | ba , where |0 and |0a are the topological vacua discussed below, and the
definition of dual states given in Eq. (22).
13 This
510
Fig. 10. Relation between the boundary and bulk topological vacua. Sewing the cap to the conduit
of Fig. 4(c) gives a surface which is topologically equivalent with the half strip.
Fig. 11. The relation between topological traces and metrics.
3.7. Topological traces ( = topological one-point functions)

Considering the surfaces of Fig. 8 with the opposite string boundary orientations gives
linear maps Tr, Tra from the spaces H, Ha to the field of complex numbers. Fig. 11 shows
that the bulk and (diagonal) boundary topological metrics can be expressed in terms of
products and traces:

(1 , 2 ) = Tr C(1 , 2 ) ,
(31)
ab ba

ab ba
(ab) 1 , 2 = Tra B(aba) 1 , 2 .
(32)
In particular, we have:

|0, = Tr(),

(ab) |0a , a = Tra a .
(33)
(34)
Hence Tr, Tra are the linear functionals on H, Ha dual to the topological vacua |0, |0a
with respect to the topological metrics , (a).
4. Consequences of the sewing constraints

We saw that the entire information about an openclosed topological field theory is
encoded by the three classes of products C, B, e and the topological metrics and . As
in boundary conformal field theory, consistency of topological amplitudes under different
decompositions of the same Riemann surface into the basic surfaces of Fig. 4 imposes
511
constraints on this data. The sewing constraints in the conformal case have been analyzed
in detail in [27], and since we deal with a topological field theory (which is, in particular,
conformally invariant), we can apply some of those results. The main observation of [27] is
that all sewing constraints are satisfied provided that the five basic conditions described in
Fig. 12 are obeyed. The first two conditions in this figure are the basic sewing constraints of
the closed case (bulk crossing duality and bulk modular covariance), while the remaining
conditions (shown in Fig. 12(c)(f)) encode boundary, openopenclosed and closed
openopen crossing duality and a supplementary constraint relating the bulk and boundary
sector. Let us analyze the consequences of these conditions on our basic data b, c, e,
and . In fact, it turns out that the constraint of Fig. 12(b) is not required in the topological
case (it reduces to a tautology for topological field theories). We have included it in our
discussion since we want to stress similarity with the analysis of [27].
4.1. Bulk crossing symmetry
As in closed topological field theory, the sewing constraint described by Fig. 12(a)
amounts to the statement that the product C must be associative. On the other hand, the
equivariance axiom requires that this product is graded commutative:
C(1 , 2 ) = (1)|1 ||2 | C(2 , 1 ).
(35)
Since the |0 is a unit, we conclude that c defines a structure of commutative and
associative ring with a unit on the bulk state space H. Since the product is bilinear,
this ring is in fact an algebra over the field of complex numbers. Moreover, we know
that the bulk topological metric is invariant with respect to the product C. An graded
associative, graded commutative C-algebra with unit, endowed with a non-degenerate,
graded-symmetric, invariant bilinear form is called a Frobenius (super)algebra [16,17,20,
21]. Hence we immediately recover the well-known fact that the bulk data (H, c, ) define
a Frobenius algebra.
4.2. Boundary crossing symmetry
The constraint described by Fig. 12(c) can be written:

B(acd) B(abc)(1 , 2 ), 3 = B(abd) 1 , B(bcd)(2 , 3 ) ,
(36)
which is a decorated associativity condition. On the other hand, equivariance requires

that the triple correlator on the decorated disk is (graded) cyclically symmetric:

ab bc ca
ab
bc
ca
1 2 3 abc = (1)|1 |(|2 |+|3 |) 2bc 3ca 1ab bca
(37)

|3ab |(|1bc |+|2ca |) ca ab bc
3 1 2 cab .
= (1)
(38)
Hence the boundary product is cyclic in the following sense:

(ac) B(abc)(1 , 2 ), 3 = (1)|1 |(|2 |+|3 |) (ba) B(bca)(2 , 3 ), 1 (39)

= (1)|3 |(|1 |+|2 |) (cb) B(cab)(3 , 1 ), 2 . (40)
512
Fig. 12. Graphical depiction of the sewing constraints. The constraint (b) is void for topological field
theories, as explained later in this section.
4.3. Consequences for traces

Associativity of the bulk and boundary products imply that relations (31) generalize to
all tree-level bulk and boundary amplitudes:
1 , . . . , n 0,0 = Tr(1 n ),
a1 a2

an1 an an a1 a1 an
n 0,1 = Tra1 1a1 a2 nan a1 .
1 , . . . , n1
(41)
(42)
513
It is clear from (31) that the traces and topological metrics determine each other, provided
that the bulk/boundary products are known. Hence one can view the traces as derived
concepts, if we treat the topological metrics as fundamental. Cyclicity of topological
metrics (Eqs. (39)) implies that the traces are graded cyclic:
Tr(1 2 3 ) = (1)|1 |(|2 |+|3 |) Tr(2 3 1 ),

ab
bc
ca
Tra 1ab 2bc 3ca = (1)|1 |(|2 |+|3 |) Tr 2bc 3ca 1ab ,
(43)
(44)
which in particular justifies their name. Cyclicity of the traces also follows by applying
the equivariance axiom to three-point correlators. In particular, we see that the bulk trace
obeys a stronger constraint, which allows us to commute arbitrary entries:
Tr(1 2 ) = (1)|1 ||2 | Tr(2 1 ).
(45)
This generally is not allowed for the boundary traces Tra .

Also note that non-degeneracy of the metrics is equivalent with the following properties
of the traces:
Tr(1 2 ) = 0, for all 2 1 = 0,

Tra 1ab 2ba = 0, for all 2ba 1ab = 0,

Tra 1ab 2ba = 0, for all 1ab 1ba = 0.
(46)
(47)
(48)
The maps Tr, Tra need not be related to the usual traces trHab on the spaces Hab . For
the operators a , we have a |a = B(a)(|a , |a ) = B (a)|a (where B(a) :=

B(aaa)). Hence:

trHa a = B (a)
(49)
and

Tra a = Tr a 1a = 0 (a) = Tr 1a a = 0 (a).
(50)
In the topological case, the one-point functions Tr(i ) = i0 = 0i and Tra (a ) =

0 (a) = 0 (a) need not be zero for nonzero i, . Let us compare this situation with
the case of boundary conformal field theories. For such systems, one has a unique state
of conformal dimension h = 0, the conformal vacuum. For a boundary state a of
nonzero dimension, conformal invariance requires Tr ( a ) = aa aa
0,1 = 0, so that only
the boundary vacua |00 can have nontrivial one-point functions. In the topological case,
this constraint cannot be applied for 0-form observables, since these transform trivially
under the diffeomorphism group, and in particular under its conformal subgroup.
In the conformal case, it is well-known that 1a a,a
0,1 can have different values for different
boundary conditions a; in general, one cannot normalize these to have the same value. The
same is true for topological theories.
4.4. The total boundary state space
The properties of boundary correlators can be given a more transparent form as follows.

Consider the total open state space Ho := ab Hab . On this space, we introduce
514
a total boundary product B defined as follows. If 1ab Hab and 2bc Hbc , then
B(1ab , 2bc ) := B(abc)(1ab , 2bc ). Then we extend B to a bilinear map defined for
arbitrary elements 1 , 2 of Ho . We also define a total boundary topological metric
on Ho through the requirement of bilinearity and the condition that it reduces to (ab) on
pure boundary states 1 Hab , 2 Hba (the metric is defined to be zero on states 1
Ha1 b1 , 2 Ha2 b2 which do not satisfy the constraints a1 = b2 , a2 = b1 ). Non-degeneracy
of all (ab) is equivalent with non-degeneracy of . It is not hard to see that the two
boundary sewing constraints are equivalent with the conditions:

B B(1 , 2 ), 3 = B 1 , B(2 , 3 ) ,
(51)

|1 |(|2 |+|3 |)
B(1 , 2 ), 3 = (1)
(52)
B(2 , 3 ), 1

|3 |(|1 |+|2 |)
= (1)
(53)
B(3 , 1 ), 2 .
The first of these equations shows that b is associative, while the second is the requirement
that the total boundary correlator 1 , 2 , 3 = Tr(1 2 3 ) (defined on Ho in the
obvious fashion 14 ) be cyclically symmetric.
Moreover, it is easy to see that the total boundary vacuum |0o := a |0a is a unit
for the total boundary product. We conclude that (H0 , B, Tr) forms a graded associative
algebra with unit (over the complex field), endowed with a non-degenerate graded cyclic
trace. This structure is familiar from open string field theory [14,32], where it appears in
a slightly different context. Since the boundary metric is invariant under the boundary
product:

B(1 , 2 ), 3 = 1 , B(2 , 3 ) ,
(54)
we also conclude that (Ho , B, ) is a non-commutative Frobenius algebra.
4.5. Bulk-boundary crossing symmetry
The constraints of Fig. 12 (d) and (e) can be formulated as follows:

B(abb) , e(b)() = (1)|||| B(aab) e(a)(), ,

B(aaa) e(a)(1 ), e(a)(2) = e(a) C(1 , 2 ) .
(55)
(56)
To simplify their analysis, define the total bulk boundary map e : H H0 by e :=

a e(a). The image of this map is contained in the diagonal subspace Hd :=
a Haa .
Then conditions (55) can be rewritten as:

B , e() = (1)|||| B e(), ,
(57)

B e(1 ), e(2 ) = e C(1 , 2 ) .
(58)
The second equation shows that e is a morphism from the bulk ring (H, C) to the boundary
ring (Ho , B). This morphism preserves units, since e(a)|0 = |0a (cf. (30)).
14 This correlator is defined to be zero on pure boundary states H
i
ai bi which
fail to satisfy the requirement

b1 = a2 , b2 = a3 , b3 = a1 . The total boundary trace is defined through Tr := a Tra , and satisfies (1 , 2 ) =
Tr(B(1 , 2 )).
If we define a multiplication H Ho Ho by:

:= B e(), ,
515
(59)
then the boundary ring (Ho , B) becomes a (graded) algebra over the bulk ring (H, C). To
finish the check of graded algebra properties, we have to show that:

B 1 , B e(), 2 = (1)|||1 | B B e(), 1 , 2

= (1)|||1 | B e(), B(1 , 2 )
(60)
and

B e C(1 , 2 ) , = B e(1 ), B e(2 ), .
(61)
These properties follow from the constraints (57) and associativity of the boundary
product B:

B 1 , B e(), 2 = B B 1 , e() , 2 ,
(62)

|||1 |
B B 1 , e() , 2 = (1)
B B e(), 1 , 2

|||1 |
B e(), B(1 , 2 )
= (1)
(63)
and:

B e C(1 , 2 ) , = B B e(1 ), e(2 ) , = B e(1 ), B e(2 ), .
(64)
We conclude that the (associative but generally not graded commutative) boundary ring
(Ho , B) has the structure of a graded algebra with unit over the (associative and graded
commutative) bulk ring (H, C). The former is endowed with a non-degenerate complexvalued graded cyclic trace.
4.6. Bulk modular invariance
It is easy to see that the condition of Fig. 12(b) does not give any further constraints on
the bulk quantities c and for a topological field theory, this reduces to the tautology
tr(e(., )) = tr(e(., )). In the conformal case, the two surfaces obtained by cutting the
torus along homologically inequivalent cycles are not equivalent, since they have different
complex structures. For topological field theories, however, one can smoothly deform them
into one another, which is why the constraint is void.
4.7. Modular invariance on the cylinder
In contrast with bulk modular invariance, modular invariance on the cylinder gives a
non-trivial constraint on the theory. The reason is that the sewing condition depicted in
Fig. 12(f) involves both closed and open cuts of the surface, and thus relates bulk and
boundary data. This constraint can be written:

1 , 2 ab
(65)
0,2 = f (a)(1 ), f (b)(2 ) 0,0 ,
where we used subscripts {g, h} to indicate the genus and number of boundaries of the
surfaces involved.
516
Fig. 13. The defining surface for the map f .
Fig. 14. The defining surface for the boundary state |aB = f (a)|0a . On the right, we have a cylinder
which interpolates between a bounding circle Ca (which, according to our definition, is not a string
boundary) and a closed string boundary C. As in the case of topological vacua, the point here is
that such a cylinder produces a closed string state H ( = functional on the space of closed string
configurations V on the rightmost circle) upon considering the path integral with boundary condition
a at the left end. Such a path integral contains no string state insertion on the left end, and hence this
end can be allowed to be a non-string boundary.
The map f (a) : Ha H appearing in this equation is obtained by considering the

surface of Fig. 4(c) with its opposite orientation (see Fig. 13). In particular, the bulk
state |aB := f (a)|0a is described by the surface of Fig. 14. Geometrically, the surface
of Fig. 14 translates between non-string bounding circles Ca and closed string circle
boundaries C.
The map f (a) is related to e(a) as follows. If is the surface of Fig. 4(c) (with both
string boundaries taken to be incoming), then the associated correlator a can be
expressed in either of the following two ways:

(a) e(a), = , f (a) .
(66)
This tells us that e(a) and f (a) are adjoint to each other with respect to the topological
metrics on their defining spaces.
One can rewrite relation (65) as follows:

trHab (1)F ab (1 , 2 ) = f (a)(1 ), f (b)(2 ) ,
(67)
where F is the fermion number (i.e., F counts the degree of states), the map
ab (1 , 2 ) : Hab Hab is defined through:

B(abb) B(aab)(1, ), 2 ,
(68)
and trHab is the usual trace in the vector space Hab .
To make this more specific, let us expand:
e(a)(i ) = ei (a)a ,
(69)
f (a)(a ) = fi (a)i .
Then (66) (applied to i and
(70)
a )
gives
fj (a)ij = ei (a) (a).
(71)
Defining fi (a) := f (a)j i and ei (a) := ei (a) (a), we obtain:
fi (a) = ei (a) fi (a) = ij ej (a).
(72)
517
In matrix form:
f(a) = eT (a)(a),
(73)
j := ej (a) and (a)

:= (a), (a)
ij := ij (a).
where f(a)i := fi (a), e(a)
Applying (67) to the states 1 := a and 2 := b gives:
ij fi (a)f (b) = (1)| | B

(aab)B (abb).
j
(74)
In this equation, we assumed that the basis ab is formed of states of definite degrees
|ab | := | |. Combining with expression (72) for f , we can rewrite the constraint (67) as
follows:
ij ei (a)ej (b) = (1)| | B

(aab)B (abb).
(75)
This can be recognized as a topological version of the generalized 15 Cardy constraint (see
below).
Finally, we formulate the bulk-boundary sewing conditions in terms of the total bound
ary space Ho . We start by defining the total bulk-boundary map e := a e(a) : Hc Ho

and the total boundary-bulk map f := a f (a) : Hod Hc . Here Hod := a Ha is the
diagonal part of Ho .
The adjointness relations (66) imply that e and f are adjoint with respect to the bulk
metric and the total boundary metric:

, f () = e(), , for all Hc and Ho ,
(76)
while the topological Cardy constraint becomes:

f (1 ), f (2 ) = trHo (1)F (1 , 2 ) ,
for all 1 , 2 Ho ,
where (1 , 2 ) is the endomorphism of Ho defined through:

ab 1a , 2b = H B B(1 , ), 2 ,
(1 , 2 ) =
ab
for all pairs of diagonal states k =
a
a k
(77)
(78)
Ho (k = 1, 2), of components ka Haa .
Boundary states
In boundary conformal field theory, a relation similar with (65) is used to define
boundary states. Following this tradition, we call |aB := f (a)| H the topological
boundary state associated with the open string state | Ha (this is an extension of the
terminology used in the conformal case, as we shall see in a moment). Application of the
sewing constraints allows one to reduce certain questions about the diagonal boundary
15 By generalized we mean that we apply the modular constraint of Fig. 12(f) for arbitrary incoming states |
1
and |2 , and not only for |1 = |0a and |2 = |0b as is customary in the conformal field theory literature.
518
Fig. 15. Geometric description of the composition f (a)e(a). This figure shows that
f (a)e(a)() = C(f (a)|0a , ).
Fig. 16. Geometric description of the composition e(b)f (a).
sectors Haa to problems for the associated boundary states. In particular, the correlator
associated to the surface of Fig. 4(c) can be written:

a

, a = , f (a)() 0,0 = , Ba = e(a)(), 0,1 = (a) e(a)(), . (79)
The adjointness relation (66) tells us that we can compute tree-level bulk-boundary
two-point functions a either as e(a)() a a0,1 (i.e., by pulling the bulk state to the
boundary and computing a correlator on the disk) or as f (a)( a ) = Ba 1,0 , i.e.,
by pushing the state to its bulk image (its associated boundary state) and computing a
correlation function on the sphere. Even though these descriptions are equivalent at the
level of two-point functions, such a duality is not very powerful unless the maps e and f
have appropriate surjectivity/injectivity properties.
Limitations of the boundary state approach
What is the precise power of the boundary state approach? To answer this question, let
us investigate the properties of the maps e(a) and f (a). Consideration of Fig. 15 shows
that f (a)e(a)| = C(f (a)|0a , |) = C(|aB , |), where |aB := f (a)|0a . The best

we can hope for is to have f e = a f (a)e(a) = idH . In that case, e = a e(a) would

be injective and f := a f (a) would be surjective. This can be achieved provided that
a f (a)(|0a ) =
a |aB = |0, which is a sort of completeness constraint for the set of
boundary conditions (D-branes) present in the theory. Thus, one can expect a simplification
in the case when all topological D-branes have been included as a background, a situation
which suggests that in a certain sense the bulk data can be recovered from knowledge of
all possible boundary data.
Now consider the composition e(b)f (a), which is shown in Fig. 16. This composition
is also nontrivial, and nothing can be said about it from general considerations. Provided
that e(a)f (a) = idHa in a given model, one could conclude that the boundary-bulk map
is injective and the bulk-boundary map is surjective. Since the former is responsible for
associating a boundary state to an open string state, this would assure us that the boundary
state formalism gives precise information on all open string states (i.e., we do not lose
information on the spaces Ha when taking their images through f (a)). Unfortunately, this
is generally not the case, as one can see in the particular example of CalabiYau topological
sigma models [34].
519
In general, the maps e and f are neither injective nor surjective. In particular, the
boundary algebra (Ho , B) will generally have torsion as a module over the bulk algebra
(H, C).
We saw above that the map e is a ring morphism from the bulk to the boundary algebra.
It is natural to ask whether f has a similar property. This is clearly not the case, 16 since
the quoted property of e is a consequence of the constraint depicted in Fig. 12(e), and no
similar condition holds true for f . The closest candidate would be Fig. 12(d), which cannot
be interpreted in such a manner since one cannot continuously pull the closed string tube
in that figure along the open string strips in order to produce two instances of the map f .
On boundary states as D-branes
Let us add a few remarks on the way boundary states are used in traditional boundary
conformal field theory. In standard treatments, one is interested in computing couplings
of the boundary vacua |0a , and we can do the same in our topological field theories.
This amounts to considering only couplings of bulk states with the semiclassical D-brane
state associated with the boundary condition labeled by a. The associated boundary state
|aB = |0a B H is then called the boundary state defined by the D-brane Da . Then one
writes the constraint (67) for incoming states given by the boundary vacua |0a , |0b , in
which case it reduces to:

j
|aB , |bB = ij f0i (a)f0 (b) = trHab (1)F .
(80)
This can be recognized as a topological version of the standard Cardy relation (note that
|aB = i f0i (a)|i ). The quantity in the right hand side is the Witten index of the
boundary sector Hab .
It is more useful, however, to view the quantum D-brane Da as being defined by the
entire open string state space Haa , since this includes all of the associated open string
excitations. In fact, this is necessary for logical consistency since inclusion of D-branes
should completely characterize our theory together with the bulk data. Since the boundary
vacua |0a (and even more so the boundary states |aB ) do not suffice to so characterize the
theory, we must conclude that there is more information in D-brane physics than can be
possibly encoded by the boundary vacua |0a . It follows that the technique of boundary
states represents at best a first step towards a complete characterization of boundary
conformal/topological field theories. We should also note that the literature on boundary
states is almost exclusively concerned with the states |aB , which encode information only
about the boundary vacua.
Non-injectivity of f (a) (if present) sets sharp limitations for the boundary state
approach. In fact, this formalism is further weakened by the following problem. Suppose
that one is given a state | in the bulk space H. One would like to know whether it
corresponds to an open string state | a Haa for some a, and, if so, whether this state
16 One can use relation (66) to show that f is a ring morphism provided that ef = id. Unfortunately, this is
generally not true, as can be seen by considering the geometric interpretation of this composition (see below).
520
is unique. This is precisely the approach followed by many studies of open conformal
field theory through the boundary state formalism instead of trying to determine Haa
directly, one tries to find a set of bulk states | which satisfy certain constraints (such
as conformal invariance and Cardys constraints in the case of rational conformal field
theories). This approach is especially useful in the case of abstract models (such as Gepner
models) for which a worldsheet description of the associated boundary conditions is
difficult. Unfortunately, a solution of such constraints does not necessarily correspond to
a true boundary state, since we are not assured that | lies in the image of any f (a).
To establish this for any given candidate, one needs to provide a full construction of a
boundary extension of the bulk theory, and show that the candidate boundary state indeed
is image through f (a) of some open string state (say, of the boundary vacuum |0a ).
Moreover, the answer to this question could be highly ambiguous, since one does not
know a priori what are the allowed boundary sectors a, nor does one know what the maps
f (a) are. Even if one knew these maps, it is very possible to have | = a f (a)|a for
some nonzero states | a Haa , in which case | is not associated with any single state
belonging to one of the spaces Haa . Since the full collection of possible boundary sectors
a is generally large, it is likely that this problem is quite widespread.
The observations above raise some obvious questions about the extent to which
recent work on openclosed CalabiYau compactifications (which is largely based on
the boundary state approach) gives us unambiguous information about the associated Dbranes, and how seriously one can take the geometric interpretation of those states as type
A/B D-branes wrapped over specific cycles in the large radius limit. 17
It is apparent that a complete understanding of openclosed extensions of a closed
topological/conformal field theory should go beyond the boundary state approach. At the
very least, any attempt at classifying such extensions must explicitly include the freedom
allowed by the choice of the maps e(a), i.e., must take into consideration the bulk-boundary
operator products as part of the extension problem, rather than as an afterthought. In
this respect, recent progress has been made in [33], where bulk-boundary products are
discussed in certain rational conformal field theory situations.
What is an abstract boundary condition?
In an abstract system, one lacks a direct construction of the boundary theory through
boundary conditions and boundary couplings in the action. If one is interested in
classifying all openclosed theories compatible with given bulk data, one would like to
have a conceptual definition of the boundary part of such a system. The traditional
17 Many such identifications of D-branes with particular cycles have been proposed in recent work [5,6].
Such proposals are based largely on cohomological arguments, and seem to neglect most of the issues we just
mentioned, as well a the more basic geometric fact that specifying a cohomology class does not fix a (cycle,
bundle) pair or a complex of such with any reasonable degree of uniqueness (i.e., it does not fix them up to the
natural ambiguity of such data in topological sigma models, which is specified by gauge transformations of
a certain kind). I should however mention that there are situations where symmetry arguments can be used to
reduce, though not completely eliminate, this ambiguity. In any case, it should be clear that the issue of such
identifications is far from having been settled.
521
approach to this problem is to try to isolate the boundary data through use of the
boundary state formalism, and to roughly identify the boundary states |aB with
abstract boundary configurations. As mentioned above, this approach encounters certain
conceptual difficulties, the most obvious of which is that it does not take into account
the bulk-boundary map f (a). The latter is essentially restriction to the boundary in
the standard case of nonlinear sigma models (with geometric boundary conditions).
Therefore, its specification is crucial for any consistent definition of boundary data.
I believe that the only generally meaningful procedure is to define boundary data as
the entire system (Ho , B, , e). This reduces the task of classifying boundary theories
associated to given bulk data to the problem of finding all non-commutative Frobenius
algebras over (H, C, ) which obey the topological Cardy constraint. In such generality,
this problem can be expected to have solutions which do not fit into a geometric boundary
condition approach for example, a nonlinear sigma model or LandauGinzburg model
at the conformal point could possess more openclosed extensions than predicted by
the classical boundary condition approach of [2,11]. Whether such an abstract boundary
extension has a classical boundary condition interpretation or not is a model-dependent
question which is largely irrelevant once a complete solution of the problem is known.
4.8. Summary
We showed that a (two-dimensional) topological openclosed field theory is equivalent
with the following data:
(1) A Frobenius (super)algebra (H, C, ) over the complex numbers. We recall that this
is an associative, graded commutative algebra (H, C) with unit |0, endowed with a gradedsymmetric nondegenerate bilinear form (the bulk topological metric). This metric has a
definite degree deg() (i.e., (1 , 2 ) = 0 unless |1 | + |2 | = deg()), which is modeldependent.
(2) A collection of finite-dimensional vector spaces (Hab )a,b indexed by a set
(taken to be finite, for ease of exposition), together with degree zero bilinear maps
B(abc) : Hab Hbc Hac and bilinear maps (ab) : Hab Hba C of definite (but
model-dependent) degree, with the following properties:
(2.1) B(abc)(B(adb)(1, 2 ), 3 ) = B(adc)(1 , B(dbc)(2 , 3 )).
(2.2) B(abb)(, |0b ) = B(aab)(|0a , ) = for some elements |0a Ha .
(2.3) (ab) are nondegenerate and satisfy the graded-commutativity relations:
(ab)(1 , 2 ) = (1)|1 ||2 | (ba)(2, 1 )
and the cyclicity property:

1 , B(2 , 3 ) = (1)|1 |(|2 |+|3 |) 2 , B(3 , 1 )

= (1)|3 |(|1 |+|2 |) 3 , B(1 , 2 ) ;
(3) Degree zero linear maps e(a) : H Ha with the properties:
(3.1) e(a)|0 = |0a .
(3.2) B(aaa)(e(a)1, e(a)2 ) = e(a)(C(1 , 2 )).
(3.3) B(abb)(, e(b)) = (1)|||| B(aab)(e(a), ).
(81)
(82)
(83)
522
This data is such that:

(4) the topological Cardy constraint (67) is satisfied.
For readers convenience, let us explain how one can determine the basic data
, , c, b, e from computations of topological correlators. It is convenient to chose the
bases i and ab of H, Hab to be homogeneous, i.e., such that i are elements of definite
degree |i| and ab are elements of definite degree ||. 18 Since (ab) is nondegenerate
and of definite degree, the spaces Hab and Hba are isomorphic (possibly after a shift of
grading) as graded vector spaces and hence the bases ab and ba are indexed by the same
set of labels . Raising/lowering indices with the bulk and boundary topological metrics,
we define:
Cij k := Cijl lk ,
(84)
(abc) (ac),
B (abc) := B
ei (a) := ei (a) (a),
(85)
where we use the notations (a) := (aa) etc. One has:

Cij k = C(i , j ), k = i j k 0,0 ,

B (abc) = (ac) B ab , bc , ca = ab bc ca 0,1 ,

ei (a) = (a) e(a)(i ), a = i a 0,1 .
(86)
(87)
(88)
(89)
On the other hand, we have:

(ab) = ab ba 0,1 .
ij = i j 0,0 ,
(90)
Hence all relevant data can be determined by computing:

(1) The two and 3-point functions on the sphere ij and Cij k .
(2) The boundary two and 3-point functions on the disk (ab) and B (abc).
(3) The bulk-boundary two-point function on the disk ei (a) = fi (a).
In coordinates, the constraints on this data are as follows:
(a) ij and (ab) are non-degenerate and we have:
ij = (1)|i||j | j i ,
(ab) = (1)
||||
(91)
(ba).
(92)
(b) Cij k are graded symmetric:

Cij k = (1)|i||j | Cj ik = (1)(|i|+|j |)|k| Ckij , etc.,
(93)
and form the structure constants of an associative algebra with unit. We can chose this unit
|0 to be part of our basis: 0 := |0. In this case, we must have:
k
C0j
= jk ,
(94)
k
= ik .
Ci0
(95)
Notice that j k = C0j k with this choice of basis.

18 |i| and || should not be confused with absolute values!
523
(c) B (abc) are graded cyclically symmetric:

B (abc) = (1)||(||+| |)B (bca) = (1)| |(||+||) B (cab)
(96)
and satisfy the associativity property:
B
(abc)B (acd) = B
(abd)B
(bcd).
(97)
They are also required to admit units |0a in the sense of Section 3.6. Choosing 0aa :=
|0a , we can formulate this as the constraints:
B0 (aab) = ,
(98)
B0 (abb) = .
(99)
Notice that (ab) = B0 (aab) with this choice of basis.

(d) ei induce a graded algebra structure of Ho over H, i.e., we have (see Eq. (55)):
B (abb)ei (b) = (1)|i||| ei (a)B (aab),
B (aaa)ei (a)ej (a) = ek (a)Cijk .
(100)
(101)
(e) The topological Cardy constraint (75) is satisfied.

As in closed topological field theory, the entire information is contained in tree-level amplitudes. 19 Hence one can invert the argument and define an (on-shell) topological open
closed field theory to be given by such a structure. In particular, on-shell deformations of
openclosed topological models are governed by the deformation theory of such objects.
5. Decompositions and irreducible boundary theories

We saw that the boundary and bulk-boundary sewing constraints admit a simplified form
when expressed in terms of the total boundary state space. In this section, I analyze the
structure of openclosed topological field theories from this point of view. Since the data
summarized in Section 4.8 is quite complicated, it is useful to follow a formal approach
and start with a few definitions.
Definition 5.1. A bulk algebra is a Frobenius (super) algebra (H, C, ), whose unit we
denote by |0. Remember that we take the invariant, nondegenerate metric to be gradedsymmetric, as required by the equivariance axiom.
Definition 5.2. A boundary algebra is a triple (Ho , B, ) with the properties:
(1) Ho is a (finite-dimensional) complex vector space, endowed with a Z2 -grading Ho =
H00 H1 .
(2) (Ho , B) is an associative graded algebra over the field of complex numbers.
In particular, the product B : Ho Ho Ho is bilinear with respect to complex
19 This is of course not the case in topological string theory, i.e., after coupling to topological gravity.
524
multiplication and has degree zero. This algebra is endowed with a unit which we denote
by |0o . The product B need not be graded-commutative.
(3) : Ho Ho C is a nondegenerate bilinear form of degree zero, satisfying the
properties:
(3.1) graded symmetry:
(1 , 2 ) = (1)|1 ||2 | (2 , 1 );
(102)
(3.2) invariance:

B(1 , 2 ), 3 = 1 , B(2 , 3 ) ;
(103)
(3.3) graded cyclicity:

1 , B(2 , 3 ) = (1)|1 |(|2 |+|3 |) 2 , B(3 , 1 )

= (1)|3 |(|1 |+|2 |) 3 , B(1 , 2 ) .
(104)
Definition 5.3. A boundary extension of a bulk algebra (H, C, ) is a boundary algebra

(Ho , B, ) together with a degree zero linear map e : H Ho , having the properties:
(1) e endows (Ho , B) with the structure of a graded unital algebra over the bulk ring
(H, C), i.e., we have:
(1.1) e|0 = |0o .
(1.2) B(e(), ) = (1)|||| B(, e()).
(1.3) B(e(1 ), e(2 )) = e(C(1 , 2 )).
(2) The topological Cardy constraint

f (1 ), f (2 ) = tr (1)F (1 , 2 )
(105)
is satisfied. Here f is the adjoint of e with respect to the metrics , :

f (), = , e() ,
(106)
while (1 , 2 ) is the endomorphism of Ho defined through:

B B(1 , ), 2 .
(107)
The module product H Ho Ho is defined in the standard fashion, namely :=

B(e(), ). Note that complex multiplication in the boundary algebra is compatible with
the extension map e (since the later is linear). That is, the module structure of (Ho , B) over
(H, C) determined by e takes complex multiplication in H into complex multiplication in
Ho :

= B e |0 , .
(108)
Boundary extensions correspond to openclosed topological field theories having a
single boundary sector (i.e., when is a set with only one element). In Section 4, we
expressed the sewing constraint in terms of the total boundary state space. In the language
of the present section, what we did is to show that the axioms of openclosed topological
field theory amount to the condition that the total boundary state space is an extension of
525
Fig. 17. Graphical representation of the decomposition conditions for the boundary metric and triple
correlator. This description is related to the category-theoretic interpretation of Section 5.2.
the bulk algebra. To formulate this statement (and its converse) precisely, we need a few
more mathematical definitions.
Definition 5.4. A reduction of a boundary extension (Ho , B, , e) is a (finite) direct sum

decomposition Ho = a,b Hab into (nonzero) vector spaces, with the properties:
(0) The decomposition is compatible with the grading, i.e., Hab = (Hab Ho0 ) (Hab
1
Ho ). 20
(1) is zero except when restricted to subspaces of the form Hab Hba .
(2) The product B is zero except when restricted to subspaces of the form Hab Hbc ,
and takes such a space into a subspace of Hac .

(3) The map e is diagonal, i.e., its image e(H) is a subspace of a Haa .
Using condition (1), the requirement (2) can be reformulated as the statement that the
triple correlator 1 , 2 , 3 = (1 , B(2 , 3 )) equals zero except when restricted to
subspaces of the form Hab Hbc Hca . That is, the boundary topological metric and
triple correlator must be polygonal in the sense described in Fig. 17. An extension of the
bulk algebra will be called reducible if it admits a reduction and irreducible otherwise.
A reduction will be called trivial if the set has only one element (in this case, the
reduction does nothing).
The result of the sewing constraint analysis can now be formulated as follows:
Proposition 5.1. An openclosed extension of a closed topological field theory is
equivalent with an extension (Ho , B, , e) of the bulk algebra (H, C, ), together with

a (possibly trivial) reduction Ho = a,b Hab .
The fact that an openclosed topological field theory defines such an extension and
reduction was shown in Section 4. The converse (that an extension, together with a
reduction, suffices to recover the data of Section 4.8) is rather obvious and I will not
give all formal details here. The only slightly subtle point is that the map f : Ho H
defined by the adjointness relation (106) has only diagonal components, i.e., is of the

form f = a f (a), with f (a) := f |Haa . This follows upon writing f = ab f (ab), with
20 This constraint is nontrivial, since it requires that any element | H be the sum of an even and an odd
ab
element of Ho both of which belong to Hab .
526
f (ab) = f |Hab and noticing that the (106) implies (a)(e(a)(), aa ) = (, f (a)(aa ))
when applied to the diagonal components aa of . Hence (106) can be simplified to the
form:

f (ab)(ab ) = 0,
,
(109)
a=b
which must hold for all . Since is non-degenerate, this requires a=b ab = 0, so that
f () = a f (aa)(aa ). Since is arbitrary, it follows that the map f defined by the

duality condition (106) is diagonal.
This result reveals a certain ambiguity in the construction of boundary sectors. In fact,
one can have two realizations of the structure in Section 4.8, such that they both define the
same total boundary extension (Ho , B, , e). Such realizations are indistinguishable from
the physical point of view, and thus they should be identified.
Consider the implications of this observation for the meaning of topological Dbranes. The relevant problem is to give a precise formulation of what is means to
determine all admissible boundary sectors a. Since any such sector a defines an extension
(Haa , B(aaa), (aa), e(a)) of the bulk algebra, it is clear that each admissible label a in
Section 2 must correspond to an object of the type described in Definition 5.3. However,
such an extension need not be irreducible, and a reduction leads to a refinement of the
admissible set of boundary labels. It is clear that the meaningful procedure is to look for
all irreducible boundary extensions. This allows us to give the following
Definition 5.5. A topological D-brane (or abstract boundary condition) is an irreducible
boundary extension of the bulk algebra (H, C, ).
We can now give a precise formulation of the boundary extension problem for a given
bulk topological field theory. Let S be the set of (isomorphism classes of) irreducible
boundary extensions of a given bulk algebra (this set may be infinite). Assuming S is
known, a boundary extension of the bulk theory is constructed as follows. First, pick a
(finite) subset S0 of S, enumerated by a set of labels , and let (Ho(a) , B (a) , (a) , e(a)) S0
be the boundary extension associated to a . Then boundary extensions of the given
bulk theory, determined by this set of topological D-branes, correspond to structures of the
type described in Section 4.8, based on the set of labels and having the property that
(a)
(Haa , B(aaa), (aa), e(a)) = (Ho , B (a) , (a) , e(a)) for all a in . This is a well-defined
mathematical problem, lying at the intersection between algebra and category theory (see
the next subsection). To my knowledge, this program has not been carried out even for a
single nontrivial example.
5.1. Category-theoretic interpretation
The presence of various boundary sectors give a category-theoretic flavor to the
boundary data of Section 4.8. Since this interpretation is quite obvious and not particularly
deep, I will keep the following remarks short. The categorical formulation arises by
527
viewing the labels a as objects and the open string states in Hba as morphisms from
a to b: Hom(a, b) = Hba . Then one defines composition of morphisms Hom(a, b)
Hom(b, c) Hom(a, c) to be given by the boundary product B(cba). This gives a
category due to associativity of the boundary product B. Note that we have Z2 -gradings
on the spaces Hba and the composition b is a degree zero map. Thus, we have a Z2 graded category. The boundary topological metric gives non-degenerate bilinear pairings
(ba) between Hom(a, b) and Hom(b, a) which obey the cyclicity constraint (39).
0
Restricting to degree zero morphisms Hba
gives a Grassmann-even sub-category. These
are the fundamental objects behind the categories of D-branes recently considered in
the literature [7] from a spacetime perspective. For the reader familiar with [7], 21 we
mention that morphism compositions in our approach arise naturally from the physical
boundary product, or equivalently from triple boundary correlators. This gives the physical
0 .
origin of morphism compositions, which are not restricted to the even subspaces Hba
The application of our framework to openclosed topological sigma models is discussed
in [34].
If we view the boundary data in this fashion, then the bulk-boundary product acts as an
exterior multiplication on the morphism spaces Hom(a, b) and endows them with a module
structure over the Frobenius algebra (H, C). The topological Cardy condition (75) gives a
constraint between this exterior multiplication and the boundary product B.
6. Conclusions and directions for further research

We gave a systematic derivation of the on-shell structure of two-dimensional
topological field theories on oriented openclosed Riemann surfaces. The analysis of
sewing constraints allowed us to encode all information contained in such a theory in
the well-defined algebraic structure summarized in Section 4.8. This mathematical object
can be used as a definition of openclosed on-shell topological field theories, and can
be taken as the starting point in the study of boundary extensions, as well as of the onshell deformation problem. We drew attention to the central role of the bulk-boundary
and boundary-bulk maps and gave a detailed analysis of the topological version (67) of the
(generalized) Cardy constraint.
21 The authors of [7] consider the CalabiYau B-model [3,4] in the presence of D-branes, some of which can
be described as holomorphic bundles over the target space. For such D-branes, they consider the category of
holomorphic bundles and bundle morphisms, and then proceed to restrict to the BPS saturated case, for which
they propose a modified stability condition inspired by mirror symmetry arguments. The physical origin of this
category-theoretic structure is not discussed in [7]. As we show in [34], the category of holomorphic bundles
arises by considering the realization of our structure for the openclosed B-model. For those topological Dbranes which can be described
through holomorphic bundles Va , the (physically relevant) morphism spaces are

Hba = Ext (Va , Vb ) = j Extj (Va , Vb ), as proposed by Kontsevich in [15] from mathematical considerations.
The boundary product B is the Yoneda product. The grading on the total extension spaces corresponds to a Zgrading induced by the global U (1) worldsheet symmetry of the model. The category of holomorphic bundles
considered in [7] arises from this by restricting to degree zero morphisms (with respect to this grading), i.e., to the
spaces Ext0 (Va , Vb ) = Hom(Va , Vb ), in which case the boundary product reduces to composition of morphisms.
528
Consideration of an arbitrary number of open string sectors led us to the problem of

decompositions of boundary extensions, a mathematical formulation of which was given
in Section 5.1. We also extracted a category structure from the boundary products and
proposed a precise definition of topological D-branes as irreducible boundary extensions.
It is natural to ask about the off-shell counterpart of this framework in openclosed
cohomological field theories. A correct treatment of this problem requires consideration of
topological openclosed string field theory along the lines of [13], generalized to the case
of an arbitrary number of boundary sectors. In fact, openclosed string field theory holds
the key to a physical understanding of recent mathematical work on homological mirror
symmetry, as we explain in more detail somewhere else. The tree-level off-shell structure
was recently discussed in [25], under restrictive conditions on the boundary data. 22
Understanding homological mirror symmetry [15] from a string field theoretic point of
view requires certain generalizations of the topological openclosed string theories of [2].
In particular, one has to consider such systems in the presence of an arbitrary number of
D-branes, which leads to boundary-condition-changing sectors. This problem is discussed
somewhere else [34].
Acknowledgements
The author thanks Jae-Suk Park for many stimulating conversations and Martin Rocek
for his sustained interest in his work. He also wishes to thank Sorin Popescu for some
helpful mathematical observations.
References
[1] H. Ooguri, Y. Oz, Z. Yin, D-branes on CalabiYau spaces and their mirrors, Nucl. Phys. B 477
(1996) 407430.
[2] E. Witten, ChernSimons gauge theory as a string theory, hep-th/9207094.
[3] E. Witten, Topological sigma models, Commun. Math. Phys. 118 (1988) 411.
[4] E. Witten, Mirror manifolds and topological field theory, hep-th/9112056.
[5] I. Brunner, M.R. Douglas, A. Lawrence, C. Romelsberger, D-branes on the Quintic, hep-th/
9906200.
[6] D.-E. Diaconescu, M.R. Douglas, D-branes on stringy CalabiYau manifolds, hep-th/0006224.
[7] M.R. Douglas, B. Fiol, C. Rmelsberger, Stability and BPS branes, hep-th/0002037;
Y. Oz, T. Pantev, D. Waldram, Braneantibrane systems on CalabiYau spaces, hep-th/0009112.
22 The authors of [25] give a partial off-shell description of the tree-level algebraic structure in a boundary
sector containing a single D-brane. In fact, they mainly restrict to the case where the boundary product is
associative off-shell, since they are interested in giving a physical interpretation to recent results of Kontsevich and
Soibelman which gave a proof of Delignes conjecture [26]. Since they work at open string tree level, they do not
consider the (off-shell) version of the topological Cardy constraint, which appears from a bulk-boundary sewing
condition on the cylinder (however, this conditions does constrain tree level data). The work of [25] focuses on
the boundary deformation problem, which they relate to the mathematical results just cited. The sewing bulkboundary constraints considered in [25] can be viewed as a particular case of the more general string vertex
equations of [13], as they apply to their situation.
529
[8] A. Recknagel, V. Schomerus, Moduli spaces of D-branes in CFT-backgrounds, hep-th/9903139;

A. Recknagel, V. Schomerus, Boundary deformation theory and moduli spaces of D-branes,
Nucl. Phys. B 545 (1999) 233282, hep-th/9811237;
A. Recknagel, V. Schomerus, D-branes in Gepner models, Nucl. Phys. B 531 (1998) 185225,
hep-th/9712186.
[9] S. Kachru, S. Katz, A. Lawrence, J. McGreevy, Open string instantons and superpotentials,
Phys. Rev. D 62 (2000) 026001;
S. Kachru, S. Katz, A. Lawrence, J. McGreevy, Mirror symmetry for open strings, hep-th/
0006047.
[10] C. Vafa, Extending mirror symmetry to CalabiYau with bundles, hep-th/9804131.
[11] S. Govindarajan, T. Jayaraman, On the LandauGinzburg description of boundary CFTs and
special Lagrangian submanifolds, HEP 0007 (2000) 016;
S. Govindarajan, T. Jayaraman (IMSc), T. Sarkar, On D-branes from gauged linear sigma
models, hep-th/0007075;
S. Govindarajan, T. Jayaraman (IMSc), T. Sarkar, Worldsheet approaches to D-branes on
supersymmetric cycles, Nucl. Phys. B 580 (2000) 519547.
[12] K. Hori, A. Iqbal, C. Vafa, D-Branes and mirror symmetry, hep-th/0005247.
[13] B. Zwiebach, Oriented openclosed string theory revisited, Ann. Phys. 267 (1988) 193, hep-th/
9705241.
[14] M. Gaberdiel, B. Zwiebach, Tensor constructions of open string theories I: Foundations, Nucl.
Phys. B 505 (1997) 569, hep-th/9705038.
[15] M. Kontsevich, Homological algebra of mirror symmetry, in: Proc. of Int. Congress of
Mathematicians, Zurich, 1994, Birkhauser, 1994, pp. 120139, alg-geom/9411018.
[16] B. Dubrovin, Geometry of 2d Topological Field Theories, LNM, Vol. 1620, Springer, 1996,
hep-th/9407018.
[17] A.A. Voronov, Topological field theories, string backgrounds and homotopy algebras, hep-th/
9401023, Adv. Appl. Clifford Algebras 4 (1994) 167178.
[18] R. Dijkgraaf, H. Verlinde, E. Verlinde, Notes on topological string theory and two-dimensional
quantum gravity, Trieste Spring School, 1990:0091-156.
[19] G.B. Segal, The definition of conformal field theory, in: Proc., Differential Geometric Methods
in Theoretical Physics, Como, 1987, pp. 167171;
G.B. Segal, Two-dimensional conformal field theory and modular functions, in: Swansea
Proceedings, Mathematical Physics, 1988, pp. 2237.
[20] R. Dijkgraaf, H. Verlinde, E. Verlinde, Topological strings in d < 1, Nucl. Phys. B 352 (1991)
59.
[21] R. Dijkgraaf, Topological field theory and two-dimensional quantum gravity, in: Proceedings,
Two-dimensional quantum gravity and random surfaces, Jerusalem 1990/1991, 191238.
[22] R. Dijkgraaf, Les Houches Lectures on Fields, Strings and Duality, hep-th/9703136, Lectures
given at NATO Advanced Study Institute: Les Houches Summer School on Theoretical Physics,
Session 64: Quantum Symmetries, Les Houches, France, 1 August 8 September 1995.
[23] J.L. Cardy, Conformal invariance in critical systems with boundaries, in: Proc., Infinite Lie
Algebras and Conformal Invariance In Condensed Matter and Particle Physics, Bonn, 1986,
pp. 8192;
J.L. Cardy, Boundary conditions, fusion rules and the Verlinde formula, Nucl. Phys. B 324
(1989) 581;
Boundary conditions in conformal field theory, in: Integrable systems in quantum field theory
and statistical mechanics, (Eds.) M. Jimbo et al., 127148;
J.L. Cardy, D.C. Lewellen, Bulk and boundary operators in conformal field theory, Phys. Lett.
B 259 (1991) 274278.
[24] A. Strominger, S.-T. Yau, E. Zaslow, Mirror Symmetry is T-Duality, Nucl. Phys. B 479 (1996)
243259.
530
[25] C. Hofman, W.K. Ma, Deformations of topological open strings, hep-th/006120.

[26] M. Kontsevich, Y. Soibelman, Deformations of algebras over operads and Delignes conjecture,
math.QA/0001151.
[27] D.C. Lewellen, Sewing constraints for conformal field theories on surfaces with boundaries,
Nucl. Phys. B 392 (1993) 137161.
[28] M. Atiyah, Topological quantum field theories, Inst. Hautes Etudes Sci. Publ. Math. 68 (1988)
175186;
M. Atiyah, Quantum field theory and low-dimensional geometry, Progr. Theor. Phys. Suppl. 102
(1991) 113;
See also, M. Atiyah, An introduction to topological quantum field theories, Turkish
J. Math. 21 (1) (1997) 17.
[29] F. Quinn, Lectures on axiomatic topological quantum field theory, in: D.S. Freed, K. Uhlenbeck
(Eds.), Geometry and Quantum Field Theory, IAS/Park City Mathematic Series, Vol. 1, AMS,
1995.
[30] E. Witten, On the structure of the topological phase of two-dimensional gravity, Nucl. Phys.
B 340 (1990) 281.
[31] J. Fuchs, C. Schweigert, Branes: from free fields to general backgrounds, hep-th/9712257.
[32] E. Witten, Noncommutative geometry and string field theory, Nucl. Phys. B 268 (1986) 253.
[33] R.E. Behrend, P.A. Pearce, V.B. Petkova, J.-B. Zuber, Boundary conditions in rational
conformal field theories, Nucl. Phys. B 570 (2000) 525589;
R.E. Behrend, P.A. Pearce, V.B. Petkova, J.-B. Zuber, Nucl. Phys. B 579 (2000) 707773.
[34] C.I. Lazaroiu, in preparation.
[35] A. Recknagel, V. Schomerus, Moduli spaces of D-branes in CFT-backgrounds, hep-th/9903139;
A. Recknagel, V. Schomerus, Boundary deformation theory and moduli spaces of D-branes,
Nucl. Phys. B 545 (1999) 233282, hep-th/9811237;
A. Recknagel, V. Schomerus, D-branes in Gepner models, Nucl. Phys. B 531 (1998) 185225,
hep-th/9712186;
N. Ishibashi, The boundary and crosscap states in conformal field theories, Mod. Phys. Lett.
A 4 (1989) 251;
N. Ishibashi, T. Onogi, Conformal field theories on surfaces with boundaries and crosscaps,
Mod. Phys. Lett. A 4 (1989) 161.

Embedding variables in the canonical theory

of gravitating shells
Petr Hjcek a , Claus Kiefer b
a Institut fr Theoretische Physik, Universitt Bern, Sidlerstrasse 5, CH-3012 Bern, Switzerland
b Fakultt fr Physik, Universitt Freiburg, Hermann-Herder-Strasse 3, 79104 Freiburg, Germany
Received 3 July 2000; accepted 20 March 2001
Abstract
A thin shell of light-like dust with its own gravitational field is studied in the special case of
spherical symmetry. The action functional for this system due to Louko, Whiting, and Friedman is
reduced to Kuchar form: the new variables are embeddings, their conjugate momenta, and Dirac
observables. The concepts of background manifold and covariant gauge fixing, that underlie these
variables, are reformulated in a way that implies the uniqueness and gauge invariance of the
background manifold. The reduced dynamics describes motion on this background manifold. 2001
Elsevier Science B.V. All rights reserved.
PACS: 04.60
Keywords: Constrained systems; Thin shells
1. Introduction
The phenomenon of gravitational collapse leads to serious problems in the classical
theory of gravity. The structure of the resulting singularities contradict the foundations of
the theory such as the equivalence principle. Thus, the existence of singularity theorems
[1] may constitute a strong motivation to address the quantisation of the gravitational field,
in the hope that such a framework can avoid the occurrence of singularities.
A prominent feature of the classical collapse is the existence of horizons which appear
sooner than the singularity. Such horizons not only imply that the singularity is inevitable
(which is, roughly, the content of the singularity theorems), they also seem to prevent any
object or information from leaving the region of collapse and from coming back to the
asymptotic region. It is the existence of horizons that makes gravity so different and the
problem of collapse so difficult. On the other hand, the problem of gravitational collapse
E-mail address: hajicek@itp.unibe.ch (P. Hjcek).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 4 1 - 9
532
P. Hjcek, C. Kiefer / Nuclear Physics B 603 [PM] (2001) 531554
is a very special one. For its solution, a complete quantum theory of gravitation may be as
little needed as the complete quantum electrodynamics was needed for the first calculations
of atomic spectra.
Motivated by these ideas, we consider the quantum theory of a spherically symmetric
thin shell and its gravitational field. This is, in fact, a quite popular system. For example,
it was used to study the motion of domain walls in the early Universe [2], of black-hole
evaporation [3], of quantum black holes [4], and many others. In [5,6] and [7], gravitational
collapse of such a thin shell in its own gravitational field has been studied. The result was
analogous to what is known about the s-mode of the Coulomb problem. Two aspects of
this result were surprising. First, for low-mass shells, there were stationary states with
Sommerfeld spectrum and scattering states with the wave packets describing the shell
bouncing and re-expanding. The evolution was unitary. Second, there was an analogue
to the critical charge in the relativistic quantum mechanics of atoms. The role of charge
was played by the rest mass of the shell (not to be confused with its total energy) and the
critical value of the rest mass was about one Planck mass. As in the case of relativistic
atoms, the quantum-mechanical description breaks down for supercritical charges.
To understand these results was difficult. Even the simple scattering of the sub-critical
shells admitted several different interpretations. There were two problems. First, the radial
coordinate of the shell, which served as the argument of the wave function, did not
possess the status of a quantum observable. The Coulomb-like potential prevented one
from constructing a position operator similar to the NewtonWigner operator. Second, the
model was completely reduced to the physical degrees of freedom, which in this case is
just the radius of the shell. However, the value of the radius is not as informative in a
black-hole spacetime as it is, for example, in Minkowski spacetime: points with the same
value of radial coordinate can lie in different asymptotically flat regions. These regions are
separated by horizons. The tools that were at ones disposal in [5,6] and [7] did not allow to
decide whether the shell created a horizon and then, consequently, re-expanded behind this
horizon into a different asymptotically flat section, or whether it did not create any horizon
and re-expanded into the same region from which it collapsed.
In [8], two remedies have been proposed. The first is to work with a null (light-like) shell.
The classical dynamics of such a shell is equivalent to that of free photons on flat twodimensional spacetime (the charge is zero). For such a system, there is a well-defined
position operator [9]. Moreover, it admits a simple description of its asymptotic states,
unlike Coulomb scattering.
The second idea is that the equation of motion for r(t), which has been obtained from
Einsteins equations, must in fact result from a reduction to true degrees of freedom of an
action that contains the shell as well as the gravitational field. Indeed, in the sphericallysymmetric case, the gravitational degrees of freedom consist only of the gauge and the
dependent ones. It seems that the reduction has been performed in such a way that the
information about the geometry of spacetime has been lost. We shall, therefore, perform
the reduction explicitly in a careful way. This is the main purpose of this paper.
In general, the reduction procedure consists of two steps: the choice of gauge and
the solution of constraints. There exists a particular form of the gravitational action that
533
is effectively reduced, but which still contains some information about the geometry
of spacetime: the so-called Kuchar decomposition [10,11]. Kuchar variables are neatly
separated into pure gauge ones (so-called embeddings), dependent ones that are conjugate
to the embeddings, and physical degrees of freedom. Some progress in understanding
Kuchar decomposition has been achieved recently [12,13]: general existence of the
decomposition has been shown, and the crucial role of gauge choice in it has been
recognised. The nature of gauge choice in quantum gravity has also been elucidated. Two
important notions have been introduced: background manifold and covariant gauge fixing.
As a matter of fact, the present paper is the first practical application of these concepts.
It deals with the classical canonical analysis of the null-dust shell. Its main result is the
explicit construction of the Kuchar decomposition. This then serves as the starting point
for the quantum theory of the shell, which will be presented in a separate paper.
The plan of the paper is as follows. Section 2 explains the notions of background
manifold and covariant gauge fixing in a new and clear form. This enables us to show
the gauge invariance, the uniqueness and some additional structures of the background
manifold, which will be necessary for the interpretation of the shell quantum mechanics.
Some important points of [13] are then summarised in the new language. Section 3
describes the solutions of Einsteins equations containing the shell. After fixing the
gauge, these solutions are reformulated as a set of parameter-dependent metric fields and
shell trajectories on the background manifold. The parameters distinguish the physically
different solutions and will play the role of the physical variables in the Kuchar
decomposition. The representative metric fields and shell trajectories will be used to define
the transformation from the ADM to Kuchar variables on the constraint hyper-surface. This
transformation is performed in Section 4. Starting point is a (non-reduced) Hamiltonian
action principle [14] for the spherically-symmetric shell and its gravitational field. In
Section 4, an extension of the results to a whole neighbourhood of the constraint hypersurface is performed using the methods and theorems of [13]. The final action, the variables
in it and some discussion can be found in Section 5.
2. Background manifold and covariant gauge fixing

In this section we introduce the two basic notions of background manifold and covariant
gauge fixing, restricting ourselves, for the sake of simplicity, to vacuum general relativity.
The language is slightly different from that used in [13] so that we can prove more results;
we also summarise some points that are important for the paper.
Let M be a four-manifold that admits a Lorentzian metric field g such that (M, g) is
a globally hyperbolic spacetime. Dynamically maximal solutions of Einsteins equations
are always of this form [15]. Then according to a theorem of Geroch [16], M = R,
where is an initial data manifold. Topological sectors of general relativity are uniquely
associated with different three-manifolds . M is called background manifold. In this
way, each topological sector determines a unique background manifold M.
534
All diffeomorphisms of M form the group Diff M; this is considered as the gauge
group of general relativity. Observe that the group depends on the topological sector
chosen, i.e., on M, and that the manifold structure of M itself is gauge invariant. Single
points of M are, however, not gauge invariant, since they are being pushed around by
Diff M. In some important cases, not Diff M but some of its subgroups play the role
of gauge group. For example, in the case of asymptotically flat spacetimes, only those
diffeomorphisms are considered as gauge transformations that become sufficiently quickly
trivial at infinity. In general, in such cases, M is equipped with some gauge-invariant
structure in addition to the naked manifold one.
Let Diff M and let g be a Lorentzian metric on M. Then the inverse pull-back
associated with maps g into another metric g , g = ( 1 ) g. In this way, the group
Diff M acts on the space Riem M of all (suitably restricted) Lorentzian metrics on M.
The action is not transitive and so there are non-trivial orbits of the group in Riem M.
Such orbits are called geometries and the quotient space Riem M/Diff M is the space of
geometries. Let : Riem M Riem M/Diff M be the natural projection for the quotient.
One can equip the space of geometries with some additional structure, e.g., a topology,
starting from a structure of Riem M and using the projection.
Suppose that we manage, at least for some open set U Riem M/Diff M, to specify a
section . This is a map,
: Riem M/Diff M Riem M
such that = id. The meaning of such a section is that a particular representative metric
on M is chosen for each geometry in U . This is exactly what has been called covariant
gauge fixing in [13]. Clearly, the transformation between two covariant gauge fixings is not
a single diffeomorphism, but an element of the BergmannKomar group [17].
Given a covariant gauge fixing on U , one can use it to construct a map from
Riem M/Diff M Emb(, M), where Emb(, M) is the space of embeddings of
the initial data surface into M, to the ADM phase space of general relativity. The
construction has been described in [13] and it goes, roughly, as follows. Let U and
let ( ) be the representative metric on M. Let X : M be an embedding. Then
( ) determines the first, qkl (x), and the second, Kkl (x), fundamental forms of the surface
X() in the spacetime (M, ( )). The corresponding point (qkl (x), kl (x)) of the ADM
phase space can be obtained in the well known way from qkl (x) and Kkl (x).
In [13], this transformation has been restricted to give only the points of the constraint
surface of the ADM phase space; moreover, only those points of have been
selected, where the evolved spacetimes do not admit any isometry. Then the map from
Riem M/Diff M Emb(, M) to has been shown to be invertible and extensible to a
neighbourhood of Riem M/Diff M Emb(, M) in the larger space Riem M/Diff M
T Emb(, M), which has then been mapped to a neighbourhood of in the ADM
phase space. Next, the DarbouxWeinstein theorem has been employed to prove some
nice symplectic properties of the map. These properties then make the map to a general
transformation of the ADM to the Kuchar variables.
535
This procedure will here be applied to the model of spherically-symmetric thin

gravitating shell in the subsequent sections. We shall find in the next section the set of
representative solutions for Einsteins equations for each physically distinct situation of
the shell because, as has been shown in [13], this part of the section suffices completely
to construct the above map to the constraint surface of our model.
3. Einstein dynamics of the shell

Any spherically-symmetric solution of Einsteins equations with a thin null shell as the
source has a simple structure. Inside the shell, the spacetime is flat; outside the shell, it is
isometric to a part of the Schwarzschild spacetime of mass M. The two geometries must
be stuck together along a spherically-symmetric null hyper-surface so that the points with
the same values of the radial coordinate R coincide.
All physically distinct solutions can be labeled by three parameters: {1, +1},
distinguishing between the outgoing ( = +1) and in-going ( = 1) null surfaces; the
asymptotic time of the surface, i.e., the retarded time u = T R (, ) for = +1,
and the advanced time v = T + R (, ) for = 1; and the mass M (0, ). An
in-going shell creates a black-hole (event) horizon at R = 2M and ends up in the singularity
at R = 0. The outgoing shell starts from the singularity at R = 0 and emerges from a whitehole (particle) horizon at R = 2M.
We can write down the metric in the case = 1 with the help of retarded Eddington
, R, and . U
= u is the trajectory of the shell, U
> u is a part
Finkelstein coordinates U
of Minkowski spacetime,
2 2 d U
dR + R 2 d 2 ,
ds 2 = d U
< u is a part of Schwarzschild spacetime,
and U

2M
dR + R 2 d 2 .
2 2 d U
ds 2 = 1
dU
R
(1)
(2)
, R, and ,
Similarly, for = 1, the advanced EddingtonFinkelstein coordinates are V

and V = v is the shell. Inside the shell, V < v,
2 + 2 d V
dR + R 2 d 2 ,
ds 2 = d V
> v,
and outside the shell, V

2M
dR + R 2 d 2.
2 + 2 d V
dV
ds 2 = 1
R
(3)
(4)
Let us denote the spacetime given by the triple of parameters , M and w by (, M, w),
where w = u for = 1 and w = v for = 1.
We observe that the two spacetimes (, M, w1 ) and (, M, w2 ) are isometric, the
, R, , ) into (U
+w2 w1 , R, , ). Hence, the geometries
isometry sending the point (U
of the solutions that differ only in the value of the parameter w are equal. Yet, the
physical situations they represent are different; this is similar to the motion of a free mass
536
point in Minkowski spacetime. For each two different trajectories, there is a Poincar
transformation that sends the first into the second. Still, the two motions are physically
different because they look differently from one fixed inertial frame. For the shell, instead
of an inertial frame, we imagine that there is a fixed asymptotic family of observers. The
group of these isometries is a symmetry group rather than a gauge group. It can (and will)
be employed to define a time evolution.
Another interesting isometry is the map T : (, M, w1 ) (, M, w2 ) defined for
= +1 and arbitrary w1 and w2 by the EddingtonFinkelstein coordinates as follows:

2 , R2 , 2 , 2 ,
1 , R1 , 1 , 1 = V
T U
where
2 = U
1 + w1 + w2 ,
V
R2 = R1 ,
2 = 1 ,
2 = 1 .
For = 1, we just take the inverse of the above so that T 2 = id. T can be viewed as a
time reversal symmetry.
The EddingtonFinkelstein coordinates may be nicely adapted to the symmetry and
may simplify the metric, but they do not define a covariant gauge fixing. Indeed, the
1 , R1 , 1 , 1 ) of the solution (+1, M1 , w1 ) with the points
identification of the points (U
2 = U
1 , R2 = R1 , 2 = 1 and
2 , R2 , 2 , 2 ) of (1, M2 , w2 ) satisfying the relations V
(V
2 = 1 will invert the time orientation of the asymptotic observers, which is to stay gauge
invariant. We need, however, a covariant gauge fixing if we are to transform the action to
the Kuchar form. The rest of this section will be devoted to a choice of gauge that will be
convenient for this problem.
To start with, we have to specify the background manifold. Our model comprises only
the spherically-symmetric part of general relativity with the shell. We shall, therefore,

and only that subgroup of Diff M
admit only spherically-symmetric initial surfaces

(where M := R), the elements of which commute with the rotations and are trivial
that are adapted to the symmetry and
at infinity. Let , and be coordinates on
is smooth at
[0, ), where = 0 is the regular centre of symmetry; we assume that
this centre. The shell is at = r, and the infinity at = .
The coordinates and are ignorable coordinates; in the action, we can integrate
over them so that they disappear and the effective initial manifold is one-dimensional,
diffeomorphic to R+ , and the effective background manifold M is two-dimensional,
R+ R. Our restricted gauge group induces an effective gauge group, Diff0, M, on M;
it only contains diffeomorphisms that preserve the central boundary as well as, pointwise,
the infinity.
Let us choose coordinates U and V on M that satisfy the following boundary conditions
at the gauge-invariant boundaries of M: at the regular centre inside the shell,
U = V,
at I , U = and V (, ), at I + , V = and U (, ), and at i 0 ,
U = and V = . Otherwise, U and V are arbitrary.
537
Using these coordinates U and V , one can define the representative metric (see
Section 2) by conditions on its components with respect to U and V . We shall choose
them as follows.
1. U and V are double-null coordinates so that the representative line element takes the
form

ds 2 = A(U, V ) dU dV + R 2 (U, V ) d 2 + sin2 d 2 .
(5)
2. The representative metric is continuous at the shell.
3. For the outgoing shells, U is the retarded time determined by the representative
metric at V = . Analogously, for the in-going shells, V is the advanced time at
U = .
Such a metric is uniquely defined for any physical situation given by the values of the
parameters , M, and w. This can be shown as follows.
satisfies
Consider first the case = +1. The EddingtonFinkelstein coordinate U
already the conditions for U , so we need only to find the function V . In the Minkowski
part, U > u, of the solution, the boundary conditions at the centre lead uniquely to:
V U
.
(6)
2
In the Schwarzschild part, U < u, of the solution, V is an advanced null coordinate,
), for each fixed M and u, of the advanced
so it must be some function, V = X(M, u, V

EddingtonFinkelstein coordinate V , which is defined by

R

:= U + 2R + 4M ln
V
2M 1.
A = 1,
R=
The function X is uniquely determined by the boundary condition at the shell, requiring
that V be continuous:

= (U + 2R)U =u ,
X M, u, V
U =u
or,

R

X M, u, u + 2R + 4M ln
1 = u + 2R.
2M
To solve this equation, we define

R

x := u + 2R + 4M ln
1,
2M
calculate R in terms of M, u, and x, and substitute the result into the right-hand side:
X(M, u, x) = u + 2R(M, u, x).
A straightforward calculation yields

xu
R(M, u, x) = 2M exp
,
4M
(7)
(8)
538
where is the well-known Kruskal function defined by its inverse,

1 (y) = (y 1)ey ,
(9)
and R > 2M was used. Eqs. (7) and (8) yield

U u + 2R
R
1 exp
.
V = u + 4M
2M
4M
(10)
A similar calculation for R < 2M leads to the same result. From this, it is easy to
calculate R, if we observe that

R
R
R
+ ln
1 = ln 1
.
2M
2M
2M
Then Eq. (9) implies that

V U
V u
.
1 exp
R = 2M
4M
4M
(11)
This relation defines the desired transformation from (U, R, , ) to (U, V , , ).

As the last step, we calculate the metric for U < u in the new coordinates. First, we
differentiate the function R. The derivative of (9) determines the derivative of :
(f ) =
1
,
(f )e(f )
(12)
which holds for any f . Then,

V U
V u
1
dU
1
exp
dR =
4M
4M
2(f+ )e(f+)

V U
V u
exp
+
dV
4M
4M
with

f+ :=

V U
V u
1 exp
.
4M
4M
(13)
Now, (11) implies that

2M
(f+ ) 1
=1
,
(f+ )
R
(14)
and (9) that f = ((f ) 1) exp((f )). This leads to the relation
f+
2M
,
=1
R
(f+ ) e(f+ )
and thus
dR =

1
V U
V u
2M
1
exp
1
dU +
dV .
2
R
4M
2(f+ ) e(f+) 4M
Substituting this into the metric (2) results, finally, in

R = 2M(f+ ),
A=

V U
V u
1
exp
,
4M
(f+ ) e(f+) 4M
(15)
539
where f+ is defined by (13), cf. (11). With these expressions, it is easy to verify that A
and R are continuous at the shell, as required. We note that these expressions contain
u as well as M, which become conjugate variables in the canonical formalism. This
makes the transition to the embedding variables non-trivial, and one must first look for
this transformation on the constraint surface.
In the case of in-going shells ( = 1) a completely analogous procedure yields, for
V < v, again (6), and for V > v,

V U
vU
1
exp
,
A=
R = 2M(f ),
(16)
4M
(f )e(f ) 4M
where
f :=

V U
vU
1 exp
.
4M
4M
These expressions result from (15) by the substitution V u v U .

As the result of the gauge fixing, the set of solutions (, M, w) can be written as a set of
(, M, w)-dependent metric fields (5) and a set of shell trajectories on a fixed background
manifold M. Here, the corresponding functions A and R have the form
A(, M, w; U, V ),
R(, M, w; U, V ),
(17)
and the trajectory of the shell on the background manifold is simply U = u for = +1 and
V = v for = 1.
A key property of the background manifold is that it possesses a unique asymptotic
region with I defined by U and I + by V +. As the shell cannot escape
the background manifold, its reappearance at an asymptotic region must be interpreted as
the reappearance at the asymptotic region of M. In this way, the background manifold is a
tool to solve the problem of where the shell reappears.
4. Transformation to embedding variables

4.1. Canonical formalism
The form of the canonical theory that is based on the embedding rather than ADMtype variables has been studied and advocated by Kuchar. In the recent paper [13] a large
step forward in this field has been achieved. The embedding variables have been associated
with background manifolds and gauge fixings similar to what has been done in the previous
section. The existence of this transformation has been shown in the general case.
The resulting formalism inspires hopes that some unpleasant features of the ADM
variables can be removed. First, the ADM variables lead to singular points in the physical
configuration space (super-space [18,19]) as well as at the constraint surface corresponding
to spaces or spacetimes with symmetries. Second, the symmetry of the ADM theory itself
is, on one hand, too large, containing all infinitesimal surface deformations, including also
those transformations that do not result from diffeomorphisms. On the other hand, it is too
540
small because only infinitesimal surface deformations and not finite group elements can
act on the whole phase space. The constraint surface that has been constructed in [13] has,
however, the form of a fibre bundle, which is a manifold (all points are regular), and the
fibre group of this bundle is the diffeomorphism group of the background manifold, so it
acts on the whole bundle.
As a Hamiltonian action principle that implies the dynamics of our system, we take the
action Eq. (2.6) of [14] (see also [3]). Let us briefly summarise the relevant formulae. The
spherically symmetric metric is written in the form:

2
ds 2 = N 2 d 2 + 2 d + N d + R 2 d 2 ,
and the shell is described by its radial coordinate = r. The action reads

S0 = d pr + d P + PR R H0 ,
(18)
and the Hamiltonian is

H0 = NH + N H + N E ,
where N := lim N (), E is the ADM mass (see [14]), N and N are the lapse
and shift functions, H and H are the constraints,
P2
P PR RR RR R 2 p
+
+
( r),
+
2
2R
R
2
2
2
H = PR R P p ( r),
H=
(19)
(20)
and the prime (dot) denotes the derivative with respect to ( ).

The main topic of this paper is to transform the variables in the action S0 . This
transformation will be split into two steps. The first step is a transformation of the canonical
coordinates r, p, , P , R, and PR at the constraint surface that is defined by the
constraints (19) and (20). The new coordinates are u and pu = M for = +1, v and
pv = M for = 1, and the so-called embedding variables U () and V ().
The second step is an extension of the functions u, v, pu , pv , U (), PU (), V (), and
PV () out of the constraint surface, where the functions u, v, pu , pv , U (), and V ()
are defined by the above transformation, and PU (), PV () by PU ()| = PV ()| = 0.
The extension must satisfy the condition that the functions form a canonical chart in a
neighbourhood of . A proof that such extension exists in general has been given in [13].
4.2. Transformation functions at the constraint surface
The constraint surface contains only points of the phase space that correspond to initial
data for solutions of Einsteins equations. Hence, we can assume that the metric (5) is a
spherically-symmetric solution with a shell, and so the functions A(, M, w; U, V ) and
R(, M, w; U, V ) are those written down in the previous section, Eqs. (6), (15) and (16).
According to Section 2, if such a metric is given, then, for each embedding, a unique
first and second fundamental form can be calculated from it, and so the map from the
embeddings to the ADM variables qkl (x) and kl (x) can be constructed.
541
A very important point is to specify the family of embeddings that will be used
throughout the paper. The embeddings are given by
U = U (),
V = V ().
These functions have to satisfy several conditions.

1. As is spacelike, U and V are null and increasing towards the future, we must have
U < 0 and V > 0 everywhere.
2. At the regular centre, the four-metric is flat and the three-metric is to be smooth.
This implies U (0) = V (0) in addition to the condition U (0) = V (0). This follows
from T (U (0), V (0)) = 0 and means that must run parallel to T = const. in order
to avoid conical singularities.
3. At infinity, the four-metric is the Schwarzschild metric. We require that the
embedding approaches the Schwarzschild-time-constant surfaces T = const, and that
becomes the Schwarzschild curvature coordinate R asymptotically. More precisely,
the behaviour of the Schwarzschild coordinates T and R along each embedding
U (), V () must satisfy

T () = T + O 1 ,
(21)
1
R() = + O .
(22)
The asymptotic coordinate T is a gauge-invariant quantity and it possesses the
status of an observable.
4. At the shell ( = r) we require the functions U () and V () to be C . In fact, as
the four-metric is continuous in the coordinates U and V , but not smooth, only the
C 1 -part of this condition is gauge invariant. Jumps in all higher derivatives are gauge
dependent, but the condition will simplify equations.
We suppose further that there is a whole foliation of the solution spacetimes. Any
foliation can be considered as a one-parameter family of embeddings:
U = U (, ),
V = V (, ),
the parameter being . The metric (5) reads, in terms of the coordinates , , and :

ds 2 = AU V d 2 A U V + V U d d AU V d 2 + R 2 d 2 .
From this metric, we can read off the values of the variables , R, N and N immediately:

= A(o, U, V )U V ,
(23)
R = R(o, U, V ),
where o symbolises the observables (w and M, respectively), and
N =
U V V U
AU V ,
2U V
N =
U V + V U
.
2U V
The expression U V V U is the Jacobian of the transformation from and to U and V ,

and we assume it to be positive.
542
To calculate the gravitational momenta, we can use the canonical equations that follow
when the action S0 is varied with respect to P and PR :

R
R N R ,
N

R

PR = R N R
N .
N
N
P =
Substituting for R, , N and N gives

1 U V V U
N =
AU U 2 V + AV U V 2 AU V + AU V
2 2U V

U V V U
R N R =
RU U RV V ,

2U V
so that, finally,

R
RU U RV V ,
P =

AU V
RAU RAV R U R V
U
V +
.
PR = RU U RV V +
2A
2A
2 U
2 V
(24)
(25)
Here, the indices U and V denote the partial derivatives with respect to U and V .
Eqs. (23), (24) and (25) are the transformation equations expressing the variables , R,
P and PR in terms of the new variables at the constraint surface. The functions A and R
are given by (6), (15), and (16).
We now turn to the remaining variables , r and p. We let unchanged; in fact, we
shall consider the action as two different actions, one for each value of . The variable r
is related to our new variables u, v, M, U () and V () in a different way for each value
of . If = +1, then r is determined by the equation U (r) = u. This is an equation with
exactly one solution if u satisfies the condition u < U (0) because U () is a monotonous
function with the range (, U (0)). For the differentials of the variables U (), r and u,
we obtain the relation:
dr =
du dU (r)
.
U (r)
(26)
Similarly, if = 1, then r is defined by V (r) = v for v > V (0), and the relation
between the differentials takes the form:
dr =
dv dV (r)
.
V (r)
(27)
The variable p does not seem to be determined completely in [14] because Eq. (2.5a)
of Ref. [14], which is the only equation that could serve this purpose, does not make
sense in the limit m 0 of null shells. However, the constraint equations lead to some
expressions for p; these determine p only at the constraint surface, but this is, in fact, all
we need. Let us, therefore, turn to the constraint equations.
543
4.3. The constraints

The constraint functions (19) and (20) contain finite parts, which are obtained for = r
and in the limits r, and -function parts. The -function parts can be rewritten
as equations for finite quantities, if one collects all terms with -function and sets the
coefficient equal to zero.
From the boundary conditions at the shell and Eqs. (23), (24) and (25), it follows that the
functions () and R() are continuous, whereas (), R (), P () and PR () jump
across the shell, as the metric is not smooth. This implies in turn that the -function part of
the constraints is equivalent to

p = R R ,
(28)
p = [P ],
(29)
where the symbol [g] := g+ g denotes the jump of the quantity g across the shell.
Let us calculate the jumps. We have

R = [RU ]U + [RV ]V .
For = +1, we have to use (6) and (15) and to replace the limits r by U u.
We obtain immediately from (6) that
1
1
RV = .
RU = ,
2
2
Differentiating (15) with the help of formulae (12) and (13) leads to, for U < u,
RU + =
f+
,
2(f+ )e(f+ )
RV + =
V u
4M
u
exp V4M
.
2(f+ )e(f+ )
Eq. (6) and (f+ ) = R/2M imply for the limits that
lim (f+ ) =
U u
V u
,
4M
so we have
2M
1
RU + = +
,
2 V u
Hence,
[RU ] =
2M
,
V u
1
RV + = .
2
[RV ] = 0.
Similarly, for = 1,
[RU ] = 0,
[RV ] =
2M
.
vU
There is also the relation

R|U =u =
V u
,
2
R|V =v =
vU
,
2
544
and so (28) yields:

= +1:
= 1:
p = MU (r),
p = MV (r).
(30)
(31)
For P , (24) implies

[P ] = R[RU ]U R[RV ]V ,
and so (29) gives the same result as (28).
Let us return to the finite part of the constraints (19) and (20). If we substitute the
above expressions for , R, P , and PR , we obtain, after some lengthy but straightforward
calculation, for each = r:
H=
1
R
(4RRU V + 4RU RV + A)U V +
(ARU U AU RU )U 2
2
A
R
(ARV V AV RV )V 2 ,
+
A
and
R
R
H = (ARU U AU RU )U 2 + (ARV V AV RV )V 2 .
A
A
If H and H are zero for any embedding outside the shell, that is for all possible U and V ,
the coefficients of U V , U 2 and V 2 must themselves vanish:
4RRU V + 4RU RV + A = 0,
(32)
ARU U AU RU = 0,
(33)
ARV V AV RV = 0.
(34)
These three equations are equivalent to the full set of Einstein equations for any metric
of the form (5). Thus, our functions A and R have to satisfy these equations. This is
immediately clear for (6) which gives A and R inside the shell. A more tedious calculation
verifies the validity of (32), (33), and (34) also outside the shell, where A and R are given
by (15) and (16).
4.4. Transformation of the Liouville form
As it has been explained at the end of Section 4.1, the transformation of the action (18) to
the new variables will be performed in two steps. The first step is restricted to the constraint
surface and forms the content of the present section.
At the constraint surface, H0 = N E and the action (18) becomes

S0 | = d pr N E + d P + PR R .
According to the discussion given in [20], the ADM boundary term N E in the action
can, after parametrisation at the infinity, be written as E T and can be considered as a
part of a modified Liouville form. Let us denote this form by :
= d(P d + PR dR) + p dr E dT .
(35)
545
As a result, the transformation of the action is nothing but the transformation of the
Liouville form .
We expect that the terms remaining after the transformation do not depend on any
embeddings, because the pull-back of the symplectic form to the constraint surface is
degenerated exactly in the direction of the gauge variables U () and V (). As we shall
see, the constraint surface consists of two components, + and , + containing all
outgoing and all in-going shells. We split this form into three terms for the in-going
and outgoing part, respectively,
| + = + | + + | + + p dr,
where
+
| + =
d (P d + PR dR) M dT
r
because E = M at the constraint surface, and

| + =
r
d (P d + PR dR),
0
and similar expressions for | . Let us first transform the part | + of the Liouville
form and make the ansatz
+
| + d M dT ,
(36)
r
with

f dU + g dV + hi doi + d,
(37)
where we have denoted the observables u and M collectively by (i = 1, 2). This has to
be compared with the corresponding part of (35), where the substitutions are made from
(23),
oi
AV
Ai
dU dV
d AU
=
dU +
dV +
doi +
+
,
2A
2A
2A
2U
2V
dR = RU dU + RV dV + Ri doi ,
and P and PR are given by (24) and (25).
It turns out to be convenient to make for the functions in (37) the ansatz

RRU
U
f=
ln + F U, V , oi ,
2
V

RRV
U
g=
ln + G U, V , oi ,
2
V

U
RRi
hi =
ln + Hi U, V , oi ,
2
V
(38)
(39)
(40)
(41)
(42)
546

R
U
= RRU U RRV V
RU U + RV V ln
2
V

i
F U GV + U, V , o .
(43)
The functions F, G, Hi , are then determined through comparison with the coefficients of
dU, dV , dU , dV , and doi . This leads to the equations
R
(2ARU V AU RV AV RU ),
2A
R
HiU Fi = (2ARiU Ai RU AU Ri ),
2A
R
(2ARiV Ai RV AV Ri ),
HiV Gi =
2A
= 0.
FV GU =
(44)
(45)
(46)
(47)
We next calculate the right-hand side of these equations by using the explicit expressions
for A and R found in Section 2. Outside the shell, these are the expressions (15). It is
convenient to introduce the abbreviations
U u
V u
,
a=
.
b=
(48)
4M
4M
One then has
A=
beba
,
e
R = 2M,
(49)
where is a function of f+ = (b 1)eba , see (13). The following identity turns out to be
useful:
eba
1
.
=
e
b1
After some lengthy, but straightforward calculations one finds
(50)
1
(51)
,
8b(b 1)
1
HuU = Fu
(52)
,
8b(b 1)
1
HuV = Gu +
(53)
,
8b(b 1)
1
1
HMU = FM
(54)
,
2 2(b 1)
1
2 b2
HMV = GM +
(55)
a+
.
2b(b 1)
2b(b 1)
The freedom in the choice of solution to Eqs. (44)(46) enable us to set F 0. From the
first equation one then gets
FV GU =
U
G=
dU
u
M( 2 b2 )
1
=
,
8b(b 1)
4b(b 1)
(56)
547
where we have chosen the boundary condition that G = 0 for U = u, i.e., at the shell, and
calculated the integral by the substitution x = . One recognises from (56) that, at the shell,
GM = 0 and Gu = 1/8b. With this result for G, one can integrate Eqs. (52)(55) for Hi
and choose the integration constants such that
Hu = G,
(57)
1
HM = (U u) 4bG,
(58)
2
having Hi = 0, i = 1, 2, at the shell. This then yields for the Liouville form outside the
shell pulled back to the constraint surface

+
i
+d
d + dr |=r M dT . (59)
| + = f dU + g dV + hi do
r
The fourth term on the right-hand side of (59) is a total derivative and will be omitted,
since it does not contribute to the dynamics. Eqs. (26), (30) and (31) lead to
p dr = M(du dU ).
Analogously, one finds for the part | + inside the shell

r

| + = (k dU + l dV )|r0 + d
d dr |=r ,
(60)
with

U
RRU
ln ,
k=
2
V

RRV
U
l=
ln ,
2
V
(61)
(62)

R
U

= RRU U RRV V
RU U + RV V ln .
2
V
(63)
Compared to f, g, hi , , there are no terms analogous to G, F , and Hi , since the classical

solutions (6) inside the shell lead to a vanishing right-hand side of (44)(46). Because of
the boundary condition U (0) = V (0), the functions k and l vanish at the centre. The
third term on the right-hand side of (60) is again a total derivative and will be neglected.
One has therefore only potential contributions at the shell and at infinity. We shall
consider first the contribution from the shell. Since there F = G = Hu = HM = 0, one
has to calculate
(k dU + l dV )|=r dr |=r (f dU + g dV + hu du hM dM)|=r
+ dr |=r M(du dU ).
(64)
Using (6) and (15), one arrives at the following jump conditions at the shell:
[RRU ] = M,
[RRV ] = 0,
[RRu ] = M,
[RRM ] = 0.
(65)
548
Taking these into account, one recognises that all terms on the dust shell cancel. As we
shall now demonstrate, the only non-vanishing terms are originating from infinity.

| + = lim Hi doi + F dU + G dV

RRV
RRi i
U
RRU
dU
+
dV
+
do
+ ln
M dT , (66)
V
2
2
2
where the function F = 0, and G, Hu and HM are given by (56), (57), and (58),
respectively. The limit (66) is determined by the boundary conditions 3 of Section 4.1,
cf. (21) and (22).
Eqs. (21) and (22) determine the expansions of U () and V () uniquely. Indeed, for
= +1, U near the space-like infinity coincides with the EddingtonFinkelstein retarded
coordinate (see Section 3) and so is given in terms of T and R by

R
1 .
U = T R 2M ln
2M
Then,

U () = 2M ln
+ T + O 1 .
2M
(67)
The presence of the logarithmic term is due to the long range of the gravitational potential.
Thus, the first diverging term is universal, the second depends on the observable M, and
the asymptotic coordinate T of the embedding appears only at the third position.
The asymptotic expansion of the function V () can be determined from (10). We have
first to get rid of :

U
R
V
.
= 2(R 2M) exp
+
(V u 4M) exp
(68)
4M
4M 2M
Then we substitute the expansions (22) and (67) into the right-hand side of (68):
V
4M

1

1
1
2M ln
+ T + O
= 2M + O
exp
.
4M
8M
(V u 4M) exp
Let us remove the singular part in the exponent by setting
V () = 2M ln
+ T + V1 ().
8M
Eq. (69) then becomes

1 2M 1 exp O 1

V1 ()
1
1
1
= 1 2M ln
+ (T u 4M) + V1 ()
.
exp
8M
4M
Taking the limits of both sides of (70), we obtain

V1 ()
V1 ()
= 1.
lim 1 +
exp
4M
(69)
(70)
549
It follows that
lim V1 () = 0.
Eq. (70) implies then also that

ln
V1 () = O
.
Hence, the expansion of the function V () has the form:

ln
V () = 2M ln
+ T + O
,
8M
(71)
the asymptotic coordinate T appears again only at the third position, and this is the reason
why the expansion must be carried so far.
Now, the expansion of all functions contained in is a straightforward matter. For G,
we obtain from (56):
G=
4R 2 (V u)2
M
.
4 (V u)2 4M(V u)
Then, Eqs. (22), (67), and (71) give

3
G = M 4M 2 1 ln
+ M(2T 2u 3M) 1 + o 1 ,
4
8M
where o( 1 ) is defined by the property lim o( 1 ) = 0, and

3
M(2T 2u 3M) 1 + O 2 ,
Hu = G = M + 4M 2 1 ln
4
8M
as well as

7
M
7
5
+ (6 + 4 ln 2) (T u) + O 1 .
HM = + M ln
4
2
8M
2
4
Eq. (71) can be used to calculate dV :

ln
+ dT + O
.
dV = d 2M ln
8M
Then we obtain:

ln
Hi do + F dU + G dV = M(dT du) + dZ + O
,
where
7
5
5
1
5
Z = M(T u) + M + M 2 ln + (4 13 ln 2)M 2 M 2 ln M.
4
4
2
2
2
Similarly,

R
U
ln
= 2M + O 2 .

2
V
550
The derivatives RU , RV , Ru and RM can be expanded if we calculate them from (11) using
the identity (50):
V u
1 R 2M
1 R 2M
,
RV =
,
2
R
2
R
V u 4M
R 2M
1
Ru = 2M
,
R
V u 4M
RU =
and
RM

R 2M 1 V U
R
U u
2
=
+
.
M
R
4M 1 V4M
V u 4M
u
This gives:

U
ln
R
i
ln
=
O
dU
+
R
dV
+
R
do
R
.
U
V
i
2
V
Collecting all terms, we finally have:

ln
.
| + = Mdu + dZ + O
(72)
The exact form dZ can be omitted because it has no influence on the symplectic form and
the equations of motion.
The final result of this subsection can be formulated as follows. The constraint surface
consists of two components: + for the outgoing shells ( = +1), and for the in-going
shells ( = 1). On + , we have the coordinates M, u, U () and V (), and the pull-back
of the Liouville form to + is
| + = M du.
Thus, it is independent of U () and V (), as expected.
In a completely analogous manner, the following result can be derived for the = 1
case:
| = M dv,
and our coordinates on are M, v, U () and V ().
These results also show that the two Dirac observables M and u (or M and v) form
a conjugate pair. Indeed, the Poisson algebra of Dirac observables is well-defined by the
(degenerate) pull-back of the symplectic form to the constraint surface.
4.5. Extension to a neighbourhood of the constraint surface
In the previous subsection, the constraint surface pull-back of the Liouville form has
been transformed to the Kuchar coordinates: the embeddings U () and V () that represent
pure gauges, and pu and u (or pv and v) that are Dirac observables. The next task is to
extend these coordinates to a neighbourhood of the constraint surface = + .
In [13], the proof has been given that such an extension exists if there are no points with
additional symmetry at . For the action (18), the spacetime solutions are all spherically
551
symmetric, but none of them exhibits any other symmetry, be it discrete or continuous. We
can reformulate the result of [13] in a way suitable for our purposes as follows. There is a
neighbourhood of in the phase space, and functions
U, PU , V , PV , u, pu ,
(73)
in + and
U, PU , V , PV , v, pv ,
in
such that, at
(74)
PU () = PV () = 0,
and U (), V (), pu and u (or pv and v) coincide with our coordinates there. The functions
(73) and (74) form canonical charts in . The transformation between the old variables
R, PR , , P , r, p,
(75)
and the new ones Eq. (73) or (74) is smooth and invertible in . At the constraint surface
, the transformation is given by Eqs. (23)(25), (15), (16), U (r) = u, V (r) = v, (30)
and (31). Outside the constraint surface, only the existence of the transformation has been
shown so we do not know its form.
Using this result, we can write the transformed action S in as follows,

+
S =
d pu u + d PU ()U () + PV ()V ()
NU ()PU () NV ()PV ()
and
S =
d pv v +

,

d PU ()U () + PV ()V ()
NU ()PU () NV ()PV ()

.
The two actions can be considered as the reduced form of just one action. Consider the
case = +1 first. The dynamical trajectory of the shell is given by the relation u( ) =
const, whereas v( ) is arbitrary, depending on the choice of the parameter (the only
restriction is that v( ) is an increasing function). This information can be obtained from
the extended action:

+
Sext =
d pu u + pv v nu pv + d PU ()U () + PV ()V ()
NU ()PU () NV ()PV ()

,
552
where nu is a Lagrange multiplier, v( ) a pure gauge and pv -dependent. Similarly for

= 1:
Sext =
d pu u + pv v nv pu + d PU ()U () + PV ()V ()
NU ()PU () NV ()PV ()

.
+
One can set in Sext
, as pu = M < 0:
nu = npu
and, similarly, in Sext

,
nv = npv .
+
and Sext
are obtained by reducing the following action
Then, clearly, Sext

d pu u + pv v npu pv + d PU ()U () + PV ()V ()
S=
NU ()PU () NV ()PV ()

Indeed, the case = +1 ( = 1) is obtained from the solution pv = 0 (pu = 0) of the

constraint
pu pv = 0.
The relation between the total energy M and the two momenta pu and pv can now be
written as follows:
M = pu pv .
5. Conclusions
We have demonstrated that there is a transformation of variables bringing the action (18)
to the simple form of the so-called Kuchar decomposition
S=
d (pu u + pv v npu pv ) +

d PU U + PV V H ,
(76)
where H = N U PU + N V PV ; n, N U () and N V () are the new Lagrange multipliers.

The dependence of the new variables PU and PV on the old ones is not known. This
dependence would be needed for calculation of the spacetime geometry associated with
any solution given in terms of the new variables. We know, however, that the new constraint
553
equations, PU () = PV () = 0, are mathematically equivalent to the old constraints,

Eqs. (19) and (20). One can, therefore, use the old constraints to calculate the geometry
from the true degrees of freedom along the hypersurfaces of some foliation. The fact
that two spacetimes obtained by this method using different foliations are isometric is
guaranteed by the closure of the algebra of constraints [21].
The new phase space has non-trivial boundaries:
pv 0,
pu 0,
u + v
> 0,
2
pv = 0,
U (, u),
pu = 0,
V (v, ),
and
(77)

uU
,
V > u 4pu exp
4pu

V v
U < v + 4pv exp
.
4pv
(78)
(79)
(80)
The boundaries defined by inequalities (78)(80) are due to the singularity.

The two dynamical systems defined by the actions (18) and (76) are equivalent: each
maximal dynamical trajectory of the first, if transformed to the new variables, give a
maximal dynamical trajectory of the second and vice versa.
The variables u, v, pu and pv span the effective phase space of the shell. They contain
all true degrees of freedom of the system. One can observe that the corresponding part of
the action (76) coincides with the action for free motion of a zero-rest-mass sphericallysymmetric (light) shell in flat spacetime if one replaces the inequality (78) by
u + v
0.
2
Such a dynamics is complete if the singularity at the value zero of the radius of the shell,
(u + v)/2, can be considered as a harmless caustic so that the light can re-expand after
passing through it. It might, therefore, seem also possible to extend the phase space of the
gravitating shell in the same way so that the in-going and the out-going sectors are merged
together into one bouncing solution.
However, such a formal extension of the dynamics (76) is not adequate. The physical
meaning of any solution written in terms of new variables (73) or (74) is given by
measurable quantities of geometrical or physical nature such as the curvature of spacetime
or the density of matter. These observables must be expressed as functions on the phase
space. They can of course be transformed between the phase spaces of the two systems
(18) and (76). They cannot be left out from any complete description of a system, though
they are often included only tacitly: an action alone does not define a system. This holds
just as well for the action (18) as for (76).
Let us consider these observables. The expression for the stress-energy tensor of the shell
written down in [14] implies that the density of matter diverges at r = 0; this corresponds
to (u + v)/2 = 0 in terms of the new variables. Eqs. (15) and (16) can be used to show
that the curvature of the solution spacetime diverges at the boundary defined by Eq. (79)
554
for pv = 0 and by Eq. (80) for pu = 0. It follows that the observable quantities at and near
the caustic are badly singular and that there is no sensible extension of the dynamics
defined by action (76) to it, let alone through it. This confirms the more or less obvious fact
that no measurable property (such as the singularity) can be changed by a transformation
of variables.
The action for the null dust shell is now written in a form which can be taken as the
starting point for quantisation. Surprisingly, it will turn out that the quantum theory is, in
fact, singularity-free. This will be done in a separate paper [22].
Acknowledgements
Helpful discussions with K.V. Kuchar and L. Lusanna are acknowledged. The authors
wish to thank I. Kouletsis for checking some of the equations. This work was supported by
the Swiss National Science Foundation and the Tomalla Foundation Zrich. C.K. thanks
the University of Bern for its kind hospitality, while part of this work was done; for the
same reason, P.H. thanks the University of Florence.
References
[1] S.W. Hawking, G.F.R. Ellis, The Large Scale Structure of SpaceTime, Cambridge Univ. Press,
Cambridge, 1973.
[2] E. Farhi, A.H. Guth, J. Guven, Nucl. Phys. B 339 (1990) 417.
[3] P. Kraus, F. Wilczek, Nucl. Phys. B 433 (1995) 403.
[4] T. Dray, G. t Hooft, Commun. Math. Phys. 99 (1985) 613.
[5] P. Hjcek, Commun. Math. Phys. 150 (1992) 545.
[6] P. Hjcek, B.S. Kay, K.V. Kuchar, Phys. Rev. D 46 (1992) 5439.
[7] P. Hjcek, in: J. Ehlers, H. Friedrich (Eds.), Canonical Gravity: From Classical to Quantum,
Springer, Berlin, 1994.
[8] K.V. Kuchar, P. Hjcek, unpublished (1997).
[9] T.D. Newton, E.P. Wigner, Rev. Mod. Phys. 21 (1949) 400.
[10] K.V. Kuchar, J. Math. Phys. 13 (1972) 768.
[11] K.V. Kuchar, in: G. Kunstatter et al. (Eds.), Proceedings of the 4th Canadian Conference on
General Relativity and Relativistic Astrophysics, World Scientific, Singapore, 1992.
[12] P. Hjcek, Nucl. Phys. B Proc. Suppl. 80 (2000) CD-ROM supplement. Also available as gr-qc/
9903089.
[13] P. Hjcek, J. Kijowski, Phys. Rev. D 61 (2000) 024037.
[14] J. Louko, B. Whiting, J. Friedman, Phys. Rev. D 57 (1998) 2279.
[15] Y. Choquet-Bruhat, R.P. Geroch, Commun. Math. Phys. 14 (1969) 329.
[16] R.P. Geroch, J. Math. Phys. 11 (1970) 437.
[17] P.G. Bergmann, A.B. Komar, Int. J. Theor. Phys. 5 (1972) 15.
[18] A.E. Fischer, in: M. Carmeli, S.I. Fickler, L. Witten (Eds.), Relativity, Plenum New, York,
1970.
[19] D. Giulini, Phys. Rev. D 51 (1995) 5630.
[20] K.V. Kuchar, Phys. Rev. D 50 (1994) 3961.
[21] C. Teitelboim, Ann. Phys. NY 79 (1973) 542.
[22] P. Hjcek, Nucl. Phys. B 603 [PM] (2001) 515, Next article in this issue.

Unitary dynamics
of spherical null gravitating shells
P. Hjcek
Institute for Theoretical Physics, University of Bern, Sidlerstrasse 5, CH-3012 Bern, Switzerland
Received 3 July 2000; accepted 20 March 2001
Abstract
The dynamics of a thin spherically symmetric shell of zero-rest-mass matter in its own
gravitational field is studied. A form of action principle is used that enables the reformulation of
the dynamics as motion on a fixed background manifold. A self-adjoint extension of the Hamiltonian
is obtained via the group quantization method. Operators of position and of direction of motion
are constructed. The shell is shown to avoid the singularity, to bounce and to re-expand to that
asymptotic region from which it contracted; the dynamics is, therefore, truly unitary. If a wave
packet is sufficiently narrow and/or energetic then an essential part of it can be concentrated under
its Schwarzschild radius near the bounce point but no black hole forms. The quantum Schwarzschild
horizon is a linear combination of a black and white hole apparent horizons rather than an event
horizon. 2001 Elsevier Science B.V. All rights reserved.
PACS: 04.60
Keywords: Unitarity in quantum gravity; Collapse; Black holes; Thin shells
1. Introduction
According to general relativity, all parts of a massive object definitely disappear if the
object falls through its Schwarzschild radius. The problem to be tackled in the present
paper is whether also a quantum system is, or is not, irretrievably lost if it falls under its
Schwarzschild radius.
We limit ourselves to a sufficiently simple model so that no approximations are needed
and the quantum theory can be constructed without problems. In this way, an important
question about the validity of approximative methods such as WKB expansion can also be
touched. The simplest system that can ever be invented for these aims seems to be a thin
shell with its own gravitational field made of light-like material, everything spherically
symmetric. A Hamiltonian action principle [1] for this system has been transformed to a
E-mail address: hajicek@itp.unibe.ch (P. Hjcek).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 4 0 - 7
556
P. Hjcek / Nuclear Physics B 603 [PM] (2001) 555577
form suitable for quantization in [2] (foregoing paper); this will be used as a starting point.
Most of the results of the present paper have already been published in a short review [3];
here, all derivations and calculations will be described in sufficient detail, some new results
will be added, and a new interpretation of the results will be given.
The plan of the paper is as follows. In Section 2, the starting assumptions and equations
are collected. The action for the system from Ref. [1] is written down because we shall need
the form of the constraints. The same action after the transformation to a set of embedding
variables and Dirac observables is then given, the key notion of background manifold is
introduced, and the meaning of the new variables is discussed. A construction of quantum
mechanics including the position and the direction-of-motion operators is contained in
Section 3. The so-called group-theoretical quantization method is used, which is well
adapted to the problems such as the limited ranges of spectra and the construction of a
unitary dynamics. The quantum mechanics is formulated as a dynamics of the shell on the
background manifold; this enables straightforward and unique interpretations. In Section 4,
motion of wave packets is investigated. It turns out that no shell reaches the zero radius if
it starts away from it and so the singularity is avoided. The wave packets contract, bounce
and then expand, reaching the asymptotic region from which they have been sent in, so the
dynamics is unitary from the point of view of one family of observers. Some of the packets
can be sufficiently concentrated near their bouncing point so that an essential part of them
comes under the corresponding Schwarzschild radius, but no event horizon forms.
In Section 5, we consider the seemingly contradictory claims that the quantum shell can
cross its Schwarzschild radius and still re-expand. The solution of the paradox is that if the
matter creates a Schwarzschild (apparent) horizon outside then the horizon can be, even in
the classical version of the theory, of two types: white or black, that is, corresponding to the
white or black hole horizon in the Schwarzschild spacetime. The colour of the apparent
horizon in a Cauchy surface depends on the direction of motion of the shell: the horizon
is black if the shell is contracting and it is white if the shell is expanding. The quantum
horizon is a linear combination of both because the motion of the shell is. The quantum
horizon is grey, changing from mostly black to mostly white.
The semi-classical approximation fails blatantly near the bouncing point of the quantum
shell because every classical shell reaches its Schwarzschild radius, forms a black hole
and falls into the singularity. A cautious discussion of this point is given in Section 6. In
particular, the reason is explained why our results do not prevent massive quantum systems
from collapsing to black-hole-like objects.
2. Canonical formalism
In this section, we shall summarize the formulae derived in Refs. [1] (abbreviated as
LWF further on) and [2] that are needed to start the present paper.
In LWF, the spherically symmetric metric outside the shell is written in the form
2

ds 2 = N 2 d 2 + 2 d + N r d + R 2 d 2 ,
557
and the shell is described by its radial coordinate = r. The LWF action reads

S0 = d pr + d (P + PR R H0 ) ,
0
and the LWF Hamiltonian is

H0 = NH + N H + N M ,
where N := lim N(), M is the ADM energy, H and H are the constraints,
P2
P PR RR RR R 2 p
+
( r),
+
2R
R
2
2
2
H = PR R P p( r);
H=
(1)
(2)
the prime denotes the derivative with respect to and the dot that with respect to .
In Ref. [2], the variables , r, p, , P , R and PR have been transformed to the
embedding variables U () and V (), their canonical conjugates PU () and PV (), and
the shell variables u, v, pu and pv . The pair (U (), V ()) defines an embedding of the
half-axis into the so-called background manifold M that is covered by the coordinates U
and V with the ranges
U + V
U +V
(, ),
(0, ).
2
2
The transformation to embedding variables is determined by a gauge condition, and there
has been a definite condition used in Ref. [2], where all details are given. The background
manifold carries then a set of metrics, one representative for each geometry. The variables
u and v are the coordinates of the shell trajectory in the background manifold:
U = u( ),
V = v( ).
The full action that results has the form of the so-called Kuchar decomposition

S=

d (pu u + pv v npu pv ) +

d
d (PU U + PV V H ),
(3)
= N U PU
+ N V PV ;
N U ()
N V ()
where H
and
are Lagrange multipliers.
The variables u, v, pu and pv span an extended phase space of the shell. They contain
all true degrees of freedom of the system. The phase space has non-trivial boundaries:
u + v
0.
pu 0,
(4)
pv 0,
2
The constraint surface of the extended action of the shell consists of two components:
outgoing shells for pv = 0 and in-going shells for pu = 0.
3. Group quantization
To quantize the system defined by the action (3), we apply the so-called grouptheoretical quantization method [4]. There are three reasons for this choice. First, the
558
method as modified for the generally covariant systems by Rovelli [5] (see also [6] and [7])
is based on the algebra of Dirac observables of the system; dependent degrees of freedom
dont influence the definition of Hilbert space. Second, the group method has, in fact, been
invented to cope with restrictions such as Eq. (4). Finally, the method automatically leads
to self-adjoint operators representing all observables.
In particular, a unique self-adjoint extension of the Hamiltonian is obtained in this way,
and this is the reason that the dynamics is unitary. The uniqueness of the self-adjoint
extension of the Hamiltonian is truly a result of the group quantization in the sense that the
Hamiltonian operator itself, as calculated from the constraint, possesses a one-dimensional
family of such extensions.
To begin with, we have to find a complete system of Dirac observables. Let us choose
the functions pu , pv , Du := upu and Dv := vpv . Observe that u alone is constant only
along outgoing shell trajectories (pu = 0), and v only along in-going ones (pv = 0), but
upu and vpv are always constant. The only non-vanishing Poisson brackets are
{Du , pu } = pu ,
{Dv , pv } = pv .
This Lie algebra generates a group G0 of symplectic transformations of the phase space
that preserve the boundaries pu = 0 and pv = 0. G0 is the Cartesian product of two copies
of the two-dimensional affine group A.
The group A generated by pu and Du has three irreducible unitary representations. In the
first one, the spectrum of the operator p u is [0, ), in the second, p u is the zero operator,
and in the third, the spectrum is (, 0], see Ref. [8]. Thus, we must choose the third
representation; this can be described as follows (details are given in Ref. [8]).
The Hilbert space is constructed from complex functions u (p) of p [0, ); the scalar
product is defined by

(u , u ) :=
dp
(p)u (p),
p u
u on smooth functions is
and the action of the generators p u and D

u u (p) = ip du (p) .
(pu u )(p) = pu (p),
D
dp
Similarly, the group generated by pv and Dv is represented on functions v (p); the group
G0 can, therefore, be represented on pairs (u (p), v (p)) of functions:

pu u (p), v (p) = pu (p), 0 ,

pv u (p), v (p) = 0, pv (p) ,

u u (p), v (p) = ip du (p) , 0 ,
D
dp

v u (p), v (p) = 0, ip dv (p) .
D
dp
This choice guarantees that the Casimir operator pu pv is the zero operator on this Hilbert
space, and so the constraint is satisfied.
559
Handling the last inequality (4) is facilitated by the canonical transformation:

t = (u + v)/2,
pt = pu + pv ,
r = (u + v)/2,
(5)
pr = pu + pv .
pu pv = (pt2
(6)
pr2 )/4.
The constraint function then becomes

The positivity of r is simply due to its role as the radius of the shell: it is defined as
a square root of a sum of squares of coordinates with the range R3 . This suggests the
following trick. Let us extend the phase space so that r (, +) and let us define a
symplectic map I on this extended space by I (t, r, pt , pr ) = (t, r, pt , pr ). The quotient
of the extended space by I is isomorphic to the original space, and we adopt it as our phase
space.
Clearly, only those functions on the extended space that are invariant with respect to I
will define functions on the quotient. Dirac observables of this kind are, e.g., pt , pr2 , the
dilation D := tpt + rpr = upu + vpv and the square of the boost J 2 := (tpr + rpt )2 =
(upu + vpv )2 . The action of the map I on the functions pu , pv , Du and Dv is:
Ipu I = pv ,
I Du I = Dv ,
Ipv I = pu ,
I Dv I = Du .
There are only two choices for I that preserve these relations in the quantum theory:

I u (p), v (p) = v (p), u (p) .
We choose the plus sign; it is easy to see that the other choice leads to an equivalent theory.
Observe that the resulting representation of the group G := G0 (id, I ) is irreducible.
There are two eigenspaces of I: one to the eigenvalue +1, consisting of the pairs
with u (p) = v (p), the other to the eigenvalue 1, containing the pairs with u (p) =
v (p). If we choose one of these eigenspaces as our final Hilbert space, we obtain
a representation of the classical algebra on the quotient space. Again, the two possible
choices give equivalent theories. The final result can easily be brought to the following
form. The states are determined by complex functions (p) on R+ ; the scalar product
(, ) is

(, ) =
dp
(p)(p);
p
let us denote the corresponding Hilbert space by K. The representatives of the above
algebra are
(p t )(p) = p(p),
2
p r (p) = p2 (p),

(p) = ip d(p) ,
D
dp
2
d 2 (p)
d(p)
J (p) = p
p2
.
dp
dp2
The next question is that of time evolution. Time evolution of a generally covariant
system described by Dirac observables may seem self-contradictory or gauge dependent.
560
Here, we apply the approach that has been worked out in [6] and [9] using the symmetry
group of time shifts found in Section 2 of Ref. [2], which is generated by the function pt .
The operator p t has the meaning of the total energy M of the system. We observe that it is
a self-adjoint operator with a positive spectrum and that it is diagonal in our representation.
(t) that is generated by p t is easy to interpret:
The parameter t of the unitary group U
t represents the quantity that is conjugated to pt in the classical theory and this is given by
(t) describes the evolution of the shell states between the levels of the
Eq. (5). Hence, U
function (U + V )/2 on M.
The missing piece of information of where the shell is on M is carried by the quantity r
of Eq. (5). We try to define the corresponding position operator in three steps.
First, we observe that r itself is not a Dirac observable, but the boost J is, and that the
value of J at the surface t = 0 coincides with rpt . It follows that the meaning of the Dirac
observable Jpt1 is the position at the time t = 0. This is in a nice correspondence with
the NewtonWigner construction on one hand, and with the so-called evolving constants
of motion by Rovelli [10] on the other.
Second, we try to make Jpt1 into a symmetric operator on our Hilbert space. As it is
odd with respect to I , we have to square it. Let us then chose the following factor ordering:
1 1 1
d2 1
r 2 := J J = p 2 .
p p
p
p
dp
(7)
Other choices are possible; the above one makes r 2 essentially a Laplacian and this
simplifies the subsequent mathematics. Indeed, we can map K unitarily to L2 (R+ ) by
sending each function (p) K to (p)

L2 (R+ ) as follows:
1
(p)
= (p).
p
Then, the operator of squared position r 2 on L2 (R+ ) corresponding to r 2 is

1
d 2 (p)
(p).
=
= %
r 2 = r 2 p (p)
p
dp2
Third, we have to extend the operator r 2 to a self-adjoint one. The Laplacian on the
half-axis possesses a one-dimensional family of such extensions [11]. The parameter is
is defined by the boundary condition at zero:
[0, ) and the domain of %
(0)
sin + (0) cos = 0.
is given by:
The complete system of normalized eigenfunctions of %

2 r cos cos rp sin sin rp
;
(r, p) =
r 2 cos2 + sin2
if (0, /2), there is one additional bound state,
(b, p) =
1
2 tan
exp(p tan ),
561
so that
(r, p) = r 2 (r, p),
%
(b, p) = tan2 (r, p).
%
The corresponding eigenfunctions of the operator r2 are:

2p r cos cos rp sin sin rp
,
(r, p) =
r 2 cos2 + sin2
and we restrict ourselves to [/2, ], so that there are no bound states and the operator
r is self-adjoint.
To restrict the choice, we apply the idea of Newton and Wigner. First, the subgroup of
Gt =0 that preserves the surface t = 0 is to be found. This is, in our case, UD () generated
by the dilatation D. Then, in the quantum theory, the eigenfunctions of the position at
t = 0 are to transform properly under this group; this means that the eigenfunction for
the eigenvalue r is to be transformed to that for the eigenvalue UD ()r, for each . The
acts on a wave function (p) as follows:
dilatation group generated by D

(p) UD ()(p) = e p ,
where UD () is an element of the group parameterized by . Applying UD () to (r, p)
yields

/2 2p r cos cos(e rp) sin sin(e rp)
UD () (r, p) = e
.
r 2 cos2 + sin2
The factor e/2 in the resulting functions of p keeps the system -normalized.
Let = /2; then

UD ()/2 (r, p) = e/2 /2 e r, p .
Similarly, for = ,

UD () (r, p) = e/2 e r, p ,
but such relation can hold for no other from the interval [/2, ], because of the form of
the eigenfunction dependence on r. Now, Newton and Wigner require that

UD ()(r, p) = e/2 e r, p .
Then all values of except for = /2 and = are excluded.
We have, therefore, only two choices for the self-adjoint extension of r 2 :

2p
sin rp, r 0,
(r, p) :=
(8)
and

2p
(r, p) :=
cos rp, r 0.
Let us select the first set, Eq. (8); by that, the construction of a position operator is finished.
562
The construction contains a lot of choice: the large factor-ordering freedom, and the
freedom of choosing the self-adjoint extension. One can react to this ambiguity in two
ways.
The first is to ask how the different choices influence the results. It seems plausible that
the qualitative, rough properties of the quantum system will be the same for all possible
choices. We hope (provisionally) that this is true.
The second question to ask is how the position is, in fact, measured in praxis. This
question hits the crux of the problem. Indeed, the NewtonWigner construction may be
formally elegant but, to my knowledge, nobody managed to describe the corresponding
measurement. If we search for methods of how the position of various constituents in a
microscopic system is measured, we find the scattering method to dominate. For that it
is necessary to use a particular coupling the system under study, a crystal, say, has with
another agent, X-rays, say. One has to send the X-rays onto the crystal and to view what
comes out.
It seems, therefore, that the following approach would be more reliable than attempts at
a formal definition of a position operator of the shell. One can try, for example, to couple
the shell to some field, the quanta of which could be emitted by the shell on its way down
and up. The quanta will, or will not reach the asymptotic observers and their properties
at infinity might reveal something of what is going on with the shell. This is a future
project because it will be mathematically more difficult than our provisional attempt with
the position operator.
Another observable that we shall need is ;
this is to tell us the direction of motion of
the shell at the time zero, having the eigenvalues +1 for all purely outgoing shell states,
and 1 for the in-going ones. In fact, in the classical theory, = sgnpr , but pr does not
act as an operator on the Hilbert space K, only pr2 . Hence, we need the following trick.
Consider the classical dilatation generator D = tpt + rpr . It is a Dirac observable; at
t = 0, its value is rpr . Thus, for positive r, the sign of D at t = 0 has the required value.
On the quotient space, the values at negative r correspond to the I -mapped states with
positive r, and, as D is I -invariant, the relation of the sign to the direction of motion is
again valid. Hence, we have the relation:
sgnD = t =0 .
are solutions of the differential
The normalized eigenfunctions a (p) of the operator D
equation:
a (p) = aa (p).
D
The corresponding normalized system is given by
1
a (p) = pia .
2
on the purely out- or in-going states are:
Hence, the kernels P (p, p ) of the projectors P

0
P+ (p, p ) =
(p )
da a (p) a ,
p
P (p, p ) =
da a (p)
0
a (p )
p
563
so that

()(p)

dp P+ (p, p ) P (p, p ) (p ).
This finishes our construction of the shell quantum mechanics.
4. Motion of wave packets

We shall work with the family of wave packets on the energy half-axis that are defined
by
(2)+1/2 +1/2 p
p
e ,
(p) :=
(2)!
where is a positive integer and is a positive number with dimension of length. Using
the formula
dp pn ep =
n!
,
n+1
(9)
which is valid for all non-negative integers n and for all complex that have a positive real
part, we easily show that the wave packets are normalized,
dp 2
(p) = 1.
p
The expected energy,

:=
M
dp
2
p
(p),
p
of the packet can be calculated by the same formula with the simple result
= + 1/2 .
M
The (energy) width of the packet can be represented by the mean quadratic deviation,
%M , which is
2 + 1
.
%M =
2
Hence, by choosing and suitably, we can approximate any required energy and width
arbitrarily closely.
The time evolution of the packet is generated by p t :
(t, p) = (p)eipt .
564
Let us calculate the corresponding wave function (r, t) in the r-representation,

(t, r) :=
dp
(t, p)(r, p),
p
where the functions (r, p) are defined by Eq. (8). Formula (9) then yields:

i
1 ! (2)+1/2
i
.
(t, r) =
( + it + ir)+1 ( + it ir)+1
(2)!
2
(10)
It follows immediately that

lim | (t, r)|2 = 0.
r0
The scalar product measure for the r-representation is just dr because the eigenfunctions (8) are normalized, so the probability to find the shell between r and r + dr is
| (t, r)|2 dr.
Our first important result is, therefore, that the wave packets start away from the center
r = 0 and then are keeping away from it during the whole evolution. This can be interpreted
as the absence of singularity in the quantum theory: no part of the packet is squeezed up to
a point, unlike the shell in the classical theory.
Observe that the equation (t, 0) = 0 is not a result of a boundary condition imposed
on the wave function. It is a result of the unitary dynamics. The nature of the question that
we are studying requires that the wave packets start in the asymptotic region so that their
wave function vanishes at r = 0 for t ; this is the only condition put in by hand.
The fact that the dynamics preserves this equation is the property of the unique self-adjoint
extension of the Hamiltonian operator.
A more tedious calculation is needed to obtain the time dependence r (t) of the
expected radius of the shell,

r (t) :=
dr r| (t, r)|2 .
0
Let first = 0. The wave function of the packet then is

r
0 (t, r) = 2
,
2
r + ( + it)2
so the expectation value of r is
4
r0 (t) =

dr
0
r3
.
(r 2 + 2 t 2 )2 + 42 t 2
This integral diverges logarithmically, so

r0 (t) = .
(11)
565
Let = 0. The substitution of Eq. (10) into (11) leads to:

r (t) =

1 (!)2 (2)2+1
I (t) J (t) ,
2
(2)!
where

I (t) =
0

1
1
,
r dr
+
[(r + t)2 + 2 ]+1 [(r t)2 + 2 ]+1

J (t) =
r dr
[( ir)2 + t 2 ]+1
1
[( + ir)2 + t 2 ]+1
The first integral can be brought by elementary methods to the following form:
1
1
+t
I (t) =
(t 2 + 2 )
t
t
ds
.
(s 2 + 2 )+1
Let us calculate the second integral. We obtain after a simple rearrangement:

J (t) = (1)
+1
dr
0
r + i
i
2
2
+1
[(r + i) t ]
[(r + i)2 t 2 ]+1

r i
i
.
+
+
[(r i)2 t 2 ]+1 [(r i)2 t 2 ]+1
This suggests the introduction of integration contours C1 defined in the complex plane by
z = r + i for r (0, ), and C2 by z = r i, r (0, ). Then J (t) can be written as
follows:

z
i
J (t) = (1)+1 dz
(z2 t 2 )+1 (z2 t 2 )+1
C1
+ (1)
+1
C2

z
i
.
+
(z2 t 2 )+1 (z2 t 2 )+1
The integrals of the first terms in the square brackets can be done immediately:

1
dz
1
+1
J (t) =
+
(1)
i
.
(2 + t 2 )
(z2 t 2 )+1
(12)
C1 +C2
We obtain as the final result:

t
1 (!)2 (2)2+1 2
dx
1
r (t) =
+t
2
(2)!
(2 + t 2 )
(x 2 + 2 )+1

+ i(1)
+1
C1
dz
i(1)+1
2
(z t 2 )+1

C2

dz
.
(z2 t 2 )+1
(13)
566
In fact, the R.H. side diverges for = 0 so, in this sense, this formula can be considered as
completely general, i.e., valid for all and t.
Let us study some properties of the function r (t). Eq. (13) implies that
r (t) = r (t),
so the average motion of the packet is symmetric under time reversal. Eq. (13) is also
suitable for the calculation of the expansions about the points t = 0 and t = . Consider
first the point t = 0. Expanding the first term in the square bracket is easy:

2k
2
2
t
1
k +k1
=
(1)
.
2
2
2
k
( + t )
k=0
The series on the R.H. side converges for |t| < .

To expand the next term, we expand the integrand in the powers of x/; the series
converges for |t| < . Integrating term by term yields:
t
t
t

2k+2
dx
2 (1)k + k
t
=
.
k
(x 2 + 2 )+1 2
2k + 1
k=0
Again, this series converges for |t| < .

A similar method can be applied to the remaining integrals:
2k

1 +k
t
1
=
.
k
z
(z2 t 2 )+1 z2+2
k=0
The convergence is granted for |t| < |z|. As the minimal |z| along both contours is , the
expansion is always valid for |t| < . Then

2k
t
dz
2 (1)k
+k
+1
i(1)
= 2
.
2
2
+1
k
2
+
2k
+
1
(z t )
k=0
C1 C2
Collecting all terms, we obtain the expansion around t = 0,

+1
(!)2 22+1
r (t) =
(2)!
(2 + 1)

2k

(1)k+1
t
+k1
,
+ ( + 1)
k
(2 + 2k + 1)k(2k 1)
(14)
k=1
and the equation holds for |t| < . As the k = 1 term in Eq. (14) is positive, there is a
minimal expected radius r (0) at t = 0,
r (0) =
1 22 (!)2 + 1
> 0.
(2)!
+ 1/2
(15)
For a large , the minimum at t = 0 may be only local and/or the curve may oscillate for
t (, ).
567
Let us turn to the asymptotics t . It is sufficient to consider the case t

because the other one is obtained by t t. The first term in the square brackets in
Eq. (13) is clearly of the order O(t 2 ) and it is not difficult to convince oneself that the
last two terms are both of the order O(t 21 ).
The second term need more care. First, we use the relation

t
dx
dx
1
1 d
=
2
2
+1
2
(x + )
!
2 d
x + 2
t
t
so that we obtain
t
t
t

2
1 d t
t
dx
=
arctan
.
2 d
(x 2 + 2 )+1 !
For t/ > 0, the following formula holds:
t
= arctan .
2
t
Using it and expanding the function arctan(/t) around zero leads to
arctan
t
t
t

1 d 1
d
dx
(1)k 2k
t
2(1)
=
.
d
2k + 1 t
(x 2 + 2 )+1 2 !
d()2
k=0
The series converges for t > and can be differentiated term by term in this interval. The
first non-zero term comes only from k = and it has the value
2! 1
.
2 + 1 t 2
It holds also

1 d 1 (2 1)!!
=
.
d
2+1
Hence,
t
t
t
2
dx
(2 1)!!
+
O
t
.
=
t
(x 2 + 2 )+1
2 2+1 !
Substituting this into Eq. (13) and using the symmetry t t, we obtain for both cases
t :

r (t) |t| + O t 2 .
(16)
A further interesting question about the motion of the packets is about the portion of
a given packet that moves in is purely in-going at a given time t. The portion is
given by P 2 , where P is the projector defined in Section 3. Let us calculate this
quantity.
568
If we write out the projector kernel and make some simple rearrangements in the
expression of the norm, we obtain:

q q q

2

P =
t, e t, eq ,
dq
dq
da a e a e
0

where the transformation of integration variables p and p to eq and eq in the projector
kernels has been performed.
The integral in the parenthesis,
da a
q q
1
e a e =
2

da eia(q
q )
is a kernel in an integral that is exponentially damped at the infinities. Thus, we can

calculate it as a limit,
1
lim
2 70

da eia(q
q )7a
i
1
lim
2 70 (q q ) + i7
1
i
1
P
+ (q q ),
2 q q 2
where P denotes the principal value.

Doing the integral over the -function gives the simple result:
1
1
( , ) = .
2
2
The rest can be written as follows:
P 2
i
1
= +
2 2
dq P

cos t (q q ) + i sin t (q q )
dq eq eq
.
q q
The integrand is a sum of a symmetric and an anti-symmetric functions of the variables q

and q . The principal value integral annihilates the anti-symmetric part. The integral from
the symmetric part is already regular, and we can write the final formula:
1
1
P =
2 2
dq

sin t (q q )
dq eq eq
.
q q
(17)
Let us calculate the in-going portion for some simple values of t. Thus, for t = 0, we
obtain immediately:
1
P 2t =0 = .
2
At the time zero, the probabilities to catch the shell going in or out are equal.
569
The limit t can be obtained, if we use the formula:

lim
sin tx
= (x).
x
Hence,

sin t (eq eq )
eq eq q
q
.
=
e
t
q q
q q
lim
Substituting this into the integral of Eq. (17) and returning back to the variables p and p
results in:
P 2t

dp
1 1 dp
p p
=
(p ) (p )
(p p ).

2 2
p
p
log p log p
0
The expression
p p
log p log p
is smooth and equal to p at p = p . Hence, finally
P 2t = 1,
P 2t = 0,
and we have only in-going, or only outgoing shells at the infinity.

The obvious interpretation of these formulae is that quantum shell always bounces at the
center and re-expands. We can, however, ask further questions. For example, what is the
time delay of the re-expansion as compared, say, with the same trajectory in the background
manifold M that carries the flat metric
ds 2 = dU dV + (1/4)(U + V )2 d 2
(18)
in our coordinates U and V ? To find this time delay, the true metric with respect to these
coordinates had to be calculated. The metric is determined by the quantum state in a similar
way as the position and the colour of the horizon are (see the next section). However,
unlike the points and the metric inside M, the points and metric in the asymptotic region
are gauge invariant quantities. The method by which we should calculate the asymptotic
metric ought to make the gauge invariance of the result transparent. Such a method has first
to be developed.
The result that the quantum shell bounces and re-expands is clearly at variance with
the classical idea of black hole forming in the collapse and preventing anything that
falls into it from re-emerging. It is, therefore, natural to ask, if the packet is squeezed
enough so that an important part of it comes under its Schwarzschild radius. We can try to
answer this question by comparing the minimal expected radius r (0) with the expected
Schwarzschild radius rH of the wave packet. The Schwarzschild radius is given by

= 2 M ,
rH = 2GM
MP2
570
where MP is the Planck energy. Now, the values of and for which a large part of the
packet gets under its Schwarzschild radius clearly satisfy the inequality
r (0) < rH ,
or
(MP )2 < 2
( + 1/2)2 (2)!
.
+1
22 (!)2
(19)
Interpreting roughly as the spatial width of the packet, we have MP 1 for

reasonably broad packets. Then the right-hand side can be estimated by the Stirling
formula:
( + 1/2)2 (2)!
2 .
2
+1
22 (!)2
Substituting this into the inequality (19) yields
P
> M
M
(20)
MP ,
2
which implies that the threshold energy for squeezing the packet under its Schwarzschild radius is much larger than the Planck energy. For narrow wave packets, we have
that MP 1, so the inequality (19) is satisfied, and the threshold energy is about one
Planck energy. The inequality (20) expresses, therefore, always the desired property. To
summarize: reasonably narrow packets can, in principle, get under their Schwarzschild
radius; their energy must be much larger than Planck energy. Even in such a case, the shell
bounces and re-expands.
This apparent paradox will be explained in the next section.
5. Grey horizons
In this section, we try to explain the apparently contradictory result that the quantum
shell can cross its Schwarzschild radius in both directions. The first possible idea that
comes to mind is simply to disregard everything that our model says about Planck
regime. This may be justified, because the model can hardly be considered as adequate
for this regime. However, the model is mathematically consistent, simple and solvable; it
must, therefore, provide some mechanism to make the horizon leaky. We shall study this
mechanism in the hope that it can work in more realistic situations, too.
To begin with, we have to recall that the Schwarzschild radius is the radius of a nondiverging null hyper-surface; anything moving to the future can cross such a hyper-surface
only in one direction. The local geometry is that of an apparent horizon. (Whether or not
an event horizon forms, that can also depend on the geometry near the singularity [12].)
However, as Einsteins equations are invariant under time reversal, there are two types of
Schwarzschild radius: that associated with a black hole and that associated with a white
hole. Let us call these Schwarzschild radii themselves black and white. The explanation
of the paradox that follows from the model is that quantum states can contain a linear
571
combination of black and white horizons, and that no event horizon forms. We call such a
combination a grey horizon.
The existence of grey horizons can be shown as follows. The position and the colour
of a Schwarzschild radius outside the shell is determined by the spacetime metric. For
our model, this metric is a combination of purely gauge and purely dependent degrees of
freedom, and so it is determined, within the classical version of the theory, by the physical
degrees of freedom through the constraints.
To explain the idea in more detail, less us start with the general case in the ADM
formalism. There are 16 canonical variables, the 6 components of the three-metric qkl ,
the 6 components of the conjugate momentum kl , 1 lapse and three shift functions. These
can be decomposed (non-uniquely) into physical, gauge, and dependent variables. Fixing
the gauge variables by hand (this also includes some boundary conditions in non-compact
cases) means that a particular space-like surface is chosen, and a particular coordinate
system x k , k = 1, 2, 3, is lain onto this surface. Then the constraints turn into differential
equations determining the dependent part of qkl and kl in terms of the physical one and
so the tensor fields qkl and kl are determined uniquely along in the coordinates x k
by the physical degrees of freedom. By this, the full spacetime metric g and all its first
derivatives are known at each point of . Indeed, if we choose the Gaussian coordinates
x 0 , x k , adapted to , then the four-metric at is
2

ds 2 = dx 0 + qkl dx k dx l ,
and the derivatives of this metric with respect to the coordinates x 0 and x k are given by
g00
g0k
g00
= 0,
= 0,
= 0,
x 0
x k
x 0
gkl
gkl
qkl
g0k
= 2Kkl ,
= m,
= 0,
x m
x
x l
x 0
where Kkl := (det qkl )1/2 (1/2 q mn mn qkl kl ) is the second fundamental form of the
surface . Observe that the choice of Gaussian coordinates is equivalent to specifying the
lapse and shift at by hand. The lapse and shift could also be fixed by the condition that
the gauge is preserved by the evolution.
Let S be a closed two-surface on . We can calculate the Gaussian coordinates adapted
to S in from the metric qkl of ; let they be x A and x 3 , A = 1, 2, so that S is given
by x 3 = 0 and x 3 increases in the outside direction. Let the corresponding components
and K . Then the induced two-metric on S is q
of the tensor fields be qkl
kl
AB and the full
three-metric on is
2

ds 2 = dx 3 + qAB
dx A dx B .
Now, let l and n be null vectors orthogonal to S, l being the outgoing and n the
in-going one. Their component in terms of the coordinates x 0 := x 0 and x k are
l = (1, 0, 0, 1),
n = (1, 0, 0, 1).
Then

gAB
gAB
gAB
qAB

l
=
+
=
2K
+
,
AB
x
x 0
x 3
x 3
572
and, similarly,

gAB
q

l = 2KAB
AB
,

x
x 3
We can, therefore, check, whether or not the following equation holds

qAB
AB

g
2KAB
= 0,
x 3
(21)
and so can find, if S is an out- (in-)going apparent horizon Eq. (21) is then valid with
the above (lower) sign.
Let us look to see how this algorithm works for the shell model. The constraints are
PU () = 0 and PV () = 0. If the transformation from the original variables (), R(),
P (), PR (), , r and p to U (), V (), PU () PV (), , u, v, pu and pv were known,
it would provide the functionals:
PU () = PU [, R, P , PR , , r, p; ),
and
PV () = PV [, R, P , PR , , r, p; ).
The transformation is not known explicitly, but we know that the constraint equations
PU [, R, P , PR , , r, p; ) = 0,
PV [, R, P , PR , , r, p; ) = 0
(22)
are equivalent to the original constraints, Eqs. (1) and (2). Hence, our first trick is to work
with Eqs. (1) and (2) instead of Eqs. (22).
The constraints (1) and (2) contain the physical variables , M u and v also through r
and p. We can choose the gauge variables to be R() and (). A fixed function R()
determines in terms of the geometrical quantity R and so it fixes a radial coordinate
along . () contains derivatives of the embedding functions, so it determines the slope
of the embedding at each . Integrating the slope gives a family of surfaces; a suitable
boundary condition at infinity selects one of them.
In order to obtain a suitable surface the functions R() and () have to satisfy
some further boundary conditions at the infinity, at the shell and at the regular center. The
condition at the infinity, , is to guarantee that is asymptotically flat. That at the
shell is necessary in order that is smooth across the shell. Finally, at = 0, we require
that cut the regular center rather that the singularity and that it be a smooth surface at
this point. Similarly, PR and P have to satisfy suitable boundary conditions at = 0,
= r and . The explicit form of these boundary conditions are carefully discussed
in [1].
Then the constraints become equations for the two functions PR () and P (). Eq. (1)
is an algebraic equation for PR ; solving it and inserting the solution into Eq. (2) gives an
ordinary differential equation for P . The differential equation, together with the boundary
conditions, determines P () uniquely, and this, in turn, together with Eq. (1), gives
PR (). The solution is unique. From the known functions R(), (), PR () and P (),
we can determine qkl and Kkl along and check Eq. (21).
573
One can try to solve Eqs. (1) and (2) for PR () and P () explicitly, by choosing
the functions R() and () in some way that simplifies the equations. Instead, we use
the uniqueness of the solution in the following simple trick. Any solution of the constraint
equations in the spherically symmetric case defines an initial data and surface for a solution
to Einsteins equations that is itself spherically symmetric. Hence, every such solution
of constraints forms a space-like surface that can be embedded in some Schwarzschild
spacetime. There will always be the Schwarzschild solution of mass zero inside the shell,
and the Schwarzschild solution of mass M outside it.
In this way, we find by inspection from the Kruskal diagram: if the shell is in-going,
= 1, then it is contracting and any space-like surface containing such a shell can at
most intersect an outgoing apparent horizon at the radius R = 2GM, independently of
which of the two infinities the surface is connecting the shell with. Analogous result holds
for = +1, where the shell is expanding. The corresponding H is determined by the
equation R(H ) = 2GM, and the horizon will cut if and only if H > r. We can assign
the value +1 (1) to the horizon that is out- (in-)going and denote the quantity by c
(colour: black or white hole). Then c = .
In particular, if we choose the gauge so that R() = , then r is just r(t), where t is the
value of the parameter t at which the shell intersect , and we have:
(1) The condition that an apparent horizon intersects is r(t) < 2GM.
(2) The position of the horizon at is H = 2GM.
(3) The value of c is c = .
In this way, questions about the existence and colour of an apparent horizon outside the
shell are reduced to equations containing dynamical variables of the shell. In particular,
the result that c = can be expressed by saying that the shell always creates a horizon
outside that cannot block its motion. All that matters is that the shell can bounce at the
singularity (which it cannot within the classical theory).
These results can be carried over to quantum mechanics after quantities such as 2GM r
and are expressed in terms of the operators describing the shell. Then we obtain a
and with the expected colour
quantum horizon with the expected radius 2GM
to be mostly black at the time when the expected radius of the shell crosses the horizons
inwards, neutrally grey at the time of the bounce and mostly white when the shell crosses
it outwards.
This proof has, however, two weak points. First, the spacetime metric on the background
manifold is not a gauge invariant quantity; although all gauge invariant geometrical
properties can be extracted from it within the classical version of the theory, this does not
seem to be possible in the quantum theory [13]. Second, calculating the quantum spacetime
geometry along hyper-surfaces of a foliation on a given background manifold is foliation
dependent. For example, one can easily imagine two hyper-surfaces and belonging
to different foliations, that intersect each other at a sphere outside the shell and such that
intersects the shell in its in-going and in its outgoing state. Observe that the need for
a foliation is only due to our insistence on calculating the quantum metric.
The essence of these problems is the gauge dependence of the results of the calculation.
However, it seems that this dependence concerns only details such as the distribution of
574
different hues of grey along the horizon, not the qualitative fact that the horizon exists and
changes colour from almost black to almost white. Still, a more reliable method to establish
the existence and properties of grey horizons would require another material system to be
coupled to our model; this could probe the spacetime geometry around the shell in a gaugeinvariant way.
It may still seem difficult to imagine any spacetime that contains an apparent horizon
of mixed colours. Nevertheless, examples of such spacetimes can readily be constructed
if the assumption of differentiability is abandoned. A continuous, piecewise differentiable
spacetime can make sense as a history within the path integral method.
The simplest construction of this kind is based on the existence of the time reversal
isometry T as defined in the foregoing paper [2] that maps an in-going shell spacetime
onto an outgoing one.
Let us choose a space-like hyper-surface 1 crossing the shell before this hits the
singularity in a (1, M, u)-spacetime M, and find the corresponding surface T 1 in the
spacetime T M with the parameters (1, M, v). Then we cut away the part of M that lies in
the future of 1 and the part of T M in the past of T 1 . As their boundaries are isometric
to each other, the remaining halfs can be stuck together in a continuous way. In the resulting
spacetime, the shell contracts from the infinity until it reaches 1 at the radius r1 ; then,
it turns its motion abruptly to expand towards infinity again. There is no singularity and
the spacetime is flat everywhere inside the shell. If r1 < 2GM, then there is an apparent
horizon at R = 2GM. It comes into being where the in-going shell crosses the radius r =
2GM and is outgoing (black) until it reaches 1 . Then, it changes its colour abruptly to
white (in-going) and lasts only until the outgoing shell crosses it again.
The space-like hyper-surface 1 can be chosen arbitrarily in M. The construction can,
therefore, be repeated in the future of T 1 in an analogous way so that we obtain a
spacetime with two pleats; the shell contracts, then expands, then contracts again and
hits the singularity. The horizon starts as a black ring, then changes to a white one, and
then it becomes black for all times. This history is, however, not continuous. Clearly, one
can repeat the construction arbitrary many times; this leads to a pleated spacetime with
a zig-zag motion of the shell and alternating horizon rings of white and black colour. If the
spacetime is to be singularity free, however, there must be an odd number of pleats and an
even number of rings, beginning with the black ring and ending with a white one.
The conditions that the surface 1 cuts the trajectory of the shell at some small value of
the Schwarzschild radial coordinate R, is smooth and space-like everywhere and hits the
space-like infinity i 0 for large values of R allow a considerable freedom. We can require
in addition that 1 joins smoothly to the surface T = T1 , where T is the Schwarzschild
time coordinate and T1 some constant so that 1 coincides with T = T1 for all values of R
larger than, say, R1 . It is clear from the Penrose diagram that such a 1 can ran arbitrarily
close to the incoming shell trajectory and can be joined to T = T1 for arbitrary low value of
T1 (, ), if R1 is chosen sufficiently large. On the other hand, for R1 = 2GM + 7,
1 can join T = T1 for arbitrarily large T1 (, ), just if 7 > 0.
Consider now an observer at the fixed value R0 of the Schwarzschild radius in each shell
spacetime. We shall choose 1 in such a way that R1 < R0 . With this choice, the observer
575
trajectory R = R0 remains smooth at 1 . Then, the lower bound on the possible values
of T1 is Tc , which is the Schwarzschild time of the point at which the observer crosses
the shell. There is, however, no upper bound on T1 . Hence, we can construct a one-pleat
spacetime for each value of T1 from the interval (Tc , ) with a smooth trajectory of the
observer. For each value of T1 , the observer will measure the proper time

2GM
% = 2
1
(T1 Tc ) (0, )
R0
between his two encounters with the shell. Thus, the time delay can be made arbitrarily
small or large. (Of course, all such histories and many others must be integrated with some
suitable measure in a path integral in order to obtain a reasonable value of the delay.)
Let us choose a gauge in each spacetime constructed above such that the trajectory of
the shell is V = u for the in-going part and U = u for the outgoing one. Then, the metric
in the asymptotic region, where the observer is, will read
ds 2 = A(U, V ) dU dV + R 2 (U, V ) dR 2 ,
and it is clear that the functions A(U, V ) and R(U, V ) must have different forms for
different values of T1 , or else the proper time % measured by the observer will be
independent of T1 . In most cases, the asymptotic behaviour of the metric in this gauge will
be different from (18). On the other hand, in each such spacetime, there will be double null
coordinates U1 and V1 , say, in which the metric will have the asymptotic behaviour (18).
However, the trajectory of the shell will then be given, with respect to the coordinates U1
and V1 , by different equations for different values of T1 .
6. Concluding remarks
Comparison of the motion of wave packets of Section 4 with the classical dynamics of
the shell as described in Section 3 of [2] shows a marked difference. Whereas all classical
shells cross their Schwarzschild radius and reach the singularity in some stage of their
evolution, the quantum wave packets never reach the singularity, but always bounce and reexpand; few of them manage to cross their Schwarzschild radius during their motion. This
behaviour is far from being a small perturbation around a classical solution if the classical
spacetime is considered as a whole. Even locally, the semi-classical approximation is not
valid near the bouncing point. It is surely valid in the whole asymptotic region, where
narrow wave packets follow more or less the classical trajectories of the shell.
The most important question, however, concerns the validity of the semi-classical
approximation near the Schwarzschild radius. We have seen that the geometry near the
radius can resemble the classical black hole geometry in the neighbourhood of the point
where the shell is crossing the Schwarzschild radius inwards. Then, the radius changes its
colour gradually and the geometry becomes very different from the classical one. Finally,
near the point where the shell crosses the Schwarzschild radius outwards, the radius is
predominantly white and the quantum geometry can be again similar to the classical
geometry, this time of a white hole horizon.
576
If the change of colour is very slow then the neighbourhood of the inward crossing where
the classical geometry is a good approximation can be large. It seems that sufficiently
large time delays would allow for arbitrarily slow change of colour. We cannot exclude,
therefore, that the quantum spacetime contains an extended region with the geometry
resembling its classical counterpart near a black hole horizon, at least locally. This can be
true even if the quantum spacetime as a whole differs strongly from any typical classical
collapse solution.
One can even imagine the following scenario (which needs a more realistic model
than a single thin shell). A quantum system with a large energy collapses and re-expands
with huge time delay. The black hole horizon phase is so long, that Hawking evaporation
becomes significant and must be taken into account in the calculation. It does then influence
the time delay and the period of validity of the black hole approximation. The black hole
becomes very small and only then the change of horizon colour becomes significant. The
white hole stage is quite short and it is only the small remnant of the system that, finally,
re-expands. The whole process can still preserve unitarity. In fact, this is a scenario for
the issue of Hawking evaporation process. At least, it is not excluded by the results of the
present paper.
The calculations of this paper are valid only for null shells. Similar calculations have
been performed in [14]. There has been re-expansion and unitarity for massive shells if
the rest mass has been smaller than the Planck mass (105 g). It is very plausible that the
interpretation of these results is similar to that given in the present paper. Thus, we can
expect the results valid at least for all light shells. There is, in any case, a long way to
any astrophysically significant system and a lot of work is to be done before we can claim
some understanding of the collapse problem.
Our method of dealing with the problem employs simplified models and a kind of
effective theory of gravity; it does not worry about the final form of a full-fledged theory of
quantum gravity. This need not be completely unreasonable approach. Even if the ultimate
quantum gravity theory were known, most calculations would still be performed within the
approximation of some effective theory and for simplified models (compare the situation
in the QCD). The method can give useful hints also because of the fact that the black hole
geometry is made up from purely dependent degrees of freedom of the gravitational field,
and these degrees of freedom have no proper quantum character of their own.
To summarize: We have demonstrated, at least for light shells, that quantum theory can
smoothly unify two states of motion, one being the time reversal of the other, into one
history. In this way, geometry containing a piece of a black hole horizon can be followed by
geometry containing a piece of a white hole horizon just the opposite to the situation we
know from the Kruskal diagram of the classical general relativity. In this way, the quantum
evolution can stay unitary and the question posed at the beginning of the paper can be
answered as follows: a quantum system is not always lost if it falls under its Schwarzschild
radius.
577
Acknowledgements
Part of this work has been done at the University of Utah, where the author enjoyed
a nice hospitality. Helpful discussions with K.V. Kuchar, C. Kiefer and V.F. Mukhanov
are acknowledged. The work was supported by the Swiss National Fonds, the Tomalla
Foundation Zrich, and the NSF Grant PHY-9734871 to the University of Utah.
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
J. Louko, B. Whiting, J. Friedman, Phys. Rev. D 57 (1998) 2279.

P. Hjcek, C. Kiefer, Nucl. Phys. B 603 [PM] (2001) 491, preceding article in this issue.
P. Hjcek, Nucl. Phys. B Proc. Suppl. 88 (2000) 114.
C.J. Isham, in: B.S. DeWitt, R. Stora (Eds.), Relativity, Groups and Topology II, Elsevier,
Amsterdam, 1984.
C. Rovelli, Nuovo Cimento B 100 (1987) 343.
P. Hjcek, in: J. Ehlers, H. Friedrich (Eds.), Canonical Gravity: From Classical to Quantum,
Springer, Berlin, 1994.
P. Hjcek, C.J. Isham, J. Math. Phys. 37 (1996) 3522;
P. Hjcek, J. Math. Phys. 39 (1998) 4824.
A.O. Barut, R. Raczka, Theory of Group Representations and Applications, PWN, Warsaw,
1980.
P. Hjcek, J. Math. Phys. 36 (1995) 4612;
P. Hjcek, Class. Quant. Grav. 13 (1996) 1353.
C. Rovelli, Phys. Rev. D 42 (1990) 2638;
C. Rovelli, Phys. Rev. D 43 (1991) 442;
C. Rovelli, Phys. Rev. D 44 (1991) 1339.
M. Reed, B. Simon, Fourier Analysis, Self-Adjointness, Methods of Modern Mathematical
Physics, Vol. 2, Academic Press, New York, 1975.
P. Hjcek, Phys. Rev. D 36 (1987) 1065.
P. Hjcek, Nucl. Phys. B Proc. Suppl. 80 (2000), gr-qc/9903089.
P. Hjcek, B.S. Kay, K.V. Kuchar, Phys. Rev. D 46 (1992) 5439.

Erratum
Erratum to: On the relation between

Stokes multipliers and T-Q systems
of conformal field theory
[Nucl. Phys. B 563 (1999) 573]
Patrick Dorey, Roberto Tateo
Department of Mathematical Sciences, University of Durham, Durham DH1 3LE, UK
Here are the corrections to this paper.

On page 583, Eq. (3.8) should be:

eiE/4 C(E, l)D (E, l) = (l+1/2)+E/2D 2 E, l

+ (l+1/2)E/2D 2 E, l ;

y(x, 0, l)|M
1/4
1/2

2
2
2l + 1 1
=
x (M1)/4 y
x (M+1)/2, 0,
;
M +1
M +1
M + 1 2 M=1
= (M + 1)2M E M1 ;
E
On page 601, Ref. [24] should be:
C.M. Bender and S. Boettcher, Phys. Rev. Lett. 80 (1998) 5243, physics/9712001.
PII of original article: S0550-3213(99)00609-4.

E-mail addresses: p.e.dorey@durham.ac.uk (P. Dorey), roberto.tateo@durham.ac.uk (R. Tateo).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 6 3 - 8

Erratum
Erratum to: Differential equations

and integrable models: the SU(3) case
[Nucl. Phys. B 571 (2000) 583]
Patrick Dorey, Roberto Tateo
Department of Mathematical Sciences, University of Durham, Durham DH1 3LE, UK
Here are the corrections to this paper.

W1,0,1 = i3 3;
On page 599, first line. The constant c is:
c=
3
27 (7/6) (5/3)
=
.
4
(1/3) 24/3
These do not affect any of the other results in the paper.
PII of original article: S0550-3213(99)00791-9.

E-mail addresses: p.e.dorey@durham.ac.uk (P. Dorey), roberto.tateo@durham.ac.uk (R. Tateo).
PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 1 6 4 - X

CUMULATIVE AUTHOR INDEX B601B603
Ahn, C.
Akemann, G.
Alford, M.
Alonso-Alberca, N.
ALPHA Collaboration
lvarez, E.
Amoros, G.
Anastasiou, C.
Anastasiou, C.
Arutyunov, G.
Astier, P.
Autiero, D.
B601 (2001) 539

B601 (2001) 77
B602 (2001) 61
B602 (2001) 329
B603 (2001) 180
B603 (2001) 286
B602 (2001) 87
B601 (2001) 318
B601 (2001) 341
B602 (2001) 238
B601 (2001) 3
B601 (2001) 3
Cervera-Villanueva, A.
Chandrasekharan, S.
Chim, L.
Cognola, G.
Colangelo, G.
Collazuol, G.
Conforto, G.
Conta, C.
Contalbrigo, M.
Cousins, R.
Cox, J.
Curio, G.
B601 (2001) 3
B602 (2001) 61
B601 (2001) 539
B602 (2001) 383
B603 (2001) 125
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B602 (2001) 61
B602 (2001) 172
Bajnok, Z.
Baldisseri, A.
Baldo-Ceolin, M.
Banner, M.
Bassompierre, G.
Behrndt, K.
Benatti, F.
Benslama, K.
Besson, N.
Bialas, A.
Bialas, P.
Bijnens, J.
Bird, I.
Blumenfeld, B.
Bobisut, F.
Boer, D.
Bouchez, J.
Boyd, S.
Braun, V.M.
Bueno, A.
Bunyatov, S.
Burda, Z.
B601 (2001) 503

B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 49
B602 (2001) 541
B601 (2001) 3
B601 (2001) 3
B603 (2001) 218
B603 (2001) 369
B602 (2001) 87
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 195
B601 (2001) 3
B601 (2001) 3
B603 (2001) 69
B601 (2001) 3
B601 (2001) 3
B602 (2001) 399
Dalmazi, D.
Damgaard, P.H.
Daniels, D.
De Foss, L.
Degaudenzi, H.
Del Prete, T.
De Santo, A.
Dignan, T.
Di Lella, L.
Do Couto e Silva, E.
Domenech-Garret, J.L.
Dorey, P.
Dorey, P.
Dumarchez, J.
B601 (2001) 77
B601 (2001) 77
B601 (2001) 3
B603 (2001) 413
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 395
B603 (2001) 581
B603 (2001) 582
B601 (2001) 3
Ellis, M.
Evslin, J.
B601 (2001) 3
B602 (2001) 486
Cachazo, F.
Camilleri, L.
Cardini, A.
Cattaneo, P.W.
Cavasinni, V.
Cerdeo, D.G.
B603 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 231
Fabris, J.C.
Fazio, T.
Feldman, G.J.
Feng, J.L.
Ferrari, R.
Ferrre, D.
Flaminio, V.
Floreanini, R.
Forte, S.
Fradkin, E.
Fraternali, M.
B602 (2001) 644

B601 (2001) 3
B601 (2001) 3
B602 (2001) 307
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B602 (2001) 541
B602 (2001) 585
B601 (2001) 591
B601 (2001) 3
0550-3213/2001 Published by Elsevier Science B.V.

PII: S 0 5 5 0 - 3 2 1 3 ( 0 1 ) 0 0 2 1 5 - 2
584
Frau, M.
Frolov, S.
B602 (2001) 39
B602 (2001) 238
Gabrielli, E.
Gaillard, J.-M.
Gangler, E.
Garousi, M.R.
Gasser, J.
Gehrmann, T.
Gehrmann, T.
Geiser, A.
Geppert, D.
Gherghetta, T.
Gibin, D.
Gilkey, P.B.
Glover, E.W.N.
Glover, E.W.N.
Gninenko, S.
Godley, A.
Gmez, C.
Gomez-Cadenas, J.-J.
Gorbunov, D.S.
Gosset, J.
Gling, C.
Gouanre, M.
Grant, A.
Graziani, G.
Guglielmi, A.
Gukov, S.
B603 (2001) 231

B601 (2001) 3
B601 (2001) 3
B602 (2001) 527
B603 (2001) 125
B601 (2001) 248
B601 (2001) 287
B601 (2001) 3
B601 (2001) 3
B602 (2001) 3
B601 (2001) 3
B601 (2001) 125
B601 (2001) 318
B601 (2001) 341
B601 (2001) 3
B601 (2001) 3
B603 (2001) 286
B601 (2001) 3
B602 (2001) 213
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 49
Hagner, C.
Hjcek, P.
Hjcek, P.
Hambye, T.
Hara, T.
Hernndez, L.
Hernando, J.
Hubbard, D.
Huerta, M.
Hurst, P.
Hyett, N.
B601 (2001) 3
B603 (2001) 531
B603 (2001) 555
B602 (2001) 23
B602 (2001) 499
B603 (2001) 286
B601 (2001) 3
B601 (2001) 3
B601 (2001) 591
B601 (2001) 3
B601 (2001) 3
Iacopini, E.
Intriligator, K.
Iucci, A.
B601 (2001) 3
B603 (2001) 3
B601 (2001) 607
Joseph, C.
Juget, F.
B601 (2001)
B601 (2001)
Kamani, D.
Khalil, S.
Kiefer, C.
Kiem, Y.
Kim, J.E.
Kim, Y.
Kirsanov, M.
B601 (2001) 149

B603 (2001) 231
B603 (2001) 531
B601 (2001) 27
B602 (2001) 346
B602 (2001) 467
B601 (2001) 3
3
3
Kirsten, K.
Klemm, D.
Klimov, O.
Koerber, P.
Kokkonen, J.
Korchemsky, G.P.
Kovzelev, A.
Krasnoperov, A.
Krause, A.
Kuznetsov, V.
B601 (2001) 125

B601 (2001) 380
B601 (2001) 3
B603 (2001) 413
B601 (2001) 3
B603 (2001) 69
B601 (2001) 3
B601 (2001) 3
B602 (2001) 172
B601 (2001) 3
Lacaprara, S.
Lachaud, C.
Lakic, B.
Lanza, A.
LaRotonda, L.
Laveder, M.
Lazaroiu, C.I.
Leo, C.R.
Lee, H.M.
Letessier-Selvon, A.
Leutwyler, H.
Levy, J.-M.
Li, K.
Li, M.
Liccardo, A.
Ling, Y.
Linssen, L.
Ljubicic, A.
Long, H.N.
Long, J.
Lupi, A.
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 497
B602 (2001) 514
B602 (2001) 346
B601 (2001) 3
B603 (2001) 125
B601 (2001) 3
B601 (2001) 607
B602 (2001) 201
B602 (2001) 39
B601 (2001) 191
B601 (2001) 3
B601 (2001) 3
B601 (2001) 361
B601 (2001) 3
B601 (2001) 3
Ma, E.
Ma, J.P.
Manashov, A.N.
Mangano, M.L.
March-Russell, J.
Marchionni, A.
Martelli, F.
Martn-Delgado, M.A.
McInnes, B.
Mchain, X.
Meessen, P.
Meggiolaro, E.
Mendiburu, J.-P.
Merlatti, P.
Meyer, J.-P.
Mezzetto, M.
Mishra, S.R.
Molke, H.
Moon, S.-H.
Moorhead, G.F.
Morel, A.
Mueller, A.H.
Munier, S.
B602 (2001) 23
B602 (2001) 572
B603 (2001) 69
B602 (2001) 585
B602 (2001) 307
B601 (2001) 3
B601 (2001) 3
B601 (2001) 569
B602 (2001) 132
B601 (2001) 3
B602 (2001) 329
B602 (2001) 261
B601 (2001) 3
B602 (2001) 453
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 180
B602 (2001) 467
B601 (2001) 3
B603 (2001) 369
B603 (2001) 427
B603 (2001) 427
585
Muoz, C.
Musto, R.
B603 (2001) 231

B602 (2001) 39
Rubbia, A.
Russo, J.G.
B601 (2001) 3
B602 (2001) 109
Nagao, T.
Nan, C.M.
Naumov, D.
Navelet, H.
Ndlec, P.
Nefedov, Yu.
Nguyen-Mau, C.
NOMAD Collaboration
Norrbin, E.
B602 (2001) 622

B601 (2001) 607
B601 (2001) 3
B603 (2001) 218
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 297
Oleari, C.
Oleari, C.
Oller, J.A.
Orestano, D.
Ortn, T.
B601 (2001) 318

B601 (2001) 341
B602 (2001) 641
B601 (2001) 3
B602 (2001) 329
Palla, L.
Park, D.H.
Pastore, F.
Peak, L.S.
Pearce, P.A.
Pelinson, A.M.
Pennacchio, E.
Peschanski, R.
Pessard, H.
Petersson, B.
Petersson, B.
Petkou, A.C.
Petkou, A.C.
Petkova, V.B.
Petriello, F.J.
Petrov, K.
Petti, R.
Placci, A.
Polesello, G.
Pollmann, D.
Polyarush, A.
Pomarol, A.
Popov, B.
Poulsen, C.
B601 (2001) 503

B601 (2001) 27
B601 (2001) 3
B601 (2001) 3
B601 (2001) 539
B602 (2001) 644
B601 (2001) 3
B603 (2001) 218
B601 (2001) 3
B602 (2001) 399
B603 (2001) 369
B601 (2001) 380
B602 (2001) 238
B603 (2001) 449
B601 (2001) 169
B603 (2001) 369
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B602 (2001) 3
B601 (2001) 3
B601 (2001) 3
Sabella, G.
Sakai, N.
Salvatore, F.
Sanchis-Lozano, M.A.
Sarkar, U.
Sato, H.-T.
Schahmaneche, K.
Schmidt, B.
Sekino, Y.
Sethi, S.
Sevior, M.
Sevrin, A.
Sevrin, A.
Shapiro, I.L.
Shmakova, M.
Sierra, G.
Sillou, D.
Siopsis, G.
Sjstrand, T.
Smolin, L.
Smolin, L.
Soa, D.V.
Sokal, A.D.
Soler, F.J.P.
Sozzi, G.
Starinets, A.O.
Stasto, A.M.
Steele, D.
Stiegler, U.
Stipcevic, M.
Stolarczyk, Th.
B602 (2001) 453

B602 (2001) 413
B601 (2001) 3
B601 (2001) 395
B602 (2001) 23
B601 (2001) 27
B601 (2001) 3
B601 (2001) 3
B602 (2001) 147
B602 (2001) 307
B601 (2001) 3
B603 (2001) 389
B603 (2001) 413
B602 (2001) 644
B601 (2001) 49
B601 (2001) 569
B601 (2001) 3
B601 (2001) 380
B603 (2001) 297
B601 (2001) 191
B601 (2001) 209
B601 (2001) 361
B601 (2001) 425
B601 (2001) 3
B601 (2001) 3
B601 (2001) 425
B603 (2001) 427
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
Rathouit, P.
Reisz, T.
Remiddi, E.
Remiddi, E.
Resco, P.
Rey, S.-J.
Rico, J.
Ridolfi, G.
Rivelles, V.O.
Roda, C.
Rodriguez-Laguna, J.
Romano, R.
B601 (2001) 3
B603 (2001) 369
B601 (2001) 248
B601 (2001) 287
B603 (2001) 286
B602 (2001) 467
B601 (2001) 3
B602 (2001) 585
B602 (2001) 514
B601 (2001) 3
B601 (2001) 569
B602 (2001) 541
Tabaczek, J.
Takcs, G.
Takayanagi, T.
Talavera, P.
Tani, T.
Tareb-Reyes, M.
Tateo, R.
Tateo, R.
Taylor, G.N.
Tejeda-Yeomans, M.E.
Tejeda-Yeomans, M.E.
Tereshchenko, V.
Theis, U.
Tomizawa, S.
Toropin, A.
Torrente-Lujan, E.
Toublan, D.
Touchard, A.-M.
Tovey, S.N.
Tran, M.-T.
Troost, J.
B602 (2001) 399

B601 (2001) 503
B603 (2001) 259
B602 (2001) 87
B602 (2001) 434
B601 (2001) 3
B603 (2001) 581
B603 (2001) 582
B601 (2001) 3
B601 (2001) 318
B601 (2001) 341
B601 (2001) 3
B602 (2001) 367
B602 (2001) 413
B601 (2001) 3
B603 (2001) 231
B603 (2001) 343
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 389
586
Troost, W.
Tsesmelis, E.
B603 (2001) 389

B601 (2001) 3
Ulrichs, J.
B601 (2001)
Vacavant, L.
Vafa, C.
Valdata-Nappi, M.
Valuev, V.
Van Neerven, W.L.
Vannucci, F.
Varadarajan, U.
Varvell, K.E.
Vassilevich, D.V.
Veltri, M.
Verbaarschot, J.J.M.
Verbaarschot, J.J.M.
Vercesi, V.
Vidal-Sitjes, G.
Vieira, J.-M.
Vinogradova, T.
Vogt, A.
B601 (2001) 3
B603 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 42
B601 (2001) 3
B602 (2001) 486
B601 (2001) 3
B601 (2001) 125
B601 (2001) 3
B601 (2001) 77
B603 (2001) 343
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B601 (2001) 3
B603 (2001) 42
Wgner, F.
Wang, J.E.
Weber, F.V.
Weisse, T.
Wiese, U.-J.
Wilczek, F.
Wilson, F.F.
Winton, L.J.
Wolff, U.
B601 (2001) 503

B602 (2001) 486
B601 (2001) 3
B601 (2001) 3
B602 (2001) 61
B602 (2001) 307
B601 (2001) 3
B601 (2001) 3
B603 (2001) 180
Xiong, Z.
B602 (2001) 289
Yabsley, B.D.
Yang, J.M.
Yoneya, T.
B601 (2001) 3
B602 (2001) 289
B602 (2001) 499
Zaccone, H.
Zemba, G.R.
Zerbini, S.
Zuber, J.-B.
Zuber, K.
Zuccon, P.
B601 (2001) 3
B601 (2001) 591
B602 (2001) 383
B603 (2001) 449
B601 (2001) 3
B601 (2001) 3

Nucl - Phys.B v.603

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nucl - Phys.B v.603

Uploaded by

Copyright:

Available Formats

Nuclear Physics B 603 (2001) 341

A large N duality via a geometric transition

Received 9 May 2001; accepted 10 May 2001

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

In the vacuum where classically P (x) det(x ) =

ai )Ni , the gauge group

the degree n polynomial (1.2) and fn1 (x) a degree n 1 polynomial. As

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

SU(Nj ) glueball chiral superfields Sj = 32

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

2. Review of the large N duality for N = 1 YangMills

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

The 3-cycles A and B can be viewed as 2-spheres spanned by a real subspace of y, z

Note that, under 30 e2i 30 , S, shifting the B period by an A period. Using

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

2.1. Gauge-theoretic reformulation of the duality

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

2.2. Adding massive fields

3. Geometric engineering N = 1 theories with adjoint and superpotential

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

For Wtree () = 0, the 4d field theory would be pure N = 2 YangMills system. To

There is a continuous family of P1 s, labeled by arbitrary x, at u = 0 = u . Each of the

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

which is indeed only compatible with u = u = 0 at the n choices of x = ai where

4. Large N duality proposal

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

The couplings (4.9) should be evaluated at the Si

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

5. Field theory analysis

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

Using (5.1) one could use this to try to find the ur

The g2 term coincides with (5.2), so both give the same u2

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

with the ai defined in (1.2). As in (5.1), Wcl (gr )/gr = ur

It will be useful in what follows to also integrate in the glueball fields Si :

The Si are massive, with supersymmetric vacua Si

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

and the supersymmetric vacua are at those up

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

but this will not be needed here.

with the sk related to the ur by

and s0 1 and u0 0; thus s1 = u1 , s2 = 12 u21 u2 , etc. (for SU(N) we impose u1 = 0).

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

with F2n (x; up

because the highest order term in F2n (x) is

Li PN (x; ur )x=p 2;i N + Qi PN (x; ur )x=p ,

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

Defining, as in [31], the order l 1 polynomial Bl1 (x) by

with Hl (x) the polynomial appearing in (5.15), we thus have

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

which is satisfied via Chebyshev polynomials:

With the normalization of (5.27), TN (x) = x N Nx N2 + , the first Chebyshev

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

We denote the second Chebyshev functions UK1 (x t + t 1 ) (t K t K )/(t t 1 ) =

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

configuration, as in [33]. Our geometric flop transition duality is roughly reminiscent of

6. The case with the cubic superpotential in more detail

with mW = a1 a2 = (m/g)(N/(N1 N2 )) and m = g. Naive integrating in

Again, this does not include the glueball fields.

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

F. Cachazo et al. / Nuclear Physics B 603 (2001) 341

There is a continuous family of P1 s, labeled by arbitrary x, at u = 0 = u . Each of the

which is indeed only compatible with u = u = 0 at the n choices of x = ai where

Li PN (x; ur )x=p 2;i N + Qi PN (x; ur )x=p ,

where we have Taylor-expanded W (x)2 + fn1 (x) around x = ai and

+ x, W (x) = g(x a1 )(x a2 ) and = a1 a2 .