The Sequence, Crystal Structure Determination and Refinement of Two Crystal Forms of Lipase Bfrom Candida Antarctica

The sequence, crystal structure determination
and refinement of two crystal forms

of lipase B from Candida antarctica
Jonas Uppenberg 1 , Mogens Trier Hansen 2 ,
Shamkant Patkar 2 and T Alwyn Jones 1*
1
Department of Molecular Biology, Uppsala University, Biomedical Centre, Box 590,

S-751 24 Uppsala, Sweden and 2Novo Nordisk, Novo All, DK-2880 Bagsvaerd, Denmark
Background: Lipases constitute a family of enzymes

that hydrolyze triglycerides. They occur in many organisms and display a wide variety of substrate specificities.
In recent years, much progress has been made towards
explaining the mechanism of these enzymes and their
ability to hydrolyze their substrates at an oil-water interface.
Results: We have determined the DNA and amino acid
sequences for lipase B from the yeast Candida antarctica. The primary sequence has no significant homology to any other known lipase and deviates from the
consensus sequence around the active site serine that
is found in other lipases. We have determined the crystal structure of this enzyme using multiple isomorphous
replacement methods for two crystal forms. Models for
the orthorhombic and monoclinic crystal forms of the
enzyme have been refined to 1.55 A and 2.1A resolution, respectively. Lipase B is an o./P type protein that
has many features in common with previously determined lipase structures and other related enzymes. In
the monoclinic crystal form, lipid-like molecules, most
likely -octyl glucoside, can be seen close to the active
site. The behaviour of these lipid molecules in the crystal
structure has been studied at different pH values.
Conclusion: The structure of Candida antarcticalipase B shows that the enzyme has a Ser-His-Asp catalytic triad in its active site. The structure appears to be
in an 'open' conformation with a rather restricted entrance to the active site. We believe that this accounts for
the substrate specificity and high degree of stereospecificity of this lipase.
Structure 15 April 1994, 2:293-308
Key words: Candida antarctica, crystal structure, lipase, sequence, X-ray
Introduction
Lipases (EC 3.1.1.3) make up a diverse group of enzymes that have the ability to hydrolyze triglycerides at
a lipid-water interface. Activity is dramatically increased
upon binding to the lipid surface due to a conformational change of the enzyme. This mechanism of inter-
facial activation on triglyceride substrates distinguishes

lipases from other esterases which primarily hydrolyze
water-soluble esters. A large number of lipases have
been characterized that display a wide variation in efficiency and substrate specificity [1]. Triglycerides may
be cleaved at all three ester bonds or specifically at
only one or two positions. Lipases can also show different specificities depending on the lengths of the
fatty acids. In organic media, the enzymatic behaviour
changes and lipases can be used for transesterification
and other synthetic reactions to produce new kinds of
lipids. The three-dimensional crystal structures of five
lipases have been published to date [2-6]. The five enzymes show many similarities in structure and they all
have a catalytic triad similar to the one found in serine
proteases [7,8]. The catalytic residues in the triad occur
in a different order in the protein sequence for lipases
(Ser-His-Asp/Glu), compared with serine proteases
(Asp-His-Ser in the subtilisin family, His-Asp-Ser in

the chymotrypsin family). The catalytic serine in lipases
is usually identified by the conserved sequence GxSxG
[9].
The five structures are made up of a mostly parallel
3-sheet, surrounded by a-helices. The human pancreatic lipase (HPL) has an additional domain that binds
colipase, a small protein involved in lipid binding. An
outstanding feature of most lipases is a mobile lid covering the catalytic site in its inactive form. The opening
of this lid is believed to be one of the key features of
interfacial activation. This has been demonstrated by
the structures of Rhizomucor mieheilipase (RML) [10]
and HPL [11], in complex with an inhibitor and a substrate analogue, respectively, where the lid has moved
considerably to make the active site accessible to the
ligand and made possible the formation of an oxyanion
hole. The movement of the lid also changes the overall
surface at the entrance of the active site, making it more
hydrophobic, and thereby changing the lipid-binding
properties. The structures of Geotrichum candidum
lipase (GCL) and Candida rugosa lipase (CRL) have
been identified as members of the ct/3-hydrolase family [12]. This group of enzymes shares a similar fold
'Corresponding author.
( Current Biology Ltd ISSN 0969-2126
293
294
Structure 1994, Vol 2 No 4
-20
ATG AAG CTA CTC TCT CTG ACC CGT
met lys leu eu ser leu thr gy
-1 1
TGG
CCT TTG GTG AAG CGT CTA CGT
CCT TCC
pro leu val lys arg leu pro ser
21
CTC GAT GCG GGT CTG ACC TGC CAG
leu asp ala gly leu thr cys gin
41
CTC GTC CCC GGA ACC GGC ACC ACA
leu val pro gly thr gly thr thr
61
TCA ACG CAG TTG GGT TAC ACA CCC
ser thr gin leu gly tyr thr pro
81
ACC CAG GTC AAC ACG GAG TAC ATG
thr gin val asn thr glu tyr met
101
AAC AAC AAG CTT CCC GTG CTT ACC
asn asn lys leu pro val leu thr
121
ACC TTC TTC CCC AGT ATC AGG TCC
thr phe phe pro ser ile arg ser
141
AAG GGC ACC GTC CTC CCC GGC CCT
lys gly thr val leu ala gly pro
161
CAG CAA ACC ACC GGT TCG GCA CTC
aln qln thr thr gly ser ala leu
181
ATC GTG CCC ACC ACC AAC CTC TAC
ile val pro thr thr asn leu tyr
201
AAC TCG CCA CTC GAC TCA TCC TAC
asn ser pro leu asp ser ser tyr
221
TGT GGG CCG CTG TTC GTC ATC GAC
cys gly pro leu phe val ile asp
241
GTC GGT CGA TCC GCC CTG CGC TCC
val gly arg ser ala leu arg ser
261
ACG GAC TGC AAC CCT CTT CCC GCC
thr asp cys asn pro leu pro ala
281
GCG CTC CTG GCG CCG GCA GCT GCA
ala leu leu ala pro ala ala ala
301
GAC CTC ATG CCC TAC GCC CGC CCC
asp leu met pro tyr ala arg pro
GTG GCT GGT

val ala gly
GGT TGG
GGT TCG GAC
gly ser asp
GGT
gly
GGT
gly
TGC
cys
GTC
val
TGG
trp
AAG
lys
CTC
leu
ACC
thr
TCG
ser
CTC
leu
CAT
his
ACC
thr
AAT
asn
GCC
ala
TTT
phe
-10
GTG CTT GCG ACT TGC GTT GCA GCC ACT

val leu aa thr cys va ala aa thr
CCT GCC TTT TCG CAG

pro ala phe ser gin
31
GCT TCG CCA TCC TCG GTC TCC
ala ser pro ser ser val ser
51
CCA CAG TCG TTC GAC TCG AAC
pro gin ser phe asp ser asn
71
TGG ATC TCA CCC CCG CCG TTC
trp ile ser pro pro pro phe
91
AAC GCC ATC ACC GCG CTC TAC
asn ala ile thr ala leu tyr
111
TCC CAG GGT GGT CTG GTT GCA
ser gin gly gly leu val ala
131
GTC GAT CGA CTT ATG GCC TTT
val asp arg leu met ala phe
151
GAT GCA CTC GCG GTT AGT GCA
asp ala leu ala val ser ala
171
ACC GCA CTC CGA AAC GCA GGT
thr ala leu arg asn ala gly
191
CCC ACC GAC GAG ATC GTT CAG
ala thr asp glu ile val gin
211
TTC AAC GGA AAG AAC GTC CAG
phe asn gly lys asn val gin
231
GCA GGC TCG CTC ACC TCG CAC
ala gly ser leu thr ser gin
251
ACG GGC CAG GCT CGT AGT GCA
thr gly gin ala arg ser ala
271
GAT CTG ACT CCC GAG CAA AAG
asp leu thr pro glu gin lys
291
ATC GTG GCG GGT CCA AAG CAG
ile val ala gly pro lys gin
311
GCA GTA GGC AAA AGG ACC TGC
ala val gly lys arg thr cys
CCC AAG TCG GTG

pro lys ser val
AAA CCC ATC CTT
lys pro ile leu
TGG ATC CCC CTC
trp ile pro leu
ATG CTC AAC GAC
met leu asn__asp
GCT GGT TCG GGC
ala gly ser gly
CAG TGG GGT CTG
gin trp gly leu
GCG CCC GAC TAC
ala pro asp tyr
CCC TCC GTA TGG
pro ser val trp
GGT CTG ACC CAG
gly leu thr gin
CCT CAG GTG TCC
pro gin val ser
GCA CAG GCC GTG
ala gin ala val
TTC TCC TAC GTC
phe ser tyr val
GAC TAT GGC ATT
asp tyr gly ile
GTC GCC GCG GCT
val ala ala ala
AAC TGC GAG CCC
asn cys glu pro
TCC GGC ATC GTC
ser gly ile val
ACC CCC TGA

thr pro OPA
Fig. 1. DNA sequence of the Candida antarctica lipase B gene and the deduced amino acid sequence. Numbers refer to the amino
acid position in the mature lipase. The pre-propeptide (amino acids -25 to -1, shown in italics) contains a sequence (-25 to -8)
typical of signal peptides [51] and a short propeptide ending in two basic amino acids forming a possible target for KEX2 type [52]
proteolytic processing into the mature protease. The position of the probes used for screening are indicated by solid lines: NOR930
(sense, line above sequence) and NOR929 (antisense; line below sequence). The active site residues are underlined with double lines and
the N-glycosylation is marked with a dashed line.
with an identical connectivity of the central -sheet,

although no overall sequence homology can be detected. The members of this newly categorized fold all
have a catalytic triad with the same sequential order of

the catalytic residues (nucleophile-His-Asp/Glu). The
different nucleophiles account for much of the diver-
Two crystal forms of lipase B from Candida antarctica Uppenberg et al.
sity in the reactions performed by these enzymes and

so far serine, aspartic acid and cysteine residues have
been identified in this role. All lipases that have been
characterized to date have a serine as the nucleophilic
residue. GCL and CRL have a high overall sequence
homology and very similar structures. The lid regions,
however, are quite different. It is believed that the GCL
structure represents a closed form of the enzyme, while
CRL has been crystallized in its open, activated form.
Recently, the first bacterial lipase structure was determined from Pseudomonasglumae [6]. This structure
contains much of the a/13-hydrolase fold, but also exhibits details usually not present in lipases, such as a
calcium-binding site and a partially redundant catalytic
aspartate (a calcium site is also present in HPL, although further away from the active site). The structure
of a related enzyme, cutinase, has also been determined
[13]. Cutin, the natural substrate of this enzyme, is a
lipid polyester matrix found on the surface of plants.
Cutinase has lipolytic activity but does not display interfacial activation. The structure, which is similar in many
respects to the lipases, does not have a lid covering
the active site. A model of guinea pig pancreatic lipase
(GPL) has also been constructed, based on the crystal
structure of HPL [14]. The sequence identity between
the two enzymes is high, except for the region of the
lid, where GPL has a large deletion. This is believed to
account for the fact that this enzyme does not display
interfacial activation. GPL also shows phospholipase activity in addition to its ability to hydrolyze triglycerides.
The yeast Candida antarcticadisplays a non-specific
lipase activity towards triglycerides which is retained
even at high temperatures [15]. Two different lipases,
called A and B, with different isoelectric points and
molecular weights, have been isolated [16]. Lipase A is
non-specific and the more thermostable lipase. It has
a molecular weight of 45 kDa and a pI of 7.5. Lipase B
(CALB) has a molecular weight of 33 kDa and a pI of
6.0. This enzyme has proven to be a very stereospecific
enzyme both in hydrolysis [1] and in organic synthesis
[17-19] and has a potentially important application in
glucolipid synthesis [20].
We have determined the DNA and amino acid sequences (Fig. 1) of lipase B from C antarctica(CALB)
Fig. 2. Stereo drawing of the Ca trace of

CALB. The structure is coloured red at
the amino terminus, then orange, light
Preen.
dark t0green.
nale blue,
and finally
d
k bu
t r
terminu.dark blue at the carboxvl terminus.
$
and its three-dimensional crystal structure, using multiple isomorphous replacement (MIR) methods. This
lipase has been crystallized under a variety of conditions [21] and the structure has been determined in
two different crystal forms that grow under identical
conditions. The orthorhombic crystals have the best
diffraction properties and for the general description
of the lipase, the model determined from this crystal
form will be used. The monoclinic crystal form of the
enzyme displays some interesting properties, including
the binding of a detergent molecule in the active site.
Two data sets at different pH values have been collected from the monoclinic crystals.
Results and discussion

Sequence
CALB is made up of 317 amino acid residues with a

formula weight of 33 273 Da. The amino acid sequence
(Fig. 1) shows no significant homology to other lipase
sequences. From the structures determined for other lipases, we assumed that CALB would most likely contain
a Ser-His-Asp/Glu catalytic triad. However, the consensus sequence found in lipases around the active site
serine, GxSxG, is not present in CALB. The sequence
around SerlO5 has the highest similarity to the consensus sequence but the first conserved glycine has
been replaced by a threonine to give TWSQG. Since the
whole sequence includes only one histidine residue,
residue 224 was rather easy to identify as a likely candidate for the active site. The catalytic aspartic or glutamic
acid residue could not be identified from the sequence
and was resolved only after the crystal structure had
been determined.
Description of the molecule
CALB is a globular ca/3 type protein with approximate

dimensions of 30A x 40A x 50A (Fig. 2). The central
P3-sheet is composed of seven strands of which the last
six are parallel, with the following strand topology [22]:
+ 2, - x, + 2x, + x, + Ix, + lx. The numbering of
helices and strands is shown in the secondary structure
diagram in Fig. 3. Most connections between strands
are formed by the right handed P-a-3 structural motif.
295
296
313
A14
20
22
Fig. 3. Secondary structure diagram of

CALB. The assignment of secondary
structure was carried out with the program DSSP [53]. Helices 2, 9 and
c10 all have short regions where the
hydrogen bonding pattern for helices is
broken and the direction of the helix
changes.
Two exceptions involve the antiparallel connection between strands 131 and P2, and the last two strands of the
sheet which form a right-handed [3-loop-P motif. Another P-sheet-like region is found in the last 12 residues
of the protein that form a hydrogen bonded pair of
strands with a type I 3-hairpin connection. There are
10 ao-helices in the structure. The first helix is found
immediately before the first strand. Four helices connect neighbouring P-strands: a3, cx4 and ea7 on one side
of the sheet and cc2 on the other side. Helices ct5, c6
and a10 make up most of the active site pocket and
are likely to be important in interfacial activation and
substrate specificity.
The active site
In CALB, a serine triad is found at the carboxy-terminal

edge of the parallel -sheet. This suggests that this enzyme has the same reaction mechanism as the other lipases that have been studied to date. The catalytic triad
is made up of SerlO5, Asp187 and His224 and, therefore, shares the sequential order of catalytic residues
of all lipases and c/p-hydrolases for which structures
have been determined. The catalytic serine is located
in the tight turn between 34 and the following helix,
a4, and has a similar conformation to that observed in
other lipases. This is a strained conformation, with k
and
values of 53.4 and - 126.9 respectively, that
lie outside the most favoured regions found in proteins. The sequence around the serine is usually the
only conserved region found in lipases. A consensus
sequence GxSxG exists for lipases, as well as for many
of the ca/3-hydrolases [23]. The X-ray structures have
shown how the tight tum in this region brings the Ca
atoms of the two conserved glycines into close contact
with each other, leaving no space for side chain atoms.
In CALB, this consensus is broken by the sequence
TWSQG. As shown in Fig. 4, the relative orientation
of the strand and the helix is different in CALB compared with the other lipases. The helix is slightly more
bent away from the strand, providing extra space for
the threonine residue that lies in the middle of 04. A
similar situation is found in the c/1-hydrolase enzyme
haloalkane dehalogenase [24], where the first glycine
in the consensus sequence is replaced by a valine. One
can only speculate whether the substitution G -T/V is
the cause of the helix movement or a result of it.
The active site histidine, His224, is located at the beginning of the helix, 9, such that the side chain projects
into the active site. The active site aspartic acid, Asp187,
is found in a turn after the sixth strand, as expected
for a member of the t/1-hydrolase fold. The side chain
oxygen atoms of Asp187 form hydrogen bonds to main
chain and side chain atoms as well as to a buried water
molecule (Fig. 5). The next residue is a glutamic acid.
The same pair of residues is also found in GCL, CRL
and acetylcholine esterase [25] where the glutamic acid
is part of the catalytic triad. In CALB, the glutamic acid
points away from the active site into the surrounding
solvent and has no obvious functional role.
The region around SerlO5 is remarkably polar in nature. In addition to His224, there are three residues
(Thr40, Asp134 and Gln157) that have polar side chain
atoms within 5 A of the Oy of the catalytic serine (Fig.
6). The amide group of the side chain of Gln157 is involved in three hydrogen bonds. A close contact to the
carbonyl oxygen of Ser153 leads us to the suggested
acceptor/donor assignments implied in Fig. 6. Thr40,
therefore, accepts a hydrogen bond and the protonated carboxylate of Asp134 donates a hydrogen bond
to Gln157. The three polar residues form a hydrogen
bond network that is fully accessible to the solvent.
This may impose restrictions on how amphipathic lipid
substrates can be oriented in the active site pocket by

Ser 105
Ser 105
Ser
Ser 105
Fig. 4. The superposition of CALB (green),

Rhizomucor miehei lipase (magenta),
Geotrichum candidum lipase (red) and
human pancreatic lipase (purple) around
the active site serine. Only the Cx
atoms of the 13-strand were used in this
alignment. Thr103, a buried water and
Serl05 in CALB are also shown.
Fig. 5. A stereo picture showing density

in a 2Fobs- Falc map around the catalytic triad at 1.55 A resolution. A buried
water residue is tightly associated with
the catalytic residue Asp187 through
a hydrogen bond. Carbon atoms are
shown in green, nitrogens in purple and
oxygens in red.
requiring that they make favourable hydrogen bonds

to polar protein atoms. One such set of interactions is
likely to be the formation of an oxyanion hole. From
the enzymology of serine proteases, it is known that
the negative charge of the tetrahedral substrate intermediate must be stabilized by hydrogen bonds from
the enzyme [26]. A similar oxyanion stabilization appears to be present in lipases [2,3]. In the liganded
structures of HPL and RML, probable oxyanion hole
interactions have been identified, based on hydrogen
bonds from the protein to the oxyanion intermediatelike inhibitor [10,11]. These open forms of HPL and
RML are very similar to each other in this functionally important region. Two main chain nitrogens form
hydrogen bonds to the ligand in both structures. The
first hydrogen bond donor is the residue following the
active site serine and the other is at the carboxy-terminal end of 2. In RML, the latter residue, Ser82, also
contributes a third hydrogen bond through its hydroxyl
group. We have aligned CALB with these structures and

have found a strong resemblance in this region (Fig.
7). Two backbone nitrogens, in residues Gln106 and
Thr40, are present at positions equivalent to those in
RML and HPL. Serine 82 from RML has a direct counterpart in Thr40 from CALB with side chain conformations
such that their hydroxyls are closely superimposable
(Fig. 7). The hydrogen bond assignments in Fig. 6 allow
the hydroxyl of Thr40 to act as a hydrogen bond donor
to the oxyanion without rearrangement. An oxyanion
hole stabilized by three hydrogen bonds has also been
identified in cutinase [27].
The residue immediately before Thr40 in CALB is a
glycine, which is a structurally conserved residue in lipases and most of the /[-hydrolases. A side chain at
this position in CALB would create close contacts between its C and the C[3 of the active site serine (3 A)
and possibly disturb the oxyanion hole interactions.
297
298
Fig. 6. A stereo picture of the catalytic

triad residues and nearby polar residues,
Asp134, Gln157 and Thr40 that form a
hydrogen bonding network with the solvent in the active site cavity. Colour
scheme as for Fig. 5. At the top of the
picture is the most likely candidate for
a lid in CALB and the side chains that
form stabilizing hydrogen bonds in this
region. The residues that are disordered
in one of the molecules of the monoclinic crystal form in the high pH structure are shown in magenta.
Fig. 7. A stereo picture of the RMLphosphonate inhibitor complex and an

alignment with CALB in this region. All
residues believed to make up the oxyanion hole have a similar conformation in
the two enzymes. Hypothetical hydrogen bonds from the inhibitor to CALB
are indicated by dashed lines. RML is
shown in black, CALB in the colour
scheme used for Fig. 5.
The active site pocket and lid
In the structures of CALB, the active site is accessi-
ble to external solvent through a narrow channel (Fig.

8). It is approximately 10A x 4A wide and 12A deep,
as measured from the Oy of SerlO5 to the surface.
Most of the channel is formed by three parts of the
structure: helices a5 and cl1O and a loop region which
projects Ile189 into the channel. The channel walls are
very hydrophobic and are lined with mostly aliphatic
residues. No aromatic side chains are found in the
channel except Trp104, which precedes the catalytic
serine in the sequence. The side chain nitrogen of this

residue makes a hydrogen bond with the backbone
carbonyl oxygen of the active site histidine and stabilizes this region. In other lipases, this residue is often
a histidine with a similar hydrogen bond to the active
site histidine backbone.
The accessible active site suggests that the enzyme has
adopted a conformation close to an activated state in
both crystal forms. Therefore, we cannot be certain
which part of the protein, if any, functions as a lid to
control entry to the active site. The most likely can-
Fig. 8. The active site pocket in its open

conformation from the orthorhombic
model. (a)View from above and (b)as a
cross section. A solvent accessible surface was calculated with VOIDOO [54]
using a 1 A probe radius.
didate is the short helix a5. In the monoclinic crystal

form, a5 is disordered in one of the molecules, suggesting a region of high mobility that may undergo conformational changes of importance for lipid binding and
catalysis. In the open form of the enzyme we observe
a surface of aliphatic residues from cL5 that lines the
channel leading into the active site. The helix also has
another surface which makes contact with the interior
of the protein. The most striking feature of this region is a buried aspartic acid side chain, Asp145, which
makes stabilizing hydrogen bonds with the side chains

of Serl50 and Thr158 (Fig. 6). The long helix at the
carboxy-terminal end of the structure, cal0, is another
possible candidate for changing the accessibility of the
active site. This helix is dominated by alanines and
other hydrophobic residues on all sides and is kinked
in the middle at a proline residue. It has no hydrogen
bonds to the rest of the protein, which suggests it may
be relatively mobile. It also displays higher main chain
temperature factors than the rest of the structure. A
299
300
conformational change of c10 could change the size

and shape of the active site channel and the surrounding enzyme surface.
Why the enzyme crystallizes in an open conformation
is not clear, but this form has most likely been stabilized by the crystallization medium. In the orthorhombic crystal form, we believe the open form is stabilized
in part by Leu199 from a symmetry-related molecule
which points into the active site channel and keeps it
open. In the monoclinic form, density for a detergentlike molecule can be seen at the entrance to the active
site which may also prevent the lid from closing (see
the section describing the monoclinic crystal form). In
both HPL and RML, a tryptophan side chain covers the
active site in their inactive forms. There is no equivalent
aromatic side chain present in CALB.
The external hydrophobic surface
CALB has a large hydrophobic surface surrounding the
entrance of the active site channel (Fig. 8). It has an approximate area of 450A2 and is probably in close contact with a lipid surface during hydrolysis. It is nearly
triangular in shape and is slightly concave. The surface
is dominated by side chains from aliphatic residues,
oriented towards the solvent. The surface displays only
two carboxyl residues, Asp223 and Glu188, that are
close to each other and near the entrance of the active site pocket. On the opposite side of the entrance,
there is a lysine residue that shows high mobility in
the monoclinic crystal form. In this crystal form, the
surface plays an important role in the crystal packing
(see below). In both crystal forms, this hydrophobic
surface interacts with neighbouring enzyme molecules.
Disulphide bridges and glycosylation
There are six cysteine residues in the sequence of
CALB. The crystallographic work shows that they are
all involved in disulphide bonds. The first is found between Cys22 and Cys64 and connects the antiparallel
first and third strands where the first strand ends and
begins a short loop to the second strand. The second

bridge is.formed between Cys216 and Cys258 and connects two loop regions at the surface of the protein.
The third disulphide connects Cys293 and Cys311 and
stabilizes the carboxy-terminal end of the enzyme.
The amino acid sequence suggests one possible N-glycosylation site, with the characteristic sequence NxT,
at Asn74. This asparagine, which is followed by an aspartic acid and a threonine, is indeed the only residue
where the electron density indicates a glycosylation.
Two N-acetylglucosamine molecules have been built
into the density in the orthorhombic crystal form (Fig.
9). In the monoclinic crystal form, glycosylation can
be seen for only one of the two lipase molecules in
the asymmetric unit. The carbohydrate molecules are
located in a loop region after the third strand and point
into the surrounding solvent. The innermost carbohydrate unit makes hydrogen bonds with side chains
from residues Glnll and Asp75 and with two well-ordered water molecules. The outermost molecule has
no clear hydrogen bonds. There are no visible interactions between the carbohydrates and neighbouring
lipase molecules.
Multiple conformations
The high resolution refinement has revealed a few side
chains that adopt multiple, discrete conformations. The
Oy atom of Ser26 assumes two different positions. In
Ile87, the density indicates that the C81 has two different positions and, similarly, Leu144 has two conformations for the 8-carbons. The density for the Oy atom of
Serl05 in the active site is somewhat extended and the
atom has a higher temperature factor than comparable
atoms in this region, indicating some mobility which
might be of functional interest. A few residues show
weak density for their side chains, suggesting high mobility; the most prominent of these are Arg242, Arg249
and Glu269 (this is also apparent in their real space fit
values shown in Fig. 10). The average main chain temperature factors show the usual variation expected for
Fig. 9. A 2Fobs - Fcalc map around the
N-glycosylation site in CALB in the orthorhombic crystal form. Two N-acetylglucosamine molecules have been built
in the density. The average temperature factor for the carbohydrate atoms
is 27 A2 in the orthorhombic model.
loop regions (e.g. between et9 and 0tlO) and exposed

secondary structure elements (e.g. amino-terminal part
of ca9).
0.9
0.s
O.S
0.7
*Q
E
The majority of contacts are of a polar nature, with

many water-mediated hydrogen bonds. Only eight direct hydrogen bonds have been found between protein
molecules. The hydrophobic surface is packed against
a neighbouring molecule. This interaction is highly
hydrophobic and only one protein-protein hydrogen
bond can be located. In this region, the side chain of
Leul99 from a symmetry-related molecule points into
the active site and could be partly responsible for stabilization of the open active site channel.
The monoclinic crystal forms
0.6
Several native data sets have been collected on the

monoclinic crystal form at different pH values. The
. o.
striking feature of this crystal form is the packing of
0.4
0
the two molecules in the asymmetric unit (designated
U
0.3
molecules A and B) such that the large hydrophobic
surface around the active site pocket of one molecule
0.2
packs against the corresponding surface of the other
molecule (Fig. 11). Both molecules have an open con0.1
formation with the active site accessible from the out0
side, just as in the orthorhombic form. In the monoclinic crystal form, however, density for a lipid-like
molecule has been located at the entrance to the active
Fig. 10. Plots of the real space fit for all atoms (solid lines) and avsite (Figs 11 and 12). This is most likely a P-octyl gluerage temperature factors for main chain atoms (dashed lines) of
the current orthorhombic model as a function of residue number.
coside detergent molecule, since no other lipid comThe scale on the left shows the real-space correlation coefficient
pound was added in the crystallization experiment. The
and the scale on the right shows B-factor values in A2. For the
detergent molecules are most clearly visible in the maps
real space fit, a 2Fobs- Fcalc map was used. All 317 residues are
visible in the electron density map and have been modelled.
calculated from data collected at pH 3.6. When the
pH is raised to 5.5, a dramatic change takes place in
Solvent molecules
the crystal which includes a reduction in the length of
In the current orthorhombic model, 286 water molecules
the unit cell a-axis by 2A. The most drastic structural
have been included. Of these only seven are completely
change is seen in the active site of molecule B. There
buried, lacking contact with external solvent molecules.
is no longer any density for the detergent molecule
Of special interest is the water molecule bound to an
nor for most of the residues forming ct5. This effect is
06 atom of the active site residue Asp187 (Fig. 5).
not seen in molecule A,where the detergent molecule
In cutinase, a water atom is also hydrogen bonded to
remains clearly visible.
the active site aspartic acid [27]. As mentioned earlier,
The detergent molecules are in contact with both enone water molecule is hydrogen bonded to ThrlO3 and
zyme molecules (Fig. 11). The lipid tail projects into
forms part of the turn around the active site serine, but
the active site pocket of one molecule while the carbothere is no clear electron density for a water molecule
hydrate
portion has polar interactions with the other
at the proposed oxyanion hole of CALB. Since the acenzyme
molecule. The electron densities for the cartive site is exposed to the solvent continuum, it is not
bohydrate
moiety of both molecules are rather poor
surprising that a number of solvent molecules are also
and
do
not
allow us to accurately position them. In our
found in the active site pocket. Two water molecules
current model, there are two hydrogen bonds from the
form a bridge between the hydroxyl of SerlO5 and
sugar hydroxyls to main chain atoms of residues Val221
the carbonyl oxygen of Thr40. Two well-ordered water molecules are packed between helices cL5 and lCo0. and Asp223.
These are present in both crystal forms and are absent
The hydrophobic surfaces of the two molecules
only in molecule B of the high pH monoclinic form,
are separated at many places by a layer of solvent
where the displaced helix c5 changes the hydrogen
molecules and no direct hydrogen bonds exist bebond network.
tween the two protein molecules. We believe that much
of the observed electron density may be occupied by
Given the hydrophobic character of the entrance to
more hydrophobic molecules of the crystallization mixthe active site, it cannot be ruled out that isopropanol
ture, either disordered 3-octyl glucoside or isopropanol
rather than water molecules are responsible for some
molecules. This conclusion is based on the observaof the density peaks.
tion that much of the solvent density is rather large
Crystal contacts in the orthorhombic crystal form
and packed against hydrophobic side chains that do
In the orthorhombic form, the lipase molecule has
not allow for effective hydrogen bonding. These solvent
crystal contacts with 12 symmetry-related molecules.
molecules have so far been modelled as waters.
0I
O.
-o
M
301
302
Fig. 11. A stereo plot of the asymmetric unit in the monoclinic crystal form at pH 3.6 showing how one
molecule packs its hydrophobic surface
against that of another, thereby minimizing their exposure to the surrounding solvent. Molecule A in red, molecule
Bin green, waters and P-octyl glucoside
molecules in purple. The molecules are
related by an almost exact two-fold rotation.
Fig. 12. Stereo picture of an Fobs - FcaIc
map around the lipid molecule, most

likely -octyl glucoside, in the monoclinic crystal form. The lipid part of
the molecule points into the active site
pocket of molecule A and the carbohydrate moiety forms hydrogen bonds to
molecule B.The map is contoured at 20.
Comparison with enzymes of similar structure
In recent years, a number of structures have been determined that share a common overall motif, called the
a/P-hydrolase fold [12]. The structure of CALB contains a subset of this fold. It has the same connectivity of the 13-sheet as observed in other members of
this group, with the characteristic non-sequential alignment of the first four strands in the sheet, 1, 33,
f2, 4. Since all connections are of the right handed
type, the crossovers make possible an equal distribu-
tion of helices on both sides of the -sheet. CALB is

different from the other a/3-hydrolases in having only
seven strands. All other enzymes in the group have at
least one extra strand at the amino-terminal part of the
sheet. Many parts of CALB can be superimposed with
equivalent residues in other ct/3-hydrolase enzymes, although the sequence identity is very low (Table 1).
The overall structural alignments of CALB with RML
and HPL are poorer. This can be partly explained by
their slightly different secondary structure topology that
deviates from the o/'-hydrolase fold [23].

Table 1. A structural comparison of CALB with lipases and ca/
hydrolase enzymes.
Enzyme Number of
Percent of
Rms Ca (A) PDB entry
equivalent smaller enzyme
residuesa
RML
HPLb
GCL
AChE
HAD
CPII
79 (12)
114(9)
134 (12)
143 (12)
115(17)
131 (13)
29
36
42
45
36
41
1.8
1.8
2.0
2.0
2.0
2.1
3TGL
1THG
1ACE
2HAD
2SC2
The alignment was determined with O's Isq.improve option, where each
pair of equivalent atoms are separated by less than 3.8 A after superposition. aThe number of sequence identities in the set of structurally
equivalent residues are in parentheses. bThe human pancreatic lipase
coordinates were kindly provided by Dr F Winkler, Hoffman-La Roche,
Basel. Enzyme name abbreviations: RML = Rhizomucor miehei lipase,
HPL = human pancreatic lipase, GCL = Geotrichum candidum lipase,
AChE = acetylcholine esterase, HAD = haloalkane dehalogenase, CPII
= carboxypeptidase II.
The active site residues in CALB are located at the same

positions in the structure as defined by the c/3-hydrolase fold. The catalytic triad in CALB can be superimposed on the triad of carboxypeptidase II (CPII) [28],
for example, with a root mean square deviation (rmsd)
of 0.96 A for all non-hydrogen atoms. The same comparison with RML gives an rmsd of 0.68 A (Fig. 7). The
triad in HPL is less similar because of the different topological origin of the aspartic acid.
The active site serine in CALB is surrounded by many
polar residues (Fig. 6). In the activated form of RML,
there is an aspartic acid residue in the same region as
Gln157 in CALB. In the open form of pancreatic lipase,
no polar side chains can be found in this region which
is much more hydrophobic. The unusual buried polar
cluster found in a number of fungal lipases [29] is not
apparent in CALB.
Biological implications
Lipases are able to hydrolyze triglycerides at an
oil-water interface, where their activity is drastically increased. Although Candida antarcticali-
pase B (CALB) is not as efficient as other lipases

in hydrolyzing triglycerides, this enzyme is of particular interest because it displays strong stereospecificity on chiral substrates during hydrolysis
or organic synthesis.
Here we report the sequence and structure of
CALB. The structure has many features in com-
mon with other lipases. It is built up from a

subset of the c/1-hydrolase fold and contains a
Ser-His-Asp active site triad. In the present crystal forms, a rather narrow and deep channel leads
into an open active site that contains an oxyanion
hole. The shape of the channel probably accounts
for the enzyme's stereospecificity. The lipase crystal structures that have appeared in recent years
indicate that activation at the interface may be
caused by a conformational change that exposes
the active site of the enzyme. A putative lid has
been identified based on the observed mobility of
a short -helix (5). The long carboxy-terminal
helix (el0)
may also play an important role since
it has no hydrogen bonds to other parts of the
structure and interacts mainly through hydrophobic side chains. However, we cannot rule out the
existence of a closed form of CALB resulting from
a conformational change affecting a larger part of
the structure.
The relatively low activity of the enzyme on large
triglyceride substrates and the easy adoption of
an open conformation suggests that CALB may
be an intermediate between an esterase (which
hydrolyzes water soluble substrates) and a true,
interfacially activated lipase.
Materials and methods

Cloning and sequencing of the C. antarctica lipase B gene
The amino-terminal protein sequence of C antarctica lipase
B was initially determined as LPSGSDPAFSQPKSVLDAGLTNEG.
Two slightly degenerate oligonucleotide probes, NOR929 [CCCTC GTT(C/G) GT(C/G)A GGCC(C/G) GCGTC (C/G)AGC
ACCGA CTTGG GCTG] and NOR930 [CC(C/G)T CGGGC
TCGGA CCC(C/G) GC(C/G/T) TTCTT CTCGC AGCCC AAG],
were synthesized on the basis of the extremely biased codon use
in a gene from the same organism, which had previously been
cloned and sequenced (MTH, unpublished data). Total DNA
from C antarcticaLF058 was isolated after grinding in a mortar
with quartz essentially by the method of Yelton et al. [30]. Purified DNA was partially digested with Sau3A and fragments in
the range of 3-10 kb were isolated after agarose electrophoresis.
These were ligated into plasmid pBR322, which had been cut
with BamH1 and dephosphorylated using standard procedures
[31 ]. An Escherichia coli MC1000 restriction deficient derivative
with ampicillin resistance was transformed with the plasmids.
The colonies were replicated onto filters and screened by hybridization to 32 p-labelled probes as previously described [32].
Replica filters were screened with labelled probes NOR929 and
NOR930. Seven colonies were identified, which hybridized to
both probes after washing at 55C in 6 x SSC (1 x SSC is 150 mM
NaCl and 15mM sodium citrate, pH 7.0). The hybridizing plasmids were shown to contain overlapping inserts. From one original insert of 7.8 kb, a 2.1 kb subclone was chosen for sequencing. Nested deletions were made from one end of the gene
using the exonuclease III-based Erase-a-Base System (Promega
Corporation). The sequence in one direction was determined
from the deletion plasmids by the dideoxy method [33] using
303
304
Table 2. Data collection and heavy atom refinement statistics.

Compound
Monoclinic crystal form

Native (pH 3.6)
Native (pH 5.5)
UO 2CI2
UO2(NO 3)2
CH 3 PbCH 3 COO
Resolution No. of unique

No. of
Completeness
(A)
reflections observations
(%)
2.1
31323
58582
2.5
18925
48771
86
94
3.0
5.6
3.5
3.5
3.5
7132
5857
6847
6652
6388
22324
16819
19926
12216
12745
97
80
94
91
88
4.6
4.2
4.0
4.1
2.6
3.5
K2 PtCI4(1)
3.5
K2PtCI 4(2)
Overall figure of merit: 68 % (3.5 A)
Orthorhombic crystal form

Native
Rmerge(%) Rdi(%) Heavy atom Number Phasing Rcullis(%)

conc. (mM) of sites power
1.55
37486
14 04 16a
95
4.0b
UO 2(CH 3COO) 2
3.0
Hg(CH 3COO) 2
3.0
K2PtCI4
3.0
Overall figure of merit: 54 % (3.0 A)
4042
5195
5379
15030
14809
14875
69
91
93
6.3
3.6
4.5
18.0
21.3
11.1
12.9
6.5
3
10
2
25
3
12
15
8
11
11
2.3
2.3
1.4
1.0
1.1
59
57
78
81
83
18.0
12.7
10.4
55
10
30
9
8
1
2.1
1.1
1.0
58
75
79
aThe observations are for the high resolution data set only (see Methods). bThe overall Rmergefor the native data has been calculated for reflections
between 100 and 1.8 A (see Methods). Rmerge = (lj- < Ii > 1)/ < I >, where is the intensity of an observation of reflection j and < I > is the
average intensity for reflection j. Rdiff = (lnati - deril)/y < I >, where nati is intensity for the native reflection and Ideriis the intensity for the derivative
reflection and < I > is the average intensity of Inati and deri Rcullis = (I IFPH - Fp I- FH(calc))/(1IFpH-FpJ) for centric reflections where Fpand Fp are
the structure factors for derivative and native data respectively and FH(calc) is the calculated structure factor for the heavy atom contribution.
the Sequenase DNA sequencing kit (United States Biochemical

Corporation). From this, oligonucleotide primers were made
for sequencing the opposite strand. Due to the high CG content (63 %), several areas of severe compression were seen. To
elucidate these areas, part of the gene was also sequenced using ITP (inosine-5'-triphosphate) rather than GTP (guanosine5'-triphosphate) in the sequencing reactions.
X-ray crystallography
Protein for the X-ray structure determination of CALB was purified from the native organism as described previously [34]. The
crystallization of CALB in five different space groups has been
published elsewhere [21 and will only be summarized here for
the relevant crystal forms.
The protein was crystallized at room temperature with the hanging drop method [35]. The reservoir contained 20 % polyethylene glycol 4000, 50mM sodium acetate buffer, pH 3.6 and
10 % isopropanol. The protein concentration before any additions was 10mgml- 1. The protein solution was mixed with an
equal volume from the reservoir. -octyl glucoside was added to
reach a final concentration of 0.6 %.Two different crystal forms
grew under identical conditions. The monoclinic crystal form
belonged to space group P2 1 and diffracted to 2.OA resolution.
The crystals had cell constants a=69.2A, b=50.5A, c=86.7A
and = 101.5'. The second crystal form grew as smaller crystals
which were more elongated in shape. They were orthorhombic
with space group P21 21 21 and diffracted to 1.8A with a rotating
anode X-ray source and to 1.5 A at a synchrotron source. The cell
dimensions were a=62.1 A, b=46.7A, c=92.1 A. Both crystal
forms were used in the MIR determination of the structure. The
data collection statistics are summarized in Table 2. In order
to increase the chances of heavy-atom binding, the heavy-atom
compounds were dissolved in a solution similar to the crystallization reservoir, but with the pH raised to 5.5. In the case of
the monoclinic crystal form, this led to significant changes in

the cell constants, a=67.0A, b= 50.5A, c=
-86.7Aand [3= 100.1.
The diffraction limit for these modified crystals was 2.5 K
Native and derivative data were collected for both space groups
at 20C on an SDMS Mark III multiwire area detector [36],
mounted on a Rigaku rotating anode operating at 50kV and
90 mA. The SDMS software was used for collecting the data and
the stored images were processed with MADNES [37], followed
by profile fitting in PROCOR [38]. Subsequent treatment of the
data was mainly performed with the CCP4 program package
[39]. A high resolution data set of the orthorhombic crystal form
was collected at 20C on beamline X11 at the EMBL outstation at
DESY, Hamburg using a Mar-Research image plate. This dataset
was processed with DENZO [40]. The low resolution data were
not used in this data set since many strong reflections were
overexposed on the image plate. Instead the high resolution data
were merged with the SDMS data set. The R-merge between the
two data sets in the overlapping resolution shell, 2.5-2.0A, was
9.6%.
The monoclinic crystals were used initially in the search for
heavy-atom derivatives. Difference Patterson maps gave a few
outstanding peaks, which led to the identification of the major heavy-atom sites in the U0 2 C12 and K2PtCI4 data sets. Further sites were found by inspection of difference Fouriers. The
heavy-atom parameters were refined with MLPHARE [41]. The
two uranyl data sets and the methyl lead acetate data set had
most of their sites in common. They were, therefore, always
refined separately since joint refinement of these gave unreasonably high figures of merit, without improving the resulting MIR
map. Anomalous data from the uranyl data sets were included in
the phasing. MIR phases were calculated to 3.5 A resolution and
further improved by solvent flattening, histogram matching and
application of Sayre's equation, using SQUASH [42]. An electron

density map was calculated at this point and the skeletonized
map showed many of the secondary structure elements. Two
enzyme molecules could be identified in the asymmetric unit.
These were superimposed manually in O [43] to give an approximate transformation operator. This operator was further
improved with the rt-improve program and several cycles of
two-fold density averaging was performed with A at 3.5 A resolution [44]. The resulting map was used for model building. The
tracing of the peptide backbone was done using skeletonized
density and a Cat trace was constructed with the 'baton' option in
0, placing Ca atoms into the density at a separation of 3.8A. The
positions of the main chain atoms were generated from the Ca
coordinates using a database of refined structures [45] and the
side chains were added in their preferred rotamer conformations
and the best fitting rotamer selected by visual inspection of the
map [46]. Two thirds of the sequence could easily be fitted to
the density. This partial model was subjected to one round of
energy minimization and simulated annealing molecular dynamics refinement in X-PLOR, using a 'slow cool' protocol [47].
The orthorhombic crystal form was then solved by molecular
replacement, using the crude monoclinic structure as a search
model. All diffraction data with F > 2, in the resolution range
8.0-3.5 A, were used in the rotation and translation functions
in X-PLOR [48]. The rotation function gave a unique solution
with a 25a peak height. The translation function also returned a
single solution, with an 18o peak height. Using this correctly
placed model for phasing, it was then possible to locate the
heavy-atom sites using difference Fourier maps. These sites were
refined as described above and a new MIR map was calculated.
The MIR maps of the two crystal forms were not averaged but
were displayed, superimposed in O. Together with phase combined maps, this allowed us to interpret the rest of the structure.
The structure was refined with X-PLOR by simulated annealing
and energy minimization using force-field parameters derived
from the Cambridge Structural Data Base [49]. Individual restrained temperature factor refinement was performed on the
orthorhombic data and the low pH monoclinic data set. For the
high pH monoclinic data set, main chain and side chain atoms
were grouped for each residue during temperature factor refinement, starting with the initial values from the low pH model. The
protein structure was primarily refined with the orthorhombic
data to 1.8 A resolution. This model was then used as the starting
model for refinement of the two monoclinic data sets. Waters
were added in steps during the refinement. The real space fit
of the model to calculated 2Fo - F maps was used to find incorrectly built regions of the model [46]. The model was also
closely inspected at positions with unusual main chain dihedral
angles, peptide flips or torsion angles. A sequencing error was
found at residue 276 which had initially been determined as a
glycine. The electron density indicated an alanine that was later
confirmed to be correct by partial resequencing. At a late stage
of the refinement, the 1.55A data set was collected for the orthorhombic crystal form and used to further improve the model.
The correct conformations for prolines could be identified and
the first residue could be placed correctly into density.
Heavy-atom sites in the monoclinic crystal form
Since the uranyl and lead derivatives shared most sites, they will
be described together. There was one outstanding site, with
more than twice the occupancy of any other site. It was located
at the interface between the two molecules in the asymmetric
unit, bound to Asp223 of molecule A and Glu188 and Asp223
of molecule B. Most other sites were bound to single aspartic
acid side chains. One of them was bound to Aspl45 on molecule
B, while molecule A had no equivalent site. This is another indication that helix ta5 has indeed been perturbed in this molecule,
making the aspartic acid accessible for heavy-atom binding. One

uranyl site was located between two proline residues from different lipase molecules, and another site was found at the carboxyl terminus of molecule A The strongest platinum sites in
the monoclinic form were located near the sulphur of Met298,
one for each molecule in the asymmetric unit. This is also the
only methionine found on the surface of the enzyme.
Heavy-atom sites in the orthorhombic crystal form
The main uranyl site found in the monoclnic form was stabilized
by the special packing of the two molecules in the asymmetric
unit. This packing is not present in the orthorhombic crystal
form. The highest occupancy site, however, was still a uranyl
acetate molecule bound to Asp223 and Glu188. Other sites were
all located near aspartic acid side chains. One platinum site was
found at the cysteine bridge Cys293-Cys311 and another was
located between two lysine residues. The only methionine accessible from the surface was also modified. The active site histidine
His224, as well as Lysl36, both bound the platinum compound.
There was one outstanding mercurial site, with an occupancy
four times higher than the next site, and the only site used in
MIR phasing. It was bound to Tyr91, with the difference Fourier
peak at a distance of 2.3 A from the C&2 atom. A similar site but
with lower occupancy was found at Tyr234.
.,
Phi
Fig. 13. Ramachandran plot of the current orthorhombic model.
Two residues with unusual conformations are evident. Asn51 is
located in a kinked helix, adding an extra residue to one of the
turns. Ser105 has the typical conformation for the active site nucleophile found in lipases and at/3-hydrolases.
Quality of the model
The current model in the orthorhombic crystal form has been
refined using data between 7.5A and 1.55A to an R-factor of
15.6% for reflections with amplitudes above 2cr and 15.8% for
all measured reflections. The R-factor for all reflections in the
resolution shell 1.58-1.55A is 21.9%. All 317 residues can be
seen in the map, although density for some atoms is lacking,
particularly for a number of exposed side chains. A total of 2324
non-hydrogen protein atoms, 286 water molecules and two carbohydrate molecules have been included in the model; in total
there are 2638 non-hydrogen atoms.
305
306
(a)
(b)
I0
.0
S~
4!
0~
(C)
(d)
1
0.9
0.0
0.7
.2
E
o
0.5
.=
0.6
"r
CZ
d
Cul
04
U
0.3
0.2
0.1
0
Fig. 14. Real-space fit and main chain temperature factor diagrams for the monoclinic models for both molecules in the asymmetric
unit as a function of residue number. The scales on the left show the real-space correlation coefficient and the scales on the right show
B-factor values in A2. (a) Molecule A at pH 3.6. (b) Molecule Bat pH 3.6. (c) Molecule A at pH 5.5. (d) Molecule B at pH 5.5. Part of the
proposed lid region, helix c5, lacks continuous density in the structure of molecule B at high pH. In this molecule, density for a lipid
molecule in the active site can be seen for the low pH structure. This density disappears in the structure determined at pH 5.5.
In the monoclinic crystal form, two models have been refined
against the data sets collected at pH 3.6 and 5.5, respectively.
The model representing the low pH structure has been refined
to 2.1 A,with an R-factor of 19.0 % for all reflections and consists
of two protein chains with the complete sequence, 470 waters,
two carbohydrates and two detergent molecules, with a total of
5186 non-hydrogen atoms. The high pH model has been refined
to 2.5 A, to an R-factor of 20.1 % for all reflections. One detergent
molecule, two carbohydrates and 159 waters have been included
in this model.
No non-glycine residues fall in the disallowed region of the Ramachandran plot (Fig. 13), but two fall in the generously allowed
region as defined by PROCHECK [50]. One is SerlO5 in the
catalytic triad, which is part of a tight tum, found in all lipases

and oa/-hydrolases at this position. The other is Asn51, located
in a kink of a helix. There are two residues, Ser195 and Val306,
that have high (> 2.5 A) peptide flip values [46]. Serine 195 is in
the middle of a nine residue long surface loop, where the ends
of the loop are hydrogen bonded by main chain atoms. Valine
306 forms a -bulge at the beginning of the carboxy-terminal
P-hairpin.
The two molecules in the asymmetric unit of the monoclinic

crystals are related by an almost perfect two-fold rotation. For
the low pH structure the rotation is 179.56', with a translation
along the axis of 0.15k The direction cosines for the rotation
axis are (0.798, 0.587, 0.135). For the high pH structure, the ro-
References
Table 3. Summary of refinement.
1.
Orthorhombic Monoclinic crystal form

crystal form
pH 3.6
pH 5.5
Resolution of data (A)
7.5-1.55
R-factorb(%)
15.8
Non-hydrogen
2324
protein atoms
Water molecules
286
Deviations from idealitya
Bond lengths (A)
0.007
Bond angles ()
1.1
Dihedrals (o)
Impropers ()
Average B-factors
Main chain atoms
All protein atoms
Water
7.5-2.1
19.0
4648
7.5-2.5
20.1
4648
470
159
0.006
0.9
0.006
1.3
24.3
0.9
24.1
0.8
24.1
1.2
8.7
9.7
34.1
19.2
19.9
46.2
23.5
24.7
43.2
aValues from X-PLOR. Parameters from the Cambridge data base of

small molecule structures [491 were used for the bond lengths and bond
angles. bR-factor = hIFobs-FcalcI/Fobs, where Fobs and Fcalc are the
amplitudes of the observed and the calculated structure factors.
tation is 179.41', the translation 0.30A and the direction cosines

(0.800, 0.585, 0.130). These calculations were made with the
program COORD2 (J Deisenhofer, unpublished program).
The rmsd after superposition of the two molecules in the asymmetric unit of the low pH monoclinic model is 0.18A for Cas
and 0.32 A for all non-hydrogen atoms. The rmsd after superposition of the orthorhombic model and molecule A in the low pH
monoclinic model is 0.24 A for Cots. The rmsd between models
of the high and low pH forms are 0.12 and 0.15A for Cas of
molecule A and B, respectively.
The temperature factor and real space fit diagrams reveal that
most of the structure is well ordered and shows low mobility
(Figs 10 and 14). The solvent exposed loop region from 242
to 268 has a higher average main chain temperature factor than
the rest of the structure, indicating higher mobility. Many of the

residues that lack side chain density are located in this region.
Also the following helix, ozl0, has a high average temperature
factor. The possible movement of this helix is of great interest
since it makes up a large portion of the active site pocket. In
the high pH monoclinic crystal form, the high mobility of the
c5 region is clearly manifested by a low real space fit and high
temperature factors for molecule B. The refinement statistics for
the models are summarized in Table 3.
The coordinates for the three models have been deposited at the
Protein Data Bank. The DNA sequence has been deposited at the
EMBL sequence data base with accession number z230645.
Acknowledgments- We wish to thank Dr Fritz Winkler for providing us with the coordinates for the closed form of human pancreatic lipase, Dr Christian Cambillau for the coordinates of pancreatic
lipase-procolipase complex and Dr David Lawson for the coordinates
of the phosphonate inhibited form of Rhizomucor miehei lipase. This
investigation was carried out with financial support from Nordisk Industrifond and the Swedish Natural Science Research Council. The
expert technical assistance by Ms Inge Hoegh with the cloning and
DNA sequencing is gratefully acknowledged. We wish to thank Dr
Morten Kjeldgaard for his valiant efforts in tracing a nasty bug in
qoplot, which allowed us to make the coloured figures. The constructive comments of Dr C Cambillau are gratefully acknowledged.
Rogalska, E., Cudrev, C., Ferrato, F. &Verger, R. (1993). Stereoselective hydrolysis of triglycerides by animal and microbial
lipases. Chirality 5, 24-30.
2.
Brad, L, et al.. & Brzozowski, A.M. (1990). A serine protease
triad forms the catalytic centre of a triglyceride lipase. Nature
343, 767-770.
3.
Winkler, F.K., D'Arcy, A. & Hunziker, W. (1990). Structure of
human pancreatic lipase. Nature 343, 771-774.
4.
Schrag, J.D., Li, Y., Wu, S. & Cygler, M. (1991). Ser-His-Glu
forms the catalytic site of a lipase from Geotrichum candidum.
Nature 351, 761-764.
5. Grochulski, P., et al., & Li, Y. (1993). Insights into interfacial
activation from an open structure of Candida rugosa lipase.
J Biol. Chem. 268, 12843-12847.
6.
Noble, M.E.M., Cleasby, A., Johnson, LN., Egmond, M.R. &
Frenken, L.G.J. (1993). The crystal structure of triacylglycerol
lipase from Pseudomonasglumae reveals a partially redundant
catalytic aspartate. FEBS Lett. 331, 123-128.
7.
Wright, C.S., Alden, R.A. & Kraut, J. (1969). Structure of subtilisin BPN' at 2.5A resolution. Nature 221, 235-242.
8.
Blow, D.M., Birktoft, J.J. & Hartley, B.S. (1969). Role of buried
acid group in the mechanism of action of chymotrypsin. Nature 221, 337-340.
9.
Boel, E., Huge-Jensen, B., Christens, M., Thim, L & Fiil, N.P.
(1988). Rhizomucor miehei triglyceride lipase is synthesized as
a precursor. Lipids 23, 701-706.
10. Brzozowski, AM., et al., &Derewenda, U. (1991). A model for
interfacial activation in lipases from the structure of a fungal
lipase-inhibitor complex. Nature 351, 491-494.
11. van Tilbeurgh, H., Egloff, M.-P., Martinez, C., Rugani, N.,
Verger, R. & Cambillau, C. (1993). Interfacial activation of the
lipase-procolipase complex by mixed micelles revealed by Xray crystallography. Nature 362, 814-820.
12. Ollis, D.L, et al., & Cheah, E. (1992). The a/-hydrolase fold.
Protein Eng. 5, 197-211.
13. Martinez, C., DeGeus, P., Lauwereys, M., Matthyssens, G. &
Cambillau, C. (1992). Fusarium solani cutinase is a lipolytic
enzyme with a catalytic serine accessible to solvent. Nature
356, 615-618.
14. Hjorth, A., et al., & Carrir, F. (1993). A structural domain
(the lid) found in pancreatic lipases is absent in the guinea
pig (phospho)lipase. Biochemistry 32, 4702-4707.
15. Michiyo, M. (1989). Purification of a thermostable, nonspecific lipase from Candidaand its use in transesterification. WO
Patent 8802775, 1986. Chem. Abstr. 110, 20529.
16. Heldt-Hansen, H.P., Ishii, M., Patkar, S.A, Hansen, T.T. &
Eigtved, P. (1989). Biocatalysis in agricultural biotechnology.
In ACS Symposium Series.389. (Whitaker, J.R. & Sonnet, P.E.),
pp. 157-172.
17. Frykman, H., Ohmer, N., Norin, T. & Hult, K. (1993). S-ethyl
thiooctanoate as acyl donor in lipase catalysed resolution of
secondary alcohols. Tetrahedron Lett. 34, 1367-1370.
18. Mattson, A, Ohmer, N., Hult, K. & Norin, T. (1993). Resolution
of diols with C2-symmetry by lipase catalysed transesterification.
Tetrahedron Asymm. 4, 925-930.
19. Partali, V., Waagen, V., Alvik, T. & Anthonsen, T. (1993). Enzymatic resolution of butanoic esters of 1-phenylmethyl and
1-[2-phenylethyl] ethers of 3-chloro-1,2-propanediol. Tetrahe
dron Asymm. 4, 961-968.
20. Adelhorst, K., Bjorkling, F., Godtfredsen, S. & Kirk, 0. (1990).
Enzyme catalyzed preparation of 6-O-acylglucopyranosides. Synthesis 2, 112-115.
21. Uppenberg, J., Patkar, SA, Bergfors, T. &Jones, TA (1994).
Crystallization and preliminary X-ray studies of lipase B from
Candida antarctica J. Mol Biol 235, 790-792.
22. Richardson, J.S. (1981). The anatomy and taxonomy of protein
structure. Adv. Protein Chem. 34, 167-399.
23. Cygler, M., Schrag, J.D. & Ergan, F. (1992). Advances in structural understanding of lipases. Biotech. Genet. Eng. Rev. 10,
143-184.
24. Franken, S.M., Rozeboom, HJ., Kalk, K.H. & Dijkstra, B. (1991).
Crystal structure of haloalkane dehalogenase; an enzyme to
detoxify halogenated alkanes. EMBO J. 10, 1297-1302.
25. Sussman, J.L, et a, & Harel, M. (1991). Atomic structure
of acetylcholinesterase from Torpedo californicca a prototypic
acetylcholine-binding protein. Science 253, 872-879.
307
308

26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
Carter, P. & Wells, JA (1990). Functional interaction among

catalytic residues in subtilisin BPN'. Proteins 7, 335-342.
Martinez, C., et al, & Nicolas, A (1994). Cutinase, a lipolytic enzyme with a preformed oxyanion hole. Biochemistry 33,
83-89.
Liao, D.-I. & Remington, SJ. (1990). Structure of wheat serine carboxypeptidase II at 3.5A resolution. J Biol Chem 256,
6528-6531.
Derewenda, U., et al, & Derewenda, Z.S. (1994). An unusual
buried polar cluster in a family of fungal lipases. Nature Struct.
BioL 1, 36-47.
Yelton, M.M., Hamer, J.E. & Timberlake, W.E. (1984). Transformation of Aspergillus nidulans by using trpC plasmid. Proc.
Nat Acad Sci USA 81, 1470-1474.
Sambrook, J., Fritsch, E.F. & Maniatis, T. (1989). Molecular
Cloning. Cold Spring Harbor Press, New York.
Boel, E., Hjort, I., Svensson, B., Norris, F., Norris, K.E. & Fill,
N.P. (1984). Glucoamylases G1 and G2 from Aspergillus niger
are synthesized from two different but closely related mRNAs.
EMBO J 3, 1097-1102.
Sanger, F., Nicklen, S. & Coulson, A (1977). DNA sequencing
with chain-terminating inhibitors. Proc Natl Acad Sci USA
74, 5463-5467.
Patkar, SA, et al, & Bjorkling, F. (1993). Purification of two
lipases from Candida antarcticaand their inhibition by various
inhibitors. Ind of Chem 32 B, 76-80.
McPherson, A (1982). Crystallization. In Preparationand Analysis of Protein Crystals pp. 96-97, J. Wiley & Sons Inc., New
York.
Hamlin, R. (1985). Multiwire area X-ray diffractometers. In Metb
ods in Enzymology. (Wyckoff, H. W., Hirs, C. H. W. &Timasheff, S. N. eds), Academic Press, London. pp. 416-452.
Messerschmidt, A. & Pflugrath, J.W. (1987). Crystal orientation
and X-ray pattern prediction routines for area-detector diffractometer systems in macromolecular crystallography. J. AppZ
Crystallogr. A 30, 306-315.
Kabsch, W. (1988). Evaluation of single-crystal X-ray diffraction
data from a position-sensitive detector. J. Appl Crystallogr. A
21, 916-924.
CCP4 (1979). The SERC (UK) collaborative computing project
no. 4. A suite of programs for protein crystallography, distributed from Daresbury Laboratory, Warrington, WA4 4AD, UK.
Otwinowski, Z. (1988). DENZO. A Programfor Automatic Evaluation of Film Densities Department of Molecular Biophysics
and Biochemistry, Yale University, New Haven, CT.
Otwinowski, Z. (1991). Isomorphous replacement and anomalous scattering. In Proceedings of the CCP4 Study Weekend
pp. 80-86, Daresbury Laboratory, Warrington, UK.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
Zhang, K.YJ. & Main, P. (1990). The use of Sayre's equation

with solvent flattening and histogram matching for phase extension and refinement of protein structures. Acta Crystallogr.
A 46, 377-381.
Jones, TA & Kjeldgaard, M. (1992). O - The Manual Uppsala, Sweden.
Jones, TA (1992). a, yaap, asap, @#'? A set of averaging
programs. In Molecular Replacement Proceedings of the CCP4
Study Weekend pp. 99-105, SERC, Daresbury Laboratory, Warrington, UK.
Jones, TA & Thirup, S. (1986). Using known substructures
in protein model building and crystallography.
EMBO J 5,
819-822.
Jones, TA, Zou, J.Y., Cowan, S.W. & Kjeldgaard, M. (1991).
Improved methods for building protein models in electron
density maps and the location of errors in these models. Acta
Bringer, AT. &Krukowski, A (1990). Slow-cooling protocols
for crystallographic refinement by simulated annealing. Acta
Brunger, AT. (1990). Extension of molecular replacement: a
new search strategy based on Patterson correlation refinement.
Acta Crystallogr. A 46, 46-57.
Engh, RA & Huber, R (1991). Accurate bond and angle parameters for X-ray protein structure refinement. Acta Crystallogr. A 47, 392-400.
Morris, AL, MacArthur, M.W., Hutchinson, E.G. & Thomton,
J.M. (1992). Stereochemical quality of protein structure coordinates. Proteins 12, 345-364.
von Heijne, G. (1986). A new method for predicting signal
sequence cleavage sites. Nucleic Acid Res 14, 4683-4690.
Julius, D., Brake, A, Blair, L, Kunisawa, R & Thomer, J.
(1984). Isolation of the putativestructural gene for the lysinearginine-cleaving endopeptidase required for processing of
yeast prepro-a-factor. Cell 37, 1075-1089.
Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and
geometrical features. Biopolymers 22, 2577-2637.
Kleywegt, GJ. & Jones, TA (1994). Detection, delineation,
measurement and display of cavities in Macromolecular structures. Acta Crystallogr. D 50, in press.
Received: 9 Feb 1994; revisions requested: 18 Feb 1994;

revisions received: 7 Mar 1994. Accepted: 7 March 1994.

The Sequence, Crystal Structure Determination and Refinement of Two Crystal Forms of Lipase Bfrom Candida Antarctica

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Sequence, Crystal Structure Determination and Refinement of Two Crystal Forms of Lipase Bfrom Candida Antarctica

Uploaded by

Copyright:

Available Formats

The sequence, crystal structure determination

and refinement of two crystal forms

Department of Molecular Biology, Uppsala University, Biomedical Centre, Box 590,

Background: Lipases constitute a family of enzymes

Structure 15 April 1994, 2:293-308

Key words: Candida antarctica, crystal structure, lipase, sequence, X-ray

facial activation on triglyceride substrates distinguishes

(Asp-His-Ser in the subtilisin family, His-Asp-Ser in

Structure 1994, Vol 2 No 4

GTG GCT GGT

GTG CTT GCG ACT TGC GTT GCA GCC ACT

CCT GCC TTT TCG CAG

CCC AAG TCG GTG

ACC CCC TGA

with an identical connectivity of the central -sheet,

have a catalytic triad with the same sequential order of

Two crystal forms of lipase B from Candida antarctica Uppenberg et al.

sity in the reactions performed by these enzymes and

Fig. 2. Stereo drawing of the Ca trace of

Results and discussion

CALB is made up of 317 amino acid residues with a

CALB is a globular ca/3 type protein with approximate

Structure 1994, Vol 2 No 4

Fig. 3. Secondary structure diagram of

In CALB, a serine triad is found at the carboxy-terminal

values of 53.4 and - 126.9 respectively, that

Two crystal forms of lipase B from Candida antarctica Uppenberg et al.

Fig. 4. The superposition of CALB (green),

Fig. 5. A stereo picture showing density

requiring that they make favourable hydrogen bonds

group. We have aligned CALB with these structures and

Structure 1994, Vol 2 No 4

Fig. 6. A stereo picture of the catalytic

Fig. 7. A stereo picture of the RMLphosphonate inhibitor complex and an

In the structures of CALB, the active site is accessi-

ble to external solvent through a narrow channel (Fig.

serine in the sequence. The side chain nitrogen of this

Two crystal forms of lipase B from Candida antarctica Uppenberg et al.

Fig. 8. The active site pocket in its open

didate is the short helix a5. In the monoclinic crystal

makes stabilizing hydrogen bonds with the side chains

Structure 1994, Vol 2 No 4

conformational change of c10 could change the size

begins a short loop to the second strand. The second

Fig. 9. A 2Fobs - Fcalc map around the

Two crystal forms of lipase B from Candida antarctica Uppenberg et al.

loop regions (e.g. between et9 and 0tlO) and exposed

The majority of contacts are of a polar nature, with

Several native data sets have been collected on the

Structure 1994, Vol 2 No 4

Fig. 12. Stereo picture of an Fobs - FcaIc

map around the lipid molecule, most

Comparison with enzymes of similar structure

tion of helices on both sides of the -sheet. CALB is

Two crystal forms of lipase B from Candida antarctica Uppenberg et al.

The active site residues in CALB are located at the same

pase B (CALB) is not as efficient as other lipases

mon with other lipases. It is built up from a

Materials and methods

Structure 1994, Vol 2 No 4

Table 2. Data collection and heavy atom refinement statistics.

Monoclinic crystal form

Resolution No. of unique