Professional Documents
Culture Documents
Vietor1
Karim Mazeau2
Miles Lakin2
Serge Perez1,2
1
Ingenierie Moleculaire,
Institut National de la
Recherche Agronomique,
Rue de la Geraudie`re,
BP 71627,
44316 Nantes Cedex, France
Centre de Recherches
sur les Macromolecules
Vegetales,*
Centre National de la
Recherche Scientifique,
BP53,
38041 Grenoble Cedex,
France
Received 14 May 1999;
accepted 29 March 2000
Abstract: The packing of -1,4-glucopyranose chains has been modeled to further elaborate the
molecular structures of native cellulose microfibrils. A chain pairing procedure was implemented
that evaluates the optimal interchain distance and energy for all possible settings of the two chains.
Starting with a rigid model of an isolated chain, its interaction with a second chain was studied at
various helix-axis translations and mutual rotational orientations while keeping the chains at van
der Waals separation. For each setting, the sum of the van der Waals and hydrogen-bonding energy
was calculated. No energy minimization was performed during the initial screening, but the energy
and interchain distances were mapped to a three-dimensional grid, with evaluation of parallel
settings of the cellulose chains. The emergence of several energy minima suggests that parallel
chains of cellulose can be paired in a variety of stable orientations. A further analysis considered
all possible parallel arrangements occurring between a cellulose chain pair and a further cellulose
chain. Among all the low-energy three-chain models, only a few of them yield closely packed
three-dimensional arrangements. From these, unit-cell dimensions as well as lattice symmetry were
derived; interestingly two of them correspond closely to the observed allomorphs of crystalline
native cellulose. The most favorable structural models were then optimized using a minicrystal
procedure in conjunction with the MM3 force field. The two best crystal lattice predictions were
for a triclinic (P1) and a monoclinic (P21) arrangement with unit cell dimensions a 0.63, b
0.69, c 1.036 nm, 113.0, 121.1, 76.0, and a 0.87, b 0.75, c 1.036 nm,
94.1, respectively. They correspond closely to the respective lattice symmetry and unit-cell
dimensions that have been reported for cellulose I and cellulose I allomorphs. The suitability of
Correspondence to: Serge Perez; email: perez@cermav.cnrs.fr
Contract grant sponsor: INRA, CARENET-2, and CNRS
* Associated with University Joseph Fourier, Grenoble.
Biopolymers, Vol. 54, 342354 (2000)
2000 John Wiley & Sons, Inc.
342
343
the modeling protocol is endorsed by the agreement between the predicted and experimental
unit-cell dimensions. The results provide pertinent information toward the construction of macromolecular models of microfibrils. 2000 John Wiley & Sons, Inc. Biopoly 54: 342354, 2000
Keywords: -1,4-glucopyranose chains; packing; molecular structure; native cellulose microfibrils; crystal structure prediction
INTRODUCTION
Many recent advances in the theory and application of
molecular modeling to the structural elucidation of
carbohydrate and carbohydrate polymers have produced a wide range of useful results.1 4 In combination with experimental methods, computer modeling
has become an integral part of the strategy for revealing three-dimensional structures, both in solution and
in the condensed phase. Nevertheless, the realm of
carbohydrate modeling has tended to emphasize intramolecular rather than intermolecular aspects. When
dealing with materials in condensed phases, the modeling technique can be combined with information
derived from electron and fiber diffraction to enable
quantitative solution of the three-dimensional crystalline structure.5 In addition to rationalizing why the
observed crystalline arrangement is the preferred
form, a further goal is the prediction of all stable
three-dimensional organizations accessible to the
polysaccharide in a given conformation. It would also
be desirable to extend this predictive methodology to
less ordered systems such as gels, where chain chain
interactions may occur to promote the formation of
the so-called junction zones.
These topics require the development of general
rules for analyzing the stability of certain interhelix
arrangements. Several authors have proposed methods for investigating the interhelix structure and energy through nonbonded forces.6 15 These procedures
involve a minimization of the interhelix energy. In
contrast, we have developed a method where the
helices are positioned so as to allow contact but not
interpenetration of the van der Waals surfaces of the
two helices. After the helices are placed at the position of van der Waals contact for a given helix helix
rotation and translation, the energy is calculated. This
procedure takes considerably less computer time than
methods involving energy minimization, and has been
successfully applied to synthetic polymers,16 and to
the polysaccharide chitin and starch.17 In this latter
example, the structure predicted to be most stable
corresponds to a duplex of parallel double helices, as
found in both the crystalline A and B allomorphs.18,19
From these results, an explanation of the transition
from the A to B allomorph has been proposed.17
Native cellulosic materials are organized into microfibrils in which crystalline domains coexist with
amorphous zones. Little is known about the ultrastructures of the amorphous zones. Incidentally, the
detailed crystal structure of native cellulose (cellulose
I) is still a matter of debate despite more than 70 years
of research effort. X-ray fiber diffraction experiments
initially lead to models based on two-chain or eightchain unit cells depending upon the source of the
sample,20,21 with the eight-chain unit cell invoked to
account for weak signals in the diffraction pattern.
Later experiments using cross-polarizationmagic angle spinning proton nmr indicated the presence of two
allomorphs in samples of native cellulose, designated
I and I.2224 Subsequent results from electron diffraction25 verified the presence of these two allomorphs and provided data on crystal symmetry and
unit cell size for each allomorphs. The I allomorph
crystallizes in the triclinic P1 space group and contains one cellobiose unit per unit cell with a parallel
arrangement of the chains as would be expected. The
I form crystallizes in the monoclinic space group
P21 with two cellobiose moieties per unit cell. Since
the two allomorphs are found within the same microfibril, parallel packing for I is the inescapable conclusion. The chain repeat length is found to be invariant at 1.036 nm.
Building realistic macromolecular models of
cellulosic microfibrils starting solely with information derived from the fiber repeat distance is still a
difficult task. One needs to predict both crystalline
allomorphic phases of cellulose together with less
ordered regions, which could occur in the amorphous phase. This requires an exhaustive exploration of the low-energy three-dimensional arrangements of cellulose chains. The present work
assesses the feasibility of the methodology of generation. A very important aspect of the work is the
adequation of the proposed models. Therefore,
prior to the generation of dense packed chains one
needs a proper validation of the method used; this
can be achieved by a careful prediction of the two
allomorphs of cellulose I. In addition to refining the
methodology for predicting solid state polymorphism, the study should yield the crystal structure
of the two cellulose I allomorphs, and provide some
insight into the nature of each form and into transitions between them.
344
Vietor et al.
COMPUTATIONAL METHOD
Nomenclature
The atom coordinates for the glucose residue used in this
study were taken from the MONOBANK database of
monosaccharide structures.26 The atom numbering and the
angle definitions used are shown in Figure 1. The relative
orientation of two glucose residues was determined by three
angles: the glycosidic bond angle , defined by the atoms
C1OO4OC4, and the two torsion angles
(O5OC1OO1OC4) and (C1OO1OC4OC5). In addition, the torsion angle (O5OC5OC6OO6) was used to
describe the position of the primary hydroxyl group.
c
(nm)
()
Chain
Helix
Translational
117.5
117.5
117.5
90
77
102
153
141
164
1.036
1.036
1.036
FIGURE 2 Rigid residue potential energy surface of cellobiose. Iso-energy contours are shown at 1 kcal/mol intervals relative to the global minimum. Superimposed upon
these contours are those calculated for the helical parameters n 2 and h 0.518 nm . Where the cellulose chain
adopts true 21 helical symmetry (, ) must assume the
values (90, 153). This requirement may be relaxed if
the glycosidic torsion angles alternate between (, )1
(77, 141) and (, )2 (102, 164).
cosidic bond angle of 117.5. A typical potential energy
surface is represented in Figure 2.
Strict helical symmetry of a macromolecular chain requires that equivalent chemical units occupy equivalent
positions about the molecular axis. Such model chains were
prepared with glycosidic torsion angles falling at minima on
the potential energy surface and generating helices having n
2 and h 0.518 nm. This type of chain will be referred
to as helical with 117.5, 90, and 153.
Cellulose chains exhibiting a repeat distance of 1.036 nm
can also be constructed by regular alternation between two sets
of glycosidic torsion angles. Such chains, without internal
symmetry, will be referred to as translational. They were
obtained by setting one glycosidic linkage to one of the minimum energy conformations determined with PFOS; with the
next linkage manipulated until translational symmetry with the
required period of 1.036 nm was obtained between residues i
and (i 2). Relative chain energies for the conformations
obtained in this way were estimated by averaging the energies
determined with PFOS for the individual sets (, ). The chain
having the lowest energy was selected for further study. Conformational parameters for the generated chains are collected
in Table I. Coordinates of both cellulose helices are available
upon request form the authors.
345
FIGURE 3 Interhelical parameters used to define the geometric orientation of the two parallel cellulose chains (A
and B): Chain rotations A and B , interchain contact
distance x , and longitudinal offset z .
quires a set of four interhelical parameters: A , a rotation of
the chain A about the helical axis from 0 to 360, B , a
rotation of the chain B about its axis from 0 to 360, x,
and z, which are taken as positive, and represent positional shifts normal and parallel to the identity axis, respectively. z is bounded between 0 and t (fiber repeat). The
spatial description of these parameters is shown in Figure 3.
The minimum energy arrangement of the two polymer
chains with respect to a displacement, will tend to bring the
molecules as close as possible without interpenetration of
Van der Waals radii. In reality, a small amount of repulsive
energy resulting from interpenetration of some atom pairs
can be compensated by additional attractions from the remaining atoms pairs. However, nonbonded contact distances deviate by only 10% (0.02 0.03 nm) in molecular
solids.28 The contacting procedure28 involves describing the
surface of each chain by circumscribing a hard sphere of
Using this contacting procedure for chain chain construction, the search space is reduced to three geometric
variables. The resulting interchain interaction energy (E AB )
can then be calculated to the required degree of approximation.
For the simulation of cellulose, the interaction energy of
the two chains was considered to be the sum of all pairwise
atomatom interactions and was calculated using a 6 12
potential function,29,30 with an additional term to cover the
stabilization arising from the interchain hydrogen bonding.
This was based on the distance between the oxygen atoms
that can interact through hydrogen bonding (0.25 0.30 nm)
without recourse to the hydroxyl hydrogens.
The CHACHA program17 was used to map x and E AB
as a function of the structural variables A , B , and z.
The analysis was performed by incrementing A and B
over the whole angular range by increments of a few degrees and the relative translation (z) between the two
chains was studied over the length of the whole fiber repeat,
typically by increments of 0.01 0.05 nm (i.e., h/ 20). The
rotations A and B were set to be both independent and
coupled ( A B ). For each setting of the chain as
a function of A and B and z, the magnitude of the
perpendicular offset x was derived according to the contact procedure described above. The value of the energy
E AB was then computed. The mapping procedure was used
to search for low-energy regions. In order to pinpoint the
energy minima, regions containing a local minimum were
searched a second time using intervals of 1 () and 0.005
nm (for z). This procedure provided a complete overview
of the symmetry (or lack of symmetry) of the chain chain
interactions. The set of interhelical parameters relate to the
lattice symmetry that characterises the three-dimensional
organisation as follows:
346
Vietor et al.
FIGURE 4 Building up crystalline structures. The calculations for three chains gave the relative positions of the
chains in stable triplet interactions. For triclinic lattices, the
sides a and b, the unit cell angles , , and , the projection
of on the xy plane (*), and the lattice volume per
cellobiose unit can be calculated from the parameters for the
stable configurations using Eqs. (1)(7). For the monoclinic
lattices, Eqs. (8), (9), and (10) were used instead of (1), (2),
and (6) respectively to calculate a, b, and .
a x12 z12
(1)
b x22 z22
(2)
tan
z2
x2
(3)
tan
z1
x1
(4)
* 180 1, A 2, A
cos cos cos sin sin cos *
(5)
(6)
V x1 x2 c sin *
(7)
a x22 x1 22 4 x1 x2 cos *
(8)
b x2
(9)
sin180
2x1
sin *
a
(10)
Simplex Calculations
The CHACHA chain chain algorithm allows three-chain
arrays to be considered stable even where two of the chains
do not interact. This situation would lead to channel-like
voids in the corresponding crystal lattice and were therefore
excluded.
In order to confirm their viability, Simplex optimizations
were performed on the CHACHA-derived three-chain arrangements. The interchain distances (x), the chain orientations (), the relative shifts (z), and the angle in the xy
347
RESULTS
Chain Construction
plane determined by the three-chain arrangement were varied (eight parameters). In order to retain the required translational symmetry for the triclinic cases the rotation around
the helical axis was held identical for the three chains. This
gave 6 degrees of freedom in total. The Simplex was generated by multiplying the initial value of each variable in
turn by 1.05 and using the obtained tupel as an additional
vertex for the simplex, giving 7 vertices in total. Simplex
optimisation was continued until the minimum for the total
interchain energy was reached. Both nonbonding and hydrogen-bonding interactions were taken into account for
these energy calculations. Structures that showed large displacements or rotations were rejected.
348
Vietor et al.
FIGURE 6 (a) Interchain potential energy surface as a function of coupled variation of A and B
with the perpendicular offset x. Contours are drawn at intervals of 5 kcal/mol/cellobiose; (b)
Interchain potential energy map at the optimum perpendicular offset x, as a function of the
translation z along the chain direction and coupled rotations of A and B . Contours are drawn
at 5 kcal/mol/cellobiose intervals.
B . This allowed for a straightforward two-dimensional study. The contour maps calculated as a function of the translation z, along the fibre axis and the
coupled rotation angles A B are shown in Figure
6. Figure 6a is a representation of interchain energy in
relation to coupled variations of A and B , with the
perpendicular offset x. Figure 6b shows the interchain potential energy map at the optimum perpendicular offset x, as a function of the translation z,
along the chain direction and coupled rotations of A
349
Table II
Optimum Values for the ChainChain Interactions for Coupled Rotations of Helical Cellulose Chains
A
()
z
(nm)
x
(nm)
E(vdW)
(kcal/mol/dis)
E(HB) (kcal/
mol/dis)
E(Tot) (kcal/
mol/dis)
Contacts
gg
gg
gg
gg
tg
tg
tg
tg
tg
tg
tg
tg
tg
gt
gt
gt
gt
gt
gt
gt
gt
gt
77
77
85
86
77
60
96
58
88
74
50
111
9
126
169
77
101
61
77
58
88
9
0.00
0.520
0.226
0.219
0.00
0.274
0.324
0.000
0.270
0.520
0.179
0.000
0.269
0.279
0.000
0.000
0.219
0.279
0.520
0.00
0.209
0.264
0.481
0.488
0.501
0.503
0.482
0.468
0.544
0.477
0.522
0.494
0.487
0.635
0.642
0.669
0.809
0.480
0.550
0.468
0.488
0.477
0.508
0.646
7.24
6.42
5.50
5.47
7.03
6.17
6.89
5.60
5.18
5.07
4.97
4.91
4.57
5.61
3.35
6.11
5.68
5.65
5.39
5.13
5.08
4.76
0
0
0
0
0
0
0
0
0
0
0
0
0
1.95
3.95
0
0
0
0
0
0
0
7.24
6.42
5.50
5.47
7.03
6.17
5.89
5.60
5.18
5.07
4.97
4.91
4.57
7.56
7.30
6.11
5.68
5.65
5.39
5.13
5.08
4.76
154
145
132
131
159
150
135
120
135
127
130
94
100
101
62
140
139
138
138
114
119
100
Packing of Translational
Cellulose Chains
Two-chain Arrangements. Stable two-chain and
three-chain arrangements were determined and selected as described for the helical chains. Since the
translational chains do not contain an internal 21 axis;
350
Vietor et al.
FIGURE 7 (a) Chain : chain-pair potential energy surface as a function of chain rotation and
the perpendicular offset x. Contours are drawn at 5 kcal/mol/cellobiose intervals above the global
minimum; (b) Chain : chain-pair potential energy map at the optimum perpendicular offset x, as
a function of the translation z along the chain direction, and coupled chain rotations . Contours
are drawn at 5 kcal/mol/cellobiose intervals above the global minimum.
only coupled rotations needed to be considered. Hydrogen bonding was not taken into account at this
stage. For the stable configurations the energies found
were higher than for the helical chains. Also, the
distance between the chains was larger, and the number of chain chain contacts lower.
compatible with a monoclinic lattice could be determined. Though several arrangements compatible with
a triclinic lattice were found, though these resulted in
rather large unit-cell volumes compared to those obtained for the helical chains.
Three-chain Arrangements. No stable 3-chain arrangements that resulted in a viable spacefilling lattice
Final Selection
351
352
Vietor et al.
Allomorphic Transitions
Comparison of the triclinic and monoclinic models
indicates that the interchain distances are quite simi-
353
Table III Chain Conformation Parameters after MM3 Seven-Chain Minicrystal Minimization (see Table I for
Original Values)
Lattice
Triclinic
Helix
Translational
Monoclinic
Helix chain A
Helix chain B
()
()
c
(nm)
Lattice
Energy
(kJ/mol)
Changea
(nm)
149
136
170
169
1.036
1.036
20.0
NDb
0.0114
0.0562
157
149
63
66
61
66
1.036
1.036
15.7
18.0
0.0136
0.0095
()
()
()
116
115.3
92
85
116.1
115.2
91
92
a
Root mean square of displacement of the non-hydrogen atoms of the central cellobiose unit (except OO6 and OO6) after fitting to the
original conformation.
b
Lattice energy could not be determined due to the large lattice deformation.
CONCLUSION
The present work has established a computational
procedure to predict the different ways that a polysaccharide chain of known conformation is able to
interact with other chain-like molecules. The procedure has been applied to cellulose, for which stable
parallel chain pairings have been generated.
Few of these arrangements are capable of generating an efficiently packed three-dimensional array, but
354
Vietor et al.
REFERENCES
1. Perez, S.; Kouwijzer, M. L. C. E.; Mazeau, K.; Engelsen, S. B. E. J Mol Graphics 1997, 14, 307321.
2. OSullivan, A. C. Cellulose 1997, 4, 173207.
3. Kroon-Batenburg, L. M. J.; Kroon, J. Glycoconjugate J
1997, 14, 677 690.
4. Kroon-Batenburg, L. M. J.; Bouma, B.; Kroon, J. Macromolecules 1996, 29, 56955699.
5. Perez, S. Methods Enzymol 1991, 203, 510 556.
6. Aabloo, A.; French, A. D. Macromol Theory Simul
1994, 3, 185191.
7. Aabloo, A.; French, A. D.; Mikelssar, R. H.; Perstin, A.
Cellulose 1994, 1, 161168.
8. Cousins, S. K.; Malcom Brown, R., Jr. Polymer 1995,
36, 38853888.
9. Heiner, A. P.; Sugiyama, J.; Telleman, O. Carbohydr
Res 1995, 273, 207223.
10. Hopfinger, A. K. Biopolymers 1971, 10, 1299 1315.
11. Hopfinger, A. J.; Walron, A. G. J Macromol Sci Phys
1969, B3, 195208.
12. Hopfinger, A. J.; Walron, A. G. J Macromol Sci Phys
1970, B4, 185199.
13. Marhofer, R. J.; Relling, S.; Brickman, J. Ber Bunsenges Phys Chem 1996, 100, 1350 1354.
14. Tai, K.; Kobayashi, M.; Tadokoro, H. J Polym Sci
Polym Phys Eds 1976, 14, 783797.
15. Woodcock, C.; Sarko, A. Macromolecules 1980, 13,
1183.
16. Perez, S. In Electron Crystallography of Organic Molecules; Fryer, J.; Dorset, D. L., Eds.; NATO ASI Series;
Kluwer Academic: New York, 1990; pp 3353.
17. Perez, S.; Imberty, A.; Scaringe, R. P. In Computer
Modeling of Carbohydrate Molecules; French, A. D.;
Brady, J. W., Eds.; ACS Symposium Series, American
Chemical Society: Washington, DC, 1990; pp 281299.
18. Imberty, A.; Chanzy, H.; Perez, S.; Buleon, A.; Tran, V.
J Mol Biol 1988, 201, 365378.
19. Imberty, A.; Perez, S. Biopolymers 1988, 27, 1205
1221.
20. Gardner, K. H.; Blackwell, J. Biopolymers 1974, 13,
19752001.
21. Sarko, A.; Muggli, R. Macromolecules 1974, 7, 486
494.
22. Attala, R. H.; VanderHart, D. L. Science 1984, 223,
283.
23. VanderHart, D. L.; Atalla, R. H. Macromolecules 1984,
17, 14651472.
24. Vanderhart, D. L.; Atalla, R. H. In The Structure of
Cellulose; ACS Symposium Series 1987, American
Chemical Society: Washington, DC, 1987; pp 88 118.
25. Sugiyama, J.; Vuong, R.; Chanzy, H. Macromolecules
1991, 24, 4168 4175.
26. Perez, S.; Delage, M. M. Carbohydr Res 1992, 212,
253259.
27. Tvaroska, I.; Perez, S. Carbohydr Res 1986, 149, 389
410.
28. Scaringe, R. P.; Perez, S. J Phys Chem 1987, 91,
2394 2403.
29. Chou, K. C.; Nemethy, G.; Scheraga, H. A. J Phys
Chem 1983, 87, 2869 2881.
30. Chou, K. C.; Nemethy, G.; Scheraga, H. A. J Am Chem
Soc 1984, 106, 31613170.
31. French, A. D.; Miller, D. P.; Aabloo, A. Int J Biol
Macromol 1993, 15, 30 36.
32. Allinger, N. L.; Yuh, Y. H.; Lii, J.-H. J Am Chem Soc
1989, 111, 8551 8134.
33. Allinger, N. L.; Rahman, M.; Lii, J.-H. J Am Chem Soc
1990, 112, 8293 8307.
34. Finkenstadt, V. L.; Millane, R. P. Macromolecules
1998, 31, 7776 7783.