You are on page 1of 9

Mol Gen Genet (1984) 194:15-23

© Springer-Verlag 1984

The D N A of Arabidopsis thaKana


Leslie S. Leutwiler*, Barbara R. Hough-Evans and Elliot M. Meyerowitz
Divison of Biology, California Institute of Technology, Pasadena, California 91125, USA

Summary. Arabidopsis thaliana is a small flowering plant Ti plasmid as a means of achieving DNA-mediated trans-
of the mustard family. It has a four to five week generation formation of Arabidopsis (Aerts et al. 1979). Microspectro-
time, can be self- or cross-pollinated and bears as many photometric measurements of nuclear D N A content in this
as 104 seeds per plant. Many visible and biochemical muta- plant indicate that it has a small haploid genome (Bennett
tions exist and have been mapped by recombination to one and Smith 1976). Thus, Arabidopsis has several advantages
of the five chromosomes that comprise the haploid karyo- as an organism for use in molecular genetic studies of higher
type. With the experiments reported here we demonstrate plants. As a preliminary to a number of such studies, we
that Arabidopsis has an extraordinarily small haploid ge- here report experiments that characterize the composition
nome size (approximately 7 x 1 0 7 nucleotide pairs) and a and organization of Arabidopsis DNA, including measure-
low level of cytosine methylation for an angiosperm. In ments of kinetic complexity, genome size, and nucleotide
addition, it appears to have little repetitive D N A in its composition. We find that Arabidopsis has an extraordinari-
nuclear DNA, in contrast to other higher plants. ly small genome (approximately 7 x 107 nucleotide pairs)
and little repetitive DNA. This confirms its potential as
an organism for use in molecular genetic experimentation,
and provides basic information as a foundation for such
studies. In addition, it establishes a new minimum genome
Introduction
size for a higher plant.
Plants have large differences in their nuclear D N A content.
Within angiosperms there is a nearly thousand-fold range
of variation, and there appears to be no correlation between Materials and methods
genome size and organismal complexity (Bennett and Smith Biological material. Seeds of Arabidopsis thaliana strain Col-
1976). Most of the angiosperms currently used in molecular umbia were obtained from Dr. A. Kleinhofs, Program in
genetic studies have large genomes; correlating with these Genetics, Washington State University, Pullman, WA
large (more than 109 nucleotide pairs) genomes is a large 99164. Wheat germ (Fisher Natural - not toasted) was ob-
fraction of D N A that is repeated many times, with individ- tained from a local market.
ual repetitive elements dispersed throughout the genome
(Flavell 1980). Both of these properties, large haploid ge- DNA isolation. Arabidopsis seeds were germinated on a mix-
nome and high fraction of dispersed repetitive DNA, create ture of sterile soil and peat (mixed 3:1) and grown under
difficulties in the standard procedures of molecular genet- constant illumination (7000 lux) at 25 ° C and 60% relative
ics: a large genome requires that large genomic libraries humidity. Approximately 10 g of five week old plants were
be screened to obtain any individual gene, and collection harvested, rinsed with water, and ground in a mortar and
of overlapping clones to define a chromosomal region is pestle with an equal weight of glass beads in the presence
greatly hindered by dispersed repetitive elements. of liquid nitrogen. Two grams of the powder were added
Arabidopsis thaliana is a small flowering plant of the to 10 ml 150 mM Tris-HC1, pH 8.5; 100 m M EDTA; 2%
mustard family with a short generation time (four to five N-lauroyl sarkosine, sodium salt; 0.1 mg/ml proteinase K
weeks), the ability either to cross- or self-pollinate, and a and incubated at 37 ° C with gentle stirring for 30 min. The
haploid chromosome number of only five. It has been used residue was removed by centrifugation and re-extracted
in a number of experiments in classical and biochemical twice; the supernatant was ethanol precipitated to remove
genetics (R6dei 1970). Numerous visible and biochemical a fluorescent compound which interfered with the ethidium
mutations are known, and many of these have been mapped bromide fluorescence of subsequent steps. Following cen-
by recombination to positions on the five linkage groups trifugation of the ethanol precipitate, the pellets were resus-
(Koorneef et al. 1983). It is susceptible to infection by Agro- pended in 10 mM Tris-HC1, pH 8.0; 1 mM E D T A (TE)
bacterium tumefaciens; there thus exists a potential for using and banded to equilibrium in CsC1 density gradients in
the presence of 1 mg/ml ethidium bromide. The bands were
* Current Address: Department of Biology, University of Califor- collected, the ethidium bromide was removed by butanol
nia, Los Angeles, CA 90024, USA
extraction, and the D N A ethanol precipitated. The pellets
Offprint requests to: E.M. Meyerowitz were resuspended in 1 ml TE, loaded on a i M NaC1 solu-
16

tion and centrifuged in a SW 50.1 rotor at 40,000 rpm for paper was then rinsed three times with 0.1 M Tris-HC1,
5 h in order to remove any residual RNA. The D N A pellet pH 7.5, 0.1 M NaC1 and the D N A eluted with 0.1 N NaOH.
was resuspended in TE, and the D N A concentration was
measured by absorption at 260 nm minus that at 320 nm Shearing of DNA. Unlabeled Arabidopsis (and Drosophila)
to correct for light scatter. D N A and labeled tracer D N A of the correct size (described
Wheat germ D N A was prepared essentially according above) were mixed before shearing in a Virtis homogenizer
to the above procedure. Wheat germ (0.5 g) was ground as described by Britten et al. (1974). Ten ml of the D N A
in an ice cold mortar and pestle with an equal weight of mixture in TE was diluted with 20 ml glycerol and sheared
glass beads in the presence of 3 ml of proteinase K (10 rag) in a Virtis homogenizer at 60,000 rpm in a dry ice-ethanol
in 150 m M Tris-HC1, pH 8.5, 100 m M EDTA. Following bath for 30 rain. The average single-strand length of the
grinding, the slurry was made up to 10 ml with N-lauroyl D N A was determined by electrophoresis through a 2%
sarkosine (to a final concentration of 2%) in the Tris- agarose gel using Hinf I digested pBR325 D N A as size
E D T A mixture and incubated at 37 ° C with gentle stirring standards. The unlabeled D N A was visible by UV fluores-
for 30 min. The remaining procedure for D N A isolation cence while the [32p]-labeled D N A was visualized by auto-
was the same as that for Arabidopsis DNA, except the initial radiography of the gel. Both procedures indicated the D N A
ethanol precipitation step was omitted. had been sheared to an average single-strand length of 375
nucleotides. After shearing the D N A was ethanol precipi-
Chloroplast DNA isolation. Whole plants (8.75 g) were dark tated and the precipitate subsequently dissolved in a small
adapted 12-18 h and homogenized in 40 ml buffer A (0.3 M volume of TE. This solution was layered on a Chelex-100
mannitol, 0.05 M Tris, 0.003 M EDTA, 0.001 M mercapto- (200-400 mesh, sodium form; Bio-Rad) column in order
ethanol, 0.1% bovine serum albumin, pH 8.0 - Kolodner to remove heavy metal ions; the fractions containing D N A
and Tewari 1975) with three 5-second bursts at high speed were combined and reprecipitated with ethanol. The precip-
in a Waring blender. Following filtration through 4 layers itate was resuspended in 0.4 M sodium phosphate buffer
of Miracloth, the supernatant was layered on 2 M su- pH 6.8 (PB) and the D N A concentration determined.
crose: 80% Percoll: 60% Percoll:40% Percoll in buffer A
(0.6:1:1:1 v/v) and centrifuged at 8,000 rpm in a Sorvall DNA reassociation. Reaction mixtures were prepared and
HB-4 rotor for 30 rain. The chloroplast band at the analyzed essentially according to Britten et al. (1974). D N A
40%-60% Percoll interface was collected, diluted in buffer samples ranging in concentration from 3.33 gg/ml to
A and pelleted in a HB-4 rotor at 2,000 rpm for 5 min. 1,375 gg/ml in either 0.12 M or 0.4 M PB were sealed in
The pellet was solubilized in 5 ml 0.05 M Tris-HC1, pH capillaries or ampoules, denatured by boiling for 1 min,
8.0, 0.02 M E D T A containing 2% N-lauroyl sarkosine and and reassociated in a 60°C water bath for the required
100 gg/ml proteinase K, incubated at 37 ° C 20 min and phe- incubation times. After reassociation to the appropriate
nol extracted twice. The aqueous phase was re-extracted Cot, the samples were frozen in dry ice-acetone and kept
with chloroform: 1% isoamyl alcohol and ethanol precipi- frozen at - 2 0 ° C.
tated. The D N A was further purified on a CsC1 equilibrium Reassociated D N A was separated from single-stranded
density gradient. D N A by hydroxylapatite (Bio-Rad, D N A Grade Bio-Gel
HTP) chromatography in water-jacketed columns. D N A
Preparation of labeled probes. A. thaliana total D N A or samples reassociated in 0.12 M PB were thawed and loaded
Drosophila melanogaster embryo D N A was nick translated directly on the column (2 ml bed volume), while those sam-
(Rigby et al. 1977) with [3H]-deoxycytidine triphosphate, ples reassociated in 0.4 M PB were diluted to 0.12 M PB
and a lambda clone containing a chloroplast D N A insert before loading. Single-stranded D N A was eluted in five to
(2bAt003) was nick translated with [32p]-deoxycytidine tri- six 1 ml fractions with 0.12 M PB at 60 ° C; double-stranded
phosphate. The nick translation mixture was incubated at D N A bound to hydroxylapatite was denatured by heating
13°C for 45 min and the D N A purified by passage over the column to 98 ° C and eluted in five 1 ml fractions with
a Bio-Gel P-60 (100-200 mesh; Bio-Rad) column. the same buffer. Reassociation of unlabeled D N A was de-
The nick translation yields a mixture of fragments rang- termined by measuring the A26 o nm for each fraction and
ing in size from greater than 1700 base pairs (bp) down correcting for light scattering at 320 nm. Reassociation of
to less than 154 bp. Therefore, it was necessary for the reas- labeled D N A was determined subsequently by counting
sociation measurements to isolate single-stranded D N A each fraction in Aquasol-2 (New England Nuclear) in a
fragments longer than 500 bp so that they (1) could be Beckman LS-250 scintillation counter. Tritium and [32p]_
sheared uniformly with the unlabeled D N A and (2) were cpm were measured simultaneously by counting in a half-
long enough to reassociate under the hybridization condi- tritium channel and a channel spanning the [14C]- [32p]
tions. D N A from the plasmid pBR325 (Bolivar 1978) was energy range respectively.
digested with Hint and used as a standard for size measure- Cot values were calculated for each sample by multiply-
ments. In order to make the nick translated D N A single- ing the D N A concentration (moles nucleotide per liter) by
stranded the D N A probe was made 0.1 N in N a O H prior time (sec). Reassociation data were fit by the equation:
to loading. Following electrophoresis through 2% agarose
gels to separate the different size D N A strands, a slot was r
C~
1
cut in the gel perpendicular to the electric field in the track Co 1 + kCo t
containing the nick translated D N A and at the location
for fragments of 500 bp. A piece of Whatman DE 81 paper where C = concentration of single-stranded D N A at time
was inserted and electrophoresis continued until all activity t, Co = original D N A concentration, and k = second order
present initially in the gel on fragments greater than 500 bp reassociation rate constant. The results were analyzed by
could be found on the DE 81 paper. The piece of DE 81 a non-linear least-squares computer program (Pearson et al.
17

1977) designed to fit theoretical second order components Belozerskii 1959), and close to the 45% G + C content of
to the observed reassociation kinetics. Brassica oleracea (Thomas and Sherratt 1956), both
members of the same family as Arabidopsis thaliana. As
Thermal denaturation of DNA. An aliquot of D N A from a control for this denaturation experiment, calf thymus
experiment A (Table 1) was reserved prior to shearing in D N A was melted in parallel with the plant DNA. It melted
order to determine the GC content of the D N A preparation. at T m= 86.7 ° C. This T m indicates a G + C content of 42.3%,
A. thaliana D N A at a concentration of 12.5 gg/ml and a comparable to published values for calf thymus D N A (40.0
control sample of calf thymus D N A at a concentration to 45.0%, Shapiro 1976). Since the Arabidopsis D N A was
of 42.7 lag/ml both in 0.12 M PB were melted in a Beckman found to be pure and of average G + C content, D N A reas-
Acta double-beam spectrophotometer equipped with a jack- sociation cound be carried out under standard conditions
eted cuvette holder and automatic sample changer. A refer- (Britten et al. 1974).
ence cuvette contained 0.12 M PB in order to ensure a linear
base line.
The temperature was raised to 45 ° C and then increased Reassociation kinetics of Arabidopsis DNA
at the rate of 1.5°/rain to 95 ° C. Separation of strands was The first experiment to characterize the sequence organiza-
monitored by the increase in absorbance at 260 nm. The tion and kinetic complexity of total Arabidopsis thaliana
hyperchromicity was calculated as the percent of final ab- D N A was to measure the reassociation rate of denatured
sorbance; the melting temperature (Tin) was that tempera- D N A under defined conditions. Plant D N A was sheared
ture which produced a 50% increase in hyperchromicity. to an average 375 nucleotides long, denatured, and allowed
The mol percent G + C content of native D N A was calcu- to reassociate. The percent reassociation was measured at
lated using the equation of Felsenfeld (1971): a number of time points by separating single- and double-
stranded D N A on temperature-controlled hydroxylapatite
% G + C=2.44 (Tm- 69.35)
columns and then quantitating the amount of each compo-
nent by UV light absorption at 260 nm. The reassociation
Determination of percent of 5-methylcytosine. The percent
values are shown in Fig. 1. The curve drawn through the
of 5-methylcytosine in 130 lag A. thaliana, 200 gg D. melano-
measured points in Fig. 1 is a computer best fit to second
gaster and 160 lag wheat germ D N A was determined by
order reassociation kinetics (see Materials and Methods).
lyophilizing the DNAs separately and then hydrolyzing to
The best fit consisted of two kinetic components. In addi-
bases by heating at 180° C for 25 rain in sealed glass am-
tion, a component of the D N A was reassociated by the
poules containing 200 lal formic acid. The hydrolysates were
time the first measurements were made (Cot = 0.002). Thus,
evaporated to dryness under nitrogen and resuspended in
three components were discovered: a rapidly reassociating
0.1 M HC1 in a ratio of 1 lag starting DNA: 1 gl 0.1 M
component, which comprises 10% of the plant DNA, a
HC1.
middle repetitive component containing 27% of the DNA,
The bases released by formic acid hydrolysis were ana-
and a slowly reassociating component with 55% of the
lyzed by HPLC (Diala and Hoffman 1982; Diala et al.
DNA. The remaining 8% of the D N A did not reassociate;
1981). Samples (25-40 lal) were injected into an Altex Ultra-
this D N A presumably consisted of small sheared fragments
sil-10 CX column and eluted at ambient temperature with
unable to form stable duplexes which will bind to hydroxy-
0.02 M ammonium phosphate (monobasic) buffer, pH 2.3.
lapatite. Table 1, Part A, displays the results of the analysis
Cytosine and methyl cytosine bases were identified relative
of this curve, including a remarkably small value for the
to the elution of authentic standards with detection at
overall kinetic complexity of the plant DNA.
280 nm, and the data were processed by a Shimadzu C-R1A
In addition to the unlabeled Arabidopsis D N A present
chromatopac integrator and calculated using standard
in this reassociation experiment, small quantifies of [3H]-
curves. Each value reported is the mean of three indepen-
labeled Arabidopsis DNA, sheared to the same size distribu-
dent measurements.
tion as the bulk unlabeled DNA, were included. The reasso-
ciation of this tracer, driven by the unlabeled DNA, was
Results measured as was the reassociation of the bulk DNA, except
that quantitation after the hydroxylapatite column was by
G + C content ofArabidopsis DNA
scintillation spectrometry. This labeled tracer served as a
Arabidopsis thaliana D N A was isolated from whole plants control to be certain that the results derived from optical
by the procedures detailed in Materials and Methods. Since density measurements did not reflect unknown UV absorb-
this D N A was to be used in all subsequent experiments, ing substances contaminating the DNA. They did not. As
its purity was checked by observing its denaturation in a demonstrated in Table 1, Part A, the [3H] tracer results
temperature-controlled spectrophotometer cell. The D N A parallel those obtained from the unlabeled DNA.
melted in 0.12 M PB with Tm=86.3 ° C and no hyperchro- The D N A used in this experiment was isolated from
micity was observed at temperatures lower than the D N A whole plants, and not only from nuclei. Consequently, chlo-
denaturation transition, indicating the absence of any R N A roplast D N A would be expected to make up a significant
or single-stranded D N A in the preparation. The hyperchro- fraction of the total D N A : in Sinapis alba, a member of
micity on melting, calculated as percent of final (single- the mustard family related to Arabidopsis, the chloroplast
strand) absorbance, was 22%. The G + C content of D N A genome is 1.58 × 105 bp; the complexity of this genome is
can be calculated from its T m (Felsenfeld 1971); the value 1.34 × 105 bp, since about 2.4 × 104 bp, including the ribo-
obtained for Arabidopsis D N A is 41.4% G + C, well within somal RNA coding regions, are represented twice (Link
the range (35.6% to 49.1% G + C) known for angiosperms. et al. 1981). If the size of the Arabidopsis chloroplast ge-
The Arabidopsis figure of 41.4% is almost identical to the home is similar, and there are 20 to 80 chloroplasts per
41.2% G + C content of Brassiea oleifera (Vanyushin and green cell (R6dei 1973), and 20 to 60 genomes per chloro-
18

0
<~ O
Z
D 20
~6 o o o
g
~- 4 0
O
"O
O
6O

§ so
o_ I00
-3 -2 -I 0 I 2 3 4
Log equivolent Col
Fig. 1. Reassociation kinetics of total Arabidopsis D N A fragments. 400 gg unlabeled plant D N A was combined with 0.14 gg labeled
size selected whole plant D N A (specific activity: 5.6x 105 cpm [aH]/gg) and 0.14 ng labeled sized 2bAt003 D N A (specific activity:
1.7 x 108 cpm [32p]/pg)). The D N A was sheared to an average single-strand length of 375 nucleotides, resuspended in 0.4 M ( C o t = > 50)
or 0.12 M ( C o t = <50) sodium phosphate, pH 6.8 (PB) and denatured by boiling for 1 rain. Samples were reassociated at 60°C to
the appropriate Cot value and frozen in dry ice-acetone. All samples were made up to 1 ml in 0.12 M PB immediately prior to loading
(except the Cot = 0.02 and 0.002 samples which were already in 3 ml 0.12 M PB). The reassociated D N A was fractionated by hydroxylapa-
tite (HAP) chromatography and the relative amounts of double- and single-stranded plant D N A were determined by absorbance at
260 nm minus 320 nm. The curve drawn through the points represents a least squares fit for two components, assuming second order
kinetics and allowing all the parameters to free float (rms=0.021). The lower curves (dashed lines) represent the predicted reassociation
kinetics for the pure middle repetitive and single copy components. The reassociation of labeled DNAs is not shown in the figure.
The kinetic parameters for the reassociation of the unlabeled Arabidopsis DNA, [3H]-labeled Arabidopsis D N A and [32p]-labeled 2bAt003
D N A are presented in Table I A

Table 1. Kinetic analysis of Arabidopsis D N A reassociation

D N A prepn. Component Fraction Reassociation Complexity Root mean


of D N A rate constants (bp) square error
in whole D N A
(M -1 s -1)

A Arabidopsis( A 2 6 o ) Foldback and highly repet. 0.10


Middle repet. 0.27 4.64 6.0 x 104 0.021
Single copy 0.55 0.0080 7.0 x 107
Arabidopsis [3H] Foldback and highly repet. 0.14
Middle repet. 0.23 4.78 4.9 x 104 0.009
Single copy 0.50 0.012 4.2 x 10 v
2bAt003 [32p] 1.93 1.25 x 105 0.005

B Drosophila [3HI Foldback and highly repet. (0.26)


Middle repet. 0.21 0.44 4.84 x 105 0.017
Single copy 0.45 0.0042 1.10 x 108
2bAt003 [32p] 1.82 0.018

C Arabidopsis[3H] Foldback and highly repet. (0.23)


Middle repet. 0.26 3.29 8.1 x 104 0.012
Single copy 0.47 0.0089 5.4 x 107
XbAt003 [32p] 2.44 1.2 x 105 0.013

The separation of kinetic components, and calculation of the rate constant and complexity of each component was as described in
Materials and Methods, and in Britten et al. (1974). The highly repetitive fractions in the reassociation curves described in B and
C are in parentheses to indicate that these measurements are overestimates: in the experiment detailed in part A the samples that
provided the initial reassociation points were diluted to 3 ml before the start of the experiment to avoid reannealing during the loading
of the hydroxylapatite column. The experiments of parts B and C were not diluted sufficiently, since the information desired was
that from reassociation of single-copy sequences. The complexity of the chloroplast D N A hybridized by 2bAt003 in preparations A
and C was calculated using the reassociation rate constant of the cloned tracer sequences and the genome fraction of the middle
repetitive component of the Arabidopsis D N A reassociation curve

plast ( B e d b r o o k a n d K o l o d n e r 1979), t h e n the a m o u n t o f D N A , is i n d e e d largely or totally due to this n o n - n u c l e a r


c h l o r o p l a s t D N A p e r green cell w o u l d be e x p e c t e d to be set o f sequences. T o d o this, the r e a s s o c i a t i o n e x p e r i m e n t
f r o m 6 x 107 to 8 x 10 s bp. It was t h e r e f o r e i m p o r t a n t to described a b o v e c o n t a i n e d , in a d d i t i o n to u n l a b e l e d a n d
d e t e r m i n e if the m i d d l e - r e p e t i t i v e c o m p o n e n t seen in the [3H]-labeled Arabidopsis D N A , a trace q u a n t i t y o f [32p]_
r e a s s o c i a t i o n curves, w h o s e c o m p l e x i t y a n d a m o u n t is labeled Arabidopsis c h l o r o p l a s t D N A derived f r o m a bacte-
a b o u t e q u a l to t h a t e x p e c t e d o f Arabidopsis c h l o r o p l a s t r i o p h a g e 2 r e c o m b i n a n t clone.
19

This 2 clone, 2bAt003, was isolated from an Arabidopsis


library (Pruitt and Meyerowitz, unpublished) and shown
to contain a portion (about 104 bp) of the chloroplast ge-
nome by the following series of experiments. First, the 2
clone was [32p]-labeled by nick translation, and hybridized
Kbp to a nitrocellulose gel blot filter (Southern 1975) prepared
from Arabidopsis D N A digested with the restriction endo-
nuclease Eco RI. After hybridization and autoradiography,
5.15_ it could be seen that the clone D N A hybridized to plant
4.27 -
D N A fragments identical in size to the Eco RI restriction
fragments o f the cloned D N A , and with an intensity of
:3.5:3 -
hybridization much greater than that typically seen with
single-copy ,~ clones. Thus, the clone contained unrear-
2.0:3 - ranged, repetitive Arabidopsis D N A . To show that this re-
I .58 - petitive D N A was chloroplast D N A , Arabidopsis chloro-
I .:38 - plasts were purified and chloroplast D N A prepared as de-
scribed in Materials and Methods. This D N A was digested
0.8:3 - with EcoRI, subjected to gel electrophoresis, blotted to ni-
trocellulose and hybridized with [32p]-labeled 2bAt003. Au-
toradiography of the blot showed that the clone does hybri-
dize to purified chloroplast D N A (Fig. 2), and that the pat-
tern of hybridization is identical to that seen in the blots
using whole plant D N A .
ab The reassociation of the 2bAt003 insert sequences in
the annealing experiment of Fig. 1 was driven by chloro-
Fig. 2a, b. Identification of 2bAt003 as a chloroplast genomic plast sequences in the unlabeled D N A . The rate of reasso-
clone, a Chloroplast D N A was prepared as described in Materials ciation of the chloroplast sequences in 2bAt003 is similar
and Methods. Approximately t btg of chloroplast D N A was di- to that of the middle repetitive component extracted from
gested for 2 h with 2 units of EeoRI in the presence of 4 m M the plant D N A Cot curves. Thus, the middle repetitive
spermidine and buffer recommended by the supplier (New England D N A likely contains chloroplast sequences. That it is large-
Biolabs). Restricted DNA was separated by electrophoresis on a
ly comprised of chloroplast D N A is indicated by the rela-
0.5% agarose gel, stained with ethidium bromide and photo-
graphed under 302 nm illumination, b The gel was blotted to nitro- tion of the complexity of this component with that expected
cellulose, probed with ZbAt003 DNA (specific activity=2.1 x of chloroplast D N A (Table 1). It should be noted that mito-
108 cpm [3zp]/gg), and exposed for 2 h without intensifier. Molecu- chondrial D N A would not be expected to contribute detec-
lar length standards derived from EcoRI and HindIII digested 2 tably to our whole plant D N A reassociation curve: in sever-
phage DNA are indicated to the left of lane a al species of the genus Brassica, members of the mustard

z 20
~6
4O
Q
6O

,o0 , , , , 1 , ....
-5 -2 -I 0 I 2 3 4
Log equivolenl Col
Fig. 3. Reassociation kinetics of labeled Drosophila and 2bAt003 D N A fragments. Equal amounts (200 lag) of Drosophila and Arabidopsis
unlabeled D N A were combined with 0.1 lag labeled sized Drosophila D N A (specific activity: 8.6 x 105 cpm [3H]/lag) and a.0 ng labeled
sized 2bAt003 D N A (specific activity: 2.4x 107 cpm [azp]/lag) and treated as described in Fig. 1. The relative amounts of double-
and single-stranded tracer D N A were determined by measuring [3H] cpm (Drosophila (e)] and [a2p] cpm (2bAt003 (A)] bound to
HAP at each Cot value. The Drosophila curve represents a least squares fit for two components and allows all parameters to free
float (rms=0.017). The lower curve (dashed line) represents the predicted reassociation kinetics for the pure single copy component.
The 2bAt003 curve represents a least squares fit for one component, allowing all parameters to free float (rms=0.018). The partial
(42%) reannealing of 2bAt003 is the annealing of the chloroplast insert in the clone driven by the plant DNA, the 58% not annealed
is 2 phage DNA, which is present at too low a concentration to reassociate
20

Z
r'q 20 --o-

4o
.{:2_
--
6O
o

g 80

Y_
I00 I I I . . . . . . 4--. I I - -
'- -2 -I 0 I 2 3
Log equivQIent Cot
Fig. 4. Reassociation kinetics of labeled Arabidops# and 2bAt003 DNA fragments. Equal amounts (200 gg) of Drosophila and Arabidopsis
unlabeled DNA were combined with 0.1 gg labeled sized whole plant DNA (specific activity; 6.2 x 105 cpm [3H]/gg) and 4,6 ng labeled
sized 2bAt003 DNA (specific activity: 3.3 x 1 0 6 cpm [32p]/p,g) and treated as described in Fig. 1. The relative amounts of double-
and single-stranded tracer DNA were determined by measuring [3H] cpm [Arabidopsis (o)] and [ 3 2 p ] cpm [2bAt003 (A)] bound to
HAP at each Cot value. The 2bAt003 curve represents a least squares fit for one component, allowing all parameters to free float
(0.013). The Arabidopsis curve represents a least squares fit for two components with the reassociation rate for the middle repetitive
component fixed to the value obtained for the reassociation of 2bAt003 DNA (rms=0.017) The lower curves (dashed lines) represent
the two components

family, mitochondrial D N A has been shown to comprise in two control experiments annealed the same way, with
only 1%-2% of the total D N A of mature leaves (Vedel no discrepancies caused by experimental artifact (Table 1,
and Mathieu 1982). parts B and C).
To be certain that the low complexity determined for
the slowly reassociating component of Arabidopsis D N A
is correct, and not the result of some contaminant in our
Arabidopsis D N A preparation that causes denatured D N A Methylcytosine in Arabidopsis DNA
to reanneal more rapidly than expected, a set of control Plant D N A usually has a high content of 5-methylcytosine,
experiments was performed. These controls comprised two with values reported from 2.0% of total nucleotides (9.7%
additional reassociation curves, in which the kinetic com- of cytosine residues) as 5-methylcytosine in Brassica oleifera
plexity of Arabidopsis D N A was directly compared to that (Vanyushin and Belozerskii 1959) to 10% of total nucleo-
of D N A from embryonic nuclei of Drosophila meIanogaster. tides (33% of all cytosines) as 5-methylcytosine in rye, Se-
Drosophila D N A was chosen for this comparison because tale cereale (Thomas and Sherratt 1956). Since the genomic
its complexity is known from a large number of different complexity of Arabidopsis D N A is unusually low, it seemed
experiments (Laird and McCarthy 1969; Laird 1971; Scha- possible that Arabidopsis D N A might also be unusual in
chat and Hogness 1973; Manning et al. 1975; Crain et al. its content of 5-methylcytosine. To find out, total plant
1976), and because its complexity (1 to 2 x 108 nucleotide Arabidopsis D N A was digested with the restriction endonu-
pairs) is near that derived for Arabidopsis DNA, thus pro- cleases HpaII and MspI in parallel preparations, and the
viding a control that gives reassociation points over the digested D N A subjected to electrophoresis in parallel lanes
same range of Cot values as does Arabidopsis DNA. Both of an agarose gel. Both enzymes recognize and cleave the
control curves were generated with identical mixtures of site C-C-G-G in double-stranded DNA, but MspI cleaves
equal amounts of unlabeled Arabidopsis and Drosophila whether or not the internal cytosine residue is methylated,
DNA, and trace amounts of [3zP]-labeled 2bAt003 DNA. while HpaII only cleaves if this base is unmethylated. Fig-
The first incubation mixture also contained [3H]-labeled ure 5 shows the patterns of digested and undigested Arabi-
Drosophila D N A tracer, the second, [3H]-labeled Arabidop- dopsis D N A run in the same agarose gel. No difference
sis tracer. All of the DNAs were sheared to an average can be seen between the lanes containing HpaII and MspI
375 nucleotide single-strand length. The kinetic complexity digested D N A ; D N A in both of these lanes (whether the
of Drosophila and Arabidopsis DNAs were thus measured discrete bands seen, which are chloroplast derived, or the
in parallel experiments; and in addition the experiment with background continuum of chromosomal DNA) has been
Arabidopsis tracer provided a repeat of the original analysis cleaved to a much smaller mean size than that of the undi-
of this DNA. As shown in Fig. 3 and in Part B of Table 1, gested DNA. This indicates a low level of 5-methylcytosine
there was no component in the Arabidopsis D N A prepara- at C-G sequences. As a control, and for contrast, wheat
tion that accelerated reassociation: the Drosophila D N A germ D N A was treated as was the Arabidopsis DNA.
annealed at the expected rate in the presence of plant DNA. Wheat germ D N A has 82% of the cytosines in the sequence
In addition, as seen in Fig. 4 and in Part C of Table 1, C-G methylated (Gruenbaum et al. 1981), and has a level
the repeat Arabidopsis curve showed reannealing of the mid- of cytosine methylation typical of most angiosperms (Sha-
dle repetitive and slow components of Arabidopsis D N A piro 1976). Figure 5 shows the result: undigested and Hpa
almost identical to that determined in the original experi- II-treated wheat germ D N A give similar gel patterns, while
ment. The similar reassociation rates of the [32p] 2bAt003 MspI-cleaved wheat germ D N A shows a broad distribution
D N A in the parallel control curves shows that the D N A of fragments smaller than those in other lanes.
21

Table 2

DNA mol %
Kbp 5-methylcytosine

Arabidopsis thaliana 4.6


5.53 Drosophila melanogaster < 0.2
Triticum (wheat germ) 20.1

2.05 - The mole percent 5-methylcytosineis expressed as percent of cyto-


sine residues that are 5-methylcytosine
I .58 -
I ,:58 -
of Arabidopsis nuclear D N A may be as high as 6.3% of
0.95 -
all cytosines. This is the lowest value reported for any an-
giosperm.
0.8:5 -

Discussion
0.56 - The most striking result obtained in this work is the low
complexity of the slowly reassociating (single-copy) fraction
of the Arabidopsis thaIiana genome, with three different
measurements giving a range of complexities from
4.2 x 1 0 7 bp to 7 x 1 0 7 bp for this fraction. It is possible
to calculate both the total genome complexity and the total
genome size from the reassociation results. Total complex-
ity is essentially equal to complexity of the single-copy frac-
tion, since the complexity of the other fractions is orders
of magnitude lower. Total genome size can be derived from
the single-copy complexity and fraction of the genomic
D N A found in the single-copy component. If we assume
that the middle repetitive component in our reassociation
curves is entirely or almost entirely chloroplast D N A (see
Results), then the value for nuclear genome size derives
only from the highly repeated and nonrepeated D N A frac-
abcdef tions. Using the figures from the experiment which mea-
Fig. 5. Detection of plant DNA methylation by restriction enzyme sured reassociation by ultraviolet absorption (Table 1), we
analysis. DNA was prepared as described in Materials and Meth- calculate a haploid nuclear genome size of 8.3 x 1 0 7 bp. Us-
ods and digested for t.75 h with enzymes in buffers recommended ing the results from the [3Hi-labeled tracer in the same
by the vendor (New England Biolabs). t gg of DNA and either experiment, we get the similar value 5.4 x 1 0 7 bp. The mean
2 u of HpaII or 7 u of MspI were used in each digestion. Restricted
of these measurements gives an estimate of the size of the
DNA was separated by electrophoresis on a 1.5% agarose gel,
stained with ethidium bromide, and the ethidium fluorescence pho- Arabidopsis nuclear genome of 7 x 107 bp. Even if the puta-
tographed in 302 nm illumination. Undigested Arabidopsis DNA tive chloroplast D N A is included in the calculations, the
is shown in lane a. The adjacent lanes show Arabidopsis DNA figure arrived at for total cellular genome size is only 40%
digested with HpaII (lane b) and MspI (lane c). Lane d is undigested larger than the nuclear genome size calculated above: so
wheat germ DNA, lane e is wheat germ DNA digested with HpaII, that under any set of assumptions, the genome of Arabidop-
and lane f wheat germ DNA restricted with MspI sis thaliana is smaller by far than that calculated for any
other angiosperm (Bennett and Smith 1976).
If all of the middle repetitive D N A is indeed chloroplast,
Quantitation of 5-methylcytosine levels in Arabidopsis then the number of chloroplast genomes per haploid ge-
D N A was achieved by hydrolyzing this D N A to individual home in an average Arabidopsis cell is in the range 350-600,
bases, separating the bases by high pressure liquid chroma- as calculated from our reassociation experiments; a diploid
tography, and quantitating levels of cytosine and 5-methyl- cell thus has 700-1200 chloroplast genomes. This value is
cytosine (Diala et al. 1981; Diala and Hoffman 1982). As consistent with the amount of chloroplast D N A per cell
controls Drosophila embryo DNA, which is unmethylated in Arabidopsis and related plants (see above, and Vedel
(Urieli-Shoval et al. 1982), and wheat germ DNA, with 5- and Mathieu 1982). It is also consistent with the relative
methylcytosine levels reported as being from 23.5% to intensity of autoradiographic signals from Arabidopsis total
25.2% of total cytosines (Shapiro 1976), were processed D N A sequences on genome blots that have been probed
as well. The results are shown in Table 2. The Drosophila with radioactively labeled cloned chloroplast and single-
and Triticum values obtained are close to the published copy D N A fragments, and with the relative intensity of
values, and Arabidopsis thaliana showed only 4.6% of its the autoradiographic signals from equal amounts of filter-
cytosine residues as methylcytosine. Since chloroplast D N A bound cloned chloroplast and single-copy sequences probed
of plants is generally not methylated (Kung 1977), and chlo- with radioactively labeled Arabidopsis whole plant D N A
roplast D N A comprises up to 27% of the total Arabidopsis (Pruitt and Meyerowitz, unpublished). While this evidence
D N A used for measurement, the methylcytosine content indicates that all of the middle repetitive component of the
22

Arabidopsis D N A might be chloroplast D N A , it does n o t terial determined in this study, G + C content and 5-methyl-
rule out the possibility that other, nuclear middle repetitive cytosine levels, are not strikingly unusual, though the meth-
sequences m a y m a k e up a p a r t o f this component. Experi- ylcytosine content is quite low for a higher plant. F u r t h e r
ments using cloned Arabidopsis genomic fragments to di- use o f r e c o m b i n a n t D N A techniques will soon show in
rectly examine this middle repetitive c o m p o n e n t are now m o r e detail the nature o f the repetitive a n d methylated se-
in progress (Pruitt and Meyerowitz, unpublished). It should quences. The future will also show if Arabidopsis thaliana,
be noted that the values for chloroplast D N A complexity with its unique c o m b i n a t i o n o f advantages for studies in
obtained from total D N A are lower by a factor o f two molecular genetics, will become an organism o f choice for
than the value obtained from molecular studies o f the chlo- such studies.
roplasts o f related plants, and t h a t the second order reasso-
ciation rate constant measured for reassociation o f the mid- Acknowledgements. We are very grateful to Drs. E. Diala and R.
dle repetitive Arabidopsis c o m p o n e n t is higher than that Hoffman for their advice, aid, and the use of their equipment
directly measured for Arabidopsis chloroplast D N A using in the 5-methylcytosine measurements, to Dr. R. Britten for the
use of his temperature-controlled spectrophotometer and for his
the [32p]-labeled chloroplast clone. I f the rate constant ob-
advice, to Drs. S. Johnson and J. Roberts for their advice, and
tained for the middle repetitive Arabidopsis c o m p o n e n t is to Dr. E. Davidson for the loan of equipment and for helpful
fixed to a value that is equal to that shown b y the chloro- suggestions. We thank Dr. A. Kleinhofs for Arabidopsisseeds. This
plast tracer a n d thus to a value that gives a m o r e typical work was supported by grant number 82-CRCR-I-1063 from the
chloroplast complexity, instead o f leaving all parameters Science and Education Administration of the U.S. Department
free in our c o m p u t e r analysis, a n d then the best second- of Agriculture to E.M.M.L.S.L. was supported by NIH Fellowship
order reannealing curves fit to the measured points, we 1F32 GM 07725, and by the Gosney Fund.
get the following result from the [3H]-traced reassociation
experiment shown in P a r t A o f Table 1 : foldback a n d highly
repetitive D N A are still 14% o f total D N A , middle repeti-
tive D N A is 24%, and single-copy 48%. W e have fixed References
the reassociation rate constant o f the middle repetitive com- Aerts M, Jacobs M, Hernalsteens J-P, van Montagu M, Schell
p o n e n t to 1.93, and get a calculated complexity o f this com- J (1979) Induction and in vitro culture of Arabidopsis thaliana
p o n e n t o f 1.25 x 10 s bp, a typical chloroplast figure for a crown gall tumours. Plant Sci Lett 17:43-50
m e m b e r o f the m u s t a r d family (Lebacq a n d Vedel 1981). Bedbrook JR, Kolodner R (1979) The structure of chloroplast
The reassociation rate constant o f the single-copy c o m p o - DNA. Annu Rev Plant Physiol 30 : 593-620
nent now becomes 0.0097, with a corresponding complexity Bennett MD, Smith JB (1976) Nuclear DNA amounts in angiosp-
o f 5 x 107 bp. The r o o t mean square error becomes 0.015. erms. Proc R Soc Lond B 274:227-274
The calculated nuclear genome size from this curve, assum- Bolivar E (1978) Construction and characterization of new cloning
vehicles. III. Derivatives of plasmid pBR322 carrying unique
ing the middle repetitive c o m p o n e n t to be entirely chloro-
EcoRI sites for selection of EcoRI generated recombinant DNA
plast D N A , is 6.5 x 107 bp. The n u m b e r o f chloroplast ge- molecules. Gene 4:121-136
nomes per h a p l o i d nuclear genome becomes 200. In other Britten RJ, Graham DE, Neufeld BR (1974) Analysis of repeating
words, letting the middle repetitive reassociation rate con- DNA sequences by reassociation. In: Grossman EL, Moldave
stant float free or fixing it to a value that gives the middle K (eds) Methods in enzymology, vol 29E. Academic Press, New
repetitive c o m p o n e n t a complexity exactly equal to that York, pp 363-406
expected o f chloroplast D N A makes little difference in our Cairns J (1963) The chromosome of E. coll. Cold Spring Harbor
calculations o f genome size or fraction o f genome in single- Symp Quant Biol 28 : 43-46
copy D N A . Thus, under any set o f assumptions, or with Crain WR, Eden FC, Pearson WR, Davidson EH, Britten RJ
(1976) Absence of short period interspersion of repetitive and
any reasonable constraints on the parameters o f our reasso-
non-repetitive sequences in the DNA of Drosophila melano-
ciation curves, we arrive at a p p r o x i m a t e l y the same estimate gaster. Chromosoma (Berl) 56:309-326
o f the h a p l o i d genome size. Diala E, Plant M, Coalson D, Hoffman R (1981) DNA methyla-
The measurements show that a p p r o x i m a t e l y 10% o f to- tion in normal and SV40-transformed human fibroblasts. Bio-
tal Arabidopsis D N A is foldback, or highly repetitive. The chem Biophys Res Commun 102:1379-1384
experiments reported shed no light on the nature o f this Diala E, Hoffman R (1982) Hypomethylation of HeLa cell DNA
D N A ; experiments addressing this question are in progress. and the absence of 5-methylcytosine in SV40 and adenovirus
O u r calculations show, then, that the Arabidopsis nucle- (type 2) DNA: Analysis by HPLC. Biochem Biophys Res Com-
mun 107:19-26
ar genome is c o m p a r a b l e in size to that o f the tiny n e m a t o d e
Felsenfeld G (1971) Analysis of temperature-dependent absorption
Caenorhabditis elegans (Sulston and Brenner 1974), less spectra of nucleic acids. In: Cantoni GL, Davies DR (eds) Pro-
than 20 times the size o f the Escherichia coli c h r o m o s o m e cedures in nucleic acid research, vol 2. Harper and Row Pub-
(Cairns 1963), a n d only five times the size of the minute lishers, New York, pp 233-244
Saccharomyces cerevisiae genome (Lauer et al. 1977). Fla- Flavell R (1980) The molecular characterization and organization
vell (1980) has estimated that, if an average p l a n t messenger of plant chromosomal DNA sequences. Annu Rev Plant Phys-
is a b o u t 1,200 bases, and if there are a r o u n d 15,000 genes iol 31 : 56%596
per h a p l o i d genome in plants, the m i n i m u m a m o u n t o f Gruenbaum Y, Navah-Many T, Cedar H, Razin A (1981) Sequence
D N A r e q u i r e d for specifying the m R N A sequences o f a specificity of methylation in higher plant DNA. Nature
292: 860-862
typical p l a n t should be 1.8 x 107 NTP. Our kinetic measure-
Kolodner R, Tewari KK (1975) The molecular size and conforma-
m e n t o f the h a p l o i d Arabidopsis genome is a b o u t four times tion of the chloroplast DNA from higher plants. Biochim Bio-
this m i n i m u m figure. I f there are 15,000 genes in Arabidop- phys Acta 402: 372-390
sis, each occupies an average o f less than 5,000 base pairs Koorneef M, van Eden J, Hanhart CJ, Stam P, Braaksma F J,
of chromosomal DNA. Feenstra WJ (1983) Linkage map of Arabidopsis thaliana. J
The other characteristics o f the Arabidopsis genetic ma- Hered 74:265-272
23

Kung S-D (1977) Expression of chloroplast genomes in higher Schachat FH, Hogness DS (1973) Repetitive sequences in isolated
plants. Annu Rev Plant Physiol 28:401-437 Thomas circles from Drosophila melanogaster. Cold Spring
Laird CD (1971) Chromatid structure: Relationship between DNA Harbor Symp Quant Biol 38:371-381
content and nucleotide sequence diversity. Chromosoma (Berl) Shapiro HS (1976) Distribution of purines and pyrimidines in de-
32:378-406 oxyribonucleic acids. CRC Handbook Biochem Molec Biol
Laird CD, McCarthy BJ (1969) Molecular characterization of the Nucl Acids 2:241-281
Drosophila genome. Genetics 63 : 865-882 Southern EM (1975) Detection of specific sequences among DNA
Lauer GD, Roberts TM, Klotz LC (1977) Determination of the fragments separated by gel electrophoresis. J Mol Biol
nuclear DNA content of Saecharomyces eerevisiae and implica- 98 : 503-517
tions for the organization of DNA in yeast chromosomes. J Sulston JE, Brenner S (1974) The DNA of Caenorhabditis elegans.
Mol Biol 114:507-526 Genetics 77:95 104
Lebacq P, Vedel F (1981) Sall restriction enzyme analysis of chlo- Thomas AJ, Sherrat HSA (1956) The isolation of nucleic acid frac-
roplast and mitochondrial DNAs in the genus Brassica. Plant tions from plant leaves and their purine and pyrimidine compo-
Sci Lett 23 : 1-9 sition. Biochem J 62:1-4
Link G, Chambers SE, Thompson JA, Falk H (1981) Size and Urieli-Shoval S, Gruenbaum Y, Sedat J, Razin A (1982) The ab-
physical organization of chloroplast DNA from mustard (Sina- sence of detectable methylated bases in Drosophila melanogaster
pis alba L.). Mol Gen Genet 181:454-457 DNA. FEBS Lett 146:148-152
Manning JE, Schmid CW, Davidson N (1975) Interspersion of Vanyushin BF, Belozerskii AN (1959) Nucleotide composition of
repetitive and nonrepetitive DNA sequences in the Drosophila the desoxyribonucleotides of higher plants. Dokl Akad Nauk
melanogaster genome. Cell 4:141-155 USSR 129:944-946
Pearson WR, Davidson EH, Britten RJ (1977) A program for Vedel F, Mathieu C (1982) Isolation of purified mitochondrial
least squares analysis of reassociation and hybridization data. DNA from Brassicae. Anal Biochem 127:1-8
Nucl Acids Res 4:1727-1737
R6dei GP (1970) Arabidopsis thaliana (L.) Heynh: A review of
the genetics and biology. Biblio Genetica 20:1-127
R6dei GP (1973) Extra chromosomal mutability determined by Communicated by R.B. Goldberg
a nuclear gene locus in Arabidopsis. Mutat Res 18:149-162
Rigby PWJ, Dieckmann M, Rhodes C, Berg P (1977) Labeling
deoxyribonucleic acid to high specific activity in vitro by nick
translation with DNA polymerase I. J Mol Biol 113:237-251 Received October 16, 1983

You might also like