You are on page 1of 7

ANALYSI S

NATURE GENETICS | VOLUME 40 | NUMBER 2 | FEBRUARY 2008 141


Epigenetic regulation and the variability of gene
expression
Jung Kyoon Choi
1
& Young-Joon Kim
1,

2
We characterized the genetic variability of gene expression
in terms of trans and cis variability for each yeast transcript.
Genes that are highly regulated by nucleosomes showed a
high degree of trans variability. From the expression profiles of
mutants for various chromatin modifiers, we found that trans-
variable genes are distinctly regulated at the chromatin level.
The effect of chromatin regulators was highly significant, even
when compared with that of transcription factors. The DNA-
binding activities of transcription factors had a low influence
on trans variability. In the case of the basal transcription factor
TBP and TBP-associated factor TAF1, expression variability
was coupled with the histone acetyltransferase activities of
TAF1 and other factors, rather than with the binding of TBP
to DNA. Additionally, we found that the correlation of TATA-
box presence and expression variability could be explained in
terms of chromatin regulation. The lack of activating histone
modifications may subject TATA-containing promoters to
chromatin regulation processes. Our results propose that
epigenetic regulation has a central role in the variation and
evolution of gene expression.
Phenotypic diversity can be achieved by diversifying the expression of
each component of the genetic regulatory network. Each promoter has
a unique capacity to respond to regulatory changes and effect variation
in the expression of its relevant gene. A previous study has provided a
basis for the measurement of this gene-specific capacity or expression
variability
1
. Mutational variance, or transcriptional variance among
mutation-accumulation lines of yeast, was measured for each gene to
reflect the transcriptional sensitivity of the gene to genetic perturbations.
Mutational variance was considered to be proportional to the number of
potential cis and trans elements. However, the effect of each regulatory
element may vary considerably. Thus, we sought to measure the actual
magnitude of cis and trans effects on each gene. Because it is impossible
to distinguish cis and trans components without associated genotype
data, we turned to an experimental setting where genotype and expres-
sion could be associated.
We used the dataset of a cross between a standard laboratory strain
(BY) and a wild isolate (RM)
2,3
of Saccharomyces cerevisiae (Fig. 1). For
each gene, we classified the BY RM segregants according to inheritance
at the promoter region and selected those where all the flanking markers
were marked by either the BY or RM genotype. Consequently, the BY-
promoter group consisted of segregants that inherited the cis elements
solely from the BY parent; likewise, the RM-promoter group consisted
of segregants that inherited the cis elements solely from the RM parent.
Trans variability can be defined as the transcriptional variance within
each group, whereas cis variability is the transcriptional variance between
the means of the two groups (Supplementary Table 1 online). To evalu-
ate our method, we used the results of genetic linkage analyses. If genetic
association is found between a gene and particular trans-acting loci, the
gene should have higher trans variability compared to cis variability;
similarly, a gene associated with cis-acting loci should have higher cis
variability. Indeed, we found that this was the case (Supplementary Fig. 1
online). The trans variability and cis variability explained 69.4% and
27.4% of the total variability, respectively (Supplementary Fig. 2 online).
We used trans variability measures only for the BY strain, which is an
S288C derivative. Using the measures for the RM strain did not lead to
any significant differences (Supplementary Fig. 2). Even though the
trans and cis variability were calculated from a subset of the segregants,
the sum of the two measures accounted for most of the total variability
(93.7%). Therefore, we anticipate that these two measures are represen-
tative of the two main components of genetic variability.
The degree of trans or cis variability reflects not only the number of
associated trans- or cis-acting elements but also the strength of their
actual effects and their functional divergence. For example, a high trans
variability indicates that the gene is influenced by a number of regulators
acting in trans, particular regulators that exert distinct roles on that gene,
or other rapidly evolving regulators. A reliable measure of expression
variability should correlate with expression divergence observed in natu-
ral conditions. We found that our measures were highly correlated with
expression divergence as calculated between BY and RM and between
BY and Saccharomyces paradoxus (CBS 432)
4
(Fig. 2). In particular, trans
variability showed notably high correlations (Spearman correlation
r
BY-RM
= 0.367, P < 10
141
; r
BY-CBS
= 0.425, P < 10
193
). By compari-
son, the correlations of the mutational variance
1
were r
BY-RM
= 0.173
(P < 10
36
) and r
BY-CBS
= 0.178 (P < 10
38
).
The state of chromatin and its regulators should be considered as
important components in the genetic regulatory network, as they act
upstream of DNA binding by transcription factors. Therefore, their
influence on expression variability should be taken into account. A
recent study investigated the global association between sequence diver-
gence and chromatin structure for the first time
5
. It was suggested that
low accessibility of DNA repair molecules to closed chromatin regions
might lead to the high mutation rates observed in those regions of the
1
Genome Regulation Center and
2
Department of Biochemistry, Yonsei
University, 134 Sinchon-dong, Seodaemun-gu, Seoul 120-749, Korea.
Correspondence should be addressed to Y.-J.K. (yjkim@yonsei.ac.kr)
Published online 29 January 2008; doi:10.1038/ng.2007.58

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
142 VOLUME 40 | NUMBER 2 | FEBRUARY 2008 | NATURE GENETICS
genome
5
. Considering promoter sequence divergences, one can specu-
late that genes in regions of closed chromatin may involve high cis vari-
ability. However, chromatin effects on trans variability have never been
studied.
We found that genes whose expression is largely affected by nucleo-
somes show high trans variability. Figure 3a shows that genes whose
expression is considerably affected by the depletion of histones have high
trans variability. These genes are thought to be activated or repressed
by nucleosomes in a normal state, implicating chromatin structure and
modifications in the regulation of gene expression. Genetic changes
related to chromatin regulators may thus be responsible. For example,
genes in a condensed chromatin structure are derepressed by histone
depletion, showing upregulated expression (right-side tails in Fig. 3a).
Because the expression of these genes should be highly dependent on
chromatin regulators that open the closed chromatin structure, the
observed trans variation should be attributable mainly to genetic per-
turbations of chromatin activation processes. On the other hand, genes
with high cis variability were mainly in an inactive chromatin structure,
showing only the right-side tails (Supplementary Fig. 3 online). This is
in line with the hypothesis that closed chromatin may contribute to cis
variability as a result of higher mutation rates.
To examine the trans effects of specific chromatin regulators, we used
an expression compendium of chromatin modifiers that was previously
assembled
6
. Changes in expression accompanying the perturbation
(mutation or deletion) of various chromatin regulators were mea-
sured. We removed expression profiles for loss of histone proteins (H3
and H4), defective TBP, and TBP cofactors lacking known chromatin-
modifying activities. This resulted in a refined dataset consisting of 141
perturbation profiles for chromatin remodelers, histone acetyltransfer-
ases, deacetyltransferases, methyltransferases, ubiquitinating and deu-
biquitinating enzymes, and silencing factors. We first defined groups
of trans-variable genes and nonvariable genes (see Methods). For each
perturbation profile, we carried out the Kolmogorov-Smirnov (K-S)
statistical test
7
between the two groups, following the suggestion of a
previous study
6
. The Kolmogorov-Smirnov test compares the sample
and control distribution in order to determine the significance and
direction of discrepancy. In this study, the Kolmogorov-Smirnov statistic
compares perturbation effects among variable genes and those among
nonvariable genes D
+
and D

indicate the extent of activation or repres-


sion of genes caused by the relevant perturbation. The D
+
and D

scores
were computed for the 141 expression profiles covering more than 60
chromatin modifiers (Supplementary Table 2 online). We defined the
K-S score as D
+
when D
+
> D

and D

when D
+
< D

.
The trans-variable genes were distinctly regulated by chromatin
modifiers. The Kolmogorov-Smirnov scores for the real classification
of variable genes (shown in red) were more significant than the random
compositions of a sample and control group (shown in gray; Fig. 3b).
Examples of chromatin modifiers with |KS| > 45 include the ISWI

Gene A
BY
RM
BY-
promoter
segregants
RM-
promoter
segregants
E
x
p
r
e
s
s
i
o
n

l
e
v
e
l

o
f

g
e
n
e

A
Figure 1 Schematic of cis- and trans-
variability identification. The BY- and RM-
promoter groups were determined for gene A
from the segregants of a cross between BY and
RM strains. We first selected genetic markers
that covered the whole promoter region of gene
A within 10 kb (the bars boxed in yellow).
We then selected segregants in which all the
selected markers represented either the BY
or RM genotype (the top and bottom groups
of segregants with the same color within
the yellow box). As such, the BY-promoter
group contained segregants that inherited
the promoter sequence of gene A solely
from the BY parent, and the RM-promoter
group contained segregants that inherited
the promoter sequence from the RM parent.
Therefore, the transcriptional variance of gene
A within the BY or RM group can be thought to
arise from genetic perturbations in trans-acting
loci. In this study, we used trans-variability
measures only from the BY group. The variance
of the means of the BY and RM group was
designated as cis variability.
r = 0.367 r = 0.425
r = 0.253 r = 0.166
2.0 1.0 0.0 1.0

4
0
2
Trans variability in BY
B
Y
-
R
M
10 6 4 2 0

2
.
0

1
.
0
0
.
0
1
.
0
Cis variability (BY-RM)
B
Y
-
C
B
S
2.0 1.0 0.0 1.0

2
.
0

1
.
0
0
.
0
1
.
0
Trans variability in BY
B
Y
-
C
B
S
10 6 4 2 0

4
0
2
Cis variability (BY-RM)
B
Y
-
R
M
Figure 2 Correspondence between expression variability and expression
divergence. We measured expression divergence between the BY and RM
strains (BY-RM) and between the BY strain and the S. paradoxus CBS strain
(BY-CBS). We used log-transformed values for plotting and the Spearman
rank correlation. Genes that are highly regulated by trans-acting elements
(high trans variability) show high expression divergence. Although cis-
element divergence between BY and RM is somewhat directly linked to
BY-RM expression divergence, this pattern is relatively weakly reflected in
BY-CBS divergence.

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
NATURE GENETICS | VOLUME 40 | NUMBER 2 | FEBRUARY 2008 143
chromatin remodelers ISW1 and ISW2, SAGA complex subunit SPT3,
histone acetyltransferase ESA1 and histone deacetyltransferase TUP1
(Fig. 3c). We can see that highly variable genes are sensitive to activa-
tion and/or repression by the chromatin regulators. We measured the
sensitivity of each gene as the magnitude of expression change caused by
the perturbation of each chromatin regulator. The sensitivity measures
calculated over the 141 perturbation profiles explained a large frac-
tion of the trans variability (r
2
= 0.780 from multiple linear regression
with the 141 profiles as explanatory variables; Supplementary Table 3
online). To assess the overall chromatin effects for each gene, we defined
chromatin regulation effects (CRE) as the average of the 141 sensitivity
measures. The CRE score was tightly correlated with the trans variability
(Spearman correlation r = 0.508, P < 10
300
; Supplementary Fig. 4a
online), implying that expression variability is largely affected by chro-
matin regulators.
The genetic association studies using the BY RM dataset indicated
that the expression of many genes could be mapped to chromatin modi-
fiers
8
but not to transcription factors
9
. Here, we wanted to compare the
influence of transcriptional regulators on expression variability with
that of chromatin regulators. To this end, we used global transcription
factordeletion experiments
10
. First, we carried out the Kolmogorov-
Smirnov test of variable versus nonvariable genes for each deletion
profile. The most significant Kolmogorov-Smirnov score was obtained
for the loss of SPT10, which was included in the chromatin compen-
dium. After excluding several chromatin-related factors, we obtained
perturbation profiles for a total of 240 transcriptional regulators. We
compared the magnitude of the 240 Kolmogorov-Smirnov scores with
the 141 Kolmogorov-Smirnov scores from the chromatin compen-
dium, as shown in Figure 3b (green curve versus red curve). Second,
we computed the measure of transcription factordeletion effects for
each gene in the same way that we obtained CRE. Its correlation with
trans variability was much lower than the correlation of CRE with trans
variability (r = 0.226; Supplementary Fig. 4b). When considered simul-
taneously, CRE showed a much higher contribution to trans variability
than transcription factordeletion effects (P < 10
300
versus P < 10
26

from multiple regression).
We next sought to assess the effect of a transcription factor on its
target genes. The purpose of this test was to determine whether the
4 2 0 2 4
0
.
0
1
.
0
2
.
0
isw1isw2
3 2 1 0 1
0
.
0
0
.
5
1
.
0
1
.
5
esa1
2 0 2 4 6
0
.
0
0
.
5
1
.
0
1
.
5
tup1
3 2 1 0 1 2 3
0
.
0
0
.
5
1
.
0
1
.
5
spt3
a b
c
1 0 1 2
0
.
3
0
.
4
0
.
5
0
.
6
Histone regulation
T
r
a
n
s

v
a
r
i
a
b
i
l
i
t
y
|K-S score|
D
e
n
s
i
t
y
KS = +53.2 KS = -46.8
KS = +48.6 KS = +68.6
D
e
n
s
i
t
y
D
e
n
s
i
t
y
D
e
n
s
i
t
y
D
e
n
s
i
t
y
Expression change (log) Expression change (log)
Expression change (log) Expression change (log)
H31-28
H42-26
0 10 20 30 40 50 60 70
0
.
0
0
0
.
0
5
0
.
1
0
0
.
1
5
0
.
2
0
0
.
2
5
0
.
3
0
Figure 3 Impact of chromatin regulation on expression variability. (a) Trans variability among genes with varying sensitivity to nucleosome regulation. Genes
were ordered by expression changes resulting from histone depletion (log
2
ratio)
24
; the average trans variability was obtained in each sliding window of 200
ordered genes. (b,c) We classified genes into the variable or nonvariable group and compared the distribution of sensitivities to 141 chromatin modifiers
between the two groups; the absolute value of the Kolmogorov-Smirnov score denotes the degree of discrepancy. (b) The discrepancy between the variable
and nonvariable group (red) was compared to that between two random groups (gray). We carried out random grouping 50 times, assigning the same number
of genes to each group. Responsiveness of variable genes to transcription factors was estimated by the Kolmogorov-Smirnov scores for the deletion profiles of
254 transcription factors
10
(green). Transcription factorbinding effects were estimated by the Kolmogorov-Smirnov scores (D
+
) for the DNA-binding profiles
of 203 transcription factors
11
(blue). (c) Expression changes due to the deletion of illustrative chromatin regulators are shown for the variable (red) and
nonvariable (black) genes. ISW1 and ISW2 cause variable expressions among both activated and repressed targets. SPT3 seems to cause variation mainly in
the process of gene activation. ESA1 may be a general activator (overall repression by the deletion), but less activated genes are more variable. Genes that
are highly repressed by TUP1 are highly variable.
Table 1 Functional description of transcription factors whose target genes show high expression variability
Transcription factor Functional description
ABF1 Mediates a number of chromatin-related events including gene activation and repression, chromatin remodeling and gene silencing;
recruits the SIR complex
25
FHL1 Interacts with IFH1 for chromatin silencing and ribosomal protein expression
26
; physically interacts with H4 and H2A.1 (ref. 27)
REB1 Physically interacts with TAF1 and MOT1, TATA binding proteinassociated factors possessing chromatin regulation acitivities
28
RAP1 Has a role in telomeric position effect (chromatin silencing) and telomere structure; recruits the SIR complex to chromosomal domains
25
SWI6 Physically interacts with TAF12, subunit of TFIID and SAGA complexes, which is involved in chromatin modification and is similar to H2A
29
UME6 Acts as a repressor by recruiting the SIN3-RPD3 complex
30
and the ISW2 chromatin remodeler
31
General annotation for each transcription factor is from the Saccharomyces Genome Database, and specific interactions with chromatin modifiers are from relevant
literature.

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
144 VOLUME 40 | NUMBER 2 | FEBRUARY 2008 | NATURE GENETICS
cohort variability of any transcription factor was independent of chro-
matin effects. For this, we identified a group of target genes for each
transcription factor
11
and computed the average trans variability and
CRE of genes in each group. We found that the cohort trans variability
was highly associated with the cohort CRE (Fig. 4a). In analogy to the
cohort CRE, we also calculated the deletion effect of other transcription
factors on each cohort. However, this effect was not significant when
considered together with the cohort CRE (P = 0.01 versus P = 1.76 10
7

from multiple regression). The specific effects of certain transcription
factors on their target genes would have caused upward deviations from
the regression line (red line; Fig. 4a). However, the linear regression
yielded a high explanatory power (r
2
= 0.701), ruling out transcription
factorspecific effects. Further, the overall effect of transcription factors is
reflected by the slope of the regression line: a regression line with a steeper
slope would indicate a higher degree of overall transcription-factor
effects compared to chromatin effects. We constructed a regression
line from randomized sets of transcription factorbinding genes. We
observed that the real line was not considerably steeper than the null line
(gray line; Fig. 4a). Functional descriptions for the transcription factors
with high cohort trans variability (20; Table 1) strongly suggest that
they are indeed involved in epigenetic regulation through interactions
with chromatin regulators. As an example of the opposite case, HAP2,
HAP3 and HAP5 have low cohort trans variability (<10) and low cohort
CRE (<20). The HAP2-HAP3-HAP5 complex forms the DNA-binding
component of the CCAAT-binding factor, which recognizes a distinct
motif (CCAAT). Indeed, none of the three proteins was found to interact
with chromatin modifiers
6
.
We then assessed the effects of transcription factors in terms of DNA-
binding activity. For each transcription factor, we defined DNA bind-
ing effects (DBE) as the average binding signal for the promoters of its
target genes. A high DBE score indicates that the transcription factor
has strong interactions with promoter DNA. We found that DBE does
not have any significant positive or negative correlation with cohort
trans variability (Fig. 4b). The transcriptional factors that conferred high
variability (Table 1), except for RAP1, do not have substantially higher
or lower DBE compared to other transcription factors. In other words,
DNA-binding activity does not influence the capability of a transcription
factor to generate variation. These findings were corroborated by a non-
parametric approach (see Methods and Supplementary Fig. 5 online).
In addition, we carried out the Kolmogorov-Smirnov test of variable
versus nonvariable genes for the DNA-binding profiles (blue density
line, Fig. 3b). The Kolmogorov-Smirnov scores were considerably lower
than those for chromatin perturbation profiles. Thus, it seems that, taken
together, the DNA-binding activities of transcription factors have a low
influence on trans variability.
More specifically, we examined the basal transcription factor TBP
and TBP-associated factor TAF1 (subunit of TFIID). TAF1 is involved
in the regulation of TATA-less genes with histone acetyltransferase
(HAT) activity
12,13
. The loss of TAF1 induces overall repression of
TATA-less genes
12
. The TAF1 N-terminal domain (TAND) inhibits
TBP-TATA interactions by molecular mimicry of the TATA box
14
.
TBP dimerization also inhibits TBP-TATA binding; various mutations
affecting the dimerization surface of TBP induce elevated basal (EB)
transcription (TBP
EB
mutations) from TATA-containing promoters
15
.
Taken together, these results indicate that loss of TAF1 causes repres-
sion of TATA-less genes by the loss of HAT activity (mode 1), and loss
of TAND and TBP
EB
mutations activate TATA-containing genes by
removing the inhibition of TBP-TATA binding (mode 2). We sought to
determine which mode was the primary cause of expression variability.
For this, we computed the Kolmogorov-Smirnov scores for TATA-
less genes in deletion mutants of TAF1 along with other HATs
16
and
for TATA-containing genes in deletion mutants of TAND along with
TBP
EB
mutants
14
. When comparing mode 1 and mode 2, we found that
genes in the variable group were more responsive to TBP-associated
HAT activities than to DNA binding of TBP (Fig. 4c). These patterns
were not attributable to a larger influence or larger target size of the
HAT activities (Supplementary Fig. 6 online). In addition, it should be
noted that genetic changes in DNA-binding domains of transcription
factors are rarely found in natural conditions (under strong negative
selection)
17
.
Notably, the TATA-containing cohort showed the highest scores for
both trans variability and CRE (Fig. 4a). Two very recent studies have
reported that genes containing a TATA box in their promoters show an
increased variability and evolvability in expression
1,4
. One study has
shown that the cause does not reside in promoter sequences: first, the
overall sequences of TATA-containing promoters were more conserved
a
b
Cohort chromatin effect
C
o
h
o
r
t

t
r
a
n
s

v
a
r
i
a
t
i
o
n

c
ABF1
ABF1
REB1
REB1
FHL1
FHL1
20 40 60 80
1
0

2
0

3
0

4
0

RAP1
RAP1
SWI6
SWI6
UME6
UME6
TATA
0
2
4
6
8
10
12
14
16
18
20
|
K
-
S

s
c
o
r
e
|

10 20 30 40 50 60 70
DNA binding effect
C
o
h
o
r
t

t
r
a
n
s

v
a
r
i
a
t
i
o
n
5
10
15
20
25
30
35
Figure 4 Effects of transcription factors on expression variability. (a) The average trans variability and chromatin effect were calculated for each transcription
factor cohort. If the overall effect of transcription factors is high relative to chromatin effects, the regression line should have a steep slope. To test this, we
compared the regression line for the real cohorts (red) with that for random cohorts (gray). The random cohorts were generated by permuting the assignment of
genes (100 repeats). (b) The DNA binding effect of each transcription factor was assessed by the average binding affinity for its cohort genes. The correlation
between the DNA binding effect and cohort trans variability was not significant. Transcription factors that have high cohort CRE and trans variability are marked
by their names. (c) The Kolmogorov-Smirnov test was carried out between the variable and nonvariable group among TATA-less genes for the deletion of TAF1
(dark blue) and double deletions of TAF1 and other HATs (light blue), as well as among TATA-containing genes for the deletion of the TAND domain of TAF1
(black) along with mutations affecting TBP (gray). The TAND deletion and TBP alterations led to increased TBP binding to DNA. Variable genes tend to be more
sensitive to HAT activities than to DNA binding.

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
NATURE GENETICS | VOLUME 40 | NUMBER 2 | FEBRUARY 2008 145
than TATA-less promoters, and second, the average conservation of
transcription factorbinding sites was similar between TATA-containing
and TATA-less promoters. Therefore, how the presence of a TATA box
causes expression variability remains unknown. From the regression
analysis shown in Figure 4a, we reasoned that the TATA-box effect could
also be explained by chromatin effects.
As previously reported
12
, TATA boxcontaining genes tend to be
repressed by nucleosomes (Fig. 5a). Unlike the pattern observed
with trans variable genes (Fig. 3a), TATA-containing genes are biased
toward histone repression. Indeed, TATA-containing genes tend
to lack activating histone modifications (Fig. 5b). A question may
arise as to whether TATA-box presence or the chromatin effect is the
determinant of expression variability. We observed chromatin effects
regardless of TATA-box presence, whereas the TATA-box effect did
not seem to be significant under the same level of chromatin effects
(Fig. 5c). TATA-less genes (black line) with a high CRE score showed
much higher expression variability than TATA-containing genes (gray
line) with a low CRE score. This is supported by a partial correla-
tion analysis of trans variability (trv) with CRE (cre) and TATA box
presence (tata): Rtrv cre|tata = 0.503; Rtrv tata|cre = 0.156, where
Ra b|c indicates the Spearman correlation of a and b, controlling
for c. Extending the perturbation data to include more chromatin
modifiers may be able to explain the remaining TATA-box effects
that are not accounted for by CRE. On the other hand, transcrip-
tion factordeletion effects (tde) were not significantly higher among
TATA-containing genes (Rtde tata|cre = 0.088). Therefore, there may
be other unknown effects on TATA-containing genes. It should also be
noted that TATA-box presence had no significant effect on cis variabil-
ity when trans variability and CRE were considered (P = 7.70 10
2

from multiple regression).
What could be the mechanism by which epigenetic processes cre-
ate genetic variation in gene expression? First, molecular changes of
chromatin regulators may have a direct role. From a genetic linkage
analysis coupled with bioinformatic techniques, it has been reported
that the expression of many genes could be associated with chromatin
modifying factors
8
. In contrast, only a few of the associated trans-
acting loci were linked to transcription factors
9
. Second, the evolution
of transcriptional regulators can give rise to variation in their interac-
tions with epigenetic regulators. Notably, DNA-binding domains of
transcription factors are highly conserved, but their protein-interaction
domains evolve rapidly under positive selection
17
. A previous study
has presented a transcription model in which the combinatorial
regulation of transcription factors and chromatin modifiers plays a
key part. Finally, nongenetic mechanisms should also be considered.
An important feature of epigenetic regulation is its responsiveness to
the environment, as manifest in the differential epigenetic patterns
and resulting variation in expression observed in genetically identical
twins
18,19
. Epigenetic processes can be influenced by both exogenous
and endogenous factors. In our study, genetic perturbations may
have created variation in the cellular environment, which can influ-
ence chromatin regulation processes. Indeed, TATA boxcontaining
genes were found to be highly responsive to stress and environmental
stimuli
4,12,13
. Thus, the general picture is that TATA-containing genes,
normally repressed in an inactive chromatin configuration, can be
highly regulated by epigenetic mechanisms in response to environ-
mental or genetic changes.
Opposing the notion that chromatin regulation is incidental to
intricate genetic regulation, our study has established that it has
a central role in creating regulatory variation. The available data
support the idea that highly evolvable genes are also responsive to
environmental perturbations
1,4
. Expression variability seems to be a
gene-specific property maintained over different kinds of perturba-
tions (stochastic, environmental and genetic) and different time scales
in evolution (interstrain and interspecies)
1,4
. However, no molecu-
lar mechanism has been suggested for this. Our results hint at the
possibility that epigenetics is a unifying molecular framework that
can encompass such diverse variation sources. Interesting questions
remain as to how and why evolutionary processes have placed those
Correlation between TATA presence and histone modifications
F
r
e
q
u
e
n
c
y
0.4 0.3 0.2 0.1 0.0 0.1 0.2
0
2
0
0
4
0
0
6
0
0
H3K79me H3K4me
H4Ac
H3K14Ac
a b
1 0 1 2
0
.
1
0
.
3
0
.
5
0
.
7
Histone regulation
F
r
a
c
t
i
o
n

o
f

T
A
T
A
-
c
o
n
t
a
i
n
i
n
g

g
e
n
e
s
H31-28
H42-26
c
0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
0
.
2
0
.
4
0
.
6
0
.
8
1
.
0
1
.
2
Chromatin regulation effects
T
r
a
n
s

v
a
r
i
a
b
i
l
i
t
y
TATA-less genes
TATA-containing genes
Figure 5 High expression variability of TATA-containing genes explained by chromatin regulation
effects. (a) The percentage of the TATA box among genes with varying sensitivity to nucleosome
regulation. We ordered genes by expression changes accompanying histone depletion (log
2
ratio)
24
and
obtained the fraction of TATA-containing genes in each sliding window of 200 ordered genes. (b) The
correlation of TATA-box frequency and histone-modification marks. The fraction of TATA-containing
genes and the average histone modification signals
23
were calculated in a 10-kb window sliding along
the genome. The Spearman rank correlation was obtained using the values from all windows. The
statistical significance was assessed by permuting the genomic distribution of TATA-containing and TATA-less genes (10,000 times). The correlations
from the permuted genomes were used as the null distribution. The results for H3K36me3 and H3K9ac were omitted in the plot because of relatively
weak correlations (r = 0.106 and r = 0.050, respectively). (c) The relationship of variability and chromatin effects was investigated separately for
TATA-containing and TATA-less genes. We recalculated CRE by including TBP alterations and TBP-cofactor deletions in order to rule out their TATA-
specific effects. Genes were ordered by CRE and the average trans variability was obtained in each sliding window of 50 ordered genes.

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
146 VOLUME 40 | NUMBER 2 | FEBRUARY 2008 | NATURE GENETICS
genes responsive to environmental and genetic changes in specific
chromosomal regions where epigenetic regulation has dominant
effects over genetic regulation.
METHODS
Computation of cis and trans variability. The dataset from a cross between a
standard laboratory strain BY4716 (BY) and a wild isolate RM11-1a (RM) con-
tained the genotypes and expression data for the parents and 112 segregants
2,3
.
The following procedure was repeated for each transcript (Fig. 1): first, we
defined the promoter region as 1 kb upstream from the transcription start
point; second, we selected genetic markers that covered the whole promoter
region within 10 kb from their genomic location; and third, we examined the
genotypes (BY or RM) of these markers in each segregant to find the segre-
gants that inherited the promoter sequence either from the BY or RM parent.
We defined the BY-promoter group and the RM-promoter group accordingly.
Finally, we calculated trans variability as the transcriptional variance among
the BY-promoter segregants and cis variability as transcriptional variance
between the means of the BY-promoter group and RM-promoter group.
These values were determined for 4,488 genes (Supplementary Table 1).
For trans variability, we used only the BY-promoter group and not the RM-
promoter group in order to properly incorporate other experimental data
obtained from the BY strain. Therefore, trans variability reflects the respon-
siveness of each gene to genetic perturbations in its trans-acting loci in the
BY strain.
Characterization of cis and trans variability. From genetic linkage analyses,
some genes can be mapped to responsible cis or trans regulators
8,9,20
. Different
methods yield different sets of cis- or trans-regulated genes. We used two sets
of cis-regulated genes (115 genes
8
and 1,428 genes
20
) and two sets of trans-
regulated genes (2,547 genes
8
and 1,265 genes
9
). For each of the reported
trans- or cis-regulated genes, we compared the degree of trans and cis vari-
ability (Supplementary Fig. 1). Next, we compared the variability measures
with expression divergence. Expression divergence between the BY and RM
strains was defined as transcriptional variance between the mean expression
of the BY parents and that of the RM parents from the BY RM dataset
2,3
.
Expression divergence between BY (S. cerevisiae) and CBS 432 (S. paradoxus)
was computed from the dataset of a previous study
4
, which had obtained
the expression patterns of orthologous genes in 32 different conditions. We
computed the difference in expression between orthologs within the same
condition. The squared sum of those 32 expression differences was used to
estimate expression divergence.
Expression compendium of chromatin modifiers. From the expression com-
pendium of chromatin modifiers previously assembled
6
, we selected expression
profiles for chromatin remodelers, histone acetyltransferases, deacetyltransfer-
ases, methyltransferases, ubiquitinating and deubiquitinating enzymes, and
silencing factors. We then added two more datasets
21,22
. This resulted in 141
profiles composed of expression changes (log
2
ratios) resulting from defective
chromatin regulators. With our measure of trans variability, we selected the
top 20% variable genes and used the lowest 50% as the control group. We then
carried out the Kolmogorov-Smirnov (K-S) statistical test
7
for each expression
profile. The Kolmogorov-Smirnov test compares two sample distributions to
determine the significance and the direction of disparity (D
+
and D

). The
D scores were computed for the 141 expression profiles as log
10
(P value)
(Supplementary Table 2). The Kolmogorov-Smirnov score was defined as
D
+
if D
+
> D

and D

if D
+
< D

. We randomly selected 20% of genes and


carried out the same procedure to assess the significance of the real disparity
(50 repeats). The log
2
ratios in each profile were normalized to unit variance.
The absolute value of the normalized log
2
ratio represented the responsiveness
(sensitivity) of the gene to the relevant chromatin regulator. The average of
the 141 responsiveness measures was designated as the chromatin regulation
effects (CRE) score.
Chromatin regulation effects on transcription factor cohorts. From genome-
wide ChIP-chip experiments and statistical models, a previous study
11
gen-
erated DNA-binding profiles for 203 transcription factors. The strength of
DNA binding and its statistical significance were reported. With a criterion of
P < 0.001, 127 of the 203 transcription factors were found to have more than
five target genes. For each of the 127 groups, the average trans variability and
CRE were calculated and normalized to the z score. This is equivalent to testing
whether the mean of the target genes is greater than zero. We examined the
correlation between the average trans variability and CRE across the groups
by linear regression. The same number of genes was randomly mapped to
each transcription factor and the above procedure was repeated 100 times.
The average linear regression coefficient (the slope of the fitted line) for the
random assignments was compared with the real data. As a nonparametric
approach, we also carried out the Kolmogorov-Smirnov test between the target
group and the control group. In this case, the test determines whether the tar-
get genes have greater (D
+
) or smaller (D

) values than the rest of genes. The


Kolmogorov-Smirnov score was used in place of the z score (Supplementary
Fig. 5a).
Transcription-factor binding effects. The strength of transcription factor
DNA binding was reported as the normalized intensity ratio of immunopre-
cipitated (test) over unenriched (control) samples
11
. To estimate the DNA-
binding activity of each transcription factor, we computed the average binding
affinity for its target genes (as the z score). As a nonparametric approach, we
carried out the Kolmogorov-Smirnov test between the target group and the
control group. In this case, a higher D
+
score indicates that there is a greater
disparity in DNA-binding activity between the cohort and the rest of genes
(Supplementary Fig. 5b). The z test is more focused on comparisons among
transcription factors, whereas the Kolmogorov-Smirnov test measures the
disparity. We also applied the Kolmogorov-Smirnov test of variable versus
nonvariable genes to the DNA-binding profile of each transcription factor.
The D
+
score indicates the effect of DNA-binding activity on trans variability
(Fig. 3b).
TATA box and histone modifications. The classification of TATA-contain-
ing and TATA-less genes
12
and genome-wide histone modification pat-
terns
23
(H3K4me3, H3K36me3, H3K79me3, H3K9ac, H3K14ac and H4ac)
were as previously described. Normalized signal intensity values were used.
Chromosomal positions of the genes were downloaded from the Saccharomyces
Genome Database. The percentage of TATA-containing genes and the average
of histone modification signals were determined within a 10-kb sliding win-
dow over the chromosomes. The Spearman rank correlation was measured
between the fraction of TATA-containing genes and each of the average histone
marks. To assess the statistical significances, we constructed a randomized
genome by permuting the distribution of TATA-containing and TATA-less
genes over the chromosomes (10,000 times). The correlation values obtained
from the permuted genomes were used to construct the null distribution.
Note: Supplementary information is available on the Nature Genetics website.
ACKNOWLEDGMENTS
We thank R. Brem for valuable comments and suggestions and M. Kupiec for
providing the chromatin modifier compendium. This work was supported by
grants from the Korean Ministry of Science and Technology to Y.-J.K. (Creative
Research Initiatives and Epigenomic Research of Human Disease).
AUTHOR CONTRIBUTIONS
J.K.C. designed the methodology, performed the analysis and wrote the paper;
Y.-J.K. helped with the analysis and wrote and revised the paper.
Published online at http://www.nature.com/naturegenetics
Reprints and permissions information is available online at http://npg.nature.com/
reprintsandpermissions
1. Landry, C.R., Lemos, B., Rifkin, S.A., Dickinson, W.J. & Hartl, D.L. Genetic properties
influencing the evolvability of gene expression. Science 317, 118121 (2007).
2. Brem, R.B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptional
regulation in budding yeast. Science 296, 752755 (2002).
3. Brem, R.B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene
expression traits in yeast. Proc. Natl. Acad. Sci. USA 102, 15721577 (2005).
4. Tirosh, I., Weinberger, A., Carmi, M. & Barkai, N. A genetic signature of interspecies
variations in gene expression. Nat. Genet. 38, 830834 (2006)
5. Prendergast, J.G. et al. Chromatin structure and evolution in the human genome.
BMC Evol. Biol. 7, 72 (2007).
6. Steinfeld, I., Shamir, R. & Kupiec, M. A genome-wide analysis in Saccharomyces

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s
ANALYSI S
cerevisiae demonstrates the influence of chromatin modifiers on transcription. Nat.
Genet. 39, 303309 (2007).
7. Stephens, M.A. Use of the Kolmogorov-Smirnov, Cramer-von Mises and related
statistics without extensive tables. J. R. Stat. Soc. (B) 32, 115122 (1970).
8. Lee, S-I., Peer, D., Dudley, A.M., Church, G.M. & Koller, D. Identifying regulatory
mechanisms using individual variation reveals key role for chromatin modification.
Proc. Natl. Acad. Sci. USA 103, 1406214067 (2006).
9. Yvert, G. et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and
the role of transcription factors. Nat. Genet. 35, 5764 (2003).
10. Hu, Z., Killion, P.J. & Iyer, V.R. Genetic reconstruction of a functional transcrip-
tional regulatory network. Nat. Genet. 39, 683687 (2007).
11. Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature
431, 99104 (2004)
12. Basehoar, A.D., Zanton, S.J. & Pugh, B.F. Identification and distinct regulation of
yeast TATA box-containing genes. Cell 116, 699709 (2004).
13. Huisinga, K.L. & Pugh, B.F. A genome-wide housekeeping role for TFIID and a highly
regulated stress-related role for SAGA in Saccharomyces cerevisiae. Mol. Cell 13,
573585 (2004).
14. Chitikila, C., Huisinga, K.L., Irvin, J.D., Basehoar, A.D. & Pugh, B.F. Interplay of
TBP inhibitors in global transcriptional control. Mol. Cell 10, 871882 (2002).
15. Jackson-Fisher, A.J., Chitikila, C., Mitra, M. & Pugh, B.F. A role for TBP dimeriza-
tion in preventing unregulated gene expression. Mol. Cell 3, 717727 (1999).
16. Durant, M. & Pugh, B.F. Genome-wide relationships between TAF1 and histone
acetyltransferases in Saccharomyces cerevisiae. Mol. Cell. Biol. 26, 27912802
(2006).
17. Wray, G.A. et al. The evolution of transcriptional regulation in eukaryotes. Mol. Biol.
Evol. 20, 13771419 (2003).
18. Choi, J.K. & Kim, S.C. Environmental effects on gene expression phenotype have
regional biases in the human genome. Genetics 175, 16071613 (2007).
19. Fraga, M.F. et al. Epigenetic differences arise during the lifetime of monozygotic twins.
Proc. Natl. Acad. Sci. USA 102, 1060410609 (2005).
20. Ronald, J., Brem, R.B., Whittle, J. & Kruglyak, L. Local regulatory variation in
Saccharomyces cerevisiae. PLoS Genet. 1, e25 (2005).
21. Holstege, F.C. et al. Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95,
717728 (1998).
22. Wyrick, J.J. et al. Chromosomal landscape of nucleosome-dependent gene expression
and silencing in yeast. Nature 402, 418421 (1999).
23. Pokholok, D.K. et al. Genome-wide map of nucleosome acetylation and methylation in
yeast. Cell 122, 517527 (2005).
24. Sabet, N. et al. Gloal and specific transcriptional repression by the histone H3 amino
terminus in yeast. Proc. Natl. Acad. Sci. USA 100, 40844089 (2003).
25. Guarente, L. Diverse and dynamic functions of the Sir silencing complex. Nat. Genet.
23, 281285 (1999).
26. Rudra, D., Zhao, Y. & Warner, J.R. Central role of Ifh1p-Fhl1p interaction in the synthesis
of yeast ribosomal proteins. EMBO J. 24, 533542 (2005).
27. Ho, Y. e.a. Systematic identification of protein complexes in Saccharomyces cerevisiae
by mass spectrometry. Nature 415, 180183 (2002).
28. Gavin, A.C. e.a. Proteome survey reveals modularity of the yeast cell machinery. Nature
440, 631636 (2006).
29. Sanders, S.L., Jennings, J., Canutescu, A., Link, A.J. & Weil, P.A. Proteomics of the
eukaryotic transcription machinery: identification of proteins associated with components
of yeast TFIID by multidimensional mass spectrometry. Mol. Cell. Biol. 22, 47234738
(2002).
30. Kadosh, D. & Struhl, K. Repression by Ume6 involves recruitment of a complex con-
taining Sin3 corepressor and Rpd3 histone deacetylase to target promoters. Cell 89,
365371 (1997).
31. Fazzio, T.G. et al. Widespread collaboration of Isw2 and Sin3-Rpd3 chromatin remodeling
complexes in transcriptional repression. Mol. Cell. Biol. 21, 64506460 (2001).
NATURE GENETICS | VOLUME 40 | NUMBER 2 | FEBRUARY 2008 147

2
0
0
8

N
a
t
u
r
e

P
u
b
l
i
s
h
i
n
g

G
r
o
u
p


h
t
t
p
:
/
/
w
w
w
.
n
a
t
u
r
e
.
c
o
m
/
n
a
t
u
r
e
g
e
n
e
t
i
c
s

You might also like