Biotechnol J 2010 Song

Biotechnology DOI 10.1002/biot.201000059 Biotechnol. J.
2010, 5, 768–780
Journal
Research Article
Ensembles of signal transduction models using Pareto Optimal

Ensemble Techniques (POETs)
Sang Ok Song, Anirikh Chakrabarti and Jeffrey D. Varner

School of Chemical and Biomolecular Engineering, Cornell University, Ithaca, NY, USA
Mathematical modeling of complex gene expression programs is an emerging tool for under- Received 11 May 2010
standing disease mechanisms. However, identification of large models sometimes requires train- Revised 14 June 2010
ing using qualitative, conflicting or even contradictory data sets. One strategy to address this chal- Accepted 21 June 2010
lenge is to estimate experimentally constrained model ensembles using multiobjective optimiza-
tion. In this study, we used Pareto Optimal Ensemble Techniques (POETs) to identify a family of
proof-of-concept signal transduction models. POETs integrate Simulated Annealing (SA) with
Pareto optimality to identify models near the optimal tradeoff surface between competing training
objectives. We modeled a prototypical-signaling network using mass-action kinetics within an or-
dinary differential equation (ODE) framework (64 ODEs in total). The true model was used to gen-
erate synthetic immunoblots from which the POET algorithm identified the 117 unknown model
parameters. POET generated an ensemble of signaling models, which collectively exhibited popu-
lation-like behavior. For example, scaled gene expression levels were approximately normally dis-
tributed over the ensemble following the addition of extracellular ligand. Also, the ensemble re-
covered robust and fragile features of the true model, despite significant parameter uncertainty.
Supporting information
Taken together, these results suggest that experimentally constrained model ensembles could
available online
capture qualitatively important network features without exact parameter information.
Keywords: Mathematical modeling · Robustness and fragility · Systems biology
1 Introduction models often exhibit complex behavior [2].Typical-

ly, it is not possible to uniquely identify model pa-
Mathematical modeling of signal transduction and rameters, even with extensive training data and
gene expression programs is an emerging tool for perfect models [3]. Thus, despite identification
understanding disease mechanisms. Kitano [1] standards [4] and the integration of model identifi-
suggested that analysis of molecular networks us- cation with experimental design [5], parameter es-
ing predictive computer models will play an in- timation remains challenging even with structural-
creasingly important role in biomedical research. ly complete models. This reality has brought into
However, conventional wisdom suggests that the the foreground a number of interesting questions.
data requirement to identify and validate complex For example, do we actually need exact parameter
mechanistic models is too large. Molecular network knowledge to predict qualitatively important prop-
erties of a molecular network? Or can we estimate
which components and connections are central to
network function given only limited parameter in-
Correspondence: Professor Jeffrey D. Varner, School of Chemical and formation?
Biomolecular Engineering, 244 Olin Hall, Cornell University, Ithaca, Two schools of thought have emerged on how
NY 14853, USA
uncertain models can be used to understand mo-
E-mail: jdv27@cornell.edu
Fax: +1-607-255-9166
lecular network function. Bailey hypothesized that
qualitative properties of metabolic or signaling
Abbreviations: ODE, ordinary differential equation; POET, Pareto Optimal networks could be determined using network
Ensemble Technique; SA, Simulated Annealing structure without parameter knowledge [6]. Cer-
768 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Biotechnol. J. 2010, 5, 768–780 www.biotechnology-journal.com
tainly, there is literature evidence supporting the els, using sensitivity analysis. Sensitivity analysis
Bailey hypothesis in metabolic networks [7]. Stud- has enabled the investigation of robustness and
ies exploring network modularity [8] have also fragility in molecular networks (see [9, 16–19]).
identified recurrent motifs that betray natural de- Sensitivity analysis has also been crucial to model
sign principles. Alternatively, ensemble approach- identification, discrimination and experimental de-
es, which use uncertain model families, have also sign [3, 20–23]. However, sensitivity analysis, using
emerged to deal with uncertainty in systems biolo- first-order sensitivity coefficients, is a function of
gy and other fields like weather prediction [9–13]. the model parameters.Thus, another open question
Their central value has been the ability to quantify explored here was whether qualitative properties
simulation uncertainty and to constrain model pre- estimated by sensitivity analysis were recovered by
dictions. For example, Gutenkunst et al. [14] the ensemble. We demonstrate that model ensem-
showed that predictions were possible using en- bles recovered highly robust and fragile features of
sembles of signal transduction models despite the true model, despite significant parameter un-
sometimes only order of magnitude parameter es- certainty.
timates. Beyond their ability to robustly describe
data, uncertain deterministic ensembles might be a
course-grained strategy to explore population dy- 2 Materials and methods
namics when stochastic simulation is too expen-
sive. There are several techniques to generate pa- 2.1 Formulation, solution and analysis of the
rameter ensembles. Battogtokh et al. [10] and later model equations
Brown et al. [12] generated experimentally con-
strained parameter ensembles using a Metropolis- We identified a family of models describing a
type random walk through parameter space. Moles growth factor-induced three-gene transcriptional
et al. [15] contrasted evolutionary and determinis- program (Fig. 1). The model is available in SBML
tic optimization techniques, any one of which could format in the supplemental materials. The model
be adapted for ensemble generation. However, the was formulated as a set of coupled ordinary differ-
unifying component of these previous identifica- ential equations (ODEs):
tion strategies has been the minimization of a sin-
dx
gle objective function. = S · r (x, k) x (to) = x o
In this study, we used Pareto Optimal Ensemble dt (1)
y (t) = Yx (t)
Techniques (POETs) to identify a family of proof-
of-concept signal transduction models. Our objec- where x denotes the species concentration vector
tives were to test a modification to the original (64 × 1), k denotes the parameter vector (117 × 1)
POET algorithm published by Song et al. [9] and to and r(x,k) denotes the vector of reaction rates
more deeply explore the properties of model en- (117 × 1). The symbol S denotes the stoichiometric
sembles.The motivation for POETs is practical.The matrix (64 × 117). The (i,j) element of S, denoted by
identification of models with hundreds, thousands σij, described the relationship between protein i
or even tens of thousands of parameters requires and rate j. If σij <0, then protein i was consumed in
that we use measurements from multiple laborato- rj. Conversely, if σij >0, protein i was produced by rj.
ries or even different cell lines. These training data Lastly, if σij = 0, protein i was not involved in rate j.
can contain conflicts or can sometimes even be The symbol y denotes the model output vector,
contradictory. Thus, a central challenge when iden- where Y denotes the measurement selection ma-
tifying large models is the ability to balance con- trix.
flicts in diverse training data. POETs, which inte- We assumed mass-action kinetics for each in-
grate Simulated Annealing (SA) and multiobjective teraction in the network. The rate expression for
optimization through the notion of Pareto rank, reaction q was given by:
find solutions that optimally balance these trade- –σ jq
offs. The modified POETs strategy described here
rq (x, kq) = kq ∏ xj
(2)
improved the performance of the original algo- { }
j∈ Rq
rithm using a local parameter refinement step. In- The quantity {Rq} denotes the set of reactants
terestingly, the model ensemble generated using for reaction q, while kq denotes the rate constant
POET exhibited coarse-grained heterogeneity, governing reaction q. The symbols σjq denote the
suggesting that deterministic ensembles could per- stoichiometric coefficients (elements of S) for the
haps be used to model heterogeneous populations. reactants involved with reaction q. All reversible
A secondary challenge was the subsequent charac- interactions were split into two irreversible steps;
terization of network features in a family of mod- thus, every interaction in the model was non-neg-
© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 769

Biotechnology Biotechnol. J. 2010, 5, 768–780
Journal
Extracellular
Figure 1. Schematic of the proto-
typical signaling network used in
this study. Extracellular ligand L
iK binds surface receptor R1 driving the
Adaptor aTF P2
phosphorylation of transcription
L R1 PH gene 1 P3 factor TF. TF up-regulates gene 1
aK expression. Gene 1 then initiates a
gene 2 cascade resulting in the expression
aTF
of gene 2 and gene 3. Gene 3 down-
P1
gene 3 regulates the expression of gene 1.
iTF aTF
The model is available in SBML for-
mat in the supplemental materials.
Import 5’
Px Px 3’
TF-PH mRNA
Export
5’ 3’ 5’ 3’ P1
mRNA
Translation
Nucleus
P1
L R1
P2
P3 R1 L
Cytosol P2 Protein
ative. Inactive or infrastructure proteins and spectively. The L2 vector-norm was used as the dis-
macromolecules (R1, A1, A2, iTF, iK, EXPORT, IM- tance metric. We used Δt = 100 s and γ = 0.01 for all
PORT, PH and PH-TF), RNAP and ribosomes were simulations.
assumed to have zero-order production rates and Sensitivity analysis was used to estimate which
first-order degradation rates. These rate constants network components were fragile or robust. First-
were estimated along with the binding and catalyt- order sensitivity coefficients at time tq:
ic model parameters. All initial conditions were
∂xi
zero except gene 1, 2, and 3 (1 if present, 0 if ab- sij (tq) = (4)
∂k j tq
sent). We accounted for membrane, cytosolic and
nuclear proteins and mRNA by explicitly defining were computed by solving the kinetic-sensitivity
separate species in each of these compartments. equations [24]:
( ) = ⎡⎢⎣
Mass-action kinetics, while expanding the di-
S · r (x , k) ⎤
mension of the model, regularized its mathematical dx / dt
A (t) s j + b j (t) ⎥
j = 1,2,…, P (5)
ds j / dt ⎦
structure. This allowed automatic generation of the
model code using the UNIVERSAL code genera- subject to the initial condition sj(t0)= 0. The quanti-
tion tool. UNIVERSAL, an open source Java code- ty j denotes the parameter index, P denotes the
generator, supports the generation of model code number of parameters in the model, A denotes the
from text and SBML files. UNIVERSAL currently Jacobian matrix, and bj denotes the jth column of
supports multiple code types (Matlab/Octave-M, the matrix of first derivatives of the mass balances
Octave-C, Sundials-C, GSL-C and Scilab) and it is with respect to the parameters. Sensitivity coeffi-
extensible with a simple plugin API. UNIVERSAL cients were calculated by repeatedly solving the
is freely available as a Google Code project. Model extended kinetic-sensitivity system for each pa-
code was generated as a C++ Octave module and rameter using the LSODE routine of OCTAVE
solved using the LSODE routine of Octave (www. (www.octave.org) over a sparse sampling (approxi-
octave.org). When calculating the response of the mately 10%) of the ensemble (see Fig. 3). The Jaco-
model to ligand, we ran the model to steady-state bian A and the bj vector were calculated at each
and then simulated the addition of ligand. The time step using their analytical expressions gener-
steady-state was estimated numerically by repeat- ated by UNIVERSAL. The resulting sensitivity co-
edly solving the model equations and estimating efficients were then scaled and time-averaged
the difference between subsequent time points: (Trapezoid rule):
1 T
x (t + Δt) – x (t) 2 ≤ γ (3) N ij ≡
T ∫0 dt · α ij (t) sij (t) (6)
The quantities x(t) and x(t +Δt) denote the sim- where T denotes the final simulation time and αij =
ulated concentration vector at time t and t +Δt, re- 1 (unscaled) or αij(t) = kj /xi(t) (scaled). The scaled

Tj 2
time-averaged sensitivity coefficients were then ⎛ ´ ⎞
ˆ – yˆ (k) + ⎜ Mij – max yij ⎟
∑( )
2
organized into an array for each ensemble mem- E j (k) = Mij ij (10)
i= 1 ⎝ Mij
´
⎠
ber:
⎛ N (ε) N (ε) … N (ε) … N (ε) ⎞ The symbol M̂ ij denotes scaled experimental

⎜
11 12 1j 1P
⎟ (7)
N (ε) (ε) (ε)
N 22 … N 2 j … N 2P ⎟ (ε) observations (from training set j), while the symbol
N (ε) = ⎜⎜ 21 ε = 1,2,…, Nε ŷij denotes the scaled simulation output (from train-
⎟
⎜ ⎟ ing set j). The quantity i denotes the sampled time
⎜⎝ N (Mε)1 N (Mε)2 … N (Mj
ε)
… N (Mε)P ⎟⎠ index and Tj denotes the number of time points for
experiment j. We assumed only immunoblots were
where ε denotes the index of the ensemble mem- available for training with the exception of a single
ber, P denotes the number of parameters, Nε de- qRT-PCR or ELISA measurement of the highest in-
notes the number of ensemble samples and M de- tensity band. The first term in the objective func-
notes the number of model species. The Bi matrix tion quantified the relative simulation error. The
contained the time-averaged sensitivities for a sin- read-out from the training immunoblots was band
gle species for each parameter (rows) as a function intensity where we assumed intensity was only
of the ensemble (columns): loosely proportional to concentration. Suppose we
have the intensity for species x at time i = {t1,t2,..,tn }
⎛ N (1) N (2) … N (ε) … N (Nε )⎞ in condition j. The scaled-value measurement
⎜ i(11) i1
(2)
i1
(ε)
i1
(N ε ) ⎟ would then be given by:
N
Bi = ⎜ i2 N i2 … N i2 … N i2 ⎟ i = 1, 2,…, M (8)
⎜ ⎟ Mij – mini Mij
⎜ (1) Mˆ ij = (11)
(2) (ε) (N ) ⎟ max i Mij – mini Mij
⎝ N iP N iP … N iP … N iP ε ⎠
Under this scaling, the lowest intensity band
To estimate the relative fragility or robustness equaled zero, while the highest intensity band
of species and reactions in the network, we decom- equaled one. A similar scaling was defined for the
posed the N (ε) or the Bi matrices using Singular simulation output.The second term in the objective
Value Decomposition (SVD): function quantified the error in the estimated con-
centration scale. We assumed only the highest in-
N (ε) = U(ε) ∑(ε) VT , (ε) tensity bands were quantified absolutely (denoted
Bi = Ui(ε) Si(ε) ViT , (ε) (9) by M´ij) and compared with the simulation. Howev-
er, if these measurements were not available, the
Coefficients of the left (right) singular vectors second term could be adjusted to ensure the mod-
corresponding to largest β singular values of N (ε) el operated on physiologically relevant concentra-
were rank-ordered to estimate important species tion scales.
(reaction) combinations. Only coefficients with We computed the Pareto rank of ki+1 by com-
magnitude greater than a threshold (δ =0.1) were paring the simulation error at iteration i +1 against
considered. The fraction of the β vectors in which a the simulation archive Ki.We used the Fonseca and
reaction or species index occurred was used to rank Fleming ranking scheme [25]:
its importance. Similarly, the left singular vectors of
Bi showed which reaction combinations were im- rank (ki+1|Ki) = p (12)
portant for species i, while the right singular vec-
tors rank-ordered which ensemble members con- where p denotes the number of parameter sets that
tributed most significantly to the sensitivity of dominate parameter set ki+1. Parameter sets on or
species i. near the optimal trade-off surface have small rank
(<2). Sets with increasing rank are progressively
2.2 POETs further away from the optimal trade-off surface.
The parameter set ki+1 was accepted or rejected by
POETs integrate SA with Pareto optimality to esti- the SA with probability:
mate parameter sets on or near the optimal trade-
off surface between competing training objectives P (ki+1) ≡ exp { – rank (ki+1|Ki)/T} (13)
(Fig. S1). Here, we modified the original algorithm
[9] to improve its convergence properties. Denote a where T is the computational annealing tempera-
candidate parameter set at iteration i +1 as ki+1.The ture. The initial temperature To = n/log(2), where n
squared error for ki+1 for training set j was defined is user defined (n = 4 for this study). The final tem-
as: perature was Tf = 0.1. The annealing temperature

Journal
was discretized into 10 quanta between To and Tf knew only relative amounts of protein or mRNA for
and adjusted according to the schedule Tk = β kT0 any specific condition or time. To constrain the ab-
where β was defined as: solute concentration scale, we assumed a single
1/10
ELISA or qRT-PCR measurement for the highest
⎛ Tf ⎞ intensity band in each case. Lastly, we limited our
ß=⎜ ⎟ (14)
⎝ To ⎠ training data to 20 samples per experiment (an up-
per limit on the lanes available on a Western blot).
The epoch-counter k was incremented after the The modified POET algorithm performed better
addition of 50 members to the ensemble. Thus, as than the original implementation and generated an
the ensemble grew, the likelihood of accepting pa- ensemble that collectively exhibited population-
rameter sets with a large Pareto rank decreased. To like behavior. First, the ODE model used here was
generate parameter diversity, we randomly per- deterministic and did not describe stochastic gene
turbed each parameter by ≤ ± 50%. However, in ad- expression fluctuations. However, because many
dition to a random-walk strategy (previous algo- different parameter sets were sampled, the deter-
rithm), we performed a local pattern search every q ministic ensemble exhibited population-like be-
steps to minimize the residual for a single random- havior. For example, scaled gene expression levels
ly selected objective.The local pattern-search algo- were approximately normally distributed following
rithm has been described previously [26, 27]. The the addition of extracellular ligand. Thus, while
parameter ensemble used in the simulation and gene expression was not described at a single-cell
sensitivity studies was generated from the low- level, the ensemble captured coarse-grained ex-
rank parameter sets in Ki. pression heterogeneity. This suggested that deter-
ministic ensembles could perhaps be used to mod-
el heterogeneous populations. Second, the model
3 Results ensemble captured the robust and fragile features
of the true model, despite significant parameter
3.1 Summary uncertainty. Edge (interactions between species)
and node (species) ranks computed over the en-
We identified and analyzed a family of canonical semble using sensitivity analysis were consistent
signal transduction models using POETs and sen- with the true rankings, at least for highly fragile
sitivity analysis. POET has previously been used to and robust network components. This suggested
identify molecular models of pain signaling [9]. We that, in practice, results from sensitivity analysis
modified the original algorithm by integrating a lo- obtained by analyzing model ensembles could rep-
cal pattern-search routine, which better controlled resent true behavior to a high degree of certainty, at
the absolute error in the ensemble identification. least for highly fragile or robust network features.
The original and modified algorithms were used to The true model is available in SBML format in the
estimate an ensemble of signaling models. The supplemental materials.
model, which was assumed to have a known net-
work structure, described the integration of extra- 3.2 Estimating an ensemble of models using
cellular signals with kinase activation, the phos- multiobjective optimization
phorylation of transcription factors, and the up-
regulation of an associated transcriptional program We estimated an ensemble of signal transduction
(Fig. 1). Thus, while not specific to a particular models from synthetic data sets using POET
growth factor, signaling cascade or expression pro- (Fig. S1).The canonical model had 117 unknown ki-
gram, it contained many of the general features en- netic constants, primarily of three types (associa-
countered when identifying specific models. We tion, dissociation or catalytic rate constants). Be-
modeled the molecular interactions in the proto- cause we used mass-action kinetics, every network
typical-signaling network using mass-action kinet- interaction was governed by a single parameter.
ics within an ODE framework. ODEs and mass-ac- Using the true model, we generated 24 synthetic
tion kinetics are common methods of modeling bi- data sets using a (3,2,2,2)-level factorial design.The
ological pathways [9, 16–18, 28–32]. We assumed design variables considered were the level of lig-
spatial homogeneity but differentiated between cy- and stimulation (L = 0, L = 10 and L = 50) and the
tosolic, membrane and nuclear localized processes. presence and absence of gene 1, 2 and 3. In each
The true model (known parameters) was used to data set, we assumed inactivated/activated kinase
generate synthetic data from which we tested the (cytosol), inactivated/activated transcription factor
POET algorithm. Each synthetic measurement was (cytosol), mRNA for protein 1 (cytosol) and the cy-
assumed to be a Northern or Western blot.Thus, we tosolic level of protein 1 were measured at 20 points

Figure 2. Objective function array for pa-

rameter sets with Pareto rank = 0 for the
original POET implementation (gray cir-
cles) and POET with local parameter re-
finement (black circles). Eight objectives
are shown from the 24 objectives used in
the model identification. The symbol
Oj indicates the jth objective function.
Points indicate the error associated with
ensemble parameter sets. Objectives
were defined using a (3,2,2,2)-level facto-
rial design (ligand,gene1,gene2,gene3):
O1 = (2,2,1,1), O2 = (2,2,1,2), O3 =
(2,2,2,1), O4 = (2,2,2,2), O5 = (3,2,1,1),
O6 = (3,2,1,2), O7 = (3,2,2,1) and O8 =
(3,2,2,2). Design levels: ligand (1,2,3) =
(0,10,50) and gene j (1,2) = (deleted,
present).
equidistant over the time-course of the experiment strong positive skew, while parameters with a high
(approximately 3 h). Each synthetic dataset be- CV were approximately exponentially distributed
came an objective in the optimization calculation (Fig. S3). Analysis of the residuals produced by
from which we estimated the model ensemble (24 POET gave insight into relationships in the train-
objectives in total). ing data (Fig. 2). For example, O6 × O2 and similar-
The POET algorithm with local parameter re-
finement performed better than the original imple- 2.4
Parameter Coefficient of Variation (CV)
mentation (Fig. 2). Both implementations started A

2.2
from the same randomized parameter seed, used 2
the same software libraries and were run over a 1.8
72-h period on the same hardware. Both imple-
1.6
mentations used a maximum acceptable Pareto
1.4
rank of three or less. The modified algorithm gen-
1.2
erated 2882 ranked sets, of which 1062 had a Pare-
1
to rank equal to zero (Fig. 2, black circles). On the
0.8
other hand, the original POET implementation
generated 20 645 ranked sets, where 1538 had a 0.6
0 20 40 60 80 100 120
Pareto rank equal to zero (Fig. 2, grey circles).While
5
local refinement required additional function eval-
Parameter Coefficient of Variation (CV)
B
uations, the median training residuals were less 4.5
than the original implementation (Fig. S2). The
4
quality of the resulting ensemble generated with
local refinement was also higher. Approximately 3.5
47% of the model parameters (55 of 117) were con-

3
strained with a coefficient of variation (CV) of less
than or equal to one (Fig. 3A). In comparison, the 2.5
minimum CV produced by the original implemen- 2

tation was ≥ 1.7 (Fig. 3B). The top five constrained
0 20 40 60 80 100 120
parameters were protein 1 (cytosol), RNAP and
Sorted Parameter Index
EXPORT degradation (all 0.64), the degradation of
mRNA for gene 3 (0.65; negative regulator of P1 ex- Figure 3. Coefficient of variation (CV) of model parameters estimated us-
ing POET with local parameter refinement (A) and the original implemen-
pression) and the constitutive expression of gene 1
tation (B). The solid line denotes the mean CV calculated over the
(0.67). The top five least-constrained parameters entire ensemble, while the points denote the CV of the ensemble sample
were associated with kinase regulation or regulat- used in the sensitivity analysis calculations. Approximately 47% or 55 of
ed gene 1 expression (CV >2).Well-constrained pa- 117 parameters had CV ≤ 1 for POET with local refinement. The minimum
rameters were pseudo-normally distributed with a CV obtained using the original POET implementation was 1.7.

Journal
Phosphorylated Transcription
(A) 200 (B)
Kinase (pK) (A.U)
0.8
Phosphorylated
Factor (pTF) (A.U)

150
0.6
0.4 100
0.2 50
0
0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Time Time
1.8 3
(C) (D)
1.6
2.5
mRNA Protein 1
1.4
Protein 1 Cytosol
Cytosol (A.U)
(p1C) (A.U)
1.2
1
1.5
0.8
0.6 1
0.4
0.5
0.2
0 0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Time Time
Figure 4. Model performance following the addition of ligand L for the O8 = (3,2,2,2) synthetic data set. (A) Cytosolic levels of phosphorylated kinase versus
time. (B) Cytosolic levels of phosphorylated transcription factor versus time. (C) Cytosolic levels of protein 1 mRNA versus time. (D) Cytosolic level of pro-
tein 1 versus time. Dashed lines denote the mean simulated value over the ensemble; the gray region denotes the 95% confidence interval. Points denote
the mean synthetic data used to train/validate the model.
ly O8 × O4 were strongly correlated. This suggested ferent model components were normally distrib-
that parameter sets that performed well for one ob- uted across the ensemble. For example, the scaled
jective had similar performance on the other. Oth- levels of protein 1 in the cytosol were normally dis-
er objectives showed no relationship (O8 × O2) or tributed during the expression phase of the net-
had strong fronts, for example O2 × O1. work response (Fig. 6). Directly after ligand addi-
A key question is whether deterministic ensem- tion (t = 0.1 h), the majority of cells were not ex-
bles can describe heterogeneous populations. We pressing protein 1. However, after some time
have suggested that ensembles represent the aver- (t = 0.3 h) the population of protein 1-expressing
aged behavior of different cellular subpopulations. cells was normally distributed. After approximate-
To test this idea, we explored the overall and indi- ly t = 1 h, the majority of cells had reached their
vidual behavior of the ensemble models relative to maximum cytosolic level of protein 1. Interestingly,
the training data. Overall, the ensemble recapitu- the correspondence between mRNA and protein
lated the mean activation of key network species levels varied significantly over the ensemble
following ligand addition (Fig. 4). Beyond describ- (Fig. S4).The mRNA-protein distribution shifted as
ing the data, the ensemble predicted the cytosolic a function of time to higher signal (Fig. S4, gray ver-
levels of unmeasured species (protein/mRNA for sus red circles) and became more biased toward
protein 2 and 3) for an experimental design not protein.
used for training (Fig. 5). Different network compo- A criticism of mass-action kinetics is that they
nents had varying levels of uncertainty. For exam- increase the number of parameters and species in
ple, the levels of activated kinase (Fig. 4A) were network models. Alternatively, Michaelis–Menten
well constrained, while the cytosolic level of mRNA kinetics (which are a realization of the law of mass
for protein 1 (Fig. 4C) had significant uncertainty. action) or Hill kinetics are often used to reduce
The ensemble captured the correct trend for the model dimension. However, Michaelis–Menten ki-
level of activated transcription factor but was not netics rely on the assumption that product forma-
absolutely correct (Fig. 4B).The scaled levels of dif- tion is rate limiting (kcat << koff ). We explored the

parameter ensemble for two key catalytic reactions work components is correct, given significant para-
in our network, namely, the activation of kinase by metric diversity. Previously, we approached this
activated receptor and the phosphorylation of tran- question by comparing the nodes or edges predict-
scription factor by activated kinase to determine if ed to be important in a variety of models with liter-
the Michaelis–Menten assumption was valid. We ature [9, 17, 19, 33]. However, these comparisons
considered parameter sets from the locally refined were imperfect. Many factors were likely different
parameter ensemble with Pareto rank ≤ 3. For between the experimental and modeling studies.
these reactions, the on- and catalytic rate constants Moreover, these comparisons were only as reliable
had a CV ∼ 1, while the off-rates were not well con- as the underlying literature search, which was not
strained (CV > 2). On average, the Michaelis- exhaustive. In this study, we validated the classifi-
Menten assumption was violated by ∼ 35% of the cation of nodes and edges as fragile or robust by
ensemble, suggesting that we could possibly reduce comparing the ‘true’ model with models from the
model complexity by changing the kinetics. How- ensemble.
ever, mass-action kinetics have the advantages of Local processes such as transcription factor
regularized mathematical structure and simplicity, regulation and global infrastructure like RNAP, nu-
which offsets the added complexity. clear transport and translation were the most frag-
ile components of the prototypical-signaling net-
3.3 Rank-based assessment of nodes and edges was work. First-order sensitivity coefficients were com-
conserved by the ensemble puted for the true parameters and the ensemble.
These coefficients were then time-averaged to
A key question when using model ensembles is form the N and B arrays (see Materials and meth-
whether the rank-based assessment of critical net- ods). The magnitude of the coefficients of the left
7
(A) 4.5 (B)
6
4
Protein 1 cytosol
Protein 3 cytosol
5 3.5
(p1C) (A.U)
(p3C) (A.U)
3
4
2.5
3 2
2 1.5
1
1
0.5
O7 O8
0 0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Time Time
7
12 (C) (D)
6
10
mRNA Protein 2
Protein 2 cytosol
5
Cytosol (A.U)
(p2C) (A.U)
8
4
6
3
4
2
2
1
0
O8 O8
0
0 0.5 1 1.5 2 2.5 3 0 0.5 1 1.5 2 2.5 3
Time Time
Figure 5. Model predictions following the addition of ligand L versus modified synthetic data. The dashed lines denote the mean simulated value over the
ensemble; the gray region denotes the 95% confidence interval. The points denote the mean synthetic data used to validate the model. The validation data
was generated from the training data by adding a background level of the ligand (L =1) and by considering species not used for training (with the exception
of protein 1). (A) Cytosolic levels of protein 1 versus time. Points denote the O7 = (3,2,2,1) data set in the presence of background ligand (L =1). (B) Cytoso-
lic levels of protein 3 versus time. Points denote the O8 = (3,2,2,2) data set in the presence of background ligand (L =1). (C) Cytosolic levels of protein 2 ver-
sus time. Points denote the O8 = (3,2,2,2) data set in the presence of background ligand (L =1). (D) Cytosolic levels of mRNA for protein 2 versus time.
Points denote the O8 = (3,2,2,2) data set in the presence of background ligand (L =1).

Journal
600 200
400
t = 0.10 hr 150 t = 0.20 hr
100
200 50
0 0
0 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
120
120 t = 0.30 hr
t = 0.25 hr
Number of Cells
80
80
40 40
0 0
-0.2 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 1.2
200
120
t = 0.40 hr 150 t = 0.50 hr
80
100
40 50
0 0
0.2 0.4 0.6 0.8 1 1.2 0.2 0.4 0.6 0.8 1 1.2
400 500
300 t = 0.75 hr t = 1.0 hr
200 300
100 100
0 0
0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1
Scaled protein concentration (A.U)

Figure 6. Distributions for the scaled cytosolic protein 1 concentration as a function of time following the addition of extracellular ligand L. Bars denote
expression bins for protein 1 (expression levels were sub-divided into 10 bins). The solid line denotes a normal distribution fit to the histogram (histfit
function of Octave). Initially, the ensemble was synchronized with low scaled protein 1 expression (upper left-hand plot). After the addition of the ligand L
the distribution of cells expressing protein 1 shifted to the right (progressing through an approximately normal distribution during active expression of pro-
tein 1). After t =1.0 h, the bulk of the cells reached their maximum cytosolic levels of protein 1 (lower right-hand corner).
(right) singular vectors corresponding to largest β suggested which ensemble elements most influ-
singular values of N were used to rank-order the enced a particular species. For example, examina-
importance of the nodes (edges) in the model tion of the top and bottom three ranked ensemble
(Fig. 7). The most sensitive node combinations with members, estimated from the right singular vectors
β =1 involved the regulation of activated transcrip- of BaTF , showed the highest ranked ensemble
tion factor (aTF) and the transport of aTF into the members had similar aTF trajectories (Fig. S5, sol-
nucleus (Fig. 7, top). Similarly, the most sensitive id-lines). Conversely, the lowest three had widely
edges involved PH-TF regulation of aTF, the pro- varying aTF levels (Fig. S5, dashed-lines). Thus,
duction, degradation and regulation of the specific subpopulations with qualitatively distinct behavior
kinase for TF (iK/aK), the production and degrada- were present in the ensemble and decomposing the
tion of iTF and the production/degradation of PH- B array could identify these elements.
TF. Analysis of additional singular vectors (in- Edge and node ranks computed over the en-
creased β) highlighted the role of global infrastruc- semble recovered the true rankings for highly frag-
ture like RNAP, nuclear transport (IMPORT/EX- ile and highly robust network components (Fig. 8).
PORT) and translation (Fig. 7, middle and bottom). We compared the node (species) and edge (inter-
Analysis of the left singular vectors of the BaTF ma- action) ranks computed using sensitivity analysis
trix also supported these findings. On the other for the true parameter set with the ensemble (β =1).
hand, the most robust species and reaction combi- The Kendall and Spearman rank correlations were
nations involved the assembly of the adaptor com- used to quantify the agreement between the true
plex and the basal expression of gene 1, 2 and 3. and estimated ranked lists (Table 1).The Spearman
Subpopulations in the ensemble behaved differ- and Kendall correlation coefficients were approxi-
ently. Analysis of the right singular vectors of BaTF mately normally distributed for both node and edge

Control ensemble POET ensemble Figure 7. Comparison of the species (node)

(perfect information) (uncertain parameters) fragility estimated from the ensemble versus
the true parameter set for different β values
Robust
(δ = 0.1). The fraction of the top-β modes in
which a species was present was calculated
for the true model (left) and the model
β=1 ensemble (right).
Fragile
Robust
β = 10
Fragile
Robust
β = 20
Fragile
fragility over the model ensemble (data not shown). 4 Discussion

Ranks estimated using unscaled sensitivity coeffi-
cients gave the best correlation with the true pa- Mathematical modeling of complex gene expres-
rameter values. The Kendall correlation between sion programs is an emerging tool for understand-
the true node rank and that estimated from the en- ing disease mechanisms. However, identification of
semble was 0.57 ± 0.15, while the mean edge rank large models with many unknown parameters re-
correlation was 0.72 ± 0.09. The mean Spearman quires that we use diverse training data. Training
rank correlation for node rank was 0.73 ± 0.16, data taken from many sources can contain con-
while the mean correlation for edge rank was flicts, for example different time scales, or can
0.87 ± 0.08. Additionally, if we computed the corre- sometimes even be contradictory. Parameter esti-
lation between the true rank and the mean mation techniques that balance these conflicts
node/edge rank (mean rank calculated over the en- might lead to robust model performance. POET has
semble before the rank correlation test), the Spear- previously been used to identify molecular models
man correlation for nodes and edges increased to of pain signaling [9].We modified the original algo-
0.91 and 0.97, respectively. Both correlation metrics rithm by incorporating a local parameter refine-
and visual inspection (Fig. 8, control versus POET) ment step which generated candidate parameter
suggested that edge rank was recovered better than sets with better error properties. Using the modi-
node rank. In addition to the rank correlation, we fied POET algorithm, we identified an ensemble of
calculated the fraction of the ensemble in which an
edge or node was ranked the same as the true pa-
rameter set (Fig. 8, bottom). Interestingly, both Table 1. Summary of the rank correlation for node and edge ranking
highly fragile and highly robust network features between the ensemble and true parameter set
were recovered for edges (Fig. 8, bottom left) and Method Node Edge
nodes (Fig. 8, bottom right). For example, the high-
est and lowest ranked edges were recovered in Scaled
more than 95% of the ensemble. However, minor Kendall 0.51 ± 0.18 0.36 ± 0.11
Spearman 0.65 ± 0.22 0.51 ± 0.15
network features were not similarly recovered
(worst case recovery of only 20%). This suggested
Unscaled
that we could expect to recover at least highly frag-
Kendall 0.57 ± 0.15 0.72 ± 0.09
ile or robust network features when using para-
Spearman 0.73 ± 0.16 0.87 ± 0.08
metrically uncertain ensembles.

Journal
Reactions Species
Robust Fragile Robust Fragile
Reaction Index
Species Index
Control ensemble
(perfect information)
Reaction Index
Species Index
POET ensemble
(uncertain parameters)
Percentage correct classification
Ensemble Index Ensemble Index

Percentage correct classification
100% 100%
80% 80%
60% 60%
40% 40%
20% 20%
Fragile Robust Fragile Robust
Sorted Reaction Index Sorted Species Index
Figure 8. Comparison of the reaction (edge) and species (node) rank estimated from the ensemble versus the true parameter set for β =1. The ordinal rank
of the magnitude of the left (right) singular vector corresponding to the largest singular value was computed for true model (top) and the model ensemble
(middle). The fraction of trials in which a species or reaction was ranked exactly correctly was used to calculate the correct classification percentage.
parameter sets from synthetic data generated using the role of stochastic fluctuations in biological
the true parameters. We assumed that immunoblot processes such as gene expression [34]. Today, sto-
training data (Western or Northern blots) were chastic gene expression models are not computa-
available to estimate the model ensemble. We in- tionally feasible except for small networks. Howev-
troduced a systematic procedure to incorporate er, as stochastic simulation algorithms continue to
these types of experimental measurements into improve, for example with hybrid [35] or leaping
model identification. We characterized the param- strategies [36], then fully stochastic simulations
eter ensemble generated by POET by exploring the will become tractable. Currently, the simulation of
behavioral diversity of models in the ensemble and moderate to large problems typically relies on the
by examining how the fragility of nodes or edges population-averaged descriptions provided by
varied over the ensemble. ODEs. Within an ODE framework, we showed pop-
The deterministic ensemble exhibited hetero- ulation-like effects using model ensembles. Popu-
geneous population-like behavior. In this study, we lation heterogeneity using deterministic model
suggested that deterministic ensembles could be families was also recently explored for bacterial
used to model heterogeneous populations in situa- growth in batch cultures [37]. Distributions were
tions where stochastic computation was not feasi- generated because the model parameters varied
ble.There is a rich and growing literature exploring over the ensemble, i.e., extrinsic noise led to popu-

lation heterogeneity. Parameters controlling physi- pothesized, that analysis of experimentally con-
cal interactions, such as disassociation rates, or the strained model ensembles could generate a rea-
rate of assembly or degradation of macromolecular sonable estimate of what was important in a net-
machinery, such as ribosomes, were widely distrib- work without detailed parametric knowledge [6].
uted over the ensemble. However, population het- However, sensitivity analysis does not evaluate net-
erogeneity can also arise from intrinsic noise [38]. work performance following structural or opera-
Thus, deterministic ensembles, which do not cap- tional perturbations [41]. Thus, an open question
ture intrinsic thermal fluctuations, provide a (yet to be explored) is whether an ensemble of
coarse-grained or extrinsic-only ability to simulate models captures the fault tolerance or disturbance
population diversity. Taken together, these studies rejection properties of molecular networks.
motivate a deeper question as to whether a ‘unique’
parameter set exists in biology. These results sug- The project described was supported by Award Num-
gest that not just variation in the copy number of ber #U54CA143876 from the National Cancer Insti-
infrastructure like ribosomes or RNAP but rather tute. The content is solely the responsibility of the au-
distributions in the strength of biophysical interac- thors and does not necessarily represent the official
tions could also drive population heterogeneity. views of the National Cancer Institute or the Nation-
More studies are required to explore these ques- al Institutes of Health. We also acknowledge the gen-
tions and to test the notion that ensembles can erous support of the Office of Naval Research
model population heterogeneity. One concrete next #N000140610293 to J.V. for the support of S.S.
step could be to try and recapitulate experimental-
ly measured distributions, for example, flow cytom- The authors have declared no conflict of interest.
etry measurements of protein markers. Longer
term, coarse-grained deterministic ensembles
might be a strategy to explore drug effects across 5 References
cell populations [1].
Sensitivity-based metrics, calculated from un- [1] Kitano, H., A robustness based approach to systems-orient-
certain models, are often used to estimate which ed drug design. Nat. Rev. Drug Discov. 2007, 6, 202–210.
[2] Hornberg, J. J, Binder, B., Bruggeman, F. J., Schoeberl, B. et al.,
components of networks are fragile or robust.Thus,
Control of mapk signalling: from complexity to what really
a reasonable question is whether the classification matters. Oncogene 2005, 24, 5533–5542.
of nodes (species) and edges (interactions) as frag- [3] Gadkar, K. G., Varner, J., Doyle, F. J., Model identification of
ile or robust in uncertain models is correct. We ex- signal transduction networks from data using a state regu-
plored this question by comparing nodes or edges lator problem. Syst. Biol. (Stevenage) 2005, 2, 17–30.
estimated to be fragile or robust in the true model [4] Gennemark, P., Wedelin, D., Benchmarks for identification
with those of the model ensemble. We showed that of ordinary differential equations from time series data.
Bioinformatics 2009, 25, 780–786.
both locally and globally important network fea-
[5] Bandara, S., Schlöder, J., Eils, R., Bock, H. G., Meyer, T., Opti-
tures were conserved across the ensemble. The mal experimental design for parameter estimation of a cell
most important local feature of our canonical net- signaling model. PLoS Comput. Biol. 2009, 5, e1000558.
work was transcription factor activation.Transcrip- [6] Bailey, J. E., Complex biology with no parameters. Nat.
tion factor regulation is a well-known integration Biotechnol. 2001, 19, 503–504.
layer in gene-expression architectures. For exam- [7] Covert, M., Knight, E., Reed, J., Herrgard, M., Palsson, B., In-
tegrating high-throughput and computational data eluci-
ple, Bhardwaj et al. [39] showed in a range of net-
dates bacterial networks. Nature 2004, 429, 92–96.
works that midlevel regulators, such as transcrip- [8] Shen-Orr, S. S., Milo, R., Mangan, S., Alon, U., Network mo-
tion factors, have the highest collaborative propen- tifs in the transcriptional regulation network of Escherichia
sity.Thus, transcription factor regulation is perhaps coli. Nature 2002, 31, 64–68.
one of the bow-ties described by Csete and Doyle [9] Song, S. O., Varner, J., Modeling and analysis of the molecu-
[40]. Sensitivity analysis suggested that global in- lar basis of pain in sensory neurons. PLoS One 2009, 4,
frastructure such as RNAP, nuclear transport and e6758.
[10] Battogtokh, D., Asch, D. K., Case, M. E., Arnold, J., Schuttler,
translation initiation were also fragile. The fragility
H. B., An ensemble method for identifying regulatory cir-
of transcription and translation infrastructure has cuits with special reference to the qa gene cluster of Neu-
also been reported by Stelling et al. [16] exploring rospora crassa. Proc. Natl. Acad. Sci. USA 2002, 99,
the robustness properties of Drosophila clock ar- 16904–16909.
chitectures, in cell-cycle architectures [19], and in [11] Kuepfer, L., Peter, M., Sauer, U., Stelling, J., Ensemble mod-
growth factor signaling in LNCaP sub-clones [33], eling for analysis of cell signaling dynamics. Nat. Biotechnol.
to cite just a few examples. Interestingly, highly 2007, 25, 1001–1006.
[12] Brown, K. S., Sethna, J. P., Statistical mechanical approach-
fragile or robust network features were conserved
es to models with many poorly known parameters. Phys.
across the ensemble. This suggested, as Bailey hy- Rev. E Stat. Nonlin. Soft Matter Phys. 2003, 68, 021904.

Journal
[13] Palmer, T., Shutts, G., Hagedorn, R., Doblas-Reyes, F. et al., [28] Fussenegger, M., Bailey, J., Varner, J., A mathematical model
Representing model uncertainty in weather and climate of caspase function in apoptosis. Nat. Biotechnol. 2000, 18,
prediction. Annu. Rev. Earth Planetary Sci. 2005, 33, 163–193. 768–774.
[14] Gutenkunst, R. N., Waterfall, J. J., Casey, F. P., Brown, K. S. et [29] Schoeberl, B., Eichler-Jonsson, C., Gilles, E. D., Müller, G.,
al., Universally sloppy parameter sensitivities in systems Computational modeling of the dynamics of the map kinase
biology models. PLoS Comput. Biol. 2007, 3, 1871–1878. cascade activated by surface and internalized egf receptors.
[15] Moles, C. G., Mendes, P., Banga, J. R., Parameter estimation Nat. Biotechnol. 2002, 20, 370–375.
in biochemical pathways: a comparison of global optimiza- [30] Li, H., Ung, C. Y., Ma, X. H., Liu, X. H. et al., Pathway sensi-
tion methods. Genome Res. 2003, 13, 2467–2474. tivity analysis for detecting pro-proliferation activities of
[16] Stelling, J., Gilles, E. D., Doyle, F. J., Robustness properties of oncogenes and tumor suppressors of epidermal growth fac-
circadian clock architectures. Proc. Natl. Acad. Sci. USA tor receptor-extracellular signal-regulated protein kinase
2004, 101, 13210–13215. pathway at altered protein levels. Cancer 2009, 115,
[17] Luan, D., Zai, M., Varner, J. D., Computationally derived 4246–4263.
points of fragility of a human cascade are consistent with [31] Stites, E. C.,Trampont, P. C., Ma, Z., Ravichandran, K. S., Net-
current therapeutic strategies. PLoS Comput. Biol. 2007, 3, work analysis of oncogenic ras activation in cancer. Science
e142. 2007, 318, 463–467.
[18] Chen, W. W., Schoeberl, B., Jasper, P. J., Niepel, M. et al., In- [32] Helmy, M., Gohda, J., Inoue, J. I., Tomita, M. et al., Predicting
put-output behavior of erbb signaling pathways as revealed novel features of toll-like receptor 3 signaling in
by a mass action model trained against dynamic data. Mol. macrophages. PLoS One 2009, 4, e4661.
Syst. Biol. 2009, 5, 239. [33] Tasseff, R., Nayak, S., Salim, S., Kaushik, P. et al., Analysis of
[19] Nayak, S., Salim, S., Luan, D., Zai, M., Varner, J. D., A test of the molecular networks in androgen dependent and inde-
highly optimized tolerance reveals fragile cell-cycle mech- pendent prostate cancer revealed fragile and robust sub-
anisms are molecular targets in clinical cancer trials. PLoS systems. PLoS One 2010, 5, e8864.
One 2008, 3, e2016. [34] Elowitz, M. B., Levine, A. J., Siggia, E. D., Swain, P. S., Sto-
[20] Kholodenko, B. N., Kiyatkin, A., Bruggeman, F. J., Sontag, E. chastic gene expression in a single cell. Science 2002, 297,
et al., Untangling the wires: a strategy to trace functional in- 1183–1186.
teractions in signaling and gene networks. Proc. Natl. Acad. [35] Iyengar, K. A., Harris, L. A., Clancy, P., Accurate implemen-
Sci. USA 2002, 99, 12841–12846. tation of leaping in space: The spatial partitioned-leaping
[21] Kremling, A., Fischer, S., Gadkar, K. G., Doyle. F. J. et al., A algorithm. J. Chem. Phys. 2010, 132, 094101.
benchmark for methods in reverse engineering and model [36] Cao, Y., Petzold, L. R., Rathinam, M., Gillespie, D. T., The nu-
discrimination: Problem formulation and solutions. Genome merical stability of leaping methods for stochastic simula-
Res. 2004, 14, 1773–1785. tion of chemically reacting systems. J. Chem. Phys. 2004, 121,
[22] Gutenkunst, R. N., Waterfall, J. J., Casey, F. P., Brown, K. S. et 12169–12178.
al., Universally sloppy parameter sensitivities in systems [37] Lee, M. W.,Vassiliadis,V. S., Park, J. M., Individual-based and
biology. PLoS Comput. Biol. 2007, 3, e198. stochastic modeling of cell population dynamics consider-
[23] Casey, F. P., Baird, D., Feng, Q., Gutenkunst, R. N. et al., Opti- ing substrate dependency. Biotechnol. Bioeng. 2009, 103,
mal experimental design in an EGFR signaling and down- 891–899.
regulation model. IET Syst. Biol. 2007, 1, 190–202. [38] Swain, P. S., Elowitz, M. B., Siggia, E. D., Intrinsic and extrin-
[24] Dickinson, R. P., Gelinas, R. J., Sensitivity analysis of ordi- sic contributions to stochasticity in gene expression. Proc.
nary differential equation systems – A direct method. J. Natl. Acad. Sci. USA 2002, 99, 12795–12800.
Comp. Phys. 1976, 21, 123–143. [39] Bhardwaj, N., Yan, K. K., Gerstein, M. B., Analysis of diverse
[25] Fonseca, C., Fleming, P. J., Genetic algorithms for multiob- regulatory networks in a hierarchical context shows consis-
jective optimization: Formulation, discussion and general- tent tendencies for collaboration in the middle levels. Proc.
ization, in: Proceedings of the 5th International Conference on Natl. Acad. Sci. USA 2010, 107, 6841–6846.
Genetic Algorithms, Morgan Kaufmann, San Mateo 1993, pp. [40] Csete, M., Doyle, J., Bow ties, metabolism and disease. Trends
416–423. Biotechnol. 2004, 22, 446–450.
[26] Gadkar, K. G., Doyle, F. J. 3rd, Crowley, T. J., Varner, J. D., Cy- [41] Shoemaker, J. E., Doyle, F. J., Identifying fragilities in bio-
bernetic model predictive control of a continuous bioreac- chemical networks: Robust performance analysis of fas sig-
tor with cell recycle. Biotechnol. Prog. 2003, 19, 1487–1497. naling-induced apoptosis. Biophys. J. 2008, 95, 2610–2623.
[27] Varner, J. D., Large-scale prediction of phenotype: Concept.
Biotechnol. Bioeng. 2000, 69, 664–678.

Biotechnol J 2010 Song

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Biotechnol J 2010 Song

Uploaded by

Copyright:

Available Formats

Biotechnology DOI 10.1002/biot.201000059 Biotechnol. J.

Ensembles of signal transduction models using Pareto Optimal

Sang Ok Song, Anirikh Chakrabarti and Jeffrey D. Varner

Keywords: Mathematical modeling · Robustness and fragility · Systems biology

1 Introduction models often exhibit complex behavior [2].Typical-

768 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 769

770 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

⎛ N (ε) N (ε) … N (ε) … N (ε) ⎞ The symbol M̂ ij denotes scaled experimental

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 771

772 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Figure 2. Objective function array for pa-

mentation (Fig. 2). Both implementations started A

47% of the model parameters (55 of 117) were con-

minimum CV produced by the original implemen- 2

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 773

Factor (pTF) (A.U)

774 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 775

Scaled protein concentration (A.U)

776 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

Control ensemble POET ensemble Figure 7. Comparison of the species (node)

fragility over the model ensemble (data not shown). 4 Discussion

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 777

Robust Fragile Robust Fragile

Ensemble Index Ensemble Index

778 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

© 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim 779

780 © 2010 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim

You might also like