Professional Documents
Culture Documents
Publication details
https://www.routledgehandbooks.com/doi/10.4324/9780203848852.ch12
J. Kyle Roberts, James P. Monaco, Holly Stovall, Virginia Foster
Published online on: 19 Jul 2010
How to cite :- J. Kyle Roberts, James P. Monaco, Holly Stovall, Virginia Foster. 19 Jul
2010 ,Explained Variance in Multilevel Models from: Handbook of Advanced Multilevel
Analysis Routledge.
Accessed on: 28 Jan 2017
https://www.routledgehandbooks.com/doi/10.4324/9780203848852.ch12
This Document PDF may be used for research, teaching and private study purposes. Any substantial or systematic reproductions,
re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents will be complete or
accurate or up to date. The publisher shall not be liable for an loss, actions, claims, proceedings, demand or costs or damages
whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
12
Explained Variance in Multilevel Models
J. Kyle Roberts
Annette Caldwell Simmons School of Education and Human
Development, Southern Methodist University, Texas
James P. Monaco
Laboratory for Computational Imaging and
Bioinformatics, Rutgers University, New Jersey
With the rise of the use and utility of multilevel modeling (MLM), one ques-
tion has consistently been posed to authors and on listserves: “How much
variance does my model explain?” Answering this question within the MLM
framework is not an easy task where it is actually possible to explain “nega-
tive variance” when the addition of explanatory variables increases the corre-
sponding variance components (Snijders & Bosker, 1999). Because effect size
measures previously proposed consider variance at each level, a single mea-
sure is needed that helps researchers interpret the strength of the model as a
whole. The purpose of this paper is to provide a history of past MLM effect
sizes and present three new measures that consider “whole model” effects.
The utility of effect sizes in research interpretation has generated con-
siderable discussion, much of which centers on the role and function of
effect sizes, especially concerning the relationship to statistical signifi-
cance tests (cf. Harlow, Mulaik, & Steiger, 1997). Many authors agree that
effect sizes can serve a valuable function to help evaluate the magnitude
of a difference or relationship (cf. Cohen, 1994; Kirk, 1996; Schmidt, 1996;
Shaver, 1985; Thompson, 1996; Wilkinson & APA Task Force on Statistical
Inference, 1999). Their articles, along with current publications (cf., Knapp
& Sawilowsky, 2001a; Roberts & Henson, 2002) continue to debate both the
use and utility of measures of effect size when considered both in conjunc-
tion with and peripheral from statistical significance testing.
One positive thing that has occurred while researchers began to debate the
issue of effect size reporting (e.g., Knapp & Sawilowsky, 2001b; Thompson,
219
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
220 • J. Kyle Roberts, James P. Monaco, Holly Stovall, and Virginia Foster
The general principle to be followed, how- where the numerator is represented by the
ever, is to provide the reader not only variance at the second level of the hierarchy
with information about statistical signifi- ( τ02 ) , and the denominator represents the
cance but also with enough information total variation in the model at both level-2
to assess the magnitude of the observed
and level-1 (σ2).
effect or relationship. (p. 26)
Although the ICC actually is not a mea-
sure of the effect size of an MLM model, it
bears mentioning here because it sometimes
is wrongly thought of as a measure of the
12.1 Catalog of Effect “power” or strength of MLM over ordinary
Sizes in MLM least squares (OLS) regression. However,
A history of effect sizes has been dealt with this type of thinking is commonly illus-
exhaustively in Huberty (2002), and does trated in passages like the following:
not bear repeating here. One thing absent
from Huberty’s catalog was the use of effect Determining the proportion of the total
size indices in multilevel analysis. It was variance that lies systematically between
schools, called the intraclass correla-
wisely absent from Huberty’s manuscript,
tion (ICC), constitutes the first step in an
since there is much misconception, and HLM analysis. We conduct this analysis
even disagreement, as to the interpretation with a fully unconditional model, which
of these effects. We will quickly list some of means that no student or school charac-
the proposed effect size indices for use in teristics are considered. This first step can
MLM and give brief explanations as to their also indicate whether HLM is needed or
utility. whether a single level analytic method is
appropriate. Only when the ICC is more
than trivial (i.e., greater than 10% of the
12.1.1 Intraclass Correlation total variance in the outcome) would the
analyst need to consider multilevel meth-
Intraclass correlation (ICC) is generally
ods [emphasis mine]. Ignoring this step
thought of as the degree of dependence of (i.e., assuming an ICC of either 0 or 1)
individuals upon a higher structure to which would be inappropriate if the research
they belong; or, the proportion of total vari- question were multilevel. Investigation of
ance that is between the groups of the regres- contextual effects, I argue, is by nature a
sion equation. Put more succinctly, it “is the multilevel question. (Lee, 2000, p. 128)
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
Explained Variance in Multilevel Models • 221
Roberts (2002) has rightly pointed out is reflected in many forms by Hox (2002,
that it would be incorrect to interpret p. 64), or conversely as:
this statistic as a measure of the magni-
tude of difference between OLS and MLM τ 00 (null ) − τ00 ( full )
R22 = (12.5)
estimates. τ 00 (null )
R12 = 1 −
( ∑γX )
var Yij −
h
h hij
M0: scienceij = γ 00 + u0 j + eij 1.979 23.923
var (Yij )
M1: scienceij = γ 00 + γ 10 (ses ) + u0 j + eij 0.651 80.890 (12.6)
σˆ ( full ) + τˆ ( full )
2 2
0
= 1− ,
M1, just a single level-1 predictor is included σˆ 2 (null ) + τˆ (null )
2
0
in the model with variance estimates of
σ̂ e2 = 0.651 and σ̂ u0
2
= 80.923. If we are to where Yij is the outcome variable, γh repre-
consider this in terms of Equation 12.4, we sents the coefficient for outcome variable
could see that we actually would be decreas- Xhij for all h variables, σ̂ 2 is an estimate of
ing the variance explained (or increasing σ̂ u0
2
) at the variance at the first level, and τ̂02 is an
the second level with the introduction of this estimate of the variance at the second level.
predictor (hence negative variance explained The level-2 R2 is then found by dividing
between the two models). the σ̂ 2 by the group cluster size (B), or by
Although the amount of variance explai the average cluster size for unbalanced data,
ned is noteworthy at level-1 (R12 = (1.979 – such that:
.0651)/1.979 = 0.671), the amount of variance
explained at the second level is actually
R22 = 1 −
( ∑γX )
var Y. j −
h
h h. j
–2.381(R22 = (23.923 – 80.890)/23.923 = –2.381). var (Y. j )
Not only is this number troubling, but it is (12.7)
counter-intuitive to the way most researchers σˆ 2 ( full ) / B + τˆ 02 ( full )
= 1− .
think about the effectiveness of a model. If we σˆ 2 (null )/ B + τˆ 02 (null )
were to interpret this model without previous
knowledge of multilevel models, we might In this formula, it is easy to see that the
be inclined to say that the addition of the R2 estimate at level-2 is similar to the R2
predictor “ses” is a worse predictor of “math for level-1, having just reduced the level-1
achievement” at the second level than if we variance to represent an average variance
had no predictor at all, being that it explains for each group. Although this estimation
–238% variance! differs from the previous definition of R2
(Equations 12.3 and 12.4), it is still possible
to obtain “negative” values for R2.
12.1.3 Explained Variance as Using Table 12.1, R2 at level-1 is:
a Reduction in Mean
Square Prediction Error 0.651 + 80.890
R12 = 1 − = −2.148,
1.979 + 23.923
Snijders and Bosker (1999) argue for a
slightly different approach to computing R2 and for level-2 is:
values in multilevel models by computing
the model’s associated mean square predic- 0.651 / 10 + 80.890
R22 = 1 − = −2.356.
tion error. The R2 for level-1 is then computed 1.979 / 10 + 23.923
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
Explained Variance in Multilevel Models • 223
y x1
x2
Figure 12.1
Graphical representation of correlation between y x3
ŷ
and predictor variables.
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
Explained Variance in Multilevel Models • 225
case of hierarchical linear modeling, these where y j is the random estimate for group
weights are derived through maximum like- j and yij is the original outcome score for
lihood estimates of the fixed effects, with person i in group j. This number is the
the individual estimates being the product same value as the sum of the square of all
of empirical Bayesian estimates. of the residual values eij, but is distinctly
Although it would seem that a researcher different from the variance estimate from
could simply correlate these ŷ and y val- the unbiased estimator of σ2 that contains
ues to obtain an estimate of R2, we must a correction factor for the Q + 1 regression
remember and maintain in MLM that we parameters such that:
wish to honor the procedure by which the
estimates were obtained. In OLS regression,
we can typically compute the total variance
σˆ 2 = ∑ e / (n − Q − 1).
2
ij (12.12)
in the model as:
As noted in the OLS model, the error
n
variance in a model can be viewed as the
∑ (Y − Y ) .
i =1
i
2
(12.9) distance between ŷ and y. Likewise, in
MLM after the ŷ are calculated using the
In a multilevel model, we must remember full model, the error variance for the total
that we typically define the null model as hav- model can be explained as:
ing both fixed effects (the grand mean for the
dependent variable) and random effects (the J nj
that they do not use the unbiased estimator where eij′ is an estimate of the residual for
for variance. However, consider if we were to each i person in the jth group in the null
aggregate this formula to the level-2 grouping model, and:
structure such that we gain an R2 value for
each level-2 group and then average across p( yij ) = p(dij | s j ) p(s j ) (12.19)
all groups. Doing so would further enhance
the above formulas such that the estimate of where p(dij • sj) is the probability of the per-
variance explained would be defined by: son i, given that they belong to the jth group,
and p(sj) is the probability of group j, given
R 2j k , (12.16) the entire sample of level-2 units. In extrap-
olating Equation 12.19 further and apply-
where k is the number of level-2 groups ing the probability density function from a
and, Gaussian distribution, it can be shown that:
with σ ij2 representing the variance of the the slope coefficient is simply the weighted
i individuals around their jth group mean least squares estimator (or maximum likeli-
for the full model, σ 2j is the variance of β0j hood estimate) γ00 where:
around γ00, and eij′′ and u ′′j are the residual
scores for the level-1 and level-2 estimates,
respectively. The final estimate of variance
γˆ 00 = ∑∆ −1
j •jY ∑∆ −1
j , (12.24)
R2 = 1 − i =1
n
, (12.25)
∑( y − y )
statistic to interpret just how well a model is 2
i
performing. Since the goal of most research is
i =1
to find variables that fully describe the varia-
tion in the dependent variable, a measure
like this could potentially prove very useful where yi is any given individual’s score
in helping the researcher make judgments on the dependent variable, ŷi is that indi-
about the effectiveness of a MLM model. vidual’s predicted score from the linear
regression equation, and y is the mean of
all individuals on the dependent variable.
12.2.2 Group Initiated R2 based on For any given group within a set of level-2
Weighted Least Squares units, Equation 12.25 could be considered
In addition to alternative ways of comput- the mathematical equivalent of:
ing an effect size mentioned above, another
type of effect size can be conceived through nj
( yˆij − yij )2
∑
1
maximum likelihood methods. In a typical R = 1−
2
j , (12.26)
nj σ ij2
multilevel ANOVA, the grand estimate for i =1
Downloaded By: University College London At: 14:31 28 Jan 2017; For: 9780203848852, chapter12, 10.4324/9780203848852.ch
228 • J. Kyle Roberts, James P. Monaco, Holly Stovall, and Virginia Foster
where nj is the number of people in group we are producing an equation for each
j and σ ij2 is the variance of the individuals second-level group based on weighted least
in group j. For simplification purposes, we squares estimators. It seems appropriate,
will define the latter part of Equation 12.26 then, to produce an entire model R 2 that
as being an error term corresponding to a is also weighted for the probability of the
normalized error for a given group. In rep- group from which the estimate was drawn.
resenting this with the term Ei, Equation In expanding Equation 12.27 to include
12.26 can be thought of as: all groups and also reflect the need to use
a weighted estimator, RT2 could be thought
nj of as:
∑E .
1
R = 1−
2
j i (12.27)
nj i =1 1
RT2 = 1 − J
E j , (12.30)
∑ p(s ) E ,
ing given the sample of all j groups such 1
= 1− J j j
that:
∑ p ( s )n
j =1
j j
j =1
1 (u ′j )2
p(s j ) = • exp − , (12.29) J nj
∑ ∑
1
2 πσ 2j 2σ j p(s j )
2
= 1− J
Ei ,
∑ p ( s j )n j j =1 i =1
And since we already have defined R 2j in explanation, it is hoped that the continued
Equation 12.26, we can then interpret: development of these models will help in
their proliferation.
J There is a caution, however, in making
∑ n ⋅ p(s )⋅ R
j =1
j j
2
j
these models more accessible. Just because
a researcher has the software and program-
RT2 = , (12.31)
J
ming skills to utilize complicated techniques
∑ n ⋅ p(s )
j =1
j j does not mean that that technique is war-
ranted. With the growth of a likewise com-
plicated field of statistics, MLM, Goldstein
which, theoretically, is simply the weighted (1995) voiced similar concerns:
least squares average of all of the R2 val-
ues from each group. What is present There is a danger, and this paper reminds
in Equation 12.31 is a solution that will us of it, that multilevel modeling will
become so fashionable that its use will be
produce estimates similar in interpreta-
a requirement of journal editors, or even
tion to OLS R2 measures. Equation 12.31
worse, that the mere fact of having fitted a
seems considerably more appropriate than multilevel model will become a certificate
Equations 12.15 and 12.23, since it honors of statistical probity. That would be a great
both the nesting structure of the data and pity. These models are as good as the data
the fact that the model was derived through they fit; they are powerful tools, not uni-
weighted least squares estimates. versal panaceas. (p. 202)
Harlow, L. L., Muliak, S. A., & Steiger, J. H. (Eds.). Roberts, J. K. (2004). An introductory primer on multi-
(1997). What if there were no significance tests? level and hierarchical linear modeling. Learning
Mahwah, NJ: Erlbaum. disabilities: A contemporary journal, 2(1), 30–38.
Hox, J. (1995). Applied multilevel analysis. Amsterdam: Roberts, J. K., & Henson, R. K. (2002). Correction for
TT-Publikaties. bias in estimating effect sizes. Educational and
Hox, J. (2002). Multilevel analysis: Techniques and Psychological Measurement, 62(2), 241–253.
applications. Mahwah, NJ: Erlbaum. Schmidt, F. (1996). Statistical significance testing
Huberty, C. J. (2002). A history of effect size indices. and cumulative knowledge in psychology:
Educational and Psychological Measurement, Implications for the training of researchers.
62(2), 227–240. Psychological Methods, 1, 115–129.
Kirk, R. E. (1996). Practical significance: A con- Schwarz, G. (1978). Estimating the dimension of a
cept whose time has come. Educational and model. Annals of Statistics, 6, 461–464.
Psychological Measurement, 56, 746–759. Shaver, J. (1985). Chance and nonsense. Phi Delta
Knapp, T. R., & Sawilowsky, S. S. (2001a). Constructive Kappan, 67, 57–60.
criticisms of methodological and editorial prac- Snijders, T., & Bosker, R. (1999). Multilevel analysis.
tices. The Journal of Experimental Education, 70, Thousand Oaks, CA: Sage.
65–79. Thompson, B. (1996). AERA editorial policies regarding
Knapp, T. R., & Sawilowsky, S. S. (2001b). Strong argu- statistical significance testing: Three suggested
ments: Rejoinder to Thompson. The Journal of reforms. Educational Researcher, 25(2), 26–30.
Experimental Education, 70, 94–95. Thompson, B. (2001). Significance, effect sizes, step-
Kreft, I., & de Leeuw, J. (1998). Introducing multilevel wise methods, and other issues: Strong argu-
modeling. Thousand Oaks, CA: Sage. ments move the field. Journal of Experimental
Lee. V. E. (2000). Using Hierarchical linear modeling to Education, 70, 80–93.
study social contexts: The case of school effects. Wilkinson, L., & American Psychological Association
Educational Psychologist, 35(2), 125–141. Task Force on Statistical Inference. (1999).
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Statistical methods in psychology jour-
linear models: Applications and data analy- nals: Guidelines and explanation. American
sis methods (2nd ed.). Thousand Oaks, CA: Psychologist, 54, 594–604. [reprint available
Sage. through the APA Home Page: http://www.apa.
Roberts, J. K. (2002). The importance of intraclass org/journals/amp/amp548594.html]
correlation in multilevel and hierarchical lin- Xu, R. (2003). Measuring explained variation in linear
ear modeling designs. Multiple linear regression mixed effects models. Statistics in Medicine, 22,
viewpoints, 28(2), 19–31. 3527–3541.