Professional Documents
Culture Documents
Multilevel/ Hierarchical
Linear Modeling
Jasmina Tacheva
Method developed to deal with problems of single level analysis, cross level
inferences, and ecological fallacy.
Key elements
Varying coecients
A model for those varying coecients (which can itself include group-level
predictors)
Data requirements
In general, more than one unit per group is needed, and at least 10 groups
When the number of groups is small (less than 5), there is typically not
enough information to accurately estimate group-level variation; multilevel
models in this setting typically gain little beyond classical varying-coe cient
models.
Multilevel Analyses
If nesting is not taken into account when the data are analyzed, important
assumptions about the independence of errors are violated.
Independence of errors
Even if the same result is obtained through a simpler method (e.g. OLS) as
with a multilevel model, it will still be wrong to use it when analyzing
multilevel data.
Children in the same classroom have the same teacher they cannot be
treated as independent observations. Students in another class also have the
same teacher but this teacher is different than the one in class #1 these
differences have to be controlled.
Multilevel modeling does this in the most accurate way currently available.
Both approaches have problems: no pooling ignores information and can give
unacceptably variable inferences, and complete pooling suppresses variation
that can be important or even the main goal of a study.
sub-group analyses can analyze within-group relationships, but does not account for
reliability and treats parameters as fixed;
ANCOVA assumes slopes are the same; differences in slopes violate assumption (
heteroscedasticity); MLM can test differences in slopes without violating assumptions.
Classical regression can sometimes accommodate varying coe cients by using indicator
variables. The feature that distinguishes multilevel models from classical regression is in
the modeling of the variation between groups.
When enough groups are present at upper levels (typically, more than 5)
More accurate than ICC (interclass correlations) which only deal with the
distributions of means but not with the relationships themselves
Way to find middle ground between the "individualist fallacy" and the
"ecological fallacy"
One of the main purposes of multilevel models is to deal with cases where the
assumption of independence is violated; multilevel models do, however,
assume that 1) the level 1 and level 2 residuals are uncorrelated and 2) The
errors (as measured by the residuals) at the highest level are uncorrelated.
OLS Assumptions
HLM assumptions
of each other
Example
Consider data from students in many schools, predicting in each school the
students grades y on a standardized test given their scores on a pre-test x
and other information.
Varying-intercept model
Regressions have the same slope in each of the schools, and only the
intercepts vary.
We use i for individual students and j[i] for the school of student i
In the case of a two-level model, one generally assumes that there is significant variation in 2 that
is, one assumes that within-group variation is present. One does not necessarily assume, however,
that there will be significant intercept variation (00) or between-group slope variation (11).
In Step 1, one explores the group-level properties of the outcome variable to determine three things:
(1) the ICC(1) associated with the outcome variable (how much of the variance in the outcome can be
explained by group membership); (2) whether the group means of the outcome variable are reliable
(around .70, Bliese, 2000); (3) whether the variance of the intercept (00) is significantly larger than
zero. These three aspects of the outcome variable are examined by estimating an unconditional
means model.
This model states that the dependent variable is a function of a common intercept 00, and two error
terms: the between-group error term, u0j, and the within-group error term, rij. The model essentially
states that any Y value can be described in terms of an overall mean plus some error associated with
group membership and some individual error. Bryk and Raudenbush (1992) note that this model is
directly equivalent to a one-way random effects ANOVA. In the unconditional means model, the fixed
portion of the model is 00 (an intercept term) and the random component is u 0j+rij.The random
portion of the model states that intercepts will be allowed to vary among groups.
In the model, the fixed formula is WBEING~1. This states that the only
predictor of well-being is an intercept term. One can think of this model as
stating that in the absence of any predictors, the best estimate of any
specific outcome value is the mean value on the outcome. The random
formula is random=~1|GRP. This specifies that the intercept can vary as a
function of group membership. This is the simplest random formula that one
will encounter, and in many situations a random intercept model may be all
that is required to adequately account for the nested nature of the grouped
data. The option control=list(opt="optim") in the call to lme instructs the
program to use Rs general purpose optimization routine.
Estimating ICC. The unconditional means model provides between-group and within-group
variance estimates in the form of 00 and 2, respectively. As with the ANOVA model, it is
useful to determine how much of the total variance is between-groups. This can be
accomplished by calculating the Intraclass Correlation Coefficient (ICC) using the formula:
ICC = 00/(00 + 2)
The VarCorr function provides estimates of variance for an lme object:
ac Th
qu cep e o
Sp ite ta ver
re eci low ble all
lia fic
a g
bi al rel t .7 rou
lit ly ia 3 p
y , g bi , b -m
es ro lit u e
tim up y t s an
at 72 esti eve rel
es a m r a i a
be nd ate l g bili
lo gr s. ro ty
up is
w ou
sh
.5 p
0. 98
av
e
ha
ve
The 2 log likelihood value for the gls model without the random intercept is 19536.17. The 2 log
likelihood value for the model with the random intercept is 19347.34. The difference of 188.8 is
significant on a Chi-Squared distribution with one degree of freedom (one model estimated a
variance term associated with a random intercept, the other did not, and this results in the one df
difference). These results suggest that there is significant intercept variation.
The first row states that individual well-being is a function of the groups intercept plus a
component that reflects the linear effect of individual reports of work hours plus some
random error. The second line states that each groups intercept is a function of some
common intercept (00) plus a component that reflects the linear effect of average group
work hours plus some random between-group error. The third line states that the slope
between individual work hours and well-being is fixedit is not allowed to randomly vary
across groups. Stated another way, we assume that the relationship between work hours
and well-being varies by no more than chance levels among groups.
When we combine the three rows into a single equation we get an equation that looks like
a common regression equation with an extra error term (u0j). This error term indicates
that WBEING intercepts (i.e., means) can randomly differ across groups. The combined
model is:
From the plot of the first 25 groups in the bh1996 data set, it seems likely that
there is some slope variation. The plot, however, does not tell us whether or
not this variation is significant. Thus, the first thing to do is to determine
whether the slope variation differs by more than chance levels.
The last line of the model includes the error term u 2j. This term indicates that
the leadership consideration and well-being slope is permitted to randomly vary
across groups. The variance term associated with u 2j is 12. It is this variance
term that interests us in the cross-level interaction hypothesis. Note that we
have not permitted the slope between individual work hours and individual
well-being to randomly vary across groups. In combined form the model is:
Wald test
Chi square: Snijders and Bosker (2012, p. 99) recommend using a "mixture
distribution" (or "chi-bar distribution") by comparing the chi-square
difference.
aML
EGRET
GENSTAT
GLLAMM
HLM
MIXREG
MLwiN
SAS
S-Plus
SPSS
Stata
SYSTAT
WinBUGS
References
Kwok, O. M., Underhill, A. T., Berry, J. W., Luo, W., Elliott, T. R., & Yoon, M. (2008).
Analyzing longitudinal data with multilevel models: An example with individuals living
with lower extremity intra-articular fractures.Rehabilitation Psychology,53(3), 370.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models (2nd ed.).
Newbury Park, CA: Sage Publications.