You are on page 1of 15

Last week: Family Tree of SEM

Introduction to Path Analysis and Structural Equation Modelling with AMOS


Daniel Stahl Biostatistics and Computing

Bivariate Correlation

Multiple Regression

Path Analysis

Structural Equation Modeling

Factor Analysis

Confirmatory Factor Analysis

Exploratory Factor Analysis

Last week: Path analysis


Model 1
Model 2

Last week:Error 1 Error analysis with latent constructs Path 2 Error 3 = Structural Equation Modelling
1 1 1

Item 1 Item 2 Item 3


1

Pain
Pain
Pain
1

Item 1 Item 2 Item 3

Error 1 Error 2 Error 3

Depression

Depression

Function
Function

Function

Item 3 Item 2 Item 1

Depression

Error 3 Error 2 Error 1

Last week:

Error 1 Error 2 Error 3


1 1 1

Measurement model

1 4

2 5

3 1. Observed variables 6 7 2. Unobserved variables 3. Drawing latent variable (draws latent variable and items) 4. Drawing path (causal relationship regression)

Item 1 Item 2 Item 3


1

8
Pain
1

8 9

8 10

5. Draw covariances (correlation, no direction) 6. Unique variable (error variable, add e.g. to each dependent var

Item 1 Item 2 Item 3

Error 1 Error 2 Error 3

Depression

11 12 13 14 15 16

7. List variables (open data file first, then drag and drop variabl 8. Select one object, select all, deselect 9. Move object 10.Delete

Function
1

11.Select data file 12.Analysis properties (choose statistics)

Item 3 Item 2 Item 1


1 1 1

Structural model

13.Calculate estimates (starts the analysis) 14.View test (see results) 15.Copy graph in clipboard 16.Save

Error 3 Error 2 Error 1

Last week: SEM diagram symbols


Observed variable
1

Last week: Correlation and Regression as AMOS path models


Correlation

Simple linear regresion


Error depression

Latent variable with disturbance or error


pain depress

pain

depress

Observed variable with measurement error (endogenous variable)

Unidirectional path (regression) Correlation between variables Reciprocal relation between variables
pain

Multiple linear regression


Endogenous variables have got error variances variables pointing at them = unexplained variance

error
1

Latent variable with items (observed variables)

depress

function

Today
Variance, covariance, correlation and regression coefficients SEM/path analysis is based on covariance matrix The Logic of Model Testing in SEM Model fit and model comparisons Simple latent trait models

Exercise
Do a similar analysis as last week with data file: PATHINGRAM.sav. The data are from: Ingram, K. L., Cope, J. G., Harju, B. L., & Wuensch, K. L. (2000). Applying to graduate school: A test of the theory of planned behavior. Journal of Social Behavior and Personality, 15, 215-226. Ajzens theory of planned behavior was used to predict students intentions and application behaviour (to graduate school) from their attitudes, subjective norms and perceived behavioural control.

Five Variables (derived from questionnaires)


Perceived Behavioural Control (PBC) Subjective norm Attitude Intention (to apply to college) Behaviour (applications)

Ajzens theoretical model


PBA, subjective norm and attitude influence intention PBA, subjective norm and attitude correlate with each other Intention influences Behaviour PBA also influences Behaviour

Exercise
Draw the path diagram for the model (Conduct a path analysis with a series of multiple regression analyses using SPSS.) (Calculate the standardised indirect effects using the standardised estimates from the regression analysis.) Check your results using AMOS Use a bootstrap analysis to evaluate the indirect effect. Remove some indirect effects and compare the results with the theoretical model.

Path diagram for AMOS


Perceived Behavior Control

e1
1

e2
1

Subjective Norm

Intention

Behavior

Attitude

Results
Perceived Behavior Control
-.13 .51 .60 .35 .34

Main results: AMOS output


Standardized Regression Weights:
Parameter
e1
.34

Estimate <--<--<--<--<--PBC SubNorm Attitude Intent PBC -.126 .095 .807 .350 .336

Lower -.352 -.118 .596 .075 .092

Upper

e2

Intent Intent Intent Behavior Behavior

.126 .293 .314 .430 .985 .002 .548 .013 .555 .005

.67

Subjective Norm

.09

Intention

Behavior

Standardized Indirect Effects


.47 .81

Attitude Behavior 95%CI: P: .282 (0.08-0.5) 0.007

SubNorm .033

PBC -.044

Intent .000

Attitude

(-0.04-0.13) (-0.14-0.04) 0.339 0.277

Variances, Covariances and Correlations

How do we know that this model fits the data well? Are there better models? How can we compare two or more models? First we need to know a little bit about covariances

SEM and path analysis are based on variances and covariances . Variance is a measure of the dispersion of data. It indicates how values are spread around the mean. Covariance is a measure of the covariation between two variables, such as pain and function.

Example: scores of pain and function

Variance is defined as:

120

110

100

(Xi X )
Var ( X ) =
i =1

2
score

90

80

N 1

70

60

50

40

30 0.5 1 1.5 2 2.5

group

Centering
We centre both variables = subtracting the mean from each individual score. We get a mean of 0 but same distribution around the mean and same variance Removes the constant in a regression and makes life easier for us.
30 20

Variance of pain and function

Descriptive Statistics N pain function c_pain c_function Valid N (listwise) 50 50 50 50 50 Mean 49.45 74.65 .00 .00 Std. Deviation 4.482 12.530 4.482 12.530 Variance 20.084 157.012 20.084 157.012

10

c_score

-10

-20

-30 0.5 1 1.5 2 2.5

group

Scatterplot of pain versus body function (not centred)

Scatterplot of pain versus body function (centred)


30

20

10

c_function

-10

-20 R Sq Linear = 0.33

-30 -10 -5 0 5 10

c_pain

Covariance
The covariance is an unstandardised measure of association between two variables. measure of the degree to which x and y vary together In our example the covariance between pain and function is 32. This means if pain increases by 1 unit of variance (Variance of pain = 20), function will increase by 32:
30

( X i X )(Yi Y )
Cov( X , Y ) =
i =1

N 1

20

10

c_function

32

-10

-20 R Sq Linear = 0.33

-30 -10 -5 0 5 10

c_pain

Covariance matrix
The covariance of variable X with itself is :

Corrrelation
The correlation coefficient r is a standardised measure of the association between two variables with a range from [-1,+1]:

( X i X )( X i X )
Cov ( X , X ) =
i =1

N 1

= Var ( X )

Covariance matrix of pain and function: Pain Pain Function 20 32 Function 32 157

cov( X , Y ) cov( X , Y ) = = var( X ) * var(Y ) sd ( X ) * sd (Y ) -1 = perfect negative association 0 = no association (random) +1 = perfect positive association If we standardise X and Y to new variables Zx and Zy to have a mean of 0 and a variance of 1 , then the Cov(Zx, Zy)=Corr(X,Y). Corr ( X , Y ) =

(= Variance/Covariance matrix)

Correlation Regressoin coefficient b


In our example the correlation between pain and function is 0.455. The interpretation is: If pain increases by 1 standard deviation (=sqrt(Var)), then body function increases by 0.455 SD. SD of pain = 4.5 and SD of function = 12.5 If pain increases by 4.5 then function will increase by 0.575*12.5=7.2

If we assume that variable X (pain) influences variable Y (function), then we can describe the relationship by a regression equation:

y = c + b* x c=0 y = b* x

We can estimate the regression coefficient b by (based on the least squared method):

Cov( X , Y ) b= Var( X )

Example
Whats the regression coefficient b for the regression equation: function=b*pain? V/CV Pain Function Pain 20 32 Function 32 157

Question
If we assume that function influences pain what will be b? (pain = b*function)
V/CV Pain Function Pain 20 Function 32 157

Cov( Pain, Function) b= Var( Pain)


Coefficients a Unstandardized Coefficients B Std. Error 5.8E-012 1.466 1.605 .330 Standardized Coefficients Beta .574

32.25 = 1.605 b= 20.1


t .000 4.858 Sig. 1.000 .000

Cov( X , Y ) = Cov( pain, function ) = 32.25 = 0.205 b= Var (Y ) Var ( function) 157
Coefficients a Unstandardized Coefficients B Std. Error -3E-012 .524 .205 .042 Standardized Coefficients Beta .574

Model 1

(Constant) c_pain

Model 1

a. Dependent Variable: c_function

(Constant) c_function

t .000 4.858

Sig. 1.000 .000

a. Dependent Variable: c_pain

Extension to Multiple Regression


In the case of multiple predictors (X1 and X2), b becomes a partial regression coefficient: Y=c+ b1 X1+ b2 X2 Depression=c+ b1*Function + b2*Pain b1 is the average change in Y per unit change in X1 with X2 held constant. b1 is the average change in depression score per unit change in body function with pain held constant.

Optional
How to calculate the partial correlation (partial covariance) and partial beta for standardised data simply from the correlation matrix (= covariance matrix for standardised data).

depression = c + 0.061* pain 0.523 * function +


Partial regression coefficient is obtained by removing (partialing out) the correlation with other predictors. The Point: Partial regression coefficients can be viewed as functions of operations on a correlation or covariance matrix.

Optional: Example of partial correlation

Optional: Partial correlation


Partial correlation can be calculated just from the correlation matrix.
rxy|z = rxy rxz * ryz (1 rxz ) (1 ryz )
2 2

Find the correlation between pain and depression while partialling out the effect of function.
rpain,depr| funct = rpain,depr rpain, funct * rdepr , funct (1 r 2 pain, funct )(1 r 2 depr , funct )
Correlationmatrix pain pain depress function **. 1 .337** -.455** depress .337** 1 -.421** function -.455** -.421** 1

rpain,depr| funct =

rpain,depr rpain, funct * rdepr, funct (1 rpain, funct ) (1 rdepr, funct )


2 2

Similar partial regression coefficients can be obtained from the Variance/Covariance matrix.

rpain ,depr| funct =

0.337 ( 0.455) * ( 0.421) [1 ( 0.455) 2 ] * [1 ( 0.421) 2 ]


Correlations pain 1.000 . 0 .180 .028 146 depress .180 .028 146 1.000 . 0

0.145 = 0.18 0.8

Control Variables function

pain

depress

Correlation Significance (2-tailed) df Correlation Significance (2-tailed) df

Optional: Standardised partial beta


Partial standardized regression coefficients can be obtained from the correlation coefficients:

Conclusion: Variance/Covariance Matrix


Knowing the Variance/Covariance matrix of our variables allows us to estimate our regression as well as our path and structural equation models. All information for SEM is in the Variance/Covariance matrix We do not need the original data set to do our path analyses or SEMs. Raw data will be converted in a covariance matrix (and we can also just import a covariance or a correlation matrix into AMOS). (Sample size is needed for statistical tests) Because the analyses are based on the covariances , SEM is also called covariance structure analysis.

Regression model with standardis ed (z - transforme d) data : Z = Z + Z


y 1 x 2
2

x1 =

ry , x1 ry , x 2 rx1, x 2 1 rx21, x 2

x2 =

ry , x 2 ry , x1rx1, x 2 1 rx21, x 2

Correlation matrix
If we use the correlation matrix instead of the covariance matrix, we would obtain the standardised beta coefficients. Knowing the variances of the original data allows to calculate the unstandardised regression coefficients form the correlation matrix. In general, correlation matrices may lead to imprecise parameter estimates and standard errors in complex models and variance/covariance matrices are preferred.

Aim of SEM
The aim of SEM is often to find a model with the smallest necessary number of parameters which still adequately describes the observed covariance structure (reconstructed or estimated covariance matrix based upon our theoretical model should resemble the observed covariance matrix). How do we know that a model is a good model and fits the data?

The Logic of Model Testing in SEM


1. We start with our observed covariance or correlation matrix for X1, X2, and X3: X1 X2 X3 X1 1.0 r12 r13 X2 1.0 r23 X3 1.0 1. We hypothesize a model to test: e.g. : X1 X2 X3 2. This model can be represented by the following equations: 12 = r12 (direct effect of X1 on X2) 13 = r23r12 (indirect effect via X2) 23 = r23 (direct effect of X2 on X3) 3. ij represent reconstructed or estimated correlations (covariances) based upon the theoretical model. 4. We compare the reconstructed with the observed correlations (covariances). Observed correlation matrix (= covariance matrix with standardised variables)
pain pain function depress 1.000 function depress
0,

Our hypothesized model:


pain

error2
-.455 .337 1.000 -.421 1.000
1

function
0,

error
1

depress

This model can be represented by the following correlations (equations):


pain

Observed Correlations in Data


pain pain function depress 1.000 -.455 .337 1.000 -.421 1.000 function depress

(PF)= -0.46 (direct effect of Pain on Function) (FD) = r(FD)r(PF)= -0.46* -0.42 = 0.192 (indirect effect of pain on depression via function)
.21

-.46

error2

FD = rFD = -0.42 (direct effect of function on depression) All variances are 1 (standardized).

function

Reconstructed Correlations based upon Path model


pain pain function depress 1.000 -.455 .192 1.000 -.421 1.000 function depress

-.42

error
.18

Reconstructed or estimated correlations based upon our theoretical model:


pain function depress

depress

pain function depress

1.000 -.455 .192 1.000 -.421 1.000

Is the expected correlation matrix very different from the observed matrix (the more similar the better the model)? e.g. 2 goodness of fit test: 2 = (rij(o) ij(e))2/ij(e))

Conclusion
Next step: If our model variance/covariance matrix fits the observed data well, we can calculate the path regression coefficients and error variances from the matrix. In our simple example (without partial coefficients) correlation coefficients and standardised betas are identical. Again, all we need is the covariance matrix! Path coefficients can be estimated using multiple regression methods (standardized partial coefficients) based upon a given model and can be used to reconstruct the correlation (or covariance) matrix. The estimated (reconstructed or expected) correlations can be compared with the observed correlations and a chi-square will show whether it fits. Important: a non-significant chi-square denotes good fit: The more similar observed and expected correlations, the smaller the Chi-square, the better the model.
2 =

pain

-.46

error2
.21

function

-.42

error
.18

depress

[rij (observed ) rij (exp ected )]2


rij (exp ected )

with n degrees of freedom n = df(hypothesized model) df(saturated model)

Chi Square (2) test for association


most commonly used model fit statistics 2 calculates the degree of discrepancy between the theoretically expected values vs. the empirical data The larger the discrepancy (independence), the sooner 2 becomes significant Because we are dealing with a measure of misfit, the p-value for 2 should be larger than .05 (at least 0.2) to decide that the theoretical model fits the data! However, overall 2 is a poor measure of fit and with large n, 2 will generally be significant. 2 test can only be used to compare nested models (i.e.., identical but Model 2 deletes at least one parameter found in Model 1),

Our model is significant different from the observed covariance matrix. Therefore, it does not adequately describe the observed relationships in our data!

Many other measures of model fit, each with their own assumptions and limitations, are developed.

Coefficient estimation methods


Factor Analysis Addresses these questions Can the covariances or correlations between a set of observed variables be explained in terms of their relationship with a smaller number of of unobservable latent variables? i.e does the correlation between each pair of observed variables result from their mutual association with the latent variables; i.e. is the partial correlation between any pair of observed variables, given the values of the latent variables, approximately zero? Estimation methods to minimize the discrepancy between the obtained covariance matrix and the covariance matrix implied by the model: Maximum Likelihood (ML) Unweighted Least square method (ULS) General Least Squared Method (GLS) Scale free least square method (SLS) Asymptotically distribution free (ADF) (related to weighted least square method)

Parameter estimation methods


ML
Assumption of multinormal distribution Invariance of scale Minimum sample size Inference (chi2) Information criteria measures (AIC/BIC) Y

Maximum Likelihood Estimation


ADF
N
ML most commonly used method Objective of maximizing the likelyhood of a parameter is to find a value that maximizes the joint probability density function - = maximizes the likleyhood of having observed the data (=finding parameters that make the predicted covariance matrix as similar as possibble to the observed onecovariance matrix). ML in AMOS assumes multivariate normal distribution and continuous data = each variable should be normally distributed for each value of each other variable, specifically ML requires multinormality of endogenous variables often robust to violations of multinormal distribution and continuous data Good method for missing data if missing at random (MAR): AMOS uses Full ML If ordinal data are used, they should have at least five categories and skewness and kurtosis should be small (values in the range of -1 to +1 or 1.5 to +1.5). AMOS 7 can handle ordinal data but needs Bayesian estimation methods.

GLS
Y

ULS
N

SLS
N

Y >100

Y >100

N >100

Y >100

Y 1.5*p(p+1)
(p=observed variables)

Y Y

Y N

N N

N N

Y N

Best precision if assumption met

Standardise variables

Same visual inspection as for multiple regression: look at histogramms, QQ plots, Boxplots

Test of overall model fit


Chi2 Test Bollen Stein bootstrap

Test of individiaul paramter estimates (regression coefficients or covariances):


C.R.: Critical ratio. The critical ratio is the parameter estimate divided by an estimate of its standard error (z or t value). Bootstrap CI and Tests

What is bootstrapping
Simulation studies (see Kline 1998: 209) suggest that under conditions of severe non-normality of data, SEM parameter estimates (ex., path estimates) are still fairly accurate but corresponding significance coefficients are too high. use bootstrap tests if possible
Bootstrapping is a way of estimating standard error, confidence intervals and significance based not on assumptions of normality but on empirical resampling with replacement of the data. It has minimum assumptions: It is merely based on the assumption that the sample is a good representation of the unknown population We assume that the observed data resemble the true distribution. generates information on the variability of parameter estimates or of fit indexes based on the empirical samples, not on assumptions about probability theory of normal distributions. Bootstrapping in SEM still requires moderately large samples. If data are multivariate normal, MLE will give less biased estimates. However, if data lack multivariate normality, bootstrapping gives less biased estimates.

Frequency

The general bootstrap algorithm 1. Generate a sample of size n from your data set with replacement. 2. Compute your paramter of interest for this bootstrap sample (e.g. do a SEM analysis and get a regression coefficent b) For each random sample we get a different parameter estimate. 3. Repeat steps 1 and 2, 1000 time. By this procedure we end up with 1000 bootstrap values = (1, 2, . . . , 1000 ). Sort the bootstrap values from smallest to largest. Using the sorted 1, 2, . . . , 1000 values, find the 2.5% ile value and the 97.5% ile value. Or simply the 25th and 975th observations from the sorted 1000 values. Calculate the standard deviation of the 1000 1s. This is the estimate of the standard error (se) of your parameter estimate b. An approximate 2 sided 95% confidence interval is: B 1.96(SE). Test statistic z=B/SE. If z>1.96, then p<0.05

Frequency distribution of the B's calculated using Bootstrap samples


250 200 150 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Value of Bi

Maximum Likelihood Estimation


Violations of assumptions will cause an inflation of parameter estimates and chi2 values. Lack of normality will cause a type I error (inflated chi2 values could lead to think that the model does not fit well)! Bollen Stein bootstrap or Satorra-Bentler (not in AMOS). adjusted chi2 can be used for inference if normal distribution assumption is violated.
(but: see www.statmodel.com/chidiff.shtml)

Goodness of fit measure based on predicted vs. observed covariance matrix


Goodness of fit measure Criteria

Chi2 Test

P>0.2

Compare model with saturated, full model (e.g. all paths) or compare two nested models Compare model with saturated, full model (e.g. all paths) or compare two nested models

Not the most robust method, sample size dependent

Use ADF estimation as a robust alternative. Use other goodness of fit measures to evaluate models Violations will tend to underestimate standard errors of parameter estimates, which causes regression paths or covariances are found to be statistically significant more often than they should be (Type II error). If possible use bootstrap tests and confidence intervals.

Bollen Stein bootstrap

P>0.2

robust method, does not assume multinormal distribution, sample size dependent CMIN/DF in AMOS Chi2 /degress of freedom

Relative chi2 Chi2/df Satorra Bentel corrected chi2

2.5

P>0.2

Corrects for kurtosis and bias

Not in AMOS

10

Goodness of fit comparisons between models


Goodness of fit measure Comparative fit index (CFI) Incremental Fit (IFI) Normed Fit index (NFI) 0.9 0.9=90% of covariation in data can be reproduced by given model Least affected by sample size

Goodness of fit measures based on predicted and observed covariance matrix but penalizes for lack of parsimony (overfitting)
Goodness of fit measure Root mean square error of approximation (RMSE, RMR) Parsimony ratio (PRATIO) Parsimony Index >0.9 0.5 Discrepancy per df ( Least affected by sample size, penalizes for lack of parsimony, compare different models (but information criteria better)

>0.9 0.9 Similar interpretation as CFI:0.9= model is 90% away from saturated model Similar to CFI but more robust but underestimates fit if N is small Similar to NFI but penalizes for model complexity and little affected by sample size,

Df model/df independence PRATIO * BBI PRATIO * NNFI PRATIO * CFI Not in AMOS (?), Smaller is better Smaller is better

TLI is the TuckerLewis coefficient (=non-normed fit index (NNFI))

0.90.95

Parsimonious NFI (PNFI) Parsimonious CFI (PCFI)

Goodness of fit measures based on information theory


Measures can be used to compare non-nested and nested model, penalize for model complexity (overfitting), all need ML estimation
Measure
Aikaikes IC (AIC)

Evaluation of fit of model structures


Model fit criteria do not specify which parts of the model may not fit. Compare observed and model covariance: Evaluate residuals matrix (residual covariance): should be <0.1 better<0.05 Standardised residuals Critical Ratio: Estimate divided by its standard error. If data are a random sample variables and multinormal distributed, estimates with critical ratios more than 1.96 are significant at the .05 level and suggests important contribution to model. Use bootstrap methods with small sample sizes and/or nonnormal distribution.

Problem
May select more complex models

Selection criteria
Penalizes for lack of parsimony

Idea/Philosophy
Smallest AIC of a set of model is most likely the best model (not the true model)

Bayesian IC, (BIC)

May select too parsimonious models

Penalizes for sample Smallest model is size and lack of most likely the true parsimony model among the set of models Penalizes for sample ? size and lack of parsimony, penalty greater than AIC and BIC

Consistent AIC (CAIC)

May select too parsimonious models

Observed (implied) covariance matrix and residual matrices


Implied Covariances (Group number 1 - Default model)

Critical ratios (C.R.)


Regression Weights: (Group number 1 - Default model)

x=\agfi
0, .37 4.93, 4.08

pain function depress

pain 4.079 -.397 .259

function .186 -.121

depress
pain

error
1 0, .15 -.10 3.25

.448

error2
1 -.65 3.10

depress

Residual Covariances (Group number 1 - Default model)

Estimate S.E. function <--- pain

C.R.

pain function depress

pain .000 .000 .197

function .000 .000

depress

function

.000

-.097 .016 -6.223 *** -.652 .116 -5.642 ***

Standardized Residual Covariances (Group number 1 - Default model) covariance

Residual matrices suggests that depression and pain is not adequately modelled

depress <--- function

pain function depress

pain .000 .000 1.740

function .000 .000

depress

.000

11

Exercise
Check the model fit of Ajzens theoretical model Try to find a better model (remove unnecaissary paths, include necaissary path based e.g. on chi2 tests for nested models, residual covariance matrix, critical ratio, AIC, RMSE)

Nested model comparison


Amos examines every pair of models in which one model of the pair can be obtained by constraining the parameters of the other (e.g. removing one path = setting b to 0). For every such pair of "nested" models, a likelihood ratio chi2 test can be performed to see if the constrained model is not significantly better (=fits similar good as the more complex model). For non-nested models use AIC (or BIC). The smallest AIC is the best model. (see Model selection course or handouts on Biostatistics webpage for details)

Nested model comparison


Nested model comparison
Right click on path you want to restrict to 0 (=delete). Enter name for path in regression weight box.

Alternate between output of default and model 2

12

Nested model comparison


Double-click on XX-Default model box Click New to create a new model and label it. Constraint the parameter (your path in this example) to 0 (see box): sn_int=0 , you can add more than 1 constraint

Nested model comparison


Go to analysis properties Then run the model

Output I: Nested model comparison


There was no significant difference between model 2 (without path sn int) and model 1 (with the path): chi2(1df) =0.935, p=0.334. this suggests that the path from subjective norm to intention is not necaissary. Amos also reports the changes in the fit measures, NFI, TLI, RFI and IFI, They all increase which suggests a better fit.

Output II: nested models comparisons

Quick introduction: How to define latent variables in AMOS:

Select View/Set, Analysis Properties, Output tab and check "Tests for normality and outliers."
Assessment of normality (Group number 1)

Example: simple factor analysis: Can we reduce the three variables into one factor (latent variable) without loosing too much information? = Factor analysis with Maximum likelihood estimation

Variable pain function depress Multivariate

min 1.000 .938 1.000

max 9.000 3.000 4.400

skew -.067 -1.502 1.579

c.r. -.334 -7.484 7.868

kurtosis -.808 2.039 2.522 4.760

c.r. -2.013 5.079 6.285 5.305

13

Latent variable in AMOS


Use button on top left (#3) to create a latent variable Move variables into item boxed Name error variable as e1,e2,e3 Name latent variable as well-being Tick standard estimates, squared correlations and factor scores weight in analysis property box
e1
1

Factor analysis SPSS:


Communalities pain depress function Initial .233 .204 .288 Extraction .365 .312 .568

SEM analysis AMOS:


e1 e2 e3

.36

.31

.57

Extraction Method: Maximum Likelihood.

pain
Factor Matrixa Factor 1 .604 .558 -.754

depress
.60 .56 -.75

function

e2
1

e3
1

pain depress function


1

pain depress function

Well-being

Extraction Method: Maximum Likelihood. a. 1 factors extracted. 4 iterations required.

Well-being

Latent variable models are a broad subclass of latent structure models . They postulate some relationship between the statistical properties of observable variables (or "manifest variables", or "indicators") and latent variables. A special kind of statistical Latent analysis corresponds to each kind of the latent variable models.

Checking assumptions of ML with AMOS


Skewness and curtosis (peakedness of distribution) for each parameter should be within +/- 2 Mardia's statistic measures of the degree to which the assumption of multivariate normality has been violated. Mardia's measure is based on functions of skewness and kurtosis and should be less than 3 to assume the assumption of multivariate normality is met. Large values of Mardias value may suggest some multivariate outliers in dataset (Barbara G. Tabachnick and Linda S. Fidell (2001). Using Multivariate Statistics, Fourth Edition. Needham Heights, MA: Allyn & Bacon.)

Categori cal Manifest Metrical Factor Latent analysis profile models analysis Categori Latent Latent cal/(Ordi trait class nal) analysis analysis Analysis of

Metrical

Multivariate outlier: an extreme combination, like juvenile with a high income. Observations farthest from the centroid under assumptions of multinormality. Mahalanobis distance is the most common measure used for multivariate outliers. the higher Malanobis d-squared distance for a case, the more likely to be a outlier under assumptions of normality. The cases are listed in descending order of Mahalanobis d2. Check if cases with the highest d-squared as possible (but not necaissarily) outliers. Consider cases as outliers if the MD are well seperated from other M. distances (Arbuckle and Wothke1999) .

ML estimation requires indicator variables with multivariate normal distribution and valid specification of the model; Ordinal variables are widely used in practice: If ordinal data are used, they should have at least five categories and not be strongly skewed.

14

Assumption of parametric tests Check assumptions as in any multivariable or multivariate analysis, see:
http://www2.chass.ncsu.edu/garson/pa765/assumpt.htm Brian Everitt and Graham Dunn (2001) Applied multivariate data analysis. Arnold Barbara G. Tabachnick and Linda S. Fidell (2001). Using Multivariate Statistics, Fourth Edition. Needham Heights, MA: Allyn & Bacon.)

15

You might also like