Amosweek 2

Last week: Family Tree of SEM
Introduction to Path Analysis and Structural Equation Modelling with AMOS

Daniel Stahl Biostatistics and Computing
Bivariate Correlation
Multiple Regression
Path Analysis
Structural Equation Modeling
Factor Analysis
Confirmatory Factor Analysis
Exploratory Factor Analysis
Last week: Path analysis

Model 1
Model 2
Last week:Error 1 Error analysis with latent constructs Path 2 Error 3 = Structural Equation Modelling
1 1 1
Item 1 Item 2 Item 3

1
Pain
Pain
Pain
1
Error 1 Error 2 Error 3
Depression
Depression
Function
Function
Function
Depression
Last week:

1 1 1
Measurement model
1 4
2 5
3 1. Observed variables 6 7 2. Unobserved variables 3. Drawing latent variable (draws latent variable and items) 4. Drawing path (causal relationship regression)

1
8
Pain
1
8 9
8 10
5. Draw covariances (correlation, no direction) 6. Unique variable (error variable, add e.g. to each dependent var
Depression
11 12 13 14 15 16
7. List variables (open data file first, then drag and drop variabl 8. Select one object, select all, deselect 9. Move object 10.Delete
Function
1
11.Select data file 12.Analysis properties (choose statistics)

1 1 1
Structural model
13.Calculate estimates (starts the analysis) 14.View test (see results) 15.Copy graph in clipboard 16.Save
Last week: SEM diagram symbols

Observed variable
1
Last week: Correlation and Regression as AMOS path models

Correlation
Simple linear regresion

Error depression
Latent variable with disturbance or error

pain depress
pain
depress
Observed variable with measurement error (endogenous variable)
Unidirectional path (regression) Correlation between variables Reciprocal relation between variables
pain
Multiple linear regression

Endogenous variables have got error variances variables pointing at them = unexplained variance
error
1
Latent variable with items (observed variables)
depress
function
Today
Variance, covariance, correlation and regression coefficients SEM/path analysis is based on covariance matrix The Logic of Model Testing in SEM Model fit and model comparisons Simple latent trait models
Exercise
Do a similar analysis as last week with data file: PATHINGRAM.sav. The data are from: Ingram, K. L., Cope, J. G., Harju, B. L., & Wuensch, K. L. (2000). Applying to graduate school: A test of the theory of planned behavior. Journal of Social Behavior and Personality, 15, 215-226. Ajzens theory of planned behavior was used to predict students intentions and application behaviour (to graduate school) from their attitudes, subjective norms and perceived behavioural control.
Five Variables (derived from questionnaires)

Perceived Behavioural Control (PBC) Subjective norm Attitude Intention (to apply to college) Behaviour (applications)
Ajzens theoretical model

PBA, subjective norm and attitude influence intention PBA, subjective norm and attitude correlate with each other Intention influences Behaviour PBA also influences Behaviour
Exercise
Draw the path diagram for the model (Conduct a path analysis with a series of multiple regression analyses using SPSS.) (Calculate the standardised indirect effects using the standardised estimates from the regression analysis.) Check your results using AMOS Use a bootstrap analysis to evaluate the indirect effect. Remove some indirect effects and compare the results with the theoretical model.
Path diagram for AMOS

Perceived Behavior Control
e1
1
e2
1
Subjective Norm
Intention
Behavior
Attitude
Results
Perceived Behavior Control
-.13 .51 .60 .35 .34
Main results: AMOS output

Standardized Regression Weights:
Parameter
e1
.34
Estimate <--<--<--<--<--PBC SubNorm Attitude Intent PBC -.126 .095 .807 .350 .336
Lower -.352 -.118 .596 .075 .092
Upper
e2
Intent Intent Intent Behavior Behavior
.126 .293 .314 .430 .985 .002 .548 .013 .555 .005
.67
Subjective Norm
.09
Intention
Behavior
Standardized Indirect Effects

.47 .81
Attitude Behavior 95%CI: P: .282 (0.08-0.5) 0.007
SubNorm .033
PBC -.044
Intent .000
Attitude
(-0.04-0.13) (-0.14-0.04) 0.339 0.277
Variances, Covariances and Correlations
How do we know that this model fits the data well? Are there better models? How can we compare two or more models? First we need to know a little bit about covariances
SEM and path analysis are based on variances and covariances . Variance is a measure of the dispersion of data. It indicates how values are spread around the mean. Covariance is a measure of the covariation between two variables, such as pain and function.
Example: scores of pain and function
Variance is defined as:
120
110
100
(Xi X )
Var ( X ) =
i =1
2
score
90
80
N 1
70
60
50
40
30 0.5 1 1.5 2 2.5
group
Centering
We centre both variables = subtracting the mean from each individual score. We get a mean of 0 but same distribution around the mean and same variance Removes the constant in a regression and makes life easier for us.
30 20
Variance of pain and function
Descriptive Statistics N pain function c_pain c_function Valid N (listwise) 50 50 50 50 50 Mean 49.45 74.65 .00 .00 Std. Deviation 4.482 12.530 4.482 12.530 Variance 20.084 157.012 20.084 157.012
10
c_score
-10
-20
-30 0.5 1 1.5 2 2.5
group
Scatterplot of pain versus body function (not centred)
Scatterplot of pain versus body function (centred)

30
20
10
c_function
-10
-20 R Sq Linear = 0.33
-30 -10 -5 0 5 10
c_pain
Covariance
The covariance is an unstandardised measure of association between two variables. measure of the degree to which x and y vary together In our example the covariance between pain and function is 32. This means if pain increases by 1 unit of variance (Variance of pain = 20), function will increase by 32:
30
( X i X )(Yi Y )
Cov( X , Y ) =
i =1
N 1
20
10
c_function
32
-10
-20 R Sq Linear = 0.33
-30 -10 -5 0 5 10
c_pain
Covariance matrix
The covariance of variable X with itself is :
Corrrelation
The correlation coefficient r is a standardised measure of the association between two variables with a range from [-1,+1]:
( X i X )( X i X )
Cov ( X , X ) =
i =1
N 1
= Var ( X )
Covariance matrix of pain and function: Pain Pain Function 20 32 Function 32 157
cov( X , Y ) cov( X , Y ) = = var( X ) * var(Y ) sd ( X ) * sd (Y ) -1 = perfect negative association 0 = no association (random) +1 = perfect positive association If we standardise X and Y to new variables Zx and Zy to have a mean of 0 and a variance of 1 , then the Cov(Zx, Zy)=Corr(X,Y). Corr ( X , Y ) =
(= Variance/Covariance matrix)
Correlation Regressoin coefficient b

In our example the correlation between pain and function is 0.455. The interpretation is: If pain increases by 1 standard deviation (=sqrt(Var)), then body function increases by 0.455 SD. SD of pain = 4.5 and SD of function = 12.5 If pain increases by 4.5 then function will increase by 0.575*12.5=7.2
If we assume that variable X (pain) influences variable Y (function), then we can describe the relationship by a regression equation:
y = c + b* x c=0 y = b* x
We can estimate the regression coefficient b by (based on the least squared method):
Cov( X , Y ) b= Var( X )
Example
Whats the regression coefficient b for the regression equation: function=b*pain? V/CV Pain Function Pain 20 32 Function 32 157
Question
If we assume that function influences pain what will be b? (pain = b*function)
V/CV Pain Function Pain 20 Function 32 157
Cov( Pain, Function) b= Var( Pain)

Coefficients a Unstandardized Coefficients B Std. Error 5.8E-012 1.466 1.605 .330 Standardized Coefficients Beta .574
32.25 = 1.605 b= 20.1

t .000 4.858 Sig. 1.000 .000
Cov( X , Y ) = Cov( pain, function ) = 32.25 = 0.205 b= Var (Y ) Var ( function) 157
Coefficients a Unstandardized Coefficients B Std. Error -3E-012 .524 .205 .042 Standardized Coefficients Beta .574
Model 1
(Constant) c_pain
Model 1
a. Dependent Variable: c_function
(Constant) c_function
t .000 4.858
Sig. 1.000 .000
a. Dependent Variable: c_pain
Extension to Multiple Regression

In the case of multiple predictors (X1 and X2), b becomes a partial regression coefficient: Y=c+ b1 X1+ b2 X2 Depression=c+ b1*Function + b2*Pain b1 is the average change in Y per unit change in X1 with X2 held constant. b1 is the average change in depression score per unit change in body function with pain held constant.
Optional
How to calculate the partial correlation (partial covariance) and partial beta for standardised data simply from the correlation matrix (= covariance matrix for standardised data).
depression = c + 0.061* pain 0.523 * function +

Partial regression coefficient is obtained by removing (partialing out) the correlation with other predictors. The Point: Partial regression coefficients can be viewed as functions of operations on a correlation or covariance matrix.
Optional: Example of partial correlation
Optional: Partial correlation

Partial correlation can be calculated just from the correlation matrix.
rxy|z = rxy rxz * ryz (1 rxz ) (1 ryz )
2 2
Find the correlation between pain and depression while partialling out the effect of function.
rpain,depr| funct = rpain,depr rpain, funct * rdepr , funct (1 r 2 pain, funct )(1 r 2 depr , funct )
Correlationmatrix pain pain depress function **. 1 .337** -.455** depress .337** 1 -.421** function -.455** -.421** 1
rpain,depr| funct =
rpain,depr rpain, funct * rdepr, funct (1 rpain, funct ) (1 rdepr, funct )

2 2
Similar partial regression coefficients can be obtained from the Variance/Covariance matrix.
rpain ,depr| funct =
0.337 ( 0.455) * ( 0.421) [1 ( 0.455) 2 ] * [1 ( 0.421) 2 ]

Correlations pain 1.000 . 0 .180 .028 146 depress .180 .028 146 1.000 . 0
0.145 = 0.18 0.8
Control Variables function
pain
depress
Correlation Significance (2-tailed) df Correlation Significance (2-tailed) df
Optional: Standardised partial beta

Partial standardized regression coefficients can be obtained from the correlation coefficients:
Conclusion: Variance/Covariance Matrix

Knowing the Variance/Covariance matrix of our variables allows us to estimate our regression as well as our path and structural equation models. All information for SEM is in the Variance/Covariance matrix We do not need the original data set to do our path analyses or SEMs. Raw data will be converted in a covariance matrix (and we can also just import a covariance or a correlation matrix into AMOS). (Sample size is needed for statistical tests) Because the analyses are based on the covariances , SEM is also called covariance structure analysis.
Regression model with standardis ed (z - transforme d) data : Z = Z + Z

y 1 x 2
2
x1 =
ry , x1 ry , x 2 rx1, x 2 1 rx21, x 2
x2 =
ry , x 2 ry , x1rx1, x 2 1 rx21, x 2
Correlation matrix
If we use the correlation matrix instead of the covariance matrix, we would obtain the standardised beta coefficients. Knowing the variances of the original data allows to calculate the unstandardised regression coefficients form the correlation matrix. In general, correlation matrices may lead to imprecise parameter estimates and standard errors in complex models and variance/covariance matrices are preferred.
Aim of SEM
The aim of SEM is often to find a model with the smallest necessary number of parameters which still adequately describes the observed covariance structure (reconstructed or estimated covariance matrix based upon our theoretical model should resemble the observed covariance matrix). How do we know that a model is a good model and fits the data?
The Logic of Model Testing in SEM

1. We start with our observed covariance or correlation matrix for X1, X2, and X3: X1 X2 X3 X1 1.0 r12 r13 X2 1.0 r23 X3 1.0 1. We hypothesize a model to test: e.g. : X1 X2 X3 2. This model can be represented by the following equations: 12 = r12 (direct effect of X1 on X2) 13 = r23r12 (indirect effect via X2) 23 = r23 (direct effect of X2 on X3) 3. ij represent reconstructed or estimated correlations (covariances) based upon the theoretical model. 4. We compare the reconstructed with the observed correlations (covariances). Observed correlation matrix (= covariance matrix with standardised variables)
pain pain function depress 1.000 function depress
0,
Our hypothesized model:

pain
error2
-.455 .337 1.000 -.421 1.000
1
function
0,
error
1
depress
This model can be represented by the following correlations (equations):

pain
Observed Correlations in Data

pain pain function depress 1.000 -.455 .337 1.000 -.421 1.000 function depress
(PF)= -0.46 (direct effect of Pain on Function) (FD) = r(FD)r(PF)= -0.46* -0.42 = 0.192 (indirect effect of pain on depression via function)
.21
-.46
error2
FD = rFD = -0.42 (direct effect of function on depression) All variances are 1 (standardized).
function
Reconstructed Correlations based upon Path model

pain pain function depress 1.000 -.455 .192 1.000 -.421 1.000 function depress
-.42
error
.18
Reconstructed or estimated correlations based upon our theoretical model:

pain function depress
depress
1.000 -.455 .192 1.000 -.421 1.000
Is the expected correlation matrix very different from the observed matrix (the more similar the better the model)? e.g. 2 goodness of fit test: 2 = (rij(o) ij(e))2/ij(e))
Conclusion
Next step: If our model variance/covariance matrix fits the observed data well, we can calculate the path regression coefficients and error variances from the matrix. In our simple example (without partial coefficients) correlation coefficients and standardised betas are identical. Again, all we need is the covariance matrix! Path coefficients can be estimated using multiple regression methods (standardized partial coefficients) based upon a given model and can be used to reconstruct the correlation (or covariance) matrix. The estimated (reconstructed or expected) correlations can be compared with the observed correlations and a chi-square will show whether it fits. Important: a non-significant chi-square denotes good fit: The more similar observed and expected correlations, the smaller the Chi-square, the better the model.
2 =
pain
-.46
error2
.21
function
-.42
error
.18
depress
[rij (observed ) rij (exp ected )]2

rij (exp ected )
with n degrees of freedom n = df(hypothesized model) df(saturated model)
Chi Square (2) test for association

most commonly used model fit statistics 2 calculates the degree of discrepancy between the theoretically expected values vs. the empirical data The larger the discrepancy (independence), the sooner 2 becomes significant Because we are dealing with a measure of misfit, the p-value for 2 should be larger than .05 (at least 0.2) to decide that the theoretical model fits the data! However, overall 2 is a poor measure of fit and with large n, 2 will generally be significant. 2 test can only be used to compare nested models (i.e.., identical but Model 2 deletes at least one parameter found in Model 1),
Our model is significant different from the observed covariance matrix. Therefore, it does not adequately describe the observed relationships in our data!
Many other measures of model fit, each with their own assumptions and limitations, are developed.
Coefficient estimation methods

Factor Analysis Addresses these questions Can the covariances or correlations between a set of observed variables be explained in terms of their relationship with a smaller number of of unobservable latent variables? i.e does the correlation between each pair of observed variables result from their mutual association with the latent variables; i.e. is the partial correlation between any pair of observed variables, given the values of the latent variables, approximately zero? Estimation methods to minimize the discrepancy between the obtained covariance matrix and the covariance matrix implied by the model: Maximum Likelihood (ML) Unweighted Least square method (ULS) General Least Squared Method (GLS) Scale free least square method (SLS) Asymptotically distribution free (ADF) (related to weighted least square method)
Parameter estimation methods

ML
Assumption of multinormal distribution Invariance of scale Minimum sample size Inference (chi2) Information criteria measures (AIC/BIC) Y
Maximum Likelihood Estimation

ADF
N
ML most commonly used method Objective of maximizing the likelyhood of a parameter is to find a value that maximizes the joint probability density function - = maximizes the likleyhood of having observed the data (=finding parameters that make the predicted covariance matrix as similar as possibble to the observed onecovariance matrix). ML in AMOS assumes multivariate normal distribution and continuous data = each variable should be normally distributed for each value of each other variable, specifically ML requires multinormality of endogenous variables often robust to violations of multinormal distribution and continuous data Good method for missing data if missing at random (MAR): AMOS uses Full ML If ordinal data are used, they should have at least five categories and skewness and kurtosis should be small (values in the range of -1 to +1 or 1.5 to +1.5). AMOS 7 can handle ordinal data but needs Bayesian estimation methods.
GLS
Y
ULS
N
SLS
N
Y >100
Y >100
N >100
Y >100
Y 1.5*p(p+1)
(p=observed variables)
Y Y
Y N
N N
N N
Y N
Best precision if assumption met
Standardise variables
Same visual inspection as for multiple regression: look at histogramms, QQ plots, Boxplots
Test of overall model fit

Chi2 Test Bollen Stein bootstrap
Test of individiaul paramter estimates (regression coefficients or covariances):

C.R.: Critical ratio. The critical ratio is the parameter estimate divided by an estimate of its standard error (z or t value). Bootstrap CI and Tests
What is bootstrapping
Simulation studies (see Kline 1998: 209) suggest that under conditions of severe non-normality of data, SEM parameter estimates (ex., path estimates) are still fairly accurate but corresponding significance coefficients are too high. use bootstrap tests if possible
Bootstrapping is a way of estimating standard error, confidence intervals and significance based not on assumptions of normality but on empirical resampling with replacement of the data. It has minimum assumptions: It is merely based on the assumption that the sample is a good representation of the unknown population We assume that the observed data resemble the true distribution. generates information on the variability of parameter estimates or of fit indexes based on the empirical samples, not on assumptions about probability theory of normal distributions. Bootstrapping in SEM still requires moderately large samples. If data are multivariate normal, MLE will give less biased estimates. However, if data lack multivariate normality, bootstrapping gives less biased estimates.
Frequency
The general bootstrap algorithm 1. Generate a sample of size n from your data set with replacement. 2. Compute your paramter of interest for this bootstrap sample (e.g. do a SEM analysis and get a regression coefficent b) For each random sample we get a different parameter estimate. 3. Repeat steps 1 and 2, 1000 time. By this procedure we end up with 1000 bootstrap values = (1, 2, . . . , 1000 ). Sort the bootstrap values from smallest to largest. Using the sorted 1, 2, . . . , 1000 values, find the 2.5% ile value and the 97.5% ile value. Or simply the 25th and 975th observations from the sorted 1000 values. Calculate the standard deviation of the 1000 1s. This is the estimate of the standard error (se) of your parameter estimate b. An approximate 2 sided 95% confidence interval is: B 1.96(SE). Test statistic z=B/SE. If z>1.96, then p<0.05
Frequency distribution of the B's calculated using Bootstrap samples

250 200 150 100 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Value of Bi
Maximum Likelihood Estimation

Violations of assumptions will cause an inflation of parameter estimates and chi2 values. Lack of normality will cause a type I error (inflated chi2 values could lead to think that the model does not fit well)! Bollen Stein bootstrap or Satorra-Bentler (not in AMOS). adjusted chi2 can be used for inference if normal distribution assumption is violated.
(but: see www.statmodel.com/chidiff.shtml)
Goodness of fit measure based on predicted vs. observed covariance matrix

Goodness of fit measure Criteria
Chi2 Test
P>0.2
Compare model with saturated, full model (e.g. all paths) or compare two nested models Compare model with saturated, full model (e.g. all paths) or compare two nested models
Not the most robust method, sample size dependent
Use ADF estimation as a robust alternative. Use other goodness of fit measures to evaluate models Violations will tend to underestimate standard errors of parameter estimates, which causes regression paths or covariances are found to be statistically significant more often than they should be (Type II error). If possible use bootstrap tests and confidence intervals.
Bollen Stein bootstrap
P>0.2
robust method, does not assume multinormal distribution, sample size dependent CMIN/DF in AMOS Chi2 /degress of freedom
Relative chi2 Chi2/df Satorra Bentel corrected chi2
2.5
P>0.2
Corrects for kurtosis and bias
Not in AMOS
10
Goodness of fit comparisons between models

Goodness of fit measure Comparative fit index (CFI) Incremental Fit (IFI) Normed Fit index (NFI) 0.9 0.9=90% of covariation in data can be reproduced by given model Least affected by sample size
Goodness of fit measures based on predicted and observed covariance matrix but penalizes for lack of parsimony (overfitting)
Goodness of fit measure Root mean square error of approximation (RMSE, RMR) Parsimony ratio (PRATIO) Parsimony Index >0.9 0.5 Discrepancy per df ( Least affected by sample size, penalizes for lack of parsimony, compare different models (but information criteria better)
>0.9 0.9 Similar interpretation as CFI:0.9= model is 90% away from saturated model Similar to CFI but more robust but underestimates fit if N is small Similar to NFI but penalizes for model complexity and little affected by sample size,
Df model/df independence PRATIO * BBI PRATIO * NNFI PRATIO * CFI Not in AMOS (?), Smaller is better Smaller is better
TLI is the TuckerLewis coefficient (=non-normed fit index (NNFI))
0.90.95
Parsimonious NFI (PNFI) Parsimonious CFI (PCFI)
Goodness of fit measures based on information theory

Measures can be used to compare non-nested and nested model, penalize for model complexity (overfitting), all need ML estimation
Measure
Aikaikes IC (AIC)
Evaluation of fit of model structures

Model fit criteria do not specify which parts of the model may not fit. Compare observed and model covariance: Evaluate residuals matrix (residual covariance): should be <0.1 better<0.05 Standardised residuals Critical Ratio: Estimate divided by its standard error. If data are a random sample variables and multinormal distributed, estimates with critical ratios more than 1.96 are significant at the .05 level and suggests important contribution to model. Use bootstrap methods with small sample sizes and/or nonnormal distribution.
Problem
May select more complex models
Selection criteria
Penalizes for lack of parsimony
Idea/Philosophy
Smallest AIC of a set of model is most likely the best model (not the true model)
Bayesian IC, (BIC)
May select too parsimonious models
Penalizes for sample Smallest model is size and lack of most likely the true parsimony model among the set of models Penalizes for sample ? size and lack of parsimony, penalty greater than AIC and BIC
Consistent AIC (CAIC)
May select too parsimonious models
Observed (implied) covariance matrix and residual matrices

Implied Covariances (Group number 1 - Default model)
Critical ratios (C.R.)

Regression Weights: (Group number 1 - Default model)
x=\agfi
0, .37 4.93, 4.08
pain 4.079 -.397 .259
function .186 -.121
depress
pain
error
1 0, .15 -.10 3.25
.448
error2
1 -.65 3.10
depress
Residual Covariances (Group number 1 - Default model)
Estimate S.E. function <--- pain
C.R.
pain .000 .000 .197
function .000 .000
depress
function
.000
-.097 .016 -6.223 *** -.652 .116 -5.642 ***
Standardized Residual Covariances (Group number 1 - Default model) covariance
Residual matrices suggests that depression and pain is not adequately modelled
depress <--- function
pain .000 .000 1.740
function .000 .000
depress
.000
11
Exercise
Check the model fit of Ajzens theoretical model Try to find a better model (remove unnecaissary paths, include necaissary path based e.g. on chi2 tests for nested models, residual covariance matrix, critical ratio, AIC, RMSE)
Nested model comparison

Amos examines every pair of models in which one model of the pair can be obtained by constraining the parameters of the other (e.g. removing one path = setting b to 0). For every such pair of "nested" models, a likelihood ratio chi2 test can be performed to see if the constrained model is not significantly better (=fits similar good as the more complex model). For non-nested models use AIC (or BIC). The smallest AIC is the best model. (see Model selection course or handouts on Biostatistics webpage for details)

Right click on path you want to restrict to 0 (=delete). Enter name for path in regression weight box.
Alternate between output of default and model 2
12

Double-click on XX-Default model box Click New to create a new model and label it. Constraint the parameter (your path in this example) to 0 (see box): sn_int=0 , you can add more than 1 constraint

Go to analysis properties Then run the model
Output I: Nested model comparison

There was no significant difference between model 2 (without path sn int) and model 1 (with the path): chi2(1df) =0.935, p=0.334. this suggests that the path from subjective norm to intention is not necaissary. Amos also reports the changes in the fit measures, NFI, TLI, RFI and IFI, They all increase which suggests a better fit.
Output II: nested models comparisons
Quick introduction: How to define latent variables in AMOS:
Select View/Set, Analysis Properties, Output tab and check "Tests for normality and outliers."
Assessment of normality (Group number 1)
Example: simple factor analysis: Can we reduce the three variables into one factor (latent variable) without loosing too much information? = Factor analysis with Maximum likelihood estimation
Variable pain function depress Multivariate
min 1.000 .938 1.000
max 9.000 3.000 4.400
skew -.067 -1.502 1.579
c.r. -.334 -7.484 7.868
kurtosis -.808 2.039 2.522 4.760
c.r. -2.013 5.079 6.285 5.305
13
Latent variable in AMOS

Use button on top left (#3) to create a latent variable Move variables into item boxed Name error variable as e1,e2,e3 Name latent variable as well-being Tick standard estimates, squared correlations and factor scores weight in analysis property box
e1
1
Factor analysis SPSS:

Communalities pain depress function Initial .233 .204 .288 Extraction .365 .312 .568
SEM analysis AMOS:

e1 e2 e3
.36
.31
.57
Extraction Method: Maximum Likelihood.
pain
Factor Matrixa Factor 1 .604 .558 -.754
depress
.60 .56 -.75
function
e2
1
e3
1
pain depress function

1
pain depress function
Well-being
Extraction Method: Maximum Likelihood. a. 1 factors extracted. 4 iterations required.
Well-being
Latent variable models are a broad subclass of latent structure models . They postulate some relationship between the statistical properties of observable variables (or "manifest variables", or "indicators") and latent variables. A special kind of statistical Latent analysis corresponds to each kind of the latent variable models.
Checking assumptions of ML with AMOS

Skewness and curtosis (peakedness of distribution) for each parameter should be within +/- 2 Mardia's statistic measures of the degree to which the assumption of multivariate normality has been violated. Mardia's measure is based on functions of skewness and kurtosis and should be less than 3 to assume the assumption of multivariate normality is met. Large values of Mardias value may suggest some multivariate outliers in dataset (Barbara G. Tabachnick and Linda S. Fidell (2001). Using Multivariate Statistics, Fourth Edition. Needham Heights, MA: Allyn & Bacon.)
Categori cal Manifest Metrical Factor Latent analysis profile models analysis Categori Latent Latent cal/(Ordi trait class nal) analysis analysis Analysis of
Metrical
Multivariate outlier: an extreme combination, like juvenile with a high income. Observations farthest from the centroid under assumptions of multinormality. Mahalanobis distance is the most common measure used for multivariate outliers. the higher Malanobis d-squared distance for a case, the more likely to be a outlier under assumptions of normality. The cases are listed in descending order of Mahalanobis d2. Check if cases with the highest d-squared as possible (but not necaissarily) outliers. Consider cases as outliers if the MD are well seperated from other M. distances (Arbuckle and Wothke1999) .
ML estimation requires indicator variables with multivariate normal distribution and valid specification of the model; Ordinal variables are widely used in practice: If ordinal data are used, they should have at least five categories and not be strongly skewed.
14
Assumption of parametric tests Check assumptions as in any multivariable or multivariate analysis, see:
http://www2.chass.ncsu.edu/garson/pa765/assumpt.htm Brian Everitt and Graham Dunn (2001) Applied multivariate data analysis. Arnold Barbara G. Tabachnick and Linda S. Fidell (2001). Using Multivariate Statistics, Fourth Edition. Needham Heights, MA: Allyn & Bacon.)
15

Amosweek 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Amosweek 2

Uploaded by

Copyright:

Available Formats

Last week: Family Tree of SEM

Introduction to Path Analysis and Structural Equation Modelling with AMOS

Structural Equation Modeling

Confirmatory Factor Analysis

Exploratory Factor Analysis

Last week: Path analysis

Item 1 Item 2 Item 3

Item 1 Item 2 Item 3

Error 1 Error 2 Error 3

Item 3 Item 2 Item 1

Error 3 Error 2 Error 1

Error 1 Error 2 Error 3

Item 1 Item 2 Item 3

Item 1 Item 2 Item 3

Error 1 Error 2 Error 3

11.Select data file 12.Analysis properties (choose statistics)

Item 3 Item 2 Item 1

Error 3 Error 2 Error 1

Last week: SEM diagram symbols

Last week: Correlation and Regression as AMOS path models

Simple linear regresion

Latent variable with disturbance or error

Observed variable with measurement error (endogenous variable)

Multiple linear regression

Latent variable with items (observed variables)

Five Variables (derived from questionnaires)

Ajzens theoretical model

Path diagram for AMOS

Main results: AMOS output

Lower -.352 -.118 .596 .075 .092

Intent Intent Intent Behavior Behavior

Standardized Indirect Effects

Attitude Behavior 95%CI: P: .282 (0.08-0.5) 0.007

(-0.04-0.13) (-0.14-0.04) 0.339 0.277

Variances, Covariances and Correlations

Example: scores of pain and function

Variance is defined as:

30 0.5 1 1.5 2 2.5

Variance of pain and function

-30 0.5 1 1.5 2 2.5

Scatterplot of pain versus body function (not centred)

Scatterplot of pain versus body function (centred)

-20 R Sq Linear = 0.33

-20 R Sq Linear = 0.33

Correlation Regressoin coefficient b

Cov( Pain, Function) b= Var( Pain)

32.25 = 1.605 b= 20.1

a. Dependent Variable: c_function

Sig. 1.000 .000

a. Dependent Variable: c_pain

Extension to Multiple Regression

depression = c + 0.061* pain 0.523 * function +

Optional: Example of partial correlation

Optional: Partial correlation

rpain,depr rpain, funct * rdepr, funct (1 rpain, funct ) (1 rdepr, funct )

rpain ,depr| funct =

0.337 ( 0.455) * ( 0.421) [1 ( 0.455) 2 ] * [1 ( 0.421) 2 ]

0.145 = 0.18 0.8

Control Variables function

Correlation Significance (2-tailed) df Correlation Significance (2-tailed) df

Optional: Standardised partial beta

Conclusion: Variance/Covariance Matrix

Regression model with standardis ed (z - transforme d) data : Z = Z + Z

-.097 .016 -6.223 * -.652 .116 -5.642 *